Hprc banner tamu.png

Terra:Batch Job Files

Revision as of 16:10, 26 October 2016 by Whomps (talk | contribs) (Building Job Files)
Jump to: navigation, search

Building Job Files

While not the only method of submitted programs to be executed, job files fulfill the needs of most users.

The general idea behind job files follows:

  • Make resource requests
  • Add your commands and/or scripting
  • Submit the job to the batch system

Basic Job Specifications

Several of the most important options are described below. These basic options are typically all that is needed to run a job on Terra.

Basic Terra/Slurm Job Specifications
Specification Option Example Example-Purpose
Reset Env I --export=NONE Do not propagate environment to job
Reset Env II --get-user-env=L Replicate the login environment
Wall Clock Limit -t [hh:mm:ss] -t 01:15:00 Set wall clock limit to 1 hour 15 min
Job Name -J [SomeText] -J mpiJob Set the job name to "mpiJob"
Node Count -N [min[-max]] -N 4 Spread all tasks/cores across 4 nodes
Total Task/Core Count -n [#] -n 16 Request 16 tasks/cores total
Memory Per Node --mem=[K|M|G|T] --mem=32768M Request 32768 MB (32 GB) per node
Combined stdout/stderr -j oe [OutputName].%j -j oe mpiOut.%j Collect stdout/err in mpiOut.[JobID]

It should be noted that Slurm divides processing resources as such: Nodes -> Cores/CPUs -> Tasks

A user may change the number of tasks per core. For the purposes of this guide, each core will be associated with exactly a single task.

Optional Job Specifications

Optional Job Specifications

Optional Terra/Slurm Job Specifications
Specification Option Example Example-Purpose
Set Allocation -A ###### -A 274839 Set allocation to charge to 274839
Email Notification I --mail-type=[type] --mail-type=ALL Send email on all events
Email Notification II --mail-user=[address] --mail-user=howdy@tamu.edu Send emails to howdy@tamu.edu
Specify Queue -q [queue] -q gpu Request only nodes in gpu subset
Submit Test Job --test-only Submit test job for Slurm validation
Request Temp Disk --tmp=M --tmp=10240 Request at least 10 GB in temp disk space

Several examples of Slurm job files for Terra are listed below. For translating Ada/LSF job files, the Batch Job Translation Guide provides some reference.

Documentation for advanced options can be found under Advanced Documentation.