Hprc banner tamu.png

Difference between revisions of "Terra:Batch Job Files"

From TAMU HPRC
Jump to: navigation, search
(Basic Job Specifications: removed comment after #!/bin/bash because it causes job to crash with the error /bin/bash: # comment: No such file or directory)
 
(80 intermediate revisions by 7 users not shown)
Line 1: Line 1:
== Job Files ==
+
== Building Job Files ==
  
While not the only method of submitted a job, job files fulfill the needs of most users.  
+
While not the only method of submitted programs to be executed, job files fulfill the needs of most users.  
  
 
The general idea behind job files follows:
 
The general idea behind job files follows:
Line 8: Line 8:
 
* Submit the job to the batch system
 
* Submit the job to the batch system
  
Several of the most important options are described below. These options are typically all that is needed to run a job on Terra.  
+
In a job file, resource specification options are preceded by a ''script directive''. For each batch system, this directive is different. On Terra (Slurm) this directive is '''#SBATCH'''.<br>
 +
<font color=teal> For every line of resource specifications, this directive '''must''' be the first text of the line, and '''all specifications must come before any executable lines'''.</font>
 +
An example of a resource specification is given below:
 +
#SBATCH --jobname=MyExample  #Set the job name to "MyExample"
 +
Note: Comments in a job file also begin with a '''#''' but Slurm recognizes '''#SBATCH''' as a directive.<br>
 +
 
 +
A list of the most commonly used and important options for these job files are given in the following section of this wiki. Full job file examples are given [[Terra:Batch#Job_File_Examples | below]].
 +
 
 +
=== Basic Job Specifications ===
 +
 
 +
Several of the most important options are described below. These basic options are typically all that is needed to run a job on Terra.
 +
 
 +
{| class="wikitable" style="text-align: center;"
 +
|+ Basic Terra (Slurm) Job Specifications
 +
|-
 +
! style="width: 130pt;" | Specification
 +
! style="width: 130pt;" | Option
 +
! style="width: 115pt;" | Example
 +
! style="width: 200pt;" | Example-Purpose
 +
|-
 +
| Reset Env I
 +
| --export=NONE
 +
|
 +
| Do not propagate environment to job
 +
|-
 +
| Reset Env II
 +
| --get-user-env=L
 +
|
 +
| Replicate the login environment
 +
|-
 +
|  Wall Clock Limit
 +
| --time=[hh:mm:ss]
 +
| --time=05:00:00
 +
| Set wall clock limit to 5 hour 0 min
 +
|-
 +
| Job Name
 +
| --job-name=[SomeText]
 +
| --job-name=mpiJob
 +
| Set the job name to "mpiJob"
 +
|-
 +
| Total Task/Core Count
 +
| --ntasks=[#]
 +
| --ntasks=56
 +
| Request 56 tasks/cores total
 +
|-
 +
| Tasks per Node I
 +
| --ntasks-per-node=#
 +
| --ntasks-per-node=28
 +
| Request exactly (or max) of 28 tasks per node
 +
|-
 +
| Memory Per Node
 +
| <nowiki>--mem=value[K|M|G|T]</nowiki>
 +
| --mem=32G
 +
| Request 32 GB per node
 +
|-
 +
| Combined stdout/stderr
 +
| --output=[OutputName].%j
 +
| --output=mpiOut.%j
 +
| Collect stdout/err in mpiOut.[JobID]
 +
|}
 +
<font color=teal>
 +
It should be noted that Slurm divides processing resources as such: Nodes -> Cores/CPUs -> Tasks
 +
 
 +
A user may change the number of tasks per core. For the purposes of this guide, each core will be associated with exactly a single task.
 +
</font>
 +
 
 +
[[File:OOjs UI icon lightbulb-20 fc3.svg|18px|Note|link=]]&nbsp;'''Note'''
 +
To submit batch scripts using non-Intel MPI toolchains, you must omit the '''Reset Env I''' and '''Reset Env II''' parameters from your batch script:
 +
<nowiki>
 +
#INCOMPATIBLE WITH OpenMPI/NON-INTEL MPI                        #COMPATIBLE WITH OpenMPI/NON-INTEL MPI
 +
#!/bin/bash                                                   
 +
##ENVIRONMENT SETTINGS; CHANGE WITH CAUTION                    ##ENVIRONMENT SETTINGS; CHANGE WITH CAUTION
 +
#SBATCH --export=NONE    #Do not propagate environment          ##SBATCH --export=NONE    #Do not propagate environment OMIT THIS
 +
#SBATCH --get-user-env=L #Replicate login environment          ##SBATCH --get-user-env=L #Replicate login environment  OMIT THIS
 +
 
 +
##NECESSARY JOB SPECIFICATIONS                                  ##NECESSARY JOB SPECIFICATIONS
 +
#SBATCH --job-name=jobname                                      #SBATCH --job-name=jobname
 +
#SBATCH --time=5:00                                            #SBATCH --time=5:00
 +
#SBATCH --ntasks=56                                            #SBATCH --ntasks=56
 +
#SBATCH --ntasks-per-node=28                                    #SBATCH --ntasks-per-node=28
 +
#SBATCH --mem=32G                                              #SBATCH --mem=32G
 +
#SBATCH --output=example.%j                                    #SBATCH --output=example.%j
 +
 
 +
## YOUR COMMANDS BELOW                                          ## YOUR COMMANDS BELOW
 +
 
 +
</nowiki>
 +
 
 +
=== Optional Job Specifications ===
 +
 
 +
A variety of optional specifications are available to customize your job. The table below lists the specifications which are most useful for users of Terra.
 +
 
 +
{| class="wikitable" style="text-align: center;"
 +
|+ Optional Terra/Slurm Job Specifications
 +
|-
 +
! style="width: 130pt;" | Specification
 +
! style="width: 130pt;" | Option
 +
! style="width: 170pt;" | Example
 +
! style="width: 200pt;" | Example-Purpose
 +
|-
 +
| Set Allocation
 +
| --account=######
 +
| --account=274839
 +
| Set allocation to charge to 274839
 +
|-
 +
| Email Notification I
 +
| --mail-type=[type]
 +
| --mail-type=ALL
 +
| Send email on all events
 +
|-
 +
| Email Notification II
 +
| --mail-user=[address]
 +
| --mail-user=howdy@tamu.edu
 +
| Send emails to howdy@tamu.edu
 +
|-
 +
| Specify Queue
 +
| --partition=[queue]
 +
| --partition=gpu
 +
| Request only nodes in gpu subset
 +
|-
 +
| Specify General Resource
 +
| --gres=[resource]:[count]
 +
| --gres=gpu:1
 +
| Request one GPU per node
 +
|-
 +
| Specify a specific gpu type
 +
| --gres=gpu:[type]:[count]
 +
| --gres=gpu:v100:1
 +
| Request v100 gpu: type=k80 or v100
 +
|-
 +
| Submit Test Job
 +
| --test-only
 +
|
 +
| Submit test job for Slurm validation
 +
|-
 +
| Request Temp Disk
 +
| --tmp=M
 +
| --tmp=10240
 +
| Request at least 10 GB in temp disk space
 +
|-
 +
| Request License
 +
| --licenses=[LicenseLoc]
 +
| --licenses=nastran@slurmdb:12
 +
|
 +
|}
 +
 
 +
=== Alternative Specifications ===
 +
 
 +
The job options within the above sections specify resources with the following method:
 +
* Cores and CPUs are equivalent
 +
* 1 Task per 1 CPU desired
 +
* '''You specify:''' desired number of '''tasks''' (equals number of CPUs)
 +
* '''You specify:''' desired number of '''tasks per node''' (equal or less than the 28 cores per compute node)
 +
* '''You get:''' total nodes equal to ''#ofCPUs/#ofTasksPerNodes''
 +
* '''You specify:''' desired Memory per node
 +
 
 +
Slurm allows users to specify resources in units of Tasks, CPUs, Sockets, and Nodes.
 +
 
 +
There are many overlapping settings and some settings may (quietly) overwrite the defaults of other settings. A good understanding of Slurm options is needed to correctly utilize these methods.
  
 
{| class="wikitable" style="text-align: center;"
 
{| class="wikitable" style="text-align: center;"
| System Name:
+
|+ Alternative Memory/Core/Node Specifications
| Terra
+
|-
 +
! style="width: 150pt;" | Specification
 +
! style="width: 130pt;" | Option
 +
! style="width: 170pt;" | Example
 +
! style="width: 225pt;" | Example-Purpose
 +
|-
 +
| Node Count
 +
| --nodes=[min[-max]]
 +
| --nodes=4
 +
| Spread all tasks/cores across 4 nodes
 +
|-
 +
| CPUs per Task
 +
| --cpus-per-task=#
 +
| --cpus-per-task=4
 +
| Require 4 CPUs per task (default: 1)
 
|-
 
|-
| Host Name:
+
| Memory per CPU
| terra.tamu.edu
+
| --mem-per-cpu=MB
 +
| --mem-per-cpu=2000
 +
| Request 2000 MB per CPU <bR> <font color=purple>NOTE: If this parameter is less than 1024, SLURM will misinterpret it as 0</font>
 +
|- <!--
 +
| Memory per Node (All, Single)
 +
| --mem=0
 +
|
 +
| Request all available memory on a node
 
|-
 
|-
| Operating System:
+
| Memory per Node (All, Multi)
| Linux (CentOS 7)
+
| --mem=0
 +
|
 +
| Request the least-max available memory for any node across all nodes
 +
|- YANG SAYS NO SOUP FOR YOU-->
 +
| Tasks per Core
 +
| --ntasks-per-core=#
 +
| --ntasks-per-core=4
 +
| Request max of 4 tasks per core
 
|-
 
|-
| Total Compute Cores/Nodes:
+
| Tasks per Node II
| 8,512 cores<br>304 nodes
+
| --tasks-per-node=#
 +
| --tasks-per-node=5
 +
| Equivalent to Tasks per Node I
 
|-
 
|-
| Compute Nodes:
+
| Tasks per Socket
| 256 compute nodes, each with 64GB RAM <br> 48 GPU nodes, each with a single Tesla K80 GPU and 128GB of RAM
+
| --ntasks-per-socket=#
 +
| --ntasks-per-socket=6
 +
| Request max of 6 tasks per socket
 
|-
 
|-
| Interconnect:
+
| Sockets per Node
| Intel OmniPath100 Series switches.
+
| --sockets-per-node=#
 +
| --sockets-per-node=2
 +
| Restrict to nodes with at least 2 sockets
 +
|}
 +
 
 +
If you want to make resource requests in an alternative format, you are free to do so. <font color=teal>Our ability to support alternative resource request formats may be limited.</font>
 +
 
 +
=== Using Other Job Options ===
 +
 
 +
Slurm has facilities to make advanced resources requests and change settings that most Terra users do not need. These options are beyond the scope of this guide.
 +
 
 +
If you wish to explore the advanced job options, see the [[Terra:Batch#Advanced_Documentation | Advanced Documentation]].
 +
 
 +
=== Environment Variables ===
 +
All the nodes enlisted for the execution of a job carry most of the environment variables the login process created: '''HOME, SCRATCH, PWD, PATH, USER,''' etc. In addition, Slurm defines new ones in the environment of an executing job. Below is a list of most commonly used environment variables.
 +
 
 +
{| class="wikitable" style="text-align: center;"
 +
|+ Basic Slurm Environment Variables
 
|-
 
|-
| Peak Performance:
+
! style="width: 130pt;" | Variable
| ~X TFLOPs (TBD)
+
! style="width: 130pt;" | Usage
 +
! Description
 
|-
 
|-
| Global Disk:
+
| Job ID
| 1.5PB (raw) via IBM's GSS26 appliance for general use <br>1.5PB (raw) via IBM's GSS256 purchased by a dedicated for GeoPhysics
+
| $SLURM_JOBID
 +
| Batch job ID assigned by Slurm.
 
|-
 
|-
| File System:
+
| Job Name
| General Parallel File System (GPFS)
+
| $SLURM_JOB_NAME
 +
| The name of the Job.
 
|-
 
|-
| Batch Facility:
+
| Queue
| [http://slurm.schedmd.com/ Slurm by SchedMD]
+
| $SLURM_JOB_PARTITION
 +
| The name of the queue the job is dispatched from.
 
|-
 
|-
| Location:
+
| Submit Directory
| Teague Data Center
+
| $SLURM_SUBMIT_DIR
 +
| The directory the job was submitted from.
 
|-
 
|-
| Production Date:
+
| Temporary Directory
| Fall 2016 (tentative)
+
| $TMPDIR
 +
| This is a directory assigned locally on the compute node for the job located at '''/work/job.$SLURM_JOBID'''. Use of '''$TMPDIR''' is recommended for jobs that use many small temporary files.
 
|}
 
|}
  
Several examples of Slurm job files for Terra are listed below. For translating Ada/LSF job files, the [[HPRC:Batch_Translation | Batch Job Translation Guide]] provides some reference.  
+
'''Note:''' To see all relevant Slurm environment variables for a job, add the following line to the '''executable section''' of a job file and submit that job. All the variables will be printed in the output file.
 +
env | grep SLURM
 +
 
 +
=== Clarification on Memory, Core, and Node Specifications ===
 +
 
 +
Memory Specifications are <font color=teal>IMPORTANT</font>. <br>
 +
For examples on calculating memory, core, and/or node specifications on Terra: [[:Terra:Batch_Memory_Specs | Specification Clarification]].
 +
 
 +
=== Executable Commands ===
 +
 
 +
After the resource specification section of a job file comes the executable section. This executable section contains all the necessary UNIX, Linux, and program commands that will be run in the job. <br>
 +
Some commands that may go in this section include, but are not limited to:
 +
* Changing directories
 +
* Loading, unloading, and listing modules
 +
* Launching software
 +
 
 +
An example of a possible executable section is below:
 +
cd $SCRATCH      # Change current directory to ''/scratch/user/[netID]/''
 +
ml purge        # Purge all modules
 +
ml intel/2016b  # Load the intel/2016b module
 +
ml              # List all currently loaded modules
 +
 +
./myProgram.o    # Run "myProgram.o"
 +
 
 +
For information on the module system or specific software, visit our [[SW:Modules | Modules]] page and our [[SW | Software]] page.
  
Documentation for advanced options can be found under [[Terra:Batch:Advanced Documentation | Advanced Documentation]].
+
[[Category: Terra]]

Latest revision as of 09:58, 25 September 2021

Building Job Files

While not the only method of submitted programs to be executed, job files fulfill the needs of most users.

The general idea behind job files follows:

  • Make resource requests
  • Add your commands and/or scripting
  • Submit the job to the batch system

In a job file, resource specification options are preceded by a script directive. For each batch system, this directive is different. On Terra (Slurm) this directive is #SBATCH.
For every line of resource specifications, this directive must be the first text of the line, and all specifications must come before any executable lines. An example of a resource specification is given below:

#SBATCH --jobname=MyExample  #Set the job name to "MyExample"

Note: Comments in a job file also begin with a # but Slurm recognizes #SBATCH as a directive.

A list of the most commonly used and important options for these job files are given in the following section of this wiki. Full job file examples are given below.

Basic Job Specifications

Several of the most important options are described below. These basic options are typically all that is needed to run a job on Terra.

Basic Terra (Slurm) Job Specifications
Specification Option Example Example-Purpose
Reset Env I --export=NONE Do not propagate environment to job
Reset Env II --get-user-env=L Replicate the login environment
Wall Clock Limit --time=[hh:mm:ss] --time=05:00:00 Set wall clock limit to 5 hour 0 min
Job Name --job-name=[SomeText] --job-name=mpiJob Set the job name to "mpiJob"
Total Task/Core Count --ntasks=[#] --ntasks=56 Request 56 tasks/cores total
Tasks per Node I --ntasks-per-node=# --ntasks-per-node=28 Request exactly (or max) of 28 tasks per node
Memory Per Node --mem=value[K|M|G|T] --mem=32G Request 32 GB per node
Combined stdout/stderr --output=[OutputName].%j --output=mpiOut.%j Collect stdout/err in mpiOut.[JobID]

It should be noted that Slurm divides processing resources as such: Nodes -> Cores/CPUs -> Tasks

A user may change the number of tasks per core. For the purposes of this guide, each core will be associated with exactly a single task.

Note Note To submit batch scripts using non-Intel MPI toolchains, you must omit the Reset Env I and Reset Env II parameters from your batch script:

#INCOMPATIBLE WITH OpenMPI/NON-INTEL MPI                        #COMPATIBLE WITH OpenMPI/NON-INTEL MPI
#!/bin/bash                                                     
##ENVIRONMENT SETTINGS; CHANGE WITH CAUTION                     ##ENVIRONMENT SETTINGS; CHANGE WITH CAUTION
#SBATCH --export=NONE    #Do not propagate environment          ##SBATCH --export=NONE    #Do not propagate environment OMIT THIS
#SBATCH --get-user-env=L #Replicate login environment           ##SBATCH --get-user-env=L #Replicate login environment  OMIT THIS

##NECESSARY JOB SPECIFICATIONS                                  ##NECESSARY JOB SPECIFICATIONS
#SBATCH --job-name=jobname                                      #SBATCH --job-name=jobname
#SBATCH --time=5:00                                             #SBATCH --time=5:00
#SBATCH --ntasks=56                                             #SBATCH --ntasks=56
#SBATCH --ntasks-per-node=28                                    #SBATCH --ntasks-per-node=28
#SBATCH --mem=32G                                               #SBATCH --mem=32G
#SBATCH --output=example.%j                                     #SBATCH --output=example.%j

## YOUR COMMANDS BELOW                                          ## YOUR COMMANDS BELOW


Optional Job Specifications

A variety of optional specifications are available to customize your job. The table below lists the specifications which are most useful for users of Terra.

Optional Terra/Slurm Job Specifications
Specification Option Example Example-Purpose
Set Allocation --account=###### --account=274839 Set allocation to charge to 274839
Email Notification I --mail-type=[type] --mail-type=ALL Send email on all events
Email Notification II --mail-user=[address] --mail-user=howdy@tamu.edu Send emails to howdy@tamu.edu
Specify Queue --partition=[queue] --partition=gpu Request only nodes in gpu subset
Specify General Resource --gres=[resource]:[count] --gres=gpu:1 Request one GPU per node
Specify a specific gpu type --gres=gpu:[type]:[count] --gres=gpu:v100:1 Request v100 gpu: type=k80 or v100
Submit Test Job --test-only Submit test job for Slurm validation
Request Temp Disk --tmp=M --tmp=10240 Request at least 10 GB in temp disk space
Request License --licenses=[LicenseLoc] --licenses=nastran@slurmdb:12

Alternative Specifications

The job options within the above sections specify resources with the following method:

  • Cores and CPUs are equivalent
  • 1 Task per 1 CPU desired
  • You specify: desired number of tasks (equals number of CPUs)
  • You specify: desired number of tasks per node (equal or less than the 28 cores per compute node)
  • You get: total nodes equal to #ofCPUs/#ofTasksPerNodes
  • You specify: desired Memory per node

Slurm allows users to specify resources in units of Tasks, CPUs, Sockets, and Nodes.

There are many overlapping settings and some settings may (quietly) overwrite the defaults of other settings. A good understanding of Slurm options is needed to correctly utilize these methods.

Alternative Memory/Core/Node Specifications
Specification Option Example Example-Purpose
Node Count --nodes=[min[-max]] --nodes=4 Spread all tasks/cores across 4 nodes
CPUs per Task --cpus-per-task=# --cpus-per-task=4 Require 4 CPUs per task (default: 1)
Memory per CPU --mem-per-cpu=MB --mem-per-cpu=2000 Request 2000 MB per CPU
NOTE: If this parameter is less than 1024, SLURM will misinterpret it as 0
Tasks per Core --ntasks-per-core=# --ntasks-per-core=4 Request max of 4 tasks per core
Tasks per Node II --tasks-per-node=# --tasks-per-node=5 Equivalent to Tasks per Node I
Tasks per Socket --ntasks-per-socket=# --ntasks-per-socket=6 Request max of 6 tasks per socket
Sockets per Node --sockets-per-node=# --sockets-per-node=2 Restrict to nodes with at least 2 sockets

If you want to make resource requests in an alternative format, you are free to do so. Our ability to support alternative resource request formats may be limited.

Using Other Job Options

Slurm has facilities to make advanced resources requests and change settings that most Terra users do not need. These options are beyond the scope of this guide.

If you wish to explore the advanced job options, see the Advanced Documentation.

Environment Variables

All the nodes enlisted for the execution of a job carry most of the environment variables the login process created: HOME, SCRATCH, PWD, PATH, USER, etc. In addition, Slurm defines new ones in the environment of an executing job. Below is a list of most commonly used environment variables.

Basic Slurm Environment Variables
Variable Usage Description
Job ID $SLURM_JOBID Batch job ID assigned by Slurm.
Job Name $SLURM_JOB_NAME The name of the Job.
Queue $SLURM_JOB_PARTITION The name of the queue the job is dispatched from.
Submit Directory $SLURM_SUBMIT_DIR The directory the job was submitted from.
Temporary Directory $TMPDIR This is a directory assigned locally on the compute node for the job located at /work/job.$SLURM_JOBID. Use of $TMPDIR is recommended for jobs that use many small temporary files.

Note: To see all relevant Slurm environment variables for a job, add the following line to the executable section of a job file and submit that job. All the variables will be printed in the output file.

env | grep SLURM

Clarification on Memory, Core, and Node Specifications

Memory Specifications are IMPORTANT.
For examples on calculating memory, core, and/or node specifications on Terra: Specification Clarification.

Executable Commands

After the resource specification section of a job file comes the executable section. This executable section contains all the necessary UNIX, Linux, and program commands that will be run in the job.
Some commands that may go in this section include, but are not limited to:

  • Changing directories
  • Loading, unloading, and listing modules
  • Launching software

An example of a possible executable section is below:

cd $SCRATCH      # Change current directory to /scratch/user/[netID]/
ml purge         # Purge all modules
ml intel/2016b   # Load the intel/2016b module
ml               # List all currently loaded modules

./myProgram.o    # Run "myProgram.o"

For information on the module system or specific software, visit our Modules page and our Software page.