Hprc banner tamu.png

Difference between revisions of "Ada:Batch Memory Specs"

From TAMU HPRC
Jump to: navigation, search
(Clarification on Memory, Core, and Node Specifications)
(Examples)
Line 40: Line 40:
  
 
=== Examples ===
 
=== Examples ===
'''Five important job parameters:'''
+
Below you will find examples of the above specifications as well as calculations to find the memory per node and number of nodes.
<pre>
+
<span style="font-size:120%;">'''Memory Example 1:'''</span> <br />
#BSUB -n NNN                    # NNN: total number of cores/jobslots to allocate for the job
+
#BSUB -n 1                  #Request 1 core
#BSUB -R "span[ptile=XX]"       # XX:  number of cores/jobslots per node to use. Also, a node selection criterion
+
#BSUB -R "span[ptile=1]"     #Request 1 core per node.
#BSUB -R "select[node-type]"    # node-type: nxt, mem256gb, gpu, phi, mem1t, mem2t ...
+
#BSUB -R "rusage[mem=5000]" #Request 5000MB per core for the job
#BSUB -R "rusage[mem=nnn]"     # reserves nnn MBs per process/CPU for the job
+
#BSUB -M 5000                #Set the per process enforceable memory limit to 5000MB.
#BSUB -M mmm                    # sets the per process enforceable memory limit to mmm MB
 
</pre>
 
  
We list these together because in many jobs they can be closely related and, therefore, must be
+
This results in:
consistently set. We recommend their adoption in all jobs, serial, single-node and multi-node.
+
1 Core / 1 Core per Node = '''1''' Node
The '''rusage[mem=nnn]''' setting causes LSF to select nodes that can each allocate '''XX * nnn''' MBs for the execution of the job.
+
5000MB per core * 1 Core per Node = '''5000MB''' per Node
The '''-M mm''' sets and enforces the process memory size limit. When this limit is violated the job will abort. Omitting this specification, causes LSF to assume '''the'''
 
'''default memory limit, which by configuration is set to 2.5 giga-bytes (2500 MB) per process.''' The following examples, with some commentary,  illustrate the
 
use of these options.
 
  
'''Important:''' if the process memory limit, default (2500 MB)  or specified, is exceeded during execution the job will fail with a '''memory violation''' error.
+
<span style="font-size:120%;">'''Memory Example 2:'''</span> <br />
 +
#BSUB -n 10                  #Request 10 cores
 +
#BSUB -R "span[ptile=10]"    #Request 10 cores per node.
 +
#BSUB -R "rusage[mem=10000]"  #Request 10000MB per core for the job
 +
#BSUB -M 10000                #Set the per process enforceable memory limit to 5000MB.
  
<pre>
+
This results in:
#BSUB -n 900                    # 900: number of cores/jobslots to allocate for the job
+
10 Cores / 10s Core per Node = '''1''' Node
#BSUB -R "span[ptile=20]"      # 20: number of cores per node to use
+
~10GB per core * 10 Core per Node = '''~100GB''' per Node
#BSUB -R "select[nxt]"          # Allocates NeXtScale type nodes
+
Note: This is an example of a job that would require one of the 26 256GB nodes to run.<br>
</pre>
 
  
The above specifications will allocate 45 (=900/20) whole nodes. In many parallel jobs the selection
+
<span style="font-size:120%;">'''Memory Example 3:'''</span> <br />
of NeXtScale nodes at 20 cores per node is the best choice. Here, the maximum memory per process
+
#BSUB -n 20                 #Request 10 cores
is set to 2500 MB. Here, we're just illustrating what happens when you omit the memory-related options.
+
#BSUB -R "span[ptile=10]"    #Request 10 cores per node.
We definitely urge that you specify them. '''The memory enforceable limit per process here is 2.5 MB, the default setting.'''
+
#BSUB -R "rusage[mem=10000]"  #Request 10000MB per core for the job
 +
#BSUB -M 10000                #Set the per process enforceable memory limit to 5000MB.
  
<pre>
+
This results in:
#BSUB -n 900                    # 900: total number of cores/jobslots to allocate for the job
+
  10 Cores / 10s Core per Node = '''1''' Node
#BSUB -R "span[ptile=16]"      # 16: number of cores/jobslots per node to use
+
~10GB per core * 10 Core per Node = '''~100GB''' per Node
#BSUB -R "select[nxt]"          # allocates NeXtScale type nodes
 
#BSUB -R "rusage[mem=3600]"    # schedules on nodes that have at least 3600 MB per process/CPU avail
 
#BSUB -M 3600                  # enforces 3600 MB memory use per process
 
</pre>
 
 
 
The above specifications will allocate 57 (= ceiling(900/16)) nodes. The decision to only apply XX (here 16) number cores
 
per node, and not the maximum 20, for a computation requires some judgement. The execution profile of the job is
 
important. Typically, some experimentation is required in finding the optimal tile number for a given code.
 
 
 
<pre>
 
#BSUB -n 1                    # Allocate a total of 1 cpu/core for the job, appropriate for serial processing.
 
#BSUB -R "span[ptile=1]"      # Allocate 1 core per node.
 
#BSUB -R "select[gpu]"        # Allocate a node that has gpus (of 64GB or 256GB memory). A "select[phi]"
 
                              # specification would allocate a node with phi coprocessors.
 
</pre>
 
 
 
Omitting the last two options in the above will cause LSF to place the job on any conveniently available
 
core on any node, idle or (partially) busy, of any type, except on those with 1TB or 2TB memory.<br>
 
 
 
It is worth emphasizing that, under the current LSF setup, only the '''-x''' option and a ptile value equal to the node's
 
core limit will prevent LSF from scheduling jobs that match the balance of unreserved cores.
 

Revision as of 18:01, 10 January 2017

Clarification on Memory, Core, and Node Specifications

With LSF, you can specify the number of cores, the number of cores per node, and memory per core. The number of nodes that the job uses is determined by the core count divided by the number of cores per node.

Basic Ada (LSF) Memory/Core Specifications
Specification Option Example Example-Purpose
Core count -n ## -n 20 Assigns 20 job slots/cores.
Cores per node -R "span[ptile=##]" -R "span[ptile=5]" Request 5 cores per node.
Memory Per Core -M [MB] -M 2560 Sets the per process memory limit to 2560 MB.
Memory Per Core -R "rusage[mem=[MB]]" -R "rusage[mem=2560]" Schedules job on nodes that have at
least 2560 MBs available per core.

Note: There are two memory per core specification options listed above. In any job file, it is recommended that both are used and they should always match.

Memory Per Node

Another important number to keep in mind is memory per node. On Ada, 800+ nodes have 64GB of memory (54GB usable) and 26 nodes have 256GB of memory (245GB usable). A small selection of 1TB and 2TB nodes are available for large jobs as well. More hardware information can be found on the Ada Hardware Summary page.

Memory per node is important to keep in mind because it is one of the main factors that dictate which node(s) a job is placed on. A job that requires less than 54GB of memory per node can be placed on any of the 800+ nodes on Ada. Whereas a job requiring more than 54GB of memory per node has only 26 nodes that it can be placed on.

Examples

Below you will find examples of the above specifications as well as calculations to find the memory per node and number of nodes. Memory Example 1:

#BSUB -n 1                   #Request 1 core
#BSUB -R "span[ptile=1]"     #Request 1 core per node.
#BSUB -R "rusage[mem=5000]"  #Request 5000MB per core for the job
#BSUB -M 5000                #Set the per process enforceable memory limit to 5000MB.

This results in:

1 Core / 1 Core per Node = 1 Node
5000MB per core * 1 Core per Node = 5000MB per Node

Memory Example 2:

#BSUB -n 10                  #Request 10 cores
#BSUB -R "span[ptile=10]"    #Request 10 cores per node.
#BSUB -R "rusage[mem=10000]"  #Request 10000MB per core for the job
#BSUB -M 10000                #Set the per process enforceable memory limit to 5000MB.

This results in:

10 Cores / 10s Core per Node = 1 Node
~10GB per core * 10 Core per Node = ~100GB per Node

Note: This is an example of a job that would require one of the 26 256GB nodes to run.

Memory Example 3:

#BSUB -n 20                  #Request 10 cores
#BSUB -R "span[ptile=10]"    #Request 10 cores per node.
#BSUB -R "rusage[mem=10000]"  #Request 10000MB per core for the job
#BSUB -M 10000                #Set the per process enforceable memory limit to 5000MB.

This results in:

10 Cores / 10s Core per Node = 1 Node
~10GB per core * 10 Core per Node = ~100GB per Node