Hprc banner tamu.png

Ada:Batch Memory Specs

Revision as of 17:54, 10 January 2017 by Cryssb818 (talk | contribs) (Clarification on Memory, Core, and Node Specifications)
Jump to: navigation, search

Clarification on Memory, Core, and Node Specifications

With LSF, you can specify the number of cores, the number of cores per node, and memory per core. The number of nodes that the job uses is determined by the core count divided by the number of cores per node.

Basic Ada (LSF) Memory/Core Specifications
Specification Option Example Example-Purpose
Core count -n ## -n 20 Assigns 20 job slots/cores.
Cores per node -R "span[ptile=##]" -R "span[ptile=5]" Request 5 cores per node.
Memory Per Core -M [MB] -M 2560 Sets the per process memory limit to 2560 MB.
Memory Per Core -R "rusage[mem=[MB]]" -R "rusage[mem=2560]" Schedules job on nodes that have at
least 2560 MBs available per core.

Note: There are two memory per core specification options listed above. In any job file, it is recommended that both are used and they should always match.

Memory Per Node

Another important number to keep in mind is memory per node. On Ada, 800+ nodes have 64GB of memory (54GB usable) and 26 nodes have 256GB of memory (245GB usable). A small selection of 1TB and 2TB nodes are available for large jobs as well. More hardware information can be found on the Ada Hardware Summary page.

Memory per node is important to keep in mind because it is one of the main factors that dictate which node(s) a job is placed on. A job that requires less than 54GB of memory per node can be placed on any of the 800+ nodes on Ada. Whereas a job requiring more than 54GB of memory per node has only 26 nodes that it can be placed on.


Five important job parameters:

#BSUB -n NNN                    # NNN: total number of cores/jobslots to allocate for the job
#BSUB -R "span[ptile=XX]"       # XX:  number of cores/jobslots per node to use. Also, a node selection criterion
#BSUB -R "select[node-type]"    # node-type: nxt, mem256gb, gpu, phi, mem1t, mem2t ...
#BSUB -R "rusage[mem=nnn]"      # reserves nnn MBs per process/CPU for the job
#BSUB -M mmm                    # sets the per process enforceable memory limit to mmm MB

We list these together because in many jobs they can be closely related and, therefore, must be consistently set. We recommend their adoption in all jobs, serial, single-node and multi-node. The rusage[mem=nnn] setting causes LSF to select nodes that can each allocate XX * nnn MBs for the execution of the job. The -M mm sets and enforces the process memory size limit. When this limit is violated the job will abort. Omitting this specification, causes LSF to assume the default memory limit, which by configuration is set to 2.5 giga-bytes (2500 MB) per process. The following examples, with some commentary, illustrate the use of these options.

Important: if the process memory limit, default (2500 MB) or specified, is exceeded during execution the job will fail with a memory violation error.

#BSUB -n 900                    # 900: number of cores/jobslots to allocate for the job
#BSUB -R "span[ptile=20]"       # 20:  number of cores per node to use
#BSUB -R "select[nxt]"          # Allocates NeXtScale type nodes

The above specifications will allocate 45 (=900/20) whole nodes. In many parallel jobs the selection of NeXtScale nodes at 20 cores per node is the best choice. Here, the maximum memory per process is set to 2500 MB. Here, we're just illustrating what happens when you omit the memory-related options. We definitely urge that you specify them. The memory enforceable limit per process here is 2.5 MB, the default setting.

#BSUB -n 900                    # 900: total number of cores/jobslots to allocate for the job
#BSUB -R "span[ptile=16]"       # 16:  number of cores/jobslots per node to use
#BSUB -R "select[nxt]"          # allocates NeXtScale type nodes
#BSUB -R "rusage[mem=3600]"     # schedules on nodes that have at least 3600 MB per process/CPU avail
#BSUB -M 3600                   # enforces 3600 MB memory use per process 

The above specifications will allocate 57 (= ceiling(900/16)) nodes. The decision to only apply XX (here 16) number cores per node, and not the maximum 20, for a computation requires some judgement. The execution profile of the job is important. Typically, some experimentation is required in finding the optimal tile number for a given code.

#BSUB -n 1                    # Allocate a total of 1 cpu/core for the job, appropriate for serial processing.
#BSUB -R "span[ptile=1]"      # Allocate 1 core per node.
#BSUB -R "select[gpu]"        # Allocate a node that has gpus (of 64GB or 256GB memory). A "select[phi]"
                              # specification would allocate a node with phi coprocessors.

Omitting the last two options in the above will cause LSF to place the job on any conveniently available core on any node, idle or (partially) busy, of any type, except on those with 1TB or 2TB memory.

It is worth emphasizing that, under the current LSF setup, only the -x option and a ptile value equal to the node's core limit will prevent LSF from scheduling jobs that match the balance of unreserved cores.