Hprc banner tamu.png

Difference between revisions of "Ada:Batch Queues"

From TAMU HPRC
Jump to: navigation, search
(Queues)
(Queues)
 
(45 intermediate revisions by 6 users not shown)
Line 1: Line 1:
 
==Queues==
 
==Queues==
  
LSF, upon job submission, sends your jobs to appropriate batch queues. These are (software) service
+
LSF, upon job submission, sends your jobs to appropriate batch queues. These are (software) service stations configured to control the scheduling and dispatch of jobs that have arrived in them. Batch queues are characterized by all sorts of parameters. Some of the most important are:  
stations configured to control the scheduling and dispatch of jobs that have arrived in them. Batch queues
 
are characterised by all sorts of parameters. Some of the most important are: (1) the total number of jobs
 
that can be concurrently running (number of run slots); (2) the wall-clock time limit per job; (3) the type
 
and number of nodes it can distpatch jobs to; (4) which users or user groups can use that queue; etc.
 
These settings control whether a job will lie idle in the queue or be dispatched quickly for execution.
 
  
The current (May 2015) queue structure. It is in flux.  
+
# the total number of jobs that can be concurrently running (number of run slots)
 +
# the wall-clock time limit per job
 +
# the type and number of nodes it can dispatch jobs to
 +
# which users or user groups can use that queue; etc.
  
<pre>
+
These settings control whether a job will remain idle in the queue or be dispatched quickly for execution.
Queue      Min/Default/Max Cpus  Default/Max Walltime    Compute Node Types                  Notes
 
special        None              1 hr / 36 hr            All
 
devel      1 / 1 / 320            10 min / 1 hr            All
 
small      1 / 1 / 3              1 hr / 120 hr            64 GB and 256 GB nodes
 
short      4 / 4 / 8000          1 hr / 5 hr              64 GB and 256 GB nodes
 
medium    4 / 4 / 4000          5 hr / 24 hr            64 GB and 256 GB nodes
 
long      4 / 4 / 2000          24 hr / 7 days          64 GB and 256 GB nodes              Maximum of 6000 cores for all running jobs in this queue
 
xlarge    1 / 1 / 280            1 hr / 200 hr            1 TB nodes (11), 2 TB nodes (4)
 
vnc        1 / 1 / 20            1 hr / 6 hr              All 30 nodes with GPUs
 
</pre>
 
  
LSF determines which queue will receive a job for processing. The selection is determined mainly by the resources
+
The current queue structure ('''updated on September 27, 2019''').
(e.g., number of cpus, wall-clock limit) specified, explicitly or by default. There are exceptions: (1) the
 
the xlarge queue that is associated with nodes that have 1TB or 2TB of main memory; and, (2) the special
 
queue which gives one access to all of the compute nodes. To use either one you must specify the '''-q xlarge''' option
 
in your job file as well as have special permission for such access. Output from the bjobs command contains
 
the name of the queue associated with a given job.
 
  
====Fair-Share Policy====
+
'''NOTE: Each user is now limited to 8000 cores total for his/her pending jobs across all the queues.'''
... pending ...
 
  
====Public and Private/Group Queues====
+
{| class="wikitable" style="text-align: center;"
... pending ...
+
! Queue
 +
! Job Min/Default/Max Cores
 +
! Job Default/Max Walltime
 +
! Compute Node Types
 +
! Per-Queue Limits
 +
! Aggregate Limits Across Queues
 +
! Per-User Limits Across Queues
 +
! Notes
 +
|-
 +
| sn_short
 +
| rowspan="4" | 1 / 1 / 20
 +
| 10 min / 1 hr
 +
| rowspan="8" style="white-space: nowrap;" | 64 GB nodes (811)<br>256 GB nodes (26)
 +
| rowspan="4" |
 +
| rowspan="4" | Maximum of '''7000''' cores for all running jobs in the single-node (sn_*) queues.
 +
| rowspan="4" | Maximum of '''1000 cores and 100 jobs per user''' for all running jobs in the single node (sn_*) queues.
 +
| rowspan="4" | For jobs needing '''only one compute node'''.
 +
|-
 +
| sn_regular
 +
| 1 hr / 1 day
 +
|-
 +
| sn_long
 +
| 24 hr / 4 days
 +
|-
 +
| sn_xlong
 +
| 4 days / 30 days
 +
|-
 +
| mn_short
 +
| 2 / 2 / 200
 +
| 10 min / 1 hr
 +
| Maximum of '''2000''' cores for all running jobs in this queue.
 +
| rowspan="4" | Maximum of '''12000''' cores for all running jobs in the multi-node (mn_*) queues.
 +
| rowspan="4" | Maximum of '''3000 cores and 150 jobs per user''' for all running jobs in the multi-node (mn_*) queues.
 +
| rowspan="4" | For jobs needing '''more than one compute node'''.
 +
|-
 +
| mn_small
 +
| 2 / 2 / 120
 +
| 1 hr / 10 days
 +
| Maximum of '''7000''' cores for all running jobs in this queue.
 +
|-
 +
| mn_medium
 +
| 121 / 121 / 600
 +
| 1 hr / 7 days
 +
| Maximum of '''6000''' cores for all running jobs in this queue.
 +
|-
 +
| mn_large
 +
| 601 / 601 / 2000
 +
| 1 hr / 5 days
 +
| Maximum of '''8000''' cores for all running jobs in this queue.  
 +
|-
 +
| xlarge
 +
| 1 / 1 / 280
 +
| 1 hr / 10 days
 +
| style="white-space: nowrap;" | 1 TB nodes (11)<br>2 TB nodes (4)
 +
| colspan="2" |
 +
|
 +
| For jobs needing '''more than 256GB of memory per compute node'''.
 +
|-
 +
| vnc
 +
| 1 / 1 / 20
 +
| 1 hr / 6 hr
 +
| style="white-space: nowrap;" | GPU nodes (30)
 +
| colspan="2" |
 +
|
 +
| For remote visualization jobs.
 +
|-
 +
| special
 +
| None
 +
| 1 hr / 7 days
 +
| style="white-space: nowrap;" | 64 GB nodes (811)<br>256 GB nodes (26)
 +
| colspan="2" |
 +
|
 +
| Requires permission to access this queue.
 +
|-
 +
| v100 (*)
 +
| 1 / 1 / 72
 +
| 1 hr / 2 days
 +
| style="white-space: nowrap;" | 192 GB nodes, dual 32GB V100 GPUs (2)
 +
| colspan="2" |
 +
|
 +
|
 +
|}
 +
* V100 nodes were moved to terra in preparation for the decommissioning of Ada
  
 +
LSF determines which queue will receive a job for processing. The selection is determined mainly by the resources (e.g., number of cpus, wall-clock limit) specified, explicitly or by default. There are two exceptions:
  
====The Interactive Queue====
+
# The '''xlarge''' queue that is associated with nodes that have 1TB or 2TB of main memory. To use it, submit jobs with the '''-q xlarge''' option along with '''-R "select[mem1tb]"''' or '''-R "select[mem2tb]"'''
... pending ...
+
# The '''special''' queue which gives one access to all of the compute nodes. '''You MUST request permission to get access to this queue.'''
 +
 
 +
To access any of the above queues, you must use the '''-q queue_name''' option in your job script.
 +
 
 +
Output from the bjobs command contains the name of the queue associated with a given job.
 +
 
 +
{{:SW:Checkpointing | Running jobs longer than the walltime limit}}
 +
 
 +
[[ Category:Ada ]]

Latest revision as of 08:59, 6 May 2021

Queues

LSF, upon job submission, sends your jobs to appropriate batch queues. These are (software) service stations configured to control the scheduling and dispatch of jobs that have arrived in them. Batch queues are characterized by all sorts of parameters. Some of the most important are:

  1. the total number of jobs that can be concurrently running (number of run slots)
  2. the wall-clock time limit per job
  3. the type and number of nodes it can dispatch jobs to
  4. which users or user groups can use that queue; etc.

These settings control whether a job will remain idle in the queue or be dispatched quickly for execution.

The current queue structure (updated on September 27, 2019).

NOTE: Each user is now limited to 8000 cores total for his/her pending jobs across all the queues.

Queue Job Min/Default/Max Cores Job Default/Max Walltime Compute Node Types Per-Queue Limits Aggregate Limits Across Queues Per-User Limits Across Queues Notes
sn_short 1 / 1 / 20 10 min / 1 hr 64 GB nodes (811)
256 GB nodes (26)
Maximum of 7000 cores for all running jobs in the single-node (sn_*) queues. Maximum of 1000 cores and 100 jobs per user for all running jobs in the single node (sn_*) queues. For jobs needing only one compute node.
sn_regular 1 hr / 1 day
sn_long 24 hr / 4 days
sn_xlong 4 days / 30 days
mn_short 2 / 2 / 200 10 min / 1 hr Maximum of 2000 cores for all running jobs in this queue. Maximum of 12000 cores for all running jobs in the multi-node (mn_*) queues. Maximum of 3000 cores and 150 jobs per user for all running jobs in the multi-node (mn_*) queues. For jobs needing more than one compute node.
mn_small 2 / 2 / 120 1 hr / 10 days Maximum of 7000 cores for all running jobs in this queue.
mn_medium 121 / 121 / 600 1 hr / 7 days Maximum of 6000 cores for all running jobs in this queue.
mn_large 601 / 601 / 2000 1 hr / 5 days Maximum of 8000 cores for all running jobs in this queue.
xlarge 1 / 1 / 280 1 hr / 10 days 1 TB nodes (11)
2 TB nodes (4)
For jobs needing more than 256GB of memory per compute node.
vnc 1 / 1 / 20 1 hr / 6 hr GPU nodes (30) For remote visualization jobs.
special None 1 hr / 7 days 64 GB nodes (811)
256 GB nodes (26)
Requires permission to access this queue.
v100 (*) 1 / 1 / 72 1 hr / 2 days 192 GB nodes, dual 32GB V100 GPUs (2)
  • V100 nodes were moved to terra in preparation for the decommissioning of Ada

LSF determines which queue will receive a job for processing. The selection is determined mainly by the resources (e.g., number of cpus, wall-clock limit) specified, explicitly or by default. There are two exceptions:

  1. The xlarge queue that is associated with nodes that have 1TB or 2TB of main memory. To use it, submit jobs with the -q xlarge option along with -R "select[mem1tb]" or -R "select[mem2tb]"
  2. The special queue which gives one access to all of the compute nodes. You MUST request permission to get access to this queue.

To access any of the above queues, you must use the -q queue_name option in your job script.

Output from the bjobs command contains the name of the queue associated with a given job.

Checkpointing

Checkpointing is the practice of creating a save state of a job so that, if interrupted, it can begin again without starting completely over. This technique is especially important for long jobs on the batch systems, because each batch queue has a maximum walltime limit.


A checkpointed job file is particularly useful for the gpu queue, which is limited to 4 days walltime due to its demand. There are many cases of jobs that require the use of gpus and must run longer than two days, such as training a machine learning algorithm.


Users can change their code to implement save states so that their code may restart automatically when cut off by the wall time limit. There are many different ways to checkpoint a job file depending on the software used, but it is almost always done at the application level. It is up to the user how frequently save states are made depending on what kind of fault tolerance is needed for the job, but in the case of the batch system, the exact time of the 'fault' is known. It's just the walltime limit of the queue. In this case, only one checkpoint need be created, right before the limit is reached. Many different resources are available for checkpointing techniques. Some examples for common software are listed below.