Hprc banner tamu.png

Difference between revisions of "HPRC:AMS:Service Unit"

From TAMU HPRC
Jump to: navigation, search
(AMS Service Unit)
(Ada -> Grace, GPU policy note)
Line 15: Line 15:
 
===Non-Exclusive Job===
 
===Non-Exclusive Job===
  
The effective core on a node are calculated by considering requested memory. When a job requests xxx memory by "-M xxx" (for our LSF configuration on Ada for a node with m cores, this is the runtime memory limit per core) and yyy cores by "-n yyy), thenfor that requested memory, we have<br>
+
The effective cores on a node is calculated by considering requested memory. When a job requests xxx memory by "--mem=xxx" and the node has m cores, then for that requested memory, we have<br>
''''''memory_equivalent_core = min(m, ceil(xxx*yyy/total_memory*m))''''''<br>
+
''''''memory_equivalent_core = min(m, ceil(xxx/total_memory*m))''''''<br>
where total_memory is the total memory on a node available to users.<br>
+
where total_memory is the total memory on a node available to users on that node.<br>
  
 
====Example 1====  
 
====Example 1====  
     A nxt node has 20 cores and around 50G memory available to users.  
+
     A compute node has 48 cores and around 351.5G memory available to users.  
     When a job requests 2.5G memory and 2 cores, the job uses ''memory_equivalent_core'' of min(20, ceil(2.5*2/50*20))=2 core.  
+
     When a job requests 13G memory and 2 cores, the job uses ''memory_equivalent_core'' of min(48, ceil(13/351.5625*48))=2 core.  
  
 
====Example 2====  
 
====Example 2====  
     A nxt node has 20 cores and around 50G memory available to users.  
+
     A compute node has 48 cores and around 351.5G memory available to users.  
     When the job requests 3G memory and 2 cores, the job uses ''memory_equivalent_core'' of min(20, ceil(3*2/50*20))=3 cores.
+
     When the job requests 20G memory and 2 cores, the job uses ''memory_equivalent_core'' of min(48, ceil(20/351.5625*48))=3 cores.
  
 
====Example 3====  
 
====Example 3====  
     A nxt node has 20 cores and around 50G memory available to users.  
+
     A compute node has 48 cores and around 351.5G memory available to users.  
     When the job requests 8G memoryand 10 cores, the job uses ''memory_equivalent_core'' of min(20, ceil(8*10/50*20))=20 cores.
+
     When the job requests 150G memory and 10 cores, the job uses ''memory_equivalent_core'' of min(48, ceil(150/351.5625*48))=21 cores.
  
 
Once the memory_equivalent_core is calculated, the effective_core on a node where a job request yyy cores can be calculated as below: <br>
 
Once the memory_equivalent_core is calculated, the effective_core on a node where a job request yyy cores can be calculated as below: <br>
Line 37: Line 37:
  
 
===SUs for Jobs Requesting GPUs===
 
===SUs for Jobs Requesting GPUs===
On Terra, a job requesting GPUs is considered as an exclusive job. That is, the SUs consumed by the job is calculated based on all CPUs on GPU nodes. For example, if a job requests one CPU and a GPU, then the job is a GPU job. The SUs for the GPU job is calculated as for 28 CPUs (all CPUs on a GPU node).
+
A job requesting GPUs is considered as an exclusive job. That is, the SUs consumed by the job is calculated based on all CPUs on GPU nodes. For example, if a job requests one CPU and a GPU, then the job is a GPU job. The SUs for the GPU job is calculated as for 28 or 48 CPUs (all CPUs on a GPU node). ''Note that this policy will soon be changing. Come back for updates.''
  
 
[[ Category:HPRC ]]
 
[[ Category:HPRC ]]

Revision as of 11:07, 8 October 2021

AMS Service Unit

A 'Service Unit' (SU) is charged to a job when the job uses one 'effective_core' for one hour (walltime).

The calculation of effective_core depends on whether a job is a exclusive job or not ('-x' specified or not).

SUs are allocated on a per-cluster basis and can not be transferred between clusters. Additionally, SUs expire each fiscal year and can not be extended to the new fiscal year.

Exclusive Job

When an exclusive job runs on a node with m cores.
'effective_core = m'
Note that the effective_core for an exclusive job on a node is independent how many cores are requested by the job.

Once the effective_core on a node is calculated, the effective_core for the job is simply the sum of the effective_core on all nodes where the job runs.

Non-Exclusive Job

The effective cores on a node is calculated by considering requested memory. When a job requests xxx memory by "--mem=xxx" and the node has m cores, then for that requested memory, we have
'memory_equivalent_core = min(m, ceil(xxx/total_memory*m))'
where total_memory is the total memory on a node available to users on that node.

Example 1

   A compute node has 48 cores and around 351.5G memory available to users. 
   When a job requests 13G memory and 2 cores, the job uses memory_equivalent_core of min(48, ceil(13/351.5625*48))=2 core. 

Example 2

   A compute node has 48 cores and around 351.5G memory available to users. 
   When the job requests 20G memory and 2 cores, the job uses memory_equivalent_core of min(48, ceil(20/351.5625*48))=3 cores.

Example 3

   A compute node has 48 cores and around 351.5G memory available to users. 
   When the job requests 150G memory and 10 cores, the job uses memory_equivalent_core of min(48, ceil(150/351.5625*48))=21 cores.

Once the memory_equivalent_core is calculated, the effective_core on a node where a job request yyy cores can be calculated as below:

  effective_core = max(yyy, memory_equivalent_cores)

Finally, the effective_core for a job is the sum of the effective_core on all nodes where the job runs.

SUs for Jobs Requesting GPUs

A job requesting GPUs is considered as an exclusive job. That is, the SUs consumed by the job is calculated based on all CPUs on GPU nodes. For example, if a job requests one CPU and a GPU, then the job is a GPU job. The SUs for the GPU job is calculated as for 28 or 48 CPUs (all CPUs on a GPU node). Note that this policy will soon be changing. Come back for updates.