gpuavail
The gpuavail command is available on all clusters and will display the current configuration and availability of GPUs. Use the cpuavail command to see the status of non-GPU nodes.
Usage
The gpuavail output will show the current GPU configuration and readily available compute nodes for new jobs. If only 1 of many GPUs are in use on a compute node, the gpuavail output will show the amount of available GPUs, CPUs and memory for other jobs that could use that compute node.
In the following example output, if you wanted to schedule a job with 6 x A100 GPUs on a single compute node, you could configure your job to use the 16 available CPUs and 687GB memory for you job to run on the compute node named fc024 without having to specify the node name.
Example Output
CONFIGURATION
NODE NODE
TYPE COUNT
---------------------
gpu:t4:4 29
gpu:t4:8 10
gpu:a100:8 2
gpu:a40:2 2
gpu:a100:4 2
gpu:a30:2 2
gpu:a100:16 1
gpu:a10:4 1
gpu:a40:4 1
gpu:t4:2 1
gpu:a10:2 1
AVAILABILITY
NODE GPU GPU GPU CPU GB MEM
NAME TYPE COUNT AVAIL AVAIL AVAIL
-------------------------------------------
fc004 a100 16 11 24 797
fc009 t4 4 4 64 250
fc010 t4 4 4 64 250
fc011 t4 4 4 64 250
fc012 t4 8 8 64 250
fc013 t4 8 8 64 250
fc023 t4 2 2 64 250
fc024 a100 8 6 16 687
fc026 a100 4 3 52 131
fc031 a100 8 3 28 783
-
It is good practice to not use all CPUs and memory on a GPU compute node when not scheduling all the GPUs unless your job requires those resources.
-
If you schedule 1 of many available GPUs on a compute node and also request all the CPUs and memory, the remaining GPUs on that compute node will not be available to other jobs and will remain idle for the duration of your GPU job.