Hprc banner tamu.png

Difference between revisions of "Ada:Compile:CUDA"

From TAMU HPRC
Jump to: navigation, search
(Debugging CUDA Programs)
(CUDA Programs)
Line 7: Line 7:
 
{{:CUDA:Misc | Miscellaneous}}
 
{{:CUDA:Misc | Miscellaneous}}
 
{{:CUDA:References | References}} (FIXME)-->
 
{{:CUDA:References | References}} (FIXME)-->
 +
=== Access ===
 +
In order to compile, run, and debug CUDA programs, a CUDA module must be loaded:
 +
[ netID@ada ~]$ '''module load CUDA'''
 +
 +
<font color=teal>For more information on the modules system, please see our [[SW:Modules | Modules System]] page.</font>
  
 
=== Compiling CUDA C/C++ with NVIDIA nvcc ===
 
=== Compiling CUDA C/C++ with NVIDIA nvcc ===
 
+
The compiler '''nvcc''' is the NVIDIA CUDA C/C++ compiler. The command line for invoking it is:  
NVIDIA's CUDA compiler and libraries are accessible by loading the CUDA module:
+
  [ netID@ada ~]$ '''nvcc [options] -o cuda_prog.exe file1 file2 ...'''
 
 
  login1]$ module load CUDA
 
 
 
<i>nvcc</i> is the NVIDIA CUDA C/C++ compiler. The command line for invoking <i>nvcc</i> is:  
 
   
 
  login1]$ nvcc [options] -o cuda_prog.exe file1 file2 ...
 
  
 
where file1, file2, ... are any appropriate source, assembly, object, object library, or other (linkable) files that are linked to generate the executable file cuda_prog.exe.
 
where file1, file2, ... are any appropriate source, assembly, object, object library, or other (linkable) files that are linked to generate the executable file cuda_prog.exe.
  
 
The CUDA devices on Ada are K20s. K20 GPUs are compute capability 3.5 devices. When compiling your code, you need to specify:
 
The CUDA devices on Ada are K20s. K20 GPUs are compute capability 3.5 devices. When compiling your code, you need to specify:
 +
[ netID@ada ~]$ '''nvcc -arch=compute_35 -code=sm_35 ...'''
  
  login1]$ nvcc -arch=compute_35 -code=sm_35 ...
+
By default, nvcc will use gcc to compile your source code. However, it is better to use the Intel compiler by adding the flag '''-ccbin=icc''' to your compile command.
 
 
By default, nvcc will use gcc to compile your source code. It is better to use the Intel compiler by adding '-ccbin=icc'.
 
  
For more information on <i>nvcc</i>, please refer to it online manual[http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc].
+
<font color=teal>For more information on '''nvcc''', please refer to the [http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc online manual ].</font>
  
 
=== Running CUDA Programs ===
 
=== Running CUDA Programs ===
 
+
Only 5 login nodes (login1, login2, ..., login5) on Ada are installed with either one or two K20s. To find out how many K20s on each node and the load information of the device, please run the NVIDIA system management interface program '''nvidia-smi'''. This command will tell you on which GPU device your code is running on, how much memory is used on the device, and the GPU utilization.  
The CUDA module must be loaded before running any CUDA programs.
+
 
+
[ netID@ada ~]$ '''nvidia-smi'''
  login1]$ module load CUDA
 
 
 
5 login nodes (login1, login2, ..., login5) on Ada are installed with either one or two K20s. To find out how many K20s on each node and the load information of the device, please run the NVIDIA system management interface program <i>nvidia-smi</i>. This command will tell you on which GPU device your code is running on, how much memory is used on the device, and the GPU utilization.  
 
 
 
  login5]$ nvidia-smi
 
 
   Wed Jan  7 11:16:05 2015       
 
   Wed Jan  7 11:16:05 2015       
 
   +------------------------------------------------------+                       
 
   +------------------------------------------------------+                       
Line 58: Line 51:
 
   +-----------------------------------------------------------------------------+  
 
   +-----------------------------------------------------------------------------+  
  
You can test your CUDA program on one or more of the login nodes as long as you abide by the rules stated in [[Ada:Computing_Environment|Computing Environment]]. For production runs, you should submit a batch job to run your code on the compute nodes. Ada has 20 compute nodes with dual K20s and 256GB (host) memory and 10 compute nodes with a single K20 and 64GB (host) memory.  Your job needs to specify one of the following, in conjunction with other parameters, to secure one or more GPU nodes.  
+
You can test your CUDA program on one or more of the login nodes '''as long as you abide by the rules stated in [[Ada:Computing_Environment|Computing Environment]]'''. For production runs, you should submit a batch job to run your code on the compute nodes. Ada has 20 compute nodes with dual K20s and 256GB (host) memory and 10 compute nodes with a single K20 and 64GB (host) memory.  Your job needs to specify one of the following, in conjunction with other parameters, to secure one or more GPU nodes.  
  
 
{| class="wikitable" style="text-align: left;"
 
{| class="wikitable" style="text-align: left;"
Line 79: Line 72:
  
 
=== Debugging CUDA Programs ===
 
=== Debugging CUDA Programs ===
 
The NVIDIA CUDA program debugging tool is <i>cuda-gdb</i>. Before using <i>cuda-gdb</i>, the CUDA module must be loaded first.
 
 
  login1]$ module load CUDA
 
 
 
CUDA programs must be compiled with "-g -G" to force O0 optimization and to generate code with debugging information. To generate debugging code for K20, compile and link the code with
 
CUDA programs must be compiled with "-g -G" to force O0 optimization and to generate code with debugging information. To generate debugging code for K20, compile and link the code with
 +
[ netID@ada ~]$ '''nvcc -g -G arch=compute_35 -code=sm_35 ''cuda_prog.cu'' -o ''cuda_prog.out'''''
  
  login1]$ nvcc -g -G arch=compute_35 -code=sm_35 cuda_prog.cu -o cuda_prog.out
+
<font color=teal>For more information on '''cuda-gdb''', please refer to its [http://docs.nvidia.com/cuda/cuda-gdb online manual].</font>
 
 
For more information on <i>cuda-gdb</i>, please refer to its manual [http://docs.nvidia.com/cuda/cuda-gdb]
 
  
[[Category:Eos]]
 
 
[[Category:Ada]]
 
[[Category:Ada]]

Revision as of 12:03, 17 February 2017

CUDA Programs

Access

In order to compile, run, and debug CUDA programs, a CUDA module must be loaded:

[ netID@ada ~]$ module load CUDA

For more information on the modules system, please see our Modules System page.

Compiling CUDA C/C++ with NVIDIA nvcc

The compiler nvcc is the NVIDIA CUDA C/C++ compiler. The command line for invoking it is:

[ netID@ada ~]$ nvcc [options] -o cuda_prog.exe file1 file2 ...

where file1, file2, ... are any appropriate source, assembly, object, object library, or other (linkable) files that are linked to generate the executable file cuda_prog.exe.

The CUDA devices on Ada are K20s. K20 GPUs are compute capability 3.5 devices. When compiling your code, you need to specify:

[ netID@ada ~]$ nvcc -arch=compute_35 -code=sm_35 ...

By default, nvcc will use gcc to compile your source code. However, it is better to use the Intel compiler by adding the flag -ccbin=icc to your compile command.

For more information on nvcc, please refer to the online manual .

Running CUDA Programs

Only 5 login nodes (login1, login2, ..., login5) on Ada are installed with either one or two K20s. To find out how many K20s on each node and the load information of the device, please run the NVIDIA system management interface program nvidia-smi. This command will tell you on which GPU device your code is running on, how much memory is used on the device, and the GPU utilization.

[ netID@ada ~]$ nvidia-smi
 Wed Jan  7 11:16:05 2015       
 +------------------------------------------------------+                       
 | NVIDIA-SMI 340.29     Driver Version: 340.29         |                       
 |-------------------------------+----------------------+----------------------+
 | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
 | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
 |===============================+======================+======================|
 |   0  Tesla K20m          On   | 0000:20:00.0     Off |                    0 |
 | N/A   18C    P8    15W / 225W |     22MiB /  4799MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
 |   1  Tesla K20m          On   | 0000:8B:00.0     Off |                    0 |
 | N/A   16C    P8    16W / 225W |     13MiB /  4799MiB |      0%      Default |
 +-------------------------------+----------------------+----------------------+
                                                                              
 +-----------------------------------------------------------------------------+
 | Compute processes:                                               GPU Memory |
 |  GPU       PID  Process name                                     Usage      |
 |=============================================================================|
 |   0       18950  ./a.out                                               7MiB |
 +-----------------------------------------------------------------------------+ 

You can test your CUDA program on one or more of the login nodes as long as you abide by the rules stated in Computing Environment. For production runs, you should submit a batch job to run your code on the compute nodes. Ada has 20 compute nodes with dual K20s and 256GB (host) memory and 10 compute nodes with a single K20 and 64GB (host) memory. Your job needs to specify one of the following, in conjunction with other parameters, to secure one or more GPU nodes.

Node Type Needed Job Parameter to Use
Any GPU -R "select[gpu]"
64GB GPU -R "select[gpu64gb]"
256GB GPU -R "select[gpu256gb]"

For example, the following job options will select one node with 256GB memory and dual K20s:

 #BSUB -n 20 -R "span[ptile=20]" -R "select[gpu256gb]"

Debugging CUDA Programs

CUDA programs must be compiled with "-g -G" to force O0 optimization and to generate code with debugging information. To generate debugging code for K20, compile and link the code with

[ netID@ada ~]$ nvcc -g -G arch=compute_35 -code=sm_35 cuda_prog.cu -o cuda_prog.out

For more information on cuda-gdb, please refer to its online manual.