Difference between revisions of "Ada:Compile:MPI"
(→Running MPI Code) |
(→Intel MPI) |
||
(14 intermediate revisions by 6 users not shown) | |||
Line 5: | Line 5: | ||
=== Intel MPI === | === Intel MPI === | ||
− | + | To use the Intel MPI environment you need to load the Intel module. This can be done with the following command: | |
− | To use the Intel MPI environment you need to load the Intel | + | [ netID@cluster ~]$ '''module load intel/2017A''' |
− | < | + | <font color=teal>'''Note:''' It is no longer possible to load the default intel module. You must specify a version you are loading for the sake of consistency and clarity. More information about finding and loading modules can be found on our [[SW:Modules | Modules Systems]] page. </font> |
− | module | ||
− | </ | ||
− | |||
==== Compiling MPI Code ==== | ==== Compiling MPI Code ==== | ||
− | |||
To compile MPI code a MPI compiler wrapper is used. The wrapper will call the appropriate underlying compiler with additional linker flags specific for MPI programs. The Intel MPI software stack has wrappers for Intel compilers as well as wrappers for gnu compilers. Any argument not recognized by the wrapper will be passed to the underlying compiler. Therefore, any valid compiler flag (Intel or gnu) will also work when using the mpi wrappers | To compile MPI code a MPI compiler wrapper is used. The wrapper will call the appropriate underlying compiler with additional linker flags specific for MPI programs. The Intel MPI software stack has wrappers for Intel compilers as well as wrappers for gnu compilers. Any argument not recognized by the wrapper will be passed to the underlying compiler. Therefore, any valid compiler flag (Intel or gnu) will also work when using the mpi wrappers | ||
− | The following table shows the most commonly used | + | The following table shows the most commonly used MPI wrappers used by Intel MPI. |
− | |||
− | |||
{| class="wikitable" style="text-align: left;" | {| class="wikitable" style="text-align: left;" | ||
− | !MPI | + | !MPI Wrapper |
!Compiler | !Compiler | ||
!Language | !Language | ||
Line 55: | Line 49: | ||
|mpif90 <compiler_flags> prog.f90 | |mpif90 <compiler_flags> prog.f90 | ||
|} | |} | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
To see the full compiler command of any of the mpi wrapper scripts use the **-show** flag. This flag does not actually call the compiler, it only | To see the full compiler command of any of the mpi wrapper scripts use the **-show** flag. This flag does not actually call the compiler, it only | ||
Line 72: | Line 54: | ||
'''Example:''' Show the full compiler command for the mpiifort wrapper script | '''Example:''' Show the full compiler command for the mpiifort wrapper script | ||
− | + | [ netID@cluster ~]$ '''mpiifort -show ''' | |
− | ifort -I/software/easybuild/software/impi/4.1.3.049/intel64/include -I/software/easybuild/software/impi/4.1.3.049/intel64/include | + | ifort -I/software/easybuild/software/impi/4.1.3.049/intel64/include -I/software/easybuild/software/impi/4.1.3.049/intel64/include |
− | -L/software/easybuild/software/impi/4.1.3.049/intel64/lib -Xlinker --enable-new-dtags -Xlinker -rpath -Xlinker | + | -L/software/easybuild/software/impi/4.1.3.049/intel64/lib -Xlinker --enable-new-dtags -Xlinker -rpath -Xlinker |
− | /software/easybuild/software/impi/4.1.3.049/intel64/lib -Xlinker -rpath -Xlinker -/opt/intel/mpi-rt/4.1 -lmpigf -lmpi -lmpigi -ldl -lrt -lpthread` | + | /software/easybuild/software/impi/4.1.3.049/intel64/lib -Xlinker -rpath -Xlinker -/opt/intel/mpi-rt/4.1 -lmpigf -lmpi -lmpigi -ldl -lrt -lpthread` |
− | |||
− | |||
==== Running MPI Code ==== | ==== Running MPI Code ==== | ||
+ | Running MPI code requires an MPI launcher. The latter will setup the environment and start the requested number of MPI tasks on the needed nodes.<br> | ||
+ | Use the following command to launch an MPI program where ''[mpi_flags]'' are options passed to the mpi launcher, ''<executable>'' is the name of the mpi program and ''[executable params]'' are optional parameters for the mpi program. (We continue to assume here use of the Intel MPI stack.): | ||
+ | [ netID@cluster ~]$ '''mpirun ''[mpi_flags] <executable> [executable params]''''' | ||
+ | <font color=teal>'''Note:''' ''<executable>'' must be on the '''$PATH''' otherwise the launcher will not be able to find the executable.</font> | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
For a list of the most common ''mpi_flags'' See table below. This table shows only a very small subset of all flags. To see a full listing type '''mpirun --help''' | For a list of the most common ''mpi_flags'' See table below. This table shows only a very small subset of all flags. To see a full listing type '''mpirun --help''' | ||
− | |||
− | |||
{| class="wikitable" style="text-align: left;" | {| class="wikitable" style="text-align: left;" | ||
!Flag | !Flag | ||
Line 124: | Line 96: | ||
|} | |} | ||
+ | ==== Hybrid MPI/OpenMP Code ==== | ||
+ | To compile hybrid mpi/OpenMP programs (i.e. MPI programs that also contain OpenMP directives) invoke the appropriate mpi wrapper and add the '''-openmp''' flag to enable processing of OpenMP primitives. | ||
− | ''' | + | Running a hybrid program is very similar to running a pure mpi program. To control the number of OpenMP threads to use per task the '''OMP_NUM_THREADS''' environmental variable can be set. |
− | |||
− | |||
− | |||
− | + | ===== Advanced: mapping tasks and threads ===== | |
− | |||
− | |||
− | |||
− | ''' | + | Explicitly mapping mpi tasks to processors can result in significantly better performance.This is especially true for hybrid MPI/OpenMP programs where both mpi tasks and OpenMP threads are being mapped on the available cores on a node. The Intel MPI stack provides a way to control the pinning of MPI tasks using the environmental variable ''' I_MPI_PIN_DOMAIN'''. |
− | + | [ netID@cluster ~]$ '''export I_MPI_PIN_DOMAIN=''<domain>''' | |
− | |||
− | |||
− | |||
− | < | ||
− | |||
− | |||
− | |||
− | ''' | + | where ''<domain>'' can have the following values; '''node, socket, core, cache1, cache2, cache3'''. The domain tells where to pin the tasks. For example "'''socket'''" will pin the tasks on different sockets. To map the OpenMP threads the affinity setting for OpenMP will be used. |
− | ''' | + | <font color=teal>'''NOTE:''' the above syntax is just one way to describe the pinning. Please visit the [https://software.intel.com/en-us/node/528816 Process Pinning] documentation or the Intel MPI reference (see [[Ada:Compile:MPI#Further_Information | Further Information ]] section for link) for alternative ways to pin tasks using the ''' I_MPI_PIN_DOMAIN''' environmental variable.</font> |
− | |||
− | |||
− | |||
− | |||
− | ==== | + | ==== Examples ==== |
− | + | In this section are various examples for compiling and running MPI programs with the Intel toolchain. | |
− | '''Example 1:''' Compile MPI | + | '''Example 1:''' Compile MPI program written in C, and name it mpi_prog.x. Use the underlying Intel compiler with -O3 optimization |
− | + | [ netID@cluster ~]$ '''mpiicc -o ''mpi_prog.x'' -O3 ''mpi_prog.c''''' | |
− | |||
− | |||
+ | '''Example 2:''' Same as Example 1, but this time use underlying gnu Fortran compiler. | ||
+ | [ netID@cluster ~]$ '''mpif90 -o ''mpi_prog.x mpi_prog.f90''''' | ||
− | + | '''Example 3:''' Run mpi program on local host using 4 tasks | |
+ | [ netID@cluster ~]$ '''mpirun -np ''4 mpi_prog.x''''' | ||
− | '''Example | + | '''Example 4:''' Run mpi program on a specific host, using 4 tasks |
− | + | [ netID@cluster ~]$ '''mpirun -np ''4'' -hosts ''login1 mpi_prog.x''''' | |
− | |||
− | mpirun -np | ||
− | |||
+ | '''Example 5:''' Run mpi program on two different hosts, using 4 tasks using host file, assign tasks in round robin fashion | ||
+ | [ netID@cluster ~]$ '''mpirun -np ''4'' -perhost ''1'' -hostfile ''mylist mpi_prog.x''''' | ||
− | = | + | where ''mylist'' is a file that contains the following lines: |
+ | login1 | ||
+ | login2 | ||
+ | <font color=teal>'''Note:''' If you don't specify ''-pernode'' all the tasks will be started on ''login1'', even though the hostfile contains multiple entries.</font><br> | ||
− | + | '''Example 6:''' Run 4 different programs concurrently using mpirun (MPMD style program) | |
+ | [ netID@cluster ~]$ '''mpirun -np ''1 prog1.x'' : -np ''1 prog2.x'' : -np ''1 prog3.x'' : -np ''1 prog4.x''''' | ||
+ | <font color=teal>'''Note:''' For executing a large number of serial (or OpenMP) programs, we recommend using the [[SW:tamulauncher | tamulauncher]] utility. </font> | ||
− | + | '''Example 7:''' Compile MPI fortran program named hybrid.f90 that also contains OpenMP primitives, use underlying Intel Fortran compiler | |
− | + | [ netID@cluster ~]$ '''mpiifort -openmp -o ''hybrid.x hybrid.f90''''' | |
− | |||
− | + | '''Example 8:''' Run the hybrid program named hybrid.x, use 8 tasks and every task will use 2 threads in its OpenMP regions. | |
− | + | [ netID@cluster ~]$ '''export OMP_NUM_THREADS=''2''''' | |
− | ''' | + | [ netID@cluster ~]$ '''mpirun -np ''8 ./hybrid.x''''' |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | mpirun -np | ||
− | |||
+ | '''Example 9:''' run hybrid mpi/OpenMP program using 2 tasks and 10 threads, pin the tasks to different sockets, map all OpenMP threads within the socket | ||
+ | [ netID@cluster ~]$ '''export I_MPI_PIN_DOMAIN=''socket''''' | ||
+ | [ netID@cluster ~]$ '''export OMP_NUM_THREADS=''10''''' | ||
+ | [ netID@cluster ~]$ '''export OMP_PLACES=''"socket"''''' | ||
+ | [ netID@cluster ~]$ '''export OMP_PROC_BIND=''"master"''''' | ||
+ | [ netID@cluster ~]$ '''mpirun -np ''2 ./hybrid.x''''' | ||
==== Further Information ==== | ==== Further Information ==== | ||
− | For a detailed description of the Intel MPI stack, please visit the [https://software.intel.com/ | + | For a detailed description of the Intel MPI stack, please visit the [https://software.intel.com/en-us/mpi-developer-reference-linux Intel MPI Developer Reference Manual]. This site contains detailed information about the mpi compiler wrappers, in depth discussion about mpirun and it options, as well as tuning your application for best performance and pinning tasks. |
− | |||
=== OpenMPI === | === OpenMPI === | ||
Line 202: | Line 158: | ||
Using OpenMPI is very similar to using Intel MPI. There are a few minor differences. To use OpenMPI you will need to load one of the OpenMPI modules. ADA has OpenMPI versions build with Intel compilers as well as gnu compilers. The underlying compiler depends on the loaded OpenMPI module | Using OpenMPI is very similar to using Intel MPI. There are a few minor differences. To use OpenMPI you will need to load one of the OpenMPI modules. ADA has OpenMPI versions build with Intel compilers as well as gnu compilers. The underlying compiler depends on the loaded OpenMPI module | ||
− | '''Example 1:''' Load OpenMPI version 1.8. | + | '''Example 1:''' Load OpenMPI version 1.8.4 with gnu as underlying compiler |
− | + | [ netID@cluster ~]$ '''module load OpenMPI/1.8.4-GCC-4.9.2''' | |
− | module load OpenMPI/1.8. | ||
− | |||
To see a list of all available OpenMPI versions type: | To see a list of all available OpenMPI versions type: | ||
− | + | [ netID@cluster ~]$ '''module spider openmpi''' | |
− | |||
− | module spider openmpi | ||
− | |||
==== Compiling ==== | ==== Compiling ==== | ||
Line 238: | Line 189: | ||
==== Running ==== | ==== Running ==== | ||
− | |||
To launch a mpi program you will use the '''mpirun''' command. This command is very similar to the Intel MPI '''mpirun''' launcher discussed above. However, | To launch a mpi program you will use the '''mpirun''' command. This command is very similar to the Intel MPI '''mpirun''' launcher discussed above. However, | ||
some of the flags are different for OpenMPI. The table below shows some of the more common flags. | some of the flags are different for OpenMPI. The table below shows some of the more common flags. | ||
− | |||
{| class="wikitable" style="text-align: left;" | {| class="wikitable" style="text-align: left;" | ||
Line 262: | Line 211: | ||
| comma separated list of specific host/node names. | | comma separated list of specific host/node names. | ||
|} | |} | ||
− | |||
To see all the available options and flags (including short descriptions) use the following command: | To see all the available options and flags (including short descriptions) use the following command: | ||
− | + | [ netID@cluster ~]$ '''mpirun -help''' | |
− | mpirun -help | ||
− | |||
[[Category:Ada]] | [[Category:Ada]] |
Latest revision as of 10:20, 25 April 2018
Contents
MPI Programs
There are currently two MPI stacks installed on ADA; OpenMPI and Intel MPI. The recommended MPI stack for software development is the Intel MPI software stack and most of this section will focus on this MPI stack.
Intel MPI
To use the Intel MPI environment you need to load the Intel module. This can be done with the following command:
[ netID@cluster ~]$ module load intel/2017A
Note: It is no longer possible to load the default intel module. You must specify a version you are loading for the sake of consistency and clarity. More information about finding and loading modules can be found on our Modules Systems page.
Compiling MPI Code
To compile MPI code a MPI compiler wrapper is used. The wrapper will call the appropriate underlying compiler with additional linker flags specific for MPI programs. The Intel MPI software stack has wrappers for Intel compilers as well as wrappers for gnu compilers. Any argument not recognized by the wrapper will be passed to the underlying compiler. Therefore, any valid compiler flag (Intel or gnu) will also work when using the mpi wrappers
The following table shows the most commonly used MPI wrappers used by Intel MPI.
MPI Wrapper | Compiler | Language | Example |
---|---|---|---|
mpiicc | icc | C | mpiicc <compiler_flags> prog.c |
mpicc | gcc | C | mpicc <compiler_flags> prog.c |
mpiicpc | icpc | C++ | mpiicpcp <compiler_flags> prog.cpp |
mpicxx | g++ | C++ | mpicxx <compiler_flags> prog.cpp |
mpiifort | ifort | Fortran | mpiifort <compiler_flags> prog.f90 |
mpif90 | gfortran | Fortran | mpif90 <compiler_flags> prog.f90 |
To see the full compiler command of any of the mpi wrapper scripts use the **-show** flag. This flag does not actually call the compiler, it only prints the full compiler command and exits. This can be useful for debugging purposes and/or when experiencing problems with any of the compiler wrappers
Example: Show the full compiler command for the mpiifort wrapper script
[ netID@cluster ~]$ mpiifort -show ifort -I/software/easybuild/software/impi/4.1.3.049/intel64/include -I/software/easybuild/software/impi/4.1.3.049/intel64/include -L/software/easybuild/software/impi/4.1.3.049/intel64/lib -Xlinker --enable-new-dtags -Xlinker -rpath -Xlinker /software/easybuild/software/impi/4.1.3.049/intel64/lib -Xlinker -rpath -Xlinker -/opt/intel/mpi-rt/4.1 -lmpigf -lmpi -lmpigi -ldl -lrt -lpthread`
Running MPI Code
Running MPI code requires an MPI launcher. The latter will setup the environment and start the requested number of MPI tasks on the needed nodes.
Use the following command to launch an MPI program where [mpi_flags] are options passed to the mpi launcher, <executable> is the name of the mpi program and [executable params] are optional parameters for the mpi program. (We continue to assume here use of the Intel MPI stack.):
[ netID@cluster ~]$ mpirun [mpi_flags] <executable> [executable params]
Note: <executable> must be on the $PATH otherwise the launcher will not be able to find the executable.
For a list of the most common mpi_flags See table below. This table shows only a very small subset of all flags. To see a full listing type mpirun --help
Flag | Description |
---|---|
-np <n> | The number of mpi tasks to start. |
-n <n> | The number of mpi tasks to start (same as -np). |
-perhost <n> | Places <n> consecutive (MPI) processes on each host/node. |
-ppn <n> | Stands for Process (i.e., task) Per Node (same as -perhost) |
-hostfile <file> | The name of the file that contains the list of host/node names the launcher will place tasks on. |
-f <file> | Same as -hostfile |
-hosts {host list} | comma separated list of specific host/node names. |
-help | Shows list of available flags and options |
Hybrid MPI/OpenMP Code
To compile hybrid mpi/OpenMP programs (i.e. MPI programs that also contain OpenMP directives) invoke the appropriate mpi wrapper and add the -openmp flag to enable processing of OpenMP primitives.
Running a hybrid program is very similar to running a pure mpi program. To control the number of OpenMP threads to use per task the OMP_NUM_THREADS environmental variable can be set.
Advanced: mapping tasks and threads
Explicitly mapping mpi tasks to processors can result in significantly better performance.This is especially true for hybrid MPI/OpenMP programs where both mpi tasks and OpenMP threads are being mapped on the available cores on a node. The Intel MPI stack provides a way to control the pinning of MPI tasks using the environmental variable I_MPI_PIN_DOMAIN.
[ netID@cluster ~]$ export I_MPI_PIN_DOMAIN=<domain>
where <domain> can have the following values; node, socket, core, cache1, cache2, cache3. The domain tells where to pin the tasks. For example "socket" will pin the tasks on different sockets. To map the OpenMP threads the affinity setting for OpenMP will be used.
NOTE: the above syntax is just one way to describe the pinning. Please visit the Process Pinning documentation or the Intel MPI reference (see Further Information section for link) for alternative ways to pin tasks using the I_MPI_PIN_DOMAIN environmental variable.
Examples
In this section are various examples for compiling and running MPI programs with the Intel toolchain.
Example 1: Compile MPI program written in C, and name it mpi_prog.x. Use the underlying Intel compiler with -O3 optimization
[ netID@cluster ~]$ mpiicc -o mpi_prog.x -O3 mpi_prog.c
Example 2: Same as Example 1, but this time use underlying gnu Fortran compiler.
[ netID@cluster ~]$ mpif90 -o mpi_prog.x mpi_prog.f90
Example 3: Run mpi program on local host using 4 tasks
[ netID@cluster ~]$ mpirun -np 4 mpi_prog.x
Example 4: Run mpi program on a specific host, using 4 tasks
[ netID@cluster ~]$ mpirun -np 4 -hosts login1 mpi_prog.x
Example 5: Run mpi program on two different hosts, using 4 tasks using host file, assign tasks in round robin fashion
[ netID@cluster ~]$ mpirun -np 4 -perhost 1 -hostfile mylist mpi_prog.x
where mylist is a file that contains the following lines:
login1 login2
Note: If you don't specify -pernode all the tasks will be started on login1, even though the hostfile contains multiple entries.
Example 6: Run 4 different programs concurrently using mpirun (MPMD style program)
[ netID@cluster ~]$ mpirun -np 1 prog1.x : -np 1 prog2.x : -np 1 prog3.x : -np 1 prog4.x
Note: For executing a large number of serial (or OpenMP) programs, we recommend using the tamulauncher utility.
Example 7: Compile MPI fortran program named hybrid.f90 that also contains OpenMP primitives, use underlying Intel Fortran compiler
[ netID@cluster ~]$ mpiifort -openmp -o hybrid.x hybrid.f90
Example 8: Run the hybrid program named hybrid.x, use 8 tasks and every task will use 2 threads in its OpenMP regions.
[ netID@cluster ~]$ export OMP_NUM_THREADS=2 [ netID@cluster ~]$ mpirun -np 8 ./hybrid.x
Example 9: run hybrid mpi/OpenMP program using 2 tasks and 10 threads, pin the tasks to different sockets, map all OpenMP threads within the socket
[ netID@cluster ~]$ export I_MPI_PIN_DOMAIN=socket [ netID@cluster ~]$ export OMP_NUM_THREADS=10 [ netID@cluster ~]$ export OMP_PLACES="socket" [ netID@cluster ~]$ export OMP_PROC_BIND="master" [ netID@cluster ~]$ mpirun -np 2 ./hybrid.x
Further Information
For a detailed description of the Intel MPI stack, please visit the Intel MPI Developer Reference Manual. This site contains detailed information about the mpi compiler wrappers, in depth discussion about mpirun and it options, as well as tuning your application for best performance and pinning tasks.
OpenMPI
Using OpenMPI is very similar to using Intel MPI. There are a few minor differences. To use OpenMPI you will need to load one of the OpenMPI modules. ADA has OpenMPI versions build with Intel compilers as well as gnu compilers. The underlying compiler depends on the loaded OpenMPI module
Example 1: Load OpenMPI version 1.8.4 with gnu as underlying compiler
[ netID@cluster ~]$ module load OpenMPI/1.8.4-GCC-4.9.2
To see a list of all available OpenMPI versions type:
[ netID@cluster ~]$ module spider openmpi
Compiling
The table below shows the various mpi compiler wrappers. The names will be the same regardless of the underlying compiler.
MPI wrapper | Language | Example |
---|---|---|
mpicc | C | mpicc <compiler_flags> prog.c |
mpic++ | C++ | mpic++ <compiler_flags> prog.cpp |
mpif90 | Fortran | mpif90 <compiler_flags> prog.f90 |
to see the complete compiler command use the -show flag.
Running
To launch a mpi program you will use the mpirun command. This command is very similar to the Intel MPI mpirun launcher discussed above. However, some of the flags are different for OpenMPI. The table below shows some of the more common flags.
Flag | Description |
---|---|
-np <n> | The number of mpi tasks to start. |
-npernode <n> | Places <n> (MPI) processes per node on each allocated node. |
-npersocket <n> | Places <n> (MPI) processes per socket on each allocated node |
-hostfile <file> | The name of the file that contains the list of host/node names the launcher will place tasks on. |
-host {host list} | comma separated list of specific host/node names. |
To see all the available options and flags (including short descriptions) use the following command:
[ netID@cluster ~]$ mpirun -help