https://hprc.tamu.edu/w/api.php?action=feedcontributions&user=Pennings&feedformat=atomTAMU HPRC - User contributions [en]2022-08-09T08:41:59ZUser contributionsMediaWiki 1.31.15https://hprc.tamu.edu/w/index.php?title=SW:tamulauncher&diff=12999SW:tamulauncher2022-03-21T18:37:51Z<p>Pennings: </p>
<hr />
<div><br />
== tamulauncher ==<br />
<br />
'''tamulauncher''' provides a convenient way to run a large number of serial or multithreaded commands without the need to submit individual jobs or a Job array. tamulauncher takes as arguments a text file containing all commands that need to be executed and tamulauncher will execute the commands concurrently. The number of concurrently executed commands depends on the batch requirements. When tamulauncher is run interactively the number of concurrently executed commands is limited to at most 8. tamulauncher is available on grace and terra. There is no need to load any module. tamulauncher has been successfully tested to execute over 100K commands.<br />
<br />
''tamulauncher is preferred over Job Arrays to submit a large number of individual jobs, especially when the run times of the commands are relatively short. It allows for better utilization of the nodes, puts less burden on the batch scheduler, and lessens interference with jobs of other users on the same node.'' <br />
<br />
=== Synopsis ===<br />
<pre><br />
[ NetID@ ~]$ tamulauncher --help<br />
Usage: /sw/local/bin/tamulauncher [options] FILE<br />
<br />
This script will execute commands in FILE concurrently. <br />
<br />
OPTIONS:<br />
<br />
--commands-pernode | -p <n> <br />
Set the number of concurrent processes per node.<br />
<br />
--norestart<br />
Do not restart.<br />
<br />
--status <commands file><br />
Prints number of finished commands and exits. <br />
<br />
--list <commands file><br />
Prints detailed list of all finished commands and exits.<br />
<br />
--remove-logs <commands file><br />
Removes the log directory and exits<br />
<br />
--version | -v<br />
Prints version and exits.<br />
<br />
--help | -h | ?<br />
Shows this message and exits.<br />
</pre><br />
<br />
=== Commands file ===<br />
<br />
The commands file is a regular text file containing all the commands that need to be executed. Every line contains one command. A command can be a user-compiled program, a Linux command, a script (e.g. bash, Python, Perl, etc), a software package, etc. Commands can also be compounded using the Linux semi-colon operator. In general, any command that will work when typed in a bash shell will work when executed using tamulauncher. Below is an example of a commands file; it illustrates a commands file can contain any combination of commands (although in practice it's mostly a repetition of the same command with varying input parameters). Many times a commands file can be generated automatically.<br />
<br />
<br />
<pre><br />
./prog1 125<br />
./prog2 "aa" 3 <br />
mkdir testcase1 ; cd testcase1; ./myprog<br />
./prog1 100<br />
:<br />
:<br />
:<br />
time ./prog3 <br />
python mypython.py<br />
./prog1 141 ; ./prog4 > OUTPUT<br />
./prog5 < myinput<br />
<br />
</pre><br />
<br />
=== Dynamic release of resources ===<br />
<br />
tamulauncher will automatically release resources whenever they become idle. Resources will be released on per-node basis. This feature is especially useful in cases where the majority of requested cores/nodes are idle, taking up valuable resources, while only a few cores are processing the last few commands.<br />
<br />
'''NOTE''' You might see some slurm error messages such as "srun: error: <NODE>: task 7: Killed". These messages can be safely ignored. <br />
<br />
'''NOTE:''' This is an experimental feature we are still improving on. For that reason, as well as well as some needed changes to calculation of SUs, <br />
the number of SUs charged will not be adjusted at this time. However, it will help to make the cluster less congested.<br />
<br />
== Examples ==<br />
<br />
The following sections show two simple examples how to use tamulauncher. The first examples shows how to run serial commands and the second example shows how to run multi-threaded (OpenMP) commands.<br />
<br />
=== Example 1: Running serial commands ===<br />
<br />
<pre><br />
#!/bin/bash<br />
<br />
#SBATCH --export=NONE <br />
#SBATCH --get-user-env=L <br />
<br />
##NECESSARY JOB SPECIFICATIONS<br />
#SBATCH --job-name=demo-tamulauncher<br />
#SBATCH --output=demo-tamulauncher.%j<br />
#SBATCH --time=07:00:00 <br />
#SBATCH --ntasks=480 <br />
#SBATCH --mem=4096M <br />
<br />
tamulauncher commands.in<br />
</pre><br />
<br />
In the above example, tamulauncher will extract the requirements specified in the SLURM script and distribute all the commands over the 480 tasks.<br />
<br />
<br />
=== Example 2: Running multi-threaded (OpenMP) commands ===<br />
<br />
The next example shows how to run a bunch of OpenMP executables, where every command will use 4 threads<br />
<br />
<pre><br />
#!/bin/bash<br />
<br />
#SBATCH --export=NONE <br />
#SBATCH --get-user-env=L <br />
<br />
##NECESSARY JOB SPECIFICATIONS<br />
#SBATCH --job-name=demo-tamulauncher<br />
#SBATCH --output=demo-tamulauncher.%j<br />
#SBATCH --time=07:00:00 <br />
#SBATCH --ntasks=100 <br />
#SBATCH --cpus-per-task=4<br />
#SBATCH --mem=4096M <br />
<br />
<br />
exportÂ OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK<br />
tamulauncher commands.in<br />
</pre><br />
<br />
In the above example, tamulauncher will extract the requirements specified in the SLURM script and distribute all the commands over the 100 tasks. The SLURM '''--cpus-per-task''' option will make sure 4 cores are reserved for every task and every command can use up to 4 threads.<br />
<br />
== Automatic Restart ==<br />
<br />
tamulauncher keeps track of all commands that have been executed. When you start a tamulauncher job it will check for a log (located in directory ''.tamulauncher-log'') from a previous run and if that is the case, it will continue executing commands that did not finish during the previous run. This is especially useful when a tamulauncher job was killed because it ran out of wall time or there was a system problem. To turn off the automatic restart option, use the '''norestart'' flag in your tamulauncher command, e.g.<br />
<br />
tamulauncher --norestart commands.in<br />
<br />
Using this option tamulauncher will just wipe all the log files and it will start as if it was a first run.<br />
<br />
'''NOTE:''' tamulauncher keeps a log for every unique commands file. If you make any changes to the commands file, tamulauncher will assume it's a different commands file and will create a new log directory. This also means multiple tamulauncher runs can be executed in the same directory.<br />
<br />
== Monitoring runs ==<br />
<br />
To see how many commands have been executed use the '''--status''' flag in tamulauncher:<br />
<br />
[ NetID@ ~]$ '''tamulauncher --status <command file>'''<br />
<br />
This will show a one-line summary with the number of commands executed and the total number of commands for the tamulauncher run on ''<command file>''.<br />
<br />
To see a full listing of all finished commands use the '''--list''' flag in tamulauncher:<br />
<br />
[ NetID@ ~]$ '''tamulauncher --list <command file>'''<br />
<br />
This will show a list of all commands that have finished executing, including index in the commands file, total run-time time, and exit status for the tamulauncher run on ''<command file>''.<br />
<br />
<br />
== Clearing the log ==<br />
<br />
To clear the log for a particular tamulauncher run, use the '''--remove-logs''' flag.<br />
<br />
[ NetID@ ~]$ '''tamulauncher --remove-logs <command file>'''<br />
<br />
This will clear the logs for the latest tamulauncher run on commands file <command file>. '''NOTE:''' don't clear the logs while tamulauncher is still running on that particular <commands file>.</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:tamubatch&diff=12986SW:tamubatch2022-03-04T19:54:25Z<p>Pennings: </p>
<hr />
<div><br />
== tamubatch ==<br />
<br />
'''tamubatch''' is an automatic batch job submission script on Terra and Grace that allows you to submit batch job files without batch parameters. tamubatch takes a job file of executable commands and directly adds parameters to the job file so the user does not have to enter parameters every time. The user can specify specific batch commands with the use of flags or they may use tamubatch's default batch parameters.<br />
<br />
=== Synopsis ===<br />
<pre><br />
[ NetID@ ~]$ tamubatch -help<br />
<br />
OPTIONS:<br />
<br />
SUBMISSION FLAGS:<br />
<br />
--walltime | -W <0:00><br />
Sets the walltime for the current job in format 0:00.<br />
Default is 30 minutes<br />
<br />
--GPU | -gpu<br />
selects the gpu node to run the current job.<br />
<br />
--cores | -n <n><br />
Sets the total number of cores for the current job.<br />
Default is one Core<br />
<br />
--cores-per-node | -R <n><br />
Sets the number of cores per compute node.<br />
Default is the number of cores requested from the --cores flag (if specified) with a <br />
maximum of 48 cores per node on Grace and 28 cores per node on Terra.<br />
<br />
--total-memory | -M <n>MB/G<br />
Sets the overall memory limit for the current job. Must specify MB or G<br />
Default is <4000MB on grace, 2000MB on terra> * Number of cores.<br />
<br />
--project-account | -P <Account Number><br />
Sets the account to charge for the current job.<br />
<br />
--extras | -x "<all other LSF/SLURM flags>"<br />
Extra batch submission features for Users.<br />
<br />
<br />
ALL OTHER FLAGS:<br />
<br />
--command | -command "<bash commands>"<br />
Intended for Rapid prototyping<br />
<br />
--download | -download<br />
Generates script for download. Will not submit a job if flag is called. <br />
<br />
--help | -h | -help<br />
Shows this message and exits.<br />
</pre><br />
<br />
=== batch file ===<br />
<br />
The batch file is a regular text file with the commands needed to be executed by the cluster. These commands can be linux commands, cluster commands, etc. Below is an example of commands that could potentially be found in batch job files.<br />
<br />
<pre><br />
#my_job_file<br />
<br />
echo "Hello"<br />
ml<br />
ml purge<br />
cd /scratch/user/directory<br />
./my_script<br />
</pre><br />
<br />
=== Examples ===<br />
<br />
The following are examples of calls to tamubatch.<br />
<br />
'''Example 1: default submission'''<br />
<br />
<pre><br />
[ NetID@ ~]$ tamubatch my_job_file <br />
</pre><br />
<br />
In the above example, tamubatch will identify the job file (my_job_file) and then submit the job to the cluster with default parameters. The default parameters are 1 core and 1 core per node. On Grace, the default amount of memory is 400MB (4G). On Terra, the default amount of memory is 2000MB (2.0G). <br />
<br />
'''Example 2: job submission with flags'''<br />
<br />
<pre><br />
[ NetID@ ~]$ tamubatch my_job_file -W 1:00 -n 20 -R 20 -M 50G<br />
</pre><br />
<br />
In the above example, tamubatch will set the wall time to one hour, the number of cores to 20, the number of cores per node to 20, and the total amount of memory for the job as 50G. tamubatch will then submit the job (my_job_file) directly to the cluster.<br />
<br />
'''Example 3: job submission with GPU node and extra batch parameters'''<br />
<br />
<pre><br />
[ NetID@ ~]$ tamubatch my_job_file -gpu -W 5:00 -n 40 -R 20 -M 80G -x "--mail-type=ALL --mail-user=netid@tamu.edu"<br />
</pre><br />
<br />
After setting the number of cores, the number of cores per node, and the memory in the example above, tamubatch will select the gpu node to run the current job (my_job_file). tamubatch will then include extra batch parameters in the job submission. In this specific example, the user will be alerted through email when their batch job starts to run and when their batch job is complete.<br />
<br />
<br />
'''NOTE:''' the -x flag is used to add any additional batch scheduler flag on either Grace and Terra.<br />
<br />
<br />
'''Example 4: job submission with commands'''<br />
<br />
<pre><br />
[ NetID@ ~]$ tamubatch my_job_file -W 1:00 -n 20 -R 20 -M 50G -command "echo hello; cd /user/net-id/"<br />
</pre><br />
<br />
Tamubatch will insert the same batch parameters as Example 2. Tamubatch will then add the commands entered using the command flag to the end of the batch job file (my_job_file). It will then submit the job to the clusters.<br />
<br />
<br />
A visual representation can be found in [https://www.youtube.com/watch?v=sP7GnQNvMVo this video] on our YouTube channel.</div>Penningshttps://hprc.tamu.edu/w/index.php?title=Terra:Batch_Job_Submissions&diff=12934Terra:Batch Job Submissions2022-02-04T17:19:05Z<p>Pennings: </p>
<hr />
<div>== Job Submission ==<br />
Once you have your job file ready, it is time to submit your job. You can submit your job to slurm with the following command:<br />
[NetID@terra1 ~]$ '''sbatch ''MyJob.slurm''''' <br />
Submitted batch job 3606<br />
<br />
== tamubatch ==<br />
<br />
'''tamubatch''' is an automatic batch job script that submits jobs for the user without the need of writing a batch script on grace and terra. The user just needs to provide the executable commands in a text file and tamubatch will automatically submit the job to the cluster. There are flags that the user may specify which allows control over the parameters for the job submitted.<br />
<br />
For more information, visit [https://hprc.tamu.edu/wiki/SW:tamubatch this page.]<br />
<br />
== tamulauncher ==<br />
<br />
'''tamulauncher''' provides a convenient way to run a large number of serial or multithreaded commands without the need to submit individual jobs or a Job array. User provides a text file containing all commands that need to be executed and tamulauncher will execute the commands concurrently. The number of concurrently executed commands depends on the batch requirements. When tamulauncher is run interactively the number of concurrently executed commands is limited to at most 8. tamulauncher is available on terra and grace. There is no need to load any module before using tamulauncher. tamulauncher has been successfully tested to execute over 100K commands.<br />
<br />
''tamulauncher is preferred over Job Arrays to submit a large number of individual jobs, especially when the run times of the commands are relatively short. It allows for better utilization of the nodes, puts less burden on the batch scheduler, and lessens interference with jobs of other users on the same node.'' <br />
<br />
For more information, visit [https://hprc.tamu.edu/wiki/SW:tamulauncher#tamulauncher this page.]<br />
<br />
[[Category:Terra]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=Terra:Batch_Job_Submissions&diff=12933Terra:Batch Job Submissions2022-02-04T17:17:59Z<p>Pennings: </p>
<hr />
<div>== Job Submission ==<br />
Once you have your job file ready, it is time to submit your job. You can submit your job to slurm with the following command:<br />
[NetID@terra1 ~]$ '''sbatch ''MyJob.slurm''''' <br />
Submitted batch job 3606<br />
<br />
== tamubatch ==<br />
<br />
'''tamubatch''' is an automatic batch job script that submits jobs for the user without the need of writing a batch script on the Grace and Terra clusters. The user just needs to provide the executable commands in a text file and tamubatch will automatically submit the job to the cluster. There are flags that the user may specify which allows control over the parameters for the job submitted.<br />
<br />
For more information, visit [https://hprc.tamu.edu/wiki/SW:tamubatch this page.]<br />
<br />
== tamulauncher ==<br />
<br />
'''tamulauncher''' provides a convenient way to run a large number of serial or multithreaded commands without the need to submit individual jobs or a Job array. User provides a text file containing all commands that need to be executed and tamulauncher will execute the commands concurrently. The number of concurrently executed commands depends on the batch requirements. When tamulauncher is run interactively the number of concurrently executed commands is limited to at most 8. tamulauncher is available on terra and grace. There is no need to load any module before using tamulauncher. tamulauncher has been successfully tested to execute over 100K commands.<br />
<br />
''tamulauncher is preferred over Job Arrays to submit a large number of individual jobs, especially when the run times of the commands are relatively short. It allows for better utilization of the nodes, puts less burden on the batch scheduler, and lessens interference with jobs of other users on the same node.'' <br />
<br />
For more information, visit [https://hprc.tamu.edu/wiki/SW:tamulauncher#tamulauncher this page.]<br />
<br />
[[Category:Terra]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:tamulauncher&diff=12786SW:tamulauncher2021-11-10T15:51:25Z<p>Pennings: </p>
<hr />
<div><br />
== tamulauncher ==<br />
<br />
'''tamulauncher''' provides a convenient way to run a large number of serial or multithreaded commands without the need to submit individual jobs or a Job array. tamulauncher takes as arguments a text file containing all commands that need to be executed and tamulauncher will execute the commands concurrently. The number of concurrently executed commands depends on the batch requirements. When tamulauncher is run interactively the number of concurrently executed commands is limited to at most 8. tamulauncher is available on grace and terra. There is no need to load any module. tamulauncher has been successfully tested to execute over 100K commands.<br />
<br />
''tamulauncher is preferred over Job Arrays to submit a large number of individual jobs, especially when the run times of the commands are relatively short. It allows for better utilization of the nodes, puts less burden on the batch scheduler, and lessens interference with jobs of other users on the same node.'' <br />
<br />
=== Synopsis ===<br />
<pre><br />
[ NetID@ ~]$ tamulauncher --help<br />
Usage: /sw/local/bin/tamulauncher [options] FILE<br />
<br />
This script will execute commands in FILE concurrently. <br />
<br />
OPTIONS:<br />
<br />
--commands-pernode | -p <n> <br />
Set the number of concurrent processes per node.<br />
<br />
--norestart<br />
Do not restart.<br />
<br />
--status <commands file><br />
Prints number of finished commands and exits. <br />
<br />
--list <commands file><br />
Prints detailed list of all finished commands and exits.<br />
<br />
--remove-logs <commands file><br />
Removes the log directory and exits<br />
<br />
--version | -v<br />
Prints version and exits.<br />
<br />
--help | -h | ?<br />
Shows this message and exits.<br />
</pre><br />
<br />
=== Commands file ===<br />
<br />
The commands file is a regular text file containing all the commands that need to be executed. Every line contains one command. A command can be a user-compiled program, a Linux command, a script (e.g. bash, Python, Perl, etc), a software package, etc. Commands can also be compounded using the Linux semi-colon operator. In general, any command that will work when typed in a bash shell will work when executed using tamulauncher. Below is an example of a commands file; it illustrates a commands file can contain any combination of commands (although in practice it's mostly a repetition of the same command with varying input parameters). Many times a commands file can be generated automatically.<br />
<br />
<br />
<pre><br />
./prog1 125<br />
./prog2 "aa" 3 <br />
mkdir testcase1 ; cd testcase1; ./myprog<br />
./prog1 100<br />
:<br />
:<br />
:<br />
time ./prog3 <br />
python mypython.py<br />
./prog1 141 ; ./prog4 > OUTPUT<br />
./prog5 < myinput<br />
<br />
</pre><br />
<br />
=== Dynamic release of resources ===<br />
<br />
tamulauncher will automatically release resources whenever they become idle. Resources will be released on per-node basis. This feature is especially useful in cases where the majority of requested cores/nodes are idle, taking up valuable resources, while only a few cores are processing the last few commands.<br />
<br />
'''NOTE''' You might see some slurm error messages such as "srun: error: <NODE>: task 7: Killed". These messages can be safely ignored. <br />
<br />
'''NOTE:''' This is an experimental feature we are still improving on. For that reason, as well as well as some needed changes to calculation of SUs, <br />
the number of SUs charged will not be adjusted at this time. However, it will help to make the cluster less congested.<br />
<br />
== Examples ==<br />
<br />
The following sections show two simple examples how to use tamulauncher. The first examples shows how to run serial commands and the second example shows how to run multi-threaded (OpenMP) commands.<br />
<br />
=== Example 1: Running serial commands ===<br />
<br />
<pre><br />
#!/bin/bash<br />
<br />
#SBATCH --export=NONE <br />
#SBATCH --get-user-env=L <br />
<br />
##NECESSARY JOB SPECIFICATIONS<br />
#SBATCH --job-name=demo-tamulauncher<br />
#SBATCH --output=demo-tamulauncher.%j<br />
#SBATCH --time=07:00:00 <br />
#SBATCH --ntasks=480 <br />
#SBATCH --mem=4096M <br />
<br />
tamulauncher commands.in<br />
</pre><br />
<br />
In the above example, tamulauncher will extract the requirements specified in the SLURM script and distribute all the commands over the 480 tasks.<br />
<br />
<br />
=== Example 2: Running multi-threaded (OpenMP) commands ===<br />
<br />
The next example shows how to run a bunch of OpenMP executables, where every command will use 4 threads<br />
<br />
<pre><br />
#!/bin/bash<br />
<br />
#SBATCH --export=NONE <br />
#SBATCH --get-user-env=L <br />
<br />
##NECESSARY JOB SPECIFICATIONS<br />
#SBATCH --job-name=demo-tamulauncher<br />
#SBATCH --output=demo-tamulauncher.%j<br />
#SBATCH --time=07:00:00 <br />
#SBATCH --ntasks=100 <br />
#SBATCH --cpus-per-task=4<br />
#SBATCH --mem=4096M <br />
<br />
<br />
exportÂ OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK<br />
tamulauncher commands.in<br />
</pre><br />
<br />
In the above example, tamulauncher will extract the requirements specified in the SLURM script and distribute all the commands over the 100 tasks. The SLURM '''--cpus-per-task''' option will make sure 4 cores are reserved for every task and every command can use up to 4 threads.<br />
<br />
== Automatic Restart ==<br />
<br />
tamulauncher keeps track of all commands that have been executed. When you start a tamulauncher job it will check for a log (located in directory ''.tamulauncher-log'') from a previous run and if that is the case, it will continue executing commands that did not finish during the previous run. This is especially useful when a tamulauncher job was killed because it ran out of wall time or there was a system problem. To turn off the automatic restart option, use the '''no-restart'' flag in your tamulauncher command, e.g.<br />
<br />
tamulauncher --no-restart commands.in<br />
<br />
Using this option tamulauncher will just wipe all the log files and it will start as if it was a first run.<br />
<br />
'''NOTE:''' tamulauncher keeps a log for every unique commands file. If you make any changes to the commands file, tamulauncher will assume it's a different commands file and will create a new log directory. This also means multiple tamulauncher runs can be executed in the same directory.<br />
<br />
== Monitoring runs ==<br />
<br />
To see how many commands have been executed use the '''--status''' flag in tamulauncher:<br />
<br />
[ NetID@ ~]$ '''tamulauncher --status <command file>'''<br />
<br />
This will show a one-line summary with the number of commands executed and the total number of commands for the tamulauncher run on ''<command file>''.<br />
<br />
To see a full listing of all finished commands use the '''--list''' flag in tamulauncher:<br />
<br />
[ NetID@ ~]$ '''tamulauncher --list <command file>'''<br />
<br />
This will show a list of all commands that have finished executing, including index in the commands file, total run-time time, and exit status for the tamulauncher run on ''<command file>''.<br />
<br />
<br />
== Clearing the log ==<br />
<br />
To clear the log for a particular tamulauncher run, use the '''--remove-logs''' flag.<br />
<br />
[ NetID@ ~]$ '''tamulauncher --remove-logs <command file>'''<br />
<br />
This will clear the logs for the latest tamulauncher run on commands file <command file>. '''NOTE:''' don't clear the logs while tamulauncher is still running on that particular <commands file>.</div>Penningshttps://hprc.tamu.edu/w/index.php?title=Terra:Batch_Job_Submissions&diff=12731Terra:Batch Job Submissions2021-10-22T17:11:29Z<p>Pennings: /* tamubatch */</p>
<hr />
<div>== Job Submission ==<br />
Once you have your job file ready, it is time to submit your job. You can submit your job to slurm with the following command:<br />
[NetID@terra1 ~]$ '''sbatch ''MyJob.slurm''''' <br />
Submitted batch job 3606<br />
<br />
== tamubatch ==<br />
<br />
'''tamubatch''' is an automatic batch job script that submits jobs for the user without the need of writing a batch script on the Grace and Terra clusters. The user just needs to provide the executable commands in a text file and tamubatch will automatically submit the job to the cluster. There are flags that the user may specify which allows control over the parameters for the job submitted.<br />
<br />
For more information, visit [https://hprc.tamu.edu/wiki/SW:tamubatch this page.]<br />
<br />
== tamulauncher ==<br />
<br />
'''tamulauncher''' provides a convenient way to run a large number of serial or multithreaded commands without the need to submit individual jobs or a Job array. User provides a text file containing all commands that need to be executed and tamulauncher will execute the commands concurrently. The number of concurrently executed commands depends on the batch requirements. When tamulauncher is run interactively the number of concurrently executed commands is limited to at most 8. tamulauncher is available on terra, ada, and curie. There is no need to load any module before using tamulauncher. tamulauncher has been successfully tested to execute over 100K commands.<br />
<br />
''tamulauncher is preferred over Job Arrays to submit a large number of individual jobs, especially when the run times of the commands are relatively short. It allows for better utilization of the nodes, puts less burden on the batch scheduler, and lessens interference with jobs of other users on the same node.'' <br />
<br />
For more information, visit [https://hprc.tamu.edu/wiki/SW:tamulauncher#tamulauncher this page.]<br />
<br />
[[Category:Terra]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:R&diff=12657SW:R2021-10-07T03:01:22Z<p>Pennings: </p>
<hr />
<div>=R=<br />
__TOC__<br />
==Description==<br />
R is a free software environment for statistical computing and graphics. <br><br />
Homepage: [http://www.r-project.org/ http://www.r-project.org/]<br />
<br />
==Access==<br />
R is open to all HPRC users.<br />
<br />
===Loading the Module===<br />
To see all versions of R available:<br />
[NetID@cluster ~]$ '''module spider R'''<br />
<br />
To load a particular version of R (Example: 3.4.2 with the iomkl toolchain):<br />
[NetID@cluster ~]$ '''module load R/3.4.2-iomkl-2017A-Python-2.7.12-default-mt'''<br />
<br />
=== R_tamu ===<br />
<br />
Loading the R module will setup the environment for the base R installation without any additional packages. For the user's convenience, HPRC developed an extension to R called '''R_tamu''' which is built on top of R and provides a large number of additionally installed packages not found in the base R version. R_tamu also makes it easy to install personal packages. In addition, R_tamu can also act as an R-project environment manager.<br />
<br />
To see all versions of the R_tamu available:<br />
[NetID@cluster ~]$ '''module spider R_tamu'''<br />
<br />
For more information about R_tamu, please visit the [[SW:R_tamu | R_tamu Wiki page]]<br />
<br />
===Installing Packages===<br />
<br />
While there are many packages available with the '''R_tamu''' module, you may find that we do not have a package installed that is needed. If you think that a particular package might be useful for other R users, you can [https://hprc.tamu.edu/about/contact.html contact us] with a request to install this packages systemwide. Alternatively, you can install any packages yourself in your own directory. <br />
<br />
The most common way to install a package is to start an interactive R session and use the R function<br />
<br />
> '''install.packages("''package_name''")'''<br />
<br />
<br />
Alternatively, you can also use '''R CMD INSTALL''' command from the shell. This is useful when you already have a local copy of the R package ( *.tar.gz format). <br />
<br />
R_tamu sets the R environment variable '''R_LIBS_USER''' to ${SCRATCH}/R_LIBS/<VERSION> ( where <VERSION> is currently loaded R version), so all packages will <br />
be automatically installed in that directory. <br />
<br />
In case you are using the R module, you will be asked to provide a personal directory where to install it. The easiest way is to set '''R_LIBS_USER''' before you start your R session. For example:<br />
<br />
[NetID@cluster ~]$ export R_LIBS_USER=${SCRATCH}/myRlibs<br />
<br />
<br />
'''NOTE:''' In case you are using the R module, to be able to use the installed packages, R_LIBS_USER needs to be set every time before starting an R Session.<br />
<br />
<br />
<font color=teal> If you have trouble installing packages for yourself, you can also [https://hprc.tamu.edu/about/contact.html contact us] with any concerns. </font><br />
<br />
<br> <br />
<br />
{{:SW:Login_Node_Warning}}<br />
{{:SW:Compute_Node_Info}}<br />
'<br />
<br />
=== Slurm Example (terra)===<br />
'''Example 1:''' A serial (single core) R Job example: (Last updated March 24, 2017) <br />
#!/bin/bash<br />
<br />
##NECESSARY JOB SPECIFICATIONS<br />
#SBATCH --job-name=R_Job # Sets the job name to R_Job<br />
#SBATCH --time=5:00:00 # Sets the runtime limit to 5 hr<br />
#SBATCH --ntasks=1 # Requests 1 core<br />
#SBATCH --ntasks-per-node=1 # Requests 1 core per node (1 node)<br />
#SBATCH --mem=5G # Requests 5GB of memory per node<br />
#SBATCH --output=stdout1.o%J # Sends stdout and stderr to stdout1.o[jobID]<br />
<br />
## Load the necessary modules<br />
module purge<br />
module load R_tamu/3.3.2-iomkl-2017A-Python-2.7.12-default-mt<br />
<br />
## Launch R with proper parameters <br />
Rscript ''myScript.R''<br />
<br />
'''Example 2:''' A parallel (multiple core) R Job example, where ''myScript.R'' is a script that requests 10 slaves. (Last updated March 24, 2017) <br><br />
'''Note:''' The number of cores requested should match the number of slaves requested. <br />
#!/bin/bash<br />
<br />
##NECESSARY JOB SPECIFICATIONS<br />
#SBATCH --job-name=R_Job # Sets the job name to R_Job<br />
#SBATCH --time=5:00:00 # Sets the runtime limit to 5 hr<br />
#SBATCH --ntasks=10 # Requests 10 cores<br />
#SBATCH --ntasks-per-node=10 # Requests 10 cores per node (1 node)<br />
#SBATCH --mem=50G # Requests 50GB of memory per node<br />
#SBATCH --output=stdout1.o%J # Sends stdout and stderr to stdout1.o[jobID]<br />
<br />
## Load the necessary modules<br />
module purge<br />
module load R_tamu/3.3.2-iomkl-2017A-Python-2.7.12-default-mt<br />
<br />
## Launch R with proper parameters <br />
mpirun -np 1 Rscript ''myScript.R''<br />
<br />
<br />
To submit the batch job, run: (where ''jobscript'' is a file that looks like one of the above examples)<br />
[ NetID@terra ~]$ '''sbatch ''jobscript'''''<br />
<br />
{{:SW:VNC_Node_Warning}}<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:R&diff=12656SW:R2021-10-07T00:28:02Z<p>Pennings: /* R */</p>
<hr />
<div>=R=<br />
__TOC__<br />
==Description==<br />
R is a free software environment for statistical computing and graphics. <br><br />
Homepage: [http://www.r-project.org/ http://www.r-project.org/]<br />
<br />
==Access==<br />
R is open to all HPRC users.<br />
<br />
===Loading the Module===<br />
To see all versions of R available:<br />
[NetID@cluster ~]$ '''module spider R'''<br />
<br />
To load a particular version of R (Example: 3.4.2 with the iomkl toolchain):<br />
[NetID@cluster ~]$ '''module load R/3.4.2-iomkl-2017A-Python-2.7.12-default-mt'''<br />
<br />
=== R_tamu ===<br />
<br />
Loading the R module will setup the environment for the base R installation without any additional packages. For the user's convenience, HPRC developed an extension to R called '''R_tamu''' which is built on top of R and provides a large number of additionally installed packages not found in the base R version. R_tamu also makes it easy to install personal packages. In addition, R_tamu can also act as an R-project environment manager.<br />
<br />
To see all versions of the R_tamu available:<br />
[NetID@cluster ~]$ '''module spider R_tamu'''<br />
<br />
For more information about R_tamu, please visit the [[SW:R_tamu | R_tamu Wiki page]]<br />
<br />
===Installing Packages===<br />
<br />
While there are many packages available with the '''R_tamu''' module, you may find that we do not have a package installed that is needed. If you think that a particular package might be useful for other R users, you can [https://hprc.tamu.edu/about/contact.html contact us] with a request to install this packages systemwide. Alternatively, you can install any packages yourself in your own directory. <br />
<br />
The most common way to install a package is to start an interactive R session and use the R function<br />
<br />
> '''install.packages("''package_name''")'''<br />
<br />
<br />
Alternatively, you can also use '''R CMD INSTALL''' command from the shell. This is useful when you already have a local copy of the R package ( *.tar.gz format). <br />
<br />
R_tamu sets the R environment variable '''R_LIBS_USER''' to ${SCRATCH}/R_LIBS/<VERSION> ( where <VERSION> is currently loaded R version), so all packages will <br />
be automatically installed in that directory. <br />
<br />
In case you are using the R module, you will be asked to provide a personal directory where to install it. The easiest way is to set '''R_LIBS_USER''' before you start your R session. For example:<br />
<br />
[NetID@cluster ~]$ export R_LIBS_USER=${SCRATCH}/myRlibs<br />
<br />
<br />
'''NOTE:''' In case you are using the R module, to be able to use the installed packages, R_LIBS_USER needs to be set every time before starting an R Session.<br />
<br />
<br />
<font color=teal> If you have trouble installing packages for yourself, you can also [https://hprc.tamu.edu/about/contact.html contact us] with any concerns. </font><br />
<br />
<br> <br />
<br />
{{:SW:Login_Node_Warning}}<br />
{{:SW:Compute_Node_Info}}<br />
'<br />
<br />
=== Slurm Example (terra)===<br />
'''Example 1:''' A serial (single core) R Job example: (Last updated March 24, 2017) <br />
#!/bin/bash<br />
<br />
##NECESSARY JOB SPECIFICATIONS<br />
#SBATCH --job-name=R_Job # Sets the job name to R_Job<br />
#SBATCH --time=5:00:00 # Sets the runtime limit to 5 hr<br />
#SBATCH --ntasks=1 # Requests 1 core<br />
#SBATCH --ntasks-per-node=1 # Requests 1 core per node (1 node)<br />
#SBATCH --mem=5G # Requests 5GB of memory per node<br />
#SBATCH --output=stdout1.o%J # Sends stdout and stderr to stdout1.o[jobID]<br />
<br />
## Load the necessary modules<br />
module purge<br />
module load R_tamu/3.3.2-iomkl-2017A-Python-2.7.12-default-mt<br />
<br />
## Launch R with proper parameters <br />
Rscript ''myScript.R''<br />
<br />
'''Example 2:''' A parallel (multiple core) R Job example, where ''myScript.R'' is a script that requests 10 slaves. (Last updated March 24, 2017) <br><br />
'''Note:''' The number of cores requested should match the number of slaves requested. <br />
#!/bin/bash<br />
<br />
##NECESSARY JOB SPECIFICATIONS<br />
#SBATCH --job-name=R_Job # Sets the job name to R_Job<br />
#SBATCH --time=5:00:00 # Sets the runtime limit to 5 hr<br />
#SBATCH --ntasks=10 # Requests 10 cores<br />
#SBATCH --ntasks-per-node=10 # Requests 10 cores per node (1 node)<br />
#SBATCH --mem=50G # Requests 50GB of memory per node<br />
#SBATCH --output=stdout1.o%J # Sends stdout and stderr to stdout1.o[jobID]<br />
<br />
## Load the necessary modules<br />
module purge<br />
module load R_tamu/3.3.2-iomkl-2017A-Python-2.7.12-default-mt<br />
<br />
## Launch R with proper parameters <br />
mpirun -np 1 Rscript ''myScript.R''<br />
<br />
<br />
To submit the batch job, run: (where ''jobscript'' is a file that looks like one of the above examples)<br />
[ NetID@terra ~]$ '''sbatch ''jobscript'''''<br />
<br />
{{:SW:VNC_Node_Warning}}<br />
<br />
==Usage on Large Memory Nodes==<br />
Large memory nodes allow for computation that requires more memory than a standard job. To use<br />
*R_tamu/3.3.2-iomkl-2017A-Python-2.7.12-default-mt<br />
*R_tamu/3.3.2-intel-2017A-Python-2.7.12-default-mt<br />
Please follow the steps below:<br />
<br />
Load the Westmere module (name of the large memory nodes) to set up the environment for loading/using Westmere specific modules:<br />
<pre>module load Westmere</pre><br />
Load the [https://hprc.tamu.edu/wiki/SW:R_tamu R_tamu] module using<br />
<pre>module load R_tamu/3.3.2-intel-2017A-Python-2.7.12-default-mt</pre><br />
When installing your own packages, create your own Westmere environment:<br />
<pre>R --rtamenvs=westmere</pre><br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=12557SW:Matlab2021-09-22T22:18:08Z<p>Pennings: </p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2020b'''<br />
<br />
This will setup the environment for Matlab version R2020b. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<!-- <font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><be> --><br />
<br />
In this section, we will focus on utilizing the Parallel toolbox on HPRC cluster. For a general intro to the Parallel Toolbox see the [https://www.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav parallel toolbox ] section on the Mathworks website. Here we will discuss how to use Matlab Cluster profiles to distribute workers over multiple nodes.<br />
<br />
<br />
== Cluster Profiles ==<br />
<br />
<br />
Matlab uses the concept of ''Cluster Profiles'' to create parallel pools. When Matlab creates a parallel pool, it uses the cluster profile to determine how many workers to use, how many threads every worker can use, where to store meta-data, as well as some other meta-data. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running. <br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with a batch scheduler (e.g. SLURM on terra). <br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
TAMU HPRC provides a framework, to easily manage and update cluster profiles. The central concept in most of the discussion below is the '''TAMUClusterProperties''' object. The TAMUClusterProperties object keeps track of all the properties needed to successfully create a parallel pool. That includes typical Matlab properties, such as the number of Matlab workers requested as well as batch scheduler properties such as wall-time and memory. '''TAMUClusterProperties'''.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. Using the profile, you can define how many workers you want, how you want to distribute the workers over the nodes, how many computational threads to use, how long to run, etc. Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamuprofile.importProfile()<br />
</pre><br />
<br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta-information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU<VERSION'', where <VERSION> represents the Matlab version. For example, for Matlab R2020b, it will be ''/scratch/$USER/MatlabJobs/TAMU2020b''<br />
<br />
<!-- <br />
'''NOTE:''' function '''tamuprofile.clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
--><br />
<br />
=== Getting the Cluster Profile Object ===<br />
<br />
To get a '''TAMUClusterProperties''' object you can do the following:<br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
</pre><br />
<br />
'''tp''' is an object of type '''TAMUClusterProperties''' with default values for all the properties. To see all the properties, you can just print the value of '''tp'''. You can easily change the values using the convenience methods of '''TAMUClusterProperties'''<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
</pre><br />
<br />
== Creating a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamuprofile.parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
For example:<br />
<pre><br />
mypool = tamuprofile.parpool(tp)<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the parpool block will be destroyed once the block is finished.<br />
<br />
==== Alternative approach to create parallel pool ====<br />
<br />
Matlab already provides functions to create parallel pools, namely: ''parcluster(<string clustername>)'' and ''parpool(<parcluster object>)''. You can use these functions as well, but it will be a more complicated to set all the properties correct ( we will not discuss how to do that here). To create a parallel pool using the basic Matlab functions, you can do the following: <br />
<br />
<pre><br />
cp = parcluster('TAMU2020b')<br />
% add code to set the number of workers manually. There is no uniform way to do this and might depend on the type of cluster profile and the batch scheduler (e.g. Slurm)<br />
mypool = parpool(cp);<br />
;<br />
delete(mypool)<br />
</pre><br />
<br />
For convenience, TAMU HPRC also provides a convenience function to return a fully populated ''parcluster'' object that can be passed into a Matlab ''parpool'' function. See below for an example that creates a pool with 4 workers:<br />
<br />
<pre><br />
tp = TAMUClusterProperties();<br />
tp.workers(4);<br />
cp = tamuprofile.parcluster();<br />
mypool = parpool(cp)<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
<br />
<br />
<br />
<!-- <br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
--><br />
<br />
<br />
<br />
<!-- <br />
<br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]<br />
--></div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=12556SW:Matlab2021-09-22T22:16:08Z<p>Pennings: /* Running (parallel) Matlab Scripts on HPRC compute nodes */</p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2020b'''<br />
<br />
This will setup the environment for Matlab version R2020b. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<!-- <font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><be> --><br />
<br />
In this section, we will focus on utilizing the Parallel toolbox on HPRC cluster. For a general intro to the Parallel Toolbox see the [https://www.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav parallel toolbox ] section on the Mathworks website. Here we will discuss how to use Matlab Cluster profiles to distribute workers over multiple nodes.<br />
<br />
<br />
== Cluster Profiles ==<br />
<br />
<br />
Matlab uses the concept of ''Cluster Profiles'' to create parallel pools. When Matlab creates a parallel pool, it uses the cluster profile to determine how many workers to use, how many threads every worker can use, where to store meta-data, as well as some other meta-data. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running. <br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with a batch scheduler (e.g. SLURM on terra). <br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
TAMU HPRC provides a framework, to easily manage and update cluster profiles. The central concept in most of the discussion below is the '''TAMUClusterProperties''' object. The TAMUClusterProperties object keeps track of all the properties needed to successfully create a parallel pool. That includes typical Matlab properties, such as the number of Matlab workers requested as well as batch scheduler properties such as wall-time and memory. '''TAMUClusterProperties'''.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. Using the profile, you can define how many workers you want, how you want to distribute the workers over the nodes, how many computational threads to use, how long to run, etc. Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamuprofile.importProfile()<br />
</pre><br />
<br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta-information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU<VERSION'', where <VERSION> represents the Matlab version. For example, for Matlab R2020b, it will be ''/scratch/$USER/MatlabJobs/TAMU2020b''<br />
<br />
<!-- <br />
'''NOTE:''' function '''tamuprofile.clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
--><br />
<br />
=== Getting the Cluster Profile Object ===<br />
<br />
To get a '''TAMUClusterProperties''' object you can do the following:<br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
</pre><br />
<br />
'''tp''' is an object of type '''TAMUClusterProperties''' with default values for all the properties. To see all the properties, you can just print the value of '''tp'''. You can easily change the values using the convenience methods of '''TAMUClusterProperties'''<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
</pre><br />
<br />
== Creating a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamuprofile.parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
For example:<br />
<pre><br />
mypool = tamuprofile.parpool(tp)<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the parpool block will be destroyed once the block is finished.<br />
<br />
==== Alternative approach to create parallel pool ====<br />
<br />
Matlab already provides functions to create parallel pools, namely: ''parcluster(<string clustername>)'' and ''parpool(<parcluster object>)''. You can use these functions as well, but it will be a more complicated to set all the properties correct ( we will not discuss how to do that here). To create a parallel pool using the basic Matlab functions, you can do the following: <br />
<br />
<pre><br />
cp = parcluster('TAMU2020b')<br />
% add code to set the number of workers manually. There is no uniform way to do this and might depend on the type of cluster profile and the batch scheduler (e.g. Slurm)<br />
mypool = parpool(cp);<br />
;<br />
delete(mypool)<br />
</pre><br />
<br />
For convenience, TAMU HPRC also provides a convenience function to return a fully populated ''parcluster'' object that can be passed into a Matlab ''parpool'' function. See below for an example that creates a pool with 4 workers:<br />
<br />
<pre><br />
tp = TAMUClusterProperties();<br />
tp.workers(4);<br />
cp = tamuprofile.parcluster();<br />
mypool = parpool(cp)<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
<!-- <br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]<br />
--></div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=12555SW:Matlab2021-09-22T22:15:23Z<p>Pennings: /* Using Matlab Parallel Toolbox on HPRC Resources */</p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2020b'''<br />
<br />
This will setup the environment for Matlab version R2020b. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<!-- <font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><be> --><br />
<br />
In this section, we will focus on utilizing the Parallel toolbox on HPRC cluster. For a general intro to the Parallel Toolbox see the [https://www.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav parallel toolbox ] section on the Mathworks website. Here we will discuss how to use Matlab Cluster profiles to distribute workers over multiple nodes.<br />
<br />
<br />
== Cluster Profiles ==<br />
<br />
<br />
Matlab uses the concept of ''Cluster Profiles'' to create parallel pools. When Matlab creates a parallel pool, it uses the cluster profile to determine how many workers to use, how many threads every worker can use, where to store meta-data, as well as some other meta-data. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running. <br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with a batch scheduler (e.g. SLURM on terra). <br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
TAMU HPRC provides a framework, to easily manage and update cluster profiles. The central concept in most of the discussion below is the '''TAMUClusterProperties''' object. The TAMUClusterProperties object keeps track of all the properties needed to successfully create a parallel pool. That includes typical Matlab properties, such as the number of Matlab workers requested as well as batch scheduler properties such as wall-time and memory. '''TAMUClusterProperties'''.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. Using the profile, you can define how many workers you want, how you want to distribute the workers over the nodes, how many computational threads to use, how long to run, etc. Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamuprofile.importProfile()<br />
</pre><br />
<br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta-information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU<VERSION'', where <VERSION> represents the Matlab version. For example, for Matlab R2020b, it will be ''/scratch/$USER/MatlabJobs/TAMU2020b''<br />
<br />
<!-- <br />
'''NOTE:''' function '''tamuprofile.clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
--><br />
<br />
=== Getting the Cluster Profile Object ===<br />
<br />
To get a '''TAMUClusterProperties''' object you can do the following:<br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
</pre><br />
<br />
'''tp''' is an object of type '''TAMUClusterProperties''' with default values for all the properties. To see all the properties, you can just print the value of '''tp'''. You can easily change the values using the convenience methods of '''TAMUClusterProperties'''<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
</pre><br />
<br />
== Creating a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamuprofile.parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
For example:<br />
<pre><br />
mypool = tamuprofile.parpool(tp)<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the parpool block will be destroyed once the block is finished.<br />
<br />
==== Alternative approach to create parallel pool ====<br />
<br />
Matlab already provides functions to create parallel pools, namely: ''parcluster(<string clustername>)'' and ''parpool(<parcluster object>)''. You can use these functions as well, but it will be a more complicated to set all the properties correct ( we will not discuss how to do that here). To create a parallel pool using the basic Matlab functions, you can do the following: <br />
<br />
<pre><br />
cp = parcluster('TAMU2020b')<br />
% add code to set the number of workers manually. There is no uniform way to do this and might depend on the type of cluster profile and the batch scheduler (e.g. Slurm)<br />
mypool = parpool(cp);<br />
;<br />
delete(mypool)<br />
</pre><br />
<br />
For convenience, TAMU HPRC also provides a convenience function to return a fully populated ''parcluster'' object that can be passed into a Matlab ''parpool'' function. See below for an example that creates a pool with 4 workers:<br />
<br />
<pre><br />
tp = TAMUClusterProperties();<br />
tp.workers(4);<br />
cp = tamuprofile.parcluster();<br />
mypool = parpool(cp)<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=12554SW:Matlab2021-09-22T22:13:20Z<p>Pennings: /* Starting a Parallel Pool */</p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2020b'''<br />
<br />
This will setup the environment for Matlab version R2020b. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><br><br />
<br />
In this section, we will focus on utilizing the Parallel toolbox on HPRC cluster. For a general intro to the Parallel Toolbox see the [https://www.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav parallel toolbox ] section on the Mathworks website. Here we will discuss how to use Matlab Cluster profiles to distribute workers over multiple nodes.<br />
<br />
<br />
== Cluster Profiles ==<br />
<br />
<br />
Matlab uses the concept of ''Cluster Profiles'' to create parallel pools. When Matlab creates a parallel pool, it uses the cluster profile to determine how many workers to use, how many threads every worker can use, where to store meta-data, as well as some other meta-data. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running. <br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with a batch scheduler (e.g. SLURM on terra). <br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
TAMU HPRC provides a framework, to easily manage and update cluster profiles. The central concept in most of the discussion below is the '''TAMUClusterProperties''' object. The TAMUClusterProperties object keeps track of all the properties needed to successfully create a parallel pool. That includes typical Matlab properties, such as the number of Matlab workers requested as well as batch scheduler properties such as wall-time and memory. '''TAMUClusterProperties'''.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. Using the profile, you can define how many workers you want, how you want to distribute the workers over the nodes, how many computational threads to use, how long to run, etc. Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamuprofile.importProfile()<br />
</pre><br />
<br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta-information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU<VERSION'', where <VERSION> represents the Matlab version. For example, for Matlab R2020b, it will be ''/scratch/$USER/MatlabJobs/TAMU2020b''<br />
<br />
<!-- <br />
'''NOTE:''' function '''tamuprofile.clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
--><br />
<br />
=== Getting the Cluster Profile Object ===<br />
<br />
To get a '''TAMUClusterProperties''' object you can do the following:<br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
</pre><br />
<br />
'''tp''' is an object of type '''TAMUClusterProperties''' with default values for all the properties. To see all the properties, you can just print the value of '''tp'''. You can easily change the values using the convenience methods of '''TAMUClusterProperties'''<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
</pre><br />
<br />
== Creating a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamuprofile.parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
For example:<br />
<pre><br />
mypool = tamuprofile.parpool(tp)<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the parpool block will be destroyed once the block is finished.<br />
<br />
==== Alternative approach to create parallel pool ====<br />
<br />
Matlab already provides functions to create parallel pools, namely: ''parcluster(<string clustername>)'' and ''parpool(<parcluster object>)''. You can use these functions as well, but it will be a more complicated to set all the properties correct ( we will not discuss how to do that here). To create a parallel pool using the basic Matlab functions, you can do the following: <br />
<br />
<pre><br />
cp = parcluster('TAMU2020b')<br />
% add code to set the number of workers manually. There is no uniform way to do this and might depend on the type of cluster profile and the batch scheduler (e.g. Slurm)<br />
mypool = parpool(cp);<br />
;<br />
delete(mypool)<br />
</pre><br />
<br />
For convenience, TAMU HPRC also provides a convenience function to return a fully populated ''parcluster'' object that can be passed into a Matlab ''parpool'' function. See below for an example that creates a pool with 4 workers:<br />
<br />
<pre><br />
tp = TAMUClusterProperties();<br />
tp.workers(4);<br />
cp = tamuprofile.parcluster();<br />
mypool = parpool(cp)<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=12553SW:Matlab2021-09-22T21:41:18Z<p>Pennings: /* Getting the Cluster Profile Object */</p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2020b'''<br />
<br />
This will setup the environment for Matlab version R2020b. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><br><br />
<br />
In this section, we will focus on utilizing the Parallel toolbox on HPRC cluster. For a general intro to the Parallel Toolbox see the [https://www.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav parallel toolbox ] section on the Mathworks website. Here we will discuss how to use Matlab Cluster profiles to distribute workers over multiple nodes.<br />
<br />
<br />
== Cluster Profiles ==<br />
<br />
<br />
Matlab uses the concept of ''Cluster Profiles'' to create parallel pools. When Matlab creates a parallel pool, it uses the cluster profile to determine how many workers to use, how many threads every worker can use, where to store meta-data, as well as some other meta-data. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running. <br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with a batch scheduler (e.g. SLURM on terra). <br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
TAMU HPRC provides a framework, to easily manage and update cluster profiles. The central concept in most of the discussion below is the '''TAMUClusterProperties''' object. The TAMUClusterProperties object keeps track of all the properties needed to successfully create a parallel pool. That includes typical Matlab properties, such as the number of Matlab workers requested as well as batch scheduler properties such as wall-time and memory. '''TAMUClusterProperties'''.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. Using the profile, you can define how many workers you want, how you want to distribute the workers over the nodes, how many computational threads to use, how long to run, etc. Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamuprofile.importProfile()<br />
</pre><br />
<br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta-information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU<VERSION'', where <VERSION> represents the Matlab version. For example, for Matlab R2020b, it will be ''/scratch/$USER/MatlabJobs/TAMU2020b''<br />
<br />
<!-- <br />
'''NOTE:''' function '''tamuprofile.clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
--><br />
<br />
=== Getting the Cluster Profile Object ===<br />
<br />
To get a '''TAMUClusterProperties''' object you can do the following:<br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
</pre><br />
<br />
'''tp''' is an object of type '''TAMUClusterProperties''' with default values for all the properties. To see all the properties, you can just print the value of '''tp'''. You can easily change the values using the convenience methods of '''TAMUClusterProperties'''<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
</pre><br />
<br />
== Starting a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamu_parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
The '''parpool''' functions enables the full functionality of the parallel language features (parfor and spmd, will be discussed below). A parpool creates a special job on a pool of workers, and connects the pool to the MATLAB client. For example:<br />
<pre><br />
mypool = parpool 4<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the matlabpool block will be destroyed once the block is finished.<br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=12552SW:Matlab2021-09-22T21:40:47Z<p>Pennings: /* Importing Cluster Profile */</p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2020b'''<br />
<br />
This will setup the environment for Matlab version R2020b. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><br><br />
<br />
In this section, we will focus on utilizing the Parallel toolbox on HPRC cluster. For a general intro to the Parallel Toolbox see the [https://www.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav parallel toolbox ] section on the Mathworks website. Here we will discuss how to use Matlab Cluster profiles to distribute workers over multiple nodes.<br />
<br />
<br />
== Cluster Profiles ==<br />
<br />
<br />
Matlab uses the concept of ''Cluster Profiles'' to create parallel pools. When Matlab creates a parallel pool, it uses the cluster profile to determine how many workers to use, how many threads every worker can use, where to store meta-data, as well as some other meta-data. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running. <br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with a batch scheduler (e.g. SLURM on terra). <br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
TAMU HPRC provides a framework, to easily manage and update cluster profiles. The central concept in most of the discussion below is the '''TAMUClusterProperties''' object. The TAMUClusterProperties object keeps track of all the properties needed to successfully create a parallel pool. That includes typical Matlab properties, such as the number of Matlab workers requested as well as batch scheduler properties such as wall-time and memory. '''TAMUClusterProperties'''.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. Using the profile, you can define how many workers you want, how you want to distribute the workers over the nodes, how many computational threads to use, how long to run, etc. Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamuprofile.importProfile()<br />
</pre><br />
<br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta-information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU<VERSION'', where <VERSION> represents the Matlab version. For example, for Matlab R2020b, it will be ''/scratch/$USER/MatlabJobs/TAMU2020b''<br />
<br />
<!-- <br />
'''NOTE:''' function '''tamuprofile.clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
--><br />
<br />
=== Getting the Cluster Profile Object ===<br />
<br />
To get a '''TAMUClusterProperties''' object you can do the following:<br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>/pre><br />
<br />
'''tp''' is an object of type '''TAMUClusterProperties''' with default values for all the properties. To see all the properties, you can just print the value of '''tp'''. You can easily change the values using the convenience methods of '''TAMUClusterProperties'''<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
</pre><br />
<br />
== Starting a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamu_parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
The '''parpool''' functions enables the full functionality of the parallel language features (parfor and spmd, will be discussed below). A parpool creates a special job on a pool of workers, and connects the pool to the MATLAB client. For example:<br />
<pre><br />
mypool = parpool 4<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the matlabpool block will be destroyed once the block is finished.<br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=12551SW:Matlab2021-09-22T21:38:55Z<p>Pennings: /* Cluster Profiles */</p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2020b'''<br />
<br />
This will setup the environment for Matlab version R2020b. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><br><br />
<br />
In this section, we will focus on utilizing the Parallel toolbox on HPRC cluster. For a general intro to the Parallel Toolbox see the [https://www.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav parallel toolbox ] section on the Mathworks website. Here we will discuss how to use Matlab Cluster profiles to distribute workers over multiple nodes.<br />
<br />
<br />
== Cluster Profiles ==<br />
<br />
<br />
Matlab uses the concept of ''Cluster Profiles'' to create parallel pools. When Matlab creates a parallel pool, it uses the cluster profile to determine how many workers to use, how many threads every worker can use, where to store meta-data, as well as some other meta-data. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running. <br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with a batch scheduler (e.g. SLURM on terra). <br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
TAMU HPRC provides a framework, to easily manage and update cluster profiles. The central concept in most of the discussion below is the '''TAMUClusterProperties''' object. The TAMUClusterProperties object keeps track of all the properties needed to successfully create a parallel pool. That includes typical Matlab properties, such as the number of Matlab workers requested as well as batch scheduler properties such as wall-time and memory. '''TAMUClusterProperties'''.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. Using the profile, you can define how many workers you want, how you want to distribute the workers over the nodes, how many computational threads to use, how long to run, etc. Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamuprofile.importProfile()<br />
</pre><br />
<br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta-information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU<VERSION'', where <VERSION> represents the Matlab version. For example, for Matlab R2020b, it will be ''/scratch/$USER/MatlabJobs/TAMU2020b''<br />
<br />
<!-- <br />
'''NOTE:''' function '''tamuprofile.clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
--><br />
<br />
Matlab will store meta-information for every parallel job in directory ''/scratch/$USER/MatlabJobs/TAMU''<br />
<br />
=== Getting the Cluster Profile Object ===<br />
<br />
To get a '''TAMUClusterProperties''' object you can do the following:<br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>/pre><br />
<br />
'''tp''' is an object of type '''TAMUClusterProperties''' with default values for all the properties. To see all the properties, you can just print the value of '''tp'''. You can easily change the values using the convenience methods of '''TAMUClusterProperties'''<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
</pre><br />
<br />
== Starting a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamu_parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
The '''parpool''' functions enables the full functionality of the parallel language features (parfor and spmd, will be discussed below). A parpool creates a special job on a pool of workers, and connects the pool to the MATLAB client. For example:<br />
<pre><br />
mypool = parpool 4<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the matlabpool block will be destroyed once the block is finished.<br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=12543SW:Matlab2021-09-22T16:36:27Z<p>Pennings: /* Using Matlab Parallel Toolbox on HPRC Resources */</p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2020b'''<br />
<br />
This will setup the environment for Matlab version R2020b. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><br><br />
<br />
In this section, we will focus on utilizing the Parallel toolbox on HPRC cluster. For a general intro to the Parallel Toolbox see the [https://www.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav parallel toolbox ] section on the Mathworks website. Here we will discuss how to use Matlab Cluster profiles to distribute workers over multiple nodes.<br />
<br />
<br />
== Cluster Profiles ==<br />
<br />
Matlab uses the concept of ''Cluster Profiles'' to create parallel pools. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running.<br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with a batch scheduler (e.g. SLURM on terra).<br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
TAMU HPRC provides a framework, to easily manage cluster profiles. The central concept in most of the discussion below is the '''TAMUClusterProperties''' object. In the next sections, we will discuss using in more detail in the next section<br />
<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. Using the profile, you can define how many workers you want, how you want to distribute the workers over the nodes, how many computational threads to use, how long to run, etc. Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamuprofile.importProfile()<br />
</pre><br />
<br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta-information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU<VERSION'', where <VERSION> represents the Matlab version. For example, for Matlab R2020b, it will be ''/scratch/$USER/MatlabJobs/TAMU2020b''<br />
<br />
<!-- <br />
'''NOTE:''' function '''tamuprofile.clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
--><br />
<br />
<br />
In this case, Matlab will store meta-information in directory ''/scratch/$USER/MatlabJobs/TAMU''<br />
<br />
=== Getting Cluster Profile Object ===<br />
<br />
To return a fully completed cluster object (i.e. with attached resource information) HPRC created the '''tamu_set_profile_properties''' convenience function. There are two steps to follow:<br />
<br />
* define the properties using the TAMUClusterProperties class<br />
* call '''tamu_set_profile_properties''' using the created TAMUClusterProperties object.<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
>> clusterObject=tamu_set_profile_properties(tp);<br />
</pre><br />
<br />
Variable ''clusterObject'' is a fully populated cluster object that can be used for parallel processing. <br />
<br />
'''NOTE:''' convenience function '''tamu_set_profile_properties''' is a wrapper around Matlab function <br />
[https://www.mathworks.com/help/distcomp/parcluster.html parcluster]. It also uses HPRC convenience function '''tamu_import_TAMU_clusterprofile''' to check if the '''TAMU''' profile has been imported already.<br />
<br />
== Starting a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamu_parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
The '''parpool''' functions enables the full functionality of the parallel language features (parfor and spmd, will be discussed below). A parpool creates a special job on a pool of workers, and connects the pool to the MATLAB client. For example:<br />
<pre><br />
mypool = parpool 4<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the matlabpool block will be destroyed once the block is finished.<br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=12542SW:Matlab2021-09-22T15:48:44Z<p>Pennings: /* Importing Cluster Profile */</p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2020b'''<br />
<br />
This will setup the environment for Matlab version R2020b. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><br><br />
<br />
In this section, we will focus on utilizing the Parallel toolbox on HPRC cluster. For a general intro to the Parallel Toolbox see the [https://www.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav parallel toolbox ] section on the Mathworks website. Here we will discuss how to use Matlab Cluster profiles to distribute workers over multiple nodes.<br />
<br />
The central concept in most of the discussion below is the '''TAMUClusterProperties''' object which we will discuss in more detail in the next section<br />
<br />
<br />
== Cluster Profiles ==<br />
Matlab Cluster Profiles provide an interface to define properties of how and where to start Matlab workers. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running.<br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with a batch scheduler (e.g. SLURM on terra).<br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. Using the profile, you can define how many workers you want, how you want to distribute the workers over the nodes, how many computational threads to use, how long to run, etc. Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamuprofile.importProfile()<br />
</pre><br />
<br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta-information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU<VERSION'', where <VERSION> represents the Matlab version. For example, for Matlab R2020b, it will be ''/scratch/$USER/MatlabJobs/TAMU2020b''<br />
<br />
<!-- <br />
'''NOTE:''' function '''tamuprofile.clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
--><br />
<br />
<br />
In this case, Matlab will store meta-information in directory ''/scratch/$USER/MatlabJobs/TAMU''<br />
<br />
=== Getting Cluster Profile Object ===<br />
<br />
To return a fully completed cluster object (i.e. with attached resource information) HPRC created the '''tamu_set_profile_properties''' convenience function. There are two steps to follow:<br />
<br />
* define the properties using the TAMUClusterProperties class<br />
* call '''tamu_set_profile_properties''' using the created TAMUClusterProperties object.<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
>> clusterObject=tamu_set_profile_properties(tp);<br />
</pre><br />
<br />
Variable ''clusterObject'' is a fully populated cluster object that can be used for parallel processing. <br />
<br />
'''NOTE:''' convenience function '''tamu_set_profile_properties''' is a wrapper around Matlab function <br />
[https://www.mathworks.com/help/distcomp/parcluster.html parcluster]. It also uses HPRC convenience function '''tamu_import_TAMU_clusterprofile''' to check if the '''TAMU''' profile has been imported already.<br />
<br />
== Starting a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamu_parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
The '''parpool''' functions enables the full functionality of the parallel language features (parfor and spmd, will be discussed below). A parpool creates a special job on a pool of workers, and connects the pool to the MATLAB client. For example:<br />
<pre><br />
mypool = parpool 4<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the matlabpool block will be destroyed once the block is finished.<br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=12541SW:Matlab2021-09-22T15:23:17Z<p>Pennings: /* Importing Cluster Profile */</p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2020b'''<br />
<br />
This will setup the environment for Matlab version R2020b. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><br><br />
<br />
In this section, we will focus on utilizing the Parallel toolbox on HPRC cluster. For a general intro to the Parallel Toolbox see the [https://www.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav parallel toolbox ] section on the Mathworks website. Here we will discuss how to use Matlab Cluster profiles to distribute workers over multiple nodes.<br />
<br />
The central concept in most of the discussion below is the '''TAMUClusterProperties''' object which we will discuss in more detail in the next section<br />
<br />
<br />
== Cluster Profiles ==<br />
Matlab Cluster Profiles provide an interface to define properties of how and where to start Matlab workers. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running.<br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with a batch scheduler (e.g. SLURM on terra).<br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. Using the profile, you can define how many workers you want, how you want to distribute the workers over the nodes, how many computational threads to use, how long to run, etc. Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamuprofile.importProfile()<br />
</pre><br />
<br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta-information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU<VERSION'', where <VERSION> represents the Matlab version. For example, for Matlab R2020b it will be ''/scratch/$USER/MatlabJobs/TAMU2020b''<br />
<br />
<!-- <br />
'''NOTE:''' function '''tamuprofile.clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
--><br />
<br />
'''NOTE:''' For Matlab versions before R2019b, use the following function<br />
<pre><br />
>>tamu_import_TAMU_clusterprofile()<br />
</pre><br />
<br />
In this case, Matlab will store meta-information in directory ''/scratch/$USER/MatlabJobs/TAMU''<br />
<br />
=== Getting Cluster Profile Object ===<br />
<br />
To return a fully completed cluster object (i.e. with attached resource information) HPRC created the '''tamu_set_profile_properties''' convenience function. There are two steps to follow:<br />
<br />
* define the properties using the TAMUClusterProperties class<br />
* call '''tamu_set_profile_properties''' using the created TAMUClusterProperties object.<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
>> clusterObject=tamu_set_profile_properties(tp);<br />
</pre><br />
<br />
Variable ''clusterObject'' is a fully populated cluster object that can be used for parallel processing. <br />
<br />
'''NOTE:''' convenience function '''tamu_set_profile_properties''' is a wrapper around Matlab function <br />
[https://www.mathworks.com/help/distcomp/parcluster.html parcluster]. It also uses HPRC convenience function '''tamu_import_TAMU_clusterprofile''' to check if the '''TAMU''' profile has been imported already.<br />
<br />
== Starting a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamu_parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
The '''parpool''' functions enables the full functionality of the parallel language features (parfor and spmd, will be discussed below). A parpool creates a special job on a pool of workers, and connects the pool to the MATLAB client. For example:<br />
<pre><br />
mypool = parpool 4<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the matlabpool block will be destroyed once the block is finished.<br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=12540SW:Matlab2021-09-22T15:16:49Z<p>Pennings: /* Running Matlab on a login node */</p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2020b'''<br />
<br />
This will setup the environment for Matlab version R2020b. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><br><br />
<br />
In this section, we will focus on utilizing the Parallel toolbox on HPRC cluster. For a general intro to the Parallel Toolbox see the [https://www.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav parallel toolbox ] section on the Mathworks website. Here we will discuss how to use Matlab Cluster profiles to distribute workers over multiple nodes.<br />
<br />
The central concept in most of the discussion below is the '''TAMUClusterProperties''' object which we will discuss in more detail in the next section<br />
<br />
<br />
== Cluster Profiles ==<br />
Matlab Cluster Profiles provide an interface to define properties of how and where to start Matlab workers. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running.<br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with a batch scheduler (e.g. SLURM on terra).<br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. Using the profile, you can define how many workers you want, how you want to distribute the workers over the nodes, how many computational threads to use, how long to run, etc. Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamuprofile.importProfile()<br />
</pre><br />
<br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta-information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU<VERSION'', where <VERSION> represents the Matlab version. For example, for Matlab R2019b it will be ''/scratch/$USER/MatlabJobs/TAMU2019b''<br />
<br />
<!-- <br />
'''NOTE:''' function '''tamuprofile.clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
--><br />
<br />
'''NOTE:''' For Matlab versions before R2019b, use the following function<br />
<pre><br />
>>tamu_import_TAMU_clusterprofile()<br />
</pre><br />
<br />
In this case, Matlab will store meta-information in directory ''/scratch/$USER/MatlabJobs/TAMU''<br />
<br />
=== Getting Cluster Profile Object ===<br />
<br />
To return a fully completed cluster object (i.e. with attached resource information) HPRC created the '''tamu_set_profile_properties''' convenience function. There are two steps to follow:<br />
<br />
* define the properties using the TAMUClusterProperties class<br />
* call '''tamu_set_profile_properties''' using the created TAMUClusterProperties object.<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
>> clusterObject=tamu_set_profile_properties(tp);<br />
</pre><br />
<br />
Variable ''clusterObject'' is a fully populated cluster object that can be used for parallel processing. <br />
<br />
'''NOTE:''' convenience function '''tamu_set_profile_properties''' is a wrapper around Matlab function <br />
[https://www.mathworks.com/help/distcomp/parcluster.html parcluster]. It also uses HPRC convenience function '''tamu_import_TAMU_clusterprofile''' to check if the '''TAMU''' profile has been imported already.<br />
<br />
== Starting a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamu_parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
The '''parpool''' functions enables the full functionality of the parallel language features (parfor and spmd, will be discussed below). A parpool creates a special job on a pool of workers, and connects the pool to the MATLAB client. For example:<br />
<pre><br />
mypool = parpool 4<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the matlabpool block will be destroyed once the block is finished.<br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Knitro&diff=12537SW:Knitro2021-09-20T17:17:26Z<p>Pennings: /* Knitro */</p>
<hr />
<div>=Knitro=<br />
<br />
__TOC__<br />
==Description==<br />
<br />
The Artelys Knitro Solver is a plug-in Solver Engine that extends Analytic Solver Platform, Risk Solver Platform, Premium Solver Platform or Solver SDK Platform to solve nonlinear optimization problems of virtually unlimited size. The solver has plugins for MATLAB, R, Python, C/C++, and Fortran. <br />
<br />
'''NOTE:''' Knitro is currently only available on the '''terra''' cluster.<br />
<br />
For more information, please visit https://www.solver.com/artelys-knitro-solver-engine<br />
<br />
<br />
== Setting up the Knitro environment ==<br />
<br />
Before using Knitro, you need to set up the environment. To do this, load the Knitro module<br />
<br />
[NetID@terra ~]$ '''module load Knitro/12.4'''<br />
<br />
== Using Knitro ==<br />
<br />
As mentioned, Knitro has bindings for MATLAB, R, Python, C/C++, and Fortran. To use Knitro with any of these, you need to load the appropriate module (e.g. Matlab, R) in addition to the Knitro module<br />
<br />
=== MATLAB ===<br />
<br />
To use Knitro with MATLAB, load both modules first:<br />
<br />
<pre><br />
module load Knitro/12.4<br />
module load Matlab<br />
</pre><br />
<br />
Loading the Knitro module will set the '''$MATLABPATH''' to the directory containing the Knitro mex files and interfaces. After loading the modules you can call the Knitro solver like any regular MATLAB function. The $MATLABROOT directory also contains a sample Knitro options file and example MATLAB scripts that use the Knitro solver.<br />
<br />
=== R ===<br />
<br />
To use Knitro with R, load both modules first:<br />
<br />
<pre><br />
module load Knitro/12.4<br />
module load R<br />
</pre><br />
<br />
'''NOTE:''' Instead of loading the default '''R''' module it's recommended to load a version specific R version (e.g. R/3.3.2-iomkl-2017A-Python-2.7.12-default-mt). Alternatively, you can also load the R_tamu module. Knitro will work with any R or R_tamu version.<br />
<br />
The Knitro module will append the Knitro package directory to the R environmental variable '''$R_LIBS_USER'''. Loading the Knitro package can be done using the R '''library()''' function<br />
<br />
<pre><br />
> library('KnitroR')<br />
</pre><br />
<br />
For some example R scripts that use the Knitro solver, see directory '''${KNITROEXAMPLES}/R'''<br />
<br />
=== Python ===<br />
<br />
To use Knitro with Python, load both modules first:<br />
<br />
<pre><br />
module load Knitro/12.4<br />
module load Python/3.5.2-intel-2017A<br />
</pre><br />
<br />
'''NOTE:''' The Python version used above is just an example. Knitro will work with any Python version (2.7.xx as well as 3.xx) and toolchain combination. <br />
<br />
The Knitro module will set the environmental variable '''PYTHONPATH''' to include the Knitro Python directory. The Knitro solver can be accessed from any Python script. For some example Python scripts that use the Knitro solver, see directory '''${KNITROEXAMPLES}/Python'''<br />
<br />
=== C/C++ and Fortran ===<br />
<br />
To use Knitro inside your own C/C++, or Fortran code, load Knitro and the preferred toolchain first:<br />
<br />
<pre><br />
module load Knitro/12.4<br />
module load intel/2020b<br />
</pre><br />
<br />
'''NOTE:''' The toolchain used above is just an example. Knitro will work with any toolchain (e.g. Intel, GNU). <br />
<br />
The Knitro module will set the environmental variables '''CPATH''' for the include files, '''LIBRARY_PATH''' for the compile time library paths, and '''LD_LIBRARY_PATH''' for the runtime library paths. <br />
<br />
<br />
Directories '''${KNITROEXAMPLES}/C''', '''${KNITROEXAMPLES}/C++''', and '''${KNITROEXAMPLES}/Fortran''' contain example programs for C,C++, and Fortran respectively. The directories also contain a Makefile, explaining how to compile the examples.<br />
<br />
== Acknowledgement ==<br />
<br />
The license for Knitro has been purchased by the Department of Economics. We are thankful for their contribution and for allowing access to all HPRC users.<br />
<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Knitro&diff=12536SW:Knitro2021-09-20T17:16:28Z<p>Pennings: /* R */</p>
<hr />
<div>=Knitro=<br />
<br />
__TOC__<br />
==Description==<br />
<br />
The Artelys Knitro Solver is a plug-in Solver Engine that extends Analytic Solver Platform, Risk Solver Platform, Premium Solver Platform or Solver SDK Platform to solve nonlinear optimization problems of virtually unlimited size. The solver has plugins for MATLAB, R, Python, C/C++, and Fortran. <br />
<br />
'''NOTE:''' Knitro is currently only available on the '''terra''' cluster.<br />
<br />
For more information, please visit https://www.solver.com/artelys-knitro-solver-engine<br />
<br />
<br />
== Setting up the Knitro environment ==<br />
<br />
Before using Knitro, you need to set up the environment. To do this, load the Knitro module<br />
<br />
[NetID@terra ~]$ '''module load Knitro/12.4'''<br />
<br />
== Using Knitro ==<br />
<br />
As mentioned, Knitro has bindings for MATLAB, R, Python, C/C++, and Fortran. To use Knitro with any of these, you need to load the appropriate module (e.g. Matlab, R) in addition to the Knitro module<br />
<br />
=== MATLAB ===<br />
<br />
To use Knitro with MATLAB, load both modules first:<br />
<br />
<pre><br />
module load Knitro/10.3.0<br />
module load Matlab<br />
</pre><br />
<br />
Loading the Knitro module will set the '''$MATLABPATH''' to the directory containing the Knitro mex files and interfaces. After loading the modules you can call the Knitro solver like any regular MATLAB function. The $MATLABROOT directory also contains a sample Knitro options file and example MATLAB scripts that use the Knitro solver.<br />
<br />
=== R ===<br />
<br />
To use Knitro with R, load both modules first:<br />
<br />
<pre><br />
module load Knitro/12.4<br />
module load R<br />
</pre><br />
<br />
'''NOTE:''' Instead of loading the default '''R''' module it's recommended to load a version specific R version (e.g. R/3.3.2-iomkl-2017A-Python-2.7.12-default-mt). Alternatively, you can also load the R_tamu module. Knitro will work with any R or R_tamu version.<br />
<br />
The Knitro module will append the Knitro package directory to the R environmental variable '''$R_LIBS_USER'''. Loading the Knitro package can be done using the R '''library()''' function<br />
<br />
<pre><br />
> library('KnitroR')<br />
</pre><br />
<br />
For some example R scripts that use the Knitro solver, see directory '''${KNITROEXAMPLES}/R'''<br />
<br />
=== Python ===<br />
<br />
To use Knitro with Python, load both modules first:<br />
<br />
<pre><br />
module load Knitro/10.3.0<br />
module load Python/3.5.2-intel-2017A<br />
</pre><br />
<br />
'''NOTE:''' The Python version used above is just an example. Knitro will work with any Python version (2.7.xx as well as 3.xx) and toolchain combination. <br />
<br />
The Knitro module will set the environmental variable '''PYTHONPATH''' to include the Knitro Python directory. The Knitro solver can be accessed from any Python script. For some example Python scripts that use the Knitro solver, see directory '''${KNITROEXAMPLES}/Python'''<br />
<br />
=== C/C++ and Fortran ===<br />
<br />
To use Knitro inside your own C/C++, or Fortran code, load Knitro and the preferred toolchain first:<br />
<br />
<pre><br />
module load Knitro/10.3.0<br />
module load Intel/2017A<br />
</pre><br />
<br />
'''NOTE:''' The toolchain used above is just an example. Knitro will work with any toolchain (e.g. Intel, GNU). <br />
<br />
The Knitro module will set the environmental variables '''CPATH''' for the include files, '''LIBRARY_PATH''' for the compile time library paths, and '''LD_LIBRARY_PATH''' for the runtime library paths. <br />
<br />
<br />
Directories '''${KNITROEXAMPLES}/C''', '''${KNITROEXAMPLES}/C++''', and '''${KNITROEXAMPLES}/Fortran''' contain example programs for C,C++, and Fortran respectively. The directories also contain a Makefile, explaining how to compile the examples.<br />
<br />
== Acknowledgement ==<br />
<br />
The license for Knitro has been purchased by the Department of Economics. We are thankful for their contribution and for allowing access to all HPRC users.<br />
<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Knitro&diff=12535SW:Knitro2021-09-20T17:16:04Z<p>Pennings: /* Setting up the Knitro environment */</p>
<hr />
<div>=Knitro=<br />
<br />
__TOC__<br />
==Description==<br />
<br />
The Artelys Knitro Solver is a plug-in Solver Engine that extends Analytic Solver Platform, Risk Solver Platform, Premium Solver Platform or Solver SDK Platform to solve nonlinear optimization problems of virtually unlimited size. The solver has plugins for MATLAB, R, Python, C/C++, and Fortran. <br />
<br />
'''NOTE:''' Knitro is currently only available on the '''terra''' cluster.<br />
<br />
For more information, please visit https://www.solver.com/artelys-knitro-solver-engine<br />
<br />
<br />
== Setting up the Knitro environment ==<br />
<br />
Before using Knitro, you need to set up the environment. To do this, load the Knitro module<br />
<br />
[NetID@terra ~]$ '''module load Knitro/12.4'''<br />
<br />
== Using Knitro ==<br />
<br />
As mentioned, Knitro has bindings for MATLAB, R, Python, C/C++, and Fortran. To use Knitro with any of these, you need to load the appropriate module (e.g. Matlab, R) in addition to the Knitro module<br />
<br />
=== MATLAB ===<br />
<br />
To use Knitro with MATLAB, load both modules first:<br />
<br />
<pre><br />
module load Knitro/10.3.0<br />
module load Matlab<br />
</pre><br />
<br />
Loading the Knitro module will set the '''$MATLABPATH''' to the directory containing the Knitro mex files and interfaces. After loading the modules you can call the Knitro solver like any regular MATLAB function. The $MATLABROOT directory also contains a sample Knitro options file and example MATLAB scripts that use the Knitro solver.<br />
<br />
=== R ===<br />
<br />
To use Knitro with R, load both modules first:<br />
<br />
<pre><br />
module load Knitro/10.3.0<br />
module load R<br />
</pre><br />
<br />
'''NOTE:''' Instead of loading the default '''R''' module it's recommended to load a version specific R version (e.g. R/3.3.2-iomkl-2017A-Python-2.7.12-default-mt). Alternatively, you can also load the R_tamu module. Knitro will work with any R or R_tamu version.<br />
<br />
The Knitro module will append the Knitro package directory to the R environmental variable '''$R_LIBS_USER'''. Loading the Knitro package can be done using the R '''library()''' function<br />
<br />
<pre><br />
> library('KnitroR')<br />
</pre><br />
<br />
For some example R scripts that use the Knitro solver, see directory '''${KNITROEXAMPLES}/R''' <br />
<br />
=== Python ===<br />
<br />
To use Knitro with Python, load both modules first:<br />
<br />
<pre><br />
module load Knitro/10.3.0<br />
module load Python/3.5.2-intel-2017A<br />
</pre><br />
<br />
'''NOTE:''' The Python version used above is just an example. Knitro will work with any Python version (2.7.xx as well as 3.xx) and toolchain combination. <br />
<br />
The Knitro module will set the environmental variable '''PYTHONPATH''' to include the Knitro Python directory. The Knitro solver can be accessed from any Python script. For some example Python scripts that use the Knitro solver, see directory '''${KNITROEXAMPLES}/Python'''<br />
<br />
=== C/C++ and Fortran ===<br />
<br />
To use Knitro inside your own C/C++, or Fortran code, load Knitro and the preferred toolchain first:<br />
<br />
<pre><br />
module load Knitro/10.3.0<br />
module load Intel/2017A<br />
</pre><br />
<br />
'''NOTE:''' The toolchain used above is just an example. Knitro will work with any toolchain (e.g. Intel, GNU). <br />
<br />
The Knitro module will set the environmental variables '''CPATH''' for the include files, '''LIBRARY_PATH''' for the compile time library paths, and '''LD_LIBRARY_PATH''' for the runtime library paths. <br />
<br />
<br />
Directories '''${KNITROEXAMPLES}/C''', '''${KNITROEXAMPLES}/C++''', and '''${KNITROEXAMPLES}/Fortran''' contain example programs for C,C++, and Fortran respectively. The directories also contain a Makefile, explaining how to compile the examples.<br />
<br />
== Acknowledgement ==<br />
<br />
The license for Knitro has been purchased by the Department of Economics. We are thankful for their contribution and for allowing access to all HPRC users.<br />
<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Portal&diff=12423SW:Portal2021-07-27T20:26:30Z<p>Pennings: </p>
<hr />
<div>__TOC__<br />
<br />
==What is TAMU HPRC OnDemand Portal==<br />
The TAMU HPRC OnDemand portal is based on Open OnDemand (https://openondemand.org/), an open source web platform through which users can access HPC clusters and services with a web browser. The portal provides an intuitive and easy-to-use interface and allows new users to be instantly productive at using the HPC resources for their research, and at the same time, provides an alternative convenient way for experienced users to access the HPC resources. The portal has a flexible and extensible design that makes it easy to deploy new services as needed.<br />
<br />
==Services Provided==<br />
*Job submission and monitoring <br />
*File transfer and management <br />
*File editing <br />
*Shell access<br />
*Interactive applications<br />
**[[SW:ABAQUS | Abaqus]]<br />
**[[SW:ANSYS | Ansys]]<br />
**[[SW:IGV | IGV]]<br />
**[[SW:LS-PREPOST | LS-PREPOST]]<br />
**[[SW:Matlab | Matlab]]<br />
**[[SW:Jupyter Notebook | Jupyter Notebook]]<br />
**[https://www.paraview.org/ Paraview]<br />
**VNC<br />
**[https://rstudio.com/ Rstudio]<br />
**[[SW:JupyterLab | JupyterLab]]<br />
**[[SW:JBrowse | JBrowse]]<br />
<br />
==How to Access==<br />
<br />
We recommend you access the Grace or Terra portal through their landing page at<br />
<br />
https://portal.hprc.tamu.edu<br />
<br />
Click the portal you want to connect. The portals are CAS authenticated. All active HPRC users have access to both portals using their NetID and password. You will only be authenticated once, and before your session expires, you can freely access both portals without further authentication. <br />
<br />
If accessing from off-campus, the TAMU VPN is needed.<br />
<br />
You can go directly to the Grace or Terra portal using one of the following URLs:<br />
<br />
https://portal-terra.hprc.tamu.edu<br />
<br />
https://portal-grace.hprc.tamu.edu<br />
<br />
<br />
{{:IT_Info#Two-Factor_Authentication_Requirement}}<br />
<br />
==Using the Portal==<br />
Each service provided by the portal is available at the navigation bar at the top of the page.<br />
<br />
[[Image:Navigation-bar.png|1000px]]<br />
<br />
===Files===<br />
[[Image:File-explorer.png|right|600px]]<br />
The first option in the navigation bar is the "Files" drop down menu. From this menu, a user can view a file explorer at either their home directory or scratch directory. <br />
<br />
Some users may find the visual interface of the file explorer more intuitive than shell based file exploring. All files in the directory are shown on screen, along with the file tree or hierarchy. <br />
<br />
Normal file management commands are available with the click of a button. These include:<br />
* Viewing files <br />
* Text editing<br />
* Copy/Paste<br />
* Renaming files<br />
* Creating files<br />
* Creating directories<br />
* Deleting files<br />
* File upload/download<br />
<br />
The 'View' button will display the highlighted file in the browser, as long as the file type is supported by the browser. Luckily modern browsers support many different types of files, from simple text files, to image files, to complicated multimedia files. This feature can be very convenient and useful if you want to quickly review a file, since you don't have to download the file to your local machine first and then review it, as what you would be doing if you had connected to a cluster using putty or mobaxterm.<br />
<br />
===File Editor===<br />
File editor allows you to edit a file selected. It cannot be accessed from the main menu, but is available through the File app or Job Composer. In File app, you first select a file, then click 'Edit' from the File app interface. Then a new tab will be opened and you can edit the file in the editor. In Job Composer, you can edit the job script by clicking 'Open Editor' at the bottom of Job Composer.<br />
<br />
===Cluster Shell Access===<br />
Shell access to any of the three clusters is available from this drop down menu with one click. Shell access app is similar to ssh client such as Putty and MobaXterm. it allows users to login to a cluster with their NetID and password.<br />
<br />
Copy/Paste can be done with hot keys. To copy text from the shell access terminal, highlight the text with a mouse, then the highlighted text will be coped into the clipboard. To paste a text from the clipboard to the terminal, type 'Ctrl+v'. <br />
<br />
Shell access works with Firefox and Chrome only.<br />
<br />
====Copy/Paste in VNC====<br />
If launching an interactive session in the portal, there are a few extra steps that need to be taken. Please reference the media below, or the summary of steps below that for more information.<br />
[[File:portalDemo.gif]]<br />
# Open the toolbar on the left of the screen and select "Clipboard".<br />
# If you want to paste text from your host computer to the remote session, paste the text in the clipboard box. You can then use the middle-mouse button (MMB) to paste it in your terminal.<br />
# If you want to copy text from the remote session to your host computer's clipboard, simply highlight the text in the terminal. It will appear in the Clipboard toolbar pop-out where you can copy it to your host clipboard.<br />
<br />
===Jobs===<br />
From the jobs drop down menu, a user can view their active jobs or compose and submit jobs using the job composer. <br />
<br />
====Active Jobs====<br />
The active jobs menu provides information about running jobs the cluster, including their JobID, name, user, account, time used, queue, and status. <br />
Clicking the arrow to the left of a given job will reveal more details, such as where it was submitted from, which node it's running on, when it was submitted, process IDs, memory, and CPU time.<br />
<br />
[[Image:Activejobs.png|600px]]<br />
<br />
====Job Composer====<br />
[[Image:Jobcomposer.png|right|550px]]<br />
When first launched, the job composer will walk the user through each of its features, covering the whole process of creating, editing, and submitting a job. <br />
<br />
The job composer provides some template job scripts the user can choose from. <br />
Once a template is selected, you need to edit the template to provide customized job content. This can be done by clicking 'Open Editor' underneath the job script contents.<br />
<br />
The job composer has a specific directory in the user's scratch to store the jobs it has created. We call the directory the job composer's root directory. New jobs created by the job composer will have a sub-directory in the root directory. The name of the sub-directory is same as the index of the job, which is an integer maintained by the job composer. The first job has an index 1, the second job has an index 2, and so on. Knowing this is very important to help us using the job composer more effectively. <br />
<br />
There are two ways to cope with the default directory created by the job composer. <br />
<br />
<br />
<br />
'''Method 1''': using the default directory as the working directory of your job. This means you need to upload all input files to that directory before you can click the submit button. This can be easily done by clicking 'Open Dir' right beneath the job script contents. A file explorer will open the job directory in a new tab where you can do file transfers.<br />
<br />
'''Method 2''': if you already have the input files stored somewhere in the cluster and don't want to move them around, or you prefer to have an organized directories by yourself, you can simply add one command line in the job script before any other command line, where ''/path/to/job_working_dir'' is the directory you want all the commands to be executed:<br />
<br />
<pre>cd /path/to/job_working_dir</pre><br />
<br />
===Common Problems===<br />
1. The session starts and quits immediately. <br />
<br />
Check your quota in your home and scratch. If you see a full or close to full usage, clean your disk space and try again.<br />
<br />
2. In ANSYS Workbench, not all windows are available in the foreground.<br />
<br />
Right click the bottom panel title bar "Unsaved Project - Workbench" and select maximize<br />
<br />
=== Log out ===<br />
To properly log out the portal, you must do two things: (1) log out the portal by clicking 'Log out' from the top navigation bar; (2) close the browser to completely terminate the session.<br />
<br />
<b>Be aware that only logout of the portal is not enough. You must also close the entire browser (not just the tab)</b>, a side effect of CAS. This is very important if you are using a public computer.<br />
<br />
=== Cleanup ===<br />
The portal stores temporary files for interactive apps in $SCRATCH/ondemand/data/sys/dashboard/. Although the disk space used by those files accumulate slowly, it is a good habit to clean this directory periodically.<br />
<br />
rm -rf $SCRATCH/ondemand/data/sys/dashboard/batch_connect/sys/*<br />
<br />
==Interactive Apps==<br />
Each piece of software listed above in the "services provided" section is directly available to launch from this menu. <br />
When a piece of software is selected, you will see the interface for job parameters such as number of cores, wall time, memory, and type of node. If you are not sure what to change, '''the default values work fine'''. Once you fill out the form, click 'Launch' and the app will be launched as a job. It will first go into a queue, and when ready, a button will become available to view the interface of the chosen software. <br />
<br />
Interactive sessions can be managed via the "My Interactive Sessions" button on the navigation bar.<br />
<br />
We have tried to provide the most commonly used GUI software packages on the Interactive Apps drop-down menu. If a software is not available, you can always run it within VNC, which is provided on the drop-down menu. To run a GUI application in the VNC session on the portal, follow these steps.<br />
<br />
1. Click 'VNC' from 'Interactive Apps' and start a vnc session. <br><br />
2. In the terminal within the new tab, load the module for the software you want to run. <br><br />
3. If you have chosen a GPU node, please run<br />
<br />
vglrun app_name<br />
<br />
Otherwise, type the ''app_name'' from the command line directly.<br />
<br />
==== RStudio ====<br />
To install CRAN packages, start RStudio with enough memory for the install process: 10 cores and 2GB per core for example.<br />
<br />
Then install CRAN packages using the following command at the RStudio R prompt:<br />
<br />
===== Install Ada CRAN packages =====<br />
<pre><br />
install.packages('package_name', repos='http://10.70.4.4/cran/')<br />
</pre><br />
<br />
===== Install Terra CRAN packages =====<br />
<pre><br />
install.packages('package_name', repos='http://10.76.5.24/cran/')<br />
</pre><br />
<br />
A local CRAN repository is used since RStudio runs on the compute nodes which do not have internet access.<br />
<br />
If you need Bioconductor or github R packages installed, contact the HPRC helpdesk to request installation.<br />
<br />
{{:SW:JupyterLab}}<br />
<br />
{{:SW:Jupyter_Notebook}}<br />
<br />
{{:SW:Spark_Jupyter_Notebook}}<br />
<br />
== Additional Information ==<br />
<br />
Ohio Supercomputing Center has video for OOD at:<br />
<br />
https://youtu.be/DfK7CppI-IU<br />
<br />
<br />
[[ Category:Software ]] [[ Category:Terra ]] [[ Category:Grace ]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Comsol&diff=11953SW:Comsol2021-02-02T16:25:06Z<p>Pennings: /* Running Comsol in Different Parallel Mode */</p>
<hr />
<div><br />
<br />
__TOC__<br />
<br />
==Description==<br />
COMSOL Multiphysics is a cross-platform finite element analysis, solver and multiphysics simulation software. It allows conventional physics-based user interfaces and coupled systems of partial differential equations (PDEs). COMSOL provides an IDE and unified workflow for electrical, mechanical, fluid, and chemical applications. An API for Java and LiveLink for MATLAB may be used to control the software externally, and the same API is also used via the Method Editor.<br />
<br />
Once a model is built in Comsol GUI, the next step is to compute the model for a solution, which is often time-consuming. A job script must be created to run the model in batch so that you can control wall time, memory, and other cluster resources for your simulation. This tutorial illustrates how to create Comsol LSF batch scripts on Ada.<br />
<br />
All solvers in Comsol can run in parallel in one of three parallel modes: shared memory mode, distributed mode, or hybrid mode. By default, a Comsol solver runs in shared memory mode. This is the same as OpenMP where the parallelism is limited by total number of CPU cores available on one compute node in a cluster.<br />
<br />
==Access==<br />
<br />
Comsol is restricted software that is only limited to users and groups who have a license. You can choose one of the following to get access: <br />
<br />
#Purchase your own license (If you choose this route, you can either ask us to host the license server or you can host the server by yourself.)<br />
#Ask for permission to use the license server maintained by the school of engineering. <br />
<br />
The contact person is:<br />
<br />
Mitch Wittneben<br />
979-845-5235<br />
mwittneben@tamu.edu<br />
<br />
==A Complete Batch Job Example==<br />
<br />
'''Note: xxx represents a Comsol version. You need to pick the version you need to access.'''<br />
<br />
'''Example 1a (Ada)''': Solving a model in shared memory mode using 20 cores on one Ada cluster node.<br />
<br />
#BSUB -J comsol #Set the job name to comsol<br />
#BSUB -n 20 -R "span[ptile=20]" #Request 20 cores with all 20 cores on 1 node<br />
#BSUB -M 2800 -R "rusage[mem=2800]" #Request 2800MB of memory per core<br />
#BSUB -o output.%J #Set stdout/err to comsol.[jobID]<br />
#BSUB -L /bin/bash #Use the bash shell for the job script<br />
#BSUB -W 2:00 #Set the wall clock limit to 2 hours <br />
ml Comsol/xxx <br />
<br />
comsol -np 20 batch -inputfile in.mph -outputfile out.mph<br />
<br />
Save the above example to a file and run "bsub < file" to submit this job.<br />
<br />
'''Example 1b (Terra)''': Solving a model in shared memory mode using 28 cores on one Terra cluster node.<br />
<br />
#!/bin/bash<br />
#SBATCH --job-name=comsol #Set the job name to "Comsol"<br />
#SBATCH --time=01:30:00 #Set the wall clock limit to 1hr and 30min<br />
#SBATCH --ntasks=28 #Request 28 tasks<br />
#SBATCH --ntasks-per-node=28 #Request 28 tasks/cores per node<br />
#SBATCH --mem=56G #Request 56 GB of memory per node<br />
#SBATCH --output=comsol.%j #Send stdout/err to "comsol.[jobID]"<br />
ml Comsol/xxx <br />
<br />
comsol -np 28 batch -inputfile in.mph -outputfile out.mph<br />
<br />
Save the above example to a file and run "sbatch file" to submit this job.<br />
<br />
==Running Comsol in Different Parallel Mode==<br />
<br />
Assuming other things are the same as in Example 1, we will show additional examples running in different parallel modes by changing the number of cores and the Comsol command line parameters. <br />
<br />
===Shared Memory Mode===<br />
<br />
'''Example 2a (Ada)''': Solving a model in shared memory mode and using 10 cores in one Ada cluster node. This is similar as Example 1a.<br />
<br />
#BSUB -J comsol #Set the job name to comsol<br />
#BSUB -n 20 -R "span[ptile=20]" #Request 20 cores with all 20 cores on 1 node<br />
#BSUB -M 2800 -R "rusage[mem=2800]" #Request 2800MB of memory per core<br />
#BSUB -o output.%J #Set stdout/err to comsol.[jobID]<br />
#BSUB -L /bin/bash #Use the bash shell for the job script<br />
#BSUB -W 2:00 #Set the wall clock limit to 2 hours <br />
ml Comsol/xxx <br />
<br />
comsol -np 10 batch -inputfile in.mph -outputfile out.mph<br />
<br />
'''Example 2b (Terra)''': Solving a model in shared memory mode using 14 cores on one Terra cluster node. This is similar as Example 1b<br />
<br />
#!/bin/bash<br />
#SBATCH --job-name=comsol #Set the job name to "Comsol"<br />
#SBATCH --time=01:30:00 #Set the wall clock limit to 1hr and 30min<br />
#SBATCH --ntasks=14 #Request 14 tasks<br />
#SBATCH --ntasks-per-node=14 #Request 14 tasks/cores per node<br />
#SBATCH --mem=28G #Request 28 GB of memory per node<br />
#SBATCH --output=comsol.%j #Send stdout/err to "comsol.[jobID]"<br />
ml Comsol/xxx <br />
<br />
comsol -np 14 batch -inputfile in.mph -outputfile out.mph<br />
<br />
===Distributed Mode===<br />
<br />
Comsol solvers can also run in distributed mode by checking the "distributed computing" checkbox of the solver when building the model. In this mode, the solver runs on multiple nodes and uses MPI for communication. Except PARDISO, all solvers support distributed mode. However, PARDISO also has a check box for distributed computing. If selected, the actual solver used is MUMPS.<br />
<br />
'''Example 3a (Ada)''': Solving a model in distributed mode on two Ada cluster nodes with a total of 40 cores<br />
<br />
#BSUB -J comsol #Set the job name to comsol<br />
#BSUB -n 40 -R "span[ptile=20]" #Request 40 cores with all 20 cores on 1 node<br />
#BSUB -M 2800 -R "rusage[mem=2800]" #Request 2800MB of memory per core<br />
#BSUB -o output.%J #Set stdout/err to comsol.[jobID]<br />
#BSUB -L /bin/bash #Use the bash shell for the job script<br />
#BSUB -W 2:00 #Set the wall clock limit to 2 hours <br />
ml Comsol/xxx <br />
<br />
comsol -simplecluster -inputfile input.mph -outputfile output.mph<br />
<br />
This is equivalent to:<br />
<br />
#BSUB -J comsol #Set the job name to comsol<br />
#BSUB -n 40 -R "span[ptile=20]" #Request 40 cores with all 20 cores on 1 node<br />
#BSUB -M 2800 -R "rusage[mem=2800]" #Request 2800MB of memory per core<br />
#BSUB -o output.%J #Set stdout/err to comsol.[jobID]<br />
#BSUB -L /bin/bash #Use the bash shell for the job script<br />
#BSUB -W 2:00 #Set the wall clock limit to 2 hours <br />
ml Comsol/xxx <br />
<br />
cat $LSB_DJOB_HOSTFILE > hostfile.$LSB_JOBID<br />
comsol -f ./hostfile.$LSB_JOBID -nn 40 batch -inputfile input.mph -outputfile output.mph<br />
<br />
'''Example 3b (Terra)''': Solving a model in distributed mode on two Terra cluster nodes with a total of 56 cores<br />
<br />
#!/bin/bash<br />
#SBATCH --job-name=comsol #Set the job name to "Comsol"<br />
#SBATCH --time=01:30:00 #Set the wall clock limit to 1hr and 30min<br />
#SBATCH --ntasks=56 #Request 56 tasks<br />
#SBATCH --ntasks-per-node=28 #Request 28 tasks/cores per node<br />
#SBATCH --mem=56G #Request 56 GB of memory per node<br />
#SBATCH --output=comsol.%j #Send stdout/err to "comsol.[jobID]"<br />
ml Comsol/xxx <br />
<br />
comsol -simplecluster -inputfile input.mph -outputfile output.mph<br />
<br />
===Hybrid Mode===<br />
<br />
Either mode has its pros and cons. Shared mode utilizes CPU cores better than distributed mode but can only run on one cluster node, while distributed mode can utilize more than one physical cluster node. It is usually best to run a solver in a way to take advantage of both modes. This can be done easily at the command line through fine tuning of the options -nn, -nnhost, -np.<br />
<br />
'''Example 4a (Ada)''': Solving a model in hybrid mode on 2 Ada cluster nodes with 40 cores. In this example, Comsol will spawn 2 MPI tasks in total (one on each cluster node). Each MPI task will be running with 20 threads on 20 cores. <br />
<br />
#BSUB -J comsol #Set the job name to comsol<br />
#BSUB -n 40 -R "span[ptile=20]" #Request 40 cores with all 20 cores on 1 node<br />
#BSUB -M 2800 -R "rusage[mem=2800]" #Request 2800MB of memory per core<br />
#BSUB -o output.%J #Set stdout/err to comsol.[jobID]<br />
#BSUB -L /bin/bash #Use the bash shell for the job script<br />
#BSUB -W 2:00 #Set the wall clock limit to 2 hours <br />
ml Comsol/xxx <br />
<br />
cat $LSB_DJOB_HOSTFILE |uniq > hostfile.$LSB_JOBID<br />
comsol batch -f ./hostfile.$LSB_JOBID -nn 2 -nnhost 1 -np 20 -inputfile input.mph -outputfile output.mph<br />
<br />
'''Example 5 (Ada)''': Solving a model in hybrid mode on 2 Ada cluster nodes with 40 cores. In this example, Comsol will spawn 4 MPI tasks in total (one on each cluster node). Each MPI task will be running with 10 threads on 10 cores. <br />
<br />
#BSUB -J comsol #Set the job name to comsol<br />
#BSUB -n 40 -R "span[ptile=20]" #Request 40 cores with all 20 cores on 1 node<br />
#BSUB -M 2800 -R "rusage[mem=2800]" #Request 2800MB of memory per core<br />
#BSUB -o output.%J #Set stdout/err to comsol.[jobID]<br />
#BSUB -L /bin/bash #Use the bash shell for the job script<br />
#BSUB -W 2:00 #Set the wall clock limit to 2 hours <br />
ml Comsol/xxx <br />
<br />
cat $LSB_DJOB_HOSTFILE |uniq > hostfile.$LSB_JOBID<br />
comsol batch -f ./hostfile.$LSB_JOBID -nn 4 -nnhost 2 -np 10 -inputfile input.mph -outputfile output.mph<br />
<br />
===Parametric Sweep===<br />
<br />
Comsol models configured with parametric sweep can also benefit from parallel computing in different ways. A model configured with parametric sweep needs to run under a range of parameters or combinations of parameters, and each set of parameters can be calculated independently. Once a model with parametric sweep node created in the Comsol GUI, it must also be configured with cluster sweep to distribute the parameters to be processed in parallel. <br />
<br />
'''Example 6 (Ada)''': Run a parametric sweep model on 40 cores. In this example, 10 combinations of parameters will be running concurrently on two cluster nodes with 5 combinations of parameters on each Ada cluster node. Each combination of parameters will be running with 4 threads on 4 cores. <br />
<br />
#BSUB -J comsol #Set the job name to comsol<br />
#BSUB -n 40 -R "span[ptile=20]" #Request 40 cores with all 20 cores on 1 node<br />
#BSUB -M 2800 -R "rusage[mem=2800]" #Request 2800MB of memory per core<br />
#BSUB -o output.%J #Set stdout/err to comsol.[jobID]<br />
#BSUB -L /bin/bash #Use the bash shell for the job script<br />
#BSUB -W 2:00 #Set the wall clock limit to 2 hours <br />
ml Comsol/xxx <br />
<br />
cat $LSB_DJOB_HOSTFILE |uniq > hostfile.$LSB_JOBID<br />
comsol -f ./hostfile.$LSB_JOBID -nn 10 -nnhost 5 -np 4 -inputfile input.mph -outputfile output.mph<br />
<br />
If each combination of parameters requires large amount of memory to solve, then we can specify one combination of parameters per node such that the entire memory on the node will be used for solving one combination of parameters.<br />
<br />
'''Example 7 (Ada)''': Run a parametric sweep model (with 10 parameter combinations) on 200 cores with each parameter combination taking an entire Ada cluster node.<br />
<br />
#BSUB -J comsol #Set the job name to comsol<br />
#BSUB -n 200 -R "span[ptile=20]" #Request 200 cores with all 20 cores on 1 node<br />
#BSUB -M 2800 -R "rusage[mem=2800]" #Request 2800MB of memory per core<br />
#BSUB -o output.%J #Set stdout/err to comsol.[jobID]<br />
#BSUB -L /bin/bash #Use the bash shell for the job script<br />
#BSUB -W 2:00 #Set the wall clock limit to 2 hours <br />
ml Comsol/xxx <br />
<br />
cat $LSB_DJOB_HOSTFILE |uniq > hostfile.$LSB_JOBID<br />
comsol -f ./hostfile.$LSB_JOBID -nn 10 -nnhost 1 -np 20 -inputfile input.mph -outputfile output.mph<br />
<br />
==Common problems==<br />
<br />
1. Disk quota exceeded in home directory<br />
<br />
By default, COMSOL stores all temporary files in your home directory. For large models, you are likely to get "Disk quota exceeded" error due to huge amount of temporary files dumped into your home directory. To resolve this issue, you need to redirect temporary files to your scratch directory.<br />
<br />
comsol -tmpdir /scratch/user/username/cosmol/tmp -recoverydir /scratch/user/username/comsol/recovery -np 20 -inputfile input.mph -outputfile outpu.mph<br />
<br />
[[Category:Software]]<br />
<br />
2. Out of Memory<br />
<br />
<br />
If you receive the "Out of Memory" error, most likely the Java heap has an inadequate size. The default Java heap size for Comsol is 2G. You can change the value by following three steps: <br />
<br />
*'''1) load the comsol module'''<br />
ml Comsol/version<br />
<br />
*'''2) copy the Comsol setup file to your home directory'''<br />
<br />
If you are running one core Comsol job, copy comsolbatch.ini <br />
<br />
cp $EBROOTCOMSOL/bin/glnxa64/comsolbatch.ini $HOME/comsol.ini<br />
<br />
<br />
If you are running cluster Comsol job, copy comsolclusterbatch.ini <br />
<br />
cp $EBROOTCOMSOL/bin/glnxa64/comsolclusterbatch.ini $HOME/comsol.ini<br />
<br />
<br />
*'''3) edit the local setup file and increase Xmx'''<br />
sed -i "s/-Xmx.*/-Xmx8196m/" $HOME/comsol.ini (here we increase the heap size to 8G)<br />
*'''4) add '-comsolinifile $HOME/comsol.ini' at the command line.'''<br />
comsol -comsolinifile $HOME/comsol.ini ... (... represent other command line options)</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Comsol&diff=11952SW:Comsol2021-02-02T16:23:38Z<p>Pennings: /* A Complete Batch Job Example */</p>
<hr />
<div><br />
<br />
__TOC__<br />
<br />
==Description==<br />
COMSOL Multiphysics is a cross-platform finite element analysis, solver and multiphysics simulation software. It allows conventional physics-based user interfaces and coupled systems of partial differential equations (PDEs). COMSOL provides an IDE and unified workflow for electrical, mechanical, fluid, and chemical applications. An API for Java and LiveLink for MATLAB may be used to control the software externally, and the same API is also used via the Method Editor.<br />
<br />
Once a model is built in Comsol GUI, the next step is to compute the model for a solution, which is often time-consuming. A job script must be created to run the model in batch so that you can control wall time, memory, and other cluster resources for your simulation. This tutorial illustrates how to create Comsol LSF batch scripts on Ada.<br />
<br />
All solvers in Comsol can run in parallel in one of three parallel modes: shared memory mode, distributed mode, or hybrid mode. By default, a Comsol solver runs in shared memory mode. This is the same as OpenMP where the parallelism is limited by total number of CPU cores available on one compute node in a cluster.<br />
<br />
==Access==<br />
<br />
Comsol is restricted software that is only limited to users and groups who have a license. You can choose one of the following to get access: <br />
<br />
#Purchase your own license (If you choose this route, you can either ask us to host the license server or you can host the server by yourself.)<br />
#Ask for permission to use the license server maintained by the school of engineering. <br />
<br />
The contact person is:<br />
<br />
Mitch Wittneben<br />
979-845-5235<br />
mwittneben@tamu.edu<br />
<br />
==A Complete Batch Job Example==<br />
<br />
'''Note: xxx represents a Comsol version. You need to pick the version you need to access.'''<br />
<br />
'''Example 1a (Ada)''': Solving a model in shared memory mode using 20 cores on one Ada cluster node.<br />
<br />
#BSUB -J comsol #Set the job name to comsol<br />
#BSUB -n 20 -R "span[ptile=20]" #Request 20 cores with all 20 cores on 1 node<br />
#BSUB -M 2800 -R "rusage[mem=2800]" #Request 2800MB of memory per core<br />
#BSUB -o output.%J #Set stdout/err to comsol.[jobID]<br />
#BSUB -L /bin/bash #Use the bash shell for the job script<br />
#BSUB -W 2:00 #Set the wall clock limit to 2 hours <br />
ml Comsol/xxx <br />
<br />
comsol -np 20 batch -inputfile in.mph -outputfile out.mph<br />
<br />
Save the above example to a file and run "bsub < file" to submit this job.<br />
<br />
'''Example 1b (Terra)''': Solving a model in shared memory mode using 28 cores on one Terra cluster node.<br />
<br />
#!/bin/bash<br />
#SBATCH --job-name=comsol #Set the job name to "Comsol"<br />
#SBATCH --time=01:30:00 #Set the wall clock limit to 1hr and 30min<br />
#SBATCH --ntasks=28 #Request 28 tasks<br />
#SBATCH --ntasks-per-node=28 #Request 28 tasks/cores per node<br />
#SBATCH --mem=56G #Request 56 GB of memory per node<br />
#SBATCH --output=comsol.%j #Send stdout/err to "comsol.[jobID]"<br />
ml Comsol/xxx <br />
<br />
comsol -np 28 batch -inputfile in.mph -outputfile out.mph<br />
<br />
Save the above example to a file and run "sbatch file" to submit this job.<br />
<br />
==Running Comsol in Different Parallel Mode==<br />
<br />
Assuming other things are the same as in Example 1, we will show additional examples running in different parallel modes by changing the number of cores and the Comsol command line parameters. <br />
<br />
===Shared Memory Mode===<br />
<br />
'''Example 2a (Ada)''': Solving a model in shared memory mode and using 10 cores in one Ada cluster node. This is similar as Example 1a.<br />
<br />
#BSUB -J comsol #Set the job name to comsol<br />
#BSUB -n 20 -R "span[ptile=20]" #Request 20 cores with all 20 cores on 1 node<br />
#BSUB -M 2800 -R "rusage[mem=2800]" #Request 2800MB of memory per core<br />
#BSUB -o output.%J #Set stdout/err to comsol.[jobID]<br />
#BSUB -L /bin/bash #Use the bash shell for the job script<br />
#BSUB -W 2:00 #Set the wall clock limit to 2 hours <br />
ml Comsol/xxx <br />
export LM_LICENSE_FILE=port@license-server<br />
comsol -np 10 batch -inputfile in.mph -outputfile out.mph<br />
<br />
'''Example 2b (Terra)''': Solving a model in shared memory mode using 14 cores on one Terra cluster node. This is similar as Example 1b<br />
<br />
#!/bin/bash<br />
#SBATCH --job-name=comsol #Set the job name to "Comsol"<br />
#SBATCH --time=01:30:00 #Set the wall clock limit to 1hr and 30min<br />
#SBATCH --ntasks=14 #Request 14 tasks<br />
#SBATCH --ntasks-per-node=14 #Request 14 tasks/cores per node<br />
#SBATCH --mem=28G #Request 28 GB of memory per node<br />
#SBATCH --output=comsol.%j #Send stdout/err to "comsol.[jobID]"<br />
ml Comsol/xxx <br />
export LM_LICENSE_FILE=port@license-server<br />
comsol -np 14 batch -inputfile in.mph -outputfile out.mph<br />
<br />
===Distributed Mode===<br />
<br />
Comsol solvers can also run in distributed mode by checking the "distributed computing" checkbox of the solver when building the model. In this mode, the solver runs on multiple nodes and uses MPI for communication. Except PARDISO, all solvers support distributed mode. However, PARDISO also has a check box for distributed computing. If selected, the actual solver used is MUMPS.<br />
<br />
'''Example 3a (Ada)''': Solving a model in distributed mode on two Ada cluster nodes with a total of 40 cores<br />
<br />
#BSUB -J comsol #Set the job name to comsol<br />
#BSUB -n 40 -R "span[ptile=20]" #Request 40 cores with all 20 cores on 1 node<br />
#BSUB -M 2800 -R "rusage[mem=2800]" #Request 2800MB of memory per core<br />
#BSUB -o output.%J #Set stdout/err to comsol.[jobID]<br />
#BSUB -L /bin/bash #Use the bash shell for the job script<br />
#BSUB -W 2:00 #Set the wall clock limit to 2 hours <br />
ml Comsol/xxx <br />
export LM_LICENSE_FILE=port@license-server<br />
comsol -simplecluster -inputfile input.mph -outputfile output.mph<br />
<br />
This is equivalent to:<br />
<br />
#BSUB -J comsol #Set the job name to comsol<br />
#BSUB -n 40 -R "span[ptile=20]" #Request 40 cores with all 20 cores on 1 node<br />
#BSUB -M 2800 -R "rusage[mem=2800]" #Request 2800MB of memory per core<br />
#BSUB -o output.%J #Set stdout/err to comsol.[jobID]<br />
#BSUB -L /bin/bash #Use the bash shell for the job script<br />
#BSUB -W 2:00 #Set the wall clock limit to 2 hours <br />
ml Comsol/xxx <br />
export LM_LICENSE_FILE=port@license-server<br />
cat $LSB_DJOB_HOSTFILE > hostfile.$LSB_JOBID<br />
comsol -f ./hostfile.$LSB_JOBID -nn 40 batch -inputfile input.mph -outputfile output.mph<br />
<br />
'''Example 3b (Terra)''': Solving a model in distributed mode on two Terra cluster nodes with a total of 56 cores<br />
<br />
#!/bin/bash<br />
#SBATCH --job-name=comsol #Set the job name to "Comsol"<br />
#SBATCH --time=01:30:00 #Set the wall clock limit to 1hr and 30min<br />
#SBATCH --ntasks=56 #Request 56 tasks<br />
#SBATCH --ntasks-per-node=28 #Request 28 tasks/cores per node<br />
#SBATCH --mem=56G #Request 56 GB of memory per node<br />
#SBATCH --output=comsol.%j #Send stdout/err to "comsol.[jobID]"<br />
ml Comsol/xxx <br />
export LM_LICENSE_FILE=port@license-server<br />
comsol -simplecluster -inputfile input.mph -outputfile output.mph<br />
<br />
===Hybrid Mode===<br />
<br />
Either mode has its pros and cons. Shared mode utilizes CPU cores better than distributed mode but can only run on one cluster node, while distributed mode can utilize more than one physical cluster node. It is usually best to run a solver in a way to take advantage of both modes. This can be done easily at the command line through fine tuning of the options -nn, -nnhost, -np.<br />
<br />
'''Example 4a (Ada)''': Solving a model in hybrid mode on 2 Ada cluster nodes with 40 cores. In this example, Comsol will spawn 2 MPI tasks in total (one on each cluster node). Each MPI task will be running with 20 threads on 20 cores. <br />
<br />
#BSUB -J comsol #Set the job name to comsol<br />
#BSUB -n 40 -R "span[ptile=20]" #Request 40 cores with all 20 cores on 1 node<br />
#BSUB -M 2800 -R "rusage[mem=2800]" #Request 2800MB of memory per core<br />
#BSUB -o output.%J #Set stdout/err to comsol.[jobID]<br />
#BSUB -L /bin/bash #Use the bash shell for the job script<br />
#BSUB -W 2:00 #Set the wall clock limit to 2 hours <br />
ml Comsol/xxx <br />
export LM_LICENSE_FILE=port@license-server<br />
cat $LSB_DJOB_HOSTFILE |uniq > hostfile.$LSB_JOBID<br />
comsol batch -f ./hostfile.$LSB_JOBID -nn 2 -nnhost 1 -np 20 -inputfile input.mph -outputfile output.mph<br />
<br />
'''Example 5 (Ada)''': Solving a model in hybrid mode on 2 Ada cluster nodes with 40 cores. In this example, Comsol will spawn 4 MPI tasks in total (one on each cluster node). Each MPI task will be running with 10 threads on 10 cores. <br />
<br />
#BSUB -J comsol #Set the job name to comsol<br />
#BSUB -n 40 -R "span[ptile=20]" #Request 40 cores with all 20 cores on 1 node<br />
#BSUB -M 2800 -R "rusage[mem=2800]" #Request 2800MB of memory per core<br />
#BSUB -o output.%J #Set stdout/err to comsol.[jobID]<br />
#BSUB -L /bin/bash #Use the bash shell for the job script<br />
#BSUB -W 2:00 #Set the wall clock limit to 2 hours <br />
ml Comsol/xxx <br />
export LM_LICENSE_FILE=port@license-server<br />
cat $LSB_DJOB_HOSTFILE |uniq > hostfile.$LSB_JOBID<br />
comsol batch -f ./hostfile.$LSB_JOBID -nn 4 -nnhost 2 -np 10 -inputfile input.mph -outputfile output.mph<br />
<br />
===Parametric Sweep===<br />
<br />
Comsol models configured with parametric sweep can also benefit from parallel computing in different ways. A model configured with parametric sweep needs to run under a range of parameters or combinations of parameters, and each set of parameters can be calculated independently. Once a model with parametric sweep node created in the Comsol GUI, it must also be configured with cluster sweep to distribute the parameters to be processed in parallel. <br />
<br />
'''Example 6 (Ada)''': Run a parametric sweep model on 40 cores. In this example, 10 combinations of parameters will be running concurrently on two cluster nodes with 5 combinations of parameters on each Ada cluster node. Each combination of parameters will be running with 4 threads on 4 cores. <br />
<br />
#BSUB -J comsol #Set the job name to comsol<br />
#BSUB -n 40 -R "span[ptile=20]" #Request 40 cores with all 20 cores on 1 node<br />
#BSUB -M 2800 -R "rusage[mem=2800]" #Request 2800MB of memory per core<br />
#BSUB -o output.%J #Set stdout/err to comsol.[jobID]<br />
#BSUB -L /bin/bash #Use the bash shell for the job script<br />
#BSUB -W 2:00 #Set the wall clock limit to 2 hours <br />
ml Comsol/xxx <br />
export LM_LICENSE_FILE=port@license-server<br />
cat $LSB_DJOB_HOSTFILE |uniq > hostfile.$LSB_JOBID<br />
comsol -f ./hostfile.$LSB_JOBID -nn 10 -nnhost 5 -np 4 -inputfile input.mph -outputfile output.mph<br />
<br />
If each combination of parameters requires large amount of memory to solve, then we can specify one combination of parameters per node such that the entire memory on the node will be used for solving one combination of parameters.<br />
<br />
'''Example 7 (Ada)''': Run a parametric sweep model (with 10 parameter combinations) on 200 cores with each parameter combination taking an entire Ada cluster node.<br />
<br />
#BSUB -J comsol #Set the job name to comsol<br />
#BSUB -n 200 -R "span[ptile=20]" #Request 200 cores with all 20 cores on 1 node<br />
#BSUB -M 2800 -R "rusage[mem=2800]" #Request 2800MB of memory per core<br />
#BSUB -o output.%J #Set stdout/err to comsol.[jobID]<br />
#BSUB -L /bin/bash #Use the bash shell for the job script<br />
#BSUB -W 2:00 #Set the wall clock limit to 2 hours <br />
ml Comsol/xxx <br />
export LM_LICENSE_FILE=port@license-server<br />
cat $LSB_DJOB_HOSTFILE |uniq > hostfile.$LSB_JOBID<br />
comsol -f ./hostfile.$LSB_JOBID -nn 10 -nnhost 1 -np 20 -inputfile input.mph -outputfile output.mph<br />
<br />
==Common problems==<br />
<br />
1. Disk quota exceeded in home directory<br />
<br />
By default, COMSOL stores all temporary files in your home directory. For large models, you are likely to get "Disk quota exceeded" error due to huge amount of temporary files dumped into your home directory. To resolve this issue, you need to redirect temporary files to your scratch directory.<br />
<br />
comsol -tmpdir /scratch/user/username/cosmol/tmp -recoverydir /scratch/user/username/comsol/recovery -np 20 -inputfile input.mph -outputfile outpu.mph<br />
<br />
[[Category:Software]]<br />
<br />
2. Out of Memory<br />
<br />
<br />
If you receive the "Out of Memory" error, most likely the Java heap has an inadequate size. The default Java heap size for Comsol is 2G. You can change the value by following three steps: <br />
<br />
*'''1) load the comsol module'''<br />
ml Comsol/version<br />
<br />
*'''2) copy the Comsol setup file to your home directory'''<br />
<br />
If you are running one core Comsol job, copy comsolbatch.ini <br />
<br />
cp $EBROOTCOMSOL/bin/glnxa64/comsolbatch.ini $HOME/comsol.ini<br />
<br />
<br />
If you are running cluster Comsol job, copy comsolclusterbatch.ini <br />
<br />
cp $EBROOTCOMSOL/bin/glnxa64/comsolclusterbatch.ini $HOME/comsol.ini<br />
<br />
<br />
*'''3) edit the local setup file and increase Xmx'''<br />
sed -i "s/-Xmx.*/-Xmx8196m/" $HOME/comsol.ini (here we increase the heap size to 8G)<br />
*'''4) add '-comsolinifile $HOME/comsol.ini' at the command line.'''<br />
comsol -comsolinifile $HOME/comsol.ini ... (... represent other command line options)</div>Penningshttps://hprc.tamu.edu/w/index.php?title=Grace:QuickStart&diff=11759Grace:QuickStart2020-12-17T20:47:03Z<p>Pennings: /* tamubatch */</p>
<hr />
<div><H1>Grace Quick Start Guide</H1><br />
__TOC__<br />
== Grace Usage Policies ==<br />
'''Access to Grace is granted with the condition that you will understand and adhere to all TAMU HPRC and Grace-specific policies.''' <br />
<br />
General policies can be found on the [https://hprc.tamu.edu/policies/ HPRC Policies page].<br />
<br />
Grace-specific policies, which are similar to Terra, can be found on the [[Grace:Policies | Grace Policies page]].<br />
<br />
== Accessing Grace ==<br />
<br />
<!--'''For convenience, this topic has been summarized in a short video lesson, which you can view [https://www.youtube.com/watch?v=dypaj5uHpqQ here]. Otherwise, feel free to continue reading.'''--><br />
<br />
<!--Most access to Grace is done via a secure shell session.<br />
<br />
Users on '''Windows''' computers use either [http://www.putty.org/ PuTTY] or [http://mobaxterm.mobatek.net/ MobaXterm]. If MobaXterm works on your computer, it is usually easier to use.<br />
<br />
Users on '''Mac''' and '''Linux/Unix''' should use whatever SSH-capable terminal is available on their system.<br />
<br />
The command to connect to Grace is as follows. Be sure to replace [NetID] with your TAMU NetID. <br />
[user1@localhost ~]$ '''ssh ''NetID''@Grace.tamu.edu'''<br />
<font color=teal>'''Note:''' In this example ''[user1@localhost ~]$'' represents the command prompt on your local machine.</font> <br><br />
Your login password is the same that used on [https://howdy.tamu.edu/ Howdy]. You will not see your password as your type it into the login prompt.--><br />
<br />
Most access to Grace is done via a secure shell session. In addition, '''two-factor authentication''' is required to login to any cluster. <br />
<br />
Users on '''Windows''' computers use either [http://www.putty.org/ PuTTY] or [http://mobaxterm.mobatek.net/ MobaXterm]. If MobaXterm works on your computer, it is usually easier to use. When starting an ssh session in PuTTY, choose the connection type 'SSH', select port 22, and then type the hostname 'Grace.tamu.edu'. For MobaXterm, select 'Session', 'SSH', and then remote host 'Grace.tamu.edu'. Check the box to specify username and type your NetID. After selecting 'Ok', you will be prompted for Duo Two Factor Authentication. For more detailed instructions, visit the [https://hprc.tamu.edu/wiki/Two_Factor#MobaXterm Two Factor Authentication] page.<br />
<br />
Users on '''Mac''' and '''Linux/Unix''' should use whatever SSH-capable terminal is available on their system. The command to connect to Grace is as follows. Be sure to replace [NetID] with your TAMU NetID. <br />
[user1@localhost ~]$ '''ssh ''[NetID]''@grace.tamu.edu'''<br />
<font color=teal>'''Note:''' In this example ''[user1@localhost ~]$'' represents the command prompt on your local machine.</font> <br><br />
Your login password is the same that used on [https://howdy.tamu.edu/ Howdy]. You will not see your password as your type it into the login prompt.<br />
<br />
=== Off Campus Access ===<br />
Please visit [https://hprc.tamu.edu/wiki/HPRC:Remote_Access this page] to find information on accessing Grace remotely. <br />
<br />
For more detailed instructions on how to access our systems, please see the [[HPRC:Access | HPRC Access page]].<br />
<br />
<br />
== Navigating Grace & Storage Quotas ==<br />
<br />
When you first access Grace, you will be within your ''home'' directory. This directory has smaller storage quotas and should not be used for general purpose.<br />
<br />
You can navigate to your ''home'' directory with the following command:<br />
[NetID@grace1 ~]$ '''cd /home/''NetID'''''<br />
<br />
Your ''scratch'' directory has more storage space than your ''home'' directory and is recommended for general purpose use. You can navigate to your ''scratch'' directory with the following command:<br />
[NetID@grace1 ~]$ '''cd /scratch/user/''NetID'''''<br />
<br />
You can navigate to ''scratch'' or ''home'' easily by using their respective environment variables. <br />
<br />
Navigate to ''scratch'' with the following command:<br />
[NetID@grace1 ~]$ '''cd $SCRATCH'''<br />
<br />
Navigate to ''home'' with the following command:<br />
[NetID@grace1 ~]$ '''cd $HOME'''<br />
<br />
<font color=purple><br />
Your ''scratch'' directory is restricted to 1TB/250,000 files of storage. This storage quota is '''expandable''' upon request. A user's ''scratch directory'' is '''NOT''' backed up.<br />
<br />
Your ''home'' directory is restricted to 10GB/10,000 files of storage. This storage quota is '''not expandable'''. A user's ''home'' directory is backed up on a nightly basis.<br />
</font><br />
<br />
You can see the current status of your storage quotas with:<br />
[NetID@grace1 ~]$ '''showquota'''<br />
<br />
If you need a storage quota increase, please contact us with justification and the expected length of time that you will need the quota increase.<br />
<br />
==The Batch System==<br />
The batch system is a load distribution implementation that ensures convenient and fair use of a shared resource. Submitting jobs to a batch system allows a user to reserve specific resources with minimal interference to other users. All users are required to submit resource-intensive processing to the compute nodes through the batch system - <font color=red> attempting to circumvent the batch system is not allowed.</font><br />
<br />
On Grace, '''Slurm''' is the batch system that provides job management. <br />
More information on '''Slurm''' can be found in the [[Grace:Batch | Grace Batch]] page.<br />
<br />
== Managing Project Accounts ==<br />
The batch system will charge SUs from the either the account specified in the job parameters, or from your default account (if this parameter is omitted). To avoid errors in SU billing, you can view your active accounts, and set your default account using the [https://hprc.tamu.edu/wiki/HPRC:myproject myproject] command.<br />
<br />
== Finding Software ==<br />
<br />
Software on Grace is loaded using '''modules'''.<br />
<br />
A list of the most popular software on our systems is available on the [[:SW | HPRC Available Software]] page.<br />
<br />
To '''find''' ''most'' available software on Grace, use the following command:<br />
[NetID@grace1 ~]$ '''module avail'''<br />
<br />
To '''search for''' particular software by keyword, use:<br />
[NetID@grace1 ~]$ '''module spider ''keyword'''''<br />
<br />
To load a module, use:<br />
[NetID@grace1 ~]$ '''module load ''moduleName'''''<br />
<br />
To list all currently loaded modules, use:<br />
[NetID@grace1 ~]$ '''module list'''<br />
<br />
To remove all currently loaded modules, use:<br />
[NetID@grace1 ~]$ '''module purge'''<br />
<br />
If you need '''new software''' or '''an update''', please contact us with your request. <br />
<br />
There are restrictions on what software we can install. There is also regularly a queue of requested software installations. <br />
<br />
<font color=teal>Please account for '''delays''' in your installation request timeline. </font><br />
<br />
<br />
== Running Your Program / Preparing a Job File ==<br />
<br />
In order to properly run a program on Grace, you will need to create a job file and submit a job.<br />
<br />
The simple example job file below requests 1 core on 1 node with 2.5GB of RAM for 1.5 hours. Note that typical nodes on Grace have 28 cores with 120GB of usable memory and ensure that your job requirements will fit within these restrictions. Any modules that need to be loaded or executable commands will replace the ''"#First Executable Line"'' in this example.<br />
#!/bin/bash<br />
##ENVIRONMENT SETTINGS; CHANGE WITH CAUTION<br />
#SBATCH --export=NONE #Do not propagate environment<br />
#SBATCH --get-user-env=L #Replicate login environment<br />
<br />
##NECESSARY JOB SPECIFICATIONS<br />
#SBATCH --job-name=JobExample1 #Set the job name to "JobExample1"<br />
#SBATCH --time=01:30:00 #Set the wall clock limit to 1hr and 30min<br />
#SBATCH --ntasks=1 #Request 1 task<br />
#SBATCH --ntasks-per-node=1 #Request 1 task/core per node<br />
#SBATCH --mem=2560M #Request 2560MB (2.5GB) per node<br />
#SBATCH --output=Example1Out.%j #Send stdout/err to "Example1Out.[jobID]"<br />
<br />
#First Executable Line<br />
<br />
Note: If your job file has been written on an older Mac or DOS workstation, you will need to use "dos2unix" to remove certain characters that interfere with parsing the script.<br />
<br />
[NetID@grace1 ~]$ '''dos2unix ''MyJob.slurm'''''<br />
<br />
More information on '''job options''' can be found in the [[Grace:Batch#Building_Job_Files | Building Job Files]] section of the [[Grace:Batch | Grace Batch]] page.<br />
<br />
More information on '''dos2unix''' can be found on the [[:SW:dos2unix | dos2unix]] section of the [[:SW | HPRC Available Software]] page.<br />
<br />
== Submitting and Monitoring Jobs ==<br />
Once you have your job file ready, it is time to submit your job. You can submit your job to slurm with the following command:<br />
[NetID@grace1 ~]$ '''sbatch ''MyJob.slurm'''''<br />
Submitted batch job 3606<br />
<br />
After the job has been submitted, you are able to monitor it with several methods. <br />
To see the status of all of your jobs, use the following command:<br />
[NetID@grace1 ~]$ '''squeue -u ''NetID'''''<br />
JOBID NAME USER PARTITION NODES CPUS STATE TIME TIME_LEFT START_TIME REASON NODELIST <br />
3606 myjob2 NetID short 1 3 RUNNING 0:30 00:10:30 2020-11-27T23:44:12 None tnxt-[0340] <br />
<br />
To see the status of one job, use the following command, where ''XXXX'' is the JobID:<br />
[NetID@grace1 ~]$ '''squeue --job ''XXXX'''''<br />
JOBID NAME USER PARTITION NODES CPUS STATE TIME TIME_LEFT START_TIME REASON NODELIST <br />
XXXX myjob2 NetID short 1 3 RUNNING 0:30 00:10:30 2020-11-27T23:44:12 None tnxt-[0340] <br />
<br />
To cancel a job, use the following command, where ''XXXX'' is the JobID:<br />
[NetID@grace1 ~]$ '''scancel ''XXXX'''''<br />
<br />
More information on [[:Grace:Batch#Job_Submission | Job Submission]] and [[:Grace:Batch#Job_Monitoring_and_Control_Commands | Job Monitoring]] Slurm jobs can be found at the [[:Grace:Batch | Grace Batch System]] page.<br />
<br />
== tamubatch ==<br />
<br />
'''tamubatch''' is an automatic batch job script that submits jobs for the user without the need of writing a batch script on the Ada, Terra, and Grace clusters. The user just needs to provide the executable commands in a text file and tamubatch will automatically submit the job to the cluster. There are flags that the user may specify which allows control over the parameters for the job submitted.<br />
<br />
For more information, please visit the [[:SW:tamubatch| tamubatch wiki page]]<br />
<br />
== Additional Topics ==<br />
<br />
=== Translating Ada/LSF <--> Grace/Slurm ===<br />
<br />
The [[:HPRC:Batch_Translation | HPRC Batch Translation]] page contains information on '''converting''' between LSF, PBS, and Slurm.<br />
<br />
Our staff has also written some example jobs for specific software. These software-specific examples can be seen on the [[:SW | Individual Software Pages]] where available. <br />
<br />
=== Finding Software ===<br />
<br />
Software on Grace is loaded using '''modules'''.<br />
<br />
You can see the most popular software on the [[:SW | HPRC Available Software]] page.<br />
<br />
You can '''find''' ''most'' available software on Grace with the following command:<br />
[NetID@grace1 ~]$ '''module avail'''<br />
<br />
You can '''search for''' particular software by keyword using:<br />
[NetID@grace1 ~]$ '''module spider ''keyword'''''<br />
<br />
You can load a module using:<br />
[NetID@grace1 ~]$ '''module load ''moduleName'''''<br />
<br />
You can list all currently loaded modules using:<br />
[NetID@grace1 ~]$ '''module list'''<br />
<br />
You can remove all currently loaded modules using:<br />
[NetID@grace1 ~]$ '''module purge'''<br />
<br />
If you need '''new software''' or '''an update''', please contact us with your request. <br />
<br />
There are restrictions on what software we can install. There is also regularly a queue of requested software installations. <br />
<br />
<font color=teal>Please account for '''delays''' in your installation request timeline. </font><br />
<br />
=== Transferring Files ===<br />
<br />
Files can be transferred to Grace using the ''scp'' command or a file transfer program.<br />
<br />
Our users most commonly utilize:<br />
* [https://winscp.net/eng/download.php WinSCP] - Straightforward, legacy<br />
* [https://filezilla-project.org/ FileZilla Client] - Easy to use, additional features, available on most platforms<br />
* [https://mobaxterm.mobatek.net/features.html MobaXterm Graphical SFTP] - Included with MobaXterm<br />
<br />
See our [Grace-Filezilla example video] for a demonstration of this process.<br />
<br />
<font color=teal>'''Advice:''' while GUIs are acceptable for file transfers, the cp and scp commands are much quicker and may significantly benefit your workflow.</font><br />
<br />
==== Reliably Transferring Large Files ====<br />
<br />
For files larger than several GB, you will want to consider the use of a more fault-tolerant utility such as rsync.<br />
[NetID@grace1 ~]$ '''rsync -av [-z] ''localdir/ userid@remotesystem:/path/to/remotedir/'''''<br />
<br />
An rsync example can be seen on the [[:Ada:Fast_Data_Transfer#Data_transfer_using_rsync | Ada Fast Transfer]] page.<br />
<!-- See our [Grace-rsync example video] for a demonstration of this process. --><br />
<!-- [Insert info on glob, ftn] --><br />
<br />
=== Graphical User Interfaces (Visualization) ===<br />
<br />
You have '''three options''' for using GUIs on Grace.<br />
<br />
The '''first option''' is to use the Open OnDemand Portal. See the [[SW:Portal | HPRC Portal]] page for more information.<br />
<br />
The '''second option''' is to run on the login node. When doing this, you '''must''' observe the fair-use policy of login node usage. Users commonly violate these policies by accident, resulting in terminated processes, confusion, and warnings from our admins.<br />
<br />
The '''third option''' is to use a VNC job. This method is outside the scope of this guide. See the [[Grace:Remote-Viz | Grace Remote Visualization]] page for more information.<br />
<br />
<br />
<br />
[[Category: Grace]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=Grace:QuickStart&diff=11758Grace:QuickStart2020-12-17T20:46:39Z<p>Pennings: /* tamubatch */</p>
<hr />
<div><H1>Grace Quick Start Guide</H1><br />
__TOC__<br />
== Grace Usage Policies ==<br />
'''Access to Grace is granted with the condition that you will understand and adhere to all TAMU HPRC and Grace-specific policies.''' <br />
<br />
General policies can be found on the [https://hprc.tamu.edu/policies/ HPRC Policies page].<br />
<br />
Grace-specific policies, which are similar to Terra, can be found on the [[Grace:Policies | Grace Policies page]].<br />
<br />
== Accessing Grace ==<br />
<br />
<!--'''For convenience, this topic has been summarized in a short video lesson, which you can view [https://www.youtube.com/watch?v=dypaj5uHpqQ here]. Otherwise, feel free to continue reading.'''--><br />
<br />
<!--Most access to Grace is done via a secure shell session.<br />
<br />
Users on '''Windows''' computers use either [http://www.putty.org/ PuTTY] or [http://mobaxterm.mobatek.net/ MobaXterm]. If MobaXterm works on your computer, it is usually easier to use.<br />
<br />
Users on '''Mac''' and '''Linux/Unix''' should use whatever SSH-capable terminal is available on their system.<br />
<br />
The command to connect to Grace is as follows. Be sure to replace [NetID] with your TAMU NetID. <br />
[user1@localhost ~]$ '''ssh ''NetID''@Grace.tamu.edu'''<br />
<font color=teal>'''Note:''' In this example ''[user1@localhost ~]$'' represents the command prompt on your local machine.</font> <br><br />
Your login password is the same that used on [https://howdy.tamu.edu/ Howdy]. You will not see your password as your type it into the login prompt.--><br />
<br />
Most access to Grace is done via a secure shell session. In addition, '''two-factor authentication''' is required to login to any cluster. <br />
<br />
Users on '''Windows''' computers use either [http://www.putty.org/ PuTTY] or [http://mobaxterm.mobatek.net/ MobaXterm]. If MobaXterm works on your computer, it is usually easier to use. When starting an ssh session in PuTTY, choose the connection type 'SSH', select port 22, and then type the hostname 'Grace.tamu.edu'. For MobaXterm, select 'Session', 'SSH', and then remote host 'Grace.tamu.edu'. Check the box to specify username and type your NetID. After selecting 'Ok', you will be prompted for Duo Two Factor Authentication. For more detailed instructions, visit the [https://hprc.tamu.edu/wiki/Two_Factor#MobaXterm Two Factor Authentication] page.<br />
<br />
Users on '''Mac''' and '''Linux/Unix''' should use whatever SSH-capable terminal is available on their system. The command to connect to Grace is as follows. Be sure to replace [NetID] with your TAMU NetID. <br />
[user1@localhost ~]$ '''ssh ''[NetID]''@grace.tamu.edu'''<br />
<font color=teal>'''Note:''' In this example ''[user1@localhost ~]$'' represents the command prompt on your local machine.</font> <br><br />
Your login password is the same that used on [https://howdy.tamu.edu/ Howdy]. You will not see your password as your type it into the login prompt.<br />
<br />
=== Off Campus Access ===<br />
Please visit [https://hprc.tamu.edu/wiki/HPRC:Remote_Access this page] to find information on accessing Grace remotely. <br />
<br />
For more detailed instructions on how to access our systems, please see the [[HPRC:Access | HPRC Access page]].<br />
<br />
<br />
== Navigating Grace & Storage Quotas ==<br />
<br />
When you first access Grace, you will be within your ''home'' directory. This directory has smaller storage quotas and should not be used for general purpose.<br />
<br />
You can navigate to your ''home'' directory with the following command:<br />
[NetID@grace1 ~]$ '''cd /home/''NetID'''''<br />
<br />
Your ''scratch'' directory has more storage space than your ''home'' directory and is recommended for general purpose use. You can navigate to your ''scratch'' directory with the following command:<br />
[NetID@grace1 ~]$ '''cd /scratch/user/''NetID'''''<br />
<br />
You can navigate to ''scratch'' or ''home'' easily by using their respective environment variables. <br />
<br />
Navigate to ''scratch'' with the following command:<br />
[NetID@grace1 ~]$ '''cd $SCRATCH'''<br />
<br />
Navigate to ''home'' with the following command:<br />
[NetID@grace1 ~]$ '''cd $HOME'''<br />
<br />
<font color=purple><br />
Your ''scratch'' directory is restricted to 1TB/250,000 files of storage. This storage quota is '''expandable''' upon request. A user's ''scratch directory'' is '''NOT''' backed up.<br />
<br />
Your ''home'' directory is restricted to 10GB/10,000 files of storage. This storage quota is '''not expandable'''. A user's ''home'' directory is backed up on a nightly basis.<br />
</font><br />
<br />
You can see the current status of your storage quotas with:<br />
[NetID@grace1 ~]$ '''showquota'''<br />
<br />
If you need a storage quota increase, please contact us with justification and the expected length of time that you will need the quota increase.<br />
<br />
==The Batch System==<br />
The batch system is a load distribution implementation that ensures convenient and fair use of a shared resource. Submitting jobs to a batch system allows a user to reserve specific resources with minimal interference to other users. All users are required to submit resource-intensive processing to the compute nodes through the batch system - <font color=red> attempting to circumvent the batch system is not allowed.</font><br />
<br />
On Grace, '''Slurm''' is the batch system that provides job management. <br />
More information on '''Slurm''' can be found in the [[Grace:Batch | Grace Batch]] page.<br />
<br />
== Managing Project Accounts ==<br />
The batch system will charge SUs from the either the account specified in the job parameters, or from your default account (if this parameter is omitted). To avoid errors in SU billing, you can view your active accounts, and set your default account using the [https://hprc.tamu.edu/wiki/HPRC:myproject myproject] command.<br />
<br />
== Finding Software ==<br />
<br />
Software on Grace is loaded using '''modules'''.<br />
<br />
A list of the most popular software on our systems is available on the [[:SW | HPRC Available Software]] page.<br />
<br />
To '''find''' ''most'' available software on Grace, use the following command:<br />
[NetID@grace1 ~]$ '''module avail'''<br />
<br />
To '''search for''' particular software by keyword, use:<br />
[NetID@grace1 ~]$ '''module spider ''keyword'''''<br />
<br />
To load a module, use:<br />
[NetID@grace1 ~]$ '''module load ''moduleName'''''<br />
<br />
To list all currently loaded modules, use:<br />
[NetID@grace1 ~]$ '''module list'''<br />
<br />
To remove all currently loaded modules, use:<br />
[NetID@grace1 ~]$ '''module purge'''<br />
<br />
If you need '''new software''' or '''an update''', please contact us with your request. <br />
<br />
There are restrictions on what software we can install. There is also regularly a queue of requested software installations. <br />
<br />
<font color=teal>Please account for '''delays''' in your installation request timeline. </font><br />
<br />
<br />
== Running Your Program / Preparing a Job File ==<br />
<br />
In order to properly run a program on Grace, you will need to create a job file and submit a job.<br />
<br />
The simple example job file below requests 1 core on 1 node with 2.5GB of RAM for 1.5 hours. Note that typical nodes on Grace have 28 cores with 120GB of usable memory and ensure that your job requirements will fit within these restrictions. Any modules that need to be loaded or executable commands will replace the ''"#First Executable Line"'' in this example.<br />
#!/bin/bash<br />
##ENVIRONMENT SETTINGS; CHANGE WITH CAUTION<br />
#SBATCH --export=NONE #Do not propagate environment<br />
#SBATCH --get-user-env=L #Replicate login environment<br />
<br />
##NECESSARY JOB SPECIFICATIONS<br />
#SBATCH --job-name=JobExample1 #Set the job name to "JobExample1"<br />
#SBATCH --time=01:30:00 #Set the wall clock limit to 1hr and 30min<br />
#SBATCH --ntasks=1 #Request 1 task<br />
#SBATCH --ntasks-per-node=1 #Request 1 task/core per node<br />
#SBATCH --mem=2560M #Request 2560MB (2.5GB) per node<br />
#SBATCH --output=Example1Out.%j #Send stdout/err to "Example1Out.[jobID]"<br />
<br />
#First Executable Line<br />
<br />
Note: If your job file has been written on an older Mac or DOS workstation, you will need to use "dos2unix" to remove certain characters that interfere with parsing the script.<br />
<br />
[NetID@grace1 ~]$ '''dos2unix ''MyJob.slurm'''''<br />
<br />
More information on '''job options''' can be found in the [[Grace:Batch#Building_Job_Files | Building Job Files]] section of the [[Grace:Batch | Grace Batch]] page.<br />
<br />
More information on '''dos2unix''' can be found on the [[:SW:dos2unix | dos2unix]] section of the [[:SW | HPRC Available Software]] page.<br />
<br />
== Submitting and Monitoring Jobs ==<br />
Once you have your job file ready, it is time to submit your job. You can submit your job to slurm with the following command:<br />
[NetID@grace1 ~]$ '''sbatch ''MyJob.slurm'''''<br />
Submitted batch job 3606<br />
<br />
After the job has been submitted, you are able to monitor it with several methods. <br />
To see the status of all of your jobs, use the following command:<br />
[NetID@grace1 ~]$ '''squeue -u ''NetID'''''<br />
JOBID NAME USER PARTITION NODES CPUS STATE TIME TIME_LEFT START_TIME REASON NODELIST <br />
3606 myjob2 NetID short 1 3 RUNNING 0:30 00:10:30 2020-11-27T23:44:12 None tnxt-[0340] <br />
<br />
To see the status of one job, use the following command, where ''XXXX'' is the JobID:<br />
[NetID@grace1 ~]$ '''squeue --job ''XXXX'''''<br />
JOBID NAME USER PARTITION NODES CPUS STATE TIME TIME_LEFT START_TIME REASON NODELIST <br />
XXXX myjob2 NetID short 1 3 RUNNING 0:30 00:10:30 2020-11-27T23:44:12 None tnxt-[0340] <br />
<br />
To cancel a job, use the following command, where ''XXXX'' is the JobID:<br />
[NetID@grace1 ~]$ '''scancel ''XXXX'''''<br />
<br />
More information on [[:Grace:Batch#Job_Submission | Job Submission]] and [[:Grace:Batch#Job_Monitoring_and_Control_Commands | Job Monitoring]] Slurm jobs can be found at the [[:Grace:Batch | Grace Batch System]] page.<br />
<br />
== tamubatch ==<br />
<br />
'''tamubatch''' is an automatic batch job script that submits jobs for the user without the need of writing a batch script on the Ada, Terra, and Grace clusters. The user just needs to provide the executable commands in a text file and tamubatch will automatically submit the job to the cluster. There are flags that the user may specify which allows control over the parameters for the job submitted.<br />
<br />
For more information, visit the [[:SW:tamubatch| tamubatch wiki page]]<br />
<br />
== Additional Topics ==<br />
<br />
=== Translating Ada/LSF <--> Grace/Slurm ===<br />
<br />
The [[:HPRC:Batch_Translation | HPRC Batch Translation]] page contains information on '''converting''' between LSF, PBS, and Slurm.<br />
<br />
Our staff has also written some example jobs for specific software. These software-specific examples can be seen on the [[:SW | Individual Software Pages]] where available. <br />
<br />
=== Finding Software ===<br />
<br />
Software on Grace is loaded using '''modules'''.<br />
<br />
You can see the most popular software on the [[:SW | HPRC Available Software]] page.<br />
<br />
You can '''find''' ''most'' available software on Grace with the following command:<br />
[NetID@grace1 ~]$ '''module avail'''<br />
<br />
You can '''search for''' particular software by keyword using:<br />
[NetID@grace1 ~]$ '''module spider ''keyword'''''<br />
<br />
You can load a module using:<br />
[NetID@grace1 ~]$ '''module load ''moduleName'''''<br />
<br />
You can list all currently loaded modules using:<br />
[NetID@grace1 ~]$ '''module list'''<br />
<br />
You can remove all currently loaded modules using:<br />
[NetID@grace1 ~]$ '''module purge'''<br />
<br />
If you need '''new software''' or '''an update''', please contact us with your request. <br />
<br />
There are restrictions on what software we can install. There is also regularly a queue of requested software installations. <br />
<br />
<font color=teal>Please account for '''delays''' in your installation request timeline. </font><br />
<br />
=== Transferring Files ===<br />
<br />
Files can be transferred to Grace using the ''scp'' command or a file transfer program.<br />
<br />
Our users most commonly utilize:<br />
* [https://winscp.net/eng/download.php WinSCP] - Straightforward, legacy<br />
* [https://filezilla-project.org/ FileZilla Client] - Easy to use, additional features, available on most platforms<br />
* [https://mobaxterm.mobatek.net/features.html MobaXterm Graphical SFTP] - Included with MobaXterm<br />
<br />
See our [Grace-Filezilla example video] for a demonstration of this process.<br />
<br />
<font color=teal>'''Advice:''' while GUIs are acceptable for file transfers, the cp and scp commands are much quicker and may significantly benefit your workflow.</font><br />
<br />
==== Reliably Transferring Large Files ====<br />
<br />
For files larger than several GB, you will want to consider the use of a more fault-tolerant utility such as rsync.<br />
[NetID@grace1 ~]$ '''rsync -av [-z] ''localdir/ userid@remotesystem:/path/to/remotedir/'''''<br />
<br />
An rsync example can be seen on the [[:Ada:Fast_Data_Transfer#Data_transfer_using_rsync | Ada Fast Transfer]] page.<br />
<!-- See our [Grace-rsync example video] for a demonstration of this process. --><br />
<!-- [Insert info on glob, ftn] --><br />
<br />
=== Graphical User Interfaces (Visualization) ===<br />
<br />
You have '''three options''' for using GUIs on Grace.<br />
<br />
The '''first option''' is to use the Open OnDemand Portal. See the [[SW:Portal | HPRC Portal]] page for more information.<br />
<br />
The '''second option''' is to run on the login node. When doing this, you '''must''' observe the fair-use policy of login node usage. Users commonly violate these policies by accident, resulting in terminated processes, confusion, and warnings from our admins.<br />
<br />
The '''third option''' is to use a VNC job. This method is outside the scope of this guide. See the [[Grace:Remote-Viz | Grace Remote Visualization]] page for more information.<br />
<br />
<br />
<br />
[[Category: Grace]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=Grace:QuickStart&diff=11757Grace:QuickStart2020-12-17T20:46:24Z<p>Pennings: /* tamubatch */</p>
<hr />
<div><H1>Grace Quick Start Guide</H1><br />
__TOC__<br />
== Grace Usage Policies ==<br />
'''Access to Grace is granted with the condition that you will understand and adhere to all TAMU HPRC and Grace-specific policies.''' <br />
<br />
General policies can be found on the [https://hprc.tamu.edu/policies/ HPRC Policies page].<br />
<br />
Grace-specific policies, which are similar to Terra, can be found on the [[Grace:Policies | Grace Policies page]].<br />
<br />
== Accessing Grace ==<br />
<br />
<!--'''For convenience, this topic has been summarized in a short video lesson, which you can view [https://www.youtube.com/watch?v=dypaj5uHpqQ here]. Otherwise, feel free to continue reading.'''--><br />
<br />
<!--Most access to Grace is done via a secure shell session.<br />
<br />
Users on '''Windows''' computers use either [http://www.putty.org/ PuTTY] or [http://mobaxterm.mobatek.net/ MobaXterm]. If MobaXterm works on your computer, it is usually easier to use.<br />
<br />
Users on '''Mac''' and '''Linux/Unix''' should use whatever SSH-capable terminal is available on their system.<br />
<br />
The command to connect to Grace is as follows. Be sure to replace [NetID] with your TAMU NetID. <br />
[user1@localhost ~]$ '''ssh ''NetID''@Grace.tamu.edu'''<br />
<font color=teal>'''Note:''' In this example ''[user1@localhost ~]$'' represents the command prompt on your local machine.</font> <br><br />
Your login password is the same that used on [https://howdy.tamu.edu/ Howdy]. You will not see your password as your type it into the login prompt.--><br />
<br />
Most access to Grace is done via a secure shell session. In addition, '''two-factor authentication''' is required to login to any cluster. <br />
<br />
Users on '''Windows''' computers use either [http://www.putty.org/ PuTTY] or [http://mobaxterm.mobatek.net/ MobaXterm]. If MobaXterm works on your computer, it is usually easier to use. When starting an ssh session in PuTTY, choose the connection type 'SSH', select port 22, and then type the hostname 'Grace.tamu.edu'. For MobaXterm, select 'Session', 'SSH', and then remote host 'Grace.tamu.edu'. Check the box to specify username and type your NetID. After selecting 'Ok', you will be prompted for Duo Two Factor Authentication. For more detailed instructions, visit the [https://hprc.tamu.edu/wiki/Two_Factor#MobaXterm Two Factor Authentication] page.<br />
<br />
Users on '''Mac''' and '''Linux/Unix''' should use whatever SSH-capable terminal is available on their system. The command to connect to Grace is as follows. Be sure to replace [NetID] with your TAMU NetID. <br />
[user1@localhost ~]$ '''ssh ''[NetID]''@grace.tamu.edu'''<br />
<font color=teal>'''Note:''' In this example ''[user1@localhost ~]$'' represents the command prompt on your local machine.</font> <br><br />
Your login password is the same that used on [https://howdy.tamu.edu/ Howdy]. You will not see your password as your type it into the login prompt.<br />
<br />
=== Off Campus Access ===<br />
Please visit [https://hprc.tamu.edu/wiki/HPRC:Remote_Access this page] to find information on accessing Grace remotely. <br />
<br />
For more detailed instructions on how to access our systems, please see the [[HPRC:Access | HPRC Access page]].<br />
<br />
<br />
== Navigating Grace & Storage Quotas ==<br />
<br />
When you first access Grace, you will be within your ''home'' directory. This directory has smaller storage quotas and should not be used for general purpose.<br />
<br />
You can navigate to your ''home'' directory with the following command:<br />
[NetID@grace1 ~]$ '''cd /home/''NetID'''''<br />
<br />
Your ''scratch'' directory has more storage space than your ''home'' directory and is recommended for general purpose use. You can navigate to your ''scratch'' directory with the following command:<br />
[NetID@grace1 ~]$ '''cd /scratch/user/''NetID'''''<br />
<br />
You can navigate to ''scratch'' or ''home'' easily by using their respective environment variables. <br />
<br />
Navigate to ''scratch'' with the following command:<br />
[NetID@grace1 ~]$ '''cd $SCRATCH'''<br />
<br />
Navigate to ''home'' with the following command:<br />
[NetID@grace1 ~]$ '''cd $HOME'''<br />
<br />
<font color=purple><br />
Your ''scratch'' directory is restricted to 1TB/250,000 files of storage. This storage quota is '''expandable''' upon request. A user's ''scratch directory'' is '''NOT''' backed up.<br />
<br />
Your ''home'' directory is restricted to 10GB/10,000 files of storage. This storage quota is '''not expandable'''. A user's ''home'' directory is backed up on a nightly basis.<br />
</font><br />
<br />
You can see the current status of your storage quotas with:<br />
[NetID@grace1 ~]$ '''showquota'''<br />
<br />
If you need a storage quota increase, please contact us with justification and the expected length of time that you will need the quota increase.<br />
<br />
==The Batch System==<br />
The batch system is a load distribution implementation that ensures convenient and fair use of a shared resource. Submitting jobs to a batch system allows a user to reserve specific resources with minimal interference to other users. All users are required to submit resource-intensive processing to the compute nodes through the batch system - <font color=red> attempting to circumvent the batch system is not allowed.</font><br />
<br />
On Grace, '''Slurm''' is the batch system that provides job management. <br />
More information on '''Slurm''' can be found in the [[Grace:Batch | Grace Batch]] page.<br />
<br />
== Managing Project Accounts ==<br />
The batch system will charge SUs from the either the account specified in the job parameters, or from your default account (if this parameter is omitted). To avoid errors in SU billing, you can view your active accounts, and set your default account using the [https://hprc.tamu.edu/wiki/HPRC:myproject myproject] command.<br />
<br />
== Finding Software ==<br />
<br />
Software on Grace is loaded using '''modules'''.<br />
<br />
A list of the most popular software on our systems is available on the [[:SW | HPRC Available Software]] page.<br />
<br />
To '''find''' ''most'' available software on Grace, use the following command:<br />
[NetID@grace1 ~]$ '''module avail'''<br />
<br />
To '''search for''' particular software by keyword, use:<br />
[NetID@grace1 ~]$ '''module spider ''keyword'''''<br />
<br />
To load a module, use:<br />
[NetID@grace1 ~]$ '''module load ''moduleName'''''<br />
<br />
To list all currently loaded modules, use:<br />
[NetID@grace1 ~]$ '''module list'''<br />
<br />
To remove all currently loaded modules, use:<br />
[NetID@grace1 ~]$ '''module purge'''<br />
<br />
If you need '''new software''' or '''an update''', please contact us with your request. <br />
<br />
There are restrictions on what software we can install. There is also regularly a queue of requested software installations. <br />
<br />
<font color=teal>Please account for '''delays''' in your installation request timeline. </font><br />
<br />
<br />
== Running Your Program / Preparing a Job File ==<br />
<br />
In order to properly run a program on Grace, you will need to create a job file and submit a job.<br />
<br />
The simple example job file below requests 1 core on 1 node with 2.5GB of RAM for 1.5 hours. Note that typical nodes on Grace have 28 cores with 120GB of usable memory and ensure that your job requirements will fit within these restrictions. Any modules that need to be loaded or executable commands will replace the ''"#First Executable Line"'' in this example.<br />
#!/bin/bash<br />
##ENVIRONMENT SETTINGS; CHANGE WITH CAUTION<br />
#SBATCH --export=NONE #Do not propagate environment<br />
#SBATCH --get-user-env=L #Replicate login environment<br />
<br />
##NECESSARY JOB SPECIFICATIONS<br />
#SBATCH --job-name=JobExample1 #Set the job name to "JobExample1"<br />
#SBATCH --time=01:30:00 #Set the wall clock limit to 1hr and 30min<br />
#SBATCH --ntasks=1 #Request 1 task<br />
#SBATCH --ntasks-per-node=1 #Request 1 task/core per node<br />
#SBATCH --mem=2560M #Request 2560MB (2.5GB) per node<br />
#SBATCH --output=Example1Out.%j #Send stdout/err to "Example1Out.[jobID]"<br />
<br />
#First Executable Line<br />
<br />
Note: If your job file has been written on an older Mac or DOS workstation, you will need to use "dos2unix" to remove certain characters that interfere with parsing the script.<br />
<br />
[NetID@grace1 ~]$ '''dos2unix ''MyJob.slurm'''''<br />
<br />
More information on '''job options''' can be found in the [[Grace:Batch#Building_Job_Files | Building Job Files]] section of the [[Grace:Batch | Grace Batch]] page.<br />
<br />
More information on '''dos2unix''' can be found on the [[:SW:dos2unix | dos2unix]] section of the [[:SW | HPRC Available Software]] page.<br />
<br />
== Submitting and Monitoring Jobs ==<br />
Once you have your job file ready, it is time to submit your job. You can submit your job to slurm with the following command:<br />
[NetID@grace1 ~]$ '''sbatch ''MyJob.slurm'''''<br />
Submitted batch job 3606<br />
<br />
After the job has been submitted, you are able to monitor it with several methods. <br />
To see the status of all of your jobs, use the following command:<br />
[NetID@grace1 ~]$ '''squeue -u ''NetID'''''<br />
JOBID NAME USER PARTITION NODES CPUS STATE TIME TIME_LEFT START_TIME REASON NODELIST <br />
3606 myjob2 NetID short 1 3 RUNNING 0:30 00:10:30 2020-11-27T23:44:12 None tnxt-[0340] <br />
<br />
To see the status of one job, use the following command, where ''XXXX'' is the JobID:<br />
[NetID@grace1 ~]$ '''squeue --job ''XXXX'''''<br />
JOBID NAME USER PARTITION NODES CPUS STATE TIME TIME_LEFT START_TIME REASON NODELIST <br />
XXXX myjob2 NetID short 1 3 RUNNING 0:30 00:10:30 2020-11-27T23:44:12 None tnxt-[0340] <br />
<br />
To cancel a job, use the following command, where ''XXXX'' is the JobID:<br />
[NetID@grace1 ~]$ '''scancel ''XXXX'''''<br />
<br />
More information on [[:Grace:Batch#Job_Submission | Job Submission]] and [[:Grace:Batch#Job_Monitoring_and_Control_Commands | Job Monitoring]] Slurm jobs can be found at the [[:Grace:Batch | Grace Batch System]] page.<br />
<br />
== tamubatch ==<br />
<br />
'''tamubatch''' is an automatic batch job script that submits jobs for the user without the need of writing a batch script on the Ada, Terra, and Grace clusters. The user just needs to provide the executable commands in a text file and tamubatch will automatically submit the job to the cluster. There are flags that the user may specify which allows control over the parameters for the job submitted.<br />
<br />
For more information, visit the [[:SW:tamubatch| HPRC tamubatch wiki page]]<br />
<br />
== Additional Topics ==<br />
<br />
=== Translating Ada/LSF <--> Grace/Slurm ===<br />
<br />
The [[:HPRC:Batch_Translation | HPRC Batch Translation]] page contains information on '''converting''' between LSF, PBS, and Slurm.<br />
<br />
Our staff has also written some example jobs for specific software. These software-specific examples can be seen on the [[:SW | Individual Software Pages]] where available. <br />
<br />
=== Finding Software ===<br />
<br />
Software on Grace is loaded using '''modules'''.<br />
<br />
You can see the most popular software on the [[:SW | HPRC Available Software]] page.<br />
<br />
You can '''find''' ''most'' available software on Grace with the following command:<br />
[NetID@grace1 ~]$ '''module avail'''<br />
<br />
You can '''search for''' particular software by keyword using:<br />
[NetID@grace1 ~]$ '''module spider ''keyword'''''<br />
<br />
You can load a module using:<br />
[NetID@grace1 ~]$ '''module load ''moduleName'''''<br />
<br />
You can list all currently loaded modules using:<br />
[NetID@grace1 ~]$ '''module list'''<br />
<br />
You can remove all currently loaded modules using:<br />
[NetID@grace1 ~]$ '''module purge'''<br />
<br />
If you need '''new software''' or '''an update''', please contact us with your request. <br />
<br />
There are restrictions on what software we can install. There is also regularly a queue of requested software installations. <br />
<br />
<font color=teal>Please account for '''delays''' in your installation request timeline. </font><br />
<br />
=== Transferring Files ===<br />
<br />
Files can be transferred to Grace using the ''scp'' command or a file transfer program.<br />
<br />
Our users most commonly utilize:<br />
* [https://winscp.net/eng/download.php WinSCP] - Straightforward, legacy<br />
* [https://filezilla-project.org/ FileZilla Client] - Easy to use, additional features, available on most platforms<br />
* [https://mobaxterm.mobatek.net/features.html MobaXterm Graphical SFTP] - Included with MobaXterm<br />
<br />
See our [Grace-Filezilla example video] for a demonstration of this process.<br />
<br />
<font color=teal>'''Advice:''' while GUIs are acceptable for file transfers, the cp and scp commands are much quicker and may significantly benefit your workflow.</font><br />
<br />
==== Reliably Transferring Large Files ====<br />
<br />
For files larger than several GB, you will want to consider the use of a more fault-tolerant utility such as rsync.<br />
[NetID@grace1 ~]$ '''rsync -av [-z] ''localdir/ userid@remotesystem:/path/to/remotedir/'''''<br />
<br />
An rsync example can be seen on the [[:Ada:Fast_Data_Transfer#Data_transfer_using_rsync | Ada Fast Transfer]] page.<br />
<!-- See our [Grace-rsync example video] for a demonstration of this process. --><br />
<!-- [Insert info on glob, ftn] --><br />
<br />
=== Graphical User Interfaces (Visualization) ===<br />
<br />
You have '''three options''' for using GUIs on Grace.<br />
<br />
The '''first option''' is to use the Open OnDemand Portal. See the [[SW:Portal | HPRC Portal]] page for more information.<br />
<br />
The '''second option''' is to run on the login node. When doing this, you '''must''' observe the fair-use policy of login node usage. Users commonly violate these policies by accident, resulting in terminated processes, confusion, and warnings from our admins.<br />
<br />
The '''third option''' is to use a VNC job. This method is outside the scope of this guide. See the [[Grace:Remote-Viz | Grace Remote Visualization]] page for more information.<br />
<br />
<br />
<br />
[[Category: Grace]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=Grace:QuickStart&diff=11756Grace:QuickStart2020-12-17T20:44:41Z<p>Pennings: /* tamubatch */</p>
<hr />
<div><H1>Grace Quick Start Guide</H1><br />
__TOC__<br />
== Grace Usage Policies ==<br />
'''Access to Grace is granted with the condition that you will understand and adhere to all TAMU HPRC and Grace-specific policies.''' <br />
<br />
General policies can be found on the [https://hprc.tamu.edu/policies/ HPRC Policies page].<br />
<br />
Grace-specific policies, which are similar to Terra, can be found on the [[Grace:Policies | Grace Policies page]].<br />
<br />
== Accessing Grace ==<br />
<br />
<!--'''For convenience, this topic has been summarized in a short video lesson, which you can view [https://www.youtube.com/watch?v=dypaj5uHpqQ here]. Otherwise, feel free to continue reading.'''--><br />
<br />
<!--Most access to Grace is done via a secure shell session.<br />
<br />
Users on '''Windows''' computers use either [http://www.putty.org/ PuTTY] or [http://mobaxterm.mobatek.net/ MobaXterm]. If MobaXterm works on your computer, it is usually easier to use.<br />
<br />
Users on '''Mac''' and '''Linux/Unix''' should use whatever SSH-capable terminal is available on their system.<br />
<br />
The command to connect to Grace is as follows. Be sure to replace [NetID] with your TAMU NetID. <br />
[user1@localhost ~]$ '''ssh ''NetID''@Grace.tamu.edu'''<br />
<font color=teal>'''Note:''' In this example ''[user1@localhost ~]$'' represents the command prompt on your local machine.</font> <br><br />
Your login password is the same that used on [https://howdy.tamu.edu/ Howdy]. You will not see your password as your type it into the login prompt.--><br />
<br />
Most access to Grace is done via a secure shell session. In addition, '''two-factor authentication''' is required to login to any cluster. <br />
<br />
Users on '''Windows''' computers use either [http://www.putty.org/ PuTTY] or [http://mobaxterm.mobatek.net/ MobaXterm]. If MobaXterm works on your computer, it is usually easier to use. When starting an ssh session in PuTTY, choose the connection type 'SSH', select port 22, and then type the hostname 'Grace.tamu.edu'. For MobaXterm, select 'Session', 'SSH', and then remote host 'Grace.tamu.edu'. Check the box to specify username and type your NetID. After selecting 'Ok', you will be prompted for Duo Two Factor Authentication. For more detailed instructions, visit the [https://hprc.tamu.edu/wiki/Two_Factor#MobaXterm Two Factor Authentication] page.<br />
<br />
Users on '''Mac''' and '''Linux/Unix''' should use whatever SSH-capable terminal is available on their system. The command to connect to Grace is as follows. Be sure to replace [NetID] with your TAMU NetID. <br />
[user1@localhost ~]$ '''ssh ''[NetID]''@grace.tamu.edu'''<br />
<font color=teal>'''Note:''' In this example ''[user1@localhost ~]$'' represents the command prompt on your local machine.</font> <br><br />
Your login password is the same that used on [https://howdy.tamu.edu/ Howdy]. You will not see your password as your type it into the login prompt.<br />
<br />
=== Off Campus Access ===<br />
Please visit [https://hprc.tamu.edu/wiki/HPRC:Remote_Access this page] to find information on accessing Grace remotely. <br />
<br />
For more detailed instructions on how to access our systems, please see the [[HPRC:Access | HPRC Access page]].<br />
<br />
<br />
== Navigating Grace & Storage Quotas ==<br />
<br />
When you first access Grace, you will be within your ''home'' directory. This directory has smaller storage quotas and should not be used for general purpose.<br />
<br />
You can navigate to your ''home'' directory with the following command:<br />
[NetID@grace1 ~]$ '''cd /home/''NetID'''''<br />
<br />
Your ''scratch'' directory has more storage space than your ''home'' directory and is recommended for general purpose use. You can navigate to your ''scratch'' directory with the following command:<br />
[NetID@grace1 ~]$ '''cd /scratch/user/''NetID'''''<br />
<br />
You can navigate to ''scratch'' or ''home'' easily by using their respective environment variables. <br />
<br />
Navigate to ''scratch'' with the following command:<br />
[NetID@grace1 ~]$ '''cd $SCRATCH'''<br />
<br />
Navigate to ''home'' with the following command:<br />
[NetID@grace1 ~]$ '''cd $HOME'''<br />
<br />
<font color=purple><br />
Your ''scratch'' directory is restricted to 1TB/250,000 files of storage. This storage quota is '''expandable''' upon request. A user's ''scratch directory'' is '''NOT''' backed up.<br />
<br />
Your ''home'' directory is restricted to 10GB/10,000 files of storage. This storage quota is '''not expandable'''. A user's ''home'' directory is backed up on a nightly basis.<br />
</font><br />
<br />
You can see the current status of your storage quotas with:<br />
[NetID@grace1 ~]$ '''showquota'''<br />
<br />
If you need a storage quota increase, please contact us with justification and the expected length of time that you will need the quota increase.<br />
<br />
==The Batch System==<br />
The batch system is a load distribution implementation that ensures convenient and fair use of a shared resource. Submitting jobs to a batch system allows a user to reserve specific resources with minimal interference to other users. All users are required to submit resource-intensive processing to the compute nodes through the batch system - <font color=red> attempting to circumvent the batch system is not allowed.</font><br />
<br />
On Grace, '''Slurm''' is the batch system that provides job management. <br />
More information on '''Slurm''' can be found in the [[Grace:Batch | Grace Batch]] page.<br />
<br />
== Managing Project Accounts ==<br />
The batch system will charge SUs from the either the account specified in the job parameters, or from your default account (if this parameter is omitted). To avoid errors in SU billing, you can view your active accounts, and set your default account using the [https://hprc.tamu.edu/wiki/HPRC:myproject myproject] command.<br />
<br />
== Finding Software ==<br />
<br />
Software on Grace is loaded using '''modules'''.<br />
<br />
A list of the most popular software on our systems is available on the [[:SW | HPRC Available Software]] page.<br />
<br />
To '''find''' ''most'' available software on Grace, use the following command:<br />
[NetID@grace1 ~]$ '''module avail'''<br />
<br />
To '''search for''' particular software by keyword, use:<br />
[NetID@grace1 ~]$ '''module spider ''keyword'''''<br />
<br />
To load a module, use:<br />
[NetID@grace1 ~]$ '''module load ''moduleName'''''<br />
<br />
To list all currently loaded modules, use:<br />
[NetID@grace1 ~]$ '''module list'''<br />
<br />
To remove all currently loaded modules, use:<br />
[NetID@grace1 ~]$ '''module purge'''<br />
<br />
If you need '''new software''' or '''an update''', please contact us with your request. <br />
<br />
There are restrictions on what software we can install. There is also regularly a queue of requested software installations. <br />
<br />
<font color=teal>Please account for '''delays''' in your installation request timeline. </font><br />
<br />
<br />
== Running Your Program / Preparing a Job File ==<br />
<br />
In order to properly run a program on Grace, you will need to create a job file and submit a job.<br />
<br />
The simple example job file below requests 1 core on 1 node with 2.5GB of RAM for 1.5 hours. Note that typical nodes on Grace have 28 cores with 120GB of usable memory and ensure that your job requirements will fit within these restrictions. Any modules that need to be loaded or executable commands will replace the ''"#First Executable Line"'' in this example.<br />
#!/bin/bash<br />
##ENVIRONMENT SETTINGS; CHANGE WITH CAUTION<br />
#SBATCH --export=NONE #Do not propagate environment<br />
#SBATCH --get-user-env=L #Replicate login environment<br />
<br />
##NECESSARY JOB SPECIFICATIONS<br />
#SBATCH --job-name=JobExample1 #Set the job name to "JobExample1"<br />
#SBATCH --time=01:30:00 #Set the wall clock limit to 1hr and 30min<br />
#SBATCH --ntasks=1 #Request 1 task<br />
#SBATCH --ntasks-per-node=1 #Request 1 task/core per node<br />
#SBATCH --mem=2560M #Request 2560MB (2.5GB) per node<br />
#SBATCH --output=Example1Out.%j #Send stdout/err to "Example1Out.[jobID]"<br />
<br />
#First Executable Line<br />
<br />
Note: If your job file has been written on an older Mac or DOS workstation, you will need to use "dos2unix" to remove certain characters that interfere with parsing the script.<br />
<br />
[NetID@grace1 ~]$ '''dos2unix ''MyJob.slurm'''''<br />
<br />
More information on '''job options''' can be found in the [[Grace:Batch#Building_Job_Files | Building Job Files]] section of the [[Grace:Batch | Grace Batch]] page.<br />
<br />
More information on '''dos2unix''' can be found on the [[:SW:dos2unix | dos2unix]] section of the [[:SW | HPRC Available Software]] page.<br />
<br />
== Submitting and Monitoring Jobs ==<br />
Once you have your job file ready, it is time to submit your job. You can submit your job to slurm with the following command:<br />
[NetID@grace1 ~]$ '''sbatch ''MyJob.slurm'''''<br />
Submitted batch job 3606<br />
<br />
After the job has been submitted, you are able to monitor it with several methods. <br />
To see the status of all of your jobs, use the following command:<br />
[NetID@grace1 ~]$ '''squeue -u ''NetID'''''<br />
JOBID NAME USER PARTITION NODES CPUS STATE TIME TIME_LEFT START_TIME REASON NODELIST <br />
3606 myjob2 NetID short 1 3 RUNNING 0:30 00:10:30 2020-11-27T23:44:12 None tnxt-[0340] <br />
<br />
To see the status of one job, use the following command, where ''XXXX'' is the JobID:<br />
[NetID@grace1 ~]$ '''squeue --job ''XXXX'''''<br />
JOBID NAME USER PARTITION NODES CPUS STATE TIME TIME_LEFT START_TIME REASON NODELIST <br />
XXXX myjob2 NetID short 1 3 RUNNING 0:30 00:10:30 2020-11-27T23:44:12 None tnxt-[0340] <br />
<br />
To cancel a job, use the following command, where ''XXXX'' is the JobID:<br />
[NetID@grace1 ~]$ '''scancel ''XXXX'''''<br />
<br />
More information on [[:Grace:Batch#Job_Submission | Job Submission]] and [[:Grace:Batch#Job_Monitoring_and_Control_Commands | Job Monitoring]] Slurm jobs can be found at the [[:Grace:Batch | Grace Batch System]] page.<br />
<br />
== tamubatch ==<br />
<br />
'''tamubatch''' is an automatic batch job script that submits jobs for the user without the need of writing a batch script on the Ada, Terra, and Grace clusters. The user just needs to provide the executable commands in a text file and tamubatch will automatically submit the job to the cluster. There are flags that the user may specify which allows control over the parameters for the job submitted.<br />
<br />
For more information, visit [https://hprc.tamu.edu/wiki/SW:tamubatch this page.]<br />
<br />
[[:SW:tamubatch| HPRC tamubatch wiki page]]<br />
<br />
== Additional Topics ==<br />
<br />
=== Translating Ada/LSF <--> Grace/Slurm ===<br />
<br />
The [[:HPRC:Batch_Translation | HPRC Batch Translation]] page contains information on '''converting''' between LSF, PBS, and Slurm.<br />
<br />
Our staff has also written some example jobs for specific software. These software-specific examples can be seen on the [[:SW | Individual Software Pages]] where available. <br />
<br />
=== Finding Software ===<br />
<br />
Software on Grace is loaded using '''modules'''.<br />
<br />
You can see the most popular software on the [[:SW | HPRC Available Software]] page.<br />
<br />
You can '''find''' ''most'' available software on Grace with the following command:<br />
[NetID@grace1 ~]$ '''module avail'''<br />
<br />
You can '''search for''' particular software by keyword using:<br />
[NetID@grace1 ~]$ '''module spider ''keyword'''''<br />
<br />
You can load a module using:<br />
[NetID@grace1 ~]$ '''module load ''moduleName'''''<br />
<br />
You can list all currently loaded modules using:<br />
[NetID@grace1 ~]$ '''module list'''<br />
<br />
You can remove all currently loaded modules using:<br />
[NetID@grace1 ~]$ '''module purge'''<br />
<br />
If you need '''new software''' or '''an update''', please contact us with your request. <br />
<br />
There are restrictions on what software we can install. There is also regularly a queue of requested software installations. <br />
<br />
<font color=teal>Please account for '''delays''' in your installation request timeline. </font><br />
<br />
=== Transferring Files ===<br />
<br />
Files can be transferred to Grace using the ''scp'' command or a file transfer program.<br />
<br />
Our users most commonly utilize:<br />
* [https://winscp.net/eng/download.php WinSCP] - Straightforward, legacy<br />
* [https://filezilla-project.org/ FileZilla Client] - Easy to use, additional features, available on most platforms<br />
* [https://mobaxterm.mobatek.net/features.html MobaXterm Graphical SFTP] - Included with MobaXterm<br />
<br />
See our [Grace-Filezilla example video] for a demonstration of this process.<br />
<br />
<font color=teal>'''Advice:''' while GUIs are acceptable for file transfers, the cp and scp commands are much quicker and may significantly benefit your workflow.</font><br />
<br />
==== Reliably Transferring Large Files ====<br />
<br />
For files larger than several GB, you will want to consider the use of a more fault-tolerant utility such as rsync.<br />
[NetID@grace1 ~]$ '''rsync -av [-z] ''localdir/ userid@remotesystem:/path/to/remotedir/'''''<br />
<br />
An rsync example can be seen on the [[:Ada:Fast_Data_Transfer#Data_transfer_using_rsync | Ada Fast Transfer]] page.<br />
<!-- See our [Grace-rsync example video] for a demonstration of this process. --><br />
<!-- [Insert info on glob, ftn] --><br />
<br />
=== Graphical User Interfaces (Visualization) ===<br />
<br />
You have '''three options''' for using GUIs on Grace.<br />
<br />
The '''first option''' is to use the Open OnDemand Portal. See the [[SW:Portal | HPRC Portal]] page for more information.<br />
<br />
The '''second option''' is to run on the login node. When doing this, you '''must''' observe the fair-use policy of login node usage. Users commonly violate these policies by accident, resulting in terminated processes, confusion, and warnings from our admins.<br />
<br />
The '''third option''' is to use a VNC job. This method is outside the scope of this guide. See the [[Grace:Remote-Viz | Grace Remote Visualization]] page for more information.<br />
<br />
<br />
<br />
[[Category: Grace]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=Grace:QuickStart&diff=11755Grace:QuickStart2020-12-17T20:41:57Z<p>Pennings: /* tamubatch */</p>
<hr />
<div><H1>Grace Quick Start Guide</H1><br />
__TOC__<br />
== Grace Usage Policies ==<br />
'''Access to Grace is granted with the condition that you will understand and adhere to all TAMU HPRC and Grace-specific policies.''' <br />
<br />
General policies can be found on the [https://hprc.tamu.edu/policies/ HPRC Policies page].<br />
<br />
Grace-specific policies, which are similar to Terra, can be found on the [[Grace:Policies | Grace Policies page]].<br />
<br />
== Accessing Grace ==<br />
<br />
<!--'''For convenience, this topic has been summarized in a short video lesson, which you can view [https://www.youtube.com/watch?v=dypaj5uHpqQ here]. Otherwise, feel free to continue reading.'''--><br />
<br />
<!--Most access to Grace is done via a secure shell session.<br />
<br />
Users on '''Windows''' computers use either [http://www.putty.org/ PuTTY] or [http://mobaxterm.mobatek.net/ MobaXterm]. If MobaXterm works on your computer, it is usually easier to use.<br />
<br />
Users on '''Mac''' and '''Linux/Unix''' should use whatever SSH-capable terminal is available on their system.<br />
<br />
The command to connect to Grace is as follows. Be sure to replace [NetID] with your TAMU NetID. <br />
[user1@localhost ~]$ '''ssh ''NetID''@Grace.tamu.edu'''<br />
<font color=teal>'''Note:''' In this example ''[user1@localhost ~]$'' represents the command prompt on your local machine.</font> <br><br />
Your login password is the same that used on [https://howdy.tamu.edu/ Howdy]. You will not see your password as your type it into the login prompt.--><br />
<br />
Most access to Grace is done via a secure shell session. In addition, '''two-factor authentication''' is required to login to any cluster. <br />
<br />
Users on '''Windows''' computers use either [http://www.putty.org/ PuTTY] or [http://mobaxterm.mobatek.net/ MobaXterm]. If MobaXterm works on your computer, it is usually easier to use. When starting an ssh session in PuTTY, choose the connection type 'SSH', select port 22, and then type the hostname 'Grace.tamu.edu'. For MobaXterm, select 'Session', 'SSH', and then remote host 'Grace.tamu.edu'. Check the box to specify username and type your NetID. After selecting 'Ok', you will be prompted for Duo Two Factor Authentication. For more detailed instructions, visit the [https://hprc.tamu.edu/wiki/Two_Factor#MobaXterm Two Factor Authentication] page.<br />
<br />
Users on '''Mac''' and '''Linux/Unix''' should use whatever SSH-capable terminal is available on their system. The command to connect to Grace is as follows. Be sure to replace [NetID] with your TAMU NetID. <br />
[user1@localhost ~]$ '''ssh ''[NetID]''@grace.tamu.edu'''<br />
<font color=teal>'''Note:''' In this example ''[user1@localhost ~]$'' represents the command prompt on your local machine.</font> <br><br />
Your login password is the same that used on [https://howdy.tamu.edu/ Howdy]. You will not see your password as your type it into the login prompt.<br />
<br />
=== Off Campus Access ===<br />
Please visit [https://hprc.tamu.edu/wiki/HPRC:Remote_Access this page] to find information on accessing Grace remotely. <br />
<br />
For more detailed instructions on how to access our systems, please see the [[HPRC:Access | HPRC Access page]].<br />
<br />
<br />
== Navigating Grace & Storage Quotas ==<br />
<br />
When you first access Grace, you will be within your ''home'' directory. This directory has smaller storage quotas and should not be used for general purpose.<br />
<br />
You can navigate to your ''home'' directory with the following command:<br />
[NetID@grace1 ~]$ '''cd /home/''NetID'''''<br />
<br />
Your ''scratch'' directory has more storage space than your ''home'' directory and is recommended for general purpose use. You can navigate to your ''scratch'' directory with the following command:<br />
[NetID@grace1 ~]$ '''cd /scratch/user/''NetID'''''<br />
<br />
You can navigate to ''scratch'' or ''home'' easily by using their respective environment variables. <br />
<br />
Navigate to ''scratch'' with the following command:<br />
[NetID@grace1 ~]$ '''cd $SCRATCH'''<br />
<br />
Navigate to ''home'' with the following command:<br />
[NetID@grace1 ~]$ '''cd $HOME'''<br />
<br />
<font color=purple><br />
Your ''scratch'' directory is restricted to 1TB/250,000 files of storage. This storage quota is '''expandable''' upon request. A user's ''scratch directory'' is '''NOT''' backed up.<br />
<br />
Your ''home'' directory is restricted to 10GB/10,000 files of storage. This storage quota is '''not expandable'''. A user's ''home'' directory is backed up on a nightly basis.<br />
</font><br />
<br />
You can see the current status of your storage quotas with:<br />
[NetID@grace1 ~]$ '''showquota'''<br />
<br />
If you need a storage quota increase, please contact us with justification and the expected length of time that you will need the quota increase.<br />
<br />
==The Batch System==<br />
The batch system is a load distribution implementation that ensures convenient and fair use of a shared resource. Submitting jobs to a batch system allows a user to reserve specific resources with minimal interference to other users. All users are required to submit resource-intensive processing to the compute nodes through the batch system - <font color=red> attempting to circumvent the batch system is not allowed.</font><br />
<br />
On Grace, '''Slurm''' is the batch system that provides job management. <br />
More information on '''Slurm''' can be found in the [[Grace:Batch | Grace Batch]] page.<br />
<br />
== Managing Project Accounts ==<br />
The batch system will charge SUs from the either the account specified in the job parameters, or from your default account (if this parameter is omitted). To avoid errors in SU billing, you can view your active accounts, and set your default account using the [https://hprc.tamu.edu/wiki/HPRC:myproject myproject] command.<br />
<br />
== Finding Software ==<br />
<br />
Software on Grace is loaded using '''modules'''.<br />
<br />
A list of the most popular software on our systems is available on the [[:SW | HPRC Available Software]] page.<br />
<br />
To '''find''' ''most'' available software on Grace, use the following command:<br />
[NetID@grace1 ~]$ '''module avail'''<br />
<br />
To '''search for''' particular software by keyword, use:<br />
[NetID@grace1 ~]$ '''module spider ''keyword'''''<br />
<br />
To load a module, use:<br />
[NetID@grace1 ~]$ '''module load ''moduleName'''''<br />
<br />
To list all currently loaded modules, use:<br />
[NetID@grace1 ~]$ '''module list'''<br />
<br />
To remove all currently loaded modules, use:<br />
[NetID@grace1 ~]$ '''module purge'''<br />
<br />
If you need '''new software''' or '''an update''', please contact us with your request. <br />
<br />
There are restrictions on what software we can install. There is also regularly a queue of requested software installations. <br />
<br />
<font color=teal>Please account for '''delays''' in your installation request timeline. </font><br />
<br />
<br />
== Running Your Program / Preparing a Job File ==<br />
<br />
In order to properly run a program on Grace, you will need to create a job file and submit a job.<br />
<br />
The simple example job file below requests 1 core on 1 node with 2.5GB of RAM for 1.5 hours. Note that typical nodes on Grace have 28 cores with 120GB of usable memory and ensure that your job requirements will fit within these restrictions. Any modules that need to be loaded or executable commands will replace the ''"#First Executable Line"'' in this example.<br />
#!/bin/bash<br />
##ENVIRONMENT SETTINGS; CHANGE WITH CAUTION<br />
#SBATCH --export=NONE #Do not propagate environment<br />
#SBATCH --get-user-env=L #Replicate login environment<br />
<br />
##NECESSARY JOB SPECIFICATIONS<br />
#SBATCH --job-name=JobExample1 #Set the job name to "JobExample1"<br />
#SBATCH --time=01:30:00 #Set the wall clock limit to 1hr and 30min<br />
#SBATCH --ntasks=1 #Request 1 task<br />
#SBATCH --ntasks-per-node=1 #Request 1 task/core per node<br />
#SBATCH --mem=2560M #Request 2560MB (2.5GB) per node<br />
#SBATCH --output=Example1Out.%j #Send stdout/err to "Example1Out.[jobID]"<br />
<br />
#First Executable Line<br />
<br />
Note: If your job file has been written on an older Mac or DOS workstation, you will need to use "dos2unix" to remove certain characters that interfere with parsing the script.<br />
<br />
[NetID@grace1 ~]$ '''dos2unix ''MyJob.slurm'''''<br />
<br />
More information on '''job options''' can be found in the [[Grace:Batch#Building_Job_Files | Building Job Files]] section of the [[Grace:Batch | Grace Batch]] page.<br />
<br />
More information on '''dos2unix''' can be found on the [[:SW:dos2unix | dos2unix]] section of the [[:SW | HPRC Available Software]] page.<br />
<br />
== Submitting and Monitoring Jobs ==<br />
Once you have your job file ready, it is time to submit your job. You can submit your job to slurm with the following command:<br />
[NetID@grace1 ~]$ '''sbatch ''MyJob.slurm'''''<br />
Submitted batch job 3606<br />
<br />
After the job has been submitted, you are able to monitor it with several methods. <br />
To see the status of all of your jobs, use the following command:<br />
[NetID@grace1 ~]$ '''squeue -u ''NetID'''''<br />
JOBID NAME USER PARTITION NODES CPUS STATE TIME TIME_LEFT START_TIME REASON NODELIST <br />
3606 myjob2 NetID short 1 3 RUNNING 0:30 00:10:30 2020-11-27T23:44:12 None tnxt-[0340] <br />
<br />
To see the status of one job, use the following command, where ''XXXX'' is the JobID:<br />
[NetID@grace1 ~]$ '''squeue --job ''XXXX'''''<br />
JOBID NAME USER PARTITION NODES CPUS STATE TIME TIME_LEFT START_TIME REASON NODELIST <br />
XXXX myjob2 NetID short 1 3 RUNNING 0:30 00:10:30 2020-11-27T23:44:12 None tnxt-[0340] <br />
<br />
To cancel a job, use the following command, where ''XXXX'' is the JobID:<br />
[NetID@grace1 ~]$ '''scancel ''XXXX'''''<br />
<br />
More information on [[:Grace:Batch#Job_Submission | Job Submission]] and [[:Grace:Batch#Job_Monitoring_and_Control_Commands | Job Monitoring]] Slurm jobs can be found at the [[:Grace:Batch | Grace Batch System]] page.<br />
<br />
== tamubatch ==<br />
<br />
'''tamubatch''' is an automatic batch job script that submits jobs for the user without the need of writing a batch script on the Ada, Terra, and Grace clusters. The user just needs to provide the executable commands in a text file and tamubatch will automatically submit the job to the cluster. There are flags that the user may specify which allows control over the parameters for the job submitted.<br />
<br />
For more information, visit [https://hprc.tamu.edu/wiki/SW:tamubatch this page.]<br />
<br />
== Additional Topics ==<br />
<br />
=== Translating Ada/LSF <--> Grace/Slurm ===<br />
<br />
The [[:HPRC:Batch_Translation | HPRC Batch Translation]] page contains information on '''converting''' between LSF, PBS, and Slurm.<br />
<br />
Our staff has also written some example jobs for specific software. These software-specific examples can be seen on the [[:SW | Individual Software Pages]] where available. <br />
<br />
=== Finding Software ===<br />
<br />
Software on Grace is loaded using '''modules'''.<br />
<br />
You can see the most popular software on the [[:SW | HPRC Available Software]] page.<br />
<br />
You can '''find''' ''most'' available software on Grace with the following command:<br />
[NetID@grace1 ~]$ '''module avail'''<br />
<br />
You can '''search for''' particular software by keyword using:<br />
[NetID@grace1 ~]$ '''module spider ''keyword'''''<br />
<br />
You can load a module using:<br />
[NetID@grace1 ~]$ '''module load ''moduleName'''''<br />
<br />
You can list all currently loaded modules using:<br />
[NetID@grace1 ~]$ '''module list'''<br />
<br />
You can remove all currently loaded modules using:<br />
[NetID@grace1 ~]$ '''module purge'''<br />
<br />
If you need '''new software''' or '''an update''', please contact us with your request. <br />
<br />
There are restrictions on what software we can install. There is also regularly a queue of requested software installations. <br />
<br />
<font color=teal>Please account for '''delays''' in your installation request timeline. </font><br />
<br />
=== Transferring Files ===<br />
<br />
Files can be transferred to Grace using the ''scp'' command or a file transfer program.<br />
<br />
Our users most commonly utilize:<br />
* [https://winscp.net/eng/download.php WinSCP] - Straightforward, legacy<br />
* [https://filezilla-project.org/ FileZilla Client] - Easy to use, additional features, available on most platforms<br />
* [https://mobaxterm.mobatek.net/features.html MobaXterm Graphical SFTP] - Included with MobaXterm<br />
<br />
See our [Grace-Filezilla example video] for a demonstration of this process.<br />
<br />
<font color=teal>'''Advice:''' while GUIs are acceptable for file transfers, the cp and scp commands are much quicker and may significantly benefit your workflow.</font><br />
<br />
==== Reliably Transferring Large Files ====<br />
<br />
For files larger than several GB, you will want to consider the use of a more fault-tolerant utility such as rsync.<br />
[NetID@grace1 ~]$ '''rsync -av [-z] ''localdir/ userid@remotesystem:/path/to/remotedir/'''''<br />
<br />
An rsync example can be seen on the [[:Ada:Fast_Data_Transfer#Data_transfer_using_rsync | Ada Fast Transfer]] page.<br />
<!-- See our [Grace-rsync example video] for a demonstration of this process. --><br />
<!-- [Insert info on glob, ftn] --><br />
<br />
=== Graphical User Interfaces (Visualization) ===<br />
<br />
You have '''three options''' for using GUIs on Grace.<br />
<br />
The '''first option''' is to use the Open OnDemand Portal. See the [[SW:Portal | HPRC Portal]] page for more information.<br />
<br />
The '''second option''' is to run on the login node. When doing this, you '''must''' observe the fair-use policy of login node usage. Users commonly violate these policies by accident, resulting in terminated processes, confusion, and warnings from our admins.<br />
<br />
The '''third option''' is to use a VNC job. This method is outside the scope of this guide. See the [[Grace:Remote-Viz | Grace Remote Visualization]] page for more information.<br />
<br />
<br />
<br />
[[Category: Grace]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=Grace:QuickStart&diff=11754Grace:QuickStart2020-12-17T20:41:45Z<p>Pennings: /* tamubatch */</p>
<hr />
<div><H1>Grace Quick Start Guide</H1><br />
__TOC__<br />
== Grace Usage Policies ==<br />
'''Access to Grace is granted with the condition that you will understand and adhere to all TAMU HPRC and Grace-specific policies.''' <br />
<br />
General policies can be found on the [https://hprc.tamu.edu/policies/ HPRC Policies page].<br />
<br />
Grace-specific policies, which are similar to Terra, can be found on the [[Grace:Policies | Grace Policies page]].<br />
<br />
== Accessing Grace ==<br />
<br />
<!--'''For convenience, this topic has been summarized in a short video lesson, which you can view [https://www.youtube.com/watch?v=dypaj5uHpqQ here]. Otherwise, feel free to continue reading.'''--><br />
<br />
<!--Most access to Grace is done via a secure shell session.<br />
<br />
Users on '''Windows''' computers use either [http://www.putty.org/ PuTTY] or [http://mobaxterm.mobatek.net/ MobaXterm]. If MobaXterm works on your computer, it is usually easier to use.<br />
<br />
Users on '''Mac''' and '''Linux/Unix''' should use whatever SSH-capable terminal is available on their system.<br />
<br />
The command to connect to Grace is as follows. Be sure to replace [NetID] with your TAMU NetID. <br />
[user1@localhost ~]$ '''ssh ''NetID''@Grace.tamu.edu'''<br />
<font color=teal>'''Note:''' In this example ''[user1@localhost ~]$'' represents the command prompt on your local machine.</font> <br><br />
Your login password is the same that used on [https://howdy.tamu.edu/ Howdy]. You will not see your password as your type it into the login prompt.--><br />
<br />
Most access to Grace is done via a secure shell session. In addition, '''two-factor authentication''' is required to login to any cluster. <br />
<br />
Users on '''Windows''' computers use either [http://www.putty.org/ PuTTY] or [http://mobaxterm.mobatek.net/ MobaXterm]. If MobaXterm works on your computer, it is usually easier to use. When starting an ssh session in PuTTY, choose the connection type 'SSH', select port 22, and then type the hostname 'Grace.tamu.edu'. For MobaXterm, select 'Session', 'SSH', and then remote host 'Grace.tamu.edu'. Check the box to specify username and type your NetID. After selecting 'Ok', you will be prompted for Duo Two Factor Authentication. For more detailed instructions, visit the [https://hprc.tamu.edu/wiki/Two_Factor#MobaXterm Two Factor Authentication] page.<br />
<br />
Users on '''Mac''' and '''Linux/Unix''' should use whatever SSH-capable terminal is available on their system. The command to connect to Grace is as follows. Be sure to replace [NetID] with your TAMU NetID. <br />
[user1@localhost ~]$ '''ssh ''[NetID]''@grace.tamu.edu'''<br />
<font color=teal>'''Note:''' In this example ''[user1@localhost ~]$'' represents the command prompt on your local machine.</font> <br><br />
Your login password is the same that used on [https://howdy.tamu.edu/ Howdy]. You will not see your password as your type it into the login prompt.<br />
<br />
=== Off Campus Access ===<br />
Please visit [https://hprc.tamu.edu/wiki/HPRC:Remote_Access this page] to find information on accessing Grace remotely. <br />
<br />
For more detailed instructions on how to access our systems, please see the [[HPRC:Access | HPRC Access page]].<br />
<br />
<br />
== Navigating Grace & Storage Quotas ==<br />
<br />
When you first access Grace, you will be within your ''home'' directory. This directory has smaller storage quotas and should not be used for general purpose.<br />
<br />
You can navigate to your ''home'' directory with the following command:<br />
[NetID@grace1 ~]$ '''cd /home/''NetID'''''<br />
<br />
Your ''scratch'' directory has more storage space than your ''home'' directory and is recommended for general purpose use. You can navigate to your ''scratch'' directory with the following command:<br />
[NetID@grace1 ~]$ '''cd /scratch/user/''NetID'''''<br />
<br />
You can navigate to ''scratch'' or ''home'' easily by using their respective environment variables. <br />
<br />
Navigate to ''scratch'' with the following command:<br />
[NetID@grace1 ~]$ '''cd $SCRATCH'''<br />
<br />
Navigate to ''home'' with the following command:<br />
[NetID@grace1 ~]$ '''cd $HOME'''<br />
<br />
<font color=purple><br />
Your ''scratch'' directory is restricted to 1TB/250,000 files of storage. This storage quota is '''expandable''' upon request. A user's ''scratch directory'' is '''NOT''' backed up.<br />
<br />
Your ''home'' directory is restricted to 10GB/10,000 files of storage. This storage quota is '''not expandable'''. A user's ''home'' directory is backed up on a nightly basis.<br />
</font><br />
<br />
You can see the current status of your storage quotas with:<br />
[NetID@grace1 ~]$ '''showquota'''<br />
<br />
If you need a storage quota increase, please contact us with justification and the expected length of time that you will need the quota increase.<br />
<br />
==The Batch System==<br />
The batch system is a load distribution implementation that ensures convenient and fair use of a shared resource. Submitting jobs to a batch system allows a user to reserve specific resources with minimal interference to other users. All users are required to submit resource-intensive processing to the compute nodes through the batch system - <font color=red> attempting to circumvent the batch system is not allowed.</font><br />
<br />
On Grace, '''Slurm''' is the batch system that provides job management. <br />
More information on '''Slurm''' can be found in the [[Grace:Batch | Grace Batch]] page.<br />
<br />
== Managing Project Accounts ==<br />
The batch system will charge SUs from the either the account specified in the job parameters, or from your default account (if this parameter is omitted). To avoid errors in SU billing, you can view your active accounts, and set your default account using the [https://hprc.tamu.edu/wiki/HPRC:myproject myproject] command.<br />
<br />
== Finding Software ==<br />
<br />
Software on Grace is loaded using '''modules'''.<br />
<br />
A list of the most popular software on our systems is available on the [[:SW | HPRC Available Software]] page.<br />
<br />
To '''find''' ''most'' available software on Grace, use the following command:<br />
[NetID@grace1 ~]$ '''module avail'''<br />
<br />
To '''search for''' particular software by keyword, use:<br />
[NetID@grace1 ~]$ '''module spider ''keyword'''''<br />
<br />
To load a module, use:<br />
[NetID@grace1 ~]$ '''module load ''moduleName'''''<br />
<br />
To list all currently loaded modules, use:<br />
[NetID@grace1 ~]$ '''module list'''<br />
<br />
To remove all currently loaded modules, use:<br />
[NetID@grace1 ~]$ '''module purge'''<br />
<br />
If you need '''new software''' or '''an update''', please contact us with your request. <br />
<br />
There are restrictions on what software we can install. There is also regularly a queue of requested software installations. <br />
<br />
<font color=teal>Please account for '''delays''' in your installation request timeline. </font><br />
<br />
<br />
== Running Your Program / Preparing a Job File ==<br />
<br />
In order to properly run a program on Grace, you will need to create a job file and submit a job.<br />
<br />
The simple example job file below requests 1 core on 1 node with 2.5GB of RAM for 1.5 hours. Note that typical nodes on Grace have 28 cores with 120GB of usable memory and ensure that your job requirements will fit within these restrictions. Any modules that need to be loaded or executable commands will replace the ''"#First Executable Line"'' in this example.<br />
#!/bin/bash<br />
##ENVIRONMENT SETTINGS; CHANGE WITH CAUTION<br />
#SBATCH --export=NONE #Do not propagate environment<br />
#SBATCH --get-user-env=L #Replicate login environment<br />
<br />
##NECESSARY JOB SPECIFICATIONS<br />
#SBATCH --job-name=JobExample1 #Set the job name to "JobExample1"<br />
#SBATCH --time=01:30:00 #Set the wall clock limit to 1hr and 30min<br />
#SBATCH --ntasks=1 #Request 1 task<br />
#SBATCH --ntasks-per-node=1 #Request 1 task/core per node<br />
#SBATCH --mem=2560M #Request 2560MB (2.5GB) per node<br />
#SBATCH --output=Example1Out.%j #Send stdout/err to "Example1Out.[jobID]"<br />
<br />
#First Executable Line<br />
<br />
Note: If your job file has been written on an older Mac or DOS workstation, you will need to use "dos2unix" to remove certain characters that interfere with parsing the script.<br />
<br />
[NetID@grace1 ~]$ '''dos2unix ''MyJob.slurm'''''<br />
<br />
More information on '''job options''' can be found in the [[Grace:Batch#Building_Job_Files | Building Job Files]] section of the [[Grace:Batch | Grace Batch]] page.<br />
<br />
More information on '''dos2unix''' can be found on the [[:SW:dos2unix | dos2unix]] section of the [[:SW | HPRC Available Software]] page.<br />
<br />
== Submitting and Monitoring Jobs ==<br />
Once you have your job file ready, it is time to submit your job. You can submit your job to slurm with the following command:<br />
[NetID@grace1 ~]$ '''sbatch ''MyJob.slurm'''''<br />
Submitted batch job 3606<br />
<br />
After the job has been submitted, you are able to monitor it with several methods. <br />
To see the status of all of your jobs, use the following command:<br />
[NetID@grace1 ~]$ '''squeue -u ''NetID'''''<br />
JOBID NAME USER PARTITION NODES CPUS STATE TIME TIME_LEFT START_TIME REASON NODELIST <br />
3606 myjob2 NetID short 1 3 RUNNING 0:30 00:10:30 2020-11-27T23:44:12 None tnxt-[0340] <br />
<br />
To see the status of one job, use the following command, where ''XXXX'' is the JobID:<br />
[NetID@grace1 ~]$ '''squeue --job ''XXXX'''''<br />
JOBID NAME USER PARTITION NODES CPUS STATE TIME TIME_LEFT START_TIME REASON NODELIST <br />
XXXX myjob2 NetID short 1 3 RUNNING 0:30 00:10:30 2020-11-27T23:44:12 None tnxt-[0340] <br />
<br />
To cancel a job, use the following command, where ''XXXX'' is the JobID:<br />
[NetID@grace1 ~]$ '''scancel ''XXXX'''''<br />
<br />
More information on [[:Grace:Batch#Job_Submission | Job Submission]] and [[:Grace:Batch#Job_Monitoring_and_Control_Commands | Job Monitoring]] Slurm jobs can be found at the [[:Grace:Batch | Grace Batch System]] page.<br />
<br />
== tamubatch ==<br />
<br />
'''tamubatch''' is an automatic batch job script that submits jobs for the user without the need of writing a batch script on the Ada, Terra, and Grace clusters. The user just needs to provide the executable commands in a text file and tamubatch will automatically submit the job to the cluster. There are flags that the user may specify which allows control over the parameters for the job submitted.<br />
<br />
<br />
For more information, visit [https://hprc.tamu.edu/wiki/SW:tamubatch this page.]<br />
<br />
== Additional Topics ==<br />
<br />
=== Translating Ada/LSF <--> Grace/Slurm ===<br />
<br />
The [[:HPRC:Batch_Translation | HPRC Batch Translation]] page contains information on '''converting''' between LSF, PBS, and Slurm.<br />
<br />
Our staff has also written some example jobs for specific software. These software-specific examples can be seen on the [[:SW | Individual Software Pages]] where available. <br />
<br />
=== Finding Software ===<br />
<br />
Software on Grace is loaded using '''modules'''.<br />
<br />
You can see the most popular software on the [[:SW | HPRC Available Software]] page.<br />
<br />
You can '''find''' ''most'' available software on Grace with the following command:<br />
[NetID@grace1 ~]$ '''module avail'''<br />
<br />
You can '''search for''' particular software by keyword using:<br />
[NetID@grace1 ~]$ '''module spider ''keyword'''''<br />
<br />
You can load a module using:<br />
[NetID@grace1 ~]$ '''module load ''moduleName'''''<br />
<br />
You can list all currently loaded modules using:<br />
[NetID@grace1 ~]$ '''module list'''<br />
<br />
You can remove all currently loaded modules using:<br />
[NetID@grace1 ~]$ '''module purge'''<br />
<br />
If you need '''new software''' or '''an update''', please contact us with your request. <br />
<br />
There are restrictions on what software we can install. There is also regularly a queue of requested software installations. <br />
<br />
<font color=teal>Please account for '''delays''' in your installation request timeline. </font><br />
<br />
=== Transferring Files ===<br />
<br />
Files can be transferred to Grace using the ''scp'' command or a file transfer program.<br />
<br />
Our users most commonly utilize:<br />
* [https://winscp.net/eng/download.php WinSCP] - Straightforward, legacy<br />
* [https://filezilla-project.org/ FileZilla Client] - Easy to use, additional features, available on most platforms<br />
* [https://mobaxterm.mobatek.net/features.html MobaXterm Graphical SFTP] - Included with MobaXterm<br />
<br />
See our [Grace-Filezilla example video] for a demonstration of this process.<br />
<br />
<font color=teal>'''Advice:''' while GUIs are acceptable for file transfers, the cp and scp commands are much quicker and may significantly benefit your workflow.</font><br />
<br />
==== Reliably Transferring Large Files ====<br />
<br />
For files larger than several GB, you will want to consider the use of a more fault-tolerant utility such as rsync.<br />
[NetID@grace1 ~]$ '''rsync -av [-z] ''localdir/ userid@remotesystem:/path/to/remotedir/'''''<br />
<br />
An rsync example can be seen on the [[:Ada:Fast_Data_Transfer#Data_transfer_using_rsync | Ada Fast Transfer]] page.<br />
<!-- See our [Grace-rsync example video] for a demonstration of this process. --><br />
<!-- [Insert info on glob, ftn] --><br />
<br />
=== Graphical User Interfaces (Visualization) ===<br />
<br />
You have '''three options''' for using GUIs on Grace.<br />
<br />
The '''first option''' is to use the Open OnDemand Portal. See the [[SW:Portal | HPRC Portal]] page for more information.<br />
<br />
The '''second option''' is to run on the login node. When doing this, you '''must''' observe the fair-use policy of login node usage. Users commonly violate these policies by accident, resulting in terminated processes, confusion, and warnings from our admins.<br />
<br />
The '''third option''' is to use a VNC job. This method is outside the scope of this guide. See the [[Grace:Remote-Viz | Grace Remote Visualization]] page for more information.<br />
<br />
<br />
<br />
[[Category: Grace]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=Grace:Intro&diff=11708Grace:Intro2020-12-14T21:16:31Z<p>Pennings: /* Namesake */</p>
<hr />
<div><H1>Grace: A Dell x86 HPC Cluster</H1><br />
__TOC__<br />
=== Hardware Overview ===<br />
<br />
----<br />
[[Image:Grace-racks.jpg|right|400px|caption]]<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
| System Name:<br />
| Grace<br />
|-<br />
| Host Name:<br />
| grace.hprc.tamu.edu<br />
|-<br />
| Operating System:<br />
| Linux (CentOS 7)<br />
|-<br />
| Total Compute Cores/Nodes:<br />
| 44,656 cores<br>925 nodes<br />
|-<br />
| Compute Nodes:<br />
| 800 48-core compute nodes, each with 384GB RAM <br> 100 48-core GPU nodes, each with two A100 40GB GPU accelerator and 384GB RAM <br>8 38-core GPU nodes, each with two RTX 6000 24GB GPU accellator and 384 GB RAM<br>8 48-cores GPU nodes, each with 4 T4 16GB GPU accelerator<br> 8 80-core large memory nodes, each with 3TB RAM<br />
|-<br />
| Interconnect:<br />
| Mellanox HDR Infiniband<br />
|-<br />
| Peak Performance:<br />
| 2,638.5 TFLOPS<br />
|-<br />
| Global Disk:<br />
| 5PB (usable) via DDN appliance for general use <br>?PB (raw) via Lenovo's DSS purchased by and dedicated for ??<br />
|-<br />
| File System:<br />
| Lustre and GPFS<br />
|-<br />
| Batch Facility:<br />
| [http://slurm.schedmd.com/ Slurm by SchedMD]<br />
|-<br />
| Location:<br />
| West Campus Data Center<br />
|-<br />
| Production Date:<br />
| Spring 2021<br />
|}<br />
<br />
Grace is an Intel x86-64 Linux cluster with 950 compute nodes (44,656 total cores) and 5 login nodes. There are 800 compute nodes with 384 GB of memory, and 117 GPU nodes with 384 GB of memory. Among the 117 GPU nodes, there are 100 GPU nodes two A100 40 GB GPU cards, 9 GPU nodes with two RTX 6000 24GB GPU cards, 8 GPU nodes with four T4 16GB GPU cards. These 800 compute nodes and 117 GPU nodes are a dual socket server with two Intel 6248R 3.0GHz 24-core processors. There are 8 compute nodes with 3 TB of memory and four Intel 6248 2.5 GHz 20-core processors.<br />
<br />
The interconnecting fabric is a two-level fat-tree based on EDR Infiniband. High performance mass storage of 5 petabyte (usable) capacity is made available to all nodes by DDN.<br />
<br />
Get details on using this system, see the [[Grace | User Guide for Grace]].<br />
<br />
<br />
== Compute Nodes ==<br />
<br />
A description of the four types of compute nodes is below:<br />
<br />
{| class="wikitable" style="text-align: center;"<br />
|+ Table 1 Details of Compute Nodes<br />
!<br />
! General 384GB <br> Compute<br />
! GPU A100<br> Compute<br />
! GPU RTX6000<br> Compute<br />
! GPU T4<br> Compute<br />
! Large memory 3TB<br> Compute<br />
|-<br />
| Total Nodes<br />
| 800<br />
| 100<br />
| 9<br />
| 8<br />
| 8<br />
|-<br />
| Processor Type<br />
| colspan=4 | Intel Xeon 6248R 3.0GHz 24-core<br />
| Intel 6248 2.5 GHz 20-core<br />
|-<br />
| Sockets/Node<br />
| colspan=4 | 2<br />
| 4<br />
|-<br />
| Cores/Node<br />
| colspan=4 | 48<br />
| 80<br />
|-<br />
| Memory/Node<br />
| colspan=4 | 384 GB DDR4, 3200 MHz<br />
| 3 TB DDR4, 3200 MHz<br />
|-<br />
| Accelerator(s)<br />
| N/A<br />
| 2 NVIDIA A100 40GB Accelerator<br />
| 2 NVIDIA RTX6000 24GB Accelerator<br />
| 4 NVIDIA T4 16GB Accelerator<br />
| N/A<br />
|-<br />
| Interconnect<br />
| colspan=5 | Mellanox HDR Infiniband<br />
|-<br />
|Local Disk Space<br />
| colspan=5 | 1.6TB NVMe<br />
|}<br />
<br />
<br />
== Usable Memory for Batch Jobs ==<br />
<br />
While nodes on Grace have either 384GB or 3TB of RAM, some of this memory is used to maintain the software and operating system of the node. In most cases, excessive memory requests will be automatically rejected by SLURM.<br />
<br />
The table below contains information regarding the approximate limits of Grace memory hardware and our suggestions on its use.<br />
<br />
<br />
== Login Nodes ==<br />
<br />
The '''grace.hprc.tamu.edu''' hostname can be used to access the Grace cluster. This translates into one of the five login nodes, '''grace[1-5].hprc.tamu.edu'''. To access a specific login node use its corresponding host name (e.g., grace2.tamu.edu). All login nodes have 10 GbE connections to the TAMU campus network and direct access to all global parallel (Lustre-based) file systems. The table below provides more details about the hardware configuration of the login nodes.<br />
<br />
{| class="wikitable" style="text-align: center;" <br />
|+ Table 2: Details of Login Nodes<br />
!<br />
! No Accelerator<br />
! NVIDIA A100 Accelerator<br />
! NVIDIA RTX6000 Accelerator<br />
! NVIDIA T4 Accelerator<br />
|-<br />
| HostNames<br />
| grace1.hprc.tamu.edu<br>grace2.hprc.tamu.edu<br />
| grace3.hprc.tamu.edu<br />
| grace4.hprc.tamu.edu<br />
| grace5.hprc.tamu.edu<br />
|-<br />
| Processor Type<br />
| colspan=4 | Intel Xeon 6248R 3.0GHz 24-core<br />
|-<br />
| Memory<br />
| colspan=4 | 384 GB DDR4 3200 MHz<br />
|-<br />
| Total Nodes<br />
| 2<br />
| colspan=3 | 1<br />
|-<br />
| Cores/Node<br />
| colspan=4 | 48<br />
|-<br />
| Interconnect<br />
| colspan=4 | Mellanox HDR Infiniband<br />
|-<br />
| Local Disk Space<br />
| colspan=4 | per node: two 480 GB SSD drives<br />
|}<br />
<br />
== Mass Storage ==<br />
<br />
5PB (usable) with Lustre provided by DDN<br />
<br />
<br />
== Interconnect ==<br />
<br />
<br />
<br />
== Namesake ==<br />
<br />
"Grace" is named for [https://en.wikipedia.org/wiki/Grace_Hopper Grace Hopper].<br />
<br />
[[Category:Grace]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:tamulauncher&diff=11457SW:tamulauncher2020-09-16T05:30:00Z<p>Pennings: /* Example 2: Running multi-threaded commands */</p>
<hr />
<div><br />
== tamulauncher ==<br />
<br />
'''tamulauncher''' provides a convenient way to run a large number of serial or multithreaded commands without the need to submit individual jobs or a Job array. tamulauncher takes as arguments a text file containing all commands that need to be executed and tamulauncher will execute the commands concurrently. The number of concurrently executed commands depends on the batch requirements. When tamulauncher is run interactively the number of concurrently executed commands is limited to at most 8. tamulauncher is available on terra and ada. There is no need to load any module. tamulauncher has been successfully tested to execute over 100K commands.<br />
<br />
''tamulauncher is preferred over Job Arrays to submit a large number of individual jobs, especially when the run times of the commands are relatively short. It allows for better utilization of the nodes, puts less burden on the batch scheduler, and lessens interference with jobs of other users on the same node.'' <br />
<br />
=== Synopsis ===<br />
<pre><br />
[ NetID@ ~]$ tamulauncher --help<br />
Usage: /sw/local/bin/tamulauncher [options] FILE<br />
<br />
This script will execute commands in FILE concurrently. <br />
<br />
OPTIONS:<br />
<br />
--commands-pernode | -p <n> <br />
Set the number of concurrent processes per node.<br />
<br />
--norestart<br />
Do not restart.<br />
<br />
--status <commands file><br />
Prints number of finished commands and exits. <br />
<br />
--list <commands file><br />
Prints detailed list of all finished commands and exits.<br />
<br />
--remove-logs <commands file><br />
Removes the log directory and exits<br />
<br />
--version | -v<br />
Prints version and exits.<br />
<br />
--help | -h | ?<br />
Shows this message and exits.<br />
</pre><br />
<br />
=== Commands file ===<br />
<br />
The commands file is a regular text file containing all the commands that need to be executed. Every line contains one command. A command can be a user-compiled program, a Linux command, a script (e.g. bash, Python, Perl, etc), a software package, etc. Commands can also be compounded using the Linux semi-colon operator. In general, any command that will work when typed in a bash shell will work when executed using tamulauncher. Below is an example of a commands file; it illustrates a commands file can contain any combination of commands (although in practice it's mostly a repetition of the same command with varying input parameters). Many times a commands file can be generated automatically.<br />
<br />
<br />
<pre><br />
./prog1 125<br />
./prog2 "aa" 3 <br />
mkdir testcase1 ; cd testcase1; ./myprog<br />
./prog1 100<br />
:<br />
:<br />
:<br />
time ./prog3 <br />
python mypython.py<br />
./prog1 141 ; ./prog4 > OUTPUT<br />
./prog5 < myinput<br />
<br />
</pre><br />
<br />
=== Dynamic release of resources ===<br />
<br />
tamulauncher will automatically release resources whenever they become idle. On ada, resources will be released on a per-core basis. <br />
On terra, resources will be released on per-node basis. This feature is especially useful in cases where the majority of requested cores/nodes<br />
are idle, taking up valuable resources, while only a few cores are processing the last few commands. On ada, please add following LSF line <br />
to your batch script<br />
<br />
<pre><br />
#BSUB -app resizable<br />
</pre><br />
<br />
Adding this option will enable LSF to dynamically release resources. Your tamulauncher run will still work fine without the above LSF flag, but LSF<br />
will prohibit release of resources and it will write a little warning message to your output/error file. <br />
<br />
There are no changes required on terra to enable dynamic release of resources. <br />
<br />
'''NOTE''' On terra you might see some slurm error messages such as "srun: error: <NODE>: task 7: Killed". These messages can be safely ignored. <br />
<br />
'''NOTE:''' this is an experimental feature we are still improving on. For that reason, as well as well as some needed changes to calculation of SUs, <br />
the number of SUs charged will not be adjusted at this time. However, it will help to make the cluster less congested.<br />
<br />
== Examples ==<br />
<br />
The following sections describe two simple examples how to use tamulauncher. The first example shows how to run serial commands <br />
and the second example shows how to run multi threaded commands.<br />
<br />
===Example 1: Simple tamulauncher run ===<br />
<br />
<pre><br />
#BSUB -L /bin/bash<br />
#BSUB -J demo-tamulauncher<br />
#BSUB -o demo-tamulauncher.%J<br />
#BSUB -W 07:00<br />
#BSUB -n 200<br />
#BSUB -M 150<br />
#BSUB -R 'rusage[mem=150]'<br />
#BSUB -R 'span[ptile=20]'<br />
<br />
# special LSF option to release resources<br />
#BSUB -app resizable<br />
<br />
<br />
tamulauncher commands.in<br />
</pre><br />
<br />
In the above example, tamulauncher will distribute the commands among the 200 requested cores; 200 commands will be executed concurrently and every task will process N/200 commands. On ''terra'' the script will use '''SLURM''' style directives.<br />
<br />
===Example 2: Running multi-threaded commands ===<br />
<br />
'''LSF''' (''ada'') does not provide an easy way to specify requirements for jobs where every task (command) wants to utilize multiple cores (i.e. hybrid jobs). This might be a problem when tamulauncher needs to execute multi-threaded (e.g. OpenMP) commands. For that reason, tamulauncher provides the '''--commands-per-node''' option to explicitly set the number of concurrent commands per node. '''NOTE:''' ''terra'' uses the '''SLURM''' batch scheduler which provides an easy way to specify requirements for hybrid jobs. Therefore the '''--commands-per-node''' is mostly used for ''ada''. <br />
<br />
==== ada example ====<br />
<pre><br />
#BSUB -L /bin/bash<br />
#BSUB -J demo-tamulauncher<br />
#BSUB -o demo-tamulauncher.%J<br />
#BSUB -W 07:00<br />
#BSUB -n 200<br />
#BSUB -M 100<br />
#BSUB -R 'rusage[mem=100]'<br />
#BSUB -R 'span[ptile=20]'<br />
<br />
# special LSF option to release resources<br />
#BSUB -app resizable<br />
<br />
export OMP_NUM_THREADS=4<br />
tamulauncher --commands-pernode 5 commands.in<br />
</pre><br />
<br />
In this example, tamulauncher will execute only 5 commands concurrently per node (even though ptile is set to 20). Environmental variable '''OMP_NUM_THREADS''' is set to 4 so every command will use 4 cores (threads). The total number of cores used per node is 5*4=20.<br />
<br />
'''NOTE:''' in this case, another option would be to set '''ptile=5''' and include '''#BSUB -x''' to reserve whole nodes.<br />
<br />
==== terra example ====<br />
<br />
<pre><br />
#!/bin/bash<br />
<br />
#SBATCH --export=NONE <br />
#SBATCH --get-user-env=L <br />
<br />
##NECESSARY JOB SPECIFICATIONS<br />
#SBATCH --job-name=demo-tamulauncher<br />
#SBATCH --output=demo-tamulauncher.%j<br />
#SBATCH --time=07:00:00 <br />
#SBATCH --ntasks=70 <br />
#SBATCH --ntasks-per-node=7 <br />
#SBATCH --cpus-per-task=4<br />
#SBATCH --mem=4096M <br />
<br />
<br />
exportÂ OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK<br />
tamulauncher commands.in<br />
</pre><br />
<br />
In the above example, tamulauncher will use all the requirements specified in the SLURM script and execute 7 commands per node. The SLURM '''--cpus-per-task''' option will make sure 4 cores are reserved for every task and every command can will up to 4 threads. There is no need to specify the tamulauncher '''--commands-per-task''' in this case.<br />
<br />
== Automatic Restart ==<br />
<br />
tamulauncher keeps track of all commands that have been executed. When you start a tamulauncher job it will check for a log (located in directory ''.tamulauncher-log'') from a previous run and if that is the case, it will continue executing commands that did not finish during the previous run. This is especially useful when a tamulauncher job was killed because it ran out of wall time or there was a system problem. To turn off the automatic restart option, use the '''no-restart'' flag in your tamulauncher command, e.g.<br />
<br />
tamulauncher --no-restart commands.in<br />
<br />
Using this option tamulauncher will just wipe all the log files and it will start as if it was a first run.<br />
<br />
'''NOTE:''' tamulauncher keeps a log for every unique commands file. If you make any changes to the commands file, tamulauncher will assume it's a different commands file and will create a new log directory. This also means multiple tamulauncher runs can be executed in the same directory.<br />
<br />
== Monitoring runs ==<br />
<br />
To see how many commands have been executed use the '''--status''' flag in tamulauncher:<br />
<br />
[ NetID@ ~]$ '''tamulauncher --status <command file>'''<br />
<br />
This will show a one-line summary with the number of commands executed and the total number of commands for the tamulauncher run on ''<command file>''.<br />
<br />
To see a full listing of all finished commands use the '''--list''' flag in tamulauncher:<br />
<br />
[ NetID@ ~]$ '''tamulauncher --list <command file>'''<br />
<br />
This will show a list of all commands that have finished executing, including index in the commands file, total run-time time, and exit status for the tamulauncher run on ''<command file>''.<br />
<br />
<br />
== Clearing the log ==<br />
<br />
To clear the log for a particular tamulauncher run, use the '''--remove-logs''' flag.<br />
<br />
[ NetID@ ~]$ '''tamulauncher --remove-logs <command file>'''<br />
<br />
This will clear the logs for the latest tamulauncher run on commands file <command file>. '''NOTE:''' don't clear the logs while tamulauncher is still running on that particular <commands file>.</div>Penningshttps://hprc.tamu.edu/w/index.php?title=Main_Page&diff=11386Main Page2020-09-02T21:51:01Z<p>Pennings: </p>
<hr />
<div>{{WelcomeBanner}}<br />
<!--Main page of Wiki: top level pages for all clusters and HPRC material are shown here.--><br />
<!-- Blank Line --><br />
<!-- Blank Line --><br />
<br />
{{ TextBox <br />
| content = {{TitleBar | align=center | title=Announcements }}<br />
<br />
* '''New GPU nodes in the Ada cluster: ''' Four new GPU nodes are now available in the Ada Cluster. Each GPU node has two Intel Skylake Xeon Gold 5118 20-core processors, 192 GB of memory and two NVIDIA 32GB V100 GPUs. To use these new GPU nodes, please submit jobs to the '''v100''' queue on Ada by including the following job directive in your job scripts:<br />
<br />
<br />
}}<br />
{{ TextBox <br />
| content = {{TitleBar | align=center | title=Getting Started: Understanding HPC }}<br />
<div style="column-count:1;-moz-column-count:1;-webkit-column-count:1; font-size:100%; align=center; margin: auto;"><br />
* New to High Performance Computing (HPC)? [https://hprc.tamu.edu/resources/ This HPC Introduction Page] explains the "why" and "how" of high performance computing. Also see the [https://hprc.tamu.edu/policies Policies Page] to better understand the rules and etiquette of cluster usage. <br />
<br />
</div><br />
}}<br />
{{ TextBox <br />
| content = {{TitleBar | align=center | title=Getting Started: Access to the Clusters }}<br />
<div style="column-count:1;-moz-column-count:1;-webkit-column-count:1; font-size:100%; align=center; margin: auto;"><br />
* '''Getting an Account: ''' All computer systems managed by the HPRC are available for use to TAMU faculty, staff, and students who require large-scale computing capabilities. The HPRC hosts the [[:Ada | Ada]] and [[:Terra | Terra]] clusters at TAMU. To apply for or renew an HPRC account, please visit the [https://hprc.tamu.edu/apply/ Account Applications] page. For information on how to obtain an allocation to run jobs on one of our clusters, please visit the [https://hprc.tamu.edu/policies/allocations.html Allocations Policy] page. ''All accounts expire and must be renewed in September of each year.''<br />
<br />
</div><br />
}}<br />
{{ TextBox <br />
| content = {{TitleBar | align=center | title= Creating Your Own Batch Jobs}}<br />
<div style="column-count:1;-moz-column-count:1;-webkit-column-count:1; font-size:100%; align:center; margin: auto;"><br />
* The [[:SW:tamubatch | tamubatch Page]] provides information on how to use tamubatch to create and submit jobs easily.<br />
{{DoubleTextBox<br />
| LeftTitle1 = Ada / LSF Batch Pages<br />
| LeftContent1 =<br />
* [[:Ada:Batch_Processing_LSF | Complete Ada Batch Page]]<br />
* [[:Ada:Batch_Processing_LSF#Job_Submission | Job Submission (bsub)]]<br />
* [[:Ada:Batch_Processing_LSF#Queues | Ada Queue Structure]]<br />
<br />
| RightTitle1= Terra / SLURM Batch Pages<br />
| RightContent1= <br />
* [[:Terra:Batch | Complete Terra Batch Page]]<br />
* [[:Terra:Batch#Job_Submission | Job Submission (sbatch)]]<br />
* [[:Terra:Batch#Queues | Terra Queue Structure]]<br><br />
<br />
| LeftTitle2=<br />
| LeftContent2=<br />
<br />
| RightTitle2=<br />
| RightContent2=<br />
<br />
</div><br />
}}<br />
}}<br />
{{ TextBox <br />
| content = {{TitleBar | align=center | title= Troubleshooting}}<br />
<div style="column-count:1;-moz-column-count:1;-webkit-column-count:1; font-size:100%; align:center; margin: auto;"><br />
* While we cannot predict all bugs and errors, some issues are extremely common on our clusters. See the [[:HPRC:CommonProblems | Common Problems and Quick Solutions Page]] for a small collection of the most prevalent issues.<br />
<br />
</div><br />
}}<br />
{{ TextBox <br />
| content = {{TitleBar | align=center | title=HPRC's YouTube Channel }}<br />
<div style="column-count:1;-moz-column-count:1;-webkit-column-count:1; font-size:110%; align=center; margin: auto;"><br />
* Prefer visual learning? HPRC has launched its official YouTube channel where you can find video versions of our help guides, recordings of our short courses, and more! '''Subscribe''' [https://www.youtube.com/channel/UCgeDEHE5GwkxYUGS0FDLmPw?disable_polymer=true here.]<br />
<br />
</div><br />
}}<br />
<br />
<!-- Blank Line --><br />
<!-- Blank Line --><br />
<!-- Links: More Information --><br />
<!-- Text Box Template (has buffer around it) --><br />
{{ TextBox <br />
| content = {{TitleBar | align=center | title=Further Reading }}<br />
<div style="column-count:4;-moz-column-count:4;-webkit-column-count:4; font-size:110%;"><br />
* [[:Ada | Ada User Guide]]<br />
* [[:Terra | Terra User Guide]]<br />
* [[:HPRCLab | Workstations]]<br />
* [https://hprc.tamu.edu/resources/ Hardware Overview]<br />
* [[:Ada:Intro | Ada Hardware]]<br />
* [[:Terra:Intro | Terra Hardware]]<br />
* [[:SW:Portal | TAMU OnDemand Portal]]<br />
* [[:SW | Software Overview]]<br />
* [[:SW:Modules | Loading Software]]<br />
* [[:SW:License_Checker | Check Software License Availability]]<br />
* [https://hprc.tamu.edu/ Software Policies]<br />
* [https://hprc.tamu.edu/policies/ Usage Policies]<br />
* [https://hprc.tamu.edu/apply/ Account Application]<br />
* [https://hprc.tamu.edu/ams/ Manage SUs (Transfers)]<br />
* [https://hprc.tamu.edu/about/contact.html Contact Us]<br />
</div><br />
}}<br />
<!--Blank Line--><br />
<!--Logo Banner Template--><br />
{{LogoBanner}}<br />
<br />
<!-- Following line DISABLES automatic table of contents --><br />
<!-- This line is down here to remove blank line up top --><br />
__NOTOC__</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Knitro&diff=11102SW:Knitro2020-05-06T19:19:01Z<p>Pennings: </p>
<hr />
<div>=Knitro=<br />
<br />
__TOC__<br />
==Description==<br />
<br />
The Artelys Knitro Solver is a plug-in Solver Engine that extends Analytic Solver Platform, Risk Solver Platform, Premium Solver Platform or Solver SDK Platform to solve nonlinear optimization problems of virtually unlimited size. The solver has plugins for MATLAB, R, Python, C/C++, and Fortran. <br />
<br />
'''NOTE:''' Knitro is currently only available on the '''terra''' cluster.<br />
<br />
For more information, please visit https://www.solver.com/artelys-knitro-solver-engine<br />
<br />
<br />
== Setting up the Knitro environment ==<br />
<br />
Before using Knitro, you need to set up the environment. To do this, load the Knitro module<br />
<br />
[NetID@terra ~]$ '''module load Knitro/10.3.0'''<br />
<br />
<br />
== Using Knitro ==<br />
<br />
As mentioned, Knitro has bindings for MATLAB, R, Python, C/C++, and Fortran. To use Knitro with any of these, you need to load the appropriate module (e.g. Matlab, R) in addition to the Knitro module<br />
<br />
=== MATLAB ===<br />
<br />
To use Knitro with MATLAB, load both modules first:<br />
<br />
<pre><br />
module load Knitro/10.3.0<br />
module load Matlab<br />
</pre><br />
<br />
Loading the Knitro module will set the '''$MATLABPATH''' to the directory containing the Knitro mex files and interfaces. After loading the modules you can call the Knitro solver like any regular MATLAB function. The $MATLABROOT directory also contains a sample Knitro options file and example MATLAB scripts that use the Knitro solver.<br />
<br />
=== R ===<br />
<br />
To use Knitro with R, load both modules first:<br />
<br />
<pre><br />
module load Knitro/10.3.0<br />
module load R<br />
</pre><br />
<br />
'''NOTE:''' Instead of loading the default '''R''' module it's recommended to load a version specific R version (e.g. R/3.3.2-iomkl-2017A-Python-2.7.12-default-mt). Alternatively, you can also load the R_tamu module. Knitro will work with any R or R_tamu version.<br />
<br />
The Knitro module will append the Knitro package directory to the R environmental variable '''$R_LIBS_USER'''. Loading the Knitro package can be done using the R '''library()''' function<br />
<br />
<pre><br />
> library('KnitroR')<br />
</pre><br />
<br />
For some example R scripts that use the Knitro solver, see directory '''${KNITROEXAMPLES}/R''' <br />
<br />
=== Python ===<br />
<br />
To use Knitro with Python, load both modules first:<br />
<br />
<pre><br />
module load Knitro/10.3.0<br />
module load Python/3.5.2-intel-2017A<br />
</pre><br />
<br />
'''NOTE:''' The Python version used above is just an example. Knitro will work with any Python version (2.7.xx as well as 3.xx) and toolchain combination. <br />
<br />
The Knitro module will set the environmental variable '''PYTHONPATH''' to include the Knitro Python directory. The Knitro solver can be accessed from any Python script. For some example Python scripts that use the Knitro solver, see directory '''${KNITROEXAMPLES}/Python'''<br />
<br />
=== C/C++ and Fortran ===<br />
<br />
To use Knitro inside your own C/C++, or Fortran code, load Knitro and the preferred toolchain first:<br />
<br />
<pre><br />
module load Knitro/10.3.0<br />
module load Intel/2017A<br />
</pre><br />
<br />
'''NOTE:''' The toolchain used above is just an example. Knitro will work with any toolchain (e.g. Intel, GNU). <br />
<br />
The Knitro module will set the environmental variables '''CPATH''' for the include files, '''LIBRARY_PATH''' for the compile time library paths, and '''LD_LIBRARY_PATH''' for the runtime library paths. <br />
<br />
<br />
Directories '''${KNITROEXAMPLES}/C''', '''${KNITROEXAMPLES}/C++''', and '''${KNITROEXAMPLES}/Fortran''' contain example programs for C,C++, and Fortran respectively. The directories also contain a Makefile, explaining how to compile the examples.<br />
<br />
== Acknowledgement ==<br />
<br />
The license for Knitro has been purchased by the Department of Economics. We are thankful for their contribution and for allowing access to all HPRC users.<br />
<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=HPRC:myproject&diff=11091HPRC:myproject2020-05-04T22:29:23Z<p>Pennings: </p>
<hr />
<div>==myproject command==<br />
'''For convenience, this topic has been summarized in a short video lesson, which can viewed [https://www.youtube.com/watch?v=dg3KaUvIWcU&list=PLHR4HLly3i4YrkNWcUE77t8i-AkwN5AN8&index=7&t=0s here].'''<br />
<br />
On both clusters, the command "myproject" is the tool that allows you to 1) check allocations(default and project) 2) set default allocation account 3) show all past completed jobs.<br />
<pre><br />
myproject -h<br />
-d accountNo: Set accountNo as the default project account<br />
-e YYYY-MM-DD: Specify end date (00:00 AM) for job records<br />
(-j option must be specified.)<br />
-h: Display this help and exit.<br />
-j accountNo (or 'all'): Produce the job records in current fiscal year for <br />
the selected project account or all project accounts when 'all' is given.<br />
-l: List all the active local project accounts.<br />
-p accountNo: List uncomplted jobs for accountNo.<br />
-s YYYY-MM-DD: Specify start date (00:00 AM) for job recrods.<br />
(-j option must be specified.)<br />
-c: Display job records with line length <= 80 chars.<br />
(-j option must be specified.)<br />
</pre><br />
<br />
Examples below show the most common usage of "myproject" command:<br />
<br />
===Set Default Allocation===<br />
"myproject -d xxxxxxx" sets the project account xxxxxxx to be the default account to be charged when jobs are submitted by the user. The default project account will be overridden by "#SBATCH -A yyyyyy"(for terra) and "#BSUB -P yyyyyy"(for ada) in a job script for a particular job.<br />
<br />
[user@cluster ~]$ '''myproject -d 122858321140'''<br />
Your default project account now is 122858321140.<br />
<br />
===Check Allocation===<br />
The "-l" option list all projects for the user on clusters.<br />
<br />
[user@cluster ~]$ '''myproject -l'''<br />
==========================================================================================<br />
List of user's Project Accounts<br />
------------------------------------------------------------------------------------------<br />
| Account | Default | Allocation |Used & Pending SUs| Balance | PI |<br />
------------------------------------------------------------------------------------------<br />
|1228000223136| N| 10000.00| 0.00| 10000.00|Doe, John |<br />
------------------------------------------------------------------------------------------<br />
|1428000243716| Y| 5000.00| -71.06| 4928.94|Doe, Jane |<br />
------------------------------------------------------------------------------------------<br />
|1258000247058| N| 5000.00| -0.91| 4999.09|Doe, Jane |<br />
------------------------------------------------------------------------------------------<br />
<br />
===List Completed Jobs===<br />
The "-j" option list jobs completed on cluster. If the extra "-s" and "-e" option are not specified, all jobs in current fiscal year (starting from 9/1 to 8/31 next year) are listed.<br />
<br />
[user@cluster ~]$ '''myproject -j 122858329759'''<br />
------------------------------------------------------------ List of Jobs --------------------------------------------------------------------- <br />
|ProjectAccount| JobID |JobArrayIndex| SubmitTime | StartTime | EndTime | Walltime | TotalSlots | UsedSUs |<br />
----------------------------------------------------------------------------------------------------------------------------------------------- <br />
|122858329759 |1739566 |0 |2015-09-14 15:18:16 |2015-09-14 15:18:16 |2015-09-14 15:18:27 | 11| 1| 0.00|<br />
|122858329759 |1739567 |0 |2015-09-14 15:18:32 |2015-09-14 15:18:33 |2015-09-14 15:28:41 | 608| 1| 0.17|<br />
|122858329759 |1739568 |0 |2015-09-14 15:18:42 |2015-09-14 15:18:42 |2015-09-14 15:19:09 | 27| 1| 0.01|<br />
|122858329759 |1739569 |0 |2015-09-14 15:18:43 |2015-09-14 15:18:44 |2015-09-14 15:19:15 | 31| 1| 0.01|<br />
|122858329759 |1739570 |0 |2015-09-14 15:18:44 |2015-09-14 15:18:45 |2015-09-14 15:18:55 | 10| 1| 0.00|<br />
|122858329759 |1739578 |0 |2015-09-14 15:20:31 |2015-09-14 15:20:33 |2015-09-14 15:20:40 | 7| 1| 0.00|<br />
|122858329759 |1739579 |0 |2015-09-14 15:20:42 |2015-09-14 15:20:43 |2015-09-14 15:22:01 | 78| 1| 0.02|<br />
|122858329759 |1739580 |0 |2015-09-14 15:20:46 |2015-09-14 15:20:47 |2015-09-14 15:20:54 | 7| 1| 0.00|<br />
|122858329759 |1739581 |0 |2015-09-14 15:20:47 |2015-09-14 15:20:48 |2015-09-14 15:20:55 | 7| 1| 0.00|<br />
|122858329759 |1739582 |0 |2015-09-14 15:20:48 |2015-09-14 15:20:49 |2015-09-14 15:20:57 | 8| 1| 0.00|<br />
|122858329759 |2259750 |0 |2016-03-07 12:46:13 |2016-03-07 12:46:15 |2016-03-07 13:28:55 | 2560| 1| 0.71|<br />
|122858329759 |2259751 |0 |2016-03-07 12:46:15 |2016-03-07 12:46:16 |2016-03-07 13:28:54 | 2558| 1| 0.71|<br />
|122858329759 |2259752 |0 |2016-03-07 12:46:17 |2016-03-07 12:46:18 |2016-03-07 12:56:01 | 583| 1| 0.16|<br />
|122858329759 |2259753 |0 |2016-03-07 12:46:19 |2016-03-07 12:46:20 |2016-03-07 12:55:59 | 579| 1| 0.16|<br />
|122858329759 |2259754 |0 |2016-03-07 12:46:21 |2016-03-07 12:46:22 |2016-03-07 12:49:27 | 185| 1| 0.05|<br />
|122858329759 |2259755 |0 |2016-03-07 12:46:24 |2016-03-07 12:46:25 |2016-03-07 12:49:27 | 182| 1| 0.05|<br />
|122858329759 |2259756 |0 |2016-03-07 12:46:26 |2016-03-07 12:46:26 |2016-03-07 12:47:38 | 72| 1| 0.02|<br />
|122858329759 |2259757 |0 |2016-03-07 12:46:28 |2016-03-07 12:46:28 |2016-03-07 12:47:38 | 70| 1| 0.02|<br />
|122858329759 |2261528 |0 |2016-03-08 11:05:32 |2016-03-08 11:05:33 |2016-03-08 11:05:36 | 3| 1| 0.00|<br />
|122858329759 |2261530 |0 |2016-03-08 11:06:08 |2016-03-08 11:06:09 |2016-03-08 11:06:14 | 5| 1| 0.00|<br />
|122858329759 |2261547 |0 |2016-03-08 11:07:34 |2016-03-08 11:07:35 |2016-03-08 11:08:15 | 40| 1| 0.01|<br />
|122858329759 |2261550 |0 |2016-03-08 11:07:55 |2016-03-08 11:07:56 |2016-03-08 20:08:03 | 32407| 1| 9.00|<br />
|122858329759 |2261694 |0 |2016-03-08 11:59:49 |2016-03-08 11:59:50 |2016-03-08 12:00:17 | 27| 1| 0.01|<br />
------------------------------------------------------------------------------------------------------------------------------------------------ <br />
|Total Jobs: 1021 |Total Usage: 1220.41 |<br />
------------------------------------------------------------------------------------------------------------------------------------------------<br />
<br />
===List Incomplete Jobs===<br />
The "-p" option lists the pending (incomplete jobs) pre-charged to the project specified for this option. The pre-charged SUs listed in the output will be removed after the jobs are completed, and those completed jobs will be charged by their actual used SUs.<br />
<br />
[user@cluster ~]$ '''myproject -p 122858329759'''<br />
--------------------------------------------------------------------<br />
| Job Id |State|#Cores|#EffectiveCores|Walltime(H)|Pending SUs|<br />
--------------------------------------------------------------------<br />
|7841656 | RUN| 1| 1| 1.00| 1.00|<br />
--------------------------------------------------------------------<br />
|Total Jobs: 1 |Total Pending SUs: 1.00 |<br />
--------------------------------------------------------------------<br />
<br />
<br />
[[Category:HPRC]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=10919SW:Matlab2020-02-25T04:58:47Z<p>Pennings: /* Retrieving fully populated Cluster Profile Object */</p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2019a'''<br />
<br />
This will setup the environment for Matlab version R2019a. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><br><br />
<br />
In this section, we will focus on utilizing the Parallel toolbox on HPRC cluster. For a general intro to the Parallel Toolbox see the [https://www.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav parallel toolbox ] section on the Mathworks website. Here we will discuss how to use Matlab Cluster profiles to distribute workers over multiple nodes.<br />
<br />
The central concept in most of the discussion below is the '''TAMUClusterProperties''' object which we will discuss in more detail in the next section<br />
<br />
<br />
== Cluster Profiles ==<br />
Matlab Cluster Profiles provide an interface to define properties of how and where to start Matlab workers. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running.<br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with a batch scheduler (e.g. SLURM on terra).<br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. Using the profile, you can define how many workers you want, how you want to distribute the workers over the nodes, how many computational threads to use, how long to run, etc. Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamuprofile.importProfile()<br />
</pre><br />
<br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta-information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU<VERSION'', where <VERSION> represents the Matlab version. For example, for Matlab R2019b it will be ''/scratch/$USER/MatlabJobs/TAMU2019b''<br />
<br />
<!-- <br />
'''NOTE:''' function '''tamuprofile.clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
--><br />
<br />
'''NOTE:''' For Matlab versions before R2019b, use the following function<br />
<pre><br />
>>tamu_import_TAMU_clusterprofile()<br />
</pre><br />
<br />
In this case, Matlab will store meta-information in directory ''/scratch/$USER/MatlabJobs/TAMU''<br />
<br />
=== Getting Cluster Profile Object ===<br />
<br />
To return a fully completed cluster object (i.e. with attached resource information) HPRC created the '''tamu_set_profile_properties''' convenience function. There are two steps to follow:<br />
<br />
* define the properties using the TAMUClusterProperties class<br />
* call '''tamu_set_profile_properties''' using the created TAMUClusterProperties object.<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
>> clusterObject=tamu_set_profile_properties(tp);<br />
</pre><br />
<br />
Variable ''clusterObject'' is a fully populated cluster object that can be used for parallel processing. <br />
<br />
'''NOTE:''' convenience function '''tamu_set_profile_properties''' is a wrapper around Matlab function <br />
[https://www.mathworks.com/help/distcomp/parcluster.html parcluster]. It also uses HPRC convenience function '''tamu_import_TAMU_clusterprofile''' to check if the '''TAMU''' profile has been imported already.<br />
<br />
== Starting a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamu_parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
The '''parpool''' functions enables the full functionality of the parallel language features (parfor and spmd, will be discussed below). A parpool creates a special job on a pool of workers, and connects the pool to the MATLAB client. For example:<br />
<pre><br />
mypool = parpool 4<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the matlabpool block will be destroyed once the block is finished.<br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=10918SW:Matlab2020-02-25T04:45:01Z<p>Pennings: /* Importing Cluster Profile */</p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2019a'''<br />
<br />
This will setup the environment for Matlab version R2019a. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><br><br />
<br />
In this section, we will focus on utilizing the Parallel toolbox on HPRC cluster. For a general intro to the Parallel Toolbox see the [https://www.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav parallel toolbox ] section on the Mathworks website. Here we will discuss how to use Matlab Cluster profiles to distribute workers over multiple nodes.<br />
<br />
The central concept in most of the discussion below is the '''TAMUClusterProperties''' object which we will discuss in more detail in the next section<br />
<br />
<br />
== Cluster Profiles ==<br />
Matlab Cluster Profiles provide an interface to define properties of how and where to start Matlab workers. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running.<br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with a batch scheduler (e.g. SLURM on terra).<br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. Using the profile, you can define how many workers you want, how you want to distribute the workers over the nodes, how many computational threads to use, how long to run, etc. Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamuprofile.importProfile()<br />
</pre><br />
<br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta-information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU<VERSION'', where <VERSION> represents the Matlab version. For example, for Matlab R2019b it will be ''/scratch/$USER/MatlabJobs/TAMU2019b''<br />
<br />
<!-- <br />
'''NOTE:''' function '''tamuprofile.clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
--><br />
<br />
'''NOTE:''' For Matlab versions before R2019b, use the following function<br />
<pre><br />
>>tamu_import_TAMU_clusterprofile()<br />
</pre><br />
<br />
In this case, Matlab will store meta-information in directory ''/scratch/$USER/MatlabJobs/TAMU''<br />
<br />
=== Retrieving fully populated Cluster Profile Object ===<br />
<br />
To return a fully completed cluster object (i.e. with attached resource information) HPRC created the '''tamu_set_profile_properties''' convenience function. There are two steps to follow:<br />
<br />
* define the properties using the TAMUClusterProperties class<br />
* call '''tamu_set_profile_properties''' using the created TAMUClusterProperties object.<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
>> clusterObject=tamu_set_profile_properties(tp);<br />
</pre><br />
<br />
Variable ''clusterObject'' is a fully populated cluster object that can be used for parallel processing. <br />
<br />
'''NOTE:''' convenience function '''tamu_set_profile_properties''' is a wrapper around Matlab function <br />
[https://www.mathworks.com/help/distcomp/parcluster.html parcluster]. It also uses HPRC convenience function '''tamu_import_TAMU_clusterprofile''' to check if the '''TAMU''' profile has been imported already.<br />
<br />
== Starting a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamu_parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
The '''parpool''' functions enables the full functionality of the parallel language features (parfor and spmd, will be discussed below). A parpool creates a special job on a pool of workers, and connects the pool to the MATLAB client. For example:<br />
<pre><br />
mypool = parpool 4<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the matlabpool block will be destroyed once the block is finished.<br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=10917SW:Matlab2020-02-25T04:43:35Z<p>Pennings: /* Importing Cluster Profile */</p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2019a'''<br />
<br />
This will setup the environment for Matlab version R2019a. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><br><br />
<br />
In this section, we will focus on utilizing the Parallel toolbox on HPRC cluster. For a general intro to the Parallel Toolbox see the [https://www.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav parallel toolbox ] section on the Mathworks website. Here we will discuss how to use Matlab Cluster profiles to distribute workers over multiple nodes.<br />
<br />
The central concept in most of the discussion below is the '''TAMUClusterProperties''' object which we will discuss in more detail in the next section<br />
<br />
<br />
== Cluster Profiles ==<br />
Matlab Cluster Profiles provide an interface to define properties of how and where to start Matlab workers. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running.<br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with a batch scheduler (e.g. SLURM on terra).<br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. Using the profile, you can define how many workers you want, how you want to distribute the workers over the nodes, how many computational threads to use, how long to run, etc. Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamuprofile.importProfile()<br />
</pre><br />
<br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta-information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU<VERSION'', where <VERSION> represents the Matlab version. For example, for Matlab R2019b it will be ''/scratch/$USER/MatlabJobs/TAMU2019b''<br />
<br />
<br />
'''NOTE:''' function '''tamuprofile.clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
<br />
'''NOTE:''' For Matlab versions before R2019b, use the following function<br />
<pre><br />
>>tamu_import_TAMU_clusterprofile()<br />
</pre><br />
<br />
In this case, Matlab will store meta-information in directory ''/scratch/$USER/MatlabJobs/TAMU''<br />
<br />
=== Retrieving fully populated Cluster Profile Object ===<br />
<br />
To return a fully completed cluster object (i.e. with attached resource information) HPRC created the '''tamu_set_profile_properties''' convenience function. There are two steps to follow:<br />
<br />
* define the properties using the TAMUClusterProperties class<br />
* call '''tamu_set_profile_properties''' using the created TAMUClusterProperties object.<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
>> clusterObject=tamu_set_profile_properties(tp);<br />
</pre><br />
<br />
Variable ''clusterObject'' is a fully populated cluster object that can be used for parallel processing. <br />
<br />
'''NOTE:''' convenience function '''tamu_set_profile_properties''' is a wrapper around Matlab function <br />
[https://www.mathworks.com/help/distcomp/parcluster.html parcluster]. It also uses HPRC convenience function '''tamu_import_TAMU_clusterprofile''' to check if the '''TAMU''' profile has been imported already.<br />
<br />
== Starting a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamu_parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
The '''parpool''' functions enables the full functionality of the parallel language features (parfor and spmd, will be discussed below). A parpool creates a special job on a pool of workers, and connects the pool to the MATLAB client. For example:<br />
<pre><br />
mypool = parpool 4<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the matlabpool block will be destroyed once the block is finished.<br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=10916SW:Matlab2020-02-25T04:35:31Z<p>Pennings: /* Using Matlab Parallel Toolbox on HPRC Resources */</p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2019a'''<br />
<br />
This will setup the environment for Matlab version R2019a. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><br><br />
<br />
In this section, we will focus on utilizing the Parallel toolbox on HPRC cluster. For a general intro to the Parallel Toolbox see the [https://www.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav parallel toolbox ] section on the Mathworks website. Here we will discuss how to use Matlab Cluster profiles to distribute workers over multiple nodes.<br />
<br />
The central concept in most of the discussion below is the '''TAMUClusterProperties''' object which we will discuss in more detail in the next section<br />
<br />
<br />
== Cluster Profiles ==<br />
Matlab Cluster Profiles provide an interface to define properties of how and where to start Matlab workers. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running.<br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with a batch scheduler (e.g. SLURM on terra).<br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. Using the profile, you can define how many workers you want, how you want to distribute the workers over the nodes, how many computational threads to use, how long to run, etc. Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamuprofile.importProfile()<br />
</pre><br />
<br />
<br />
<br />
<pre><br />
>>tamu_import_TAMU_clusterprofile()<br />
</pre><br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta-information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU<VERSION'', where <VERSIOM> represents the Matlab version. For example, for Matlab R2019b it will be ''/scratch/$USER/MatlabJobs/TAMU2019b''<br />
<br />
'''NOTE:''' function '''tamuprofile.clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
<br />
<br />
=== Retrieving fully populated Cluster Profile Object ===<br />
<br />
To return a fully completed cluster object (i.e. with attached resource information) HPRC created the '''tamu_set_profile_properties''' convenience function. There are two steps to follow:<br />
<br />
* define the properties using the TAMUClusterProperties class<br />
* call '''tamu_set_profile_properties''' using the created TAMUClusterProperties object.<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
>> clusterObject=tamu_set_profile_properties(tp);<br />
</pre><br />
<br />
Variable ''clusterObject'' is a fully populated cluster object that can be used for parallel processing. <br />
<br />
'''NOTE:''' convenience function '''tamu_set_profile_properties''' is a wrapper around Matlab function <br />
[https://www.mathworks.com/help/distcomp/parcluster.html parcluster]. It also uses HPRC convenience function '''tamu_import_TAMU_clusterprofile''' to check if the '''TAMU''' profile has been imported already.<br />
<br />
== Starting a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamu_parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
The '''parpool''' functions enables the full functionality of the parallel language features (parfor and spmd, will be discussed below). A parpool creates a special job on a pool of workers, and connects the pool to the MATLAB client. For example:<br />
<pre><br />
mypool = parpool 4<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the matlabpool block will be destroyed once the block is finished.<br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=10915SW:Matlab2020-02-25T03:35:23Z<p>Pennings: /* Using Matlab Parallel Toolbox on HPRC Resources */</p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2019a'''<br />
<br />
This will setup the environment for Matlab version R2019a. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><br><br />
<br />
In this section, we will focus on utilizing the Parallel toolbox on HPRC cluster. For a general intro to the Parallel Toolbox see the [https://www.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav parallel toolbox ] section on the Mathworks website. Here we will discuss how to use Cluster profiles to distribute workers over multiple nodes.<br />
<br />
The central concept in most of the convenience functions is the '''TAMUClusterProperties''' object which we will discuss in more detail in the next section<br />
<br />
<br />
== Cluster Profiles ==<br />
<br />
Matlab uses Cluster Profiles as an interface between define properties how and where to start Matlab workers. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running.<br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with batch scheduler (e.g. LSF on ada, SLURM on terra).<br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. You can use this profile to define how many workers you want, how you want to distribute the workers over the nodes Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamu_import_TAMU_clusterprofile()<br />
</pre><br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU'' ( ''/scratch/$USER/MatlabJobs/TAMUREMOTE'' for remote jobs)<br />
<br />
'''NOTE:''' convenience function '''tamu_import_TAMU_clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
<br />
You only need to import the cluster profile once. However, the imported profile is just a skeleton. It doesn't contain information how many resources (e.g. #workers) you want to use for parallel processing. In the next section, we will discuss how to create a fully populated cluster object that can be used for parallel processing.<br />
<br />
For more information about '''tamu_import_TAMU_clusterprofile()''' you can use the Matlab ''help// and ''doc'' functions.<br />
<br />
=== Retrieving fully populated Cluster Profile Object ===<br />
<br />
To return a fully completed cluster object (i.e. with attached resource information) HPRC created the '''tamu_set_profile_properties''' convenience function. There are two steps to follow:<br />
<br />
* define the properties using the TAMUClusterProperties class<br />
* call '''tamu_set_profile_properties''' using the created TAMUClusterProperties object.<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
>> clusterObject=tamu_set_profile_properties(tp);<br />
</pre><br />
<br />
Variable ''clusterObject'' is a fully populated cluster object that can be used for parallel processing. <br />
<br />
'''NOTE:''' convenience function '''tamu_set_profile_properties''' is a wrapper around Matlab function <br />
[https://www.mathworks.com/help/distcomp/parcluster.html parcluster]. It also uses HPRC convenience function '''tamu_import_TAMU_clusterprofile''' to check if the '''TAMU''' profile has been imported already.<br />
<br />
== Starting a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamu_parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
The '''parpool''' functions enables the full functionality of the parallel language features (parfor and spmd, will be discussed below). A parpool creates a special job on a pool of workers, and connects the pool to the MATLAB client. For example:<br />
<pre><br />
mypool = parpool 4<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the matlabpool block will be destroyed once the block is finished.<br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab_matlabsubmit&diff=10914SW:Matlab matlabsubmit2020-02-25T03:01:54Z<p>Pennings: /* Example 3: Utilizing Matlab workers (multi node) */</p>
<hr />
<div> <br />
= Example 1: basic use =<br />
<br />
The following example shows the simplest use of matlabsubmit. It will execute matlab script ''test.m'' using default values for batch resources and Matlab resources. matlabsubmit will also print some useful information to the screen. As can be seen in the example, it will show the Matlab resources requested (e.g. #threads, #workers), the submit command that will be used to submit the job, the batch scheduler JobID, and the location of output generated by Matlab and the batch scheduler.<br />
<br />
<pre><br />
-bash-4.1$ matlabsubmit test.m<br />
<br />
===============================================<br />
Running Matlab script with following parameters<br />
-----------------------------------------------<br />
Script : test.m<br />
Workers : 0<br />
Nodes : 1<br />
#threads : 8<br />
===============================================<br />
<br />
sbatch -o MatlabSubmitLOG1/slurm.out --job-name=matlabsubmitID-1_test --ntasks=1 --ntasks-per-node=1 --cpus-per-task=8 --mem=16000 --time=02:00:00 MatlabSubmitLOG1/submission_script<br />
Submitted batch job 3759554<br />
(from job_submit) your job is charged as below<br />
Project Account: 122839393845<br />
Account Balance: 17504.211670<br />
Requested SUs: 16<br />
<br />
-----------------------------------------------<br />
matlabsubmit ID : 1<br />
Matlab output file : MatlabSubmitLOG1/matlab.log<br />
-bash-4.1$<br />
</pre><br />
<br />
<br />
The matlab script ''test.m'' has to be in the current directory. Control will be returned immediately after executing the matlabsubmit command. To check the run status or kill a job, use the respective batch scheduler commands (e.g. '''squeue''' and '''scancel'''). matlabsubmit will create a sub directory named '''MatlabSubmitLOG<N>''' (where '''N''' is the matlabsubmit ID). In this directory matlabsubmit will store all its relevant files; the generated batch script, matlab driver, and redirected output and error. A listing of this directory will show the following files: <br />
<br />
* '''slurm.err''' redirected error <br />
* '''slurm.out''' redirected output (both LSF and Matlab)<br />
* '''matlab.log''' redirected Matlab screen output<br />
* '''matlabsubmit_wrapper.m''' Matlab code that sets #threads and calls user function<br />
* '''submission_script''' the generated LSF batch script<br />
* '''batch_job_id-<SLURM_JOBID>''' dummy file to show the actual Slurm job id this matlabsubmit job is associated to.<br />
<br />
= Example 2: Utilizing Matlab workers (single node) =<br />
<br />
To utilize additional workers used by Matlab's parallel features such as ''parfor'',''spmd'', and ''distributed'', '''matlabsubmit''' provides an option to specify the number of workers. This is done using the ''-w <N>'' flag (where <N> represents the number of workers). The following example shows a simple case of using 8 workers.<br />
<br />
<pre><br />
-bash-4.1$ matlabsubmit -w 8 test.m<br />
<br />
Running Matlab script with following parameters<br />
-----------------------------------------------<br />
Script : test.m<br />
Workers : 8<br />
Nodes : 1<br />
#threads : 1<br />
===============================================<br />
<br />
sbatch -o MatlabSubmitLOG2/slurm.out --job-name=matlabsubmitID-2_test --ntasks=9 --ntasks-per-node=9 --cpus-per-task=1 --mem=16000 --time=02:00:00 MatlabSubmitLOG2/submission_script<br />
Submitted batch job 3759571<br />
(from job_submit) your job is charged as below<br />
Project Account: 122839393845<br />
Account Balance: 17504.211670<br />
Requested SUs: 18<br />
<br />
-----------------------------------------------<br />
matlabsubmit ID : 2<br />
matlab output file : MatlabSubmitLOG2/matlab.log<br />
<br />
</pre><br />
<br />
In this example, '''matlabsubmit''' will first execute Matlab code to create a ''parpool'' with 8 workers (using the local profile). As can be seen in the output, in this case, '''matlabsubmit''' requests 9 cores: 1 core for the client and 8 cores for the workers. The only exception is when the user requests all the workers on a node ( 28 on terra). In that case, matlabsubmit will request all cores (instead of one extra for the client).<br />
<br />
= Example 3: Utilizing Matlab workers (multi node) =<br />
<br />
'''matlabsubmit''' provides excellent options for Matlab runs that need more workers than fit a single node and/or when the Matlab workers need to be distributed among multiple nodes. Some examples for distributing workers among multiple nodes include<br />
* hybrid runs: where every worker uses multiple computational threads<br />
* memory requirements: where the job needs more memory than fits on a single node. <br />
* GPU needs: where every worker needs to utilize the GPU on a node<br />
<br />
The following example shows how to run a matlab simulation that utilizes 24 workers, where every node will run 4 workers (i.e. the workers will be distributed among 24/4 = 6 nodes).<br />
<pre><br />
-bash-4.1$ matlabsubmit -w 8 -p 4 test.m<br />
===============================================<br />
Running Matlab script with following parameters<br />
-----------------------------------------------<br />
Script : test.m<br />
Workers : 8<br />
Nodes : 2<br />
#threads : 1<br />
===============================================<br />
<br />
... starting matlab batch. This might take some time. See MatlabSubmitLOG5/matlab-batch-commands.log<br />
...Starting Matlab from host: tlogin-0502.cluster<br />
MATLAB is selecting SOFTWARE OPENGL rendering.<br />
<br />
< M A T L A B (R) ><br />
Copyright 1984-2019 The MathWorks, Inc.<br />
R2019b Update 3 (9.7.0.1261785) 64-bit (glnxa64)<br />
November 27, 2019<br />
<br />
<br />
To get started, type doc.<br />
For product information, visit www.mathworks.com.<br />
<br />
... Interactive Matlab session, multi threading reduced to 8<br />
<br />
<br />
submitstring =<br />
<br />
'--cpus-per-task=1 --ntasks=9 --ntasks-per-node=4 --mem=5334 --time=2:00:00 '<br />
<br />
-bash-4.1$<br />
<br />
</pre><br />
<br />
As can be seen the output is very different from the previous examples ( some of the ouput has been removed for readability). When a job uses multiple nodes the approach '''matlabsubmit''' uses is a bit different. '''matlabsubmit''' will start a regular ''interactive'' Matlab session and from within it will run the Matlab ''batch'' command using a specialized cluster profile. It will then exit Matlab while the Matlab script is executed on the compute nodes. <br> <br />
<br><br />
<br />
The contents of the MatlabSubmitLOG directory are also slightly different. A listing will show the following files:<br />
<br />
* '''matlab-batch-commands.log''' screen output from Matlab <br />
* '''matlabsubmit_driver.m''' Matlab code that sets up the cluster profile and calls Matlab ''batch'' <br />
* '''matlabsubmit_wrapper.m''' Matlab code that sets #threads and calls user function<br />
* '''submission_script''' The actual command to start Matlab<br />
<br />
In addition to the MatlabSubmitLOG directory created by '''matlabsubmit''', Matlab will also create a directory named '''Job<N>''' used by the cluster profile to store metadata, log files, and screen output. The '''*.diary.txt''' text files will show screen output for the client and all the workers. All the Job<N> directories can be found in your $SCRATCH/MatlabJobs/<PROFILE> directory where <PROFILE> is the name of the used cluster profile. For R2019b it will be ''$SCRATCH/MatlabJobs/TAMU2019b''. Note: for older versions, the profile is always ''TAMU''</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=10913SW:Matlab2020-02-24T23:25:55Z<p>Pennings: /* Using Matlab Parallel Toolbox on HPRC Resources */</p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2019a'''<br />
<br />
This will setup the environment for Matlab version R2019a. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><br><br />
<br />
In this section we will focus on utilizing the Parallel toolbox on HPRC cluster. For a general intro to the Parallel Toolbox see the [https://www.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav parallel toolbox ] section on the Mathworks website. will give a brief introduction into Matlab Cluster profiles, parallel pools, the parallel constructs ''parfor'' and ''spmd'' , and how to utilize GPUs using Matlab.<br />
<br />
The central concept in most of the convenience functions is the '''TAMUClusterProperties''' class introduced in the ''Submit Matlab Scripts Remotely or Locally From the Matlab Command Line'' section above.<br />
<br />
<br />
== Cluster Profiles ==<br />
<br />
Cluster profiles define properties on where and how you want to do the parallel processing. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running.<br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with batch scheduler (e.g. LSF on ada, SLURM on terra).<br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. You can use this profile to define how many workers you want, how you want to distribute the workers over the nodes Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamu_import_TAMU_clusterprofile()<br />
</pre><br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU'' ( ''/scratch/$USER/MatlabJobs/TAMUREMOTE'' for remote jobs)<br />
<br />
'''NOTE:''' convenience function '''tamu_import_TAMU_clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
<br />
You only need to import the cluster profile once. However, the imported profile is just a skeleton. It doesn't contain information how many resources (e.g. #workers) you want to use for parallel processing. In the next section, we will discuss how to create a fully populated cluster object that can be used for parallel processing.<br />
<br />
For more information about '''tamu_import_TAMU_clusterprofile()''' you can use the Matlab ''help// and ''doc'' functions.<br />
<br />
=== Retrieving fully populated Cluster Profile Object ===<br />
<br />
To return a fully completed cluster object (i.e. with attached resource information) HPRC created the '''tamu_set_profile_properties''' convenience function. There are two steps to follow:<br />
<br />
* define the properties using the TAMUClusterProperties class<br />
* call '''tamu_set_profile_properties''' using the created TAMUClusterProperties object.<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
>> clusterObject=tamu_set_profile_properties(tp);<br />
</pre><br />
<br />
Variable ''clusterObject'' is a fully populated cluster object that can be used for parallel processing. <br />
<br />
'''NOTE:''' convenience function '''tamu_set_profile_properties''' is a wrapper around Matlab function <br />
[https://www.mathworks.com/help/distcomp/parcluster.html parcluster]. It also uses HPRC convenience function '''tamu_import_TAMU_clusterprofile''' to check if the '''TAMU''' profile has been imported already.<br />
<br />
== Starting a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamu_parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
The '''parpool''' functions enables the full functionality of the parallel language features (parfor and spmd, will be discussed below). A parpool creates a special job on a pool of workers, and connects the pool to the MATLAB client. For example:<br />
<pre><br />
mypool = parpool 4<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the matlabpool block will be destroyed once the block is finished.<br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=10912SW:Matlab2020-02-24T23:24:16Z<p>Pennings: </p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2019a'''<br />
<br />
This will setup the environment for Matlab version R2019a. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><br><br />
<br />
In this section we will focus on utilizing the Parallel toolbox on HPRC cluster. For a general intro to the Parallel Toolbox see the [[ https://www.mathworks.com/help/parallel-computing/index.html?s_tid=CRUX_lftnav | parallel toolbox ]] section on the Mathworks website. will give a brief introduction into Matlab Cluster profiles, parallel pools, the parallel constructs ''parfor'' and ''spmd'' , and how to utilize GPUs using Matlab.<br />
<br />
The central concept in most of the convenience functions is the '''TAMUClusterProperties''' class introduced in the ''Submit Matlab Scripts Remotely or Locally From the Matlab Command Line'' section above.<br />
<br />
<br />
== Cluster Profiles ==<br />
<br />
Cluster profiles define properties on where and how you want to do the parallel processing. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running.<br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with batch scheduler (e.g. LSF on ada, SLURM on terra).<br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. You can use this profile to define how many workers you want, how you want to distribute the workers over the nodes Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamu_import_TAMU_clusterprofile()<br />
</pre><br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU'' ( ''/scratch/$USER/MatlabJobs/TAMUREMOTE'' for remote jobs)<br />
<br />
'''NOTE:''' convenience function '''tamu_import_TAMU_clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
<br />
You only need to import the cluster profile once. However, the imported profile is just a skeleton. It doesn't contain information how many resources (e.g. #workers) you want to use for parallel processing. In the next section, we will discuss how to create a fully populated cluster object that can be used for parallel processing.<br />
<br />
For more information about '''tamu_import_TAMU_clusterprofile()''' you can use the Matlab ''help// and ''doc'' functions.<br />
<br />
=== Retrieving fully populated Cluster Profile Object ===<br />
<br />
To return a fully completed cluster object (i.e. with attached resource information) HPRC created the '''tamu_set_profile_properties''' convenience function. There are two steps to follow:<br />
<br />
* define the properties using the TAMUClusterProperties class<br />
* call '''tamu_set_profile_properties''' using the created TAMUClusterProperties object.<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
>> clusterObject=tamu_set_profile_properties(tp);<br />
</pre><br />
<br />
Variable ''clusterObject'' is a fully populated cluster object that can be used for parallel processing. <br />
<br />
'''NOTE:''' convenience function '''tamu_set_profile_properties''' is a wrapper around Matlab function <br />
[https://www.mathworks.com/help/distcomp/parcluster.html parcluster]. It also uses HPRC convenience function '''tamu_import_TAMU_clusterprofile''' to check if the '''TAMU''' profile has been imported already.<br />
<br />
== Starting a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamu_parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
The '''parpool''' functions enables the full functionality of the parallel language features (parfor and spmd, will be discussed below). A parpool creates a special job on a pool of workers, and connects the pool to the MATLAB client. For example:<br />
<pre><br />
mypool = parpool 4<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the matlabpool block will be destroyed once the block is finished.<br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=10911SW:Matlab2020-02-24T22:15:26Z<p>Pennings: </p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2019a'''<br />
<br />
This will setup the environment for Matlab version R2019a. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><br><br />
<br />
In this section, we will discuss some common concepts from the Matlab Parallel Toolbox and the convenience functions HPRC created to utilize the Parallel toolbox. We will give a brief introduction into Matlab Cluster profiles, parallel pools, the parallel constructs ''parfor'' and ''spmd'' , and how to utilize GPUs using Matlab.<br />
<br />
The central concept in most of the convenience functions is the '''TAMUClusterProperties''' class introduced in the ''Submit Matlab Scripts Remotely or Locally From the Matlab Command Line'' section above.<br />
<br />
<br />
== Cluster Profiles ==<br />
<br />
Cluster profiles define properties on where and how you want to do the parallel processing. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running.<br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with batch scheduler (e.g. LSF on ada, SLURM on terra).<br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. You can use this profile to define how many workers you want, how you want to distribute the workers over the nodes Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamu_import_TAMU_clusterprofile()<br />
</pre><br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU'' ( ''/scratch/$USER/MatlabJobs/TAMUREMOTE'' for remote jobs)<br />
<br />
'''NOTE:''' convenience function '''tamu_import_TAMU_clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
<br />
You only need to import the cluster profile once. However, the imported profile is just a skeleton. It doesn't contain information how many resources (e.g. #workers) you want to use for parallel processing. In the next section, we will discuss how to create a fully populated cluster object that can be used for parallel processing.<br />
<br />
For more information about '''tamu_import_TAMU_clusterprofile()''' you can use the Matlab ''help// and ''doc'' functions.<br />
<br />
=== Retrieving fully populated Cluster Profile Object ===<br />
<br />
To return a fully completed cluster object (i.e. with attached resource information) HPRC created the '''tamu_set_profile_properties''' convenience function. There are two steps to follow:<br />
<br />
* define the properties using the TAMUClusterProperties class<br />
* call '''tamu_set_profile_properties''' using the created TAMUClusterProperties object.<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
>> clusterObject=tamu_set_profile_properties(tp);<br />
</pre><br />
<br />
Variable ''clusterObject'' is a fully populated cluster object that can be used for parallel processing. <br />
<br />
'''NOTE:''' convenience function '''tamu_set_profile_properties''' is a wrapper around Matlab function <br />
[https://www.mathworks.com/help/distcomp/parcluster.html parcluster]. It also uses HPRC convenience function '''tamu_import_TAMU_clusterprofile''' to check if the '''TAMU''' profile has been imported already.<br />
<br />
== Starting a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamu_parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
The '''parpool''' functions enables the full functionality of the parallel language features (parfor and spmd, will be discussed below). A parpool creates a special job on a pool of workers, and connects the pool to the MATLAB client. For example:<br />
<pre><br />
mypool = parpool 4<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the matlabpool block will be destroyed once the block is finished.<br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=10910SW:Matlab2020-02-24T22:12:58Z<p>Pennings: </p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2019a'''<br />
<br />
This will setup the environment for Matlab version R2019a. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><br><br />
<br />
In this section, we will discuss some common concepts from the Matlab Parallel Toolbox and the convenience functions HPRC created to utilize the Parallel toolbox. We will give a brief introduction into Matlab Cluster profiles, parallel pools, the parallel constructs ''parfor'' and ''spmd'' , and how to utilize GPUs using Matlab.<br />
<br />
The central concept in most of the convenience functions is the '''TAMUClusterProperties''' class introduced in the ''Submit Matlab Scripts Remotely or Locally From the Matlab Command Line'' section above.<br />
<br />
<br />
== Cluster Profiles ==<br />
<br />
Cluster profiles define properties on where and how you want to do the parallel processing. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running.<br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with batch scheduler (e.g. LSF on ada, SLURM on terra).<br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. You can use this profile to define how many workers you want, how you want to distribute the workers over the nodes Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamu_import_TAMU_clusterprofile()<br />
</pre><br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU'' ( ''/scratch/$USER/MatlabJobs/TAMUREMOTE'' for remote jobs)<br />
<br />
'''NOTE:''' convenience function '''tamu_import_TAMU_clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
<br />
You only need to import the cluster profile once. However, the imported profile is just a skeleton. It doesn't contain information how many resources (e.g. #workers) you want to use for parallel processing. In the next section, we will discuss how to create a fully populated cluster object that can be used for parallel processing.<br />
<br />
For more information about '''tamu_import_TAMU_clusterprofile()''' you can use the Matlab ''help// and ''doc'' functions.<br />
<br />
=== Retrieving fully populated Cluster Profile Object ===<br />
<br />
To return a fully completed cluster object (i.e. with attached resource information) HPRC created the '''tamu_set_profile_properties''' convenience function. There are two steps to follow:<br />
<br />
* define the properties using the TAMUClusterProperties class<br />
* call '''tamu_set_profile_properties''' using the created TAMUClusterProperties object.<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
>> clusterObject=tamu_set_profile_properties(tp);<br />
</pre><br />
<br />
Variable ''clusterObject'' is a fully populated cluster object that can be used for parallel processing. <br />
<br />
'''NOTE:''' convenience function '''tamu_set_profile_properties''' is a wrapper around Matlab function <br />
[https://www.mathworks.com/help/distcomp/parcluster.html parcluster]. It also uses HPRC convenience function '''tamu_import_TAMU_clusterprofile''' to check if the '''TAMU''' profile has been imported already.<br />
<br />
== Starting a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamu_parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
The '''parpool''' functions enables the full functionality of the parallel language features (parfor and spmd, will be discussed below). A parpool creates a special job on a pool of workers, and connects the pool to the MATLAB client. For example:<br />
<pre><br />
mypool = parpool 4<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the matlabpool block will be destroyed once the block is finished.<br />
<br />
== Common Parallel constructs ==<br />
=== parfor ===<br />
<br />
The concept of a parfor-loop is similar to the standard Matlab for-loop. The difference is that parfor partitions the iterations among the available <br />
workers to run in parallel. For example:<br />
<br />
<pre><br />
parfor i=1:1024<br />
A(i)=sin((i/1024)*2*pi);<br />
end<br />
<br />
</pre><br />
This code will open a parallel pool with 2 workers using the default cluster profile and execute the loop in parallel. <br />
<br />
For more information please visit the Matlab parfor page.<br />
<br />
<br />
=== spmd ===<br />
<br />
<br />
spmd runs the same program on all workers concurrently. A typical use of spmd is when you need to run the same program on multiple sets of input. For example, Suppose you have 4 inputs named data1,data2,data3,data4 and you want run function myfun on all of them:<br />
<br />
<pre><br />
spmd (4)<br />
data = load(['data' num2str(labindex)])<br />
myresult = myfun(data)<br />
end<br />
</pre><br />
NOTE: labindex is a Matlab variable and is set to the worker id, values range from 1 to number of workers. <br />
<br />
Every worker will have its own version of variable myresult. To access these variables outside the spmd block you append {i} to the variable name, e.g. myresult{3} represents variable myresult from worker 3. <br />
<br />
For more information please visit the Matlab spmd page.<br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
= Running (parallel) Matlab Scripts on HPRC compute nodes =<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab&diff=10909SW:Matlab2020-02-24T21:34:53Z<p>Pennings: </p>
<hr />
<div><br />
__TOC__<br />
= Running Matlab interactively =<br />
Matlab is accessible to all HPRC users within the terms of our license agreement. If you have particular concerns about whether specific usage falls within the TAMU HPRC license, please send an email to HPRC Helpdesk. You can start a Matlab session either directly on a login node or through our portal<br />
<br />
== Running Matlab on a login node ==<br />
<br />
To be able to use Matlab, the Matlab module needs to be loaded first. This can be done using the following command:<br />
[ netID@cluster ~]$ '''module load Matlab/R2019a'''<br />
<br />
This will setup the environment for Matlab version R2019a. To see a list of all installed versions, use the following command:<br />
[ netID@cluster ~]$ '''module spider Matlab'''<br />
<font color=teal>'''Note:''' New versions of software become available periodically. Version numbers may change.</font><br />
<br />
To start matlab, use the following command: <br />
[ netID@cluster ~]$ '''matlab'''<br />
<br />
Depending on your X server settings, this will start either the Matlab GUI or the Matlab command-line interface. To start Matlab in command-line interface mode, use the following command with the appropriate flags:<br />
[ netID@cluster ~]$ '''matlab -nosplash -nodisplay'''<br />
<br />
By default, Matlab will execute a large number of built-in operators and functions multi-threaded and will use as many threads (i.e. cores) as are available on the node. Since login nodes are shared among all users, HPRC restricts the number of computational threads to 8. This should suffice for most cases. Speedup achieved through multi-threading depends on many factors and in certain cases. To explicitly change the number of computational threads, use the following Matlab command:<br />
>>feature('NumThreads',4);<br />
<br />
This will set the number of computational threads to 4.<br />
<br />
To completely disable multi-threading, use the -singleCompThread option when starting Matlab:<br />
[ netID@cluster ~]$ '''matlab -singleCompThread'''<br />
<br />
{{:SW:Login_Node_Warning}}<br />
<br />
== Running Matlab through the hprc portal ==<br />
<br />
HPRC provides a portal through which users can start an interactive Matlab GUI session inside a web browser. For more information how to use the portal see our [[SW:Portal | HPRC OnDemand Portal]] section<br />
<br />
= Running Matlab through the batch system =<br />
<br />
<br />
HPRC developed a tool named '''matlabsubmit''' to run Matlab simulations on the HPRC compute nodes without the need to create your own batch script and without the need to start a Matlab session. '''matlabsubmit''' will automatically generate a batch script with the correct requirements. In addition, '''matlabsubmit''' will also generate boilerplate Matlab code to set up the environment (e.g. the number of computational threads) and, if needed, will start a ''parpool'' using the correct Cluster Profile (''local'' if all workers fit on a single node and a cluster profile when workers are distribued over multiple nodes)<br />
<br />
To submit your Matlab script, use the following command:<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit myscript.m<br />
</pre><br />
<br />
In the above example, '''matlabsubmit''' will use all default values for runtime, memory requirements, the number of workers, etc. To specify resources, you can use the command-line options of '''matlabsubmmit'''. For example:<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m<br />
</pre><br />
<br />
will set the wall-time to 7 hours and makes sure Matlab will use 4 computational threads for its run ( '''matlabsubmit''' will also request 4 cores). <br />
<br />
To see all options for '''matlabsubmit''' use the '''-h''' flag<br />
<br />
<pre><br />
[ netID@cluster ~]$ matlabsubmit -h<br />
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME<br />
<br />
This tool automates the process of running Matlab codes on the compute nodes.<br />
<br />
OPTIONS:<br />
-h Shows this message<br />
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)<br />
-t sets the walltime; form hh:mm (e.g. -t 03:27)<br />
-w sets the number of ADDITIONAL workers<br />
-g indicates script needs GPU (no value needed)<br />
-b sets the billing account to use <br />
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)<br />
-p set number of workers per node<br />
-f run function call instead of script<br />
-x add explicit batch scheduler option<br />
<br />
DEFAULT VALUES:<br />
memory : 2000 per core <br />
time : 02:00<br />
workers : 0<br />
gpu : no gpu <br />
threading: on, 8 threads<br />
<br />
</pre><br />
<br />
<br />
'''NOTE''' when using the '''-f''' flag to execute a function instead of a script, the function call must be enclosed with double quotes when it contains parentheses. For example: '''matlabsubmit -f "myfunc(21)"'''<br />
<br />
<br><br />
<br />
When executing, '''matlabsubmit''' will do the following:<br />
* generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers) <br><br />
* generate a batch script with all resources set correctly and the command to run Matlab <br><br />
* submit the generated batch script to the batch scheduler and return control back to the user <br><br />
<br />
<br />
For detailed examples on using matlabsubmit see the [[ SW:Matlab_matlabsubmit | examples ]] section.<br />
<br />
=Running (parallel) Matlab Scripts on HPRC compute nodes=<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, this method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
<br />
For detailed information how to submit Matlab codes remotely, click [[SW:Matlab_app | here]]<br />
<br />
== Submit Matlab Scripts Remotely or Locally From the Matlab Command Line ==<br />
<br />
'''NOTE:''' Due to the new 2-factor authentication mechanism, remote submission method does not work at the moment. We will update this wiki page when this is fixed.<br />
<br />
Instead of using the App you can also call Matlab functions (developed by HPRC) directly to run your Matlab script on HPRC compute nodes. There are two steps involved in submitting your Matlab script:<br />
<br />
* Define the properties for your Matlab script (e.g. #workers). HPRC created a class named '''TAMUClusterProperties''' for this<br />
* Submit the Matlab script to run on HPRC compute nodes. HPRC created a function named '''tamu_run_batch''' for this.<br />
<br />
For example, suppose you have a script named ''mysimulation.m'', you want to use 4 workers and estimate it will need less than 7 hours of computing time: <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.workers(4);<br />
>> tp.walltime('07:00');<br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre><br />
<br />
'''NOTE:''' '''TAMUClusterProperties''' will use all default values for any of the properties that have not been set explicitly. <br />
<br />
In case you want to submit your Matlab script remotely from your local Matlab GUI, you also have to specify the HPRC cluster name you want to run on and your username. <br />
For example, suppose you have a script that uses Matlab GPU functions and you want to run it on terra:<br />
<pre><br />
>> tp=TAMUClusterProperties();<br />
>> tp.gpu(1);<br />
>> tp.hostname('terra.tamu.edu');<br />
>> tp.user('<USERNAME>'); <br />
>> myjob=tamu_run_batch(tp,'mysimulation.m');<br />
</pre> <br />
<br />
To see all available methods on objects of type '''TAMUClusterProperties''' you can use the Matlab '''help''' or '''doc''' functions: E.g.<br />
<br />
>> help TAMUClusterProperties/doc <br />
<br />
To see help page for '''tamu_run_batch''', use:<br />
<br />
<pre><br />
>> help tamu_run_batch<br />
tamu_run_batch runs Matlab script on worker(s). <br />
<br />
j = TAMU_RUN_BATH(tp,'script') runs the script<br />
script.m on the worker(s) using the TAMUClusterProperties object tp.<br />
Returns j, a handle to the job object that runs the script.<br />
<br />
<br />
</pre><br />
<br />
<br />
'''tamu_run_batch''' returns a variable of type '''Job'''. See the section ''"Retrieve results and information from Submitted Job"'' how to get results and information from the submitted job.<br />
<br />
<br />
= Using Matlab Parallel Toolbox on HPRC Resources=<br />
<br />
<br />
<font color=red> ''THIS SECTION IS UNDER CONSTRUCTION'' </font><br><br />
<br />
In this section, we will discuss some common concepts from the Matlab Parallel Toolbox and the convenience functions HPRC created to utilize the Parallel toolbox. We will give a brief introduction into Matlab Cluster profiles, parallel pools, the parallel constructs ''parfor'' and ''spmd'' , and how to utilize GPUs using Matlab.<br />
<br />
The central concept in most of the convenience functions is the '''TAMUClusterProperties''' class introduced in the ''Submit Matlab Scripts Remotely or Locally From the Matlab Command Line'' section above.<br />
<br />
<br />
== Cluster Profiles ==<br />
<br />
Cluster profiles define properties on where and how you want to do the parallel processing. There are two kinds of profiles.<br />
<br />
* local profiles: parallel processing is limited to the same node the Matlab client is running.<br />
* cluster profiles: parallel processing can span multiple nodes; profile interacts with batch scheduler (e.g. LSF on ada, SLURM on terra).<br />
<br />
'''NOTE:''' we will not discuss ''local profiles'' any further here. Processing using a local profile is exactly the same as processing using cluster profiles.<br />
<br />
<br />
=== Importing Cluster Profile ===<br />
<br />
For your convenience, HPRC already created a custom Cluster Profile. You can use this profile to define how many workers you want, how you want to distribute the workers over the nodes Before you can use this profile you need to import it first. This can be done using by calling the following Matlab function.<br />
<br />
<pre><br />
>>tamu_import_TAMU_clusterprofile()<br />
</pre><br />
<br />
This function imports the cluster profile and it creates a directory structure in your scratch directory where Matlab will store meta information during parallel processing. The default location is ''/scratch/$USER/MatlabJobs/TAMU'' ( ''/scratch/$USER/MatlabJobs/TAMUREMOTE'' for remote jobs)<br />
<br />
'''NOTE:''' convenience function '''tamu_import_TAMU_clusterprofile''' is a wrapper around the Matlab function <br />
[https://www.mathworks.com/help/distcomp/parallel.importprofile.html parallel.importprofile]<br />
<br />
You only need to import the cluster profile once. However, the imported profile is just a skeleton. It doesn't contain information how many resources (e.g. #workers) you want to use for parallel processing. In the next section, we will discuss how to create a fully populated cluster object that can be used for parallel processing.<br />
<br />
For more information about '''tamu_import_TAMU_clusterprofile()''' you can use the Matlab ''help// and ''doc'' functions.<br />
<br />
=== Retrieving fully populated Cluster Profile Object ===<br />
<br />
To return a fully completed cluster object (i.e. with attached resource information) HPRC created the '''tamu_set_profile_properties''' convenience function. There are two steps to follow:<br />
<br />
* define the properties using the TAMUClusterProperties class<br />
* call '''tamu_set_profile_properties''' using the created TAMUClusterProperties object.<br />
<br />
For example, suppose you have Matlab code and want to use 4 workers for parallel processing. <br />
<br />
<pre><br />
>> tp=TAMUClusterProperties;<br />
>> tp.workers(4);<br />
>> clusterObject=tamu_set_profile_properties(tp);<br />
</pre><br />
<br />
Variable ''clusterObject'' is a fully populated cluster object that can be used for parallel processing. <br />
<br />
'''NOTE:''' convenience function '''tamu_set_profile_properties''' is a wrapper around Matlab function <br />
[https://www.mathworks.com/help/distcomp/parcluster.html parcluster]. It also uses HPRC convenience function '''tamu_import_TAMU_clusterprofile''' to check if the '''TAMU''' profile has been imported already.<br />
<br />
== Starting a Parallel Pool ==<br />
<br />
To start a parallel pool you can use the HPRC convenience function '''tamu_parpool'''. It takes as argument a '''TAMUClustrerProperties''' object that specifies all the resources that are requested. <br />
<br />
The '''parpool''' functions enables the full functionality of the parallel language features (parfor and spmd, will be discussed below). A parpool creates a special job on a pool of workers, and connects the pool to the MATLAB client. For example:<br />
<pre><br />
mypool = parpool 4<br />
:<br />
delete(mypool)<br />
</pre><br />
<br />
This code starts a worker pool using the default cluster profile, with 4 additional workers. <br />
<br />
NOTE: only instructions within parfor and spmd blocks are executed on the workers. All other instructions are executed on the client. <br />
<br />
NOTE: all variables declared inside the matlabpool block will be destroyed once the block is finished.<br />
<br />
== Common Parallel constructs ==<br />
=== parfor ===<br />
<br />
The concept of a parfor-loop is similar to the standard Matlab for-loop. The difference is that parfor partitions the iterations among the available <br />
workers to run in parallel. For example:<br />
<br />
<pre><br />
parfor i=1:1024<br />
A(i)=sin((i/1024)*2*pi);<br />
end<br />
<br />
</pre><br />
This code will open a parallel pool with 2 workers using the default cluster profile and execute the loop in parallel. <br />
<br />
For more information please visit the Matlab parfor page.<br />
<br />
<br />
=== spmd ===<br />
<br />
<br />
spmd runs the same program on all workers concurrently. A typical use of spmd is when you need to run the same program on multiple sets of input. For example, Suppose you have 4 inputs named data1,data2,data3,data4 and you want run function myfun on all of them:<br />
<br />
<pre><br />
spmd (4)<br />
data = load(['data' num2str(labindex)])<br />
myresult = myfun(data)<br />
end<br />
</pre><br />
NOTE: labindex is a Matlab variable and is set to the worker id, values range from 1 to number of workers. <br />
<br />
Every worker will have its own version of variable myresult. To access these variables outside the spmd block you append {i} to the variable name, e.g. myresult{3} represents variable myresult from worker 3. <br />
<br />
For more information please visit the Matlab spmd page.<br />
<br />
== Using GPU ==<br />
<br />
Normally all variables reside in the client workspace and matlab operations are executed on the client machine. However, Matlab also provides options to utilize available GPUs to run code faster.<br />
Running code on the gpu is actually very straightforward. Matlab provides GPU versions for many build-in operations. These operations are executed on the GPU automatically when the variables involved reside on the GPU. The results of these operations will also reside on the GPU. To see what functions can be run on the GPU type:<br />
<br />
methods('gpuArray')<br />
This will show a list of all available functions that can be run on the GPU, as well as a list of available static functions to create data on the GPU directly (will be discussed later). <br />
<br />
NOTE: There is significant overhead of executing code on the gpu because of memory transfers. <br />
<br />
Another useful function is:<br />
gpuDevice<br />
This functions shows all the properties of the GPU. When this function is called from the client (or a node without a GPU) it will just print an error message.<br />
<br />
<br />
To copy variables from the client workspace to the GPU, you can use the gpuArray command. For example:<br />
<pre><br />
carr = ones(1000);<br />
garr = gpuArray(carr);<br />
</pre><br />
<br />
will copy variable carr to the GPU wit name garr. <br />
<br />
In the example above the 1000x1000 matrix needs to be copied from the client workspace to the GPU. There is a significant overhead involved in doing this.<br />
<br />
To create the variables directly on the GPU, Matlab provides a number of convenience functions. For example:<br />
<pre><br />
garr=gpuArray.ones(1000)<br />
</pre><br />
<br />
This will create a 1000x1000 matrix directly on the GPU consisting of all ones. <br />
<br />
<br />
To copy data back to the client workspace Matlab provides the gather operation.<br />
<pre><br />
carr2 = gather(garr)<br />
</pre><br />
<br />
This will copy the array garr on the GPU back to variable carr2 in the client workspace.<br />
<br />
The next example performs a matrix multiplication on the client, a matrix multiplication on the GPU, and prints out elapsed times for both. The actual cpu-gpu matrix multiplication code can be written as:<br />
<pre><br />
ag = gpuArray.rand(1000); <br />
bg = ag*ag;<br />
c = gather(bg); <br />
</pre><br />
<br />
<br />
<br />
[[Category:Software]]</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab_matlabsubmit&diff=10908SW:Matlab matlabsubmit2020-02-24T21:26:57Z<p>Pennings: /* Example 3: Utilizing Matlab workers (multi node) */</p>
<hr />
<div> <br />
= Example 1: basic use =<br />
<br />
The following example shows the simplest use of matlabsubmit. It will execute matlab script ''test.m'' using default values for batch resources and Matlab resources. matlabsubmit will also print some useful information to the screen. As can be seen in the example, it will show the Matlab resources requested (e.g. #threads, #workers), the submit command that will be used to submit the job, the batch scheduler JobID, and the location of output generated by Matlab and the batch scheduler.<br />
<br />
<pre><br />
-bash-4.1$ matlabsubmit test.m<br />
<br />
===============================================<br />
Running Matlab script with following parameters<br />
-----------------------------------------------<br />
Script : test.m<br />
Workers : 0<br />
Nodes : 1<br />
#threads : 8<br />
===============================================<br />
<br />
sbatch -o MatlabSubmitLOG1/slurm.out --job-name=matlabsubmitID-1_test --ntasks=1 --ntasks-per-node=1 --cpus-per-task=8 --mem=16000 --time=02:00:00 MatlabSubmitLOG1/submission_script<br />
Submitted batch job 3759554<br />
(from job_submit) your job is charged as below<br />
Project Account: 122839393845<br />
Account Balance: 17504.211670<br />
Requested SUs: 16<br />
<br />
-----------------------------------------------<br />
matlabsubmit ID : 1<br />
Matlab output file : MatlabSubmitLOG1/matlab.log<br />
-bash-4.1$<br />
</pre><br />
<br />
<br />
The matlab script ''test.m'' has to be in the current directory. Control will be returned immediately after executing the matlabsubmit command. To check the run status or kill a job, use the respective batch scheduler commands (e.g. '''squeue''' and '''scancel'''). matlabsubmit will create a sub directory named '''MatlabSubmitLOG<N>''' (where '''N''' is the matlabsubmit ID). In this directory matlabsubmit will store all its relevant files; the generated batch script, matlab driver, and redirected output and error. A listing of this directory will show the following files: <br />
<br />
* '''slurm.err''' redirected error <br />
* '''slurm.out''' redirected output (both LSF and Matlab)<br />
* '''matlab.log''' redirected Matlab screen output<br />
* '''matlabsubmit_wrapper.m''' Matlab code that sets #threads and calls user function<br />
* '''submission_script''' the generated LSF batch script<br />
* '''batch_job_id-<SLURM_JOBID>''' dummy file to show the actual Slurm job id this matlabsubmit job is associated to.<br />
<br />
= Example 2: Utilizing Matlab workers (single node) =<br />
<br />
To utilize additional workers used by Matlab's parallel features such as ''parfor'',''spmd'', and ''distributed'', '''matlabsubmit''' provides an option to specify the number of workers. This is done using the ''-w <N>'' flag (where <N> represents the number of workers). The following example shows a simple case of using 8 workers.<br />
<br />
<pre><br />
-bash-4.1$ matlabsubmit -w 8 test.m<br />
<br />
Running Matlab script with following parameters<br />
-----------------------------------------------<br />
Script : test.m<br />
Workers : 8<br />
Nodes : 1<br />
#threads : 1<br />
===============================================<br />
<br />
sbatch -o MatlabSubmitLOG2/slurm.out --job-name=matlabsubmitID-2_test --ntasks=9 --ntasks-per-node=9 --cpus-per-task=1 --mem=16000 --time=02:00:00 MatlabSubmitLOG2/submission_script<br />
Submitted batch job 3759571<br />
(from job_submit) your job is charged as below<br />
Project Account: 122839393845<br />
Account Balance: 17504.211670<br />
Requested SUs: 18<br />
<br />
-----------------------------------------------<br />
matlabsubmit ID : 2<br />
matlab output file : MatlabSubmitLOG2/matlab.log<br />
<br />
</pre><br />
<br />
In this example, '''matlabsubmit''' will first execute Matlab code to create a ''parpool'' with 8 workers (using the local profile). As can be seen in the output, in this case, '''matlabsubmit''' requests 9 cores: 1 core for the client and 8 cores for the workers. The only exception is when the user requests all the workers on a node ( 28 on terra). In that case, matlabsubmit will request all cores (instead of one extra for the client).<br />
<br />
= Example 3: Utilizing Matlab workers (multi node) =<br />
<br />
'''matlabsubmit''' provides excellent options for Matlab runs that need more than 20 workers (maximum for single node) and/or when the Matlab workers need to be distributed among multiple nodes. Reasons for distributing workers among different nodes include: need to use certain resources such as gpu on multiple nodes, enable multi threading on every worker, and use the available memory on multiple nodes.<br />
The following example shows how to run a matlab simulation that utilizes 24 workers, where every node will run 4 workers (i.e. the workers will be distributed among 24/4 = 6 nodes).<br />
<pre><br />
-bash-4.1$ matlabsubmit -w 8 -p 4 test.m<br />
===============================================<br />
Running Matlab script with following parameters<br />
-----------------------------------------------<br />
Script : test.m<br />
Workers : 8<br />
Nodes : 2<br />
#threads : 1<br />
===============================================<br />
<br />
... starting matlab batch. This might take some time. See MatlabSubmitLOG5/matlab-batch-commands.log<br />
...Starting Matlab from host: tlogin-0502.cluster<br />
MATLAB is selecting SOFTWARE OPENGL rendering.<br />
<br />
< M A T L A B (R) ><br />
Copyright 1984-2019 The MathWorks, Inc.<br />
R2019b Update 3 (9.7.0.1261785) 64-bit (glnxa64)<br />
November 27, 2019<br />
<br />
<br />
To get started, type doc.<br />
For product information, visit www.mathworks.com.<br />
<br />
... Interactive Matlab session, multi threading reduced to 8<br />
<br />
<br />
submitstring =<br />
<br />
'--cpus-per-task=1 --ntasks=9 --ntasks-per-node=4 --mem=5334 --time=2:00:00 '<br />
<br />
-bash-4.1$<br />
<br />
</pre><br />
<br />
As can be seen the output is very different from the previous examples ( some of the ouput has been removed for readability). When a job uses multiple nodes the approach '''matlabsubmit''' uses is a bit different. '''matlabsubmit''' will start a regular ''interactive'' Matlab session and from within it will run the Matlab ''batch'' command using a specialized cluster profile. It will then exit Matlab while the Matlab script is executed on the compute nodes. <br> <br />
<br><br />
<br />
The contents of the MatlabSubmitLOG directory are also slightly different. A listing will show the following files:<br />
<br />
* '''matlab-batch-commands.log''' screen output from Matlab <br />
* '''matlabsubmit_driver.m''' Matlab code that sets up the cluster profile and calls Matlab ''batch'' <br />
* '''matlabsubmit_wrapper.m''' Matlab code that sets #threads and calls user function<br />
* '''submission_script''' The actual command to start Matlab<br />
<br />
In addition to the MatlabSubmitLOG directory created by '''matlabsubmit''', Matlab will also create a directory named '''Job<N>''' used by the cluster profile to store metadata, log files, and screen output. The '''*.diary.txt''' text files will show screen output for the client and all the workers. All the Job<N> directories can be found in your $SCRATCH/MatlabJobs/<PROFILE> directory where <PROFILE> is the name of the used cluster profile. For R2019b it will be ''$SCRATCH/MatlabJobs/TAMU2019b''. Note: for older versions, the profile is always ''TAMU''</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab_matlabsubmit&diff=10907SW:Matlab matlabsubmit2020-02-24T20:59:46Z<p>Pennings: /* Example 2: Utilizing Matlab workers (single node) */</p>
<hr />
<div> <br />
= Example 1: basic use =<br />
<br />
The following example shows the simplest use of matlabsubmit. It will execute matlab script ''test.m'' using default values for batch resources and Matlab resources. matlabsubmit will also print some useful information to the screen. As can be seen in the example, it will show the Matlab resources requested (e.g. #threads, #workers), the submit command that will be used to submit the job, the batch scheduler JobID, and the location of output generated by Matlab and the batch scheduler.<br />
<br />
<pre><br />
-bash-4.1$ matlabsubmit test.m<br />
<br />
===============================================<br />
Running Matlab script with following parameters<br />
-----------------------------------------------<br />
Script : test.m<br />
Workers : 0<br />
Nodes : 1<br />
#threads : 8<br />
===============================================<br />
<br />
sbatch -o MatlabSubmitLOG1/slurm.out --job-name=matlabsubmitID-1_test --ntasks=1 --ntasks-per-node=1 --cpus-per-task=8 --mem=16000 --time=02:00:00 MatlabSubmitLOG1/submission_script<br />
Submitted batch job 3759554<br />
(from job_submit) your job is charged as below<br />
Project Account: 122839393845<br />
Account Balance: 17504.211670<br />
Requested SUs: 16<br />
<br />
-----------------------------------------------<br />
matlabsubmit ID : 1<br />
Matlab output file : MatlabSubmitLOG1/matlab.log<br />
-bash-4.1$<br />
</pre><br />
<br />
<br />
The matlab script ''test.m'' has to be in the current directory. Control will be returned immediately after executing the matlabsubmit command. To check the run status or kill a job, use the respective batch scheduler commands (e.g. '''squeue''' and '''scancel'''). matlabsubmit will create a sub directory named '''MatlabSubmitLOG<N>''' (where '''N''' is the matlabsubmit ID). In this directory matlabsubmit will store all its relevant files; the generated batch script, matlab driver, and redirected output and error. A listing of this directory will show the following files: <br />
<br />
* '''slurm.err''' redirected error <br />
* '''slurm.out''' redirected output (both LSF and Matlab)<br />
* '''matlab.log''' redirected Matlab screen output<br />
* '''matlabsubmit_wrapper.m''' Matlab code that sets #threads and calls user function<br />
* '''submission_script''' the generated LSF batch script<br />
* '''batch_job_id-<SLURM_JOBID>''' dummy file to show the actual Slurm job id this matlabsubmit job is associated to.<br />
<br />
= Example 2: Utilizing Matlab workers (single node) =<br />
<br />
To utilize additional workers used by Matlab's parallel features such as ''parfor'',''spmd'', and ''distributed'', '''matlabsubmit''' provides an option to specify the number of workers. This is done using the ''-w <N>'' flag (where <N> represents the number of workers). The following example shows a simple case of using 8 workers.<br />
<br />
<pre><br />
-bash-4.1$ matlabsubmit -w 8 test.m<br />
<br />
Running Matlab script with following parameters<br />
-----------------------------------------------<br />
Script : test.m<br />
Workers : 8<br />
Nodes : 1<br />
#threads : 1<br />
===============================================<br />
<br />
sbatch -o MatlabSubmitLOG2/slurm.out --job-name=matlabsubmitID-2_test --ntasks=9 --ntasks-per-node=9 --cpus-per-task=1 --mem=16000 --time=02:00:00 MatlabSubmitLOG2/submission_script<br />
Submitted batch job 3759571<br />
(from job_submit) your job is charged as below<br />
Project Account: 122839393845<br />
Account Balance: 17504.211670<br />
Requested SUs: 18<br />
<br />
-----------------------------------------------<br />
matlabsubmit ID : 2<br />
matlab output file : MatlabSubmitLOG2/matlab.log<br />
<br />
</pre><br />
<br />
In this example, '''matlabsubmit''' will first execute Matlab code to create a ''parpool'' with 8 workers (using the local profile). As can be seen in the output, in this case, '''matlabsubmit''' requests 9 cores: 1 core for the client and 8 cores for the workers. The only exception is when the user requests all the workers on a node ( 28 on terra). In that case, matlabsubmit will request all cores (instead of one extra for the client).<br />
<br />
= Example 3: Utilizing Matlab workers (multi node) =<br />
<br />
matlabsubmit provides excellent options for Matlab runs that need more than 20 workers (maximum for single node) and/or when the Matlab workers need to be distributed among multiple nodes. Reasons for distributing workers among different nodes include: need to use certain resources such as gpu on multiple nodes, enable multi threading on every worker, and use the available memory on multiple nodes.<br />
The following example shows how to run a matlab simulation that utilizes 24 workers, where every node will run 4 workers (i.e. the workers will be distributed among 24/4 = 6 nodes).<br />
<pre><br />
-bash-4.1$ matlabsubmit -w 24 -p 4 test.m<br />
===============================================<br />
Running Matlab script with following parameters<br />
-----------------------------------------------<br />
Script : test.m<br />
Workers : 24<br />
Nodes : 6<br />
Mem/proc : 2500<br />
#threads : 1<br />
===============================================<br />
<br />
... starting matlab batch. This might take some time. <br />
See MatlabSubmitLOG8/matlab-batch-commands.log<br />
...Starting Matlab from host: login4<br />
MATLAB is selecting SOFTWARE OPENGL rendering.<br />
<br />
< M A T L A B (R) ><br />
Copyright 1984-2016 The MathWorks, Inc.<br />
R2016a (9.0.0.341360) 64-bit (glnxa64)<br />
February 11, 2016<br />
<br />
<br />
To get started, type one of these: helpwin, helpdesk, or demo.<br />
For product information, visit www.mathworks.com.<br />
<br />
... Interactive Matlab session, multi threading reduced to 4<br />
<br />
Academic License<br />
<br />
<br />
commandToRun =<br />
<br />
bsub -L /bin/bash -J Job1 -o '/general/home/pennings/Job1/Job1.log' -n 25 -M 2500 <br />
-R rusage[mem=2500] -R "span[ptile=4]" -W 02:00 <br />
"source /general/home/pennings/Job1/mdce_envvars ;<br />
/general/software/x86_64/tamusc/Matlab/toolbox/tamu/profiles/lsfgeneric/communicatingJobWrapper.sh"<br />
<br />
<br />
job = <br />
<br />
Job<br />
<br />
Properties: <br />
<br />
ID: 1<br />
Type: pool<br />
Username: pennings<br />
State: running<br />
SubmitTime: Mon Aug 01 12:15:15 CDT 2016<br />
StartTime: <br />
Running Duration: 0 days 0h 0m 0s<br />
NumWorkersRange: [25 25]<br />
<br />
AutoAttachFiles: true<br />
Auto Attached Files: /general/home/pennings/MatlabSubmitLOG8/matlabsubmit_wrapper.m<br />
/general/home/pennings/test.m<br />
AttachedFiles: {}<br />
AdditionalPaths: {}<br />
<br />
Associated Tasks: <br />
<br />
Number Pending: 25<br />
Number Running: 0<br />
Number Finished: 0<br />
Task ID of Errors: []<br />
Task ID of Warnings: []<br />
<br />
<br />
<br />
<br />
-----------------------------------------------<br />
matlabsubmit JOBID : 8<br />
batch output file (client) : Job1/Task1.diary.txt<br />
batch output files (workers) : Job1/Task[2-25].diary.txt<br />
Done<br />
<br />
-bash-4.1$<br />
<br />
</pre><br />
<br />
As can be seen the output is very different from the previous examples. When a job uses multiple nodes the approach matlabsubmit uses is a bit different. matlabsubmit will start a regular ''interactive'' matlab session and from within it will run the Matlab ''batch'' command using the '''TAMUG''' cluster profile. It will then exit Matlab while the Matlab script is executed on the compute nodes. <br> <br />
<br><br />
<br />
The contents of the MatlabSubmitLOG directory are also slightly different. A listing will show the following files:<br />
<br />
* '''matlab-batch-commands.log''' screen output from Matlab <br />
* '''matlabsubmit_driver.m''' Matlab code that sets up the cluster profile and calls Matlab ''batch'' <br />
* '''matlabsubmit_wrapper.m''' Matlab code that sets #threads and calls user function<br />
* '''submission_script''' The actual command to start Matlab<br />
<br />
In addition to the MatlabSubmitLOG directory created by matlabsubmit, Matlab will also create a directory named '''Job<N>''' used by the cluster profile to store meta data, log files, and screen output. The '''*.diary.txt''' text files will show screen output for the client and all the workers.</div>Penningshttps://hprc.tamu.edu/w/index.php?title=SW:Matlab_matlabsubmit&diff=10906SW:Matlab matlabsubmit2020-02-24T20:54:11Z<p>Pennings: /* Example 1: basic use */</p>
<hr />
<div> <br />
= Example 1: basic use =<br />
<br />
The following example shows the simplest use of matlabsubmit. It will execute matlab script ''test.m'' using default values for batch resources and Matlab resources. matlabsubmit will also print some useful information to the screen. As can be seen in the example, it will show the Matlab resources requested (e.g. #threads, #workers), the submit command that will be used to submit the job, the batch scheduler JobID, and the location of output generated by Matlab and the batch scheduler.<br />
<br />
<pre><br />
-bash-4.1$ matlabsubmit test.m<br />
<br />
===============================================<br />
Running Matlab script with following parameters<br />
-----------------------------------------------<br />
Script : test.m<br />
Workers : 0<br />
Nodes : 1<br />
#threads : 8<br />
===============================================<br />
<br />
sbatch -o MatlabSubmitLOG1/slurm.out --job-name=matlabsubmitID-1_test --ntasks=1 --ntasks-per-node=1 --cpus-per-task=8 --mem=16000 --time=02:00:00 MatlabSubmitLOG1/submission_script<br />
Submitted batch job 3759554<br />
(from job_submit) your job is charged as below<br />
Project Account: 122839393845<br />
Account Balance: 17504.211670<br />
Requested SUs: 16<br />
<br />
-----------------------------------------------<br />
matlabsubmit ID : 1<br />
Matlab output file : MatlabSubmitLOG1/matlab.log<br />
-bash-4.1$<br />
</pre><br />
<br />
<br />
The matlab script ''test.m'' has to be in the current directory. Control will be returned immediately after executing the matlabsubmit command. To check the run status or kill a job, use the respective batch scheduler commands (e.g. '''squeue''' and '''scancel'''). matlabsubmit will create a sub directory named '''MatlabSubmitLOG<N>''' (where '''N''' is the matlabsubmit ID). In this directory matlabsubmit will store all its relevant files; the generated batch script, matlab driver, and redirected output and error. A listing of this directory will show the following files: <br />
<br />
* '''slurm.err''' redirected error <br />
* '''slurm.out''' redirected output (both LSF and Matlab)<br />
* '''matlab.log''' redirected Matlab screen output<br />
* '''matlabsubmit_wrapper.m''' Matlab code that sets #threads and calls user function<br />
* '''submission_script''' the generated LSF batch script<br />
* '''batch_job_id-<SLURM_JOBID>''' dummy file to show the actual Slurm job id this matlabsubmit job is associated to.<br />
<br />
= Example 2: Utilizing Matlab workers (single node) =<br />
<br />
To utilize additional workers used by Matlab's parallel features such as ''parfor'',''spmd'', and ''distributed'' matlabsubmit provides the option to specify the number of workers. This is done using the ''-w <N>'' flag (where <N> represents the number of workers). The following example shows a simple case of using additional workers; in this case 8 workers<br />
<pre><br />
<br />
-bash-4.1$ matlabsubmit -w 8 test.m<br />
===============================================<br />
Running Matlab script with following parameters<br />
-----------------------------------------------<br />
Script : test.m<br />
Workers : 8<br />
Nodes : 1<br />
Mem/proc : 2500<br />
#threads : 1<br />
===============================================<br />
<br />
bsub -e MatlabSubmitLOG5/lsf.err -o MatlabSubmitLOG5/lsf.out <br />
-L /bin/bash -n 9 -R span[ptile=9] -W 02:00 -M 2500 <br />
-R rusage[mem=2500] <br />
-J test5 MatlabSubmitLOG5/submission_script<br />
<br />
Verifying job submission parameters...<br />
Verifying project account...<br />
Account to charge: 082839397478<br />
Balance (SUs): 80533.2098<br />
SUs to charge: 18.0000<br />
Job <2901543> is submitted to default queue <sn_regular>.<br />
<br />
-----------------------------------------------<br />
matlabsubmit ID : 5<br />
matlab output file : MatlabSubmitLOG5/matlab.log<br />
LSF/matlab output file : MatlabSubmitLOG5/lsf.out<br />
LSF/matlab error file : MatlabSubmitLOG5/lsf.err<br />
<br />
-bash-4.1$<br />
<br />
</pre><br />
<br />
In this example, matlabsubmit will first execute matlab code to create a ''parpool'' with 8 workers (using the local profile). As can be seen in the output, in this case, matlabsubmit requests 9 cores: 1 core for the client and 8 cores for the workers. The only exception is when the user requests 20 workers. In that case, matlabsubmit will request 20 cores.<br />
<br />
<br />
= Example 3: Utilizing Matlab workers (multi node) =<br />
<br />
matlabsubmit provides excellent options for Matlab runs that need more than 20 workers (maximum for single node) and/or when the Matlab workers need to be distributed among multiple nodes. Reasons for distributing workers among different nodes include: need to use certain resources such as gpu on multiple nodes, enable multi threading on every worker, and use the available memory on multiple nodes.<br />
The following example shows how to run a matlab simulation that utilizes 24 workers, where every node will run 4 workers (i.e. the workers will be distributed among 24/4 = 6 nodes).<br />
<pre><br />
-bash-4.1$ matlabsubmit -w 24 -p 4 test.m<br />
===============================================<br />
Running Matlab script with following parameters<br />
-----------------------------------------------<br />
Script : test.m<br />
Workers : 24<br />
Nodes : 6<br />
Mem/proc : 2500<br />
#threads : 1<br />
===============================================<br />
<br />
... starting matlab batch. This might take some time. <br />
See MatlabSubmitLOG8/matlab-batch-commands.log<br />
...Starting Matlab from host: login4<br />
MATLAB is selecting SOFTWARE OPENGL rendering.<br />
<br />
< M A T L A B (R) ><br />
Copyright 1984-2016 The MathWorks, Inc.<br />
R2016a (9.0.0.341360) 64-bit (glnxa64)<br />
February 11, 2016<br />
<br />
<br />
To get started, type one of these: helpwin, helpdesk, or demo.<br />
For product information, visit www.mathworks.com.<br />
<br />
... Interactive Matlab session, multi threading reduced to 4<br />
<br />
Academic License<br />
<br />
<br />
commandToRun =<br />
<br />
bsub -L /bin/bash -J Job1 -o '/general/home/pennings/Job1/Job1.log' -n 25 -M 2500 <br />
-R rusage[mem=2500] -R "span[ptile=4]" -W 02:00 <br />
"source /general/home/pennings/Job1/mdce_envvars ;<br />
/general/software/x86_64/tamusc/Matlab/toolbox/tamu/profiles/lsfgeneric/communicatingJobWrapper.sh"<br />
<br />
<br />
job = <br />
<br />
Job<br />
<br />
Properties: <br />
<br />
ID: 1<br />
Type: pool<br />
Username: pennings<br />
State: running<br />
SubmitTime: Mon Aug 01 12:15:15 CDT 2016<br />
StartTime: <br />
Running Duration: 0 days 0h 0m 0s<br />
NumWorkersRange: [25 25]<br />
<br />
AutoAttachFiles: true<br />
Auto Attached Files: /general/home/pennings/MatlabSubmitLOG8/matlabsubmit_wrapper.m<br />
/general/home/pennings/test.m<br />
AttachedFiles: {}<br />
AdditionalPaths: {}<br />
<br />
Associated Tasks: <br />
<br />
Number Pending: 25<br />
Number Running: 0<br />
Number Finished: 0<br />
Task ID of Errors: []<br />
Task ID of Warnings: []<br />
<br />
<br />
<br />
<br />
-----------------------------------------------<br />
matlabsubmit JOBID : 8<br />
batch output file (client) : Job1/Task1.diary.txt<br />
batch output files (workers) : Job1/Task[2-25].diary.txt<br />
Done<br />
<br />
-bash-4.1$<br />
<br />
</pre><br />
<br />
As can be seen the output is very different from the previous examples. When a job uses multiple nodes the approach matlabsubmit uses is a bit different. matlabsubmit will start a regular ''interactive'' matlab session and from within it will run the Matlab ''batch'' command using the '''TAMUG''' cluster profile. It will then exit Matlab while the Matlab script is executed on the compute nodes. <br> <br />
<br><br />
<br />
The contents of the MatlabSubmitLOG directory are also slightly different. A listing will show the following files:<br />
<br />
* '''matlab-batch-commands.log''' screen output from Matlab <br />
* '''matlabsubmit_driver.m''' Matlab code that sets up the cluster profile and calls Matlab ''batch'' <br />
* '''matlabsubmit_wrapper.m''' Matlab code that sets #threads and calls user function<br />
* '''submission_script''' The actual command to start Matlab<br />
<br />
In addition to the MatlabSubmitLOG directory created by matlabsubmit, Matlab will also create a directory named '''Job<N>''' used by the cluster profile to store meta data, log files, and screen output. The '''*.diary.txt''' text files will show screen output for the client and all the workers.</div>Pennings