# matlabsubmit

## Usage

HPRC developed a tool named **matlabsubmit** to run Matlab simulations
on the HPRC compute nodes without the need to create your own batch
script and without the need to start a Matlab session. **matlabsubmit**
will automatically generate a batch script with the correct
requirements. In addition, **matlabsubmit** will also generate
boilerplate Matlab code to set up the environment (e.g. the number of
computational threads) and, if needed, will start a *parpool* using the
correct Cluster Profile (*local* if all workers fit on a single node and
a cluster profile when workers are distribued over multiple nodes)

To submit your Matlab script, use the following command:

```
[ NetID@cluster ~]$ matlabsubmit myscript.m
```

In the above example, **matlabsubmit** will use all default values for
runtime, memory requirements, the number of workers, etc. To specify
resources, you can use the command-line options of **matlabsubmmit**.
For example:

```
[ NetID@cluster ~]$ matlabsubmit -t 07:00 -s 4 myscript.m
```

will set the wall-time to 7 hours and makes sure Matlab will use 4
computational threads for its run ( **matlabsubmit** will also request 4
cores).

To see all options for **matlabsubmit** use the **-h** flag

```
[ NetID@cluster ~]$ matlabsubmit -h
Usage: /sw/hprc/sw/Matlab/bin/matlabsubmit [options] SCRIPTNAME
This tool automates the process of running Matlab codes on the compute nodes.
OPTIONS:
-h Shows this message
-m set the amount of requested memory in MEGA bytes(e.g. -m 20000)
-t sets the walltime; form hh:mm (e.g. -t 03:27)
-w sets the number of ADDITIONAL workers
-g indicates script needs GPU (no value needed)
-b sets the billing account to use
-s set number of threads for multithreading (default: 8 ( 1 when -w > 0)
-p set number of workers per node
-f run function call instead of script
-x add explicit batch scheduler option
DEFAULT VALUES:
memory : 2000 per core
time : 02:00
workers : 0
gpu : no gpu
threading: on, 8 threads
```

**NOTE** when using the **-f** flag to execute a function instead of a
script, the function call must be enclosed with double quotes when it
contains parentheses. For example: **matlabsubmit -f "myfunc(21)"**

When executing, **matlabsubmit** will do the following:

- generate boilerplate Matlab code to setup the Matlab environment (e.g. #threads, #workers)
- generate a batch script with all resources set correctly and the command to run Matlab
- submit the generated batch script to the batch scheduler and return control back to the user

## Examples

### Example 1: basic use

The following example shows the simplest use of matlabsubmit. It will
execute matlab script *test.m* using default values for batch resources
and Matlab resources. matlabsubmit will also print some useful
information to the screen. As can be seen in the example, it will show
the Matlab resources requested (e.g. #threads, #workers), the submit
command that will be used to submit the job, the batch scheduler JobID,
and the location of output generated by Matlab and the batch scheduler.

```
-bash-4.1$ matlabsubmit test.m
===============================================
Running Matlab script with following parameters
-----------------------------------------------
Script : test.m
Workers : 0
Nodes : 1
#threads : 8
===============================================
sbatch -o MatlabSubmitLOG1/slurm.out --job-name=matlabsubmitID-1_test --ntasks=1 --ntasks-per-node=1 --cpus-per-task=8 --mem=16000 --time=02:00:00 MatlabSubmitLOG1/submission_script
Submitted batch job 3759554
(from job_submit) your job is charged as below
Project Account: 122839393845
Account Balance: 17504.211670
Requested SUs: 16
-----------------------------------------------
matlabsubmit ID : 1
Matlab output file : MatlabSubmitLOG1/matlab.log
-bash-4.1$
```

The matlab script *test.m* has to be in the current directory. Control
will be returned immediately after executing the matlabsubmit command.
To check the run status or kill a job, use the respective batch
scheduler commands (e.g. **squeue** and **scancel**). matlabsubmit will
create a sub directory named **MatlabSubmitLOG** (where

**N**is the matlabsubmit ID). In this directory matlabsubmit will store all its relevant files; the generated batch script, matlab driver, and redirected output and error. A listing of this directory will show the following files:

**slurm.err**redirected error**slurm.out**redirected output (both LSF and Matlab)**matlab.log**redirected Matlab screen output**matlabsubmit_wrapper.m**Matlab code that sets #threads and calls user function**submission_script**the generated LSF batch script**batch_job_id-**dummy file to show the actual Slurm job id this matlabsubmit job is associated to.

### Example 2: Utilizing Matlab workers (single node)

To utilize additional workers used by Matlab's parallel features such as
*parfor*,*spmd*, and *distributed*, **matlabsubmit** provides an option
to specify the number of workers. This is done using the *-w * flag
(where

```
-bash-4.1$ matlabsubmit -w 8 test.m
Running Matlab script with following parameters
-----------------------------------------------
Script : test.m
Workers : 8
Nodes : 1
#threads : 1
===============================================
sbatch -o MatlabSubmitLOG2/slurm.out --job-name=matlabsubmitID-2_test --ntasks=9 --ntasks-per-node=9 --cpus-per-task=1 --mem=16000 --time=02:00:00 MatlabSubmitLOG2/submission_script
Submitted batch job 3759571
(from job_submit) your job is charged as below
Project Account: 122839393845
Account Balance: 17504.211670
Requested SUs: 18
-----------------------------------------------
matlabsubmit ID : 2
matlab output file : MatlabSubmitLOG2/matlab.log
```

In this example, **matlabsubmit** will first execute Matlab code to
create a *parpool* with 8 workers (using the local profile). As can be
seen in the output, in this case, **matlabsubmit** requests 9 cores: 1
core for the client and 8 cores for the workers. The only exception is
when the user requests all the workers on a node. In that
case, matlabsubmit will request all cores (instead of one extra for the
client).

### Example 3: Utilizing Matlab workers (multi node)

**matlabsubmit** provides excellent options for Matlab runs that need
more workers than fit a single node and/or when the Matlab workers need
to be distributed among multiple nodes. Some examples for distributing
workers among multiple nodes include

- hybrid runs: where every worker uses multiple computational threads
- memory requirements: where the job needs more memory than fits on a single node.
- GPU needs: where every worker needs to utilize the GPU on a node

The following example shows how to run a matlab simulation that utilizes 24 workers, where every node will run 4 workers (i.e. the workers will be distributed among 24/4 = 6 nodes).

```
-bash-4.1$ matlabsubmit -w 8 -p 4 test.m
===============================================
Running Matlab script with following parameters
-----------------------------------------------
Script : test.m
Workers : 8
Nodes : 2
#threads : 1
===============================================
... starting matlab batch. This might take some time. See MatlabSubmitLOG5/matlab-batch-commands.log
...Starting Matlab from host: tlogin-0502.cluster
MATLAB is selecting SOFTWARE OPENGL rendering.
< M A T L A B (R) >
Copyright 1984-2019 The MathWorks, Inc.
R2019b Update 3 (9.7.0.1261785) 64-bit (glnxa64)
November 27, 2019
To get started, type doc.
For product information, visit www.mathworks.com.
... Interactive Matlab session, multi threading reduced to 8
submitstring =
'--cpus-per-task=1 --ntasks=9 --ntasks-per-node=4 --mem=5334 --time=2:00:00 '
-bash-4.1$
```

As can be seen the output is very different from the previous examples (
some of the ouput has been removed for readability). When a job uses
multiple nodes the approach **matlabsubmit** uses is a bit different.
**matlabsubmit** will start a regular *interactive* Matlab session and
from within it will run the Matlab *batch* command using a specialized
cluster profile. It will then exit Matlab while the Matlab script is
executed on the compute nodes.

The contents of the MatlabSubmitLOG directory are also slightly different. A listing will show the following files:

**matlab-batch-commands.log**screen output from Matlab**matlabsubmit_driver.m**Matlab code that sets up the cluster profile and calls Matlab*batch***matlabsubmit_wrapper.m**Matlab code that sets #threads and calls user function**submission_script**The actual command to start Matlab

In addition to the MatlabSubmitLOG directory created by
**matlabsubmit**, Matlab will also create a directory named **Job**
used by the cluster profile to store metadata, log files, and screen
output. The

***.diary.txt**text files will show screen output for the client and all the workers. All the Job

*$SCRATCH/MatlabJobs/TAMU2019b*. Note: for older versions, the profile is always

*TAMU*