Hprc banner tamu.png

Difference between revisions of "SW:Matlab matlabsubmit"

From TAMU HPRC
Jump to: navigation, search
(Example 1: basic use)
(Example 2: Utilizing Matlab workers (single node))
Line 41: Line 41:
 
= Example 2: Utilizing Matlab workers (single node)  =
 
= Example 2: Utilizing Matlab workers (single node)  =
  
To utilize additional workers used by Matlab's parallel features such as ''parfor'',''spmd'', and ''distributed'' matlabsubmit provides the option to specify the number of workers. This is done using the ''-w <N>'' flag (where <N> represents the number of workers). The following example shows a simple case of using additional workers; in this case 8 workers
+
To utilize additional workers used by Matlab's parallel features such as ''parfor'',''spmd'', and ''distributed'',  '''matlabsubmit''' provides an option to specify the number of workers. This is done using the ''-w <N>'' flag (where <N> represents the number of workers). The following example shows a simple case of using 8 workers.
 +
 
 
<pre>
 
<pre>
 +
-bash-4.1$ matlabsubmit -w 8 test.m
  
-bash-4.1$ matlabsubmit -w 8 test.m
 
===============================================
 
 
Running Matlab script with following parameters
 
Running Matlab script with following parameters
 
-----------------------------------------------
 
-----------------------------------------------
Line 51: Line 51:
 
Workers    : 8
 
Workers    : 8
 
Nodes      : 1
 
Nodes      : 1
Mem/proc  : 2500
 
 
#threads  : 1
 
#threads  : 1
 
===============================================
 
===============================================
  
bsub  -e MatlabSubmitLOG5/lsf.err -o MatlabSubmitLOG5/lsf.out   
+
sbatch -o MatlabSubmitLOG2/slurm.out --job-name=matlabsubmitID-2_test  --ntasks=9 --ntasks-per-node=9 --cpus-per-task=1 --mem=16000 --time=02:00:00      MatlabSubmitLOG2/submission_script
      -L /bin/bash -n 9 -R span[ptile=9] -W 02:00 -M 2500
+
Submitted batch job 3759571
      -R rusage[mem=2500]
+
(from job_submit) your job is charged as below
      -J test5 MatlabSubmitLOG5/submission_script
+
              Project Account: 122839393845
 
+
              Account Balance: 17504.211670
Verifying job submission parameters...
+
              Requested SUs:   18
Verifying project account...
 
    Account to charge:   082839397478
 
        Balance (SUs):     80533.2098
 
        SUs to charge:       18.0000
 
Job <2901543> is submitted to default queue <sn_regular>.
 
  
 
-----------------------------------------------
 
-----------------------------------------------
matlabsubmit ID        : 5
+
matlabsubmit ID        : 2
matlab output file    : MatlabSubmitLOG5/matlab.log
+
matlab output file    : MatlabSubmitLOG2/matlab.log
LSF/matlab output file : MatlabSubmitLOG5/lsf.out
 
LSF/matlab error file  : MatlabSubmitLOG5/lsf.err
 
 
 
-bash-4.1$
 
  
 
</pre>
 
</pre>
  
In this example, matlabsubmit will first execute matlab code to create a ''parpool'' with 8 workers (using the local profile). As can be seen in the output, in this case, matlabsubmit requests 9 cores: 1 core for the client and 8 cores for the workers. The only exception is when the user requests 20 workers. In that case, matlabsubmit will request 20 cores.
+
In this example, '''matlabsubmit''' will first execute Matlab code to create a ''parpool'' with 8 workers (using the local profile). As can be seen in the output, in this case, '''matlabsubmit''' requests 9 cores: 1 core for the client and 8 cores for the workers. The only exception is when the user requests all the workers on a node ( 28 on terra). In that case, matlabsubmit will request all cores (instead of one extra for the client).
 
 
  
 
= Example 3: Utilizing Matlab workers (multi node)  =
 
= Example 3: Utilizing Matlab workers (multi node)  =

Revision as of 15:59, 24 February 2020

Example 1: basic use

The following example shows the simplest use of matlabsubmit. It will execute matlab script test.m using default values for batch resources and Matlab resources. matlabsubmit will also print some useful information to the screen. As can be seen in the example, it will show the Matlab resources requested (e.g. #threads, #workers), the submit command that will be used to submit the job, the batch scheduler JobID, and the location of output generated by Matlab and the batch scheduler.

-bash-4.1$ matlabsubmit test.m

===============================================
Running Matlab script with following parameters
-----------------------------------------------
Script     : test.m
Workers    : 0
Nodes      : 1
#threads   : 8
===============================================

sbatch -o MatlabSubmitLOG1/slurm.out --job-name=matlabsubmitID-1_test   --ntasks=1  --ntasks-per-node=1 --cpus-per-task=8 --mem=16000 --time=02:00:00      MatlabSubmitLOG1/submission_script
Submitted batch job 3759554
(from job_submit) your job is charged as below
              Project Account: 122839393845
              Account Balance: 17504.211670
              Requested SUs:   16

-----------------------------------------------
matlabsubmit ID        : 1
Matlab output file     : MatlabSubmitLOG1/matlab.log
-bash-4.1$


The matlab script test.m has to be in the current directory. Control will be returned immediately after executing the matlabsubmit command. To check the run status or kill a job, use the respective batch scheduler commands (e.g. squeue and scancel). matlabsubmit will create a sub directory named MatlabSubmitLOG<N> (where N is the matlabsubmit ID). In this directory matlabsubmit will store all its relevant files; the generated batch script, matlab driver, and redirected output and error. A listing of this directory will show the following files:

  • slurm.err redirected error
  • slurm.out redirected output (both LSF and Matlab)
  • matlab.log redirected Matlab screen output
  • matlabsubmit_wrapper.m Matlab code that sets #threads and calls user function
  • submission_script the generated LSF batch script
  • batch_job_id-<SLURM_JOBID> dummy file to show the actual Slurm job id this matlabsubmit job is associated to.

Example 2: Utilizing Matlab workers (single node)

To utilize additional workers used by Matlab's parallel features such as parfor,spmd, and distributed, matlabsubmit provides an option to specify the number of workers. This is done using the -w <N> flag (where <N> represents the number of workers). The following example shows a simple case of using 8 workers.

-bash-4.1$ matlabsubmit -w 8 test.m

Running Matlab script with following parameters
-----------------------------------------------
Script     : test.m
Workers    : 8
Nodes      : 1
#threads   : 1
===============================================

sbatch -o MatlabSubmitLOG2/slurm.out --job-name=matlabsubmitID-2_test   --ntasks=9  --ntasks-per-node=9 --cpus-per-task=1 --mem=16000 --time=02:00:00      MatlabSubmitLOG2/submission_script
Submitted batch job 3759571
(from job_submit) your job is charged as below
              Project Account: 122839393845
              Account Balance: 17504.211670
              Requested SUs:   18

-----------------------------------------------
matlabsubmit ID        : 2
matlab output file     : MatlabSubmitLOG2/matlab.log

In this example, matlabsubmit will first execute Matlab code to create a parpool with 8 workers (using the local profile). As can be seen in the output, in this case, matlabsubmit requests 9 cores: 1 core for the client and 8 cores for the workers. The only exception is when the user requests all the workers on a node ( 28 on terra). In that case, matlabsubmit will request all cores (instead of one extra for the client).

Example 3: Utilizing Matlab workers (multi node)

matlabsubmit provides excellent options for Matlab runs that need more than 20 workers (maximum for single node) and/or when the Matlab workers need to be distributed among multiple nodes. Reasons for distributing workers among different nodes include: need to use certain resources such as gpu on multiple nodes, enable multi threading on every worker, and use the available memory on multiple nodes. The following example shows how to run a matlab simulation that utilizes 24 workers, where every node will run 4 workers (i.e. the workers will be distributed among 24/4 = 6 nodes).

-bash-4.1$ matlabsubmit -w 24 -p 4 test.m
===============================================
Running Matlab script with following parameters
-----------------------------------------------
Script     : test.m
Workers    : 24
Nodes      : 6
Mem/proc   : 2500
#threads   : 1
===============================================

... starting matlab batch. This might take some time. 
See MatlabSubmitLOG8/matlab-batch-commands.log
...Starting Matlab from host: login4
MATLAB is selecting SOFTWARE OPENGL rendering.

                                           < M A T L A B (R) >
                                 Copyright 1984-2016 The MathWorks, Inc.
                                 R2016a (9.0.0.341360) 64-bit (glnxa64)
                                            February 11, 2016

 
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
 
... Interactive Matlab session, multi threading reduced to 4

	Academic License


commandToRun =

bsub -L /bin/bash -J Job1 -o '/general/home/pennings/Job1/Job1.log' -n 25 -M 2500 
     -R rusage[mem=2500] -R "span[ptile=4]" -W 02:00       
     "source /general/home/pennings/Job1/mdce_envvars ;
     /general/software/x86_64/tamusc/Matlab/toolbox/tamu/profiles/lsfgeneric/communicatingJobWrapper.sh"


job = 

 Job

    Properties: 

                   ID: 1
                 Type: pool
             Username: pennings
                State: running
           SubmitTime: Mon Aug 01 12:15:15 CDT 2016
            StartTime: 
     Running Duration: 0 days 0h 0m 0s
      NumWorkersRange: [25 25]

      AutoAttachFiles: true
  Auto Attached Files: /general/home/pennings/MatlabSubmitLOG8/matlabsubmit_wrapper.m
                       /general/home/pennings/test.m
        AttachedFiles: {}
      AdditionalPaths: {}

    Associated Tasks: 

       Number Pending: 25
       Number Running: 0
      Number Finished: 0
    Task ID of Errors: []
  Task ID of Warnings: []




-----------------------------------------------
matlabsubmit JOBID            : 8
batch  output file (client)   : Job1/Task1.diary.txt
batch  output files (workers) : Job1/Task[2-25].diary.txt
Done

-bash-4.1$

As can be seen the output is very different from the previous examples. When a job uses multiple nodes the approach matlabsubmit uses is a bit different. matlabsubmit will start a regular interactive matlab session and from within it will run the Matlab batch command using the TAMUG cluster profile. It will then exit Matlab while the Matlab script is executed on the compute nodes.

The contents of the MatlabSubmitLOG directory are also slightly different. A listing will show the following files:

  • matlab-batch-commands.log screen output from Matlab
  • matlabsubmit_driver.m Matlab code that sets up the cluster profile and calls Matlab batch
  • matlabsubmit_wrapper.m Matlab code that sets #threads and calls user function
  • submission_script The actual command to start Matlab

In addition to the MatlabSubmitLOG directory created by matlabsubmit, Matlab will also create a directory named Job<N> used by the cluster profile to store meta data, log files, and screen output. The *.diary.txt text files will show screen output for the client and all the workers.