SW:Comsol
After a model is built in Comsol, the next step is to compute for a solution, which is often time consuming and a job script must be created to run it in batch. This tutorial illustrates how to create Comsol LSF batch scripts on Ada.
All solvers in Comsol can run in parallel in one of three parallel modes: shared memory mode, distributed mode, or hybrid mode. By default, a Comsol solver runs in shared memory mode. This is the same as OpenMP where the parallelism is limited by total number of CPU cores available in one compute node in a cluster.
Example 1: #BSUB -n 20 comsol -np 20 batch
Comsol solvers can also run in distributed mode by checking the "distributed computing" checkbox of the solver when building the model. In this mode, the solver runs on multiple nodes a nd uses MPI for communication. Except PARDISO, all solvers support distributed mode. However, PARDISO also has a check box for distributed computing. If selected, the actual solver used
is MUMPS.
Example:
- BSUB -n 40
comsol -f -simplecluster comsol -f ./hostfile.$JOB_ID -nn 40 batch
Either mode has its pros and cons. Shared mode utilizes CPU cores better than distributed mode but can only run in one node, while distributed mode can utilize more than one node. It is
usually best to run a solver in a way to take advantage of both modes. This can be done easily at the command line through fine tuning of the options -nn, -nnhost, -np.
Example:
- BSUB -n 40
comsol batch -f ./hostfile.$JOB_ID -nn 2 -nnhost 1 -np 20 comsol batch -f ./hostfile.$JOB_ID -nn 4 -nnhost 2 -np 10
Comsol models configured with parameter sweep can also benefit from parallel computing in different ways. A paremeter sweep model needs to run under a range of parameters or combination s of parameters, and each set of parameters can be calculated independently. Once a model with parameter sweep node created, it must be also configured with cluster sweep to use the clu ster.
Example:
- BSUB -n 40
run each set of parameters on one CPU core comsol -f ./hostfile.$JOB_ID -nn 10 -nnhost 5 -np 4
large memory requirement
- BSUB -n 200
comsol -f ./hostfile.$JOB_ID -nn 10 -nnhost 1 -np 20
if each set of parameter requires more than one node (due to memory or computation time),
Common problems:
1. Disk quota exceeded in home directory.
By default, comsol stores all temparary files in your home directory. For large models, are likely to get "Disk quota exceeded" error due to huge amount of temporary files dumped into y our home directory. To resolve this issue, you need to redirect temporary files to your scratch directory.
comsole -tmpdir /scratch/user/username/cosmol/tmp -recoverydir /scratch/user/username/comsol/recovery ...
For admins:
1. ??? Java exception occurred:java.lang.OutOfMemoryError: Java heap space
Increase the java head size in $COMSLPATH/bin/comsol.ini. Change -Xmx from 1024m to a larger value.