Hprc banner tamu.png

Curie:Batch Processing LSF

From TAMU HPRC
Jump to: navigation, search

Curie Batch Processing: LSF

The Batch System

The batch system is a load distribution implementation that ensures convenient and fair use of a shared resource. Submitting jobs to a batch system allows a user to reserve specific resources with minimal interference to other users. All users are required to submit resource-intensive processing to the compute nodes through the batch system - attempting to circumvent the batch system is not allowed.

Unlike other HPRC resources, you will not be charged SUs for using Curie.

Submitting Jobs

Hardware Conciderations

Curie has a total of 234 GB of usable memory per node and has 16 cores per node. Because of this the maximum recommenced "rusage" is 14GB.

If more memory or cores is requested than exists the job will never be run.

Curie uses IBM POWER7 CPUs. Therefore executables built on Intel x86 systems such as Ada or Terra will not be compatible.

Example Job Scripts

NOTE: Job examples are NOT lists of commands, but are a template of the contents of a job file. These examples should be pasted into a text editor and submitted as a job to be tested, not entered as commands line by line.

Example One

This is an example job using one core with 2.5GB of memory.

##NECESSARY JOB SPECIFICATIONS
#BSUB -J JobExample1         #Set the job name to "JobExample1"
#BSUB -L /bin/bash           #Uses the bash login shell to initialize the job's execution environment.
#BSUB -W 1:30                #Set the wall clock limit to 1hr and 30min
#BSUB -n 1                   #Request 1 core 
#BSUB -R "span[ptile=1]"     #Request 1 core per node (This can be a maximum of 16 cores)
#BSUB -R "rusage[mem=2560]"  #Request 2560MB (2.5GB) per core (If using all cores recommend maximum is 14GB)
#BSUB -M 2560                #Set the per process enforceable memory limit to 2560MB
#BSUB -o Example1Out.%J      #Send stdout/err to "Example1Out.[jobID]"

##OPTIONAL JOB SPECIFICATIONS
#BSUB -P 123456              #Set billing account to 123456
#BSUB -u email_address       #Send all emails to email_address
#BSUB -B -N                  #Send email on job begin (-B) and end (-N)

##Your Commands After This Line

Example Two

This example job uses all 16 cores across 2 nodes for a total of 32 cores and uses most of the variable memory.

##NECESSARY JOB SPECIFICATIONS
#BSUB -J JobExample2         #Set the job name to "JobExample1"
#BSUB -L /bin/bash           #Uses the bash login shell to initialize the job's execution environment.
#BSUB -W 5:00                #Set the wall clock limit to 5hr and 00min
#BSUB -n 32                  #Request 32 cores 
#BSUB -R "span[ptile=16]"    #Request 16 cores per node (This can be a maximum of 16 cores)
#BSUB -R "rusage[mem=14336]" #Request 14336MB (14GB) per core (If using all cores recommend maximum is 14GB)
#BSUB -M 14336               #Set the per process enforceable memory limit to 14336MB
#BSUB -o Example2Out.%J      #Send stdout/err to "Example2Out.[jobID]"

##OPTIONAL JOB SPECIFICATIONS
#BSUB -P 123456              #Set billing account to 123456
#BSUB -u email_address       #Send all emails to email_address
#BSUB -B -N                  #Send email on job begin (-B) and end (-N)

##Your Commands After This Line

Curie uses the same workload manager as Ada, LSF commands and scripts are the same on both systems. For more information visit Ada Batch Processing.