Hprc banner tamu.png

SW:mpiBLAST

From TAMU HPRC
Jump to: navigation, search

mpiBLAST

GCATemplates available: no

 module spider mpiBLAST

mpiBLAST homepage

Make sure to add the following line in your job script

 export I_MPI_MPD_RSH=ssh

Available formatted dbs are found here in either 38_frag or 78_frag

/scratch/datasets/mpiblast/nt/38_frags
/scratch/datasets/mpiblast/nr/38_frags
/scratch/datasets/mpiblast/est_human/38_frags

/scratch/datasets/mpiblast/nt/78_frags
/scratch/datasets/mpiblast/nr/78_frags
/scratch/datasets/mpiblast/est_human/78_frags

If you select 38 frags then run your job on 1 nodes using 40 cores per node since 2 cores are required for mpiBLAST for management processes.

If you select 78 frags then run your job on 2 nodes using 48 cores per node since 2 cores are required for mpiBLAST for management processes.

Since mpiBLAST uses a config file called ~/.ncbirc you should only run one mpiBLAST job at a time.

Your ~/.ncbirc file should reflect the number of frags you are using:

Here is an example of the ~/.ncbirc file when using the 78 frag nt database. First create a directory called /scratch/user/Your_NetID/tmp_mpiblast

[NCBI]
Data=/scratch/datasets/mpiblast/data

[BLAST]
BLASTDB=/scratch/datasets/mpiblast/nt/78_frags
BLASTMAT=/scratch/datasets/mpiblast/data

[mpiBLAST]
Shared=/scratch/datasets/mpiblast/nt/78_frags
Local=/scratch/user/Your_NetID/tmp_mpiblast

If you want your mpiBLAST job to finish a lot faster, then add these lines to your job script: (don't create the .ncbirc file, let the job script create it for you since it will use the $TMPDIR variable at runtime)

echo "[NCBI]
Data=/scratch/datasets/mpiblast/data

[BLAST]
BLASTDB=/scratch/datasets/mpiblast/nt/78_frags
BLASTMAT=/scratch/datasets/mpiblast/data

[mpiBLAST]
Shared=/scratch/datasets/mpiblast/nt/78_frags
Local=$TMPDIR" > ~/.ncbirc

An example of a command using the 78 frag nt database and the .ncbirc file above would look like this:

 mpirun -np 78 mpiblast -p blastn -d nt -i my_seqs.fa -o my_output.tsv -m 9 --use-parallel-write --removedb

And the BSUB parameters in your job script to accompany the above example:

#SBATCH --nodes=2
#SBATCH --ntasks-per-node=48
#SBATCH --cpus-per-task=1
#SBATCH --mem=56G

If you format your own database, you can use a local copy of the blast matricies which are found here:

 /scratch/datasets/mpiblast/data