Metagenomics
Mothur
GCATemplates available: no
Mothur homepage
module spider Mothur
Open-source, platform-independent, community-supported software for describing and comparing microbial communities
See syntax for commands when running in command line mode such as in a batch file.
https://mothur.org/wiki/Command_line_mode
QIIME
GCATemplates available: no
QIIME is an open-source bioinformatics pipeline for performing microbiome analysis from raw DNA sequencing data.
Qiime homepage
PICRUSt
GCATemplates available: no
PICRUSt homepage
PICRUSt (pronounced “pie crust”) is a bioinformatics software package designed to predict metagenome functional content from marker gene (e.g., 16S rRNA) surveys and full genomes.
module spider PICRUSt2
GCATemplates available: no
SortMeRNA homepage
Fast filtering, mapping and OTU picking.
SortMeRNA is a program tool for filtering, mapping and OTU-picking NGS reads in metatranscriptomic and metagenomic data.
# for Grace
module purge
module load Anaconda3/2021.11
source activate /sw/hprc/sw/Anaconda3/2021.11/envs/sortmerna-4.3.4
MetaVelvet
GCATemplates available: no
MetaVelvet homepage
module load MetaVelvet
A short read assembler for metagenomics
MetaPhlAn
MetaPhlAn homepage, tutorial, usage
module spider MetaPhlAn
MetaPhlAn is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data.
The MetaPhlAn binaries are located in the following directory:
ls $EBROOTMETAPHLAN/bin
MetaPhlAn2
GCATemplates available: no (see MetaPhlAn template)
MetaPhlAn2 homepage
module spider MetaPhlAn2
MetaPhlAn2 is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea, Eukaryotes and Viruses) from metagenomic shotgun sequencing data with species level resolution. From version 2.0 MetaPhlAn2 is also able to identify specific strains (in the not-so-frequent cases in which the sample contains a previously sequenced strains) and to track strains across samples for all species.
The MetaPhlAn2 utilities are located in the following directory:
ls $EBROOTMETAPHLAN2/utils
This is how you use the MetaPhlAn2 provided bacterial profiles with Bowtie2:
metaphlan2.py --bowtie2db $EBROOTMETAPHLAN2/db_v20/mpa_v20_m200 --bowtie2_exe $EBROOTBOWTIE2/bin/bowtie2
SRST2
SRST2 homepage
module spider SRST2
Molecular Genotyping of Bacterial Pathogens
This program is designed to take Illumina sequence data, a MLST database and/or a database of gene sequences (e.g. resistance genes, virulence genes, etc) and report the presence of STs and/or reference genes.
List of MLST databases: here
You must download the appropriate MLST database prior to running SRST2 using the getmlst.py script provided by SRST2.
Here is an example of downloading the Salmonella enterica database on the UNIX command line in your working directory prior to running SRST2.
NOTE: you have to run these two commands on the login node command line and not in a job script because the compute nodes do not have access to the internet
module load SRST2/0.2.0-intel-2015B-Python-2.7.10
getmlst.py --species "Salmonella enterica"
Read the output from the getmlst.py command to see any recommendations on running srst2
For SRST2, remember to check what separator is being used in this allele database
Looks like --mlst_delimiter '_'
>aroC_1 --> --> ('aroC', '_', '1')
Suggested srst2 command for use with this MLST database:
srst2 --output test --input_pe *.fastq.gz --mlst_db Salmonella_enterica.fasta --mlst_definitions senterica.txt --mlst_delimiter '_'