Metagenomics

Mothur

module spider Mothur

Open-source, platform-independent, community-supported software for describing and comparing microbial communities

See syntax for commands when running in command line mode such as in a batch file.

https://mothur.org/wiki/Command_line_mode

QIIME

GCATemplates available: no

QIIME is an open-source bioinformatics pipeline for performing microbiome analysis from raw DNA sequencing data.

Qiime homepage

PICRUSt

GCATemplates available: no

PICRUSt homepage

PICRUSt (pronounced “pie crust”) is a bioinformatics software package designed to predict metagenome functional content from marker gene (e.g., 16S rRNA) surveys and full genomes.

module spider PICRUSt2

GCATemplates available: no

SortMeRNA homepage

Fast filtering, mapping and OTU picking.

SortMeRNA is a program tool for filtering, mapping and OTU-picking NGS reads in metatranscriptomic and metagenomic data.

# for Grace
module purge
module load Anaconda3/2021.11
source activate /sw/hprc/sw/Anaconda3/2021.11/envs/sortmerna-4.3.4

MetaVelvet

GCATemplates available: no

MetaVelvet homepage

module load MetaVelvet

A short read assembler for metagenomics

MetaPhlAn

GCATemplates

Grace

MetaPhlAn homepage, tutorial, usage

module spider MetaPhlAn

MetaPhlAn is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data.

The MetaPhlAn binaries are located in the following directory:

ls $EBROOTMETAPHLAN/bin

MetaPhlAn2

GCATemplates available: no (see MetaPhlAn template)

MetaPhlAn2 homepage

module spider MetaPhlAn2

MetaPhlAn2 is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea, Eukaryotes and Viruses) from metagenomic shotgun sequencing data with species level resolution. From version 2.0 MetaPhlAn2 is also able to identify specific strains (in the not-so-frequent cases in which the sample contains a previously sequenced strains) and to track strains across samples for all species.

The MetaPhlAn2 utilities are located in the following directory:

ls $EBROOTMETAPHLAN2/utils

This is how you use the MetaPhlAn2 provided bacterial profiles with Bowtie2:

metaphlan2.py --bowtie2db $EBROOTMETAPHLAN2/db_v20/mpa_v20_m200 --bowtie2_exe $EBROOTBOWTIE2/bin/bowtie2

SRST2

GCATemplates

Ada

SRST2 homepage

module spider SRST2

Molecular Genotyping of Bacterial Pathogens

This program is designed to take Illumina sequence data, a MLST database and/or a database of gene sequences (e.g. resistance genes, virulence genes, etc) and report the presence of STs and/or reference genes.

List of MLST databases: here

You must download the appropriate MLST database prior to running SRST2 using the getmlst.py script provided by SRST2.

Here is an example of downloading the Salmonella enterica database on the UNIX command line in your working directory prior to running SRST2.

NOTE: you have to run these two commands on the login node command line and not in a job script because the compute nodes do not have access to the internet

module load SRST2/0.2.0-intel-2015B-Python-2.7.10
getmlst.py --species "Salmonella enterica"

Read the output from the getmlst.py command to see any recommendations on running srst2

  For SRST2, remember to check what separator is being used in this allele database

  Looks like --mlst_delimiter '_'

  >aroC_1  --> -->   ('aroC', '_', '1')

  Suggested srst2 command for use with this MLST database:

    srst2 --output test --input_pe *.fastq.gz --mlst_db Salmonella_enterica.fasta --mlst_definitions senterica.txt --mlst_delimiter '_'