Bioinformatics Tool Categories (Overview)
- Sequence QC
- Data Normalization, Clustering & Collapsing
- PacBio Tools
- Oxford Nanopore Tools
- Genome Assembly
- Metagenomics
- RNA-seq
- ChIP-seq
- Sequence Variants (SNPs and indels)
- CNV
- Methylation
- Sequence Alignments
- Multiple Sequence Alignments
- Proteomics
- Population Genomics
- Phylogenetics
- Genotyping
- Gene Networks
- Chromatin Structure
- Software Suites
- Common Tools
- Data Visualization
- Statistics
- File Format Tools
- Licenses
- Aspera (SRA, 1000genomes, BioMart)
- Conda/Bioconda
- Biocontainers
- FAQ
This is a summary of most of the NGS Bioinformatics tools on the HPRC clusters.
Not all NGS Bioinformatics tools installed on each HPRC cluster are summarized on these pages.
A complete software module listing for each cluster can be found here:
Other software installed as a conda environment and not as a module can be found by loading an Anaconda module such as the following two commands
module load Anaconda/3-5.0.0.1
conda env list
Check the website of the software package you want to use to see if the version on the HPRC cluster is the latest available version and advise us if a newer version needs to be installed.
Genome Fasta, Index Files and Databases
The following genomes are available on Ada and can be accessed via the command line.
The idea is that you link to these files in your job script instead of copying the files to your directories.
Genomic Reference Sequences, Indexes and Databases | |
---|---|
All of NCBI's BLAST5 databases (nt, nr, rna, 16SMicrobial,...) | /scratch/data/bio/blast/ |
The latest UCSC Genomes | /scratch/data/bio/ucsc/gbdb/ |
BUSCO lineage files | /scratch/data/bio/busco4/ scratch/data/bio/busco5/ |
Silva database files | /scratch/data/bio/silva/ |
CESM | /scratch/data/bio/cesm/ |
Gemini | /scratch/data/bio/gemini/ |
Augustus species | $EBROOTAUGUSTUS/config/species/ (after loading AUGUSTUS module) |
Bowtie, Bowtie2 and BWA indexed genomes* | /scratch/data/bio/genome_indexes/ |
* bowtie indexes will work with any version of the bowtie tool but will
not work with the bowtie2 tool
* bowtie2 indexes will work with any version of the bowtie2 tool but
will not work with the bowtie tool
Bioinformatics Web Resources
General Tutorials | |
---|---|
Tutorials | Cornell Virtual Workshop |
Genomic Databases | |
NCBI | National Center for Biotechnology Information |
Ensembl | Provides a bioinformatics framework to organise biology around the sequences of large genomes. |
EBI | The European Bioinformatics Institute |
JGI | Genome sequences of plants, fungi, microbes, and metagenomes |
DDBJ | DNA Data Bank of Japan |
ENCODE | ENCyclopedia Of DNA Elements |
HapMap | Identify and catalog genetic similarities and differences in human beings |
GOLD | Genomes Online Database: information regarding genome and metagenome sequencing projects |
DAVID | The Database for Annotation, Visualization and Integrated Discovery |
RepBase | Genetic Information Research Institute. RepBase: repetitive sequences database |
Gene Nomenclature and Info | |
HGNC | A curated online repository of HGNC-approved gene nomenclature. |
GeneCards | Information on all annotated and predicted human genes |
Sequencing Projects | |
HMP | Human Microbiome Project: characterizes microbial communities found at multiple human body sites |
1000 Genomes | A Deep Catalog of Human Genetic Variation |
UK10K | Rare Genetic Variants in Health and Disease |
Protein Sequence Databases | |
InterPro | Protein sequence analysis & classification |
PDB | Protein Data Bank |
Pfam | A large collection of protein families |
UniProt | Protein sequences and functional information |
Download Sequence Data | |
BioMart | Specific gene regions, Gene Ontology and Alternate IDs |
Ensembl | cDNA, GTF, VEP, CDS, Protein |
iGenome | Ensembl, NCBI, UCSC reference fasta; Bowtie, BWA and Bowtie2 index files |
RNA Databases | |
miRBase | A searchable database of published miRNA sequences and annotation |
RDP | Aligned and annotated Bacterial and Archaeal 16S rRNA sequences, and Fungal 28S rRNA sequences |
SILVA | A comprehensive on-line resource for quality checked and aligned ribosomal RNA sequence data |
Gene Expression Databases | |
Expression Atlas | EBI Expression Atlas |
BioXpress | gene expression in cancer |
EMAGE | Mouse embryo in situ gene expression data |
Human Protein Atlas | Human protein-coding genes regarding the expression based on both RNA and protein data |
PLEXdb | Gene expression in diverse organs, tissues, or developmental stages |
Metabolic Pathway Databases | |
KEGG | A database resource for understanding high-level functions and utilities of the biological system |
Reactome | A curated and peer reviewed pathway database |
GeneNetwork | Tools used to study complex networks of genes, molecules, gene function and phenotypes |
Model Organism Databases | |
TAIR | The Arabidopsis Information Resource |
CGD | Candida Genome Database |
FlyBase | A Database of Drosophila Genes & Genomes |
Gramene | Comparative functional genomics in crops and model plant species |
MaizeGDB | Zea mays database |
MGI | International database resource for the laboratory mouse |
RGD | Rat Genome Database |
Saccharomyces | Comprehensive integrated biological information for the budding yeast Saccharomyces cerevisiae |
VectorBase | Bioinformatics Resource for Invertebrate Vectors of Human Pathogens |
WormBase | Nematodes |
Xenbase | Xenopus Database |
ZFIN | The Zebrafish Model Organism Database |
Genome Browsers | |
UCSC | |
Genome Data Viewer | NCBI genome browser for 600+ RefSeq genome assemblies |
Ensembl | |
Forums and Techniques | |
RNA-Seq Blog | Blog on the latest RNA-seq experimental design and analysis techniques |
SeqAnswers | Forums on everything Bioinformatics |
BioStar | Forums on everything Bioinformatics |
Genohub | Designing your Next Generation Sequencing Run |
Genohub | Coverage and Read Depth Recommendations by Sequencing Application |
Software and Database Lists | |
NCBI | Databases available at NCBI |
NCBI | Tools available at NCBI |
ExPASy | Bioinformatics Resource Portal |
NAR | Nucleic Acids Research, published papers on databases |
Bioinformatics Web Tutorials | |
NCBI | NIH Online Bioinformatics Tutorials |
EMBL-EBI | Train online with EMBL-EBI (Home) |
EMBL-EBI | Train online with EMBL-EBI (Next Generation Sequencing Practical Course) |
Ensembl | Ensembl Tutorials and Worked Examples |
Data Carpentry | Genomics (QC, trim, call variants) |
Melbourne Bioinformatics | A few good tutorials including some for Galaxy |
Other Useful Tools | |
SAM Flags | Explain SAM/BAM bitwise flags |
SRA Explorer | Explore SRA by SRA/GEO id, organism, seq type, tissue type, ... |
SNPnexus | Web-based annotation of human SNPs |