Difference between revisions of "Bioinformatics:Data Normalization.2C Clustering .26 Collapsing"
(→Data Normalization, Clustering & Collapsing) |
|||
Line 6: | Line 6: | ||
BBNorm [https://sourceforge.net/projects/bbmap/ homepage] | BBNorm [https://sourceforge.net/projects/bbmap/ homepage] | ||
− | module | + | module spider BBMap |
bbnorm.sh is the data normalization script that is part of the BBMap package. | bbnorm.sh is the data normalization script that is part of the BBMap package. | ||
Line 18: | Line 18: | ||
CD-HIT [http://weizhongli-lab.org/cd-hit/ homepage] | CD-HIT [http://weizhongli-lab.org/cd-hit/ homepage] | ||
− | module | + | module spider CD-HIT |
CD-HIT is a very widely used program for clustering and comparing protein or nucleotide sequences. | CD-HIT is a very widely used program for clustering and comparing protein or nucleotide sequences. | ||
Line 28: | Line 28: | ||
FASTX-Toolkit [http://hannonlab.cshl.edu/fastx_toolkit/ homepage] | FASTX-Toolkit [http://hannonlab.cshl.edu/fastx_toolkit/ homepage] | ||
− | module | + | module spider FASTX-Toolkit |
The fastx_collapser tool is included in the FASTX-Toolkit. | The fastx_collapser tool is included in the FASTX-Toolkit. |
Revision as of 11:38, 24 April 2017
NGS: Data Normalization, Clustering & Collapsing
BBNorm
GCATemplates available: no
BBNorm homepage
module spider BBMap
bbnorm.sh is the data normalization script that is part of the BBMap package.
BBNorm: Kmer-based error-correction and normalization tool.
CD-HIT
GCATemplates available: no
CD-HIT homepage
module spider CD-HIT
CD-HIT is a very widely used program for clustering and comparing protein or nucleotide sequences.
FASTX-Toolkit
GCATemplates available: no
FASTX-Toolkit homepage
module spider FASTX-Toolkit
The fastx_collapser tool is included in the FASTX-Toolkit.
Collapses identical sequences in a FASTQ/A file into a single sequence (while maintaining reads counts).