Hprc banner tamu.png

Bioinformatics:Data Normalization.2C Clustering .26 Collapsing

Revision as of 13:32, 3 December 2021 by Cmdickens (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Data Normalization, Clustering & Collapsing

Back to Bioinformatics Main Menu


GCATemplates available: no

BBNorm homepage

 module spider BBMap

bbnorm.sh is the data normalization script that is part of the BBMap package.

BBNorm: Kmer-based error-correction and normalization tool.


GCATemplates available: no

CD-HIT homepage

 module spider CD-HIT

CD-HIT is a very widely used program for clustering and comparing protein or nucleotide sequences.


GCATemplates available: no

FASTX-Toolkit homepage

 module spider FASTX-Toolkit

The fastx_collapser tool is included in the FASTX-Toolkit.

Collapses identical sequences in a FASTQ/A file into a single sequence (while maintaining reads counts).