Difference between revisions of "Bioinformatics:Data Normalization.2C Clustering .26 Collapsing"
(7 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
= Data Normalization, Clustering & Collapsing = | = Data Normalization, Clustering & Collapsing = | ||
+ | [https://hprc.tamu.edu/wiki/Bioinformatics Back to Bioinformatics Main Menu] | ||
+ | __TOC__ | ||
+ | {{:SW:BBNorm}} | ||
+ | {{:SW:CD-HIT}} | ||
+ | {{:SW:FASTX-Toolkit}} | ||
− | + | [[Category:Bioinformatics]] | |
− | [[ | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− |
Latest revision as of 12:32, 3 December 2021
Data Normalization, Clustering & Collapsing
Back to Bioinformatics Main Menu
BBNorm
GCATemplates available: no
BBNorm homepage
module spider BBMap
bbnorm.sh is the data normalization script that is part of the BBMap package.
BBNorm: Kmer-based error-correction and normalization tool.
CD-HIT
GCATemplates available: no
CD-HIT homepage
module spider CD-HIT
CD-HIT is a very widely used program for clustering and comparing protein or nucleotide sequences.
FASTX-Toolkit
GCATemplates available: no
FASTX-Toolkit homepage
module spider FASTX-Toolkit
The fastx_collapser tool is included in the FASTX-Toolkit.
Collapses identical sequences in a FASTQ/A file into a single sequence (while maintaining reads counts).