Hprc banner tamu.png

Difference between revisions of "Bioinformatics:Data Normalization.2C Clustering .26 Collapsing"

From TAMU HPRC
Jump to: navigation, search
(Created page with "= Data Normalization, Clustering & Collapsing = == BBNorm == GCATemplates available: no BBNorm [https://sourceforge.net/projects/bbmap/ homepage] mod...")
 
 
(8 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
= Data Normalization, Clustering & Collapsing =
 
= Data Normalization, Clustering & Collapsing =
 +
[https://hprc.tamu.edu/wiki/Bioinformatics Back to Bioinformatics Main Menu]
 +
__TOC__
 +
{{:SW:BBNorm}}
 +
{{:SW:CD-HIT}}
 +
{{:SW:FASTX-Toolkit}}
  
== BBNorm ==
+
[[Category:Bioinformatics]]
[[Ada:GCATemplates|GCATemplates]] available: no
 
 
 
BBNorm [https://sourceforge.net/projects/bbmap/ homepage]
 
 
 
  module load BBMap
 
 
 
bbnorm.sh is the data normalization script that is part of the BBMap package.
 
 
 
BBNorm: Kmer-based error-correction and normalization tool.
 
 
 
 
 
== CD-HIT ==
 
[[Ada:GCATemplates|GCATemplates]] available: no
 
 
 
CD-HIT [http://weizhongli-lab.org/cd-hit/ homepage]
 
 
 
  module load CD-HIT
 
 
 
CD-HIT is a very widely used program for clustering and comparing protein or nucleotide sequences.
 
 
 
 
 
== FASTX-Toolkit ==
 
[[Ada:GCATemplates|GCATemplates]] available: no
 
 
 
FASTX-Toolkit [http://hannonlab.cshl.edu/fastx_toolkit/ homepage]
 
 
 
  module load FASTX-Toolkit
 
 
 
The fastx_collapser tool is included in the FASTX-Toolkit.
 
 
 
Collapses identical sequences in a FASTQ/A file into a single sequence (while maintaining reads counts).
 

Latest revision as of 13:32, 3 December 2021

Data Normalization, Clustering & Collapsing

Back to Bioinformatics Main Menu

BBNorm

GCATemplates available: no

BBNorm homepage

 module spider BBMap

bbnorm.sh is the data normalization script that is part of the BBMap package.

BBNorm: Kmer-based error-correction and normalization tool.

CD-HIT

GCATemplates available: no

CD-HIT homepage

 module spider CD-HIT

CD-HIT is a very widely used program for clustering and comparing protein or nucleotide sequences.

FASTX-Toolkit

GCATemplates available: no

FASTX-Toolkit homepage

 module spider FASTX-Toolkit

The fastx_collapser tool is included in the FASTX-Toolkit.

Collapses identical sequences in a FASTQ/A file into a single sequence (while maintaining reads counts).