Software Modules on the Ada Cluster

Last Updated: Mon Oct 14 00:00:02 CDT

The available software for the Ada cluster is listed in the table. Click on any software package name to get more information such as the available versions, additional documentation if available, etc.

Name Description
3to2lib3to2 is a set of fixers that are intended to backport code written for Python version 3.x into Python version 2.x.
4ti2A software package for algebraic, geometric and combinatorial problems on linear spaces
AAFAAF constructs phylogenies directly from unassembled genome sequence data, bypassing both genome assembly and alignment.
ABACAS2ABACAS2, a tool for ordering and orientating biosequences along a reference
ABAQUSFinite Element Analysis software for modeling, visualization and best-in-class implicit and explicit dynamics FEA.
ABINITABINIT is a package whose main program allows one to find the total energy, charge density and electronic structure of systems made of electrons and nuclei (molecules and periodic solids) within Density Functional Theory (DFT), using pseudopotentials and a planewave or wavelet basis.
ABRA2ABRA2 is an updated implementation of ABRA featuring: RNA support, Improved scalability (Human whole genomes now supported), Improved accuracy, Improved stability and usability (BWA is no longer required to run ABRA although we do recommend BWA as the initial aligner for DNA) URL:
ABRicateMass screening of contigs for antimicrobial resistance or virulence genes. It comes bundled with multiple databases: Resfinder, CARD, ARG-ANNOT, NCBI BARRGD, NCBI, EcOH, PlasmidFinder, Ecoli_VF and VFDB. URL:
absl-pyAbseil Python Common Libraries
ABySSAssembly By Short Sequences - a de novo, parallel, paired-end sequence assembler
ACMLACML provides a free set of thoroughly optimized and threaded math routines for HPC, scientific, engineering and related compute-intensive applications. ACML is ideal for weather modeling, computational fluid dynamics, financial analysis, oil and gas applications and more.
ACTCACTC converts independent triangles into triangle strips or fans.
AdapterRemovalAdapterRemoval searches for and removes remnant adapter sequences from High-Throughput Sequencing (HTS) data and (optionally) trims low quality bases from the 3' end of reads following adapter removal.
adapter_trimRemoves 3' adapter from Illumina sequencing of small RNAs where read length is greater than the size of RNAs.
ADDAADDA is an open-source parallel implementation of the discrete dipole approximation, capable to simulate light scattering by particles of arbitrary shape and composition in a wide range of particle sizes. URL:
adjustTextA small library for automatically adjustment of text position in matplotlib plots to minimize overlaps.
AdvisorVectorization Optimization and Thread Prototyping - Vectorize & thread code or performance “dies” - Easy workflow + data + tips = faster code faster - Prioritize, Prototype & Predict performance gain
AEGeAnThe AEGeAn Toolkit is designed for the Analysis and Evaluation of Genome Annotations. The toolkit includes a variety of analysis programs as well as a C library whose API provides access to AEGeAn's core functions and data structures.
AFNIAFNI is a set of C programs for processing, analyzing, and displaying functional MRI (FMRI) data - a technique for mapping human brain activity.
AGEntAGEnt is a program for identifying accessory genomic elements in bacterial genomes by using an in-silico subtractive hybridization approach against a core genome, such as those generated by the Spine algorithm. URL:
AGFusionAGFusion is a python package for annotating gene fusions from the human or mouse genomes. URL:
aiohttp" Async http client/server framework
AlbacoreAlbacore is a software project that provides an entry point to the Oxford Nanopore basecalling algorithms.
Algorithm-LoopsAlgorithm::Loops - Looping constructs: NestedLoops, MapCar*, Filter, and NextPermute* URL:
almaBTEThe almaBTE software package developed by this project extends the ShengBTE approach currently employed for homogeneous bulk materials, into the mesoscale, to fully describe thermal transport from the electronic ab initio level, through the atomistic one, all the way into the mesoscopic structure level.
amaskamask is a set of tools to to determine the affinity of MPI processes and OpenMP threads in a parallel environment.
AMBERAmber is a package of programs for molecular dynamics simulations of proteins and nucleic
AmberMiniA stripped-down set of just antechamber, sqm, and tleap.
AMOSThe AMOS consortium is committed to the development of open-source whole genome assembly software
AMPL-MPAn open-source library for mathematical programming.
AnacondaAnaconda environment for
Anaconda2Built to complement the rich, open source Python community, the Anaconda platform provides an enterprise-ready data analytics platform that empowers companies to adopt a modern open data science analytics architecture.
Anaconda3Built to complement the rich, open source Python community, the Anaconda platform provides an enterprise-ready data analytics platform that empowers companies to adopt a modern open data science analytics architecture.
Ancestry_HMMa hidden Markhov model
angsdProgram for analysing NGS data.
ANNOVARANNOVAR is an efficient software tool to utilize update-to-date information to functionally annotate genetic variants detected from diverse genomes (including human genome hg18, hg19, hg38, as well as mouse, worm, fly, yeast and many others).
ANSYSANSYS simulation software enables organizations to confidently predict how their products will operate in the real world. We believe that every product is a promise of something greater.
AnsysEMANSYS Electromagnetics Suite
antApache Ant is a Java library and command-line tool whose mission is to drive processes described in build files as targets and extension points dependent upon each other. The main known usage of Ant is the build of Java applications.
ANTLRANTLR, ANother Tool for Language Recognition, (formerly PCCTS) is a language tool that provides a framework for constructing recognizers, compilers, and translators from grammatical descriptions containing Java, C#, C++, or Python actions.
ANTsANTs extracts information from complex datasets that include imaging. ANTs is useful for managing, interpreting and visualizing multidimensional data.
anvioAnvi'o is an open-source, community-driven analysis and visualization platform for 'omics data.
APBSAPBS is a software package for modeling biomolecular solvation through solution of the Poisson-Boltzmann equation (PBE), one of the most popular continuum models for describing electrostatic interactions between molecular solutes in salty, aqueous media.
APRApache Portable Runtime (APR) libraries.
APR-utilApache Portable Runtime (APR) util libraries.
AragorntRNA (and tmRNA) detection
argparsePython command-line parsing library
argtableArgtable is an ANSI C library for parsing GNU style command line options with a minimum of fuss.
ARIBAARIBA is a tool that identifies antibiotic resistant genes by running local assemblies
ARKSScaffolding genome sequence assemblies using 10X Genomics GemCode/Chromium data. This project is a new kmer-based (alignment free) implementation of ARCS.
ArmadilloArmadillo is an open-source C++ linear algebra library (matrix maths) aiming towards a good balance between speed and ease of use. Integer, floating point and complex numbers are supported, as well as a subset of trigonometric and statistics functions.
arpack-ngARPACK is a collection of Fortran77 subroutines designed to solve large scale eigenvalue problems.
ArrayFireArrayFire is a general-purpose library that simplifies the process of developing software that targets parallel and massively-parallel architectures including CPUs, GPUs, and other hardware acceleration devices.
ArrowApache Arrow is a cross-language development platform for in-memory data.
ARTART is a set of simulation tools to generate synthetic next-generation sequencing reads
ARTSARTS is a radiative transfer model for the millimeter and sub-millimeter spectral range. There are a number of models mostly developed explicitly for the different sensors.
ASAP3ASAP is a calculator for doing large-scale classical molecular dynamics within the Campos Atomic Simulation Environment (ASE).
ASEASE is a python package providing an open source Atomic Simulation Environment in the Python scripting language. URL:
AssimuloAssimulo is a simulation package for solving ordinary differential equations.
astropyThe Astropy Project is a community effort to develop a single core package for Astronomy in Python and foster interoperability between Python astronomy packages. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
asyncoroPython framework for concurrent, distributed, asynchronous network programming with coroutines, asynchronous completions and message passing.
ATKATK provides the set of accessibility interfaces that are implemented by other toolkits and applications. Using the ATK interfaces, accessibility tools have full access to view and control running applications. URL:
AtkmmAtkmm is the official C++ interface for the ATK accessibility toolkit library.
ATLASATLAS (Automatically Tuned Linear Algebra Software) is the application of the AEOS (Automated Empirical Optimization of Software) paradigm, with the present emphasis on the Basic Linear Algebra Subprograms (BLAS), a widely used, performance-critical, linear algebra kernel library.
at-spi2-atkAT-SPI 2 toolkit bridge URL:
at-spi2-coreAssistive Technology Service Provider Interface. URL:
AUGUSTUSAUGUSTUS is a program that predicts genes in eukaryotic genomic sequences URL:
AutoconfAutoconf is an extensible package of M4 macros that produce shell scripts to automatically configure software source code packages. These scripts can adapt the packages to many kinds of UNIX-like systems without manual user intervention. Autoconf creates a configuration script for a package from a template file that lists the operating system features that the package can use, in the form of M4 macro calls.
AutoDock_VinaAutoDock Vina is an open-source program for doing molecular docking.
AutomakeAutomake: GNU Standards-compliant Makefile generator
AutotoolsThis bundle collect the standard GNU build tools: Autoconf, Automake and libtool
awscliUniversal Command Line Environment for AWS URL:
BactSNPBactSNP is a tool to identify SNPs among bacterial isolates.
BaderA fast algorithm for doing Bader's analysis on a charge density grid.
bamkitTools for common BAM file manipulations
bam-readA tool for reading BAM files developed exclusively by the Transrate tool.
BamToolsBamTools provides both a programmer's API and an end-user's toolkit for handling BAM files.
BandageBandage is a program for visualising de novo assembly graphs URL:
BarnapBarrnap predicts the location of ribosomal RNA genes in genomes. It supports bacteria (5S,23S,16S), archaea (5S,5.8S,23S,16S), mitochondria (12S,16S) and eukaryotes (5S,5.8S,28S,18S).
barrnapBarrnap (BAsic Rapid Ribosomal RNA Predictor) predicts the location of ribosomal RNA genes in genomes.
basemapThe matplotlib basemap toolkit is a library for plotting 2D data on maps in Python
batThe BAT Python package supports the processing and analysis of Bro data with Pandas, scikit-learn, and Spark
BayesTraitsBayesTraits is a computer package for performing analyses of trait evolution among groups of species for which a phylogeny or sample of phylogenies is available. This new package incoporates our earlier and separate programes Multistate, Discrete and Continuous. BayesTraits can be applied to the analysis of traits that adopt a finite number of discrete states, or to the analysis of continuously varying traits. Hypotheses can be tested about models of evolution, about ancestral states and about correlations among pairs of traits.
BazelBazel is a build tool that builds code quickly and reliably. It is used to build the majority of Google's software. URL:
bbFTPbbFTP is a file transfer software. It implements its own transfer protocol, which is optimized for large files (larger than 2GB) and secure as it does not read the password in a file and encrypts the connection information. bbFTP main features are: - Encoded username and password at connection - SSH and Certificate authentication modules - Multi-stream transfer - Big windows as defined in RFC1323 - On-the-fly data compression - Automatic retry - Customizable time-outs - Transfer simulation - AFS authentication integration - RFIO interface
bbftpPRObbftpPRO is a data transfer program - as opposed to ordinary file transfer programs, capable of transferring arbitrary data over LAN/WANs at parallel speed. bbftpPRO has been started at the Particle Physics Dept. of Weizmann Institute of Science as an enhancement of bbftp, developed at IN2P3, ref:
BBMapThis package includes BBMap, a short read aligner, as well as various other bioinformatic tools. BBMap: Short read aligner for DNA and RNA-seq data. Capable of handling arbitrarily large genomes with millions of scaffolds. Handles Illumina, PacBio, 454, and other reads; very high sensitivity and tolerant of errors and numerous large indels. Very fast. BBNorm: Kmer-based error-correction and normalization tool. Dedupe: Simplifies assemblies by removing duplicate or contained subsequences that share a target percent identity.
bcbio-nextgenA python toolkit providing best-practice pipelines for fully automated high throughput sequencing analysis. Validated, scalable, community developed variant calling, RNA-seq and small RNA analysis.
BCELThe Byte Code Engineering Library (Apache Commons BCEL™) is intended to give users a convenient way to analyze, create, and manipulate (binary) Java class files (those ending with .class).
BCFtoolsSamtools is a suite of programs for interacting with high-throughput sequencing data. BCFtools - Reading/writing BCF2/VCF/gVCF files and calling/filtering/summarising SNP and short indel sequence variants
bcl2fastq2bcl2fastq Conversion Software both demultiplexes data and converts BCL files generated by Illumina sequencing systems to standard FASTQ file formats for downstream analysis. URL:
BEAGLEBeagle version 4.0 performs genotype calling, genotype phasing, imputation of ungenotyped markers, and identity-by-descent segment detection.
beagle-libbeagle-lib is a high-performance library that can perform the core calculations at the heart of most Bayesian and Maximum Likelihood phylogenetics packages.
BeastBEAST is a cross-platform program for Bayesian MCMC analysis of molecular sequences. It is entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models. It can be used as a method of reconstructing phylogenies but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology. BEAST uses MCMC to average over tree space, so that each tree is weighted proportional to its posterior probability. URL:
BeautifulSoupBeautiful Soup is a Python library designed for quick turnaround projects like screen-scraping.
BEDOPSBEDOPS is an open-source command-line toolkit that performs highly efficient and scalable Boolean and other set operations, statistical calculations, archiving, conversion and other management of genomic data of arbitrary scale. Tasks can be easily split by chromosome for distributing whole-genome analyses across a computational cluster.
BEDToolsThe BEDTools utilities allow one to address common genomics tasks such as finding feature overlaps and computing coverage. The utilities are largely based on four widely-used file formats: BED, GFF/GTF, VCF, and SAM/BAM.
BerkeleyGWThe BerkeleyGW Package is a set of computer codes that calculates the quasiparticle properties and the optical responses of a large variety of materials from bulk periodic crystals to nanostructures such as slabs, wires and molecules.
BESSTBESST is a package for scaffolding genomic assemblies. It contains several modules for e.g. building a "contig graph" from available information, obtaining scaffolds from this graph, and accurate gap size information.
BGLRBayesian Generalized Linear Regression.
binutilsbinutils: GNU binary utilities
bioawkBioawk is an extension to Brian Kernighan's awk, adding the support of several common biological data formats, including optionally gzip'ed BED, GFF, SAM, VCF, FASTA/Q and TAB-delimited formats with column names.
Bio-DB-HTSRead files using HTSlib including BAM/CRAM, Tabix and BCF database files
Bio-EaselEasel is an ANSI C code library for computational analysis of biological sequences using probabilistic models. Easel is used by HMMER, the profile hidden Markov model software that underlies the Pfam protein families database, and by Infernal, the profile stochastic context-free grammar software that underlies the Rfam RNA family database. URL:
BiogemeBiogeme is an open source freeware designed for the maximum likelihood estimation of parametric models in general, with a special emphasis on discrete choice models.
biom-formatThe BIOM file format (canonically pronounced biome) is designed to be a general-use format for representing biological sample by observation contingency tables.
BioPerlBioperl is the product of a community effort to produce Perl code which is useful in biology. Examples include Sequence objects, Alignment objects and database searching objects.
BiopythonBiopython is a set of freely available tools for biological computation written in Python by an international team of developers. It is a distributed collaborative effort to develop Python libraries and applications which address the needs of current and future work in bioinformatics. URL:
BismarkA tool to map bisulfite converted sequence reads and determine cytosine methylation states
BisonBison is a general-purpose parser generator that converts an annotated context-free grammar into a deterministic LR or generalized LR (GLR) parser employing LALR(1) parser tables.
bitarraybitarray provides an object type which efficiently represents an array of booleans
BitSeqBitSeq (Bayesian Inference of Transcripts from Sequencing Data) is an application for inferring expression levels of individual transcripts from sequencing (RNA-Seq) data and estimating differential expression (DE) between conditions. An advantage of this approach is the ability to account for both technical uncertainty and intrinsic biological variance in order to avoid false DE calls. The technical contribution to the uncertainty comes both from finite read-depth and the possibly ambiguous mapping of reads to multiple transcripts.
blasrBLASR is a method of mapping Single Molecule Sequencing (SMS) reads that are thousands to tens of thousands of bases long with divergence between the read and genome dominated by insertion and deletion error..
BLASTBasic Local Alignment Search Tool, or BLAST, is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of different proteins or the nucleotides of DNA sequences.
BLAST+Basic Local Alignment Search Tool, or BLAST, is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of different proteins or the nucleotides of DNA sequences.
BLATBLAT on DNA is designed to quickly find sequences of 95% and greater similarity of length 25 bases or more.
BlenderBlender is the free and open source 3D creation suite. It supports the entirety of the 3D pipeline-modeling, rigging, animation, simulation, rendering, compositing and motion tracking, even video editing and game creation.
Blitz++Blitz++ is a (LGPLv3+) licensed meta-template library for array manipulation in C++ with a speed comparable to Fortran implementations, while preserving an object-oriented interface
BlobToolsA modular command-line solution for visualisation, quality control and taxonomic partitioning of genome datasets.
BloscBlosc, an extremely fast, multi-threaded, meta-compressor library URL:
bmlThe basic matrix library (bml) is a collection of various matrix data formats (in dense and sparse) and their associated algorithms for basic matrix operations.
bmonbmon is a monitoring and debugging tool to capture networking related statistics and prepare them visually in a human friendly way.
bmtaggerBest Match Tagger for removing human reads from metagenomics datasets
bokehStatistical and novel interactive HTML plots for Python URL:
BoltzTraP2band-structure interpolator and transport coefficient calculator
BoostBoost provides free peer-reviewed portable C++ source libraries.
Boost.PythonBoost.Python is a C++ library which enables seamless interoperability between C++ and the Python programming language. Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
BotanBotan (Japanese for peony) is a cryptography library written in C++11 and released under the permissive Simplified BSD license.
BowtieBowtie is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome.
Bowtie2Bowtie 2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes.
BRAKER1BRAKER1: Unsupervised RNA-Seq-based genome annotation with GeneMark-ET and AUGUSTUS.
BreakDancerBreakDancer is a Perl/C++ package that provides genome-wide detection of structural variants from next generation paired-end sequencing reads URL:
BroadPeakBroadPeak broad peak calling algorithm for diffuse ChIP-seq datasets.
BsoftBsoft is a collection of programs and a platform for development of software for image and molecular processing in structural biology. Problems in structural biology are approached with a highly modular design, allowing fast development of new algorithms without the burden of issues such as file I/O. It provides an easily accessible interface, a resource that can be and has been used in other packages.
buildenvThis module sets a group of environment variables for compilers, linkers, maths libraries, etc., that you can use to easily transition between toolchains when building your software. To query the variables being set please use: module show <this module name> URL: None
BUSCOBUSCO provides measures for quantitative assessment of genome assembly, gene set, and transcriptome completeness based on evolutionarily informed expectations of gene content from near-universal single-copy orthologs selected from OrthoDB.
BWABurrows-Wheeler Aligner (BWA) is an efficient program that aligns relatively short nucleotide sequences against a long reference sequence such as the human genome.
bwidgetThe BWidget Toolkit is a high-level Widget Set for Tcl/Tk built using native Tcl/Tk 8.x namespaces. URL:
bx-pythonThe bx-python project is a Python library and associated set of scripts to allow for rapid implementation of genome scale analyses.
byaccBerkeley Yacc (byacc) is generally conceded to be the best yacc variant available. In contrast to bison, it is written to avoid dependencies upon a particular compiler.
bzip2bzip2 is a freely available, patent free, high-quality data compressor. It typically compresses files to within 10% to 15% of the best available techniques (the PPM family of statistical compressors), whilst being around twice as fast at compression and six times faster at decompression.
C3DConvert3D Medical Image Processing Tool
CaffeCaffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and community contributors.
cairoCairo is a 2D graphics library with support for multiple output devices. Currently supported output targets include the X Window System (via both Xlib and XCB), Quartz, Win32, image buffers, PostScript, PDF, and SVG file output. Experimental backends include OpenGL, BeOS, OS/2, and DirectFB
cairocfficffi-based cairo bindings for Python
cairommThe Cairomm package provides a C++ interface to Cairo.
CanuCanu is a fork of the Celera Assembler designed for high-noise single-molecule sequencing (such as the PacBio RSII or Oxford Nanopore MinION).
CapnProtoCap’n Proto is an insanely fast data interchange format and capability-based RPC system.
CargoThe Rust package manager
cath-resolve-hitsCollapse a list of domain matches to your query sequence(s) down to the non-overlapping subset (ie domain architecture) that maximises the sum of the hits' scores.
causalmlCausal ML: A Python Package for Uplift Modeling and Causal Inference with ML URL:
CD-HITCD-HIT is a very widely used program for clustering and comparing protein or nucleotide sequences. URL:
CDOCDO is a collection of command line Operators to manipulate and analyse Climate and NWP model Data.
cdsapiClimate Data Store API Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
CEGMACEGMA (Core Eukaryotic Genes Mapping Approach) is a pipeline for building a set of high reliable set of gene annotations in virtually any eukaryotic genome.
CellRangerCell Ranger is a set of analysis pipelines that process Chromium single-cell RNA-seq output to align reads, generate gene-cell matrices and perform clustering and gene expression analysis. URL:
CentrifugeClassifier for metagenomic sequences. Centrifuge is a novel microbial classification engine that enables rapid, accurate and sensitive labeling of reads and quantification of species
CESMThe Community Earth System Model (CESM) is a coupled climate model for simulating the earth's climate system. Composed of four separate models simultaneously simulating the earth's atmosphere, ocean, land surface and sea-ice, and one central coupler component, the CESM allows researchers to conduct fundamental research into the earth's past, present and future climate states.
CESM-depsCESM is a fully-coupled, community, global climate model that provides state-of-the-art computer simulations of the Earth's past, present, and future climate states. URL:
cffiPython http for humans
CFITSIOCFITSIO is a library of C and Fortran subroutines for reading and writing data files in FITS (Flexible Image Transport System) data format. URL:
cftimeTime-handling functionality from netcdf4-python
CGALThe goal of the CGAL Open Source Project is to provide easy access to efficient and reliable geometric algorithms in the form of a C++ library. URL:
CGATCGAT is a collection of tools for the computational genomicist written in the python language.
CGNSThe CGNS system is designed to facilitate the exchange of data between sites and applications, and to help stabilize the archiving of aerodynamic data.
Charm++Charm++ is a parallel programming framework in C++, supported by an adaptive runtime system, which enhances user productivity and allows programs to run portably from small multicore computers (your laptop) to the largest supercomputers.
CHARMMCHARMM is a versatile and widely used molecular simulation program with broad application to many-particle sy stems. Use charmm for the serial version and charmm-mpi for the mpi version. Environmental variable $TREXHOME is set to the location of the AdHocTrex directory, which inludes examples and scripts.\\ This module has restricted access.
CheckMCheckM provides a set of tools for assessing the quality of genomes recovered from isolates, single cells, or metagenomes.
CheetahCheetah is an open source template engine and code generation tool.
CheMPS2CheMPS2 is a scientific library which contains a spin-adapted implementation of the density matrix renormalization group (DMRG) for ab initio quantum chemistry. URL:
ChimeraUCSF Chimera is a highly extensible program for interactive visualization and analysis of molecular structures and related data, including density maps, supramolecular assemblies, sequence alignments, docking results, trajectories, and conformational ensembles.
ciftifyThe tools of the Human Connectome Project (HCP) adapted for working with non-HCP datasets
CirclatorCirclator will attempt to identify each circular sequence and output a linearised version of it. It does this by assembling all reads that map to contig ends and comparing the resulting contigs with the input assembly.
CircosCircos is a software package for visualizing data and information. It visualizes data in a circular layout - this makes Circos ideal for exploring relationships between objects or positions.
cisTEMcisTEM is user-friendly software to process cryo-EM images of macromolecular complexes and obtain high-resolution 3D reconstructions from them.
ClangC, C++, Objective-C compiler, based on LLVM. Does not include C++ standard library -- use libstdc++ from GCC.
CLHEPThe CLHEP project is intended to be a set of HEP-specific foundation and utility classes such as random generators, physics vectors, geometry and linear algebra. CLHEP is structured in a set of packages independent of any external package.
clickA simple wrapper around optparse for powerful command line utilities.
CLISPCommon Lisp is a high-level, general-purpose, object-oriented, dynamic, functional programming language.
ClustAGEClustAGE is a command-line tool built using the Perl scripting language for the purpose of analyzing and comparing accessory genomic elements (AGEs) between genomes. URL:
Clustal-OmegaClustal Omega is a multiple sequence alignment program for proteins. It produces biologically meaningful multiple sequence alignments of divergent sequences. Evolutionary relationships can be seen via viewing Cladograms or Phylograms
ClustalW2ClustalW2 is a general purpose multiple sequence alignment program for DNA or proteins.
CMakeCMake, the cross-platform, open-source build system. CMake is a family of tools designed to build, test and package software.
CNVnatorA tool for CNV discovery and genotyping from depth-of-coverage by mapped reads.
coloramaCross-platform colored terminal text.
colorspaceColor Space Manipulation
CONCOCTClustering cONtigs with COverage and ComposiTion (CONCOCT) is a program for unsupervised binning of metagenomic contigs by using nucleotide composition, coverage data in multiple samples and linkage data from paired end reads.
configparserconfigparser is a Python library that brings the updated configparser from Python 3.5 to Python 2.6-3.5
configurable-http-proxyHTTP proxy for node.js including a REST API for updating the routing table. Developed as a part of the Jupyter Hub multi-user server.
CONTRACONTRA is a tool for copy number variation (CNV) detection for targeted resequencing data such as those from whole-exome capture data. CONTRA calls copy number gains and losses for each target region with key strategies include the use of base-level log-ratios to remove GC-content bias, correction for an imbalanced library size effect on log-ratios, and the estimation of log-ratio variations via binning and interpolation. It takes standard alignment formats (BAM/SAM) and output in variant call format (VCF 4.0) for easy integration with other next generation sequencing analysis package.
ConvergeCFDConverge CFD software by Convergent Science URL:
ConvergeStudioConverge Studio software by Convergent Science URL:
CopereadCOPE (Connecting Overlapped Pair-End reads) is a method to align and connect the illumina sequenced Pair-End reads of which the insert size is smaller than the sum of the two read length.
CoreutilsThe GNU Core Utilities are the basic file, shell and text manipulation utilities of the GNU operating system. These are the core utilities which are expected to exist on every operating system.
cornerMake some beautiful corner plots. URL: is a tool for measuring code coverage of Python programs. It monitors your program, noting which parts of the code have been executed, then analyzes the source to identify code that could have been executed but was not.
cowsayConfigurable talking characters in ASCII art
CP2KCP2K is a freely available (GPL) program, written in Fortran 95, to perform atomistic and molecular simulations of solid state, liquid, molecular and biological systems. It provides a general framework for different methods such as e.g. density functional theory (DFT) using a mixed Gaussian and plane waves approach (GPW), and classical pair and many-body potentials. URL:
CPLEXIBM ILOG CPLEX Optimizer's mathematical programming technology enables analytical decision support for improving efficiency, reducing costs, and increasing profitability.
CppUnitCppUnit is the C++ port of the famous JUnit framework for unit testing.
cramCram is a functional testing framework for command line applications. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
CRF++CRF++ is a simple, customizable, and open source implementation of Conditional Random Fields (CRFs) for segmenting/labeling sequential data. CRF++ is designed for generic purpose and will be applied to a variety of NLP tasks, such as Named Entity Recognition, Information Extraction and Text Chunking.
cryptographycryptography is a package which provides cryptographic recipes and primitives to Python developers..
csvkitcsvkit is a suite of command-line tools for converting to and working with CSV, the king of tabular file formats. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
CubeGUICube, which is used as performance report explorer for Scalasca and Score-P, is a generic tool for displaying a multi-dimensional performance space consisting of the dimensions (i) performance metric, (ii) call path, and (iii) system resource. Each dimension can be represented as a tree, where non-leaf nodes of the tree can be collapsed or expanded to achieve the desired level of granularity. This module provides the Cube graphical report explorer. URL:
CubeLibCube, which is used as performance report explorer for Scalasca and Score-P, is a generic tool for displaying a multi-dimensional performance space consisting of the dimensions (i) performance metric, (ii) call path, and (iii) system resource. Each dimension can be represented as a tree, where non-leaf nodes of the tree can be collapsed or expanded to achieve the desired level of granularity. This module provides the Cube general purpose C++ library component and command-line tools. URL:
CubeWriterCube, which is used as performance report explorer for Scalasca and Score-P, is a generic tool for displaying a multi-dimensional performance space consisting of the dimensions (i) performance metric, (ii) call path, and (iii) system resource. Each dimension can be represented as a tree, where non-leaf nodes of the tree can be collapsed or expanded to achieve the desired level of granularity. This module provides the Cube high-performance C writer library component. URL:
CUDACUDA (formerly Compute Unified Device Architecture) is a parallel computing platform and programming model created by NVIDIA and implemented by the graphics processing units (GPUs) that they produce. CUDA gives developers access to the virtual instruction set and memory of the parallel computational elements in CUDA GPUs.
cuDNNThe NVIDIA CUDA Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks.
CufflinksTranscript assembly, differential expression, and differential regulation for RNA-Seq
cURLlibcurl is a free and easy-to-use client-side URL transfer library, supporting DICT, FILE, FTP, FTPS, Gopher, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMTP, SMTPS, Telnet and TFTP. libcurl supports SSL certificates, HTTP POST, HTTP PUT, FTP uploading, HTTP form based upload, proxies, cookies, user+password authentication (Basic, Digest, NTLM, Negotiate, Kerberos), file transfer resume, http proxy tunneling and more.
cutadaptCutadapt finds and removes adapter sequences, primers, poly-A tails and other types of unwanted sequence from your high-throughput sequencing reads.
CVXOPTCVXOPT is a free software package for convex optimization based on the Python programming language. Its main purpose is to make the development of software for convex optimization applications straightforward by building on Python's extensive standard library and on the strengths of Python as a high-level programming language. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
CVXPYCVXPY is a Python-embedded modeling language for convex optimization problems. It allows you to express your problem in a natural way that follows the math, rather than in the restrictive standard form required by solvers. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
CWPSUSeismic Unix is an open source seismic utilities package supported by the Center for Wave Phenomena (CWP) at the Colorado School of Mines (CSM).
CyclerComposable style cycles
CythonThe Cython language makes writing C extensions for the Python language as easy as Python itself. Cython is a source code translator based on the well-known Pyrex, but supports more cutting edge functionality and optimizations.
CyToolzCython implementation of the toolz package, which provides high performance utility functions for iterables, functions, and dictionaries.
cyvcf2cython + htslib == fast VCF and BCF processing URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
DakotaDakota software's advanced parametric analyses enable design exploration, model calibration, risk analysis, and quantification of margins and uncertainty with computational models.
damageprotoMonitoring the regions affected by rendering has wide-spread use, from VNC-like systems scraping the screen to screen magnifying applications designed to aid users with limited visual acuity. The DAMAGE extension is designed to make such applications reasonably efficient in the face of server-client latency.
daskDask natively scales Python. Dask provides advanced parallelism for analytics, enabling performance at scale for the tools you love. URL:
datamashGNU datamash is a command-line program which performs basic numeric,textual and statistical operations on input textual data files.
DBBerkeley DB enables the development of custom data management solutions, without the overhead traditionally associated with such custom projects.
DBD-mysqlPerl binding for MySQL
DB_FilePerl5 access to Berkeley DB version 1.x.
DBG2OLCDBG2OLC:Efficient Assembly of Large Genomes Using Long Erroneous Reads of the Third Generation Sequencing Technologies
DBusD-Bus is a message bus system, a simple way for applications to talk to one another. In addition to interprocess communication, D-Bus helps coordinate process lifecycle; it makes it simple and reliable to code a "single instance" application or daemon, and to launch applications and daemons on demand when their services are needed.
dbus-glibD-Bus is a message bus system, a simple way for applications to talk to one another. URL:
dDocentdDocent is simple bash wrapper to QC, assemble, map, and call SNPs from almost any kind of RAD sequencing. If you have a reference already, dDocent can be used to call SNPs from almost any type of NGS data set.
dealiideal.II is a C++ software library supporting the creation of finite element codes and an open community of users and developers.
deal.IIdeal.II is a C++ program library targeted at the computational solution of partial differential equations using adaptive finite elements. URL:
deepdiffDeepDiff: Deep Difference of dictionaries, iterables and almost any other object recursively. URL:
Delft3DDelft3D is Open Source Software. To enhance collaboration, to combine the unique expertise of researchers worldwide and to further expand the modelling suite, the source code of Delft3D 4 Suite can be downloaded. The following modules are available: FLOW + MOR + WAVE + WAQ (DELWAQ) + PART.
DellyDelly is an integrated structural variant (SV) prediction method that can discover, genotype and visualize deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read massively parallel sequencing data.
DendroPyA Python library for phylogenetics and phylogenetic computing: reading, writing, simulation, processing and manipulation of phylogenetic trees (phylogenies) and characters. Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
DETONATEDETONATE (DE novo TranscriptOme rNa-seq Assembly with or without the Truth Evaluation) consists of two component packages, RSEM-EVAL and REF-EVAL. Both packages are mainly intended to be used to evaluate de novo transcriptome assemblies, although REF-EVAL can be used to compare sets of any kinds of genomic sequences.
DEXTRACTORThe Dextractor commands allow one to pull exactly and only the information needed for assembly and reconstruction from the source HDF5 files produced by the PacBio RS II sequencer, or from the source BAM files produced by the PacBio Sequel sequencer.
DFT-D3DFT-D3 implements a dispersion correction for density functionals, Hartree-Fock and semi-empirical quantum chemical methods. URL:
DIAMONDAccelerated BLAST compatible local sequence aligner URL:
dichromatColor Schemes for Dichromats
DIDADIDA is a novel framework that performs the large-scale alignment tasks by distributing the indexing and alignment stages into smaller subtasks over a cluster of compute nodes.
digestCreate Compact Hash Digests of R Objects
dilldill extends python's pickle module for serializing and de-serializing python objects to the majority of the built-in python types. Serialization is the process of converting an object to a byte stream, and the inverse of which is converting a byte stream back to on python object hierarchy. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
discovarDISCOVAR de novo can generate de novo assemblies for both large and small genomes. It currently does not call variants.
dispyDistributed and Parallel Computing with/for Python.
DMTCPDMTCP is a tool to transparently checkpoint the state of multiple simultaneous applications, including multi-threaded and distributed applications. It operates directly on the user binary executable, without any Linux kernel modules or other kernel modifications.
DocutilsDocutils is an open-source text processing system for processing plaintext documentation into useful formats, such as HTML, LaTeX, man-pages, open-document or XML. It includes reStructuredText, the easy to read, easy to use, what-you-see-is-what-you-get plaintext markup language.
DOLFINDOLFIN is the C++/Python interface of FEniCS, providing a consistent PSE (Problem Solving Environment) for ordinary and partial differential equations.
DomainFinderConverts manually curated CATH structural domain hierarchy used to search UniProt, RefSeq and Ensembl protein sequences into simple multi-domain architectures
dos2unixUNIX to DOS/MAC and vice versa text file format converter
double-conversionEfficient binary-decimal and decimal-binary conversion routines for IEEE doubles. URL:
DoxygenDoxygen is a documentation system for C++, C, Java, Objective-C, Python, IDL (Corba and Microsoft flavors), Fortran, VHDL, PHP, C#, and to some extent D.
DrakeDrake is a simple-to-use, extensible, text-based data workflow tool that organizes command execution around data and its dependencies.
dtcmpDatatype Compare (DTCMP) Library for sorting and ranking distributed data using MPI URL:
EasyBuildEasyBuild is a software build and installation framework written in Python that allows you to install software in a structured, repeatable and robust way. URL:
EasyBuild-adaEasyBuild environment variables for building system software on
EasyBuild-ada-REasyBuild environment variables for building software for the experimental R_modules on
EasyBuild-ada-restricted-amberEasyBuild environment variables for building restricted software Amber on
EasyBuild-ada-restricted-econstatEasyBuild environment variables for building restricted software for the econstat group on
EasyBuild-ada-restricted-junjiezEasyBuild environment variables for building software on for the junjiez group
EasyBuild-ada-restricted-math_madymoEasyBuild environment variables for building restricted software Amber on
EasyBuild-ada-restricted-orcaEasyBuild environment variables for building restricted software for ORCA on
EasyBuild-ada-restricted-tamamis-sharedEasyBuild environment variables for building restricted software for group tamamis-shared
EasyBuild-ada-restricted-tamuscEasyBuild environment variables for building restricted software for HPRC on
EasyBuild-ada-restricted-tecplotgrpEasyBuild environment variables for building restricted software Amber on
EasyBuild-ada-restricted-vaspEasyBuild environment variables for building restricted software VASP on
EasyBuild-ada-SCRATCHUser EasyBuild environment for in $SCRATCH/eb
ea-utilsCommand-line tools for processing biological sequencing data. Barcode demultiplexing, adapter trimming, etc. Primarily written to support an Illumina based pipeline - but should work with any FASTQs.
ecCodesecCodes is a package developed by ECMWF which provides an application programming interface and a set of tools for decoding and encoding messages in the following formats: WMO FM-92 GRIB edition 1 and edition 2, WMO FM-94 BUFR edition 3 and edition 4, WMO GTS abbreviated header (only decoding).
ecoPCRecoPCR helps you estimate Barcode primers quality. In conjunction with OBITools, you can postprocess ecoPCR output to compute barcode coverage and barcode specificity.
EigenEigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.
EIGENSOFTThe EIGENSOFT package combines functionality from our population genetics methods (Patterson et al. 2006) and our EIGENSTRAT stratification correction method (Price et al. 2006). The EIGENSTRAT method uses principal components analysis to explicitly model ancestry differences between cases and controls along continuous axes of variation; the resulting correction is specific to a candidate marker’s variation in frequency across ancestral populations, minimizing spurious associations while maximizing power to detect true associations. The EIGENSOFT package has a built-in plotting script and supports multiple file formats and quantitative phenotypes. URL:
elfutilsThe elfutils project provides libraries and tools for ELF files and DWARF data.
ELI5ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions.
ElkAn all-electron full-potential linearised augmented-plane wave (FP-LAPW) code with many advanced features. Written originally at Karl-Franzens-Universität Graz as a milestone of the EXCITING EU Research and Training Network, the code is designed to be as simple as possible so that new developments in the field of density functional theory (DFT) can be added quickly and reliably.
ELPAEigenvalue SoLvers for Petaflop-Applications .
EmacsGNU Emacs is an extensible, customizable text editor—and more. At its core is an interpreter for Emacs Lisp, a dialect of the Lisp programming language with extensions to support text editing.
EMAN2EMAN2 is the successor to EMAN1. It is a broadly based greyscale scientific image processing suite with a primary focus on processing data from transmission electron microscopes.
EMBOSSEMBOSS is 'The European Molecular Biology Open Software Suite'. EMBOSS is a free Open Source software analysis package specially developed for the needs of the molecular biology (e.g. EMBnet) user community.
EmbreeEmbree is a collection of high-performance ray tracing kernels, developed at Intel. The target user of Embree are graphics application engineers that want to improve the performance of their application by leveraging the optimized ray tracing kernels of Embree. The kernels are optimized for photo-realistic rendering on the latest Intel® processors with support for SSE, AVX, AVX2, and AVX512. Embree supports runtime code selection to choose the traversal and build algorithms that best matches the instruction set of your CPU.
emceeEmcee is an extensible, pure-Python implementation of Goodman & Weare's Affine Invariant Markov chain Monte Carlo (MCMC) Ensemble sampler. It's designed for Bayesian parameter estimation and it's really sweet! URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
enaBrowserToolenaBrowserTools is a set of scripts that interface with the ENA web services to download data from ENA easily, without any knowledge of scripting required. URL:
ensemblEnsembl Core API
ensembl-comparaThe ensembl-io repo is intended as a shared codebase for handling the parsing and writing of popular biological formats used by Ensembl, such as BED, BigWig and FASTA. For a full list of supported formats, see the child objects in modules/Bio/EnsEMBL/IO/Parser/.
ensembl-funcgenThe Funcgen database contains currently 4 different types of data which can be accessed through the API. 1. Regulatory Features 2. Segmentation 3. Microarray Probe Mappings 4. External Regulatory Data
ensembl-ioThe ensembl-io repo is intended as a shared codebase for handling the parsing and writing of popular biological formats used by Ensembl, such as BED, BigWig and FASTA. For a full list of supported formats, see the child objects in modules/Bio/EnsEMBL/IO/Parser/.
ensembl-variationThe Ensembl Variation API (Application Programme Interface) serves as a middle layer between the underlying MySQL database and the user's script. It aims to encapsulate the database layout by providing high level access to the database.
entrypointsEntry points are a way for Python packages to advertise objects with some common interface.
ESMFThe Earth System Modeling Framework (ESMF) is software for building and coupling weather, climate, and related models. URL:
etaETA Progress bar for command-line utilities
ETSF_IOA library of F90 routines to read/write the ETSF file format has been written. It is called ETSF_IO and available under LGPL.
eudeveudev is a fork of systemd-udev with the goal of obtaining better compatibility with existing software such as OpenRC and Upstart, older kernels, various toolchains and anything else required by users and various distributions.
EvidentialGeneEvidentialGene is a genome informatics project for "Evidence Directed Gene Construction for Eukaryotes", for constructing high quality, accurate gene sets for animals and plants (any eukaryotes), being developed by Don Gilbert at Indiana University, gilbertd at indiana edu.
ExonerateExonerate is a generic tool for pairwise sequence comparison. It allows you to align sequences using a many alignment models, using either exhaustive dynamic programming, or a variety of heuristics.
expatExpat is an XML parser library written in C. It is a stream-oriented parser in which an application registers handlers for things the parser might find in the XML document (like start tags)
export2graphlanexport2graphlan is a conversion software tool for producing both annotation and tree file for GraPhlAn. In particular, the annotation file tries to highlight specific sub-trees deriving automatically from input file what nodes are important.
ExtraeExtrae is the core instrumentation package developed by the Performance Tools group at BSC. Extrae is capable of instrumenting applications based on MPI, OpenMP, pthreads, CUDA1, OpenCL1, and StarSs1 using different instrumentation approaches. The information gathered by Extrae typically includes timestamped events of runtime calls, performance counters and source code references. Besides, Extrae provides its own API to allow the user to manually instrument his or her application. URL:
faacA complete, cross-platform solution to record, convert and stream audio and video.
fast5A lightweight C++ library for accessing Oxford Nanopore Technologies sequencing data.
FASTAThe FASTA programs find regions of local or global (new) similarity between protein or DNA sequences, either by searching Protein or DNA databases, or by identifying local duplications within a sequence.
fastahacktahack is a small application for indexing and extracting sequences and subsequences from FASTA files. The included Fasta.cpp library provides a FASTA reader and indexer that can be embedded into applications which would benefit from directly reading subsequences from FASTA files. The library automatically handles index file generation and use.
FastaIndexFastA index (.fai) handler compatible with samtools faidx
FastANIFastANI is developed for fast alignment-free computation of whole-genome Average Nucleotide Identity (ANI). ANI is defined as mean nucleotide identity of orthologous gene pairs shared between two microbial genomes. FastANI supports pairwise comparison of both complete and draft genome assemblies.
FastMEFastME: a comprehensive, accurate and fast distance-based phylogeny inference program.
FastQCFastQC is a quality control application for high throughput sequence data. It reads in sequence data in a variety of formats and can either provide an interactive application to review the results of several different QC checks, or create an HTML based report which can be integrated into a pipeline.
fastq-joinfastq-join joins two paired-end reads on the overlapping ends.
FastQScreenFastQ Screen allows you to screen a library of sequences in FastQ format against a set of sequence databases so you can see if the composition of the library matches with what you expect.
fastq-toolsThis package provides a number of small and efficient programs to perform common tasks with high throughput sequencing data in the FASTQ format. All of the programs work with typical FASTQ files as well as gzipped FASTQ files.
fastsimcoal2fast sequential Markov coalescent simulation of genomic data under complex evolutionary models URL:
fastStructureA variational framework for inferring population structure from SNP genotype data.
FastTreeFastTree infers approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences. FastTree can handle alignments with up to a million of sequences in a reasonable amount of time and memory.
FASTX-ToolkitThe FASTX-Toolkit is a collection of command line tools for Short-Reads FASTA/FASTQ files preprocessing.
FFCThe FEniCS Form Compiler (FFC) is a compiler for finite element variational forms.
FFmpegA complete, cross-platform solution to record, convert and stream audio and video. URL:
FFTWFFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data. URL:
FIATThe FInite element Automatic Tabulator (FIAT) supports generation of arbitrary order instances of the Lagrange elements on lines, triangles, and tetrahedra. It is also capable of generating arbitrary order instances of Jacobi-type quadrature rules on the same element shapes.
FigTreeFigTree is designed as a graphical viewer of phylogenetic trees and as a program for producing publication-ready figures
fileThe file command is 'a file type guesser', that is, a command-line tool that tells you in words what kind of data a file contains.
File-Copy-LinkThe distribution File-Copy-Link includes the modules File::Spec::Link and File::Copy::Link and the script copylink. They include routines to read and copy links.
File-Tempreturn name and handle of a temporary file safely
FiltlongFiltlong is a tool for filtering long reads by quality. It can take a set of long reads and produce a smaller, better subset. It uses both read length (longer is better) and read identity (higher is better) when choosing which reads pass the filter.
fineRADstructureA package for population structure inference from RAD-seq data URL: FixesProto protocol headers.
FLANNFLANN is a library for performing fast approximate nearest neighbor searches in high dimensional spaces.
FLASHFLASH (Fast Length Adjustment of SHort reads) is a very fast and accurate software tool to merge paired-end reads from next-generation sequencing experiments. FLASH is designed to merge pairs of reads when the original DNA fragments are shorter than twice the length of reads. The resulting longer reads can significantly improve genome assemblies. They can also improve transcriptome assembly when FLASH is used to merge RNA-seq data.
Flask" Flask is a lightweight WSGI web application framework. It is designed to make getting started quick and easy, with the ability to scale up to complex applications.
flexFlex (Fast Lexical Analyzer) is a tool for generating scanners. A scanner, sometimes called a tokenizer, is a program which recognizes lexical patterns in text.
FLTKFLTK is a cross-platform C++ GUI toolkit for UNIX/Linux (X11), Microsoft Windows, and MacOS X. FLTK provides modern GUI functionality without the bloat and supports 3D graphics via OpenGL and its built-in GLUT emulation.
FlyeFlye is a de novo assembler for long and noisy reads, such as those produced by PacBio and Oxford Nanopore Technologies.
FMILibraryFMI library is intended as a foundation for applications interfacing FMUs (Functional Mockup Units) that follow FMI Standard. This version of the library supports FMI 1.0 and FMI2.0. See
fmtfmt (formerly cppformat) is an open-source formatting library. URL:
fontconfigFontconfig is a library designed to provide system-wide font configuration, customization and application access.
fossGNU Compiler Collection (GCC) based compiler toolchain, including OpenMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK.
fosscudaGCC based compiler toolchain __with CUDA support__, and including OpenMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK.
FoXFoX is an XML library written in Fortran 95. It allows software developers to read, write and modify XML documents from Fortran applications without the complications of dealing with multi-language development.
FragGeneScanFragGeneScan is an application for finding (fragmented) genes in short reads.
FreeBayesFreeBayes is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs (single-nucleotide polymorphisms), indels (insertions and deletions), MNPs (multi-nucleotide polymorphisms), and complex events (composite insertion and substitution events) smaller than the length of a short-read sequencing alignment.
freeglutfreeglut is a completely OpenSourced alternative to the OpenGL Utility Toolkit (GLUT) library.
FreeSurferFreeSurfer is a set of tools for analysis and visualization of structural and functional brain imaging data. FreeSurfer contains a fully automatic structural imaging stream for processing cross sectional and longitudinal data.
freetypeFreeType 2 is a software font engine that is designed to be small, efficient, highly customizable, and portable while capable of producing high-quality output (glyph images). It can be used in graphics libraries, display servers, font conversion tools, text image generation tools, and many other products as well.
FreeXLFreeXL is an open source library to extract valid data from within an Excel (.xls) spreadsheet.
FriBidiThe Free Implementation of the Unicode Bidirectional Algorithm.
FSLFSL is a comprehensive library of analysis tools for FMRI, MRI and DTI brain imaging data. URL:
FTGLFTGL is a free open source library to enable developers to use arbitrary fonts in their OpenGL ( applications. URL:
futurepython-future is the missing compatibility layer between Python 2 and Python 3.
fxtractExtract sequences from a fastx (fasta or fastq) file given a subsequence.
g2clibLibrary contains GRIB2 encoder/decoder ('C' version).
g2libLibrary contains GRIB2 encoder/decoder and search/indexing routines.
GAMESS_tamu"Description:TAMU HPRC GAMESS launcher - rungms " "
GAM-NGSGenomic assemblies merger for next generation sequencing
gapGAP is a system for computational discrete algebra, with particular emphasis on Computational Group Theory.
GapCloserGapCloser is designed to close the gaps emerging during the scaffolding process by SOAPdenovo or other assembler, using the abundant pair relationships of short reads.
GapFillerGapFiller is a stand-alone program for closing gaps within pre-assembled scaffolds. It is unique in offering the possibility to manually control the gap closure process. By using the distance information of paired-read data, GapFiller seeks to close the gap from each edge in an iterative manner. From a good number of tests we see the program yields excellent results both on bacterial en eukaryotic data sets. The command-line Perl script and additional files can be downloaded below. The input data is given by pre-assembled scaffold sequences (FASTA) and NGS paired-read data (FASTA or FASTQ). The final gap-filled scaffolds are provided in FASTA format.
GATEGATE is an advanced opensource software developed by the international OpenGATE collaboration and dedicated to the numerical simulations in medical imaging. It currently supports simulations of Emission Tomography (Positron Emission Tomography - PET and Single Photon Emission Computed Tomography - SPECT), and Computed Tomography
GATKThe Genome Analysis Toolkit or GATK is a software package developed at the Broad Institute to analyse next-generation resequencing data. The toolkit offers a wide variety of tools, with a primary focus on variant discovery and genotyping as well as strong emphasis on data quality assurance. Its robust architecture, powerful processing engine and high-performance computing features make it capable of taking on projects of any size. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
gawkgawk: GNU awk
gcThe Boehm-Demers-Weiser conservative garbage collector can be used as a garbage collecting replacement for C malloc or C++ new.
GCATemplatesGCATemplates is a collection of HPC template scripts for tools useful for bioinformatics tasks.
GCCThe GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Java, and Ada, as well as libraries for these languages (libstdc++, libgcj,...).
GCCcoreThe GNU Compiler Collection includes front ends for C, C++, Objective-C, Fortran, Java, and Ada, as well as libraries for these languages (libstdc++, libgcj,...).
gcccudaGNU Compiler Collection (GCC) based compiler toolchain, along with CUDA toolkit.
GConfGConf is a system for storing application preferences. It is intended for user preferences; not configuration of something like Apache, or arbitrary data storage. - Interface to Gd Graphics Library
GDALGDAL is a translator library for raster geospatial data formats that is released under an X/MIT style Open Source license by the Open Source Geospatial Foundation. As a library, it presents a single abstract data model to the calling application for all supported formats. It also comes with a variety of useful commandline utilities for data translation and processing.
GDBThe GNU Project Debugger
gdbguiBrowser-based frontend to gdb (gnu debugger). Add breakpoints, view the stack, visualize data structures, and more in C, C++, Go, Rust, and Fortran. Run gdbgui from the terminal and a new tab will open in your browser.
gdc-clientThe GDC provides a standard client-based mechanism in support of high-performance data downloads and submission.
GDCHARTEasy to use C API, high performance library to create charts and graphs in PNG, GIF and WBMP format. URL:
Gdk-PixbufThe Gdk Pixbuf is a toolkit for image loading and pixel buffer manipulation. It is used by GTK+ 2 and GTK+ 3 to load and manipulate images. In the past it was distributed as part of GTK+ 2 but it was split off into a separate package in preparation for the change to GTK+ 3. URL:
Geant4Geant4 is a toolkit for the simulation of the passage of particles through matter. Its areas of application include high energy, nuclear and accelerator physics, as well as studies in medical and space science.
gearshifftBenchmark Suite for Heterogenuous FFT Implementations
GeminiGEMINI (GEnome MINIng) is a flexible framework for exploring genetic variation in the context of the wealth of genome annotations available for the human genome. By placing genetic variants, sample phenotypes and genotypes, as well as genome annotations into an integrated database framework, GEMINI provides a simple, flexible, and powerful system for exploring genetic variation for disease and population genetics.
geneidgeneid is a program to predict genes in anonymous genomic sequences designed with a hierarchical structure.
GeneMark-ESGeneMark-ES - Gene Prediction in Eukaryotes. Unsupervised training is an important feature of the GeneMark-ES algorithm that identifies protein coding genes in eukaryotic genomes. This is the only eukaryotic gene finder that can perform gene prediction without curated training sets.
GeneMarkSGeneMarkS - Gene Prediction in Prokaryotes.
GenomeMapperGenomeMapper is a short read mapping tool designed for accurate read alignments. It quickly aligns millions of reads either with ungapped or gapped alignments. It can be used to align against multiple genomes simulanteously or against a single reference.
GenomeToolsThe GenomeTools genome analysis system is a free collection of bioinformatics tools (in the realm of genome informatics) combined into a single binary named gt. It is based on a C library named libgenometools which contains a wide variety of classes for efficient and convenient implementation of sequence and annotation processing software.
GEOSGEOS (Geometry Engine - Open Source) is a C++ port of the Java Topology Suite (JTS)
GerrisGerris is a Free Software program for the solution of the partial differential equations describing fluid flow
gettextGNU 'gettext' is an important step for the GNU Translation Project, as it is an asset on which we may build many other steps. This package offers to programmers, translators, and even users, a well integrated set of tools and documentation
GffCompareGffCompare provides classification and reference annotation mapping and matching statistics for RNA-Seq assemblies (transfrags) or other generic GFF/GTF files.
gffreadGFF/GTF parsing utility providing format conversions, region filtering, FASTA sequence extraction and more.
gflagsThe gflags package contains a C++ library that implements commandline flags processing. It includes built-in support for standard types such as string and the ability to define flags in the source file in which they are used. URL:
ggplot2 An Implementation of the Grammar of Graphics
GhostscriptGhostscript is a versatile processor for PostScript data with the ability to render PostScript to different targets. It used to be part of the cups printing stack, but is no longer used for that.
giflibgiflib is a library for reading and writing gif images. It is API and ABI compatible with libungif which was in wide use while the LZW compression algorithm was patented. URL:
gifsicleGifsicle is a command-line tool for creating, editing, and getting information about GIF images and animations. Making a GIF animation with gifsicle is easy. URL:
gimpiGNU Compiler Collection (GCC) based compiler toolchain, next to Intel MPI.
giolfGNU Compiler Collection (GCC) based compiler toolchain, including IntelMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK.
GIREMIGIREMI is a method that can identify RNA editing sites using one RNA-seq data set without requiring genome sequence data. URL:
gitGit is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.
GitPythonGitPython is a python library used to interact with Git repositories
GizaGiza is an open, lightweight scientific plotting library built on top of cairo that provides uniform output to multiple devices.
GL2PSGL2PS: an OpenGL to PostScript printing library
GladeGlade is a RAD tool to enable quick & easy development of user interfaces for the GTK+ toolkit and the GNOME desktop environment.
glewThe OpenGL Extension Wrangler Library The OpenGL Extension Wrangler Library (GLEW) is a cross-platform open-source C/C++ extension loading library. GLEW provides efficient run-time mechanisms for determining which OpenGL extensions are supported on the target platform. OpenGL core and extension functionality is exposed in a single header file. GLEW has been tested on a variety of operating systems, including Windows, Linux, Mac OS X, FreeBSD, Irix, and Solaris. URL:
GLibGLib is one of the base libraries of the GTK+ project
glibcThe GNU C Library project provides the core libraries for the GNU system and GNU/Linux systems, as well as many other systems that use Linux as the kernel.
GLibmmGLib is one of the base libraries of the GTK+ project
GLIMMERGlimmer is a system for finding genes in microbial DNA, especially the genomes of bacteria, archaea, and viruses.
GlimmerHMMGlimmerHMM is a new gene finder based on a Generalized Hidden Markov Model (GHMM). Although the gene finder conforms to the overall mathematical framework of a GHMM, additionally it incorporates splice site models adapted from the GeneSplicer program and a decision tree adapted from GlimmerM. It also utilizes Interpolated Markov Models for the coding and noncoding models
GLMOpenGL Mathematics (GLM) is a header only C++ mathematics library for graphics software based on the OpenGL Shading Language (GLSL) specifications.
GlobalArraysGlobal Arrays (GA) is a Partitioned Global Address Space (PGAS) programming model
glogA C++ implementation of the Google logging module. URL:
GLPKThe GLPK (GNU Linear Programming Kit) package is intended for solving large-scale linear programming (LP), mixed integer programming (MIP), and other related problems. It is a set of routines written in ANSI C and organized in the form of a callable library.
glprotoX protocol and ancillary headers
glueAn implementation of interpreted string literals
GMAP-GSNAPGMAP: A Genomic Mapping and Alignment Program for mRNA and EST Sequences GSNAP: Genomic Short-read Nucleotide Alignment Program
GMPGMP is a free library for arbitrary precision arithmetic, operating on signed integers, rational numbers, and floating point numbers.
gmpichgcc and GFortran based compiler toolchain, including MPICH for MPI support.
gmpolfgcc and GFortran based compiler toolchain, MPICH for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK.
gmpy2GMP/MPIR, MPFR, and MPC interface to Python 2.6+ and 3.x
GNUCompiler-only toolchain with GCC and binutils.
gnuplotPortable interactive, function plotting utility URL:
gnutlsGnuTLS is a secure communications library implementing the SSL, TLS and DTLS protocols and technologies around them. It provides a simple C language application programming interface (API) to access the secure communications protocols as well as APIs to parse and write X.509, PKCS #12, OpenPGP and other required structures. It is aimed to be portable and efficient with focus on security and interoperability.
GoGo is an open source programming language that makes it easy to build simple, reliable, and efficient software.
goatoolsPython scripts to find enrichment of GO terms
GObject-IntrospectionGObject introspection is a middleware layer between C libraries (using GObject) and language bindings. The C library can be scanned at compile time and generate a metadata file, in addition to the actual native C library. Then at runtime, language bindings can read this metadata and automatically provide bindings to call into the C library.
golfGNU Compiler Collection (GCC) based compiler toolchain, including OpenBLAS (BLAS and LAPACK support) and FFTW.
gompiGNU Compiler Collection (GCC) based compiler toolchain, including OpenMPI for MPI support.
gompicGNU Compiler Collection (GCC) based compiler toolchain along with CUDA toolkit, including OpenMPI for MPI support with CUDA features enabled.
google-java-formatReformats Java source code to comply with Google Java Style.
googletestGoogle's C++ test framework
goolfGNU Compiler Collection (GCC) based compiler toolchain, including OpenMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK.
goolfcGCC based compiler toolchain __with CUDA support__, and including OpenMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW and ScaLAPACK.
GPAWGPAW is a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE). It uses real-space uniform grids and multigrid methods or atom-centered basis-functions. URL:
GPAW-setupsPAW setup for the GPAW Density Functional Theory package. Users can install setups manually using 'gpaw install-data' or use setups from this package. The versions of GPAW and GPAW-setups can be intermixed.
gperfGNU gperf is a perfect hash function generator. For a given list of strings, it produces a hash function and hash table, in form of C or C++ code, for looking up a value depending on the input string. The hash function is perfect, which means that the hash table has no collisions, and the hash table lookup needs a single string comparison only.
gperftoolsgperftools are for use by developers so that they can create more robust applications. Especially of use to those developing multi-threaded applications in C++ with templates. Includes TCMalloc, heap-checker, heap-profiler and cpu-profiler.
GPflowGPflow is a package for building Gaussian process models in python using TensorFlow.
gprMaxgprMax is open source software that simulates electromagnetic wave propagation. It uses Yee's algorithm to solve Maxwell’s equations in 3D using the Finite-Difference Time-Domain (FDTD) method.
gpustatdstat-like utilization monitor for NVIDIA GPUs
grabixgrabix leverages the fantastic BGZF library in samtools to provide random access into text files that have been compressed with bgzip. grabix creates it's own index (.gbi) of the bgzipped file. Once indexed, one can extract arbitrary lines from the file with the grab command. Or choose random lines with the, well, random command.
GraceGrace is a WYSIWYG tool to make two-dimensional plots of numerical data.
GraphicsMagickGraphicsMagick is the swiss army knife of image processing.
GraPhlAnGraPhlAn is a software tool for producing high-quality circular representations of taxonomic and phylogenetic trees. It focuses on concise, integrative, informative, and publication-ready representations of phylogenetically- and taxonomically-driven investigation.
graph-toolGraph-tool is an efficient Python module for manipulation and statistical analysis of graphs (a.k.a. networks). Contrary to most other python modules with similar functionality, the core data structures and algorithms are implemented in C++, making extensive use of template metaprogramming, based heavily on the Boost Graph Library. This confers it a level of performance that is comparable (both in memory usage and computation time) to that of a pure C/C++ library. URL:
GraphvizGraphviz is open source graph visualization software. Graph visualization is a way of representing structural information as diagrams of abstract graphs and networks. It has important applications in networking, bioinformatics, software engineering, database and web design, machine learning, and in visual interfaces for other technical domains.
GRASSGRASS GIS, commonly referred to as GRASS (Geographic Resources Analysis Support System), is a free and open source Geographic Information System (GIS) software suite used for geospatial data management and analysis, image processing, graphics and maps production, spatial modeling, and visualization.
grib_apiThe ECMWF GRIB API is an application program interface accessible from C, FORTRAN and Python programs developed for encoding and decoding WMO FM-92 GRIB edition 1 and edition 2 messages. A useful set of command line tools is also provided to give quick access to GRIB messages.
GROMACSGROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. This is a CPU only build, containing both MPI and threadMPI builds. URL:
GSLThe GNU Scientific Library (GSL) is a numerical library for C and C++ programmers. The library provides a wide range of mathematical routines such as random number generators, special functions and least-squares fitting.
GST-plugins-baseGStreamer is a library for constructing graphs of media-handling components. The applications it supports range from simple Ogg/Vorbis playback, audio/video streaming to complex audio (mixing) and video (non-linear editing) processing.
GStreamerGStreamer is a library for constructing graphs of media-handling components. The applications it supports range from simple Ogg/Vorbis playback, audio/video streaming to complex audio (mixing) and video (non-linear editing) processing. URL:
gtableArrange 'Grobs' in Tables
GTDB-TkA toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes.
gtestGoogle's framework for writing C++ tests on a variety of platforms URL:
GTK+The GTK+ 3 package contains libraries used for creating graphical user interfaces for applications.
GtkmmThe Gtkmm package provides a C++ interface to GTK+ 3.
GTSGTS stands for the GNU Triangulated Surface Library. It is an Open Source Free Software Library intended to provide a set of useful functions to deal with 3D surfaces meshed with interconnected triangles.
GuileGuile is a programming language, designed to help programmers create flexible applications that can be extended by users or other programmers with plug-ins, modules, or scripts.
gzipgzip (GNU zip) is a popular data compression program as a replacement for compress
h5pyHDF5 for Python (h5py) is a general-purpose Python interface to the Hierarchical Data Format library, version 5. HDF5 is a versatile, mature scientific software library designed for the fast, flexible storage of enormous amounts of data. Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
HADDOCKHigh Ambiguity Driven biomolecular DOCKing based on biochemical and/or biophysical information. This module has restricted access.
HadoopHadoop MapReduce by Cloudera
HarfBuzzHarfBuzz is an OpenType text shaping engine.
HarvestToolsHarvestTools is a part of the Harvest software suite and provides file conversion between Gingr files and various standard text formats
HDDMHDDM is a Puthon toolbox for hierarchical Bayesian parameter estimation of the Drift Diffusion Model (via PyMC).
HDFHDF (also known as HDF4) is a library and multi-object file format for storing and managing data between machines.
HDF5HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of datatypes, and is designed for flexible and efficient I/O and for high volume and complex data. URL:
HDF-EOSThe HDF-EOS2 is a software library designed built on HDF4* to support EOS-specific data structures, namely Grid, Point, and Swath.
HelloThe GNU Hello program produces a familiar, friendly greeting. Yes, this is another implementation of the classic program that prints "Hello, world!" when you run it. However, unlike the minimal version often seen, GNU Hello processes its argument list to modify its behavior, supports greetings in many languages, and so on. URL:
help2manhelp2man produces simple manual pages from the '--help' and '--version' output of other commands.
hisatHISAT is a fast and sensitive spliced alignment program for mapping RNA-seq reads. In addition to one global FM index that represents a whole genome, HISAT uses a large set of small FM indexes that collectively cover the whole genome (each index represents a genomic region of ~64,000 bp and ~48,000 indexes are needed to cover the human genome). These small indexes (called local indexes) combined with several alignment strategies enable effective alignment of RNA-seq reads, in particular, reads spanning multiple exons. The memory footprint of HISAT is relatively low (~4.3GB for the human genome). We have developed HISAT based on the Bowtie2 implementation to handle most of the operations on the FM index.
HISAT2HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) against the general human population (as well as against a single reference genome).
HMMERHMMER is used for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments. It implements methods using probabilistic models called profile hidden Markov models (profile HMMs). Compared to BLAST, FASTA, and other sequence alignment and database search tools based on older scoring methodology, HMMER aims to be significantly more accurate and more able to detect remote homologs because of the strength of its underlying mathematical models. In the past, this strength came at significant computational expense, but in the new HMMER3 project, HMMER is now essentially as fast as BLAST.
HomerHOMER (Hypergeometric Optimization of Motif EnRichment) is a suite of tools for Motif Discovery and next-gen sequencing analysis.
HPCGThe HPCG Benchmark project is an effort to create a more relevant metric for ranking HPC systems than the High Performance LINPACK (HPL) benchmark, that is currently used by the TOP500 benchmark.
HPLHPL is a software package that solves a (random) dense linear system in double precision (64 bits) arithmetic on distributed-memory computers. It can thus be regarded as a portable as well as freely available implementation of the High Performance Computing Linpack Benchmark.
HRLDASThe High-Resolution Land Data Assimilation System (HRLDAS) is an offline driver of the land surface models present in the Weather Research and Forecasting (WRF) model.
htopAn interactive process viewer for Unix
HTSeqA framework to process and analyze data from high-throughput sequencing (HTS) assays
HTSlibA C library for reading/writing high-throughput sequencing data. This package includes the utilities bgzip and tabix
hunspellHunspell is a spell checker and morphological analyzer library and program designed for languages with rich morphology and complex word compounding or character encoding. URL:
hwlocThe Portable Hardware Locality (hwloc) software package provides a portable abstraction (across OS, versions, architectures, ...) of the hierarchical topology of modern architectures, including NUMA memory nodes, sockets, shared caches, cores and simultaneous multithreading. It also gathers various system attributes such as cache and memory information as well as the locality of I/O devices such as network interfaces, InfiniBand HCAs or GPUs. It primarily aims at helping applications with gathering information about modern computing hardware so as to exploit it accordingly and efficiently.
HYCOMHYCOM - HYbrid Coordinate Ocean Model
Hyperopthyperopt is a Python library for optimizing over awkward search spaces with real-valued, discrete, and conditional dimensions.
HyperworksComputer-aided engineering simulator.
hypothesisHypothesis is an advanced testing library for Python. It lets you write tests which are parametrized by a source of examples, and then generates simple and comprehensible examples that make your tests fail. This lets you find more bugs in your code with less work. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
HypreHypre is a library for solving large, sparse linear systems of equations on massively parallel computers. The problems of interest arise in the simulation codes being developed at LLNL and elsewhere to study physical phenomena in the defense, environmental, energy, and biological sciences.
ICA-AROMAICA-AROMA (i.e. 'ICA-based Automatic Removal Of Motion Artifacts') concerns a data-driven method to identify and remove motion-related independent components from fMRI data.
iccIntel C and C++ compilers
iccifortIntel C, C++ & Fortran compilers
iccifortcudaIntel C, C++ & Fortran compilers with CUDA toolkit
IceTThe Image Composition Engine for Tiles (IceT) is a high-performance sort-last parallel rendering library.
ICORN2ICORN2 is a software to correct reference genome sequences. The main idea is to iteratively map reads and find differences in the sequence.
iCountiCount: protein-RNA interaction analysis is a Python module and associated command-line interface (CLI), which provides all the commands needed to process iCLIP data on protein-RNA interactions.
ictceIntel Cluster Toolkit Compiler Edition provides Intel C/C++ and Fortran compilers, Intel MPI & Intel MKL.
ICUICU is a mature, widely used set of C/C++ and Java libraries providing Unicode and Globalization support for software applications.
IDBA-UDIDBA-UD is a iterative De Bruijn Graph De Novo Assembler for Short Reads Sequencing data with Highly Uneven Sequencing Depth. It is an extension of IDBA algorithm. IDBA-UD also iterates from small k to a large k. In each iteration, short and low-depth contigs are removed iteratively with cutoff threshold from low to high to reduce the errors in low-depth and high-depth regions. Paired-end reads are aligned to contigs and assembled locally to generate some missing k-mers in low-depth regions. With these technologies, IDBA-UD can iterate k value of de Bruijn graph to a very large value with less gaps and less branches to form long contigs in both low-depth and high-depth regions.
IDLENVIEXELIS IDL is a programming language used for data analysis. It is popular in particular areas of science, such as astronomy, atmospheric physics and medical imaging.
ifortIntel Fortran compiler
igraphigraph is a collection of network analysis tools with the emphasis on efficiency, portability and ease of use. igraph is open source and free. igraph can be programmed in R, Python and C/C++.
IGVThe Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations. URL:
IGVToolsThis package contains command line utilities for preprocessing, computing feature count density (coverage), sorting, and indexing data files. See also
iimpiIntel C/C++ and Fortran compilers, alongside Intel MPI.
iimpicIntel C/C++ and Fortran compilers, alongside Intel MPI and CUDA.
imageioImageio is a Python library that provides an easy interface to read and write a wide range of image data, including animated images, video, volumetric data, and scientific formats. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
ImageJImage Processing and Analysis in Java
ImageMagickImageMagick is a software suite to create, edit, compose, or convert bitmap images
imbalanced-learnimbalanced-learn is a Python package offering a number of re-sampling techniques commonly used in datasets showing strong between-class imbalance.
imklIntel Math Kernel Library is a library of highly optimized, extensively threaded math routines for science, engineering, and financial applications that require maximum performance. Core math functions include BLAS, LAPACK, ScaLAPACK, Sparse Solvers, Fast Fourier Transforms, Vector Math, and more.
impiIntel MPI Library, compatible with MPICH ABI
InelasticaPython package for eigenchannels, vibrations and inelastic electron transport based on SIESTA/TranSIESTA DFT.
InfernalInfernal ("INFERence of RNA ALignment") is for searching DNA sequence databases for RNA structure and sequence similarities. InputProto protocol headers.
InspectorIntel Inspector XE is an easy to use memory error checker and thread checker for serial and parallel applications
IntaRNAEfficient RNA-RNA interaction prediction incorporating accessibility and seeding of interaction sites
intelCompiler toolchain including Intel compilers, Intel MPI and Intel Math Kernel Library (MKL).
intelcudaIntel Cluster Toolkit Compiler Edition provides Intel C/C++ and Fortran compilers, Intel MPI & Intel MKL, with CUDA toolkit
intel-paraIntel Cluster Toolkit Compiler Edition provides Intel C/C++ and Fortran compilers, ParaStation MPI & Intel MKL.
IntelPythonIntel® Distribution for Python. Powered by Anaconda. Accelerating Python* performance on modern architectures from Intel.
InterProScanInterProScan is a sequence analysis application (nucleotide and protein sequences) that combines different protein signature recognition methods into one resource. URL:
intltoolintltool is a set of tools to centralize translation of many different file formats using GNU gettext-compatible PO files.
iomklIntel Cluster Toolchain Compiler Edition provides Intel C/C++ and Fortran compilers, Intel MKL & OpenMPI.
iompiIntel C/C++ and Fortran compilers, alongside Open MPI. URL:
IpoptIpopt (Interior Point OPTimizer, pronounced eye-pea-Opt) is a software package for large-scale nonlinear optimization. URL:
ipsmpiIntel Cluster Toolkit Compiler Edition provides Intel C/C++ and Fortran compilers, Intel MKL combined with ParaStation MPI.
IPythonIPython provides a rich architecture for interactive computing with: Powerful interactive shells (terminal and Qt-based). A browser-based notebook with support for code, text, mathematical expressions, inline plots and other rich media. Support for interactive data visualization and use of GUI toolkits. Flexible, embeddable interpreters to load into your own projects. Easy to use, high performance tools for parallel computing. URL:
IQ-TREEEfficient phylogenomic software by maximum likelihood
iRAPa flexible RNA-seq analysis pipeline that allows the user to select and apply their preferred combination of existing tools for mapping reads, quantifying expression and testing for differential expression.
ispc, Intel SPMD Program Compilers; An open-source compiler for high-performance SIMD programming on the CPU. ispc is a compiler for a variant of the C programming language, with extensions for 'single program, multiple data' (SPMD) programming. Under the SPMD model, the programmer writes a program that generally appears to be a regular serial program, though the execution model is actually that a number of program instances execute in parallel on the hardware.
isPcrCommand line program that builds its own index (rather than relying on gfServer) to do PCR. This uses a lot of memory and is best done one chromosome at a time in batch mode, ideally on a cluster of machines.
itacThe Intel Trace Collector is a low-overhead tracing library that performs event-based tracing in applications. The Intel Trace Analyzer provides a convenient way to monitor application activities gathered by the Intel Trace Collector through graphical displays.
I-TASSERI-TASSER is a set of pre-compiled binaries and scripts for protein structure and function modelling and comparison.
ITKInsight Segmentation and Registration Toolkit (ITK) provides an extensive suite of software tools for registering and segmenting multidimensional imaging data.
itsdangerousVarious helpers to pass trusted data to untrusted environments and back.
JabbaJabba, a hybrid error correction tool for sequencing reads.
JAGSJAGS is Just Another Gibbs Sampler. It is a program for analysis of Bayesian hierarchical models using Markov Chain Monte Carlo (MCMC) simulation
JasPerThe JasPer Project is an open-source initiative to provide a free software-based reference implementation of the codec specified in the JPEG-2000 Part-1 standard.
JavaJava Platform, Standard Edition (Java SE) lets you develop and deploy Java applications on desktops and servers.
JBIGKIT(description not available)
JBIG-KITJBIG-KIT provides a portable library of compression and decompression functions with a documented interface that you can include very easily into your image or document processing software.
JBrowseJBrowse is a genome browser with a fully dynamic AJAX interface, being developed as the eventual successor to GBrowse. It is very fast and scales well to large datasets. URL:
JDKJava Platform, Standard Edition (Java SE) lets you develop and deploy Java applications on desktops and servers.
JellyfishJellyfish is a tool for fast, memory-efficient counting of k-mers in DNA.
jemallocjemalloc is a general purpose malloc(3) implementation that emphasizes fragmentation avoidance and scalable concurrency support. URL:
Jinja2Jinja2 is a template engine written in pure Python. It provides a Django inspired non-XML syntax but supports inline expressions and an optional sandboxed environment.
JiTCODEJust-in-time compilation for ordinary/delay/stochastic differential equations (DDEs) URL:
json2htmlPython wrapper to convert JSON into a human readable HTML Table representation.
JsonCppJsonCpp is a C++ library that allows manipulating JSON values, including serialization and deserialization to and from strings. It can also preserve existing comment in unserialization/serialization steps, making it a convenient format to store user input files. URL:
JudyA C library that implements a dynamic array. URL:
JuliaJulia is a high-level, high-performance dynamic programming language for numerical computing URL:
Julia_tamuJulia is a high-level, high-performance dynamic programming language for numerical computing..
JUnitA programmer-oriented testing framework for Java.
jupyterhubJupyterHub is a multiuser version of the Jupyter (IPython) notebook designed for centralized deployments in companies, university classrooms and research labs.
kallistokallisto is a program for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads. URL:
KAnalyzeKAnalyze is a Java toolkit designed to convert DNA and RNA sequences into k-mers. It is both a command line application and an API.
KarectKAUST Assembly Read Error Correction Tool
KATKAT is a suite of tools that generate, analyse and compare k-mer spectra produced from sequence files. KBProto protocol headers.
KerasKeras is a minimalist, highly modular neural networks library, written in Python and capable of running on top of either TensorFlow or Theano.
KILAPEKILAPE (K-masking and Iterative Local Assembly of Paired Ends) is an automated scaffolding and gap filling software pipeline which predicts repetitive elements in Next Generation Sequencing read libraries without resorting to a reference sequence. To see executable files: ls $KILAPE_HOME; ls $KILAPE_BIN
KMCKMC is a disk-based programm for counting k-mers from (possibly gzipped) FASTQ/FASTA files.
KNIMEKNIME Analytics Platform is the open source software for creating data science applications and services. KNIME stands for KoNstanz Information MinEr.
KnitroThe Artelys Knitro Solver is a plug-in Solver Engine that extends Analytic Solver Platform, Risk Solver Platform, Premium Solver Platform or Solver SDK Platform to solve nonlinear optimization problems of virtually unlimited size. URL:
KokkosKokkos implements a programming model in C++ for writing performance portable applications targeting all major HPC platforms.
KorfLab-Perl_utilsMiscellaneous Perl scripts and modules used by people in the Korf lab.
KrakenKraken is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies. Previous attempts by other bioinformatics software to accomplish this task have often used sequence alignment or machine learning techniques that were quite slow, leading to the development of less sensitive but much faster abundance estimation programs. Kraken aims to achieve high sensitivity and high speed by utilizing exact alignments of k-mers and a novel classification algorithm.
Kraken2Kraken is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies. Previous attempts by other bioinformatics software to accomplish this task have often used sequence alignment or machine learning techniques that were quite slow, leading to the development of less sensitive but much faster abundance estimation programs. Kraken aims to achieve high sensitivity and high speed by utilizing exact alignments of k-mers and a novel classification algorithm.
KronaToolsKrona Tools is a set of scripts to create Krona charts from several Bioinformatics tools as well as from text and XML files.
kSNPkSNP identifies the pan-genome SNPs in a set of genome sequences, and estimates phylogenetic trees based upon those SNPs. SNP discovery is based on k-mer analysis, and requires no multiple sequence alignment or the selection of a reference genome, so kSNP can take 100's of microbial genomes as input.
KyotoCabinetKyoto Cabinet is a library of routines for managing a database.
labelingAxis Labeling
LAMELAME is a high quality MPEG Audio Layer III (MP3) encoder licensed under the LGPL.
LAPACKLAPACK is written in Fortran90 and provides routines for solving systems of simultaneous linear equations, least-squares solutions of linear systems of equations, eigenvalue problems, and singular value problems.
lapelsLapels - A remapper and annotator of in silico (pseudo) genome alignments
LASTLAST finds similar regions between sequences. LAST copes more efficiently with repeat-rich sequences (e.g. genomes). For example: it can align reads to genomes without repeat-masking, without becoming overwhelmed by repetitive hits.
LATTEOpen source density functional tight binding molecular dynamics.
LCovLCOV - the LTP GCOV extension
LEfSeLEfSe (Linear discriminant analysis Effect Size) determines the features (organisms, clades, operational taxonomic units, genes, or functions) most likely to explain differences between classes by coupling standard tests for statistical significance with additional tests encoding biological consistency and effect relevance.
LeptonicaLeptonica is a collection of pedagogically-oriented open source software that is broadly useful for image processing and image analysis applications.
LevelDBLevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.
lftpLFTP is a sophisticated ftp/http client, and a file transfer program supporting a number of network protocols. Like BASH, it has job control and uses the readline library for input. It has bookmarks, a built-in mirror command, and can transfer several files in parallel. It was designed with reliability in mind.
libaioAsynchronous input/output library that uses the kernels native interface. URL:
libarchiveMulti-format archive and compression library URL:
libartGraphics routines used by the GnomeCanvas widget and some other applications. libart renders vector paths and the like.
libavLibav is a friendly and community-driven effort to provide its users with a set of portable, functional and high-performance libraries for dealing with multimedia formats of all sorts.
libcerflibcerf is a self-contained numeric library that provides an efficient and accurate implementation of complex error functions, along with Dawson, Faddeeva, and Voigt functions.
libcircleAn API to provide an efficient distributed queue on a cluster. libcircle is an API for distributing embarrassingly parallel workloads using self-stabilization. URL:
libconfigLibconfig is a simple library for processing structured configuration files
libConfuselibConfuse is a configuration file parser library, licensed under the terms of the ISC license, and written in C.
libdapA C++ SDK which contains an implementation of DAP 2.0 and the development versions of DAP3, up to 3.4. This includes both Client- and Server-side support classes.
libdrmDirect Rendering Manager runtime library.
libdwarfThe DWARF Debugging Information Format is of interest to programmers working on compilers and debuggers (and anyone interested in reading or writing DWARF information)) URL:
libelflibelf is a free ELF object file access library URL:
libepoxyEpoxy is a library for handling OpenGL function pointer management for you URL:
libeventThe libevent API provides a mechanism to execute a callback function when a specific event occurs on a file descriptor or after a timeout has been reached. Furthermore, libevent also support callbacks due to signals or regular timeouts.
libffcallGNU Libffcall is a collection of four libraries which can be used to build foreign function call interfaces in embedded interpreters
libffiThe libffi library provides a portable, high level programming interface to various calling conventions. This allows a programmer to call any function specified by a call interface description at run-time.
libgcryptLibgpg-error is a small library that defines common error values for all GnuPG components.
libgdGD is an open source code library for the dynamic creation of images by programmers.
libgeotiffLibrary for reading and writing coordinate system information from/to GeoTIFF files
libgit2libgit2 is a portable, pure C implementation of the Git core methods provided as a linkable library with a solid API, allowing to build Git functionality into your application. URL:
libgladeLibglade is a library for constructing user interfaces dynamically from XML descriptions.
libGLUThe OpenGL Utility Library (GLU) is a computer graphics library for OpenGL. URL:
libgnomecanvasThe canvas widget allows you to create custom displays using stock items such as circles, lines, text, and so on. It was originally a port of the Tk canvas widget but has evolved quite a bit over time.
libgpg-errorLibgpg-error is a small library that defines common error values for all GnuPG components.
libgpuarrayLibrary to manipulate tensors on the GPU.
libgtextutilsligtextutils is a dependency of fastx-toolkit and is provided via the same upstream
libharulibHaru is a free, cross platform, open source library for generating PDF files.
libICEX Inter-Client Exchange library for
libiconvLibiconv converts from one character encoding to another through Unicode conversion
libidnGNU Libidn is a fully documented implementation of the Stringprep, Punycode and IDNA specifications. Libidn's purpose is to encode and decode internationalized domain names.
LibintLibint library is used to evaluate the traditional (electron repulsion) and certain novel two-body matrix elements (integrals) over Cartesian Gaussian functions used in modern atomic and molecular theory.
libjpeg-turbolibjpeg-turbo is a fork of the original IJG libjpeg which uses SIMD to accelerate baseline JPEG compression and decompression. libjpeg is a library that implements JPEG image encoding, decoding and transcoding.
libmathevalGNU libmatheval is a library (callable from C and Fortran) to parse and evaluate symbolic expressions input as text.
libMemcachedlibMemcached is an open source C/C++ client library and tools for the memcached server ( It has been designed to be light on memory usage, thread safe, and provide full access to server side methods.
libMeshThe libMesh library provides a framework for the numerical simulation of partial differential equations using arbitrary unstructured discretizations on serial and parallel platforms. A major goal of the library is to provide support for adaptive mesh refinement (AMR) computations in parallel while allowing a research scientist to focus on the physics they are modeling. NOTE: This module has been specifically configured for use with MOOSE (
libnlThe libnl suite is a collection of libraries providing APIs to netlink protocol based Linux kernel interfaces.
libpciaccessGeneric PCI access library.
libpnglibpng is the official PNG reference library
libpthread-stubsThe X protocol C-language Binding (XCB) is a replacement for Xlib featuring a small footprint, latency hiding, direct access to the protocol, improved threading support, and extensibility.
libreadlineThe GNU Readline library provides a set of functions for use by applications that allow users to edit command lines as they are typed in. Both Emacs and vi editing modes are available. The Readline library includes additional functions to maintain a list of previously-entered command lines, to recall and perhaps reedit those lines, and perform csh-like history expansion on previous commands.
libsigc++The libsigc++ package implements a typesafe callback system for standard C++. URL:
libsigsegvGNU libsigsegv is a library for handling page faults in user mode.
libSMX11 Session Management library, which allows for applications to both manage sessions, and make use of session managers to save and restore their state for later use.
libsndfileLibsndfile is a C library for reading and writing files containing sampled sound (such as MS Windows WAV and the Apple/SGI AIFF format) through one standard library interface.
libsodiumSodium is a modern, easy-to-use software library for encryption, decryption, signatures, password hashing and more. URL:
libspatialindexC++ implementation of R*-tree, an MVR-tree and a TPR-tree with C API
libspatialiteSpatiaLite is an open source library intended to extend the SQLite core to support fully fledged Spatial SQL capabilities.
LIBSVMLIBSVM is an integrated software for support vector classification, (C-SVC, nu-SVC), regression (epsilon-SVR, nu-SVR) and distribution estimation (one-class SVM). It supports multi-class classification.
libtarC library for manipulating POSIX tar files
libtasn1Libtasn1 is the ASN.1 library used by GnuTLS, GNU Shishi and some other packages. It was written by Fabio Fiorina, and has been shipped as part of GnuTLS for some time but is now a proper GNU package. URL:
LibTIFFtiff: Library and tools for reading and writing TIFF data files
libtoolGNU libtool is a generic library support script. Libtool hides the complexity of using shared libraries behind a consistent, portable interface.
libunistringThis library provides functions for manipulating Unicode strings and for manipulating C strings according to the Unicode standard.
libunwindThe primary goal of libunwind is to define a portable and efficient C programming interface (API) to determine the call-chain of a program. The API additionally provides the means to manipulate the preserved (callee-saved) state of each call-frame and to resume execution at any point in the call-chain (non-local goto). The API supports both local (same-process) and remote (across-process) operation. As such, the API is useful in a number of applications
LibUUIDPortable uuid C library
libvdwxclibvdwxc is a general library for evaluating energy and potential for exchange-correlation (XC) functionals from the vdW-DF family that can be used with various of density functional theory (DFT) codes. URL:
libwebpWebP is a modern image format that provides superior lossless and lossy compression for images on the web. Using WebP, webmasters and web developers can create smaller, richer images that make the web faster. URL:
libX11X11 client-side library
libXauThe libXau package contains a library implementing the X11 Authorization Protocol. This is useful for restricting client access to the display.
libxcLibxc is a library of exchange-correlation functionals for density-functional theory. The aim is to provide a portable, well tested and reliable set of exchange and correlation functionals.
libxcbThe X protocol C-language Binding (XCB) is a replacement for Xlib featuring a small footprint, latency hiding, direct access to the protocol, improved threading support, and extensibility.
libXdamageX Damage extension library
libXdmcpThe libXdmcp package contains a library implementing the X Display Manager Control Protocol. This is useful for allowing clients to interact with the X Display Manager.
libXextCommon X Extensions library
libXfixesX Fixes extension library
libXfontX font libary
libXftX11 client-side library
libXiLibXi provides an X Window System client interface to the XINPUT extension to the X protocol.
libXineramaXinerama multiple monitor library
libxml++libxml++ is a C++ wrapper for the libxml XML parser library. URL:
libxml2Libxml2 is the XML C parser and toolchain developed for the Gnome project (but usable outside of the Gnome platform).
libXmulibXmu provides a set of miscellaneous utility convenience functions for X libraries to use. libXmuu is a lighter-weight version that does not depend on libXt or libXext
libXplibXp provides the X print library.
libXpmlibXp provides the X print library.
libXrandrX Resize, Rotate and Reflection extension library
libXrenderX11 client-side library
libxsltLibxslt is the XSLT C library developed for the GNOME project (but usable outside of the Gnome platform).
libxsmmLIBXSMM is a library for small dense and small sparse matrix-matrix multiplications targeting Intel Architecture (x86).
libXtlibXt provides the X Toolkit Intrinsics, an abstract widget library upon which other toolkits are based. Xt is the basis for many toolkits, including the Athena widgets (Xaw), and LessTif (a Motif implementation).
libyamlLibYAML is a YAML parser and emitter written in C.
libzeepC++ library for reading and writing XML and creating web and SOAP servers
LIGGGHTSLIGGGHTS® is an Open Source Discrete Element Method Particle Simulation Software. It can be used for the simulation of particulate materials, and aims to for applications it to industrial problems
LIGGGHTS-PUBLICLIGGGHTS® is an Open Source Discrete Element Method Particle Simulation Software. It can be used for the simulation of particulate materials, and aims to for applications it to industrial problems
LIGGGHTS-WITH-BONDSLIGGGHTS® DEM software with Bonds enabled.
LighterFast and memory-efficient sequencing error corrector
LINKSLINKS is a genomics application for scaffolding or re-scaffolding genome assemblies with long reads, such as those produced by Oxford Nanopore Technologies Ltd. It provides a generic framework for scaffolding and can work on any sequences.
lisLis (Library of Iterative Solvers for linear systems, pronounced [lis]) is a parallel software library for solving linear equations and eigenvalue problems that arise in the numerical solution of partial differential equations using iterative methods.
LittleCMSLittle CMS intends to be an OPEN SOURCE small-footprint color management engine, with special focus on accuracy and performance.
LLVMThe LLVM Core libraries provide a modern source- and target-independent optimizer, along with code generation support for many popular CPUs (as well as some less common ones!) These libraries are built around a well specified code representation known as the LLVM intermediate representation ("LLVM IR"). The LLVM Core libraries are well documented, and it is particularly easy to invent your own language (or port an existing compiler) to use LLVM as an optimizer and code generator.
LMDBLMDB is a fast, memory-efficient database. With memory-mapped files, it has the read performance of a pure in-memory database while retaining the persistence of standard disk-based databases.
LocARNALocARNA is a collection of alignment tools for the structural analysis of RNA. Given a set of RNA sequences, LocARNA simultaneously aligns and predicts common structures for your RNAs. In this way, LocARNA performs Sankoff-like alignment and is in particular suited for analyzing sets of related RNAs without known common structure.
LoFreqFast and sensitive variant calling from next-gen sequencing data
LongRangerLong Ranger is a set of analysis pipelines that processes Chromium sequencing output to align reads and call and phase SNPs, indels, and structural variants. There are five main pipelines, each triggered by a longranger command.
lpsolveMixed Integer Linear Programming (MILP) solver
L_RNA_scaffolderL_RNA_scaffolder is a novel scaffolding tool using long trancriptome reads to scaffold genome fragments. The method is suitable for most genomes. The program could handle the transcript reads generated from 454/Sanger/Ion_Torrent sequencing, or de novo assembled with pair-end Illumina sequencing. Since the large introns cover most transcribed genome regions and RNA-sequencing is much less expensive than large insert library construction, the method provides a practical alternative to existing fosmid/BAC library_based approaches for scaffolding genome sequences in a cost effective way.
lrsliblrslib is a self-contained ANSI C implementation of the reverse search algorithm for vertex enumeration/convex hull problems
LSCLSC is a pure implementation of the long read error correction algorithm. Long reads and high-quality short reads are homopolyer-compressed. Then, compressed short reads are mapped to compressed long reads with Bowtie2. Then the concensus sequences for short reads will replace the mapped regions in the long reads.
LS-DYNALS-DYNA is a general-purpose finite element program capable of simulating complex real world problems.
LS-OPTLS-OPT is a standalone Design Optimization and Probabilistic Analysis package with an interface to LS-DYNA.
LS-PREPOSTLS-PREPOST is an advanced pre and post-processor that is delivered free with LS-DYNA.
LS-TASCLS-TaSC is a Topology and Shape Computation tool. Developed for engineering analysts who need to optimize structures.
LuaLua is a powerful, fast, lightweight, embeddable scripting language. Lua combines simple procedural syntax with powerful data description constructs based on associative arrays and extensible semantics. Lua is dynamically typed, runs by interpreting bytecode for a register-based virtual machine, and has automatic memory management with incremental garbage collection, making it ideal for configuration, scripting, and rapid prototyping.
LuaJITLuaJIT is a Just-In-Time Compiler (JIT) for the Lua programming language. Lua is a powerful, dynamic and light-weight programming language. It may be embedded or used as a general-purpose, stand-alone language.
LUMPYA probabilistic framework for structural variant discovery.
lwgrpThe Light-weight Group Library provides methods for MPI codes to quickly create and destroy process groups URL:
lxmlThe lxml XML toolkit is a Pythonic binding for the C libraries libxml2 and libxslt. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
Lyve-SETLYVE version of the Snp Extraction Tool (SET), a method of using hqSNPs to create a phylogeny.
lz4LZ4 is lossless compression algorithm, providing compression speed at 400 MB/s per core. It features an extremely fast decoder, with speed in multiple GB/s per core.
LZOPortable lossless data compression library
M4GNU M4 is an implementation of the traditional Unix macro processor. It is mostly SVR4 compatible although it has some extensions (for example, handling more than 9 positional parameters to macros). GNU M4 also has built-in functions for including files, running shell commands, doing arithmetic, etc.
MACSModel-based Analysis of ChIP-Seq (MACS) on short reads sequencers such as Genome Analyzer (Illumina / Solexa). MACS empirically models the length of the sequenced ChIP fragments, which tends to be shorter than sonication or library construction size estimates, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome sequence, allowing for more sensitive and robust prediction.
MACS2Model Based Analysis for ChIP-Seq data
MadymoSimcenter Madymo is the worldwide standard software for analyzing and optimizing occupant and pedestrian safety designs. URL:
MafFilterMafFilter is a program dedicated to the analysis of genome alignments. It parses and manipulates MAF files as well as more simple fasta files.
MAFFTMAFFT is a multiple sequence alignment program for unix-like operating systems. It offers a range of multiple alignment methods, L-INS-i (accurate; for alignment of <∼200 sequences), FFT-NS-2 (fast; for alignment of <∼10,000 sequences), etc.
mafToolsBioinformatics tools for dealing with Multiple Alignment Format (MAF) files.
Magic-BLASTMagic-BLAST is a tool for mapping large next-generation RNA or DNA sequencing runs against a whole genome or transcriptome. Unlike other BLAST nucleotide search programs, such as BLASTN or Megablast, Magic-BLAST produces spliced alignments and optimizes alignment scores for paired reads.
MagicsMagics is the latest generation of the ECMWF's meteorological plotting software and can be either accessed directly through its Python or Fortran interfaces or by using Metview.
magmaThe MAGMA project aims to develop a dense linear algebra library similar to LAPACK but for heterogeneous/hybrid architectures, starting with current Multicore+GPU systems.
MagresPythonMagresPython is a Python library for parsing the CCP-NC ab-initio magnetic resonance file format. This is used in the latest version of the CASTEP and Quantum ESPRESSO (PWSCF) codes.
magrittrA Forward-Pipe Operator for R
makeGNU version of make utility
makedependThe makedepend package contains a C-preprocessor like utility to determine build-time dependencies.
MAKERA portable and easily configurable genome annotation pipeline. MAKER identifies repeats, aligns ESTs and proteins to a genome, produces ab-initio gene predictions and automatically synthesizes these data into gene annotations having evidence-based quality values.
MakoA super-fast templating language that borrows the best ideas from the existing templating languages Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
MapSpliceMapSplice is a software for mapping RNA-seq data to reference genome for splice junction discovery that depends only on reference genome, and not on any further annotations.
MariaDBMariaDB An enhanced, drop-in replacement for MySQL.
MariaDB-connector-cMariaDB Connector/C is used to connect applications developed in C/C++ to MariaDB and MySQL databases. URL:
MarkdownPython implementation of Markdown.
MarkupSafePython http for humans
MASSSupport Functions and Datasets for Venables and Ripley's MASS
MaSuRCAMaSuRCA is whole genome assembly software. It combines the efficiency of the de Bruijn graph and Overlap-Layout-Consensus (OLC) approaches. MaSuRCA can assemble data sets containing only short reads from Illumina sequencing or a mixture of short reads and long reads (Sanger, 454, Pacbio and Nanopore).
MATCHMultipurpose Atom Checker for CHARMM
Math-DerivativeMath::Derivative - Numeric 1st and 2nd order differentiation URL:
Math-SplineMath::Spline - Cubic Spline Interpolation of data URL:
Math-UtilsMath::Utils - Useful mathematical functions not in Perl. URL:
MATIOmatio is an C library for reading and writing Matlab MAT files.
MatlabA numerical computing environment and fourth-generation programming language.
Matlab-MCR/products/compiler/mcr Matlab Component Runtime. standalone matlab libraries to run matlab codes.
matplotlibmatplotlib is a python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. matplotlib can be used in python scripts, the python and ipython shell, web application servers, and six graphical user interface toolkits.
MavenBinary maven install, Apache Maven is a software project management and comprehension tool. Based on the concept of a project object model (POM), Maven can manage a project's build, reporting and documentation from a central piece of information.
MavericKMavericK is a program for inferring population structure on the basis of genetic information. The mixture modelling framework used by MavericK is identical to that used in the program STRUCTURE by Pritchard et al. (2000), which remains one of the most powerful and widely used programs in population genetics.
mawkmawk is an interpreter for the AWK Programming Language.
MaxBinMaxBin is software for binning assembled metagenomic sequences based on an Expectation-Maximization algorithm.
MaximaCommon Lisp is a high-level, general-purpose, object-oriented, dynamic, functional programming language.
MBBCMBBC is a useful tool in metagenomic studies. It is a novel composition-based approach to bin environmental shotgun reads, by considering the k-mer frequency in reads and the inferred Markovian property of the unknown species or OTUs (operational taxonomic units).
mbuffermbuffer is a tool for buffering data streams with a large set of unique features.
McCortexMcCortex is a multi-sample de novo assembly and variant calling using Linked de bruijn graphs.
MCLThe MCL algorithm is short for the Markov Cluster Algorithm, a fast and scalable unsupervised cluster algorithm for graphs (also known as networks) based on simulation of (stochastic) flow in graphs.
mcOutbrykmcOutbryk is a SNP calling pipeline using mccortex
MCRThe MATLAB Runtime is a standalone set of shared libraries that enables the execution of compiled MATLAB applications or components on computers that do not have MATLAB installed.
MDTrajRead, write and analyze MD trajectories with only a few lines of Python code.
MECATMECAT is an ultra-fast Mapping, Error Correction and de novo Assembly Tools for single molecula sequencing (SMRT) reads.
medakamedaka is a tool to create a consensus sequence of nanopore sequencing data.
MEGAHITAn ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph
MEMEThe MEME Suite allows you to: 1 1000_ebs_dirs.txt 1000_ebs.txt 1.0.2a-goolf-1.7.20 2 2016D-versions 29291.mods 31503.moddeps 4eb 7 a abaqus_2016.gpr abaqus_2017.gpr abaqus_2018.gpr abaqus.guiState abaqus_pde.deps abaqus_plugins abaqus.rpy abaqus.rpy.4 abaqus.rpy.5 abaqus.rpy.6 abaqus.rpy.7 abaqus_v6.12.gpr abaqus_v6.13.gpr abaqus_v6.14.gpr abdefaults abort_trace_rank0.out ABQcaeK.429.00.dmp ada_17may2016 ada6.lshw.txt ansys_01feb2016 ANSYS-19.0 ANSYS-19.0-licserv-LINX64.TGZ ansysldd a.out area51 ASRemlRegister.txt atfails ausers.ada ausers.terra autodyn.log BA.csv bash_history.12nov2015.1525 bash-history.antuxie.11aug2015.2145 benchmark_data BGQ_login BGQ_login.tar.gz bin blcr_LSF bmem.1217382 bmem.1217383 bmem.936838 bmem.936840 bmem.job bmtf.job.lsf build_vasp_log CATSettings cleanups CMAQ-5.2 Co compiler_tests conda-bld converge.start cortexerror.log CPLEX-12.6.3-intel-2015B.env cshtest cuda.repo dakotaEA.out data Desktop devel-list.txt Documents Downloads drmaa.pyc dssp-2.2.1 dssp-2.2.1.tgz eb-confighelp ebgit eb-help eb-params ebs-new EBtoken elfutils-0.160.tar.bz2 EMAN2 eos.purged.13apr2016 fdang.OpenMPInotes fd_scripts fe-safe.2017 fe-safe.env file.db file.dbb file.err file.log fileusage.scratch.29sep2018 foo fort.7 g GCCchain Get_7D_PES.f90 glibc_rpms grads_notes gtest.job gui_tmp hello2.c hello3.c hello3.txt hello.c hello.f90 hello.txt history.08apr2014 hosts hosts~ hosts2 hosts.brazos hosts.neumann4 hpctoolkit HPRC_ANSYS_Users_2017.txt hprclab_users.txt hs_err_pid9128.log httpd icculus info install-exit-status intel ism isus jacksfuckup Jacks_SC_Jump.html jkp j-perdue jperdue@xdmod.dor l lammps_test lmodthrash load_dealii login2vs7.txt login7.rpmlist log.liggghts lsf-drmaa-master LSF_patch lspost.cfile lspost.msg LSPP LSTC lu makeman matlab md.log #md.log.1# #md.log.2# #md.log.3# mi.login7 mi.login8 mishmash2.txt mmperftools-1.0.0.tar.gz mmrepquota.adacurie.general mmrepquota.adacurie.scratch mmrepquota.adacurie.tiered mmrepquota.terra.general mmrepquota.terra.scratch module.avail.ada module.avail.curie module.avail.curie.19jun2018.backup module.avail.terra moduledemo mol001.gif mol002.gif mpi_hello_world mpi_hello_world.tgz mpi_master mpi_master.c mpi-ping.c mpi_worker.c My myimb nedit-5.7-1.el6.x86_64.rpm newtree nohup note numpy numpy-benchmarks-20160218.tar.gz numpy-benchmarks-master nvidia_peer_memory-1.0-0.tar.gz nvvp_workspace of.init of.rc OMBbench oom.txt OpenFOAM osu-micro-benchmarks-4.4 out out.bmtf.7688012 out.bmtf.7688026 out_mmrepquota.adacurie.general.inodes-20190621.gv out_mmrepquota.adacurie.general.inodes-20190621.pdf out_mmrepquota.adacurie.general.kilobytes-20190621.gv out_mmrepquota.adacurie.general.kilobytes-20190621.pdf out_mmrepquota.adacurie.scratch.inodes-20190621.gv out_mmrepquota.adacurie.scratch.inodes-20190621.pdf out_mmrepquota.adacurie.scratch.kilobytes-20190621.gv out_mmrepquota.adacurie.scratch.kilobytes-20190621.pdf out_mmrepquota.adacurie.tiered.inodes-20190621.gv out_mmrepquota.adacurie.tiered.inodes-20190621.pdf out_mmrepquota.adacurie.tiered.kilobytes-20190621.gv out_mmrepquota.adacurie.tiered.kilobytes-20190621.pdf out_mmrepquota.terra.general.inodes-20190621.gv out_mmrepquota.terra.general.inodes-20190621.pdf out_mmrepquota.terra.general.kilobytes-20190621.gv out_mmrepquota.terra.general.kilobytes-20190621.pdf out_mmrepquota.terra.scratch.inodes-20190621.gv out_mmrepquota.terra.scratch.inodes-20190621.pdf out_mmrepquota.terra.scratch.kilobytes-20190621.gv out_mmrepquota.terra.scratch.kilobytes-20190621.pdf p4vasp.log pamlsfauth parallel-netcdf-1.4.1.tar.gz patch-535-vaspsol2.01 pavitrat pbr-1.0.1.tar.gz portal-vnc.6046787 privatemodules project-deal.II.cfg.bruno ptest py2rebuild pytz-2015.7.tar.gz q qiime2-2017.12-py35-linux-conda.yml qtest quota.prepurge.20feb2017 R-3.2.0-intel-2015B-default-mt.env remove1-11apr2017 results-definition.xml rl6 rl7 Rplots.pdf rpmbuild
memory-profilermemory-profiler is a Python module for monitoring memory consumption of a process as well as line-by-line analysis of memory consumption for python programs. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
MesaMesa is an open-source implementation of the OpenGL specification - a system for rendering interactive 3D graphics.
MesonMeson is a cross-platform build system designed to be both as fast and as user friendly as possible.
MesquiteMesh-Quality Improvement Library
MetaClusterMetaCluster5.0 is an unsupervised binning method that can (1) samples with low-abundance species, or (2) samples (even with high-abundance) with many extremely-low-abundance species.
MetaCluster-TAMetaCluster-TA is a new software for binning and annotating short paired-end reads.
MetaPhlAnMetaPhlAn is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data.
MetaPhlAn2MetaPhlAn is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea, Eukaryotes and Viruses) from metagenomic shotgun sequencing data with species level resolution. From version 2.0 MetaPhlAn is also able to identify specific strains (in the not-so-frequent cases in which the sample contains a previously sequenced strains) and to track strains across samples for all species.
Metaxa2Metaxa2: Improved Identification and Taxonomic Classification of Small and Large Subunit rRNA in Metagenomic Data
MethPipeThe MethPipe software package is a computational pipeline for analyzing bisulfite sequencing data (BS-seq, WGBS and RRBS).
METISMETIS is a set of serial programs for partitioning graphs, partitioning finite element meshes, and producing fill reducing orderings for sparse matrices. The algorithms implemented in METIS are based on the multilevel recursive-bisection, multilevel k-way, and multi-constraint partitioning schemes.
MicroMagnumMicroMagnum is a fast easy-to-use micromagnetic simulator that runs on CPUs as well as on GPUs using the CUDA platform. It combines the speed and flexibility of C++ together with the usability of Python.
MINCMedical Image NetCDF or MINC isn't netCDF.
MincedMinCED is a program to find Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) in full genomes or environmental datasets such as metagenomes, in which sequence size can be anywhere from 100 to 800 bp.
MiniasmMiniasm is a very fast OLC-based de novo assembler for noisy long reads. It takes all-vs-all read self-mappings (typically by minimap) as input and outputs an assembly graph in the GFA format.
Miniconda2Built to complement the rich, open source Python community, the Anaconda platform provides an enterprise-ready data analytics platform that empowers companies to adopt a modern open data science analytics architecture.
Miniconda3Built to complement the rich, open source Python community, the Anaconda platform provides an enterprise-ready data analytics platform that empowers companies to adopt a modern open data science analytics architecture.
minimap2Minimap2 is a fast sequence mapping and alignment program that can find overlaps between long noisy reads, or map long reads or their assemblies to a reference genome optionally with detailed alignment (i.e. CIGAR). At present, it works efficiently with query sequences from a few kilobases to ~100 megabases in length at an error rate ~15%. Minimap2 outputs in the PAF or the SAM format. On limited test data sets, minimap2 is over 20 times faster than most other long-read aligners. It will replace BWA-MEM for long reads and contig alignment.
MiniScrubMiniScrub is a de novo long sequencing read preprocessing method that improves read quality by predicting and removing ('scrubbing') read segments that have a high concentration of errors. Since long read technologies have high error rates, read scrubbing can be used to improve downstream applications such as alignment or assembly.
MinPathMinPath (Minimal set of Pathways) is a parsimony approach for biological pathway reconstructions using protein family predictions, achieving a more conservative, yet more faithful, estimation of the biological pathways for a query dataset.
miRDeep2miRDeep2 is a completely overhauled tool which discovers microRNA genes by analyzing sequenced RNAs
miRhubCandidate miRNA regulatory hub identification pipeline.
miRNAThe BCGSC miRNA Profiling Pipeline produces expression profiles of known miRNAs from BWA-aligned BAM files and generates summary reports and graphs describing the results.
MITREMITRE learns predictive models of patient outcomes from microbiome time-series data in the form of short lists of interpretable rules URL:
mkl-dnnIntel(R) Math Kernel Library for Deep Neural Networks (Intel(R) MKL-DNN)
mkl-servicePython hooks for Intel(R) Math Kernel Library runtime control settings. Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
MMseqs2MMseqs2: ultra fast and sensitive search and clustering suite
MOCAT2MOCAT2 (metagenomic analysis toolkit) is a package for analyzing metagenomics datasets. Currently MOCAT2 supports Illumina single- and paired-end reads in raw FastQ format. Using MOCAT2 you can generate taxonomic and functional profiles, as well as assemblereads and predict genes in assembled sequences.
modtoolsA tool set for manipulating MOD and pseudogenomes
MoldenMolden is a package for displaying Molecular Density from the Ab Initio packages GAMESS-UK, GAMESS-US and GAUSSIAN and the Semi-Empirical packages Mopac/Ampac
molmodMolMod is a Python library with many compoments that are useful to write molecular modeling programs. URL:
MonoAn open source, cross-platform, implementation of C# and the CLR that is binary compatible with Microsoft.NET.
MOOSEThe Multiphysics Object-Oriented Simulation Environment (MOOSE) is a finite-element, multiphysics framework primarily developed by Idaho National Laboratory. It provides a high-level interface to some of the most sophisticated nonlinear solver technology on the planet.
mosdepthFast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing
MothurMothur is a single piece of open-source, expandable software to fill the bioinformatics needs of the microbial ecology community.
motifMotif refers to both a graphical user interface (GUI) specification and the widget toolkit for building applications that follow that specification under the X Window System on Unix and other POSIX-compliant systems. It was the standard toolkit for the Common Desktop Environment and thus for Unix.
MotifMakerMotifMaker is a tool for identify motifs associated with DNA modifications in prokaryotic genomes.
MPCGnu Mpc is a C library for the arithmetic of complex numbers with arbitrarily high precision and correct rounding of the result. It extends the principles of the IEEE-754 standard for fixed precision real floating point numbers to complex numbers, providing well-defined semantics for every operation. At the same time, speed of operation at high precision is a major design goal.
MPFRThe MPFR library is a C library for multiple-precision floating-point computations with correct rounding.
mpi4pyMPI for Python (mpi4py) provides bindings of the Message Passing Interface (MPI) standard for the Python programming language, allowing any Python program to exploit multiple processors.
mpiBLASTmpiBLAST is a freely available, open-source, parallel implementation of NCBI BLAST. By efficiently utilizing distributed computational resources through database fragmentation, query segmentation, intelligent scheduling, and parallel I/O, mpiBLAST improves NCBI BLAST performance by several orders of magnitude while scaling to hundreds of processors. mpiBLAST is also portable across many different platforms and operating systems.
MPICHMPICH v3.x is an open source high-performance MPI 3.0 implementation. It does not support InfiniBand (use MVAPICH2 with InfiniBand devices).
mpifileutilsMPI-Based File Utilities For Distributed Systems URL:
mpiJavampiJava is an object-oriented Java interface to the standard Message Passing Interface (MPI). The interface was developed as part of the HPJava project, but mpiJava itself does not assume any special extensions to the Java language - it should be portable to any platform that provides compatible Java-development and native MPI environments.
mpiPmpiP is a lightweight profiling library for MPI applications. Because it only collects statistical information about MPI functions, mpiP generates considerably less overhead and much less data than tracing tools. All the information captured by mpiP is task-local. It only uses communication during report generation, typically at the end of the experiment, to merge results from all of the tasks into one output file. URL:
mpmathmpmath can be used as an arbitrary-precision substitute for Python's float/complex types and math/cmath modules, but also does much more advanced mathematics. Almost any calculation can be performed just as well at 10-digit or 1000-digit precision, with either real or complex numbers, and in many cases mpmath implements efficient algorithms that scale well for extremely high precision work.
MrBayesMrBayes is a program for the Bayesian estimation of phylogeny.
msprimemsprime is a coalescent simulator and library for processing tree-based genetic data.
MuaveMauve is a software package that attempts to align orthologous and xenologous regions among two or more genome sequences that have undergone both local and large-scale changes.
MultiQCAggregate results from bioinformatics analyses across many samples into a single report. MultiQC searches a given directory for analysis logs and compiles a HTML report. It's a general use tool, perfect for summarising the output from numerous bioinformatics tools.
MultiwfnMultiwfn is an extremely powerful program for realizingi electronic wavefunction analysis, which is a key ingredient of quantum chemistry. Multiwfn is free, open-source, high-efficient, very user-friendly and flexible, it supports almost all of the most important wavefunction analysis methods. URL:
MUMmerMUMmer is a system for rapidly aligning entire genomes, whether in complete or draft form. AMOS makes use of it.
mummichogMummichog is a Python program for analyzing data from high throughput, untargeted metabolomics. It leverages the organization of metabolic networks to predict functional activity directly from feature tables, bypassing metabolite identification.
MUMPSA parallel sparse direct solver URL:
munsellUtilities for Using Munsell Colours
muParsermuParser is an extensible high performance math expression parser library written in C++. It works by transforming a mathematical expression into bytecode and precalculating constant parts of the expression.
MuPeXIMuPeXI: Mutant Peptide eXtractor and Informer. Given a list of somatic mutations (VCF file) as input, MuPeXI returns a table containing all mutated peptides (neo-peptides) of user-defined lengths, along with several pieces of information relevant for identifying which of these neo-peptides are likely to serve as neo-epitopes.
MUSCLEMUSCLE is one of the best-performing multiple alignment programs according to published benchmark tests, with accuracy and speed that are consistently better than CLUSTALW. MUSCLE can align hundreds of sequences in seconds. Most users learn everything they need to know about MUSCLE in a few minutes-only a handful of command-line options are needed to perform common alignment tasks.
myAnaconda2A TAMU HPRC module to help users maintain their own virtual environments in $SCRATCH/myAnaconda2
myAnaconda3A TAMU HPRC module to help users maintain their own virtual environments in $SCRATCH/myAnaconda3
myEBUser EasyBuild built modules in $SCRATCH/eb
mygenePython Client for MyGene.Info services. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
myHadoopSee sample job script, \$MYHADOOP_HOME/example/lsf.qsub myHadoop: wrapper for running Hadoop on HPC cluster
myPythonA TAMU HPRC module to help users maintain their own virtual environments in $SCRATCH/myPython
myRA TAMU HPRC module to help users maintain their own R libraries in $SCRATCH/myR
myriadSimple distributed computing.
MySQL-pythonMySQL database connector for Python
NAMDNAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems.
NanonetNanonet provides recurrent neural network basecalling for Oxford Nanopore MinION data. It represents the first generation of such a basecaller from Oxford Nanopore Technologies, and is provided as a technology demonstrator.
nanopolishA nanopore consensus algorithm using a signal-level hidden Markov model.
NanoSimNanoSim is a fast and scalable read simulator that captures the technology-specific features of ONT data, and allows for adjustments upon improvement of nanopore sequencing technology.
NASMNASM: General-purpose x86 assembler
NCBI-ToolkitThe NCBI Toolkit is a collection of utilities developed for the production and distribution of GenBank, Entrez, BLAST, and related services by the National Center for Biotechnology Information.
ncbi-vdbThe SRA Toolkit and SDK from NCBI is a collection of tools and libraries for using data in the INSDC Sequence Read Archives. URL:
NCCLThe NVIDIA Collective Communications Library (NCCL) implements multi-GPU and multi-node collective communication primitives that are performance optimized for NVIDIA GPUs.
ncduNcdu is a disk usage analyzer with an ncurses interface. It is designed to find space hogs on a remote server where you don't have an entire graphical setup available, but it is a useful tool even on regular desktop systems. Ncdu aims to be fast, simple and easy to use, and should be able to run in any minimal POSIX-like environment with ncurses installed.
NCLNCL is an interpreted language designed specifically for scientific data analysis and visualization.
NCOmanipulates and analyzes data stored in netCDF-accessible formats, including DAP, HDF4, and HDF5
ncompressCompress is a fast, simple LZW file compressor. Compress does not have the highest compression rate, but it is one of the fastest programs to compress data. Compress is the defacto standard in the UNIX community for compressing files.
ncursesThe Ncurses (new curses) library is a free software emulation of curses in System V Release 4.0, and more. It uses Terminfo format, supports pads and color and multiple highlights and forms characters and function-key mapping, and has all the other SYSV-curses enhancements over BSD Curses.
ncviewNcview is a visual browser for netCDF format files. Typically you would use ncview to get a quick and easy, push-button look at your netCDF files. You can view simple movies of the data, view along various dimensions, take a look at the actual data values, change color maps, invert the data, etc.
netCDFNetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. URL:
netcdf4-pythonPython/numpy interface to netCDF.
netCDF-C++NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data.
netCDF-C++4NetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. URL:
netCDF-FortranNetCDF (network Common Data Form) is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. URL:
NetLogoNetLogo is a multi-agent programmable modeling environment. It is used by tens of thousands of students, teachers and researchers worldwide. It also powers HubNet participatory simulations. It is authored by Uri Wilensky and developed at the CCL.
netMHCpanThe NetMHCpan software predicts binding of peptides to any known MHC molecule using artificial neural networks (ANNs).
nettleNettle is a cryptographic library that is designed to fit easily in more or less any context: In crypto toolkits for object-oriented languages (C++, Python, Pike, ...), in applications like LSH or GNUPG, or even in kernel space.
networkxNetworkX is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. URL:
NEURONEmpirically-based simulations of neurons and networks of neurons.
nglviewIPython widget to interactively view molecular structures and trajectories. URL:
NGSNGS is a new, domain-specific API for accessing reads, alignments and pileups produced from Next Generation Sequencing. URL:
NGSadmixNGSadmix is a tool for finding admixture proportions from NGS data, based on genotype likelihoods.
NGSUtilsNGSUtils is a suite of software tools for working with next-generation sequencing datasets
NiBabelNiBabel provides read/write access to some common medical and neuroimaging file formats, including: ANALYZE (plain, SPM99, SPM2 and later), GIFTI, NIfTI1, NIfTI2, MINC1, MINC2, MGH and ECAT as well as Philips PAR/REC. We can read and write Freesurfer geometry, and read Freesurfer morphometry and annotation files. There is some very limited support for DICOM. NiBabel is the successor of PyNIfTI. Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
NIfTINiftilib is a set of i/o libraries for reading and writing files in the nifti-1 data format.
NilearnNilearn is a Python module for fast and easy statistical learning on NeuroImaging data. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
NimNim is a systems and applications programming language.
NinjaNinja is a small build system with a focus on speed.
NipypeNipype is a Python project that provides a uniform interface to existing neuroimaging software and facilitates interaction between these packages within a single workflow.
NLoptNLopt is a free/open-source library for nonlinear optimization, providing a common interface for a number of different free optimization routines available online as well as original implementations of various other algorithms.
nodejsNode.js is a platform built on Chrome's JavaScript runtime for easily building fast, scalable network applications. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient, perfect for data-intensive real-time applications that run across distributed devices. URL:
NormalizNormaliz is an open source tool for computations in affine monoids, vector configurations, lattice polytopes, and rational cones.
nose-parameterizedParameterized testing with any Python test framework.
nsegNSEG is used to mask nucleic acid sequences, needed by RepeatScout
NSPRNetscape Portable Runtime (NSPR) provides a platform-neutral API for system level and libc-like functions.
NSSNetwork Security Services (NSS) is a set of libraries designed to support cross-platform development of security-enabled client and server applications.
numactlThe numactl program allows you to run your application program on specific cpu's and memory nodes. It does this by supplying a NUMA memory policy to the operating system before running your program. The libnuma library provides convenient ways for you to add NUMA memory policies into your own program.
numbaNumba is an Open Source NumPy-aware optimizing compiler for Python sponsored by Continuum Analytics, Inc. It uses the remarkable LLVM compiler infrastructure to compile Python syntax to machine code.
numexprThe numexpr package evaluates multiple-operator array expressions many times faster than NumPy can. It accepts the expression as a string, analyzes it, rewrites it more efficiently, and compiles it on the fly into code for its internal virtual machine (VM). Due to its integrated just-in-time (JIT) compiler, it does not require a compiler at runtime. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
numpyNumPy is the fundamental package for scientific computing with Python. It contains among other things: a powerful N-dimensional array object, sophisticated (broadcasting) functions, tools for integrating C/C++ and Fortran code, useful linear algebra, Fourier transform, and random number capabilities. Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.
NWChemNWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters. NWChem software can handle: biomolecules, nanostructures, and solid-state; from quantum to classical, and all combinations; Gaussian basis functions or plane-waves; scaling from one to thousands of processors; properties and relativity.
OasesOases is a de novo transcriptome assembler designed to produce transcripts from short read sequencing technologies, such as Illumina, SOLiD, or 454 in the absence of any genomic assembly.
OBIToolsThe OBITools programs aims to help you to manipulate various data and sequence files in a convenient way using the Unix command line interface. They follow the standard Unix interface for command line program, allowing to chain a set of commands using the pipe mecanism.
OCamlOCaml is a general purpose industrial-strength programming language with an emphasis on expressiveness and safety. Developed for more than 20 years at Inria it benefits from one of the most advanced type systems and supports functional, imperative and object-oriented styles of programming.
OctaveGNU Octave is a high-level interpreted language, primarily intended for numerical computations.
OMBOSU (MPI) Micro-Benchmarks
ont_albacoreAlbacore performs real-time basecalls on Oxford Nanopore Technologies sequencing data.
ont-fast5-apiOxford Nanopore Technologies fast5 API software
OOF2OOF: Finite Element Analysis of Microstructures
OOF3DOOF: Finite Element Analysis of Microstructures
OPARI2OPARI2, the successor of Forschungszentrum Juelich's OPARI, is a source-to-source instrumentation tool for OpenMP and hybrid codes. It surrounds OpenMP directives and runtime library calls with calls to the POMP2 measurement interface. URL:
OpenBabelOpen Babel is a chemical toolbox designed to speak the many languages of chemical data. It's an open, collaborative project allowing anyone to search, convert, analyze, or store data from molecular modeling, chemistry, solid-state materials, biochemistry, or related areas.
OpenBLASOpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
OpenColorIOOpenColorIO (OCIO) is a complete color management solution geared towards motion picture production with an emphasis on visual effects and computer animation.
OpenCVOpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. OpenCV was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in the commercial products.
OpenEXROpenEXR is a high dynamic-range (HDR) image file format developed by Industrial Light & Magic for use in computer imaging applications
OpenFASTOpenFAST is an open-source wind turbine simulation tool that was established in 2017 with the FAST v8 code as its starting point (see FAST v8 and the transition to OpenFAST). OpenFAST is a multi-physics, multi-fidelity tool for simulating the coupled dynamic response of wind turbines.
OpenFOAMOpenFOAM is a free, open source CFD software package. OpenFOAM has an extensive range of features to solve anything from complex fluid flows involving chemical reactions, turbulence and heat transfer, to solid dynamics and electromagnetics.
OpenFOAM-ExtendOpenFOAM is a free, open source CFD software package. OpenFOAM has an extensive range of features to solve anything from complex fluid flows involving chemical reactions, turbulence and heat transfer, to solid dynamics and electromagnetics.
OpenGLOriginally developed by Silicon Graphics in the early '90s, OpenGL® has become the most widely-used open graphics standard in the world. NVIDIA supports OpenGL and a complete set of OpenGL extensions, designed to give you maximum performance on our GPUs.
OpenImageIOOpenImageIO is a library for reading and writing images, and a bunch of related classes, utilities, and applications.
OpenJPEGOpenJPEG is an open-source JPEG 2000 codec written in C language. It has been developed in order to promote the use of JPEG 2000, a still-image compression standard from the Joint Photographic Experts Group (JPEG). Since may 2015, it is officially recognized by ISO/IEC and ITU-T as a JPEG 2000 Reference Software.
OpenKIM-APIOpen Knowledgebase of Interatomic Models. OpenKIM is an API and a collection of interatomic models (potentials) for atomistic simulations. It is a library that can be used by simulation programs to get access to the models in the OpenKIM database. This EasyBuild only installs the API, the models have to be installed by the user by running kim-api-collections-management install user MODELNAME or kim-api-collections-management install user OpenKIM to install them all.
OpenMCOpenMC is a Monte Carlo particle transport simulation code focused on neutron criticality calculations. It is capable of simulating 3D models based on constructive solid geometry with second-order surfaces. OpenMC supports either continuous-energy or multi-group transport.
OpenMPIThe Open MPI Project is an open source MPI-3 implementation.
OpenPGMOpenPGM is an open source implementation of the Pragmatic General Multicast (PGM) specification in RFC 3208 available at PGM is a reliable and scalable multicast protocol that enables receivers to detect loss, request retransmission of lost data, or notify an application of unrecoverable loss. PGM is a receiver-reliable protocol, which means the receiver is responsible for ensuring all data is received, absolving the sender of reception responsibility.
OpenPhaseOpenPhase is the open source software project targeted at the phase field simulations of complex scientific problems involving microstructure formation in systems undergoing first order phase transformation.
openpyxlA Python library to read/write Excel 2010 xlsx/xlsm files URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
OpenSeesThe Open System for Earthquake Engineering Simulation (OpenSees) is a software framework for simulating the seismic response of structural and geotechnical systems.
OpenSSLThe OpenSSL Project is a collaborative effort to develop a robust, commercial-grade, full-featured, and Open Source toolchain implementing the Secure Sockets Layer (SSL v2/v3) and Transport Layer Security (TLS v1) protocols as well as a full-strength general purpose cryptography library.
OPERAAn optimal genome scaffolding program
OptiTypeOptiType is a novel HLA genotyping algorithm based on integer linear programming, capable of producing accurate 4-digit HLA genotyping predictions from NGS data by simultaneously selecting all major and minor HLA Class I alleles.
ORCAORCA is a flexible, efficient and easy-to-use general purpose tool for quantum chemistry with specific emphasis on spectroscopic properties of open-shell molecules. It features a wide variety of standard quantum chemical methods ranging from semiempirical methods to DFT to single- and multireference correlated ab initio methods. It can also treat environmental and relativistic effects. URL:
ORCA-HPRC-LicenseLicense terms for using ORCA on TAMU HPRC clusters
OrfMA simple and not slow open reading frame (ORF) caller.
OrthoFinderOrthoFinder is a program for identifying orthologous protein sequence families.
OrthoMCLOrthoMCL is a genome-scale algorithm for grouping orthologous protein sequences. It provides not only groups shared by two or more species/genomes, but also groups representing species-specific gene expansion families. So it serves as an important utility for automated eukaryotic genome annotation.
OSPRayOSPRay is an open source, scalable, and portable ray tracing engine for high-performance, high-fidelity visualization on Intel® Architecture CPUs.
OSPREYOSPREY is a suite of programs for computational structure-based protein design.
OVITOOVITO is a scientific visualization and analysis software for atomistic simulation data
p11-kitProvides a way to load and enumerate PKCS#11 modules. Provides a standard configuration setup for installing PKCS#11 modules in such a way that they're discoverable. Also solves problems with coordinating the use of PKCS#11 by different components or libraries living in the same process.
P3DFFTParallel Three-Dimensional Fast Fourier Transforms, dubbed P3DFFT, as well as its extension P3DFFT++, is a library for large-scale computer simulations on parallel platforms.This project was initiated at San Diego Supercomputer Center (SDSC) at UC San Diego by its main author Dmitry Pekurovsky, Ph.D.
p4estp4est is a C library to manage a collection (a forest) of multiple connected adaptive quadtrees or octrees in parallel. URL:
p4vaspVisualization suite for VASP
p7zipp7zip is a quick port of 7z.exe and 7za.exe (command line version of 7zip) for Unix. 7-Zip is a file archiver with highest compression ratio.
PAGANPAGAN is a general-purpose method for the alignment of sequence graphs. PAGAN is based on the phylogeny-aware progressive alignment algorithm and uses graphs to describe the uncertainty in the presence of characters at certain sequence positions.
PAGITTools to generate automatically high quality sequence by ordering contigs, closing gaps, correcting sequence errors and transferring annotation.
PAMLPAML is a package of programs for phylogenetic analyses of DNA or protein sequences using maximum likelihood.
pandaspandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
PANDAseqPANDASEQ is a program to align Illumina reads, optionally with PCR primers embedded in the sequence, and reconstruct an overlapping sequence.
PandocIf you need to convert files from one markup format into another, pandoc is your swiss-army knife
PangoPango is a library for laying out and rendering of text, with an emphasis on internationalization. Pango can be used anywhere that text layout is needed, though most of the work on Pango so far has been done in the context of the GTK+ widget toolkit. Pango forms the core of text and font handling for GTK+-2.x.
PangommThe Pangomm package provides a C++ interface to Pango.
PAPIPAPI provides the tool designer and application engineer with a consistent interface and methodology for use of the performance counter hardware found in most major microprocessors. PAPI enables software engineers to see, in near real time, the relation between software performance and processor events. In addition Component PAPI provides access to a collection of components that expose performance measurement opportunites across the hardware and software stack. URL:
parallelparallel: Build and execute shell commands in parallel URL:
parasailparasail is a SIMD C (C99) library containing implementations of the Smith-Waterman (local), Needleman-Wunsch (global), and semi-global pairwise sequence alignment algorithms.
ParaViewParaView is a scientific parallel visualizer. URL:
ParFlowParFlow is an integrated, parallel watershed model that makes use of high-performance computing to simulate surface and subsurface fluid flow.
ParMETISParMETIS is an MPI-based parallel library that implements a variety of algorithms for partitioning unstructured graphs, meshes, and for computing fill-reducing orderings of sparse matrices. ParMETIS extends the functionality provided by METIS and includes routines that are especially suited for parallel AMR computations and large scale numerical simulations. The algorithms implemented in ParMETIS are based on the parallel multilevel k-way graph-partitioning, adaptive repartitioning, and parallel multi-constrained partitioning schemes.
ParMGridGenParMGridGen is an MPI-based parallel library that is based on the serial package MGridGen, that implements (serial) algorithms for obtaining a sequence of successive coarse grids that are well-suited for geometric multigrid methods.
ParsnpParsnp is a command-line-tool for efficient microbial core genome alignment and SNP detection. Parsnp was designed to work in tandem with Gingr, a flexible platform for visualizing genome alignments and phylogenetic trees; both Parsnp and Gingr form part of the Harvest suite.
PASAPASA, acronym for Program to Assemble Spliced Alignments (and pronounced 'pass-uh'), is a eukaryotic genome annotation tool that exploits spliced alignments of expressed transcript sequences to automatically model gene structures, and to maintain gene structure annotation consistent with the most recently available experimental sequence data. URL:
PasteTools for using a Web Server Gateway Interface stack
PasteDeployLoad, configure, and compose WSGI applications and servers
PasteScriptA pluggable command-line frontend, including commands to setup package file layouts
patchelfPatchELF is a small utility to modify the dynamic linker and RPATH of ELF executables. is a Python library implementing path objects as first-class entities, allowing common operations on files to be invoked on those path objects directly.
pbalignpbalign aligns PacBio reads to reference sequences, filters aligned reads according to user-specific filtering criteria, and converts the output to either the SAM format or PacBio Compare HDF5 (e.g., .cmp.h5) format. The output Compare HDF5 file will be compatible with Quiver if --forQuiver option is specified.
pbbamThe pbbam software package provides components to create, query, & edit PacBio BAM files and associated indices.
PBSuiteSoftware for Long-Read Sequencing Data from PacBio
PCAngsdPCAngsd, which estimates the covariance matrix for low depth NGS data in an iterative procedure based on genotype likelihoods and is able to perform multiple population genetic analyses in heterogeneous populations.
PCLThe Point Cloud Library (PCL) is a standalone, large scale, open project for 2D/3D image and point cloud processing.
PCMSolverAn API for the Polarizable Continuum Model.
PCREThe PCRE library is a set of functions that implement regular expression pattern matching using the same syntax and semantics as Perl 5.
PCRE2The PCRE library is a set of functions that implement regular expression pattern matching using the same syntax and semantics as Perl 5. URL:
PDTProgram Database Toolkit (PDT) is a framework for analyzing source code written in several programming languages and for making rich program knowledge accessible to developers of static and dynamic analysis tools. PDT implements a standard program representation, the program database (PDB), that can be accessed in a uniform way through a class library supporting common PDB operations. URL:
PeakRangerPeakRanger: A multi-purpose ultrafast peak caller for ChIP Seq data
PeakSplitterSubdivision of ChIP-seq/ChIP-chip regions into discrete signal peaks.
PEARPEAR is an ultrafast, memory-efficient and highly accurate pair-end read merger. It is fully parallelized and can run with as low as just a few kilobytes of memory.
PerlLarry Wall's Practical Extraction and Report Language
Perl_tamuPerl_tamu contains perl modules that are not intalled in the easybuild module: GD GD::Graph GD::TextUtil PerlIO::gzip File::Spec::Link Parallel::ForkManager XML::NamespaceSupport XML::SAX XML::Lite XML::LibXML Array::Utils Exporter::Tiny List::MoreUtils Math::Counting CPAN::Meta inc::latest Module::Build Note: new modules may be added to the list when new modules are installed.
PETScPETSc, pronounced PET-see (the S is silent), is a suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations.
petsc4pypetsc4py are Python bindings for PETSc, the Portable, Extensible Toolchain for Scientific Computation.
PGDSpiderAn automated data conversion tool for connecting population genetics and genomics programs
PGIC, C++ and Fortran compilers from The Portland Group - PGI URL:
PHASTPHAST is a freely available software package for comparative and evolutionary genomics.
PhiPackThe Phi Test is a simple, rapid, and statistically efficient test for recombination. Its performance is comparable with coalescent based methods like LDHat, and yet it can be applied to large alignments with hundreds of sequences.
PhobiusPrediction of transmembrane topology and signal peptides from the amino acid sequence of a protein. URL:
phonopyPhonopy is an open source package of phonon calculations based on the supercell approach. URL:
PHYLIPPHYLIP is a free package of programs for inferring phylogenies.
PhyloNetworksPhyloNetworks is a Julia package for the manipulation, visualization, inference of phylogenetic networks, and their use for trait evolution.
PhyloSNPPhyloSNP is designed to take SNP data files (.csv and .vcf) and generate phylogenetic trees from the provided data.
PhyMLPhyML is a software that estimates maximum likelihood phylogenies from alignments of nucleotide or amino acid sequences. The main strength of PhyML lies in the large number of substitution models coupled to various options to search the space of phylogenetic tree topologies, going from very fast and efficient methods to slower but generally more accurate approaches. PhyML was designed to process moderate to large data sets. In theory, alignments with up to 4,000 sequences 2,000,000 character-long can be processed.
picardA set of tools (in Java) for working with next generation sequencing data in the BAM format.
PICRUStPICRUSt (pronounced “pie crust”) is a bioinformatics software package designed to predict metagenome functional content from marker gene (e.g., 16S rRNA) surveys and full genomes.
pigzpigz, which stands for parallel implementation of gzip, is a fully functional replacement for gzip that exploits multiple processors and multiple cores to the hilt when compressing data. pigz was written by Mark Adler, and uses the zlib and pthread libraries.
PILThe Python Imaging Library (PIL) adds image processing capabilities to your Python interpreter. This library supports many file formats, and provides powerful image processing and graphics capabilities.
PillowPillow is the 'friendly PIL fork' by Alex Clark and Contributors. PIL is the Python Imaging Library by Fredrik Lundh and Contributors. Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
Pillow-SIMDPillow is the 'friendly PIL fork' by Alex Clark and Contributors. PIL is the Python Imaging Library by Fredrik Lundh and Contributors. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
PilonPilon is an automated genome assembly improvement and variant detection tool
pipThe PyPA recommended tool for installing Python packages.
pixmanPixman is a low-level software library for pixel manipulation, providing features such as image compositing and trapezoid rasterization. Important users of pixman are the cairo graphics library and the X server.
pizzlyPizzly is a program for detecting gene fusions from RNA-Seq data of cancer samples.
pkgconfigpkgconfig is a Python module to interface with the pkg-config command line tool Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
pkg-configpkg-config is a helper tool used when compiling applications and libraries. It helps you insert the correct compiler options on the command line so an application can use gcc -o test test.c `pkg-config --libs --cflags glib-2.0` for instance, rather than hard-coding values on where to find glib (or other libraries).
PlanetWRFThe Planetary Weather Research and Forecasting model (planetWRF) is an open-source general purpose numerical model for planetary atmospheres research.
PlatanusPLATform for Assembling NUcleotide Sequences
plcplc is the public Planck Likelihood Code. It provides C and Fortran libraries that allow users to compute the log likelihoods of the temperature, polarization, and lensing maps. Optionally, it also provides a python version of this library, as well as tools to modify the predetermined options for some likelihoods (e.g. changing the high-ell and low-ell lmin and lmax values of the temperature). URL:
PLINKPLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner. The focus of PLINK is purely on analysis of genotype/phenotype data, so there is no support for steps prior to this (e.g. study design and planning, generating genotype or CNV calls from raw data). Through integration with gPLINK and Haploview, there is some support for the subsequent visualization, annotation and storage of results.
PLINKSEQPLINK/SEQ is an open-source C/C++ library for working with human genetic variation data. The specific focus is to provide a platform for analytic tool development for variation data from large-scale resequencing and genotyping projects, particularly whole-exome and whole-genome studies. It is independent of (but designed to be complementary to) the existing PLINK package.
ploidyNGSVisually exploring ploidy with Next Generation Sequencing data
PloticusPloticus is a free GPL software utility that can produce various types of plots and graphs
plotly.pyAn open-source, interactive graphing library for Python URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
PLUMEDPLUMED is an open source library for free energy calculations in molecular systems which works together with some of the most popular molecular dynamics engines. Free energy calculations can be performed as a function of many order parameters with a particular focus on biological problems, using state of the art methods such as metadynamics, umbrella sampling and Jarzynski-equation based steered MD. The software, written in C++, can be easily interfaced with both fortran and C/C++ codes.
PLYPLY is yet another implementation of lex and yacc for Python.
plyr Tools for Splitting, Applying and Combining Data
PMIxProcess Management for Exascale Environments PMI Exascale (PMIx) represents an attempt to provide an extended version of the PMI standard specifically designed to support clusters up to and including exascale sizes. The overall objective of the project is not to branch the existing pseudo-standard definitions - in fact, PMIx fully supports both of the existing PMI-1 and PMI-2 APIs - but rather to (a) augment and extend those APIs to eliminate some current restrictions that impact scalability, and (b) provide a reference implementation of the PMI-server that demonstrates the desired level of scalability.
PnetCDFPnetCDF is a high-performance parallel I/O library for accessing files in format compatibility with Unidata's NetCDF, specifically the formats of CDF-1, 2, and 5. The CDF-5 file format, an extension of CDF-2, supports unsigned data types and uses 64-bit integers to allow users to define large dimensions, attributes, and variables (> 2B array elements).
pompiToolchain with PGI C, C++ and Fortran compilers, alongside OpenMPI. URL:
popplerPoppler is a PDF rendering library based on the xpdf-3.0 code base.
poptPopt is a C library for parsing command line parameters.
PorechopAdapter trimmer for Oxford Nanopore reads
PoretoolsA toolkit for working with nanopore sequencing data from Oxford Nanopore.
PosiGenePosiGene is a tool that (i) detects positively selected genes on genome-scale, (ii) allows analysis of specific evolutionary branches, (iii) can be used in arbitrary species contexts and (iv) offers visualization of the candidates.
PostgreSQLPostgreSQL is a powerful, open source object-relational database system. It is fully ACID compliant, has full support for foreign keys, joins, views, triggers, and stored procedures (in multiple languages). It includes most SQL:2008 data types, including INTEGER, NUMERIC, BOOLEAN, CHAR, VARCHAR, DATE, INTERVAL, and TIMESTAMP. It also supports storage of binary large objects, including pictures, sounds, or video. It has native programming interfaces for C/C++, Java, .Net, Perl, Python, Ruby, Tcl, ODBC, among others, and exceptional documentation.
POTPOT (Python Optimal Transport) is a Python library provide several solvers for optimization problems related to Optimal Transport for signal, image processing and machine learning.
POTIONPOTION (POsitive selecTION) is an open source, modular and end-to-end software for genomic scale detection of positive Darwinian selection in groups of homologous coding sequences through estimation of dN/dS ratios.
POV-RayThe Persistence of Vision Raytracer, or POV-Ray, is a ray tracing program which generates images from a text-based scene description, and is available for a variety of computer platforms. POV-Ray is a high-quality, Free Software tool for creating stunning three-dimensional graphics. The source code is available for those wanting to do their own ports.
pplacerPplacer places query sequences on a fixed reference phylogenetic tree to maximize phylogenetic likelihood or posterior probability according to a reference alignment. Pplacer is designed to be fast, to give useful information about uncertainty, and to offer advanced visualization and downstream analysis.
PRANKPRANK is a probabilistic multiple alignment program for DNA, codon and amino-acid sequences. PRANK is based on a novel algorithm that treats insertions correctly and avoids over-estimation of the number of deletion events.
preseqSoftware for predicting library complexity and genome coverage in high-throughput sequencing.
Primer3Primer3 is a widely used program for designing PCR primers (PCR = 'Polymerase Chain Reaction'). PCR is an essential and ubiquitous tool in genetics and molecular biology. Primer3 can also design hybridization probes and sequencing primers.
PRINSEQA bioinformatics tool to PRe-process and show INformation of SEQuence data. PrintProto protocol headers.
PRISMS-PFPRISMS-PF is a powerful, massively parallel finite element code for conducting phase field and other related simulations of microstructural evolution. URL:
ProdigalProdigal (Prokaryotic Dynamic Programming Genefinding Algorithm) is a microbial (bacterial and archaeal) gene finding program developed at Oak Ridge National Laboratory and the University of Tennessee.
profilemyjobsprofilemyjobs is a script that provides course-grain profiling of batch jobs
progressbar33Text progress bar library for Python.
ProgressiveCactusProgressive Cactus is a whole-genome alignment package.
PROJProgram proj is a standard Unix filter function which converts geographic longitude and latitude coordinates into cartesian coordinates
ProjectQAn open source software framework for quantum computing
prokkaProkka is a software tool for the rapid annotation of prokaryotic genomes.
prompt-toolkitprompt_toolkit is a Python library for building powerful interactive command lines and terminal applications.
ProovreadLarge-scale high accuracy PacBio correction through iterative short read consensus.
protobufGoogle Protocol Buffers
protobuf-pythonPython Protocol Buffers runtime library.
PROVEANPROVEAN (Protein Variation Effect Analyzer) is a software tool which predicts whether an amino acid substitution or indel has an impact on the biological function of a protein. PROVEAN is useful for filtering sequence variants to identify nonsynonymous or indel variants that are predicted to be functionally important..
pscomParaStation is a robust and efficient cluster middleware, consisting of a high-performance communication layer (MPI) and a sophisticated management layer.
PSI4PSI4 is an open-source suite of ab initio quantum chemistry programs designed for efficient, high-accuracy simulations of a variety of molecular properties. We can routinely perform computations with more than 2500 basis functions running serially or in parallel.
psmcThis software package infers population size history from a diploid sequence using the Pairwise Sequentially Markovian Coalescent (PSMC) model.
psmpiParaStation MPI as part of the ParaStationV5 cluster suite provides robust, flexible and scalable communication and management functions for Linux-based compute clusters. Beside parallel applications based on the Message Passing Interface specification, version 2 (MPI2), also serial applications are supported.
PSolverPoisson Solver from the BigDFT code compiled as a standalone library.
psrecordpsrecord is a small utility that uses the psutil library to record the CPU and memory activity of a process.
PSSpredPSSpred (Protein Secondary Structure prediction) is a simple neural network training algorithm for accurate protein secondary structure prediction. URL:
pstoeditpstoedit translates PostScript and PDF graphics into other vector formats
psutilA cross-platform process and system utilities module for Python URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
psycopg2Psycopg is the most popular PostgreSQL adapter for the Python programming language.
ptemceeptemcee, pronounced "tem-cee", is fork of Daniel Foreman-Mackey's wonderful emcee to implement parallel tempering more robustly. If you're trying to characterise awkward, multi-model probability distributions, then ptemcee is your friend. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
pubtcrsThis repository contains C++ source code for the TCR clustering and correlation analyses described in the manuscript "Human T cell receptor occurrence patterns encode immune history, genetic background, and receptor specificity" by William S DeWitt III, Anajane Smith, Gary Schoch, John A Hansen, Frederick A Matsen IV and Philip Bradley, available on bioRxiv. URL:
pullseqUtility program for extracting sequences from a fasta/fastq file
Purge_HaplotigsPipeline to help with curating heterozygous diploid genome assemblies (for instance when assembling using FALCON or FALCON-unzip).
pylibrary with cross-python path, ini-parsing, io, code, log facilities
PyAPS3Python 3 Atmospheric Phase Screen URL:
pybedtoolspybedtools wraps and extends BEDTools and offers feature-level manipulations from within Python.
pyBigWigA python extension, written in C, for quick access to bigBed files and access to and creation of bigWig files.
pybind11pybind11 is a lightweight header-only library that exposes C++ types in Python and vice versa, mainly to create Python bindings of existing C++ code.
PyCairoPython bindings for the cairo library
PyCogentPyCogent is a software library for genomic biology. It is a fully integrated and thoroughly tested framework for: controlling third-party applications; devising workflows; querying databases; conducting novel probabilistic analyses of biological sequence evolution; and generating publication quality graphics.
pycparserC parser in Python
PyCUDAPyCUDA lets you access Nvidia’s CUDA parallel computation API from Python.
PydusaPydusa is a package for parallel programming using Python. It contains a module for doing MPI programming in Python. We have added parallel solver packages such as Parallel SuperLU for solving sparse linear systems.
pyEGA3A basic Python-based EGA download client URL:
pyexcel_xlsxpyexcel-xlsx is a tiny wrapper library to read, manipulate and write data in xlsx and xlsm fromat using openpyxl.
pyfastaStores a flattened version of the fasta file without spaces or headers and uses either a mmap of numpy binary format or fseek/fread so the sequence data is never read into memory. URL:
pyFFTWA pythonic wrapper around FFTW, the FFT library, presenting a unified interface for all the supported transforms. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
pyfitsThe PyFITS module is a Python library providing access to FITS (Flexible Image Transport System)
PyFMIPyFMI is a package for loading and interacting with Functional Mock-Up Units (FMUs), which are compiled dynamic models compliant with the Functional Mock-Up Interface (FMI)
PygmentsPygments is a syntax highlighting package written in Python.
PyGObjectPython Bindings for GLib/GObject/GIO/GTK+
pygribPython interface for reading and writing GRIB data URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
PyGTKPyGTK lets you to easily create programs with a graphical user interface using the Python programming language.
pyhdfPython wrapper around the NCSA HDF version 4 library Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
pyironAn integrated development environment (IDE) for computational materials science. URL:
Pyke3Pyke introduces a form of Logic Programming (inspired by Prolog) to the Python community by providing a knowledge-based inference engine (expert system) written in 100% Python.
PylintPylint is a tool that checks for errors in Python code, tries to enforce a coding standard and looks for code smells. It can also look for certain type errors, it can recommend suggestions about how particular blocks can be refactored and can offer you details about the code's complexity. URL:
py-lmdbUniversal Python binding for the LMDB 'Lightning' Database
PylonsPylons Web Framework
pymatgenPython Materials Genomics is a robust materials analysis code that defines core object representations for structures and molecules with support for many electronic structure codes.
PyNASTPyNAST is a reimplementation of the NAST sequence aligner, which has become a popular tool for adding new 16s rRNA sequences to existing 16s rRNA alignments. This reimplementation is more flexible, faster, and easier to install and maintain than the original NAST implementation.
PyomoPyomo is a Python-based open-source software package that supports a diverse set of optimization capabilities for formulating and analyzing optimization models.
PyOpenGLPyOpenGL is the most common cross platform Python binding to OpenGL and related APIs.
pyOpenSSLHigh-level wrapper around a subset of the OpenSSL library.
PyPhlAnTools to use with GraPhlAn
pyprojPython interface to PROJ4 library for cartographic transformations Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
pyqipyqi (canonically pronounced pie chee) is a Python framework designed to support wrapping general commands in multiple types of interfaces, including at the command line, HTML, and API levels.
PyQtPyQt is a set of Python v2 and v3 bindings for Digia's Qt application framework.
PyQt5PyQt5 is a set of Python bindings for v5 of the Qt application framework from The Qt Company.
PyRADPyRAD is a pipeline to assemble de novo RADseq loci with the aim of optimizing coverage across phylogenetic datasets. It uses a wrapper around an alignment-clustering algorithm, which allows for indel variation within and between samples, as well as for incomplete overlap among reads (e.g. paired-end)
PyRETISPyRETIS is a Python library for rare event molecular simulations with emphasis on methods based on transition interface sampling and replica exchange transition interface sampling.
PyrexPyrex - a Language for Writing Python Extension Modules
PysamPysam is a python module for reading and manipulating Samfiles. It's a lightweight wrapper of the samtools C-API. Pysam also includes an interface for tabix. URL:
pyScafpyScaf orders contigs from genome assemblies utilising several types of information
PySCFPySCF is an open-source collection of electronic structure modules powered by Python. URL:
pysnptoolsPySnpTools is a library for reading and manipulating genetic data.
pysqlitepysqlite is an interface to the SQLite 3.x embedded relational database engine. It is almost fully compliant with the Python database API version 2.0 also exposes the unique features of SQLite.
PyStanPython interface to Stan, a package for Bayesian inference using the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo.
PyTablesPyTables is a package for managing hierarchical datasets and designed to efficiently and easily cope with extremely large amounts of data. PyTables is built on top of the HDF5 library, using the Python language and the NumPy package. It features an object-oriented interface that, combined with C extensions for the performance-critical parts of the code (generated using Cython), makes it a fast, yet extremely easy to use tool for interactively browse, process and search very large amounts of data. One important feature of PyTables is that it optimizes memory and disk resources so that data takes much less space (specially if on-flight compression is used) than other solutions such as relational or object oriented databases. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
pytestpytest: simple powerful testing with Python
PythonPython is a programming language that lets you work more quickly and integrate your systems more effectively.
python-hostlistPython module for hostlist handling
python-igraphPython interface to the igraph high performance graph library, primarily aimed at complex network research and analysis.
PyTorchTensors and Dynamic neural networks in Python with strong GPU acceleration. PyTorch is a deep learning framework that puts Python first. URL:
PyVCFA VCF parser for Python
PyYAMLPyYAML is a YAML parser and emitter for the Python programming language. Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
PyZMQPython bindings for ZeroMQ
QCATaking a hint from the similarly-named Java Cryptography Architecture, QCA aims to provide a straightforward and cross-platform crypto API, using Qt datatypes and conventions. QCA separates the API from the implementation, using plugins known as Providers. The advantage of this model is to allow applications to avoid linking to or explicitly depending on any particular cryptographic library. This allows one to easily change or upgrade crypto implementations without even needing to recompile the application! QCA should work everywhere Qt does, including Windows/Unix/MacOSX.
qcintlibcint is an open source library for analytical Gaussian integrals. qcint is an optimized libcint branch for the x86-64 platform. URL:
QDDA user-friendly program to select microsatellite markers and design primers from large sequencing projects.
QGISQGIS is a user friendly Open Source Geographic Information System (GIS)
QhullQhull computes the convex hull, Delaunay triangulation, Voronoi diagram, halfspace intersection about a point, furthest-site Delaunay triangulation, and furthest-site Voronoi diagram. The source code runs in 2-d, 3-d, 4-d, and higher dimensions. Qhull implements the Quickhull algorithm for computing the convex hull.
QiimeQIIME is an open-source bioinformatics pipeline for performing microbiome analysis from raw DNA sequencing data. QIIME is designed to take users from raw sequencing data generated on the Illumina or other platforms through publication quality graphics and statistics. This includes demultiplexing and quality filtering, OTU picking, taxonomic assignment, and phylogenetic reconstruction, and diversity analyses and visualizations. QIIME has been applied to studies based on billions of sequences from tens of thousands of samples.
QJsonQJson is a Qt-based library that maps JSON data to QVariant objects and vice versa.
qmd-progressPROGRESS: Parallel, Rapid O(N) and Graph-based Recursive Electronic Structure Solver.
qpthA fast and differentiable QP solver for PyTorch. URL:
qrupdateqrupdate is a Fortran library for fast updates of QR and Cholesky decompositions.
QScintillaQScintilla is a port to Qt of Neil Hodgson's Scintilla C++ editor control
QScintilla5QScintilla is a port to Qt of Neil Hodgson's Scintilla C++ editor control
QtQt is a comprehensive cross-platform C++ application framework.
Qt5Qt is a comprehensive cross-platform C++ application framework. URL:
QuakeQuake is a package to correct substitution sequencing errors in experiments with deep coverage (e.g. >15X), specifically intended for Illumina sequencing reads.
QualimapQualimap examines sequencing alignment data in SAM/BAM files according to the features of the mapped reads and provides an overall view of the data that helps to the detect biases in the sequencing and/or mapping of the data and eases decision-making for further analysis.
QuantumESPRESSOQuantum ESPRESSO is an integrated suite of computer codes for electronic-structure calculations and materials modeling at the nanoscale. It is based on density-functional theory, plane waves, and pseudopotentials (both norm-conserving and ultrasoft).
QUASTQUAST evaluates genome assemblies by computing various metrics. It works both with and without reference genomes. The tool accepts multiple assemblies, thus is suitable for comparison.
QuaZIPQuaZIP is the C++ wrapper for Gilles Vollant's ZIP/UNZIP package (AKA Minizip) using Trolltech's Qt library. URL:
QuTiPQuTiP is open-source software for simulating the dynamics of open quantum systems.
QwtThe Qwt library contains GUI Components and utility classes which are primarily useful for programs with a technical background. URL:
QwtPolarThe QwtPolar library contains classes for displaying values on a polar coordinate system.
RR is a free software environment for statistical computing and graphics.
R6Classes with Reference Semantics
RaconUltrafast consensus module for raw de novo genome assembly of long uncorrected reads.
RagoutRagout (Reference-Assisted Genome Ordering UTility) is a tool for chromosome assembly using multiple references. Given a set of assembly fragments (contigs/scaffolds) and one or multiple related references (complete or draft), it produces a chromosome-scale assembly (as a set of scaffolds).
rainbowEfficient tool for clustering and assembling short reads, especially for RAD.
randfoldMinimum free energy of folding randomization test software
R_asremlR is a free software environment for statistical computing and graphics.
RAxMLRAxML search algorithm for maximum likelihood based inference of phylogenetic trees. URL:
RBFOptRBFOpt is a Python library for black-box optimization (also known as derivative-free optimization). URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
R-bundle-BioconductorR is a free software environment for statistical computing and graphics.
rcloneRclone is a command line program to sync files and directories to and from a variety of online storage services URL:
RColorBrewerColorBrewer Palettes
Rcpp, Seamless R and C++ Integration
RDP-ClassifierThe RDP Classifier is a naive Bayesian classifier that can rapidly and accurately provides taxonomic assignments from domain to genus, with confidence estimates for each assignment.
re2cre2c is a free and open-source lexer generator for C and C++. Its main goal is generating fast lexers: at least as fast as their reasonably optimized hand-coded counterparts. Instead of using traditional table-driven approach, re2c encodes the generated finite state automata directly in the form of conditional jumps and comparisons. URL:
REAPRREAPR is a tool that evaluates the accuracy of a genome assembly using mapped paired end reads, without the use of a reference genome for comparison. It can be used in any stage of an assembly pipeline to automatically break incorrect scaffolds and flag other errors in an assembly for manual inspection. It reports mis-assemblies and other warnings, and produces a new broken assembly based on the error calls.
RedundansRedundans is a pipeline that assists an assembly of heterozygous/polymorphic genomes.
RELIONRELION (for REgularised LIkelihood OptimisatioN, pronounce rely-on) is a stand-alone computer program that employs an empirical Bayesian approach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM).
REMORAREsource MOnitoring for Remote Applications
renderprotoXrender protocol and ancillary headers
RepeatMaskerRepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences.
RepeatModelerRepeatModeler is a de-novo repeat family identification and modeling package. At the heart of RepeatModeler are two de-novo repeat finding programs ( RECON and RepeatScout ) which employ complementary computational methods for identifying repeat element boundaries and family relationships from sequence data. RepeatModeler assists in automating the runs of RECON and RepeatScout given a genomic database and uses the output to build, refine and classify consensus models of putative interspersed repeats.
RepeatScoutRepeatScout is a tool to discover repetitive substrings in DNA. The purpose of the RepeatScout software is to identify repeat familysequences from genomes where hand-curated repeat databases (a la RepBase update) are not available. In fact, the output of this program can be used as input to RepeatMasker as a way of automatically masking newly-sequenced genomes..
requestsPython http for humans
reshape2Flexibly Reshape Data: A Reboot of the Reshape Package.
RMBlastRMBlast is a RepeatMasker compatible version of the standard NCBI BLAST suite.
RNAclustRNAclust is a perl script summarizing all the single steps required for clustering of structured RNA motifs, i.e. identifying groups of RNA sequences sharing a secondary structure motif. It requires as input a multiple FASTA file.
RNAIndelRNAIndel calls coding indels and classifies them into somatic, germline, and artifact from tumor RNA-Seq data.
RNAmmerRNAmmer predicts ribosomal RNA genes in full genome sequences by utilising two levels of Hidden Markov Models: An initial spotter model searches both strands. The spotter model is constructed from highly conserved loci within a structural alignment of known rRNA sequences. Once the spotter model detects an approximate position of a gene, flanking regions are extracted and parsed to the full model which matches the entire gene. By enabling a two-level approach it is avoided to run a full model through an entire genome sequence allowing faster predictions.
RNA-SeQCRNA-SeQC is a java program which computes a series of quality control metrics for RNA-seq data. The input can be one or more BAM files. The output consists of HTML reports and tab delimited files of metrics data. This program can be valuable for comparing sequencing quality across different samples or experiments to evaluate different experimental parameters. It can also be run on individual samples as a means of quality control before continuing with downstream analysis.
ROOTThe ROOT system provides a set of OO frameworks with all the functionality needed to handle and analyze large amounts of data in a very efficient way.
RosettaRosetta is the premier software suite for modeling macromolecular structures. As a flexible, multi-purpose application, it includes tools for structure prediction, design, and remodeling of proteins and nucleic acids.
rpy2Python interface to the R language (embedded R)
RSATRegulatory Sequence Analysis Tools (RSAT), a software suite for the detection of cis-regulatory elements in genomic sequences.
RSEMRNA-Seq by Expectation-Maximization
RSeQCRSeQC provides a number of useful modules that can comprehensively evaluate high throughput sequence data especially RNA-seq data. Some basic modules quickly inspect sequence quality, nucleotide composition bias, PCR bias and GC bias, while RNA-seq specific modules evaluate sequencing saturation, mapped reads distribution, coverage uniformity, strand specificity, transcript level RNA integrity etc.
R_tamuR is a free software environment for statistical computing and graphics.
RubyRuby is a dynamic, open source programming language with a focus on simplicity and productivity. It has an elegant syntax that is natural to read and easy to write.
RustRust is a systems programming language that runs blazingly fast, prevents segfaults, and guarantees thread safety.
S4S4 (or simply S4) stands for Stanford Stratified Structure Solver, a frequency domain code to solve the linear Maxwell’s equations in layered periodic structures. Internally, it uses Rigorous Coupled Wave Analysis (RCWA; also called the Fourier Modal Method (FMM)) and the S-matrix algorithm.
SailfishSailfish: Rapid Alignment-free Quantification of Isoform Abundance
SalmonSalmon is a wicked-fast program to produce a highly-accurate, transcript-level quantification estimates from RNA-seq data. URL:
SALSAA tool to scaffold long read assemblies with Hi-C URL:
SambambaSambamba is a high performance modern robust and fast tool (and library), written in the D programming language, for working with SAM and BAM files. Current functionality is an important subset of samtools functionality, including view, index, sort, markdup, and depth
samblastersamblaster: a tool to mark duplicates and extract discordant and split reads from sam files
samclipFilter SAM file for soft and hard clipped alignments
SAMtoolsSAM Tools provide various utilities for manipulating alignments in the SAM format, including sorting, merging, indexing and generating alignments in a per-position format.
SASa software suite developed by SAS Institute for advanced analytics, business intelligence, data management, and predictive analytics.
ScaLAPACKThe ScaLAPACK (or Scalable LAPACK) library includes a subset of LAPACK routines redesigned for distributed memory MIMD parallel computers.
scalesScale Functions for Visualization
SCATSA statistical tool to detect differential alternative splicing events using single-cell RNA-seq URL:
ScientificPythonScientificPython is a collection of Python modules for scientific computing. It contains support for geometry, mathematical functions, statistics, physical units, IO, visualization, and parallelization.
scikit-imagescikit-image is a collection of algorithms for image processing. URL:
scikit-learnScikit-learn integrates machine learning algorithms in the tightly-knit scientific Python world, building upon numpy, scipy, and matplotlib. As a machine-learning module, it provides versatile tools for data mining and analysis in any field of science and engineering. It strives to be simple and efficient, accessible to everybody, and reusable in various contexts. Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
scikit-optimizeScikit-Optimize, or skopt, is a simple and efficient library to minimize (very) expensive and noisy black-box functions.
SCIPhISingle-cell mutation identification via phylogenetic inference (SCIPhI) is a new approach to mutation detection in individual tumor cells by leveraging the evolutionary relationship among cells.
scipySciPy is a collection of mathematical algorithms and convenience functions built on the Numpy extension for Python. URL:
SciPy-bundleBundle of Python packages for scientific software Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
SConsSCons is a software construction tool. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
SCOTCHSoftware package and libraries for sequential and parallel graph partitioning, static mapping, and sparse matrix block ordering, and sequential mesh and hypergraph partitioning. URL:
scpThe module uses a paramiko transport to send and recieve files via the scp1 protocol.
ScriptureScripture is a method for transcriptome reconstruction that relies solely on RNA-Seq reads and an assembled genome to build a transcriptome ab initio.
ScytheScythe uses a Naive Bayesian approach to classify contaminant substrings in sequence reads. It considers quality information, which can make it robust in picking out 3'-end adapters, which often include poor quality bases.
SDL2SDL: Simple DirectMedia Layer, a cross-platform multimedia library
SeabornSeaborn is a Python visualization library based on matplotlib. It provides a high-level interface for drawing attractive statistical graphics. URL:
sedsed: GNU implementation of sed, the POSIX stream editor
SeederSeeder is a framework for DNA motif discovery. URL:
segemehlsegemehl is a software to map short sequencer reads to reference genomes. Unlike other methods, segemehl is able to detect not only mismatches but also insertions and deletions. Furthermore, segemehl is not limited to a specific read length and is able to mapprimer- or polyadenylation contaminated reads correctly. segemehl implements a matching strategy based on enhanced suffix arrays (ESA). Segemehl now supports the SAM format, reads gziped queries to save both disk and memory space and allows bisulfite sequencing mapping and split read mapping.
SeisSolSeisSol is a software package for simulating wave propagation and dynamic rupture based on the arbitrary high-order accurate derivative discontinuous Galerkin method (ADER-DG).
sepPython and C library for Source Extraction and Photometry. (this easyconfig provides python library only)
SeqAnSeqAn is an open source C++ library of efficient algorithms and data structures for the analysis of sequences with the focus on biological data
SeqKitA cross-platform ultrafast comprehensive toolkit for FASTA/Q processing
SeqmagickWe often have to convert between sequence formats and do little tasks on them, and it's not worth writing scripts for that. Seqmagick is a kickass little utility built in the spirit of imagemagick to expose the file format conversion in Biopython in a convenient way. Instead of having a big mess of scripts, there is one that takes arguments.
seqOutBiasMolecular biology enzymes have nucleic acid preferences for their substrates; the preference of an enzyme is typically dictated by the sequence at or near the active site of the enzyme. This bias may result in spurious read count patterns when used to interpret high-resolution molecular genomics data. The seqOutBias program aims to correct this issue by scaling the aligned read counts by the ratio of genome-wide observed read counts to the expected sequence based counts for each k-mer.
SeqPrepTool for stripping adaptors and/or merging paired reads with overlap into single reads.
SeqSeroSalmonella serotyping from genome sequencing data. SeqSero is a pipeline for Salmonella serotype determination from raw sequencing reads or genome assemblies.
SeqtkSeqtk is a fast and lightweight tool for processing sequences in the FASTA or FASTQ format. It seamlessly parses both FASTA and FASTQ files which can also be optionally compressed by gzip.
SerfThe serf library is a high performance C-based HTTP client library built upon the Apache Portable Runtime (APR) library URL:
setuptoolsDownload, build, install, upgrade, and uninstall Python packages -- easily!
SHAPEITSHAPEIT is a fast and accurate method for estimation of haplotypes (aka phasing) from genotype or sequencing data. The version 4 is a refactored and improved version of the SHAPEIT algorithm with multiple key additional features
SibeliaSibelia: A comparative genomics tool: It assists biologists in analysing the genomic variations that correlate with pathogens, or the genomic changes that help microorganisms adapt in different environments. Sibelia will also be helpful for the evolutionary and genome rearrangement studies for multiple strains of microorganisms.
SICERA clustering approach for identification of enriched domains from histone modification ChIP-Seq data.
SiestaSIESTA is both a method and its computer program implementation, to perform efficient electronic structure calculations and ab initio molecular dynamics simulations of molecules and solids.
SignalPSignalP 4.1 predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms: Gram-positive prokaryotes, Gram-negative prokaryotes, and eukaryotes. The method incorporates a prediction of cleavage sites and a signal peptide/non-signal peptide prediction based on a combination of several artificial neural networks.
SiloSilo is a library for reading and writing a wide variety of scientific data to binary, disk files
simMSGExact numerical calculation of the joint site-frequency spectrum as in Wakeley and Hey (1997) Estimating ancestral population parameters. URL:
simplejsonSimple, fast, extensible JSON encoder/decoder for Python
simpySimPy is a process-based discrete-event simulation framework based on standard Python. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
SIPSIP is a tool that makes it very easy to create Python bindings for C and C++ libraries.
sixPython 2 and 3 compatibility utilities
SKESASKESA is a de-novo sequence read assembler for cultured single isolate genomes based on DeBruijn graphs.
SLEPcSLEPc (Scalable Library for Eigenvalue Problem Computations) is a software library for the solution of large scale sparse eigenvalue problems on parallel computers. It is an extension of PETSc and can be used for either standard or generalized eigenproblems, with real or complex arithmetic. It can also be used for computing a partial SVD of a large, sparse, rectangular matrix, and to solve quadratic eigenvalue problems.
slepc4pyPython bindings for SLEPc, the Scalable Library for Eigenvalue Problem Computations.
slidingwindowslidingwindow is a simple little Python library for computing a set of windows into a larger dataset, designed for use with image-processing algorithms that utilise a sliding window to break the processing up into a series of smaller chunks.
SMALTSMALT efficiently aligns DNA sequencing reads with a reference genome.
snakemakeThe Snakemake workflow management system is a tool to create reproducible and scalable data analyses.
SNAPSNAP is a general purpose gene finding program suitable for both eukaryotic and prokaryotic genomes. SNAP is an acroynm for Semi-HMM-based Nucleic Acid Parser.
SNAP-HMM(Semi-HMM-based Nucleic Acid Parser) gene prediction tool
snappySnappy is a compression/decompression library. It does not aim for maximum compression, or compatibility with any other compression library; instead, it aims for very high speeds and reasonable compression. URL:
SnifflesSniffles is a structural variation caller using third generation sequencing (PacBio or Oxford Nanopore). It detects all types of SVs (10bp+) using evidence from split-read alignments, high-mismatch regions, and coverage analysis.
snoscanSearch for C/D box methylation guide snoRNA genes in a genomic sequences
snpEffSnpEff is a variant annotation and effect prediction tool. It annotates and predicts the effects of genetic variants (such as amino acid changes).
SNPGenieSNPGenie is a collection of Perl scripts for estimating πN/πS, dN/dS, and other evolutionary parameters from next-generation sequencing (NGS) single-nucleotide polymorphism (SNP) variant data.
SNPomaticHigh throughput sequencing technologies generate large amounts of short reads. Mapping these to a reference sequence consumes large amounts of processing time and memory, and read mapping errors can lead to noisy or incorrect alignments. SNP-o-matic is a fast, memory-efficient, and stringent read mapping tool offering a variety of analytical output functions, with an emphasis on genotyping.
SNP-PipelineThe CFSAN SNP Pipeline is a Python-based system for the production of SNP matrices from sequence data used in the phylogenetic analysis of pathogenic organisms sequenced from samples of interest to food safety.
SOAPdenovo2SOAPdenovo is a novel short-read assembly method that can build a de novo draft assembly for human-sized genomes. The program is specially designed to assemble Illumina short reads. It creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost effective way. SOAPdenovo2 is the successor of SOAPdenovo.
SOAPfuseSOAPfuse is an open source tool developed for genome-wide detection of fusion transcripts from paired-end RNA-Seq data.
socatsocat is a relay for bidirectional data transfer between two independent data channels. URL:
SOFI2DSOFI2D stands for Seismic mOdeling with FInite differences and denotes our 2D viscoelastic time domain massive parallel modeling code for P- and SV-waves. SOFI2D is the forward solver for the full waveform inversion code IFOS2D.
SPAdesGenome assembler for single-cell and isolates data sets
SparkSpark is Hadoop MapReduce done in memory
sparsehashAn extremely memory-efficient hash_map implementation. 2 bits/entry overhead! The SparseHash library contains several hash-map implementations, including implementations that optimize for space or speed.
SPECFEM2DSPECFEM2D simulates forward and adjoint seismic wave propagation in two-dimensional acoustic, (an)elastic, poroelastic or coupled acoustic-(an)elastic-poroelastic media, with Convolution PML absorbing conditions.
SpeedSeqA flexible framework for rapid genome analysis and interpretation.
spglibSpglib is a C library for finding and handling crystal symmetries.
spglib-pythonSpglib for Python. Spglib is a library for finding and handling crystal symmetries written in C. URL:
SphinxSphinx is a tool that makes it easy to create intelligent and beautiful documentation. It was originally created for the new Python documentation, and it has excellent facilities for the documentation of Python projects, but C/C++ is already supported as well, and it is planned to add special support for other languages as well.
sphireSParx for HIgh Resolution Electron Microscopy
SpineSpine is a program for identification of the conserved core genome of bacteria and other small genome organisms. URL:
SPLASHSPLASH is a free and open source visualisation tool for Smoothed Particle Hydrodynamics (SPH) simulations.
SpyderSpyder is an interactive Python development environment providing MATLAB-like features in a simple and light-weighted software.
SQLAlchemySQLAlchemy is the Python SQL toolkit and Object Relational Mapper that gives application developers the full power and flexibility of SQL.
SQLiteSQLite: SQL Database Engine in a C Library
SRA-ToolkitThe NCBI SRA Toolkit enables reading (dumping) of sequencing files from the SRA database and writing (loading) files into the .sra format
sRNAnalyzerA pipeline for small RNA sequencing data analysis.
SRPRISMSingle Read Paired Read Indel Substitution Minimizer
SRST2Short Read Sequence Typing for Bacterial Pathogens -- This program is designed to take Illumina sequence data, a MLST database and/or a database of gene sequences (e.g. resistance genes, virulence genes, etc) and report the presence of STs and/or reference genes.
SSPACE_BasicSSPACE Basic, SSAKE-based Scaffolding of Pre-Assembled Contigs after Extension
SSPACE-LongReadSSPACE standard is a stand-alone program for scaffolding pre-assembled contigs using NGS paired-read data. It is unique in offering the possibility to manually control the scaffolding process. By using the distance information of paired-end and/or matepair data, SSPACE is able to assess the order, distance and orientation of your contigs and combine them into scaffolds. Currently we offer this as a command-line tool in Perl. The input data is given by pre-assembled contig sequences (FASTA) and NGS paired-read data (Illumina/454/Solid FASTA or FASTQ). The final scaffolds are provided in FASTA format.
SSPACE-STANDARDSSPACE standard is a stand-alone program for scaffolding pre-assembled contigs using NGS paired-read data. It is unique in offering the possibility to manually control the scaffolding process. By using the distance information of paired-end and/or matepair data, SSPACE is able to assess the order, distance and orientation of your contigs and combine them into scaffolds. Currently we offer this as a command-line tool in Perl. The input data is given by pre-assembled contig sequences (FASTA) and NGS paired-read data (Illumina/454/Solid FASTA or FASTQ). The final scaffolds are provided in FASTA format.
StacksStacks is a software pipeline for building loci from short-read sequences, such as those generated on the Illumina platform. Stacks was developed to work with restriction enzyme-based data, such as RAD-seq, for the purpose of building genetic maps and conducting population genomics and phylogeography.
STAMP-METAGENOMICSSTAMP is a software package for analyzing taxonomic or metabolic profiles that promotes ‘best practices’ in choosing appropriate statistical techniques and reporting results. Statistical hypothesis tests for pairs of samples or groups of samples is support along with a wide range of exploratory plots.
StampyStampy is a package for the mapping of short reads from illumina sequencing machines onto a reference genome.
STARSTAR aligns RNA-seq reads to a reference genome using uncompressed suffix arrays. URL:
STAR-CCM+Software for solving problems involving flow (of fluids or solids), heat transfer and stress.
STAR-FusionSTAR-Fusion uses the STAR aligner to identify candidate fusion transcripts supported by Illumina reads. STAR-Fusion further processes the output generated by the STAR aligner to map junction reads and spanning reads to a reference annotation set.
STAR-STARSpliced Transcripts Alignment to a Reference
StataStata is a complete, integrated statistical software package that provides everything you need for data analysis, data management, and graphics.
statsmodelsStatsmodels is a Python module that allows users to explore data, estimate statistical models, and perform statistical tests. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
StrAutoAutomation and Parallelization of STRUCTURE Analysis. StrAuto is used to streamline population structure analysis using parallel computing.
STREAMThe STREAM benchmark is a simple synthetic benchmark program that measures sustainable memory bandwidth (in MB/s) and the corresponding computation rate for simple vector kernels.
strelkaStrelka2 is a fast and accurate small variant caller optimized for analysis of germline variation in small cohorts and somatic variation in tumor/normal sample pairs.
stringi Character String Processing Facilities
stringrSimple, Consistent Wrappers for Common String Operations
StringTieStringTie is a fast and highly efficient assembler of RNA-Seq alignments into potential transcripts.
StructureThe program structure is a free software package for using multi-locus genotype data to investigate population structure. Its uses include inferring the presence of distinct populations, assigning individuals to populations, studying hybrid zones, identifying migrants and admixed individuals, and estimating population allele frequencies in situations where many individuals are migrants or admixed.
StructureHarvesterStructure Harvester is a program for parsing the output of Pritchard's STRUCTURE and for performing the Evanno method.
Structure_threaderA program to parallelize the runs of Structure, fastStructure and MavericK software. URL:
SubreadSubread: an accurate and efficient aligner for mapping both genomic DNA-seq reads and RNA-seq reads (for the purpose of expression analysis).
SubversionSubversion is an open source version control system.
SuiteSparseSuiteSparse is a collection of libraries manipulate sparse matrices.
SUMOSUMO allows modelling of intermodal traffic systems including road vehicles, public transport and pedestrians. Included with SUMO is a wealth of supporting tools which handle tasks such as route finding, visualization, network import and emission calculation.
SUNDIALSSUNDIALS: SUite of Nonlinear and DIfferential/ALgebraic Equation Solvers
SuperLUSuperLU is a general purpose library for the direct solution of large, sparse, nonsymmetric systems of linear equations on high performance machines.
SuperLU_DISTSuperLU is a general purpose library for the direct solution of large, sparse, nonsymmetric systems of linear equations on high performance machines.
suspendersAllows the merging of alignments that have been annotated using pylapels into a single alignment that picks the highest quality alignment.
SVDetectSVDetect is a application for the isolation and the type prediction of intra- and inter-chromosomal rearrangements from paired-end/mate-pair sequencing data provided by the high-throughput sequencing technologies. This tool aims to identifying structural variations with both clustering and sliding-window strategies, and helping in their visualization at the genome scale.
svtyperBayesian genotyper for structural variants
swak4Foamswak4Foam stands for SWiss Army Knife for Foam. Like that knife it rarely is the best tool for any given task, but sometimes it is more convenient to get it out of your pocket than going to the tool-shed to get the chain-saw.
swalignThis package implements a Smith-Waterman style local alignment algorithm. You can align a query sequence to a reference. The scoring functions can be based on a matrix, or simple identity.
SWANSWAN is a third-generation wave model, developed at Delft University of Technology, that computes random, short-crested wind-generated waves in coastal regions and inland waters.
SWIGSWIG is a software development tool that connects programs written in C and C++ with a variety of high-level programming languages.
SymEngineSymEngine is a standalone fast C++ symbolic manipulation library. URL:
sympySymPy is a Python library for symbolic mathematics. It aims to become a full-featured computer algebra system (CAS) while keeping the code as simple as possible in order to be comprehensible and easily extensible. SymPy is written entirely in Python and does not require any external libraries. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
SzipSzip compression software, providing lossless compression of scientific data
tabixGeneric indexer for TAB-delimited genome position files
TahoeTahoe is an open source research-oriented software platform for the development of numerical methods and material models. URL:
tamu-libsThis module provides missing libraries for compute nodes.
TASSELTASSEL provides tools to investigate relationships between phenotypes and genotypes
tbbIntel(R) Threading Building Blocks (Intel(R) TBB) lets you easily write parallel C++ programs that take full advantage of multicore performance, that are portable, composable and have future-proof scalability.
tbl2asnTbl2asn is a command-line program that automates the creation of sequence records for submission to GenBank.
TclTcl (Tool Command Language) is a very powerful but easy to learn dynamic programming language, suitable for a very wide range of uses, including web and desktop applications, networking, administration, testing and many more.
tcshTcsh is an enhanced, but completely compatible version of the Berkeley UNIX C shell (csh). It is a command language interpreter usable both as an interactive login shell and a shell script command processor. It includes a command-line editor, programmable word completion, spelling correction, a history mechanism, job control and a C-like syntax.
Tecplot360EXQuickly plot and animate your CFD results exactly the way you want. Analyze complex solutions, arrange multiple layouts, and communicate your results with professional images and animations. URL:
TensorFlowAn open-source software library for Machine Intelligence
terminaltablesGenerate simple tables in terminals from a nested list of strings.
tesseractTesseract is an optical character recognition engine
testfixturesTestfixtures is a collection of helpers and mock objects that are useful when writing automated tests in Python.
testpathTest utilities for code working with files and commands
Test-Simpleyet another framework for writing test scripts
TetGenA Quality Tetrahedral Mesh Generator and a 3D Delaunay Triangulator
texinfoTexinfo is the official documentation format of the GNU project.
TheanoTheano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. URL:
thirdorderThe purpose of the thirdorder scripts is to help users of g ( create FORCE_CONSTANTS_3RD files in an efficient and convenient manner.
TiCCutilsTiCC utils is a collection of generic C++ software which is used in a lot of programs produced at Tilburg centre for Cognition and Communication (TiCC) at Tilburg University and Centre for Dutch Language and Speech at University of Antwerp.
TiMBLTiMBL (Tilburg Memory Based Learner) is an open source software package implementing several memory-based learning algorithms, among which IB1-IG, an implementation of k-nearest neighbor classification with feature weighting suitable for symbolic feature spaces, and IGTree, a decision-tree approximation of IB1-IG. All implemented algorithms have in common that they store some representation of the training set explicitly in memory. During testing, new cases are classified by extrapolation from the most similar stored cases.
TINKERThe TINKER molecular modeling software is a complete and general package for molecular mechanics and dynamics, with some special features for biopolymers.
TkTk is an open source, cross-platform widget toolchain that provides a library of basic elements for building a graphical user interface (GUI) in many different programming languages.
TkinterTkinter module, built with the Python buildsystem
TM-alignThis package unifies protein structure alignment and RNA structure alignment into the standard TM-align program for single chain structure alignment, MM-align program for multi-chain structure alignment, and TM-score program for sequence dependent structure superposition. URL:
TMHMMPrediction of transmembrane helices in proteins.
tmuxtmux is a terminal multiplexer. It lets you switch easily between several programs in one terminal, detach them (they keep running in the background) and reattach them to a different terminal.
TopHatTopHat is a fast splice junction mapper for RNA-Seq reads. URL:
torchvisionDatasets, Transforms and Models specific to Computer Vision URL:
ToscaWidgetsWeb widget creation toolkit based on TurboGears widgets
tqdmA fast, extensible progress bar for Python and CLI URL: Compatible modules: Python/2.7.15-GCCcore-8.2.0 (default), Python/3.7.2-GCCcore-8.2.0
TracerTracer is a program for analysing the trace files generated by Bayesian MCMC runs (that is, the continuous parameter values sampled from the chain). It can be used to analyse runs of BEAST, MrBayes, LAMARC and possibly other MCMC programs.
TransDecoderTransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks.
Transratensrate is software for de-novo transcriptome assembly quality analysis. It examines your assembly in detail and compares it to experimental evidence such as the sequencing reads, reporting quality scores for contigs and assemblies. This allows you to choose between assemblers and parameters, filter out the bad contigs from an assembly, and help decide when to stop trying to improve the assembly.
TreeMixTreeMix is a method for inferring the patterns of population splits and mixtures in the history of a set of populations.
trfTandem Repeats Finder is a program to locate and display tandem repeats in DNA sequences. In order to use the program, the user submits a sequence in FASTA format. There is no need to specify the pattern, the size of the pattern or any other parameter.
TrilinosThe Trilinos Project is an effort to develop algorithms and enabling technologies within an object-oriented software framework for the solution of large-scale, complex multi-physics engineering and scientific problems. A unique design feature of Trilinos is its focus on packages.
trimAlA tool for automated alignment trimming in large-scale phylogenetic analyses
Trim_GaloreA wrapper tool around Cutadapt and FastQC to consistently apply quality and adapter trimming to FastQ files, with some extra functionality for MspI-digested RRBS-type (Reduced Representation Bisufite-Seq) libraries.
TrimmomaticTrimmomatic performs a variety of useful trimming tasks for illumina paired-end and single ended data.The selection of trimming steps and their associated parameters are supplied on the command line. URL:
TrinityTrinity represents a novel method for the efficient and robust de novo reconstruction of transcriptomes from RNA-Seq data. Trinity combines three independent software modules: Inchworm, Chrysalis, and Butterfly, applied sequentially to process large volumes of RNA-Seq reads.
TRINITY_TAMU(description not available)
TrinotateTrinotate is a comprehensive annotation suite designed for automatic functional annotation of transcriptomes, particularly de novo assembled transcriptomes, from model or non-model organisms. Trinotate makes use of a number of different well referenced methods for functional annotation including homology search to known sequence data (BLAST+/SwissProt), protein domain identification (HMMER/PFAM), protein signal peptide and transmembrane domain prediction (signalP/tmHMM), and leveraging various annotation databases (eggNOG/GO/Kegg databases).
tRNAscan-SESearching for tRNA genes in genomic sequences
TurboCheetahTurboGears plugin to support use of Cheetah templates
TurboJsonPython template plugin that supports JSON
UCLUSTUCLUST: Extreme high-speed sequence clustering, alignment and database search.
UCSCtoolsUCSC utilities pre-compiled binaries.
UCXUnified Communication X An open-source production grade communication framework for data centric and high-performance applications
UDUNITSUDUNITS supports conversion of unit specifications between formatted and binary forms, arithmetic manipulation of units, and conversion of values between compatible scales of measurement.
UFLThe Unified Form Language (UFL) is a domain specific language for declaration of finite element discretizations of variational forms. More precisely, it defines a flexible interface for choosing finite element spaces and defining expressions for weak forms in a notation close to mathematical notation.
umisPackage for estimating UMI counts in Transcript Tag Counting data. URL: Compatible modules: Python/3.7.2-GCCcore-8.2.0 (default), Python/2.7.15-GCCcore-8.2.0
UnblurUnblur is used to align the frames of movies recorded on an electron microscope to reduce image blurring due to beam-induced motion.
UnicyclerUnicycler is an assembly pipeline for bacterial genomes. It can assemble Illumina-only read sets where it functions as a SPAdes-optimiser. It can also assembly long-read-only sets (PacBio or Nanopore) where it runs a miniasm+Racon pipeline.
unrarRAR is a powerful archive manager.
UnZipUnZip is an extraction utility for archives compressed in .zip format (also called "zipfiles"). Although highly compatible both with PKWARE's PKZIP and PKUNZIP utilities for MS-DOS and with Info-ZIP's own Zip program, our primary objectives have been portability and non-MSDOS functionality. URL:
USEARCHUSEARCH is a unique sequence analysis tool which offers search and clustering algorithms that are often orders of magnitude faster than BLAST. URL:
utf8procutf8proc is a small, clean C library that provides Unicode normalization, case-folding, and other operations for data in the UTF-8 encoding.
util-linuxSet of Linux utilities
ValgrindValgrind: Debugging and profiling tools
VarScanVarScan is a platform-independent variant caller for targeted, exome, and whole-genome resequencing data generated on Illumina, SOLiD, Life/PGM, Roche/454, and similar instruments.
VASPThe Vienna Ab initio Simulation Package (VASP) is a computer program for atomic scale materials modelling, e.g. electronic structure calculations and quantum-mechanical molecular dynamics, from first principles. Includes VTST from:
vawkAn awk-like VCF parser. vawk command syntax is exactly the same as awk syntax with a few additional features.
VCF-kitVCF-kit is a command-line based collection of utilities for performing analysis on Variant Call Format (VCF) files.
vcflibA C++ library for parsing and manipulating VCF files.
VCFtoolsThe aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.
vContact2vConTACT2 is a tool to perform guilt-by-contig-association automatic classification of viral contigs. URL:
VelvetSequence assembler for very short reads
VelvetOptimiserVelvetOptimiser is a multi-threaded Perl script for automatically optimising the three primary parameter options (K, -exp_cov, -cov_cutoff) for the Velvet de novo sequence assembler.
VEPVariant Effect Predictor (VEP) determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions.
version_requiredA TAMU HPRC module to force users to specify a version when loading certain modules
ViennaRNAThe Vienna RNA Package consists of a C code library and several stand-alone programs for the prediction and comparison of RNA secondary structures.
VimVim is an advanced text editor that seeks to provide the power of the de-facto Unix editor 'Vi', with a more complete feature set. URL:
viridisLiteDefault Color Maps from 'matplotlib' (Lite Version)
VisItVisIt is an Open Source, interactive, scalable, visualization, animation and analysis tool.
VmatchLarge scale sequence analysis software.
VMDVMD is a molecular visualization program for displaying, animating, and analyzing large biomolecular systems using 3-D graphics and built-in scripting.
Voro++Voro++ is a software library for carrying out three-dimensional computations of the Voronoi tessellation. A distinguishing feature of the Voro++ library is that it carries out cell-based calculations, computing the Voronoi cell for each particle individually. It is particularly well-suited for applications that rely on cell-based statistics, where features of Voronoi cells (eg. volume, centroid, number of faces) can be used to analyze a system of particles.
VSEARCHVSEARCH supports de novo and reference based chimera detection, clustering, full-length and prefix dereplication, rereplication, reverse complementation, masking, all-vs-all pairwise global alignment, exact and global alignment searching, shuffling, subsampling and sorting. It also supports FASTQ file analysis, filtering, conversion and merging of paired-end reads.
V_SimV_Sim visualizes atomic structures such as crystals, grain boundaries and so on (sic)
VTKThe Visualization Toolkit (VTK) is an open-source, freely available software system for 3D computer graphics, image processing and visualization. VTK consists of a C++ class library and several interpreted interface layers including Tcl/Tk, Java, and Python. VTK supports a wide variety of visualization algorithms including: scalar, vector, tensor, texture, and volumetric methods; and advanced modeling techniques such as: implicit modeling, polygon reduction, mesh smoothing, cutting, contouring, and Delaunay triangulation. URL:
VTuneIntel VTune Amplifier XE is the premier performance profiler for C, C++, C#, Fortran, Assembly and Java.
wcwidthwcwidth is a low-level Python library to simplify Terminal emulation.
WerkzeugThe Swiss Army knife of Python web development
WestmereWestmere (large-memory node) built packages for
wgetpure python download utility
WGSCelera Assembler : scientific software for biological research. Celera Assembler is a de novo whole-genome shotgun (WGS) DNA sequence assembler. It reconstructs long sequences of genomic DNA from fragmentary data produced by whole-genome shotgun sequencing. Celera Assembler has enabled many advances in genomics, including the first whole genome shotgun sequence of a multi-cellular organism (Myers 2000) and the first diploid sequence of an individual human (Levy 2007). Celera Assembler was developed at Celera Genomics starting in 1999.
wheelA built-package format for Python.
wise2wise2 key programs are genewise, a program for aligning proteins or protein HMMs to DNA, and dynamite a rather cranky "macro language" which automates the production of dynamic programming.
workerThe Worker framework has been developed to help deal with parameter exploration experiments that would otherwise result in many jobs, forcing the user resort to scripting to retain her sanity; see also
WPSWRF Preprocessing System (WPS) for WRF. The Weather Research and Forecasting (WRF) Model is a next-generation mesoscale numerical weather prediction system designed to serve both operational forecasting and atmospheric research needs.
WRFThe Weather Research and Forecasting (WRF) Model is a next-generation mesoscale numerical weather prediction system designed to serve both operational forecasting and atmospheric research needs.
wrf-depsThis module sets up dependency requirements for building customized WRF.
wtdbg2Wtdbg2 is a de novo sequence assembler for long noisy reads produced by PacBio or Oxford Nanopore Technologies (ONT). It assembles raw reads without error correction and then builds the consensus from intermediate assembly output.
wxPythonwxPython is a GUI toolkit for the Python programming language. It allows Python programmers to create programs with a robust, highly functional graphical user interface, simply and easily. It is implemented as a Python extension module (native code) that wraps the popular wxWidgets cross platform GUI library, which is written in C++.
X11The X Window System (X11) is a windowing system for bitmap displays
x264x264 is a free software library and application for encoding video streams into the H.264/MPEG-4 AVC compression format, and is released under the terms of the GNU GPL.
x265x265 is a free software library and application for encoding video streams into the H.265 AVC compression format, and is released under the terms of the GNU GPL.
xarrayxarray (formerly xray) is an open source project and Python package that aims to bring the labeled data power of pandas to the physical sciences, by providing N-dimensional variants of the core pandas data structures.
xbitmapsprovides bitmaps for x
xcb-protoThe X protocol C-language Binding (XCB) is a replacement for Xlib featuring a small footprint, latency hiding, direct access to the protocol, improved threading support, and extensibility.
XCfunXCFun is a library of DFT exchange-correlation (XC) functionals. It is based on automatic differentiation and can therefore generate arbitrary order derivatives of these functionals. URL:
XCrySDenXCrySDen is a crystalline and molecular structure visualisation program aiming at display of isosurfaces and contours, which can be superimposed on crystalline structures and interactively rotated and manipulated. URL:
Xerces-C++Xerces-C++ is a validating XML parser written in a portable subset of C++. Xerces-C++ makes it easy to give your application the ability to read and write XML data. A shared library is provided for parsing, generating, manipulating, and validating XML documents using the DOM, SAX, and SAX2 APIs.
xextprotoXExtProto protocol headers.
XFOILXFOIL is an interactive program for the design and analysis of subsonic isolated airfoils.
xgboostScalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library
xlrdLibrary for developers to extract data from Microsoft Excel (tm) spreadsheet files
XlsxWriterA Python module for creating Excel XLSX files
XMDS2The purpose of XMDS2 is to simplify the process of creating simulations that solve systems of initial-value first-order partial and ordinary differential equations.
XML-LibXMLPerl binding for libxml2 URL:
XML-NamespaceSupportsimple generic namespace support class
XML-ParserThis is a Perl extension interface to James Clark's XML parser, expat.
XML-SAX-BaseBase class SAX Drivers and Filters macros utilities.
xpropThe xprop utility is for displaying window and font properties in an X server. One window or font is selected using the command line arguments or possibly in the case of a window, by clicking on the desired window. A list of properties is then given, possibly with formatting information.
xprotoX protocol and ancillary headers
xsspThe source code for building the mkdssp, mkhssp, hsspconv, and hsspsoap programs is bundled in the xssp project. The DSSP executable is mkdssp.
xtransxtrans includes a number of routines to make X implementations transport-independent; at time of writing, it includes support for UNIX sockets, IPv4, IPv6, and DECnet.
XZxz: XZ utilities
yaffYaff stands for 'Yet another force field'. It is a pythonic force-field code. URL:
yaml-cppyaml-cpp is a YAML parser and emitter in C++ matching the YAML 1.2 spec.
YasmYasm: Complete rewrite of the NASM assembler with BSD license
YAXTYet Another eXchange Tool
ZDOCKProtein docking sotware that performs a full rigid-body search of docking orientations between two proteins
ZeroMQZeroMQ looks like an embeddable networking library but acts like a concurrency framework. It gives you sockets that carry atomic messages across various transports like in-process, inter-process, TCP, and multicast. You can connect sockets N-to-N with patterns like fanout, pub-sub, task distribution, and request-reply. It's fast enough to be the fabric for clustered products. Its asynchronous I/O model gives you scalable multicore applications, built as asynchronous message-processing tasks. It has a score of language APIs and runs on most operating systems. URL:
zlibzlib is designed to be a free, general-purpose, legally unencumbered -- that is, not covered by any patents -- lossless data-compression library for use on virtually any computer hardware and operating system.
zshZsh is a shell designed for interactive use, although it is also a powerful scripting language.
zstdZstandard is a real-time compression algorithm, providing high compression ratios. It offers a very wide range of compression/speed trade-off, while being backed by a very fast decoder. It also offers a special mode for small data, called dictionary compression, and can create dictionaries from any sample set. URL: