Skip to content

Data Visualization

JBrowse

GCATemplates available: no, JBrowse is only available at the HPRC Portal

JBrowse is a fast, scalable genome browser built completely with JavaScript and HTML5.

JBrowse is available on the Grace portal

You will need to index the genome fasta file prior to loading the genome file in JBrowse.

To do this, you will need to go to the HPRC cluster Unix command line and index the fasta reference file using samtools which will create a .fai index file

module load GCC/12.3.0 SAMtools/1.18
samtools faidx reference_genome.fasta
  • Then you start JBrowse in the portal (under the Interactive Apps section) selecting the Number of hours and Number of cores as needed.
  • Select 'Open Sequence File' and click the 'Select Files...' button in the 'Local files' section (files that are located on the HPRC cluster)
  • Load the reference_genome.fasta and the reference_genome.fasta.fai at the same time in JBrowse.
  • Then you can add tracks (gff, bam, ...) by selecting Track in the JBrowse right panel at the top.
  • All files that you add as tracks or genomes must be located on the HPRC cluster. URLs are not supported since JBrowse on the portal.hprc.tamu.edu runs on the compute nodes which do not have internet access.
  • Remember to delete the interactive session when you are finished.

IGV

GCATemplates available: no

IGV homepage

The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations.

IGV is best viewed using the HPRC portal.

  • Open your favorite web browser and go to portal.hprc.tamu.edu
  • Log in using your NetID and password
  • Select 'Interactive Apps'
  • Select 'IGV'
  • Adjust the 'Number of hours' as needed.
  • Increase the 'Number of processors' to 20 for viewing large genomes
  • Click the blue 'Launch' button
  • Wait for your job to start and then click the the blue 'Launch noVNC in New Tab' button when it appears.
  • Navigate within IGV to find preconfigured genomes at /scratch/datasets/IGV
  • You can add your own genomes by creating a .fai index for your fasta file using samtools on the command line before launching IGV
    • module load SAMtools
    • samtools faidx my_genome.fasta
  • When you are finished using IGV, click the red 'Delete' button in the 'My Interactive Sessions' section to stop the IGV job.

Circos

GCATemplates available: no

Circos homepage

Circos is a software package for visualizing data and information. It visualizes data in a circular layout — this makes Circos ideal for exploring relationships between objects or positions.

Use the module spider command to see how to load the Circos module.

module spider Circos

Sample config files are found here:

ls $EBROOTCIRCOS/etc

Sample command:

circos -conf main.conf -outputfile genome_image.png

The svg image output format may turn out better than the png format.

Circos is also available in Maroon Galaxy

MultiQC

GCATemplates available: no

MultiQC homepage

module spider MultiQC

After you load the MultiQC module and run multiqc, you can view the html page using the HPC portal at portal.hprc.tamu.edu

An example command for viewing multiple FastQC processed samples in MultiQC where the output_dir contains the FastQC report files ending in _fastqc.zip and _fastqc.html. You can use any directory name for the output_dir

multiqc output_dir

Go to portal.hprc.tamu.edu and click the Files menu item and then click your scratch directory:

600px   |   Center   |   Portal 
 Files

Then navigate the files to find and select your multiqc_report.html file and then click the Download button to view it on your personal computer.

Here is an example page of viewing a MultiQC report for FastQC output files (2 samples) in the portal.hprc.tamu.edu:

600px   |   Center   |   View   HTML 
 page