Skip to main content

Bioinformatics and Statistics Resources

Information and resources on bioinformatics and statistics for the UCSF research community

General Purpose

Galaxy


"Galaxy is an open source, web-based platform for data intensive biomedical research."

Galaxy is a freely available web-based software.


PLINK

"PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner."

PLINK is freely available here.


R and Bioconductor

      

"R is a free software environment for statistical computing and graphics."

"Bioconductor provides tools for the analysis and comprehension of high-throughput genomic data. Bioconductor uses the R statistical programming language, and is open source and open development."

R is freely available here. Bioconductor is freely available with installation instructions here. You must have R installed to use Bioconductor.


UCSC Genome Browser

The UCSC Genome Browser includes "a broad collection of vertebrate and model organism assemblies and annotations, along with a large suite of tools for viewing, analyzing and downloading data."

The UCSC Genome Browser is a freely available web-based software.

Alignment and Mapping

BioPerl

"The Bioperl Project is an international association of users & developers of open source Perl tools for bioinformatics, genomics and life science."

BioPerl is freely available here.


Bowtie

"Bowtie is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour."

Bowtie is freely available here.


DECIPHER

"DECIPHER is a software toolset that can be used for deciphering and managing biological sequences efficiently using the R programming language. Some functionality of the program is accessible online through web tools."

DECIPHER is freely available with installation instructions here (requires R and Bioconductor- see above).


HISAT2

"HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes (as well as to a single reference genome)."

HISAT2 is freely available here.

 

Annotation and Enrichment

DAVID

"This tool suite, introduced in the first version of DAVID, mainly provides typical batch annotation and gene-GO term enrichment analysis to highlight the most relevant GO terms associated with a given gene list."

DAVID is freely available as a web-based software.


GATK

"Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping."

GATK is freely available here. It can also be run on the cloud.


GSEA

"Gene Set Enrichment Analysis (GSEA) is a computational method that determines whether an a priori defined set of genes shows statistically significant, concordant differences between two biological states (e.g. phenotypes)."

GSEA is freely available here.


PARADIGM

"A factor graph framework for pathway inference on high-throughput genomic data."

PARADIGM is freely available here.

 

Data Manipulation

bedtools

"Bedtools is a fast, flexible toolset for genome arithmetic." ... "For example, bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF."

bedtools is freely available with installation instructions here.


Picard

"A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF."

Picard is freely available with installation instructions here.


Samtools

"Samtools is a suite of programs for interacting with high-throughput sequencing data." 

Samtools is freely available with installation instructions here.


VCFtools

"VCFtools is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files."

VCFtools is freely available with installation instructions here.

Visualization

Chimera

"UCSF Chimera is a highly extensible program for interactive visualization and analysis of molecular structures and related data, including density maps, supramolecular assemblies, sequence alignments, docking results, trajectories, and conformational ensembles."

Chimera is freely available here.


Cytoscape

"Cytoscape is an open source software platform for visualizing complex networks and integrating these with any type of attribute data." ... "Cytoscape supports many use cases in molecular and systems biology, genomics, and proteomics"

Cytoscape is freely available here.


GenomeBrowse

"The free Golden Helix GenomeBrowse® tool delivers stunning visualizations of your genomic data that give you the power to see what is occurring at each base pair in your samples."

GenomeBrowse is freely available here.


Integrative Genomics Viewer

"The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations."

IGV is freely available here.