BWA

Burrows-Wheeler Aligner, a genomic sequence mapping program

Homepage Version(s): 0.7.17

BWA requires an environment module

In order to use BWA, you must first load the appropriate environment module:

module load gnu

BWA, which is an acronym for the Burrows-Wheeler Aligner, is a genomic sequence mapping program which is designed to map low-divergent sequence reads against large reference genomes using one of three algorithms:

BWA-backtrack, which is intended for use with Illumina sequence reads of up to 100 base pairs;
BWA-MEM which is meant for longer sequence reads of 70 base pairs up to 1Mbp and supports long-reads and split alignment
BWA-SW, which is similar to BWA-MEM.

In general, BWA-MEM is the latest algorithm and the one recommended for high-quality queries due to it being faster and more accurate.

Using BWA on the HPC#

The following shows an example of using BWA on the HPC.

From an HPC login node, run the following commands:

# Load the GNU module
ml gnu

# Create a folder to serve as your workspace
$ mkdir ~/bwa-test && cd ~/bwa-test

# Download example data
$ wget https://raw.github.com/dzerbino/velvet/master/data/test_reads.fa
$ wget https://raw.github.com/dzerbino/velvet/master/data/test_reference.fa

# Index the reference genome
$ bwa index test_reference.fa

# Align data
$ bwa aln test_reference.fa test_reads.fa > aln_test.sai

# Pair and map aligned data
$ bwa samse test_reference.fa aln_test.sai test_reads.fa > aln_test.sam

For a complete set of commands, refer to the BWA Manual.