Bowtie2 is a bioinformatics program designed to align genomic sequence reads of about 50 and up to thousands of characters in length. It is particularly good for aligning such reads to fairly long genomes, such as mammalian genomes.
Serially Running Bowtie2 on the HPC and Spear
The gnu module must be loaded before running Bowtie2 on the HPC and Spear.
$ module load gnu $ bowtie2-build e_coli_1000.fa e_coli
This should print many lines of output and then quit. When the command completes, the current directory will contain six new files that all start with e_coli and end with .1.bt2, .2.bt2, .3.bt2, .4.bt2, .rev.1.bt2, and .rev.2.bt2. These files constitute the index. To run the Bowtie2 aligner, which aligns a set of unpaired reads to the E. coli reference genome using the index generated in the previous step, use the command:
$ bowtie2 -x e_coli -U e_coli_1000.fq -S eg1.sam
The alignment results in SAM format are written to the file eg1.sam, and a short alignment summary is written to the console.
Running Bowtie2 in Parallel on HPC
Below is a script to run the above example on the HPC using the Slurm job scheduler. The script must be saved with the .sh extension.
#!/bin/bash #SBATCH -p genacc_q #SBATCH -J Bowtie2Job module load gnu bowtie2-build e_coli_1000.fa e_coli bowtie2 -x e_coli -U e_coli_1000.fq -S eg1.sam
Then submit your script using the following command, replacing
YOURSCRIPT with the name of your script file:
$ sbatch YOURSCRIPT.sh
For more information about Bowtie2, please refer to the official documentation.