ABySS is a Bioinformatics program designed to assemble genomes from small paired-end sequence reads. It can be run either in serial or in parallel, though the parallel version is capable of efficiently assembling larger genomes than the serial one is.
Running ABySS on RCC Resources
ABySS can be run in HPC serially as well as in parallel. The abyss module needs to be loaded before running ABySS.
Serially running ABySS on Spear
Download and assemble a small synthetic data set.
module load abyss abyss-pe k=25 name=test se=https://raw.github.com/dzerbino/velvet/master/data/test_reads.fa
Calculate assembly contiguity statistics
To assemble paired reads in two files named test-1.fa and test-3.fa into contigs in a file named test-contigs.fa, run the command:
abyss-pe name=test k=64 in='test-1.fa test-3.fa'
Further details about the commands can be found in the ABySS documentation.
Parallaly Running ABySS on HPC
Following SLURM submit script can be used as a template to submit a parallel ABySS job in HPC.
#!/bin/bash # # Name your job #SBATCH -J abyss # #Change the queue #SBATCH -p genacc_q # #Change the number of nodes and processes per node as necessary #SBATCH -N 2 #SBATCH --ntasks-per-node=4 # #Change the wall time #SBATCH -t 00:30:00 # module load abyss # #Run your ABySS commands abyss-pe name=test k=48 n=8 in='test-1.fa test-3.fa'