Clustal W

Software Category

Clustal W

Clustal W is a program designed to take in nucleic acid (genetic) sequence data or protein sequence data and align them. Clustal W is essentially the same program as Clustal X; the only difference is that Clustal X is a GUI for Clustal W.

Using Clustal W on RCC Resources

Running Clustal W on HPC Login Nodes

Clustal W requires the gnu module to run on HPC login nodes.

In order to begin working with Clustal W on a login node, simply run the command clustalw and a command-line interface with prompts will run. From this interface, you can run your Clustal W job. You can also specify a list of options and files to run Clustal W with in non-interactive mode. This can be done using clustalw -[OPTIONS] FILES. Detailed documentation on the options and inputs available for use with Clustal W can be found by typing clustalw -help. For more information, please refer to the official website.

The following is a basic example run of the program. Note that without specifying an output format, you will get a default output file which has the same name as the input file with a different file extension (.aln).

module load gnu
clustalw TEST.fa

Running Clustal W in Parallel

It is also possible to run Clustal W in parallel with OpenMPI. In order to do this, the GNU OpenMPI modules are required and must be loaded first.

An example run for Clustal W is below, again using the default output parameters. The first set of code is the Slurm submit script, and the second set is the commands to run the submit script. The submit script must be saved with the .sh extension. We use the name as an example.

#SBATCH -J clustalTest # Rename to better describe your specific job
#SBATCH -n 4
#SBATCH -t 1:00:00
#SBATCH -p genacc_q

module load gnu openmpi

srun clustalw TEST.fa