Desmond
Desmond
Desmond
is a software developed by D. E. Shaw Research to perform molecular dynamics simulations of biological systems on conventional commodity clusters. Desmond was installed on the FSU HPC cluster as a component of the Schrodinger
software suite. The Schrodinger suite also include Maestro
, a visualization tool for molecular dynamics. The Schrodinger suite is available on the shared HPC storage system at the following path:
/gpfs/research/software/desmond/schrodinger2013-3
Molecular Dynamics Simulation Process Using the Schrodinger Suite
The procedure for running a molecular dynamics simulation using Desmond
and Maestro
can be summarized in the following figure:
In particular, the original structure file imported from the protein database (PDB) needs to be prepared/preprocessed by Maestro
to produce the structure file (with force field) which is used as the input of the desmond
simulation. The configuration file contains the simulation parameters such as the global cell, the force field, constraints (if there are any), and the integrator.
The visualization tool Maestro
is a graphical user interface (GUI).
It is not advisable to run GUI software on the HPC login nodes. We suggest you to install Maestrao
to your personal computer and prepare the structure file on your laptop/desktop, before running the Desmond
simulation (the CPU-intensive process) on the HPC cluster.
Command Line Syntax for Desmond
The syntax for desmond
is as follows:
/gpfs/research/software/desmond/schrodinger2013-3/desmond [Job_Options] [Backend_Options] Backend_Arguments
The Job_Options can be:
-h : print help message.
-v : print version information and exit.
-WAIT : don't exit until job completes.
-p,-NPROC : number of processors to be used (default is 1).
-JOBNAME name : the name of this job.
-gpu : run GPU version.
-jin filename : files or directories to be transfered to the compute node.
-jout filename : files to be copied back to the submit node.
-dryrun backend_cfg : generate backend config file only.
The Backend_Options can be:
-comm plugin : use communication plugin (serial or mli)
-c config_file : parameter file for simulation.
-tpp n : number of threads per processor.
-dp : run double precision version (single precision by default).
-noopt : do not optimize parameters automatically.
-overwrite : overwrite trajectory.
-profile : enable backend profiling,
The Backend_Arguments can be:
-in x.cms : the structure file
-restore checkpoint : a check point file for resuming a simulation
(run the desmond -h
command for details).
As an example, to run the desmond simulation with input structure file x.cms and configuration file y.cfg on a single machine,
$SCHRODINGER/desmond -in x.cms -c y.cfg
where SCHRODINGER
is the environmental variable (path) to the Schrodinger
software installation.
Command Line Options for Schrodinger Job Control Facility
Besides command line options specific to desmond
, there are important command line options directly recognized by the Schrodinger Job Control Facility. Here are a few important ones (refer to Schrodinger Job Control Guide for more information)
-HOST host
-HOST host:n
-HOST "host_1:n_1 host_2:n_2 ... host_k:n_k"
This option tells the job control facility to run job on a specified host, or submit job to a queue.
Here, host
is the value of a name
entry (not the host entry) in the schrodinger.hosts
file (see discussion in the following), or the actual address of a host, and n
(n_1, n_2, ... n_k) is the number of cores to the host. When specifying more than one host, use spaces to separate them and enclose them with quotes.
-QARGS queue-args
This option passes arguments to the queue manager. These arguments are appended to those specified by the qargs
settings in the hosts file schrodinger.hosts
.
-TMPDIR directory
This option specifies the scratch directory for the job. The job directory is created as a subdirectory of the scratch directory. We suggest you to use$HOME/scratch
or $HOME/_tmp
.
There are some options to see the configured runtime for Desmond. For example,
-ENTRY
This option shows the section of the schrodinger.hosts
file that will be used for this job provided the -Host host
option points to a section of the hosts file.
Remark. Command-line options always take precedence over the corresponding environment variable.
Running Desmond Simulation on the HPC cluster
Molecular Dynamics simulation is CPU-intensive. A desmond simulation can run on the HPC cluster in two ways:
- via the Slurm job submit script (recommended)
- through the Schrodinger's Job Control Facility
Submitting Desmond Jobs using a Slurm script
Below is an example Slurm submit script:
#!/bin/bash
#SBATCH -J desmond_mpi
#SBATCH --mail-type=ALL
#SBATCH -N 2
#SBATCH --ntasks-per-node=8
#SBATCH -t 24:00:00
#SBATCH --mem-per-cpu=2000 # 2GB of memory per CPU * 8 = 16GB total
#SBATCH -p genacc_q
# the desmond module defines the environmental variables needed
# for running desmond
module load desmond
#$SCHRODINGER=/gpfs/research/software/desmond/schrodinger2013-3
mpirun $SCHRODINGER/desmond -in x.cms -c y.cfg -comm mpi
where x.cms and y.cfg are respectively the structure file and simulation parameter file.
Submitting jobs using the Schrodinger Job Control facility
The Job Control Facility obtains information about the hosts on which it will run jobs (and other information needed by the queue) from the hosts file, schrodinger.hosts,
and will submit/monitor the job for you. The default hosts file is located in:
$SCHRODINGER/schrodinger.hosts
An example entry of the schrodinger.hosts
is as follows:
name: genacc_q
host: hpc-login
queue: slurm
qargs: -p genacc_q -N 2 --ntasks-per-node=8 --mem-per-cpu=2000 -t 14:00:00:00
processors: 500
tmpdir: $HOME/_tmp # You will need to create this directory before running the job
In this example:
genacc_q
is the name of the entry (each entry is one job submission scenario),submit
is the name of the host (the login node of the HPC cluster),slurm
is the queuing software,qargs
contains the command line options ofqsub
(orsbatch
for Slurm). In the genacc_q queue, the wall clock limit is 14 days,tmpdir
is the scratch space for Desmond application (you can use for example $HOME/scratch for your ownschrodinger.hosts
file).
To submit a parallel job using the above configuration
$ module load desmond
$SCHRODINGER/desmond -in x.cms -c x.cfg -HOST genacc_q
(the -HOST genacc_q
flag above will tell the Schrodinger Job Monitoring Facility to use the entry with name genacc_q
in the schordinger.hosts
file to create the job submit script. Consequently, the simulation will be submitted to the queue genacc_q
asking for 2 nodes and 8 cores each).
To use your own hosts file, create a directory .schrodinger
under your home directory, and copy the above file to it:
$ mkdir $HOME/.schrodinger
$ cp $SCHRODINGER/schrodinger.hosts $HOME/.schrodinger/schrodinger.hosts
and edit it. You also need define the environmental variable SCHRODINGER_HOSTS
to point to this file:
export SCHRODINGER_HOSTS=$HOME/.schrodinger/schrodinger.hosts
Monitoring Jobs Using the Schrodinger Job Control Facility
Besides the job control commands provided by the job scheduler such as squeue
, sbatch
, etc., Desmond simulations can be monitored by the Schrodinger Job Control Facility.
The Job Control facility provides tools for monitoring and controlling the jobs that it runs. Information about each job is kept in the user's job database. This database is kept in the directory $HOME/.schrodinger/.jobdb.
A Desmond job can be manipulated and monitored by the Schrodinger utility jobcontrol
as follows:
$SCHRODINGER/jobcontrol [command or query]
In the above command
is the command for the action you want to perform, and query
defines the scope of the action performed by the command.
The command
can be any one of the following:
-cancel
which cancels a job that has been launched but not started, or
-kill
which terminates a job immediately, or
-list
which lists the JobID, job name and status, or
-resume
which continues running a paused job, or
-dump
which shows the complete job record.
A query
can be
all
which means all jobs in your job database, or
active
which means all active jobs (not finished), or
finished
which means all jobs finished, or a JobID. The JobId
is a unique identifier consisting of the name of the submission host, a sequence number, and a hexadecimal timestamp, e.g., submit-0-a1b2c3d4.
For example, to list all the jobs in your job database that finished successfully, enter:
$SCHRODINGER/jobcontrol -list finished
To list just the job whose JobId is submit-0-a1b2c3d4, enter:
$SCHRODINGER/jobcontrol -list submit-0-a1b2c3d4
To list the complete database record for a job, enter the command:
$SCHRODINGER/jobcontrol -dump jobid
References
A few useful resources for Desmond are as follows:
(1). [Desmond User's Guide] (http://www.deshawresearch.com/downloads/download_desmond.cgi/Desmond_Users_Guide-0.5.3.pdf).
(2). [Schrodinger Job Control Guide] (http://gohom.win/ManualHom/Schrodinger/Schrodinger_2012_docs/general/job_control.pdf).