Python is a very powerful, easy to use and object-oriented scripting language. The language has numerous packages that are designed for a myriad of different purposes.
Using Python on RCC Resources
Currently Python 2 and Python3 are available on HPC resources (2.7.5 and 3.7.0). Python2 is available by default:
$ python Python 2.7.5 (default, Aug 4 2017, 00:39:18) [GCC 4.8.5 20150623 (Red Hat 4.8.5-16)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>>
To use Python 3.7.0, load the python3 module:
$ module load python3 $ python3.7 Python 3.7.0 (default, Feb 15 2019, 13:18:19) [GCC 4.8.5 20150623 (Red Hat 4.8.5-16)] on linux Type "help", "copyright", "credits" or "license" for more information. >>>
Anaconda is a custom distrubtion of Python with hundreds of modules prepackaged for scientific/mathematic use. For details, refer to our Anaconda documentation. We have two versions of Anaconda installed for python2 and python3 respectively.
To use python 2.7.15 from anaconda, change your PATH environment variable to include
module load anaconda2.7.15 [bchen3@hpc-login-25 ~]$ python Python 2.7.15 |Anaconda, Inc.| (default, May 1 2018, 23:32:55) [GCC 7.2.0] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>>
To use python 3.6.5 from anaconda, change your PATH environment variable to include
$ module load anaconda3.7.3 $ python Python 3.7.3 (default, Mar 27 2019, 22:11:17) [GCC 7.3.0] :: Anaconda, Inc. on linux Type "help", "copyright", "credits" or "license" for more information. >>>
Loading a module within Python is no different than how modules would be loaded on a local installation of Python, e.g.
from math import * or
import math to import the math library. The list of readily available python modules can be found after loading the Python command-line and using
Anaconda and Biopython both contain a wealth of important functions. Anaconda also has a large list of libraries that come with it. Shown below is a list of the major python packages RCC has available that are not contained within Anaconda or Biopython.
To use modules not available on the HPC, either modify the Python-specific environment search path PYTHONPATH to include a directory with the downloaded module, or run the python executable within a directory that includes the downloaded module, both of which are described here.
Custom modules with virtualenv
You can install any Python (v2 or 3) module that you wish using a virtualenv, which is a copy of all Python runtimes and libraries that you can install in your home directory. In fact, you can create multiple virtual environments in your home directory for different applications.
The following example demonstrates how to create Python virtual environment named 'myapp':
# Load Python module $ module load python3 # Create a Python virtualenv named 'myapp' $ virtualenv -p python3.7 ~/myapp # Use pip3 to install pycrypto $ ~/myapp/bin/pip3.7 install pycrypto # Run Python in virtual environment $ ~/myapp/bin/python3.7
If you use pip to install Python packages, you may see warnings during installation that look like:
Failed building wheel for pycrypto
This does not necessarily mean that the package installation failed. You should run Python and test using the module before assuming it is broken.
You can also create a virtual environment starting from anaconda python. For example, to use anaconda python 3.7.3.
$ cd $HOME $ module load anaconda3.7.3 # this creates a directory venv in your home directory $ virtualenv venv # activate your virtual environment source venv/bin/activiate # install package "tensorflow" into your venv $ pip install tensorflow
Below is an example of submitting a Python virtualenv job using Slurm. Notice that the python3.7 executable from the virtualenv is used.
#!/bin/bash #SBATCH -n 1 #SBATCH -p genacc_q #SBATCH -t 00:10:00 #SBATCH --mail-type=ALL module load python3 ~/myapp/bin/python3.7 my_python_script.py