Python

Version
3.10.4, 3.6.8, 2.7.17

Python

Python is a very powerful, easy to use and object-oriented scripting language. The language has numerous packages that are designed for a myriad of different purposes.

Using Python on RCC Resources

Currently Python3 is available on HPC resources:

$ python
Python 3.6.8 (default, Aug 24 2020, 17:57:11)
[GCC 8.3.1 20191121 (Red Hat 8.3.1-5)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

Anaconda

Anaconda is a custom distribution of Python with hundreds of modules prepackaged for scientific/mathematics use. For details, refer to our Anaconda documentation. We have three versions of Anaconda installed for python2.7.15, python3.7.3 and python3.8.3 respectively.

To use python 3.7.3 from anaconda, load the anaconda3.7.3 module 

module load anaconda/3.7.3
[bchen3@hpc-login-25 ~]$ python
Python 3.7.3 (default, Mar 27 2019, 22:11:17)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 

Available Modules

Loading a module within Python is no different than how modules would be loaded on a local installation of Python, e.g. from math import * or import math to import the math library.

Anaconda and Biopython both contain a wealth of important functions. Anaconda also has a large list of libraries that come with it. Shown below is a list of the major python packages RCC has available that are not contained within Anaconda or Biopython.

  • PyHDF
  • pyOpenSSL
  • pkgconfig
  • polygon2
  • shapley
  • six
  • ttfquery
  • urllib3
  • visual
  • pyGtkGLExt
  • h5py

To use modules not available on the HPC, either modify the Python-specific environment search path PYTHONPATH to include a directory with the downloaded module, or run the python executable within a directory that includes the downloaded module, both of which are described here.

Custom modules with virtualenv

You can install any Python (v2 or 3) module that you wish using a virtualenv, which is a copy of all Python run times and libraries that you can install in your home directory.  In fact, you can create multiple virtual environments in your home directory for different applications.

The following example demonstrates how to create Python virtual environment named 'myapp':

# Load Python module
$ module load python/3

# Create a Python virtualenv named 'myapp'
$ virtualenv -p python ~/myapp

# Use pip3 to install pycrypto
$ ~/myapp/bin/pip3 install pycrypto

# Run Python in virtual environment
$ ~/myapp/bin/python3

If you use pip to install Python packages, you may see warnings during installation that look like:

  Failed building wheel for pycrypto

This does not necessarily mean that the package installation failed.  You should run Python and test using the module before assuming it is broken.

You can also create a virtual environment starting from anaconda python.  For example, to use anaconda python 3.7.3.

$ cd $HOME
$ module load anaconda/3.7.3

# this creates a directory venv in your home directory
$ virtualenv venv

# activate your virtual environment
source venv/bin/activate

# install package "tensorflow" into your venv
$ pip install tensorflow

Below is an example of submitting a Python virtualenv job using Slurm.  Notice that the python3.6 executable from the virtualenv is used.

#!/bin/bash

#SBATCH -n 1
#SBATCH -p genacc_q
#SBATCH -t 00:10:00
#SBATCH --mail-type=ALL

~/myapp/bin/python3 my_python_script.py

 

Setting Up Custom Environments with virtualenv

Set up a requirements.txt file to install all of the packages your environment needs with pip in a batch process. Save this file in your home directory. The following is an example taken from the Python Bootcamp given by RCC on 2/8/2022.

numpy 
pandas 
tensorflow 
jupyterlab 
matplotlib 
addfips 
plotly==4.5 
argparse 
nltk 
datetime 
notebook 
nbconvert
datetime 
plotly

Set up a script to create the environment and install the packages and save this to the home directory as well (same folder your requirements.txt file is in). You can also set up your home directory to automatically start that environment if you want too (be careful with this if you plan to have multiple virtual environments in your home directory). The following is an example taken from the Python Bootcamp given by RCC on 2/8/2022 called prepareenv.sh.

#!/bin/bash

cd ${HOME}

virtualenv -p python ${HOME}/bootcamp_venv
source ${HOME}/bootcamp_venv/bin/activate
pip install -r requirements.txt
## Comment out the following line if you don't want to have the environment
## automatically start every time you log in!
echo "source $HOME/bootcamp_venv/bin/activate" >> .bashrc