Skip to content

Conda and Anaconda

Conda is a Python package manager, virtual environment manager, and more.


Conda is a package manager, similar to pip. It helps you take care of your different packages by handling installing, updating and removing them. The advantage over pip is that it automatically creates isolated environments for different projects, and it can install data science libraries that are not written in Python (e.g., "R", C, etc.). It is the most popular package manager for data science.

Anaconda is a "batteries included" distribution of Python that includes over 150 data science packages. It uses Conda as its package manager.

Conda vs Pip#

Both Conda and Pip are package managers written in Python. The following table shows the differences between the two1:

conda pip
manages binaries wheel or source
can require compilers no yes
package types any Python-only
creates isolated environments yes, built-in no, requres virtualenv or venv
dependency checks yes no
package sources Anaconda repo or cloud PyPI
recommended for data science Python-only code

Setting up a Conda Environment#

Initial setup#

To use Conda, you need to configure your shell environment using the conda init command. You only need to do this once. Thenceforth, when you log in to the HPC, your shell will be configured for Conda.

From a login node:

# Load the anaconda module (1) 
[USER@h22-login-26 ~]$ module load anaconda

# Initialize your base environment (you only need to this ONCE!) (2)
[USER@h22-login-26 ~]$ conda init bash

# Read the new configuration (3)
[USER@h22-login-26 ~]$ source ~/.bashrc

# Your prompt will look something like this (notice the "(base)" in front of the prompt)
(base) [USER@h22-login-26 ~]$
  1. If you need a specific version of anaconda, use anaconda/VERSION. To see a list of versions available on the cluster, run module avail anaconda.
  2. If you are using a shell other than bash (e.g., tcsh, zsh, or fish, substitute that here).
  3. If you are using a shell other than bash, you will need to source your shell initialization file; if you do not know what that file name is, you can log out and log back in to the HPC.

Note

The conda initalization may cause your shell to take a second or two longer to load.

If you no longer need to use the conda package manager, you can edit your shell initialization script (~/.bashrc in the BASH environment), and remove the lines between and including the # >>> conda initialize >>> and # <<< conda initialize <<<.

Managing Conda Environments#

You can create as many Conda environments as you wish. Each environment is isolated to a single directory, and you can have as many environments as you need:

For example, to create a Conda environment named my_conda_app:

(base) [USER@h22-login-26 ~]$ conda create -n my_conda_app

Read and accept the prompts. When the script completes, you will see the following:

1
2
3
4
5
6
7
8
#
# To activate this environment, use
#
#     $ conda activate my_conda_app
#
# To deactivate an active environment, use
#
#     $ conda deactivate

Run the command conda activate my_conda_app:

1
2
3
4
(base) [USER@h22-login-26 ~]$ conda activate my_conda_app

# Your prompt will now be prefixed with:
(my_conda_app) [USER@h22-login-26 ~]$

Work in your Conda environment:

(my_conda_app) [USER@h22-login-26 ~]$ conda install some-package

To de-activate your Conda environment:

1
2
3
4
(my_conda_app) [USER@h22-login-26 ~]$ conda deactivate

# Your prompt will revert to (base)
(base) [USER@h22-login-26 ~]$

List your Conda environments:

(base) [USER@h22-login-26 ~]$ conda env list

Delete a Conda environment:

# Replace 'my_conda_app' with the name of your Conda environment
(base) [USER@h22-login-26 ~]$ conda remove --name my_conda_app --all

Anaconda#

We provide several pre-installed Anaconda environments globally on the HPC. To load the default Anaconda version, load the environment module:

$ module load anaconda

If you need a specific version, you can use the module avail anaconda command, which will show the available versions on the HPC:

$ module avail anaconda

------------------------------------------------------ /opt/modulefiles/core -------------------------------------------------------
   anaconda/2.7.15    anaconda/3.7.3    anaconda/3.8.3 (D)

  Where:
   D:  Default Module

Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

You can ensure that you activated Anaconda successfully by checking the Python path using the which command:

$ module load anaconda

$ which python
/gpfs/research/software/python/anaconda38/bin/python

# Use Anaconda packages
$ ipython
Python 3.8.3 (default, Jul  2 2020, 16:21:59) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.16.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]:

Discovering Anaconda Packages#

The list of packages included with Anaconda differs based on which module you use. The best way to determine what packages are installed in Anaconda is to run the conda list command.

Installing Anaconda in your home directory#

If you want to use the latest version of Anaconda, you can create a conda environment in your home directory and install Anaconda in it.

(base) [USER@h22-login-26 ~]$ conda create -n my_anaconda_app
(base) [USER@h22-login-26 ~]$ conda activate my_anaconda_app
(my_anaconda_app) [USER@h22-login-26 ~]$ conda install anaconda

# This will take a while... It will install approx ~4.8GB of data into your home directory

(my_anaconda_app) [USER@h22-login-26 ~]$ conda list anaconda$
# packages in environment at /gpfs/home/cam02h/.conda/envs/my_conda_app:
#
# Name                    Version                   Build  Channel
anaconda                  2023.06                 py311_0  

  1. Comparison table courtesy of Kumar Brar