Caltech Home > Caltech HPC Homepage > Documentation > Software and Modules
Search open search form

Software and Modules

Software and Modules

Software
We have installed a number of software packages in /central/software.  These software packages are managed by the modules environment management system.

Please take a look and see what is available for your use.  If you need other software installed that will be of use to the community in general, feel free to contact us at help-hpc@caltech.edu and we can look into it, otherwise a local or group install is advisable. (see below)

Typically with many scientific software packages, the end users or groups will compile and install it themselves into their home directories or shared group directories.  This is because oftentimes the software build parameters are specific to the group or person in question and may have options compiled in that only they can use. In such cases, you may want to use a package management system like Anaconda or Spack to make building and maintaining the software environment easier. 

Modules
The module system is designed to manage your environment to make availabe softawre packages in your PATH and to also set other settings that the package may need.

To get started, it is great to look a a listing of what is available:

module avail

To load something into your environment use the load command:

module load openmpi/2.1.2

You have to specify the version as well as the package when loading it

Sometimes the modules require other modules to be loaded as well.  If this is the case, it will tell you.  You can also find out more information about the package using the show or whatis commands

module show openmpi/2.1.2
module whatis openmpi/2.1.2

If you want to remove all the currently laoded modules, you can use the purge function

module purge

If you are always using the same module, you can also add the commands to your startup files such as .bash_profile or  .cshrc depending on your shell.


Containers
If you need to run containers on the cluster, you should use singularity as the container technology. This is an application built specifically for HPC workloads, support mpi, is scheduled as any other job is scheduled, runs is userspace, and overlays the filesystem.



Spack Package Management

Some users may want to manage their own stack on software.  Spack is a tool to make it easier. It is putting software installation and modules in one package. You can find out more information at https://spack.io/ . You may also want to check out the Spack tutorial


Anaconda Package Management

If most of your software requirements revolve around scientific python packages or you require newer packages than the system python, you may want to install Conda into your home or group directory. Conda is an open source package and environment management system.

Here's a Conda Cheat Sheet.

Example 1, Miniconda Installed into your home directory. (Not recommended due to 50GB home directory quotas)

Run the following to install a Miniconda. Note that we recommend example 2 below as group directories provide much more storage space than the 50GB home directories on the cluster. 

Python3 MiniConda installation

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
sh ./Miniconda3-latest-Linux-x86_64.sh 

(Accept the license agreement, select install location and select yes to add conda to your path if desired.) Log out and back into the cluster and run the following to verify that the base Conda environment is now active. 

(base) [jflilley@login1]$ which python
~/anaconda3/bin/python

As sourcing of the Conda environment is in your ~/.bashrc all compute nodes will have the proper paths set to use Conda packages. If you've setup multiple conda environments be sure that your slurm submission scripts activate the correct environment if not the default one.  

Example 2, MiniConda install into a group directory.  (Recommended approach as group directories are 10TB+)

Create a dedicated location for your miniconda installation
mkdir /central/groups/<your-group-dir>/$USER

Python3 MiniConda installation

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
sh ./Miniconda3-latest-Linux-x86_64.sh 

(Accept the license agreement, select install location (the newly created group dir and select yes to add conda to your path if desired.) Log out and back into the cluster and run the following to verify that the base Conda environment is now active. 

(base) [jflilley@login1]$ which python
~/anaconda3/bin/python

As sourcing of the Conda environment is in your ~/.bashrc all compute nodes will have the proper paths set to use Conda packages. If you've setup multiple conda environments be sure that your slurm submission scripts activate the correct environment if not the default one.  

Example 3, Conda Install into your groups shared directory. Complete Example 1 installation above first. Add the secondary conda environment by creating a new .condarc file in your home directory, replacing <group name directory> with your groups directory name. 

envs_dirs: 
  - /central/groups/<group name directory>/anaconda2

Verify the environment shows up and then activate.
[jflilley@login1 ~]$ conda env list
second-shared-environment /central/groups/MICS/anaconda2/second-shared-environment base * /home/jflilley/anaconda2

conda activate second-shared-environment
which python
/central/groups/<group name directory>/second-shared-environment/bin/python

Another method to share Conda environments while allowing flexibility for other users to self manage their own, is to export an environment to yaml that can be imported into the other users personal Conda setup. 


To export an environment to share with another researcher, first activate the intended environment in Conda and run the following command. 

conda env export > ~/shared-conda-environment.yml

To import the Conda environment into another Conda installation run.

conda env create -f shared-conda-environment.yml 

To verify the import you may list your Conda environments. 

conda env list


Using Python's native Virtual Environment function (venv)

Another option is to use our Python environment module on the cluster along with python's native virtual environment function. (venv) The only downside to this method is that the python version may not be quite as new as the one available via conda. 


module load python3/3.8.5


python3 -m venv /central/groups/imss_admin/jflilley/python-environments/my-test-venv


source /central/groups/imss_admin/jflilley/python-environments/my-test-venv/bin/activate


which python

(venv3) [jflilley@login1 ~]$ which python

/central/groups/imss_admin/jflilley/python-environments/my-test-venv/bin/python

Now install whatever packages are needed via Pip etc.

To deactivate a Python venv, run 'deactivate' . Be sure to always load the python3/3.8.5 module before working with the environments or add 'module load python3/3.8.5' to your ~/.bashrc