- 1 Anaconda
- 1.1 Important Concepts
- 1.2 Versions available on Ada/Terra
- 1.3 Managing Anaconda Virtual Environments
- 1.3.1 Using Anaconda
- 188.8.131.52 List Anaconda Virtual Environments
- 184.108.40.206 Virtual Environment Types
- 220.127.116.11 Create a Private Anaconda Virtual Environment
- 18.104.22.168 Access an Anaconda Virtual Environment
- 22.214.171.124 Check Packages in an Anaconda Virtual Environment
- 126.96.36.199 Install/Uninstall Packages in a Anaconda Virtual Environment
- 188.8.131.52 Clean and Remove a Virtual Environment
- 1.3.2 JupyterLab
- 1.3.3 Jupyter Notebook
- 1.3.1 Using Anaconda
- 1.4 Using Conda
Anaconda is a leading open data science platform powered by Python. It provides a collection of over 720 open source packages, and is a package and virtual environment manager. More details on Anaconda: https://docs.continuum.io/anaconda/. Next several important concepts are discussed, and then we discuss Anaconda modules on Ada and Terra.
A package is a collection of programs. For example, numpy is a package, tensorflow is a package, etc. Over 150 packages are automatically installed with Anaconda installation. Over 250 additional open source packages can be installed from the Anaconda repository with the 'conda install' command. Moreover, thousands of other packages are available from Anaconda cloud.
A virtual environment is a named collection of packages. For example, a virtual environment named 'test_environment' is a collection of python 3.5, basemap 1.0.7, and shapely 1.5.16. A user may create one virtual environment per project if each project needs different collection of software. Therefore, virtual environments avoid problems of version conflicting between different user projects. The command 'conda' is used to create and manage virtual environments in Anaconda. Note that other than 'conda', the command 'pip' can also be used to install python packages into a virtual environment. The 'pip' command facilitates to access more python packages. However, 'pip' does not resolve package dependency well, while 'conda' does a much better job.
Versions available on Ada/Terra
The most up to date listing of available versions on the cluster you are using can be found with:
module avail Anaconda # or "ml avail Anaconda" if you get tired of typing "module"
This will show all available versions of below (plus some of the myAnaconda modules described on the Python page).
Anaconda Modules on Ada
Run command 'module spider Anaconda' to list Anaconda modules which include Anaconda/2-5.0.1 and Anaconda/3-184.108.40.206. Module Anaconda/2-5.0.1 is for Python 2.7, while Anaconda/3-220.127.116.11 is for Python 3.6. Anaconda/3-18.104.22.168 is recommended for users who needs Python 3, while Anaconda/2-5.0.1 is recommended for users who needs Python 2.
Anaconda Modules on Terra
Run command 'module spider Anaconda' to list Anaconda modules which include Anaconda/2-4.3.1 and Anaconda/3-22.214.171.124 Module Anaconda/2-4.3.1 is for Python 2.7, while Anaconda/3-126.96.36.199 is for Python 3.6. Anaconda/3-188.8.131.52 is recommended for users without legacy issues.
Managing Anaconda Virtual Environments
Information about environment management can be found on this page.
Anaconda/3-184.108.40.206 is recommended on both Terra and Ada.
List Anaconda Virtual Environments
A user may list all shared virtual environments and your own private virtual environments using the command:
conda info --env
We have many shared environment related to specific tasks. For example, the tensorflow-gpu, keras-gpu environments can be useful for machine learning applications.
Virtual Environment Types
- Shared Virtual Environments The command 'conda create -n virtual_environment_name python=x.x' creates a virtual environment named as 'virtual_environment_name' (a user should change the virtual environment name) in anaconda. On our clusters, users do not have write permission to where anaconda root environment is. So users cannot use this command to create a virtual environment using the Anaconda modules on our clusters. Instead, we may create a virtual environment for user(s). Note that all virtual environment created by this command are accessible to all users.
- Note: A list of available virtual environments (currently 76) can be discerned from the files in /sw/local/etc/Anaconda/venvs/. e.g. on terra, using Anaconda/3-220.127.116.11, one should be able to activate the VE "tensorflow-gpu-1.4.1" as needed.
- Private Virtual Environment A user can create a private virtual environment using the command 'conda create -n virtual_environment_name package_to_install' where package_to_install is optional. Such a virtual environment is only accessible to the user who creates it. The private virtual environment is located at $SCRATCH/.conda/envs. NOTE: private virtual environment works only for Anaconda/3-4.4.0 and later version (e.g., 18.104.22.168.1), and works for Anaconda/2-5.0.1 on Ada
Create a Private Anaconda Virtual Environment
Make scratch directory as your current directory and follow the commands in order to create your own virtual environment. NOTE: Do not create an environment in your home directory. You will exceed your home directory file limit.
[NetID@cluster NetID]$ cd $SCRATCH # Make scratch your current directory [NetID@cluster NetID]$ module load Anaconda/3-22.214.171.124 # Load Anaconda module [NetID@cluster NetID]$ conda create --name myenv # Create environment
Now "conda info --env" command will also show your private environment.
Access an Anaconda Virtual Environment
To activate a virtual environment user has to first load anaconda and follow these steps
[NetID@cluster NetID]$ module load Anaconda/3-126.96.36.199 # Load Anaconda module [NetID@cluster NetID]$ source activate myenv # Activate environment (myenv) [NetID@cluster NetID]$ python myprogram.py # Run your programs/commands (activated environment name will show on left of command line) (myenv) [NetID@cluster NetID]$ source deactivate # Deactivate environment [NetID@cluster NetID]$ # Command line changes to normal
Normally a user needs to load Anaconda module and any other modules needed for your virtual environment, source activate virtual environment, and then user can run a program or commands which access the packages in the activated virtual environment. After the program or commands finished, the user should source deactivate the virtual environment. Actually, the last step 'source deactivate virtual_environment_name' is not necessary if you do not need to clean your path environment. Below are the summaries on how to access a virtual environment.
- module load Anaconda/xxx
- module load any_other_module_needed
- source activate your_virtual_environment_name
- run your programs/commands
- source deactivate
Note: if you have a virtual environment not in the output of 'conda info --env', then you need the full path of the virtual environment in the source activate command. For example: source activate /scratch/user/uncommon/test.
Check Packages in an Anaconda Virtual Environment
To check the list of packages in a Anaconda environment user first can follow these steps on command line.
[NetID@cluster NetID]$ module load Anaconda/3-188.8.131.52 # Load Anaconda module [NetID@cluster NetID]$ source activate myenv # Activate environment (myenv) [NetID@cluster NetID]$ conda list # Conda list command to check packages
If you don't activate an environment and use "conda list" command then it will show packages in root environment.
Install/Uninstall Packages in a Anaconda Virtual Environment
NOTE: Users can only install/uninstall packages in their private environment. Users don't have access to install/uninstall packages in root and shared environments.
To install/uninstall packages in private environments users first need to activate them. For example, next few steps show how to install and uninstall numpy package in the "myenv" private environment.
[NetID@cluster NetID]$ module load Anaconda/3-184.108.40.206 # Load Anaconda module [NetID@cluster NetID]$ source activate myenv # Activate environment (myenv) [NetID@cluster NetID]$ conda install numpy # Command to install numpy package (myenv) [NetID@cluster NetID]$ conda list # Conda list command to check packages (myenv) [NetID@cluster NetID]$ conda uninstall numpy # Command to uninstall numpy package
If you see the following error after installing a software package in Anaconda:
This system lists a couple of UTF-8 supporting locales that you can pick from. The following suitable locales were discovered: aa_DJ.utf8, aa_ER.utf8, aa_ET.utf8, af_ZA.utf8, am_ET.utf8, an_ES.utf8, ar_AE.utf8, ar_BH.utf8, ar_DZ.utf8, ar_EG.utf8, ar_IN.utf8, ar_IQ.utf8, ar_JO.utf8, ar_KW.utf8, ar_LB.utf8, ar_LY.utf8, ar_MA.utf8, ar_OM.utf8, ar_QA.utf8, ar_SA.utf8, ar_SD.utf8, ar_SY.utf8, ar_TN.utf8, ar_YE.ut
Then copy the activate_utf.sh file to you conda environment substituting USERNAME and ENVIRONMENTNAME with your netid and environment name:
Ada: cp /sw/hprc/Anaconda/activate_utf.sh /scratch/user/USERNAME/.conda/envs/ENVIRONMENTNAME/etc/conda/activate.d/ Terra: cp /sw/hprc/sw/Anaconda/activate_utf.sh /scratch/user/USERNAME/.conda/envs/ENVIRONMENTNAME/etc/conda/activate.d/
Or if that doesn't work, run the following commands after activating your environment
export LANGUAGE=en_US.UTF-8 export LC_ALL=en_US.UTF-8 export LANG=en_US.UTF-8
Clean and Remove a Virtual Environment
Anaconda downloads packages to your computer before software packages are installed. Those downloaded packages consume your disk quota. You may run 'conda clean --all' after your software packages are installed. To complete remove your private virtual environment myenv when you no longer need it, run the command "conda remove --name myenv --all"
You can create your own JupyterLab conda environment either using Anaconda or Miniconda for use on the HPRC portal but you must use one of the Anaconda versions that are on the JupyterLab HPRC portal webpage.
Notice that you will need to make sure you have enough available file quota (~30,000) since conda creates thousands of files.
An Anaconda install of JupyterLab creates about the same number of files as Miniconda3.
To to create an Anaconda conda environment called jupyterlab_1.2.2, do the following on the command line:
module purge module load Anaconda/3-220.127.116.11 conda create -n jupyterlab_1.2.2
After your jupyterlab_1.2.2 environment is created, you will see output on how to activate and use your jupyterlab_1.2.2 environment
# # To activate this environment, use: # > source activate jupyterlab_1.2.2 # # To deactivate an active environment, use: # > source deactivate #
Then you can install jupyterlab (specifying a version if needed) and add packages to your jupyterlab_1.2.2 environment
source activate jupyterlab_1.2.2 conda install -c conda-forge jupyterlab=1.2.2 conda install -c conda-forge package-name
To remove downloads and unused files after packages are installed.
conda clean --all
JupyterLab v1.2.2 installed via Miniconda3 will install python v3.6.7 while Anaconda installs python 3.8.0.
Anaconda/3-18.104.22.168 and Miniconda3/4.7.10 both use python v3.6.7 with jupyterlab v1.2.0 but jupyterlab v1.2.2 installs python 3.8.0 in Anaconda so it is best to use Anaconda for JupyterLab at the moment if you want to use jupyterlab v1.2.2 instead of v1.2.0.
To to create an Miniconda conda environment called jupyterlab_1.2.0, do the following on the command line:
module purge module load Miniconda3/4.7.10 conda create -p /scratch/user/your_netid/.conda/envs/jupyterlab_1.2.0 jupyterlab=1.2.0
After your jupyterlab_1.2.0 environment is created, you will see output on how to activate and use your bio environment
# # To activate this environment, use # # $ conda activate /scratch/user/your_netid/.conda/envs/jupyterlab_1.2.0 # # To deactivate an active environment, use # # $ conda deactivate
You can add packages to your Miniconda3 environment using either Anaconda/3-22.214.171.124 or Miniconda3/4.7.10 both which use python v3.6.7
When activating the conda environment using the Miniconda3 module, you must specify the full path. When using the Anaconda module you only need to specify the environment name.
In this example, JupyterLab should be run using the portal JupyterLab app. You can use your Miniconda3/4.7.10 environment in the JupyterLab portal app by selecting the Anaconda/3-126.96.36.199 module in the portal app page and providing the name including full path of your Miniconda3/4.7.10 environment in the "JupyterLab Environment to be activated" box.
You can create your own Jupyter Notebook environment using Python or Anaconda for use on the HPRC Portal but you must use one of the Module versions that are on the Jupyter Notebook HPRC portal web page.
Notice that you will need to make sure you have enough available file quota (~10,000) since conda and pip creates thousands of files.
A Python module can be used to create a virtual environment to be used in the portal Jupyter Notebook app when all you need is Python packages.
You can use a default Python virtual environment in the Jupyter Notebook portal app by leaving the "Optional Environment to be activated" field blank.
To to create a Python virtual environment called my_notebook-python-3.6.6-foss-2018b (you can name it whatever you like), do the following on the command line. You can save your virtual environments in any $SCRATCH directory you want. In this example a directory called /scratch/user/mynetid/pip_envs is used but you can use another name instead of pip_envs
mkdir -p /scratch/user/mynetid/pip_envs
A good practice is to name your environment so that you can identify which Python version is in your virtualenv so that you know which module to load.
The next three lines will create your virtual environment.
module purge module load Python/3.6.6-foss-2018b virtualenv /scratch/user/mynetid/pip_envs/my_notebook-python-3.6.6-foss-2018b
Then you can activate the virtual environment by using the full path to the activate command inside your virtual environment and install Python packages.
source /scratch/user/mynetid/pip_envs/my_notebook-python-3.6.6-foss-2018b/bin/activate pip install notebook pip install python_package_name
You can use your Python/3.6.6-foss-2018b environment in the Jupyter Notebook portal app by selecting the Python/3.6.6-foss-2018b module in the portal app page and providing the name including full path to the activate command for your Python/3.6.6-foss-2018b environment in the "Optional Conda Environment to be activated" box. The activate command is found inside the bin directory of your virtual env. An example of what to put in the "Optional Conda Environment to be activated" box is the full path used in the source command above.
Anaconda is different than Python's virtualenv in that you can install other types of software such as R and R packages in your environment. To to create an Anaconda conda environment called my_notebook (you can name it whatever you like), do the following on the command line:
module purge module load Anaconda/3-188.8.131.52 conda create -n my_notebook
After your my_notebook environment is created, you will see output on how to activate and use your my_notebook environment
# # To activate this environment, use: # > source activate my_notebook # # To deactivate an active environment, use: # > source deactivate #
Then you need to install notebook and then you can add optional packages to your my_notebook environment
source activate my_notebook conda install -c conda-forge notebook conda install -c conda-forge package-name
Other than creating a virtual environment as discussed above, the command 'conda' can list, clone, remove and share a virtual environment. More details can be found at https://conda.io/docs/using/envs.html. A user may find the conda cheatsheet is helpful: https://conda.io/docs/_downloads/conda-cheatsheet.pdf