SW:Python vs Anaconda
This table can help you decide when to use a Python module and when to use an Anaconda module for installing python packages.
Python | Anaconda | |
---|---|---|
Example module | module load Python/3.6.6-intel-2018b | module load Anaconda/3-5.0.0.1 |
When to use | When only python packages are required | When C, C++ or R modules are required for installing a software package with an extensive dependency list (Example: qiime2) Can also install programming languages with specific versions such as Python, R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN, Julia and more within a conda environment |
Python version | only the same version as the module loaded | can install any version of Python 3 within Anaconda |
Env location | virtual environment can be saved in any directory. It's up to the user to remember where environments are saved | Manages environments in a centralized location: $SCRATCH/.conda/envs |
Env activation | Must provide full or relative path when activating Command Line Example: source activate /scratch/user/netid/my_envs/env_name/bin/activate Terra Jupyter Notebook Portal App Example: /scratch/user/netid/my_envs/env_name/bin/activate |
Only need to provide environment name when activating Command Line Example: source activate env_name Jupyter Notebook Portal App Example: env_name |
Available packages | PyPI | anaconda cloud (includes bioconda) and PyPI |
Package install command | pip | conda (for Anaconda packages) pip (for PyPI packages) |
Installation type | wheel or source | precompiled binaries (using 'conda install pkg_name') wheel or source (using 'pip install --user pkg_name') |
software speed | Specific software packages such as TensorFlow non-GPU are much faster when configured correctly than Anaconda binaries since they are compiled from source and can take advantage of CPU features. However, the performance for GPU versions of the TensorFlow modules versus Anaconda environments are relatively similar. | precompiled binaries may be slower for some software packages that run on CPU |
Dependency checks | yes but not completely (see link below) | yes |
File usage | each virutal environment downloads its own packages | multiple conda environments share a common directory for downloaded packages so if a package has been previously installed in a conda environment, it doesn't have to be downloaded again when used in a new conda environment (unless you did 'conda clean -t') |
Remove install cache | pip cache purge for pip >=20.1b1 |
conda clean -t to remove downloaded tar packages from shared pkgs directory |
Delete virtual environment | rm -rf env_name_directory | conda env remove --name env_name |
possible issues | not all dependencies are resolved globally when installing multiple packages (see link below) | installing package dependencies from multiple channels (default vs conda-forge) may cause conflicts |
Note: you must activate the python virtualenv or anaconda environment before installing packages with 'pip install --user' or 'conda install'