Hprc banner tamu.png

SW:R-CNN

From TAMU HPRC
Revision as of 19:01, 13 August 2020 by J-perdue (talk | contribs) (Install pytorch/caffe2)
Jump to: navigation, search

"Region Based Convolutional Neural Networks (R-CNN) are a family of machine learning models for computer vision and specifically object detection." -- Wikipedia

The page above mentions a number of packages available for using R-CNNs. For now, this page will concentrate on Detectron2.

Note that these instructions are for building from source using a Python vitual environment so we can get optimizations for the current CPU/machine. We explicitly do NOT use Anaconda (Python for newbies) which uses precompiled binaries for CPU architectures from over a decade ago which are poorly suited for high-performance computing (HPC) in the 2020s.

Detectron

We'll start with single-node (no MPI) Detectron, the predecessor to Detecron2, since we've successfully built it on ada (terra test to come). The instructions for Detectron2 (not written/tested) will be added below later.

These steps come from the Detectron's INSTALL.md and from the Caffe2 instructions for building from source.

Installing Detectron in a Python virtual environment on HPRC clusters

foss/2018b

Download sources

Download, via Git, the needed sources. Note: we used the system git here (no module), but if you have problems you may try loading a Git module.

mkdir $SCRATCH/tmp
cd $SCRATCH/tmp
git clone https://github.com/facebookresearch/Detectron.git
git clone https://github.com/cocodataset/cocoapi.git
git clone https://github.com/pytorch/pytorch.git # for caffe2
cd pytorch
git submodule update --init --recursive

Create virtual environment

For a clean (re)start, remove previous virtual environment directory.

rm -rf $SCRATCH/Detectron-foss-2018b # remove previous attempt, if there was one.

Create and activate a Python VE to install into.

ml purge  # module purge
ml Python/3.6.6-foss-2018b  # module load Python/3.6.6-foss-2018b
python -m venv $SCRATCH/Detectron-foss-2018b
source /scratch/user/j-perdue/Detectron-foss-2018b/bin/activate

Update pip/setuptools.

pip install --upgrade pip setuptools

Install pytorch/caffe2

Load a newer CMake module (system cmake is too old) need for the below.

ml CMake/3.12.1-GCCcore-7.3.0

Install needed module(s).

pip install pyaml

Load cuDNN/CUDA for GPU support (WARNING: adding these currently causing problems on terra

ml cuDNN/7.6.5.32-CUDA-9.2.148.1  # See WARNING

On ada, edit line 185 of $SCRATCH/tmp/pytorch/torch/csrc/DataLoader.cpp and change:

throw ValueError("Cannot find worker information for _BaseDataLoaderIter with id %" PRId64, key);

to

throw ValueError("Cannot find worker information for _BaseDataLoaderIter with id %" "ld", key);

to avoid error with PRId64 on RHEL6.

Install pytorch/caffe2.

cd $SCRATCH/tmp/pytorch
rm -rf build  # remove previous attempts
python setup.py install

Test.

cd  # don't run in pytorch directory (fails)

# To check if Caffe2 build was successful
python -c 'from caffe2.python import core' 2>/dev/null && echo "Success" || echo "Failure"

# To check if Caffe2 GPU build was successful
# This must print a number > 0 in order to use Detectron
python -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())'

Install COCO API

Install the COCO API.

cd $SCRATCH/tmp/cocoapi/PythonAPI
make install

Build Detectron

Install the modules needed by Detectron (and pytorch).

# install non-binary numpy first so the binary version isn't brought in by opencv-python
pip install --no-binary :all: numpy
# we install the binary version of opencv-python (listed in requirements.txt) first since building from source seems to have problems.
pip install --only-binary :all: opencv-python

#install scipy and kiwisolver from wheel to avoid lapack problems
pip install scipy kiwisolver

# install other needed modules from source
pip install -r $SCRATCH/tmp/Detectron/requirements.txt
cd $SCRATCH/tmp/Detectron
make install

Test Detectron.

python $SCRATCH/tmp/Detectron/detectron/tests/test_spatial_narrow_as_op.py

This will fail if pytorch/caffe2 were build without cuDNN/CUDA.

Detectron2

"Detectron2 is Facebook AI Research's next generation software system that implements state-of-the-art object detection algorithms. It is a ground-up rewrite of the previous version, Detectron, and it originates from maskrcnn-benchmark." --Detectron2 site

It also includes support for Fast R-CNN, Faster R-CNN and other R-CNNs.

See the Directron2 site for using and training. For now, this page will only cover installation.

Installing Detectron2 in a Python virtual environment on HPRC clusters

foss/2019b

This is a basic/starter build. Note that this build does not include a CUDA-enabled OpenMPI so is limited to the GPUs on a single node.

Modules used include:

Make/3.15.3-GCCcore-8.3.0
Python/3.7.4-GCCcore-8.3.0
cuDNN/7.0.5-CUDA-9.0.176
(optional?) Graphviz/2.42.2-foss-2019b

Start with a clean module environment and install directory.

ml purge
rm -rf $SCRATCH/Detectron2-foss-2019b

Create and activate a Python VE to install into.

ml Python/3.7.4-GCCcore-8.3.0
python -m venv $SCRATCH Detectron2-foss-2019b

fosscuda/2018b

This build includes a CUDA-enabled OpenMPI for using multiple GPU nodes to speed up processing.

CMake/3.12.1-GCCcore-7.3.0
Python-3.6.6-fosscuda-2018b