Hprc banner tamu.png

Difference between revisions of "SW:R-CNN"

From TAMU HPRC
Jump to: navigation, search
(Build Detectron)
 
(41 intermediate revisions by the same user not shown)
Line 3: Line 3:
 
The page above mentions a number of packages available for using R-CNNs.  For now, this page will concentrate on [https://github.com/facebookresearch/detectron2 Detectron2].
 
The page above mentions a number of packages available for using R-CNNs.  For now, this page will concentrate on [https://github.com/facebookresearch/detectron2 Detectron2].
  
Note that these instructions are for building from source using a Python vitual environment so we can get optimizations for the current machine.  [b]We explicitly do NOT use Anaconda (Python for children) which uses precompiled binaries for CPU architectures from over a decade ago.[/b]
+
Note that these instructions are for building from source using a Python vitual environment so we can get optimizations for the current CPU/machine.  We explicitly do NOT use Anaconda (Python for newbies) which uses precompiled binaries for CPU architectures from over a decade ago which are poorly suited for high-performance computing (HPC) in the 2020s.
  
 
= Detectron =
 
= Detectron =
We'll start with single-node (no MPI) [https://github.com/facebookresearch/Detectron Detectron], the predecessor to Detecron2, since we've successfully built it on ada (terra test to come).  The instructions for Detectron2 (not written/tested) will be added later.
+
We'll start with single-node (no MPI) [https://github.com/facebookresearch/Detectron Detectron], the predecessor to Detecron2, since we've successfully built it on ada (terra test to come).  The instructions for Detectron2 (not written/tested) will be added below later.
  
 
These steps come from the [https://github.com/facebookresearch/Detectron/blob/master/INSTALL.md Detectron's INSTALL.md] and from the [https://caffe2.ai/docs/getting-started.html?platform=ubuntu&configuration=compile Caffe2 instructions for building from source].
 
These steps come from the [https://github.com/facebookresearch/Detectron/blob/master/INSTALL.md Detectron's INSTALL.md] and from the [https://caffe2.ai/docs/getting-started.html?platform=ubuntu&configuration=compile Caffe2 instructions for building from source].
  
 +
== Installing Detectron in a Python virtual environment on HPRC clusters ==
 +
 +
=== foss/2018b ===
 +
 +
==== Download sources ====
 
Download, via Git, the needed sources.  Note: we used the system git here (no module), but if you have problems you may try loading a Git module.
 
Download, via Git, the needed sources.  Note: we used the system git here (no module), but if you have problems you may try loading a Git module.
 
<pre>
 
<pre>
Line 15: Line 20:
 
cd $SCRATCH/tmp
 
cd $SCRATCH/tmp
 
git clone https://github.com/facebookresearch/Detectron.git
 
git clone https://github.com/facebookresearch/Detectron.git
 +
git clone https://github.com/cocodataset/cocoapi.git
 
git clone https://github.com/pytorch/pytorch.git # for caffe2
 
git clone https://github.com/pytorch/pytorch.git # for caffe2
 
cd pytorch
 
cd pytorch
Line 20: Line 26:
 
</pre>
 
</pre>
  
Clean the module environment and install directory.
+
==== Create virtual environment ====
 +
For a clean (re)start, remove previous virtual environment directory.
 
<pre>
 
<pre>
ml purge
+
rm -rf $SCRATCH/Detectron-foss-2018b # remove previous attempt, if there was one.
rm -rf $SCRATCH/Detectron-foss-2019b # remove previous attempt, if there was one.
 
 
</pre>
 
</pre>
  
 
Create and activate a Python VE to install into.
 
Create and activate a Python VE to install into.
 
<pre>
 
<pre>
ml Python/3.7.4-GCCcore-8.3.0
+
ml purge  # module purge
python -m venv $SCRATCH Detectron-foss-2019b
+
ml Python/3.6.6-foss-2018b  # module load Python/3.6.6-foss-2018b
 +
python -m venv $SCRATCH/Detectron-foss-2018b
 +
source /scratch/user/j-perdue/Detectron-foss-2018b/bin/activate
 +
</pre>
 +
 
 +
Update pip/setuptools.
 +
<pre>
 +
pip install --upgrade pip setuptools
 +
</pre>
 +
 
 +
==== Install pytorch/caffe2 ====
 +
 
 +
Load a newer CMake module (system cmake is too old) need for the below.
 +
<pre>
 +
ml CMake/3.12.1-GCCcore-7.3.0
 +
</pre>
 +
 
 +
Install needed module(s).
 +
<pre>
 +
pip install pyaml
 +
</pre>
 +
 
 +
Load cuDNN/CUDA for GPU support (WARNING: adding these currently causing problems on terra
 +
<pre>
 +
ml cuDNN/7.6.5.32-CUDA-9.2.148.1  # See WARNING
 +
</pre>
 +
 
 +
On ada, edit line 185 of $SCRATCH/tmp/pytorch/torch/csrc/DataLoader.cpp and change:
 +
<pre>
 +
throw ValueError("Cannot find worker information for _BaseDataLoaderIter with id %" PRId64, key);
 +
</pre>
 +
to
 +
<pre>
 +
throw ValueError("Cannot find worker information for _BaseDataLoaderIter with id %" "ld", key);
 +
</pre>
 +
to avoid error with PRId64 on RHEL6.
 +
 
 +
Install pytorch/caffe2.
 +
<pre>
 +
cd $SCRATCH/tmp/pytorch
 +
rm -rf build  # remove previous attempts
 +
python setup.py install
 +
</pre>
 +
Go have lunch or something.  This will take a while (just under 2 hours on terra).
 +
 
 +
Test.
 +
<pre>
 +
cd  # don't run in pytorch directory (fails)
 +
 
 +
# To check if Caffe2 build was successful
 +
python -c 'from caffe2.python import core' 2>/dev/null && echo "Success" || echo "Failure"
 +
 
 +
# To check if Caffe2 GPU build was successful
 +
# This must print a number > 0 in order to use Detectron
 +
python -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())'
 +
</pre>
 +
 
 +
==== Install COCO API ====
 +
Install the [https://github.com/cocodataset/cocoapi COCO API].
 +
 
 +
Install needed module(s).
 +
<pre>
 +
pip install numpy
 +
</pre>
 +
 
 +
Build/install.
 +
<pre>
 +
cd $SCRATCH/tmp/cocoapi/PythonAPI
 +
make install
 +
</pre>
 +
 
 +
==== Build Detectron ====
 +
 
 +
Install the needed module(s).
 +
 
 +
On ada, install opencv-python.
 +
<pre>
 +
# on ada, we install the binary version of opencv-python (listed in requirements.txt) first since building from source seems to have problems.
 +
pip install --only-binary :all: opencv-python
 +
</pre>
 +
 
 +
Install other needed modules.
 +
<pre>
 +
pip install -r $SCRATCH/tmp/Detectron/requirements.txt
 +
</pre>
 +
 
 +
Build.
 +
<pre>
 +
cd $SCRATCH/tmp/Detectron
 +
make install
 +
</pre>
 +
 
 +
Test Detectron.
 +
<pre>
 +
python $SCRATCH/tmp/Detectron/detectron/tests/test_spatial_narrow_as_op.py
 
</pre>
 
</pre>
 +
This will fail if pytorch/caffe2 were build without cuDNN/CUDA.
  
 
= Detectron2 =
 
= Detectron2 =

Latest revision as of 12:13, 14 August 2020

"Region Based Convolutional Neural Networks (R-CNN) are a family of machine learning models for computer vision and specifically object detection." -- Wikipedia

The page above mentions a number of packages available for using R-CNNs. For now, this page will concentrate on Detectron2.

Note that these instructions are for building from source using a Python vitual environment so we can get optimizations for the current CPU/machine. We explicitly do NOT use Anaconda (Python for newbies) which uses precompiled binaries for CPU architectures from over a decade ago which are poorly suited for high-performance computing (HPC) in the 2020s.

Detectron

We'll start with single-node (no MPI) Detectron, the predecessor to Detecron2, since we've successfully built it on ada (terra test to come). The instructions for Detectron2 (not written/tested) will be added below later.

These steps come from the Detectron's INSTALL.md and from the Caffe2 instructions for building from source.

Installing Detectron in a Python virtual environment on HPRC clusters

foss/2018b

Download sources

Download, via Git, the needed sources. Note: we used the system git here (no module), but if you have problems you may try loading a Git module.

mkdir $SCRATCH/tmp
cd $SCRATCH/tmp
git clone https://github.com/facebookresearch/Detectron.git
git clone https://github.com/cocodataset/cocoapi.git
git clone https://github.com/pytorch/pytorch.git # for caffe2
cd pytorch
git submodule update --init --recursive

Create virtual environment

For a clean (re)start, remove previous virtual environment directory.

rm -rf $SCRATCH/Detectron-foss-2018b # remove previous attempt, if there was one.

Create and activate a Python VE to install into.

ml purge  # module purge
ml Python/3.6.6-foss-2018b  # module load Python/3.6.6-foss-2018b
python -m venv $SCRATCH/Detectron-foss-2018b
source /scratch/user/j-perdue/Detectron-foss-2018b/bin/activate

Update pip/setuptools.

pip install --upgrade pip setuptools

Install pytorch/caffe2

Load a newer CMake module (system cmake is too old) need for the below.

ml CMake/3.12.1-GCCcore-7.3.0

Install needed module(s).

pip install pyaml

Load cuDNN/CUDA for GPU support (WARNING: adding these currently causing problems on terra

ml cuDNN/7.6.5.32-CUDA-9.2.148.1  # See WARNING

On ada, edit line 185 of $SCRATCH/tmp/pytorch/torch/csrc/DataLoader.cpp and change:

throw ValueError("Cannot find worker information for _BaseDataLoaderIter with id %" PRId64, key);

to

throw ValueError("Cannot find worker information for _BaseDataLoaderIter with id %" "ld", key);

to avoid error with PRId64 on RHEL6.

Install pytorch/caffe2.

cd $SCRATCH/tmp/pytorch
rm -rf build  # remove previous attempts
python setup.py install

Go have lunch or something. This will take a while (just under 2 hours on terra).

Test.

cd  # don't run in pytorch directory (fails)

# To check if Caffe2 build was successful
python -c 'from caffe2.python import core' 2>/dev/null && echo "Success" || echo "Failure"

# To check if Caffe2 GPU build was successful
# This must print a number > 0 in order to use Detectron
python -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())'

Install COCO API

Install the COCO API.

Install needed module(s).

pip install numpy

Build/install.

cd $SCRATCH/tmp/cocoapi/PythonAPI
make install

Build Detectron

Install the needed module(s).

On ada, install opencv-python.

# on ada, we install the binary version of opencv-python (listed in requirements.txt) first since building from source seems to have problems.
pip install --only-binary :all: opencv-python

Install other needed modules.

pip install -r $SCRATCH/tmp/Detectron/requirements.txt

Build.

cd $SCRATCH/tmp/Detectron
make install

Test Detectron.

python $SCRATCH/tmp/Detectron/detectron/tests/test_spatial_narrow_as_op.py

This will fail if pytorch/caffe2 were build without cuDNN/CUDA.

Detectron2

"Detectron2 is Facebook AI Research's next generation software system that implements state-of-the-art object detection algorithms. It is a ground-up rewrite of the previous version, Detectron, and it originates from maskrcnn-benchmark." --Detectron2 site

It also includes support for Fast R-CNN, Faster R-CNN and other R-CNNs.

See the Directron2 site for using and training. For now, this page will only cover installation.

Installing Detectron2 in a Python virtual environment on HPRC clusters

foss/2019b

This is a basic/starter build. Note that this build does not include a CUDA-enabled OpenMPI so is limited to the GPUs on a single node.

Modules used include:

Make/3.15.3-GCCcore-8.3.0
Python/3.7.4-GCCcore-8.3.0
cuDNN/7.0.5-CUDA-9.0.176
(optional?) Graphviz/2.42.2-foss-2019b

Start with a clean module environment and install directory.

ml purge
rm -rf $SCRATCH/Detectron2-foss-2019b

Create and activate a Python VE to install into.

ml Python/3.7.4-GCCcore-8.3.0
python -m venv $SCRATCH Detectron2-foss-2019b

fosscuda/2018b

This build includes a CUDA-enabled OpenMPI for using multiple GPU nodes to speed up processing.

CMake/3.12.1-GCCcore-7.3.0
Python-3.6.6-fosscuda-2018b