Difference between revisions of "SW:R-CNN"
(→foss/2019b) |
(→Build Detectron) |
||
(37 intermediate revisions by the same user not shown) | |||
Line 10: | Line 10: | ||
These steps come from the [https://github.com/facebookresearch/Detectron/blob/master/INSTALL.md Detectron's INSTALL.md] and from the [https://caffe2.ai/docs/getting-started.html?platform=ubuntu&configuration=compile Caffe2 instructions for building from source]. | These steps come from the [https://github.com/facebookresearch/Detectron/blob/master/INSTALL.md Detectron's INSTALL.md] and from the [https://caffe2.ai/docs/getting-started.html?platform=ubuntu&configuration=compile Caffe2 instructions for building from source]. | ||
+ | == Installing Detectron in a Python virtual environment on HPRC clusters == | ||
− | == | + | === foss/2018b === |
− | === | + | ==== Download sources ==== |
Download, via Git, the needed sources. Note: we used the system git here (no module), but if you have problems you may try loading a Git module. | Download, via Git, the needed sources. Note: we used the system git here (no module), but if you have problems you may try loading a Git module. | ||
<pre> | <pre> | ||
Line 19: | Line 20: | ||
cd $SCRATCH/tmp | cd $SCRATCH/tmp | ||
git clone https://github.com/facebookresearch/Detectron.git | git clone https://github.com/facebookresearch/Detectron.git | ||
+ | git clone https://github.com/cocodataset/cocoapi.git | ||
git clone https://github.com/pytorch/pytorch.git # for caffe2 | git clone https://github.com/pytorch/pytorch.git # for caffe2 | ||
cd pytorch | cd pytorch | ||
Line 24: | Line 26: | ||
</pre> | </pre> | ||
− | + | ==== Create virtual environment ==== | |
+ | For a clean (re)start, remove previous virtual environment directory. | ||
<pre> | <pre> | ||
− | + | rm -rf $SCRATCH/Detectron-foss-2018b # remove previous attempt, if there was one. | |
− | rm -rf $SCRATCH/Detectron-foss- | ||
</pre> | </pre> | ||
Create and activate a Python VE to install into. | Create and activate a Python VE to install into. | ||
<pre> | <pre> | ||
− | ml Python/3. | + | ml purge # module purge |
− | python - | + | ml Python/3.6.6-foss-2018b # module load Python/3.6.6-foss-2018b |
+ | python -m venv $SCRATCH/Detectron-foss-2018b | ||
+ | source /scratch/user/j-perdue/Detectron-foss-2018b/bin/activate | ||
+ | </pre> | ||
+ | |||
+ | Update pip/setuptools. | ||
+ | <pre> | ||
+ | pip install --upgrade pip setuptools | ||
+ | </pre> | ||
+ | |||
+ | ==== Install pytorch/caffe2 ==== | ||
+ | |||
+ | Load a newer CMake module (system cmake is too old) need for the below. | ||
+ | <pre> | ||
+ | ml CMake/3.12.1-GCCcore-7.3.0 | ||
+ | </pre> | ||
+ | |||
+ | Install needed module(s). | ||
+ | <pre> | ||
+ | pip install pyaml | ||
+ | </pre> | ||
+ | |||
+ | Load cuDNN/CUDA for GPU support (WARNING: adding these currently causing problems on terra | ||
+ | <pre> | ||
+ | ml cuDNN/7.6.5.32-CUDA-9.2.148.1 # See WARNING | ||
+ | </pre> | ||
+ | |||
+ | On ada, edit line 185 of $SCRATCH/tmp/pytorch/torch/csrc/DataLoader.cpp and change: | ||
+ | <pre> | ||
+ | throw ValueError("Cannot find worker information for _BaseDataLoaderIter with id %" PRId64, key); | ||
+ | </pre> | ||
+ | to | ||
+ | <pre> | ||
+ | throw ValueError("Cannot find worker information for _BaseDataLoaderIter with id %" "ld", key); | ||
+ | </pre> | ||
+ | to avoid error with PRId64 on RHEL6. | ||
+ | |||
+ | Install pytorch/caffe2. | ||
+ | <pre> | ||
+ | cd $SCRATCH/tmp/pytorch | ||
+ | rm -rf build # remove previous attempts | ||
+ | python setup.py install | ||
+ | </pre> | ||
+ | Go have lunch or something. This will take a while (just under 2 hours on terra). | ||
+ | |||
+ | Test. | ||
+ | <pre> | ||
+ | cd # don't run in pytorch directory (fails) | ||
+ | |||
+ | # To check if Caffe2 build was successful | ||
+ | python -c 'from caffe2.python import core' 2>/dev/null && echo "Success" || echo "Failure" | ||
+ | |||
+ | # To check if Caffe2 GPU build was successful | ||
+ | # This must print a number > 0 in order to use Detectron | ||
+ | python -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())' | ||
+ | </pre> | ||
+ | |||
+ | ==== Install COCO API ==== | ||
+ | Install the [https://github.com/cocodataset/cocoapi COCO API]. | ||
+ | |||
+ | Install needed module(s). | ||
+ | <pre> | ||
+ | pip install numpy | ||
+ | </pre> | ||
+ | |||
+ | Build/install. | ||
+ | <pre> | ||
+ | cd $SCRATCH/tmp/cocoapi/PythonAPI | ||
+ | make install | ||
+ | </pre> | ||
+ | |||
+ | ==== Build Detectron ==== | ||
+ | |||
+ | Install the needed module(s). | ||
+ | |||
+ | On ada, install opencv-python. | ||
+ | <pre> | ||
+ | # on ada, we install the binary version of opencv-python (listed in requirements.txt) first since building from source seems to have problems. | ||
+ | pip install --only-binary :all: opencv-python | ||
+ | </pre> | ||
+ | |||
+ | Install other needed modules. | ||
+ | <pre> | ||
+ | pip install -r $SCRATCH/tmp/Detectron/requirements.txt | ||
+ | </pre> | ||
+ | |||
+ | Build. | ||
+ | <pre> | ||
+ | cd $SCRATCH/tmp/Detectron | ||
+ | make install | ||
+ | </pre> | ||
+ | |||
+ | Test Detectron. | ||
+ | <pre> | ||
+ | python $SCRATCH/tmp/Detectron/detectron/tests/test_spatial_narrow_as_op.py | ||
</pre> | </pre> | ||
+ | This will fail if pytorch/caffe2 were build without cuDNN/CUDA. | ||
= Detectron2 = | = Detectron2 = |
Latest revision as of 12:13, 14 August 2020
The page above mentions a number of packages available for using R-CNNs. For now, this page will concentrate on Detectron2.
Note that these instructions are for building from source using a Python vitual environment so we can get optimizations for the current CPU/machine. We explicitly do NOT use Anaconda (Python for newbies) which uses precompiled binaries for CPU architectures from over a decade ago which are poorly suited for high-performance computing (HPC) in the 2020s.
Contents
Detectron
We'll start with single-node (no MPI) Detectron, the predecessor to Detecron2, since we've successfully built it on ada (terra test to come). The instructions for Detectron2 (not written/tested) will be added below later.
These steps come from the Detectron's INSTALL.md and from the Caffe2 instructions for building from source.
Installing Detectron in a Python virtual environment on HPRC clusters
foss/2018b
Download sources
Download, via Git, the needed sources. Note: we used the system git here (no module), but if you have problems you may try loading a Git module.
mkdir $SCRATCH/tmp cd $SCRATCH/tmp git clone https://github.com/facebookresearch/Detectron.git git clone https://github.com/cocodataset/cocoapi.git git clone https://github.com/pytorch/pytorch.git # for caffe2 cd pytorch git submodule update --init --recursive
Create virtual environment
For a clean (re)start, remove previous virtual environment directory.
rm -rf $SCRATCH/Detectron-foss-2018b # remove previous attempt, if there was one.
Create and activate a Python VE to install into.
ml purge # module purge ml Python/3.6.6-foss-2018b # module load Python/3.6.6-foss-2018b python -m venv $SCRATCH/Detectron-foss-2018b source /scratch/user/j-perdue/Detectron-foss-2018b/bin/activate
Update pip/setuptools.
pip install --upgrade pip setuptools
Install pytorch/caffe2
Load a newer CMake module (system cmake is too old) need for the below.
ml CMake/3.12.1-GCCcore-7.3.0
Install needed module(s).
pip install pyaml
Load cuDNN/CUDA for GPU support (WARNING: adding these currently causing problems on terra
ml cuDNN/7.6.5.32-CUDA-9.2.148.1 # See WARNING
On ada, edit line 185 of $SCRATCH/tmp/pytorch/torch/csrc/DataLoader.cpp and change:
throw ValueError("Cannot find worker information for _BaseDataLoaderIter with id %" PRId64, key);
to
throw ValueError("Cannot find worker information for _BaseDataLoaderIter with id %" "ld", key);
to avoid error with PRId64 on RHEL6.
Install pytorch/caffe2.
cd $SCRATCH/tmp/pytorch rm -rf build # remove previous attempts python setup.py install
Go have lunch or something. This will take a while (just under 2 hours on terra).
Test.
cd # don't run in pytorch directory (fails) # To check if Caffe2 build was successful python -c 'from caffe2.python import core' 2>/dev/null && echo "Success" || echo "Failure" # To check if Caffe2 GPU build was successful # This must print a number > 0 in order to use Detectron python -c 'from caffe2.python import workspace; print(workspace.NumCudaDevices())'
Install COCO API
Install the COCO API.
Install needed module(s).
pip install numpy
Build/install.
cd $SCRATCH/tmp/cocoapi/PythonAPI make install
Build Detectron
Install the needed module(s).
On ada, install opencv-python.
# on ada, we install the binary version of opencv-python (listed in requirements.txt) first since building from source seems to have problems. pip install --only-binary :all: opencv-python
Install other needed modules.
pip install -r $SCRATCH/tmp/Detectron/requirements.txt
Build.
cd $SCRATCH/tmp/Detectron make install
Test Detectron.
python $SCRATCH/tmp/Detectron/detectron/tests/test_spatial_narrow_as_op.py
This will fail if pytorch/caffe2 were build without cuDNN/CUDA.
Detectron2
"Detectron2 is Facebook AI Research's next generation software system that implements state-of-the-art object detection algorithms. It is a ground-up rewrite of the previous version, Detectron, and it originates from maskrcnn-benchmark." --Detectron2 site
It also includes support for Fast R-CNN, Faster R-CNN and other R-CNNs.
See the Directron2 site for using and training. For now, this page will only cover installation.
Installing Detectron2 in a Python virtual environment on HPRC clusters
foss/2019b
This is a basic/starter build. Note that this build does not include a CUDA-enabled OpenMPI so is limited to the GPUs on a single node.
Modules used include:
Make/3.15.3-GCCcore-8.3.0 Python/3.7.4-GCCcore-8.3.0 cuDNN/7.0.5-CUDA-9.0.176 (optional?) Graphviz/2.42.2-foss-2019b
Start with a clean module environment and install directory.
ml purge rm -rf $SCRATCH/Detectron2-foss-2019b
Create and activate a Python VE to install into.
ml Python/3.7.4-GCCcore-8.3.0 python -m venv $SCRATCH Detectron2-foss-2019b
fosscuda/2018b
This build includes a CUDA-enabled OpenMPI for using multiple GPU nodes to speed up processing.
CMake/3.12.1-GCCcore-7.3.0 Python-3.6.6-fosscuda-2018b