Caffe

Note

Caffe was superseded by Caffe2, which was itself absorbed into PyTorch in 2020. The Caffe2 web site presently consists of the following statement: "This website is being deprecated - Caffe2 is now a part of PyTorch. While the APIs will continue to work, we encourage you to use the PyTorch APIs. Read more or visit https://pytorch.org."

PyTorch contains the capability of converting a model defined in PyTorch into Caffe2 format for those constrained to use the latter.

https://pytorch.org/tutorials/advanced/super_resolution_with_caffe2.html

TAMU HPRC fully supports recent versions of PyTorch, but has no plans to support any of the obsolete Caffe or Caffe2 versions.

Description

Caffe is a deep learning framework made with expression, speed, and modularity in mind.

Homepage: http://caffe.berkeleyvision.org

Access

Caffe is open to all HPRC users.

Caffe Module

TAMU HPRC can support Caffe through a Caffe module. A limited number of Caffe builds are available on Grace.

You can learn more about the module system on our Modules page.

You can explore the available Caffe modules using the following:

    [NetID@grace ~]$ module spider Caffe

Each Caffe module has instructions about how to load it, for example Caffe with CUDA depends on CUDA, GCC, and OpenMPI. Load all of them at once like this:

    [NetID@grace ~]$ module load GCC/8.2.0-2.31.1 CUDA/10.1.105 OpenMPI/3.1.3 Caffe/1.0-CUDA-10.1.105-Python-3.7.2

Anaconda and Caffe

You can install your own version of Caffe through Anaconda.

Example Caffe Script

As with most cluster use, Caffe should be used via the submission of a job file. Scripts using Caffe are written in Python, and thus Caffe scripts should not be written directly inside a job file or entered in the shell line by line. Instead, a separate file for the Python/Caffe script should be created, which can then be executed by the job file.

Caffe was developed to represent deep networks in a modular way. That is to say: each layer of a deep network is represented in its own file. Before the script can be used, the layer file must be defined (in the text editor of your choice). More about the anatomy of a Caffe model can be found here.

Note: The layer file(s) and the script MUST be in the same directory.

The following was designed with caffe 1.0. It is recommended to test your script with the same version.

Creating the layer file, conv.prototxt:

name: "convolution"
input: "data"
input_dim: 1
input_dim: 1
input_dim: 100
input_dim: 100
layer {
  name: "conv"
  type: "Convolution"
  bottom: "data"
  top: "conv"
  convolution_param {
    num_output: 3
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}

Creating the script file: Load Caffe:

import caffe

To make the script GPU exclusive:

caffe.set_device(0)
caffe.set_mode_gpu()

To make the script CPU exclusive:

caffe.set_mode_cpu()

To load the net layer defined in conv.prototxt:

net = caffe.Net('conv.prototxt', caffe.TEST)

NOTE: While acceptable to test programs on the login node, please do not run extended or intense computation on these shared resources. Use a batch job and the compute nodes for heavy processing.

Please limit interactive processing to short, non-intensive usage. Use non-interactive batch jobs for resource-intensive and/or multiple-core processing. Users are requested to be responsible and courteous to other users when using software on the login nodes.

Pay careful attention to which node this script will run on, as not all login nodes have GPUs.

(More information on the Grace computing environment).

The most important processing limits here are: - ONE HOUR of PROCESSING TIME per login session.

EIGHT CORES per login session on the same node or (cumulatively) across all login nodes.

Anyone found violating the processing limits will have their processes killed without warning. Repeated violation of these limits will result in account suspension. Note: Your login session will disconnect after one hour of inactivity.

Batch Usage on the Compute Nodes

Non-interactive batch jobs on the compute nodes allows for resource-demanding processing. Non-interactive jobs have higher limits on the number of cores, amount of memory, and runtime length.

For instructions on how to create and submit a batch job, please see the appropriate wiki page for each respective cluster:

Grace: About Grace Batch Processing

Graphical Usage on the Compute Nodes

A VNC windowed session allow for usage of the a graphical user interface (GUI) on a compute node without disrupting other users.

For more information on using software with VNC, please visit our Remote Visualization page.