Keras

Description

Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.

If you are interested in using Keras 2.3.0 or later version, please note that the Keras team recommends that Keras users switch to tf.keras in TensorFlow 2.0. As suggested by the Kereas team, tf.keras is better maintained and has better integration with TensorFlow features (eager execution, distribution support and other).

Homepage: https://keras.io/

Access

Keras is available on the Grace, FASTER, and ACES clusters.

Keras is open to all HPRC users.

Keras Modules

TAMU HPRC currently supports the user of Keras though the the module system modules.

You can learn more about the module system on our Modules page.

GraceFASTERACES

To find what Keras versions are available, use module spider:

module spider Keras

To learn how to load a specific module version, use module spider:

module spider Keras/2.4.3

You will need to load all module(s) on any one of the lines below before the "Keras/2.4.3" module is available to load.

GCC/10.2.0 CUDA/11.1.1 OpenMPI/4.0.5

Example Keras Script

As with any job on the system, Keras should be used via the submission of a job file. Scripts using Keras are written in Python, and thus Keras scripts should not be written directly inside a job file or entered in the shell line by line. Instead, a separate file for the Python/Keras script should be created, which can then be executed by the job file.

To create a new script file, simply open up the text editor of your choice.

Below is an example script for dot product with Keras (entered in the text editor of your choice):

import numpy as np
import tensorflow as tf
import tensorflow.keras as keras
print(keras.__version__)
x = np.random.rand(500,500)
y = np.random.rand(500,500)
z = keras.backend.batch_dot(x,y)
print (z.shape)

It is recommended to save this script with a .py file extension, but not necessary.

If you encounter an error like

CUBLAS_STATUS_NOT_INITIALIZED

it is because the GPUs on the login node are fully occupied. You could check the usage of the GPUs with

[NetID@cluster ~]$ nvidia-smi

In fact, if you could see the this error, you are ready using the GPU-enabled version of TensorFlow backend for Keras, which is a good sign. You could go ahead submit a job and try things out on a compute node.

Once saved, the script can be tested on a login node by entering:

[NetID@cluster ~]$ python testscript.py