Hprc banner tamu.png

Difference between revisions of "SW:PyTorch"

From TAMU HPRC
Jump to: navigation, search
(Example TensorFlow Script)
(Example TensorFlow Script)
Line 44: Line 44:
  
 
  import torch
 
  import torch
 
 
 
  dtype = torch.FloatTensor
 
  dtype = torch.FloatTensor
 
  # dtype = torch.cuda.FloatTensor # Uncomment this to run on GPU
 
  # dtype = torch.cuda.FloatTensor # Uncomment this to run on GPU
 
 
  # N is batch size; D_in is input dimension;
 
  # N is batch size; D_in is input dimension;
 
  # H is hidden dimension; D_out is output dimension.
 
  # H is hidden dimension; D_out is output dimension.
 
  N, D_in, H, D_out = 64, 1000, 100, 10
 
  N, D_in, H, D_out = 64, 1000, 100, 10
 
 
  # Create random input and output data
 
  # Create random input and output data
 
  x = torch.randn(N, D_in).type(dtype)
 
  x = torch.randn(N, D_in).type(dtype)
 
  y = torch.randn(N, D_out).type(dtype)
 
  y = torch.randn(N, D_out).type(dtype)
 
 
  # Randomly initialize weights
 
  # Randomly initialize weights
 
  w1 = torch.randn(D_in, H).type(dtype)
 
  w1 = torch.randn(D_in, H).type(dtype)
 
  w2 = torch.randn(H, D_out).type(dtype)
 
  w2 = torch.randn(H, D_out).type(dtype)
 
 
  learning_rate = 1e-6
 
  learning_rate = 1e-6
 
  for t in range(500):
 
  for t in range(500):
Line 67: Line 61:
 
     h_relu = h.clamp(min=0)
 
     h_relu = h.clamp(min=0)
 
     y_pred = h_relu.mm(w2)
 
     y_pred = h_relu.mm(w2)
 
 
     # Compute and print loss
 
     # Compute and print loss
 
     loss = (y_pred - y).pow(2).sum()
 
     loss = (y_pred - y).pow(2).sum()
 
     print(t, loss)
 
     print(t, loss)
 
 
     # Backprop to compute gradients of w1 and w2 with respect to loss
 
     # Backprop to compute gradients of w1 and w2 with respect to loss
 
     grad_y_pred = 2.0 * (y_pred - y)
 
     grad_y_pred = 2.0 * (y_pred - y)
Line 79: Line 71:
 
     grad_h[h < 0] = 0
 
     grad_h[h < 0] = 0
 
     grad_w1 = x.t().mm(grad_h)
 
     grad_w1 = x.t().mm(grad_h)
 
 
     # Update weights using gradient descent
 
     # Update weights using gradient descent
 
     w1 -= learning_rate * grad_w1
 
     w1 -= learning_rate * grad_w1
 
     w2 -= learning_rate * grad_w2
 
     w2 -= learning_rate * grad_w2

Revision as of 16:38, 27 February 2018

PyTorch

Description

PyTorch is deep learning framework that puts Python first.

Access

Pytorch is open to all HPRC users.

Anaconda and Pytorch Packages

TAMU HPRC currently supports the user of Pytorch though the Anaconda modules. There are a variety of Anaconda modules available on Ada and Terra.

While several versions of Anaconda have some Pytorch environment installed, it is simplest to use exactly the versions in the following sections.

You can learn more about the module system on our SW:Modules page.

You can explore the available Anaconda environments on a per-module basis using the following:

[NetID@ada ~]$ module load Anaconda/[SomeVersion]
[NetID@ada ~]$ conda info --envs

Pytorch on Ada (CPU-only)

A single version of TensorFlow is currently available on Ada. This version is limited to the CPU only (no GPU).

To load this version (python 3.6):

[NetID@ada ~]$ module load Anaconda/3-5.0.0.1
[NetID@ada ~]$ source activate pytorch-0.2.0
[NetID@ada ~]$ [run your Python program accessing Pytorch]
[NetID@ada ~]$ source deactivate

This version can be run on any of the 64GB or 256GB compute nodes.

TensorFlow on Terra (GPU-only)

There is one version of Pytorch (0.1.12) using GPUs on Terra for module Anaconda/3-5.0.0.1.

To load pytorch-gpu-0.1.12 (python 3.6.2):

[NetID@terra ~]$ module load Anaconda/3-5.0.0.1
[NetID@terra ~]$ source activate pytorch-gpu-0.1.12
[NetID@terra ~]$ [run your Python program accessing Pytorch]
[NetID@terra ~]$ source deactivate

Example TensorFlow Script

import torch
dtype = torch.FloatTensor
# dtype = torch.cuda.FloatTensor # Uncomment this to run on GPU
# N is batch size; D_in is input dimension;
# H is hidden dimension; D_out is output dimension.
N, D_in, H, D_out = 64, 1000, 100, 10
# Create random input and output data
x = torch.randn(N, D_in).type(dtype)
y = torch.randn(N, D_out).type(dtype)
# Randomly initialize weights
w1 = torch.randn(D_in, H).type(dtype)
w2 = torch.randn(H, D_out).type(dtype)
learning_rate = 1e-6
for t in range(500):
    # Forward pass: compute predicted y
    h = x.mm(w1)
    h_relu = h.clamp(min=0)
    y_pred = h_relu.mm(w2)
    # Compute and print loss
    loss = (y_pred - y).pow(2).sum()
    print(t, loss)
    # Backprop to compute gradients of w1 and w2 with respect to loss
    grad_y_pred = 2.0 * (y_pred - y)
    grad_w2 = h_relu.t().mm(grad_y_pred)
    grad_h_relu = grad_y_pred.mm(w2.t())
    grad_h = grad_h_relu.clone()
    grad_h[h < 0] = 0
    grad_w1 = x.t().mm(grad_h)
    # Update weights using gradient descent
    w1 -= learning_rate * grad_w1
    w2 -= learning_rate * grad_w2