ACES: Graphcore IPU Tutorial

Overview

Instructor: Zhenhua He

Time: Tuesday, March 5, 2024 — 10:00AM-12:30PM CT

Location: Online using Zoom

Prerequisites: Current ACCESS ID; basic Linux/Unix skills; basic understanding of machine learning concepts, neural networks, and deep learning; familiarity with deep learning frameworks TensorFlow and/or PyTorch

This short course introduces researchers to Graphcore IPUs on the ACES cluster, a composable accelerator testbed at Texas A&M University. The Graphcore IPU course is a short but in-depth training program that provides users with a comprehensive understanding of Graphcore's Intelligence Processing Unit (IPU) and how it can be used to accelerate machine learning and artificial intelligence workloads. The course is designed to provide practical advice and a hands-on experience for engineers and researchers who are looking to improve the performance of their AI and ML workloads.

The course begins by covering the architecture of the IPU and how it differs from traditional CPUs and GPUs. Participants will learn about the unique features of the IPU, such as its large memory bandwidth and high-performance interconnects, which make it well-suited for deep learning and other AI workloads.

The course then covers the use of popular deep learning frameworks such as TensorFlow and PyTorch on the IPU. Participants will learn how to optimize their models for the IPU and how to use the Graphcore-specific libraries to take full advantage of the IPU's capabilities. Hands-on exercises will be provided to give participants experience with the IPU and its capabilities. The course also includes a section on the use of Graphcore's software development kit (SDK) and tools for profiling, debugging, and monitoring the IPU.

Course Materials

The presentation slides are available as downloadable PDF files.

  • Graphcore IPU Tutorial (Spring 2024) PDF

  • Graphcore IPU Workshop (Fall 2023) PDF
  • Graphcore IPU Workshop (Spring 2023) PDF
  • IPU Training Labs (Fall 2022) PDF
  • IPU-Training GitHub repo

Agenda

There are a total of four lab sessions:

  1. Intro to IPU (30 mins)
    We will introduce Graphcore, IPU architecture, and the IPU system on the TAMU ACES platform.
  2. Demo on ACES (30 mins)
    We will demonstrate how to run models of different frameworks on the ACES IPU system.
  3. TensorFlow on IPU (30 minutes)
    We will learn to convert a Keras MNIST classification model to run on IPU.
  4. PyTorch on IPU (30 minutes)
    We will learn to convert a PyTorch Fashion-MNIST classification model to run on IPU.

See: https://hprc.tamu.edu/aces/

Note: During the class sessions many aspects of the material will be illustrated live via a login to a training system. Attendees are welcome to follow these parts with their own computers.