ACES: AI/ML TechLab on Graphcore IPUs
Overview
Instructor(s): Dr. Zhenhua He
Time: Tuesday, November 11, 2025 10:00AM-12:30PM CT
Location: Online using Zoom
Prerequisite(s): Current ACCESS ID; basic Linux/Unix skills; basic understanding of machine learning concepts, neural networks, and deep learning; familiarity with deep learning frameworks TensorFlow and/or PyTorch
This short course introduces researchers to Graphcore IPUs on the ACES cluster, a composable accelerator testbed at Texas A&M University. The Graphcore IPU course is a short but in-depth training program that provides users with a comprehensive understanding of Graphcore's Intelligence Processing Unit (IPU) and how it can be used to accelerate machine learning and artificial intelligence workloads. The course is designed to provide practical advice and a hands-on experience for engineers and researchers who are looking to improve the performance of their AI and ML workloads.
The course begins by covering the architecture of the IPU and how it differs from traditional CPUs and GPUs. Participants will learn about the unique features of the IPU, such as its large memory bandwidth and high-performance interconnects, which make it well-suited for deep learning and other AI workloads.
The course then covers the use of popular deep learning frameworks such as TensorFlow and PyTorch on the IPU. Participants will learn how to optimize their models for the IPU and how to use the Graphcore-specific libraries to take full advantage of the IPU's capabilities. Hands-on exercises will be provided to give participants experience with the IPU and its capabilities. The course also includes a section on the use of Graphcore's software development kit (SDK) and tools for profiling, debugging, and monitoring the IPU.
A Registration button will appear here when registration has been opened.
Course Materials
The presentation slides are available as downloadable PDF files.
Learning Objectives
In this course, participants will:
- Access the IPU systems on the ACES cluster: Colossus and Bow Pod16
- Run PyTorch and TensorFlow models on the IPU systems.
- Migrate a Keras MNIST classification model to IPU.
- Migrate a PyTorch Fashion-MNIST classification model to IPU.
See: https://hprc.tamu.edu/aces/
Note: This is a training session that will take place on the ACES cluster. Participants will log in and follow along with the instructor to complete the hands-on exercises.
