ACES: Using the Slurm Scheduler on Composable Resources
Overview
Instructor: Michael Dickens
Time: Tuesday, September 10, 2024 1:30PM-4:00PM CT
Location: Online using Zoom
Prerequisites: Current ACCESS ID, basic Linux/Unix skills
Introduction to using the Slurm scheduler on the ACES cluster, a composable accelerator testbed at Texas A&M University. Topics covered include multiple job scheduling approaches and job management tools.
Course Materials
The presentation slides are available as a downloadable PDF file.
Learning Objectives and Agenda
In this course, participants will:
- Learn the basics of HPC architecture
- Learn the basic components of a job script
- Learn how to submit a job script
- Learn how to review job HPC resource usage
- Learn how to debug failed jobs
This short course will cover various job scheduling approaches using the Slurm Workload Manager on ACES:
- HPC Architecture
- SBATCH Parameters
- Single node jobs
- single-core
- multi-core
- Multi-node jobs
- MPI jobs
- TAMULauncher
- array jobs
- Monitoring job resource usage
- at runtime
- after job completion
- job debugging
See: https://hprc.tamu.edu/aces
Note: During the class sessions many aspects of the material will be illustrated live via a login to ACES. Attendees will log into ACES and complete the exercises. You are encouraged to contact the HPRC helpdesk with any questions regarding ACES.