News
Jan 14, 2025
Dec 19, 2024
Dec 13, 2024
Mar 06, 2024
Feb 15, 2024
Events
Mar 17-20
Nov 15-16
Nov 16-22
July 21-25
July 18-20
June 24-28
June 10-14
System Notices
FASTER and Grace Cluster Maintenance, March 10-13 — UPDATED
UPDATE: (03/15/2025 4:36p):
The FASTER cluster is currently available with about 80% of its compute nodes. We are still investigating issues with FASTER's OOD portals.
The Grace cluster redeployment is still in progress.
UPDATE: (03/14/2025 11:55p):
UPDATE 11:55p March 14: The FASTER cluster may be available tomorrow morning at 75% capacity after testing overnight. Some GPU nodes will remain offline due to composability fabric issues that will be remediated next week.
The Grace cluster remains unavailable as its redeployment with a new OS is taking much longer than anticipated. We will continue working through the weekend to complete the remaining maintenance to make the Grace cluster ASAP.
UPDATE: (03/13/2025 10:06pm):
The maintenance for the shared storage and the Liqid composability fabrics were completed successfully but took more time than anticipated. A failed disk (which needed replacement) contributed delays to the shared storage maintenance. We will provide more updates as we continue work on the FASTER and Grace cluster maintenance.
Posted at 03/04/2025 10:28a
The FASTER and Grace clusters will be unavailable from 9am March 10 to 8pm March 13. Software maintenance will be done for FASTER's nodes and the Liqid fabrics. The Grace cluster will be redeployed to the same OS (RHEL 8.10) as FASTER. The software on the shared storage will be updated as well.