Hprc banner tamu.png

Difference between revisions of "Grace:Intro"

From TAMU HPRC
Jump to: navigation, search
(initial page)
 
Line 2: Line 2:
 
__TOC__
 
__TOC__
 
=== Hardware Overview ===
 
=== Hardware Overview ===
 +
 +
----
 +
[[Image:Grace-racks.jpg|right|400px|caption]]
 +
 +
{| class="wikitable" style="text-align: center;"
 +
| System Name:
 +
| Grace
 +
|-
 +
| Host Name:
 +
| grace.hprc.tamu.edu
 +
|-
 +
| Operating System:
 +
| Linux (CentOS 7)
 +
|-
 +
| Total Compute Cores/Nodes:
 +
| 44,656 cores<br>925 nodes
 +
|-
 +
| Compute Nodes:
 +
| 800 48-core compute nodes, each with 384GB RAM <br> 100 48-core GPU nodes, each with two A100 40GB GPU accelerator and 384GB RAM <br>8 38-core GPU nodes, each with two RTX 6000 24GB GPU accellator and 384 GB RAM<br>8 48-cores GPU nodes, each with 4 T4 16GB GPU accelerator<br> 8 80-core large memory nodes, each with 3TB RAM
 +
|-
 +
| Interconnect:
 +
| Mellanox HDR Infiniband
 +
|-
 +
| Peak Performance:
 +
| 2,638.5 TFLOPS
 +
|-
 +
| Global Disk:
 +
| 5PB (usable) via DDN appliance for general use <br>?PB (raw) via Lenovo's DSS purchased by and dedicated for ??
 +
|-
 +
| File System:
 +
| Lustre and GPFS
 +
|-
 +
| Batch Facility:
 +
| [http://slurm.schedmd.com/ Slurm by SchedMD]
 +
|-
 +
| Location:
 +
| West Campus Data Center
 +
|-
 +
| Production Date:
 +
| Spring 2021
 +
|}
  
 
Grace is an Intel x86-64 Linux cluster with 950 compute nodes (44,656 total cores) and 5 login nodes.  There are 800 compute nodes with 384 GB of memory, and 117 GPU nodes with 384 GB of memory.  Among the 117 GPU nodes, there are 100 GPU nodes two A100 40 GB GPU cards, 9 GPU nodes with two RTX 6000 24GB GPU cards, 8 GPU nodes with four T4 16GB GPU cards.  These 800 compute nodes and 117 GPU nodes are a dual socket server with two Intel 6248R 3.0GHz 24-core processors.  There are 8 compute nodes with 3 TB of memory and four Intel 6248 2.5 GHz 20-core processors.
 
Grace is an Intel x86-64 Linux cluster with 950 compute nodes (44,656 total cores) and 5 login nodes.  There are 800 compute nodes with 384 GB of memory, and 117 GPU nodes with 384 GB of memory.  Among the 117 GPU nodes, there are 100 GPU nodes two A100 40 GB GPU cards, 9 GPU nodes with two RTX 6000 24GB GPU cards, 8 GPU nodes with four T4 16GB GPU cards.  These 800 compute nodes and 117 GPU nodes are a dual socket server with two Intel 6248R 3.0GHz 24-core processors.  There are 8 compute nodes with 3 TB of memory and four Intel 6248 2.5 GHz 20-core processors.
Line 16: Line 57:
  
 
== Login Nodes ==
 
== Login Nodes ==
 +
 +
The '''grace.hprc.tamu.edu''' hostname can be used to access the Grace cluster. This translates into one of the five login nodes, '''grace[1-5].hprc.tamu.edu'''. To access a specific login node use its corresponding host name (e.g., grace2.tamu.edu).  All login nodes have 10 GbE connections to the TAMU campus network and direct access to all global parallel (Lustre-based) file systems. The table below provides more details about the hardware configuration of the login nodes.
  
  
Line 24: Line 67:
  
 
== Interconnect ==
 
== Interconnect ==
 +
  
  

Revision as of 14:49, 14 December 2020

Grace: A Dell x86 HPC Cluster

Hardware Overview


caption
System Name: Grace
Host Name: grace.hprc.tamu.edu
Operating System: Linux (CentOS 7)
Total Compute Cores/Nodes: 44,656 cores
925 nodes
Compute Nodes: 800 48-core compute nodes, each with 384GB RAM
100 48-core GPU nodes, each with two A100 40GB GPU accelerator and 384GB RAM
8 38-core GPU nodes, each with two RTX 6000 24GB GPU accellator and 384 GB RAM
8 48-cores GPU nodes, each with 4 T4 16GB GPU accelerator
8 80-core large memory nodes, each with 3TB RAM
Interconnect: Mellanox HDR Infiniband
Peak Performance: 2,638.5 TFLOPS
Global Disk: 5PB (usable) via DDN appliance for general use
?PB (raw) via Lenovo's DSS purchased by and dedicated for ??
File System: Lustre and GPFS
Batch Facility: Slurm by SchedMD
Location: West Campus Data Center
Production Date: Spring 2021

Grace is an Intel x86-64 Linux cluster with 950 compute nodes (44,656 total cores) and 5 login nodes. There are 800 compute nodes with 384 GB of memory, and 117 GPU nodes with 384 GB of memory. Among the 117 GPU nodes, there are 100 GPU nodes two A100 40 GB GPU cards, 9 GPU nodes with two RTX 6000 24GB GPU cards, 8 GPU nodes with four T4 16GB GPU cards. These 800 compute nodes and 117 GPU nodes are a dual socket server with two Intel 6248R 3.0GHz 24-core processors. There are 8 compute nodes with 3 TB of memory and four Intel 6248 2.5 GHz 20-core processors.

The interconnecting fabric is a two-level fat-tree based on EDR Infiniband. High performance mass storage of 5 petabyte (usable) capacity is made available to all nodes by DDN.

Get details on using this system, see the User Guide for Grace.


Compute Nodes

A description of the four types of compute nodes is below:


Login Nodes

The grace.hprc.tamu.edu hostname can be used to access the Grace cluster. This translates into one of the five login nodes, grace[1-5].hprc.tamu.edu. To access a specific login node use its corresponding host name (e.g., grace2.tamu.edu). All login nodes have 10 GbE connections to the TAMU campus network and direct access to all global parallel (Lustre-based) file systems. The table below provides more details about the hardware configuration of the login nodes.


Mass Storage

5PB (usable) with Lustre provided by DDN


Interconnect

Namesake

"Grade" is named for Grace Hopper.