HPRCheader.jpg

From TAMU High Performance Research Computing
Jump to: navigation, search

Terra: A Lenovo x86 HPC Cluster

Under Construction

Contents

Hardware Overview


caption
System Name: Terra
Host Name: terra.tamu.edu
Operating System: Linux (CentOS 7)
Total Compute Cores/Nodes: 8,512 cores
304 nodes
Compute Nodes: 256 compute nodes, each with 64GB RAM
48 GPU nodes, each with a single dual-GPU Tesla K80 accelerator and 128GB of RAM
Interconnect: Intel OmniPath100 Series switches.
Peak Performance: 326 TFLOPs
Global Disk: 2PB (raw) via IBM/Lenovo's GSS26 appliance for general use
1PB (raw) via Lenovo's GSS24 purchased by a dedicated for GEOSAT
File System: General Parallel File System (GPFS)
Batch Facility: Slurm by SchedMD
Location: Teague Data Center
Production Date: Spring 2017 (tentative)

Terra is an Intel x86-64 Linux cluster with 304 compute nodes (8,512 total cores) and 3 login nodes. There are 256 compute nodes with 64 GB of memory and 64 compute nodes with 128 GB of memory and a K80 GPU card. Each compute node is a dual socket server with two Intel Xeon E5-2680 v4 2.40GHz 14-core processors. The interconnecting fabric is a two-level fat-tree based on Intel Omni-Path Architecture (OPA). High performance mass storage of 3 petabyte (raw) capacity is made available to all nodes by two IBM GSS26 storage appliance and one GSS24 storage appliance.

Get details on using this system, see the User Guide for Terra.

Compute Nodes

A description of the two types of compute nodes is below:

Table 1 Details of Compute Nodes
General 64GB
Compute
GPU 128 GB
Compute
Total Nodes 256 48
Processor Type Intel Xeon E5-2680 v4 2.40GHz 14-core
Sockets/Node 2
Cores/Node 28
Memory/Node 64 GB DDR4, 2400 MHz 128 GB DDR4, 2400 MHz
Accelerator(s) N/A 1 NVIDIA K80 Accelerator
Interconnect Intel Omni-Path Architecture (OPA)
Local Disk Space 1TB 7.2K RPM SATA disk

Note, each K80 accelerator has two GPUs.


Login Nodes

The terra.tamu.edu hostname can be used to access the Terra cluster. This translates into one of the three login nodes, terra[1-3].tamu.edu. To access a specific login node use its corresponding host name (e.g., terra2.tamu.edu). All login nodes have 1 GigE external connectivity and direct access to all global parallel (GPFS-based) file systems. The table below provides more details about the hardware configuration of the login nodes.

Table 2: Details of Login Nodes
No Accelerator One NVIDIA K80 Accelerator
HostNames terra1.tamu.edu
terra2.tamu.edu
terra3.tamu.edu
Processor Type Intel Xeon E5-2680 v4 2.40GHz 14-core
Memory 128 GB DDR4 2400 MHz
Total Nodes 2 1
Cores/Node 28
Interconnect Intel Omni-Path Architecture (OPA)
Local Disk Space per node: two 900GB 10K RPM SAS drives

Usable Memory/RAM

While nodes on Terra have either 64GB or 128GB of RAM, some of this memory is used to maintain the software and operating system of the node. In most cases, excessive memory requests will be automatically rejected by SLURM.

The table below contains information regarding the approximate limits of Terra memory hardware and our suggestions on its use.

Memory Limits of Nodes
64GB Nodes 128GB Nodes
Node Count 256 48
Number of Cores 28 Cores (2 sockets x 14 core)
Memory Limit
Per Core
2040 MB
2 GB
4096 MB
4 GB
Memory Limit
Per Node
57344 MB
56 GB
114688 MB
112 GB

SLURM may queue your job for an excessive time (or indefinitely) if waiting for some particular nodes with sufficient memory to become free.

Mass Storage

2PB (raw) via two IBM/Lenovo GSS26 appliances for general use and 1PB (raw) via Lenovo's GSS24 appliance purchased by and dedicated for GEOSAT. All storage appliances will use GPFS.

Interconnect

Opa fabric.png

Namesake

"terra" comes from the Latin word for "this planet" a.k.a. "Earth". One of the purposes of this cluster is to study images gathered from a "Earth Observation Satellite" (EOS). Given that we just retired a cluster named "Eos" (named after the Greek goddess of the dawn waiting to spread the light of knowledge each day), the name terra was chosen instead

Personal tools
Namespaces
    Notice: Undefined index: namespace_urls in /var/www/mediawiki119/skins_local/Vector.php on line 354 Warning: Invalid argument supplied for foreach() in /var/www/mediawiki119/skins_local/Vector.php on line 354

Notice: Undefined index: variant_urls in /var/www/mediawiki119/skins_local/Vector.php on line 365 Warning: Invalid argument supplied for foreach() in /var/www/mediawiki119/skins_local/Vector.php on line 365

Variants
Views
    Notice: Undefined index: view_urls in /var/www/mediawiki119/skins_local/Vector.php on line 387 Warning: Invalid argument supplied for foreach() in /var/www/mediawiki119/skins_local/Vector.php on line 387
Actions
Important Info
User Guides
Helpful Pages
Tools