hpc101: how to use a supercomputer?€¦ · 36 xc40 compute cabinets, plus disk, blowers,...
TRANSCRIPT
![Page 1: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained](https://reader034.vdocuments.net/reader034/viewer/2022042314/5f0274967e708231d4045934/html5/thumbnails/1.jpg)
HPC101: How to use a Supercomputer?
HPC Saudi 2017
![Page 2: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained](https://reader034.vdocuments.net/reader034/viewer/2022042314/5f0274967e708231d4045934/html5/thumbnails/2.jpg)
HPC101: Introduction to High Performance
Computing Saber Feki
Computational Scientist Lead KAUST Supercomputing Core Lab
![Page 3: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained](https://reader034.vdocuments.net/reader034/viewer/2022042314/5f0274967e708231d4045934/html5/thumbnails/3.jpg)
Agenda
� 8:30am: Introduction on HPC (Dr. Saber Feki)
� 8:40am: Overview of Shaheen II Architecture
� 8:50am: How to get access on Shaheen (Dr. Bilel Hadri)
� 9:05am: Programming Environment
� 9:30am: Runtime Environment (Dr. Samuel Kortas)
� 9:55am: Lustre Parallel Filesystem (Dr. Georgios Markomanolis)
� 10.15am: Coffee Break
� 10:30am: Application Software examples (Dr. Rooh Khurram and Dr. Zhiyong Zhu)
� 11:00am: Visualization Tools (Dr. Madhu Srinivasan, Ms. Dina Garatly)
� 11:45am: Tips & Best Practices (Dr. Bilel Hadri)
![Page 4: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained](https://reader034.vdocuments.net/reader034/viewer/2022042314/5f0274967e708231d4045934/html5/thumbnails/4.jpg)
What is HPC and Why HPC ?
� https://www.youtube.com/watch?v=TGSRvV9u32M
![Page 5: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained](https://reader034.vdocuments.net/reader034/viewer/2022042314/5f0274967e708231d4045934/html5/thumbnails/5.jpg)
HPC101: Shaheen II Architecture Overview
Saber Feki Computational Scientist Lead
KAUST Supercomputing Core Lab
![Page 6: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained](https://reader034.vdocuments.net/reader034/viewer/2022042314/5f0274967e708231d4045934/html5/thumbnails/6.jpg)
Shaheen II Overview
CO
MP
UT
E Node
Processor type: Intel Haswell
2 CPU sockets per node, 16 processors cores per CPU, 2.3GHz
6174 Nodes 197,568 cores
128 GB of memory per node
Over 790 TB total memory
Power Up to 2.8MW Water Cooled
Weight/Size More than 100 metrics tons
36 XC40 Compute cabinets, plus disk, blowers, management , etc..
Speed 7.2 Pflop/s speak theoretical performance
5.5 Pflop/s sustained LINPACK
Network Cray Aries interconnect with Dragonfly topology
57% of the maximum global bandwidth between the 18 groups of two cabinets.
STO
RE
Storage Sonexion 2000 Lustre appliance
17.6 petabytes of usable storage. Over 500 GB/s bandwidth
Burst Buffer DataWarp Solid Sate Devices (SDD) fast data
cache. Over 1 TB/s bandwidth, ( delivery September 2015)
Archiving
Tiered Adaptive Storage (TAS)
Hierarchical storage with 200 TB disk cache and 20 PB of tape storage, using a spectra logic tape library. ( can expand up to 100 PB)
AN
ALY
ZE
Analyzing Urika - GD
2TB of global shared-memory, 64 Threadstorm4 processors with 128 hardware threads per processor Over 75 TB of Lustre PFS
![Page 7: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained](https://reader034.vdocuments.net/reader034/viewer/2022042314/5f0274967e708231d4045934/html5/thumbnails/7.jpg)
Intel Haswell CPU
![Page 8: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained](https://reader034.vdocuments.net/reader034/viewer/2022042314/5f0274967e708231d4045934/html5/thumbnails/8.jpg)
AVX 2 and FMA in Haswell
![Page 9: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained](https://reader034.vdocuments.net/reader034/viewer/2022042314/5f0274967e708231d4045934/html5/thumbnails/9.jpg)
Intel Haswell CPU
![Page 10: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained](https://reader034.vdocuments.net/reader034/viewer/2022042314/5f0274967e708231d4045934/html5/thumbnails/10.jpg)
XC40 Compute Blades
![Page 11: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained](https://reader034.vdocuments.net/reader034/viewer/2022042314/5f0274967e708231d4045934/html5/thumbnails/11.jpg)
High Speed Network (HSN)
![Page 12: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained](https://reader034.vdocuments.net/reader034/viewer/2022042314/5f0274967e708231d4045934/html5/thumbnails/12.jpg)
High Speed Network (HSN)
![Page 13: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained](https://reader034.vdocuments.net/reader034/viewer/2022042314/5f0274967e708231d4045934/html5/thumbnails/13.jpg)
High Speed Network (HSN)
![Page 14: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained](https://reader034.vdocuments.net/reader034/viewer/2022042314/5f0274967e708231d4045934/html5/thumbnails/14.jpg)
XC40 Routing
![Page 15: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained](https://reader034.vdocuments.net/reader034/viewer/2022042314/5f0274967e708231d4045934/html5/thumbnails/15.jpg)
High Speed Network (HSN)
![Page 16: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained](https://reader034.vdocuments.net/reader034/viewer/2022042314/5f0274967e708231d4045934/html5/thumbnails/16.jpg)
Networking
![Page 17: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained](https://reader034.vdocuments.net/reader034/viewer/2022042314/5f0274967e708231d4045934/html5/thumbnails/17.jpg)
Shaheen II Sonexion � Cray Sonexion 2000 Storage System
consis3ng of 12 cabinets containing a total of 5988 4TB SAS disk drives.
� The cabinets are interconnected by FDR Infiniband Fabric .
� Each cabinet can contain up to 6 Scalable Storage Units (SSU); Shaheen II has a total of 72 SSUs.
� As there are 2 OSS/OSTs for each SSU, this means that there are 144 OSTs in total
![Page 18: HPC101: How to use a Supercomputer?€¦ · 36 XC40 Compute cabinets, plus disk, blowers, management , etc.. Speed 7.2 Pflop/s speak theoretical performance 5.5 Pflop/s sustained](https://reader034.vdocuments.net/reader034/viewer/2022042314/5f0274967e708231d4045934/html5/thumbnails/18.jpg)
Questions ?