perspective on extreme scale computing in china

72
Perspective on Extreme Scale Computing in China Depei Qian Sino-German Joint Software Institute (JSI) Beihang University Co-design 2013, Guilin, Oct. 29, 2013

Upload: bobby

Post on 13-Jan-2016

41 views

Category:

Documents


2 download

DESCRIPTION

Perspective on Extreme Scale Computing in China. Depei Qian Sino-German Joint Software Institute (JSI) Beihang University Co-design 2013, Guilin, Oct. 29, 2013. Outline. Related R&D programs in China HPC system development Application service environment Applications. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Perspective on Extreme Scale Computing in China

Perspective on Extreme Scale Computing in China

Depei Qian

Sino-German Joint Software Institute (JSI)

Beihang UniversityCo-design 2013, Guilin, Oct. 29, 2013

Page 2: Perspective on Extreme Scale Computing in China

Outline Related R&D programs in China HPC system development Application service environment Applications

Page 3: Perspective on Extreme Scale Computing in China

Related R&D programs in China

Related R&D programs in China

Page 4: Perspective on Extreme Scale Computing in China

HPC-related R&D Under NSFC

NSFC Key initiative “Basic algorithms for high

performance scientific computing and computable modeling”

2011-2018 180 million RMB Basic algorithms and high efficient

implementation Computable modeling Verification by solving domain problems

Page 5: Perspective on Extreme Scale Computing in China

HPC-related R&D Under 863 program

3 Key projects in the last 12 years High performance computer and core software (2002-2005) High productivity computer and Grid service environment

(2006-2010) High productivity computer and application environment

(2011-2016) 3 Major projects

Multicore/many-core programming support (2012-2015) High performance parallel algorithms and parallel coupler

development for earth systems study (2010-2013) HPC software support for earth system modeling (2010-

2013)

Page 6: Perspective on Extreme Scale Computing in China

HPC-related R&D Under 973 program

973 program High performance scientific computing Large scale scientific computing Aggregation and coordination

mechanisms in virtual computing environment

Highly efficient and trustworthy virtual computing environment

Page 7: Perspective on Extreme Scale Computing in China

There is no national long-term R&D program on extreme scale computing

Coordination between different programs needed

Page 8: Perspective on Extreme Scale Computing in China

Shift of 863 program emphasis

1987: Intelligent computers, following the 5th generation computer program in Japan

1990: from intelligent computers to high performance parallel computers

1999: from individual HPC system to the national HPC environment

2006: from high performance computers to high productivity computers

Page 9: Perspective on Extreme Scale Computing in China

History of HPC development under 863 program

1990: parallel computers identified as priority topic of the 863 program National Intelligent Computer R&D Center

established 1993: Dawning 1, 640MIPS, SMP 1995: Dawning 1000, 2.5GFlops, MPP

Downing company established in 1995 1996: Dawning 1000A, cluster system

First product-oriented system of Dawning 1998: Dawning 2000, 100GFlops, cluster

Page 10: Perspective on Extreme Scale Computing in China

History of HPC development under 863 program

2000: Dawning 3000, 400GFlops, cluster, First system commercialized

2002: Lenovo DeepComp 1800, 1TFlops, cluster Lenovo entered the HPC market

2003: Lenovo DeepComp 6800, 5.3TFlops, cluster

2004: Dawning 4000A, 11.2TFlops

Page 11: Perspective on Extreme Scale Computing in China

History of HPC development under 863 program

2008: Lenovo DeepComp 7000

150TFlops, Heterogeneous cluster Dawning 5000A

230TFlops, cluster 2010:

Dawning 6000 3PFlops, Heterogeneous system CPU+GPU

TH-1A 4.7PFlops, Heterogeneous CPU+GPU

2011: Sunway-Bluelight

IPFlops+100TFlops Based on domestic processor

2013: TH-2

Heterogeneous system with CPU+MIC

Page 12: Perspective on Extreme Scale Computing in China

863 key projects on HPC and Grid: 2002-2010

“High performance computer and core software” 4-year project, May 2002 to Dec. 2005 100 million Yuan funding from the MOST More than 2Χ associated funding from local

government, application organizations, and industry

Major outcomes: China National Grid (CNGrid) “High productivity Computer and Grid

Service Environment” Period: 2006-2010 (extended to now) 940 million Yuan from the MOST and more than

1B Yuan matching money from other sources

Page 13: Perspective on Extreme Scale Computing in China

Current 863 key project

“High productivity computer and application environment” 2011-2015 (2016) 1.3B YUAN investment secured Develop leading level high performance

computers Transfer CNGrid into an application service

environment Develop parallel applications in selected areas

Page 14: Perspective on Extreme Scale Computing in China

Projects launched The first round of projects launched in 2011 High productivity computer (1)

100PF by the end of 2015 HPC applications (6)

Fusion simulation Simulation for aircraft design Drug discovery Digital media Structural mechanics for large machinery Simulation of electro-magnetic environment

Parallel programming framework (1) Application service environment will be supported in

the second round Emphasis on application service support Technologies for new mode of operation

Page 15: Perspective on Extreme Scale Computing in China

HPC system developmentHPC system development

Page 16: Perspective on Extreme Scale Computing in China

Major challenges Power consumption Performance obtained by the applications Programmability Resilience Major obstacles

memory walls Power walls I/O walls …

Page 17: Perspective on Extreme Scale Computing in China

Power consumption The limiting factor to implementation

of extreme scale computers Impossible to increase performance by

expanding system scale only Cooling of the system is difficult and

affects reliability of the system Energy cost is a heavy burden and

prevent acceptance of extreme scale computers by end users

Page 18: Perspective on Extreme Scale Computing in China

Performance obtained by applications Systems installed at general purpose

computing centers Serving a large population of users supporting a wide range of applications

LinPack is not everything Need to be efficient for both general-

purpose and special-purpose computing Need to support both computing-intensive

and data-intensive applications

Page 19: Perspective on Extreme Scale Computing in China

Programmability Must handle

Concurrency/locality Heterogeneity of the system Legacy programs porting

Lower the skill requirement for application developers

Page 20: Perspective on Extreme Scale Computing in China

Resilience

Very short MTBF for extreme scale systems

Long-time continuous operation System must self-heal/recover from

hardware faults/failures System must detect and tolerate

errors in software

Page 21: Perspective on Extreme Scale Computing in China

Constrained design principle We must set strong constrains to the

extreme scale system implementation Power consumption

50GF/W or less before 2020 5GF/W in 2015

Systems scale <100,000 processors <200 cabinets

Cost <300 million dollars (or <2 B YUAN)

We can only design and implement extreme scale system with those constrains

Page 22: Perspective on Extreme Scale Computing in China

How to address the challenges?

Architectural support Technology innovation Hardware and software coordination

Page 23: Perspective on Extreme Scale Computing in China

Architectural support Using the most appropriate architecture to achieve

the goal Making trade-offs between performance, power

consumption, programmability, resilience, and cost Hybrid architecture (TH-1A & TH-2)

General purpose + high density computing (GPU or MIC) HPP architecture (Dawning 6000/Loonson)

Enable different processors to co-exist Support global address space Multi-level of parallelism

Multi-conformation and Multi-scale adaptive architecture (SW/BL)

Cluster implemented with Intel processor for supporting commercial software

Homogeneous system implemented with domestic multicore processors for computing-intensive applications

Support parallelism at different levels

Page 24: Perspective on Extreme Scale Computing in China

Classification of current major architectures

Classifying architectures using “homogeneity/heterogeneity” and “CPU only/CPU+Accelerator”

Homo-/Hetero refers to the ISA

CPU only CPU+Acc

Homogeneous

SequoiaK-computerSunway/BL

StampedeTH-2

Heterogeneous

Dawning 6000/HPP (AMD+Loonson)

TH-1ADawning 6000/Nebulae, Tsubame 2.0

Page 25: Perspective on Extreme Scale Computing in China

Comparison of different architectures

power performance Programmability/productivity

resilience

Homo/CPU only

poor/fair

good/excellent

good/good

vary

Heter/CPU only

poor good fair/fair vary

Homo/CPU+ACC

fair good/excellent

good/poor?

vary

Heter/CPU+ACC

good good/excellent

fair/poor? vary

Page 26: Perspective on Extreme Scale Computing in China

TH-1A architecture Hybrid system architecture

Computing sub-system Service sub-system Communication networks Storage sub-system Monitoring and diagnosis sub-system

Storage sub-systemStorage sub-system

Compute sub-systemCompute sub-system Service sub-systemService

sub-system

Communication sub-systemCommunication sub-system

CPU+

GPU

CPU+

GPU

CPU+

GPU

CPU+

GPU

CPU+

GPU

CPU+

GPU

Operation node

Operation node

MDSMDSOSSOSS OSSOSS OSSOSSOSSOSS

CPU+

GPU

CPU+

GPU

CPU+

GPU

CPU+

GPU

Operation node

Operation nodeM

onitor and diagnosis sub-system

Monitor and

diagnosis sub-system

Page 27: Perspective on Extreme Scale Computing in China

Dawning/Loonson HPP (Hyper Parallel Processing) architecture

Hyper node composed of AMD and Loonson processors

Separation of OS & appl. processors

Multiple interconnect H/W global synchronization

RTs

APPCPUs

MEMs

OS

OSCPU

MEM

HPPController

I/O

RTs

APPCPUs

MEMs

OS

OSCPU

MEM

I/O

DATA OS

Int Int

GlobalSync

Hypernode Hypernode

HPPController

OS

CPU

MEM

I/O

Int

...

OS

CPU

MEM

I/O

OS

CPU

MEM

I/O

Page 28: Perspective on Extreme Scale Computing in China

Sunway BlueLight Architecture

Global I /O Network

IO nodes

System manage

System Servi ce

Offline StorageOnline storage Nearline Storage

Login nodes

National Grid

Cloud services

I nternet

Remote users

Storage manager

Subnetwork manager

Consol e

DataBase Servi ce

Job manage nodes

FirewallFirewall I ntranet

Remote users

Local users

Local users

VPNVPN

Securi ty Servi ce

TCP/ IP network

Data Center

Blue Light Compter

Page 29: Perspective on Extreme Scale Computing in China

Technology innovations Innovation at different levels

Device Component system

New processor architectures Heter. Many-core, accelerators, re-configurable

Address memory wall new memory devices 3D stacking New cache architectures

High performance interconnect All optical network Silicon photonics

High density system design Low power design

Page 30: Perspective on Extreme Scale Computing in China

CPU SW1600

Release time Aug,2010

Processor cores 16

Peak performance [email protected]

Clock frequency 0.975~1.1GHz

Process generation 65nm

Power 35~70W

a general-purpose multi-core processor power efficient, achieve 2.0GFlops/W Next generation processor is under development

SW1600 processor features

Page 31: Perspective on Extreme Scale Computing in China

SparcV9, 16 cores, 4 SIMD 40nm, 1.8GHz Performance: 144GFlops Typical power: ~65W

FT-1500 CPU

Page 32: Perspective on Extreme Scale Computing in China

Similar ISA, different ALU

2 Intel Ivy Bridge CPU + 3 Intel Xeon Phi

16 Registered ECC DDR3 DIMMs, 64GB

3 PCI-E 3.0 with 16 lanes

PDP Comm. Port Dual Gigabit LAN Peak Perf. :

3.432Tflops

GDDR5Memory

GDDR5Memory

MICMIC

CPUCPU

CPUCPU

QPI

PCHPCHDMI16X PCIE

IPMB

CPLDCPLD16X PCIE

16X PCIE

GEGEPDPPDP 16X PCIE

Comm. PortDual Gigabit LAN

Heterogeneous Compute Node (TH-2)

Page 33: Perspective on Extreme Scale Computing in China

Interconnection network (TH-2)

576-port Switch 0

576-port Switch 12

Compute node

Compute node

Fat-tree topology using 13 576-port top level switches

Optical-electronic hybrid transport tech.

Proprietary network protocol

Page 34: Perspective on Extreme Scale Computing in China

High radix router ASIC: NRC Feature size: 90nm Die size: 17.16mm x 17.16mm Package: FC-PBGA 2577 pins Throughput of single NRC:

2.56Tbps Network interface ASIC: NIC

Same Feature size and package

Die size: 10.76mm x 10.76mm 675 pins, PCI-E G2 16X

Interconnection network(TH-2)

Page 35: Perspective on Extreme Scale Computing in China

High density system design (SW/BL)

computing node Basic element, one processor +memory

node complex High density assembly, 2 computing nodes+network interface

Supernode 256 nodes (processors), tightly coupled interconnect

cabinet 1024 computing nodes (4 supernodes)

Multi/many-core

processor

Computing node

Node complexsupernode

system

Page 36: Perspective on Extreme Scale Computing in China

Low power design

Low power design at different levels Low power processors Low power interconnect High efficient cooling High efficient power supply

Low power management Fine-grain real-time power consumption monitor System status sensing Multi-layer power consumption control

Low power programming Default system tools like debugging and tuning? Code power consumption modeling Sampling the code power consumption as code performance Feedback to programming

Page 37: Perspective on Extreme Scale Computing in China

Power supply (SW/BL)

DC UPS Conversion

efficiency 77% Highly reliable Power monitoring

associated

SW-3

AC1

AC2

DCUPS DCDC

板级电源 核心器件

两路交流输入"N+1"热备份

DC300V

AC380V

DCUPS

TDK-Lambda+Vin

-Vin

SGCNT

+V+V+V-V-V-V

TDK-Lambda+Vin

-Vin

SGCNT

+V+V+V-V-V-V

12V

REC

12V主电源 DC/ DC 电源SW- 5

AC240V DC300V DC12V 0. 9V

10KV 配电

可控整流N+1备份

300 W

TDK-Lambda+Vin

-Vin

SGCNT

+V+V+V-V-V-V

众核处理器

SW

双路切换

12相变换

AC1

AC2AC

OFF

变配电部分 一次电源机舱二次电源

高压移相变压

10KV 240V:

1输入

2输入

E: \ SZ7_xxx \ PROTEL\ SZ_VI I _DY. ddb - Document s\ SZ7\ P_Chai n_02. Sch工程

AC10KV

AC10KV

AC10KV

SW-3

SW-3

TDK-Lambda+Vin

-Vin

SGCNT

+V+V+V-V-V-V

TDK-Lambda+Vin

-Vin

SGCNT

+V+V+V-V-V-V

TDK-Lambda+Vin

-Vin

SGCNT

+V+V+V-V-V-V

DC12V

TDK-Lambda+Vin

-Vin

SGCNT

+V+V+V-V-V-V

DC12V

DC300V

SW-3

SW-3

SW-3

SW-3

SW-3

SW-3

DC12V

DC12V

DPNC

FHCA

DC12V输入 TDK-Lambda+Vin

-Vin

SGCNT

+V+V+V-V-V-V

主电源

DPNC

DPNC

DPNC

"4+1"

标 准 化

拟 制

审 核

批 准 第 张 张共文件路 径: :日 期

1 2 3 4 5 6 7 8

D

C

B

A

E: \ SZ6_906 _02\ \ \ 90602 . ddb - Document s\ \ Sheet 15. Sch工程 机房建 设 原理图 配电总 图 总图 6- Feb- 2013

幅面A3

E

F

SJT011.82.00 DL

15

906 02工程 分系统

配电总图17

YJ V4*120+75

UPS1500 KVA

W01

W02

W03

W18

W19

. . . .

1000A 母线1000A 母线

外围设备配电

WPD1

外围设备100KW

W21

W22

W23

W38

W39

. . . .

WPD2

外围设备100KW

K101

K102

K103

K109

K110

. . . .

WPD3

外围设备90KW

UPD1

UJ PD1

双电源转换

YJ V4*120+75

UPS2500 KVA

W01

W02

W03

W18

W19

. . . .

1000A 母线1000A 母线

WPD4

外围设备100KW

W21

W22

W23

W38

W39

. . . .

WPD5

外围设备100KW

K101

K102

K103

K109

K110

. . . .

WPD6

外围设备90KW

UPD2

UJ PD2

双电源转换

BU1

BU2

BU3

BU4

BU5

BU6

9401

1040

1

1040

2

9402

14/ E2 14/ E5 14/ E6 14/ E2

Page 38: Perspective on Extreme Scale Computing in China

Efficient Cooling (TH-2)

Close-coupled chilled water cooling Customized Liquid Cooling Unit

High Cooling Capacity: 80kW

Use city cooling system

to supply cooling water to LCUs

Page 39: Perspective on Extreme Scale Computing in China

Efficient Cooling (SW/BL)

Water cooling to the board (node complex)

Energy-saving Environment-friendly

High room temperature Low noise

Page 40: Perspective on Extreme Scale Computing in China

HW/SW coordination Using combination of hardware and software

technologies to address the technical issues Achieving performance while maintaining

flexibility Compilation support Parallel programming framework Performance tools HW/SW coordinated reliability measures

User level checkpointing Redundancy based reliability measure

Page 41: Perspective on Extreme Scale Computing in China

Software stack of TH-2

Page 42: Perspective on Extreme Scale Computing in China

Features Support C, Fortran and SIMD

extension Libc for computing kernel Support storage hierarchy Programming model for many-core

acceleration Collaborative cache date prefetch Instruction prefetch optimization Static/dynamic instruction scheduling

optimization

异构融合的基础编译器

前端

C C++ Fortran SIMD

常规优化

针对异构众核优化

中间表示转换与代码生成

众核线程调度优化

运算核心机器描述

运算核心汇编代码生成

运算控制核心机器描述

运算控制核心汇编代码生成

加速线程支撑库

线程调度/控制线程创建/回收

异步/掩码支持中断/异常管理

运算控制核心加速线程库

运算控制核心基础库

异构程序加载器

纯运算控制核心模式 异构混合模式 纯运算核心模式

编程模型与优化

协同

访存优化

多层次寄存器分配优化

动静结合的调度优化

数据访问指令代理优化

热点函数重排与垫塞

轻量级局存动态分配优化

汇编器 链接器 反汇编器

任务执行中断触发数据传输线程识别

运算核心加速线程库

运算核心基础库

过程间优化

循环嵌套优化

全局优化

……

SBM

D cache

Compiler for many-core

Page 43: Perspective on Extreme Scale Computing in China

Basic math lib based on many-core structure

Basic function lib SIMD extended function lib Fortran function lib

Technical features Standard function call

interface Customized optimization Support accuracy analysis

基础数学库系统

基础函数库 SIMD扩展函数库 Fortran函数库

双曲函数

性能优化浮点异常控制 精度控制

基础算法

指数函数

对数函数

数值运算函数

数值处理函数

判断类函数

贝塞儿函数

误差函数

SIMD算法

三角函数

ISO C99 数

学函数接口规范

IEEE 754 标

Basic math lib for many-core

Page 44: Perspective on Extreme Scale Computing in China

Technical features Unified architecture for

heterogeneous many-cores Low overhead

virtualization High efficient resource

management

Parallel OS

Page 45: Perspective on Extreme Scale Computing in China

Covering program development, testing, tuning, parallelization and code translation

Collaborative tuning framework

Tolls for parallelism analysis and parallelization

Integrated translation tools for multiple source codes

自动管理

项目管理 文件管理 模板管理 配置管理

开发场景

编辑器 编译管理 执行管理 环境管理

并行调试

多种编程模型调试

一体化调优

算法语言调优

并行语言调优 基础语言调优

协同开发调优框架

帮 助 系 统

应用服务支撑

应用服务中间件 用户 管理授权 容器 数据

编译执行服务

并行/基础编译

作业执行管理

调试服务 调优服务

参数化调优 数据采集

迭代优化 联合优化

策略优化 自动向量化

静态调试模式 动态调试模式

引擎服务

模型插件 实例管理

SWGDB

微架构级命令环境

自动SIMD向量化一体化调优模型

性能监测

扩展功能

并行识别与自动并行化 二进制翻译

Parallel application development platform

Page 46: Perspective on Extreme Scale Computing in China

Parallel programming framework

Hide the complexity of programming millions of cores

Integrate high efficient implementation of fast parallel algorithms

Provide efficient data structures and solver libraries

Support software engineering concept for code extensibility.

Page 47: Perspective on Extreme Scale Computing in China

SupercoSupercomputermputer

ApplicApplicationsations

middlmiddlewareeware

Peta-scale flops 100P flops

Program wall :Think parallelWrite sequential

100 times

High Performance Computing Applications Infrastructure

Materials, Climate, nuclear energy…

Page 48: Perspective on Extreme Scale Computing in China

Infrastructure: Four types computing

Structured

Mesh

Finite Element

Unstructured

Mesh

CombinatoryGeometry

HPC JAUMIN : J Adaptive Unstructured

Meshes applications INfrastructure

并行自适应非结构网格支撑软件框架

JCOGIN : J mesh-free COmbinatory Geometry INfrastructure并行三维无网格组合几何计算支撑软件框架

JASMIN :( J Adaptive Structured

Meshes applications INfrastructure ) 并行自适应结构网格支撑软件框架

PHG : Parallel Hierarchical Grid infrastructure

并行自适应有限元计算软件平台

Page 49: Perspective on Extreme Scale Computing in China

Reliability design

High-quality components, strict screening test Water cooling to prolong the lifetime of

components High density assembly, reduce the length of

wires, improve data transfer reliability Multiple error correction codes to deal with

instantaneous errors Redundant design for memory, computing

node, networks, I/O, power supply, and water cooling

Page 50: Perspective on Extreme Scale Computing in China

Hardware monitoring (SW/BL)

Basis for reliability, availability, maintainability of the system Monitor major

components Maintenance Diagnosis Dedicated

management network

1

2Diag

RPS

PWR

LED Mode

3

4

5

6

7

8

9

10

11

12

13

14

15

16

5 6 7 8 9 10 11 12 13 14 15 161 2 3 4

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

21 22 23 24 25 26 27 28 29 30 31 3217 18 19 20

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

49 50

51 5248

37 38 39 40 41 42 43 44 45 46 47 4833 34 35 36 PowerConnect 3048

环境监控

应急系统

系统控制台

万兆主交换机

系统控制台

监控交换机SD

PowerEdge4350

………

以太网交换模块

维护服务卡

维护控制器

维护服务卡

维护服务卡

……

CPU

CPU

以太网交换

CPU

CPU

以太网控制器ARM

FPGA

BMC并口串口

……

CPU

CPU

以太网交换

CPU

CPU

以太网控制器ARM

FPGA

BMC并口串口

……

CPU

CPU

以太网交换

CPU

CPU

以太网控制器ARM

FPGA

BMC并口串口

...

I BA Swi tch I BA Swi tch

以太网控制器ARM

FPGA

BMC并口串口

...

I BA Swi tch I BA Swi tch

以太网控制器ARM

FPGA

BMC并口串口

...

I BA Swi tch I BA Swi tch

以太网控制器ARM

FPGA

BMC并口串口

以太网交换模块

维护服务卡

维护控制器

计算超节点

计算超节点

计算超节点

互连网络插件

互连网络插件

互连网络插件

Page 51: Perspective on Extreme Scale Computing in China

High availability (SW/BL)

SW/HW coordinated multi-level fault-tolerant architecture

Local fault suppression, fault isolation, fault components replacement, fault recovering

应用层

受控容错手段

控制层

系统信息库

基础支撑

硬件系统

用户应用

主动容错被动容错保留恢复

作业回卷

作业降级

开工容错

服务修复

双机接管 RAC

主动迁移

容错总控 接插件环境 容错接插件

故障数据 预警信息 容错策略

系统维护 心跳检测 RAS

节点 网络

Page 52: Perspective on Extreme Scale Computing in China

Delivered system: TH-1A

Tianhe: Galaxy in Chinese Hybrid arch. :CPU & GPU Peak performance 4.7PF Linpack 2.57PF Power consumption 4.04MW

Items Items Configuration Configuration

ProcessorsProcessors 14,336 XEON CPUs + 7,168 nVIDIA GPUs + 2,048FT CPUs14,336 XEON CPUs + 7,168 nVIDIA GPUs + 2,048FT CPUs

MemoryMemory 262TB in total262TB in total

InterconnectInterconnect Proprietary high-speed interconnect networkProprietary high-speed interconnect network

StorageStorage Global shared parallel storage system, 2PBGlobal shared parallel storage system, 2PB

RacksRacks 120 Compute racks+14 Storage racks + 6 Communication racks120 Compute racks+14 Storage racks + 6 Communication racks52

Page 53: Perspective on Extreme Scale Computing in China

Delivered system: Delivered system: Dawning 6000

Hybrid system Service unit

(Nebula) 3PF peak

performance 1.27PF Linpack

performance 2.6 MW

Computing unit Experiment on

using Loonson processor

Page 54: Perspective on Extreme Scale Computing in China

Delivered system: Delivered system: Sunway BlueLight

Installed in September, 2011 at the National Supercomputing Center in Jinan.

Implemented completely with domestic 16-core ShenWei processor SW1600

8704 ShenWei processors in total Peak performance: 1.07PFlops (with 8196 processor) Linpack performance: 796TFlops (with 8196 processor) Power consumption: 1074KWatt. (with Linpack execution)

Page 55: Perspective on Extreme Scale Computing in China

Delivered system: TH-2

Page 56: Perspective on Extreme Scale Computing in China

TH-2 specifications

Hybrid Architecture Xeon CPU & Xeon Phi

Page 57: Perspective on Extreme Scale Computing in China

Application service environment

Page 58: Perspective on Extreme Scale Computing in China

China National Grid (CNGrid) 14 sites

SCCAS (Beijing, major site) SSC (Shanghai, major site ) NSC-TJ (Tianjin) NSC-SZ (Shenzhen) NSC-JN (Jinan) Tsinghua University (Beijing) IAPCM (Beijing) USTC (Hefei) XJTU (Xi’an) SIAT (Shenzhen) HKU (Hong Kong) SDU (Jinan) HUST (Wuhan) GSCC (Lanzhou)

The CNGrid Operation Center (based on SCCAS)

Page 59: Perspective on Extreme Scale Computing in China

CNGrid sitesCPU/GPU

Storage

SCCAS 157TF/300TF

1.4PB

SSC 200TF 600TB

NSC-TJ 1PF/3.7PF 2PB

NSC-SZ 716TF/1.3PF

9.2PB

NSC-JN 1.1PF 2PB

THU 104TF/64TF 1PB

IAPCM 40TF 80TB

USTC 10TF 50TB

XJTU 5TF 50TB

SIAT 30TF/200TF 1PB

HKU 23TF/7.7TF 130TB

SDU 10TF 50TB

HUST 3TF 22TB

GSCC 13TF/28TF 40TB

THUTHUIAPCMIAPCM NSCTJNSCTJ

NSCJNNSCJN

SSCSSC

USTCUSTC

NSCSZNSCSZ

HKUHKUSIATSIAT

HUSTHUST

SCCASSCCAS

GSCCGSCC

XJTUXJTU

SDUSDU

Page 60: Perspective on Extreme Scale Computing in China

CNGrid GOS Architecture

Tomcat(Apache)+Axis, GT4, gLite, OMII

Dynamic DeployService

CA Service

System Mgmt Portal

Hosting Environment

Core

System

Tool/App

Message Service

Agora

User Mgmt Res MgmtAgora Mgmt

Naming

HPCG App & Mgmt Portal

GSML Browser

ServiceControllerOther

RController

BatchJob mgmt

MetaScheduleAccount mgmt

File mgmt

metainfo mgmt

HPCG Backend

Resource Space

GOS System Call (Resource mgmt,Agora mgmt, User mgmt, Grip mgmt, etc)GOS Library (Batch, Message, File, etc)

Other Domain Specific Applications

Grip Runtime

Grip Instance MgmtSecurity

Res AC & Sharing

Other 3rd software &

tools

Java J2SE

GridWorkflowDataGrid

IDE Compiler

GSML Composer

GSML Workshop.

Debugger

Grip

Gsh & cmd tools

VegaSSH

Cmd Line Tools

DB ServiceWork Flow

Engine

Grid Portal, Gsh+CLI, GSML Workshop and Grid Apps

OS (Linux/Unix/Windows)

PC Server (Grid Server)

J2SE(1.4.2_07, 1.5.0_07)

Tomcat(5.0.28) +Axis(1.2 rc2)

Axis Handlers for Message Level Security

Core, System and App Level Services

Page 61: Perspective on Extreme Scale Computing in China

CNGrid GOS deployment

CNGrid GOS deployed on 11 sites and some application Grids

Support heterogeneous HPCs: Galaxy, Dawning, DeepComp

Support multiple platforms Unix, Linux, Windows

Using public network connection, enable only HTTP port

Flexible client Web browser Special client GSML client

Page 62: Perspective on Extreme Scale Computing in China

CNGrid: Resources

14 sites >3PF

aggregated computing power

>15PB storage

Page 63: Perspective on Extreme Scale Computing in China

CNGrid: Service and Users

>450 services >2800 users

China commercial Aircraft Corp

Bao Steel automobile institutes of

CAS universities ……

Page 64: Perspective on Extreme Scale Computing in China

CNGrid : applications

Supporting >700 projects 973, 863, NSFC, CAS Innovative, and

Engineering projects

Page 65: Perspective on Extreme Scale Computing in China

Application Villages Support domain applications

Industrial product design optimization New drug discovery Digital media

Introducing Cloud Computing concept CNGrid—as IaaS and partially PaaS Application villages—as SaaS and partially

PaaS Build up business models for HPC

applications

Page 66: Perspective on Extreme Scale Computing in China

Applications

Page 67: Perspective on Extreme Scale Computing in China

CNGrid applications

Page 68: Perspective on Extreme Scale Computing in China

Grid applications Drug Discovery Weather forecasting Scientific data Grid and its application in research Water resource Information system Grid-enabled railway freight Information system Grid for Chinese medicine database applications HPC and Grid for Aerospace Industry (AviGrid) National forestry project planning, monitoring and

evaluation

Page 69: Perspective on Extreme Scale Computing in China

HPC applications Computational chemistry Computational Astronomy Parallel program for large fluid machinery design Fusion ignition simulation Parallel algorithms for bio- and pharmacy applications Parallel algorithms for weather forecasting based on

GRAPES 10000+ core scale simulation for aircraft design Seismic imaging for oil exploration Parallel algorithm libraries for PetaFlops systems

Page 70: Perspective on Extreme Scale Computing in China

China’s status in the related fields

Significant progress in developing HPC systems and HPC service environment

Lack of long-term strategic study and plan Still far behind in many aspects

Lack of kernel technologies Processors, memory, interconnect, system software,

algorithms… Especially weak in applications Need multi-disciplinary research Shortage in cross-disciplinary talents

Sustainable development is crucial Lack of regular budget for e-Infrastructure Always competing funding with other disciplines

Page 71: Perspective on Extreme Scale Computing in China

Pursuing international Cooperation

We wish to cooperate with international HPC communities Joint work on grand challenge problems

Climate change New energy Environment protection Disaster mitigation

Jointly address challenges towards Extreme scale systems

Low power system design and implementation Performance obtained by applications Heterogeneous system programming Resilience of large systems

Page 72: Perspective on Extreme Scale Computing in China

Thank you!