hitachi all rights reserved, copyright c 2001, hitachi, ltd. overview of hitachi’s super technical...
TRANSCRIPT
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
Overview of Hitachi’s
Super Technical Server SR8000
Overview of Hitachi’s
Super Technical Server SR8000
Hitachi, Ltd. Enterprise Server Division
March, 2001
Yoshiro Aihara
The Third International Workshop on Next Generation Climate Models
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
HITACHI Supercomputers
'77 '78 '79 '80 '81 '82 '83 '84 '85 '86 '87 '88 '89 '90 '91
M-200HIAP
'92 '93 '94 '95 '96 '97 '98 '99 '00
0.01G
0.1G
1G
10G
S-810Series
M-280HIAP
M-680IAP
100G S-3000Series
Pe
ak
Pe
rfo
rma
nc
e(F
LO
PS
)
1TSR2201Series
AdvancedRISC Parallel
Announcement Year
New concept machine for advanced HPC users
(Combination of Parallel and Vector)
New concept machine for advanced HPC users
(Combination of Parallel and Vector)
Vector Type RISC Parallel
10T
IAP:Integrated Array Processor
Integrated Array Processor system
First Japanese Vector Supercomputer
Single CPU peak performance 3GFlops
First commercially available distributed memory parallel processor
Single CPU peak performance 8GFlops (Fastest in the world)
S-820Series
SR8000
‘01
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
Design Concept of SR8000
High Single Node Performance
High Scalability
Short Development Cycle Easy Enhancement:
Multi-dimensional Crossbar Network(High-speed inter-node network)
- PVP feature- COMPAS feature- High Memory Throughput
- Vector processing- Element parallel processing
RISC based processor (HITACHI developed)
Target of Design Hitachi’s Solution
PVP: Pseudo Vector ProcessingCOMPAS: Co-operative Micro-Processors in single Address Space
SR8000: New Concept combining advantages of Vector processor and RISC Parallel Processor
SR8000 New Feature Vector processor
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
EtherATM HiPPi
Basic Configuration of SR8000
Highperformance RISC
System controlPCI
I/O adapter
SVPI/O Device
SVP : SerVice ProcessorMCD : Maintenance Console DeviceSP : System Processor
Node Node Node
MCD
High performance RISCMicroprocessor (Hitachi develop.) Pseudo-Vector Processing
COMPAS: CO-operative Micro-Processors in single Address Space
Network control
High speed inter-node network
Multi-dimensional Crossbar Network
RAID Disk
Main memory
Highperformance RISC
SP
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
Ex) 8x8x2 (128 nodes) configuration
Multi-dimensional Crossbar network
: X axis crossbar
2 nodes
8 nodes
8 nodes
: Y axis crossbar
: Nodes
yz
x
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
SR8000 Hardware Specification
4 8 16 32 64 128 256 512Peak SR8000 32 64 128 256 512 1024 - -performance SR8000 Model E1 38.4 76.8 153.6 307.2 614.4 1228.8 2457.6 4915.2(GFLOPS) SR8000 Model F1 48 96 192 384 768 1536 3072 6144
SR8000 Model G1 57.6 115.2 230.4 460.8 921.6 1843.2 3686.4 7372.8
Inter- node network SR8000Three-
dimensionalcrossbar - -
SR8000 ModelE1,F1,G1
Inter- node transferspeed SR8000
SR8000 Model E1SR8000 Model F1SR8000 Model G1
Maximum total SR8000 32 64 128 256 512 1024 - -memory capacity (GB) SR8000 Model
E1,F1,G1 64 128 256 512 1024 2048 4096 8192
Peak SR8000performance SR8000 Model E1(GFLOPS) SR8000 Model F1
SR8000 Model G1SR8000SR8000 ModelE1,F1,G1
Three- dimensional crossbar
Number of Nodes
One- dimensional crossbar Two- dimensional crossbar
-1GB/ s(single direction)× 21.2GB/ s(single direction)× 21GB/ s(single direction)× 2
9.6GFLOPS12GFLOPS
Memory capacity (GB)
1.6GB/ s(single direction)× 2
SYSTEM
NODE
One- dimensional crossbar Two- dimensional crossbar
2GB/ 4GB/ 8GB
2GB/ 4GB/ 8GB/ 16GB
14.4GFLOPS
External interface Ultra SCSI, Ethernet/ Fast Ethernet,Gigabit Ethernet,ATM,HIPPI,Fibre Channel8GFLOPS
4 8 16 32 64 128 256 512Peak SR8000 32 64 128 256 512 1024 - -performance SR8000 Model E1 38.4 76.8 153.6 307.2 614.4 1228.8 2457.6 4915.2(GFLOPS) SR8000 Model F1 48 96 192 384 768 1536 3072 6144
SR8000 Model G1 57.6 115.2 230.4 460.8 921.6 1843.2 3686.4 7372.8
Inter- node network SR8000Three-
dimensionalcrossbar - -
SR8000 ModelE1,F1,G1
Inter- node transferspeed SR8000
SR8000 Model E1SR8000 Model F1SR8000 Model G1
Maximum total SR8000 32 64 128 256 512 1024 - -memory capacity (GB) SR8000 Model
E1,F1,G1 64 128 256 512 1024 2048 4096 8192
Peak SR8000performance SR8000 Model E1(GFLOPS) SR8000 Model F1
SR8000 Model G1SR8000SR8000 ModelE1,F1,G1
Three- dimensional crossbar
Number of Nodes
One- dimensional crossbar Two- dimensional crossbar
-1GB/ s(single direction)× 21.2GB/ s(single direction)× 21GB/ s(single direction)× 2
9.6GFLOPS12GFLOPS
Memory capacity (GB)
1.6GB/ s(single direction)× 2
SYSTEM
NODE
One- dimensional crossbar Two- dimensional crossbar
2GB/ 4GB/ 8GB
2GB/ 4GB/ 8GB/ 16GB
14.4GFLOPS
External interface Ultra SCSI, Ethernet/ Fast Ethernet,Gigabit Ethernet,ATM,HIPPI,Fibre Channel8GFLOPS
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
Pseudo-Vector Processing(PVP)
Cache memory
Extended floating point registers( 160)
Prefetch
Preload
load
FPUFPU
Main memoryMain memory
Prefetch - Read data from main memory to cache before calculation- Accelerate sequential data access
Preload- Read data from main memory to Extended Floating Registers before calculation- Accelerate stride memory access and indirectly addressed memory access
Problems of conventional RISC processor- Reduction of performance for large scale simulations because of cache-overflow- Sustained : Under 10% of peak
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
COMPAS Feature of SR8000
Program BehaviorIP
Scalar Part
Start Parallel Inst.
Loop Part
End Parallel Inst.
Scalar Part
Hardware Feature(COMPAS)
IP
(waiting for startup)
Loop Part
Loop Part
Loop Part
IP IP IP
(waiting for startup) (waiting for startup)
IP IP IP
SC
MSIP:Instruction ProcessorSC:Storage ControllerMS:Main Storage
・・・
・・・
COMPAS: CO-operative Micro-Processors in single Address Space
High-speedCommunication Mechanism
Realization of high speed processing of multiple processorsby hardware high-speedcommunication mechanism
Realization of elementwise parallel processing of DO Loops, employed in vector supercomputer, by multiple processors in a node(Automatic elementwise parallelization in a node by compiler)
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
Programming Models
Hardware Programming Model Example Single CPU Pseudo-vector processing Vector application
Independent processing on each IP
Compilation, parallel make
Message passing MPP application DO loop distribution with COMPAS
Vector application Single Node
Parallel processing of independent blocks of code
Message passing MPP application Multiple Nodes COMPAS and message
passing Vector parallel application
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
Physical Data of SR8000
Example; 128 Node Configuration (G1 model)
Power Consumption; approx. 370 kVAHeat Dissipation; approx. 330 kWCooling Air Inlet Temperature; 16--22 deg CWeight; approx. 15,000 kgFloor Space; approx. 50 sq. meters (incl. service area)
Foot Print (128 node)
approx. 3.3 m
approx. 8.0 m
Height: approx. 1.8 m
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
OS
Language Processor
Program Development
Parallel Library
Numerical Calculation
Development Support
Graphics
GUI
Graphic Library
Network
Optimizing FORTRAN77/90, HPF, Optimizing C, C++, OpenMP (Ver1)
MPI-2, PVM, PARALLELWARE
MATRIX/MPP,MATRIX/MPP/SSS,MSL2
Symbolic Debugger Optimizing C /FORTRAN90 Performance Monitor (for HP-UX)
X11R6, Motif1.2
GKS, PEX, PHIGS
Ethernet / Fast Ethernet, GbE, HiPPi, ATM TCP/IP, NFS V3, telnet, rlogin
OSF/1 Microkernel based OS NQS, BGT, DIFF, SFF, PFF
HI-UX/MPPHI-UX/MPP
Overview of Software Products
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
Single UNIX System Single UNIX System : Single System Operation (File system, Process control, Network) Open System (Standardized OS, Compiler, Network) Flexible System Operation (Partitioning Operation, Automatic Operation) Scalable System (4 to 512 nodes)
IP :Instruction Processor3D-XB : 3-dimensional Cross-bar Network
NodeNode
NodeNode
NodeNode
NodeNode
NodeNode
NodeNode
NodeNode
NodeNode
NodeNode
NodeNode
NodeNode
NodeNode
NodeNode
NodeNode
NodeNode
NodeNode
NodeNode
NodeNode
IP IP IP IPIP SP
Main Storage
COMPAS Feature
SR8000
Network
SR2000Series
SR2000Series
3D-XB
3500Series3500
Series
H-9000VSeries
H-9000VSeries
HIPPI
HIPPIEthernet
Disk
Disk
3D-X
B
WS PC X Terminal
Console
Other Vendor(SGI, etc........ )Other Vendor
(SGI, etc........ )
Graphic
RAID
COMPAS (CO-operative Micro-Processors in single Address Space)
...Micro-kernel (Control of all IPs)
UNIX(OSF/1) Server (Functional co-operation with other nodes)
Single UNIX
System
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
Receive Buffer
Program Program
OSOS
memory copy
data data
data dataSend Buffer
Crossbar Network
Normal TransferProtocol ProcessingContext SwitchInterrupt Handling
Node Node
memory copy
Remote DMA Transfer
No Buffering in KernelNo OS System Call
Remote DMA Transfer
● Direct Memory Copy between User Program on Different Nodes that minimizes OS Overhead
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
StructuralAnalysis
ComputationalFluidDynamics
ChemicalAnalysis
Libraries IMSL
MSC.Nastran
MSC.Marc
LS-DYNA
PAM_CRASH
STAR-CD
PHOENICS
SCRYU
STREAM
GAUSSIAN98
Tools AVS/EXPRESS
ABAQUS/StandardABAQUS/Explicit
FLUENT
AMBER
NAG
TotalViewVampir
Examples of ISV Package
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
SR8000 Installation Sites (Example)
Leibniz Rechenzentrum (Germany)High Energy Accelerator Research OrganizationUniversity of TokyoJapan Meteorological AgencyUniversity of Tokyo / Institute for Solid State PhysicsTsukuba advanced Computing Center - TACC / AISTMeteorological Research InstituteHokkaido UniversityInstitute of Statistical MathematicsHWW / Universitat Stuttgart & DLR (Germany)
..
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
TOP500 Supercomputing Sites - November 3rd, 2000
Rmax/Rpeak > 75 % Hitachi SR8000 works efficiently.Rmax/Rpeak > 75 % Hitachi SR8000 works efficiently.
Rank Manuf acturer ComputerRmax
(GFl ops)I nstal l at i on Si te Count ry Year
Area ofI nstal l at i on
# ProcRpeak
(GFl ops)
1 I BMASCI Whi te, SPPower3 375 MHz
4938Lawrence Li vermoreNat i onal Laboratory
USA 2000 Research Energy 8192 12288
2 I ntel ASCI Red 2379 Sandi a Nat i onal Labs USA 1999 Research 9632 3207
3 I BMASCI Bl ue- Paci fi cSST, I BM SP 604e
2144Lawrence Li vermoreNat i onal Laboratory
USA 1999 Research Energy 5808 3868
4 SGI ASCI Bl ue Mountai n 1608Los Al amos Nat i onal
LaboratoryUSA 1998 Research 6144 3072
5 I BM SP Power3 375 MHz 1417Naval Oceanographi cOffi ce (NAVOCEANO)
USA 2000ResearchAerospace
1336 2004
6 I BM SP Power3 375 MHz 1179Nat i onal Centers f or
Envi ronmental Predi ct i onUSA 2000 Research Weather 1104 1656
7 Hi tachi SR8000- F1/ 112 1035 Lei bni z Rechenzent rum Germany 2000 Academi c 112 1344
8 I BMSP Power3 375 MHz 8way
929UCSD/ San Di ego
Supercomputer CenterUSA 2000 Research 1152 1728
9 Hi tachi SR8000- F1/ 100 917Hi gh Energy Accel eratorResearch Organi zat i on
/ KEKJ apan 2000 Research 100 1200
10 Cray I nc. T3E1200 892 Government USA 1998 Cl assi fi ed 1084 1300. 8
Rank Manuf acturer ComputerRmax
(GFl ops)I nstal l at i on Si te Count ry Year
Area ofI nstal l at i on
# ProcRpeak
(GFl ops)
1 I BMASCI Whi te, SPPower3 375 MHz
4938Lawrence Li vermoreNat i onal Laboratory
USA 2000 Research Energy 8192 12288
2 I ntel ASCI Red 2379 Sandi a Nat i onal Labs USA 1999 Research 9632 3207
3 I BMASCI Bl ue- Paci fi cSST, I BM SP 604e
2144Lawrence Li vermoreNat i onal Laboratory
USA 1999 Research Energy 5808 3868
4 SGI ASCI Bl ue Mountai n 1608Los Al amos Nat i onal
LaboratoryUSA 1998 Research 6144 3072
5 I BM SP Power3 375 MHz 1417Naval Oceanographi cOffi ce (NAVOCEANO)
USA 2000ResearchAerospace
1336 2004
6 I BM SP Power3 375 MHz 1179Nat i onal Centers f or
Envi ronmental Predi ct i onUSA 2000 Research Weather 1104 1656
7 Hi tachi SR8000- F1/ 112 1035 Lei bni z Rechenzent rum Germany 2000 Academi c 112 1344
8 I BMSP Power3 375 MHz 8way
929UCSD/ San Di ego
Supercomputer CenterUSA 2000 Research 1152 1728
9 Hi tachi SR8000- F1/ 100 917Hi gh Energy Accel eratorResearch Organi zat i on
/ KEKJ apan 2000 Research 100 1200
10 Cray I nc. T3E1200 892 Government USA 1998 Cl assi fi ed 1084 1300. 8
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
TOP500 Supercomputing Sites - November 3rd, 2000
Rmax/Rpeak = 85.3 % on SR8000/128
Rmax/Rpeak = 90.0 % on SR8000-E1/80
Hitachi SR8000 works efficiently.
Rmax/Rpeak = 85.3 % on SR8000/128
Rmax/Rpeak = 90.0 % on SR8000-E1/80
Hitachi SR8000 works efficiently.
Rank Manuf acturer ComputerRmax
(GFl ops)I nstal l at i on Si te Count ry Year
Area ofI nstal l at i on
# ProcRpeak
(GFl ops)
11 Cray I nc. T3E1200 892US Army HPC Research
Center at NCSUSA 2000 Research 1084 1300. 8
12 Fuj i t su VPP5000/ 100 886 ECMWF UK 2000 Research Weather 100 960
13 Hi tachi SR8000/ 128 873 Uni versi ty of Tokyo J apan 1999 Academi c 128 1024
14 Cray I nc. T3E900 815 Government USA 1997 Cl assi fi ed 1324 1191. 6
15 I BM SP Power3 375 MHz 795 Char l es Schwab USA 2000 I ndust ry Fi nance 768 1152
16 I BM SP Power3 375 MHz 741North Carol i na
Supercomput i ng Center(NCSC)
USA 2000 Academi c 720 1080
17 I BM SP Power3 375 MHz 723Oak Ri dge Nat i onal
LaboratoryUSA 2000 Research 704 1056
18 Hi tachi SR8000- E1/ 80 691. 3J apan Meteorol ogi cal
AgencyJ apan 2000 Research Weather 80 768
19 SGI ORI GI N 2000 250 MHz 690. 9Los Al amos Nat i onal
Laboratory/ ACLUSA 1999 Research 2048 1024
20 I BM SP Power3 375 MHz 688NCAR (Nat i onal Center f or
Atmospher i c Research)USA 2000 Research 668 1002
Rank Manuf acturer ComputerRmax
(GFl ops)I nstal l at i on Si te Count ry Year
Area ofI nstal l at i on
# ProcRpeak
(GFl ops)
11 Cray I nc. T3E1200 892US Army HPC Research
Center at NCSUSA 2000 Research 1084 1300. 8
12 Fuj i t su VPP5000/ 100 886 ECMWF UK 2000 Research Weather 100 960
13 Hi tachi SR8000/ 128 873 Uni versi ty of Tokyo J apan 1999 Academi c 128 1024
14 Cray I nc. T3E900 815 Government USA 1997 Cl assi fi ed 1324 1191. 6
15 I BM SP Power3 375 MHz 795 Char l es Schwab USA 2000 I ndust ry Fi nance 768 1152
16 I BM SP Power3 375 MHz 741North Carol i na
Supercomput i ng Center(NCSC)
USA 2000 Academi c 720 1080
17 I BM SP Power3 375 MHz 723Oak Ri dge Nat i onal
LaboratoryUSA 2000 Research 704 1056
18 Hi tachi SR8000- E1/ 80 691. 3J apan Meteorol ogi cal
AgencyJ apan 2000 Research Weather 80 768
19 SGI ORI GI N 2000 250 MHz 690. 9Los Al amos Nat i onal
Laboratory/ ACLUSA 1999 Research 2048 1024
20 I BM SP Power3 375 MHz 688NCAR (Nat i onal Center f or
Atmospher i c Research)USA 2000 Research 668 1002
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
SR8000 F1 & G1 LINPACK Performance
80. 25(8nodes)
159. 50(16nodes)
313. 30(32nodes)
605. 30(64nodes)
917. 20(100nodes)
1035. 00(112nodes)
398. 50(32nodes)
786. 95(64nodes)
331. 50
0
200
400
600
800
1000
1200
0 20 40 60 80 100 120
Number of Nodes
Perf
orma
nce
(GFl
op/s
)
313.30 Gflop/s on SR8000F1/32 with Nmax=65000 ↓ 6% Speed Up331.50 Gflop/s on SR8000F1/32 with Nmax=84800 ↓ 20% Speed Up398.50 Gflop/s on SR8000G1/32 with Nmax=84800
313.30 Gflop/s on SR8000F1/32 with Nmax=65000 ↓ 6% Speed Up331.50 Gflop/s on SR8000F1/32 with Nmax=84800 ↓ 20% Speed Up398.50 Gflop/s on SR8000G1/32 with Nmax=84800
SR8000F1SR8000F1
SR8000G1SR8000G1
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
NAS Parallel Benchmark (FT)
FT: A 3-D fast-Fourier transform partial differential equation benchmarkFT: A 3-D fast-Fourier transform partial differential equation benchmark
8. 4
15. 5
29. 5
57. 2
6. 510. 8
19. 8
74. 1
38. 0
0
10
20
30
40
50
60
70
80
1 2 4 8 16Number of Nodes
Perf
orma
nce
(Gfl
op/s
)SR8000F1
SR8000G1
Model G1 is 1.28~ 1.30 times faster
than Model F1.
Model G1 is 1.28~ 1.30 times faster
than Model F1.
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
NAS Parallel Benchmark (MG)
MG: a simple 3D multigrid benchmarkMG: a simple 3D multigrid benchmark
4. 1
8. 0
14. 8
27. 8
46. 8
5. 0
9. 8
18. 0
33. 8
58. 2
0
10
20
30
40
50
60
70
1 2 4 8 16Number of Nodes
Perf
orma
nce
(Gfl
op/s
)
SR8000F1
SR8000G1
Model G1 is 1.22~ 1.24 times faster
than Model F1.
Model G1 is 1.22~ 1.24 times faster
than Model F1.
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd.
MPI Ping-Pong PerformanceMPI Ping-Pong Performance
Remote DMA (Direct Memory Access) is sender driven and makes memory to memory copy of data. Remote DMA provides a high-speed inter-processor communication function without redundant copying. Remote DMA (Direct Memory Access) is sender driven and makes memory to memory copy of data. Remote DMA provides a high-speed inter-processor communication function without redundant copying.
0
200
400
600
800
1000
1200
1400
1600
1800
8 16 32 64 128
256
512
1024
2048
4096
8192
1638
4
3276
8
6553
6
1310
72
2621
44
5242
88
1048
576
2097
152
4194
304
8388
608
1677
7216
Bytes
Thro
ughp
ut (
MB/s
)SR8000F1 wi thout RDMA opti on
SR8000F1 wi th RDMA opti on
SR8000G1 wi thout RDMA opti on
SR8000G1 wi th RDMA opti on