6/3/2015copyright g bell & tcm history center 1 supercomputers(t) gordon bell bay area research...

80
06/20/22 Copyright G Bell & TCM Hi story Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp. http://research.microsoft.com/users/gbell Photos courtesy of The Computer Museum History Center Please only copy with credit! http:// www.computerhistory.org

Post on 15-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 1

Supercomputers(t)Gordon Bell

Bay Area Research CenterMicrosoft Corp.

http://research.microsoft.com/users/gbell

Photos courtesy of The Computer Museum History Center

Please only copy with credit!

http://www.computerhistory.org

Page 2: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 2

Supercomputer

Largest computer at a given time Technical use for science and

engineering calculations Large government defense, weather,

aero laboratories are first buyers Price is no object Market size is 3-5

Page 3: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 3

Growth in Computational Resources Used for UK Weather Forecasting

•1950

•2000

10T •

1T •

100G •

10G •

1G •

100M •

10M •

1M •

100K •

10K •

1K •

100 •

10 •

LeoMercury

KDF9

195

205YMP

1010/ 50 yrs = 1.5850

Page 4: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 4

What a difference 25 years and spending >10x more makes!

LLNL 150 Mflops machine room c1978

Artist’s view of 40 Tflops

ESRDC c2002

Page 5: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 5

Harvard Mark I aka IBM ASCC

Page 6: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 6

I think there is a world I think there is a world

market for maybe five market for maybe five

computers.computers.

““ ””

Thomas Watson Senior, Chairman of IBM, 1943

Page 7: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 7

The scientific market is still about that size… 3 computers

When scientific processing was 100% of the industry a good predictor

$3 Billion: 6 vendors, 7 architectures DOE buys 3 very big ($100-$200 M)

machines every 3-4 years

Page 8: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 8

Supercomputer price (t)

Time $M structure example

1950 1 mainframes many...

1960 3 instruction //sm IBM / CDC

mainframe SMP

1970 10 pipelining 7600 / Cray 1

1980 30 vectors; SCI “Crays”

1990 250 MIMDs: mC, SMP, DSM “Crays”/MPP

2000 1,000 ASCI, COTS MPP Grid, Legion

Page 9: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 9

Supercomputing: speed at any price, using parallelismIntra processor

Memory overlap & instruction lookaheadFunctional parallelism (2-4)Pipelining (10)SIMD ala ILLIAC 2d array of 64 pe vs vectorsWide instruction word (2-4)MTA (10-20)

MIMDs… processor replicationSMP (4-64)Distributed Shared Memory SMPs 100

MIMD… computer replicationMulticomputers aka MPP aka clusters (10K)Grid: 100K

Page 10: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 10

High performance architectures timeline

1950 . 1960 . 1970 . 1980 . 1990 . 2000Vtubes Trans. MSI(mini) Micro RISC nMicr

Processor overlap, lookahead “killer micros”

Cray era 6600 7600 Cray1 X Y C T

Vector-----SMP---------------->

SMP mainframes---> “multis”----------->

DSM KSR SGI---->

Clusters TandmVAX IBM UNIX->

MPP if n>1000 Ncube Intel IBM->

Networks n>10,000 NOW Grid

Page 11: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 11

High performance architectures timeline

1950 . 1960 . 1970 . 1980 . 1990 . 2000Vtubes Trans. MSI(mini) Micro RISC nMicr

Sequential programming---->------------------------------

<SIMD Vector--//---------------

Parallelization---

Parallel programming <---------------

multicomputers <--MPP era------

ultracomputers 10X in price 10xMPP

“in situ” resources 100x in //sm NOW VLC

Grid

Page 12: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 12

Time line of hpcc contributions

1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010ProcessorsIBM Interleaving, overlap, Instruction lookahead

CDC/Cray/Supers 6600 7600 VectorDEC mini AlphaIntel 8008 8086,8 286 386 486 Ppro P2,3, Merced

RISC and "the killer micros" RISCVLIW Cydrome & Multiflow XXXSIMD Illiac IV CM1 CM2 Maspar XXXMulti-threaded Architecture Dennelcor? Tera MTA ????????????

MultiprocessorsSMP cabinet mainframes Burroughs, Univac, IBM, etc.-----------------SMP "multis_ Mulits=Sequent,Encore, etc. -------------SMP on a chip X-------------------

SMPv. Cray, NEC, Fujitsu, Hitachi XMP YMP C T ---------- ??????

Distributed Shared Memory KSR Origin numa----- ??????

Shared address multicomputers BBN T3D T3E

Multicomputers aka clusters aka MPPClusters of minis or mainframes Tandem VAX Clustr Sysplex UNIX ---------------------

MPPs: Intel, Thinking Machines, IBM CalT Ncube Beowulf------------------

Workstation clusters UC/B NOW etc.------- ??????NOW worldwide Grid-------- ??????

Page 13: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

13Copyright G Bell & TCM History Center

04/21/23

Time line of hpcc contributions

1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010

ProcessorsIBM Stretch 360 370 GCDC 1604 6600 7600 Cray 1DEC PDP 8 PDP11 VAX AlphaIntel 8008 8086,8 286 386 486 PproP2,3,MercedRISC all MIPS/Ppc/SparcVLIW Cydrome & Multiflow XXXSIMD Illiac IV CM1 CM2 MasparMulti-threaded Architecture Dennelcor? Tera MTA

MultiprocessorsSMP B5000, Univac, etc. Mulits=Sequent,Encore, etc.SMP.IBM 8090 ….SMP.SUN 10K

SMPv.Cray XMP YMP C TSMPv.NEC SX 1…… 5

DSM SUN SUN NUMADSM.SGI/Cray KSR Origin numa

T3D T3E

Multicomputers aka clusters aka MPPClusters Tandem VAX Clustr Sysplex UNIXMulticomputers CalTech Ncube BeowulfIntel MPPs iPSC1, 2,Par,Delta 1.Tf 2TfThinking Machines CM1,2, 5IBM MPP SP1 SP2NOW UC/B NOW Grid

Page 14: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 14

Lehmer UC/Berkeley pre-computer number sieves

Page 15: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 15

Eniac c1946

Page 16: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 16

Manchester: the first computer. Baby, Mark I, and Atlas

Page 17: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 17

von Neumann

computers: Rand

Johniac

Page 18: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 18

Gene Amdahl’s Dissertation and first computer

Page 19: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 19

IBM

Page 20: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 20

IBM Stretch c1961 & 360/91 c1965

consoles!

Page 21: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 21

IBM Terabit Photodigital Store c1967

Page 22: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 22

STC Terabytes of storage c1999

Page 23: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 23

Amdahl aka Fujitsu version of the 360 c1975

Page 24: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 24

IBM ASCI Red @ LLNL

Page 25: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 25

CDC, ETA, Cray Research, Cray Computer

Page 26: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 26

Cray1925-1996

Page 27: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 27

Circuits and Packaging, Plumbing (bits and atoms) & Parallelism… plus Programming and Problems Packaging, including heat removal High level bit plumbing… getting the bits

from I/O, into memory through a processor and back to memory and to I/O

Parallelism Programming: O/S and compiler Problems being solved

Page 28: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 28

Seymour Cray Computers 1951: ERA 1103 control circuits 1957: Sperry Rand NTDS; to CDC 1959: Little Character to test transistor

ckts 1960: CDC 1604 (3600, 3800) & 160/160A 1964: CDC 6600 (6xxx series) 1969: CDC 7600

Page 29: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 29

Cray Research, Cray Computer Corp. and SRC Computer Corp.

1976: Cray 1... (1/M, 1/S, XMP, YMP, C90, T90)

1985: Cray Computer Cray 2 from Cray Research; GaAs: Cray 3 (1993), Cray 4

1999: SRC Company large scale, shared memory multiprocessor using x86 microprocessors

Page 30: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 30

Cray contributions…

Creative and productive during his entire career 1951-1996.

Creator and un-disputed designer of supers from c1960 1604 to Cray 1, 1s, 1m c1977… basis for SMPvector: XMP, YMP, T90, C90, 2, 3

Circuits, packaging, and cooling… “the mini” as a peripheral computer Use I/O computers versus I/O processors Use the main processor and interrupt it for I/O

versus I/O processors aka IBM Channels

Page 31: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 31

Cray Contributions Multi-theaded processor (6600 PPUs) CDC 6600 functional parallelism leading to RISC…

software control Pipelining in the 7600 leading to... Use of vector registers: adopted by 10+ companies.

Mainstream for technical computing Established the template for vector supercomputer

architecture SRC Company use of x86 micro in 1986 that could

lead to largest, smP?

Page 32: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 32

1.E-01

1.E+00

1.E+01

1.E+02

1.E+03

1.E+04

1.E+05

1.E+06

1960 1970 1980 1990 2000

“Cray” Clock speed (Mhz), no. of processors, peak power (Mflops)

Page 33: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 34

CDC 1604 & 6600

Page 34: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 35

CDC 7600: pipelining

Page 35: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 36

CDC 8600 Prototype:SMP, scalar,discrete circuits, failed to achieve clock speed

Page 36: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 37

CDC STAR… ETA10

Page 37: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 38

CDC 7600 & Cray 1 at Livermore

Cray 1 CDC 7600

Disks

Page 38: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 39

Cray 1 #6 from LLNL.Located at The Computer Museum History Center, Moffett Field

Page 39: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 40

Cray 1 150 Kw. MG set & heat exchanger

Page 40: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 41

Cray XMP/4Proc.c1984

Page 41: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 42

Cray 2 from NERSC/LBL

Page 42: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 43

Cray 3 c1995 processor500 MHz32 modules 1K GaAs ic’s/module8 proc.

Page 43: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 44

c1970: Beginning the search for parallelism

SIMDs Illiac IV CDC Star Cray 1

Page 44: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 45

Iliac IV: first SIMD c 1970s

Page 45: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 46

SCI (Strategic Computing Initiative)

funded by DARPA and aimed at a Teraflops!

Era of State computers and many efforts to build high speed computers… lead to HPCC

Thinking Machines, Intel supers,Cray T3 series

Page 46: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 47

Minisupercomputers: a market whose time never came. Alliant, Convex, Ardent+Stellar= Stardent = 0,

Page 47: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 48

Cydrome and Multiflow: prelude to wide word parallelism

in Merced Minisupers with VLIW attack the market Like the minisupers, they are repelled It’s software, software, and software Was it a basically good idea that will

now work as Merced?

Page 48: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 49

MasPar...

A less costly, CM 1/2 done in silicon chips

It is repelled. S is the fatal flaw

Page 49: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 50

Thinking Machines:

Page 50: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 51

Thinking Machines: CM1 & CM5 c1983-1993

Page 51: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 52

““

””

In Dec. 1995 computers In Dec. 1995 computers with 1,000 processors with 1,000 processors will do most of the will do most of the scientific processing. scientific processing.

Danny Hillis 1990 (1 paper or 1 company)

Page 52: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 53

The Bell-Hillis BetMassive Parallelism in 1995TMC

World-wide

Supers

TMC

World-wide Supers

TMC

World-wideSupers

ApplicationsRevenue

Petaflops / mo.

Page 53: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 54

Bell-Hillis Bet: wasn’t paid off!

My goal was not necessarily to just win the bet!

Hennessey and Patterson were to evaluate what was really happening…

Wanted to understand degree of MPP progress and programmability

Page 54: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 55

KSR 1: first commercial DSM NUMA (non-uniform memory access) aka COMA (cache-only memory architecture)

Page 55: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 56

SCI (c1980s): Strategic Computing Initiative funded

ATT/Columbia (Non Von), BBN Labs, Bell Labs/Columbia (DADO), CMU Warp (GE & Honeywell), CMU (Production Systems), Encore, ESL, GE (like connection machine), Georgia Tech, Hughes (dataflow), IBM (RP3), MIT/Harris, MIT/Motorola (Dataflow), MIT Lincoln Labs, Princeton (MMMP), Schlumberger (FAIM-1), SDC/Burroughs, SRI (Eazyflow), University of Texas, Thinking Machines (Connection Machine),

Page 56: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 57

Those who gave their lives in the search for parallellism

Alliant, American Supercomputer, Ametek, AMT, Astronautics, BBN Supercomputer, Biin, CDC, Chen Systems, CHOPP, Cogent, Convex (now HP), Culler, Cray Computers, Cydrome, Dennelcor, Elexsi, ETA, E & S Supercomputers, Flexible, Floating Point Systems, Gould/SEL, IPM, Key, KSR, MasPar, Multiflow, Myrias, Ncube, Pixar, Prisma, SAXPY, SCS, SDSA, Supertek (now Cray), Suprenum, Stardent (Ardent+Stellar), Supercomputer Systems Inc., Synapse, Thinking Machines, Vitec, Vitesse, Wavetracer.

Page 57: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 58

NCSA Cluster of 8 x 128 processors SGI Origin c1999

Page 58: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 59

Humble beginning:

In 1981…would you

have predicted

this would be the

basis of supers?

Page 59: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 60

Intel’s ipsc 1 & Touchstone Delta

Page 60: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 61

Intel Sandia Cluster 9K PII: 1.8 TF

Page 61: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 62

GB with NT, Compaq, HP cluster

Page 62: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 63

192 HP 300 MHz

64 Compaq 333 MHz

• Andrew Chien, CS UIUC-->UCSD • Rob Pennington, NCSA• Myrinet Network, HPVM, Fast Msgs• Microsoft NT OS, MPI API

“Supercomputer performance at mail-order prices”-- Jim Gray, Microsoft

The Alliance LES NT Supercluster

Page 63: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 64

Intel/Sandia: 9000x1 node Ppro

LLNL/IBM: 512x8 PowerPC (SP2)

LANL/Cray: 6144 CPUs

Maui Supercomputer Center– 512x1 SP2

Our Tax Dollars At WorkASCI for Stockpile Stewardship

Page 64: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

65Copyright G Bell & TCM History Center

04/21/23

ASCI Blue Mountain 3.1 Tflops SGI Origin 2000

12,000 sq. ft. of floor space

1.6 MWatts of power

530 tons of cooling

384 cabinets to house 6144 CPU’s with 1536 GB (32GB / 128 CPUs)

48 cabinets for metarouters

96 cabinets for 76 TB of raid disks

36 x HIPPI-800 switch Cluster Interconnect

9 cabinets for 36 HIPPI switches

about 348 miles of fiber cable

Page 65: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 66

Half of SGI ASCI Computer at LASL c1999

Page 66: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

67Copyright G Bell & TCM History Center

04/21/23

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

1 2 3 4 5 6

6 Groups of 8 Computers each

18 16x16 Crossbar Switches

18 Separate NetworksLASL ASCI Cluster Interconnect

Page 67: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 68

LASL ASCI Cluster Interconnect

Page 68: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 69

Typical MCNP BNCT simulation:• 1 cm resolution (21x21x25)• 1 million particles• 1 hour on 200 MHz PC

ASCI Blue Mountain MCNP simulation:• 1 mm resolution (256x256x250)• 100 million particles• 2 hours on 6144 CPUs

3 TeraOps makes a difference!

Page 69: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 70

LLNL ArchitectureSector S

Sector Y

Sector K

24

24

24

Each SP sector has• 488 Silver nodes• 24 HPGN Links

System Parameters• 3.89 TFLOP/s Peak• 2.6 TB Memory• 62.5 TB Global disk

HPGNHPGN

HiPPI

2.5 GB/node Memory24.5 TB Global Disk8.3 TB Local Disk

1.5 GB/node Memory20.5 TB Global Disk4.4 TB Local Disk

1.5 GB/node Memory20.5 TB Global Disk4.4 TB Local Disk

FDDI

SST Achieved >1.2TFLOP/son sPPM and Problem

>70x LargerThan Ever Solved Before!

66

12

Page 70: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 71

I/O Hardware Architecture

System Data and Control Networks

488 Node IBM SP Sector

56 GPFSServers

432 Silver Compute Nodes

Each SST Sector• local and global I/O file system• 2.2 GB/s global I/O performance• 3.66 GB/s local I/O performance• Separate SP first level switches• Independent command and control

Full system mode• Application launch over full 1,464 Silver nodes• 1,048 MPI/us tasks, 2,048 MPI/IP tasks• High speed, low latency communication • Single STDIO interface

GPFS GPFS GPFS GPFS GPFS GPFS GPFS GPFS

24 SP Links to Second Level

Switch

Page 71: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 72

Fujitsu VPP5000 multicomputer:(not available in the U.S.)

Computing nodesspeed: 9.6 Gflops vector, 1.2 Gflops scalar primary memory: 4-16 GBmemory bandwidth: 76 GB/s (9.6 x 64 Gb/s) inter-processor comm: 1.6 GB/s non-blocking with global addressing among all nodesI/O: 3 GB/s to scsi, hippi, gigabit ethernet, etc.

1-128 computers deliver 1.22 Tflops

Page 72: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 73

NEC SX 5: clustered SMPv(not available in the U.S.)

SMPv computing nodes– 4 - 8 processors/computer– Processor pap: 8 Gflops– Memory– I/O speed

Cluster

Page 73: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 74

NEC Supers

Page 74: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 75

High Performance COTS Raceway and (RACE++) Busses

– ANSI Standardized– Mapped Memory, Message Passing, ‘Planned Direct’

Transfers– Circuit Switched; Basic Bus Interface Unit Is a 6 (8) Port

Bidirectional Switch at 40MB/s (66MB/s) Per Port.– Scales to 4000 Processors

Skychannel– ANSI Standardized– 320mb/sec; Crossbar backplane supports up to 1.6 GB/s

Throughput Non-blocking– Heart of Air Force $3M / 256 Gflops System

Page 75: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 76

Mercury & Sky Computers - & $Rugged System With 10 Modules ~ $100K; $1K /#

Scalable to several K processors; ~1-10 Gflop / Ft3

10 9U Boards * 4 Ppc750’s 440 Specfp95 in 1 Ft3 (18.5 * 8 * 10.75”)

Sky 384 Signal Processor, #20 on ‘Top 500’, $3M

Mercury VME Platinum SystemMercury VME Platinum System Sky PPC Daughtercard Sky PPC Daughtercard

Page 76: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 77

Brookhaven/Columbia QCD c1999(1999 Bell Prize for performance/$)

Page 77: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 78

Brookhaven/Columbia QCD board

Page 78: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 79

HT-MT: What’s 0.55? c1999

Page 79: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 80

HT-MT…

Mechanical: cooling and signals Chips: design tools, fabrication Chips: memory, PIM Architecture: mta on steroids Storage material

Page 80: 6/3/2015Copyright G Bell & TCM History Center 1 Supercomputers(t) Gordon Bell Bay Area Research Center Microsoft Corp

04/21/23 Copyright G Bell & TCM History Center 81

HTMT challenges the heuristics for a successful computer

Mead 11 year rule: time between lab appearance and commercial use

Requires >2 break throughs Team’s first computer or super It’s government funded…

albeit at a university