heiko schröder, 1998 architectures special purpose...

100
The Future of Parallel Computing The Future of Parallel Computing Special Purpose Mesh Architectures Heiko Schröder, 1998 P A R C SA ISA PIPS RM OH

Upload: others

Post on 06-Oct-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

The Future of ParallelComputing

The Future of ParallelComputing

Special Purpose MeshArchitectures

Heiko Schröder, 1998P A R C

SA ISA PIPS RM OH

Page 2: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 2

P A R C ContentsContents

•Why meshes ???•Application specific parallelmesh architectures

Fine grain1983

Coarse grain1997

-Systolic Arrays-Instruction Systolic Arrays-PIPS-Reconfigurable mesh-Optical Highway

Page 3: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 3

P A R C Physical limitsPhysical limits

c=300 000 km/sec • OPS -- 0.3 mm/OP• 1000 PEs with OPS --

30cm/OP• massive parallelism• distributed memory

9101210

Page 4: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 4

P A R C

0.01

0.1

1

10

100

1000

1970 1975 1980 1985 1990 1995

Perf

orm

ance

(MIP

S)

Year

4004

808080286

80386

Pentium 2

Pentium

80486

Processor powerProcessor power

Page 5: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 5

P A R C

0,5 µ

0,25 µ

ScalingFaktor 2:

• 1/2 width • 1/2 hight • 1/2 switching time

8 x performance!

Page 6: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 6

P A R C CMOS transistorsCMOS transistors

1960 1970 1980 1990 2000 2010 2020 2030

0,01µµµµ

0,1µµµµ

1µµµµ

10µµµµ

Size of minimal transistor

ca. 0,03µµµµ

Page 7: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 7

P A R C Mesh/TorusMesh/Torus

diameterbisection width

nn

2D mesh

Page 8: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 8

P A R C HypercubeHypercube

0-D0

11-D

00

01

10

112-D

000 010

001 011

100 110

101 111

3-D

0 1

4-D

diameter log nbisection width n

Page 9: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 9

P A R C VLSIVLSI

VeryLarge

Scale Integration

• simple cells• few types• regular architecture• short connections

mesh -- torus

Page 10: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 10

P A R C Pin limitationsPin limitations

16x16 pins

diameter 256

16 pins

diameter 16

16x12 pins

Page 11: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 11

P A R C Bisection widthBisection width

Bisection width 256

25 cm

Bisection width 32K

32 m

Page 12: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 12

P A R C ProgrammingProgramming

• SA --- Systolic Array• SIMD --- Single Instruction Multiple Data• ISA --- Instruction Systolic Array• MIMD --- Multiple Instruction Multiple Data

Page 13: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 13

P A R C parallel mergeparallel merge

initial situation:

1.) sort columns(odd-even-transposition sort)

2.) sort rows(odd-even-transposition sort)

sorted !!!!

x1 x2 x3 x4 x5 x6

x7

x17 x18

y1 y2 y3 y4 y5 y6

y7

y17 y18

...

...

...

...

Page 14: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 14

P A R C 0-1 principle0-1 principle

• The 0-1 principle states that ifall sequences of 0 and 1 aresorted properly than this is acorrect sorter.

• The sorter must be based onmoving data.

initially

0s

0s

1s

after verticalsort

0s

1s

after horizontalsort

0s

1s

Page 15: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 15

P A R C MIMD-mesh (clocked)MIMD-mesh (clocked)

min max Time: 2n

Page 16: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 16

P A R C systolic mergesystolic merge

1 3 3 45 5 6 79 8 8 74 4 3 2

Page 17: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 17

P A R C systolic mergesystolic merge

1 3 3 45 5 6 79 8 8 74 4 3 2

Page 18: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 18

P A R C systolic mergesystolic merge

1 3 3 45 5 6 79 8 8 74 4 3 2

Page 19: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 19

P A R C systolic mergesystolic merge

1 3 3 45 5 6 79 8 8 74 4 3 2

Page 20: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 20

P A R C systolic mergesystolic merge

1 3 3 45 5 6 79 8 8 74 4 3 2

1 3 3 45 5 6 74 4 3 29 8 8 7

Page 21: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 21

P A R C systolic mergesystolic merge

1 3 3 45 5 6 74 4 3 29 8 8 7

Page 22: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 22

P A R C systolic mergesystolic merge

1 3 3 44 4 3 25 5 6 79 8 8 7

Page 23: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 23

P A R C systolic mergesystolic merge

1 3 3 44 4 3 25 5 6 79 8 8 7

Page 24: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 24

P A R C systolic mergesystolic merge

1 3 3 24 4 3 45 5 6 79 8 8 7

Page 25: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 25

P A R C systolic mergesystolic merge

1 3 3 24 4 3 45 5 6 79 8 8 7

Page 26: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 26

P A R C systolic mergesystolic merge

1 3 3 24 4 3 45 5 6 79 8 8 7

Page 27: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 27

P A R C systolic mergesystolic merge

1 3 2 34 3 4 45 5 6 79 8 8 7

Page 28: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 28

P A R C

1 3 2 34 3 4 45 5 6 79 8 8 7

systolic mergesystolic merge

Page 29: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 29

P A R C

1 2 3 33 4 4 45 5 6 79 8 8 7

systolic mergesystolic merge

Page 30: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 30

P A R C

1 2 3 33 4 4 45 5 6 79 8 8 7

systolic mergesystolic merge

Page 31: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 31

P A R C1 2 3 33 4 4 45 5 6 79 8 8 7

systolic mergesystolic merge

Page 32: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 32

P A R C1 2 3 33 4 4 45 5 6 78 9 7 8

systolic mergesystolic merge

Page 33: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 33

P A R C1 2 3 33 4 4 45 5 6 78 9 7 8

systolic mergesystolic merge

Page 34: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 34

P A R C1 2 3 33 4 4 45 5 6 78 7 9 8

systolic mergesystolic merge

Page 35: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 35

P A R C systolic mergesystolic merge1 2 3 33 4 4 45 5 6 78 7 9 8

Page 36: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 36

P A R C• sorted !!!

systolic mergesystolic merge1 2 3 33 4 4 45 5 6 77 8 8 9

Page 37: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 37

P A R C Characteristics of SAsCharacteristics of SAs

Extremely high cost-performanceno flexibility -- long development time

Suitable for special signal processing tasks ???

Page 38: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 38

P A R C Systolic architectures ISystolic architectures I1. Lang, H.W., Schimmler, M., Schmeck, H., Schröder, H., “A Fast Sorting Algorithm

for VLSI”, Proc. 10th ICALP, Barcelona, July 1983, Lecture Notes in ComputerScience, 154, pp408--419, 1983

2. Lang, H.W., Schimmler, H., Schröder, H., “Pattern Matching in Binary Trees on aMesh-Connected Processor Array”, VLSI: Algorithms and Architectures, Bertolazziand Luccio (eds.), North Holland, pp113--124, 1984

3. Schmeck, H., Schröder, H., “Dictionary Machines for Different Models of VLSI”,IEEE Transactions on Computers, C-34, pp472--475, 1985

4. Lang, H.W., Schimmler, M., Schmeck, H., Schröder, H., “Realistic Comparisons ofSorting Algorithms for VLSI”, Foundations of Data Organisation, Ghosh,Kambayashi and Tanaka (eds.), Plenum Press, pp309--318, 1987

5. Schröder, H., “VLSI-Sorting Evaluated under the Linear Model”, Journal ofComplexity 4, pp 330-355, December 1988

6. Schmeck, H., Schröder, H., Starke, C., “Systolic s2-Way Merge Sort is NearlyOptimal”, in IEEE Transactions on Computers, Vol 38, Nr 7, pp 1052-1056, 1989

7. Murthy, V.K., Schröder, H., “Systolic Arrays for Parallel G-Inversion and FindingPetri Net Invariants”, in Parallel Computing, Vol. 11, Nr. 3, pp 349-359, 1989

8. E.V. Krishnamurthy, M. Kunde, M. Schimmler, H. Schröder, “Systolic algorithm fortensor products of matrices --- implementation and applications”, Parallel Computing13, 301-308, 1989

Page 39: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 39

P A R C Systolic architectures IISystolic architectures II9. Schimmler, M., Schröder, H., “A Simple Systolic Method to Find all Bridges on

an undirected Graph”, Parallel Computing 12, 107-111, 198910. P. Lenders, H. Schröder, “A programmable systolic device for Image Processing -

-- based on Mathematical Morphology”, Parallel Computing, 13, pp 337-344,1990

11. Schröder, H., Krishnamurthy, E.V., “Systolic Algorithms for PolynomialInterpolation and Related Problems”, Parallel Computing, 17, pp 493-504, 1991

12. Schröder, H., Krishnamurthy, E.V., “Systolic Algorithms for MultivariateApproximation Using Tensor Products of Basis Functions”, Parallel Computing,17, pp 483-492, 1991

13. B. Beresford-Smith, J. Breckling, H. Schröder, “Systolic Codebook Generation”,Transactions on Acoustics, Speech, & Signal Processing, pp 144-149, Vol.1, Nr.2, April 1993

14. Schröder, H., “Partition Sorts for VLSI”, Proceedings GI-13, Jahrestagung,Hamburg, October 1983, Informatik-Fachberichte 73, pp101--116, 1983

15. Schröder, H., “VLSI-gerechte Sortierverfahren”, PARS-Workshop, Erlangen,April 1984, Mitteilungen GI-PARS Nr. 2, pp134--143, 1984

16. Schröder, H. “VLSI-Sorting under the Linear Model”, Proceedings of the TenthAustralian Computer Science Conference, pp330--340, Deakin University,February 1987

17. B. Beresford-Smith, J. Breckling, H. Schröder, “Systolic Devices for SpeechProcessing”, CompEuro 89, Hamburg, Germany, May 1989

Page 40: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 40

P A R C ISA mergeISA merge

1 3 3 45 5 6 79 8 8 74 4 3 2

C:=min{C, CE}

C:=max{C, CW}

Page 41: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 41

P A R C ISA mergeISA merge

1 3 3 45 5 6 79 8 8 74 4 3 2

Page 42: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 42

P A R C ISA mergeISA merge

1 3 3 45 5 6 79 8 8 74 4 3 2

Page 43: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 43

P A R C ISA mergeISA merge

1 3 3 45 5 6 79 8 8 74 4 3 2

Page 44: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 44

P A R C ISA mergeISA merge

1 3 3 45 5 6 79 8 8 74 4 3 2

Page 45: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 45

P A R C ISA mergeISA merge

1 3 3 45 5 6 74 8 8 79 4 3 2

Page 46: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 46

P A R C ISA mergeISA merge

1 3 3 45 5 6 74 8 8 79 4 3 2

Page 47: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 47

P A R C ISA mergeISA merge

1 3 3 44 5 6 75 4 8 79 8 3 2

Page 48: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 48

P A R C ISA mergeISA merge

1 3 3 44 5 6 75 4 8 79 8 3 2

Page 49: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 49

P A R C ISA mergeISA merge

1 3 3 44 4 6 75 5 3 79 8 8 2

Page 50: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 50

P A R C ISA mergeISA merge

1 3 3 44 4 6 75 5 3 79 8 8 2

Page 51: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 51

P A R C ISA mergeISA merge

1 3 3 44 4 3 75 5 6 29 8 8 7

Page 52: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 52

P A R C ISA mergeISA merge

1 3 3 44 4 3 75 5 6 29 8 8 7

Page 53: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 53

P A R C ISA mergeISA merge

1 3 3 44 4 3 25 5 6 79 8 8 7

Page 54: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 54

P A R C ISA mergeISA merge

1 3 3 44 4 3 25 5 6 79 8 8 7

Page 55: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 55

P A R C ISA mergeISA merge

1 3 3 24 4 3 45 5 6 79 8 8 7

Page 56: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 56

P A R C ISA mergeISA merge

1 3 3 24 4 3 45 5 6 79 8 8 7

Page 57: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 57

P A R C ISA mergeISA merge

1 3 2 34 3 4 45 5 6 79 8 8 7

Page 58: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 58

P A R C ISA mergeISA merge

1 3 2 34 3 4 45 5 6 79 8 8 7

Page 59: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 59

P A R C ISA mergeISA merge

1 2 3 33 4 4 45 5 6 79 8 8 7

Page 60: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 60

P A R C ISA mergeISA merge

1 2 3 33 4 4 45 5 6 79 8 8 7

Page 61: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 61

P A R C ISA mergeISA merge

1 2 3 33 4 4 45 5 6 79 8 8 7

Page 62: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 62

P A R C ISA mergeISA merge

1 2 3 33 4 4 45 5 6 78 9 7 8

Page 63: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 63

P A R C ISA mergeISA merge

1 2 3 33 4 4 45 5 6 78 9 7 8

Page 64: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 64

P A R C ISA mergeISA merge

1 2 3 33 4 4 45 5 6 78 7 9 8

Page 65: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 65

P A R C ISA mergeISA merge

1 2 3 33 4 4 45 5 6 78 7 9 8

Page 66: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 66

P A R C ISA mergeISA merge

1 2 3 33 4 4 45 5 6 77 8 8 9

Page 67: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 67

P A R C ISA mergeISA merge

1 2 3 33 4 4 45 5 6 77 8 8 9

Page 68: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 68

P A R C ISA mergeISA merge

1 2 3 33 4 4 45 5 6 77 8 8 9

Page 69: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 69

P A R C ISA mergeISA merge

1 2 3 33 4 4 45 5 6 77 8 8 9

Page 70: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 70

P A R C ISA mergeISA merge

1 2 3 33 4 4 45 5 6 77 8 8 9

Page 71: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 71

P A R C Hough transform on the ISAHough transform on the ISA

• good line detection method

shear

Fast tomography

Page 72: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 72

P A R C robot visionrobot vision

projectorCCD CCD

• stereo vision

Page 73: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 73

P A R C Use of the ISAUse of the ISA

Areas of application for ISA:automatic optical quality controlreal time signal processingcomputer graphics /visualizationlinear equationsCryptography --> Tele-medicine ?

Special features:fast aggregate functions (sum, carry)fast local communicationno local memorytypical improvement over PC: Factor 20-30

Page 74: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 74

P A R C Instruction Systolic ArrayInstruction Systolic Array1. Schröder, H., “The Instruction Systolic Array --- A Tradeoff between Flexibility and

Speed”, Computer Systems Science and Engineering, Vol 3 No 2, April 19882. Schröder, H., “Top-Down Designs of Instruction Systolic Arrays for Polynomial

Interpolation and Evaluation”, in Journal of Parallel and Distributed Computing, Vol.6, pp 692-703, 1989

3. Kunde, M., Lang, H.W., Schimmler, M., Schmeck, H., Schröder, H., “The InstructionSystolic Array and its Relation to other Models of Parallel Computers”, ParallelComputing, Vol 7, pp 25-39, 1988

4. Schröder, H., Krishnamurthy, E.V., “Instruction Systolic Array Computation of theCharacteristic Polynomial of a Hessenberg Matrix”, Parallel Computing, 17, pp 273-278, 1991

5. H. Schröder, P. Strazdins, “Programm compression on the ISA”, Parallel Computing17, 207-219, 1991

6. B. Pham, H. Schröder, “An Instruction Systolic Device for Quadratic SurfaceGeneration”, CompEuro 89, Hamburg, West Germany, May 1989

7. P. Lenders, H. Schröder, P. Strazdins, “Microprogramming Instruction SystolicArrays”, MICRO 22, Dublin, August 1989

8. B. Schmidt, M. Schimmler, H. Schröder, “Morphological Hough Transform on theInstruction Systolic Array”, Euro-Par ’97 – Parallel Computing, LNCS 1300, SpringerVerlag, pp. 798-806, 1997.

Page 75: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 75

P A R C PIPS (1990-94)PIPS (1990-94)

1 M bit

1 M bit

32x32 torus16 bit parallelcommunication16 bit addprefetch

memory control

BHP -- CSIRO -- NU -- ADFA 1.4 M

Page 76: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 76

P A R C

Special features:local memorySIMD-torusmemory pre-fetch

Applications:visualization3D-simulation (CFD, FEM)

Page 77: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 77

P A R C

Page 78: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 78

P A R C PIPSPIPS1. A. Spray, K.T. Lie, H. Schröder, “Test Strategies Employed in a Massively Parallel

Visualization Engine”, PRFTS, Melbourne, December 19932. A. Spray, H. Schröder, K.T. Lie, “A Low-Cost Machine for Real-Time

Visualization”, 5th International Symposium on IC Technology, Singapore, 19933. H. Schröder et al, “PIPADS: A Vertically Integrated Parallel Image Processing and

Display System”, 5th Australian Supercomputing Conference, Melbourne, 1992.4. R.Lang, E.Plesner, H.Schröder, A.Spray, “An efficient systolic architecture for the

one-dimensional wavelet transform”, in Wavelet Applications, Harold H. Szu,Editor, Proc. SPIE 2242, p925-935, 1994.

5. R. Lang, A. Spray, H. Schröder, “2D wavelet transform on a SIMD torus ofscanline processors”, Australian Computer Science Communications, Vol 17, Nr17, 271-277, 1995.

6. P. Bray, S.W. Chan, K.T.Lie, Meiyun, H. Schröder, A. Spray, “A torus for scan-line based image processing”, Presented at the Post-ISCA Special PurposeArchitectures Workshop, May 1992.

7. A. Spray, H. Schröder, K.T. Lie, E. Plesner, P. Bray, “PIPADS --- A Low-CostReal-Time Visualization Tool”, Supercomputing, Melbourne, 1993.

8. H. Schröder, A. Spray, “PIPS a massively parallel image processing system”,Workshop PARAGRAPH'94, Linz, March 1994.

9. H. Schröder, “3D-Visualisation based on Scan-line Image Processing”, invitedspeaker, Workshop MWTAI’97 in Missen-Wilhams, March 1997

Page 79: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 79

P A R C Use in industry ?Use in industry ?

1993 1994 1995 1996

500

1000

1500

2000

2500

3000

Performance[Gflops]

Research

Industry

1327

2121

648

3675

126248

693

1168

Page 80: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 80

P A R C InvestmentsInvestments

Investments into parallel computers[M$]

0

500

1000

1500

2000

2500

3000

3500

1993 1994 1995 1996

ResearchIndustry

Page 81: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 81

P A R C ConcentrationConcentration

1993 1994 1995 1996

10

20

30

40

50

60

Number ofmanufacturers

11

2119

49

Page 82: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 82

P A R C Degree of ParallelismDegree of Parallelism

050

100150200250300350400450

May-93

Nov-93

May-94

Nov-94

May-95

Nov-95

May-96

Nov-96

May-97

Nov-97

1 to 6364 to 255256 to 10231024 and more

Number ofnew Systems

Page 83: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 83

P A R C EvaluationEvaluation

Cost * computation time

• Parallel computers with standard components

• Imbedded parallel systems

Page 84: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 84

P A R C reconfigurable meshreconfigurable mesh

reconfigurable mesh =mesh + interior connections

15 positions

low cost

Page 85: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 85

P A R C global OR and modulo 3global OR and modulo 3

1 0 000 1 0

* * “V”

log n on EREW-PRAM10 11 10

*1 mod 3

log n / log log n on CRCW-PRAM

Page 86: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 86

P A R C sorting with all-to-all mappingsorting with all-to-all mapping

Sorting:sort blocksall-to-all (columns)sort blocks all-to-all (rows)o-e-sort blocks

Page 87: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 87

P A R C all-to-all mappingall-to-all mapping

n2

3

n x n

Page 88: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 88

P A R C vertical all-to-allvertical all-to-all

Page 89: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 89

P A R C horizontal all-to-allhorizontal all-to-all

Page 90: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 90

P A R C1 step

2 steps

3 steps

k/2 steps

3 steps

2 steps

1 step

(k/2)2 steps

Page 91: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 91

P A R C sorting in optimal timesorting in optimal time

(k/2)2 stepsk=n1/3

each step takes n1/3 time --> T= n/4

x 2

x 2

/2

T = n/2all-to-all

Sorting:sort blocks (O(n2/3))all-to-all (n/2)sort blocks (O(n2/3))all-to-all (n/2)sort blocks (O(n2/3))

time: n + o(n)

Page 92: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 92

P A R C Reconfigurable meshReconfigurable mesh

Special featuresSIMDconstant diameterfaster than PRAM ?Suitable applicationsrouting/sorting/load balancingsparse matrix multiplicationsegmentation / component labelingfeature extractionimage database ?

Page 93: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 93

P A R C Reconfigurable meshReconfigurable mesh1. Kapoor, H. Schröder, B. Beresford-Smith, “Connected Component Labelling on

3D Reconfigurable Mesh Architecture”, Parallel Processing, 19942. M. Kaufmann, H. Schröder, J. Sibeyn, “Routing and Sorting on Reconfigurable

Meshes”, Parallel Processing Letters, Vol 5, Nr 1, pp 81-96, 19953. G. Turner, H. Schröder, “Token Distribution and Load Balancing on recon-

figurable, d-D. Meshes”, Journal of Parallel Architectures and Algorithms, 19964. M. Middendorf, H. Schmeck, H. Schröder, G. Turner. “Multiplikation of matrices

with different sparseness properties on dynamically reconfigurable meshes”, toappear in VLSI Design.

5. Kapoor A., Schröder H., Beresford-Smith B., “Constant Time sorting on aReconfigurable Mesh “, Australian Computer Science Conference, February 1993.

6. D. Yu, H. Schröder, “Parallel Border following, Connecting and Labeling forBinary Image with 8-neighborhood Coding on a Mesh with Bi-reconfigurableBuses”, 3rd International Workshop on Parallel Image Analysis, Maryland, 1994.

7. Kapoor, H. Schröder, B. Beresford-Smith, “Deterministic Permutation Routing onthe Reconfigurable Mesh”, Proceedings 8th IPPS, pp 536-540, Mexico, 1994.

8. G. Turner, H. Schröder, “Fast Token Distribution on Reconfigurable Meshes”, inProceedings of the 2nd Australasian Conference on Parallel and Real-TimeSystems (PART)}, vol 2, pp 127-133, 1995

9. H. Schmeck, H. Schröder, G. Turner, “Efficient Sparse Matrix Multiplication on aReconfigurable Mesh”, PARS-Workshop, Stuttgart, Germany, October 1995.

10. H. Schröder, “Mathematical Morphology for Robot Vision on a ReconfigurableMesh Architecture”, ISCA Special Purpose Architectures Workshop, May 1992.

11. M. Kaufmann, H. Schröder, J. Sibeyn, “Asymptotically Optimal and PracticalRouting on the Reconfigurable Mesh”, GI/ITG-Workshop “Architekturen fürhochintegrierte Schaltungen”, Schloß Dagstuhl, July1994.

Page 94: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 94

P A R C Optical HighwayOptical Highway

( )

6

3

10

#

===

=

CwidthW

processorsPCWP W=1; P=100

W=100; P=22

C

All-to-all connection

Page 95: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 95

P A R C

Horizontal all-to-all

Verticalall-to-all

Page 96: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 96

P A R C

Features of optically connected meshesSIMD/SPMD/MIMDimplement all major architecturesall-to-all communication in 2 stepsBulk synchronous processing (BSP)no latency hidingno pin-limitationApplicationscoarse grain parallel computing only?ray-tracing ????

Page 97: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 97

P A R C Optical HighwayOptical Highway

1. H. Schröder et al, “RMB --- A Reconfigurable Multiple BusNetwork”, HPCA 96, San Jose, 19962. H. Schröder, O. Sykora, I. Vrto, “Optical All-to-All Communicationfor some Product Graphs”, SOFSEM '97, Milovy, Czech Republic,1997

Page 98: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 98

P A R C Bisection-width / DiameterBisection-width / Diameter

Diameter log nbisection width n

diameterbisection width

nn

SAISAPIPS

Diameter 1bisection width n

RM

Diameter 1bisection width nOH

Page 99: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486

Heiko Schröder, 1998 Slide 99

P A R C Suitable problems ?Suitable problems ?bisection width n

SA: suitable applications?ISA: 2D-problems, aggregate functions local communicationPIPS: 3D-problems, local communication

RM: diameter-bound > bisection-width-bound

OH: PRAM equivalent?

SAISAPIPS

RM

OH

Page 100: Heiko Schröder, 1998 Architectures Special Purpose Meshweb.cecs.pdx.edu/~mperkows/temp/May22/warwick.pdf · Performance (MIPS) Year 4004 8080 80286 80386 Pentium 2 Pentium 80486