modular design and the importance of models of computation bart kienhuis, leiden university, liacs...

35
Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group he presentation given at the 37th DAC tutorial on Embedded System Design

Upload: hailey-cosgrave

Post on 29-Mar-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

Modular design and the importance of models of computation

Bart Kienhuis,Leiden University, LIACSComputer Systems Group

Based on the presentation given at the 37th DAC tutorial on Embedded System Design

Page 2: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

2

New Applications

Stream oriented applications Multi-media (Smart) imaging Bioinformatics Classical digital signal processing

Ferocious appetite for compute power

Page 3: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

3

Smart Imaging

Camellia project Core for ambient

and mobile intelligent imaging applications

IST project (fr5)Detection of a

pedestrian walking in front of a car Renault Philips

Target 1

Target 2

Huge compute Requirements Giga operations per second Real Time

Embedded System Low cost Low power

Page 4: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

4

Heterogeneous Architectures

Stream-based applications Autonomous operating components

(task-level parallelism) Low bandwidth communication between components Programmable interconnect Distributed memory

Programmable Interconnect (NoC)

Programmable Interconnect (NoC)

IPcore

IPcore

RP

UR

PU

Mem

oryM

emory

CP

UC

PU

Micro

ProcessorM

icro P

rocessorMemoryMemory

...

Programmable Interconnect (NoC)

Programmable Interconnect (NoC)

IPcore

IPcore

RP

UR

PU

Mem

oryM

emory

CP

UC

PU

Micro

ProcessorM

icro P

rocessorMemoryMemory

...

Microprocessors (DSP, CPU)

Reconfigurable Units (FPGA)

Dedicated Hard (IP cores)

Distributed Memory Memory banks

Page 5: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

5

Computational efficiency [MOPS/W]

Intrinsic Computational Efficiency (ICE)

i386SXi486DX P568040

microsparc

Supersparc

601604

Ultrasparc P6

604e21164a

Turbosparc 604e

21364

7400

106

105

104

103

102

101

100

2 1 0.5 0.25 0.13 0.07Feature size [m]

E. Roza, System-on-chip: what are the limits?, IEE Electronics Communication Engineering Journal, vol13, No6, Dec 2001, pp 249-255.

Microprocessors

Intrinsic ComputationalEfficiency of Silicon

Pla

ying

fiel

d

Page 6: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

6

Xilinx Virtex II Pro

PowerPC based 420 Dhrystone

MIPS at 300 MHz 1 to 4 PowerPCs

Virtex-4 Already over a

billion transistors 90nm technology

Reconfigurable logicand memory blocks

PowerPCs

Source: Xilinx

Page 7: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

7

SpaceCake

TitleCurrent Development in Philips Research. Idea is to make a platform that is Moore’s Law resilient.

Homogenous Tiles

If more transistors become available, more titles can be combined delivering more compute power.

Programmable Interconnect

CPU0 CPU1 CPU2

RPU IPcore Memory

Stravers, P and Hoogerbrugge, J. 2001. Homogeneous multiprocessoring and the future of silicon design paradigms. In proceedings of the Int. Symposium on VLSI Technology, Systems, and Applications.

Page 8: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

8

PicoArrayStartup,

EnglandProcessor with

430 16-bit RISC cores on a single die.

Projected Markets: WCDMA / 802.11 Source: www.picoarray.com

Homogeneous Architecture

Page 9: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

9

Different Views

Homogeneous architectures Linear scaling

More CPUs leading to an equal increase in available compute power

Load balancingEach CPU is working at the same workload level

Heterogeneous architectures Flexibility

Match the computation to the correct component in terms of ICE; Take advantage of heterogeneity

Page 10: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

10

Software Efficiency

Pro

du

ctiv

ity

Tra

ns.

/ S

taff

. M

on

th

10

100

1,000

10,000

100,000

1,000,000

10,000,000

100,000,000

198

1

198

5

198

9

199

3

199

7

200

1

200

5

200

9

Log

ic t

ransi

stors

per

chip

(K

)

10

100

1,000

10,000

100,000

1,000,000

10,000,000

Logic

Tr./Chip

58% / Yr. compoundcomplexity growth rate

Tr./S.M

21% / Yr. compoundproductivity growth rate

Source: SEMATECH© Kreutzer

Productivitygap

Page 11: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

11

Problem

“We know how to build billion transistor ICs, but we do not know how to program them”

Current compiler technology is not capable to handle the heterogeneity of the architectures

How come and how to solve?

Page 12: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

12

Y-chart ApproachThree different ways to improve the performance of a system.

Suggest architecturalimprovements

Rewrite theapplications

ApplicationsApplicationsArchitecture Instance

Mapping

Applications

PerformanceAnalysis

PerformanceNumbers

Use differentMapping strategies

Kienhuis,B., Deprettere, E., Van der Wolf, P., and Vissers, K. 2002. A Methodology to Design Programmable Embedded Systems. LNCS, vol. 2268. Springer Verlag, pages 18 – 37.

Page 13: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

13

Mapping

ApplicationsApplicationsArchitecture Instance

Mapping

Applications

PerformanceAnalysis

PerformanceNumbers

MPEG

Codedvideo

DemuxVLD Q-1 IDCT

MotionBuffer

Reorderordering

quantization control

motion vectors & mode

Decodedvideo

MPEG Decoding

+

Programmable Interconnect (NoC)

Programmable Interconnect (NoC)

IPcore

IPcore

RP

UR

PU

Mem

oryM

emory

CP

UC

PU

Micro

ProcessorM

icro P

rocessor

MemoryMemory

...

Programmable Interconnect (NoC)

Programmable Interconnect (NoC)

IPcore

IPcore

RP

UR

PU

Mem

oryM

emory

CP

UC

PU

Micro

ProcessorM

icro P

rocessor

MemoryMemory

...

MAPPING

Page 14: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

14

Mapping

bus

coproc

CPU

coproc.

Architecture:•Resources

•ALUs, CORDICS, PEs•Registers, SRAM, DRAM•Busses, Switches

•Communication•Bits, Signals

Application:• Computations

•IDCT, SQRT, Quantizer• Communication

•Pixels, Blocks

Both described a network of components that performa particular function and that communication in a

particular way

MPEG

Codedvideo

DemuxVLD Q-1 IDCT

MotionBuffer

Reorderordering

quantization control

motion vectors & mode

Decodedvideo

MPEG Decoding

+

Page 15: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

15

Mapping

Architecture Application

bus

coproc

CPU

coproc.

Mapping

Can we formalize the description of these networks?“Models of Architecture” and “Models of Computation”

MPEG

Codedvideo

DemuxVLD Q-1 IDCT

MotionBuffer

Reorderordering

quantization control

motion vectors & mode

Decodedvideo

MPEG Decoding

+

Page 16: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

16

Model of Computation

A

C

D

B

A Model of computation is a formal representation of the operational semantics of networks of

functional blocks describing the computations.

Page 17: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

17

Model of ComputationTerminology

Actor Describes the functionality

Relation The actors can communicate

with each other using relations.

Token The exchange of a quantum

of information. It represents a signal

Firing A quantum of computation Moment of interaction with other

actors

fire { … token = get(); … send(token); …}

Port

(Active/Passive)

Port

Relation

A

C

D

B

Actor

token

Page 18: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

18

Active/Passive Actors

A

C

D

B

Passive Actor:•Scheduler needed.

•Schedule ABBCD•A firing needs to terminate•Fire-and-exit behavior

fire { token = get(); … send(token); …}

fire { while(1) { token = get(); send(token); }}

Active Actor:•Schedules itself•A firing typically doesn’t terminate

•Endless while loop•Process behavior

Two kinds of Actors:Exit

Page 19: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

19

Communication Between Actors

Data Type of the Token•Integer, Double, Complex•Matrix, Vector•Record

Actor 2.

fire { … get(); …}

port port

Tokenfire { … send(); …}

Actor 1.

Way exchange takes place•Buffered•Timed•Synchronized

Communication(Semantics)

Page 20: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

20

Different Semantics

Analog computers (odes) Discrete time (difference

equations) Discrete-event systems

(DE) Process networks (Kahn) Sequential processes with

rendezvous (CSP) Dataflow (Dennis) Synchronous-reactive

systems (SR) Codesign finite state

machines (CFSM)

continuous time:

discrete time:

discrete events:

E1 E2 E3

E4 E5 E6

partially-orderedevents:

synchronous/reactive:

Page 21: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

21

Synchronous/reactive Models (SR)

Network of concurrent executing actors Passive actors Communication is unbuffered

Computation and communication is instantaneous. A model progresses as a sequence of “ticks.” At a tick, the signals are defined by a fixed point equation:

Characteristics of SR models Tightly synchronized Stable state points Control intensive systems

),(

)(

)1(

yxf

zf

f

z

y

x

c

b

A

Fixed point equation

A

C

D

B

x

y

z

fire { … get(); …}port port

Tokenfire { … send(); …}

Page 22: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

22

Process Network (PN)

Network of concurrent executing processes Active actors Communicate over

unbounded FIFOs Performing some

operation, a blocking read or a non-blocking write

Characteristics of process networks Deterministic execution Doesn’t impose a particular

schedule (Dynamic) dataflow

A

C

D

B

Process

Stream channel

fire { … get(); …}port port

Tokenfire { … send(); …}

Page 23: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

23

Synchronous Dataflow (SDF)

Network of concurrent executing actors Passive actors Communication is buffered

A model progresses as a sequence of “iterations.”

A “firing rule” determines the firing condition of an actor.

At each firing, a fixed number of tokens is consumes and produces.

Characteristics of SDF Compile time analyzable. Memory/schedule/speed Static dataflow

Schedule: ABBBC

A

C

D

B

1

1

1 1

3

33

3

port

fire { … get(); …}port

Tokensfire { … send(); …}

Page 24: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

24

Codesign Finite State Machine (CFSM)

Network of concurrent executing actors Passive actors Synchronous locally Asynchronous globally

An “event” causes the evaluation (firing) of a FSM.

Characteristics of CFSM Compile time analyzable. Reactive systems

FSMport port

Token FSM

A

C

D

B

Timed Event

Page 25: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

25

Finite State Machine (FSM)

•More efficient way to describe sequential control.•Formal semantics which allows for verifying various properties like safety, liveness, and fairness.

•FSM may only have one state active at the time•FSM has only a finite number of states.

Port_BELTOFF

WAIT

ALARM

KEY=0N => START

KEY=OFF or BELT=ON =>ALARM=OFF

END=5 => ALARM=ON

END=10 orBELT=ON orKEY=OFF =>ALARM=OFF

Port_KEY

Port_END

Port_START

Port_ALARM

Page 26: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

26

Model of Architecture

A Model of architecture is a formal representation of the operational semantics of networks offunctional blocks describing architectures.

Model of Architecture is similar to Model of Computation, but the focus is on

the architecture instead of on the applications.

A,B,C and D are nowhardware resources likeCPUs, busses, Memory,

and dedicated coprocessors.

A

C

D

B

Page 27: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

27

Examples

Programmable Communication Network

PE2

PE3

PE

Control Dominated Tasks•Sequential

Control/ Data Tasks•Sequential•Centralized computation•Mutual Exclusive [1]

Data Dominated Tasks•Parallel / DMA•Data flow•Distributed computation

Less mature then MoC

Com

plex

ity

High

LowCPU

Bus

Memory

CPU

Bus

Memory

CPU PE1

Memory

[1] Wolf, Wayne. A Decade of Hardware/Software Codesign. IEEE Computer, Volume 36, April 2003, pages 38-43

Page 28: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

28

Conclusion: Matching Models

Data Type

Architecture

Model of Architecture

Application

Model of Computation

When the MoC and MoA match, a simple mapping results

Natual Fit

Page 29: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

29

Putting It Together Example 1.

Platform: microprocessor “von-Neumann architecture”Machine Description

Architecture Instances

ApplicationsApplicationsMicroProcessor

Compiler

Application

Pentium/ArmMIPS/Alpha

PerformanceNumbers

GCC

SPECInt Benchmarks

Page 30: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

30

Putting It Together Example 1.

for i=1:1:10for j=1:1:10 A(i,j) =FIR();end

endfor i=1:1:10,

for j=1:1:10, A(i,j) =SRC( A(i,j) );

endend

ProgramCounter

Memory

ALUInstructionDecoder

(address)

Model of Architecture:• Sequential (Program Counter) • one item over the bus at the time.• Shared Memory

Model of Computation:• Sequential• Shared Memory

Picture in PictureMicro Processor

Compiler

Simulator

PerformanceNumbers

Natural FIT

Page 31: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

31

ApplicationsApplicationsArchitecture Instance

Mapping

Applications

PerformanceAnalysis

PerformanceNumbers

Putting It Together Example 2.

Programmable Interconnect (NoC)

Programmable Interconnect (NoC)

IPcore

IPcore

RP

UR

PU

Mem

oryM

emory

CP

UC

PU

Micro

ProcessorM

icro P

rocessor

MemoryMemory

...

Programmable Interconnect (NoC)

Programmable Interconnect (NoC)

IPcore

IPcore

RP

UR

PU

Mem

oryM

emory

CP

UC

PU

Micro

ProcessorM

icro P

rocessor

MemoryMemory

...

%parameter N 8 16;%parameter K 100 1000;

for k = 1:1:K, for j = 1:1:N, [ r(j,j), x(k,j), t ]=Vectorize( r(j,j), x(k,j) ); for i = j+1:1:N, [ r(j,i), x(k,i), t]=Rotate( r(j,i), x(k,i), t ); end endend

Matlab Code (QR Algorithm)

Model of Architecture:• Task Level Parallelism • Heterogeneity• Distributed Memory

Model of Computation:• Imperative• Sequential Execution • Global Memory

NO Natural FIT

Page 32: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

32

ApplicationsApplicationsArchitecture Instance

Mapping

Applications

PerformanceAnalysis

PerformanceNumbers

Putting It Together Example 2.

Programmable Interconnect (NoC)

Programmable Interconnect (NoC)

IPcore

IPcore

RP

UR

PU

Mem

oryM

emory

CP

UC

PU

Micro

ProcessorM

icro P

rocessor

MemoryMemory

...

Programmable Interconnect (NoC)

Programmable Interconnect (NoC)

IPcore

IPcore

RP

UR

PU

Mem

oryM

emory

CP

UC

PU

Micro

ProcessorM

icro P

rocessor

MemoryMemory

...

Model of Architecture:• Task Level Parallelism • Heterogeneity• Distributed Memory

Model of Computation:• Process Networks• Distributed Memory • Distributed Control

P1 P2

S1Source

P3 P4

Sink

Natural FIT

Page 33: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

33

Our Research Focus….

Application

Programmable Interconnect (NoC)Programmable Interconnect (NoC)

IPcore

IPcore

RP

UR

PU

Mem

oryM

emory

CP

UC

PU

Micro

Processor

Micro

Processor

MemoryMemory

...

ProgrammingCompaan

Laura

/ESPAM

P1 P2

S1Source

P3 P4

Sink

%parameter N 8 16;%parameter K 100 1000;

for k = 1:1:K, for j = 1:1:N, [ r(j,j), x(k,j), t ]=Vectorize( r(j,j), x(k,j) ); for i = j+1:1:N, [ r(j,i), x(k,i), t]=Rotate( r(j,i), x(k,i), t ); end endend

Matlab Code (QR Algorithm)

Process Network

Page 34: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

34

Other Examples

Other example of the “Natual Fit” concept VSP architecture, CSDF Tangam, CSP Polis system, CFSM

Page 35: Modular design and the importance of models of computation Bart Kienhuis, Leiden University, LIACS Computer Systems Group Based on the presentation given

35

Conclusions

We will get billion transistor ICsWe already know how to build them, but

not how to program themCurrent compilers do not take into

account the notion of “natural fit” in terms of models of computation and models of architecture

New compiler research is needed that takes into account different models of computation