Transcript
Page 1: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

11

Bridging the gap between Bridging the gap between asynchronous designasynchronous design

and designersand designers

Thanks to Jordi Cortadella, Luciano Lavagno, Mike Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many othersKishinevsky and many others

Page 2: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

22

OutlineOutline

1.1. Basic concepts on asynchronous circuit designBasic concepts on asynchronous circuit design

2.2. Logic synthesis from concurrent specificationsLogic synthesis from concurrent specifications

3.3. Design automation for asynchronous circuitsDesign automation for asynchronous circuits

Page 3: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

33

Basic concepts on Basic concepts on asynchronous circuit designasynchronous circuit design

Page 4: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

44

OutlineOutline

What is an asynchronous circuit ?What is an asynchronous circuit ?

Asynchronous communicationAsynchronous communication

Asynchronous design styles (Micropipelines)Asynchronous design styles (Micropipelines)

Asynchronous logic building blocksAsynchronous logic building blocks

Control specification and implementationControl specification and implementation

Delay models and classes of async circuitsDelay models and classes of async circuits

Why asynchronous circuits ?Why asynchronous circuits ?

Page 5: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

55

Synchronous circuitSynchronous circuit

R R R RCL CL CL

CLK

Implicit (global) synchronization between blocksClock period > Max Delay (CL + R)

Time is an independent physical variable (quantity)

Page 6: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

66

Asynchronous circuitAsynchronous circuit

R R R RCL CL CL

Req

Ack

Explicit (local) synchronization:Req / Ack handshakes

Time = events + quantity Time does not exist if nothing happens (Aristotle)

Page 7: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

77

Motivation for asynchronousMotivation for asynchronous

Asynchronous design is often unavoidable:Asynchronous design is often unavoidable: Asynchronous interfaces, arbiters etc.Asynchronous interfaces, arbiters etc.

Modern clocking is multi-phase and distributed – Modern clocking is multi-phase and distributed – and virtually ‘asynchronous’ (cf. GALS – next slide):and virtually ‘asynchronous’ (cf. GALS – next slide): Mesachronous (clock travels together with data)Mesachronous (clock travels together with data) Local (possibly stretchable) clock generationLocal (possibly stretchable) clock generation

Robust asynchronous design flow is coming (e.g. Robust asynchronous design flow is coming (e.g. VLSI programming from Philips, NCL from Theseus VLSI programming from Philips, NCL from Theseus Logic, fine-grain pipelining from Fulcrum) Logic, fine-grain pipelining from Fulcrum)

Page 8: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

88

Globally Async Locally Sync (GALS)Globally Async Locally Sync (GALS)

Local CLK

R RCL

Async-to-sync Wrapper

Req1

Req2

Req3

Req4

Ack3

Ack4Ack2

Ack1

Asynchronous World

Clocked Domain

Page 9: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

99

Key Design DifferencesKey Design Differences

Synchronous logic design:Synchronous logic design: proceeds without taking timing correctness proceeds without taking timing correctness

(hazards, signal ack-ing etc.) into account(hazards, signal ack-ing etc.) into account Combinational logic and memory latches Combinational logic and memory latches

(registers) are built separately(registers) are built separately Static timing analysis of CL is sufficient to Static timing analysis of CL is sufficient to

determine the Max Delay (clock period)determine the Max Delay (clock period) Fixed set-up and hold conditions for latchesFixed set-up and hold conditions for latches

Page 10: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

1010

Key Design DifferencesKey Design Differences

Asynchronous logic design:Asynchronous logic design: Must ensure hazard-freedom, signal ack-ing, local Must ensure hazard-freedom, signal ack-ing, local

timing constraintstiming constraints Combinational logic and memory latches Combinational logic and memory latches

(registers) are often mixed in “complex gates”(registers) are often mixed in “complex gates” Dynamic timing analysis of logic is needed to Dynamic timing analysis of logic is needed to

determine relative delays between pathsdetermine relative delays between paths

To avoid complex issues, circuits may be To avoid complex issues, circuits may be built as Delay-insensitive and/or Speed-built as Delay-insensitive and/or Speed-independent independent (Maller’s theory vs Huffman (Maller’s theory vs Huffman asynchronous automata)asynchronous automata)

Page 11: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

1111

Verification and Testing DifferencesVerification and Testing Differences

Synchronous logic verification and testing:Synchronous logic verification and testing: Only functional correctness aspect is verified and Only functional correctness aspect is verified and

testedtested Testing can be done with standard ATE and at low Testing can be done with standard ATE and at low

speedspeedAsynchronous logic verification and testing:Asynchronous logic verification and testing: In addition to functional correctness, temporal In addition to functional correctness, temporal

aspect is crucial: e.g. causality and order, aspect is crucial: e.g. causality and order, deadlock-freedomdeadlock-freedom

Testing must cover faults in complex gates Testing must cover faults in complex gates (logic+memory) and must proceed at normal (logic+memory) and must proceed at normal operation rateoperation rate

Delay fault testing may be neededDelay fault testing may be needed

Page 12: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

1212

Synchronous communicationSynchronous communication

Clock edges determine the time instants where Clock edges determine the time instants where data must be sampleddata must be sampled

Data wires may glitch between clock edges (set-Data wires may glitch between clock edges (set-up/hold times must be satisfied)up/hold times must be satisfied)

Data are transmitted at a fixed rateData are transmitted at a fixed rate(clock frequency)(clock frequency)

1 1 0 0 1 0

Page 13: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

1313

Dual railDual rail

Two wires with L(low) and H (high) per bitTwo wires with L(low) and H (high) per bit ““LL” = “spacer”, “LH” = “0”, “HL” = “1”LL” = “spacer”, “LH” = “0”, “HL” = “1”

nn-bit data communication requires 2-bit data communication requires 2nn wires wires

Each bit is Each bit is self-timedself-timed

Other Other delay-insensitivedelay-insensitive codes exist (e.g. k-of-n) codes exist (e.g. k-of-n) and event-based signalling (choice criteria: pin and event-based signalling (choice criteria: pin and power efficiency) and power efficiency)

1 1

0 0

1

0

Page 14: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

1414

Bundled dataBundled data

Validity signalValidity signal Similar to an aperiodic local clockSimilar to an aperiodic local clock

nn-bit data communication requires -bit data communication requires nn+1 wires+1 wires

Data wires may glitch when no validData wires may glitch when no valid

Signaling protocolsSignaling protocols level sensitive (latch)level sensitive (latch) transition sensitive (register): 2-phase / 4-phasetransition sensitive (register): 2-phase / 4-phase

1 1 0 0 1 0

Page 15: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

1515

Example: memory read cycleExample: memory read cycle

Transition signaling, 4-phaseTransition signaling, 4-phase

Valid address

Address

Valid data

Data

A A

DD

Page 16: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

1616

Example: memory read cycleExample: memory read cycle

Transition signaling, 2-phaseTransition signaling, 2-phase

Valid address

Address

Valid data

Data

A A

DD

Page 17: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

1717

Asynchronous modulesAsynchronous modules

Signaling protocol:Signaling protocol:

reqin+ start+ [reqin+ start+ [computationcomputation] done+ reqout+ ackout+ ackin+] done+ reqout+ ackout+ ackin+reqin- start- [reqin- start- [resetreset] done- reqout- ackout- ackin-] done- reqout- ackout- ackin-

(more concurrency is also possible)(more concurrency is also possible)

Data IN Data OUT

req in req out

ack in ack out

DATAPATH

CONTROL

start done

Page 18: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

1818

Asynchronous latches: C elementAsynchronous latches: C element

CA

BZ

A B Z+

0 0 00 1 Z1 0 Z1 1 1

Vdd

Gnd

A

A

A

AB

B

B

B

Z

Z

Z

[van Berkel 91]

Static Logic Implementation

Page 19: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

1919

C-element: Other implementationsC-element: Other implementations

A

A

B

B

Gnd

Vdd

Z

A

A

B

B

Gnd

Vdd

Z

Weak inverter

Quasi-StaticDynamic

Page 20: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

2020

Dual-rail logicDual-rail logic

A.t

A.f

B.t

B.f

C.t

C.f

Dual-rail AND gate

Valid behavior for monotonic environment

Page 21: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

2121

Completion detection Completion detection

Dual-rail logic

•••

•••

C done

Completion detection tree

Page 22: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

2222

Differential cascode voltage switch logic Differential cascode voltage switch logic

start

start

A.t

B.t

C.t

A.fB.fC.f

Z.tZ.f

done

3-input AND/NAND gate

N-type transistor network

Page 23: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

2323

Examples of dual-rail designExamples of dual-rail design

Asynchronous dual-rail ripple-carry adder (A. Asynchronous dual-rail ripple-carry adder (A. Martin, 1991)Martin, 1991) Critical delay is proportional to logN (N=number Critical delay is proportional to logN (N=number

of bits)of bits) 32-bit adder delay (1.6m MOSIS CMOS): 11ns 32-bit adder delay (1.6m MOSIS CMOS): 11ns

versus 40 ns for synchronousversus 40 ns for synchronous Async cell transistor count = 34 versus Async cell transistor count = 34 versus

synchronous = 28synchronous = 28

More recent success stories (modularity and More recent success stories (modularity and automatic synthesis) of dual-rail logic from automatic synthesis) of dual-rail logic from Null-Convension Logic from Theseus Logic Null-Convension Logic from Theseus Logic

Page 24: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

2424

Bundled-data logic blocks Bundled-data logic blocks

Single-rail logic

•••

•••

delaystart done

Conventional logic + matched delay

Page 25: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

2525

Micropipelines Micropipelines (Sutherland 89)(Sutherland 89)

C

Join Merge

Toggle

r1

r2

g1

g2

d1

d2

Request-Grant-Done (RGD)Arbiter

Call

r1

r2

ra

a1

a2Select

inoutf

outt

sel

inout0out1

Micropipeline (2-phase) control blocks

Page 26: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

2626

Micropipelines (Sutherland 89)Micropipelines (Sutherland 89)

L L L Llogic logic logic

Rin

Aout

C C

C C

Rout

Aindelay

delay

delay

Page 27: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

2727

Data-path / ControlData-path / Control

L L L Llogic logic logic

Rin RoutCONTROL AinAout

Synthesis of control is a major challenge

Page 28: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

2828

Control specificationControl specification

A+

B+

A-

B-

A

B

A inputB output

Page 29: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

2929

Control specificationControl specification

A+

B-

A-

B+

A B

Page 30: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

3030

Control specificationControl specification

A+

C-

A-

C+A

C

B+

B- B

C

Page 31: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

3131

Control specificationControl specification

A+

C-

A-

C+A

C

B+

B-B

C

Page 32: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

3232

Control specificationControl specification

CC

Ri

Ro

Ai

Ao

Ri+

Ao+

Ri-

Ao-

Ro+

Ai+

Ro-

Ai-

Ri Ro

Ao Ai

FIFOcntrl

Page 33: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

3333

Gate vs wire delay modelsGate vs wire delay models

Gate delay model: delays in gates, no delays in wiresGate delay model: delays in gates, no delays in wires

Wire delay model: delays in gates and wiresWire delay model: delays in gates and wires

Page 34: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

3434

Delay models for async. circuitsDelay models for async. circuits

Bounded delays (BD):Bounded delays (BD): realistic for gates and wires. realistic for gates and wires. Technology mapping is easy, verification is Technology mapping is easy, verification is

difficultdifficult

Speed independent (SI):Speed independent (SI): Unbounded (pessimistic) Unbounded (pessimistic) delays for gates and “negligible” (optimistic) delays delays for gates and “negligible” (optimistic) delays for wires.for wires.

Technology mapping is more difficult, verification Technology mapping is more difficult, verification is easyis easy

Delay insensitive (DI):Delay insensitive (DI): Unbounded (pessimistic) Unbounded (pessimistic) delays for gates and wires.delays for gates and wires.

DI class (built out of basic gates) is almost emptyDI class (built out of basic gates) is almost empty

Quasi-delay insensitive (QDI):Quasi-delay insensitive (QDI): Delay insensitive Delay insensitive except for critical wire forks (except for critical wire forks (isochronic forksisochronic forks).).

In practice it is the same as speed independentIn practice it is the same as speed independent

BD

SI QDI

DI

Page 35: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

3535

Environment modelsEnvironment models

Slow enough environment = Fundamental modeSlow enough environment = Fundamental mode

(Inputs change AFTER system has settled)(Inputs change AFTER system has settled)

Reactive environment = I/O modeReactive environment = I/O mode

(Inputs may change once the first output changes)(Inputs may change once the first output changes)

Page 36: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

3636

Correctness of a circuit wrt delay Correctness of a circuit wrt delay assumptionsassumptions

a

bz

C-element: z = ab +zb + za

a

b z

Page 37: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

3737

Motivation (designer’s view)Motivation (designer’s view)

Modularity for system-on-chip designModularity for system-on-chip design Plug-and-play interconnectivityPlug-and-play interconnectivity

Average-case peformanceAverage-case peformance No worst-case delay synchronizationNo worst-case delay synchronization

Many interfaces are asynchronousMany interfaces are asynchronous Buses, networks, ...Buses, networks, ...

Page 38: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

3838

Motivation (technology aspects)Motivation (technology aspects)

Low powerLow power Automatic clock gatingAutomatic clock gating

Electromagnetic compatibilityElectromagnetic compatibility No peak currents around clock edgesNo peak currents around clock edges

SecuritySecurity No ‘electro-magnetic difference’ between logical No ‘electro-magnetic difference’ between logical

‘0’ and ‘1’in dual rail code‘0’ and ‘1’in dual rail codeRobustnessRobustness High immunity to technology and environment High immunity to technology and environment

variations (temperature, power supply, ...)variations (temperature, power supply, ...)

Page 39: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

3939

ResistanceResistance

Concurrent models for specificationConcurrent models for specification CSP, Petri nets, ...: no more FSMsCSP, Petri nets, ...: no more FSMs

Difficult to designDifficult to design Hazards, synchronizationHazards, synchronization

Complex timing analysisComplex timing analysis Difficult to estimate performanceDifficult to estimate performance

Difficult to testDifficult to test No way to stop the clockNo way to stop the clock

Page 40: 1 Bridging the gap between asynchronous design and designers Thanks to Jordi Cortadella, Luciano Lavagno, Mike Kishinevsky and many others

4040

But ... some successful storiesBut ... some successful stories

PhilipsPhilipsAMULET microprocessorsAMULET microprocessorsSharpSharpIntel (RAPPID)Intel (RAPPID)Start-up companies:Start-up companies: Theseus logic, Fulcrum, Self-Timed SolutionsTheseus logic, Fulcrum, Self-Timed Solutions

Recent blurb: Recent blurb: It's Time for Clockless Chips, by It's Time for Clockless Chips, by Claire TristClaire Tristramram (MIT Technology Review, v. 104, (MIT Technology Review, v. 104, no.8, October 2001: no.8, October 2001: http://www.technologyreview.com/magazine/oct01/thttp://www.technologyreview.com/magazine/oct01/tristram.aspristram.asp)) … …..


Top Related