© 2003-2009 ran ginosar048878 lecture 3: handshake ckt implementations 1 vlsi architectures 048878...

35
© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

Post on 19-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 1

VLSI Architectures048878

Lecture 3

S&F Ch. 5: Handshake Ckt Implementations

Page 2: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 2

Implementations• We only consider simple circuits

• More aggressive circuits will come later

• First, reminder on latches

Page 3: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 3

4- & 2-phase bundled data latches

CCC

ACK ACK ACK

REQ REQ REQ

ACK

REQ

LATCH LATCH LATCH

EN EN EN

CC

ACK ACK

REQ REQ

ACK

REQ

LATCH LATCH

C P C P

C

LATCH

C P

ACK

REQ

Page 4: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 4

4-phase dual rail – many bits

CC C

C C C

CC C

C C C

C C C

ACK

d[0].t

d[0].f

d[1].t

d[1].f

ACK

d[0].t

d[0].f

d[1].t

d[1].f

Page 5: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 5

4-phase Fork, Join

y.t

z.tx.t

y.f

z.fx.f

Cy-ackz-ackx-ackC

y-ackz-ackx-ack

y-req

z-reqx-req

y

zx

Fork

y

zx

y-ackz-ack

x-ack

Cy-reqz-reqx-req

y z1x z0

y-ackz-ack

x-ack

x.t z0.tx.f z0.f

y.t z1.ty.f z1.f

COMPONENT 4-phase bundled data 4-phase dual-rail

Join(wait for all)

yz

x

Page 6: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 6

4-phase Bundled-data Mux

yz

x

y zx

ctl.tctl.f

y-ackz-ack

x-ack

y-req

z-req

x-req

C

C

ctl

0

1

C

Cctl.f

ctl.t

ctl-ack

Page 7: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 7

4-phase Bundled-data Demux

y-ackz-ack

x-ack

y-req

z-req

x-req

C

Cctl.f

ctl.t

ctl-ack

zx

y

ctl

0

1

yzx

Page 8: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 8

4-phase Merge

COMPONENT 4-phase bundled data 4-phase dual-rail

yz

x y zx

y-reqx-req

y-ackz-ack

x-ack

y-reqz-req

x-req

C

Cz.t

y-ackz-ack

x-ack

C

x.t

z.fy.fy.t

x.f

C

CD

CD

Merge(wait for one)

Page 9: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 9

COMPONENT 4-phase bundled data 4-phase dual-rail

yz

x y zx

y-reqx-req

y-ackz-ack

x-ack

y-reqz-req

x-req

C

Cz.t

y-ackz-ack

x-ack

C

x.t

z.fy.fy.t

x.f

C

CD

CD

Merge(wait for one)

4-phase Merge

Mutually exclusive inputs.Guaranteed elsewhere!(more later..)

Assume X active…

…C-element sees input glitch

Relative Timing: x-req < z-ack simplify CEL

Page 10: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 10

Asymmetric C Element• Useful when we know the relative timing:

b < a only a needed to pull up

• Only one pMOS - faster

Ca

b

c

a

b

c

Page 11: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 11

2-phase Merge• Try it at home…

• This is not an assignment!

Page 12: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 12

Mutual Exclusion: MUTEX

R1R2

G1 G2

MU

TE

X

R2

R1 G1

G2

x1x2

Page 13: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 13

Standard Gate MUTEXsR1

R2O2

O1G1

G2

Not fully guaranteed that outputs are M/E, but highly probable !

R1

R2O2

O1 G1

G2

Very low threshold

Page 14: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 14

Arbiter

AR

BIT

ER

R1

R0A0

A1

R2A2

R1

A1

R2

A2M

UT

EX

C

R0

A0

C

G1

G2

Y1

Y2

Page 15: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 15

Arbitrating Merge

AR

B-M

ER

GE

x

y x-req MU

TE

X

C

z-req

C

Gx

Gy

Fx

Fy

z

x-ack

y-req

y-ack

z-ack

x

yz

FyFx

Page 16: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 16

Function Blocks• We said “transparent” but…

– Need a matched delay for bundled-data

– Need to generate completion for dual-rail

– Need to join inputs, fork outputs:

=

Page 17: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 17

Transparency Revisited• Function blocks must not affect how the

latches “shake hands” (except for timing)

ack ack

=

Page 18: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 18

Indication Revisited• FB(req_out) means

– FB(req_in)

– Computation finished, data out ready

• Simple “strong indication” for bundled data:

req_in req_out

1: ALL DATA_IN VALID

2: REQ_IN

3: COMPUTE

4: ALL DATA_OUT VALID

5: REQ_OUT

Page 19: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 19

Strong vs. Weak Indication

• Strong Indication: All inputs must arrive before any output is allowed (“indicated”).

– Even if some outputs are ready earlier, there is no REQ_OUT, so they cannot be used.

– Implies worst-case latency

• Weak Indication: Some outputs are allowed even before all inputs arrived

– Only makes sense in dual-rail:

Page 20: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 20

Weak Indication• No REQ on dual-rail – each bit is “self-

indicating”

• May lead to faster circuits

• Example chain of events:

DR

ack

1 2

34

6

5

7

Page 21: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 21

Composition of FBs• Legal composition:

– All inputs and outputs are connected

– No cycles

• Legal composition of weekly indicating FBs is weakly indicating

• Legal composition of strongly indicating FBs is strongly indicating

Page 22: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 22

Example: Ripple-carry

...

ai bi

cidi

si

...

a1b1

cind1

s1

anbn

cncout

sn

Page 23: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 23

Example: Ripple-carry• Full adder (a,b,c) = (s,d)

– s = a b c

– d = ab + ac + bc

• Shortcuts for look-ahead (prop, gen, kill):

– p = a b s = p c

– g = ab d = g + pc, OR d' = k + pc'

– k = a' b'

• Sometimes d can be made valid without waiting for c

Page 24: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 24

Speculative / Strong Ripple Carry• 16 bit ripple-carry adder, bundled-data

• Longest carry is 16 stages

• But if p8=0 then longest carry is 8 stages

• And if p12p8p4=0, then longest carry is 4 stages

• If willing to trade area and power for speed:

Page 25: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 25

Speculative / Strong Ripple Carry

DELAYSELECTOR

short

medium

long

REQ_OUT

ADDER

REQ_IN

Page 26: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 26

ST-CL

Based on David, Ginosar, Yoeli, "An Efficient Implementation of Boolean Functions as Self-Timed Circuits,'' IEEE Trans. Computers, Jan. 1992

Page 27: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 27

Dual-Rail DIMS PLA Notation

C

C

a.f

a.t

C

C

b.f

b.t

z.f

z.t

a

b

z

a.t a.f b.fb.t

C

C

C

C

z.t z.f

Page 28: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 28

Dual-Rail DIMS Adders

ADD

a

b

c

s

d

a.t a.f b.fb.t c.fc.t s.t s.f d.fd.t

C

C

C

C

C

C

C

C

a.t a.f b.fb.t c.fc.t s.t s.f d.fd.t

C

C

C

C

C

C

C

C

C

C

GEN

KILL

Still slow: LF(V) = LF(E)

Page 29: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 29

Transistor Level DIMS• Too many P

transistors - slow

• Some N paths can be shared:

a.f

b.f

c.f

a.f

b.f

c.t

a.f

b.t

c.f

a.t

b.f

c.f

a.f

b.f

c.t

a.f

b.f

c.t

d.f

a.f

b.f

c.f c.t

a.f

b.t

c.f

a.t

b.f

d.f

Page 30: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 30

Hybrid Adder• Dual-rail carry (for flexible latency)

• Bundled-data data inputs and sum output (for lower area and power)

• Data-dependent data-forward (V) latency

• Constant empty-forward (E) latency

Page 31: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 31

Hybrid Adder

...

ai bi

...

a1b1

c1.td1.t

CARRY

anbn

cout

...

si

...

s1

SUM

sn

CD

c1.fd1.f

ci.t

ci.f

di.t

di.f

cn.t

cn.f

dn.t

dn.f

C

REQ_OUT

REQ_INcin

SUM SUM

CARRY CARRY

Dual-rail

Bun

dled

-dat

a

Page 32: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 32

Domino Logic Dual Rail

REQ_IN

REQ_IN

REQ_IN

f

t

REQ_OUT?

Req Out: Either by (flexible) Completion Detection or by matched (worst case) delay

Page 33: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 33

Hybrid Adder: Sum Ckt

REQ_IN

REQ_IN

s

b b

c.fc.t

bb

a a

Page 34: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 34

Hybrid Adder: Two Carry Ckts

REQ_IN

REQ_IN

REQ_IN

d.f

d.t

b

a a ab

c.f b c.t

ba

REQ_IN

REQ_IN

REQ_IN

d.f

d.t

a a

c.f c.t

aa

bb

Weak Indication Strong Indication

KILL GEN

Page 35: © 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© 2003-2009 Ran Ginosar 048878 Lecture 3: Handshake Ckt Implementations 35

Hybrid Adder: Two Carry Ckts

WEAKCARRY

STRONGCARRY

STRONGCARRY

WEAKCARRY

STRONGCARRY

STRONGCARRY

…123456

CD

Slightly faster…