delay/phase regeneration circuits crescenzo d’alessandro, andrey mokhov, alex bystrov, alex...

Post on 15-Jan-2016

223 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Delay/Phase Regeneration Circuits

Crescenzo D’Alessandro, Andrey Mokhov, Alex Bystrov, Alex Yakovlev

Microelectronics Systems Design Group

School of EECE

Newcastle University, UK

ASYNC 2007 - C D'Alessandro et al. - 2/28

OutlineIntroduction

Background on Phase-encoding

Dual-rail/multiple-rail phase encoding

Motivation for the present work

Taxonomy

Latch-based designs

MUTEX-based designs

Design types

Conclusions

ASYNC 2007 - C D'Alessandro et al. - 3/28

Phase EncodingDual-Rail

Main idea: encode data on the phase relationship between two identical out-of-phase signals

Resistant to transient faults

Similarity with dual-rail dual-spacer protocol

sp0 sp1

0

sp0

1

sp1

0

sp0

0

ref

t_1

t_0

data

t_1 before t_0t_0 before t_1

ASYNC 2007 - C D'Alessandro et al. - 4/28

Multiple RailNo group of wires has the same delay

All wires toggle when an item of data is sent

17 2017

/req

/ack

/a

/b

/c

/d

cdba cdba dbac

ASYNC 2007 - C D'Alessandro et al. - 5/28

Phase Corruption Phase corruption is due to jitter (introduced by the gates), physical wire fabric and transistor mismatches

Mismatch in process variations cause a systematic delay offset to appear between the two lines, which could cause errors in decoding

Additionally, cross-talk causes symbol-dependent phase corruption

As the wires are always “allies” in terms of cross-talk, the longer the wire, the more corrupted the phase relationship between the wires

What is then the optimal length of wire which “guarantees” that the phase relationship is maintained?

ASYNC 2007 - C D'Alessandro et al. - 6/28

Phase CorruptionExample of phase corruption

No change in sequence

Change in absolute value of phase

ASYNC 2007 - C D'Alessandro et al. - 7/28

TaxonomyDifferent design styles can be identified

We focus in this presentation on digital implementations

Latch-based designs

A latch is used on each wire

Gate-level implementation

Transistor-level implementation

MUTEX-based designs

A single MUTEX is used to arbitrate between the two edges

“Early-propagating”

“Merging”

ASYNC 2007 - C D'Alessandro et al. - 8/28

ParametersMaximum input time separation affected δmax

Events whose time separation is > δmax retain their original separation

Circuit latency λ

Time between the first event occurring and the corresponding output being generated

Response time ζ

Time between the two events below which the time separation cannot be regenerated

Capture range κ= δmax – ζ

Using the convention sometimes used in PLLs to give a value for the range

Linearity

ASYNC 2007 - C D'Alessandro et al. - 9/28

Graphs

0

1

2

3

4

5

6

7

8

9

10

0 2 4 6 8 10

Out

put T

ime S

epar

atio

n/La

tenc

y (F

O4)

Input Time Separation (FO4)

Response of latch-based design (2)

FallRise

Latency (fall)Latency (rise)

δmax

λ

ζ

κ

Linearity:

how flat

this part is

ASYNC 2007 - C D'Alessandro et al. - 10/28

Passive Solution

“Textbook” solution

Different response for rising/falling – can be matched using balanced drivers

Not very linear

Capacitor size a problem – also introduces latency

0

2

4

6

8

10

12

0 2 4 6 8 10O

utpu

t Tim

e Sep

arat

ion/

Laten

cy (F

O4)

Input Time Separation (FO4)

Response of passive device

FallRise

Latency (fall)Latency (rise)

o2

o1i1

i2

Sender

ASYNC 2007 - C D'Alessandro et al. - 11/28

Latch-basedGate level/1

Q

Q

D

G

i2

i1 o1

o2

Q

Q

D

G

ASYNC 2007 - C D'Alessandro et al. - 12/28

Latch-basedGate level/1

Latches are transparent at startup

They are closed after one edge arrives at the output

They are then reopened after the pulse is finished

6 FO4 capture range, stops working around 5 FO4 input delta

Difference in rising and falling behaviour

Q

Q

D

G

i2

i1 o1

o2

Q

Q

D

G

0

2

4

6

8

10

12

0 2 4 6 8 10O

utpu

t Tim

e Sep

arat

ion/

Laten

cy (F

O4)

Input Time Separation (FO4)

Response of latch-based design (1)

FallRise

Latency (fall)Latency (rise)

ASYNC 2007 - C D'Alessandro et al. - 13/28

Latch-basedGate level/2

Similar to previous design

Two pulse generators – faster

Only blocks one output and not both

Only one output used – less difference between rising and falling edges

Q

Q

D

G

Pulse generator - width=

Pulse generator - width=

i2

i1 o1

o2Q

Q

D

G

0

1

2

3

4

5

6

7

8

9

10

0 2 4 6 8 10O

utpu

t Tim

e Sep

arat

ion/

Laten

cy (F

O4)

Input Time Separation (FO4)

Response of latch-based design (2)

FallRise

Latency (fall)Latency (rise)

ASYNC 2007 - C D'Alessandro et al. - 14/28

Latch-basedTransistor level

i1

i2

o2

o1

ASYNC 2007 - C D'Alessandro et al. - 15/28

Latch-basedTransistor level

Better latency and response

Capture range can be increased increasing tau

Good linearity

i1

i2

o2

o1

0

1

2

3

4

5

0 0.5 1 1.5 2 2.5 3 3.5 4Ou

tput

Tim

e Sep

arati

on/L

atenc

y (F

O4)

Input Time Separation (FO4)

Response of transistor-based design (with keepers)

FallRise

Latency (fall)Latency (rise)

ASYNC 2007 - C D'Alessandro et al. - 16/28

MUTEX-based

i2

i1

o2

o1

ASYNC 2007 - C D'Alessandro et al. - 17/28

MUTEX-based

Higher latency (complex gates)

Good response and capture range

Poor linearity

Early-propagating

i2

i1

o2

o1

0

2

4

6

8

10

0 1 2 3 4 5

Out

put T

ime

Sepa

ratio

n/L

aten

cy (F

O4)

Input Time Separation (FO4)

Response of modified-ME design

FallRise

Latency (fall)Latency (rise)

0

0.5

1

1.5

2

0 0.5 1 1.5 2

Out

put T

ime

Sepa

ratio

n/La

tenc

y (F

O4)

Input Time Separation (FO4)

Response of modified-ME design

FallRise

ASYNC 2007 - C D'Alessandro et al. - 18/28

MUTEX-based

“Infinite” capture range – lower-bounded

Flat response

Very high latency – dependent on input time separation

NOR-MUTEX is slow

C C

g11g12

g21

g22

g12g11

g22

g21

i1

i2

o1

o2

+

-

g11

g12

g21

g22

ref

0

2

4

6

8

10

12

14

16

18

0 2 4 6 8 10O

utpu

t Tim

e Sep

arat

ion/

Laten

cy (F

O4)

Input Time Separation (FO4)

Response of "merge" design

FallRise

Latency (fall)Latency (rise)

ASYNC 2007 - C D'Alessandro et al. - 19/28

STG for RepeaterSTG for a repeater

Use timing assumptions:

i1- -> p1 -> g11-, g12-

g11- -> i1+

… and mirror ones

This STG can be synthesised using PETRIFY

Synthesised version in next slide…

g21-

o2-

o1-

o1-

g22-

o2-

g21+ g22+

i2-i1-

ME2

ME1

g11+

o2+

o1+

o1+

g12+

o2+

g11- g12-

i1+ i2+

p1 p2

p7

p5

p8

p3 p4

p6

ASYNC 2007 - C D'Alessandro et al. - 20/28

MUTEX-basedw/PETRIFY

Very good linearity and capture range

High latency independent on input until 0.5 FO4

Generated using PETRIFY (STG in previous slide)

i2

i1

o2

o1

C

g22

g11

g12

g21

0

2

4

6

8

10

12

14

16

18

0 5 10 15 20O

utpu

t Tim

e Sep

arat

ion/

Laten

cy (F

O4)

Input Time Separation (FO4)

Response of PETRIFY-generated design

FallRise

Latency (fall)Latency (rise)

ASYNC 2007 - C D'Alessandro et al. - 21/28

TSETransition Sequence Encoder

This circuit generates a number of requests based on an input matrix

The acknowledgments can be either “proper” or a delayed version of the output signals

Can be used as a phase-encoder

req[1]

req[2]

req[3]

ack[1]ack[2]ack[3]

go

R[1,3]

R[1,2]

R[2,3]

R[2,1]

R[3,2]

R[3,1]

ASYNC 2007 - C D'Alessandro et al. - 22/28

MUTEX-TSE

This solution is similar to the MUTEX-based one, only using the TSE as a sender

λ < 2 FO4

Increasing output time separation dependent on the input (output δ > 8FO4)

C

i1

i2

o1

o2

R[1,2]

R[2,1]

go

0

5

10

15

20

0 2 4 6 8 10O

utpu

t Tim

e Sep

arat

ion/

Laten

cy (F

O4)

Input Time Separation (FO4)

Response of TSE design

FallRise

Latency (fall)Latency (rise)

ASYNC 2007 - C D'Alessandro et al. - 23/28

TSE – Transistor-level

Like above, only rising and falling edge

Transistor-level implementation of the TSE

Results similar to the previous case

Note the similarity with the transistor-level latch-based design

C

i1

i2

R1[1,2]

R2[1,2]

R2[2,1]

o2

o1

R2[1,2]

R1[1,2]

R1[2,1]

R1[2,1]

R2[2,1]

0

5

10

15

20

0 2 4 6 8 10O

utpu

t Tim

e Sep

arat

ion/

Laten

cy (F

O4)

Input Time Separation (FO4)

Response of TSE design (transistor level)

FallRise

Latency (fall)Latency (rise)

ASYNC 2007 - C D'Alessandro et al. - 24/28

Multiple-railMultiple-rail phase-encoding requires similar designs to regenerate the phase relationship

The design on the right is a simple expansion of the previous latch-based design

Very slow response

Only useful for large δ

Acceptable latency

Q

Q

D

G

Q

Q

D

G

Q

Q

D

G

Q

Q

D

G

i2

i3

i4

Pulse generator

Pulse generator

Pulse generator

Pulse generator

i1 o1

o2

o3

o4

ASYNC 2007 - C D'Alessandro et al. - 25/28

Multiple-rail “merge”

Better design: use a TSE

Shown: 3-wires regeneration – left, rising edge only, right; rising and falling edges

Better response, but λ depends on the input time separation (needs to wait for all inputs to be present)

i1

i2

i3

o1

o2

o3

sender

go

receiver

R1[1,3]

R1[1,2]

R2[1,2]

R2[1,3]

Co1

o4

o2

o3

go

R1[1,4]

R2[1,4]

ASYNC 2007 - C D'Alessandro et al. - 26/28

Performance comparison

Dual-rail implementations

Area in transistor count

κ and λ in FO4

Area and energy for “Latch-based transistor level” design is for no keeper/keeper

“Charge compensation”: area calculated estimating the size of the capacitors

Avg. for rise/fall

Design Area pJ/bit ζ κ λ

Latch-based 1 58 0.82 6 (avg)

2 2.5

Latch-based 2 68 0.59 4 3 3.5

Mod. MUTEX 88 1.17 <0.5 1 7

Automatic Synthesis

94 0.9 <0.1 5 7

MUTEX-based merging

110 0.98 <0.1 δinput

Latch-based transistor level

28/32

0.43/0.47 <0.5 3 1

Charge-compensation

24 0.22 2 4 4 (avg)

TSE gate-level

74 0.78 <0.1 δinput

TSE transistor-level

52 0.79 <0.1 δinput

ASYNC 2007 - C D'Alessandro et al. - 27/28

ConclusionsSome phase-regeneration circuits have been presented

More work to do:

Metastability behaviour, in particular for keeper structures

Behaviour in case of faults

Characterisation with different input signal slopes

ASYNC 2007 - C D'Alessandro et al. - 28/28

Contact details

Crescenzo S. D’Alessandro

Microelectronics Systems Design Group

School of Electrical, Electronics and Computer Engineering

Merz Court

Newcastle University, UK

Crescenzo.D’Alessandro@ncl.ac.uk

http://async.org.uk

top related