9 timing issues contents 1. clocking schemes and storage elements

52
9.1 1. Clocking Schemes and Storage Elements 2. Clock Distribution Network 3. Self-Timed Logic Circuits 4. Synchronizers 5. Clock Generation & Synchronization 9 Timing Issues Contents

Upload: andrew-campbell

Post on 08-Jan-2018

229 views

Category:

Documents


0 download

DESCRIPTION

1. Clocking Schemes based on each storage element Waveforms for D-latch, +ve edge-triggered D-f/f, and 2-phase double latch(latter two are equivalent to each other)

TRANSCRIPT

Page 1: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.1

1. Clocking Schemes and Storage Elements

2. Clock Distribution Network

3. Self-Timed Logic Circuits

4. Synchronizers

5. Clock Generation & Synchronization

9 Timing Issues

Contents

Page 2: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.2

1. Clocking Schemes based on each storage element

Waveforms for D-latch, +ve edge-triggered D-f/f, and 2-phase double latch(latter two are equivalent to each other)

Page 3: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.3

Finite State Machines based on each storage element

Clk1 and clk2 are non-overlapping each other

Page 4: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.4

Clock jitter(skew) positive skewnegative skew

CL1 2

clk clk’

signal direction

clk

+ve

-ve

Positive skew tdelay,min must be obeyed.

Otherwise, 2nd f/f, at the current sample point, samples the next value, not the current one (which is the correct one).

called double clocking negative skew Tp-(tdelay,max + tsetup) must be obeyed.

Otherwise, 2nd f/f, at the next sample point, samples the old value, not the updated value(which is the correct one).

Page 5: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.5

Single-phase system with edge-triggered flip-flops

negative skew 의 경우임

Page 6: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.6

i) maximum allowable clock skew = tskew,max (Race equation)

to prevent race condition, i.e., to prevent f/f from deciding Q with next input rather than current input.

when hold time is ignored,

tskew,max < tf/f,min+tcl,min

when hold time is considered,

tskew,max < tf/f,min+tcl,min- thold,max

(thold:min. time a signal needs to stay stable after clock edge)

ii) min. clock cycle time for correct operation with stable f/f inputs considering clock skew(Delay equation)

Tp,min>tf/f,max+tcl.max+tsetup,max- tskew,max

iii) tclk-width>thold, to guarantee correct data capture.

이만큼의 추가적인 여유가 있어야 nextsample 이 아닌 current 신호가 제대로(hold time requirement 를 만족하며 )sample 된다 .

tskew 가 +ve 이면 실제로 TP 가 tskew 만큼늘어나는 효과가 있다 . 즉 , TP 값이 그만큼의 여유가 생긴다 .

아무리 edge-triggered f/f 이지만 (tve f/f 의 경우 ) sampling edge 후에thold 만큼은 clock 이 high 로 stay 해야 함 .

Page 7: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.7

Single-phase system with latches

- ve skew 의 경우임

Page 8: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.8

clk 이 후에도 current 신호가 이 시간만큼 유지되어야 함 .즉 , clk 이 하기 thold.max 만큼 전에는 next 신호가 오면 안됨 .

i) Race condition: double-sided constraint on clock width, tclk-width

clock width must be greater than tsetup.( tsetup for latch is the min. time a signal should remain stable before the fall of clock edge)

tclk-width tsetup,max

clock width must be shorter than the sum of 1-stage delay(consisting of tlatch and tc

l) minus hold time and skew, to prevent any signal from passing through more than one stage.

Tclk-width tlatch,min + tcl,min - thold,max - tskew,max

ii) min. cycle time(in the critical stage)

tcycle,min > tlatch,max + tcl,max + tsetup,max + tskew,max- tclk-width,min

some delay as much as this can be transferred to the preceding or succeeding non-critical stages.

-ve skew 의 경우 이만큼 clk width 가effectively 줄어듬 .

합친 term 시간 이내에 delay 와 setup 까지 이루어져야 함 .-ve skew 의 경우에는 skew 만큼그 시간 (tP+tW) 이 줄어드는 것과 같다 .

tclk.width

Page 9: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.9

2-phase non-overlapping clock using double latchl

Page 10: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.10

i) Race condition:

thold-max < tnon-ol,min + tlatch,min + tlogic,min - tskew,max

ii) min. cycle time(delay equation)

tcycle,min > tnon-ol,max + tlatch,max + tlogic,max + tsetup,max - tskew,max

iii) data capture condition

tsetup < tclk-width

Single-phase, edge.trig f/f 에 비해 이만큼 여유가 더 있음 .

바꾸어 생각하면 편리

Page 11: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.11

Intentional clock skew

Page 12: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.12

tcl2,min > tcl1,min , tcl3,min 일 때 CLK’ 은 앞으로 CLK” 은 뒤로 shift 시킴으로써 CL1, CL3 에서 남은 시간을 CL2 에서 활용 이때 CLK’ 을 너무 advance( 혹은 CLK” 를 너무 delay) 시키면 CL

2 의 min. delay path 가 ta(edge-triggered f/f 인 경우 ) 혹은 tb(latch 인 경우 ) 보다 짧게 되어 race 가 발생할 수 있다 .

Page 13: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.13

Relation between race condition( on max, clock skew) and delay condition( on min. clock period)i) when data & clk are running in the same direction(positive skew)

– Clock skew should be tightly controlled to prevent race condition.

– With +ve skew, clock frequency can be increased for higher performance.

ii) when data and clock are running in opposite direction(negative skew)

– No need to worry about race condition,

– But, -ve skew degrades the performance by increasing the min. clock period according to the delay equation.

Page 14: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.14

How to suppress race condition1) routing clock in the opposite direction of data(easy to implement in datapath)

– at the cost of performance degradation

2) controlling the non-overlap periods of clock( in 2-phase clocking)

3) Try to obtain good clock distribution network to obtain as uniform clock skew as possible at the local clock point.( Absolute skew between clock source and local clock point is irrelevant)

4) Clock dist. Network

– interconnect material

– shape of the dist. Network

– clock driver/buffering schemes

– load, i.e., fan-out on the clock lines

– rise/fall time of the clock

5) Avoid global clock/ Use self-timed approach

Page 15: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.15

2. Clock Distribution Network H-tree as clock dist. Network

clock receiver(photo-diode) at the center receiving sharp laser pulse through a glass window in the package

Page 16: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.16

Two-level buffering(Hierarchy)

Page 17: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.17

Composition of a PLL(Phase-Locked Loop)

i) Loop filter : loop filter is introduced to remove clock jitter. 1st to 3rd-order LPF is generally needed, as excessive phase shift due to high-order

filtering can cause instability in this feedback structure.

ii) Lock range : range of input frequency over which output follows input frequency over which output follows input with given relationship.

iii) Lock time : time for PLL to lock into the input

iv) Jitter : Loop filter(LPF) helps remove jitter.

Page 18: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.18

Simply generating local clocks from global clock generates clock skew causing inter-chip communication impossible.

Page 19: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.19

How to minimize clock skew in multi-chip system, i.e., board or multiple-board system.

GlobalClockSource

i) Global dist. Network.

ii) On-chip clock generator/buffer

PLL can help here only.

iii) Local dist. Network.

Page 20: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.20

Each Component of Skew :

i) Chip-to-chip clock skew due to global dist. Network ; can be suppressed by ; placing clock pins/pads at the identical positions on the chip carrier/chip. Keeping the lead length and capacitive loading of clock pins and wires from the

global clock source to each clock pin as identical as possible.

ii) Skew due to on-chip clock generator/buffer can be suppressed by PLL;

Page 21: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.21

Each Component of PLL(Phase detector, LPF, voltage-controlled delay line)

Page 22: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.22

Methodology for dealing with timing problems in LARGE systems ;

1) Divide the whole system into a number of regions, with each region operating in synchronous manner.

2) Communication among each region is eitheri) through a global clock slower(N) than local clock or

ii) asynchronously using self-timed discipline.

Page 23: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.23

Using PLL for local synchronization between global & local clocks.

Delay of local clock is adjusted via. PLL to make the local clock edge occurring simultaneously with global clock edge.

Page 24: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.24

Minimal skew system

1) equal-length chip-to-chip interconnection

2) PLL-based clock generator/buffer, and

3) equal-length on-chip distribution(H-tree)

Page 25: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.25

Symmetric clock trees(H- vs X- tree)

- H-tree is better than X-tree in thati) in H-tree, no corners sharper than 90, thus with smaller inductive discontin

uity, reflection is small.

ii) in H-tree, fan-out is only 2, simplifying impedance matching

Page 26: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.26

Reduction of inductive discontinuities at the corners of H-tree.

Page 27: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.27

Matching condition at the branch point :

- impedance matching occurs when Zk=Zk+1//Zk+1=Zk+1

2

Zk+1

Zk+1

Zk

Page 28: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.28

Driving the clock lines :

Page 29: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.29

Sharpening clock signal at the receiver front before distribution in the subblock using schmitt trigger or source-end-terminated buffer.(Look at sharp rise of Vb in previous slide.)

Page 30: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.30

RC network representation of H-clock tree(simplified as a distributed RC line):

When tailored H-clock tree is used, I.e., if the line width is halved at each branching point, above distributed RC tree network is equivalent to a uniformly distributed RC line.(R1= R2= R3=…, C1=2C2=4C3=...)1

214

Page 31: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.31

Requirement of the cross-sectional geometry(height, width) of interconnection line :

1) From distributed RC model ; Total distance from clock source to end point(ltot) in H-tree :

Time required for the last node to reach 90% of its final value :

lD D D D D D

tot n n 4 4 8 8

2 221

21

....

D D Dkk

nn1

21

121

22

//[ ( ) ]

T R C D90%2 int int

Rint : resistance of interconnection per unit lengthCint : capacitance per unit length

RH W

CWtAl ox

oxint

int intint

int

1

TD

H tAl oxox

90%

2

1

int

( )

Page 32: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.32

2) From lossy transmission line RC model :

Eq. (1), (2) need to be considered in determining H&W.

For high frequency, skin effect prevents thickening the interconnection by more than 2-4 times the skin depth ineffective. For 1GHz, skin depth of aluminum is 2.8m.

1 2 loss exp[ / ]int intR l Zo

intintint

1Wt

Wt

CZ ox

ox

o

ox

oxoxoo

RW H

Alint

int int

H tD

lossoxox

o

Alint ln[ / ( )]

( )

2 1 12

Page 33: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.33

Simulation of H-clock tree with the last stage unmatched.

Page 34: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.34

Reflections in the final unmatched branch :

Page 35: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.35

Synchronous vs. Asynchronous(Self-Timed)

i) In pipelined systems, performance(throughput) depends on worst case stage delay in synchronous systems, while it depends on average delay in self-timed systems.

ii) (cons of synchronous system) : Distribution of high-speed clock over all region is a very difficult problem.

iii) (cons of asynchronous systems) : Hand-shaking logic overhead.

3. Self-Timed Logic Circuits

Page 36: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.36

Pipelined, synchronous datapath :

Self-timed pipelined datapath :

F1 F2 F3R1 R2 R3 R4

tpreg tpF1 tpF2 tpF3

OutIn

F1 F2 F3R1 R2 R3

tpF1 tpF2 tpF3

OutIn

HS HS HSReq

Ack

Req

Ack

Req

Ack

Req

Ack

Start Done Start Done Start Done

12

56

3 4 87

Page 37: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.37

Sequence of Events :i) As input word arrives at the input of R1, I.e., at the input buffer(IN buffer)

Req(to F1) is raised.If F1 is then available(inactive), data is transferred to R1 and Acknowledges it to IN buffer.

ii) F1 is then enabled by ‘start’ signal‘Done’ signal goes high indicating the completion of the computation.

iii) Request to F2-module is issued.If F2 is available, Ack is raised, and the output from F1 is transferred to R2.F1 continues with its next sample for computing.

Page 38: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.38

Completion(Done) signal generation :

- redundant signal generation.Ex. 2bit per signal :

00 : out 0

11 : out 1

01 or 10 : out retains its current value.

B

in transition(or reset)

0

1

illegal

B0

0

0

1

1

B1

0

1

0

1

Page 39: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.39

Completion signal generation

Page 40: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.40

Page 41: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.41

Self-Timed signaling(2-cycle Hand shaking protocol)

Page 42: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.42

Muller C-element

Page 43: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.43

Implementation of Muller C(-element) element :

Page 44: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.44

An example showing Muller C-element.

1)

2) 2-, self-timed FIFO

Page 45: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.45

4-cycle(4-phase) handshaking protocol :

ex) Implementation of 4-phase handshaking protocol

Page 46: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.46

Muller C element : Its output is 1 when all inputs are 1, and 0 when all inputs are 0 ; otherwise re

mains at its earlier state. Called as rendezvous, join or last-of circuit.

Page 47: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.47

Implementation :

Page 48: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.48

Implementation :

Page 49: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.49

Conversion between single rail plus request and double rail data :

single singledouble

Page 50: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.50

Conversion form chip to package pins :

Page 51: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.51

ex) Self-timed CMOS PLA

Dummy line 의 extra parasitic C 에 의해 AND-plane 의모든 product line 이 충분히 stable 해진 후에 DONE 이“1” 로 올라간다 .

Page 52: 9 Timing Issues Contents 1. Clocking Schemes and Storage Elements

9.52

DONE 신호는 여러 개의 PLA 를 cascade 시킬때 next PLA 의 PC로 사용된다 . 이때 2nd PlA 는 inverting input 을 갖지 못한다 .( 정 필요하면 따로 1st PLA 에서 만들어 주어야 함 ( unintended discharge)

어느 PLA 의 입력이 여러 PLA 에서 올때 : 가장 늦은 DONE 을 PC 로 받아야 함 .