ece680: physical vlsi designqli/ece680/chapter7 timing issue.pdf600 mhz – 0.35 micron cmos t cycle...

Post on 25-Apr-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

ECE680: Physical VLSI DesignECE680: Physical VLSI DesignECE680: Physical VLSI DesignECE680: Physical VLSI Design

Chapter VIIChapter VII

Timing Issues in Digital CircuitsTiming Issues in Digital Circuits

(chapter 10 in textbook)(chapter 10 in textbook)

GMU, ECE 680 Physical VLSI Design 1

Synchronous TimingSynchronous Timing

(Fig 10‐1)

CLK

(Fig. 10‐1)

CombinationalLogic

R1 R2Cin Cout Out

In

T > t + t + tT > tc ‐ q + tlogic + tsu

thold < tc – q, cd + tlogic, cd

GMU, ECE 680 Physical VLSI Design 2

hold c  q, cd logic, cd

Timing DefinitionshLatch Parameters

D

Clk

Q

Clk PW

T

D thold

PWm tsu

Q tc-q td-q

Delays can be different for rising and falling data transitionsGMU, ECE 680 Physical VLSI Design 3

Register ParametersRegister Parameters

D

Clk

Q

ClkT

D thold

t

Q tc-q

tsu

Delays can be different for rising and falling data transitionsGMU, ECE 680 Physical VLSI Design 4

Clock UncertaintiesClock Uncertainties

4 Power Supply

2

43

Power Supply

Interconnect6 Capacitive Load

Devices

5 Temperature7 Coupling to Adjacent Lines

1 Clock Generation

Sources of clock uncertainty

GMU, ECE 680 Physical VLSI Design 5

Clock NonidealitiesClock Nonidealities

• Clock skew– Spatial variation in temporally equivalent clock edges; deterministic + random, tSK

Cl k jitt• Clock jitter– Temporal variations in consecutive edges of the clock signal; modulation + random noiseg ;

– Cycle‐to‐cycle (short‐term) tJS– Long term tJL

• Variation of the pulse width – Important for level sensitive clocking

GMU, ECE 680 Physical VLSI Design 6

Clock Skew and JitterClock Skew and Jitter

ClkClktSK

Clk tJS

• Both skew and jitter affect the effective cycle time

• Only skew affects the race margin

Tclk + δ – 2tjitter > tc ‐ q + tlogic + tsu

δ 2t t t t

Page 501

GMU, ECE 680 Physical VLSI Design 7

δ + 2tjitter + thold < tc – q, cd + tlogic, cd

Clock SkewClock Skew

# of registersg

Earliest occurrenceof Clk edge

Latest occurrenceof Clk edgeg

Nominal – /2g

Nominal + /2

Clk delayInsertion delayMax Clk skew

GMU, ECE 680 Physical VLSI Design 8

Positive and Negative SkewPositive and Negative Skew

R1 R2 R3R1In Combinational

LogicD Q

tCLK1CLK tCLK2

R2D Q Combinational

Logic

tCLK3

R3• • •D Q

(a) Positive skew

delay delay

R1In Combinational

LogicD QR2D Q Combinational

Logic

R3• • •D Q

(b) Negative skew

tCLK1

delay

tCLK2 tCLK3

delay CLK

( ) g

GMU, ECE 680 Physical VLSI Design 9

Positive SkewPositive Skew

CLK1TCLK

TCLK

1 3CLK1

CLK2

CLK2

th

2 4

Launching edge arrives before the receiving edge

GMU, ECE 680 Physical VLSI Design 10

Negative SkewNegative Skew

TCLK

TCLK +

1 3CLK1 1 3

CLK2

2 4

Receiving edge arrives before the launching edge

GMU, ECE 680 Physical VLSI Design 11

Timing ConstraintsTiming Constraints

R1 R2R1D Q Combinational

LogicIn

t

R2D Q

tCLK tCLK1 tCLK2

tc qtc q, cd

tlogictlogic, cdq

tsu, tholdg

Minimum cycle time:T T ‐ = tc‐q + tsu + tlogic

Worst case is when receiving edge arrives early (positive )

GMU, ECE 680 Physical VLSI Design 12

Timing ConstraintsTiming ConstraintsR1

CombinationalInR2

D Q CombinationalLogic

CLK tCLK1

D Q

tCLK2

tc qtc q, cdtsu, thold

tlogictlogic, cd

su, hold

Hold time constraint:t(c‐q, cd) + t(logic, cd) > thold + 

Worst case is when receiving edge arrives lateRace between data and clock

GMU, ECE 680 Physical VLSI Design 13

Impact of JitterImpact of Jitter

TC LK

CLK-tji tter

t j itter

tji tter

ICombinationalREGS

CLK

In Logic

tc-q , tc-q, cdt log ict l i dq q, log ic, cdtsu, thold

tjitter

GMU, ECE 680 Physical VLSI Design 14

Longest Logic Path in Edge‐Triggered Systems

ClkTSU

TJI +

T

TClk-Q TLM

Latest point of launching

Earliest arrivalf lof launching of next cycle

GMU, ECE 680 Physical VLSI Design 15

Clock Constraints in Edge‐Triggered Systems

If launching edge is late and receiving edge is early the data will not be too late if:If launching edge is late and receiving edge is early, the data will not be too late if:

T + T + T < T T T Tc-q + TLM + TSU < T – TJI,1 – TJI,2 -

Minimum cycle time is determined by the maximum delays through the logic

2Tc-q + TLM + TSU + + 2 TJI < T

Skew can be either positive or negativeSkew can be either positive or negativeGMU, ECE 680 Physical VLSI Design 16

Shortest PathShortest Path

E li t i t

Clk

Earliest point of launching

TClk-Q TLm

ClkClkTH

Data must not arrivebefore this timeNominal

clock edge Example 10.1

GMU, ECE 680 Physical VLSI Design 17

Clock Constraints in Edge‐Triggered Systems

If launching edge is early and receiving edge is late:

T + TLM – TJI 1 < TH + TJI 2 + Minimum logic delay

Tc-q + TLM TJI,1 < TH + TJI,2 +

Tc-q + TLM < TH + 2TJI+

Tclk + δ – 2tjitter > tc ‐ q + tlogic + tsu

GMU, ECE 680 Physical VLSI Design 18

δ + 2tjitter + thold < tc – q, cd + tlogic, cd

How to counter Clock Skew?

Negative Skew

RE

G

EG EG log O t

RE

G

RE

.

RE

log Out

InPositive Skew

Clock Distribution

Positive Skew

Data and Clock Routing

GMU, ECE 680 Physical VLSI Design 19

Flip‐Flop – Based TimingFlip Flop  Based Timing

Flip-flopSkew

p opdelayLogic delay

TFlip-flop

TSUTClk-Q

Logic

Representation after i S Ci i 996

Tclk + δ – 2tjitter > tc ‐ q + tlogic + tsuM. Horowitz, VLSI Circuits 1996.

GMU, ECE 680 Physical VLSI Design 20

clk jitter c q logic su

δ + 2tjitter + thold < tc – q, cd + tlogic, cd

Flip‐Flops and Dynamic LogicFlip Flops and Dynamic LogicLogic delay

TSUTClk-Q

TSUTClk-Q

Clk Q

Logic delayP hPrechargeEvaluateEvaluatePrecharge

Flip‐flops are used only with static logicp p y g

GMU, ECE 680 Physical VLSI Design 21

Latch timingLatch timing

tD-QWhen data arrives

l h

D Q

to transparent latch

Latch is a ‘soft’ barrierD

Clk

Q

tClk-Q When data arrives to closed latch

Data has to be ‘re-launched’

GMU, ECE 680 Physical VLSI Design 22

Single‐Phase Clock with LatchesSingle Phase Clock with Latches

Latch

Logic

Clk

Tskl Tskl TsktTskt

Clk

P

PW

GMU, ECE 680 Physical VLSI Design 23

Latch‐Based DesignLatch Based DesignL1 latch is transparentwhen = 0

L2 latch is transparent when = 1

when = 0 when = 1

L1L t h Logic L2

L t hLatch g Latch

Logic

GMU, ECE 680 Physical VLSI Design 24

Slack‐borrowingSlack borrowing

In CLB AL1 L2 L1

CLB BQDIn CLB_A QD QD

CLB_Btpd,A tpd,Ba b c d e

CLK1 CLK2 CLK1

CLK1

TCLK

CLK2

tpd,A

a valid b val id

tDQ tpd,B

c valid d valid

tDQe valid

slack passed to next stage

GMU, ECE 680 Physical VLSI Design 25

Latch‐Based TimingLatch Based TimingSkew

Static logic

L1 L2

L2 latchL1

Latch Logic L2Latch

L1 latch

Logic Longpath

Can tolerate skew!

Shortpathpath

GMU, ECE 680 Physical VLSI Design 26

Clock Distribution

H-tree

CLK

Clock is distributed in a tree-like fashion

GMU, ECE 680 Physical VLSI Design 27

More realistic H‐treeMore realistic H tree

[Restle98][Restle98]

GMU, ECE 680 Physical VLSI Design 28

The Grid SystemThe Grid System

Driver

GCLK

DriverD

river

Driv

er

GCLK GCLK

•No rc-matching•Large power

Driver

GCLK

GMU, ECE 680 Physical VLSI Design 29

Example: DEC Alpha 21164

Clock Frequency: 300 MHz - 9.3 Million Transistors

Total Clock Load: 3.75 nF

Power in Clock Distribution network : 20 W (out of 50)( )

Uses Two Level Clock Distribution:

Si l 6 t d i t t f hi• Single 6-stage driver at center of chip• Secondary buffers drive left and right side

clock grid in Metal3 and Metal4clock grid in Metal3 and Metal4Total driver size: 58 cm!

GMU, ECE 680 Physical VLSI Design 30

21164 Clocking21164 Clocking

• 2 phase single wire clock, di ib d l b ll

tcycle= 3.3ns

distributed globally

• 2 distributed driver channels– Reduced RC delay/skew

trise = 0.35ns tskew = 150ps

Clock waveform

– Improved thermal distribution

– 3.75nF clock load

– 58 cm final driver width

final drivers

• Local inverters for latching

• Conditional clocks in caches to reduce powerreduce power

• More complex race checking

• Device variationLocation of clock

pre‐driver

Location of clockdriver on die

GMU, ECE 680 Physical VLSI Design 31

Clock Drivers

GMU, ECE 680 Physical VLSI Design 32

Clock Skew in Alpha Processorp

GMU, ECE 680 Physical VLSI Design 33

EV6 (Alpha 21264) ClockingEV6 (Alpha 21264) Clocking600 MHz600 MHz –– 0.35 micron CMOS0.35 micron CMOS

tcycle= 1.67ns

600 MHz 600 MHz 0.35 micron CMOS0.35 micron CMOS

trise = 0.35ns tskew = 50psGlobal clock waveform

• 2 Phase, with multiple conditional buffered clocks

– 2.8 nF clock load

– 40 cm final driver width

• Local clocks can be gated “off” to save power

• Reduced load/skew

• Reduced thermal issues

• Multiple clocks complicate raceMultiple clocks complicate race checkingPLL

GMU, ECE 680 Physical VLSI Design 34

21264 Clocking21264 Clocking

GMU, ECE 680 Physical VLSI Design 35

EV6 Clock ResultsEV6 Clock Resultsps5

ps300

101520

30531031520

253035

32032533035

4045

330335340

GCLK Skew

50 345

GCLK Rise Times(at Vdd/2 Crossings) (20% to 80% Extrapolated to 0% to 100%)

GMU, ECE 680 Physical VLSI Design 36

EV7 Clock HierarchyEV7 Clock HierarchyActive Skew Management and Multiple Clock Domains

NCLK(Mem Ctrl)

+ widely dispersed drivers

+ DLLs compensate static and low‐frequency 

DLL

DLL

DLL

q yvariation

+ divides design and ifi ti ff t

_CLK

Cac

he)

_CLK

Cac

he)

PLL

verification effort

- DLL design and verification is added 

GCLK(CPU Core)L2

L_(L

2 C

L2R

_(L

2 C

SYSCLK

work

+ tailored clocksSYSCLK

37

Self-timed and Asynchronous DesignSelf timed and Asynchronous DesignFunctions of clock in synchronous design

1) Acts as completion signal

2) Ensures the correct ordering of events

Truly asynchronous design

1) Completion is ensured by careful timing analysis

2) Ordering of events is implicit in logic

1) Completion is ensured by careful timing analysis

Self‐timed designSelf timed design

1) Completion ensured by completion signal2) Ordering imposed by handshaking protocol

GMU, ECE 680 Physical VLSI Design 38

Synchronous Pipelined DatapathSynchronous Pipelined Datapath

InDR1

QLogic

Block #1 DR2

QLogic

Block #2 DR3

Q DR4

QLogic

Block #3

tpd,reg tpd1CLK tpd2 tpd3

Advantages: Easy to integrate

disadvantages: Frequency assume the worst case of logicy g Frequency assume the worst case of logic

GMU, ECE 680 Physical VLSI Design 39

Self‐Timed Pipelined DatapathSelf Timed Pipelined Datapath

Req Req Req Req

St t DSt t D St t D

Req Req Req Req

Ack Ack Ack ACKHS HS HS

Start DoneStart Done Start Done

R2 OutF2In

R1 F1 R3 F3

tpF2tpF1 tpF3

GMU, ECE 680 Physical VLSI Design 40

Completion Signal GenerationCompletion Signal Generation

LOGIC

NETWORKIn Out

DELAY MODULEStart Done

Using Delay Element (e.g. in memories)

GMU, ECE 680 Physical VLSI Design 41

Completion Signal GenerationCompletion Signal Generation

Using Redundant Signal Encoding

GMU, ECE 680 Physical VLSI Design 42

Completion Signal in DCVSLCompletion Signal in DCVSLVDD VDD

Start DoneB0

B1

B0

In1

B1

PDN PDNIn1In2In2

Start

GMU, ECE 680 Physical VLSI Design 43

Self‐Timed AdderSelf Timed Adder

P P P P

VDD

StartVDD

Start DoneP0

C0

P1

G0

P2

G1

P3

G2 G3

C0 C1 C2 C3 C4 C4 C4

C3

C

C4

C3

C

Done

Start

VDD

St tStart

C2

C1

C2

C1

P0

C0

P1

K0

P2

K1

P3

K2 K3

Start

C4C0 C1 C2 C3 C4 (b) Completion signal

Start

(a) Differential carry generation

GMU, ECE 680 Physical VLSI Design 44

Completion Signal Using Current Sensingp g g g

VDD

Inputs Static CMOS LogicReg

iste

r

Output tdelay

Start

StartGNDsense

Current Sensor

Inpu

t

A

toverlap

tMDG

A

B

Min Delay Generator

Done

B

tpd-NOR

tMDG

Done

Output valid

GMU, ECE 680 Physical VLSI Design 45

Hand‐Shaking ProtocolHand Shaking Protocol

ReqReq

2

3RECEIVERSENDER

Req

Ack

Ack

Data

11Data(a) Sender-receiver configuration

(b) Ti i di

cycle 1 cycle 2Sender’s actionReceiver’s action

(b) Timing diagramTwo Phase Handshake

GMU, ECE 680 Physical VLSI Design 46

Event Logic – The Muller‐C ElementgA B Fn1

0 0 0F

A

C 011

101

FnFn1

F

BC

(b) Truth table(a) Schematic

VDD VDDVDD

S

FF

R

QA

B

A B

B

BFA

(a) Logic

(b) Majority Function

A BB

(b) Majority Function

(c) DynamicGMU, ECE 680 Physical VLSI Design 47

2-Phase Handshake Protocol2 Phase Handshake Protocol

Sender Recei erDataSenderlogic

Receiverlogic

Data ready Data accepted

CReq

Ack

Handshake logic

Ack

Advantage : FAST - minimal # of signaling events (important for global interconnect)Disadvantage : edge - sensitive, has state

GMU, ECE 680 Physical VLSI Design 48

Example: Self-timed FIFOExample: Self timed FIFO

R1In Out

R2 R3

C C

EnReqi

CReq0

Done

C

Acki Acko

All 1s or 0s -> pipeline emptyAlternating 1s and 0s -> pipeline full

GMU, ECE 680 Physical VLSI Design 49

2‐Phase Protocol2 Phase Protocol

GMU, ECE 680 Physical VLSI Design 50

ExampleExample

From [Horowitz]

GMU, ECE 680 Physical VLSI Design 51

ExampleExample

GMU, ECE 680 Physical VLSI Design 52

ExampleExample

GMU, ECE 680 Physical VLSI Design 53

ExampleExample

GMU, ECE 680 Physical VLSI Design 54

4-Phase Handshake Protocol4 Phase Handshake Protocol

2 4Req

Sender’s action

Receiver’s action

3 5Ack

1 1Data

Cycle 1 Cycle 2

Slower, but unambiguous

Also known as RTZ

GMU, ECE 680 Physical VLSI Design 55

4-Phase Handshake Protocol4 Phase Handshake ProtocolImplementation using Muller‐C elements

Senderlogic

Receiverlogic

Data

Data ready Data accepted

ReqS

Ack

C C

Handshake logic

GMU, ECE 680 Physical VLSI Design 56

Self‐Resetting LogicSelf Resetting Logic

PrechargedLogic Block

PrechargedLogic Block

PrechargedLogic Block

completiondetection

(L1)

completiondetection

(L2)

completiondetection

(L3)

(L1) (L2) (L3)

VDDVDD

Post‐charge

A B C

intout

logic

GMU, ECE 680 Physical VLSI Design 57

Clock‐Delayed DominoClock Delayed DominoGND

CLK1 CLK2 (to next stage)

V

Pulldown

Q1 (also D2)

D1

VDD

PulldownNetwork

D1

GMU, ECE 680 Physical VLSI Design 58

Asynchronous‐Synchronous Interfacey y

fin

Asynchronoussystem

Synchronous system

fCLK

Synchronization

GMU, ECE 680 Physical VLSI Design 59

Synchronizers and ArbitersSynchronizers and Arbiters

• Arbiter: Circuit to decide which of 2 events occurred first

• Synchronizer: Arbiter with clock  as one of the inputs

• Problem: Circuit HAS to make a decision in limited time which decision is not importanttime ‐ which decision is not important

• Caveat: It is impossible to ensure correct operation• But we can decrease the error probability at theBut, we can decrease the error probability at the expense of delay

GMU, ECE 680 Physical VLSI Design 60

A Simple SynchronizerA Simple Synchronizer 

CLK

int I1D Q

I2

D Q

I2CLK

• Data sampled on rising edge of the clock

• Latch will eventually resolve the signal value• Latch will eventually resolve the signal value,but ... this might take infinite time!

GMU, ECE 680 Physical VLSI Design 61

Synchronizer: Output TrajectoriesSynchronizer: Output Trajectories 

2.0

1 0

Vou

t

1.0

Si l l d l f fli fl

0.00 100 200 300

time [ps]

Single-pole model for a flip-flop

GMU, ECE 680 Physical VLSI Design 62

Mean Time to FailureMean Time to Failure

GMU, ECE 680 Physical VLSI Design 63

ExampleExampleTf = 10 nsec = TT 50Tsignal = 50 nsectr = 1 nsect = 310 psect = 310 psecVIH - VIL = 1 V (VDD = 5 V)

N(T) = 3.9 10-9 errors/secN(T) 3.9 10 errors/secMTF (T) = 2.6 108 sec = 8.3 yearsMTF (0) = 2.5 sec

GMU, ECE 680 Physical VLSI Design 64

Influence of NoiseInfluence of NoiseUniform distributionaround VM

p(v)around VM

logarithmic d ti

Treduction

0 VIL VIH Still UniformInitial Distribution

Still Uniform

Low amplitude noise does not influence synchronization behaviorp y

GMU, ECE 680 Physical VLSI Design 65

Typical SynchronizersTypical Synchronizers2

Q

2 phase clocking circuit

Q

Q

1

2

1

Q 2

Using dela lineUsing delay line

GMU, ECE 680 Physical VLSI Design 66

Cascaded Synchronizers Reduce MTFCascaded Synchronizers Reduce MTF

S S SIn O1 O2 Out

Sync Sync Sync

GMU, ECE 680 Physical VLSI Design 67

ArbitersArbiters

Req1 Ack1Req1

Req1

Req2

Ack1

Ack2Arbiter

Ack1

Ack2A

B

Req2Ack1

(a) Schematic symbol

(b) l iR 1 (b) ImplementationReq1

Req2

A (c) Timing diagramVT gap

BAck1 t

( ) g g

metastable

GMU, ECE 680 Physical VLSI Design 68

PLL‐Based SynchronizationPLL Based Synchronization

Chip 1 Chip 2

DigitalS t

Chip 1

DigitalSystem

Chip 2

Data

System

Divider

System

PLLfsystem = N x fcrystal

referenceclock

PLL

fcrystal 200<Mhz

ClockBuffer

CrystalOscillator

fcrystal 200 Mhz

GMU, ECE 680 Physical VLSI Design 69

PLL Block DiagramPLL Block Diagram

Phasedetector

Chargepump

Loopfilter VCO

Referenceclock

Upvcont

Divide by

Localclock

Down

Divide byN

SystemClock

GMU, ECE 680 Physical VLSI Design 70

Phase DetectorPhase Detector

ref

Output before filtering

reflocalclock

Output

ref

localclock

Output

Output (Low pass filtered)(a) (b)

VDD

Output (Low pass filtered)

Transfercharacteristic

180 90 90 180 phase error (deg)-180 -90 90 180 phase error (deg)(c)GMU, ECE 680 Physical VLSI Design 71

Phase‐Frequency DetectorPhase Frequency Detector

D QRst UP

B

B A

D Q

A

Rst

UP = 0DN = 1

UP = 0DN = 0

UP = 1DN = 0 A

(a) schematic (b) state transition diagram

D Q

B

DNA B

A

B

A

B

UP

DN

UP

DN

(c) Timing waveforms

GMU, ECE 680 Physical VLSI Design 72

PFD Response to FrequencyPFD Response to Frequency

A

B

UPUP

DN

GMU, ECE 680 Physical VLSI Design 73

PFD Phase Transfer CharacteristicPFD Phase Transfer Characteristic

VDD

Average (UP-DN)

2

phase error (deg)2

GMU, ECE 680 Physical VLSI Design 74

Charge PumpCharge PumpVDD

UP To VCO Control Input

DN

GMU, ECE 680 Physical VLSI Design 75

PLL SimulationPLL Simulation

ref

0.8

1.0

V)

div

0 4

0.6

ol V

olta

ge (V

vco

0.2

0.4

Con

tro

ref

00.0

1 2 3 4 5Time ( s)

div

vco

GMU, ECE 680 Physical VLSI Design 76

Clock Generation using DLLsClock Generation using DLLs

Delay‐Locked Loop (Delay Line Based)

Phase ChargeDL

UfREF

Det PumpFilter

DLD

fO

Phase‐Locked Loop (VCO‐Based)

UfREF

PD CP VCO÷N

D

f

Filter

fO

GMU, ECE 680 Physical VLSI Design 77

Delay Locked LoopDelay Locked Loop

Phase Charge VCDLFREF

UC VPhase

detectChargepump VCDLPH D

C VCTRL

FO(a)

REF

OUT

UP

DN(b)

Delay

PH

VCTRL

Delay

(c)GMU, ECE 680 Physical VLSI Design 78

DLL‐Based Clock DistributionDLL Based Clock Distribution

VCDL DigitalCi it晻�VCDL

CP/LF

Circuit晻�

PhaseDetector

VCDL DigitalCircuit晻�GLOBAL CLK

CP/LF

PhaseDetector

GMU, ECE 680 Physical VLSI Design 79

top related