timing issues & clock distributionece322/lectures/lecture10/...clk • every branch sees the same...

27
Timing Issues and Clock Distribution Lecture 10 18-322 Fall 2003 Textbook: [Sections 7.5, 10.1, 10.3]

Upload: others

Post on 26-Jan-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • Timing Issues and Clock Distribution

    Lecture 1018-322 Fall 2003

    Textbook: [Sections 7.5, 10.1, 10.3]

  • Overview

    Timing issues & clock distributionSystem Performance DeterminationPipeliningClock skew. Register timingCounter clock skew

  • Review: Register Timing

    © Prentice Hall 1995clk-to-Q (propagation) delay (tpFF)

    hold time

    setup time

    Unstable data

    cycle time

    clk

    Q

  • Sequential Systems: The Big Picture

    PrimaryInputs

    PrimaryOutputsCombinational

    Logic

    Next State

    Current State

    MemoryElements

    (Registers)Clock

  • Maximum Clock Frequency

    FF’s

    LOGIC

    tp,comb

    φ

    “Speed” of the sequential machine (how fast can this machine be clocked)

    f = 1/Tφ (clock frequency)

    Example: tp ~ 100ns => 10MHz (limit on performance)

    tp,FF + tp,comb + tsetup < Tφ

  • Setup Time

    Required time for input to be stableBEFORE CLOCK EDGE

    Comb.Logic

    Data stable herebefore clock here

  • Setup Time Fix

    Φ

    Data

    This violation can be fixed by stretching the clock cycle

    OK

    Φ

    Data

  • Setup Time Fix 2

    Φ

    Data

    OR… by accelerating the combinational logic

    OK

    Φ

    Data

  • Hold Time

    Required time for input to be stableAFTER CLOCK EDGE

    Comb.Logic

    Data stable hereafter clock here

  • Hold Time Violations

    Prop Delay: 1 ns Hold Time: 2 ns

    Hold time violations are caused by “short paths”Cannot be fixed by slowing down the clock!!!

    Fixed by slowing down fast paths

  • Timing Analysis

    Look for longest path: clock speedLook for shortest paths: check hold time

    Static Timing Analysis:Attempt to determine longest/shortest path from schematicDifficult problem ⌧Know the delay of logic elements, but cannot easily reason about

    the entire design

  • False Paths

    Example: #4#3

    #2 #3

    Solutions:SimulationFalse Path Analysis

  • Speeding up System Performance: Pipelining

    RE

    G

    φ

    REG

    φR

    EGφ

    log.

    RE

    G

    φ

    REG

    φ

    RE

    G

    φ

    .

    RE

    G

    φ

    RE

    G

    φ

    logOut Out

    a

    b

    a

    b

    Non-pipelined version Pipelined version

    tp,comb

  • How Good Is This?

    Tmin,pipe = tp,reg + max(tp,ADD,tp,abs,tp,log ) + tsetup,regPipelining is used to implement high-performance data-pathsAdding extra pipeline stages only makes sense up to a certain point

    RE

    Gf

    RE

    G

    φR

    EG

    φ

    .

    RE

    G

    φ

    RE

    G

    φ

    log Out

    a

    b

    Pipelined version

  • Overview

    Timing issues & clock distributionSystem Performance DeterminationPipeliningClock skew. Register timingCounter clock skew

  • Synchronous Pipelined Data-Path: Clock Skew

    Clock Rates as High as 1 GHz in CMOS!

    CL1 R1 CL2 R2 CL3 R3Out

    tφ’ tφ’’ tφ’’’

    tl,mintl,max

    tr,mintr,max

    ti

    Clock Edge Timing Depends upon Position

    A clock line behaves as a distributed RC line

    Each register sees a localclock time depending on their distance from the clock source -> clock skew

    δ = tφ” – tφ’ (> 0 or

  • Constraints on Skew

    R1 R2

    φ’ φ’’δ

    tr,min + tl,min + ti

    (a) Race between clock and data.

    tφ’ tφ’’ = tφ’ + δ

    dataearliest time

    If the local clock of R2 is delayed w.r.t. R1, it might happen that the inputs of R2 change before the previous data is latched -> race

    δ ≤ tr,min + ti + tl,min

    R1 R2

    φ’ φ’’+ Tδ

    tr,max + tl,max + ti

    (b) Data should be stable before clock pulse is applied.

    tφ’ tφ’’ + T =

    data

    φ’’

    tφ’ + T + δ

    worst-case

    The correct input data is stable at R2 after the worst-case propagation delay. The clock period must be large enough for the computations to settle.

    T ≥ tr,max + ti + tl,max - δ

  • Clock Constraints in Edge-Triggered Logic

    δ tr min, ti tl min,+≤

    T r max, ti tl max, δ–+≥

    +

    +t

    (1)

    (2)

    Maximum Clock Skew Determined by Minimum Delay between Latches (condition 1)Minimum Clock Period Determined by Maximum Delay between Latches (condition 2)

  • Positive and Negative Skew

    R R RData

    The clock is routed in the same direction as data

    The skew has to satisfy (1)If it violates (1), then the circuit

    malfunction independently of the clock period Clock period decreases!!!

    (a) Positive skewφ

    CL CLCL

    R R RData

    φ (b) Negative skewThe clock is routed in the opposite direction of data

    (1) is satisfied implicitly. The circuit operates correctly independently of the skew

    Clock period increases by | δ|CL CLCL

  • Overview

    Timing issues & clock distributionPipeliningClock skew. Register timingCounter clock skew

  • Countering Clock Skew

    RE

    G

    φ

    RE

    G

    φR

    EG

    φ

    .

    RE

    G

    φ

    log Out

    In

    Clock Distribution

    Positive Skew

    Negative Skew

    Data and Clock Routing

    Goal: clock skew between registers is bounded!(What matters is the relative skew between communicating registers.)

  • Clock Distribution: H-Trees

    clk

    • Every branch sees the same wire length and capacitance •The clock skew is theoretically zero• The sub-blocks should be small enough s.t. the skew within the block is tolerable• It is essential to consider clock distribution early in the design process

    Clock distribution is a major design problem!

  • Clock Network with Distributed Buffering

    Module

    Module

    Module

    Module

    Module

    Module

    CLOCK

    main clock driver

    secondary clock drivers

    Reduces absolute delay, and makes Power-Down easierSensitive to variations in Buffer Delay

    Local Area

  • DEC Alpha 21164

    Clock Drivers

    9.3 M Transistors, 4 metal layers, 0.55µmClock Freq: 300 MHzClock Load: 3.75 nFPower in Clock = 20W (out of 50W)Two Level Clock Distribution:

    oSingle 6-stage driver at centeroSecondary buffers drive left and right side

    o Max clock skew less than 100psecoRouting the clock in the opposite directionoProper timing

  • Clock Skew in Alpha

    Clock driver

  • Timing & Race Conditions: Example

    AB

    SumCoutCin

    AB

    SumCoutCin

    AB

    SumCoutCin

    32-bit reg

    32-bit reg

    vv

    32-bit adder

    R1

    R2

    clk driver 150Ω

    300fF

    SourceDestination

    32-bit reg

    v

    R5

    32-bit reg

    v

    R4

    32-bit reg

    v

    R3

    ~1mm wire 200Ω, 100fF

  • Example (cont’d)

    150Ω 200Ω

    600fF 50fF 50fF 900fF

    φ’ φ”π model

    tφ’ = 0.69 (150) (650) = 67pstφ” = 0.69 [(150) (650) + (150 + 200)(950)] = 297psδ = tφ’ – tφ” = 230ps

    Find the skew between the source register clock (φ’) and the destination (φ”)

    δ ≤ tr,min + ti + tl,min condition (1)thold + δ ≤ tclk-Q + tsum100 + 230 ≤ 50 + 300 TRUE => No race problem

    Check race condition

    T ≥ tr,max + ti + tl,max - δ condition (2)T ≥ tclk-Q + 31 tcarry + tsum - δ + tsetupT ≥ 50 + 31(250) + 300 –230 + 150 => T ≥ 8.2 nsFind minimum clock period

    Timing Issues and Clock DistributionOverviewReview: Register TimingSequential Systems: The Big PictureMaximum Clock FrequencySetup TimeSetup Time FixSetup Time Fix 2Hold TimeHold Time ViolationsTiming AnalysisFalse PathsSpeeding up System Performance: PipeliningHow Good Is This?OverviewSynchronous Pipelined Data-Path: Clock SkewConstraints on SkewClock Constraints in Edge-Triggered LogicPositive and Negative SkewOverviewCountering Clock SkewClock Distribution: H-TreesClock Network with Distributed BufferingDEC Alpha 21164Clock Skew in AlphaTiming & Race Conditions: ExampleExample (cont’d)