advanced digital design metastability a. steininger vienna university of technology
TRANSCRIPT
Advanced Digital DesignMetastability
A. SteiningerVienna University of Technology
2Lecture "Advanced Digital Design" © A. Steininger / TU Vienna
Outline
What is metastability
Effects and threats
The unavoidability
MTBU estimation
Synchronizers & Countermeasures
Trends
Measurement of Model Parameters
© A. Steininger / TU Vienna 3Lecture "Advanced Digital Design"
Metastability: An Example
Ball may remain on top („metastable“) for unbounded time
A small disturbance causes the ball to fall in either direction
stable right
position
stable left
position
© A. Steininger / TU Vienna 4
What is Metastability ?
continuous-valued input space(initial position of the ball)
mapped to
binary output space(left or right position)
mapping may be undecided for unbounded time
Lecture "Advanced Digital Design"
© A. Steininger / TU Vienna 5
Mestastability in Logic ?
„In the synchronous digital world we do not have a continuous space“ (after all, that‘s the key benefit!)
„Inputs and outputs of gates are all digital“
So why bother about metastability?
Lecture "Advanced Digital Design"
© A. Steininger / TU Vienna 6
The real world
signal levels representing the digital state are continuous
pulse lengths are continuous in time relative signal arrival times are
continuous
transistors and the circuits built from them operate in continuous time with continuous voltage amplitudes
Lecture "Advanced Digital Design"
7
Specifying Problems Away
is the input high or low? spec: forbidden range
is the pulse long enough to be recognized by a gate? spec: min pulsewidth
did A occur before or after B? spec: setup/hold time
Lecture "Advanced Digital Design" © A. Steininger / TU Vienna
© A. Steininger / TU Vienna 8
Limits of the Abstraction
in a closed world these issues can be „specified away“, but
what happens at interfaces what happens with faults
The synchronous digital abstraction cannot comprise these issues
when facing metastability, CMOS circuits are operated out of spec, hence have undefined behavior
Lecture "Advanced Digital Design"
© A. Steininger / TU Vienna 9Lecture "Advanced Digital Design"
Level: Inverter Example
analog transfer characteristics
„forbidden“ input level may lead to „forbidden“ output level
propagation of „forbidden“ level
uin
uoutInverter-characteristics
© A. Steininger / TU Vienna 10
Pulsewidth: RC Example
short digital input pulse
creates analog output in forbidden range
parasitic RCs are omnipresent in ASICs
Lecture "Advanced Digital Design"
© A. Steininger / TU Vienna 11
A before B: AND Example
contradicting digital transi-tions on inputs
depending on timing a glitch is produced
RC will convert it into ambi-guous voltage
Lecture "Advanced Digital Design"
a
b
a AND b
© A. Steininger / TU Vienna 12Lecture "Advanced Digital Design"
CLK
D
Q
D Q1
1
1
Setup/Hold Time of Latch
Otherwise we feed the storage loop with a marginal condition (pulse width, level), thus creating undefined behavior
feedback path must be stable when swiching from „transparent“ to „hold“.
© A. Steininger / TU Vienna 13Lecture "Advanced Digital Design"
Metastability in the Latch
normal operation: strong momentum will roll ball to other side
metastability: marginal momentum will roll ball just to the top
stable right
position
stable left
position
© A. Steininger / TU Vienna 14Lecture "Advanced Digital Design"
Observation: An input transition during the decision window leads to an (unbounded) increase of clock-to-output delay
tclk2out
tclk2out,nom
tclk2datatsetup thold0
CLK
D
Response Time of a FF
off
-spec
© A. Steininger / TU Vienna 15
Observation
combinational elements transform off-spec inputs into off-
spec outputs immediatey sequential (stateful) elements
are expected to decide for one state;
off-spec inputs will delay this decision
only they can become metastable
Lecture "Advanced Digital Design"
© A. Steininger / TU Vienna 16Lecture "Advanced Digital Design"
Faces of Metastability
(properly shaped) late transition may cause timing problems problem specific for synchronous design
creeping through forbidden voltage range generates long undefined level
oscillation generates erroneous transitions
© A. Steininger / TU Vienna 17Lecture "Advanced Digital Design"
Metastability: Creeping
1
2
3
4
5
1 2 3 4 5
1
Inv 1
1
ue,2 = ua,1
ue,1 = ua,2
stable (HI)
stable (LO)
metastable
A
Inv 2
© A. Steininger / TU Vienna 18
1
Metastability: Oscillation
A pulse with length shorter than the roundtrip delay through the inverter loop can circulate
Thus it appears periodically at the output
„oscillation“
Lecture "Advanced Digital Design"
D1
D2
PW<D1+D2
1
© A. Steininger / TU Vienna 19
Ways of Triggering MS
Time domainglitch in feedback loop S/H violation, or glitch on D
Value domain Marginal input
voltage stored even without S/H violation
Lecture "Advanced Digital Design"
D
Clk
L
FB
CLK
D
Q
D Q1
1
1
D
Clk
L
FB
© A. Steininger / TU Vienna 20Lecture "Advanced Digital Design"
Why voilate Setup/Hold?
in a closed synchronous system no violations will occur
BUT: no system is really closed non-synchronous interfaces clock domain boundaries fault effects (single-event upsets) off-spec operation (temp, VCC, frequency)
© A. Steininger / TU Vienna 21Lecture "Advanced Digital Design"
asynchronous event
setup/hold
clock period Tclk
dec. win. T0
probability of setup/hold violation
Asynchronous Inputs
00 clk
violate T
TP
© A. Steininger / TU Vienna 22Lecture "Advanced Digital Design"
CLK 1 (Ref)
CLK 2
A
Multiple Clock Domains
arbitrary „phase“ relation setup/hold violation inevitable
(fundamentally!)
© A. Steininger / TU Vienna 23Lecture "Advanced Digital Design"
Metastability: Threats
propagation undefined logic level/timing at input may
produce undefined output
„Byzantine“ Interpretation Thresholds/timing of different inputs are
different (type variations)
marginal input level/timing may be interpreted differently
© A. Steininger / TU Vienna 24Lecture "Advanced Digital Design"
D
CLK
X
Metastab.
Xdata
clkuin
uout
Combinational gates as well as the inverters inside the FF map metastable inputs to metastable outputs
Inverter-characteristics
A
Metastability Propagation
D
CLK
© A. Steininger / TU Vienna 25Lecture "Advanced Digital Design"
Inconsistent Perception
D
CLK
D
CLK
X
0
1
Metastab.
The metastable state may be regarded as „1“ by one FF and as „0“ by another
CMOS 3V
0.8V
2.0V
0.0V0.4V
2.4V
3.3V
D
CLKX
threshold A
A
Btreshold B
A
© A. Steininger / TU Vienna 26Lecture "Advanced Digital Design"
Metastability Proofs
Formal proofs exist that metastability can in principle not be
avoided („Buridan‘s Principle“) no upper bound on the duration of
metastable state can be given but after infinite time the state will be
resolved with probability 1 Fundamental issue
Mapping from a continuous space to a discrete space involves a decision that may take unbounded time (namely in borderline cases)
© A. Steininger / TU Vienna 27
Approaching the Border
The mapping from continuous to binary space needs a borderline
In the proximity of the borderline the force pulling towards one of the binary states becomes smaller (compare momentum of the ball)
In the continuous input space one can go arbitrarily close to the borderline, thus moving this force towards zero
Often the stable binary states represent energy-minima, while the metastable state represents a (local) maximum (Remember: energy must change continuously)
Lecture "Advanced Digital Design"
© A. Steininger / TU Vienna 28
Metastability Avoidance?
Can‘t we avoid metastability in practice, if we avoid borderline cases?
(only those are problematic!)
=> synchronous design, noise margins…
allow arbitrary time for resolving? change input threshold of successor
stage ? use a different storage element ?
Lecture "Advanced Digital Design"
© A. Steininger / TU Vienna 29Lecture "Advanced Digital Design"
Why use the D-Flipflop?
Metastability is not restriced to D-FFs, it is encountered with
SR-latch, JK-Flipflop, Muller C-Gate,… Basically all biststable elements can
become metastable: state is always associated with energy state change always involves energy
transfer law of physics dictate
continuous transfer but: binary state min min
max
© A. Steininger / TU Vienna 30Lecture "Advanced Digital Design"
Mitigating Metastability
Metastability cannot be eliminated in general all such circuits have been shown to fail…
in practice systems still work because metastability is very improbable
it can be made more or less probable by design techniques
it can be transformedbetween its differentmodes marginal voltage level late transition oscillation
© A. Steininger / TU Vienna 31
Conversions
Low-Pass oscillation => creeping
Discriminator creeping + noise => oscillation
High / Low threshold input creeping => glitch
Schmitt Trigger creeping => late transition
Flip-Flop late transition => creeping or oscillation
Lecture "Advanced Digital Design"
© A. Steininger / TU Vienna 32
Masking Metastability
assume m-of-n voting
Lecture "Advanced Digital Design"
… …
m-1
n
• If the metastable input just makes the difference, MS can propagate
• in all other cases MS will be masked
© A. Steininger / TU Vienna 33
Detecting Metastability
… often possible by comparing Q and Q creeping
both, Q and Q deliver VDD/2; this is often perceived as the „same“ logic level
late transition with proper separation of Schmitt-Trigger /
High threshold inverter and output inverter => no visible effect
oscillation literature reports about „in phase“
oscillation of Q and Q
Lecture "Advanced Digital Design"
© A. Steininger / TU Vienna 34Lecture "Advanced Digital Design"
c
res
clkdat
t
fTMTBU
exp
1
0
Quantifying the Risk of MS
„Upset“ metastable output is captured by
subsequent FF after tr
Mean Time Between Upset (MTBU) expected value (statistics!) for interval
between two subsequent upsets
© A. Steininger / TU Vienna 35Lecture "Advanced Digital Design"
Resolution Time
clk
asyn
syntclk2out
tcomb tSU
tres
SUcomboutclkclkres tttTt 2
D
CLK
D
CLK
asyn
clk
syn comb. logic
normal operation:
tres>0
upset:
tres<0
© A. Steininger / TU Vienna 36Lecture "Advanced Digital Design"
Parameters
Resolution time tres interval available for output to settle after
active clock edge
Flip-Flop parameters tc ,T0
experimentally determined time constant tc dep. on transit frequ. T0 from effective width of decision window
Clock period of FF Tclk = 1/fclk
Average rate of change ldat Avg. rate of transitions at FF data input
© A. Steininger / TU Vienna 37
Modeling Metastability
How can we derive this equation?
Which model to apply?
Lecture "Advanced Digital Design"
© A. Steininger / TU Vienna 38Lecture "Advanced Digital Design"
Simple Metastability Model
model bistable element by inverter pair
use linear model for inverter, around midpoint of transfer function („balance point“)
consider „homo-genuous“ case, i.e. closed loop w/o inputs
uin
uoutInverter-characteristics
uout = -A*uin
u1 u2
© A. Steininger / TU Vienna 39Lecture "Advanced Digital Design"
Introducing Dynamics
1st order approximation of dynamic behavior: RC element
assume symmetry (same A, RC for both inverters) for simplicity
WLOG assume symmetric supply (+VCC/-VCC) against GND
-A
-ARC = t
RC = tu1 u2
© A. Steininger / TU Vienna 40Lecture "Advanced Digital Design"
Differential Equations
Basics:
forward path:
backward path:
Laplace:
time-domain solution:
RR iRu dt
duCi C
C
dt
duCRuAu 2
12
dt
duCRuAu 1
21
0)()(
usUsdt
tduL
0
2212 uUsUAU 0
1121 uUsUAU
t
Auut
Auutu
1
exp2
1exp
2)(
01
02
01
02
2
© A. Steininger / TU Vienna 41Lecture "Advanced Digital Design"
The Solution
u20-u1
0 … difference of initial voltages (charges on Cs); zero at balance point
t … RC constant, bandwidth = 1/RC A … inverter gain at balance point A/t … gain bandwidth product of inverter
starting from the initial difference u2 rises exponentially with time towards the positive or negative supply voltage
t
Auutu
1
exp2
)(01
02
2
© A. Steininger / TU Vienna 42Lecture "Advanced Digital Design"
Plot of u2 over Time
-500
-250
0
250
500
0 1 2 3
-25
-20
-15
-10
-5
0
5
10
15
20
25
For a given t we can project „forbidden“ input range back to a „forbidden“ range of the initial voltage difference
© A. Steininger / TU Vienna 43Lecture "Advanced Digital Design"
Forbidden Initial Range
t
Auutu
1
exp2
)(01
02
2
u0
resborderoutrborder t
AUtu
1
exp)( ,,0
The forbidden output voltage range relates to a forbidden range of initial difference voltage (i.e. just after sampling). This range becomes exponentially smaller for high resolution time tres and high gain-bandwidth product A/t.
© A. Steininger / TU Vienna 44Lecture "Advanced Digital Design"
Aperture Window TAW
How long does it take for the input voltage difference to cross the forbidden range?
Depends on slopes of both,input voltage AND feedback voltage
+u0,border
u0,borderTAW
udiff(t), slope S
S
uT border
AW,02
© A. Steininger / TU Vienna 45Lecture "Advanced Digital Design"
Calibrating TAW
TAW depends on u0,border , which in turn depends on tres
for immediate use of the output:
thus
res
borderborderAW t
A
S
U
S
uT
1
exp22 ,0,0
0,02
)0( Wborder
resAW TS
UtT
resWAW t
ATT
1
exp0
technology parameter
© A. Steininger / TU Vienna 46Lecture "Advanced Digital Design"
Hitting the Aperture
with exponentially distributed inter-arrival time of input events (rate ldat) and sampling with period Tclk (i.e. window TAW is repeated) the upset rate can be calculated as
Hence the MTBU becomesclk
AWdatupset T
T
AW
clk
datupset T
TMTBU
11
© A. Steininger / TU Vienna 47Lecture "Advanced Digital Design"
Putting it all together
AW
clk
datupset T
TMTBU
11
resWAW t
ATT
1
exp0
res
W
clk
dat
tA
T
TMTBU
1
exp1
0
T0 1/tC
© A. Steininger / TU Vienna 48Lecture "Advanced Digital Design"
The widely used equation
c
r
clkdat
t
TfMTBU
exp
1
0
rate of input events
sampling frequency
technology parameters
expected time between upsets (statistical!)
available resolution time
© A. Steininger / TU Vienna 49
Late Transition
calculate output delay over data to clk distance
Lecture "Advanced Digital Design"
C
diffout
tutu
exp
2)(
diff
outCdiff u
uut
2ln)(
thout Uu
indiff TSu
in
WC
in
thCindly T
T
TS
UTt 0ln
2ln)(
detector threshold
input slope S
output delay depends on input phase with ln(1/x)
© A. Steininger / TU Vienna 50
Graphical View
Lecture "Advanced Digital Design"
-25 -20 -15 -10 -5 0 5 10 15 20 250
5
10
15
20
25
Dly
Dly
© A. Steininger / TU Vienna 51Lecture "Advanced Digital Design"
Provoking Metastability
asynchronous inputs multiple clock domains clock divider (uncontrolled delay)
low timing margins slow technology (gain/BW prod) supply drop (excessive delay) Operation under high temperature
© A. Steininger / TU Vienna 52Lecture "Advanced Digital Design"
Determination of T0, tC
experimental: vary tres
observe MTBU log graph
=> straight slope -> tC
offset -> T0
typical values
ldat*fclk*T0
tC1
tres(ns)
ldat = 2MHzfclk = 10MHz
1
1
© A. Steininger / TU Vienna 53Lecture "Advanced Digital Design"
Claim: „Metastability is a non-issue in modern technologies“
log MTBU[s]
tres
6
12
5
1996 (XC4005)
2002 (XC2VP4)
BUT: clock rates have increased by a factor of 16 during that period –
and timing margins have shrunk in the same way!
Metastability – Trends
© A. Steininger / TU Vienna 54Lecture "Advanced Digital Design"
Mitigating Metastability
avoid/minimize non synchronous IFs
leave sufficient timing margins
use fast technology (gain/BW prod)
ensure proper operating conditions (stable power supply, cooling,…)
basic principle of synchronizers:
trade performance for increased timing margins (tres)
© A. Steininger / TU Vienna 55Lecture "Advanced Digital Design"
Synchronizer
Example: Cascade of n Input-FFs
D
CLK
asyn
clk
syn
D
CLK
MTBU calculation: same equation as before, but now individual resolution times sum up:
iresres tt ,
© A. Steininger / TU Vienna 56Lecture "Advanced Digital Design"
MTBF of n-Stage Synchr. Recall the projection of allowed output range to an
input range considering the exponential increase during the resolution time:
u0 for FFk is provided by the output of a preceding stage FFk-1 => we make the same projection again:
c
resoutres
tUtu
expˆ)(ˆ0
1
,
1
,01,1,0
expexpˆ
expˆ)(ˆ
kc
res
kc
reskout
kc
reskkresk
ttU
tutu
Lecture "Advanced Digital Design" © A. Steininger / TU Vienna 57
Synchronizer-Rules
never synchronize more than one signal (rail) danger of data inconsistecy degradation of MTBU by number of signals
for a wider bus, use one signal for handshaking
never introduce a fork before the end of synchronizer
estimate the MTBU of your solution too low MTBU leads to failures too many stages introduce unnecessary delay
there is definitely no magic solution to eliminate the potential for metastability, but it can be made arbitrarily improbable
Lecture "Advanced Digital Design" © A. Steininger / TU Vienna 58
Synchronizer – Trends
need for more synchronizers more function units being integrated on a chip more standardized frequencies higher communication demands
need for more synchronizer stages increasing PVT variations => larger safety margins synchronizer paramters become worse: tC used to scale proportional to (FO4) propagation
delay for decades, below 45nm technologies the scaling is worse
synchronizers tend to create a considerable performance loss in the future
Lecture "Advanced Digital Design" © A. Steininger / TU Vienna 59
Even/Odd Synchronizer
works for two periodic clocks only avoids performance penalty of synchronizers largely eliminated potential for metastability for details see
[Dally & Tell, The Even/Odd Synchronizer, ASYNC 2010]
© A. Steininger / TU Vienna 60
Mutex
For deciding the „A before B“ problem a special circuit exists, namely the Mutex (mutual exclusion element)
Unlike the Synchronizer it assumes there is unbounded time to resolve
It will be treated in a later Section.
Lecture "Advanced Digital Design"
© A. Steininger / TU Vienna 61Lecture "Advanced Digital Design"
Assumptions made so far
linear inverter slope (1st order model)
load independent gain
dominating RC const. (1st order model)
full symmetry (RCs, inverter properties, rising/falling slopes,…)
decreasing exp term neglected
homogenuous case (MUX switching and input signal shape neglected)
equally distributed voltage levels
exponentially distributed input events
© A. Steininger / TU Vienna 62
What about Oscillation?
Can our model be used for oscillatory behavior?
How / Why not?
Lecture "Advanced Digital Design"
© A. Steininger / TU Vienna 63Lecture "Advanced Digital Design"
A More general MS Model
ideal amplifier
gain -A
pure delay
delay D
slope limiter
time constant RC slope S
GBWP = A/RC determines dynamics (decay of metastable state)
oscillation for D > RC/A
creeping otherwise
Lecture "Advanced Digital Design" © A. Steininger / TU Vienna 64
Characterizing Metastability
know (=assume) exponential MTBU relation
measure MTBU over tres
draw semilog plot => straight line
find params: slope tC
offset T0
need very good setupfor measurements !(assumptions made…)
ldat. fclk
. T0
tC
1
tres(ns)
ldat = 2MHzfclk = 10MHz
1
1
Lecture "Advanced Digital Design" © A. Steininger / TU Vienna 65
Measuring Metastability
DUT
clk
D Q
MS producer MS detector counter
[Altera]
Lecture "Advanced Digital Design" © A. Steininger / TU Vienna 66
MS Producer
Single clock source, controllable relative delay between clock and data path variable delay element, optional: feedback control create as many MS events as possible in short time well-controlled and reproducible phase steer into deep metastability problems: noise, cannot derive MTBU
Two independent clock sources: uniform distribution of phase relations problems: MS rare, phase distribution truly uniform?
Lecture "Advanced Digital Design" © A. Steininger / TU Vienna 67
MS Detector
Aims: detect metastable output of DUT
Problem: How define MS ?
late transition detection intermediate voltage detection output proximity detection
Implementation options (late trans det): sample DUT output with FF1 after tres
compare with reference FF2 having „infinite“ tres
mismatch indicates metastability many sources of error!
68
Late Transition Detection
D Q D Q
D Q
osc 1
osc 2
var D
D∞
≠ CNT
DUT DET
REF
• max of var D determines maximum detectable tCO
• infinite delay not feasible => false positive for large tCO
69
Detecting Metastability (1)
Fundamental problems MS behavior is highly sensitive esp.
to loading cannot measure w/o influencing can only make indirect observation
What is an „upset“ at all? no sharp definition MS interpretation becomes ambiguous
often „by chance“ (threshold of next stage) or „deliberate“ (scope)
70
Detecting Metastability (2)
Practical problems FFs in „relevant“ circuits are not accessible,
cannot propagate subtle effects over pins cannot reliably capture them on-chip either
detection circuits usually involve forks different path delays, different thresholds usually ignored: symmetry assumed
how do PVT variations impact the results? in DUT and measurement circuit
which manifestation of MS to observe? intermediate voltage, output proximity, late trans.
where get the reference from? infinite time…
71
Relating the Results
We plot log(MTBU) or tCO over tDtoC How determine tDtoC?
measure with oscilloscope/counter know from timing control: dly 2 – dly 1
This relates to the external view (pins)! The actual FF cell will perceive a different
timing due to non-matching path delays for C/D
At best this may shift the MS point, but what about variable path delays (VT) ?
© A. Steininger / TU Vienna 72
Time Accuracy
Clock how accurate/stable is it? where is it used?
Delay how accurate is it in which granularity can I vary it?
Output delay measurement how accurate is my scope?
Lecture "Advanced Digital Design"
73
Uncertainty Characterization
…is a „must“ in many types of measurement.
Result is given as value ± u% For probabilistic results: confidence interval These types of characterization allow
Estimation of the credibility of value Determination of worst case for value Calculation of compound uncertainty
Why not care for this in metastability measurement / MTBU prediction?
74
Why we SHOULD care
There is no other evidence for the (even approximate) correctness of MTBU prediction: Wait for 1000 years?
Highly super-linear dependence of predicted MTBU on measured parameters => may amplify errors!
Given the ample PVT variations – how to translate a specific measurement result into a generally valid prediction?
© A. Steininger / TU Vienna 75
What about simulation
simulation can provide access to all nodes of interest in a non-intrusive way
metastability is, however, a very subtle effect, depending on many details a very detailed model for transistors (parasitics)
and circuit (layout!) is needed analog simulation is needed, so the simulation
time may become considerable finding the right phase CLK to data is difficult the simulator tends to run into numeric problems noise is not necessarily considered
so are the results finally representative?
Lecture "Advanced Digital Design"
© A. Steininger / TU Vienna 76
Summary (1)
Metastability is unavoidable when mapping from a continuous space to a binary one.
It can result in late transition, creeping or oscillation.
It can be specified away, but only in a closed system.
Metastable inputs make gates operate out of spec, hence their behavior is undefined.
Metastability can propagate, even over masking provisions (TMR, etc.)
Lecture "Advanced Digital Design"
© A. Steininger / TU Vienna 77
Summary (2)
In practice, the risk of facing a metastable upset can be made arbitrarily small.
On a statistical base, the upset probability of a flip-flop can be predicted.
The corresponding equation can be derived by investigating the homogenouns solution of a dynamic model built from first-order models of the inverters.
The generally used equation is based on many simplifying assumptions.
Lecture "Advanced Digital Design"
© A. Steininger / TU Vienna 78
Summary (3)
The required model parameters are often hard to find. Their determination by measurements involves a lot of uncertainties.
Synchronizers trade performance for a reduced probability of a metastable upset.
Metastability is also an issue for modern technologies. It can be best mitigated by conservative design and large timing margins.
Lecture "Advanced Digital Design"