ee4271 vlsi design interconnect optimizations buffer insertion
Post on 20-Dec-2015
238 views
TRANSCRIPT
![Page 1: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/1.jpg)
EE4271 VLSI DesignInterconnect Optimizations
Buffer Insertion
![Page 2: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/2.jpg)
Moore’s law
Twice the number of transistors, approximately every two years, so double clock frequency accordingly
![Page 3: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/3.jpg)
3
0.18
Source: Gordon Moore, Chairman Emeritus, Intel Corp.
0
50
100
150
200
250
300
Technology generation (m)
Del
ay (
pse
c)
Transistor/Gate delay
Interconnect delay
0.8 0.5 0.250.25
0.150.35
Interconnects Dominate
This is why Moore’s law is not true anymore.This is why Moore’s law is not true anymore.
![Page 4: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/4.jpg)
Objectives
• What have we learned?– Compute circuit delay on wires and gates– Gate delay optimization
• What are we going to learn?– Interconnect delay optimization: buffer
insertion• Why reducing delay• How to perform it
– This is the most important optimization in circuit design
![Page 5: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/5.jpg)
5
0.18
Source: Gordon Moore, Chairman Emeritus, Intel Corp.
0
50
100
150
200
250
300
Technology generation (m)
Del
ay (
pse
c)
Transistor/Gate delay
Interconnect delay
0.8 0.5 0.250.25
0.150.35
Why is this trend?
![Page 6: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/6.jpg)
A scaling primer
• Ideal process scaling:– Device geometries shrink by S= 0.7x)
• Device delay shrinks by s
– Wire geometries shrink by • Unit resistance R/ : /(ws.hs) = r/s2
• Unit coupling capacitance
Cc/ : (hs)/(Ss)
• Resistance doubled, capacitance roughly unchanged for unit length
• How about the change in wire length?
SS
GG
DD
h
w
l
S
l
h
Sw
![Page 7: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/7.jpg)
Technology scaling
• Global (long) interconnect lengths don’t shrink– Global interconnect link cells far apart
• Local (short) interconnect lengths shrink by s– Local interconnects link cells nearby
![Page 8: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/8.jpg)
Interconnect delay scaling• Delay of a wire of length l :
int = (rl)(cl) = rcl2 (a quadratic function of length)
• Local interconnects : int : (r/s2)(c)(ls)2 = rcl2
– Local interconnect delay unchanged
• Global interconnects : int : (r/s2)(c)(l)2 = (rcl2)/s2
– Global interconnect delay doubled – unsustainable!
• Interconnect delay increasingly more dominant
![Page 9: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/9.jpg)
Buffer Insertion For Delay Reduction
![Page 10: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/10.jpg)
Elmore Delay for Wire
x
C
unit wire capacitance c
unit wire resistance r
![Page 11: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/11.jpg)
Elmore Delay for Buffer
v
C
u
Driving resistanceInput capacitance
![Page 12: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/12.jpg)
Elmore Delay for A Circuit
• Delay = all Ri all Cj downstream from Ri Ri*Cj
• Elmore delay to n1 R(B)*(C1+C2)• Elmore delay to n2 R(B)*(C1+C2)+R(w)*C2
R(B)C1 R(w) C2
n1
B
n2
![Page 13: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/13.jpg)
R
Buffers Reduce Wire Delay
x/2
cx/4 cx/4rx/2
t_unbuf = R( cx + C ) + rx( cx/2 + C )
t_buf = 2R( cx/2 + C ) + rx( cx/4 + C )
t_buf – t_unbuf = RC – rcx2/4
x/2
cx/4 cx/4rx/2
C
C R
x
∆t
![Page 14: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/14.jpg)
Buffered global interconnects: Intuition
Interconnect delay = r.c.l2/2
Interconnect delay = r.c.li2 /2 < r.c.l2 /2 (where l = lj )
since (lj 2) < (lj )2
(Of course, we need to consider buffer delay as well)
l1 lnl3l2
l
![Page 15: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/15.jpg)
Optimal Buffer Insertion on A Wire
• Delay before buffer insertion = rcL2/2
• Assume N identical buffers with equal inter-buffer length l (L = Nl)
• For minimum delay,
gddg
ggd
CRl
cRrCrclL
clCrlclCRNT
12/
2/
0dldT
02 2
opt
gd
l
CRrcL
rc
CRl gdopt
2
L
Rd – On resistance of inverterCg – Gate input capacitancer,c – unit resistance and capacitance
… …
l
![Page 16: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/16.jpg)
Optimal interconnect delay
• Substituting lopt back into the interconnect delay expression:
rc
CR
CRcRrC
rc
CRrcL
CRl
cRrCrclLT
gd
gddg
gd
gdopt
dgoptopt
2
2
12/
cRrCrcCRLT dggdopt 2
Delay grows linearly with L (instead of quadratically)
![Page 17: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/17.jpg)
Total buffer count
• Ever-increasing fractions of total cell count will be buffers– 70% in 32nm– 25% is widely observed
0
10
20
30
40
50
60
70
80
90nm 65nm 45nm 32nm
% c
ells
use
d t
o b
uff
er n
ets
clk-buf
buf
tot-buf
![Page 18: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/18.jpg)
Source: ITRS, 2003Source: ITRS, 20030.1
1
10
100
250 180 130 90 65 45 32
Feature size (nm)Relative
delay
Gate delayLocal interconnect (M1,2)Global interconnect with repeatersGlobal interconnect without repeaters
ITRS projections
![Page 19: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/19.jpg)
Exercise 1
• Given a wire of length 10 with r=2, c=2, what is its delay?
• Given a buffer with Rd =10, Cg=20, after optimally buffering the wire, what is the delay?
• What if wire length is 100?
• Any conclusion?
![Page 20: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/20.jpg)
Exercise 2
• Relationship with gate sizing– If we can size the buffer, what is the best
buffer size?
– Let R0 denote the unit size buffer driving resistance, and C0 denote the unit size buffer input capacitance. Thus, Rd=R0/h and Cg=C0h
– What is best h leading to smallest delay?
![Page 21: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/21.jpg)
Analogy
![Page 22: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/22.jpg)
Analogy
• Advancing technology = period of city expansion, more transistors = larger city
• Interconnects = streets
• Buffers = gas stations
• Signal delay (timing) = time to cross the city
• Buffer insertion = gas station construction
![Page 23: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/23.jpg)
Previous Result is Only Theoretical: Discrete Buffer Locations
Candidate buffer locations
![Page 24: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/24.jpg)
RAT: Required Arrival TimeRAT = 100
Wire delay = 80
AT = 0
RAT = 100
Wire delay = 80
AT = 0
RAT = 20 AT = 80
![Page 25: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/25.jpg)
Slack: RAT - AT
RAT = 100
Wire delay = 80
AT = 0
RAT = 20 AT = 80
Slack = 20 Slack = 20
Minimizing circuit delay = maximizing RAT at driver = maximizing slack at driver
![Page 26: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/26.jpg)
Motivation for Problem Formulation
RAT = 300AT = 350Slack = RAT-AT= -50
RAT = 700AT = 600Slack = 100
RAT = 300AT = 250Slack = 50
RAT = 700AT = 400Slack = 300
slack = -50
slack = 50Decouple capacitive load from critical path
RAT = Required Arrival Time
Slack = RAT - AT
We need to maximum slack or RAT at driver
![Page 27: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/27.jpg)
Timing Driven Buffering Problem Formulation
• Given– A Steiner tree– RAT at each sink– A buffer type– RC parameters– Candidate buffer locations
• Find buffer insertion solution such that the slack (or RAT) at the driver is maximized
![Page 28: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/28.jpg)
An Example for Buffer Insertion
(v1, 1, 20)22
v1 v1
(v2, 3, 16)
• r = 1, c = 1• Rb = 1, Cb = 1• Rd = 1
(v2, 1, 13)
v1
(v3, 5, 8)
v1
(v3, 3, 9)
slack = 6
slack = 3
Add wire
Add wire
Insert buffer Add wire
Add driver
Add driver
C Q
![Page 29: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/29.jpg)
Candidate Buffering Solution
• Definition• Each candidate
solution is associated with– vi: a node
– ci: downstream capacitance
– qi: RAT
vi is a sinkci is sink capacitance
v is an internal node
![Page 30: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/30.jpg)
Van Ginneken’s Algorithm
Candidate solutions are propagated toward the source
![Page 31: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/31.jpg)
Solution Propagation: Add Wire
• c2 = c1 + cx
• q2 = q1 – rcx2/2 – rxc1
• r: wire resistance per unit length
• c: wire capacitance per unit length
(v1, c1, q1)(v2, c2, q2)x
![Page 32: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/32.jpg)
32
Solution Propagation: Insert Buffer
• c1b = Cb
• q1b = q1 – Rbc1
• Cb: buffer capacitance
• Rb: buffer resistance
(v1, c1, q1)(v1, c1b, q1b)
![Page 33: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/33.jpg)
Solution Propagation: Add Driver
• q0d = q0 – Rdc0
• Rd: driver resistance
• Pick solution with max slack
(v0, c0, q0)(v0, c0d, q0d)
![Page 34: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/34.jpg)
Exercise
(20,400)22
Unit Wire Cap = 5Unit Wire Res = 3Buffer C=5, R=1Perform buffer insertion to maximize the slack at driver
2
![Page 35: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/35.jpg)
Exponential Runtime
2 solutions
4 solutions
8 solutions
16 solutions
n candidate buffer locations lead to 2n solutions
![Page 36: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/36.jpg)
Solution Pruning
• Two candidate solutions– (v, c1, q1)
– (v, c2, q2)
• Solution 1 is inferior if – c1 > c2 : larger load
– and q1 < q2 : tighter timing
![Page 37: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/37.jpg)
LOAD
An Analogy - I
Faster -> Smaller Delay -> Larger RAT (since RAT = RAToutput - Delay)
Larger Load -> Larger Capacitance
![Page 38: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/38.jpg)
LOAD
LOAD
Faster & smaller load(larger RAT, smaller
capacitance):Good
Slower & larger load(smaller RAT, larger
capacitance):Inferior
END
An Analogy - II
![Page 39: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/39.jpg)
END
Who will be the winner?Cannot tell at this moment,
so keep both of them.
An Analogy - III
![Page 40: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/40.jpg)
END
Who will be the winner?Cannot tell at this moment,
so keep both of them.
An Analogy - IV
![Page 41: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/41.jpg)
Pruning When Insert Buffer
They have the same load cap Cb, only the one with max q is kept
![Page 42: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/42.jpg)
42
Generating Candidates
(1)
(2)
(3)
From Dr. Charles Alpert
![Page 43: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/43.jpg)
43
Pruning Candidates
(3)
(a) (b)
Both (a) and (b) “look” the same to the source.Throw out the one with the worse slack
(4)
![Page 44: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/44.jpg)
44
Candidate Example Continued
(4)
(5)
![Page 45: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/45.jpg)
45
Candidate Example ContinuedAfter pruning
(5)
At driver, compute which candidate maximizesslack. Result is optimal.
![Page 46: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/46.jpg)
46
Example
(20,400)
(20,400)(30,250)(5, 220)
(20,400)(30,250)(5, 220)
(40, 40)(5, 0)(15,160)(5, 145)
Unit Wire Cap = 5Unit Wire Res = 3Buffer C=5, R=1
2 2 2
![Page 47: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/47.jpg)
47
Example Cont’d
(20,400)(30,250)(5, 220)
(40, 40)(5, 0)(15,160)(5, 145)
(5,0) is inferior to (5,145). (45,40) is inferior to (15,160)
(20,400)(30,250)(5, 220)
(15,160)(5, 145)(5,15)
(5,70)
Pick solution with largest slack, follow arrows to get solution
![Page 48: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/48.jpg)
Exercise
• Without pruning, there will be exponential number of candidate solutions (i.e., given n candidate buffer locations, there will be 2n solutions). With pruning, how many solutions will we have?
![Page 49: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/49.jpg)
Exercise
Unit Wire Cap = 1Unit Wire Res = 1Buffer C=1, R=1
2 2
(10,40)(8,50)(5,10)(15,40)(7,10)(9,30)(12,20)
• Continue the following buffer insertion process. Assume that all partial candidate buffering solutions are as shown.
![Page 50: EE4271 VLSI Design Interconnect Optimizations Buffer Insertion](https://reader038.vdocuments.net/reader038/viewer/2022102907/56649d435503460f94a1fecf/html5/thumbnails/50.jpg)
Summary
• Interconnect delay increases with technology scaling
• Linear interconnect delay with buffer insertion
• Buffer insertion with candidate buffer locations
• Pruning for accelerating buffer insertion technique