fast algorithms for slew constrained minimum cost buffering

28
Fast Algorithms for Slew Constrained Minimum Cost Buffering S. Hu*, C. Alpert**, J. Hu*, S. Karandikar**, Z. Li*, W. Shi* S. Hu*, C. Alpert**, J. Hu*, S. Karandikar**, Z. Li*, W. Shi* and C. Sze** and C. Sze** *Dept of ECE, Texas A&M University *Dept of ECE, Texas A&M University **IBM Austin Research Lab **IBM Austin Research Lab

Upload: janet

Post on 06-Feb-2016

45 views

Category:

Documents


0 download

DESCRIPTION

Fast Algorithms for Slew Constrained Minimum Cost Buffering. S. Hu*, C. Alpert**, J. Hu*, S. Karandikar**, Z. Li*, W. Shi* and C. Sze** *Dept of ECE, Texas A&M University **IBM Austin Research Lab. Outline. Motivation Slew Model Algorithms Discrete slew buffering with fixed input slew - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Fast Algorithms for Slew Constrained Minimum Cost Buffering

Fast Algorithms for Slew Constrained Minimum Cost Buffering

S Hu C Alpert J Hu S Karandikar Z Li W Shi and C SzeS Hu C Alpert J Hu S Karandikar Z Li W Shi and C SzeDept of ECE Texas AampM UniversityDept of ECE Texas AampM University

IBM Austin Research LabIBM Austin Research Lab

2

Outline

MotivationMotivation Slew ModelSlew Model AlgorithmsAlgorithms

ndash Discrete slew buffering with fixed input slewDiscrete slew buffering with fixed input slewndash Discrete slew buffering with non-fixed input Discrete slew buffering with non-fixed input

slewslewndash Continuous slew bufferingContinuous slew buffering

Experimental ResultsExperimental Results ConclusionConclusion

3

Motivation

Buffer insertion is prevalent in VLSI designsBuffer insertion is prevalent in VLSI designsndash eg for timing optimization and improving signal eg for timing optimization and improving signal

integrityintegrity ProblemsProblems

ndash Most nets (90-95) are NOT timing critical so donrsquot Most nets (90-95) are NOT timing critical so donrsquot need a timing-driven formulationneed a timing-driven formulation

ndash Takes a long time as millions of nets need bufferingTakes a long time as millions of nets need bufferingndash Uses a ton of area so area minimization is criticalUses a ton of area so area minimization is critical

Our solutionOur solutionndash Replace timing-driven formulation with slew-driven Replace timing-driven formulation with slew-driven

formulation - formulation - Slew BufferingSlew Bufferingndash Good enough for Good enough for most netsmost nets as they are not critical as they are not criticalndash gt100xgt100x faster than timing-driven buffering faster than timing-driven bufferingndash Still saving area compared to timing bufferingStill saving area compared to timing buffering

4

A New Flow for Buffering 1M Nets First buffering 1 million nets using slew First buffering 1 million nets using slew

buffering algorithm buffering algorithm ndash Can be done efficiently as our algorithm can Can be done efficiently as our algorithm can

buffer buffer 10001000 industrial nets ( industrial nets (4848 buffer types) in buffer types) in 55 secondsseconds

Timing analysis finds that most nets except Timing analysis finds that most nets except say 50K critical nets satisfy the timing say 50K critical nets satisfy the timing constraintconstraint

Rip up these 50K critical nets for rebuffering by Rip up these 50K critical nets for rebuffering by timing buffering algorithmtiming buffering algorithm

BenefitsBenefitsndash Much fasterMuch fasterndash More area savingMore area saving

5

Slew Definition

Delay

Slew (Transition Time)

6

Slew Model

( ) ( ) ln 9w i j i js v v ElmoreDelay v v

2 2( ) ( ) ( )j b out i w i js v s v s v v

Upstream Downstream

Sbout(vi) Slew degradation on wire Sw(vivj)

S(vj)

vi vj

7

BufferDriver Input Slew Assumption

Output slew of a buffer depends on its input Output slew of a buffer depends on its input slewslewbottom-up dynamic programming bottom-up dynamic programming inapplicableinapplicable

AssumptionAssumption the input slew of each buffer is the input slew of each buffer is conservatively assumed to be a fixed valueconservatively assumed to be a fixed value

Then the output slew of a buffer isThen the output slew of a buffer is

where Rwhere Rbb and K and Kbb are called are called slew resistanceslew resistance and and intrinsic slewintrinsic slew

( ) ( )b out i b i bs v R C v K

8

Slew Resistance of An Inverter

9

Problem Formulation

GivenGivenndash A Steiner treeA Steiner treendash Maximum input slew rate α Maximum input slew rate α

at each buffersink (at each buffersink (slew constraintslew constraint))ndash A buffer libraryA buffer libraryndash RC parametersRC parametersndash Candidate buffer locationsCandidate buffer locations

Find a Find a minimal areaminimal area buffer insertion solution buffer insertion solution such that the slew constraint is satisfiedsuch that the slew constraint is satisfied

10

NP-Complete Proof

hellip

nN

1nN

2N

nN

1

1

There is a solution with slew constraint and cost n

n i

i

N M N N

BufferBuffer Slew Slew ResRes

Input Input CapCap

CostCost

BB11 11 XX11 XX22+N+Nnn

BB22 11 XX22 XX11+N+Nnn

BB33 NN XX33 XX44+N+Nn-1n-1

BB44 NN XX44 XX33+N+Nn-1n-1

helliphellip helliphellip helliphellip helliphellipBB2n-12n-1 NNn-1n-1 XX2n-12n-1 XX2n2n+N+NBB2n2n NNn-1n-1 xx2n2n XX2n-12n-1+N+N

2

1 2 21

2 -1 2

2-1 partition problem given 2 positive integers and 2

is there an index set which contains exactly one of and for each 1 such that

n

n ii

i i ii I

n x x x x N

I x x i n x N

1

1 1

n n

n n i ii i

i I i I i i

N x N x N N N

11

Fixed-Input Slew Buffering Candidate Solution Characteristics

Each candidate Each candidate solution is associated solution is associated withwithndash vvii a node a nodendash ccii downstream downstream

capacitancecapacitancendash ssii cumulative slew cumulative slew

degradation along wiredegradation along wirendash wwii cumulative buffer cumulative buffer

areaarea

vi is a sinkci is sink capacitance

v is an internal node

12

Dynamic Programming

Candidate solutions are propagated toward the source

Start from sinks Candidate

solutions are generated

Three operationsndash Add Wirendash Insert Bufferndash Merge

Solution Pruning

13

Solution Propagation Add Wire

cc22 = c = c11 + cx + cx ss22 = s = s11 + (rcx + (rcx222 + rxc2 + rxc11)ln9)ln9 s slew degradation along wiress slew degradation along wires r wire slew resistance per unit lengthr wire slew resistance per unit length c wire capacitance per unit lengthc wire capacitance per unit length

(v1 c1 w1 s1)

(v2 c2 w2 s2)

x

14

Solution Propagation Insert Buffer

c1b = Cb s1b = 0

w1b = w1+w(b)Cb buffer input capacitancePruned if the following slew constraint is violated

Rb buffer output slew resistanceKb buffer intrinsic slew

(v1 c1 w1 s1)(v1 c1b w1b s1b)

221 1( )b bR c K s

15

Solution Propagation Merge

ccmerge merge = c= cl l + c+ crr

wwmerge merge = w= wl l + w+ wrr

ssmergemerge = max(s = max(sl l s srr))

(v cl wl sl) (v cr wlrsr)

16

Solution Pruning

Two candidate solutionsTwo candidate solutionsndash Solution 1 (v c1 w1 s1)Solution 1 (v c1 w1 s1)ndash Solution 2 (v c2 w2 s2)Solution 2 (v c2 w2 s2)

Solution 1 is Solution 1 is inferiorinferior if if ndash c1 gt c2 larger loadc1 gt c2 larger loadndash and w1 gt w2 larger buffer areaand w1 gt w2 larger buffer areandash and s1 gt s2 worse cumulative slew and s1 gt s2 worse cumulative slew

degradationdegradationon wireon wire

17

Timing vs Slew Buffering (I)

A buffer insertion S=0 C=C(b) A buffer insertion S=0 C=C(b) Inserting one buffer Inserting one buffer one new one new

solutionsolution (the one with the smallest (the one with the smallest cost) cost)

In min-cost timing buffering a buffer In min-cost timing buffering a buffer insertion brings many non-inferior insertion brings many non-inferior (CWQ) with the same C where Q is (CWQ) with the same C where Q is the required arrival time (RAT)the required arrival time (RAT)

18

Timing vs Slew Buffering (II)

Slew constraint is close to length Slew constraint is close to length constraint constraint

An extreme case An extreme case ndash in min-cost timing buffering solutions with no buffer in min-cost timing buffering solutions with no buffer

inserted live till driver inserted live till driver ndash Soon become infeasible in slew bufferingSoon become infeasible in slew buffering

A A linear timelinear time optimal algorithm for slew optimal algorithm for slew buffering with a single buffer type buffering with a single buffer type

No polynomial timeNo polynomial time min-cost timing min-cost timing buffering algorithm in the same case buffering algorithm in the same case

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 2: Fast Algorithms for Slew Constrained Minimum Cost Buffering

2

Outline

MotivationMotivation Slew ModelSlew Model AlgorithmsAlgorithms

ndash Discrete slew buffering with fixed input slewDiscrete slew buffering with fixed input slewndash Discrete slew buffering with non-fixed input Discrete slew buffering with non-fixed input

slewslewndash Continuous slew bufferingContinuous slew buffering

Experimental ResultsExperimental Results ConclusionConclusion

3

Motivation

Buffer insertion is prevalent in VLSI designsBuffer insertion is prevalent in VLSI designsndash eg for timing optimization and improving signal eg for timing optimization and improving signal

integrityintegrity ProblemsProblems

ndash Most nets (90-95) are NOT timing critical so donrsquot Most nets (90-95) are NOT timing critical so donrsquot need a timing-driven formulationneed a timing-driven formulation

ndash Takes a long time as millions of nets need bufferingTakes a long time as millions of nets need bufferingndash Uses a ton of area so area minimization is criticalUses a ton of area so area minimization is critical

Our solutionOur solutionndash Replace timing-driven formulation with slew-driven Replace timing-driven formulation with slew-driven

formulation - formulation - Slew BufferingSlew Bufferingndash Good enough for Good enough for most netsmost nets as they are not critical as they are not criticalndash gt100xgt100x faster than timing-driven buffering faster than timing-driven bufferingndash Still saving area compared to timing bufferingStill saving area compared to timing buffering

4

A New Flow for Buffering 1M Nets First buffering 1 million nets using slew First buffering 1 million nets using slew

buffering algorithm buffering algorithm ndash Can be done efficiently as our algorithm can Can be done efficiently as our algorithm can

buffer buffer 10001000 industrial nets ( industrial nets (4848 buffer types) in buffer types) in 55 secondsseconds

Timing analysis finds that most nets except Timing analysis finds that most nets except say 50K critical nets satisfy the timing say 50K critical nets satisfy the timing constraintconstraint

Rip up these 50K critical nets for rebuffering by Rip up these 50K critical nets for rebuffering by timing buffering algorithmtiming buffering algorithm

BenefitsBenefitsndash Much fasterMuch fasterndash More area savingMore area saving

5

Slew Definition

Delay

Slew (Transition Time)

6

Slew Model

( ) ( ) ln 9w i j i js v v ElmoreDelay v v

2 2( ) ( ) ( )j b out i w i js v s v s v v

Upstream Downstream

Sbout(vi) Slew degradation on wire Sw(vivj)

S(vj)

vi vj

7

BufferDriver Input Slew Assumption

Output slew of a buffer depends on its input Output slew of a buffer depends on its input slewslewbottom-up dynamic programming bottom-up dynamic programming inapplicableinapplicable

AssumptionAssumption the input slew of each buffer is the input slew of each buffer is conservatively assumed to be a fixed valueconservatively assumed to be a fixed value

Then the output slew of a buffer isThen the output slew of a buffer is

where Rwhere Rbb and K and Kbb are called are called slew resistanceslew resistance and and intrinsic slewintrinsic slew

( ) ( )b out i b i bs v R C v K

8

Slew Resistance of An Inverter

9

Problem Formulation

GivenGivenndash A Steiner treeA Steiner treendash Maximum input slew rate α Maximum input slew rate α

at each buffersink (at each buffersink (slew constraintslew constraint))ndash A buffer libraryA buffer libraryndash RC parametersRC parametersndash Candidate buffer locationsCandidate buffer locations

Find a Find a minimal areaminimal area buffer insertion solution buffer insertion solution such that the slew constraint is satisfiedsuch that the slew constraint is satisfied

10

NP-Complete Proof

hellip

nN

1nN

2N

nN

1

1

There is a solution with slew constraint and cost n

n i

i

N M N N

BufferBuffer Slew Slew ResRes

Input Input CapCap

CostCost

BB11 11 XX11 XX22+N+Nnn

BB22 11 XX22 XX11+N+Nnn

BB33 NN XX33 XX44+N+Nn-1n-1

BB44 NN XX44 XX33+N+Nn-1n-1

helliphellip helliphellip helliphellip helliphellipBB2n-12n-1 NNn-1n-1 XX2n-12n-1 XX2n2n+N+NBB2n2n NNn-1n-1 xx2n2n XX2n-12n-1+N+N

2

1 2 21

2 -1 2

2-1 partition problem given 2 positive integers and 2

is there an index set which contains exactly one of and for each 1 such that

n

n ii

i i ii I

n x x x x N

I x x i n x N

1

1 1

n n

n n i ii i

i I i I i i

N x N x N N N

11

Fixed-Input Slew Buffering Candidate Solution Characteristics

Each candidate Each candidate solution is associated solution is associated withwithndash vvii a node a nodendash ccii downstream downstream

capacitancecapacitancendash ssii cumulative slew cumulative slew

degradation along wiredegradation along wirendash wwii cumulative buffer cumulative buffer

areaarea

vi is a sinkci is sink capacitance

v is an internal node

12

Dynamic Programming

Candidate solutions are propagated toward the source

Start from sinks Candidate

solutions are generated

Three operationsndash Add Wirendash Insert Bufferndash Merge

Solution Pruning

13

Solution Propagation Add Wire

cc22 = c = c11 + cx + cx ss22 = s = s11 + (rcx + (rcx222 + rxc2 + rxc11)ln9)ln9 s slew degradation along wiress slew degradation along wires r wire slew resistance per unit lengthr wire slew resistance per unit length c wire capacitance per unit lengthc wire capacitance per unit length

(v1 c1 w1 s1)

(v2 c2 w2 s2)

x

14

Solution Propagation Insert Buffer

c1b = Cb s1b = 0

w1b = w1+w(b)Cb buffer input capacitancePruned if the following slew constraint is violated

Rb buffer output slew resistanceKb buffer intrinsic slew

(v1 c1 w1 s1)(v1 c1b w1b s1b)

221 1( )b bR c K s

15

Solution Propagation Merge

ccmerge merge = c= cl l + c+ crr

wwmerge merge = w= wl l + w+ wrr

ssmergemerge = max(s = max(sl l s srr))

(v cl wl sl) (v cr wlrsr)

16

Solution Pruning

Two candidate solutionsTwo candidate solutionsndash Solution 1 (v c1 w1 s1)Solution 1 (v c1 w1 s1)ndash Solution 2 (v c2 w2 s2)Solution 2 (v c2 w2 s2)

Solution 1 is Solution 1 is inferiorinferior if if ndash c1 gt c2 larger loadc1 gt c2 larger loadndash and w1 gt w2 larger buffer areaand w1 gt w2 larger buffer areandash and s1 gt s2 worse cumulative slew and s1 gt s2 worse cumulative slew

degradationdegradationon wireon wire

17

Timing vs Slew Buffering (I)

A buffer insertion S=0 C=C(b) A buffer insertion S=0 C=C(b) Inserting one buffer Inserting one buffer one new one new

solutionsolution (the one with the smallest (the one with the smallest cost) cost)

In min-cost timing buffering a buffer In min-cost timing buffering a buffer insertion brings many non-inferior insertion brings many non-inferior (CWQ) with the same C where Q is (CWQ) with the same C where Q is the required arrival time (RAT)the required arrival time (RAT)

18

Timing vs Slew Buffering (II)

Slew constraint is close to length Slew constraint is close to length constraint constraint

An extreme case An extreme case ndash in min-cost timing buffering solutions with no buffer in min-cost timing buffering solutions with no buffer

inserted live till driver inserted live till driver ndash Soon become infeasible in slew bufferingSoon become infeasible in slew buffering

A A linear timelinear time optimal algorithm for slew optimal algorithm for slew buffering with a single buffer type buffering with a single buffer type

No polynomial timeNo polynomial time min-cost timing min-cost timing buffering algorithm in the same case buffering algorithm in the same case

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 3: Fast Algorithms for Slew Constrained Minimum Cost Buffering

3

Motivation

Buffer insertion is prevalent in VLSI designsBuffer insertion is prevalent in VLSI designsndash eg for timing optimization and improving signal eg for timing optimization and improving signal

integrityintegrity ProblemsProblems

ndash Most nets (90-95) are NOT timing critical so donrsquot Most nets (90-95) are NOT timing critical so donrsquot need a timing-driven formulationneed a timing-driven formulation

ndash Takes a long time as millions of nets need bufferingTakes a long time as millions of nets need bufferingndash Uses a ton of area so area minimization is criticalUses a ton of area so area minimization is critical

Our solutionOur solutionndash Replace timing-driven formulation with slew-driven Replace timing-driven formulation with slew-driven

formulation - formulation - Slew BufferingSlew Bufferingndash Good enough for Good enough for most netsmost nets as they are not critical as they are not criticalndash gt100xgt100x faster than timing-driven buffering faster than timing-driven bufferingndash Still saving area compared to timing bufferingStill saving area compared to timing buffering

4

A New Flow for Buffering 1M Nets First buffering 1 million nets using slew First buffering 1 million nets using slew

buffering algorithm buffering algorithm ndash Can be done efficiently as our algorithm can Can be done efficiently as our algorithm can

buffer buffer 10001000 industrial nets ( industrial nets (4848 buffer types) in buffer types) in 55 secondsseconds

Timing analysis finds that most nets except Timing analysis finds that most nets except say 50K critical nets satisfy the timing say 50K critical nets satisfy the timing constraintconstraint

Rip up these 50K critical nets for rebuffering by Rip up these 50K critical nets for rebuffering by timing buffering algorithmtiming buffering algorithm

BenefitsBenefitsndash Much fasterMuch fasterndash More area savingMore area saving

5

Slew Definition

Delay

Slew (Transition Time)

6

Slew Model

( ) ( ) ln 9w i j i js v v ElmoreDelay v v

2 2( ) ( ) ( )j b out i w i js v s v s v v

Upstream Downstream

Sbout(vi) Slew degradation on wire Sw(vivj)

S(vj)

vi vj

7

BufferDriver Input Slew Assumption

Output slew of a buffer depends on its input Output slew of a buffer depends on its input slewslewbottom-up dynamic programming bottom-up dynamic programming inapplicableinapplicable

AssumptionAssumption the input slew of each buffer is the input slew of each buffer is conservatively assumed to be a fixed valueconservatively assumed to be a fixed value

Then the output slew of a buffer isThen the output slew of a buffer is

where Rwhere Rbb and K and Kbb are called are called slew resistanceslew resistance and and intrinsic slewintrinsic slew

( ) ( )b out i b i bs v R C v K

8

Slew Resistance of An Inverter

9

Problem Formulation

GivenGivenndash A Steiner treeA Steiner treendash Maximum input slew rate α Maximum input slew rate α

at each buffersink (at each buffersink (slew constraintslew constraint))ndash A buffer libraryA buffer libraryndash RC parametersRC parametersndash Candidate buffer locationsCandidate buffer locations

Find a Find a minimal areaminimal area buffer insertion solution buffer insertion solution such that the slew constraint is satisfiedsuch that the slew constraint is satisfied

10

NP-Complete Proof

hellip

nN

1nN

2N

nN

1

1

There is a solution with slew constraint and cost n

n i

i

N M N N

BufferBuffer Slew Slew ResRes

Input Input CapCap

CostCost

BB11 11 XX11 XX22+N+Nnn

BB22 11 XX22 XX11+N+Nnn

BB33 NN XX33 XX44+N+Nn-1n-1

BB44 NN XX44 XX33+N+Nn-1n-1

helliphellip helliphellip helliphellip helliphellipBB2n-12n-1 NNn-1n-1 XX2n-12n-1 XX2n2n+N+NBB2n2n NNn-1n-1 xx2n2n XX2n-12n-1+N+N

2

1 2 21

2 -1 2

2-1 partition problem given 2 positive integers and 2

is there an index set which contains exactly one of and for each 1 such that

n

n ii

i i ii I

n x x x x N

I x x i n x N

1

1 1

n n

n n i ii i

i I i I i i

N x N x N N N

11

Fixed-Input Slew Buffering Candidate Solution Characteristics

Each candidate Each candidate solution is associated solution is associated withwithndash vvii a node a nodendash ccii downstream downstream

capacitancecapacitancendash ssii cumulative slew cumulative slew

degradation along wiredegradation along wirendash wwii cumulative buffer cumulative buffer

areaarea

vi is a sinkci is sink capacitance

v is an internal node

12

Dynamic Programming

Candidate solutions are propagated toward the source

Start from sinks Candidate

solutions are generated

Three operationsndash Add Wirendash Insert Bufferndash Merge

Solution Pruning

13

Solution Propagation Add Wire

cc22 = c = c11 + cx + cx ss22 = s = s11 + (rcx + (rcx222 + rxc2 + rxc11)ln9)ln9 s slew degradation along wiress slew degradation along wires r wire slew resistance per unit lengthr wire slew resistance per unit length c wire capacitance per unit lengthc wire capacitance per unit length

(v1 c1 w1 s1)

(v2 c2 w2 s2)

x

14

Solution Propagation Insert Buffer

c1b = Cb s1b = 0

w1b = w1+w(b)Cb buffer input capacitancePruned if the following slew constraint is violated

Rb buffer output slew resistanceKb buffer intrinsic slew

(v1 c1 w1 s1)(v1 c1b w1b s1b)

221 1( )b bR c K s

15

Solution Propagation Merge

ccmerge merge = c= cl l + c+ crr

wwmerge merge = w= wl l + w+ wrr

ssmergemerge = max(s = max(sl l s srr))

(v cl wl sl) (v cr wlrsr)

16

Solution Pruning

Two candidate solutionsTwo candidate solutionsndash Solution 1 (v c1 w1 s1)Solution 1 (v c1 w1 s1)ndash Solution 2 (v c2 w2 s2)Solution 2 (v c2 w2 s2)

Solution 1 is Solution 1 is inferiorinferior if if ndash c1 gt c2 larger loadc1 gt c2 larger loadndash and w1 gt w2 larger buffer areaand w1 gt w2 larger buffer areandash and s1 gt s2 worse cumulative slew and s1 gt s2 worse cumulative slew

degradationdegradationon wireon wire

17

Timing vs Slew Buffering (I)

A buffer insertion S=0 C=C(b) A buffer insertion S=0 C=C(b) Inserting one buffer Inserting one buffer one new one new

solutionsolution (the one with the smallest (the one with the smallest cost) cost)

In min-cost timing buffering a buffer In min-cost timing buffering a buffer insertion brings many non-inferior insertion brings many non-inferior (CWQ) with the same C where Q is (CWQ) with the same C where Q is the required arrival time (RAT)the required arrival time (RAT)

18

Timing vs Slew Buffering (II)

Slew constraint is close to length Slew constraint is close to length constraint constraint

An extreme case An extreme case ndash in min-cost timing buffering solutions with no buffer in min-cost timing buffering solutions with no buffer

inserted live till driver inserted live till driver ndash Soon become infeasible in slew bufferingSoon become infeasible in slew buffering

A A linear timelinear time optimal algorithm for slew optimal algorithm for slew buffering with a single buffer type buffering with a single buffer type

No polynomial timeNo polynomial time min-cost timing min-cost timing buffering algorithm in the same case buffering algorithm in the same case

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 4: Fast Algorithms for Slew Constrained Minimum Cost Buffering

4

A New Flow for Buffering 1M Nets First buffering 1 million nets using slew First buffering 1 million nets using slew

buffering algorithm buffering algorithm ndash Can be done efficiently as our algorithm can Can be done efficiently as our algorithm can

buffer buffer 10001000 industrial nets ( industrial nets (4848 buffer types) in buffer types) in 55 secondsseconds

Timing analysis finds that most nets except Timing analysis finds that most nets except say 50K critical nets satisfy the timing say 50K critical nets satisfy the timing constraintconstraint

Rip up these 50K critical nets for rebuffering by Rip up these 50K critical nets for rebuffering by timing buffering algorithmtiming buffering algorithm

BenefitsBenefitsndash Much fasterMuch fasterndash More area savingMore area saving

5

Slew Definition

Delay

Slew (Transition Time)

6

Slew Model

( ) ( ) ln 9w i j i js v v ElmoreDelay v v

2 2( ) ( ) ( )j b out i w i js v s v s v v

Upstream Downstream

Sbout(vi) Slew degradation on wire Sw(vivj)

S(vj)

vi vj

7

BufferDriver Input Slew Assumption

Output slew of a buffer depends on its input Output slew of a buffer depends on its input slewslewbottom-up dynamic programming bottom-up dynamic programming inapplicableinapplicable

AssumptionAssumption the input slew of each buffer is the input slew of each buffer is conservatively assumed to be a fixed valueconservatively assumed to be a fixed value

Then the output slew of a buffer isThen the output slew of a buffer is

where Rwhere Rbb and K and Kbb are called are called slew resistanceslew resistance and and intrinsic slewintrinsic slew

( ) ( )b out i b i bs v R C v K

8

Slew Resistance of An Inverter

9

Problem Formulation

GivenGivenndash A Steiner treeA Steiner treendash Maximum input slew rate α Maximum input slew rate α

at each buffersink (at each buffersink (slew constraintslew constraint))ndash A buffer libraryA buffer libraryndash RC parametersRC parametersndash Candidate buffer locationsCandidate buffer locations

Find a Find a minimal areaminimal area buffer insertion solution buffer insertion solution such that the slew constraint is satisfiedsuch that the slew constraint is satisfied

10

NP-Complete Proof

hellip

nN

1nN

2N

nN

1

1

There is a solution with slew constraint and cost n

n i

i

N M N N

BufferBuffer Slew Slew ResRes

Input Input CapCap

CostCost

BB11 11 XX11 XX22+N+Nnn

BB22 11 XX22 XX11+N+Nnn

BB33 NN XX33 XX44+N+Nn-1n-1

BB44 NN XX44 XX33+N+Nn-1n-1

helliphellip helliphellip helliphellip helliphellipBB2n-12n-1 NNn-1n-1 XX2n-12n-1 XX2n2n+N+NBB2n2n NNn-1n-1 xx2n2n XX2n-12n-1+N+N

2

1 2 21

2 -1 2

2-1 partition problem given 2 positive integers and 2

is there an index set which contains exactly one of and for each 1 such that

n

n ii

i i ii I

n x x x x N

I x x i n x N

1

1 1

n n

n n i ii i

i I i I i i

N x N x N N N

11

Fixed-Input Slew Buffering Candidate Solution Characteristics

Each candidate Each candidate solution is associated solution is associated withwithndash vvii a node a nodendash ccii downstream downstream

capacitancecapacitancendash ssii cumulative slew cumulative slew

degradation along wiredegradation along wirendash wwii cumulative buffer cumulative buffer

areaarea

vi is a sinkci is sink capacitance

v is an internal node

12

Dynamic Programming

Candidate solutions are propagated toward the source

Start from sinks Candidate

solutions are generated

Three operationsndash Add Wirendash Insert Bufferndash Merge

Solution Pruning

13

Solution Propagation Add Wire

cc22 = c = c11 + cx + cx ss22 = s = s11 + (rcx + (rcx222 + rxc2 + rxc11)ln9)ln9 s slew degradation along wiress slew degradation along wires r wire slew resistance per unit lengthr wire slew resistance per unit length c wire capacitance per unit lengthc wire capacitance per unit length

(v1 c1 w1 s1)

(v2 c2 w2 s2)

x

14

Solution Propagation Insert Buffer

c1b = Cb s1b = 0

w1b = w1+w(b)Cb buffer input capacitancePruned if the following slew constraint is violated

Rb buffer output slew resistanceKb buffer intrinsic slew

(v1 c1 w1 s1)(v1 c1b w1b s1b)

221 1( )b bR c K s

15

Solution Propagation Merge

ccmerge merge = c= cl l + c+ crr

wwmerge merge = w= wl l + w+ wrr

ssmergemerge = max(s = max(sl l s srr))

(v cl wl sl) (v cr wlrsr)

16

Solution Pruning

Two candidate solutionsTwo candidate solutionsndash Solution 1 (v c1 w1 s1)Solution 1 (v c1 w1 s1)ndash Solution 2 (v c2 w2 s2)Solution 2 (v c2 w2 s2)

Solution 1 is Solution 1 is inferiorinferior if if ndash c1 gt c2 larger loadc1 gt c2 larger loadndash and w1 gt w2 larger buffer areaand w1 gt w2 larger buffer areandash and s1 gt s2 worse cumulative slew and s1 gt s2 worse cumulative slew

degradationdegradationon wireon wire

17

Timing vs Slew Buffering (I)

A buffer insertion S=0 C=C(b) A buffer insertion S=0 C=C(b) Inserting one buffer Inserting one buffer one new one new

solutionsolution (the one with the smallest (the one with the smallest cost) cost)

In min-cost timing buffering a buffer In min-cost timing buffering a buffer insertion brings many non-inferior insertion brings many non-inferior (CWQ) with the same C where Q is (CWQ) with the same C where Q is the required arrival time (RAT)the required arrival time (RAT)

18

Timing vs Slew Buffering (II)

Slew constraint is close to length Slew constraint is close to length constraint constraint

An extreme case An extreme case ndash in min-cost timing buffering solutions with no buffer in min-cost timing buffering solutions with no buffer

inserted live till driver inserted live till driver ndash Soon become infeasible in slew bufferingSoon become infeasible in slew buffering

A A linear timelinear time optimal algorithm for slew optimal algorithm for slew buffering with a single buffer type buffering with a single buffer type

No polynomial timeNo polynomial time min-cost timing min-cost timing buffering algorithm in the same case buffering algorithm in the same case

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 5: Fast Algorithms for Slew Constrained Minimum Cost Buffering

5

Slew Definition

Delay

Slew (Transition Time)

6

Slew Model

( ) ( ) ln 9w i j i js v v ElmoreDelay v v

2 2( ) ( ) ( )j b out i w i js v s v s v v

Upstream Downstream

Sbout(vi) Slew degradation on wire Sw(vivj)

S(vj)

vi vj

7

BufferDriver Input Slew Assumption

Output slew of a buffer depends on its input Output slew of a buffer depends on its input slewslewbottom-up dynamic programming bottom-up dynamic programming inapplicableinapplicable

AssumptionAssumption the input slew of each buffer is the input slew of each buffer is conservatively assumed to be a fixed valueconservatively assumed to be a fixed value

Then the output slew of a buffer isThen the output slew of a buffer is

where Rwhere Rbb and K and Kbb are called are called slew resistanceslew resistance and and intrinsic slewintrinsic slew

( ) ( )b out i b i bs v R C v K

8

Slew Resistance of An Inverter

9

Problem Formulation

GivenGivenndash A Steiner treeA Steiner treendash Maximum input slew rate α Maximum input slew rate α

at each buffersink (at each buffersink (slew constraintslew constraint))ndash A buffer libraryA buffer libraryndash RC parametersRC parametersndash Candidate buffer locationsCandidate buffer locations

Find a Find a minimal areaminimal area buffer insertion solution buffer insertion solution such that the slew constraint is satisfiedsuch that the slew constraint is satisfied

10

NP-Complete Proof

hellip

nN

1nN

2N

nN

1

1

There is a solution with slew constraint and cost n

n i

i

N M N N

BufferBuffer Slew Slew ResRes

Input Input CapCap

CostCost

BB11 11 XX11 XX22+N+Nnn

BB22 11 XX22 XX11+N+Nnn

BB33 NN XX33 XX44+N+Nn-1n-1

BB44 NN XX44 XX33+N+Nn-1n-1

helliphellip helliphellip helliphellip helliphellipBB2n-12n-1 NNn-1n-1 XX2n-12n-1 XX2n2n+N+NBB2n2n NNn-1n-1 xx2n2n XX2n-12n-1+N+N

2

1 2 21

2 -1 2

2-1 partition problem given 2 positive integers and 2

is there an index set which contains exactly one of and for each 1 such that

n

n ii

i i ii I

n x x x x N

I x x i n x N

1

1 1

n n

n n i ii i

i I i I i i

N x N x N N N

11

Fixed-Input Slew Buffering Candidate Solution Characteristics

Each candidate Each candidate solution is associated solution is associated withwithndash vvii a node a nodendash ccii downstream downstream

capacitancecapacitancendash ssii cumulative slew cumulative slew

degradation along wiredegradation along wirendash wwii cumulative buffer cumulative buffer

areaarea

vi is a sinkci is sink capacitance

v is an internal node

12

Dynamic Programming

Candidate solutions are propagated toward the source

Start from sinks Candidate

solutions are generated

Three operationsndash Add Wirendash Insert Bufferndash Merge

Solution Pruning

13

Solution Propagation Add Wire

cc22 = c = c11 + cx + cx ss22 = s = s11 + (rcx + (rcx222 + rxc2 + rxc11)ln9)ln9 s slew degradation along wiress slew degradation along wires r wire slew resistance per unit lengthr wire slew resistance per unit length c wire capacitance per unit lengthc wire capacitance per unit length

(v1 c1 w1 s1)

(v2 c2 w2 s2)

x

14

Solution Propagation Insert Buffer

c1b = Cb s1b = 0

w1b = w1+w(b)Cb buffer input capacitancePruned if the following slew constraint is violated

Rb buffer output slew resistanceKb buffer intrinsic slew

(v1 c1 w1 s1)(v1 c1b w1b s1b)

221 1( )b bR c K s

15

Solution Propagation Merge

ccmerge merge = c= cl l + c+ crr

wwmerge merge = w= wl l + w+ wrr

ssmergemerge = max(s = max(sl l s srr))

(v cl wl sl) (v cr wlrsr)

16

Solution Pruning

Two candidate solutionsTwo candidate solutionsndash Solution 1 (v c1 w1 s1)Solution 1 (v c1 w1 s1)ndash Solution 2 (v c2 w2 s2)Solution 2 (v c2 w2 s2)

Solution 1 is Solution 1 is inferiorinferior if if ndash c1 gt c2 larger loadc1 gt c2 larger loadndash and w1 gt w2 larger buffer areaand w1 gt w2 larger buffer areandash and s1 gt s2 worse cumulative slew and s1 gt s2 worse cumulative slew

degradationdegradationon wireon wire

17

Timing vs Slew Buffering (I)

A buffer insertion S=0 C=C(b) A buffer insertion S=0 C=C(b) Inserting one buffer Inserting one buffer one new one new

solutionsolution (the one with the smallest (the one with the smallest cost) cost)

In min-cost timing buffering a buffer In min-cost timing buffering a buffer insertion brings many non-inferior insertion brings many non-inferior (CWQ) with the same C where Q is (CWQ) with the same C where Q is the required arrival time (RAT)the required arrival time (RAT)

18

Timing vs Slew Buffering (II)

Slew constraint is close to length Slew constraint is close to length constraint constraint

An extreme case An extreme case ndash in min-cost timing buffering solutions with no buffer in min-cost timing buffering solutions with no buffer

inserted live till driver inserted live till driver ndash Soon become infeasible in slew bufferingSoon become infeasible in slew buffering

A A linear timelinear time optimal algorithm for slew optimal algorithm for slew buffering with a single buffer type buffering with a single buffer type

No polynomial timeNo polynomial time min-cost timing min-cost timing buffering algorithm in the same case buffering algorithm in the same case

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 6: Fast Algorithms for Slew Constrained Minimum Cost Buffering

6

Slew Model

( ) ( ) ln 9w i j i js v v ElmoreDelay v v

2 2( ) ( ) ( )j b out i w i js v s v s v v

Upstream Downstream

Sbout(vi) Slew degradation on wire Sw(vivj)

S(vj)

vi vj

7

BufferDriver Input Slew Assumption

Output slew of a buffer depends on its input Output slew of a buffer depends on its input slewslewbottom-up dynamic programming bottom-up dynamic programming inapplicableinapplicable

AssumptionAssumption the input slew of each buffer is the input slew of each buffer is conservatively assumed to be a fixed valueconservatively assumed to be a fixed value

Then the output slew of a buffer isThen the output slew of a buffer is

where Rwhere Rbb and K and Kbb are called are called slew resistanceslew resistance and and intrinsic slewintrinsic slew

( ) ( )b out i b i bs v R C v K

8

Slew Resistance of An Inverter

9

Problem Formulation

GivenGivenndash A Steiner treeA Steiner treendash Maximum input slew rate α Maximum input slew rate α

at each buffersink (at each buffersink (slew constraintslew constraint))ndash A buffer libraryA buffer libraryndash RC parametersRC parametersndash Candidate buffer locationsCandidate buffer locations

Find a Find a minimal areaminimal area buffer insertion solution buffer insertion solution such that the slew constraint is satisfiedsuch that the slew constraint is satisfied

10

NP-Complete Proof

hellip

nN

1nN

2N

nN

1

1

There is a solution with slew constraint and cost n

n i

i

N M N N

BufferBuffer Slew Slew ResRes

Input Input CapCap

CostCost

BB11 11 XX11 XX22+N+Nnn

BB22 11 XX22 XX11+N+Nnn

BB33 NN XX33 XX44+N+Nn-1n-1

BB44 NN XX44 XX33+N+Nn-1n-1

helliphellip helliphellip helliphellip helliphellipBB2n-12n-1 NNn-1n-1 XX2n-12n-1 XX2n2n+N+NBB2n2n NNn-1n-1 xx2n2n XX2n-12n-1+N+N

2

1 2 21

2 -1 2

2-1 partition problem given 2 positive integers and 2

is there an index set which contains exactly one of and for each 1 such that

n

n ii

i i ii I

n x x x x N

I x x i n x N

1

1 1

n n

n n i ii i

i I i I i i

N x N x N N N

11

Fixed-Input Slew Buffering Candidate Solution Characteristics

Each candidate Each candidate solution is associated solution is associated withwithndash vvii a node a nodendash ccii downstream downstream

capacitancecapacitancendash ssii cumulative slew cumulative slew

degradation along wiredegradation along wirendash wwii cumulative buffer cumulative buffer

areaarea

vi is a sinkci is sink capacitance

v is an internal node

12

Dynamic Programming

Candidate solutions are propagated toward the source

Start from sinks Candidate

solutions are generated

Three operationsndash Add Wirendash Insert Bufferndash Merge

Solution Pruning

13

Solution Propagation Add Wire

cc22 = c = c11 + cx + cx ss22 = s = s11 + (rcx + (rcx222 + rxc2 + rxc11)ln9)ln9 s slew degradation along wiress slew degradation along wires r wire slew resistance per unit lengthr wire slew resistance per unit length c wire capacitance per unit lengthc wire capacitance per unit length

(v1 c1 w1 s1)

(v2 c2 w2 s2)

x

14

Solution Propagation Insert Buffer

c1b = Cb s1b = 0

w1b = w1+w(b)Cb buffer input capacitancePruned if the following slew constraint is violated

Rb buffer output slew resistanceKb buffer intrinsic slew

(v1 c1 w1 s1)(v1 c1b w1b s1b)

221 1( )b bR c K s

15

Solution Propagation Merge

ccmerge merge = c= cl l + c+ crr

wwmerge merge = w= wl l + w+ wrr

ssmergemerge = max(s = max(sl l s srr))

(v cl wl sl) (v cr wlrsr)

16

Solution Pruning

Two candidate solutionsTwo candidate solutionsndash Solution 1 (v c1 w1 s1)Solution 1 (v c1 w1 s1)ndash Solution 2 (v c2 w2 s2)Solution 2 (v c2 w2 s2)

Solution 1 is Solution 1 is inferiorinferior if if ndash c1 gt c2 larger loadc1 gt c2 larger loadndash and w1 gt w2 larger buffer areaand w1 gt w2 larger buffer areandash and s1 gt s2 worse cumulative slew and s1 gt s2 worse cumulative slew

degradationdegradationon wireon wire

17

Timing vs Slew Buffering (I)

A buffer insertion S=0 C=C(b) A buffer insertion S=0 C=C(b) Inserting one buffer Inserting one buffer one new one new

solutionsolution (the one with the smallest (the one with the smallest cost) cost)

In min-cost timing buffering a buffer In min-cost timing buffering a buffer insertion brings many non-inferior insertion brings many non-inferior (CWQ) with the same C where Q is (CWQ) with the same C where Q is the required arrival time (RAT)the required arrival time (RAT)

18

Timing vs Slew Buffering (II)

Slew constraint is close to length Slew constraint is close to length constraint constraint

An extreme case An extreme case ndash in min-cost timing buffering solutions with no buffer in min-cost timing buffering solutions with no buffer

inserted live till driver inserted live till driver ndash Soon become infeasible in slew bufferingSoon become infeasible in slew buffering

A A linear timelinear time optimal algorithm for slew optimal algorithm for slew buffering with a single buffer type buffering with a single buffer type

No polynomial timeNo polynomial time min-cost timing min-cost timing buffering algorithm in the same case buffering algorithm in the same case

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 7: Fast Algorithms for Slew Constrained Minimum Cost Buffering

7

BufferDriver Input Slew Assumption

Output slew of a buffer depends on its input Output slew of a buffer depends on its input slewslewbottom-up dynamic programming bottom-up dynamic programming inapplicableinapplicable

AssumptionAssumption the input slew of each buffer is the input slew of each buffer is conservatively assumed to be a fixed valueconservatively assumed to be a fixed value

Then the output slew of a buffer isThen the output slew of a buffer is

where Rwhere Rbb and K and Kbb are called are called slew resistanceslew resistance and and intrinsic slewintrinsic slew

( ) ( )b out i b i bs v R C v K

8

Slew Resistance of An Inverter

9

Problem Formulation

GivenGivenndash A Steiner treeA Steiner treendash Maximum input slew rate α Maximum input slew rate α

at each buffersink (at each buffersink (slew constraintslew constraint))ndash A buffer libraryA buffer libraryndash RC parametersRC parametersndash Candidate buffer locationsCandidate buffer locations

Find a Find a minimal areaminimal area buffer insertion solution buffer insertion solution such that the slew constraint is satisfiedsuch that the slew constraint is satisfied

10

NP-Complete Proof

hellip

nN

1nN

2N

nN

1

1

There is a solution with slew constraint and cost n

n i

i

N M N N

BufferBuffer Slew Slew ResRes

Input Input CapCap

CostCost

BB11 11 XX11 XX22+N+Nnn

BB22 11 XX22 XX11+N+Nnn

BB33 NN XX33 XX44+N+Nn-1n-1

BB44 NN XX44 XX33+N+Nn-1n-1

helliphellip helliphellip helliphellip helliphellipBB2n-12n-1 NNn-1n-1 XX2n-12n-1 XX2n2n+N+NBB2n2n NNn-1n-1 xx2n2n XX2n-12n-1+N+N

2

1 2 21

2 -1 2

2-1 partition problem given 2 positive integers and 2

is there an index set which contains exactly one of and for each 1 such that

n

n ii

i i ii I

n x x x x N

I x x i n x N

1

1 1

n n

n n i ii i

i I i I i i

N x N x N N N

11

Fixed-Input Slew Buffering Candidate Solution Characteristics

Each candidate Each candidate solution is associated solution is associated withwithndash vvii a node a nodendash ccii downstream downstream

capacitancecapacitancendash ssii cumulative slew cumulative slew

degradation along wiredegradation along wirendash wwii cumulative buffer cumulative buffer

areaarea

vi is a sinkci is sink capacitance

v is an internal node

12

Dynamic Programming

Candidate solutions are propagated toward the source

Start from sinks Candidate

solutions are generated

Three operationsndash Add Wirendash Insert Bufferndash Merge

Solution Pruning

13

Solution Propagation Add Wire

cc22 = c = c11 + cx + cx ss22 = s = s11 + (rcx + (rcx222 + rxc2 + rxc11)ln9)ln9 s slew degradation along wiress slew degradation along wires r wire slew resistance per unit lengthr wire slew resistance per unit length c wire capacitance per unit lengthc wire capacitance per unit length

(v1 c1 w1 s1)

(v2 c2 w2 s2)

x

14

Solution Propagation Insert Buffer

c1b = Cb s1b = 0

w1b = w1+w(b)Cb buffer input capacitancePruned if the following slew constraint is violated

Rb buffer output slew resistanceKb buffer intrinsic slew

(v1 c1 w1 s1)(v1 c1b w1b s1b)

221 1( )b bR c K s

15

Solution Propagation Merge

ccmerge merge = c= cl l + c+ crr

wwmerge merge = w= wl l + w+ wrr

ssmergemerge = max(s = max(sl l s srr))

(v cl wl sl) (v cr wlrsr)

16

Solution Pruning

Two candidate solutionsTwo candidate solutionsndash Solution 1 (v c1 w1 s1)Solution 1 (v c1 w1 s1)ndash Solution 2 (v c2 w2 s2)Solution 2 (v c2 w2 s2)

Solution 1 is Solution 1 is inferiorinferior if if ndash c1 gt c2 larger loadc1 gt c2 larger loadndash and w1 gt w2 larger buffer areaand w1 gt w2 larger buffer areandash and s1 gt s2 worse cumulative slew and s1 gt s2 worse cumulative slew

degradationdegradationon wireon wire

17

Timing vs Slew Buffering (I)

A buffer insertion S=0 C=C(b) A buffer insertion S=0 C=C(b) Inserting one buffer Inserting one buffer one new one new

solutionsolution (the one with the smallest (the one with the smallest cost) cost)

In min-cost timing buffering a buffer In min-cost timing buffering a buffer insertion brings many non-inferior insertion brings many non-inferior (CWQ) with the same C where Q is (CWQ) with the same C where Q is the required arrival time (RAT)the required arrival time (RAT)

18

Timing vs Slew Buffering (II)

Slew constraint is close to length Slew constraint is close to length constraint constraint

An extreme case An extreme case ndash in min-cost timing buffering solutions with no buffer in min-cost timing buffering solutions with no buffer

inserted live till driver inserted live till driver ndash Soon become infeasible in slew bufferingSoon become infeasible in slew buffering

A A linear timelinear time optimal algorithm for slew optimal algorithm for slew buffering with a single buffer type buffering with a single buffer type

No polynomial timeNo polynomial time min-cost timing min-cost timing buffering algorithm in the same case buffering algorithm in the same case

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 8: Fast Algorithms for Slew Constrained Minimum Cost Buffering

8

Slew Resistance of An Inverter

9

Problem Formulation

GivenGivenndash A Steiner treeA Steiner treendash Maximum input slew rate α Maximum input slew rate α

at each buffersink (at each buffersink (slew constraintslew constraint))ndash A buffer libraryA buffer libraryndash RC parametersRC parametersndash Candidate buffer locationsCandidate buffer locations

Find a Find a minimal areaminimal area buffer insertion solution buffer insertion solution such that the slew constraint is satisfiedsuch that the slew constraint is satisfied

10

NP-Complete Proof

hellip

nN

1nN

2N

nN

1

1

There is a solution with slew constraint and cost n

n i

i

N M N N

BufferBuffer Slew Slew ResRes

Input Input CapCap

CostCost

BB11 11 XX11 XX22+N+Nnn

BB22 11 XX22 XX11+N+Nnn

BB33 NN XX33 XX44+N+Nn-1n-1

BB44 NN XX44 XX33+N+Nn-1n-1

helliphellip helliphellip helliphellip helliphellipBB2n-12n-1 NNn-1n-1 XX2n-12n-1 XX2n2n+N+NBB2n2n NNn-1n-1 xx2n2n XX2n-12n-1+N+N

2

1 2 21

2 -1 2

2-1 partition problem given 2 positive integers and 2

is there an index set which contains exactly one of and for each 1 such that

n

n ii

i i ii I

n x x x x N

I x x i n x N

1

1 1

n n

n n i ii i

i I i I i i

N x N x N N N

11

Fixed-Input Slew Buffering Candidate Solution Characteristics

Each candidate Each candidate solution is associated solution is associated withwithndash vvii a node a nodendash ccii downstream downstream

capacitancecapacitancendash ssii cumulative slew cumulative slew

degradation along wiredegradation along wirendash wwii cumulative buffer cumulative buffer

areaarea

vi is a sinkci is sink capacitance

v is an internal node

12

Dynamic Programming

Candidate solutions are propagated toward the source

Start from sinks Candidate

solutions are generated

Three operationsndash Add Wirendash Insert Bufferndash Merge

Solution Pruning

13

Solution Propagation Add Wire

cc22 = c = c11 + cx + cx ss22 = s = s11 + (rcx + (rcx222 + rxc2 + rxc11)ln9)ln9 s slew degradation along wiress slew degradation along wires r wire slew resistance per unit lengthr wire slew resistance per unit length c wire capacitance per unit lengthc wire capacitance per unit length

(v1 c1 w1 s1)

(v2 c2 w2 s2)

x

14

Solution Propagation Insert Buffer

c1b = Cb s1b = 0

w1b = w1+w(b)Cb buffer input capacitancePruned if the following slew constraint is violated

Rb buffer output slew resistanceKb buffer intrinsic slew

(v1 c1 w1 s1)(v1 c1b w1b s1b)

221 1( )b bR c K s

15

Solution Propagation Merge

ccmerge merge = c= cl l + c+ crr

wwmerge merge = w= wl l + w+ wrr

ssmergemerge = max(s = max(sl l s srr))

(v cl wl sl) (v cr wlrsr)

16

Solution Pruning

Two candidate solutionsTwo candidate solutionsndash Solution 1 (v c1 w1 s1)Solution 1 (v c1 w1 s1)ndash Solution 2 (v c2 w2 s2)Solution 2 (v c2 w2 s2)

Solution 1 is Solution 1 is inferiorinferior if if ndash c1 gt c2 larger loadc1 gt c2 larger loadndash and w1 gt w2 larger buffer areaand w1 gt w2 larger buffer areandash and s1 gt s2 worse cumulative slew and s1 gt s2 worse cumulative slew

degradationdegradationon wireon wire

17

Timing vs Slew Buffering (I)

A buffer insertion S=0 C=C(b) A buffer insertion S=0 C=C(b) Inserting one buffer Inserting one buffer one new one new

solutionsolution (the one with the smallest (the one with the smallest cost) cost)

In min-cost timing buffering a buffer In min-cost timing buffering a buffer insertion brings many non-inferior insertion brings many non-inferior (CWQ) with the same C where Q is (CWQ) with the same C where Q is the required arrival time (RAT)the required arrival time (RAT)

18

Timing vs Slew Buffering (II)

Slew constraint is close to length Slew constraint is close to length constraint constraint

An extreme case An extreme case ndash in min-cost timing buffering solutions with no buffer in min-cost timing buffering solutions with no buffer

inserted live till driver inserted live till driver ndash Soon become infeasible in slew bufferingSoon become infeasible in slew buffering

A A linear timelinear time optimal algorithm for slew optimal algorithm for slew buffering with a single buffer type buffering with a single buffer type

No polynomial timeNo polynomial time min-cost timing min-cost timing buffering algorithm in the same case buffering algorithm in the same case

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 9: Fast Algorithms for Slew Constrained Minimum Cost Buffering

9

Problem Formulation

GivenGivenndash A Steiner treeA Steiner treendash Maximum input slew rate α Maximum input slew rate α

at each buffersink (at each buffersink (slew constraintslew constraint))ndash A buffer libraryA buffer libraryndash RC parametersRC parametersndash Candidate buffer locationsCandidate buffer locations

Find a Find a minimal areaminimal area buffer insertion solution buffer insertion solution such that the slew constraint is satisfiedsuch that the slew constraint is satisfied

10

NP-Complete Proof

hellip

nN

1nN

2N

nN

1

1

There is a solution with slew constraint and cost n

n i

i

N M N N

BufferBuffer Slew Slew ResRes

Input Input CapCap

CostCost

BB11 11 XX11 XX22+N+Nnn

BB22 11 XX22 XX11+N+Nnn

BB33 NN XX33 XX44+N+Nn-1n-1

BB44 NN XX44 XX33+N+Nn-1n-1

helliphellip helliphellip helliphellip helliphellipBB2n-12n-1 NNn-1n-1 XX2n-12n-1 XX2n2n+N+NBB2n2n NNn-1n-1 xx2n2n XX2n-12n-1+N+N

2

1 2 21

2 -1 2

2-1 partition problem given 2 positive integers and 2

is there an index set which contains exactly one of and for each 1 such that

n

n ii

i i ii I

n x x x x N

I x x i n x N

1

1 1

n n

n n i ii i

i I i I i i

N x N x N N N

11

Fixed-Input Slew Buffering Candidate Solution Characteristics

Each candidate Each candidate solution is associated solution is associated withwithndash vvii a node a nodendash ccii downstream downstream

capacitancecapacitancendash ssii cumulative slew cumulative slew

degradation along wiredegradation along wirendash wwii cumulative buffer cumulative buffer

areaarea

vi is a sinkci is sink capacitance

v is an internal node

12

Dynamic Programming

Candidate solutions are propagated toward the source

Start from sinks Candidate

solutions are generated

Three operationsndash Add Wirendash Insert Bufferndash Merge

Solution Pruning

13

Solution Propagation Add Wire

cc22 = c = c11 + cx + cx ss22 = s = s11 + (rcx + (rcx222 + rxc2 + rxc11)ln9)ln9 s slew degradation along wiress slew degradation along wires r wire slew resistance per unit lengthr wire slew resistance per unit length c wire capacitance per unit lengthc wire capacitance per unit length

(v1 c1 w1 s1)

(v2 c2 w2 s2)

x

14

Solution Propagation Insert Buffer

c1b = Cb s1b = 0

w1b = w1+w(b)Cb buffer input capacitancePruned if the following slew constraint is violated

Rb buffer output slew resistanceKb buffer intrinsic slew

(v1 c1 w1 s1)(v1 c1b w1b s1b)

221 1( )b bR c K s

15

Solution Propagation Merge

ccmerge merge = c= cl l + c+ crr

wwmerge merge = w= wl l + w+ wrr

ssmergemerge = max(s = max(sl l s srr))

(v cl wl sl) (v cr wlrsr)

16

Solution Pruning

Two candidate solutionsTwo candidate solutionsndash Solution 1 (v c1 w1 s1)Solution 1 (v c1 w1 s1)ndash Solution 2 (v c2 w2 s2)Solution 2 (v c2 w2 s2)

Solution 1 is Solution 1 is inferiorinferior if if ndash c1 gt c2 larger loadc1 gt c2 larger loadndash and w1 gt w2 larger buffer areaand w1 gt w2 larger buffer areandash and s1 gt s2 worse cumulative slew and s1 gt s2 worse cumulative slew

degradationdegradationon wireon wire

17

Timing vs Slew Buffering (I)

A buffer insertion S=0 C=C(b) A buffer insertion S=0 C=C(b) Inserting one buffer Inserting one buffer one new one new

solutionsolution (the one with the smallest (the one with the smallest cost) cost)

In min-cost timing buffering a buffer In min-cost timing buffering a buffer insertion brings many non-inferior insertion brings many non-inferior (CWQ) with the same C where Q is (CWQ) with the same C where Q is the required arrival time (RAT)the required arrival time (RAT)

18

Timing vs Slew Buffering (II)

Slew constraint is close to length Slew constraint is close to length constraint constraint

An extreme case An extreme case ndash in min-cost timing buffering solutions with no buffer in min-cost timing buffering solutions with no buffer

inserted live till driver inserted live till driver ndash Soon become infeasible in slew bufferingSoon become infeasible in slew buffering

A A linear timelinear time optimal algorithm for slew optimal algorithm for slew buffering with a single buffer type buffering with a single buffer type

No polynomial timeNo polynomial time min-cost timing min-cost timing buffering algorithm in the same case buffering algorithm in the same case

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 10: Fast Algorithms for Slew Constrained Minimum Cost Buffering

10

NP-Complete Proof

hellip

nN

1nN

2N

nN

1

1

There is a solution with slew constraint and cost n

n i

i

N M N N

BufferBuffer Slew Slew ResRes

Input Input CapCap

CostCost

BB11 11 XX11 XX22+N+Nnn

BB22 11 XX22 XX11+N+Nnn

BB33 NN XX33 XX44+N+Nn-1n-1

BB44 NN XX44 XX33+N+Nn-1n-1

helliphellip helliphellip helliphellip helliphellipBB2n-12n-1 NNn-1n-1 XX2n-12n-1 XX2n2n+N+NBB2n2n NNn-1n-1 xx2n2n XX2n-12n-1+N+N

2

1 2 21

2 -1 2

2-1 partition problem given 2 positive integers and 2

is there an index set which contains exactly one of and for each 1 such that

n

n ii

i i ii I

n x x x x N

I x x i n x N

1

1 1

n n

n n i ii i

i I i I i i

N x N x N N N

11

Fixed-Input Slew Buffering Candidate Solution Characteristics

Each candidate Each candidate solution is associated solution is associated withwithndash vvii a node a nodendash ccii downstream downstream

capacitancecapacitancendash ssii cumulative slew cumulative slew

degradation along wiredegradation along wirendash wwii cumulative buffer cumulative buffer

areaarea

vi is a sinkci is sink capacitance

v is an internal node

12

Dynamic Programming

Candidate solutions are propagated toward the source

Start from sinks Candidate

solutions are generated

Three operationsndash Add Wirendash Insert Bufferndash Merge

Solution Pruning

13

Solution Propagation Add Wire

cc22 = c = c11 + cx + cx ss22 = s = s11 + (rcx + (rcx222 + rxc2 + rxc11)ln9)ln9 s slew degradation along wiress slew degradation along wires r wire slew resistance per unit lengthr wire slew resistance per unit length c wire capacitance per unit lengthc wire capacitance per unit length

(v1 c1 w1 s1)

(v2 c2 w2 s2)

x

14

Solution Propagation Insert Buffer

c1b = Cb s1b = 0

w1b = w1+w(b)Cb buffer input capacitancePruned if the following slew constraint is violated

Rb buffer output slew resistanceKb buffer intrinsic slew

(v1 c1 w1 s1)(v1 c1b w1b s1b)

221 1( )b bR c K s

15

Solution Propagation Merge

ccmerge merge = c= cl l + c+ crr

wwmerge merge = w= wl l + w+ wrr

ssmergemerge = max(s = max(sl l s srr))

(v cl wl sl) (v cr wlrsr)

16

Solution Pruning

Two candidate solutionsTwo candidate solutionsndash Solution 1 (v c1 w1 s1)Solution 1 (v c1 w1 s1)ndash Solution 2 (v c2 w2 s2)Solution 2 (v c2 w2 s2)

Solution 1 is Solution 1 is inferiorinferior if if ndash c1 gt c2 larger loadc1 gt c2 larger loadndash and w1 gt w2 larger buffer areaand w1 gt w2 larger buffer areandash and s1 gt s2 worse cumulative slew and s1 gt s2 worse cumulative slew

degradationdegradationon wireon wire

17

Timing vs Slew Buffering (I)

A buffer insertion S=0 C=C(b) A buffer insertion S=0 C=C(b) Inserting one buffer Inserting one buffer one new one new

solutionsolution (the one with the smallest (the one with the smallest cost) cost)

In min-cost timing buffering a buffer In min-cost timing buffering a buffer insertion brings many non-inferior insertion brings many non-inferior (CWQ) with the same C where Q is (CWQ) with the same C where Q is the required arrival time (RAT)the required arrival time (RAT)

18

Timing vs Slew Buffering (II)

Slew constraint is close to length Slew constraint is close to length constraint constraint

An extreme case An extreme case ndash in min-cost timing buffering solutions with no buffer in min-cost timing buffering solutions with no buffer

inserted live till driver inserted live till driver ndash Soon become infeasible in slew bufferingSoon become infeasible in slew buffering

A A linear timelinear time optimal algorithm for slew optimal algorithm for slew buffering with a single buffer type buffering with a single buffer type

No polynomial timeNo polynomial time min-cost timing min-cost timing buffering algorithm in the same case buffering algorithm in the same case

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 11: Fast Algorithms for Slew Constrained Minimum Cost Buffering

11

Fixed-Input Slew Buffering Candidate Solution Characteristics

Each candidate Each candidate solution is associated solution is associated withwithndash vvii a node a nodendash ccii downstream downstream

capacitancecapacitancendash ssii cumulative slew cumulative slew

degradation along wiredegradation along wirendash wwii cumulative buffer cumulative buffer

areaarea

vi is a sinkci is sink capacitance

v is an internal node

12

Dynamic Programming

Candidate solutions are propagated toward the source

Start from sinks Candidate

solutions are generated

Three operationsndash Add Wirendash Insert Bufferndash Merge

Solution Pruning

13

Solution Propagation Add Wire

cc22 = c = c11 + cx + cx ss22 = s = s11 + (rcx + (rcx222 + rxc2 + rxc11)ln9)ln9 s slew degradation along wiress slew degradation along wires r wire slew resistance per unit lengthr wire slew resistance per unit length c wire capacitance per unit lengthc wire capacitance per unit length

(v1 c1 w1 s1)

(v2 c2 w2 s2)

x

14

Solution Propagation Insert Buffer

c1b = Cb s1b = 0

w1b = w1+w(b)Cb buffer input capacitancePruned if the following slew constraint is violated

Rb buffer output slew resistanceKb buffer intrinsic slew

(v1 c1 w1 s1)(v1 c1b w1b s1b)

221 1( )b bR c K s

15

Solution Propagation Merge

ccmerge merge = c= cl l + c+ crr

wwmerge merge = w= wl l + w+ wrr

ssmergemerge = max(s = max(sl l s srr))

(v cl wl sl) (v cr wlrsr)

16

Solution Pruning

Two candidate solutionsTwo candidate solutionsndash Solution 1 (v c1 w1 s1)Solution 1 (v c1 w1 s1)ndash Solution 2 (v c2 w2 s2)Solution 2 (v c2 w2 s2)

Solution 1 is Solution 1 is inferiorinferior if if ndash c1 gt c2 larger loadc1 gt c2 larger loadndash and w1 gt w2 larger buffer areaand w1 gt w2 larger buffer areandash and s1 gt s2 worse cumulative slew and s1 gt s2 worse cumulative slew

degradationdegradationon wireon wire

17

Timing vs Slew Buffering (I)

A buffer insertion S=0 C=C(b) A buffer insertion S=0 C=C(b) Inserting one buffer Inserting one buffer one new one new

solutionsolution (the one with the smallest (the one with the smallest cost) cost)

In min-cost timing buffering a buffer In min-cost timing buffering a buffer insertion brings many non-inferior insertion brings many non-inferior (CWQ) with the same C where Q is (CWQ) with the same C where Q is the required arrival time (RAT)the required arrival time (RAT)

18

Timing vs Slew Buffering (II)

Slew constraint is close to length Slew constraint is close to length constraint constraint

An extreme case An extreme case ndash in min-cost timing buffering solutions with no buffer in min-cost timing buffering solutions with no buffer

inserted live till driver inserted live till driver ndash Soon become infeasible in slew bufferingSoon become infeasible in slew buffering

A A linear timelinear time optimal algorithm for slew optimal algorithm for slew buffering with a single buffer type buffering with a single buffer type

No polynomial timeNo polynomial time min-cost timing min-cost timing buffering algorithm in the same case buffering algorithm in the same case

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 12: Fast Algorithms for Slew Constrained Minimum Cost Buffering

12

Dynamic Programming

Candidate solutions are propagated toward the source

Start from sinks Candidate

solutions are generated

Three operationsndash Add Wirendash Insert Bufferndash Merge

Solution Pruning

13

Solution Propagation Add Wire

cc22 = c = c11 + cx + cx ss22 = s = s11 + (rcx + (rcx222 + rxc2 + rxc11)ln9)ln9 s slew degradation along wiress slew degradation along wires r wire slew resistance per unit lengthr wire slew resistance per unit length c wire capacitance per unit lengthc wire capacitance per unit length

(v1 c1 w1 s1)

(v2 c2 w2 s2)

x

14

Solution Propagation Insert Buffer

c1b = Cb s1b = 0

w1b = w1+w(b)Cb buffer input capacitancePruned if the following slew constraint is violated

Rb buffer output slew resistanceKb buffer intrinsic slew

(v1 c1 w1 s1)(v1 c1b w1b s1b)

221 1( )b bR c K s

15

Solution Propagation Merge

ccmerge merge = c= cl l + c+ crr

wwmerge merge = w= wl l + w+ wrr

ssmergemerge = max(s = max(sl l s srr))

(v cl wl sl) (v cr wlrsr)

16

Solution Pruning

Two candidate solutionsTwo candidate solutionsndash Solution 1 (v c1 w1 s1)Solution 1 (v c1 w1 s1)ndash Solution 2 (v c2 w2 s2)Solution 2 (v c2 w2 s2)

Solution 1 is Solution 1 is inferiorinferior if if ndash c1 gt c2 larger loadc1 gt c2 larger loadndash and w1 gt w2 larger buffer areaand w1 gt w2 larger buffer areandash and s1 gt s2 worse cumulative slew and s1 gt s2 worse cumulative slew

degradationdegradationon wireon wire

17

Timing vs Slew Buffering (I)

A buffer insertion S=0 C=C(b) A buffer insertion S=0 C=C(b) Inserting one buffer Inserting one buffer one new one new

solutionsolution (the one with the smallest (the one with the smallest cost) cost)

In min-cost timing buffering a buffer In min-cost timing buffering a buffer insertion brings many non-inferior insertion brings many non-inferior (CWQ) with the same C where Q is (CWQ) with the same C where Q is the required arrival time (RAT)the required arrival time (RAT)

18

Timing vs Slew Buffering (II)

Slew constraint is close to length Slew constraint is close to length constraint constraint

An extreme case An extreme case ndash in min-cost timing buffering solutions with no buffer in min-cost timing buffering solutions with no buffer

inserted live till driver inserted live till driver ndash Soon become infeasible in slew bufferingSoon become infeasible in slew buffering

A A linear timelinear time optimal algorithm for slew optimal algorithm for slew buffering with a single buffer type buffering with a single buffer type

No polynomial timeNo polynomial time min-cost timing min-cost timing buffering algorithm in the same case buffering algorithm in the same case

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 13: Fast Algorithms for Slew Constrained Minimum Cost Buffering

13

Solution Propagation Add Wire

cc22 = c = c11 + cx + cx ss22 = s = s11 + (rcx + (rcx222 + rxc2 + rxc11)ln9)ln9 s slew degradation along wiress slew degradation along wires r wire slew resistance per unit lengthr wire slew resistance per unit length c wire capacitance per unit lengthc wire capacitance per unit length

(v1 c1 w1 s1)

(v2 c2 w2 s2)

x

14

Solution Propagation Insert Buffer

c1b = Cb s1b = 0

w1b = w1+w(b)Cb buffer input capacitancePruned if the following slew constraint is violated

Rb buffer output slew resistanceKb buffer intrinsic slew

(v1 c1 w1 s1)(v1 c1b w1b s1b)

221 1( )b bR c K s

15

Solution Propagation Merge

ccmerge merge = c= cl l + c+ crr

wwmerge merge = w= wl l + w+ wrr

ssmergemerge = max(s = max(sl l s srr))

(v cl wl sl) (v cr wlrsr)

16

Solution Pruning

Two candidate solutionsTwo candidate solutionsndash Solution 1 (v c1 w1 s1)Solution 1 (v c1 w1 s1)ndash Solution 2 (v c2 w2 s2)Solution 2 (v c2 w2 s2)

Solution 1 is Solution 1 is inferiorinferior if if ndash c1 gt c2 larger loadc1 gt c2 larger loadndash and w1 gt w2 larger buffer areaand w1 gt w2 larger buffer areandash and s1 gt s2 worse cumulative slew and s1 gt s2 worse cumulative slew

degradationdegradationon wireon wire

17

Timing vs Slew Buffering (I)

A buffer insertion S=0 C=C(b) A buffer insertion S=0 C=C(b) Inserting one buffer Inserting one buffer one new one new

solutionsolution (the one with the smallest (the one with the smallest cost) cost)

In min-cost timing buffering a buffer In min-cost timing buffering a buffer insertion brings many non-inferior insertion brings many non-inferior (CWQ) with the same C where Q is (CWQ) with the same C where Q is the required arrival time (RAT)the required arrival time (RAT)

18

Timing vs Slew Buffering (II)

Slew constraint is close to length Slew constraint is close to length constraint constraint

An extreme case An extreme case ndash in min-cost timing buffering solutions with no buffer in min-cost timing buffering solutions with no buffer

inserted live till driver inserted live till driver ndash Soon become infeasible in slew bufferingSoon become infeasible in slew buffering

A A linear timelinear time optimal algorithm for slew optimal algorithm for slew buffering with a single buffer type buffering with a single buffer type

No polynomial timeNo polynomial time min-cost timing min-cost timing buffering algorithm in the same case buffering algorithm in the same case

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 14: Fast Algorithms for Slew Constrained Minimum Cost Buffering

14

Solution Propagation Insert Buffer

c1b = Cb s1b = 0

w1b = w1+w(b)Cb buffer input capacitancePruned if the following slew constraint is violated

Rb buffer output slew resistanceKb buffer intrinsic slew

(v1 c1 w1 s1)(v1 c1b w1b s1b)

221 1( )b bR c K s

15

Solution Propagation Merge

ccmerge merge = c= cl l + c+ crr

wwmerge merge = w= wl l + w+ wrr

ssmergemerge = max(s = max(sl l s srr))

(v cl wl sl) (v cr wlrsr)

16

Solution Pruning

Two candidate solutionsTwo candidate solutionsndash Solution 1 (v c1 w1 s1)Solution 1 (v c1 w1 s1)ndash Solution 2 (v c2 w2 s2)Solution 2 (v c2 w2 s2)

Solution 1 is Solution 1 is inferiorinferior if if ndash c1 gt c2 larger loadc1 gt c2 larger loadndash and w1 gt w2 larger buffer areaand w1 gt w2 larger buffer areandash and s1 gt s2 worse cumulative slew and s1 gt s2 worse cumulative slew

degradationdegradationon wireon wire

17

Timing vs Slew Buffering (I)

A buffer insertion S=0 C=C(b) A buffer insertion S=0 C=C(b) Inserting one buffer Inserting one buffer one new one new

solutionsolution (the one with the smallest (the one with the smallest cost) cost)

In min-cost timing buffering a buffer In min-cost timing buffering a buffer insertion brings many non-inferior insertion brings many non-inferior (CWQ) with the same C where Q is (CWQ) with the same C where Q is the required arrival time (RAT)the required arrival time (RAT)

18

Timing vs Slew Buffering (II)

Slew constraint is close to length Slew constraint is close to length constraint constraint

An extreme case An extreme case ndash in min-cost timing buffering solutions with no buffer in min-cost timing buffering solutions with no buffer

inserted live till driver inserted live till driver ndash Soon become infeasible in slew bufferingSoon become infeasible in slew buffering

A A linear timelinear time optimal algorithm for slew optimal algorithm for slew buffering with a single buffer type buffering with a single buffer type

No polynomial timeNo polynomial time min-cost timing min-cost timing buffering algorithm in the same case buffering algorithm in the same case

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 15: Fast Algorithms for Slew Constrained Minimum Cost Buffering

15

Solution Propagation Merge

ccmerge merge = c= cl l + c+ crr

wwmerge merge = w= wl l + w+ wrr

ssmergemerge = max(s = max(sl l s srr))

(v cl wl sl) (v cr wlrsr)

16

Solution Pruning

Two candidate solutionsTwo candidate solutionsndash Solution 1 (v c1 w1 s1)Solution 1 (v c1 w1 s1)ndash Solution 2 (v c2 w2 s2)Solution 2 (v c2 w2 s2)

Solution 1 is Solution 1 is inferiorinferior if if ndash c1 gt c2 larger loadc1 gt c2 larger loadndash and w1 gt w2 larger buffer areaand w1 gt w2 larger buffer areandash and s1 gt s2 worse cumulative slew and s1 gt s2 worse cumulative slew

degradationdegradationon wireon wire

17

Timing vs Slew Buffering (I)

A buffer insertion S=0 C=C(b) A buffer insertion S=0 C=C(b) Inserting one buffer Inserting one buffer one new one new

solutionsolution (the one with the smallest (the one with the smallest cost) cost)

In min-cost timing buffering a buffer In min-cost timing buffering a buffer insertion brings many non-inferior insertion brings many non-inferior (CWQ) with the same C where Q is (CWQ) with the same C where Q is the required arrival time (RAT)the required arrival time (RAT)

18

Timing vs Slew Buffering (II)

Slew constraint is close to length Slew constraint is close to length constraint constraint

An extreme case An extreme case ndash in min-cost timing buffering solutions with no buffer in min-cost timing buffering solutions with no buffer

inserted live till driver inserted live till driver ndash Soon become infeasible in slew bufferingSoon become infeasible in slew buffering

A A linear timelinear time optimal algorithm for slew optimal algorithm for slew buffering with a single buffer type buffering with a single buffer type

No polynomial timeNo polynomial time min-cost timing min-cost timing buffering algorithm in the same case buffering algorithm in the same case

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 16: Fast Algorithms for Slew Constrained Minimum Cost Buffering

16

Solution Pruning

Two candidate solutionsTwo candidate solutionsndash Solution 1 (v c1 w1 s1)Solution 1 (v c1 w1 s1)ndash Solution 2 (v c2 w2 s2)Solution 2 (v c2 w2 s2)

Solution 1 is Solution 1 is inferiorinferior if if ndash c1 gt c2 larger loadc1 gt c2 larger loadndash and w1 gt w2 larger buffer areaand w1 gt w2 larger buffer areandash and s1 gt s2 worse cumulative slew and s1 gt s2 worse cumulative slew

degradationdegradationon wireon wire

17

Timing vs Slew Buffering (I)

A buffer insertion S=0 C=C(b) A buffer insertion S=0 C=C(b) Inserting one buffer Inserting one buffer one new one new

solutionsolution (the one with the smallest (the one with the smallest cost) cost)

In min-cost timing buffering a buffer In min-cost timing buffering a buffer insertion brings many non-inferior insertion brings many non-inferior (CWQ) with the same C where Q is (CWQ) with the same C where Q is the required arrival time (RAT)the required arrival time (RAT)

18

Timing vs Slew Buffering (II)

Slew constraint is close to length Slew constraint is close to length constraint constraint

An extreme case An extreme case ndash in min-cost timing buffering solutions with no buffer in min-cost timing buffering solutions with no buffer

inserted live till driver inserted live till driver ndash Soon become infeasible in slew bufferingSoon become infeasible in slew buffering

A A linear timelinear time optimal algorithm for slew optimal algorithm for slew buffering with a single buffer type buffering with a single buffer type

No polynomial timeNo polynomial time min-cost timing min-cost timing buffering algorithm in the same case buffering algorithm in the same case

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 17: Fast Algorithms for Slew Constrained Minimum Cost Buffering

17

Timing vs Slew Buffering (I)

A buffer insertion S=0 C=C(b) A buffer insertion S=0 C=C(b) Inserting one buffer Inserting one buffer one new one new

solutionsolution (the one with the smallest (the one with the smallest cost) cost)

In min-cost timing buffering a buffer In min-cost timing buffering a buffer insertion brings many non-inferior insertion brings many non-inferior (CWQ) with the same C where Q is (CWQ) with the same C where Q is the required arrival time (RAT)the required arrival time (RAT)

18

Timing vs Slew Buffering (II)

Slew constraint is close to length Slew constraint is close to length constraint constraint

An extreme case An extreme case ndash in min-cost timing buffering solutions with no buffer in min-cost timing buffering solutions with no buffer

inserted live till driver inserted live till driver ndash Soon become infeasible in slew bufferingSoon become infeasible in slew buffering

A A linear timelinear time optimal algorithm for slew optimal algorithm for slew buffering with a single buffer type buffering with a single buffer type

No polynomial timeNo polynomial time min-cost timing min-cost timing buffering algorithm in the same case buffering algorithm in the same case

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 18: Fast Algorithms for Slew Constrained Minimum Cost Buffering

18

Timing vs Slew Buffering (II)

Slew constraint is close to length Slew constraint is close to length constraint constraint

An extreme case An extreme case ndash in min-cost timing buffering solutions with no buffer in min-cost timing buffering solutions with no buffer

inserted live till driver inserted live till driver ndash Soon become infeasible in slew bufferingSoon become infeasible in slew buffering

A A linear timelinear time optimal algorithm for slew optimal algorithm for slew buffering with a single buffer type buffering with a single buffer type

No polynomial timeNo polynomial time min-cost timing min-cost timing buffering algorithm in the same case buffering algorithm in the same case

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 19: Fast Algorithms for Slew Constrained Minimum Cost Buffering

19

Non-Fixed Input Slew

Input slew to each buffer can vary Input slew to each buffer can vary Our idea discretize the possible input slew Our idea discretize the possible input slew

values into values into input slew binsinput slew bins For each input slew bin carry out the above For each input slew bin carry out the above

procedure (for the fixed input slew case) to procedure (for the fixed input slew case) to propagate solutions propagate solutions

Some detailsSome detailsndash Input slew bins can be merged for speedupInput slew bins can be merged for speedupndash Inferiority also depends on the slew binInferiority also depends on the slew binndash A maximum bipartite matching algorithm is A maximum bipartite matching algorithm is

used for pruningused for pruning

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 20: Fast Algorithms for Slew Constrained Minimum Cost Buffering

20

Continuous Slew Buffering

Buffers are allowed to be inserted Buffers are allowed to be inserted anywhereanywhere

Single buffer type a linear greedy optimal Single buffer type a linear greedy optimal algorithm ndash add buffer as upstream as algorithm ndash add buffer as upstream as possiblepossible Start from sinks

Greedy algorithm toward the source

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 21: Fast Algorithms for Slew Constrained Minimum Cost Buffering

21

Multiple Buffer Types

Multiple buffer types greedily inserts Multiple buffer types greedily inserts buffers for every possibilitybuffers for every possibilityndash SlowSlow

Approximation via Approximation via adaptive buffer adaptive buffer selectionselection ndash Buffer library is shrunkenBuffer library is shrunkenndash Prefer buffer types with small slew Prefer buffer types with small slew

resistanceresistance Tight slew constraint choose top few buffer Tight slew constraint choose top few buffer

typestypes Loose constraint choose more buffer typesLoose constraint choose more buffer types

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 22: Fast Algorithms for Slew Constrained Minimum Cost Buffering

22

Experiments

Experiment SetupExperiment Setupndash 1000 industrial netlists1000 industrial netlistsndash 48 buffer types including non-inverting 48 buffer types including non-inverting

buffers and inverting buffersbuffers and inverting buffersndash A Pentium 4 machine with a 32GHz CPU A Pentium 4 machine with a 32GHz CPU

1G memory1G memory Compared to slew constrained min-Compared to slew constrained min-

cost timing buffering cost timing buffering ndash Pruning based on (QCW) S is maintainedPruning based on (QCW) S is maintainedndash S is only responsible for checking whether S is only responsible for checking whether

the solution violates the slew constraintthe solution violates the slew constraint

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 23: Fast Algorithms for Slew Constrained Minimum Cost Buffering

23

Slew Constraint vs Buffer Area

05000

100001500020000250003000035000400004500050000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

Area

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 24: Fast Algorithms for Slew Constrained Minimum Cost Buffering

24

Slew Constraint vs CPU Time (s)

0100200300400500600700800900

1000

03 05 1 15 2

Fixed SBTiming BufNon-fixed SBContinuous

Slew Constraint

CPU Time (s)

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 25: Fast Algorithms for Slew Constrained Minimum Cost Buffering

25

Slew Constraint vs Slack

7600

7800

8000

8200

8400

8600

8800

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Slack

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 26: Fast Algorithms for Slew Constrained Minimum Cost Buffering

26

Slew Con vs Solutions at Driver

050

100150200250300350

03 05 1 15 2

Fixed SBTiming Buf

Slew Constraint

Sol at Driver

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 27: Fast Algorithms for Slew Constrained Minimum Cost Buffering

27

Observations

Discrete Fixed-Input Slew BufferingDiscrete Fixed-Input Slew Bufferingndash Loose slew constraint smaller areaLoose slew constraint smaller areandash gt100x faster than timing bufferinggt100x faster than timing bufferingndash Saves 6 area over timing bufferingSaves 6 area over timing bufferingndash Small slack sacrificeSmall slack sacrifice

Non-fixed input slew buffering Non-fixed input slew buffering ndash Save up to 40 area over fixed input slew bufferingSave up to 40 area over fixed input slew buffering

Continuous slew bufferingContinuous slew bufferingndash Tight slew constraint causes many buffer insertions Tight slew constraint causes many buffer insertions

Not-well set candidate buffer positions significant Not-well set candidate buffer positions significant buffer waste buffer waste

ndash Continuous slew buffering reduces the wasteContinuous slew buffering reduces the wastendash Fast due to adaptive buffer selection strategyFast due to adaptive buffer selection strategy

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion
Page 28: Fast Algorithms for Slew Constrained Minimum Cost Buffering

28

Conclusion

Propose three slew buffering algorithms Propose three slew buffering algorithms ndash Discrete fixed-input slew bufferingDiscrete fixed-input slew bufferingndash Discrete non-fixed input slew bufferingDiscrete non-fixed input slew bufferingndash Continuous slew bufferingContinuous slew buffering

gt100x faster while still saving area gt100x faster while still saving area compared to timing bufferingcompared to timing buffering

gt90 nets are not timing critical in reality gt90 nets are not timing critical in reality and thus can be buffered by our and thus can be buffered by our algorithmalgorithm

  • Fast Algorithms for Slew Constrained Minimum Cost Buffering
  • Outline
  • Motivation
  • A New Flow for Buffering 1M Nets
  • Slew Definition
  • Slew Model
  • BufferDriver Input Slew Assumption
  • Slew Resistance of An Inverter
  • Problem Formulation
  • NP-Complete Proof
  • Fixed-Input Slew Buffering Candidate Solution Characteristics
  • Dynamic Programming
  • Solution Propagation Add Wire
  • Solution Propagation Insert Buffer
  • Solution Propagation Merge
  • Solution Pruning
  • Timing vs Slew Buffering (I)
  • Timing vs Slew Buffering (II)
  • Non-Fixed Input Slew
  • Continuous Slew Buffering
  • Multiple Buffer Types
  • Experiments
  • Slew Constraint vs Buffer Area
  • Slew Constraint vs CPU Time (s)
  • Slew Constraint vs Slack
  • Slew Con vs Solutions at Driver
  • Observations
  • Conclusion