high-performance gate selection with a signoff timer andrew b. kahng *, seokhyeong kang *, hyein lee...

39
High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng * , Seokhyeong Kang * , Hyein Lee * , Igor L. Markov + and Pankit Thapar + UC San Diego * University of Michigan +

Upload: june-anderson

Post on 17-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

High-Performance Gate Selection with a Signoff Timer

Andrew B. Kahng*, Seokhyeong Kang*, Hyein Lee*, Igor L. Markov+ and Pankit Thapar+

UC San Diego* University of Michigan+

Page 2: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

2

Outline

• Gate Selection in VLSI Design• Previous Work• Challenges in Gate Selection• High-Performance Gate Selection with a Signoff Timer• Overall Flow• Experimental Results• Conclusions and Future Works

Page 3: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

3

Gate Selection in VLSI Design

• Effective approach to power, delay optimization • Objective: select a library cell for each gate

• Tunable cell parameters: gate length, gate width, Vth

• Minimize power • Satisfy constraints: slack, slew, max load capacitance, …

gate-width(drive-strength)

multi-Vth

Lgate-bias

INVX2 INVX4 INVX8 INVX16

HVT NVT LVT

L=60nmL=65nm L=55nm

lower (leakage) powerlower speed

higher (leakage) powerhigher speed

Page 4: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

4

Previous Techniques

• Common heuristics/algorithms

• Limitations• Do not account for realistic delay models and constraints (capaci-

tance, slew)• Continuous methods: industrial cell libraries offer discrete gate

sizes, and rounding solutions is not easy• Discrete methods: scalability to large circuits is an issue

Continuous methods

Discrete methods

Linear programming Convex optimization

Lagrangian relaxation

Dynamic programming Sensitivity-based Selection

Optimality Scalability

Page 5: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

5

Previous Work

• Our work extends Trident 1.0 [Hu et al. Proc. ICCAD 2012]• Produced strongest results on ISPD 2012 benchmarks

as of ICCAD 2012• Metaheuristic optimization with importance sampling

and sensitivity-guided search• Limitation: no interconnect delay calculation

unrealistic assumption

Page 6: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

6

Outline

• Gate Selection in VLSI Design• Previous Works• Challenges in Gate Selection

• Interconnect delay• Incorrect internal timer• Critical paths

• High-Performance Gate Selection with a Signoff Timer• Overall Flow• Experimental Results• Conclusions and Future Works

Page 7: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

7

Challenges in Gate Selection

• Selection problem seen at all phases of RTL-to-GDS flow• Becomes more challenging at later design stages

RTL

Gate Level Netlist

Placed Netlist

Routed Netlist

GDS

Logic Synthesis

Placement

Route

Interconnects

Gate Selection

• Timing constraints are strict• Gate and interconnect delay• Slew, max capacitance

• Gate Selection can result in large change in interconnect delay

Challenging

Our Problem

New challenges in the ISPD 2013 Gate Selection Contest• Routed netlists including interconnect • Realistic timing constraints including

slew and capaciatance• Relying on an industry signoff timer

Page 8: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

8

Issue 1: Interconnect Delay/Slew

• Delay and slew calculations for gates and wires• Delay : 50% of input transition to 50% of output transi-

tion • Slew : 25% to 75% of transition• Gate delay and slew are estimated with the lookup ta-

ble-based nonlinear delay models (NLDMs)• Interconnect delay and slew are estimated

with analytical models for RC trees

wirecell2cell1

wire delay

SA B

Ccell3

wire slew

75%

25%

A

S

B

C

A B

C

Sv0 v1 v2

v3

v4 v5

C0 C1 C2 C5

C3

R0-1 R1-2

R2-3R3∙4 =R0-1 +R1-2

C4R2-4

R4-5

Page 9: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

9

Issue 1: Interconnect Delay/Slew• The impact of interconnects on slew values propagates

to upstream and downstream and makes delay changes

T

S1

S2

FI2

FI1

FO1 FO2

Output pin capacitance change + slew change by interconnect

Slew propagation + slew degradation by interconnect

Large delay changes in upstream and downstream gates and nets

Page 10: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

10

Issue 2: Incorrect Internal Timer• Timer is essential to estimate interconnect delay and slew

which are affected by gate Selection/Vth swapping• Two options: Signoff Timer and Internal Timer

An accurate internal timer is needed

Signoff Timer

Gate Selection/Vt-Swapping

Post-Layout

Signoff

Post-Layout Optimizer

Iterative invocation Runtime increaseInternal Timer

TimingDiscrepancy

Page 11: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

11

Issue 2: Inaccurate Internal Timer• Challenges in matching signoff timer

• Error propagation along paths• Error accumulation with netlist changes

Error propagation on pathsError(internal – signoff)

Error # logic depth along path# cell change

Netlist change

Error accumulationwith netlist change

Timing calibration to a signoff timer is needed to avoid divergence

Page 12: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

12

Issue 3: Critical Paths• Many near-critical paths in the given benchmarks • Challenging to obtain a timing feasible solution

* From ISPD 2013 Discrete Gate Selection Contest Presentation

Dedicated critical path optimization is needed

Page 13: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

13

Outline

• Gate Selection in VLSI Design• Previous Works• Challenges in Gate Selection• High-Performance Gate Selection with a Signoff Timer

• Internal Timer with Interconnect Delay Modeling• Calibration to a Signoff Timer• Dedicated Critical Path Optimization• Sensitivity Functions

• Overall Flow• Experimental Results• Conclusions and Future Works

Page 14: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

14

Our Sizer

• High-Performance Gate Selection with a Signoff Timer1. Interconnect delay/slew models for an internal timer2. Efficient calibration to a signoff timer3. Critical path optimization for timing-feasible solutions4. Sensitivity-guided cell Selection

Page 15: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

15

1. Interconnect Delay/Slew for Internal Timer

• Essential to estimate interconnect delay and slew affected by gate Selection/Vth swapping

• Requirements for an internal timer• Fast enough for move-based optimization • Accurate enough to track signoff timer

• Our approach: use best-performing models for in-terconnect delay/slew from previous work

Page 16: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

16

Interconnect Delay/Slew : Previously Known Models• Early optimization does not require accuracy fast interconnect models • We use pre-existing models

• Model selection criterion: endpoint slack error between signoff timer* and our estimation

Elmore delayD2M

DM1, DM2

PERIS2M

delay models slew models

D2M: Alpert et al. ISPD 2000DM1,DM2: Kahng et al. TCAD 1997PERI: Kashyap et al. TAU 2002S2M: Agarwal et al. TCAD 2004 McCormick: McCormick Thesis 1989

McCormickTotal Cap.

Effective Cap. models

* Synopsys PrimeTime

Page 17: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

17

Interconnect Delay/Slew : Model Selection• The (D2M, PERI) model combination

has the smallest mean and standard deviation

Endpoint slack error distribution

(EM, PERI) (D2M,PERI)

(DM1,PERI) (DM2,PERI)

x-axis: slack error (ps), y-axis: % of #paths

0

1

2

3

4

5

6Mean StDev

Normalized mean/std. of endpoint slack error

Page 18: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

18

2. Calibration to a Signoff Timer• Challenges in matching the results of a signoff timer

• Timing divergence from error propagation along timing paths and error accumulation with netlist changes

• The divergence can be compensated with offset • Offset-based slack calibration [Moon et al. Patent 7,823,098]• Improve the accuracy of a given STA engine by periodically invok-

ing a signoff timer and storing slack differences (offsets)

Signoff TimerInternal Timer

Request timing information

offset = signoff timer – internal timer

Page 19: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

19

Calibration Frequency vs. Error• Impact of calibration frequency on average

slack error while Selection:

5% threshold shows <10ps slack errors

X-axis: % of cell changes during leakage optimization

Y-axis: (avg.) slack error over the signoff timer

Page 20: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

20

• Tcl socket interface allows send/ receive commands to/from the signoff timer

• Basis of winning ISPD-2013 gate Selection contest entry

Efficient Signoff-Timer Interface

Sizer signoff timer

load designlaunch signoff timer

cell sizing

open socket

cell swap listupdate cell size

incremental STAtiming calibration timing results

(b)

(a)Tcl client (Sizer)

socket interface

Tcl server (signoff timer)

socket -server accept $portvwait events

proc accept {sock addr port} fileevent $sock readable \\ [list svcHandler $sock] fconfigure ...

set server xx.xx.xx set chan [socket $server $port]

proc GetData {} set data [gets $chan] return $data proc SendData {data} puts $chan $data

(a) Tcl socket code

(b) Timing correlation w/ the socket I/F

Page 21: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

21

3. Critical Path Optimization• ISPD 2013 contest : many near-critical paths

in benchmarks • Challenging to obtain a timing feasible solution

• Dedicated critical path optimization: optimize cells on the most critical path to reduce WNS*• DownSelection fanouts• Peephole optimization

* WNS: Worst Negative Slack

Page 22: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

22

Critical Path Optimization: Downsizing Fanouts• Downsizing fanouts of critical cells

• Improve delay of the target cell by reducing output load

• Downsize fanout cells with sensitivity score

𝑺𝑭 𝒅𝒐𝒘𝒏=𝑪𝒐𝒖𝒕 (𝒄)  𝒔𝒊𝒛𝒆(𝒄)  

Critical cells

Fanout cells

DownSelection to reduce input cap.

Cell delay decresewith reduced output load

Page 23: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

23

• Pick k cells in a critical path and exhaustively searchthe best combination of k

• All possible combinations are listed in order of Gray code to minimize the overhead of incremental STA*

current window next window

N(# trial) = {#size option}^{k}

...

trial1

trial2

trialN

pick the best move

Critical Path Optimization: Peephole Optimization

Critical path

Enumerate all possible combination w/ Gray code

iSTA

* STA: Static Timing Analysis

Page 24: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

24

4. Sensitivity Function: Timing Recovery• Our sensitivity function takes into account

• The direct impact of Selection a given cell on its slack• The required increase in leakage power• The number of critical paths whose slack is improved

• Sensitivity function for timing recovery• , : slack, leakage power change from the cell change• : the number of paths passing through the cell• : Leakage exponent

• Same as [7], but interconnect delay/slewis considered

𝑺𝑭 𝑮𝑻𝑹=∆𝒔𝒍𝒂𝒄𝒌 ·¿𝒑𝒂𝒕𝒉𝒔   ¿∆ 𝒍𝒆𝒂𝒌𝒂𝒈𝒆𝒑𝒐𝒘𝒆𝒓𝜶  

Page 25: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

25

Sensitivity Function: Leakage Reduction

• Multiple sensitivity functions from [7] are use

• Among five SFs, the best SF is selected and used for the next optimization stage

SF1 ∆leakage / ∆delaySF2 ∆leakage * slackSF3 ∆leakage / (∆delay*#paths)SF4 ∆leakage * slack / #pathsSF5 ∆leakage * slack / (∆delay*#paths)

Page 26: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

26

Outline

• Gate Selection in VLSI Design• Previous Works• Challenges in Gate Selection• High-Performance Gate Selection with a Signoff Timer• Overall Flow

• Global Timing Recovery• Power Reduction with Feasible Timing

• Experimental Results• Conclusions and Future Works

Page 27: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

27

Overall Optimization Flow• Overall flow: Global Timing Recovery (GTR) +

Power Reduction with Feasible Timing (PRFT)

Routed Netlist, SPEF

GTRwoST

Selection Solution

GTRwST

PRFT phase1

PRFT phase2

Set to minimum size

Global Timing Recovery

Power Reduction w/ Feasible Timing

Find timing feasible solution with an internal timer

Find timing feasible solutionwith a signoff timer

Leakage reduction with different sensitivity functions

Leakage reduction with kick-move

Page 28: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

28

GTR without Signoff Timer

• GTR procedure

• Objective: find timing feasible solution with internal timer (no need of accurate timing information)

• Use guardband for the fast solution search

Timing feasible solution(non-feasible with signoff timer)

Increase guardband (GB)

No

Yes

GTR(GB)GTR(GB)GTR(α,γ)

Feasible?Feasible?Feasible?

Multi-threaded

STA

Calculate sensitivity (α)

Upsize γ% of cells in de-scending order of sensi-

tivity

Timing meet?

Incremental STA

NO

Page 29: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

29

GTR with Signoff Timer• Objective: find timing feasible solution with signoff timer• Timing recovery is added to GTR flow

Feasible?

Timing feasible Solution

Cell upSelection

Peephole & Critical path optimization

Internal slack calibration

No

Yes

Signoff timer

Update slack offset

• Timing recovery procedure

Page 30: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

30

PRFT with Sensitivity Functions• Objective: find the best leakage solution

• Various sensitivity functions are tried sequentially

Best Solution /Sensitivity Function (SF)

Run static timing analysis

Calculate sensitivity for all cells

Downsize cell C with maximum sensitivity

slack (C ) < 0

Incremental STA

NO

Revert the

Selection

YES

Feasible?

SGGS(SFi)

Next Sensitivity Function (SFi)

• SGGS procedure

Timing recovery

No

Yes

Page 31: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

31

PRFT : Speeding up Bottleneck Cells• Speed up bottleneck cells: recover timing slack

with minimum power impact • To escape from a local optimum, γ% bottleneck

cells are upsized

Feasible?No

SGGS(SF)

Yes

Timing recovery

Next Kick Move (LSMC) with γ% ratio

Best Solution /Best Sensitivity Function (SF) from PRFT phase 1

Page 32: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

32

Outline

• Gate Selection in VLSI Design• Previous Works• Challenges in Gate Selection• High-Performance Gate Selection with a Signoff Timer• Overall Flow• Experimental Results• Conclusions and Future Works

Page 33: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

33

ISPD 2013 Gate Selection Contest

• Realistic benchmarks and constraints• Netilst (Verilog), parasitics (SPEF), timing constraint (SDC)

• Max slew/load constraint

• Library: 11 logic functions, 30 cell types (three multi-Vth and ten different sizes) 330 cells

• Leakage power of violation-free solutions are compared• Final timing evaluation with a commercial signoff tool

Page 34: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

34

Experimental Results: Power and Runtime Result• Power and runtime comparison with contest best result

http://www.ispd.cc/contests/13/ISPD_2013_Contest_Final.pdf

usb_

phy_

fast

usb_

phy_

slow

pci_b

32_fas

t

pci_b

32_s

low

fft_fa

st

fft_slo

w

cord

ic_slo

w

des_

perf_

slow

edit_

dist_fas

t

edit_

dist_s

low

mat

rix_m

_slow

netc

ard_

fast

netc

ard_

slow

0

0.5

1

1.5 Contest Best

Trident2.0

Normalized leakage power

0

0.5

1

1.5

2

2.5

Contest Best Trident2.0Normalized runtime

No team found a feasible solution for netcard_fast

Page 35: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

35

• Runtime breakdown

Experimental Results: Runtime Breakdown

Page 36: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

36

• Normalized TNS* and leakage power over GTR iterations• After timing correlation, TNS increases due to discrepancy

between internal timer and signoff timer

Experimental Results: Optimization Trajectories

GTR without signoff timer

GTR with signoff timer

0 5 10 15 20 25 300.00E+00

2.00E-01

4.00E-01

6.00E-01

8.00E-01

1.00E+00

1.20E+00

1

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6TNS Leakage

After timing correlation

* TNS: Total Negative Slack

Page 37: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

37

• The minimum leakage without timing violation is achieved with calibration for every 5% cell change

• No calibration timing violation cannot be fixed• One calibration leakage increase after timing recovery• Applying gaurdband (GB) leakage overhead

Experimental Results: Impact of Timing Calibration

PRFT after timing recovery97%

100%

103%

106%

109%

112%

calibration (5%) init calibrationno calibration GB=5psGB=10ps

Nor

mal

ized

Lea

kage

(%)

PRFT after timing recovery

-450-400-350-300-250-200-150-100

-500

calibration (5%)init calibrationno calibrationGB=5psGB=10ps

TNS

(ps)

Result of pci_b32_fast

Page 38: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

38

• Trident2.0: high-performance gate-Selection• Fast interconnect models with reasonable accuracy

for an efficient internal timer• Calibration to a signoff timer with an interface

to improve timing accuracy• Dedicated critical path optimization with heuristics

• ISPD 2013 gate selection contest• Trident 2.0 took 2nd and 1st places in two contest categories,

resp.

• Future work• See if Lagrangian relaxation helps• Additional industry benchmarks

Conclusions and Future Work

Page 39: High-Performance Gate Selection with a Signoff Timer Andrew B. Kahng *, Seokhyeong Kang *, Hyein Lee *, Igor L. Markov + and Pankit Thapar + UC San Diego

Thank you!