-1- uc san diego / vlsi cad laboratory construction of realistic gate sizing benchmarks with known...

26
-1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI CAD LABORATORY, UC San Diego International Symposium on Physical Design March 27 th , 2012

Upload: kevin-jordan

Post on 01-Jan-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-1-UC San Diego / VLSI CAD Laboratory

Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions

Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions

Andrew B. Kahng, Seokhyeong Kang VLSI CAD LABORATORY, UC San Diego

International Symposium on Physical DesignMarch 27th, 2012

Page 2: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-2-

OutlineOutline

Background and Motivation Benchmark Generation Experimental Framework and Results Conclusions and Ongoing Work

Page 3: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-3-

Gate Sizing in VLSI DesignGate Sizing in VLSI Design

Gate sizing– Essential for power, delay and area

optimization

– Tunable parameters: gate-width, gate-length and threshold voltage

– Sizing problem seen in all phases of RTL-to-GDS flow

Common heuristics/algorithms– LP, Lagrangian relaxation, convex

optimization, DP, sensitivity-based gradient descent, ...

1. Which heuristic is better?

2. How suboptimal a given sizing solution is?

systematic and quantitative comparison is required

Page 4: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-4-

Suboptimality of Sizing HeuristicsSuboptimality of Sizing Heuristics Eyechart *

– Built from three basic topologies, optimally sized with DP – allow suboptimalities to be evaluated

– Non-realistic: Eyechart circuits have different topology from real design – large depth (650 stages) and small Rent parameter (0.17)

More realistic benchmarks are required along w/ automated generation flow

*Gupta et al., “Eyecharts: Constructive Benchmarking of Gate Sizing Heuristics”, DAC 2010.

Chain

MESHSTAR

Page 5: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-5-

Our Work: Realistic Benchmark Generation w/ Known Optimal Solution

Our Work: Realistic Benchmark Generation w/ Known Optimal Solution1.Propose benchmark circuits with known optimal

solutions2.The benchmarks resemble real designs

– Gate count, path depth, Rent parameter and net degree

3.Assess suboptimality of standard gate sizing approaches

Construct chains

Find optimal solution

Connect chainskeeping the optimal solution

Net

list g

ener

atorCharacteristic

parameters

Benchmark circuit

w/ known optimal solution

Real design

Extract parameters

Automated benchmark generation flow

Page 6: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-6-

OutlineOutline

Background and Motivation Benchmark Considerations and

Generation Experimental Framework and Results Conclusions and Ongoing Work

Page 7: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-7-

Benchmark ConsiderationsBenchmark Considerations Realism vs. Tractability to Analysis – opposing

goals To construct realistic benchmark: use design

characteristic parameters– # primary ports, path depth, fanin/fanout

distribution

To enable known optimal solutions– Library simplification as in Gupta et al. 2010:

slew-independent library

1 2 3 4 5 60

0.2

0.4

0.6

fanin fanout

design: JPEG Encoder

Fanin distirbution25%: 1-input60%: 2-input15%: >3-input

Path depth: 72Avg. net degree: 1.84Rent parameter: 0.72

Page 8: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-8-

Benchmark GenerationBenchmark Generation Input parameters

1. timing budget T2. depth of data path K3. number of primary ports N4. fanin, fanout distribution fid(i), fod(j)

Constraints– T should be larger than min. delay of K-stage

chain

Generation flow1. construct N chains with depth K2. attach connection cells (C )3. connect chains netlist with N*K + C cells

Page 9: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-9-

Benchmark Generation: Construct ChainsBenchmark Generation: Construct Chains

1. Construct N chains each with depth k (N*k cells)

2. Assign gate instance according to fid(i)3. Assign # fanouts to output ports according to

fod(o) Assignment strategy: arranged and random

chain1

chain2

chainN

...

stage1 stageK-1 stageK

gate(1,1)

gate(N,K)

Page 10: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-10-

Benchmark Generation: Construct ChainsBenchmark Generation: Construct Chains

1. Construct N chains each with depth k (N*k cells)

2. Assign gate instance according to fid(i)3. Assign # fanouts to output ports according to

fod(o) Assignment strategy: arranged and random

fanout fanin

Arranged assignment Random assignment

Page 11: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-11-

Benchmark Generation: Find Optimal Solution with DP Benchmark Generation: Find Optimal Solution with DP

1. Attach connection cells to all open fanouts- to connect chains keeping optimal solution

2. Perform dynamic programming with timing budget T

- optimal solution is achievable w/ slew-independent lib.

chain1

chain2

chainN

...

connection cellchain1

chain2

chainN

...

Page 12: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-12-

Benchmark Generation: Solving a Chain Optimally (Example)

Benchmark Generation: Solving a Chain Optimally (Example)

6

8 20 1

INV1 INV2 INV3

Dmax = 8

1 10 22 10 23 5 14 5 15 5 16 5 17 5 18 5 1

3 20 24 15 15 15 26 10 17 10 18 10 1

4 20 25 15 16 15 27 10 18 10 1

Stage 1 Stage 2 Stage 3

Stage 3Stage 1 Stage 2

Budget Power Size

Budget Power Size

Budget Power Size

Load= 3

Load= 6

Load= 3

Load= 6

size inputcap

leakage

power

delay

load 3

load 6

Size 1 3 5 3 4

Size 2 6 10 1 2

2 10 23 10 24 5 15 5 16 5 17 5 18 5 1

8 25 2

OPTIMIZED CHAIN

size 2 size 1 size 1

Page 13: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-13-

Benchmark Generation: Connect ChainsBenchmark Generation: Connect Chains

1. Run STA and find arrival time for each gate2. Connect each connection cell to open fanin port

- connect only if timing constraints are satisfied- connection cells do not change the optimal chain solution

3. Tie unconnected ports to logic high or low

c

g

wc,g

ac

ag

dgc

chain1

chain2

chainN

...

VDD

chain1

chain2

chainN

...

Page 14: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-14-

Benchmark Generation: Generated NetlistBenchmark Generation: Generated Netlist Generated output:

– benchmark circuit of N*K + C cells w/ optimal solution

Schematic of generated netlist (N = 10, K = 20)

Chains are connected to each other various topologies

Page 15: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-15-

OutlineOutline

Background and Motivation Benchmark Generation Experimental Framework and

Results Conclusions and Ongoing Work

Page 16: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-16-

Experimental SetupExperimental Setup Delay and Power model (library)

– LP: linear increase in power – gate sizing context

– EP: exponential increase in power – Vt or gate-length

Heuristics compared– Two commercial tools (BlazeMO, Cadence

Encounter)– UCLA sizing tool– UCSD sensitivity-based leakage optimizer

Realistic benchmarks: six open-source designs

Suboptimality calculationSuboptimality = powerheuristic - poweropt

poweropt

Page 17: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-17-

Generated Benchmark - ComplexityGenerated Benchmark - Complexity

Complexity (suboptimality) of generated benchmarkChain-only vs. connected-chain topologies

EP-40-20

EP-40-40

EP-80-20

EP-80-40

LP-40-20

LP-40-40

LP-80-20

LP-80-40

0.0%

5.0%

10.0%

15.0%

20.0%

chain-only

connected

EP-40-20

EP-40-40

EP-80-20

EP-80-40

LP-40-20

LP-40-40

LP-80-20

LP-80-40

-5.0%

0.0%

5.0%

10.0%

15.0%

20.0%chain-only

connected

Sub

opti

malit

y

Commercial tool Greedy

Chain-only: avg. 2.1% Connected-chain: avg. 12.8%

[library]-[N]-[k]

Page 18: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-18-

Generated Benchmark - ConnectivityGenerated Benchmark - Connectivity

Problem complexity and circuit connectivity1. Arranged assignment: improve connectivity

(larger fanin – later stage, larger fanout – earlier stage)

2. Random assignment: improve diversity of topologyarranged random unconnecte

dSubopt.

100% 0% 0.00% 2.60%

75% 25% 0.00% 6.80%

50% 50% 0.25% 10.30%

25% 75% 0.75% 11.20%

0% 100% 17.00% 7.70%

Page 19: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-19-

Suboptimality w.r.t. ParametersSuboptimality w.r.t. Parameters For different number of chains

40 80 160 320 6408%

9%

10%

11%

12%

13%

14%

1

10

100

1000

10000

subopt.(Comm)subopt.(Greedy)subopt.(SensOpt)runtime(Comm)runtime(Greedy)

number of chains

sub

opti

mal

ity

run

tim

e (m

in)

For different number of stages

20 40 60 80 1008%

9%

10%

11%

12%

13%

14%

1

10

100

1000

subopt.(Comm)subopt.(Greedy)subopt.(SensOpt)runtime(Comm)runtime(Greedy)runtime(SensOpt)

number of stages

sub

opti

mal

ity

run

tim

e (m

in)

Total # paths increase significantly w.r.t. N and K

Page 20: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-20-

Suboptimality w.r.t. Parameters (2)Suboptimality w.r.t. Parameters (2)

For different average net degrees

For different delay constraints

1.2 1.6 2 2.40%

20%

40%

60%

80%

100%

120%

0.1

1.0

10.0

100.0

1000.0

subopt.(Comm)subopt.(Greedy)subopt.(SensOpt)runtime(Comm)runtime(Greedy)

average net degree

sub

opti

mal

ity

run

tim

e (m

in)

0.4 0.5 0.6 0.7 0.8 0.9 1 1.10%

5%

10%

15%

20%

25%

0.1

1.0

10.0

100.0

subopt.(Comm)subopt.(Greedy)subopt.(SensOpt)runtime(Comm)runtime(Greedy)

timing constraint (ns)

sub

op

tim

alit

y

run

tim

e (m

in)

Page 21: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-21-

Generated Realistic BenchmarksGenerated Realistic Benchmarks

Target benchmarks– SASC, SPI, AES, JPEG, MPEG (from OpenCores)– EXU (from OpenSPARC T1)

Characteristic parameters of real and generated benchmarks

data depth

#instance

real designs generated

Rent param.

net degree

Rent param.

net degree

SASC 20 624 0.858 2.06 0.865 2.06

SPI 33 1092 0.880 1.81 0.877 1.80

EXU 31 25560 0.858 1.91 0.814 1.90

AES 23 23622 0.810 1.89 0.820 1.88

JPEG 72 141165 0.721 1.84 0.831 1.84

MPEG 33 578034 0.848 1.59 0.848 1.60

Page 22: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-22-

Suboptimality of HeuristicsSuboptimality of Heuristics Suboptimality w.r.t. known optimal solutions

for generated realistic benchmarks

Vt swap context –up to 52.2% avg. 16.3% eye-

chartSASC SPI AES EXU JPEG MPEG

0.00%

20.00%

40.00%

60.00%

Comm1 Comm2 Greedy SensOpt

eye-chart

SASC SPI AES EXU JPEG MPEG

-20.00%

0.00%

20.00%

40.00%

60.00%

Comm1 Comm2 Greedy SensOptGate sizing context –up to 43.7% avg. 25.5%

Suboptimality

* Greedy results for MPEG are missing

With EP library

With LP library

Page 23: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-23-

Comparison w/ Real Designs Comparison w/ Real Designs Suboptimality versus one specific heuristic (SensOpt)

Real designs and real delay/leakage library (TSMC 65nm) case Actual suboptimaltiy will be greater !

SASC SPI AES EXU JPEG MPEG-10.00%

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

Comm1 Comm2 Greedy

SASC SPI AES EXU JPEG MPEG-10.00%

0.00%

10.00%

20.00%

30.00%

40.00%

Comm1 Comm2 Greedy

Suboptimality from our benchmarks

Discrepancy: simplified delay model, reduced library set, ...

Page 24: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-24-

ConclusionsConclusions

A new benchmark generation technique for gate sizing construct realistic circuits with known optimal solutions

Our benchmarks enable systematic and quantitative study of common sizing heuristics

Common sizing methods are suboptimal for realistic benchmarks by up to 52.2% (Vt assignment) and 43.7% (sizing)

http://vlsicad.ucsd.edu/SIZING/

Page 25: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-25-

Ongoing WorkOngoing Work

Analyze discrepancies between real and artificial benchmarks

Handle more realistic delay model– Use realistic delay library in the context

of realistic benchmarks with tight upper bounds

Alternate approach for netlist generation– (1) cutting nets in a real design and find

optimal solution (2) reconnecting the nets keeping the optimal solution

Page 26: -1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI

-26-

Thank you