are floorplan representations important in digital design? h. h. chan, s. n. adya, i. l. markov the...

25
Are Floorplan Representations Important in Digital Design? H. H. Chan, S. N. Adya, I. L. Markov The University of Michigan

Upload: abraham-george

Post on 29-Dec-2015

219 views

Category:

Documents


3 download

TRANSCRIPT

Are Floorplan Representations Important in Digital Design?

H. H. Chan, S. N. Adya, I. L. Markov

The University of Michigan

Motivation

• Many FP representations have been proposed since sequence-pair [Murata et.al. ‘96]

• Emphasize better area optimization results, with appealing properties based on math. results

• Interconnect is often more important in physical design, but rarely discussed in the literature

Does the choice of FP reps matter in interconnect-driven floorplanning?

Outline of the Talk

• Common Practices in FP Research

• Background on Representations

• Evaluation Framework

• Results and Analysis

• Conclusions

Current Status Quo

• Prevailing algorithm: simulated annealing– Can optimize power, performance, etc

• Most work focuses on FP representations • Many FP reps exist, compared in terms of

– Size of solution space– Area optimality (capture area-optimal floorplan?)– Asymptotic complexity in “realizing” a floorplan– Complexity of incremental changes in annealing

Are these properties relevant in interconnect-driven floorplanning?

No Room For Improvement Left?

• MCNC benchmark suite is almost always used

• The suite has only 5 benchmarks, with <50 blks – Area-optimal results on apte, xerox and hp– Extremely close results on ami33 and ami49

• Area optimization results are emphasized– Interconnect optimization is often ignored

• Temperature schedules are rarely reported– Hard to reproduce results

Contexts for Floorplanning

• Outline-free floorplanning – Minimize a combination of area and HPWL– Many floorplanners are evaluated in this framework

• Fixed-outline floorplanning– Minimize HPWL subject to a bounding box– More relevant to modern designs

• Large scale floorplanning– Excellent area-packing results for >500 blks

e.g., 10 – 20 copies of ami49– Hardly interesting w/o interconnect optimization

Outline of the Talk

• Common Practices in FP Research

• Background on Representations

• Evaluation Framework

• Results and Analysis

• Conclusions

Families of FP Representations (1)

• Different FP reps can share the same solution space (“equivalent”)– Sequence pair, TCG, TCG-S– O-tree, B*-tree– Mosaic reps: CBL, Q-sequence, TBS etc

• Equivalent FP reps produce similar solutions, only differ in runtime

• We seek to compare solution spacesrather than specific representations

Families of FP Representations (2)

We study sequence pair and B*-tree, since• They are best studied in the literature

– Broad extensions for various constraintse.g. pre-placed blks, rectilinear blks, soft blks

– Hierarchical extensions for over 10K blocks(Parquet-in-Capo, MB*-tree)

• Two “extremes” of representations– Sequence pair: largest solution space– B*-tree: smallest solution space

Sequence Pair (SP)

• Proposed by Murata et al in 1996• Encodes a floorplan using two permutations

• O(n2) time to realize a floorplan– O(n lg n), O(n lg lg n) algos exist– O(n2) time algo is easy to

implement and fast in practice

• Size of solution space: n!2

• Some area-optimal solutions• Equivalent to TCG, TCG-S

B*-tree

• Proposed by Chang et al in 2000• Encodes a floorplan by a binary tree and a

permutation

• O(n)-time to realize a floorplan• Size of solution space:

O(n!22n-2 / n1.5)• Some area-optimal floorplans• All packings are compacted to

the bottom• Equivalent to O-tree

Sequence Pair vs. B*-tree

SP captures low interconnect floorplans, that are not captured by B*-tree

• SP captures more floorplans than B*-tree– This packing is encoded

by SP <a b c> <c a b>– Not captured by B*-tree

Outline of the Talk

• Common Practices in FP Research

• Background on Representations

• Evaluation Framework

• Results and Analysis

• Conclusions

Evaluation Framework (1)

• Floorplanner Parquet [Adya et.al ICCD 2001]

– Simulated annealer based on sequence pair– Competitive results in

• Outline-free floorplanning• Fixed-outline floorplanning• Handling both hard and soft blocks

– Embedded in placer Capo 9.0 for mixed-sized placement (standard cells + large macros) [Adya et.al ICCAD 2004]

Evaluation Framework (2)

• Replace the sequence pair annealer by B*-tree, with minimal changes– Identical temperature schedule– Probabilities of applying moves are identical,

except for those specific to B*-trees– Both versions are open source

http://vlsicad.eecs.umich.edu/BK/parquet/

• Report averages over 50 independent startsrather than best results

• Runtime breakdown: Block-packing vs WL eval.

Evaluation Framework (3)

Min-cut floorplacement with Capo• Use min-cut partitioning whenever possible• Otherwise, resort to annealing-based packing

– Cluster standard cells into soft blocks– Fixed-outline floorplanning on clustered

instances– Min-cut resumes on standard cells

• Outperforms competitive annealers on large FP instances (> 100 blocks)– Clustered FP instances have up to 300 blks, often <20

• Generates thousands of very different FP benchmarks

What to Expect?

• Wirelength evaluation is very CPU-intesive

• Count # floating-point ops in floorplan and wirelength evaluation per move (for SP)– Instrument the code by adding counters

• # ops in HPWL evaluation– Only count arithmetic ops,

not assignments

• # ops in FP evaluation– Worst-case complexity O(n2)

is too pessimistic – Runtime of SP eval

fits to cn1.3

t = cn1.3

Outline of the Talk

• Common Practices in FP Research

• Background on Representations

• Evaluation Framework

• Results and Analysis

• Conclusions

Area Optimization (average performance reported, not best)

• B*-tree packs better than SP• Similar for fixed-outline FP, B*-tree: higher success rate• SP beats B*-tree in evaluation time up to 200 blks,

despite their asymptotic complexities --- O(n2) vs. O(n)

ami33 ami49 n100 n200 n300

SP 9.5% 4.8ms

9.1% 7.2ms

9.2% 17.6ms

10.2% 37.9ms

10.7% 64.3ms

B*-tree 5.6% 7.6ms

5.3% 10.7ms

5.5% 20.3ms

5.9% 39.4ms

6.3% 57.1ms

Outline-free FP (deadspace % / time-per-move)

Area + HPWL Optimization

• Similar performance for SP and B*-tree in area, HPWL, and evaluation time

• Runtime dominated by HPWL evaluation • Same trends for fixed-outline FP and soft blocks

ami33 ami49 n100 n200 n300

SP 14.3% 76e3 29.7ms

14.1% 823e3 48.5ms

11.6% 322e3 93.9ms

13.4% 589e3 219ms

14.1% 707e3 305ms

B*-tree 15.8% 75e3 31.5ms

16.2% 847e3 50.6ms

11.3% 320e3 99.5ms

12.0% 580e3 219ms

11.9% 700e3 313ms

Outline-free FP (deadspace % / HPWL / time-per-move)

Min-Cut Floorplacement

• A wide range of FP benchmarks– Instances with both hard and soft blocks – Facilitate a rigorous empirical comparison

• Capo on the IBM-MSwPins benchmarks – Similar performance of SP and B*-tree

(<1% diff in wirelength, <2.5% in runtime)

• Stand-alone Parquet on the generated instances– 2000+ fixed-outline instances with both hard and soft blocks– Block counts range from 1 to ~300, often <20 – Similar performance of SP and B*-tree

Wirelength Evaluation Runtime (1)

• In both outline-free and fixed-outline FP, HPWL evaluation dominates runtime (~80%)

Outline-free (TPM area-only / area+wire)ami33 ami49 n100 n200 n300

SP 4.8ms 29.7ms

7.2ms 48.5ms

17.6ms 93.9ms

37.9ms 219ms

64.3ms 305ms

B*-tree 7.6ms 31.5ms

10.7ms 50.6ms

20.3ms 99.5ms

39.4ms 219ms

57.1ms 313ms

Consistent with our analysis that WL evaluation should dominate runtime.

Wirelength Evaluation Runtime (2)

• Estimates less accurate in larger benchmarks– Larger netlists may not fit into processor cache

FP evaluation is never a runtime bottleneck!

ami33 ami49 n100 n200 n300

estimated 89% 88% 85% 80% 76%

actual 86% 87% 84% 85% 82%

% time-per-move spent on WL evaluation

(analytical estimates versus actual measurements)

Must Improve WL EvaluationRather than FP Representations!

• Ideas for speed-ups– Special-case the evaluation of 2-pin nets:

no loops, 2 floating-point comparisons only– Remove inessential nets, whose bounding box contains the outline– Conglomerate 2-pin nets between the same blocks

into heavy-weight nets

• These techniques lead to 10% speed-upwithout loss of quality in Parquet 3

• Parquet 4: another 10% speed-up (2x speed-up per move) with improved wirelength (longer temp. schedule)– Better use of cache (float versus double)– Simultaneous computation of min and max

(25% less comparisons)

Conclusions

• We compared the performance ofSP and B*-tree in wirelength-driven floorplanning – Surprisingly similar performance!

• The main bottleneck is interconnect evaluation:

>75% runtime in realistic FP instances– Parquet-4: Improved WL eval. tangible speed-up

• Asymptotic complexity of FP eval. has little relevance – Realistic FP instances are small (<200 blks)

– Worst-case analysis is too pessimistic

• New FP reps seem irrelevant unlessthey support incremental wirelength evaluation

• The status quo in FP literature needs to be changed

t = cn1.3