are floorplan representations important in digital design? h. h. chan, s. n. adya, i. l. markov the...

Post on 29-Dec-2015

219 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Are Floorplan Representations Important in Digital Design?

H. H. Chan, S. N. Adya, I. L. Markov

The University of Michigan

Motivation

• Many FP representations have been proposed since sequence-pair [Murata et.al. ‘96]

• Emphasize better area optimization results, with appealing properties based on math. results

• Interconnect is often more important in physical design, but rarely discussed in the literature

Does the choice of FP reps matter in interconnect-driven floorplanning?

Outline of the Talk

• Common Practices in FP Research

• Background on Representations

• Evaluation Framework

• Results and Analysis

• Conclusions

Current Status Quo

• Prevailing algorithm: simulated annealing– Can optimize power, performance, etc

• Most work focuses on FP representations • Many FP reps exist, compared in terms of

– Size of solution space– Area optimality (capture area-optimal floorplan?)– Asymptotic complexity in “realizing” a floorplan– Complexity of incremental changes in annealing

Are these properties relevant in interconnect-driven floorplanning?

No Room For Improvement Left?

• MCNC benchmark suite is almost always used

• The suite has only 5 benchmarks, with <50 blks – Area-optimal results on apte, xerox and hp– Extremely close results on ami33 and ami49

• Area optimization results are emphasized– Interconnect optimization is often ignored

• Temperature schedules are rarely reported– Hard to reproduce results

Contexts for Floorplanning

• Outline-free floorplanning – Minimize a combination of area and HPWL– Many floorplanners are evaluated in this framework

• Fixed-outline floorplanning– Minimize HPWL subject to a bounding box– More relevant to modern designs

• Large scale floorplanning– Excellent area-packing results for >500 blks

e.g., 10 – 20 copies of ami49– Hardly interesting w/o interconnect optimization

Outline of the Talk

• Common Practices in FP Research

• Background on Representations

• Evaluation Framework

• Results and Analysis

• Conclusions

Families of FP Representations (1)

• Different FP reps can share the same solution space (“equivalent”)– Sequence pair, TCG, TCG-S– O-tree, B*-tree– Mosaic reps: CBL, Q-sequence, TBS etc

• Equivalent FP reps produce similar solutions, only differ in runtime

• We seek to compare solution spacesrather than specific representations

Families of FP Representations (2)

We study sequence pair and B*-tree, since• They are best studied in the literature

– Broad extensions for various constraintse.g. pre-placed blks, rectilinear blks, soft blks

– Hierarchical extensions for over 10K blocks(Parquet-in-Capo, MB*-tree)

• Two “extremes” of representations– Sequence pair: largest solution space– B*-tree: smallest solution space

Sequence Pair (SP)

• Proposed by Murata et al in 1996• Encodes a floorplan using two permutations

• O(n2) time to realize a floorplan– O(n lg n), O(n lg lg n) algos exist– O(n2) time algo is easy to

implement and fast in practice

• Size of solution space: n!2

• Some area-optimal solutions• Equivalent to TCG, TCG-S

B*-tree

• Proposed by Chang et al in 2000• Encodes a floorplan by a binary tree and a

permutation

• O(n)-time to realize a floorplan• Size of solution space:

O(n!22n-2 / n1.5)• Some area-optimal floorplans• All packings are compacted to

the bottom• Equivalent to O-tree

Sequence Pair vs. B*-tree

SP captures low interconnect floorplans, that are not captured by B*-tree

• SP captures more floorplans than B*-tree– This packing is encoded

by SP <a b c> <c a b>– Not captured by B*-tree

Outline of the Talk

• Common Practices in FP Research

• Background on Representations

• Evaluation Framework

• Results and Analysis

• Conclusions

Evaluation Framework (1)

• Floorplanner Parquet [Adya et.al ICCD 2001]

– Simulated annealer based on sequence pair– Competitive results in

• Outline-free floorplanning• Fixed-outline floorplanning• Handling both hard and soft blocks

– Embedded in placer Capo 9.0 for mixed-sized placement (standard cells + large macros) [Adya et.al ICCAD 2004]

Evaluation Framework (2)

• Replace the sequence pair annealer by B*-tree, with minimal changes– Identical temperature schedule– Probabilities of applying moves are identical,

except for those specific to B*-trees– Both versions are open source

http://vlsicad.eecs.umich.edu/BK/parquet/

• Report averages over 50 independent startsrather than best results

• Runtime breakdown: Block-packing vs WL eval.

Evaluation Framework (3)

Min-cut floorplacement with Capo• Use min-cut partitioning whenever possible• Otherwise, resort to annealing-based packing

– Cluster standard cells into soft blocks– Fixed-outline floorplanning on clustered

instances– Min-cut resumes on standard cells

• Outperforms competitive annealers on large FP instances (> 100 blocks)– Clustered FP instances have up to 300 blks, often <20

• Generates thousands of very different FP benchmarks

What to Expect?

• Wirelength evaluation is very CPU-intesive

• Count # floating-point ops in floorplan and wirelength evaluation per move (for SP)– Instrument the code by adding counters

• # ops in HPWL evaluation– Only count arithmetic ops,

not assignments

• # ops in FP evaluation– Worst-case complexity O(n2)

is too pessimistic – Runtime of SP eval

fits to cn1.3

t = cn1.3

Outline of the Talk

• Common Practices in FP Research

• Background on Representations

• Evaluation Framework

• Results and Analysis

• Conclusions

Area Optimization (average performance reported, not best)

• B*-tree packs better than SP• Similar for fixed-outline FP, B*-tree: higher success rate• SP beats B*-tree in evaluation time up to 200 blks,

despite their asymptotic complexities --- O(n2) vs. O(n)

ami33 ami49 n100 n200 n300

SP 9.5% 4.8ms

9.1% 7.2ms

9.2% 17.6ms

10.2% 37.9ms

10.7% 64.3ms

B*-tree 5.6% 7.6ms

5.3% 10.7ms

5.5% 20.3ms

5.9% 39.4ms

6.3% 57.1ms

Outline-free FP (deadspace % / time-per-move)

Area + HPWL Optimization

• Similar performance for SP and B*-tree in area, HPWL, and evaluation time

• Runtime dominated by HPWL evaluation • Same trends for fixed-outline FP and soft blocks

ami33 ami49 n100 n200 n300

SP 14.3% 76e3 29.7ms

14.1% 823e3 48.5ms

11.6% 322e3 93.9ms

13.4% 589e3 219ms

14.1% 707e3 305ms

B*-tree 15.8% 75e3 31.5ms

16.2% 847e3 50.6ms

11.3% 320e3 99.5ms

12.0% 580e3 219ms

11.9% 700e3 313ms

Outline-free FP (deadspace % / HPWL / time-per-move)

Min-Cut Floorplacement

• A wide range of FP benchmarks– Instances with both hard and soft blocks – Facilitate a rigorous empirical comparison

• Capo on the IBM-MSwPins benchmarks – Similar performance of SP and B*-tree

(<1% diff in wirelength, <2.5% in runtime)

• Stand-alone Parquet on the generated instances– 2000+ fixed-outline instances with both hard and soft blocks– Block counts range from 1 to ~300, often <20 – Similar performance of SP and B*-tree

Wirelength Evaluation Runtime (1)

• In both outline-free and fixed-outline FP, HPWL evaluation dominates runtime (~80%)

Outline-free (TPM area-only / area+wire)ami33 ami49 n100 n200 n300

SP 4.8ms 29.7ms

7.2ms 48.5ms

17.6ms 93.9ms

37.9ms 219ms

64.3ms 305ms

B*-tree 7.6ms 31.5ms

10.7ms 50.6ms

20.3ms 99.5ms

39.4ms 219ms

57.1ms 313ms

Consistent with our analysis that WL evaluation should dominate runtime.

Wirelength Evaluation Runtime (2)

• Estimates less accurate in larger benchmarks– Larger netlists may not fit into processor cache

FP evaluation is never a runtime bottleneck!

ami33 ami49 n100 n200 n300

estimated 89% 88% 85% 80% 76%

actual 86% 87% 84% 85% 82%

% time-per-move spent on WL evaluation

(analytical estimates versus actual measurements)

Must Improve WL EvaluationRather than FP Representations!

• Ideas for speed-ups– Special-case the evaluation of 2-pin nets:

no loops, 2 floating-point comparisons only– Remove inessential nets, whose bounding box contains the outline– Conglomerate 2-pin nets between the same blocks

into heavy-weight nets

• These techniques lead to 10% speed-upwithout loss of quality in Parquet 3

• Parquet 4: another 10% speed-up (2x speed-up per move) with improved wirelength (longer temp. schedule)– Better use of cache (float versus double)– Simultaneous computation of min and max

(25% less comparisons)

Conclusions

• We compared the performance ofSP and B*-tree in wirelength-driven floorplanning – Surprisingly similar performance!

• The main bottleneck is interconnect evaluation:

>75% runtime in realistic FP instances– Parquet-4: Improved WL eval. tangible speed-up

• Asymptotic complexity of FP eval. has little relevance – Realistic FP instances are small (<200 blks)

– Worst-case analysis is too pessimistic

• New FP reps seem irrelevant unlessthey support incremental wirelength evaluation

• The status quo in FP literature needs to be changed

t = cn1.3

top related