parallelization by simplification: a case study in vlsi placement myung-chul kim, dong-jin lee and...
Post on 20-Dec-2015
217 views
TRANSCRIPT
![Page 1: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/1.jpg)
PAPA2011, University of Michigan
Parallelization by SimPLification:A Case Study in VLSI Placement
Myung-Chul Kim, Dong-Jin Leeand Igor L. MarkovDept. of EECS, University of Michigan
1
![Page 2: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/2.jpg)
PAPA2011, University of Michigan
Complexities of Parallel Algorithms & SW
1.Objectives of parallelizationA. Improve completion time by using multiple cores in ||B. Improve throughput by using stream processing
(latency may increase and become less predictable)C. Improve power consumption (by decreasing clk rate)2.Not an objective (a pitfall)
− Come up with a slow algorithm that is easy to parallelize
■In this talk: how to accomplish 1.A without 2− Take a leading algorithm and speed up its bottlenecks− Design a new algorithm that is
(a) better, (b) easy to parallelize
2
![Page 3: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/3.jpg)
PAPA2011, University of Michigan
CAD Algorithms
■Sequence of optimizations− Subject to Amdahl’s law− The more the stages, the harder to parallelize effectively■Additional complications
− Elaborate data structures may entail overheadfor parallel access
− When processing is light, memory bandwidthmay become a bottleneck (with 4+ threads)
■Recommendations− A simpler algorithm is often either to parallelize
(fewer stages, simpler data structures)− Using standard solvers, e.g., linear algebra
helps reuse previous work on parallelization
3
![Page 4: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/4.jpg)
PAPA2011, University of Michigan
Global Placement: Motivation
■Interconnect lagging in performance while transistors continue scaling
− Circuit delay, power dissipation and areadominated by interconnect
− Routing quality highly controlled by placement
■Circuit size and complexity rapidly increasing− Scalable placement algorithm is critical− Simplicity, integration with other optimizations
4
Unloaded
Coupling
IR drop
RC delay
![Page 5: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/5.jpg)
PAPA2011, University of Michigan
Goals in Placement
■Find good relative ordering of cells− Minimize wire length and congestion− Maximize timing slack■Find good spacing of cells
− Eliminate wiring congestion problems− Provide space for post placement stages
–clock trees–buffer insertion–timing correction
■Find good global position
5
![Page 6: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/6.jpg)
PAPA2011, University of Michigan
A B C
Optimize Relative Order
6
![Page 7: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/7.jpg)
PAPA2011, University of Michigan
A B C
To spread ...
7
![Page 8: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/8.jpg)
PAPA2011, University of Michigan
A B C
.. or not to spread
8
![Page 9: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/9.jpg)
PAPA2011, University of Michigan
A B C
Place to the left
9
![Page 10: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/10.jpg)
PAPA2011, University of Michigan
A B C
… or to the right
10
![Page 11: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/11.jpg)
PAPA2011, University of Michigan
A B C
Optimize Relative Order
Without whitespace,placement is dominated by ordering
11
![Page 12: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/12.jpg)
Example of Global Placement (APlace 2.04 from UCSD)
![Page 13: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/13.jpg)
Example of Global Placement (mFar from UCSB)
![Page 14: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/14.jpg)
PAPA2011, University of Michigan
Placement Formulation
■Objective: Minimize estimated wirelength− Half-perimeter wirelength (HPWL)
− (max X – min X) + (max Y – min Y)
■Subject to constraints:− Legality: Row-based
placement with no overlaps− Routability: Limiting local
interconnect congestion forsuccessful routing
− Timing: Meeting performancetarget of a design
14
xy
![Page 15: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/15.jpg)
PAPA2011, University of Michigan
Quadratic Placement
■Consider a graph first, not a hypergraph
■Minimize Σ(xi-xj)2+(yi-yj)2 (the sum is over eij)
− Seems unrelated to Σ |xi-xj|+|yi-yj| but can still be separated into x- and y-components
■Physical analogy: Hooke’s law− Consider an elastic spring, spread by x− Force F=-kx (k is the spring constant)− Energy E=kx2
− Our goal: minimize the energy of the system
A system of springs will only settle in a minimum
15
![Page 16: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/16.jpg)
PAPA2011, University of Michigan
Iterative Optimization
16
![Page 17: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/17.jpg)
PAPA2011, University of Michigan
Prior Work
■ Ideal Placer
− Low runtime without sacrificing solution quality
− Simplicity, integration with other optimizations
17
Sp
eed
Solution Quality
Non-convex optimization
mFAR, Kraftwerk2, FastPlace3
Ideal placer
mPL6, APlace2, NTUPlace3
Quadratic and force-directed
![Page 18: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/18.jpg)
PAPA2011, University of Michigan
Key features of SimPL
■Flat quadratic placement■Primal dual optimization
− Closing the gap between upper and lower bounds
18
Final Solution
Lower-Bound Solutionby Linear System Solver
Wir
elen
gth
Iteration
Final Legal Solution
Upper-Bound Solution by Look-ahead Legalization
Initial WL Opt.
![Page 19: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/19.jpg)
PAPA2011, University of Michigan
Common Analytical Placement Flow
19
Placement Instance
Converge
yes
no
GlobalPlacement
Initial WLOptimization
Legalizationand Detailed Placement
![Page 20: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/20.jpg)
SimPL Flow
20
We delegate final legalization and detailed placement to FastPlace-DP [M. Pan, et al, “An Efficient and Effective Detailed Placement Algorithm”, ICCAD2005]
Placement Instance
Legalizationand Detailed Placement
B2B net model[P. Spindler, et al, “Kraftwerk2 - A Fast Force-Directed Quadratic Placement Approach Using an Accurate Net Model,” TCAD 2008]
yesno
Pseudonet Insertion
Look-aheadLegalization
(Upper-Bound)
B2B GraphBuilding
Linear System Solver (Lower-Bound)
ConvergeGlobal
Placement
B2B GraphBuilding
Linear System Solver
WLConverge
yes
noInitial WLOptimization
![Page 21: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/21.jpg)
PAPA2011, University of Michigan
SimPL: Look-ahead Legalization
■Purpose: Produces almost-legal placement (Upper-Bound)
while preserving the relative cell ordering givenby linear system solver (Lower-Bound)
■Identify target region − Find overflow bin b− Create a minimal wide enough bin cluster B around b
■Perform geometric top-down partitioning − Find cell area median (Cc) and whitespace median (CB)
− Assign cells (Cc) to corresponding partitions (CB)
■Non-linear scaling− Form stripe regions− Move cells across stripe regions in-order based on whitespace
21
![Page 22: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/22.jpg)
PAPA2011, University of Michigan
SimPL: Look-ahead Legalization (1)
Performing geometric top-down partitioning
Overfilled binCell-area median (Cc)
B0 B1
whitespacemedian (CB)
Bin cluster (B)
22
![Page 23: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/23.jpg)
PAPA2011, University of Michigan
SimPL: Look-ahead Legalization (2)
23
Cell-area median (Cc)
whitespacemedian (CB)
B0
![Page 24: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/24.jpg)
PAPA2011, University of Michigan
SimPL: Look-ahead Legalization (2)
CB
Obstacle
borders
Uniform cutlines
CellOrdering
Per-stripeLinear Scaling
26
4
37
58
1
CB
26
4
37
58
1
CB
24
![Page 25: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/25.jpg)
SimPL: Look-ahead Legalization (3)
■Example (adaptec1)
Look-ahead legalization stops when target regions become small enough
![Page 26: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/26.jpg)
PAPA2011, University of Michigan
SimPL: Using legal locations as anchors
■Purpose: Gradually perturb the linear system to generate
lower-bound solutions with less overlap
■Anchors and Pseudonets− Look-ahead locations used
as fixed, zero-area anchors − Anchors and original cells
connected with 2-pin pseudonets− Pseudonet weights grow
linearly with iterations
26
![Page 27: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/27.jpg)
PAPA2011, University of Michigan
Next illustration: Tug-of-war between low-wirelength and
legalized placements
27
![Page 28: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/28.jpg)
SimPL Iterations on Adaptec1 (1)Iteration=0 (Init WL Opt.) Iteration=1 (Upper Bound)
Iteration=2 (Lower Bound) Iteration=3 (Upper Bound)
28
![Page 29: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/29.jpg)
SimPL Iterations on Adaptec1 (2)Iteration=11 (Upper Bound)
Iteration=20 (Lower Bound) Iteration=21 (Upper Bound)
Iteration=11 (Upper Bound)
Iteration=20 (Lower Bound) Iteration=21 (Upper Bound)
Iteration=10 (Lower Bound)
29
![Page 30: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/30.jpg)
SimPL Iterations on Adaptec1 (3)
30
Iteration=31 (Upper Bound)Iteration=30 (Lower Bound)
Iteration=40 (Lower Bound) Iteration=41 (Upper Bound)
![Page 31: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/31.jpg)
PAPA2011, University of Michigan
Convergence of SimPL
■ Legal solution is formed between two bounds
31
![Page 32: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/32.jpg)
PAPA2011, University of Michigan
Empirical Results: ISPD05 Benchmarks
■Experimental setup− Single threaded runs on a 3.2GHz Intel core i7 Quad
CPU Q660 Linux workstation− HPWL is computed by GSRC Bookshelf Evaluator< 5000 lines of code in C++, including CG solver
for sparse linear systems (w Jacobi preconditioner)
32
![Page 33: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/33.jpg)
PAPA2011, University of Michigan
Initial placement 8%
CG solver 31%
Sparse matrix and B2B net
modeling8%
Look-ahead legalization
14%Pseudo-net insertion 1%
Post Global Placement
38%
IO 0%
Speeding Up Placement Using Parallelism
■SimPL has very few components (5KLOC)■Each bottleneck is amenable to some form of ||-ism
− Thread-level − Instruction-level
34
![Page 34: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/34.jpg)
PAPA2011, University of Michigan
Parallelism in Conjugate Gradient Solver
■Coarse-grain row partitioning− Implemented using OpenMP3.0 compiler intrinsic
■SSE2 (Streaming SIMD Extensions) instructions− Process 4 multiple data with a single instruction− Marginal runtime improvement in SpMxV
■Reducing memory bandwidth demand of SpMxV− CSR (Compressed Sparse Row) format
Y. Saad, “Iterative Methods for Sparse Linear Systems,” SIAM 2003
35
![Page 35: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/35.jpg)
PAPA2011, University of Michigan
Parallelism in CG Solver - Example
36
![Page 36: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/36.jpg)
PAPA2011, University of Michigan
Parallelism in B2B Mode Update
■B2B net model update– B2B model is separable– Can process the x and y cases in parallel
− Additionally, split the nets of the netlist into equal groups that can be processed by multiple threads.
37
![Page 37: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/37.jpg)
PAPA2011, University of Michigan
SSE optimization affects Runtime Profile
38
Initial placement 5%
CG solver 19%
Sparse matrix and B2B net
modeling10%
Look-ahead legalization
18%
Pseudo-net insertion 1%
Post Global Placement
46%
IO 1%
Initial placement 8%
CG solver 31%
Sparse matrix and B2B net
modeling8%
Look-ahead legalization
14%Pseudo-net insertion 1%
Post Global Placement
38%
IO 0%
![Page 38: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/38.jpg)
PAPA2011, University of Michigan
Parallelism in Look-ahead Legalization (1)
■Look-ahead legalization (LAL) started consuming a significant fraction of overall runtime
■Top-down geometric partitioning and non-linear scaling (T&N) are amenable to parallelization
− Top-down partitioning generates an increasing number of subtasks of similar sizes which can be solved in parallel
− After each level of T&N on bin cluster, eachthread generates two sub-clusters with similar numbers of cells
39
![Page 39: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/39.jpg)
PAPA2011, University of Michigan
Parallelism in Look-ahead Legalization (2)
■LAL keeps the global queue of bin clusters Q■Static partitioning
− Assign initial bin clusters to available threads such that each thread has similar number of bin clusters to start
■Subtask updates
− Thread ti processes one of two sub-clusters (for the next level of T&N), the remainder is added to the global cluster queue Q
■Dynamic task scheduling
− When thread ti is idle, it dynamically retrieves clusters from the global cluster queue Q. The number of clusters to be retrieved N = max(Q.size()/N_threads, 1)
40
![Page 40: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/40.jpg)
PAPA2011, University of Michigan
Empirical Results – Overall Speed-ups
■Experimental setup− Multithreaded runs on a 8-core AMD-based system
with four dual-core CPUs and 16GByte RAM− Each CPU was Opteron 880 processor running
at 2.4GHz with 1024KB cache
41
![Page 41: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/41.jpg)
Empirical Results – Component Speed-ups
42PAPA2011, University of Michigan
![Page 42: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/42.jpg)
PAPA2011, University of Michigan
Empirical Results – Component Speed-ups
43
![Page 43: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/43.jpg)
PAPA2011, University of Michigan
Extending the Routability-driven Placement
■Ongoing work: simultaneous place-and-route
44
![Page 44: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/44.jpg)
PAPA2011, University of Michigan
Simultaneous Place-and-Route
■After Look-Ahead Legalization (LAL) perform Look-Ahead Routing (LAR)
− Integrate an in-house router through clean API− Cell locations in, accurate congestion maps out− The placer accounts for congestion in addition to density
(slightly modified formulas, almost no extra work)■ISPD 2011 contest organized by IBM Research
− New, large benchmarks− Placements evaluated by a common global router
45
![Page 45: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/45.jpg)
PAPA2011, University of Michigan
SimPL SimPLR
■Key metric is #overflows (OF)■Also shown – routed WL (RtWL)
46
![Page 46: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/46.jpg)
PAPA2011, University of Michigan
Conclusions
■ New flat quadratic placement algorithm: SimPL− Novel primal-dual based approach − Amenable to integration with physical synthesis
■ Self-contained, compact implementation − Fastest among available academic placers − Highly competitive solution quality− Amenable to parallelism− Easy to extend to simultaneous place-and-route
47
![Page 47: Parallelization by SimPLification: A Case Study in VLSI Placement Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of Michigan](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d4d5503460f94a2c28d/html5/thumbnails/47.jpg)
Questions and Answers
Thank you!Time for Questions
48PAPA2011, University of Michigan