analytic models and empirical search: a hybrid approach to code optimization a. epshteyn 1, m....

17
Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1 , M. Garzaran 1 , G. DeJong 1 , D. Padua 1 , G. Ren 1 , X. Li 1 , K. Yotov 2 , K. Pingali 2 1 University of Illinois at Urbana-Champaign 2 Cornell University

Upload: hugo-moody

Post on 31-Dec-2015

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,

Analytic Models and Empirical Search: A Hybrid Approach to

Code OptimizationA. Epshteyn1, M. Garzaran1, G. DeJong1,

D. Padua1, G. Ren1, X. Li1,

K. Yotov2, K. Pingali2

1 University of Illinois at Urbana-Champaign2 Cornell University

Page 2: Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,

Two approaches to code optimization:

• Models– E.g., calculate the

best tile size for MM as a function of cache size.

– Fast– May be inaccurate– No verification

through feedback

• Empirical Search– E.g., execute and

measure different versions of MM code with different tile sizes.

– Slow– Accurate because of

feedback

Page 3: Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,

Hybrid Approach

• Faster than empirical search

• More accurate than the model– Use the model as a prior– Use active sampling to minimize the amount

of searching

Page 4: Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,

Why is Speed Important?

• Adaptation may have to be applied at runtime, where running time is critical.

• Adaptation may have to be applied at compile time (e.g., with feedback from a fast simulator)

• Library routines can be used as a benchmark to evaluate alternative machine designs.

Page 5: Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,

Problem: Matrix Multiplication

• Tiling– Improves the locality of references

• Cache Blocking (NB): Matrix is decomposed into smaller subblocks of size NBxNB

• Matrix multiplication - illustrative example for testing the hybrid approach

• Ultimate goal: a learning compiler that specializes itself to its installation environment, user profile, etc.

Page 6: Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,

Empirical Search: ATLAS

• Try tiling parameters NB in the range

in steps of 4)1,80min(...16 sizecacheL

Page 7: Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,

Model (Yotov et. al.)

• Compute NB which optimizes the use of the L1 cache. • Constructed by analyzing the memory access trace of the

matrix multiplication code.• Formula:

• Has been extended to optimize the use of the L2 cache

sizelineL

sizeL

sizelineL

NB

sizelineL

NB

NB

1

11

1*3

1such that

max2

≤+⎥⎥

⎤⎢⎢

⎡+⎥⎥

⎤⎢⎢

Page 8: Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,

Model in action:

• Performance curve: • Vertical lines: model-predicted L1 and L2 blocking factors

• Whether to tile for the L1 or the L2 cache depends on the architecture and the application

Page 9: Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,

Hybrid approach

• Model performance with a family of regression curves

• Regression (nonparam)

– minimizing the average error

• Regression (ML)

– Distribution over regression curves

– Pick the most likely curve

Page 10: Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,

Regression (Bayesian)

• Prior distribution curve) over regression curves– Make regression curves with model-predicted maxima

more likely

• Posterior distribution given the data (Bayes rule):– P(curve|data)=P(data|curve) (curve)/P(data)

• Pick the maximum a-posteriori curve– Picks curves with peaks in model-predicted locations

when the data sample is small– Picks curves which fit the data best when the sample

is large

Page 11: Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,

Active sampling

• Objectives:1) Sample at lower-tile sizes – takes less time

2) Explore – don’t oversample in the same region

3) Get information about the dominant peak

Page 12: Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,

Solution: Potential Fieldsobjectives 1,2

• Positive charge at the origin

• Negative charges at previously sampled points

• Sample at the point which minimizes the field

Page 13: Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,

Potential Fields objective 3

• Positive charge in the region of the dominant peak

• How do we know which peak dominates:– Distribution over regression curves

• can compute:

P(peak1 is located at x), P(peak2 is located at x),

P(peak1 is of height h), P(peak2 is of height h)• Hence, can compute P(peak1 dominates peak2)• Impose a positive charge in the region of each peak

proportional to its probability of domination

Page 14: Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,

Results I – Regression Curves

Page 15: Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,

Results II – Time, Performance

Model Hybrid ATLAS

Sparc 376.66 851.04 832.63

SGI 499.81 553.15 505.4

Model Hybrid ATLAS

Sparc 0:00 3:12 8:59

SGI 0:00 14:02 59:00

Performance (MFLOPS) Time (mins)

• Sparc – actual improvement due to the hybrid search for NB: ~10%• SGI – improvement over both the model and ATLAS due to choosing to tile for the L2 cache

Page 16: Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,

Results III – Library Performance

Page 17: Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization A. Epshteyn 1, M. Garzaran 1, G. DeJong 1, D. Padua 1, G. Ren 1, X. Li 1,

Conclusion

• Approach: incorporates the prior.

• Active sampling: actively picks to sample in the most informative region.

• Decreases the search time of the empirical search, improves on the model’s performance.