optimizing compilers cisc 673 spring 2011 inlining

30
UNIVERSITY NIVERSITY OF OF D DELAWARE ELAWARE C COMPUTER & OMPUTER & INFORMATION NFORMATION SCIENCES CIENCES DEPARTMENT EPARTMENT Optimizing Compilers CISC 673 Spring 2011 Inlining John Cavazos University of Delaware

Upload: raziya

Post on 24-Feb-2016

39 views

Category:

Documents


1 download

DESCRIPTION

Optimizing Compilers CISC 673 Spring 2011 Inlining. John Cavazos University of Delaware. Background. Inlining is important Removes call overhead Enables optimization opportunities Can be detrimental Increased compilation time Increased register pressure Cache effects. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Optimizing CompilersCISC 673

Spring 2011Inlining

John CavazosUniversity of Delaware

Page 2: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Background Inlining is important

Removes call overhead Enables optimization opportunities

Can be detrimental Increased compilation time Increased register pressure Cache effects

Page 3: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Interprocedural Optimization Some optimizations are disrupted

by calls Constant propagation might

stop at call site Possible solution: interprocedural

optimization Optimization that involves more

than one function Gets complicated (e.g., when

functions not in same file)

Page 4: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Inlining Replace a function call with body of

called function Assumed to be beneficial to a certain

point Enables optimizations

Constant folding, Common subexpression elimination, better global register allocation

Optimizations can outweigh call overhead reduction

Page 5: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Inlining Advantages Eliminates call disruption

No register save/restore required

Call overhead removed Allows context-specific tailoring Eliminates call barrier for

analysis/optimizations

Page 6: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Inlining Disadvantages Eliminates benefits

Resets state for register allocation Increase register pressure

Procedure calls (reuse) keep code size small

Compilation time increases Larger functions

Code bloat

Page 7: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Inlining for Object Oriented Plays a particular important role

in optimization of OO languages High ratio of calls (and

overhead) Many methods are short

(e.g., setter/getter) Issues mapping virtual calls to

concrete implementations Requires inserting a run-time

type test

Page 8: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Inlining example

Page 9: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Inlining example

Page 10: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Inlining Transformation Easy Actual transformation is easy

Rewrite call site with callee’s body

Rewrite formal parameter names with actual parameter names

Page 11: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Inlining Decision Hard Resource constraint decision Code size

must whole program and procedure

Excessive code growth leads to excessive compilation time (important for JITs!)

Profitability depends on specific context Can callee be tailored and

optimized Each decision affects profitability

and resources available later!

Page 12: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Inlining Decision Hard Consider following call graph

Assign each edge a type {inline, no-inline} Choice at each edge affects other

decisions Each decision has a profit and a cost

(in terms of resources)

Page 13: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Inlining Decision Procedures Some decisions are obvious Inline small procedures

Code smaller than linkage Inline procedures called only once Still lots of experimental work to do!

Cavazos 2005, Waterman 2006 Cooper, Hall, & Torczon or Davidson

& Holler

Page 14: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Adaptive Decision Making How should we determine a good

decision heuristic? Cavazos proposed an adaptive

solution Train a heuristic

Specialized for a given hardware or benchmark

Prior Art Ad hoc (manually-constructed) heuristic

based on program properties Combine ad hoc heuristics into a single a

single test applied at each call site – applied in a fixed order

Page 15: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Proposed Solution Use machine learning Features predict which methods

to inline Heuristic function controls

inlining Tune heuristic to :

Different compilation scenario Different architecture

Page 16: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Applying Genetic Algorithms Cross-validation

Evolve heuristic over set of benchmarks

Test on a different set of benchmarks

Average high performance Self-validation

Evolve heuristic for one benchmark

Best performance for benchmark

Page 17: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

High Performance Compiler IBM Jikes RVM• Java JIT Compiler• Tuned for Server Applications

Commercial quality Used by Several Hundred Researchers Over 100 Publications Several papers on Inlining

Page 18: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Default Inlining Heuristic Small methods

Always inline Medium-sized methods

Use static heuristic (IBM) Large methods

Never inline

Page 19: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Default Inlining Heuristicif (calleeSize >

CALLEE_MAX_SIZE)return NO

if (calleeSize < ALWAYS_INLINE_SIZE)return YES

if (inlineDepth > MAX_INLINE_DEPTH)return NO

if (callerSize > CALLER_MAX_SIZE)return NO

return YES

Page 20: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Genetic Algorithms Tune parameters of IBM

heuristic Individual

Vector of Integers Fitness is benchmark running

time Tuning time

Few hours per benchmark Few days per suite

Page 21: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Parameters Tuned by GA

Metric to Evaluate an Individual

Page 22: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Genetic Algorithms Primer

Page 23: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Scenarios and Metrics Scenarios

Adaptive Optimizing

Metrics Running Time Total Time

Page 24: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Experimental Setup High-Performance Java compiler

Jikes RVM 2.3.3 Intel Pentium 4, 2.6 GHz PowerPC G4, 500 MHz (not shown) Training Set

SPEC JVM benchmarks Test Set

DaCapo benchmarks + SPEC JBB

Page 25: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Adaptive Scenario(SPEC JVM98)

Page 26: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Adaptive Scenario(DaCapo+JBB)

Page 27: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Optimizing Scenario(SPEC JVM98)

Page 28: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Optimizing Scenario(DaCapo+JBB)

Page 29: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Self-Tuned Results

Page 30: Optimizing Compilers CISC 673 Spring 2011 Inlining

UUNIVERSITYNIVERSITY OFOF D DELAWARE ELAWARE • • C COMPUTER & OMPUTER & IINFORMATION NFORMATION SSCIENCES CIENCES DDEPARTMENTEPARTMENT

Conclusions

Out-performs well-tuned heuristic 37% total time reduction on Intel 7% total time reduction on PowerPC

Automatically tunes compiler heuristic Compilation Scenario Different Architectures