presented by: sameer kulkarni dept of computer & information sciences university of delaware
DESCRIPTION
Presented by: Sameer Kulkarni Dept of Computer & Information Sciences University of Delaware. Phase Ordering. Optimization??. does it really work??. No. of optimizations. O64 = 264 (on last count) JikesRVM = 67. Search space. Consider a hypothetical case where we apply 40 optimizations - PowerPoint PPT PresentationTRANSCRIPT
CISC673 – Optimizing Compilers 1/34
• Presented by: Sameer Kulkarni• Dept of Computer & Information Sciences
• University of Delaware
Phase Ordering
CISC673 – Optimizing Compilers 2/34
Optimization??
does it really work??
_2
09_d
b
_2
13_ja
vac
_2
22_m
pega
udio
_2
02_je
ss
_2
01_c
ompre
ss
_2
05_ra
ytrac
e
_2
28_ja
ck0
0.20.40.60.8
11.21.41.61.8
2
SpecJVM98 relative speedup
O1O2O3
CISC673 – Optimizing Compilers 3/34
No. of optimizations
• O64 = 264 (on last count)• JikesRVM = 67
CISC673 – Optimizing Compilers 4/34
Search space
• Consider a hypothetical case where we apply 40 optimizations
• O64 : 3.98 x 1047 • Jikes: 4.1 x 1018
CISC673 – Optimizing Compilers 5/34
Could take a while
• Considering the smaller problem, assume that running all the benchmarks take just 1 sec to run
• Jikes would take: 130.2 billion years • Age of the universe 13 billion years
CISC673 – Optimizing Compilers 6/34
Some basic Optimizations
• Constant Sub-expression Elimination• Loop Unrolling• Local Copy Prop • Branch Optimizations ...
CISC673 – Optimizing Compilers 7/34
Example
for(int i=0; i< 3;i++){
a = a + i + 1;
}
Loop Unrolling
CSE
CISC673 – Optimizing Compilers 8/34
Instruction Scheduling vs Register Allocation
• Maximizing Parallelism IS• Minimizing Register Spilling RA
CISC673 – Optimizing Compilers 9/34
Phase Ord. vs Opt Levels
• Opt Levels ~ Timing Constraints• Phase ordering ~ code interactions
CISC673 – Optimizing Compilers 10/34
Whimsical??
• Opt X would like to go before Opt Y, but not always.
Best d
b
Best ja
vac
Best m
pega
udio
Best je
ss
Best r
aytra
ce
Best C
ompr
ess
0.91
0.92
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1
1.01
_209_db _213_javac _222_mpegaudio _202_jess _205_raytrace _201_compress
CISC673 – Optimizing Compilers 11/34
Ideal Solution?
• Oracle Perfect sequence at the very start• Wise Man Solution Given the present code
predict the best optimization solution
CISC673 – Optimizing Compilers 12/34
Wise Man
?• Understand
Compilers
• Optimizations
• Source Code
CISC673 – Optimizing Compilers 13/34
Possible Solutions• Pruning the search space• Genetic Algorithms• Estimating running times• Precompiled choices
CISC673 – Optimizing Compilers 14/34
Pruning Search space
Fast and Efficient Searches for Effective Optimization Phase Sequences, Kulkarni et al. TACO 2005
CISC673 – Optimizing Compilers 15/34
Optimization Profiling
Fast and Efficient Searches for Effective Optimization Phase Sequences, Kulkarni et al. TACO 2005
CISC673 – Optimizing Compilers 16/34
Genetic Algorithms
Fast Searches for Effective Optimization Phase Sequences, Kulkarni et al. PLDI ‘04
CISC673 – Optimizing Compilers 17/34
Exhaustive vs Heuristic [2]
CISC673 – Optimizing Compilers 18/34
Disadvantages• Benchmark Specific• Architecture dependent• Code disregarded
CISC673 – Optimizing Compilers 19/34
Improvements• Profiling the application• Understand the code• Understanding optimizations• Continuous evaluation of transformations
CISC673 – Optimizing Compilers 20/34
Proposed solutionInput = Code Features
Output = Running time
Evolve Neural Networks
CISC673 – Optimizing Compilers 21/34
Proposed solution
CISC673 – Optimizing Compilers 22/34
Experimental Setup• Neural Network Evolver (ANJI)• Training Set { javaGrande }• Testing Set { SpecJVM, Da Capo }
CISC673 – Optimizing Compilers 23/34
ANJI• Mutating & generating n/w s• Network phase ordering• Timing Information• Scoring the n/w
CISC673 – Optimizing Compilers 24/34
Training Phase• Generations and Chromosomes• Random chromosomes• Back Propagation• Add/Remove/Update hidden nodes
CISC673 – Optimizing Compilers 25/34
Experimental Setup
CISC673 – Optimizing Compilers 26/34
Network Evolution
Anji Evolution over generations
0.940.960.98
11.021.041.061.08
1.1
0 200 400 600 800 1000
Generation
Spee
dups
Max FitnessO3 (rough)
CISC673 – Optimizing Compilers 27/34
Network Evolution
CISC673 – Optimizing Compilers 28/34
javaGrande• Set of very small benchmarks• Low running times• Memory management• Machine Architecture
CISC673 – Optimizing Compilers 29/34
Testing• SpecJVM’98 & Da Capo• Champion n/w• Running times
CISC673 – Optimizing Compilers 30/34
Present Solution
CISC673 – Optimizing Compilers 31/34
Implementation in GCC• Milepost GCC
• Created for intelligent compilation
• Collecting source features
• Submitting features to common loc.
• Hooks into the Compilation process.
CISC673 – Optimizing Compilers 32/34
Possible Use Case
CISC673 – Optimizing Compilers 33/34
Structure for Phase Ordering
ANJI network from Source features
CISC673 – Optimizing Compilers 34/34
LLVM• Open Source Compiler• Modular Design• Easy to work with• All Optimizations are interchangeable
CISC673 – Optimizing Compilers 35/34
Questions
Most of the files and this presentation have been uploaded to http://www.cis.udel.edu/~skulkarn/ta.html