meta optimization

28
http:// www.cag.lcs.mit.edu/ metaopt Meta Optimization Improving Compiler Heuristics with Machine Learning Mark Stephenson, Una-May O’Reilly, Martin Martin, and Saman Amarasinghe MIT Computer Architecture Group

Upload: fran

Post on 19-Jan-2016

40 views

Category:

Documents


0 download

DESCRIPTION

Meta Optimization. Improving Compiler Heuristics with Machine Learning. Mark Stephenson, Una-May O’Reilly, Martin Martin, and Saman Amarasinghe MIT Computer Architecture Group. Motivation. Compiler writers are faced with many challenges: Many compiler problems are NP-hard - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Meta OptimizationImproving Compiler Heuristics

with Machine Learning

Mark Stephenson, Una-May O’Reilly,Martin Martin, and Saman AmarasingheMIT Computer Architecture Group

Page 2: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Motivation

• Compiler writers are faced with many challenges:– Many compiler problems are NP-hard– Modern architectures are inextricably complex

• Simple models can’t capture architecture intricacies

– Micro-architectures change quickly

Page 3: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Motivation

• Heuristics alleviate complexity woes– Find good approximate solutions for a large

class of applications– Find solutions quickly

• Unfortunately…– They require a lot of trial-and-error tweaking to

achieve suitable performance

Page 4: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Priority Functions

• A heuristic’s Achilles heel– A single priority or cost function often dictates the

efficacy of a heuristic

• Priority functions rank the options available to a compiler heuristic– Graph coloring register allocation (selecting nodes to

spill)– List scheduling (identifying instructions in worklist to

schedule first)– Hyperblock formation (selecting paths to include)

Page 5: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Machine Learning

• We propose using machine learning techniques to automatically search the priority function space– Search space is feasible– Make use of spare computer cycles

Page 6: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

• Find predicatable regions of control flow

• Enumerate paths of control in region– Exponential, but in practice it’s okay

• Prioritize paths based on several characteristics– The priority function we want to optimize

• Add paths to hyperblock in priority order

Case Study I: Hyperblock Formation

Page 7: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Case Study I: IMPACT’s Function

free hazard is if :1

hazard has if :25.0

i

ii path

pathhazard

jNj

ii heightdep

heightdepratiodep

_max

__

1

)__1.2(_ iiiii ratioopratiodephazardratioexecpriority

jNj

ii heightop

heightopratioop

_max

__

1

Favor frequentlyExecuted paths Favor short paths

Favor parallel pathsPenalize paths with hazards

Page 8: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Hyperblock Formation

• What are the important characteristic of a hyperblock formation priority function?

• IMPACT uses four characteristics

• Extract all the characteristics you can think of and have a machine learning algorithm find the priority function

Page 9: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Hyperblock Formationx1 Maximum ops over paths x2 Dependence height

x3 Number of paths x4 Number of operations

x5 Does path have subroutine calls? x6 Number of branches

x7 Does path have unsafe calls? x8 Path execution ratio

x9 Does path have pointer derefs? x10 Average ops executed in path

x11 Issue width of processor x12 Average predictability of branches in path

… xN Predictability product of branches in path

)function( 1 Ni xxpriority

)__1.2(_ iiiii ratioopratiodephazardratioexecpriority

Page 10: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Genetic Programming

• GP’s representation is a directly executable expression– Basically a lisp expression (or an AST)– In our case, GP variables are interesting

characteristics of the program

num_ops 2.3 predictability 4.1

- /

*

Page 11: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Genetic Programming

• Searching algorithm analogous to natural selection– Maintain a population of expressions– Selection

• The fittest expressions in the population are more likely to reproduce

– Sexual reproduction• Crossing over subexpressions of two expressions

– Mutation

Page 12: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Genetic ProgrammingCreate initial population(initial solutions)

Evaluation

Selection

END

Generation of variants(mutation and crossover)

Generations < Limit?

• Most expressions in initial population are randomly generated

• It also seeded with the compiler writer’s best guesses

Page 13: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Genetic Programming

Evaluation

Selection

END

Generation of variants(mutation and crossover)

Generations < Limit?

• Each expression is evaluated by compiling and running benchmark(s)

• Fitness is the relative speedup over the baseline on benchmark(s)

Baseline expression in the one that’s distributed with Trimaran

Create initial population(initial solutions)

Page 14: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Genetic Programming

Evaluation

Selection

END

Generation of variants(mutation and crossover)

Generations < Limit?

• Just as with Natural Selection, the fittest individuals are more likely to survive and reproduce.

Create initial population(initial solutions)

Page 15: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Genetic Programming

Evaluation

Selection

END

Generation of variants(mutation and crossover)

Generations < Limit?

Create initial population(initial solutions)

Page 16: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Genetic Programming

Evaluation

Selection

END

Generation of variants(mutation and crossover)

Generations < Limit?

• Use crossover and mutation to generate new expressions

Create initial population(initial solutions)

Page 17: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Hyperblock ResultsCompiler Specialization

1.5

4

1.2

3

0

0.5

1

1.5

2

2.5

3

3.51

29

.co

mp

ress

g7

21

en

cod

e

g7

21

de

cod

e

hu

ff_

de

c

hu

ff_

en

c

raw

cau

dio

raw

da

ud

io

toa

st

mp

eg

2d

ec

Ave

rag

e

Sp

ee

du

p

Train data set Alternate data set

(add (sub (cmul (gt (cmul $b0 0.8982 $d17)…$d7)) (cmul $b0 0.6183 $d28)))

(add (div $d20 $d5) (tern $b2 $d0 $d9))

Page 18: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Hyperblock ResultsA General Purpose Priority Function

1.44

1.25

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5de

codr

le4

codr

le4

g721

deco

de

g721

enco

de

raw

daud

io

raw

caud

io

toas

t

mpe

g2de

c

124.

m88

ksim

129.

com

pres

s

huff

_enc

huff

_dec

Ave

rage

Sp

ee

du

p

Train data set Alternate data set

Page 19: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Cross ValidationTesting General Purpose Applicability

1.09

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6u

ne

pic

djp

eg

rast

a

02

3.e

qn

tott

13

2.ij

pe

g

05

2.a

lvin

n

14

7.v

ort

ex

08

5.c

c1 art

13

0.li

osd

em

o

mip

ma

p

Ave

rag

e

Sp

ee

du

p

Page 20: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Case Study II: Register AllocationA General Purpose Priority Function

1.03

1.03

0.85

0.9

0.95

1

1.05

1.1

1.151

29

.co

mp

ress

g7

21

de

cod

e

g7

21

en

cod

e

hu

ff_e

nc

hu

ff_d

ec

raw

cau

dio

raw

da

ud

io

mp

eg

2d

ec

Ave

rag

e

Sp

ee

du

p

Train data set Alternate data set

Page 21: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

1.02 1.

03

0.9

0.95

1

1.05

1.1

1.15

1.2

de

cod

rle

4

cod

rle

4

12

4.m

88

ksim

un

ep

ic

djp

eg

02

3.e

qn

tott

13

2.ij

pe

g

14

7.v

ort

ex

08

5.c

c1

13

0.li

Ave

rag

e

Sp

ee

du

p

Speedup (32-Regs) Speedup (64-Regs)

Register Allocation ResultsCross Validation

Page 22: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Conclusion

• Machine learning techniques can identify effective priority functions

• ‘Proof of concept’ by evolving two well known priority functions

• Human cycles v. computer cycles

Page 23: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

(add (sub (mul exec_ratio_mean 0.8720) 0.9400) (mul 0.4762 (cmul (not has_pointer_deref)

(mul 0.6727 num_paths) (mul 1.1609

(add (sub (mul (div num_ops dependence_height) 10.8240) exec_ratio) (sub (mul (cmul has_unsafe_jsr predict_product_mean

0.9838) (sub 1.1039 num_ops_max)) (sub (mul dependence_height_mean

num_branches_max) num_paths)))))))

GP Hyperblock SolutionsGeneral Purpose

Intron that doesn’t affect solution

Page 24: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

(add (sub (mul exec_ratio_mean 0.8720) 0.9400) (mul 0.4762 (cmul (not has_pointer_deref)

(mul 0.6727 num_paths) (mul 1.1609

(add (sub (mul (div num_ops dependence_height) 10.8240) exec_ratio) (sub (mul (cmul has_unsafe_jsr predict_product_mean

0.9838) (sub 1.1039 num_ops_max)) (sub (mul dependence_height_mean

num_branches_max) num_paths)))))))

GP Hyperblock SolutionsGeneral Purpose

Favor paths that don’t have pointer dereferences

Page 25: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

GP Hyperblock SolutionsGeneral Purpose

(add (sub (mul exec_ratio_mean 0.8720) 0.9400) (mul 0.4762 (cmul (not has_pointer_deref)

(mul 0.6727 num_paths) (mul 1.1609

(add (sub (mul (div num_ops dependence_height) 10.8240) exec_ratio) (sub (mul (cmul has_unsafe_jsr predict_product_mean

0.9838) (sub 1.1039 num_ops_max)) (sub (mul dependence_height_mean

num_branches_max) num_paths)))))))

Favor highly parallel (fat) paths

Page 26: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

GP Hyperblock SolutionsGeneral Purpose

(add (sub (mul exec_ratio_mean 0.8720) 0.9400) (mul 0.4762 (cmul (not has_pointer_deref)

(mul 0.6727 num_paths) (mul 1.1609

(add (sub (mul (div num_ops dependence_height) 10.8240) exec_ratio) (sub (mul (cmul has_unsafe_jsr predict_product_mean

0.9838) (sub 1.1039 num_ops_max)) (sub (mul dependence_height_mean

num_branches_max) num_paths)))))))

If a path calls a subroutine that may have side effects, penalize it

Page 27: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Case Study I: IMPACT’s Algorithm

A

B C

E

F

G

4k 24k

4k 22k 2k

2k 25

28k

28k

D

10

10

Path exec haz ops dep pr

A-B-D-F-G ~0 1.0 13 4 ~0

A-B-F-G 0.14 1.0 10 4 0.21

A-C-F-G 0.79 1.0 9 2 1.44

A-C-E-F-G 0.07 0.25 13 5 0.02

A-C-E-G ~0 0.25 11 3 ~0

Page 28: Meta Optimization

http://www.cag.lcs.mit.edu/metaopt

Case Study I: IMPACT’s Algorithm

A

B C

E

F

G

4k 24k

4k 22k 2k

2k 25

28k

28k

D

10

10

Path exec haz ops dep pr

A-B-D-F-G ~0 1.0 13 4 ~0

A-B-F-G 0.14 1.0 10 4 0.21

A-C-F-G 0.79 1.0 9 2 1.44

A-C-E-F-G 0.07 0.25 13 5 0.02

A-C-E-G ~0 0.25 11 3 ~0