translation validation for an optimizing compiler guy erez based on george c. necula article (acm...

36
Translation Validation Translation Validation for an Optimizing for an Optimizing Compiler Compiler Guy Erez Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar, Winter Advanced Programming Languages Seminar, Winter 2000 2000

Post on 15-Jan-2016

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

Translation Validation for Translation Validation for an Optimizing Compileran Optimizing Compiler

Guy ErezGuy Erez

Based on George C. Necula article (ACM SIGPLAN 2000)Based on George C. Necula article (ACM SIGPLAN 2000)

Advanced Programming Languages Seminar, Winter Advanced Programming Languages Seminar, Winter 20002000

Page 2: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

2

In a NutshellIn a Nutshell

• The ProblemThe Problem: Verify that the : Verify that the optimized and source code are optimized and source code are equivalentequivalent

• Partial (heuristic) SolutionPartial (heuristic) Solution: : Independently prove the validity of Independently prove the validity of each translation passeach translation pass

• Motivation:Motivation: Optimizer Testing Optimizer Testing

Page 3: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

3

OutlineOutline

• IntroductionIntroduction• Intermediate LanguageIntermediate Language• An extensive exampleAn extensive example

– Simulation RelationSimulation Relation– Execution PairExecution Pair– Equivalence CheckingEquivalence Checking

• Branch NavigationBranch Navigation• Results and LimitationsResults and Limitations

Page 4: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

4

Methods of Proving Methods of Proving Compiler CorrectnessCompiler Correctness

• Prove compiler general Prove compiler general correctness:correctness:+absoluteabsolute– tedioustedious– impractical for large programsimpractical for large programs– very dependent of compiler codevery dependent of compiler code

Page 5: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

5

Methods of Proving Methods of Proving Compiler Corr. (cont.)Compiler Corr. (cont.)

• Show that each translation phase Show that each translation phase was validwas valid– weakerweaker+proof per programproof per program+applicable for large programsapplicable for large programs+independent of compiler codeindependent of compiler code

Page 6: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

6

Compilation ProcessCompilation Process

SourceCode

IntermediateLanguage

(IL)

TargetCode

Page 7: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

7

Optimization ProcessOptimization Process

ILCode0

Optimize Pass

ILCode1

ILCoden

Validator

Page 8: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

8

The IL in GNU C The IL in GNU C (subset)(subset)

• Instructions:Instructions:

Expressions:Expressions:

• Operators:Operators:

Page 9: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

9

An ExampleAn Example

extern int g;extern int a[…];main(){

int n=… /* n contains the length of the array */

int i;for (i=0; i<n; i++)

a[i]=g*i+3;return i;

}

Page 10: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

10

And in IL…And in IL…for (i=0;i<n; i++)

a[i]=g*i+3;return i;

Page 11: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

11

After Transformation…After Transformation…Use registers

Transform while to a

repeat loop?<==>

?<==>

Page 12: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

12

EquivalenceEquivalence

• xx11,…,x,…,xnn – variables in source – variables in source

• yy11,…,y,…,ymm – variables in target – variables in target

• Variable Equivalence:Variable Equivalence:xx11 = y = y33

• Expression Equivalence:Expression Equivalence:xx11+x+x22 = y = y33+6+6

Page 13: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

13

Simulation RelationSimulation Relation

• A set of equivalences between a A set of equivalences between a source block and a target block source block and a target block

Page 14: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

14

Execution PairExecution Pair

• DefinitionDefinition: An execution path in the : An execution path in the source and its corresponding path source and its corresponding path in the targetin the target

Source

Target

Page 15: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

15

Checking EquivalenceChecking Equivalence

• Equivalence is checked at the end Equivalence is checked at the end of a specific execution pairof a specific execution pair

• A variable value after the run is A variable value after the run is marked with a primemarked with a prime

x x’=x+1

y y’=y*3

Symbolic Substitution

Page 16: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

16

Equivalence SimplificationEquivalence Simplification

• An equivalence can be simplified using:An equivalence can be simplified using:– Arithmetic rulesArithmetic rules– Already proven equivalencesAlready proven equivalences

• ExampleExample: If x’=x+1 and y’=y*5 then:: If x’=x+1 and y’=y*5 then:3*x’=y’3*x’=y’3*(x+1)=y*53*(x+1)=y*53*x+3=y*53*x+3=y*5

• An equivalence An equivalence holdsholds if it can be if it can be simplified to an already proven simplified to an already proven equivalenceequivalence

Page 17: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

17

Checking Simulation Checking Simulation RelationsRelations

• A relation is correct if for each A relation is correct if for each execution pair entering it, all of its execution pair entering it, all of its equivalences hold equivalences hold

x yx=y+1

Page 18: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

18

Something fishySomething fishy

• SimplerSimpler• Provides an Provides an independentindependent

perspective on the perspective on the finalfinal code code

• What’s the point of proving What’s the point of proving something using the same rules something using the same rules that created it?that created it?

Page 19: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

19

Showtime…Showtime…

A. Element #1 holds

C. Prove elem. #2 (Trivial)

B. There is only oneexecution pair (no cond.)

Page 20: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

20

Element #5Element #5

•Two execution pairs:

b3-b1-b2 and b7-b5

Page 21: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

21

Element #5 (cont.)Element #5 (cont.)

•The other pair:

b3-b1-b3 and b7-b7

Page 22: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

22

Known EquivalencesKnown Equivalences

• Equivalences from the start of the Equivalences from the start of the run:run:

• Equivalences at the end of run:Equivalences at the end of run:

Page 23: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

23

Need to ProveNeed to Prove

• The path condition is correct:The path condition is correct:

• The equivalences hold, mainly:The equivalences hold, mainly:

Page 24: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

24

Elem #5: Path Cond.Elem #5: Path Cond.

Page 25: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

25

Elem #5: The EquivalenceElem #5: The Equivalence

distributivity

commutativity

Q.E.D

Page 26: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

26

Algorithm PartsAlgorithm Parts

• Inferring Simulation RelationsInferring Simulation Relations• Finding execution pairsFinding execution pairs• Solving ConstraintsSolving Constraints

Page 27: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

27

Navigating BranchesNavigating Branches

• An optimizer might eliminate or An optimizer might eliminate or reverse branchesreverse branches

• ProblemProblem: did branch B’ originate : did branch B’ originate from branch B in the sourcefrom branch B in the source

• SolutionSolution: Use heuristics : Use heuristics

Page 28: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

28

A Typical CaseA Typical Case

Page 29: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

29

SimilaritySimilarity

• The similarity between two branches The similarity between two branches depend on the similarity of their:depend on the similarity of their:– preceding instruction sequencepreceding instruction sequence– boolean conditionsboolean conditions– the twothe two branching sequencesbranching sequences

Page 30: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

30

Similarity (cont.)Similarity (cont.)

• Formally:Formally:

• ~ is a numeric relation(0..1)~ is a numeric relation(0..1)• ““and” is multiplicationand” is multiplication• ““or” is maximumor” is maximum

Page 31: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

31

Boolean SimilarityBoolean Similarity

• Branches are similar if:Branches are similar if:– one can be simplified into the other one can be simplified into the other

using simple transforms, such as:using simple transforms, such as:

Page 32: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

32

Instruction SimilarityInstruction Similarity

• Instructions similarityInstructions similarity– amount of function callsamount of function calls– lead to already related branches (in lead to already related branches (in

that case, similarity is 1.0)that case, similarity is 1.0)

Page 33: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

33

Instruction Similarity…Instruction Similarity…

• gcc specific featuresgcc specific features– IL instructions serial numberIL instructions serial number– source line number information (for source line number information (for

code duplication detection)code duplication detection)

Page 34: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

34

ResultsResults

• Detected a known bug in gcc Detected a known bug in gcc 2.7.2.22.7.2.2

• Used on large programs:Used on large programs:

• Increased compile time x4Increased compile time x4

Page 35: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

35

LimitationsLimitations

• Cannot handle loop unrollingCannot handle loop unrolling• Cannot resolve all types of Cannot resolve all types of

equivalencesequivalences• Produces several false alarms (i.e. Produces several false alarms (i.e.

the gcc bug was accompanied by 3 the gcc bug was accompanied by 3 false alarms)false alarms)

Page 36: Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,

36

ConclusionConclusion

• Automatically infer equivalencesAutomatically infer equivalences• Uses: Uses:

– simple rules and substitutionsimple rules and substitution– heuristicsheuristics

• Good resultsGood results• Problems:Problems:

– false alarmsfalse alarms– runtime overheadruntime overhead