compiler principles fall 2014-2015 compiler principles lecture 11: loop optimizations roman manevich...

Download Compiler Principles Fall 2014-2015 Compiler Principles Lecture 11: Loop Optimizations Roman Manevich Ben-Gurion University

If you can't read please download the document

Upload: belinda-bertha-wells

Post on 18-Jan-2018

217 views

Category:

Documents


0 download

DESCRIPTION

Tentative syllabus Front End Scanning Top-down Parsing (LL) Bottom-up Parsing (LR) Intermediate Representation Lowering Optimizations Local Optimizations Dataflow Analysis Loop Optimizations Code Generation Register Allocation Instruction Selection 3 mid-termexam

TRANSCRIPT

Compiler Principles Fall Compiler Principles Lecture 11: Loop Optimizations Roman Manevich Ben-Gurion University 2 Tentative syllabus Front End Scanning Top-down Parsing (LL) Bottom-up Parsing (LR) Intermediate Representation Lowering Optimizations Local Optimizations Dataflow Analysis Loop Optimizations Code Generation Register Allocation Instruction Selection 3 mid-termexam Previously Dataflow framework Global constant propagation 4 agenda Finish dataflow framework Monotone frameworks Distributivity Reaching definitions Loop code motion (Strength reduction via induction variables) 5 6 Dataflow framework reminder Join semilattice definition A join semilattice is a pair (V, ), where V is a set of elements (domain) is a join operator that is commutative: x y = y x associative: (x y) z = x (y z) idempotent: x x = x If x y = z, we say that z is the join or (Least Upper Bound) of x and y Every join semilattice has a bottom element denoted such that x = x for all x 7 Partial ordering induced by join Every join semilattice (V, ) induces an ordering relationship over its elements Define x y iff x y = y Need to prove Reflexivity: x x Antisymmetry: If x y and y x, then x = y Transitivity: If x y and y z, then x z 8 Join semilattice example for liveness 9 {} {a}{b}{c} {a, b}{a, c}{b, c} {a, b, c} Bottom element Dataflow framework A global analysis is a tuple ( D, V, , F, I ), where D is a direction (forward or backward) The order to visit statements within a basic block, NOT the order in which to visit the basic blocks V is a set of values (sometimes called domain) is a join operator over those values F is a set of transfer functions f s : V V (for every statement s) I is an initial value 10 Running global analyses Assume that ( D, V, , F, I ) is a forward analysis For every statement s maintain values before - IN[s] - and after - OUT[s] Set OUT[s] = for all statements s Set OUT[entry] = I Repeat until no values change: For each statement s with predecessors PRED[s]={p 1, p 2, , p n } Set IN[s] = OUT[p 1 ] OUT[p 2 ] OUT[p n ] Set OUT[s] = f s (IN[s]) The order of this iteration does not matter Chaotic iteration 11 Soundness of dataflow analysis The result of the fixed-point algorithm is sound holds for every execution of the program Sometimes coming up with precise join and transfer functions is hard, but we can always do with more conservative versions Define x y x y An operator that returns some upper bound not necessarily the least one Example, liveness {a, b, c} {b, c, d} is (union) {a, b, c} We can define a more conservative upper bound operation X Y = (X Y) {e} 12 13 Termination Reasoning for dataflow analysis Proving termination Our algorithm for running these analyses continuously loops until no changes are detected Problem: how do we know the analyses will eventually terminate? 14 A non-terminating analysis The following analysis will loop infinitely on any CFG containing a loop: Direction: Forward Domain: Join operator: max Transfer function: f(n) = n + 1 Initial value: 0 15 A non-terminating analysis 16 start end x = y Initialization 17 start end x = y 0 0 Fixed-point iteration 18 start end x = y 0 0 Choose a block 19 start end x = y 0 0 Iteration 1 20 start end x = y 0 0 0 Iteration 1 21 start end x = y 1 0 0 Choose a block 22 start end x = y 1 0 0 Iteration 2 23 start end x = y 1 0 0 Iteration 2 24 start end x = y 1 0 1 Iteration 2 25 start end x = y 2 0 1 Choose a block 26 start end x = y 2 0 1 Iteration 3 27 start end x = y 2 0 1 Iteration 3 28 start end x = y 2 0 2 Iteration 3 29 start end x = y 3 0 2 Why doesnt this terminate? Values can increase without bound Note that increase refers to the lattice ordering, not the ordering on the natural numbers The height of a semilattice is the length of the longest increasing sequence in that semilattice The dataflow framework is not guaranteed to terminate for semilattices of infinite height Note that a semilattice can be infinitely large but have finite height e.g. constant propagation Height of a lattice An increasing chain is a sequence of elements a 1 a 2 a k The length of such a chain is k The height of a lattice is the length of the maximal increasing chain For liveness with n program variables: {} {v 1 } {v 1,v 2 } {v 1,,v n } For available expressions it is the number of expressions of the form a=b op c For n program variables and m operator types: O(m n 3 ) 31 32 Monotonicity Another non-terminating analysis This analysis works on a finite-height semilattice, but will not terminate on certain CFGs: Direction: Forward Domain: Boolean values true and false Join operator: Logical OR Transfer function: Logical NOT Initial value: false 33 A non-terminating analysis 34 start end x = y Initialization 35 start end x = y false Fixed-point iteration 36 start end x = y false Choose a block 37 start end x = y false Iteration 1 38 start end x = y false Iteration 1 39 start end x = y true false Iteration 2 40 start end x = y true false true Iteration 2 41 start end x = y false true Iteration 3 42 start end x = y false Iteration 3 43 start end x = y true false Why doesnt it terminate? Values can loop indefinitely Intuitively, the join operator keeps pulling values up If the transfer function can keep pushing values back down again, then the values might cycle forever 44 false true false true false... Why doesnt it terminate? Values can loop indefinitely Intuitively, the join operator keeps pulling values up If the transfer function can keep pushing values back down again, then the values might cycle forever How can we fix this? 45 false true false true false... Monotone transfer functions A transfer function f is monotone iff if x y, then f(x) f(y) Intuitively, if you know less information about a program point, you can't gain back more information about that program point Many transfer functions are monotone, including those for liveness and constant propagation Note: Monotonicity does not mean that x f(x) (This is a different property called extensivity) 46 Liveness and monotonicity A transfer function f is monotone iff if x y, then f(x) f(y) Recall our transfer function for a = b + c is f a = b + c (V) = (V {a}) {b, c} Recall that our join operator is set union and induces an ordering relationship X Y iff X Y Is this monotone? 47 Is constant propagation monotone? A transfer function f is monotone iff if x y, then f(x) f(y) Recall our transfer functions f x=k (V) = V[x k] (update V by mapping x to k) f x=y (V) = V[x V(y)] (update val. of x to val. of y) f x=a+b (V) = V[x ] (assign Not-a-Constant) Is this monotone? 48 The grand result Theorem: A dataflow analysis with a finite- height semilattice and family of monotone transfer functions always terminates Proof sketch: The join operator can only bring values up Transfer functions can never lower values back down below where they were in the past (monotonicity) Values cannot increase indefinitely (finite height) 49 50 distributivity An optimality result A transfer function f is distributive if f(a b) = f(a) f(b) for every domain elements a and b If all transfer functions are distributive then the fixed-point solution is equal to the solution computed by joining results from all (potentially infinite) control-flow paths Join over all paths Optimal if we ignore program conditions Pretend all control-flow paths can be executed by the program Which analyses use distributive functions? 51 An optimality result A transfer function f is distributive if f(a b) = f(a) f(b) for every domain elements a and b If all transfer functions are distributive then the fixed-point solution is equal to the solution computed by joining results from all (potentially infinite) control-flow paths Join over all paths Optimal if we ignore program conditions Pretend all control-flow paths can be executed by the program Which analyses use distributive functions? 52 53 Loop optimizations Most of a programs computations are done inside loops Focus optimizations effort on loops The optimizations weve seen so far are independent of the control structure Some optimizations are specialized to loops Loop-invariant code motion (Strength reduction via induction variables) Require another type of analysis to find out where expressions get their values from Reaching definitions 54 Loop invariant computation 55 y = t * 4 x < y + z end x = x + 1 start y = t = z = Loop invariant computation 56 y = t * 4 x < y + z end x = x + 1 start y = t = z = t*4 and y+z have same value on each iteration Code hoisting 57 x < w end x = x + 1 start y = t = z = y = t * 4 w = y + z What reasoning did we use? 58 y = t * 4 x < y + z end x = x + 1 start y = t = z = y is defined inside loop but it is loop invariant since t*4 is loop-invariant Both t and z are defined only outside of loop constants are trivially loop-invariant What about now? 59 y = t * 4 x < y + z end x = x + 1 t = t + 1 start y = t = z = Now t is not loop-invariant and so are t*4 and y Loop-invariant code motion d: t = a 1 op a 2 d is a program location a 1 op a 2 loop-invariant (for a loop L) if computes the same value in each iteration Hard to know in general Conservative approximation Each a i is a constant, or All definitions of a i that reach d are outside L, or Only one definition of of a i reaches d, and is loop- invariant itself Transformation: hoist the loop-invariant code outside of the loop 60 61 Reaching definitions analysis A definition d: t = reaches a program location if there is a path from the definition to the program location, along which the defined variable is never redefined Lets define the corresponding dataflow analysis 62 Reaching definitions analysis A definition d: t = reaches a program location if there is a path from the definition to the program location, along which the defined variable is never redefined Direction: ? Domain: ? Join operator: ? Transfer function: f d: a=b op c (RD) = ? f d: not-a-def (RD) = ? Where defs(a) is the set of locations defining a (statements of the form a=...) Initial value: ? 63 Reaching definitions analysis A definition d: t = reaches a program location if there is a path from the definition to the program location, along which the defined variable is never redefined Direction: Forward Domain: sets of program locations that are definitions Join operator: union Transfer function: f d: a=b op c (RD) = (RD - defs(a)) {d} f d: not-a-def (RD) = RD Where defs(a) is the set of locations defining a (statements of the form a=...) Initial value: {} 64 Reaching definitions analysis 65 d4: y = t * 4 d4:x < y + z d6: x = x + 1 d1: y = d2: t = d3: z = start end {} Reaching definitions analysis 66 d4: y = t * 4 d4:x < y + z d5: x = x + 1 start d1: y = d2: t = d3: z = end {} Initialization 67 d4: y = t * 4 d4:x < y + z d5: x = x + 1 start d1: y = d2: t = d3: z = {} end {} Iteration 1 68 d4: y = t * 4 d4:x < y + z d5: x = x + 1 start d1: y = d2: t = d3: z = {} end {} Iteration 1 69 d4: y = t * 4 d4:x < y + z d5: x = x + 1 start d1: y = d2: t = d3: z = {} {d1} {d1, d2} {d1, d2, d3} end {} Iteration 2 70 d4: y = t * 4 x < y + z end d5: x = x + 1 start d1: y = d2: t = d3: z = {} {d1} {d1, d2} {d1, d2, d3} {} Iteration 2 71 d4: y = t * 4 x < y + z end d5: x = x + 1 start d1: y = d2: t = d3: z = {} {d1, d2, d3} {} {d1} {d1, d2} {d1, d2, d3} {} Iteration 2 72 d4: y = t * 4 x < y + z end d5: x = x + 1 start d1: y = d2: t = d3: z = {} {d1, d2, d3} {} {d1} {d1, d2} {d1, d2, d3} {d2, d3, d4} {} Iteration 2 73 d4: y = t * 4 x < y + z end d5: x = x + 1 start d1: y = d2: t = d3: z = {} {d1, d2, d3} {} {d1} {d1, d2} {d1, d2, d3} {d2, d3, d4} {} Iteration 3 74 d4: y = t * 4 x < y + z end d5: x = x + 1 start d1: y = d2: t = d3: z = {} {d1, d2, d3} {d2, d3, d4} {} {d1} {d1, d2} {d1, d2, d3} {d2, d3, d4} {} Iteration 3 75 d4: y = t * 4 x < y + z end d5: x = x + 1 start d1: y = d2: t = d3: z = {} {d1, d2, d3} {d2, d3, d4} {} {d1} {d1, d2} {d1, d2, d3} {d2, d3, d4} {d2, d3, d4, d5} Iteration 4 76 d4: y = t * 4 x < y + z end d5: x = x + 1 start d1: y = d2: t = d3: z = {} {d1, d2, d3} {d2, d3, d4} {} {d1} {d1, d2} {d1, d2, d3} {d2, d3, d4} {d2, d3, d4, d5} Iteration 4 77 d4: y = t * 4 x < y + z end d5: x = x + 1 start d1: y = d2: t = d3: z = {} {d1, d2, d3, d4, d5} {d2, d3, d4} {} {d1} {d1, d2} {d1, d2, d3} {d2, d3, d4} {d2, d3, d4, d5} Iteration 4 78 d4: y = t * 4 x < y + z end d5: x = x + 1 start d1: y = d2: t = d3: z = {} {d1, d2, d3, d4, d5} {d2, d3, d4} {} {d1} {d1, d2} {d1, d2, d3} {d2, d3, d4, d5} Iteration 5 79 end start d1: y = d2: t = d3: z = {} {d2, d3, d4, d5} {d1} {d1, d2} {d1, d2, d3} d5: x = x + 1 {d2, d3, d4} {d2, d3, d4, d5} d4: y = t * 4 x < y + z {d1, d2, d3, d4, d5} {d2, d3, d4, d5} Iteration 6 80 end start d1: y = d2: t = d3: z = {} {d2, d3, d4, d5} {d1} {d1, d2} {d1, d2, d3} d5: x = x + 1 {d2, d3, d4, d5} d4: y = t * 4 x < y + z {d1, d2, d3, d4, d5} {d2, d3, d4, d5} Which expressions are loop invariant 81 t is defined only in d2 outside of loop z is defined only in d3 outside of loop y is defined only in d4 inside of loop but depends on t and 4, both loop-invariant start d1: y = d2: t = d3: z = {} {d1} {d1, d2} {d1, d2, d3} end {d2, d3, d4, d5} d5: x = x + 1 {d2, d3, d4, d5} d4: y = t * 4 x < y + z {d1, d2, d3, d4, d5} {d2, d3, d4, d5} x is defined only in d5 inside of loop so is not a loop-invariant 82 Inferring loop invariants Inferring loop-invariant expressions For a statement s of the form t = a 1 op a 2 A variable a i is immediately loop-invariant if all reaching definitions IN[s]={d 1,,d k } for a i are outside of the loop LOOP-INV = immediately loop-invariant variables and constants LOOP-INV = LOOP-INV {x | d: x = a 1 op a 2, d is in the loop, and both a 1 and a 2 are in LOOP-INV} Iterate until fixed-point An expression is loop-invariant if all operands are loop-invariants 83 Computing LOOP-INV 84 end start d1: y = d2: t = d3: z = {} {d2, d3, d4} {d1} {d1, d2} {d1, d2, d3} d4: y = t * 4 x < y + z d5: x = x + 1 {d1, d2, d3, d4, d5} {d2, d3, d4, d5} Computing LOOP-INV 85 end start d1: y = d2: t = d3: z = {} {d2, d3, d4} {d1} {d1, d2} {d1, d2, d3} d4: y = t * 4 x < y + z d5: x = x + 1 {d1, d2, d3, d4, d5} {d2, d3, d4, d5} (immediately) LOOP-INV = {t} Computing LOOP-INV 86 end start d1: y = d2: t = d3: z = {} {d2, d3, d4} {d1} {d1, d2} {d1, d2, d3} d4: y = t * 4 x < y + z d5: x = x + 1 {d1, d2, d3, d4, d5} {d2, d3, d4, d5} (immediately) LOOP-INV = {t, z} Computing LOOP-INV 87 end start d1: y = d2: t = d3: z = {} {d2, d3, d4} {d1} {d1, d2} {d1, d2, d3} d4: y = t * 4 x < y + z d5: x = x + 1 {d1, d2, d3, d4, d5} {d2, d3, d4, d5} (immediately) LOOP-INV = {t, z} Computing LOOP-INV 88 end start d1: y = d2: t = d3: z = {} {d2, d3, d4} {d1} {d1, d2} {d1, d2, d3} d4: y = t * 4 x < y + z d5: x = x + 1 {d1, d2, d3, d4, d5} {d2, d3, d4, d5} (immediately) LOOP-INV = {t, z} Computing LOOP-INV 89 end start d1: y = d2: t = d3: z = {} {d2, d3, d4} {d1} {d1, d2} {d1, d2, d3} LOOP-INV = {t, z, 4} d4: y = t * 4 x < y + z d5: x = x + 1 {d1, d2, d3, d4, d5} {d2, d3, d4, d5} Computing LOOP-INV 90 d4: y = t * 4 x < y + z end d5: x = x + 1 start d1: y = d2: t = d3: z = {} {d1, d2, d3, d4, d5} {d2, d3, d4, d5} {d2, d3, d4} {d1} {d1, d2} {d1, d2, d3} {d2, d3, d4, d5} LOOP-INV = {t, z, 4, y} 91 Strength reduction via induction variables Induction variables 92 while (i < x) { j = a + 4 * i a[j] = j i = i + 1 } i is incremented by a loop-invariant expression on each iteration this is called an induction variable j is a linear function of the induction variable with multiplier 4 Strength-reduction 93 j = a + 4 * (i-1) while (i < x) { j = j + 4 a[j] = j i = i + 1 } Prepare initial value Increment by multiplier Summary of optimizations 94 Enabled OptimizationsAnalysis Common-subexpression elimination Copy Propagation Available Expressions Constant foldingConstant Propagation Dead code eliminationLive Variables Loop-invariant code motionReaching Definitions 95 Dataflow analysis summary Join semilattice A join semilattice is a pair (V, ), where V is a set of elements is a join operator that is commutative: x y = y x associative: (x y) z = x (y z) idempotent: x x = x If x y = z, we say that z is the join or (Least Upper Bound) of x and y Every join semilattice has a bottom element denoted such that x = x for all x 96 Join semilattices and orderings Every join semilattice (V, ) induces an ordering relationship over its elements Define x y iff x y = y Need to prove Reflexivity: x x Antisymmetry: If x y and y x, then x = y Transitivity: If x y and y z, then x z 97 Hasse diagram 98 {} {a}{b}{c} {a, b}{a, c}{b, c} {a, b, c} Greater Lower Dataflow framework A (global) dataflow analysis is a tuple ( D, V, , F, I ) D is a direction (forward or backward) The order to visit statements within a basic block V is a set of lattice elements is a join operator over the lattice elements F is a set of transfer functions f : V V One per statement I is an initial value 99 Running global analyses Assume that ( D, V, , F, I ) is a forward analysis Set OUT[s] = for all statements s Set OUT[entry] = I Repeat until no values change: For each statement s with predecessors p 1, p 2, , p n : Set IN[s] = OUT[p 1 ] OUT[p 2 ] OUT[p n ] Set OUT[s] = f s (IN[s]) The order of this iteration does not matter: Chaotic iteration (due to Garry Kildall) 100 Height of a lattice An increasing chain is a sequence of elements a 1 a 2 a k The length of such a chain is k The height of a lattice is the length of the maximal increasing chain 101 Properties of transfer functions A transfer function f is monotone iff if x y, then f(x) f(y) A transfer function f is distributive iff f(a b) = f(a) f(b) for every domain elements a and b 102 Dataflow analysis main theorem Theorem 1: A dataflow analysis with a finite- height semilattice and family of monotone transfer functions always terminates Theorem 2: the solution to the system of equations is unique the minimal fixed point in the lattice 103 Analysis precision Ideal solution Only consider execution paths that are possible at runtime Undecidable to find these paths statically Join over all paths (JOP) Assuming every control flow path is feasible Less precise solution but conservative (sound) If all transfer functions are distributive Least fixed point Less precise than JOP but sound The result of any dataflow analysis 104 Analysis direction Forward In the CFG, information is propagated from predecessors of a statement to its output Properties depend on past computations Examples: available expressions, constant propagating, reaching definitions Backward In the CFG, information is propagated from successors of a statement to its output Properties depend on future computations Examples: Liveness analysis 105 Types of dataflow analysis May vs. Must Must analysis properties hold on all paths (join is set intersection) Examples: available expressions, constant propagation May analysis properties hold on some path (join is set union) Examples: liveness, reaching definitions 106 107 Global optimizations practice ' 109 110 false true x < 0x = 0x > 0 x 0x 0 111 112 113 114 ' 116 117 Aliasing dataflow problem Direction D = forward Domain MayPointTo = {x = &y | x is a pointer variable and y is a numeric variable} Join operator = Transfer functions For two numeric variables x and y: f x=y (M ) = M For two pointer variables x and y: f x=y (M ) = (M \ {x = &z | z is a numeric variable}) {x = &z | y = &z M} f x=&y (M ) = (M \ {x = &z | z is a numeric variable}) {x=&y} f x=y+z (M ) = f x=*y (M ) = f *x=y (M ) = M Initial value I = {} 118 The lattice is finite there are O(n 2 ) facts of the form x = &y for n program variables. Therefore the height of the lattice is finite. The transfer functions have a GEN-KILL type: f(X) = (X \ A) B, which we already proved to be monotone in class 119 120 121 122 123 124 ' 126 Next lecture: Register Allocation ' 119 120 131 132 133 134