cse322, programming languages and compilers 1 7/13/2015 lecture #11, may 10, 2007 more optimizations...

35
Cse322, Programming Languages and Compilers 1 03/21/22 Lecture #11, May 10, 2007 More optimizations (local, loop, global) Liveness Analysis, Spilling, Problems with Jumps, Ranges, Intervals, Linear Scan register allocation.

Post on 22-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Cse322, Programming Languages and Compilers

104/19/23

Lecture #11, May 10, 2007•More optimizations (local, loop, global)•Liveness Analysis,•Spilling,•Problems with Jumps,•Ranges,•Intervals,•Linear Scan register allocation.

Cse322, Programming Languages and Compilers

204/19/23

More Optimizations• Local Optimizations

– Constant Folding

– Constant Propagation

– Copy Propagation

– Reduction in Strength

– In Lining

– Common sub-expression elimination

• Loop Optimizations– Loop Invariant s

– Reduction in strength due to induction variables

– Loop unrolling

• Global Optimizations– Dead Code elimination

– Code motion

» Reordering

» code hoisting

Cse322, Programming Languages and Compilers

304/19/23

Constant Folding• Subexpressions whose operands are all

constants can be carried out at compile-time.• E.g.

X := 2 * 4

Rather than generating

Movi 2 r5

Movi 4 r7

Prim "*" [r5,r7] r9

. . .

Generate this instead

Movi 8 r9

. . .

• Code like this is not ordinarily written by programmers but is often the result of translation of index calculations.

Cse322, Programming Languages and Compilers

404/19/23

Constant Propagation• Sometimes we know a symbolic value is a

constant. So we can propagate the constant and generate better code:

1 step := 4

2 total := 0

3 i := 1

4 total := x [i + step]

• Note that because there are no possible paths to 4 that do not pass through 1, 2 and 3 we know that i+step can be computed by (1+4) which is known at compile time to be 5.

Cse322, Programming Languages and Compilers

504/19/23

Copy Propagation• Assignments of one variable to another also

propagate information:

x : = y

. . .

total := Z[x]

• Note if my translation knows that y is stored in some register, R7, I can use R7 rather than fetching x from memory

• Copy propagation my remove all references to x completely. This allow makes the assignment to x dead code and a candidate to further optimization.

Cse322, Programming Languages and Compilers

604/19/23

Reduction in Strength• Some sequences of code can be replaced

with simpler (or less expensive) sequences.

x := 2 * y

could be replaced by

x := y + y

• Exponentiation by 2 by multiply– x ^2 == x * x

• Multiplication by factor of 2 by shift

Cse322, Programming Languages and Compilers

704/19/23

In - Lining• Some calls to functions (especially

primitives like +, -, *, absolute value, ord and char) can be inlined as a sequence of machine instructions instead of a call to a library routine.

i := abs(j)

Bneg j L2

Mov j i

Br L3

L2:

Neg j R2

Mov R2 i

L3:

Cse322, Programming Languages and Compilers

804/19/23

Common Sub-expressions• Common subexpressions can be exploited

by not duplicating code

x := z[j+2] - w[j+2]

T1 := j+2

x := z[T1] - w[T1]

• Note that common subexpressions often occur even when they are not in the user level code.

– E.g. Subscript computations on two multi-dimensional arrays with the same dimensions will often have common sub expressions even if the index to the arrays are completely different

Cse322, Programming Languages and Compilers

904/19/23

Loop Invariants• Computations inside loops which remain

invariant each time around the loop can be computed once outside the loop rather than each time around the loop.

For i := 1 to N do { total := x[i] / sqr(n) + total }

T1 := sqr(n)

For i := 1 to N do

{ total := x[i] / T1 + total }

• Note that index calculation may also introduces computations which are invariant around the loop.

Cse322, Programming Languages and Compilers

1004/19/23

Induction Variables and reduction in strength

• Variables which vary in a regular way around loops are called induction variables.

• For loop variables are obvious cases, but implicit induction variables give much opportunity for optimization.

For i := 1 to 10 do { k := i * 4;

total := x[i] + w[k] }

• Note that k varies as a linear function of i. i := 1 k := 4

while i <= 10 do

{ total := x[i] + w[k]

i := i + 1; k := k + 4 }

Cse322, Programming Languages and Compilers

1104/19/23

Induction Variables (cont. 1)

i := 1 k := 4

while i <= 10 do

{ total := x[i] + w[k]

i := i + 1; k := k + 4 }

• Note that x[i] and w[k] are computed by formulas like: r4 := add(x) + lowbound(x) + i;

load r4 r4

• Note that add(x) + lowbound(x) can be moved outside the loop and this introduces another induction variable

xptr := add(x) + lowbound(x) i := 1

k := 4

while i <= 10 do

{ total := (xptr + i)* + w[k]

i := i + 1; k := k + 1}

Cse322, Programming Languages and Compilers

1204/19/23

Induction Variables (cont. 2)

xptr := addr(x) + lowbound(x) wptr := addr(w) + lowbound(w)

i := 1

k := 4

while i <= 10 do

{ total := (xptr + i)* + (wptr + k)*

i := i + 1; k := k + 4 }

xptr := addr(x) + lowbound(x) +1 wptr := addr(w) + lowbound(w) + 4

bound = xptr + 10

while xptr <= bound do

{ total := xptr* + wptr*

xptr := xptr + 1; wptr := wptr + 4 }

Cse322, Programming Languages and Compilers

1304/19/23

Loop Unrolling• Loop with low trip count can be unrolled.

This does away with the loop initialization and test for termination conditions.

list := [1,2]

while (list <> nil) do

{ total := total + hd(list);

list := tail(list)

}

total := total + hd(list)

list := tl(list)

total := total + hd(list)

Cse322, Programming Languages and Compilers

1404/19/23

Dead Code Elimination• Automatic generation techniques often

generate code that is unreachable.

debug := false;

if debug

then print x;

f(x);

• Because of constant propagation it is possible to tell at compile-time that the then branch will never be executed.

Cse322, Programming Languages and Compilers

1504/19/23

Code Motion (reordering)

• Sometimes reordering statements that do not interfere, allows other more powerful optimizations to be come applicable.

Push R2

Movi 7 R3

Pop R4

Movi 7 R3

Push R2

Pop R4

Movi 7 R3

Mov R2 R4

• Now copy propagation might remove R2 altogether

Cse322, Programming Languages and Compilers

1604/19/23

Code Motion (Code Hoisting)

• Branches in code sometimes repeat identical calculations.

• These calculations can sometimes be “hoisted” before the branch, then they don’t have to be repeated.

• This saves space, but not time.

if g(x)

then x := (d*2) + w / k

else x := (d*2) - w / j

T1 := (d*2);

if g(x)

then x := T + w / k

else x := T - w / j• Multi branch “case” statements can make this quite a

space saver

Cse322, Programming Languages and Compilers

1704/19/23

Code Hoisting in Nested Lang’s

• Note that x := a+b - 3; could be hoisted out of the nested function, then hoisted out of the loop.

Procedure P (a,b:int);

var x : int = 5;

function f(y:int):int;

begin x := a+b - 3;

return x + y

end;

begin

for i := 1 to 3 do print f(i)

end;

Procedure P (a,b:int);

var x : int = 5;

function f(y:int):int;

begin return x + y

end;

begin

x := a+b - 3;

for i := 1 to 3 do print f(i)

end;

Cse322, Programming Languages and Compilers

1804/19/23

Register Allocation

• Task: Manage scarce resources (registers) in environment with imperfect information (static program text) about dynamic program behavior.

• General aim is to keep frequently-used values in registers as much as possible, to lower memory traffic. Can have a large effect on program performance.

• Variety of approaches are possible, differing in sophistication and in scope of analysis used.

Cse322, Programming Languages and Compilers

1904/19/23

Spilling

• Allocator may be unable to keep every ``live'' variable in registers; must then ``spill'' variables to memory. Spilling adds new instructions, which often affects the allocation analysis, requiring a new iteration.

• If spilling is necessary, what should we spill? Some heuristics:

– Don't spill variables used in inner loops.

– Spill variables not used again for ``longest'' time.

– Spill variables which haven't been updated since last read from memory.

Cse322, Programming Languages and Compilers

2004/19/23

Simplistic approach• Assume variables ``normally'' live in

memory.

• Use existing (often redundant) fetches and stores present in IR1.

• So: only need to allocate registers to IR temporaries (T5 etc.).

• Ignore possibility of spills.

• Use simple linear scan register allocator based on liveness intervals.

Cse322, Programming Languages and Compilers

2104/19/23

Liveness

• To determine how long to keep a given variable (or temporary) in a register, need to know the range of instructions for which the variable is live.

• A variable or temporary is live immediately following an instruction if its current value will be needed in the future (i.e., it will be used again, and it won't be changed before that use).

Cse322, Programming Languages and Compilers

2204/19/23

Example

| live after instruction:

T2 := 3 | T2

T3 := T2 | T2 T3

T4 := T3 + 4 | T2 T4

T4 := T2 + T4 | T4

a := T4 | (nothing)

• It's easy to calculate liveness for a consecutive series of instructions without branches, just by working backwards.

Cse322, Programming Languages and Compilers

2304/19/23

First, compute vars of an IR expfun varsOf x =

case x of

BINOP(m,x,y) => varsOf x @ varsOf y

| RELOP(m,x,y) => varsOf x @ varsOf y

| CALL(f,xs) => varsOfF varsOf xs

| MEM x => varsOf x

| NAME s => []

| TEMP n => [TEMP n]

| PARAM n => [PARAM n]

| MEMBER(x,n) => varsOf x

| VAR n => [VAR n]

| CONST (s,ty) => []

| STRING s => []

| ESEQ(ss,x) => varsOfF varsOfSt ss @ varsOf x

Cse322, Programming Languages and Compilers

2404/19/23

Next, compute vars of an IR stmtand varsOfSt stmt =case stmt of MOVE(x,y) => varsOf x @ varsOf y| JUMP n => []| CJUMP(m,x,y,n) => varsOf x @ varsOf y| LABEL n => []| CALLST(f,xs) => varsOfF varsOf xs| RETURN x => varsOf x| STMTlist ss => varsOfF varsOfSt ss| COMMENT(s,message) => varsOfSt s

and varsOfF f [] = [] | varsOfF f (x::xs) = f x @ varsOfF f xs

Cse322, Programming Languages and Compilers

2504/19/23

Helper functions• We treat lists as sets, so we need functions that

– Remove an element from a set– Unions two sets

• Note there will only ever be one element with a given value in a set.

fun remove x [] = [] | remove x (y::ys) = if x=y then ys else y :: remove x ys; fun union [] ys = ys | union (x::xs) ys = if List.exists (fn z => z=x) ys then union xs ys else x :: (union xs ys)

Cse322, Programming Languages and Compilers

2604/19/23

Work backwards

fun defines (s as MOVE(target,src)) live = union (varsOf src) (remove target live) | defines s live = union (varsOfSt s) live

fun annLive (s::ss) live ans = annLive ss (defines s live) ((s,live)::ans) | annLive [] live ans = ans;

fun live stmts = annLive (rev stmts) [] []

Cse322, Programming Languages and Compilers

2704/19/23

Jumps cause problems T1 := 0 | T1 T3L1: T2 := T1 + 1 | T2 T3 T3 := T3 + T2 | T2 T3 T1 := (2 * T2) | T1 T3 if T1 < 1000 GOTO L1 | T3 return T3 |

• Consider the above IR program. It is labelled by the results obtained from the previous algorithm.

• But, what happens if we jump back to L1, the 2nd to last statement should still state that T1 is live, because it could still be used in the dynamic flow of the program.

Cse322, Programming Languages and Compilers

2804/19/23

Correct Liveness

1. T1 := 0 | T1 T3

2. L1: T2 := T1 + 1 | T2 T3

3. T3 := T3 + T2 | T2 T3

4. T1 := (2 * T2) | T1 T3

5. if T1 < 1000 GOTO L1 | T1 T3

6. return T3 |

• A live range is a set of pairs. Each pair indicates a range where the variable is live.– T1 (1,1) (4,5)

– T2 (2,3)

– T3 (1,5)

Cse322, Programming Languages and Compilers

2904/19/23

Computing Ranges

• The result of computing the liveness is a list of variables for each line number. Eg.

1. [[TEMP 1, TEMP 3]2. ,[TEMP 1, TEMP 3]3. ,[ TEMP 2, TEMP 3,]4. ,[ TEMP 2, TEMP 3]5. ,[TEMP 1, TEMP 3]6. ,[ TEMP 3]7. ,[]]

• We look for contiguous variables in consecutive line numbers for each variable.

Cse322, Programming Languages and Compilers

3004/19/23

Computing ranges• Given such a list, we comput the range for a

single variable (i.e. Temp 1) by making a pass over the list.

• We repeat for each variable.

1. [[TEMP 1,TEMP 3]2. ,[TEMP 1,TEMP 3]3. ,[TEMP 3,TEMP 2]4. ,[TEMP 2,TEMP 3]5. ,[TEMP 1,TEMP 3]6. ,[TEMP 3]7. ,[]]

Range for Temp 1 [(5,5),(1,2)]

Range for Temp 2 [(3,4)]

Range for Temp 3 [(1,6)]

Cse322, Programming Languages and Compilers

3104/19/23

ML codeThe idea is to carry a option type indicating if we are in an active run for

variable name. If we are in an active run, and we find a line that doesn’t have the variable, end the run. If we’re not in a run, and we find the variable, start a new run.

fun range name [] line NONE ans = ans

| range name [] line (SOME(x,y)) ans = (x,y)::ans

| range name (rs::rss) line interval ans =

(case (List.find (fn x => x=name) rs,interval) of

(NONE,NONE) =>

range name rss (line+1) NONE ans

| (NONE,SOME(x,y)) =>

range name rss (line+1) NONE ((x,y)::ans)

| (SOME _,NONE) =>

range name rss (line+1) (SOME(line,line)) ans

| (SOME _,SOME(x,y)) =>

range name rss (line+1) (SOME(x,line)) ans)

Cse322, Programming Languages and Compilers

3204/19/23

From Ranges to intervals• Computing intervals from ranges is easy.

• Find the smallest start line number, and the largest finish line number in a set of ranges.

– [(5,5),(1,2)] -> (1,5)

fun max [x] = x | max (x::xs) = let val n = max xs in if x < n then n else x end; fun min [x] = x | min (x::xs) = let val n = min xs in if x < n then x else n end; fun interval ranges = let fun fst (x,y) = x fun snd (x,y) = y in (min (map fst ranges) ,max (map snd ranges)) end;

Cse322, Programming Languages and Compilers

3304/19/23

Live ranges and register allocation– T1 (1,1) (4,5)– T2 (2,3)– T3 (1,5)

• Two variables with no overlapping ranges can share the same register. Note that T1 and T2 could be stored in the same physical register.

• APPROXIMATION TECHNIQUE• Compute live intervals, which are the first

and last statement that a variable can be live. Coalesce the ranges.

– T1 (1,5)– T2 (2,3)– T3 (1,5)

Cse322, Programming Languages and Compilers

3404/19/23

Linear Scan Register Allocation1. Compute startpoint and endpoint of the

live interval for each temporary (Temp i). Store the intervals in a list in order of increasing start point. Range for Temp 1 [(5,5),(1,2)] interval (1,5)Range for Temp 2 [(3,4)] interval (3,4)Range for Temp 3 [(1,6)] interval (1,6) [(TEMP 1,1,5),(TEMP 3,1,6),(TEMP 2,3,4)]

2. Initialize set active := [ ] and pool of free registers = all usable registers.

3. For each live interval i in order of increasing start point: 1. For each interval j in active, in order of increasing end point

1. if endpoint[j] >= startpoint[i] break to step 3.22. Remove j from active3. Add register[j] to pool of free registers.

2. Set register[i] := NEXT FREE REGISTER, AND REMOVE IT from pool. If pool is already empty. need to spill.

3. Add i to active, sorted by increasing end point.

Cse322, Programming Languages and Compilers

3504/19/23

Fixing problems with Jumps1. T1 := 0 | T1 T3

2. L1: T2 := T1 + 1 | T2 T3

3. T3 := T2 + T3 | T2 T3

4. T1 := (2 * T1) | T1 T3

5. if T1 < 1000 GOTO L1 | T1 T3

6. return T3

• To fix problems with jumps we break code into basic blocks.

• We use simple analysis within blocks (where the flow of control is simple)

• And more complex analysis between blocks.