school of eecs, peking university “advanced compiler techniques” (fall 2011) ssa guo, yao

76
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA SSA Guo, Yao Guo, Yao

Upload: denisse-alridge

Post on 14-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

School of EECS, Peking University

“Advanced Compiler Techniques” (Fall 2011)

SSASSA

Guo, YaoGuo, Yao

2Fall 2011“Advanced Compiler

Techniques”

ContentContent SSA IntroductionSSA Introduction Converting to SSAConverting to SSA SSA ExampleSSA Example SSAPRESSAPRE

ReadingReading Tiger Book: 19.1, 19.3Tiger Book: 19.1, 19.3 Related PapersRelated Papers

3Fall 2011“Advanced Compiler

Techniques”

PreludePrelude

SSA (Static Single-Assignment): A SSA (Static Single-Assignment): A

program is said to be in SSA form iff program is said to be in SSA form iff Each variable is Each variable is staticallystatically defined defined

exactly only once, and exactly only once, and

each use of a variable is dominated by each use of a variable is dominated by

that variablethat variable’’s definition.s definition.

So, straight line code is in SSA form ?

4Fall 2011“Advanced Compiler

Techniques”

ExampleExample

In general, how to In general, how to

transform an transform an

arbitrary program arbitrary program

into SSA form?into SSA form?

Does the definition Does the definition

of Xof X22 dominates its dominates its

use in the example?use in the example?

X

X

x X3 (X1, X2)

1

2

4

5Fall 2011“Advanced Compiler

Techniques”

SSA: MotivationSSA: Motivation Provide a uniform basis of an IR to solve a Provide a uniform basis of an IR to solve a

wide range of classical dataflow problemswide range of classical dataflow problems Encode both dataflow and control flow Encode both dataflow and control flow

informationinformation A SSA form can be constructed and A SSA form can be constructed and

maintained efficientlymaintained efficiently Many SSA dataflow analysis algorithms Many SSA dataflow analysis algorithms

are more efficient (have lower complexity) are more efficient (have lower complexity) than their CFG counterparts.than their CFG counterparts.

6Fall 2011“Advanced Compiler

Techniques”

Static Single-Static Single-Assignment FormAssignment Form

Each variable has only one definition in the program text.

This single static definition can be in a loop andmay be executed many times. Thus even in aprogram expressed in SSA, a variable can be

dynamically defined many times.

7Fall 2011“Advanced Compiler

Techniques”

Advantages of SSAAdvantages of SSA Simpler dataflow analysisSimpler dataflow analysis No need to use use-def/def-use No need to use use-def/def-use

chains, which requires Nchains, which requires NM space for M space for N uses and M definitionsN uses and M definitions

SSA form relates in a useful way with SSA form relates in a useful way with dominance structures. dominance structures.

Differentiate unrelated uses of the Differentiate unrelated uses of the same variablesame variable E.g. loop induction variablesE.g. loop induction variables

8Fall 2011“Advanced Compiler

Techniques”

SSA Form SSA Form –– An Example An ExampleSSA-formSSA-form Each name is defined exactly onceEach name is defined exactly once Each use refers to exactly one nameEach use refers to exactly one name

WhatWhat’’s hards hard Straight-line code is trivialStraight-line code is trivial Splits in the CFG are trivialSplits in the CFG are trivial Joins in the CFG are hardJoins in the CFG are hard

Building SSA FormBuilding SSA Form Insert Insert ØØ-functions at birth points-functions at birth points RenameRename all values for uniquenessall values for uniqueness

x 17 - 4

x a + b

x y - z

x 13

z x * q

s w - x

?

9Fall 2011“Advanced Compiler

Techniques”

Birth PointsBirth Points

Consider the flow of values in this example:Consider the flow of values in this example:

x 17 - 4

x a + b

x y - z

x 13

z x * q

s w - x

The value x appears everywhereIt takes on several values.• Here, x can be 13, y-z, or 17-4• Here, it can also be a+b

If each value has its own name …• Need a way to merge these distinct values• Values are “born” at merge points

10Fall 2011“Advanced Compiler

Techniques”

Birth PointsBirth Points

Consider the flow of values in this example:Consider the flow of values in this example:

x 17 - 4

x a + b

x y - z

x 13

z x * q

s w - x

New value for x here17 - 4 or y - z

New value for x here13 or (17 - 4 or y - z)

New value for x herea+b or ((13 or (17-4 or y-z))

11Fall 2011“Advanced Compiler

Techniques”

Birth PointsBirth Points

x 17 - 4

x a + b

x y - z

x 13

z x * q

s w - x

These are all birth points for values

• All birth points are join points

• Not all join points are birth points

• Birth points are value-specific …

Consider the value flow below:

12Fall 2011“Advanced Compiler

Techniques”

ReviewReview

SSA-formSSA-form Each name is defined exactly onceEach name is defined exactly once Each use refers to exactly one nameEach use refers to exactly one name

WhatWhat’’s hards hard Straight-line code is trivialStraight-line code is trivial Splits in the CFG are trivialSplits in the CFG are trivial Joins in the CFG are hardJoins in the CFG are hard

Building SSA FormBuilding SSA Form Insert Insert ØØ-functions at birth points-functions at birth points Rename all values for uniquenessRename all values for uniqueness

A Ø-function is a special kind of copy that selects one of its parameters.

The choice of parameter is governed by the CFG edge along which control reached the current block.

Real machines do not implement a Ø-function directly in hardware.(not yet!)

y1 ... y2 ...

y3 Ø(y1,y2)

*

13Fall 2011“Advanced Compiler

Techniques”

a =

a

a

a =

a

a =

Use-def Dependencies in Use-def Dependencies in Non-straight-line CodeNon-straight-line Code

Many uses to many defs Overhead in

representation Hard to manage

14Fall 2011“Advanced Compiler

Techniques”

Factoring Operator

Number of edges reduced from 9 to 6

A is regarded as def (its parameters are uses)

Many uses to 1 def Each def dominates

all its uses

a =

a

a

a =

a

a =

a = a,a,a)

Factoring – when multiple edges cross a join point, create a common node that all edges must pass through

15Fall 2011“Advanced Compiler

Techniques”

Rename to represent use-def edges

a2=

a4

a4

a3=

a4

a1 =

a4 = a1,a2,a3)

No longer necessary to represent the use-def edges explicitly

16Fall 2011“Advanced Compiler

Techniques”

SSA Form in Control-SSA Form in Control-Flow Path MergesFlow Path Merges

b M[x]a 0

if b<4

a b

c a + b

B1

B2

B3

B4

Is this code in SSA form?

No, two definitions of a at B4 appear in the code (in B1 and B3)

How can we transform this codeinto a code in SSA form?

We can create two versions ofa, one for B1 and another for B3.

17Fall 2011“Advanced Compiler

Techniques”

SSA Form in Control-SSA Form in Control-Flow Path MergesFlow Path Merges

b M[x]a1 0

if b<4

a2 b

c a? + b

B1

B2

B3

B4

But which version should weuse in B4 now?

We define a fictional function that

“knows” which control path was

taken to reach the basic block B4:

=B3 from B4at arrive weif a2

B2 from B4at arrive weif a1 a2a1,( )

18Fall 2011“Advanced Compiler

Techniques”

SSA Form in Control-SSA Form in Control-Flow Path MergesFlow Path Merges

b M[x]a1 0

if b<4

a2 b

a3 (a2,a1)c a3 + b

B1

B2

B3

B4

But, which version should weuse in B4 now?

We define a fictional function that “knows” which control path was

taken to reach the basic block B4:

( ) =B3 from B4at arrive weif a2

B2 from B4at arrive weif a1 a1a2,

19Fall 2011“Advanced Compiler

Techniques”

A Loop ExampleA Loop Example

a 0

b a+1c c+ba b*2if a < N

return

a1 0

a3 (a1,a2)b2 (b0,b2)c2 (c0,c1)b2 a3+1c1 c2+b2a2 b2*2if a2 < N

return(b0,b2) is not necessary because b0 is never used. But the phase that generates functions does not know it. Unnecessary functionsare eliminated by dead code elimination.

Note: only a,c are first used inthe loop body before it is redefined.For b, it is redefined right at the Beginning!

20Fall 2011“Advanced Compiler

Techniques”

The The function function

How can we implement a function that “knows”which control path was taken?

Answer 1: We don’t!! The function is used onlyto connect use to definitions during

optimization, but is never implemented.

Answer 2: If we must execute the function, we can implement it by inserting MOVE instructions

in all control paths.

21Fall 2011“Advanced Compiler

Techniques”

Criteria For Inserting Criteria For Inserting Functions Functions

We could insert one function for each variableat every join point (a point in the CFG with more

than one predecessor). But that would be wasteful.

What should be our criteria to insert a function for a variable a at node z of the CFG?

Intuitively, we should add a function if there are two definitions of a that can reach

the point z through distinct paths.

22Fall 2011“Advanced Compiler

Techniques”

A naA naïïve methodve method Simply introduce a phi-function at Simply introduce a phi-function at

each each ““joinjoin”” point in CFG point in CFG But, we already pointed out that this But, we already pointed out that this

is inefficient is inefficient –– too many useless phi- too many useless phi-functions may be introduced!functions may be introduced!

What is a good algorithm to What is a good algorithm to introduce only the right number of introduce only the right number of phi-functions ?phi-functions ?

23Fall 2011“Advanced Compiler

Techniques”

Path Convergence CriterionPath Convergence Criterion

Insert a function for a variable a at a node z ifall the following conditions are true:1. There is a block x that defines a2. There is a block y x that defines a3. There is a non-empty path Pxz from x to z4. There is a non-empty path Pyz from y to z5. Paths Pxz and Pyz don’t have any nodes in common other than z6. The node z does not appear within both Pxz and Pyz prior to the end, but it might appear in one or the other.

The start node contains an implicit definitionof every variable.

24Fall 2011“Advanced Compiler

Techniques”

Iterated Path-Iterated Path-Convergence CriterionConvergence Criterion

The function itself is a definition of a. Therefore the path-convergence criterion

is a set of equations that must be satisfied.

while there are nodes x, y, z satisfying conditions 1-6 and z does not contain a function for ado insert a (a, a, …, a) at node z

This algorithm is extremely costly, because itrequires the examination of every triple ofnodes x, y, z and every path leading from

x to y.

Can we do better?

25Fall 2011“Advanced Compiler

Techniques”

Concept of dominance Concept of dominance FrontiersFrontiers

X

Blocks dominated by bb1

bb1

bbn

Border between dorm and not-

dorm

(Dominance Frontier)

An Intuitive View

26Fall 2011“Advanced Compiler

Techniques”

Dominance FrontierDominance Frontier The The dominance frontierdominance frontier DF(x) of a DF(x) of a

node x is the set of all node z such node x is the set of all node z such that x dominates a predecessor of z, that x dominates a predecessor of z, without strictly dominating z.without strictly dominating z.

RecallRecall: if x dominates y and x ≠ y, : if x dominates y and x ≠ y, thenthen

x x strictlystrictly dominates y dominates y

27Fall 2011“Advanced Compiler

Techniques”

Calculate The Dominance Calculate The Dominance FrontierFrontier

1

2 9

13

3

4

6 7 10 11

128

5

How to Determine the Dominance Frontier of Node 5?

An Intuitive Way

1. Determine the dominance region of node 5:

2. Determine the targets of edges crossing from the dominance region of node 5

{5, 6, 7, 8}

These targets are the dominance frontier of node 5

DF(5) = { 4, 5, 12, 13}

NOTE: node 5 is in DF(5) in this case – why ?

28Fall 2011“Advanced Compiler

Techniques”

Algorithm: Converting to Algorithm: Converting to SSASSA

Big picture, translation to SSA form is Big picture, translation to SSA form is done in 3 stepsdone in 3 steps The dominance frontier mapping is The dominance frontier mapping is

constructed form the control flow graph.constructed form the control flow graph. Using the dominance frontiers, the Using the dominance frontiers, the

locations of the locations of the -functions for each -functions for each variable in the original program are variable in the original program are determined.determined.

The variables are renamed by replacing The variables are renamed by replacing each mention of an original variable V by each mention of an original variable V by an appropriate mention of a new variable an appropriate mention of a new variable VVii

29Fall 2011“Advanced Compiler

Techniques”

CFG CFG A control flow graph G = (V, E)A control flow graph G = (V, E) Set V contains distinguished nodes Set V contains distinguished nodes

ENTRY and EXITENTRY and EXIT every node is reachable from ENTRYevery node is reachable from ENTRY EXIT is reachable from every node in G. EXIT is reachable from every node in G. ENTRY has no predecessorsENTRY has no predecessors EXIT has no successors.EXIT has no successors.

Notation: predecessor, successor, Notation: predecessor, successor, pathpath

30Fall 2011“Advanced Compiler

Techniques”

Dominance RelationDominance Relation If X appears on every path from ENTRY to If X appears on every path from ENTRY to

Y, then X Y, then X dominatesdominates Y. Y. Dominance relation is both reflexive and Dominance relation is both reflexive and

transitive.transitive. idom(Y): immediate dominator of Yidom(Y): immediate dominator of Y Dominator TreeDominator Tree

ENTRY is the rootENTRY is the root Any node Y other than ENTRY has idom(Y) as Any node Y other than ENTRY has idom(Y) as

its parentits parent Notation: parent, child, ancestor, descendantNotation: parent, child, ancestor, descendant

31Fall 2011“Advanced Compiler

Techniques”

Dominator Tree ExampleDominator Tree Example

ENTRY

a

b c

d

EXIT

ENTRY

a

d

EXIT

b c

CFG DT

32Fall 2011“Advanced Compiler

Techniques”

Dominator Tree ExampleDominator Tree Example

33Fall 2011“Advanced Compiler

Techniques”

Dominance FrontierDominance Frontier Dominance Frontier DF(X) for node XDominance Frontier DF(X) for node X

Set of nodes Y Set of nodes Y X dominates a predecessor of YX dominates a predecessor of Y X does not strictly dominate YX does not strictly dominate Y

Equation 1: Equation 1: DF(X) ={Y|( DF(X) ={Y|(PP∈Pred(Y))(X ∈Pred(Y))(X domdom P and X P and X !sdom!sdom Y )}Y )}

Equation 2: Equation 2:

DF(X) = DF DF(X) = DFlocallocal(X)∪ (X)∪ ∪∪Z∈Children(X)Z∈Children(X)DFDFupup(Z)(Z)

DFDFlocallocal(X) = {Y∈Succ(x) | X (X) = {Y∈Succ(x) | X !sdom!sdom Y} Y} = {Y∈Succ(x) | idom(Y) ≠ X} = {Y∈Succ(x) | idom(Y) ≠ X}

DFDFupup(Z) = {Y∈DF(Z) | idom(Z) (Z) = {Y∈DF(Z) | idom(Z) !sdom!sdom Y} Y}

34Fall 2011“Advanced Compiler

Techniques”

Dominance FrontierDominance Frontier How to prove equation 1 and equation 2 are How to prove equation 1 and equation 2 are

correct?correct? Easy to See that => is correctEasy to See that => is correct Still have to show everything in DF(X) has been Still have to show everything in DF(X) has been

accounted for.accounted for. Suppose Y ∈DF(X) and U->Y be the edge that X Suppose Y ∈DF(X) and U->Y be the edge that X

dominate U but doesn’t strictly dominate Y.dominate U but doesn’t strictly dominate Y. If U == X, then Y ∈ DFIf U == X, then Y ∈ DFlocallocal(X)(X) If U ≠X, then there exists a path from X to U in If U ≠X, then there exists a path from X to U in

Dominator Tree which implies there exists a child Dominator Tree which implies there exists a child Z of X dominate U. Z doesn’t strictly dominate Y Z of X dominate U. Z doesn’t strictly dominate Y because X doesn’t strictly dominate Y. because X doesn’t strictly dominate Y. So Y ∈DFSo Y ∈DFupup(Z).(Z).

35Fall 2011“Advanced Compiler

Techniques”

ФФ-function and -function and Dominance Dominance FrontierFrontier

Intuition behind dominance frontierIntuition behind dominance frontier Y ∈DF(X) means:Y ∈DF(X) means:

Y has multiple predecessorsY has multiple predecessors X dominate one of them, say U, U inherits X dominate one of them, say U, U inherits

everything defined in Xeverything defined in X Reaching definition of Y are from U and Reaching definition of Y are from U and

other predecessorsother predecessors So Y is exactly the place where So Y is exactly the place where ФФ-function is -function is

neededneeded

36Fall 2011“Advanced Compiler

Techniques”

Control Dependences and Control Dependences and Dominance FrontierDominance Frontier

A CFG node Y is A CFG node Y is control dependentcontrol dependent on a CFG node X if both the following on a CFG node X if both the following hold:hold: There is a nonempty path p: X There is a nonempty path p: X →→++ Y such Y such

that Y postdominate every node after X.that Y postdominate every node after X. The node Y doesn’t strictly The node Y doesn’t strictly

postdominate the node Xpostdominate the node X If X appears on every path from Y to If X appears on every path from Y to

Exit, then X Exit, then X postdominatepostdominate Y. Y.

37Fall 2011“Advanced Compiler

Techniques”

Control Dependences and Control Dependences and Dominance FrontierDominance Frontier

In other words, there is some edge In other words, there is some edge from X that definitely causes Y to from X that definitely causes Y to execute, and there is also some path execute, and there is also some path from X that avoids executing Y.from X that avoids executing Y.

38Fall 2011“Advanced Compiler

Techniques”

RCFGRCFG The The reverse control flow graphreverse control flow graph RCFG RCFG

has the same nodes as CFG, but has has the same nodes as CFG, but has edge Y edge Y →X for each edge X→Y in CFG.→X for each edge X→Y in CFG. Entry and Exit are also reversed.Entry and Exit are also reversed. The postdominator relation in CFG is The postdominator relation in CFG is

dominator relation in RCFG.dominator relation in RCFG. Let X and Y be nodes in CFG. Then Y Let X and Y be nodes in CFG. Then Y

is control dependent on X in CFG iff is control dependent on X in CFG iff XX∈DF(Y) in RCFG.∈DF(Y) in RCFG.

39Fall 2011“Advanced Compiler

Techniques”

SSA Construction– Place SSA Construction– Place ФФ FunctionsFunctions

For each variable VFor each variable V Add all nodes with assignments to V to Add all nodes with assignments to V to

worklist Wworklist W While X in WWhile X in W do do

For each Y in DF(X) doFor each Y in DF(X) do If no If no ФФ added in Y then added in Y then

Place (V = Place (V = ФФ (V,…,V)) at Y (V,…,V)) at Y If Y has not been added before, add Y to W.If Y has not been added before, add Y to W.

40Fall 2011“Advanced Compiler

Techniques”

SSA Construction– Rename SSA Construction– Rename VariablesVariables

Rename from the ENTRY node recursivelyRename from the ENTRY node recursively For node XFor node X

For each assignment (V = …) in XFor each assignment (V = …) in X Rename any use of V with the TOS (Top-of-Stack) of Rename any use of V with the TOS (Top-of-Stack) of

rename stackrename stack Push the new name VPush the new name Vii on rename stack on rename stack i = i + 1i = i + 1

Rename all the Rename all the ФФ operands through successor operands through successor edgesedges

Recursively rename for all child nodes in the Recursively rename for all child nodes in the dominator tree dominator tree

For each assignment (V = …) in X For each assignment (V = …) in X Pop VPop Vii in X from the rename stack in X from the rename stack

41Fall 2011“Advanced Compiler

Techniques”

Rename ExampleRename Example

a1=

a1+5

a =

a+5

a=

a= Ф(a1,a)

a+5

TOS

a1

a0

42Fall 2011“Advanced Compiler

Techniques”

Converting Out of SSAConverting Out of SSA Eventually, a program must be Eventually, a program must be

executed. executed. The The ФФ-function have precise -function have precise

semantics, but they are generally not semantics, but they are generally not represented in existing target represented in existing target machines.machines.

43Fall 2011“Advanced Compiler

Techniques”

Converting Out of SSAConverting Out of SSA Naively, a k-input Naively, a k-input ФФ-function at -function at

entrance to node X can be replaced entrance to node X can be replaced by k ordinary assignments, one at by k ordinary assignments, one at the end of each control predecessor the end of each control predecessor of X.of X.

Inefficient object code can be Inefficient object code can be generated.generated.

44Fall 2011“Advanced Compiler

Techniques”

Dead Code EliminationDead Code Elimination Where does dead code come from?Where does dead code come from?

Assignments without any useAssignments without any use Dead code elimination methodDead code elimination method

Initially all statements are marked deadInitially all statements are marked dead Some statements need to be marked live Some statements need to be marked live

because of certain conditionsbecause of certain conditions Mark these statements can cause others to be Mark these statements can cause others to be

marked live.marked live. After worklist is empty, truly dead code can be After worklist is empty, truly dead code can be

removedremoved

45Fall 2011“Advanced Compiler

Techniques”

SSA ExampleSSA Example Dead Code Elimination IntuitionDead Code Elimination Intuition

Because there is only one definition for Because there is only one definition for each variable, if the list of uses of the each variable, if the list of uses of the variable is empty, the definition is dead.variable is empty, the definition is dead.

When a statement When a statement vv x x y y is is eliminated because eliminated because vv is dead, this is dead, this statement must be removed from the statement must be removed from the list of uses of list of uses of xx and and yy. This might cause . This might cause those definitions to become dead.those definitions to become dead.

46Fall 2011“Advanced Compiler

Techniques”

SSA ExampleSSA Example Simple Constant Propagation Simple Constant Propagation

IntuitionIntuition If there is a statement If there is a statement v v c c, where , where cc is is

a constant, then all uses of a constant, then all uses of vv can be can be replaced for replaced for cc..

A A function of the form function of the form v v (c1, c2, (c1, c2, …, cn)…, cn) where all where all cici are identical can be are identical can be replaced for replaced for v v c c..

Using a work list algorithm in a program Using a work list algorithm in a program in SSA form, we can perform constant in SSA form, we can perform constant propagation in linear time.propagation in linear time.

47Fall 2011“Advanced Compiler

Techniques”

SSA ExampleSSA Examplei=1;j=1;k=0; while(k<100) { if(j<20) { j=i; k=k+1; } else { j=k; k=k+2; } } return j;}

i 1j 1k 0

j ik k+1

j kk k+2

return jif j<20

if k<100

B1

B2

B3

B5 B6

B4

B7

48Fall 2011“Advanced Compiler

Techniques”

SSA ExampleSSA Examplei=1;j=1;k=0; while(k<100) { if(j<20) { j=i; k=k+1; } else { j=k; k=k+2; } } return j;}

i 1j 1k1 0

j ik3 k+1

j kk5 k+2

return jif j<20

if k<100

B1

B2

B3

B5 B6

B4

B7

49Fall 2011“Advanced Compiler

Techniques”

SSA ExampleSSA Examplei=1;j=1;k=0; while(k<100) { if(j<20) { j=i; k=k+1; } else { j=k; k=k+2; } } return j;}

i 1j 1k1 0

j ik3 k+1

j kk5 k+2

return jif j<20

if k<100

k4 (k3,k5)

B1

B2

B3

B5 B6

B4

B7

50Fall 2011“Advanced Compiler

Techniques”

SSA ExampleSSA Examplei=1;j=1;k=0; while(k<100) { if(j<20) { j=i; k=k+1; } else { j=k; k=k+2; } } return j;}

i 1j 1k1 0

j ik3 k+1

j kk5 k+2

return jif j<20

k2 (k4,k1)if k<100

k4 (k3,k5)

B1

B2

B3

B5 B6

B4

B7

51Fall 2011“Advanced Compiler

Techniques”

SSA ExampleSSA Examplei=1;j=1;k=0; while(k<100) { if(j<20) { j=i; k=k+1; } else { j=k; k=k+2; } } return j;}

i 1j 1k1 0

j ik3 k2+1

j kk5 k2+2

return jif j<20

k2 (k4,k1)if k2<100

k4 (k3,k5)

B1

B2

B3

B5 B6

B4

B7

52Fall 2011“Advanced Compiler

Techniques”

SSA ExampleSSA Examplei=1;j=1;k=0; while(k<100) { if(j<20) { j=i; k=k+1; } else { j=k; k=k+2; } } return j;}

i1 1j1 1k1 0

j3 i1k3 k2+1

j5 k2k5 k2+2

return j2if j2<20

j2 (j4,j1)k2 (k4,k1)if k2<100

j4 (j3,j5)k4 (k3,k5)

B1

B2

B3

B5 B6

B4

B7

53Fall 2011“Advanced Compiler

Techniques”

SSA Example: Constant SSA Example: Constant PropagationPropagation

i1 1j1 1k1 0

j3 i1k3 k2+1

j5 k2k5 k2+2

return j2if j2<20

j2 (j4,j1)k2 (k4,k1)if k2<100

j4 (j3,j5)k4 (k3,k5)

B1

B2

B3

B5 B6

B4

B7

i1 1j1 1k1 0

j3 1k3 k2+1

j5 k2k5 k2+2

return j2if j2<20

j2 (j4, 1)k2 (k4,0)if k2<100

j4 (j3,j5)k4 (k3,k5)

B1

B2

B3

B5 B6

B4

B7

54Fall 2011“Advanced Compiler

Techniques”

SSA Example: Dead Code SSA Example: Dead Code EliminationElimination

j3 1k3 k2+1

j5 k2k5 k2+2

return j2if j2<20

j2 (j4,1)k2 (k4,0)if k2<100

j4 (j3,j5)k4 (k3,k5)

B2

B3

B5 B6

B4

B7

i1 1j1 1k1 0

j3 1k3 k2+1

j5 k2k5 k2+2

return j2if j2<20

j2 (j4, 1)k2 (k4,0)if k2<100

j4 (j3,j5)k4 (k3,k5)

B1

B2

B3

B5 B6

B4

B7

55Fall 2011“Advanced Compiler

Techniques”

SSA Example: Constant SSA Example: Constant Propagation and Dead Code Propagation and Dead Code

EliminationElimination

j3 1k3 k2+1

j5 k2k5 k2+2

return j2if j2<20

j2 (j4,1)k2 (k4,0)if k2<100

j4 (1,j5)k4 (k3,k5)

B2

B3

B5 B6

B4

B7

j3 1k3 k2+1

j5 k2k5 k2+2

return j2if j2<20

j2 (j4,1)k2 (k4,0)if k2<100

j4 (j3,j5)k4 (k3,k5)

B2

B3

B5 B6

B4

B7

56Fall 2011“Advanced Compiler

Techniques”

SSA Example: One Step SSA Example: One Step FurtherFurther

But block 6 is neverexecuted! How can we

find this out, and simplifythe program?

SSA conditional constantpropagation finds the

least fixed point for theprogram and allows further elimination of

dead code.

k3 k2+1 j5 k2k5 k2+2

return j2if j2<20

j2 (j4,1)k2 (k4,0)if k2<100

j4 (1,j5)k4 (k3,k5)

B2

B3

B5 B6

B4

B7

57Fall 2011“Advanced Compiler

Techniques”

SSA Example: Dead Code SSA Example: Dead Code EliminationElimination

k3 k2+1 j5 k2k5 k2+2

return j2if j2<20

j2 (j4,1)k2 (k4,0)if k2<100

j4 (1,j5)k4 (k3,k5)

B2

B3

B6

B4

B7

B4

k3 k2+1

return j2

j2 (j4,1)k2 (k4,0)if k2<100

j4 (1)k4 (k3)

B2

B5

B7

58Fall 2011“Advanced Compiler

Techniques”

SSA Example: Single SSA Example: Single Argument Argument Function Function

EliminationElimination

B4

k3 k2+1

return j2

j2 (j4,1)k2 (k4,0)if k2<100

j4 1k4 k3

B2

B5

B7

B4B4

k3 k2+1

return j2

j2 (j4,1)k2 (k4,0)if k2<100

j4 (1)k4 (k3)

B2

B5

B7

59Fall 2011“Advanced Compiler

Techniques”

SSA Example: Constant and SSA Example: Constant and Copy PropagationCopy Propagation

k3 k2+1

return j2

j2 (j4,1)k2 (k4,0)if k2<100

j4 1k4 k3

B2

B5

B7

k3 k2+1

return j2

j2 (1,1)k2 (k3,0)if k2<100

j4 1k4 k3

B2

B5

B7

B4B4

60Fall 2011“Advanced Compiler

Techniques”

SSA Example: More Dead SSA Example: More Dead CodeCode

k3 k2+1

return j2

j2 (1,1)k2 (k3,0)if k2<100

j4 1k4 k3

B2

B5

B7

B4

k3 k2+1

return j2

j2 (1,1)k2 (k3,0)if k2<100

B2

B5

B4

61Fall 2011“Advanced Compiler

Techniques”

SSA Example: More SSA Example: More Function SimplificationFunction Simplification

k3 k2+1

return j2

j2 (1,1)k2 (k3,0)if k2<100

B2

B5

B4

k3 k2+1

return j2

j2 1k2 (k3,0)if k2<100

B2

B5

B4

62Fall 2011“Advanced Compiler

Techniques”

SSA Example: More SSA Example: More Constant PropagationConstant Propagation

k3 k2+1

return j2

j2 1k2 (k3,0)if k2<100

B2

B5

B4

k3 k2+1

return 1

j2 1k2 (k3,0)if k2<100

B2

B5

B4

63Fall 2011“Advanced Compiler

Techniques”

SSA Example: Ultimate SSA Example: Ultimate Dead Code EliminationDead Code Elimination

k3 k2+1

return 1

j2 1k2 (k3,0)if k2<100

B2

B5

B4 return 1 B4

64Fall 2011“Advanced Compiler

Techniques”

SSAPRESSAPRE

Kennedy et al., Partial redundancy elimination in SSA form, ACM TOPLAS 1999

65Fall 2011“Advanced Compiler

Techniques”

SSAPRE: MotivationSSAPRE: Motivation Traditional data flow analysis based Traditional data flow analysis based

on bit-vectors do not interface well on bit-vectors do not interface well with SSA representationwith SSA representation

66Fall 2011“Advanced Compiler

Techniques”

Traditional Partial Traditional Partial Redundancy EliminationRedundancy Elimination

a x*y

b x*y

B1 B2

B3

tx*y

a ttx*y

b t

B1 B2

B3

SSA form

Not SSA form

!

Before PRE

After PRE

67Fall 2011“Advanced Compiler

Techniques”

SSAPRE: Motivation SSAPRE: Motivation (Cont.)(Cont.)

Traditional PRE needs a postpass Traditional PRE needs a postpass

transform to turn the results into SSA transform to turn the results into SSA

form again.form again.

SSA is based on variables, not on SSA is based on variables, not on

expressionsexpressions

68Fall 2011“Advanced Compiler

Techniques”

SSAPRE: Motivation SSAPRE: Motivation (Cont.)(Cont.)

Representative occurrenceRepresentative occurrence Given a partially redundant occurrence Given a partially redundant occurrence

EE00, we define the representative , we define the representative

occurrence for Eoccurrence for E00 as the nearest to E as the nearest to E00

among all occurrences that dominate Eamong all occurrences that dominate E00..

Unique and well-defined Unique and well-defined

69Fall 2011“Advanced Compiler

Techniques”

FRG (Factored Redundancy FRG (Factored Redundancy Graph)Graph)

Redundancy edge

Without factoring

Control flow edge

E E

E E

Factored

E E

E E

E =(E, E, )

70Fall 2011“Advanced Compiler

Techniques”

FRG (Factored Redundancy FRG (Factored Redundancy Graph)Graph)

E E

E E

E =(E, E, )

t0 x*y t1 x*y

t2(t0, t1 , t3)

t2 t2

Assume E=x*y

Note: This is in SSA form

71Fall 2011“Advanced Compiler

Techniques”

ObservationObservation Every use-def edge of the temporary Every use-def edge of the temporary

corresponds directly to a redundancy corresponds directly to a redundancy edge for the expression.edge for the expression.

72Fall 2011“Advanced Compiler

Techniques”

Intuition for SSAPREIntuition for SSAPRE

Construct FRG for each expression EConstruct FRG for each expression E Refine redundant edges to form the use-Refine redundant edges to form the use-

def relation for each expression Edef relation for each expression E’’s s

temporarytemporary Transform the programTransform the program

For a definition, replace with tFor a definition, replace with tE E

For a use, replace with For a use, replace with t t

Sometimes need also insert an expressionSometimes need also insert an expression

73Fall 2011“Advanced Compiler

Techniques”

Main StepsMain Steps Initialization:Initialization: scan the whole program, collect all scan the whole program, collect all

the occurrences, and build a worklist of expressions the occurrences, and build a worklist of expressions Then, for each entry in the worklist: Then, for each entry in the worklist: PHI placement:PHI placement: find where the PHI nodes of the find where the PHI nodes of the

FRG must be placed. FRG must be placed. Renaming:Renaming: assign appropriate "redundancy assign appropriate "redundancy

classes" to all occurrencesclasses" to all occurrences Analyze:Analyze: compute various flags in PHI nodes compute various flags in PHI nodes

This conceptually is composed of two data flow analysis This conceptually is composed of two data flow analysis passes, which in practice only scan the PHI nodes in the passes, which in practice only scan the PHI nodes in the FRG, not the whole code, so they are not that heavy. FRG, not the whole code, so they are not that heavy.

Finalize:Finalize: make so that the FRG is make so that the FRG is exactlyexactly like the like the use-def graph for the temporary introduced use-def graph for the temporary introduced

Code motion:Code motion: actually update the code using the actually update the code using the FRG. FRG.

74Fall 2011“Advanced Compiler

Techniques”

a1

a2(a4 , a1)

a2+b

a3

a4(a2, a3)

a4+b

exit

B1

B2

B3

B4

B5

a1

t1 a1+ b1

t2(t4 , t1)

a2(a4 , a1)

t2

a3

t3a3+ b1

t4(t2, t3)a4(a2, a3)

t4

exit

B1

B2

B4

An Example

75Fall 2011“Advanced Compiler

Techniques”

Benefits of SSAPREBenefits of SSAPRE SSAPRE algorithm is "SSAPRE algorithm is "sparsesparse""

Original PRE using DFA operates globally.Original PRE using DFA operates globally. SSAPRE operates on expressions individually, SSAPRE operates on expressions individually,

looking only at specific nodes it needs.looking only at specific nodes it needs. SSAPRE looks at the whole program (method) SSAPRE looks at the whole program (method)

only onceonly once when it scans the code to collect all expression when it scans the code to collect all expression

occurrences. occurrences. Then, it operates on one expression at a time, Then, it operates on one expression at a time,

looking only at its specific data structureslooking only at its specific data structures which, for each expression, are much smaller than which, for each expression, are much smaller than

the whole program, so the operations are fast.the whole program, so the operations are fast.

76Fall 2011“Advanced Compiler

Techniques”

Next TimeNext Time Sample Midterm ReviewSample Midterm Review

Midterm Exam (Nov 24Midterm Exam (Nov 24thth))

Interprocedural AnalysisInterprocedural Analysis Reading: Dragon chapter 12Reading: Dragon chapter 12