structural data-flow analysis algorithms: allen-cocke interval analysis copyright 2011, keith d....

23
Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved. Comp 512 Spring 2011

Upload: myrtle-hodges

Post on 29-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

Structural Data-flow Analysis Algorithms:Allen-Cocke Interval Analysis

Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved.

Students enrolled in Comp 512 at Rice University have explicit permission to make copies of these materials for their personal use.

Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved.

Comp 512Spring 2011

Page 2: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

2

Structural Data-flow Analysis

While strong arguments favor iterative data-flow analysis, you should be aware of the other methods proposed in the literature, in part because they offer insight into the problems and techniques of analysis, and in part because they are, in some situations, techniques that merit serious consideration.

• Structural Data-flow Algorithms Interval analysis, T1-T2, Balanced-tree path compression,

All of these follow a reduction-expansion discipline Partitioned variable technique

Operates on a variable-by-variable basis These methods rely on structural analysis of the CFG to

choose an evaluation order for the data-flow equations

• We will focus on Allen-Cocke Interval Analysis F.E. Allen and J. Cocke, “A Program Data Flow Analysis

Procedure”, Comm. ACM, 19(3), March 1976, pp. 137—147 Kennedy’s “Survey of Data Flow Analysis Techniques”

Page 3: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

3

Background Material

Definitions

• Interval: an interval I(h) in a control-flow graph is a maximal, single entry subgraph in which h is the only entry to I(h) and all closed paths in I(h) contain h.

• Interval header: h, in I(h), is the sole entry point of the interval and is called the interval header

By selecting the proper set of interval headers, a CFG can be partitioned into a unique set of disjoint intervals.

Page 4: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

4

Finding Intervals

A simple algorithm finds the “right” set of intervals & headers

Order of H and each I(h) is important — “interval order”

H n0

while (H Ø) do remove next h from H /* create interval h

*/ create I(h) I(h) { h }

while n N s.t. n I(h) /* build the interval */

I(h) I(h) + n

while n N s.t. n I(h) and /* find next headers */ m preds(n) s.t. m I(h) H H + n

+ creates an ordered set

Page 5: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

5

Example

1

2

3

4

6

7

8

5

Header Interval

1 1

2 2

3 3 4 5 6

7 7 8

Notice that taking intervals in order (1,2,3,7) and following the order of creation within each interval produces an order with the same effects as reverse postorder (look at I(3) )

Page 6: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

6

Interval Derived Graphs

Replacing each interval with a single node, we derive a new graph

1

2

9

10

1

2

3

4

6

7

8

5

If we call the original graph our first graph, G1, then this graph is the second graph, G2.

Header Interval

1 1

2 2 9 10

Page 7: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

7

Interval Derived Graphs

Continuing the process, …

1

2

9

10

Header Interval

1 1 11

1

11

G2 G3

And, of course, G4 is a single node

1

G4

Page 8: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

8

Derived Sequence of Interval Graphs

1

2

9

10

1

11

G2 G3

1

G4

1

2

3

4

6

7

8

5

G1

This sequence of graphs establish that G1 is “reducible” — that is, it reduces to a single node

Allen & Cocke envision a system where we derive the set of interval graphs as a first step in analysis

We can then use the interval graphs to choose an evaluation order for the data-flow equations

The same interval graphs can be used to solve many data-flow problems

Page 9: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

9

Computing Available Expressions

For interval analysis, we need to reformulate the equationsin a straight-forward way

In the iterative framework, we used a single equation:

AVAIL(b) = xpred(b) (DEEXPR(x) (AVAIL(x) EXPRKILL(x) ))

where preds(b) is the set of b’s predecessors in the control-flow graph

In the interval analysis framework, we use two equations

AVAILATNODE(n) = e=(m,n) E AVAIL(e)

AVAIL(e =(m,n) ) = DEEXPR(m) (AVAILATNODE(m) EXPRKILL(m)

))

Page 10: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

10

The Big Picture — Phase 1: Local to Global

1

2

3

4

6

7

8

5

G1

Step 1:Compute initial sets DEEXPR and EXPRKILL for each node in G1

Page 11: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

11

1

2

3

4

6

7

8

5

G1

Step 1:Compute initial sets DEEXPR and EXPRKILL for each node in G1

Step 2: Compute AVAIL for each interval exit edge and each internal edge that enters interval’s header

The Big Picture — Phase 1: Local to Global

Page 12: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

12

1

2

9

10

G2

1

2

3

4

6

7

8

5

G1

Step 1:Compute initial sets DEEXPR and EXPRKILL for each node in G1

Step 2: Compute AVAIL for each interval exit edge and each internal edge that enters interval’s header

Step 3: Map constants from intervals in G1 to nodes in G2 and AVAIL sets from exit edges in G1 to corresponding edges in G2

The Big Picture — Phase 1: Local to Global

Page 13: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

13

1

2

9

10

G2

1

2

3

4

6

7

8

5

G1

Step 1:Compute initial sets DEEXPR and EXPRKILL for each node in G1

Step 2: Compute AVAIL for each interval exit edge and each internal edge that enters interval’s header

Step 3: Map constants from intervals in G1 to nodes in G2 and AVAIL sets from exit edges in G1 to corresponding edges in G2

The Big Picture — Phase 1: Local to Global

Page 14: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

14

1

2

9

10

1

11

G2 G3

1

G4

1

2

3

4

6

7

8

5

G1

Step 2: Compute AVAIL for each interval exit edge and each internal edge that enters interval’s header

Step 3: Map constants from intervals in G1 to nodes in G2 and AVAIL sets from exit edges in G1 to corresponding edges in G2

Repeat steps 2 & 3 for successive derived graphs

The Big Picture — Phase 1: Local to Global

Page 15: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

15

1

2

9

10

1

11

G2 G3

1

G4

1

2

3

4

6

7

8

5

G1

Step 1: Map AVAILATNODE sets from nodes in G4 to interval heads G3

Step 2: Using constants, and AVAIL sets from phase 1 plus AVAILATNODE for interval headers, solve for AVAIL at each node and edge in intervals

The Big Picture — Phase 2: Global to Local

Page 16: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

16

1

2

9

10

1

11

G2 G3

1

G4

1

2

3

4

6

7

8

5

G1

The Big Picture — Phase 2: Global to Local

Step 1: Map AVAILATNODE sets from nodes in G4 to interval heads G3

Step 2: Using constants, and AVAIL sets from phase 1 plus AVAILATNODE for interval headers, solve for AVAIL at each node and edge in intervals

Repeat steps 1 & 2 until we solve G1

Page 17: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

17

Solving the Equations

Phase 1 & Phase 2 solve local equations on acyclic subgraphs

• Make a single pass, in interval order, over each interval in each derived graph Interval order has same effect as reverse postorder Solver runs innermost loop to outermost loop

Avoids some unproductive set operations Why compute outer loops when solving for inner loops?

Fewer total set operations than iterative algorithm

• Number of derived graphs, k, is related to loop nesting Hecht showed that k d(G )

• Graph manipulation & set mapping Gi to Gk incur some costs

Page 18: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

18

Interval Analysis — The Three Questions

• Termination Interval construction halts (if graph is reducible; more later) Phases 1 & 2 perform fixed steps on each derived graph

• Correctness Allen & Cocke studied underpinnings of the graph theory Don’t say much about theory of the problems & frameworks

Clearly works for Kam-Ullman rapid problems (classic DFA) More complex problems need a closure in cyclic intervals

Result due to Rosen (1980-1982) Similar issue arises in other structural frameworks (Tarjan, GW)

• Speed Interval depth determines number of derived graphs # of set operations is comparable to iterative for rapid prob’s Add closure computation for non-rapid problems

Interval Sequence Length d(G), Hecht & Ullman 1975.

Page 19: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

19

What About Procedures with Irreducible Graphs?

Most CFGs are reducible, but we must handle those that are not

• Allen & Cocke suggest “node splitting” & give an example

• Papers always make this transformation look easy

• Handling arbitrary irreducible graphs is hard Interval analysis simplifies the issue by reducing all irreducible graphs to an interval like our example Reduce the interval graph that makes the problem easy

1

32

1

3

2

3’

The interval {2,3} now has a single entry.

Can change CFG for analysis without rewriting the code (unless change enables some optimization).

Alternative is to iterate over irreducible intervals

Interval construction finds only single node intervals

Page 20: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

20

Perspective

Having the set of derived interval graphs is useful for otheroptimization purposes

• LICM, OSR, copy folding, & others work better inner to outer Process most-frequent instances first Build more accurate cost models for traces & such

• Global optimizations (PRE, RA) don’t benefit from loop structure But, fast global DFA helps both of them …

• Extensive work on how to incrementally update the solution to interval-derived answers to data-flow problems in response to changes in facts & in graph structure Editing procedures with whole-program compilation

Carroll & Ryder, TOPLAS, Jan. 1987; Burke, TOPLAS, July 1990 Not easy to do incremental update in an iterative

framework Delete one edge in a graph with complex cyles Now, find its former effects… STOP

Page 21: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

End of Lecture

Extra Slides Start Here

COMP 512, Rice University

21

Page 22: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

22

The Big Picture

1

2

9

10

1

11

G2 G3

1

G4

1

2

3

4

6

7

8

5

G1

Page 23: Structural Data-flow Analysis Algorithms: Allen-Cocke Interval Analysis Copyright 2011, Keith D. Cooper & Linda Torczon, all rights reserved. Students

COMP 512, Rice University

23

Background Material

Reaching Definitions:

• A definition, d reaches a use u, if a path from d to u that does not redefine the name defined in d

REACHES(n) = Ø, nodes n

REACHES(n) = ppreds(n)(DEDEF(p) (REACHES(p) DEFKILL(p) )

• To simplify the formulation of the algorithm, we will factor the system of equations into a set of available definitions on each edge and a set of reaching definitions for each block

REACHES(n) = e=(n,m ) E AVAILDEFS(e)

AVAILDEFS(e =(n,m) ) = (DEDEF(m) (REACHES(m) DEFKILL(m) )