program analysis - georgia institute of technologyorso/ubaeci/class02-staticanalysis.pdf · program...

29
•7/29/08 •1 Program Analysis Class #2 Readings The Program Dependence Graph and Its Use in Optimization Program slicing A Survey of Program Slicing Techniques Dragon book

Upload: others

Post on 07-Jun-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 1

Program Analysis

Class #2

Readings •  The Program Dependence Graph and Its

Use in Optimization

•  Program slicing •  A Survey of Program Slicing Techniques •  Dragon book

Page 2: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 2

Program Analysis

Data-Flow Analysis (wrap up)

Recall: Dominance (algorithm) Input: N, pred, entry Output: domin: n → {n}

•  domin(entry) = {entry} •  foreach n ∈ N - {entry}

•  domin(n) = N •  label: change = false •  foreach n ∈ N - {entry}

•  T = ∩p ∈ pred(n) domin(p) •  D = {n} ∪ T •  if D ≠ domin(n)

–  change = true –  domin(n) = D

•  if change = true goto label •  return domin

S1

S3

S4

S5

entry

exit

T

S2

S6 F

CFG

T F

Page 3: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 3

Dominance as a DF Problem Local information:

GEN[B] KILL[B]

By propagation: IN[B] OUT[B]

1.  Initialize IN[B], OUT[B] sets 2.  Iterate over all B until no changes

On each iteration, visit all B, and compute IN[B], OUT[B] as

OUT[B]

IN[B]

GEN[B] KILL[B] B

S1

S3

S4

S5

entry

exit

T

S2

S6 F

CFG

T F

Dominance as a DF Problem Local information:

GEN[B] = node itself KILL[B] = empty

By propagation: IN[B] = dominators of nodes in pred(B) OUT[B] = dominators of B

1.  Initialize IN[B], OUT[B] sets 2.  Iterate over all B until no changes

On each iteration, visit all B, and compute IN[B], OUT[B] as

IN[B] = ∩ OUT(P), P in pred(B)

OUT[B] = IN(B) U (GEN[B] - KILL[B])

OUT[B]

IN[B]

GEN[B] KILL[B] B

S1

S3

S4

S5

entry

exit

T

S2

S6 F

CFG

T F

Page 4: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 4

Data-flow for nodes 1, 2, 3 never changes but is computed on every iteration of the algorithm 1

return f2

i=2

i<=m return m

fib(m)

f0=0

m<=1

f1=1

i=i+1

f1=f2

f0=f1

f2=f0+f1 T

T F

F

2

3

4 5

6 8

7 10

11

9 12

Other DF Approaches (worklist)

In general, nodes involved in the computation may be a small subset of the nodes in the graph

For example, what if I want to compute RD only for f1?

Other DF Approaches (worklist) algorithm RDWorklist Input: GEN[B], KILL[B] for all B Output: reaching definitions for each B Method:

initialize IN[B], OUT[B] for all B; add successors of B initially involved in computation to worklist W

repeat remove B from W Oldout=OUT[B] compute IN[B], OUT[B] if oldout != OUT[B] then add successors of B to W endif

until W is empty

Page 5: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 5

Compute RD for f1 using RDWorklist •  GEN[3] is {3}, GEN[10] is {10}, KILL[3]

is {10}, KILL[10] is {3} •  Add successors of 3, 10 to W •  remove 4 from W, compute IN[4],

OUT[4], etc.

Other DF Approaches (worklist)

1

return f2

i=2

i<=m return m

fib(m)

f0=0

m<=1

f1=1

i=i+1

f1=f2

f0=f1

f2=f0+f1 T

T F

F

2

3

4 5

6 8

7 10

11

9 12

Program Analysis

Dependence Analysis

Page 6: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 6

Dependence Analysis

Important for •  Optimization

•  Instruction scheduling •  Data-cache optimization

•  Software engineering •  Program understanding •  Reverse engineering •  Debugging

Two main kinds of dependences •  Data related •  Control related

Data-Dependence Graph (DDG) A DDG has one node for every

variable (basic block) and one edge representing the flow of data between two nodes

Different types of data dependences •  Flow: def to use

•  Anti: use to def •  Output: def to def

•  Input: use to use

Further classifiable as •  Loop-carried: requires iterations •  Loop-independent: occurs anyway

entry

Z > 1

X = 1 Z > 2

Y = X + 1

X = 2

Z = X – 3 X = 4

Z = X + 7

exit

B1

B3

B2

B6

B5

B4

Page 7: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 7

Data-Dependence Graph

entry

Z > 1

X = 1 Z > 2

Y = X + 1

X = 2

Z = X – 3 X = 4

Z = X + 7

exit

B1

B3

B2

B6

B5

B4

entry

Z > 1

X = 1 Z > 2

Y = X + 1

X = 2

Z = X – 3 X = 4

Z = X + 7

exit

B1

B3

B2

B6

B5

B4

X

X X

X

X

Z Z

Z Y

X

Y X

Z

Control-Dependence Analysis Definition Let G be a CFG, with X and Y

nodes in G Y is control-dependent on X iff

1.  There exists a path P from X to Y with any Z in P (excluding X and Y) postdominated by Y and

2.  X is not postdominated by Y

(intuitively: there are two edges out of X; traversing one edge always leads to Y, the other may not lead to Y)

entry

Z > 1

X = 1 Z > 2

Y = X + 1

X = 2

Z = X – 3 X = 4

Z = X + 7

exit

B1

B3

B2

B6

B5

B4

Page 8: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 8

Control-Dependence Analysis What are the control

dependences for each statement in the CFG on the right?

entry

Z > 1

X = 1 Z > 2

Y = X + 1

X = 2

Z = X – 3 X = 4

Z = X + 7

exit

B1

B3

B2

B6

B5

B4

Control-Dependence Analysis What are the control

dependences for each statement in the CFG on the right?

Dependences: •  B1, exit – entry •  B2 – B1T •  B3 – B2T •  B4 – B1F •  B5 – B2F, B1F •  B6 – B2F, B1F

entry

Z > 1

X = 1 Z > 2

Y = X + 1

X = 2

Z = X – 3 X = 4

Z = X + 7

exit

B1

B3

B2

B6

B5

B4

Page 9: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 9

Computing CD Using FOW

1.  Construct basic CDG 2.  Add region nodes

Computing CD Using FOW

1.  Construct basic CDG 2.  Add region nodes

Page 10: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 10

Computing CD Using FOW 1.1 Build a CFG En

1

2

3

4

5

6

Ex

T F

T F

CFG

Computing CD Using FOW 1.1 Build a CFG

1.2 Augment the CFG by adding a node Start with edge (Start, entry) labeled “T” and edge (Start, exit) labeled “F”; call this AugCFG

En

1

2

3

4

5

6

Ex

T F

T F

CFG

Page 11: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 11

Computing CD Using FOW 1.1 Build a CFG

1.2 Augment the CFG by adding a node Start with edge (Start, entry) labeled “T” and edge (Start, exit) labeled “F”; call this AugCFG

En

1

2

3

4

5

6

Ex

Start

T F T

F

T F

AugCFG

Computing CD Using FOW 1.1 Build a CFG

1.2 Augment the CFG by adding a node Start with edge (Start, entry) labeled “T” and edge (Start, exit) labeled “F”; call this AugCFG

1.3 Construct the postdominator tree for AugCFG

En

1

2

3

4

5

6

Ex

Start

T F T

F

T F

AugCFG

Page 12: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 12

Computing CD Using FOW 1.1 Build a CFG

1.2 Augment the CFG by adding a node Start with edge (Start, entry) labeled “T” and edge (Start, exit) labeled “F”; call this AugCFG

1.3 Construct the postdominator tree for AugCFG

En

4

5

Ex

1 2 3 6 Start

Pdom Tree

Computing CD Using FOW En

1

2

3

4

5

6

Ex

Start

T F T

F

T F

AugCFG

En

4

5

Ex

1 2 3 6 Start

Pdom Tree

1.4 Consider set S of edges (m, n) in AugCFG such that n does not postdominate m

Page 13: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 13

Computing CD Using FOW En

1

2

3

4

5

6

Ex

Start

T F T

F

T F

AugCFG 1.4 Consider set S of edges (m, n) in AugCFG such that n does not postdominate m

Computing CD Using FOW En

1

2

3

4

5

6

Ex

Start

T F T

F

T F

AugCFG 1.4 Consider set S of edges (m, n) in AugCFG such that n does not postdominate m

1.5 Consider, for each edge (m, n) in S, those nodes in the Pdom tree from n to least common ancestor L of m and n •  Including L if L is m •  Excluding L if L is not m

Page 14: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 14

Computing CD Using FOW 1.5 Consider, for each edge

(m, n) in S, those nodes in the Pdom tree from n to least common ancestor L of m and n •  Including L if L is m •  Excluding L if L is not m

En

4

5

Ex

1 2 3 6 Start

Pdom Tree

Edge L Nodes

Start, En

1, 2

1, 4

2, 3

2, 5

4

1

Edge L Nodes

Start, En

1, 2

1, 4 Ex 4, 5, 6

2, 3

2, 5

Computing CD Using FOW 1.5 Consider, for each edge

(m, n) in S, those nodes in the Pdom tree from n to least common ancestor L of m and n •  Including L if L is m •  Excluding L if L is not m

En

4

5

Ex

1 2 3 6 Start

Pdom Tree

Edge L Nodes

Start, En Ex En, 1

1, 2 Ex 2

1, 4 Ex 4, 5, 6

2, 3 Ex 3

2, 5 Ex 5, 6

All identified nodes are control dependent on m

Page 15: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 15

Computing CD Using FOW 1.5 Consider, for each edge

(m, n) in S, those nodes in the Pdom tree from n to least common ancestor L of m and n •  Including L if L is m •  Excluding L if L is not m

En

4

5

Ex

1 2 3 6 Start

Pdom Tree

Edge L Nodes CD on

Start, En Ex En, 1 Start, T

1, 2 Ex 2 1, T

1, 4 Ex 4, 5, 6 1, F

2, 3 Ex 3 2, T

2, 5 Ex 5, 6 2, F

All identified nodes are control dependent on m

Why is the set we obtain correct?

Computing CD Using FOW

1.6 Create the actual CDG

Edge L Nodes CD on

Start, En Ex En, 1 Start, T

1, 2 Ex 2 1, T

1, 4 Ex 4, 5, 6 1, F

2, 3 Ex 3 2, T

2, 5 Ex 5, 6 2, F

En

6 5

1

2

3 4

Start T

F

T

T

T F

F

F

F

CDG

Page 16: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 16

Program-Dependence Graph (PDG)

A PDG for a program P is the combination of the data- and control- dependence graphs for P

A PDG contains nodes representing statements in P and edges representing either control or data dependence between nodes

PDG exercise Compute the PDG for the program on the left

1  x = 1; 2  y = 2; 3  if (c) 4  x++; 5  while(d) { 6  x--; 7  y += 1; } 8 printf(“%d”, y);

Page 17: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 17

Program Analysis

Slicing

Program Slicing

•  Slicing overview •  Types of slices, levels of slices •  Methods for computing slices

•  Interprocedural slicing

Page 18: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 18

Some History

1.  Mark Weiser, 1981 Experimented with programmers to show that slices are:

“The mental abstraction people make when they are debugging a program” [Weiser]

Used Data Flow Equations

2.  Ottenstein & Ottenstein – PDG, 1984 3.  Horowitz, Reps & Binkley – SDG, 1990

Applications

•  Debugging •  Program Comprehension •  Reverse Engineering

•  Program Testing •  Measuring Program Metrics

•  Coverage, Overlap, Clustering

•  Refactoring

Page 19: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 19

Static VS Dynamic Slicing

Static Slicing •  Statically available information only •  No assumptions on input •  Computed slice in general inaccurate ➡ Approximations ➡ results may be useless

Dynamic Slicing •  Computed on a given input •  Typically more precise ➡ more useful for

applications such as debugging and testing

A backward slice of a program with respect to a program point p and set of program variables V consists of all statements and predicates in the program that may affect the value of variables in V at p

The program point p and the variables V together form the slicing criterion, usually written <p, V>

Types of Slicing (backward)

Page 20: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 20

Types of Slicing (backward)

General approach: backward traversal of program flow •  Slicing starts from point p (C = (p , V)) •  Examines statements that are executed before

p ( ≠ statements that appear before p ) •  Adds statements that affect value of V at p, or

execution of p •  Transitive dependences are considered

Types of Slicing (backward)

1.  read (n) 2.  i := 1 3.  sum := 0 4.  product := 1 5.  while i <= n do 6.  sum := sum + i 7.  product := product * i 8.  i := i + 1 9.  write (sum) 10.  write (product)

Criterion <10, product>

Page 21: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 21

1.  read (n) 2.  i := 1 3.  sum := 0 4.  product := 1 5.  while i <= n do 6.  sum := sum + i 7.  product := product * i 8.  i := i + 1 9.  write (sum) 10.  write (product)

Criterion <10, product>

Types of Slicing (backward)

A slice is executable if the statements in the slice form a syntactically correct program that can be executed.

If the slice is computed correctly (safely), running the executable slice produces the same result for variables in V at p for all inputs.

Types of Slicing (executable)

Page 22: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 22

1.  read (n) 2.  i := 1 3.  sum := 0 4.  product := 1 5.  while i <= n do 6.  sum := sum + i 7.  product := product * i 8.  i := i + 1 9.  write (sum) 10.  write (product)

Criterion <10, product>

Types of Slicing (executable)

1.  read (n) 2.  i := 1 3.  sum := 0 4.  product := 1 5.  while i <= n do 6.  sum := sum + i 7.  product := product * i 8.  i := i + 1 9.  write (sum) 10.  write (product)

Criterion <10, product>

Types of Slicing (executable)

Page 23: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 23

read (n) i := 1

product := 1 while i <= n do

product := product * i i := i + 1

write (product)

Criterion <10, product>

Types of Slicing (executable)

A forward slice of a program with respect to a program point p and set of program variables V consists of all statements and predicates in the program that may be affected the value of variables in V at p

The program point p and the variables V together form the slicing criterion, usually written <p, V>

Types of Slicing (forward)

Page 24: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 24

Types of Slicing (forward)

General approach: forward traversal of program flow •  Slicing starts from p (C = (p , V)) •  Examine all statements that are executed after p •  Keep statements that are affected by the values of V at

p or by the execution of p •  Transitive dependences considered •  Shows downstream code that depend on a specific

variable or statement •  Can show the code affected by a modification

1.  read (n) 2.  i := 1 3.  sum := 0 4.  product := 1 5.  while i <= n do 6.  sum := sum + i 7.  product := product * i 8.  i := i + 1 9.  write (sum) 10.  write (product)

Types of Slicing (forward)

Criterion <3, sum>

Page 25: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 25

1.  read (n) 2.  i := 1 3.  sum := 0 4.  product := 1 5.  while i <= n do 6.  sum := sum + i 7.  product := product * i 8.  i := i + 1 9.  write (sum) 10.  write (product)

Types of Slicing (forward)

Criterion <3, sum>

1.  read (n) 2.  i := 1 3.  sum := 0 4.  product := 1 5.  while i <= n do 6.  sum := sum + i 7.  product := product * i 8.  i := i + 1 9.  write (sum) 10.  write (product)

Types of Slicing (forward)

Criterion <1, n>

Page 26: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 26

1.  read (n) 2.  i := 1 3.  sum := 0 4.  product := 1 5.  while i <= n do 6.  sum := sum + i 7.  product := product * i 8.  i := i + 1 9.  write (sum) 10.  write (product)

Types of Slicing (forward)

Criterion <1, n>

Methods for Computing Slices

Data-flow on the flow graph •  Intraprocedural: CFG

•  Interprocedural: ICFG

Reachability on a dependence graph •  Intraprocedural: program-dependence graph

(PDG)

•  Interprocedural: system-dependence graph (SDG)

Page 27: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 27

Methods for Computing Slices

Data-flow on the flow graph •  Intraprocedural: CFG

•  Interprocedural: ICFG

Reachability on a dependence graph •  Intraprocedural: program-dependence graph

(PDG)

•  Interprocedural: system-dependence graph (SDG)

Slicing as a reachability�on the PDG

•  Slicing criterion: node n in the PDG •  Simple method: slice = all nodes from

which n is reachable

•  Backward traversal of the PDG

Page 28: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 28

Slicing as a reachability�on the PDG

1  x = 1; 2  y = 2; 3  if (c) 4  x++; 5  while(d) { 6  x--; 7  y += 1; } 8 printf(“%d”, y);

1  x = 1; 2  y = 2; 3  if (c) 4  x++; 5  while(d) { 6  x--; 7  y += 1; } 8 printf(“%d”, y);

What is the slice for <y, 8> ?

Now let’s do it using the PDG 1  x = 1; 2  y = 2; 3  if (c) 4  x++; 5  while(d) { 6  x--; 7  y += 1; } 8 printf(“%d”, y);

Slicing as a reachability�on the PDG

What is the slice for <y, 8> ?

Page 29: Program Analysis - Georgia Institute of Technologyorso/UBAECI/Class02-StaticAnalysis.pdf · Program Analysis Slicing Program Slicing • Slicing overview • Types of slices, levels

• 7/29/08

• 29

entry

1 2 3 5 8

4 6 7

Slice = backward reachable nodes

Slicing as a reachability�on the PDG

What is the slice for <y, 8> ? 1  x = 1; 2  y = 2; 3  if (c) 4  x++; 5  while(d) { 6  x--; 7  y += 1; } 8 printf(“%d”, y);

entry

2 5

7