context-sensitive pointer alias analysis using bddslin/cs380c/handout15.pdf– summary-based...

25
1 March 3, 2014 Flow-Insensitive Pointer Analysis 1 More Pointer Analysis Last time Flow-Insensitive Pointer Analysis Inclusion-based analysis (Andersen) Today Class projects Context-Sensitive analysis Cloning - Based Context - Sensitive Pointer Alias Analysis using BDDs John Whaley Monica Lam Stanford University June 10, 2004

Upload: others

Post on 11-Aug-2021

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

1

March 3, 2014 Flow-Insensitive Pointer Analysis 1

More Pointer Analysis

Last time

– Flow-Insensitive Pointer Analysis

– Inclusion-based analysis (Andersen)

Today

– Class projects

– Context-Sensitive analysis

Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

John Whaley

Monica Lam

Stanford University

June 10, 2004

Page 2: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

2

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

2

Unification vs. Inclusion

• Earlier scalable pointer analysis was context-insensitive unification-based [Steensgaard ’96]– Pointers are either unaliased or point to the same set

of objects.

– Near-linear, but VERY imprecise.

• Inclusion-based pointer analysis– Can point to overlapping sets of objects.

– Closure calculation is O(n3)

– Various optimizations [Fahndrich,Su,Heintze,…]

– BDD formulation, simple, scalable [Berndl,Zhu]

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

3

Context Sensitivity

• Context sensitivity is important for precision.

– Unrealizable paths.

Object id(Object x) {

return x;

}

a = id(b); c = id(d);

Page 3: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

3

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

4

Object id(Object x) {

return x;

}

Object id(Object x) {

return x;

}

Context Sensitivity

• Context sensitivity is important for precision.

– Unrealizable paths.

a = id(b); c = id(d);

Object id(Object x) {

return x;

}

Object id(Object x) {

return x;

}

• Context sensitivity is important for precision.

– Unrealizable paths.

– Conceptually give each caller its own copy.

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

5

Context Sensitivity

a = id(b); c = id(d);

Page 4: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

4

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

6

Summary-Based Analysis

• Popular method for context sensitivity.

• Two phases:

– Bottom-up: Summarize effects of methods.

– Top-down: Propagate information down.

• Problems:

– Difficult to summarize pointer analysis.

– Summary-based analysis using BDD: not shown to

scale [Zhu’02]

– Queries (e.g. which context points to x) require

expanding an exponential number of contexts.

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

7

Cloning-Based Analysis

• Simple brute force technique.

– Clone every path through the call graph.

– Run context-insensitive algorithm on

expanded call graph.

• The catch: exponential blowup

Page 5: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

5

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

8

Cloning is exponential!

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

9

Recursion

• Actually, cloning is unbounded in the

presence of recursive cycles.

• Technique: We treat all methods within a

strongly-connected component as a single

node.

Page 6: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

6

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

10

Recursion

A

G

B C D

E F

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

11

Recursion

A

G

B C D

E F

A

G

B C D

E F E F E F

G G

Page 7: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

7

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

12

Top 20 Sourceforge Java Apps

Number of Clones

1.E+00

1.E+02

1.E+04

1.E+06

1.E+08

1.E+10

1.E+12

1.E+14

1.E+16

1000 10000 100000 1000000

Size of program (variable nodes)

Nu

mb

er

of

clo

nes

1016

1012

108

104

100

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

13

Cloning is infeasible (?)

• Typical large program has ~1014 paths

– If you need 1 byte to represent a clone:

• Would require 256 terabytes of storage

– Registered ECC 1GB DIMMs: $98.6 million

» Power: 96.4 kilowatts = Power for 128 homes

– 300 GB hard disks: 939 x $250 = $234,750

» Time to read (sequential): 70.8 days

• Seems unreasonable!

Page 8: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

8

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

14

BDD comes to the rescue

• There are many similarities across

contexts.

– Many copies of nearly-identical results.

• BDDs can represent large sets of

redundant data efficiently.

– Need a BDD encoding that exploits the

similarities.

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

15

Contribution (1)

• Can represent context-sensitive call graph

efficiently with BDDs and a clever context

numbering scheme

– Inclusion-based pointer analysis

• 1014 contexts, 19 minutes

– Generates all answers

Page 9: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

9

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

16

Contribution (2)

BDD hacking is complicated

bddbddb

(BDD-based deductive database)

• Pointer algorithm in 6 lines of Datalog

• Automatic translate into efficient BDD

implementation

• 10x performance over hand-tuned solver

(2164 lines of Java)

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

17

Contribution (3)

• bddbddb: General Datalog solver

– Supports simple declarative queries

– Easy use of context-sensitive pointer results

• Simple context-sensitive analyses:

– Escape analysis

– Type refinement

– Side effect analysis

– Many more presented in the paper

Page 10: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

10

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

18

Context-sensitive call graphs

in BDD

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

19

Call graph relation

• Call graph expressed as a relation.

– Five edges:

• Calls(A,B)

• Calls(A,C)

• Calls(A,D)

• Calls(B,D)

• Calls(C,D)

B

D

C

A

Page 11: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

11

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

20

Call graph relation

• Relation expressed as a binary

function.

– A=00, B=01, C=10, D=11

x1 x2 x3 x4 f

0 0 0 0 0

0 0 0 1 1

0 0 1 0 1

0 0 1 1 1

0 1 0 0 0

0 1 0 1 0

0 1 1 0 0

0 1 1 1 1

1 0 0 0 0

1 0 0 1 0

1 0 1 0 0

1 0 1 1 1

1 1 0 0 0

1 1 0 1 0

1 1 1 0 0

1 1 1 1 0

B

D

C

A 00

1001

11

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

21

Binary Decision Diagrams

• Graphical encoding of a truth table.

x2

x4

x3 x3

x4 x4 x4

0 0 0 1 0 0 0 0

x2

x4

x3 x3

x4 x4 x4

0 1 1 1 0 0 0 1

x1 0 edge

1 edge

Page 12: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

12

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

22

Binary Decision Diagrams

• Collapse redundant nodes.

x2

x4

x3 x3

x4 x4 x4

0 0 0 0 0 0 0

x2

x4

x3 x3

x4 x4 x4

0 0 0 0

x1

11 1 1 1

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

23

Binary Decision Diagrams

• Collapse redundant nodes.

x2

x4

x3 x3

x4 x4 x4

x2

x4

x3 x3

x4 x4 x4

0

x1

1

Page 13: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

13

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

24

Binary Decision Diagrams

• Collapse redundant nodes.

x2

x4

x3 x3

x2

x3 x3

x4 x4

0

x1

1

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

25

Binary Decision Diagrams

• Collapse redundant nodes.

x2

x4

x3 x3

x2

x3

x4 x4

0

x1

1

Page 14: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

14

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

26

Binary Decision Diagrams

• Eliminate unnecessary nodes.

x2

x4

x3 x3

x2

x3

x4 x4

0

x1

1

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

27

Binary Decision Diagrams

• Eliminate unnecessary nodes.

x2

x3

x2

x3

x4

0

x1

1

Page 15: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

15

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

28

Binary Decision Diagrams

• Size is correlated to amount of redundancy, NOT

size of relation.

– As the set gets larger, the number of don’t-care bits

increases, leading to fewer necessary nodes.

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

29

Expanded Call Graph

A

DB C

E

F G

H

0 1 2

A

DB C

E

F G

H

E E

F F GG

H H H H H

0 1 2 210

Page 16: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

16

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

30

Numbering Clones

A

DB C

E

F G

H

A

DB C

E

F G

H

E E

F F GG

H H H H H

0 0 0

01

2

0-2 0-2

0-2 3-5

0 0 0

0 1 2

0 1 2 210

0 1 2 3 4 5

0

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

31

Pointer Analysis

Page 17: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

17

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

32

Pointer Analysis Example

h1: v1 = new Object();

h2: v2 = new Object();

v1.f = v2;

v3 = v1.f;

Input Relations

vPointsTo(v1,h1)

vPointsTo(v2,h2)

Store(v1,f,v2)

Load(v1,f,v3)

Output Relations

hPointsTo(h1,f,h2)

vPointsTo(v3,h2)

v1 h1

v2 h2

f

v3

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

33

hPointsTo(h1, f, h2) :- Store(v1, f, v2),

vPointsTo(v1, h1),

vPointsTo(v2, h2).

v1 h1

v2 h2

f

Inference Rule in Datalog

v1.f = v2;

Stores:

Page 18: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

18

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

34

Context-sensitive pointer analysis

• Compute call graph with context-insensitive pointer analysis.

– Datalog rules for:• assignments, loads, stores

• discover call targets, bind parameters

• type filtering

– Apply rules until fix-point reached.

• Compute expanded call graph relation.

• Apply context-insensitive algorithm to expanded call graph.

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

35

bddbddb:

BDD-Based Deductive DataBase

Page 19: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

19

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

36

Datalog

• Declarative logic programming language designed for databases– Horn clauses

– Operates on relations

• Datalog is expressive– Relational algebra:

• Explicitly specify relational join, project, rename.

– Relational calculus:• Specify relations between variables; operations are implicit.

– Datalog:• Allows recursively-defined relations.

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

37

Datalog BDD

• Join, project, rename are directly mapped

to built-in BDD operations

• Automatically optimizes:

– Rule application order

– Incrementalization

– Variable ordering

– BDD parameter tuning

– Many more…

Page 20: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

20

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

38

Experimental Results

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

39

Experimental Results

• Top 20 Java projects on SourceForge

– Real programs with 100K+ users each

• Using automatic bddbddb solver

– Each analysis only a few lines of code

– Easy to try new algorithms, new queries

• Test system:

– Pentium 4 2.2GHz, 1GB RAM

– RedHat Fedora Core 1, JDK 1.4.2_04, javabdd library, Joeq compiler

Page 21: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

21

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

40

Analysis time

y = 0.0078x2.3233

R2 = 0.9197

1

10

100

1000

10000

1 10 100 1000

Variable nodes

Seco

nd

s

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

41

Analysis memory

y = 0.3609x1.4204

R2 = 0.8859

1

10

100

1000

1 10 100 1000

Variable nodes

Me

gab

yte

s

Page 22: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

22

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

42

Multi-type variables

• A variable is multi-type if it can point to

objects of different types.

– Measure of analysis precision

– One line in Datalog

• Two ways of handling context sensitivity:

– Projected: Merge all contexts together

– Full: Keep each context separate

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

43

Comparison of Accuracy (smaller bars are better)

0

1

2

3

4

5

6

7

8

9

10

fre

etts

nfc

ch

at

je

tty

op

en

wfe

jo

on

e

jb

oss

jb

ossd

ep

ssh

da

em

on

pm

d

azu

reu

s

fre

en

et

ssh

term

jg

rap

h

um

ldo

t

jb

idw

atc

h

co

lum

ba

ga

ntt

jxp

lore

r

je

dit

me

ga

me

k

gru

nts

pu

d

Benchmarks

% o

f m

ult

i-ty

pe

va

ria

ble

s

Context-insensitive Projected context-sensitive Fully context-sensitive

Page 23: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

23

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

44

Related Work

• Context-insensitive pointer analysis

– Steensgaard: Unification-based (POPL’96)

– Andersen: Inclusion-based (’94)

• Optimizations: too many to list

• Berndl: formulate in BDD (PLDI’03)

– Das: one-level-flow (PLDI’00)

• Hybrid unification/inclusion

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

45

Related Work

• Scalable context-sensitive pointer analysis– Fähndrich etal, instantiation constraints (PLDI’00)

• CFL-reachability

• Unification-based: Imprecise.

• Handles recursion well.

• Computes on-demand.

– GOLF: Das etal. (SAS’01)• One level of context sensitivity.

– Foster, Fahndrich, Aiken (SAS’00)• Throws away information.

– Wilson & Lam: PTF (PLDI’95)• Doesn't really scale (especially complexity)

Page 24: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

24

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

46

Related Work

• Whaley & Rinard: Escape analysis (OOPSLA’99)– Compositional summaries: only weak updates.

– Achieves scalability by collapsing escaped nodes.

• Emami & Hendren: Invocation graphs (PLDI’94)– Only shown to scale to 8K lines.

• Zhu & Calman: (PLDI’04)– To be presented next in this session.

• More complete coverage in the paper.

June 10, 2004 Cloning-Based Context-Sensitive

Pointer Alias Analysis using BDDs

47

Conclusion

• The first scalable context-sensitive inclusion-

based pointer analysis.

– Achieves context sensitivity by cloning.

• bddbddb: Datalog efficient BDD

• Easy to query results, develop new analyses.

• Very efficient!

– <19 minutes, <600mb on largest benchmark.

• Complete system is publically available at:

http://suif.stanford.edu/bddbddb

Page 25: Context-sensitive Pointer Alias Analysis using BDDslin/cs380c/handout15.pdf– Summary-based analysis using BDD: not shown to scale [Zhu’02] – Queries (e.g. which context points

25

Epilogue

Impact

– Best Paper Award, PLDI 2004

– High-level specification is successful

– Datalog now used as specification language for Java pointer analyses

(Doop)

March 3, 2014 Flow-Insensitive Pointer Analysis 48

Reflection

Scalability

– Whaley and Lam’s algorithm scales to 700K LOC

– Shows the benefits of abstraction

– Represent the call graph as a binary function

– Represent the binary function as a BDD

Is this a solved problem?

– LOC measured in bytecodes not source lines

– Only top-level variables are context-sensitive

– This strategy works well for Java but not C

– For C, this analysis only scales to 30K LOC

March 3, 2014 Flow-Insensitive Pointer Analysis 49