context-sensitive pointer alias analysis using bddslin/cs380c/handout15.pdf– summary-based...
TRANSCRIPT
1
March 3, 2014 Flow-Insensitive Pointer Analysis 1
More Pointer Analysis
Last time
– Flow-Insensitive Pointer Analysis
– Inclusion-based analysis (Andersen)
Today
– Class projects
– Context-Sensitive analysis
Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
John Whaley
Monica Lam
Stanford University
June 10, 2004
2
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
2
Unification vs. Inclusion
• Earlier scalable pointer analysis was context-insensitive unification-based [Steensgaard ’96]– Pointers are either unaliased or point to the same set
of objects.
– Near-linear, but VERY imprecise.
• Inclusion-based pointer analysis– Can point to overlapping sets of objects.
– Closure calculation is O(n3)
– Various optimizations [Fahndrich,Su,Heintze,…]
– BDD formulation, simple, scalable [Berndl,Zhu]
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
3
Context Sensitivity
• Context sensitivity is important for precision.
– Unrealizable paths.
Object id(Object x) {
return x;
}
a = id(b); c = id(d);
3
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
4
Object id(Object x) {
return x;
}
Object id(Object x) {
return x;
}
Context Sensitivity
• Context sensitivity is important for precision.
– Unrealizable paths.
a = id(b); c = id(d);
Object id(Object x) {
return x;
}
Object id(Object x) {
return x;
}
• Context sensitivity is important for precision.
– Unrealizable paths.
– Conceptually give each caller its own copy.
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
5
Context Sensitivity
a = id(b); c = id(d);
4
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
6
Summary-Based Analysis
• Popular method for context sensitivity.
• Two phases:
– Bottom-up: Summarize effects of methods.
– Top-down: Propagate information down.
• Problems:
– Difficult to summarize pointer analysis.
– Summary-based analysis using BDD: not shown to
scale [Zhu’02]
– Queries (e.g. which context points to x) require
expanding an exponential number of contexts.
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
7
Cloning-Based Analysis
• Simple brute force technique.
– Clone every path through the call graph.
– Run context-insensitive algorithm on
expanded call graph.
• The catch: exponential blowup
5
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
8
Cloning is exponential!
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
9
Recursion
• Actually, cloning is unbounded in the
presence of recursive cycles.
• Technique: We treat all methods within a
strongly-connected component as a single
node.
6
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
10
Recursion
A
G
B C D
E F
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
11
Recursion
A
G
B C D
E F
A
G
B C D
E F E F E F
G G
7
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
12
Top 20 Sourceforge Java Apps
Number of Clones
1.E+00
1.E+02
1.E+04
1.E+06
1.E+08
1.E+10
1.E+12
1.E+14
1.E+16
1000 10000 100000 1000000
Size of program (variable nodes)
Nu
mb
er
of
clo
nes
1016
1012
108
104
100
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
13
Cloning is infeasible (?)
• Typical large program has ~1014 paths
– If you need 1 byte to represent a clone:
• Would require 256 terabytes of storage
– Registered ECC 1GB DIMMs: $98.6 million
» Power: 96.4 kilowatts = Power for 128 homes
– 300 GB hard disks: 939 x $250 = $234,750
» Time to read (sequential): 70.8 days
• Seems unreasonable!
8
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
14
BDD comes to the rescue
• There are many similarities across
contexts.
– Many copies of nearly-identical results.
• BDDs can represent large sets of
redundant data efficiently.
– Need a BDD encoding that exploits the
similarities.
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
15
Contribution (1)
• Can represent context-sensitive call graph
efficiently with BDDs and a clever context
numbering scheme
– Inclusion-based pointer analysis
• 1014 contexts, 19 minutes
– Generates all answers
9
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
16
Contribution (2)
BDD hacking is complicated
bddbddb
(BDD-based deductive database)
• Pointer algorithm in 6 lines of Datalog
• Automatic translate into efficient BDD
implementation
• 10x performance over hand-tuned solver
(2164 lines of Java)
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
17
Contribution (3)
• bddbddb: General Datalog solver
– Supports simple declarative queries
– Easy use of context-sensitive pointer results
• Simple context-sensitive analyses:
– Escape analysis
– Type refinement
– Side effect analysis
– Many more presented in the paper
10
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
18
Context-sensitive call graphs
in BDD
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
19
Call graph relation
• Call graph expressed as a relation.
– Five edges:
• Calls(A,B)
• Calls(A,C)
• Calls(A,D)
• Calls(B,D)
• Calls(C,D)
B
D
C
A
11
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
20
Call graph relation
• Relation expressed as a binary
function.
– A=00, B=01, C=10, D=11
x1 x2 x3 x4 f
0 0 0 0 0
0 0 0 1 1
0 0 1 0 1
0 0 1 1 1
0 1 0 0 0
0 1 0 1 0
0 1 1 0 0
0 1 1 1 1
1 0 0 0 0
1 0 0 1 0
1 0 1 0 0
1 0 1 1 1
1 1 0 0 0
1 1 0 1 0
1 1 1 0 0
1 1 1 1 0
B
D
C
A 00
1001
11
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
21
Binary Decision Diagrams
• Graphical encoding of a truth table.
x2
x4
x3 x3
x4 x4 x4
0 0 0 1 0 0 0 0
x2
x4
x3 x3
x4 x4 x4
0 1 1 1 0 0 0 1
x1 0 edge
1 edge
12
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
22
Binary Decision Diagrams
• Collapse redundant nodes.
x2
x4
x3 x3
x4 x4 x4
0 0 0 0 0 0 0
x2
x4
x3 x3
x4 x4 x4
0 0 0 0
x1
11 1 1 1
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
23
Binary Decision Diagrams
• Collapse redundant nodes.
x2
x4
x3 x3
x4 x4 x4
x2
x4
x3 x3
x4 x4 x4
0
x1
1
13
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
24
Binary Decision Diagrams
• Collapse redundant nodes.
x2
x4
x3 x3
x2
x3 x3
x4 x4
0
x1
1
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
25
Binary Decision Diagrams
• Collapse redundant nodes.
x2
x4
x3 x3
x2
x3
x4 x4
0
x1
1
14
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
26
Binary Decision Diagrams
• Eliminate unnecessary nodes.
x2
x4
x3 x3
x2
x3
x4 x4
0
x1
1
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
27
Binary Decision Diagrams
• Eliminate unnecessary nodes.
x2
x3
x2
x3
x4
0
x1
1
15
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
28
Binary Decision Diagrams
• Size is correlated to amount of redundancy, NOT
size of relation.
– As the set gets larger, the number of don’t-care bits
increases, leading to fewer necessary nodes.
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
29
Expanded Call Graph
A
DB C
E
F G
H
0 1 2
A
DB C
E
F G
H
E E
F F GG
H H H H H
0 1 2 210
16
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
30
Numbering Clones
A
DB C
E
F G
H
A
DB C
E
F G
H
E E
F F GG
H H H H H
0 0 0
01
2
0-2 0-2
0-2 3-5
0 0 0
0 1 2
0 1 2 210
0 1 2 3 4 5
0
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
31
Pointer Analysis
17
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
32
Pointer Analysis Example
h1: v1 = new Object();
h2: v2 = new Object();
v1.f = v2;
v3 = v1.f;
Input Relations
vPointsTo(v1,h1)
vPointsTo(v2,h2)
Store(v1,f,v2)
Load(v1,f,v3)
Output Relations
hPointsTo(h1,f,h2)
vPointsTo(v3,h2)
v1 h1
v2 h2
f
v3
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
33
hPointsTo(h1, f, h2) :- Store(v1, f, v2),
vPointsTo(v1, h1),
vPointsTo(v2, h2).
v1 h1
v2 h2
f
Inference Rule in Datalog
v1.f = v2;
Stores:
18
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
34
Context-sensitive pointer analysis
• Compute call graph with context-insensitive pointer analysis.
– Datalog rules for:• assignments, loads, stores
• discover call targets, bind parameters
• type filtering
– Apply rules until fix-point reached.
• Compute expanded call graph relation.
• Apply context-insensitive algorithm to expanded call graph.
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
35
bddbddb:
BDD-Based Deductive DataBase
19
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
36
Datalog
• Declarative logic programming language designed for databases– Horn clauses
– Operates on relations
• Datalog is expressive– Relational algebra:
• Explicitly specify relational join, project, rename.
– Relational calculus:• Specify relations between variables; operations are implicit.
– Datalog:• Allows recursively-defined relations.
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
37
Datalog BDD
• Join, project, rename are directly mapped
to built-in BDD operations
• Automatically optimizes:
– Rule application order
– Incrementalization
– Variable ordering
– BDD parameter tuning
– Many more…
20
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
38
Experimental Results
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
39
Experimental Results
• Top 20 Java projects on SourceForge
– Real programs with 100K+ users each
• Using automatic bddbddb solver
– Each analysis only a few lines of code
– Easy to try new algorithms, new queries
• Test system:
– Pentium 4 2.2GHz, 1GB RAM
– RedHat Fedora Core 1, JDK 1.4.2_04, javabdd library, Joeq compiler
21
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
40
Analysis time
y = 0.0078x2.3233
R2 = 0.9197
1
10
100
1000
10000
1 10 100 1000
Variable nodes
Seco
nd
s
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
41
Analysis memory
y = 0.3609x1.4204
R2 = 0.8859
1
10
100
1000
1 10 100 1000
Variable nodes
Me
gab
yte
s
22
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
42
Multi-type variables
• A variable is multi-type if it can point to
objects of different types.
– Measure of analysis precision
– One line in Datalog
• Two ways of handling context sensitivity:
– Projected: Merge all contexts together
– Full: Keep each context separate
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
43
Comparison of Accuracy (smaller bars are better)
0
1
2
3
4
5
6
7
8
9
10
fre
etts
nfc
ch
at
je
tty
op
en
wfe
jo
on
e
jb
oss
jb
ossd
ep
ssh
da
em
on
pm
d
azu
reu
s
fre
en
et
ssh
term
jg
rap
h
um
ldo
t
jb
idw
atc
h
co
lum
ba
ga
ntt
jxp
lore
r
je
dit
me
ga
me
k
gru
nts
pu
d
Benchmarks
% o
f m
ult
i-ty
pe
va
ria
ble
s
Context-insensitive Projected context-sensitive Fully context-sensitive
23
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
44
Related Work
• Context-insensitive pointer analysis
– Steensgaard: Unification-based (POPL’96)
– Andersen: Inclusion-based (’94)
• Optimizations: too many to list
• Berndl: formulate in BDD (PLDI’03)
– Das: one-level-flow (PLDI’00)
• Hybrid unification/inclusion
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
45
Related Work
• Scalable context-sensitive pointer analysis– Fähndrich etal, instantiation constraints (PLDI’00)
• CFL-reachability
• Unification-based: Imprecise.
• Handles recursion well.
• Computes on-demand.
– GOLF: Das etal. (SAS’01)• One level of context sensitivity.
– Foster, Fahndrich, Aiken (SAS’00)• Throws away information.
– Wilson & Lam: PTF (PLDI’95)• Doesn't really scale (especially complexity)
24
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
46
Related Work
• Whaley & Rinard: Escape analysis (OOPSLA’99)– Compositional summaries: only weak updates.
– Achieves scalability by collapsing escaped nodes.
• Emami & Hendren: Invocation graphs (PLDI’94)– Only shown to scale to 8K lines.
• Zhu & Calman: (PLDI’04)– To be presented next in this session.
• More complete coverage in the paper.
June 10, 2004 Cloning-Based Context-Sensitive
Pointer Alias Analysis using BDDs
47
Conclusion
• The first scalable context-sensitive inclusion-
based pointer analysis.
– Achieves context sensitivity by cloning.
• bddbddb: Datalog efficient BDD
• Easy to query results, develop new analyses.
• Very efficient!
– <19 minutes, <600mb on largest benchmark.
• Complete system is publically available at:
http://suif.stanford.edu/bddbddb
25
Epilogue
Impact
– Best Paper Award, PLDI 2004
– High-level specification is successful
– Datalog now used as specification language for Java pointer analyses
(Doop)
March 3, 2014 Flow-Insensitive Pointer Analysis 48
Reflection
Scalability
– Whaley and Lam’s algorithm scales to 700K LOC
– Shows the benefits of abstraction
– Represent the call graph as a binary function
– Represent the binary function as a BDD
Is this a solved problem?
– LOC measured in bytecodes not source lines
– Only top-level variables are context-sensitive
– This strategy works well for Java but not C
– For C, this analysis only scales to 30K LOC
March 3, 2014 Flow-Insensitive Pointer Analysis 49