refinement-based context-sensitive points-to analysis for java

18
1 Refinement-Based Context- Sensitive Points-To Analysis for Java Manu Sridharan, Rastislav Bodík UC Berkeley PLDI 2006

Upload: jasmine-kaufman

Post on 01-Jan-2016

27 views

Category:

Documents


0 download

DESCRIPTION

Refinement-Based Context-Sensitive Points-To Analysis for Java. Manu Sridharan , Rastislav Bodík UC Berkeley PLDI 2006. What Does Refinement Buy You?. Cast Safety Client. Increased scalability : enable new clients Memory : orders of magnitude savings - PowerPoint PPT Presentation

TRANSCRIPT

1

Refinement-Based Context-Sensitive Points-To Analysis for Java

Manu Sridharan, Rastislav BodíkUC Berkeley

PLDI 2006

2

What Does Refinement Buy You?

Increased scalability: enable new clients• Memory: orders of magnitude savings• Time: answer for a variable comes back in 1 second• ) Suitable for IDE

0

10

20

30

40

50

Andersen Whaley/Lam 1-ObjSens(Milanova /

Lhotak)

Refine

Algorithm

% o

f ca

sts

pro

ved

saf

e

Precision:

Cast Safety Client

3

Approach: Focus on the Client

Demand-driven: only do requested work

Client-driven refinement: stop when client satisfied

Example:• client asks: “can x point to o?”

• we refine until we answer NO (the good answer) or we time out

4

Context-Sensitive Analysis Costly

Context-sensitive analysis (def):• Compute result as if all calls inlined

• But, collapse recursive methods

Exponential blowup (code growth)

5

Why Not Existing Technique?

Most analyses approximate same way in all code• E.g., k-CFA

• Precision lost, esp. for data structures

Our analysis focuses precision where it matters• Fully precise in the limit

• Only small amount of code analyzed precisely

• First refinement algorithm for Java

6

Points-To Analysis Overview

Compute objects each variable can point toFor each var x, points-to set pt(x)

Model objects with abstract locations1: x = new Foo() yields pt(x) = { o1 }

Flow-insensitive: statements in any order

7

Points-To Analysis as CFL-Reachability

1) Assignmentsx = new Obj(); // o1

y = new Obj(); // o2

z = x; o1 x

y

z

o2

a

b

pid retidd c

(1 )1

(2 )2

[f

[g

]f

2) Method callsid(p) { return p; } a = id(x);b = id(y);

3) Heap accessesc.f = x;c.g = y;d = c.f; pt(x) = { o | o flowsTo x }

flowsTo: balanced call and field parensflowsTo: balanced call parensflowsTo: path exists

8

Summary of Formulation

Graph represents program

Compute reachability with two filters• Language of balanced call parens

• Language of balanced field parens

9

Single path problem

Problem: show path is unbalancedGoal: reduce number of visited edgesInsight: enough to find one unbalanced paren

o xt0 t1 t2[f

(1 )1

[h

[f (1 )1 [h

t5)5

t6

(7

t8

t9t7

… …

]j[p )8

o2 t10

t11

t12 ]g]k

10

Approximation via Match Edges

Match edges connect matched field parens• From source of open to sink of close

• Initially, all pairs connected

Use match edges to skip subpaths

o t3t0 t1 t2[f

[g [h ]h

t4 x]j ]f

[f [g [h ]h ]j ]f

11

Refining the Approximation

Refine by removing some match edges• Exposes more of original path for checking

Soundness: Traverse match edge ) assume field parens balanced on skipped path

Remove where unbalanced parens expected• Explore deeper levels of pointer indirection

o t3t0 t1[f

[g

t4 x]j ]f

[f [g [h ]h ]j ]f

12

Refinement With Both Languages

o t5t0 t1 t2

(1 )1

[g ]g

t6 x]f

)3t3 t4[f

(2

Match edges enable approximation of calls• Only can check calls on match-free subpaths

Match edge removal ) more call checking • Key point: refine heap and calls together

Calls: (1 )1 (2 )3

Fields: [f [g ]g ]f

13

Evaluation

14

Experimental Configuration

Implemented in Soot framework

Tested on large benchmarks x 2 clients• SPECjvm98, Dacapo suite

• Downcast checking, factory method props

Refine context-insensitive result

Timeout for long-running queries

15

Precision: Cast Checking

0

10

20

30

40

50

60

70

80

90

100

com

press db ja

ckja

vac

jess

mpe

gaud

iom

trt

soot

-c

sable

cc-j

polyg

lotan

tlrbl

oat

char

t

jytho

npm

d ps

Benchmark

% o

f C

as

ts P

rov

ed

Sa

fe

1-ObjSens (Milanova / Lhotak) Refine

16

Scalability: Time and Memory

Average query time less than 1 second• Interactive performance (for IDE)

• At most 13 minutes for casts, 4 minutes for factory client

Very low memory usage: at most 35MB• Of this, 30MB for context-insensitive

result

• Compare with >2GB for 1-ObjSens analysis

17

Demand-Driven vs. Exhaustive

Demand advantage: no caching required• Hence, low memory overhead

• No engineering of efficient sets

• Good for changing code; just re-compute

Demand advantage: faster for many clients• Often only care about some variables

Demand disadvantage: slower querying all vars• At most 90 minutes for all app. vars

• But, still good precision, memory

18

Conclusions

Novel refinement-based analysis• More precise for tested clients

• Interactive performance for queries

• Low memory: could scale even more

• Relatively easy to implement

Insight: refine heap and calls together• Useful for other balanced-paren

analyses?