spring 2014 program analysis and verification lecture 11: abstract interpretation iii

Post on 22-Feb-2016

41 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Spring 2014 Program Analysis and Verification Lecture 11: Abstract Interpretation III. Roman Manevich Ben-Gurion University. Syllabus. Previously. Solving monotone systems Fixed-points Vanilla static analysis algorithm Chaotic iteration. Static analysis. - PowerPoint PPT Presentation

TRANSCRIPT

Spring 2014Program Analysis and Verification

Lecture 11: Abstract Interpretation III

Roman ManevichBen-Gurion University

2

SyllabusSemantics

NaturalSemantics

Structural semantics

AxiomaticVerification

StaticAnalysis

AutomatingHoare Logic

Control Flow Graphs

Equation Systems

CollectingSemantics

AbstractInterpretation fundamentals

Lattices

Fixed-Points

Chaotic Iteration

Galois Connections

Domain constructors

Widening/Narrowing

InterproceduralAnalysis

AnalysisTechniques

Numerical Domains

CEGAR

Alias analysis

ShapeAnalysis

Crafting your own

Soot

From proofs to abstractions

Systematically developing

transformers

3

Previously

• Solving monotone systems• Fixed-points• Vanilla static analysis algorithm• Chaotic iteration

4

Static analysis• Given a system of equations

for the collecting semantics

A static analysis solves a corresponding system of equations over an abstract domain

• Questions:– How do you solve the second system? Chaotic Iteration– What is the relation between the solutions?

This lecture

R[0] = {xZ} // established inputR[1] = R[0] R[4]R[2] = assume x>0 R[1]R[3] = assume x0 R[1]R[4] = x:=x-1 R[2]

R[0]# = {xZ}#

R[1]# = R[0] R[4]R[2]# = assume x>0# R[1]R[3]# = assume x0# R[1]R[4]# = x:=x-1# R[2]

5

Monotone function

f

1x

L1 L2

2y

3 f(x)

4 f(y)

f

6

Important cases of monotonicity• Join: f(X, Y) = X Y is monotone in each operand– Prove it!

• Set lifting function: for a set X and any function gF(X) = { g(x) | x X } is monotone w.r.t. – Prove it!

• Notice that the collecting semantics function is defined in terms of– Join (set union)– Semantic function for atomic statements lifted to sets of

states• Conclusion: collecting semantics function is monotone

7

Fixed points

• L = (D, , , , , )• f : D D monotone• Fix(f) = { d | f(d) = d }• Red(f) = { d | f(d) d }• Ext(f) = { d | d f(d) }• Theorem [Tarski 1955]– lfp(f) = Fix(f) = Red(f) Fix(f)– gfp(f) = Fix(f) = Ext(f) Fix(f)

Red(f)

Ext(f)

Fix(f)

lfp

gfp

fn()

1.Does a solution always exist? Yes2. If so, is it unique? No, but it has least/greatest solutions3. If so, is it computable? Under some conditions…

8

Continuous functions• Let L = (D, , , ) be a complete partial order– Every ascending chain has an upper bound

• A function f is continuous if for every increasing chain Y D*,

f(Y) = { f(y) | yY }• Lemma: if f is continuous then f is monotone• Proof: assume x y

Therefore xy=yThen f(y) = f(xy) = f(x) f(y), which means f(x) f(y)

9

Continuous functions• Let L = (D, , , ) be a complete partial order– Every ascending chain has an upper bound

• A function f is continuous if for every increasing chain Y D*,

f(Y) = { f(y) | yY }• Lemma: if f is continuous then f is monotone• Proof: assume x y

Therefore xy=yThen f(y) = f(xy) = f(x) f(y), which means f(x) f(y)

10

Kleene’s fixed point theorem

• Let L = (D, , , ) be a complete partial order and a continuous function f: D D then

lfp(f) = nN fn()• That is, take the ascending chain

f() f(f()) … fn() …and return the supremum– Why is this an ascending chain?

• But how do you know if a function f is continuous

11

Continuity and ACC condition

• Let L = (D, , , ) be a complete partial order– Every ascending chain has an upper bound

• L satisfies the ascending chain condition (ACC) if every ascending chain eventually stabilizes:

d0 d1 … dn = dn+1 = dn+2 = … • Lemma: Monotone functions on posets

satisfying ACC are continuous

12

Resulting algorithm • Kleene’s fixed point theorem gives a

constructive method for computing lfp(f) over a poset with ACC when f is monotone

lfpfn()

f()f2()

…d := while f(d) d do d := f(d)return d

Algorithm

lfp(f) = nN fn()Mathematical definition

13

Vanilla algorithmProblem Definition:

1. Lattice of properties L of finite height (ACC)2. For each statement define a monotone transformer

Preparation:3. Parse program into AST4. Convert AST into CFG5. Generate system of equations from CFG

Analysis:6. Initialize each analysis variable with 7. Update all analysis variables of each equation until

reaching a fixed point

Non-incremental.Most variables don’t change.

14

Chaotic iteration• Input:

– A cpo L = (D, , , ) satisfying ACC– Ln = L L … L– A monotone function f : Dn Dn – A system of equations { X[i] | f(X) | 1 i n }

• Output: lfp(f)• A worklist-based algorithm

for i:=1 to n do X[i] := WL = {1,…,n}while WL do j := pop WL // choose index non-deterministically N := F[i](X) if N X[i] then X[i] := N add all the indexes that directly depend on i to WL (X[j] depends on X[i] if F[j] contains X[i])return X

15

Required knowledgeCollecting semanticsAbstract semantics (over lattices)Algorithm to compute abstract semantics

(chaotic iteration)• Connection between collecting semantics and

abstract semantics• Abstract transformers

16

Today

• Galois connections• Abstract transformers• Global soundness

17

Recap• We defined a reference semantics –

the collecting semantics• We defined an abstract semantics for a given lattice and

abstract transformers• We defined an algorithm to compute abstract least fixed-

point when transformers are monotone and lattice obeys ACC

• Questions:1. What is the connection between the two least fixed-

points?2. Transformer monotonicity is required for termination –

what should we require for correctness?

18

Recap• We defined a reference semantics –

the collecting semantics• We defined an abstract semantics for a given lattice and

abstract transformers• We defined an algorithm to compute abstract least fixed-point

when transformers are monotone and lattice obeys ACC• Questions:1. Does the algorithm terminate?2. What is the connection between the two least fixed-points?3. Transformer monotonicity is required for termination – what

should we require for correctness?

19

Handling non-monotone transformers

• Kleene’s fixed point theorem gives a constructive methodfor computing lfp(f) over a poset with ACC when f is monotone

• Monotonicity ensures• f() … fn() …

is an ascending chain• What if f is not necessarily

monotone? How can we ensure termination?

d := while f(d) d do d := f(d)return d

Algorithm

lfp(f) = nN fn()Mathematical definition

20

Handling non-monotone transformers

• Define f’(d) = d f(d)• Now f’ is extensive:

d d f(d) = f’(d) and so• f’() … f’n() …

is an ascending chain• Result is not necessarily

the least fixed point – we get a (post)fixed point in finite time (ACC)

d := while f’(d) d do d := f’(d)return d

Revised algorithm

lfp(f) = nN fn() nN f’n()

Mathematical definition

21

Relating the concrete domainand the abstract domain

22

Galois Connection• Given two complete lattices

C = (DC, C, C, C, C, C) – concrete domainA = (DA, A, A, A, A, A) – abstract domain

• A Galois Connection (GC) is quadruple (C, , , A)that relates C and A via the monotone functions– The abstraction function : DC DA

– The concretization function : DA DC

• For every concrete element c DC

and abstract element a DA

( (a)) A a and c C ( (c))• Alternatively (c) A a iff c C (a)

23

Galois Connection: c C ((c))

1

c2 (c

)

3((c))

The most precise (least) element in A representing c

C A

24

Galois Connection: ((a)) A a

1

3((a))

2(a) a

C AWhat a represents in C(its meaning)

25

Example: lattice of equalities• Concrete lattice:

C = (2State, , , , , State)• Abstract lattice:

EQ = { x=y | x, y Var}A = (2EQ, , , , EQ , )– Treat elements of A as both formulas and sets of constraints

• Useful for copy propagation – a compiler optimization•

(X) = ?(Y) = ?

26

Example: lattice of equalities• Concrete lattice:

C = (2State, , , , , State)• Abstract lattice:

EQ = { x=y | x, y Var}A = (2EQ, , , , EQ , )– Treat elements of A as both formulas and sets of constraints

• Useful for copy propagation – a compiler optimization• () = ({}) = { x=y | x = y} that is x=y

(X) = {() | X} = A {() | X} (Y) = { | Y } = models(Y)

27

Galois Connection: c C ((c))

1

[x5, y5, z5]

2

x=x, y=y, z=z,x=y, y=x,x=z, z=x,y=z, z=y

3

…[x6, y6, z6][x5, y5, z5][x4, y4, z4]

4

x=x, y=y, z=z

The most precise (least) element in A representing [x5, y5, z5]

C A

28

Most precise abstract representation

1c

5

C A

46

27

3

89

(c)

(c) = {c’ | c (c’)}

29

Most precise abstract representation

1c

5

C A

46

27

3

89

(c)= x=x, y=y, z=z,x=y, y=x,x=z, z=x,y=z, z=y

(c) = {c’ | c (c’)}

[x5, y5, z5]

x=y, y=zx=y, z=y

x=y

30

Galois Connection: ((a)) A a

1

3x=x, y=y, z=z,x=y, y=x,x=z, z=x,y=z, z=y

2

…[x6, y6, z6][x5, y5, z5][x4, y4, z4]

x=y, y=z

What a represents in C(its meaning)

is called a semanticreduction

C A

31

Partial reduction

• The operator is called a semantic reduction since((a)) means the same a a but it is a reduced – more precise version of a

• An operator reduce : DA DA

is a partial reduction if1. reduce(a) A a and2. (a)=(reduce(a))

32

Galois Insertion a: ((a))=a

1x=x, y=y, z=z,x=y, y=x,x=z, z=x,y=z, z=y

2

…[x6, y6, z6][x5, y5, z5][x4, y4, z4]

C AHow can we obtain a Galois Insertion

from a Galois Connection?

All elementsare reduced

33

Special cases

34

Properties of a Galois Connection

• The abstraction and concretization functions uniquely determine each other:

(a) = {c | (c) a}(c) = {a | c (a)}

35

Abstracting (disjunctive) sets

• It is usually convenient to first define the abstraction of single elements

(s) = ({s}) • Then lift the abstraction to sets of elements

(X) = A {(s) | sX}

36

The case of symbolic domains• An important class of abstract domains are symbolic

domains – domains of formulas• C = (2State, , , , , State)

A = (DA, A, A, A, A, A)• If DA is a set of formulas then the abstraction of a state is

defined as () = ({}) = A{ | }

the least formula from DA that s satisfies• The abstraction of a set of states is

(X) = A { () | s X}• The concretization is

( ) = { | } = models( )

37

Composing Galois connections

38

Inducing along the connections• Assume the complete lattices

C = (DC, C, C, C, C, C) A = (DA, A, A, A, A, A)M = (DM, M, M, M, M, M)andGalois connectionsGCC,A=(C, C,A, A,C, A) and GCA,M=(A, A,M, M,A, M)

• Lemma: both connections induce theGCC,M= (C, C,M, M,C, M)

defined by C,M = C,A A,M and M,C = M,A A,C

39

Inducing along the connections

1 C,

A

A,C

c 2 C,A(c)

5

C A

3

M

A,

M

4

M,A

c’a’

=A,M(C,A(c))

40

Relating abstract transformers to concrete

transformers

41

Sound abstract transformer

• Given two latticesC = (DC, C, C, C, C, C)A = (DA, A, A, A, A, A)and GCC,A=(C, , , A) with

• A concrete transformer f : DC DC

an abstract transformer f# : DA DA

• We say that f# is a sound transformer (w.r.t. f) if• c: f(c)=c’ (f#(c)) (c’)• For every a and a’ such that

(f((a))) A f#(a)

42

Transformer soundness condition 1

1 2

C A

f 34f#

5

c: f(c)=c’ (f#(c)) (c’)

43

Transformer soundness condition 2

C A

1 2

f#35f

4

a: f#(a)=a’ f((a)) (a’)

44

Best (induced) transformer

C A

23f

f#(a)= (f((a)))

1 f#

4

Problem: incomputable directly

45

Best abstract transformer [CC’77]• Best in terms of precision– Most precise abstract transformer– May be too expensive to compute

• Constructively defined asf# = f

– Induced by the GC• Not directly computable because first step is

concretization• We often compromise for a “good enough”

transformer– Useful tool: partial concretization

46

Developing a sound abstract transformer by example

47

Transformer example

• C = (2State, , , , , State)• EQ = { x=y | x, y Var}

A = (2EQ, , , , EQ , )• () = ({}) = { x=y | x = y }

that is x=y(S) = {() | S} = A { () | S } () = { | } = models()

• Concrete: x:=y S = { [x y] | S }• Abstract: x:=y# S = ?

48

Developing a transformer for EQ - 1

• Input has the form S = {a=b}• sp(x:=expr, ) = v. x=expr[v/x] [v/x]

• sp(x:=y, S) = v. x=y[v/x] S[v/x] = …• Let’s define helper notations:– Mod(x:=y, S) = {x=a, b=x S}• Subset of equalities containing x (will be modified)

– Frame(x:=y, S) = S \ Mod(x:=y, S)• Subset of equalities not containing x (i.e., the frame)

49

Developing a transformer for EQ - 2

• sp(x:=y, S) = v. x=y[v/x] {a=b}[v/x] = …• Two cases

– x is y: sp(x:=x, S) = S – x is different from y:

sp(x:=y, S)= v. x=y Mod(x:=y, S)[v/x] Frame(x:=y, S)[v/x]

= x=y Frame(x:=y, S) v. Mod(x:=y, S)[v/x] x=y Frame(x:=y, S)

• Vanilla transformer: x:=y#1 X = x=y Frame(x:=y, S)

• Example: x:=y#1 {x=p, q=x, m=n} = {x=y, m=n}Is this the most precise result?

50

Developing a transformer for EQ - 3

• x:=y#1 {x=p, x=q, m=n} = {x=y, m=n} {x=y, m=n, p=q}– Where does the information p=q come from?

• sp(x:=y, S) = x=y Frame(x:=y, S) v. Mod(x:=y, S)[v/x]

• v. Mod(x:=y, S)[v/x] holds possible equalities between different a’s and b’s – how can we account for that?

51

Developing a transformer for EQ - 4

• Define a reduction operator:reduce(S) = if {a=b, b=c}S and {a=c} S then reduce(S {a=c}) if {a=b}S and {b=a} S then reduce(S {b=a})

else S• Define x:=y#2 = x:=y#1 reduce• x:=y#2){x=p, x=q, m=n}) = {x=y, m=n, p=q}

is this the best transformer?

52

Developing a transformer for EQ - 5 • x:=y#2 ){y=z}) = {x=y, y=z} {x=y, y=z, x=z}• Idea: apply reduction operator again after the vanilla

transformer• x:=y#3 = reduce x:=y#1 reduce• Observation : after the first time we apply reduce, all

subsequent values will be in the image of the abstraction so really we only need to apply it once to the input

• Finally: x:=y# S = reduce x:=y#1

– Best transformer for reduced elements (elements in the image of the abstraction)

53

Properties of abstract transformers

54

Negative property of best transformers

• Let f# = f • Best transformer does not compose

(f(f((a)))) f#(f#(a))• Best transformer of composed operation

(f2)# = (f f)# = f f • Composition of best transformers:

(f#)2= f# f# = f f

Source of precision loss

55

(f(f((a)))) f#(f#(a))

C A

23f

1 f#

6

54

f

7

f#

8

9

f

56

Global (fixed point) Soundness theorems

57

Soundness theorem 11. Given two complete lattices

C = (DC, C, C, C, C, C)A = (DA, A, A, A, A, A)and GCC,A=(C, , , A) with

2. Monotone concrete transformer f : DC DC

3. Monotone abstract transformer f# : DA DA

4. Local soundness: a DA : f( (a)) (f#(a))Then, global soundness:

lfp(f) (lfp(f#)) (lfp(f)) lfp(f#)

58

Soundness theorem 1

C A

f

fn …

lpf(f)

f2 f3

f#n

lpf(f#)

f#2 f#3

f#

aDA : f((a)) (f#(a)) aDA : fn((a)) (f#n(a)) aDA : lfp(fn)((a)) (lfp(f#n)(a)) lfp(f) lfp(f#)

59

Soundness theorem 21. Given two complete lattices

C = (DC, C, C, C, C, C)A = (DA, A, A, A, A, A)and GCC,A=(C, , , A) with

2. Monotone concrete transformer f : DC DC

3. Monotone abstract transformer f# : DA DA

4. Local soundness: c DC : (f(c)) f#( (c))Then, global soundness:

(lfp(f)) lfp(f#)lfp(f) (lfp(f#))

60

Soundness theorem 2

C A

f

fn …

lpf(f)

f2 f3

f#n

lpf(f#)

f#2 f#3

f#

c DC : (f(c)) f#((c)) c DC : (fn(c)) f#n((c)) c DC : (lfp(f)(c)) lfp(f#)((c)) lfp(f) lfp(f#)

61

A recipe for a sound static analysis• Define an “appropriate” operational semantics• Define “collecting” structural operational semantics • Establish a Galois connection between collecting

states and abstract states• Local correctness: show that the abstract

interpretation of every atomic statement is soundw.r.t. the collecting semantics

• Global correctness: conclude that the analysis is sound

62

Completeness

63

Completeness• Local property:– forward complete: c: (f#(c)) = (f(c))– backward complete: a: f((a)) = (f#(a))

• A property of domain and the (best) transformer• Global property:– (lfp(f)) = lfp(f#)– lfp(f) = (lfp(f#))

• Very ideal but usually not possible unless we change the program model (apply strong abstraction) and/or aim for very simple properties

64

Forward complete transformer

1 2

C A

f 34

f#

c: (f#(c)) = (f(c))

65

Backward complete transformer

C A

1 2

f#35f

a: f((a)) = (f#(a))

66

Global (backward) completeness

C A

f

fn …

lpf(f)

f2 f3

f#n

lpf(f#)

f#2 f#3

f#

a: f((a)) = (f#(a)) a: fn((a)) = (f#n(a)) aDA : lfp(fn)((a)) = (lfp(f#n)(a)) lfp(f) = lfp(f#)

67

Global (forward) completeness

C A

f

fn …

lpf(f)

f2 f3

f#n

lpf(f#)

f#2 f#3

f#

c DC : (f(c)) = f#((c)) c DC : (fn(c)) = f#n((c)) c DC : (lfp(f)(c)) = lfp(f#)((c)) lfp(f) = lfp(f#)

68

implementation

69

Implementing the VE analysis

1. Implement the abstract states in the VE domain

1. Implement the abstract states in the VE domain

2. Implement the VE abstract domain operations

3. Connect analysis to Soot

A = (DA, A, A, A, A, A)

70

VE factoids

Don’t forget to override toString() – used for displaying analysis results

71

VE abstract state implementationImplement bottom as a special object – must be handled appropriately in all abstract operations

72

Abstract domain base class

73

VE domain operations: , , , used to match the right transformer for a given statement

Handle bottom as a special corner case

Handle bottom as a special corner case

74

VE abstract transformersFind transformer by matching statement

Improved transformer with reduction operator

which branch of the condition

75

Matching transformers to statementsExtend matcher to easily pattern match statementsIt is usually safe to ignore

many statements (skip)

It is usually safe to ignore many statements (skip)

Handle assume x==y

Handle assume x!=y

Handle x=y

Special case: x=x(what happens if we use the general case here?)

76

Matching transformers to statements

Handling x=expr as identity is unsound.Handle soundly by removing all factoids containing x (that is, lhs).

77

Example transformer x=y#

Check the precondition for applying this transformer

Handle bottom as a special corner case

A transformer is a unary operation parameterized by the type of abstract domain elements

Override the apply method with one argument

78

Example transformer x=expr#

As a convention never mutate states – create new ones and modify them (functional programming style)

Next lecture:abstract interpretation IV

top related