1 automatic predicate abstraction of c programs parts of the slides are from

1

Automatic Predicate Abstraction of C Programs

Parts of the slides are from

http://research.microsoft.com/slam/presentations/pldi2001.ppt

2

Predicate abstraction by C2BP

program P

Booleanprogram BP(P,E)

C2BP

predicates E

3

C2BP: Predicate Abstraction for C Programs

Given• P : a C program• E = {e1,...,en} : set of C boolean

expressions over the variables in P– no side effects, no procedure calls

Produces a boolean program B• same control-flow structure as P• only vars are 3-valued booleans {{e1},...,

{en}}• properties true of B are true of P

4

Assumptions

• Operates on intermediate representation of C language– CFG : if-then-else statements and gotos– expressions : side-effect free, no short-circuit

evaluation, no multiple dereferences(e.g. **p)– Function call occurs at the top-most level

• z=x+f(y); w=f(y); z=x+w;– Return statement is in the following form:

• return r; (where r is a local variable)

• Uses a logical memory model– for all i: p+i = p (where p is a pointer)

5

Weak/Strong predicates

• p is weaker than q iff q ) p• For example,

– ‘T’ is weaker than ‘F’since ‘F’ ) ‘T’

– x < 5 is weaker than x < 4 since x < 4 ) x < 5

• ‘Weak’ means ‘easy to be satisfied’– ‘T’ is the weakest predicate.

• ‘Strong’ means ‘hard to be satisfied’– ‘F’ is the strongest predicate.

6

Precondition• p 2 P(s, q) iff

– 8M 2 States : M 2 [[p]] ) ([[s]] M) 2 [[q]](where [[p]] denotes the states that satisfy p and [[s]] denotes the transfer function)

• P(s, q) = the set of predicates whose truth before s entails the truth of q after s terminates

• Example:– (y<3) 2 P(x=y+1, x<20)– (y<19) 2 P(x=y+1, x<20)

• It is easy to see that if p,p’ 2 P(s,q) then pÇp’ 2 P(s,q)– i.e, P(s,q) is closed under Ç– Note that pÇp’ is weaker than p and p’

7

Weakest Precondition

• WP(s, q) = – the weakest predicate p such that p 2 P(s, q)– equivalently, Ç(P(s,q))

• For assignment statement,– WP(x = e, q) = q[e/x]

• Example:– WP(x = y+1, x<20) = (x<20)[(y+1)/x] =

(y+1)<20 = (y<19)– (y<3) 2 P(x = y+1, x<20) but (y<3) ) (y<19)

8

Predicate strengthening

• Example– Given only E={ x<19, y==7, y==z, z<8},– WP(x = y+1, x<19) = y<18-> we don’t have (y<18) as a predicate! panic?– Find the weakest boolean expression consisting of E that

implies (y<18)! How?– SE(p) = Ç{ e | e is a boolean expression over E, e ) p }– But by disjunctive normal form theorem…– c1Æ…Æcn is a cube over E

iff (ci = ei or ci = :ei) and ei 2 E– SE(p) = Ç{ e | e is a cube over E, e ) p }– SE(y<18) = (y==7) Ç (y==z Æ z<8)

9

Strengthening & Weakening

p

SE(p)

WE(p)

10

Translating Assignment Statements

• In general,– “WP(s, ei) is true before s”

implies “predicate ei is true after s”– “WP(s, :ei) is true before s”

implies “predicate ei is false after s “

• But in some cases, WP(s, ei) is not expressible using E• The assignment statement “x=e;” is translated to

– {ei} = {SE(WP(x=e, ei))} ? true :

{SE(WP(x=e, :ei))} ? false : *; for each ei 2 E

11

Translation example

• Translating “x = y+1;”– Given E={ x<19, y==7, y==z, z<8},– SE(WP(x=y+1, x<19)) = SE(y<18) =

(y==7) Ç (y==z Æ z<8)– SE(WP(x=y+1, :(x<19))) = SE(WP(x=y+1,

x¸20)) = SE(y¸19) = false– For {x<19},

• {x<19} = ( {y==7} || ( {y==z} && {z<8} ) ) ? true :

false ? false : *;

– Similar for {y==7}, {y==z}, {z<8}

12

Handling Pointers

• Statement : *p = y;• predicates E = { x==5 }• Weakest Precondition:

– WP(*p=y, x==5) = (x==5)[y/*p] = (x==5)???– What if *p and x alias?

• Correct Weakest Precondition:– WP(*p=y, x==5) =

(p==&x Æ y==5) Ç (p!=&x Æ x==5)

• Uses Das’s pointer analysis to prune infeasible alias scenarios

13

Translating conditional statements (1/2)

• if (e) { … } else { … }

is translated to

• if (*) { assume(e); … } else { assume(:e); … }

• But what if ‘e’ is not expressible in terms of the predicates in E?

• Then find a condition weaker than ‘e’ such that it is expressible in terms of E

14

Predicate weakening

• WE(p) = the strongest predicate among those which are expressible in terms of ‘E’ and are implied by ‘p’

• WE(p) = :(SE(:p))

15

Translating conditional statements (2/2)

• if (e) { … } else { … }

is translated to

• if (*) { assume(WE(e)); … } else { assume(WE(:e)); … }

• Example– E = { x>3, x==10 }– if(x>5) { … } else { … } – if(*) { assume(x>3); … } else { assume(!(x==10));

… }

16

Handling Procedures

• Procedures abstracted in two passes:– a signature is produced for each

procedure in isolation– procedure calls are abstracted using the

callees’ signatures only

17

Global/Local predicates

• Each predicate in E is annotated as being either global or local to a particular procedure

• ei 2 E is …

– global iff ei refers to global variables only

– local to procedure R iff ei refers to a local variable of R (ei cannot refer to local variables of another procedure due to scoping)

18

Formal/Return predicates

• ei 2 E is …

– a formal predicate of procedure R iff ei is a local predicate of R and doesn’t refer to any local variables of R

– a return predicate of procedure R iff one of the following conditions is true: (1) ei refers to return variable ‘r’ of R and doesn’t refer to any local variable except ‘r’(2) ei is a formal predicate and ei refers to a global var or to a pointer dereference of formal parameter of R

19

Example program

int bar(int *q,int y,int z) {

int m1,m2; m1 = y+1; return m1; }

void foo(int *p, int x) { int r; r=bar(p,x,*p); }

bar { y>=0; // formal *q<=y; // formal, return y==m1; // return}

foo { *p<=0; // formal, return x==0; // formal r==0 // local}

20

Translating procedure “bar”int bar(int *q,int y,int z) { int m1,m2; m1 = y+1; return m1; }

bar({y>=0}, {*q<=y}) {bool {y==m1};{y==m1} = false ? true : !{y==m1} ? false : *;

}


21

Translating procedure “foo”

• “foo” invokes “bar” but only the signature of “bar” is needed to handle the procedure call

• Translated form of “foo”:foo({*p<=0}, {x==0}) {

bool {r==0};…

}

22

Signature of procedure “bar”

int bar(int *q,int y,int z) { int m1,m2; m1 = y+1; return m1; }


•Signature of “bar”:int bar(int *q,int y,int z);

Takes (y>=0), (*q<=y) as formal predicates

Returns (*q<=y), (y==m1) as return predicates

Return var = m1

23

Resolving/Strengtheningformal predicates

r=bar(p,x,*p); (in procedure foo)

Formal predicates of “bar”• (y>=0)[p/q, x/y, *p/z] = (x>=0)

– Sfoo(x>=0) = (x==0)– Sfoo(:(x>=0)) = false

• (*q<=y)[p/q, x/y, *p/z] = (*p<=x)– Sfoo(*p<=x) = (*p<=0)Æ(x==0)– Sfoo(:(*p<=x)) = :

(*p<=0)Æ(x==0)


Takes (y>=0), (*q<=y)

Returns (*q<=y), (y==m1)

Return var = m1

•Predicates of “foo”: *p<=0; x==0; r==0

24

Translated procedure invocation

prm1 = {x==0} ? true : false ? false : *;prm2 = {*p<=0}&&{x==0} ? true

: !{*p<=0}&&{x==0} ?

false : *;

r1,r2 = bar(prm1,prm2);

• “bar” returns the resulting value of predicates (*q<=y) and (y==m1)


Takes (y>=0), (*q<=y)


Return var = m1


25

Resolving return predicates


• Return predicates of “bar”(*q<=y)[r/m1, p/q, x/y, *p/z] = (*p<=x)(y==m1)[r/m1, p/q, x/y, *p/z] = (x==r)• Let R = { (*p<=x), (x==r) }

• Predicates of “foo” to be updated– (*p<=0) (“p” is passed to “bar”)– (r==0) (“r” is assigned by proc call)

• Let Ufoo = { (*p<=x), (r==0) }


Takes (y>=0), (*q<=y)


Return var = m1


26

How to update caller’s predicates


• R = { (*p<=x), (x==r) }• Ufoo = { (*p<=0), (r==0) }• Let T = R [ (Efoo\ Ufoo) =

{ (*p<=x), (x==r), (x==0) }• Strengthening Ufoo over T

– ST(*p<=0) = (*p<=x)Æ(x==0)– ST(:(*p<=0)) = :(*p<=x)Æ(x==0)– ST(r==0) = (x==r)Æ(x==0)– ST(:(r==0)) = :(x==r)Æ(x==0)


Takes (y>=0), (*q<=y)


Return var = m1


27

Translated procedure return

…;r1,r2 = bar(prm1,prm2);• r1,r2 represents {*p<=x}, {x==r}

• Ufoo strengthened over T

– ST(*p<=0) = (*p<=x)Æ(x==0)

– ST(:(*p<=0)) = :(*p<=x)Æ(x==0)

– ST(r==0) = (x==r)Æ(x==0)

– ST(:(r==0)) = :(x==r)Æ(x==0)

• Updating predicates of Ufoo

– {*p<=0} = r1&&{x==0} : true ? !r1&&{x==0} : false ? *;

– {r==0} = r2&&{x==0} : true ?

!r2&&{x==0} : false ? *;


Takes (y>=0), (*q<=y)


Return var = m1


28

Finally…foo({*p<=0}, {x==0}) {

bool {r==0};bool prm1, prm2, r1, r2;prm1 = {x==0} ? true :

false ? false : *;prm2 = {*p<=0}&&{x==0} ? true :

!{*p<=0}&&{x==0} ? false : *;

r1,r2 = bar(prm1,prm2);

{*p<=0} = r1&&(x==0) : true ? !r1&&(x==0) : false ? *;{r==0} = r2&&(x==0) : true ?

!r2&&(x==0) : false ? *;}


Takes (y>=0), (*q<=y)


Return var = m1


29

Formal Properties of C2BP

• soundness – B has a superset of the feasible paths in

P

– If {ei} is true (false) at some point on a path in B, then ei is true (false) at that point along a corresponding path in P

• complexity – linear in size of program– exponential in number of predicates

30

Conclusions

• Contributions– C2BP handles the constructs of C

• pointers, recursion, structs, casts• modular abstraction of procedures

– Provably sound

• Defects– Doesn’t use physical memory model

1 automatic predicate abstraction of c programs parts of the slides are from

Documents