1 automatic predicate abstraction of c programs parts of the slides are from
TRANSCRIPT
1
Automatic Predicate Abstraction of C Programs
Parts of the slides are from
http://research.microsoft.com/slam/presentations/pldi2001.ppt
3
C2BP: Predicate Abstraction for C Programs
Given• P : a C program• E = {e1,...,en} : set of C boolean
expressions over the variables in P– no side effects, no procedure calls
Produces a boolean program B• same control-flow structure as P• only vars are 3-valued booleans {{e1},...,
{en}}• properties true of B are true of P
4
Assumptions
• Operates on intermediate representation of C language– CFG : if-then-else statements and gotos– expressions : side-effect free, no short-circuit
evaluation, no multiple dereferences(e.g. **p)– Function call occurs at the top-most level
• z=x+f(y); w=f(y); z=x+w;– Return statement is in the following form:
• return r; (where r is a local variable)
• Uses a logical memory model– for all i: p+i = p (where p is a pointer)
5
Weak/Strong predicates
• p is weaker than q iff q ) p• For example,
– ‘T’ is weaker than ‘F’since ‘F’ ) ‘T’
– x < 5 is weaker than x < 4 since x < 4 ) x < 5
• ‘Weak’ means ‘easy to be satisfied’– ‘T’ is the weakest predicate.
• ‘Strong’ means ‘hard to be satisfied’– ‘F’ is the strongest predicate.
6
Precondition• p 2 P(s, q) iff
– 8M 2 States : M 2 [[p]] ) ([[s]] M) 2 [[q]](where [[p]] denotes the states that satisfy p and [[s]] denotes the transfer function)
• P(s, q) = the set of predicates whose truth before s entails the truth of q after s terminates
• Example:– (y<3) 2 P(x=y+1, x<20)– (y<19) 2 P(x=y+1, x<20)
• It is easy to see that if p,p’ 2 P(s,q) then pÇp’ 2 P(s,q)– i.e, P(s,q) is closed under Ç– Note that pÇp’ is weaker than p and p’
7
Weakest Precondition
• WP(s, q) = – the weakest predicate p such that p 2 P(s, q)– equivalently, Ç(P(s,q))
• For assignment statement,– WP(x = e, q) = q[e/x]
• Example:– WP(x = y+1, x<20) = (x<20)[(y+1)/x] =
(y+1)<20 = (y<19)– (y<3) 2 P(x = y+1, x<20) but (y<3) ) (y<19)
8
Predicate strengthening
• Example– Given only E={ x<19, y==7, y==z, z<8},– WP(x = y+1, x<19) = y<18-> we don’t have (y<18) as a predicate! panic?– Find the weakest boolean expression consisting of E that
implies (y<18)! How?– SE(p) = Ç{ e | e is a boolean expression over E, e ) p }– But by disjunctive normal form theorem…– c1Æ…Æcn is a cube over E
iff (ci = ei or ci = :ei) and ei 2 E– SE(p) = Ç{ e | e is a cube over E, e ) p }– SE(y<18) = (y==7) Ç (y==z Æ z<8)
10
Translating Assignment Statements
• In general,– “WP(s, ei) is true before s”
implies “predicate ei is true after s”– “WP(s, :ei) is true before s”
implies “predicate ei is false after s “
• But in some cases, WP(s, ei) is not expressible using E• The assignment statement “x=e;” is translated to
– {ei} = {SE(WP(x=e, ei))} ? true :
{SE(WP(x=e, :ei))} ? false : *; for each ei 2 E
11
Translation example
• Translating “x = y+1;”– Given E={ x<19, y==7, y==z, z<8},– SE(WP(x=y+1, x<19)) = SE(y<18) =
(y==7) Ç (y==z Æ z<8)– SE(WP(x=y+1, :(x<19))) = SE(WP(x=y+1,
x¸20)) = SE(y¸19) = false– For {x<19},
• {x<19} = ( {y==7} || ( {y==z} && {z<8} ) ) ? true :
false ? false : *;
– Similar for {y==7}, {y==z}, {z<8}
12
Handling Pointers
• Statement : *p = y;• predicates E = { x==5 }• Weakest Precondition:
– WP(*p=y, x==5) = (x==5)[y/*p] = (x==5)???– What if *p and x alias?
• Correct Weakest Precondition:– WP(*p=y, x==5) =
(p==&x Æ y==5) Ç (p!=&x Æ x==5)
• Uses Das’s pointer analysis to prune infeasible alias scenarios
13
Translating conditional statements (1/2)
• if (e) { … } else { … }
is translated to
• if (*) { assume(e); … } else { assume(:e); … }
• But what if ‘e’ is not expressible in terms of the predicates in E?
• Then find a condition weaker than ‘e’ such that it is expressible in terms of E
14
Predicate weakening
• WE(p) = the strongest predicate among those which are expressible in terms of ‘E’ and are implied by ‘p’
• WE(p) = :(SE(:p))
15
Translating conditional statements (2/2)
• if (e) { … } else { … }
is translated to
• if (*) { assume(WE(e)); … } else { assume(WE(:e)); … }
• Example– E = { x>3, x==10 }– if(x>5) { … } else { … } – if(*) { assume(x>3); … } else { assume(!(x==10));
… }
16
Handling Procedures
• Procedures abstracted in two passes:– a signature is produced for each
procedure in isolation– procedure calls are abstracted using the
callees’ signatures only
17
Global/Local predicates
• Each predicate in E is annotated as being either global or local to a particular procedure
• ei 2 E is …
– global iff ei refers to global variables only
– local to procedure R iff ei refers to a local variable of R (ei cannot refer to local variables of another procedure due to scoping)
18
Formal/Return predicates
• ei 2 E is …
– a formal predicate of procedure R iff ei is a local predicate of R and doesn’t refer to any local variables of R
– a return predicate of procedure R iff one of the following conditions is true: (1) ei refers to return variable ‘r’ of R and doesn’t refer to any local variable except ‘r’(2) ei is a formal predicate and ei refers to a global var or to a pointer dereference of formal parameter of R
19
Example program
int bar(int *q,int y,int z) {
int m1,m2; m1 = y+1; return m1; }
void foo(int *p, int x) { int r; r=bar(p,x,*p); }
bar { y>=0; // formal *q<=y; // formal, return y==m1; // return}
foo { *p<=0; // formal, return x==0; // formal r==0 // local}
20
Translating procedure “bar”int bar(int *q,int y,int z) { int m1,m2; m1 = y+1; return m1; }
bar({y>=0}, {*q<=y}) {bool {y==m1};{y==m1} = false ? true : !{y==m1} ? false : *;
}
bar { y>=0; // formal *q<=y; // formal, return y==m1; // return}
21
Translating procedure “foo”
• “foo” invokes “bar” but only the signature of “bar” is needed to handle the procedure call
• Translated form of “foo”:foo({*p<=0}, {x==0}) {
bool {r==0};…
}
22
Signature of procedure “bar”
int bar(int *q,int y,int z) { int m1,m2; m1 = y+1; return m1; }
bar { y>=0; // formal *q<=y; // formal, return y==m1; // return}
•Signature of “bar”:int bar(int *q,int y,int z);
Takes (y>=0), (*q<=y) as formal predicates
Returns (*q<=y), (y==m1) as return predicates
Return var = m1
23
Resolving/Strengtheningformal predicates
r=bar(p,x,*p); (in procedure foo)
Formal predicates of “bar”• (y>=0)[p/q, x/y, *p/z] = (x>=0)
– Sfoo(x>=0) = (x==0)– Sfoo(:(x>=0)) = false
• (*q<=y)[p/q, x/y, *p/z] = (*p<=x)– Sfoo(*p<=x) = (*p<=0)Æ(x==0)– Sfoo(:(*p<=x)) = :
(*p<=0)Æ(x==0)
•Signature of “bar”:int bar(int *q,int y,int z);
Takes (y>=0), (*q<=y)
Returns (*q<=y), (y==m1)
Return var = m1
•Predicates of “foo”: *p<=0; x==0; r==0
24
Translated procedure invocation
prm1 = {x==0} ? true : false ? false : *;prm2 = {*p<=0}&&{x==0} ? true
: !{*p<=0}&&{x==0} ?
false : *;
r1,r2 = bar(prm1,prm2);
• “bar” returns the resulting value of predicates (*q<=y) and (y==m1)
•Signature of “bar”:int bar(int *q,int y,int z);
Takes (y>=0), (*q<=y)
Returns (*q<=y), (y==m1)
Return var = m1
•Predicates of “foo”: *p<=0; x==0; r==0
25
Resolving return predicates
r=bar(p,x,*p); (in procedure foo)
• Return predicates of “bar”(*q<=y)[r/m1, p/q, x/y, *p/z] = (*p<=x)(y==m1)[r/m1, p/q, x/y, *p/z] = (x==r)• Let R = { (*p<=x), (x==r) }
• Predicates of “foo” to be updated– (*p<=0) (“p” is passed to “bar”)– (r==0) (“r” is assigned by proc call)
• Let Ufoo = { (*p<=x), (r==0) }
•Signature of “bar”:int bar(int *q,int y,int z);
Takes (y>=0), (*q<=y)
Returns (*q<=y), (y==m1)
Return var = m1
•Predicates of “foo”: *p<=0; x==0; r==0
26
How to update caller’s predicates
r=bar(p,x,*p); (in procedure foo)
• R = { (*p<=x), (x==r) }• Ufoo = { (*p<=0), (r==0) }• Let T = R [ (Efoo\ Ufoo) =
{ (*p<=x), (x==r), (x==0) }• Strengthening Ufoo over T
– ST(*p<=0) = (*p<=x)Æ(x==0)– ST(:(*p<=0)) = :(*p<=x)Æ(x==0)– ST(r==0) = (x==r)Æ(x==0)– ST(:(r==0)) = :(x==r)Æ(x==0)
•Signature of “bar”:int bar(int *q,int y,int z);
Takes (y>=0), (*q<=y)
Returns (*q<=y), (y==m1)
Return var = m1
•Predicates of “foo”: *p<=0; x==0; r==0
27
Translated procedure return
…;r1,r2 = bar(prm1,prm2);• r1,r2 represents {*p<=x}, {x==r}
• Ufoo strengthened over T
– ST(*p<=0) = (*p<=x)Æ(x==0)
– ST(:(*p<=0)) = :(*p<=x)Æ(x==0)
– ST(r==0) = (x==r)Æ(x==0)
– ST(:(r==0)) = :(x==r)Æ(x==0)
• Updating predicates of Ufoo
– {*p<=0} = r1&&{x==0} : true ? !r1&&{x==0} : false ? *;
– {r==0} = r2&&{x==0} : true ?
!r2&&{x==0} : false ? *;
•Signature of “bar”:int bar(int *q,int y,int z);
Takes (y>=0), (*q<=y)
Returns (*q<=y), (y==m1)
Return var = m1
•Predicates of “foo”: *p<=0; x==0; r==0
28
Finally…foo({*p<=0}, {x==0}) {
bool {r==0};bool prm1, prm2, r1, r2;prm1 = {x==0} ? true :
false ? false : *;prm2 = {*p<=0}&&{x==0} ? true :
!{*p<=0}&&{x==0} ? false : *;
r1,r2 = bar(prm1,prm2);
{*p<=0} = r1&&(x==0) : true ? !r1&&(x==0) : false ? *;{r==0} = r2&&(x==0) : true ?
!r2&&(x==0) : false ? *;}
•Signature of “bar”:int bar(int *q,int y,int z);
Takes (y>=0), (*q<=y)
Returns (*q<=y), (y==m1)
Return var = m1
•Predicates of “foo”: *p<=0; x==0; r==0
29
Formal Properties of C2BP
• soundness – B has a superset of the feasible paths in
P
– If {ei} is true (false) at some point on a path in B, then ei is true (false) at that point along a corresponding path in P
• complexity – linear in size of program– exponential in number of predicates