effectively-propositional modular reasoning about ... · challenges • complexity of reasoning...
TRANSCRIPT
Effectively-Propositional Modular Reasoning
about Reachability
in Linked Data Structures
CAV’13, POPL’14
Shachar Itzhaky
Anindya Banerjee
Neil Immerman
Ori Lahav
Aleks Nanevski
Mooly Sagiv
http://www.cs.tau.ac.il/~shachar/afwp.html
TAU
IMDEA
UMASS
TAU
IMDEA
TAU
Motivation
• Proving presence (absence) of pointer paths
between memory allocated objects in a
given program
– Partial program correctness
• Memory safety
• Absence of memory leaks
• Data structure invariants
– Acylicity, Sortedness
– Total program correctness
– Program equivalence
Reaso
nin
g ab
out R
eachab
ility
Program Termination
traverse(Node x, Node y) {
for (t =x; t != y ; t = t.n) {
…
}
}
{x <n*> y}
nu
ll
n n n n x
y
Disjoint Parallelism
for (x =h;
x != null;
x = x.n) {
…
}
for (y=k;
y != null;
y = y.n) {
…
}
{: null (h<n*> k<n*> )}
nu
ll
n n n n h
nu
ll n n n k
x
y
Challenges
• Complexity of reasoning about reachability
assertions
– Undecidability of reachability (not even RE)
• Modularity
– How to specify the behavior on reachability
independent of the call
• [Inferring reachability properties from the code
and some assertions]
Link list manipulations are simple
• Simple to reason about correctness
– Small counterexamples
• Even for doubly/circular/nested lists
• “Simple” invariants
– Alternation Free + Reachability “” **
EA(**) formulas Bernays-Schönfinkel-Ramsey
• t ::= var | constant (Terms)
• ap ::= t1 = t2 | r(t1,t2, …, tn)
• qf ::= ap | qf1 qf2 | qf1 qf2 | qf
• ea ::= 1, 2, n: 1, 2, m: qf
• Effectively Propositional
– Skolimization yields finite models
– EQ-satisfiable to a propositional formula
– Support from Z3
EA() formulas
Bernays-Schönfinkel-Ramsey
1, 2, : 1 : r(1, 1) r(1, 2)
=sat 1 : r(c1, 1) r(1, c2)
=sat(r(c1, c1) r(c1, c2))
(r(c1, c2) r(c2, c2))
=sat (P11 P12) (P12 P22)
Alternation Free Reachability (AFR)
• “Extended subset” of EA
– Closed under negation
• t ::= var | constant (Terms)
• ap ::= t1 = t2 | r(t1,t2, …, tn)
| t1 <f*> t2 (Reachability via sequences of f’s)
(exists k: fk (t1)=t2 )
• qf ::= qf | qf1 qf2 | qf1 qf2 | qf
• e ::= 1, 2,…, n: qf
a: ::= 1, 2,…, m: qf
• afR ::= e | a | afR1 afR
2 | afR1 afR
2
AFR Program Properties • Acylicity (>2)
– , : <n*> <n*> =
• Acyclic list with a head h
– , : h<n*> h<n*>
<n*> <n*> =
• Sorted segment
– ,: <n*> data
n*
n*
n*
n*
h
u
n*
v u v
AFR Program Properties
• Doubly linked acyclic lists
– , : <f *> <b*>
• Disjoint lists with heads h and k
– : null (h<n*> k<n*> )
f *
b*
1 n* h
k
2
List Reversal (isolated)
Node reverse(Node h) {
Node c = h; Node d = null;
while {I} (c != null) {
Node t = c.next;
c.next = d;
d = c;
c = t;
}
return d
}
{ac [h]: h <n*>}
{ac[d] , : <n*> <n*> : d <n*>}
d<n*> <n*> <n*>
c <n*>
(<n*> <n*>)
d<n*> I= , :
Why AFR?
• Represent invariants of simple linked list
manipulations
• Closed under , , , , wpx.n :=y
• Finite model property
• Decidable for satisfiability/validity
• AFR AF
• Can be reduced to a propositional formula
– SAT solver is complete for
verification/falsification
AFR AF
• Introduce an auxiliary relation n*
• t[ <n*>] =n*(, )
• Axiomatize n* by an AF formula linOrd=, :
n*(, ) n*(, ) =
: n*(, )
, , : n*(, ) n*(, ) n*(, )
, , : n*(, ) n*(, ) (n*(, ) n*(, ))
• Completeness
is satisfiable (linOrd t[]) is satisfiable
• AF formulas have finite model
Inverting n* n
• Every finite model in which n* satisfies the
order requirements:
, : n*(, ) n*(, ) =
: n*(, )
, , : n*(, ) n*(, ) n*(, )
, , : n*(, ) n*(, ) (n*(, )
n*(, ))
• n* uniquely determines n
Inverting n* n
u v
w x
y
<n+> <n*>
n*
n*
n*
n*
n* n*
n*
n*
n*
n* n*
n* n*
Inverting n* n
u v
w x
y
‘n()=’ <n+> : <n+><n*>
n+
n+
n+
n+
n+ n+
n+
n
n+ n
n
n+ n
Simple SAT Application
• Determine if two clients are identical
– Produce isomorphic reachable stores
• reverse(reverse(h)) = h
, : <n1*> <n0
*>
, : <n2*> <n1
*>
, : <n0*> <n2
*>
Verification Process
Program P Assertions
VC gen
Verification Condition
P “”
SAT Solver
Counterexample Proof
Modular Specification
• Every procedure mutates a limited part of the heap
– footprint
• Can we specify the effect of the procedure on the footprint
only?
• Provide a general adaptation rule for the context
– Possible for second order logics
• Transitive closure
• Separation Logic
• Can this be done in a weak logic?
An Adaptation Rule
mod = “the footprint”
Old path
New path
Old local path
New local
path
List Reversal (isolated)
Node reverse(Node h) {
Node c = h; Node d = null;
while (c != null) {
Node t = c.next;
c.next = d;
d = c;
c = t;
}
return d
}
{ac [h]: h <n*>}
{ac[d] , : <n*> <n*> : d <n*>}
{?}
reverse(i);
{ac[i] ] i<n*>j}
When ownership breaks
Node reverse(Node h) {
Node c = h; Node d = null;
while (c != null) {
Node t = c.next;
c.next = d;
d = c;
c = t;
}
return d
}
{ac [h]}
h<n*> h<n*> <n*> <n*> h<n*> h<n*>
h<n*> h<n*> false : <n*>
h<n*> <n*>n()
h<n*> h<n*>
nu
ll
h
, : <n*>
Unbounded Cutpoints
Node find(Node x) {
Node i = x.p;
if (i != null) {
i = find(i);
x.p = i;
}
else i = x;
return i;
}
nu
ll
c
accessed branch x
b
a
…
d
…
Complicated Mutations
1 2 3 4 5
k
h
6
1 2 3 4 5
k
h
6
…
…
…
c
c
hn*c cn*k
hn*c cn*k
Limited Programming Model
• Type correct programs
• Recursion instead of loops
• Limited amount of new sharing per call
• Deterministic transitive closure
• Specified changes
• Uniform changes
– Fixed number of contiguous intervals of
changed parts of the heap
Small Footprint
Destructive Updates
x y x y
x.n := y
x null xnnull xn*y yn*x
Large Footprint
Destructive Updates
x y
x.n := y
n* ↔ n* (n*x yn*))
Small Footprint
Destructive Updates
x s x s
x.n := null
x null xns xn*s sn*x
Large Footprint
x y
x y
x y
x y
x.n := null
n* ↔ n* (n*x n*x))
Small Footprint Reverse
{: null (h<n*> d<n*> )} (disjoint lists h, d)
Node reverse(Node h, Node d) {
if (h == null) return d;
else {
Node t = h.next;
h.next = d;
return reverse(t, h);
}
}
{, mod : <n*> ↔ (<n*> =d)}
modifies mod = [h, null) [d, d]
Large Footprint Reverse
1 2 3 4 5
z1 z2 z3
1 2 3 4 5
z1 z2 z3
h
h
reverse(h, null)
Modular Reasoning
Node find(Node x) {
Node i = x.p;
if (i != null) {
i = find(i);
x.p = i;
}
else i = x;
return i;
}
nu
ll
y
x
nu
ll y
x
find(x)
mod
mod
Modular Reasoning
void union(Node x, Node y){
Node t = find(x);
Node s = find(y);
if (t != s) t.p = s;
}
x
nu
ll
y
nu
ll
x
nu
ll
y
union(x,y)
mod
mod
An Adaptation Rule
mod Old local path
Unchanged path
New local
path
Becomes
unreachable
Becomes
reachable
An FOTC Adaptation Rule
• Unmodified edges q
– s,t: s<q> t s<f> t smod tmod
• Paths in post state
– s, t: s<f*>t s<q*>t
,mod : s<q*> <f*> <q*> t
An Adaptable Heap Logic
• Extend AFR with an idempotent function
enmod
• Maps nodes into footprint entrance points
• Reducible to effectively propositional
– Even with nested recursive calls
Large Footprint Adaptation
mod
enmod
enmod(s) = min ([s,null) mod)
s
s mod, tmod: s<f *>t enmod(s)<f *>t
An Adaptation Rule (Simplified)
mod Old local path
Unchanged path
New local
path
Becomes
unreachable
Becomes
reachable
s1
s2
enmod(s2)
ex1
ex2
t1
t2
t3
s2<q*> <f *> <q*>t3
enmod(s2)<f *>
enmod(s1) = null
<f *>exi exi<f *>t3 exi
An AER Adaptation Rule
enmod(s)<f *>exi exi<f *>t exi
s,t: s<f *>t
enmod(s) = null s<f *>t
t mod enmod(s)<f *>t
**
Small Footprint Specification in
AFR
Node find(Node x)
ensures:
,β: <p*>β ↔ =β ˅ β=r
requires: x != null
r = maxp[x, null)
where
nu
ll
y
x
nu
ll y
x
find(x)
mod
mod
Benchmark
Formula Size Solving time P,Q mod VC
# # # (Z3)
SLL: filter 7 2 1 217 6 0.48s
SLL: quicksort 25 2 1 745 9 1.06s
SLL: insert-sort 21 2 1 284 11 0.37s
UF:find 13 2 1 203 6 0.40s
UF:union 20 2 2 188 6 1.39s
Verification Time (Z3)
Disproving with SAT
Benchmark Nature of defect
Formula Size Solving time
C.e. Size P,Q mod VC
# # (Z3) (vertices)
UF: find Incorrect handling of corner case
27 3 2 201 6 1.6s 2
UF: union Incorrect specification
19 2 2 186 6 0.7s 8
SLL: filter Uncontrolled sharing 36 4 1 317 6 0.49s 14 SLL: insert-sort Violating call
precondition 21 2 1 283 9 0.88s 8
Example Bug
Node find(Node x) {
Node i = x.p;
if (i != null) {
i = find(i);
x.p = i;
}
// else i = x; return i;
}
nu
ll
x
find(x)
Bug: missing else branch
violates spec
: x<p*> <p*>r
nu
ll
x
r = maxp[x, null)
(for =x)
Data Structures outside AFR
• Lists with the same lengths
• Trees
• General DAGs
• Grids
• …
Mutations outside our adaptable logic
• Creation of unbounded sharing
• Changing multiple fields
– Nested linked lists
Property Guided Shape Analysis
N. Bjorner, T. Reps, A.Thakur, T. Weiss
• Predictable Shape Analysis
– Simple fixed Predicate Abstraction
– Infer propositional invariants guided by the verification
problem
– When the analysis fails:
• Concrete counterexamples
• A trace showing overly coarse abstraction
– Programmer can define new predicates using
AFR
• Employ IC3/PDR
Related Work
• First Order Rechability Axioms
– [Nelson POPL’83] Useful axioms
– [Lev-Ami’09] Useful axioms + completeness
study
• Incremental Methods [Hesse’03, Reps’03,
Lahiri&Qadeer POPL’06]
• Decidable Logics [Mona, STRAND, LRP,
Berdine’2004,Lahiri&Qadeer POPL’08,
Wies, Muñiz,Kuncak’2011,12 …]
Summary
• Reduction to SAT
• Support modularity
• Works for many programs
• Principles
– Restricted invariants
– Inversion n*
– Uniform mutations
– Two logics