checking properties of software static safety verification dynamic liveness testing
Post on 21-Dec-2015
231 views
TRANSCRIPT
Algorithmic Software VerificationRanjit Jhala, UC San Diego
CheckingProperties
Of Software
Static Safety VerificationDynamic Liveness Testing
char* rev_copy(char* a, int n){
i = 0; j = n – 1; b = malloc(n); while(0<=j){ b[i] = a[j]; i++; j--; } return b;}
Property: Memory Safety
Access Within Array BoundsBugs Are Disastrous Buffer Overflow Vulnerability
For each domain:1. Formalize Properties2. Automate Checking3. Build Tools
Safety
char* rev_copy(char* a, int n){
i = 0; j = n – 1; b = malloc(n); while(j>=0){ b[i] = a[j]; i++; j--; } return b;}
assert (0<=i && i<n);
0:
1: 2:
How to prove assert never fails ?
assert (i<n);
0: i = 0; j = n–1; 1: while (0<=j){ 2: assert(i<n); i = i+1; j = j–1; }Access Within Array Bounds
Safety1. Formalize Properties2. Automate Checking3. Build Tools
How to prove asserts?Invariants [Floyd-Hoare]
Invariants
Predicate that is always true
0: i = 0; j = n–1; 1: while (0<=j){
2: assert(i<n);
i = i+1; j = j–1; }
true
i+j=n-1
i+j=n-1 Æ 0·j
Invariant Proves Assert
How to Prove Asserts?How to Infer Invariants?
0: i = 0; j = n–1; 1: while (0<=j){
2: assert(i<n);
i = i+1; j = j–1; }
?
What are Invariants ?
??Let Xi = Invariant at line i
0: i = 0; j = n–1; 1: while (0<=j){
2: assert(i<n);
i = i+1; j = j–1; }
?
What are Invariants ?
??
X0
X1
X2Properties of X0,X1,X2?
0: i = 0; j = n–1; 1: while (0<=j){
2: assert(i<n);
i = i+1; j = j–1; }
What are Invariants ?
X0
Initial Values ArbitraryX0= true
true
X1
0: i = 0; j = n–1; 1: while (0<=j){
2: assert(i<n);
i = i+1; j = j–1; }
What are Invariants ?
true
i=0 Æ j=n-1 )
X1
trueÆi=0
i=0i=0Æj=n-
1
0: i = 0; j = n–1; 1: while (0<=j){
2: assert(i<n);
i = i+1; j = j–1; }
What are Invariants ?
X1
0·j Æ X1
0·j Æ X1 ) X2
X2
0: i = 0; j = n–1; 1: while (0<=j){
2: assert(i<n);
i = i+1; j = j–1; }
What are Invariants ?
X2 ) i<n
X2
0: i = 0; j = n–1; 1: while (0<=j){
2: assert(i<n);
i = i+1; j = j–1; }
What are Invariants ?
X2
[io/i][jo/j]X2
Æ i=io+1 Æ j=jo-1 i=io+1 Æ j=jo-1 Æ [io/i][jo/j]X2 )
X1
X2
What are Invariants ?
… Æ [io/i][jo/j]X2 ) X1
Predicates X1, X2 s.t.i=0 Æ j=n-1 ) X1
0·j Æ X1 ) X2
X2 ) i<n
How to Infer Invariants? How to Solve for X1, X2? Idea: Lazy Abstraction
Idea: Lazy AbstractionTree of executions over atomic predicates
i+j=n-10·j
Nodes: X1, X2
Edges: X1 ) X2
… [io/i][jo/j]X2 ) X1
0·j Æ X1 ) X2
X2 ) i<n
Lazy Predicate Abstraction
X0 trueTree Root Root X (i.e. non-RHS)
i=0 Æ j=n-1Æ X0 )
X1
Atoms: i+j=n-1, 0·j
Lazy Predicate Abstraction
X0 true
X1
Tree Edge“Unrolled” Implication
… [io/i][jo/j]X2 ) X1
0·j Æ X1 ) X2
X2 ) i<n
i=0 Æ j=n-1Æ X0 )
X1
Atoms: i+j=n-1, 0·j
Lazy Predicate Abstraction
X0 true
X1
SMT (Z3) Query
i=0 Æ j=n-1Æ X0 )
X1
Atoms: i+j=n-1, 0·j
?i=0 Æ j=n-1Ætrue
)i+j=n-1
Valid
Lazy Predicate Abstraction
X0 true
X1i+j=n-1
SMT Query
i=0 Æ j=n-1Æ X0 )
X1
Atoms: i+j=n-1, 0·j
i=0 Æ j=n-1Ætrue
)0·j
Invalid
… [io/i][jo/j]X2 ) X1
0·j Æ X1 ) X2
X2 ) i<n
?
Lazy Predicate Abstraction
X0 true
X1i+j=n-1
… [io/i][jo/j]X2 ) X1
0·j Æ X1 ) X2
X2 ) i<n
i=0 Æ j=n-1Æ X0 )
X1
Atoms: i+j=n-1, 0·jX2 i+j=n-1 Æ 0·j?
Lazy Predicate Abstraction
X0 true
X1i+j=n-1
… [io/i][jo/j]X2 ) X1
0·j Æ X1 ) X2
X2 ) i<n
i=0 Æ j=n-1Æ X0 )
X1
Atoms: i+j=n-1, 0·jX2 i+j=n-1 Æ 0·j
i<n
SMT Query0·j Æ i+j=n-1 )i<n
Valid
Lazy Predicate Abstraction
X0 true
X1i+j=n-1
X2
X1 i<n?
i+j=n-1 Æ 0·j
… [io/i][jo/j]X2 ) X1
0·j Æ X1 ) X2
X2 ) i<n
i=0 Æ j=n-1Æ X0 )
X1
Atoms: i+j=n-1, 0·j
i+j=n-1
Lazy Predicate Abstraction
X0 true
X1i+j=n-1
X2
X1 i<n
i+j=n-1 Æ 0·j
… [io/i][jo/j]X2 ) X1
0·j Æ X1 ) X2
X2 ) i<n
i=0 Æ j=n-1Æ X0 )
X1
Atoms: i+j=n-1, 0·j
i+j=n-1
FixpointStop UnrollingInferred InvariantsProved Asserts…Constraints Solved
…not so fast!
How to get good atoms?e.g. i+j=n-1Counterexamples [Ball&Rajamani 00]
Craig Interpolants [popl 04]
RecapSafety
Invariants
Implications
Lazy Abstraction
X0 , X1
X0 ) X1
Higher-level Software?
Simple ImperativeThreads
Asynchronous EventsGenerics & Closures
Data StructuresPointers & Mutation
[popl 02,04][pldi 04][popl 07][pldi 08][pldi 09][popl 10]
Analyzing High Level Software
What’s hard aboutData Structures?
int kmp_search(char str[], char pat[]){ p = 0; s = 0; while (p<pat.length && s<str.length){ if (str[s] == pat[p]){s++; p++;} else if (p == 0){s++;} else{p = table[p-1] + 1;} } if (p >= plen) {return (s-plen)}; return (-1);}
Need Universally Quantified Invariants
8i: 0·i<table.length )-1·table[i] Every element of table exceeds -1Prove Access Within Array Bounds
Need Universally Quantified Invariants
More complex for lists, trees, etc.
8x: next*(root,x) ) -1 · x.data
Quantifiers Kill SMT SolversCannot Decide Implications
Key: Invariants Without Quantifiers
Idea: Logically Qualified TypesFactor Invariant to Logic x Type
LogicDescribes Individual Data
TypeQuantifies over Structure
Idea: Liquid Types
factored into
8i: 0 ·i<table.length )-1· table[i]
table :: {v:int|-1 · v} array
Type Logic
factored into
8x: next*(root,x) )-1 · x.data
root :: {v:int|-1 · v} list
Type Logic
LogicDescribes Individual Data
TypeQuantifies over Structure
Theorem ProverReasoning about Individual Data
Type SystemQuantified Reasoning about Structure
Demo
Base TypesCollections
ClosuresGenerics
let rec ffor l u f =
if l < u then ( f l; ffor (l+1) u f )
Type of f
int ! unitTemplate of f
{v:int|X1}!unit
Liquid Type of f
{v:int|l·v Æ v<u} ! unit
l Flows Into Input of f {v:int|v=l} <: {v:int|X1}
l<u |-
l<u Æ v=l ) X1
Solution X1 = l·v Æ v<u
Reduces to
Base TypesCollections
ClosuresGenerics
Collections(Structure)
let group kvs = let t = H.create 37 in List.iter (fun (k,v) -> let vs = H.mem t k ? H.find t k : [] in H.add t k (v::vs) ) kvs; t
let vs = H.mem t k ? H.find t k : [] in
H.add t k (v::vs)
Types
t: (’a,’b list) H.t vs: ’b list
Templates
t (’a,{v:’b list| X1}) H.t
vs {v:’b list| X2}
{v:’b list|len v=0} <: {v:’b list| X2}
{v:’b list| X1} <: {v:’b list| X2}X1 ) X2
len v=0 ) X2
vs:{X2}|-{len v=len vs + 1} <: {X1}
X2[vs/v] Æ len v=len vs + 1 ) X1
Solution X1 = 0 < len vX2 = 0 ·len v
Liquid Type of t
(’a,{v:’b list| 0 < len v}) H.t
Collections(Data)
let nearest dist ctra x = let da = Array.map (dist x) ctra in
[min_index da, (x, 1)]Type of Output
int * ’b * int listTemplate of Output
{v:int | X1} * ’b * {v:int | X2} list
(’a !’b)!x:’a array!{v:’b array|len x = len v}
Liquid Type of
x:’a array!{v:int| 0·v Æ v < len x}
min_index da {v:int| 0·v Æ v < len da}da {v:’b array| len v = len ctra}
len da = len ctra Æ 0·v<len da ) X1
len da = len ctra Æ v=1 ) X2
da:{len v = len ctra}|-{ 0·v<len da} * ’b * {v=1} list <: {X1} * ’b * {X2}
list
Reduces To
Solution X1 = 0·v < len ctra X2 = 0 < v
Liquid Type of Output{v:int|0·v<len ctra}*’b*{v:int|0<v}
list
Base TypesCollections
ClosuresGenerics
let min_index a = let min = ref 0 in ffor 0 (Array.length a) (fun i -> if a.(i) < a.(!min) then min :=
i ); !min
Liquid Type of ffor 0 (len a)
({v:int|0· v < len a} ! unit)! unit
Template of (fun i ->...)
{v:int|Xi} ! unit
{Xi}!unit <: {0·v<len a}!unit{0·v<len a} unit{Xi} unit
{0·v<len a} <: {Xi}
Reduces To
unit <: unit0· v < len a ) Xi
Solution Xi = 0·v< len a
Liquid Type of (fun i ->...) {v:int|0·v<len a} ! unit
Liquid Type of fforl:int!u:int!({v:int|l·v<u}!unit)!unit
Liquid Type of ffor 0u:int!({v:int|0·v< u} ! unit)! unit
Base TypesCollections
ClosuresGenerics
mapreduce (nearest dist ctra) (centroid plus) xs
|> List.iter (fun (i,(x,sz)) -> ctra.(i)<- div x
sz) Type of mapreduce(’a !’b * ’c list) !...! ’b * ’c list
Template of mapreduce(’a ! {X1} * ’a * {X2} list)!...! {X1} * ’a * {X2} list
Type Instantiation ’a with ’a ’b with int
’c with ’a * int
Template Instantiation ’a with ’a
’b with {v:int|X1}
’c with ’a * {v:int|X2}
Liquid Type of (nearest dist ya)’a ! {0 · v < len ctra} * ’a * {0<v} list’a ! {0 · v < len ctra} * ’a * {0<v} list
<:’a ! {X1} * ’a * {X2} list
Solution X1 = 0 · v < len ctra X2 = 0 < v
Reduces To0 · v < len ctra ) X1
0 < v ) X2
Liquid Type of mapreduce Output {0 · v < len ctra} * ’a * {0 < v} list
RecapSafety
Invariants
Implications
Lazy Abstraction
X0 , X1
X0 ) X1
Liquid Types{v:t|X0} , {v:t|X1}
Subtyping{v:t|X0} <: {v:t|X1}
Safety1. Formalize Properties2. Automate Checking3. Build Tools
C or ML + Asserts Safe+Types
Error+TypesDsolve
Results
Atoms
1. Data Structures (ML) 2. Memory Safety (C)
Verification Benchmarks
Finite Maps (ML)5: ‘cat’
3: ‘cow’ 8: ‘tic’
1: ‘doc’ 4: ‘hog’ 7: ‘ant’ 9: ‘emu’From Ocaml Standard Library
Implemented as AVL TreesRotate/Rebalance on Insert/Delete
Verified InvariantsBinary Search Ordered
Height BalancedKeys Implement Set
Binary Decision Diagrams (ML)X1
X2 X2
X3
X4 X4
1
Graph-Based Boolean Formulas [Bryant 86]
X1ÛX2 Ù X3ÛX4 Efficient Formula Manipulation
Memoizing Results on SubformulasVerified Invariant
Variables Ordered Along Each Path
Program (ML) Verified InvariantsList-based Sorting Sorted, Outputs Permutation of Input
Finite Map Balance, BST, Implements a SetRed-Black Trees Balance, BST, Color
Stablesort SortedExtensible Vectors Balance, Bounds Checking, …
Binary Heaps Heap, Returns Min, Implements SetSplay Heaps BST, Returns Min, Implements Set
Malloc Used and Free Lists Are AccurateBDDs Variable Order
Union Find AcyclicityBitvector Unification Acyclicity
1. Data Structures (ML) 2. Memory Safety (C)
Verification Benchmarks
Memory Safety of C Programs
Verified PropertySpatial Memory SafetyNo Buffer OverflowsNo Null Dereferences
Program (C) Lines Data Structures Usedstringlists 72 Arrays, Linked Lists
strcpy 77 Arraysadpcm 198 Arrays
pagemap 250 Arrays, Linked Listsmst 309 Arrays, Linked Lists, Graphs
power 620 Arrays, Linked Lists, Graphsks 650 Arrays, Linked Listsft 742 Arrays, Graphs
Safety1. Formalize Properties2. Automate Checking3. Build Tools
Static Safety VerificationDynamic Liveness Testing
System Nodes exchanging messages
Concurrent, Distributed Systems
Challenges
Nodes enter, leave, fail Messages are reordered, lost
Pastry
[Rowstron & Druschel ‘01]
Key-Value StoreDistributed Across NodesOrganized in Ring Topology
Nodes Leave and Rejoin
Leaves
Nodes Leave and Rejoin
Detect,Reconnect
Nodes Leave and Rejoin
Returns
Nodes Leave and Rejoin
Asks forNeighbors
Nodes Leave and Rejoin
RejoinsNeighbors
But Sometimes...
Asks forNeighbors
QueryBounces
Back!
Node forever unable to rejoin...
??! #@How to find ?
How to reproduce?How to fix?
Liveness1. Formalize Properties2. Automate Analysis3. Build Tools
States and Transitions
1 2
StateSnapshot of system
1 2
event@1
At each state,scheduler chooses1. Node n2. Event @n3. Executes code (C++)
Initial State
1 2
The Space of System Executions
1 2 Initial State
1 2
1 2
1 2
1 2
event@1
event@2
fail@1
fail@2
At each state,scheduler chooses1. Node n2. Event @n3. Executes code (C++)
Eventually all nodes regroup
(Despite Failures,...)
Eventually some good happens
Desired Properties
Eventually all data deliveredEventually “P is true”
Liveness Properties
Live States
InitialState
P is trueLive States
Live Executions
InitialState
Live States
Liveness Bugs
InitialState
Live States
Execution never reaches live state
How to find liveness bugs?
Liveness1. Formalize Properties2. Automate Analysis3. Build Tools
How to find liveness bugs?
Live States
Idea: Dead Executions
Dead States
No execution can reach live states
Execution Reaches
Recovery is ImpossibleEspecially Severe Class of Bugs
How to find liveness bugs? How to tell if state is Dead?Property only says which are live
Idea: Random Walks
Live States
Dead States
Execute long random walks from state Pr[reaching live] = 0 Pr[reaching live] = 1
How to tell if state is Dead?Property only says which are live
Executions and Random Walks
At each execution step: 1. Scheduler picks node n2. Scheduler picks event @n3. Executes event code
Random Walk: Scheduler picks randomly
Algorithm = Search + Random Walks
1. Systematic Search: find candidates 2. Random Walk: test if candidate dead
Live States
Iterate
Live States
If walk length >> avg. steps to livenessThen non-live walk is likely liveness bug!
100k Events
1k Events
Algorithm = Search + Random Walks
Walk length found with repeated trials
Recap
Dead ExecutionsSystem has shot itself (but doesnt know it)
Systematic SearchFinds candidate dead states
Random WalksDetermine if candidate is dead
Liveness1. Formalize Properties2. Automate Analysis3. Build Tools
Liveness Bugs
Mace (C++)System
Liveness Properties
MaceMC[NSDI 07]
Systems Analyzed
RandTreeRandom Overlay Tree with max degree.
MaceTransportUser-level, reliable messaging service.
PastryKey-based routing, using an overlay ring.
ChordKey-based routing, using an overlay ring.
Liveness Properties
RandTreeRandom Overlay Tree with max degree.
MaceTransportUser-level, reliable transport service.
PastryKey-based routing, using an overlay ring.
ChordKey-based routing, using an overlay ring.
Eventually, all messages acknowledged.
Eventually, all nodes form single tree.
Eventually, all nodes form a ring.
Eventually, all nodes form a ring.
MaceMC finds the Pastry Bug
Pastry Bug Understood
Node forever unable to rejoin...
C B
Pastry Bug Understood
A
B sends C message about A
A
C B
Pastry Bug Understood
A leaves
A
A
Ring reforms
C B
Pastry Bug Understood
A
A returns
A
B
Pastry Bug Understood
C receives (stale) message about AUpdates routing information
AA
C
A
System Dies!
B
Pastry Bug Understood
A’s Rejoin requests bounced back
AC
A
A forever unable to rejoin...
“Dropped JoinRequest on rapid rejoin problem: There was a problem with nodes not being able to quickly rejoin if they used the same NodeId. Didn’t find the cause of this bug, but can no longer reproduce.”
(FreePastry README, “Changes since 1.4.2”)
Also in Original Implementation
A “Protocol Level” Bug
Liveness Bugs Yield Safety Assertions
Dead States Violate a priori unknown safety assertions
MaceMCFinds dead states, yielding new asserts
New Safety Property: ChordNodes with Fwd, Back pointers
PropertyEventually nodes form a ring
Critical Transition To Dead StateWhere: n.back = n, n.fwd = m
New Safety PropertyIF n.back=n THEN n.fwd=n
ScorecardSystem Bugs Liveness Safety
MaceTransport 11 5 6RandTree 17 12 5
Pastry 5 5 0Chord 19 9 10Total 52 31 21
Several “protocol level” bugsRoutinely used by Mace programmersZero False Alarms
Liveness1. Formalize Properties2. Automate Analysis3. Build Tools
Other Work
Mace LanguageMC & Random Walks
Distributed Systems
[pldi 07]
[nsdi 07]
How to find, reproduce, fix end-to-end liveness bugs?
Scalable Race DetectionMultithread Analysis = Sequential Analysis x Race Detection
Multithread Analysis
[popl07]
[pldi08]
[fse07]How to prevent and controlThread Interference ?Lock Allocation
Q: Can I install emacs?A: If you have X11 or Xorg (but not both)
Config ManagementNP-CompleteEncode and Solve via SAT [icse 07]
SAT solvers in Eclipse, Suse-Linux
How to avoid “DLL hell” ?
Staged Analysis for JavaScript
Web 2.0 Security
[pldi 09]
Dynamic Information Flow [tbd]
How to prevent JavaScript from doing mischief ?
For each domain:1. Formalize Properties2. Automate Checking3. Build Tools
Analysis Connects Properties & Code
Analyze tricky corner casesRe-analyze as code evolves
Reliable Software
“ucsd progsys”(people, papers, code, demos, etc.)