bandera: extracting finite-state models from java source code james corbett matthew dwyer john...
TRANSCRIPT
Bandera: Bandera: Extracting Finite-state Extracting Finite-state Models from Java Source CodeModels from Java Source Code
James Corbett
Matthew Dwyer
John Hatcliff
Shawn Laubach
Corina Pasareanu
Robby
Hongjun Zheng
Faculty Students and Post-docs
Roby Joehanes
Ritesh Desai
Venkatesh Ranganath
Oksana Tkachuk
Goal: Goal: Increase Software ReliabilityIncrease Software Reliability
Trends:
Size, complexity, concurrency, distributed
Cost of software engineer……………………….
Cost of CPU cycle………………………………..
Future: Automated Fault Detection
The DreamThe Dream
Program
Requirement
Checker
OK
Error trace
or
void add(Object o) { buffer[head] = o; head = (head+1)%size;}
Object take() { … tail=(tail+1)%size; return buffer[tail];}
Property 1: …Property 2: ……
Model CheckingModel Checking
OK
Error trace
orFinite-state model
Temporal logic formula
Model Checker
Line 5: …Line 12: …Line 15:…Line 21:…Line 25:…Line 27:… …Line 41:…Line 47:…
Why use Model Checking?Why use Model Checking?
In contrast to testing, gives complete coverage by exhaustively exploring all paths in system,
It’s been used for years with good success in hardware and protocol design
Automatically check, e.g., – invariants, simple safety & liveness properties – absence of dead-lock and live-lock, – complex event sequencing properties,
“Between the window open and the window close, button X can be pushed at most twice.”
This suggests that model-checking can complement existing software quality assurance techniques.
What makes model-checking What makes model-checking software difficult?software difficult?
Model construction
OK
Error trace
orFinite-state model
Temporal logic formula
Model Checker
State explosion
Problems using existing checkers:
Property specification Output interpretation
Line 5: …Line 12: …Line 15:…Line 21:…
Model Construction ProblemModel Construction Problem
Semantic gap:
Model Description
Model CheckerProgram
void add(Object o) { buffer[head] = o; head = (head+1)%size;}
Object take() { … tail=(tail+1)%size; return buffer[tail];}
Gap
Programming Languages
Model Description Languages
methods, inheritance, dynamic creation, exceptions, etc.
automata
What makes model-checking What makes model-checking software difficult?software difficult?
Model construction
OK
Error trace
orFinite-state model
Temporal logic formula
Model Checker
State explosion
Problems using existing checkers:
Property specification Output interpretation
Line 5: …Line 12: …Line 15:…Line 21:…
Property Specification ProblemProperty Specification Problem
Difficult to formalize a requirement in temporal logic
“Between the window open and the window close, button X can be pushed at most twice.”
[]((open /\ <>close) -> ((!pushX /\ !close) U (close \/ ((pushX /\ !close) U (close \/ ((!pushX /\ !close) U (close \/ ((pushX /\ !close) U (close \/ (!pushX U close))))))))))
…is rendered in LTL as...
Property Specification ProblemProperty Specification Problem
We want to write source level specifications...
(((_collect(heap_b) == 1)\ && (BoundedBuffer_col.instance[_index(heap _b)].head == BoundedBuffer_col.instance[_index(heap _b)].tail) )\|| ((_collect(heap _b) == 3)\ && (BoundedBuffer_col_0.instance[_index(heap _b)].head == BoundedBuffer_col_0.instance[_index(heap _b)].tail) )\|| ((_collect(heap _b) == 0) && TRAP))
Heap.b.head == Heap.b.tail
We are forced to write model level specifications...
Forced to state property in terms of model rather than source:
What makes model-checking What makes model-checking software difficult?software difficult?
Model construction
OK
Error trace
orFinite-state model
Temporal logic formula
Model Checker
State explosion
Problems using existing checkers:
Property specification Output interpretation
Line 5: …Line 12: …Line 15:…Line 21:…
State Explosion ProblemState Explosion Problem
Moore’s law and algorithm advances can help– Holzmann: 7 days (1980) ==> 7 seconds (2000)
Explosive state growth in software limits scalability
Bit x1,…,xN 2^N states
Cost is exponential in the number of components
What makes model-checking What makes model-checking software difficult?software difficult?
Model construction
OK
Error trace
orFinite-state model
Temporal logic formula
Model Checker
State explosion
Problems using existing checkers:
Property specification Output interpretation
Line 5: …Line 12: …Line 15:…Line 21:…
Output Interpretation ProblemOutput Interpretation Problem
Raw error trace may be 1000’s of steps long
Model DescriptionProgram
void add(Object o) { buffer[head] = o; head = (head+1)%size;}
Object take() { … tail=(tail+1)%size; return buffer[tail];}
Gap
Error trace
Line 5: …Line 12: …Line 15:…Line 21:…Line 25:…Line 27:… …Line 41:…Line 47:…
Must map line listing onto model description Mapping to source is made difficult by
– Semantic gap & clever encodings of complex features– multiple optimizations and transformations
Bandera:Bandera:An open tool set for model-checking Java source codeAn open tool set for model-checking Java source code
Checker Inputs
CheckerOutputs
Optimization Control
Transformation &Abstraction Tools
ModelCheckers
Java Source
void add(Object o) { buffer[head] = o; head = (head+1)%size;}
Object take() { … tail=(tail+1)%size; return buffer[tail];}
Bandera Temporal Specification
Graphical User Interface
Error Trace Mapping
Bandera
Addressing theAddressing the Model Construction ProblemModel Construction Problem
Numerous analyses, optimizations,two intermediate languages, multiple back-ends
Slicing, abstract interpretation, specialization Variety of usage modes: simple...highly tuned
Model extraction: compiling to model checker inputs:
Java Source
void add(Object o) { buffer[head] = o; head = (head+1)%size;}
Object take() { … tail=(tail+1)%size; return buffer[tail];}
Model DescriptionModel Compiler
Static Analyses
Abstract Interpretation
Slicing Optimizations
Addressing theAddressing the Property Specification ProblemProperty Specification Problem
An extensible language based on field-tested temporal property specification patterns
[]((open /\ <>close) -> ((!pushX /\ !close) U (close \/ ((pushX /\ !close) U (close \/ ((!pushX /\ !close) U (close \/ ((pushX /\ !close) U (close \/ (!pushX U close))))))))))
Using the pattern system: 2-bounded existence
Between {open} and {close} {pushX} exists atMost {2} times;
Addressing theAddressing the State Explosion ProblemState Explosion Problem
Aggressive customization via slicing, abstract interpretation, program specialization
Java Source
void add(Object o) { buffer[head] = o; head = (head+1)%size;}
…
Model DescriptionsModel Compiler
Property
Generate models customized wrt property!
Result: multiple models --- even as many as one per property
Addressing theAddressing the Output Interpretation ProblemOutput Interpretation Problem
Run error traces forwards and backwards Program state queried Heap structures navigated Locks, wait sets, blocked sets displayed
Like a debugger: error traces mapped back to source
Java Source
void add(Object o) { buffer[head] = o; head = (head+1)%size;}
Object take() { … tail=(tail+1)%size; return buffer[tail];}
Model Compiler
ModelChecker
Intermediate Representations
Error traceLine 5: …Line 12: …Line 15:…Line 21:…
ModelDescription
+ simulator
Bandera ArchitectureBandera Architecture
BIRC BIR
Simulator
AbstractionEngine
Slicer
Analyses
Translators
SPIN
dSPIN
SMV
JPF
Property Tool
JavaJimple
Parser
Error Trace Display
Property SpecificationProperty Specification
/** * observable * EXP Full: (head == tail); */
class BoundedBuffer { Object [] buffer; int head, tail, bound;
public synchronized void add(Object o) {…}
public synchronized Object take () {…}}
Requirement:
If a buffer becomes full,it will eventually becomenon-full.
Bandera Specification:
FullToNonFull: forall[b:BoundedBuffer]. {!Full(b)} responds to {Full(b)} globally;
Property SpecificationProperty Specification
Property SpecificationProperty Specification
Property SpecificationProperty Specification
Property-directed SlicingProperty-directed Slicing
slicing criterion generated automatically from observables mentioned in the property
backwards slicing automatically finds all components that might influence the observables.
Source program Resulting slice
Slice
mentionedin property
indirectlyrelevant
Property-directed SlicingProperty-directed Slicing/** * @observable EXP Full: (head == tail) */
class BoundedBuffer { Object [] buffer_; int bound; int head, tail; public synchronized void add(Object o) { while ( tail == head ) try { wait(); } catch ( InterruptedException ex) {}
buffer_[head] = o; head = (head+1) % bound; notifyAll(); }...}
Included inslicingcritirion
Slicing Criterion
All statementsthat assign tohead, tail.
indirectlyrelevant
removed byslicing
Property-directed SlicingProperty-directed Slicing
Thread 1 Thread 2Data Dependence
x := 3;
y = x + 1;
Control Dependencez<0
Interference Dependence
x := z;
z := 4;Synchronization Dependence
enter monitor(o)
enter monitor(o)
Ready Dependencenotify(o)
wait(o)
[SAS’99]Dependencies for concurrent Java
Abstraction EngineAbstraction Engine
int x = 0;if (x == 0) x = x + 1;
Data domains
(n<0) : neg(n==0): zero(n>0) : pos
Signs
neg poszero
int
Code
Signs x = zero;if (x == zero) x = pos;
Collapses data domains via abstract interpretation:
Abstraction Component Abstraction Component FunctionalityFunctionality
VariableConcrete Type
Abstract Type
Inferred Type
AbstractionLibrary
BanderaAbstractionSpecificationLanguage
BASLCompiler
PVS
JimpleJimple AbstractionEngine
AbstractedJimple
xydonecount
ob
intintbool
ObjectBuffer
int….
SignsSignsSigns
intAbsBool
….PointBuffer
Abstraction SpecificationAbstraction Specificationabstraction Signs abstracts intbegin TOKENS = { NEG, ZERO, POS };
abstract(n) begin n < 0 -> {NEG}; n == 0 -> {ZERO}; n > 0 -> {POS}; end
operator + add begin (NEG , NEG) -> {NEG} ; (NEG , ZERO) -> {NEG} ; (ZERO, NEG) -> {NEG} ; (ZERO, ZERO) -> {ZERO} ; (ZERO, POS) -> {POS} ; (POS , ZERO) -> {POS} ; (POS , POS) -> {POS} ; (_,_)-> {NEG, ZERO, POS}; /* case (POS,NEG), (NEG,POS) */ end
public class Signs { public static final int NEG = 0; // mask 1 public static final int ZERO = 1; // mask 2 public static final int POS = 2; // mask 4 public static int abstract(int n) { if (n < 0) return NEG; if (n == 0) return ZERO; if (n > 0) return POS; }
public static int add(int arg1, int arg2) { if (arg1==NEG && arg2==NEG) return NEG; if (arg1==NEG && arg2==ZERO) return NEG; if (arg1==ZERO && arg2==NEG) return NEG; if (arg1==ZERO && arg2==ZERO) return ZERO; if (arg1==ZERO && arg2==POS) return POS; if (arg1==POS && arg2==ZERO) return POS; if (arg1==POS && arg2==POS) return POS; return Bandera.choose(7); /* case (POS,NEG), (NEG,POS) */ }
Compiled
Specification Creation ToolsSpecification Creation Toolsabstraction Signs abstracts intbegin TOKENS = { NEG, ZERO, POS };
abstract(n) begin n < 0 -> {NEG}; n == 0 -> {ZERO}; n > 0 -> {POS}; end
operator + add begin (NEG , NEG) -> {NEG} ; (NEG , ZERO) -> {NEG} ; (ZERO, NEG) -> {NEG} ; (ZERO, ZERO) -> {ZERO} ; (ZERO, POS) -> {POS} ; (POS , ZERO) -> {POS} ; (POS , POS) -> {POS} ; (_,_)-> {NEG, ZERO, POS}; end
AutomaticGeneration
Forall n1,n2: neg?(n1) and neg?(n2) implies not pos?(n1+n2)
Forall n1,n2: neg?(n1) and neg?(n2) implies not zero?(n1+n2)
Forall n1,n2: neg?(n1) and neg?(n2) implies not neg?(n1+n2)
Proof obligations submitted to PVS...
Example: Start safe, then refine: +(NEG,NEG)={NEG,ZERO,POS}
Abstraction LibraryAbstraction Library
Current Library Contains:
Range(i,j) : i..j modeled precisely, e.g.,
– Range(0,0) is the signs abstraction
– Range(2,4) has tokens {lt2,2,3,4,gt4}
Modulo(k), e.g.,
– Modulo(2) is the even-odd abstraction
Specific(v,…) : identifies values of interest, e.g.,
– Specific(10) has tokens {eq10,not10}
User extendable for base type predicates
Back EndBack End
Bandera Intermediate Representation (BIR)– guarded command language– includes: locks, threads, references, heap– info to help translators (live vars, invisible)
entermonitor r0r1.count = 0;…
loc s5: live { r0, r1 } when lockAvail(r0.lock) do { lock(r0.lock); } goto s6;loc s6: live { r1 } when true do invisible { r1.count = 0;} goto s7;
JimpleBIR
Bounded Buffer BIRBounded Buffer BIRprocess BoundedB() BoundedBuffer_ref = ref { BoundedBuffer_col, BoundedBuffer_col_0 }; BoundedBuffer_rec = record { bound_ : range -1..4; head_ : range -1..4; tail_ : range -1..4; BIRLock : lock wait reentrant; }; BoundedBuffer_col : collection [3] of BoundedBuffer_rec; BoundedBuffer_col_0 : collection [3] of BoundedBuffer_rec;…….……….loc s34: live { b2, b1, add_JJJCTEMP_0, add_JJJCTEMP_6, add_JJJCTEMP_8 } when true do invisible { add_JJJCTEMP_8 := (add_JJJCTEMP_6 % add_JJJCTEMP_8); } goto s35;loc s35: live { b2, b1, add_JJJCTEMP_0, add_JJJCTEMP_8 } when true do { add_JJJCTEMP_0.head_ := add_JJJCTEMP_8; } goto s36;loc s36: live { b2, b1, add_JJJCTEMP_0 } when true do { notifyAll(add_JJJCTEMP_0.BIRLock); } goto s37;loc s37: live { b2, b1, add_JJJCTEMP_0 } when true do { unlock(add_JJJCTEMP_0.BIRLock); } goto s38;
Bounded Buffer PromelaBounded Buffer Promelatypedef BoundedBuffer_rec { type_8 bound_; type_8 head_; type_8 tail_; type_18 BIRLock; }
……loc_25: atomic { printf("BIR: 25 0 1 OK\n"); if :: (_collect(add_JJJCTEMP_0) == 1) -> add_JJJCTEMP_8 = BoundedBuffer_col. instance[_index(add_JJJCTEMP_0)].tail_; :: (_collect(add_JJJCTEMP_0) == 2) -> add_JJJCTEMP_8 = BoundedBuffer_col_0. instance[_index(add_JJJCTEMP_0)].tail_; :: else -> printf("BIR: 25 0 1 NullPointerException\n"); assert(0); fi; goto loc_26; }
TranslatorsTranslators
Plug-in component that interfaces to specific model checker– Translates BIR to checker input language– Parses output of checker for error trace
Currently– SPIN, dSPIN, SMV translators complete– JPF (from NASA Ames) integrated– XMC, FDR translators in progress
Case StudiesCase Studies Small examples thus far (< 2000 loc)
– illustrating use of property-pattern system and other components
Scheduler from DEOS real-time OS kernel– (1600, 22 classes, seven tasks)
Now trying systems up to 20,000 loc– collection of 15 open-source 100% pure Java – Jigsaw web-server from W3C– Tomcat, James (from Apache/Jakarta)
In general, 1-2 minutes for model extraction on (~2000k systems)
State space reductions can dramatically reduce cost
SummarySummary
Bandera provides an open platform for experimentation Separates model checking from extraction
– uses existing model checkers
– supports multiple model checkers Specialize models for specific properties using
automated support for slicing, abstraction, etc. Designed for extensibility
– well-defined internal representations and interfaces We hope this will contribute to the definition of APIs for
software model-checkers
Other Work on Other Work on Software Model-checkingSoftware Model-checking
Java
– JPF (NASA Ames)
– JCAT (Torino)
– Java to SAL (Stanford)
C
– SLAM (Microsoft Research)
– AX, FeaVer (Lucent)
Current StatusCurrent Status
A reasonable subset of concurrent Java– not handled: recursive methods,
exceptions, inner classes, native methods, libraries(*)
Public release: October 2000
Demo tomorrow morningDemo tomorrow morning
http://www.cis.ksu.edu/santos/bandera