modeling data in formal verification bits, bit vectors, or words
Post on 25-Feb-2016
29 Views
Preview:
DESCRIPTION
TRANSCRIPT
Modeling Data in Formal Modeling Data in Formal VerificationVerification
Bits, Bit Vectors, or WordsBits, Bit Vectors, or Words
http://www.cs.cmu.edu/~bryant
Randal E. BryantCarnegie Mellon University
– 2 –
OverviewIssueIssue
How should data be modeled in logical analysis? Verification, synthesis, test generation, security analysis, …
ApproachesApproaches Bits: Every bit is represented individually
Words: View each word as arbitrary value
E.g., unbounded integersHistoric program verification work
Bit Vectors: Finite precision words
Captures true semantics of hardware and software
– 3 –
Data Path
Com.Log.
1
Com.Log.
2
Bit-Level Modeling
Represent Every Bit of State IndividuallyRepresent Every Bit of State Individually Behavior expressed as Boolean next-state over current state Historic method for most CAD, testing, and verification tools
E.g., model checkers, test generators
Control Logic
– 4 –
Bit-Level Modeling in Practice
StrengthsStrengths Allows precise modeling of system Well developed technology
BDDs & SAT for Boolean reasoning
LimitationsLimitations Every state bit introduces two Boolean variables
Current state & next state Overly detailed modeling of system functions
Don’t want to capture full details of FPU
Making It WorkMaking It Work Use extensive abstraction to reduce bit count Hard to abstract functionality
– 5 –
Word-Level Abstraction #1:Bits → Integers
View Data as Symbolic WordsView Data as Symbolic Words Arbitrary integers
No assumptions about size or encodingClassic model for reasoning about software
Can store in memories & registers
x0x1x2
xn-1
x
– 6 –
Data Path
Com.Log.
1
Com.Log.
2
Abstracting Data BitsControl Logic
Data Path
Com.Log.
1
Com.Log.
1? ?
What do we do about logic functions?
– 7 –
Word-Level Abstraction #2:Uninterpreted Functions
For any Block that Transforms or Evaluates Data:For any Block that Transforms or Evaluates Data: Replace with generic, unspecified function Only assumed property is functional consistency:
a = x b = y f (a, b) = f (x, y)
ALUf
– 8 –
Abstracting Functions
For Any Block that Transforms Data:For Any Block that Transforms Data: Replace by uninterpreted function Ignore detailed functionality Conservative approximation of actual system
Data Path
Control Logic
Com.Log.
1
Com.Log.
1F1 F2
– 9 –
Word-Level Modeling: History
HistoricHistoric Used by theorem provers
More RecentlyMore Recently Burch & Dill, CAV ’94
Verify that pipelined processor has same behavior as unpipelined reference model
Use word-level abstractions of data paths and memoriesUse decision procedure to determine equivalence
Bryant, Lahiri, Seshia, CAV ’02UCLID verifierTool for describing & verifying systems at word level
– 10 –
Pipeline Verification Example
Reg.File
IF/ID
InstrMem
+4
PC ID/EX
ALU
EX/WB
=
=
RdRa
Rb
Imm
Op
Adat
Control Control
Reg.File
IF/ID
InstrMem
+4
PC ID/EX
ALU
EX/WB
=
=
RdRa
Rb
Imm
Op
Adat
Control Control
Reg.File
InstrMem
+4
ALU
RdRa
Rb
Imm
Op
Adat
Control
Bdat
Reg.File
InstrMem
+4
ALU
RdRa
Rb
Imm
Op
Adat
Control
Bdat
Pipelined Processor
Reference Model
– 11 –
Abstracted Pipeline Verification
Pipelined Processor
Reference Model
Reg.File
IF/ID
InstrMem
+4
PC ID/EX
ALU
EX/WB
=
=
RdRa
Rb
Imm
Op
Adat
Control Control
Reg.File
IF/ID
InstrMem
+4
PC ID/EX
ALU
EX/WB
=
=
RdRa
Rb
Imm
Op
Adat
Control Control
F1
F2
F3
Reg.File
InstrMem
+4
ALU
RdRa
Rb
Imm
Op
Adat
Control
Bdat
Reg.File
InstrMem
+4
ALU
RdRa
Rb
Imm
Op
Adat
Control
Bdat
PC
F1
F2
F3
F1
F2
F3
– 12 –
Experience with Word-Level Modeling
Powerful Abstraction ToolPowerful Abstraction Tool Allows focus on control of large-scale system Can model systems with very large memories
Hard to Generate Abstract ModelHard to Generate Abstract Model Hand-generated: how to validate? Automatic abstraction: limited success
Andraus & Sakallah, DAC 2004
Realistic Features Break AbstractionRealistic Features Break Abstraction E.g., Set ALU function to A+0 to pass operand to output
DesireDesire Should be able to mix detailed bit-level representation with
abstracted word-level representation
– 13 –
Bit Vectors: Motivating Example #1
Do these functions produce identical results?Do these functions produce identical results?
StrategyStrategy Represent and reason about bit-level program behavior Specific to machine word size, integer representations, and
operations
int abs(int x) { int mask = x>>31; return (x ^ mask) + ~mask + 1;}
int test_abs(int x) { return (x < 0) ? -x : x; }
– 14 –
Motivating Example #3
Is there a way to expand the program sketch to Is there a way to expand the program sketch to make it match the spec?make it match the spec?
bit[W] popSpec(bit[W] x){ int cnt = 0; for (int i=0; i<W; i++) { if (x[i]) cnt++; } return cnt;}
AnswerAnswer W=16:
[Solar-Lezama, et al., ASPLOS ‘06]
bit[W] popSketch(bit[W] x){ loop (??) { x = (x&??) + ((x>>??)&??); } return x;}
x = (x&0x5555) + ((x>>1)&0x5555); x = (x&0x3333) + ((x>>2)&0x3333); x = (x&0x0077) + ((x>>8)&0x0077); x = (x&0x000f) + ((x>>4)&0x000f);
– 15 –
Motivating Example #4
Is pipelined microprocessor identical to sequential reference model?Is pipelined microprocessor identical to sequential reference model?
StrategyStrategy Represent machine instructions, data, and state as bit vectors
Compatible with hardware description language representation Verifier finds abstractions automatically
Reg.File
IF/ID
InstrMem
+4
PC ID/EX
ALU
EX/WB
=
=
RdRa
Rb
Imm
Op
Adat
Control Control
Reg.File
IF/ID
InstrMem
+4
PC ID/EX
ALU
EX/WB
=
=
RdRa
Rb
Imm
Op
Adat
Control Control
Reg.File
InstrMem
+4
ALU
RdRa
Rb
Imm
Op
Adat
Control
Bdat
Reg.File
InstrMem
+4
ALU
RdRa
Rb
Imm
Op
Adat
Control
Bdat
Pipelined Microprocessor Sequential Reference Model
– 16 –
Bit Vector Formulas
Fixed width data wordsFixed width data words
Arithmetic operationsArithmetic operations Add/subtract/multiply/divide, … Two’s complement, unsigned, …
Bit-wise logical operationsBit-wise logical operations Bitwise and/or/xor, shift/extract, concatenate
PredicatesPredicates ==, <=
TaskTask Is formula satisfiable? E.g., a > 0 && a*a < 0
50000 * 50000 =-1794967296
(on 32-bit machine)
– 17 –
Decision Procedures Core technology for formal reasoning
Boolean SATBoolean SAT Pure Boolean formula
SAT Modulo Theories (SMT)SAT Modulo Theories (SMT) Support additional logic fragments Example theories
Linear arithmetic over reals or integersFunctions with equalityBit vectorsCombinations of theories
FormulaDecision
Procedure
Satisfying solution
Unsatisfiable(+ proof)
– 18 –
Recent Progress in SAT Solving3600
766
147 118 81 46 190
1,000
2,000
3,000
Run
-tim
e (s
ec.)
– 19 –
BV Decision Procedures:Some HistoryB.C. (Before Chaff)B.C. (Before Chaff)
String operations (concatenate, field extraction) Linear arithmetic with bounds checking Modular arithmetic
LimitationsLimitations Cannot handle full range of bit-vector operations
– 20 –
BV Decision Procedures:Using SATSAT-Based “Bit Blasting”SAT-Based “Bit Blasting”
Generate Boolean circuit based on bit-level behavior of operations
Convert to Conjunctive Normal Form (CNF) and check with best available SAT checker
Handles arbitrary operations
Effective in Many ApplicationsEffective in Many Applications CBMC [Clarke, Kroening, Lerda, TACAS ’04] Microsoft Cogent + SLAM [Cook, Kroening, Sharygina, CAV
’05] CVC-Lite [Dill, Barrett, Ganesh], Yices [deMoura, et al]
– 21 –
Bit-Vector ChallengeIs there a better way than bit blasting?Is there a better way than bit blasting?
RequirementsRequirements Provide same functionality as with bit blasting Find abstractions based on word-level structure Improve on performance of bit blasting
ObservationObservation Must have bit blasting at core
Only approach that covers full functionality Want to exploit special cases
Formula satisfied by small valuesSimple algebraic properties imply unsatisfiabilitySmall unsatisfiable coreSolvable by modular arithmetic…
– 22 –
Some Recent Ideas
Iterative ApproximationIterative Approximation UCLID: Bryant, Kroening, Ouaknine, Seshia, Strichman, Brady,
TACAS ’07 Use bit blasting as core technique Apply to simplified versions of formula Successive approximations until solve or show unsatisfiable
Using Modular ArithmeticUsing Modular Arithmetic STP: Ganesh & Dill, CAV ’07 Algebraic techniques to solve special case forms
Layered ApproachLayered Approach MathSat: Bruttomesso, Cimatti, Franzen, Griggio, Hanna,
Nadel, Palti, Sebastiani, CAV ’07 Use successively more detailed solvers
– 23 –
Iterative Approach Background: Approximating Formula
Example Approximation TechniquesExample Approximation Techniques Underapproximating
Restrict word-level variables to smaller ranges of values Overapproximating
Replace subformula with Boolean variable
Original Formula
+Overapproximation + More solutions:
If unsatisfiable, then so is
Underapproximation−
−
Fewer solutions:Satisfying solution also satisfies
– 24 –
Starting Iterations
Initial UnderapproximationInitial Underapproximation (Greatly) restrict ranges of word-level variables Intuition: Satisfiable formula often has small-domain solution
1−
– 25 –
First Half of Iteration
SAT Result for SAT Result for 1− Satisfiable
Then have found solution for Unsatisfiable
Use UNSAT proof to generate overapproximation 1+ (Described later)
1−If SAT, then done
1+
UNSAT proof:generate overapproximation
– 26 –
Second Half of Iteration
SAT Result for SAT Result for 1+ Unsatisfiable
Then have shown unsatisfiable Satisfiable
Solution indicates variable ranges that must be expandedGenerate refined underapproximation
1−
If UNSAT, then done1+
SAT:Use solution to generate refined underapproximation
2−
– 27 –
Iterative Behavior
UnderapproximationsUnderapproximations Successively more precise
abstractions of Allow wider variable ranges
OverapproximationsOverapproximations No predictable relation UNSAT proof not unique
1−
1+
2−
k−
2+
k+
– 28 –
Overall EffectSoundnessSoundness
Only terminate with solution on underapproximation
Only terminate as UNSAT on overapproximation
CompletenessCompleteness Successive underapproximations
approach Finite variable ranges guarantee
termination In worst case, get k−
1−
1+
2−
k−
2+
k+
SAT
UNSAT
– 29 –
Generating Overapproximation
GivenGiven Underapproximation 1− Bit-blasted translation of 1−
into Boolean formula Proof that Boolean formula
unsatisfiable
GenerateGenerate Overapproximation 1+ If 1+ satisfiable, must lead to
refined underapproximationGenerate 2− such that
1− 2−
1−
1+
UNSAT proof:generate overapproximation
2−
– 30 –
Bit-Vector Formula Structure DAG representation to allow shared subformulas
x + 2 z 1
x % 26 = v
w & 0xFFFF = x
x = y
Ç
Æ:
Ç
ÆÇ
a
– 31 –
Structure of Underapproximation
Linear complexity translation to CNFEach word-level variable encoded as set of Boolean variablesAdditional Boolean variables represent subformula values
x + 2 z 1
x % 26 = v
w & 0xFFFF = x
x = y
Ç
Æ:
Ç
ÆÇ
a −
RangeConstraints
wxyz
Æ
– 32 –
RangeConstraints
wxyz
Æ
UNSAT Proof Subset of clauses that is unsatisfiable Clause variables define portion of DAG Subgraph that cannot be satisfied with given range
constraints
x + 2 z 1
x % 26 = v
w & 0xFFFF = x
x = y
a
Ç
Æ
ÆÇ
Ç
:
– 33 –
Extracting Circuit from UNSAT Proof Subgraph that cannot be satisfied with given range
constraintsEven when replace rest of graph with unconstrained
variables
x + 2 z 1
x = y
a Æ
ÆÇ
Ç
:
b1
b2
RangeConstraints
wxyz
ÆUNSAT
– 34 –
Generated Overapproximation Remove range constraints on word-level variables Creates overapproximation
Ignores correlations between values of subformulas
x + 2 z 1
x = y
a Æ
ÆÇ
Ç
:
b1
b2
1+
– 35 –
Refinement PropertyClaimClaim
1+ has no solutions that satisfy 1−’s range constraintsBecause 1+ contains portion of 1− that was shown to be
unsatisfiable under range constraints
x + 2 z 1
x = y
a Æ
ÆÇ
Ç
:
b1
b2
RangeConstraints
wxyz
ÆUNSAT
1+
– 36 –
Refinement Property (Cont.)
ConsequenceConsequence Solving 1+ will expand range of some variables Leading to more exact underapproximation 2−
x + 2 z 1
x = y
a Æ
ÆÇ
Ç
:
b1
b2
1+
– 37 –
Effect of Iteration
Each Complete IterationEach Complete Iteration Expands ranges of some word-level variables Creates refined underapproximation
1−
1+
SAT:Use solution to generate refined underapproximation
2−
UNSAT proof:generate overapproximation
– 38 –
Approximation Methods
So FarSo Far Range constraints
Underapproximate by constraining values of word-level variables
Subformula eliminationOverapproximate by assuming subformula value arbitrary
General RequirementsGeneral Requirements Systematic under- and over-approximations Way to connect from one to another
Goal: Devise Additional Approximation StrategiesGoal: Devise Additional Approximation Strategies
– 39 –
Function Approximation Example
MotivationMotivation Multiplication (and division) are difficult cases for SAT
§: Prohibit Via Additional Range Constraints: Prohibit Via Additional Range Constraints Gives underapproximation Restricts values of (possibly intermediate) terms
§: Abstract as : Abstract as ff ((xx,,yy)) Overapproximate as uninterpreted function f Value constrained only by functional consistency
*x
y
x
0 1 else
y
0 0 0 0
1 0 1 x
else 0 y §
– 40 –
Challenges with Iterative ApproximationFormulating Overall StrategyFormulating Overall Strategy
Which abstractions to apply, when and where How quickly to relax constraints in iterations
Which variables to expand and by how much?Too conservative: Each call to SAT solver incurs costToo lenient: Devolves to complete bit blasting.
Predicting SAT Solver PerformancePredicting SAT Solver Performance Hard to predict time required by call to SAT solver Will particular abstraction simplify or complicate SAT?
Combination Especially DifficultCombination Especially Difficult Multiple iterations with unpredictable inner loop
– 41 –
STP: Linear Equation Solving Ganesh & Dill, CAV ’07 Solve linear equations over integers mod 2w
Capture range of solutions with Boolean Variables
Example ProblemExample Problem Variables: 3-bit unsigned integers
x: [x2 x1 x0] y: [y2 y1 y0] z: [z2 z1 z0] Linear equations: conjunction of linear constraints
General FormA x + b = 0 mod 2w
3x + 4y + 2z = 0 mod 82x + 2y + 2 = 0 mod 82x + 4y + 2z = 0 mod 8
– 42 –
Summary: Modeling LevelsBitsBits
Limited ability to scale Hard to apply functional abstractions
WordsWords Allows abstracting data while precisely representing control Overlooks finite word-size effects
Bit VectorsBit Vectors Realistic semantic model for hardware & software Captures all details of actual operation
Detects errors related to overflow and other artifacts of finite representation
Can apply abstractions found at word-level
– 43 –
Areas of Agreement
SAT-Based Framework Is Only Logical ChoiceSAT-Based Framework Is Only Logical Choice SAT solvers are good & getting better
Want to Automatically Exploit AbstractionsWant to Automatically Exploit Abstractions Function structure Arithmetic properties
E.g., associativity, commutativity Arithmetic reductions
E.g., LU decomposition
Base Level Should Be SATBase Level Should Be SAT Only semantically complete approach
– 44 –
Choices
Optimize for Special Formula ClassesOptimize for Special Formula Classes E.g., STP optimized for conjunctions of constraints
Common in software verification & testing
Iterative AbstractionIterative Abstraction Natural framework for attempting different abstractions Having SAT solver in inner loop makes performance tuning
difficult
Others?Others?
– 45 –
Observations
Bit-Vector Modeling Gaining in PopularityBit-Vector Modeling Gaining in Popularity Recognition of importance Benchmarks and competitions
Just Now Improving on Bit Blasting + SATJust Now Improving on Bit Blasting + SAT
Lots More Work to be DoneLots More Work to be Done
top related