the eager approach to smt - eecs at uc berkeleysseshia/219c/spr11/lectures/... · the eager...

1

The Eager Approach to SMT

Sanjit A. Seshia

UC Berkeley

Slides based on ICCAD ’09 Tutorial

S. A. Seshia 2

Eager Approach to SMT

Key Ideas:• Small-domain encoding

– Constrain model search

• Rewrite rules• Abstraction-based

methods (eager + lazy)

Example Solvers:UCLID, STP, Spear,

Boolector, Beaver, …

Input Formula

Boolean Formula

satisfiable unsatisfiable

Satisfiability-preserving Boolean Encoder

SAT Solver

EAGER ENCODING

SAT Solver involved in Theory Reasoning

2

S. A. Seshia 3

Theories

• Eager Encoding Methods have been demonstrated for the following Theories:– Equality & Uninterpreted Functions

– Integer Linear Arithmetic

– Restricted Lambda expressions • Arrays, memories, etc.

– Finite-precision Bit-Vector Arithmetic

– Strings

S. A. Seshia 4

UCLID Operation

OperationOperation–– Series of Series of

transformations transformations leading to Boolean leading to Boolean formulaformula

–– Each step is validity Each step is validity (satisfiability) (satisfiability) preservingpreserving

–– Each step performs Each step performs optimizationsoptimizations

LambdaExpansionfor Arrays

Encoding Arithmetic

BooleanSatisfiability

InputFormula

-freeFormula

Linear/ BitvectorArithmeticFormula

BooleanFormula

Function&

PredicateElimination

http://uclid.eecs.berkeley.edu

3

S. A. Seshia 5

Rewrites: Eliminating Function Applications

– Two applications of an uninterpreted function f in a formula

– f(x1) and f(x2)

AckermannAckermann’’s s EncodingEncoding

f(f(xx11)) vfvf11

f(f(xx22)) vfvf22

xx11== xx2 2 vfvf1 1 = = vfvf22

Bryant, German, Bryant, German, VelevVelev’’ssEncodingEncoding

f(f(xx11)) vfvf11

f(f(xx22))

ITE(ITE(xx11== xx22, vf, vf11, vf, vf22))

S. A. Seshia 6

The Small Model Property

• A Theory is said to have the small-model property if, given a formula , is satisfiable over the original domain D iffit is satisfiable over a domain S, where S is a function of , and |S| < |D|.

• Example: Integer Linear Arithmetic (QF_LIA)

4

S. A. Seshia 7

Small-Domain Encoding

• Consider an SMT formula (x1, x2, …, xn)where xi ∈ Di

• Small-domain encoding/Finite instantiation: Derive finite set Si ⊂ Di s.t. |Si| ¿ |Di|– In some cases, Si is finite where Di is infinite

• Encode each xi to take values only in Si

– Could be done by encoding to SAT

S. A. Seshia 8

Solving QF_LIA is NP-complete

• In NP:– If a satisfying solution exists, then one exists

within a bound d• log d is polynomial in input size

– Expression for d [Papadimitriou, ‘82]

(n+m) · (bmax +1) · ( m · amax ) 2m+3

– Input size:• m – # constraints

• n – # variables

• bmax – largest constant (absolute value)

• amax – largest coefficient (absolute value)

5

S. A. Seshia 9

Small-domain encoding / Finite Instantiation: Naïve approach

• Steps– Calculate the solution bound d– Encode each integer variable with d log d e bits

& translate to Boolean formula

– Run SAT solver

• Problem: For QF_LIA, d is ( m m )– ( m log m ) bits per variable

• Solution: Exploit special-cases and domain-specific structure

S. A. Seshia 10

Special Case 1: Equality Logic

• Linear constraints are equalities xi = xj

• Result: d = n

xx11 xx22 ∧∧ xx22 xx33 ∧∧ xx11 xx33

33--valued domain is needed: {1, 2, 3}valued domain is needed: {1, 2, 3}

xx11 xx22 ∧∧ xx22 xx33 ∧∧ xx11 xx33

Can find solution with domain {1, 2}Can find solution with domain {1, 2}

[Pnueli et al., Information and Computation, 2002]

6

S. A. Seshia 11

Special Case 2: Difference Logic

• Boolean combination of difference-bound constraints– xi ≥ xj + b, ± xi ≥ b

• Result: d = n · (bmax + 1)[Bryant, Lahiri, Seshia, CAV’02]

• Proof sketch: satisfying solution corresponds to shortest path in constraint graph– Longest such path has length · n · (bmax + 1)

• Tighter formula-specific bounds possible

S. A. Seshia 12

Special Case 3: Generalized 2SAT

• Generalized 2SAT constraints– xi + xj ≥ b, - xi - xj ≥ b, xi - xj ≥ b, xi ≥ b

• d = 2 · n · (bmax + 1) [Seshia, Subramani, Bryant,’04]

7

S. A. Seshia 13

Full Integer Linear Arithmetic

• Can we avoid the mm blow-up?

• In fact, yes. The idea is to derive a new parameterized solution bound d– Formalize parameters that the bound really

depends on

– Parameters characterize sparse structure • Occurs especially in software verification; also in

many high-level hardware models

– [Seshia & Bryant, LICS’04, LMCS’05]

S. A. Seshia 14

Structure of Linear Constraints in Software Verification

• Characteristics of studied benchmarks– Mostly difference constraints

• Only 3% of constraints were NOT difference constraints

– Non-difference constraints are sparse• At most 6 variables per constraint (total number of

variables in 1000s)

• Some similar observations: Pratt’77, ESC/Java-Simplify-TR’03

8

S. A. Seshia 15

Parameterized Solution Bound

max |coefficient|

amax

max |constant|bmax

#variablesn

#constraintsm

New parameters: New parameters: –– kk nonnon--difference constraints, difference constraints,

–– ww variables per constraint (width)variables per constraint (width)

Our solution bound:Our solution bound:

n n ·· ((bbmaxmax +1) +1) ·· ( ( ww ·· aamaxmax ) ) kk

Previous:Previous:((nn++mm) ) ·· ((bbmaxmax +1) +1) ·· ( ( mm ·· aamaxmax ) ) 22mm+3+3

•• Direct dependence on Direct dependence on mm eliminatedeliminated(and (and kk ¿¿ mm ))

S. A. Seshia 16

Example

x1 - x2 ≥ 1

∨

∧

¬

∨

x1 + 2 x2 + x3 > -3

x2 – x4≥ 0

3widthw

1#non-differencek

2max |coefficient|amax

max |constant|

#variables

#constraints

3bmax

4n

3m

d = d = 9696

PreviousPrevious d d = = 282,175,488282,175,488

9

S. A. Seshia 17

Summary of d Values

n · (bmax + 1) · (amaxk · w k)Full Integer Linear

Arithmetic

2 · n · ( bmax + 1 )Generalized 2SAT logic

n · ( bmax + 1 )Difference logic

nEquality logic

Solution Bound dLogic

– 18 –

Proof of Our Bound: StepsProof of Our Bound: Steps

1.1. Previous result for integer linear programming Previous result for integer linear programming (ILP) (ILP) by by BoroshBorosh--TreybigTreybig--FlahiveFlahive [[’’76, 76, ’’86]86]

2.2. Express above result in Express above result in kk and and ww, in addition to , in addition to other parametersother parameters

3.3. Derive QFP bound from ILP boundDerive QFP bound from ILP bound

10

– 19 –

Integer Linear Programming (ILP) NotationInteger Linear Programming (ILP) Notation

A xA x = = b b , , xx ≥≥ 00

m

n

amax = maxi,j |aij | bmax = maxi |bi |

– 20 –

Borosh-Treybig-Flahive Result [1986]Borosh-Treybig-Flahive Result [1986]

Solution bound Solution bound dd is is

((nn+2) +2) ··

wherewhere = largest sub= largest sub--determinant of [determinant of [AA | | bb] ] (abs. value)(abs. value)

Problem: Exponentially many subProblem: Exponentially many sub--determinants!determinants!

11

– 21 –

Matrix StructureMatrix Structure

m

n

Non-difference constraintsk

w non-zeroes per row

– 22 –

k = 0 : Only Difference Constraintsk = 0 : Only Difference Constraints

xxii -- xxjj ≥≥ bb, , ±± xxii ≥≥ bb

Totally Unimodular: All subdeterminants are in {0, -1, +1}

· i | bi | · min(n+1, m) · bmax

12

– 23 –

Arbitrary kArbitrary k

· w

· k

Det. ∈{0,±1}

Each term · amaxk #Terms · w k

· i | bi | (amaxk · w k)

· min(n+1, m) · bmax · (amaxk · w k)

– 24 –

Bound for ILPBound for ILP

·· min(min(nn+1, +1, mm) ) ·· bbmaxmax ·· ((aamaxmaxkk ·· w w kk))

dd = (= (nn+2) +2) ·· [[BoroshBorosh--TreybigTreybig--FlahiveFlahive]]

= (= (nn+2) +2) ·· min(min(nn+1, +1, mm) ) ·· bbmaxmax ·· ((aamaxmaxkk ·· w w kk))

·· ((nn+2) +2) ·· nn ·· bbmaxmax ·· ((aamaxmaxkk ·· w w kk))

(assuming (assuming mm ·· nn))

13

– 25 –

QFP Bound from ILP BoundQFP Bound from ILP Bound

Consider DNF of arbitrary QFP formula Consider DNF of arbitrary QFP formula = = 11 ∨∨ 22 ∨∨ . . . . . . ∨∨ NN

Satisfying assignment to Satisfying assignment to must satisfy some must satisfy some ii

Each Each ii is an ILP is an ILP –– Parameters of Parameters of ii are bounded by those ofare bounded by those of

Therefore: Therefore: d = d = ((nn+2) +2) ·· nn ·· bbmaxmax ·· ((aamaxmaxkk ·· w w kk))

– 26 –

Summary of d ValuesSummary of d Values

((nn+2) +2) ·· n · (bmax + 1) · (amaxk · w k)QuantifierQuantifier--Free Free

PresburgerPresburger logiclogic

2 2 ·· nn ·· ( ( bbmaxmax + 1 )+ 1 )Generalized 2SAT logicGeneralized 2SAT logic

nn ·· ( ( bbmaxmax + 1 )+ 1 )Separation logicSeparation logic

nnEquality logicEquality logic

ddLogicLogic

Note: A paper by Sergei Veselov proved that the bound of Borosh-Treybig-Flahive has been improved from (n+2) · to simply . This eliminates the (n+2) term from the last entry in the table above.

14

S. A. Seshia 27

Abstraction-Based Methods

• For some logics, one cannot easily compute a closed-form expression for the small domain

• Example: Bit-Vector Arithmetic

• In such cases, an abstraction-refinement approach can be used to compute formula-specific small domains

S. A. Seshia 28

Bit-Vector Arithmetic: Some History

• B.C. (Before Chaff)– String operations (concatenate, field extraction)

– Linear arithmetic with bounds checking

– Modular arithmetic

• SAT-Based “Bit Blasting”– Generate Boolean circuit based on bit-level behavior of

operations• Handles arbitrary operations

– Check with best available SAT solver

– Effective in many applications• CBMC [Clarke, Kroening, Lerda, TACAS ’04]

• Microsoft Cogent + SLAM [Cook, Kroening, Sharygina, CAV ’05]

15

S. A. Seshia 29

Research Challenge

• Is there a better way than bit blasting?

• Requirements– Provide same functionality as with bit blasting

• Must support all bit-vector operators

– Exploit word-level structure

– Improve on performance of bit blasting

• Current Approaches based on two core ideas:1. Simplification: Simplify input formula using word-level

rewrite rules and solvers

2. Abstraction: Can use automatic abstraction-refinement to solve simplified formula

S. A. Seshia 30

Bit-Vector SMT Solvers, circa Spr.’2009

Current Techniques with Sample Tools – Proof-based abstraction-refinement – UCLID [Bryant et al.,

TACAS ’07]

– Solver for linear modular arithmetic to simplify the formula –STP [Ganesh & Dill, CAV’07]

– Automatic parameter tuning for SAT– Spear [Hutter et al., FMCAD ’07]

– Rewrites, underapproximation, efficient SAT engine –Boolector [Brummayer & Biere, TACAS’09]

– Equality/constant propagation, logic optimization, special rulesfor non-linear ops - Beaver [Jha et al., CAV’09]

– DPLL(T) framework: Layered approach, rewriting – CVC3[Barrett et al.], MathSAT [Bruttomesso et al], Yices [Dutertre et al.], Z3 [de Moura et al]

16

S. A. Seshia 31

Abstraction-Refinement• Deciding Bit-Vector Arithmetic with Abstraction

[Bryant et al., TACAS ’07, STTT ’09]

– Use bit blasting as core technique

– Apply to simplified versions of formula: under and over approximations

– Generate successive approximations until a solution is found or formula shown unsatisfiable

– Inspired by McMillan & Amla’s proof-based abstraction for finite-state model checking

• Small Motivating Example:

(x + y y + x) ∧ (x * y y * x)– Sufficient to prove the left-hand conjunct unsat

S. A. Seshia 32

Approximations to Formula

• Example Approximation Techniques

– Underapproximating• Restrict word-level variables to smaller ranges of values

– Overapproximating• Replace subformula with Boolean variable

Original Formula

+Overapproximation + More solutions:

If unsatisfiable, then so is

Underapproximation−

−

Fewer solutions:Satisfying solution also satisfies

17

S. A. Seshia 33

Starting Iterations

• Initial Underapproximation

– (Greatly) restrict ranges of word-level variables

– Intuition: Satisfiable formula often has small-domain solution

1−

S. A. Seshia 34

First Half of Iteration

• SAT Result for 1−

– Satisfiable

• Then have found solution for – Unsatisfiable

• Use UNSAT proof to generate overapproximation 1+

1−If SAT, then done

1+

UNSAT proof:generate overapproximation

18

S. A. Seshia 35

Second Half of Iteration

• SAT Result for 1+

– Unsatisfiable: then have shown unsatisfiable

– Satisfiable: solution indicates variable ranges that must be expanded

• Generate refined underapproximation

1−

If UNSAT, then done1+

SAT:Use solution to generate refined underapproximation

2−

S. A. Seshia 36

Example

:= (x = y+2) ∧ (x2 > y2)

1− := (x[1] = y[1]+2) ∧ (x[1]2 > y[1]

2)

2− := (x[2] = y[2]+2) ∧ (x[2]2 > y[2]

2)

1+ := (x = y+2)

SAT, done.

UNSATLook at proof

SATx = 2, y = 0

19

S. A. Seshia 37

Iterative Behavior

• Underapproximations– Successively more precise

abstractions of – Allow wider variable ranges

• Overapproximations– No predictable relation

– UNSAT proof not unique

1−

1+

2−

k−

2+

k+

S. A. Seshia 38

Overall Effect

• Soundness– Only terminate with solution on

underapproximation

– Only terminate as UNSAT on overapproximation

• Completeness– Successive underapproximations

approach – Finite variable ranges guarantee

termination

• In worst case, get k−

SAT

UNSAT

1−

1+

2−

k−

2+

k+

20

S. A. Seshia 39

Summary of Key Ideas

S. A. Seshia 40C. Barrett & S. A. Seshia ICCAD 2009 Tutorial 40

Summary of Ideas: Lazy Methods

• Philosophy: Extend DPLL framework from SAT to SMT

• Literals assigned by SAT are sent to Theory Solver

• Theory Solver determines if literals are satisfiable in the theory

• Key optimizations: small explanations, early conflict detection, theory propagation

21

S. A. Seshia 41

Summary of Ideas: Eager Methods

• Philosophy: Constrain solution space with theory-specific solution bounds

• Small-domain encoding– Compute bounds that work for any formula in

the logic

• Abstraction-refinement of domains– Compute formula-specific small domains

• Rewrite rules: high level and bit level– Simplify formula before and after bit-blasting

S. A. Seshia 42

Challenges and Opportunities

• Solvers for new theories– Strings

– Non-linear arithmetic

– Can we exploit domain-specific structure?

• Parallel SMT

• Better support for quantifiers

• Better proof/interpolant generation

the eager approach to smt - eecs at uc berkeleysseshia/219c/spr11/lectures/... · the eager...

Documents