hierarchic superposition with weak abstraction and the beagle theorem...
TRANSCRIPT
Peter BaumgartnerNICTA and ANU, Canberra
Hierarchic Superposition With Weak Abstraction and the Beagle Theorem Prover
Uwe WaldmannMPI für Informatik, Saarbrücken
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
GoalAutomated deduction in hierarchic combinations of specifications
2
Lists over integers (l ≈ nil) ∨ (l ≈ cons(head(l), tail(l)) ¬(cons(k, l) ≈ nil) head(cons(k, l)) ≈ k tail(cons(k, l)) ≈ l
The inRange predicate, e.g. inRange([1,0,5], 6) nRange(l, n) ↔ (l ≈ nil ∨ (0 ≤ head(l) < n ∧ inRange(tail(l), n)))
Conjecture ∀ l:list n:int (¬(l ≈ nil) → (inRange(l, n) → inRange(cons(head(l), l), n)))
LIA + Lists/Arrays + Hypotheses ⊨ Conjecture ?
LIA + Lists/Arrays + Hypotheses ⊭ Conjecture ?
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
ContentsSemantics
Hierarchic specifications
Sufficient completeness
Hierarchic superposition
Weak abstraction
Two kinds of variables
Definitions
The Beagle theorem prover
Overview
Experimental results
3
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
Hierarchic SpecificationsBackground (BG) specification consists of
sorts, e.g. { int }
operators, e.g. { 0, 1, -1, 2, -2, ..., -, +, >, ≥}
models, e.g. linear integer arithmetic
Foreground (FG) specification extends BG specification by
new sorts, e.g. { list }
new operators, e.g.
{ cons: int × list ↦ list, nil: list, length: list ↦ int, a: list }
first-order clauses, e.g.
{ length(a) ≥ 1, length(cons(x, y)) ≈ length(y)+1 }
4
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
Hierarchic SpecificationsAssumption
We have a decision procedure for quantified formulas over the BG specification
Goal
Check whether the hierarchic combination has models or not, using the BG decision procedure as a subroutine
Question
What is a model of the hierarchic combination?
5
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
Hierarchic Specifications
6
Models of hierarchic specifications
Must satisfy the FG clauses, and
must leave the interpretation of the BG sorts and operators unchanged (conservative extension):
- distinct BG elements may not be identified (no confusion), and- no new elements may be added to BG sorts (no junk)
Fundamental problem 1
Absence of junk is not r.e.
⇒ Refutational completeness is only possible in certain cases
⇒ Require sufficient completeness
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
Sufficient CompletenessSufficient Completeness
In every model of the FG clauses, every ground FG term that has a BG sort must be equal to some BG term
Example
is not sufficiently complete:
take BG domain ℤ ∪ { NaN } and evaluate head(nil) to NaN
Adding head(nil) ≈ 0 and tail(nil) ≈ nil makes it sufficiently complete
7
(l ≈ nil) ∨ (l ≈ cons(head(l), tail(l))¬(cons(k, l) ≈ nil)head(cons(k, l)) ≈ ktail(cons(k, l)) ≈ l
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
Hierarchic SpecificationsFundamental Problem 2
We can pass only finite sets of formulas to the BG decision procedure
Second Requirement for Completeness
Compactness: If an infinite set of BG formulas is unsatisfiable, then it has a finite unsatisfiable subset
LIA with global symbolic constants α (parameters) is not compact:
take { α ≥ 0, α ≥ 1, α ≥ 2, ... }
LIA without parameters is compact
8
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
Calculi for Hierarchic ReasoningIf the FG clauses are ground
DPLL(T) + Nelson-Oppen
(Neither sufficient completeness nor compactness poses problems)
If the FG clauses are not ground
DPLL(T) + Nelson-Oppen + instantiation heuristics (CVC4, Z3,...)
Hierarchic superposition [Bachmair Ganzinger Waldmann 1994, Althaus Weidenbach Kruglov 2009, Weidenbach Kruglov 2012]
Model evolution with LIA constraints [B Tinelli 2008, 2011]
Sequent calculus [Rümmer 2008]
Theory instantiation [Korovin 2006]
LASCA [Korovin Voronkov 2007]
Hierarchic superposition with weak abstraction [B Waldmann 2013]
9
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
Hierarchic Superposition with Weak AbstractionCalculus Layout
10
Input clause set
Weak abstraction
Core calculusSuperpositionCloseWeak abstractionSimplificationSplittingDefine
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
(Weak) Abstraction
11
Unification cannot detect "semantic equality" of BG terms
P(1+2) ¬P(2+1)
?
- Abstraction extracts BG terms in terms of new variables- FG literals are subject to superposition inferences- BG clauses are passed to the BG solver, in Close inferences
ARI595=1.p
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
(Weak) Abstraction
P(X) ∨ X≉1+2
11
Unification cannot detect "semantic equality" of BG terms
P(1+2) ¬P(2+1)
?
- Abstraction extracts BG terms in terms of new variables- FG literals are subject to superposition inferences- BG clauses are passed to the BG solver, in Close inferences
ARI595=1.p
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
(Weak) Abstraction
P(X) ∨ X≉1+2 ¬P(Y) ∨ Y≉2+1
11
Unification cannot detect "semantic equality" of BG terms
P(1+2) ¬P(2+1)
?
- Abstraction extracts BG terms in terms of new variables- FG literals are subject to superposition inferences- BG clauses are passed to the BG solver, in Close inferences
ARI595=1.p
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
(Weak) Abstraction
P(X) ∨ X≉1+2 ¬P(Y) ∨ Y≉2+1
11
Unification cannot detect "semantic equality" of BG terms
P(1+2) ¬P(2+1)
?
- Abstraction extracts BG terms in terms of new variables- FG literals are subject to superposition inferences- BG clauses are passed to the BG solver, in Close inferences
SupX≉1+2 ∨ X≉2+1
ARI595=1.p
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
(Weak) Abstraction
P(X) ∨ X≉1+2 ¬P(Y) ∨ Y≉2+1
11
Unification cannot detect "semantic equality" of BG terms
P(1+2) ¬P(2+1)
?
- Abstraction extracts BG terms in terms of new variables- FG literals are subject to superposition inferences- BG clauses are passed to the BG solver, in Close inferences
SupX≉1+2 ∨ X≉2+1
Close□
ARI595=1.p
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
Weak AbstractionWeak Abstraction
Only non-variable BG terms that are direct subterms of non-BG terms are abstracted out
Concrete numbers (0, 1, -1, 2, -2, ...) are never abstracted out
Example (α and β are BG constants)
g(1, α, f(1)+(α+1), Z) ≈ β ↝ g(1, X, f(1)+ Y, Z) ≈ β ∨ X ≉ α ∨ Y ≉ (α+1)
Properties (in relation to [BGW 94])
Extracts viewer terms: less detrimental to unificationShorter clauses: preserves unit clauses more often (good for rewriting) Always preserves sufficient completeness (see below)Inference rules can destroy WA, hence need explicit WA of conclusion
12
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
Two Kinds of VariablesAbstraction Variables X, Y, Z
Stand for BG terms
↝ Never unify with non-variable FG terms
Pro: BG terms are always smaller than FG terms
E.g., f(X) ≈ g(Y) is ordered from left to right if f > g
Con1: Subsumption does not work as expected
E.g., P(X) does not subsume P(f(Y))
Con2: Unexpectedly don't get refutations
E.g. f(nil) + 1 ≉ Y + 1 May even destroy sufficient completeness during abstraction
Ordinary variables fix these problems
13
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
Two Kinds of VariablesOrdinary Variables x, y, z
Stand for arbitrary BG-sorted terms
↝ May also unify with non-variable FG terms
Con: viewer ordered equations
E.g., f(x) ≈ g(y) is not ordered from left to right even if f > g
Pro1: subsumption works as expected
E.g., P(x) subsumes P(f(y))
Pro2: mey get refutations even in absence of s.c.
E.g. f(nil) + 1 ≉ y + 1 Always preserves sufficient completeness during abstraction(use ordinary variables only if abstracted term contains ordinary variables)
14
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
Two Kinds of VariablesWhat is the kind of variables in the input problem?
Ordinary variables: { x ≉ f(1) } has a refutation
Abstraction variables: { X ≉ f(1) } does not have a refutation
A: there is a trade-off, see above, so let the user decide.In practice, most variables are abstraction variables, and ordinary variables are only used in additional lemmas:
Lemmas
Valid BG theory clauses, make BG knowledge available to FG reasoner
E.g. ¬(x<x), x+0 ≈ x, ¬(x<y) ∨ ¬(y<z) ∨ x<z
Used to simplify, e.g., f(1)<f(1), length(nil)+0
15
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
Definitions
16
In general, one cannot make an arbitrary hierarchic specification sufficiently complete by construction
We can, however, prevent that a ground BG-sorted FG term t is interpreted by a junk element:
- introduce a new parameter, i.e., a new BG constant αt
- add the definition t ≈ αt
This is a well-known preprocessing technique [KruglovWeidenbach 12]
However, in hierarchic superposition ground terms can show up in the middle of the saturation process
↝ use introduction of definitions as an inference rule
f(X)>5 ∨ X≉1+2
f(X) ≈ αf(1+2) ∨ X≉1+2
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
Main Theoretical Results [B Waldmann CADE 2013]
Completeness 1
HSPWA is refutationally complete for compact BG specifications and sufficiently complete input clause sets
Completeness 2
HSPWA is refutationally complete for input clause sets where every BG-sorted term is ground
17
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
The Beagle Prover• Full implementation of the calculus above
– Lemmas, ordinary/abstraction variables, definitions, splitting– Discount/otter loop, demodulation, subsumption– LPO/KBO, W/A ratio– Cautious and aggressive simplification
e.g. 1+(2+a) ≉ 1+x simplifies to 3+a ≉ 1+x• Front-end for TPTP TFA and SMT-Lib languages • Background reasoners
– LRA: Fourier/Motzkin, Simplex – LIA: Cooper QE, Branch and bound
• Written in Scala, easy to installhttp://users.cecs.anu.edu.au/~baumgart/systems/beagle/
• Team: PB, A Bauer, J Bax, T Cosgrove• Companion system: SMTtoTPTP
http://users.cecs.anu.edu.au/~baumgart/systems/smttotptp/18
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
User Experience
There is a number in [a,...,a+2] that is divisible by 3
19
$ beagle ARI595=1.p
This is beagle, version 0.7.1 (2/10/2013)
Input formulas==============¬((∀ Zᵃ:$int (((a ≤ Zᵃ) ∧ (Zᵃ ≤ (a + 2))) ⇒ p(Zᵃ))) ⇒ (∃ Xᵃ:$int p((3·Xᵃ))))
Clause set signature====================Background sorts: { " , ℤ, ℝ}Foreground sorts: {$i, $o, $tType}Background operators: $greatereq: ℤ × ℤ " $o : a: ℤForeground operators: $true: $o p: ℤ " $o $false: $o
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
User Experience
20
Precedence among foreground operators: p > $true > $false
Clause set==========p(Zᵃ) ∨ ¬(Zᵃ ≤ (a + 2)) ∨ ¬(a ≤ Zᵃ)¬p((3·Xᵃ))
Background sorts used in clause set: ℤUsed background theory solver: cooper-clauses
Proving...p(Zᵃ) ∨ ¬(Zᵃ ≤ (2 + a)) ∨ ¬(a ≤ Zᵃ)¬p((3·Xᵃ))¬((3·X_13ᵃ) ≤ (2 + a)) ∨ ¬(a ≤ (3·X_13ᵃ))¬(3|a)¬(3|(2 + a))¬(3|(1 + a))
SZS status Theorem for ARI595=1.p
Inference rules----------------------...
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
Beagle on TPTP
21
LIA-Theorems in TPTP 337
Full Abstraction [BGW94] 242 proved
Weak Abstraction 251 proved
WA + Definitions 297 proved
WA + Definitions + Aggressive Simplification 303 proved
Two more theorems can only be proved when BG axioms are added, but adding BG axioms is (obviously) a double-edged sword and not generally helpful
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
Proving Infinite Satisfiability [B Bax LPAR-19]
22
• Given the LIST axioms over integers• Suppose a set HYP defining functions/relations on lists
E.g. length, in, inRange, count, append• Suppose we know that LIST ∪ HYP is satisfiable (by construction)
• Then, to disprove a conjecture CON, i.e.
LIST ∪ HYP ⊭ CON
it suffices to prove
LIST ∪ HYP ⊨ ¬CON
• Same for ARRAY• Use this method in the following result tables, for all provers
– Directly establishing countersatisfiability does not work at all
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
Experiments with LIST
23
A detailed proof of Lemma 3.1 is in the appendix. It proceeds by constructing a canon-ical (minimal) model of the(-direction of Def
P
, which always also is a model of the)-direction. From a logic-programming angle, the user could as well give only the(-direction of Def
P
, and the system adds the completion ()-direction) for disprovingpurposes.
Example. Let inRange : Z ⇥ LIST be a predicate symbol. Consider the extension ofAxLIST with the following (admissible) definition for P (the free variables are universallyquantified with the obvious sorts).
inRange(n, l), l ⇡ nil _ 9 hZ tLIST . (l ⇡ cons(h, t) ^ 0 h ^ h < n ^ inRange(n, t))
This example comes from a case study with the first-order logic model checker from [1].The inRange predicate is used there to specify lists of “ordered items” handled in apurchase order process, which must all be in a range 0..N � 1, for some N � 0.
The following table lists some sample problems together with the runtimes (in sec-onds) needed to disprove them with the provers mentioned.1
Problem Beagle Spass+T Z3inRange(4, cons(1, cons(5, cons(2, nil)))) 6.2 0.3 0.2n > 4) inRange(n, cons(1, cons(5, cons(2, nil)))) 7.2 0.3 0.2inRange(n, tail(l))) inRange(n, l) 3.9 0.3 0.29 nZ lLIST . l 0 nil ^ inRange(n, l) ^ n � head(l) < 1 2.7 0.3 0.2inRange(n, l)) inRange(n � 1, l) 8.2 0.3 >60l 0 nil ^ inRange(n, l)) n � head(l) > 2 2.8 0.3 0.2n > 0 ^ inRange(n, l) ^ l
0 = cons(n � 2, l)) inRange(n, l0) 4.5 5.2 0.2
We remark that none of these problems is solvable by either prover by directly trying toestablish consistency of the axioms, definitions and the conjecture. Even if only the(-direction is used, Z3 and Spass+T do not terminate. Because the universally quantifiedvariables in the conjectures lead to Skolem constants, the resulting clause set is nolonger su�ciently complete (see [3]), and a finite saturation obtained by Beagle doesnot allow one to conclude satisfiability.
Functions. Let ⌃+ ◆ ⌃LIST be a signature, s 2 sorts(⌃) and f < ⌃+ a function symbolwith arity Z⇥LIST 7! s. Let Def
f
be a set of (implicitly) universally quantified formulasof the form below, where k and h are Z-sorted and t is LIST-sorted:
f (k, nil) ⇡ b[k]( B[k] (f0)f (k, cons(h, t)) ⇡ c1[k, h, t, f (k, t)]( C1[k, h, t, f (k, t)] (f1)
...
f (k, cons(h, t)) ⇡ c
n
[k, h, t, f (k, t)]( C
n
[k, h, t, f (k, t)] (fn
)1 Here and below, Beagle has been run with “cautious simplification on” and “ordinary vari-
ables on”; Z3, version 4.3.1 with the options ”pull-nested-quantifiers”, “mbqi” and “macro-finder” on; SPASS+T used Yices as a theory solver. All timings obtained on reasonablerecent computer hardware. The input problems are available on the Beagle website http://users.cecs.anu.edu.au/
˜
baumgart/systems/beagle/.
5
Problems 5 and 7 require "ordinary variables" and "cautious simplification" (the most complete parameter setting)
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
Experiments with LIST
24
where B is a ⌃+-formula of arity Z, each C
i
is a ⌃+-formula of arity Z⇥Z⇥ LIST⇥ s, b
is a ⌃+-term of arity Z 7! s, and each c
i
is a ⌃+-term with arity Z ⇥ Z ⇥ LIST ⇥ s 7! s.
Lemma 3.2. Let D be a ⌃+-domain with DLIST = LIST. If for all 1 i < j n the
formula
8 kZ hZ tLIST x
s
.Ci
[k, h, t, x] ^C
j
[k, h, t, x]) c
i
[k, h, t, x] ⇡ c
j
[k, h, t, x]
is valid in all ⌃+-interpretations with domain D then Deff
is an admissible definition
of f wrt. ⌃+ and D.
The condition in the lemma statement is needed to make sure that all cases (fi
) and (fj
)for i , j are consistent. For example, for f(cons(h, t)) ⇡ 1 ( h ⇡ 1 and f(cons(h, t)) ⇡a ( h ⇡ 1 + a this is not the case. Indeed, 8 hZ . h ⇡ 1 ^ h ⇡ 1 + a ) 1 ⇡ a is notvalid. Notice that establishing the condition is a theorem proving task, which fits wellwith our method. In the examples below it is trivial.
Example. Let length : LIST 7! Z, count : Z ⇥ LIST 7! Z, append : LIST ⇥ LIST 7!LIST and in : Z⇥LIST be operators. Consider the extension of AxLIST with the following(admissible) definitions, in the given order.
length(nil) ⇡ 0 append(nil, l) ⇡ l
length(cons(h, t) ⇡ 1 + length(t) append(cons(h, t), l) ⇡ cons(h, append(t, l))count(k, nil) ⇡ 0
count(k, cons(h, t)) ⇡ count(k, t)( k 0 h in(k, l), count(k, l) > 0count(k, cons(h, t)) ⇡ count(k, t) + 1( k ⇡ h
Here are some sample conjectures together with the times for disproving them.2
Problem Beagle Spass+T Z3length(l1) ⇡ length(l2)) l1 ⇡ l2 4.3 9.0 0.2n � 3 ^ length(l) � 4) inRange(n, l) 5.4 1.1 0.2count(n, l) ⇡ count(n, cons(1, l)) 2.5 0.3 >60count(n, l) � length(l) 2.7 0.3 >60l1 0 l2 ) count(n, l1) 0 count(n, l2) 2.4 0.8 >60length(append(l1, l2)) ⇡ length(l1) 2.1 0.3 0.2length(l1) > 1 ^ length(l2) > 1) length(append(k, l)) > 4 37 >60 >60in(n1, l1) ^ ¬in(n2, l2) ^ l3 ⇡ append(l1, cons(n2, l2)))
count(n, l3) ⇡ count(n, l1)>60 (6.2) 9.1 >60
4 Arrays
The signature ⌃ARRAY consist of sorts ARRAY and Z and the operators read : ARRAY⇥Z 7! Z, write : ARRAY ⇥ Z ⇥ Z 7! ARRAY, and init : Z 7! ARRAY. The array axioms
2 The time of 6.2 seconds for the last problem is with “ordinary variables o↵”.
6
The last problem is provable only with "abstraction variables"
Baumgartner/Waldmann Hierarchic superposition with weak abstraction
Experiments with ARRAY
25
Examples. Let the operators inRange : ARRAY ⇥ Z ⇥ Z, max, distinct be defined asfollows (sorted and rev are as defined previously):
inRange(a, r, n), distinct(a, n),8 i . (n � i ^ i � 0) 8 i, j . (n > i ^ n > j ^ j � 0 ^ i � 0)) (r � read(a, i) ^ read(a, i) � 0) ) read(a, i) ⇡ read(a, j)) i ⇡ j)
max(a, n) ⇡ w( 8 i . (n > i ^ i � 0)) w � read(a, i)) ^ (9 i . n > i ^ i � 0 ^ read(a, i) ⇡ w)
Here are some sample conjectures together with the times for disproving them. 3
Note that u indicates termination with a status “unknown”.
Problem Beagle Spass+T Z3n � 0) inRange(a,max(a, n), n) 1.40 0.16 udistinct(init(n), i) 0.98 0.15 uread(rev(a, n + 1), 0) = read(a, n)) >60 >60(0.27) >60distinct(a, n)) distinct(rev(a, n)) >60 0.11 0.369 nZ .¬sorted(rev(init(n),m),m) >60 0.16 usorted(a, n) ^ n > 0) distinct(a, n) 2.40 0.17 0.01
In addition, SPASS+T, Beagle and Z3 were used to prove the functionality conditionin Lemma 4.2 for the max and rev operators. All provers verified the condition for maxbut only SPASS+T and Z3 verified that for rev.
5 Conclusions
The aim of this work is to provide a reasonably expressive language (in practical terms)that allows one to specify properties of data structures under consideration, like lists andarrays, and that supports disproving by existing theorem provers. The main idea is tocapitalize on the strengths of these systems for theorem proving for solving disprovingproblems, instead of relying on their model-building capabilities. To this end we gavesome examples and tested them with the theorem provers SPASS+T, Beagle and Z3.It turns out that the theorem provers work rather well, in the sense that all problemswe tried could be solved, and in short time. In general, the first-order solvers Beagleand SPASS+T worked most reliably, possibly thanks to handling quantified formulasnatively instead of relying solely on instantiation heuristics.
Our examples are inspired by case studies with the first-order model-checker de-scribed in [1]. Disprovable conjectures come up there not only by “faulty” conjectures,but also when trying to prove that two state-changing operators commute (for partial-order reduction). Clearly, more experiments are needed, also from di↵erent contexts.
3 SPASS+T used Yices as a theory solver. The time of 0.27s in the third problem is obtained byexcluding the inRange definition.
8
The first and the last problem is provable only with "ordinary variables"