A Generalization of Shostak#x2019;s Method for Combining Decision Procedures

Barrett, Clark W.; Dill, David L.; Stump, Aaron

A generalization of Shostak's method for combining decision procedures

… Margherita Ligure, Italy, April 8-10 …, 2002

A Generalization of Shostak's Method for Combining Decision Procedures Clark W. Barrett, David L. Dill, and Aaron Stump Stanford University, Stanford, CA 94305, USA, http://verify.stanford.edu Abstract. Consider the problem of determining whether a quanti er- free formula is satis able in some rst-order theory T . Shostak's algorithm decides this problem for a certain class of theories with both interpreted and uninterpreted functions. We present two new algorithms based on Shostak's method. The rst is a simple subset of Shostak's algorithm for the same class of theories but without uninterpreted functions. This simpli ed algorithm is easy to understand and prove correct, providing insight into how and why Shostak's algorithm works. The simpli ed algorithm is then used as the foundation for a generalization of Shostak's method based on the Nelson-Oppen method for combining theories. 1 Introduction In 1984, Shostak introduced a clever and subtle algorithm for deciding the satis ability of quanti er-free formulas in a combined theory which includes a rstorder theory with certain properties and the pure theory of equality with uninterpreted functions [10]. The method has proved to be popular for automated reasoning applications, having been used as the basis for decision procedures found in several tools including PVS [8] STeP [3, 5], and SVC [1, 2, 6]. Unfortunately, the original paper is dicult to follow, due in part to the fact that it contains several errors. As a result, there has been an ongoing e ort to understand and clarify the method [4, 9, 12]. The presentation that is most faithful to Shostak while correcting his errors is that recently produced by Shankar and Ruess [9]. Our work on SVC has led to a number of additional insights which we hope will help to demystify the method and increase its utility. We rst present a subset of the original algorithm, in particular, the subset which decides formulas without uninterpreted functions. This algorithm is interesting in its own right because it is easily proved correct and can be used directly to produce decision procedures. The simpli ed algorithm also forms the basis for a more general algorithm which subsumes Shostak's algorithm. In order to justify the more general algorithm, we use an argument based on a variation of the Nelson-Oppen method for combining theories [7, 11] and a new theorem which relates convexity (a requirement for Shostak) and stable-in niteness (a requirement for Nelson-Oppen). In Section 2, below, some preliminary de nitions and notation are given. The simple algorithm without uninterpreted functions is presented in Section 3. Section 4 reviews the Nelson-Oppen method in preparation for the generalized algorithm which is presented in Section 5. Finally, Section 6 compares our approach to other work on Shostak's algorithm, and Section 7 concludes. 2 Preliminary Concepts 2.1 Some Notions from Logic A theory is a set of rst-order sentences. For the purposes of this paper, all theories are assumed to be rst-order and to include the axioms of equality. The signature of a theory is the set of function, predicate (other than equality), and constant symbols appearing in those sentences. A literal is an atomic formula or its negation. To avoid confusion with the logical equality symbol =, we use the symbol to indicate that two logical expressions are syntactically identical. For a given model, M , a variable assignment is a function which assigns to each variable an element of the domain of M . We write M j= if is true in the model M with variable assignment . If is a set of formulas, then M j= indicates that M j= for each 2 . In general, whenever sets of formulas are used as logical formulas, the intended meaning is the conjunction of the formulas in the set. A formula is satis able if there exists some model M and variable assignment such that M j= . If ? is a set of formulas and is a formula, then ? j= means that whenever a model and variable assignment satisfy ? , they also satisfy . A set S of literals is convex in a theory T if T [ S does not entail any disjunction of equalities without entailing one of the equalities itself. A theory is convex if every set of literals in the language of the theory is convex. 2.2 Equations in Solved Form De nition 1. A set E of equations is said to be in solved form i the left-hand side of each equation in E is a variable which appears only once in E . We refer to the variables which appear only on the left-hand sides as solitary variables. A set E of equations in solved form de nes an idempotent substitution: the one which replaces each solitary variable with its corresponding right-hand side. If S is an expression or set of expressions, we denote the result of applying this substitution to S by E (S ). Another interesting property of equations in solved form is that the question of whether such a set E entails some formula in a theory T can be answered simply by determining the validity of E () in T : Lemma 1. If T is a theory with signature and E is a set of -equations in solved form, then T [ E j= i T j= E (). Proof. Clearly, T [ E j= i T [ E j= E (). Thus we need only show that T [ E j= E () i T j= E (). The \if" direction is trivial. To show the other 2 direction, assume that T [ E j= E (). Any model of T can be made to satisfy T [E by assigning any value to the non-solitary variables of E , and then choosing the value of each solitary variable to match the value of its corresponding righthand side. Since none of the solitary variables occur anywhere else in E , this assignment is well-de ned and satis es E . By assumption then, this model and assignment also satisfy E (), but none of the solitary variables appear in E (), so the initial arbitrary assignment to non-solitary variables must be sucient to satisfy E (). Thus it must be the case that every model of T satis es E () with every variable assignment. ut Corollary 1. If T is a consistent theory with signature and E is a set of -equations in solved form, then T [ E is satis able. 3 Algorithm S1 In this section we present an algorithm, based on a subset of Shostak's algorithm, for deciding satis ability of quanti er-free formulas in a theory T which meets certain conditions. We call such a theory a Shostak theory. De nition 2. A consistent theory T with signature is a Shostak theory if the following conditions hold. 1. does not contain any predicate symbols. 2. T is convex. 3. There exists a canonizer , a computable function from -terms to -terms, with the property that T j= a = b i (a) (b). 4. There exists a solver !, a computable function from -equations to sets of formulas de ned as follows: (a) If T j= a 6= b, then !(a = b) ffalseg. (b) Otherwise, !(a = b) returns a set E of equations in solved form such that T j= [(a = b) $ 9x:E ], where x is the set of variables which appear in E but not in a or b. Each of these variables must be fresh. The requirements given here are slightly di erent from those given by Shostak and others. These di erences are discussed in Section 6 below. In the rest of this section, T is assumed to be a Shostak theory with canonizer and solver !. As we will show, the solver can be used to convert an arbitrary set of equations into a set of equations in solved form. The canonizer is used to determine whether a speci c equality is entailed by a set of equations in solved form, as shown by the following lemma. Lemma 2. If T is a Shostak theory with signature , E is a set of -equations in solved form and is a canonizer for the theory T , then T [ E j= a = b i (E (a)) (E (b)). Proof. By Lemma 1, T [ E j= a = b i T j= E (a) = E (b). But T j= E (a) = E (b) i (E (a)) (E (b)) by the de nition of . ut 3 S1(?; ; ; ! ) 1. ; 2. WHILE ? DO BEGIN 3. Remove some equality a b from ? ; 4. a a ; b b; 5. ! a b ; 6. IF false THEN RETURN FALSE; ; 7. 8. END 9. IF a b for some a b THEN RETURN FALSE; 10. RETURN TRUE; E := ; 6= ; := E ( ) := E ( ) E := ( = ) E =f g E := E (E ) [ E (E ( )) (E ( )) Fig. 1. = 6= 2 Algorithm S1: based on a simple subset of Shostak's algorithm. Algorithm S1 (shown in Fig. 1) makes use of the properties of a Shostak theory to check the joint satis ability of an arbitrary set of equalities, ? , and an arbitrary set of disequalities, , in a Shostak theory with canonizer and solver !. Since the satis ability of any quanti er-free formula can be determined by rst converting it to disjunctive normal form, it suces to have a satis ability procedure for a conjunction of literals. Since contains no predicate symbols, all -literals are either equalities or disequalities. Thus, algorithm S1 is sucient for deciding the satis ability of quanti er-free formulas. Termination of the algorithm is trivial since each step terminates and each time line 3 is executed the size of ? is reduced. The following lemmas are needed before proving correctness. T is a theory, ? and are sets of formulas, and E is a set of equations in solved form, then for any formula , T [ ? [ [ E j= i T [ ? [ E () [ E j= . Lemma 3. If Proof. Follows trivially from the fact that [ E and E () [ E are satis ed by exactly the same models and variable assignments. ut Lemma 4. If ! is a solver for a Shostak theory T and ? is any set of formulas, then for any formula , and terms a and b, T [ ? [ fa = bg j= i T [ ? [ !(a = b) j= . Proof. ): Given that T [ ? [ fa = bg j= , suppose that M j= T [ ? [ !(a = b). It is easy to see from the de nition of ! that M j= a = b and hence by the hypothesis, M j= . (: Given T [ ? [ !(a = b) j= , suppose that M j= T [ ? [ fa = bg. Then, since T j= (a = b) $ 9x:!(a = b), there exists a modi ed assignment which assigns values to all the variables in x and satis es !(a = b) but is otherwise equivalent to . Then, by the hypothesis, M j= . But the variables in x are new variables, so they do not appear in , meaning that changing their values cannot a ect whether is true. Thus M j= . ut 4 Lemma 5. If ? , fa = bg, and E are sets of -formulas, with E in solved form, then for every -formula , T [ ? [fa = bg[E j= i T [ ? [E [E (E ) j= , where E = !(E (a = b)). Proof. T [ ? [ fa = bg [ E j= , T [ ? [ fE (a = b)g [ E j= Lemma 3 , T [ ? [ E [ E j= Lemma 4 , T [ ? [ E [ E (E ) j= Lemma 3 ut Lemma 6. During the execution of algorithm S1, E is always in solved form. Proof. Clearly E is in solved form initially. Consider one iteration. By construction, a and b do not contain any of the solitary variables of E , and thus E doesn't either. E is in solved form by the de nition of !. Finally, applying E to E guarantees that none of the solitary variables of E appear in E , so the new value of E is also in solved form. ut Lemma 7. Let ?n and En be the values of ? and E after the while loop in algorithm S1 has been executed n times. Then for each n, and any formula , the following invariant holds: T [ ?0 j= i T [ ?n [ En j= . Proof. The proof is by induction on n. For n = 0, the invariant holds trivially. Now suppose the invariant holds for some k 0. Consider the next iteration. T [ ?0 j= , , , , T T T T [ ?k [ Ek j= [ ?k+1 [ fa = bg [ Ek j= [ ?k+1 [ E [ E (Ek ) j= [ ?k+1 [ Ek+1 j= Induction Hypothesis Line 3 Lemmas 5 and 6 Line 7 ut Theorem 1. Suppose T is a Shostak theory with signature , canonizer , and solver !. If ? is a set of -equalities and is a set of -disequalities, then T [ ? [ is satis able i S1(?; ; ; !) = TRUE. Proof. Suppose S1(?; ; ; !) = FALSE. If the algorithm terminates at line 9, then, (E (a)) (E (b)) for some a 6= b 2 . It follows from Lemmas 2 and 7 that T [ ? j= a = b, so clearly T [ ? [ is not satis able. The other possibility is that the algorithm terminates at line 6. Suppose the loop has been executed n times and that ?n and En are the values of ? and E at the end of the last loop. It must be the case that T j= a 6= b , so T [ fa = bg is unsatis able. Clearly then, T [ fa = b g [ En is unsatis able, so by Lemma 3, T [ fa = bg [ En is unsatis able. But fa = bg is a subset of ?n , so T [ ?n [En must be unsatis able, and thus by Lemma 7, T [ ? is unsatis able. Suppose on the other hand that S1(?; ; ; !) = TRUE. Then the algorithm terminates at line 10. By Lemma 6, E is in solved form. Let be the disjunction of equalities equivalent to :(). Since the algorithm does not terminate at line 9, E does not entail any equality in . Because T is convex, it follows that 5 T [ E 6j= . Now, since T [ E is satis able by Corollary 1, it follows that T [ E [ is satis able. But by Lemma 7, T [ ? j= i T [ E j= , so in particular T [ E j= ? . Thus T [ E [ [ ? is satis able, and hence T [ ? [ is satis able. ut 3.1 An Example Perhaps the most obvious example of a Shostak theory is the theory of linear arithmetic with signature f0; S; +g (where S is the successor function) and domain the real numbers. Terms in this theory can be more conveniently represented by using some standard abbreviations: base 10 numerals instead of repeated applications of successor (i.e. 3 instead of S (S (S (0)))), multiplication by a constant instead of repeated applications of + (i.e. 3x instead of x + x + x). Division by a non-zero constant and the use of unary minus can also be included since equations involving these operations can always be converted into equivalent equations without them. A simple canonizer for this theory can be obtained by imposing an order on all variables (lexicographic or otherwise), and combining like terms. For example, (z + 3y ? x ? 5z ) ?x + 3y + (?4z ). Similarly, a solver can be obtained simply by solving for one of the variables in an equation. A well-known method for obtaining a solution to a system of equations in this theory is simply to use Gaussian elimination and back-substitution. Interestingly, by using the solver and canonizer just described, algorithm S1 actually implements Gaussian elimination with back-substitution. Consider the following system of equations: x + 3y ? 2z = 1 x ? y ? 6z = 1 This system can be represented by a matrix and transformed to reduced row echelon form as follows. 1 3 ?2 1 ) 1 3 ?2 1 ) 1 0 ?5 1 1 ?1 ?6 1 0 ?4 ?4 0 01 1 0 Compare this with running algorithm S1 on the same set of equations. The following table shows the values of ? , E , E (a = b), and E on each iteration of algorithm S1 starting with ? = fx + 3y ? 2z = 1; x ? y ? 6z = 1g: ? E E (a = b) E x + 3y ? 2z = 1 ; x + 3y ? 2z = 1 x = 1 ? 3y + 2z x ? y ? 6z = 1 x ? y ? 6z = 1 x = 1 ? 3y + 2z 1 ? 3y + 2z ? y ? 6z = 1 y = ?z ; x = 1 + 5z y = ?z The substitution for x in the second iteration corresponds to using x as a pivot variable to produce a zero in the second row of the matrix. Similarly, the last 6 execution of line 7 transforms x = 1 ? 3y + 2z into x = 1 + 5z , corresponding to the transformation of the rst row of the matrix due to back-substitution. Notice that the nal solution obtained by algorithm S1 is the same as that obtained from the matrix in reduced row echelon form. To make the example a little more interesting, suppose a third equation is added: 2x + 8y ? 2z = 3. Transforming the matrix yields: 0 1 3 ?2 1 1 0 1 3 ?2 1 1 0 1 3 ?2 1 1 @ 1 ?1 ?6 1 A ) @ 0 ?4 ?4 0 A ) @ 0 1 1 0 A 2 8 ?2 3 0 2 2 1 00 0 1 At this point, the last row indicates that the system of equations is unsatis able. Suppose that the same new equation is processed by algorithm S1 . Note that rather than restarting the algorithm, the new equation can be placed in ? and the algorithm can continue from where it left o . This illustrates a very nice property of the algorithm: it is incremental. If a new equation is added to ? after some of the equations have already been processed, the algorithm can continue without any diculty. The result is as follows: ? E E (a = b) E 2x + 8y ? 2z = 3 x = 1 + 5z 2(1 + 5z ) + 8(?z ) ? 2z = 3 false y = ?z The solver detects an inconsistency when it tries to solve the equation obtained after applying the substitution from E . The solver indicates this by returning ffalseg, which results in the algorithm returning FALSE. Finally, suppose that instead of the equation 2x +8y ? 2z = 3, the disequality y + x 6= x ? z is added. This is handled by line 9 of the algorithm: (E (y + x)) (?z + 1 + 5z ) 1 + 4z (E (x ? z )) (1 + 5z ? z ) 1 + 4z Since y + x 6= x ? z 2 and (E (y + x)) (E (x ? z )), the algorithm returns FALSE. There is no matrix analog to the case which includes the disequality. Algorithm S1 may, in fact, properly be viewed as a generalization of Gaussian elimination. Not only can it handle disequalities, but it can also introduce new variables or equations when solving. Also, the set of function symbols can be richer than those provided by a vector space. The key requirement is simply that an appropriate canonizer and solver exist. 3.2 Combining Shostak Theories As noted by Shostak in his original paper [10], it is often possible to combine two Shostak theories to form a new Shostak theory. A canonizer for the combined theory is obtained simply by composing the canonizers from each individual theory. Furthermore, a solver for the combined theory can often be obtained by 7 repeatedly applying the solver for each theory (treating terms in other theories as variables) until a true variable is on the left-hand side of each equation in the solved form. However, as pointed out in [6] and [9], this is not always possible. We do not address this issue here, but mention it as a question which warrants further investigation. 4 The Nelson-Oppen Combination Method Nelson and Oppen [7] described a method for combining decision procedures for theories which are stably-in nite and have disjoint signatures. A theory T is stably-in nite if any quanti er-free formula is satis able in some model of T i it is satis able in an in nite model of T . There have been many detailed presentations of the Nelson-Oppen method. A brief overview based on Tinelli and Harandi's approach [11] is given here. Suppose T1 and T2 are such theories with T = T1 [ T2 (the generalization to more than two theories is straightforward). Let be a set of -literals and suppose we wish to determine the satis ability of T [ . A few more de nitions are required before presenting the Nelson-Oppen procedure. Members of , for i = 1,2 are called i-symbols. In order to associate all terms with some theory, each variable is also arbitrarily associated with either T1 or T2 . A variable is called an i-variable if it is associated with T (note that an i-variable is not an i-symbol, as it is not a member of ). A -term t is an i-term if it is an i-variable, a constant i-symbol, or an application of a functional i-symbol. An i-predicate is an application of a predicate i-symbol. An atomic i-formula is an an i-predicate or an equality whose left term is an i-term. An i-literal is an atomic i-formula or the negation of an atomic i-formula. An occurrence of a term t in either a term or a literal is i-alien if it is a j -term, with i 6= j and all of its super-terms (if any) are i-terms. An i-term or i-literal is pure if it contains only i-symbols (i.e. its i-alien sub-terms are all variables). Given an equivalence relation , let dom be the domain of the relation. We de ne the following sets of formulas induced by : i i i i i i E = fx = yjx; y 2 dom and x yg D = fx 6= yjx; y 2 dom and x 6 yg A = E [ D. If Ar = A for some equivalence relation with domain S , we call Ar an arrangement of S . The rst step in determining the satis ability of is to transform into an equisatis able formula 1 ^ 2 where consists only of pure i-literals as follows. Let be some i-literal in containing a non-variable i-alien j -term t. Replace all occurrences of t in with a new j -variable z and add the equation z = t to . Repeat until every literal in is pure. The literals can then easily be partitioned into 1 and 2 . It is easy to see that is satis able if and only if 1 ^ 2 is satis able. i 8 Now, let S be the set of all variables which appear in both 1 and 2 . A simple version of the Nelson-Oppen procedure simply guesses an equivalence relation on S nondeterministically, and then checks whether Ti [ i [ A is satis able. The correctness of the result is based on the following theorem from [11]. Theorem 2. Let T1 and T2 be two stably-in nite, signature-disjoint theories and let i be a set of pure i-literals for i = 1; 2. Let S be the set of variables which appear in both 1 and 2 . Then T1 [T2 [ 1 [ 2 is satis able i there exists an arrangement Ar of S such that Ti [ i [ Ar is satis able for i = 1; 2. 5 Combining Nelson-Oppen and Shostak 5.1 A Variation of the Nelson-Oppen Procedure The rst step in the version of the Nelson-Oppen procedure described above changes the structure and number of literals in . However, it is possible to give a version of the procedure which does not change the literals in . This makes possible the combination of Shostak and Nelson-Oppen described next. First, a few more de nitions are needed. Let v be a mapping such that for i = 1; 2, each i-term t is mapped to a fresh i-variable v(t). Then, for some formula or term , de ne i ( ) to be the result of replacing all i-alien instances of terms t by v(t). It is easy to see that as a result, i ( ) is i-pure. Note that since the i operator simply replaces terms with unique place-holders, it is invertible. Also, it distributes over equality and Boolean operators. i (t) is a somewhat cumbersome notation, but it allows us to be precise about the notion of treating alien terms as variables which is a key part of Shostak's method for combining theories. Our variation on the Nelson-Oppen procedure works as follows. Given a set of literals, , rst partition into two sets 1 and 2 , where i is exactly the set of i-literals in . Let S be the set of all terms which are i-alien (for some i) in some literal in or in some sub-term of some literal in . S will also be referred to as the set of shared terms. As before, an equivalence relation on S is guessed. If Ti [ i (i [ A ) is satis able for each i, then T [ is satis able, as shown by the following theorem. Theorem 3. Let T1 and T2 be two stably-in nite, signature-disjoint theories and let be a set of literals in the combined signature . If i is the set of all i-literals in and S is the set of shared terms in , then T1 [ T2 [ is satis able i there exists an equivalence relation on S such that for i = 1; 2, Ti [ i (i [ A ) is satis able. Proof. ): Suppose M j= T [ . Let a b i a; b 2 S and M j= a = b. Then clearly for i = 1; 2, M j= Ti [ i [ A . It is then easy to see that Ti [ i (i [ A ) is satis able by choosing a variable assignment which assigns to each variable v(t) 9 the corresponding value of the term t which it replaces. (: Suppose that for each i, T [ ( [ A ) is satis able. Consider i = 1. Let 1 be the set of all equations v(t) = t, where t 2 S is a 1-term. Consider 1 (1 ). Since 1 never replaces 1-terms and each v (t) is a new variable, it follows that 1 (1 ) is in solved form, and its solitary variables are exactly the variables which are used to replace 1-terms. Thus, by Corollary 1, T1 [ 1 (1 ) is satis able. Furthermore, since none of the solitary variables of 1 (1 ) appear in 1 (1 [ A ), a satis able assignment for T1 [ 1 (1 ) can be constructed from the satisfying assignment for T1 [ 1 (1 [ A ) (which exists by hypothesis) so that the resulting assignment satis es T1 [ 1 (1 [ A [ 1 ). Now, note that the equations in 1 are exactly those used if 2 is applied to a set of expressions which are 1-pure. Thus T1 [ 1 (1 [ A [ 1 ) is equisatis able with T1 [ 1 (1 [ 1 ) [ 2 ( 1 (A )). Applying the same argument with i=2, we conclude that T2 [ 2 (2 [ 2 ) [ 1 ( 2 (A )) is satis able. But for each i, ( [ ) is a set of i-literals. Furthermore, 2 ( 1 (A )) is equivalent to 1 ( 2 (A )) and is an arrangement of the variables shared by these two sets, so Theorem 2 can be applied to conclude that T [ [ 1 [ 2 , and thus T [ , is satis able. ut i i i i i i 5.2 Convexity and Stable-In niteness In order to generalize Shostak's algorithm we use the following result which relates convexity and stable-in niteness. Theorem 4. Every convex rst-order theory with no trivial models is stablyin nite. Proof. Suppose U is a rst-order theory which is not stably-in nite. Then there exists some quanti er-free set of literals which is satis able in a nite model of U , but not in an in nite model of U . Let 9x be the existential closure of . Then 9x: is true in some nite model, but not in any in nite model, of U . It follows that U [ f9x:g is a theory with no in nite models. By rst-order compactness, there must be some nite cardinality n such that there is a model of U [ f9x:g of cardinality n, but none of cardinality larger than n. Clearly, U [ is satis able in some model of size n, but not in any models larger than n. It follows byWthe pigeonhole principle that if y ; 0 i n are fresh variables, then U [ j= 6= y = y , but because U has no trivial models (i.e. models of size 1), U [ 6j= y = y for any i; j with i 6= j . Thus, U is not convex. ut i i j i i j j 5.3 Combining the methods Suppose now that T1 is a Shostak theory and T2 is a convex theory, neither of which admit trivial models (typically, theories of interest do not admit trivial models, or can be easily modi ed so that this is the case). The above theorem implies that both theories are also stably-in nite. As a result, Theorem 3 can be applied to decide a combination of the two theories as follows. Suppose is a set of -literals. As in the previous section, divide into 1 and 2 where contains exactly the i-literals of . Let S be the set of shared i 10 terms. By Theorem 3, T [ 1 [ 2 is satis able i there exists an equivalence relation such that for i = 1; 2, T [ ( [ A ) is satis able. Let ? be the set of all equalities in 1 and the set of disequalities in 1 . Furthermore, let Sat 2 be a decision procedure for satis ability of literals in T2 : Sat 2 ( ) = TRUE i T2 [ 2 () 6j= false: Algorithm S2 is a modi cation of algorithm S1 which accommodates the additional theory T2 . Essentially, the algorithm is identical except for the addition of lines 3 through 5 which check whether 2 is consistent in theory T2 with an arrangement A . The equivalence relation on S is de ned in such a way that E is consistent with A by de nition: a b i a; b 2 S ^ (E ( 1 (a))) (E ( 1 (b))) Thus, when the algorithm returns TRUE, both 1 and 2 are known to be consistent with the arrangement A . i i i Sat S2(?; ; ; !; 2 ; 2) 1. ; 2. WHILE ? OR 2 2 DO BEGIN 3. IF 2 2 THEN BEGIN 4. IF 2 2 THEN RETURN FALSE; 5. ELSE choose a b D such that 2 2 6. END ELSE Remove some equality a b from ? ; 7. a 1 a ; b 1 b ; 8. !a b ; 9. IF false THEN RETURN FALSE; 10. ; 11. END 12. IF a b for some a b THEN RETURN FALSE; 13. RETURN TRUE; E := ; 6 ; :Sat ( [ A ) = :Sat ( [ A ) :Sat ( [ E ) 6= 2 = := E ( ( )) := E ( ( )) E := ( = ) E =f g E := E (E ) [ E Fig. 2. :Sat ( [ E [ fa 6= b g); 6= 2 Algorithm S2: a generalization of Shostak's algorithm. Line 5 requires a little explanation. If the algorithm reaches line 5, it means that 2 [ E [ D is not satis able in T2 , but 2 [ E is. It follows from convexity of T2 that there must be a disequality a 6= b in D such that 2 [ E [ fa 6= bg is not satis able in T2 . It is not hard to see that algorithm S2 must terminate. This is because each step terminates and in each iteration either the size of ? is reduced by one or two equivalence classes in are merged. As before, the correctness proof requires a couple of preparatory lemmas. Lemma 8. Suppose T1 is a Shostak theory with signature 1, E is a set of 1formulas in solved form, S is a set of terms, and is de ned as above. If is an equivalence relation on S such that T1 [ 1 (A ) [E is satis able, then E A . 11 Proof. Consider an arbitrary equation a = b between terms in S . a = b 2 E i (E ( 1 (a))) (E ( 1 (b))) i (by Lemma 2) T1 [ E j= 1 (a = b). So 1 (a = b) must be true in every model and assignment satisfying T1 [ E . In particular, if T1 [ 1 (A ) [E is satis able, the corresponding model and assignment must also satisfy 1 (a = b). Since either the equation a = b or the disequation a 6= b must be in A , it must be the case that a = b 2 A . Thus, E A . ut Lemma 9. Let ?n and En be the values of ? and E after the loop in algorithm S2 has been executed n times. Then for each n, the following invariant holds: T [ is satis able i there exists an equivalence relation on S such that (1) T1 [ 1 (?n [ [ A ) [ En is satis able, and (2) T2 [ 2 (2 [ A ) is satis able. Proof. The proof is by induction on n. For the base case, notice that by Theorem 3, T [ is satis able i there exists an equivalence relation such that (1) and (2) hold with n = 0. Before doing the induction case, we rst show that for some xed equivalence relation , (1) and (2) hold when n = k i (1) and (2) hold when n = k + 1. Notice that (2) is independent of n, so it is only necessary to consider (1). There are two cases to consider. First, suppose that the condition of line 3 is true and line 5 is executed. We rst show that (1) holds when n = k i the following holds: (3) T1 [ 1 (?k+1 [ [ A [ fa = bg) [ Ek is satis able. Since line 6 is not executed, ?k+1 = ?k . The if direction is then trivial since the formula in (1) is a subset of the formula in (3). To show the only if direction, rst note that it follows from line 5 that T2 [ 2 (2 [ E ) j= 2 (a = b). But by Lemma 8, E A , so it follows that T2 [ 2 (2 [ A ) j= 2 (a = b). Since either a = b 2 A or a 6= b 2 A , it must be the case that a = b 2 A and thus (3) follows trivially from (1). Now, by Lemma 5 (where is false), (3) holds i (4) T1 [ 1 (?k+1 [ [ A ) [ E (Ek ) [ E is satis able, where E = !(E ( 1 (a = b))). But Ek+1 = E (Ek ) [ E , so (4) is equivalent to (1) with n = k + 1. In the other case, line 6 is executed (so that ?k+1 = ?k ?fa = bg). Thus, (1) holds with n = k i T1 [ 1 (?k+1 [ [ fa = bg [ A ) [ Ek is satis able, which is equivalent to (3). As in the previous case, it then follows from Lemma 5 that (1) holds at k i (1) holds at k + 1. Thus, given an equivalence relation, (1) and (2) hold at k + 1 exactly when they hold at k. It follows easily that if an equivalence relation exists which satis es (1) and (2) at k, then there exists an equivalence relation satisfying (1) and (2) at k + 1 and vise-versa. Finally, the induction case assumes that that T [ is satis able i there exists an equivalence relation such that (1) and (2) hold at k. It follows from the above argument that T [ is satis able i there exists an equivalence relation such that (1) and (2) hold at k + 1. ut 12 Theorem 5. Suppose T1 is a Shostak theory with signature 1, canonizer , and solver !, and T2 is a convex theory with signature 2 disjoint from 1 and satis ability procedure Sat 2 . Suppose also that neither T1 nor T2 admit trivial models, and let T = T1 [ T2 and = 1 [ 2 . Suppose is a set of -literals. Let ? be the subset of which consists of 1-equalities, the subset of which consists of 1-disequalities, and 2 the remainder of the literals in . T [ is satis able i S2(?; ; ; !; 2 ; Sat 2 ) = TRUE. Proof. First note that by the same argument used in Lemma 6, E is always in solved form. Suppose S2(?; ; ; !; 2 ; Sat 2 ) = FALSE. If the algorithm terminates at line 9 or 12, then the proof that is unsatis able is the same as that for algorithm S1 above. If it stops at line 4, then suppose there is an equivalence relation satisfying condition (1) of Lemma 9. It follows from Lemma 8 that E A . But since the algorithm terminates at line 4, T2 [ 2 (2 [ A ) must be unsatis able. Thus condition (2) of Lemma 9 cannot hold. Thus, by Lemma 9, T [ is unsatis able. Suppose on the other hand that S2(?; ; ; !; 2 ; Sat 2 ) = TRUE. By the definition of and Lemma 2, a = b 2 A i T1 [ E j= 1 (a = b). It follows from the convexity of T1 and Corollary 1 that T1 [ E [ 1 (A ) is satis able. It then follows from the fact that S2 does not terminate at line 12 (as well as convexity again) that T1 [ E [ 1 ( [ A ) is satis able. This is condition (1) of Lemma 9. Condition (2) must hold because the while loop terminates. Thus, by Lemma 9, T [ is satis able. ut 6 Comparison with Shostak's Original Method There are two main ways in which this work di ers from Shostak's original method, which, as was mentioned, is best represented by Ruess and Shankar in [9]. The rst is in the set of requirements a theory must ful ll. The second is in the level of abstraction at which the algorithm is presented. 6.1 Requirements on the Theory Recall that the de nition of a Shostak theory gave four requirements which must be met. The rst of these is simply that the theory contain no predicate symbols. This is a minor point that is included simply because Shostak's method does not give any guidance on what to do if the theory includes predicate symbols. One possible solution is to encode predicates as functions, but this only works if the resulting encoding maintains the properties of canonizability and solvability. The second requirement is that the theory be convex. This may seem overly restrictive since Shostak claims that non-convex theories can be handled [10]. Consider, however, a theory with exactly two elements in its domain and the set of formulas fx 6= y; y 6= z; x 6= z g. Clearly, this set of equations is unsatis able, but even if a canonizer and solver exist for the theory, Shostak's algorithm 13 will fail to detect the inconsistency. Ruess and Shankar avoid this diculty by restricting their attention to the problem of whether T [ ? j= a = b for some set of equalities ? . However, the ability to solve this problem does not lead to a self-contained decision procedure unless the theory is convex. The third requirement on the theory is that a canonizer exist. Shostak gave ve properties that must be satis ed by the canonizer. We have retained only one of these. In fact, this is the only one that is needed at the level of abstraction of our algorithms. However, ecient implementions typically require the additional properties. A similar situation arises with the requirements on the solver: only a subset of the original requirements are needed. Note that although we require the set of equalities returned by the solver to be equisatis able in every model of T , whereas Ruess and Shankar require only that it be equisatis able in every model, it is not dicult to show that their requirements on the canonizer imply that every model of T must be a -model. 6.2 Level of Abstraction Algorithm S2 looks very di erent from Shostak's original published algorithm as well as most other published versions, though these are, in fact, closely related to instances of S2 . The most obvious di erence is that while we leave T2 unspeci ed, in other work, T2 is always the theory of pure equality with uninterpreted functions. Additionally, Shostak incorporates several optimizations, which our algorithms would also bene t from. First of all, notice that Sat 2 need only be called if A changes. This can easily be tracked by maintaining a mapping from each term t 2 S to (E ( 1 (t))). Another optimization is to attempt to reduce the number of shared terms, thus reducing the size of the arrangement A . Rather than precomputing the set, it can be computed incrementally as follows. Initially the set of shared terms S is the empty set. Then, before an equation is processed in line 7, each sub-term t in the equation is considered. If it can be replaced with a term u already in S which is known to be equivalent, then all instances of t are replaced with u. Otherwise if t is i-alien for some i, it is added to S . It is clear that in an actual implementation, it is desirable to have such optimizations. However, such details naturally complicate the presentation and proof. 7 Conclusions and Future Work We have presented a simpli ed explanation of Shostak's algorithm, omitting uninterpreted functions. It was then shown that by using the same reasoning as that used to justify the Nelson-Oppen combination procedure, any convex theory which satis es the criteria for the Nelson-Oppen procedure can be combined with the simple Shostak algorithm. 14 It is our hope that the insights presented in this paper will serve as a foundation for greater understanding and application of cooperating decision procedures. Acknowledgments This work was partially supported by the National Science Foundation Grant CCR-9806889, and ARPA/AirForce contract number F33615-00-C-1693. References 1. Clark Barrett, David Dill, and Jeremy Levitt. Validity Checking for Combinations of Theories with Equality. In M. Srivas and A. Camilleri, editors, Formal Methods in Computer-Aided Design, volume 1166 of Lecture Notes in Computer Science, pages 187{201. Springer-Verlag, 1996. 2. Clark W. Barrett, David L. Dill, and Aaron Stump. A Framework for Cooperating Decision Procedures. In 17th International Conference on Automated Deduction, Lecture Notes in Computer Science. Springer-Verlag, 2000. 3. N. Bjorner. Integrating Decision Procedures for Temporal Veri cation. PhD thesis, Stanford University, 1999. 4. D. Cyrluk, P. Lincoln, and N. Shankar. On Shostak's Decision Procedure for Combinations of Theories. In M. McRobbie and J. Slaney, editors, 13th International Conference on Computer Aided Deduction, volume 1104 of Lecture Notes in Computer Science, pages 463{477. Springer-Verlag, 1996. 5. Z. Manna et al. STeP: Deductive-Algorithmic Veri cation of Reactive and Realtime Systems. In 8th International Conference on Computer-Aided Veri cation, volume 1102 of Lecture Notes in Computer Science, pages 415{418. SpringerVerlag, 1996. 6. J. Levitt. Formal Veri cation Techniques for Digital Systems. PhD thesis, Stanford University, 1999. 7. G. Nelson and D. Oppen. Simpli cation by Cooperating Decision Procedures. ACM Transactions on Programming Languages and Systems, 1(2):245{57, 1979. 8. S. Owre, J. Rushby, and N. Shankar. PVS: A Prototype Veri cation System. In D. Kapur, editor, 11th International Conference on Automated Deduction, volume 607 of Lecture Notes in Arti cial Intelligence, pages 748{752. Springer-Verlag, 1992. 9. H. Ruess and N. Shankar. Deconstructing Shostak. In 16th Annual IEEE Symposium on Logic in Computer Science, pages 19{28, June 2001. 10. R. Shostak. Deciding Combinations of Theories. Journal of the Association for Computing Machinery, 31(1):1{12, 1984. 11. C. Tinelli and M. Harandi. A New Correctness Proof of the Nelson-Oppen Combination Procedure. In F. Baader and K. Schulz, editors, 1st International Workshop on Frontiers of Combining Systems (FroCoS'96), volume 3 of Applied Logic Series. Kluwer Academic Publishers, 1996. 12. A. Tiwari. Decision Procedures in Automated Deduction. PhD thesis, State University of New York at Stony Brook, 2000. 15

Log In

A generalization of Shostak's method for combining decision procedures