Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

SAT Techniques for Modal and Description Logics

2009
...Read more
“p02c11˙mod” — 2008/11/16 — 16:01 — page 781 — #1 Handbook of Satisfiability Armin Biere, Marijn Heule, Hans van Maaren and Toby Walsh (Eds.) IOS Press, 2009 c 2009 Roberto Sebastiani and Armando Tacchella and IOS Press. All rights reserved. 781 Chapter 25 SAT Techniques for Modal and Description Logics Roberto Sebastiani and Armando Tacchella 25.1. Introduction In a nutshell, modal logics are propositional logics enriched with modal opera- tors, —like , , i — which are able to represent complex facts like necessity, possibility, knowledge and belief. For instance, “ϕ” and “ϕ” may represent “necessarily ϕ” and “possibly ϕ” respectively, whilst “ 1 2 ϕ” may represent the fact that agent 1 knows that agent 2 knows the fact ϕ. Description logics are extensions of propositional logic which build on top of entities, concepts (unary relations) and roles (binary relations), which allow for representing complex con- cepts. For instance, the concept “male ∧∃ Children (¬ male teen)”, represents the set of fathers which have at least one teenager daughter. The research in modal and description logics had followed two parallel routes until the seminal work by Schild [Sch91], who showed that the core modal logic K m and the core description logic ALC are notational variants one of the other, and that analogous frameworks, results and algorithms had been conceived in parallel in the two communities. Since then, analogous results have been pro- duced for a bunch of other logics, so that nowadays the two communities have substantially merged into one research flow. In the last two decades, modal and description logics have provided a theo- retical framework for important applications in many areas of computer science, including artificial intelligence, formal verification, database theory, distributed computing and, more recently, semantic web. For this reason, the problem of automated reasoning in modal and description logics has been thoroughly inves- tigated (see, e.g., [Fit83, Lad77, HM92, BH91, Mas00]), and many approaches have been proposed for (efficiently) handling the satisfiability of modal and de- scription logics, with a particular interest for the core logics K m and ALC (see, e.g., [Fit83, BH91, GS00, HPS99, HS99, BGdR03, PSV02, SV06]). Moreover, a significant amount of benchmarks formulas have been produced for testing the effectiveness of the different techniques [HM92, GRS96, HS96, HPSS00, Mas99, PSS01, PSS03].
“p02c11˙mod” — 2008/11/16 — 16:01 — page 782 — #2 782 Chapter 25. SAT Techniques for Modal and Description Logics We briefly overview the main approaches for the satisfiability of modal and description logics which have been proposed in the literature. The “classic” tableau-based approach [Fit83, Lad77, HM92, Mas00] is based on the construc- tion of propositional-tableau branches, which are recursively expanded on demand by generating successor nodes in a candidate Kripke model. In the DPLL-based approach [GS96a, SV98, GS00] a DPLL procedure, which treats the modal subfor- mulas as propositions, is used as Boolean engine at each nesting level of the modal operators: when a satisfying assignment is found, the corresponding set of modal subformulas is recursively checked for modal consistency. Among the tools em- ploying (and extending) this approach, we recall Ksat[GS96a, GGST00], *SAT [Tac99], Fact [Hor98b], Dlp [PS98], and Racer [HM01]. 1 This approach has lately been exported into the context of Satisfiability Modulo Theories - SMT [ACG00, ABC + 02], giving rise the so-called on-line lazy approach to SMT de- scribed in §26.4 (see also [Seb07]). The CSP-based approach [BGdR03] differs from the tableaux-based and DPLL-based ones mostly in the fact that a CSP engine is used instead of a tableaux/DPLL engine. KCSP is the representa- tive tool of this approach. In the translational approach [HS99, AGHd00] the modal formula is encoded into first-order logic (FOL), and the encoded formula is then fed to a FOL theorem prover [AGHd00]. MSpass [HSW99] is the most representative tool of this approach. In the Inverse-method approach, a search procedure is based on the inverted version of a sequent calculus [Vor99, Vor01] (which can be seen as a modalized version of propositional resolution [PSV02]). K K is the representative tool of this approach. In the Automata-theoretic ap- proach, or OBDD-based approach, (a OBDD-based symbolic representation of) a tree automaton accepting all the tree models of the input formula is implicitly built and checked for emptiness [PSV02, PV03]. KBDD [PV03] is the represen- tative tool of this approach. [PV03] presents also an encoding of K-satisfiability into QBF-satisfiability – another PSPACE-complete problem – combined with the use of a state-of-the-art QBF solver Finally, in the eager approach [SV06, SV08] K m /ALC -formulas are encoded into SAT and then fed to a state-of-the-art SAT solver. K m 2SAT is the representative tool of this approach. Most such approaches combine propositional reasoning with various tech- niques/encodings for handling the modalities, and thus are based on, or have largely benefited from, efficient propositional reasoning techniques. In particular, the usage of DPLL as a core Boolean reasoning technique produced a boost in the performance of the tools when it was adopted [GS96a, GS96b, Hor98b, PS98, HPS99, GGST00, HPSS00]. In this chapter we show how efficient Boolean reasoning techniques have been imported, used and integrated into reasoning tools for modal and description logics. To this extent, we focus on modal logics, and in particular mainly on K m . Importantly, this chapter does not address the much more general issue of satisfiability in modal and description logics, because the reasoning techniques 1 Notice that there is not an universal agreement on the terminology “tableau-based” and “DPLL-based”. E.g., tools like Fact, Dlp, and Racer are often called “tableau-based”, al- though they use a DPLL-like algorithm instead of propositional tableaux for handling the propositional component of reasoning [Hor98b, PS98, HPS99, HM01], because many scientists in these communities consider DPLL as an optimized version of propositional tableaux. The same issue holds for the Boolean system KE [DM94] and its derived systems.
✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 781 — #1 ✐ Handbook of Satisfiability Armin Biere, Marijn Heule, Hans van Maaren and Toby Walsh (Eds.) IOS Press, 2009 c 2009 Roberto Sebastiani and Armando Tacchella and IOS Press. All rights reserved. ✐ 781 Chapter 25 SAT Techniques for Modal and Description Logics Roberto Sebastiani and Armando Tacchella 25.1. Introduction In a nutshell, modal logics are propositional logics enriched with modal operators, —like ✷, ✸, ✷i — which are able to represent complex facts like necessity, possibility, knowledge and belief. For instance, “✷ϕ” and “✸ϕ” may represent “necessarily ϕ” and “possibly ϕ” respectively, whilst “✷1 ✷2 ϕ” may represent the fact that agent 1 knows that agent 2 knows the fact ϕ. Description logics are extensions of propositional logic which build on top of entities, concepts (unary relations) and roles (binary relations), which allow for representing complex concepts. For instance, the concept “male ∧ ∃ Children (¬ male ∧ teen)”, represents the set of fathers which have at least one teenager daughter. The research in modal and description logics had followed two parallel routes until the seminal work by Schild [Sch91], who showed that the core modal logic Km and the core description logic ALC are notational variants one of the other, and that analogous frameworks, results and algorithms had been conceived in parallel in the two communities. Since then, analogous results have been produced for a bunch of other logics, so that nowadays the two communities have substantially merged into one research flow. In the last two decades, modal and description logics have provided a theoretical framework for important applications in many areas of computer science, including artificial intelligence, formal verification, database theory, distributed computing and, more recently, semantic web. For this reason, the problem of automated reasoning in modal and description logics has been thoroughly investigated (see, e.g., [Fit83, Lad77, HM92, BH91, Mas00]), and many approaches have been proposed for (efficiently) handling the satisfiability of modal and description logics, with a particular interest for the core logics Km and ALC (see, e.g., [Fit83, BH91, GS00, HPS99, HS99, BGdR03, PSV02, SV06]). Moreover, a significant amount of benchmarks formulas have been produced for testing the effectiveness of the different techniques [HM92, GRS96, HS96, HPSS00, Mas99, PSS01, PSS03]. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 782 — #2 ✐ 782 ✐ Chapter 25. SAT Techniques for Modal and Description Logics We briefly overview the main approaches for the satisfiability of modal and description logics which have been proposed in the literature. The “classic” tableau-based approach [Fit83, Lad77, HM92, Mas00] is based on the construction of propositional-tableau branches, which are recursively expanded on demand by generating successor nodes in a candidate Kripke model. In the DPLL-based approach [GS96a, SV98, GS00] a DPLL procedure, which treats the modal subformulas as propositions, is used as Boolean engine at each nesting level of the modal operators: when a satisfying assignment is found, the corresponding set of modal subformulas is recursively checked for modal consistency. Among the tools employing (and extending) this approach, we recall Ksat[GS96a, GGST00], *SAT [Tac99], Fact [Hor98b], Dlp [PS98], and Racer [HM01]. 1 This approach has lately been exported into the context of Satisfiability Modulo Theories - SMT [ACG00, ABC+ 02], giving rise the so-called on-line lazy approach to SMT described in §26.4 (see also [Seb07]). The CSP-based approach [BGdR03] differs from the tableaux-based and DPLL-based ones mostly in the fact that a CSP engine is used instead of a tableaux/DPLL engine. KCSP is the representative tool of this approach. In the translational approach [HS99, AGHd00] the modal formula is encoded into first-order logic (FOL), and the encoded formula is then fed to a FOL theorem prover [AGHd00]. MSpass [HSW99] is the most representative tool of this approach. In the Inverse-method approach, a search procedure is based on the inverted version of a sequent calculus [Vor99, Vor01] (which can be seen as a modalized version of propositional resolution [PSV02]). KK is the representative tool of this approach. In the Automata-theoretic approach, or OBDD-based approach, (a OBDD-based symbolic representation of) a tree automaton accepting all the tree models of the input formula is implicitly built and checked for emptiness [PSV02, PV03]. KBDD [PV03] is the representative tool of this approach. [PV03] presents also an encoding of K-satisfiability into QBF-satisfiability – another PSPACE-complete problem – combined with the use of a state-of-the-art QBF solver Finally, in the eager approach [SV06, SV08] Km /ALC-formulas are encoded into SAT and then fed to a state-of-the-art SAT solver. Km 2SAT is the representative tool of this approach. Most such approaches combine propositional reasoning with various techniques/encodings for handling the modalities, and thus are based on, or have largely benefited from, efficient propositional reasoning techniques. In particular, the usage of DPLL as a core Boolean reasoning technique produced a boost in the performance of the tools when it was adopted [GS96a, GS96b, Hor98b, PS98, HPS99, GGST00, HPSS00]. In this chapter we show how efficient Boolean reasoning techniques have been imported, used and integrated into reasoning tools for modal and description logics. To this extent, we focus on modal logics, and in particular mainly on Km . Importantly, this chapter does not address the much more general issue of satisfiability in modal and description logics, because the reasoning techniques 1 Notice that there is not an universal agreement on the terminology “tableau-based” and “DPLL-based”. E.g., tools like Fact, Dlp, and Racer are often called “tableau-based”, although they use a DPLL-like algorithm instead of propositional tableaux for handling the propositional component of reasoning [Hor98b, PS98, HPS99, HM01], because many scientists in these communities consider DPLL as an optimized version of propositional tableaux. The same issue holds for the Boolean system KE [DM94] and its derived systems. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 783 — #3 ✐ Chapter 25. SAT Techniques for Modal and Description Logics ✐ 783 which are specific for the different modal and description logics are orthogonal to the issue of Boolean reasoning. We refer the reader to the bibliography presented above and to [BCM+ 03] for a detailed description of those topics. The chapter is organized as follows. In §25.2 we provide some background in modal logics. In §25.3 we describe a basic theoretical framework and we present and analyze the basic tableau-based and DPLL-based techniques. In §25.4 we present optimizations and extensions of the DPLL-based procedures. In §25.5 we present the automata-theoretic/OBDD-based approach. Finally, in §25.6 we present the eager approach. 25.2. Background In this section we provide some background in modal logics. We refer the reader to, e.g., [Che80, Fit83, HM92] for a more detailed introduction. 25.2.1. The Modal Logic Km We start with some basic notions and notation (see, e.g., [Che80, Fit83, HM92] for more details). Given a non-empty set of primitive propositions A = {A1 , A2 , . . .} and a set of m modal operators B = {✷1 , . . . , ✷m }, let the language Λm be the least set of formulas containing A, closed under the set of propositional connectives {¬, ∧} and the set of modal operators in B. Notationally, we use capital letters Ai , Bi , ... to denote primitive propositions and Greek letters αi , βi , ϕi , ψi to denote formulas in Λm . We use the standard abbreviations, that is: “✸r ϕ1 ” for ‘¬✷r ¬ϕ1 ”, “ϕ1 ∨ ϕ2 ” for “¬(¬ϕ1 ∧ ¬ϕ2 )”, “ϕ1 → ϕ2 ” for “¬(ϕ1 ∧ ¬ϕ2 )”, “ϕ1 ↔ ϕ2 ” for “¬(ϕ1 ∧ ¬ϕ2 ) ∧ ¬(ϕ2 ∧ ¬ϕ1 )”, “⊤” and ⊥ for the true and false constants respectively. Formulas like ¬¬ψ are implicitly assumed to be simplified then by W “¬ψ” we mean V often write V “φ”. We V intoWψ; thus, if ψ is ¬φ, W l ) → ( l ”, and “( ¬l ∨ “ “( i li ) → j lj ” for theVclause i j lj )” for the i i j j j W conjunction of clauses “ j ( i ¬li ∨ lj )”. We call depth of ϕ, written depth(ϕ), the maximum degree of nesting of modal operators in ϕ. A Km -formula is said to be in Negative Normal Form (NNF) if it is written in terms of the symbols ✷r , ✸r , ∧, ∨ and propositional literals Ai , ¬Ai (i.e., if all negations occur only before propositional atoms in A). Every Km -formula ϕ can be converted into an equivalent one N N F (ϕ) by recursively applying the rewriting rules: ¬✷r ϕ=⇒✸r ¬ϕ, ¬✸r ϕ=⇒✷r ¬ϕ, ¬(ϕ1 ∧ ϕ2 )=⇒(¬ϕ1 ∨ ¬ϕ2 ), ¬(ϕ1 ∨ ϕ2 )=⇒(¬ϕ1 ∧ ¬ϕ2 ), ¬¬ϕ=⇒ϕ. A Km -formula is said to be in Box Normal Form (BNF) [PSV02, PV03] if it is written in terms of the symbols ✷r , ¬✷r , ∧, ∨, and propositional literals Ai , ¬Ai (i.e., if there are no diamonds, and if all negations occurs only before boxes or before propositional atoms in A). Every Km -formula ϕ can be converted into an equivalent one BN F (ϕ) by recursively applying the rewriting rules: ✸r ϕ=⇒¬✷r ¬ϕ, ¬(ϕ1 ∧ ϕ2 )=⇒(¬ϕ1 ∨ ¬ϕ2 ), ¬(ϕ1 ∨ ϕ2 )=⇒(¬ϕ1 ∧ ¬ϕ2 ), ¬¬ϕ=⇒ϕ. The basic normal modal logic Km can be defined axiomatically as follows. A formula in Λm is a theorem in Km if it can be inferred from the following axiom ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 784 — #4 ✐ 784 ✐ Chapter 25. SAT Techniques for Modal and Description Logics schema: K. (✷r ϕ1 ∧ ✷r (ϕ1 → ϕ2 )) → ✷r ϕ2 . (25.1) by means of tautological inference and of the application of the following inference rule: ϕ ✷r ϕ (N ecessitation), (25.2) for every formula ϕ, ϕ1 , ϕ2 in Λm and for every ✷r ∈ B. The rule N ecessitation characterizes most modal logics; the axiom K characterizes the normal modal logics. The semantics of modal logics is given by means of Kripke structures. A Kripke structure for Km is a tuple M = hU, π, R1 , . . . , Rm i, where U is a set of states, π is a function π : A × U 7−→ {T rue, F alse}, and each Rr is a binary relation on the states of U. With a little abuse of notation we write “u ∈ M ” instead of “u ∈ U”. We call a pair M, u, a situation. The binary relation |= between a modal formula ϕ and a pair M, u s.t. u ∈ M is defined as follows: M, u |= Ai , Ai ∈ A M, u |= ¬ϕ1 M, u |= ϕ1 ∧ ϕ2 M, u |= ✷r ϕ1 , ✷r ∈ B ⇐⇒ π(Ai , u) = T rue; ⇐⇒ M, u 6|= ϕ1 ; ⇐⇒ M, u |= ϕ1 and M, u |= ϕ2 ; ⇐⇒ M, v |= ϕ1 for every v ∈ M s.t. Rr (u, v) holds in M . We extend the definition of |= to formula sets µ = {ϕ1 , ..., ϕn } as follows: M, u |= µ ⇐⇒ M, u |= ϕi , f or every ϕi ∈ µ. “M, u |= ϕ” should be read as “M, u satisfy ϕ in Km ” (alternatively, “M, u Km -satisfy ϕ”). We say that a formula ϕ ∈ Λm is satisfiable in Km (Km satisfiable from now on) if and only if there exist M and u ∈ M s.t. M, u |= ϕ. ϕ is valid for M , written M |= ϕ, if M, u |= ϕ for every u ∈ M . ϕ is valid for a class of Kripke structures K if M |= ϕ for every M ∈ K. ϕ is said to be valid in Km iff M |= ϕ for every Kripke structure M . It can be proved that a Λm -formula ϕ is a theorem in Km if and only if it is valid in Km [Che80, Fit83, HM92]. When this causes no ambiguity we sometimes write “satisfiability” meaning “Km -satisfiability”. If m = 1, we simply write “K” for “K1 ”. The problem of determining the Km -satisfiability of a Km -formula ϕ is decidable and PSPACE-complete [Lad77, HM92], even restricting the language to a single Boolean atom (i.e., A = {A1 }) [Hal95]; if we impose a bound on the modal depth of the Km -formulas, the problem reduces to NP-complete [Hal95]. Intuitively, every satisfiable formula ϕ in Km can be satisfied by a Kripke structure M which is a finite tree and whose depth is given by depth(ϕ) + 1 (i.e., s.t. |M | ≤ |ϕ|depth(ϕ) ). Such a structure can be spanned by an alternating-and/or search procedure, similarly to what is done with QBF. An encoding of Km satisfiability into QBF is presented in [Lad77, HM92]. For a detailed description on Km , including complexity results, we refer the reader to [Lad77, HM92, Hal95]. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 785 — #5 ✐ Chapter 25. SAT Techniques for Modal and Description Logics ✐ 785 Table 25.1. Axiom schemata and corresponding properties of Rr for the normal modal logics. Axiom Schema B. ¬ϕ → ✷r ¬✷r ϕ D. ¬✷r ⊥ T. ✷r ϕ → ϕ 4. ✷r ϕ → ✷r ✷r ϕ 5. ¬✷r ϕ → ✷r ¬✷r ϕ Property of Rr symmetric seriality reflexive transitive euclidean ∀ ∀ ∀ ∀ ∀ u v. [Rr (u, v) =⇒ Rr (v, u)] u. ∃ v. [Rr (u, v)] u. [Rr (u, u)] u v w. [Rr (u, v) e Rr (v, w) =⇒ Rr (u, w)] u v w. [Rr (u, v) e Rr (u, w) =⇒ Rr (v, w)] Table 25.2. Properties of Rr for the various normal modal logics. The names between parentheses denote the names each logic is commonly referred with. (For better readability, we omit the pedex “ m” from the name of the logics.) Logic L ∈ N (Axiomatic Characterization) K KB KD KT = KDT (T) K4 K5 KBD KBT = KBDT (B) KB4 = KB5 = KB45 KD4 KD5 KT4 = KDT4 (S4) KT5 = KBD4 = KBD5 = KBT4 = KBT5 = KDT5 = KT45 = KBD45 = KBT45 = KDT45 = KBDT4 = KBDT5 = KBDT45 (S5) K45 KD45 Corresponding Properties of Rr (Semantic Characterization) — symmetric serial reflexive transitive euclidean symmetric and serial symmetric and reflexive symmetric and transitive serial and transitive serial and euclidean reflexive and transitive reflexive, transitive and symmetric (equivalence) transitive and euclidean serial, transitive and euclidean 25.2.2. Normal Modal Logics We consider the class N of the normal modal logics. We briefly recall some of the standard definitions and results for these logics (see, e.g., [Che80, Fit83, HM92, Hal95]). Given the language Λm , the class of normal modal logics on Λm , N , can be described axiomatically as follows. The set of theorems in a logic L in N is the set of Λm -formulas which can be inferred by means of tautological inference and of the application of the Necessitation rule from the axiom schema K, plus a given subset of the axiom schemata {B, D, T, 4, 5} described in the left column of Table 25.1. A list of normal modal logics built by combining such axiom schemata is presented in the left column of Table 25.2. Notice that each logic L is named after the list of its (modal) axiom schemata, and that many logics are equivalent, so that we have only 15 distinct logics out of the 32 possible combinations. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 786 — #6 ✐ 786 ✐ Chapter 25. SAT Techniques for Modal and Description Logics From the semantic point of view, the logics L ∈ N differ from one another by imposing some restrictions on the relations Rr of the Kripke structures. As described in Table 25.1, each axiom schema in {B, D, T, 4, 5} corresponds to a property on Rr . In each logic L, a formula ϕ can be satisfied only by Kripke structures whose relations Rr ’s verify the properties corresponding to L, as described in Table 25.2. (E.g., ϕ is satisfiable in KD4 only by Kripke structures whose relations are both serial and transitive.) Consequently, ϕ is said to be valid in L if it is valid in the corresponding class of Kripke structures. For every L in N , it can be proved that a Λm -formula is a theorem in L if and only if it is valid in L [Che80, Fit83]. The problem of determining the satisfiability in L in N (“L-satisfiability” hereafter) of a Λm -formula ϕ is decidable for every L. The computational complexity of the problem depends on the logic L and on many other factors, including the maximum number m of distinct box operators, the maximum number |A| of distinct primitive propositions, and the maximum modal depth of the formulas (denoted by depth). In the general case, for most logics L ∈ N L-satisfiability is PSPACE-complete; in some cases it may reduce to NP-complete, if m = 1 (e.g., with K45, KD45, S5), if depth is bounded (e.g., with Km , Tm , K45m , KD45m , S5m ); in some cases it may reduce even to PTIME-complete is some of the features above combine with the fact that |A| is finite [Lad77, Hal95]. We refer the reader to [Lad77, HM92, Hal95, Ngu05] for a detailed description of these issues. A labeled formula is a pair σ : ϕ, where ϕ is a formula in Λ and σ is a label (typically a sequence of integers) labeling a world in a Kripke structure for L. If Γ = {ϕ1 , . . . , ϕn }, we write σ : Γ for {σ : ϕ1 , . . . , σ : ϕn }. Intuitively, σ : ϕ means “the formula ϕ in the world σ”. For every L ∈ N , [Fit83, Mas94, Mas00] give a notion of accessibility relation between labels and gives the properties for these relations for the various logics L. Essentially, they mirror the accessibility relation between the worlds they label. 25.2.3. Non-normal Modal Logics We now consider the class of classical – also known as non-normal – modal logics. We briefly recall some of the standard definitions and results for these logics (see, e.g., [Che80, FHMV95]). Given the language Λm , the basic classical modal logic on Λm , Em , can be defined axiomatically as follows. The theorems in Em are the set of formulas in Λm which can be inferred by tautological inference and by the application of the inference rule: ϕ↔ψ (E) ✷r ϕ ↔ ✷r ψ . (25.3) As a consequence, the schemata N. ✷r ⊤ M. ✷r (ϕ ∧ ψ) → ✷r ϕ C. (✷r ϕ ∧ ✷r ψ) → ✷r (ϕ ∧ ψ) (25.4) which are theorems in Km do not hold in Em . The three principles N , M , and C enforce closure conditions on the set of provable formulas which are not always desirable, especially if the ✷r operator has an epistemic (such as knowledge or belief) ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 787 — #7 ✐ Chapter 25. SAT Techniques for Modal and Description Logics ✐ 787 reading. If we interpret ✷r ϕ as “a certain agent r believes ϕ”, then N enforces that r believes all the logical truths, M that r’s beliefs are closed under logical consequence, and C that r’s beliefs are closed under conjunction. These three closure properties are different forms of omniscience, and —as such— they might not be appropriate for modeling the beliefs of a real agent (see, e.g., [FHMV95]). By combining the schemata in (25.4) and using them as axiom schemata, we can get eight different combinations corresponding to eight distinct logics, where each logic is named after the list of its modal axiom schemata. The logic EM CNm corresponds to the basic normal modal logic Km . The semantics of classical modal logics is given by means of Montague-Scott structures. A Montague-Scott structure for Em is a tuple S = hU, π, N1 , . . . , Nm i, where U is a set of states, π is a function π : A × U 7−→ {T rue, F alse}, and each Nr is a relation Nr : U 7−→ P(P(U )), i.e., for each u ∈ U, Nr (u) ⊆ P(U ). Notice that Montague-Scott structures are a generalization of Kripke structures, so the class of possible models for Km is indeed a subclass of the possible models for Em . In analogy with Section 25.2.1, we write “u ∈ S” instead of “u ∈ U”, and we call S, u a situation. The binary relation |= between a modal formula ϕ and a pair S, u s.t. u ∈ S is defined as follows: S, u |= Ai , Ai ∈ A S, u |= ¬ϕ1 S, u |= ϕ1 ∧ ϕ2 S, u |= ✷r ϕ1 , ✷r ∈ B ⇐⇒ π(Ai , u) = T rue; ⇐⇒ S, u 6|= ϕ1 ; ⇐⇒ S, u |= ϕ1 and S, u |= ϕ2 ; ⇐⇒ {v | M, v |= ϕ1 } ∈ Nr (u) (25.5) We extend the definition of |= to formula sets µ = {ϕ1 , ..., ϕn } as follows: S, u |= µ ⇐⇒ S, u |= ϕi , f or every ϕi ∈ µ. “S, u |= ϕ” should be read as “S, u satisfy ϕ in Em ” (alternatively, “S, u Em satisfy ϕ”). We say that a formula ϕ ∈ Λm is satisfiable in Em (Em -satisfiable from now on) if and only if there exist S and u ∈ S s.t. S, u |= ϕ. ϕ is valid for S, written S |= ϕ, if S, u |= ϕ for every u ∈ S. ϕ is valid for a class of Montague-Scott structures C if S |= ϕ for every S ∈ C. ϕ is said to be valid in Em iff S |= ϕ for every Montague-Scott structure S. The semantics of the logic E is given by the relation defined in 25.5 only. The logics where one of M , C and N is an axiom require the following closure conditions on Nr to be satisfied for each r: (M ) if U ⊆ V and U ∈ Nr (w) then V ∈ Nr (w) (closure by superset inclusion), (C) if U ∈ Nr (w) and V ∈ Nr (w) then U ∩ V ∈ Nr (w) (closure by intersection), (N ) U ∈ Nr (w) (unit containment). Notice that if an Em structure S is such that Nr (u) satisfies all the above conditions for each world u ∈ S, then S is also a Km structure. In analogy with section 25.2.1, if m = 1, we simply write, e.g., “E” for “E1 ”, “EM ” for “EM1 ”, and so on. For every non-normal logic L, it can be proved that a Λm -formula is a theorem in L if and only if it is valid in L [Che80, Fit83]. The problem of determining the Em -satisfiability of a Em -formula ϕ is decidable, and the Em -satisfiability problem is NP-complete [Var89]. Satisfiability ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 788 — #8 ✐ 788 ✐ Chapter 25. SAT Techniques for Modal and Description Logics is also NP-complete in all the classical modal logics that do not contain C as an axiom (EM, EN, EMN), while it is PSPACE-complete in the remaining ones (EC, EMC, ECN, EMCN). The satisfiability problems maintain the same complexity classes when considering multi-agent extensions. 25.2.4. Modal Logics and Description Logics The connection between modal logics and terminological logics – also known as description logics – is due to a seminal paper by Klaus Schild [Sch91] where the description logic ALC [SSS91] is shown to be a notational variant of the modal logic Km . Here we survey some of the results of [Sch91], and we refer the reader to [BCM+ 03] for further reading about the current state of the art in modal and description logics. Following [Sch91], we start by defining the language of ALC. Formally, the language ALC is defined by grammar rules of the form: C → c | ⊤ | C1 ⊓ C2 | ¬C | ∀R.C R→r (25.6) where C, and Ci denote generical concepts, c denotes an atomic concept symbol and r a role symbol. The formal semantics of ALC is specified by an extension function. Let D be any set called the domain. An extension function ε over D is a function mapping concepts to subsets of D and roles to subsets of D × D such that ε[⊤] = D ε[C ⊓ D] = ε[C] ∩ ε[D] (25.7) ε[¬C] = D \ ε[C] ε[∀R.C] = {d ∈ D | ∀hd, ei ∈ ε[R] e ∈ ε[C]} Using extension functions, we can define the semantic notion of subsumption, equivalence and coherence: D subsumes C, written |= C ⊑ D, iff for each extension function ε, ε[C] ⊆ ε[D], whereas C and D are equivalent, written |= C = D iff for each extension function ε, ε[C] = ε[D]. Finally, C is coherent iff there is an extension function ε with ε[C] 6= ∅. The following result allows us to concentrate on any of the above notions without loss of generality: Lemma 1. [Sch91] Subsumption, equivalence, and incoherence are log-space reducible to each other in any terminological logic comprising Boolean operations on concepts. Viewing ALC from the modal logic perspective (see 25.2.1), atomic concepts simply can be expounded as atomic propositions, and can be interpreted as the set of states in which such propositions hold. In this case “∀.” becomes a modal operator since it is applied to formulas. Thus, e.g., ¬c1 ⊔ ∀r.(c2 ⊓ c3 ) can be expressed by the Km -formula ¬A1 ∨ ✷r (A2 ∧ A3 ). The subformula ✷r (A2 ∧ A3 ) is to be read as “agent r knows A2 ∧ A3 ”, and means that in every state accessible for r, both A2 and A3 hold.2 Actually 2 Notice that we replaced primitive concepts c with i ∈ {1, 2, 3} with propositions A , asi i suming the obvious bijection between the two sets. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 789 — #9 ✐ Chapter 25. SAT Techniques for Modal and Description Logics ✐ 789 • the domain of an extension function can be read as a set of states U, • atomic concepts can be interpreted as the set of worlds in which they hold, if expounded as atomic formulas, and • atomic roles can be interpreted as accessibility relations. Hence ∀R.C can be expounded as “all states in which agent R knows proposition C” instead of “all objects for which all R’s are in C”. To establish the correspondence between ALC and Km consider the function f mapping ALC concepts to Km -formulas with f (ci ) = Ai for i ∈ 1, 2, . . ., i.e., f maps concept symbols to primitive propositions, f (⊤) = ⊤, f (C ⊓ D) = f (C) ∧ f (D), f (¬C) = ¬f (C) and f (∀R.C) = ✷R f (C). It could easily be shown by induction on the complexity of ALC−concepts that f is a linearly length-bounded isomorphism such that an ALC−concept C is coherent iff the Km -formula f (C) is satisfiable. Formally: Theorem 1. [Sch91] ALC is a notational variant of the propositional modal logic Km , and satisfiability in Km has the same computational complexity as coherence in ALC. By this correspondence, several theoretical results for Km can easily be carried over to ALC. We immediately know, for example, that without loss of generality, any decision procedure for Km -satisfiability is also a decision procedure for ALC−coherence. There are other result (see, e.g., [BCM+ 03]) that link normal modal logics, as described in 25.2.2, to various description logics that extend ALC in several ways. According to these results, decision procedures for expressive description logics may be regarded as decision procedures for various normal modal logics, e.g., KD, T, B, and the other way round. 25.3. Basic Modal DPLL In this section we introduce the basic concepts of modal tableau-based and DPLLbased procedures, and we discuss their relation. 25.3.1. A Formal Framework Assume w.l.o.g. that the ✸r ’s are not part of the language (each ✸r ϕ can be rewritten into ¬✷r ¬ϕ). We call atom every formula that cannot be decomposed propositionally, that is, every formula whose main connective is not propositional. Examples of atoms are, A1 , Ai (propositional atoms), ✷1 (A1 ∨ ¬A2 ) and ✷2 (✷1 A1 ∨¬A2 ) (modal atoms). A literal is either an atom or its negation. Given a formula ϕ, an atom [literal] is a top-level atom [literal] for ϕ if and only if it occurs in ϕ and under the scope of no boxes. Atoms0 (ϕ) is the set of the top-level atoms of ϕ. We call a truth assignment µ for a formula ϕ a truth value assignment to all the atoms of ϕ. A truth assignment is total if it assigns a value to all atoms in ϕ, partial otherwise. Syntactically identical instances of the same atom are always assigned identical truth values; syntactically different atoms, e.g., ✷1 (ϕ1 ∨ ϕ2 ) ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 790 — #10 ✐ 790 ✐ Chapter 25. SAT Techniques for Modal and Description Logics and ✷1 (ϕ2 ∨ ϕ1 ), are treated differently and may thus be assigned different truth values. To this extent, we introduce a bijective function L2P (“L-to-Propositional”) and its inverse P2L := L2P −1 (“Propositional-to-L”), s.t. L2P maps toplevel Boolean atoms into themselves and top-level non-Boolean atoms into fresh Boolean atoms — so that two atom instances in ϕ are mapped into the same Boolean atom iff they are syntactically identical— and distributes with sets and Boolean connectives. (E.g., L2P({✷r ϕ1 , ¬(✷r ϕ1 ∨ ¬A1 )}) is {B1 , ¬(B1 ∨ ¬A1 )} .) L2P and P2L are also called Boolean abstraction and Boolean refinement respectively. We represent a truth assignment µ for ϕ as a set of literals µ = { ✷1 α11 , . . . , ✷1 α1N1 , ¬✷1 β11 , . . . , ¬✷1 β1M1 , .. . ✷m αm1 , . . . , ✷m αmNm , ¬✷m βm1 , . . . , ¬✷m βmMm , A1 , . . . , ¬AR , ¬AR+1 , . . . , ¬AS }, (25.8) ✷r αi ’s, ✷r βj ’s being modal atoms and Ai ’s being propositional atoms. Positive literals ✷r αi and Ak in µ mean that the corresponding atom is assigned to true, negative literals ¬✷r βi and ¬Ak mean that the corresponding atom is assigned to false. If µ2 ⊆ µ1 , then we say that µ1 extends µ2 and that µ2 subsumes µ1 . A restricted truth assignment µr = {✷r αr1 , . . . , ✷r αrNr , ¬✷r βr1 , . . . , ¬✷r βrMr } (25.9) is given by restricting µ to the set of atoms in the form ✷r ψ, where 1 ≤ r ≤ m. Trivially µr subsumes µ. Notationally, we use the Greek letters µ, η to represent truth assignments. Sometimes we represent the truth assignments in (25.8) and (25.9) also as the formulas given by the conjunction of their literals:  VN V M1 1   j=1 ¬✷1 β1j ∧ i=1 ✷1 α1i ∧        ... (25.10) µ= V VMm  Nm   i=1 ✷m αm ∧ j=1 ¬✷m βmj ∧       VS  VR k=1 Ak ∧ h=R+1 ¬Ah , ^ ^ µr = ✷r αri ∧ ¬✷r βrj . (25.11) i j For every logic L, we say that an assignment µ [restricted assignment µr ] is L-satisfiable meaning that its corresponding formula (25.10) [(25.11)] is Lsatisfiable. We say that a total truth assignment µ for ϕ propositionally satisfies ϕ, written µ |=p ϕ, if and only if L2P(µ) |= L2P(ϕ), that is, for all sub-formulas ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 791 — #11 ✐ Chapter 25. SAT Techniques for Modal and Description Logics ✐ 791 ϕ1 , ϕ2 of ϕ: µ |=p ϕ1 , ϕ1 ∈ Atoms0 (ϕ) ⇐⇒ ϕ1 ∈ µ, µ |=p ¬ϕ1 ⇐⇒ µ 6|=p ϕ1 , µ |=p ϕ1 ∧ ϕ2 ⇐⇒ µ |=p ϕ1 and µ |=p ϕ2 . We say that a partial truth assignment µ propositionally satisfies ϕ if and only if all the total truth assignments for ϕ which extend µ propositionally satisfy ϕ. For instance, if ϕ = ✷1 ϕ1 ∨ ¬✷2 ϕ2 , then the partial assignment µ = {✷1 ϕ1 } is such that µ |=p ϕ. In fact, both {✷1 ϕ1 ✷2 ϕ2 } and {✷1 ϕ1 , ¬✷2 ϕ2 } propositionally satisfy ϕ. Henceforth, if not otherwise specified, when dealing with propositional satisfiability we do not distinguish between assignments and partial assignments. Intuitively, if we consider a formula ϕ as a propositional formula in its top-level atoms, then |=p is the standard satisfiability in propositional logic. Thus, for every ϕ1 and ϕ2 , we say that ϕ1 |=p ϕ2 if and only if µ |=p ϕ2 for every µ s.t. µ |=p ϕ1 . We say that ϕ is propositionally satisfiable if and only if there exist an assignment µ s.t. µ |=p ϕ. We also say that |=p ϕ (ϕ is propositionally valid) if and only if µ |=p ϕ for every assignment µ for ϕ. Thus ϕ1 |=p ϕ2 if and only if |=p ϕ1 → ϕ2 , and |=p ϕ iff ¬ϕ is propositionally unsatisfiable. Notice that |=p is stronger than |=, that is, if ϕ1 |=p ϕ2 , then ϕ1 |= ϕ2 , but not vice versa. E.g., ✷r ϕ1 ∧ ✷r (ϕ1 → ϕ2 ) |= ✷r ϕ2 , but ✷r ϕ1 ∧ ✷r (ϕ1 → ϕ2 ) 6|=p ✷r ϕ2 . Example 1. Consider the following K2 formula ϕ and its Boolean abstraction L2P(ϕ): ϕ = {¬✷1 (¬A3 ∨ ¬A1 ∨ A2 ) ∨ A1 ∨ A5 } ∧ {¬A2 ∨ ¬A5 ∨ ✷1 (¬A2 ∨ A4 ∨ A5 )} ∧ {A1 ∨ ✷2 (¬A4 ∨ A5 ∨ A2 ) ∨ A2 } ∧ {¬✷2 (A4 ∨ ¬A3 ∨ A1 ) ∨ ¬✷1 (A4 ∨ ¬A2 ∨ A3 ) ∨ ¬A5 } ∧ {¬A3 ∨ A1 ∨ ✷2 (¬A4 ∨ A5 ∨ A2 )} ∧ {✷1 (¬A5 ∨ A4 ∨ A3 ) ∨ ✷1 (¬A1 ∨ A4 ∨ A3 ) ∨ ¬A1 } ∧ {A1 ∨ ✷1 (¬A2 ∨ A1 ∨ A4 ) ∨ A2 } L2P(ϕ) = ∧ ∧ ∧ ∧ ∧ ∧ {¬B1 ∨ A1 ∨ A5 } {¬A2 ∨ ¬A5 ∨ B2 } {A1 ∨ B3 ∨ A2 } {¬B4 ∨ ¬B5 ∨ ¬A5 } {¬A3 ∨ A1 ∨ B3 } {B6 ∨ B7 ∨ ¬A1 } {A1 ∨ B8 ∨ A2 } The partial assignment µp = {B6 , B8 , ¬B1 , ¬B5 , B3 , ¬A2 } satisfies L2P(ϕ), so that the following assignment µ := P2L(µp ) propositionally satisfies ϕ: V µ = ✷1 (¬A5 ∨ A4 ∨ A3 ) ∧ ✷1 (¬A2 ∨ A1 ∨ A4 ) ∧ [Vi ✷1 α1i ] ¬✷1 (¬A3 ∨ ¬A1 ∨ A2 ) ∧ ¬✷1 (A4 ∨ ¬A2 ∨ A3 ) ∧ [ j ¬✷1 β1j ] V ✷2 (¬A4 ∨ A5 ∨ A2 ) ∧ [Vi ✷2 α2i ]V ¬A2 . [ k Ak ∧ h ¬Ah ] ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 792 — #12 ✐ 792 ✐ Chapter 25. SAT Techniques for Modal and Description Logics µ gives rise to two restricted assignments µ1 and µ2 : µ1 = ✷1 (¬A5 ∨ A4 ∨ A3 ) ∧ ✷1 (¬A2 ∨ A1 ∨ A4 ) ∧ ¬✷1 (¬A3 ∨ ¬A1 ∨ A2 ) ∧ ¬✷1 (A4 ∨ ¬A2 ∨ A3 ) µ2 = ✷2 (¬A4 ∨ A5 ∨ A2 ) V [Vi ✷1 α1i ] [ j ¬✷1 β1j ] V [ i ✷2 α2i ]. We say that a collection M := {µ1 , . . . , µn } of (possibly partial) assignments propositionally satisfying ϕ is complete if and only if, for every total assignment η s.t. η |=p ϕ, there exists µj ∈ M s.t. µj ⊆ η. Intuitively, M can be seen as a compact representation of the whole set of total assignments propositionally satisfying ϕ. Proposition 1. [SV98] Let ϕ be a formula and let M := {µ1 , . . . , µn } be a complete collection of truth assignments propositionally satisfying ϕ. Then, for every L, ϕ is L-satisfiable if and only if µj is L-satisfiable for some µj ∈ M. We also notice the following fact. Proposition 2. [Seb01] Let α be a non-Boolean atom occurring only positively [resp. negatively] in ϕ. Let M be a complete set of assignments satisfying ϕ, and let M′ := {µj \ {¬α}| µj ∈ M} [resp. {µj \ {α} | µj ∈ M}]. Then (i) for every µ′j ∈ M′ , µ′j |=p ϕ, and (ii) ϕ is L-satisfiable if and only if there exist a L-satisfiable µ′j ∈ M′ . proposition 1 shows that the L-satisfiability of a formula can be reduced to that of a complete collection of sets of literals (assignment), for every call. Proposition 2 says that, if we have non-Boolean atoms occurring only positively [resp. negatively] in the input formula, we can safely drop every negative [resp. positive] occurrence of them from all assignments in a complete set M preserving the completeness of M. In general, L-satisfiability of a conjunction of literals depends on L [Fit83, Che80]. The following propositions give a recursive definition for Km . Proposition 3. [GS00] The truth assignment µ of Equation (25.10) is Km satisfiable if and only if the restricted truth assignment µr of Equation (25.11) is Km -satisfiable, for all ✷r ’s. 3 Proposition 4. [GS00] The restricted assignment µr of Equation (25.11) is Km satisfiable if and only if the formula ^ ϕrj = αri ∧ ¬βrj (25.12) i is Km -satisfiable, for every ¬✷r βrj occurring in µr . Notice that propositions 3 and 4 can be merged into one single theorem stating that µ is Km -satisfiable if and only if ϕrj is Km -satisfiable, for all r and j. Notice furthermore that the depth of every ϕrj is strictly smaller than the depth of ϕ. VR VS 3 Notice that the component a truth assignment. k=1 Ak ∧ h=R+1 ¬Ah in (25.10) is consistent because µ is ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 793 — #13 ✐ Chapter 25. SAT Techniques for Modal and Description Logics ✐ 793 Example 2. Consider the formula ϕ and the assignments µ, µ1 and µ2 in Example 1. µ propositionally satisfies ϕ. Thus, for proposition 1, ϕ is Km -satisfiable if µ is Km -satisfiable. By proposition 3, µ is Km -satisfiable if and only if both µ1 and µ2 are Km -satisfiable; by proposition 4, µ2 is trivially Km -satisfiable, as it contains no negated boxes, and µ1 is Km -satisfiable if and only if each of the formulas V ϕ11 = Vi α1i ∧ ¬β11 = (¬A5 ∨ A4 ∨ A3 ) ∧ (¬A2 ∨ A1 ∨ A4 ) ∧ A3 ∧ A1 ∧ ¬A2 , ϕ12 = i α1i ∧ ¬β12 = (¬A5 ∨ A4 ∨ A3 ) ∧ (¬A2 ∨ A1 ∨ A4 ) ∧ ¬A4 ∧ A2 ∧ ¬A3 is Km -satisfiable. As they both are satisfiable propositional formulas, then ϕ is Km -satisfiable. Proposition 1 reduces the L-satisfiability of a formula ϕ to the L-satisfiability of a complete collection of its truth assignments, for every L. If L is Km , propositions 3 and 4 show how to reduce the latter to the Km -satisfiability of formulas of smaller depth. This process can be applied recursively, decreasing the depth of the formula considered at each iteration. Following these observations, it is possible to test the Km -satisfiability of a formula ϕ by implementing a recursive alternation of two basic steps [GS96a, GS96b]: 1. Propositional reasoning: using some procedure for propositional satisfiability, find a truth assignment µ for ϕ s.t. µ |=p ϕ; 2. Modal reasoning: check the Km -satisfiability of µ by generating the corresponding restricted assignments µr ’s and formulas ϕrj ’s. The two steps recurse down until we get to a truth assignment with no modal atoms. At each level, the process is repeated until either a Km -satisfiable assignment is found (in which case ϕ is Km -satisfiable) or no more assignments are found (in which case ϕ is not Km -satisfiable). 25.3.2. Modal Tableaux We call “tableau-based” a system that implements and extends to other logics the Smullyan’s propositional tableau calculus, as defined in [Smu68]. Tableau-based procedures basically consist of a control strategy applied on top of a tableau framework. By tableau framework for modal logics we denote a refutation formal system extending Smullyan’s propositional tableau with rules handling the modal operators (modal rules). Thus, for instance, in our terminology Kris [BH91, BFH+ 94], Crack [BFT95] and LWB [HJSS96] are tableau-based systems. For instance, in the labeled tableau framework for normal modal logics in N described in [Fit83, Mas94, Mas00], branches are represented as sets of labeled formulas u : ψ, where u labels the state in which the formula ψ has to be satisfiable. At the first step the root 1 : ϕ is created, ϕ being the modal formula to be proved (un)satisfiable. At the i-th step, a branch is expanded by applying to a chosen labeled formula the rule corresponding to its main connective, and adding the resulting labeled formula to the branch. The rules are the following: 4 4 Analogous rules handling negated ∧’s and ∨’s, double negations ¬¬, single and double implications → and ↔, diamonds ✸r , n-ary ∧’s and ∨’s, and the negation of all them, can ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 794 — #14 ✐ 794 ✐ Chapter 25. SAT Techniques for Modal and Description Logics u : (ϕ1 ∧ ϕ2 ) u : ϕ1 , u : ϕ2 (∧) u : ¬✷r ϕ (¬✷r ) u′ : ¬ϕ u : (ϕ1 ∨ ϕ2 ) u : ϕ1 u : ϕ2 (∨), (25.13) u : ✷r ϕ (✷r ) u′′ : ϕ . (25.14) The modal rules are constrained by the following applicability conditions: • ¬✷r -rule: u′ is a new state (u′ is said to be directly accessible from u); • ✷r -rule: u′′ is an existing state which is accessible from u via Rr . Distinct logics L differ for different notions of accessibility in the ✷r -rule [Fit83, Mas94, Mas00]. Every application of the ∨-rule splits the branch into two sub-branches. A branch is closed when a formula ψ and its negation ¬ψ occur in it. The procedure stops when all branches are closed (ϕ is L-unsatisfiable) or no more rule is applicable (ϕ is L-satisfiable). For some modal logics it is possible to drop labels by using alternative sets of non-labeled modal rules [Fit83]. For instance in Km it is possible to use unlabeled formulas and update branches according to the rules Γ, ϕ1 ∧ ϕ2 (∧) Γ, ϕ1 , ϕ2 Γ, ϕ1 ∨ ϕ2 (∨) Γ, ϕ1 Γ, ϕ2 µ (✷r /¬✷r ) α1 ∧ . . . ∧ αm ∧ ¬βj (25.15) (25.16) for each box-index r ∈ {1, ..., m}. Γ is an arbitrary set of formulas, and µ is a set of literals which includes ¬✷r βj and whose only positive ✷r -atoms are ✷r α1 , . . . , ✷r αm . This describes the tableau-based decision procedure of Figure 25.1, which is the restriction to Km of the basic version of the Kris procedure described in [BH91]. Tableau-based formalisms for many modal logics are described, e.g., in [Fit83, Mas94]. Tableau-based procedures for many modal logics are described, e.g., in [BH91, BFH+ 94, BFT95, HJSS96]. 25.3.3. From Modal Tableaux to Modal DPLL We call “DPLL-based” any system that implements and extends to other logics the Davis-Putnam-Longeman-Loveland procedure (DPLL) [DP60, DLL62]. DPLL-based procedures basically consist on the combination of a procedure handling purely-propositional component of reasoning, typically a variant of the DPLL algorithm, and some procedure handling the purely-modal component, typically consisting of a control strategy applied on top of a modal tableau rules. be derived straightforwardly, and are thus omitted here. Following [Fit83], the ∧-, ∨-, ¬✷r and ✷r -rules (and those for their equivalent operators) are often called α-, β-, π-, and ν-rules respectively. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 795 — #15 ✐ Chapter 25. SAT Techniques for Modal and Description Logics function Km -Tableau(Γ) if ψi ∈ Γ and ¬ψi ∈ Γ /* then return False; if (ϕ1 ∧ ϕ2 ) ∈ Γ /* then return Km -Tableau(Γ ∪ {ϕ1 , ϕ2 }\{(ϕ1 ∧ ϕ2 )}); if (ϕ1 ∨ ϕ2 ) ∈ Γ /* then return Km -Tableau(Γ ∪ {ϕ1 }\{(ϕ1 ∨ ϕ2 )}) or Km -Tableau(Γ ∪ {ϕ2 }\{(ϕ1 ∨ ϕ2 )}); for every r ∈ {1, ..., m} do for every ¬✷r βj ∈ Γ do /* S if not Km -Tableau({¬βj } ∪ ✷ α ∈Γ {αi }) r i then return False; return True; ✐ 795 branch closed */ ∧-elimination */ ∨-elimination */ branch expanded */ Figure 25.1. An example of a tableau-based procedure for Km . We omit the steps for the other operators. Thus, for instance, in our terminology Ksat [GS96a, GS00], Fact [Hor98b, Hor98a], Dlp [PS98], Racer [HM01] are DPLL-based systems. 5 From a purely-logical viewpoint, it is possible to conceive a DPLL-based framework by substituting the propositional tableaux rules with some rules implementing the DPLL algorithms in a tableau-based framework [SV98]. For instance, one can conceive a DPLL-based framework for a normal logic L from Fitting or Massacci’s frameworks (see §25.3.2) by substituting the ∨-rule (25.13) with the following rules: u : (l ∨ C) (Branch) u:l u : ¬l u:l u : (¬l ∨ C) (U nit) u:C , (25.17) where l is a and C is a disjunction of literals. 6 More recent and richer formal frameworks for representing DPLL and DPLL-based procedures are described in [Tin02, NOT06]. As stated in §25.3.2, for some modal logics it is possible to drop labels by using alternative sets of non-labeled modal rules [Fit83]. If so, DPLL-based procedures can be implemented more straightforwardly. For instance, in Km it is possible to use unlabeled formulas and update branches according to the following rules [SV98]: ϕ µ1 µ2 ... µn (DP LL) µ (✷r /¬✷r ) α1 ∧ . . . ∧ αm ∧ ¬βj (25.18) where the ✷r /¬✷r -rule is that of (25.16), and {µ1 , . . . , µn } is a complete set of assignments for ϕ, which can be produced by the DPLL algorithm. 5 See footnote 1 in §25.1. Here we assume for simplicity that the input formula is in conjunctive normal form (CNF). Equivalent formalisms are available for non-CNF formulas [DM94, Mas98]. 6 ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 796 — #16 ✐ 796 ✐ Chapter 25. SAT Techniques for Modal and Description Logics function Ksat(ϕ) return KsatF (ϕ, ⊤); function KsatF (ϕ, µ) if ϕ = ⊤ then return KsatA (µ); if ϕ = ⊥ then return False; if {a unit clause (l) occurs in ϕ} then return KsatF (assign(l, ϕ), µ ∧ l); l := choose-literal(ϕ); return KsatF (assign(l, ϕ), µ ∧ l) or KsatF (assign(¬l, ϕ), µ ∧ ¬l); V V V /* µ is i ✷1 α1i ∧ j ¬✷1 β1j ∧ . . . ∧ i ✷m αmi ∧ function KsatA (µ) for each box index r V ∈ {1...m} do V if not KsatAR ( i ✷r αri ∧ j ¬✷r βrj ) then return False; return True; /* base */ /* backtrack */ V j /* unit */ /* split */ ¬✷m βmj ∧ V k Ak ∧ V h ¬Ah */ /* µr is i ✷r αri ∧ j ¬✷r βrj */ function KsatAR (µr ) for each literal ¬✷ Vr βrj ∈ µ do if not Ksat( i αri ∧ ¬βrj ) then return False; return True; V V Figure 25.2. The basic version of Ksat algorithm. 25.3.4. Basic Modal DPLL for Km The ideas described in §25.3.3 were implemented in the Ksat procedure [GS96a, GS00], whose basic version is reported in Figure 25.2. This schema evolved from that of the PTAUT procedure in [AG93], and is based on the “classic” DPLL procedure [DP60, DLL62]. Ksat takes in input a modal formula ϕ and returns a truth value asserting whether ϕ is Km -satisfiable or not. Ksat invokes KsatF (where “F ” stands for “Formula”), passing as arguments ϕ and (by reference) the empty assignment ⊤. KsatF tries to build a Km -satisfiable assignment µ propositionally satisfying ϕ. This is done recursively, according to the following steps: • (base) If ϕ = ⊤, then µ satisfies ϕ. Thus, if µ is Km -satisfiable, then ϕ is Km -satisfiable. Therefore KsatF invokes KsatA (µ) (where “A ” stands for Assignment), which returns a truth value asserting whether µ is Km satisfiable or not. • (backtrack) If ϕ = ⊥, then µ does not satisfy ϕ, so that KsatF returns F alse. • (unit) If a literal l occurs in ϕ as a unit clause, then l must be assigned ⊤. 7 To obtain this, KsatF is invoked recursively with arguments the formula returned by assign(l, ϕ) and the assignment obtained by adding l to µ. 7A notion of unit clause for non-CNF propositional formulas is given in [AG93]. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 797 — #17 ✐ Chapter 25. SAT Techniques for Modal and Description Logics ✐ 797 assign(l, ϕ) substitutes every occurrence of l in ϕ with ⊤ and evaluates the result. • (split) If none of the above situations occurs, then choose-literal(ϕ) returns an unassigned literal l according to some heuristic criterion. Then KsatF is first invoked recursively with arguments assign(l, ϕ) and µ ∧ l. If the result is negative, then KsatF is invoked with arguments assign(¬l, ϕ) and µ ∧ ¬l. KsatF is a variant of the “classic” DPLL algorithm [DP60, DLL62]. The KsatF schema differs from that of classic DPLL by only two steps. The first difference is the “base” case: when finding an assignment µ which propositionally satisfies the input formula, it simply returns “T rue”. KsatF instead is supposed also to check the Km -satisfiability of the corresponding set of literals, by invoking KsatA on µ. If the latter returns true, then the whole formula is satisfiable and KsatF returns T rue as well; otherwise, KsatF backtracks and looks for the next assignment. The second difference is in the fact that in KsatF the pure-literal step [DLL62] is removed. 8 In fact the sets of assignments generated by DPLL with pure-literal might be incomplete and might cause incorrect results, as shown by the following example. Example 3. Let ϕ be the following formula: (✷1 A1 ∨ A1 ) ∧ (✷1 (A1 → A2 ) ∨ A2 ) ∧ (¬✷1 A2 ∨ A2 ) ∧ (¬A2 ∨ A3 ) ∧ (¬A2 ∨ ¬A3 ). ϕ is Km -satisfiable, because µ = {A1 , ¬A2 , ✷1 (A1 → A2 ), ¬✷1 A2 } is an assignment which propositionally satisfies ϕ and which is also modally consistent. It is easy to see that no satisfiable assignment propositionally satisfying ϕ assigns ✷1 A1 to true. As ✷1 A1 occurs only positively in ϕ, DPLL with the pure literal rule would assign ✷1 A1 to true as first step, which would lead the procedure to return F alse. With these simple modifications, the embedded DPLL procedure works as an enumerator of a complete set of assignments, whose Km -satisfiability is recursively checked by KsatA . KsatA (µ) invokes KsatAR (µr ) (where “AR ” stands for Restricted Assignment) for every box index r. This is repeated until either KsatAR returns a negative value (in which case KsatA (µ) returns False) or no more ✷r ’s are available (in which case KsatA (µ) returns True). KsatAR (µr ) invokes Ksat(ϕrj ) for any conjunct ¬✷r βrj occurring in µr . Again, this is repeated until either Ksat returns a negative value (in which case KsatAR (µr ) returns False) or no more ¬✷r βrj ’s are available (in which case KsatAR (µr ) returns True). Notice that KsatF , KsatA and KsatAR are a direct implementation of propositions 1, 3 and 4, respectively. This guarantees their correctness and completeness. 8 Alternatively, the application of the pure-literal rule can be restricted to atomic propositions only. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 798 — #18 ✐ 798 ✐ Chapter 25. SAT Techniques for Modal and Description Logics 25.3.5. Modal DPLL vs. Modal Tableaux [GS96a, GS96b, GS00, GGST98, GGST00, HPS99, HPSS00] presented extensive empirical comparisons, in which DPLL-based procedures outperformed tableaubased ones, with performance gaps that can reach orders of magnitude. (Similar performance gaps between tableau-based vs. DPLL-based procedures were obtained lately also in a completely-different context [ACG00].) Remarkably, most such results were obtained with tools implementing the “classic” DPLL procedure of §25.3.4, very far from the efficiency of current DPLL implementations. We concentrate on the basic tableau-based and DPLL-based algorithms for Km -satisfiability described in §25.3.2 and §25.3.4. Both procedures work (i) by enumerating truth assignments which propositionally satisfy the input formula ϕ and (ii) by recursively checking the Km -satisfiability of the assignments found. Both algorithms perform the latter step in the same way. The key difference is thus in the way they handle propositional inference. [GS96b, GS00] remarked that, regardless the quality of implementation and the optimizations performed, tableau-based procedures have, with respect to DPLL-based procedures, two weaknesses which make them intrinsically less efficient, and whose effects get up to exponentially amplified when using them in modal inference. We consider them in turn. Syntactic vs. semantic branching. In a propositional tableau truth assignments are generated as branches induced by the application of the ∨-rule to disjunctive subformulas of the input formula ϕ. Thus, they perform what we call syntactic branching [GS96b], that is, the branching in the search tree is induced by the syntactic structure of ϕ. As discussed in [D’A92, DM94], an application of the ∨-rule generates two subtrees which can be mutually consistent, i.e., which may share propositional models. 9 Therefore, the set of truth assignments enumerated by propositional tableau procedures grows exponentially with the number of disjunctions occurring positively in ϕ, regardless the fact that it may contain up to exponentially-many duplicated and/or subsumed assignments. Things get even worse in the modal case. When testing Km -satisfiability, unlike the propositional case where tableaux look for one assignment satisfying the input formula, the propositional tableaux are used to enumerate all the truth assignments, which must be recursively checked for Km -consistency. This requires V checking recursively possibly-many sub-formulas of the form i αri ∧¬βj of depth d − 1, for which a propositional tableau will enumerate all truth assignments, and so on. At all levels of nesting, a redundant truth assignment introduces a redundant modal search tree. Thus, with modal formulas the redundancy of the propositional case propagates with the modal depth, and, in the worst case, the number of redundant truth assignments can become exponential. DPLL instead, performs a search which is based on what we call semantic branching [GS96b], that is, a branching on the truth value of sub-formulas ψ of ϕ 9 As pointed out in [D’A92, DM94], the propositional tableaux rules are unable to represent bivalence: “every proposition is either true or false, tertium non datur ”. This is a consequence of the elimination of the cut rule in cut-free sequent calculi, from which propositional tableaux are derived. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 799 — #19 ✐ ✐ Chapter 25. SAT Techniques for Modal and Description Logics Γ Γ α −α β −β α −β α −α α −β 799 −α Τ β −α −β −β β T −β Τ Figure 25.3. Search trees for the formula Γ = (α ∨ ¬β) ∧ (α ∨ β) ∧ (¬α ∨ ¬β). Left: a tableau-based procedure. Right: a DPLL-based procedure. (typically atoms): 10 ϕ ϕ[ψ/⊤] ϕ[ψ/⊥], where ϕ[ψ/⊤] is the result of substituting with ⊤ all occurrences of ψ in ϕ and then simplify the result. Thus, every branching step generates two mutuallyinconsistent subtrees. 11 Because of this, DPLL always generates non-redundant sets of assignments. This avoids any search duplication and, in the case of modal search, any recursive exponential propagation of such a redundancy. Example 4. Consider the simple formula Γ = (α ∨ ¬β) ∧ (α ∨ β) ∧ (¬α ∨ ¬β), where α and β are modal atoms s.t. α ∧ ¬β is not modally consistent. and let d be the depth of Γ. The only possible assignment propositionally satisfying Γ is µ = α ∧ ¬β. Look at Figure 25.3 left. Assume that in a tableau-based procedure, the ∨-rule is applied to the three clauses occurring in Γ in the order they are listed. Then two distinct but identical open branches are generated, both representing the assignment µ. Then the tableau expands the two open branches in the same way, until it generates two identical (and possibly big) closed modal sub-trees T of modal depth d, each proving the Km -unsatisfiability of µ. This phenomenon may repeat itself at the lower level in each sub-tree T , and so on. For instance, if α = ✷1 ((α′ ∨ ¬β ′ ) ∧ (α′ ∨ β ′ )) and β = ✷1 (α′ ∧ β ′ ), then at the lower level we have a formula Γ′ of depth d − 1 analogous to Γ. This propagates exponentially the redundancy with the depth d. VK Finally, notice that if we considered the formula ΓK = i=1 (αi ∨ ¬βi ) ∧ K (αi ∨ βV identical truth assignments i ) ∧ (¬αi ∨ ¬βi ), the tableau would generate 2 K µ = i αi ∧ ¬βi , and things would get exponentially worse. Look at Figure 25.3, right. A DPLL-based procedure branches asserting α = ⊤ or α = ⊥. The first branch generates α ∧ ¬β, while the second gives ¬α ∧ ¬β ∧ β, which immediately closes. Therefore, only one instance of µ = α∧¬β is generated. The same applies to µK . 10 Notice that the notion of “semantic branching” introduced in [GS96b] is stronger than that lately used in [Hor98b, HPS99], the former corresponding to the latter plus the usage of unit-propagation. 11 This fact holds for both “classic” [DP60, DLL62] and “modern” DPLL (see, e.g., [ZM02]), because in both cases two branches differ for the truth value of at least one atom, although for the latter case the explanation is slightly more complicate. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 800 — #20 ✐ 800 Chapter 25. SAT Techniques for Modal and Description Logics Γ α φ T3 −β −α −β . . . . . α φ1 β −α ✐ T1 2 −β Γ −α T1 T23 T2 −α −β Figure 25.4. Search trees for the formula Γ = (α ∨ φ1 ) ∧ (β ∨ φ2 ) ∧ φ3 ∧ (¬α ∨ ¬β). Left: a tableau-based procedure. Right: a DPLL-based procedure. Detecting constraint violations. A propositional formula ϕ can be seen as a set of constraints for the truth assignments which possibly satisfy it. For instance, a clause A1 ∨ A2 constrains every assignment not to set both A1 and A2 to ⊥. Unlike tableaux, DPLL prunes a branch as soon as it violates some constraint of the input formula. (For instance, in Ksat this is done by the function assign.) Example 5. Consider the formula Γ = (α ∨ φ1 ) ∧ (β ∨ φ2 ) ∧ φ3 ∧ (¬α ∨ ¬β), α and β being atoms, φ1 , φ2 and φ3 being sub-formulas, such that α ∧ β ∧ φ3 is propositionally satisfiable and α ∧ φ2 is Km -unsatisfiable. Look at Figure 25.4, left. Again, assume that, in a tableau-based procedure, the ∨-rule is applied in order, left to right. After two steps, the branch α, β is generated, which violates the constraint imposed by the last clause (¬α ∨ ¬β). A tableau-based procedure is not able to detect such a violation until it explicitly branches on that clause, that is, only after having generated the whole sub-tableau T3 for α ∧ β ∧ φ3 , which may be rather big. DPLL instead (Figure 25.4, right) avoids generating the violating assignment detects the violation and immediately prunes the branch. 25.4. Advanced Modal DPLL In this section we present the most important optimizations of the DPLL-based procedures described in §25.3, and one extension to non-normal modal logics. 25.4.1. Optimizations As described in §25.3.4, the first DPLL-based tools of [GS96a, GS96b, SV98, GS00] were based on the “classic” recursive DPLL schema [DP60, DLL62]. Drastic improvements in performances were lately obtained by importing ideas and techniques from the SAT literature and/or by directly implementing tools on top of modern DPLL solvers, which are applied to the Boolean abstraction of the input formula [GGST98, GGST00, HPS99, Tac99, GGT01, HM01]. In particular, modern DPLL implementation are non-recursive, and are based on very efficient, destructive data structures to handle Boolean formulas and assignments. They benefit of sophisticated search techniques (e.g., backjumping, learning, restarts [MSS96, BS97, GSK98]), smart splitting heuristics (e.g., ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 801 — #21 ✐ Chapter 25. SAT Techniques for Modal and Description Logics ✐ 801 [MMZ+ 01, GN02, ES04]), highly-engineered data structures and implementation tricks (e.g., the two-watched literal scheme [MMZ+ 01]), and advanced preprocessing techniques [Bra01, BW03, EB05]. In particular, modern DPLL implementations perform conflict analysis on failed assignments µ’s, which detect the reason of each failure, that is, a (typically much smaller) subset µ′ of µ which alone causes the failure. When this happens, the procedure • adds the negation of µ′ as a new clause to the formula, so that no assignment containing µ′ will be ever investigated again. This technique is called learning; • backtracks to the highest point in the stack where one literal l in the learned clause ¬µ′ is not assigned, it unit propagates l, and it proceeds with the search. This technique is called backjumping. Backjumping and learning are of great interest in our discussion, as it will be made clear in §25.4.1.4. The other DPLL optimizations come for free by using stateof-the-art SAT solvers and they are substantially orthogonal to our discussion, so that they will not be discussed here. We describe some further optimizations which have been proposed to the basic schema of §25.3.4. For better readability, the description will refer to the case of Km , but they can be extended to other logics. Most of these techniques and optimizations have lately been adopted by the so-called lazy tools for Satisfiability Modulo Theories, SMT (see §26.4.3). 25.4.1.1. Normalizing atoms One potential source of inefficiency for DPLL-based procedures is the occurrence in the input formula of equivalent though syntactically-different atoms (e.g., ✷r (A1 ∨ A2 ) and ✷r (A2 ∨ A1 )), or pairs atoms in which one is equivalent to the negation of the other (e.g. ✷r (A1 ∨ A2 ) and ✸r (¬A1 ∧ ¬A2 )). If two atoms ψ1 , ψ2 are s.t. ψ1 6= ψ2 and |= ψ1 ↔ ψ2 [resp. ψ1 6= ¬ψ2 and |= ψ1 ↔ ¬ψ2 ], then they are recognized as distinct Boolean atoms B1 =def L2P(ψ1 ) and B2 =def L2P(ψ2 ), which may be assigned different [resp. identical] truth values by DPLL. This may cause the useless generation of many unsatisfiable assignments and the cor0 responding useless calls to KsatA (e.g., up to 2|Atoms (ϕ)|−2 calls on assignments like {✷r (A1 ∨ A2 ), ¬✷r (A2 ∨ A1 )...}). In order to avoid these problems, it is wise to preprocess atoms so that to map as many as possible equivalent literals into syntactically identical ones [GS96a, GS00, HPS99]. This can be achieved by applying some rewriting rules, like, e.g.: • Drop dual operators: ✷r (ϕ1 ∨ϕ2 ), ✸r (¬ϕ1 ∧¬ϕ2 ) =⇒ ✷r (ϕ1 ∨ϕ2 ), ¬✷r (ϕ1 ∨ ϕ2 ), or even (ϕ1 ∧ ϕ2 ), (¬ϕ1 ∨ ¬ϕ2 ) =⇒ (ϕ1 ∧ ϕ2 ), ¬(ϕ1 ∧ ϕ2 ) • Exploit associativity: ✷r (ϕ1 ∨ (ϕ2 ∨ ϕ3 )), ✷r ((ϕ1 ∨ ϕ2 ) ∨ ϕ3 ) =⇒ ✷r (ϕ1 ∨ ϕ2 ∨ ϕ3 ), • Sort: ✷r (ϕ2 ∨ ϕ1 ∨ ϕ3 ), ✷r (ϕ3 ∨ ϕ1 ∨ ϕ2 ) =⇒ ✷r (ϕ1 ∨ ϕ2 ∨ ϕ3 ). • Exploit properties of normal modal logics: ✷r (ϕ1 ∧ ϕ2 ) =⇒ ✷r ϕ1 ∧ ✷r ϕ2 if L ∈ N. • Exploit specific properties of some logic L: ✷r ✷r ϕ1 =⇒ ✷r ϕ1 if L is S5. Notice that pre-conversion to BNF (§25.2.1) goes in this direction. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 802 — #22 ✐ 802 ✐ Chapter 25. SAT Techniques for Modal and Description Logics Example 6. Consider the modal atoms occurring in the formula ϕ in Example 1. For every modal atom in ϕ there are 3! = 6 equivalent permutations, which are all mapped into one atom if the modal atoms are sorted. E.g., if we consider an equivalent formula ϕ′ in which the second occurrence of the atom ✷2 (¬A4 ∨ A5 ∨ A2 ), occurring in rows 3 and 5, is rewritten as ✷2 (A5 ∨ ¬A4 ∨ A2 ), then the latter will be encoded by L2P into a different Boolean variable, namely B9 , which could be assigned by DPLL a different truth value wrt. B3 , generating an modally inconsistent assignment µ′ . If all atoms in ϕ′ are pre-sorted, then the problem does not occur. 25.4.1.2. Early Pruning Another optimization [GS96a, GS00, Tac99] was conceived after the empirical observation that most assignments found by DPLL are “trivially” Km -unsatisfiable, that is, they will remain Km -unsatisfiable even after removing some of their conjuncts. If an incomplete 12 assignment µ′ is Km -unsatisfiable, then all its extensions are Km -unsatisfiable. If the unsatisfiability of µ′ is detected on time, then 0 ′ this prevents checking the Km -satisfiability of all the up to 2|Atoms (ϕ)|−|µ | truth assignments which extend µ′ . This suggests the introduction of an intermediate Km -satisfiability test on incomplete assignments just before the split. (Notice there is no need to introduce similar tests before unit propagation.) In the basic algorithm of Figure 25.2, this is done by introducing the three lines below in the function KsatF of Figure 25.2, just before the “split”: if (Likely-Unsatisfiable(µ)) if not KsatA (µ) then return False; /* early-pruning */ (We temporarily ignore the test performed by Likely-Unsatisfiable.) KsatA is invoked on the current incomplete assignment µ. If KsatA (µ) returns F alse, then all possible extensions of µ are unsatisfiable, and therefore KsatF returns F alse. The introduction of this intermediate check, which is called early pruning, caused a drastic improvement in the overall performances [GS96a, GS00]. Example 7. Consider the formula ϕ of Example 1. Suppose that, after three recursive calls, KsatF builds the incomplete assignment: µ′ = ✷1 (¬A1 ∨ A4 ∨ A3 ) ∧ ✷1 (¬A2 ∨ A1 ∨ A4 ) ∧ ¬✷1 (A4 ∨ ¬A2 ∨ A3 ) (rows 6, 7 and 4 of ϕ). If it is invoked on µ′ , KsatA will check the K2 -satisfiability of the formula (¬A1 ∨ A4 ∨ A3 ) ∧ (¬A2 ∨ A1 ∨ A4 ) ∧ ¬A4 ∧ A2 ∧ ¬A3 , which is unsatisfiable. Therefore there will be no more need to select further literals, and KsatF will backtrack. 12 By incomplete assignment µ for ϕ we mean that µ has not assgined enough atoms to determine whether µ |= ϕ or not. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 803 — #23 ✐ Chapter 25. SAT Techniques for Modal and Description Logics ✐ 803 The intermediate consistency checks, however, may introduce some useless calls to KsatA . One way of addressing this problem is to condition the calls to KsatA in early-pruning steps to some heuristic criteria (here represented by the heuristic function Likely-Unsatisfiable). The main idea is to avoid invoking KsatA when it is very unlikely that, since the last call, the new literals added to µ can cause inconsistency: e.g., when they are added only literals which are purely-propositional or contain new Boolean atoms [GS96a, GS00]. Another way is to make KsatA work in an incremental way: if for some box index r ∈ {1...m} no literal of the form ✷r ψ or ¬✷r ψ has been added to µ since the last call to KsatA , then KsatA can avoid performing the corresponding call to KsatAR ; moreover, if for some box index r ∈ {1...m} no positive ✷r ψ’s have been added to µ since the last call to V KsatA , then KsatAR can avoid calling recursively Ksat on the subformulas ( i αri ∧ ¬βrj ) s.t. ¬✷r βrj was already passed to KsatA in the last call [Tac99]. 25.4.1.3. Caching This section is an overview of [GT01] to which we refer for further reading. Consider the basic version of Ksat algorithm in Figure 25.2. Without loss of generality, in the remainder of this section we assume that |B| = 1, so that the call to KsatA is the same as KsatAR . The extension to the case where |B| > 1 is straightforward since it simply requires checking the different modalities in separate calls to KsatAR . Given two assignments µ and µ′ , it may be the case that KsatA (µ) and KsatA (µ′ ) perform some equal subtests, i.e., recursive calls to Ksat. This is the case, e.g., when µ and µ′ differ only for the propositional conjuncts and there is at least one conjunct of the form ¬✷β. To prevent recomputation, the obvious solution is to cache both the formula whose satisfiability is being checked and the result of the check. Then, the cache is consulted before performing each subtest to determine whether the result of the subtest can be assessed on the basis of the cache contents. In the following, we assume to have two different caching mechanisms, each using a separate caching structure: • S-cache to store and query about satisfiable formulas, and • U-cache to store and query about unsatisfiable formulas. In this way, storing a subtest amounts to storing the formula in the appropriate caching structure. Of course, the issue is how to implement effective caching mechanisms allowing to reduce the number of subtests as much as possible. To this extent, the following considerations are in order: V 1. if a formula α∈∆′ α ∧ ¬β has already been V determined to be satisfiable then, if ∆ ⊆V ∆′ , we can conclude that also α∈∆ α ∧ ¬β is satisfiable, and determined to be unsatisfiable 2. if a formula α∈∆′ α ∧ ¬β has already been V then, if ∆ ⊇ ∆′ , we can conclude that also α∈∆ α ∧ ¬β is unsatisfiable. The above observations suggest the usage of caching mechanisms that allow for storing sets of formulas and for efficiently querying about subsets or supersets. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 804 — #24 ✐ 804 ✐ Chapter 25. SAT Techniques for Modal and Description Logics function KsatA (µ) ∆ := {α | ✷α is a conjunct of µ}; Γ := {β | ¬✷β is a conjunct of µ}; if U-cache get(∆, Γ) return False; Γr := S-cache get(∆, Γ); Γs := ∅; foreach β ∈ ΓrVdo if not Ksat( α∈∆ α ∧ ¬β) then if Γs 6= ∅ then S-cache store(∆, Γs ); U-cache store(∆, β); return False else Γs := Γs ∪ {β}; S-cache store(∆, Γs ); return True. Figure 25.5. KsatA : satisfiability checking for K with caching In other words, given a subtest Ksat( ^ α ∧ ¬β), (25.19) α∈∆ we want to be able to query our S-cache about the presence of a formula ^ α ∧ ¬β (25.20) α∈∆′ with ∆ ⊆ ∆′ (query for subsets or subset-matching). Analogously, given the subtest (25.19), we want to be able to query our U-cache about the presence of a formula (25.20) with ∆ ⊇ ∆′ (query for supersets or superset-matching). In this way, caching a subtest avoids the recomputation of the very same subtest, and of the possibly many “subsumed” subtests. Observations 1 and 2 are independent of the particular modal logic being considered. They are to be taken into account when designing caching structures for satisfiability in any modal logic. Of course, depending on the particular modal logic considered, some other considerations might be in order. For example, in K, we observe that in KsatA there is a natural unbalance between satisfiable subtests and unsatisfiable ones. In fact, with reference to Figure 25.2, when testing an assignment µ 3. many subtests can be determined to be satisfiable, all sharing the same set ∆, and 4. at most one subtest may turn out to be unsatisfiable. Observation 3 suggests that S-cache should be able to store satisfiable subtests sharing a common set ∆ in a compact way. Therefore, S-cache associates V the set ∆ to the set Γ′ ⊆ Γ, representing the “computed” satisfiable subtests α∈∆ α ∧ ¬β for each β ∈ Γ′ . Observation 4 suggests that U-cache should not care about subtests sharing a common ∆. Therefore, U-cache associates ∆ to the single β V for which the subtest α∈∆ α ∧ ¬β failed. Given the design issues outlined above, we can modify Ksat to yield the procedure KsatA shown in Figure 25.5. In the Figure: ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 805 — #25 ✐ Chapter 25. SAT Techniques for Modal and Description Logics ✐ 805 • U-cache get(∆, Γ) returns True if U-cache contains a set ∆′ such that ∆ ⊇ ∆′ , ∆′ is associated with β and β ∈ Γ; • S-cache get(∆, Γ) returns the set Γ \ Γ′ where Γ′ is the union over all the sets Γ′′ such that for some set ∆′ ⊇ ∆, Γ′′ is associated to ∆′ in S-cache. • U-cache store(∆, β) stores in U-cache the set ∆ and associates β to it; • S-cache store(∆, Γ) stores in S-cache the set ∆ and associates to it the set Γ. The new issue is now to implement effective data structures for S-cache and Ucache supporting the above functions. Clearly, we expect that the computational costs associated to the above functions will be superior to the computational costs associated to other caching structures designed for “equality-matching”, i.e., effectively supporting the functions obtained from the above by substituting “⊇” with “=”. There is indeed a trade-off between “smart but expensive” and “simple but efficient” data-structures for caching. Of course, depending on • the particular logic being considered, and • the characteristics of the particular formula being tested, we expect that one caching mechanism will lead to a faster decision process than the others. Independently from the data-structure being used, the following (last) observation needs to be taken into account when dealing with modal logics whose decision problem is not in NP (e.g., K, S4): 5. testing the consistency of a formula may require an exponential number of subtests. This is the case for the Halpern and Moses formulas presented in [HM85] for various modal logics. Observation 5 suggests that it may be necessary to bound the size of the cache, and introduce mechanisms for deciding which formulas to discard when the bound is reached. Further discussion about the implementation of caching and low-level optimizations can be found in [GT01]. 25.4.1.4. Modal Backjumping Another very important optimization, called modal backjumping [Hor98a, PS98], generalizes the idea of backjumping in DPLL. KsatA can be easily modified so that, when invoked on a Km -unsatisfiable set of modal literals µ, it returns also the subset µ′ of µ which caused the inconsistency of µ. We call µ′ , a modal conflict set of µ. An easy way of computing µ′ is that of returning V the set L2P({✷r αri }i ∪ {¬✷r βrj }) corresponding to the first formula ϕrj = i αri ∧ ¬βrj which is found unsatisfiable by Ksat. Example 8. Consider the formula ϕ of Example 1. The assignment µp = {B6 , B8 , B2 , ¬B1 , ¬B5 , B3 } is found by KsatF , which satisfies L2P(ϕ). Thus ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 806 — #26 ✐ 806 ✐ Chapter 25. SAT Techniques for Modal and Description Logics KsatA is given as input µ = ✷1 (¬A5 ∨ A4 ∨ A3 ) ∧ ✷1 (¬A2 ∨ A1 ∨ A4 ) ∧ ✷1 (¬A2 ∨ A4 ∨ A5 ) ∧ ¬✷1 (¬A3 ∨ ¬A1 ∨ A2 ) ∧ ¬✷1 (A4 ∨ ¬A2 ∨ A3 ) ∧ ✷2 (¬A4 ∨ A5 ∨ A2 ) V [Vi ✷1 α1i ] [ j ¬✷1 β1j ] V [ i ✷2 α2i ] and hence invokes KsatAR on the two restricted assignments: µ1 = ✷1 (¬A5 ∨ A4 ∨ A3 ) ∧ ✷1 (¬A2 ∨ A1 ∨ A4 ) ∧ ✷1 (¬A2 ∨ A4 ∨ A5 ) ∧ ¬✷1 (¬A3 ∨ ¬A1 ∨ A2 ) ∧ ¬✷1 (A4 ∨ ¬A2 ∨ A3 ) µ2 = ✷2 (¬A4 ∨ A5 ∨ A2 ) V [Vi ✷1 α1i ] [ j ¬✷1 β1j ] V [ i ✷2 α2i ]. µ2 is trivially Km -satisfiable. µ1 requires invoking Ksat on the two formulas ϕ11 = (¬A5 ∨ A4 ∨ A3 ) (¬A2 ∨ A4 ∨ A5 ) ϕ12 = (¬A5 ∨ A4 ∨ A3 ) (¬A2 ∨ A4 ∨ A5 ) ∧ (¬A2 ∨ A1 ∨ A4 ) ∧ ∧ A3 ∧ A1 ∧ ¬A2 , ∧ (¬A2 ∨ A1 ∨ A4 ) ∧ ∧ ¬A4 ∧ A2 ∧ ¬A3 . The latter is unsatisfiable, from which we can conclude that ✷1 (¬A5 ∨A4 ∨A3 ) ∧✷1 (¬A2 ∨A1 ∨A4 ) ∧✷1 (¬A2 ∨A4 ∨A5 ) ∧¬✷1 (A4 ∨¬A2 ∨A3 ) is Km -unsatisfiable, so that {B6 , B8 , B2 , ¬B5 , } is a conflict set of µp . The conflict set µ′ found is then used to drive the backjumping mechanism of DPLL. Different strategies are possible. The DPLL-based modal tools [Hor98a, PS98] and earlier SMT tools [WW99] used to jump up to the most recent branching point s.t. at least one literal lp ∈ µ′ is not assigned. Intuitively, all open subbranches departing from the current branch at a lower decision point contain µ′ , so that there is no need to explore them; this allows for pruning all these subbranches from the search tree. (Notice that these strategies do not explicitly require adding the clause ¬µ′ to ϕ.) More sophisticate versions of this technique, which mirror the most-modern backjumping techniques introduced in DPLL, were lately introduced in the context of SMT (see §26.4.3.5). In substance, modal backjumping differs from standard Boolean backjumping only for the notion of conflict set used: whilst a Boolean conflict set µ is an assignment which causes a propositional inconsistency if conjoined to ϕ (i.e, s.t. µ ∧ ϕ |=p ⊥), a modal conflict set is a set of literals which in Km -inconsistent (i.e, s.t. µ |= ⊥). 25.4.1.5. Pure-literal filtering This technique, which we call pure-literal filtering,13 was implicitly proposed by [WW99] and then generalized and adopted in the *SAT tool [Tac99] (and lately imported into SMT [ABC+ 02], see §26.4.3.7). The idea is that, if we have nonBoolean atoms occurring only positively [resp. negatively] in the input formula, 13 Also called triggering in [WW99, ABC+ 02]. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 807 — #27 ✐ Chapter 25. SAT Techniques for Modal and Description Logics ✐ 807 we can safely drop every negative [resp. positive] occurrence of them from the assignment µ to be checked by KsatA . (The correctness and completeness of this process is a consequence of proposition 2 in §25.3.1.) There are some benefits for this behavior. Let µ′ be the reduced version of µ . First, µ′ might be Km -satisfiable despite µ is Km -unsatisfiable. If so, and if µ (and hence µ′ ) propositionally satisfies ϕ, then KsatF can stop, potentially saving a lot of search. Second, if both µ′ and µ are Km -unsatisfiable, the call to KsatA on µ′ rather than that on µ can cause smaller conflict sets, in order to improve the effectiveness of backjumping and learning. Third, checking the Km -satisfiability of µ′ rather than that of µ can be significantly faster. In fact, suppose ✷r βrj occurs only positively in ϕ and it is assigned a negative value by KsatF , so that ¬✷r βrj ∈ µ but ¬✷r βrj 6∈ µ′ . Thus ¬✷r βrj r will not occur V in the restricted assignment µ fed to KsatAR , avoiding the call to Ksat on ( i αri ∧¬βrj ). This allows for extending the notion of “incrementality” of §25.4.1.2, by considering only the literals in µ′ rather than those in µ. 25.4.2. Extensions to Non-Normal Modal Logics This section briefly surveys some of the contents of [GGT01] to which we refer for further reading. Following the notation of [GGT01], we say that an assignment µ satisfies a formula ϕ if µ entails ϕ by propositional reasoning, and that a formula ϕ is consistent in a logic L (or L-consistent) if ¬ϕ is not a theorem of L, i.e., if ¬ϕ 6∈ L. Whether an assignment is consistent, depends on the particular classical modal logic L being considered. Furthermore, depending on the logic L considered, the consistency problem for L (i.e., determining whether a formula is consistent in L) belongs to different complexity classes. In particular, the consistency problem for E, EM, EN, EMN is NP-complete, while for EC, ECN, EMC it is PSPACEcomplete (see [Var89, FHMV95]). Here, to save space, we divide these eight logics in two groups. We present the algorithms for checking the L-consistency of an assignment first in the case in which L is one of E, EM, EN, EMN, and then in the case in which L is one of the others. 25.4.2.1. Logics E, EM, EN, EMN The following proposition is an easy consequence of the results presented in [Var89]. V V Proposition 5. Let µ = i ✷αi ∧ j ¬✷βj ∧ γ be an assignment in which γ is a propositional formula. Let L be one of the logics E, EM, EN, EMN. µ is consistent in L if for each conjunct ¬✷βj in µ one of the following conditions is satisfied: • (αi ≡ ¬βj ) is L-consistent for each conjunct ✷αi in µ, and L=E; • (αi ∧ ¬βj ) is L-consistent for each conjunct ✷αi in µ, and L=EM; • ¬βj and (αi ≡ ¬βj ) are L-consistent for each conjunct ✷αi in µ, and L=EN; • ¬βj and (αi ∧ ¬βj ) are L-consistent for each conjunct ✷αi in µ, and L=EMN. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 808 — #28 ✐ 808 ✐ Chapter 25. SAT Techniques for Modal and Description Logics function LsatA (µ) foreach conjunct ✷βj do foreach conjunct ✷αi do if M [i, j] = Undef then M [i, j] := Lsat(αi ∧ ¬βj ); if L ∈ {EN,EMN} and M [i, j] = True then M [j, j] := True; if L ∈ {E,EN} and M [i, j] = False then if M [j, i] = Undef then M [j, i] := Lsat(¬αi ∧ βj ); if L = EN and M [j, i] = True then M [i, i] := True; if M [j, i] = False then return False end if L ∈ {EN,EMN} then if M [j, j] = Undef then M [j, j] := Lsat(¬βj ); if M [j, j] = False then return False end; return True. Figure 25.6. LsatA for E, EM, EN, EMN When implementing the above conditions, care must be taken in order to avoid repetitions of consistency checks. In fact, while an exponential number of assignments satisfying the input formula can be generated, at most n2 checks are possible in L, where n is the number of “✷” in the input formula. Given this upper bound, for each new consistency check, we can cache the result for a future possible re-utilization in a n × n matrix M. This ensures that at most n2 consistency checks will be performed. In more detail, given an enumeration ϕ1 , ϕ2 , . . . , ϕn of the boxed subformulas of the input formula, M[i,j], with i 6= j, stores the result of the consistency check for (ϕi ∧ ¬ϕj ). M[i,i] stores the result of the consistency check for ¬ϕi . Initially, each element of the matrix M has value Undef (meaning that the corresponding test has not been done yet). The result is the procedure LsatA in Figure 25.6, where the procedure Lsat is identical to the procedure Ksat modulo the call to KsatA which must be replaced by LsatA . Consider Figure 25.6 and assume that L=E or L=EN. Given a pair of conjuncts ✷αi and ¬✷βj , we split the consistency test for (αi ≡ ¬βj ) in two simpler sub-tests: • first, we test whether (αi ∧ ¬βj ) is consistent, and • only if this test gives False, we test whether (¬αi ∧ βj ) is consistent. Notice also that, in case L=EN or L=EMN, if we know that, e.g., (αi ∧ ¬βj ) is consistent, then also ¬βj is consistent and we store this result in M[j,j]. The following proposition ensures the correctness of Lsat in the case of E, EM, EN and EMN. V V Proposition 6. Let µ = i ✷αi ∧ j ¬✷βj ∧ γ be an assignment in which γ is a propositional formula. Let L be one of the logics E, EM, EN, EMN. Assume that, for any formula ϕ whose depth is less than the depth of µ, Lsat(ϕ) • returns True if ϕ is L-consistent, and • False otherwise. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 809 — #29 ✐ Chapter 25. SAT Techniques for Modal and Description Logics V ✐ 809 V function LsatA ( i ✷αi ∧ j ¬✷βj ∧ γ) ∆ := {αi | ✷αi is a conjunct of µ}; foreach conjunct ✷βj do ∆′ := ∆; if L ∈ {EC, ECN} then foreach conjunct ✷αi do if M [j, i] = Undef then M [j, i] := Lsat(¬αi ∧ βj ); if M [j, i] = True then ∆′ = ∆′ \ {αi } end; if L ∈ {ECN} V or ∆′ 6= ∅ then if not Lsat( α ∈∆′ αi ∧ ¬βj ) then return False i end; return True. Figure 25.7. LsatA for EC, ECN, EMC LsatA (µ) returns True if µ is L-consistent, and False otherwise. 25.4.2.2. Logics EC, ECN, EMC The following proposition is an easy consequence of the results presented in [Var89]. V V Proposition 7. Let µ = i ✷αi ∧ j ¬✷βj ∧ γ be an assignment in which γ is a propositional formula. Let ∆ be the set of formulas αi such that ✷αi is a conjunct of µ. Let L be one of logics EC, ECN, EMC. µ is consistent in L if for each conjunct ¬✷βj in µ one of the following conditions is satisfied: V • (( αi ∈∆′ αi ) ≡ ¬βj ) is L-consistent for each non empty subset ∆′ of ∆, and V L=EC; ′ • (( αi ∈∆′ αi ) ≡ ¬β Vj ) is L-consistent for each subset ∆ of ∆, and L=ECN; • ∆ is empty or (( αi ∈∆ αi ) ∧ ¬βj ) is L-consistent, and L=EMC; Assume that L=EC or L=ECN. The straightforward implementation of the corresponding condition may lead to an exponential number of checks in the cardinality |∆| of ∆. More carefully, for each conjunct ¬✷βj in µ, we can perform at most |∆| + 1 checks if 1. for each formula αi in ∆, we first check whether (¬αi ∧ βj ) is consistent in L. Let ∆′ be the set of formulas for which the above test fails. Then, 2. inVcase L=ECN or ∆′ 6= ∅, we perform the last test, checking whether (( αi ∈∆′ αi ) ∧ ¬βj ) is consistent in L. Furthermore, the result of the consistency checks performed in the first step can be cached in a matrix M analogous to the one used in the previous subsection. If L=EC or L=ECN, the procedure LsatA in Figure 25.7 implements the above ideas. Otherwise, it is a straightforward implementation of the conditions in proposition 7. The following proposition ensures the correctness of Lsat in the case of E, EM, EN and EMN. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 810 — #30 ✐ 810 ✐ Chapter 25. SAT Techniques for Modal and Description Logics V V Proposition 8. Let µ = i ✷αi ∧ j ¬✷βj ∧ γ be an assignment in which γ is a propositional formula. Let L be one of logics EC, ECN, EMC. Assume that, for any formula ϕ whose depth is less than the depth of µ, Lsat(ϕ) • returns True if ϕ is L-consistent, and • False otherwise. LsatA (µ) returns True if µ is L-consistent, and False otherwise. 25.5. The OBDD-based Approach In this section we briefly survey the basics of the OBDD-based approach to implement decision procedures for modal K, and we refer the reader to [PSV02, PV03] for further details. The contents of this section and the next borrow from [PSV06], including basic notation and the description of the algorithms. The OBDD-based approach is inspired by the automata-theoretic approach for logics with the tree-model -property. In that approach, one proceeds in two steps. First, an input formula is translated to a tree automaton that accepts all the tree models of the formula. Second, the automaton is tested for nonemptiness, i.e., whether it accepts some tree. The approach described in [PSV02] combines the two steps and carries out the non-emptiness test without explicitly constructing the automaton. The logic K is simple enough that the automaton’s non-emptiness test consists of a single fixpoint computation, which starts with a set of states and then repeatedly applies a monotone operator until a fixpoint is reached. In the automaton that corresponds to a formula, each state is a type, i.e., a set of formulas satisfying some consistency conditions. The algorithms that we describe here start from some set of types and then repeatedly apply a monotone operator until a fixpoint is reached. 25.5.1. Basics To aid the description of the OBDD-based algorithms we introduce some additional notation. The set of propositional atoms used in a formula is denoted AP (ϕ), and, given a formula ψ, we call its set of subformulas sub(ψ). For ϕ ∈ sub(ψ), we can define depth(ϕ) in the usual way. If not stated otherwise, we assume all formulas to be in BNF. The closure of a formula cl(ψ) is defined as the smallest set such that, for all subformulas ϕ of ψ, if ϕ is not of the form ¬ϕ′ , then {ϕ, ¬ϕ} ⊆ cl(ψ). The algorithms that we present here work on types, i.e., maximal sets of formulas that are consistent w.r.t. the Boolean operators, and where (negated) box formulas are treated as atoms. A set of formulas a ⊆ cl(ψ) is called a ψ-type (or simply a type if ψ is clear from the context) if it satisfies the following conditions: • If ϕ = ¬ϕ′ , then ϕ ∈ a iff ϕ′ 6∈ a. • If ϕ = ϕ′ ∧ ϕ′′ , then ϕ ∈ a iff ϕ′ ∈ a and ϕ′′ ∈ a. • If ϕ = ϕ′ ∨ ϕ′′ , then ϕ ∈ a iff ϕ′ ∈ a or ϕ′′ ∈ a. For a set of types T , we define the maximal accessibility relation ∆ ⊆ T × T as follows. ∆(t, t′ ) iff for all ✷ϕ′ ∈ t, we have ϕ′ ∈ t′ ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 811 — #31 ✐ Chapter 25. SAT Techniques for Modal and Description Logics ✐ 811 X := Init(ψ) repeat X ′ := X X := Update(X ′ ) until X = X ′ if exists x ∈ X such that ψ ∈ x then return “ψ is satisfiable” else return “ψ is not satisfiable” Figure 25.8. Basic schema for the OBDD-based algorithm. In Figure 25.8 we present the basic schema for the OBDD-based decision procedures. The schema can be made to work in two fashions, called top-down and bottom-up in [PSV06], according to the definition of the accessory functions Init and Update. In both cases, since the algorithms operate with elements in a finite lattice 2cl(ψ) and use a monotone Update, they are bound to terminate. In the case of the top-down approach, the accessory functions are defined as: • Init(ψ) is the set of all ψ-types. • Update(T ) := T \ bad(T ), where bad(T ) are the types in T that contain unwitnessed negated box formulas. More precisely, bad(T ) := {t ∈ T | there exists ¬✷ϕ ∈ t and, forall u ∈ T with ∆(t, u), we have ϕ ∈ u}. Intuitively, the top-down algorithm starts with the set of all types and remove those types with “possibilities” ✸ϕ for which no “witness” can be found. In the bottom-up approach, the accessory functions are defined as: • Init(ψ) is the set of all those types that do not require any witness, which means that they do not contain any negated box formula, or equivalently, that they contain all positive box formulas in cl(ψ). More precisely, Init(ψ) := {t ⊆ cl(ψ) | t is a type and ✷ϕ ∈ t for each ✷ϕ ∈ cl(ψ)}. • Update(T ) := T ∪ supp(T ), where supp(T ) is the set of those types whose negated box formulas are witnessed by types in T . More precisely, supp(T ) := {t ⊆ cl(ψ) |t is a type and, for all ¬✷ϕ ∈ t, there exists u ∈ T with ¬ϕ ∈ u and ∆(t, u)}. Intuitively, the bottom-up algorithm starts with the set of types having no possibilities ✸ϕ , and adds those types whose possibilities are witnessed by a type in the set. Notice that the two algorithms described above, correspond to the two ways in which non-emptiness can be tested for automata for K. 25.5.2. Optimizations The decision procedure described in the previous section handle the formula in three steps. First, the formula is converted into BNF. Then the initial set of ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 812 — #32 ✐ 812 ✐ Chapter 25. SAT Techniques for Modal and Description Logics types is generated – we can think of this set as having some memory efficient representation. Finally, this set is updated through a fixpoint process. The answer of the decision procedure depends on a simple syntactic check of this fixpoint. In the following we consider three orthogonal optimizations techniques. See [PSV06] for more details, and for a description of preprocessing techniques that may further improve the performances of the OBDD-based implementations. 25.5.2.1. Particles The approaches presented so far strongly depend on the fact that the BNF is used and they can be said to be redundant: if a type contains two conjuncts of some subformula of the input, then it also contains the corresponding conjunction – although the truth value of the latter is determined by the truth values of the former. Working with a different normal form it is possible to reduce such redundancy. We consider K-formulas in NNF (negation normal form) and we assume hereafter that all the formulas are in NNF. A set p ⊆ sub(ψ) is a ψparticle if it satisfies the following conditions: • If ϕ = ¬ϕ′ , then ϕ ∈ p implies ϕ′ 6∈ p • If ϕ = ϕ′ ∧ ϕ′′ , then ϕ ∈ p implies ϕ′ ∈ p and ϕ′′ ∈ p. • If ϕ = ϕ′ ∨ ϕ′′ , then ϕ ∈ p implies ϕ′ ∈ p or ϕ′′ ∈ p. Thus, in contrast to a type, a particle may contain both ϕ′ and ϕ′′ , but neither ϕ′ ∧ ϕ′′ nor ϕ′ ∨ ϕ′′ . Incidentally, particles are closer than types to assignments over modal atoms as described in Section 25.3.4. For particles, ∆(·, ·) is defined as types. From a set of particles P and the corresponding ∆(·, ·), a Kripke structure Kp can be constructed in the same way as from a set of types (see [PSV06]). The schema presented in Figure 25.8 can be made to work for particles as well. In the top-down algorithm: • Init(ψ) is the set of all ψ-particles. • Update(P ) := P \ bad(P ), where bad(P ) is the particles in P that contain unwitnessed diamond formulas and it is defined similarly to the case of types Also in the case of particles, the bottom-up approach differs only for the definitions of Init and Update: • Init(ψ) := {p ⊆ sub(ψ) | p is a particle and ✸ϕ 6∈ p for all ✸ϕ ∈ sub(ψ)} is the set of ψ-particles p that do not contain diamond formulas. • Update(P ) := P ∪ supp(P ) where supp(P ) is the set of witnessed particles defined similarly to witnessed types. Just like a set of types can be encoded in some efficient way, e.g., a set of bit vectors using a BDD, the same can be done for particles. It is easy to see that bit vectors for particles may be longer than bit vectors for types because, for example, the input may involve subformulas ✷A and ✸¬A. The overall size of the BDD may, however, be smaller for particles since particles impose fewer constraints than types, and improvements in the run time of the algorithms may result because the particle-based Update functions require checking less formulas than the type-based ones. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 813 — #33 ✐ Chapter 25. SAT Techniques for Modal and Description Logics ✐ 813 25.5.2.2. Lean approaches Even though the particle approach imposes less constraints than the type approach, it still involves redundant information: like types, particles may contain both a conjunction and the corresponding conjuncts. To further reduce the size of the corresponding BDDs, in [PSV06] it is proposed a representation where “nonredundant” subformulas are only kept track of. A set of “non-redundant” subformulas atom(ψ) is defined as the set of those formulas in cl(ψ) that are neither conjunctions nor disjunctions, i.e., each ϕ ∈ atom(ψ) is of the form ✷ϕ′ , A, ¬✷ϕ′ , or ¬A. By definition of types, each ψ-type t ⊆ cl(ψ), corresponds one-to-one to a lean type lean(t) := t ∩ atom(ψ). To specify .algorithms for lean types, a relation . ∈ must be defined recursively as follows: ϕ ∈ t if • • • • ϕ ∈ atom(ψ) and ϕ ∈ t, . ϕ = ¬ϕ′ and not. ϕ ∈ t, . ϕ = ϕ′ ∧ ϕ′′ , ϕ′ ∈ t, and ϕ′′ ∈ t,. or . ϕ = ϕ′ ∨ ϕ′′ , and ϕ′ ∈ t, or ϕ′′ ∈ t. The top-down and bottom-up approach for types can be easily modified to work for lean types. It suffices to modify the definition of the functions bad and supp as follows: bad(T ) := {t ∈ T | there exists ¬✷ϕ ∈ t and, . forall u ∈ T with ∆(t, u), we have ϕ ∈ u}. supp(T ) := {t ⊆ cl(ψ) |t is a type and, for all ¬✷ϕ ∈ t, there exists u ∈ T . with ¬ϕ ∈ u and ∆(t, u)}. A lean optimization can also be defined for particles – details are given in [PSV06]. Notice that this approach bears also some resemblances with the approach used in [CGH97] to translate LTL to SMV. 25.5.2.3. Level based evaluation Another variation of the basic algorithm presented in Figure 25.8 exploits the fact that K enjoys the finite-tree-model property, i.e., each satisfiable formula ψ of K has a finite tree model of depth bounded by the depth of nested modal operators depth(ψ) of ψ. We can think of such a model as being partitioned into layers, where all states that are at distance i from the root are said to be in layer i. Instead of representing a complete model using a set of particles or types, each layer in the model can be represented using a separate set. Since only a subset of all subformulas appears in one layer, the representation can be more compact. We start by (re)defining cl(·) as cli (ψ) := {ϕ ∈ cl(ψ) | ϕ occurs at modal depth i in ψ} and ∆(·, ·) as ∆(t, t′ ) iff for all t ⊆ cli (ψ), t′ ⊆ cli+1 (ψ), and ϕ′ ∈ t′ for all ✷ϕ′ ∈ t. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 814 — #34 ✐ 814 ✐ Chapter 25. SAT Techniques for Modal and Description Logics d := depth(ψ) Xd := Initd (ψ) for i := d − 1 downto 0 do Xi := Update(Xi+1 ,i) end if exists x ∈ X0 such that ψ ∈ x then return “ψ is satisfiable” else return “ψ is not satisfiable” Figure 25.9. Algorithm for the level-based optimization. in order to adapt them to the layered approach. A sequence of sets of types T = hT0 , T1 , . . . , Td i with Ti ⊆ 2cli (ψ) can still be converted into a tree Kripke structure (see [PSV06] for details). A bottom-up algorithm for level-based evaluation can be defined as in Figure 25.9. The algorithm works bottom-up in the sense that it starts with the leaves of a tree model at the deepest level and then moves up the tree model toward the root, adding nodes that are “witnessed”. In contrast, the bottom-up approach presented earlier starts with all leaves of a tree model. The accessory functions can be defined as follows: • Initi (ψ) := {t ⊆ cli (ψ) | t is a type }. • Update(T, i) := {t ∈ Initi (ψ) | for all ¬✷ϕ ∈ t there exists u ∈ T with ¬ϕ ∈ u and ∆i (t, u)}. For a set T of types of formulas at level i + 1, Update(T, i) represents all types of formulas at level i that are witnessed in T . 25.6. The Eager DPLL-based approach Recently [SV06, SV08] have explored the idea of encoding Km /ALC-satisfiability into SAT and handle it by state-of-the-art SAT tools. A satisfiability-preserving encoding from Km /ALC to SAT was proposed there, with a few variations and some important optimizations. As Km -satisfiability is PSPACE-complete, the encoding is necessarily worst-case exponential (unless PSPACE=NP). However, the only source of exponentiality is the modal depth of the input formula: if the depth is bounded, the problem is NP-complete [Hal95], so that the encoding reduces to polynomial. In practice, the experiments presented there showed that this approach can handle most or all the problems which are at the reach of the other approaches, with performances which are comparable with, or even better than, those of the current state-of-the-art tools. As this idea was inspired by the so-called “eager approach” to SMT [BGV99, SSB02, Str02] (see §26.3), we call this approach, eager approach to modal reasoning. In this section we present an overview of this approach. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 815 — #35 ✐ Chapter 25. SAT Techniques for Modal and Description Logics ✐ 815 25.6.1. The basic encoding In order to make our presentation more uniform, and to avoid considering the polarity of subformulas, we adopt from [Fit83, Mas00] the representation of Km formulas from the following table: α (ϕ1 ∧ ϕ2 ) ¬(ϕ1 ∨ ϕ2 ) ¬(ϕ1 → ϕ2 ) α1 ϕ1 ¬ϕ1 ϕ1 α2 ϕ2 ¬ϕ2 ¬ϕ2 β (ϕ1 ∨ ϕ2 ) ¬(ϕ1 ∧ ϕ2 ) (ϕ1 → ϕ2 ) β1 ϕ1 ¬ϕ1 ¬ϕ1 πr ✸r ϕ1 ¬✷r ϕ1 β2 ϕ2 ¬ϕ2 ϕ2 π0r ϕ1 ¬ϕ1 νr ✷r ϕ1 ¬✸r ϕ1 ν0r ϕ1 ¬ϕ1 in which non-literal Km -formulas are grouped into four categories: α’s (conjunctive), β’s (disjunctive), π’s (existential), ν’s (universal). All such formulas occur in the main formula with positive polarity only. 14 This allows for disregarding the issue of polarity of subformulas. We borrow some notation from the Single Step Tableau (SST) framework [Mas00, DM00]. We represent univocally states in M as labels σ, represented as non empty sequences of integers 1.nr11 .nr22 . ... .nrkk , s.t. the label 1 represents the root state, and σ.nr represents the n-th successor of σ through the relation Rr . With a little abuse of notation, hereafter we may say “a state σ” meaning “a state labeled by σ”. We call a labeled formula a pair hσ : ψi, s.t. σ is a state label and ψ is a Km -formula. Let A[, ] be an injective function which maps a labeled formula hσ : ψi, s.t. is not in the form ¬φ, into a Boolean variable A[σ, ψ] . Let L[σ, ψ] denote ¬A[σ, φ] if ψ is in the form ¬φ, A[σ, ψ] otherwise. Given a Km -formula ϕ, Km 2SAT builds a Boolean CNF formula recursively as follows: Km 2SAT (ϕ) := A[1, ϕ] ∧ Def (1, ϕ) (25.21) Def (σ, Ai ), := ⊤ (25.22) Def (σ, ¬Ai ) := ⊤ (25.23) Def (σ, α) := (L[σ, α] → (L[σ, α1 ] ∧ L[σ, α2 ] )) Def (σ, β) := (L[σ, β] → (L[σ, β1 ] ∨ L[σ, β2 ] )) Def (σ, π r,j ) := (L[σ, r ^ Def (σ, ν ) := π r,j ] hσ:π r,i i →L ((L[σ, [σ.j, π νr ] r,j ) ] 0 ∧ L[σ, ∧ Def (σ, α1 ) ∧ Def (σ, α2 ) (25.24) ∧ Def (σ, β1 ) ∧ Def (σ, β2 ) (25.25) ∧ Def (σ.j, π r,i ] ) r,j π0 ) → L[σ.i, νr ] ) 0 (25.26) ∧ ^ r Def (σ.i, ν0 ). (25.27) hσ:π r,i i Here by “hσ : π r,i i” we mean that π r,i is the j-th distinct π r formula labeled by σ. We assume that the Km -formulas are represented as DAGs, so to avoid the expansion of the same Def (σ, ψ) more than once. Moreover, following [Mas00], we assume that, for each σ, the Def (σ, ψ)’s are expanded in the order: α, β, π, ν. Thus, each Def (σ, ν r ) is expanded after the expansion of all Def (σ, π r,i )’s, so that Def (σ, ν r ) will generate one clause ((L[σ, πr,i ] ∧ L[σ, ✷r ν0r ] ) → L[σ.i, ν0r ] ) and one novel definition Def (σ.i, ν0r ) for each Def (σ, π r,i ) expanded. 15 Intuitively, Km 2SAT (ϕ) mimics the construction of an SST tableau expansion [Mas00, DM00] , s.t., if there exists an open tableau T for h1 : ϕi, then 14 E.g., a ∧-formula [resp. ∨-formula] occurring negatively is considered a positive occurrence of a β-formula [resp. an α-formula ]; a ✷r -formula [resp. a ✸r -formula] occurring negatively is considered a positive occurrence of a π-formula [resp. a ν-formula]. 15 Notice that, e.g., an occurrence of ✷ ψ is considered a ν-formula if positive, a π-formula if r negative. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 816 — #36 ✐ 816 ✐ Chapter 25. SAT Techniques for Modal and Description Logics there exists a total truth assignment µ which satisfies Km 2SAT (ϕ), and vice versa. Thus, from the correctness and completeness of the SST framework, we have the following fact. Theorem 2. [SV08] A Km -formula ϕ is Km -satisfiable if and only if the corresponding Boolean formula Km 2SAT (ϕ) is satisfiable. Notice that, due to (25.27), the number of variables and clauses in Km 2SAT (ϕ) may grow exponentially with depth(ϕ). This is in accordance to what stated in [Hal95]. Example 9 (NNF). Let ϕnnf be (✸A1 ∨✸(A2 ∨A3 )) ∧ ✷¬A1 ∧ ✷¬A2 ∧ ✷¬A3 . 16 It is easy to see that ϕnnf is K1 -unsatisfiable: the ✸-atoms impose that at least one atom Ai is true in at least one successor of the root state, whilst the ✷-atoms impose that all atoms Ai are false in all successor states of the root state. Km 2SAT (ϕnnf ) is: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. ∧( ∧( ∧( ∧( ∧( ∧( ∧( ∧( ∧( ∧( ∧( A[1, ϕnnf ] A[1, ϕnnf ] → (A[1, ✸A1 ∨✸(A2 ∨A3 )] ∧ A[1, ✷¬A1 ] ∧ A[1, A[1, ✸A1 ∨✸(A2 ∨A3 )] → (A[1, ✸A1 ] ∨ A[1, ✸(A2 ∨A3 )] ) ) A[1, ✸A1 ] → A[1.1, A1 ] ) A[1, ✸(A2 ∨A3 )] → A[1.2, A2 ∨A3 ] ) (A[1, ✷¬A1 ] ∧ A[1, ✸A1 ] ) → ¬A[1.1, A1 ] ) (A[1, ✷¬A2 ] ∧ A[1, ✸A1 ] ) → ¬A[1.1, A2 ] ) (A[1, ✷¬A3 ] ∧ A[1, ✸A1 ] ) → ¬A[1.1, A3 ] ) (A[1, ✷¬A1 ] ∧ A[1, ✸(A2 ∨A3 )] ) → ¬A[1.2, A1 ] ) (A[1, ✷¬A2 ] ∧ A[1, ✸(A2 ∨A3 )] ) → ¬A[1.2, A2 ] ) (A[1, ✷¬A3 ] ∧ A[1, ✸(A2 ∨A3 )] ) → ¬A[1.2, A3 ] ) A[1.2, A2 ∨A3 ] → (A[1.2, A2 ] ∨ A[1.2, A3 ] ) ) ✷¬A2 ] ∧ A[1, ✷¬A3 ] ) ) After a run of BCP, 3. reduces to the implicate disjunction A[1, ✸A1 ] ∨A[1, ✸(A2 ∨A3 )] . If the first element A[1, ✸A1 ] is assigned to true, then by BCP we have a conflict on 4. and 6. If A[1, ✸A1 ] is set to false, then the second element A[1, ✸(A2 ∨A3 )] is assigned to true, and by BCP we have a conflict on 12. Thus Km 2SAT (ϕnnf ) is unsatisfiable. 25.6.2. Optimizations The following optimizations of the encoding have been proposed in [SV06, SV08] in order to reduce the size of the output propositional formula. 25.6.2.1. Pre-conversion to BNF Before the encoding, some potentially useful preprocessing on the input formula can be performed. First, the input Km -formulas can be converted into BNF. One potential advantage is that, when one ✷r ψ occurs both positively and negatively (like, e.g., in (✷r ψ∨...)∧(¬✷r ψ∨...)∧...), then both occurrences of ✷r ψ are labeled by the same Boolean atom A[σ, ✷r ψ] , and hence they are always assigned the 16 For K1 -formulas, we omit the box and diamond indexes. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 817 — #37 ✐ Chapter 25. SAT Techniques for Modal and Description Logics ✐ 817 same truth value by DPLL; with NNF, instead, the negative occurrence ¬✷r ψ is rewritten into ✸r ¬ψ, so that two distinct Boolean atoms A[σ, ✷r ψ] and A[σ, ✸r ¬ψ] are generated; DPLL can assign them the same truth value, creating a hidden conflict which may require some extra Boolean search to reveal. Example 10 (BNF). We consider the BNF variant of the ϕnnf formula of Example 9, ϕbnf = (¬✷¬A1 ∨ ¬✷(¬A2 ∧ ¬A3 )) ∧ ✷¬A1 ∧ ✷¬A2 ∧ ✷¬A3 . As before, it is easy to see that ϕbnf is K1 -unsatisfiable. Km 2SAT (ϕbnf ) is: 1. 2. ∧ 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. ∧ ∧ ∧ ∧ ∧ ∧ ∧ ∧ ∧ ∧ A[1, ϕbnf ] ( A[1, ϕbnf ] → (A[1, (¬✷¬A1 ∨¬✷(¬A2 ∧¬A3 ))] ∧ A[1, ✷¬A1 ] ∧ A[1, ✷¬A2 ] ∧ A[1, ✷¬A3 ] ) ) ( A[1, (¬✷¬A1 ∨¬✷(¬A2 ∧¬A3 ))] → (¬A[1, ✷¬A1 ] ∨ ¬A[1, ✷(¬A2 ∧¬A3 )] ) ) ( ¬A[1, ✷¬A1 ] → A[1.1, A1 ] ) ( ¬A[1, ✷(¬A2 ∧¬A3 )] → ¬A[1.2, (¬A2 ∧¬A3 )] ) ( (A[1, ✷¬A1 ] ∧ ¬A[1, ✷¬A1 ] ) → ¬A[1.1, A1 ] ) ( (A[1, ✷¬A2 ] ∧ ¬A[1, ✷¬A1 ] ) → ¬A[1.1, A2 ] ) ( (A[1, ✷¬A3 ] ∧ ¬A[1, ✷¬A1 ] ) → ¬A[1.1, A3 ] ) ( (A[1, ✷¬A1 ] ∧ ¬A[1, ✷(¬A2 ∧¬A3 )] ) → ¬A[1.2, A1 ] ) ( (A[1, ✷¬A2 ] ∧ ¬A[1, ✷(¬A2 ∧¬A3 )] ) → ¬A[1.2, A2 ] ) ( (A[1, ✷¬A3 ] ∧ ¬A[1, ✷(¬A2 ∧¬A3 )] ) → ¬A[1.2, A3 ] ) ( ¬A[1.2, (¬A2 ∧¬A3 )] → (A[1.2, A2 ] ∨ A[1.2, A3 ] ) ) Unlike with NNF, Km 2SAT (ϕbnf ) is found unsatisfiable directly by BCP. In fact, the unit-propagation of A[1, ✷¬A1 ] from 2. causes ¬A[1, ✷¬A1 ] in 3. to be false, so that one of the two (unsatisfiable) branches induced by the disjunction is cut a priori. With NNF, the corresponding atoms A[1, ✷¬A1 ] and A[1, ✸A1 ] are not recognized to be one the negation of the other, s.t. DPLL may need exploring one Boolean branch more. 25.6.2.2. Lifting boxes and diamonds The second form of preprocessing is, the Km -formula can also be rewritten by recursively applying the Km -validity-preserving “box/diamond-lifting rules”: ( ✷r ϕ1 ∧ ✷r ϕ2 ) =⇒ ✷r (ϕ1 ∧ ϕ2 ), ( ✸r ϕ1 ∨ ✸r ϕ2 ) =⇒ ✸r (ϕ1 ∨ ϕ2 ), (¬✷r ϕ1 ∨ ¬✷r ϕ2 ) =⇒ ¬✷r (ϕ1 ∧ ϕ2 ), (¬✸r ϕ1 ∧ ¬✸r ϕ2 ) =⇒ ¬✸r (ϕ1 ∨ ϕ2 ). (25.28) This has the potential benefit of reducing the number of π r,i formulas, and hence the number of labels σ.i to take into account in the expansion of the Def (σ, ν r )’s (25.27). Example 11 (BNF with LIFT). If we apply the rules (25.28) to the formula of Example 10, then we have ϕbnf lif t = ¬✷(¬A1 ∧¬A2 ∧¬A3 ) ∧ ✷(¬A1 ∧¬A2 ∧¬A3 ). Km 2SAT (ϕbnf lif t ) is thus: 1. 2. 3. 4. 5. 6. ∧ ∧ ∧ ∧ ∧ A[1, ϕbnf lif t ] ( A[1, ϕbnf lif t ] → (¬A[1, ✷(¬A1 ∧¬A2 ∧¬A3 )] ∧ A[1, ✷(¬A1 ∧¬A2 ∧¬A3 )] ) ) ( ¬A[1, ✷(¬A1 ∧¬A2 ∧¬A3 )] → ¬A[1.1, (¬A1 ∧¬A2 ∧¬A3 )] ) (( A[1, ✷(¬A1 ∧¬A2 ∧¬A3 )] ∧ ¬A[1, ✷(¬A1 ∧¬A2 ∧¬A3 )] ) → A[1.1, (¬A1 ∧¬A2 ∧¬A3 )] ) ( ¬A[1.1, (¬A1 ∧¬A2 ∧¬A3 )] → (A[1.1, A1 ] ∨ A[1.1, A2 ] ∨ A[1.1, A3 ] ) ) ( A[1.1, (¬A1 ∧¬A2 ∧¬A3 )] → (¬A[1.1, A1 ] ∧ ¬A[1.1, A2 ] ∧ ¬A[1.1, A3 ] ) ) ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 818 — #38 ✐ 818 ✐ Chapter 25. SAT Techniques for Modal and Description Logics Km 2SAT (ϕbnf lif t ) is found unsatisfiable directly by BCP on 1. and 2.. One potential drawback of applying the lifting rules (25.28) is that, by collapsing a conjunction/disjunction of modal atoms into one single atom, the possibility of sharing box/diamond subformulas in the DAG representation of ϕ is reduced. To cope with this problem, it is possible to adopt a controlled policy for applying Box/Diamond-lifting, that is, to apply (25.28) only if neither atom has multiple occurrences. 25.6.2.3. Handling incompatible π r and ν r A first straightforward optimization, in the BNF variant, avoids the useless encoding of incompatible π r and ν r formulas. In BNF, in fact, the same subformula ✷r ψ may occur in the same state σ both positively and negatively (e.g., if π r,j is ¬✷r ψ and ν r is ✷r ψ). If so, Km 2SAT labels both those occurrences of ✷r ψ with the same Boolean atom A[σ, ✷r ψ] , and produces recursively two distinct subsets of clauses in the encoding, by applying (25.26) to ¬✷r ψ and (25.27) to ✷r ψ respectively. However, the latter step (25.27) generates a valid clause (A[σ, ✷r ψ] ∧ ¬A[σ, ✷r ψ] ) → A[σ.j, ψ] , which can be dropped. Consequently A[σ.j, ψ] no more occurs in the formula, so that also Def (σ.i, ψ) can be dropped as well, as there is no more need of defining hσ : ψi. Example 12. In the formula ϕbnf of Example 10 the implication 6. is valid and can be dropped. In the formula ϕbnf lif t of Example 11, not only 4., but also 6. can be dropped. 25.6.2.4. On-the-fly Boolean Constraint Propagation One major problem of the basic encoding of §25.6.1 is that it is purely-syntactic, that is, it does not consider the possible truth values of the subformulas, and the effect of their propagation through the Boolean and modal connectives. In particular, Km 2SAT applies (25.26) [resp. (25.27)] to every π-subformula [resp. ν-subformula], regardless the fact that the truth values which can be deterministically assigned to the labeled subformulas of h1 : ϕi may allow for dropping some labeled π-/ν-subformulas, and thus prevent the need of encoding them. One solution to this problem is that of applying BCP on-the-fly during the construction of Km 2SAT (ϕ). If a contradiction is found, then Km 2SAT (ϕ) is ⊥. When BCP allows for dropping one implication in (25.24)-(25.27) without assigning some of its implicate literals, namely L[σ, ψi ] , then hσ : ψi i needs not to be defined, so that Def (σ, ψ) can be dropped. Importantly, dropping Def (σ, π r,j ) for some π-formula hσ : π r,j i prevents generating the label σ.j (25.26) and all its successor labels σ.j.σ ′ (corresponding to the subtree of states rooted in σ.j), so that all the corresponding labeled subformulas are not encoded. Example 13. Consider Example 10. After building 1. – 3. in Km 2SAT (ϕbnf ), the atoms A[1, ϕbnf ] , A[1, (¬✷¬A1 ∨¬✷(¬A2 ∧¬A3 ))] , A[1, ✷¬A1 ] , A[1, ✷¬A2 ] and A[1, ✷¬A3 ] can be deterministically assigned to true by applying BCP. This causes the removal from 3. of the first-implied disjunct ¬A[1, ✷¬A1 ] , so that 4. is not generated. As label 1.1. is not defined, 6., 7. and 8. are not generated. Then after the construction of 5., 9., 10., 11. and 12., by applying BCP a contradiction is found, so that Km 2SAT (ϕ) is ⊥. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 819 — #39 ✐ Chapter 25. SAT Techniques for Modal and Description Logics ✐ 819 25.6.2.5. On-the-fly Pure-Literal Reduction Another technique, evolved from that proposed in [PSV02, PV03], applies PureLiteral reduction on-the-fly during the construction of Km 2SAT (ϕ). When for some label σ all the clauses containing atoms A[σ, ψ] have been generated, if some of them occurs only positively [resp. negatively], then it can be safely assigned to true [resp. to false], and hence the clauses containing A[σ, ψ] can be dropped. As a consequence, some other atom A[σ, ψ′ ] can become pure, so that the process is repeated until a fixpoint is reached. Example 14. Consider the formula ϕbnf of Example 10. During the construction of Km 2SAT (ϕbnf ), after 1.-8. are generated, no more clause containing atoms in the form A[1.1, ψ] is to be generated. Then we notice that A[1.1, A2 ] and A[1.1, A3 ] occur only negatively, so that they can be safely assigned to false. Therefore, 7. and 8. can be safely dropped. Same discourse applies lately to A[1.2, A1 ] and 9. The resulting formula is found inconsistent by BCP. (In fact, notice that in Example 10 A[1.1, A2 ] , A[1.1, A3 ] , and A[1.2, A1 ] play no role in the unsatisfiability of Km 2SAT (ϕbnf ).) References [ABC+ 02] G. Audemard, P. Bertoli, A. Cimatti, A. Kornilowicz, and R. Sebastiani. A SAT Based Approach for Solving Formulas over Boolean and Linear Mathematical Propositions. In Proceedings of 18th International Conference on Automated Deduction (CADE), volume 2392 of LNAI. Springer, 2002. [ACG00] A. Armando, C. Castellini, and E. Giunchiglia. SAT-based procedures for temporal reasoning. In Proceedings of 5th European Conference on Planning, (ECP), volume 1809 of LNCS. Springer, 2000. [AG93] A. Armando and E. Giunchiglia. Embedding Complex Decision Procedures inside an Interactive Theorem Prover. Annals of Mathematics and Artificial Intelligence, 8(3–4):475–502, 1993. [AGHd00] C. Areces, R. Gennari, J. Heguiabehere, and M. de Rijke. Treebased heuristics in modal theorem proving. In Proceedings of the 14th European Conference on Artificial Intelligence (ECAI), pages 199–203, 2000. [BCM+ 03] F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, and P. F. Patel-Schneider, editors. The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press, 2003. [BFH+ 94] F. Baader, E. Franconi, B. Hollunder, B. Nebel, and H. J. Profitlich. An Empirical Analysis of Optimization Techniques for Terminological Representation Systems or: Making KRIS get a move on. Applied Artificial Intelligence. Special Issue on Knowledge Base Management, 4:109–132, 1994. [BFT95] P. Bresciani, E. Franconi, and S. Tessaris. Implementing and testing expressive Description Logics: a preliminary report. In Proc. International Workshop on Description Logics, Rome, Italy, 1995. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 820 — #40 ✐ 820 ✐ Chapter 25. SAT Techniques for Modal and Description Logics [BGdR03] S. Brand, R. Gennari, and M. de Rijke. Constraint Programming for Modelling and Solving Modal Satisfability. In Proceedings of 9th International Conference on Principles and Practice of Constraint Programming (CP), volume 3010 of LNAI, pages 795–800. Springer, 2003. [BGV99] R. Bryant, S. German, and M. Velev. Exploiting Positive Equality in a Logic of Equality with Uninterpreted Functions. In Proceedings of 11th International Conference on Computer Aided Verification (CAV), volume 1633 of LNCS. Springer, 1999. [BH91] F. Baader and B. Hollunder. A Terminological Knowledge Representation System with Complete Inference Algorithms. In Proceedings of the First International Workshop on Processing Declarative Knowledge, volume 572 of LNCS, pages 67–85, Kaiserslautern (Germany), 1991. Springer–Verlag. [Bra01] R. Brafman. A simplifier for propositional formulas with many binary clauses. In Proceedings of 17th International Joint Conference on Artificial Intelligence (IJCAI), 2001. [BS97] R. J. Bayardo and R. C. Schrag. Using CSP Look-Back Techniques to Solve Real-World SAT instances. In Proceedings of 14th National Conference on Artificial Intelligence (AAAI), pages 203–208. AAAI Press, 1997. [BW03] F. Bacchus and J. Winter. Effective Preprocessing with HyperResolution and Equality Reduction. In Proceedings of 6th International Conference on Theory and Applications of Satisfiability Testing (SAT), 2003. [CGH97] E. M. Clarke, O. Grumberg, and K. Hamaguchi. Another look at ltl model checking. Formal Methods in System Design, 10(1):47–71, 1997. [Che80] B. F. Chellas. Modal Logic – an Introduction. Cambridge University Press, 1980. [D’A92] M. D’Agostino. Are Tableaux an Improvement on Truth-Tables? Journal of Logic, Language and Information, 1:235–252, 1992. [DLL62] M. Davis, G. Logemann, and D. Loveland. A machine program for theorem proving. Journal of the ACM, 5(7), 1962. [DM94] M. D’Agostino and M. Mondadori. The Taming of the Cut. Journal of Logic and Computation, 4(3):285–319, 1994. [DM00] F. Donini and F. Massacci. EXPTIME tableaux for ALC. Artificial Intelligence, 124(1):87–138, 2000. [DP60] M. Davis and H. Putnam. A computing procedure for quantification theory. Journal of the ACM, 7:201–215, 1960. [EB05] N. Eén and A. Biere. Effective Preprocessing in SAT Through Variable and Clause Elimination. In Proceedings of 8th International Conference on Theory and Applications of Satisfiability Testing (SAT), volume 3569 of LNCS. Springer, 2005. [ES04] N. Eén and N. Sörensson. An extensible SAT-solver. In Proceedings of 6th International Conference on Theory and Applications of Satisfiability Testing (SAT), volume 2919 of LNCS, pages 502–518. Springer, ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 821 — #41 ✐ Chapter 25. SAT Techniques for Modal and Description Logics ✐ 821 2004. [FHMV95] R. Fagin, J. Halpern, Y. Moses, and M. Y. Vardi. Reasoning about knowledge. The MIT press, 1995. [Fit83] M. Fitting. Proof Methods for Modal and Intuitionistic Logics. D. Reidel Publishg, 1983. [GGST98] E. Giunchiglia, F. Giunchiglia, R. Sebastiani, and A. Tacchella. More evaluation of decision procedures for modal logics. In Proceedings of Sixth International Conference on Principles of Knowledge Representation and Reasoning (KR’98), Trento, Italy, 1998. [GGST00] E. Giunchiglia, F. Giunchiglia, R. Sebastiani, and A. Tacchella. SAT vs. Translation based decision procedures for modal logics: a comparative evaluation. Journal of Applied Non-Classical Logics, 10(2):145– 172, 2000. [GGT01] E. Giunchiglia, F. Giunchiglia, and A. Tacchella. SAT Based Decision Procedures for Classical Modal Logics. Journal of Automated Reasoning. Special Issue: Satisfiability at the start of the year 2000, 2001. [GN02] E. Goldberg and Y. Novikov. BerkMin: A Fast and Robust SATSolver. In Proc. DATE ’02, page 142, Washington, DC, USA, 2002. IEEE Computer Society. [GRS96] F. Giunchiglia, M. Roveri, and R. Sebastiani. A new method for testing decision procedures in modal and terminological logics. In Proc. of 1996 International Workshop on Description Logics - DL’96, Cambridge, MA, USA, November 1996. [GS96a] F. Giunchiglia and R. Sebastiani. Building decision procedures for modal logics from propositional decision procedures - the case study of modal K. In Proc. CADE’13, LNAI, New Brunswick, NJ, USA, August 1996. Springer. [GS96b] F. Giunchiglia and R. Sebastiani. A SAT-based decision procedure for ALC. In Proc. of the 5th International Conference on Principles of Knowledge Representation and Reasoning - KR’96, Cambridge, MA, USA, November 1996. [GS00] F. Giunchiglia and R. Sebastiani. Building decision procedures for modal logics from propositional decision procedures - the case study of modal K(m). Information and Computation, 162(1/2), October/November 2000. [GSK98] C. P. Gomes, B. Selman, and H. Kautz. Boosting combinatorial search through randomization. In Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI’98), pages 431–437, Madison, Wisconsin, 1998. [GT01] E. Giunchiglia and A. Tacchella. Testing for Satisfiability in Modal Logics using a Subset-matching Size-bounded cache. Annals of Mathematics and Artificial Intelligence, 33:39–68, 2001. [Hal95] J. Y. Halpern. The effect of bounding the number of primitive propositions and the depth of nesting on the complexity of modal logic. Artificial Intelligence, 75(3):361–372, 1995. [HJSS96] A. Heuerding, G. Jager, S. Schwendimann, and M. Seyfried. The ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 822 — #42 ✐ 822 [HM85] [HM92] [HM01] [Hor98a] [Hor98b] [HPS99] [HPSS00] [HS96] [HS99] [HSW99] [Lad77] [Mas94] [Mas98] [Mas99] ✐ Chapter 25. SAT Techniques for Modal and Description Logics Logics Workbench LWB: A Snapshot. Euromath Bulletin, 2(1):177– 186, 1996. J. Y. Halpern and Y. Moses. A guide to the modal logics of knowledge and belief: preliminary draft. In Proceedings of 9th International Joint Conference on Artificial Intelligence, pages 480–490, Los Angeles, CA, 1985. Morgan Kaufmann Publ. Inc. J. Y. Halpern and Y. Moses. A guide to the completeness and complexity for modal logics of knowledge and belief. Artificial Intelligence, 54(3):319–379, 1992. V. Haarslev and R. Moeller. RACER System Description. In Proc. of International Joint Conference on Automated reasoning - IJCAR2001, volume 2083 of LNAI, Siena, Italy, July 2001. Springer-verlag. I. Horrocks. The FaCT system. In Proc. Automated Reasoning with Analytic Tableaux and Related Methods: International Conference Tableaux’98, number 1397 in LNAI, pages 307–312. Springer, May 1998. I. Horrocks. Using an expressive description logic: FaCT or fiction? In Sixth International Conference on Principles of Knowledge Representation and Reasoning (KR’98), pages 636–647, 1998. I. Horrocks and P. F. Patel-Schneider. Optimizing Description Logic Subsumption. Journal of Logic and Computation, 9(3):267–293, 1999. I. Horrocks, P. F. Patel-Schneider, and R. Sebastiani. An Analysis of Empirical Testing for Modal Decision Procedures. Logic Journal of the IGPL, 8(3):293–323, May 2000. A. Heuerding and S. Schwendimann. A benchmark method for the propositional modal logics K, KT, S4. Technical Report IAM-96-015, University of Bern, Switzerland, 1996. U. Hustadt and R. Schmidt. An empirical analysis of modal theorem provers. Journal of Applied Non-Classical Logics, 9(4), 1999. U. Hustadt, R. A. Schmidt, and C. Weidenbach. MSPASS: Subsumption Testing with SPASS. In Proc. 1999 International Workshop on Description Logics (DL’99), vol. 22, CEUR Workshop Proceedings, pages 136–137, 1999. R. Ladner. The computational complexity of provability in systems of modal propositional logic. SIAM J. Comp., 6(3):467–480, 1977. F. Massacci. Strongly analytic tableaux for normal modal logics. In In Proceedings of 12th International Conference on Automated Deduction, volume 814 of Lecture Notes in Computer Science. Springer, 1994. F. Massacci. Simplification: A general constraint propagation technique for propositional and modal tableaux. In Proc. 2nd International Conference on Analytic Tableaux and Related Methods (TABLEAUX-97), volume 1397 of LNAI. Springer, 1998. F. Massacci. Design and Results of Tableaux-99 Non-Classical (Modal) System Competition. In Automated Reasoning with Analytic Tableaux and Related Methods: International Conference (Tableaux’99), 1999. ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 823 — #43 ✐ Chapter 25. SAT Techniques for Modal and Description Logics ✐ 823 [Mas00] F. Massacci. Single Step Tableaux for modal logics: methodology, computations, algorithms. Journal of Automated Reasoning, Vol. 24(3), 2000. [MMZ+ 01] M. W. Moskewicz, C. F. Madigan, Y. Zhao, L. Zhang, and S. Malik. Chaff: Engineering an efficient SAT solver. In Design Automation Conference, 2001. [MSS96] J. P. Marques-Silva and K. A. Sakallah. GRASP - A new Search Algorithm for Satisfiability. In Proc. ICCAD’96, 1996. [Ngu05] L. A. Nguyen. On the Complexity of Fragments of Modal Logics. In Advances in Modal Logic. King’s College Publications, 2005. [NOT06] R. Nieuwenhuis, A. Oliveras, and C. Tinelli. Solving SAT and SAT Modulo Theories: from an Abstract Davis-Putnam-LogemannLoveland Procedure to DPLL(T). Journal of the ACM, 53(6):937–977, November 2006. [PS98] P. F. Patel-Schneider. DLP system description. In Proc. DL-98, pages 87–89, 1998. [PSS01] P. F. Patel-Schneider and R. Sebastiani. A system and methodology for generating random modal formulae. In Proc. IJCAR-2001, volume 2083 of LNAI. Springer-verlag, 2001. [PSS03] P. F. Patel-Schneider and R. Sebastiani. A New General Method to Generate Random Modal Formulae for Testing Decision Procedures. Journal of Artificial Intelligence Research, (JAIR), 18:351–389, May 2003. Morgan Kaufmann. [PSV02] G. Pan, U. Sattler, and M. Y. Vardi. BDD-Based Decision Procedures for K. In In proceedings of 18th International Conference on Automated Deduction, volume 2392 of Lecture Notes in Computer Science. Springer, 2002. [PSV06] G. Pan, U. Sattler, and M. Y. Vardi. BDD-Based Decision Procedures for the Modal Logic K. Journal of Applied Non-Classical Logics, 16(2), 2006. [PV03] G. Pan and M. Y. Vardi. Optimizing a BDD-based modal solver. In Proceedings of 19th International Conference on Automated Deduction, volume 2741 of Lecture Notes in Computer Science. Springer, 2003. [Sch91] K. D. Schild. A correspondence theory for terminological logics: preliminary report. In Proc. 12th Int. Joint Conf. on Artificial Inteligence, IJCAI, Sydney, Australia, 1991. [Seb01] R. Sebastiani. Integrating SAT Solvers with Math Reasoners: Foundations and Basic Algorithms. Technical Report 0111-22, ITC-IRST, Trento, Italy, November 2001. [Seb07] R. Sebastiani. Lazy Satisfiability Modulo Theories. Journal on Satisfiability, Boolean Modeling and Computation – JSAT., 3, 2007. [Smu68] R. M. Smullyan. First-Order Logic. Springer-Verlag, NY, 1968. [SSB02] O. Strichman, S. Seshia, and R. Bryant. Deciding separation formulas with SAT. In Proc. of Computer Aided Verification, (CAV’02), LNCS. Springer, 2002. [SSS91] M. Schmidt-Schauß and G. Smolka. Attributive Concept Descriptions ✐ ✐ ✐ ✐ ✐ ✐ “p02c11˙mod” — 2008/11/16 — 16:01 — page 824 — #44 ✐ 824 ✐ Chapter 25. SAT Techniques for Modal and Description Logics with Complements. Artificial Intelligence, 48:1–26, 1991. [Str02] O. Strichman. On Solving Presburger and Linear Arithmetic with SAT. In Proc. of Formal Methods in Computer-Aided Design (FMCAD 2002), LNCS. Springer, 2002. [SV98] R. Sebastiani and A. Villafiorita. SAT-based decision procedures for normal modal logics: a theoretical framework. In Proc. AIMSA’98, volume 1480 of LNAI. Springer, 1998. [SV06] R. Sebastiani and M. Vescovi. Encoding the Satisfiability of Modal and Description Logics into SAT: The Case Study of K(m)/ALC. In Proc. SAT’06, volume 4121 of LNCS. Springer, 2006. [SV08] R. Sebastiani and M. Vescovi. Automated Reasoning in Modal and Description Logics via SAT Encoding: the Case Study of K(m)/ALC-Satisfiability. Technical report, DISI, University of Trento, Italy, 2008. Submitted for journal pubblication. Available as http://disi.unitn.it/∼rseba/sat06/. [Tac99] A. Tacchella. *SAT system description. In Proc. 1999 International Workshop on Description Logics (DL’99), vol. 22, CEUR Workshop Proceedings, pages 142–144, 1999. [Tin02] C. Tinelli. A DPLL-based Calculus for Ground Satisfiability Modulo Theories. In Proc. JELIA-02, volume 2424 of LNAI, pages 308–319. Springer, 2002. [Var89] M. Y. Vardi. On the complexity of epistemic reasoning. In Proceedings, Fourth Annual Symposium on Logic in Computer Science, pages 243–252, 1989. [Vor99] A. Voronkov. KK: a theorem prover for K. In CADE-16: Proceedings of the 16th International Conference on Automated Deduction, number 1632 in LNAI, pages 383–387. Springer, 1999. [Vor01] A. Voronkov. How to optimize proof-search in modal logics: new methods of proving redundancy criteria for sequent calculi. ACM Transacrtions on Computational Logic, 2(2):182–215, 2001. [WW99] S. Wolfman and D. Weld. The LPSAT Engine & its Application to Resource Planning. In Proc. IJCAI, 1999. [ZM02] L. Zhang and S. Malik. The quest for efficient boolean satisfiability solvers. In Proc. CAV’02, number 2404 in LNCS, pages 17–36. Springer, 2002. ✐ ✐ ✐ ✐