2.2.1 Formal Logics.
Many kinds of logic-based KR systems have been proposed over the years, mostly relying on first-order logic(FOL) either by restricting or extending it, e.g., on description logics and modal logics, which have been used to represent, for instance, terminological knowledge and time-dependent or subjective knowledge. Here, we briefly recall the state-of-the-art of FOL and its most relevant subsets.
First-order logic. FOL is a general-purpose logic that can be used to represent knowledge symbolically, in a very flexible way. More precisely, it allows both human and computational agents to express (i.e., write) the properties of, and the relations among, a set of entities constituting the domain of the discourse via one or more formulæ and, possibly, to reason over such formulæ by drawing inferences. Here, the domain of the discourse \(\mathbb {D}\) is the set of all relevant entities that should be represented in FOL to be amenable of formal treatment in a particular scenario.
Informally, the syntax for the general FOL formula is defined over the assumption that there exist: (i) a set of constant or function symbols, (ii) a set of predicate symbols, and (iii) a set of variables. Under this assumption, a FOL formula is any expression composed of a list of quantified variables, followed by a number of literals, i.e., predicates that may or may not be prefixed by the negation operator (\(\lnot\)). Literals are commonly combined into expressions via logic connectives, such as conjunction (\(\wedge\)), disjunction (\(\vee\)), implication (\(\rightarrow\)), or equivalence (\(\leftrightarrow\)).
Each predicate consists of a predicate symbol, possibly applied to one or more terms. Terms may be of three sorts: constants, functions, or variables. Constants represent entities from the domain of the discourse. In particular, each constant references a different entity. Functions are combinations of one or more entities via a function symbol. Similar to predicates, functions may carry one or more terms. Being containers of terms, functions enable the creation of arbitrarily complex data structures combining several elementary terms into composite ones. This kind of composability by recursion is what makes the aforementioned definition of ‘symbolic’ valid for FOL. Finally, variables are placeholders for unknown terms, i.e., for either individual entities or groups of entities.
Predicates and terms are very flexible tools to represent knowledge. While terms can be used to represent or reference either entities or groups of entities from the domain of the discourse, predicates can be used to represent relations among entities or the properties of each single entity.
Intensional vs. extensional. In logic, one may define concepts—i.e., describe data—either extensionally or intensionally. Extensional definitions are direct representations of data. In the particular case of FOL, this implies defining a relation or set by explicitly mentioning the entities it involves. Conversely, intensional definitions are indirect representations of data. In the particular case of FOL, this implies defining a relation or set by describing its elements via other relations or sets. Recursive intensional predicates are very expressive and powerful, as they enable the description of infinite sets via a finite (and commonly small) amount of formulæ. This is one of the key benefits of FOL as a means for KR.
2.2.2 Expressiveness vs. Tractability: Notable Subsets of FOL.
Tractability deals with the theoretical question: Can a logic reasoner compute whether a logic formula is true (or not) in
reasonable time? Such aspects are deeply entangled with the particular reasoner of choice. Depending on which and how many features a logic includes, it may be more or less
expressive. The higher the expressiveness, the more the complexity of the problems that may be represented via logic and processed via inference increases. This opens the possibility for the solver to meet queries that cannot be answered in practical time or by relying upon a limited amount of memory—or just cannot get an answer at all. Roughly speaking, more expressive logic languages make it easier for human beings to describe a particular domain, usually requiring them to write less and more concise clauses at the expense of higher difficulty for software agents to draw inferences autonomously, because of computational tractability. This is a well-understood phenomenon in both computer science and computational logic [
8,
31], often referred to as the
expressiveness/tractability trade-off.
FOL, in particular, is considered very expressive. Indeed, it comes with many undecidable, semi-decidable, or simply intractable properties. Hence, several relevant subsets of FOL have been identified in the literature, often sacrificing expressiveness for tractability. Major notions concerning these logics are recalled below.
Horn logic. Horn logic is a notable subset of FOL, characterised by a good trade-off among theoretical expressiveness and practical tractability [
36].
Horn logic is designed around the notion of the
Horn clause [
26]. Horn clauses are FOL formulæ having no quantifiers and consisting of a disjunction of predicates, where only at most one literal is non-negated—or, equivalently, an implication having a single predicate as post-condition and a conjunction of predicates as pre-condition:
\(h \leftarrow b_1,\ \ldots ,\ b_n\). Here,
\(\leftarrow\) denotes logic implication from right to left, commas denote logic conjunction, and all
\(b_i\), as well as
h, are predicates of arbitrary arity, possibly carrying FOL terms of any sort—i.e., variables, constants, or functions. Horn clauses are thus if–then rules written in reverse order and only supporting conjunctions of predicates as pre-conditions.
Essentially, Horn logic is a very restricted subset of FOL where (i) formulæ are reduced to clauses, as they can only contain predicates, conjunctions, and a single implication operator; therefore, (ii) operators such as \(\vee\), \(\leftrightarrow\), or \(\lnot\) cannot be used; (iii) variables are implicitly quantified; and (iv) terms work as in FOL.
Datalog. Datalog is a restricted subset of FOL [
3] representing knowledge via function-free Horn clauses, defined above. Thus, essentially, Datalog is a subset of Horn logic where structured terms (i.e., recursive data structures) are forbidden. This is a direct consequence of the lack of function symbols.
Similar to Horn logic, Datalog’s knowledge bases consist of sets of function-free Horn clauses.
Description logics. Description logics (DL) are a family of subsets of FOL, generally involving some or no quantifiers, no structured terms, and no n-ary predicates such that \(n \ge 3\). In other words, description logics represent knowledge by only leveraging on constants and variables other than atomic, unary, and binary predicates.
Differences among specific variants of DL lay in which and how many logic connectives are supported other than, of course, whether negation is supported or not. The wide variety of DL is due to the well-known expressiveness/tractability trade-off. However, depending on the particular situation at hand, one may either prefer a more expressive (\(\approx\)feature-rich) DL variant at the price of a reduced tractability (or even decidability) of the algorithms aimed at manipulating knowledge represented through that DL or vice versa.
Regardless of the particular DL variant of choice, it is common practice in the scope of DL to call (i) constant terms ‘individuals’ as each constant references a single entity from a given domain, (ii) unary predicates, e.g., either ‘classes’ or ‘concepts’ as each predicate groups a set of individuals, i.e., all those individuals for which the predicate is true, and (iii) binary predicates, e.g., either ‘properties’ or ‘roles’ as each predicate relates two sets of individuals. Following such a nomenclature, any piece of knowledge can be represented in DL by tagging each relevant entity with some constant (e.g., a URL) and by defining concepts and properties accordingly.
Notably, binary predicates are of particular interest as they support connecting couples of entities altogether. This is commonly achieved via subject–predicate–object triplets, i.e., ground binary predicates of the form \(\langle \mathtt {a}\ \mathit {f}\ \mathtt {b} \rangle\) or \(\mathit {f}(\mathtt {a}, \mathtt {b})\), where \(\mathtt {a}\) is the subject, \(\mathit {f}\) is the predicate, and \(\mathtt {b}\) is the object. Such triplets allow users to extensionally describe knowledge in a readable, machine-interpretable, and tractable way.
Collections of triplets constitute the so-called knowledge graphs(KGs), i.e., directed graphs where vertices represent individuals, while arcs represent the binary properties connecting these individuals. These may explicitly or implicitly instantiate a particular ontology, i.e., a formal description of classes characterising a given domain and description of their relations (inclusion, exclusion, intersection, equivalence, etc.) as well as the properties they must (or must not) include.
Propositional logic. Propositional logic is a very restricted subset of FOL, where quantifiers, terms, and non-atomic predicates are missing. Hence, propositional formulæ simply consist of expressions involving one or many 0-ary predicates—i.e., propositions—possibly interconnected by ordinary logic connectives. Here, each proposition may be interpreted as a Boolean variable that can either be true or false and the truth of formulæ can be computed as in the Boolean algebra. Thus, for instance, a notable example of a propositional formula could be as follows: \(p \wedge \lnot q \rightarrow r,\) where p may be the proposition ‘it is raining’, q may be the proposition ‘there is a roof’, and r may be the proposition ‘the floor is wet’.
The expressiveness of propositional logic is far lower than the one of FOL. For instance, because of the lack of quantifiers, each relevant aspect/event should be explicitly modelled as a proposition. Furthermore, because of the lack of terms, entities from a given domain cannot be explicitly referenced. Such a lack of expressiveness, however, implies that computing the satisfiability of a propositional formula is a decidable problem, which may be a desirable property in some application scenarios.
Despite the fact that propositional logic may appear too trivial to handle common decision tasks where non-binary data is involved, it turns out that a number of apparently complex situations can indeed be reduced to a propositional setting. This is the case, for instance, of any expression involving numeric variables or constants, arithmetical comparison operators, logic connectives, and nothing more than that. In fact, formulæ containing comparisons among variables or constants (or among each others) can be reduced to propositional logic by mapping each comparison into a proposition.