1. Introduction
Most introductions to first-order logic first define the syntax of formulas, then formalize their meaning in the form established in the 1930s by Tarski [
1] (essentially what we call today in programming language theory a “denotational semantics” [
2]), then introduce a deduction calculus, and finally show the soundness and completeness of this calculus concerning the semantics: if a formula can be derived in the calculus, it is true according to the semantics, and vice versa. These relationships between truth and derivability have to be established because there is no self-evident link between the semantics of a formula and the deduction rules associated with it. Historically, deduction came first; the soundness of a deduction calculus was established by showing that it could not lead to apparent inconsistencies, i.e. that both a formula and its negation could not be derived in a deduction system. It was Tarski who first gave meaning to formulas that was independent of deduction.
However, as it did since the 1940s to many other mathematical areas, category theory [
3,
4,
5,
6,
7], the general theory of mathematical structures, can bring and provide an alternative light also to first-order logic. It does so by considering logical notions as special instances of “universal” constructions, where a value of interest is determined
first, by depicting the core property that the value shall satisfy; and
second, by giving a criterion how to choose a canonical value from all values that satisfy the property.
It was eventually recognized that by such universal constructions the semantics of the connectives of propositional logic could be determined directly from their associated introduction and elimination rules. However, it took until the late 1960s until Lawvere gained the fundamental insight that this idea could be also applied to the quantifiers of first-order logic [
8], thus establishing a direct relationship between its semantics and its proof calculus.
However, this insight has not yet obtained a foothold in basic texts on logic and its basic education. The main reason may be that the corresponding material is found mostly in texts on category theory and its applications where it is dispersed among examples of the application of categorical notions without a clear central presentation. Furthermore, the general treatment of first-order logic with terms and variables requires a complex mathematical apparatus [
9] which is much beyond the scope of basic introductions. Reasonably compact introductions can be found, e.g., in Section 2.1.10 of [
10], in [
11], in Section 1.6 of [
12] (however, in the context of type theory rather than classical first-order logic), in Section 9.5 of [
3] (the treatment of quantifiers only), and in Section 7.1.12 of [
7] (again only the treatment of quantifiers).
The goal of this paper is to give a compact introduction to a categorical version of first-order logic that is fully self-contained, only introduces the categorical notions relevant for the stated purpose, and presents them from the point of view of the intended application. For this purpose, it elaborates a simple but completely formalized syntactic and semantic framework of first-order logic that represents the background of the discussion, without gaps and inconsistencies. As a deliberate decision, this framework does not address the syntax and semantics of terms but abstracts atomic formulas to opaque relations; this allows for focusing the discussion on the essentials. However, to describe a reasonably close relative of first-order logic, this framework is (in contrast to other presentations) not based on relations of fixed arity, i.e., with a fixed number of variables; instead, we consider relations of infinite arity, i.e., with infinitely many variables. However, only finitely many variables may influence the truth value of the relation, which represents the effect that a classical atomic formula can only reference a finite number of variables. The overall result is a slick and elegant presentation. Because the duality has many manifestations in logic and it is agreed by all hands that a duality is like a “giant symmetry”—a symmetry between theories, we focus on this concept in our approach. For the implementation, we use the RISCAL—the
RISC Algorithm Language [
13], which is a specification language with an associated software system for describing mathematical algorithms, formally specifying their behavior based on mathematical theories, and validating the correctness of algorithms, specifications, and theories by the execution/evaluation of their formal semantics. The term “algorithm language” indicates that RISCAL is intended to model, rather than low-level code, algorithms (as can be found in textbooks on discrete mathematics) in a high-level language, and specifying the behavior of these algorithms by formal contacts. RISCAL has been developed to validate the correctness of mathematical theories, specifications, and programs, by checking instances of these artifacts on finite domains; applications of RISCAL are for instance in discrete mathematics, number theory, and computer algebra. Software based on formal logic plays an ever-increasing role in areas where a mathematically precise understanding of a subject domain and sound rules for reasoning about the properties of this domain are essential. A prime example is the formal modeling, specification, and verification of computer programs and computing systems, but there are many other applications in areas such as knowledge-based systems, computer mathematics, or the semantic web [
14]. Furthermore, the intent of all our projects (namely LogTechEdu, SemTech [
15], and others listed in Funding section) is to further advance education in computer science and related topics. In academical courses for computer science and mathematics, by utilizing the power of modern software based on formal logic and semantics, students shall engage with the material they encounter by actively producing the problem solutions rather than just passively taking them from the lecturer.
The remainder of this paper is structured as follows: in
Section 2, we define a term-free variant of first-order logic and give it a semantics in the usual style based on set-theoretic notions. In
Section 3, we introduce those categorical notions that are necessary for understanding the following elaboration and discuss their relationships. The core of this paper is
Section 4 where we elaborate the categorical formulation of the semantics of our variant of first-order logic. In
Section 5, we demonstrate that these semantics are constructive by modeling it in the RISCAL system [
13], which allows us to automatically check the core propositions in particular finite models.
Section 6 concludes our presentation and gives an outlook on our future work.
2. A Relational First-Order Logic
In this section, we introduce a simplified variant of first-order logic that abstracts from the syntactic structure of atomic formulas and thus copes without the concept of terms, constants, function symbols, predicate symbols, and all of the associated semantic apparatus. Towards this goal, atomic formulas are replaced by relations over assignments (maps of variables to values) that are constrained to only depend on a finite number of variables; we will call such relations “predicates”. Consequently, the semantics of every non-atomic formula is also a relation (i.e., a predicate, as mentioned).
We begin with some standard notions. First, we specify variables and the values that variables hold.
Axiom 1 (Variables and Values). Let denote an arbitrary infinite and enumerable set; we call the elements of this set variables. Furthermore, let denote an arbitrary non-empty set; we call the elements of these set values.
Next, we define assignments.
Definition 1 (Assignments). We define as the set of all mappings of variables to values (a function space); we call the elements of this set assignments. Thus, for every assignment and every variable , we have .
We note that assignment is similar to a concept of state in theory of formal semantics of programming languages (see, e.g., [
16]) where the state is a function from variables to values: to each variable, the state associates its current value.
Definition 2 (Updates)
. Let be an assignment, a variable, and a value. We define the update assignment as follows:Consequently, is identical to a except that it maps variable x to value v.
Based on this, we can formulate the following updating properties.
Proposition 1 (Update Properties)
. Let be an assignment, variables, and values. Then, we have the following properties: Proof. Directly from the definitions. □
The properties of assignments listed above (and only these) will be of importance in the subsequent proofs.
Now, we turn to the fundamental semantic notions.
Definition 3 (Relations). We define as the set of all sets of assignments; we call the elements of this set “relations”. Consequently, a relation is a set of assignments.
Definition 4 (Variable Independence)
. We state that relation is independent of variable , written as , if and only if the following holds:Consequently, if , the value of x in any assignment a does not influence whether a is in R. We say that R depends on x if does not hold.
We transfer the central syntactic property of atomic formulas (they can only refer to finitely many variables) to its semantic counterpart.
Definition 5 (Predicates). A relation is a predicate, if it only depends on finitely many variables. We denote by the set of all predicates and by the subset of all predicates that are independent of x.
Now, we are ready to introduce the central entities of our paper. First, we give a definition of the abstract syntax of formulas.
Definition 6 (Abstract Syntax of Formulas)
. We define as that smallest set of abstract syntax trees in which every element is generated by an application of a rule of the following context-free grammar (where denotes an arbitrary predicate and denotes an arbitrary variable):We call the elements of this set formulas.
In this definition, the role of a classic atomic predicate with argument terms in which n variables occur freely is abstracted to a predicate P that depends on variables .
Now, we establish the relationship between the syntax and semantics of formulas.
Definition 7 (Semantics of Formulas)
. Let be a formula. We define the relation , called the semantics
of F, by induction on the structure of F: The above definition is well-defined in that every formula denotes a relation. To show that formulas indeed denote predicates, some more work is required.
Proposition 2 (Quantified Formulas and Variable Independence). For every variable and formula , we have and , i.e., the semantics of quantified formulas do not depend on x.
Proof. We prove this proposition by reductio ad absurdum.
First, assume that depends on x. Then, we have some assignment and some values such that and . From we have some with and thus . However, implies and thus , which represents a contradiction.
Now, assume that depends on x. Then, we have some assignment and some values such that and . From , we have some with and thus . However, implies and thus , which represents a contradiction. □
Proposition 3 (Formula Semantics and Predicates). For every formula , we have , i.e., the semantics of F is a predicate.
Proof. The proof proceeds by induction over the structure of F.
If , we have .
If , there are no such that and because, for , the second condition must be false and for the first one; thus, F does not depend on any variable.
If , by the induction hypothesis, we may assume that depends only on the variables in some finite variable set X. From the definition of , it is then easy to show that also depends only on the variables in X.
If , we may assume by the induction hypothesis that depends only on the variables in some finite set while only depends on the variables in some finite set . From the definition of , it is then easy to show that depends only on the variables in the finite set .
If , we may assume by the induction hypothesis that only depends on the variables in some finite variable set X. We are now going to show that only depends on the variables in the finite set . Actually, we assume that this is not the case and show a contradiction. From this assumption and Proposition 2, we have a variable on which F depends; thus, we have an assignment a and values such that and .
If , from , we have a with and thus (since ) . From , we know and thus . Thus, depends on a variable which contradicts the induction assumption.
If , from , we have a with and thus (since ) . From we know and thus . Thus, depends on a variable which contradicts the induction assumption.
This completes our proof. □
In the following, we transfer the classical model-theoretic notions to our framework.
Definition 8 (Satisfaction)
. Let be an assignment and be a formula. We define (read: a satisfies F) as follows: Definition 9 (Validity)
. Let be a formula. We define (read: F is valid) as follows: Definition 10 (Logical Consequence)
. Let be formulas. We define (read: G is a logical consequence of F) as follows: Definition 11 (Logical Equivalence)
. Let be formulas. We define (read: F and G are logically equivalent) as follows: Proposition 4 (Logical Consequence and Logical Equivalence). Let be formulas. Then, we have the following equivalences:
Proof. Directly from the definitions. □
Thus, a logical consequence on the meta-level coincides with an implication on the formula level and with the subset relation on the semantic level. Furthermore, logical equivalence on the meta-level coincides with equivalence on the formula level and with the equality relation on the semantic level.
In the following, we establish a set-theoretic interpretation of the logical operations of our formula language.
Definition 12 (Complement). We define the complement of relation as the relation . Consequently, an assignment is in if and only if it is not in R.
Proposition 5 (Propositional Semantics as Set Operations)
. Let be formulas. We then have the following equalities: Proof. Directly from the definition of the semantics. □
While the above results are quite intuitive, a corresponding set-theoretic interpretation of quantified formulas is not. In the following, we only state the plain result without indication of how it can be intuitively understood; we will delegate this explanation to
Section 4, where the categorical framework will provide us with adequate insight.
Proposition 6 (Quantifier Semantics as Set Operations)
. Let be a formula. We then have the following equalities:In other words, is the weakest predicate P (“weakest” in the sense of the largest set) that is independent from x and that satisfies the property while is the strongest predicate P (“strongest” in the sense of the smallest set) that is independent of x and that satisfies the property .
Proof. The proof is in two stages. First, we take an arbitrary assignment
and show
⇒: We assume
and prove for
From Proposition 3, we have (
1). From Proposition 2, we have (2). From
, we have (4). To show (3), we take arbitrary assignment
and show
. From
, we know
for
. Since
, we thus know
.
⇐: We assume for some
and prove
. For this, we take arbitrary
and prove
. From (6), it suffices to show
. From (
5), we know
From (7) and
, we know
for
. Thus, with (
8), we know
.
Now, we take arbitrary
and show
⇒: We assume
and take arbitrary but fixed
for which we assume
Our goal is to show
. From
, we know
for some
. From (10), we thus know
. From (
9), we thus know
. Since
, we thus know
.
⇐: We assume
and prove
. From (
11) instantiated with
and Propositions 3 and 2, it suffices to prove
. Take arbitrary assignment
. Since
, we thus have
for
and thus
. □
5. An Implementation of the Categorical Semantics
In this section, we describe how the constructions that we have theoretically modeled in
Section 2 can be actually implemented. For this purpose, we use RISCAL (RISCAL is developed at JKU, Linz, Austria,
https://www3.risc.jku.at/research/formal/software/RISCAL/, see [
13]), the RISC Algorithm Language [
13,
17], a specification language, and an associated software system for modeling mathematical theories and algorithms in a specification language based on first-order logic and set theory. The language is based on a type system where all types have finite sizes (specified by the user); this allows for fully automatically deciding formulas and verifying the correctness of algorithms for all possible inputs. To this end, the system translates every syntactic phrase into an executable form of its denotational semantics; the RISCAL model checker evaluates these semantics to determine the results of algorithms and the truth values of formulas such as the postconditions of algorithms. Since the domains of RISCAL models have (parameterized but) finite size, the validity of all theorems and the correctness of all algorithms can be fully automatically checked; the system has been mainly employed in educational scenarios [
18,
19].
Figure 2 gives a screenshot of the software with the RISCAL model that is going to be discussed below.
Figure 3 and
Figure 4 list a RISCAL model of the categorical semantics over a domain of
variables (identified with the natural numbers
) with
values, for arbitrary model parameters
; all theorems over these domains are decidable and can be checked by RISCAL. The RISCAL definition of domains, functions, and predicates closely correspond to those given in this paper; in particular, we have a domain
Pred of predicates (since the number of variables is finite, by definition all relations are predicates) and predicate functions
TRUE,
FALSE,
AND,
OR,
IMP,
FORALL,
EXISTS. Different from the categorical formulation,
IMP is a binary function, not a family of unary functions; likewise,
FORALL and
EXISTS are binary functions whose first argument is a variable. Furthermore, we introduce functions
NOT and
EQUIV for the semantics of negation and conjunction and show by theorems
Not and
Equiv that they can be reduced to the other functions.
All other logical operations are first defined in their usual set-theoretic form. Subsequently, we describe their categorical semantics by a pair of theorems: the first theorem claims that the set-theoretic semantics is equivalent to an implicit definition of the categorical semantics while the second theorem claims equivalence to the corresponding constructive definition. Choosing small parameter values and (i.e., relations with variables and values ), RISCAL can easily check the validity of all claims, as demonstrated by the following output:
RISC Algorithm Language 2.6.4 (10 December 2018)
http://www.risc.jku.at/research/formal/software/RISCAL
(C) 2016-, Research Institute for Symbolic Computation (RISC)
This is free software distributed under the terms of the GNU GPL.
Execute "RISCAL -h" to see the available command line options.
-----------------------------------------------------------------
Reading file /usr2/schreine/papers/CategoricalLogic2019/catlogic.txt
Using N=2.
Using M=1.
Computing the value of Ass...
Computing the value of TRUE...
Computing the value of FALSE...
Type checking and translation completed.
Executing True1().
Execution completed (3 ms).
Executing True2().
Execution completed (1 ms).
Executing False1().
Execution completed (0 ms).
Executing False2().
Execution completed (1 ms).
Executing And1(Set[Array[ℤ]],Set[Array[ℤ]]) with all 65536 inputs.
PARALLEL execution with 4 threads (output disabled).
...
Execution completed for ALL inputs (18,373 ms, 65,536 checked, 0 inadmissible).
Executing And2(Set[Array[ℤ]],Set[Array[ℤ]]) with all 65,536 inputs.
PARALLEL execution with 4 threads (output disabled).
46273 inputs (36446 checked, 0 inadmissible, 0 ignored, 9827 open)...
Execution completed for ALL inputs (3576 ms, 65536 checked, 0 inadmissible).
Executing Or1(Set[Array[ℤ]],Set[Array[ℤ]]) with all 65536 inputs.
PARALLEL execution with 4 threads (output disabled).
...
Execution completed for ALL inputs (26,889 ms, 65,536 checked, 0 inadmissible).
Executing Or2(Set[Array[ℤ]],Set[Array[ℤ]]) with all 65,536 inputs.
PARALLEL execution with 4 threads (output disabled).
42,676 inputs (32,887 checked, 0 inadmissible, 0 ignored, 9789 open)...
Execution completed for ALL inputs (3907 ms, 65,536 checked, 0 inadmissible).
Executing Imp1(Set[Array[ℤ]],Set[Array[ℤ]]) with all 65,536 inputs.
PARALLEL execution with 4 threads (output disabled).
...
Execution completed for ALL inputs (48,592 ms, 65,536 checked, 0 inadmissible).
Executing Imp2(Set[Array[ℤ]],Set[Array[ℤ]]) with all 65,536 inputs.
PARALLEL execution with 4 threads (output disabled).
...
Execution completed for ALL inputs (9462 ms, 65,536 checked, 0 inadmissible).
Executing Not(Set[Array[ℤ]]) with all 256 inputs.
PARALLEL execution with 4 threads (output disabled).
Execution completed for ALL inputs (28 ms, 256 checked, 0 inadmissible).
Executing Equiv(Set[Array[ℤ]],Set[Array[ℤ]]) with all 65,536 inputs.
PARALLEL execution with 4 threads (output disabled).
Execution completed for ALL inputs (354 ms, 65,536 checked, 0 inadmissible).
Executing Forall1(ℤ,Set[Array[ℤ]]) with all 768 inputs.
PARALLEL execution with 4 threads (output disabled).
Execution completed for ALL inputs (1315 ms, 768 checked, 0 inadmissible).
Executing Forall2(ℤ,Set[Array[ℤ]]) with all 768 inputs.
PARALLEL execution with 4 threads (output disabled).
Execution completed for ALL inputs (512 ms, 768 checked, 0 inadmissible).
Executing Exists1(ℤ,Set[Array[ℤ]]) with all 768 inputs.
PARALLEL execution with 4 threads (output disabled).
Execution completed for ALL inputs (1299 ms, 768 checked, 0 inadmissible).
Executing Exists2(ℤ,Set[Array[ℤ]]) with all 768 inputs.
PARALLEL execution with 4 threads (output disabled).
Execution completed for ALL inputs (461 ms, 768 checked, 0 inadmissible).
These values are, however, the largest ones with which model checking is realistically feasible; choosing, for example, and gives for the checking theorem And1 about possible inputs whose checking on a single processor core would take RISCAL more than two decades.