Applied Cat-Theory
Applied Cat-Theory
Applied Cat-Theory
W H AT I S A P P L I E D
C AT E G O RY T H E O RY ?
D E PA R T M E N T O F M AT H E M AT I C S
C U N Y G R A D U AT E C E N T E R
NEW YORK, NEW YORK
tbradley@gradcenter.cuny.edu
what is applied category theory? 2
For those thinking thought #1, I hope to convince you that the answer is No way! It’s true that category
theory sometimes goes by the name of general abstract nonsense, which might incline you to think that cate-
gory theory is too pie-in-the-sky to have any impact on the “real world.” My hope is that these notes will
convince you that that’s far from the truth!
For those thinking thought #2, yes it’s true that ideas and results from category theory have found ap-
plications in computer science and quantum physics (not to mention pure mathematics itself), but these
are not the only applications to which the word applied in applied category theory is being applied. So what is
applied category theory?
Read on.
• I’ll make heavy use of hyperlinks, as I have already, and I’ll also – What is a Category?
incorporate the occasional use of color throughout the text. For – What is a Functor?
these reasons, it’s probably best to read this PDF on a computer – What is a Natural Transformation?
rather than in print form. At the first link, you’ll find a list of
other recommended resources for
learning about category theory.
• Finally, a fair warning: I use italics a lot (along with frequent par-
enthetical remarks). I also like exclamation points! And many of
my sentences begin with a conjunction.
My gratitude goes to the participants and mentors of the 2018 ACT Workshop from whom I learned a great deal. I also thank John
Baez, Joseph Hirsh, and Maximilien Péroux for providing valuable feedback on a first draft of these notes.
what is applied category theory? 3
Introduction
One of the great features of category theory, birthed in the 1940s, is that its organizing principles have
been used to reshape and reformulate problems within pure mathematics, including topology, homotopy
theory and algebraic geometry. Category theory has light on those problems, making them easier to solve
and opening doors for new avenues of research. Historically, then, category theory has found immense
application within mathematics. As John Baez recently noted, “[category theory] was meant to be applied.”
More recently, however, category theory has found applications in a wide range of disciplines outside
of pure mathematics—even beyond the closely related fields of computer science and quantum physics.
These disciplines include chemistry, neuroscience, systems biology, natural language processing, causal-
ity, network theory, dynamical systems, and database theory to name a few. And what do they all have in
common? That’s precisely what applied category theory seeks to discover. In other words, the techniques,
tools, and ideas of category theory are being used to identify recurring themes across these various dis-
ciplines with the purpose of making them a little more formal. This is what the phrase applied category
theory (ACT) is meant to describe. As explained on the ACT 2018 workshop webpage,
...we should treat the use of categorical concepts as a natural part of transferring and integrating knowledge
across disciplines. The restructuring employed in applied category theory cuts through jargon, helping to elu-
cidate common themes across disciplines. Indeed, the drive for a common language and comparison of similar
structures in algebra and topology is what led to the development category theory in the first place, and recent
hints show that this approach is not only useful between mathematical disciplines, but between scientific ones as
well.
Of course, one of the challenges of using category theory to transfer and integrate knowledge across
disciplines is making category theory itself accessible to the broader scientific audience. John Baez and
Brendan Fong address this very point in their 2016 paper on electrical circuit diagrams1 :
Although their comments refer to a particular project, they can apply to the field at large, too.
The goal of this document is to give a taste of applied category from a graduate student’s perspective.
In doing so, I’ll share two themes and two constructions that appeared frequently during the ACT 2018
workshop. The math underlying these themes and constructions is not new. The newness, rather, is in how
they are being applied. To illustrate the themes and constructions, I’ll also share two examples—two re-
search projects in the field of ACT. The first project relates to chemistry and the second to natural language
processing, though the expositions are weighted unevenly. I’ll devote considerably more time on the sec-
ond example since that’s where my own research interests lie. And that’s what’s on the carte du jour! Two
themes and two constructions and two examples, along with a few crumbs (i.e. digressions) in between.
Here’s the menu in more detail:
what is applied category theory? 4
Contents
1 Two Themes 5
1.1 Functorial Semantics 5
1.2 Compositionality 11
1.3 Further Reading 12
2 Two Constructions 13
2.1 Monoidal Categories 13 Although the items are listed linearly,
they are very much intertwined. The
2.2 Decorated Cospans 28 themes motivate the constructions; the
constructions embody the themes, and
2.3 Further Reading 29 both the themes and the constructions
come to life in the examples.
3 Two Examples 30
3.1 Chemical Reaction Networks 30
3.2 Natural Language Processing 35
3.3 Further Reading 48
4 But Wait! There’s More... 50
1 Two Themes
Two themes that appear over and over (and over and over and over) in
applied category theory are functorial semantics and compositional-
ity. Let’s talk about the first one first.
syntax Ñ semantics
To get a better idea of syntax vs. semantics, think of the En- I’m using English language as an anal-
ogy to illustrate syntax vs. semantics,
glish language where two important features of communication
but it’s more than an analogy! As we’ll
are 1) grammar, which provides rules for combining words to form see in Section 3.2, the pairings
sentences, and 2) the actual meaning conveyed by those words and
“grammar ù syntax”
sentences. Grammar is the syntax, and the meaning is the semantics. “meanings of words ù semantics”
A small-ish digression...
Even though the idea goes by the fancy name of functorial semantics, it
is not just a “category theory thing.” Mind if I digress for a while to
I’ll take your silence as a No.
elaborate on this?
If you know a little bit about groups, then you’ve seen functo-
rial semantics in action before! How so? A group is a set endowed
with some extra structure, though that tells us nothing about why
groups are useful. It’s better to think of a group as encoding for some
kind of action or transformation. And this is why group representa- Group elements are like verbs. They
tions are so great! A group representation provides a way to view your DO stuff! For more on this notion from
a categorical perspective, check out the
abstract group elements as concrete linear transformations of some article Group Elements, Categorically
vector space. Explicitly, given a vector space V, a group representa- on Math3ma.
tion is a group homomorphism from G to AutpVq, the group of all
automorphisms of V If we replace AutpVq by AutpXq for
G Ñ AutpVq some set X (i.e. the group of automor-
phisms, i.e. bijections, on X), then a
It assigns to each group element a linear isomorphism V Ñ V. group homomorphism G Ñ AutpXq is
precisely a group action on X.
As a quick example, suppose our group is D3 , the dihedral group
of order 6, which is the group of symmetries of an equilateral trian-
gle. If we were to look at a presentation of the group,
D3 “ xr, s | r3 “ s2 “ rsrs “ 1y
syntax Ñ semantics
what is applied category theory? 7
In fact... it’s not like that. It IS that. If we view both the groups G and
AutpVq as one-object2 categories, then a group representation 2
Every group G gives rise to a category
having a single object ‚ (the group
g
G Ñ AutpVq itself) and a morphism ‚ Ñ ‚ for each
group element g P G. Composition is
given by the group operation.
IS a functor from syntax to semantics. That’s because every group
homomorphism is a functor when the groups are viewed as one-
object categories! So although functorial semantics has the word
Here’s another example I can’t resist
“functor” in it, don’t think that the idea behind it is unique to cat- sharing: operads! If you’re not familiar
egory theory. Indeed, representation theory capitalizes on the rela- with operads, just know that this is
a souped-up version of the group
tionship between syntax and semantics: a representation assigns to an
theory example. If you are familiar
abstract algebraic gadget (the syntax) some concrete meaning (the with operads, then you know this is the
semantics). souped-up version of the group theory
example.
An operad is an example of syntax,
I could end our digression here, but I’d like to share one more while an algebra over that operad
provides the semantics. For example,
instance of functorial semantics at work in pure mathematics. The
given a vector space V, an operad
next few examples involve monoids and monoidal categories, so I’ll homomorphism from the [commu-
assume you are familiar with those words. If you are not familiar tative, associative, Lie, Poisson,...]
operad to the endomorphism operad
with those words, don’t fret—you’re in luck! Section 2.1 is all about on V IS a [commutative, associative,
monoids and monoidal categories, so feel free to read that section Lie, Poisson,...]-algebra! That is, the
structure-preserving homomorphism
first then come back here. In either case, let’s proceed with another
provides an interpretation of each ab-
neat example of functorial semantics in action: stract n-ary operation as a actual,
concrete operation V bn Ñ V.
3
A functor F : C Ñ D between
monoidal categories is called lax
monoidal if for every pair of objects
c, c1 in C there is a morphism
Here I’m viewing both 1 and Set as monoidal categories. The symbol Fc b Fc1 Ñ Fpc b c1 q
1 is meant to represent the category with one object and only one (which assembles into a natural trans-
morphism (the identity), which we can view as a monoidal category formation.) It’s called strong monoidal
if Fc b Fc1 – Fpc b c1 q, and it’s called
p1, b, 1q in exactly one way. The category of sets has a monoidal strict monoidal if Fc b Fc1 “ Fpc b c1 q.
structure given by the Cartesian product with the set containing one
element, denoted t˚u, as the monoidal unit. Technically then,
what is applied category theory? 8
p1, b, 1q Ñ pC, b, 1q
p1, b, 1q Ñ pTop, ˆ, ˚q
In Section 3.1 we’ll see how the behavior of a chemical reaction net-
work is modeled by a functor
1.2 Compositionality
Compositionality, also known as the principal of compositionality,
also known as Frege’s principle, is the idea that the meaning of a Frege as in Gottlob Frege.
complex expression is determined by
1. the meanings of its constituent parts, and
In Section 3.2, our example from natural language, the complex Matrix factorization provides another
illustration of compositionality in math-
expression will be a sentence; its constituent parts are the words that
ematics. As an example, every n ˆ m
comprise the sentence. matrix M has a singular value decomposi-
In both examples, functorial semantics and the principle of com- tion, which means it can be written as a
product of three matrices M “ UDV :
positionality will go hand-in-hand. The former prompts us to model where U and V are unitary square
behavior using a functor between syntax and semantics categories. matrices (here V : denotes the conjugate
The latter encourages us to take things one at a time: To model a huge transpose of V) and D is a rectangular
diagonal matrix. Intuitively then, the
system, compositionality tells us, it’s enough to model smaller pieces of linear transformation M can be broken
it and then stick those pieces together. Simple enough. But what does it down into a rotation followed by a
shear followed by another rotation. So
mean to “stick pieces together” mathematically? The answer is pro- you can analyze your transformation
vided by the structure of a monoidal category. And that is the first of (or your data set, if that’s what M is
encoding) by understanding its con-
our two main constructions in ACT.
stituent pieces—the factors—and how
they compose together. More generally,
I like to think that tensor networks are
a good example of compositionality, but
such a discussion might take us too far
off course. Perhaps another day!
what is applied category theory? 12
• Take a look at (this small notice on) William Lawvere’s 1963 PhD
thesis “Functorial Semantics of Algebraic Theories” for the formal
foundations for functorial semantics.
2 Two Constructions
Two constructions that appear over and over (and over and over and
over) in (some projects in) applied category theory are monoidal
categories and decorated cospans. Let’s talk about the first one first.
and butter of many applied category theorists. One reason for this
is that monoidal categories provide a good setting in which to view
morphisms Ñ as physical processes and objects A, B, . . . as states. As a
non-technical example, let’s suppose A is a bunch of lemon meringue
pie ingredients while B is a fully-assembled-yet-unbaked lemon
meringue pie. We might view a morphism A Ñ B as the process of
mixing the raw ingredients together and then pouring the resulting
concoction into a pre-baked crust.
As it turns out, this pie example isn’t so silly after all. It’s one of the
motivating examples that Brendan Fong and David Spivak use in
their excellent book Seven Sketches in Compositionality: An Invitation
to Applied Category Theory to illustrate both the ubiquity and the sim-
plicity of monoidal categories. (If you haven’t read Seven Sketches yet,
you really must.) Below is a copy of their lemon meringue pie dia-
gram, where I’ve drawn our A and B on the left as input and right as
output.
Now that we’ve zoomed in, we can see that our process
is actually made up of a bunch of other processes! This isn’t too
surprising as there are several steps that go into preparing a lemon
pie: separating the eggs, making the lemon filling, filling the crust, and
so on. Fong and Spivak’s diagram illustrates just how those those
individual steps combine to form the single process prepare lemon
meringue pie. What’s neat is that we can describe these steps using the
language of monoidal categories! We’ll go into more detail later in
this section, but here’s a quick preview:
what is applied category theory? 16
I’ll explain.
what is applied category theory? 17
That’s it.
ηV : R Ñ V b V ˚ eV : V ˚ b V Ñ R
In fact, there’s a nice fact from linear algebra, namely that once we fix
a basis te1 , . . . , en u for V then there is an isomorphism V – V ˚ . So We’ll need η and e for a computation
in Section 3.2, so it’s good to see what
let’s fix that basis (the standard one) and write V instead of V ˚ . Also,
they look like explicitly.
the subscript is a little cumbersome, so let’s drop it for now. So we
have two maps
η: R Ñ V bV e: V b V Ñ R (1)
The map η is called the unit 7 , and it assigns to every real number a 7
Note: this unit is not to be confused
vector in V b V, namely: with “monoidal unit”!
n
ÿ
ηp1q “ ei b ei (and extend linearly)
i “1
˜ ¸
n
ÿ n
ÿ
e cij vi b w j “ cij pvi ¨ w j q where ¨ is the inner product
i“1 i “1
The e map just extends this linearly. That is, if we now have any
vector ij cij vi b w j in V b V, then e : V b V Ñ R is given by
ř
¨ ˛
ÿ ÿ
e ˝ cij vi b wi ‚ “ cij pvi ¨ w j q
ij ij
as above.
Finally, the unit η and counit e interact nicely with each other
because they satisfy some equations called the yanking equations,
which I’ll explain shortly. The bottom line is that all the above—
the maps η and e and the equations they satisfy—makes V ˚ into a
bona fide dual for V. The upshot is that compact closed categories
generalize these notions.
ηcr : 1 Ñ cr b c ecr : c b cr Ñ 1
that satisfy the “yanking (or snake) equations” Here idc denotes the identity morphism
idc : c Ñ c.
pidc bel q ˝ pη l b idc q “ idc per b idc q ˝ pidc bη r q “ idc
(2)
pel b idcl q ˝ pidcl bη l q “ idcl pidcr ber q ˝ pη r b idcr q “ idcr
objects are vector spaces V, W, . . .. In this case, the left and right dual
of space V is its vector space dual
V˚ “ Vr “ Vl
what is applied category theory? 19
We’ll get to the yanking equations shortly, but first: If this document
has been your first introduction into string diagrams, then here is
THE KEY thing to know:
In category theory, we often draw an object as a dot ‚ and a morphism
as an arrow ‚ Ñ ˝. To draw a string diagram, just do the opposite!
The lemon pie diagram that we saw
(This goes back to Poincaré duality in topology.) To draw a string
on page 15 is an example of a string
diagram, draw an object as an arrow and a morphism as a dot or, even diagram!
better, a box.
which suggests that the unit is “invisible.” But I like to draw it any-
way, shaded:
Again, to simplify the notation we’ll use the fact that that for vector
spaces, V ˚ “ V r “ V l . I’ll also drop the subscripts to keep things
clean.
Graphically, the ηs and es are drawn as below. The reason we
have two versions of each map is because the “information flow” can
either flow up or it can flow down.
10
Remember, the sentence “FVect is
symmetric monoidal” means there is
an isomorphism V b W – W b V for
And since FVect is a symmetric monoidal category, and since the left every pair of vector spaces V and W. In
and right duals are both V ˚ , there is really only one unit and one string diagram calculus, this means that
the order in which we draw our arrows
counit for vector spaces.10 doesn’t matter:
what is applied category theory? 21
η “ ηr “ ηl and e “ er “ e l ,
and these are precisely the η and e defined on page 17! So in this For fun, verify that the unit and counit
maps on page 17 do indeed satisfy
example, the four yanking equations of (2) reduce down to just two: these two equations.
After yanking the strings taut, you’ll notice that information flows
rightwards in the first equation, while it flows leftwards in the second
equation.
what is applied category theory? 22
By the way, a key feature of FVect (and more generally, all sym-
metric compact closed categories) is that processes, i.e. morphisms,
V Ñ W are in bijection with states R Ñ W b V ˚ – V ˚ b W, which is
the special name given to morphisms whose domain is the monoidal
unit. This bijection is sometimes called process-state duality, and in
the context of FVect, it means we can view linear maps as vectors in a
tensor product11 and vice versa! 11
While a linear map R Ñ V ˚ b W is
not itself a vector in V ˚ b W, it can be
identified with one, namely with the
image of 1 in R! More generally, for any
finite-dimensional vector space A over
R, you can always think of hompR, Aq
as A itself, at least at the set level.
That’s because the forgetful functor
U : FVect Ñ Set is representable with
representing object R. In other words,
linear maps R Ñ A are in one-to-one
correspondence with the vectors in A,
viewed as elements of its underlying
set,
hompR, Aq – U A
This is completely analogous to how
functions t˚u Ñ X from the one-
point set to a set X are in one-to-one
correspondence with the elements in X,
hompt˚u, Xq – X
and is another manifestation of the
“probing” idea we saw in the margin
on page 9.
what is applied category theory? 23
η ù “unit” e ù “counit”
Is it a coincidence that these two words are also used in the defini-
tion of an adjunction? NOPE. They are closely related. Specifically,
the data V, V ˚ , η, and e together with the yanking equations are an
instance of a categorical adjunction! I think this is a neat fact,12 so 12
which appears on the first page
let’s take yet another digression. Happily, it will tie in quite nicely of “Coherence for Compact Closed
Categories” by Kelley and LaPlaza.
with our discussion on string diagrams. We’ll begin by recalling the
definition of an adjunction.
Equivalently, L and R form an adjunc-
tion if for all objects c P C, d P D there is
Definition 2.2. An adjunction between categories C and D is a pair of
an isomorphism
functors –
homD pLc, dq ÐÑ homC pc, Rdq
L: C D: R
that’s natural in both c and d.
and a pair of natural transformations
There, idC denotes the identity func-
η : idC ùñ RL e : LR ùñ idD tor on C. It assigns each object and
morphism in C to itself.
called the unit and counit respectively, such that these two triangles
Here, L ˝ η denotes the natural trans-
commute:
formation whose components are of
L˝η η ˝R the form Lηc : c Ñ LRLc, while e ˝ L is
L LRL R RLR the natural transformation with compo-
nents e Lc : LRLc Ñ Lc. A similar story
e˝ L R˝e
id L idR holds for η ˝ R and R ˝ e. (As per the
L R margin comment on page 7, I’d prefer
to omit the composition symbol ˝, but
The adjunction is denoted L % R, and L is said to be left adjoint to R I’m writing it now for good reason, as
we’ll soon see!)
while R is said to be right adjoint to L.
Why are (3) and (4) so similar? What’s going on here? Is there a sense
in which a vector space and its dual form an adjunction?
Is V % V ˚ a thing?
what is applied category theory? 25
Yes!
iii. some arrows between the arrows (natural transformations) and natural transformations
η : idC ùñ RL e : LR ùñ idD
You’ll notice that the objects and the arrows themselves form a cate-
gory, namely Cat, the category of all categories. The objects of Cat are
categories and the morphisms are functors.
Nice.
So what do we do?
G
what is applied category theory? 26
l˝η η˝r η
l l˝r˝l r r˝l˝r ‚ ˝ ù ‚ e˛η ˝
e
e˝l r˝e
idl idr
l r then vertical composition ˛ gives a 2-cell
e ˛ η as shown above on the right. This
i.e. such that the following equations hold13 is composition along a common 1-cell Ñ.
On the other hand, given four 1-cells as
pe ˝ lq ˝ pl ˝ ηq “ idl and pr ˝ eq ˝ pη ˝ rq “ idr shown below left,
where l ˝ η :“ idl ˝η, and similarly for e ˝ l and so on. We’ll say l is ‚ η ‹ e ˝ ù ‚ e˝η ˝
a left adjoint of r, and r is a right adjoint of l, and we’ll denote the
adjunction by l % r. horizontal composition ˝ gives a 2-cell
e ˝ η as shown above right. This is
composition along a common 0-cell ‹.
Alright, fine. But what does this have to do with vector spaces? Moreover, the triangle identities involve
both compositions. That is, the actual
equations are
The answer lies in the following neat fact.
pe ˛ lq ˝ pl ˛ ηq “ idl and pr ˛ eq ˝ pη ˛ rq “ idr
Neat Fact: Every monoidal category pC, b, 1q can be viewed
Take note of the diamonds vs. the
as a 2-category! circles!
so that the following triangles commute and for all vector spaces V in F Vect,
V b R – V – R b V.
Vbη ηbV ˚
V – VbR V b V˚ b V V˚ – R b V˚ V˚ b V b V˚
ebV V˚ b e
idV idV ˚
RbV – V V˚ b R – V˚
i.e. so that the following equations hold On the leftmost triangle, the notation
V b η denotes the linear map
pe b idV q ˝ pidV bηq “ idV and pidV ˚ b eq ˝ pη b idV ˚ q “ idV ˚
idV bη : V b R Ñ V b V ˚ b V
and these are precisely the string diagram equations shown in the that appears in the first equation. A
chart on page 21. similar statement holds for e b V, etc.
Also, take note of the different symbols
b and ˝ and compare them with the
Voila! diamond ˛ and circle ˝ in the margin
on the previous page.
Finally, notice that the above holds for every vector space V in
FVect. On the other hand, there are certainly 2-categories in which
not every 1-cell is dualizable, i.e. has an adjoint. Take Cat for instance!
Not every functor is part of an adjunction. There is, however, a spe-
cial name given to those bicategories C that do arise from a monoidal The punchline for this section is that
monoidal categories are an appropriate
category C and in which every 1-cell has an adjoint.
framework for stacking things together,
and the calculus of string diagrams
That name is compact closed. allows us to replace complicated, messy
equations by simple, neat pictures. In
Section 3, we’ll see two examples of
how this can be put into practice.
what is applied category theory? 28
3 Two Examples
Having taken a leisurely stroll through two themes (functorial se-
mantics and compositionality) and two constructions (monoidal
categories and decorated cospans) within applied category theory, it’s
time to see them come to life in two examples. As mentioned in the
introduction, we’ll walk through the first example—chemical reaction
networks—relatively quickly. There are several excellent resources
available online, including John Baez’s expositions on the n-Category
Café as well as on his personal webpage. (I’ve included a few links to
these in Section 3.3.) Afterwards we’ll take a longer stroll through the
second example—natural language processing—in Section 3.2.
Of course, you can imagine that there might be lots of various reac-
tants, products, and chemical processes. The corresponding network
would then be a (possibly huge) collection of these graphs stacked
what is applied category theory? 31
Graphs such as these are examples of Petri nets. A Petri net is es-
sentially a bipartite directed (multi)graph that allows us to visually
represent reactions, though they are used outside of chemistry as
well.
But if we do wish to model chemical reactions, then an important
thing we’d like to account for is the rate at which one or more chemi-
cals change over to another. A Petri net with rates included is called,
appropriately, a Petri net with rates. More specifically, it’s a bipartite
directed graph whose two types of vertices are called places, which
represent chemical species, and transitions, which represent chemical
reactions. Moreover, each transition τi is assigned a rate ri , a positive
real number that describes how fast or how likely it is for τi to occur.
These rates then allow us to write down differential equations that
describe the system. A Petri net with rates is thus a pictorial repre-
sentation of a set of differential equations that describe a system. So,
for instance, if you did watch Baez’s “The Mathematics of Networks”
talk then this example will look familiar:
law of mass action, which says that the rate with which a chemical
reaction will occur is equal to its rate constant ri multiplied by the
product of the concentration of the reactants, i.e. the concentration of
the “inputs” of the reaction.
By the way, the rates themselves could change with time, which
might suggest the presence of a dynamical system. What’s more, a
system such as the above could potentially interact with its environ-
ment, which is to say there might be some quantities that flow in and
some quantities that flow out, resulting in an open Petri net with
rates:
• a morphism X Ñ Y is a open Petri net with rates, i.e. a cospan Really, it’s an isomorphism class of
cospans. Also, you’ll notice that in
Theorem 12 of Baez and Pollard’s “A
V
Compositional Framework,” their
i o syntax category is something called
RxNet. That stands for the category
X Y of open reaction networks with rates. An
open reaction network is very nearly
together with a Petri net with rates whose places are comprised of V. the same as an open Petri net, though
I’m glossing over this a bit.
what is applied category theory? 34
The next corollary provides the same statement for the semantics
category:
• a morphism X Ñ Y is an open dynamical system, i.e. a cospan Again, it’s really an isomorphism class
of cospans. And again we can think of
V as the set of all places in a Petri net
V
where, as before, there may be a real
i o number ri attached to each vertex than
can vary with time. The description
X Y of how these things vary in time is
precisely a vector field on RV .
together with a smooth vector field on RV .
˝p f ˝ gq “ ˝ f ˝ ˝g
˝p f b gq “ ˝ f b ˝g
• a poset pP, ďq
• and moreover each element p has both a left dual pl and a right
dual pr with maps
el ηl er ηr
pl p ď 1 ď ppl and ppr ď 1 ď pr p.
el ηl er ηr
pl p ÝÑ 1 ÝÑ ppl and ppr ÝÑ 1 ÝÑ pr p.
discussed in Section 2.1, and the yanking equations (2) amount to the
following:
1¨ η r e r ¨1
p “ p ¨ 1 ÝÑ ppr p ÝÑ 1 ¨ p “ p
η l ¨1 1¨ η l
p “ 1 ¨ p ÝÑ ppl p ÝÑ p ¨ 1 “ p
η r ¨1 1¨ e r
p “ 1 ¨ pr ÝÑ pr ppr ÝÑ pr ¨ 1 “ pr In first equality of the third line we’re
rewriting pr as 1 ¨ pr rather than pr ¨ 1
because neither of the ηs nor es provide
1¨ η l e l ¨1 a map 1 Ñ ppr . Similarly for the
p “ pl ¨ 1 ÝÑ pl ppl ÝÑ 1 ¨ pl “ pl
last line, write pl “ pl ¨ 1 rather than
pl “ 1 ¨ pl since there’s no map 1 Ñ pl p.
is a pregroup. The partial order is given pointwise: f ď g if and only Fun fact: the pair p f l , f q forms a special
if f n ď gn for all n. The monoid multiplication is given by function kind of categorical adjunction called a
Galois connection since it satisfies
composition f ¨ g :“ f ˝ g. The monoidal unit is idZ . Given such a
function f , its left and right duals are given by f ln ď m ô n ď f m.
Indeed if n is even, then n{2 ď m ô
f l n :“ mintm P Z | n ď f mu f r n “ maxtm P Z | f m ď nu. n ď 2m. And if n is odd, then pn `
1q{2 ď m which means n ` 1 ď 2m
For example, if f m “ 2m, then which is true iff n ď 2m. Similarly, the
pair p f , f r q forms a Galois connection
$ $ since
&n if n is even, &n if n is even,
f ln “ 2 fr “ 2 fn ď m ô n ď f r m.
n ` 1 % n ´1
%
2 if n is odd 2 if n is odd.
Indeed, if n is even then 2n ď m ô
n ď m{2. And if n is odd, then 2n ď m
In short, f l n “ t n` 1 r n
2 u and f n “ t 2 u.
means n ď m{2 which is true iff
n ď pm ´ 1q{2.
You can find this example in “Iterated Galois Connections in
For a couple of great introductions to
Arithmetic and Linguistics” by Lambek, which appears in the Springer Galois connections (They are super cool
book Galois Connections and Applications. You’ll also find mention of it and appear in lots of places in math!)
take a look at Lecture 4 of John Baez’s
in the “Mathematical Foundations” paper of Coecke et. al. online course on applied category
theory as well as Section 1.5 of Seven
Sketches by Fong and Spivak.
what is applied category theory? 38
While arithmetic is fun, this next example is the one we’re most
interested in.
Example 3.5. Given any finite poset X, we can construct the free pre-
group generated by X, denoted PregX. For a simple example, suppose
X “ tn, su whose elements we’ll think of as basic grammar types:
n is the type of a noun and s is the type of a (declarative) sentence.
Elements of Pregtn, su are concatenations of the letters n and s and
their left and right duals and iterations of those duals and so on. For
example, some grammatical types in Pregtn, su are:
The strings of letters are called compound types, and there is a mor-
phism a Ñ b between compound types if and only if a can reduce to b
by application of one or more of the counit maps er and el .
Consider a banana, for example. It has type n, of course, while
the adjective yellow has type nnl . The reason that adjectives have
grammar type nnl is that an adjective can always be paired on the left
with a noun, resulting in a new noun—e.g. yellow banana.
yellow banana
nnl n
This tells us that the phrase yellow banana has grammar type n.
That’s good. A yellow banana is a noun!
In light of this discussion on yellow bananas, you might enjoy tak-
ing a few seconds to think about why nr snl represents the grammar
type of transitive verb.
A transitive verb is a word that accepts a noun on the right and an-
other noun on the left such that the resulting phrase is a full sen-
tence. Since we like bananas, here’s another fruit-based example:
Here’s that same reduction written out step-by-step. For clarity, I’ll
indicate the concatenation with a dot:
e r ¨1 ¨1 1 ¨e l
nnr snl n “ nnr ¨ s ¨ nl n ÝÑ
s n
1 ¨ s ¨ nl n “ s ¨ nl n ÝÑ
s
s¨1 “ s
syntax Ñ semantics
what is applied category theory? 40
In this section, we’ve just shown that the syntax category is taken to
be a pregroup freely generated on a finite set of basic grammar types,
i.e. syntax “ PregX. Let’s move on to semantics now.
What number?
Theorem (The Yoneda Lemma for Linguistics). You shall know a word 20
Firth, J. R. A synopsis of linguistic
by the company it keeps. theory, 1930–1955. In Selected Papers
of JR Firth, 1952–59 (ed. J. Firth and F.
Proof. John Firth20 Palmer). Indiana University Press.
ith spot
hnlj
wi “ p0, . . . , 1 , . . . , 0q
what is applied category theory? 41
The coefficients ci are real numbers that indicate the number of times
that w occurs near22 wi in the corpus. 22
You can decide what “near” means.
That is, the context of w is the set of
Here’s an example. Suppose we’re reading a book that contains
words within k words of w, where k “ 1
the words or 2 or 3 or whatever you like.
tsweet, green, furryu
Let’s choose them to be our context words and make the assignment
so that
» fi » fi » fi
1 0 0
sweet “ –0fl green “ –1fl furry “ –0fl
— ffi — ffi — ffi
0 0 1
Then if banana, puppy and fruit are also words in our book, we might
have something like
» fi » fi » fi
21 8 43
banana “ – 9 fl puppy “ – 1 fl fruit “ –19fl
— ffi — ffi — ffi
0 32 0
In other words, we’ve used data from the corpus to embed these
words as vectors inside of a three-dimensional vector space. This
prompts us to say that the meaning of the word banana is the vector
p21, 9, 0q, the meaning of puppy is p8, 1, 32q, and the meaning of fruit
is p43, 19, 0q.
And this works! That is, you can feed distributional models into
your computer, and they’ll ace the word-similarity portion of your
SAT exam. Or your can compute the dot product between words, and
you’ll find that vectors are closer together precisely when the words
they represent have the same meaning. It’s all familiar territory for
NLP practitioners. The semantics category for Coecke et. al. is thus
the category of finite dimensional vector spaces over R. That is,
F : PregX Ñ FVect
F : syntax Ñ semantics
or more specifically,
F : PregX Ñ FVect
For simplicity, let’s take X “ tn, su as we did before. Now to define
a functor, we need simply to say what it does on objects and mor-
phisms. So let’s do that. On objects,
and on morphisms
r Fr
• F assigns to a type reduction a ÝÑ b a linear map Fa ÝÑ Fb that
sends the vector corresponding to a word or phrase of type a in Fa
to the vector corresponding to a word or phrase of type b in Fb.
ηr ηl er el
n
Fp1 ÝÑ nr nq “ Fp1 ÝÑ
n
nnl q “ η N and Fpnnr ÝÑ
n
1q “ Fpnl n ÝÑ
n
1q “ e N
e.g. Fnr “ Fnl “ N ˚ . But our vector spaces are finite dimensional
and so N ˚ – N and therefore Fnr “ Fnl “ N.
That output vector will be the “meaning” of the sentence. Our goal is
to find that meaning.
bananas ù n
are ù nr snl
fruit ù n
as before. These basis vectors generate the noun space. But what
about the sentence space S? For simplicity, let’s define S to be a “true
or false” space so that it’s a one-dimensional vector space spanned
by a single vector ~1. The origin 0 P S corresponds to “false” while ~1
corresponds to “true.” What about scalar multiplies of ~1? If you like,
you’re more than welcome to think of a positive scalar multiple of ~1
as the meaning vector for sentence that is super true. The larger the
scalar, the more true the sentence!
Finally, note that once we’ve established N and S, the verb space
comes for free:
Fpnr snl q “ N b S b N
This is a nine-dimensional space spanned by vectors of the form
wi b~1 b wj where i and j range between 1 and 3.
Note that both of these vectors live in the noun space N since each
word has grammar type n. But what about the transitive verb are? By
Step 1, we know that are has grammar type nr snl and is therefore a
vector in the tensor product N b S b N. That is, there are coefficients
cij P R so that
are “ c11 sweet b~1 b sweet ` c12 sweet b~1 b green ` ¨ ¨ ¨ ` c33 furry b~1 b furry.
e r ¨1 s ¨ e l
nnr snl n nÝÑ n n
Aisde: You might wonder about the word “Choose” in “Step 4: Choose
a type reduction.” What’s up with that? Incidentally, no choice was
needed in this toy example of ours, so the purpose of this aside might
be unclear. Indeed, there’s only one way to parse the sentence bananas
are fruit. But there exist sentences that can be parsed in more than one
way. Consequently, the grammar type of such sentences may reduce
down to type s via more than one reduction morphism. In Step 4, we are
required to choose one. As an illustration, here is a nice sentence:
or perhaps
Those are two parsings of the same sentence, each of which corre-
sponds to a different type reduction in the pregroup. In turn, this gives
rise to different meaning vectors! And rightly so. Those two sentences
have different meanings! Step 4 is simply reminding us of this fact.
Step 5: Apply F!
e r ¨1 s ¨ e l
nnr snl n nÝÑ n n
1074~1
a ar a a ar a
a ar a a ar a
Now consider a functor P Ñ FVect. It assigns a in P to a vector space
A in FVect, and it assigns p1a ¨ ηa q ˝ pea ¨ 1a q to the corresponding
linear map, p1 A ¨ η A q ˝ pe A ¨ 1 A q : A b A˚ b A Ñ A b A˚ b A, which
we’ll just denote by f ,
A A˚ A
f “
A A˚ A
and which must be an isomorphism. Now if the dimension of A is at
least 2, then we can choose orthogonal basis vectors e1 and e2 so that
e1 b e2˚ b e1 P A b A˚ b A. And since e A computes the inner product
between e1 and e2˚ , we have f pe1 b e2˚ b e1 q “ 0. Therefore f is not
injective, and so it cannot be an isomorphism.
The intuition is, perhaps, that pregroups have too few morphisms
to capture the semantics. In particular, pregroups do not allow us
to distinguish different parsings of strings of types. One string may
reduce in several ways—e.g. (Men and women) whom I like vs. Men
and (women whom I like)—and the morphisms in a pregroup do
not account for this. So in some sense, there isn’t enough “wiggle
room” for meaning in pregroup syntax, so the output can only be a
one-dimensional vector space. But all is not lost! As Preller showed,
the problem can be fixed by replacing a free pregroup with a free
compact closed category. For more details, see her paper “From
Logical to Distributional Models.”
what is applied category theory? 48
• The yellow banana example of the previous section was just a toy
example meant to showcase the functor of the DisCoCat model of
meaning. But this is a document on applied category theory, and
so you’d surely like to see some applications! For empirical data
arising from actual implementations of the DisCoCat model, take a
look at:
• Jelle Harold and the folks at Statebox filmed most of the 2018
workshop talks, and you can watch them here: https://statebox.org/events/act-
leiden.html. Speakers include Samson Abramsky, John Baez, Bob
Coecke, Kathryn Hess, Aleks Kissinger, Tom Leinster, David Spi-
vak, and many more!