A Natural History of Human Thinking, by Michael Tomasello
A Natural History of Human Thinking, by Michael Tomasello
A Natural History of Human Thinking, by Michael Tomasello
of Human Thinking
A Natural History
of Human Thinking
Michael Tomasello
Cambridge, Massachusetts
London, England
2014
Copyright 2014 by the President and Fellows of Harvard College
All rights reserved
Printed in the United States of America
Tomasello, Michael.
A natural history of human thinking / Michael Tomasello.
pages cm
Includes bibliographical references and index.
ISBN 978-0-674-72477-8 (hardcover : alk. paper)
1. Cognition Social aspects. 2. Evolutionary psychology.
3. Psychology, Comparative. I. Title.
BF311.T6473 2014
153dc23 2013020185
For Rita, Anya, Leo, and Chiara
Contents
Preface ix
1.
The Shared Intentionality Hypothesis 1
2.
Individual Intentionality 7
Evolution of Cognition 7
Thinking like an Ape 15
Cognition for Competition 26
3.
Joint Intentionality 32
A New Form of Collaboration 33
A New Form of Cooperative Communication 49
Second-Personal Thinking 68
Perspectivity: The View from Here and There 76
viii Contents
4.
Collective Intentionality 80
The Emergence of Culture 81
The Emergence of Conventional Communication 93
Agent-Neutral Thinking 113
Objectivity: The View from Nowhere 120
5.
Human Thinking as Cooperation 124
Theories of Human Cognitive Evolution 125
Sociality and Thinking 133
The Role of Ontogeny 144
6.
Conclusion 149
Notes 155
References 159
Index 173
Preface
This book is a sequel or, better, a prequelto The Cultural Origins of Human
Cognition (Harvard University Press, 1999). But it also has a slightly different
focus. In the 1999 book the question was what makes human cognition unique,
and the answer was culture. Individual human beings develop uniquely pow-
erful cognitive skills because they grow to maturity in the midst of all kinds
of cultural artifacts and practices, including a conventional language, and of
course they have the cultural learning skills necessary to master them. Indi-
viduals internalize the artifacts and practices they encounter, and these then
serve to mediate all of their cognitive interactions with the world.
In the current book, the question is similar: what makes human thinking
unique? And the answer is similar as well: human thinking is fundamentally
cooperative. But this slightly different question and slightly different answer
lead to a very different book. The 1999 book was clean and simple because the
data we had comparing apes and humans were so sparse. We could thus say
things like Only humans understand others as intentional agents, and this
enables human culture. But we now know that the picture is more complex
than this. Great apes appear to know much more about others as intentional
agents than previously believed, and still they do not have human-like culture
x Pr eface
With regard to the manuscript itself, I would like to thank Larry Barsa-
lou, Mattia Galloti, Henrike Moll, and Marco Schmidt for reading various
chapters and providing very useful feedback. Of special importance, Richard
Moore and Hannes Rakoczy each read the entire manuscript at a fairly early
stage and provided a number of trenchant comments and suggestions, re-
garding both content and presentation. Thanks also to Elizabeth Knoll and
three anonymous reviewers at Harvard University Press for a number of help-
ful comments and criticisms on the penultimate draft.
Last and most important, I thank my wife, Rita Svetlova, for providing con-
stant and detailed critical commentary and suggestions throughout. Many
ideas were made clearer through discussion with her, and confusing passages
were made clear, or at least clearer, by her literate eye.
A Natural History
of Human Thinking
1
first place. Mead (1934) pointed out that when humans interact with one an-
other, especially in communication, they are able to imagine themselves in the
role of the other and to take the others perspective on themselves. Piaget
(1928) argued further that these role-taking and perspective-taking abilities
along with a cooperative attitudenot only make culture and language pos-
sible but also make possible reasoning in which individuals subordinate their
own point of view to the normative standards of the group. And Wittgen-
stein (1955) explicated several different ways in which the appropriate use of a
linguistic convention or cultural rule depends on a preexisting set of shared
social practices and judgments (forms of life), which constitute the prag-
matic infrastructure from which all uses of language and rules gain their
interpersonal significance. These social infrastructure theorists, as we may
call them, all share the belief that language and culture are only the icing on
the cake of humans ultrasocial ways of relating to the world cognitively.
Insightful as they were, all of these classic theorists were operating with-
out several new pieces of the puzzle, both empirical and theoretical, that have
emerged only in recent years. Empirically, one new finding is the surprisingly
sophisticated cognitive abilities of nonhuman primates, which have been
discovered mostly in the last few decades (for reviews, see Tomasello and Call,
1997; Call and Tomasello, 2008). Thus, great apes, as the closest living rela-
tives of humans, already understand in human-like ways many aspects of
their physical and social worlds, including the causal and intentional rela-
tions that structure those worlds. This means that many important aspects of
human thinking derive not from humans unique forms of sociality, culture,
and language but, rather, from something like the individual problem-solving
abilities of great apes in general.
Another new set of findings concern prelinguistic (or just linguistic) human
infants, who have yet to partake fully of the culture and language around
them. These still fledgling human beings nevertheless operate with some cog-
nitive processes that great apes do not, enabling them to engage with others
socially in some ways that great apes cannot, for example, via joint attention
and cooperative communication (Tomasello et al., 2005). The fact that these
precultural and prelinguistic creatures are already cognitively unique provides
empirical support for the social infrastructure theorists claim that important
aspects of human thinking emanate not from culture and language per se but,
rather, from some deeper and more primitive forms of uniquely human social
engagement.
T he Sh a r ed Intent iona lit y H y pot hesis 3
heuristics (so-called system 1 processes), humans and at least some other ani-
mals also solve some problems and make some decisions by thinking (system
2 processes; e.g., Kahneman, 2011). A specific focus on thinking is useful be-
cause it restricts our topic to a single cognitive process, but one that involves
several key components, especially (1) the ability to cognitively represent expe-
riences to oneself off-line; (2) the ability to simulate or make inferences trans-
forming these representations causally, intentionally, and/or logically; and
(3) the ability to self-monitor and evaluate how these simulated experiences
might lead to specific behavioral outcomes and so to make a thoughtful
behavioral decision.
It seems obvious that, compared with other animal species, humans think
in special ways. But this difference is hard to characterize using traditional
theories of human thinking since they presuppose key aspects of the process
that are actually evolutionary achievements. These are precisely the social
aspects of human thinking that are our primary focus here. Thus, although
many animal species can cognitively represent situations and entities at least
somewhat abstractly, only humans can conceptualize one and the same situ-
ation or entity under differing, even conflicting, social perspectives (leading
ultimately to a sense of objectivity). Further, although many animals also
make simple causal and intentional inferences about external events, only
humans make socially recursive and self-reflective inferences about others or
their own intentional states. And, finally, although many animals monitor
and evaluate their own actions with respect to instrumental success, only
humans self-monitor and evaluate their own thinking with respect to the
normative perspectives and standards (reasons) of others or the group. These
fundamentally social differences lead to an identifiably different type of think-
ing, what we may call, for the sake of brevity, objective-reflective-normative
thinking.
In this book we attempt to reconstruct the evolutionary origins of this
uniquely human objective-reflective-normative thinking. The shared intentiona-
lity hypothesis is that what created this unique type of thinkingits processes
of representation, inference, and self-monitoringwere adaptations for dealing
with problems of social coordination, specifically, problems presented by indi-
viduals attempts to collaborate and communicate with others (to co-operate
with others). Although humans great ape ancestors were social beings, they
lived mostly individualistic and competitive lives, and so their thinking was
geared toward achieving individual goals. But early humans were at some
T he Sh a r ed Intent iona lit y H y pot hesis 5
Individual Intentionality
Cognitive processes are a product of natural selection, but they are not its tar-
get. Indeed, natural selection cannot even see cognition; it can only see the
effects of cognition in organizing and regulating overt actions (Piaget, 1971). In
evolution, being smart counts for nothing if it does not lead to acting smart.
The two classic theories of animal behavior, behaviorism and ethology, both
focused on overt actions, but they somehow forgot the cognition. Classical
ethology had little or no interest in animal cognition, and classical behavior-
ism was downright hostile to the idea. Although contemporary instantiations of
ethology and behaviorism take some account of cognitive processes, they pro-
vide no systematic theoretical accounts. Nor are any other modern approaches
to the evolution of cognition sufficient for current purposes.
And so to begin this account of the evolutionary emergence of uniquely
human thinking, we must first formulate, in broad outline, a theory of the
evolution of cognition more generally. We may then begin our natural history
proper by using this theoretical framework to characterize processes of cogni-
tion and thinking in modern-day great apes, as representative of humans
evolutionary starting point before they separated from other primates some
six million years ago.
Evolution of Cognition
All organisms possess some reflexive reactions that are organized linearly as
stimulus-response linkages. Behaviorists think that all behavior is organized
in this way, though in complex organisms the linkages may be learned and
become associated with others in various ways. The alternative is to recognize
that complex organisms also possess some adaptive specializations that are
8 A Nat ur a l History of Hu m a n T hink ing
organized circularly, as feedback control systems, with built-in goal states and
action possibilities. Starting from this foundation, cognition evolves not from
a complexifying of stimulus-response linkages but, rather, from the individ-
ual organism gaining (1) powers of flexible decision-making and behavioral
control in its various adaptive specializations, and (2) capacities for cognitively
representing and making inferences from the casual and intentional relations
structuring relevant events.
Adaptive specializations are orga nized as self-regulating systems, as are
many physiological processes such as the homeostatic regulation of blood
sugar and body temperature in mammals. These specializations go beyond
reflexes in their capacity to produce adaptive behavior in a much wider range
of circumstances, and indeed, they may be quite complex, for example, spi-
ders spinning webs. There is no way that a spider can spin a web using only
stimulus-response linkages. The process is too dynamic and dependent on
local context. Instead, the spider must have goal states that it is motivated to
bring about, and the ability to perceive and act so as to bring them about in a
self-regulated manner. But adaptive specializations are still not cognitive (or
only weakly cognitive) because they are unknowing and inflexible by definition:
perceived situations and behavioral possibilities for goal attainment are mostly
connected in an inflexible manner. The individual organism does not have the
kind of causal or intentional understanding of the situation that would enable it
to deal flexibly with novel situations. Natural selection has designed these
adaptive specializations to work invariantly in the same situations as those
encountered in the past, and so cleverness from the individual is not needed.
Cognition and thinking enter the picture when organisms live in less pre-
dictable worlds and natural selection crafts cognitive and decision making
processes that empower the individual to recognize novel situations and to
deal flexibly, on its own, with unpredictable exigencies. What enables effec-
tive handling of a novel situation is some understanding of the causal and/or
intentional relations involved, which then suggests an appropriate and poten-
tially novel behavioral response. For example, a chimpanzee might recognize
that the only tool available to her in a given situation demands, based on the
physical causality involved, manipulations she has never before performed
toward this goal. A cognitively competent organism, then, operates as a con-
trol system with reference values or goals, capacities for attending to situations
causally or intentionally relevant to these reference values or goals, and
capacities for choosing actions that lead to the fulfillment of these reference
values or goals (given the causal and/or intentional structure of the situation).
Indi v idua l In ten t iona l it y 9
Cognitive Representation
Cognitive representation in a self-regulating, intentional system may be char-
acterized both in terms of its content and in terms of its format. In terms of
content, the claim here is that both the organisms internal goals and its ex-
ternally directed attention (NB: not just perception but attention) have as
content not punctate stimuli or sense data, but rather whole situations. Goals,
values, and other reference values (pro-attitudes) are cognitive representa-
tions of situations that the organism is motivated to bring about or maintain.
Although we sometimes speak of an object or location as someones goal, this
is really only a shorthand way of speaking; the goal is the situation of having
the object or reaching the location. The philosopher Davidson (2001) writes,
Wants and desires are directed to propositional contents. What one wants
is . . . that one has the apple in hand. . . . Similarly . . . someone who intends
to go to the opera intends to make it the case that he is at the opera (p.126).
10 A Nat ur a l History of Hu m a n T hink ing
In this same manner, modern decision theory often speaks of the desire or
preference that a particular state of affairs be realized.
If goals and values are represented as desired situations, then what the or-
ganism must attend to in its perceived environment is situations relevant to
those goals and values. Desired situations and attended-to environmental situ-
ations are thus perforce in the same perceptually based, fact-like representa-
tional format, which enables their cognitive comparison. Of course, complex
organisms also perceive less complex things, such as objects, properties, and
eventsand can attend to them for specific purposesbut in the current
analysis they always do so as components of situations relevant to behavioral
decision making.
To illustrate the point, let us suppose that the image in Figure 2.1 is what
a chimpanzee sees as she approaches a tree while foraging.
The chimpanzee perceives the scene in the same basic way that we would;
our visual systems are similar enough that we see the same basic objects and
their spatial relationships. But what situations does the chimpanzee attend
to? Although she could potentially focus her attention on any of the poten-
tially infinite situations that this image presents, at the current moment she
must make a foraging decision, and so she attends to the situations or facts
relevant to this behavioral decision, to wit (as described in English):
For a foraging chimpanzee with the goal of obtaining food, given all of its
perceptual and behavioral capacities and its knowledge of the local ecology,
all of these are relevant situations for deciding what to do all present in a
single visual image and, of course, nonverbally. (NB: Even the absence of
something expected, such as food not in its usual location, may be a relevant
situation.)
Relevance is one of those occasion-sensitive judgments that cannot be
given a general definition. But in broad strokes, organisms attend to situations
as either (1) opportunities or (2) obstacles to the pursuit and maintenance of
their goals and values (or as information relevant to predicting possible future
opportunities or obstacles). Different species have different ways of life, of
course, which means that they perceive or attend to different situations (and
components of situations). Thus, for a leopard, the situation of bananas in a
tree would not represent an opportunity to eat, but the presence of a chim-
panzee would. For the chimpanzee, in contrast, the leopards presence now
presents an obstacle to its value of avoiding predators, and so it should look
for a situation providing opportunities for escape, such as a tree to climb with-
out low-hanging limbsgiven its knowledge that leopards cannot climb such
trees and its familiarity with its own tree-climbing prowess. If we now throw
into the mix a worm resting on the bananas surface, the relevant situations
for the three different speciesthe obstacles and opportunities for their re-
spective goalswould overlap even less, if at all. Relevant situations are thus
12 A Nat ur a l History of Hu m a n T hink ing
determined jointly by the organisms goals and values, its perceptual abili-
ties and knowledge, and its behavioral capacities, that is to say, by its overall
functioning as a self-regulating system. Identifying situations relevant for
a behavioral decision thus involves an organisms whole way of life (von
Uexkll, 1921).2
In terms of representational format, the key is that to make creative infer-
ences that go beyond particular experiences, the organism must represent its
experiences as types, that is to say, in some generalized, schematized, or abstract
form. One plausible hypothesis is a kind of exemplar model in which the indi-
vidual in some sense saves the particular situations and components to
which it has attended (many models of knowledge representation have atten-
tion as the gateway). There is then generalization or abstraction across these in
a process that we might call schematization. (Langackers [1987] metaphor is of
a stack of transparencies, each depicting a single situation or entity, and sche-
matization is the process of looking down through them for overlap.) We
might think of the result of this process of schematization as cognitive models
of various types of situations and entities, for example, categories of objects,
schemas of events, and models of situations. Recognizing a situation or entity
as a token of a known typeas an exemplar of a cognitive category, schema,
or model enables novel inferences about the token appropriate to the type.
Categories, schemas, and models as cognitive types are nothing more or less
than imagistic or iconic schematizations of the organisms (or, in some cases, its
species) previous experience (Barsalou, 1999, 2008). As such, they do not suffer
from the indeterminacy of interpretation that some theorists attribute to iconic
representations considered as mental pictures, that is, the indeterminacy of
whether this image is of a banana, a fruit, an object, and so forth (Crane, 2003).
They do not because they are composed of individual experiences in which the
organism was attending to a relevant (already interpreted) situation. Thus,
the organism interprets, or understands, particular situations and entities in
the context of its goals as it assimilates them to known (cognitively represented)
types: This is another one of those.
Behavioral Self-Monitoring
To think effectively, an organism with individual intentionality must be able
to observe the outcome of its actions in a given situation and evaluate whether
they match the desired goal state or outcome. Engaging in some such pro-
cesses of behavioral self-monitoring and evaluation is what enables learning
from experience over time.
A cognitive version of such self-monitoring enables the agent, as noted above,
to inferentially simulate a potential action-outcome sequence ahead of time
and observe itas if it were an actual action-outcome sequenceand then
evaluate the imagined outcome. This process creates more thoughtful decision
making through the precorrection of errors. (Dennett [1995] calls it Pop-
perian learning because failure means that my hypothesis dies, not me.) For
example, consider a squirrel on one tree branch gearing up to jump to another.
One can see the muscles preparing, but in some cases the squirrel decides the
leap is too far and so, after feigning some jumps, climbs down the trunk and
then back up the other branch. The most straightforward description of this
event is that the squirrel is observing and evaluating a simulation of what it
would experience if it leaped; for example, it would experience missing the
branch and fallinga decidedly negative outcome. The squirrel must then
use that simulation to make a decision about whether to actually leap. Okrent
(2007) holds that imagining the possible outcomes of different behavioral
choices ahead of time, and then evaluating and deciding for the one with the
best imagined outcome, is the essence of instrumental rationality.
This kind of self-monitoring, requiring what some call executive function-
ing, is cognitive because the individual, in some sense, observes not just its ac-
tions and their results in the environment but also its own internal simulations.
It is also possible for the organism to assess things like the information it has
available for making a decision in order to predict the likelihood that it will
make a successful choice (before it actually chooses). Humans even use the
imagined evaluations of other persons or the imagined comprehension of
others in the case of communicationto evaluate potential behavioral deci-
sions. Whatever its specific form, internal self-monitoring of some kind is
critical to anything we would want to call thinking, as it constitutes, in some
sense, the individual knowing what it is doing.
Indi v idua l In ten t iona l it y 15
(2) if the food were inside the shaking cup, then it would make noise;
(3) therefore, the food is inside the cup. In condition 2, the experimenter
shook the empty cup. In this case the chimpanzee observed only silence and
had to infer backward in the causal chain to why that might be, specifically,
that there was no food in the cup. This is a kind of proto-modus tollens: (1) the
shaking cup is silent; (2) if the food were inside the shaking cup, then it
would make noise; (3) therefore, the food must not be in the cup (the shaken
cup must be empty). The chimpanzees made this inference, but they also made
an additional one. They combined their understanding of the causality of
noise making in this context with their preexisting knowledge that the food
was in one of the two cups to locate the food in the other, nonshaken cup (if
the food is not in this one, then it must be in that one; see bottom row in
Figure 2.2). This inferential paradigm thus involves the kind of exclusion in-
ference characteristic of a disjunctive syllogism.
Negation is a very complex cognitive operation, and one could easily
object to the use of negation in these proposed accounts of great ape logical
inferences. But Bermudez (2003) makes a novel theoretical proposal about some
possible evolutionary precursors to formal negation that make these accounts
much more plausible. The proposal is to think of a kind of protonegation
assimply comprising exclusionary opposites on a scale (contraries), such as
presence-absence, noise-silence, safety-danger, success-failure, and available-
not available. If we assume that great apes understand polar opposites such as
these as indeed mutually exclusivefor example, if something is absent, it
cannot be present, or if it makes noise it cannot be silentthen this could be
a much simpler basis for the negation operation. All of the current descrip-
tions assume protonegation of this type.
When taken together, the conditional (if-then) and negation operations
structure all of the most basic paradigms of human logical reasoning. The
claim is thus that great apes can solve complex and novel physical problems
by assimilating key aspects of the problem situation to already known cognitive
models with causal structure and then use those models to simulate or make
inferences about what has happened previously or what might happen next
employing both a kind of protoconditional and a kind of protonegation in
both forward-facing and backward-facing paradigms. Our general conclusion
is thus that since the great apes in these studies are using cognitive models con-
taining general principles of causality, and they are also simulating or making
inferences in various kinds of protological paradigms, with various kinds of
20 A Nat ur a l History of Hu m a n T hink ing
self-monitoring along the way, what the great apes are doing in these studies
is thinking.
tion, subordinate chimpanzees avoided going for food that a dominant could
not see now but had seen hidden in one of the locations some moments be-
fore; they knew that he knew where the hidden food was located (Hare et al.,
2001; Kaminski et al., 2008). In still another variation, in a back-and-forth
foraging game, chimpanzees knew that if their competitor chose first, he would
choose a board that was lying slanted on the table (as if something were under-
neath) rather than a flat board (under which there could be nothing); they
knew what kind of inference he would make in the situation (Schmelz et al.,
2011). Chimpanzees thus know that others see things, know things, and
make inferences about things.
But beyond exploiting their understanding of what others do and do not
experience and how this affects their behavior, great apes sometimes even at-
tempt to manipulate what others experience. In a series of experiments, Hare
et al. (2006) and Melis et al. (2006a) had chimpanzees compete with a human
(sitting in a booth-like apparatus) for two pieces of food. In some conditions,
the human could see the ape equally well if it approached either piece of food;
in these cases, the apes had no preference for either piece. But in the key con-
dition, a barrier was in place so that the apes could approach one piece of food
without being seenwhich is exactly what they did. They even did this when
they themselves could not see the human in either case. (They had to choose
to reach for food from behind a barrier in both cases, but through a clear tun-
nel in one case and an opaque one in the other.) Perhaps most impressive, the
same individuals also preferentially chose to pursue food that they could ap-
proach silentlyso that the distracted human competitor did not know they
were doing soas opposed to food that required them to make noise en
route. This generalization to a completely different perceptual modality speaks
to the power and flexibility of the cognitive models and inferences involved.
Importantly analogous to the domain of physical cognition, the chimpan-
zees in these studies not only made productive inferences based on a general
understanding of intentionality but also connected their inferences into para-
digms to both predict and even manipulate what others would do (see Figure
2.3). The background knowledge required in all of these food competition
experiments is that a competitor will go for a piece of food if and only if (1) he
has the goal of having it and (2) he perceives its location (e.g., at location A).
The protoconditional inferences in the Hare et al. (2000) experiment follow
straightforwardly from this: if the dominant wants the banana and sees it at
location A, then she will go to location A. Also analogous to the domain of
22 A Nat ur a l History of Hu m a n T hink ing
counted the unusual action and used their hands as they normally would.
However, when they saw the human use the unusual action when there was
no physical constraint dictating thishe just turned on the light with his
head for no discernable reasonthey quite often copied the unusual behavior
themselves. The most natural interpretation of this differentiated pattern of
response would be that the apes employed a kind of proto-modus tollens, from
effect to cause with protonegation, similar to that in the Call (2004) shaking
cups study: (1) he is not using his hands; (2) if he had a free choice, he would
be using his hands; (3) therefore he must not have a free choice (in one case
for obvious reasons; in the other not).
These studies demonstrate that great apes can solve complex social problems,
just as they solve complex physical problems, by assimilating key aspects of the
problem situation to a cognitive modelwhich in this case embodies a gen-
eral understanding of intentionalityand then using that model to simulate or
make inferences about what has happened or what might happen next. Great
apes employ both a kind of protoconditional and a kind of protonegation
in both forward-facing and backward-facing modesin the context of
protological paradigms of social inferring. Our conclusion is thus that in the
social domain, as well as the physical domain, what the great apes in these
studies are doing is thinking.
Cognitive Self-Monitoring
Great apes in these studies are clearly not just automatically flipping through
behavioral alternatives and reacting to a goal match; they monitor, and so in
some sense know, what they are doing in order to make more effective deci-
sions. On the level of action (recall the hesitant squirrel), recent studies of
great apes have shown that they can (1) delay taking a smaller reward so as to
get a larger reward later, (2) inhibit a previously successful response in favor
of a new one demanded by a changed situation, (3) make themselves do some-
thing unpleasant for a desired reward at the end, (4) persist through failures, and
(5) concentrate through distractions. Specifically, in a comprehensive com-
parative study, chimpanzees ability to do these things was roughly comparable
to that of three-year-old human children (Herrmann et al., submitted). These
are all skills referred to variously as impulse control, attentional control, emo-
tion regulation, and executive functionthough we prefer to use the terms
behavioral self-monitoring, for more action-based self-regulation, and cognitive
Indi v idua l In ten t iona l it y 25
empirical data are less clear-cut in the case of cognitive self-monitoring, Calls
(2010) finding that the same factors affect the process in humans and great
apes is highly suggestive thatin concrete situations, at leastthe apes are
genuinely self-monitoring the decision-making process.
In any case, our natural history of human thinking begins with this pos-
sibly somewhat generous account of great ape thinking. To summarize, think-
ing comprises three key components, and great apes operate in cognitively
sophisticated ways with each of them.
and also that they can predict what particular individuals will do in situations
based on their past experience with them (Hare et al., 2001). Further evidence
is the fact that in experiments great apes individuate objects, so that if they
see a particular object go behind a screen, they expect to find that particular
object there, and if they see it leave and another replace it, they do not expect
to find it there anymore and if two identical objects go behind the screen
they expect to find two objects. They are not feature placing, but rather, they
are tracking the self-same object or objects engaging in different actions across
time (Mendes et al., 2008).
With respect to different individuals doing the same thing, great apes
know such individual things as leopards climb trees, snakes climb trees,
monkeys climb trees each in their own way. Here things are a bit more
difficult evidentially because there are few if any nonverbal methods for inves-
tigating event schemas like climbing. But one hypothesis is that a nonverbal
way of establishing an event schema is imitation. That is, an individual who
imitates another knows at the very least that a demonstrator is doing X and
then they themselves can do Xthe same thing as well (and perhaps
other actors also). Although imitation is not their frontline strategy for social
learning, great apes (at least those raised by humans) are nevertheless capable
of reproducing the actions of others with some facility in some contexts (e.g.,
Tomasello et al., 1993; Custance et al., 1995; Buttelmann et al., 2007). Some
apes also know when another individual is imitating them, again suggesting
at least a rudimentary understanding of self-other equivalence (Haun and
Call, 2008). But imitation involves just self and other. Since apes understand
the goals of all agents, an alternative hypothesis might be that apes schematize
acts of climbing based not on movements but on an understanding that the
actor has the goal of getting up the tree and that goal (not actions per se)
provides the basis for an event schema across all individuals, with or without
the self.
Great ape cognition thus goes at least some way toward meeting the gen-
erality constraint, although productivity may be limited. The claim would be
that great ape productive thinking enables an individual to imagine, for ex-
ample, that if I chase this novel animal it might climb a tree, even if I have
never before seen this animal climb a tree. On the other hand, it may be that
an ape could not imagine something contrary to fact (i.e., contrary to its causal
understanding), such as a leopard flying, as humans are able to do with the
aid of external communicative vehicles. Apes sense of self-other equivalence
30 A Nat ur a l History of Hu m a n T hink ing
may also be limited by the fact that imitation takes place sequentially, whereas
much better for establishing self-other equivalence are situations in which
the equivalence manifest in a single social interaction simultaneously (e.g.,
role reversal in the collaborative activities of humans).
Behavioral Self-Monitoring
The third key component of great ape thinking is the ability to self-monitor the
decision-making process. Many animal species self-monitor, and even antici-
pate, the outcomes of their behavioral decisions in the world. But great apes
do more than this simple behavioral self-monitoring.
And so we may imagine a common ancestor to humans and other great apes.
Its daily life was like that of extant nonhuman apes: most waking hours spent
in small bands foraging individually for fruit and other vegetation, with
various kinds of social interactions, mostly competitive, interspersed. Our
hypothesis is that this creature and also probably australopithecines for the
ensuing 4 million years of the human lineagewas individually intentional
and instrumentally rational. It cognitively represented its physical and social
experience categorically and schematically, and it made all kinds of produc-
tive and hypothetical inferences and chains of inferences about its experience
as wellall with a modicum of cognitive self-monitoring. And so, the crucial
point is that well before the emergence of uniquely human sociality, much
less culture, language, and institutions, the foundations for human thinking
were securely in place in humans last common ancestor with other apes.
Indi v idua l In ten t iona l it y 31
Joint Intentionality
with other primates, then, it would seem that we need an intermediate step
inour natural history. We need some early humans who were not yet living in
cultures and using conventional languages, but who were nevertheless much
more cooperatively inclined than the last common ancestor.
And so we will posit in this chapter, as an initial step, some early humans
who created new forms of social coordination, perhaps in the context of col-
laborative foraging. Early humans new form of collaborative activity was
unique among primates because it was structured by joint goals and joint
attention into a kind of second-personal joint intentionality of the moment, a
we intentionality with a particular other, within which each participant
had an individual role and an individual perspective. Early humans new
form of cooperative communicationthe natural gestures of pointing and
pantomiming enabled them to coordinate their roles and perspectives on
external situations with a collaborative partner toward various kinds of joint
objectives. The result was that these early humans cooperativized great ape
individual intentionality into human joint intentionality involving new forms
of cognitive representation (perspectival, symbolic), inference (socially recur-
sive), and self-monitoring (regulating ones actions from the perspective of a
cooperative partner), which, when put to use in solving concrete problems of
social coordination, constituted a radically new form of thinking.
So let us look, first, at the new form of collaboration that emerged with
early humans, then at the new form of cooperative communication that early
humans used to coordinate their collaborative activities, and then at the result-
ing new form of thinking that all of this collaborating and communicating
required as substrate.
characterizes virtually all of the foraging activities of the four great ape
species.
The main exception to this general great ape pattern is chimpanzees group
hunting of monkeyssystematically observed only in chimpanzees, and only
in some groups (Boesch and Boesch, 1989; Watts and Mitani 2002). What
happens prototypically is that a small party of male chimpanzees spies a red
colobus monkey somewhat separated from its group, which they then proceed
to surround and capture. Normally, one individual begins the chase, and
others scramble to the monkeys possible escape routes, including the ground.
One individual actually captures the monkey, and he ends up getting the
most and best meat. But because he cannot dominate the carcass on his own,
all participants (and many bystanders) usually get at least some meat, depend-
ing on their dominance and the vigor with which they beg and harass the
captor (Gilby, 2006).
The social and cognitive processes involved in chimpanzee group hunting
could potentially be complex, but they could also be fairly simple. The rich
reading is a human-like reading, namely, that chimpanzees have the joint goal
of capturing the monkey together and that they coordinate their individual
roles in doing so (Boesch, 2005). But more likely, in our opinion, is a leaner
interpretation (Tomasello et al., 2005). In this interpretation, each individual
is attempting to capture the monkey on its own (since captors get the most
meat), and they take into account the behavior, and perhaps intentions, of
the other chimpanzees as these affect their chances of capture. Adding some
complexity, individuals prefer that one of the other hunters capture the mon-
key (in which case they will get a small amount of meat through begging and
harassing) to the possibility of the monkey escaping totally (in which case
they get no meat). In this view, chimpanzees in a group hunt are engaged in
a kind of co-action in which each individual is pursuing his own individ-
ual goal of capturing the monkey (what Tuomela, 2007 calls group behavior
in I-mode). In general, it is not clear that chimpanzees group hunting of
monkeys is so different cognitively from the group hunting of other social
mammals, such as lions and wolves.
In stark contrast, human foraging is collaborative in much more fundamen-
tal ways. In modern forager societies, individuals produce the vast majority
of their daily sustenance collaboratively with others, either immediately
through collaborative efforts or via procurers who bring the food back to some
central location for sharing (Hill and Hurtado, 1996; Hill, 2002; Alvard,
36 A Nat ur a l History of Hu m a n T hink ing
population size were both expanding rapidly (Gowlett et al., 2012). We may
hypothesize that these collaborative foragers lived as more or less loose bands
comprising a kind of pool of potential collaborators.
But more important than when is how. In the hypothesis of Tomasello et
al. (2012), obligate collaborative foraging became an evolutionarily stable strat-
egy for early humans because of two interrelated processes: interdependence
and social selection. The first and most basic point is that humans began a
lifestyle in which individuals could not procure their daily sustenance alone
but instead were interdependent with others in their foraging activities
which meant that individuals needed to develop the skills and motivations to
forage collaboratively or else starve. There was thus direct and immediate se-
lective pressure for skills and motivations for joint collaborative activity ( joint
intentionality). The second point is that as a natural outcome of this interde-
pendence, individuals began to make evaluative judgments about others as
potential collaborative partners: they began to be socially selective, since choos-
ing a poor partner meant less food. Cheaters and laggards were thus selected
against, and bullies lost their power to bully. Importantly, this now meant that
early human individuals had to worry, in a way that other great apes do not,
both about evaluating others and about how others were evaluating them as
potential collaborative partners (i.e., a concern for self-image).
The situation these early humans faced is perhaps best modeled by the stag
hunt scenario from game theory (Skyrms, 2004). Two individuals have easy
access to low-payoff hares (e.g., low-calorie vegetation), and then there ap-
pears on the horizon a high-payoff but difficult-to-obtain stag (e.g., large
game) that can be acquired only if individuals abandon their hares and col-
laborate. Their motivations thus align, because it is in both their interests to
work together. The dilemma is thus purely cognitive: since collaboration is
mandatory and I am risking my hare, I want to go for the stag only if you do,
too. But you only want to go for the stag if I do, too. How do we coordinate
this potential standoff ? There are some cognitively simple ways out of the
dilemma (see Bullinger et al., 2011b, for the leader-follower strategy that chim-
panzees use), but they always involve one individual incurring disproportion-
ate risk, and so they are unstable in certain circumstances. For example, if there
were very few hares, so that each was highly valued, and hunting stags was
only rarely successful, then the cost/benefit analysis would require that each
individual attempt to make certain that their potential partner was also going
for the stag before they relinquished their hare.
38 A Nat ur a l History of Hu m a n T hink ing
It is important here that each of our goals is not just to capture the stag
but, rather, to capture it together with the other. Each of us wanting sepa-
rately to capture the stag (even if this was mutual knowledge; see Searle, 1995)
would constitute two individuals hunting in parallel, not jointly. It is also
important that we have mutual knowledge of one anothers goal, that is, that
our respective goals are part of our common conceptual ground. Each of us
may want to capture the stag together with the other, but if neither of us
knows that this is the case, we very likely will not succeed in coordinating
(for all of the reasons outlined by Lewis and Schelling, among others). Thus,
joint intentionality is operative both in the action content of each of our goals
or intentionsthat we act together and in our mutual knowledge, or com-
mon ground, that we both know that we both intend this.
Young children begin engaging with others in ways that suggest some form
of joint goal from around fourteen to eighteen months of age, when they are
still mostly prelinguistic. Thus, Warneken et al. (2006, 2007) had infants of
this age engage in a joint activity with an adult, such as obtaining a toy by
each operating one side of an apparatus. Then, the adult simply stopped play-
ing her role for no reason. The children were not happy about this and did
various things to attempt to reengage their partner. (They did not do this if
her stopping was for a good reason; e.g., she had to attend to something else
[Warneken et al., 2012].) Interesting, when this same situation was arranged
for human-raised chimpanzees, they simply ignored the recalcitrant partner
and tried to find ways to achieve the goal on their own. Although infants
reengagement attempts do not suggest necessarily that they have a fully adult-
like joint goal in common ground with their partner, at the very least they
reflect an expectation that, barring obstacles, my partner in this joint activity
is committed enough to reengage after a stoppage an expectation that, ap-
parently, chimpanzees in similar activities do not have.
By the time they are three years of age, children provide much more con-
vincing evidence for joint goals because they themselves display commitment
to the joint activity in the face of distractions and temptations. For example,
Hamann et al. (2012) had pairs of three-year-old children work together to
bring rewards to the top of a step-like structure. The problem was that for
one child the reward, surprisingly, became available midway through. Never-
theless, when this happened, the lucky child delayed consumption of her own
reward and persevered until the other got hers (i.e., more than they helped
the partner in a similar situation in which they were acting individually,
40 A Nat ur a l History of Hu m a n T hink ing
without collaboration). Such commitment to the partner suggests that the chil-
dren constructed a joint goal at the beginning that we get the prizes together,
and they made whatever adjustments were necessary to realize that joint goal.
Again, great apes do not behave in this same way. In a similar experiment with
chimpanzees, Greenberg et al. (2010) found no signs of a human-like com-
mitment to follow through on the joint action until both partners received
their reward. (And Hamann et al. [2011] found that at the end of the collab-
orative activity, three-year-old children, but not chimpanzees, were committed
to dividing the spoils equally among participants as well.)
Importantly, when children of this same age have it in their common ground
with a collaborative partner that each is counting on the other to come through
(we are interdependent), they both feel obligated to the other (see Gilbert,
1989, 1990). Thus, Grfenhain et al. (2009) had preschoolers explicitly agree
to play a game with one adult, and then another adult attempted to lure them
away to a more exciting game. Although two-year-old children mostly just
bolted to the new game straightaway, from three years of age children paused
before departing and took leave, either verbally or by handing the adult the
tool they had been using together. The children seemed to recognize that joint
goals involve joint commitments, the breaking of which requires some kind
of acknowledgment or even apology. No study of this type has ever been
done with chimpanzees, but there are no published reports of one chimpanzee
taking leave from, making excuses to, or apologizing to another for breaking
a joint commitment.
In addition to joint goals, collaborative activities also demand a division of
labor and so individual roles. Bratman (1992) specifies that in joint coopera-
tive activities individuals must mesh their subplans together toward the
joint goal, and even help one another in their individual roles as necessary. In
the Hamann et al. (2012) study cited above, young children stopped to help
their partner as needed. This demonstrates that the partners are attending to
one another and their respective subgoals, and perhaps even attending to the
partner attending to them, and so forth. Indeed, other studies have found
that young children, but not chimpanzees, learn important new things about
the partners role as they are collaborating. For example, Carpenter et al. (2005)
found that after young children played one role in a collaboration, they could
quickly switch to the other, whereas chimpanzees could not do this (Toma-
sello and Carpenter, 2005). Most important, Fletcher et al. (2012) found that
three-year-olds who had fi rst participated in a collaboration playing role
Joi n t I n t e n t iona l i t y 41
Athen knew much better how to play role B than if they had not previously
collaborated, whereas this was not true of chimpanzees.
Young children are thus beginning to understand that the roles in a collab-
orative activity are in most cases interchangeable among individuals, which
suggests a birds eye view of the collaboration in which the various roles,
including ones own, are all conceptualized in the same representational for-
mat (see Hobson, 2004). This species-unique understanding may support an
especially deep appreciation of self-other equivalence, as individuals imagine
different subjects/agents engaging in similar or complementary activities
simultaneously in the same collaborative activity. As suggested in our discus-
sion of great ape thinking, the understanding of self-other equivalence is a
key component enabling various kinds of combinatorial flexibility in think-
ing. (It also sets the stage for a full-blooded appreciation of agent neutrality
encompassing not just self and other but all possible agents, which is a key
feature of cultural norms and institutions, and objectivity more generally,
as we shall see in chapter 4.)
Preschool children are not good models for the early humans we are pic-
turing here because they are modern humans and they are bathed in culture
and language from the beginning. But from soon after their first birthdays,
and continuing up to their third birthdays, they come to engage with others
in collaborative activities that have a species-unique structure and that do
not, in any obvious way, depend on cultural conventions or language. These
young children coordinate a joint goal, commit themselves to that joint goal
until all get their reward, expect others to be similarly committed to the joint
goal, divide the common spoils of a collaboration equally, take leave when
breaking a commitment, understand their own and the partners role in the
joint activity, and even help the partner in her role when necessary. When
tested in highly similar circumstances, humans nearest primate relatives, great
apes, do not show any of these capacities for collaborative activities underlain
by joint intentionality. Importantly, young children also seem to have a species-
unique motivation for collaboration, as shown in recent studies in which
children and chimpanzees had to choose between pulling in a certain amount
of food collaboratively with a conspecific or pulling in that same amount of
food (or more or less) in a solo activity. Children very much preferred the col-
laborative option, whereas chimpanzees went wherever there was most food
regardless of opportunities for collaboration (Rekers et al., 2011; Bullinger
et al., 2011a).
42 A Nat ur a l History of Hu m a n T hink ing
Penn et al. (2008) have proposed that what makes human cognition
different from that of other primates is thinking in terms of relations,
especially higher-order relations. To support their claim, they review
evidence from many different domains of cognition: judgments of relational
similarity, judgments of same-difference, analogy, transitive inference,
hierarchical relations, and so forth.
But at the same time, it is true that humans are particularly skilled at
relational thinking (Gentner, 2003). One hypothesis that might explain the
data is that there are actually two kinds of relational thinking. One
concerns the concrete physical world of space and quantities, in which we
may compare various characteristics or magnitudes such as bigger-smaller,
brighter-darker, fewer-greater, higher-lower, and even same-different.
Nonhuman primates have some skills with these kinds of physical relations
and relational magnitudes. What they may not comprehend at allthough
there are few direct testsis a second type of relation. Specifically, they
may not comprehend functional categories of things defined by their role in
some larger activity. Humans are exceptional in creating categories such as
pet, husband, pedestrian, referee, customer, guest, tenant, and so forth,
what Markman and Stillwell (2001) call role-based categories. They are
relational not in the sense of comparing two physical entities but, rather, in
assessing the relation between an entity and some larger event or process in
which it plays a role.
The obvious hypothesis here is that this second type of relational thinking
comes from humans unique understanding of collaborative activities with
joint goals and individual roles (perhaps later generalized to all kinds of
Joi n t I n t e n t iona l i t y 43
social activities even if they are not collaborative per se). As humans
constructed these kinds of activities, they were creating more or less abstract
slots or roles that anyone could play. These abstract slots formed role-
based categories, such as things that one uses to kill game (viz., weapons;
Barsalou, 1983), as well as more abstract narrative categories such as
protagonist, victim, avenger, and so on. A further speculation might be that
these abstract slots at some point enabled humans to even put relational
material in the slots; for example, a married couple can play a role in a
cultural activity. This would be the basis for the kinds of higher-order
relational thinking that Penn et al. (2008) emphasize as especially important
in differentiating human thinking.
In any case, the proposal here is that, at the very least, constructing the
kinds of dual-level cognitive models needed to support collaborative
activities enhanced, if not enabled, human engagement in much broader
and more flexible relational thinking involving roles in larger social realities,
and possibly in higher-order relational thinking as well.
The main point for now is that early humans seem to have created a new
cognitive model. Collaborating toward a joint goal created a new kind of so-
cial engagement, a joint intentionality in which we are hunting antelopes
together (or whatever), with each partner playing her own interdependent
role. This dual-level structure of simultaneous sharedness and individuality
a joint goal but with individual rolesis a uniquely human form of second-
personal joint engagement requiring species-unique cognitive skills and
motivational propensities. It also has a number of perhaps surprising ramifi-
cations for many different aspects of human cognition that go beyond our
primary focus here (see box 1 for one example).
attention. Underlying this coordination is, once again, some notion of com-
mon ground, in which each individual at least potentially can attend to
his partners attention, his partners attention to his attention, and so forth
(Tomasello, 1995). Joint actions, joint goals, and joint attention are thus of a
piece, and so they must have coevolved together.
The current proposal is that the phylogenetic origins of the ability to par-
ticipate with others in joint attentionthe first and most concrete way in
which young children create common conceptual ground and so shared reali-
ties with otherslie in collaborative activities. Th is is what Tomasello (2008)
calls the top-down version of joint attention because it is directed by joint
goals. (The alternative is bottom-up joint attention, such as when a loud noise
attracts both of our attention, and we both know it must have attracted the
others attention as well.) Ontogenetically, young children begin to structure
their joint actions with others via joint visual attention at around nine to
twelve months of age, often called joint attentional activities. These are such
activities as giving and taking objects, rolling a ball back and forth, building
a block tower together, putting away toys together, and reading books to-
gether. Despite specific attempts to identify and solicit such joint attentional
activities with human-raised chimpanzees, Tomasello and Carpenter (2005)
were unable to find any (nor are there any other reliable reports of joint atten-
tion in nonhuman primates).
Just as each partner in a joint collaborative activity has her own individual
role, each partner in joint attentional engagement has her own individual
perspective and knows that the other has her own individual perspective as
well. The crucial point, which will be foundational for all that follows, is that
the notion of perspective assumes a single target of joint attention on which we
have diff ering perspectives (Moll and Tomasello, 2007, in press). If you are
looking out one window of the house and I am looking out another in the
opposite direction, we do not have different perspectiveswe are just seeing
completely different things. We can thus operate with the notion of individu-
ally distinct perspectives only if (1) we are both considering the same thing,
and (2) we both know the other is attending to it differently. If I see some-
thing in one way, and then round the corner to see it in another, this does not
give me two perspectives on the same thing, because I do not have multiple
perspectives available to me simultaneously for comparison. But when two
people are attending to the same thing simultaneously and it is in their
common ground that they are both doing sothen space is created (to use
Joi n t I n t e n t iona l i t y 45
Social Self-Monitoring
Early humans living as obligate collaborative foragers would have become
more deeply social in still another way. Although skills of joint intentionality
are necessary for human-like collaborative foraging, they are not sufficient.
One also has to find a good partner. This may not always be overly difficult,
as even chimpanzees, after some experience, learn which partners are good
(i.e., lead to success) and which are not (Melis et al., 2006b). But in addition,
in situations in which there is meaningful partner choice, one must be or at
least appear to be a good collaborative partner oneself. To be an attractive
partner for others, and so not be excluded from collaborative opportunities,
one must not only have good collaborative skills, but also do ones share of
the work, help ones partner when necessary, share the spoils at the end of the
collaboration, and so forth.
And so early humans had to develop a concern for how other individuals
in their group were evaluating them as potential collaborative partners, and
then regulate their actions so as to affect these external social judgments in
positive wayswhat we may call social self-monitoring. Other great apes do
Joi n t I n t e n t iona l i t y 47
that brought the human species into the modern human world as we shall
see in the next chapter.
honestly, for their not our benefit, is the starting point. The notion of truth
thus entered the human psyche not with the advent of individual intentionality
and its focus on accuracy in information acquisition but, rather, with the
advent of joint intentionality and its focus on communicating cooperatively
with others.4
The second important consequence of this new cooperative way of com-
municating was that it created a new kind of inference, namely, a relevance
inference. The recipient of a cooperative communicative act asks herself: given
that we know together that he is trying to help me, why does he think that I
will find the situation he is pointing out to me relevant to my concerns. Con-
sider great apes. If a human points and looks at some food on the ground, apes
will follow the pointing/looking to the food and so take itno inferences
required. But if food is hidden in one of two buckets (and the ape knows it is
in only one of them) and a human then points to a bucket, apes are clueless
(see Tomasello, 2006, for a review). Apes follow the humans pointing and
looking to the bucket, but then they do not make the seemingly straightfor-
ward inference that the human is directing their attention there because he
thinks it is somehow relevant to their current search for the food. They do not
make this relevance inference because it does not occur to them that the hu-
man is trying to inform them helpfully since ape communication is always
directive and this means that they are totally uninterested in why the hu-
man is pointing to one of the boring buckets. Importantly, it is not that apes
cannot make inferences from human behavior at all. When a human first sets
up with them a competitive situation and then reaches desperately toward one
of the buckets, great apes know immediately that the food must be in that one
(Hare and Tomasello, 2004). They make the competitive inference, He wants
in that bucket; therefore the food must be in there, but they do not make the
cooperative inference, He wants me to know that the food is in that bucket.
This pattern of behavior contrasts markedly with that of human infants.
In the same situation, prelinguistic infants of only twelve months of age trust
that the adult is pointing out to them something relevant to their current
searchthey comprehend the informative motiveand so they know imme-
diately that the pointed-to bucket is the one containing the reward (Behne
etal., 2005, 2012). The mutual assumption of cooperativeness in such situations
is so natural for humans that they have developed a special set of signals
ostensive signals such as eye contact and addressing the other vocallyby
means of which the communicator highlights for the recipient that he has
some relevant information for her. Thus, as evolutionary example, suppose
Joi n t I n t e n t iona l i t y 53
that while we are collaboratively foraging I point to berries on a bush for you,
with eye contact and an excited vocalization. You look and see the bush but,
at first, no berries. So you ask yourself: why does he think that this bush is rel-
evant for me and this makes you look harder for something that is indeed
relevant and thus you discover the berries. As communicator, I know that
you, as recipient, are going to be engaging in this process if and only if you see
me as directing your attention cooperatively, and so I want to make sure that
you know that I am doing this. Therefore, I not only want you to know that
there are berries here but also want you to know that I want you to know
this so that you will follow through the inferential process to its conclusion
(Grice, 1957; Moore, in press). By addressing you ostensively, and based on
our mutual expectation of cooperation, I am in effect saying, You are going
to want to know this and you do want to know it because you trust that
Ihave your interests in mind.
The third and final consequence of this newly cooperative way of com-
municating was that there now emerged, at least in nascent form, a distinc-
tion between communicative force as overtly expressed in requestive and
informative intonationsand situational or propositional content as indicated
by the pointing gesture. (NB: Th is means that by this time early humans
would have had to control their vocal expressions of emotions voluntarily in
a way that apes do not.) Early humans could now point toward berries in a
bush, with one of two different motives, expressed intonationally: either an
insistent requestive intonation, in the hopes that the recipient would fetch
some berries for her, or a neutral intonation to just inform the recipient of the
berries location so that she might get some for herself. We thus now have a
clear distinction between something like communicative force and commu-
nicative content: the communicative content is the presence of the berries,
and the communicative force is either requestive or informative. All of this is
implicit, of course, and so we still have some way to go to reach the convention-
alized and so explicit distinction between communicative force and content
that is so important in conventional linguistic communication (see chapter 4).
But the breakthrough here is the relative independence of referential (situa-
tional, propositional) content from the communicators motives or intentions
for referring attention to it.
And so, early humans joint collaborative activities created a new motiva-
tional infrastructure for their communication, a cooperative motivation to
inform one another of things helpfully and honestly. Th is then motivated
recipients to do significant inferential work to find out why the communicator
54 A Nat ur a l History of Hu m a n T hink ing
thought that looking in a certain direction would be relevant for their con-
cerns, which then motivated communicators to advertise when they had some-
thing relevant for a recipient. And the fact that there were now two different
communicative motives possiblerequestive and informativemeant that
the situational (propositional) content of the communicative act was starting
to be conceptualized as independent of the particular intentional states of the
communicator.
indicating for you the fact that there is now no predator in the tree. Common
ground and a mutual assumption of relevancenot possible for apes because
they simply do not engage in this kind of cooperative communication
enable a meeting of minds in the direction of the protruding finger.
Following the analysis in chapter 2, relevant situations are those that present
individuals with opportunities and/or obstacles for reaching their goals and
maintaining their values. Thus, if during our search for fruit I point toward a
distant banana tree, it would never occur to you that I might be pointing out
the presence of the leaves, even if leaves is all that you see at the moment, since
the presence of leaves is in no way relevant to what we are doing. Instead, you
will continue looking until you see, for example, some bananas behind the
leaves, whose presence is highly relevant to what we are doing. Another dimen-
sion of this process is that only new situations are communicatively relevant,
since currently shared situations need not be pointed out. And so, in the
example from above, after the predator left the banana tree I pointed to
the banana tree with the intention to indicate the situation of the predators
absence, which you readily discerned. How could I intend and you infer preda-
tor absence when the presence of the bananas is also highly relevant? Because
the presence of the bananas was already in our current common ground, and
so me pointing out this situation to you would be superfluous. If I am going to
be helpful, I must point out situations that are new for you, or else why bother.
And so, in human cooperative communication, both communicators and
recipients mutually assume in their common ground that communicators
point out for recipients situations that are both relevant and new.
Perhaps surprising, even young infants are skillful at keeping track of the
common ground they have with specific other individuals and using that to
determine relevance in both the comprehension and production of pointing
gestures. For example, Liebal et al. (2009) had a one-year-old infant and an
adult clean up together by picking up toys and putting them in a basket. At
one point the adult stopped and pointed to a target toy, which the infant then
cleaned up into the basket. However, when the infant and adult were clean-
ing up in exactly this same way, and a second adult who had not shared this
context entered the room and pointed toward the target toy in exactly the
same way, infants did not put the toy away into the basket; they mostly just
handed it to him, presumably because the second adult had not shared the
cleaning up game with them as common ground. Infants interpretations
thus did not depend on their own current egocentric activities and interests,
56 A Nat ur a l History of Hu m a n T hink ing
which were the same in both cases, but rather on their shared experience with
each of the pointing adults. (In another study, Liebal et al. [2010] found that
infants of this same age also produced points differently depending on their
common ground with the recipient.)
Infants in this same age range also use a mutual assumption of newness to
determine what a pointing adult thinks is relevant for them. Thus, Moll et al.
(2006) had eighteen-month-old infants play with an adult and a toy drum.
If a new adult now entered the room and indicated the drum excitedly, the
child assumed he was talking about the cool drum. But if the adult with
whom the child had just been sharing enjoyment of the drum pointed to the
drum excitedly in exactly the same manner, the child did not assume that she
was excited about the drum: how could she be, since it is old news for us?
Rather, children assumed that the adults excitement must be due to some-
thing new about the drum that they had not previously noticed, and so they
attended to some new aspect, for example, on the adults side of the drum. In
their production of pointing, infants also use this distinction between shared
and new information. For example, when a fourteen-month-old infant wanted
his mother to put his high chair up to the dining room table: on one occasion
he pointed to the chair (because he and his mother had already shared atten-
tion to the empty space at the table), whereas on another occasion he pointed
to the empty space at the table (because he and his mother had already shared
attention to the chair) (Tomasello et al., 2007a). In both cases the infant wants
the exact same thinghis chair placed at the tablebut to communicate
effectively he assumes that the object he and his mother are focused on is al-
ready part of their common ground, and so he points out the aspect of the
situation that she may not have noticed, the new part.6
Engaging in cooperative (ostensive-inferential) communication of this type
requires some new types of thinking. In effect, all three components of the
thinking processrepresentation, inference, and self-monitoringmust
become socialized.
With respect to representation, the key novelty is that both participants in
the communicative interaction must represent one anothers perspective
on the situation and its elements. Thus, the communicator attempts to focus
the recipients attention on one of the many possible situationsfact-like
representationsimmanent in the current perceptual scene (e.g., there are
bananas in the tree versus there is no predator in the tree). The communica-
tive act thus perspectivizes the scene for the recipient. It also perspectivizes
Joi n t I n t e n t iona l i t y 57
the elements. For example, if we are building a fire, me pointing out to you the
presence of a log construes that log as firewood. But if we are tidying up the
cave, me pointing out to you the presence of that very same log construes it as
trash. In the object choice task, the communicator is not pointing to the bucket
qua physical object or qua vessel for carrying water, but rather qua location:
Iam informing you that the reward is located in there. Cooperative pointing
thus creates different conceptualizations or construals of things. These presage
the ability of linguistic creatures to place one and the same entity under alter-
native different descriptions or aspectual shapes, which is one of the hall-
marks of human conceptual thinking; but it does this without the use of any
conventional or symbolic vehicles with articulate semantic content.
With respect to inference, the key point is that the inferences used in co-
operative communication are socially recursive. Thus, implicit in all of the
foregoing is a kind of backing-and-forthing of individuals making inferences
about the partners intentions toward my intentional states. In the object choice
task, for example, the recipient infers that the communicator intends that she
know that the food is in that bucket a socially recursive inference that great
apes apparently do not make. This inference requires in all cases an abductive
leap, something like: his pointing in the direction of that otherwise boring
bucket would make sense (i.e., would be consistent with common ground,
relevance, and newness) if it is the case that he intends that I know where the
reward is. The communicator, for his part, is attempting to help the recipient
to make that abductive leap appropriately. To do this, at least in many situa-
tions, the communicator must engage in some kind of simulation, or think-
ing, in which he imagines how pointing in a particular direction will lead the
recipient to make a particular abductive inference: if I point in this direction,
what inferences will he make about my intentions toward his intentional states?
And then, when making his abductive inference, the recipient can potentially
take into account the communicators taking into account of what kind of
inference she is likely to make about his communicative intentions. And so
forth.
Finally, with respect to self-monitoring, the key is that being able to oper-
ate in this way communicatively requires individuals to self-monitor in
anew way. As opposed to apes cognitive self-monitoring, this new way was
social. Specifically, as an individual was communicating with another, he
was simultaneously imagining himself in the role of the recipient attempting
to comprehend him (Mead, 1934). And so was born a new kind of self-monitoring
58 A Nat ur a l History of Hu m a n T hink ing
will not make the normal inference but rather a different one. For example,
Liebal et al. (2011) had an adult and a two-year-old child again tidying up
toys into a large basket. In the normal course of events, when the adult pointed
to a medium-sized box on the floor, the child took this to suggest that she
should tidy up this box into the basket as well. But in some cases the adult
pointed to the box with flashing eyes and a kind of insistent pointing directed
at the child, obviously not the normal way of doing it. The adult clearly intended
something different from the norm. In this case, many children looked at the
adult puzzled but then proceeded to open the box and look at what was in-
side (and tidy it up). The most straightforward interpretation of this behavior
is that the child understood that the adult was anticipating how she would
construe a normal point, which he did not want, and so he was marking his
pointing gesture so that she would be motivated to search for a different in-
terpretation. This is the child thinking about the adult thinking about her
thinking about his thinking.
And so, the kind of thinking that goes on in human cooperative commu-
nication is evolutionarily new in that it is perspectival and socially recur-
sive. Individuals must think (simulate, imagine, make inferences) about their
communicative partner thinking (simulating, imagining, making inferences)
about their thinkingat the very least. Great apes show no signs of making
such inferences, and their failure to comprehend even the simplest acts of
cooperative pointing, for example, in the object choice task (while making non-
recursive inferences in the same task setting), provides positive evidence that
they do not. Human thinking in cooperative communication also involves a
new kind of social self-monitoring, in which the communicator imagines
what perspective the recipient is taking, or will take, on his intentions toward
her intentionsand so imagines how she will comprehend it. In all, what we
have at this point in our evolutionary story of human communication is indi-
viduals attempting to coordinate their intentional states, and so their actions,
by pointing out new and relevant situations to one another. This relies on their
having a certain amount and type of common ground, and it requires, further,
that the interactants make a series of interlocking and socially recursive infer-
ences about one anothers perspectives and intentional states.
Symbolizing in Pantomime
Beyond the pointing gesture, the second form of natural communication that
humans employ is spontaneously generated, nonconventional iconic gestures,
60 A Nat ur a l History of Hu m a n T hink ing
the appropriate common ground, but the unique perspective in each case is
not in any way contained in the protruding finger itself (see Wittgensteins
[1955] incisive, if cryptic, discussion of this issue). But with iconic gestures,
I would indicate for you the shape, size, or material of a piece of paper or
whether I want you to write on the paper or throw it awayby depicting each
of these different aspects or actions with different icons. The momentous new
feature of iconic gestures is thus that the different perspectives of things and
situations only implicit in pointing are now expressed overtly in external sym-
bolic vehicles with semantic content.
Relatedly, the vast majority of communicative conventions in a natural
language are category terms. That is, common nouns and most verbs are con-
ventionalized for reference to categories of entities such as dog and bite, which
means that to make reference to a specific dog or instance of biting, we must
do some kind of pragmatic grounding (such as with the or my dog, or the dog
who lives next door in the case of nouns; or tense and aspect markers, as in is
biting or bit, in the case of verbs). Iconic gestures are already category terms,
because they implore the recipient to imagine something like this. (It is pos-
sible that one could iconically gesture an individual as wellfor example, by
mimicking her idiosyncratic mannerismsand so the distinction between
common and proper nouns is at least in principle possible in this modality.)
The categorical dimension is bound up with perspective in the sense that call-
ing someone either Bill or Mr. Smith is not perspectival because these are not
category terms, but calling him a father or a man or a policeman is perspectival
because it puts him under a description, that is, it perspectivizes him differ-
ently on different occasions for different communicative purposes.
Iconic gestures are thus an important step on the road to linguistic conven-
tions in that they are symbolic, with semantic content, and are at least poten-
tially categorical. An interesting fact that reinforces this point is that although
young children produce some iconic gestures from early in development,
they actually go down in frequency over the second year of life as children
begin learning language, whereas pointing increases in frequency during the
same period. One hypothesis is that pointing increases because it does not
compete with language but complements it by performing a different func-
tion. As symbolic vehicles with semantic content, iconic gestures compete with
linguistic conventions, and they lose the competitionfor many obvious
reasonswhich usurps the need to create spontaneous gestures on the spot,
except in a few exceptional circumstances. If one imagines an evolutionary
Joi n t I n t e n t iona l i t y 63
(continued )
64 A Nat ur a l History of Hu m a n T hink ing
box 2 (continued )
And so the first surprising effect of iconic gestures is that their emergence in
human evolution led to skills of acting out pretend scenarios with and for
others, which may be the basis for humans creation of all of the imagi-
nary situations and institutions within which they reside. In addition, to
anticipate our story a bit, it is also reasonable to suppose that the creation of
what Searle (1995) calls cultural status functions such as being a president
or a husband and pieces of paper standing for (indeed, constituting)
moneyhas its phylogenetic and ontogenetic roots in pretend play in which
children together anoint a stick as a horse, which gives the stick special
powers, in a manner very similar to anointing a person as a president
(Rakoczy and Tomasello, 2007). If thinking is at base a form of imagining,
then one can hardly overestimate the importance of imagining things for
other people, as embodied in iconic gestures, for the evolution and develop-
ment of uniquely human thinking (Donald, 1991).
But more recently, some theorists have dug more deeply into this connec-
tion. Beginning with the pioneering work of Lakoff and Johnson (1979), it is
well known that humans quite often talk about abstract situations or
entities metaphorically or analogically in terms of concrete spatial relation-
ships. As just a few examples, we talk about putting things into and taking
them out of our lectures, we fall into love, we are on our way to success, or
we are going nowhere in our career, or I am out of my mind, or she is
coming to her senses, and on and on. We are not talking about just surface
metaphors here, but very basic ways of conceptualizing complicated and
abstract situations. Thus, in his follow-up work, Johnson (1987) identified a
number of so-called image schemas that seem to permeate our thinking,
such as containment (in and out of a lecture), part-whole (the foundation of
our relationship), link (we are connected), obstacle (my lack of education gets
in the way of my social life), and path (we are on our way to marriage).
Joi n t I n t e n t iona l i t y 65
The speculation is thus that in addition to numerous other reasons for space
being important in human cognition, a critically important reason is that at
an early stage in their evolution humans conceptualized many things for
others in their gestural communication in a fictive space with fictive actors
and actions. Basically, the only way to depict many things in spontaneous,
nonconventionalized gestures is by acting out in space the referent objects
and events. And so if we believe that human thinking is intimately tied to
communicationhow we have come to conceptualize things for others
then the fact that we did this for some time in our history by pantomiming
in space may go a long way toward explaining the inordinately important
role of space in human cognition.
66 A Nat ur a l History of Hu m a n T hink ing
Combining Gestures
Great apes do not create new communicative functions by combining their
gestures, their vocalizations, or their gestures and vocalizations together (Liebal
et al., 2004; Tomasello, 2008). But humans do, including young children
from the earliest stages of their communicative development, and including
even children exposed to no conventional language, vocal or signed, at all
(Goldin-Meadow, 2003).
While there is no principled reason why someone could not string together
various pointing gesturesand individuals may do this on occasionthis is
not commonly observed. Beginning language learners combine their earliest
linguistic conventions with pointing or other conventions, and beginning sign
language learners produce iconic or conventional signs in combination with
pointing (as do, again, children exposed to no conventional language at all;
Goldin-Meadow, 2003). As originating context in evolution, one can easily
Joi n t I n t e n t iona l i t y 67
Second-Personal Thinking
We are trying to get to the full flowering of modern human objective-reflective-
normative thinking in the context of culture and language. We are halfway
there. With the reconstructed early humans we are picturing here, we have
creatures who are not just strategizing how to obtain food or mates in bigger,
better, and faster ways than others, as are great apes, but rather who are attempt-
ing to coordinate their actions and intentional states with others via evolution-
arily new forms of collaborative activity and cooperative communication. They
are not just organizing their actions via individual intentionality; they are also
organizing them via joint intentionality. And this changed the way they imag-
ined the world so as to manipulate it in acts of thinking.
Joi n t I n t e n t iona l i t y 69
whereas in pointing, the act (protruding finger) would be the same in both
cases, with the common ground of the collaborative activity (whether we are
admiring the local fauna or seeking sustenance) carrying the semantic weight.
Another important feature of iconic gestures is that they are mostly categori-
cal in nature, that is, used to conceptualize or perspectivize things, events, or
situations like this. In choosing what to act out for others in pantomime,
then, communicators construe the situation from a particular perspective cat-
egorically, as opposed to other possible categorical perspectives.
pointing and pantomiming on their own are quite weak communicative vehi-
cles, inferential leaps of at least some distance are always required to recon-
struct the communicators communicative intention so that at least some
help is almost always needed.
And so developed a form of communication in which a communicator
intended that a recipient know something, for her benefit. The recipient un-
derstood this and so, for example, understood that he intends for me to know
that the banana is in that bucket. The communicator, for his part, knew that
the recipient would make such an inference if he helped her to do so by alert-
ing her to the fact that he had such an intention (the Gricean communicative
intention that the recipient notice that the communicator wants her to know
something). This may not be one multiply embedded communicative inten-
tion, as in the Gricean analysis, but rather, as argued by Moore (in press) two
singly embedded intentions: I intend that you notice that this communicative
act is for you, plus I intend that you know that the banana is in that bucket.
Nevertheless, the single embedding in this second intention is already more
than great apes can do, and so it represents a new form of recursive inference
(the production version occurring when the communicator simulated the re-
cipients intentional states in order to formulate communicative acts that
would be readily comprehensible for hernot throwing the ball at her, but
rather to her).
constructions with abstract slots in this way, they created for themselves
almost unlimited combinatorial freedom. Schema formation in communi-
cative acts and the parsing of communicative intentions into discrete overt
components represent a significant step in the direction of the kind of in-
ferential promiscuity characteristic of modern human thinking in a con-
ventional language.
Beyond the new possibilities for creating novel, even counterfactual thoughts
via external communicative vehicles, a number of theorists have emphasized
the necessary role of such external vehicles for individuals to reflect on their
own thinking (e.g., Bermudez, 2003). When individuals formulate an overt
communicative act and then perceive and comprehend it as they produce it,
they are, in effect, reflecting on their own thinking (a process that may become
internalized so that we may think about things that we could potentially com-
municate overtly). Because the gesture combinations at this point have only
limited semantic content (e.g., no logical vocabulary and no propositional atti-
tude vocabulary), early humans could reflect only in a highly limited way on
their own thinking.
With the advent of early human collaboration and cooperative communi-
cation, then, the causal inferences of great apes were, like their cognitive repre-
sentations, cooperativized. This meant that the communicators inferences
were about what was the situation from the perspective of the recipient, and
the recipients inferences were about the communicators simulations of her
simulating his perspective. Overt combinations of symbols, especially if sche-
matized, led to the possibility of thinking various new and even counterfac-
tual thoughts, as well as to the first, rather modest, reflections on ones own
thinking. With all of these new inferential possibilities, then, we are now well
on our way to thinking processes that are truly reflectively reasoned.
Second-Personal Self-Monitoring
Great apes self-monitor their goal-directed behavior, including its psychologi-
cal underpinnings with respect to such things as memory and decision making.
But great apes are not normative creatures. They experience instrumental
pressure, for example, when they have a goal to eat food and they know that
food is available at location X; this implies that they must go to location X.
But this is just the way control systems with individual intentionality work: a
mismatch between goal and perceived reality motivates action. In contrast,
Joi n t I n t e n t iona l i t y 75
early humans began to self-monitor from the perspective of others and, indeed,
self-regulated their behavioral decisions with others evaluations in mind. Now
we may talk of something that is socially regulated, that is, socially norma-
tive, albeit only in second-personal (as opposed to agent-neutral) form. There
were two manifestations.
desire, and informative utterances such as There is some fruit over there are
public offers of helpful information). Communication of this type could
never be adaptively stable in contexts that were not fundamentally coopera-
tive, and so fully human-like skills of joint intentionality could never evolve
solely in the context of competition.
There can be no doubt that the last common ancestor to humans and other
primates engaged in individual thinking in pursuit of individual goals, mostly
in order to compete with groupmates for valued resources. Along the way, they
attended to situations relevant to those goals. Early human individualsin
response to a changing feeding ecologythen began to join together with
other individuals dyadically in pursuit of joint goals, and they jointly attended
to situations relevant to that joint goal. Each participant in the collaboration
had her own individual role and her own individual perspective on the situa-
tion as part of the interactive unit. This dual-level structuresimultaneous
jointness and individualityis the defining structure of what we are calling
joint intentionality, and it is foundational for all subsequent manifestations of
human shared intentionality.
The problem was how to coordinate these collaborative activities as they
became ever more complex, both to negotiate a joint goal and to coordinate
the two different roles. The solution was cooperative communication. Early
humans directed the attention of their collaborative partner to relevant situa-
tions by pointing, which required taking her perspective and simulating her
thinking (i.e., in terms of the abductive leap she might be expected to make
given different possible communicative acts). To comprehend, the recipient
had to take the perspective of the communicator taking her perspective
which constituted to a new form of socially recursive inferring. Early humans
concern that their partner comprehend them led to social self-monitoring via
the anticipated evaluations of the partner with respect to the comprehensibil-
ity of the communicative act.
The basic cognitive challenge in all of this was to coordinate ones own per-
spective with the perspective of ones collaborative partner. And so, as early
humans engaged in the truck and barter of making a living collaboratively,
they began to truck and barter in perspectives with their interactive partners
communicativelyand in their own perspectives reflectively to some degree
and this gave human cognitive representation and inference a new kind of flex-
ibility and power. Now, instead of just their own view on the world, early hu-
mans could also view the world at the same time from the perspective of the
Joi n t I n t e n t iona l i t y 79
other, which might also include her perspective on my perspective. Early hu-
mans had not just a great ape view from here, but rather a view simultaneously
from here and there.
We do not know precisely who these early humans were, but we may spec-
ulate Homo heidelbergensis some 400,000 years ago, living as loosely struc-
tured bands or pools of recurrently collaborating partners. Of course Homo
heidelbergensis did not engage in modern human forms of fully objective-
reflective-normative thinking. Their thinking was not objective but rather
was still tied to the two second-personal perspectives of I and you. Their
thinking was only weakly reflective because they could express very few of
their intentional states or cognitive operations externally in communicative
vehicles (and so they could act as both producers and comprehenders of only
some limited semantic content). And their thinking was socially normative
only in the sense that they were concerned with how their partner evaluated
their cooperative behavior and comprehended their communicative acts, not
with the groups normative standards. There is thus no question that we are
still some way from modern human collective intentionality and its objective-
reflective-normative thinking. But, we would argue, the in-between step of
early human joint intentionality and its perspectival-recursive-socially moni-
tored thinking was necessary for getting there. It was necessary because the
transition to modern humans was all about creating cultural conventions, and
if these were to be in a cooperative directionas they almost invariably were
then some very strong cooperative tendencies had to be already present in the
individuals doing the conventionalizing.
Together, then, early human collaborative activities and cooperative com-
munication represent a kind of second-personal cooperativization of great
ape lifeways and thinking. But these evolutionarily new forms of second-
person social interaction involved joint engagement with specific other per-
sons on specific occasions only, and they did not retain their special character-
istics very far outside of the collaborative activities themselves. And so, despite
the great leap forward represented by this new joint intentional way of living,
communicating, and thinking, the next leap forward will have to take this
cooperativized cognition and thinking and collectivize it by conventional-
izing and institutionalizingand so normativizing and objectifyingalmost
everything.
4
Collective Intentionality
The most cultural of nonhuman animals are undoubtedly the great apes, espe-
cially chimpanzees and orangutans. Observations in the wild have documented
for these two species a relatively large number of population-specific behaviors
that persist in the group over time and that very likely involve social learning
(Whiten et al., 1999; van Schaik et al., 2003). Experimental studies have also
demonstrated some skills of social learning in these two species, for example,
in learning to use novel tools, that very likely are at work in generating their
cultural patterns in the wild (see Whiten, 2010, for a review).
But great ape culture is not human culture. Tomasello (2011) characterizes
great ape culture as mainly exploitive, as individuals socially learn from others
who may not even know they are being watched. Modern human culture, in
contrast, is fundamentally cooperative, as adults actively teach children, altru-
istically, and children actively conform to adults, as a way of fitting in coopera-
tively with the cultural group. The hypothesis is that this cooperative form of
culture was made possible by the intermediate step of early humans highly
cooperative lifeways and how this transformed great ape social learning into
truly cultural learning. Teaching borrows its basic structure from cooperative
communication in which we inform others of things helpfully, and conformity
is imitation fortified by the desire to coordinate with the normative expectations
of the group. Modern humans did not start from scratch but started from early
human cooperation. Human culture is early human cooperation writ large.
Group Identification
The small-scale, ad hoc collaborative foraging characteristic of early humans
was a stable adaptive strategyfor a while. In the hypothesis of Tomasello
etal. (2012), it was destabilized by two, essentially demographic, factors.
The first factor was competition with other humans. This meant that a
loose pool of collaborators had to turn into a proper social group in order to
protect their way of life from invaders. A loose social grouping of early humans
was under pressure to transform into a coherent collaborative group with
joint goals aimed at group survival (each group member needing the others as
collaborative partners for both foraging and fighting) and division-of-labor
roles toward this end. As with early humans smaller-scale collaborations,
this meant that group members were motivated to help one another, as they
were all now clearly interdependent with one another at all times: we must
together compete with and protect ourselves from them. Individuals thus
Col l ec t i v e In t en t iona l i t y 83
trusting and even conformingthe resulting cumulative effect was that the
we became an enduring culture to which we (past, present, and future) are
all committed ( just as early humans were committed to their ongoing, small-
scale collaborations). Human populations thus became more than a loosely
structured pool of collaborators; they become self-identified cultures with
their own histories. Once again, precisely when this all happened is not
crucial to our story, but the first clear signs of distinct human cultures appear
with Homo sapiens sapiens, that is, modern humans, beginning at the earliest
some 200,000 years ago.
That humans do indeed think of their group as a we of interdependent
individualsthat humans identify with their groupis a well-established
psychological fact. Most fundamentally, humans have a marked in-group/out-
group psychology that is, in all likelihood, unique to the species. Much re-
search shows that humans favor their in-group in all kinds of ways, and they
care about their reputation more in their in-group than in any out-groups as
well (Engelmann et al., in press). Moreover, they think of others from other
groups not just as strangers, as do apes and as did early humans, but as mem-
bers of specific out-groups with alien, often despised ways. Perhaps the most
striking phenomenon of group identity is collective guilt, shame, and pride.
Individuals feel guilty, ashamed, and/or proud when an individual of their
group does something noteworthy in basically the same way that they would
if they themselves had done the deed (Bennett and Sani, 2008). In the con-
temporary world, one sees such group identity and collective guilt, shame, and
pride quite clearly in struggles over ethnic identity, linguistic identity, collec-
tive responsibility, and so forthand even in such frivolous phenomena as fan
support of sports teams. As far as we know, great apes do not have, and early
humans did not have, this sense of group identity at all.
The proposal is thus that with increasing population sizes and competition
among humans, the members of human groups began to think of themselves
and their groupmates (known and unknown, present and past) as participants
in one big, interdependent, collaborative activity aimed at surviving and thriv-
ing in competition with other human groups. Group members were identified
most readily by specific cultural practices, and so teaching and conformity to
the groups lifeways became a critical part of the process. These new forms of
group-mindedness led to what we may call the collectivization of human social
life, as embodied in group-wide cultural conventions, norms, and institutions
which transformed, one more time, the way that humans think.
Col l ec t i v e In t en t iona l i t y 85
perspectives on the world. This process has many implications for human think-
ing, but a prominent one is the understanding of false beliefs (which great
apes clearly do not do; see Tomasello and Moll, in press, for a review). Thus,
we previously invoked something like Davidsons notion of social triangula-
tion to explain how it is that early humans came to understand that others
have perspectives that differ from their own. But to get to an understanding
of beliefs, including false beliefs, we must have some notion of a generalized
perspective on an objective reality that is independent of any particular per-
spective. Something like this is needed to make the judgment not just that
a belief is different from mine, but that it is wrong since objective reality is
the final arbiter. It is likely that young children begin to think in terms
ofmultiple different perspectives on things from as soon as they participate
in joint attention with its two perspectives during late infancy (Onishi and
Baillargeon, 2005; Buttelmann et al., 2009), and we may hypothesize that this
was the case for early humans as well. But it is not for several more years that
children come to a full-blown understanding of beliefs, including false beliefs,
because they (and so presumably all humans before modern humans) do not
yet understand objective reality.1
are not just statistical but, rather, socially normative, as in you are expected to
do your part (or else!). The force of the expectations derives from the fact that
individuals who do not conform to our groups way of doing things often create
disruptions, which should not be tolerated, and indeed, if individuals behave
too differently it signals that they are not one of us (or do not want to be one of
us) and so cannot be trusted. Group-minded individuals thus view nonconfor-
mity in general as potentially harmful to group life in general. The result is that
humans conform to social norms for instrumental reasons (to coordinate suc-
cessfully), for prudential reasons (to avoid the groups opprobrium), and in order
to benefit of the groups functioning since nonconformity potentially disrupts
this functioning (a group-minded reason).
Like conventions in general, social norms operate not in second-personal
mode but rather in agent-neutral, transpersonal, generic mode. First, and most
basic, social norms are generic in that they imply an objective standard against
which an individuals behavior is evaluated and judged. In early humans so-
cial evaluations, individuals only knew who did things ineffectively or nonco-
operatively, but now the roles have specific agent-neutral standards (that can
be taught as such). These objective standards come from the mutual under-
standing of how the different functions in particular conventionalized cultural
practices are effected if everyone is to reap the anticipated benefit. Thus, if it is
cultural common ground in the group that when collecting honey the person
smoking out the bees must do so in this particular way, and that if she does
not do it in this way we will all go home empty-handed, then her behavior
may be evaluated relative to this objective standard for job performance.
Social norms are also generic in terms of their source. Social norms emanate
not from an individuals personal preferences and evaluations but, rather, from
the groups agreed-upon evaluations for these kinds of things. Thus, when an
individual enforces a social norm, she is doing so, in effect, as an emissary of
the group as a wholeknowing that the group will back her up. Group-
minded individuals thus enforce social norms because their collective com-
mitment to a social norm means that they commit not only to following
it themselves but also to seeing that others do, toofor the benefit both of
ourselves and of other group members with whom we are interdependent
(Gilbert, 1983). The typical formulation of individuals enforcing social norms
would be something like, One cannot do it like that; one must do it like
this, which is of course very similar to the generic mode used in teaching.
(Indeed, norm enforcement and teaching may be two versions of the same
Col l ec t i v e In t en t iona l i t y 89
done best in a conventional language; see below). This means that early humans
concern with being judged is transformed by modern humans into a concern
for ones public reputation and social status. And, critically, reputational status
is more than just a sum of many social evaluations; it is nothing less than a
Searlian status function (see next section) in which my public persona is a reified
cultural product created by the collectivity, who can take it away in a second,
as any scandalized modern politician can attest.
Institutional Reality
In the limit, some conventional cultural practices turn into full-blown institu-
tions. Obviously, the dividing line is fuzzy, but a basic prerequisite is that the
cultural practice is not a solo activity but is in some sense collaborative, with
well-defined, complementary roles. But the key feature distinguishing cul-
tural institutions is that they comprise social norms that do not just regulate
existing activities but, rather, create new cultural entities (the norms are not
regulative but constitutive). For example, a human group might tend to make
decisions about such things as where to travel next, how to set up defenses
against potential predators, and so forth, by simply arguing among themselves.
But if there are difficulties in making decisions, or infighting among several
coalitions, then the group could institutionalize the process into some kind
of governing council. Creating this council would give otherwise normal indi-
viduals abnormal status and powers. The council might then designate a
chief, whom they would empower to do still other abnormal things, like ban-
ish people from the group. The council and chief are thus cultural creations,
and their entitlements and obligations are bestowed upon them by the mem-
bers of the group, who can, in theory, take them away and so turn the council
members and chief back into everyday people again. The roles in institutions
are explicitly agent neutral because, in theory if not in practice, anyone may
play any role.
Searle (1995) has been most explicit about how this process works. First,
obviously, is some kind of mutual agreement or joint acceptance among group
members to designate, for example, an individual as chief. Second, there must
be some kind of symbolizing capacity so as to enact Searles well-known
formula X counts as Y in context C (X counts as chief in the context of
group decision making). Related to this, there should be some actual physical
symbolsomething like a leaders headdress or scepter or presidential sealto
Col l ec t i v e In t en t iona l i t y 91
help in marking the new status in a public way. The fact that institutions are
public means that everyone knows that everyone knows about them and can-
not plead ignorance in the face of overt symbolic marking. This is one reason
why new institutions and officials are anointed with their new obligations and
entitlements not just implicitly but explicitly and publicly. One could not do
something bad to a chief wearing his official headdress, right after his official
inauguration, and then plead ignorance of his status. Similarly with formally
written rules and laws: their public nature essentially means that one cannot
break them and expect to be excused by pleading ignorance.
Rakoczy and Tomasello (2007) argue that a simple model for understand-
ing cultural institutions is rule games. Of course one may move a piece of
wood shaped like a horse all around a checkered board in any way one likes.
But if one wants to play chess, then one acknowledges that this horse-shaped
piece is a knight, and one moves a knight only in certain ways, and the other
pieces in other ways, toward the goal of winning the game, where winning is
defined by certain agreed-upon configurations of pieces. The pieces are given
their statuses by the norms or rules, whose existence comes from, and only
from, the explicit agreement of the players. Thus, we would argue that the
ontogenetic cradle of such cultural status functions is young childrens joint
pretense when they, for example, designate together a stick to be a snake. In
doing this they are engaging in the fundamental act that creates new statuses
since, we would claim, this designation is a social, public agreement with ones
play partner (see Wyman et al., 2009). Importantly, although the ability to
pretend derives evolutionarily, as we have argued, from the way that early hu-
mans created pretend realities by pantomiming situations for others commu-
nicatively, the normative dimension comes only with the group-mindedness
and collectivity characteristic of modern human cultures.
The most important point for current purposes is that there are in the
modern human world social or institutional facts. These are objective facts
about the world: Barack Obama is president of the United States, the piece of
paper in my pocket is a 20 note, and one wins at chess by checkmating ones
opponent. At the same time that they are objective, however, these facts are
observer relative; that is, they are created by individuals in social groups, and
so they may be just as easily dissolved (Searle, 1995). Barack Obama is president
but only as long as we say so, euros are legal tender but only as long as we act
so, and, in theory, the rules of chess may be changed at any time. What is ab-
solutely extraordinary about social facts, then, is that they are both objectively
92 A Nat ur a l History of Hu m a n T hink ing
real and socially created, speaking once again to the power of the objectifi-
cation/reification process. Indeed, if one gives five-year-old children some
objects with almost no instruction, they very quickly create their own rules
for how to play with them, and they then apply these rules both to them-
selves and to new players as objective facts: One must do this first, It works
like this, and so forth (Goeckeritz et al., unpublished manuscript). As in
adult teaching and norm enforcement, the must here implies the guiding
hand of an objective reality, independent of the perspective or wishes of any
particular individual.
who have been communicating their whole lives with hearing parents using
spontaneous iconic gestures, come together and conventionalize some of these
home signs. This often leads to a kind of stylization or shortening of signs
(see Senghas et al., 2004). Thus, the wavy hand for snake danger might be-
come abbreviated to the point of almost no waving. Typically this is because
the recipient can predict what is coming in the communicative situation; for
example, if she is about to turn over a rock, as soon as someone sticks out his
hand she can anticipate the snake-danger gesture. Children and other new-
comers would then just imitate or conform to the abbreviated hand-out (with
no waving) gesture to direct attention to snake-danger situations.2 Powerful
skills of imitation and conformity thus undermine iconicity in communica-
tion, as iconicity is not necessary in a group with cultural common ground
about what gesture to use for communicating about certain situations conven-
tionally. Communicative conventions can thus become arbitrary.
The implications of this conventional-arbitrary way of doing things for the
individual and her processes of thinking were, needless to say, momentous.
For one thing, children were now born into a group of people using a set of
communicative conventions that their ancestors had previously found useful
in coordinating their referential acts, and everyone was expected to acquire
and use exactly these conventions. Individuals thus did not have to invent
their own ways of conceptualizing things; they just had to learn those of oth-
ers, which embodied, as it were, the entire collective intelligence of the entire
cultural group over much historical time. Individuals thus inherited myr-
iad ways of conceptualizing and perspectivizing the world for others, which
created the possibility of viewing one and the same situation or entity simul-
taneously under different construals as, for example, berry, fruit, food, or
trading resource. The mode of construal was not due to reality, or even to the
communicators goals, but rather to the communicators thinking about how
best to construe a situation or entity so that a recipient would most effectively
discern his communicative intention.
In addition to this fundamentally new form of conventional/normative
and perspectival cognitive representation, communicating conventionally with
arbitrary devices also creates, or at least facilitates, two other new processes of
cognitive representation. The first is that the arbitrariness leads to a higher level
of abstractness. Thus, when gestures are purely iconic, the level of abstraction
is typically low and local. For example, with spontaneous iconic signs, open-
ing a door is pantomimed in one way whereas opening a jar is pantomimed
Col l ec t i v e In t en t iona l i t y 97
in the cultural common ground of the group that everyone knows all of the
alternatives and so will be making inferences about a communicators choice
among them.
And so, with the advent of communicative conventions, we now have some
new forms of conceptualization. Modern humans inherit a set of commu-
nicative conventions in their cultural common ground with others in the group,
and the use of these conventions is normatively governed, in the sense that
deviance from them puts one outside the cultural practice. The arbitrariness
of communicative conventions means that they may be used to conceptual-
ize situations and entities of almost unlimited abstractness, including role-
relational, thematic, and narrative schemas. And with communicative con-
ventions, we now have collectively known inferential connections among
conceptualizations, both formal and pragmatic, which were not possible in
the same way with natural gestures.
Over the past few decades, a handful of great apes have been raised by humans
and taught some form of human-like communication. They end up doing very
interesting things, but it is not clear in which ways they are human-like and in
which ways they are not. With particular regard to linguistic constructions,
there is no doubt that apes can combine their signs, sometimes creatively, but
they do not seem to have anything resembling human constructions (even
though they are perfectly capable of schematizing conceptual content in
general). Why is this? To help answer this question, here are some examples of
the kinds of utterances they produce, with either manual gestures or human-
provided visual symbols (not resembling their referents):
The first thing to note is that these are all requests, reflecting the fact that
systematic studies have found that over 95% of the communicative acts
produced by these individuals are some form of imperative (and the other
5% are questionable; Greenfield and Savage-Rumbaugh, 1990, 1991; Rivas,
2005). This is because no matter how they are trained by humans, great apes
will not acquire a motive to simply inform others of things or share
information with them (Tomasello, 2008). And in strictly imperative
communication, there is little functional need for all the complexities of
human linguistic communication (prototypically, no subject, no tense, etc.).
box 3 (continued )
of syntax that are geared at making the utterance comprehensible to the
recipient a key part of the cooperative motive. For example:
They do not ground their acts of reference for the listener to help
them identify the referent. That is to say, they do not have noun
phrases with things like articles and adjectives that help to specify
which ball or cheese is wanted, for example. Nor do they have any
kind of markers of tense that would indicate which event, as
indicated by when it occurred, they intend to indicate.
They do not use second-order symbols such as case markers or word
order to mark semantic roles and so to indicate who is doing what to
whom in the utterance. Communicators do not need this information;
it is provided to make sure that the listener understands the role of each
participant in the larger situation or event being communicated about.
They do not have constructions or other devices for indicating for
listeners what is old versus new versus contrasting information. For
example, if you adamantly expressed that Bill broke the window, I
probably would correct you by using a cleft construction and say
No, it was FRED that broke the window. Apes do not have such
constructions.
They do not choose constructions based on perspective. For
example, I might describe the same event either as I broke the vase
or The vase broke, based on your knowledge and expectations and
my communicative intentions, whereas linguistic apes have not
learned constructional alternatives of this type.
They do not specifically indicate in their utterances their communica-
tive motive (why should they, since it is always requestive) or anything
of their epistemic or modal attitudes toward the referential situation.
The key theoretical point is that, beyond just supplying ordering preferences for
utterances, human linguistic constructions are created with adaptations for the
recipients knowledge, expectations, and perspective in mind. And even very
simple constructions like noun phrases require adaptations to the recipients
knowledge, expectations, and perspective. Humans also conventionalize
expressions of motives and epistemic and modal attitudes in their constructions.
Call all of this the pragmatic dimension of grammar, and call it uniquely human.
Col l ec t i v e In t en t iona l i t y 107
let us suppose that on my way back from a hunting trip I see gazelles drinking
at watering hole no. 2, leading me to infer that their preferred watering hole
no. 1 is currently dry (given the recent dry weather). Back at the home base,
you inform me that you are headed to watering hole no. 1 to fetch water. I
want to inform you that it very likely does not have water, but I do not want
to just state as fact It does not have water since I am not certain. Presum-
ably the first marker of speaker uncertainty used in such situations was some
involuntary facial expression (see above). But then humans conventionalized
ways of indicating doubt, for example, by saying something like, Maybe it
does not have water or I think it does not have water. Interestingly, young
English- and German-speaking children first use words for thinking not to
indicate a specific mental act of thinking but rather to express their uncer-
tainty in the same way as maybe (thus, I think it does not have water means
maybe it does not; Diessel and Tomasello, 2001). Only later do they make
explicit reference to third-person mental happenings. And so one hypothesis
is that it was the demands of discourse that led humans to begin talking explic-
itly about mental states, and they did not do this across-the-board initially, but
only for their own epistemic attitudes toward propositional contents. Later,
they came to refer to the mental states of anyone and everyone, including both
others and the self, with the exact same set of communicative conventions.
Once humans could make explicit reference to intentional states, they could
think reflectively about them in some new ways.
A second set of cognitive processes often in need of explication is the
communicators logical inferring processes. These include most prominently
those indicated by and and or, various kinds of negation (e.g., not), and impli-
cation (if . . . then . . . ). For example, in response to pressure from the recipient
in argumentative discourse, the speaker requires terms such as these to make
explicit his reasoning processes. And so, analogous to communicative pres-
sure in normal discourse, logical pressure in argumentative discourse forces
disputants to make explicit in language the logical operations that until that
time were only procedural and not representational at all. One can imagine a
first gestural/iconic step in which, for example, or is expressed in some kind
of pantomiming in which one offers someone either this object (held out
with one hand) or that object (held out with the other). An if . . . then . . .
implication could be acted out in pantomiming such everyday social interac-
tions as threats and warnings (if X . . . then Y). But, as always, symbolizing
these logical operators in linguistic conventions would make them much
108 A Nat ur a l History of Hu m a n T hink ing
more abstract and powerful and, once again, much more readily available
for self-monitoring and self-reflection.
Third, speakers are often forced to make explicit some of the background
assumptions and/or common ground to help the recipient to comprehend. For
example, assume we are foraging together for honey, a cultural practice with
which we are both very familiar from our cultural common ground. The
knowledge we share about this practicewhat kind of hive we are looking
for, the height in the tree we should scan, the tools we will need, the container
we will need for transport, and so forth directs many of our activities. Thus,
if you go off and start picking and weaving together leaves, I wait for you
patiently as we both know that a vessel will be needed for transport. But this
shared knowledge is all implicit in our (cultural) common ground. An early
human might make this knowledge overt by pointing out to his partner the
presence of some appropriate leaves. But now imagine that I, as modern human,
express my intention that you notice the leaves presence by means of some
shared communicative conventions: Look, there are some good leaves over
there. This draws your attention to the leaves in a much more explicit way,
but there is still room for misunderstanding (good for what?). So perhaps you
look over at the leaves but draw a blank. Depending on my assessment of what
you are not comprehending, I might say, Its banyan leaves, or We are going
to need a vessel, or We need banyan leaves to make the vessel, or whatever.
I am making explicit for you the reason I am directing your attention to the
leaves presence (which I erroneously thought you could infer from our com-
mon ground), and, in the process, make explicit the bases for my own think-
ing for communicating. Once more, this makes it possible for me to reflect on
my thoughts and their connections in a way that I could not when they were
only an implicit part of our common ground.
And so with modern humans such things as intentional states, logical
operations, and background assumptions could be expressed explicitly in a
relatively abstract and normatively governed set of collectively known lin-
guistic conventions. Because of the conventional and normative nature of
language, new processes of reflection now took place not just as when apes
monitor their own uncertainty in making a decision, and not as when early
humans monitor recipient comprehension, but rather as an objectively and
normatively thinking communicator evaluating his own linguistic conceptu-
alization as if it were coming from some other objectively and normatively
Col l ec t i v e In t en t iona l i t y 109
thinking person. The outcome is that modern humans engage not just in in-
dividual self-monitoring or second-personal social evaluation but, rather, in
fully normative self-reflection.
believe together, that either of us has any standing to demand that one another
reason logically.
Such cooperative argumentation, as we may call it, may be modeled in game
theory as a battle of the sexes: our highest goals are collaborativewe will
hunt together under all circumstances because otherwise there is zero hope of
success but within that cooperative framework we each argue our case.
Critically, in this context, neither of us wants to convince the other if we are
in fact wrong about the location of antelopes; each would rather lose the
argument and eat tonight than win the argument and go hungry. And so a
key dimension of our cooperativeness is that we both have agreed ahead of
time, implicitly, that we will go in the direction for which there are the best
reasons. That is what being reasonable is all about.
An appeal to best reasons invokes what Sellars (1963) calls common
standards of correctness and relevance, which relate what I do think to what
anyone ought to think. Our cooperative argumentation in the context of
joint or collective decision making is thus premised on a shared metric that we
both use in determining which reasons are indeed best. There have thus
arisen social norms that govern cooperative argumentation in group decision
making specifying, for example, that reasons based on direct observation trump
reasons based on indirect evidence or hearsay. An even deeper, conceptual
point is that to be in an argument in the first place means to accept as infra-
structure certain rules of the game, namely, the groups social norms for
arguing cooperatively. This is the difference between a street fight and a boxing
match. The early Greeks made explicit some of the most important of these
norms of argumentation in Western culture, for example, the law of noncon-
tradiction (a disputant cannot hold the same statement to be both true and
false at the same time), and the law of identity (a disputant cannot change the
identity of A during the course of the argument). Even before the Greeks, we
can imagine that individuals who, for example, held the same statement to
be both true and false at the same time were either ignored by others or else
exhorted to argue rationally. The cooperative infrastructure was thus deci-
sive in determining what it means to reason at all. The natural world itself
may be totally isthe antelopes are where they are. However, the cultur-
ally embedded discourse processes by which we determine what that is in
fact isin the space of reasons, to use Sellarss evocative phrase are fraught
with ought.
112 A Nat ur a l History of Hu m a n T hink ing
tive normativity in which the individual regulates her actions and thinking
based on the groups normative conventions and standardswhat some have
called normative self-governance (e.g., Korsgaard, 2009).
Agent-Neutral Thinking
The second-personal thinking of early humans was aimed at solving coordina-
tion problems presented by direct collaborative and communicative interactions
with specific others. Modern humans faced different kinds of coordination
problems, namely, those involving unknown others, with whom one had little
or no personal common ground. The solution on the behavioral level was the
creation of group-wide, agent-neutral conventions, norms, and institutions,
to which everyone expected everyone, in cultural common ground, to con-
form. To coordinate with others communicatively in such a world, human
communication had to be conventional as well, based again not on personal
but rather on cultural common ground. And to be a good communicative
partner in conventional communication especially, to be a cooperative par-
ticipant in shared decision makingmodern humans needed to express their
reasons for thinking in certain ways explicitly in language and then simulate
the cultural groups normative judgments of the intelligibility and rationality
of those linguistic acts and reasons. Modern humans participate not only in
joint intentionality with other individuals but also in collective intentionality
with the entire cultural group.
Representing Objectively
Early humans cognitively represented to themselves various situations and
entities simultaneously from more than one perspective, and they then indi-
cated or symbolized particular perspectives on those situations and entities
for others in their deictic and iconic communicative acts. Modern humans
then began collaborating and communicating with sometimes unfamiliar
others structured by agent-neutral conventions, norms, and institutions, so that
the cognitive models they were building and the perspectives they were simu-
lating concerned not just particular others but, rather, some kind of generic
other or, perhaps, the group at large. The linguistic conventions individuals
were born into embodied the way that the group as a whole, from many years
114 A Nat ur a l History of Hu m a n T hink ing
past, perspectivized and schematized experience, so that this way seemed inevi-
table. This new way of operating socially led to cognitive representations with
three important new features.
Reasoning Reflectively
The inferences of the common ancestor to humans and great apes were simple
causal and intentional inferences. The inferences of early humans were recur-
sively structured, enabling them to produce and interpret cooperative com-
municative acts comprising nothing but a protruding finger. But now, the
linguistic communication of modern humans opened up whole new vistas of
inference and reasoning. We now have such things as formal and pragmatic
inferences, and external communicative vehicles can be reflected upon by the
communicator from an objective and normative perspective. And the giving
of reasons and justifications to others and to the self in internal reasoning
now serves to connect up an individuals various conceptualizations into a
single inferential web.
is the fact that we all know collectively that we all know the linguistic options
available to a communicator, which leads to the kind of pragmatic inferences
that Grice (1975) made famous: if I refer to someone as an acquaintance,
that almost certainly means that we are not friendsbecause if we were friends
I would have used the word friend. These implicatures and corresponding
inferences are possible because, and only because, the options available are
part of the groups cultural common ground, so that we can wonder together
why I made the choice that I did. Conventional linguistic communication
thus makes possible powerful new kinds of inferences.
In addition, linguistic communication, and the arbitrary nature of lin-
guistic conventions, enabled modern humans to express explicitly in lan-
guage some conceptualizations that could not be expressed easily, if at all,
in the natural gestures of early humans, for example, intentional states and
logical operations. Based on the hypothesis that one can reflect on ones
thinking only as it is expressed in external behavior directed at another
because only then can one play the others role and attempt to comprehend
it from her perspectivelinguistic communication now made available to
modern humans many new conceptualizations about which they could think
reflectively. Importantly, as modern humans thought about their own think-
ing reflectively, at least in some situations, they did not do so merely from
their own perspective, or even that of the other, but from a more objective
perspective.
Normative Self-Monitoring
Early humans engaged in what we have called cooperative self-monitoring
regulating their collaborative activities by the evaluative reactions of specific
partners; and communicative self-monitoringregulating their communica-
Col l ec t i v e In t en t iona l i t y 119
ways but also attempt to assess ahead of time whether those are good goals to
pursue or good decisions to make or good reasons to have a clearly extra
layer of reflection and evaluation. And the normative judgment here is not
simply mine alone, nor that of a specific other partner, but rather a judgment
about whether that would be a good goal or decision or line of reasoning for
any rational person, that is, for anyone from our group who does things the
way that we do them.
Modern humans thus operate with the social norms of the group as inter-
nalized guides to both action and thinking. This means that in their collab-
orative interactions modern humans conform to the collectively accepted
ways of doing things, based on norms of cooperation, and in their communi-
cative interactions they conform to the collectively accepted ways of using
language and also linguistically formulated arguments, based on the groups
norms of reason.
argumentation, modern humans made explicit the reasons for their assertions,
thus connecting them in an inferential web to their other knowledge, and
then this social practice of reason-giving was internalized into fully reflective
reason. And the self-monitoring of modern humans for the first time reflected
not just their expectations about the second-personal evaluations of specific
others but, rather, their expectations about the normative evaluations of us
as a cultural group. Given all of these new ways of behaving and thinking, the
crack in the human experiential egg now became a veritable chasm: the indi-
vidual no longer contrasted her own perspective with that of a specific other
the view from here and there; rather, she contrasted her own perspective with
some kind of generic perspective of anyone and everyone about things that
were objectively real, true, and right from any perspective whatsoever a
perspectiveless view from nowhere.
And so, if from a moral point of view, cooperation always entails some
kind of effacing of ones own interests in deference to those of others or the
group, then, from a cognitive point of view, cooperative thinking always
entails some kind of effacing of ones own perspective in deference to the more
objective perspective of others or the group (Piaget, 1928). Thus, in coop-
erative communication I must always honor the perspective of my recipient,
and in cooperative argumentation I must be committed to accept the reasons
and arguments of others if they are better than my ownby the yardstick of
our agreed upon normative criteria of rationality, which include our agreed
upon objective reality and so to abandon mine for theirs. In the words of
Nagel (1986. p.4): Objectivity is a method of understanding . . . To acquire
a more objective understanding of some aspect of life or the world, we step
back from our initial view of it and form a new conception which has that
view and its relation to the world as its object. . . . The process can be repeated,
using a still more objective conception. In this formulation, objectivity is
the result of being able to think of things from ever wider perspectives and
also recursively, as one embeds ones perspective within another, more en-
compassing perspective. In the current view, more encompassing means sim-
ply from the perspective of an ever wider, more transpersonally constituted
generic individual or social groupthe view from anyone.
The monumental second step on the way to modern humans thus took the
already cooperativized and perspectival thinking of early humans and collec-
tivized and objectified it. Whereas early humans internalized and referenced
the perspective of what Mead (1934) calls the significant other, modern
Col l ec t i v e In t en t iona l i t y 123
Human cognition and thinking are much more complex than the cognition
and thinking of other primates. Human social interaction and organization are
much more complex than the social interaction and organization of other pri-
mates as well. It is highly unlikely, we would argue, that this is a coincidence.
Complex human cognition is of course responsible for complex human
societies in the sense that human societies would fall apart if human-like cog-
nition were not available to support them. But this cognition-to-society
causal link is not a plausible direction for an account of evolutionary origins.
For that direction of effect, there would need to be some other behavioral
domain in which powerful cognitive skills were selected, and then those
skills were somehow extended to solving social problems. But it is not clear
what other behavioral domain that might be, given that we are trying to ex-
plain the many particularities of cognitive skills supporting humans unique
forms of collaboration and communication, including in the end such things
as cultural conventions, norms, and institutions. It seems highly unlikely
that cognitive skills adapted for, say, individual tool use or the tracking of
prey could be exapted in this way for such complex cooperative enterprises.
And so, in the current view, the most plausible evolutionary scenario is
that new ecological pressures (e.g., the disappearance of individually obtain-
able foods and then increased population sizes and competition from other
groups) acted directly on human social interaction and organization, leading
to the evolution of more cooperative human lifeways (e.g., collaboration for
foraging and then cultural organization for group coordination and defense).
Coordinating these newly collaborative and cultural lifeways communica-
Hu m a n T hink ing a s Cooper at ion 125
tively required new skills and motivations for co-operating with others, first
via joint intentionality, and then via collective intentionality. Th inking for
co-operating. Th is, in broadest possible outline, is the shared intentionality
hypothesis.
But our evolutionary story has taken many more detailed twists and turns
as we have attempted to account, in detail, for the many different aspects of
uniquely human thinking as they relate, in detail, to the many different aspects
of uniquely human collaboration and communication. Because there are no
other contemporary evolutionary stories with exactly this focus, we have thus
far made scant reference to other theories. But there are a number of other
contemporary accounts of the evolution of uniquely human cognition and/or
uniquely human sociality in general, and a broad survey of these will help to
better situate the shared intentionality hypothesis within the current theo-
retical landscape.
can be applied simultaneously in the same situation, which creates some nov-
elties, and in addition, humans evolved a disposition to imagine and rehearse
action plans creatively in working memory, a capacity that enables all of the
other modules to interact with one another more flexibly.
Mithen (1996) makes a systematic attempt to provide a modular theory of
human cognitive evolution closely tied to the artifactual record. He makes a
distinction between early humans and modern humans, noting that early
humans were relatively limited cognitively, using the same tools everywhere
over many millennia, with no symbolic behavior, and so forth. He explains
this limitation by positing that early humans, like most animals, possessed
several different cognitive modules that were not integrated with one another.
Specifically, they had a technical intelligence with tools, a natural history in-
telligence with animals, and a social intelligence with conspecifics, none of
which interacted any other module. With modern humans, we get symbolic
capacities and language, which enabled the modules to work together, creat-
ing the kind of cognitive fluidity associated with modern human thinking.
What all of these more specific evolutionary psychology accounts have in
common is the proposal that nonhuman primates, and perhaps even early
humans, are dominated by highly compartmentalized modules, and this
means that their cognitive processes are relatively narrow and inflexible. In
contrast, human cognition is broader and more flexible because humans have
modules, including some novel modules, that somehow work together or
communicate with one another (via metarepresentation, symbols and lan-
guage, or some horizontal processes such as creative imagination in working
memory). The means that nonhuman animals (and perhaps early humans)
operate only with system 1 intuitive inferences, whereas modern humans op-
erate in addition with system 2 reflective inferences based on actual thinking.
But this view a kind of strict view of modularity for all animals except
modern humansis not compatible with the data on great ape thinking at
all. There is no evidence that great apes operate only with highly compart-
mentalized modules, and indeed, chapter 2 presented evidence that they do
not; they often use system 2 processes to think before they act in both the
physical and social domains, in both cases using abstract representations,
simple inferences, and protological paradigms (structured by physical causal-
ity or social intentionality). In our view, then, these attempts to both be true
to modularity theory and simultaneously make room for human flexible
thinking simply do not accord well with available empirical evidence.
Hu m a n T hink ing a s Cooper at ion 131
It is also disconcerting to see how different are the specific modules that
the different theorists positindeed, they often operate at very different lev-
els of analysis (compare snake detection and face recognition with technical
intelligence and normative reasoning). Perhaps a more systematic and com-
prehensive list could be compiled, but the real problem is that modularity
theorists do not often ask the question of origins beyond seeking a single
evolutionary function for a module (what it is good for). It is well known
that in evolution new functions are often subserved by existing structures,
perhaps put together in some new ways. Thus, for example, the proposed mod-
ule for normative reasoning would almost certainly have been constructed
out of earlier skills and motivations for such things as making individual
inferences, conforming to others and the group, evaluating others and being
sensitive to their evaluations, cooperative communication, and other skills.
Looking at the architecture of contemporary human cognition for a single
evolutionary function (via reverse engineering) misses out on the dynamics
of evolution, the way that new functions are created by cobbling together
already existing structures as evolution proceeds. This dynamic means that
there are deep relations among many cognitive functions via common de-
scent. A complex adaptive behavior such as collaborative foraging, for example,
may comprise many component process such as fast running, accurate throw-
ing, and skillful trackingnot to mention skills of joint intentionalitythat
may each have other adaptive functions, either on their own or in other com-
plex behaviors. Once we get past narrowly defined problems with immediate
and urgent fitness benefits (e.g., mate selection and predator detection), this
hierarchical structure is crucial for understanding how different cognitive
skills interrelate with others.
Our preference would thus be not to use the word module, which suggests
a static architectural or engineering perspective. Rather, we would prefer the
word adaptation, which suggests dynamic evolutionary processes. Adapta-
tions may be quite narrowly targeted, and we ourselves have invoked the etho-
logical notion of adaptive specialization (e.g., spiders building webs), which is
very close in spirit to the notion of a module. But other adaptations may apply
more broadly, either initially or through extensions over time. For example,
great apes do not seem to be specifically adapted for tool use, as neither goril-
las nor bonobos (and only some orangutan populations) use tools in the wild.
But all great apes use tools, and quite adeptly, in appropriate circumstances
in captivity. The adaptation thus seems to be more for causal understanding
132 A Nat ur a l History of Hu m a n T hink ing
in the manipulation of objects, which can then be applied in the use of tools
if the need arises for an individual (in contrast to some bird species, which
seem to be specifically adapted for tool use).
Even more generally along these lines, the question arises whether there
exist any truly domain-general horizontal abilities. (The metaphor is that
specific content, like space or quantity, is vertical, whereas general processes
like representation, memory, and inference are horizontal.) Some modularity
theorists believe that seemingly horizontal abilities do not represent single
domain-general processes; rather, each module has its own computational
procedures that have nothing to do with those in other modules. Our view is
that this again misses the importance of hierarchical organization in complex
adaptations. Processes such as cognitive representation, inference, and self-
monitoring may have evolved initiallyin some ancient vertebrate ancestor
in the context of some fairly narrow behavioral specializations. But as new
species have evolved, in the face of new and complex problems, these pro-
cesses have been co-opted for use as subcomponents, as it were, in many dif-
ferent adaptations, some quite broad. This co-option process is especially
important in highly flexible organisms such as great apes and humans, and
indeed, wide-ranging occurrence of this process is a key component of cogni-
tive flexibility.
Finally, we must also note that human skills and motivations for shared
intentionality do not, in our view, represent typical cognitive adaptations
occurring within individuals. Early humans had their own individual cognitive
skills, but then they began attempting to coordinate with others toward joint
goals with joint attention. Solving these coordination problems did not end
the matter, however; rather, it opened up a whole new way of operating for
early humans, especially the possibility of communicating referentially about
basically everything in their experience with modified processes of represen-
tation and inference. The emergence of shared intentionality thus effected a
restructuring, a transformation, a socialization, of all the processes involved
in individual intentionality and thinkingan unusual, if not unprecedented,
evolutionary event. This does not mean that humans do not also operate with
many system 1 processes impervious to this transformation; they do, quite
often making gut decisions about event probabilities, moral dilemmas, dan-
gerous situations, and so forth (see, e.g., Gigerenzer and Selton, 2001; Haidt,
2012). But still, humans may consider and even communicate about all of these
things in their system 2 thinking, even if this does not affect their eventual
Hu m a n T hink ing a s Cooper at ion 133
1 . C o m pe t i t i o n w i t h g ro u p m at e s l e d to s o ph i s t i c at e d
f o r m s o f n o n h u m a n p r i m at e s o c i a l c o g n i t i o n a n d t h i n k -
ing without human-like forms of sociality or communi-
c at i o n . Basic mammalian sociality is simply the motivation to live in
asocial group. Within-group competition engenders social relations of domi-
nance and, along with other factors, affi liation. Great apes, and perhaps
other primates, engage in more-than-average social competition and so
have developed skills for understanding the goals and perceptions of others
as a way of flexibly predicting their behavior. They also are especially skillful
at manipulating physical causes in tool use and the intentional states of
others in gestural communication. Great apes collaboratethat is, actually
work togethervery little, and when they do, it is best characterized as
136 A Nat ur a l History of Hu m a n T hink ing
2 . E a r ly h u m a n c o l l a b o r at i v e ac t i v i t i e s a n d c o o pe r a -
t i v e c o m m u n i c at i o n e m p l oy i n g n ew f o r m s o f s o c i a l
c o o rd i n at i o n l e d to n ew f o r m s o f h u m a n t h i n k i n g
w i t h o u t e i t h e r c u lt u r e o r l a n g uag e . For more than 5 million
of the 6 million years that humans have been on their own evolutionary
pathway, their thinking was mainly ape-like (though their skills at making
tools may have enhanced their causal understanding). But then there was a
change in ecological conditions that forced some early humans to begin col-
laborating in new ways to obtain food. This made individuals interdependent
with one another in an especially urgent way. In mutualistic activities such as
these, communication could become fully cooperative since it was in the in-
terest of each individual to coordinate with others toward their mutualistic
goal and to inform them of things useful to them in their role. And so were
born early humans who could survive and thrive only by collaborating and
communicating cooperatively with social partners.
Collaborative foraging created a number of difficult problems of social
coordination. The basic solution was to form with others joint goals to do
things together, to which both participants were jointly committed. This cre-
ated the dual-level structure: joint goals with individual roles, along with
joint attention with individual perspectives. In the cooperative communica-
tion used to coordinate individuals perspectives (and so actions) within these
activitiesinitially via pointing and pantomimingthe communicator was
committed to cooperation in the form of an honest informative act, and
communicator and recipient collaborated to ensure successful communica-
tion. The recipient followed the pointing gesture, or imagined the referent of
the pantomime, and then made an abductive inference from that to what,
given their common ground, the communicator intended to communicate.
The communicator, for his part, knew that this was what the recipient would
be doing and so attempted to conceptualize the situation for her in his choice
of referents anticipating her perspective of his perspective on her perspec-
tive recursivelyin a way that facilitated her abductive leap. Moreover, in
the special context of joint decision making, early human communicators
sometimes pointed out relevant situations to their partner that (implicitly)
provided reasons for them to jointly decide on a certain course of action based
on their common ground understanding of the causal and/or intentional
implications of the indicated situation.
138 A Nat ur a l History of Hu m a n T hink ing
To do all of this effectively required thinking of a type not possible for great
apes and their individual intentionality: the communicator had to make
judgments not only about his common conceptual ground with recipient but
also about which aspects of the current situation the recipient would find
both relevant and new and so what kind of abductive inference she would
make given different possible referential acts. Doing this led to what we have
called second-personal thinking, comprising (1) cognitive representations that
are perspectival and symbolic, (2) inferences that are recursively structured to
include intentional states within intentional states, and (3) self-monitoring that
incorporates the imagined social evaluation and/or comprehension of the col-
laborative and/or communicative partner. These changes all served to basically
cooperativize great ape individual intentionality into a kind of second-personal
joint intentionality and thinking.
And so, early humans joint intentionality and second-personal thinking
represented a radical break, a new type of relation between sociality and think-
ing. The cooperative and recursive sociality of early humans created an adaptive
context requiring individuals, if they were to survive and thrive, to coordinate
their actions and intentional states with others, which required them to co-
operativize their cognitive representations, inferences, and self-monitoring,
and so the processes of thinking that these enabled. Importantly for theories
of the relation of sociality and thinking, this new type of second-personal
thinking took place without conventionalization, culture, or language or any-
thing else going beyond direct, second-personal, social engagements.
3 . M o d e r n h u m a n p ro c e s s e s o f co n ve n t i o n a l i ze d c u lt u r e
a n d l a n g uag e l e d to a l l o f t h e u n i qu e c o m p l e x i t i e s o f
m o d e r n h u m a n t h i n k i n g a n d r e a s o n i n g . Modern humans
faced some new social challenges due to increases in group sizes accompanied
by competition among groups. For survival, modern human groups had to
begin operating as relatively cohesive collaborative units, with various division-
of-labor roles (see Wilson, 2012). This created the problem of how individuals
could coordinate with in-group strangers, with whom they had no personal
common ground. The solution was the conventionalization of cultural prac-
tices: everyone conformed to what everyone else was doing, and expected
others to conform as well (and expected them to expect them to, etc.), which
created a kind of cultural common ground that could be assumed of all
members of the group (but not other groups). Modern humans ways of com-
Hu m a n T hink ing a s Cooper at ion 139
4 . C u m u l at i v e c u lt u r a l evo lu t i o n l e d to a p l e t h o r a o f
c u lt u r a l ly s pe c i f i c c o g n i t i v e s k i l l s a n d t y pe s o f t h i n k -
i n g . All of these processes of joint and collective intentionality are universal
in the human species. Most likely, the first step of joint intentionality evolved
in Africa before the split between Neanderthals and modern humans and so
characterized both species. The second step of collective intentionality likely
evolved in a population of modern humans in Africa before they migrated
out into other parts of the world after 100,000 years ago. But once they started
migrating out and settling in highly variable local ecologies, differences in
cultural practices became pronounced. Different human cultures created
very different sets of particular cognitive skills, for example, for navigating
across large distances, for building important tools and artifacts, and even for
communicating linguistically. This meant that different cultures created, on
top of their species-wide cognitive skills of individual, joint, and collective
intentionality, many culturally specific cognitive skills and ways of thinking
for their own local purposes.
142 A Nat ur a l History of Hu m a n T hink ing
Importantly, these culturally specific skills build on one another over his-
torical time within a culture in a kind of ratchet effect, leading to cumulative
cultural evolution. Because of humans especially powerful skills of cultural
learning, along with adult teaching and childrens tendency to conform, the
artifacts and practices of a culture acquire a history. Individuals mediate their
interactions with the world through the cultures artifacts and symbols from
early in ontogeny (Vygotsky, 1978; Tomasello, 1999), thus absorbing something
of the wisdom of the entire cultural group and its history. Cumulative cul-
tural evolution is what enabled humans to conquer all kinds of otherwise
uninhabitable places all over the globe.
As one dramatic example in the contemporary world, we may point to
what are arguably the most abstract and complex forms of human thinking,
that is, those involved in Western science and mathematics. The point here is
that these forms of thinking are simply not possible without special forms of
socially constructed conventions, namely, those in written form, that devel-
oped over historical time in Western culture. This point is stressed especially
by Peirce (19311958) and is summarized in the classic text of modern logic by
Lewis and Langford (1932, p.4): Had it not been for the adoption of the new
and more versatile ideographic symbols, many branches of mathematics could
never have been developed because no human mind could grasp the essence of
their operations in terms of the phonograms of ordinary language. Many
scholars of literacy would also argue that written language makes certain forms
of reasoning, if not possible, at least more accessible (Olson, 1994). Writing also
greatly facilitates metalinguistic thinking and the possibility to analyze, criti-
cize, and evaluate our own linguistic communication, as well as that of others.
Pictures and other graphic symbols used as communicative devices are collec-
tive representations that contribute to the process in important ways as well.
Those modern cultures that have created active communities of scientists,
mathematicians, linguists, and other scholars are pretty much unthinkable
without written language, written mathematical numerals and operations,
and other forms of visually based and semipermanent symbols. Cultures
that have not created and do not currently possess any of these kinds of graphic
symbols cannot currently participate in these activities. Th is demonstrates
quite clearly that many of the most complex and sophisticated human cog-
nitive processes are indeed culturally and historically constructed. It also
opens the possibility that some other human cognitive achievements are a
kind of co-evolutionary mixture. Our own view would be that many of the
Hu m a n T hink ing a s Cooper at ion 143
It is theoretically possible that this entire account applies not to human think-
ing in general but only to a kind of modularized thinking for collaboration
and communication specifically (see Sperber, 1994, for something in this
general direction). But this does not seem to be the case. Human perspectival
and objective representations, recursive and reflective inferences, and norma-
tive self-monitoringthe constituents of uniquely human thinking do not
just go away when humans are not collaborating or communicating. On the
contrary, they structure nearly everything that humans do, with the possible
exception of sensory-motor activities. Thus, humans use recursive inferences
in the grammatical structures of their languages, in mind-reading in non-
communicative contexts, in mathematics, and in music, to name just the
most obvious examples. Humans use perspectival and objective representa-
tions for thinking about everything, even in their solitary reveries, and they
are engaged in normative self-monitoring whenever they are concerned about
their reputationwhich is pretty much all of the time. We might also recall
here skills of relational thinking, which are products of dual-level collabora-
tion but used more broadly, and skills of imagination and pretense, which are
products of imagining in pantomime but are now used in all kinds of artistic
creation. Collaboration and communication may play the crucial instigating
roles in our story, but their effects on cognitive representation, inference, and
self-monitoring extend much more broadly to basically all of humans con-
ceptually mediated activities.
Along these same lines, we should also be clear that the new forms of
social cognition that this account proposes are not just modularized theory
of mind skills. Rather, such things as perspectival representations, recursive
inferences, and social self-monitoring evolved so that individuals could now
understand the world in new ways by putting their heads together with others
in acts of shared intentionality. Doing this requires more than just some spe-
cific cognitive skill aimed at some specific content domain, because coordi-
nating actions and intentional states with others toward outside referents
requires new ways of operating across the board. Skills and motivations for
shared intentionality thus changed not just the way that humans think about
144 A Nat ur a l History of Hu m a n T hink ing
others but also the way they conceptualize and think about the entire world,
and their own place in it, in collaboration with others.
ment. Thus, at around three years of age, young children do not just follow
social norms but begin actively to enforce them on others (and to feel guilty
when they break norms themselves). They do this in ways that demonstrate their
understanding that particular norms apply only in particular contexts and
only to individuals in the group who have conventionalized them. They also
understand that some pieces of language, for example, common nouns, are
conventional for everyone in the group, whereas proper names are conventional
only for those who know the person (see Schmidt and Tomasello, 2012, for a
review). Skills of collective intentionality have not been studied in any depth
outside of Western, middle-class culture, and so the cross-cultural generality
of this developmental timing is not known.
The second point about the role of ontogeny is this: neither joint nor collec-
tive intentionality would exist without it. This is true of many human traits,
as the human species has evolved an extended ontogeny for all kinds of things
that other species possess in mature, or almost mature, form at birth. Thus,
whereas many small primates have brains that develop very rapidly in the first
months of life, maturing in less than a year, and chimpanzees have brains that
develop to maturity in only about five years, the human brain takes more
than ten years to reach its fully mature adult volume (Coqueugniot et al., 2004).
Because this extended ontogeny is highly risky both for youngsters and
mothers, there must be off setting advantages, presumably in terms of such
things as especially flexible behavioral orga nization, cognition, and decision
makingas well as time to master the local groups cultural artifacts, symbols,
and practices (Bruner, 1972).
Human skills of joint and collective intentionality thus come into existence
during an extended ontogeny in which the child and her developing brain are
in constant interaction with the environment, especially the social environ-
ment. Our hypothesis is that they would not come into existence without this
interaction. To make the point as concretely as possible, let us invoke a thought
experiment we have used before, and then add a novel twist. Imagine a child
born on a desert island, miraculously kept alive and healthy until adulthood,
but all alone. The hypothesis is that this child, as an adult, would not have
skills of either joint or collective intentionality. This social isolate could not,
as an adult, enter a human group and start collaborating by forming joint goals
with individual roles, or communicating cooperatively in the context of joint
attention with individual perspectives. This individual would thus not develop
during its isolated lifetime second-personal thinking with perspectival and
146 A Nat ur a l History of Hu m a n T hink ing
social group if there was in fact no social group that antedated ones social
and cognitive development? No, skills of collective intentionality are not sim-
ply innate or maturational either; they are biological adaptations that come
into existence only through an extended ontogeny in a collectively created
and transmitted cultural environmentwhich takes multiple generations
to emerge. In this case, then, adults and all of their cultural paraphernalia
are indeed necessary for the ontogenetic development of skills of collective
intentionality.
It is not incoherent to believe that all of the cognitive and thinking skills
we have described could be built-into the degree that our wild child or
orphaned peers could, if discovered as adults, immediately display perfectly
mature forms of uniquely human thinking at both levels. It is just, in our
view, highly unlikely. Humans biologically inherit their basic capacities for
constructing uniquely human cognitive representations, forms of inference,
and self-monitoring, from out of their collaborative and communicative inter-
actions with other social beings. Absent a social environment, these capacities
would wither away from disuse, like the capacity for vision in a person born
and raised completely in darkness.
One could, in principle, collect data on the role of ontogeny in the emergence
of uniquely human thinkingbut only if one had no moral scruples. One
would have to be prepared to randomly assigned newborn children to differ-
ent rearing environments. Natural experiments, such as the feral child Victor
of Aveyron and other wolf children, are not definitive on the question for
many reasons, not the least of which is that some of these children could have
been abandoned by their parents precisely because they were not normally
functioning (Candland, 1995)and none of them was tested for the relevant
cognitive skills either. Some interesting indirect evidence for the important
role of a human-like social environment is provided by so-called enculturated
apes. When apes are raised by humans in the midst of all kinds of human-like
social interaction and artifacts, they do not develop more human-like skills of
physical cognition (e.g., space, object permanence, tool use), but they do develop
more human-like skills of imitation and communication (Call and Tomasello,
1996; Tomasello and Call, 2004). The significance of these findings for human
ontogeny are, however, not straightforward.
In any case, while everyone will continue to be fascinated by the question of
wild children and how much and what kinds of social experience are necessary
148 A Nat ur a l History of Hu m a n T hink ing
for humans to develop their unique forms of cognition and thinking, the
question is very likely to remain a deep mystery for the foreseeable future. In
the meantime, our hypothesis is that, like many human adaptations, adapta-
tions for shared intentionality are built to grow and flourish only in the midst
of rich social and cultural nourishment of particular kinds.
6
Conclusion
At least since Aristotle, human beings have wondered how they differ from
other animal species. But for almost all of that time the appropriate informa-
tion for making this comparison was not availablemost important, because
for the first several thousand years of Western civilization there were no non-
human primates in Europe. Aristotle and Descartes could readily posit things
like only humans have reason or only humans have free will because they
were comparing humans to birds, rats, various domesticated animals, and
the occasional fox or wolf.
In the nineteenth century nonhuman primates, including great apes, came
to Europe via newly created zoological gardens. Darwin himself was dumb-
struck by his encounter in 1838 with an orangutan named Jenny at the London
Zoo (whom Queen Victoria termed disagreeably human). After publication
of the Origin of Species some twenty-one years later, and the Descent of Man
some twelve years after that, the differences between humans and other
animalsas now represented by our closest living relatives became much
more difficult to pinpoint. Many philosophers reacted by simply defining away
the problem: thinking is a process that takes place in, and only in, the medium
of language, and so other animal species cannot think by definition (the most
prominent modern proponents being Davidson [2001] and Brandom [1994]).
Recent research on great ape cognition and thinking, such as that reviewed
here, should already be undermining this radical discontinuity view. Great
150 A Nat ur a l History of Hu m a n T hink ing
apes cognitively represent the world in abstract format, they make complex
causal and intentional inferences with logical structure, and they seem to
know, at least in some sense, what they are doing while they are doing it.
Although this may not be fully human thinking, for sure it has some key
components.
But the problem is deeper than finding a line of demarcation. The point is
that the great ape species alive today are arbitrarily far from humans; it is just a
matter of who survived and who did not. So what if we discovered, in some
remote jungle, surviving members of the species Homo heidelbergensis or Homo
neanderthalensis? How would we decide whether they possess fully human
thinkingyes or nogiven that, in all probability, they would be somewhere
in between contemporary humans and great apes? Even more radical, what if
we discovered some earlier side branches from the human evolutionary tree
who had their own ways of doing and thinking about things, overlapping only
partially with modern human thinking? Perhaps these creatures never devel-
oped pointing and so did not evolve skills of recursive inferring. Or perhaps
they never imitated at a level sufficient for pantomime and so did not symbolize
their experience for others gesturally. Or perhaps they collaborated but did not
care about others evaluations and so did not become socially normative. Or
perhaps they never had situations in which they had to make group decisions
and so never came to offer one another reasons and justifications for their asser-
tions. Our question is what would these creatures version of thinking look
like if it skipped a key ingredient (along with all of its cascading effects) of the
modern human version? We might end up with something sharing many fea-
tures with modern human thinking but having its own unique features as well.
The point is that, considered evolutionarily, human thinking is not a monolith
but a motleyand it could have turned out other than it did.
What we have done in the current natural history is to imagine one pos-
sible missing link in the evolution of human thinking from great apes to
modern humans based on selected aspects of the way of life of contemporary
hunter-gatherers and selected aspects of the thinking of young children (ac-
companied by a few, admittedly indeterminant, paleoanthropological facts).
But, importantly, our claim is not just that such an intermediate step can be
imagined and that it probably occurred, but that it was necessary. It was nec-
essary because one cannot even imagine going directly from ape-like com-
petitive interactions and imperative communication to modern human cul-
ture and language with no evolutionary intermediary. Human culture and
Conclusion 151
that early and modern humans faced as they moved toward ever more coop-
erative ways of making a living.
It is certain that some parts of our evolutionary story are incomplete. The
main problem is that collaboration, communication, and thinking do not
fossilize, and so we will always be in a position of speculation about such
behavioral phenomena, as well as the specific events that were critical to their
evolution. Most crucial, we do not know how much contemporary great apes
have changed from their common ancestor with humans because there are
basically no relevant fossils from this era. Furthermore, our intermediary step
of early humans very likely had much more of a gradual evolution than de-
scribed here; indeed, it is not even clear that Homo heidelbergensis was a sepa-
rate species at all. And we have given only cursory attention to humans after
agriculture and all of the complexities arising from the intermixing of cultural
groups, from literacy and numeracy, and from institutions such as science
and government. And so our attempt is less of an explicitly historical exercise
than an attempt to carve nature at some of its most important joints, specifi-
cally, at some of its most important evolutionary joints.
A list of open questions at this point would be quite long. But two par-
ticularly big ones are these: First is the nature of the jointness or collectivity
or we-ness that characterizes all forms of shared intentionality. Many theo-
rists subscribe to something like an irreducibility thesis (e.g., Gallotti, 2012)
in which such things as joint attention and shared conventions are irreducibly
social phenomena, and attempting to capture them in terms of the individu-
als involved, and what is going on in their individual heads, is doomed to
failure. Our view is that shared intentionality is indeed an irreducibly social
phenomena in the momentjoint attention only exists when two or more
individuals are interacting, for example but at the same time we may ask
the evolutionary or developmental question of what does the individual bring
to the interaction that enables her to engage in joint attention in a way that
other apes and younger children cannot. And so for us this means that some-
thing like recursive mind-reading or inferring still not adequately charac-
terized, and in most instances fully implicithas to be a part of the story of
shared intentionality. From the individuals point of view, shared intentional-
ity is simply experienced as a sharing, but its underlying structure, reflecting
its evolution, is that each participant in an interaction can potentially take
the perspective of others taking her perspective taking their perspective, and
so forth for at least a few levels. But this, as they say, is a point on which rea-
sonable people may disagree.
Conclusion 153
A second open question is how and why modern humans reify and objectify
what are essentially socially created entities. Money is not just a piece of paper
but legal tender, and Barack Obama is not just a person living in a large white
house but commander in chiefbecause we act and talk as if they are these
things. We also reify such things as morality, arguing not about the moral
norms of different social groups, including those shared by all human groups,
but rather about what is the right and wrong way to do things, where right
and wrong are considered objective features of the world. And nowhere is
this tendency stronger than in language, where everyone has a tendency
correctable but only with much effortto reify the conceptualizations codified
in our own natural language. About all of these things, we are like the young
child who says that even if long ago everyone agreed to call the striped feline in
front of us a gazzer, it would not be right to do so because, well, Its a tiger.
Our own view is that such objectifying tendencies could come only from the
kind of agent-neutral, group-minded perspective that imagines things from the
view of any one of us, the view of any rational person, the view from nowhere,
in the context of a world of social and institutional realities that antedate our
own existence and that speak with an authority larger than us. This is the au-
thoritative voice that lies behind the use of genericized linguistic expressions in
norm enforcement (That is wrong) and pedagogy (It works like this), and
it determines, in large part, what we consider real. But, again, this is a point on
which reasonable people may disagree.
Despite these gaping questions, and others, we cannot conceive any com-
prehensive theory of the origins of uniquely human thinking that is not fun-
damentally social in character. To be as clear as possible: we are not claiming
that all aspects of human thinking are socially constituted, only the species-
unique aspects. It is an empirical fact that the social interaction and organi-
zation of great apes and humans are hugely different, with humans being
much more cooperative in every way. We find it difficult in the extreme to
believe that this is unrelated to the huge differences in cognition and think-
ing that also separate great apes from humans, especially when we focus on
the details. What nonsocial theory can explain such things as cultural insti-
tutions, perspectival and conventional conceptualizations in natural languages,
recursive and rational reasoning, objective perspectives, social norms and
normative self-governance, and on and on? These are all coordinative phe-
nomena through and through, and it is almost inconceivable that they arose
evolutionarily from some nonsocial source. Something like the shared inten-
tionality hypothesis just must be true.
Notes
2. Individual Intentionality
1. Importantly, complex organisms embody hierarchies of control systems, so that
most of their actions are attempts to regulate multiple goals simultaneously at mul-
tiple levels (e.g., the same act is simultaneously attempting to place left foot in front
of right, pursue a prey, feed the family, etc.).
2. This account is related to the notion of Gibsonian affordances, but it is much
broader in including not only direct opportunities for the self s concrete actions but
also situations that are relevant to the organism in many more indirect ways. In addi-
tion, we should also acknowledge that all organisms are hardwired to attend to some
things as naturally salient (e.g., for humans, loud noises) because of potential relevance
to biological goals and values (so-called bottom-up processes of attention).
3. In none of these studies did chimpanzees understand and predict that, when a
dominant was not just ignorant but had a false belief, she would reliably go to the
place where she (falsely) thought the food was located. They treated ignorance and
false belief as the same (see also Kaminski et al., 2008; Krachun et al., 2009, 2010;
chapter 3 discusses this distinction further).
3. Joint Intentionality
1. Contemporary human foragers are not good models for the early humans we
are imagining here, as they have gone through both steps of our evolutionary story
156 N o t e s t o Pa g e s 3 6 6 0
and so live in cultures with social norms, institutions, and languages. Moreover,
contemporary foragers have tools and weapons that make individual foraging (then
sharing at the end) feasible, whereas the early humans we are imagining here had
more primitive weapons and so needed to work together.
2. Of course, contemporary human societies are also full of selfishness and non-
cooperation, not to mention cruelty and war. Much of this is generated by conflicts
between people from different groups (however this is defined) and concerns compe-
tition for private property and the accumulation of wealth that began only in the last
10,000 years or so, after the advent of agriculture, that is, after humans had spent
many millennia as small-group collaborative foragers.
3. Davidson is actually concerned with a special kind of perspective, namely, be-
lief: a cognitive representation of the world that the subject knows might be in error.
His claim is that a necessary condition for the notion of error is a social situation in
which I and another person simultaneously focus on the same object or event simul-
taneously yet differently, what we have called perspective. But the notion of error
introduces an additional consideration because it privileges one of the perspectives as
accurate (and the other as in error), and this requires some notion of an objective
perspective. This notion of objectivity and so the notion of beliefwill not be
available to humans until the next step in our story when agent-neutral perspectives
are possible (see chapter 4).
4. The possibility of lying meant that recipients had to practice epistemic vigi-
lance (Sperber et al., 2010). And so the notion of true propositions also arose from
the comprehension side of the interaction, as comprehenders attempted to distin-
guish truthful from deceptive communicative acts.
5. For purposes of simplicity, the terminology herereferential acts underlain by
communicative intentionsis slightly different from that of Tomasello (2008).
What is here called the communicative intention comprises what was there called
the social intention in the context of the Gricean communicative intention.
6. Some researchers think that this characterization of childrens early communi-
cation via pointing is too cognitively rich (see, e.g., Gomez, 2007; Southgate et al.,
2007) and that infants are actually doing something simpler.
7. Said another way, it is one thing to throw an object at another person (which,
coincidently, many apes do), but it is quite another to throw something to someone
in anticipation of her task of catching (Darwall, 2006)which is what the commu-
nicator does, metaphorically, in human cooperative communication.
8. Some researchers have claimed that some great ape intention-movements are
actually functioning iconically, for example, when one gorilla ritualistically pushes
another in a direction in a sexual or play context (Tanner and Byrne, 1996). But these
are most likely garden-variety ritualized behaviors that appear to humans to be
iconic because they derive from attempts to actually move the body of the other in
the desired directionthey are not functioning iconically for the apes themselves.
N o t e s t o Pa g e s 6 1 1 0 1 157
9. Some contemporary cultures have more than one (e.g., pointing with the in-
dex finger and pinky extended simultaneously for a certain subclass of situations),
but the presumption is that those are derived from the original primordial index
finger pointing, with which all children begin.
10. We have until this point discussed only propositional contents in the sense
of fact-like situations that are expressed in cooperative communication. By the term
proposition we mean a communicative act expressed as a fully articulated act of con-
ventional linguistic communication.
4. Collective Intentionality
1. Children also find it difficult for some time to comprehend situations in
which objective reality is unaffected by the fact that we humans may describe it from
different, even conflicting, perspectives, for example, situations in which there is an
undisturbed objective reality despite the fact that this entity is simultaneously a dog
an animal and a pet (see Moll and Tomasello, in press).
2. This is not unlike the way that some motivated linguistic forms, such as meta-
phors, become opaque (dead) across historical time as new learners are ignorant of
the original motivation.
3. A number of unusual situations in the contemporary world have illustrated the
process, at least in broad outline. Most spectacular is the case of Nicaraguan Sign
Language. A number of young deaf individuals each had their own kind of pidgin
signing, or home sign, with very little grammatical structuring, that they used with
their hearing families. But soon after they were brought together into a community
within three generationstheir various idiosyncratic home signs turned into a sys-
tem of conventionalized signs used in numerous constructions with all kinds of gram-
matical organization (Senghas et al., 2004). A very similar process was observed in the
birth of Al-Sayyid Bedouin Sign Language (Sandler et al., 2005), and indeed, some-
what similar processes have been at least indirectly observed in many cases in which
spoken pidgin languages have turned into creoles and full languages (Lefebvre, 2006).
What seems to happen is that pidgin communication (or home sign) works well
among family members, coworkers, and others with very strong common ground,
typically in highly restricted and recurrent situations such as mealtime or a work task.
But especially as a wider community of communicators and communicative situa-
tions must be accommodated, this process breaks down, and new grammatical means
must be found to help recipients to reconstruct the events and participants (and their
roles) in the intended referential situation. Communicator and recipient then work
together further until there is comprehension, and successful grammatical solutions
are repeated and imitated and so conventionalized in the community.
4. Participants and events in situations may be linguistically indicated at many
different levels of specificity, depending on the common ground between communicator
158 N o t e t o Pa g e 1 0 3
and recipient (Gundel et al., 1993). Pronouns are used to indicate entities already well
established in common ground, whereas nouns with relative clauses are used for new
entities that the recipient may identify using our common ground (e.g., the man we
saw yesterday). In addition, many languages have determiners such as the and a,
which specifically indicate whether something is or is not in our common ground in
the current communicative interaction. Events are typically grounded in the current
communicative interaction by specifying when they occurred or will occur relative
to, ultimately, now (i.e., via tense). This way of specifying referents thus leads to the
kind of hierarchical tree structures diagramed in traditional linguistic analyses, as
the different linguistic items of a noun phrase or a verbal complex, each with its own
function, are used together, collaboratively as it were, toward the overall goal of indi-
cating a particular participant or event in the referential situation.
5. Sandler et al. (2005) provide a very interesting description of how successive
generations of the newly created Al-Sayyid Bedouin Sign Language conventional-
ized speaker motive and attitude, mostly by conventionalizing slightly exaggerated
facial expressions. Thus, across generations signers came to use conventionalized fa-
cial expressions to signal such things as the illocutionary force of an utterance, such
as assertions vs. questions (p.31) as in mature sign languages. In addition, com-
municators in later but not earlier generations came to symbolize conventionally
their various modal and epistemic attitudes, such things as necessity, possibility, uncer-
tainty, or surprise.
References
Call, J., and M. Tomasello. 1996. The effect of humans on the cognitive develop-
ment of apes. In A.E. Russon, K.A. Bard, and S.T. Parker, eds., Reaching into
thought (pp.371403). New York: Cambridge University Press.
. 2005. What chimpanzees know about seeing, revisited: An explanation of
the third kind. In N. Eilan, C. Hoerl, T. McCormack, and J. Roessler, eds.,
Joint attention: Communication and other minds (pp.45 64). Oxford: Oxford
University Press.
. 2007. The gestural communication of apes and monkeys. Mahwah, NJ:
Lawrence Erlbaum.
. 2008. Does the chimpanzee have a theory of mind: 30 years later. Trends in
Cognitive Science, 12, 87 92.
Callaghan, T., H. Moll, H. Rakozcy, T. Behne, U. Liszkowski, and M. Tomasello.
2011. Early social cognition in three cultural contexts. Monographs of the Society
for Research in Child Development 76(2). Boston: Wiley-Blackwell.
Candland, D.K. 1995. Feral children and clever animals: Reflections on human
nature. Oxford: Oxford University Press.
Carey, S. 2009. The origin of concepts. New York: Oxford University Press.
Carpenter, M., K. Nagel, and M. Tomasello 1998. Social cognition, joint attention, and
communicative competence from 9 to 15 months of age. Monographs of the Society for
Research in Child Development 63(4). Chicago: University of Chicago Press.
Carpenter, M., M. Tomasello, and T. Striano. 2005. Role reversal imitation in 12
and 18 month olds and children with autism. Infancy, 8, 253278.
Carruthers, P. 2006. The architecture of the mind. Oxford: Oxford University Press.
Carruthers, P., and M. Ritchie. 2012. The emergence of metacognition: Affect and
uncertainty in animals. In M. Beran et al., eds., Foundations of metacognition.
(pp.21137). New York: Oxford University Press.
Chapais, B. 2008. Primeval kinship: How pair-bonding gave birth to human society.
Cambridge, MA: Harvard University Press.
Chase, P. 2006. The emergence of culture. New York: Springer.
Chwe, M.S.-Y. 2003. Rational ritual: Culture, coordination and common knowledge.
Princeton, NJ: Princeton University Press.
Clark, H. 1996. Uses of language. Cambridge: Cambridge University Press.
Collingwood, R. 1946. The idea of history. Oxford: Clarendon Press.
Coqueugniot, H., J.-J. Hublin, F. Veillon, F. Houet, and T. Jacob. 2004. Early
brain growth in Homo erectus and implications for cognitive ability. Nature,
231, 299302.
Corbalis, M. 2011. The recursive mind. Princeton, NJ: Princeton University Press.
Crane, T. 2003. The mechanical mind: A philosophical introduction to minds,
machines and mental representation. 2nd ed. New York: Routledge.
Crockford, C., R. M. Wittig, R. Mundry, and K. Zuberbuehler. 2011. Wild chimpan-
zees inform ignorant group members of danger. Current Biology, 22, 142146.
162 R efer ences
Marn Manrique, H., A.N. Gross, and J. Call. 2010. Great apes select tools on the
basis of their rigidity. Journal of Experimental Psychology: Animal Behavior
Processes, 36(4), 409422.
Markman, A., and H. Stillwell. 2001. Role-governed categories. Journal of Experi-
mental and Theoretical Artificial Intelligence, 13, 329358.
Maynard Smith, J., and M. Szathmry. 1995. Major transitions in evolution. Oxford:
W.H. Freeman Spektrum.
Mead, G.H. 1934. Mind, self, and society (ed. C. W. Morris). Chicago: University of
Chicago Press.
Melis, A., J. Call, and M. Tomasello. 2006a. Chimpanzees conceal visual and
auditory information from others. Journal of Comparative Psychology, 120, 154162.
Melis, A., B. Hare, and M. Tomasello. 2006b. Chimpanzees recruit the best
collaborators. Science, 31, 12971300.
. 2009. Chimpanzees coordinate in a negotiation game. Evolution and
Human Behavior, 30, 381392.
Mendes, N., H. Rakoczy, and J. Call. 2008. Ape metaphysics: Object individuation
without language. Cognition, 106(2), 730 749.
Mercier, H., and D. Sperber. 2011. Why do humans reason? Arguments for an
argumentative theory. Behavioural and Brain Sciences, 34(2), 57 74.
Millikan, R.G. 1987. Language, thought, and other biological categories. New
foundations for realism. Cambridge, MA: The MIT Press.
Mitani, J., J. Call, P. Kappeler, R. Palombit, and J. Silk, eds. 2012. The evolution of
primate societies. Chicago: University of Chicago Press.
Mithen, S. 1996. The prehistory of the mind. New York: Phoenix Books.
Moll, H., and M. Tomasello 2007. Cooperation and human cognition: The
Vygotskian intelligence hypothesis. Philosophical Transactions of the Royal Society
of London, Series B: Biological Sciences, 362, 639 648.
. 2012. Three-year-olds understand appearance and realityjust not about
the same object at the same time. Developmental Psychology, 48, 11241132.
. In press. Social cognition in the second year of life. In A. Leslie and
T.German, eds., Handbook of Theory of Mind. New York: Taylor and Francis.
Moll, H., C. Koring, M. Carpenter, and M. Tomasello. 2006. Infants determine
others focus of attention by pragmatics and exclusion. Journal of Cognition and
Development, 7, 411430.
Moll, H., A. Meltzoff, K. Mersch, and M. Tomasello. 2013. Taking versus confront-
ing visual perspectives in preschool children. Developmental Psychology, 49(4),
646 654.
Moore, R. In press. Cognizing communicative intent. Mind and Language.
Mulcahy, N.J., and J. Call. 2006. Apes save tools for future use. Science, 312, 10381040.
Muller, M.N., and J.C. Mitani. 2005. Conflict and cooperation in wild chimpan-
zees. Advances in the Study of Behavior, 35, 275331.
168 R efer ences
Nagel, T. 1986. The view from nowhere. New York: Oxford University Press.
Okrent, M. 2007. Rational animals: The teleological roots of intentionality. Athens:
Ohio University Press.
Olson, D. 1994. The world on paper. Cambridge: Cambridge University Press.
Onishi, K.H., and R. Baillargeon. 2005. Do 15-month-old infants understand false
beliefs? Science, 308, 255258.
Peirce, C.S. 19311958. Collected writings (ed. C. Hartshorne, P. Weiss, and A.W.
Burks). 8 vols. Cambridge, MA: Harvard University Press.
Penn, D.C., K.J. Holyoak, and D.J. Povinelli. 2008. Darwins mistake:
Explaining the discontinuity between human and nonhuman minds. Behavioral
and Brain Sciences, 31, 109178.
Perner, J. 1991. Understanding the representational mind. Cambridge, MA: The MIT
Press.
Piaget, J. 1928. Genetic logic and sociology. Reprinted in J. Piaget, Sociological
studies (ed. L. Smith). New York: Routledge, 1995.
. 1952. The origins of intelligence in children. New York: W.W. Norton.
. 1971. Biology and knowledge. Chicago: University of Chicago Press.
Povinelli, D. 2000. Folk physics for apes: The chimpanzees theory of how the world
works. New York: Oxford University Press.
Povinelli, D.J., and D. ONeill. 2000. Do chimpanzees use their gestures to
instruct each other? In S. Baron-Cohen, H. Tager-Flusberg, and D. Cohen, eds.,
Understanding other minds: Perspectives from developmental cognitive neuroscience,
2nd ed. (pp.11133). Oxford: Oxford University Press.
Rakoczy, H., and M. Tomasello. 2007. The ontogeny of social ontology: Steps to
shared intentionality and status functions. In S. Tsohatzidis, ed., Intentional acts
and institutional facts (pp.113137). Dordrecht: Springer.
Rakoczy, H., F. Warneken, and M. Tomasello. 2008. The sources of normativity:
Young childrens awareness of the normative structure of games. Developmental
Psychology, 44, 875881.
Rekers, Y., D. Haun, and M. Tomasello. 2011. Children, but not chimpanzees,
prefer to forage collaboratively. Current Biology, 21, 17561758.
Richerson, P., and R. Boyd. 2006. Not by genes alone: How culture transformed
human evolution. Chicago: University of Chicago Press.
Riedl, K., K. Jensen, J. Call, and M. Tomasello. 2012. No third-party punishment
in chimpanzees. Proceedings of the National Academy of Sciences of the United
States of America, 109, 1482414829.
Rivas, E. 2005. Recent use of signs by chimpanzees (Pan troglodytes) in interactions
with humans. Journal of Comparative Psychology, 119(4), 404417.
Sandler, W., I. Meir, C. Padden, and M. Aronoff. 2005. The emergence of gram-
mar: Systematic structure in a new language. Proceedings of the National
Academy of Sciences of the United States of America, 102(7), 26612665.
R efer ences 169
Saussure, F. de. 1916. Cours de linguistique gnrale (ed. Charles Bailey and Albert
Schehaye).
Schelling, T.C. 1960. The strategy of conflict. Cambridge, MA: Harvard University
Press.
Schmelz, M., J. Call, and M. Tomasello. 2011. Chimpanzees know that others
make inferences. Proceedings of the National Academy of Sciences of the United
States of America, 108, 1728417289.
Schmidt, M., and M. Tomasello 2012. Young children enforce social norms.
Current Directions in Psychological Science, 21, 232236.
Schmidt, M., H. Rakoczy, and M. Tomasello. 2012. Young children enforce social
norms selectively depending on the violators group affi liation. Cognition, 124,
325333.
Schmitt, V., B. Pankau, and J. Fischer. 2012. Old World monkeys compare to apes
in the Primate Cognition Test Battery. PLoS One, 7(4), e32024.
Searle, J. 1995. The construction of social reality. New York: Free Press.
. 2001. Rationality in action. Cambridge, MA: The MIT Press.
Sellars, W. 1963. Empiricism and the philosophy of mind. London: Routledge.
Senghas, A., S. Kita, and A. zyrek. 2004. Children creating core properties of
language: Evidence from an emerging sign language in Nicaragua. Science, 305,
17791782.
Shore, B. 1995. Culture in mind: cognition, culture, and the problem of meaning. New
York: Oxford University Press.
Skyrms, B. 2004. The stag hunt and the evolution of sociality. Cambridge: Cambridge
University Press.
Slobin, D. 1985. Crosslinguistic evidence for the language-making capacity. In D.I.
Slobin, ed., The crosslinguistic study of language acquisition, Vol. 2: Theoretical
issues (pp.11571260). Hillsdale, NJ: Lawrence Erlbaum.
Smith, J. M., and Ers Szathmry (1995). The Major Transitions in Evolution.
Oxford, England: Oxford University Press.
Southgate, V., C. van Maanen, and G. Csibra. 2007. Infant pointing: Communi-
cation to cooperate or communication to learn? Child Development, 78(3),
735 774.
Sperber, D. 1994. The modularity of thought and the epidemiology of representa-
tions. In L.A. Hirschfeld and S.A. Gelman, eds., Mapping the mind
(pp.39 67). Cambridge: Cambridge University Press.
. 1996, Explaining culture: A naturalistic approach. Oxford: Blackwell.
. 2000. Metarepresentations in an evolutionary perspective. In Dan Sperber,
ed., Metarepresentations: A multidisciplinary perspective. (pp.21934). Oxford:
Oxford University Press.
Sperber, D., and D. Wilson. 1996. Relevance: Communication and cognition. 2nd ed.
Oxford: Basil Blackwell.
170 R efer ences