0% found this document useful (0 votes)

145 views

Evolutionary Syntax (2015)

This document discusses the evolution of syntax from its earliest stages to modern recursive syntax. It proposes that proto-syntax consisted of small clauses that were rigid with no movement or recursion. It then evolved through three stages: 1) an intransitive two-word stage with absolutives and unaccusatives as precursors to transitivity, 2) a paratactic stage with coordination as a precursor to hierarchy, and 3) a stage with specific functional categories that allowed for recursion. The document argues this evolutionary account is supported by evidence from language acquisition, agrammatism, neuroscience, genetics, and studies of other communication systems.

Uploaded by

Abdolreza Shahbazi

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

145 views

Evolutionary Syntax (2015)

Uploaded by

Abdolreza Shahbazi

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 280

OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Evolutionary Syntax
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Oxf o rd St u d i e s in t h e E vo lu t i o n of La n g uage
General Editors
Kathleen R. Gibson, University of Texas at Houston,
and Maggie Tallerman, Newcastle University
RECENTLY PUBLISHED
11
The Prehistory of Language
Edited by Rudolf Botha and Chris Knight
12
The Cradle of Language
Edited by Rudolf Botha and Chris Knight
13
Language Complexity as an Evolving Variable
Edited by Geoffrey Sampson, David Gil, and Peter Trudgill
14
The Evolution of Morphology
Andrew Carstairs-McCarthy
15
The Origins of Grammar
Language in the Light of Evolution II
James R. Hurford
16
How the Brain Got Language
The Mirror System Hypothesis
Michael A. Arbib
17
The Evolutionary Emergence of Language
Evidence and Interference
Edited by Rudolf Botha and Martin Everaert
18
The Nature and Origin of Language
Denis Bouchard
19
The Social Origins of Language
Edited by Daniel Dor, Chris Knight, and Jerome Lewis
20
Evolutionary Syntax
Ljiljana Progovac
See the end of the book for a complete list of titles published
and in preparation for the series.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Evolutionary Syntax

LJILJANA PROGOVAC

1
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

3
Great Clarendon Street, Oxford, ox2 6dp,
United Kingdom
Oxford University Press is a department of the University of Oxford.
It furthers the University’s objective of excellence in research, scholarship,
and education by publishing worldwide. Oxford is a registered trade mark of
Oxford University Press in the UK and in certain other countries
# Ljiljana Progovac 2015
The moral rights of the author have been asserted
First Edition published in 2015
Impression: 1
All rights reserved. No part of this publication may be reproduced, stored in
a retrieval system, or transmitted, in any form or by any means, without the
prior permission in writing of Oxford University Press, or as expressly permitted
by law, by licence or under terms agreed with the appropriate reprographics
rights organization. Enquiries concerning reproduction outside the scope of the
above should be sent to the Rights Department, Oxford University Press, at the
address above
You must not circulate this work in any other form
and you must impose this same condition on any acquirer
Published in the United States of America by Oxford University Press
198 Madison Avenue, New York, NY 10016, United States of America
British Library Cataloguing in Publication Data
Data available
Library of Congress Control Number: 2014957516
ISBN 978–0–19–873654–7 (Hbk.)
ISBN 978–0–19–873655–4 (Pbk.)
Printed and bound by
CPI Group (UK) Ltd, Croydon, cr0 4yy
Links to third party websites are provided by Oxford in good faith and
for information only. Oxford disclaims any responsibility for the materials
contained in any third party website referenced in this work.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

To Ana and Stefan

OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Contents
Preface xi
Acknowledgments xii
List of abbreviations xiv

1. Introduction 1
1.1 Background and rationale 1
1.2 Proposal in a nutshell 5
1.2.1 What did proto-syntax look like? 5
1.2.2 A method of reconstruction based on Minimalism 8
1.3 Three rough stages 12
1.4 Can natural/sexual selection be relevant for syntax? 14
1.5 Corroboration and testing 20
1.6 A brief comparison with Jackendoff ’s (and other) approaches 21
1.7 Syntactic theory 26
1.8 Chapter-by-chapter overview 29
2. The small (clause) beginnings 33
2.1 Introduction 33
2.2 Root small clauses in English 34
2.3 (Unaccusative) Root small clauses in Serbian 40
2.4 Small clause syntax is rigid (no Move, no recursion) 44
2.5 Corroborating evidence and testing grounds 49
2.5.1 Language acquisition 49
2.5.2 Agrammatism 52
2.5.3 Neuroimaging 52
2.5.4 Genetics and the FOXP2 gene 53
2.5.5 Stratiﬁcation accounts elsewhere 55
2.6 Conclusion 56
3. The intransitive two-word stage: Absolutives, unaccusatives,
and middles as precursors to transitivity 57
3.1 Introduction: The two-word stage 57
3.2 Intransitive absolutives 62
3.3 More on living fossils: What is it that unaccusatives, exocentrics,
and absolutives have in common? 65
3.3.1 Unaccusatives 66
3.3.2 Exocentric compounds 68
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

viii Contents

3.3.3 Absolutives 70
3.3.4 More absolutive-like patterns in nominative/accusative
languages 73
3.3.4.1 Nominals 73
3.3.4.2 Dative subjects 74
3.3.4.3 Clausal complements 74
3.4 Precursors to transitivity 75
3.4.1 Serial verb constructions 75
3.4.2 The “middle” ground 76
3.5 Corroborating evidence and testing grounds 81
3.6 Conclusion 85
4. Parataxis and coordination as precursors to hierarchy: Evolving
recursive grammars 86
4.1 Hypothesized evolutionary stages of syntax 86
4.2 Paratactic proto-syntax stage 89
4.2.1 Operation Conjoin: Clause-internally and clause-externally 89
4.2.2 Paratactic grammar vs. separate utterances 95
4.2.3 Absolutes and correlatives: More on Conjoin 99
4.3 The proto-coordination stage 102
4.4 The specific functional category stage 109
4.4.1 From linkers to specific functional categories 109
4.4.2 CP and recursion 111
4.4.3 DP and recursion 113
4.4.4 Benefits of subordination 115
4.4.5 Possible precursors to Move 117
4.4.6 Transitions and overlaps 120
4.5 Corroborating evidence 123
4.5.1 Corroborating evidence for the paratactic stage 123
4.5.1.1 Ancient languages 123
4.5.1.2 Grammaticalization 123
4.5.1.3 Comparative studies: Animal communication 124
4.5.1.4 Agrammatism 126
4.5.1.5 Neuroscience 127
4.5.1.6 Acquisition 128
4.5.2 Corroborating evidence for a proto-coordination stage 128
4.6 Concluding remarks 129
5. Islandhood (Subjacency) as an epiphenomenon of evolutionary
tinkering 131
5.1 Introductory note 131
5.2 What is islandhood/subjacency? 131
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Contents ix

5.3 Why there is no principled account of islandhood 135

5.4 Subjacency in the light of evolution 139
5.5 Conclusion 142
6. Exocentric VN compounds: The best fossils 144
6.1 Introduction 144
6.2 Paratactic grammar behind VN compounds 145
6.2.1 Absolutive-like proto-predication 145
6.2.2 Exocentricity 149
6.3 A comparison with the hierarchical verbal compounds 152
6.4 A surprising verb form: The imperative 156
6.5 Crosslinguistic distribution and parallels 162
6.5.1 VN compounds in other Slavic languages 162
6.5.2 VN compounds in Romance languages 163
6.5.3 VN compounds in non-Indo-European languages 166
6.6 VN compounds and sexual selection 167
6.7 Corroborating evidence and testing grounds 169
6.8 Concluding remarks 171
6.9 Appendix 1: Additional English VN compounds 171
6.10 Appendix 2: Additional (mostly coarse) VN compounds as
Serbian people and place names 172
7. The plausibility of natural selection for syntax 174
7.1 Concrete and selectable advantages accrued by each stage 174
7.2 From one-word to two-word utterances: Vagueness galore 175
7.3 From the two-word stage to hierarchical syntax: Evolving
transitivity, displacement, and recursion 180
7.3.1 Introductory remarks 180
7.3.2 Grammaticalizing tense 180
7.3.3 Grammaticalizing transitivity 183
7.3.4 Recursion 187
7.3.5 Historical change vs. language evolution 190
7.4 A detailed selection scenario 193
7.5 The timeline for the evolution of language 198
7.5.1 Was there enough time? 198
7.5.2 The timeline 199
8. Conclusion 207
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

x Contents

Appendix
Testing grounds: Neuroimaging
Co-authored with Noa Ofen 211
References 219
Index of languages and language groups 247
Index of names 249
Index of subjects 255
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Preface
This monograph is meant to be readable and evaluable not only by linguists—all
kinds of linguists—but also by non-linguists. To this end, painful efforts have been
made to write it clearly, and to present the theories and postulates it draws upon in an
accessible way, without taking away too much from the complexity of the issues
discussed. This is especially important to do in a monograph which purports to
stimulate interdisciplinary projects on language evolution. The footnotes are used to
do justice to some of the complexities, and they include some technical details of the
analysis. The reader who ignores the footnotes will still get the gist of the arguments.
However, the reader will only fully grasp the impact of this proposal after working
through at least Chapter 4, which brings it all together. Each of the Chapters 2, 3, and 4
considers the proposed proto-syntax stages from a different angle, and it is only after all
these angles are taken into account that a clear picture will emerge.
This monograph draws directly upon the field of theoretical syntax, and presents
some of its key postulates in an accessible way so that crossfertilization can be sought
between this field and the fields of evolutionary biology, neuroscience, and genetics.
In addition, this monograph sometimes takes into account the linguistic (sub-)
disciplines such as typology and theories of grammaticalization. Doing an interdis-
ciplinary study of this kind inevitably leads to some loss of detail with each particular
field, but my assessment is that any such loss is more than compensated for by the
synergy among these fields, yielding insights that would never be possible without
this kind of approach.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Acknowledgments
The completion of this book owes to many, many people—certainly more than I will
remember to mention here. First and foremost, I am deeply thankful to Martha
Ratliff, who carefully read the whole monograph and provided substantial and often
crucial feedback on every single chapter, on every single idea, not only in the context
of this book, but over the span of the past seven or so years. Her criticism,
encouragement, and friendship kept me motivated and balanced. Daniel Ross and
Robert Henderson read selected chapters and provided valuable feedback on them.
I have co-authored work on language evolution with Eugenia Casielles and John
L. Locke, and I cannot imagine better collaborators.
For countless discussions and exchanges on the topic of the evolution of syntax,
I am grateful to, in no particular order: Martha Ratliff, Brady Clark, Dorit Bar-On,
David Gil, Fritz Newmeyer, Jasmina Milićević, Ana Progovac, Stefan Progovac,
Dušan Progovac, Eugenia Casielles, Draga Zec, John L. Locke, Noa Ofen, Relja
Vulanović, Steven Franks, Tecumseh Fitch, Andrea Moro, Ray Jackendoff, Željko
Bošković, Nataša Todorović, Igor Yanovich, Dan Everett, Natasha Kondrashova,
Dan Seely, Robert Henderson, Paweł Rutkowski, Ellen Barton, Kate Paesani, Pat
Siple, Walter Edwards, Richard Kayne, Juan Uriagereka, Stephanie Harves, Jim
Hurford, Andrew Nevins, Mitch Green, Raffaella Zanuttini, Haiyong Liu, Franck
Floricic, Margaret Winters, Geoff Nathan, Geoffrey Sampson, John McWhorter,
Ruth Crabtree, Bernd Heine, Eric Reuland, Johanna Nichols, Ken Saﬁr, Patricia
Schneider-Zioga, Acrisio Pires, Ileana Paul. Speciﬁc acknowledgments to those who
provided data are to be found in the relevant places in the book.
I am very grateful for receiving several grants to pursue this project, including: the
2014 Marilyn Williamson Endowed Distinguished Faculty Fellowship for the experi-
mental fMRI project “In Search of Protosyntax in the Brain;” 2013 Keal Faculty
Fellowship, for this book’s manuscript preparation; 2007 Gershenson Distinguished
Faculty Award, as well as 2006 Humanities Innovative Projects in the Arts and
Humanities Grant, for the project “Rudimentary Grammar in the Evolution of
Human Language.”
My immense gratitude goes to the two Oxford University Press reviewers, for
providing amazingly thorough and stimulating comments, as well as to editor
Maggie Tallerman, for many thoughtful and detailed comments. Their comments
made me produce a more nuanced, more engaged, and more informed monograph.
My deep gratitude also goes to the Oxford University Press editor, Julia Steer, for her
thoughtfulness.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Acknowledgments xiii

The ideas pursued in this monograph have been presented at various conferences
and workshops, which were crucial in shaping my proposal on the evolution of
language, and I am grateful to the audiences there for their valuable feedback: Slavic
Linguistic Society (2006); Michigan Linguistic Society (2006, 2007); Georgetown
University Round Table (GURT) (2007); International Linguistics Association
(ILA), New York (2007); Max Planck Workshop on Complexity, Leipzig, Germany
(2007); Illinois State University Conference on Recursion in Human Language
(2007); Formal Approaches to Slavic Linguistics (FASL) (2007, 2008, 2012, 2014);
American Association of Teachers of Slavic and East European Languages (AAT-
SEEL), Chicago (2007); DGfS Workshop on Language Universals in Bamberg,
Germany (2008); EvoLang in Barcelona, Spain (2008); Biolinguistics: Acquisition
and Language Evolution (BALE) in York, England (2008); Generative Syntax Work-
shop, Novi Sad, Serbia (2008); Ways to Protolanguage Workshop, Torún, Poland
(2009); EvoLang, Utrecht, Netherlands (2010); SyntaxFest, Bloomington, Indiana
(2010); Workshop on Protolanguage, University of Virginia, Charlottesville (2012);
Symposium on Formal Linguistics and the Measurement of Grammatical Complex-
ity, Seattle, Washington (2012); Transcending the Boundaries Workshop, Duke
Institute for Brain Sciences (2013); University of Connecticut Workshop on the
Evolution of Syntax, Storrs, Connecticut (2014).
Needless to say, I have not always heeded the advice, and whether I did or not, all
errors remain mine.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

List of abbreviations
1,2,3 First, Second, or Third Person
ABSL Al-Sayyid Bedouin Sign Language
ACC Accusative
BP Before Present
CP Complementizer Phrase
DOM Differential Object Marking
DP Determiner Phrase
ECM Exceptional Case Marking
F Feminine (gender)
IE Indo-European
IFG Inferior frontal gyrus
IMP Imperative
INF Inﬁnitive
M Masculine (gender)
N Neuter (gender)
mya Million years ago
NOM Nominative
NP Noun Phrase
NSL Nicaraguan Sign Language
PART Participle
PERF Perfective
PIE Proto-Indo-European
PL Plural
PRES Present
pSTS Posterior superior temporal sulcus
SC Small Clause
SG Singular
SOV Subject-Object-Verb
SVO Subject-Verb-Object
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

List of abbreviations xv

TAM Tense/Aspect/Mood
TP Tense Phrase
VN Verb-noun
VP Verb Phrase
vP Light Verb Phrase
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

“The sensations and ideas thus excited in us by music, or expressed by the cadences
of oratory, appear, from their vagueness, yet depth, like mental reversions to the
emotions and thought of a long-past age.”
(Darwin 1874: 595)
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Introduction

1.1 Background and rationale

Broadly speaking, the goal of this monograph is to provide a framework, a program
for studying the evolution of syntax, by relying on a theory of syntax. The intent is
to show that syntax can be decomposed into evolutionary primitives/layers, and
that such decomposition can not only help identify the stages of evolutionary
progression of syntax, but also shed light on the very nature of language design.
I also show that the progression through the postulated stages makes evolutionary
sense, i.e., that each new stage brings some concrete advantage(s) over the previous
stage(s), and that such advantage(s) would have been subject to natural/sexual
selection.1 My proposal is therefore that the capacity for syntax evolved incremen-
tally, in stages, subject to selection pressures.2 This approach leaves the door wide
open for the possibility that the pressures to evolve syntax, and language in general,
played an active role in shaping human brains, contrasting with the view that the
brain’s capabilities evolved for some other purpose, and then got co-opted for
language.
This monograph draws on and brings together: (i) Darwin’s (1859, 1872, 1874)
theory of gradualist evolution invoking natural/sexual selection; (ii) some key syn-
tactic postulates of the Minimalist Program for syntax (e.g. Chomsky 1995) and its
predecessors; (iii) Jackendoff ’s (1999, 2002) idea of syntactic “fossils;” and (iv) the

1
While sexual selection is typically considered to be a subcase of natural selection, given that they both
ultimately reduce to reproduction, I sometimes use both terms in this book next to each other in order to
highlight the prominent role sexual selection might have played in language evolution. As I will argue, at
least some aspects of the evolution of syntax/language may not have been adaptive in the sense of physical
survival in the environment, but instead beneficial for securing mates.
2
When I refer to the gradual evolution of syntax in this monograph, this can also be interpreted as the
gradual evolution of the capacity to use syntax, one aspect of which is the capacity to establish numerous
neural connections in the brain, as discussed in Section 1.4. But my primary intent here is to hypothesize
what kind of syntax/grammar was actually in use in each proposed stage, whether the use of simpler
grammars at these stages reflected the lesser capacity for establishing a multitude of neural connections
necessary to support more complex syntax, or whether the use of simpler syntax simply reflected the lack of
innovation (of more complex syntactic structures) at that point, or both.

Evolutionary Syntax. First edition. Ljiljana Progovac

# Ljiljana Progovac 2015. Published 2015 by Oxford University Press
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

2 Introduction

idea of internal reconstruction using a linguistic theory (Heine and Kuteva 2007).3
The proposed framework is not only informed by syntactic theory, but it is also
consistent with the forces of natural/sexual selection and it is speciﬁc enough to yield
testable hypotheses that can be subjected to e.g. neuroimaging experiments. Remark-
ably, by reconstructing a particular path along which syntax evolved, this approach is
able to explain the crucial properties of language design itself, as well as some major
parameters of crosslinguistic variation.4
In the spirit of Darwin (e.g. 1859), and as elaborated in Jacob (1977), evolution is to
be seen as a “tinkerer,” rather than an engineer. Unlike engineering, which designs
from scratch, with foresight and plan, and with perfection, tinkering works by
cobbling something together out of bits and pieces that happen to be available locally,
with no long-term foresight. Evolution is also known to be conservative and not to
throw a good thing away, but to build upon it, which is why one should expect to ﬁnd
constructions of previous stages (“fossils”) in the later stages. One of the themes of
this monograph is that the advent of a new stage does not obliterate the previous
stage(s), but rather that the older stages continue to co-exist, often in specialized or
marginalized roles, in addition to being built into the very foundation of more
complex structures.
However, many syntacticians believe that it is inconceivable for there to exist, or to
have ever existed, a human language which does not come complete with unbounded
Merge, Move, structural case, subordination, and a series of functional projections:
the hallmarks of modern syntax. The claim is often that syntax in its entirety evolved
suddenly, as a result of a single event. The following quotation from Berwick (1998:
338–9) summarizes this view: “In this sense, there is no possibility of an ‘intermediate’
syntax between a non-combinatorial one and full natural language—one either has
Merge in all its generative glory, or one has no combinatorial syntax at all . . . ” (see
also Chomsky 2002, 2005; Piattelli-Palmarini 2010; Moro 2008). When it comes to
language evolution, this stance has been challenged by e.g. Pinker and Bloom (1990);
Newmeyer (1991, 1998, 2005); Jackendoff (1999, 2002); Culicover and Jackendoff
(2005); Givón (e.g. 2002a,b, 2009); Tallerman (2007, 2013a,b, 2014a,b); Heine and
Kuteva (2007); Hurford (2007, 2012); Progovac (2006, 2009a, b, 2013b).5 Most recently,

3
This may appear to be an uneasy alliance, especially given that Noam Chomsky himself has rejected a
gradualist approach to the evolution of syntax, as discussed below in the text (in this respect, see Clark 2013
for the argument that one’s theoretical framework does not determine one’s stand on language evolution).
To my mind, a syntactic program such as Minimalism is not the truth about language, but it is a framework
which provides tools that can be used to search for the truth (see Section 1.7). The tools of other approaches
can certainly be used as well.
4
As put in Givón (2002b: 39), “like other biological phenomena, language cannot be fully understood
without reference to its evolution, whether proven or hypothesized.” An even stronger claim to this effect can be
found in Dobzhansky’s (1973) article titled “Nothing in biology makes sense except in the light of evolution.”
5
For a thorough overview of the recent approaches to language evolution, the reader is referred to the
introductory chapter of Heine and Kuteva (2007), as well as to Tallerman and Gibson (2012).
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Background and rationale 3

on the other hand, Berwick and Chomsky (2011: 29–31) assert again that “the simplest
assumption, hence the one we adopt . . . , is that the generative procedure emerged
suddenly as the result of a minor mutation. In that case we would expect the
generative procedure to be very simple . . . The generative process is optimal. . . . Lan-
guage is something like a snowflake, assuming its particular form by virtue of laws of
nature . . . Optimally, recursion can be reduced to Merge . . . 6 There is no room in this
picture for any precursors to language—say a language-like system with only short
sentences. The same holds for language acquisition, despite appearances . . . ”7
This monograph shows that there is in fact ample room for a language system with
short (and flat) sentences, and that such constructs are not just something we can
postulate for the evolution of syntax, but that they are also found as “living fossils”
throughout present-day languages (see e.g. Jackendoff 1999, 2002 for the idea of living
fossils of syntax). In biological literature, living fossils are defined as species that have
changed little from their fossil ancestors in the distant past, e.g. lungfish (Ridley
1993).8 Significantly, such fossil structures in syntax are clearly characterizeable using
the tools of Minimalism, and their properties follow precisely from the reconstruc-
tion formula introduced in the following section. For something to qualify as a
syntactic fossil, I argue, it has to be theoretically proven to be measurably simpler
than its more complex/more modern counterparts, and yet show clear continuity
with them. Strikingly, there is evidence that these (proto-syntactic) fossils provide a
foundation upon which more complex syntactic structures are built.
Jackendoff (1999, 2002) considers paratactic grammars as evolutionarily more
primary than hierarchical grammars, and identifies some fossils of such grammars,
including compounds and adjunction processes (see Section 1.6 for more details
regarding Jackendoff ’s approach). Parataxis can be considered as a loose combin-
ation or concatenation of two or more elements. Jackendoff ’s claim is that the
achievements of the previous stages are still there, co-existing side by side with
more complex hierarchical constructions. This monograph shows that one can
make an even stronger and more specific claim than this, which is that these
paratactic (fossil) structures are built into the very foundation of every modern clause

6
This idea that syntax is optimal in some sense can be found in various recent papers on Minimalism.
According to the so-called Strong Minimalist Thesis (SMT), language is an optimal solution to legibility
conditions (e.g. Chomsky 2000: 96; see also Epstein, Kitahara, and Seely 2010). However, what “optimal”
should mean in this context has not been deﬁned, and this makes it impossible to falsify these claims, or
to respond to them in a meaningful way (see also the discussion in Johnson and Lappin 1999).
7
In fact, saltationist views sometimes ﬂirt with the idea that not just syntax, but language in its entirety,
arose as one single event. While most claims are vague in this respect, Piattelli-Palmarini (2010: 160) states
that it is “illusory” to think that words can exist outside of full-blown syntax, or that any protolanguage can
be reconstructed in which words are used, but not full-blown syntax.
8
Linguistic fossils are also discussed in Bickerton (1990, 1998), although Bickerton claims that there is
no continuity between such fossils, found e.g. in pidgin languages and early children’s speech, and modern
grammars. In addition, Givón (e.g. 1979) also refers to vestiges of previous stages of language in present-day
languages, in a very similar sense.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

4 Introduction

or phrase, as will be seen below. Consider also that Jackendoff ’s (and Bickerton’s
1990) concatenation protolanguage stage differs from mine in another important
respect: while theirs allows more than one argument per verb from the very start,
I argue that the initial stage of proto-syntax was necessarily intransitive, as well as
absolutive-like (Section 1.2).9
Bickerton (1990, and subsequent work) claims that what he terms “protolanguage”
does not have syntax, and is in fact not real language for that reason (see also
Section 1.6). My use of the term “proto-syntax” is meant to imply that this is a
stage which shows syntax, although of a different, simpler kind.10 Postulating an
absolutive-like two-word stage allows for a more fine-grained identification of stages,
which in turn makes it easier to identify the pressures, as well as precursors, for
evolving hierarchical grammars, including transitivity. The postulation of an
intransitive absolutive-like stage also opens up the possibility of using crosslinguistic
variation in the expression of transitivity to correlate these stages with the hominin
timeline (Chapter 7).
This monograph thus challenges the view that syntax is an all-or-nothing package,
and that it evolved suddenly in all its complexity. My position is instead that the
capacity for syntax evolved gradually, in stages, subject to selection pressures.11 It is
based on very specific claims, whose feasibility can be evaluated and tested both in the
theory of syntax and in neuroscience, as well as corroborated by the findings in other
relevant fields or subfields, including language acquisition, grammaticalization the-
ory, typology, aphasia, and genetics. There are several components of this proposal
that set it apart from the other approaches to the evolution of language. First, this
approach pursues an internal reconstruction of the stages of grammar based on the
syntactic theory, to arrive at precise, specific, and tangible hypotheses. Second, it
provides an abundance of theoretically analyzed “living fossils” for each postulated
stage, drawn from a variety of languages. Third, the postulated stages, as well as
fossils, are at the appropriate level of granularity to reveal the selection pressures that
would have driven the progression through stages. Fourth, this approach offers a very

9
The meaning of the term “absolutive-like” will be made much clearer in Chapter 3. For now, it is to be
understood as a construction with a verb and one single argument whose status as a subject vs. as an object
of the verb is not syntactically speciﬁed. This characterization pertains most clearly to constructions which
are ergative/absolutive both syntactically and morphologically, as will be explained in Chapter 3.
10
In this book, to avoid confusion, I will reserve the term “protolanguage” for presyntactic (non-
combinatorial) stages of language, as is the one-word stage, even though, in principle, the term protolanguage
could be taken to encompass proto-syntax as well.
11
A reviewer has wondered if the term “gradual” can be interpreted to mean “continuous,” as that
would not be the correct characterization of what I mean here. The term has been associated with
Darwinian adaptationism, and has been used in this context, with a clear sense of incremental processes,
using small steps rather than leaps, as discussed at length in e.g. Dawkins (1996); see also Fitch (2010: 46).
As Dawkins (1996) explains, by situating Darwin’s writings within the context of the debates of his own
time, one can clearly see that Darwin was not a constant-rate gradualist, as is sometimes suggested by
punctuated equilibrium advocates.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Proposal in a nutshell 5

specific experimental design for testing the proposed hypotheses. Last but not least, it
arrives at a reconstruction which can be meaningfully correlated with the hominin
timeline.
This monograph draws directly upon the field of theoretical syntax, and presents
some of its key postulates in an accessible way so that cross-fertilization can be
sought between this field and the fields of e.g. evolutionary biology, neuroscience, and
genetics. An interdisciplinary endeavor of this scope will inevitably lead to some loss
of depth and technical detail with each particular field, including when it comes to
the theory of syntax, but my assessment is that any such loss is more than compen-
sated for by the potential to cross-fertilize these fields, yielding insights that would
never be possible by looking at each field separately. As much as this book is about
reconstructing the evolutionary path for syntax, it is also deeply about what syntax
actually is, as the two questions are inextricably linked. This particular evolutionary
scenario offers a reconstruction of how communicative benefits may have been
involved in the shaping of the formal design of language itself.

1.2 Proposal in a nutshell

1.2.1 What did proto-syntax look like?
Specifically, my proposal is that the first “sentences” were paratactic (not hierarchical,
not headed) combinations of e.g. a “verb” and a “noun” (akin to present-day
intransitive small clauses), in which the noun, the only argument of the verb, was
absolutive-like, specified as neither subject nor object.12 To put it in less technical
terms, I propose that a proto-sentence was somewhat like a two-slot mold, which
could fit just two words, for example one verb and one noun, and as such it could not
be transitive, as a transitive structure requires three basic elements, a verb, a subject,
and an object. While postulating two-slot proto-grammars may seem far-fetched at
first sight, the (fossil) structures that clearly exhibit properties of such grammars are
easily found across various constructions in all present-day languages, if not at the
level of the sentence, then at the level of the noun phrase or at the level of the
compound. My basic argument is that such fossil small clauses have been built into
more complex hierarchical structures, rather than have been replaced by them.

12
More accurately, instead of using the term “sentence” here, one can talk about combinations of nouns
and verbs, as some of these combinations appear to be compounds used as names/nicknames. In addition,
some of these combinations may involve predicates other than verbs, but the majority of the examples
I consider in the monograph consists of a verb and a noun, which is also in line with Heine and Kuteva’s
(2007) conclusion that nouns and verbs were the ﬁrst (proto-)word categories to emerge in the evolution of
human language. A reviewer points out that noun-noun compounds may also be of interest for this
approach, as well as verb-verb combinations, as attested in serial verb constructions in some languages.
I return to serial verb constructions in Sections 1.6 and 3.4.1, and to noun-noun compounding in
Section 1.6.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

6 Introduction

There are additional reasons to believe that the first syntactic combinations were
short (and binary), that is, that they consisted of only two (main) elements loosely/
paratactically combined. First of all, binary branching in syntactic theory (including
in Minimalism) is considered to be a syntactic universal, that is, it is considered that
all syntactic operations can only join two elements at a time. The overwhelming
majority of compounds across languages are binary, consisting of only two free
morphemes. Child language acquisition is typically reported to proceed from a
one-word stage to a two-word stage, before combining more words into single
utterances becomes available (e.g. Bloom 1970).13 In addition, where (small) clauses
themselves are combined paratactically with other such clauses (as in e.g. Nothing
ventured, nothing gained; Easy come, easy go; Come one, come all), the number of
clauses that combine is again overwhelmingly just two.14
There are some exceptions, such as No shoes, no shirt, no service; however,
combining more than two expressions paratactically often becomes very difficult to
process, as the following example helps illustrate:
(1) Nothing ventured, nothing gained, nothing lost.
One is not sure what the above example means. Does it mean that if nothing is
ventured, and nothing is gained, then nothing is lost either? Or does it mean that if
nothing is ventured, then nothing is either gained or lost? Or something else? This
is not grammatically specified in the example in (1), and our brains do not seem
prepared to readily assign meanings to such ternary structures. The only way to
unambiguously accommodate three or more clauses like that into a single utter-
ance is by creating hierarchical syntax, using function words such as if, then,
and, etc.
If our ancestors started with the capacity to use small (clause) paratactic gram-
mars of the kind approximated above, they would have faced ample evolutionary
pressures to develop a capacity for more elaborated grammars, that is, grammars
that can accommodate a combination of more than just two clauses, as well as
transitive grammars, which can accommodate more than two words/phrases. In
other words, my claim is that the two-slot proto-syntax (an early stage in the
evolution of syntax) characterized by paratactic union (#), operated both inside the
sentence (to produce intransitive two-word proto-sentences, such as Come # all),
and at the level of clause union (to produce binary combinations of the kind Come
one, # come all).

13
In his Overview to the edited collection, Bloom (1994) speciﬁes that the two-word stage is typically
observed between the ages of eighteen months and two and a half years.
14
There is a wealth of data from a variety of languages which follow this AB-AC formula, and these are
typically fossilized expressions, although in some languages, such as Hmong, they can be in productive use,
exemplifying true living fossils (Chapters 2, 3, and 4).
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Proposal in a nutshell 7

It has been hypothesized by several researchers that there was a simpler stage of
syntax in the evolution of human language, involving elements loosely concatenated
into a single utterance (e.g. by Givón 1979, 2002a,b; Dwyer 1986; Bickerton 1990, 1998;
Jackendoff 1999, 2002; Culicover and Jackendoff 2005; Deutscher 2005; Burling 2005;
Gil 2005; Tallerman 2007, 2013a,b, 2014a,b; Hurford 2007, 2012; Progovac 2006, 2008a,
b, 2009a,b; Jackendoff and Wittenberg 2014; and many others. My approach builds
on these claims, and takes them significantly further, to hypothesize an intransitive,
absolutive-like, two-word (small clause) stage (Progovac 2014a,b).
Using crosslinguistic data, Progovac (e.g. 2006, 2008a,b, 2009a,b) has extended the
idea of paratactic proto-grammars (i.e. early evolutionary stages of grammar) to what
is referred to in the literature as “small clauses” in embedded contexts, but which are
also found in isolation as root small clauses (2, 4). According to this proposal, clauses
in (2) and (4) are relevant fossils of the two-word stage, as they are not only
intransitive, but also lack (at least) the TP (Tense Phrase) layer of structure, typically
associated with modern finite sentences in Minimalism (3, 5). They can be reduced to
a single layer of structure, the layer of the small clause.
(2) Problem solved. Case closed. Me first! Him worry?!
(3) The problem has been solved. The case has been closed.
I will be first! He worries?!
(4) a. Pala vlada. (Serbian)15
fall.PART government
b. Pao sneg.
fall.PART snow
c. Stigla pošta.
arrive.PART mail
(5) a. Vlada je pala.
AUX
‘The government has fallen.’
b. Pao je sneg.
‘It has snowed.’
c. Stigla je pošta.
‘The mail has arrived.’
My argument is that comparable small clauses served as precursors to more complex
sentences (TPs) in the evolution of human language, given that they are syntactically

15
The form of the verb in the examples in (4) is a perfective participle form, indicating perfective/
completed aspect. There is no past tense marking in these examples. In contrast, the past tense is marked in
the examples in (5) by the auxiliary je.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

8 Introduction

measurably simpler, and given that, according to the theory, they provide a founda-
tion for building TPs (Sections 1.2.2; 1.7).16
The argument for the proposed progression from a small clause to a TP stage has
three prongs to it: (i) providing evidence of “tinkering” with the language design, in
the sense that older structures (i.e. small clauses) get built into more complex
structures (i.e. TPs); (ii) identifying “living fossils” of the small clause stage in modern
languages; and (iii) identifying existing or potential corroborating evidence and
testing grounds, from language acquisition, agrammatism, genetics, and neurosci-
ence. In addition, the argument is that the progression from a small clause stage to a
TP stage brings with it concrete communicative advantages, which could have been
subject to natural/sexual selection (see Section 1.4).
It is important to keep in mind here that the fossils discussed in this monograph
can only be seen as rough approximations of the structures once used in the deep
evolutionary past. Depending on the language in question, such fossils in present-day
languages may show morphological markings, e.g. case marking and aspect marking.
It is in no way implied in this monograph that the proto-syntax in evolutionary times
had any such morphology. The structures identiﬁed as fossils in this monograph
count as fossils in some relevant respect under consideration, for example, in their
lack of a TP, but not in all their properties. It also seems that some of the fossils
discussed in the monograph (such as exocentric verb-noun (VN) compounds, e.g.
cry-baby, pick-pocket, hunch-back, rattle-snake) are closer approximations of the
proto-syntactic constructs than others, for the reasons given in e.g. Chapter 6,
which discusses such compounds in great detail.
It is also of interest that different languages can use the foundational, fossil
structures in different ways, and in different constructions. In some languages, the
fossil constructions are still in productive use, as is the case with e.g. Serbian
unaccusative small clauses in (4), to be covered in Chapter 2, and Hmong AB-AC
formulae, as discussed in Chapters 2, 3, and 4. Looking at more languages in this light
would uncover more types of fossil structures, and provide further insights into the
evolution and nature of human language.

1.2.2 A method of reconstruction based on Minimalism

The method used in this monograph for hypothesizing the stages of proto-syntax can
be characterized as internal reconstruction based on the theory of clause/sentence

16
In this respect, my analysis of small clauses being transformed into full sentences/TPs resembles, to some
extent, the development of the heart (thanks to Garrett Mitchener, p.c. April 2013, for the analogy). The
embryo initially has only a small precursor to the heart, consisting of two simple tubes which merge (“primitive
heart”), and this precursor gradually bulges and expands to become the complex heart. A reviewer has pointed
out that the analogy is not complete, as the complex human heart no longer has the two tubes discernible.
Perhaps the analogy can at least be taken to show that the complex human heart does not come into existence
in its full complexity, but that there is a simple precursor, however hard it may be to imagine one.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Proposal in a nutshell 9

structure adopted in Minimalism (e.g. Chomsky 1995) and its predecessors. The
simplified hierarchy of functional projections/layers characterizing modern clauses/
sentences in Minimalism is given in (6).
(6) CP > TP > vP > VP/SC
Very roughly speaking, the inner VP (Verb Phrase)/SC (Small Clause) layer accom-
modates the verb/predicate and one argument, while vP (Light Verb Phrase) accom-
modates an additional argument, such as agent, in transitive structures. TP (Tense
Phrase) accommodates the expression of tense and finiteness, while CP (Comple-
mentizer Phrase) accommodates subordination/embedding, among other processes
(see Section 1.7 for more discussion).
This hierarchy is a theoretical construct which offers a natural and precise method
of reconstructing previous syntactic stages in language evolution, as outlined in (7).
(7) Internal Reconstruction, based on syntactic theory
Structure X is considered to be primary relative to Structure Y if
X can be composed independently of Y, but Y can only be built
upon the foundation of X.17
While SCs/VPs can be composed without the TP layer, TPs must be built on the
foundation of a small clause/VP, as postulated in the theory of syntax.18 Likewise,
while TPs can be composed without CPs, CPs require the foundation of a TP. One
can thus reconstruct a stage of proto-syntax which had no TPs or CPs, but had SCs/
VPs, and possibly also vPs. To put it differently, one can reconstruct a stage in the
evolution of syntax in which it was possible to compose structures comparable to
those in (2) and (4), but not structures comparable to those in (3) and (5).
Similarly, while SCs/VPs can be composed without a vP layer, the vP can only
build its shell upon the foundation of a SC/VP. One can thus reconstruct a vP-less
(intransitive) stage in the evolution of syntax, reduced to only SC/VP. By removing
these three layers of hierarchical structure, one is essentially left with an intransitive
flat small clause, which is arguably absolutive-like, and which approximates the small
clause beginnings without functional projections, and without the possibility of
distinguishing subjects from objects (see Section 1.2.1). I focus on reconstructing
the properties of these TP-less and vP-less stages of proto-syntax in Chapters 2 and 3
respectively. The significance of the emergence of CP is discussed in Chapters 4, 5,
and 6, but the focus of this book is on the earliest stages of proto-syntax, as they are

17
The term “primary” is used here in the sense that there was a stage in language evolution when the
primary structure X was in use, but not the non-primary structure Y. This is also the sense in which the
internal reconstruction method is used by Heine and Kuteva (2007), as elaborated below.
18
The idea that a sentence (TP) is built upon the foundation of a small clause is one of the most stable
postulates in the theory (Section 1.7).
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

10 Introduction

most relevant for the biological evolution of language, as well as most difficult to
reconstruct.
The absence of each functional projection has concrete and observable conse-
quences, as established based on the theory of syntax, as well as on the abundance
of fossil data taken from across languages. A variety of these fossil constructions
will be exemplified and discussed from this viewpoint throughout the monograph,
including absolutives, unaccusatives, exocentrics, and middles. Consistent with the
gradualist approach advocated here, it is significant that these fossils include
constructions which straddle the boundary between transitivity and intransitivity:
the so-called middles.
The recurring theme of this monograph is that each new stage preserves, and
builds upon, the achievements of the previous stage(s). Thus, a TP is built upon
the foundation of the small clause (which might or might not include a vP),
and transitive structures (vP/VP shells), as well as “middles,” are built upon
the foundation of intransitive (absolutive-like) VPs/SCs. In brain stratification
accounts (see e.g. Vygotsky 1979 and Jean Piaget’s work, as outlined in e.g.
Gruber and Vonèche 1977), as well as in the triune brain proposals (e.g. MacLean
1949), the common theme is the inclusion of attainments of earlier stages in the
structures of later stages (Section 2.5.5). This kind of scaffolding finds corroboration
in the processes of language acquisition and language loss, as well as in language
disorders.
A method of internal reconstruction is also used in Heine and Kuteva (2007), but
based on a different linguistic theory: a theory of grammaticalization. Since gram-
maticalization typically works in the direction of developing a functional (gram-
matical) category out of a lexical category (or a more abstract category out of a
more concrete category), but not the other way around, the authors reconstruct a
stage in the evolution of human language which only had lexical (content) cat-
egories, but not functional categories. In that sense, lexical categories are primary
with respect to corresponding functional categories (see Footnote 17). Importantly,
the proposed syntactic reconstruction in (6–7) leads to a convergent result: strip-
ping away functional layers (such as CP, TP, and vP) leaves one with a bare small
clause, consisting typically only of a verb and a noun, with no functional projec-
tions on top. What these two methods share is that they use a linguistic theory, as
well as a wealth of linguistic data behind these theories, to arrive at hypotheses
about language evolution, and it is significant that these two approaches lead to a
convergent result.
The sense in which the term “internal reconstruction” is used by Heine and Kuteva
(2007: 24), as well as in this book, is based on the assumption that languages reveal
evidence of past changes in their present structures, and that certain kinds of present
alternation in a language can be reconstructed back to an earlier stage in which there
was no alternation of that kind (see also e.g. Comrie 2002). The internal reconstruction
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Proposal in a nutshell 11

method contrasts with the comparative method, which necessarily looks at more than
one language in order to reconstruct the language of the common ancestor.19
My approach does not lead to identical results to those reached by Heine and
Kuteva (2007), but rather to results that complement each other, and reinforce each
other. Heine and Kuteva’s (2007) focus is on the lexicon, that is, word categories,
while the focus here is on syntax/grammar, that is, on how the words in that lexicon
were combined, and how these syntactic combinations got to be more complex
over time.
It is worth pointing out, however, that the capacity for abstract vocabulary
building is not unrelated to the emergence of functional categories and hierarchical
syntax. Grammaticalization processes typically take more concrete words, such as go,
say, etc., and metaphorically extend their meanings to the point when they become
e.g. highly abstract functional categories (such as tense markers, or subordination
markers; see e.g. Heine and Kuteva 2007 and references there). Thus, the capacity for
hierarchical syntax probably presupposes the capacity for abstract vocabulary build-
ing based on metaphorical extension. According to e.g. Givón (2002a: 151–2), one
reason to believe that some basic words used in isolation (one-word stage) preceded a
syntactic stage is that grammatical categories are more abstract than lexical categories
(see also Tallerman 2014a).
While my approach identifies specific syntactic stages of language evolution, as
well as evolutionary pressures that would have driven the progression through stages,
Heine and Kuteva do not explore the role of natural/sexual selection in the evolution
of the lexicon or syntax. Even though the final evidence regarding the origins of
human language may have to come from other disciplines, perhaps neuroscience and
genetics, only linguistics can provide specific and linguistically sound hypotheses for
these fields to engage.
This book thus contributes to the view that language, and in particular syntax,
emerged gradually, through evolutionary tinkering. However, the gradualist view of
the evolution of language is sometimes dismissed by pointing out that recently
observed language changes are not always linear/directional, and that it is possible

19
It should be pointed out here that internal reconstruction (as opposed to the comparative method)
has been used much less in phonological reconstruction, and that syntactic reconstruction in particular is
much newer and less successful than phonological reconstruction. However, it is pointed out in e.g.
Newman (2014: 13), as well as references cited there, that internal reconstruction has a lot to offer:
“Although internal reconstruction (IR) is not as well understood nor commonly utilized as the comparative
method, it has a long pedigree in historical linguistics (see Hoenigswald 1944; Kuryłowicz 1973). While
recognizing the limitations of IR, most historical linguists appreciate its value in historical linguistics and
would agree with Hock (1991: 550) when he concludes: ‘internal reconstruction is an extremely useful and
generally quite accurate tool for the reconstruction of linguistic prehistory.’ ” Newman also says in a
footnote that Ringe (2003) is a “curious exception” in rejecting the IR method. This controversy aside,
when it comes to the evolution of syntax, the internal reconstruction method is the only one available. It is
also a method used for reconstructing language isolates, such as Basque.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

12 Introduction

both to develop certain (more complex) forms, and to revert back to the original
(simpler) forms. So the question here is, once you evolve more complex structures,
can you or can you not revert to a paratactic (small clause) grammar?
The basic claim of this monograph is that the foundational, paratactic structures
remain built into the very foundation of the hierarchical grammars, and that they
also continue to live in various marginal, and sometimes not so marginal, construc-
tions, which can be characterized as “living fossils” of the paratactic stage. If so, then
it should not be impossible to fall back onto these simpler, paratactic strategies, still
alive in the brain, especially in the case of adversity, such as agrammatism, pidginiza-
tion, and second language acquisition. Evolution should be able to revert back to
more robust, foundational strategies. According to the so-called last in, ﬁrst out
principle, used in e.g. computer science and psychology (see e.g. Code 2005), what is
acquired last is the most shallow/fragile layer that is the easiest to lose, and vice versa.
When it comes to complex syntax, such loss can take place in pidginization and in
agrammatic aphasia (see also Gil 2005 for the development of Riau Indonesian; also
Heine and Kuteva 2007).
In fact, there are reversals elsewhere in the evolution of organisms. As observed e.g.
in the work of Richard Dawkins, body hair is one of those traits that can recede and
reappear a number of times in the history of a species (e.g. with mammoths, who
rapidly became wooly in the most recent ice ages in Eurasia). In addition, some recent
genetic studies reveal that reversals and losses are possible even in the evolution of
multi-cellularity, a major transition in the history of life. For example, Schirrmeister,
Antonelli, and Bagheri (2011) report that the majority of extant cyanobacteria, one of
the oldest phyla still alive, including many single-celled species, descend from multi-
cellular ancestors, and that reversals to unicellularity occurred at least ﬁve times. In a
sense, then, pidginization, and other similar losses of syntactic suprastructure, can be
seen as comparable to the return to a simpler, unicellular mode of existence.

1.3 Three rough stages

As outlined in the previous sections, this monograph focuses on intransitive fossil
clauses of various kinds, lacking the layers of functional structure characterizing
modern clauses, including TP and vP layers. This monograph is further concerned
with the nature of the bond between the merged elements in these proto-syntactic
constructs, as well as with how that bond evolves over time to be able to support
processes such as coordination and subordination. In this respect, I identify the
following three rough stages in the evolution of syntactic bond (i–iii), following a
hypothetical non-syntactic one-word stage (0). This progression is meant to shed
light on the emergence of functional categories, as well as on the nature of modern
language design, and the existence of various transitional, ambivalent structures (see
Chapter 4 for details).
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Three rough stages 13

(0) One-word stage (no combinatorial power/no syntax)

This stage is characterized by single words intended as complete utterances (as in e.g.
Run! Snake! Out!)

(i) Paratactic stage (proto-syntax)

In this paratactic stage, prosody/supra-segmentals provide the only glue for (proto-)
Merge. In other words, in this stage there is prosodic evidence, but not any segmental
evidence, that the words/constituents are Merged.20 The proto-syntax of this stage can
be characterized by an operation Conjoin (or proto-Merge), rather than Merge proper
as conceived in Minimalism. As explained at length in Chapter 4, Conjoin, which is
akin to Adjoin, does not create headedness or hierarchy, and the products of Conjoin
are ﬂat structures, which do not allow Move or recursion, as can be corroborated by
looking at the present-day fossils which approximate this stage (Chapter 2). Adjunc-
tion itself is known to be rigid when it comes to Move, and adjuncts are well-known
islands for movement, as discussed below (see also Chapter 5 on Subjacency).

(ii) Proto-coordination stage

This is a stage in which, in addition to prosody, a conjunction/linker provides all-
purpose segmental glue to hold the utterance together. In this stage, the evidence for
(proto-)Merge is more robust, as it retains the prosodic evidence (the only type of
evidence available in the previous stage(s)), and adds to it segmental evidence (the
linker), even though in this stage the segmental piece does not provide any more
specific information regarding the nature of the categories and projections. In other
words, in this stage the evidence for proto-Merge is dual: both prosodic (retained
from the paratactic stage), and segmental (in the form of a linker), an innovation of
this stage. This stage is arguably still syntactically flat/non-hierarchical, and still does
not allow Move, which is consistent with the well-known fact that coordination
structures even today constitute islands for Move (Chapter 5). The proposal is that
the meaningless linkers/proto-conjunctions, best approximated by the conjunction
of the type and in present-day languages, were among the first functional categories,
whose initial purpose was only to consolidate the important achievement of Stage (i),
the ability to (proto-)Merge, i.e. to Conjoin.

(iii) Speciﬁc functional category stage (hierarchical/subordination stage)

In this hierarchical stage, in addition to prosody and to segmental glue, speciﬁc
functional categories become available, providing specialized syntactic glue for con-
stituent cohesion, including tense particles and subordinators/complementizers. In

20
Roughly speaking, prosody refers to the rhythm, stress, and intonation of speech, while segments
refer to speciﬁc sounds.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

14 Introduction

other words, this stage includes all the attainments of the previous stages (prosody
and linkers), and adds another, which is to use the segmental piece (linker) also to
identify the type of the constituent created by Merge. I argue that it is only at this
stage that hierarchical structure, Move, and recursion become available.21
These postulated stages mark a progression from least syntactically elaborated
(parataxis), to more elaborated (coordination), to most elaborated (specialized func-
tional categories/projections). My claim is that each of these grammars can operate
both clause-internally, e.g. to combine a subject and a predicate into a small clause
(e.g. Come winter, . . . ), and clause-externally, to combine two such clauses into a
single utterance (e.g. Come one, come all). As will be shown, all the hierarchical
phenomena discussed in this book, including transitivity and CP subordination,
seem to have alternative, paratactic routes.
This approach can explain why adjuncts and conjuncts are islands for Move, and
more generally why languages exhibit Subjacency/islandhood effects in the ﬁrst place
(Chapter 5). It also sheds light on the vast overlap and indeterminacy between coord-
ination structures and paratactic structures, at one end of the spectrum, as well as
between coordination structures and subordination structures at the other end. The
overlap is expected if each stage is taken to gradually integrate into the next (Chapter 4).

1.4 Can natural/sexual selection be relevant for syntax?

One encounters three main types of argument against subjecting syntax to a grad-
ualist evolutionary approach. The ﬁrst, as pointed out in Section 1, are the claims that
syntax itself is an all-or-nothing package, and that syntax cannot be decomposed into
evolutionary primitives or stages. Second, even if syntax can be decomposed in some
fashion, the progression to a more complex syntax stage would involve the acquisi-
tion of principles such as Subjacency, which are just too abstract and arbitrary to be
targeted by evolutionary forces. Third, it has been claimed that there was not enough
evolutionary time to allow natural/sexual selection to operate on syntax or language
in general (see e.g. Hornstein 2009).22 I brieﬂy address each of these objections below,

21
What I mean by recursion in this monograph corresponds to what linguists typically mean by it: the
embedding of a constituent of a certain syntactic category (e.g. a clause/CP) within another constitutent of
the same category (another clause/CP), as in (i) below (but see e.g. Tomalin 2011 for the confusion
surrounding the term). Also typically associated with the use of the term recursion by linguists is the
assumption that you can repeat the procedure in principle any number of times. This is essentially what
Heine and Kuteva (2007: 68) call “productive recursion.” I return to recursion, and its different charac-
terizations, in Chapter 4 (see also Section 1.7).
(i) I believe [CP that Henry knows [CP that Peter doubts [CP that recursion is real.]]]
22
Another potential problem is raised by Christiansen and Chater (2008), which has to do with the
constant and rapid language change. According to the authors, the linguistic environment over which
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Can natural/sexual selection be relevant for syntax? 15

but in much more detail in Chapter 7. Chapter 7 also considers the basic timeline for
the evolution of syntax, consistent with the postulated stages. Even in this broadest
outline, this approach can help choose among some proposed hypotheses regarding
the evolution of human species.
Regarding the first objection, this monograph not only shows that syntax can be
decomposed into primitives, but also that the progression through these basic
syntactic stages can be successfully reconstructed given syntactic theory, as discussed
in the previous sections, and as will be elaborated in much more detail in subsequent
chapters. As for the second objection, the intent of this monograph is also to show
that the progression through the identified syntactic stages makes evolutionary sense,
i.e. that each new stage brings some tangible advantage(s) over the previous stage(s),
and that such advantage(s) were significant enough to have been subject to natural/
sexual selection. These advantages are discussed throughout the monograph, but
especially in Chapter 7.
For example, each step in the progression from one-word stage (no syntax), to
small clause stage (paratactic two-slot syntax), to hierarchical TP stage accrues
clear incremental communicative benefits. Small clauses (or half-clauses), with only
one layer of structure, would have been immensely useful to our ancestors when
they first started using syntax.23 A half-clause is still useful, even in expressing
propositional content—much more useful than having no syntax at all (one-word
stage), and much less useful than having more articulated hierarchical syntax of the
specific functional category stage. This is exactly the scenario upon which evolution/
selection can operate.
According to Pinker and Bloom (1990), based on Darwin’s work, the only way to
evolve a truly complex design that serves a particular purpose is through a sequence
of mutations/changes with small effects, and through intermediate stages, with each
change/stage useful enough to trigger natural selection. This monograph explores
exactly that kind of scenario for the evolution of syntax. As pointed out by Pinker
and Bloom, it is impossible to make sense of the structure of the eye without
acknowledging that it evolved for the purpose of seeing; evolution is the only
physical process that can create an eye because it is the only physical process in
which the criterion of being good at seeing can play a causal role. The same can be
applied to language: evolution can create a system as complex as language because

selectional pressures operate thus presents a “moving target” for natural selection. However, in a commentary
to this article, Fitch (2008) counters that the same issue of a rapidly changing environment also arises with
uncontroversially adaptive biological processes, and calls for more sophisticated models of co-evolution
between ontogeny, phylogeny, and language change in an attempt to understand the nature of language.
23
The reference to half-clauses (in Progovac’s 2008a paper titled “What use is half a clause?”) is meant
to mimic the typical objections to Darwin’s adaptationist approach in general, in the form of “what use is
half an eye?”
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

16 Introduction

evolution is the physical process in which the criterion of being good at language/
communication can play a causal role.
Moreover, not all linguistic innovations need have begun with a genetic change.
The Baldwin Effect postulates that learning and culture can guide evolution, given
that individuals using innovative features can set up a pressure for the evolution of
neural mechanisms that would make decoding such innovative features of language
automatic and undistracted by irrelevant factors, triggering conventional Darwinian
evolution (Hinton and Nowlan 1987; Pinker and Bloom 1990; Deacon 2003), as
discussed further in Chapter 7.
Chapter 7 summarizes the advantages that each postulated stage brings with it, and
considers one concrete hypothetical scenario for progressing from one stage to the
next, invoking sexual selection. For example, as shown in Chapter 6, exocentric VN
compounds are fossil structures which specialize for derogatory reference (e.g. turn-
coat, kill-joy, cry-baby, hunch-back), and which provide evidence not only of most
rudimentary syntax, but also of ritual insult/sexual selection for such simple syntax
(see Progovac and Locke 2009; Progovac 2012). Selecting for the ability to quickly
produce (and interpret) such (often humorous and vivid) compounds on the spot
would have gone a long way toward not only solidifying the capacity to use paratactic
grammars, the foundation for more complex grammars, but also the capacity for
building (abstract) vocabulary.
The abundance of examples from various languages offered in Chapter 6 makes it
clear that these compounds combine basic, concrete words, often denoting body
parts and functions, in order to create vivid and memorable abstract concepts. Thus,
sexual selection for the capacity to produce and interpret such compounds could
have been one of the factors facilitating the progression from the one-word stage to
the two-word paratactic stage. There is no doubt that many other factors would have
also contributed to solidifying this foundational syntactic strategy, given that having
simple syntax, as opposed to having no syntax at all, accrues a host of communica-
tive advantages.
As pointed out above, transitioning from the paratactic stage to the speciﬁc
functional category stage may have proceeded through a linker/proto-conjunction
stage in some cases, where the linker initially served only to solidify proto-Merge, as
will be discussed in detail in Chapter 4. Perhaps the initial meaningless linker
occurring between a subject and a predicate of a small clause gradually became a
Tense particle which can now automatically express reference to past and future
events, but also build a TP, and with it hierarchical structure. The grammaticalization
of the functional projection such as TP renders automatic and undistracted the
expression of the temporal and modal properties, allowing the speakers to break
away from the here-and-now much more easily than is possible with paratactic small
clause grammars (see Chapter 2 for the data showing that TP-less root small clauses
are typically grounded in the here-and-now).
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Can natural/sexual selection be relevant for syntax? 17

Breaking away from the here-and-now, and from the prison of pragmatics in
general, may have been one dimension along which language evolved.24 As will be
discussed in Chapter 3, two-slot proto-grammars do not distinguish between subjects
and objects, and it is typically pragmatics that determines the meaning of sentences
created by such grammars. The same certainly holds of one-word utterances. So,
imagine encountering the following one-word (8) and two-word (9) utterance
sequences in a proto-syntax stage:
(8) Apple . . . Eat . . . John . . . Go . . .
(9) Apple eat. John go.
These kinds of utterances are much less precise (i.e. more vague) than a correspond-
ing TP sentence such as (10) below, and can receive many interpretations in addition
to the one in (10):
(10) John will (go and) eat the apple.
However, the meaning that does not readily come to mind with respect to (8) and (9)
is the one expressed in (11), but that reading does not come to mind because of its
pragmatic oddness, and not because there is anything in the structure of (8) or (9)
that excludes it. In contrast, this reading is excluded by the structure of the sentence
in (10), and it is the only reading that the structure in (11) allows.
(11) The apple will (go and) eat John.
This suggests that pragmatically odd (or impossible) propositions are harder to express
without complex syntax, given that underspeciﬁed structures, resulting in vague
interpretations, are in close alliance with the pragmatics of the situation; in this
sense, such structures are prisoners of pragmatics. Adding the transitive vP and/or
the TP layer to the small clause structure would have yielded more precise grammars,
with subjects and objects more clearly differentiated, making it much easier to describe
odd or pragmatically impossible events (11). This is so because with such grammars one
can now unambiguously make “the apple” the subject of eating a human being. But one
may wonder what good it does to be able to talk about apples eating humans.
First, it is important to keep in mind that language (and syntax) are not just used to
express propositions and exchange information, but that they are also often used for
playful purposes and in order to impress (see references in Progovac and Locke

24
Displacement, roughly characterizable as the capability of language to communicate about things
that are not present, is commonly thought to be one of the design features of human language, and
moreover one of the features that arguably distinguishes human language from animal communication
systems (see e.g. Hockett 1960; Hockett and Altmann 1968). This is the sense in which I am using the term
“displacement” in this book. The reader should note that the same term is also used to refer to a completely
different phenomenon, to the rearrangement of constituents within a sentence, as a result of the syntactic
operation Move.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

18 Introduction

2009), something we formal linguists often forget.25 Thus, if an ancient language user
wanted to draw attention to himself by using language in a funny and surprising way,
he would have had a much harder time doing so with structures such as (8) or (9),
than with the structure such as (11), which relies on hierarchical syntax. This is just
one way in which transitivity, and hierarchical structure more generally, would have
been adaptive. Of course, the precision in expressing the argument structure (who
did what to whom) would have been adaptive in so many other ways as well,
including in gossip and story-telling, both of which rely on displacement.
In other words, the capacity for displacement, a key design feature of human
language, is facilitated by hierarchical syntax. As will be discussed in Chapter 4,
hierarchical syntax also enables Move and recursion, making it possible e.g. to
embed one point of view within another. Therefore, once the innovation that was
hierarchical syntax appeared on the evolutionary scene, there would have been
multiple types of pressures to select for it.
This is not to claim that every single phenomenon of syntax, such as every single
functional projection, or every single construction, has been selected; certain syntac-
tic phenomena seem to be bundled together, and it may be enough for one of them to
emerge to make the others possible. Likewise, just as is the case with the evolution of
other aspects of living organisms, there will surely be phenomena in syntax that serve
no particular purpose, and that can be seen as spandrels (i.e. by-products of some
other adaptations), or which perhaps developed through drift (variation due to
chance).26 However, the existence of such phenomena should not distract one
from identifying those aspects of language that responded to selection pressures,
and from devising methods to test such hypotheses.
Finally, the third objection to the gradualist approach to language evolution is that
there was not enough evolutionary time for the selection to take place. Pinker and
Bloom (1990) propose that language evolved gradually, subject to the Baldwin Effect,
the process whereby environmentally-induced responses set up selection pressures
for such responses to become innate, triggering conventional Darwinian evolution
(see also Deacon 1997; Hinton and Nowlan 1987). Deacon (2003) puts emphasis on
learning, rather than innateness, in his adoption of the Baldwin Effect. He considers
that masking and unmasking of “preadaptations” plays an important role in this
process. As an innovative tool (e.g. language) became more and more essential to

25
In this respect, Dunbar, Duncan, and Marriot (1997) report that only about 10–20% of conversation
time is devoted to practical and technical topics, while the rest is devoted to social concerns (see also
Tallerman 2013b).
26
For example, the availability of Move may be inextricably linked to the availability of functional
layers, such as TP and CP, which Move serves to connect (Section 4.4.5). Recursion itself may be a
by-product of the emergence of specialized functional projections, such as CP for clausal subordination
(Chapter 4). As argued at length in Chapter 5, Subjacency should be seen as a by-product of other
adaptations, and not a principle in its own right.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Can natural/sexual selection be relevant for syntax? 19

successful reproduction, “novel selection pressures unmasked selection on previously

‘neutral’ variants and created advantages for certain classes of mutations that might
not otherwise have been favored” (93–4). At the same time, this innovative tool
“masked selection on traits made less vital by being supplemented” by the innovative
tool, such as perhaps the inventory and specificity of human calls (94). As clarified in
Chapter 7, where I return to this topic, this approach ultimately reduces to Darwinian
natural selection.
Tiny selective advantages are sufficient for evolutionary change; according to
Haldane (1927), a variant that produces on average 1% more offspring than its
alternative allele would increase in frequency from 0.1% to 99.9% of the population
in just over 4,000 generations. This would still leave plenty of time for language to
have evolved: 3.5 to 5 million years, if early Australopithecines were the first talkers,
or, as an absolute minimum, several hundred thousand years (Stringer and Andrews
1988), in the event that early H. sapiens were the first.27 Moreover, fixations of
different genes can go in parallel, and sexual selection can significantly speed up
any of these processes. The speed of the spread depends on how high the fitness of
these individuals was relative to the competitors. According to e.g. Stone and Lurquin
(2007), if relative fitness is high, it can take just a few dozen generations for the
variant frequency to increase tenfold.28
The initial arguments for the saltationist views, such as the postulation of the
Middle to Upper Paleolithic revolution, as well as the dating of the FOXP2 gene, have
now been mostly reconsidered (see Chapter 7 for more discussion on this). It was
initially reported by Enard et al. (2002) that FOXP2 gene mutation in humans
occurred at some point in the last 200,000 years, which could have then coincided
with the emergence of syntax. However, it has since been found that the same
mutation characterizes Neanderthals (Krause et al. 2007), which pushes the mutation
back to at least the common ancestor, about half a million years ago (see e.g. Piattelli-
Palmarini and Uriagereka 2011 for discussion).
Another type of evidence that has been invoked in favor of the saltationist view has
to do with the postulation of the Middle to Upper Paleolithic transition/revolution.
Based on archeological findings, Mellars (2002) and others have initially suggested
that there was a major cultural and cognitive transition/revolution around 43–35,000
Before Present (BP). These archeological findings were often interpreted to mean that
syntax (or language) in its entirety arose at this juncture, through one single event,
such as a mutation (e.g. Bickerton 1995; Chomsky 2002, 2005). However, the more
recent findings suggest that there was no human revolution, at least not at this

27
Berwick et al. (2013) maintain that the capacity for language evolved about 100,000 years ago.
28
As one example, the ﬁtness of lactose tolerance is 2–3% higher in dairy areas. It took about
5,000–10,000 years to reach the current rates of lactose tolerance among northern Europeans, which is
close to 100% with some populations.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

20 Introduction

particular juncture (see e.g. McBrearty and Brooks 2000; McBrearty 2007; and
Mellars himself (2007: 3), as further discussed in Chapter 7).
In short, there is no real obstacle to studying syntax in a gradualist evolutionary
framework. The recurring theme of this monograph is that there is evidence of
evolutionary tinkering in the language design itself, consistent with the view that
the complexity of syntax emerged gradually, through evolutionary tinkering. As a
result, modern clauses are quirky and redundant (rather than optimal and perfect).29
In this evolutionary perspective, rather than a system designed from scratch in an
optimal way, syntax is seen as a patchwork of structures incorporating various stages
of its evolution, and thus exhibiting a variety of quirky phenomena, many of which
are discussed in this monograph.30

1.5 Corroboration and testing

The gradualist evolutionary scenario for syntax, as outlined in this monograph, finds
corroboration in practically every domain relevant for language, in addition to the
fossil proto-structures found in modern languages. As discussed in various chapters
in this monograph, in children, as well as in adults acquiring a second language,
language clearly emerges gradually, through stages, arguably starting with the para-
tactic (small clause) stage, and it can also be partly lost in e.g. agrammatic aphasia
and speech language disorders.
When it comes to neuroscience, there is converging evidence in the literature
showing that increased syntactic complexity corresponds to increased neural activa-
tion in certain specific areas of the brain (see e.g. Caplan 2001; Indefrey, Hagoort et al.
2001; Just et al. 1996; Pallier, Devauchelle, and Dehaene 2011; Brennan et al. 2012). The
experiments performed by Pallier, Devauchelle, and Dehaene (2011) and Brennan
et al. (2012) have found a positive correlation between the levels of hierarchical
embedding and the degree of activation in the brain, even when keeping the number
of words constant. This is consistent with the proposal in this monograph that the
complexity of syntax is graded, and that it evolved gradually. The proposals in this
monograph are also specific and concrete enough that they themselves can be
subjected to neuroscientific testing, as summarized in the Appendix to this book,
which proposes specific experimental design.

29
A reviewer suggests that while syntactic representations may be quirky and redundant, the syntactic
operations, such as Merge, may be optimal and perfect. Again, in order for these claims to be falsiﬁable, one
will need to deﬁne what “optimal” and “perfect” should mean in this respect (see Footnote 6). Besides, my
argument in this monograph is that the crucial properties of syntax, including the operations such as
Merge, have precursors (e.g. Conjoin), and that they, too, arose through evolutionary tinkering. In this
view, the reason why syntactic representations show quirkiness today is because they often incorporate
structures created by different types of operations, including by the precursors to Merge.
30
The quirkiness of a variety of syntactic constructions is recognized in e.g. Culicover (1999) and
Culicover and Jackendoff (2005), who refer to some such constructions as “syntactic nuts.”
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

A brief comparison with Jackendoff ’s (and other) approaches 21

It is also significant that the framework explored here can serve as a point of
contact, an intermediary, between the fields of neuro-linguistics and genetics. To take
one example, some recent experiments suggest that a specifically human FOXP2
mutation is responsible for increased synaptic plasticity, establishing better connect-
ivity among the neurons in the brain (e.g. Vernes et al. 2007; Enard et al. 2009; see
Section 2.5.4).31 If better synaptic plasticity is what facilitates the processing of more
complex syntax, then one can hypothesize that the pressures to evolve the capacity
for a more complex syntax could have contributed to the spread of the human
mutation of e.g. the FOXP2 gene. Even at this preliminary level, one can appreciate
the potential for synergy among the fields which can shed light on the evolution of
human language: syntactic theory, neuroscience, and genetics. It is the evolutionary
considerations like this that can provide the point of contact.32

1.6 A brief comparison with Jackendoff ’s (and other) approaches

The work of Ray Jackendoff (e.g. 1999, 2002) on the evolution of syntax, in particular
his characterization of syntactic “fossils,” provided an important component of the
proposal that I am exploring in this book. This short section is in no way meant to be
an overview of his approach, or to do justice to various aspects of his approach, but
merely to indicate where my proposal converges with his, and where it diverges.
Generally speaking, I would say that Jackendoff ’s approach is in broad strokes, with
wider implications (including some discussion of phonology and morphology), while
mine is narrowly focused only on the evolution of the layers of clausal structure, as
postulated in the syntactic theory associated with Minimalism.
Also, as per the distinction drawn in Heine and Kuteva (2007), Jackendoff ’s
approach is integrative, considering data from various domains, including language
acquisition, pidgin languages, and aphasia, while mine is discipline-based, following,
perhaps doggedly, a reconstruction method based on a syntactic theory, while the
evidence from the other domains is only considered as secondary, corroborating evi-
dence. Even though it seems plausible and insightful in many respects, Jackendoff ’s

31
FOXP2 is just one of several genes that are implicated in language and speech (disorders), and are
thus of potential relevance for language evolution (see e.g. Vernes et al. 2007; Newbury and Monaco 2010,
for FOXP1). Two other potentially relevant genes are CNTNAP2 and ASPM (see e.g. Dediu and Ladd 2007;
Fitch 2010: 291; Diller and Cann 2012). The exact contribution of FOXP2 and the other genes remains to be
determined, as rightly pointed out by a reviewer, but I do not think this can be determined without some
concrete input by linguists, and without some specific, testable hypotheses about language evolution. If
these genes were even partly selected for some specific language abilities then, without hypothesizing what
these specific abilities might have been, we will not be able to find out.
32
There is no doubt that the researchers in each of these fields will notice that there are other possible
takes and interpretations of the analyses and data presented here, and that there are many more
complexities involved with each field than this approach can do justice to. Still, if this is one of the ways
that all these fields can be brought together, then it is at least worth exploring.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

22 Introduction

(1999, 2002) approach does not easily lend itself to specific hypotheses that can be tested.
Botha (2006: 135), among others, has pointed out that such windows into the evolution of
language need to be accompanied by insightful theories.
My intent in this book is to show that following a narrowly focused and discipline-
based approach leads to greater depth, and to some surprising novel insights, which
in turn make it possible to formulate specific hypotheses and predictions, as well as to
reveal clear communicative advantages that come with each stage. There is both a
virtue and a curse in following this kind of simple and precise reconstruction method.
The virtue is that you can be fairly confident in your reconstruction. The curse is that
it does not tell you about other things.
The idea of syntactic fossils advocated by Jackendoff is very powerful, as is the
idea that one can reconstruct the stages of language evolution by looking at the
nature of language itself. I believe that my approach strengthens these ideas
further by proposing that each new stage literally leans upon the structure of
the previous one, and cannot exist without it.33 On my approach, not only are the
fossil structures used side by side with more complex structures (e.g. paratactic
(adjunct) structures alongside hierarchical structures), but these foundational
paratactic structures are literally built into the foundation of hierarchical struc-
tures. As will be shown, intransitive small clauses are built into vPs and TPs, and
exocentric compounds are built into hierarchical compounds. Thus, on my
approach, the fossil structures do not just provide cognitive scaffolding (i.e. increased
cognitive abilities) for advancing to hierarchical language (e.g. Jackendoff 2009);
the fossil structures also provide concrete syntactic scaffolding for hierarchical
structures to be built upon.
Jackendoff (2002) identifies certain fossil principles of language, such as Agent
First (where the agent precedes the patient/theme: e.g. Bear chase boy); Grouping
(where modifiers are grouped next to the words they modify: e.g. Brown bear chase
boy); Topic First (where the topic of the sentence appears before the comment);
Focus Last; etc. These are still not syntactic principles, although they can be seen
as precursors to such principles, especially the connection between the linear word
order and the semantic/thematic role. These also relate to what Givón (1979) has
called the pragmatic mode of language, which preceded the syntactic mode. In this
respect, Hurford (2012) suggests that the first proto-sentences involved a topic-
comment dichotomy, from which even the categories of nouns and verbs emerged.
According to Hurford, the first proto-sentences consisted of two words juxta-
posed, always in the topic-comment word order (653). It would be only at a later

33
While Jackendoff (1999, 2002) has identiﬁed some fossils of the postulated one-word stage—proto-
words which are not combinable with other words, such as ouch, wow, shhh, etc.—I have identiﬁed some
rigid small clause structures in English (e.g. Case closed) and Serbian (Pala karta ‘Card played’), which also
seem to be syntactic isolates in the sense that they cannot combine further (Chapters 2–4).
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

A brief comparison with Jackendoff ’s (and other) approaches 23

stage that topics would give rise to subjects, and comments would give rise to
predicates.34
While all these ideas can in principle be plausible, without having a theory that
organizes these proposed principles, as well as illuminating the transition from one
stage to the next, it is hard to know what kind of evidence bears on these hypotheses.
It is also not clear from such a characterization what selection pressures, and what
communicative advantages, led to the transition from the topic-comment based, or
Agent-First based, language to the subject-predicate based language. For example, as
Jackendoff ’s Agent-First protolanguage is already capable of communicating who
does what to whom, and it already allows more than one argument per verb, it is not
clear why there would have existed selection pressures to transition to the (transitive)
subject-predicate grammars. In contrast, on my approach, which starts from an
intransitive absolutive-like foundation, the communicative advantages of developing
transitivity can be characterized clearly and precisely. In addition, this reconstruction
down to the intransitive absolutive-like layer allows me to connect this proposal
meaningfully to the typological variation across languages, as well as to the hominin
timeline (Chapter 7).
More generally, Jackendoff (2002: 238) considers that protolanguage consisted of
the following separable components: use of symbols; concatenation of symbols; use
of open-class symbols; and use of word order to convey semantic relations. The
hierarchical stage then adds to the protolanguage the following: symbols encoding
abstract semantic relations, as well as grammatical categories and grammatical
functions, including subject vs. object. This is the sense in which, for Jackendoff,
the hierarchical stage subsumes protolanguage. However, as pointed out above, it is
not clear how and why, and in what manageable, incremental steps, one proceeds
from the concatenation of symbols with Agent First to a hierarchical transitive
sentence with a subject and object. My approach puts emphasis on this incremental
progression through stages, as well as on the speciﬁc communicative advantages
gained with each incremental step.
Bickerton’s (1990, 1998) inﬂuential work is also relevant here, as he has proposed
that pidgin languages are indicative of our ability to tap into the proto-linguistic
mode. However, in his view, pidgin languages (or child language) have no syntax,
which leads him to a saltationist view of the emergence of syntax, from no syntax at
all, to full-blown hierarchical syntax. Bickerton’s main reason for considering proto-
language to be without any syntax is his observation that the arguments routinely go

34
However, see Casielles and Progovac (2010, 2012) for the idea that the so-called thetic statements are
evolutionarily primary, and that they preceded categorical statements, those which feature such topic-
comment diachotomies. Some examples of thetic statements include e.g. English It rained, and Serbian Pao
sneg (‘Fell snow’), which do not involve a clear diachotomy between a topic and a comment, or between a
subject and a predicate, but rather describe an event as a whole. Such thetic statements also typically do not
have agents, and often overlap with unaccusative constructions (see Section 3.3).
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

24 Introduction

missing in these systems, and that syntactic language must obligatorily realize all the
arguments of the verb. For Bickerton, these asyntactic systems do not even involve
true language. Notice that Givón (1979: 296) has also proposed that there was a pre-
syntactic, pragmatic mode of discourse, which had a “low noun to verb ratio.”
My approach has elevated the stage characterized by intransitivity (one argument
per verb) not only to the level of language, but also to the level of (simpler) syntax.
Given the logic behind the two-slot syntax, one is not dealing with missing argu-
ments here, but rather with a coherent syntax which can accommodate only one
argument per verb. Not only is this kind of proto-syntax syntactic and language-like,
but such fossil structures are still available across various constructions and lan-
guages, including in English. In my analysis, the proto-syntactic stages clearly show
continuity with the more innovative stages of syntax.
As pointed out above, Jackendoff (2002) advocates that word order in the proto-
language stage followed the semantic ordering of Agent First. In contrast, my
argument is that the initial stages were absolutive-like, with agent and patient not
being structurally differentiated at all. However, notice that Agent First may be
relevant after all, even in my approach, although in a somewhat roundabout way.
As discussed throughout the book (e.g. Section 3.4; Chapter 4), the paratactic small
clause combinations of the kind attested in Nicaraguan Sign Language (e.g.
WOMAN PUSH – MAN FALL) may have provided precursors to accusative-type
transitivity (e.g. Senghas et al. 1997). We notice here that WOMAN is interpreted as
an agent, and MAN as a patient, but this is not directly related to the Agent-First
principle, given that one is dealing with two clauses here. Instead, this may ultimately
reduce to Cause First principle, which is operative in a much wider array of paratactic
combinations, including e.g. Easy come, easy go; Nothing ventured, nothing gained. In
other words, given that the first clause (e.g. WOMAN PUSH; Easy come) is inter-
preted, roughly speaking, as Cause of the second clause (MAN FALL; Easy go), then
WOMAN will be seen as the agent/causer. This will be discussed further, especially in
Chapters 3 and 4.
While Jackendoff ’s work, as well as Bickerton’s, is more about characterizing the
fancy properties of modern syntax, which they list and illustrate, my approach is
more about envisioning and illustrating what the initial, early syntax was like, in its
own right, and its own logic. I offer a lot of data from present-day languages which
arguably approximate these early stages, emphasizing the symmetry and flatness of
proto-syntax, as opposed to asymmetry and hierarchy of modern syntax. Interest-
ingly, if my approach is correct, it suggests that syntax was autonomous in this very
early stage: while thoughts can be fluid, with many participants associated with one
verb or event, the proto-syntactic mold only allowed one such participant to occur
with the verb. In other words, the syntactic form did not just follow from the
semantic principles, or from the properties of events, but it imposed its own logic
and constraints.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

A brief comparison with Jackendoff ’s (and other) approaches 25

The fossil structures discussed in Jackendoff (1999, 2002) also include English
noun-noun (NN) compounds, such as snowman, doghouse, housedog. However, even
though seemingly simple, these compounds, at least in present-day English, are
headed and recursive, which may suggest more complex structure. Interestingly,
this kind of NN compounding process is not productive, and certainly not recursive,
in e.g. Serbian (see Snyder 2014 for other languages in this respect). Still, if there was a
proto-syntax stage with verb-like and noun-like proto-words (as per Heine and
Kuteva’s 2007 reconstruction), then it would stand to reason that one should have
been able to combine not just verbs with nouns, but also nouns with nouns, and verbs
with verbs, as pointed out by a reviewer. However, the method of reconstruction that
I follow, and the fossil evidence that I have gathered, do not lead to a clear conclusion
in this respect, and I thus leave NN compounds for future research.
As for the verb-verb (VV) combinations, the discussion above suggests that at
least some serial verb constructions across languages probably have a complex
clausal origin, rather than just being plain VV compounds or concatenations (see
Section 3.4.1 for more discussion on this). However, one does find an occasional
VV compound which can be of evolutionary significance, such as Macedonian
veži-dreši (tie-untie ‘an ignorant person’), consisting of two imperative forms
strung together (Olga Tomić, p.c., 2006); see especially Section 6.4 for the relevance
of imperative morphology in compounds. Possibly of interest are also English tie-
dye and French passe-passe (‘sleight of hand’).
Finally, a reviewer wonders if my approach cannot be somehow reconciled with
the saltationist views (e.g. Chomsky 2005; Berwick and Chomsky 2011), if one can
interpret their position to be that one single mutation occurred at a point when
protolanguage in Bickerton’s sense was already in place, in which case this one single
mutation would have brought about hierarchical syntax. First of all, there are clear
claims in Berwick and Chomsky (2011: 29–31) and elsewhere in this line of work to the
effect that “there is no room in this picture for any precursors to language—say a
language-like system with only short sentences,” as quoted in full in Section 1.1.
Bickerton (1990, 1998) shares this view. Some saltationists (e.g. Piattelli-Palmarini
2010: 160) go even further to propose that even unstructured protolanguage in
Bickerton’s sense could not have existed (see also Section 4.2.2).
Crucially, my approach has isolated a coherent two-slot, flat stage in the evolu-
tion of syntax, which provides a clear transition/intermediate stage between one-
word protolanguage and hierarchical syntax. Not only that, but my approach
identifies specific kinds of (paratactic) precursors to all the hierarchical phenomena
discussed in the book, showing that the incremental, scaffolding approach is on the
right track. In this respect, my approach differs sharply not only from Berwick and
Chomsky’s approach, but also from Jackendoff ’s and all the other approaches to
language evolution (see Chapter 7 for more discussion on this). Importantly, one
cannot object here that what I have reconstructed is not real language or real syntax.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

26 Introduction

As will be shown, such structures are found, in some form or another, all over
present-day languages.

1.7 Syntactic theory

As an influential approach to syntax for an extended period of time, Minimalism (e.g.
Chomsky 1995) and its predecessors (the frameworks leading to it, such as Govern-
ment and Binding and Principles and Parameters), have accrued many insightful
generalizations and analyses which provide important tools for analyzing structure.
However, Minimalism has been in flux, and it would be unwise to base a whole
evolutionary framework on one particular version of Minimalism. My solution to
that is to rely only on well-established theoretical postulates: those which have
withstood the test of time and empirical scrutiny within this framework, and which
date back to the predecessors of Minimalism, as well as surviving into various later
versions. In this section, I will thus present only some of those stable postulates, in
particular those that will be useful for further discussion. For a fuller picture of the
framework that I am adopting here, the reader is referred to e.g. Adger (2003) and
Chomsky (1995). The discussion of Minimalist (and other theoretical) notions in this
monograph will be made accessible to a non-expert, and only those postulates will be
discussed that are relevant for further discussion.
As pointed out in Section 1.2.2, the theory of clause/sentence structure adopted in
Minimalism (e.g. Chomsky 1995) and its predecessors involves a hierarchy of func-
tional projections, which includes at least the following projections (several more
projections have been postulated, but these are the ones that are largely agreed upon):
(12) CP > TP > vP > VP/SC
The structure is built bottom up, so that the inner VP/SC layer is formed first, to
accommodate the verb/predicate and one argument.35 The next layer, vP, accom-
modates an additional argument, such as agent in transitive structures. TP (Tense
Phrase) accommodates the expression of tense and finiteness, while CP (Comple-
mentizer Phrase) accommodates e.g. subordination/embedding, and wh-movement,
as discussed below.
The following derivation illustrates how one constructs a transitive sentence (TP)
in this manner, starting from a small clause.
(13) Maria will roll the ball.

35
Sometimes the initial combination of a verb and a noun is referred to as a VP (Verb Phrase), and
other times as a SC (Small Clause). Even though it is a bit clumsy, I will use the label VP/SC here in
order to reﬂect that reality, as well as because vP is a label for the light verb phrase, which is considered
to be an additional layer of verb phrase. The layer of the vP structure on top of the VP is referred to as a
vP/VP shell.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Syntactic theory 27

(14) a. [SC/VP roll the ball] !

b. [vP Maria [SC/VP roll the ball]] !
c. [TP Maria will [vP Maria [SC/VP roll the ball]]]
[The strike-out notation is used to represent the original (pre-Move) copy of the
moved constituent.]
In deriving the sentence in (13), one starts with the basic, small clause layer in (14a).
At this point, one cannot know if the ball will be the subject or the object of the
sentence (see 16 below). Then, the agent (Maria) is merged in the higher vP layer
(14b), which is now responsible not only for accommodating this additional argu-
ment, but also for assigning (abstract) accusative case to the object (the ball). Finally,
the TP layer is projected on top of the vP layer, and “Maria,” the highest argument,
moves to become the subject of the TP (14c).36
It is of note here that a corresponding intransitive sentence (15) can be derived
without the vP layer, simply by moving the argument of the small clause to the TP
layer (see e.g. Kratzer 1996, 2000; Chomsky 1995: 214 for unaccusatives):
(15) The ball will roll.
(16) a. [SC/VP roll the ball] !
b. [TP The ball will [SC/VP roll the ball]]
It is of note here that the two derivations above illustrate the fluid, relative nature of
subjecthood. While “the ball” in (13) is the object, in (15) it is the subject, even though
its semantic role with respect to the event of rolling seems to be the same. This will be
relevant for the later discussion of proto-syntactic structures, especially those involv-
ing absolutive-like roles in Chapter 3.
What is of relevance here is that the modern syntactic theory associated with the
Minimalist Program (e.g. Chomsky 1995, 2001) analyzes every clause/sentence as
initially a small clause (SC) which gets transformed into a full TP only upon
subsequent Merge of tense, and subsequent Move of the subject to TP in English.
This kind of analysis was originally proposed in Stowell (1981, 1983); Burzio (1981);
Kitagawa (1985, 1986); and further solidified in the work of Koopman and Sportiche
(1991); Chomsky (1995); and many others. This is thus one of those solid postulates in
this framework that has withstood the test of time and empirical scrutiny.
The import of this incremental structure building for the evolutionary proposal in
this book will become clearer in Chapters 2 and 3. The significance of the operation
Move from the evolutionary standpoint will be discussed in Sections 2.2–2.4 and

36
In fact, there is an additional Move operation postulated in this structure, that is, Move of the verb in
V to the position of the light verb in v, but this operation is not directly relevant to the considerations in
this book, except to characterize Move as a force whose more general purpose may be to connect the layers
of structure, as per Section 4.4.5.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

28 Introduction

4.4.5. Further, Section 5.2 discusses wh-movement in the context of Subjacency,

showing where wh-movement is possible, and where it is not.
Moving up the hierarchy in (12), wh-movement in e.g. English utilizes the CP layer
of structure, which is constructed on top of a TP:
(17) What will Maria roll?
(18) a. [SC/VP roll what] !
b. [vP Maria [SC/VP roll what]] !
c. [TP Maria will [vP Maria [SC/VP roll what]]]
d. [CPWhat will [TP Maria will [vP Maria [SC/VP roll what]]]]
What happens in step (d) is that the wh-word what Moves to the CP layer, and so
does the auxiliary verb will, as indicated by the strike-out notation. Many more
instances of wh-movement are to be found in Section 5.2.
There are many complexities involved in characterizing Merge and Move, includ-
ing the motivation for Move, the availability of landing sites for Move, and the
restrictions on Move, including Subjacency, the latter discussed at length in
Chapter 5. Minimalism typically relies on various grammatical features to implement
Merge and Move, that is, to predict when these operations will take place and when
they will not. The reader is referred to e.g. Adger (2003) for one detailed implemen-
tation of the feature checking theory.
Embedded sentences, introduced by the subordinator/complementizer such as
that or whether, also rely on the CP layer, as illustrated below:
(19) (I believe) that Maria will roll the ball.
(20) a. [SC/VP roll the ball] !
b. [vP Maria [SC/VP roll the ball]] !
c. [TP Maria will [vP Maria [SC/VP roll the ball]]]
d. [CP that [TP Maria will [vP Maria [SC/VP roll the ball]]]]
As discussed in Section 4.4, the CP layer may be instrumental in facilitating recursion
when it comes to clausal embedding. The following example illustrates how CPs can
be recursively embedded one within another:
(21) I believe [CP that John suspects [CP that Sue will acknowledge
[CP that Maria will roll the ball.]]]
As pointed out in Footnote 21, recursion in linguistics is typically characterized as one
type of category (in this case CP) embedding within another category of the same
type (another CP), potentially ad inﬁnitum (see Section 4.4 for much more discussion
on recursion).
Just like functional projections are postulated on top of verb phrases, they are also
postulated on top of noun phrases, although there is less agreement among syntacticians
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Chapter-by-chapter overview 29

regarding the nature and number of such projections when it comes to the noun phrase.
One such functional projection postulated for English is a Determiner Phrase (DP)
projection (e.g. Abney 1987), as in:
(22) [DP the [NP book on recursion]]
(23) [DP Peter’s [NP book on recursion]]
In other words, a similar kind of hierarchical layering of structure seen with clauses
also seems to characterize noun phrases. In this respect, it should be pointed out that
nouns and verbs are considered to be lexical (content) categories, and NP and VP
lexical projections of nouns and verbs, respectively. On the other hand, vP, TP, CP,
and DP are all considered to be functional projections, as they are not direct
projections of lexical categories.
Section 4.4.3 returns to DP structures in the context of the discussion of DP
recursion in English and other languages. DP recursion is characterized in English
by a repeated embedding of one possessive DP within another, as illustrated below:
(24) [DP [DP [DP Peter’s] friend’s] book on recursion]
Section 4.4.3 discusses languages and constructions in which recursion is not pos-
sible, as well as the evolutionary signiﬁcance of that.
What this section provides is a mere skeleton of the theory of structure building
that is adopted in this book, to help the reader follow the discussion in the subsequent
chapters better. Various other details of this theory will be discussed only when and if
they become relevant.

1.8 Chapter-by-chapter overview

Chapter 2 focuses on TP-less (root) small clauses, mostly in English and Serbian, and
shows that such clauses are measurably simpler than finite TP counterparts, and that
they can be considered as (living) fossils of the small clause stage in the evolution of
syntax. These clauses lack finite verbs, structural case, the ability to question or move
constituents, and the ability to embed and show recursion.37 These are exactly the
characteristics postulated for the initial (paratactic) stage in the evolution of syntax.
Many among the root small clauses in both Serbian and English are now marginal,
formulaic expressions, although Serbian also has a type of small clause with a VN
order, which is used productively in parallel with its TP counterparts, as a true living
fossil. One finds corroborating evidence for the primacy of small clauses, and their

37
It follows from the considerations in this monograph that (unbounded) recursion only becomes
possible in the later stages of the evolution of language, more precisely when specialized functional
categories emerge, as discussed at length in Chapter 4.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

30 Introduction

continuity with TP equivalents, in ﬁrst language acquisition, second language acqui-

sition, and agrammatism, and potential testing grounds in the fields of neuroscience
and genetics.
Chapter 3 builds on the small clause analysis of Chapter 2, and takes it further to
the conclusion that only intransitive small clauses, those that license only one
argument, can be considered as true fossils of the initial stage of syntactic evolution.
One arrives at the intransitive small clause stage via the method of internal recon-
struction by removing the outer layers of functional structure, in this case the TP and
the vP layers, the latter being responsible for transitivity and accusative case. The
chapter goes even further to claim that this stage was characterized by an absolutive-
like grammar, that is, an intransitive grammar in which the distinction between
subjects and objects is not syntactically expressed.38
Fossils of such an absolutive-like grammar are arguably found in all languages, and
include absolutives, unaccusatives, certain nominals, and exocentrics (e.g. exocentric
compounds).39 Consistent with the gradualist evolutionary proposal for the expres-
sion of transitivity, one also finds various types of intermediate structures, including
the so-called “middles,” which straddle the boundary between transitivity and
intransitivity, and between subjecthood and objecthood. Corroborating evidence
for an intransitive absolutive-like stage comes from child language acquisition, as
well as from the initial stages of spontaneously emerging sign languages. Finally,
neuroscience provides fertile testing grounds for this proposal.
Chapter 4 is concerned with the nature of the bond between the merged elements
in these proto-syntactic constructs, as well as with how the syntactic bond evolved
over time. In this respect, I identify three rough stages in the evolution of the syntactic
bond: (i) the Paratactic Stage; (ii) the Proto-Coordination Stage; and (iii) the Specific
Functional Category Stage (hierarchical stage), as outlined in Section 1.3. These
syntactic stages follow a hypothesized one-word stage. This progression is meant to
shed light on the gradual/incremental emergence of hierarchical syntax and recur-
sion, as well as on the existence of various transitional, ambivalent structures that
straddle the stage boundaries in modern languages.
In addition to a variety of living fossils of these stages found in modern languages,
this progression of stages finds corroboration in the studies of language acquisition,
agrammatism, grammaticalization, neuroscience, and animal communication. The
corroborating evidence is more robust for the paratactic stage than it is for the proto-
coordination stage, and this can be attributed to parataxis serving as the ultimate

38
According to many authors, the notions of subjecthood and objecthood, which are descriptive terms
particularly suited for nominative-accusative patterns, are not useful distinctions to make when it comes to
ergative/absolutive patterns (e.g. Authier and Haude 2012; Blake 1976; Mithun 1994: 247; Shibatani 1998:
120; Tchekhoff 1973). This will be explained in detail in Chapter 3.
39
These terms will be explained and illustrated in Chapter 3.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Chapter-by-chapter overview 31

foundation for both coordination and subordination. The chapter identiﬁes some
concrete communicative advantages that each stage brings with it, making it possible
to invoke natural/sexual selection in evolving hierarchical grammars (that yield
recursion).
Chapter 5 on islandhood (Subjacency) is a good example of how evolutionary
considerations of this kind can shed light on the very nature of language design, by
explaining certain phenomena observed in modern syntax, which otherwise remain
unaccounted for. Despite the sustained effort of syntactic theory for over forty years
to account for the islandhood effects, that is, for the existence of constructions that
prohibit Move, to date there has been no principled account. It is also signiﬁcant that
the postulated arbitrariness of Subjacency, the principle that is supposed to capture
islandhood effects, has been used to argue that syntax could not have evolved
gradually: one does not see why evolution would target a grammar with Subjacency,
when its contribution to grammar is not clear, let alone its contribution to survival.
As put in Lightfoot (1991), “Subjacency has many virtues, but . . . it could not have
increased the chances of having fruitful sex.”
However, the approach explored here stands this argument on its head and shows
that subjecting syntax to a gradualist evolutionary scenario can in fact explain the
existence of islandhood effects. In this view, Subjacency is not a principle of syntax, or
a principle of any kind, but rather just an epiphenomenon of evolutionary tinkering.
Subjacency or islandhood can be seen as the default, primary state of language, due to
the evolutionary beginnings of language which had no Move. This default state can
be overridden in certain, evolutionarily novel, fancy constructions, such as hierarch-
ical CPs. To put it differently, given that proto-syntactic (paratactic and proto-
coordination stages) did not have Move (Chapter 4), the survivors of these stages,
adjuncts and conjuncts, continue to show islandhood effects.
Chapter 6 considers in detail what may be the best fossils we can access today of
the paratactic absolutive-like stage in the evolution of human grammar: the exocen-
tric VN compounds (e.g. turn-coat, cry-baby, hunch-back, pick-pocket, kill-joy, spoil-
sport). These fossils consist of just one verb and one noun, with the noun in the
absolutive-like role. Structurally speaking, these compounds are exactly what the
postulated paratactic stage would have looked like: a rigid combination of only two
elements, a verb and a noun, with no subject/object distinction, and with no Move or
recursion available. What is even more striking about these compounds is that they
specialize for derogatory reference (insult) when they refer to humans, in language
after language, which makes it plausible that comparable creations in the ancient past
could have been used for ritual insult, and could have thus contributed to the sexual
selection of this simplest type of syntax (Progovac and Locke 2009). Some corrob-
oration for the primary nature of VN compounds comes from language acquisition
and neuroscience, the latter in connection to the often obscene nature of these
compounds.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

32 Introduction

Chapter 7 considers the communicative advantages gained by each new stage, and
how the progression through each stage would have been guided by evolutionary
pressures. This chapter looks in detail at one concrete evolutionary scenario that
would have solidified the paratactic stage by way of sexual selection (see also
Section 1.4). The approach explored in this chapter thus offers a reconstruction of
how communicative benefits may have been involved in shaping the formal design of
human language. Finally, given that the postulated stages of the evolution of human
language are consistent only with certain hypotheses regarding human prehistory,
this approach can also help choose among some competing hypotheses about the
origins of the human species.
Chapter 8 summarizes and concludes, as well as considering future prospects and
promises.
The Appendix, written jointly with neuroscientist Noa Ofen, considers how the
main claims advanced in this monograph can be subjected to neuroimaging testing.
There is already reasonable evidence from neuro-linguistics establishing that
increased syntactic complexity correlates with the increased neuronal activation in
the specific areas of the brain. The Appendix builds on that and proposes an
experimental design for how specific hypotheses advanced in this monograph can
be tested. To take one example, one can compare and contrast the processing of
TP-less and vP-less small clauses/compounds with the processing of their hierarch-
ical counterparts, which use the same vocabulary, and only differ with respect to tiny
grammatical pieces, making hardly any difference in meaning in controlled contexts.
While the processing of full hierarchical structures is expected to show clear
lateralization in the left hemisphere, with extensive activation of some specific
Broca’s areas, the proto-structures, including root small clauses and VN compounds,
are expected to show less lateralization, and less involvement of Broca’s area, but
more reliance on both hemispheres, as well as, possibly, more reliance on the
subcortical structures of the brain.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

The small (clause) beginnings

2.1 Introduction
My proposal in this chapter is based on very specific claims, whose feasibility can be
evaluated and tested both in the theory of syntax and in neuroimaging experiments,
as well as corroborated by the findings in several other fields. Specifically, the
proposal is that the first “sentences” were paratactic (not hierarchical, not headed)
combinations of a verb and just one argument, akin to present-day intransitive
small clauses (see e.g. Progovac 2006, 2008a,b, 2009a,b). The claim is that such
structures are still found across various constructions in present-day languages,
“living fossils” of this stage of grammar (see Jackendoff 1999, 2002 for the idea of
living fossils in syntax). A living fossil in syntax can be characterized as a measurably
simpler syntactic construct which nonetheless shows continuity with more complex
counterparts, and which can be reconstructed back to a time when such complex
counterparts were not available.1 My take on small clauses as living fossils is that they
have been preserved within modern sentences, rather than have been replaced by
them.2
Many have proposed a simpler stage of syntax in the evolution of human language
involving concatenation (e.g. Givón 1979; Dwyer 1986; Bickerton 1990; Jackendoff

1
I believe that the term living fossil is appropriate to use in the context of my analysis, even though the
rigor of proof may not be identical to what one finds in biology. As pointed out in Chapter 1, biologists
consider lungfish to be a living fossil because it has changed little from its evolutionary past. In the case
of lungfish, actual fossils, identical to modern Queensland lungfish, have been found and dated at over
100 million years, proving that lungfish is a living fossil (the term first used by Charles Darwin).
First of all, this suggests that living fossils in principle are possible, and that they can survive millions of
years, living side-by-side more modern species. Second, even if biologists had not discovered the actual
fossils, lungfish would still be a living fossil, and perhaps there would be some other, less direct way to
prove this, or at least to hypothesize this. In other words, what I hypothesize to be living fossils of language
evolution are not identical to, but are closely comparable to what is considered to be living fossils in
biology. Clearly, the proposal that I am exploring here requires a different kind of proof.
2
As pointed out in Chapter 1, the fossils as discussed in this monograph can only be seen as
approximations of the structures once used in the deep evolutionary past. Such fossils in present-day
languages may show morphological markings and other complexities which were not there in proto-
syntax. These structures count as fossils in some relevant respect under consideration, for example, in their
lack of a TP, but not in all their properties.

Evolutionary Syntax. First edition. Ljiljana Progovac

# Ljiljana Progovac 2015. Published 2015 by Oxford University Press
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

34 The small (clause) beginnings

1999, 2002; Culicover and Jackendoff 2005; Deutscher 2005; Burling 2005; Hurford
2007, 2012; Tallerman 2007, 2013a, b, 2014a, b). Progovac (2006, 2008a,b, 2009a,b;
2013b, 2014a) has connected this idea to the well-known construct in syntax, “small
clauses.” According to her proposal, small clauses used in isolation lack (at least) the
TP (Tense Phrase) layer of structure, typically associated with modern sentences in
Minimalism, and can be reduced to a single layer of structure. My argument is that
comparable small clauses served as precursors to more complex sentences (TPs),
given that they are syntactically measurably simpler (shorter by one or more layers of
structure), and given that they provide a foundation for building TPs. Crosslinguisti-
cally, small clauses (SCs) are both pervasive and robust syntactic constructs, occurring
in root contexts (as RootSCs), as embedded small clauses, as loosely attached adjuncts
or conjuncts, and, most importantly, they also serve as foundation for building full
sentences, according to Minimalism (e.g. Chomsky 1995, and subsequent work) and
predecessors to Minimalism (see Section 1.7).
The argument for the proposed progression from a small clause to a TP stage has
three prongs to it: (i) providing evidence of “tinkering” with the language design, in
the sense that older structures (i.e. small clauses) get built into more complex
structures (i.e. TPs); (ii) identifying “living fossils” of the small clause stage in modern
languages; and (iii) identifying existing or potential corroborating evidence and
testing grounds, from language acquisition, agrammatism, genetics, and neurosci-
ence. Moreover, the goal is to show that each identified stage accrues concrete and
tangible advantages over the previous stage(s), advantages that were significant
enough to be targeted by natural/sexual selection.
The method for hypothesizing previous stages of syntax can be roughly charac-
terized as internal reconstruction, based on the syntactic theory of structure building
adopted in e.g. Minimalism (e.g. Chomsky 1995). My proposal is that a structure X is
considered to be primary relative to a structure Y if X can be composed independently
of Y, but Y can only be built upon the foundation of X. While small clauses can be
composed without the TP layer, TPs must be built on the foundation of a small clause,
providing syntactic proof that small clauses are more primary than TPs (see Section 1.2.2
for the definition of the term “primary” as used in the context of a reconstruction).

2.2 Root small clauses in English

Consider the following three types of (marginal) RootSCs (i.e. small clauses used in
root/unembedded contexts): the so-called “Mad Magazine”/incredulity clauses in (1)
(Akmajian 1984); imperative/optative clauses in (3); and RootSCs involving past
participles in (5).3 While (2), (4), and (6) can be considered their respective full

3
The examples of small clauses offered in this chapter are all intransitive, consistent with my proposal
that transitivity is a later evolutionary development in clause building, as well as with the syntactic analysis
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Root small clauses in English 35

sentential counterparts, no sentential paraphrase completely captures the expressive

power of RootSCs, which seem to convey a sense of urgency and immediacy. This is
just one indication that RootSCs are not simply elliptical versions of full sentences.
(1) Him retire!? John a doctor?! Sheila happy?!
Me in Rome?!
(2) Is he going to retire? Is John a doctor? Is Sheila happy?
Am I in Rome?
(3) Me first! Family first! Everybody out!
(4) I want to be first! Family should be first! Everybody must
go out!
(5) Problem solved. Case closed. Point taken. Crisis averted.
Mission accomplished. Lesson learned.
(6) The problem has been solved. The case is closed. The point is
taken. The crisis has been averted. The mission has been
accomplished. The lesson has been learned.
The clauses in (1, 3, 5) combine a noun/pronoun with a predicate (typically verb),
resulting in a predication structure, arguably without functional projections, at the
very least without a TP.4
While root small clauses are typically not discussed in syntactic literature, being
delegated to the periphery (but see Progovac 2006), embedded small clauses, such as
the bracketed clauses in (7) below, have received a lot of attention.
(7) I wanted [the problem solved].
I imagined [Sheila happy].
There are competing analyses of embedded small clauses, including some that ascribe
quite complex structure to them (see e.g. Cardinaletti and Guasti 1995; Dubinsky et al.
2000).5 However, the tendency is still, overwhelmingly, to label them as “SCs,”

according to which transitivity involves an extra syntactic layer, a vP shell, as disussed in Chapter 3.
I assume here that passive-looking clauses in (5) do not involve a vP shell or Move(ment). Although I do
not provide a specific analysis for these clauses, my approach allows for them to be treated as middle-like
constructions, as discussed in Chapter 3.
4
The syntactic analysis of this kind of “nonsentential” speech is based on Barton (1990), Barton and
Progovac (2005), and Progovac (2006, 2013a) (see also Tang 2005). Fortin (2007), who embeds her analysis
in the phase framework of Minimalism (e.g. Chomsky 2001), also argues for the nonsentential analysis of
certain syntactic phrases, such as adverbials, vocatives, and bare unergative verbs, but she specifically
argues against such an analysis of any propositional constructs, as are small clauses in (1, 3, 5), which are the
focus of this chapter.
5
One influential early analysis is Stowell (1981), which treats such clauses, at least in the embedded
contexts, as headed structures, that is, as Adjective Phrases (APs) (Me first!), Verb Phrases (VPs) (Him
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

36 The small (clause) beginnings

suggesting hesitance to commit to an analysis that renders them projections of their

predicate, or of something else. In fact, the simplest among them may just be
paratactic creations, in which the subject and the predicate combine by the operation
Conjoin, which is akin to the operation Adjoin used in syntactic theory to account for
the attachment of various adjuncts (e.g. adverbials) (for details, see Chapter 4).
Conjoin here is to be understood in the simple sense “join together, unite, combine,”
rather than in the sense of using a conjunction. Chapter 4 returns to the character-
ization of Conjoin, and paratactic attachment in general (Section 4.2). Sufﬁce it to say
here that Conjoin does not create headedness or hierarchy, but rather creates
structures which can be characterized as ﬂat, and not asymmetrical.
Uriagereka (2008) looks at the embedded SCs such as the ones in (7), and
concludes that their structure is rather basic, and may involve the simplest type of
syntax in Chomsky Hierarchy (Chomsky 1956; for characterization and detailed
discussion of the Hierarchy, see e.g. Hurford 2012). One of the arguments Uriagereka
invokes for the basic nature of (embedded) small clauses is the long-noted observa-
tion that these clauses do not have an internal source of abstract (structural) Case for
their subjects, which are instead assigned Case by clause-external elements, such as
the verbs want or imagine in (7).
According to Progovac (2006), RootSCs likewise do not have a structural mech-
anism for assigning case to their subjects, providing another argument that they are
comparable to embedded SCs. Since with RootSCs there is no external source of case
either (they are not embedded under a verb), their subjects surface with what can be
characterized as default case (in the sense of e.g. Schütze 2001)—witness the accusa-
tive case on the pronominal subjects in (1,3).6 The evolutionary perspective sheds

worry?!), Prepositional Phrases (PPs) (Everybody out!), etc., based on the category of the predicate. Such
clauses, however, have properties that suggest that they form a natural class, which would not be captured if
they were labeled differently. For example, all of them can embed under a verb like want or imagine (i),
even though these verbs cannot otherwise take APs or PPs as their complements (ii).
(i) He imagined [everybody out]/[the problem solved]/[me happy].
(ii) *He imagined [very happy]/[out of there].
[The “*” marks an ungrammatical structure.]
That is one reason why the label small clause (SC) still persists, tacitly acknowledging that we do not know
what heads these structures, if anything at all.
6
As noted in Progovac (2006), small clauses found in isolation (e.g. Problem solved. Case closed) are even
“smaller” than corresponding embedded small clauses, which require a determiner in the following examples:
(i) I consider the problem solved. / *I consider problem solved.
(ii) I consider the case closed. / *I consider case closed.
Essentially, some kind of grammatical relationship or functional projection is needed for syntactic
embedding/subordination (see Section 4.4), and for that reason bare small clauses, such as the ones
found in isolation, cannot embed (see also further discussion in the text).
The presence of the determiner the in embedded clauses instantiates the DP (Determiner Phrase) layer,
which, according to e.g. Longobardi (1994), is necessary to establish a case-checking relationship between the
higher verb and the subject of the small clause in English (Progovac 2006 has more discussion on this). This
relationship seems to provide sufﬁcient grammatical “glue” to allow embedding in this case (see also Section 4.4).
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Root small clauses in English 37

light on the existence of both embedded SCs and RootSCs—both can be seen as
“living fossils” of a proto-syntactic stage in which, presumably, clauses were put
together by an operation akin to adjunction (i.e. Conjoin), and in which there were
no functional categories or projections to facilitate grammatical relationships.
In any event, small clauses do not have the functional power to assign their
subjects a structural (nominative) case. In Minimalism, structural nominative case
is associated with the projection of TP, providing another argument that root small
clauses are not TPs. The next chapter will establish that absolutive-type fossils show a
comparable property of not having the functional power to assign structural (accusa-
tive) case to their objects.
As established in Section 1.7, the modern syntactic theory associated with Minim-
alism analyzes every clause/sentence as initially a small clause (SC) (examples below
in (a)), which gets transformed into a full TP only upon subsequent Merge of tense
(examples in (b)), and subsequent Move of the subject to TP in English (examples in
(c)). In other words, according to this inﬂuential analysis, the layer of TP is super-
imposed upon the layer of small clause:7
(8) a. Small clause: [SC Sheila happy]
b. [TP is [SC/AP Sheila happy]] →
c. Sentence: [TP Sheila [T’ is [SC/AP Sheila happy]]]
(9) a. Small clause: [SC Peter retire]
b. [TP will [SC/VP Peter retire]] →
c. Sentence: [TP Peter [T’ will [SC/VP Peter retire]]]
(10) a. Small clause: [SC (the) problem solved]
b. [TP is [SC/VP (the) problem solved]] →
c. Sentence:
[TP The problem [T’ is [SC/VP (the) problem solved]]]
The (a) examples above involve only one clausal projection/layer, which can be
uniformly characterized as a SC (small clause). The full ﬁnite clauses in (c) have at
least two layers of clausal structure: the inner SC layer, and the outer TP layer, clearly
creating hierarchical structure. In other words, small clauses morph/transform into
TPs, as if the building of the modern sentence (TP) retraces its evolutionary steps.
The kind of derivation (from SC to TP) illustrated in (8-10) is the commonly
accepted postulate in Minimalism and predecessors, dating back to the early 1980s. In

7
I do not use the vP shells of Minimalism here because I only discuss intransitive clauses, for which the
vP shell is arguably not necessary. Chapter 3 discusses this issue in greater detail and actually proposes that
the vP shell should be seen as a later evolutionary innovation, an additional layer of functional structure
superimposed over the foundational VP layer, introducing agency and transitivity. In this view, original
small clauses did not have vP shells.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

38 The small (clause) beginnings

fact, this is one of the most stable postulates in this approach, which has survived
many changes of analysis and focus. In general, I base my proposal on the discoveries
and claims which are reasonably uncontroversial within this approach, and which
both predate Minimalism and survive into many later versions. Such stable postu-
lates include not only the derivation of the sentence from the underlying small clause,
but also the layering of sentential structure (e.g. CP>TP>vP>SC/VP), as established
in Chapter 1.
Recall from Chapter 1 that my proposal relies on an internal syntactic reconstruc-
tion to arrive at the intransitive small clause proto-syntax:

Internal reconstruction of clause structure

A structure X is considered to be (evolutionarily) primary relative to a structure
Y if X can be composed independently of Y, but Y can only be created upon the
foundation of X.
This reconstruction claims that there was a point in time when only the primary
structures X were available, but not the structures Y, as explained in Section 1.2.2.
While small clauses (and VPs) can be composed without the vP or TP layers, vPs and
TPs need to be built upon the foundation of a small clause/VP.
The ambivalence as to how to analyze small clause structures is the reason behind
the persistence of the vague label “SC.” While the syntactic theory postulates that
every category (head) projects a phrase (e.g. a noun projects a noun phrase; a verb
projects a verb phrase; and a tense projects a tense phrase), and every phrase is a
projection of a head, the label SC defies this very important postulate, as it is not clear
at all what heads small clauses. One way to deal with this situation is to concede that
small clauses are not modern hierarchical formations, but rather fossils of a paratac-
tic proto-syntax stage, whose formations were neither headed nor hierarchical.
Kinsella (2009: 44) raises the question of why one should have Move in the
syntactic theory, in addition to Merge, given that even in the Minimalist Program
Move is sometimes considered to be “more costly” than Merge.8 This question is
related to the question of why every sentence should begin as a small clause in the
first place. But if the small clause core of the sentence can be seen as the first step in
building sentential structure, as the paratactic scaffolding, then Move can be seen as a
force which connects different layers of sentential derivation, created by evolutionary
tinkering.9 In other words, the building of the sentence bottom up, from small clause
to TP, may be seen, metaphorically, as retracing the steps of the evolutionary
development of the sentence. Neither bottom-up sentence building, nor small clausal

8
Kinsella (2009) is a published version of Parker (2006).
9
But see e.g. McDaniel (2005) for a different view of Move, as discussed in Section 4.4.5. See also
Hurford (2012) who considers that the initial impetus for Move may have been pragmatically driven, e.g.
for focalizing or topicalizing purposes.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Root small clauses in English 39

beginnings of the sentence, nor Move, would then need to be considered as concep-
tual necessities, but rather as just epiphenomena of evolutionary tinkering.10
In support of the claim that modern clauses have (at least) two layers of structure,
notice that they can have two subject positions: one in which the subject is ﬁrst
Merged in the small clause, and the other in which the subject actually surfaces, after
Move (8-10). In certain sentences, both subject positions can be overtly ﬁlled, as
underlined below (see e.g. Koopman and Sportiche 1991):
(11) [TP The jurors [T’ will [SC/VP all rise]]].
The root counterpart of the small clause in (4) is also in use:
(12) All rise!
In this sense, then, a SC is indeed a half-clause in comparison to the corresponding
TP, with a substantial overlap between the two layers, as will become even more
obvious when we consider unaccusative small clauses in Serbian in the next section.
There is also some division of labor between root small clauses and their full
sentential counterparts: while full TPs specialize for indicative mood and assertion,
root small clauses tend to exhibit elsewhere, non-indicative, “irrealis” functions,
ranging over expressions of incredulity, commands, wishes, etc. Root small clauses
in general also specialize for the here-and-now, as further discussed in the following
section. It is important to point out that overlap, and partial specialization, are
hallmarks of evolutionary tinkering, but not of optimal design. According to e.g.
Carroll (2005: 170–1), evolving multiple means to the same end creates the oppor-
tunity for the evolution of specialization through the division of labor.11
In the evolutionary perspective, if there was a stage of proto-syntax characterized
by root small clauses, then in that stage such clauses were probably able to express
assertions as well, there not yet having arisen the opportunity for the division of
labor.12 The emergence of Tense/TP would have created such an opportunity for
specialization between small clauses and full TPs.
A similar scenario, which can illustrate how the emergence of a more
specialized category can lead to division of labor, has been reported for the
grammaticalization of tense and indicative mood in more recent, historical times, in
pre-Indo-European (pre-IE).13 Many Indo-Europeanists converge on the conclusion

10
At a more abstract level, the theoretical construct Move can be seen as a metaphor for interpreting
one and the same constituent as somehow present/relevant in more than one position in a sentence. In fact,
regardless of the metaphor used, having a gap in one place which needs to be linked to a constituent in
another place in a sentence is a powerful mechanism of syntactic cohesion.
11
Carroll (2005) shows that extra limbs/appendages of various kinds, with various species, demonstrate
such specializations. In this respect, they also mention the specialization of human hands vs. legs.
12
Serbian unaccusative RootSCs are still used to express assertions, as shown in the next section. They
show specialization with respect to full TPs in other respects.
13
While Kiparsky (1968) estimates that Proto-Indo-European was spoken around 3,700 BC, Renfrew
(1987) dates it back to 7,000 BC, and Gray and Atkinson (2003), using a computational model, to around
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

40 The small (clause) beginnings

that Proto-Indo-European (PIE), as well as early IE languages, had an unaugmented

verb form, the injunctive, which was not marked for tense or mood (see e.g. Thurneysen
1883; Kuryłowicz 1925/1927; Gonda 1956; Kiparsky 1968).14 As discussed in Kiparsky
(1968), this unmarked injunctive form, upon the grammaticalization of tense, began to
specialize for non-indicative/irrealis moods.15 Child language acquisition seems to
proceed in a comparable fashion, from small clauses to TPs (Section 2.5), providing,
at the very least, corroborating evidence for the syntactic simplicity/primacy of small
clauses relative to tensed sentences (TPs).

2.3 (Unaccusative) Root small clauses in Serbian

Serbian has the same types of root small clauses illustrated in the previous section for
English, but I will not discuss those here as they do not add anything new to the
arguments already presented in the previous section. However, Serbian has another
type of root small clause, the unaccusative root small clause, which makes an even
stronger argument for “half” (TP-less) syntax, and for the gradual integration of the
small clause into a ﬁnite TP. Moreover, this type of clause is quite productive in
expressing assertions, making an even stronger case for living fossils.
Serbian has (at least) two ways of expressing propositions/assertions with unac-
cusative verbs: as VS (verb-subject) small clauses, arguably without the TP layer (13),
and as full TPs with free word order (14) (for details, see Progovac 2008a,b, 2013b):16
(13) a. Stigla pošta. (cf. ??Pošta stigla.)
arrive.PART mail
‘The mail has (just) arrived.’

10,000 years ago. Yet others have indicated that we simply do not know, and that Proto-Indo-European
could be pushed back even further (Dixon 1997: 49).
14
Such tenseless/moodless forms occur in other families as well, e.g. in Bantu languages. Tswana, as
described in Cole (1955: 445), has neutral tense forms which are used in coordinate structures, but also in
complements, where modern IE languages mostly use infinitives. Cole refers to them as subjunctive forms.
Such neutral tenses are also found in other African languages, including Herero, Duala (where the form is
called aorist), and Swahili (Meinhof 1948).
15
In this injunctive stage of pre-IE, according to Kiparsky (1968), it was possible to express time by
temporal adverbials, which, unlike grammaticalized tense, were neither obligatory nor associated with a
specific functional projection, and which can best be described as adjuncts. In fact, in Greek and Sanskrit,
verbs are commonly put into (what looks like) present tense when modified by adverbs denoting past time
(Kiparsky 1968: 47), and this is considered to be a vestige of the PIE injunctive. It is highly likely that
temporal adverbs preceded the grammaticalization of tense in the evolution of syntax.
According to Kuryłowicz (1964: 21), the injunctive, a tenseless verbal form, was the only mood in earliest
PIE. Gonda (1956: 36–7) points out that any attempt exactly to translate the injunctive categories into a
modern Western idiom is doomed to fail, given “the vagueness in meaning and the great, and in the eyes of
modern man astonishing, variety of its functions.” As pointed out in Chapter 6, some of these functions
were optative and imperative, which seem to have been preserved in VN compounds in some languages.
16
“PART” in the glosses stands for a perfective participle form of the verb, which marks perfective/
completed aspect, but not tense. The form is equivalent to English “gone” in expressions such as “All gone.”
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

(Unaccusative) Root small clauses in Serbian 41

b. Pala vlada. (cf. ??Vlada pala.)

fall.PART government
‘The government has (just) collapsed.’
c. Umro Petar. (cf. ??Petar umro.)
die.PART Peter
‘Peter just died.’
d. Pao sneg/mrak. (cf. ??Sneg/Mrak pao.)
fall.PART snow/darkness
‘It has just snowed./It got dark.’
Among such SCs, many are formulaic/rigid/non-compositional in meaning, and
with these the rigidity of VS word order is more obvious:
e. Pala karta. (cf. ?*Karta pala.)
fall.PART card
‘Card laid, card played.’
f. Proš’o voz. (cf. ??Voz proš’o.)
gone train
‘The opportunity has passed.’
g. Pukla tikva. (cf. ?*Tikva pukla.)
burst.PART squash
‘The friendship/alliance has ended.’
The full (TP) counterparts feature the ﬁnite (past tense) auxiliary je, as well as free(er)
word order (14a-d). In addition, formulaic readings typically do not survive expan-
sion into full clauses, as shown in (14e-g), which are interpreted literally.
(14) a. Pošta je stigla. / Stigla je pošta.
mail is arrived arrived is mail
‘The mail (has) arrived.’
b. Vlada je pala. / Pala je vlada.
‘The government (has) collapsed.’
c. Petar je umro. / Umro je Petar.
‘Peter died.’
d. Sneg je pao. / Pao je sneg. Mrak je pao. / Pao je mrak.
‘It snowed.’ ‘It got dark.’
e. Karta je pala. / Pala je karta.
‘The card fell.’
f. Voz je prošao. / Prošao je voz.
‘The train is gone.’
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

42 The small (clause) beginnings

g. Tikva je pukla. / Pukla je tikva.

‘The squash has burst.’
Unaccusatives can be roughly characterized as intransitive structures which blur the
boundary between subjects and objects in the sense that their only argument is not an
agent, but typically theme, showing some properties of objects. Consequently, un-
accusative verbs (e.g. arrive, fall, come, appear) are analyzed crosslinguistically as
Merging their sole arguments as objects, rather than as subjects (e.g. Perlmutter 1978;
Burzio 1981, 1986). In Serbian, there is a preference even in the surface structure for
unaccusative subjects to follow the verb, the position typically associated with objects.
With unaccusative SCs in (13) this preference becomes more rigid, imposing the
otherwise non-canonical VS order (see Progovac 2008a,b, 2013b, for details).17
Given this widely accepted analysis of unaccusatives, full/ﬁnite unaccusative
clauses in Serbian and English are derived as follows:18
(15) a. Small clause: [sc pala vlada] →
b. [tp je [sc pala vlada]] →
c. TP Sentence: [tp vlada [t’ je [sc pala vlada]]]
(16) a. [sc spill the milk] →
b. [tp will [sc spill the milk]] →
c. [tp the milk [t’ will [sc spill the milk]]]
Here, “the milk” can be considered object-like also in the sense that if there is an
agent added, as in John spilled the milk, then “the milk” would clearly emerge as an
object (for further discussion of this, see Chapter 3).
In the paper entitled “What use is half a clause?” Progovac (2008a) argues that root
small clauses are “half-clauses” in comparison to the TP counterparts, which have an
additional (TP) layer of structure. The clause in (15a) is a half-clause in comparison to
the clause/sentence (15c), which has two layers of structure, SC and TP, and two
subject positions, as also discussed below. Frequent arguments against Darwin’s
theory of evolution have been of the kind: what use is half an eye? Since similar
arguments have been raised against a gradualist approach to the evolution of syntax,

17
The closest English equivalents to rigid VS unaccusatives occur in fossilized expressions such as
Come winter, she will travel to Rome; cf. *Winter come, she will travel to Rome). Another example, Come
one, come all, is found among the fossilized small clause combinations, as discussed in Section 2.4. Just as in
Serbian, the word order in these expressions is VS, even though the word order in English is otherwise
SVO, and Serbian typically shows a freedom of word order, with SVO being the default order.
18
As shown in the examples in the text, Serbian TPs can also retain the unaccusative VS order, in which
case the tense particle je has to follow the verb (see example (14) in the text). Since je is a clitic, it has to
observe the Clitic-Second requirement in Serbian, and that is why it cannot appear ﬁrst in (14). There are
many different approaches to the placement of clitics in Serbian, but for my purposes here I will assume
that the SV examples in (14) are syntactically more complex than the VS counterparts, involving more
movement operations, most notably subject raising to the speciﬁer of TP.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

(Unaccusative) Root small clauses in Serbian 43

it is instructive to wonder about whether such half-clauses would have been useful to
our ancestors when they first stumbled upon syntax.19
As it turns out, such half-clauses are used productively in Serbian even today
(13), alongside with the full TP counterparts illustrated in (14) (Section 2.4). As is
the case with English root small clauses discussed in the previous section, Serbian
unaccusative half-clauses also specialize for the here-and-now, reporting on an
event that has just manifested itself. Consequently, these clauses cannot be modi-
fied by adverbs denoting remote past, such as “three years ago” (?*Stigla pošta pre
tri godine, ‘arrived mail before three years’), leading again to a division of labor.
Moreover, some formulaic unaccusative clauses (13e, repeated below) are only
possible as half-clauses, and not as full clauses, when used to perform a speech-
act in the context of a card game:
(13e) Pala karta. (cf. ?*Karta pala. / ?*Karta je pala.)
fall.PART card
‘Card laid, card played—you cannot take it back now.’
These clauses first of all provide a forceful argument that half-clause syntax is real:
the VS word order in these clauses can only be explained if the widely-adopted
unaccusative hypothesis is coupled with the small-clause analysis. The awkwardness
of the (otherwise default) SV order (13) makes it clear that they are not just
abbreviated/elliptical versions of some finite counterparts, such as those given in
(14). Rather, these half-clauses, as well as the ones illustrated for English in the
previous section, demonstrate consistent and systematic properties of a different,
simpler clausal syntax, a syntax that involves one (less) layer of clausal structure, the
basic (underived) word order (no Move), non-finite verb forms, and default case (for
more details, see Progovac 2006, 2008b).
From the evolutionary standpoint, it is significant that half-clauses (13) to some
extent overlap in function with their full equivalents (14), even though they show a
degree of specialization as well, as elaborated in the repeated example below.
(14’) a. Stigla je pošta.
arrive.PERF.PART.F.SG is.3SG mail.F.SG
‘The mail has (just) arrived.’

19
As put in Carroll (2005: 170–1), “the erroneous notion . . . has been that the intermediate stages in the
evolution of structures must be useless—the old saw of ‘What use is half a leg or half an eye?’ ” Such
expressions of disbelief were partly due to the inability to imagine, based on the structure of the modern
eye, how it could have been broken down into stages, and moreover stages that would have provided
incremental advantages. The arguments against a gradualist approach to the evolution of syntax are of a
comparable kind: given how we view/understand modern syntax today, we cannot imagine how it could
have evolved through stages, and moreover how each new stage could have provided incremental
advantages.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

44 The small (clause) beginnings

While the perfective participles in half-clauses contribute to the perfective aspect (but
have no tense or TP), the full counterparts mark both perfective aspect (with the
participle) and (past) tense (with the auxiliary je). This expression of tense/aspect
must be redundant at least to some extent (especially for the here-and-now situ-
ations), given that only past tense auxiliaries in this case are compatible with these
perfective participle forms.
Agreement properties of these clauses exhibit redundancy and overlap even more
obviously. As indicated in the glosses in (14’), the participle form agrees with the
subject in number and gender, but not in person, the type of agreement that also
characterizes adjectives in Serbian. On the other hand, the auxiliary verb agrees with
the subject in person and number (but not in gender). It is as though both layers
of the clause have their own subject position (see previous section, examples 11–12),
their own separate agreement properties, which partly overlap, and their own ways of
encoding time/aspect, which again partly overlap. This provides evidence of tinker-
ing with clausal structure, rather than evidence of optimal design.20
My proposal in this respect is that a layer of TP (or a comparable functional
projection) was at one point in evolution superimposed upon the layer of a small
clause (half-clause), the proto-syntactic construct which already was able to express
some basic clausal properties: predication and some temporal/aspectual properties.21
If so, then half-clauses would have been useful to our ancestors. A half-clause is still
useful, even in expressing propositional content—much more useful than having no
clausal syntax at all, and less useful than articulated ﬁnite syntax. This is exactly the
scenario upon which evolution/selection can operate.

2.4 Small clause syntax is rigid (no Move, no recursion)

In addition to the obvious morpho-syntactic hallmarks of the discussed root small
clauses in English and Serbian (absence of a ﬁnite verbal form, absence of nominative
case checking in English, absence of subject raising with Serbian unaccusative SCs
(resulting in strong preference for VS order), these clauses are also characterized by

20
The argument here is that root small clauses both in English and in Serbian are approximations of a
proto-syntax stage in the evolution of human language, and that the superimposition of a TP over the small
clause layer works basically in the same way in both languages. A reviewer wonders if these data may not
just be properties of particular languages, but not a design feature for the capacity for language. First of all,
Minimalism and predecessors analyze every sentence/clause as starting as a small clause, and the intent for
this analysis is to hold universally, in all languages. The English and Serbian data discussed here are thus
just illustrations of this otherwise universal phenomenon. What I am claiming here is that this property of
language, recognized in syntactic theory, reﬂects an imperfection, evidence of tinkering, rather than
optimal design.
21
Finally, the reviewers wonder how one can distinguish between historical change and language
evolution in this case. While historical language change is typically considered to be a change which
does not have any genetic consequences, language evolution (and evolution in general) is typically
associated with genetic changes. I address this issue in Section 7.3.5.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Small clause syntax is rigid (no Move, no recursion) 45

the following surprising properties: they do not tolerate movement of any kind
(17–18); they cannot embed (one within another), and thus do not show recursion
(19–20); their interpretation is typically confined to the here-and-now (21–23); and
many among them are (semi-)formulaic. For all these reasons, these clauses cannot
be analyzed as identical in structure or complexity to their full finite counterparts,
nor can they be reduced to elliptical versions of the full counterparts.
The following (a) examples illustrate that small clause syntax does not exhibit (wh-)
movement, in sharp contrast to the full sentences (b).22
(17) a. *Who(m) worry?!
*Where everybody?!
b. Who worries?
Where is everybody?
(18) a. *Kada stigla pošta?
when arrived mail
b. Kada je stigla pošta?
The examples below illustrate that a small clause of this type cannot embed into
another clause, and thus does not show recursion. Recall from Chapter 1 that
recursion is defined in this book, as per the traditional view, as a category of a certain
type being embedded within another category of the same type. In this respect, what
we see in (19) is a (failed) attempt to embed one SC within another SC.
(19) a. *Him worry [me first]?
b. *Him worry [problem solved]?
(20) a. *Ja mislim [(da) stigla pošta].
I think (that) arrived mail
b. cf. Ja mislim [da je stigla pošta].
At first sight, it may seem that the clauses in (20) should be able to embed if the
complementizer da is used, given that complementizers are supposed to provide
the specific functional glue necessary for subordination, as per the discussion in
Chapter 4. However, in syntactic theory it is considered that there is a hierarchy

22
Interestingly, as the reviewers of this manuscript have pointed out, some types of questions seem
possible with subsentential structures, and I have no good explanation for that. While (i) seems to be a ﬁxed
expression, an unanalyzed unit, (ii) illustrates that why can freely combine with various categories.
Interestingly, however, neither of them can combine with a small clause (iv). The example in (iii) may
be an echo question, that is, a question echoing what somebody else has said before.
(i) How come?
(ii) Why worry? Why now? Why Mary?
(iii) Solve what?
(iv) *Why Mary worry? *How come Mary worry?
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

46 The small (clause) beginnings

of functional projections such that each functional category selects the next (e.g.
Abney 1987; Adger 2003; but see e.g. Grimshaw 2000, who does not adopt this view
of the hierarchy). Recall that this is the theoretical postulate on which my recon-
struction method rests, making use of the hierarchy of functional projections:
CP>TP>vP>SC/VP.
In this concrete case, a CP (Complementizer Phrase) needs to select a TP (Tense
Phrase). This would mean that if a clause does not have a TP to begin with, it cannot
build a CP or use a finite complementizer/subordinator. Serbian data illustrated in
(20) conform to this pattern rather dramatically. The subordinate clauses with the
tense auxiliary can be introduced by the complementizer, and are fully recursive
(20b), while the clauses without the tense auxiliary cannot take a complementizer and
cannot embed at all (20a).23
Finally, the following data illustrate that small clauses are typically confined to the
here-and-now, rejecting modification by adverbials denoting distant past.
(21) a. *Stigla pošta pre tri godine.
three years ago
b. *Pala karta pre tri godine.
(22) a. *Case closed three years ago.
b. *Me first three years ago!
Clearly, we are dealing with two distinct types of grammar here: the simpler, rigid,
TP-less small clause grammar, arguably approximating the ancient proto-syntax
stage, and the more complex TP grammar, which subsumes the former in that a
TP is projected upon the small clause foundation.24 TPs have at least one more layer
of structure than root small clauses (or “half-clauses”). Superimposing one layer (e.g.
TP) over another (SC) creates hierarchy, as well as additional syntactic space for
Move to target as it connects multiple layers of structure. Due to the wiring of the
brain in this particular way throughout human evolution, it is entirely possible that
the only way we humans can build sentences is by starting with the small clause, even
if one can certainly envision more direct and more optimal derivations.

23
A reviewer wonders about the English example in (i) in this respect:
(i) [CP For [TP John [T’ to have left]]] would make sense.
The bracketed clause is still typically analyzed as having a C (for) which selects a TP headed by to. But the
reviewer is correct to point out that sometimes the heads of TP or CP are allowed to be non-overt in this
theory, which makes it harder to see how the hierarchy works.
24
It is worth pointing out here that the rigid small clauses considered here seem to be syntactic isolates,
in the sense that they cannot be easily embedded or modiﬁed, and in this sense they do not show recursion
even in the Hauser, Chomsky, and Fitch’s (2002) weak sense of the term, which seems to reduce to the
possibility to reapply Merge. Recursion is discussed in much more detail in Chapter 4.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Small clause syntax is rigid (no Move, no recursion) 47

Quite clearly, these data cannot be attributed to any cognitive restrictions: the ability
to embed examples in (20b), but not those in (20a), depends solely on the structural
properties of these two types of clauses. The significance of this for the evolutionary
argument is that one may in principle be capable of recursive thought, but cannot
express it through language via subordination if the structural properties of language
are limited in this way.25 Given these data and analyses, one can reconstruct a
gradualist progression from proto-syntax to the development of recursion, as discussed
in Chapter 4.
Notice that Move and recursion, reducible to Merge, are considered to be universal
and defining properties of human language among most Minimalist researchers (see
e.g. Hauser, Chomsky, and Fitch 2002; Chomsky 2005; Moro 2008). While Hauser,
Chomsky, and Fitch (2002) do not define it, what seems to be meant by recursion by
them is the ability to apply and then re-apply Merge, so perhaps this sense of
recursion can be characterized as recombinability (see e.g. Tomalin 2011 for some
useful discussion). In other words, the operation Merge can apply repeatedly. The
ability to recombine/re-Merge in this way yields hierarchy, but not necessarily what is
typically considered to be recursion by linguists, that is, the ability to embed one
category (e.g. CP) within another category of the same type, in an unlimited fashion
(see Chapter 4 for more discussion on this).
If considerations in this book are on the right track, then Move and recursion
cannot be the defining properties of human language, not even recursion in the weak
sense of recombinability. Nor can they be reduced to Merge. Rather, both Move and
recursion should be seen as relatively recent, fragile innovations, which emerged with
the hierarchical stage (as discussed at length in Chapter 4).
The TP/CP grammar allows for embedded recursion (23) and for the expression of
a variety of nuanced meanings with respect to the temporal/aspectual/modal prop-
erties of the clause (24):
(23) He worries [that I think [that the problem has been solved]].
(24) The problem has been/may have been/will be solved.
I will be/should be/better be first.
The small clause grammar, on the other hand, allows for flat concatenation of two
clauses, of the type illustrated in (25–34), once again often resulting in (semi-)formulaic
expressions, not subject to questioning or recursion (see Progovac 2010a):26

25
A reviewer has wondered how recursive thought is related to recursive syntax. As my goal in this
monograph is to conﬁne my claims to what my reconstruction and my fossil data lead me to, this question
falls outside the scope of this monograph. Still, for what it is worth, I believe that, as in art, the medium
partly shapes not only what you can express, but also what occurs to you to express.
26
The reader will notice that not all of these “clauses” involve a noun-verb combination, but that there
are other possibilities as well. Most of them involve an interesting AB-AC pattern, although in the Serbian
examples the As are not identical, only similar (correlated) in some sense. In spite of the differences among
them, they all seem to exhibit a characteristic rhythm and symmetry.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

48 The small (clause) beginnings

(25) Nothing ventured, nothing gained.

(26) Easy come, easy go.
(27) First come, first serve.
(28) Monkey see, monkey do.
(29) Come one, come all.
(30) Like father, like son.
(31) So far, so good.
(32) Preko preče, naokolo bliže. (Serbian)
‘Across shorter, around closer.’
(33) Što na umu, to na drumu.
what on mind that on road
‘What one thinks, one says.’
(34) Duga kosa, kratka pamet.
long hair short intelligence
‘Who wears long hair has little intelligence.’
(35) Wo dua wo twa. (Twi)27
you sow you reap
(36) Wo hwehwea wo hu.
you seek you find
As discussed in Chapter 4, the glue that holds these small clause combinations together
is paratactic, resulting in a non-hierarchical, symmetric bond, arguably created by the
same operation Conjoin which joins the subject and the predicate in flat small clauses
(Section 2.2). Let me represent the paratactic process Conjoin with the symbol #:
(35) [sc Monkey # see] # [SC monkey # do]
In fact, if musical protolanguage was an episode in the evolution of human language
(see e.g. Darwin 1874; Fitch 2010: 475), then it would have been most useful right here,
in this paratactic stage, in which the rhythm and melody would have served to hold
together not only two-word clauses, but also binary combinations of such clauses.
However, unlike in Fitch’s proposal in which a musical stage preceded words
(criticized by Tallerman 2013a), the use of melody/prosody here would have been
compositional, used to combine words into larger utterances (see also Section 4.2).

27
Twi is spoken in Ghana. Thanks to Kingsley Okai (p.c. 2011) for supplying the data.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Corroborating evidence and testing grounds 49

The best evidence of rigid syntax is typically found among intransitive constructions,
speciﬁcally among unaccusative and absolutive-like constructions. This observation is
taken up in the next chapter, which postulates an intransitive, absolutive-like stage in
the evolution of syntax, and which discusses its many modern manifestations. The next
section considers corroborating evidence and testing grounds for the small clause
proposal explored in this chapter.

2.5 Corroborating evidence and testing grounds

There exists abundant corroborating evidence for the small clause beginnings of
language, which comes from the gradual emergence of syntax in child language
acquisition and second language acquisition (Section 2.5.1) and from agrammatism
(2.5.2). In addition, neuroscience (2.5.3) and genetics (2.5.4) constitute possible testing
grounds for this hypothesis. Section 2.5.5 considers various other stratiﬁcation-based
approaches to evolutionary progression, adding to the plausibility of the proposal
explored in this monograph.

2.5.1 Language acquisition

When it comes to first language acquisition, it has been argued by many that children
go through a root small clause/root infinitive stage (e.g. Radford 1988, 1990; Lebeaux
1989; Platzak 1990; Ouhalla 1991; Guilfoyle and Noonan 1992; Rizzi 1994; Jordens 2002;
Potts and Roeper 2006; but see e.g. Guasti 2002; Pinker 1996, for opposing views). Of
course, this stage follows the well-known one-word stage in which single words are
used often to express full propositions (e.g. Bloom 1970). According to Radford (1990),
children enter the one-word stage at about ten months of age, start sequencing single
words at around fifteen months, and start using something comparable to small clauses
around two years of age. This sequencing represents a plausible transition from one-
word stage to a small clause stage in evolution as well (Chapter 4).
Below are some examples from child English, using root small clauses, based on
Guasti (2002):
(37) Marie go. Me go.
Eve gone.
Kitty hiding.
You nice.
These data are obviously missing markers of finiteness, and can thus be analyzed as
TP-less small clauses, as they are in e.g. Radford’s work. It is also obvious that these
data are directly comparable to the root small clauses in adult speech, as introduced
in the previous sections, which also consist of a noun/pronoun and a predicate, either
involving a non-finite verb (infinitive or participle), or some other predicate, such as
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

50 The small (clause) beginnings

adjective. If these are essentially the same kinds of root small clauses, used by both
adults and children, then it is not surprising that children’s initial stages of language
development do not show embedding/subordination or Move, both reported to be
rather late developments in children (e.g. Radford 1990; Lebeaux 1989; Ouhalla 1991;
Platzak 1990; Potts and Roeper 2006; Hollebrandse and Roeper 2007). The claim in
this monograph is that Move and recursion are unavailable to paratactic (small
clause) grammars. As shown in Section 2.4, adult root small clauses, taken to be
fossils of the paratactic small clause stage in language evolution, also show no Move
or recursion. Given the small clause data in adult speech and in language acquisition,
one can conclude that a small clause stage in language evolution was not only
possible, but highly probable.
Early stages of second language acquisition have been analyzed in a similar light.
According to e.g. Klein and Purdue’s (1997) inﬂuential work, second language
acquisition can stabilize/fossilize at the stage of the so-called Basic Variety, which
is, according to them, a well-structured, simple, and efﬁcient form of language. The
Basic Variety also does without most functional categories, complex hierarchical
structure, Move, and subordination.
A reviewer has wondered why language acquisition would be relevant for the
considerations of language evolution. First, I should remind the reader that my
proposal is based on syntactic reconstruction, as well as on the availability of
syntactic fossils, and I use language acquisition only as corroborating evidence for
the proposal, and not as main evidence. Having said that, let me also point out that in
my proposal language evolved through scaffolding/layering, in such a way that the
lowest layers served as necessary foundation for the higher layers. The prediction of
this proposal is that child language, to the extent that it emerges in stages, has to
observe the same scaffolding. So, even without any ontogeny/phylogeny connections
ever established in biology, child language acquisition would still be relevant for
language evolution considerations, at least for the approach that I am pursuing in this
monograph.
In biological texts (e.g. Ridley 1993: 551; also Strickberger 2000: 493–4), the rela-
tionship between ontogeny and phylogeny is considered to be a classic topic in
evolutionary studies, despite much controversy surrounding it. In my work, I do
not consider that ontogeny literally recapitulates phylogeny, but only that it can be
used as secondary, corroborating support for a proposal that is independently
established. This is in line with e.g. Studdert-Kennedy (1991); Rolfe (1996); Locke
(2009); Locke and Bogin (2006), who suggest that present-day views warrant the use
of ontogeny to corroborate hypotheses about phylogeny.
When it comes to the studies of the evolution of language, Burling (2005, 174)
makes use of the phylogeny/ontogeny connection, and so does Lieberman (e.g. 2000)
in his discussion of the descent of larynx. In his work on Riau Indonesian, Gil (2005)
also invokes the phylogeny-ontogeny connection. In particular, he argues that Riau
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Corroborating evidence and testing grounds 51

comes close to being a perfect example of an IMA (Isolating–Monocategorial–

Associational) language, a language whose syntax can be characterized as exhibiting
a simple combinatorial operation (call it Conjoin), the semantic effect of which is a
loose associational relationship. According to Gil, IMA language may constitute a
stage both in language acquisition and language evolution.
On the other hand, as a reviewer points out, Yang (2013) argues that there is no
connection between phylogeny and ontogeny when it comes to language, and con-
cludes from there that gradualist approaches to language evolution are not feasible.
However, in the final analysis, this article cannot reach both of these conclusions at the
same time. That is, one cannot claim that the connection between ontogeny and
phylogeny is irrelevant, and then use it (i.e. use the supposed lack of correspondence
between the two) to make an argument that there was no gradual evolution of
language. I thus interpret this article to be actually using the (lack of) ontogeny/
phylogeny connection in a certain domain to make an argument against a gradualist
approach to the evolution of syntax. This paper is discussed further in Section 4.5.1.3.
The reviewer also wonders if there is enough evidence for a two-word stage in first
language acquisition. It has been reported by various researchers that there typically
is such a stage, a stage where utterances of more than two words are very rare (e.g.
Bloom 1970; Bloom 1994). Of course, a two-word stage does not eradicate the one-
word stage, as I discuss in this monograph, and thus this stage would encompass both
one-word and two-word utterances. It is worth mentioning in this respect that the
level of language proficiency in children is typically measured by the mean length of
their utterances (MLU), characterized as the number of words/morphemes per
utterance. It is well-known that MLU increases gradually with age, and it is often
used as an indicator of the acquisition stage. I return to this issue in Section 3.5, where
I discuss the absolutive-like nature of children’s two-word utterances.
However, an influential theory of language acquisition, the Continuity Hypothesis,
has proposed that in spite of appearances, children’s grammars, from the very start,
are full-blown adult grammars, where the functional categories are only superficially
missing (see e.g. Guasti 2002 and references there; Pinker 1996; see also the quote
from Berwick and Chomsky 2011 in Section 1.1). The rationale behind the Continuity
Hypothesis is that without assuming it, there is a discontinuity between child
grammars and adult grammars, and it would be a difficult task to explain how
children then advance to adult grammars (see also Section 3.5 for more discussion
on this). Interestingly, comparable arguments have been advanced against the grad-
ualist approach to the evolution of syntax.
When it comes to the continuity of syntax, Progovac et al. (2006) have proposed in
the Epilogue that continuity lies in small clause/paratactic grammars, rather than in
full finite grammars. As pointed out above, both children and adults use small clause
grammars, and, moreover, small clauses are built into the very foundation of finite
sentences. For agrammatic patients (see following subsection), even when they lose
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

52 The small (clause) beginnings

the ability to consistently produce full sentences, they are often capable of producing
small clauses. According to Kolk’s (e.g. 2006) study of Dutch and German, normal
adults produced 10% non-finite root clauses, aphasics produced about 60%, and in
children the overuse of non-finite root clauses decreases with age: from 83% in the
two-year-olds, to 60% in the two-and-a-half-year-olds, to 40% in the three-year-olds.
There is thus continuity and common ground in the use of small clause/non-finite
grammars across all these groups.
Clearly, if small clause grammars are the foundational structures, upon which
everything else rests, then they are expected to emerge first for that reason alone.
Given that adults use small clause grammars as well, both in root and embedded
contexts, children are exposed to simpler grammars, and the continuity of language is
not disrupted.

2.5.2 Agrammatism
Agrammatism offers another source of corroborating evidence for the primacy of
simpler, paratactic grammars. When it comes to aphasia, Kolk (2006; also Kolk 1995;
Kolk, van Grunsven, and Keyser 1985) has argued that the preventive adaptation
results in a bias to select simpler types of constructions, often sub-sentential, includ-
ing root small clauses and root infinitives (see also Friedmann and Grodzinsky 1997).
The argument is that the impaired system reorganizes to exploit alternative routes to
the same goal.
(38) Koffie drinken.
coffee drink-INF
(39) Portemonnaie verloren.
wallet lost-PAST.PART
(40) iedereen naar buiten
‘Everybody out’
Just as with the English small clause data, the data above illustrate clauses with non-
finite verb forms, in particular the infinitive (38) and the past participle (39), as well as
clauses without a verb (40). The use of non-finite clauses in Dutch and German is
significant not only because they occur so frequently in agrammatism, but also because
they involve morphology and word order that are distinct from what is found in the
corresponding finite clauses. Just as with the Serbian data discussed in sections 2.3 and
2.4, this again indicates that one is dealing with a distinct, simpler type of grammar,
which cannot be reduced to elliptical versions of full finite sentences.

2.5.3 Neuroimaging
Neuroimaging can provide a fertile testing ground for various evolutionary claims,
including the hypotheses explored in this chapter. My suggestion is that one can use
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Corroborating evidence and testing grounds 53

the subtraction and other neuro-linguistic methods to determine how paratactic

small clause structures are processed in comparison to their more complex (TP)
counterparts, in an attempt to identify neuro-biological correlates of TP layering and
finiteness (see Progovac 2010b).
For the reasons given in the Appendix, while the processing of full TPs is expected
to show clear lateralization in the left hemisphere, with extensive activation of
specific Broca’s areas, the proto-syntactic structures, including root small clauses,
are expected to show less lateralization, and less involvement of Broca’s areas, but
more reliance on both hemispheres, as well as, possibly, more reliance on the
subcortical structures of the brain. This can be tested given the availability of
minimally contrasting pairs in English and Serbian discussed in this chapter, such
as Case closed vs. The case is closed; Pao sneg vs. Sneg je pao (13–14). A detailed
proposal to this effect can be found in the Appendix. If syntax evolved gradually,
through several stages, then it is plausible to expect that modern syntactic structures
and operations decompose into evolutionary primitives. If so, this will be measurable
in the activation of the brain.
If the brain co-evolved with language/syntax, then the pressures to use ever more
complex syntax contributed to the strengthening of necessary neural connections,
and with it some specific processing strategies. According to Deacon (2003: 86–7), if
language structure arose in a drawn-out coevolutionary process in which both brain
and language structures would have exerted selection pressures on one another, then
“we should expect to find that human brains exhibit species-unique modifications
that tend to ‘fit’ the unique processing demands imposed by language learning and
use.” As noted by many, including Deacon (1997) and Diller and Cann (2013), not all
parts of the brain increased at the same rate, but Broca’s areas and other language
processing areas increased more than proportionately. According to Diller and Cann
(2013: 253), “in biology we expect form and function to evolve together.” In other
words, if certain processing strategies evolved more recently in order to support e.g.
layered syntax, then it is expected that such strategies would exhibit a particularly
good fit for the function of processing such syntactic structures.

2.5.4 Genetics and the FOXP2 gene

Genetics is another area of great interest to the evolutionary considerations. A gene
has recently been identified, the FOXP2 gene, which is taken to play a role not only in
articulation, but also in the processing of (morpho-)syntax. The symptoms of the
affected members of the KE family (those who have a mutation) include simplified
morpho-syntax (e.g. Gopnik and Crago 1991), potentially implicating problems with
building functional categories and projections, including tense and TP (see also
Piattelli-Palmarini and Uriagereka 2011; also some discussion in Tallerman 2013a).
The specific symptoms involve subject drop and the nonsystematic use of plural
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

54 The small (clause) beginnings

forms and tense (Gopnik and Crago 1991, and references cited there). While Diller
and Cann (2012: 171) consider that “it would seem likely that FOXP2 is more
important for . . . vocalized speech than for something as complex as grammar,”
they add that “speech and certain aspects of grammar . . . are closely related to each
other from the standpoint of human neural function,” and that “the KE family . . . has
a disruption of both speech and certain aspects of grammar.”
This may suggest that the affected KE family members experience difficulties
establishing neural connections necessary for connecting multiple layers of structure,
in a manner similar to agrammatic speakers (Section 2.5.2); see also the discussion
below. In an fMRI experiment (Liégeois et al. 2003), the unaffected KE family
members showed a typical left-dominant distribution of activation involving Broca's
area, whereas the affected members showed a more posterior and more extensively
bilateral pattern of activation, as well as significant underactivation in Broca’s area
and its right homologue. This may suggest that they are relying on alternative
processing strategies, possibly those better suited for processing paratactic language.
According to Enard et al. (2002), there is evidence for positive selection of the gene
by humans, which would render this discovery of relevance for the evolution of
language. Diller and Cann (2009; 2012: 171) estimate that the FOXP2 mutation dates
back to 1.8 to 1.9 mya (million years ago), approximately the time when Homo (Homo
habilis, H. ergaster, and H. erectus) emerged, and when the hominin brains began to
triple in size. According to Diller and Cann (2012: 171), this would be consistent with
symbolic speech, grammatical language, and the spectacular brain growth evolving
together.
FOXP2 is just one of several genes that are implicated in language and speech
(disorders), and are thus of potential relevance for language evolution (see e.g.
Newbury and Monaco 2010). Two other potentially relevant genes are CNTNAP2
and ASPM (Diller and Cann 2012; see Dediu and Ladd 2007 for ASPM and Micro-
cephalin).28 In order to actively engage this and any other relevant future findings in
genetics, we linguists will have to come up with some concrete linguistically-based
hypotheses about how language evolved. Without that, these remarkable findings in
genetics will go untapped.
To suggest just one possible track, some recent experiments indicate that the
specifically human FOXP2 mutations are responsible for increasing synaptic plasti-
city and for establishing better connectivity among neurons in the brain (e.g. Vernes

28
According to Christiansen and Chater (2008), human genome-wide scans have revealed evidence of
recent positive selection for more than 250 genes (Voight, Kudaravalli, Wen, and Pritchard 2006), making
it very possible that genetic adaptations for language would have continued in this scenario. According to
Hurford and Dediu (2009: 179), there is “genetic diversity across the human species and each gene has a
different history.” See also Levinson and Dediu (2013). Fitch (2010: 503) observes that if “widespread allelic
variations turn out to correlate with subtle linguistic differences, as suggested by Dediu and Ladd (2007),
genetic data may help resolve such debates in the coming decades.”
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Corroborating evidence and testing grounds 55

et al. 2007; Enard et al. 2009: 968). This contributes to the enhanced capability of
cortico-basal ganglia circuits in the human brain that regulate critical aspects of
language, cognition, and motor control (Lieberman 2009). One can thus hypothesize
that the FOXP2 mutation was selected, in part, in order to facilitate the processing
of syntax.

2.5.5 Stratiﬁcation accounts elsewhere

The proposal in this chapter considers that the complexity in clausal structure arose
through the imposition of one layer of structure upon another, that is, by superim-
posing a layer of TP over the layer of a small clause. It is worth pointing out that
stratification accounts have also been proposed for brain evolution in general, where
newly emerged patterns are considered to become dominant and “rework”/subor-
dinate older patterns into conformity with them (e.g. Rolfe 1996; Vygotsky 1979).
Vygotsky (155–6) states that “brain development proceeds in accordance with the
laws of stratification of construction of new levels on old ones . . . Instinct is not
destroyed, but ‘copied’ in conditioned reflexes as a function of the ancient brain,
which is now to be found in the new one.”29 A repeated theme in Piaget’s work is the
inclusion of attainments of earlier stages in the structures of later stages (Gruber and
Vonèche 1977: xxiii). From this perspective, small clause structures can be seen as the
older/lower structures, which are retained in, and subordinated to, the newer/higher
sentential TP structures.30
The notion of the triune brain also invokes the idea of evolutionary layering and
subordination. According to Isaacson (1982: 1, 240), following Broca (e.g. 1878), the
inner lobe of the brain is organized into two layers: the inner and phyletically oldest
ring (allocortex) and the outer limbic ring (transitional cortex). The lowest, proto-
reptilian brain involves ancestral learning and memories, which are subjugated by the
higher limbic brain, thus allowing forgetfulness and suppression of the protoreptilian
habitual way of responding (MacLean 1949: 240–2, 247). In turn, rational decision
making is associated with the prefrontal cortex, or yet-higher brain (Strickberger
2000: 506).
In his characterization of symbolic reference, Deacon (1997: 300) argues that each
higher-order form of a representational relationship must be constructed from, or
decomposed into, lower levels of representation, in such a way that indexical reference
depends upon iconic reference, and symbolic reference in turn depends upon indexical

29
As put in Bickerton (1998: 353) “the creation of a new neural pathway in no way entails the extinction of
the previous one. The fact that we remain capable of functioning in the protolinguistic mode . . . indicates the
persistence of the older link.”
30
A rather concrete example of evolutionary layering and recency dominance comes from the adap-
tation that led to black coloration in leopards, which still preserves the previous layer of orange spots
(Carroll 2005). Metaphorically speaking, the small clause grammar can be seen as orange spots still lurking
through the layer of the more recent, dominant black coloration of sentential/TP speech.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

56 The small (clause) beginnings

reference. Deacon (453) concludes that a failure to appreciate the constitutive role of
lower forms leads to a perspective that “kicks the ladder away after climbing up to the
symbolic realm and then imagines that there never was a ladder in the ﬁrst place.”

2.6 Conclusion
In the evolutionary proposal given in this chapter, Tense and TP (and higher projec-
tions, such as CP) did not emerge from scratch, but were superimposed upon what was
already there—the small clause layer—allowing small clauses to survive, but only in
marginalized, subordinated roles. This kind of incremental building of clausal structure
is arguably also evident in language acquisition (Section 2.5.1). The above established
quirky (rather than optimal) properties of modern clauses, attested crosslinguistically,
begin to make sense if they are seen as by-products of evolutionary tinkering.
Relying on the stable postulates of syntax, that TPs are built upon the foundation
of small clauses, one naturally arrives at the small clause stage in language evolution
by a method of internal reconstruction. By removing the TP layer of the clause, one
can get down to the SC layer. This same method of reconstruction will be used again
in the next chapter to reconstruct an intransitive stage in the evolution of syntax, by
peeling away the vP layer, associated with agents and transitivity. The proposal in the
next chapter also sheds light on various other puzzling properties of language design,
including unaccusativity and ergativity. Chapter 7 demonstrates how the property of
displacement, a design feature of human language, is supported by more complex
grammars.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

The intransitive two-word stage:

Absolutives, unaccusatives, and
middles as precursors to transitivity

3.1 Introduction: The two-word stage

The previous chapter discussed (root) small clauses, which were argued to be
evolutionary precursors to more complex (TP) counterparts, with both still in use
to varied degrees in present-day languages. It is of note that all the small clause data
included in Chapter 2 involved intransitive clauses, that is, clauses with a subject but
no object. While small clauses in present-day languages can deﬁnitely be transitive,
my argument is that the proto-grammars in the paratactic (non-hierarchical) stage
were intransitive.1
There are many reasons to postulate that proto-syntax started intransitive. First of
all, children’s language acquisition proceeds through a two-word stage, as noted by
many (e.g. Bloom 1970); a two-word stage can accommodate a predicate with only
one argument, and thus cannot be transitive, at least not without positing various
null categories in order to bridge the gap between adult grammars and early child
grammars.2 Similarly, early stages of sign languages constructed from scratch also
seem to show a two-word, intransitive stage, as discussed below. Next, many fossils of
proto-grammars are intransitive two-word structures, including certain compounds
and unaccusative and passive-like small clauses (see also Chapter 2).

1
The following example of a TP-less incredulity small clause is transitive, containing both a subject
(him) and the object (his wife). Just like the intransitive small clauses from the previous chapter, this
example lacks tense, agreement, and structural nominative case, as well as shows the other properties of
small clause syntax:
(i) Him leave his wife?! (That is not possible!)
On the other hand, the unaccusative data from Serbian, as well as the passive-like (Problem solved) and
verbless (Me ﬁrst!) small clauses from English, are necessarily intransitive in the sense that only one argument
can be structurally realized.
2
Section 2.5.1 in Chapter 2 offers some discussion of the so-called Continuity Hypothesis, which posits
that all the relevant categories in adult language are also there in child language, but are just null or covert.

Evolutionary Syntax. First edition. Ljiljana Progovac

# Ljiljana Progovac 2015. Published 2015 by Oxford University Press
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

58 The intransitive two-word stage

In addition, paratactic combinations of (small) clauses are almost always binary,

merging only two clauses at a time (1–2). Thus, two-word grammars creating a
(small) clause are paralleled by “two-clause” grammars:
(1) Nothing ventured, nothing gained.
(2) Come one, come all.
Combining more than two paratactic phrases/clauses typically leads to a processing
quagmire, as the following example helps illustrate:
(3) No come, no money, no shelter.
This example is an expanded version of attested binary examples from pidgin
languages, such as “No mani, no kom” from Hawaiian Pidgin English (Winford
2006). Unlike the relatively clear message behind “No money, no come,” it is hard to
know how to interpret (3). Is it that if you do not come, then you cannot get paid, or
get any shelter? Or does it mean that if you do not come with the money, then you
will not get any shelter? Or is it a prediction or a threat that you won’t come, won’t
get the money, and won’t get any shelter either? The grammar on its own cannot
decide among these options.
This is not just an example involving the familiar kind of ambiguity, as found in
e.g. “He saw the man with the binoculars,” where language users typically reach for
one interpretation and do not even consider the other(s). With the one in (3), we are
at a loss right away. It seems that our brains are just not prepared to readily assign
meanings to paratactic ternary structures such as (3). But we can handle binary
structures.
In Minimalism (e.g. Chomsky 1995) and predecessors, the central combinatorial
operation Merge is widely considered to be binary, that is, it is considered that Merge
can combine only two elements at a time (see e.g. Kayne 1984). The same assumption
holds for the operation Adjoin, which is akin to paratactic attachment (see
Chapter 4).3 As a consequence of binary Merge, it is considered in Minimalism
and predecessors that binary branching is a syntactic universal, characteristic of all
languages. To be more accurate here, because it was empirically determined/dis-
covered that the vast majority of syntactic structures across languages can be
analyzed as involving binary branching, the operation Merge was hypothesized to
only be able to combine two words/phrases at a time.

3
Very roughly speaking, operation Merge creates a headed structure, given that one of the merged
elements determines the category of the newly-created constitutent. For example, in merging a Tense
element and a verb phrase, one creates a Tense Phrase, with Tense acting as the syntactic head. In contrast,
with operation Adjoin, which serves to attach e.g. adverbials, neither of the merged elements is treated as a
syntactic head (for discussion, see e.g. Adger 2003). For example, an adverb such as quickly can attach to a
vP, expanding that vP, but not creating a new headed structure. I return to the distinction between the two
operations in Chapter 4.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Introduction: The two-word stage 59

If so, then the initial proto-syntax, characterized by parataxis, could not have been
transitive in the modern sense of the word, given that transitivity involves three
obligatory constituents (subject, verb, object), and accommodating three such con-
stituents structurally would require hierarchical syntax.4 At least this is the claim in
Minimalism: on top of the small clause (or VP layer) in transitive structures one must
project another verbal layer, the layer of vP, as discussed later in the chapter. If, as
I argue, proto-syntax did not have hierarchical capabilities, then it could not have had
true transitivity.
But, can there be languages without transitivity? How would one express the basic
notions such as “who does what to whom” in such languages? At ﬁrst glance, such
grammars might seem impossible to imagine. However, as will be shown in this
chapter, there are many constructions in present-day languages that exhibit exactly
such non-transitive properties.
A good initial illustration is provided by the emergence of Nicaraguan Sign Lan-
guage (NSL) by deaf children in the 1970s and 1980s, to be discussed further in
Section 3.5 (see also Aronoff et al. 2008 for Al-Sayyid Bedouin Sign Language, which
exhibits similar properties). According to Kegl, Senghas, and Coppola (1999: 216–17),
the earliest stages of NSL, with the ﬁrst generation of speakers, do not utilize transitive
N V N constructions, such as (4), at least not when both nouns are animate (Senghas
et al. 1997). Instead, the speakers resort to a sequence of two intransitive clauses, an
N V—N V sequence (5–6), clearly resembling the paratactic structures in (1–2):
(4) *WOMAN PUSH MAN.
(5) WOMAN PUSH—MAN REACT.
(6) WOMAN PUSH—MAN FALL.
Focusing on (6), one can say that the sign for WOMAN is the subject of PUSH, but
the sign for MAN here is not the object of PUSH, but instead the subject of FALL. In
this kind of grammar, there are no structural objects, as these structures are intransi-
tive.5 Similar considerations hold for Homesign syntax, as reported in e.g. Goldin-
Meadow (2005), to be discussed in Section 3.5. But, one can argue, this may just be a
phenomenon of early stages of sign languages, and nothing like that is possible in
spoken languages.6

4
This property of language, that its structures are necessarily binary-branching, may partly be a
consequence of the paratactic beginnings of language, and the processing constraints to which such
paratactic grammars seem to be subjected (see the discussion regarding the example in (3)).
5
One must also appreciate the relativity of the notions subject and object, to be discussed further in this
chapter: whether MAN/WOMAN in the above examples is subject-like or object-like depends on the
choice of the verb.
6
A reviewer has wondered why the acquisition of these sign languages is relevant for language
evolution. As pointed out for the same question raised for language acquisition in general (Section 2.5.1),
my approach postulates that the foundational layers of syntax need to be in place before one can build more
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

60 The intransitive two-word stage

However, this chapter will go over a variety of present-day structures which blur
the subject/object distinction in this same way. One example of this kind of grammar
is the so-called exocentric VN compounds, which necessarily consist only of two
words (i.e. two free morphemes), a verb and a noun, resembling small clauses (see
Section 3.3.2 for more details). If Givón’s (1971) well-known slogan “today’s morph-
ology is yesterday’s syntax” has some truth to it, then it provides additional support
for the claim that the mold these compounds are poured into may be just fossilized
syntax of an earlier era. One more recent example of a compound which preserves a
stage of English syntax is the name for the plant forget-me-not. While English
speakers no longer use this kind of syntax in sentences (e.g. *You forgot me not), it
is preserved in this particular compound.7
In the underlined compounds in (7) the noun is subject-like, while in the rest of the
compounds it is object-like, as discussed below.
(7) scare-crow, kill-joy, pick-pocket, cry-baby, cut-purse, busy-body,
spoil-sport, turn-coat, rattle-snake, hunch-back, dare-devil,
wag-tail, tattle-tale, saw-bones, cut-throat, Burn-house, Love-joy,
Pinch-penny (miser), sink-hole, turn-table
Even though these compounds contain a verb, and the verb takes one argument (the
noun), which is typically object-like, it would be wrong to analyze such compounds
as transitive. First of all, clearly, there is no second argument in these compounds,
which would count as a subject. Also, the noun is not necessarily object-like, but can
also be subject-like, as is the case with the underlined compounds. While a scarecrow
is somebody who scares crows (crow is object-like), a rattlesnake is a snake that
rattles (thus subject-like), and a cry-baby is a baby (or somebody) who cries (again
subject-like). But the nouns in both of these cases appear in exactly the same position
and the same form in the compound, following the verb, and thus there is no formal
differentiation between object-like and subject-like arguments in this sense. This is
quite comparable to the clauses characterizing early stages of Nicaraguan Sign
Language, as illustrated above in (5–6). The VN compounds in other languages,

complex layers. If the acquisition of a sign language proceeds in stages, then these stages are expected to be
consistent with the postulated scaffolding.
7
According to a reviewer, Givón’s slogan is controversial. However, my approach does not use Givón’s
slogan as a reconstruction method, but rather just to give an extra dimension to the claim that (verbal)
compounds may have preserved a very old stage of syntax. In this respect, Anderson (1988) discusses
Givón’s slogan and concludes that while “it is impossible to identify all of today’s morphology with
yesterday’s syntax” (338), “there is every reason to believe that much morphology does in fact represent
the reanalysis of earlier syntactic complexity” (340), even though the relation between the two is not simple
and direct (see also Lightfoot 1979). According to Lightfoot (1979: 160), “the morphology is notoriously
slow to adapt to changing syntax” and may reﬂect syntactic patterns of great antiquity. If true, then this can
be helpful for my proposal, which attempts to reconstruct the earliest stages of human syntax.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Introduction: The two-word stage 61

including Serbian, show exactly the same properties in this respect, as discussed in
detail in Chapter 6.
Furthermore, the intransitive constructions in some modern ergative languages
share this property as well. In these languages, the subject of the intransitive clause is
structurally not distinguishable from the object, both appearing in the so-called
absolutive case, as illustrated in the following example from Tongan (Austronesian
language spoken in Tonga; Tchekhoff 1979: 409):8
(8) ‘oku kai ‘ae iká.
PRES eat the.ABS fish
‘The fish eats.’
‘The fish is eaten.’
As the two distinct translations indicate, the only argument (the fish) can be
interpreted here as either the subject or the object of eating, once again illustrating
an intransitive grammar which does not make a formal distinction between subjects
and objects. As pointed out in Tchekhoff (1973), as well as by other researchers (e.g.
Authier and Haude 2012; Blake 1976; Mithun 1994: 247; Shibatani 1998: 120), the
subject/object distinction does not play a role in such ergative/absolutive patterns,
especially those which are both syntactically and morphologically ergative, as will be
explained below. In addition to these, several other absolutive-like constructions
found in present-day languages, in fact languages classified as nominative-accusative,
will be considered in this chapter, including unaccusatives, nominals, and middles.
The main proposal in this chapter is that the initial paratactic (non-hierarchical)
grammars were intransitive grammars, whose clauses consisted of just two (proto-)
words. In this proposal, transitivity is seen as an innovation brought about by
superimposing an additional layer of structure (perhaps the vP layer of Minimalism)
upon the foundational (absolutive) layer, with some intermediate “middle” construc-
tions paving the way toward transitivity. Not only can this approach shed light on the
ergative-absolutive and nominative-accusative dichotomy found across today’s lan-
guages, but it can also explain the availability of the foundational absolutive-like
patterns in various guises in primarily nominative-accusative languages. The recur-
ring theme of this monograph is that each stage preserves, and builds upon, the

8
While ABS does not appear in the gloss in the original, I have added it here because this would be
typically considered as absolutive case. Tchekhoff calls it the “first modifier,” as opposed to the “second
modifier,” which corresponds to an agent (ergative case).
Interestingly, as reported by Haiyong Liu (p.c. 2013), Chinese shows similar vagueness of expression,
especially when the perfective particle le is used (see also Section 3.3.3 for comparable data from Riau
Indonesian).
(i) Ji chi le.
chicken eat PERF
‘The chicken(s) have/has finished eating.’
‘The chicken was eaten.’
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

62 The intransitive two-word stage

achievements of the previous stage(s). In this case, the proposal is that transitive
structures (vP shells), as well as middles, are built upon the foundation of intransitive
(absolutive-like) VPs (or small clauses), shedding light on a host of quirky phenom-
ena across languages.
As was the case with the small clause proposal in Chapter 2, this proposal also
involves an internal reconstruction based on the theoretical postulates within Min-
imalism. Just as one can peel the TP layer off a modern sentence (Chapter 2), one can
also peel off the vP layer, resulting in intransitive small clauses. Recall that the
reconstruction method used in this book is based on the hierarchy of functional
projections which allows a SC/VP to be composed without a TP or vP, but does not
allow either a vP or a TP to be composed without a VP/SC. This renders the proposed
progression of stages theoretically plausible.
In the process of evolving transitivity, i.e. grammaticalizing the syntactic positions
of more than one argument, I propose that there are/were various types of inter-
mediate steps, as discussed in Section 3.4. The evidence for these intermediate stages
includes various “middle” constructions, which straddle the boundary between
transitivity and intransitivity, passives and actives, as well as neutralize the distinc-
tion between subjects and objects. I exemplify this with se middle constructions to be
introduced below, where se is analyzed as a meaningless proto-transitive marker.
As with the analysis of small clauses in Chapter 2, the argument for the proposed
progression through stages (absolutive, to middle, to transitive) has three prongs to it:
(i) evidence of “tinkering” with the language design, so that fossils of one stage
provide foundation for the next, possibly through intermediate steps; (ii) identifying
“living fossils” of each stage in modern languages; (iii) existing or potential corrob-
orating evidence. Moreover, the goal is to show that each identiﬁed stage accrues
concrete and tangible advantages over the previous stage(s), advantages that are
signiﬁcant enough to be targeted by natural/sexual selection.
In this respect, Section 3.2 shows how intransitive absolutive-like structures get built
into the transitive (vP) structures, thus providing evidence of evolutionary tinkering
with the language design. Section 3.3 introduces further living fossils of the postulated
absolutive-like stage in the evolution of syntax. Section 3.4 considers middle construc-
tions and serial verb constructions, both of which straddle the boundary between
intransitivity and transitivity. There is also some corroborating evidence for an
intransitive stage, as well as potential testing grounds, as discussed in Section 3.5.

3.2 Intransitive absolutives

This chapter postulates a stage in the evolution of syntax in which only intransitive
absolutive-like patterns were available, i.e. patterns in which a verb takes only one
argument, and in which the distinction between subjects and objects is neutralized, in
fact, irrelevant. This is the sense in which I am using the term “absolutive-like” in this
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Intransitive absolutives 63

context, just to indicate that an intransitive structure does not distinguish subjects
from objects grammatically. This is not to imply in any way that there was a special
marking of an ergative argument, to contrast with the absolutive one. My proposal is
that this intransitive proto stage could only have one argument per clause. From
there, one can see how ergative and nominative languages would have diverged in the
way they express additional arguments in sentences. Ergative languages would have
kept the absolutive pattern for intransitive sentences, but added ergative arguments
to this absolutive structure in order to express transitive patterns. On the other hand,
nominative-accusative languages would have developed a special, accusative case
only for the lower argument, establishing a category of the object. It could be that
certain middle constructions in the latter languages paved the way toward developing
the accusative case, as discussed in Section 3.4.2.
It should be pointed out that this is a very different view from the one that would
advocate missing or null arguments. In this analysis, one is dealing with a two-slot
grammar with only one argument slot, and there is nothing missing or null syntac-
tically speaking.9 This is a perfectly coherent grammar, even if simpler than e.g.
transitive grammars. Developing such a grammar would have constituted an enor-
mous advantage over no grammar at all, but this kind of grammar has less expressive
power than a fully transitive grammar, exactly the kind of scenario that would allow
evolutionary forces such as natural selection to operate (see Chapters 2, 4, and 7).
Pressure to accommodate additional arguments would have been a powerful driving
force behind the evolution of more complex (transitive) patterns.
This proposal is entirely consistent with the analysis of transitivity in e.g. Minim-
alism, where transitivity is considered to involve an additional layer of verb structure,
a vP shell (e.g. Chomsky 1995). In this analysis, the internal (lower argument) is
generated in the VP (or SC), and the external argument (e.g. agent) in the vP (9–10),
as discussed in Section 1.7.
(9) Maria will roll the ball.
(10) a. [SC/VP roll the ball] →
b. [vP Maria [SC/VP roll the ball]] →
c. [TP: Maria will [vP Maria [SC/VP roll the ball]]]
In deriving the sentence in (9), one starts with the basic, small clause layer in (10a).
Then, the agent (Maria) is merged in the higher vP layer (10b), which is responsible

9
Bickerton (1990, 1998) discusses pidgin languages, as well as child language, in the light of language
evolution, and concludes that these systems are not real languages. One of the reasons why these systems
are not treated as “real” language in Bickerton’s work is that they do not realize all the arguments that seem
to be obligatory in adult speech (see the discussion in Section 1.6). However, given that constructions with
“missing” arguments are also prevalent in adult languages, one cannot really conclude that this is not “real”
language. Instead, my argument is that languages are composites encompassing structures of various
degrees of syntactic layering, reﬂecting different stages in the evolution of human language.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

64 The intransitive two-word stage

not only for accommodating this additional argument, but also for assigning
(abstract) accusative case to the object (the ball). Finally, the TP layer is projected
on top of the vP layer, and “Maria,” the highest argument, moves to become the
subject of the TP (10c).
Thus, just as is the case with the small clause vs. TP distinction discussed in
Chapter 2, here as well we have a layer of structure (vP) superimposed upon the
foundational, absolutive (small clause) layer. In both cases, the small clause with one
argument is the foundation. In more elaborate grammars, full transitive sentences
will have all three layers, arranged in a hierarchy of projections (see e.g. Abney 1987):
(11) TP > vP > SC/VP.
Assuming this kind of structure building in Minimalism, my proposal in fact does an
internal reconstruction to arrive at the intransitive small clause proto-syntax, as
proposed in Chapters 1 and 2, and repeated below:

Internal reconstruction of clause structure (based on Minimalism)

A structure X is considered to be (evolutionary) primary relative to a structure Y if
X can be composed independently of Y, but Y can only be built upon the foundation of X.
This hierarchy of functional projections is an inﬂuential theoretical construct, with
good empirical foundation, and it is signiﬁcant that it can be used to reconstruct
proto-syntax.
While Chapter 2 provided evidence for living fossil structures without a TP, my
focus in this chapter is on the fossil structures without vPs. In fact, intransitives,
especially unaccusatives (see Section 3.3.1), can be accommodated without the vP
layer (12–13), as discussed in Section 1.7. In other words, the vP layer is optional.10
(12) The ball will roll.
(13) a. [SC/VP roll the ball]
b. [TP: The ball will [SC/VP roll the ball]]
Given that there is no agent, and no accusative case either, the vP shell need not
project in (13a). In English, the object-like argument (the ball) has to move to the TP
projection and become a structural subject (13b).11

10
Recognizing that vP is an optional layer means that the hierarchy TP > vP > SC/VP has to be seen in
the following way. The SC/VP serves as necessary foundation for all clausal constructions. Transitivity (vP)
must have a SC/VP as its foundation. TP, on the other hand, must have either the SC/VP or the vP as its
foundation. If both vP and TP are present, then the TP will dominate vP. Because vP is considered to be
just a shell, another layer of the verb phrase, then a uniﬁed generalization for TP is that it has to be built
upon the foundation of a verbal layer.
11
One exception in English are fossil structures such as the underlined small clause in (i), in which the
subject does not move, and which closely parallels the structure of the unaccusative small clauses in
Serbian, discussed below in the text (see also Chapter 2).
(i) Come winter, he will go to Florida.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

More on living fossils 65

The verbs like roll, which participate in both transitive and intransitive patterns,
clearly show ﬂuidity in the expression of subjecthood (see also Sections 1.7 and 3.4).
Observe that (9) and (12) start with exactly the same foundation, the small clauses in
(10a) and (13a), respectively. Whether the ball will be the object or the subject of the
sentence depends on whether or not there is an additional argument. What counts as
a subject is thus relative to the number of arguments expressed.12
Recall from Chapter 2 that in the absence of the TP layer in unaccusative small
clauses in Serbian of the kind in (14), only one layer of structure is available, the
[SC/VP] layer:
(14) [SC/VP Pala vlada.]
fell government
In conjunction with the examples above, we see a gradual progression toward more
syntactic complexity, from one single layer of structure in (14), to two layers of
structure in English tensed unaccusatives (13), to three layers of structure with
English tensed transitive clauses (10c), abstracting away from some other functional
projections that may be there. Crucially, this gradual increase in complexity is arrived
at not through impressionistic means, but by a precise method of internal recon-
struction based on theoretical considerations.
Grammaticalizing transitivity in e.g. nominative-accusative languages, with a
structural accusative case and the vP/VP shell, would not have precluded some
other structures (e.g. unaccusatives, se clauses, nominals, compounds) from retaining
the absolutive-like ﬂavor. If these simpler grammars are easier to process, then their
retention at least in some constructions is to be expected.
This section has shown that intransitive (absolutive-like) structures get built into
the transitive vP shells, providing the necessary foundation for transitivity, thus
offering evidence of evolutionary tinkering with the language design. The following
section introduces further types of living fossils of the postulated absolutive-like stage
in the evolution of syntax.

3.3 More on living fossils: What is it that unaccusatives, exocentrics,

and absolutives have in common?
In this section I consider in more detail the following “living fossils” of the postulated
absolutive-like stage in the evolution of human language: unaccusatives (Section 3.3.1),
exocentric VN compounds (Section 3.3.2), absolutives in ergative-absolutive languages

12
Of note is also that Borer’s (1994) fully conﬁgurational approach to argument linking assumes that
the arguments within the VP are hierarchically unordered, and that there is no lexical distinction between
subjects and objects inside the VP. Such distinctions can only be made with the help of the functional
projections, such as vP. This is consistent with the proposal in this monograph that the foundational small
clause layer of structure is non-hierarchical and absolutive-like.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

66 The intransitive two-word stage

(Section 3.3.3), as well as (other) absolutive-like constructions found in nominative-

accusative languages, including nominals and dative subject clauses (Section 3.3.4).

3.3.1 Unaccusatives
Unaccusative small clauses were introduced in Chapter 2, where the focus was to
establish that such clauses are structures without a TP layer, showing neither Move
nor subordination. What is relevant about them in this chapter is that they are
intransitive structures which can be generated without projecting the vP layer either.
This kind of grammar is a good approximation of the hypothetical two-word stage, as
discussed in Section 3.1, as well as in Chapter 2. Moreover, this kind of grammar is
reminiscent of the grammar found in exocentric VN compounds, as discussed
further in Section 3.3.2, as well as in Chapter 6.
As pointed out in Section 3.2, unaccusatives can be accommodated without
projecting the vP layer:
(15) [TP: The ball will [SC/VP roll/fall the ball]]
Recall that vP is projected primarily in order to accommodate an additional argu-
ment, typically the agent, as well as the accusative case, but unaccusative structures
have only one argument and no accusative case (hence their name). Unaccusatives
can be roughly characterized as intransitive structures whose sole argument is
typically a theme, showing some object-like properties, including the postverbal
position in some cases (see e.g. Perlmutter 1978; Burzio 1981; Levin and Rappaport
Hovav 1995, for crosslinguistic manifestations and characterizations).
In Serbian, for example, there is a clear preference for unaccusative “subjects” to
follow the verb, the position typically associated with objects. If these unaccusatives
are at the same time TP-less small clauses (e.g. 16), this preference becomes more
rigid, with strong preference for the otherwise non-canonical VS order (see Progovac
2008a,b for details):
(16) a. Pala vlada.
fall.PART government
‘The government has (just) collapsed.’
b. Proš’o voz.
gone.PART train
‘The opportunity has passed.’
This provides support for the unaccusative analysis under which the “subjects” of the
unaccusative verbs (e.g. arrive, fall, come, appear) Merge as “objects” of the small
clause (e.g. Burzio 1981).
Recall that unaccusative small clauses in Serbian are analyzed as involving one
single layer of structure, (SC/VP) layer, and that their subjects thus have no syntactic
position into which to move (Chapter 2; see also Section 3.3):
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

(17) [SC/VP Pala vlada]

Without a vP or TP layer, these are just rigid and ﬂat two-word structures, reasonably
good approximations of the postulated two-word stage.13
Given this proposal, the unaccusativity phenomenon can be seen as an option to
retain (elements of) absolutive-type grammars in constructions which can be sup-
ported by such grammars, e.g. intransitive constructions with a single (internal)
argument (Casielles and Progovac 2010, 2012).14 To put it another way, if proto-
syntax involves less grammatical burden, and is less costly to process, then one can
expect to ﬁnd it in constructions in which more complex grammars do not confer
much advantage. Transitive constructions, as well as intransitive constructions
involving agents in some languages, may need extra syntactic space, e.g. a vP shell,
and thus cannot be expressed as readily with this type of grammar.
Intransitive absolutive constructions in ergative languages, as discussed in
Section 3.3.3, as well as various absolutive-like constructions elsewhere, are again struc-
tures which blur the distinction between subjecthood and objecthood in the sense that
their only argument has characteristics of both.15 The only difference seems to be that
unaccusatives tend not to encompass agents, and this may be due to the special status
agents have acquired in nominative-accusative languages, possibly by being associated
with their own functional projection, such as vP in Minimalism. In other words, even if
there is an association of the role of agent with the vP in some modern languages today

13
A reviewer wonders why this stage could not have had a noun phrase in lieu of the noun, combining
with the verb, resulting in a multiple-word stage. Perhaps this kind of complexity, involving modification,
arose only later, as it would have created an asymmetrical structure. Also, the typical modifiers of nouns,
adjectives, would have evolved in a later stage, given Heine and Kuteva’s (2007) reconstruction, and
considering that not all languages distinguish the category of adjectives.
14
As pointed out in Section 1.6, Casielles and Progovac (2010, 2012) explore the connection between
unaccusatives and thetic statements. According to e.g. Marty (1918), categorical judgments (also referred to
as double judgments) involve two successive acts (choosing an entity and making a statement about it) and
are expressed by the traditional subject-predicate sentences (Vlada je pala ‘(the) government has col-
lapsed’). In contrast, thetic statements or simple judgments merely assert a state of affairs where a new
situation is presented as a whole. In these statements the entity involved in the event forms a unit with it
(Pao sneg ‘Fell snow’). There is a lot of overlap between thetic and unaccusative grammars. It would stand
to reason that grammars which generate thetic statements are evolutionarily more primary, as well as
simpler. In this respect, Gil (2012) has proposed that predication is a composite emergent entity, rather
than a primitive.
15
Comrie (1978) has made an argument that subjecthood across languages is not a rigid notion, but a
notion on a continuum. This can be accommodated within the evolutionary scenario explored here,
according to which this distinction was not there at all in the first stages of proto-syntax.
In Minimalism, subjecthood is characterized structurally/mechanically, based on the position of the
phrase, as well as on its agreement properties. Thus, very roughly speaking, if a phrase (in English) occupies
a certain syntactic position (e.g. a specifier position of a TP), and/or if it agrees with the Tense element,
then it is considered to be the subject, descriptively speaking. But in fact whether or not we call this phrase a
subject matters very little in this syntactic theory. Thus, the fluidity of the concept of subjecthood does not
seem to pose a problem for this theory. Where the problems lie is in the attempt to rigidly associate specific
thematic roles with specific syntactic positions, as addressed from this perspective in Progovac (2014b).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

68 The intransitive two-word stage

(but see Progovac 2014b), this association was certainly not there in the two-word
grammar stage. What the two phenomena have in common, absolutives and unaccusa-
tives, is the unavailability of the accusative case, that is, the structural case which is
reserved only for objects.16

3.3.2 Exocentric compounds

Another phenomenon that is difficult to explain given the postulates of modern
morpho-syntax are exocentric VN compounds of the kind illustrated in (18–19)
below for English and Serbian. However, their shape makes sense if they are seen
as fossils closely approximating a two-word absolutive-like (intransitive) stage in the
evolution of human language (see Progovac and Locke 2009; Progovac 2009a, 2012).
(18) scare-crow, kill-joy, pick-pocket, cry-baby, spoil-sport, turn-coat,
rattle-snake, hunch-back, dare-devil, wag-tail, tattle-tale,
drynk-pany (drink-penny; miser (a surname)), pinch-penny
(miser), sink-hole, turn-table, busy-body
(19) ispi-čutura (drink.up-flask—drunkard), guli-koža (peel-skin—who
rips you off), cepi-dlaka (split-hair—who splits hairs), muti-voda
(muddy-water—trouble-maker), jebi-vetar (screw-wind—
charlatan), vrti-guz (spin-butt—fidget), tuži-baba (whine-
old.woman; tattletale), pali-drvce (ignite-stick, matches)
The grammar behind these compounds is quite simple: it is a two-place mold that
can fit exactly one verb and one noun, whether the verb is semantically monovalent
or bivalent. Moreover the thematic role of the noun is underspecified. While the
noun in these compounds is often interpreted as an internal argument, correspond-
ing to an object in a sentence, it can also be external, corresponding to a subject in a
sentence, as is the case with the underlined compounds in both languages. As pointed
out in Section 3.1, the noun in crybaby is subject-like, while the noun in scarecrow is
object-like.
Once again, the grammar behind these compounds provides no morpho-syntactic
differentiation between subjecthood and objecthood, leaving room for vagueness.
For example, a rattlesnake is conventionally interpreted as a snake that rattles, but
one can imagine this word also used for somebody who routinely rattles snakes, on

16
It follows from this proposal that proto-clauses did not have structural case, whether accusative or
nominative, as discussed in Chapter 2. In Minimalism, structural nominative case is associated with the
functional projection of TP, and structural (accusative) case with the projection of vP. While nominative
and accusative noun phrases can have different morphological manifestations (e.g. he/who vs. him/whom
in English), the syntactic theory considers that even without such overt manifestations, there are abstract
case relationships between Tense (TP) and nominative, and between the light verb (vP) and the object in
the accusative-type languages.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

More on living fossils 69

analogy with e.g. pick-pocket.17 Notice that a more complex compound, snake-rattler,
which has a transitivity layer, is no longer vague in this way, and can only be
interpreted as somebody who rattles snakes.
Perhaps a better way to make this point would be to consider a contrast between a
turn-coat and a turn-table. Conventionally, a turncoat is somebody who turns his
coat/skin inside out (traitor), with the coat being object-like. On the other hand, a
turntable is a table that turns, where the table is subject-like. But if a turntable can be
a table that turns, then, in principle, grammatically speaking, a turncoat could be a
coat that turns, perhaps a coat that is reversible. Likewise, if a turncoat can be
somebody who turns his coat, then, in principle, a turntable could be someone
who (routinely) turns tables upside down, perhaps a rowdy regular in a bar. Again,
this ﬂexibility is not there with syntactically more elaborated compounds, such as
table-turner, which cannot mean, not even in principle, a table that turns.
Exocentric VN compounds can thus be seen as absolutive-like constructions
which blur the distinction between subjecthood and objecthood, and which also
lack accusative case, the properties these compounds share with unaccusatives and
(other) absolutives.
It is of interest that exocentric VN compounds across languages seem to specialize
for derogatory reference when referring to humans, possibly implicating their ori-
ginal use in ritual insult. Chapter 6 explores the proposal that the ability to create
such compounds in the distant evolutionary past may have been sexually selected,
contributing to the consolidation of proto-syntax, as well as to vocabulary building.
As pointed out by a reviewer, there are many other compound types in English,
combining other categories, such as an adjective and a noun (blackbird), a noun and
a noun (snowman), a noun and an adjective (sky-blue). There are several reasons why
they are not considered in this monograph, although future research might reveal
relevance of some of these for evolutionary considerations, perhaps compounds of
the egghead type. First of all, even though N-N compounds in English may seem
simple and straightforward at ﬁrst sight, they are typically not only headed (the
second element is the syntactic and semantic head of the whole compound), but
they are also recursive, producing: styrofoam snowman, or policy committee decision

17
Some compounds can even be simultaneously doubly interpreted in this respect: Serbian pali-drvce
(ignite-stick, matches) is at the same time a stick that ignites and a stick that gets ignited. In this case, the
vagueness is quite expressive and appropriate. Precision is not always desirable, and this can provide partial
explanation for the persistence of vague expressions. One example where vagueness is desirable involves
suppressing the agent of the action in passives, as in (i). In English, passive constructions serve this purpose
particularly well, while in other languages, such as Serbian, middles are used for this purpose as well
(Section 3.4).
(i) The policeman was wounded.
The point here is that one does not always want or care to express precisely who did what to whom, but just
to express that something happened.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

70 The intransitive two-word stage

process. While it may seem that such combinations of nouns directly reflect our
cognitive abilities for headed Merge and recursion, it is worth pointing out that not
all languages in fact use such compounds, and especially not recursively (see
Section 1.6; Chapter 6; also Snyder 2014).
In contrast, as will be discussed in Chapter 6, VN compounds are typically neither
headed (hence the name exocentric) nor recursive. Moreover, they are relevant for
the purposes of this book because they are combinations of a verb and a noun, typical
building blocks of clauses/sentences, and the first categories to emerge and be
differentiated (e.g. Heine and Kuteva 2007). Finally, these VN compounds reveal
evidence of ritual insult, rendering them of particular interest for evolutionary
considerations for that reason as well. Chapter 6 offers additional reasons for their
evolutionary significance.

3.3.3 Absolutives
The vagueness attested in exocentric VN compounds is also characteristic of intransi-
tive absolutives in some ergative-absolutive languages. Consider another example
from Tongan featuring an intransitive sentence with the absolutive case (Tchekhoff
1973: 283):18
(20) Oku ui ‘a Mele
PRES call ABS Mary
‘Mary calls.’
‘Mary is called.’
In this intransitive sentence, Mary can be interpreted either as the agent of the action, or
the patient/theme. But, as pointed out in Tchekhoff (1973), this sentence means neither
“Mary calls” nor “Mary is called” in Tongan, these being just two different translations
of one single underdetermined/underspeciﬁed structure in Tongan. In other words,
these translations reﬂect our nominative/accusative bias. Instead, all this sentence
means is that there is calling, and that Mary is implied in the process (Tchekhoff 1973:
284). This is also the essence of Gil’s analysis of vague sentences in Riau (Footnote 18), as
well as my proposal for middles in Serbian, and the idea of a proto-role (Section 3.4.2).

18
See also Gil (2005) for an extensive discussion of comparable vague clauses in Riau Indonesian:
(i) Ayam makan
chicken eat
‘The chicken is eating.’
‘Somebody is eating the chicken.’
Etc.
While Gil does not analyze Riau as an ergative/absolutive language, this may be simply because it does not
have a special ergative case marking, which would then contrast with an absolutive case. But, for all relevant
purposes, the structure in (i) above can be considered absolutive-like, as it exhibits the same properties
found in intransitive constructions in ergative-absolutive languages such as Tongan.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

In ergative-absolutive patterns, the subject of an intransitive predicate is morpho-

syntactically equivalent to the object, both characterized as absolutive arguments (e.g.
Comrie 1978; Dixon 1994). Only agents of transitive verbs are marked distinctly, with
ergative case.19 It is only after the addition of the external ergative argument (e.g.
John below) that the role of Mary disambiguates and is necessarily patient/theme
(Tchekhoff 1973: 283). In other words, the addition of the external argument forces
the inner absolutive layer to distinguish itself from the external argument, resulting
in more precision.20
(21) Oku ui ‘e Sione ‘a Mele
PRES call ERG John ABS Mary
‘John calls Mary.’
The examples in (20–21) illustrate quite clearly how the ergative argument (John) is
inserted into the basic absolutive layer. They also illustrate something that has been
noted repeatedly in the typological literature, that the ergative-absolutive structures
resemble passive structures in nominative-accusative languages, in which the agent is
introduced as an oblique argument, e.g. as a by-phrase in English passives, as
discussed further below (see e.g. Hale 1970). These similarities extend to the nominal
domain as well, as discussed in the following section.
Dyirbal (Australian language spoken in northeast Queensland) is another ergative
language which, like Tongan (see also Dukes 1998), exhibits syntactic ergativity, in
the sense that the absolutive role even in transitive constructions continues to behave
in a subject-like fashion, as illustrated with a coordinated structure below (Dixon
1994: 155):
(22) nguma yabu-nggu buran banaganyu
father.ABS mother-ERG saw return
‘Mother saw father and (father) returned.’
This clearly contrasts with English (23) below, in which a comparable coordinated
structure yields the opposite result for the missing argument:
(23) Mother saw father and (mother) returned home.
In fact, if we were to coordinate a passive sentence and an active sentence in English,
we would get the pattern comparable to the one in (22) from Dyirbal:
(24) Father was seen by mother, and (father) returned home.

19
To put it slightly differently, as is often done in the literature on ergativity, the ergative alignment
involves formal singling out of the agent of transitive verbs in contrast to the patient of transitive verbs and
the single argument of intransitive verbs (e.g. Authier and Haude 2012; see also Comrie 1978; Dixon 1994).
20
Notice that the addition of -er in VN compounds has a comparable effect, as pointed out above.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

72 The intransitive two-word stage

This is the sense in which the ergative phrase can be likened to the passive by-phrase.
The by-phrase here, just like the ergative phrase in (22), is not the true, structural
subject, but only the “logical” subject, as will also be discussed with respect to the
noun phrases in Section 3.3.4.1. This is also the reason behind the proposals in Nash
(1996) and Alexiadou (2001) that ergative phrases may be attached by adjunction, in a
way similar to the attachment of the passive by-phrase in English.
While Tongan and Dyirbal are analyzed as syntactically ergative languages, in the
sense that they exhibit both morphological and syntactic ergativity, there are many
ergative languages spoken today which exhibit only morphological ergativity, pat-
terning with English with respect to e.g. coordination (see Aldridge 2008 for an
overview and discussion; thanks also to Robert Henderson, p.c. 2013, for a discussion
on this). Likewise, ergative-absolutive languages typically show the so-called split-
ergativity, in the sense that they are ergative with some nouns/pronouns, but accusa-
tive with other nouns/prounouns, as discussed in Section 7.3.3. Tongan has also
developed certain morphological constructions that can be analyzed as accusative
patterns (see e.g. Tchekhoff 1973). It may well be that every language has some ergative
and some accusative patterns, and it is only a matter of which patterns prevail.
Assuming that there was an intransitive absolutive-like (proto-syntactic) stage in
the evolution of human language, one can envision the subsequent development of
the two basic language types, primarily nominative-accusative and primarily erga-
tive-absolutive. Lehman (1985: 245) points to the gradient nature of the distinction
between the ergative and accusative types: “a language is never wholly and exclusively
either ergative or active or accusative, in all its grammatical patterns.” As pointed out
in this section and in the following sections, there are many absolutive-like construc-
tions in nominative-accusative languages. Likewise, so-called ergative languages
often develop nominative-accusative patterns in some domains, e.g. in the domain
of personal pronouns (which are higher on the animacy hierarchy), resulting in so-
called split ergativity (e.g. Trask 1979 and references there; see Chapter 7 for more
discussion). This overlap is what one would expect under the evolutionary approach
explored here.
Bringing unaccusativity and ergativity under the same umbrella, Bok-Bennema
(1991: 169) points out that ergativity and unaccusativity are both characterized by the
inability of transitive verbs to assign structural case to their deep objects. To put it
another way, neither ergative nor unaccusative structures can have true (syntactic)
objects, that is, objects distinguished from subjects by means of a speciﬁc structural
case (see Footnote 16). According to e.g. Alexiadou (2001: 18; also Hale 1970; Nash
1995), ergative/absolutive patterns are reﬂexes of a passive/unaccusative system.
Therefore, what all these phenomena have in common (absolutives, exocentric VN
compounds, unaccusatives, and passives) is that the verb is unable to assign struc-
tural case to its deep object. Given that the object does not receive a distinct
(accusative) marking, the distinction between subjecthood and objecthood is blurred.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

More on living fossils 73

These phenomena begin to make sense if they are seen as survivors from a two-
word proto-syntax stage, which could only accommodate one argument per verb,
and which did not have the means to distinguish between subjects and objects. As
pointed out above, it is perfectly plausible to expect that the absolutive-like patterns
will be preserved in some constructions, especially those in which subject/object
differentiation is not important. It is also conceivable under this approach that the
foundational absolutive-like patterns will be found in some guise or another in
nominative-accusative languages as well, as explored further in the following section.
Languages may vary considerably with respect to the degree to which they rely on the
foundational absolutive-like patterns, but my argument is that every language has at
least some constructions which are absolutive-like in nature, providing continuity
and common ground between the two language types.

3.3.4 More absolutive-like patterns in nominative/accusative languages

As noted in e.g. Authier and Haude (2012: 2) “some notoriously ‘accusative’ lan-
guages such as Latin, French, and in fact many Indo-European languages may have
some hints of ergativity” (see also Bauman 1979: 430; Lehman 1985). Such hints of
ergativity have already been introduced in this chapter for English and Serbian
exocentric compounds, as well as for unaccusatives. This section considers additional
constructions that can be seen in a similar light, including nominals (Section 3.3.4.1),
dative subjects (Section 3.3.4.2), and clausal complements (Section 3.3.4.3).
3.3.4.1 Nominals This section is there to show that even in English one ﬁnds, in
productive use, these absolutive-like structures which do not distinguish subjects
from objects, resulting in vagueness. According to e.g. Alexiadou (2001), nominals
across various languages are intransitive, as well as absolutive-like (passive-like). In
other words, all nominals, whether passive or not, have an intransitive base (see also
Picallo 1991; Bottari 1992; Alexiadou and Stavrou 1998). In passive nominals the agent
appears as an adjunct, as in (25) from Alexiadou (2001: 78).21
(25) the destruction of the city by the barbarians
In this analysis, by-phrases in derived nominals can only be interpreted as affectors
(agents, instruments, creators), rendering examples such as (26) not fully grammatical.22

21
Comrie (1978) suggests that nominalizations constitute a possible source for ergativity. Or perhaps it
is the other way around.
22
One reviewer does not ﬁnd (26) completely ungrammatical. A native speaker I consulted likewise
ﬁnds this example marginal/awkward, but not fully ungrammatical. On the other hand, (25) is fully
grammatical, indicating that there exists some contrast here, although perhaps subtler than perceived in
Alexiadou (2001). Interestingly, a similar contrast is offered in Pesetsky and Block (1990: 751) in order to
challenge Pinker and Bloom (1990), as discussed in Section 7.4:
(i) the city’s destruction by the enemy
(ii) ?*the city’s sight by the enemy
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

74 The intransitive two-word stage

According to the authors, unlike with the verbal domain, there is no structural external
argument in nominalizations, generated in a vP (or an nP, the nominal equivalent of a
vP), and the presence of the by-phrase seems to be lexically licensed. In that sense, the
external argument in the by-phrase resembles ergative case, which is also often analyzed
as lexical/prepositional case, rather than structural case (see above).23
(26) ??the receipt of the package by John
This is consistent with the proposal in this chapter that the intransitive, absolutive-
like/passive-like patterns provided a foundation for evolving transitive structures,
with ergativity and accusativity being different solutions to the same problem of
accommodating an additional, external argument.
3.3.4.2 Dative subjects Consider next dative “subjects” in Serbian, which co-occur
with nominative “objects” in what certainly looks like an ergative/absolutive pattern:
(27) Meni se pije kafa.
me.DAT SE drinks coffee.NOM
‘I feel like drinking coffee.’
Nominative on the “object” is like absolutive, being also the case of intransitive
subjects, while dative adds an external argument, akin to an ergative (see e.g.
Alexiadou 2001; Nash 1996, for an adjunction analysis of the ergative argument). As
pointed out in e.g. Trask (1979: 398), the ergative case is often identical to the genitive,
dative, or locative. According to Nash (1996: 171), ergative subjects, like dative subjects,
cannot co-occur with structural accusative, but instead appear with absolutive/nom-
inative “objects.” This is yet another construction in which the verb fails to assign
structural (accusative case) to what would be its object.
It is also of signiﬁcance here that dative subjects in Serbian typically co-occur with
the (middle) pronoun se. As per the proposal in Section 3.4.2, se is associated with the
ancient absolutive-like pattern.
3.3.4.3 Clausal complements The clausal complements of the so-called raising
predicates, such as seem, appear, likely, as well as of predicates such as obvious,
are also absolutive-like/unaccusative-like in nature. While they are generated as

23
For additional references claiming that ergative is an inherent case, see e.g. Woolford (1997, 2006);
Legate (2008); Massam (2000, 2001). There are alternative analyses of ergative arguments. For example,
Otsuka (2011) treats ergative as structural, rather than inherent case, based on the analysis proposed by
Levin and Massam (1985), and further developed by Bobaljik (1993) and Laka (1993). According to that
analysis, both ergative and absolutive are structural cases, and the difference between accusative and
ergative languages is taken to be the choice of primary case between the two core structural cases, one
assigned by T and the other assigned by v. These references also suggest that the absolutive case is licensed
by v, which would not work with my analysis, according to which vP is not projected in intransitive
absolutive constructions.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Precursors to transitivity 75

complements of the verb, they do not receive accusative case, and there is no external
argument either, which is reminiscent of the unaccusative grammars.

(28) It seems/appears/is likely that John is in jail.

(29) It is obvious/surprising that John is in jail.

For purely grammatical purposes, the subject position of these sentences hosts an
expletive (meaningless) pronoun it, but this pronoun is certainly not an argument of
the verb. In fact, what looks like an external argument can optionally be added, as in:

(30) It seems to me that John is in jail.

(31) It is obvious to me that John is in jail.

Intriguingly, when it comes to comparable predicates in Serbian, their external

argument, if expressed, would appear as a dative subject:

(32) Čini mi se da je Jovan u zatvoru.

seems me.DAT SE that is John in jail

Both Serbian mi and English to me can be seen as a type of ergative case, added to the
otherwise absolutive foundation. This just shows that various quirky and exotic-looking
phenomena across languages can be understood in this evolutionary framework.

3.4 Precursors to transitivity

3.4.1 Serial verb constructions
As a reviewer rightly points out, also of relevance to this discussion are the so-called
serial verb constructions, widespread in Creole languages, in the languages of West
Africa, Southeast Asia, Amazonia, Oceania, and New Guinea. Serial verb construc-
tions can be characterized as sequences of verbs “which act together as a single
predicate, without any overt marker of coordination, subordination, or syntactic
dependency of any other sort,” describing what is conceptualized to be a single event
(e.g. Aikhenvald 2005: 1). What one observes in these examples again is that there is
one argument per verb, and the relationship of that argument to the verb seems
absolutive-like. According to Givón (1979: 220), serial verb constructions involve “a
concatenation of small propositions in which, roughly, a one-to-one correlation is
maintained between verbs and nominal arguments.”
Aikhenvald further states that these constructions are monoclausal, and that their
intonational properties are the same as those of a monoverbal clause, sharing just one
tense, aspect, and polarity value. Importantly, she also mentions that serial verbs do
not necessarily have to be next to each other, as they are in (34), but can also be
separated by other constituents, as in (33).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

76 The intransitive two-word stage

Anyi-Sanvi (Kwa family, Niger-Congo: Van Leynseele 1975: 191–2)

!
(33) cùá c̀i ákÓ dì
dog catch.HAB chicken eat
‘The dog eats (lit. catch-eat) a chicken’
Igbo (Igboid, Benue-Congo, Niger-Congo; Lord 1975: 27)
(34) ó tì-wà-rà ètèrè á
he hit-split.open-TENSE plate the
‘He shattered the plate.’
It is of note here that (33) in essence has an N V—N V structure, comparable to the
structures attested in Nicaraguan Sign Language (e.g. WOMAN PUSH – MAN
FALL) (see e.g. Section 1.6). Intriguingly, Senghas et al. (1997: 558) claim that the
N V – N V structures of the first generation of NSL signers tend to become N VV
N (WOMAN PUSH FALL MAN), or N N VV (WOMAN MAN PUSH FALL)
combinations, with the second generation. In other words, the second generation is
grouping the two verbs so that they are adjacent to each other, as is typically the case
with serial verb constructions. As Senghas et al. (560) conclude in the article, the next
stage may be a stage where transitivity emerges, with only one verb remaining to
support both nouns. This is essentially the path of grammaticalization envisioned in
e.g. Givón (1979: 220).
While I would like to leave a more detailed investigation of serial verb construc-
tions from this perspective for future research, it is worth pointing out that devel-
oping an intricate system of such constructions may have been yet another route
toward transitivity, by embracing the dual-verb structures. If so, it is significant that
the intransitive foundation (one absolutive-like argument per verb) provides the
common ground for such a wide range of strategies for expressing transitivity. 24 In
fact, this astonishing variety of strategies for expressing transitivity across languages
and constructions seems to have nothing in common except for this paratactic,
absolutive-like foundation. If true, this provides significant support for the recon-
struction explored in this chapter, and for the gradualist approach to the evolution of
syntax more generally.

3.4.2 The “middle” ground

According to e.g. Kemmer (1994: 181), “the reﬂexive and the middle can be situated
as semantic categories intermediate in transitivity between one-participant and

24
If this is the origin of at least some serial verb constructions, then at least they should not be analyzed
on a par with compounds, or as some kind of freely Merged V-V combinations (as per the discussion in
Section 1.6). Instead, they should be seen as a by-product of, or as one kind of solution to, the emergence of
transitivity from paratactic combinations of intransitive small clauses.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Precursors to transitivity 77

two-participant events.”25 Here I consider just one representative example: se con-

structions in Serbian that can be characterized as middles as they straddle the
boundary between the passive and active voice. In addition to dative subject clauses
introduced in the previous sections (3.3.4.2 and 3.3.4.3), se is also used in a wide
variety of other constructions in Serbian, and may well be one of the most frequently
used words in the language.
Where pragmatics allows, se constructions in Serbian exhibit astounding vague-
ness of meaning, and se clearly cannot be analyzed as a reflexive pronoun, reflexivity
being only one of the available interpretations, and not even a preferred one, as the
following examples illustrate:
(35) Deca se tuku.
children SE hit
‘The children are hitting each other/?themselves.’
‘The children are hitting somebody else.’
‘One hits/spanks children.’
(36) Pas se ujeda.
dog SE bites
‘The dog bites (someone).’
‘?The dog is biting itself.’
‘?One bites dogs.’
(37) Marko se udara loptom!
Marko SE hits ball.INST
‘Marko is hitting me with a ball.’
‘Marko is hitting somebody with a ball.’
?‘Marko is hitting himself with a ball.’
?‘One is supposed to hit Marko with a ball.’
If (37) is uttered with a sense of urgency, the most probable interpretation will involve
the most salient discourse participant, the speaker, even though there is no word or
morpheme corresponding to the first person at all! Even though (38) below offers an
unambiguous way of expressing the first reading of (37), (38) is much less likely to be
used in the heat of the moment, suggesting that se constructions are easier to process
than regular transitives:
(38) Marko me udara loptom!
Marko me hits ball.INST

25
Kemmer (1994: 184) points out that middle systems are quite widespread, being found in a large
number of genetically and areally divergent languages. According to Arce-Arenales, Axelrod, and Fox
(1994: 2–3), the “middle diathesis” is marked in all nom-acc languages, and many constructions which have
traditionally been analyzed in terms of passive voice could be better understood as middle diathesis.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

78 The intransitive two-word stage

It is signiﬁcant that the vagueness in se clauses illustrated above is comparable to that

found with Tongan absolutives (20) and Riau intransitives (Footnote 18), as well as with
exocentric compounds. In (36), as apparent from the translations, Marko can be either
the subject (agent), or the object (patient), or both at the same time, the latter option
yielding the reflexive interpretation. This kind of ambivalence can only be a result of
underspecification, that is, of simple, unarticulated syntax and semantics.26 Given this,
the meaning of (37) and (38) can be roughly characterized in the following way:
(37’) There is an event of hitting with a ball, and Marko is a participant
in that event.
Logical formula: ∃e [H(e) ∧ Participant (Marko, e)]
(38’) There is an event of hitting with a ball, and Marko is the agent of
that event, and the speaker is the patient of that event.
Logical formula: ∃e [H(e) ∧ Agent (Marko, e) ∧ Patient (Me, e)]
It is probably more accurate to characterize (38’) as (38’’) below, building directly on
the middle pattern in (37’):
(38’’) There is an event of hitting with a ball, and Marko is the agent of
that event, and the speaker is the participant of that event.
Logical formula: ∃e [H(e) ∧ Agent (Marko, e) ∧ Participant
(Me, e)]
This would essentially mean, as discussed in this chapter, that the basic absolutive
layer is still preserved even in (38), and that it is by virtue of superimposing a higher
argument that the initial participant is now interpreted as a non-agent, in this case as
patient/theme. This is exactly what we see with the Dyirbal data in Section 3.3.3.
Interestingly, Dowty (1991) also questions the rigidity and descreteness of theta
roles, and proposes that they can instead be seen as prototypes, or proto-roles, such
as proto-agent and proto-theme roles (thanks to a reviewer for leading me in this
direction). The participant role that I am using here can then be seen as an even more
underspecified role, just a proto-role. This is then how one can characterize proto-
predication – as involving a verb (predicate) and just one argument, with a proto-role
of a participant.
The presence of se simply implies that there is one more participant involved in the
event, in addition to the one surfacing, and typically its role can be inferred from
pragmatic context (e.g. 37). But the role of the expressed argument (e.g. Deca, Pas, or
Marko above) still remains absolutive-like, not grammatically specified as either
subject or object, giving rise to massive ambiguities (Progovac 2005a, 2013b, 2014a,b).

26
Recall that VN compounds, which are also analyzed as absolutive-like, are likewise vague in this
respect, with the noun acting either as an object, or as a subject, or as both at the same time in some cases.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Precursors to transitivity 79

The constructions in (35–36) once again illustrate a two-word grammar at work (this
time enhanced by the particle se). Even though, pragmatically speaking, one is
dealing here with an event with two participants, this kind of fossil syntax cannot
express both arguments, nor can it specify whether the only expressed argument is
subject or object.27
Comparable vagueness may also be found with cognate se constructions in other
Slavic languages, but also in Spanish (Arce-Arenales, Axelrod, and Fox 1994: 5),
clearly indicating that the phenomenon illustrated above is not just a quirk of
Serbian:28
(39) Juan se mató.
Juan SE killed
‘Juan got killed.’
‘Juan killed himself.’
Serbian se is analyzed in Franks (1995) and Progovac (2005a) as an expletive (mean-
ingless) pronoun, “absorbing” accusative case. Another way to look at it is to say that
se in these constructions is a proto-transitive/proto-accusative marker imposed upon
an ancient absolutive pattern, but being stuck in this intermediate stage between
absolutivity and transitivity. As pointed out by Maggie Tallerman (p.c. 2014), these se
constructions, as well as other constructions which I consider “transitional” in this
framework, are not transitional in the sense that they are unstable or in the process of
changing – they can only be transitional in the sense that they straddle the boundary
between transitivity and intransitivity.
It is hard to be sure how to analyze these se constructions by using the tools of
Minimalism, and the derivation in (40) is just a suggestion:
(40) [TP Deca [FP se [SC/VP deca tuku]]]
children SE children hit
Again, the idea is that the noun and the verb are ﬁrst Merged in a SC/VP (Section 1.7).
Next, a proto-transitive functional word se is Merged with the SC/VP to create some

27
It is interesting to note in this respect that Otsuka (2011) analyzes some of the Tongan constructions
as involving a null SE anaphor, even though, as he mentions, Tongan does not have any overt anaphors!
The way I see it, the author is simply noticing a connection between absolutivity and se middles.
28
Consider also the vagueness of the English example below:
(i) The children got dressed.
As argued in e.g. Alexiadou (2012), these get-passive constructions should be analyzed as middles, that is,
constructions which have only one structural argument. In this respect, get-passives are non-canonical
passives, given that canonical passives are taken to have two structural arguments. Middles in English also
include examples such as (ii-iii), among others:
(ii) These apples sell well.
(iii) The glass cuts easily.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

80 The intransitive two-word stage

kind of functional projection, whose head is se, and which can be labeled as FP.29
Finally, assume that the TP is created, and the noun deca Moves to the speciﬁer of the
TP. FP is still not a vP, as it does not introduce an agent, nor does it disambiguate the
role of the absolutive-type argument in the SC/VP, but it can be considered as a
precursor to vP. The next step(s) in developing vP-type transitivity in accusative
languages would be to associate this FP with an additional, external argument, such
as agent, and to associate the internal argument with the special (accusative) case.
Interestingly, without se, the absolutive pattern vanishes, and the only argument
has to be interpreted as subject/agent performing an action on an unspeciﬁed object,
as is also the case with English translations in (41) and (42), a familiar consequence of
accusative grammars:
(41) Deca tuku.
‘The children are hitting (somebody).’
(42) Pas ujeda.
‘The dog bites (someone).’
This suggests that the fossil absolutive-like structures in Serbian are only preserved
under the wing of se (as further explored in Progovac 2014b).
It seems, then, that the distinctions between subjecthood and objecthood, transi-
tivity and intransitivity, passive and active, can be neutralized, and can have a middle
ground. One way to make sense out of this is to postulate an intransitive absolutive-
like stage in the evolution of human language, a stage which provides a foundation
for any subsequent elaboration of argument structure.
Importantly, however, introducing transitivity with a structural accusative case
(vP/VP shell) to a language does not preclude some other constructions (e.g.
unaccusative small clauses, nominals, se constructions, compounds) from remaining
absolutive-like. What is also important to emphasize is that many of these founda-
tional structures still live inside/within the more complex structures. For example,
absolutives generated in small clauses/VPs arguably live inside nominals, se con-
structions, and transitives, and small clauses in general live inside TPs, as commonly
assumed in syntactic theory (Chapter 2). This reinforces the claim in this monograph
that small clauses and intransitive absolutives constitute the foundation, the platform
on top of which one can build (or not) more complex syntax, namely TPs and
transitivity, perhaps in the form of vP shells.
Transitivity in syntax thus need not be seen as conceptual necessity, but rather as an
evolutionary innovation; it can be seen as an additional layer of structure superimposed
upon the foundational (absolutive) layer, leading to a variety of crosslinguistic strategies

29
Se could have even started out as some kind of linker in the sense of Chapter 4.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Corroborating evidence and testing grounds 81

for marking case relations, and reflected in the postulation of two verbal layers in
Minimalism (two vP shells). This renders syntax a quirky system, a product of tinker-
ing, rather than a system optimally designed from scratch. As was the case with the
small clause/TP distinction discussed in Chapter 2, the hypotheses explored in this
chapter are testable/falsifiable, as well as corroborated by evidence from other fields, as
discussed in the following section.

3.5 Corroborating evidence and testing grounds

The strongest corroborating evidence for the proposal in this chapter comes from
language acquisition, both involving sign languages and spoken languages. Neuro-
imaging would, once again, provide a good testing ground for the hypotheses
proposed in this chapter.
As pointed out in Section 3.1, the emergence of NSL provides excellent corrobor-
ation for the proposal. According to Kegl, Senghas, and Coppola (1999: 216–17), the
earliest stages of NSL, observed with the first generation of speakers, do not exhibit
transitive N V N constructions, such as (43) below, at least not when two animate
nouns are involved (Senghas et al. 1997). Instead, one finds what look like sequences
of two clauses of the kind (N V—N V) (44–45):
(43) *WOMAN PUSH MAN.
(44) WOMAN PUSH—MAN REACT.
(45) WOMAN PUSH—MAN FALL.
Aronoff et al. (2008, and references there) found a similar pattern for another sign
language that emerged spontaneously about 70 years ago, Al-Sayyid Bedouin Sign
Language (ABSL). They also report that there is a tendency toward one argument per
predicate, where e.g. transitive events involving two animate referents are rendered
by two or even three clauses.
These sequences can be analyzed as paratactic/symmetric combinations of two
intransitive (small) clauses, which are interpreted as the first one causing the sec-
ond.30 In this sense, this grammar is absolutive-like, and resembles the grammar
behind serial verb constructions and other absolutive-like constructions discussed in
this chapter, in that only intransitive structures are available, that is, each verb can
have only one argument.

30
This is not necessarily how the authors of the article would analyse these data. My personal
communication with Ann Senghas and Marie Coppola (p.c. 2014) revealed that they are revisiting those
early analyses, and that there are complexities involved. But, as far as I understand, the claim still stands
that in the earliest stages of NSL one ﬁnds these N V – N V types of constructions, in lieu of N V
N transitive constructions, when both Ns are animate. When one of the nouns is inanimate, then
apparently transitive structures are possible.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

82 The intransitive two-word stage

It is perhaps of interest to mention here that the overwhelming majority of the

world languages are classified as either Subject-Verb-Object (SVO) or Subject-
Object-Verb (SOV). Both types can be derived easily from a binary N V – N V
pattern, comparable to the paratactic patterns in (44–45). If one starts with an N V –
N V sequence, and assigns the role of S (subject) to the first noun, and the role of
O (object) to the second noun, one can easily derive the two word orders above
by dropping one of the verbs (the dropped verb could then be grammaticalized as a
null light verb (v), as per the syntactic theory). There is another symmetric paratactic
possibility: V N – V N, the verb initial order being attested in e.g. unaccusatives
and VN compounds. If, again, the first noun is associated with S, and the second
noun with O (as per the Cause First principle discussed in e.g. Section 1.6), this
underlying pattern can easily yield the SVO order again, but also another possible
word order across languages, VSO. The other logically possible word orders are
extremely rare across languages. Needless to say, this is a rather speculative observation.
According to Goldin-Meadow (2005), the syntax of Homesign languages, self-
styled gestural communication systems spontaneously developed by deaf children
not exposed to sign language, also appears to be absolutive-like. In Homesign, both
patients/themes and intransitive agents tend to precede verbs, once again neutraliz-
ing the distinction between subjects and objects. Also, patients are more likely to be
expressed than agents, as is also the case with exocentric compounds and nominals
discussed in the previous sections. As Goldin-Meadow notes, both American and
Chinese deaf children are more likely to produce the sign for the eaten than for the
eater. In Zheng and Goldin-Meadow’s (2002: 171–2) study, the Chinese children
showed a bias to omit only the subjects of caused motions (agents), not the subjects
of spontaneous motions. Subjects of spontaneous motions were produced as often as
objects.
Considering that early stages of NSL, ABSL, and Homesign are languages arguably
constructed from scratch, the patterns of intransitivity and ergativity observed in
their creation are of evolutionary significance (see Section 2.5.1 for much more
discussion regarding the reasons why language acquisition can be relevant for
language evolution studies). At the very least, these considerations demonstrate
that there is a simpler way to break into syntax, starting with intransitive clauses
and blurring the distinction between subjecthood and objecthood.
Moreover, children acquiring spoken languages also go through a two-word stage
(Bloom 1970) which seems to be characterized by similar proto-syntactic patterns. It
is often claimed for child language acquisition that children “delete” arguments in
their speech, that is, that they do not express all the arguments that would typically be
required in the adult grammar.31 According to Zheng and Goldin-Meadow (2002:

31
As pointed out in Section 1.6, Bickerton (1990, 1998) takes this frequent omission of arguments to
indicate that children at this stage do not have “real” language.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Corroborating evidence and testing grounds 83

171–2), such “deletions” are not random, but rather follow an ergative pattern. If
children in these cases are using absolutive-type intransitive grammars, as per the
proposal in this chapter, then they are not deleting anything, but rather just using the
syntactic mold in which there is room for expressing only one single argument.
Similar patterns in language acquisition of spoken languages have been reported
by other authors. For example, when hearing children are exposed to Korean (Clancy
1993) or Samoan (Ochs 1982), they too follow the deaf children’s pattern—they omit
transitive subjects and produce intransitive subjects and objects, exhibiting essen-
tially an absolutive pattern. Indeed, the same pattern has been observed for English
language acquisition (Goldin-Meadow and Mylander 1983: 63). As Zheng and
Goldin-Meadow (2002: 171–2) conclude, the ergative pattern is more robust, consid-
ering that the omission pattern found in all of these hearing children and the deaf
children is reminiscent of the alignment found in ergative languages. This ties in well
with the approach explored in this chapter.
As pointed out by a reviewer, the preferred discourse pattern in a variety of
languages is the pattern in which only one argument is given in full, while the
other arguments are either omitted altogether or occur in a reduced (affix) form
(see e.g. Newmeyer 2005: 132–3, and references there). For example, Du Bois (1985:
347–9) found that in Sacapultec, a Mayan language of Guatemala, most clauses in the
discourse contain only one full noun phrase, with zero noun phrases also very
common. The full NP that commonly occurs is the absolutive, consistently following
the verb, while the ergative full noun phrases are infrequent.
In addition, Du Bois (1987) has noted that the pattern in which the grown is
expressed more readily than the grower is common in the adult languages of the
world, as attested with the intransitive constructions in (b) from English:
(46) a. John grew tomatoes. b. John grew.
(47) a. John shook Bill. b. John shook.
While the transitive pattern in (a) necessarily takes John to be an agent, the
intransitive counterparts in (b) favor the interpretation in which John is undergoing
the action, as a theme/patient. In other words, there is avoidance of agents/external
arguments in these cases (see also Casielles and Progovac (2010, 2012) for the
significance of this phenomenon for language evolution, and in particular for the
Agent-First hypothesis).
Interestingly, the bonobo Kanzi has been reported to have mastered a VS (two-
word) syntax in his use of lexigrams and gestures, based on the description in
Greenfield and Savage-Rumbaugh (1990: 161), as well as Heine and Kuteva (2007:
145–7). First of all, Kanzi only uses two-word combinations, including creations with
one verb and just one argument, in a way that does not distinguish agents/subjects
from patients/objects, with both following the verb. While Kanzi’s initial combinations
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

84 The intransitive two-word stage

(during the first month) show free word order (hide peanut, peanut hide), the later
combinations seem to converge on the productive VS order, even when the noun is the
agent, in the sense that the verb is followed by an agent gesture.32
There is a lot of controversy surrounding the interpretation of these and other
reports on primate communication, and it is not my intention to engage with these
controversies in this book. For now, suffice it to say that, if Kanzi is in principle
capable of (sporadic) two-word (intransitive) combinations, then it is conceivable
that at least some individuals of our common ancestor with bonobos were too. This
would have been enough to allow the process of natural selection for language.
Last but not least, as pointed out in Chapter 2, neuroimaging can provide a fertile
testing ground for the hypotheses explored in this chapter. The suggestion is that one
can use the subtraction and other neuro-linguistic methods to determine how proto-
syntactic structures are processed in comparison to their more complex counterparts,
in the hope of finding neuro-biological correlates of, for example, vP shells and
transitivity (see Progovac 2010b).
For the reasons given in the Appendix, while the processing of transitives with vP
shells is expected to show clear lateralization in the left hemisphere, with extensive
activation of specific Broca’s areas, the proto-syntactic structures, such as absolutive-
type constructions, as well as middle se constructions, are expected to show less
lateralization, and less involvement of Broca’s area, but more reliance on both
hemispheres, as well as, possibly, more reliance on the subcortical structures of the
brain.To take just one concrete example (not discussed in the Appendix), it follows
from the analysis presented in this chapter that se constructions (and middles in
general) are easier to process than regular transitives, given that they involve simpler,
less articulated syntax. This can be tested given the availability of minimally con-
trasting pairs in Serbian involving se constructions (48) vs. true transitive counter-
parts (49), as suggested in Progovac (2014a,b):
(48) Marko se udara!
Marko SE hits
(49) Marko me udara!
Marko me hits
If syntax evolved gradually, through several stages, then it is plausible to expect
that modern syntactic structures and operations decompose into evolutionary

32
As pointed out by e.g. Tallerman (2012: 453), human syntax is far more than regularities in word
order, concluding that “at most we can agree that Kanzi has learned a productive proto-grammar.”
Tallerman (2012: 454) further elaborates that “certain properties that we might call proto-syntactic are
attested in animal language research. Words can be meaningfully combined, especially in novel ways . . . ”
This is where the reconstruction of syntax in this book should be helpful. It decomposes syntax all the way
down to the simplest syntactic strategy, which in turn allows one to ﬁnd some continuity, however tenuous
it may be, with animal communication systems.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Conclusion 85

primitives. If so, this will not only be measurable in the activation of the brain, but
without these evolutionary considerations it may not be possible to achieve a true
breakthrough in the ﬁeld of neuro-linguistics (see Section 2.5.3).

3.6 Conclusion
This chapter builds on the arguments of Chapter 2, and reconstructs a stage in the
evolution of human language which is characterized by intransitive small clauses,
lacking vP and TP structure, and allowing only one proto-argument per clause, that
is, an argument whose thematic role is underspeciﬁed. This stage is arrived at by
internal reconstruction based on the syntactic hierarchy of functional projections.
Peeling off the outer clausal layers, TP and then vP, one arrives at the basic
predication structure of an intransitive small clause. As with the proposal in
Chapter 2, there are three prongs to this argument. First, the absolutive-like pattern
is shown to provide a foundation upon which transitive structures are built. Second,
there is a variety of absolutive-like foundational structures even in nominative-
accusative languages. And, third, there is good corroborating evidence and promising
testing grounds for this proposal. Furthermore, postulating an intransitive absolu-
tive-like stage allows one to clearly identify the kinds of evolutionary pressures that
would have led to the rise of transitivity, as explored in Chapter 7.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Parataxis and coordination as

precursors to hierarchy: Evolving
recursive grammars

4.1 Hypothesized evolutionary stages of syntax

The goal of this chapter is to show that syntax can be decomposed into evolutionary
primitives/layers at an even more abstract level than explored in the previous
chapters, and that such decomposition can not only help identify the stages of
evolutionary progression of syntax, but also shed light on its very design. The intent
is also to show that the progression through such stages makes evolutionary sense,
i.e., that each new stage brings some concrete advantage(s) over the previous stage(s),
and that such advantage(s) could have been subject to natural selection.
Considering some present-day constructions, as well as the trends in grammat-
icalization processes, I propose the following three rough stages in the evolution of
syntax (i–iii), following a hypothetical non-syntactic one-word stage (0). My working
assumption, the simplest possible, and the least stipulative, is that any combination of
words/phrases into a single utterance involves syntax.

(0) One-word stage (no combinatorial power)1

It has been postulated that children go through a one-word stage as they acquire language
(e.g. Bloom 1970), but adult speech also sometimes involves single words meant as
complete utterances (as in e.g. Snake! Run! Out!) Since the one-word stage does not
involve syntax, it will not be discussed here, except to show why it would be beneﬁcial to
advance from this stage to a proto-syntactic stage, as characterized in (i) below.

(i) Paratactic stage (proto-syntax), where prosody/supra-segmentals provide the only

glue for (proto-)Merge. In other words, in this stage there is prosodic evidence, but
not any segmental evidence, that the words/constituents are Merged. The paratactic

1
See Section 4.2.2 for some discussion regarding the issue of valence in a one-word stage.

Evolutionary Syntax. First edition. Ljiljana Progovac

# Ljiljana Progovac 2015. Published 2015 by Oxford University Press
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Hypothesized evolutionary stages of syntax 87

syntax of this stage can be characterized by an operation Conjoin, rather than Merge
proper. As explained below, Conjoin is an operation not distinct from Adjoin, as used
in e.g. Adger (2003) for the attachment of adjuncts. Unlike Merge proper, Conjoin
does not create headedness or hierarchy.

(ii) Proto-coordination stage, where, in addition to prosody, the conjunction/linker

provides all-purpose segmental glue to hold the utterance together. In this stage, the
evidence for (proto-)Merge is more robust, as it retains the prosodic evidence (the
only type of evidence available in the previous stage), and adds to it segmental
evidence (the linker), even though in this stage the linker does not add much more
than that to the interpretation. This stage is arguably still syntactically ﬂat/non-
hierarchical.

(iii) Speciﬁc functional category stage (hierarchical/subordination stage), where, in

addition to prosody and to segmental glue, speciﬁc functional categories emerge,
providing specialized syntactic glue for constituent cohesion, including tense par-
ticles (copulas) and subordinators/complementizers. In other words, this stage
includes all the achievements of the previous stages, and adds another, which is to
use the segmental piece (linker) also to identify the type of constituent created by
Merge. To take just one example, a meaningless linker of the proto-coordination
stage, connecting the subject and its predicate, becomes a meaningful tense particle,
which can now build its own Tense Phrase (TP). I argue that it is only at this stage
that hierarchical structure, Move, and recursion become available, considering that
adjunction and coordination structures, characterizing the previous two stages, are
typically islands for Move (Chapter 5), and do not show true recursion.2

This is a progression from least syntactically elaborated (parataxis), to more elabor-

ated (coordination), to most elaborated (specialized functional categories/projec-
tions). I consider that each of these grammars can operate both clause-internally,
e.g. to combine a subject and a predicate (into a small clause Me ﬁrst!), and clause-
externally, to combine two such clauses into a single utterance (e.g. Nothing ventured,
nothing gained).

2
I should clarify here that I do not consider Move to be subsumable under Merge, the so-called Internal
Merge, as is typically assumed in the Minimalist Program today. Instead, the considerations in this book
lead me to conclude that Merge is just a necessary condition for realizing Move, but that Move requires
additional conditions to be met, as discussed below, as well as in Section 4.4.5. So does recursion.
As will be discussed in Section 4.3, the coordination stage and the subordination stage may not have
shown a clear chronological ordering in the evolution of human language, as they may have been
intertwined, just as they seem to be in today’s languages. Still, they can be ordered in terms of relative
complexity, as per the proposal in this book. I hope that future research will shed brighter light on this
issue.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

88 Parataxis and coordination as precursors to hierarchy

As pointed out in the previous chapters, the argument for each proposed progres-
sion through stages has three prongs to it: (i) identifying “living fossils” of each stage
in modern languages; (ii) providing evidence of “tinkering” with the language design,
so that fossils of one stage can be shown to be integrated into the next, leading also to
composite structures incorporating constructions of various stages; (iii) identifying
existing or potential corroborating evidence from grammaticalization, language
acquisition, agrammatism, animal communication, neuroscience, and genetics.
Moreover, the goal is to show that each identified stage accrues concrete and tangible
advantages over the previous stage(s), advantages specific enough to be targeted by
natural/sexual selection.
One of the themes of this monograph is that the advent of a new stage does not
replace the previous stage(s), but rather that the older stages continue to co-exist,
often in specialized or marginalized roles, in addition to being built into the very
foundation of more complex structures (see Chapters 2 and 3). Evolution is known
not to throw a good thing away, but to build upon it, which is why one should expect
to find constructions of previous stages (fossils) in the later stages.
Prosody and intonation are of course still in use everywhere not only to signal
constituent cohesion, but also to signal grammatical function, such as interrogative
mood in (1). When they are used in conjunction with syntactic operations such as
Move (subject–auxiliary inversion), the result is redundancy and robustness, hall-
marks of evolutionary tinkering.
(1) Mary is already at home?
(2) Is Mary already at home?
There is also experimental evidence to show that prosody signals syntactic cohesion.
For example, Tyler and Warren (1987) have performed an experiment to see how
comprehension is affected by disrupting either syntactic or prosodic structure. Their
conclusion is that a disruption in prosody has a serious adverse effect on compre-
hension, suggesting that prosody even today plays a crucial role in achieving syntactic
cohesion. Tyler and Warren conclude that “prosody does not play the poor sister to
syntax, with prosodic information only used when there are syntactic options, such
as syntactically ambiguous phrases. Rather, prosodic information seems to be an
integral part of the comprehension process” (656). This is also consistent with
Deacon’s (1997) characterization of the role of prosody, as discussed in Section 4.5.1.3.
The progression of stages proposed above is consistent with what one finds with
the grammaticalization processes observed in the present times. The grammatical-
ization of e.g. finite subordination typically takes parataxis as a starting point and
possibly proceeds through a(n intermediate) coordination stage, as discussed in
Section 4.5. Here I extend this progression of stages even to clause-internal level,
suggesting that predication may have also gone through a similar progression in its
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Paratactic proto-syntax stage 89

evolutionary trajectory: (i) root small clause (SC) stage (created by parataxis/prosody
alone); (ii) proto-coordination stage (with a linking/conjunction-like element con-
necting the subject and the predicate); and (iii) a speciﬁc functional category stage,
i.e. a hierarchical stage involving a speciﬁc functional category superimposing one
layer of structure (e.g. TP) over another (SC).
As will be discussed further below, while the evidence for a paratactic stage is
overwhelming, evidence for a coordination stage is not that robust. This may be
because the paratactic structures provide the necessary foundation for building both
coordination and subordination, while coordination may serve as an intermediate
stage only optionally, in some circumstances.
The following subsections explore each of the postulated stages of syntax, as well as
point to the possible communicative advantages of each. The following section will
consider some of the same data introduced in the previous two chapters to illustrate
the paratactic stage, namely intransitive small clauses and compounds, but the focus
in this chapter is on the nature of the bond between merged elements, as well as on
how that bond changes with the progression to the subsequent two stages, both
clause-internally and clause-externally.

4.2 Paratactic proto-syntax stage

4.2.1 Operation Conjoin: Clause-internally and clause-externally
As argued in Chapter 2, as well as in Chapter 3, the following types of small clauses
(3–4), clause combinations (5), and compounds (6) are reasonably good approxima-
tions of what the operation Conjoin (proto-Merge) could accomplish in the para-
tactic proto-syntactic stage. Recall that Conjoin can be characterized as an operation
which joins/unites two elements into a single utterance, but in so doing it does not
create headedness or hierarchy. What holds the bond together is only intonation/
prosody (i.e. supra-segmentals).
(3) Me ﬁrst! Everybody out! Him apologize?! Me worry?!
(4) Case closed. Problem solved. Point taken. Mission accomplished.
Crisis averted.
(5) Nothing ventured, nothing gained. Monkey see, monkey do. Come
one, come all.
(6) pick-pocket, turn-coat, hunch-back, cry-baby, busy-body, rattle-snake
Of note here is also that certain root small clause types, in particular the clauses
illustrated in (3), are characterized by exaggerated intonation, possibly compensating
for the lack of functional categories or linkers, and thus tapping into the proto-
linguistic ability to use prosody/intonation for the purposes of conjoining. Given this,
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

90 Parataxis and coordination as precursors to hierarchy

it is not implausible to suggest that in this stage of proto-syntax prosody may have
been exaggerated, or perhaps even musical, in the sense of “prosodic protolanguage”
discussed in Fitch (2010: 475, and references there). However, Fitch’s proposal that
this kind of prosodic or musical protolanguage preceded words in the evolution of
language, and was devoid of propositional meaning, sort of like birdsong, was rightly
criticized by Tallerman (2013a). On the approach explored here, if there was such a
musical episode in the evolution of language, then it would have been most useful at
this (paratactic) juncture, where prosody/melody would have had a very specific
compositional function to hold the (proto-)words and utterances together (see also
Section 2.4).
Notice that adjunction, used abundantly in present-day languages, is taken to
involve a comparable kind of flat/non-hierarchical structure, essentially parataxis. In
(7) below, the adverb is traditionally analyzed as adjoining to the verbal projection
(but see Cinque 1999 for a specialized functional category approach to the attachment
of adverbs). Similarly, the adverbial clause in (8) is traditionally analyzed as adjoining
to the main clause. This kind of attachment does not create a new (functional)
category or layer, but rather loosely attaches to an already projected layer, expanding
it, as shown below.
(7) She [vP [vP worked] feverishly].
(8) [TP After considering all the options, [TP she ventured out.]]
This is what prompted Jackendoff (1999, 2002) to propose that adjunction structures
have proto-linguistic flavor, and that they can be seen as evolutionary fossils
(Section 1.6).
While it may look as if adjunction is creating an additional layer of structure in
(7–8), this is just an artifact of the lack of appropriate notation. The intent of the vP/
TP repetition in these examples is to capture the idea that the existing layer is only
expanded/stretched, not that a new layer is created. Just like conjuncts, adjuncts seem
to be in a different dimension, and have been seen as merging in a different plane
(e.g. Chomsky 2001; also Chomsky 2004; Citko 2011). It is for this reason that I do not
consider that the structures in (7–8) involve true recursion, in the sense that one
syntactic category is embedded/inserted within another, as its integral part. What we
have here instead is an adverb phrase loosely adjoining to a vP (7), and a clause
loosely adjoining to another clause (8). In the sense of Kinsella (2009), this should be
seen as iteration, rather than true recursion, as discussed further in Section 4.4.
It has been repeatedly noted in the syntactic literature that clausal adjuncts such as
the one in (8) are not fully integrated into syntactic fabric. As put in An (2007), these
adjuncts sit in semi-integrated, “non-canonical” syntactic positions (see also Selkirk
1978; Stowell 1981; Nespor and Vogel 1986; Zec and Inkelas 1990 for the prosodic
properties of adjuncts). Whatever the analysis, it highlights the exceptional and
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Paratactic proto-syntax stage 91

peripheral nature of the adjunction process. Chomsky (2004: 117) acknowledges that
“there has never [ . . . ] been a satisfactory theory of adjunction.” An evolutionary
approach provides a rationale: adjunction is not some well-designed (engineered)
property of human language, which stands in clear differentiation from all other
syntactic processes and operations, but it is rather just a fossil of the paratactic stage
in language evolution, which is thus neither the same as the more modern structures,
nor sharply differentiated from them.
In this sense, the operation Conjoin characterized above subsumes the operation
Adjoin, operative in modern languages (see e.g. Adger 2003). What they share is the
paratactic, non-hierarchical, non-headed nature of attachment. However, Conjoin, as
intended here, is a broader term than Adjoin. When it comes to modern adjuncts, by
deﬁnition, they are peripheral structures attached to the core structures, such as
adverbs attaching to the verb phrases. On the other hand, the operation Conjoin, in
the sense in which I am using it here, can also join two elements of equal status,
where neither element is subordinated to the other, resulting in symmetrical attach-
ment, which is often described as parataxis, or even as (asyndetic) coordination.
Consider the following examples from Kaqchikel (Mayan), spoken in Guatemala.3
(9) a. ru-te', ru-tata'
his-mother, his-father
‘his parents’
b. tiwila' i-juyub'al i-taq'ajal.
ﬁnd your-mountains your-valleys
(Maxwell and Hill 2006: 30).

Harris and Campbell (1995: 283) also struggle with the distinction, and characterize
parataxis as “either asyndetic joining, or loose (imprecise) joining, or both at the
same time. Asyndetic joining is simply joining without a conjunction.”4

3
Thanks to Robert Henderson (p.c. 2013) for leading me to the Kaqchikel (Mayan) data. According to
Maxwell and Hill (2006: 25), “Maya writings have long shown parallelism in structure. The Mayan Codices
are replete with repetition. A set of registers may show one ﬁgure, a Chac (Rain Deity), in a variety of poses.
The accompanying texts will share a syntactic form, varying perhaps one content word, a noun or a verb.
But the substitute words will have all the same inﬂection as the originals. Such close parallelism appears in
modern spoken language in most formal genres, particularly public prayer. As the formality of speech
decreases, so does the strictness of the parallel structure.”
4
It is of note here that Givón’s (1979) pragmatic (asyntactic) mode of communication is characterized
by what he calls loose conjunction or parataxis (222–3). In addition, Gil (2005 and elsewhere) has in fact
argued that simple sentences in Riau Indonesian (i) are put together by an instance of coordination.
(i) Makan ayam
eat chicken
In the analysis pursued in this monograph, this would fall under operation Conjoin, which is meant to
capture the common ground between coordination and adjunction.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

92 Parataxis and coordination as precursors to hierarchy

In clauses such as Me first! (see (3) repeated below), arguably created by Conjoin, it
is not clear what counts as the head (center), the pronoun or the adjective, or
something else, and that is precisely why these structures are still referred to as
small clauses, i.e. as syntactically undefined constituents (see Chapter 2 for compet-
ing analyses of embedded small clauses). Similarly, exocentric compounds (6), which
are also arguably created by Conjoin, are traditionally considered to be headless—in
fact, the lack of headedness in these compounds is so salient that it is responsible for
their name, “exocentric,” that is, without a center/head (see Chapter 6 for details of
their analysis). Moreover, the paratactic combination of two clauses in (5) is also
headless, and is arguably also a product of Conjoin. It is obvious here that the two
clauses are on an equal footing structurally, neither one being structurally embedded
within the other. In fact, the nature of the link in the correlative constructions more
generally can be considered fossil-like in this respect, given that it involves parallel,
symmetric attachment, as discussed later in the text.
(3) Me first! Everybody out! Him apologize?! Me worry?
(5) Nothing ventured, nothing gained. Monkey see, monkey do.
Come one, come all.
(6) pick-pocket, turn-coat, cry-baby, busy-body, hunch-back, rattle-snake
In other words, unlike Merge proper, which is considered to create headedness and
hierarchy, Conjoin, subsuming Adjoin, can be seen as an operation creating flat,
exocentric (non-headed) structures. In this respect, Conjoin can be seen as creating
both conjoined constituents of equal status (parataxis/asyndetic coordination) and
conjoined constituents of unequal status (adjunction).5
Some syntacticians consider that modern Merge can be decomposed into two
operations: Concatenate and Label (e.g. Chomsky 1995; Hornstein 2009; but see Citko
2011). Given this idea, Conjoin can be taken to involve just Concatenate, but not
Label, while Merge proper can be considered to involve both. This would be in line
with the suggestion in Clark (2013) that labeling might be a later evolutionary
development. Labeling itself would be responsible for selecting one of the combined
elements to be the head/center of the whole composition, creating headedness and
asymmetry. For example, in a combination run marathons, the verb and the noun

5
According to e.g. Haspelmath (2004: 3–4), coordinating constructions can be identiﬁed on the basis of
their symmetry, and he includes here both paratactic constructions, without a coordinator, and those with
a coordinator. He also struggles with a differentiation between coordination in this sense, and subordin-
ation, which involves asymmetry, concluding that “there are many constructions showing mixtures of
both, and we are only at the beginning of understanding what constraints there might be on such mixtures”
(37). As discussed throughout this chapter, the evolutionary approach explored here predicts that there
would be such overlap between stages, and that in fact a clear differentiation will never be possible. As
noted by a reviewer, these may pose a challenge for the Minimalist Program.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Paratactic proto-syntax stage 93

combine (by Concatenate/Conjoin), but then labeling renders the whole combin-
ation a verb phrase, with the verb being selected as the head/center of the whole
composition. While this may be a promising direction to explore, here I stick with the
more traditional terminology in order to avoid the undesirable assumptions associ-
ated with Merge in e.g. Hornstein’s (2009) view, including its inseparability from
Move and recursion.
The choice of the term Conjoin may seem unfortunate at first sight, given that it can
be confused with coordination structures, which are also referred to as conjunction
structures. Here the term is used in its lay sense of joining together, or uniting. But this
term is often used in linguistic literature not only for structures involving conjunctions
(e.g. and), but also for paratactic structures without any conjunction, especially if these
structures are on equal footing, that is, symmetrical. Are these uses of linguistic terms
confusing? Yes, but there is a good reason why it is not possible to clearly delineate and
differentiate conjunction from adjunction/parataxis. If my proposal is on the right
track, then the initial paratactic Conjoin stage, without any coordinating words,
gradually integrated into the proto-coordination stage, in which proto-conjunctions
or linkers were used, for the sole purpose of solidifying the operation Conjoin, without
much difference in meaning. That is why the terms coordination/conjoin/parataxis are
often used interchangeably in linguistic literature. The overlap in terminology is the
result of the overlap in constructions: there is no clear differentiation among these
processes in present-day languages, as discussed further in Section 4.3. This is as
expected under the evolutionary approach explored here.
Recall from Chapter 2 that the paratactic small clauses discussed above cannot be
manipulated by Move (10), and that they are not subject to embedding/subordination
either (11):6
(10) a. *Where everybody?
b. *To whom him apologize?
c. *What solved?
d. *What ventured, nothing gained?
(11) a. *Him worry [me first].
b. *Sheila happy [problem solved].
If root small clauses found in present-day languages are indeed approximations of
proto-syntactic structures, then this is consistent with my claim that paratactic proto-
syntax was rigid, and that it did not have the operation Move, or the ability to embed
one clause within another. Arguably, both of these processes are facilitated by specific
functional categories, which provide a stronger bond between constituents.

6
The reader is referred to Section 2.4 for some discussion of questions such as Why worry ? and How
come?
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

94 Parataxis and coordination as precursors to hierarchy

Clearly, given their behavior illustrated above, root small clauses instantiate a
distinct, simpler grammar, which cannot be reduced to superficial omissions of
functional categories. In any event, this paratactic grammar is exocentric and flat,
rather than hierarchical, and it lacks functional categories such as TP and CP, as well
as Move and subordination/recursion.
The clauses investigated in this chapter, as well as in the previous chapters,
typically consist of two words, a noun-like element and a verb-like element, which
I consider to be in a proto-predication relationship.7
As pointed out in Chapter 3, it is entirely conceivable that the first syntactic
combinations were two-word utterances, that is, that Conjoin could only combine
two elements at a time. In fact, all the evidence from present-day grammars points to
that conclusion. First of all, the central operation of modern grammars, Merge, is
widely considered to be binary, that is, that it can combine only two elements at a
time. If Conjoin was a precursor to Merge, then Merge retained this important
property of Conjoin. The same assumption holds for the principle Adjoin, which
I consider to be just one facet of the paratactic principle Conjoin. In e.g. Minimalism,
binary branching (i.e. binary Merge) is considered to be a syntactic universal,
operative in all languages. Also, as discussed in Section 2.5.1, children typically
proceed from a one-word stage to a two-word stage, before they start combining
more than two words into single utterances.
Furthermore, even observationally speaking, the overwhelming majority of con-
ventionalized compounds across languages are binary, consisting of only two free
morphemes.8 In addition, where clauses are clearly combined paratactically (e.g. 12),
the number of clauses that combine is again typically just two. Combining more than
two clauses in this way becomes cumbersome to process, as discussed in Chapter 3):
(12) Nothing ventured, nothing gained. Easy come, easy go.
This adds plausibility to the argument that the initial clauses were two-word
(intransitive) combinations, and that only two such clauses could combine paratac-
tically into a conjoined union (see also the examples in 15–19 below).9

7
Hurford (2012) considers that the ﬁrst two-word utterances were of the topic-comment kind, and that
they only later grew into subject-predicate structures (see some discussion in Section 1.6).
8
Here, I am not considering recursive compound processes, such as English (i), but rather compounds
that are likely to be stored in the speakers’ lexicons, and found in the dictionaries, such as (ii).
(i) policy committee proposal discussion
(ii) bedroom, toothbrush, heartbeat
Moreover, in cases where the compound process is not recursive, only two words can combine by default,
and this is the case with e.g. English and Serbian VN excocentric compounds, as well as Serbian
compounds in general, as discussed further in Chapter 6 (see also Section 1.6).
9
Additionally, it is worth pointing out that the structure of ideophones, which can also be considered as
linguistic fossils, is paratactic and binary, suggesting that this kind of “grammar” might be working across
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Paratactic proto-syntax stage 95

In sum, what I propose in this chapter is that both clausal combinations in (5) and
predicate–argument combinations in (3–4) can be created by the same type of
grammar—paratactic grammar, which is characterized by the operation Conjoin.
This parallelism between clause-internal and clause-external processes ﬁnds further
support in the consideration of the proto-coordination stage, which reveals that the
same proto-coordinator/linker can sometimes be used to connect both (Section 4.3).

4.2.2 Paratactic grammar vs. separate utterances

As pointed out in Section 4.1, each new stage should accrue some advantages over the
previous stage(s) in order to justify its evolutionary usefulness. In this respect,
consider (13) as a report from a business trip, with falling intonation rendering
these two clauses two separate utterances, not connected by Conjoin:
(13) Nothing ventured . . . Nothing gained.
The interpretation in (13) is that nothing was ventured, and that nothing was gained.
Crosslinguistically, falling intonation implies assertion/certainty/completion, while
rising intonation signals uncertainty/incompleteness (e.g. Burling 2005, 170 and
references there). In contrast, in (14), Conjoin combines the two clauses into a single
utterance, using rising intonation as only glue. This invokes an interpretation that
assigns a (causal/conditional) connection to the utterance.
(14) Nothing ventured, nothing gained.
In the absence of speciﬁc functional glue of the hierarchical stage (e.g. If nothing is
ventured, then nothing is gained), concatenations such as (14) typically rely on
iconicity of word order to express temporal and/or causal relations (see
Section 4.2.3). While structurally neither of the clauses in (14) is embedded within
the other, pragmatically they are interpreted in such a way that the ﬁrst clause serves
as a condition for the second, “main” clause.
Such parallel, symmetric concatenations occur crosslinguistically, and are typically
preserved in formulaic, proverb-like sayings:10

linguistic modules. What is also of interest when it comes to ideophones is that the prototypical examples among
them are often iconic in the sense that they imitate the sounds (tick-tock) or the sights (zig-zag) in nature.
(i) tick-tock; zig-zag; flip-flop; willy-nilly (English)
(ii) tika-taka; cik-cak; trte-mrte (aha, you are scared!); (Serbian)
apa-drapa (unruly, disorderly); kuku-riku (rooster’s call)
(iii) mî mê (mosquitoes buzzing); (Hmong)
plĩ -plǒn (empty bottle submerged in water filling up)
The Hmong data are from Ratliff (2013; see also Ratliff 2010). Some languages, such as Korean, Japanese,
and Hmong, make a much more extensive use of ideophones than e.g. English or Serbian, and the speakers
of these ideophone-rich languages can create such expressives on the spot.
10
Comparable concatenations are quite prevalent in pidgin languages as well (e.g. No money, no come,
Winford 2006).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

96 Parataxis and coordination as precursors to hierarchy

(15) a. Easy come, easy go.

b. Monkey see, monkey do.
c. Card laid, card played.
d. Come one, come all.
e. Like father, like son.
f. So far, so good.
(16) a. Na psu rana, na psu i zarasla.
on dog wound on dog and healed
‘No big deal!’ (Serbian)
b. Preko preče, naokolo bliže.
across shorter around closer
‘Shortcuts are not always best.’
c. Koliko para, toliko i muzike.
how-much money, that-much and music
‘How much you pay, is how much you can enjoy.’
d. Duga kosa, kratka pamet.
long hair, short intelligence
(17) a. Wo dua, wo twa. (Twi)11
you sow you reap
b. Wo hwehwea, wo hu.
you seek you ﬁnd
(18) a. Bene diagnoscitur, bene curatur. (Latin)
well diagnosed well cared-for
b. Cito maturum, cito putridum.
early ripe, early rotten
c. Qualis rex, talis grex.
like king, like people
d. Ubi fumus, ibi ignis.
‘Where there is smoke, there is ﬁre.’
(19) a. ua noj ua haus (Hmong)12
make eat make drink
‘to earn a living’

11
Twi is spoken in Ghana, and the examples were kindly provided by Kingsley Okai (p.c. 2011).
12
Hmong is spoken in southern China and northern Southeast Asia, and the data are taken from
Mottin (1978) and Johns and Strecker (1982). Thanks to Martha Ratliff (p.c. 2013) for leading me to the
Hmong data, as well as for providing the background for understanding them. Hmong has thousands of
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Paratactic proto-syntax stage 97

b. kav teb kav chaw

rule land rule place
‘to rule a county’
c. cua daj cua dub
wind yellow wind black
‘a storm’
d. ua tsov ua rog
make tiger make war
‘make war’
e. kev tshaib kev nqhis
way hunger way thirst
‘famine’
It is clear how paratactic Conjoin brings about an advantage over no combinatorial
capabilities: the emergence of the paratactic syntactic stage (e.g. 14) does not eradicate
the possibility in (13) to use the clauses as separate utterances. Rather, with two
possibilities now available, to Conjoin and not to Conjoin, we can now more easily
distinguish between two unrelated propositions (13), and utterances that introduce
two propositions in a causal/temporal relationship (14). If so, then Conjoin affords a
concrete expressive advantage which could have been targeted by natural selection.
Still, without specialized functional elements, this paratactic grammar cannot be fully
explicit about the nature of the relations between clauses. Instead, such relations
seem to be inferred iconically. Iconicity here is reﬂected in the requirement to express
the condition/cause before the outcome, as will be discussed further below. Deutscher
(2000) argues that the development of ﬁnite subordination had an adaptive advan-
tage in that it broke away from such iconicity.
A comparable kind of advantage is brought about clause-internally. While one can
express a variety of thoughts with one-word utterances (20), bringing two words
together by Conjoin into a single utterance (21) begins to create tighter connections,
paving the way toward (proto-)predication.
(20) Fall. Snow.

such creative binary paratactic creations. Even though the Hmong examples seem to create complex
vocabulary items as opposed to conditionals, their structure is parallel to the examples from English and
Twi, in that they are of the AB AC form. Some of these are frozen expressions (the one for “storm,” for
example) and are passed down from generation to generation, but good speakers will make up new ones
that are easily interpretable. Just 60–70 years ago Hmong was a totally unwritten language, so for millennia
language skill equaled oral skill, and making up new, good ones was highly valued (Martha Ratliff, p.c. 2013).
That Hmong speakers use these AB AC structures productively is shown by Mortensen (2014), who
considers a 17 million-word corpus based on Hmong text from the soc.culture.hmong Usenet group. The
corpus yielded 16,106 valid tokens and 3,253 types of AB AC expressions.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

98 Parataxis and coordination as precursors to hierarchy

(21) Fall snow.

While the two separate utterances in (20) can express that there is somebody or
something that fell, perhaps because of the snow, or perhaps the snow fell, the small
clause in (21), created by Conjoin, is more likely to be interpreted as snow being
directly involved in the act of falling. Here, again, Conjoin provides an additional
expressive capability, which contrasts with the more vague possibility in (20), which
still remains available.
However, as pointed out by a reviewer, some syntacticians claim that words cannot
exist in isolation, as words have valence, that is, they are taken from the lexicon
(mental dictionary) with various grammatical features which dictate that they com-
bine with certain other words. In this particular case, the idea would be that the word
fall must combine with a subject (e.g. snow), as well as with Tense, in order to express
an assertion. In this view, neither (20) nor (21) can make assertions. For example,
Piattelli-Palmarini (2010: 160) considers that words themselves are syntactic entities
and that it is “illusory” to think that as such words can exist outside of full-blown
syntax, or that any protolanguage can be reconstructed in which words are used, but
not full-blown syntax.
On the other hand, Bickerton (2014: 89) points out that these early words could
have been different from modern words; perhaps they were “mere lexical shells,”
without grammatical features. It is also important to point out that even present-
day adult speakers often use one-word utterances (e.g. Me! Fire!), which need not
be analyzed as elliptical, that is, as derived from full sentences by ellipsis (see
Progovac 2013a for an overview of this issue and for references). It is well-known
that words sometimes have a special form when used in isolation, and such forms
have been analyzed as default forms (see e.g. Schütze 2001; Section 2.2), that is, as
forms with unspeciﬁed grammatical features. These default forms are used
exactly in those situations in which syntactic rules cannot reach them, including
in isolation. In other words, Piattelli-Palmarini’s claim is not necessary even
for present-day languages, as even present-day languages allow words to be
used in isolation. This leaves the door wide open for a one-word stage in language
evolution.
The usefulness of paratactic grammars is also very obvious with exocentric VN
compounds. Progovac (2009a, 2012) and Progovac and Locke (2009) argue that the
ability to use the kind of paratactic grammar characterizing exocentric compounds
may have been sexually selected, and that some modern languages may still preserve
evidence of such selection. As discussed in much detail in Chapter 6, these com-
pounds seem to be the only well-deﬁned morpho-syntactic construction that spe-
cializes for derogatory reference/insult. It is also the construction that most neatly
falls under the umbrella of a paratactic, exocentric, intransitive, absolutive-like
grammar (see Chapter 3 and Chapter 6).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Paratactic proto-syntax stage 99

(22) scare-crow, kill-joy, pick-pocket, cry-baby, spoil-sport, turn-coat,

rattle-snake, hunch-back, dare-devil, wag-tail, tattle-tale,
Drynk-pany (miser), Pinch-penny (miser), busy-body
(23) ispi-čutura (drink.up-flask—drunkard), guli-koža (peel-skin—who
rips you off), cepi-dlaka (split-hair—who splits hairs), muti-voda
(muddy-water—trouble-maker), jebi-vetar (screw-wind—
charlatan), vrti-guz (spin-butt—fidget), tuži-baba (whine-
old.woman; tattletale) (Serbian)
The argument is that one can create much more colorful and creative (ritual) insults
with two-word concatenations than one can ever do with just single words. The
reader can try to find one-word (non-compound) equivalents to the concepts
expressed with the metaphorical compounds above. The chances are that either the
alternatives of that kind do not exist, or if they do, that they are too dry or abstract,
and not likely to have existed in the initial stages of language. On the other hand, one
should notice that the pieces of exocentric compounds (e.g. wind, cry, rattle, wag,
peel) tend to be rather concrete concepts, much more likely to have existed at the
early stages of language evolution. It is fascinating to observe how the simplest
combinations of these most basic pieces are able to yield truly complex and abstract
concepts, which can serve amazingly well as insults. But such combinations can also
be useful for providing more precise descriptions of animals, as is the case with
e.g. rattlesnake. Paratactic grammar in this particular domain would have thus
constituted a true breakthrough in human expressive capabilities, clearly something
that could have, and would have, been subject to selection, as discussed in Chapter 6,
as well as in Chapter 7.

4.2.3 Absolutes and correlatives: More on Conjoin

As established in the previous section, when two small clauses combine paratactic-
ally, they appear to be on an “equal footing” with respect to each other as far as syntax
is concerned, and their relationship is then interpreted as one of temporal ordering
and/or causation, expressed iconically by the relative ordering of the two clauses.
On the other hand, when a bare small clause attaches paratactically to a finite
sentence/TP, in an unequal act of union (24 below), such a small clause is perceived
as an adverbial/adjunct, which again usually receives temporal/causal, or some other
modifier-like interpretation. In his detailed consideration of absolute constructions
(not to be confused with absolutives), such as the underlined clause in (24), which
can also be characterized as involving operation Conjoin, Stump (1985: 302) con-
cludes that the logical relation between the absolute and its superordinate clause is
often determined inferentially. He defines “inference” as “anything which is not part
of the literal meaning of some expression but which language users judge to be part of
the intended meaning of that expression” (304).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

100 Parataxis and coordination as precursors to hierarchy

(24) She clapped her hands like a child, her lucid eyes sparkling.
(Stump 1985: 332)
Jackendoff (2002) also considers similar small clause attachments in (25) and (26),
suggesting a possible pre-TP stage in the evolution of human language:
(25) [Us having left], he reverted to his old ways.
(26) [Him having gone to Rome], I can now focus on my work.
As opposed to the symmetric parataxis illustrated in (15–19), the interpretation in this
case is no longer determined by the relative ordering of the two clauses, but is at least
partly determined by their unequal grammatical status, again iconically: the finite
clause serves as the main clause because it is grammatically the fuller one, and the
small clause just provides some temporal and/or causal modification of the main
clause. Even if the ordering is reversed in (27–28), the interpretation remains the
same. This is in contrast to symmetric clause combinations, which are directly
affected by such reversals of order (29):
(27) He reverted to his old ways, [us having left].
(28) I can now focus on my work, [him having gone to Rome].
(29) ?Nothing gained, nothing ventured.
?Easy go, easy come.
?Monkey do, monkey see.
?Come all, come one.
The analysis of (15–19) as simple concatenation/parataxis may be called into question
by some recent analyses of correlative constructions of the type illustrated in (30)
below:
(30) The more you read, the less you understand.
Culicover and Jackendoff (2005: 508) argue that such correlative constructions
involve a paratactic (quasi-coordinate) syntax with conditional semantics. However,
den Dikken (2005: 503) responds that their approach “condone(s) a mismatch
between syntax and semantics” and proposes a syntactically more complex deriv-
ation (see also Smith 2010 and Citko 2011 for an overview of various approaches). The
conditional semantics, however, does not follow even from den Dikken’s treatment
of correlatives, as he himself acknowledges. But, at any rate, this same friction
between syntax and semantics seems to carry over to my examples in (15–19).
First of all, at least in the case of examples such as (15–19), one is not dealing with a
mismatch, but rather with underspecification/vagueness, just as one is not dealing
with a mismatch in the case of absolutive constructions. The paratactic attachment
only signals that there is a relationship between the events in the two clauses, but it
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Paratactic proto-syntax stage 101

does not specify the nature of that relationship. According to Culicover and
Jackendoff (2005: 528), parataxis is “grammatically the most primitive way to com-
bine linguistic elements, one that leaves the semantic relations among the elements to
be determined by their inherent semantic possibilities or by pragmatic consider-
ations.” As pointed out above, concatenations such as (15–19) typically rely on
iconicity of word order to express temporal and/or causal relations, rather than on
any syntactic devices (see also Stump 1985: 307; Deutscher 2000).
This is also the case with constructions akin to serial verb constructions, as
discussed in Chapter 3, Section 3.4.1. One example is the concatenation of two
intransitive clauses (e.g. WOMAN PUSH, MAN FALL), meant to express a transitive
event in Nicaraguan Sign Language. Comparable to the examples in (15–19), such
combinations are also interpreted iconically, in the sense that the ﬁrst clause acts as
the cause for the second (and the order is not reversible). If transitive constructions
ultimately derive from such paratactic sequences, then this would explain the over-
whelming tendency in world languages for agents (which are typically causers) to
precede patients/objects in transitive constructions. Perhaps this could even obviate
the need for a separate Agent-First principle, as discussed in Section 1.6.
Furthermore, the correlative structures in (30) are clearly more complex than the
paratactic attachment of small clauses in e.g. (15), both clause-internally and clause-
externally (see especially Smith 2010). Internally, both clauses in (30) are ﬁnite,
showing tense and agreement, as well as a left-peripheral position before the subject,
which may implicate Move, or at least a(n additional) functional projection above
TP. In contrast, the small clauses in (15) are just that— small clauses which show no
tense, no agreement, and no Move. Externally, each of the small clauses in (15) can be
a root construct on its own, not requiring another clause to complete it (e.g., Nothing
ventured!) This is not the case with the correlative constructions in (30), whose
individual clauses are clearly dependent (*The more you read), possibly suggesting
some additional external mechanism of clause cohesion, not available in (15).
This is not to deny the obvious similarities between the constructions in (15) and
the correlatives in (30). The correlatives in (30) may represent modern complications
of ancient correlatives, the latter more closely approximated by the examples in (15),
but the examples in (30) still showing some elements of proto-syntax. Citko (2011)
also concludes that correlative structures are somewhere between parataxis and
hypotaxis. Notice that such clauses still depend iconically on the relative ordering:
(31) The less you understand, the more you read.
(31) is interpreted very differently from (30). This is in contrast to clearly subordin-
ated structures (32) below, which do not depend on relative ordering:
(32) If you read more, you understand less.
You understand less if you read more.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

102 Parataxis and coordination as precursors to hierarchy

The issue of vagueness and underspeciﬁcation deserves special attention in an

evolutionary framework (see also Chapter 3). If language evolved gradually, then it
is to be expected that not all the grammatical tools that we use today to express
various relations with some precision were available in the previous stages of
grammar. This should not have prevented our ancestors from speaking in however
imprecise and vague ways. It is also important to keep in mind that, however precise
we may believe that our language is today, it is still extremely underspeciﬁed with
respect to so many distinctions that could in principle be made, and which are made
in some languages, but not in others (see e.g. Gil 2014).
The ever-increasing precision in what we can express with language, and the
increasing match between syntax and semantics, may have marked one of the
directions in which language evolved. But there is no reason to believe that a perfect
syntax–semantics match will ever be achieved (see e.g. Francis and Michaelis (2003)
which focuses on various incongruities of this kind), or that it is even desirable to
achieve (see e.g. Bouchard 2013). As pointed out throughout the book, even when
more precise means are available in languages, speakers often resort to simpler,
vaguer expressions, such as middles in Serbian. In languages like English, which
does not have comparable middles, one often uses passive forms, whose end result is
the suppression of the agent, that is, less precision in characterizing the argument
structure.
In conclusion, postulating a paratactic stage in the evolution of syntax is supported
by the “living fossils” of this stage found in abundance in modern languages (e.g. root
small clauses, their paratactic combinations, and exocentric compounds). In add-
ition, Section 4.5 identiﬁes corroborating evidence from ancient languages, gram-
maticalization processes, language acquisition, comparative animal studies, and
neurological studies, including the processing of intonation and prosody. It is of
note that paratactic grammars cannot be manipulated by Move. This feeds into my
proposal in Chapter 5 on Subjacency that Move is a later evolutionary innovation,
which arguably emerged together with hierarchy (also Section 4.4.5). The following
subsections, 4.3 and 4.4, consider how paratactic constructions provide a foundation,
a scaffolding, for the coordination and subordination structures.

4.3 The proto-coordination stage

As pointed out in the previous section, paratactic combinations rely solely on
prosodic, supra-segmental information to provide evidence of proto-Merge (Con-
join). If the advent of proto-Merge, that is, the beginning of proto-syntax, was a
particularly advantageous development in the evolution of human language, then
one can see how providing more robust and unambiguous evidence of such an
operation would have been beneﬁcial. The proposal here is that proto-conjunctions,
the all-purpose linking categories, evolved as a result of the pressures to consolidate
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

The proto-coordination stage 103

proto-Merge/Conjoin. Such proto-conjunctions/linkers added all-purpose segmental

glue to the already available prosodic glue characteristic of the paratactic stage,
providing now two indicators of proto-Merge, both segmental and prosodic. Two
mechanisms will necessarily yield more robust evidence for Merge than just one of
them alone (it is also possible that segmental glue (conjunction/linker) by itself
provides a more salient cohesive mechanism than prosody). It may have been only
later that such proto-coordinators/linkers differentiated into speciﬁc functional
categories, such as aspect markers, tense markers, or complementizers, as discussed
in Section 4.4.
There may have been other advantages to the emergence of (proto-)conjunctions,
such as the ability to use different types of conjunctions, not just the neutral connective
and. As pointed out in Payne (1985: 9) and references cited there, in languages such as
Vietnamese and Japanese, a coordinator is used for the adversative conjunction
comparable to English but, even though in non-adversatives the strategy involves
simple juxtaposition (parataxis) of the “conjuncts,” with no intervening conjunction.13
This highlights the continuity/ﬂuidity between parataxis and coordination, as already
discussed in Section 4.2.14
According to Payne, the paratactic strategy, where the “conjuncts” are simply
juxtaposed, with no additional markers of conjunction, is probably available to all
languages. This would be expected under the evolutionary scenario explored in this
book, according to which parataxis preceded, and provided the necessary scaffolding
for, both coordination and subordination structures. In many languages, such as for
example Turkish, parataxis is a normal alternative to coordination, existing side by
side with other strategies. The classical languages, including Sanskrit and Latin, also
widely use the juxtaposition (parataxis) strategy at the expense of coordination
(Payne 1985: 25). The two strategies are obviously in competition, and are not at all
clearly demarcated, exactly the kind of overlap expected under an evolutionary
scenario outlined in this book.
Just a cursory look at some very common data in English can illustrate the
ambivalence and overlap between the two processes:
(33) a. The tall, elegant lady carried a heavy suitcase.
b. The tall and elegant lady carried a heavy suitcase.

13
One also ﬁnds combinations of both the neutral conjunction (and) and an adversative conjunction in
e.g. English and yet and Standard Arabic wa lakin “and but,” as noted in Payne (1985: 15), suggesting that
the neutral coordinator can serve as a mere connector/linker, without a speciﬁed meaning.
14
There are also numerous examples across languages which seem to straddle the boundary between
parataxis and subordination/complemention. One example is serialization/complementation in Hmong
(Martha Ratliff, p.c 2013). According to Jarkey (2006: 129), complementation in Hmong involves a serial-
like construction, “a step along a continuum between serialization and complementation in terms of the
closeness of the juncture.” Serial verbs are discussed in more detail in Section 3.4.1.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

104 Parataxis and coordination as precursors to hierarchy

(34) a. He read the book quickly.

b. He read the book, and quickly.
(35) a. She visited many cities, including Prague, Paris, Rome,
Trieste, Vienna.
b. She visited many cities, including Prague, Paris, Rome,
Trieste, and Vienna.
(36) a. She is tall, elegant, and ambitious.
b. She is tall and elegant and ambitious.
Such examples show that there are contexts in which one can either use a conjunction
or not, in constructions that are typically characterized as coordination (see Progovac
2003 and references there for an extensive discussion of the phenomenon). Payne and
others refer to this as a paratactic strategy for coordination, again suggesting a lack of
clear differentiation between the two. Similar ambivalence is encountered when one
tries to distinguish between coordination and subordination in some cases, as
pointed out in the next section. This would be surprising if syntax were a perfect
and optimal system, engineered from scratch, with adjunction, coordination, and
subordination each having their own speciﬁc functions. On the other hand, this
ambivalence and overlap are exactly what one expects from evolutionary tinkering, if
parataxis gradually integrated into coordination, and coordination and parataxis
gradually integrated into subordination.
Clausal conjuncts (e.g. John is here, and Mary is gone), just like adjuncts (e.g. John
is here because Mary is gone), have been repeatedly noted in current syntactic
literature not to be fully integrated into syntactic fabric (Selkirk 1978; Stowell 1981;
Nespor and Vogel 1986; Zec and Inkelas 1990). This is consistent with them sitting in
semi-integrated, “non-canonical,” syntactic positions, as put in An (2007). Next,
conjuncts have been analyzed in syntax as sitting on parallel planes, that is, in a
different dimension (e.g. Goodall 1987), even though this analysis is not widely
accepted (see an overview of these issues in Progovac 2003; see also Crysmann
2006 and Citko 2011).
Moreover, c-command, the central postulate of syntax, does not seem to extend
into conjuncts or adjuncts in all cases (see Progovac 2003). As discussed in
Section 4.4.5, Move targets a hierarchically higher (c-commanding) position,
so that the Moved element can structurally command/identify its trace/gap.
C-command also regulates other structural relationships, such as the one between
noun phrases and co-referring pronouns (Footnote 15), and negation and
co-dependent negative polarity items (i.e. items that must be licensed by negation,
e.g. ever). It is thus of interest that c-command does not extend seamlessly into
conjuncts and adjuncts, suggesting that conjuncts and adjuncts are not fully inte-
grated into the layers of syntax. To take one example, while it is possible to license the
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

The proto-coordination stage 105

negative polarity item ever in an embedded subordinate clause (37), it is not possible
to do so in a conjunct clause (38) or an adjunct clause (39):15
(37) Mary did not say [that she ever met Peter]. Subordination
(38) *Mary did not say it, [but she ever met Peter]. Coordination
(cf. Mary did not say it but she never met Peter.)
(39) *Mary did not say it, [after she ever met Peter]. Adjunction
It is of note that Bruening (2014) has proposed that the principle of precedence is
relevant even for sentential grammars (see Footnote 15), and that instead of a purely
syntactic principle of c-command, one needs a conjunction of two principles:
Precede and Command. While he treats Precede as a syntactic principle, the fact
that it extends across sentence boundaries (Footnote 15) suggests that this principle
has a pragmatic source. Could it be that an ancient, pragmatic principle of prece-
dence got grammaticalized into a structural relation of c-command, whose effects are
fully observable only in the hierarchical, subordination stage? Interestingly, Bruening
proposes that this decomposition of c-command allows one to treat coordination
structures as symmetrical (352, fn. 7). According to him, what gives an effect of
asymmetry in coordinated structures is the precedence, rather than hierarchical
asymmetry of conjuncts.
In addition, several theoretical accounts invoke adjunction as an integral part of
the analysis of coordination, and/or liken it to subordination (e.g. Munn 1993;
Johannessen 1993; Kayne 1994).16 These analyses are technical, and would take us
too far aﬁeld to introduce them here, but the reader is referred to Progovac (2003) for
a lengthy overview of various analyses of coordination. Sufﬁce it to say here that
theoretical analyses of coordination are not able to draw clear distinctions among
the three categories under discussion: adjunction, coordination, subordination. In

15
In a similar fashion, Principle C effects, clearly visible with subordination (i), do not seem to extend
into conjuncts (ii): while she and Mary cannot co-refer in (i), such co-reference is possible in (ii). The
judgment is less clear with an adjoined clause in (iii). To me, as well as a native speaker I consulted, it seems
that (iii) is slightly better than (i).
(i) *Shei never mentioned [that Maryi is a bartender]. Subordination
(ii) Shei never mentioned it, [but Maryi is a bartender]. Coordination
(iii) ?*Shei never mentioned it, [after Maryi became a bartender]. Adjunction
To complicate matters further, some Principle C effects seem to overlap with the effects of the pragmatic
precedence principle, which operates across independent sentences (iv), and can thus not be reduced to
c-command, which is a sentence-internal principle (see Progovac 2003 for discussion):
(iv) Hei ﬁnally arrived. Johni’s cousin accompanied him.
?*

Given this, it is not clear if it is syntactic c-command or precedence that excludes co-reference in
e.g. (i). Clearly, this issue deserves further investigation. It may be that the grammaticality status of the
examples introduced above reﬂects a curious interplay of more than one factor, including syntactic
command and pragmatic precedence, whose domains seem to partly overlap.
16
See also Schwartz (1989a,b) for the comitative/asymmetric conjuncts.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

106 Parataxis and coordination as precursors to hierarchy

several other respects as well, the conjunction is a category unlike other functional
categories, straddling the boundary between adjunction and subordination. Conside-
rations like this give credence to the gradualist evolutionary approach, for they
provide evidence of continuity and overlap among adjunction/parataxis, coordin-
ation, and subordination.
Section 4.2 established that the operation Conjoin, which creates paratactic/exo-
centric structures, applies both clause-internally and between clauses. I would like to
extend this same idea to coordination, and tentatively suggest that even predication
may have, at least in some circumstances, passed through a proto-coordination
(linker) stage in the evolution of human language. The fossils of such processes are
not as easy to find as they are for clausal combinations, but there are some construc-
tions that can be considered as such fossils. For example, German incredulity root
small clauses take an optional conjunction (see Potts and Roeper 2006; also Progovac
2006, 2009b):
(40) Ich (und) Angst haben? (German)
I (and) fear have.INF
‘Me afraid?!’
The German small clause above seems to preserve both the paratactic option
(without a coordinator) and the coordination option, the latter just adding a mean-
ingless coordinator/linker to solidify the connection between the subject and the
predicate. In a similar fashion, Akkadian, a Semitic language spoken between c. 2,500
and 500 BC, used the coordinative particle –ma in predicative functions (41), as
reported in Deutscher (2000: 33f.). The absence of a verbal copula suggests the use of
root small clauses:
(41) ‘napišti māt-im eql-um-ma
soul.of land.gen field.nom.conj
‘The soul of the land is the field.’
In addition, Bowers (1993) analyzes English as as a realization of the head of
Predication Phrase, whose purpose again is merely to link the subject with the
predicate:
(42) She regards [sc Mary as a fool/crazy.]
Of note here is that English as (as well as Akkadian –ma) can serve as glue for both
predication (interclausally, as in (41, 42)) and to connect clauses (extraclausally, as in
English (43)):17

17
It is also reported in Mous (2004: 121) that Alagwa (Cushitic language spoken in Tanzania) uses the
same morpheme for both conjunctive functions and as a copula. See also Newman and Newman (1977:
21–2) for the dà in Hausa, a Chadic language spoken in Africa; also Gil (2004) for various functions of sama
in Riau Indonesian.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

The proto-coordination stage 107

(43) a. Peter will be late, as will John.

b. The door opened as she was approaching.
Note also that as is used to solidify/cement predication only in small clauses, where,
arguably, there are no specified functional projections, such as TP, which can serve
the purpose. It is also worth pointing out that as in (43) seems to straddle the
boundary between a coordinator and subordinator.
In descriptions of various linguistic phenomena one finds reference to “linking”
words or morphemes, which do not seem to have any specific meaning. To take just
one recent example, den Dikken (2006: 245) devotes his whole book Relators and
Linkers to such “meaningless elements (meaningless in the sense of having no
semantic load) that play an essential role in the establishment and syntactic manipu-
lation of predication relationships.” As he puts it, “relators and linkers are the vital
syntactic cement of predication relationships” (249). He goes on to say that all
subject-predicate relationships are mediated by such relators, whether overt or
covert.
Little words like as in English can also be seen as a kind of linker, whether it is used
to link a subject and its predicate (42), or to link two clauses (43). In its former
function it can be likened to a copula, while in the latter function it can be likened to a
conjunction, or subordinator, but the claim here is that this is a reflex of a pattern
in which such distinctions were not made.18 Moreover, as argued in e.g. Vossen
(2010: 47), there is a “linker” a in Central Khoisan, Kalahari branch of Khoe, spoken
in southern Africa, which, at least at the synchronic level “has no recognizable
meaning nor does it reveal a definite grammatical function.”19 This is so because it
is found “linking” various kinds of grammatical elements (all-purpose linker),
including the verbal base to the following tense-aspect marker; or a verb and the
dative argument (Daniel Ross, p.c. 2013).
(44) a. x’ũú-wá-hã
kill-LINKER-PRETERITE
b. gòm -á-mà
smoke-LINKER-DATIVE
‘to smoke for’
Heine (1986) has argued that the linker (or juncture, as it is sometimes referred to) is
a grammaticalized copula, which still exists as such in most Kalahari Khoe languages.

18
Also, the copular verb be appearing between the subject and a non-verbal predicate, as in English
John is happy, is traditionally referred to as a “linking verb.” This kind of verbal linker is typically absent
from small clauses (e.g. I consider John happy), and is also not used in all languages, or with all tenses in a
given language, as discussed in Chapter 7.
19
Another example of a linker is the (in)famous particle de in Chinese, as described in e.g. Cheng
(1986).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

108 Parataxis and coordination as precursors to hierarchy

At the same time, Elderkin (1986) has argued that the linker derives historically from
a conjunction. Even though these two proposals seemingly compete with each other,
my approach suggests that they can both be correct: the linker was at one point the
proto-coordinator, used both to connect the subject and the predicate (copular use),
and to connect other constituents.
According to Schneider-Zioga (2013), the Bantu language Kinande, spoken in
Congo, has a linker which occurs between internal arguments (e.g. direct and indirect
objects), and sometimes between adjuncts or between an internal argument and an
adjunct. She shows that the function of this linker cannot be reduced either to a case
marker or to the distinctness condition, but rather remains just a copula, or rather a
linker, as copula is the term usually reserved for the linker between the subjects and
predicates. What these data show is that other constituents in a sentence, including
objects and adjuncts, can also be linked to the rest of the sentence via linkers/
specialized copulas.
The proposal of this chapter is that those kinds of all-purpose proto-linkers can
grammaticalize into more specialized functions, and moreover different functions in
different languages. The most prototypical of these linkers is arguably the conjunc-
tion such as and in English, which is still characterized by a significant amount of
promiscuity and possibly by the lack of meaning above and beyond mere linking. If
one can now imagine a grammar where this kind of linker is even more promiscuous
and devoid of meaning, and which can link any two constituents, including the
subject and the predicate, then this would be the all-purpose proto-conjunction
hypothesized in this chapter. The main reason to refer to these linkers as proto-
conjunctions is that, in modern languages, conjunctions are used in more linking
functions than other (functional) words, approximating the proto-conjunction
markers better than other words.
Likewise, one finds a variety of so-called linking morphemes in compounds across
languages (e.g. linking “o” in speed-o-meter; Graec-o-Roman; palat-o-alveolar
in English; kiš-o-bran (rain-o-guard, umbrella); kamen-o-rezac (stone-o-carver) in
Serbian; linking “s” in Germanic compounds, e.g. tabak-s-rook (tobacco smell) in
Dutch; Himmel-s-tor (heaven’s door) in German; hunt-s-man in English. Linking “o”
is very common across Slavic, as well as in Greek. It is also found in Romance
languages. All these constructions may be frozen somewhere at an intermediate
proto-coordination stage, some place between parataxis and a specialized functional
category stage. Recall that the VN compounds such as pick-pocket, discussed in
Section 4.2, have no linking morphemes, and are characterized as paratactic, which
in my analysis renders them simpler, more primary than the compounds involving
linkers.
In sum, this section hypothesizes that (proto-)conjunctions/linkers may have been
among the first functional categories to emerge, for the primary function of solidi-
fying Merge, that is, of providing more robust evidence of Merge than just prosody
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

The speciﬁc functional category stage 109

(supra-segmentals) can do. Finally, if the emergence of parataxis already proved

advantageous to our ancestors as they advanced to the first syntactic stage, then
providing more robust and unmistakable evidence of such (proto-)syntax, by intro-
ducing linkers (coordination stage), would have constituted a clear and concrete
benefit, which could have been subject to natural selection.
Section 4.5 considers some corroborating evidence for a proto-coordination stage,
mostly based on language acquisition. Evidence for a proto-coordination stage at the
level of predication is not as robust as evidence for a paratactic stage. It may be that
the proto-coordination stage at the predication level was a brief, fleeting stage, which
quickly led to the appearance of specific functional categories. It is also probable that
this stage was optional, that is, that not all languages/constructions needed to proceed
through this stage, as it should be possible to evolve a specific functional category
without first evolving a conjunction-type linker (see Section 4.5). In contrast, the
paratactic stage is the foundational stage, upon which everything rests, and the
evidence for parataxis providing a scaffolding for other structures is overwhelming.

4.4 The speciﬁc functional category stage

4.4.1 From linkers to speciﬁc functional categories
Finally, certain categories, including linkers/conjunctions of the proto-coordination
stage, would have grammaticalized into speciﬁc functional categories, such as predi-
cation head or tense or subordinator/complementizer—another syntactic break-
through and the beginning of modern, hierarchical syntax, which can now not
only use functional words as glue to connect words/phrases/clauses, but which can
also use them to build specialized, hierarchical functional projections, such as TP and
CP. A modern functional category such as a copular verb (head of TP), or a
complementizer (head of CP), can be seen as providing not only segmental evidence
of Merge (interclausally with TPs and extraclausally with CPs), but also, simultan-
eously, an expanded structural space, which can now accommodate Move.
These specialized functional projections not only provide landing sites for Move, but
also motivation for Move. Recall from Chapters 2 and 3 that the subject of the clause
Moves out of the SC layer only after the TP layer is projected on top of the SC. In other
words, the SC layer by itself does not exhibit the Move of the subject. In this case at
least, syntactic theory considers that Move is driven by the need of the higher layer, TP,
to have its own subject. In some very abstract sense, then, Move serves to connect the
layers of structure, and such layers only become available in the hierarchical stage.
Another example would be the Move of the verb (V) into the light verb (v) position,
and in some languages and some circumstances the verb is taken to Move even higher,
to T or to C (see Section 4.4.5 for more discussion). Likewise, the CP position is
typically the target of wh-movement, as discussed at length in Chapter 5.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

110 Parataxis and coordination as precursors to hierarchy

For concreteness, suppose that a linker comparable to as was used as all-purpose glue
in the proto-coordination stage, both to connect words into clauses, and to connect
clauses, as the following repeated examples from the previous section illustrate.
(45) She regards [sc Mary as a fool/crazy.]
(46) Peter will be late, as will John.
(47) As she was approaching, the door opened.
In (45) as acts as a linker between a subject and its predicate; in (46) as is akin to a
conjunction, connecting two clauses; and in (47) as is more like a subordinator/
complementizer, even though the latter two functions are not clearly distinguished.20
Now, the function (as well as the phonological shape) of this linker could have
diverged in these different positions to specialize either for predication/tense/aspect
marking, or for clause cohesion, the developments which would now signify the
beginning of the subordination stage, i.e. speciﬁc functional category stage. In this
stage, the linkers are not only there to provide segmental glue, but also to illuminate
the nature of the link (e.g. predication vs. clause combination), and to provide more
speciﬁc information about the link, such as information about present vs. past tense
in the case of predication, or causation vs. temporal event ordering in the case of
clause combination.
To appreciate the three stages, also reported in the processes of grammaticalization
of subordination in e.g. Traugott and Heine (1991) and Deutscher (2000), consider
the following examples which seem to range from least syntactically integrated (48,
parataxis), to most integrated (50, subordination), with coordination (49) straddling
the boundary between the two:21
(48) Marc is a linguist—(as) you know. Parataxis
(49) Marc is a linguist, and you know it. Coordination
(50) You know that Marc is a linguist. Subordination

20
Potts (2002) analyzes as clauses such as (i) below as syntactically quite complex, involving movement
and CP integration. In fact, he treats as in such clauses as a preposition, which selects a CP.
(i) As the FBI eventually discovered, Ames was a spy.
In contrast, others have analyzed parentheticals in general as involving a loose concatenation of two
independent sentences, which is how parataxis is often understood (e.g. Emonds 1976: 52–3; Haegeman,
Shaer, and Frey 2009). Asher (2000) also discusses as parentheticals in this light. Resolving this issue is
beyond the scope of this book. The purpose here is only to illustrate how certain words in today’s languages
might approximate multi-functional linkers, rather than to provide an in-depth analysis of English as.
21
The following example may also be seen as involving parataxis, but in a clause-internal position:
(i) Marc, (as) you know, is a linguist.
Notice that as in (48) is itself an intermediate category bridging the gap between true parataxis (without as)
and true coordination (49).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

The speciﬁc functional category stage 111

In these combinations of only two clauses, the three different syntactic strategies
for clause union result in roughly the same interpretations. This is not surprising
under the evolutionary “tinkering” scenario explored in this monograph, in which
conjunctions/linkers emerge just to provide additional (segmental) evidence of
union, on top of the prosodic evidence which already characterizes parataxis.
The speciﬁed functional categories then arise, characterized by a more speciﬁc
meaning/function.
Such tinkering would have left us with multiple possibilities which partly overlap in
function, that is, with redundant means for expressing similar meanings (48–50). One
wonders, then, what concrete communicative advantages might have been gained by
the subordination stage (50) over the two previous stages (48, 49). The following
sections discuss this issue in relation to CP recursion (4.4.2) and DP recursion
(4.4.3), to shed light on the question of what it takes to realize recursion in syntax.

4.4.2 CP and recursion

Recall from Chapter 1 that I adopt the traditional characterization of recursion, also
adopted in Kinsella (2009), according to which recursion is deﬁned as the embedding
of a constituent of a certain syntactic category (e.g. a clause/CP) within another
constitutent of the same category (another clause/CP). Traditionally, this operation is
taken to automatically apply in an unlimited fashion, given that one embedded CP
can always feature another embeded CP inside it, and so on, as in (53) below. This
was the traditional way to “prove” the existence of inﬁnite recursion. In other words,
recursion in this characterization has two components to it: the same category
embedding, and the unlimited reapplication of this kind of embedding. This char-
acterization coincides with what Heine and Kuteva (1987) call productive recursion,
as will be discussed in Section 4.4.3.
As it turns out, in addition to facilitating Move, including Move across clause
boundaries (see Chapter 5; Section 4.4.5), the subordination stage also provides a
recursive mechanism for embedding multiple viewpoints within one another, unavail-
able with either coordination or parataxis/adjunction, privileging in this respect (53)
over (51–52).
(51) a. ?Marc is a linguist—[you know,] [Mary knows].
b. Marc is a linguist—[you know it,] [Mary knows it].
(52) Marc is a linguist, [and you know it,] [and Mary knows it].
(53) Mary knows [that you know [that Marc is a linguist]].
Only (53) allows one to report on one person’s knowledge about another person’s
knowledge, unambiguously and recursively.
As the bracketing notation indicates, while in (53) each embedded clause is an
integral part (complement/object) of the higher clause, showing subordination
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

112 Parataxis and coordination as precursors to hierarchy

(hypotaxis), in (51) and (52) the clauses are strung next to each other (parataxis).
Kinsella (2009) discusses the distinction between iteration, characteristic of coord-
ination, and true embedded recursion. As she puts it, “the difference between
iteration and recursion is this: the former involves mere repetition of an action or
object, each repetition being a separate act that can exist in its entirety apart from the
other repetitions, while the latter involves the embedding of an action or object inside
another action or object of the same type, each embedding being dependent in some
way on the action/object it is embedded inside” (115).22 In this sense, (51–52) should
be analyzed as involving iteration, rather than true recursion.
However, a reviewer points out that, with some elaboration, coordination may
allow for multiple embedding of one viewpoint within another, as in the following
example, which places prosodic prominence on that:23
(54) ?
Marc is a linguist, and you know it, and Mary knows that.
So, at least with coordination, one can ﬁnd a way to tinker with the utterance until it
expresses two levels of embedding, with the help of prosody and the alternation
between it and that referring to the main clause, as per Footnote 23.24
But even with these tools, ﬁrst of all, the sequence in (54) does not guarantee the
interpretation in (53), as other interpretations are possible, too. Second of all, the lack
of syntactic precision becomes even more obvious when one attempts recursion
beyond the two levels. Let us show this by attempting two more levels of embedding,
contrasting the coordination strategy in (55), with the subordination strategy in (56):
(55) ?
Marc is a linguist, and you suspect it, and Mary knows that, and
Steven really believes that, and Peter wonders about that.
(56) Peter wonders [CP if Steven really believes [CP that Mary knows
[CP that you suspect [CP that Marc is a linguist]]]].

22
Kinsella further notes that, unlike iteration, embedded recursion involves keeping track or adding to
memory using a stack (116). In other words, tracking recursive structures poses a challenge to our
processing abilities the way that iteration does not, to be discussed further below in the text. This is why
it is so helpful to have a designated functional projection such as CP, which unambiguously tracks an
embedded recursive process.
23
What makes this possible in (54), but much less so in (52), is the alternation between it and that, both
of which can refer to a clause. Using the same pronoun (it) suggests that one is referring to the same main
clause (Marc is a linguist) in both cases. On the other hand, alternating it and that, and placing special
emphasis on that in the second coordinated clause, suggests that whatever it refers to, that contrasts with it,
and refers to something else. This something else can then be a combination of the first two clauses,
although it need not, and other possibilities for the interpretation of that are certainly also available. This is
another example of an underspecificed structure, subject to vagueness.
24
To the extent to which there is a contrast in acceptability between (54) in the text and (i) below, it
might suggest that coordination is a bit more flexible than plain parataxis in this respect:
(i) Marc is a linguist, you know it, Mary knows that.
??
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

The speciﬁc functional category stage 113

The more levels of embedding, the more clear it becomes that (55) is not a great
strategy for embedding multiple viewpoints one within another, while (56) is spe-
cialized to do just that, in an unambiguous and streamlined way. Whatever (55)
means, it is hard to see how it would be used successfully to express the meaning
in (56). (55) does not exhibit true recursion in the sense above.
In the same vein, in the paratactic example in (48), the two clauses should be
analyzed as occurring next to each other, loosely conjoined, in the sense of iteration,
rather than true recursion. The nature of the semantic link between the two clauses
will then be figured out pragmatically. However, if there are multiple links to figure
out, that is, multiple clauses strung together, then this becomes a processing game of
guessing, familiar from Section 3.1 with examples such as No come, no money, no
shelter. In that sense, a specialized, designated functional projection such as CP,
whose processing is streamlined, can circumvent the more scattered processing
strategies associated with Conjoin (see also Section 7.3.4).
Suffice it to say here that this is exactly what evolutionary forces can operate on:
there is already a precursor to recursion, that is, a precursor to the ability to embed
one viewpoint within another, but it is only good for one or two levels of such
embedding, and it is never unambiguous. In contrast, CP subordination, which
specializes for this kind of embedding, gives rise to infinite recursion, exactly because
it can circumvent the imprecise processing strategies based on Conjoin. This is the
sense in which gradual, step-by-step evolution should be understood: a new stage
does not bring about something totally new, but something just a bit more stream-
lined. The following section explores recursion associated with the Determiner
Phrase (DP).

4.4.3 DP and recursion

In addition to recursion associated with CP embedding, recursion is also often
illustrated for English with possessive structures, such as:
(57) a) Peter’s brother
b) Peter’s brother’s cat
c) Peter’s brother’s cat’s toy
Here, one DP (Peter’s) is embedded within another DP (Peter’s brother), which in
turn is embedded within another DP (Peter’s brother’s cat), and so on, illustrating a
true recursive process that can keep going. It exhibits both of the elements of the
traditional characterization of recursion: the same category condition, and the
potential for unlimited reapplicability. And from the point of view of an English
speaker, this may seem like no big deal—of course you can keep embedding one
possessive within another, and why not conclude from there that this is just
unbounded, recursive Merge at work? And why not also conclude that the recursive
Merge here reﬂects our recursive cognitive abilities?
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

114 Parataxis and coordination as precursors to hierarchy

Interestingly, however, Heine and Kuteva (2007) introduce a distinction between

productive recursion, as attested with English possessives (57), which can apply
multiple times, and one-level recursion (simple recursion), which can only apply
once, which they illustrate with German possessives (58):
(58) a) Peters Bruder
b) *Peters Bruders Auto
Despite great surface similarities between the two languages, German possessive
structures of this kind cannot be repeatedly embedded one within the other, and
the question is “why not?” Given the characterization of recursion I am adopting, the
German example does not in fact involve recursion: one of the two conditions is
clearly not met: the unbounded nature of recursion.
This in itself should be enough to illustrate that just having Merge is not enough to
guarantee recursion, and that there is crosslinguistic variation even between closely
related languages in this respect. It must be that English and German are using
different syntactic strategies to the same end: one happens to be recursive, and the
other is not. While determining exactly what these different syntactic strategies are is
an ongoing topic in syntactic theory, the discussion in the previous section suggests
that what Heine and Kuteva (2007) call simple recursion may be a symptom of a
paratactic (iterative) strategy, rather than true embedded recursion (see also
Section 4.5.1.1 for parataxis which can be mistaken for subordination in PIE).
To complicate matters further, Serbian, just like German, does not show recursion
of possessives (59), even though it is possible to express one level of possession (60):
(59) *Milenina mamina knjiga. / *moja mamina knjiga.
Milena’s mother’s book my mother’s book
(60) a. Milenina mama moja mama
Milena’s mother my mother
b. mamina knjiga moja knjiga
mother’s book my book
In line with the idea that recursion needs to be facilitated by speciﬁc functional
projections, Bošković (2008, and subsequent work) has proposed that Serbian does
not have a DP, which correlates with the lack of deﬁnite articles such as the in
Serbian. This analysis is controversial, but if true, then Serbian cannot use the same
DP-within-DP strategy that is used in English for possessives. Instead, Bošković
analyzes the possessive in Serbian as an adjective adjoined to an NP. This would
then render the possessive attachment close to the Conjoin/Adjoin strategy, i.e. to the
iterative strategy in the sense of Kinsella (2009).
According to Bošković (2008), the reason why Milenina mamina cannot form an
Adjective Phrase (AP) which adjoins to the NP is because adjectives cannot modify
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

The speciﬁc functional category stage 115

other adjectives. In other words, Milenina is not interpreted as being inside the
phrase headed by mamina in (61), but rather as being next to it, and can thus not
yield true recursion, in contrast to English (62):
(61) *[NP [? Milenina mamina [NP knjiga]]] (Serbian)
(62) [DP [DP [DP Peter’s] brother’s] [NP car]] (English)
Even though German has deﬁnite articles, and is thus analyzable as a DP language, it
is possible that German uses a similar adjectival strategy for possessives, and treats
them as adjoined to an NP.
(63) *[DP [NP [? Peters Bruders] [NP Auto]]] (German)
One can see that the structure in (62) is truly recursive, as far as syntax is concerned,
because it involves a repeated insertion of one DP within another. However, the
structures in (61) and (63) are not recursive in this way, as they do not involve one DP
embedded within another DP, but an adjective adjoined to an NP, a paratactic
strategy.
It seems that all hierarchical phenomena considered so far have an alternate,
paratactic route, including CP subordination and DP possessive expression. As
pointed out in Chapter 3 with respect to serial verb constructions (Section 3.4.1),
even transitivity can be expressed with an alternative, paratactic strategy (as opposed
to a hierarchical strategy). This is consistent with the proposal in this book that
parataxis provided, and continues to provide, a foundation and a precursor for
building hierarchical structures. The emergence of transitivity and TP layering will
be futher discussed in Chapter 7.

4.4.4 Beneﬁts of subordination

As discussed in the previous subsections, recursive syntactic mechanisms have two
basic advantages over parataxis and coordination: (i) they allow more levels of
(recursive) embedding, and (ii) they provide a more precise, unambiguous mechan-
ism for expressing recursive thoughts. Of course, the hierarchical stage in general has
many additional advantages, including the ability to streamline the expression of
transitivity and tense marking, as further discussed in Chapter 7.
The subordination stage, a hierarchical, speciﬁc functional category stage, can thus
be characterized as a stage that facilitates Move, as well as provides a mechanism for
true recursion. As pointed out above, at an abstract level, one important function of
Move is to connect the layers of hierarchical structure (Section 4.4.5). However, it is
important to keep in mind that only those constructions that have reached this
hierarchical level can be recursive and subject to Move. Recall the proposal that
sentences in modern languages are composite structures, potentially incorporating
constructions of all three stages:
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

116 Parataxis and coordination as precursors to hierarchy

(64) As you may recall, her having left, Peter decided that he wanted
to buy a new house, but not in California.
Preceding the main clause (Peter decided . . . ), there is an adjoined full TP/CP
adjunct/parenthetical (as you may recall), followed by an adjoined small clause
adjunct (her having left), both attached paratactically by Adjoin/Conjoin. The main
clause contains a fully subordinated clause (CP), which moreover features coordin-
ation inside it (that he wants to buy a new house, but not in California). As discussed
in Chapter 5, Move is typically not possible out of adjuncts and conjuncts, adjunction
and coordination being the most notorious islands for movement. In Chapter 5
I argue that these island/Subjacency effects are epiphenomena of evolutionary tin-
kering, more precisely, of having such rigid, Move-less structures co-exist side by side
more modern structures.
It is also worth pointing out here that the lack of recursion cannot be attributed to
cognitive capabilities, or rather to the lack thereof. Just as it was pointed out with
respect to the lack of recursion with small clauses, the inability to express true
recursion with parataxis and coordination has nothing to do with the speakers’
cognitive abilities, and everything to do with the structure of these constructions.
The claim in this monograph is that the kind of functional structure which enables
recursion evolved gradually in the evolution of human language, although it did not
emerge with every single construction in every single language (see also Heine and
Kuteva 2007).
In this respect, both German and Serbian speakers must be cognitively capable of
recursive thought, given that they make use of recursion elsewhere, and yet their
possessive structures, as discussed in Section 4.4.3 are not recursive. It cannot be that
language is just a passive reflection of thought, equipped with an unbounded Merge,
so that, if you can only think an unbounded thought, it will allow you to express it
through recursive syntactic means. Instead, language is patched together from
various bits and pieces to first allow paratactic precursors to recursion, and then, in
some special cases, unlimited recursion.25 This shows that a language can have
hierarchical syntax and Merge in the sense of Hauser, Chomsky, and Fitch (2002),
but there is still some tinkering to do before recursion in the traditional sense can
emerge.
Also consistent with the considerations in this chapter are reports that some
modern languages do not make use of finite subordination (e.g. Dixon 1994 for
Dyirbal; Mithun 1984, 2010 for various Native American languages). Most recently,
Everett (2005) has argued that Pirahã lacks recursion both in the domain of CP

25
It is, of course, misleading to talk here about precursors to recursion, as if German and Serbian and
Pirahã (see below) strategies are somehow unstable and awaiting recursion. It is only from the point of view
of how recursion comes about that these strategies can be seen as precursors.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

The speciﬁc functional category stage 117

subordination and in the domain of possessive recursion, the conclusion also echoed
in Sakel and Stapert’s work (e.g. 2010); see also Piantadosi et al. 2012).26 Newmeyer
(2005: 170–1) also leaves the door open for languages to lack subordination, suggest-
ing that this may be correlated with the lack of literacy, considering that CP
subordination is mostly used in written texts, and very rarely in everyday conversa-
tion. This can serve as a partial answer to a reviewer’s question concerning why a
human language would not have CP or clausal embedding, if human brains are
capable of it.27

4.4.5 Possible precursors to Move

As established so far, the proposed fossils of the paratactic (non-hierarchical) stage
are not subject to Move. This is not at all surprising given that Move in the theoretical
framework associated with Minimalism has to take a constitutent in a certain
syntactic position and raise it to a hierarchically higher (c-commanding) position,
so that the raised constituent can hierarchically dominate its trace or copy left in the
original position. To put it slightly differently, in order to posit that there is a gap (left
by Move) in a certain syntactic position, there has to be an automatic, grammatical-
ized way to identify the position and nature of the gap by a higher constitutent, and
this kind of command relationship (c-command) is only relevant for hierarchical
grammars (see also Section 4.3). Nonetheless, there might be precursors to Move in
this paratactic stage, or rather structures that can provide the basis for Move.
One relatively straightforward case would involve a transition from a hypothetical
paratactic serial verb sequence (Section 3.4.1) of the kind in (65), to vP transitivity of
the accusative type, such as (66).28
(65) a) [SC Woman push], [SC man fall]
b) [SC Girl roll], [SC ball roll]

26
The analysis of Pirahã, as proposed in Everett (2005), has been contested by e.g. Nevins, Pesetsky, and
Rodrigues (2009) and references there, and is, in general, surrounded by a lot of unpleasant controversy
(see e.g. the characterization by Pullum 2012, and the comments there).
27
In this respect, it is important to keep in mind that, whether in evolution in general or in language
change, an innovation is typically due to chance, and is not predetermined or predestined. While human
beings are capable in principle of inventing a wheel, not all cultures have done that, and certainly not all
individuals. We often pose these negative questions, such as how come this language or person does not
have this? Or why do certain constructions lack Move? Or, why do certain constructions lack recursive
subordination? Or why do some languages lack DP? In fact, on this evolutionary approach, the questions to
be posed are of the opposite kind: why is it that certain constructions have Move (Chapter 5), and why is it,
and what kind of circumstances needed to be met, for some constructions to become recursive, and for
languages to acquire a DP? And what does it take to invent a wheel? For the absence of such rather bizarre
phenomena as far as nature is concerned is much easier to understand than their existence.
28
As pointed out in Footnote 25, these can be considered as precursors only from the point of view of
the vP accusative structures, which needed that foundation. But these structures are supported by coherent
grammars, which can be stable.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

118 Parataxis and coordination as precursors to hierarchy

(66) a) [vPWoman push [VP/ SC man push]]

b) [vP Girl roll [VP/ SC ball roll]]
The second small clause in the vP structure (66) is now inserted into the higher
clause, rather than being next to it (65), which is what creates hierarchy and
subordination. In the syntactic theory I am following, the idea is that the lower
verb Moves from the position of V into the higher verb position, light v, leaving
behind a gap, which is now identiﬁed by the higher verb. In this case, the postulated
Move accompanies the building of hierarchical structure, whereby the Moved ele-
ment travels through the layers of structure, providing cohesion among them. This
kind of Move seems purely grammatical, without leading to a different semantic or
pragmatic interpretation.
Another instance of purely grammatical Move would be the Move of the subject of
the SC into the layer of TP in English, as discussed in Chapter 2. Suppose that we start
with a hypothetical paratactic sequence in (67), to arrive at the hierarchical structure
in (68), in which the initial verb go has been grammaticalized into a future tense
particle, a common occurrence in grammaticalization. Here, the repeated instance of
boy can be interpreted as a trace/copy of Move, and thus deleted.
(67) [SC Boy go], [SC boy hunt]
(68) [TP Boy go [VP/SC boy hunt]]
As pointed out before, what seems to be captured by Move is the observation that
some word or phrase is relevant/interpreted in more than one position in a sentence.
In some sense, then, Move is an epiphenomenon of the modest, two-word beginnings
of syntax, which could not accommodate all the arguments, or the temporal infor-
mation, into a single clause. Instead, these types of information can only be provided
in separate layers, and Move is that kind of operation which can, metaphorically
speaking, run through all these layers, providing syntactic cohesion.
Syntactic theory also postulates Move in cases where the motivation for Move
seems to be pragmatic, for the purposes of foregrounding or backgrounding. To take
just one example, consider the case of topicalization:
(69) Mary, I don’t like Mary.
Here, Mary is taken to have Moved from its original object position to the left
periphery of the clause (possibly to CP) in order to serve as a topic of the sentence.
It is possible that such topicalization structures also have paratactic precursors, which
do not involve Move, such as (70) below, typically referred to as left-dislocation:
(70) Mary, I don’t like her.
The attachment of Mary in (70) could be by adjunction or parataxis (see e.g.
Haegeman, Shaer, and Frey 2009). What is important here is that the position of
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

The speciﬁc functional category stage 119

Mary in (70) is typically not analyzed as a result of Move, given that there is no gap in
the rest of the sentence, but rather there is a (resumptive) pronoun her, which refers
back to Mary. In any event, the kind of Move postulated for (69) can only charac-
terize the stage of hierarchical syntax in which there is at least a TP, but possibly also
a CP, to provide a hierarchically higher landing site for Move.
When it comes to Move, McDaniel (2005) considers that protohumans initially
produced long, fluent, unstructured strings of words (e.g. 71), essentially Bickerton’s
(1990) protolanguage, but more fluent. According to McDaniel, when syntax fixed
the order (72), it was no longer possible to topicalize an object (e.g. baby), but this
becomes possible again if Move is introduced (73). The repetition of arguments
characteristic of protolanguage can be reinterpreted as copies of Move, and thus
provide a precursor for Move (see also the discussion in Bickerton 2012; Tallerman
2014b).
(71) baby tree leopard baby baby kill
(72) leopard tree kill baby
(73) baby [leopard tree kill baby]
At least given the theoretical framework associated with Minimalism, two conditions
would need to be met in order for (73) to constitute Move: first, as McDaniel suggests,
there would be deletion of the lower copy (avoidance of repetition); and, second,
hierarchical structure would already have to be in place in order to be possible to
syntactically identify the gap, as per the discussion above. For example, in order to
postulate a gap in the object position in (73), one needs to be certain that this is in fact
a transitive (vP) structure in the first place, given that an intransitive absolutive-like
structure would not have an object position. Likewise, in order for baby to be able to
c-command the gap in the object position, this already would have to be hierarchical
syntax, with baby appearing in the highest layer.
In this respect, it is also important to point out that any permutations of word
order in the two-word stage (e.g. Ball roll vs. Roll ball) cannot be considered as Move
in the technical sense of Move, as discussed above. Instead, these kinds of permuta-
tions would need to be considered as just instances of a single application of the
operation Conjoin, which does not impose linear ordering on the constituents.29 In
this sense as well, Conjoin is like Adjoin in that it can exhibit different word
orderings without implicating syntactic Move (e.g. Adger 2003). Thus, the adverbs

29
As such, different orderings could be used for different discourse purposes of backgrounding or
foregrounding, but they could not be considered to instantiate Move. In this respect, Hurford (2012)
considers that Move is driven by the desire to change the information structure, for example to express
topicalization, new vs. old information, questions, etc. Given my approach, Move can serve these purposes
only at a much later stage, necessarily hierarchical, as per the discussion in the text.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

120 Parataxis and coordination as precursors to hierarchy

below are not analyzed as involving Move, but rather just attachment in different
syntactic positions.
(74) a) Unfortunately, they will have to retire.
b) They will have to retire, unfortunately.
(75) a) They quickly extinguished the ﬁre.
b) They extinguished the ﬁre quickly.
In other words, not every permutation of constituents is analyzed as syntactic
Move, but only those instances in which a constituent travels upwards through
(c-commanding) hierarchical layers.

4.4.6 Transitions and overlaps

Given this gradualist view, one can expect to find transitional constructions, those
straddling the boundary between coordination and subordination, and such con-
structions are not difficult to find (see Section 4.3 for the overlap between parataxis
and coordination). In addition to as illustrated there, there are other words that are
difficult to classify as either coordinators or subordinators. In the examples below,
but is analyzed as a coordinator, and although as a subordinator. Notice, however,
that although introduces an adjunct clause, rather than a clause truly subordinated
into the matrix clause, once again showing a curious interplay among adjunction,
coordination, and subordination.
(76) He wants to get married again, but this time not in Las Vegas.
(77) He wants to get married again, although this time not in Las Vegas.
To take another example, the most neutral, prototypical of conjunctions, and, can
sometimes express subordinating relationships, as discussed in e.g. Culicover and
Jackendoff (2005).
(78) a. Give him an inch, and he will take an ell.
(Oxford English Dictionary)
b. Speak one word, and you are a dead man!
(Oxford English Dictionary)
c. One more can of beer and I am leaving.
(Culicover and Jackendoff 2005: 474)
In (78) above, the relationship between the two clauses is best paraphrased as
involving a conditional, if–then relationship. Culicover and Jackendoff (2005: 474)
call this use of and “left-subordinating and.” In this use, and can be seen as a pure
linker/proto-conjunction, linking two clauses. Recall from Section 4.2 that paratactic
clause combinations without any linkers (e.g. Nothing ventured, nothing gained) also
get interpreted as involving causal or conditional relations.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

The speciﬁc functional category stage 121

More recently, Ross (2013) has argued that English and has undergone a grammat-
icalization process from a conjunction to a subordinator/complementizer in e.g. the try
and type of constructions (I will try and do that). He ties this with comparable
processes of grammaticalization observed in e.g. !Xun (southern African Khoisan
language), in which conjunctions become subordinators (Heine and Kuteva 2002: 44):
(79) yà /oa tcí ta yà ﬁa #èhi
he neg come and he PROG be.sick
‘He doesn’t come (because) he is sick.’
A similar process has been observed in Tok Pisin, English-based Creole spoken in Papua
New Guinea (Verhaar 1995), as illustrated in the following example from Ross (2013):
(80) Em (i) tra-im na help-im mi.
He PRED try-TRANS and help-TRANS me
‘He tries/tried to help me.’

Even within the subordination stage, one finds a variety of clausal subordination
types, with differing degrees of cohesion between clauses. These types of clausal
subordination range from those that involve most syntactic structure, finite subor-
dination with a CP (Complementizer Phrase) (81–82), to those which involve the
least structure, a small clause (83–84), abstracting away from intermediate cases, such
as infinitive clauses (see Progovac 2009c, 2010a).
(81) Mary believes [that he fell off his motorcycle].
(82) Mary believes [that John knows [that the neighbors noticed [that he
fell off his motorcycle]]].
(83) Let [it rain].
Peter saw [Mike fall].
I consider [the problem solved].
(84) ?I will let [John imagine [Peter see [Mike fall off his
motorcycle]]].
In contrast to finite (CP) subordination, which is fully recursive in the sense that one
clause can be embedded inside another, potentially ad infinitum (82), small clause
recursion seems to be somewhat more limited in this sense, as the marginal status of
(84) suggests. Of note is that the subjects of embedded SCs have a structural (case)
relationship with the matrix verb, the so-called ECM case, suggested by the required
adjacency with the verb (no intervening adverbials) (85), and by the required
determiner (86).30

30
The label ECM (Exceptional Case Marking) is due to the observation that the verb here assigns
structural case to a noun phrase which is not its object. Structural case is a grammatical case assigned to a
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

122 Parataxis and coordination as precursors to hierarchy

(85) * Peter saw yesterday [Mike fall].

*I consider crucially [the problem solved].
(86) I consider [*(the) problem solved].
I consider [*(the) class in session].
Both types of embedding exploit some functional glue to “cement” the relationship
between the two clauses: complementizers/subordinators (82) or structural case (84).
Complementizer glue is more specialized (only used for finite subordination), while
structural case is used for other purposes as well. This may be one reason why
recursion is freer with finite subordination (see also Deutscher 2000). In other
words, it may be that finite subordination allows recursion more freely because it is
more unambiguously marked for embedding than are ECM small clauses (see also
discussion in Section 4.4.2).
Thus, it seems that recursion itself is not an all-or-nothing phenomenon. Very
roughly speaking, extrapolating from the discussion in this section, as well as
previous chapters, recursion is structurally impossible without hierarchical func-
tional structure; it can be limitless with highly specialized functional categories
such as finite complementizers; and it is possible, although not limitless, with other
types of structures, such as ECM.
The conclusion emerges that recursion is not just a ubiquitous omnipresent
phenomenon, which comes to language free with Merge, but rather it is a conse-
quence of fairly elaborate syntactic structure, which may be present in some, but not
other, language constructions. As pointed out in the previous sections, the absence
of recursion with a particular structure in a particular language is not an indicator
of a general cognitive (in)ability, but rather just an indicator of the less elaborated
syntax.
Finally, the functional category stage introduced in this section may have wit-
nessed more fine-grained sub-stages. Perhaps there was a stage in which aspect was
grammaticalized, but not yet tense (see also the discussion of PIE in the following
section; also Progovac 2008a,b for Serbian small clauses). Perhaps there was a stage in
which TP could be built, but not yet CP (see also Chapters 3 and 7 for the vP
projection and transitivity). Perhaps gender/number agreement (e.g. on participles)
emerged before person agreement (see Progovac 2008a,b). In this respect, Boeckx
(2008: 119) suggests that the Minimalist operation Agree may have emerged after
Merge. But my primary focus in this chapter is on the three rough syntactic stages, as
well as on envisioning what proto-grammars looked like in the initial stages, as well
as how these initial stages may have penetrated into the subsequent stages gradually

noun phrase (or DP) by e.g. a verb or a preposition in a certain syntactic conﬁguration, often requiring
adjacency. DP is considered to be required for structural case in e.g. English and Italian (Longobardi 1994;
see also Chapter 2), which helps explain why the articles are obligatory in (86). For an elaborate argument
with regard to small clauses in this respect, see Progovac (2006).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Corroborating evidence 123

paving the way toward layered/hierarchical and recursive syntax. My purpose was
also to show how postulating these stages can shed light on the quirks and complex-
ities of present-day syntax (see also Chapter 5 on Subjacency).

4.5 Corroborating evidence

4.5.1 Corroborating evidence for the paratactic stage
There is abundance of corroborating evidence for a paratactic stage in the evolution
of human language, coming from e.g. ancient languages (4.5.1.1), grammaticalization
processes (4.5.1.2), animal communication (4.5.1.3), agrammatism (4.5.1.4), neurosci-
ence (4.5.1.5), and language acquisition (4.5.1.6).
4.5.1.1 Ancient languages While ancient languages are typically considered to be
full modern languages in the sense of language evolution, I consider them separately
here first because they are no longer spoken, and second because they just might have
preserved more syntactic fossils than present-day languages. Here I merely list some
of these fossil properties reported in the literature, leaving their evolutionary signifi-
cance for future research.
Kiparsky (1968) has argued convincingly that PIE syntax was characterized by
optional adverbial temporal particles, which did not build TPs. Such adverbs may
have been attached by Conjoin (or Adjoin). Similarly, when it comes to clause
combination in PIE, according to Kiparsky (1995: 155) (see also Hale 1987; Watkins
1976; Hock 1989), a major characteristic, best preserved in Sanskrit, Hittite, and
Old Latin, was that finite “subordinate” clauses were not embedded but adjoined.
Kiparsky further argues that IE protolanguage lacked the category of complementizer
and had no CP or any syntactically embedded sentences. What looked like finite
subordinate clauses, including relative clauses and sentential complements, were
syntactically adjoined to the main clause, still exhibiting main clause properties, such
as topicalization of constituents to clause-initial position. Kiparsky (1995: 145) calls
these adjoined finite clauses “embedded root clauses,” for they exhibit properties of
root clauses, and yet seem to be interpreted as embedded. According to Kiparsky’s
analysis, these correspond to the paratactic attachment, which does not yield recursion.
Kiparsky further claims that the introduction of complementizers coincided with
the shift from adjunction to subordination, which is in line with Kayne’s (1982) claim
that only CPs can function as true sentential arguments, i.e. as embedded clauses (see
also Holmberg 1986; Taraldsen 1986). If true, then ancient languages, such as PIE, as
well as Akkadian, as discussed in the previous section, can provide additional access
to evolutionary fossils of language.
4.5.1.2 Grammaticalization31 As already pointed out, the outlined progression of
stages is consistent with the grammaticalization processes observed in recent times:

31
Section 7.3.5 discusses why historical change may be of interest to evolutionary considerations.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

124 Parataxis and coordination as precursors to hierarchy

e.g. transitions from paratactic to subordinate relationships are observed even in

present-day languages. According to e.g. Deutscher (2000) and Traugott and Heine
(1991), the grammaticalization of finite subordination typically takes parataxis as a
starting point. Mithun (2010) offers several examples from a variety of languages,
illustrating the “fluidity” of recursion, in the sense that not all languages exhibit it in
all possible constructions. Her findings support the idea that subordination often
arises from parataxis, and that parataxis is still used in some languages as the main
strategy for clause combination, utilizing intonation as the primary glue. The fol-
lowing example from Mohawk illustrates two independent clauses used in a
sequence, but pronounced “as a single intonational contour, beginning with a high
pitch reset and descending to a final fall only at the end of the second clause”
(Mithun 2010: 24):32
(87) Iah ki’ the’: tehoterièn:tare’ na’a:wen’ne’.
not in.fact at.all did he know (it) so it happened
‘In fact, he did not know it. It so happened.’
In English, a preferred way to express this would be by using subordination, as in (88)
below. But the use of parataxis in English is still widely attested, as the example in
(89) shows.
(88) In fact, he did not know [what happened].
(89) You know that. Marc is a linguist.
Such paratactic combinations of independent sentences into a single intonation unit
often result in the grammaticalization of demonstratives into complementizers. All
that needs to happen in (89) is for the demonstrative that to be reanalyzed as
introducing the following clause (90), rather than ending the previous clause, and
this is in fact a very frequent source of complementizers/subordinators, according to
e.g. Heine and Kuteva (2007) and references there:
(90) You know [that Marc is a linguist].
If this kind of progression from parataxis to subordination is a natural process that
occurs even in present times, then this is certainly not an implausible scenario for the
evolution of human language.
4.5.1.3 Comparative studies: Animal communication Given that parataxis involves
no markers of Merge other than intonation/prosody, it is also of interest here that

32
It is in fact a traditional view in historical linguistics that subordination (hypotaxis) develops from
parataxis (juxtaposition, coordination), advocated in e.g. Schlegel (1808), Bauer (1833), Delbrück (1893–1900),
among many others (see e.g. Harris and Campbell 1995 for criticism of this view, as well as for many additional
references). Harris and Campbell conclude, however, that subordinate clauses originated in relatively recent
times (308).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Corroborating evidence 125

intonation and prosody, which are modulated analogically, rather than discretely,
have been proposed by many to have been available before syntax proper, given that
they seem to have analogs in other species (see e.g. Deacon 1997; Piattelli-Palmarini
and Uriagereka 2004; Burling 2005), and given that prosody emerges early in
language acquisition (Section 4.5.1.6)33 According to Deacon (1997), speech prosody
is essentially a mode of communication that provides a parallel channel to speech; it
is recruited from ancestral call functions.34 Like these systems, prosodic features are
primarily produced by the larynx and lungs, and not articulated by the mouth and
tongue. But unlike calls of other species, prosodic vocal modification is continuous
and highly correlated with the speech process (Deacon 1997: 418).35 The human
larynx must be controlled from higher brain systems involved in skeletal muscle
control, not just visceral control (243).
According to Deacon, it is as though we have not so much shifted control from
visceral to voluntary means but superimposed intentional cortical motor behaviors
over autonomous subcortical vocal behaviors. If this is on the right track, then this
would be another scenario consistent with the theme of this monograph, which is
that older strategies got integrated into more recent ones, rather than got replaced by
them, resulting in composite structures.
There have also been numerous reports that primates can combine two signs into a
meaningful utterance, even though, as pointed out by reviewers, the interpretations
of these findings are controversial. The problem seems to be that primates usually
produce a stream of signs without much evidence for cohesion (e.g. Kanzi, a bonobo,
as reported in Savage-Rumbaugh and Lewin 1994). The question then is whether
there are at least some sporadic attempts to put some of these signs together into
meaningful units. It has been reported that Washoe, a chimpanzee who learned how
to use signs of American Sign Language, combined the signs for water and bird to
describe a duck (Gardner, Gardner, and van Cantfort 1989). Kanzi has been reported
to be able to combine a lexigram and a gesture into a meaningful unit (Greenfield and
Savage-Rumbaugh 1990: 161), as discussed in Section 3.5.
Washoe’s and Kanzi’s ability to combine two elements into a meaningful unit
should not be taken to mean that they are using compounds or small clauses in the
same productive way that humans do today. Clearly, the use of such combinations by
non-humans is rare and sporadic. The relevant question here is not whether Washoe
reached a two-word or hierarchical stage of language, but rather whether our

33
In addition, intonation and prosody may remain intact even in cases of various kinds of aphasia (e.g.
Brain and Bannister 1992; Pulvermüller 2002; and references cited there).
34
Affective prosody has been reported to be strikingly similar in humans and other primates so that
human subjects having no previous experience with monkeys correctly identify the emotional content of
their screams (Linnankoski et al. 1994; Kotchoubey 2005: 136; see also Hurford 2007: 282).
35
This is also consistent with Tyler and Warren’s (1987) experimental ﬁnding that comprehension is
affected by disrupting either syntactic or prosodic structure (Section 4.1).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

126 Parataxis and coordination as precursors to hierarchy

common ancestors were in principle capable of combining two signs. This kind of
basic ability, if it was there at the relevant juncture, would have greatly facilitated the
transition from the one-word stage to the two-word stage. In order for the selection
process to get off the ground, at least some of our common ancestors should have
been capable of producing and understanding such combinations. Those who were
just a bit better at it would have been the ones whose genes were passed on in the line
of descent leading to humans. It is important to point out that any continuity with
other primates is not to be sought in the most advanced features of human syntax,
such as recursive CPs or DPs, but rather in the most rudimentary of syntactic
structures, such as two-word paratactic combinations.
One important consequence of the syntactic reconstruction offered in this book is
that it decomposes syntax down to its most modest beginnings, revealing where
continuity with the abilities of non-humans is likely to be found. In this respect,
consider Yang’s (2013) study discussed in Section 2.5.1. It compares children’s com-
binations of articles (a and the) and nouns, with the sign combinations by non-
human primates, of the kind give X, or more X. It is not clear to me how these
structures are comparable, given that articles are highly abstract functional categories
(associated with DPs), late to emerge in children (e.g. Radford 1990), as well as in the
grammaticalization processes (e.g. Heine and Kuteva 2007). Recall also from
Section 4.4.3 that articles are not even available in all human languages. In any
event, this monograph suggests that continuity can only be expected with the most
rudimentary of syntactic structures. But even there, as pointed out above, one does
not expect human-like fluency with two-word combinations—not at all. After all,
humans had millions of years to undergo selection for language since the common
ancestor with chimpanzees. All one can hope to find in this respect is a precursor to
the ability to combine signs.
4.5.1.4 Agrammatism As discussed in Chapter 2, agrammatism offers another
source of corroborating evidence for small clause grammars, which are arguably
paratactic grammars. Consistent with the conclusion that agrammatic patients often
resort to small clause grammars, one expects them also to have difficulties with
embedding and recursion.
As found in e.g. Friedmann and Grodzinsky (1997), the use of subordination/CP is
also affected in the speech production of agrammatic aphasia, which typically
involves a lesion in the left inferior frontal gyrus (see also Friedmann 2002). While
the speakers in their study could produce simple sentences, they failed to produce
embedded sentences in sentence repetition and sentence elicitation tasks, as well as in
spontaneous speech. The study concludes that these agrammatics cannot project
their syntactic trees up to the CP node (their Tree-Pruning Hypothesis). This is
expected if agrammatics often resort to paratactic small clause grammars, with as few
functional projections as possible.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Corroborating evidence 127

In addition, a recent neuroimaging study found that sentences with CPs involve
more activation in multiple loci, including Broca’s area, in comparison to those
without a CP (Shetreet, Friedmann, and Hadar 2009). The authors concluded that
the generation of syntactic layers is cognitively costly, which is fully compatible with
the proposal explored in this monograph.
4.5.1.5 Neuroscience Recent computational and brain-imaging work is consistent
with the notions explored in this book. To take one example, a PET study by
Indefrey, Brown et al. (2001) indicates that non-finite clauses do indeed require less
grammatical work. These authors presented German-speaking participants with
pictures of simple colored objects (squares, circles, and ellipses) in different spatial
configurations. The task of the participants was to describe the pictures, using one of
three different sentence formats. In the full-sentence condition, they had to produce a
full grammatical sentence, containing all relevant information (e.g. Das rote Viereck
stößt die blaue Ellipse weg ‘the red square pushes the blue ellipse away’). In the noun
phrase condition, they were required to use a non-finite phrase and to leave out the
determiner (e.g. Rotes Viereck, blaue Ellipse, wegstoßen ‘red square, blue ellipse,
pushing away’). In the word condition, participants were also required to produce
sub-sentential forms, but this time they needed to omit the inflection of the adjective
and put the adjective after the noun (e.g. Viereck rot, Ellipse blau, wegstoßen ‘square
red, ellipse blue, pushing away’). The latter two strategies involve paratactic attach-
ment, and not fully integrated syntax. The blood flow response varied as expected
between these conditions in the left operculum, a region just behind Broca’s area:
maximal response in the full-sentence condition, less strong in the noun phrase
condition, and less strong still in the word condition.
It is also of some interest that the data introduced in this monograph, the “living
fossils” of the paratactic stage, are characteristically formulaic/stereotypical expres-
sions (e.g. Case closed; Me first; Nothing ventured, nothing gained). According to e.g.
Code (2005: 317), non-propositional, stereotypical/formulaic uses of language might
represent fossilized clues to the evolutionary origins of human communication, given
that their processing involves more ancient processing patterns, including more
involvement of the basal ganglia, thalamus, limbic structures, and the right hemi-
sphere (see e.g. Lieberman 2000 for an extensive argument that subcortical struc-
tures, basal ganglia in particular, play a crucial role in syntax). According to Ullman
(2006: 480–1), Broca’s area is part of a larger circuit that involves the basal ganglia,
with the two parts of the brain densely interconnected. Basal-limbic structures are
phylogentically old and the aspects of human communication associated with them
are considered to be ancient too (van Lancker and Cummings 1999; Bradshaw 2001).
The Appendix returns to this discussion.
Moreover, the proposals in this monograph are vulnerable to empirical verifica-
tion. Neuroimaging experiments can be devised in such a way as to distinguish
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

128 Parataxis and coordination as precursors to hierarchy

between paratactic (small clause) structures and hierarchical structures, as explored

in the Appendix (see also Chapter 2).
4.5.1.6 Acquisition As discussed in Chapter 2, syntax in language acquisition seems
to begin with a root small clause stage (or root inﬁnitive stage), arguably a two-word
paratactic stage. Early stages of second language acquisition have been analyzed in a
similar fashion (see e.g. Klein and Purdue’s 1997 Basic Variety). Also, as shown in e.g.
Hua and Dodd (2000), prosody emerges early in language acquisition. On the other
hand, subordination, as well as Move, are rather late developments in children, as
reported in numerous references (e.g. Radford 1990; Lebeaux 1989; Ouhalla 1991;
Platzak 1990; Potts and Roeper 2006; Hollebrandse and Roeper 2007). This is
consistent with the claim in this monograph that subordination and Move are
unavailable in the paratactic stage.

4.5.2 Corroborating evidence for a proto-coordination stage

Evidence for a proto-coordination stage is not nearly as robust as evidence for a
paratactic stage. It is entirely possible that the proto-coordination stage was a brief
stage, which quickly led to grammaticalization of specific functional categories. This
stage also must have been optional in the sense that not all languages/constructions
had to pass through it. Recall from Section 4.5.1.2 that subordinators often gram-
maticalize from demonstratives and verbs like say, rather than from pure linkers (see
e.g. Heine and Kuteva 2007). The coordination stage may have been only a detour, a
direction taken only in some circumstances.
One piece of corroborating evidence for a coordination stage may come from
instances where grammaticalization of e.g. finite subordination proceeds through a
(n intermediate) coordination stage (see e.g. Deutscher 2000; Traugott and Heine
1991; and references cited there). While the references above often speak of coordi-
nation even where there is no overt coordinator, there are some concrete proposals
according to which an overt coordinator grammaticalizes into a subordinator.36 For
example, according to Harris and Campbell (1995: 290), conditional marker da in
Mingrelian, language spoken in Western Georgia, comes from the conjunction and.
Likewise, it is frequently reported in language acquisition literature that some
children use “fillers” in places where one would expect functional categories. While
researchers sometimes attribute the presence of such fillers to the presence of specific
functional categories, a more conservative approach is that these are just connectors
(proto-conjunctions or linkers), serving to connect words into phrases/clauses (see
e.g. Peters 1999; Peters and Menn 1993; Veneziano and Sinclair 2000; and references

36
As discussed in detail in this chapter, various languages and constructions do not mark with
conjunctions what can be considered as coordination, resulting in structures which straddle the boundary
between parataxis/adjunction and coordination.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Concluding remarks 129

cited there). Such ﬁllers in English are often a syllabic nasal [m] or a schwa [@], as the
following example illustrates (Peters and Menn 1993):
(91) [m] pick [@] ﬂowers. (English learning boy, age 1;6)

According to the above authors, the fillers are vocalizations that do not correspond to
particular words/morphemes, and that initially seem to range over various kinds of
functional categories/positions.
Such fillers can thus be seen as proto-conjunctions, as per the proposal in this
chapter. It is only later that they transition into specific functional categories,
resulting in hierarchical structure. If this is on the right track, it can be seen as a
progression from a proto-coordination stage to a specific functional category stage.
In addition, Pérez-Leroux et al. (2012) found that young children in their study
frequently avoided producing recursive nominals with three nouns, such as Elmo’s
sister’s ball, which crucially rely on recursive hierarchical structure, and possibly on
the presence of a DP (Determiner Phrase) projection (see Section 4.4). In contrast, the
same children demonstrated facility integrating three nouns into coordinated struc-
tures, suggesting that coordination involves less syntactic complexity than embedded
recursion, consistent with the proposal in this chapter (see especially Section 4.4 for the
distinction between true recursion and iteration, as per Kinsella 2009).
Also, Jordens (2002) argues that there is a stage in the acquisition of Dutch where
all constituents are attached by adjunction/parataxis, but where certain modal verbs
and negation serve as proto-functional categories. These proto-functional categories,
according to Jordens, are linking elements between the topic and the predicate (744),
certainly analyzable as proto-conjunctions/linkers in the sense of this chapter. In the
next, “finite-linking stage,” these linkers are grammaticalized into auxiliaries, which
now serve as heads of hierarchical structures (750). This progression of stages fits well
with the proposal of this chapter, showing transitions from the adjunction/parataxis
stage to the proto-coordination and the specific functional category stages.

4.6 Concluding remarks

This chapter has proposed the following progression of syntactic stages in the
evolution of human language:

0. One-word stage (non-syntactic stage)

(i) Paratactic proto-syntax stage (ﬂat, non-hierarchical stage), where prosody/supra-

segmentals provide the only glue for (proto-)Merge (Conjoin)

(ii) Proto-coordination stage, where, in addition to prosody, the (proto-)conjunction/

linker provides all-purpose segmental glue to hold the utterance together. It is only at
this stage that Move and recursion become available
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

130 Parataxis and coordination as precursors to hierarchy

(iii) Speciﬁc functional category stage (hierarchical stage), where, in addition to

prosody and to segmental glue, speciﬁc functional categories also provide specialized
syntactic glue for constituent cohesion, including tense elements and subordinators/
complementizers.

The progression of stages along these lines is being proposed both for predication
(clause-internally) and for clause combination. It is shown that each new stage offers
clear and concrete communicative advantages over the previous stage(s), and more-
over advantages specific enough to be responsive to natural selection. Significantly, in
their modern incarnations, the constructions of the three stages also overlap a great
deal, which is expected under a gradualist evolutionary scenario.
In Chapters 2 and 3 I argued that the capacity for two-word paratactic grammars
evolved due to natural selection, including sexual selection. As the reviewers point
out, the question now arises whether the capacity for hierarchical syntax evolved
through biological selection as well, or whether it just developed through the gram-
maticalization processes, once the paratactic stage was in place. My hypothesis here is
that the capacity to use hierarchical grammars evolved through biological processes
as well, although I am certainly not claiming that each specific functional projection
had to evolve that way. This will be further discussed in Chapter 7.
The following chapter on Subjacency builds directly on the proposals in this
chapter to explain why adjuncts and conjuncts constitute islands for Move.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Islandhood (Subjacency) as an
epiphenomenon of evolutionary
tinkering

5.1 Introductory note

This chapter builds directly on Chapter 4 in that it proposes to treat islandhood/
Subjacency as an epiphenomenon of the progression through the three evolutionary
stages, as outlined in that chapter. In particular, the proposal of the previous chapter
is that adjunction/parataxis and proto-coordination stages of syntax do not show full
syntactic integration and thus do not allow movement or subordination. This
approach not only directly captures the islandhood of adjunction and coordination,
but it can also shed light on some other island effects. In this view, Subjacency or
islandhood can be seen as the default, primary state of language, due to an evolu-
tionary base of language which was without Move. This default state can be overrid-
den in certain evolutionarily novel, fancy constructions, arising in the hierarchical
(subordination) stage(s). The constructions that allow Move form a natural class, and
can be characterized syntactically, while the constructions that do not allow Move
(islands) do not form a natural class at all. My conclusion is that Subjacency is not a
principle of syntax, but rather an epiphenomenon of the evolutionary trajectory of
syntax.

5.2 What is islandhood/subjacency?

Move(ment) plays a central role in Minimalism (e.g. Chomsky 1995) and its prede-
cessors. So, for example, wh-question formation in English is considered to involve
movement of the wh-word or phrase from its thematic (underlying) position to the
left periphery of the sentence. The following examples illustrate this:
(1) What do penguins eat what?
(2) What does Peter think [cp penguins eat what]?

Evolutionary Syntax. First edition. Ljiljana Progovac

# Ljiljana Progovac 2015. Published 2015 by Oxford University Press
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

132 Islandhood (Subjacency) as an epiphenomenon of evolutionary tinkering

(3) Who(m) did Peter walk with who(m)?

(4) Who(m) did you say [cp Peter walked with who(m)]?
In (1)–(2) it is assumed that the wh-word what originates after eat as a complement/
object of eat (cf. echo questions such as Penguins eat what?), and that it subsequently
moves to the front of the sentence, to the position of the speciﬁer of CP. (The
strikeout notation is used here to represent the original, pre-Move, copy of the
wh-word.) Similar considerations hold of the wh-word who(m) in the examples
(3)–(4). It is important to note here that wh-movement conceived in this way can
cross clausal (CP) boundaries, as is the case in (2) and (4).
In his seminal work, Ross (1967) noted that there are many types of syntactic
islands, that is, constructions out of which it is not possible to apply Move.1 One such
island is coordination—as illustrated with the minimal pairs below, while it is
possible to move a wh-word out of a Prepositional Phrase (PP) (5, 7), it is not possible
to move a wh-word out of a conjunct (6, 8):
(5) What did Peter eat ham with what?
(6) *What did Peter eat ham and what?
(7) Who did Peter see Richard with who(m)?
(8) *Who did Peter see Richard and who(m)?
Notice that the echo versions below are grammatical, suggesting that the problem
lies with the movement itself, rather than with the semantics.
(9) Peter ate ham and what?
(10) Peter saw Richard and who(m)?
Movement is also prohibited out of adjunct clauses, which are also considered to be
islands:2
?
(11) *What did Peter retire [cp after Mary said what?]
[cf. echo question: Peter retired after Mary said what?]
Likewise, movement out of subjects (12) is less acceptable than movement out of
objects (13), and subjects are for that reason also regarded as islands:

1
“We say that a phrase is an ‘island’ if it is immune to the application of rules that relate its parts to a
position outside of the island” (Chomsky 1980: 194).
2
As pointed out by a reviewer, there are some apparent exceptions to this observation in certain well-
deﬁned contexts, as reported in Borgonovo and Neelman (2000: 199–200):
(i) What did John arrive whistling what?
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

What is islandhood/subjacency? 133

??
(12) Who did [np your loyalty to who(m)] appeal to Mary?
[cf. echo question: Your loyalty to who(m) appealed to Mary?]
(13) Who(m) did Bill question [np your loyalty to who(m)]?
[cf. echo question: Bill questioned your loyalty to who(m)?]
The following examples introduce two additional islands: Wh-Islands, where
wh-extraction is prohibited out of another wh-clause (14), and Complex NP Islands,
where Move is prohibited out of a noun phrase which includes a clause, either a
nominal complement clause (15), or a relative clause (16):
?
(14) *Which book did you ask John [cp where Bill bought which
book]?
(15) *What did Bill reject [np the accusation [cp that John stole what]]?
(16) *Which book did Bill visit [np the store [cp that had which book in
stock]]?
Interestingly, there are languages (e.g. Japanese and Chinese) which keep their
wh-phrases in situ (i.e., not moved), and it is still an open theoretical question how to
analyze wh-questions in these languages. One line of research considers that wh-words
in fact do undergo Move even in these languages, but covertly/invisibly so (e.g. Huang
1982). However, just as is the case with English echo questions (9–10), wh-words in situ
in these languages do not show island effects, at least not when in argument positions.
This prompted e.g. Huang (1982) to propose that Subjacency does not hold for covert
wh-movement. In contrast, Tsai (1994) and Hagstrom (1998) rejected the idea that wh-
words themselves move covertly, but instead proposed a different strategy of deriving
such wh-questions. According to the proposal in Fukui (1986), the lack of wh-movement
in Japanese can be correlated with the lack of CP in the language.
While it is beyond the scope of this book to engage with the issue of covert
movement, sufﬁce it to say here that approaches which do not invoke such move-
ment of wh-phrases are fully compatible with the evolutionary approach I am
adopting here. This is so because these approaches identify a different strategy for
expressing wh-questions, a strategy which does not require a CP layer, or Move.3 The
approach explored here highlights the existence of multiple routes to the same goal.
One of the central goals of syntactic theory has been to determine what differen-
tiates constructions that allow Move from those that do not. Overwhelmingly, the
assumption among syntacticians is that islandhood, that is, restrictions on Move, is
the marked case, in need of explanation. This assumption has led to the expectation
that there is some (abstract) principle of syntax, such as Subjacency, which accounts

3
As proposed in Radford (1990), a similar kind of strategy is needed to capture wh-questions in
child English, prior to the emergence of CP.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

134 Islandhood (Subjacency) as an epiphenomenon of evolutionary tinkering

for all or most of the island effects. Research has thus concentrated on characterizing
and defining the principles that are taken to constrain Move, including Subjacency.4
Almost fifty years after Ross’ dissertation, no real progress has been made on this
front—there is still no principled characterization of islandhood.5
Most accounts stipulate which syntactic nodes (S, NP, CP, DP, etc.), and/or which
combination of nodes, and/or nodes in which syntactic positions, constitute obstacles
to Move (barriers/bounding nodes/phases). The classic accounts are Huang (1982);
Lasnik and Saito (1984); and Chomsky (1986). To take one example, very roughly
speaking, one can account for the Complex NP constraint (15)–(16) by assuming that
the NP is an obstacle to Move, to use neutral terminology. But the NP proves an
obstacle only in conjunction with a clause, given that movement is otherwise possible
either out of a clause as in (2) and (4), or out of an NP as in (13). Very roughly
speaking again, one needs to assume that clauses and NPs are both obstacles, but that
the wh-phrase can jump over one obstacle (at a time), even though not over two. So
far, so good. But then this analysis does not really carry over to other islands. When it
comes to the Subject Island, how does one explain why movement out of the subject
NP is illicit, while movement out of a comparable object NP is licit? In both cases, the
wh-phrases seem to be crossing the same number of obstacles. According to Huang
(1982), this is because the subjects (and adjuncts) are not “properly governed,” while
objects are. In Chomsky’s (1986) version, this is because subjects (and adjuncts)
are not L-marked, while objects are. The appeal to either proper government or
L-marking only stipulates that objects/complements are special/privileged in this
respect, implicating the importance of the structural position, in addition to the
nature and number of nodes crossed. But there is now no real unification of the
Complex NP Island, on the one hand, and subject or adjunct islands, on the other.6

4
Some more recent accounts (e.g. Boeckx 2008) adopt a pluralistic view of islandhood, that is, a view
that islandhood is a result of the application of various principles, not just one unified principle such as
Subjacency. Under this view, a unification of all islandhood is not pursued or expected. In fact, Boeckx
considers that the result of each Merge is an island, although typically not an absolute island. For him,
islandhood results if too much checking affects a single item. If features to be checked can be distributed
over more than one item, such as may be the case with movement leaving a resumptive pronoun, then
islandhood is voided or weakened (208). In other words, the islands are relativized to the amount of
checking relations established and their configurations. Boeckx (2008) does acknowledge, however, that
adjoined structures “have a freezing effect” on movement (233), as well as that the islandhood of
coordination is not captured by his, or any other syntactic theory (237).
5
This is not meant, in any way, to denigrate the quality of research done within this approach. For even
when one follows an ill-fated hypothesis, one gathers invaluable data and insights along the way. But
however fine and ingenious this research may have been otherwise, and however great its contributions, in
my view, it has not yielded progress on this particular front, that is, it has not provided a principled account
of islandhood, suggesting that a different angle is needed.
6
And this is looking at islandhood in only one language: English. There is variation in this respect
across languages, too (see e.g. Sprouse and Hornstein 2014: 4). To take just one example, Italian does not
seem to show wh-island effects. To account for this, Rizzi (1982) proposed that in Italian the obstacles for
movement are NPs and CPs, as opposed to NPs and IPs (i.e. TPs) in English. Also, as mentioned above, in
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Why there is no principled account of islandhood 135

And the problems multiply as one considers additional islands, such as coordination
(see e.g. Postal 1997, 1998).7
Within the Minimalist Program, in which proper government and L-marking of
the previous frameworks are not available as theoretical postulates, Chomsky (2001,
2008) attempts to capture some of the island effects by invoking new Minimalist
constructs, phases (impenetrable domains), again stipulating that CPs and DPs
(former NPs) are phases. As Boeckx and Grohmann (2007: 216) observe, these
most recent phase-based approaches to islandhood do not improve upon the previ-
ous approaches, and “phases are in many ways reincarnations of bounding nodes and
barriers.” Belletti and Rizzi (2000) report an interview with Chomsky, in which he
concludes that “there is no really principled account of many island conditions.”

5.3 Why there is no principled account of islandhood

The persistent view of islandhood/Subjacency (in Minimalism and predecessors)
considers Move to be the default option, while Subjacency (and other restrictions on
Move) are treated as a marked option, in need of explanation (Ross 1967; Huang 1982;
Lasnik and Saito 1984; Chomsky 1986, 2001; Stepanov 2007; Sprouse and Hornstein
2014). To be more accurate, Move in Minimalism is never completely free, as it is
taken to apply only if motivated by a need to check certain (strong uninterpretable)
features. But once such features are present in the derivation, it is considered that
Move applies freely, in the sense that it applies unless blocked by some specific
principle like Subjacency.
Significantly, this view fuels the influential language evolution hypothesis, accord-
ing to which Merge (which subsumes Move) was the only evolutionary breakthrough
for syntax: once it emerged, it was able to apply freely and recursively, automatically
yielding Move and subordination (Berwick 1998; Chomsky 2005; Berwick and
Chomsky 2011). In an early attempt to reconcile this view with a gradualist approach
to syntax, Newmeyer (1991) proposes that a grammar with Subjacency was specific-
ally targeted by natural/sexual selection, over a previous stage of grammar, which
presumably had no Subjacency. This implies that this previous stage was character-
ized by a much freer Move, and that the ungrammatical examples discussed in
Section 5.2 would have been grammatical in this stage. However, Lightfoot (1991: 69)

some languages wh-phrases do not show overt movement at all, and this introduces further complications
for the characterization of Subjacency.
7
In fact, coordination and adjunction seem to be the most difﬁcult islands to capture. For example,
Napoli (1993: 401, 409) notes that “while Subjacency accounts for the Complex NP Constraint, [ . . . ] the
Subject Condition, and the wh-islands, it cannot account for the ungrammaticality of movement out of
coordinate structures and out of adverbial clauses” (see also Footnote 4). The islandhood of coordination
and adjunction is the central focus of this chapter, and it is proposed here that it follows from a looser
integration of adjuncts and conjuncts into the fabric of syntactic structure (see Chapter 4).
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

136 Islandhood (Subjacency) as an epiphenomenon of evolutionary tinkering

counters that “Subjacency has many virtues, but [ . . . ] it could not have increased the
chances of having fruitful sex.” In other words, it is not clear how or why a grammar
with Subjacency would have been naturally/sexually selected over a grammar without
Subjacency.
It is exactly based on these considerations that Berwick (1998: 338–9) concluded
that “there is no possibility of an ‘intermediate’ syntax between a non-combinatorial
one and full natural language—one either has Merge in all its generative glory, or one
has no combinatorial syntax at all” (see also Bickerton 1990, 1998, 2007; Berwick and
Chomsky 2011; Chapter 1). This reasoning, which is reminiscent of the old saw
“what use is half an eye,” has led some syntacticians to believe that syntax is an
all-or-nothing package, which could not have evolved gradually, and which must
have been, in its entirety, a product of one single sudden event, possibly one single
mutation, which Berwick and Chomsky (2011: 29) characterize as “minor.”
But there is no need for this drastic conclusion. In fact, there is an alternative
possibility to consider regarding Subjacency (mentioned in e.g. Cinque 1978; Postal
1997; Boeckx and Grohmann 2007; Progovac 2009b), that islandhood is the default
state of syntax. Given this view, permitting Move would be a special/marked option. In
fact, the constructions that prohibit Move are much more numerous and diverse than
those that allow it. Consider, again, the list of constructions which constitute islands
(for a long inventory of additional island constructions, see e.g. Postal 1997, 1998):

Subject Islands
??
(17) Who did [np your loyalty to who] appeal to Mary?
Wh-Islands
?
(18) *Which book did you ask John [cp where Bill bought which
book]?
Complex NP Islands
(19) *What did Bill reject [np the accusation [cp that John stole what]]?
(20) *Which book did Bill visit [np the store [cp that had which book in
stock]]?
Adjunct Islands
?
(21) *What did Peter retire [cp after Mary said what?]
Conjunct Islands
(22) *What did Peter retire and [cp Mary said what?]
Typically, Move is possible only out of (a subset of) complements/objects, for
example, verbal (non-wh-)complements, whether clausal (23) or nominal (24):
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Why there is no principled account of islandhood 137

(23) Which book did you tell John [cp that Bill bought which book]?
(24) Who did Bill question [np your loyalty to who]?
What this means is that constructions which disallow Move (islands) do not form a
natural class, while those that allow Move, seem to. If so, then any attempt to
characterize islandhood/Subjacency in unified terms is doomed to fail. On the
other hand, it should be possible to formulate a general characterization of non-
island constituents, as pointed out in Postal (1997). For example, in the case of
(23–24), Move proceeds through the hierarchy of projections where each new layer
c-commands the previous one, and where there are no adjunct or conjunct clause
boundaries on the way. Recall from Chapter 4 that c-command does not extend
seamlessly into adjuncts or conjuncts, and given that movement has to proceed to a
c-commanding position, any boundary that is not strictly hierarchical, subject to an
unbroken chain of c-command, can trip up Move.
Furthermore, there are additional cases where Move is illicit, and I list them here to
anticipate the discussion in subsequent sections. For example, Move does not occur
across sentential boundaries, as is well-known, but not discussed in the context of
Subjacency:
(25) *Who did Mary see the movie. It featured who?
The idea is that the principles of syntax do not extend across sentence boundaries,
but it is worth noting here that some sentence-internal boundaries, such as parataxis,
resemble sentential boundaries in this and other respects.
Move is also prohibited from paratactically (loosely) attached parallel small clauses
(26), as well as from small clauses adjoined to finite clauses (27), the latter example,
but not the former, subsumable under Adjunct Islandhood:8
(26) a. *What nothing ventured, what gained?
(cf Nothing ventured, nothing gained.)
b. *How easy come, how go?
(cf Easy come, easy go.)
c. *Who monkey see, who do?
(cf Monkey see, monkey do.)
(27) *Where can her having retired from where, we finally relax?
(cf. Her having retired from where, we can finally relax.)
Both types of examples above feature a paratactic boundary across which the
wh-phrase would have to Move. If the paratactic glue is mainly intonational/prosodic

8
As pointed out by the reviewers, these examples do not seem to allow even echo questions:
(i) ??Nothing ventured, what gained?
(ii) *Easy come, how go?
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

138 Islandhood (Subjacency) as an epiphenomenon of evolutionary tinkering

(Chapter 4), then the paratactic boundary is not unlike a sentence boundary. In
addition, if these are just small clauses, then they are not provided with the functional
categories and projections, such as CP, that would provide the landing sites for
wh-Move. It is typically considered that wh-movement targets the specifier position
of CP, and that if this position is not there, or is filled with some other material,
wh-movement cannot take place (see also Section 4.4.5).
The same considerations hold for single root small clauses in (28) below, discussed in
chapters 2 and 3. If these clauses are just bare argument-predicate concatenations, then
they also lack the relevant syntactic space for Move to take place, such as TP or CP.
(28) *When problem solved when? (cf. Problem solved.)
*Who(m) worry? (cf. Jeanne worry?!)
With these additional examples, it becomes even clearer that constructions that
prohibit Move (islands) have no syntactic property in common, that is, that these
constructions do not form a natural class. It is thus not surprising that in spite of all
the effort, to date, there has been no principled analysis of islandhood/Subjacency, as
pointed out in Section 5.1 (see also Belletti and Rizzi 2000; Szabolcsi and den Dikken
2003; Boeckx and Grohmann 2007).
As mentioned in Footnote 4, yet another angle is possible, namely, to adopt a
pluralistic view in which islandhood is a result of several independent principles that
constrain Move (see e.g. Boeckx 2008). In addition to not being able to capture the
islandhood of coordination and adjunction, the central topics of this chapter, this
view is also not able to account for the generalization that non-islands seem to form a
natural class. Even though the correlation is not perfect, it still holds that if a
constituent is not a complement, then it is highly likely to be an island.9
For all these reasons, it would be prudent to explore an alternative track, an
approach that takes islandhood to be the default state of syntax, and Move a special
option, available only in certain privileged constructions. In this view, the question is
no longer why Move is impossible out of islands, but rather why Move is possible out
of certain complements, and indeed why Move is possible at all. But, first, before one
can pursue that question, it is important to establish the reason why No Move would
be the default state of syntax. The next section addresses that question.

9
There are many subtleties regarding islandhood, including distinguishing weak from strong islands,
which my approach does not address. I hope that future research will address this question within an
evolutionary framework, especially given that an evolutionary approach is well-equipped to deal with
graded grammaticality. In this respect, one would need to consider the three rough stages explored in this
monograph: Adjunction/Parataxis, Coordination, and Subordination, as just three idealized points in the
evolution of syntax, with a variety of transitional sub-stages certainly a possibility, as discussed in
Chapter 4. To the extent that the structures can be more or less syntactically integrated, the graded
judgments would then reﬂect the extent of that integration, which can vary not only across constructions,
but also across languages.
OUP CORRECTED PROOF – FINAL, 15/5/2015, SPi

Subjacency in the light of evolution 139

5.4 Subjacency in the light of evolution

My proposal is that proto-syntax, characterized by one-word utterances, (root) small
clauses (29), and paratactic combinations of such small clauses (30), did not have Move
or subordination (Progovac 2008a,b, 2009b, 2013b, 2014a), as discussed in Chapters 2–4.10
(29) a. Case closed. Problem solved. Point taken. Crisis averted.
Me first! Everybody out! Him apologize?!
(30) a. Nothing ventured, nothing gained.
b. Easy come, easy go.
c. Monkey see, monkey do.
d. Come one, come all.
The examples above are arguably present-day approximations of this stage of syntax.
Even though these present-day incarnations must be a bit more complex than the
proto-structures, they still do not allow Move (26–28), as established in the previous
sections, as well as in the previous chapters.
In this proposal, the kind of syntax illustrated in (29)–(30) was primary and
foundational, while Move was an evolutionary innovation. In agreement with
Newmeyer (1991), this proposal advocates a gradualist approach to the evolution
of syntax; however, recall that Newmeyer proposed that the previous stage(s) of
grammar had no restrictions on Move, and that Subjacency was an innovation
(Section 5.2).11 In my proposal, in contrast, the proto-stages of grammar were
characterized by islandhood, or lack of Move, with Move emerging only later, in
conjunction with layered, hierarchical syntax (Section 4.4.5), supported by special-
ized functional categories and projections, such as TP and CP. In Minimalism Move
is typically associated with functional projections. For example, Move of the under-
lying small clause subject targets the specifier of a TP (Chapter 2; Section 4.4.5), while
wh-movement targets the specifier of a CP, as illustrated in this chapter.12 In a small
clause based grammar which has neither TPs nor CPs, one does not expect to
encounter this type of Move, especially considering that Move has to target a
hierarchically higher position, as explained in Section 4.4.5.

10
As pointed out by Boeckx (2008), syntactic theories of Subjacency, and locality in general, should be
compatible with ﬁndings in neuroscience and evolutionary biology: “Up to now, compatibility with neuro-
science and evolutionary biology has been a rather weak constraint on theory construction in linguistics”
(Boeckx 2008: 4).
11
A similar idea can be found in, for example, Boeckx’s (2008: 244) statement that bounding nodes are
solutions that the language faculty has developed to ensure that syntactic objects are unambiguous.
12
A reviewer wonders if all Move operations target functional projections. I would say here that at least
the uncontroversial cases of Move do involve functional projections, such as subject raising to TP,
wh-raising to CP, V movement to v, etc. In fact, even adjunction of adverbials is sometimes claimed to
target only functional projections (see e.g. Adger 2003).
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

140 Islandhood (Subjacency) as an epiphenomenon of evolutionary tinkering

Going back to islands, we can now envision an answer to the question of why some
constructions still disallow Move (e.g. coordination and adjunction), while others
facilitate it (e.g. subordination). My claim is that our grammars, courtesy of gradual
evolutionary development, show a range of constructions that fall between the
two opposites: (i) completely independent utterances/sentences and (ii) syntactically
fully integrated expressions. The intermediate possibility is to be loosely attached
(semi-integrated) into sentential fabric, and this is arguably the case with parataxis/
adjunction and conjunction.13 Only the most integrated of constructions (e.g. com-
plements), which build a ladder, a scaffolding of functional projections, allow Move
to climb along this ladder.14 The metaphor of climbing is appropriate here given
that syntactic theory assumes that movement is always to a structurally higher
(c-commanding) position. Clausal conjuncts and adjuncts have been repeatedly
noted in the literature not to be fully integrated into syntactic fabric, as discussed
in Chapter 4. An evolutionary approach can shed novel light on these phenomena.
This evolutionary account also helps explain why human grammars should avail
themselves of redundant means for expressing clause combinations, and moreover
such “imperfect” means, as are coordination and adjunction. Recall from Chapter 4
how clauses are combined in the postulated three rough stages: parataxis (adjunc-
tion) (31), coordination (32), and subordination (33):
(31) He is a linguist—(as) you know. Parataxis
(32) He is a linguist, and you know it. Coordination
(33) You know that he is a linguist. Subordination
If comparable stages characterized language evolution, with adjunction and coord-
ination constituting intermediate steps between separate utterances (no syntactic
integration, no Move) and subordination (full integration, free(er) Move), then such
evolutionary “tinkering” left us with multiple possibilities which partly overlap in
function, that is, with redundant means for expressing similar meanings (31)–(33).15

13
Even though I will not discuss subject islands in this book, it is worth noting that syntactic theory
recognizes that subjects/specifiers are less tightly integrated than objects/complements. While objects/
complements are merged directly with the verbs (First Merge), subjects/specifiers are typically introduced
as sisters to intermediate projections (Second Merge). In addition, subjects typically undergo local Move
out of verbal projections, further contributing to their syntactic instability.
14
This is not to say that subordination was necessarily one big solid monolithic stage—as pointed out
repeatedly in this chapter, as well as in Chapter 4, sub-stages and transitions may well have existed, and
may account for a number of present-day constructions which are ambivalent and difficult to categorize.
15
My claim here is not that a hierarchical stage automatically licenses Move. I am only saying that
hierarchy is a necessary condition for Move, not sufficient. This is not surprising given that Move is
typically assumed to target a c-commanding position, that is, a structurally higher position. Other
conditions clearly need to be met to allow Move, including the existence of the appropriate and available
landing sites for Move (e.g. CP for wh-movement). Given this, the fact that not all subordinate construc-
tions allow Move, but only a subset of them do, is not directly a problem for my analysis. The analysis
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Subjacency in the light of evolution 141

As pointed out in Chapter 1, evolution is taken not to throw a good thing away, but
to build upon it, or to add to it. So, if adjunction and conjunction proved to be useful
syntactic mechanisms in a proto-syntactic stage, the later stages did not have to
discard them, but could continue to use them in specialized functions. This is also
what happens in present times with grammaticalization of subordination, as well as
with grammaticalization processes in general (see e.g. Heine and Kuteva 2007).
Overlap and (partial) specialization are properties of evolutionary tinkering, rather
than of optimal design.
Grammaticalization is relevant for my approach because it shows that this type of
change is in principle possible (see also Fitch 2010). When processes of grammati-
calization happened for the ﬁrst time, they would have driven biological selection
toward developing brains that can support the processing of such abstract categories
and their projections. Once the processing mechanisms evolved to a certain point,
then grammaticalization processes could, in principle, operate without biological
evolution. However, as discussed in much more detail in Chapter 7, there is no
guarantee that any of these processes will not, for some reason and in certain
circumstances, trigger genetic selection.
As pointed out in Chapter 4, there are concrete and tangible advantages to each
postulated stage of syntax. The conjunction stage has an advantage over the adjunc-
tion stage in that it provides more robust evidence for proto-Merge, by including
the segmental glue. In addition to facilitating Move, the hierarchical, subordination
stage also provides a recursive mechanism for embedding multiple viewpoints one
within another, as discussed in detail in Chapter 4. Thus, if subordination (as well as
Move) is an innovation resulting from evolutionary tinkering, then subordination
would have signiﬁcantly increased the expressive power of language, in a concrete
manner, and thus, unlike Subjacency, constitutes a plausible target for natural/sexual
selection.
In this evolutionary perspective, rather than a system designed from scratch in an
optimal way, syntax is seen as a patchwork of structures incorporating various stages
of its evolution, giving an impression, or an illusion, of Subjacency. It follows from
this approach that Subjacency is not a principle of syntax, or a principle of any kind,
but rather just an epiphenomenon. Subjacency or islandhood can be seen as the
default, primary state of language, due to an evolutionary base of language which was
without Move. This default state can be overridden in certain evolutionarily novel
constructions, such as subordination.

proposed here posits a different question than the traditional analyses: the question here is not what non-
complements and complement islands have in common, the question pursued by Subjacency accounts, but
rather how complement islands differ from complement non-islands. Exploring this question further may
give new insights into the nature of Move, and language in general.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

142 Islandhood (Subjacency) as an epiphenomenon of evolutionary tinkering

5.5 Conclusion
This chapter has pointed out that syntactic islands do not form a natural class, but
that non-islands do, and that, for this reason, there can never be a principled, unified
account of islandhood/Subjacency. My proposal is that Subjacency is not a specific
principle of syntax, but rather the default state of syntax, dating back in time to the
evolutionary beginnings of language, in which Move, and functional projections that
facilitate Move, were simply unavailable. I have hypothesized that two initial stages in
the evolution of syntax do not exhibit Move: the adjunction/parataxis stage, and the
coordination stage. In this analysis, Move and subordination are later innovations,
made possible by the emergence of specialized functional categories and their
projections, such as TP and CP. Present-day sentences can still include various fossil
constructs lacking Move, specifically adjuncts and conjuncts, which are then seen as
islands.
My proposal reverses the direction of syntactic evolution hypothesized in
Newmeyer (1991), who also explores a gradualist approach to syntax. While New-
meyer assumes that the initial stages of syntax were characterized by Move free of
Subjacency, I propose exactly the opposite, that islandhood (or the state with no
Move) was the norm in the previous stages, and that Move was an innovation. This
reversal allows me to kill three birds with one stone. First, it provides some rationale
for characterizing islandhood/Subjacency as the default state of grammar, rather than
as a constraint on grammars. Second, this allows me to explain why various fossilized
expressions (arguably “living fossils” of this proto-syntax stage) cannot be manipu-
lated by Move.
Third, and most importantly, this allows me to address the question of how or why
the progression took place from the proto-syntactic stages with no Move and no
subordination, to the stage(s) with Move and subordination. Instead of targeting the
abstract and obscure Subjacency by natural selection, as per Newmeyer’s (1991)
proposal, my proposal targets the emergence of subordination (Move emerging in
conjunction with it). In comparison to its more primary counterparts (adjunction
and coordination), subordination provides a clear and concrete advantage in the
expressive power of language. One such advantage is that subordination affords the
possibility to recursively and unambiguously embed/nest multiple viewpoints one
within another.
This chapter offers a hypothesis which is consistent with a lot of descriptive data,
with how grammaticalization processes work, as well as with many studies in
language acquisition and processing, as discussed in Chapter 4. Finally, an important
advantage of this proposal is that it does not force us into the conclusion that syntax
is all or nothing, and that the evolution of syntax as a whole had to have been a
sudden and passive event, passive in the sense that its evolution was parasitic on
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Conclusion 143

some other event. For example, Gould (1987) and Chomsky (1988) have proposed
that syntax can just be a consequence of an increase in the size of the brain, or of
some general laws of growth. The approach explored here leaves open the possibility
that syntax played an active role in shaping human brains. Another important
advantage of this approach is that it reveals how the incremental nature of the
evolution of syntax can actually shed light on the very properties of its design.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Exocentric VN compounds:
The best fossils

6.1 Introduction
This chapter looks at a host of surprising properties of VN compounds, such as pick-
pocket, turn-coat, spoil-sport, cry-baby, across a variety of languages, focusing pri-
marily on those found in English and Serbian. My argument is that the grammar
behind these exocentric compounds is a survivor (“living fossil”) of an early stage of
syntax in language evolution, and that by looking at their structure we can get a good
glimpse into the workings of proto-syntax. Jackendoff (1999, 2002) has proposed that
the evolution of syntax might have preserved “fossils” of previous stages in its later
stages (see also Bickerton 1990), mentioning in particular compounds (e.g. snowman)
as one such living fossil (see Section 1.6).
I have argued that speciﬁcally exocentric VN compounds constitute the most
plausible candidate for a syntactic fossil featuring a verb (Progovac 2009a, 2012).
When it comes to their structure (or the lack thereof), my argument is that VN
compounds, at least in English and Serbian, are a clear product of the paratactic
proto-grammar, as introduced in Chapters 2–4. These compounds are best analyzed
as involving a single application of (proto-)Merge/Conjoin (of Chapter 4), to exactly
two words, a verb and a noun, where the noun stands as the verb’s only (proto-)
argument. The thematic (theta) role of this noun, even though typically theme
(object-like), can be shown to be largely underdetermined, in fact absolutive-like,
corroborating the proposal that these compounds are ﬂat, paratactic structures,
rather than hierarchical structures equipped with null projections and null argu-
ments (Section 6.2). I will argue that the relationship between the verb and the noun
in these compounds is that of proto-predication (see e.g. Gil 2012), a precursor to true
predication (for the notion of a proto-role, see Section 3.4.2; also discussion below).
Section 6.3 compares VN compounds with their more complex hierarchical coun-
terparts, bringing to light the sharp differences between them, but also continuity in
the sense that the former provide scaffolding for building the latter. Consistent with
the theme of this monograph, the structure of VN compounds integrates into the
structure of their more complex hierarchical counterparts.

Evolutionary Syntax. First edition. Ljiljana Progovac

# Ljiljana Progovac 2015. Published 2015 by Oxford University Press
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Paratactic grammar behind VN compounds 145

It will also be shown that the verb in these compounds surfaces in what at least
synchronically appears to be the imperative form, the kind of imperative that is also
found in other (frozen) expressions. This is unmistakably the case with Serbian VN
compounds (Section 6.4), but the same has also been proposed for VN compounds in
other languages, including other Slavic and Romance languages (Section 6.5). As will
be shown, VN compounds across languages, not only Indo-European (IE), but also
non-IE, exhibit striking parallelisms both in form and in imagery (Section 6.5).
Exocentric VN compounds specialize for derogatory reference when they refer to
humans, providing a good glimpse into how comparable expressions might have
played a role in (ancient) ritual insults, which is why these fossils are of signiﬁcance
for sexual selection considerations (Progovac and Locke 2009), as addressed in
Section 6.6, as well as in Chapter 7. As pointed out in Chapter 1, the present-day
compounds, as well as all the other fossils discussed in this book, are only to be seen
as approximations of the structures once used by our ancestors. Some corroborating
evidence and testing grounds for the proposal in this chapter come from language
acquisition studies and language representation in the brain (Section 6.7). To the
extent that their structure and use can best be understood in an evolutionary
framework, these compounds constitute an argument for the gradualist approach
to the evolution of syntax, for the same reason that ﬁnding fossils elsewhere would.

6.2 Paratactic grammar behind VN compounds

This section focuses on the type of (proto-)Merge that characterizes VN compounds
and concludes that the grammar behind these compounds, including their absolu-
tive-like and exocentric nature, begins to make sense only if seen as a fossil of the very
beginnings of syntax, capable of producing only ﬂat, paratactic structures.1 This
would be the hypothesized paratactic proto-syntax stage of Chapters 2–4. In VN
compounds, the noun is the verb’s only (proto-)argument, which is absolutive-like in
nature, as established in Section 6.2.1. The exocentric nature of VN compounds is
addressed in Section 6.2.2, where it is shown that absolutivity and exocentricity in this
case are just two sides of the same coin.

6.2.1 Absolutive-like proto-predication

As proposed in Chapter 3, the simplest possible grammar involving predication is a
ﬂat, intransitive, absolutive-like grammar, the kind which licenses only one argument
per predicate, and which blurs the distinction between subjects and objects. If we
consider all the data involving VN compounds, it becomes obvious that the grammar

1
See Jackendoff (1999, 2002) for the proposal that adjunction/parataxis in general is a protosyntactic
fossil; see Chapter 4 of this volume for an extensive discussion of the reach of parataxis in proto-grammars
and modern grammars.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

146 Exocentric VN compounds: The best fossils

behind VN compounds is also absolutive-like in this sense. The noun in VN

compounds is not always the object or theme of the verb, as is typically assumed in
the literature. Consider the following data from English first:
(1) pick-pocket, scare-crow, turn-coat, dare-devil, hunch-back,
wag-tail, tattle-tale, kill-joy, cut-purse, spoil-sport, saw-bones,
Shake-speare, Burn-house, Drink-water, Bere-water, Drynk-pany
(drink-penny, miser), Pinch-penny (miser)
(2) rattle-snake, catch-phrase, cry-baby, stink-bug, worry-wart, copy-
cat, tumble-weed, scape-goat, turn-table
While the nouns in the compounds in (1) are object-like, the nouns in (2) are subject-
like, occasionally involving agents (e.g cry-baby, copy-cat).2
The situation is the same in Serbian. While the composing nouns in the com-
pounds in (3) are object-like, those in (4) are subject-like. Even though the data in
(4a) can be considered as unaccusative, with the nouns analyzable as themes, those in
(4b) involve agents. It follows that it is not possible to describe these compounds as
uniformly involving objects, or even themes, and that the only unified description is
the one that invokes absolutive-type roles.
(3) cepi-dlaka [split-hair = hairsplitter]
deri-koža [rip-skin = person who rips you off]
ispi-čutura [empty-flask = drunkard]
kljuj-drvo [archaic: peck-wood = wood-pecker]
kosi-noga [skew-leg = person who limps]
muti-voda [muddy-water = one who muddies waters]
(4) a. duri-baba3 [sulk-old.woman=who sulks like an old
woman]
kaži-prst [show-finger=index finger]
smrdi-buba

2
The terms object-like and subject-like are used here in the sense that the noun arguments would
surface as objects or subjects, respectively, in a corresponding sentence. While the sentences A snake rattles,
or A baby cries, involve these nouns as subjects, the sentences He picks pockets or He kills joy involve these
nouns as objects. See Chapter 3 for various additional constructions across languages which do not clearly
distinguish between subjects and objects.
A reviewer points out that some VN compounds feature nouns that are not clearly either subject-like or
object-like, such as scatter-brain and jump-rope. This may still be consistent with the proto-role charac-
terization of proto-predication (Section 3.4.2). The reviewer also brings up compounds created by merging
prepositions and verbs, such as input, hand-out, follow-through. If Heine and Kuteva’s (2007) reconstruc-
tion is correct, then the category of prepositions was a later evolutionary development, not characteristic of
the earliest proto-syntax stages.
3
As pointed out in Mihajlović (1992), baba is a difﬁcult piece to translate since it involves layers of
meaning, including “woman,” “old woman,” and “witch.” In fact, many of these compounds are impossible
to translate accurately, given that they preserve older uses and meanings of morphemes, no longer
accessible to native speakers.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Paratactic grammar behind VN compounds 147

[stink-bug=bug species that stinks; person

who stinks]
smrdi-vrana [stink-crow =a species of crow]
tresi-baba [shake-old.woman=who shakes/scares like
an old woman]
visi-baba [hang-old.woman=flower: snowdrop]
b. plači-drug [cry-friend=who commiserates with you]
plači-baba [cry-old.woman=cry-baby]
striži-buba [grate-bug=an insect which pecks trees]
tuži-baba [complain-old.woman=who complains like a
woman]
trči-laža [run-lie=one who spreads lies]
Recall from Chapter 3 how absolutivity works in e.g. Tongan (Tchekhoff 1979: 409).
When only one argument is present, an absolutive argument, it can be either subject-
like or object-like, resulting in vagueness:
(5) ‘oku kai ‘ae iká.
PRES eat the.ABS fish
“The fish eats.”
“The fish is eaten.”
The lack of theta role specification on the noun is also noticable with compounds
such as turn-table and turn-coat, showing that, with one and the same verb, the noun
can be either subject-like or object-like (see Section 3.3.2). As discussed in Section 6.3,
hierarchical compounds, such as table-turner, show a precise thematic role assign-
ment, and the noun in such compounds can only be interpreted as object-like. In
other words, unlike with VN compounds, with -er compounds it is the grammar that
dictates that the noun must be interpreted as object-like, given that the -er piece is
associated with the role of agent.
The most expressive of VN compounds can in fact be doubly interpreted, allowing
the noun to play the role of the agent and theme at the same time, providing a strong
argument for the proto-linguistic character of these compounds. For example,
English dare-devil is the one who dares the devil, and can also be the one who is a
devil that dares.4 In Serbian pali-drvce [ignite-stick, matches], drvce is interpreted
as both a theme and an agent (the stick is both ignited and igniting). Since both

4
According to e.g. the Online Etymology Dictionary, dare-devil consists of the verb dare and the noun
devil, and “the devil might refer to the person, or the sense might be ‘one who dares the devil (compare
scare-crow, pick-pocket, cut-throat).’ ” Interestingly, some native speakers believe that only the former
interpretation is behind this compound, while others believe that the latter interpretation is there; there is
disagreement even among the reviewers of this book.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

148 Exocentric VN compounds: The best fossils

interpretations are available at the same time, this has to be a matter of vagueness/
underspecification, rather than ambiguity (Progovac and Locke 2009; Progovac
2012). While vagueness is typically associated with paucity of structure, ambiguity
is typically ascribed to distinct structural possibilities (see e.g. Kempson 1977 for the
distinction).
My proposal is that an absolutive-like grammar underlies VN compounds, and
that all the compounds illustrated above (1–4) involve the same kind of composition.
It would be an error to treat (1) and (3) distinctly from (2) and (4). A unified
(absolutive-like) analysis of VN compounds would immediately capture their iden-
tical morphological make-up, including the imperative morphology in Serbian
(Section 6.4), as well as their shared (derogatory) semantics (Section 6.6).5
As shown in Section 6.3, all the VN compounds in Serbian (3) and (4) types alike,
feature exactly the same morpho-syntactic frame, complete with an imperative form
of the verb, calling for a unified analysis. While in English the form of the verb is
unmarked, the similarity in structure and interpretation between e.g. English rattle-
snake and Serbian tresi-baba; English worry-wart and Serbian duri-baba, strongly
suggests that the English VN compounds also form a unified class. Section 6.2.2 on
exocentricity provides further arguments for this unified analysis. In fact, the argu-
ments for the absolutive nature and for the exocentric nature of these compounds are
inextricably linked, and these compounds can only be understood if both of these
crucial properties are considered together, as they are two sides of the same coin.
In the spirit of Downing (1977), Gil (2005) suggests that root NN compounds (e.g.
toothbrush, snowman), as well as some other constructions in various languages,
involve an association operator semantically. On the other hand, the semantics of
VN compounds involve (a bit) more than just association; they involve a participant in
the event, and thus a relationship which can be characterized as a precursor to
predication, i.e., as proto-predication. In this respect, Gil (2012) has proposed that
predication is a composite emergent entity, rather than a primitive, and that it brings
together both thematic role assignment and headedness. In this light, VN compounds
exhibit a rudimentary thematic role assignment, involving just one participant, but
with no further theta-role specification, and with no headedness or hierarchy.6

5
While Carstairs-McCarthy (1992: 118) claims that the semantic relation between the noun and the verb
is free in VN compounds, and may include an internal argument (but need not), Ackema (1998: 128), based
on Dutch, claims that there are two types of VN compounds, depending on whether the noun is a
complement or not. The considerations in this chapter strongly support the former view, i.e. a uniﬁed
analysis of VN compounds.
6
As pointed out in Section 3.4.2, Dowty (1991) proposes that theta roles are not discrete, but can instead
be seen as prototypes, including proto-agent and proto-theme roles. The participant role I am using here
can be seen as the ultimate proto-role.
Proto-predication does not assume valence in the modern sense of the term, that is, it does not assume
that the verbs in this stage necessarily require a certain number of arguments, the assumption which is also
necessary to make for the one-word stage, as discussed in Section 4.2.2.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Paratactic grammar behind VN compounds 149

My conclusion is that the VN compounds in (1–4) are all instances of the same
paratactic, absolutive-like proto-strategy, where the noun’s thematic role is not
structurally speciﬁed. While it is typically a theme, it can also be an agent, attesting
to the proto-predication character of the compounding process. The next section
gives further support to this view, considering how the exocentric nature of VN
compounds is closely tied to their absolutive-like nature.

6.2.2 Exocentricity
It is typically reported in the linguistics literature, including textbooks, that VN
compounds of the kind illustrated in (1) are exceptional in that they are exocentric
(i.e. not headed), in contrast to the compounds illustrated in e.g. (6), which are headed
by the second/rightmost element in the compound (e.g. Spencer 1991; Selkirk 1982):
(1) pick-pocket, scare-crow, turn-coat, dare-devil, hunch-back, wag-tail,
tattletale
(6) toothbrush, headboard, bedroom, blackboard, navy-blue
While a bedroom is a kind of room, and navy-blue is a kind of blue (with room and
blue acting as heads), a turncoat is neither a kind of coat nor a kind of turn. It is rather
a person who (metaphorically speaking) turns his coat (a traitor), even though there
is no morphological piece, at least not an overt one, contributing to the meaning
person.
And even though the compounds in (2) and (4) discussed in the previous section at
first glance seem to pattern with those in (6), in the sense that a rattlesnake is a kind of
snake, and a show-finger is a kind of finger (cf. index finger in English), there is good
evidence for the view that they are in fact the same compound type as those in (1).
(2) rattle-snake, catch-phrase, cry-baby, stink-bug, worry-wart,
copy-cat, tumble-weed, scape-goat, turn-table
The clearest evidence is available in Serbian VN compounds, which feature the same
type of (imperative) morphology in both (3) and (4) type compounds, but never in
the compound type in (6). In fact, the compounds such as (6), considered to be root
compounds, consisting of just two roots, are practically non-existent in Serbian, as
mentioned in Section 1.6. Apart from very few creations, mostly borrowings, Serbian
cannot use the root compound strategy in (6) productively. For example, one cannot
create *krevet-soba (lit. bed-room), or *kafa-sto (lit. coffee-table) in Serbian. Instead,
one uses phrases of different kinds to express similar concepts, such as spavaća soba
(lit. sleeping room), or sto za kafu (lit. table for coffee). It is clear in Serbian that the
imperative compounds in (4) cannot be the product of the root compound strategy,
of the kind exemplified in English (6). Instead, this is the exact same strategy used to
form (imperative) compounds in (3).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

150 Exocentric VN compounds: The best fossils

To recapitulate, the argument for the absolutive-like, exocentric nature of VN

compounds is as follows. Given that Serbian does not really have root compounds,
and certainly not any root compounds with imperative morphology, and given that
compounds in (3) and (4) show completely identical morphological make-up,
including the imperative form of the verb, the conclusion must be that the com-
pounds in (3) and (4) in Serbian are a product of the same compounding strategy.
Given that some of these compounds involve object-like predication (3), and others
subject-like predication (4), the unified analysis of these compounds must invoke the
absolutive-like analysis. Even though in English there is no clear morphological
evidence to show if the compounds in (2) pattern with those in (1) or with those in
(6), I propose to extend the unified analysis to English VN compounds as well, for the
reasons mentioned below. However, if it turns out that the English data in (2) are just
(headed) root compounds, the data in Serbian still remain a clear absolutive-like
fossil.
I can offer the following reasons for treating all English VN compounds discussed
so far, including those in (2), as exocentric, absolutive-like creations. First of all,
unlike root compounds, which are extremely productive in English, the VN strategy,
both in (1) and (2), is highly restricted and unproductive.7 Second, when they refer to
humans, the compounds in (2) tend to be pejorative, just like their counterparts in
(1), as is obvious from e.g. cry-baby, worry-wart, copy-cat. Furthermore, as discussed
in Section 6.5, the VN compounds across different languages involve very similar
images and concepts, typically combining simple, basic words. As pointed out in the
previous section, there are clear parallels in the interpretation and imagery of English
rattle-snake and Serbian tresi-baba; English worry-wart and Serbian duri-baba,
suggesting that VN compounds in English should receive the same unified analysis
that is inescapable in the case of Serbian VN compounds.
Exocentricity is a surprising property, given that all morpho-syntactic structure
(derived by Merge) is considered to be headed, headedness and hierarchy con-
stituting the hallmarks of Merge. According to e.g. Williams (1981), compounds
and affixation in morphology are also subject to headedness, more specifically right-
hand headedness, with the rightmost morpheme serving as the head of the whole
compound/word. While Williams’ (1981) Righthand Headedness Rule seems applic-
able in describing the headed compounds in (6), it does not apply to the VN
compounds in (1). This is only one of several ways in which VN compounds are
nonconforming.

7
While these VN compounds are no longer productive in English, it is interesting that they are still
accessible to the brain. As pointed out to me by Ana Progovac (p.c. 2013), one can ﬁnd an online fantasy
name-generator for insults (http://www.rinkworks.com/namegen/), which generates a list of potential
derogatory names for characters and a lot of them are in fact VN compounds. Another example of a
recent creation is sell-sword, used in the sense of mercenary, which ﬁgures in the title of the trilogy of
fantasy novels The Sellswords, written by R. A. Salvatore.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Paratactic grammar behind VN compounds 151

The few references that address the structure of VN compounds of type (1)
typically attempt to make them more streamlined, more conforming to the pre-
sent-day accusative-style grammars, by endowing them with null elements and
covert structure. Marchand (1969) proposed that VN compounds, which he calls
“pseudo-compounds,” are derived by a null affix which serves as their head (see also
Rohrer 1977 for French and Lieber 1992).8 More recently, Ferrari (2005), based on
Italian data, explores an analysis of VN compounds which posits a null head and an
Aspect Phrase inside these compounds, rendering them headed by a null affix.
I explore an approach to VN compounds, at least those found in English and Serbian,
which does not posit any covert structure or null elements, embracing the traditional
view of these compounds as exocentric (but see Section 6.5.2 for Romance languages
possibly being an exception in this respect). This in turn leads to an absolutive-like
analysis. As this chapter will show, there are many reasons to adhere to this view.
In addition to the observed ambivalence in theta-role assignment, Serbian VN
compounds are also ambivalent when it comes to determining what counts as head
with respect to agreement possibilities. In some sense, the noun inside Serbian VN
compounds seems to act as a morphological head of the whole compound, influen-
cing agreement possibilities, but in another sense, it does not, as illustrated in the
following table:
(7) Nominative Accusative
ta.F. /taj.M.(this) trči-laža.F tu.F /tog.M trči-laž-u.F
ta/taj ispi-čutura.F tog/tu ispi-čutur-u.F
taj jebi-vetar.M tog jebi-vetr-a.M
Animate
taj vadi-čep.M taj vadi-čep.M
Inanimate
to.N pali-drvce.N to.N pali-drvce.N
For the F(eminine) noun čutura [flask], the compound is declined as a simple F noun
would be, as demonstrated by the characteristic F accusative ending –u (čuturu). The
choice of the demonstrative is also influenced, although not determined, by the
F form of the noun: if the noun is F, the demonstrative for the whole compound
can be either F or M(asculine), the latter choice probably available by default (see
Ferrari 2005 for an important role played by default M gender in compounds
and word formation in general.)9 The M option suggests that the noun in a VN

8
The null affix can be seen as perhaps a null counterpart of the morpheme -man, or -er. Marchand’s
view is criticized in Langendoen (1971) and Ljung (1975), who favor the ellipsis approach (the term ellipsis is
also used in Jespersen 1954). Warren (1978: 27) uses the term “incomplete compound” for a host of different
types of compounds, including compounds such as egghead, which she analyzes as missing the morpho-
logical piece corresponding to man. Egghead type compounds may also be of evolutionary significance.
9
On the other hand, Ferrari (2005) reports that Italian VN compounds are uniformly M, suggesting
that they may have more morpho-syntactic structure, including possibly a null M suffix.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

152 Exocentric VN compounds: The best fossils

compound is not unambiguously its morphological head. Also, the F demonstrative

can be freely used with VN compounds even when they refer to males.
If the noun is inanimate M (vetar [wind]), and the compound human/animate, the
demonstrative must be M as well, but the whole compound in the accusative form
would follow an animate accusative pattern, with an ending in –a (jebi-vetr-a),
suggesting again that the ﬁnal (inanimate) noun is not really the head of the
compound. On the other hand, if the compound as a whole refers to an instru-
ment/inanimate object (vadi-čep), and if, moreover, its noun is inanimate and M (čep
[cork]), then the demonstrative must be M, and the whole compound also follows the
accusative inanimate M pattern. When the noun is inanimate N(euter) (drvce
[stick]), and the whole compound is also inanimate, the demonstrative must be
N as well, and the accusative follows the inanimate pattern. These patterns point to
some unusual strategies and compromises in determining agreement, which would
be understandable if these compounds lack morpho-syntactic heads. The compari-
son between VN compounds and their hierarchical -er/-ac counterparts in the
following section reinforces the conclusions reached in this section.

6.3 A comparison with the hierarchical verbal compounds

Recall that the proposed analysis of exocentric compounds involves a flat, paratactic
combination of a verb and a noun, its only argument, making use of a single instance of
(proto)-Merge (or Conjoin in the sense of Chapter 4). Proto-Merge, creating non-
hierarchical, flat structures, arguably coincides with rudimentary predicate-argument
semantics, as established in the previous sections. From an evolutionary perspective, the
structures created by proto-Merge, including VN compounds, can be seen not only as
precursors, but also as necessary foundation for building more elaborate, hierarchical
structures, including hierarchical verbal compounds discussed in this section.
Consider the following verbal (synthetic) compounds in English and Serbian, also
composed of a verb and a noun, but involving additional morphology and structure:
(8) truck-driver, meat-eater, brick-layer, story-teller, tax-payer, heart-
breaker
(9) kamen-o-rez-ac [stone-O-carve-AGENT, stone-carver]
srebr-o-ljub-ac [silver-O-love-AGENT, who admires
money]
žen-o-mrz-ac [woman-O-hate-AGENT, woman-hater,
misogynist]
ver-o-lom-ac [faith-O-break-AGENT, who converts]
brak-o-lom-ac [marriage-O-break-AGENT, who breaks
marriages]
rib-o-lov-ac [fish-O-hunt-AGENT, fisherman]
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

A comparison with the hierarchical verbal compounds 153

The two compound types, the exocentric VN strategy and the -er/-ac strategy illus-
trated in (8–9) are comparable given that both utilize the same free morphemes, a verb
and a noun, to express similar concepts, which is especially clear with the following
minimal pairs, one involving a VN compound, and the other an -er/-ac compound:10
(10) a. der-i-koža [rip-IMP-skin, one who rips you off]
kož-o-der-ac [skin-O-rip-AGENT, skin-ripper, one who
rips you off]
b. liž-i-sahan [lick-IMP-basin, boot-licker]
čank-o-liz-ac [basin-O-lick-AGENT, boot-licker]
(11) kill-joy vs. joy-killer; Bere-water vs. water-bearer/carrier
The -er/-ac compounds not only have more morphological pieces than the VN
exocentric compounds, but they also show an obligatory rearrangement of the two
free morphemes, the verb and the noun. One approach to this is to take VN
compounds to reﬂect the underlying, basic word order (e.g. Lieber 1992; Murray
2004) and the -er/-ac compounds to involve a rearrangement/Move of constituents,
as illustrated below.
According to e.g. Roeper (1999: Footnote 32) and Progovac (2005b), -er/-ac
compounds have an additional layer of structure, the transitivity layer, possibly vP,
where the agentive morpheme -er/-ac is generated, the way agents are in the
Minimalist Progam.11 Recall from Chapter 3 that transitive structures are analyzed
in Minimalism as involving a vP layer, while intransitive structures, especially
absolutive-like structures, need not have the vP layer. Given the ﬂat/non-hierarchical
(basically small clause (SC)) analysis of VN compounds explored in this chapter,
these compounds certainly lack the vP layer. In this respect, they contrast with -er/-ac
compounds, which have hierarchical structure, and possibly also involve Move/
incorporation of the internal argument into the verb (e.g. Baker 1988; see also Lees
1960; Roeper and Siegel 1978; Lieber 1992).12

10
The two compounds in (10b), coming from two different dialects, clearly illustrate the distinction in
the use of the verb form: imperative in the VN compound (liži in both dialects), and the root form in -ac
compounds (liz in both dialects). The imperative morphology in VN compounds will be discussed at
length in Section 6.4.
11
For my purposes, the label for this projection is not as important as the need to capture the layering/
shelling effect of these compounds; a nominal equivalent of vP, an nP shell, would do just as well (see e.g.
Ferrari 2005).
12
For postulating VP in nominalizations, see e.g. Lees (1960); Lieber (1992); Fu, Roeper, and Borer
(2001); van Hout and Roeper (1998); for movement/incorporation in word formation, see e.g. Fabb (1984);
Sproat (1985); Roeper (1999). For some more recent syntactic approaches to word formation, see also Halle
and Marantz (1993); Marantz (1997); Josefsson (2001); Julien (2002); Lacarme (2002); Pylkkänen (2002);
Ferrari (2005); Roeper (2005); and references cited there.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

154 Exocentric VN compounds: The best fossils

(12) a) [SC kill joy] → [vP–er [SC kill joy]] → [joy–kill–er]

b) [SC der[i] koža]→[vP -ac [SC der koža]] → [ kož-o-der-ac]
According to this analysis, just like the small clause in general provides the platform
for building the TP or vP (Chapters 2 and 3), the VN conﬁguration provides the
foundation for building the more complex compound. As shown in (12), the -er/-ac
attaches to the small clause, building the complex compound upon the foundation of
the simpler one. What I am proposing is that the simpler, paratactic structures
literally provide a concrete syntactic basis upon which the more complex structures
have to be built. As seen in Chapter 2, there is empirical evidence that the TP is
superimposed upon the SC. The question now arises if there is any such empirical
evidence that the paratactic VN foundation provides the scaffolding for the -er/-ac
compounds. This question can best be answered by considering an alternative
analysis, which does not assume this kind of scaffolding.
An alternative analysis of -er/-ac compounds would be to treat them as NN
compounds, truck + driver, where the second noun happens to be derived by the
sufﬁx -er/-ac (e.g. Selkirk 1982; Spencer 1991). While this analysis may work for some
-er compounds, such as truck-driver in English, it does not work for many others.
Jespersen (1954: 293) points out that derivations such as sound-sleeper cannot be
derived by combining the adjective with the noun sleeper, but rather by adding -er to
the foundational combination [sound sleep]. The following examples illustrate that
neither English nor Serbian -er/-ac compounds can be uniformly derived through an
NN compounding process, given that the second noun often does not have a
derivation independently of the compound (see also Warren 1984: 233; Spencer
1991; Murray 2004):13
(13) brick-lay-er (*layer as Agent)
story-tell-er (*teller as human Agent)
tax-payer (*payer)
(14) kamen-o-rez-ac ‘stone-carver’ (*rezac)
srebr-o-ljub-ac ‘silver-lover’ (*ljubac)
žen-o-mrz-ac ‘woman-hater’ (*mrzac)
brak-o-lom-ac ‘marriage-breaker’ (*lomac)
Once again, just as is the case with small clause derivations of sentences (TPs)
discussed in Chapter 2, the layering/scaffolding approach, which takes the (VN)

13
As pointed out by a reviewer, it is not impossible to say “a layer of bricks,” or “a teller of tales.” But
these nouns still differ from other nouns in that they require such of complements. In Serbian, even such
phrasal realizations are completely ungrammatical (*rezac kamena = “carver of stone”). In fact, Serbian -ac
specializes for attaching to the VN basis, and is only rarely found outside of compounds, that is, attached
directly to a verb. Instead, different derivational sufﬁxes are used to derive nouns from just verbs, such as
-ač in pliv-ač (swimm-er).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

A comparison with the hierarchical verbal compounds 155

small clause to be the foundation (see (12)), sheds light on the otherwise unexpected
properties of these compounds.
It is also of note here that -ac compounds in Serbian necessarily feature a vowel -o,
which is often seen as a linking vowel (but see Progovac 2005b for the default
agreement analysis). What this means is that -ac compounds in Serbian have four
pieces of morphology, certainly more than VN compounds. Recall the proposal in
Chapter 4 that the paratactic stage of grammar was followed by a proto-coordination
stage, characterized by linkers/coordinators, with little or no semantic import. An
interesting question then arises with respect to -ac compounds in Serbian: are they
created by the coordination/linker type of grammar, or by true hierarchical gram-
mar? Most likely, these compounds have elements of both, and represent fossilized
intermediate structures.
As an alternative to the derivation in (12), one can also consider an analysis
according to which -er/-ac suffix is an ergative suffix (12’), added to the absolutive
compound base, and possibly attached by adjunction (see Chapter 3 for discussion
and references on the attachment of ergative phrases.)
(12’) a) [SC kill joy] → [SC –er [SC kill–joy]]
b) [SC der[i] koža]→[ SC –ac [SC der–koža]]
The added precision in theta-role assignment in -er/-ac compounds would come
from this added agentive argument, the morpheme -er/-ac, whether it is an agent in
vP (12), or an ergative adjunct (12’), necessitating that the lower (absolutive-like)
argument be a non-agent. In fact, the ergative analysis would have an added benefit
of explaining why -ac in Serbian can only attach to compounds (Footnote 13):
ergative arguments are typically only added to structures which already contain an
absolutive argument. If so, then Serbian -ac compounds are yet another example of
ergative syntax at work in Serbian (see Chapter 3 for more examples).
As pointed out in Section 6.2, the grammar of VN compounds resembles the
grammar of absolutive intransitives, as illustrated in Tongan (5). When only one
argument is present, the absolutive argument, it can be either the agent or the theme/
patient of the action. However, once a specifically marked agent is introduced
(ergative), its very presence renders the absolutive argument as semantic patient/
theme (see Chapter 3 for further examples and details). This is exactly what happens
with e.g. the compound dare-devil, which is less specified in comparison to devil-
darer. In other words, the one-argument proto-grammar is underspecified when it
comes to the nature of theta roles, but the addition of an external, agent argument
leads to more precision.
Even though one compound type can be shown to be more complex than the
other, it is significant that there is continuity of structure between the two compound
types, where one type literally provides the (paratactic) scaffolding for the other, as
illustrated in (12) and (12’) above. This is consistent with the main theme of this
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

156 Exocentric VN compounds: The best fossils

monograph, which is that simpler syntactic structures integrate into more complex
ones, serving as their foundation (see e.g. Chapters 2 and 3 for small clauses and TPs/
vPs). Some corroborating evidence for the continuity between VN and -er com-
pounds comes from the way children acquire compounds, as discussed in Section 6.7.
The following section provides yet another reason for treating VN compounds as
evolutionary fossils.

6.4 A surprising verb form: The imperative

Linguists and grammarians converge on the (surprising) conclusion that VN com-
pounds in Serbian consist of an imperative verb plus a noun (Stevanović 1956;
Mihajlović 1992; Maretić 1899; Belić 1949; Živanović 1904; Progovac 2005b, 2010c).
This is signiﬁcant since it may push the ultimate analysis of this compound strategy
into the deep evolutionary past.
The imperative in Serbian has a characteristic i/j ending, as can be seen from the
examples below. Although there is an overlap with some verbs (marked as IMP/3SG)
below, those verbs which have distinct endings for the base 3SG form (third person
singular present) and for the imperative (IMP) unmistakably use the imperative form
in these compounds, whether these compounds involve object-like nouns (15a) or
subject-like nouns (15b) (see also Section 6.2).14
(15) VN compounds as common nouns in Serbian
a) with object-like nouns
cepi-dlaka [split-hair = hairsplitter] IMP15
deri-koža [rip-skin = person who rips you off] IMP
ispi-čutura [empty-ﬂask = drunkard] IMP/3SG
jebi-vetar [screw-wind = charlatan] IMP
jedi-vek [eat-life = one who constantly annoys] IMP
kljuj-drvo [archaic: peck-wood = wood-pecker] IMP
liži-sahan [dialectal: lick-basin = boot-licker] IMP
kosi-noga [skew-leg = person who limps] IMP
mami-para [lure-money = money-grabber] IMP/3SG
muti-voda [muddy-water = one who muddies waters] IMP/3SG
pali-drvce [burn-stick = matches] IMP/3SG
pali-kuća [burn-house = one who burns houses] IMP/3SG

14
As pointed out in Section 1.6, there are VV compounds in Macedonian which involve two imperative
verbs strung together, as in veži-dreši (tie-untie ‘an ignorant person’) (Olga Tomić, p.c. 2006).
15
The example cepi-dlaka seems problematic at ﬁrst glance since the imperative form of cepati is cepaj,
and not cepi (the base, 3SG present tense form is cepa.) However, preﬁxed perfective counterparts of
the verb cepati, such as pre-cepiti, ot-cepiti, have the respective imperative forms as pre-cepi and ot-cepi. The
compound probably preserves a now obsolete imperative form cepi.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

A surprising verb form: The imperative 157

podvi-rep [fold-tail = someone who is crestfallen] IMP

priši-petlja [sow-loop = who clings onto another] IMP
probi-svet [break-world = wanderer] IMP
raspi-kuća [waste-house = who spends away property] IMP
razbi-briga [break-worry = game/entertainment] IMP
seci-kesa [cut-purse = pick-pocket] IMP
vadi-čep [extract-cork = corkscrew] IMP/3SG
vrti-guz [spin-butt = restless person, fidget] IMP/3SG
vrti-rep [wag-tail = restless person, fidget] IMP/3SG
vuci-batina [pull-whip = tramp, good-for-nothing] IMP
b) with subject-like nouns
kaži-prst [show-finger=index finger] IMP
tresi-baba [shake-old.woman=who shakes/scares like IMP
an old woman]
visi-baba [hang-old.woman=flower: snowdrop] IMP/3SG
plači-drug [cry-friend=who commiserates with you] IMP
plači-baba [cry-old.woman=cry-baby] IMP
striži-buba [grate-bug=an insect which pecks trees] IMP
tuži-baba [complain-old.woman=who complains like
a woman] IMP/3SG
In other words, all the compounds above marked as IMP can only be analyzed as
involving an imperative verb, while the compounds marked as IMP/3SG are ambigu-
ous between the two forms. Significantly, there are no compounds whose verb can be
analyzed as 3SG, but not as IMP. To illustrate, in seci-kesa, seci is unambiguously
IMP, as opposed to sek.ROOT, seći.INF, seče.3SG.PRES. Likewise, plači in plači-drug is
clearly IMP, as opposed to plakati.INF, plače.3SG.PRES, plak.ROOT. Any unified charac-
terization of the morphological make-up of Serbian VN compounds must therefore
refer to the imperative form:
(16) (Fossilized) imperative verb + noun (default case)16
It is significant to note that most of these compounds are derogatory when referring
to humans. The exception are compounds created in more recent times, for official

16
All the compounds in the citation form have their nouns in the default nominative case (seci-
kesa.NOM), and not in the accusative case (seci-kesu.ACC), which would be required in a sentential
imperative counterpart (Seci kesu!/*Seci kesa! “Cut the purse!”). When these compounds are used in a
sentence, the noun gets inﬂected for the appropriate case assigned to the position of the whole compound.
It is important to point out that these compounds in Serbian are not interpreted as involving commands
of any kind, whether their nouns are subject-like or object-like. Only the form here is imperative, and the
native speakers are typically not aware of this. I will give further arguments below for why this imperative
form should be analyzed as a fossilized imperative.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

158 Exocentric VN compounds: The best fossils

naming purposes, which are not derogatory, but in fact tend to express grand wishes
(see e.g. Yonge 1863: 441).
(17) VN compounds as non-derogatory names in Serbian
Bodi-roga [pierce-horn?] IMP
Bori-voj [ﬁght-war] IMP/3SG
Brani-mir [defend-world?] IMP/3SG
Budi-mir [be-world?] IMP
Budi-sava [be-?] (town) IMP
Deli-blato [divide-mud] (town) IMP/3SG
Jezdi-mir [ride-world] IMP/3SG
Kolji-vratić [cut-throat] IMP
Kruni-slav [crown-glory] IMP/3SG
Pali-lula [burn-pipe/straw?] 17 IMP/3SG
Popi-voda [drink-water] IMP
Rasti-slav [grow-glory] IMP/3SG
Stani-mir [stay-world] IMP
Stani-slav [stay-glory] IMP
Sveti-mir [bless-world] IMP/3SG
Trpi-mir [endure-world] IMP/3SG
Strati-mir [waste-world] IMP/3SG
Veli-mir [command world] IMP/3SG
Vladi-mir [rule-world] IMP/3SG
Zlati-bor [gild-pine] (mountain) IMP/3SG
Zlati-slav [gild-glory] IMP/3SG
As can be seen, these more recent creations also feature the imperative form of the
verb (see Appendix 2 for more examples).
Even though English does not distinguish imperative from base and root forms,
according to e.g. Jespersen (1954: 224), VN English compounds “often seem to
originate in an ironical imperative.” Following Darmesteter (1894, 1934), Weekley
(1916) also analyzes English VN compounds as consisting of the imperative verb +
object, and sometimes an adverb (e.g. Go-lightly). 18

17
Mihajlović (1992: 16, 136) suggests that Pali-lula, a place name, derives from Pali-lila, meaning ‘burn-
straw/hay,’ the ancient image dating back to a pre-Christian (Old Hittite) ritual. If so, then the present-day
form Pali-lula was derived by folk-etymology: lula means a (smoker’s) pipe, while lila has no meaning in
present-day Serbian.
18
These data include examples from Weekley (1916); Jespersen (1954: 223–4; 347–50); Lees (1960);
Marchand (1969: 380–2); Adams (1973); Groom (1937). For many more examples of English VN com-
pounds, the reader is referred to these references (see also Appendix 1 of this Chapter).
A reviewer disagrees with Jespersen’s claim, noticing that there is no imperative interpretation in English
compounds. However, Serbian VN compounds are also not interpreted as imperative, even though the
form is unmistakably imperative. This imperative form will be analyzed below as a fossilized form, akin to
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

A surprising verb form: The imperative 159

(18) bang-straw (thresher), break-back, break-fast, break-neck, break-

vow, break-water, burn-bag, burst-cow (insect), carry-all, catch-fly
(plant), catch-penny, cease-fire, cover-shame (plant), cover-slut
(apron), cure-all, cut-finger (plant), cut-throat, cut-purse, cut-water,
do-nought, dread-nought (originally a person; later a battleship),
fill-belly (glutton), fill-pot, find-fault, hang-dog (originally a
person who hangs stray dogs), hang-man, heal-all (plant), hunch-
back, kill-joy, kill-lamb (plant), kill-time, know-little, know-
nothing, lack-brain, lack-bread, lack-grace, lack-land, lack-love,
lack-luster, lack-wit, lick-box, lick-dish, lick-ladle, lick-platter,
lick-pot, lick-spit, lock-jaw, make-mirth, make-peace, pass-port,
pas-time, pick-lock, pick-purse, pick-thank, pinch-back (miser),
pinch-belly, pinch-gut, pinch-penny, rake-hell (scoundrel, ruffian),
rake-shame, save-all, saw-bones, scape-gallows, scare-crow,
scatter-brain, scoff-law, scrape-gut (fiddler), shear-water (bird),
shuffle-wing (bird), skin-flint (miser), sling-shot, spend-thrift
(miser), spit-fire, spoil-sport, spurn-water, stay-stomach (snack),
stop-gap, sweep-stake, swish-tail (bird), tangle-foot (whiskey),
tear-thumb, tell-tale, toss-pot, tumble-dung (insect), turn-coat,
turn-key, turn-penny, turn-skin, turn-spit, turn-table, wag-tail (bird)
While it is hard to tell what all the English words in (18) mean, if one selects only
those for which the meaning is relatively clear, and which refer to humans as opposed
to plants or objects, the list includes the following:
(18’) cut-throat, cut-purse, do-nought, dread-nought (originally a person;
later a battleship), fill-belly (glutton), hunch-back, kill-joy, know-
little, know-nothing, lack-brain, lack-grace, lack-land, lack-wit,
lick-spit, pick-thank, pinch-back (miser), pinch-penny, rake-hell
(scoundrel, ruffian), saw-bones, scatter-brain, scoff-law, scrape-gut
(fiddler), skin-flint (miser), spend-thrift (miser), spoil-sport, tell-
tale, turn-coat
They all seem to be derogatory, but even if just the majority of them were, this would
still call for an explanation. This is in addition to previously introduced compounds,
such as tattle-tale, busy-body, cry-baby, crake-bone (crack-bone). I do not know of
any other morpho-syntactic process that has created so many pejorative terms.

English optative uses of verbs, as in Long Live the King, which show no agreement with the subject. As will
be shown, this same form is also used as Historical Imperative in some dialects of Serbian. This is then just
an ancient mood form that happens to coincide with the synchronic imperative morphology in Serbian.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

160 Exocentric VN compounds: The best fossils

As is obvious from Appendix 2, the older compounds in Serbian that refer to

humans are also derogatory, and there are many more like that in Mihajlović’s (1992)
book. The exception is the newly created class of names in Serbian, given in (17). It is
important, though, to focus on the compounds that refer to humans, for those that
describe instruments or animals would not be useful as insults. Even though in
medieval times these compounds sometimes showed “unquotable coarseness,”
Weekley maintains that this is a very expressive and convenient way of naming,
which flourished in the thirteenth and fourteenth centuries. Most reference books do
not include these compounds due to their obscene nature, driving them to virtual
extinction, not only in English, but in other languages as well (see also Lloyd 1968;
Darmesteter 1934; Mihajlović 1992).
Rolfe (1996) has hypothesized that humans initially used verbs to issue commands
(imperative), even in the one-word (pre-syntactic stage), and much before using
verbs to make statements. The imperative in general is among the first productive
verbal forms used by children (e.g. Bar-Shalom and Snyder 1999) (see Section 6.7). It
also tends to be the least marked verbal form across languages, and/or to preserve
archaic patterns (see e.g. Dixon 1994: 189; Kuryłowicz 1964: 137).
Imperatives arguably also may provide some continuity with animal calls and
other communicative signals, in the sense that they are calls for action, typically in
the here-and-now. This would be consistent with e.g. Greenfield and Savage-
Rumbaugh’s (1990) and Tomasello’s (2008) claims that non-human primates use
communicative signals, both vocalizations and gestures, almost exclusively for
imperative purposes. In addition, Millikan (2004) has argued that animal commu-
nicative signals are both indicative and imperative in force.
If the capacity to use the VN compound strategy emerged at an early stage of
language evolution, when one-word utterances and imperatives ruled, then it is
plausible that the compound-like names would have been put together using what
was already there—the imperative-like verbs.19 But it is important to keep in mind
that one is dealing here with a proto-imperative form, not with what is meant by
imperative in present-day languages. This proto-imperative would have had a much
wider range of functions than the modern imperative has today. In this respect, the
quote by Speijer (1886: 271–3) regarding the imperative form in Sanskrit is useful:
“Sanskrit ‘imperative’ comprises more than is conveyed by its European name. It is
not only the equivalent of what we are wont to understand by this mood, but it is also
expressive of wishes, benedictions, possibility, and doubt . . . ”

19
Mihajlović (1992: 16, 136) suggests that Pali-lula, a place name, derives from Pali-lila, meaning ‘burn-
straw/hay,’ the ancient image dating back to a pre-Christian (Old Hittite) ritual. If so, then the present-day
form Pali-lula was derived by folk-etymology: lula means a (smoker’s) pipe, while lila has no meaning in
present-day Serbian.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

A surprising verb form: The imperative 161

In fact, this proto-imperative form is used in Serbian in other surprising functions

as well, all of which can be considered as fossils. For example, some petriﬁed optative/
subjunctive forms such as English (19) are rendered as imperative forms in Serbian
(20), the same forms that are found inside VN compounds:20
(19) Long Live the King! God Forbid!
(20) a. Pomozi Bog!
Help.IMP God
b. Hvali Bog!
Praise.IMP God
Another example comes from the archaic Historical Imperative, which used to be
productive in narratives, but is now only used in some dialects of Serbian (Stevanović
1966: 412–13):
(21) A on ti skini motiku s ramena, zabij je u zemlju, ostavi fenjer kraj
sebe i sedi na ladju.
‘And he take off-IMP the spade from his shoulder, stick-IMP it
into the ground, leave-IMP the lantern by himself, and sit-IMP
onto the boat.’
It is prhaps relevant for these considerations that the Slavic imperative descended
from the optative mood expressing wishes (often indistinguishable from commands),
which in turn descended from the ancient PIE injunctive (e.g. Belić 1960; Kiparsky
1968; Kerns and Schwartz 1972: 23; Stevanović 1974).21 The injunctive was initially an
unmarked mood, but later specialized for non-indicative, “irrealis” moods, express-
ing wishes, commands, and/or exclamations).22 It is possible that VN compounds
preserve approximations of this ancient mood morphology, which in Serbian hap-
pens to be rendered as imperative, via optative (see Progovac 2006, 2010c).
It is also relevant in this respect that swearing in present-day languages often
involves verb forms which look like imperatives, but are not true imperatives in the
modern sense of the word. These include e.g. Damn (you)! Fuck (you)!), as discussed
in Dong (1971).23 Such uses of verbs in swearing in fact resemble optatives in the
sense that they impose wishes/curses upon someone. In that sense, such swear

20
There is a name in Polish that has exactly the same make-up as (20b), as illustrated in (22) in
Section 6.5.1.
21
According to Kiparsky (1968: 51), in Vedic, Greek, and Old Irish, injunctives are also a source of
historical present, equivalent to the Serbian Historical Imperative discussed in the text.
22
See Section 2.2 for the discussion of (tenseless) injunctive mood in PIE in connection with small
clauses, which are arguably tenseless creations.
23
Dong is the pseudonym for linguist James McCawley. Notice that present-day imperatives necessarily
feature reﬂexive pronouns, such as Wash yourself ! Reﬂexives are also possible in some swear phrases
(e.g. Fuck yourself ) on a different interpretation, although not with others (e.g. ??Damn yourself !).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

162 Exocentric VN compounds: The best fossils

expressions can be seen as negative versions of optative phrases such as Long Live the
King! in (19). These connections and overlaps with optatives and injunctives make it
more plausible to accept the (proto-)imperative analysis of VN compounds (see
Progovac 2010c for more details). Section 6.6 discusses evolutionary signiﬁcance of
swearing.
If indeed such VN compounds were among the ﬁrst two-word creations involving
proto-Merge, then it stands to reason that early language would have made use of
what it already had at its disposal: (proto-imperative) verbs. The beginning of the
category verb in human language may have been the (one-word) imperative utter-
ance. The next section introduces data from additional languages, establishing
further crosslinguistic parallels in the structure and use of VN compounds.

6.5 Crosslinguistic distribution and parallels

VN compounds are found in a variety of languages, including non-IE, showing
striking parallels in form and imagery, as illustrated in this section.

6.5.1 VN compounds in other Slavic languages

In addition to the imperative analysis of Serbian compounds (previous section), the
imperative analysis of VN compounds has also been proposed for other Slavic
languages, including Bulgarian (Andreĭčin 1955) and Macedonian (e.g. Koneski
1954). For Polish, it is sometimes claimed that –i is a connecting/linking vowel (e.g.
Ułaszyn 1923). Polish lost the imperative in –i by the end of the sixteenth century, and
Mirowicz (1946) advocates a diachronic imperative analysis of VN compounds in
Polish. According to Klemensiewicz, Lehr-Spławiński, and Urbański (1964: 256–7),
VN compounds in Polish went through several stages, including an imperative stage.
In any event, the diverging analyses of Polish and Serbian compounds simply reﬂect
the fact that the marker i/j is still recognizable as an imperative marker in Serbian, but
no longer in Polish. Once Polish lost the connection with i/j as an imperative marker,
the original proto-Slavic paratactic VN strategy may have been reinterpreted as a
proto-coordination strategy, reanalyzing i/j as a meaningless linker (see Chapter 4 for
the progression of syntactic stages from parataxis to (proto-)coordination.)
The following are some examples from Polish, Russian, and Macedonian, featur-
ing the same i/j ending.
(22) Polish (supplied and/or glossed by Paweł Rutkowski, p.c. 2006)
Chwali-bóg [praise-god] (name)
dusi-grosz [squeeze-penny, miser] (cf. English pinch-penny)
goli-broda [shave-beard, barber]
hulaj-dusza [roister-soul, reveler, rioter]
Kopaj-gród [dig-town] (place name)
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Crosslinguistic distribution and parallels 163

łami-strajk [break-strike, strike-breaker]

łami-główka [break-head.DIM, puzzle, riddle]
mąci-woda [muddy-water, troublemaker, brawler] (cf. Serbian
muti-voda)
moczy-morda [soak-muzzle, sot, drunkard]
obieży-świat [trot-world, globe-trotter]
pali-woda [burn-water, flibbertigibbet, madcap]
pasi-brzuch [pasture-belly, glutton, lazybones]
pędzi-wiatr [drive-wind, flibbertigibbet, madcap]
rzezi-mieszek [cut-purse, pick-pocket]
wali-góra [topple-mountain, giant of Polish folklore who
could topple mountains]
wierci-pięta [wiggle-heel, fidget]
wozi-woda [carry-water, water-carrier]
(23) Russian (Yana Pugach, Maria Babyonyshev, Dina Brun, Natasha
Kondrashova, Asya Pereltsvaig, p.c. 2006)
lomi-golovka [break-head, brain-teaser/puzzle]
sorvi-golova [cut-off head, dare-devil]
perekati-pole [roll-over-field, tumbleweed]24
verti-hvostka [wag-tail, a bird]
(24) Macedonian (Olga Mišeska Tomić, p.c. 2006)
gazi-bara [tread-water]
isturi-čorba [stick-out broth, tactless person]
zajdi-sunce [set-sun, sunset]
6.5.2 VN compounds in Romance languages
Verb-noun compounds are productive in some Romance languages, including Span-
ish, Italian, and French, which suggests that they might have acquired additional
structure, at least the newly formed ones, and that they conform better to modern
syntactic patterns. However, VN compounds are still marginal in Rumanian, where
they “belong to affective and familiar language,” and where they are “exclusively
epithets applied to persons in a contemptuous fashion, as are the earliest examples in
the other Romance languages” (Lloyd 1968: 7).
Lloyd claims that Romance VN compounds were originally nicknames, usually
playful and humorous, and that it was only around the twelfth/thirteenth century
that the strategy was extended to names of places, instruments, occupations, plants. It
could be that Romance VN compounds acquired more structure at this juncture in

24
Tumble-weed itself belongs to the VN compound type, as per discussion in Section 6.2.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

164 Exocentric VN compounds: The best fossils

time, distinguishing themselves from the original pattern, still preserved in Germanic
and Slavic languages.25 Lloyd (1968: 20) believes that these compounds spread to
more neutral contexts due to their expressiveness, and also due to the lack of a
competing pattern, i.e. the lack of the English -er compound type (e.g. dish-washer).
According to Lloyd (1968), many of the original VN compounds were used by the
lowest classes of society, were coarse and humorous, and because of that did not enter
the texts and reference books.
Here are some examples from Spanish, Italian, and French.
(25) Spanish (Murray 2004; Eugenia Casielles, p.c. 2012)
calienta-pollas [heat-penises, a tease]
espanta-pájaros [scare-birds, scarecrow]
lame-culos [lick-asses]
lava-manos [wash-hands, bathroom sink]
lava-platos [wash-dishes, dishwasher]
para-brisas [stop-wind, windshield]
para-caídas [stop-falls, parachute]
para-sol [stop-sun, sunshade]
pica-ﬂor [peck-ﬂower, hummingbird]
rasca-cielos [scrape-sky, skyscraper]
rompe-cabezas [break-heads, puzzle] (cf. Polish and
Russian)
saca-corchos [extract-corks, corkscrew]
saca-muelas [extract-teeth, hack dentist]
(26) Italian (Hall 1948b: 175-6; Murray 2004)
akkatta-pane [beg-bread, beggar]
akkiappa-kani [catch-dog, dog-catcher]
faci-male [do-evil, evil-doer]
gratta-cielo [scrape-sky, skyscraper]

25
Not only are VN compounds in some Romance languages productive, but they also can be recursive,
and often contain plural nouns inside them (see e.g. Murray 2004; Ferrari 2005). One example of a
recursive Spanish VN compound is limpia-para-brisas ‘wipe-stop-wind, windshield wiper’ (Murray
2004). In English, only the complex -er counterparts are recursive (e.g. dishwasher user). A recursive V
[VN] combination seems to me to be completely out of reach for English and Serbian VN compounds
(*scare-pick-pocket (one who scares pick-pockets); *dare-spoil-sport (one who dares spoil-sports); *muti-
ispi-čutura (one who confuses drunkards)). Serbian and English VN compounds are neither productive
nor recursive, and are thus likely to be better approximations of the postulated proto-syntactic constructs
(for recursion, see Chapter 4). Italian and French VN compounds also differ from Serbian counterparts
with respect to gender specification, as discussed in Section 6.2. It is also of significance that the productive,
more recently created, VN compounds in Romance mostly refer to instruments, rather than people,
contrary to what one finds in English and Serbian fossil compounds, as well as in Rumanian, as pointed
out in the text. The idea is that the original creations of this kind targeted people, possibly for ritual insult
purposes (Section 6.6).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Crosslinguistic distribution and parallels 165

lava-piatti [wash-plate, dishwasher]

metti-male [put-evil, trouble-maker]
spremi-limoni [squeeze-lemon, lemon-squeezer]
(27) French (some from Nyrop 1908; also Kate Paesani, p.c. 2006)
accroche-coeur [engage-heart, flirt]
Boil-eau [drink-water] (name)
cache-col [hide-neck, scarf]
coupe-bourse [cut-purse, pick-pocket]
coupe-gorge [cut-neck, rough neighborhood]
essuie-glace [wipe-windshield, windshield wiper]
grippe-sou [seize-up-penny, skinflint]
perce-neige [pierce-snow, snowdrop]
porte-bonheur [carry-happiness, lucky charm]
saute-mouton [jump-sheep, leap-frog]
tire-bouchon [cork-screw]
It is intriguing that the imperative analysis has also been proposed for Romance VN
compounds by many, including Diez (1838); Shulze (1868) (this reference also
consults Sanskrit and Slavic and Germanic families); Darmesteter (1894) (a very
extensive and comprehensive study and defense of the imperative analysis in
Romance); Darmesteter (1934); Prati (1931, 1958); Migliorini (1946); Lloyd (1968)
(see the latter reference for an overview of the imperative and non-imperative
approaches to VN compounds). Most historical grammars generally followed Dar-
mesteter’s imperative analysis (e.g. Meyer-Lübke (1895: 213–14) and subsequent work;
Adams (1913); de Diego (1914); Rohlfs (1954); while some advocated the third person
singular analysis (e.g. Bolufer 1920: 170). The imperative analysis is challenged and
argued against in Meunier (1875); Osthoff (1878) (Osthoff was criticized in Tobler
1886); Tollemache (1945); Heinimann (1949); Hall (1964, also 1948a,b). More recently,
while Floricic (2009) explores an imperative analysis, Ferrari (2005) has argued
against the imperative analysis, in a very thorough study of word formation in Italian
and other languages.
My claim here is that those compounds which approximate proto-syntactic
structures are more likely to exhibit ancient verb forms, including proto-imperative
forms.26 While preservation of structure in this close manner is by no means
guaranteed or necessary, it is nonetheless possible that this syntactic mold (even
though certainly not specific words) was passed on from generation to generation,
with only minor adjustments to the morphology of the verb form, to best approximate

26
In Romance, the (proto-)imperative analysis may be more appropriate for the original compounds
than for the recently coined, productive compounds referring mostly to instruments.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

166 Exocentric VN compounds: The best fossils

the original compound. It is thus intriguing that so many researchers, working on

diverse languages, converged on a conclusion that VN compounds involve a form that
looks like imperative, even though such an analysis is obviously counterintuitive, “un
vero controsenso,” as put in Tollemache (1945: 181). By embracing the imperative
puzzle, as well as the traditional exocentric characterization of VN compounds, one is
led to seek an explanation in the deep evolutionary past.

6.5.3 VN compounds in non-Indo-European languages

VN compounds with comparable morphological make-up, and with comparable
metaphors, can also be found in non-IE languages, as illustrated in this section
with Tashelhit Berber, Twi, and Chinese.
(28) Tashelhit Berber (spoken in Morocco; Dris Soulaimani, p.c. 2007)
slm-aggrn [suck.in-flour, butterfly]
ssum-izi [suck-fly, thrifty person]
ssum-sitan [suck-cow, insect]
(29) Twi (spoken in Ghana; Kingsley Okai, p.c. 2011)
Atoto-botom [dip-in pocket, pick-pocket]
Kukru-bin [roll feces, beetle]
Nom-mmogya [suck blood, vampire]
Wodi-nii [kill person, killer]
(30) Chinese (Murray 2004; Haiyong Liu, p.c. 2006)
dean-shin [stay-stomach, refreshments]
liing-shyh [lead-affairs, consul]
ua-eel [dig-ear, ear-pick]
It should be noted in this respect that Tashelhit ssum-sitan [suck-cow] in (28) is
closely parallel to Old English burst-cow, which also meant “insect,” and the drinking
image for a miser drynk-pany [drink-penny] (1) is reminiscent of ssum-izi [suck-fly]
in Tashelhit (28). Likewise, Twi kukru-bin [roll feces, beetle] in (29) involves the same
image as English tumble-dung (insect), and there is also a clear parallel between the
Twi word atoto-botom (29) and English pick-pocket.
The following table reveals further parallels in morphological make-up and meta-
phorical expression; many more can be found across the data provided in this
chapter.
(31) English Serbian French Polish Twi
Drink-water Popi-voda Boil-eau
cut-purse seci-kesa coupe-bourse rzezi-mieszek
pick-pocket atoto-botom
cut-throat Koji-vratić coupe-gorge
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

VN compounds and sexual selection 167

lick-pot liži-sahan
Bere-water wozi-woda
Burn-house pali-kuća
wag-tail vrti-rep
pinch-penny grippe-sou dusi-grosz
tumble-dung kukru-bin
According to Lloyd (1968), the original VN compounds described people who were
lazy, useless, careless in dress, idle, contemptible, criminal, stupid, uncultured,
bullies, busybodies, ﬂatterers, gluttons, drunkards, gloomy, cheating and swindling,
misers, defective, of contemptuous professions.27 If these descriptive words were not
available to ancient humans, which is a reasonable assumption to make, then the VN
naming strategy would have increased their expressive power (as well as the insulting
power) enormously. In other words, the ability to use such compounding strategy
successfully would have constituted an enormous expressive advantage over just
using single-word utterances, an advantage which could have been subject to sexual
selection, as discussed in the following section.

6.6 VN compounds and sexual selection

My proposal in this chapter is that VN compounds may represent the best fossils we
have for the postulated intransitive, paratactic, two-word grammar stage, which
moreover involves a verb(-like element) acting as a proto-predicate. Not only is the
structure of these compounds rudimentary and unsyntactic in almost every sense of
modern syntactic theory (flat structure, no headedness, no subject/object differenti-
ation, no recursion), but this compound strategy specializes for derogatory reference
and insult. Moreover, this strategy clearly illustrates how one can create hundreds of
complex and abstract concepts out of a handful of concrete base words. This alone
would have been enough to demonstrate to the ancient hominins the power of
(proto-)syntax, and to involve them in a possibly cut-throat race toward evolving
the capacity for syntax. This section looks into how creations comparable to VN
compounds would have contributed to the sexual selection of (proto-)syntax.
It is important to point out that I am not saying that this kind of naming/insult
strategy was the only benefit of proto-syntax, and the only reason for selecting syntax.
Not at all. There are just so many benefits of being able to combine words into larger
meaningful units that it would be trivial and pointless to list them here. What I am
saying is that I have isolated the data that point to just one of these benefits, and even
this one alone would have afforded such a significant concrete advantage that it could

27
Busy-body is probably another VN compound expressing a concept that can hardly be expressed so
succinctly and vividly in any other way.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

168 Exocentric VN compounds: The best fossils

have by itself triggered selection for proto-syntax. The more such factors at work, of
course, the faster and more complete the selection would have been. Chapter 7 offers
a concrete scenario which outlines how this process could have involved genes.
VN compounds across languages are typically playful, pejorative, and/or vulgar.
Their expressive potential seems unmatched by any other (nick)naming strategy. As
put in Darmesteter (1934: 443), the artistic beauty and richness of VN compounds (in
French) is inexhaustible.28 Mihajlović (1992) was equally impressed by the VN
compounds in Serbian. He devoted his career to traveling to remote places and
collecting over 500 Serbian place and people names in the form of VN compounds.
He reports that these condensed compositions pack in them frozen fairy tales,
proverbs, and ancient wisdoms and metaphors (1992: 8–9).
According to Progovac and Locke (2009), formation and use of VN compounds
may have been an adaptive way to compete for status and sex in ancient times. Their
successful use would have enhanced relative status first by derogating existing rivals
and placing prospective rivals on notice; and second by demonstrating verbal skills
and quick-wittedness (see Chapter 7 for a hypothetical scenario). Darwin (1874)
identified two distinct kinds of sexual selection: aggressive rivalry and mate choice
(see also Miller 2000), both of which seem relevant for the proposed use of exocentric
compounds. Darwin (1872) also pointed out that strong emotions expressed in
animals are those of lust and hostility, and that they may have been the first verbal
threats and intimidations uttered by humans (Code 2005: 322).
Throughout recorded history, sexually mature males have issued humorous
insults in public (Locke 2009; Locke and Bogin 2006). These “verbal duels” are
taken to discharge aggressive dispositions, and provide a way to compete for
status and mating opportunities without risking physical altercations (Marsh
1978; Parks 1990). In this respect it is significant that vulgar VN compounds in
Serbian target males. For example, jebi-vetar [screw-wind, charlatan] is typically
used to describe males. Even those compounds that seemingly describe females
are typically used in reference to males, for a doubly insulting effect (Mihajlović
1992): laj-kučka [bark-bitch, loud and obnoxious person]; lezi-baba [lie-(old.)
woman, loose woman or man].
In fact, it is hard to come up with an alternative explanation for the creation of
hundreds of such brilliant and humorous insults. The vast number of these com-
pounds (reported to have been in the thousands in medieval times) clearly exceeds
what is needed for just survival. Such excess is typically ascribed to sexual selection
forces. According to Miller (2000: 369): “if language evolved in part through sexual

28
In his own words, “at the time of Renaissance, Ronsard introduced [VN compounds] in a new and
original manner as epithets: Jupiter lance-tonnerre, le soleil donne-vie, Hercule porte-massue . . . It would be
well could French poets again make use in lofty poetry of this class of epithets; for they may attain Homeric
breadth” (Darmesteter 1934: 443).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Corroborating evidence and testing grounds 169

choice as an ornament or indicator, it should be costly, excessive, luxuriant beyond

the demands.”29
In this respect, Franks and Rigby (2005) have performed experiments which found
that males increase their creativity with language not only in the presence of
attractive females, but also in the presence of male competitors. They did not ﬁnd
either of these effects with female subjects. Their test phrases involved unusual
combinations of e.g. two nouns (such as book bicycle), and the subjects were asked
to provide possible referents for such noun phrases. As independently established,
the relation interpretation (such as “a bicycle for delivering books”) is considered by
females to be less creative than property interpretations (such as “a book with two
wheels).” Their study thus provides evidence that males even today display their
creativity and cognitive skills by using language. As pointed out in e.g. Miller (2000)
and Franks and Rigby (2005: 208), human mate selection often involves display of
cognitive traits by creativity in language use. Moreover, creativity is considered to be
highly correlated with intelligence (Miller 2000).
The possibility that sexual selection played a role in evolving syntax is consistent
with the ﬁndings reported in e.g. Ullman (2008) that there is a gender difference
when it comes to language processing (see also Pinker and Ullman 2002), as
discussed in Chapter 7.

6.7 Corroborating evidence and testing grounds

As argued in Section 6.3, the simple, paratactic VN compound structure provides a
foundation/scaffolding for building hierarchical -er/-ac compounds in English and
Serbian. Another reason to consider exocentric VN compounds as derivationally
related to their -er/-ac counterparts comes from language acquisition studies, as
pointed out in e.g. Lieber (1992). In addition, Clark, Hecht, and Mulford (1986)
conducted an experiment in which they prompted children to produce novel -er
compounds (see also Clark and Barron 1988). At around three, children consistently
produced VN compounds such as “grate-cheese” instead of “cheese-grater,” “rip-
paper” instead of “paper-ripper,” and “bounce-ball” instead of “ball-bouncer.”
According to the authors, children begin by forming compounds with VN predicate
order, basically from verb phrases (Clark, Hecht, and Mulford 1986: 26).30 This seems

29
As pointed out by a reviewer, there is no reason to believe that there were that many compounds at
the onset of the paratactic stage, and I am certainly not claiming that. The sheer number of these
compounds attests to their enormous creative potential, as well as to the fact that people got very good
at creating them at some point, for some reason.
30
While many have reconstructed SOV as the proto-world word order (e.g. Givón 1979; Newmeyer
2000; see also Aske 1998; Lightfoot 1979; and Section 3.1), according to Miller (1975), the oldest recon-
structible stage of IE (Indo-European) may have been VSO. Miller (1975: 32) notes that in IE the productive
compound type was SV, OV, but that VS, VO was archaic and residual. IE also had a marked conjunct
order, with the verb at the beginning (Watkins 1963), another residue of VS order. Lehmann (1969: 12f)
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

170 Exocentric VN compounds: The best fossils

to indicate that children start with the foundation, before they can build the
suprastructure.
At the next stage, there is a tendency to produce compounds with misplaced affix:
“dry-hairer”/“dryer-hair” (cf. the target “hair-dryer”) and “fix-biker”/“fixer-bike” (cf.
“bike-fixer”). It is only later that children begin to place the noun before the verbal
form, creating the adult NV-er order. At the very least, this finding supports the
proposal that VN compounds are more primary and simpler in structure than their
hierarchical counterparts, as they emerge earlier in language acquisition. Moreover,
the stages and struggles in the acquisition of these compounds reinforce the conclu-
sion that NV-er compounds are built upon the foundation of the paratactic VN
compounds (Section 6.3).
In addition, the imperative in general is among the first productive verbal forms
used by young children (Bar-Shalom and Snyder 1999). It is conceivable that the
imperative is a paradigm case of an unmarked mood form, and that for that reason it
emerges early, whether in evolution or acquisition. Moreover, as reported in e.g.
Bates et al. (1979), children’s early speech acts are manipulative, expressing wishes
and commands (the typical uses of optative, as per Section 6.4), while the informative
(declarative) speech acts emerge later. The acquisition data are thus consistent with
the view that the grammar behind VN compounds represents an evolutionary
primary, foundational strategy.
In addition to language acquisition, there is some corroborating evidence from
language representation in the brain. It has been reported that swearwords are
processed by the more ancient structures of the brain, suggesting that they them-
selves might be ancient creations. According to e.g. Code (2005: 317), swearwords (as
well as some other non-propositional uses of language) might represent fossilized
clues to the evolutionary origins of human communication, given that their process-
ing involves the right hemisphere, basal ganglia, thalamus, and limbic structures.
Basal-limbic structures are phylogentically old and the aspects of human communi-
cation associated with them are considered to be ancient too (e.g. van Lancker and
Cummings 1999; Bradshaw 2001).31
Moreover, as pointed out in reference to other syntactic fossils, such as small
clauses (Chapter 2) and absolutives/unaccusatives (Chapter 3), neuroimaging experi-
ments can be devised to compare and contrast the processing of VN compounds and
their hierarchical -er counterparts (Progovac 2010b). The prediction of the proposal

claims that these verb-initial compounds are derived from underlying sentences having the order with the
verb preceding the object or subject. Perhaps (proto-)imperatives had a preference for initial verb order,
and the compounds that fossilize such imperatives are verb-initial. Needless to say, resolving this issue is
beyond the scope of this book. But, as pointed out in Section 4.4.5, word order in the two-word proto-
syntax stage was probably not ﬁxed.
31
Note also that Tourette’s Syndrome, a disorder caused by basal ganglia-limbic connection dysfunc-
tion, is characterized by involuntary production of obscene speech.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Appendix 1: Additional English VN compounds 171

in this chapter is that the processing of VN compounds involves less syntactic

activation in Broca’s areas, and less lateralization in the left hemisphere, but more
reliance on subcortical structures of the brain, and the right hemisphere. This would
especially be the case with the compounds involving swearwords, as per the discus-
sion above. The Appendix returns to this testing opportunity in more detail.

6.8 Concluding remarks

In conclusion, the grammar behind VN compounds is an excellent candidate for a
fossil of proto-syntax, involving the simplest possible merger of a verb-like and a
noun-like element. Little about these compounds makes sense except in the light of
evolution. They show rudimentary syntax, defying the most fundamental postulates
of modern morpho-syntax, including headedness and hierarchy. In addition, their
semantics is underspeciﬁed, with no differentiation between thematic roles, and
therefore between subjecthood and objecthood. Additionally, these compounds
specialize for derogatory reference, which invokes an explanation in terms of ritual
insult and sexual selection. Adding further to the exotic nature of VN compounds,
their verb surfaces in a (proto-)imperative form in some languages.
These crude compounds, typically exhibiting the most base and basic of vocabu-
lary, can nonetheless express abstract (human) traits not only with astounding
succinctness and vividness, but also with humor and playfulness. Using this kind
of compounding strategy at the dawn of language would have not only augmented
the expressive power of human language enormously (Chapter 7), but it would have
also provided a foundation for further vocabulary and structure building, in keeping
with the main theme of this monograph.

6.9 Appendix 1: Additional English VN compounds

As names in English (most are taken from Weekley 1916)

Ben-bow (bend-bow), Bere-water (bear-water), Bran-foot (possibly from brand-foot,

for animals/slaves), Break-speare, Burn-house, Catch-love (love = wolf), Cant-well,
Crake-bone, Cut-bush, Cut-fox, Cut-love (love =wolf), Cut-right, Culle-hare (culle
= kill), Culle-hog (culle = kill), Culle-bolloc (culle = kill), Do-best, Do-bet, Do-little,
Do-well, Doubt-ﬁre (from arch. “dout” – in charge of furnace), Dread-nought,
Drink-low, Drynk-pany (drink penny), Drink-water, Eat-well, Gather-all, Gather-
cole (coal or cabbage), Gather-good (good = property, wealth), Go-lightly, Hab-good
(from “hap” = “to snatch”), Hack-block, Hack-wood, Hate-crist (crist = Christ), Hop
(e)-well (well = stream/pool), Hurl-bat, Kill-buck (place name in the state of New
York), Kis-sack, Lack-land, Lack-love, Love-gold, Love-good (probably good = God;
contrast with Hate-crist), Love-well, Make-joy, Make-mead, Make-peace, Mar-brow,
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

172 Exocentric VN compounds: The best fossils

Mar-wood, Mean-well, Mend-market, Pass-ﬁeld, Passe-low (cross-water), Perce-

forest (perce = pierce), Perce-val (pierce-vale), Pers-house (pers = pierce), Pil-beam
(pil=peel, barker of trees), Pinch-back, Porte-rose, Rack-straw (rack = rake), Rid-land
(rid = clear), Rid-wood (rid = clear), Save-all, Scare-devil, Scatter-good (good
= wealth/property), Shake-lady, Shake-lance, Shake-rose, Shak-shaft, Shake-speare,
Shake-staff, Shear-gold (coin-clipper), Shear-lock, Shear-wood, Shave-tail (shave
= shove), Spare-good (good = property, wealth), Spare-water, Spin-garn, Spyll-
payn, Stab-back, Stand-even, Stand-fast, Strangle-man, Swep-stak, Thack-well
(thatcher), Thumb-wood (cf. mar-wood; “thumb” archaic for “to handle clumsily”),
Tickle-penny, Tire-buck (tire = tear), Tread-away, Tread-gold, Tread-well (well
= stream), Trede-water, Trust-god, Tuck-well, Turn-bull, Turn-penny, Turn-pike,
Wage-spere, Wag-horn, Wag-staff, Wag-tail, Wast-all, Win-bow, Win-penny, Win-
rose, Wipe-tail, Wrynge-tail

Online dictionaries of slang; dictionary.com

(It is of note here that for some of these compounds it is not possible to tell if they are
VN or NN compounds.)

fuck-ass (fool), fuck-bag (disgusting person), fuck-ball, fuck-brain, fuck-buddy, fuck-a-

bush, fuck-chop (an imbecile), fuck-head, fuck-dog (dog-fucker), fuck-face, fuck-freak,
screw-ball, shit-ass, shit-bag, shit-bullets (terriﬁed person; cf. Serbian Seri-sabljić,
Appendix 2), shit-bird, shit-head, shit-face, shit-stick

6.10 Appendix 2: Additional (mostly coarse) VN compounds as Serbian

people and place names (taken from Mihajlović 1992)
Čepi-guz IMP/3SG ‘cork-butt’
Češi-guz IMP/3SG ‘scratch-butt’
Ćuli-brk IMP/3SG ‘stick-moustache’
Deri-gaća IMP ‘rip/tear-underpants’
Deri-kučka IMP ‘rip-bitch’
Deri-muda IMP ‘rip-balls’ (place name, a steep hill)
Draži-vaška IMP/3SG ‘tease-louse’
Gladi-kur IMP ‘stroke-dick’ (womanizer)
Gori-guzica IMP ‘burn-butt’ (person in trouble; cf. Burn-
breeches)
Jebi-baba IMP ‘fuck-old.woman’ (unselective womanizer)
Jebi-sestra IMP ‘fuck-sister/cousin’
Jebi-vetar IMP ‘fuck-wind’ (charlatan)
Kapi-kur IMP ‘drip-dick’ (name of a slow spring)
Kosi-noga IMP ‘skew-leg’ (person who limps)
Kovrlji-guz IMP ‘drag-butt’
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

Appendix 2 173

Kradi-gaća IMP ‘steal-underpants’

Krpi-tur IMP ‘patch-butt’ (poor person)
Laj-kučka IMP/3SG ‘bark-bitch’ (loud and obnoxious person)
Lezi-baba IMP/3SG ‘lie-old-woman’ (loose woman or man)
Lezi-tetka IMP/3SG ‘lie-aunt’ (loose woman or man)
Liz-guz IMP/3SG ‘lick-butt’
Muz-govno IMP/3SG ‘milk-shit’
Nabi-guz IMP/3SG ‘shove-butt’
Neper-gaća IMP/3SG ‘no-wash-underpants’
Peči-govno IMP/3SG ‘burn-shit’
Piš-kur IMP/3SG ‘piss-dick’
Plači-guz IMP/3SG ‘cry-butt’ (cf. cry-baby)
Plači-pička IMP ‘cry-cunt’ (vulgar version of cry-baby)
Plaši-vranac IMP/3SG ‘scare-crow’
Poj-kurić IMP/3SG ‘sing-dick’ (womanizer)
Prdi-kučka IMP/3SG ‘fart-bitch’
Prdi-vuk IMP/3SG ‘fart-wolf ’
Prdi-zec IMP/3SG ‘fart-rabbit’
Prti-mud IMP/3SG ‘carry-balls’
Puš-kur IMP/3SG ‘smoke-dick’
Razbi-dupe IMP/3SG ‘break-butt’ (steep terrain)
Seri-sabljić IMP/3SG ‘shit-sword’
Seri-vuk IMP/3SG ‘shit-wolf ’
Visi-guz IMP ‘hang-butt’
Vuci-guz IMP ‘drag-butt’ (slow-moving person)
Vuci-klašnja IMP ‘drag-stockings’ (carelessly dressed person)
Vuci-kuja IMP ‘drag-dog’ (stray dog)
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

The plausibility of natural selection

for syntax

“Evolution is the only physical process that can create an eye because it is the only
physical process in which the criterion of being good at seeing can play a causal role”
(Pinker and Bloom 1990: 710)

7.1 Concrete and selectable advantages accrued by each stage

This chapter considers how each new postulated stage of syntax accrues concrete
communicative advantage(s) over the previous stage(s), and how such advantage(s)
would have been subject to natural/sexual selection. More specifically, I show how
the progression from one-word stage (no syntax), to paratactic two-slot syntax
(Section 7.2), to hierarchical vP/TP stage (Section 7.3), brings about clear incremental
communicative benefits. Section 7.4 details one concrete hypothetical scenario for
progressing from one stage to the next, invoking sexual selection. Section 7.5 con-
siders how these syntactic stages may fit into the timeline of human evolution.
Doubt has been repeatedly expressed regarding the possibility that aspects of
syntax were naturally/sexually selected. Most of the dismissive reactions mention
abstract syntactic postulates such as Subjacency (Chapter 5), or EPP (requirement
that every clause has a subject), pointing to the improbability of such principles being
sexually selected given that even their status in syntax is not clear, let alone their
usefulness to survival. As famously put by Lightfoot (1991: 69), “Subjacency has many
virtues, but [ . . . ] it could not have increased the chances of having fruitful sex.”
However, as concluded in this book, phenomena associated with Subjacency are
not the essence of syntax. In fact, islandhood effects are largely unexplained and
poorly understood phenomena in syntax, and their characterization still remains at
the level of observation and description, as discussed at length in Chapter 5. Given the
framework developed in this monograph, it transpires that islandhood is in fact an
epiphenomenon of evolutionary tinkering, that is, an ancient, foundational state of
grammar, which does not sanction Move. To put it slightly differently, islandhood is
the default state of proto-grammars, and only some relatively recent, innovative

Evolutionary Syntax. First edition. Ljiljana Progovac

# Ljiljana Progovac 2015. Published 2015 by Oxford University Press
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

From one-word to two-word utterances: Vagueness galore 175

constructions can override this default state, giving an illusion of Subjacency. On my

approach, which illuminates the issue from a shifted perspective, it is the ability to use
these innovative constructions, and the communicative benefits that come with
them, that would have been selected, rather than Subjacency. So, in order to explore
a gradualist approach to syntax, one needs to decompose syntax along dimensions
which are concrete and specific enough for selection to target, and at the same time
consistent with the basic theoretical postulates of syntax. This is the most important
pursuit of this monograph.
Speaking in broad terms, my argument is that (more) complex syntax brings about
adaptive advantages in the following ways. First, it automatizes/streamlines the
expression of various syntactic phenomena, such as transitivity, tense, subordination,
which can otherwise be expressed, but only vaguely and with less speed and precision
(see also Pinker and Bloom 1990, among others). One rather amazing property of
human language is the speed with which we can talk and understand others talking.
The less we have to guess as to what goes with what, and the more we can rely on
subconscious, automatic processes to arrange the basic information, the faster and
more undistracted our speech will be. Syntax does its part by providing that frame
which organizes the information in a reliable and predictable manner.
Second, and related to the advantage identified above, by offering more precision in
expression, more complex syntax allows us to escape the vast vagueness associated with
underspecified proto-grammars and to break away from the here-and-now, as well
as from the prison of pragmatics more generally, enabling the famous displacement
property of human language (see e.g. Hockett 1960). Third, and related to the above, given
that a more complex grammar is much more self-sufficient and much less reliant on
pragmatics, it is much better at expressing strange, even non-existent concepts, contri-
buting to the capacity for novelty and creativity. Several other properties of language only
emerge in the later stages of syntax, including hierarchy and the capacity for recursion.
The following two sections break down these general advantages into specific ones.

7.2 From one-word to two-word utterances: Vagueness galore

A progression from one-word to two-word stage, i.e. from no syntax to rudimentary
(paratactic) syntax, would have brought about enormous communicative advantages
to our ancestors. While syntacticians often dismiss any precursors to complex syntax
as irrelevant or even impossible, the argument that I build throughout this mono-
graph is that this simple paratactic syntax is the foundation for any further develop-
ments with syntax, the scaffolding without which it would not have been possible to
reach the complex realm.1 As such, the emergence of productive two-slot syntax may

1
Jackendoff (1999, 2002) also considers that previous stages of evolution, such as Bickerton’s (1990)
protolanguage, provided a foundation for subsequent stages. Jackendoff and Wittenberg (2014) also
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

176 The plausibility of natural selection for syntax

have been the most dramatic breakthrough in the evolution of human language. In
this section I focus on the advantages of this stage brought to light by the concrete
proposals and fossil evidence discussed in this monograph, but there is no doubt that
there would have been many more advantages of transitioning into a two-word stage.
Consider now some hypothetical examples that can shed light on the communi-
cative possibilities in the one-word stage (1). One should keep in mind that one-word
utterances would continue to be available in the two-word stage as well, as per the
theme of this monograph that the emergence of a new stage preserves the achieve-
ments of the previous stages:
(1) Snake! . . . Gone! . . . You! . . . Out! . . . Eve! . . . Run!
The string in (1) could mean that a snake has been spotted, and that you should be
gone and out, and that Eve should run, too. Or it could mean that the snake was
spotted, but is now gone, thanks to you, and now Eve should go out and run. Or
maybe you should run to save Eve. There are various other possibilities for (1) as well,
each conveying very different messages. The one-word stage would have been
frustratingly vague, at least from the point of view of the modern person. Still, at
the point when ﬁrst words emerged, they would have been a source of joy, a novel
device for display, in addition to being somewhat informative. Darwin (1872) argued
that neophilia, i.e. love of novelty, was an important factor in the diversiﬁcation and
rapid evolution of e.g. bird song. Primates in general are extremely neophilic, and this
is certainly the case with humans. This clearly has important implications for sexual
selection of language.
On the other hand, the two-word stage, as postulated in Chapters 2 to 4, would
have been able to express basic intransitive (absolutive-like) propositions (or predi-
cations) by combining a verb-like and a noun-like category, as illustrated in the
following examples, analyzed here as fossils of this stage:
(2) Come winter, . . . Problem solved.
(3) Pao sneg. Stigla pošta. (Serbian)
fell.PART snow arrived.PART mail
(4) Ayam makan (Riau Indonesian, Gil 2005)
chicken eat
‘The chicken is eating.’
‘Somebody is eating the chicken.’
(5) rattle-snake, cry-baby, scare-crow, hunch-back

emphasize the layering and preservation of older stages. However, they characterize their stages and
layering differently, as discussed in Section 1.6.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

From one-word to two-word utterances: Vagueness galore 177

Early child language also abounds in two-word utterances. Consider the following
hypothetical small clauses used in the two-word stage:
(6) Marie cut. Me go. Eve gone. Snake roll.
Even though two-word grammars are still quite underspecified, the vagueness is
significantly reduced in comparison to the one-word stage.2 Here, in the two-word
stage, it is at least clear which verb is associated with which noun, and it is also
typically the case that the referent of the noun is a major participant of the action
specified by the verb, providing evidence of (proto-)predication, as characterized in
Section 3.4.2.
If the proto-words from (6) were not grouped into (small) clauses (7), many more
options for interpretation would be readily available, including the following one,
highly unlikely for (6): “Look at Mary. She is cutting me. Go, Eve. The snake is gone.
Roll now.”
(7) Marie . . . Cut . . . Me . . . Go . . . Eve . . . Gone . . . Snake . . . Roll . . .
Still, as discussed in the following section, it takes hierarchical syntax, such as vP and
TP layers of structure, to unambiguously distinguish between e.g. subjects and
objects. This is exactly the scenario compatible with the incremental, step-by-step
evolution of syntactic complexity, in response to communicative pressures to reduce
vagueness in the expression of argument structure.
There is one more characteristic that goes hand-in-hand with vagueness and
reliance on context, especially when it comes to distinguishing subjects from objects.
The pragmatic context can easily give a clue as to who is eating what in sentences
such as (4). If we are observing a chicken walking in a yard, then the first interpret-
ation in (4) would make sense; if we are observing a chicken on a plate, then only the
second interpretation would make sense. However, the syntax of (4) on its own does
not distinguish these possibilities. But notice now that by utilizing the pragmatic
context we catch ourselves expecting the utterances to make pragmatic sense, to be
consistent with how the world is. If so, at the postulated absolutive-like stage, it would
have been hard to express something very strange, or plain impossible, such as a
chicken eating Tom, or an apple eating a chicken. This means that displacement,
understood not only as a shift away from the here-and-now, but also as a shift away
from the realm of probable or possible, would have been much harder to realize with
two-slot grammars, and especially with one-word utterances, as also discussed in the
following section. The property of displacement, understood in this way, would only

2
It is essential to have an appropriate standard of comparison whenever we talk about adaptiveness or
usefulness of syntax; that is, we need to ask the question: “more useful or more precise in comparison to
what?” Likewise, when one claims that language/syntax is not good for communication, the question is
again: “in comparison to what?”
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

178 The plausibility of natural selection for syntax

flourish with the hierarchical stage, and it might have in fact been a major driving
force behind the evolution of the hierarchical stage.
However, this two-word stage can already piece together a way to express transi-
tive events. In addition to combining two words into a (small) clause, two-slot
grammars can also paratactically combine two such clauses. This most probably
would have been a later development, and perhaps a different sub-stage in the
paratactic stage in the evolution of syntax, but it is still in essence a paratactic,
symmetric stage. Recall from Chapter 3 that the earliest stages of Nicaraguan Sign
Language tend to use sequences of two intransitive clauses when dealing with
multiple animate arguments, such as (9) or (10), in lieu of transitive structures,
such as (8) (Kegl, Senghas, and Coppola 1999: 216–17; Senghas et al. 1997; see also
Aronoff et al. 2008 for Al-Sayyid Bedouin Sign Language):
(8) *WOMAN PUSH MAN.
(9) WOMAN PUSH—MAN REACT.
(10) WOMAN PUSH—MAN FALL.
The paratactic binary grammar above can already express transitivity, but in a
roundabout way, and not as directly and unambiguously as a true transitive sentence
such as (8) would. This is exactly the claim here, that language evolved in the
direction of streamlining and automatizing the expression of certain syntactic phe-
nomena, including transitivity, starting from a stage in which such grammatical
phenomena could be expressed, but only with vagueness/imprecision, and with the
help of context.
One can find such binary clause fossils in a variety of languages, as discussed in
Chapter 4. Most of them are used to express some temporal or causal relationships
between two clauses.3
(11) a. Come one, come all.
b. Easy come, easy go.
c. Garbage in, garbage out.
(12) a. Wo dua, wo twa (Twi)
you sow you reap
b. Wo hwehwea, wo hu
you seek you find
The symmetrically/paratactically attached clauses above (created by the Operation
Conjoin of Chapter 4) are interpreted as linked, but merely by iconic means. The

3
Thanks to Kingsley Okai (p.c. 2011) for the Twi data. See many more examples from other languages in
Section 4.2.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

From one-word to two-word utterances: Vagueness galore 179

event of the first clause is taken to precede and/or cause the event of the second
clause. This is iconic because it mimics the intended ordering of the two events.
Reversing the order of these clauses would completely change their meaning (e.g.
??
Easy go, easy come; ??You reap, you sow). In contrast, fully fledged hierarchical
counterparts are not subject to this ordering condition (e.g. You will reap (only) if you
sow), as they rely on functional categories and syntax to express their meaning in a
more self-reliant fashion. This iconicity of ordering is also relevant for the precursors
to transitivity, such as (9–10). There, again, the causing event is placed before the
caused event (as also discussed in Sections 3.4.1 and 1.6).
The data discussed in this section illustrate two points. First, unlike one-word
grammars, two-slot paratactic grammars can express, with some consistency, ele-
mentary predication by combining e.g. a verb and a noun, as well as some temporal/
causal relationships between two clauses. The two-slot grammars are thus more
precise and more expressive than one-word grammars, but less precise and less
expressive than hierarchical grammars, suggesting again an incremental increase in
communicative capabilities, exactly the scenario in which evolutionary forces can
operate.
But the communicative/expressive advantages of a two-word stage certainly do not
end here. As shown in Chapter 6, exocentric VN compounds are fossil structures
which specialize for derogatory reference, and which provide evidence of ritual
insult/sexual selection for (simple) syntax (see Progovac and Locke 2009; Progovac
2012). While it is certainly possible to insult somebody by using single words, one’s
ability to create stunning insults increases by leaps and bounds if one can combine
two proto-words (see Section 7.4 for a detailed evolutionary scenario).
(13) cry-baby, busy-body, turn-coat, kill-joy, pick-pocket, fuck-head
(14) ispi-čutura (drink.up-flask—drunkard), guli-koža (peel-skin—who
rips you off), cepi-dlaka (split-hair—who splits hairs), muti-voda
(muddy-water—trouble-maker), jebi-vetar (screw-wind—
charlatan), vrti-guz (spin-butt—fidget); tuži-baba (whine-
old.woman—tattletale) (Serbian)
The VN compound data taken from a variety of languages make it clear that these
compounds typically combine basic, concrete words, often denoting body parts and
functions, in order to express vivid and memorable abstract concepts. Selecting for the
ability to quickly produce (and interpret) such compounds on the spot would have gone
a long way toward not only solidifying the capacity to use paratactic two-slot grammars,
the foundation for more complex grammars, but also the capacity for building abstract
vocabulary. As discussed in Section 7.4, sexual selection for the capacity to produce and
interpret such compound insults would have been one of the factors driving the
progression from the one-word stage to the two-word paratactic stage.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

180 The plausibility of natural selection for syntax

In sum, having a bunch of single words to refer to individuals or actions without

using syntax (one-word stage) would have already been very useful to our ancestors,
much more useful than not having any words at all, but much less useful than having
two-word grammars, as the latter provide clear and concrete communicative bene-
ﬁts. The next section considers the advantages of transititioning to the stage of
hierarchical syntax.

7.3 From the two-word stage to hierarchical syntax: Evolving

transitivity, displacement, and recursion
7.3.1 Introductory remarks
If my proposal in this monograph is on the right track, then some of the design
features of human language, such as displacement (see e.g. Hockett 1960; Hockett and
Altmann 1968), emerged through evolutionary tinkering, that is, by gradually evolv-
ing hierarchical grammars (with e.g. vPs and TPs). Such hierarchical grammars offer
enough precision to be able to stand on their own, without much reliance on context,
and thus to describe situations that are distant, non-existent, or that even challenge
common sense. These capabilities increase significantly in the transition from the
two-word stage to the hierarchical stage, given that in this stage one does not have to
rely much on pragmatic context, which typically confines one to the available and the
observable. Both displacement and recursion would have been very difficult to
express, if possible at all, in the two-word stage, as discussed in this chapter.
My claim is thus that these amazing properties of human language did not just
materialize out of thin air, but that they had to evolve through a painstaking process
of scaffolding and tinkering, with two-word grammars providing the ultimate syn-
tactic foundation. This process was guided by the evolutionary pressures to tinker
with these grammars to be ever more and more expressive, ever more and more
precise, and to be able to communicate more and more kinds of ideas. It must have
taken constant and relentless acts of creativity and novelty on the part of our
ancestors to get us where we are now. And many good hominins, with otherwise
perfectly good genes, had to make room for those who happened to be just a bit better
at this game.4

7.3.2 Grammaticalizing tense

As discussed in Chapter 4, progressing from the paratactic (two-word) stage to the
hierarchical functional category stage may have proceeded through a linker/proto-
conjunction stage, where the linker initially served only to solidify proto-Merge, that

4
More precisely, just a little bit better at whatever the local game with language was, in that particular
location, at that particular time.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

From the two-word stage to hierarchical syntax 181

is, to provide more robust evidence of combinatorial syntax. Perhaps the initial
meaningless linkers occurring between a subject and a predicate of a small clause, or
between two clauses, gradually became tense particles, or subordinators/complement-
izers, enabling more automatic expression of tense/time reference, as well as subor-
dination. Or perhaps tense markers developed from other sources, such as verbs.
Whatever the source, the emergence of the sentential functional projection such as
TP renders more automatic and undistracted the expression of temporal, modal, and
other properties, allowing speakers to break away from the here-and-now more easily
and more efficiently. As also shown in Chapter 2, tenseless root small clauses in
modern languages tend to specialize for the here-and-now, and cannot be easily
modified by adverbials such as “three years ago,” either in Serbian or in English:
(15) a. *Stigla pošta pre tri godine.
arrived mail before three years
b. *Pao sneg pre tri godine.
fell snow before three years
(16) a. *Case closed three years ago.
b. *Me first three years ago!
This kind of specialization is possible in modern languages when they also exhibit the
alternative TP strategy:
(17) a. Pošta je stigla pre tri godine.
mail AUX arrived before three years
b. Sneg je pao pre tri godine.
snow AUX fell before three years
(18) a. The case was closed three years ago.
b. I was first three years ago!
In the literature on evolution, evolving multiple means to the same end is considered
to create the opportunity for the evolution of specialization through the division of
labor (e.g. Carroll 2005), as pointed out in Section 2.2. The retention of these small
clause fossils in the here-and-now contexts can be explained if more complex
grammars do not bring about a tangible advantage in these contexts, i.e. if they are
an overkill in these contexts. An example of a grammatical overkill would be to use
“The point is being taken,” in lieu of “Point taken;” or Serbian “Sneg je pao” (Snow is
fallen) in lieu of “Pao sneg.” Another more subtle example of a grammatical overkill
would be to use Serbian “On me udara!” (He me hits), as opposed to the middle “On
se udara!,” as per the discussion in Section 3.4.2. See also Du Bois (1985) for the
preference to use intransitive underspecified grammars in discourse, as discussed in
Section 3.5.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

182 The plausibility of natural selection for syntax

There are also languages across the world that do not show an obligatory gram-
maticalized TAM (tense/aspect/mood) system, but can optionally express the rele-
vant temporal/modal properties through the use of adverbials (see e.g. Skou, a
Papuan language, and Riau Indonesian (Gil 2014), and Tongan (Churchward
1953).5 Indeed, according to Gil’s (2014) study based on a sample of 868 languages,
377 are categorized as having optional TAM marking, while 491 are classified as
having obligatory/grammaticalized TAM marking. This shows that variation in this
respect is not only possible but widely attested. This is again consistent with the
gradualist approach explored here, which postulates only a paratactic, small clause
foundation as the common syntactic core. Beyond the paratactic platform, languages
will diverge with respect to where and what to build on top of this foundation (see
Section 7.5 for more discussion).
In various other modern languages, including Russian (e.g. Pesetsky 1982) and
Hebrew (Rothstein 1995), one finds mixed systems, or perhaps we should call them
split systems, on analogy with the split systems attested in case marking, as discussed
in the following section. In these systems the present tense in general remains
unmarked, while the other tenses show obligatory TAM markings:
(19) Ivan veren. (Russian)
Ivan faithful
(20) Ivan byl veren.
Ivan was faithful
The present tense in formal semantics literature is normally characterized as
“coincident with the time of the context in which the sentence is produced”
(Chierchia and McConnell-Ginet 1990: 266). If so, then grammaticalizing present
tense is somewhat superfluous, an overkill, as it does not bring about a clear
advantage. Most importantly, this indicates that the TAM system in one single
language can be split/mixed, involving either TPs or TP-less structures, with the
split aligning with communicative considerations. As will be shown in Section 7.3.3,
similar mixed systems exist in the realm of transitivity, the so-called split-ergative
systems.
A hierarchical TP system makes it easier to express claims about the (distant) past,
as well as to make future and counterfactual claims, all hallmarks of displacement.
This is not to say that these notions cannot be expressed without functional categor-
ies and projections, perhaps through the use of loosely adjoined adverbs. This is only
to say that functional projections such as TP facilitate a more automatic, unambigu-
ous, and undistracted way of expressing such notions.

5
See also Section 2.2 for Kiparsky’s (1968) claim about pre-Indo-European in this respect.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

From the two-word stage to hierarchical syntax 183

7.3.3 Grammaticalizing transitivity

Breaking away from the here-and-now, and from the prison of pragmatics in general,
which is the essence of displacement, is much easier to achieve with more articulated
grammars. As pointed out above, the two-word stage does not distinguish between
subjects and objects, and it is typically pragmatics that sheds light on the intended
meaning of the sentence in this regard. The same certainly holds of one-word
“grammars.” So, imagine encountering the following one-word (21) and two-word
(22) utterance sequences in the early stages of language:
(21) Apple . . . Eat . . . Go . . . Tom.
(22) Apple eat . . . Go Tom.
These kinds of utterances are much less precise (more vague) than a corresponding
hierarchical sentence such as (23) below, and can receive many interpretations in
addition to the one in (23):
(23) Tom will (go and) eat the apple.
But it is important to point out that the meaning that does not readily come to mind
with respect to (21–22) is the one expressed in (24), even though there is nothing in
the form of these utterances that would preclude that interpretation.6
(24) The apple will (go and) eat Tom.
This suggests that pragmatically odd (or impossible) notions can hardly be expressed
without hierarchical syntax, given that simple underspeciﬁed structures are in close
alliance with pragmatics. In this respect, adding a layer of transitivity makes it
possible now to unambiguously make “the apple” the subject of eating a human
being (24), no matter what the context is, or what common sense tells us.
As discussed in Chapter 3, there is an abundance of “fossils” of this two-slot
absolutive-like stage across languages. For example, absolutive-like constructions
characterize nominals, certain middle constructions, as well as certain compounds,
even in nominative-accusative languages. Ergative-absolutive languages exhibit such
absolutive structures in the verbal domain as well, at least in some cases, given that
most of the languages classiﬁed as ergative are in fact not purely ergative, but split-
ergative systems.
Certain types of splits in split ergative languages provide a compelling argument
for the claim that transitivity serves to alleviate vagueness. Many split ergative
languages are mixed languages in the sense that they are ergative-absolutive with

6
A reviewer suggests that the reading in (24) should be obtainable even from (21/22). This is something
that can be subjected to psychological testing to determine the statistical likelihood of interpreting (21/22)
as something pragmatically implausible, such as (24).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

184 The plausibility of natural selection for syntax

some types of nominals, such as inanimates, but nominative-accusative with others,

such as animates (e.g. Gair 1970 for Sinhalese, a language spoken in Sri Lanka).7
Clearly, this kind of split aligns with communicative considerations, as the ambiguity
is much more likely to arise with the animates than with the inanimates. As
Tchekhoff (1973: 285) notes in connection to Tongan, “a yam cannot eat any more
than a box can dig a hole.”
Consider again some hypothetical examples from the postulated absolutive-like
stage:
(25) Yam eat. Tree cut. Water drink.
(26) Chicken eat. Chicken cut. Boy cut.
While there is little possibility for confusion with (25), where “yam,” “tree,” and
“water” are readily interpreted as objects of the expressed actions, there is often
possibility for confusion with animates (26), as they can both cut and be cut, and they
can both eat and be eaten. Thus, the split in these cases is designed to reduce
vagueness where it is likely to arise, as per the discussion in e.g. Comrie (1989:
124–37).
Comrie (1989: 130) in fact observes that there are some languages, such as Hua
(spoken in Papua New Guinea), where the occurrence of a special transitivity marker
(e.g. accusative) is “conditioned not by any speciﬁc rigid cut-off point in the animacy
or deﬁniteness hierarchies, but rather . . . [by] the assessment of likelihood of confu-
sion,” which is left to the speaker in the particular context. This is a clear case of
syntax responding to communicative considerations in modern times, which more-
over illustrates exactly the kind of transitional scenario that could have paved the way
toward grammaticalized transitivity. Aissen (2003) looks at a variety of languages
with what she terms “differential object marking (DOM),” which include ergative/
accusative splits, and concludes that DOM is a compromise between two contradict-
ory principles, Iconicity and Economy. For her, Iconicity is at work when overt case
marking occurs on an object which can easily be confused for a subject, while
Economy simply avoids any case marking.

7
The animacy scale that these processes typically target can be expressed as:
(i) Human > Animate > Inanimate (se e.g. Silverstein 1976; Aissen 2003)
Another dimension along which case marking can split is definiteness (Aissen 2003). As reported in e.g.
Dixon (1994), in the Australian language Dyirbal, pronouns denoting first and second person adopt the
nominative-accusative pattern, while the rest of the nominals, those lower on the animacy/definiteness
hierarchies, adopt an ergative-absolutive pattern.
There are various other kinds of ergative/accusative splits, including those which are based on aspect.
A discussion of these is outside the scope of this monograph. My goal here is simply to show that there exist
ergative/accusative splits which clearly align with communicative considerations. I am not claiming that all
splits necessarily do.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

From the two-word stage to hierarchical syntax 185

Recall from Section 7.3.2 that in mixed/split TAM systems, present tense usually
remains unmarked, possibly TP-less, as it is the context in which more complex TP
marking does not bring about a tangible advantage. The same considerations hold in
the realm of transitivity in this case: simpler, vP-less structures are used in those
(typically inanimate) contexts in which more complex vP structures yield no signiﬁ-
cant communicative advantage. This ties in well with one of the themes of this book,
that simpler, fossil structures co-exist with more complex structures because they are
more economical, and because there are many situations in which more complex
structures are just an overkill.
The DOM marking discussed above is also found in nominative-accusative lan-
guages. As pointed out in e.g. Comrie (1989: 132) and Gil (2014), in languages in which
accusative marking is optional, it typically occurs on animate and/or deﬁnite nouns,
but not on inanimates, as illustrated below for Spanish and Serbian:
(27) El hombre vio a la mujer. (Spanish)
the man saw ACC the woman
(28) El hombre vio la silla
the man saw the chair
(29) Milan donosi jež-a. (Serbian)
Milan brings hedgehog.ACC
‘Milan is bringing a hedgehog.’
(30) Milan donosi Jež.
Milan brings hedgehog
‘Milan is bringing (the magazine called) Jež.’
This leaves enough room for the view that the key syntactic properties, including
transitivity and Tense/TP, emerged for communication purposes, and gradually so.
This is not consistent, however, with the view that syntax in all its complexity, arose
only once, as a single event, shortly before the H. sapiens’ dispersion out of Africa
(e.g. Chomsky 2005; Berwick and Chomsky 2011), as discussed in Section 7.5. Neither
is this consistent with the concomitant view that communicative considerations
cannot have anything to do with (the emergence of) syntax.
A reviewer suggests that the saltationist view mentioned above does not necessarily
exclude the possibility that vP, TP, and CP emerged later through grammaticalization
processes, and that what emerged as a single mutation were the “design features” of
language, such as Merge, Move, recursion. However, as I have argued, these design
features of syntax cannot be there without the functional categories in question. If the
whole package of syntax emerged as one single event, as per saltationist claims, then
this package would have certainly included the functional projections/categories, as
they are the postulates of syntax upon which all the other postulates rest. Take away
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

186 The plausibility of natural selection for syntax

vP, TP, CP, DP, and other functional categories, and you get pretty much what I have
reconstructed here, a precursor to language operating with short (and flat) small
clauses, with hardly any syntax to speak of (see Section 1.1 for Berwick and
Chomsky’s 2011 view about precursors; also Section 1.6). But, much more import-
antly, one should not be second-guessing what these saltationists might have had in
mind, or what they might have in mind in the future. If somebody is proposing
something, then they should make their proposal fully explicit, as well as under-
standable and vulnerable to verification.
The reviewer further suggests that communicative considerations can play some
role in Chomsky’s (2010) and Berwick and Chomsky’s (2011) view as well, even
though their view is that language emerged (in full) to facilitate thought (inner
speech), rather than communication; once this thought system was externalized
(e.g. pronounced), then it could have proved useful for communication as well.
More precisely, according to Berwick and Chomsky (40-1), “in the very recent past,
maybe about 75,000 years ago, . . . an individual . . . underwent a minor mutation that
provided the operation Merge,” which brought about recursive structured thought. It
was at some later stage that the language of thought was connected to the external
speech, “quite possibly a task that involves no evolution at all.”
However, what I have proposed in this book is not just that once language in its full
complexity arose (for some other purpose), it so happened that it was also useful for
communication. My proposal here and elsewhere is that communication pressures
were the very reason why language evolved. These communication considerations
shaped the very design of human language, and determined each incremental step of
its evolution. Furthermore, on my approach, each new stage relies heavily on the
previous stage, and the innovations it introduces are small and incremental, so that
they can be understood and negotiated by the rest of the community as soon as they
arise. This incremental approach removes any rationale for the claim that language
could not have evolved for communication purposes.8
Going back to the communicative benefits of the vP/TP equipped syntax, it is
useful to recall that, the more vague the expression, the more it relies on the
pragmatic context and on pragmatic plausibility in general, because of our tendency
to seek pragmatic sense. In contrast, more complex language/syntax is better able to
take us away from the concrete and observable in the here-and-now, toward what is
less concrete, and less observable, and ultimately to what is non-existent or plain

8
As for Berwick and Chomsky (2011), one reason for their proposal that syntax and Merge were initially
useful only for thought, but not for communication, has to do with that one person in their evolutionary
scenario who got the language mutation. Their argument is that this one single person would not have had
anybody to communicate with, and that communication could start only much later, after this mutation
was passed down through several generations. This kind of conundrum only arises if you insist that
language/syntax arose as one single event/mutation, but not if you envision an incremental, gradualist
approach, with precursors, as discussed in the preceding and following sections of this chapter.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

From the two-word stage to hierarchical syntax 187

bizarre. While our ability to talk nonsense is clearly not always advantageous, one
would have to concede that it makes it easier to talk about things that do not exist, but
that might have existed, or might exist in the future. It is easier to describe a different
world, and then perhaps to change the world so it fits this new description. Or, if one
is skeptical about displacement being adaptive in this way, there certainly remains the
great potential for using language to amuse and surprise, i.e. for display purposes.
In other words, the evolutionary pressures to proceed to a transitive (hierarchical)
stage would have included the following tangible benefits: (i) more precision in the
expression of argument structure; which in turn leads to (ii) capacity for displace-
ment, not only in the sense of temporally/spatially removed (relevant also for TP),
but also removed from the ordinary, common sense, or plausible; which in turn
opens doors to (iii) the capacity to tell amazing and entertaining stories; or just to
(iv) stun and amuse with novel and fantastic claims. Again, it is entirely possible that
those who were just a bit more creative with using language in these ways left more
offspring than the rest, leading to the spread of this capacity. As pointed out in
e.g. Tallerman (2013b: 95), even in modern societies the most eloquent speakers tend
to be granted the highest status, which in turn is correlated with greater reproductive
success (e.g. Locke 2009 and references there). In this respect, Miller (2000: 350)
points out that the speaker benefits much more from holding the floor, than the
hearer benefits from listening.

7.3.4 Recursion
As argued in Chapters 2, 3, and 4, it is only in the hierarchical stage that true
recursion becomes available, making it possible to embed, repeatedly, e.g. one
point of view within another.9 Recall that fossil structures, such as tenseless root
small clauses in English (31) and Serbian (32), cannot embed at all, in contrast to full
finite CPs (33, 36), which show infinite recursion:
(31) *Him worry [me first]? *Him happy [problem solved]?
(32) *Ja mislim [(da) stigla pošta].
I think (that) arrived mail
(33) Ja mislim [da ti znaš [da je stigla pošta].
I think that you know that AUX arrived mail
As pointed out in Chapter 4, full CP structure, with a designated complementizer, may
be necessary to realize the full (unlimited) recursion potential in the clausal domain.

9
As pointed out in the previous chapters, I adopt the typical, standard characterization of recursion in
linguistics, according to which recursion refers to the ability to repeatedly embed/insert one type of
category (e.g. a CP or a DP), within another category of the same type (for a detailed discussion of this,
see Section 4.4).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

188 The plausibility of natural selection for syntax

As shown in detail in Section 4.4, it is possible to identify, by looking at the three

strategies for syntactic integration, a gradual progression toward achieving unlim-
ited recursion. While parataxis already allows the interpretation of one view
embedded within another (34), and coordination may allow two such levels of
embedding, with some special tinkering (35), it is only subordination, with a
designated functional projection, that can achieve unlimited recursion (36). In
that sense, parataxis and coordination provide excellent precursors to CP recursion
(Bouchard 2013: 60 also observes that parataxis is an alternative way of expressing
what subordination does).
(34) a. ?Marc is a linguist—[you know,] [Mary knows].
b. Marc is a linguist—[you know it,] [Mary knows it].
(35) Marc is a linguist, [and you know it,] [and Mary knows that].
(36) You know [CP that Mary knows [CP that Marc is a linguist]].
As pointed out in Chapter 4, it seems that all hierarchical phenomena considered in
this book, including subordination and transitivity, have alternative routes, involving
non-hierarchical, paratactic structures. This is consistent with the proposal in this
book that parataxis provides a precursor, a foundation for building hierarchical
structures.
Going back to the paratactic example in (34), the clauses in it should be
analyzed as occurring next to each other, loosely Conjoined, in the sense of
iteration, rather than true recursion (Kinsella 2009, Section 4.4). The nature of
the semantic link between the clauses will then be figured out pragmatically.
Perhaps one way to implement this distinction is to consider that a specific
functional category, such as CP (36), is processed in a direct, streamlined way by
the specialized syntactic areas in the brain, such as Broca’s area. On the other
hand, Conjoin (responsible for parataxis) may be delegated to more general and
more scattered processing strategies, which work quite well when only two ele-
ments are Conjoined, but which are challenged when multiple combinations are
attempted, as is the case with e.g. No come, no money, no shelter, as discussed
earlier (see Sections 3.1 and 4.4.2).
This is again exactly what evolutionary forces can operate upon: there is already a
precursor to recursion, that is, a precursor to the ability to embed one viewpoint
within another, but it is only good for one or two levels of such embedding.
In contrast, CP subordination, which specializes for this kind of embedding, gives
rise to unlimited recursion. With CP subordination, unlike with coordination or
parataxis, you do not need to figure out and guess what various pieces might or might
not be referring to—the syntax gives you no choice but to interpret each CP as
embedded within the higher CP. It is a fool-proof strategy. This is the sense in which
gradual, step-by-step, evolution should be understood: a new stage does not bring
about something totally new, but something just a bit more streamlined.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

From the two-word stage to hierarchical syntax 189

This also goes a long way toward answering some more general questions that
linguists sometimes pose, as discussed briefly in Section 7.3.3. For example, a reviewer
wonders how an innovation gets to be received or interpreted by the rest of the
community. In other words, how do the listeners know that the speaker is using vP
transitivity or CP recursion, if they themselves do not have it (yet)? As proposed in
Chapter 3, transitivity is something that also emerges gradually, step by step, and
there are precursors to it, so that the listeners are prepared to recognize when a more
streamlined expression of transitivity is being presented.10 As shown above (as well as
in Chapter 4), the same is true for the emergence of CP recursion—the precursors are
already in place, and nothing totally new is being introduced. That is in fact a
powerful argument in favor of the claim that the evolution of language had to
proceed in small increments, so that something that is already available in one
stage can become just a bit more streamlined and unambiguous in the next. While
the issue raised by the reviewer poses a problem for saltationist accounts (see
Footnote 8 for the saltationist response to this question), the gradualist approaches
to evolution in general are designed to address exactly these types of concerns.
One also must keep in mind that evolution is not a predestined or predetermined
course of progression to ever higher and brighter realms. It is full of random twists
and turns, and full of attempts and failures (see Section 7.3.5 for some discussion). In
other words, it is not that the transitive stage emerges as soon as one person utters a
verb, a subject, and an object in one breath. Many conditions have to be met for a
community of speakers to converge on an innovation like that, and even when all
such conditions are met, it is still up to chance whether the innovation will take hold
or not. But this is also the case with grammaticalization processes that take place in
modern times. Also, as discussed in Chapter 3, there exist different solutions to the
problem of transitivity, not just one perfect solution.
The same goes for CP and other types of recursion. Modern languages that do not
make use of finite subordination have been reported to exist today (see e.g. Dixon
1995 for Dyirbal; Mithun 1984, 2010, for various Native American languages; Everett
2005 for Pirahã). As pointed out in Chapter 4, languages like German and Serbian do
not exhibit DP recursion with possessives, of the kind illustrated below for English:
(37) John’s mother’s friend’s kitten
It follows that recursion cannot be the defining property of human language, or an
automatic consequence of Merge, as it is perfectly possible to have coherent gram-
mars which make use of Merge, but which do not show recursion. Moreover, as seen
in the previous sections, it is also possible to have coherent (even if underspecified)
grammars without a vP or TP layer.

10
The use of the term “precursor” here is not meant to suggest that this is some kind of unstable
structure awaiting further evolution. These can be perfectly stable and persistent structures by themselves.
They are only seen as precursors from the perspective of transitive structures.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

190 The plausibility of natural selection for syntax

Finally, taking into account everything that has been discussed so far, it seems that
what needed to evolve through selection/adaptation when it comes to syntax, was,
first of all, fluency in the paratactic (flat) stage (Section 7.2), and then the hierarchical
stage (Sections 7.3.2 and 7.3.3), with a vP, TP, or an equivalent functional projection.
These two transitions would have constituted truly significant breakthroughs in the
evolution of human language. This does not necessarily mean that the addition of yet
another functional layer on top of this, such as CP, had to have involved natural
selection. Once the brain (and language) evolved sufficiently to be able to support two
or three levels of hierarchical layering, it may be that after that the third or fourth
layer of structure would have been accommodated with the existing capabilities.11
The next section discusses this.
7.3.5 Historical change vs. language evolution
First of all, it is maintained in e.g. Hurford (1990) and Fitch (2008, and references
cited there) that historical change is relevant for language evolution. Fitch (2008: 483)
points out that, for example, the historical loss of tone is relevant for evolutionary
considerations because it proves that such a change is possible in principle. Even if
that were all there is to it, historical change can at least provide corroborating
evidence for specific evolutionary proposals, such as the ones explored in this book.
However, the reviewers wonder how one can distinguish between just historical
change and language evolution regarding the postulated stages of language evolution.
Historical change is typically considered to be a change which has no genetic basis or
consequences. In contrast, language evolution (and evolution in general) is typically
associated with genetic changes and selection. However, these two processes may not
be as disjointed as one typically considers them to be. Let us look at one concrete,
although completely hypothetical, scenario suggesting how this distinction between
historical change and genetic evolution can get blurred.
Suppose we are in a community of speakers of a tone language, which is under-
going a (historical?) change to a non-tone language, i.e., it is losing its tones.12 I have
chosen to discuss tone here because it has already been discussed in the context of
genetics, and because it seems easier to imagine selection for tone.13 Still, the same

11
As discussed in the Appendix, several neuro-linguistic studies on syntax found that more hierarchical
layering involves more activation in the brain.
12
Very roughly speaking, tone can be characterized as the use of pitch (high, low, or contours thereof)
to distinguish not only the meaning of words, but also grammatical categories (e.g. Yip 2002). Some
languages, e.g. Bantu, use tone to distinguish tense categories. It is also of relevance here that the historical
change affecting tone typically goes in the direction of tone loss, rather than the development of tone (see
also Fitch 2010: 483, quoting Jespersen 1922). One salient example of such change is the recent loss of tone
in Swahili, a Bantu language.
13
For example, the papers by Dediu and Ladd (2007) and Dediu (2008) have reported that there is a
small genetic difference between populations speaking tone languages vs. those speaking non-tone
languages. Their particular take on this is that the new gene variants provide a small bias toward learning
a non-tone language, and against learning a tone language (but see e.g. Diller and Cann 2012 for criticism).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

From the two-word stage to hierarchical syntax 191

logic could apply to any other historical change. For concreteness, let us say that this
community has 1,000 speakers. Suppose now that the large majority of this popula-
tion, say 90%, has a good genetic basis for learning and using tone, that is, for quickly
and effortlessly producing and discriminating the distinctions made by tone. Suppose
next that the rest 10% are still ﬂuent and functional speakers, but have something of a
speech impediment, which is observable in their less than optimal use of tone.14
Perhaps they speak in a hard-to-understand monotone. As pointed out to me by
Haiyong Liu (p.c. 2014), in Mandarin Chinese, a tonal language, there is a special
term for good speakers, which has to do with how dramatically they make the tone
distinctions: die dang qi fu (lit. up-down, fall-rise).
Going back to the hypothetical scenario, suppose now that those individuals who
speak in a monotone, or exhibit other imperfections with their use of tone, may not
have inherited all the genetic basis necessary for streamlined processing of tone,
but managed to survive anyway perhaps because they were ﬁt in other ways.
Perhaps they were stronger or more attractive than most other people. The reason
why such a high number, 90% of the population, got to have this genetic basis for
tone, presumably gained by natural/sexual selection, would attest to the obsession
that humans seem to have with “perfect” use of language, which often trumps other
desirable traits.
But now suppose that tone is lost in this community of speakers—a historical
change has occurred. People who are perfect at it no longer hold an advantage over
those who are not. To use the terminology from Deacon (2003) (see Section 7.5.1), the
genetic basis for being good at tone is now masked; that is, it is no longer accessible to
selection processes, because it is no longer observable. This means that the tone-
challenged 10% are no longer at a disadvantage, and that the tone-savvy 90% are no
longer at an advantage. In fact, the opposite may now be true, because those 10% who
managed to survive in spite of being tone-challenged may be more attractive or
healthier people in general. Suppose now that after many, many generations the
pendulum starts to swing in favor of the 10%, and the population now becomes say
70% tone-challenged, losing the genetic basis that was originally selected for tone.
This would essentially constitute a genetic change that is inextricably linked to a
historical change.
But this genetic change is not something that would be readily observable. While
we would observe the historical change, we would not necessarily observe any genetic
change associated with it. And if this new hypothetical generation, which is now only

14
See e.g. Wong et al. (2009) and Nan, Sun, and Peretz (2010) for some discussion of tone and language
disorders, still a largely unexplored topic. The reader should keep in mind that the scenario I am
considering here is purely hypothetical, and is not meant to make any speciﬁc claims either about tone
disorders or the consequences of tone loss.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

192 The plausibility of natural selection for syntax

30% tone-ready, were to acquire tone again (an unlikely scenario), the pendulum may
swing back again. But, crucially, I do not see how one can guarantee that natural
selection, including sexual selection, would not operate in such cases. As Fitch (2008:
522) puts it, “language change does not entail a cessation of selection.”
This hypothetical example can also help illustrate what I mean in this book by
evolution of syntax via natural selection. It is not some kind of straightforward
progression toward a clearly deﬁned lofty goal, but rather it involves just small and
random local advantages, in competition with a host of other potential advantages,
which can swing back and forth. It is only in hindsight, and only by sifting through
a lot of variation, and a lot of twists and turns, that one can even discern a pattern,
if a pattern emerges at all. Evolution in this sense is as much about loss as it is
about gain.
In this respect, since I already got off track, perhaps one more (hypothetical)
observation is in order. If my reconstruction of proto-syntax is on the right track, and
if there was a paratactic stage in language evolution, perhaps lasting for a prolonged
period of time, then I would say that our ancestors at that point got to be really good
and creative with this paratactic language, including with VN compounding (cry-
baby, rattle-snake), and with AB-AC patterns (Easy come, easy go), which may or
may not have been accompanied by melodies (Sections 2.4. and 4.2). But very few of
us living today seem to be still capable of using language in such creative, poetic ways.
It could be that by going grammatical, and by becoming slaves to a host of tiny
grammatical categories and distinctions, we masked our other great abilities, includ-
ing poetic and possibly musical abilities, which then gradually got diminished, in a
scenario similar to the one described above.
Those few who are still capable of such artistic expression may be considered as
great orators in some cultures, as seems to be the case with skilled Hmong shamans
and preachers, whose productive use of lofty AB AC patterns is highly valued
(Martha Ratliff, p.c. 2013). It is also reported in Maxwell and Hill (2006: 25) that
Maya writings have long shown parallelism in structure, but that such parallelism in
modern spoken language only appears in most formal genres, particularly public
prayer (see Section 4.2).
In any event, my proposal is that the stages of syntax, as postulated in this book,
brought about incremental advantages one over the other, the advantages that could
have, in principle, been subject to selection. Not all of them had to be, of course. This
is an open empirical question that nowadays can be subjected to genetic and other
types of testing.15

15
According to Christiansen and Chater (2008), human genome-wide scans have revealed evidence of
recent positive selection for more than 250 genes (Voight, Kudaravalli, Wen, and Pritchard 2006), making
it possible that there exist genetic adaptations for language.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

A detailed selection scenario 193

7.4 A detailed selection scenario

For the sake of concreteness, this section describes in detail one possible scenario for
how the capacity for the simplest two-word paratactic syntax could have been subject
to selection, the kind of syntax that, according to this book, provided a foundation for
all subsequent structure building. While this book discusses various fossils of this
stage of grammar, and any of them could be used for illustration purposes, VN
exocentric compounds are particularly illustrative in this respect, given that they rely
on the basic, concrete vocabulary to create abstract concepts, and given that they
specialize for naming and derogatory reference (insults), while at the same time clearly
exhibiting the properties of the two-slot paratactic grammar (Chapter 6). The goal of
this section is to envision how the ability to coin such insults on the spot would have
secured survival benefits in the ancient times. Needless to say, this is a hypothetical
scenario. But I believe that it is important to get very specific about the details of one’s
proposal in order to make sure that the implementation is at least in principle plausible.
The significance of these compounds is exactly in that they make this particular
selection scenario plausible from the point of view of evolutionary biology.
Imagine, if only for the sake of argument, that we encounter a community of
hominin ancestors, such as perhaps the H. erectus (see Section 7.5), living in a society
of about 100, with approximately equal number of males and females. The adults in
this community do not divide neatly into couples, but rather some males (and
females) mate with multiple partners. This is not a far-fetched scenario, given that
polygamy is practiced even in the modern times, by humans and non-humans alike
(see e.g. Symons 1979). In this kind of situation, if the males with certain proto-syntax
capabilities left more offspring than the other males, consistently, over generations,
then they would have skewed the course of evolution toward the spread of the
mutation(s) responsible for that ability to the rest of the population. As discussed
in Section 7.5.1, the speed of the spread depends on how high the fitness of these
individuals was relative to the competitors. According to Stone and Lurquin (2007), if
relative fitness is high, the increase of the variant in the population can be fast, taking
just a few dozen of generations for the variant frequency to increase tenfold.
Suppose now that at this point the vast majority of the population are only capable
of one-word utterances (pre-syntactic stage). Moreover, the words they use are basic
and concrete, numbering in dozens, perhaps up to 200 (see the examples in (38) and
(39) as a possible sample). This is the kind of protolanguage that primates such as
Kanzi, a bonobo, seem capable of (see e.g. Savage-Rumbaugh and Lewin 1994).
Anticipating the argument below, most of these words, nouns and verbs, are taken
from the VN compounds discussed in Chapter 6.16

16
Recall also that Heine and Kuteva’s (2007) reconstruction of proto-categories based on the theory of
grammaticalization leads to the conclusion that the ﬁrst proto-words in the evolution of language were
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

194 The plausibility of natural selection for syntax

(38) Verb-like proto-words

break, burn, burst, crack, cry, cut, drag, drink, drip, eat, fart, fill,
fold, fuck, hang, heck, hunch, kill, lick, lie, peck, pierce, pinch,
piss, rattle, rip, roll, run, scatter, scrape, scratch, shake, shit,
shove, skew, sing, sit, smoke, spin, spit, split, stink, stroke,
suck, sulk, tread, tumble, turn, wag, wipe
(39) Noun-like proto-words
ass, baby, back, balls, beard, belly, bird, brain, butt, dung, face,
finger, fire, hair, head, heel, leg, mustache, neck, old-woman,
penis, shit, skin, sky, snake, sun, tail, throat, vagina, water,
wind, wolf, wood
Suppose now that an innovation occurs in the community: one or two hominins
begin to merge these proto-words to create insults that succinctly characterize their
rivals in derogatory terms. While it would have no doubt been possible to insult with
single words, in a one-word stage one is severely limited to insults such as:
(40) ass, fart, old.woman, penis, piss, shit, snake, spit, stink, vagina
Now compare this one-word potential for insult with the possibilities that open up in
the paratactic two-slot stage (see Chapter 6 for many more examples from a variety of
languages):17
(41) kill-joy, turn-skin (cf. turn-coat), hunch-back, wag-tail,
tattle-tale, scatter-brain, cut-throat, mar-wood (bad carpenter),
heck-wood, busy-body, cry-baby, break-back, catch-fly (plant),
cut-finger (plant), fill-belly (glutton), lick-spit, pinch-back
(miser), shuffle-wing (bird), skin-flint (miser), spit-fire,
swish-tail (bird), tangle-foot (whiskey), tumble-dung (insect), bere-
water (bear-water), crake-bone (crack-bone), drink-water, shave-
tail (shove-tail), wipe-tail, wrynge-tail, fuck-ass, fuck-head, shit-
ass, shit-head
(42) cepi-dlaka ‘split-hair’ (hair-splitter); guli-koža ‘peel-skin’ (who
rips you off); vrti-guz ‘spin-butt’ (restless person, fidget); muti-
voda ‘muddy-water’ (trouble-maker); jebi-vetar ‘screw-wind’

noun-like and verb-like. In this respect as well, VN compounds count as good fossils, and a good starting
point for breaking into paratactic syntax. It is important to clarify here, however, that in this stage one can
only speak of proto-verbs (denoting actions, perhaps proto-imperative forms) and proto-nouns (denoting
static individuals). There is no claim here to the effect that nouns and verbs were distinguished morpho-
logically at this stage.
17
When such combinations are used to name animals or plants, unsurprisingly, they are not insults.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

A detailed selection scenario 195

(charlatan); vuci-guz ‘drag-butt’ (slow-moving person); gori-

guzica ‘burn-butt’ (a person in trouble, burn-breeches); kosi-noga
‘skew-leg’ (person who limps); lezi-baba ‘lie-old.woman’ (loose
woman or man); jedi-vek, ‘eat-life’ (one who constantly annoys);
kljuj-drvo ‘peck-wood’ (wood-pecker); podvi-rep ‘fold-tail’
(one who is crestfallen); čepi-guz ‘cork-butt; češi-guz ‘scratch-
butt;’ deri-muda ‘rip-balls’ (place name, a steep hill); gladi-kur
‘stroke-dick’ (womanizer); jebi-baba ‘fuck-old.woman’
(unselective womanizer); kapi-kur ‘drip-dick’ (name of a slow
water spring); kovrlji-guz ‘drag-butt;’ liz-guz ‘lick-butt;’ nabi-guz
‘shove-butt;’ peči-govno ‘burn-shit;’ piš-kur ‘piss-dick;’ plači-guz
‘cry-butt’ (cf. cry-baby); plači-pička ‘cry-cunt’ (vulgar version of cry-baby);
poj-kurić ‘sing-dick’ (womanizer); puš-kur ‘smoke-
dick;’ razbi-dupe ‘break-butt’ (steep terrain); seri-vuk ‘shit-wolf;’
visi-guz ‘hang-butt’
One goes from being able to utter a handful of very predictable and boring insults, to
suddenly having the power to create many more novel insults, abstract, witty, and
often humorous, combinations of words that have never been heard before. You are
suddenly able to capture a trait of a person, or perhaps his essence, with only two
basic proto-words. Maybe you first stumbled upon one or two combinations like this,
but then you started to actively seek new ones.
According to Progovac and Locke (2009), coining compounds akin to the ones
illustrated above would have been an adaptive way to compete for status and sex in
ancient times. Their successful use would have enhanced relative status first by
derogating existing rivals and placing prospective rivals on notice, and second by
demonstrating verbal skills and quick-wittedness. Darwin (1874) identified two
distinct kinds of sexual selection, aggressive rivalry and mate choice (see also
Miller 2000), both of which seem relevant for the proposed use of exocentric
compounds.18 There is no doubt that this ability would have attracted attention.
Insult and ritual insult still do, even in the present times.
It should be noted that considering the simpler stages of grammar helps identify
some potential points of continuity with animal communication abilities. As
observed in e.g. Darwin (1874), the males of almost all the mammal species use
their voices much more during the breeding season, and some are absolutely mute
except at this season. If human language was used for display purposes from the very
start, then there is some continuity there.19 In addition, compounds used for insult

18
See Section 6.6 for Darwin’s (1872) suggestion that strong emotions expressed in animals are those of
lust and hostility, and that they may have been the ﬁrst verbal threats and intimidations uttered by humans.
19
Darwin’s view in fact was that language evolved gradually through sexual selection, as an instinct to
acquire a particular method of verbal display similar to music (see e.g. Fitch 2010 for recent arguments for
musical protolanguage; see Sections 2.4 and 4.2).
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

196 The plausibility of natural selection for syntax

often feature swearwords. Code (2005, and references there) provides neurological
evidence that swearwords are separately stored from the other words, using both the
part of the brain where digital language is processed, and the part of the brain which
processes laughing and crying. In that sense, swearwords straddle the boundary
between (animal) calls, which share many properties with laughing and crying, on
the one hand, and digital language, on the other (see e.g. Burling 2005; see also the
Appendix for more discussion on this).
In general, decomposing syntax into evolutionary primitives in this way has an
added bonus in that it can reveal some continuity, some points of contact, between
human language and other animal communication systems. Recall from Section 7.3.2
(also Chapter 2) that the paratactic (two-slot) syntax is tied to the here-and-now, and
does not show displacement or recursion. These properties are also difficult to find
across animal communication systems.
Let us now go back to the concrete scenario involving VN insults and our hominin
ancestors. Perhaps after a day of gathering and/or hunting, as well as evading
predators, the community would come together for some socializing. The group
would have been thoroughly entertained by the ability to use words in novel and
playful ways. Suppose for concreteness that those few men who could quickly and
efficiently coin VN-type compounds on the spot had a preexisting mutation that
makes this task easier for them.20 They can do it with less effort and with more
buoyancy. If the chances of these compound-savvy men of having fruitful sex was
only 2–3% higher than for the rest, then it would have taken less than 10,000 years to
spread this mutation to the rest of the population (see Section 7.5.1 for some
calculations). As pointed out in e.g. Symons (1979), tribal chiefs are often both gifted
orators and highly polygynous. Consider that the H. erectus species existed for more
than a million and a half years. But notice that this would have allowed enough time
for syntax to evolve even if the paratactic stage emerged with the H. heidelbergensis
species (see Section 3.5).
It is not my intention here to suggest that (paratactic) syntax evolved only, or even
primarily, for insult purposes. My intention is only to show that insults could have
played an important role in solidifying basic syntax. There is no doubt that the ability
to join words would have opened up many other communicative possibilities,
including the accumulation of (complex) vocabulary items. For example, compound
words could now be used to distinguish snakes (e.g. rattle-snake), to name animals
(e.g. swish-tail (bird), tumble-dung (insect)), as well as to describe people’s activities
and issue more specific (less vague) commands:

20
This ability may be attributable not just to one mutation, but perhaps a cluster of mutations, in which
case the selection would have targeted the whole cluster.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

A detailed selection scenario 197

(43) Kill snake! Eat berry! Cut wolf! Sing baby! Run wolf! Rattle snake!21
Intriguingly, as discussed at length in Chapter 6, VN compounds across languages
seem to preserve traces of an imperative verb form.
The possibility that sexual selection played some role in evolving syntax is con-
sistent with the findings reported in e.g. Ullman (2008) that there is a gender
difference when it comes to relying on declarative vs. procedural memory in language
processing (see also Pinker and Ullman 2002).22 At the same time, as noted in e.g.
Darwin (1874), the law of the equal transmission of characters to both sexes prevails
with mammals, and ensures that characters of all kinds are inherited equally by the
males and females; we might therefore expect that with mankind any characters
gained by the females or by the males, through sexual selection, would commonly be
transferred to the offspring of both sexes. In other words, one would expect any
differences between sexes (sexual dimorphism) to be minor and subtle.
Finally, as Pinker and Bloom (1990) argue persuasively, human language is too
complicated and too specifically designed for communication to be a spandrel or a
by-product of some other development.23 The only way for a complex design such as
human language to evolve is through a sequence of mutations with small effects, and
through intermediate stages, with each stage useful enough and small enough in
triggering natural selection. Based on syntactic theory and linguistic fossils, this
monograph has reconstructed just such concrete intermediate stages of syntax
through which language evolution would have passed, and identified specific com-
municative benefits that each stage brought about, sufficient to trigger natural/sexual
selection.24

21
Interestingly, imperatives themselves can be quite vague. While in modern languages we often
distinguish the noun in Kill wolf as an object of the action, and the nouns in Run wolf! and Cry baby! as
vocatives (thus subjects of the actions), without speciﬁc case markings for these categories, these structures
are ambiguous. Rattle snake! could in principle either be a (bizarre) command for a snake to rattle, or a
command for somebody to rattle a snake.
22
It has also been reported by many that the use of cursing and dirty words is more common in males
(e.g. Jay 1980, 1995; van Lancker and Cummings 1999), and this is true even in language disorders (Code
1982). As Code observes, such words are used for fundamental expression of deep emotion, including fear,
pain, frustration, as well as for sex and violence.
23
Gould (1987) and Chomsky (2002, 2005), among others, have claimed that human language/grammar
can be a by-product of other phenomena, such as the increase in brain size, or general laws of physics.
Chomsky’s arguments have to do with his views that syntax is not decomposable into stages, and that there
are no genetic differences among humans when it comes to language abilities (Chomsky 2002: 147).
Additionally, Chomsky (2002) considers that natural selection is messy and not properly understood. He
also considers that the evolutionary explanations that invoke natural selection via tinkering can be
symptomatic of the lack of understanding (“if you take a look at anything that you don’t understand, it’s
going to look like tinkering” (139)), and that when things are properly understood, one realizes that there is
much more order in nature.
24
As pointed out by a reviewer, Pinker and Bloom’s (1990) approach has been criticized on the account
of the claim that the properties considered to be adaptive in language, such as recursive Merge, are not
complex, and vice versa (see e.g. Pesetsky and Block 1990). My approach shows that, when syntax is
decomposed into plausible evolutionary stages, this criticism dissipates. Interestingly, Pesetsky and Block
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

198 The plausibility of natural selection for syntax

The following section considers how these stages might map onto the human line
of descent.

7.5 The timeline for the evolution of language

7.5.1 Was there enough time?
Many evolutionists have adopted the Baldwin Effect as an evolutionary force,
including Dawkins (1999). Pinker and Bloom (1990); Deacon (1997); and Briscoe
(e.g. 2000) have applied it to language evolution. According to Pinker and Bloom
(1990) this is a process whereby environmentally-induced responses set up selection
pressures for such responses to become innate, triggering conventional Darwinian
evolution (see also Deacon 1997; Hinton and Nowlan 1987).
As pointed out by Depew (2003), there is a variety of shifting and contested
theoretical ideas associated with the Baldwin Effect (see also Longa 2006). What
they all seem to have in common may be just the following: “learned behaviors can
affect the direction and rate of evolutionary change by natural selection” (Depew
2003: 3). This may lead to converting learned behaviors into genetic adaptations, or,
alternatively, it may lead to supporting learned behaviors by related genetic adapta-
tions (Depew 2003: 3).
Deacon (2003) considers that masking and unmasking of “preadaptations” plays
an important role. As language became more and more essential to successful
reproduction, “novel selection pressures unmasked selection on previously ‘neutral’
variants and created advantages for certain classes of mutations that might not
otherwise have been favored” (93–94). At the same time, this innovative tool “masked
selection on traits made less vital by being supplemented” by the innovative tool,
such as perhaps the inventory and specificity of human calls (94) (see Section 7.3.5 for
a hypothetical scenario along these lines). It is important to point out that the process
of unmasking can have “highly distributed parallel synergistic consequences, with the
potential to significantly amplify adaptations” (Deacon 2003: 95–6). As Deacon
clarifies in the postscript to his paper, his approach does not really deviate from
Darwin’s, given that the unmasking process is comparable to uncovering the so-
called preadaptations, associated with Darwinian evolution, or changes of function
(Godfrey-Smith, Dennett, and Deacon 2003: 110).
One example that is often associated with the Baldwin Effect is the emergence of
lactose tolerance among herding populations (but see Depew 2003: 26 for the claim
that “there is no theory neutral empirical phenomenon that can be named ‘the

(1990: 751) challenge Pinker and Bloom to explain why it is adaptive to allow “the city’s destruction by the
enemy” but not “the city’s sight by the enemy,” which, as they say, is not fully acceptable. Ironically,
examples like these turn out to be relevant for evolutionary considerations, even though, of course, at a
much more abstract level, as discussed in Section 3.3.4.1.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

The timeline for the evolution of language 199

Baldwin Effect.’)” As discussed in Deacon (1997), in these populations, alleles that

allow infants to digest milk are not shut down immediately after weaning, but instead
remain operative at increasingly deferred points in the life cycle. While this eventu-
ally reduces to classical Darwinian selection, as these alleles are just being discovered
or unmasked by this cultural habit, according to Deacon the emphasis here is on the
causal factor for selection, which is a cultural phenomenon.
Small selective advantages are sufficient for evolutionary change: according to
Haldane (1927), a variant that produces on average 1% more offspring than its
alternative allele would increase in frequency from 0.1% to 99.9% of the population
in just over 4,000 generations. As discussed in e.g. Stone and Lurquin (2007), the
speed of natural selection depends on relative fitness of a trait/mutation. The time
necessary for a gene variant frequency to change is proportional to the difference in
fitness of the variants competing in the population.
As one example, the fitness of lactose tolerance is 2–3% higher in dairy areas. It
took about 5,000–10,000 years to reach the current rates of lactose tolerance among
northern Europeans, which is close to 100% in some cases. For sickle cell anemia the
fitness of the AS heterozygotes can be 9–10% greater, because they are clinically
normal, and because they are protected from malaria to some extent. According to
Stone and Lurquin (2007: 104–5), in this case it took only 2,000 to 3,000 years, or even
less, to reach the equilibrium seen today. Moreover, fixations of different genes can
go in parallel, and sexual selection can significantly speed up any of these processes,
triggering a runaway effect (Fisher 1930; see also Miller 2000 and Hurford 2007).
This suggests that there was enough time to evolve language gradually, in stages,
especially if the fitness value for each new stage of language was high. Given the scenario
for the evolution of syntax outlined in this monograph, there would have been at least
two major breakthroughs: (i) the emergence of the paratactic two-word stage out of a
one-word stage (Section 7.2); and (ii) the emergence of hierarchical grammars, with
transitivity and/or TAM marking (Section 7.3).25 For each progression, one can identify
several concrete communicative advantages, as per the previous sections.26

7.5.2 The timeline

At this point, one wonders if my proposal has anything to say about the timeline for
the evolution of human language. While the proposal as it is now cannot precisely

25
There must have been many more developments and detours, including possibly a proto-conjunction
stage, as discussed in Chapter 4, as well as many transitional stages, which left us with ambivalent
structures, such as middles. Here, I focus only on the major breakthroughs, for which the evidence is the
clearest, and leave the rest for future research.
26
This is also consistent with the idea of punctuated equilibrium, according to which evolutionary
change involves bursts of change that are relatively brief on the geological time scale, followed by long
periods of stasis (Eldredge and Gould 1972; Gould and Eldredge 1977). For example, it is possible that the
paratactic stage was stable for a long time.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

200 The plausibility of natural selection for syntax

place the stages of proto-syntax in evolutionary time, it is capable, even in this broad
outline, of excluding some hypotheses regarding language evolution, and supporting
others. This indicates that the level of granularity is appropriate, and that future
research along these lines, and at this level of granularity, can certainly shed further
light on this topic.
Consistent with the considerations of this monograph, it is likely, even though
not certain, that the paratactic proto-syntax stage already characterized the
H. heidelbergensis species, the common ancestor of both humans and Neanderthals,
which would place the emergence of the proto-syntactic stage to as far as half million
years ago. In fact, my proposal also cannot exclude the possibility that H. erectus also
had some form of proto-syntax, especially considering that their brain doubled in
size relative to that of the Australopithecus, who lived sometime between 4 million
years ago and 2 million years ago. The earliest fossil evidence for H. erectus goes back
to 1.8 million years ago and the most recent to about 140,000 years ago. It is conceivable
that the capacity for paratactic grammars triggered a speciation event, such as a
transition from H. erectus to H. heidelbergensis (or, if the deeper timeline for language
is correct, a transition from Australopithecus to the hominin species).27 Clearly, the
pressure to be able to use and memorize innovative language combinations and
abstract vocabulary would have certainly required increasingly more mental capability,
and thus more brain capacity. There was nothing else at that juncture that would have
required as much brain capacity as the paratactic stage of language would have,
accompanied by an increase in vocabulary size (see Section 7.4).28
According to Deacon (1997), the unusually expanded prefrontal brain regions
(Footnote 27) are an evolutionary response to a sort of virtual input with increased
processing demands, suggesting that language forced the brain to evolve in this
particular way, or at least that it co-evolved with it (see also Diller and Cann 2013).
As put in Darwin (1874: 634), “a great stride in the development of intellect will have
followed, as soon as the half-art and half-instinct of language came into use; for the
continued use of language will have reacted on the brain and produced an inherited
effect; and this again will have reacted on the improvement of language. . . . The

27
According to Deacon (1997), symbolic language has been accruing from around the time that the
Austrolopithecines were replaced by the hominins, some 2 million years ago, when ancestors became
bipedal, freeing up their hands for tool use and gesture, and when brains expanded signiﬁcantly. As he
notes, in the australopithecine-hominin transition, our brains did not get bigger proportionately, but,
rather, it was the forebrain, particularly the cerebellum and the cerebral cortex, which ballooned the most.
28
Another potentially relevant observation is that H. erectus was possibly the ﬁrst hominin to move out
of Africa, as early as 1.7 million years ago, and spread as far as England, Georgia, India, Sri Lanka, China,
and Java. However, as pointed out by McBrearty (2007: 140), no genetic mutation enhancing intelligence
was necessary for hominins to migrate out of Africa, given that faunal exchanges between Africa and Asia
have occurred sporadically since the land bridge at Sinai was established 17 million years ago. Finlayson
(2009) also notes that having language in place, or a large brain, is not a necessary prerequisite for
dispersions of this kind to take place.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

The timeline for the evolution of language 201

largeness of the brain in man, relatively to his body, compared with the lower
animals, may be attributed in chief part to the early use of some simple form of
language—that wonderful engine which afﬁxes signs to all sorts of objects and
qualities, and excites trains of thought which would never arise from the mere
impression of the senses . . . ” [emphasis mine].
Dediu and Levinson (2013) review a number of recent ﬁndings suggesting that at
least H. heidelbergensis had some form of language, based on the comparative
evidence among its descendents: H. sapiens, Denisovans, and Neanderthals, as also
suggested by Finlayson (2009: 116) (but see Berwick, Hauser, and Tattersall 2013 for
criticism of this view). According to Dediu and Levinson (2013: 10), “language as we
know it must then have originated within the ~1 million years between H. erectus
and the common ancestor of Neanderthals and us.” The most interesting evidence
comes from genetics, and Dediu and Levinson (2013: 5) conclude that Neanderthals
and Denisovans “had the basic genetic underpinning for recognizably modern
language and speech, but it is possible that modern humans may outstrip them in
some parameters (perhaps range of speech sounds or rapidity of speech, complexity
of syntax, size of vocabularies, or the like).” In addition to genetics, Dediu and
Levinson also review evidence from the skeletal morphology, the morphology of the
vocal tract, infant maturation, Broca’s area, brain size, cultural artifacts, and con-
clude that all the evidence is consistent with their proposal. According to them, the
H. heidelbergensis species might have even spoken complex languages, comparable
to human languages, which in my framework would imply a hierarchical stage.
Given the considerations in this monograph, it is much more likely that the
hierarchical stage of language evolved only in H. sapiens, after the dispersion from
Africa, or perhaps after a dispersion within Africa, and that H. heidelbergensis, as well
as Neanderthals and Denisovans, only commanded the paratactic stage, that is, two-
slot grammars, as well as one-word protolanguage. But, as it is clearly established in
this monograph, this “mere” paratactic stage of language has a remarkable commu-
nicative potential. Interestingly, Dediu and Levinson (2013: 11) hope “that some
combinations of structural features will prove so conservative that they will allow
deep reconstruction.” My hope is that this monograph has provided just such a
conservative structural feature which can be used for reconstruction, the two-slot
absolutive-like platform.
On this scenario, the second major breakthrough, the one that brought about
hierarchical grammars, would have originated with H. sapiens. On one view, the
H. sapiens species is taken to have emerged in Africa around 200,000 years ago, and
dispersed out of Africa about 60,000 years ago, to Asia and Europe, where the species
co-existed with Neanderthals for a while (see e.g. Stone and Lurquin 2007). Nean-
derthals are thought to have left Africa much before H. sapiens, and lived in Europe
and Asia since at least 200,000 years ago, dying out about 20,000 years ago. In the
scenario outlined above, Neanderthals would have commanded the paratactic use of
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

202 The plausibility of natural selection for syntax

grammar, inherited from the common ancestor with the H. sapiens, but would not
have inherited hierarchical grammars from the ancestor, as hierarchical grammars
would have only emerged in the H. sapiens. This of course does not mean that
Neanderthals could not have, independently, evolved layers of grammar on top of
the paratactic foundation, or even musical language (see e.g. Mithen 2006; Section 2.4;
4.2). It only means, under this scenario, that whatever they built on top of the paratactic
two-slot grammars, if anything, was not shared by the common ancestors.29
However, the stages broadly outlined in this monograph are also consistent with
the less likely possibility that paratactic grammars emerged only at the transition
from the H. heidelbergensis to the H. sapiens species, in which case H. heidelbergensis
(and possibly H. erectus and Australopithecus) would have been stuck in a one-word
stage, with some basic vocabulary but no syntax, or even in a stage without any words
at all. In that case, Neanderthals would not have inherited the paratactic grammar
from the common ancestor, as the common ancestor would not have had one, but,
again, it is possible, if not likely, that they could have developed some form of proto-
syntax on their own. Under this more recent scenario for the evolution of language, it
would be hard to explain why the brains ballooned in the transition from the
Australopithecus to H. erectus, as per previous discussion. But even this recent
scenario would have allowed enough time for syntax to evolve gradually in stages,
as pointed out above.30
Importantly, there are certain scenarios for the evolution of syntax that are not
consistent with the approach outlined in this monograph. For example, a great
degree of crosslinguistic variation in how different languages build upon the postu-
lated foundational paratactic stage suggests that the hierarchical stage did not emerge
in all its complexity and in a uniform fashion only once (in Africa), but instead
multiple times, and independently, either within Africa, or after the dispersion from
Africa. If it had emerged only once, before H. sapiens spread out, it would be difﬁcult
to explain why there is so much variation across languages of the world in how they

29
Recall that paratactic grammars are characterized not only by compound insults and rudimentary
small clauses, but also by paratactic clause combinations, of the kind:
(i) Easy come, easy go. Come one, come all. You sow, you reap. You seek,
you find.
For what it is worth, such symmetric, parallel combinations would have been easy to fit onto simple
melodies, and to develop musical protolanguage from. As pointed out in Sections 2.4 and 4.2, such
paratactic structures rely on prosodic glue to hold them together, and if there was musical protolanguage
at any point in human evolution, then it would have been most useful in this stage.
30
As observed by Maggie Tallerman (p.c. 2014), in this scenario the transitions from one stage to the
next could have been so swift as to become close to saltationist views of the evolution of language. In my
view, what is important for distinguishing the gradualist, incremental approaches from saltationist
approaches is not so much the amount of time that elapsed from having no language to having hierarchical
language, but rather whether or not there were well-defined incremental stages, to provide the scaffolding
without which natural/sexual selection could not have been able to operate.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

The timeline for the evolution of language 203

express transitivity (by ergative, accusative, or other means), or in whether or not

they grammaticalize tense/aspect/mood systems, to name just some parameters of
variation. Dediu and Levinson (2013) mention that interbreeding with Neanderthals
or Denisovans, or just the contact with their languages, may have contributed to the
differences among human languages. While this may be possible in principle, a
proposal of this kind would have to be made much more explicit in order to be
evaluable. At present, what my approach can say for sure is that all human languages
have the paratactic grammar as a common denominator, and, if anything, it would
have been this kind of paratactic grammar that was also shared by Neanderthals and
Denisovans, if it was indeed present in their common ancestor, H. heidelbergensis.
What all human languages and constructions undoubtedly have in common is the
paratactic platform, that is, the ability to combine two words or two clauses paratac-
tically, essentially the properties of the postulated ﬂat, intransitive, absolutive-like
stage.31 As pointed out above, all the hierarchical phenomena discussed in this book,
including transitivity and subordination, have alternative routes, as well as pre-
cursors, in parataxis. This is a deep, conservative property of human language that
young children across cultures seem capable of, and that pidgin speakers and second
language learners seem to often resort to. However, as pointed out in Sections 7.2 and
7.3, modern languages vary with respect to whether they exhibit grammaticalized
expression of TAM (tense/aspect/mood), recursive clause embedding, and a particu-
lar type of transitivity, all properties of hierarchical syntax. In other words, if
transitivity and TAM emerged only after the dispersion of hominin populations,
we can explain the vast variation across languages in how they choose to express
them, or not. This would in turn mean that the common ancestor with Neanderthals
and Denisovans did not have hierarchical syntax.
The postulations in this monograph, as they stand now, are not capable of
choosing between the uniregional and multiregional hypotheses about human ori-
gins. According to the widely accepted uniregional hypothesis, the Asian and Euro-
pean H. erectus lineages went extinct in all the places into which the species migrated
(see e.g. Stone and Lurquin 2007). In the meantime, H. sapiens evolved only once
from H. erectus in Africa (around 200,000 years ago), where H. erectus also went
extinct. The African H. sapiens populations migrated out of Africa around 50,000 to
60,000 years ago, which would mark the second dispersion out of Africa (Out of
Africa II model). Within this scenario, my approach is consistent with H. sapiens
exhibiting basically only the paratactic grammar before the dispersion to different
geographical locations, as pointed out above. This paratactic grammar would have

31
The use of linkers/proto-conjunctions, as discussed in Chapter 4, may also be common to all or most
languages, and this would be another good topic for further research along these lines. If it turns out that
languages differ signiﬁcantly in this respect, this might help situate the stages of language in time more
precisely.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

204 The plausibility of natural selection for syntax

provided the common platform upon which all languages could build further
complexities, often in diverging directions. Still, the syntactic variation among
languages would be largely constrained by the shared scaffolding that paratactic
(absolutive-like) grammars provide.
Recall that H. erectus traveled out of Africa around 1.7 million years ago,
spreading to Europe and Asia, where fossil evidence of the species was found.
The absence of older hominin fossils in Europe and Asia (and in the Americas) is
taken as evidence that H. erectus evolved only once, in Africa. According to the
much less accepted multiregional hypothesis regarding human origins, the local
H. erectus populations in Africa, Asia, and Europe differentiated into H. sapiens
independently, by a process of parallel evolution, as well as some admixture among
the populations (see e.g. Stone and Lurquin 2007).32 If this hypothesis turns out to
be correct, then, under my approach, one would have to say that H. erectus, prior
to the migrations out of Africa, already commanded the foundational paratactic
grammar, and that the more complex hierarchical grammars emerged separately in
different geographical locations, after the dispersion. On this scenario, the hier-
archical grammars could have emerged much earlier than with the uniregional
hypothesis, given that the dispersion took place much earlier, around 1.7 million
years ago. On the other hand, if the uniregional hypothesis is correct, then the
dating of the emergence of hierarchical syntax would be in a more shallow time
frame, sometime around 60,000 years ago, after the second dispersion out of Africa
took place, involving H. sapiens.33
As discussed in Section 2.5.4, initially, it was reported by Enard et al. (2002) that
FOXP2 gene mutation in humans occurred at some point in the last 200,000 years,
which would have neatly coincided with the emergence of hierarchical syntax.
However, it has since been found that the same mutation characterizes Neanderthals
(Krause et al. 2007), which pushes the mutation back to at least the common
ancestor, about half a million years ago. This ﬁnding was a disappointment to the
adherents to the saltationist view, for whom the initial report by Enard et al. provided

32
As pointed out by Finlayson (2009), the distinctions between H. habilis, H. erectus, H. sapiens, and
other hominins are not as clear-cut as is typically assumed. For example, when it comes to the size of the
brain, while the brains of H. sapiens are certainly larger on average than the brains of H. erectus, Finlayson
(2009: 42–3) points out that the variation within species is so large that some H. erectus individuals were
within the human range. This speaks in favor of the gradualist approach to the evolution of language
and cognition.
33
There may be another possible scenario for the timeline for hierarchical syntax, which would allow
for an earlier timing of hierarchical syntax. Namely, it is possible that hierarchical syntax emerged
independently among different populations in Africa, and that, as these different populations migrated
to different parts of the world, they brought with them these diverse hierarchical grammars. According to
Stringer (2007: 17) and Finlayson (2009), there are still many uncertainties about human timeline and
dispersals. Stringer mentions that there might have been an African version of multiregionalism, citing
“growing molecular evidence of deep divisions within African populations.”
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

The timeline for the evolution of language 205

encouragement (see Piattelli-Palmarini and Uriagereka 2011 for discussion). Diller

and Cann (2009; 2012: 171) even propose that the FOXP2 mutation should be dated
back to 1.8 to 1.9 million years ago, approximately the time when Homo (Homo
habilis, H. ergaster, and H. erectus) emerged.
Another type of evidence that has often been invoked in favor of the saltationist
view has to do with the postulation of the “Middle to Upper Paleolithic transition/
revolution.” Based on archeological findings, Mellars (2002) and others have initially
suggested that there was a major transition/revolution around 43–35,000 BP, char-
acterized by major changes, all reflecting shifts in many different dimensions of
human culture and adaptation: new forms and complexity of stone, bone, and other
tools; explosion of explicitly decorative or ornamental items; representational art
carving of animal and human figures; increase in human population densities. To
many this “symbolic explosion” was exactly what one would expect from a major
shift in the complexity of language patterns, possibly associated with corresponding
shifts in the neurological structure of the human brain (Mellars 1991: 35; Bickerton
1995; Pinker 1995; Mithen 1996). These archeological findings were often interpreted
to mean that language (or syntax) in its entirety arose at this juncture, through one
single event, such as a mutation (see e.g. Chomsky 2005, 2010; Berwick and Chomsky
2011; Tattersall 2010).
However, the recent findings lead to the conclusion that there was no human
revolution, at least not at this particular juncture (see e.g. McBrearty and Brooks
2000; McBrearty 2007; and Mellars himself 2007: 3). According to Mellars (2007: 3)
“there is now ample evidence . . . that virtually the whole pattern of radical behav-
ioural changes as reflected in the archeological record of the classic Middle-to-Upper
Paleolithic transition in Eurasia is due entirely to the replacement of one human
population (that of the Eurasian Neanderthals) by the new, intrusive populations of
biologically and behaviourally modern humans, from an ultimately African source.”
Consequently, this archeological situation cannot reflect some in situ cultural or
evolutionary processes. According to McBrearty and Brooks (2000) and McBrearty
(2007), a much more gradual and piecemeal pattern of development of new techno-
logical innovations can be documented in Africa.
It should also be pointed out that even if there had been an explosion of cultural
artifacts in the archeological record at this or some other point, it would not have
followed that language or syntax emerged at that point, or that they emerged
suddenly. Definitive conclusions in this regard are especially difficult to draw given
the common assumption among linguists, based on present-day cultures, that it is
possible to have a highly complex language in the absence of any complex culture
(see e.g. Roebroeks and Verpoorte 2009; Tallerman, 2014c; and references there). This
in turn means that a sudden emergence of culture does not imply a sudden emer-
gence of language, which means that this never was a plausible argument for the
saltationist views in the first place.
OUP CORRECTED PROOF – FINAL, 9/5/2015, SPi

206 The plausibility of natural selection for syntax

In summary, there are no real obstacles for studying the evolution of language/
syntax within the Darwinian adaptationist framework, along the lines proposed in
this book: there was plenty of evolutionary time to evolve syntax in stages, and each
stage can be shown to accrue concrete and important communicative advantages,
including precision in the expression of e.g. argument structure, as well as the
capacities for insult, displacement, and recursion. In addition, languages of the
world show variation consistent with the postulated stages, and there are fossils of
these stages to be found across languages and constructions. Furthermore, this
reconstruction can serve as a source of possible hypotheses for correlating linguistic
variation with genetic variation.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Conclusion

The basic proposal of this monograph is that the capacity for syntax evolved incre-
mentally, in stages, subject to selection pressures. Following an internal reconstruction
of syntax, based on the syntactic theory adopted in Minimalism and its predecessors,
this monograph arrives at the stage of human grammar which had no tense (no Tense
Phrase), and no transitivity (no vP), but only the rudimentary small clause structure
consisting of a verb and just one argument (typically a noun). This proto-grammar
could not differentiate between subjects and objects, and it knew of no Move or
recursion. This is essentially an absolutive-like, binary, two-slot grammar, which can
nonetheless create not only rudimentary small clauses (e.g. “Come winter, . . . ”), but
also paratactic binary combinations of such clauses (e.g. “Come one, come all.”). It can
also create some stunning insults in the form of compounds.
The internal reconstruction is based on stripping off the layers of functional
structure typically associated with a modern clause in Minimalism:
(1) CP > TP > vP > SC/VP
[where CP is a Complementizer Phrase, TP a Tense Phrase, vP a transitive (light)
Verb Phrase, VP the basic Verb Phrase, and SC a Small Clause.]
The logic behind the proposed reconstruction is straightforward: while VP/SC can be
composed without a vP or a TP layer, a vP or a TP can only be constructed upon the
foundation of a VP/SC. Moreover, while imposing an additional layer of structure
upon the foundational SC, whether it is a vP, a TP, or both, necessarily results in a
hierarchical construct, the SC itself can be a ﬂat, headless, paratactic creation.
Strikingly, as this monograph shows, languages of the world abound in the “fossil”
structures approximating this paratactic, two-slot, one-argument proto-grammar.
Such fossils are found among nominals, certain exocentric compounds, unaccusa-
tives, root small clauses, absolutive constructions, and absolutive-like constructions
in nominative-accusative languages, as well as among the so-called “middles,” the
structures that blur the boundary between intransitivity and transitivity, between
passives and actives, and between subjecthood and objecthood. Middles are just one
of several examples of transitional structures discussed in this monograph, which

Evolutionary Syntax. First edition. Ljiljana Progovac

# Ljiljana Progovac 2015. Published 2015 by Oxford University Press
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

208 Conclusion

straddle the boundary between stages, providing support for a gradualist approach to
the evolution of syntax.
That one should find fossils of previous stages in the structures of the more recent
stages is consistent with the recurring theme of this monograph, taking the advent of
each new stage to preserve the achievements of the previous stages. In addition to
fossil structures often being used side by side more complex structures, this mono-
graph also shows that the fossils of proto-syntax are built into the very foundation of
modern syntactic structures. For example, a modern sentence (TP) is built upon the
foundation of the proto-syntactic small clause, as if the building of a modern
sentence retraces its evolutionary steps.
For each postulated major stage of the evolution of syntax, including the absolu-
tive-like two-word stage, and the hierarchical transitive (vP) and TP stages, this
monograph identifies clear and concrete communicative benefits which would have
driven natural/sexual selection in each case. Not only that, but the level of concrete-
ness and granularity of this proposal makes it possible to seek cross-fertilization
among the (sub)disciplines of syntactic theory, evolutionary biology, neuroscience,
language variation (typology), and even genetics, in pursuit of language origins. This
proposal is also specific enough to be able to shed light on the hominin timeline, as it
is able to discriminate among some competing hypotheses in this regard. One
hypothesis that is not compatible with the findings in this monograph is that syntax
emerged in all its complexity abruptly, as one single (minor) mutation/event.
By decomposing syntax into its evolutionary primitives, this monograph has
demystified some of the otherwise problematic syntactic postulates, including Sub-
jacency, recasting them in a completely novel light: in the light of evolution. For if
syntax evolved gradually, through stages, this progression had to have left a mark on
the very design of syntax, as well as on the way syntax is processed by the brain. This
monograph thus finds an explanation for certain properties of modern syntax in the
nature of its evolution, as well as outlines very specific neuroimaging experiments
designed to test the proposed hypotheses. If language structure arose in a drawn-out
coevolutionary process in which both brain and language structures would have
exerted selection pressures on one another, then “we should expect to find that
human brains exhibit species-unique modifications that tend to ‘fit’ the unique
processing demands imposed by language learning and use” (Deacon 2003: 86–7).
Importantly, the proposals and hypotheses of this monograph are compatible with
the forces of natural/sexual selection, as well as vulnerable to verification not only by
syntactic theory, but also by neuroscience and genetics.
When it comes to genetics, some recent experiments with mice suggest that the
specifically human FOXP2 mutations are responsible for increased synaptic plasti-
city, as well as for increased dendrite connectivity (Enard et al. 2009). While syntactic
theory can help identify proto-structures, and distinguish them from more complex
structures, neuroscience can test if these distinctions are correlated with a different
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Conclusion 209

degree and distribution of brain activation, and genetics can, among other possibil-
ities, shed light on the role of some specific genes in making such connections in the
brain possible (see e.g. Vernes et al. 2007; Newbury and Monaco 2010).
Decomposing syntax into its evolutionary primitives is the only way to arrive at
concrete and testable hypotheses about language origins, as it is the only way to forge
synergy among the fields of syntax, neuro-linguistics, and genetics, by also taking
into account the geography of language variation. While each of these fields on its
own may provide glimpses into the origins of human language, any conclusive
account will ultimately have to be both based on a linguistic theory, and synergistic
with the other relevant disciplines.
In sum, there are several components to this proposal that set it apart from the
other approaches to the evolution of language. First, this approach pursues an
internal reconstruction of the stages of grammar based on the syntactic theory
associated with Minimalism, to arrive at very specific, tangible hypotheses. Second,
it provides an abundance of theoretically analyzed “living fossils” for each postulated
stage, drawn from a variety of languages. Third, and most importantly, this approach
shows how these fossils do not just co-exist side by side with more modern structures,
but that they are in fact literally built into the foundation of these more complex
structures. Fourth, the postulated stages, as well as fossils, are at the appropriate level
of granularity to reveal the selection pressures that would have driven the progression
through stages. Fifth, this approach offers a very specific experimental design for
testing the proposed hypotheses. Last but not least, it arrives at a reconstruction
which stands a chance of being meaningfully correlated with the hominin timeline, as
well as with the quickly accruing genetic evidence.
While this monograph provides a comprehensive framework for studying the
evolution of syntax based on a theory of syntax, it is only a framework, a program,
meant to stimulate further research and lead to better proposals. Further evidence
will need to come from (i) additional syntactic fossils from more languages; from
(ii) a better integration of language variation in syntactic theories; from (iii) neuro-
scientific experiments targeting specific hypotheses about language evolution; and
from (iv) the search for correlations between the geography of language variation and
genetics, but with all of these quests mediated by a coherent and comprehensive
evolutionary framework. While various pieces of the puzzle of the origins of human
language are certainly still missing, my hope is that this book has placed enough
pieces into the right spots to make the contour of the solution discernible.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Appendix
Testing grounds: Neuroimaging
CO-AUTHORED WITH NOA OFEN

1 Syntax and neuroimaging

Broadly speaking, this Appendix considers how evolutionary considerations can provide a
missing piece of the puzzle to bridge the gap between the theory of syntax and the field of
neuroscience. According to e.g. Poeppel and Embick (2005), what is needed but missing for
cross-fertilization between the two fields is a theoretical framework of how they should be
related. This monograph suggests that any such framework will have to take into account
evolutionary origins of syntax, especially if syntax co-evolved with the brain. According to
Deacon (2003: 86–7), if language structure arose in a drawn-out coevolutionary process in
which both brain and language structures would have exerted selection pressures on one
another, then “we should expect to find that human brains exhibit species-unique modifica-
tions that tend to ‘fit’ the unique processing demands imposed by language learning and
use . . . Reciprocally, we should expect languages to exhibit structures that optimize limits in
human working memory . . . ” This gives a rationale for why evolutionary considerations may
be the missing piece of the puzzle.
The same evolutionary considerations also promise to provide the necessary points of
contact between the fields of neuroscience and genetics. The data and analyses in this
monograph are presented in sufficient detail to allow for the formulation of specific hypotheses
based on minimally contrasting structures. The availability of such concrete and testable
hypotheses makes the proposals in this monograph vulnerable to falsification.
As pointed out throughout the monograph, neuroimaging methods involving subtraction or
correlation can provide a fertile testing ground for various specific hypotheses advanced in this
monograph. Roughly speaking, the subtraction neuro-scientific method is designed to com-
pare and contrast how certain inputs are processed in the brain by subtracting the brain image
reflecting the processing of one from that of another, isolating the differences between the two.
The correlation method can be roughly characterized as correlating the increase in the stimulus
complexity with the increase in brain activation. Both methods described above can employ
brain-imaging techniques such as functional magnetic resonance imaging (fMRI), which
measures differences in blood oxygenation levels accompanying neuronal activation.
Generally speaking, one can use these methods to determine how proto-syntactic structures
(e.g. root small clauses, middles, exocentric compounds) are processed in comparison to their
more complex hierarchical counterparts, in the hope of isolating neuro-biological correlates of,
for example, TP layering and vP shelling/transitivity. For the reasons discussed below, the

Evolutionary Syntax. First edition. Ljiljana Progovac

# Ljiljana Progovac 2015. Published 2015 by Oxford University Press
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

212 Testing grounds: Neuroimaging

prediction is that the processing of TPs and transitives with vP shells will show clear lateralization
in the left hemisphere, with extensive activation in certain specific Broca’s areas, as well as
possibly in the anterior temporal lobes, while the corresponding proto-structures are expected
to show less lateralization, and less involvement of Broca’s areas, but more reliance on both
hemispheres, as well as, possibly, more reliance on the subcortical structures of the brain (see
Progovac 2010b for these hypotheses).
Despite the current impasse, neuro-linguistic research in the domain of syntax has yielded
enough solid results to serve as a springboard for continued search for knowledge in this area.1
There is growing consensus in the literature that language processing involves a large number
of small but clustered and interconnected modules, as well as that the right hemisphere is also
involved in language processing, more than previously thought (see e.g. Bookheimer 2002;
Embick et al. 2000; Friederici, Meyer, and von Cramon 2000; Moro et al. 2001; Brennan et al.
2012). More specifically, various findings suggest that syntax itself is not a monolith, but a
complex phenomenon that recruits multiple loci in the brain. In this respect, Moro et al. (2001:
117) point out that syntactic capacities are not implemented in a single area, but rather
“constitute an integrated system which involves both left and right neocortical areas, as well
as other portions of the brain, such as the basal ganglia and the cerebellum.” Grodzinsky and
Friederici (2006: 240) similarly conclude that each subpart of the linguistic system, including
syntax, “can be neurologically decomposed into subcomponents.” These findings are consist-
ent with, and expected under, the evolutionary considerations explored in this project.
There are already quite concrete and specific findings about how some syntactic phenomena
are processed. Neuroimaging findings support the claim that syntactic movement is associated
with increased involvement of the inferior frontal gyrus (IFG). More specifically, syntactic
movement is associated with increased activations in the left IFG, clustering around Broca’s
areas: Brodmann Areas (BA) 44 and 45, but also BA 46 and 47 (see e.g. Ben-Shachar, Palti, and
Grodzinsky 2004; Constable et al. 2004; Friederici et al. 2006; Grodzinsky 2010; Grodzinsky
and Friederici 2006; Stromswold et al. 1996).2 The neural investigations mentioned above focus
on the types of movement that involve visible rearrangements of the basic sentential constitu-
ents: the subject, the verb, and the object. For example, Ben-Shachar, Palti, and Grodzinsky
(2004) consider object preposing in topicalization (as in This paper, John dislikes) and
wh-questions (as in Which paper does John dislike?) in Hebrew and conclude that both types
of movement yield comparable activation in a consistent set of brain regions, including left
IFG. According to Grodzinsky and Friederici (2006: 244), complexity in these studies can be
measured as the number of moved constituents.3

1
According to Poeppel and Embick (2005), among others (also Poeppel 2008; Fedorenko and
Kanwisher 2009), current neuro-linguistic research in the domain of syntax presents a case of cross-
sterilization, rather than cross-fertilization. This is because, according to them, no meaningful correlates
have been found, nor are expected to be found, between biological units of neuroscience (e.g. neurons,
dendrites, axons) and the formal syntactic postulates such as Move, Subjacency, Theta-Criterion. The
proposal in this book (also in Progovac 2010b) is that the missing piece needed to bridge the two vastly
different fields is the consideration of the evolution of syntax. This Appendix elaborates on that idea.
2
In addition, syntactic movement poses specific comprehension difficulties for aphasic patients
suffering from a lesion in Broca’s region (e.g. Caramazza and Zurif 1976; Grodzinsky 2000; Zurif et al. 1993).
3
In assessing relative complexity, the literature on this topic typically uses as a starting point what are
referred to as basic, canonical sentences, such as John ran; John dislikes the paper. In contrast, this proposal
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Small clauses vs. full sentences 213

There is converging evidence in the literature showing that increased syntactic complex-
ity corresponds to increased neural activation in certain speciﬁc areas of the brain (see
e.g. Caplan 2001; Indefrey, Hagoort, et al. 2001; Just et al. 1996; Pallier, Devauchelle, and
Dehaene 2011; Brennan et al. 2012). The experiments performed by Pallier, Devauchelle,
and Dehaene (2011) and Brennan et al. (2012) found a positive correlation between the levels
of hierarchical structure and the degree of activation, even when keeping the number of words
constant. In Pallier, Devauchelle, and Dehaene’s experiment, the subjects were exposed to
twelve word strings, but the conditions varied based on whether these twelve words were a
single sentence, two or more shorter sentences, or just random words. The cumulative building
of structure showed correlated accumulation of activation both in IFG areas, and in temporal
lobe areas (e.g. posterior superior temporal sulcus (pSTS)). Most accumulation occurred in the
single sentence condition, and the least accumulation with strings of random words, even
though the accumulation was logarithmic, rather than linear. Brennan et al. (2012) exposed
their subjects to a naturalistic twelve-minute story-telling experiment, in which the subjects
passively listened to a fairy tale. Each word in the story was analyzed for its level of hierarchical
embedding, and the degree of embedding was found to correlate with the amount of activation
in the anterior temporal lobes, as well as in the left posterior temporal lobe, left IFG, and medial
prefrontal cortex.
Section 2 of this Appendix considers root small clauses in Serbian and English, in contrast to
their full sentential counterparts. Section 3 considers ﬂat exocentric compounds in contrast to
their hierarchical counterparts.

2 Small clauses vs. full sentences

Recall from Chapters 2 and 3 that Serbian unaccusative clauses are in productive use in three
syntactic patterns: unaccusative (TP-less) root small clauses with the underlying VS order (1),
TPs with the same V-T-S order (2), and TPs with subject movement, resulting in S-T-V order (3):

(1) a. Stigla pošta.

arrived mail
b. Pala vlada.
collapsed Government

(2) a. Stigla je pošta.

arrived AUX mail
b. Pala je vlada.
collapsed AUX government
(3) a. Pošta je stigla.
mail AUX arrived
b. Vlada je pala.
government AUX collapsed

advocates probing below this “basic” level, to the level of proto-syntax, in an attempt to compare the
processing of TPs, some even transitive, with the processing of fossil structures, which are arguably a
product of proto-syntax, still alive in the brain.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

214 Testing grounds: Neuroimaging

In Chapters 2 and 3, root small clauses such as (1) are analyzed as approximations of proto-
sentences used in a TP-less stage in language evolution, exhibiting paratactic, flat structure
assembled by the operation Conjoin. In contrast, the sentences in (2) and (3) are TPs, headed
by the tense auxiliary je, where (2) keeps the underlying VS order, but (3) additionally Moves
the postverbal subject to the specifier of TP position.
Given that TPs involve a layer of functional structure on top of the VP layer, the structures
in (2) and (3) are necessarily instances of hierarchical, headed syntax. At the very least, the
sentences in (2) and (3) involve more hierarchical layering than those in (1). According to the
proposed analysis in Chapter 2, the examples above differ incrementally in their syntactic
complexity, with (1) being the simplest, and (3) the most complex, because it involves not only
additional hierarchical structure, but also Move.4
This analysis can be subjected to neuroimaging testing by applying either the subtraction
method, or the correlation method, as introduced in Section 1. Starting with the contrast
between the two TPs in (2) and (3), and assuming that movement incurs a processing cost, as
established in many references discussed in Section 1, one can expect a difference in brain
activation between these two types of structures. Any additional activation with (3) should thus
reflect the neural correlates of syntactic movement of this kind. This finding would thus isolate
the processing strategies which support the operation Move. In other words, the hypothesis is
that sentences such as (3), in comparison to those in (2), will show more left-lateralization and
more activation in purely syntactic areas, including, but not limited to, left Broca’s areas.
More relevant to the proto-grammar considerations, one can also compare and contrast the
processing of basic intransitive TPs, such as (2) above, and minimally contrasting TP-less small
clauses in (1), arguably proto-syntactic creations. It is fortunate that these minimally contrast-
ing pairs share the same meaning and vocabulary, differing on the surface only with respect to
the presence vs. absence of the tiny functional word, auxiliary je, whose presence in this context
contributes no difference in meaning. They are both unaccusative intransitive sentences with
VS word order and with their subjects in situ, that is, not moved. Any detected difference in
their processing would thus isolate a neuro-biological correlate of TP layering, or more
generally, an incremental increase in hierarchical layering. Given that functional categories,
including TPs, are postulates of hierarchical syntax, the prediction is that TPs in (2) will show
more activation in the syntactic areas of the brain than their proto-syntactic counterparts in (1).
This would be a prediction associated with subtracting (1) from (2).5 Employing a correlation
method, such as the one used in Pallier, Devauchelle, and Dehaene (2011) or Brennan et al. (2012),

4
At ﬁrst sight, the examples such as (1) can be seen as sentences whose trees have undergone “pruning,”
to use the metaphor explored in e.g. Friedmann and Grodzinsky (1997). According to the analysis pursued
in this monograph, these sentences are nothing but small clauses (SCs), the most basic (paratactic)
argument/predicate creations, which never were TPs. Instead of adopting the pruning metaphor, which
suggests that we start from the top with the full syntactic tree, and then shed various functional projections,
the view here is that such functional projections are never projected in these structures in the ﬁrst place. It
is for that reason that the TP-less proto-structures are expected to show less syntactic activation in e.g.
Broca’s areas.
5
Kolk (2006, and several references cited there) has also found that sub-sentential speech in e.g.
German and Dutch, including small clauses, requires less processing time (is processed within a smaller
temporal window), and that it is thus frequently resorted to in agrammatic production as preventive
adaptation.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Small clauses vs. full sentences 215

one can contrast blocks of sentences of type (1) with blocks of sentences of type (2) and look for
enhanced activation with type (2) in e.g. the anterior temporal lobe.
While the prediction for subtracting (1) from (2) is clear, it is not completely clear what to
predict for the subtraction of (2) from (1). The question is whether (2) completely subsumes (1),
without a residue, or if there is some activation present in (1) but not in (2). Recall the
argument from Chapters 2 and 3 that modern finite sentences are built upon the foundation
of small clauses. This analysis is at the heart of the current syntactic theory adopted in
Minimalism, as well as its predecessors. If this is directly reflected in the activation in the
brain, then it may be that the subtraction of (2) from (1) will be null. However, it is also possible
that there will be some additional activation in the brain associated with root small clauses, in
which case the subtraction of (2) from (1) may be non-null. This important issue can only begin
to be resolved by performing specific neuroimaging experiments of this kind, which will help
identify further hypotheses to be tested.
If there is a residue in the subtraction of (2) from (1), then the residue may involve
activation in the subcortical regions of the brain, as well as in the right hemisphere. One
reason to hypothesize subcortical/right hemisphere activation comes from the expectation
that the processing of proto-structures, those assembled by the operation Conjoin, would
involve more ancient and more scattered processing strategies, as also discussed in Chapter 2.
Another reason to expect this result comes from the observation that small clause structures
tend to be (semi-)formulaic, as evidenced in many English and Serbian examples (see
Chapters 2 and 3). According to e.g. Code (2005) and Wray (2002), formulaic speech in
general is processed by the more ancient structures of the brain, showing resilience in the
case of Broca’s aphasia.
This is consistent with the recent findings that language is not solely supported by Broca’s
and Wernicke’s areas of the brain, but also by the primitive subcortical basal ganglia, given
that damage to the basal ganglia can cause serious harm to linguistic processing (see e.g.
Gibson 1996; Lieberman 2000; Ullman 2006). According to Ullman (2006: 480–1), Broca’s
area is part of a larger circuit that involves the basal ganglia. The two parts of the brain are
densely interconnected, and both are implicated in language processing, including in
morphology and syntax. If the proto-syntactic structures (and the operation Conjoin)
postulated in this monograph provide a foundation for the rest of syntax, and if proto-
syntax is processed in part by subcortical structures of the brain, then it is expected that
damage to these areas in the brain would significantly affect language. To put it another way,
if the foundation is faulty, it will not be able to support the suprastructure. As pointed out in
the previous section, Moro et al.’s (2001) study also reveals activation of basal ganglia and the
cerebellum regions in syntactic processing, as well as the involvement of the right hemisphere
(see also Bookheimer 2002; Embick et al., 2000; Friederici, Meyer, and von Cramon 2000). The
hypothesis here is that this is so because modern syntactic structures still rest on the paratactic
foundation assembled by Conjoin, which in turn relies on the more scattered and more ancient
processing strategies.
In conclusion, while the predictions regarding subtracting (2) from (1) are ambivalent (but
testable), the subtraction of (1) from (2) is clearly expected to show increased activation in
Broca’s areas, and possibly also in anterior temporal lobes. If so, then neuroimaging experi-
ments in this case can isolate direct neural correlates of utilizing a functional projection (e.g.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

216 Testing grounds: Neuroimaging

TP), and with it hierarchy. In addition, due to the highly specific and concrete nature of these
hypotheses, neuroimaging testing in this case has the potential to tease apart movement (the
operation Move) from hierarchical layering, as well as to observe the contribution of each
hierarchical layer, one at a time.
If the theoretical predictions identified in this section get confirmed, the results will yield a
strictly controlled three-way distinction in graded syntactic complexity: first, flat proto-syntax
with no TP overlay and no movement possibilities, exhibiting only the operation Conjoin (1);
second, hierarchical syntax with a basic functional category, TP, but no movement performed,
exhibiting both Conjoin and Merge (2); and third, hierarchical syntax with both the basic
functional category TP and movement, exhibiting not only the operations Conjoin and Merge,
but also Move (3). This continuity of syntactic complexity, if found to correlate as predicted
with brain activation, would provide plausibility for a gradualist approach to the evolution of
syntax, as well as a promising new way of mediating between the fields of syntax and
neuroscience. It is also significant that this framework can serve as a point of contact, an
intermediary, between the fields of neuro-linguistics and genetics, as discussed in Sections 1.5
and 2.5.4).
One can use the same neuroimaging methods to test the processing of English root small
clauses in (4), in contrast to the full TP counterparts in (5), as per the proposal in Chapter 2.
The small clauses in (4) are expected to show similar properties as Serbian small clauses
discussed above.

(4) Case closed. Problem solved. Point taken. Crisis averted.

Mission accomplished.

(5) The case has been closed. The problem has been solved. The point
has been taken. The crisis has been averted. The mission has been
accomplished.

Even though the clauses in (4) are certainly not exact equivalents of the Serbian unaccusative
small clauses in (1), they do show enough syntactic similarity to warrant a comparison. First,
these are passive-like structures, in which, just as is the case with unaccusatives, the subject is
not the agent. For that reason, passives and unaccusatives often receive a similar treatment in
syntactic theory (see e.g. Marantz 1984; Belletti 1988; Adger 2003).6 Another similarity is that
both the Serbian data in (1) and the English examples in (4) are rigid small clauses assembled
by the operation Conjoin. As such, they lack the Tense auxiliary verb (and TP), cannot have
their constituents questioned (e.g. *How problem solved?), and cannot embed into other clauses
(e.g. *I think (that) case closed.), as discussed in Chapter 2. Even though the full ﬁnite
counterparts of these clauses (5) appear rather wordy, the additions are just functional words
which, in this case, add little, if anything, to the meaning. The predictions regarding these
English data are then comparable to the predictions for Serbian unaccusatives, as outlined
above.

6
While the mainstream syntactic analysis would implicate movement in passive sentences, the
approach explored in this book would suggest that these proto-syntactic passives do not involve move-
ment, given the general rigidity of proto-syntactic structures in this respect (Chapters 2–4). The results of a
neuroimaging experiment like this can shed light on this matter as well.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Flat vs. hierarchical compounds 217

The general prediction of this proposal is that the distinction between small clause proto-
syntax and hierarchical syntax cuts across a variety of data and even languages, and that one
can isolate this distinction by looking at various minimally contrasting data of this kind across
languages and constructions, including verbal compounds discussed in the following section.

3 Flat vs. hierarchical compounds

Neuroimaging experiments can also be devised to compare and contrast the processing of ﬂat
proto-syntactic VN compounds (6) and their hierarchical counterparts (7), both in English and
Serbian, based on the analysis in Chapter 6.

(6) pick-pocket, scare-crow, turn-coat, dare-devil, hunch-back,

wag-tail, tattle-tale, kill-joy, cut-purse, spoil-sport, rattle-snake,
catch-word, cry-baby, stink-bug, worry-wart, copy-cat, turn-
table

(7) joy-killer, head-turner, truck-driver, meat-eater, brick-layer,

story-teller, tax-payer, heart-breaker

The flat (fossil) characterization of VN compounds, and their association with the operation
Conjoin, predicts that they will exhibit less syntactic activation, and less lateralization in the left
hemisphere, but possibly more reliance on the right hemisphere and the subcortical structures
of the brain, such as basal ganglia, thalamus, and limbic structures, especially the compounds
involving swearwords. As discussed in Chapter 6, VN compounds specialize for derogatory
reference, and many among them are obscene. Code (2005) has provided some neurological
evidence that swearwords are stored separately from other words, as they can remain intact
even when e.g. aphasic patients cannot access the rest of language. According to Code (2005),
the processing of swearwords relies on the right hemisphere of the brain, and on the
subcortical structures, considered to be involved in emotional processing in general.
According to LeDoux (2000: 159), while the triune brain and the limbic theory for emotional
processing (e.g. MacLean 1949; Isaacson 1982) may not provide an adequate theory of the
specific brain circuits for emotion, “MacLean’s original ideas are very interesting in the context
of a general evolutionary explanation of emotion and the brain.7 In particular, the notion that
emotions involve relatively primitive circuits that are conserved throughout mammalian
evolution seems right on target.”
LeDoux (2000: 159) further acknowledges that it is possible that cognitive processes involve
other circuits, and that they might function relatively independently of emotional circuits
(LeDoux 2000: 159). The VN compounds thus bring together both proto-syntactic structure
assembled by the operation Conjoin and the subcortical underpinnings of swearing, rendering
these compounds of particular relevance for the study of language evolution. The approach to the
evolution of syntax outlined in this monograph provides some postulates which are at the right
level of granularity to help bridge the gap between the fields of syntax and neuroscience. This
approach may also provide an intermediary between genetic considerations and neuro-linguistic
considerations, another important piece of the puzzle for evolutionary considerations.

7
For MacLean’s notion of the triune brain, see Section 2.5.5.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

218 Testing grounds: Neuroimaging

Acknowledgments

This Appendix is based on an ongoing joint project with neuroscientist Noa Ofen, Institute of
Gerontology/Pediatrics, Wayne State University. The project, titled “In Search of Protosyntax
in the Brain,” is supported by the 2014 Marilyn Williamson Endowed Distinguished Faculty
Fellowship.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

References
Abney, Steven P. (1987). The English Noun Phrase in its Sentential Aspect. Ph.D. Dissertation,
Massachusetts Institute of Technology, Cambridge, MA.
Ackema, Peter (1998). ‘The Nonuniform Structure of Dutch N-V Compounds,’ in Geert Booij
and Jaap van Marle (eds.), Yearbook of Morphology. Dordrecht: Kluwer Academic Pub-
lishers, 127–58.
Adams, Edward L. (1913). Word-Formation in Provençal. New York.
Adams, Valerie (1973). An Introduction to Modern English Word Formation. English Language
Series. London: Longman.
Adger, David (2003). Core Syntax: A Minimalist Approach. Oxford: Oxford University Press.
Aikhenvald, Alexandra Y. (2005). ‘Serial verb constructions in typological perspective,’ in
Alexandra Y. Aikhenvald and R.M.W. Dixon (eds.), Serial Verb Constructions: A Cross-
Linguistic Typology. Oxford: Oxford University Press, 1–68.
Aissen, Judith (2003). ‘Differential object marking: Iconicity vs. economy,’ Natural Language
and Linguistic Theory 21: 435–83.
Akmajian, Adrian (1984). ‘Sentence types and the form-function ﬁt,’ Natural Language and
Linguistic Theory 2: 1–23.
Aldridge, Edith (2008). ‘Generative approaches to ergativity,’ Language and Linguistics Com-
pass 2/5: 966–95.
Alexiadou, Artemis (2001). Functional Structure in Nominals: Nominalization and Ergativity.
Amsterdam/Philadelphia: John Benjamins.
Alexiadou, Artemis (2012). ‘Noncanonical passives revisited: Parameters of nonactive voice,’
Linguistics 50–6: 1079–110.
Alexiadou, Artemis, and Melita Stavrou (1998). ‘On derived nominals in Greek,’ in B. Joseph,
G. Horrocks, and I. Philippaki-Warbarton (eds.), Themes in Greek Linguistics II. Amster-
dam: John Benjamins, 101–29.
Anderson, Stephen R. (1988). ‘Morphological change,’ in Frederick J. Newmeyer (ed.), Lin-
guistics: The Cambridge Survey. Volume I: Linguistic Theory: Foundations. Cambridge:
Cambridge University Press, 324–62.
Andreĭčin, L. (1955). Български език. Soﬁa.
An, Duk-Ho (2007). ‘Clauses in non-canonical positions at the syntax–phonology interface,’
Syntax 10: 38–79.
Arce-Arenales, Manuel, Melissa Axelrod, and Barbara A. Fox (1994). ‘Active voice and middle
diathesis: A cross-linguistic perspective,’ in B. Fox and P. J. Hopper (eds.), Voice: Form and
Function. Typological Studies in Language 27. Amsterdam/Philadelphia: John Benjamins, 1–21.
Aronoff, Mark, Irit Meir, Carol Padden, and Wendy Sandler (2008). ‘The roots of linguistic
organization in a new language,’ Interaction Studies 9(1): 133–53. doi:10.1075/is.9.1.10aro.
Asher, Nicholas (2000). ‘Truth conditional discourse semantics for parentheticals,’ Journal of
Semantics 17: 31–50.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

220 References

Aske, Jon (1998). Basque Word Order and Disorder: Principles, Variation, and Prospects.
Amsterdam: Benjamins.
Authier, Gilles, and Katharina Haude (2012). Ergativity, Valency and Voice. Berlin: DeGruyter
Mouton.
Baker, Mark (1988). Incorporation: A Theory of Grammatical Function Changing. Chicago:
University of Chicago Press.
Bar-Shalom, Eva, and William Snyder (1999). ‘On the relationship between Root Inﬁnitives
and Imperatives in Early Child Russian,’ in Annabel Greenhill, Heather Littleﬁeld, and
Cheryl Tano (eds.), Proceedings of Boston University Conference on Language Development
23: 56–67.
Barton, Ellen (1990). Nonsentential Constituents: A Theory of Grammatical Structure and
Pragmatic Interpretation. Amsterdam/Philadelphia: John Benjamins.
Barton, Ellen, and Ljiljana Progovac (2005). ‘Nonsententials in Minimalism,’ in Robert Stain-
ton and Reinaldo Elugardo (eds.), Ellipsis and Nonsentential Speech. Dordrecht: Kluwer,
71–93.
Bates, Elizabeth, Laura Benigni, Inge Bretherton, Luigia Camaioni, and Virginia Volterra
(1979). The Emergence of Symbols: Cognition and Communication in Infancy. New York,
NY: Academic Press.
Bauer, Heinrich (1833). Vollständige Grammatik der neuhochdeutschen Sprache. Berlin:
G. Reimer. (Reprinted in 1967, Berlin: Walter de Gruyter.)
Bauman, James (1979). ‘An historical perspective on ergativity in Tibeto-Burman,’ in F. Plank
(ed.), Ergativity: Towards a Theory of Grammatical Relations. London: Academic Press,
419–33.
Belić, Aleksandar (1949). Savremeni srpskohrvatski književni jezik II. Nauka o gradjenju reči.
Beograd: Naučna Knjiga.
Belić, Aleksandar (1960). Istorija srpskohrvatskog jezika. Knjiga II, Sveska 2: Reči sa konjuga-
cijom. Beograd: Naučna knjiga.
Belletti, Adriana (1988). ‘The case of unaccusatives,’ Linguistic Inquiry 19: 1–34.
Belletti, Adriana, and Luigi Rizzi (2000). An Interview on Minimalism, with Noam Chomsky,
the University of Siena, November 8–9, 1999; revised March 16, 2000. [http://www.media.
unisi.it/ciscl/pubblicazioni.htm.]
Ben-Shachar, Michal, Dafna Palti, and Yosef Grodzinsky (2004). ‘Neural correlates of syntactic
movement: Converging evidence from two fMRI experiments,’ NeuroImage 21: 1320–36.
Berwick, Robert C. (1998). ‘Language evolution and the Minimalist Program: The origins of
syntax,’ in James R. Hurford, Michael Studdert–Kennedy, and Chris Knight (eds.),
Approaches to the Evolution of Language: Social and Cognitive Bases. Cambridge: Cambridge
University Press, 320–40.
Berwick, Robert, and Noam Chomsky (2011). ‘The Biolinguistic Program. The current state of
its development,’ in Anna Maria Di Sciullo and Cedric Boeckx (eds.), The Biolinguistic
Enterprise: New Perspectives on the Evolution and Nature of the Human Language Faculty.
Oxford: Oxford University Press, 19–41.
Berwick, Robert C, Angela D. Friederici, Noam Chomsky, and Johan J. Bolhuis (2013).
‘Evolution, brain, and the nature of language,’ Trends in Cognitive Sciences 17.2: 92–8.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

References 221

Berwick, Robert C., Marc D. Hauser, and Ian Tattersall (2013). ‘Neanderthal language? Just-so
stories take center stage,’ Frontiers in Psychology 4. doi: 10.3389/fpsyg.2013.00671.
Bickerton, Derek (1990). Language and Species. Chicago: University of Chicago Press.
Bickerton, Derek (1995). Language and Human Behavior. Seattle, WA: University of
Washington Press.
Bickerton, Derek (1998). ‘Catastrophic evolution: The case for a single step from protolanguage
to full human language,’ in J.R. Hurford, M. Studdert-Kennedy, and C. Knight (eds.),
Approaches to the Evolution of Language: Social and Cognitive Bases. Cambridge: Cambridge
University Press, 341–58.
Bickerton, Derek (2007). ‘Language evolution: A brief guide for linguists,’ Lingua 117: 510–26.
Bickerton, Derek (2012). ‘The origins of syntactic language,’ in Maggie Tallerman and Kathleen
R. Gibson (eds.), The Oxford Handbook of Language Evolution. Oxford: Oxford University
Press, 456–68.
Bickerton, Derek (2014). ‘Some problems for Biolinguistics,’ Biolinguistics 8: 73–96.
Blake, Berry (1976). ‘On ergativity and the notion of subject,’ Lingua 39.4: 281–300.
Bloom, Lois (1970). Language Development: Form and Function in Emerging Grammars.
Cambridge, MA: MIT Press.
Bloom, Paul (ed.) (1994). Language Acquisition: Core Readings. Cambridge, MA: MIT Press.
Bobaljik, Johnatan (1993). ‘On ergativity and ergative unergatives,’ in C. Phillips (ed.), Papers
on Case and agreement II. MIT Working Papers in Linguistics 19. Cambridge, MA, 45–88.
Boeckx, Cedric (2008). Bare Syntax. Oxford: Oxford University Press.
Boeckx, Cedric, and Kleanthes K. Grohmann (2007). ‘Putting phases in perspective,’ Syntax 10:
204–22.
Bok-Bennema, Reineke (1991). Case and Agreement in Inuit. Dordrecht: Foris.
Bolufer, José Alemany (1920). Tratado de la formación de palabras en la lengua castellana.
Madrid: Libería General de Victoriano Suárez. Preciados, núm. 48.
Bookheimer, Susan (2002). ‘Functional MRI of language: New approaches to understanding
the cortical organization of semantic processing,’ Annual Review of Neuroscience 25: 151–88.
Borer, Hagit (1994). ‘The projection of arguments,’ in E. Benedicto and J. Runner (eds.),
University of Massachusetts Occasional Papers in Linguistics 17. Amherst: GLSA, 19–47.
Borgonovo, Claudia, and Ad Neelman (2000). ‘Transparent adjuncts,’ Canadian Journal of
Linguistics 45: 199–224.
Bošković, Željko (2008). ‘What will you have, DP or NP?’ in M. Walkow and E. Elfner (eds.),
Proceedings of NELS 37. Amherst: GLSA, 101–14.
Botha, Rudolf (2006). ‘On the windows approach to language evolution,’ Language and
Communication 26: 129–43.
Bottari, Piero (1992). ‘Romance passive nominals,’ Geneva Generative Papers 0.0: 66–80.
Bouchard, Denis (2013). The Nature and Origin of Language. Oxford: Oxford University Press.
Bowers, John (1993). ‘The syntax of predication,’ Linguistic Inquiry 24: 591–656.
Bradshaw, John L. (2001). Developmental Disorders of the Frontostratial System. Hove: Psych-
ology Press.
Brain, Walter Russell, and Roger Bannister (1992). Clinical Neurology, 7th edition. Oxford and
New York: Oxford University Press.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

222 References

Brennan, Jonathan, Y. Nir, U. Hasson, R. Malach, D.J. Haeger, and L. Pylkkänen (2012).
‘Syntactic structure building in the anterior temporal lobe during natural story listening,’
Brain and Language 120: 163–73.
Briscoe, Ted (2000). ‘Grammatical acquisition: Inductive bias and co–evolution of language
and the language acquisition device,’ Language 76: 245–96.
Broca, Paul (1878). ‘Anatomie comparée des circomvolutions cérébrales: Le grand lobe limbi-
que et la scissure limbique dans la série des mammifères,’ Revue anthropologique 1: 385–498.
Bruening, Benjamin (2014). ‘Precede-and-command revisited,’ Language 90.2: 342–88.
Burling, Robbins (2005). The Talking Ape: How Language Evolved. Oxford: Oxford University
Press.
Burzio, Luigi (1981). Intransitive verbs and Italian auxiliaries. Ph.D. Dissertation, Massachu-
setts Institute of Technology, Cambridge, MA.
Burzio, Luigi (1986). Italian Syntax: A Government-Binding Approach. Dordrecht: Kluwer.
Caplan, David (2001). ‘Functional neuroimaging studies of syntactic processing,’ Journal of
Psycholinguistic Research 30: 297–320.
Caramazza, Alfonso, and Edgar B. Zurif (1976). ‘Dissociation of algorithmic and heuristic
processes in sentence comprehension: Evidence from Aphasia, Brain and Language 3:
572–82.
Cardinaletti, Anna, and Maria Teresa Guasti (eds.) (1995). Small Clauses. San Diego, CA:
Academic Press.
Carroll, Sean B. (2005). Endless Forms Most Beautiful: The New Science of Evo Devo and the
Making of the Animal Kingdom. New York: W. W. Norton and Company.
Carstairs-McCarthy, Andrew (1992). Current Morphology. London: Routledge.
Casielles, Eugenia, and Ljiljana Progovac (2010). ‘On protolinguistic “fossils”: Subject-verb vs.
verb-subject structures,’ in Andrew D. M. Smith, Marieke Schouwstra, Bart de Boer, and
Kenny Smith (eds.), The Evolution of Language: Proceedings of the 8th International Con-
ference (EVOLANG 8). Hackensack, NJ: World Scientiﬁc, 66–73.
Casielles, Eugenia, and Ljiljana Progovac (2012). ‘Protosyntax: A thetic (unaccusative) stage,’
Theoria et Historia Scientiarum: An International Journal for Interdisciplinary Studies IX:
29–48.
Cheng, Lisa (1986). ‘De in Mandarin,’ Canadian Journal of Linguistics 31: 313–26.
Chierchia, Gennaro, and Sally McConnell-Ginet (1990). Meaning and Grammar. Cambridge,
MA: MIT Press.
Chomsky, Noam (1956). ‘Three models for the description of language,’ IRE Transactions on
Information Theory 2: 113–24.
Chomsky, Noam (1980). Rules and Representations. New York: Columbia University Press.
Chomsky, Noam (1981). Lectures on Government and Binding. Dordrecht: Foris.
Chomsky, Noam (1986). Barriers. Cambridge, MA: MIT Press.
Chomsky, Noam (1988). Language and Problems of Knowledge. The Managua Lectures.
Cambridge, MA: MIT Press.
Chomsky, Noam (1995). The Minimalist Program. Cambridge, MA: MIT Press.
Chomsky, Noam (2000). ‘Minimalist inquiries: The framework,’ in Roger Martin, David
Michaels, and Juan Uriagereka (eds.), Step by Step: Essays on Minimalist Syntax in Honor
of Howard Lasnik. Cambridge, MA: MIT Press, 89–155.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

References 223

Chomsky, Noam (2001). ‘Derivation by phase,’ in Michael Kenstowicz (ed.), Ken Hale: A Life
in Language. Cambridge, MA: MIT Press, 1–52.
Chomsky, Noam (2002). On Nature and Language. (Edited by Adriana Belletti and Luigi
Rizzi). Cambridge: Cambridge University Press.
Chomsky, Noam (2004). ‘Beyond explanatory adequacy,’ in Adriana Belletti (ed.), Structures
and Beyond: The Cartography of Syntactic Structures, vol. 3. Oxford: Oxford University
Press, 104–31.
Chomsky, Noam (2005). ‘Three factors in language design,’ Linguistic Inquiry 36: 1–22.
Chomsky, Noam (2008). ‘On phases,’ in Robert Freidin, Carlos P. Otero, and Maria Luisa
Zubizarreta (eds.), Foundational Issues in Linguistic Theory: Essays in Honor of Jean-Roger
Vergnaud. Cambridge, MA: MIT Press, 133–66.
Chomsky, Noam (2010). ‘Some simple evo-devo theses: How true might they be for language?’
in R. K. Larson, V. M. Deprez, and H. Yamakido, Approaches to the Evolution of Language.
Cambridge: Cambridge University Press, 45–62.
Christiansen, Morten H., and Nick Chater (2008). ‘Language as shaped by the brain,’ Behav-
ioral and Brain Sciences 31: 489–558.
Churchward, Maxwell C. (1953). Tongan Grammar. London: Oxford University Press.
Cinque, Guglielmo (1978). ‘Towards a uniﬁed treatment of island constraints,’ in Wolfgang
U. Dressler and Wolfgang Meid (eds.), Proceedings of the Twelfth International Congress of
Linguists, Vienna, August 28 – September 2, 1977. Innsbruck: Universität Innsbruck, 344–8.
Cinque, Guglielmo (1999). Adverbs and Functional Heads: A Cross-linguistic Perspective.
Oxford: Oxford University Press.
Citko, Barbara (2011). Symmetry in Syntax: Merge, Move, and Labels. Cambridge: Cambridge
University Press.
Clancy, Patricia M. (1993). ‘Preferred argument structure in Korean acquisition,’ in Eve Clark
(ed.), The Proceedings of the 25th Annual Child Language Research Forum. Stanford, CA:
Center for the Study of Language and Information, 307–14.
Clark, Brady (2013). ‘Syntactic theory and the evolution of syntax,’ Biolinguistics 7: 169–97.
Clark, Eve, Barbara Frant Hecht, and Randa C. Mulford (1986). ‘Coining complex compounds
in English: Afﬁxes and word order in acquisition, Linguistics 24: 7–29.
Clark, Eve V., and Brigid J. S. Barron (1988). ‘A thrower button or a button thrower? Children’s
judgments of grammatical and ungrammatical compound nouns,’ Linguistics 26: 3–19.
Code, Chris (1982). ‘Neurolinguistic analysis of recurrent utterances in aphasia,’ Cortex 18:
141–52.
Code, Chris (2005). ‘First in, last out? The evolution of aphasic lexical speech automatisms to
agrammatism and the evolution of human communication,’ Interaction Studies 6: 311–34.
Cole, Desmond T. (1955). An Introduction to Tswana Grammar. London-New York: Long-
mans Green.
Comrie, Bernard (1978). ‘Ergativity,’ in W. P. Lehmann (ed.), Syntactic Typology: Studies in the
Phenomenology of Language. Austin: University of Texas Press, 329–94.
Comrie, Bernard (1989). Language Universals and Linguistic Typology. 2nd edn. Oxford:
Blackwell.
Comrie, Bernard (2002). ‘Reconstruction, typology and reality,’ in R. Hickey (ed.), Motives for
Language Change. Cambridge: Cambridge University Press, 243–57.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

224 References

Constable, R. T., K. R. Pugh, E. Berroya, W. E. Mencl, M. Westerveld, W. Ni, D. Shankweiler

(2004). ‘Sentence complexity and input modality effects in sentence comprehension: An
fMRI study,’ Neuroimage 22: 11–21.
Crysmann, Berthold (2006). ‘Coordination,’ in Keith Brown (ed.), Encyclopedia of Language
and Linguistics, Vol. 3. Oxford: Elsevier, 183–96.
Culicover, Peter W. (1999). Syntactic Nuts. Oxford: Oxford University Press.
Culicover, Peter W., and Ray Jackendoff (2005). Simpler Syntax. New York: Oxford University
Press.
Darmesteter, Arsène (1894). Traité de la formation des mots composés dans la langue française
comparée aux autres langues romanes et au latin. Paris: Émile Bouillon.
Darmesteter, Arsène (1934). A Historical French Grammar. Authorized English edition by
Alphonse Hartog. First edition in 1899. London: Macmillan and Co.
Darwin, Charles (1859/1964). On the Origin of Species. 1964 facsimile edition. Cambridge, MA:
Harvard University Press. First published in 1859.
Darwin, Charles (1872). The Expression of the Emotions in Man and Animals. London: John
Murray.
Darwin, Charles M. A. (1874). The Descent of Man, and Selection in Relation to Sex. New
edition, revised and augmented. New York: Hurst and Company.
Dawkins, Richard (1996). The Blind Watchmaker: Why the Evidence of Evolution Reveals a
Universe without Design. New York: W. W. Norton and Company.
Dawkins, Richard (1999). The Extended Phenotype: The Long Reach of the Gene. Oxford:
Oxford University Press.
Deacon, Terrence W. (1997). The Symbolic Species. New York: Norton.
Deacon, Terrence W. (2003). ‘Multilevel selection in a complex adaptive system: The problem
of language origins,’ in Weber H. Bruce and David J. Depew (eds.), Evolution and Learning:
The Baldwin Effect Reconsidered. A Bradford Book. Cambridge, MA: The MIT Press, 81–106.
de Diego, Vicente Garcia (1914). Elementos de gramática histórica castellana. Burgos: El Monte
Carmelo.
Dediu, Dan (2008). ‘Causal correlations between genes and linguistic features: The mechanism
of gradual language evolution,’ in Andrew Smith, Kenneth Smith, and Ramon Ferrer i
Cancho (eds.), Evolution of Language: Proceedings of the 7th International Conference,
Barcelona, Spain, March 12–15. Hackensack, NJ: World Scientiﬁc, 83–90.
Dediu, Dan, and D. Robert Ladd (2007). ‘Linguistic tone is related to the population frequency
of the adaptive haplogroups of two brain size genes, ASPM and Microcephalin,’ Proceedings
of the National Academy of Sciences of the USA 104: 10944–9.
Dediu, Dan, and Stephen C. Levinson (2013). ‘On the antiquity of language: The reinterpret-
ation of Neandertal linguistic capacities and its consequences,’ Frontiers in Psychology 4: 397.
doi: 10.3389/fpsyg.2013.00397.
Delbrück, Berthold (1893–1900). Vergleichende Syntax der Indogermanischen Sprachen. (Part 3,
Karl Brugmann and Berthold Delbrück. 1900. Grundriss der vergleichende Grammatik der
indogermanischen Sprachen.) Strassburg: Karl J. Trübner.
den Dikken, Marcel (2005). ‘Comparative correlatives comparatively,’ Linguistic Inquiry 36:
497–532.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

References 225

den Dikken, Marcel (2006). Relators and Linkers: The Syntax of Predication, Predicate Inver-
sion, and Copulas. Linguistic Inquiry Monographs 47. Cambridge, MA: MIT Press.
Depew, David J. (2003). ‘Baldwin and his many effects,’ in Weber H. Bruce and David J. Depew
(eds.), Evolution and Learning: The Baldwin Effect Reconsidered. A Bradford Book.
Cambridge, MA: MIT Press, 3–31.
Deutscher, Guy (2000). Syntactic Change in Akkadian: The Evolution of Sentential Comple-
mentation. Oxford: Oxford University Press.
Deutscher, Guy (2005). The Unfolding of Language: An Evolutionary Tour of Mankind's
Greatest Invention. New York: Metropolitan Books.
Diez, Friedrich (1838). Grammatick der romanischen Sprachen. Bonn: Weber.
Diller, Karl C., and Rebecca L. Cann (2009). ‘Evidence against a genetic-based revolution in
language 50,000 years ago,’ in Rudolf Botha and Chris Knight (eds.), The Cradle of
Language. Oxford: Oxford University Press, 135–49.
Diller, Karl C., and Rebecca L. Cann (2012). ‘Genetic inﬂuence on language evolution: An
evaluation of the evidence,’ in Maggie Tallerman and Kathleen R. Gibson (eds.), The Oxford
Handbook of Language Evolution. Oxford: Oxford University Press, 168–75.
Diller, Karl C., and Rebecca L. Cann (2013). ‘Genetics, evolution, and the innateness of
language,’ in Rudolf Botha and Martin Everaert (eds.), The Evolutionary Emergence of
Language. Oxford: Oxford University Press, 244–58.
Dixon, Robert M. W. (1994). Ergativity. Cambridge Studies in Linguistics 69. Cambridge:
Cambridge University Press.
Dixon, Robert M. W. (1995). ‘Complement clauses and complementation strategies,’ in Frank
R. Palmer (ed.), Grammar and Meaning: Essays in Honour of Sir John Lyons. New York:
Cambridge University Press, 175–220.
Dixon, Robert M. W. (1997). The Rise and Fall of Language. Cambridge: Cambridge University
Press.
Dobzhansky, Theodosius (1973). ‘Nothing in biology makes sense except in the light of
evolution,’ American Biology Teacher 35: 125–9.
Dong, Quang P. (1971). ‘English sentences without overt grammatical subject,’ in A. M. Zwicky,
P. H. Salus, R. I. Binnick, and A. L. Vanek (eds.), Studies Out in Left Field: Defamatory Essays
Presented to James D. McCawley on his 33rd or 34th Birthday. Edmonton, ON, CA: Linguistic
Research, 3–20.
Downing, Pamela A. (1977). ‘On the creation and use of English compound nouns,’ Language
53.4: 810–42.
Dowty, David (1991). ‘Thematic proto-roles and argument selection,’ Language 67.3:
547–619.
Dubinsky, Stanley, Marie Egan, A. René Schmauder, and Matthew J. Traxler (2000). ‘Func-
tional projections of predicates: Experimental evidence from coordinate structure process-
ing,’ Syntax 3.3: 182–214.
Du Bois, John W. (1985). ‘Competing motivations,’ in John Haiman (ed.), Iconicity in Syntax.
Proceedings of a Symposium on Iconicity in Syntax, Stanford, June 24–26, 1983. Amsterdam:
John Benjamins, 343–65.
Du Bois, John W. (1987). ‘The discourse basis of ergativity,’ Language 63: 805–55.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

226 References

Dukes, Michael (1998). ‘Evidence for grammatical functions in Tongan,’ in Miriam Butt and
Tracy Holloway King (eds.), Proceedings of the LFG 98 Conference, the University of
Queensland, Brisbane. Stanford: CSLI Publications, 144–61.
Dunbar, Robin I. M, Neil Duncan, and Anna Marriot (1997). ‘Human conversational behavior,’
Human Nature 8: 231–46.
Dwyer, David (1986). ‘What are chimpanzees telling us about language?’ Lingua 69: 219–44.
Elderkin, Edward D. (1986). ‘Khoe tone and Khoe “junctures,” ’ in R. Vossen and K. Keuth-
mann (eds.), Contemporary Studies on Khoisan. Volume 1. Hamburg: Helmut Buske, 225–35.
Eldredge, Niles, and Stephen Jay Gould (1972). ‘Punctuated equilibria: An alternative to
phyletic gradualism,’ in T. J. M. Schopf (ed.), Models in Paleobiology. San Francisco:
Freeman Cooper, 82–115.
Embick, D., A. Marantz, Y. Miyashita, W. O’Neil, and K. L. Sakai (2000). ‘A syntactic
specialization for Broca’s area,’ Proceedings of the National Academy of Sciences of the
United States of America 97: 6150–4.
Emonds, Joseph E. (1976). A Transformational Approach to English Syntax: Root, Structure
Preserving, and Local Transformations. San Diego: Academic Press.
Enard, Wolfgang, Molly Przeworski, Simon E. Fisher, Cecilia S. L. Lai, Victor Wiebe, Takashi
Kitano, Anthony P. Monaco, and Svante Pääbo (2002). ‘Molecular evolution of FOXP2, a
gene involved in speech and language,’ Nature 418: 869–72.
Enard, W., S. Gehre, K. Hammerschmidt, S. M. Hölter, T. Blass, M. Somel, M. K. Brückner,
S. Schreiweis, C. Winter, R. Sohr, L. Becker, V. Wiebe, B. Nickel, T. Giger, U. Müller,
M. Groszer, T. Adler, A. Aguilar, I. Bolle, J. Calzada-Wack, C. Dalke, N. Ehrhardt, J. Favor,
H. Fuchs, V. Gailus-Durner, W. Hans, G. Hölzlwimmer, A. Javaheri, S. Kalaydjiev,
M. Kallnik, E. Kling, S. Kunder, I. Moßbrugger, B. Naton, I. Racz, B. Rathkolb, J. Rozman,
A. Schrewe, D.H. Busch, J. Graw, B. Ivandic, M. Klingenspor, T. Klopstock, M. Ollert,
L. Quintanilla-Martinez, H. Schulz, E. Wolf, W. Wurst, A. Zimmer, S. E. Fisher,
R. Morgenstern, T. Arendt, M. Hrabé de Angelis, J. Fischer, J. Schwarz, S. Pääbo (2009).
‘A humanized version of FOXP2 affects cortico-basal ganglia circuits in mice,’ Cell 137:
961–7.
Epstein, D. Samuel, Hisatsugu Kitahara, and Daniel Seely (2010). ‘Uninterpretable features:
What are they and what do they do?’ in Michael T. Putnam (ed.), Exploring Crash-Proof
Grammars. Amsterdam: John Benjamins, 125–42.
Everett, Dan (2005). ‘Cultural constraints on grammar and cognition in Pirahã: Another look
at the design features of human language,’ Current Anthropology 46.4: 621–46.
Fabb, Nigel (1984). Syntactic Afﬁxation. Ph.D. Dissertation, Massachusetts Institute of Tech-
nology, Cambridge, MA.
Fedorenko, Evelina, and Nancy Kanwisher (2009). ‘A new approach to investigating the
functional speciﬁcity of language regions in the brain.’ Poster at the Neurobiology of
Language Conference, Chicago.
Ferrari, Franca (2005). A Syntactic Analysis of the Nominal Systems of Italian and Luganda:
How Nouns Can Be Formed in the Syntax. Ph.D. Dissertation, New York University.
Finlayson, Clive (2009). The Humans Who Went Extinct: Why Neanderthals Died out and We
Survived. Oxford: Oxford University Press.
Fisher, Ronald A. (1930). The Genetical Theory of Natural Selection. Oxford: Clarendon Press.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

References 227

Fitch, W. Tecumseh (2008). ‘Co-evolution of phylogeny and glossogeny: There is no “logical

problem of language evolution,” ’ Behavioral and Brain Sciences 31.5: 521–2.
Fitch, W. Tecumseh (2010). The Evolution of Language. Cambridge: Cambridge University
Press.
Fitch, W. Tecumseh, Marc D. Hauser, and Noam Chomsky (2005). ‘The Evolution of the
language faculty: Clariﬁcations and implications,’ Cognition 97: 179–210.
Floricic, Franck (2009). ‘The Italian verb-noun anthroponymic compounds at the syntax /
morphology interface,’ Morphology. DOI 10.1007/s11525-009-9126-9.
Fortin, Catherine (2007). ‘Some (not all) nonsententials are only a phase,’ Lingua 117: 67–94.
Francis, Elaine J., and Laura A. Michaelis (eds.) (2003). Mismatch: Form-Function Incongruity
and the Architecture of Grammar. Stanford, CA: CSLI Publications.
Franks, Bradley, and Kate Rigby (2005). ‘Deception and mate selection: Some implications for
relevance and the evolution of language,’ in Maggie Tallerman (ed.), Language Origins:
Perspectives on Evolution. Studies in the Evolution of Language. Oxford: Oxford University
Press, 208–29.
Franks, Steven (1995). Parameters of Slavic Morphosyntax. Oxford: Oxford University Press.
Friederici, Angela D., Christian J. Fiebach, Matthias Schlesewsky, Ina D. Bornkessel, and
D. Yves von Cramon (2006). ‘Processing linguistic complexity and grammaticality in the
Left Frontal Cortex,’ Cerebral Cortex 16: 1709–17.
Friederici, Angela D., Martin Meyer, and D. Yves von Cramon (2000). ‘Auditory language
comprehension: An event-related fMRI study on the processing of syntactic and lexical
information,’ Brain and Language 74: 289–300.
Friedmann, Na’ama (2002). ‘Question production in agrammatism: The tree–pruning hypoth-
esis,’ Brain and Language 80: 160–87.
Friedmann, Na’ama, and Yosef Grodzinsky (1997). ‘Tense and agreement in agrammatic
production: Pruning the syntactic tree,’ Brain and Language 56: 397–425.
Fu, Jingqi, Thomas Roeper, and Hagit Borer (2001). ‘The VP within nominalizations: Evidence
from adverbs and the VP anaphor do- so,’ Natural Language and Linguistic Theory 3: 549–82.
Fukui, Naomi (1986). A Theory of Category Projection and its Implications. Ph.D. Dissertation,
Massachusetts Institute of Technology, Cambridge, MA.
Gair, James (1970). Colloquial Sinhalese Clause Structures. The Hague: Mouton.
Gardner, R. Allen, Beatrix T. Gardner, and Thomas E. Van Cantfort (1989). Teaching Sign
Language to Chimpanzees. Albany, N.Y.: SUNY Press.
Gibson, Kathleen, R. (1996). ‘The ontogeny and evolution of the brain, cognition, and lan-
guage,’ in Andrew Lock and Charles R. Peters (eds.), Handbook of Human Symbolic
Evolution. Oxford: Clarendon Press, 407–31.
Gil, David (2004). ‘Riau Indonesian sama: Explorations in macrofunctionality,’ in Martin
Haspelmath (ed.), Coordinating Constructions. Amsterdam/Philadelphia: Benjamins,
371–424.
Gil, David (2005). ‘Isolating-monocategorial-associational language,’ in H. Cohen and
C. Lefebvre (eds.), Handbook of Categorizaton in Cognitive Science. Amsterdam: Elsevier,
347–79.
Gil, David (2012). ‘Where does predication come from?’ Canadian Journal of Linguistics 57:
303–33.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

228 References

Gil, David (2014). ‘Sign languages, Creoles, and the development of predication,’ in Frederick
J. Newmeyer and Laurel B. Preston (eds.), Measuring Grammatical Complexity. Oxford:
Oxford University Press, 37–64.
Givón, Talmy (1971). ‘Historical syntax and synchronic morphology: An archaeologist’s field
trip,’ in Papers from the 7th Regional Meeting of the Chicago Linguistic Society. Chicago:
Chicago Linguistic Society, 394–415.
Givón, Talmy (1979). On Understanding Grammar. New York: Academic Press.
Givón, Talmy (2002a). Bio-linguistics: The Santa Barbara Lectures. Amsterdam: John Benjamins.
Givón, Talmy (2002b). ‘The visual information-processing system as an evolutionary precursor
to human language,’ in Talmy Givón and Bertram F. Malle (eds.), The Evolution of Language
out of Pre-Language. Typological Studies in Language 53. Amsterdam: John Benjamins, 3–50.
Givón, Talmy (2009). The Genesis of Syntactic Complexity: Diachrony, Ontogeny, Neuro-
cognition, Evolution. Amsterdam/Philadelphia: John Benjamins.
Godfrey-Smith, Peter, Daniel Dennett, and Terrence W. Deacon (2003). ‘Postscript on the
Baldwin Effect and niche construction,’ in Weber H. Bruce and David J. Depew (eds.),
Evolution and Learning: The Baldwin Effect Reconsidered. A Bradford Book. Cambridge,
MA: The MIT Press, 107–12.
Goldin-Meadow, Susan (2005). ‘What language creation in the manual modality tells us about
the foundations of language,’ The Linguistic Review 22: 199–225.
Goldin-Meadow, Susan, and Carolyn Mylander (1983). ‘Gestural communication in deaf
children: Non-effect of parental input on language development,’ Science 221: 372–4.
Gonda, Jan (1956). The Character of the Indo-European Moods (With Special Regard to Greek
and Sanskrit). Weisbaden: Harrassowitz.
Goodall, Grant (1987). Parallel Structures in Syntax: Coordination, Causatives, and Restructur-
ing. Cambridge: Cambridge University Press.
Gopnik, Myrna, and Martha B. Crago (1991). ‘Familial aggregation of a developmental lan-
guage disorder,’ Cognition 39: 1–50.
Gould, Stephen Jay (1987). ‘The limits of adaptation: Is language a spandrel of the human brain?’
Paper presented to the Cognitive Science Seminar, Center for Cognitive Science, MIT.
Gould, Stephen Jay, and Niles Eldredge (1977). ‘Punctuated equilibria: The tempo and mode of
evolution reconsidered,’ Paleobiology 3: 115–51.
Gray, Russell D., and Quentin D. Atkinson (2003). ‘Language-tree divergence times support
the Anatolian theory of Indo-European origin,’ Nature 426: 435–9.
Greenfield, Patricia M., and Sue Savage-Rumbaugh (1990). ‘Language and intelligence in
monkeys and apes,’ in S. T. Parker and K. R. Gibson (eds.), Grammatical combination in
Pan paniscus: Process of learning and invention in the evolution and development of
language. Cambridge: Cambridge University Press, 540–79.
Grimshaw, Jane (2000). ‘Locality and Extended Projection,’ in Peter Coopmans, Martin
B.H. Everaert, and Jane Grimshaw (eds.), Lexical Specification and Insertion. Amsterdam:
John Benjamins, 115–33.
Grodzinsky, Yosef (2000). ‘The neurology of syntax: Language use without Broca’s area,’
Behavioral and Brain Sciences 23: 1–21.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

References 229

Grodzinsky, Yosef (2010). ‘The picture of the linguistic brain: How sharp can it be? Reply to
Fedorenko and Kanwisher,’ Language and Linguistics Compass 4/8: 605–22.
Grodzindky, Yosef, and Angela D. Friederici (2006). ‘Neuroimaging of syntax and syntactic
processing,’ Current Opinion in Neurobiology 16: 240–6.
Groom, Bernard (1937). ‘The formation and use of compound epithets in English poetry from
1579,’ S.P.E. Tract No. XLIX. Clarendon Press.
Gruber, Howard Ernest, and J. Jacques Vonèche (eds.) (1977). The Essential Piaget. New York:
Basic Books, Inc. Publishers.
Guasti, Maria Teresa (2002). Language Acquisition: The Growth of Grammar. Cambridge, MA:
MIT Press.
Guilfoyle, Eithne, and Michael Noonan (1992). ‘Functional categories in language acquisition,’
Canadian Journal of Linguistics 37: 241–72.
Haegeman, Liliane, Benjamin Shaer, and Werner Frey (2009). ‘Postscript: Problems and
solutions for orphan analyses,’ in Benjamin Shaer, Philippa Cook, Werner Frey, and Claudia
Maienborn (eds.), Dislocated Elements in Discourse: Syntactic, Semantic, and Pragmatic
Perspectives. New York: Routledge, 348–65.
Hagstrom, Paul (1998). Decomposing Questions. Ph.D. Dissertation. Massachusetts Institute of
Technology, Cambridge, MA.
Haldane, John Burdon Sanderson (1927). ‘A mathematical theory of natural and artificial
selection. Part V. Selection and Mutation,’ Proceedings of the Cambridge Philosophical
Society 23: 838–44.
Hale, Kenneth (1970). ‘The passive and ergative in language change: The Australian case,’ in
S. Wurm and D. Laysock (eds.), Pacific Linguistic Studies in Honor of Arthur Capell. Pacific
Linguistic Series. Canberra: Linguistics Circle of Canberra, 757–81.
Hale, Mark (1987). ‘Notes on Wackernagel’s Law in the language of the Rigveda,’ in Calvert
Watkins (ed.), Studies in Memory of Warren Cowgill. Berlin: De Gruyter, 38–50.
Hall, Robert A. Jr. (1948a). ‘Ancora i composite del tipo portabandiera, facidanno,’ Lingua
Nostra 9: 22–3.
Hall, Robert A. Jr. (1948b). Descriptive Italian Grammar. Cornell Romance Studies: Volume
II. Ithaca, New York: Cornell University Press and Linguistic Society of America.
Hall, Robert A. Jr. (1964). Introductory Linguistics. Philadelphia: Chilton Books.
Halle, Morris, and Alec Marantz (1993). ‘Distributed morphology and the pieces of inflection,’
in Kenneth Hall and Samuel J. Keyser (eds.), The View from Building 20. Cambridge, MA:
MIT Press, 111–76.
Harris, Alice C., and Lyle Campbell (1995). Historical Syntax in Cross-linguistic Perspective.
Cambridge: Cambridge University Press.
Haspelmath, Martin (2004). ‘Coordinating constructions: An overview,’ in Martin Haspelmath
(ed.), Coordinating Constructions. Amsterdam/Philadelphia: Benjamins, 3–39.
Hauser, Marc, Noam Chomsky, and W. Tecumseh Fitch (2002). ‘The language faculty: What is
it, who has it, and how did it evolve?’ Science 298: 1569–79.
Heine, Bernd (1986). ‘Bemerkungen zur Entwicklung der Verbaljunkturen im Khoe und
anderen Zentralkhoisan-Sprachen,’ in R. Vossen and K. Keuthmann (eds.), Contemporary
Studies on Khoisan, Volume 2. Hamburg: Helmut Buske, 9–21.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

230 References

Heine, Bernd, and Tania Kuteva (2002). World Lexicon of Grammaticalization. Cambridge:
Cambridge University Press.
Heine, Bernd, and Tania Kuteva (2007). The Genesis of Grammar. A Reconstruction. Oxford:
Oxford University Press.
Heinimann, Siegfried (1949). ‘Die italiensischen Imperativ-komposita.’ In ASNS CLXXXVI:
136–43.
Hinton, Geoffrey E., and Steven J. Nowlan (1987). ‘How learning can guide evolution,’ Complex
Systems 1: 495–502.
Hinzen, Wolfram (2006). Mind Design and Minimal Syntax. Oxford: Oxford University Press.
Hock, Hans Henrich (1989). ‘Early Indo-European syntactic typology: A different approach.’
Paper presented at the Eighth East Coast Indo–European Conference, Cambridge,
MA. [Harvard University, 10 June 1989.]
Hock, Hans Henrich (1991). Principles of Historical Linguistics. Berlin: Mouton de Gruyter.
Hockett, Charles F. (1960). ‘The origin of speech,’ Scientiﬁc American 203: 88–96.
Hockett, Charles F., and Stuart Altmann (1968). ‘A note on design features,’ in Thomas
A. Sebeok (ed.), Animal Communication; Techniques of Study and Results of Research.
Bloomington: Indiana University Press, 61–72.
Hoenigswald, Henry M. (1944). ‘Internal reconstruction,’ Studies in Linguistics 2: 78–87.
Hollebrandse, Bart, and Tom Roeper (2007). ‘Recursion and propositional exclusivity.’
Manuscript.
Holmberg, Anders (1986). Word Order and Syntactic Features in the Scandinavian Languages
and English. Ph.D. Dissertation, University of Stockholm.
Hornstein, Norbert (2009). A Theory of Syntax: Minimal Operations and Universal Grammar.
Cambridge: Cambridge University Press.
Hua, Zhu, and Barbara Dodd (2000). ‘The phonological acquisition of Putonghua (Modern
Standard Chinese),’ Journal of Child Language 27: 3–42.
Huang, James (1982). Logical Relations in Chinese and the Theory of Grammar.
Ph.D. Dissertation, Massachusetts Institute of Technology, Cambridge, MA.
Hurford, Jim R. (1990). ‘Nativist and functional explanations in language acquisition,’ in Iggy
M. Roca (ed.), Logical issues in language acquisition. Dordrecht: Foris, 85–136.
Hurford, Jim (2007). The Origins of Meaning: Language in the Light of Evolution. Oxford:
Oxford University Press.
Hurford, James R. (2012). The Origins of Grammar. Language in the Light of Evolution II.
Oxford: Oxford University Press.
Hurford, James R., and Dan Dediu (2009). ‘Diversity in languages, genes, and the language
faculty,’ in Rudolf Botha and Chris Knight (eds.), The Cradle of Language. Oxford: Oxford
University Press, 166–88.
Indefrey, Peter, Collin M. Brown, Frauke Hellwig, Katrin Amunts, Hans Herzog, Rüdiger
J. Seitz, and Peter Hagoort (2001). ‘A neural correlate of syntactic encoding during speech
production,’ Proceedings of the National Academy of Sciences of the United States of America
98: 5933–6.
Indefrey, Peter, Peter Hagoort, Hans Herzog, Rüdiger J. Seitz, and Collin M. Brown (2001).
‘Syntactic processing in left prefrontal cortex is independent of lexical meaning,’ Neuro-
Image 14: 546–55.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

References 231

Isaacson, Robert L. (1982). The Limbic System. Second edition. New York and London: Plenum
Press.
Jackendoff, Ray (1999). ‘Possible stages in the evolution of the language capacity,’ Trends in
Cognitive Sciences 3: 272–79.
Jackendoff, Ray (2002). Foundations of Language: Brain, Meaning, Grammar, Evolution.
Oxford: Oxford University Press.
Jackendoff, Ray (2009). ‘Compounding in the Parallel Architecture and Conceptual Seman-
tics,’ in Rochelle Lieber and Pavol Štekauer (eds.), The Oxford Handbook of Compounding.
Oxford: Oxford University Press, 105–28.
Jackendoff, Ray, and Eva Wittenberg (2014). ‘What you can say without syntax: A hierarchy of
grammatical complexity,’ in Frederick J. Newmeyer and Laurel B. Preston (eds.), Measuring
Grammatical Complexity. Oxford: Oxford University Press, 65–82.
Jacob, François (1977). ‘Evolution and tinkering,’ Science 196: 1161–6.
Jarkey, Nerida (2006). ‘Complement clause types and complementation strategy in White
Hmong,’ in R. M. W. Dixon and Alexandra Y. Aikhenvald (eds.), Complementation:
A Cross-linguistic Typology. Oxford: Oxford University Press, 115–36.
Jay, Timothy (1980). ‘Sex roles and dirty word usage: A review of the literature and a reply to
Haas,’ Psychological Bulletin 88: 614–21.
Jay, Timothy (1995). ‘Cursing: A damned persistent lexicon,’ in Douglas Hermann, Marcia
K. Johnson, Cathy McEvoy, Chris Hertzog, and Paula Hertel (eds.), Basic and Applied
Memory: Research on Practical Aspects of Memory. Hillsdale, NJ: Erlbaum, 301–13.
Jespersen, Otto (1922). Language: Its Nature, Development, and Origin. New York:
W. W. Norton and Co.
Jespersen, Otto (1954). A Modern English Grammar. Part III: Syntax. London: Allen and
Unwin.
Johannessen, Janne Bondi (1993). Coordination: A Minimalist Approach. Ph.D. Dissertation,
University of Oslo.
Johns, Brenda, and David Strecker (1982). ‘Aesthetic language in White Hmong,’ in Bruce
T. Downing and Douglas P. Olney (eds.), The Hmong in the West: Observations and Reports.
Minneapolis: University of Minnesota Southeast Asian Refugee Studies Project, Center for
Urban and Regional Affairs, 160–9.
Johnson, David E., and Shalom Lappin (1999). Local Constraints vs. Economy. Stanford
Monographs in Linguistics.
Jordens, Peter (2002). ‘Finiteness in early child Dutch,’ Linguistics 40: 687–765.
Josefsson, Gunlög (2001). Minimal Words in a Minimal Syntax. Amsterdam: Benjamins.
Julien, Marit (2002). Syntactic Heads and Word Formation. New York: Oxford University
Press.
Just, Marcel A., Patricia A. Carpenter, Timothy A. Keller, William F. Eddy, and Keith
R. Thulborn (1996). ‘Brain activation modulated by sentence comprehension,’ Science 274:
114–16.
Kayne, Richard (1982). ‘Predicates and arguments, verbs and nouns,’ GLOW Newsletter 8: 24.
Kayne, Richard (1984). Connectedness and Binary Branching. Dordrecht: Foris.
Kayne, Richard S. (1994). The Antisymmetry of Syntax. Cambridge, MA: MIT Press.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

232 References

Kegl, Judy, Ann Senghas, and Marie Coppola (1999). ‘Creation through contact: Sign language
emergence and sign language change in Nicaragua,’ in Michel DeGraff (ed.), Language
Creation and Language Change: Creolization, Diachrony, and Development. Cambridge,
MA: MIT Press, 179–237.
Kemmer, Suzanne (1994). ‘Middle voice, transitivity, and the elaboration of events,’ in B. Fox
and P. J. Hopper (eds.), Voice: Form and Function. Amsterdam/Philadelphia: John Benjamins,
179–230.
Kempson, Ruth M. (1977). Semantic Theory. Cambridge Textbooks in Linguistics. Cambridge:
Cambridge University Press.
Kerns, J. Alexander, and Benjamin Schwartz (1972). A Sketch of the Indo-European Finite Verb.
Leiden: E. J. Brill.
Kinsella, Anna R. (2009). Language Evolution and Syntactic Theory. Cambridge: Cambridge
University Press.
Kiparsky, Paul (1968). ‘Tense and mood in Indo-European syntax,’ Foundations of Language 4:
30–57.
Kiparsky, Paul (1995). ‘Indo-European origins of Germanic syntax,’ in Adrian Battye and Ian
Roberts (eds.), Clause Structure and Language Change. Oxford Studies in Comparative
Syntax. Oxford: Oxford University Press, 140–69.
Kitagawa, Yoshihisa (1985). ‘Small but clausal,’ Chicago Linguistic Society 21: 210–20.
Kitagawa, Yoshihisa (1986). Subjects in English and Japanese. Ph.D. Dissertation, University of
Massachusetts, Amherst.
Klein, Wolfgang, and Clive Perdue (1997). ‘The Basic Variety (or couldn’t natural languages be
much simpler?),’ Second Language Research 13: 301–47.
Klemensiewicz, Zenon, Tadeusz Lehr-Spławiński, and Stanisław Urbański (1964). Gramatyka
historyczna języka polskiego. Warszawa: Państwowe Wydawnictwo Naukowe.
Kolk, Herman H.J. (1995). ‘A time-based approach to agrammatic production,’ Brain and
Language 50: 282–303.
Kolk, Herman H. J. (2006). ‘How language adapts to the brain: An analysis of agrammatic
aphasia,’ in Progovac et al. (eds.), 229–58.
Kolk, Herman H. J., Marianne F. van Grunsven, and Antoine Keyser (1985). ‘On parallelism
between production and comprehension in agrammatism,’ in Mary-Louise Kean (ed.),
Agrammatism. New York: Academic Press, 165–206.
Koneski, Blaže (1954). Gramatika na makedonskiot literaturen jezik, del. II. Skopje.
Koopman, Hilda, and Dominique Sportiche (1991). ‘The position of subjects,’ Lingua 85: 211–58.
Kotchoubey, Boris (2005). ‘Pragmatics, prosody, and evolution: Language is more than a
symbolic system,’ Behavioural and Brain Sciences 28: 136–7.
Kratzer, Angelika (1996). ‘Severing the external argument from its verb,’ In Johan Rooryck and
Laurie Zaring (eds.), Phrase Structure and the Lexicon. Dordrecht: Kluwer, 109–37.
Kratzer, Angelika (2000). ‘Building statives,’ in U L. Conathan, J. Good, D. Kavitskaya,
A. Wulf, and A. Yu (ed.), Berkeley Linguistic Society 26: 385–99.
Krause, J, C. Lalueza-Fox, L. Orlando, W. Enard, R. Green, H. Burbano, J-J. Hublin, C. Hänni,
J. Fortea, M. Rasilla, J. Bertranpetit, A. Rosas, and S. Pääbo (2007). ‘The derived FOXP2
variant of modern humans was shared with Neanderthals,’ Current Biology 17(1-5): 53–60.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

References 233

Kuryłowicz, Jerzy (1925/1927). ‘Injonctif et subjonctif dans les Gāthās de l’Avesta,’ Rocznik
Orientalistyczny 3: 164–79.
Kuryłowicz, Jerzy (1964). The Inﬂectional Categories of Indo-European. Heidelberg: Carl
Winter Universitätsverlag.
Kuryłowicz, Jerzy (1973). ‘Internal reconstruction,’ in Thomas A. Sebeok (ed.), Current
Trends in Linguistics: Diachronic, Areal, and Typological Linguistics. The Hague: Mouton,
63–92.
Lacarme, Jacqueline (2002). ‘Gender polarity theoretical aspect of Somali nominal morph-
ology,’ in Paul Boucher and Marc Plenat (eds.), Many Morphologies. Sommerville:
Cascadilla Press, 109–41.
Laka, Itziar (1993). ‘Unergatives that assign ergative, unaccusatives that assign accusative,’ in
J. Bobaljik and C. Phillips (eds.), Papers on Case and Agreement I (MIT Working Papers in
Linguistics 18). Cambridge, MA, 149–72.
Langendoen, D. Terence (1971). ‘Review of Marchand (1969),’ Language 47.3: 708–10.
Lasnik, Howard, and Mamoru Saito (1984). ‘On the nature of proper government,’ Linguistic
Inquiry 15: 235–89.
Lebeaux, David (1989). Language Acquisition and the Form of the Grammar. Ph.D. Dissertation,
University of Massachusetts, Amherst.
LeDoux, Joseph E. (2000). ‘Emotion circuits in the brain,’ Annual Review of Neuroscience 23:
155–84.
Lees, Robert B. (1960). The Grammar of English Nominalizations. The Hague: Mouton.
Legate, Julie A. (2008). ‘Morphological and abstract case,’ Linguistic Inquiry 39: 55–101.
Lehman, Christian (1985). ‘Ergative and active traits in Latin,’ in Frans Plank (ed.), Relational
Typology. Berlin, New York, Amsterdam: Mouton Publishers, 243–55.
Lehmann, Winfred P. (1969). ‘Proto-Indo-European compounds in relation to other Proto-
Indo-European syntactic patterns,’ Acta Linguistica Hafniensia 12: 1–12.
Levin, Beth, and Malka Rappaport Hovav (1995). Unaccusativity at the Syntax-Lexical Seman-
tics Interface. Linguistic Inquiry Monograph 26. Cambridge, MA: MIT Press.
Levin, Juliette, and Diane Massam (1985). ‘Surface ergativity: Case/theta relations reexamined,’
in S. Berman, J-W. Choe, and J. McDonough, Proceedings of NELS 15. Amherst, MA: GLSA
Publications, 286–301.
Levinson, C. Stephen, and Dan Dediu (2013). ‘The interplay of genetic and cultural factors in
ongoing language evolution,’ in Peter J. Richerson and Morten H. Christiansen (eds.),
Strüngmann Forum Reports, volume 12. Cambridge, MA: MIT Press, 229–32.
Lieber, Rochelle (1992). Deconstructing Morphology: Word Formation in Syntactic Theory.
Chicago and London: The University of Chicago Press.
Lieberman, Philip (2000). Human Language and Our Reptilian Brain: The Subcortical Bases of
Speech, Syntax, and Thought. Cambridge, MA: Harvard University Press.
Lieberman, Philip (2009). ‘FOXP2 and human cognition,’ Cell 137: 801–2.
Liégeois, Frédérique, Torsten Baldeweg, Alan Connelly, David G. Gadian, Mortimer Mishkin,
and Faraneh Vargha-Khadem (2003). ‘Language fMRI abnormalities associated with FOXP2
gene mutation,’ Nature Neuroscience 6.11: 1230–7.
Lightfoot, David. (1979). Principles of Diachronic Syntax. Cambridge: Cambridge University
Press.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

234 References

Lightfoot, David (1991). ‘Subjacency and sex,’ Language and Communication 11: 67–9.
Linnankoski, Ilkka, Maija Laakso, Reijo Aulanko, and Lea Leinonen (1994). ‘Recognition of
emotions in macaque vocalizations by children and adults,’ Language and Communication
14: 183–92.
Ljung, Magnus (1975). 'Review of Marchand, Hans. 1974,’ in Dieter Kastovsky (ed.), Studies in
Syntax and Word-Formation, Selected Articles by Hans Marchand. München: W. Fing
Verlag.
Lloyd, Paul M. (1968). Verb-Complement Compounds in Spanish. Beihefte zur Zeitschrift für
romanische Philologie. 116. Heft. Tübingen: Max Niemeyer Verlag.
Locke, John L. (2009). ‘Evolutionary developmental linguistics: Naturalization of the faculty of
language,’ Language Sciences 31: 33–59.
Locke, John L., and Barry Bogin (2006). ‘Language and life history: A new perspective on the
evolution and development of linguistic communication,’ Behavioral and Brain Sciences 29:
259–325.
Longa, Victor M. (2006). ‘A misconception about the Baldwin Effect: Implications for language
evolution,’ Folia Linguistica 40: 305–18.
Longobardi, Giuseppe (1994). ‘Reference and proper names: A theory of N-movement in
syntax and Logical Form,’ Linguistic Inquiry 25: 609–65.
Lord, Carol (1975). ‘Igbo verb compounds and the lexicon,’ Studies in African Linguistics 6: 23–48.
MacLean, Paul D. (1949). ‘Psychosomatic disease and the “visceral brain:” Recent develop-
ments bearing on the Papez theory of emotion,’ Psychosomatic Medicine 11: 338–53.
Marantz, Alec (1984). On the Nature of Grammatical Relations. Cambridge, MA: MIT Press.
Marantz, Alec (1997). ‘No escape from syntax: Don’t try morphological analysis in the privacy
of your own lexicon,’ U Penn Working Papers in Linguistics 4.2: 201–25.
Marchand, Hans (1969). The Categories and Types of Present-Day English Word-Formation: A
Synchronic-Diachronic Approach. Second, completely revised and enlarged edition.
München: C. H. Beck’sche Verlagsbuchhandlung.
Maretić, Tomislav (1899). Gramatika i stilistika hrvatskoga ili srpskoga jezika. Zagreb:
L. Hartman.
Marsh, Peter (1978). Aggro: The Illusion of Violence. London: Dent.
Marty, Anton (1918). Gesammelte Schriften. vol. II, part 1. Abteilung. Halle: Max Niemeyer Verlag.
Massam, Diane (2000). ‘VSO and VOS: Aspects of Niuean word order,’ in Andrew Carnie and
Eithne Guilfoyle (eds.), The Syntax of Verb Initial Languages. Oxford: Oxford University
Press, 97–116.
Massam, Diane (2001). ‘Pseudo noun incorporation in Niuean,’ Natural Language and Lin-
guistic Theory 19:153–97.
Maxwell, Judith M., and Robert M. Hill II (2006). Kaqchikel Chronicles: The Deﬁnitive Edition.
Austin: UT Press.
McBrearty, Sally (2007). ‘Down with the revolution,’ in Paul Mellars, Katie Boyle, Ofer Bar-
Yosef, and Chris Stringer (eds.), Rethinking the Human Revolution: New Behavioral and
Biological Perspectives on the Origin and Dispersal of Modern Humans. University of
Cambridge: McDonald Institute for Archeological Research, 133–51.
McBrearty, Sally, and Alison Brooks (2000). ‘The revolution that wasn’t: A new interpretation
of the origin of modern human behavior,’ Journal of Human Evolution 39: 453–563.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

References 235

McDaniel, Dana (2005). ‘The potential role of production in the evolution of syntax,’ in Maggie
Tallerman (ed.), Language Origins: Perspectives on Evolution. Studies in the Evolution of
Language. Oxford: Oxford University Press, 153–65.
Meinhof, Carl (1948). Grundzüge einer vergleichenden Grammatik der Bantusprachen. 2nd
edition. Hamburg: Verlag.
Mellars, Paul A. (1991). ‘Cognitive changes and the emergence of modern humans,’ Current
Anthropology 30: 349–85.
Mellars, Paul (2002). ‘Archeology and the origins of modern humans: European and African
perspectives,’ in Tim J. Crow (ed.), The Speciation of Modern Homo Sapiens. Oxford:
Oxford University Press, 31–47.
Mellars, Paul (2007). ‘Introduction: Rethinking the Human Revolution: Eurasian and African
perspectives,’ in Paul Mellars, Katie Boyle, Ofer Bar-Yosef, and Chris Stringer (eds.),
Rethinking the Human Revolution: New Behavioral and Biological Perspectives on the Origin
and Dispersal of Modern Humans. University of Cambridge: McDonald Institute for
Archeological Research, 1–11.
Meunier, Louis-Francis (1875). Les composés qui contiennent un verbe à un mode personnel en
latin, en français, en italien et en espagnol. Paris.
Meyer-Lübke, Wilhelm (1895). Grammaire des Langues Romanes. Tome II. Reprinted in 1923.
New York/Paris: G.E. Stechert and Co.
Migliorini, Bruno (1946). A Note in Lingua Nostra VII.
Mihajlović, Velimir (1992). Ime po zapovesti (Name by Command). Beograd: Nolit.
Miller, D. Gary (1975). ‘Indo-European: VSO, SOV, SVO, or all three?’ Lingua 37: 31–52.
Miller, Geoffrey A. (2000). The Mating Mind: How Sexual Choice Shaped the Evolution of
Human Nature. London: William Heinemann.
Millikan, Ruth Garrett (2004). Varieties of Meaning. The 2002 Jean Nicod Lectures. A Bradford
Book. Cambridge, MA: The MIT Press.
Mirowicz, Anatol (1946). ‘Wartośč uczuciova rozkaźnika a słozenia typu cziścibut,’ Język polski
XXV. W Krakowie.
Mithen, Steven (1996). The Prehistory of the Mind: A Search for the Origins of Art, Religion, and
Science. London: Thames and Hudson.
Mithen, Steven (2006). The Singing Neanderthals: The Origins of Music, Language, Mind, and
Body. Cambridge, MA: Harvard University Press.
Mithun, Marianne (1984). ‘How to avoid subordination,’ Proceedings of the Tenth Annual
Meeting of the Berkeley Linguistics Society, University of California, Berkeley, 493–509.
Mithun, Marianne (1994). ‘The implications of ergativity for a Philippine voice system,’ in
Barbara A. Fox and Paul J. Hopper (eds.), Voice: Form and Function. Amsterdam/Philadel-
phia: Benjamins, 247–77.
Mithun, Marianne (2010). ‘The ﬂuidity of recursion and its implications,’ in Harry van der
Hulst (ed.), Recursion and Human Language. Berlin: De Gruyter Mouton, 17–41.
Moro, Andrea (2008). The Boundaries of Babel: The Brain and the Enigma of Impossible
Languages. Cambridge, MA: The MIT Press.
Moro, Andrea, Marco Tettamanti, Daniela Perani, Caterina Donati, Stefano F. Cappa, and
Ferruccio Fazio (2001). ‘Syntax and the brain: Disentangling grammar by selective anomal-
ies,’ NeuroImage 13: 110–18.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

236 References

Mortensen, David (2014). ‘Learning phonological ordering generalizations for Hmong elabor-
ate expressions.’ Paper presented at the 2014 Linguistic Society of America Meeting in
Minneapolis.
Mottin, Jean (1978). Elements de Grammaire Hmong Blanc. Bangkok: Don Bosco Press.
Mous, Maarten (2004). ‘The grammar of conjunctive and disjunctive coordination in Iraqw,’ in
Martin Haspelmath (ed.), Coordinating Constructions. Amsterdam/Philadelphia: John
Benjamins, 109–22.
Munn, Alan B. (1993). Topics in the Syntax and Semantics of Coordinate Structures.
Ph.D. Dissertation. University of Maryland, College Park, MD.
Murray, Sarah (2004). ‘ “Truckdriver” and “scarecrow” compounds in English and Spanish:
A uniﬁed approach.’ MA Thesis. Wayne State University, Detroit.
Nan, Yun, Yanan Sun, and Isabelle Peretz (2010). ‘Congenital amusia in speakers of a tone language:
Association with lexical tone agnosia,’ Brain 133(9):2635-42. doi: 10.1093/brain/awq178.
Napoli, Donna Jo (1993). Syntax. Oxford: Oxford University Press.
Nash, Lea (1995). Argument Scope and Case Marking in SOV and in Ergative Languages: The
Case of Georgian,’ Ph.D. Dissertation, Université Paris 8.
Nash, Lea (1996). ‘The internal ergative subject hypothesis,’ in Kiyomi Kusumoto, Proceedings
of NELS 26. Amherst: GLSA, 195–210.
Nespor, Marina, and Irene Vogel (1986). Prosodic Phonology. Dordrecht: Foris.
Nevins, Andrew, David Pesetsky, and Cilene Rodrigues (2009). ‘Pirahã exceptionality:
A reassessment,’ Language 85.2: 355–404.
Newbury, Dianne F., and Anthony P. Monaco (2010). ‘Genetic advances in the study of speech
and language disorders,’ Neuron 68: 309–20.
Newman, Paul (2014). ‘The range and beauty of internal reconstruction: Probing Hausa linguistic
history,’ Studies of the Department of African Languages and Cultures [Warsaw] 48: 13–32.
Newman, Paul, and Roxana M. Newman (1977). Modern Hausa-English Dictionary. Ibadan:
University Press.
Newmeyer, Frederick J. (1991). ‘Functional explanation in linguistics and the origin of lan-
guage,’ Language and Communication 11: 1–28.
Newmeyer, Frederick J. (1998). ‘On the supposed “counterfunctionality” of Universal Gram-
mar: Some evolutionary implications,’ in James R. Hurford, Michael Studdert-Kennedy, and
Chris Knight (eds.), Approaches to the Evolution of Language: Social and Cognitive Bases.
Cambridge: Cambridge University Press, 305–19.
Newmeyer, Frederick J. (2000). ‘On the reconstruction of “proto-world” word order,’ in Chris
Knight, Michael Studdert-Kennedy, and James R. Hurford (eds.), The Evolutionary Emergence
of Language: Social Functions and the Origins of Linguistic Form. Cambridge: Cambridge
University Press, 372–88.
Newmeyer, Frederick J. (2005). Possible and Probable Languages: A Generative Perspective on
Linguistic Typology. Oxford: Oxford University Press.
Nyrop, Kristoffer (1908). Grammaire historique de la langue française III. Copenhague:
Gyldendal.
Ochs, Elinor (1982). ‘Ergativity and word order in Samoan child language,’ Language 58:
646–71.
Osthoff, Hermann (1878). Das Verbum in der Nomnalkomposition im Deutschen, Griechischen,
Slavischen und Romanischen. Jena: H. Costenoble.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

References 237

Otsuka, Yuko (2011). ‘PRO versus null SE: Case, tense, and empty categories in Tongan,’ Syntax
14.3: 265–96.
Ouhalla, Jamal (1991). Functional Categories and Parametric Variation. London: Routledge and
Kegan Paul.
Pallier, Christophe, Anne-Dominique Devauchelle, and Stanislas Dehaene (2011). ‘Cortical
representation of the constituent structure of sentences,’ Proceedings of the National Acad-
emy of Sciences 108(6): 2522–7.
Parker, Anna R. (2006). Evolution as a Constraint on Theories of Syntax: The Case against
Minimalism. Ph.D. Dissertation, University of Edinburgh.
Parks, Ward (1990). Verbal Dueling in Heroic Narrative: The Homeric and Old English
Traditions. Princeton, NJ: Princeton University Press.
Payne, John (1985). ‘Complex phrases and complex sentences,’ in Timothy Shopen (ed.),
Language Typology and Syntactic Description, vol. 2: Complex Constructions. Cambridge:
Cambridge University Press, 3–41.
Pérez-Leroux, Ana T., Anny P. Castilla-Earls, Susana Bejar, and Diane Massam (2012). ‘Elmo’s
sister’s ball: The development of nominal recursion in children,’ Language Acquisition 19.4:
301–11.
Perlmutter, David (1978). ‘Impersonal passive and the Unaccusative Hypothesis,’ Berkeley
Linguistics Society 4: 159–89. University of California, Berkeley.
Pesetsky, David (1982). Paths and Categories. Ph.D. Dissertation, Massachusetts Institute of
Technology, Cambridge, MA.
Pesetsky, David, and Ned Block (1990). ‘Complexity and adaptation,’ Behavioral and Brain
Sciences 13: 750–2.
Peters, Ann M. (1999). ‘The emergence of so-called “functional categories” in English: A case
study of auxiliaries, modals, and quasi-modals,” in Shin Ja Hwang and Arle R. Lommel
(eds.), LACUS (The Linguistic Association of Canada and the United States) Forum XXV,
Fullerton, CA, 179–88.
Peters, Ann M., and Lise Menn (1993). ‘False starts and ﬁller syllables: Ways to learn gram-
matical morphemes,’ Language 69: 742–77.
Piantadosi, Steven, Laura Stearns, Daniel L. Everett, and Edward Gibson (2012). ‘A corpus
analysis of Pirahã grammar: An investigation of recursion.’ Paper presented at the LSA
(Linguistic Society of America) Meeting. Portland, Oregon.
Piattelli-Palmarini, Massimo (2010). ‘What is language, that it may have evolved, and what is
evolution, that it may apply to language?’ in Richard K. Larson, Viviane Deprez, and Hiroko
Yamakido (eds.), The Evolution of Human Language: Biolinguistic Perspectives. Cambridge:
Cambridge University Press, 148–62.
Piattelli-Palmarini, Massimo, and Juan Uriagereka (2004). ‘Immune syntax: The evolution of
the language virus,’ in Lyle Jenkins (ed.), Variation and Universals in Biolinguistics. Oxford:
Elsevier, 341–77.
Piattelli-Palmarini, Massimo, and Juan Uriagereka (2011). ‘A geneticist’s dream, a linguist’s
nightmare: The case of FOXP2 gene,’ in Anna Maria Di Sciullo and Cedric Boeckx (eds.),
The Biolinguistic Enterprise: New Perspectives on the Evolution and Nature of the Human
Language Faculty. Oxford: Oxford University Press, 100–125.
Picallo, M. Carme (1991). ‘Nominals and nominalizations in Catalan,’ Probus 3: 279–316.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

238 References

Pinker, Steven (1995). The Language Instinct. Harmondsworth: Penguin Books.

Pinker, Steven (1996). Language Learnability and Language Development. Cambridge, MA:
Harvard University Press. First edition published in 1984.
Pinker, Steven, and Paul Bloom (1990). ‘Natural language and natural selection,’ Behavioral
and Brain Sciences 13: 707–84.
Pinker, Steven, and Michael T. Ullman (2002). ‘The past-tense debate: The past and future of
the past tense,’ Trends in Cognitive Sciences 6: 456–63.
Platzak, Christer (1990). ‘A grammar without functional categories: A syntactic study of early
child language,’ Nordic Journal of Linguistics 13: 107–26.
Poeppel, David (2008). ‘Linguistics and the future of the neurosciences.’ Paper presented at the
DGfS Workshop on Foundations of Language Comparison: Human Universals as Con-
straints on Language Diversity, Bamberg, Germany.
Poeppel, David, and David Embick (2005). ‘Deﬁning the relation between linguistics and
neuroscience,’ in Anne Cutler (ed.), Twenty-First Century Psycho-linguistics: Four Corner-
stones. Mahwah, NJ: Lawrence Erlbaum, 103–18.
Postal, Paul M. (1997). ‘Islands.’ Manuscript, New York University.
Postal, Paul (1998). Three Investigations of Extraction. Cambridge, MA: MIT Press.
Potts, Christopher (2002). ‘The syntax and semantics of as-parentheticals,’ Natural Language
and Linguistic Theory 20: 623–89.
Potts, Christopher, and Tom Roeper (2006). ‘The narrowing acquisition path: From expressive
small clauses to declaratives,’ in Progovac et al. (eds.), 183–201.
Prati, Angelico (1931). ‘Composti imperativi quali casati e soprannomi,’ Revue de linguistique
romane 7: 250–64.
Prati, Angelico (1958). ‘Nomi composti con verbi,’ Revue de linguistique romane 12: 98–119.
Progovac, Ljiljana (2003). ‘Structure for coordination,’ in Lisa Cheng and Rint Sybesma (eds.),
The Second GLOT International State-of-the-Article Book: The Latest in Linguistics. The
Hague: de Gruyter, 241–88.
Progovac, Ljiljana (2005a). A Syntax of Serbian: Clausal Architecture. Bloomington, IN: Slavica
Publishers.
Progovac, Ljiljana (2005b). ‘Synthetic agent compounds in Serbian: An incorporation analysis,’ in
Mila Tasseva-Kurktchieva, Steven Franks, and Frank Gladney, Formal Approaches to Slavic
Linguistics 13: The Columbia Meeting 2004. Ann Arbor: Michigan Slavic Publications, 253–64.
Progovac, Ljiljana (2006). ‘The syntax of nonsententials: Small clauses and phrases at the root,’
in Progovac et al. (eds.), 33–71.
Progovac, Ljiljana (2008a). ‘What use is half a clause?’ in Andrew Smith, Kenneth Smith, and
Ramon Ferrer i Cancho (eds.), Evolution of Language: Proceedings of the 7th International
Conference, Barcelona, Spain, March 12–15. Hackensack, NJ: World Scientiﬁc, 259–66.
Progovac, Ljiljana (2008b). ‘Root small clauses with unaccusative verbs: A view from evolu-
tion,’ in Andrei Antonenko, John F. Bailyn, and Christina Y. Bethin (eds.), Proceedings of
FASL (Formal Approaches to Slavic Linguistics) 16. Ann Arbor: Michigan Slavic Publications,
359–73.
Progovac, Ljiljana (2009a). ‘Layering of grammar: Vestiges of proto-syntax in present-day
languages,’ in Geoffrey Sampson, David Gil, and Peter Trudgill (eds.), Language Complexity
as an Evolving Variable. Oxford: Oxford University Press, 203–12.
Progovac, Ljiljana (2009b). ‘Sex and syntax: Subjacency revisited,’ Biolinguistics 3.2–3: 305–36.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

References 239

Progovac, Ljiljana (2009c). ‘What is there when little words are not there: Possible implications
for evolutionary studies,’ in Ronald P. Leow, Héctor Campos, and Donna Lardiere (eds.),
Little Words: Their History, Phonology, Syntax, Semantics, Pragmatics, and Acquisition.
Washington, DC: Georgetown University Press, 99–108.
Progovac, Ljiljana (2010a). ‘When clauses refuse to be recursive: An evolutionary perspective,’
in Harry van der Hulst (ed.), Recursion and Human Language. Berlin: De Gruyter Mouton,
193–211.
Progovac, Ljiljana (2010b). ‘Syntax: Its evolution and its representation in the brain,’ Biolin-
guistics 4.2-3: 233–54.
Progovac, Ljiljana (2010c). ‘Imperative in compounds: Implications for historical and evolu-
tionary studies,’ in Petr Karlík (ed.), Development of Language through the Lens of Formal
Linguistics. Munich: Lincom Europa, 137–45.
Progovac, Ljiljana (2012). ‘Compounds and commands in the evolution of human language,’
Theoria et Historia Scientiarum: An International Journal for Interdisciplinary Studies IX:
49–70.
Progovac, Ljiljana (2013a). ‘Nonsentential vs. ellipsis approaches: Review and extensions,’
Language and Linguistics Compass 7/11: 597–617.
Progovac, Ljiljana (2013b). ‘Rigid syntax, rigid sense: Absolutives/unaccusatives as evolutionary
precursors,’ in Steven Franks, Markus Dickinson, George Fowler, Melissa Witcombe, and
Ksenia Zanon (eds.), Proceedings of Formal Approaches to Slavic Linguistics (FASL), The
Third Indiana Meeting, Bloomington, IN. Ann Arbor: Michigan Slavic Publications, 246–59.
Progovac, Ljiljana (2014a). ‘Degrees of complexity in syntax: A view from evolution,’ in
Frederick J. Newmeyer and Laurel B. Preston, Measuring Grammatical Complexity. Oxford:
Oxford University Press, 83–102.
Progovac, Ljiljana (2014b). ‘The absolutive basis of middles and the status of vP and UTAH.’
Paper presented at FASL 23 (Formal Approaches to Slavic Linguistics), University of
California, Berkeley. To appear in the Proceedings.
Progovac, Ljiljana, and John L. Locke (2009). ‘The urge to merge: Ritual insult and the
evolution of syntax,’ Biolinguistics 3.2–3: 337–54.
Progovac, Ljiljana, Kate Paesani, Eugenia Casielles, and Ellen Barton (eds.) (2006). The Syntax
of Non-Sententials: Multidisciplinary Perspectives. Amsterdam: John Benjamins.
Pullum, Geoffrey (2012). ‘The rise and fall of a venomous dispute.’ http://chronicle.com/blogs/
linguafranca/2012/03/28/poisonous-dispute/.
Pulvermüller, Friedemann (2002). The Neuroscience of Language: On Brain Circuits of Words
and Serial Order. Cambridge: Cambridge University Press.
Pylkkänen, Liina (2002). Introducing Arguments. Ph.D. Dissertation, Massachusetts Institute of
Technology, Cambridge, MA.
Radford, Andrew (1988). ‘Small children’s small clauses,’ Transactions of the Philological
Society 86: 1–43.
Radford, Andrew (1990). Syntactic Theory and the Acquisition of English Syntax. Oxford:
Blackwell.
Ratliff, Martha (2010). Meaningful Tone: A Study of Tonal Morphology in Compounds, Form
Classes, and Expressive Phrases in White Hmong. DeKalb, Illinois: Northern Illinois
University Press. Originally published in 1992 by Northern Illinois University Center for
Southeast Asian Studies.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

240 References

Ratliff, Martha (2013). ‘White Hmong reduplicative expressives,’ in Jeffrey P. Williams (ed.),
The Aesthetics of Grammar: Sound and Meaning in the Languages of Mainland Southeast
Asia. Cambridge: Cambridge University Press, 179–88.
Renfrew, Colin (1987). Archeology and Language: The Puzzle of Indo-European Origins.
Cambridge: Cambridge University Press.
Ridley, Mark (1993). Evolution. Oxford: Blackwell Scientiﬁc Publications.
Ringe, Don (2003). ‘Internal reconstruction,’ in Brian D. Joseph and Richard D. Janda (eds.),
The Handbook of Historical Linguistics. Oxford: Blackwell, 244–61.
Rizzi, Luigi (1982). ‘Violations of the wh-island constraint and the subjacency conditions,’ in
Luigi Rizzi (ed.), Issues in Italian Syntax. Dordrecht: Foris, 49–76.
Rizzi, Luigi (1994). ‘Some notes on linguistic theory and language development: The case of
root inﬁnitives,’ Language Acquisition 3: 371–93.
Roebroeks, Wil, and Alexander Verpoorte (2009). ‘A “language-free” explanation for
differences between the European Middle and Upper Paleolithic Record,’ in Rudolf
Botha and Chris Knight (eds.), The Cradle of Language. Oxford: Oxford University
Press, 151–66.
Roeper, Thomas (1999). ‘Leftward movement in morphology,’ MIT Working Papers in Lin-
guistics 34: 35–66.
Roeper, Tom (2005). ‘Chomsky’s Remarks and the transformationalist hypothesis,’ in Pavol
Štekauer and Rochelle Lieber (eds.), Handbook of Word Formation. Studies in Natural
Language and Linguistic Theory 64. The Netherlands: Springer, 125–46.
Roeper, Thomas, and Dorothy Siegel (1978). ‘A lexical transformation for verb compounds,’
Linguistic Inquiry 9: 199–260.
Rohlfs, Gerhard (1954). Historische Grammatik der Italienischen Sprache III. Bern: A. Francke
Ag. Verlag.
Rohrer, Christian (1977). Die Wordzusammensetzung in Modernen Französisch. Tübingen:
TBL Verlag Gunter Narr.
Rolfe, Leonard (1996). ‘Theoretical stages in the prehistory of grammar,’ in Andrew Lock and
Charles R. Peters (eds.), Handbook of Human Symbolic Evolution. Oxford: Clarendon Press,
776–92.
Ross, Daniel (2013). ‘Try and: The development of subordination out of coordination.’ Paper
presented at the Workshop on Interfaces at the Left Periphery, LSA Institute, University of
Michigan, Ann Arbor.
Ross, John R. (1967). Constraints on Variables in Syntax. Ph.D. Dissertation, Massachusetts
Institute of Technology, Cambridge, MA.
Rothstein, Susan (1995). ‘Small clauses and copular constructions,’ in Anna Cardinaletti and
Maria Teresa Guasti (eds.), Syntax and Semantics 28: Small Clauses. San Diego: Academic
Press, 27–48.
Sakel, Jeanette, and Eugenie Stapert (2010). ‘Pirahã – in need of recursive syntax?’ in Harry van
der Hulst (ed.), Recursion and Human Language. Berlin: De Gruyter Mouton, 3–16.
Savage-Rumbaugh, E. Sue, and Roger Lewin (1994). Kanzi: The Ape at the Brink of the Human
Mind. New York: John Wiley and Sons.
Schirrmeister, Bettina E., Alexandre Antonelli, and Homayoun C. Bagheri (2011). ‘The origin of
multicellularity in cyanobacteria,’ BMC Evolutionary Biology 11.45: 1–21.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

References 241

Schlegel, (Karl Wilhelm) Friedrich von (1808). Ueber die Sprache und Weisheit der Indier.
Heidelberg: Mohr und Zimmer. (Reprinted in 1977, Amsterdam Classics in Linguistics 1;
Introduction by Sebastiano Timpanaro, translation by Peter Maher. Amsterdam: John
Benjamins.)
Schneider-Zioga, Patricia (2013). ‘The linker in Kinande re-examined.’ Manuscript, California
State University, Fullerton.
Schulze, C. (1868). ‘Imperativisch gebildete Substantiva,’ Studium der Neueren Sprachen und
Literaturen (ASNS) 43: 13–40.
Schütze, T. Carson (2001). ‘On the nature of default case,’ Syntax 4: 205–38.
Schwartz, Linda (1989a). ‘Thematic linking in Hausa asymmetric coordination, Studies in
African Linguistics 20: 29–62.
Schwartz, Linda (1989b). ‘Asymmetrical syntax and symmetrical morphology in African
languages,’ in Paul Newman and Robert Botne (eds.), Current Approaches to African
Linguistics 5. Dordrecht: Foris, 21–33.
Selkirk, Elisabeth O. (1978). ‘On prosodic structure and its relation to syntactic structure,’ in
Thorstein Fretheim (ed.), Nordic Prosody II. Trondheim: TAPIR, 111–20.
Selkirk, Elisabeth O. (1982). The Syntax of Words. Linguistic Inquiry Monographs 7. Cam-
bridge, MA: MIT Press.
Senghas, Ann, Marie Coppola, Elissa L. Newport, and Ted Supalla (1997). ‘Argument structure
in Nicaraguan Sign Language: The emergence of grammatical devices,’ in Elizabeth Hughes,
Mary Hughes, and Annabel Greenhill (eds.), Proceedings of the Boston University Conference
on Language Development 21: 55–561. Boston: Cascadilla Press.
Shetreet, Einat, Naama Friedmann, and Uri Hadar (2009). ‘An fMRI study of syntactic layers:
Sentential and lexical aspects of embedding,’ NeuroImage 48: 707–16.
Shibatani, Masayoshi (1998). ‘Voice parameters,’ in Leonid Kulikov and Heinz Vater (eds.),
Typology of Verbal Categories. Papers Presented to Vladimir Nedjalkov on the Occasion of his
70th Birthday. Tubingen: Niemeyer, 117–38.
Silverstein, Michael (1976). ‘Shifters, linguistic categories, and cultural description,’ in Keith
H. Basso and Henry A. Selby (eds.), Meaning in Anthropology. Albuquerque: University of
New Mexico Press, 11–15.
Smith, Allyn E. (2010). Correlational Comparison in English. Ph.D. Dissertation, The Ohio
State University.
Snyder, William (2014). ‘On the very idea of a living linguistic fossil,’ Paper presented at the
Workshop on the Evolution of Syntax, University of Connecticut, Storrs, March 2014.
Speijer, Jacob S. (1886). Sanskrit Syntax. Delhi: Jayyad Press.
Spencer, Andrew (1991). Morphological Theory. Oxford: Basil Blackwell.
Sproat, Richard (1985). On Deriving the Lexicon. Ph.D. Dissertation, Massachusetts Institute of
Technology, Cambridge, MA.
Sprouse, Jon, and Norbert Hornstein (2014). ‘Experimental syntax and island effects: Toward a
comprehensive theory of islands,’ in Jon Sprouse and Norbert Hornstein (eds.), Experimen-
tal Syntax and Island Effects. Cambridge: Cambridge University Press, 1–17.
Stepanov, Arthur (2007). ‘The end of CED? Minimalism and extraction domains,’ Syntax 10:
80–126.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

242 References

Stevanović, Mihailo (1956). ‘Imperativne složenice,’ Naš Jezik, Nova Serija, VIII. 1-2: 6–18.
Stevanović, Mihailo (1966). Gramaticka srpskohrvatskog jezika. Cetinje: Obod.
Stevanović, Mihailo (1974). Savremeni srpskohrvatski jezik II: Sintaksa. Beograd: Naučna
Knjiga.
Stone, Linda, and Paul F. Lurquin (2007). Genes, Culture, and Human Evolution: A Synthesis.
Blackwell Publishing.
Stowell, Tim (1981). Origins of Phrase Structure. Ph.D. Dissertation, Massachusetts Institute of
Technology, Cambridge, MA.
Stowell, Tim (1983). ‘Subjects across categories,’ The Linguistic Review 2/3: 285–312.
Strickberger, Monroe W. (2000). Evolution. Boston: Jones and Bartlett Publishers.
Stringer, Chris (2007). ‘The origin and dispersal of Homo sapiens: Our current state of
knowledge,’ in Paul Mellars, Katie Boyle, Ofer Bar-Yosef, and Chris Stringer (eds.), Rethink-
ing the Human Revolution: New Behavioral and Biological Perspectives on the Origin and
Dispersal of Modern Humans. University of Cambridge: McDonald Institute for Archeo-
logical Research, 15–20.
Stringer, Christopher B., and Peter Andrews (1988). ‘Genetic and fossil evidence for the origin
of modern humans,’ Science 239: 1263–68.
Stromswold, Karin, David Caplan, Nathaniel Alpert, and Scott Rauch (1996). ‘Localization of
syntactic comprehension by positron emission tomography,’ Brain and Language 52: 452–73.
Studdert-Kennedy, Michael (1991). ‘Language development from an evolutionary perspective,’
in Norman A. Krasnegor, Duane M. Rumbaugh, Richard L. Scheiefelbusch, and Michael
Studdert-Kennedy (eds.), Biological and Behavioral Determinants of Language Development.
Hillsdale, NJ: Erlbaum, 5–28.
Stump, Gregory T. (1985). The Semantic Variability of Absolute Constructions. Dordrecht: D. Reidel.
Symons, Donald (1979). The Evolution of Human Sexuality. Oxford: Oxford University Press.
Szabolcsi, Anna, and Marcel den Dikken (2003). ‘Islands,’ in Lisa Cheng and Rint Sybesma
(eds.), The Second GLOT International State-of-the-Article Book: The Latest in Linguistics.
The Hague: de Gruyter, 213–40.
Tallerman, Maggie. (2007). ‘Did our ancestors speak a holistic protolanguage?’ Lingua 117(3):
579–604.
Tallerman, Maggie (2012). ‘What is syntax?’ in Maggie Tallerman and Kathleen R. Gibson
(eds.), The Oxford Handbook of Language Evolution. Oxford: Oxford University Press,
442–55.
Tallerman, Maggie (2013a). ‘Join the dots: A musical interlude in the evolution of language?’
Journal of Linguistics 49: 455–87.
Tallerman, Maggie (2013b). ‘Kin selection, pedagogy, and linguistic complexity: Whence
protolanguage?’ in Rudolf Botha and Martin Everaert (eds.), The Evolutionary Emergence
of Language. Oxford: Oxford University Press, 77–96.
Tallerman, Maggie (2014a). ‘Is the syntax rubicon more of a mirage? A defence of pre-syntactic
protolanguage,’ The evolution of language. Proceedings of the 10th International Conference
(EVOLANG 10) 2014. Vienna: World Scientiﬁc, 318–25.
Tallerman, Maggie (2014b). ‘The Evolutionary Origins of Syntax,’ in Andrew Carnie, Yosuke
Sato, and Daniel Siddiqi (eds.), The Routledge Handbook of Syntax. London, UK: Routledge,
446–62.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

References 243

Tallerman, Maggie (2014c). ‘No syntax saltation in language evolution, Language Sciences 46:
207–19.
Tallerman, Maggie, and Kathleen R. Gibson (eds.) (2012). The Oxford Handbook of Language
Evolution. Oxford: Oxford University Press.
Tang, Sze-Wing (2005). ‘A theory of licensing in English syntax and its applications,’ Korean
Journal of English Language and Linguistics 5: 1–25.
Taraldsen, Knut Tarald (1986). ‘On verb second and the functional content of syntactic
categories,’ in Hubert Haider and Martin Prinzhorn (eds.), Verb Second Phenomena in the
Germanic Languages. Dordrecht: Foris, 7–26.
Tattersall, Ian (2010). A putative role for language in the origin of human consciousness. In
Richard K. Larson, Viviane Deprez, and Hiroko Yamakido (eds.), The Evolution of Human
Language: Biolinguistic Perspectives. Cambridge: Cambridge University Press, 193–8.
Tchekhoff, Claude (1973). ‘Some verbal patterns in Tongan,’ The Journal of the Polynesian
Society 82.3: 281–92.
Tchekhoff, Claude (1979). ‘From ergative to accusative in Tongan: An example of synchronic
dynamics,’ in Frans Plank (ed.), Ergativity: Towards a Theory of Grammatical Relations.
London: Academic Press, 407–18.
Thurneysen, Rudolf (1883). ‘Der indogermanischen Imperativ,’ KZ 27: 172–80.
Tobler, Adolf (1886). Vermischte Beiträge zur Französischen Grammatik. Vol. 1. Leipzig: Verlag
von S. Hirzel.
Tollemache, Federico (1945). Le parole composte nella lingua italiana. Rome.
Tomalin, Marcus (2011). ‘Syntactic structures and recursive devices: A legacy of imprecision,’
Journal of Logic, Language, and Information 20: 297–315.
Tomasello, Michael (2008). Origins of Human Communication. A Bradford Book. Cambridge,
MA: MIT Press.
Trask, Robert L. (1979). ‘On the origins of ergativity,’ in Frans Plank (ed.), Ergativity: Towards
a Theory of Grammatical Relations. London: Academic Press, 385–404.
Traugott, Elisabeth C., and Bernd Heine (1991). Approaches to Grammaticalization, Vol. II:
Typological Studies in Language 19. Amsterdam: John Benjamins.
Tsai, W.-T. Dylan (1994). ‘On nominal islands and LF extraction in Chinese,’ Natural
Language and Linguistic Theory 12: 121–75.
Tyler, Komisarjevsky Lorraine, and Paul Warren (1987). ‘Local and global structure in spoken
language comprehension,’ Journal of Memory and Language 26: 638–57.
Ułazyn, Henryk (1923). Gramatyka języka polskiego. Polskiej akademji umiętności w Krakowie.
Słwotvórstvo.
Ullman, Michael T. (2006). ‘Is Broca’s area part of a basal ganglia thalamocortical circuit?’
Cortex 42: 480–5.
Ullman, Michael T. (2008). ‘Variability and redundancy in the neurocognition of language.’
Paper presented at the DGfS Workshop on Foundations of Language Comparison: Human
Universals as Constraints on Language Diversity, Bamberg, Germany.
Uriagereka, Juan (2008). Syntactic Anchors: On Semantic Structuring. Cambridge: Cambridge
University Press.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

244 References

van Hout, Angelik, and Tom Roeper (1998). ‘Events and aspectual structure in derivational
morphology,’ in Heidi Harley (ed.), Roundtable on Argument Structure and Aspect.
MITWPL 32: 175–99.
van Lancker, Diana, and Jeffrey L. Cummings (1999). ‘Expletives: Neurolinguistic and neuro-
behavioral perspectives on swearing,’ Brain Research Reviews 31: 83–104.
Van Leynseele, Helen (1975). ‘Restrictions on serial verb constructions in Anyi,’ Journal of West
African Languages X: 189–217.
Veneziano, Edy, and Hermine Sinclair (2000). ‘The changing status of “ﬁller syllables” on the
way to grammatical morphemes,’ Journal of Child Language 27: 461–500.
Verhaar, John. W. M. (1995). Towards a Reference Grammar of Tok Pisin: An Experiment in
Corpus Linguistics. Honolulu: University of Hawai’i Press.
Vernes, S. C., E. Spiteri, J. Nicod, M. Groszer, J. M. Taylor, K. E. Davies, D. H. Geschwind, and
S. E. Fisher (2007). ‘High-throughput analysis promoter occupancy reveals direct neural
targets of FOXP2, a gene mutated in speech and language disorders,’ The American Journal
of Human Genetics 81: 1232–50.
Voight, Benjamin F., Sridhar Kudaravalli, Xiaoquan Wen, and Jonathan K. Pritchard (2006).
‘A map of recent positive selection in the human genome,’ PLOS Biology 4(3): e72; doi
10.1371/journal.pbio.0040072.
Vossen, Rainer (2010). ‘The verbal “linker” in Central Khoisan (Khoe) in the context of
deverbal derivation,’ Journal of Asian and African Studies 80: 47–60.
Vygotsky, Lev S. (1979). ‘The genesis of higher mental functions,’ in James V. Wertsch (ed.),
The Concept of Activity in Soviet Psychology. New York, M.E. Sharpe, 144–88.
Warren, Beatrice (1978). Semantic Patterns of Noun-Noun Compounds. Gothenburg Studies in
English 41. Göteborg, Sweden: Acta Universitatis Gothoburgensis.
Warren, Beatrice (1984). Classifying adjectives. Göteborg, Sweden: Acta Universitatis
Gothoburgensis.
Watkins, Calvert (1963). ‘Preliminaries to a historical and comparative analysis of the syntax of
the Old Irish verb,’ Celtica 6: 1–49.
Watkins, Calvert (1976). ‘Towards Proto-Indo-European syntax: Problems and pseudo-problems,’
Chicago Linguistic Society Parasession on Diachronic Syntax, 305–26.
Weekley, Ernest (1916). Surnames. New York: E.P. Dutton and Co.
Williams, Edwin (1981). ‘On the notions “lexically related” and “head of a word,” ’ Linguistic
Inquiry 12.2: 245–74.
Winford, Donald (2006). ‘Reduced syntax in prototypical pidgins,’ in Progovac et al. (eds.),
283–307.
Wong, Patrick C. M., Tyler K. Perrachione, Geshri Gunasekera, and Bharath Chandrasekaran
(2009). ‘Communication disorders in speakers of tone languages: Etiological bases and
clinical considerations,’ Seminars in Speech Language 30.3: 162–73.
Woolford, Ellen (1997). ‘Four-way case systems: Ergative, nominative, objective, and accusa-
tive,’ Natural Language and Linguistic Theory 15: 181–227.
Woolford, Ellen (2006). ‘Lexical case, inherent case, and argument structure,’ Linguistic
Inquiry 37: 111–30.
Wray, Alison (2002). Formulaic Language and the Lexicon. Cambridge: Cambridge University
Press.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

References 245

Yang, Charles (2013). ‘Ontogeny and phylogeny of language,’ Proceedings of the National
Academy of Sciences of the USA. Published online before print: 10.1073/pnas.1216803110
PNAS April 1, 2013.
Yip, Moira (2002). Tone. Cambridge: Cambridge University Press.
Yonge, Charlotte (1863). History of Christian Names. London: Parker, Son, and Bourn, West
Strand.
Zec, Draga, and Sharon Inkelas (1990). ‘Prosodically constrained syntax,’ in Sharon Inkelas and
Draga Zec (eds.), Phonology–Syntax Connection. Chicago: Chicago University Press, 365–78.
Zheng, Mingyu, and Susan Goldin-Meadow (2002). ‘Thought before language: How deaf and
hearing children express motion events across cultures,’ Cognition 85: 145–75.
Živanović, Jovan (1904). ‘Složene reči u srpskom jeziku,’ Glas Srpske Akademije Nauka / Glas
Srpske Kraljevske Akademije LXVIII: 175–207. Drugi Razred 42. Beograd: Državna
Štamparija.
Zurif, Edgar, David Swinney, Penny Prather, Julie Solomon, and Camille Bushell (1993). ‘An
online analysis of syntactic processing in Broca’s and Wernicke’s aphasia,’ Brain and
Language 45: 448–64.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Index of languages and

language groups
Akkadian 106, 123 Greek 40 n. 15, 108, 161 n. 21
Al-Sayyid Bedouin Sign Language 59, 81, 178
Alagwa 106 n. Hausa 106 n.
Ancient languages 18, 102, 123 see also Hawaiian 58
Akkadian; Latin; Sanskrit Hebrew 182, 212
Anyi-Sanvi 76 Herero 40 n. 14
Arabic 103 n. 13 Hittite 123, 158 n. 17
Australian 71, 184 n. Hmong 6 n. 14, 8, 95 n. 9, 96, 97 n., 103
n. 14, 192
Bantu 40 n. 14, 108, 190 n. 12 Hua 184
Basque 11 n.
Berber 166 see also Tashelhit Igbo 76
Bulgarian 162 Indo-European (IE) 40, 73, 145, 166,
169 n. 30
Chadic 106 n. pre-Indo-European 39, 182 n.
Chinese 61 n., 82, 107 n. 19, 133, 166, 191 Proto-Indo-European (PIE) 39, 40, 114,
see also Mandarin 122, 123, 161
Creole 75, 121 Indonesian, Riau 12, 50, 61 n., 70 n., 91 n. 4,
Cushitic 106 106 n., 176, 182
Irish 161 n. 21
Duala 40 n. 14 Italian 122 n., 134 n. 6, 151, 163,
Dutch 52, 108, 129, 148 n. 5, 214 n. 5 164, 165
Dyirbal 71, 72, 78, 116, 184 n., 189
Japanese 95 n. 9, 103, 133
English 22 n., 23 n., 24, 25, 27, 28, 29, 34, 36 n.
6, 37, 40 n. 16, 42, 44 n. 20, 46 n. 23, 49, 53, Kaqchikel 91
57 n. 1, 60, 64, 67 n. 15, 68, 69, 71, 73, 75, 79 Khoe 107
n. 28, 83, 94 n. 8, 95 n. 9, 103, 106, 107 n. 18, Khoisan 107, 121 see also Khoe
108, 110, 113, 115, 120, 121, 124, 131, 133, 144, Kinande 108
146, 148, 149, 150, 152, 154, 158, 159, 164 n., Korean 83, 95 n. 9
166, 171, 181, 187, 189, 216, 217 Kwa 76

French 25, 73, 151, 163, 165, 166, 168 Latin 73, 96, 103, 123

German 52, 106, 108, 114, 115, 116, 127, 189, 214 n. 5 Macedonian 25, 156 n. 14, 162, 163
Germanic 108, 164, 165 Mandarin 191
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

248 Index of languages and language groups

Mayan 83, 91 Serbian 7, 8, 22 n., 23 n., 25, 40–44, 46, 48, 65,
Mingrelian 128 66, 68, 69 n., 74, 75, 77, 79, 80, 84, 95, 96,
Mohawk 124 99, 108, 114–16, 145–58, 161, 166, 168, 172,
173, 176, 179, 181, 185, 187, 213
Native American 116, 189 Sinhalese 184
Nicaraguan Sign Language (NSL) 24, 59, 60, Skou 182
76, 101, 178 Slavic 79, 108, 145, 161, 162, 163, 165
Niger-Congo 76 Spanish 79, 163, 164, 185
Swahili 40 n. 14, 190 n. 12
Papuan 182
Pidgin 3 n. 8, 12, 21, 23, 58, 63 n., 95 n. 10, 203 Tashelhit 166 see also Berber
Pirahã 116, 117 n. 26, 189 Tok Pisin 121
Polish 162, 163, 164, 166 Tongan 61, 70, 71, 72, 78, 79 n. 27, 147, 182, 184
Tswana 40 n. 14
Romance 108, 145, 151, 163, 164, 165 Turkish 103
Rumanian 163, 164 n. Twi 48, 96, 97 n.; 166, 178
Russian 162, 163, 182
Vedic 161 n. 21 see also Sanskrit
Sacapultec 83 Vietnamese 103
Sanskrit 40 n. 15, 103, 123, 160, 165
Semitic 106 !Xun 121
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Index of names
Authors of single-, double-, and triple-authored works are indexed here. For multiple-
authored works, only the ﬁrst author is indexed.

Abney, Steven P. 29, 46, 64 Belić, Aleksandar 156, 161

Ackema, Peter 148 n. 5 Belletti, Adriana 135, 138, 216
Adams, Edward L. 165 Ben-Shachar, Michal 212
Adams, Valerie 158 n. 18 Berwick, Robert C. 2, 3, 19 n. 27, 25, 51, 135,
Adger, David 26, 28, 46, 58 n., 87, 91, 119, 136, 185, 186, 201, 205
139 n. 12, 216 Bickerton, Derek 3 n. 8, 4, 7, 19, 23–25, 33, 55
Aikhenvald, Alexandra Y. 75 n. 29, 63 n., 82n., 98, 119, 136, 144, 175 n., 205
Aissen, Judith 184 Blake, Berry 30 n. 38, 61
Akmajian, Adrian 34 Block, Ned 73 n. 22, 197 n. 24
Aldridge, Edith 72 Bloom, Lois 6, 49, 51, 57, 82, 86
Alexiadou, Artemis 72, 73, 74, 79 n. 28 Bloom, Paul 2, 6 n. 13, 15, 16, 18, 51, 73 n. 22,
Altmann, Stuart 17 n., 180 174, 175, 197, 198
An, Duk-Ho 90, 104 Bobaljik, Johnatan 74 n.
Anderson, Stephen R. 60 n. 7 Boeckx, Cedric 122, 134 n. 4, 135, 136, 138,
Andreĭčin, L. 162 139 nn.10, 11
Andrews, Peter 19 Bogin, Barry 50, 168
Antonelli, Alexandre 12 Bok-Bennema, Reineke 72
Arce-Arenales, Manuel 77 n., 79 Bolufer, José Alemany 165
Aronoff, Mark 59, 81, 178 Bookheimer, Susan 212, 215
Asher, Nicholas 110 n. 20 Borer, Hagit 65 n., 153 n. 12
Aske, Jon 169 n. 30 Borgonovo, Claudia 132 n. 2
Atkinson, Quentin D. 39 n. 13 Bošković, Željko xii, 114
Authier, Gilles 30 n. 38, 61, 71 n. 19, 73 Botha, Rudolf 22
Axelrod, Melissa 77 n., 79 Bottari, Piero 73
Bouchard, Denis 102, 188
Bagheri, Homayoun C. 12 Bowers, John 106
Baker, Mark 153 Bradshaw, John L. 127, 170
Bannister, Roger 125 n. 33 Brain, Walter Russell 125 n. 33
Bar-On, Dorit xii Brennan, Jonathan 20, 212, 213, 214
Bar-Shalom, Eva 160, 170 Briscoe, Ted 198
Barron, Brigid J. S. 169 Broca, Paul 55
Barton, Ellen xii, 35 n. 4 Brooks, Alison 20, 205
Bates, Elizabeth 170 Bruening, Benjamin 105
Bauer, Heinrich 124 n. Burling, Robbins 7, 34, 50, 95, 125, 196
Bauman, James 73 Burzio, Luigi 27, 42, 66
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

250 Index of names

Campbell, Lyle 91, 124 n., 128 den Dikken, Marcel 100, 107, 138
Cann, Rebecca L. 21 n. 31, 53, 54, 190 n. 13, Dennett, Daniel 198
200, 205 Depew, David J. 198
Caplan, David 20, 213 Deutscher, Guy 7, 34, 97, 101, 106, 110,
Caramazza, Alfonso 212 n. 2 122, 124, 128
Cardinaletti, Anna 35 Devauchelle, Anne-Dominique 20, 213, 214
Carroll, Sean B. 39, 43 n. 19, 55 n. 30, 181 Diez, Friedrich 165
Carstairs-McCarthy, Andrew 148 n. 5 Diller, Karl C. 21 n. 31, 53, 54, 190 n. 13,
Casielles, Eugenia xii, 23 n., 67, 83, 164 200, 204
Chater, Nick 14 n. 22, 54 n., 192 n. Dixon, Robert M. W. 40 n. 13, 71, 116, 160,
Cheng, Lisa 107 n. 19 184 n., 189
Chierchia, Gennaro 182 Dobzhansky, Theodosius 2 n. 4
Chomsky, Noam 1, 2, 3, 9, 19, 25, 26, 27, 34, 36, Dodd, Barbara 128
46 n. 24, 47, 51, 58, 63, 90, 91, 92, 116, 131, 134, Dong, Quang P. 161 see also James
135, 136, 143, 185, 186, 197 n. 23, 205 D. McCawley
Christiansen, Morten H. 14 n. 22, 54 n., 192 n. Downing, Pamela A. 148
Churchward, Maxwell C. 182 Dowty, David 78, 148 n. 6
Cinque, Guglielmo 90, 136 Du Bois, John W. 83, 181
Citko, Barbara 90, 92, 100, 101, 104 Dubinsky, Stanley 35
Clancy, Patricia M. 83 Dukes, Michael 71
Clark, Brady xii, 2 n. 3, 92 Dunbar, Robin I. M. 18
Clark, Eve 169 Duncan, Neil 18 n. 25
Code, Chris 12, 127, 168, 170, 196, 197 n. 22, Dwyer, David 7, 33
215, 217
Cole, Desmond T. 40 Elderkin, Edward D. 108
Comrie, Bernard 10, 67 n. 15, 71, 73 n. 21, Eldredge, Niles 199 n. 26
184, 185 Embick, David 211, 212, 215
Constable, R.T. 212 Emonds, Joseph E. 110 n. 20
Coppola, Marie 59, 81, 178 Enard, Wolfgang 19, 21, 54, 55, 204, 208
Crago, Martha B. 53, 54 Epstein, D. Samuel 3 n. 6
Crysmann, Berthold 104 Everett, Dan xii, 116, 117 n. 26, 189
Culicover, Peter W. 2, 7, 20 n. 30
Cummings, Jeffrey L. 127, 170, 197 n. 22 Fabb, Nigel 153 n. 12
Fedorenko, Evelina 212 n. 1
Darmesteter, Arsène 158, 160, 165, 168 Ferrari, Franca 151, 153 n. 11, 12, 164 n., 165
Darwin, Charles xvi, 1, 2, 4 n. 11, 15, 33, 42, 48, Finlayson, Clive 200 n. 28, 201, 204 n. 32, 33
168, 176, 195, 197, 198, 200 Fisher, Ronald A. 199
Dawkins, Richard 4 n. 11, 12, 198 Fitch, W. Tecumseh xii, 4 n. 11, 15 n. 22, 21 n.
de Diego, Vicente Garcia 165 31, 46 n. 24, 47, 48, 54 n., 90, 116, 141, 190,
Deacon, Terrence W. 16, 18, 53, 55, 56, 88, 125, 192, 195 n. 19
191, 198, 199, 200, 208, 211 Floricic, Franck xii, 165
Dediu, Dan 21 n. 31, 54, 109 n. 13, 201, 203 Fortin, Catherine 35 n. 4
Dehaene, Stanislas 20, 213, 214 Fox, Barbara A. 77 n., 79
Delbrück, Berthold 124 n. Francis, Elaine J. 102
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Index of names 251

Franks, Bradley 169 Hecht, Barbara Frant 169

Franks, Steven xii, 79 Heine, Bernd xii, 2, 5 n., 9 n. 17, 10, 11, 12, 14 n.
Frey, Werner 110 n. 20, 118 21, 25, 67 n. 13, 70, 83, 107, 110, 111, 114, 116,
Friederici, Angela D. 212, 215 121, 124, 126, 128, 141, 146 n. 2, 160 n., 193 n.
Friedmann, Na’ama 52, 126, 127, 214 n. 4 Heinimann, Siegfried 165
Fu, Jingqi 153 n. 12 Henderson, Robert xii, 72, 91 n. 3
Fukui, Naomi 133 Hill, Robert M. II 91, 192
Hinton, Geoffrey E. 16, 18, 198
Gair, James 184 Hock, Hans Henrich 11n., 123
Gardner, R. Allen 125 Hockett, Charles F. 17 n., 175, 180
Gardner, Beatrix T. 125 Hoenigswald, Henry M. 11 n.
Gibson, Kathleen, R. 2 n. 5, 215 Hollebrandse, Bart 50, 128
Gil, David xii, 7, 12, 50, 51, 67 n. 14, 70, 91 n. 4, Holmberg, Anders 123
102, 106 n., 144, 148, 176, 182, 185 Hornstein, Norbert 14, 92, 93, 134 n. 6, 135
Givón, Talmy 2, 3 n. 8, 7, 11, 22, 24, 33, 60, 75, Hua, Zhu 128, 184
76, 91 n. 4, 169 n. 30 Huang, James 133, 134, 135
Godfrey-Smith, Peter 198 Hurford, Jim R. xii, 2, 7, 22, 34, 36, 38 n. 9,
Goldin-Meadow, Susan 59, 82, 83 54 n., 94 n. 7, 119 n., 125 n. 34, 160 n.,
Gonda, Jan 40 190, 199
Goodall, Grant 104
Gopnik, Myrna 53, 54 Indefrey, Peter 20, 127, 213
Gould, Stephen Jay 143, 197 n. 23, 199 n. 26 Inkelas, Sharon 90, 104
Gray, Russell D. 39 n. 13 Isaacson, Robert L. 55, 217
Greenﬁeld, Patricia M. 83, 125, 160
Grimshaw, Jane 46 Jackendoff, Ray xii, 1, 2, 3, 4, 7, 21–6, 33, 34,
Grodzinsky, Yosef 52, 126, 212, 214 n. 4 90, 100, 101, 120, 144, 175 n.
Grohmann, Kleanthes K. 135, 136, 138 Jacob, François 2
Groom, Bernard 158 n. 18 Jarkey, Nerida 103 n. 14
Gruber, Howard Ernest 10, 55 Jay, Timothy 197 n. 22
Guasti, Maria Teresa 35, 49, 51 Jespersen, Otto 151 n. 8, 154, 158, 190 n. 12
Guilfoyle, Eithne 49 Johannessen, Janne Bondi 105
Johns, Brenda 96 n. 12
Hadar, Uri 127 Johnson, David E. 3
Haegeman, Liliane 110 n. 20, 118 Jordens, Peter 49, 129
Hagstrom, Paul 133 Josefsson, Gunlög 153 n. 12
Haldane, John Burdon Sanderson 19, 199 Julien, Marit 153 n. 12
Hale, Kenneth 71, 72 Just, Marcel A. 20, 213
Hale, Mark 123
Hall, Robert A. Jr. 164, 165 Kanwisher, Nancy 212 n. 1
Halle, Morris 153 n. 12 Kayne, Richard xii, 58, 105, 123
Harris, Alice C. 91, 124 n., 128 Kegl, Judy 59, 81, 178
Haspelmath, Martin 92 n. Kemmer, Suzanne 76, 77 n.
Haude, Katharina 30 n. 38, 61, 71 n. 19, 73 Kempson, Ruth M. 148
Hauser, Marc 46 n. 24, 47, 116, 201 Kerns, J. Alexander 161
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

252 Index of names

Kinsella, Anna R. 38, 90, 111, 112, 114, 129, 188 Locke, John L. xii, 16, 17, 31, 50, 68, 98, 145,
see also Anna R. Parker 148, 168, 179, 187, 195
Kiparsky, Paul 39 n. 13, 40, 123, 161, 182 n. Longa, Victor M. 198
Kitagawa, Yoshihisa 27 Longobardi, Giuseppe 36 n. 6, 122 n.
Kitahara, Hisatsugu 3 n. 6 Lord, Carol 76
Klein, Wolfgang 50, 128 Lurquin, Paul F. 19, 193, 199, 201, 203, 204
Klemensiewicz, Zenon 162
Kolk, Herman H.J. 52, 214 n. 5 McBrearty, Sally 20, 200 n. 28, 205
Kondrashova, Natasha xii, 163 McCawley, James D. 161 n. 23
Koneski, Blaže 182 see also Quang P. Dong
Koopman, Hilda 27, 39 McConnell-Ginet, Sally 182
Kotchoubey, Boris 125 n. 34 McDaniel, Dana 38 n. 9, 119
Kratzer, Angelika 27 MacLean, Paul D. 10, 55, 217
Krause, J. 19, 204 Marantz, Alec 153 n. 12, 216
Kuryłowicz, Jerzy 11 n., 40, 160 Marchand, Hans 151, 158 n. 18
Kuteva, Tania 2, 5 n., 9 n. 17, 10, 11, 12, 14 n. Maretić, Tomislav 156
21, 25, 67 n. 13, 70, 83, 111, 114, 116, 121, 124, Marriot, Anna 18 n. 25
126, 128, 141 Marsh, Peter 168
Marty, Anton 67 n. 14
Lacarme, Jacqueline 153 n. 12 Massam, Diane 74 n.
Ladd, D. Robert 21 n. 31, 54, 190 n. 13 Maxwell, Judith M. 91, 192
Laka, Itziar 74 n. Meinhof, Carl 40 n. 14
Langendoen, D. Terence 151 n. 8 Mellars, Paul A. 19, 20, 205
Lappin, Shalom 3 n. 6 Menn, Lise 128, 129
Lasnik, Howard 134, 135 Meunier, Louis-Francis 165
Lebeaux, David 49, 50, 128 Meyer, Martin 212, 215
LeDoux, Joseph E. 217 Meyer-Lübke, Wilhelm 165
Lees, Robert. B. 153, 158 n. 18 Michaelis, Laura A. 102
Legate, Julie A. 74 n. Migliorini, Bruno 165
Lehman, Christian 72, 73 Mihajlović, Velimir 146 n. 3, 156, 158 n. 17,
Lehmann, Winfred P. 169 n. 30 160, 168, 172
Levin, Beth 66 Milićević, Jasmina xii
Levin, Juliette 74 n. Miller, D. Gary 169 n. 30
Levinson, C. Stephen 54 n., 201, 203 Miller, Geoffrey A. 168, 169, 187, 195, 199
Lewin, Roger 125, 193 Millikan, Ruth Garrett 160
Lieber, Rochelle 151, 153, 169 Mirowicz, Anatol 162
Lieberman, Philip 50, 55, 127, 215 Mithen, Steven 202, 205
Liégeois, Frédérique 54 Mithun, Marianne 30 n. 38, 61, 116, 124, 189
Lightfoot, David 31, 67 n. 7, 135 n. 30, 174 Monaco, Anthony P. 21 n. 31, 54, 209
Linnankoski, Ilkka 125 n. 34 Moro, Andrea xii, 2, 47, 212, 215
Liu, Haiyong xii, 61, 166, 191 Mortensen, David 97 n.
Ljung, Magnus 151 n. 8 Mottin, Jean 96 n. 12
Lloyd, Paul M. 160, 163, 164, 165, 167 Mous, Maarten 106 n.
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Index of names 253

Mulford, Randa C. 169 Pinker, Steven 2, 15, 16, 18, 49, 51, 73 n. 22, 169,
Munn, Alan B. 105 174, 175, 197, 198, 205
Murray, Sarah 153, 164, 166 Platzak, Christer 49, 50, 128
Mylander, Carolyn 83 Poeppel, David 211, 212 n. 1
Postal, Paul M. 135, 136, 137
Nan, Yun 191 n. 14 Potts, Christopher 49, 50, 106, 110 n. 20, 128
Napoli, Donna Jo 135 n. 7 Prati, Angelico 165
Nash, Lea 72, 74 Progovac, Ana xii, 150 n.
Neelman, Ad 132 n. 2 Progovac, Dušan xii
Nespor, Marina 90, 104 Progovac, Ljiljana 2, 7, 15 n. 23, 16, 23 n., 31,
Nevins, Andrew xii, 117 n. 26 33–6, 40, 42, 47, 51, 53, 66–8, 78, 80, 98, 104,
Newbury, Dianne F. 21 n. 31, 54, 209 121, 122, 139, 144, 148, 153, 155, 161, 179, 212
Newman, Paul 11 n., 106 n. Progovac, Stefan xii
Newman, Roxana M. 106 n. Pullum, Geoffrey 117 n. 26
Newmeyer, Frederick J. xii, 2, 83, 117, 135, 139, Pulvermüller, Friedemann 125 n. 33
142, 169 Pylkkänen, Liina 153 n. 12
Noonan, Michael 49
Nowlan, Steven J. 16, 18, 198 Radford, Andrew 49, 50, 126, 128, 133 n.
Nyrop, Kristoffer 165 Rappaport-Hovav, Malka 66
Ratliff, Martha xii, 95 n. 9, 96 n. 12, 97 n., 103
Ochs, Elinor 83 n. 14, 192
Ofen, Noa xii, 32, 211, 218 Renfrew, Colin 39 n. 13
Osthoff, Hermann 165 Ridley, Mark 3, 50
Otsuka, Yuko 74 n., 79 n. 27 Rigby, Kate 169
Ouhalla, Jamal 49, 50, 128 Ringe, Don 11 n.
Rizzi, Luigi 49, 134 n. 6, 135, 138
Paesani, Kate xii, 165 Rodrigues, Cilene 117 n. 26
Pallier, Christophe 20, 213, 214 Roebroeks, Wil 205
Palti, Dafna 212 Roeper, Thomas 49, 50, 106, 128, 153
Parker, Anna R. 38 n. 8 see also Anna Rohlfs, Gerhard 165
R. Kinsella Rohrer, Christian 151
Parks, Ward 168 Rolfe, Leonard 50, 55, 160
Payne, John 103, 104 Ross, Daniel xii, 107, 121
Perdue, Clive 50, 128 Ross, John R. 132, 134, 135
Pérez-Leroux, Ana T. 129 Rothstein, Susan 182
Perlmutter, David 42, 66 Rutkowski, Paweł xii, 162
Pesetsky, David 73 n. 22, 117 n. 26,
182, 197 n. 24 Saito, Mamoru 134, 135
Peters, Ann M. 128, 129 Sakel, Jeanette 117
Piaget, Jean 10, 55 Savage-Rumbaugh, E. Sue 83, 125, 160, 193
Piantadosi, Steven 117 Schirrmeister, Bettina E. 12
Piattelli-Palmarini, Massimo 2, 3 n. 7, 19, 25, Schlegel, (Karl Wilhelm) Friedrich von 124 n.
53, 98, 125, 204 Schneider-Zioga, Patricia xii, 108
Picallo, M. Carme 73 Schulze, C. 165
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

254 Index of names

Schütze, T. Carson 36, 98 Tomasello, Michael 160

Schwartz, Benjamin 161 Trask, Robert L. 72, 74
Schwartz, Linda 105 n. 16 Traugott, Elisabeth C. 110, 124, 128
Seely, Daniel xii, 3 n. 6 Tsai, W.-T. Dylan 133
Selkirk, Elisabeth O. 90, 104, 149, 154 Tyler, Komisarjevsky Lorraine 88, 125 n. 35
Senghas, Ann 24, 59, 76, 81, 178
Shaer, Benjamin 110 n. 20, 118 Ułaszyn, Henryk 162
Shetreet, Einat 127 Ullman, Michael T. 127, 169, 197, 215
Shibatani, Masayoshi 30 n. 38, 61 Uriagereka, Juan xii, 19, 36, 53,
Siegel, Dorothy 153 125, 204
Silverstein, Michael 184 n.
Sinclair, Hermine 128 van Cantfort, Thomas E. 125
Smith, Allyn E. 100, 101 van Hout, Angelik 153 n. 12
Snyder, William 25, 70, 160, 170 van Lancker, Diana 127, 170, 197 n. 22
Speijer, Jacob S. 160 van Leynseele, Helen 76
Spencer, Andrew 149, 154 Veneziano, Edy 128
Sportiche, Dominique 27, 39 Verhaar, John. W.M. 121
Sproat, Richard 153 n. 12 Vernes, S.C. 21, 54, 209
Sprouse, Jon 134 n. 6, 135 Verpoorte, Alexander 205
Stapert, Eugenie 117 Vogel, Irene 90, 104
Stavrou, Melita 73 Voight, Benjamin F. 54 n., 192 n.
Stepanov, Arthur 135 Vonèche, J. Jacques 10, 55
Stevanović, Mihailo 156, 161 Vossen, Rainer 107
Stone, Linda 19, 193, 199, 201, 203, 204, 205 Vulanović, Relja xii
Stowell, Tim 27, 35 n. 5, 90, 104 Vygotsky, Lev S. 10, 55
Strecker, David 96 n. 12
Strickberger, Monroe W. 50, 55 Warren, Beatrice 151 n. 8, 154
Stringer, Christopher B. 19, 204 n. 33 Warren, Paul 88, 125 n. 35
Stromswold, Karin 212 Watkins, Calvert 123, 169 n. 30
Studdert-Kennedy, Michael 50 Weekley, Ernest 158, 160, 171
Stump, Gregory T. 99, 100, 101 Williams, Edwin 150
Symons, Donald 193, 196 Winford, Donald 58, 95
Szabolcsi, Anna 138 Wittenberg, Eva 7, 175 n.
Wong, Patrick C.M. 191 n.
Tallerman, Maggie xii, 2, 7, 11, 18 n. 25, 34, 48, Woolford, Ellen 74 n.
53, 79, 84 n., 90, 119, 187, 202 n. 30, 205 Wray, Alison 215
Tang, Sze-Wing 35 n. 4
Taraldsen, Knut Tarald 123 Yang, Charles 51, 126
Tattersall, Ian 201, 205 Yip, Moira 190 n. 12
Tchekhoff, Claude 30 n. 38, 61, 70, 71, Yonge, Charlotte 158
72, 147, 184
Thurneysen, Rudolf 40 Zec, Draga xii, 90, 104
Tobler, Adolf 165 Zheng, Mingyu 82, 83
Tollemache, Federico 165, 166 Živanović, Jovan 156
Tomalin, Marcus 14 n. 21, 47 Zurif, Edgar 212 n. 2
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Index of subjects
absolutes 99, 100 assertion 39, 40, 95, 98
absolutive 4, 23, 24, 30, 61–5, 70–4, 75, 79, 80, Australopithecines 19
81, 83, 145–9, 155, 183, 184, 201, 203, 204 Australopithecus 200, 202 see also
accusative 24, 27, 30, 36, 37, 63, 64, 66, 68, 72, Australopithecines
79, 151, 184, 185 see also unaccusative autonomy of syntax 24
acquisition 6, 10, 12, 21, 49–51, 59 n. 6, 82–3,
128–9, 169–70 see also Basic Variety; Baldwin Effect 16, 18, 198–9
Continuity Hypothesis; two-word/ basal ganglia 55, 127, 170, 212, 215 see also
two-slot subcortical structures of the brain
adaptation 4 n. 11, 14–15, 18, 54 n., 55 n. 30, Basic Variety 50, 128 see also acquisition
190, 198, 205, 206 see also competition; binary (branching) 6, 48, 58, 59 n. 4, 82, 94,
ﬁtness; selection 178 see also two-word/two-slot
Adjoin 36, 58, 91, 94, 116, 119 see also biology 2 n. 4, 33 n. 1, 50, 53
adjunction; Conjoin evolutionary 5, 139 n. 10, 193, 208
adjunction 13, 14, 36, 37, 72, 73, 87, 90–2, 99, bonobos 83, 84, 125, 193 see also primates
104–6, 111, 114, 115, 116, 120, 123, 129, 132, brain activation 20 see also neural/neuronal
134–8, 140–1, 155 see also Adjoin activation/connectivity
Agent First 22–4, 83, 101 see also Cause First; brain evolution 46, 53, 54, 55, 127 see also
thematic (theta) roles triune brain
agrammatism 8, 12, 52, 126 see also aphasia brain lateralization 32, 53, 54, 84, 171, 212, 214,
Agree(ment) 44, 57 n. 1, 67 n. 15, 101, 122, 151, 217 see also left hemisphere; right
152, 155, 159 n. hemisphere
ambiguity 58, 148, 184 see also vagueness brain stratiﬁcation 10, 55 see also subcortical
Animacy Hierarchy 72, 184 structures; triune brain
animal calls 160, 196 Broca’s area 32, 53, 54, 84, 127, 171, 188, 201,
animal communication 17 n., 84 n., 124–6, 212, 214, 215
195, 196 see also animal calls;
comparative method c-command 104–5, 117, 119, 120, 137, 140
aphasia 12, 52, 125 n. 33, 215 see also see also precedence
agrammatism case 8, 27, 36, 61, 63, 64, 65, 68, 70, 71, 72, 74,
approximation 8, 33, 44, 66, 89, 93, 161, 164 n. 108, 182, 184 see also absolutive;
25 see also fossil accusative; case alignment; dative;
archeology 19, 205 ergative; Exceptional Case Marking
argument-predicate structure 9, 14, 23, 26, 35, (ECM); genitive; nominative
36, 57, 67, 75, 81, 88, 106, 107, 129 see also default case 36, 43, 157
predication inherent/lexical case 74
aspect 8, 40 n. 16, 44, 103, 107, 122, 151, 184 n. structural case 36, 37, 44, 57 n. 1, 68, 121, 122
see also tense/aspect/mood (TAM) case alignment
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

256 Index of subjects

case alignment (cont.) context 43, 78, 177–8, 180–5, 186 see also
ergative-absolutive 61, 65, 70, 71, 72, 183 pragmatics
nominative-accusative 61, 63, 65, 67, 71, 72, continuity 84 n., 126, 160, 195, 196 see also
183, 184, 207 animal communication
categorical statements 23 n., 67 n. 14 see also Continuity Hypotheses 51, 57 n. 2 see also
thetic statements acquisition
cause 95, 97, 99, 100, 101, 110, 120, 178, 179 coordination 13, 14, 71, 72, 87, 89, 91–3,
see also Cause First 100–12, 120, 128, 129, 132, 134 n. 4, 135, 140,
Cause First 24, 82 see also Agent First 162, 188 see also conjunction; linker
child language see acquisition copula 87, 106, 107, 108, 109 see also linker
cognition 19, 22, 47, 55, 70, 113, 116, 122, 127, correlatives 99–101
169, 217 see also thought
communicative beneﬁts/advantages 5, 15, 16, dative 74, 75, 107
23, 130, 174–86, 196, 197 Denisovans 201, 203
comparative method 124–6, 201 see also derogatory reference 31, 98, 145, 150 n.,
animal communication 159–60, 167–9, 179, 193–7, 217 see also
competition 19, 168, 169, 192, 193, 195, 199 insult; pejorative
see also adaptation; ﬁtness design of language 31, 32, 34, 56, 141, 208
Complementizer Phrase (CP) 9, 10, 14, 18 n. see also optimal design
26, 28, 46, 47, 94, 109, 111–19, 122, 123, 126, all-or-nothing package 14
127, 132, 133, 135, 139, 187, 188, 190 decomposable 14, 15, 53, 55, 84, 92, 105, 122,
complexity 4, 8 n., 20, 32, 45, 55, 65, 87 n., 129, 126, 136, 175, 197 n. 23, 208, 212
177, 185, 186, 201, 205, 211, 212, 213, 214, 216 design features of language 180, 185
compounds see also displacement
exocentric 8, 22, 60, 68–70, 72, 82, 92, 98–9, determiner 36 n. 6, 121, 127
149–52, 166, 169, 193, 217 see also Determiner Phrase (DP) 29, 113–115, 117 n. 27,
verb-noun compounds 122 n., 126, 129, 135 see also noun phrase
hierarchical 22, 152–6, 169, 217 differential object marking (DOM) 184, 185
noun-noun 5 n., 25 dimorphism 197 see also gender difference
root 149, 150 directionality of change 11
verb-noun 8, 31, 60, 68–70, 82, 108, 144, disorders (language) 21 n. 31, 53, 54, 55, 170 n.
146, 148, 149–52, 156–8, 162–7, 168, 170, 31, 191 n., 197 n. 22 see also genetics of
171–3, 179, 194, 195, 217 see also exocentric language
compounds displacement 17 n., 18, 175, 177, 180, 183, 187
verb-verb 5 n., 25 see also here-and-now
concatenation 3, 4, 7, 23, 25, 33, 47, 75, 92, 93, display 169, 176, 187, 195
95, 99–101, 110 n. 20, 138 see also Conjoin; division of labor 39, 43, 181 see also
parataxis specialization
Conjoin 20 n. 29, 36, 37, 48, 51, 89–95, 97, 119,
188, 215 see also Adjoin; concatenation; economy 67, 184, 185
conjunction; parataxis ellipsis 35, 43, 45, 52, 98, 151 n. 8 see also
conjunction 13, 14, 16, 31, 34, 36, 87, 89, 90, 91, nonsententials; null arguments/
93, 102–9, 120, 121, 128, 129, 132, 136, 140 categories
see also Conjoin; coordination; linker ergative 30, 61, 71
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Index of subjects 257

ergativity 56, 61, 63, 70, 72, 74, 83, 155 grammaticalization 10–11, 39, 40, 76, 86, 88,
split ergativity 72, 183, 184 107, 109, 110, 118, 121–124, 128, 141, 182,
Exceptional Case-Marking (ECM) 121, 122 184, 193 n.
exocentricity 145, 149–152 see also exocentric
compounds; headedness H. erectus 54, 193, 196, 200, 201, 202, 203,
expressivity 35, 63, 69 n., 95 n. 9, 98, 141, 160, 204, 205
164, 168, 179 H. ergaster 54, 205
externalization of language 186 H. habilis 54, 204 n. 32, 205
H. heidelbergensis 196, 200, 201, 202, 203
falsifiability 20 n. 29, 81, 211 see also testing H. sapiens 19, 185, 201, 202, 203, 204
grounds headedness 36, 87, 89, 92, 148, 150 see also
feature checking 28, 36, 44, 134 n. 4 exocentricity
filler 128–129 here-and-now 17, 39, 43, 46, 160, 175, 181, 183,
finiteness 9, 37, 41, 43, 49, 53, 100, 121, 187, 215 186 see also displacement
see also Tense Phrase hierarchical structure 3, 5, 13, 25–9, 46, 47, 59,
fitness 19, 193, 199 see also adaptation; 104, 105, 109, 115–19, 137, 150, 180, 187, 190,
competition; selection 201, 202, 204, 211–14 see also compounds
formulaic speech 41, 43, 45, 47, 95, (hierarchical); layering; scaffolding
127, 215 historical change 39, 190–3 see also language
fossil see also approximation; precursor change
living fossil 12, 33, 40, 64, 65, 142, 144 Homesign 59, 82
functional categories 9, 10, 11, 12, 16, 26, 29, hominin timeline 4, 23, 208–9 see also
35, 37, 50, 51, 53, 80, 89, 103, 108, 109–11, hominins
113, 122, 126, 128, 130, 139, 140, 185, 214 hominins 54, 167, 180, 193, 194, 196, 200 n. 27,
see also lexical categories 28, 203, 204 see also Australopithecus;
Denisovans; H. erectus; H. ergaster;
gender difference 169, 197 see also H. habilis; H. heidelbergensis; H. sapiens;
dimorphism Neanderthals
genes 19, 21 n. 31, 54, 126, 180, 192 n., 209 humor 16, 163, 164, 168, 171, 195 see also
see also disorders; mutation; variation playfulness
(genetic) hypotaxis 101, 112, 124 n. see also parataxis;
ASPM 21 n. 31, 54 subordination
CNTNAP2 21 n. 31, 54
FOXP2 19, 53–5, 204, 205, 208 iconicity 55, 95, 97, 99, 100, 101, 178, 179, 184
Microcephalin 54 see also ideophones; Cause First
genetics 8, 21, 53, 54, 190, 201, 208, 209 ideophones 94 n. 9, 95 n. 9 see also iconicity
see also genes imperative see mood
genitive 74 incremental evolution 4 n. 11, 15, 23, 25, 27, 43
Government and Binding 26 n., 143, 174, 177, 179, 186, 189, 202 n. 30,
gradualist evolution 1, 2 n. 3, 4 n. 11, 20, 30, 43 214 see also gradualist evolution
n., 51, 106, 120, 175, 182, 186 n., 189, 202 n. inflection 91 n. 3, 127, 157 n. see also
30, 216 see also continuity; incremental morphology
evolution; intermediate (stages); information structure 119 n. see also topic-
saltationist approaches comment
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

258 Index of subjects

inner speech 186 see also thought linker 13, 87, 103, 106–8, 109–11, 120, 128, 155,
innovation 1 n. 2, 13, 16, 18, 37 n., 47, 61, 80, 162 see also conjunction; copula; linking
102, 117 n. 27, 139, 141, 186, 189, 194, 205 vowel
see also neophilia; novelty linking vowel 155, 162 see also linker
insult 69, 99, 150 n., 167–169, 193–196, 207
see also derogatory reference; pejorative Merge 3, 13, 14, 20 n. 29, 28, 38, 46 n. 24, 47,
reference 58, 87, 92, 108, 113, 114, 116, 135, 136, 140 n.
intermediate (stages) 2, 15, 25, 30, 43 n., 61, 76, 13, 185, 186, 189 see also proto-Merge
79, 88, 89, 108, 128, 136, 140, 155, 197 metaphor 11, 99, 149, 166, 168
see also gradualist evolution; middle; middle 10, 62, 76–81, 84, 181, 211 see also
transitional (stages) intermediate stages; passive; reflexive;
intonation 13n, 75, 88, 89, 95, 102, 124, 125, 137 transitivity
see also prosody Middle to Upper Paleolithic transition/
intransitivity 4, 9, 23, 24, 27, 38, 57–65, 73, 76, revolution 19, 205 see also symbolic
81, 84, 119, 181 see also middles; explosion
transitivity Minimalism/Minimalist Program 1, 2 n. 3, 3,
island constraints 31, 131–142 see also 7, 8–12, 21, 26–29, 34, 37, 38, 47, 58, 67 n.
Subjacency 15, 68 n., 117, 119, 131, 135, 139
Adjunct Island 14, 116, 132, 134 n. 4, 137 mood 39, 40, 88, 159–162
Complex NP Island 133, 134 imperative 25, 34, 40 n. 15, 145, 149, 150,
Coordination Island 14, 116, 132, 134 n. 4 153 n. 10, 156–62, 165–6, 170, 194
Subject Island 132, 133, 134, 140 n. 13 n. 16, 197
Wh-Island 133, 134 n. 6 indicative 39, 160
injunctive 40, 161, 162
KE family 53, 54 see also FOXP2 gene; irrealis 39, 40, 161
genetics optative 34, 40 n. 15, 159–62, 170
subjunctive 40 n. 14, 161
language change 11, 14 n. 22, 117 n. 27, 190–3 morphology 8, 21, 25, 33 n. 2, 52, 60, 61, 68 n.,
see also historical change 72, 148, 150–3, 157, 159 n., 161, 166, 194 n.
last in first out 12 16, 215 see also inflection
layering 29, 30, 37–9, 44, 50, 53–5, 61, 63, 65, Move(ment) 13, 14, 18 nn. 26–8, 38, 39, 45–7,
78, 90, 109, 118, 127, 153, 190, 214 see also 50, 87, 104, 109, 116–20, 131, 135–7, 140,
hierarchical structure; scaffolding 153, 185, 212, 214 see also island
left hemisphere 32, 53, 84, 171, 212, 217 see also constraints; Subjacency
brain lateralization covert movement 133
left periphery 101, 118, 131 subject movement 27, 37, 140 n. 13
lexical categories 10, 29, 98 see also functional topic movement 38, 118, 119, 123, 212
categories see also topic-comment
lexicon 11, 94, 98 see also vocabulary verb movement 88, 118
limbic brain 55, 127, 170, 217 see also wh-movement 28, 132–8
subcortical structures of the brain music 192, 195 n. 19
linear order 119 see also precedence; musical protolanguage 48, 90, 195 n. 19, 202
word order see also protolanguage
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Index of subjects 259

mutation 3, 15, 19, 21, 25, 53, 54, 55, 136, 185, passive 35 n. 3, 57, 62, 69, 71, 72, 77, 79 n. 28,
186, 193, 196, 197, 198, 199, 200 n. 28, 204, 102, 216 see also middle
208 see also genes; genetics pejorative reference 150, 159, 168 see also
derogatory reference; insult
naming 158, 160, 167, 168, 193 phase 35 n. 4, 134, 135
Neanderthals 19, 200, 201, 202, 203, 204, 205 phylogeny 15 n. 22, 50, 51 see also ontogeny
neophilia 176 see also novelty playfulness 17, 163, 168, 171, 196 see also
neural/neuronal activation/connectivity 20, humor
21, 54, 211 see also brain activation polygamy 193
neuroimaging 52–3, 84, 121, 127, 208, 211–17 polygyny 196
see also neuroscience possessives 29, 113–17, 189 see also
neuro-linguistics 21, 53, 84, 190 n. 11, 212 Determiner Phrase; recursion
neuroscience 5, 20, 102, 127, 139 n. 10, 196, 208 pragmatic mode 22, 24, 91 n. 4
see also neuroimaging pragmatics 17, 38 n. 9, 77–9, 81, 95, 101, 105,
nominal 30, 65, 71, 73, 74, 80, 129 see also 113, 118, 175, 177, 180, 183, 186, 188 see also
noun phrase context; pragmatic mode
nominative 30 n. 38, 37, 44, 57 n. 1, 63, 68 n., precedence 105 see also linear order
74, 157 precision 17, 18, 69 n., 71, 91, 102, 112, 113, 115,
nonsententials 35 n. 4 see also ellipsis; null 147, 155, 175, 177, 179, 180, 183, 187 see also
arguments/categories vagueness
noun phrase 38, 67 n. 13, 68, 72, 83, 122 n., 133 precursor 3, 8 n., 20 n. 29, 22, 24, 25, 34, 57,
see also Determiner Phrase; nominal 62–5, 75–81, 86, 94, 113, 115, 116, 117–20,
novelty 175, 176, 180 see also innovation; 126, 144, 152, 175, 179, 186, 188, 189, 203
neophilia predication 35, 44, 67 n. 14, 106, 107, 109, 110,
null arguments/categories 57, 63, 79 n. 27, 82, 148, 179 see also argument-predicate
144, 151 see also ellipsis; nonsententials structure
proto-predication 67 n. 14, 78, 94, 97, 106,
ontogeny 15 n. 22, 50, 51 see also phylogeny 144–9, 179 see also proto-syntax
optimal design 3, 20, 39, 44, 46, 81, 104, 141 primates 84, 125, 126, 160, 176 see also
see also design of language bonobos
Strong Minimalist Thesis (SMT) 3 see also Principles and Parameters 26
Minimalism/Minimalist Program processing 21, 32, 53, 54, 58, 102, 112 n. 22, 113,
127, 141, 169, 200, 211, 212, 214, 215, 217
parallelism 91, 92, 95, 104, 137, 192, 202 n. 29 see also brain activation
see also symmetry automatic/subconscious processing 16,
parameters of variation 2, 203 see also case 117, 175, 181, 182
alignment; tense/aspect/mood (TAM) prosody 13, 14, 48, 86–89, 90, 102, 108, 112, 124,
marking; variation (linguistic) 125, 128 see also intonation
parataxis 3, 36, 38, 48, 54, 58, 76, 81, 89–102, protolanguage 3 n. 7, 4, 23, 24, 25, 48, 90, 98,
103, 104, 110, 111, 114, 115, 117, 123, 124, 127, 119, 175 n., 193, 201 see also musical
137, 138, 140, 154, 179, 188, 196, 199–204 protolanguage; proto-syntax
see also concatenation; Conjoin; proto-Merge 13, 16, 87, 92, 102, 141, 145, 194
hypotaxis see also Conjoin; Merge
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

260 Index of subjects

proto-syntax 4–9, 17, 24, 37, 57, 64, 82, 84, serial verbs 25, 75–6, 81, 115, 117
89–95, 139, 164 n., 200, 214 see also small clause (SC)
protolanguage embedded small clause 34–6, 92
punctuated (equilibrium) 4 n. 11, 199 n. 26 half clause 15, 39, 42, 43, 44
root small clause 7, 16, 34–44, 46, 49, 52, 53,
quirkiness 20, 56, 62, 75, 81, 123 89, 94, 106, 128, 138, 181, 187, 213, 215
see also root infinitive
reconstruction 5, 8, 9, 21, 22, 25, 33, 47, 126, specialization 2, 31, 39, 40, 43, 88, 108, 109,
146 n. 2, 169 n. 30, 192, 201 110, 113, 122, 181, 188 see also division
comparative method 11 of labor
internal reconstruction 4, 9, 10, 11 n., 33, 34, subcortical structures of the brain 53, 84, 125,
38, 64 127, 212, 215, 217
recursion 3, 14 n. 21, 44–9, 115, 121–2, 126, 129, Subjacency 14, 31, 116, 131–42, 174, 175, 208 see
135, 186, 187–90, 203 see also also island constraints
subordination subordination 9, 11, 26, 36, 45, 50, 87, 88, 92
and cognitive abilities 113, 116–17 n., 97, 105, 110–13, 115–17, 120, 121, 122,
and compounds 25, 69, 94 n. 8, 164 n. 124, 126, 128, 141, 181, 188 see also
and CP 28, 111–13 hypotaxis
and DP 29, 113–15, 129 survival 1 n. 1, 31, 33, 73, 144, 168, 174, 191, 193
definitions of 47, 90, 111–12, 114 see also selection
redundancy 20, 44, 88, 111, 140 see also swearwords 170, 171, 196, 217 see also
robustness derogatory reference; insult
reflexive 76, 77, 78, 161 n. 23 see also middle symbolic explosion 205 see also Middle to
right hemisphere 127, 170, 171, 212, 215, 217 Upper Paleolithic transition/revolution
see also brain lateralization symmetry 24, 36, 47 n. 26, 48, 81, 82, 91, 92, 93,
robustness 12, 13, 34, 83, 87, 88, 103 see also 95, 105, 178, 202 see also parallelism
redundancy
root infinitive 49, 52, 128 see also small clause tense 9, 11, 26, 40, 44, 68, 98, 110, 122, 180–183,
185, 190 n. 12, 203
saltationist approaches 3 n. 7, 19, 23, 25, 185, Tense Phrase (TP) 9, 26–7, 34, 38, 39, 46, 53,
186, 189, 202 n. 30, 204, 205 see also 58 n. see also tense
gradualist evolution; incremental tense-aspect-mood (TAM) 44, 110, 182, 203
evolution see also aspect; mood; tense
scaffolding 10, 22, 25, 38, 50, 102, 109, 140, 144, testing grounds 49–56, 81–5, 169–71, 211–19
154, 169, 175, 180, 202 n. 30, 204 see also see also falsifiability
hierarchical structure; layering thematic (theta) roles 22, 67 n. 15, 68, 78, 144,
selection 4, 14–20, 23, 44, 54, 63, 84, 97, 109, 147, 148, 151, 155
126, 130, 141, 142, 145, 167–9, 174–5, 179, agent 9, 22, 23 n., 24, 26, 42, 56, 61 n., 64,
190–9, 202 n. 30, 211 see also adaptation; 67, 70, 71, 73, 78, 82, 83, 101, 102, 146, 147,
competition; fitness; survival 153, 155 see also Agent First
natural vs. sexual 1, 169, 179, 195, 197 patient/theme 22, 24, 42, 66, 70, 71, 78, 82,
semantics 22, 24, 27, 51, 68, 78, 100, 101, 102, 83, 101, 144, 146, 147, 155
107, 113, 118, 148, 152, 155, 182 see also thetic statements 23 n., 67 n. 14 see also
autonomous syntax categorical statements
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Index of subjects 261

thought 24, 47, 97, 115, 116, 186, 201 see also Uniformity of Theta Role Assignment
cognition; inner speech (UTAH) 67 n. 15 see also thematic
tinkering in evolution 2, 20, 31, 34, 39, 44, (theta) roles
62, 81, 88, 104, 111, 112, 116, 140, 174, 180, universals in syntax 6, 44 n. 20, 47, 58, 94
197 n. 23
tone 190, 191, 192 vagueness 17, 40 n. 15, 61 n., 68, 69, 70, 73,
topic-comment 22, 23, 94 n. 7 see also 77–9, 100, 102, 147, 148, 175–80, 183, 184,
information structure; Move(ment) 186, 197 n. 21 see also precision;
transitional (stages) 12, 16, 23, 25, 30, 49, 79, underspeciﬁcation
117, 120–4, 126, 129, 138 n., 176, 184, 190, valence 98, 148 n. 6
202 n. 30 see also intermediate (stages); variation
middle genetic 18, 54 n. 28, 204 n. 32, 206 see also
transitivity 4, 5, 9, 17, 23, 24, 26–7, 59, 60, 62, disorders; genetics
63, 65, 71, 75–6, 79, 80, 83, 101, 115, 119, 175, linguistic 4, 23, 114, 134 n. 6, 182, 202, 203,
179, 183–5, 187, 189, 203, 211 see also 204, 206 see also parameters of variation
intransitivity; middle; verb phrase (light) Verb Phrase (VP) 9, 26, 29, 38, 62, 63, 65 n.,
triune brain 10, 55, 217 see also brain 80, 153 n. 12, 207
evolution light verb phrase (vP) 9, 10, 17, 22, 27, 61–4,
two-word/two-slot 48, 51, 57, 58, 63, 66–8, 79, 66, 67, 74, 84, 118, 153, 186, 211, 212
82–4, 94, 99, 118, 119, 125, 126, 130, 175–80, vP/VP shell 9, 26 n., 37 n., 63, 64, 65,
183, 194, 196, 202 see also binary 81, 211
(branching) verbal dueling 168 see also insult
typology 23, 71, 208 see also variation vocabulary 11, 16, 69, 97 n., 171, 179, 193, 196,
(linguistic) 200, 202 see also lexicon

unaccusative 23 n., 27, 40–4, 64 n. 11, word order 22–24, 40–42, 52, 82, 84 n., 95,
66–8, 72, 74, 82, 146, 214, 216 see also 99–101, 119, 153, 169, 170, 213 see also
accusative Agent First; Cause First
underspeciﬁcation 68, 70, 78, 100, 102, 112 n. SOV 82, 169 n. 30
23, 148, 155, 177, 181, 183, 189 see also SVO 42 n. 17, 82
vagueness VS 40, 41–3, 66, 82, 83, 84, 169 n. 30, 214
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

Oxf o rd St u d i e s in t h e E vo lu t i o n of La n g uage
General Editors
Kathleen R. Gibson, University of Texas at Houston,
and Maggie Tallerman, Newcastle University
Published
1
The Origins of Vowel Systems
Bart de Boer
2
The Transition to Language
Edited by Alison Wray
3
Language Evolution
Edited by Morten H. Christiansen and Simon Kirby
4
Language Origins
Perspectives on Evolution
Edited by Maggie Tallerman
5
The Talking Ape
How Language Evolved
Robbins Burling
6
Self-Organization in the Evolution of Speech
Pierre-Yves Oudeyer
Translated by James R. Hurford
7
Why we Talk
The Evolutionary Origins of Human Communication
Jean-Louis Dessalles
Translated by James Grieve
8
The Origins of Meaning
Language in the Light of Evolution 1
James R. Hurford
9
The Genesis of Grammar
Bernd Heine and Tania Kuteva
10
The Origin of Speech
Peter F. MacNeilage
11
The Prehistory of Language
Edited by Rudolf Botha and Chris Knight
12
The Cradle of Language
Edited by Rudolf Botha and Chris Knight
OUP CORRECTED PROOF – FINAL, 8/5/2015, SPi

13
Language Complexity as an Evolving Variable
Edited by Geoffrey Sampson, David Gil, and Peter Trudgill
14
The Evolution of Morphology
Andrew Carstairs McCarthy
15
The Origins of Grammar
Language in the Light of Evolution 2
James R. Hurford
16
How the Brain Got Language
The Mirror System Hypothesis
Michael A. Arbib
17
The Evolutionary Emergence of Language
Edited by Rudolf Botha and Martin Everaert
18
The Nature and Origin of Language
Denis Bouchard
19
The Social Origins of Language
Edited by Daniel Dor, Chris Knight, and Jerome Lewis
20
Evolutionary Syntax
Ljiljana Progovac
In Preparation
Darwinian Linguistics
Evolution and the Logic of Linguistic Theory
Stephen R. Anderson
Published in Association with the Series
The Oxford Handbook of Language Evolution
edited by Maggie Tallerman and Kathleen R. Gibson
Language Diversity
Daniel Nettle
Function, Selection, and Innateness
The Emergence of Language Universals
Simon Kirby
The Origins of Complex Language
An Inquiry into the Evolutionary Beginnings of Sentences,
Syllables, and Truth
Andrew Carstairs McCarthy