DOI: 10.26346/1120-2726-178
A quantitative analysis of counterfactual conditionals in
cross-linguistic perspective
Jesús Olguín Martínez,a Nicholas Lesterb
a
b
University of California, Santa Barbara, United States <olguinmartinez@ucsb.edu>
University of Zürich, Switzerland <nicholas.a.lester@gmail.com>
People often reason about states of the world that could have been, but which are
not, or those which could be, given that certain conditions are satisfied. When we
make statements about such relationships, we usually divide them into two parts:
the condition (protasis) and the result (the apodosis). While most languages signal
the relationship between protasis and apodosis in counterfactual conditional constructions explicitly, they vary widely in the structures they use to do so. The present study addresses several questions related to cross-linguistic variation in this
domain. How are the clauses marked for tense, aspect, and modality? What kind
of clause-linking strategies are used to combine them? Are the clauses marked
using the same or different morphosyntax? Through qualitative and quantitative
analysis of a large sample of carefully selected languages, we demonstrate widespread differences between languages. We also uncover general patterns of features that correlate both with the symmetry and the morphosyntax of protases and
apodoses in counterfactual conditionals across languages.
Keywords: counterfactual conditional, adverbial clause, irrealis, TAM markers, clause-linking devices, clause combining, linguistic typology, subordination,
coordination.
1. Introduction
Although adverbial clauses have long been a topic of strong interest to linguists, there have been relatively few studies of particular
semantic relations from a cross-linguistic perspective. One semantic
relation that has received little attention is that of counterfactual conditionals. Counterfactual conditionals convey the idea that the state of
affairs denoted by them did not happen or could not happen (cf. Givón
2001: 332-333).1 Various studies have addressed this semantic relation
in individual languages (e.g. Arkadiev 2020 on Kuban Kabardian), in
particular macro-areas (e.g. Nicolle 2017 on African languages), and in
particular language families (e.g. Bhatt 1998 on Indo-Aryan languages).
However, to the best of our knowledge, only a few broad cross-linguistic
studies have explored counterfactual conditionals (e.g. Comrie 1986;
Wierzbicka 1997; Haiman & Kuteva 2001; Xrakovskij 2005; Karawani
2014; Qian 2016).
Italian Journal of Linguistics, 33.2 (2021), p. 147-182
(received December 2019)
Jesús Olguín Martínez, Nicholas Lester
One important aspect to bear in mind is that those cross-linguistic
studies that have addressed counterfactual conditionals have examined
them in a monofactorial context (e.g. Haiman & Kuteva 2001 on the
symmetric and asymmetric patterns of counterfactual conditionals). In
monofactorial studies, one independent variable (at a time) is investigated without reference to any other independent variables. However,
one independent variable hardly ever accounts for all variation in a
dependent variable (Wulff et al. 2014: 276-277). There is only one crosslinguistic study that has explored counterfactual conditionals in a monofactorial context by taking into account several parameters independently. Qian (2016: 104), in his typological work, considers the formal type
of subordinating device, the order of the protasis and apodosis, and the
deranking status of the protasis. Although the study of these parameters
in a monofactorial context provides an important point of departure,
we should like to know how each behaves when the others have been
controlled for (Wulff et al. 2014: 276-277). Monofactorial methodologies
cannot shed light on this theoretical aspect.
This paper has two goals. First, we explore counterfactual conditionals by taking into account a genetically and areally balanced sample of
107 languages. As will be seen, counterfactual conditionals vary widely
across languages. To keep the scope of the discussion manageable, we
focus on three parameters: (i) the symmetric and asymmetric morphological patterns of counterfactual conditionals, (ii) the range of Tense-AspectMood values (henceforth TAM) that tend to appear in the protasis and
apodosis of counterfactual conditional constructions, and (iii) the range of
clause-linking strategies used in the encoding of counterfactual conditionals. Next, we apply two statistical analyses of these distributions to a subset of the overall database (selected to maximize sample sizes across the
classes that we compare statistically). Here we test which factors lead to
symmetric vs asymmetric systems, as well as which morphosyntactic features distinguish protasis from apodosis within this subsample. Based on
these two studies, we hope (a) to uncover the cross-linguistic distribution
of the features associated with counterfactual conditionals, (b) to determine what conglomerations of factors produce symmetric vs asymmetric
systems, and (c) to see whether protases and apodoses show reliable morphosyntactic properties across languages.
The paper is structured as follows. In Study 1, we first introduce the
sample of languages. Then, we provide some theoretical remarks on counterfactual conditionals and propose a comparative concept required to compare this construction across languages. Based on this concept, we explore
the symmetric and asymmetric morphological patterns of counterfactual
conditionals in the worldʼs languages. We further document the range of
148
A quantitative analysis of counterfactual conditionals in cross-linguistic perspective
TAM values that tend to appear in both the protasis and apodosis and provide a concise overview of the clause-linking strategies used in the encoding
of counterfactual conditionals. In Study 2 (see section 3 for a detailed discussion), we report the results of two statistical analyses that test (i) which
variables best predict the morphological symmetry/asymmetry of this construction across languages and (ii) whether there are any typological associations between TAM and clause type within the construction. Finally, we
synthesize the results of the two studies to arrive at a statistically informed
picture of the typological variability of counterfactual conditionals.
2. Study 1: Typological variables associated with counterfactual conditionals
Since this study is one of the first attempts to explore counterfactual conditionals from a typological perspective, the ideal strategy is to
build a sample that is genetically stratified at the level of genus, which
aims at avoiding a genetic bias.2 Having a genetically balanced sample
is critical given that our main aim is to find statistical tendencies and
correlations. Genetic biases could produce misleading or unreliable statistical conclusions, that is, conclusions that can only be generalized to
specific (sub)groups of languages. Not only genetic biases need to be
removed, but also areal biases. In this research, we take into account a
genetically and areally balanced sample of 107 languages based on the
Genus-Macroarea method proposed by Miestamo et al. (2016). In this
method, the primary genealogical stratification is made at the genus level, and the primary areal stratification at the level of macroareas. While
the Genus-Macroarea method adopts the genealogical and areal stratification proposed by Dryer (2013), it does not follow the procedure(s)
Dryer adopts for the selection of languages in Dryer (1989).3 In Dryer’s
(1989) method of sampling, languages are first included in the sample
without a systematic method of selection, and this bottom-up approach
is then complemented by a more systematic stratification at the stage of
testing generalizations (Miestamo et al. 2016: 242). The structure and
motivations behind the selection of languages deserve some explanation.
In the Genus-Macroarea method, constructing a sample without predetermined sample size will, at its simplest, mean picking one language
from every genus. This means that we attempted to find one language
from each of Dryer’s genera for which the available literature gives sufficient information on the grammar of counterfactual conditionals.
Dryer’s (2013) classification in WALS contains 543 genera. It is important to mention that in Dryer’s classification each language belongs to a
genus and each genus belongs to a family. Note that a language can be
149
Jesús Olguín Martínez, Nicholas Lester
the only member of its genus, and a genus may form a family on its own
(Miestamo et al. 2016: 239). For some genera, we were not able to find
any language that meets that criterion. Therefore, these genera do not
(indeed, cannot) figure in our discussion. We were able to find sufficient
information on one language in each of exactly 165 genera (i.e. 165 genera out of 543, or 30%), which accounts for our Core Sample.4
One aspect to consider is that some macro-areas are better represented
than others in the languages of the Core Sample because of the availability
and quality of the sources. Bibliographic bias tends to introduce areal bias
(Miestamo et al. 2016: 251). This areal bias is problematic for the reason
that we are exploring relationships between linguistic parameters. In order
to avoid this areal bias, we followed the method for achieving a better
areal balance that was introduced in Miestamo (2005). In this regard, the
Restricted Sample is a subsample drawn from the Core Sample with the aim
to balance the representation of each macro-area, and therefore to avoid
areal biases. In the Restricted Sample the number of genera of the least
well-represented area defines the maximal size of the Restricted Sample
that can be drawn from the Core Sample (Miestamo et al. 2016: 252).
In our study, as can be seen in Table 1, the least well represented
area is South America with 19.09% of its genera covered in the Core
Sample. The Restricted Sample will thus include 19.09% of the total
number of genera in each macro-area. For instance, 19.09% of the total
number of genera in Africa (77) gives the number of African languages
in the Restricted Sample as 15, 19.09% of the total number of genera
in Papunesia (136) gives the number of Papunesian languages in the
Restricted Sample as 26, etc.
Macro-area
Number Number of Coverage
of genera genera in
the Core
Sample
Africa
77
23
29.87%
Australia
43
17
39.53%
Eurasia
82
27
35.36%
North America
95
28
31.57%
Papunesia
136
43
32.35%
South America
110
21
19.09%
Total
150
Number
Coverage
of genera
in the
Restricted
Sample
15
19.09%
10
19.09%
17
19.09%
18
19.09%
26
19.09%
21
19.09%
107
—
543
165
—
Table 1. Genera covered in the Core Sample and Restricted Sample.
A quantitative analysis of counterfactual conditionals in cross-linguistic perspective
With the areal bias removed, the Restricted Sample is better suited
to serve as a basis for quantitative analysis. Table 2 provides the list of
107 languages of the Restricted Sample arranged by macro-area.
Macro area
Sample languages
Africa
Bangime, Boko, Emai, Eton, Gumuz, Koyra Chiini, Lango,
15
Lumun, Maba, Ngiti, Sandawe, Supyire, Tamashek, Tommo So,
Ts’ixa
Australia
Bininj Gun-Wok, Gaagudju, Gooniyandi, Gurr-Goni, Kuku
Yalanji, Malakmalak, Mangarrayi, Ngankikurungkurr,
Ungarinjin, Wardaman
10
Eurasia
Armenian, English, Finnish, Georgian, Hungarian, Ket,
Kharia, Kodava, Korean, Lao, Lezgian, Mongsen Ao, Palula,
Spanish, Tangsa, Udihe, Yukaghir
17
North America Ayutla Mixe, Barbareño Chumash, Buglere, Central Alaskan
Yupik, Chol, Cree, Creek, Crow, Garifuna, Haida, Huasteca
Nahuatl, Jamul Tiipay, Slave, Sochiapan Chinantec, Teribe,
Wappo, Warihio, Yuchi
18
Papunesia
26
Abau, Awtuw, Balantak, Barai, Barupu, Dadibi, Daga, Duna,
Golin, Ilocano, Imonda, Inanwatan, Kaluli, Kombio, Komnzo,
Lavukaleve, Makasae, Manambu, Motuna, Paiwan, Rapanui,
Rotokas, Sulka, Taba, Urama, Urim
Sum
South America Aguaruna, Ashéninka Perené, Awa Pit, Baure, Bora, Epena
Pedee, Guna, Hup, Iquito, Kokama-Kokamilla, Kwaza,
Mapudungun, Mosetén, Movima, Murui Huitoto, Muylaq’
Aymara, Puinave, Tariana, Tiriyó, Urarina, Yurakaré
21
Total
107
Table 2. List of languages of the Restricted Sample arranged by macro-area.
Before leaving the present section, some remarks on the languages that
were left outside the Restricted Sample are in order. We followed Miestamo
et al. (2016: 253), who explain that the languages that must be left outside
the Restricted Sample should come from the family (or families) with the
greatest number of languages in the sample in each macro-area, and leave
out a language or languages from that family (or families). This reduces
genetic bias by reducing the influence of large families. For instance, for
the African macro-area, we decided to leave out languages from different
genera of the Niger-Congo and Afro-Asiatic families, the two largest families
in Africa. This approach therefore maximizes the independence of the languages sampled in that it avoids including languages from different genera
of the same family which might share a feature inherited from the protolanguage of the family. Note that unlike this method, in Dryer’s 1989 method two or more languages from different genera of the same family may be
151
Jesús Olguín Martínez, Nicholas Lester
included, which may result in a sample not suitable for statistical analysis.
It is important to stress that whenever we noticed that two or more languages belonging to different families have been in contact, we decided to
choose other languages from the same families. However, sometimes it was
not possible to establish whether the languages belonging to different families shared a feature because of intense contact or chance. It is important to
mention that these cases are rather few and do not detract from the validity
of our overall statistical conclusions.
2.1. Theoretical remarks on counterfactual conditionals and comparative concept
Conditional clauses have been traditionally divided into different
types. For instance, Bennett (1908: 198) divides Latin conditional clauses into conditional clauses in which nothing is implied as to the reality
of the supposed case, hypothetical conditional clauses, which refer to
imagined state of affairs that might hypothetically happen, and counterfactual conditionals, which refer to imagined state of affairs that did not
happen. Smyth (1920: 517-520) considers different types of conditional
clauses in Greek, such as simple past and present conditionals (i.e. conditions which simply state a supposition with no implication as to its
reality or probability), present and past unreal conditionals (i.e. the protasis implies that the state of affairs cannot be realized because contrary
to a known fact), more vivid future conditionals (i.e. the speaker sets
forth a thought as prominent in his mind), less vivid future conditionals
(i.e. it expresses suppositions less distinctly conceived and of less immediate concern to the speaker), and general conditionals (i.e. they refer to
a state of affairs that is very likely to occur).
In a recent typological study, Qian (2016: 108) shows that most
languages tend to make a morpho-syntactic tripartite differentiation
in hypotheticality in conditionals clauses. He mentions that only a few
languages lack a formal distinction among when-clauses, if-clauses and
if…would-clauses. In languages where the distinction is not encoded,
the differentiation between temporal and hypothetical constructions is
therefore contextually dependent. The focus of this paper is on counterfactual conditionals. Counterfactual conditionals are considered to be
semantic primitives in that every language should have a construction
which allows the speaker to express this meaning (Wierzbicka 1997:
28). Before launching into this discussion, the reader should bear in
mind one remaining general point.
One of the biggest challenges of typology is coping with different
terminological traditions across languages while exploring one particu-
152
A quantitative analysis of counterfactual conditionals in cross-linguistic perspective
lar phenomenon. For instance, different terminology has been used to
talk about non-finite adverbial forms, such as converbs in Altaic languages, gerunds and adverbial participles in languages from Europe,
medial verbs in languages from New Guinea, and conjunctive participles
in languages from South Asia (Haspelmath 1995: 23). Another example comes from apprehensive markers. The terminology used to refer
to apprehensive markers varies a lot, especially from one geographical
area to another and across language families. In this regard, Vuillermet
(2018: 259) explains that she has identified about 20 terms, such as
admonitive, avertive, warning clitic, timitive, volitive of fear or fear case
marker, among others. Unlike these phenomena, the terminology used in
different grammars to refer to counterfactual conditionals seems not to
vary a lot. Protasis and apodosis are the most common ways to refer to
the counterfactual conditional clause and the main clause respectively.
Other less common ways are antecedence and consequent, subordinate clause and matrix clause, and dependent clause and superordinate
clause. In the present study, we have chosen to use the terms ‘protasis’
and ‘apodosis’. This is due to the fact that, as explained by Traugott
(1985: 304), the concepts ‘protasis’ and ‘apodosis’ are the traditional
terms, whereas antecedent and consequent are associated more directly
with the philosophical tradition. We now turn to the definition of counterfactual conditionals adopted in this study.
The definition in (1) is the comparative concept put forward in this
paper.5 This definition facilitates cross-linguistic comparability and does not
impose any a priori restrictions on the form of counterfactual conditionals.
(1) Counterfactual conditional: A counterfactual conditional clause is a type of complex
sentence construction in which the relation between the protasis and apodosis is that of an
imagined state of affairs that did not happen.
There are two key components that can be highlighted from the definition in (1): complex sentence construction and imagined state of affairs
that did not happen. The first component (i.e. complex sentence construction) refers to a specific relationship between (at least) two states of affairs
in (at least) two clauses (Longacre 1985: 255; Croft 2001: 320-321).6
Complex sentence constructions are thus sentences that contain more than
one clause. A clause, in turn, can be defined as a unit minimally consisting of a predication that may be accompanied by its arguments and modifiers (Lehmann 1988: 182; Haspelmath 1995: 11; Gast & Diessel 2012: 4
– among many others). The syntactic relation between these two states of
affairs may be one of coordination or subordination, among others.7
153
Jesús Olguín Martínez, Nicholas Lester
Conceived of in this way, the notion of complex sentence construction is useful because it has enabled us to incorporate counterfactual
conditional constructions which show different types of syntactic relations
and are encoded by various types of clause-linking strategies. Therefore,
if the component complex sentence construction is substituted by a more
particular syntactic relation, such as subordination, many languages will
have to be excluded from the present study, such as the example in (2),
from Imonda (West Sepik). Note that in this example both the protasis
and apodosis are simply juxtaposed rather than marked by any specific
morpho-syntactic device(s). Complicating the picture further, the distinction between subordinate and main clauses is regarded by many linguists
to be gradual (Gast & Diessel 2012: 5; Lehmann 1988: 190 – among many
others), making it difficult to define or to compare across languages.
(2) Imonda (Seiler 1985: 206)
ka
heulõ-ta-ba,
ne-m
ka
eg-t.
1sg.sbj hear-irr-top 2sg.obj-gl 1sg.sbj follow-cf
‘If I had heard (you), I would have followed you.’
With this in mind, the following range of complex sentence constructions and clause-linking strategies are taken into account in the
present study. First, languages may encode counterfactual conditionals
by means of paratactic structures, as in (3). By parataxis is meant two
clauses without any structural element linking them. The relation arises
by implicature, usually due to contextual or common knowledge and/or
iconicity of sequencing (Greenberg 1966; Haiman 1980). As pointed out
by Mauri & Sansò (2009), it is not infrequent to find languages lacking
grammaticalized strategies (e.g. free adverbial subordinators) and expressing counterfactual conditional relations by means of paratactic constructions. Mauri & van der Auwera (2012: 396) explain that in this scenario
not all is left to inferential processes. Rather, if a language expresses
counterfactual conditionals by means of paratactic constructions, at least
one of the linked state of affairs has to be marked as irrealis (by means of
irrealis, dubitative, or hypothetical elements) in order for the counterfactual conditional relation to be inferable. Verstraete (2014: 223) mentions
that TAM markers, in paratactic counterfactual conditionals, may serve
as a pragmatic trigger of the counterfactual conditional interpretation.
This pattern shows a clear areal pattern, as has been shown previously by
Haiman (1983). Its mainstay is Papua New Guinea and Australia (see section 2.4. for a similar cross-linguistic distribution attested in the present
study). In the Yimas (Lower Sepik) example in (3), two clauses appear
one after the other without any grammaticalized strategy. In order for the
counterfactual conditional relation to be inferable, it is necessary that the
154
A quantitative analysis of counterfactual conditionals in cross-linguistic perspective
two clauses are overtly marked as potential, otherwise the hearer could
interpret the construction as a purely temporal or causal relation.
(3) Yimas (Foley 1991: 442)
tuŋkurŋ
ant-ka-tay-c-mp-n,
ant-ka-tu-r-ak.
eye.vi.sg pot-1sg.sbj-see-pfv-vii.sg-obl pot-1sg.sbj-kill-pfv-vii.sg.obj
‘If I had seen the eye (of the crocodile), I would have killed it.’
Counterfactual conditionals may also be encoded by a general
coordinating device, as is shown in (4). General coordinating devices
are coordinating linkers, such as ‘and’ (Haspelmath 2004), that occur in
a biclausal construction, from which a counterfactual conditional relation is inferred due to iconicity of sequencing and/or contextual factors.
Given the underspecification of general coordinating devices, Mauri &
van der Auwera (2012: 396) also explain that not all is left to inferential
processes and at least one of the linked state of affairs has to be marked
as irrealis (by means of irrealis, dubitative, or hypothetical elements)
in order for the counterfactual conditional relation to be inferable, as is
shown in the Sulka (Isolate) example in (4). In this example, if one of
the clauses does not appear with -ngoe, the hearer could interpret the
construction as a purely temporal or causal relation (Tharp 1996: 153).
(4) Sulka (Tharp 1996: 153)
ip-ngoe
va
nap-ngoe.
2sg.sbj-go.pst.cond and 3sg.sbj-go.pst.cond
‘If you had gone, then he would have gone.’
We also take into account, in the present study, counterfactual conditionals encoded by grammaticalized strategies, i.e. dedicated devices,
which explicitly encode the semantic relation of the adverbial clause to
the state of affairs expressed in the main clause. The most common dedicated devices by which counterfactual conditionals tend to be encoded
in the languages of the sample are dedicated adverbial subordinators
and specialized converbs. Some comments on the properties of these
devices and the challenges in defining them are in order.
A dedicated adverbial subordinator is a morpheme that marks a
subordinate adverbial clause for its semantic relationship to the main
clause. For the most part dedicated adverbial subordinators are associated with free subordinating items, illustrated in the San Andrés Otomi
(Oto-Manguean) example in (5), where the counterfactual conditional
relation is encoded by the free adverbial subordinator bɨ ‘if’. However,
there are languages in which dedicated adverbial subordinators may
be bound morphemes, as can be seen in the Rama (Chibchan) example
155
Jesús Olguín Martínez, Nicholas Lester
in (6), where the counterfactual conditional relation is encoded by the
bound adverbial subordinator -kata ‘if’.
(5) San Andrés Otomi (Lastra de Suárez 2001: 136)
bɨ kʷa-nú,
kʷa-ó-hpí
r˄
másčité.
if 1sg.sbj.subj-see 1sg.sbj.subj-ask-3sg.obj art machete
‘If I had seen (it), I would have asked him the machete.’
(6) Rama (Craig 1990: 165)
nah
maa
alkuk-kata,
nah
uwaik
1sg.sbj 2sg.sbj hear-if
1sg.sbj long.time
siik-ut.
come-irr
‘If I had heard (that) you (had come), I would have come a long time ago.’
The greatest obstacle in defining dedicated adverbial subordinators in the present study has been to define what a subordinate clause is
(Kortmann 1997: 57). However, given that subordination is a multidimensional phenomenon (Lehmann 1988) described by a set of independent formal parameters (e.g. dependent clause reduces its range of TAM
values, dependent clause increasingly acquires nominal properties),
there will be instances in which the dedicated adverbial subordinator
will clearly operate in a subordinate clause and others in which it will
not. In the Movima (Isolate) example in (7), the free adverbial subordinator disoy ‘if’ introduces a clause that is clearly subordinate in that it is
deprived of any TAM markers, it appears with nominalizing morphology
(i.e. the suffix -wa), and it occurs with the oblique article nokos, commonly found in nominal elements. The opposite situation is shown in
the Wardaman (Yangmanic) example in (8) in that the free adverbial
subordinator bujun ‘if’ appears in a dependent clause that is marked for
its own TAM markers and shows overt participant coding. This clause
appears with the same properties of main clauses (Merlan 1994: 188).8
(7) Movima (Haude 2006: 532)
disoy no-kos
dinkaye-wa-nkweɬ,
if
obl-art
hurry-nmlz-2pl.sbj
diʼ
man<a>ye=nkweɬ
ney
diːra.
hyp
meet<dr>=2pl.sbj here still
‘If you had hurried, you might still have met them here (but you didnʼt).’
(8) Wardaman (Merlan 1994: 188)
bujun yi-ngan-wo-ndi
ma-jad,
if
irr-3sg.sbj.1.sg.obj-give-pst big-abs
yi-ngong-wo-ndi.
irr-2sg.sbj.1sg.obj-give-pst
‘If he had given me a lot, I would have given you (some).’
156
A quantitative analysis of counterfactual conditionals in cross-linguistic perspective
Another important aspect regarding dedicated adverbial subordinators should be mentioned here. Since language is a not static, but rather
a dynamic system that is in a constant state of flux (Croft 2003: 283), it
is expected that languages may have dedicated adverbial subordinators
that may not (yet) fully grammaticalized. When building the sample of the
present study, we came across languages in which counterfactual conditionals are encoded by verbs meaning ‘to say’. Whether this form has become
grammaticalized as a dedicated adverbial subordinator or not is unclear
to us. For instance, in Anejom (Oceanic) the expression of counterfactual
conditionals by means of the verb ika ‘say’ is very frequent, as in (9). In
Araki (Oceanic) the form co de is a free adverbial subordinator related to
the verb ‘say’, as can be seen in (10). Note that de ‘say’ is accompanied
by the first person inclusive plural irrealis pronoun co which refers to the
speaker and his addressee. However, it may also be accompanied by other
types of person markers, which seems to suggest that it may not (yet) fully
grammaticalized. François (2002: 177) explains that in this case co de has
to be understood as ‘let us say that’, in a very similar way to English ‘let us
suppose’. Other Oceanic languages in which this pattern is found are Bariai
(Gallagher & Baehr 2005: 160), Big Nambas (Fox 1979: 108-109), Daakaka
(Von Prince 2015: 378), Kwamera (Lindstrom & Lynch 1994: 35) and
Mangap-Mbula (Bugenhagen 1995: 404). For the sake of transparency, the
policy adopted in this study has been to exclude these instances from the
present research on the grounds that it has not been possible to determine
whether these strategies are dedicated adverbial subordinators or verbs. It
is important to stress that these problematic cases are rather few and do not
detract from the validity of our overall conclusions.
(9) Anejom (Lynch 2000: 161)
et
wut
ika
et
idim
itiyi ehe,
3sg.aor temp.conj.fut say
3sg.aor really neg rain
ek
pu
idim
apan m-asjan-ya.
1sg.aor fut really go
es-throw-line
‘If it really hadnʼt rained, I would have gone fishing.’
(10)Araki (François 2002: 178)
co
de
na
maci, na
pa avu.
1.incl.irr say 1sg.sbj bird 1sg.sbj.irr seq fly
‘If I were a bird, I would have flown.’
One important aspect to bear in mind is that, in some languages,
counterfactual conditionals may be encoded by two dedicated adverbial
subordinators. For instance, in the Urarina (Isolate) example in (11), the
dependent clause appears with baana ‘if’ and hananiane ‘if’.
157
Jesús Olguín Martínez, Nicholas Lester
(11)Urarina (Olawsky 2006: 255)
baana itɕʉʉ-a=ne
hananiane, raj
kalaui-tɕʉrʉ mʉkʉ-akatɕe.
if
be.near-3sg.sbj=sub if
poss son-pl
catch-1pl.sbj
‘If its creatures had been near, we would have caught it (about a peccary).’
Counterfactual conditionals may also be encoded by specialized
converbs, that is, special verb forms that do not appear in independent
declarative clauses (Cristofaro 2003: Chapter 3) and mark the adverbial
clause for its semantic relationship to the main clause, as in the Ingush
(Nakh-Daghestanian) example in (12). Although specialized converbs
and bound adverbial subordinators may look similar at first glance,
there are some clear-cut differences between them. While specialized
converbs are part of the inflectional paradigm of verbs and thus in
paradigmatic contrast to other inflectional morphemes, bound adverbial
subordinators are not. What this means is that specialized converbs cannot be analyzed as a verb plus a subordinating affix (Haspelmath 1995:
4). Another important difference between these devices has to do with
their lexical autonomy. Specialized converbs never have the degree of
autonomy associated with the status of lexemes (Haspelmath 1995: 4),
but bound adverbial subordinators do. These criteria have played an
important role when exploring the sources of the sample.
(12)Ingush (Nichols 2011: 305)
ehw
dalaarie,
mocagha hwa-dea
xuddar.
conscience gend.be.irr.cvb long_ago deic-gend.ant.cvb go.gend.cond
‘If they had had any conscience, they would have done it long ago.’
Another thought-provoking example comes from Chamacoco in
(13). In this language, counterfactual conditionals are encoded by parahypotaxis.9 In this example the protasis appears with a dedicated adverbial subordinator and the apodosis appears with a general coordinating
device that is obligatory. Interestingly, there are instances in which both
the protasis and apodosis appear with a dedicated adverbial subordinator, as in the Paiwan example in (14), and the Lango example in (15).
(13)Chamacoco (Bertinetto & Ciucci 2012: 98)
kẽhe, uu
lɨke ɨshɨr
lɨshɨ
sẽhe,
if
det.sg.m
this indigenous.sg.m poor.sg.m want
teehe, s-ohnɨmichɨ=ke,
hn
uhu
oy-ihye
ɨre.
interj 3.irr-get.off=pst and 2sg.caus
1pl-arrest 3sg
‘If the indigenous had wanted to get off (the bus), you would have made us arrest him.’
(14)Paiwan (Chang 2006: 318)
kana na=meLay
sa
Ɂudal, kana=ken
cf1 perf=rain.stop.av this.nom rain
cf2=1sg.nom
‘If this rain had stopped, I would have already left.’
158
a
lk
vaik=anga.
go.av=compl
A quantitative analysis of counterfactual conditionals in cross-linguistic perspective
(15)Lango (Noonan 1992: 233)
kónô ònwòŋò
àtíê
cɛm,
if
3sg.find.pfv 1sg.sbj.be.pres.hab food
‘If I had had food, I would have given it to you.’
kónô
if
àmîyí.
1sg.sbj-give-pfv-2sg.obj
Having explained the constructions that are included in the present study due to the notion of complex sentence construction, we turn
briefly to the constructions that are excluded due to this criterion. The
examples in (16) and (17) are discarded from the study because they do
not establish a relationship between two states of affairs, that is, both
examples lack an apodosis.
(16)Ma’di (Blackings & Fabb 2003: 143)
ɲɨ
drɨ
drɨ
dʒè kū.
2sg.sbj then
hand wash neg
‘Had you not washed your hand.’ (you’d have been in real trouble) (the event of washing
took place a few moments ago)
(17)Hunzib (Van den Berg 1995: 106)
zuq’u-r q’ədə diɁi
y-at’əru
ʕadam.
be-pret irr
me.dat 2-love-pst.ptcp person
‘If I only had a lover.’
The second component of the comparative concept used in the
present study is that of an imagined state of affairs that did not happen. This component refers to past counterfactual conditionals, which
express a counterfactual state of affairs in the past (e.g. If John had
come yesterday, we would have had fun) and present counterfactual conditionals, which express a counterfactual state of affairs in the present
(e.g. If only John were here now, we would be happy). The sources of the
languages of the sample explain for the most part the encoding of past
counterfactual conditionals rather than present counterfactual conditionals.
Before leaving the present section, it is important to bear in mind
that we also take into account languages in which counterfactual conditionals and hypothetical conditionals are expressed in the same way
and therefore they leave the interpretation to be inferred from the
context. This theoretical fact has not gone unnoticed and echoes Qian
(2016: 101), who explains that in some languages (e.g. Mising, Hmong,
Tagalog, Dolakha Newar, Zuni, Vietnamese), there is a clear differentiation between real and hypothetical conditional clauses. However, in
these languages a hypothetical or a counterfactual conditional reading
is contextually dependent. This is shown in the Gumawana (Oceanic)
example in (18) and the Longgu (Oceanic) example in (19), in which
159
Jesús Olguín Martínez, Nicholas Lester
there is a construction that allows both a hypothetical and counterfactual conditional reading.
(18)Gumawana (Olson 1992: 360)
neta i-tagona, dedei-na, ta-tupa.
if
3sg-offer good-3sg 1pl.incl-sail
‘If he offered, then good, we would sail.’
‘If he had offered, then good, we would have sailed.’
(19)Longgu (Hill 1992: 286)
zuhu no beata roporopo-i, gaoa
ho la bweubweu.
if
irr fine
morning-sg 1du.incl irr go walking
‘If it were fine this morning, we would go for a walk.’
‘If it had been fine this morning, we would have gone for a walk.’
The general spirit of this section has been to bring greater conceptual clarity to the understanding of counterfactual conditionals. In doing
so, this section provided a brief survey of the main components of counterfactual conditional in the light of cross-linguistic data. In the following sections, we explore the three parameters mentioned in §1.
2.2. Symmetric and asymmetric patterns of the protasis and apodosis
Cross-linguistically, the verbs in the protasis and the apodosis of
a counterfactual conditional may be encoded by different TAM values.
This property may be called the asymmetry of conditionals (Haiman
& Kuteva 2001: 101), as is illustrated by Mparntwe Arrernte (PamaNyungan) in (20), where the protasis appears with -ke and the apodosis
with -mere.10 However, sometimes the protasis and apodosis, irrespective
of their particular morphological form, have parallel structures, which
we refer to as a symmetric pattern. In the Quiegolani Zapotec (OtoManguean) example in (21), both the protasis and apodosis occur with
the counterfactual mood marker ny-. Interestingly, there are languages
in which counterfactual conditionals may be symmetric or asymmetric,
as is shown in the examples in (22) and (23), from Huasteca Nahuatl
(Uto-Aztecan). Another possibility is that neither the protasis nor the
apodosis shows any TAM values, as is illustrated in (24), from Tetun
(Austronesian). Note that these instances are treated as symmetric counterfactual conditionals because they show parallel structures.
(20)Mparntwe Arrernte (Wilkins 1989: 234)
unte
apmwerrke petye-ke,
2sg.sbj yesterday come-pst.compl
arrayte unte
te-nhe
are-mere.
true
2sg.sbj
3sg.obj-acc
see-hyp
‘If you had come yesterday, then you certainly would have seen her.’
160
A quantitative analysis of counterfactual conditionals in cross-linguistic perspective
(21)Quiegolani Zapotec (Black 1994: 44)
che-bel
ny-oon=t
Min,
ny-oon-t
Lawer.
when-if cf-cry=neg Jazmin cf-cry-neg Laura
‘If Jazmin had not cried, Laura would have cried.’
(22)Huasteca Nahuatl (Olguín Martínez 2016: 75)
tlan kin-kuah-toskia
tama-li,
amo mayana-toskia.
if
3pl.obj-eat-cond.pst tamal-abs neg be.hungry-cond.pst
‘If he had eaten tamales, he would not have been hungry.’
(23)Huasteca Nahuatl (Olguín Martínez 2016: 76)
ach-ia-toya
okichpil ilhui-tl,
ach-miki-toskia.
neg-go-pst.perf boy
party-abs neg-die-cond.pst
‘Had the boy not gone to the party, he wouldn’t have died.’
(24)Tetun (Van Klinken 1999: 312)
kalo
haʼu
feto,
if
1sg.sbj woman
haʼu
la
bele
k-akur
tasi wé-n.
1sg.sbj neg
can
1sg.sbj-cross sea water-gen
‘If I were not a woman, I wouldn’t have been able to cross the sea.’
Map 1. Distribution of symmetric and asymmetric counterfactual conditionals.
161
Jesús Olguín Martínez, Nicholas Lester
Macro-area
Symmetric
Asymmetric
Both
Africa
3
11
1
Australia
9
1
0
Eurasia
5
9
3
North America
4
11
1
Papunesia
13
12
2
South America
4
17
1
Total
38
61
8
Table 3. Distribution of symmetric and asymmetric counterfactuals per macro-area.
As can be observed in Map 1, asymmetric counterfactual conditionals are the most robust type (61/107=57%). They are found in all
macro-areas, but the preference for this type is especially strong in South
America in the languages of the sample (i.e. 17/61=27.86%), as is
shown in Table 3. Symmetric counterfactual conditionals are the second
most common type (38/107=35.51%). They are also found in all the
macro-areas, but they are mostly attested in languages from Papunesia
(13/38=34.21%) and Australia (9/13=23.68%), as is illustrated in Table
3. Haiman & Kuteva (2001: 109) explain that the symmetric morphological pattern of counterfactual conditionals is predominantly an areal
typological feature in languages from Papua New Guinea. The authors
mention that it occurs in almost every Papuan language they are aware
of. Brooks (2018: 187) mentions that this symmetric pattern may be due
to contact-induced language change by showing evidence from Chini and
other languages from Papua New Guinea. In this regard, he mentions that
the forms are not always cognate across Chini and other languages from
Papua New Guinea, but the symmetric pattern is the same.
Having addressed the symmetric and asymmetric morphological
patterns of counterfactual conditionals, we now turn our attention to the
range of TAM values that tend to appear in these complex sentence constructions.
2.3. TAM values of counterfactual conditionals
Since counterfactual conditionals express non-actualized state of
affairs, one would expect that they should appear with TAM markers
whose semantics is appropriate to the counterfactual conditional context,
such as irrealis markers, conditional mood markers, and counterfactual
mood markers, among others (Mithun 1995: 384). However, it has long
162
A quantitative analysis of counterfactual conditionals in cross-linguistic perspective
been observed that, across a large number of unrelated languages, past
tense markers, and other TAM markers whose semantics does not harmonize with the counterfactual conditional meaning (e.g. perfective, completive), tend to appear in counterfactual conditional constructions (Comrie
1986). This is a clear mismatch for the reason that past tense marking,
perfective, and perfect marking tend to occur in state of affairs that are
actualized and, as was mentioned above, counterfactual conditionals
express non-actualized state of affairs. Different linguists have tried to
offer a possible explanation to this mismatch. These can be divided into
two lines of reasoning, that is, those who have adopted a remotenessbased approach and those who have adopted a back-shifting approach
(see von Prince 2019 for a detailed explanation). First, those who have
adopted the remoteness-based approach explain that past and counterfactuality share a semantic core of distance from the actual present (von
Prince 2019). For instance, Steele (1975) explains that the connection
between past tense and counterfactual conditionals is that the past tense
marker has as its basic meaning not past tense but something like distant
from present reality. Karawani (2014: 15) mentions that the connection
between past tense and counterfactual conditionals stems from the fact
that there is an inherent nature of the past as being closed and therefore
the condition is impossible or false. Second, von Prince (2019) explains
that past tense markers, in the back-shifting approach, “are thought to
push one’s perspective back in time so that developments that are no
longer possible become historically accessible.”
Our study shows that past tense markers and other TAM markers
whose semantics do not harmonize with the counterfactual conditional
meaning tend to occur in counterfactual conditional constructions.
However, there may be more to the story. In this regard, past tense may
combine with some other type of TAM marker expected to occur in nonactualized state of affairs (e.g. irrealis, counterfactual mood), showing
a mixed pattern. For instance, the protasis of the counterfactual conditional in the Papantla Veracruz Totonac (Totonacan) example in (25)
appears with different semantically conflicting TAM values, viz. the past
tense marker ix- and completive marker -li (expected to occur in actualized state of affairs) and the counterfactual mood marker -ti- (expected
to occur in non-actualized state of affairs).
(25)Papantla Veracruz Totonac (Levy 1990: 139)
para ix-k-tiː-akxilh-li,
ix-k-tiː-maqskiˊ-lh
ixmachiːta.
if
pst-1sg.sbj-cf-see-compl pst-1sg.sbj-cf-ask-compl machete
‘If I had seen it, I would have asked him the machete.’
163
Jesús Olguín Martínez, Nicholas Lester
For the purposes of the present study, we discuss the range of TAM
values of both the protasis and apodosis in a separate way. We use four
terms to describe the range of TAM values of both the protasis and apodosis: actualized pattern, non-actualized pattern, mixed pattern, and
unmarked pattern.
Actualized patterns refer to those instances in which the protasis
or the apodosis is encoded by TAM values whose semantics do not harmonize with the counterfactual conditional context, such as past tense
marking, perfect marking, completive marking, and perfective marking.
For instance, the protasis of the Bangime (Isolate) example in (26) is
encoded by perfective marking and past tense marking. These TAM values are not expected to appear in counterfactual conditionals.
(26)Bangime (Heath & Hantgan 2017: 465)
sé
ŋ̀
jáá
Séédù
ŋījɛ̀
hīŋgà,
if
1sg
see.pfv Seydou yesterday pst
ŋ̀
dɛ́gɛ́
∅
náw.
1sg hit.fut 1sg
fut
‘If I had seen Seydou yesterday, I’d have hit him.’
Non-actualized patterns refer to those instances in which the protasis or the apodosis is encoded by TAM values expected to occur in
the counterfactual conditional context, such as irrealis, potential mood
marking, conditional mood marking, counterfactual mood marking,
future tense marking, and hypothetical mood marking. An example
appears in (27) from Gooniyandi (Bunuban), where the protasis is
encoded by the subjunctive -ya- and the irrealis -ala.
(27)Gooniyandi (McGregor 1990: 432)
barlanyi mila-ya-ala,
mangaddi mood-gila-rni.
snake
see-subj-irr.1sg.sbj neg
step.on-irr.1sg.sbj-pot
‘If I had seen the snake, I wouldn’t have stepped on it.’
One remark on the irrealis category is in order here. Mithun (1995:
384) explains that the notion irrealis portrays the state of affairs as
within the realm of thought, as knowable only through imagination. A
source of potential confusion in any discussion on irrealis is that it has
been applied to different concepts and constructions in languages from
many areas of the world. It is therefore important to clarify what is
meant when using this term. In this paper, we consider irrealis as specific markers (rather than notional descriptions of non-encoded meanings of constructions) in the forms of verbal affixes and clausal enclitics (Brooks 2018: 4). There seems to be a strong correlation between
counterfactual conditionals and irrealis marking because, as explained
164
A quantitative analysis of counterfactual conditionals in cross-linguistic perspective
by Mithun (1995: 384), when languages have a grammaticalized realis/
irrealis distinction, counterfactual conditionals tend to be encoded by
irrealis marking. This study supports this theoretical claim in that most
languages of the sample that have a grammaticalized realis/irrealis distinction tend to be marked by irrealis.
Mixed patterns refer to those instances in which the protasis or apodosis is encoded by a combination of two semantically conflicting TAM
values, as can be seen in the Hungarian (Uralic) example in (28), where
the protasis appears with the past tense marker -t and the conditional
mood marker volna.
(28)Hungarian (Kenesei et al. 1998: 52)
ha Péter-alud-t
volna, Anna haragud-ott
volna.
if Peter-sleep-pst cond Anna be.angry-pst cond
‘If Peter had been asleep, Anna would have been angry.’
By unmarked is meant those instances in which the protasis or apodosis is deprived of TAM marking, as can be seen in the example in (29)
from Inanwatan (Marind). In this example, the protasis does not appear
with any TAM values.
(29)Inanwatan (de Vries 2004: 39)
lwáa-go
dókter-e
náwe úra-y-aigo,
yesterday-circ doctor-m me
see-trans-neg
máiwo-go
nú-d-eqo.
now-circ
die-cf-1sg.sbj
‘If the doctor had not helped me yesterday, I would have died.’
Before leaving the present section, one remark on actualized and
non-actualized patterns is in order here. There are languages in which
the protasis will be nominalized, but it may appear with TAM marking
that is actualized or non-actualized. This fact has not gone unnoticed
and echoes Qian (2016: 156), who explains that in different languages
the protasis or apodosis of counterfactual conditionals constructions may
be nominalized, but may take TAM verbal inflections, as can be seen in
Table 4.11
Nominalization of protasis
Nominalization of apodosis
Hup, Kham, Macushi, Warekena
Afar, Kwazá, Movima, Pashto, Savosavo, Yimas
Table 4. Languages in which the protasis or apodosis of counterfactual conditional constructions is nominalized (Qian 2016: 158).
165
Jesús Olguín Martínez, Nicholas Lester
Having introduced the terminology that will be used in the following section, we can now proceed to explaining the most common TAM
values of both the protasis and apodosis in counterfactual conditional
constructions.
2.3.1. TAM values of the protasis in counterfactual conditionals
As can be observed in Map 2, the protases of counterfactual conditionals tend to appear with a non-actualized pattern (34/107=31.77%)
or actualized pattern (32/107=29.90%) in the languages of the sample.
While both types are found in all macro-areas, they seem to be more
frequent in particular macro-areas. As is shown in Table 5, actualized
protases seem to be slightly more common in Africa (9/32=28.12%) and
non-actualized protases in Papunesia (i.e. 13/34=38.23%). Some other
observations to be gleaned from Map 2 are the following. First, mixed protases are scattered in all macro-areas, but they seem to be slightly more
frequent in Eurasia (i.e. 6/23=26.08%). Second, unmarked protases are
mostly attested in Papunesia (i.e. 7/18=38.88%). Note that this type is
not found in Africa and Australia in the languages of the sample.
Map 2. TAM values of the protasis in counterfactual conditionals.
166
A quantitative analysis of counterfactual conditionals in cross-linguistic perspective
Macro-area
Actualized
Non-actualized Mixed
Unmarked
Africa
9
4
3
0
Australia
1
5
3
0
Eurasia
7
2
6
2
North America
7
2
4
4
South America
4
8
4
5
Papunesia
4
13
3
7
Total
32
34
23
18
Table 5. Distribution of TAM values of the protasis in counterfactual conditionals per
macro-area.
2.3.2. TAM values of the apodosis in counterfactual conditionals
The first and most important finding, as can be observed in Map 3,
is that apodoses encoded by non-actualized patterns are the most common pattern worldwide. In the sample, 56 languages (56/107=52.33%)
show this pattern. In particular, this pattern seems to be more common
in Papunesia (16/56=28.57%) and South America (15/56=26.78%).
With respect to mixed patterns (25/107=23.36%), they are found in
all macro-areas, but they do not seem to cluster in any particular area.
Regarding actualized protases (21/107=19.62%), they are mostly
attested in Africa in the languages of the sample. Note that languages
tend not to have apodoses that are unmarked.
Map 3. TAM values of the apodosis of counterfactual conditionals.
167
Jesús Olguín Martínez, Nicholas Lester
Macro-area
Actualized
Non-actualized Mixed
Unmarked
Africa
7
3
6
0
Australia
1
6
3
0
Eurasia
1
8
6
1
North America 5
8
4
0
South America 1
15
4
1
Papunesia
6
16
2
3
Total
21
56
25
5
Table 6. Distribution of TAM values of the apodosis in counterfactual conditionals per
macro-area.
In the following section we explain the last parameter addressed
in the present study, viz. the range of clause-linking devices used in the
encoding of counterfactual conditionals.
2.4. Clause-linking devices used in the encoding of counterfactuals conditionals
Clause-linking devices are among the most important means used to
establish subordinative and coordinative relations (Hetterle 2015: 106).
These devices may sometimes shed light on the type of semantic relation that holds between clauses (e.g. adverbial subordinators, specialized
converbs) in that they serve as devices for labeling complex sentence
relations like causal, conditional or temporal relations (Verstraete 2014:
195). Counterfactual conditionals are encoded by different formal types
of clause-linking devices. For the purposes of this study, we classify
these strategies in the following way.
First, specialized devices refer to devices that are only used to
encode counterfactual conditionals. These include dedicated adverbial
subordinators and specialized converbs. In the example in (30) from Eton
(Niger-Congo), the free clause-linking device bɛ́n is only used to encode
counterfactual conditionals. Therefore, this device is specialized. Second,
non-specialized devices refer to devices that encode counterfactual conditionals and other semantic types of conditionals (e.g. real, generic, and
hypothetical). In Aguaruna (Jivaroan), all semantic types of conditionals
are encoded by the subordinating affix -ka as can be observed in (31) and
(32). This seems to indicate that -ka is a non-specialized device. Third,
parataxis refers to those languages in which counterfactual conditionals
and other semantic types of conditionals (e.g. real, generic, and hypotheti-
168
A quantitative analysis of counterfactual conditionals in cross-linguistic perspective
cal) do not appear with any clause-linking device, as can be observed in
the examples in (33) and (34) from Gaagudju (Isolate).
(30)Eton (Van de Velde 2008: 365)
bɛn
nâ
ɲɛ̋
à-dǐdìá
va̋,
if
comp
i.ppr
i-foc~being
here
mə̀-lɛ́dà-H
wɔ̀.
1sg.sbj-show-cons 2sg.nppri.ppr
‘If it had been here, I would have shown it to you.’
(31)Aguaruna (Overall 2017: 391)
wi
kaʃini
wi-a-ku-nu-ka,
1sg.sbj tomorrow go-ipfv-sim-1sg:ss-cond
taka-sa-tʃa-tata-ha-i.
work-att-neg-fut-1sg.sbj-decl
‘If I go tomorrow, I wonʼt work.’
(32)Aguaruna (Overall 2017: 507)
ami
wɨ-tʃau-aita-ku-mɨ-ĩ-ka,
2sg.sbj go.pfv-neg:rel.cop-sim-2-cond
ʃiiha
anɨ-sa-nu
puhu-mai-inu-aita-ha-i.
well
be.happy-sub-1sg:ss live-pot-nmlz-cop-1sg-decl
‘If you had not gone, I would be happy.’
(33)Gaagudju (Harvey 2002: 371)
i-rree-ma
biirndi magaadja arree-wagi.
3i-1sg.sbj-get.fut money that.iv
1sg.sbj-go.back
‘If/When I get money, I will go back there.’
(34)Gaagudju (Harvey 2002: 372)
ø-ng-goro-garraa-ri
arr-geenma-ri=ni.
3i-1sg.sbj.irr-see-aux-pst
1sg.sbj-say.irr-pst=3sg.m.ind.obj
‘If I had seen him, I would have told him.’
As Map 4 demonstrates, non-specialized devices are the most
common type (45/93=48.38%; indicated by blue dots). These are
attested in all macro-areas. However, they seem to be more frequent in
North America (11/45=24.44%), Eurasia (10/93=22.22%), and South
America (10/45=22.22%). The second most frequent type is that of
paratactic counterfactual conditionals (28/93=30.10%; indicated by
green dots). Interestingly, this type of clause-linking strategy shows clear
areal skewings in that they can be found mainly in two macro-areas, viz.
Australia (7/28=25%) and Papunesia (12/28=42.85%), in particular in
languages from Papua New Guinea. Note that paratactic counterfactual
conditionals are completely absent from Eurasia. The third type, and the
least common device, is that of specialized devices (20/93=21.50%;
indicated by red dots). They are attested in all macro-areas, but do not
169
Jesús Olguín Martínez, Nicholas Lester
seem to show any areal clusters. Note that we removed all languages
with unknown clause-linking devices (n=14) in order to explore the
cross-linguistic distribution of clause-linking devices used to express
counterfactual conditional constructions.
Map 4. Clause-linking devices used in the encoding of counterfactuals conditionals.
Macro-area
Specialized
Non-specialized
Parataxis
Africa
5
4
3
Australia
0
3
7
Eurasia
5
10
0
North America
2
11
2
South America
3
10
4
Papunesia
5
7
12
Total
20
45
28
Table 7. Distribution of clause-linking devices used in the encoding of counterfactuals
conditionals per macro-area.
170
A quantitative analysis of counterfactual conditionals in cross-linguistic perspective
3. Study 2: Statistical analyses
In study 2, we perform two statistical analyses. The first aims to
uncover the variables that impact whether protases and apodoses are
encoded via symmetrical or asymmetrical patterns. The second tests
which, if any, TAM markers are distinctively associated with the protasis or apodosis across the languages in our sample. Prior to the analyses, we reduced the sample of languages. In particular, we removed all
languages for which it has not been possible to determine whether the
linking device is specialized (i.e. devices that are only used to encode
counterfactual conditionals) or non-specialized (i.e. devices that encode
counterfactual conditionals and other semantic types of conditionals,
e.g. real, generic, and hypothetical) (n=14). These languages account
for approximately 7% of the sample. We further removed those languages with systems that lacked TAM marking on the apodosis (n=5) or that
allow nominalization of the protasis (n=2). These trims were necessary
given issues of data sparsity. The final sample consisted of 86 languages.
3.1. Classification and Regression Tree (CART) analysis12
Our first goal is to discover which variables predict the presence of
symmetrical or asymmetrical systems for counterfactuals cross-linguistically. For this purpose, we use a technique from machine learning known
as Classification and Regression Tree (henceforth CART) analysis (our task
is one of classification). We have selected this analysis for several reasons.
First, we are dealing with a relatively small sample of labeled entities
(in this case, languages). Second, we have several categorical predictor
variables, each with several levels. Third, many of the cells in the crosstabulated predictor space are sparsely populated or contain zeroes. That
is, we do not have enough observations of many of the variable combinations to make reliable estimates of their behavior with respect to our
dependent variable. All of these facts create problems for more common
methods of classification, such as binary logistic regression.13 We therefore
select the non-parametric classification algorithm known as CART. CART
analysis involves the recursive binary partitioning of a dataset based on
which predictor variable is most strongly associated with the outcome
variable. Associations are weighted using significance tests against the
null hypothesis that the predictor and outcome variables are unrelated.
At each potential decision point in the tree, all predictors are considered,
and the resulting set of p-values are corrected for multiple comparison.
Here we apply the Bonferroni correction. As the partitions must be binary,
the levels of each categorical variable used for each split are divided into
171
Jesús Olguín Martínez, Nicholas Lester
two groups. Partitioning is stopped when all corrected p-values are greater
than the significance threshold (here, a = .05).
For this analysis, we included four predictors: macroarea, TAMmarking on the apodosis and protasis (respectively), and clause-linking
strategy. The resulting model achieved 85% classification accuracy.
Simply guessing the most frequent symmetry label yields a performance of
57% (24% poorer than our model). Sampling randomly based on the true
distribution (i.e., sometimes guessing the less frequent outcome in proportion to the observed distribution; baseline = psymmetrical2 + pasymmetrical2)
yields a performance of 51% (30% poorer than our model).
The classification tree is presented in Figure 1. The highest-level
split was made using the TAM marking on the protasis. This finding
alone is interesting, as it suggests that morphological (a)symmetry
depends most strongly on the properties of the protasis rather than the
apodosis of the counterfactual conditional construction. In particular,
languages with unmarked or actualized protases are reliably distinguished from those with mixed or non-actualized protases (p<.001).
The former group contains almost exclusively asymmetric languages
(bar graph for node 2). For languages with mixed or non-actualized
protases, the TAM of the apodosis further helped to predict (a)symmetry. Actualized and non-actualized apodoses were reliably distinguished
from mixed apodoses (p<.05). Both groups of languages overwhelmingly prefer symmetric marking (bar graphs for nodes four and five).
Figure 1. Results of the CART analysis predicting the (a)symmetry of the counterfactual
system across languages.
172
A quantitative analysis of counterfactual conditionals in cross-linguistic perspective
3.2. Contingency analysis
To determine the TAM properties that distinguish apodosis from
protasis, we perform a contingency analysis adapted from Gries &
Stefanowitsch (2004). This analysis involves a Fisher-Yates exact test
computed over a cross-tabulation of TAM marking strategies and type of
system. The contingency analysis works by constructing a series of 2×2
tables. Each cell contains a frequency. Columns represent the outcome
levels (apodosis vs protasis). Rows represent a given TAM value (e.g.
mixed) versus all other levels. The direction of any significant results is
derived from the difference between observed and expected frequencies
(we assume a uniform distribution as the null hypothesis). A positive
difference, or over-representation relative to the expected baseline, indicates affiliation; a negative difference, or under-representation, indicates
repulsion. The raw data for the analysis are provided in Table 8.
TAM
apodosis
protasis
non-actualized
47 (.62)
29 (.38)
unmarked
0 (.00)
15 (1.00)
actualized
19 (.42)
26 (.58)
mixed
20 (.56)
16 (.44)
Table 8. Frequency of TAM types per clause type (%).
It is immediately clear that unmarked TAM in this sample appears
exclusively on the protasis. Non-actualized TAM markers are roughly twice
as likely to occur on the apodosis. Actualized and mixed TAM marking are
more evenly distributed across the clause types. Table 9 shows the results of
the contingency analysis (only significant relationships are reported).
TAM
fobs apo fobs pro fexp apo fexp pro ∆PtypeTAM ∆PTAMtype p
pref
non47
actualized
29
38
38
0.21
0.21
<.001 apo
unmarked 0
15
7.5
7.5
0.17
0.55
<.001 pro
Table 9. Distinctive affiliation of TAM types to (a)symmetry of the counterfactual.
Table 9 provides several pieces of information. First, we have the
observed (fobs) and expected (fexp) frequencies of the apodosis (apo) and
protasis (pro) per TAM type. Next, we have a unidirectional measure of
association known as ∆P (Ellis 2006), taken both from the clause type
to the TAM marker (∆Ptype→TAM) and from the TAM marker to the
173
Jesús Olguín Martínez, Nicholas Lester
clause type (∆PTAM→type). ∆P describes the relationship between cues and
outcomes. In the present study, cues and outcomes may alternatively be
defined as values of the TAM or clause-type variables. ∆P equals 0 when
the cue is unrelated to the outcome. It approaches 1 as the cue and outcome are positively related (cue predicts presence of the outcome) and
-1 as they are negatively related (cue predicts absence of the outcome).14
Bidirectional relationships are indicated by similar values of ∆Ptype→TAM
and ∆PTAM→type. Finally, we have the p-value produced by the FisherYates exact test, along with the clause type that is distinctively associated with TAM marker.
First, we see that non-actualized TAM markers are significantly preferred by the apodosis cross-linguistically. As illustrated by the values
of ∆P, this relationship is largely bidirectional, meaning that non-actualized markers and apodoses are mutually strong cues of one another.
Second, we see that unmarked status is strongly associated with protasis
(unsurprising given the absence of any languages with unmarked apodoses in the sample). In this case, unmarked status is a much stronger
predictor of clause type than the other way around.
3.3. Discussion
Morphological (a)symmetry is best predicted by the types of TAM values in the protasis and apodosis. This is to be expected, as (a)symmetry is
defined relative to these properties. However, the result has two interesting
implications. First, morphological symmetry between clauses is more common for languages with mixed or non-actualized protases. Second, none of
the other variables was necessary to achieve a high degree of accuracy in
predicting (a)symmetry. However, for certain TAM configurations, one can
readily reconstruct the corresponding values of macro-area and clause-linking strategies. For instance, Papunesian languages with actualized apodoses
tend to be encoded by juxtaposed clause linkage.
The specific TAM affinities of protasis and apodosis are instructive about the general semantics of the counterfactual conditional construction. For example, protases tend to be morphologically unmarked
whereas apodoses tend to occur with non-actualized morphology. While
an unmarked clause offers no information about its relationship to reality, clauses marked with non-actualized morphology explicitly assert the
non-reality of the corresponding state of affairs. Moreover, the (a)symmetry of the overall system is best discriminated by splitting unmarked
and actualized from mixed or non-actualized protases (Figure 1). When
the protasis is unmarked or actualized, the outcome is almost certainly
an asymmetric system. Conversely, when the protasis may occur with
174
A quantitative analysis of counterfactual conditionals in cross-linguistic perspective
mixed or non-actualized morphology, the likelihood that the overall
system will be symmetric increases dramatically. Therefore, (a)symmetry seems to be a product of protases that behave like apodoses rather
than the other way around. In other words, symmetrical systems tend
to be those that treat the entire counterfactual conditional construction as ungrounded or hypothetical. Asymmetrical systems tend to be
those which afford special status (either grounded or unmarked) to the
protasis while leaving the apodosis non-actualized. The former kind of
system is consistent with the overall meaning of the counterfactual construction. The latter system is not, in principle, though it does follow a
certain logic. Protases serve as the background against which apodoses
are evaluated. Marking them as actualized grounds them conceptually,
hence treating them as a sort of given. Even though neither situation is
asserted to have occurred in reality, the situation encoded by the apodosis is treated as contingent on a world in which whatever is expressed in
the protasis did in fact occur. Actualized morphology thus anchors the
protasis to an imagined world as a precedent for the apodosis.
4. General discussion
This paper set out to describe the cross-linguistic diversity of counterfactual conditionals by taking into account three parameters, viz. the
symmetric and asymmetric morphological patterns of counterfactual
conditionals, the range of TAM values that tend to appear in the protasis
and apodosis in counterfactual conditional constructions, and the range
of clause-linking devices used in counterfactual conditionals.
Through two statistical analyses, we find that morphological (a)symmetry
depends most strongly on the properties of the protasis rather than the
apodosis of the counterfactual conditional construction. In particular,
languages with unmarked or actualized protases contain almost exclusively asymmetric languages. Regarding languages with mixed or nonactualized protases, they overwhelmingly prefer symmetric marking.
Another finding is that non-actualized TAM markers are significantly
preferred by the apodosis cross-linguistically, while the unmarked status
is strongly associated with protasis.
After having explored counterfactual conditionals by taking into
account a genetically and areally balanced sample, the following step is to
explore particular large genera for which we could only take into account
one language (e.g. Oceanic). This will enable us to explore internal diversity
and try to come up with more fine-grained typological generalizations.
175
Jesús Olguín Martínez, Nicholas Lester
Abbreviations
1, 2, 3 = first, second, third person; abl = ablative; abs = absolutive; acc =
accusative; act = active; ad = adessive; anim = animate; ant = anterior; aor =
aoristic; art = article; asp = aspect; asr = assertive; assoc = associative; atn =
focus of attention; att = attenuative; attr = attributive; aux = auxiliary; av =
actor voice; bp = body part; caus = causative; cf = counterfactual; cfp = clausefinal particle; circ = circumstantial; cnn = connective; com = communal aspect;
comit = comitative; comp = complementizer; compl = completive; cond = conditional; conj = conjunction; conneg = connegative; cons = consequential; cop
= copula; cvb = converb; dat = dative; decl = declarative; def = definite; deic
= deictic; dem = demonstrative; det = determiner; des = desiderative; dir =
directional; dist = distal; distr = distributive; dr = bivalent direct; ds = different
subject; du = dual; dur = durative; dynm = dynamic; emph = emphatic; ep =
epenthetic; erg = ergative; es = echo subject; ev = evidential; event = eventive;
ex = extended; excl = exclusive; f = feminine; fact = factual; fin = finite; foc
= focus; frust = frustrative; fut = future; g = general; gen = genitive; gend =
gender; hab = habitual; hyp = hypothetical; i = agreement prefix of agreement
pattern one; ign = ignorative; imag = imaginative; imp = imperative; imper =
impersonal; imperf = imperfect; inan = inanimate; inch = inchoative; incl =
inclusive; ind = indicative; indf = indefinite; inf = infitive; infr = inferential;
inh = inherent; ins = instrumental; int = intentional; intr = intransitive; ipd =
impeditive; ipfv = imperfective; irr = irrealis; lim = limiter; lk = linker; loc =
locative; loczr = localizer; m = masculine; mid = middle; min = minimal number; mod = modal; mv = medial verb; nb = notable information; neg = negative;
nmlz = nominalizer; nom = nominative; nppr = personal pronominal; nr = near;
obl = oblique; obj = object; opt = optative; pass = passive; pat = patient; perf
= perfect; pfv = perfective; pl = plural; pol = polarity; poss = possessive; post
= postposition; pot = potential; pres = present; pret = preterite; prog = progressive; prosp = prospective; pst = past; ptcp = participle; punct = punctual;
qual = qualitative predication; rdp = reduplication; real = realis; rec = recent;
ref = referential; refl = reflexive; regr = regressive; reit = reiterative; rel =
relativizer; rem = remote; rep = reportative; res = resultative; rld = realized;
rsg = resigned; sbj = subject; seq = sequential; sg = singular; sim = simultaneous; ss = same subject; sub = subordinator; subj = subjunctive; sv = serial verb;
temp = temporal; term = terminative; them = thematic; top = topic; trans =
transitive; uaugm = unit augmented number; unacc = unaccusative; unspec =
unspecified; val = validational; ver = veridical.
Acknowledgements
Many thanks to Peter Arkadiev, Marianne Mithun, Bernard Comrie, and two anonymous reviewers for their comments. Any errors remain our responsibility.
176
A quantitative analysis of counterfactual conditionals in cross-linguistic perspective
Notes
1
As correctly pointed by one reviewer, the notion of state of affairs is a useful concept
in that it is used unambiguously as a hyponym of different classes of predicates such as
situations, actions, events, and processes (see Dik 1997:105).
2
“[A] genus is a [maximal] group of languages whose relatedness is fairly obvious
without systematic comparative analysis” (Dryer 2013, slightly modified).
3
The present study is in line with other typological research that has also adopted
the genealogical and areal stratification proposed by Dryer (2013) without following the
procedure(s) he adopts in Dryer (1989). Some of these typological studies can be found in
Miestamo (2005) and Shagal (2019), to name but a few.
4
The Genus-Macroarea sampling method involves different samples or levels of sampling: the Genus Sample, the Core Sample, the Restricted Sample, and the Extended
Sample. Their selection depends on the type of research question(s).
5
Haspelmath (2010: 664) explains that comparative concepts are concepts created by
comparative linguists for the specific purpose of cross-linguistic comparison. They are based
on universal conceptual-semantic concepts and universal formal concepts. As pointed out
by one reviewer, it should be noted that comparative concepts were developed much earlier, and have been used by typologists for at least the past three decades (e.g. Stassen 1985:
14). This approach implies that any language should have a means to encode particular
conceptual state of affairs, though not necessarily a dedicated or a grammaticalized one.
6
It is important to bear in mind that the literature on complex sentence constructions
is vast. However, only a few studies have provided an explicit definition of what a complex sentence construction is.
7
We refer the interested reader to Van Valin & LaPolla (1997) on subordination, coordination and co-subordination, and Yuasa & Sadock (2002) on pseudo-subordination.
8
In a similar fashion, bound adverbial subordinators also operate in subordinate clauses
with various properties. However, in comparison to free adverbial subordinators, it is not infrequent to observe bound adverbial subordinators operating in clauses with properties similar to
those found in main clauses. In this regard, Hetterle (2015: 108) mentions that fully inflected
verbs are not as rare as one might suspect. In her typological study, she mentions that in 38 of
the 164 constructions with a bound adverbial subordinator, the verb of such a construction is
fully inflected and identical to a main clause verb.
9
The term para-hypotaxis is used by Romance linguists to refer to sentences containing a dependent clause with the main clause introduced by a coordinative conjunction.
According to Bertinetto & Ciucci (2012) this term was traditionally considered as an idiosyncratic feature of Old Romance languages.
10
Note that asymmetric counterfactual conditionals may also include instances in which the
protasis is deprived of TAM marking while the apodosis appears with particular TAM values
(e.g. Warekena, Macushi, Pashto, and Kam, among others; cf. Qian 2016: 158) or instances in
which the protasis appears with particular TAM values and the apodosis is deprived of TAM
marking (e.g. Savosavo and Yimas; cf. Qian 2016: 158).
11
As correctly pointed by one reviewer, nominalized verb forms must be considered a
subset of actualized and non-actualized patterns because nominalized verb forms can be
used for both actualized and non-actualized state of affairs.
12
The results of the CART analysis were verified by means of a random-forests analysis. We
present the CART results because they offer a clearer perspective on how the different predictors are weighted with respect to their reliability in partitioning the data.
13
Indeed, we attempted logistic regressions with both generalized linear and generalized additive models. While the results are similar to those presented in the CART analysis, the models performed quite poorly.
14
We only provide the absolute values here because we have just two outcomes: negative values indicate association with protasis, while positive values indicate association with apodosis.
177
Jesús Olguín Martínez, Nicholas Lester
Bibliographical References
Arkadiev, Peter 2020. Actionality, aspect, tense, and counterfactuality in Kuban
Kabardian. Studia Orientalia Electronica 8. 5-21.
Bennett, Charles E. 1908. A Latin grammar. Boston / Chicago: Allyn and Bacon.
Bertinetto, Pier Marco & Ciucci, Luca 2012. Parataxis, hypotaxis and para-hypotaxis
in the Zamucoan languages. Linguistic Discovery 10. 89-111.
Bhatt, Rajesh 1998. CF marking in the modern Indo-Aryan languages. Paper presented
at the University of Konstanz.
Black, Cheryl A. 1994. Quiegolani Zapotec syntax. PhD dissertation. University of
California, Santa Cruz.
Blackings, Mairi & Fabb, Nigel 2003. A grammar of Ma’di. Berlin / New York:
Mouton de Gruyter.
Brooks, Joseph 2018. Realis and irrealis: Chini verb morphology, clause chaining, and
discourse. PhD dissertation. University of California, Santa Barbara.
Bugenhagen, Robert D. 1995. A grammar of Mangap-Mbula: An Austronesian language of Papua New Guinea. Notes on Linguistics. Canberra: Research School of
Pacific and Asian Studies, Australian National University.
Chang, Anna Hsiou-chuan 2006. A reference grammar of Paiwan. PhD dissertation.
Australian National University, Canberra.
Comrie, Bernard 1986. Conditionals: A typology. In Traugott, Elizabeth; ter Meulen,
Alice; Reilly, Judy & Ferguson, Charles (eds.), On conditionals. Cambridge:
Cambridge University Press. 77-99.
Craig, Grinevald Colette 1990. A grammar of Rama. Report to National Science
Foundation.
Croft, William 2001. Radical construction grammar: Syntactic theory in typological perspective. Oxford: Oxford University Press.
Croft, William 2003. Typology and universals, 2nd ed. Cambridge: Cambridge
University Press.
Dik, Simon 1997. The theory of functional grammar. The structure of the clause. Berlin
/ New York: Mouton de Gruyter
Dryer, Matthew S. 1989. Large linguistic areas and language sampling. Studies in
Language 13. 257-292.
Dryer, Matthew S. 2013. Genealogical language list. In Dryer, Matthew S. &
Haspelmath, Martin (eds.), The world atlas of language structures online. Leipzig:
Max Planck Institute for Evolutionary Anthropology.
Ellis, Nick C. 2006. Language acquisition as rational contingency learning. Applied
Linguistics 27. 1-24.
Foley, William A. 1991. The Yimas language of Papua New Guinea. Stanford: Stanford
University Press.
Fox, Greg J. 1979. Big Nambas grammar. Canberra: Department of Linguistics,
Research School of Pacific Studies, Australian National University.
François, Alexandre 2002. Araki: A disappearing language of Vanuatu. Canberra:
Research School of Pacific and Asian Studies, Australian National University.
Gallagher, Steve & Baehr, Pierce 2005. Bariai grammar sketch. Data Papers on Papua
New Guinea Languages. Ukarumpa: Summer Institute of Linguistics.
Gast, Volker & Diessel, Holger 2012. The typology of clause linkage: Status quo,
178
A quantitative analysis of counterfactual conditionals in cross-linguistic perspective
challenges, prospects. In Gast, Volker & Diessel, Holger (eds.), Clause linkage in
cross-linguistic perspective: Data-driven approaches to cross-clausal syntax. Berlin /
New York: Mouton de Gruyter. 1-36.
Givón, Talmy 2001. Syntax: An introduction: Volume 2. Amsterdam / Philadelphia:
John Benjamins.
Greenberg, Joseph 1966. Language universals, with special reference to feature hierarchies. The Hague: Mouton.
Gries, Stefan & Stefanowitsch, Anatol 2004. Extending collostructional analysis:
A corpus-based perspective on ‘alternations’. International Journal of Corpus
Linguistics 9. 97-129.
Haiman, John 1980. The iconicity of grammar. Language 56. 515-540.
Haiman, John 1983. Paratactic if-clauses. Journal of Pragmatics 7. 263-281.
Haiman, John & Kuteva, Tania 2002. The symmetry of counterfactuals. In Bybee,
Joan & Noonan, Michael (eds.), Complex sentences in grammar and discourse.
Amsterdam / Philadelphia: John Benjamins. 101-124.
Harvey, Mark 2002. A grammar of Gaagudju. Berlin / New York: Mouton de Gruyter.
Haspelmath, Martin 1995. The converb as a cross-linguistically valid category. In
Haspelmath, Martin & König, Ekkehard (eds.), Converbs in cross-linguistic perspective. Berlin / New York: Mouton de Gruyter. 1-55.
Haspelmath, Martin 2004. Coordinating constructions: An overview. In Haspelmath,
Martin (ed.), Coordinating constructions. Amsterdam / Philadelphia: John
Benjamins. 3-39.
Haspelmath, Martin 2010. Comparative concepts and descriptive categories in crosslinguistic studies. Language 86. 663-687.
Haude, Katharina 2006. A grammar of Movima. PhD dissertation. Radboud
Universiteit, Nijmegen.
Heath, Jeffrey & Hantgan, Abbie 2018. A grammar of Bangime: Language isolate of
Mali. Berlin / Boston: De Gruyter Mouton.
Hetterle, Katja 2015. Adverbial clauses in cross-linguistic perspective. Berlin / Boston:
De Gruyter Mouton.
Hill, Deborah 1992. Longgu grammar. PhD dissertation. Australian National
University, Canberra.
Lynch, John 2000. A grammar of Anejom. Canberra: Research School of Pacific and
Asian Studies, Australian National University.
Nichols, Johanna 2011. Ingush grammar. Berkeley: University of California Press.
Karawani, Hadil 2014. The real, the fake, and the fake fake in counterfactual conditionals, crosslinguistically. Utrecht: Landelijke Onderzoekschool Taalwetenschap,
Netherlands National Graduate School of Linguistics.
Kenesei, István; Vago, Robert M. & Fenyvesi, Anna 1998. Hungarian. London:
Routledge.
Klinken, Catharina Lumien van 1999. A grammar of the Fehan Dialect of Tetun, an
Austronesian language of West Timor. Canberra: Research School of Pacific and
Asian Studies, Australian National University.
Kortmann, Bernd 1997. Adverbial subordination: A typology and history of adverbial subordinators based on European languages. Berlin / New York: Mouton de
Gruyter.
Lastra, Yolanda 1989. Otomí de San Andrés Cuexcontitlán. Mexico: El Colégio de
México.
179
Jesús Olguín Martínez, Nicholas Lester
Lehmann, Christian 1988. Towards a typology of clause linkage. In Haiman, John
& Thompson, Sandra A. (eds.), Clause combining in discourse and grammar.
Amsterdam / Philadelphia: John Benjamins. 181-225.
Levy, Paulette 1990. Totonaco de Papantla, Veracruz. México: Centro de
Investigación para la Integración Social.
Lindstrom, Lamont & Lynch, John 1994. Kwamera. München: Lincom.
Longacre, Robert E. 1985. Sentences as combinations of clauses. In Shopen, Timothy
(ed.), Language typology and syntactic description. Cambridge: Cambridge
University Press. 235-286.
Mauri, Caterina & Sansò, Andrea 2009. Irrealis and clause linkage. Paper presented
at the 8th Biennial Meeting of the Association of Linguistic Typology, Berkeley.
Mauri, Caterina & van der Auwera, Johan 2012. Connectives. In Keith, Allan &
Jaszczolt, Kasia M. (eds.), The Cambridge handbook of pragmatics. Cambridge:
Cambridge University Press. 347-402.
McGregor, William 1990. A functional grammar of Gooniyandi. Amsterdam /
Philadelphia: John Benjamins.
Merlan, Francesca C. 1994. A grammar of Wardaman, a language of the Northern territory of Australia. Berlin / New York: Mouton de Gruyter.
Miestamo, Matti 2005. Standard negation: The negation of declarative verbal main
clauses in a typological perspective. Berlin / New York: Mouton de Gruyter.
Miestamo, Matti; Bakker, Dik & Arppe, Antti 2016. Sampling for variety. Linguistic
Typology 20. 233-296.
Mithun, Marianne 1995. On the relativity of irreality. In Bybee, Joan & Fleischman,
Suzanne (eds.), Modality in grammar and discourse. Amsterdam / Philadelphia:
John Benjamins. 367-388.
Nicolle, Steve 2017. Introduction to special issue on conditional constructions in
African languages. Studies in African Linguistics 46. 1-15.
Noonan, Michael 1992. A grammar of Lango. Berlin / New York: Mouton de Gruyter.
Olawsky, Knut J. 2006. A grammar of Urarina. Berlin / New York: Mouton de
Gruyter.
Olguín Martínez, Jesús 2016. Adverbial clauses in Veracruz Huasteca Nahuatl from a
functional-typological approach. MA thesis. University of Sonora, Hermosillo.
Olson, Clif 1992. Gumawana (Amphlett Islands, Papua New Guinea): Grammar
sketch and texts. In Ross, Malcolm D. (ed.), Papers in Austronesian Linguistics
2. Canberra: Research School of Pacific and Asian Studies, Australian National
University. 251-430.
Overall, Simon 2017. A grammar of Aguaruna (Iiniá Chicham). Berlin / Boston: De
Gruyter Mouton.
Prince, Kilu von 2015. A grammar of Daakaka. Berlin / Boston: De Gruyter Mouton.
Prince, Kilu von 2019. Counterfactuality and past. Linguistics and Philosophy 42. 577615.
Qian, Yong 2016. A typology of counterfactual clauses. PhD dissertation. City
University of Hong Kong.
Sanders, Arden & Sanders, Joy 1994. Kamasau (Wand Tuan) grammar: Morpheme to
sentence. Ms.
Seiler, Walter 1985. Imonda, a Papuan language. Canberra: Research School of Pacific
and Asian Studies, Australian National University.
Shagal, Ksenia 2019. Participles: A typological study. Berlin / Boston: De Gruyter
180
A quantitative analysis of counterfactual conditionals in cross-linguistic perspective
Mouton.
Smyth, H. W. 1920. Greek grammar for colleges. New York: American Book Company.
Stassen, Leon 1985. Comparison and universal grammar. Oxford: Basil Blackwell.
Steele, Susan 1975. Past and irrealis: Just what does it all mean? International
Journal of American Linguistics 41. 200-217.
Tharp, Douglas 1996. Sulka grammar essentials. In Clifton, John M. (ed.), Two
non-Austronesian grammars from the islands. Ukarumpa, Papua New Guinea:
Summer Institute of Linguistics. 77-179.
Traugott, Elizabeth C. 1985. On conditionals. In Haiman, John (ed.), Iconicity in syntax. Amsterdam / Philadelphia: John Benjamins. 289-307.
Valin, Robert Jr. van & LaPolla, Randy 1997. Syntax. Cambridge: Cambridge
University Press.
Van den Berg, Helma 1995. A Grammar of Hunzib (with Texts and Lexicon).
München: Lincom.
Van de Velde, Mark 2008. A grammar of Eton. Berlin / New York: Mouton de
Gruyter.
Verstraete, Jean-Christophe 2014. The role of mood marking in complex sentences:
A case study of Australian languages. Word 57. 195-236.
Vries, Lourens de 2004. A short grammar of Inanwatan: An endangered language of the
Bird’s head of Papua, Indonesia. Canberra: Research School of Pacific and Asian
Studies, Australian National University.
Vuillermet, Marine 2018. Grammatical fear morphemes in Ese Ejja: Making the
case for a morphosemantic apprehensional domain. In Ponsonnet, Maïa &
Vuillermet, Marine (eds.), Morphology and emotions across the world’s languages.
Special issue of Studies in language 42. 256-293.
Wierzbicka, Anna 1997. Conditionals and counterfactuals: Conceptual primitives
and linguistic universals. In Athanasiadou, Angeliki & Dirven, René (eds.), On
conditionals again. Amsterdam / Philadelphia: John Benjamins. 15-59.
Wilkins, David P. 1989. Mparntwe Arrernte (Aranda): Studies in the structure and
semantics of grammar. PhD dissertation. Australian National University,
Canberra.
Wulff, Stefanie; Lester, Nicholas & Martinez Garcia, Maria 2014. That-variation in
German and Spanish L2 English. Language and Cognition 6. 271-299.
Xrakovskij, Viktor 2005. Conditional constructions: A theoretical description.
In Xrakovskij, Viktor (ed.), Typology of conditional constructions. München:
Lincom. 3-95.
Yuasa, Etsuyo & Sadock, Jerry M. 2002. Pseudo-subordination: A mismatch between
syntax and semantics. Journal of Linguistics 38. 87-111.
181