Bruggeman 2020
Bruggeman 2020
Bruggeman 2020
prominence
in
Tashlhiyt Berber
and Moroccan Arabic
Inaugural-Dissertation
zur Erlangung des Doktorgrades
der Philosophischen Fakultät der Universität zu Köln
im Fach Phonetik
vorgelegt von
Anna Bruggeman
aus Utrecht, Niederlande
Bruggeman, Anna. 2020. Lexical and postlexical prominence in Tashlhiyt Berber and Mo-
roccan Arabic. University of Cologne PhD dissertation.
iii
Abstract
Tashlhiyt Berber (Afro-Asiatic, Berber) and Moroccan Arabic (Afro-Asiatic, Semitic),
two languages spoken in Morocco, have been in contact for over 1200 years. The
influence of Berber languages on the lexicon and the segmental-phonological structure
of Moroccan Arabic is well-documented, whereas possible similarities in the prosodic-
phonological domain have not yet been addressed in detail.
This thesis brings together evidence from production and perception to bear on the
question whether Tashlhiyt Berber and Moroccan Arabic also exhibit convergence in
the domain of phonological prominence. Experimental results are interpreted as show-
ing that neither language has lexical prominence asymmetries in the form of lexical
stress. This lack of stress in Moroccan Arabic is unlike the undisputed presence of lex-
ical stress in most other varieties of Arabic, which in turn suggests that this aspect of
the phonology of Moroccan Arabic has resulted from contact with (Tashlhiyt) Berber.
A further, theoretical contribution is made with respect to the possible correspond-
ence between lexical and postlexical prominence structure from a typological point of
view. One of the tenets of the Autosegmental Metrical approach to intonation analysis
holds that prominence-marking intonational events (pitch accents) associate with lexic-
ally stressed syllables. Exactly how prominence marking is achieved in languages that
lack lexical stress is little-understood, and this thesis’ discussion of postlexical promin-
ence in Tashlhiyt Berber and Moroccan Arabic provides new insights that bear on this
topic.
A first set of production experiments investigates, for both languages, if there are
acoustic correlates to what some researchers have considered to be lexically stressed
syllables. It is shown that neither language exhibits consistent acoustic enhancement
of presumed stressed syllables relative to unstressed syllables.
The second set of production experiments reports on the prosodic characteristics of
question word interrogatives in both languages. It is shown that question words are the
locus of postlexical prominence-marking events that however do not exhibit association
to a sub-lexical phonological unit.
A final perception experiment serves the goal of showing how native speakers of
Tashlhiyt Berber and Moroccan Arabic deal with the encoding of a postlexical prom-
inence contrast that is parasitic on a lexical prominence contrast. This is achieved by
means of a ‘stress deafness’ experiment, the results of which show that speakers of
neither language can reliably encode a lexically-specified prominence difference.
Results from all three types of experiment thus converge in suggesting that lexical
prominence asymmetries are not specified in the phonology of either language.
v
Contents
Abstract v
Acknowledgments xi
Abbreviations xiii
Part I Introduction 1
1 General introduction 3
1.1 Aims and goals of thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 The linguistic landscape of Morocco . . . . . . . . . . . . . . . . . . . . 5
1.3 Tashlhiyt Berber and Moroccan Arabic . . . . . . . . . . . . . . . . . . . 6
1.4 Overview of thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
vii
Contents
4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.5 Summary and conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 67
viii
Contents
Appendices 179
Bibliography 187
ix
Acknowledgments
First of all I thank the DAAD for their financial support for the past 3+ years.
There are very many people I am indebted to for helping me shape, start and complete
this PhD.
I owe Martine Grice for a great deal more than I can express here. As a supervisor,
and as Martine, she supported, questioned and challenged me. She has contributed
enormously to how I have grown academically and personally. She identified and
created learning opportunities at every turn, opened doors of whose existence I was
not aware and made me steer clear from many potential hurdles. I am immensely
grateful for everything that she has done for me on all levels.
Sam Hellmuth has been pivotal in getting me back on track when I needed it the
most, as a supervisor and in all other possible capacities. I would not have managed
if it had not been for her unwavering belief in me, her contagious enthusiasm, and for
helping me see clearly through the forest of (metrical) trees. Not least of all, I owe her
for entrusting me with her Moroccan Arabic data. This thesis would simply not have
had its present form without her.
In Cambridge I owe much to Brechtje Post and Francis Nolan, who were exemplary
teachers and motivated me to take the path into phonetics. Their continued interest
and open doors during my PhD have been invaluable.
I am also grateful to Carlos Gussenhoven, Rachid Ridouane, Maarten Kossmann and
Harry Stroomer for their input, suggestions and interest throughout.
There are a number of other people who have had a profound influence on me aca-
demically. Timo Roettger, from and with whom I have learned a great deal, and with
whom working together on Tashlhiyt was always stimulating. Francesco Cangemi, for
his readiness to share ideas, methodologies and complex thought processes. Stefan Bau-
mann, for always making time when I needed input and for invariably being positive.
Bodo Winter, for demystifying statistics before I even started my PhD. András Bárány
belongs in this list too, for asking both obvious and obscure questions, and for showing
what a great PhD could look like.
Many other people in the last few years have supported me in various ways.
From the If L Phonetik in Cologne I want to thank Anne, Aviad, Bastian, Christian,
Christine Ri., Christine Rö., Doris, Francesco, Henrik, Jane, Janina, Jessica, Luke, Le-
onie, Martina, Martine, Simon R., Simon W., Stefan, Tabea, Theo, Timo, and Phuong
for everything from a discussion to a laugh, and from a chat (on request even in Ger-
man!) to a dance. Martina, Christine and Tabea specifically for joint Muskelkaters, and
Henrik, Aviad, the Simons, Luke and Francesco for providing daily dosages of enter-
tainment and motivation.
xi
Acknowledgments
In York a number of inspiring people helped shape the environment that brought me
back on track at the start of my third year, including Sam Hellmuth and Tamar Keren-
Portnoy who believed in me right from the start. I was especially lucky to meet Julia
Kolkmann and Miriam Aguilar: I owe them both for keeping me sane and motivated
at the same time. I also had the pleasure to meet Rana Almbark again, for Arabic
discussion and joint elation about sugar.
Elsewhere I owe my (at times) relatively well-kept sanity to a number of good friends.
Steffie, Maaike, Nicole and Maarten: Thanks for always being up for anything and
providing the nicest kind of changing continuity. Andrea, things have been tumultuous
for us in different ways, but I am glad we shared it. Camilla, Claudia B., Claudia N. and
Laura: Thank you for providing me with shelter and distraction in various countries
whenever I needed it.
A special thanks goes to all my contacts and participants in Morocco, and especially
to Abderrahme Charki, Sanae Oubraim and Nabila Louriz. Thank you for welcoming
me into your homes and lives. I simply could not have done this PhD without you.
Additionally, learning from you (and with you) about your languages and culture has
been one of the most fun and rewarding aspects of my PhD.
Aan mijn familie in Nederland: Dankjulliewel voor jullie ondersteuning en rotsvaste
vertrouwen in me.
Finally, thanking András here does not do justice to the role he played in the process
that led to this thesis – he lit my way.
xii
Abbreviations
1 first person
2 second person
3 third person
acc accusative
aor aorist
cl clitic
comp complementizer
dat dative
dist distal
fut future
f feminine
imp imperative
ipfv imperfective
m masculine
neg negative
pfv perfective
pl plural
poss possessive
q interrogative particle
rel relative
sg singular
xiii
Abbreviations
AP Accentual Phrase
IP Intonational Phrase
MA Moroccan Arabic
PP Phonological Phrase
ST semitones
TB Tashlhiyt Berber
xiv
Part I
Introduction
1
1 General introduction
1.1 Aims and goals of thesis
This thesis investigates and compares aspects of lexical and postlexical prominence
structure in two Afro-Asiatic languages of Morocco, Tashlhiyt Berber (TB) and Mo-
roccan Arabic (MA). The main goal is to find out how prominence structure in these
languages should be characterised.
In doing so, two secondary goals can be identified. Firstly, this thesis serves as a
detailed comparison of aspects of phonological prominence structure in two distantly
related languages on different branches within the same language family. These lan-
guages have nevertheless been in contact for around 1200 years and are known to
exhibit convergence in many aspects of linguistic structure. One of the additional
goals therefore involves identifying similarities in these languages in the prosodic-
phonological domain. Secondly, the present thesis will provide a contribution to the
theoretical discussion about the mapping between lexical and postlexical prominence
structure. In brief, prominence is conceived of as a phonological phenomenon with
an abstract representation in the grammar, rather than a more surface-oriented un-
derstanding of prominence as acoustic-perceptual salience. A detailed discussion and a
precise definition of ‘prominence’ as used in this thesis will be given in the next chapter.
At the heart of the phonological definition of prominence is the question what struc-
tural linguistic elements are specified as such. This thesis will contribute insights into
the specification of prominence at the lexical level, and will discuss how the realisation
and distribution of postlexical prominences relates to these lexical prominence specific-
ations.
Our current understanding of postlexical prominence structure in languages that lack
lexical prominence is limited, mainly because there are only few languages that are
convincingly argued to lack it. The best-known languages lacking any kind of lexical
prominence are French and Korean, which are often considered to lack ‘stress’ (note
that they also lack lexical tone and lexical pitch accent). Terminology is crucial here:
Plenty of languages are considered to lack ‘stress’, such as Tokyo Japanese, but such
languages may still have another type of lexical prominence specification (in the case
of Japanese this is lexical pitch accent). Again, a detailed discussion of the terminology
can be found in Chapter 2.
The two languages discussed in this thesis, TB and MA, form particularly interest-
ing case studies for claims about prominence at both lexical and postlexical levels of
phonological structure.
At the time of writing there appears to be consensus on the lack of lexical prominence
3
1 General introduction
in Tashlhiyt Berber (Stumme 1899; Dell & Elmedlaoui 2002; Kossmann 2012; Ridouane
2014; Roettger, Bruggeman & Grice 2015), although it has previously also been claimed
that lexical stress is present (e.g. Sadiqi 1997; Gordon & Nafi 2012; Laoust 2012). For
Moroccan Arabic, the details of lexical prominence are a matter of disagreement and
remain as of yet unresolved (e.g Mitchell 1993; Benkirane 1998; Boudlal 2001; Watson
2011).1
While it is generally acknowledged that some languages lack lexical prominence struc-
ture, the question as to whether all languages have phonological prominence specific-
ations at the postlexical level has been addressed to a lesser extent. At this point lan-
guages do seem to exist in which postlexical structural prominence does not exist or
is unspecified, notably Ambonese Malay (Maskikit-Essed & Gussenhoven 2016) and
Korean (Jun 1993, 2005a). Most of the literature however focuses on the apparent ma-
jority of languages that exhibit structural prominence at both levels. In languages with
lexical stress, these levels are characterised by a clear correspondence, where postlex-
ical prominence in the form of pitch accents co-occurs with lexically stressed syllables.
In light of this, acknowledging the existence of languages lacking either or both levels
of structural prominence is important, not only for purposes of prosodic typology, but
also for our understanding of intonational structure in general, as most models have
attempted to link lexical prominence structure to postlexical prominence. For present
purposes, it should be noted that not much is known about the postlexical prominence
structure in TB and MA, beyond general observations that suggest it is rather variable
(Mitchell 1993 for MA, Dell & Elmedlaoui 2008 for TB). The investigation of the prin-
ciples guiding the placement of postlexical prominence in these languages is therefore
interesting in its own right already, but even more so by virtue of its being tradition-
ally linked to lexical stress. In this thesis, therefore, the theoretical implications of a
potential absence of a correspondence between the two levels of prominence structure
will also be addressed.
The general claim that I will make in this thesis, based on the results from five exper-
iments, is that lexical prominence asymmetries (in the form of lexical stress) are absent
in both languages. In the context of the aforementioned goals (describing lexical and
postlexical structure as well as any possible correspondences), the specific contributions
of the different experimental chapters can be categorised as follows:
1
As this thesis is officially published in 2020, it should be clarified that Chapter 3 is a more elaborate
analysis of the same experiment that has previously been published as Roettger, Bruggeman & Grice
(2015) and Roettger (2017). Similarly, the claim that stress is absent in Moroccan Arabic has in the
meantime been published in Bruggeman et al. (in press) (based on Chapter 4).
4
1.2 The linguistic landscape of Morocco
The specific questions and theoretical background motivating each of the experi-
ments will be discussed in more detail in the next chapter (Chapter 2: Approaches
to lexical and postlexical prominence).
5
1 General introduction
(2004: Ch. 1) and Versteegh (2014). For specific reference to the Moroccan context see
Maas & Procházka (2012b: Sec. 3), Mitchell (1993) and El Aissati (2005).
In the following section, I will review claims about convergence between Moroccan
Arabic and Tashlhiyt Berber in light of the wider context of variation in Arabic and
Berber.
2
For the uninitiated reader, it is not always clear which variety of Berber is being discussed in a given
source due to confusing terminology. Tashlhiyt is known by various alternative spellings and names,
including Tachelhit(e), Shilha, Soussiya, or Tasoussit. The term Tamazight, in turn, has multiple uses,
as it may refer to the variety of Berber spoken in the Atlas as well as to the standardised version
of Berber taught in schools. Tamazight might also be used synonymous with Berber and denote the
language family in general.
3
Maas & Procházka (2012b: 330) go as far as to suggest that different varieties of Berber within a village
are spoken by individual families.
6
1.3 Tashlhiyt Berber and Moroccan Arabic
There has been a long-standing debate about how these sequences should be best rep-
resented, with notable disagreement about the status of the central vocoids that are
observed to break up long clusters, as either having phonological status or being phon-
etic transitions.
More recently, experimental work on Tashlhiyt intonation has been conducted, focus-
ing on yes–no question intonation and phrase-final focus (Grice, Ridouane & Roettger
2015; Roettger 2017). Some more impressionistic reference to topic and focus struc-
tures in Tashlhiyt can be found in Lafkioui (2010). This thesis adds to existing work on
Tashlhiyt intonation by discussing intonation in wh-questions, hitherto undiscussed, in
Chapter 6.
While there is thus some past work on Berber, existing research on Arabic has vast
dimensions (Classical) Arabic has a long research tradition, but this has been supple-
mented over the last century or two by work following from specific interest in the vari-
ability between synchronic, spoken varieties. The following resources, and references
therein, form a good starting point into the literature: Bateson (1967), Holes (2004),
Owens (2013) and Versteegh (2014). Contemporary general descriptions of Moroccan
Arabic can be found in Harrell (1962) and Maas (n.d.) and Maas & Procházka (2012b).
Aspects of MA phonology are addressed in Mitchell (1993) and Heath (1997) and Dell
& Elmedlaoui (2002). Comparatively recently, some work has been done on intona-
tion, with a general description in Benkirane (1998), and experimental studies by Yeou
(2005), Yeou et al. (2007), Yeou, Embarki & Al-Maqtari (2007), Burdin et al. (2015)
and Hellmuth et al. (2015). See Chapter 7 for more a detailed discussion of the existing
literature on MA intonation.
It has been suggested that there is no common standard Moroccan Arabic yet, at
least not in terms of phonology, although if anything there is a trend towards a local
standard variety in the metropolitan areas of Casablanca and Rabat (Dell & Elmedlaoui
2002: 239). Nevertheless, is well known that in general, the segmental phonology of
MA has been heavily influenced by contact with Berber (Mitchell 1993; Heath 1997;
Maas & Procházka 2012b; Maas n.d., see also Zellou 2010). In fact, MA and (Tashlhiyt)
Berber have been said to exhibit similar “surface phonologies” (Dell & Elmedlaoui 2002:
227). For example, the vowel inventory of most contemporary Arabic varieties typically
consists of the phonological vowels /i, a, u/ with contrastive use of length (Watson
2011), whereas MA lacks a phonological vowel length distinction (see also Chapter
4). Similarly, MA displays many complex consonantal clusters that are not observed in
other varieties of Arabic. Maas & Procházka (2012b) discuss several more phonological
properties that are shared between MA and Berber, including prosodic aspects such as
what they call “accent” (lexical stress).
Especially this reference to similarities in lexical prominence structure is of interest to
this thesis. As previously mentioned, the existence of lexical prominence specifications
in the form of stress has been denied for Tashlhiyt, and is subject to debate for Moroccan
Arabic. Additionally, the location of postlexical prominence in both languages is highly
elusive, along the lines of the following statement about Moroccan Arabic: “Prominence
among syllables in MA words is at present imponderable and seemingly lacks any close
7
1 General introduction
8
2 Approaches to lexical and postlexical
prominence
2.1 Introduction
This chapter serves to lay out the theoretical groundwork relating to prominence, lexical
and postlexical phonological structure, and the assumptions that underlie the questions
asked in this thesis.
Section 2.2 will begin with a few basic definitions, namely those of lexical and postlex-
ical prominence, as used throughout the rest of this thesis. In subsequent sections, the
literature on each of these will be discussed in further detail. Section 2.3, entitled ‘Lex-
ical prominence: Word stress’, will look at phonological prominence at the word level
and will review uses of the term ‘stress’. The next section, 2.4, entitled ‘Postlexical
prominence and intonation’ continues the discussion of prominence asymmetries at a
higher level of linguistic structure (i.e. above the word). This section will involve an
overview of the place of postlexical prominence in various models used for the analysis
of intonation. It includes an overview of what is currently the most standard theory
used in the modelling of intonational structure; AM phonology.
In Section 2.5, I will bring together the discussions of lexical and postlexical promin-
ence and highlight severak topics related to their interaction that are especially relevant
to this thesis.
Section 2.6 serves to link the discussion of the current linguistic landscape of Morocco
in the previous chapter with the theoretical issues highlighted in the present one. This
section will motivate the overarching research questions of this thesis as well as the
individual research questions for each experimental chapter. I will also briefly sketch
how hypothesised findings may fit in with various typological approaches. This topic
is taken up again in Chapter 9.
9
2 Approaches to lexical and postlexical prominence
10
2.3 Lexical prominence: Word stress
lexical stress (or to none of the words if the language in question lacks stress). Finally,
the position of stress in a given word is fixed and invariant, something which would
be reflected in the dictionary entry of a word (cf. Abercrombie 1976 in van der Hulst
2014a).
So far, the present definition of stress would cover both lexical stress and lexical
pitch accent systems.1 In this context, a relevant distinction is often made between
‘stress–accent’ and ‘non-stress–accent’ (cf. Hyman 1977; Beckman 1986). This distinc-
tion refers to the observation that lexical stress asymmetries result from acoustic en-
hancement in terms of multiple phonetic parameters (‘stress–accent’), whereas lexical
pitch accent asymmetries result from pitch only.2 Based on the available evidence
about correlates of stress, however, there seem to be languages that have stress but
nevertheless exhibit little or no acoustic enhancement of stressed syllables (see Section
2.3.3). It is desirable therefore to define lexical stress without reference to the acoustic
properties that may mark it, and instead define it to the exclusion of lexical prominence
asymmetries that solely involve lexical pitch accent. This way, languages like Tokyo Ja-
panese can be considered to have lexical prominence in the form of pitch accent but not
stress, and languages like Swedish and Norwegian have lexical prominence asymmet-
ries in the form of stress as well as in the form of pitch accent. In essence, then, lexical
stress is taken to refer to the property of lexically specified prominence asymmetries
that does not exclusively involve the lexical marking of pitch.
The present view of stress is similar to the definitions found in Ladd (2008: 50f.):
“an abstract phonological property of a syllable within a prosodic structure”) and in
Hyman (2014: 56): “the phonological marking of one most prominent position in a
word” (note that Hyman uses the term ‘accent’ to denote this property).3 This definition
is also compatible with the one given on WALS (Goedemans & van der Hulst 2013).
The present definition however contrasts with two other commonly used definitions.
The first one is the understanding of stress in the literature on Metrical Stress Theory, as
in Hayes (1995: 8): “stress is the linguistic manifestation of rhythmic structure”. This
definition of stress overlaps to a large extent with the present one in the sense that it usu-
ally identifies the same syllable as stressed, and uses some of the same diagnostics (see
Section 2.3.2). The present definition crucially differs from it in renouncing the idea
1
In fact, evidence from stress deafness experiments (e.g. Rahmani, Rietveld & Gussenhoven 2015)
suggests that it is indeed appropriate to group these systems together for the purpose of lexical–
phonological representation.
2
Even many of those who use the term ‘stress–accent’ would not go as far as to suggest that the only
way of identifying stress is through its acoustic enhancement. However, since acoustic enhancement
is so often misanalysed, and there are plenty of alternative ways to identify stress, it seems safer not
to use it as part of a definition. If one does want to define a class of languages based on the presence
of ‘stress–accent’, as in Jun (2005b: 440): “A language is categorized to have a ‘stress–accent’ feature
if a certain syllable in a word is more prominent than other syllables by duration and/or amplitude”,
several languages that are normally thought to have stress but do not exhibit such enhancement would
have to be reconsidered.
3
The main reason why I use the term ‘lexical stress’ rather than ‘lexical accent’ is to avoid confusion
with the term ‘accent’ as used in intonational research, where it denotes concrete, measurable, pitch
protrusions.
11
2 Approaches to lexical and postlexical prominence
4
It is an independent question if these authors believe in the existence of a more abstract level of lexical
prominence; Abercrombie for example did, and called it ‘accent’.
5
It can often be deduced that stress position is determined based on perceived prominence (by non-native
speaker linguists). This seems to be the case even in Hayes (1995), which despite featuring a list of
possible diagnostics to stress, cites many sources for which the process involved in identifying stress is
unclear.
12
2.3 Lexical prominence: Word stress
4. A further diagnostic for stress is one that considers native speaker judgments. This
might involve metalinguistic questions about what is considered the most prom-
inent position in the word, or non-linguistic exercises involving tapping on the
‘beat’ in parallel to a stretch of speech. Interestingly, this diagnostic is sometimes
6
In fact, some recent studies still base the identification of stress solely on perceived prominence by non-
native speaker linguists, including Zuraw, Yu & Orfitelli (2014). While it is true that the coincidence
of pitch accents with stressed syllables is robust in some languages, and therefore that perceived prom-
inence on a specific syllable might be a reflection of stress in that position, it is by no means clear that
this is a reliable diagnostic crosslinguistically.
13
2 Approaches to lexical and postlexical prominence
considered problematic or its results are simply ignored on the grounds that nat-
ive speakers disagree on the position of prominence or do not know what is meant
by it (see e.g. the discussion of native speaker judgments in Moroccan Arabic in
Chapter 4, and discussions of stress in Indonesian, cf. Goedemans & Van Zanten
2007; Maskikit-Essed & Gussenhoven 2016). In fact, it can be argued that the
very existence of non-converging or difficult-to-elicit native speaker judgments is
highly informative, since such findings suggest either that lexically-determined
prominence asymmetries do not exist in the language, or if they do, that they
represent a rather different phenomenon from what are comparatively straightfor-
ward types of stress characterised by more consistent judgments (as in Germanic).
5. Finally, I would like to suggest that there is a diagnostic which may serve to
clarify whether a language has lexically specified prominence asymmetries in the
first place. It is based on the results of two decades’ worth of ‘stress deafness’
experiments, culminating in findings by Rahmani, Rietveld & Gussenhoven (2015)
and Hellmuth, Muradás-Taylor & Karrinton (to appear). These stress deafness
experiments have yielded the insight that a specific kind of memory task (a so-
called Sequence Recall Task, or SRT) reliably distinguishes between participants
who are native speakers of a language with lexical prominence (showing good
performance) and participants who are native speakers of languages that lack
lexical prominence (exhibiting relatively poor performance).
In the above, I showed that there are at least four diagnostics (1. to 4.) that may
serve to identify stressed syllables (or moras), and one additional, general diagnostic
(5.) for the very existence of lexical prominence asymmetries in a language. At this
point it is not clear to what extent any of these diagnostics are necessary or sufficient
criteria on their own. It is clear, however, that the more diagnostics converge, the
stronger any claims about stress are, keeping in mind that not all languages exhibit all
of the above diagnostics. There are languages that are considered to have lexical stress
but postlexical pitch accent apparently does not dock on these syllables (including Kuot,
Wolof and Chickasaw, see Section 2.5 for more detail). There are languages that exhibit
little if any acoustic enhancement of stressed syllables, like Hungarian (Varga 2002;
Szalontai et al. 2016, and see also Figure 2.1 in Section 2.3.3). Such languages might
still be characterised by consistent native speaker intuitions about stress and the co-
occurrence of pitch accents with stressed syllables. There are also languages that exhibit
most or all of these five diagnostics, such as Germanic languages. In these languages,
stressed syllables can receive postlexical pitch accents, but even in the absence of pitch
accents, stressed syllables are acoustically enhanced in terms of duration. Stressed
syllables are also those syllables that display the full set of contrastive vowels in the
language, as opposed to unstressed syllables. Finally, speakers of such languages agree
on what stressed syllables are, and they do not exhibit stress deafness on SRTs (Rahmani,
Rietveld & Gussenhoven 2015).
The observation that stress might be identified by possibly any combination of the
aforementioned diagnostics makes it difficult to convincingly argue that a given lan-
14
2.3 Lexical prominence: Word stress
guage lacks lexical stress. Firstly, negative results for any single one of the diagnostics
cannot be taken as conclusive evidence that a given language lacks lexical prominence
asymmetries. Secondly, the discussion of negative results brings me to the inherent
problem of trying to prove a null hypothesis; If a diagnostic or test is negative, it is
not logically possible to conclude that the answer to the question asked with that dia-
gnostic or test is indeed negative. Not being to able answer with ‘yes’ might simply
mean one has looked wrong; Negative results can always be due to flawed design. The
solution for the purposes of this thesis (where I will argue in favour of the absence of
lexical stress in languages), is the following: If multiple diagnostics converge, in the
sense that not one is able to provide evidence supporting the existence of lexical stress,
it should be possible to conclude the opposite: That stress does not exist. Or: “If […] a
language makes it so hard to find the stress, one naturally has to ask whether stress is
phonologically activated at all” (Hyman 2014: 78).
15
2 Approaches to lexical and postlexical prominence
consideration in this context is that not all enhanced F0 should be interpreted as pitch
accentuation or prominence marking, since enhanced F0 might result from a host of
factors, many of which have little to do with lexical stress. The most well known al-
ternative cause of pitch prominence includes phrasing-related pitch movements. This
is especially relevant for languages that exhibit no acoustic correlates of stress (or other
diagnostics for that matter) but do exhibit F0 movements on most lexical words, such
as Korean. Normally considered to lack lexical stress, Korean has been misconstrued
as having lexical stress on the basis of F0 movement alone (cf. Jun 2005a). The same
will be shown to have happened for both Tashlhiyt Berber in Chapter 3 and Moroccan
Arabic in Chapter 4.
Some correlates of stress are generally held to be robust, or reliable. In a review of
more than 100 studies investigating correlates of stress in over 75 languages, Gordon &
Roettger (2017) show that duration is reported to be a common cue to stress, followed,
in order of importance, by F0, intensity and spectral measurements. Extending these
findings to all known languages or interpreting them as standardly held assumptions
(which to some extent they already are) about the crosslinguistic manifestation of cor-
relates of stress is dangerous. In addition even to the aforementioned inherent issues
with F0 and intensity, several other things should be kept in mind. Firstly, the meta
study reviews published works, and thus possibly reflects a bias towards publishing sig-
nificant results (indeed, the majority of studies found at least some effect). Secondly,
some further doubt is cast on the reliability of the original results by the authors’ dis-
cussion of the methodologies employed in data elicitation. For example, around half
of the studies reviewed were not clear on the exact context in which target words were
placed. Thirdly, many studies exhibited experimental flaws, ranging from very small
numbers of speakers to employing elicitation contexts in which postlexical prominence
as opposed to lexical prominence was investigated. In sum, while the results of the
individual studies are highly valuable in their own right, these observations cast doubt
not only on the reliability of some of the results on their own (as reflecting lexical
stress proper rather than enhancement due to postlexical prominence) but also on the
extendability of some correlates of stress as crosslinguistically common or reliable.
At this point, it can only safely be said that stress in a given language may but does
not have to be signalled by acoustic enhancement of the relevant syllable or mora. In
addition, it is by no means clear that when stressed positions are acoustically enhanced,
some correlates are generally preferred over others.
In order to illustrate different ways in which stress may or may not be cued acous-
tically, consider Dutch and Hungarian. Dutch is a typical Germanic language in the
sense that it has variable stress location and displays most of the known diagnostics of
stress, including rather robust acoustic correlates: Stressed syllables are longer, have
more peripheral vowel quality and are spectrally enhanced in the absence of postlex-
ical prominence (Sluijter & van Heuven 1996; Rietveld, Kerkhoff & Gussenhoven 2004).
Hungarian is also uncontroversially considered to have lexical stress, which is fixed in
word-initial position. Native speakers agree that stress is word-initial, and word-initial
syllables consistently attract pitch accents in sentence context. Its acoustic correlates
16
2.3 Lexical prominence: Word stress
in the absence of postlexical prominence however are minimal: Stressed syllables are
not longer and they do not have different vowel quality. Szalontai et al. (2016) show
that only intensity is a reliable marker of word-initial syllables, which supports earlier
observations that this might in fact be the only acoustic correlate of stress in Hungarian
(Varga 2002). Based on the previous discussion, it remains an open question whether
this is a perceptually retrievable correlate of stress.
Figure 2.1: Dutch and Hungarian words (Agatha and fekete, highlighted) embedded in
sentence context. Target words carry postlexical prominence in the form of
a pitch accent in the top panels. In the bottom panels target words do not
carry postlexical prominence. Data recordings are author’s own.
Figure 2.1 illustrates the acoustic realisation of a word in each language: [aːˈxaːtaː]
‘(the proper name) Agatha’ for Dutch and [ˈfɛkɛtɛ] ‘black’ for Hungarian.8 The top
panel gives a sentence in which the target word is focused and receives a postlexical
pitch accent that associates with its stressed syllable (in Dutch on the second syllable
ˈxaː and in Hungarian on the first syllable ˈfɛ). The bottom panel gives a context in
which the target words do not carry postlexical prominence. For Hungarian it is clear
that there is no durational asymmetry between the initial stressed syllable in and the
other ones in the same word. These are the same phonological vowels and therefore
directly comparable. For Dutch, the vowels are phonologically comparable too, but
phonetically the non-stressed vowels are much reduced (the first vowel, for example,
could be transcribed as [ɐ] in a narrow phonetic transcription). In addition, the stressed
vowel is longer even in the absence of pitch prominence. Finally, especially in the case
of Dutch with segmentally identical sentences, it can be seen that the highlighted target
words in the top panels are longer overall than the target words in the bottom panels.
This shows that the simple presence of a pitch accent also results in lengthening (of
the entire word, although it will usually target the stressed syllable disproportionately).
Pitch-accent related lengthening in part explains the ubiquitous finding in the literature
that stress results in lengthening, at least when ‘stress’ is investigated by means words
8
It is difficult to find Dutch native words that have full vowels in all syllables, hence the use of a proper
name.
17
2 Approaches to lexical and postlexical prominence
18
2.4 Postlexical prominence and intonation
19
2 Approaches to lexical and postlexical prominence
The question as to how prosody and intonation in general, and postlexical prom-
inence in particular, can be described categorically will be discussed in the follow-
ing. The current standard model of intonational structure is couched in Autosegmental
Metrical (AM) phonology (for AM see Goldsmith 1976; for AM analysis of intonation
see Pierrehumbert 1980; Beckman & Pierrehumbert 1986; Pierrehumbert & Beckman
1988; Gussenhoven 2004; Ladd 2008). In Section 2.4.2 I will first briefly review inter-
pretations of prominence in pre- and non-AM phonological approaches to intonational
analysis. In Section 2.4.3 I will then discuss the most important aspects of AM-style
analyses of intonation and prominence ().
At this point I should clarify that I will only be concerned with a subset of models
that deal with the phonological representation of intonation, to the exclusion of models
that are primary phonetic, for example those with the aim of speech synthesis, includ-
ing Paint-E (e.g. Möhler 1998), the Fujisaki model (e.g. Fujisaki & Hirose 1984) and
PENTA (Xu 2004a,b), the discussion of which is beyond the scope of this thesis. A more
comprehensive overview can be found in, for example, Reichel 2010.
20
2.4 Postlexical prominence and intonation
prominence structure map onto each other as well as onto syntactic structure. On the
other hand, the tradition of Prosodic Phonology (Selkirk 1986; Nespor & Vogel 2007)
is more generally concerned with phonological structure and its interaction with other
domains of linguistic structure, including syntax and morphology. This approach ad-
dresses structural prominence relations as one of several factors involved in determining
the grouping of phonological units. As in Metrical Phonology, prominence is conceived
of as involving the alternation of (perceived) strong and weak elements.
The conception of prominence in approaches that aim to describe intonational phono-
logy is somewhat different. In contrast to both aforementioned phonological traditions,
the Autosegmental-Metrical (AM) approach to intonation is less concerned with the in-
teraction between various domains of linguistics. This approach is also less concerned
with degrees of perceived prominence (although perception has always continued to
play a role) and instead focuses on the phonetic realisation and phonological-categorical
classification of different F0 events.
Before moving on to a more detailed discussion of AM analysis of intonation, two
further approaches that aim(ed) to develop analyses of intonation for its own sake
should be mentioned. These are the British School and IPO (Instituut voor Perceptie-
Onderzoek) approaches.
The British School described phrasal intonation by appealing to the succession of
phonological units, ‘tone units’ (equivalent to Intonation Phrases, or IPs, in AM ana-
lyses), that are defined based on their forming a coherent unit of linguistic meaning
(Crystal 1972; O’Connor & Arnold 1973; Tench 1996). Each of these tone units has
a single locus of main prominence (‘nucleus’) which is associated with the stressed
syllable of the most perceptually prominent word. The distinctive pitch of the nuc-
leus takes the form of a glide, or occasionally a level. Postlexical prominence in this
tradition thus refers to the further enhancement of an already structurally prominent
position through a pitch glide or obtrusion.
The IPO or ‘Dutch School’ approach to intonation, like the British School, considers
‘rises and falls’ to be the basic primitives needed for a model of intonation (Cohen &
’t Hart 1968; ’t Hart & Cohen 1973; ’t Hart & Collier 1975; ’t Hart, Collier & Cohen
1990). Its researchers aimed to provide a rule-system that yields all possible Dutch in-
tonation contours, by specifying the combinations in which basic primitives may occur,
and by restricting how these combinations (‘configurations’) can combine into phrasal
intonation contours. They did not specify exactly how these intonation contours are
mapped onto text, although they do note the consistent co-occurrence of specific config-
urations with specific structural positions such as stressed syllables. The IPO approach
distinguishes between pitch and prominence (the latter conceived of as a purely percep-
tual phenomenon), considering only certain pitch events to have a prominence-lending
function.
The idea of glides or configurations as intonational primitives (be it the contours
from the British School or the contour primitives from the IPO) has since lost ground
to the idea that intonation can be analysed crosslinguistically with a relatively small
set of (level) tonal targets as primitives. This is the Autosegmental-Metrical approach
21
2 Approaches to lexical and postlexical prominence
10
While AM analyses of intonation are primarily concerned with the structure of the F0 contour, it is widely
accepted that other prosodic properties are relevant too. Notational systems have been developed in
which degrees of perceived prominence and various types of perceived juncture can all be transcribed
(cf. Beckman, Hirschberg & Shattuck-Hufnagel 2005).
22
2.4 Postlexical prominence and intonation
AM phonology. Several traditions have made the observation that aspects of speech
melody can be characterised as either prominence-marking (sometimes explicitly ac-
knowledged to be linked to co-occurrence with stressed syllables) or edge marking:
The main point of interest here is how AM phonology deals with the distinction
between intonational events that mark prominence and those that do not. In the context
of this distinction two further notions need defining: Alignment and association.
Alignment “refers to the temporal implementation of fundamental frequency (F0)
movements with respect to the segmental string” (Prieto 2011: 1185). This general
definition of alignment is related to the notion of ‘segmental anchoring’, an observation
about alignment patterns in a number of languages (originally termed the ‘Segmental
Anchoring Hypothesis’ Arvaniti, Ladd & Mennen 1998; Ladd, Mennen & Schepman
2000). Specifically, segmental anchoring refers to a situation in which the points mark-
ing the start and end point of a pitch movement are both found to temporally occur at
locations that can be defined with reference to some segmental landmark (Ladd 2006).
Over the years the term has also been used to describe situations in which individual
tonal targets (i.e. local high and low turning points) are consistently found in some spe-
cific segmental location. Whereas segmental anchoring refers to a type of consistent
alignment behaviour, the term alignment alone refers simply to the phonetic-temporal
location of an intonational event and does not presuppose consistency.
Association, on the other hand, refers to the phonological interpretation of F0 move-
ments as belonging with specific structural phonological positions, which may be do-
main edges, or Tone Bearing Units (TBUs) such as syllables or moras. In many lan-
guages, positions of metrical strength (stressed syllables) are common TBUs.
Both notions, alignment and association, are crucial to an understanding of the role of
prominence within intonational analysis, because whether some intonational primitive
(in the form of a tonal target) is considered to contribute to prominence depends in part
on its alignment and its interpreted association.
In AM phonology, those pitch movements that occur in the vicinity of stressed syl-
lables are considered prominence-marking and are called pitch accents, while move-
ments that are found elsewhere are considered to be edge-marking (edge or boundary
tones): “More or less by definition, a tone that seeks to associate with a lexically stressed
syllable is a pitch accent” (Ladd 2008: 145). The distinction between pitch accents and
boundary tones is in first instance based on differing phonetic alignment, but it is to a
large extent backed by perception, since pitch excursions near stressed syllables result
23
2 Approaches to lexical and postlexical prominence
24
2.4 Postlexical prominence and intonation
of which an accent does not end up on the stressed syllable of each smaller constituent.
This idea of focus projection forms one of the reasons why there is no direct mapping
between semantic-pragmatic prominence and accentuation in most if not all languages.
Thus far, research has shown that in appealing to the combination of information
value paired with language-specific accent assignment rules, several types of crosslin-
guistic patterns of pitch prominence distribution can be accounted for, and I will use
the term prominence-marking (as opposed to prominence-lending) to reflect the relevant
property of pitch accents.
Finally, it should be noted that the Focus-to-Accent view might not be able to account
for every detail of the distribution of accentual prominence in a given language. It
seems possible that in some cases, pitch accents may occur on a constituent for reasons
that have nothing to do with information value (projected or not), with notions like
‘ornamental accents’ (Büring 2007), as well as the phrase accents previously discussed
perhaps being of some explanatory value.
25
2 Approaches to lexical and postlexical prominence
• “[o]ne possible kind of phonemic stress is potential for pitch accent” (Bolinger
1958: 149)
• “Lexical stress provides the designated terminal elements for the assignment of
intonational tones (‘pitch accents’)” (Hyman 2014: 58)
While none of the above sources would claim that every stressed syllable will always
be pitch accented, the correspondence can account for the ‘conceptual merger’ (Gussen-
hoven 2015) as a result of which the F0 correlates of pitch accent are attributed to
lexical stress (see also Section 2.3.3).
In general, however, it is well known that the correspondence between lexical stress
and postlexical pitch accent is not one-to-one: Languages differ in the percentage of
lexically stressed syllables that actually receive a pitch accent, from around 30% in
German or English to virtually all lexical stresses (i.e. words) in Egyptian Arabic (Hell-
muth 2006). Even varieties within a single language can differ considerably, with 17%
of stressed syllables accented in Southern European Portuguese, to 74% in Northern
European Portuguese, to almost a 100% in Brazilian Portuguese (Frota & Vigário 2000;
Vigário & Frota 2003).
Additionally, there are specific contexts in which the correspondence between lexical
stress and postlexical pitch accent is not entirely clear. In the following, I present a
number of difficulties relating to this correspondence. The first three points have to do
with interpretative difficulties, while the last two points form (possible) exceptions to
the overall rule that postlexical prominence maps onto lexical prominence.
A main problem is that some prominence-marking intonational events are analysed
as being associated with a stressed syllable, while the phonetic alignment of the relev-
ant turning points in the contour is not synchronised in time with that specific stressed
syllable. This observation has caused considerable debate in the case of bitonal pitch
accents, in which one of the tonal targets is usually considered to be the more essential
11
Lindström & Remijsen (2005) even call it a “linguistic universal”.
26
2.5 Interaction between lexical and postlexical prominence
one, indicated by a following star ‘*’ and interpreted in terms of greater strength. If
neither of the tonal targets can be defined easily with reference to the phonological
unit it is thought to belong with, such as the edges of a stressed syllable, it is not clear
which of the tonal targets should be starred (detailed discussions and specific experi-
ments addressing this question can be found in Arvaniti, Ladd & Mennen 2000; Beck-
man, Hirschberg & Shattuck-Hufnagel 2005; Dilley, Ladd & Schepman 2005; Prieto,
D’Imperio & Fivela 2005; D’Imperio 2006; Prieto 2011).
Secondly, despite the existence of segmental anchoring (in some cases), pitch accents
differ in their phonetic realisation depending on context, making the uniform analysis
of prominence marking pitch events difficult to sustain even within a single language.
For instance, the same pitch accent category will be aligned differently in different
words, depending on factors such as the segmental structure of the syllable with which
it associates, the relative position of that stressed syllable within the word, and the
length of the phrase or utterance as a whole. Even when the same pitch accent is asso-
ciated with the same syllable in a word, say L*+H with the first syllable in the English
word wonderful, it will align somewhat differently depending on the phrasal position in
which the word occurs, with earlier alignment of the turning points when the word is in
phrase-final position (as in That’s wonderful! as opposed to That’s a wonderful drawing!).
Another factor influencing realisational details is articulation rate, resulting in earlier
alignment in fast speech. There is therefore no one-to-one mapping between the phon-
etic alignment parameters of an intonational event and its phonological interpretation
as pitch accent type A or B. This holds very similarly for edge-marking intonational
events.
A third problem is the discussion surrounding phrase accents, a notion invoked by
Grice, Ladd & Arvaniti (2000) (and cf. Ladd 2008: Ch. 4) to describe the variable
realisation of the right edge marking of yes–no questions in a number of languages (in-
cluding Hungarian and Greek). Phrase accents designate the intonational events that
occur in a phrasal position after the nuclear accent and before the right-edge bound-
ary of the relevant domain. They are not typical pitch accents because in many cases
they do not surface on stressed syllables, but rather occur wherever there is space to
be realised, often adjacent to an edge tone and thus forming a complex boundary tone
sequence. They do not function like typical edge tones either since, when the length
of the metrical-segmental structure allows for it, they do seek out a prominent pos-
ition such as a primary or secondary stressed syllable. Together, these observations
suggests that intonational events exist that are neither solely prominence-lending nor
solely boundary marking, and instead may be both at the same time.
The fourth point, which might be one of the exceptions to the general rule of post-
lexical-lexical prominence correspondence, is so-called ‘stress shift’. Among Hayes’s
(1995) diagnostics to metrical stress is the ‘Rhythm Rule’, which applies in lexical com-
pounds with adjacent stress positions. At least in English, individual words can exhibit
a prominence shift when they occur in a clash context of this kind (compare ‘BAMboo
CHAIR’ with ‘bamBOO’ with final prominence in isolation). For the purposes of the
present definition of stress (stress being an invariant property of a lexical word), this
27
2 Approaches to lexical and postlexical prominence
phenomenon should be called ‘prominence shift’ rather than ‘stress shift’, since the in-
herent relative prominence of the final syllable of bamboo cannot change by definition.
The shift can instead be interpreted as only concerning a particular surface realisation
of the word. The phenomenon indeed seems to involve a shift of the postlexical prom-
inence, which implies that the shift involves the perception of a pitch accent occurring
on a non-primary (but often secondary) stressed syllable (cf. Shattuck-Hufnagel 1994;
Gussenhoven 2004).
The fifth issue concerns another possible exception, and relates to the suggestion that
there are languages which, despite having lexical stress, in fact have pitch accents that
associate with structural positions other than these stressed syllables. More specifically,
this has been argued to be the case in Kuot (Lindström & Remijsen 2005) and several
native languages of North America (as reviewed in Gordon 2014).12 In the first place,
these statements cause terminological confusion since the standard definition of a pitch
accent presupposes that it is an intonational event that seeks out a stressed syllable.
Setting the definition aside, in none of these cases do there seem to be clear criteria or
robust acoustic evidence in favour of claims about prominence, be it lexical or postlex-
ical. While experimental in nature, the study on Kuot is based on only two speakers
and stress was identified by a non-native speaker linguist. Similar issues apply to the
claims about the Northern Iroquoian language Onondaga (Gordon 2014). Here, the
argument appeals to a different position for ‘pitch accent’ depending on the position of
the word in the phrase: If the word is final, the accent would target the penultimate
syllable, whereas if the word is phrase-medial, the accent would target the final syl-
lable, suggesting that pitch accents do not go to predetermined stressed syllables. Note
however that the term pitch accent in this context is somewhat underdefined, with the
only indication that it involves ‘raised F0’ (Gordon 2014: 89). Such evidence hardly
makes a convincing case for an interpretation along the lines of a prominence-marking
pitch accent, as it is well known that not all F0 protrusions serve structural prominence
marking. Moreover, the issues with interpreting stress location based on perceived pitch
prominence are well known (see Section 2.3) and it should be noted that the original
descriptions of (pitch) accent in the relevant North American languages (e.g. Chafe
1970; Foster 1982) date from before the widespread use of experimental methods with
which claims about perceived prominence can be cross-checked. Finally, the exact same
prominence pattern, involving varying final or penultimate lexical prominence depend-
ing on position of the word in the phrase, is found in at least two other cases in the
literature: Halim (1974) as cited in Maskikit-Essed & Gussenhoven (2016) on Indone-
sian, and Boudlal (2001) describing lexical stress assignment in Moroccan Arabic. For
the former, there is now a consensus that (varieties of) Indonesian lack lexical stress,
suggesting that it was not lexical stress that the original claim referred to. With respect
to the latter, as I will discuss in Chapter 4, this particular interpretation of stress is
12
Both these sources cite Rialland & Robert (2001) as another study which makes similar claims, but the
main argument of that study is that the language has stress but no postlexical prominence, which is a
different claim from the one that stress and postlexical prominence both exist, but do not necessarily
coincide.
28
2.5 Interaction between lexical and postlexical prominence
13
West Greenlandic, as one of the aforementioned stressless languages, is left out of this discussion because
little reference can be found to its prominence structure, e.g. Arnhold (2014).
29
2 Approaches to lexical and postlexical prominence
event occurring in this position has some link to structural prominence). A potentially
better characterisation would be that of a pitch accent with phrasal-metrical associ-
ation, or ‘phrasal pitch accent’/‘postlexical pitch accent’, which would contrast with
a ‘standard’ pitch accent which seeks lexical-metrical association. A similar analysis
might be appropriate for Mongolian. The existence of prominence asymmetries at the
postlexical level in Mongolian is less clear than in French, but there is some evidence
that the first syllable of an AP is privileged in terms of being the only position that
exhibits a paradigmatic vowel contrast (Karlsson & Svantesson 2004). While the tonal
movement marking the same position (a rising gesture at the left edge of each AP) is
not by definition considered to contribute to prominence, focus may be achieved by
enhancing the initial rising gesture, suggesting that the AP-initial position is indeed a
position of metrical prominence (cf. Karlsson 2014).
On the other hand there are languages like Korean and Ambonese Malay. For these
languages there is no evidence to support the existence even of postlexical promin-
ence. Apparently, in these languages, pitch is used neither in the sense of associating
to phrasal culminative positions (as in French), nor in the sense that edge-aligned pitch
movement can be interpreted as having a prominence-marking function.14 . These lan-
guages certainly have intonation, but it can be analysed exclusively in terms of edge-
marking tones without reference to functional prominence-marking.
As an aside, while it seems possible for languages without stress to also lack postlex-
ical prominence, it is unclear at this point whether languages with stress can lack
postlexical prominence. To my knowledge, this has only been suggested to be the
case for Wolof (Rialland & Robert 2001).
The case for Korean as one such language is made by Sun-Ah Jun. Jun (2005a) re-
views a number of studies concerned with the perception of prominence in Korean,
but all these studies involved non-native speakers who seemed to equate high F0 with
prominence. While surface differences in F0 clearly exist, these have nothing to do
with prominence. Firstly, there does not seem to be a role for structural prominence
asymmetries between syllables within an AP, as in French (Jun 2014a). Secondly, in-
formation structure is signalled by means of phrasing, not accentuation of (some part of)
a specific constituent (Jun & Oh 1996; Jun 2014a, although a different, impressionistic,
view can be found in Choe 1995).
In addition to Korean, Ambonese Malay (Maskikit-Essed & Gussenhoven 2016) has
also been argued to be a language of the type that lacks prosodic postlexical prominence
altogether.15 Based on elicited sentences with phrase-final words occurring in different
information structural contexts, the authors conclude that the language simply does
not have postlexical prominence marking: Contrastively focused words in this position
were not realised differently from non-focused words.
14
A third, but novel, logical option, namely that prominence marking of a constituent may occur somewhat
more freely on the constituent, i.e. without the constraints of metrical anchoring, will be argued to be
the case for certain intonational patterns in TB and MA, see Chapters 6 and 7)
15
A difference between the languages concerns the interpretation of edge tones. In Korean, edge tones
seek association to a specific TBU, whereas Ambonese Malay edge tones do not. This difference has
little bearing on the interpretation of postlexical prominence.
30
2.6 Research questions
Table 2.1: Proposed typology of languages as a function of lexical and postlexical prom-
inence structure.
The idea that languages can be classified based on their prominence structural prop-
erties at lexical and postlexical levels is taken up again in the discussion in Chapter
9, with added insights from findings with respect to Tashlhiyt Berber and Moroccan
Arabic.
31
2 Approaches to lexical and postlexical prominence
of this kind are considered to reveal ‘stress deafness’, and have been shown to
provide a reliable insight into the presence of lexical prominence asymmetries in
the lexical phonology of the native language (Chapter 8).
In addition to an answer to the main research question and the subquestions, a further
contribution is made with respect to language contact. The present experiments allow
for the near-direct comparison of results from Tashlhiyt Berber and Moroccan Arabic
in the case of the production experiments, and for a direct comparison in the case of
the perception experiment. These comparisons have the potential to shed light on the
degree of convergence between the two languages in terms of prosodic-phonological
aspects of linguistic structure.
2.7 Summary
This chapter has given the theoretical background for the five experiments that will be
reported on in the next chapters. The following key definitions were given:
• Lexical prominence is the phonological property of one syllable (or mora) within
a word that marks it as prominent in relation to the others. Stress is a type of
lexical prominence asymmetry that does not (exclusively) involve lexical pitch
accent.
This chapter’s discussion of lexical and postlexical prominence has moreover high-
lighted the complex interdependence between lexical and postlexical prominence spe-
cifications. On the one hand, many languages whose intonation has been analysed
within the AM phonological framework are characterised by a highly reliable corres-
pondence in the location of postlexical and lexical prominence. In contrast, in lan-
guages in which the question of lexical–postlexical prominence correspondence does
not arise, due to the non-existence of lexical stress, the mechanisms accounting for the
distribution of postlexical prominence are as of yet poorly understood.
32
Part II
33
3 Acoustic correlates of word-level stress
in Tashlhiyt Berber
3.1 Introduction
3.1.1 Prior work on the lexical phonology of Tashlhiyt Berber
3.1.1.1 Stress
Prior to 2015, the question whether lexical stress is present in Tashlhiyt Berber had
been addressed in passing in a few grammatical descriptions of the language, and it
had also been addressed in one experiment.1
At the turn of the 20th century, Stumme (1899) observed that lexical prominence
(‘Wortaccent’) is highly variable in sentence context, and with this observation appears
to have been the first to propose that Tashlhiyt lacks lexical stress in the sense of lexical
prominence specified for each individual word. Later, similar observations were made
by Applegate (1958), following a failure to provide a uniform characterisation of word-
level and phrase-level prominence patterns, and by Dell & Elmedlaoui (2002), who
considered prominence asymmetries to exist only at a phrasal level rather than at the
lexical level.
In contrast to the above, the one experimental study that addresses questions about
the existence of stress in TB claims that it is in fact present (Gordon & Nafi 2012). The
design of the materials in this study however leaves some room for (re)interpretation of
the results. Half of the speech material analysed consisted of individual words produced
in isolation, which means these words would have been subject to phrase-level prosody.
Firstly, this phrase-level prosody could have manifested itself in terms of the presence
of phrase-level prominence (such as the nuclear pitch accent which by default occurs on
the final content word in Germanic languages). Secondly, the presence of phrase-level
prosody on the target words will most likely have resulted in the presence of phrase-
level edge marking in the form of continuation intonation. This is a crosslinguistically
common form of intonation used for items produced in lists, and judging from the pitch
tracks provided by Gordon & Nafi (2012). In sum, these words were subject to the entire
range of intonational marking also found on larger phrases, in the present case however
condensed onto a single word. Unsurprisingly then, the authors found considerable
enhancement of word-final syllables: These had longer duration, greater intensity and
higher pitch than their non-final counterparts. While some of these are common, or
1
The data reported on in this chapter have previously been published, in slightly different format, in
Roettger, Bruggeman & Grice (2015) and in Roettger (2017: Ch. 4).
35
3 Acoustic correlates of word-level stress in Tashlhiyt Berber
36
3.2 Methodology
word. If, on the other hand, results provide a negative answer to this question, this
finding would be in line with the assumption that there is no lexical stress in Tashlhiyt
(or at least no stress-by-position).
3.1.3 Data
The present chapter reports on an experiment that was designed to test acoustic correl-
ates of stress in Tashlhiyt Berber, and specifically to compare the results with findings
by Gordon & Nafi (2012). Recordings were made by the author in November 2014 at
the Université Ibn Zohr in Agadir.
3.2 Methodology
3.2.1 Participants
The speakers in this experiment were 10 native speakers of Tashlhiyt, all students in
the Département des études amazighes at the Université Ibn Zohr in Agadir. All parti-
cipants were multilingual and spoke fluent Moroccan Arabic as a second language (see
also Section 1.2). Most additionally spoke some French. One speaker’s recording was
excluded from further analysis due to poor recording quality. In the results section,
individual speakers are referred to by the numbers 1–9 followed by the letter “M” for
male speakers, and “F” for female speakers.2
3.2.2 Procedure
Participants were given oral instructions about the task by a native speaker. They
were then seated in front of a laptop screen in slide presentation mode from which
they read out scripted mock dialogues that included the target sentence stimuli (see
details below). Dialogues with target items were interspersed with dialogues with filler
items. The experiment was self-paced was part of a larger session which lasted 40–45
minutes. Participants did the present experiment reading out dialogues twice, with the
full set of dialogues occurring in two random orders, and performed other tasks in the
2
These codes match the ones used in Roettger (2017: Ch. 4).
37
3 Acoustic correlates of word-level stress in Tashlhiyt Berber
interim. The experiment was self-paced and the two repetitions took 15–20 minutes
to complete. Recordings were made in a univeristy office using a PreSonus Audiobox
solid-state recorder at a sampling rate of 44.1 kHz, and a head-mounted AKG C420 III
microphone.
Target words were placed in a carrier sentence which was in turn embedded within
a scripted mock dialogue consisting of three sentences. Sentences were presented on a
laptop screen in speech bubbles coming from smiley faces to represent the two imagin-
ary discourse participants.3
The context of the dialogue served to ensure that the target word in its carrier sen-
tence (1) was i) given and not marked by intonational prominence, because it was
explicitly mentioned in the previous sentence, ii) not marked by any other intonational
events, as it occurred in a yes-no question in which the edge-marking intonational
event occurs on the phrase-final two syllables (Grice, Ridouane & Roettger 2015), and
iii) non-final in the IP so as to avoid phrase-final lengthening effects.
Sentences were presented in Latin script, which is the most common way to write
Tashlhiyt. One mock dialogue with a phonological transcription is given in 1 with the
target word /dari/ ‘with me’:
3
I am grateful to Carlos Gussenhoven for sharing this methodology.
38
3.2 Methodology
Figure 3.1: Example spectrogram and F0 contour (smoothed) for context sentence 2 and
target sentence, spoken by speaker 7f), with target word dari highlighted.
Target sentence
inːa [dari] ʁakudan
say.aor [dari] then
‘He said [dari] then ?’
An example spectrogram and waveform are given in Figure 3.1, showing context
sentence 2 and the target sentence. As can be seen, the target word dari in the target
sentence occurs prior to the main pitch event in the sentence (the rise-fall at the right
edge of the phrase), and does not appear to be the locus of some phrasal or edge-marking
pitch event itself.This means that the target words occurred in the intended context to
examine correlates of lexical stress proper.
3.2.4 Analysis
The acoustic parameters investigated were duration, intensity and F0, matching the set
of correlates reported in Gordon & Nafi (2012). Annotation was performed manually.
Target sentences were segmented into words, and target words were further annotated
for the syllable boundary and onset and nucleus of both syllables. This was unproblem-
atic since all words had either the structure /CV.CV/ or /CVC.CVC/.
The theoretical number of target items was N=396 (9 speakers * 2 syllable positions
* 11 target syllables * 2 repetitions). Of these, N=288 were targetlike in the sense of
39
3 Acoustic correlates of word-level stress in Tashlhiyt Berber
the correct target word being produced in a fluent sentence. Out of these, N=279 were
produced without any pausing, and the analysis will use this pause-less set of tokens.
F0 measurements are based on a handcorrected version of the fundamental frequency
contour provided by the standard pitch-tracking algorithm in Praat (Boersma & Ween-
ink 2015). Manual correction was limited to the correction of pitch-tracking errors,
such as octave jumps and the tracking of pitch in cases of phonetically voiceless seg-
ments. There were two F0 measurements: Peak in and mean throughout the vowel.
For duration, measurements of target syllables and vowels were taken. Vowel dura-
tion was determined as the period of time following the initial consonant with strong
periodic energy across the second and third formant. Intervocalic /r/s were mostly
realised as either trill or tap, therefore the onset of the /r/ was determined as the start
of the (first) closure. Intensity was measured in terms of mean energy throughout the
vowel.
To control for variation between speakers, and to facilitate visual comparison, all
measurements were z-scored. These data are shown in the speaker-specific graphs.
The statistical models use the raw measurements, and account for speaker differences
with random intercepts and slopes.
Statistical analysis was performed with linear mixed-effects regression models with
the package lme4 (Bates et al. 2015) in R (R Core Team 2016). Separate models were run
for each of the acoustic parameters under investigation, with the relevant parameter
as a fixed effect. To allow for potentially varying interactions with the fixed effect,
random intercepts for speakers and random slopes for items (syllables) and speakers
were included. Statistical significance was calculated by means of likelihood ratio tests
LRT)s comparing hypothesised models with the corresponding null model that lacked
the relevant fixed effect or interaction term. The R syntax of the models that are used
is given in the footnotes.
3.3 Results
3.3.1 Duration
Figure 3.2 shows the raw durational measurements for the vowels in each syllable.
There are differences between syllables in whether the final or the penultimate position
results in longer duration: For e.g. ba and kaw the final vowel is longer, whereas for
e.g. kif and tam the penultimate vowel is somewhat longer. This indicates that there is
no systematic positional enhancement of syllables: Either the final or the penultimate
vowels would have to be consistently longer for this to be true.
Statistically, none of the durational differences are significant. Vowel duration was
not different as a function of whether it occurred in the final or penultimate syllable
(LRT: χ2 (1)=2.45, p=0.12). The same holds for syllable duration (LRT: χ2 (1)=0.04,
p=0.85).4
4
syllable/vowel duration ∼syllable position + (0+syllable position|speaker) + (0+syllable posi-
tion|syllable) + (1|syllable) + (1|speaker)
40
3.3 Results
41
3 Acoustic correlates of word-level stress in Tashlhiyt Berber
Figure 3.4: Intensity (dB) throughout the vowel for each syllable as a function of pos-
ition (penultimate/final in the word). Lines link productions by the same
speaker, large dots represent mean.
3.3.2 Intensity
Figure 3.4 shows the raw intensity measurements for the individual syllables. There are
few apparent differences between syllables in whether the vowel is relatively enhanced
in the final or the penultimate position. There is however a small statistical difference
in the sense that final syllables have somewhat lower vowel intensity overall than pen-
ultimate syllables (LRT: χ2 (1)=4.4, p=0.04).5 The estimated difference is 0.56 dB,
which arguably constitutes a marginal difference that lies well below the threshold for
5
vowel intensity ∼syllable position + (0+syllable position|speaker) + (0+syllable position|syllable)
+ (1|syllable) + (1|speaker)
42
3.3 Results
just noticeable differences in intensity of around 1 dB (cf. Lehiste 1970; Beckman 1986
and see also discussion of the relevance of intensity as a cue to stress in Section 2.3.3).
Individual speakers also do not make a consistent distinction between syllables in
penultimate and final position in terms of intensity. As shown in Figure 3.5, there is no
systematic trend here: It is not the case that all final syllables exhibit lower intensity.
3.3.3 F0
As previously mentioned, F0 mean and peak measurements throughout the vowel were
taken. Figure 3.6 shows the mean F0, in semitones relative to 100 Hertz, for penul-
timate and final syllable nuclei. There are no large differences between mean F0 as
a function of syllable position, but the overall data nevertheless suggests a consistent
small difference between final and penultimate syllables. This is corroborated by a
statistical difference for both the mean F0 measurement (LRT: χ2 (1)=10.0, p<0.01),
and the peak F0 measurement (LRT: χ2 (1)=6.49, p<0.05).6 The predicted mean dif-
ference was 0.83 ST, and the peak difference 0.55 ST, with word-final nuclei having
the lower F0 values. I will return to this finding below.
Figure 3.6: F0 (ST) mean throughout vowel for each syllable as a function of position
(penultimate/final in the word). Lines link productions by the same speaker,
large dots represent mean.
43
3 Acoustic correlates of word-level stress in Tashlhiyt Berber
reliable phonetic exponent of lexical stress. Firstly, in terms of perception, the more
liberal estimates for just noticeable pitch movements in dynamic speech stimuli start at
1 ST (for example, exceptionally good listeners in ’t Hart 1981). Most of the literature
however has suggested that larger pitch movements are required in order for listeners
to reliably identify differences (’t Hart 1976; d’Alessandro & Mertens 1995). Secondly,
stress is typically realised as acoustic enhancement. There is no definitive reason why
low pitch (rather than high pitch) should not be considered enhancement, but given
the known status of high pitch as the prototypical form of enhancement, it would at
the very least be unusual (or perhaps it should be taken to point to the interpretation
that the initial syllables are the enhanced ones). A third observation, which provides
the most likely explanation of the observed differences, concerns the possibility that
lower F0 in final syllables results from declination throughout the phrase: The lower
F0 values are found on the word-final syllable which coincides with a later position
in the phrase. Figure 3.8 shows all F0 contours for the target sentence that contains
the word yanyan ‘one by one’ in which both syllables can be compared directly. These
contours are characterised by a gradually declining slope throughout the utterance, at
least until the starting point of the final rise(-fall) that is typical of yes-no question
modality in TB (Grice, Ridouane & Roettger 2015).
These observations together suggest that the small F0 difference does not reflect en-
hancement of the penultimate syllable as a function of fixed stress in that position.
Rather, it forms a predictable phonetic correlate of postlexical (phrasal) structure, in
this case reflecting general facts of speech production (declination throughout a phrase
is often thought to result from a gradual reduction of pulmonic effort throughout an
utterance, cf. Ladefoged 1972; Ladd 2008).
44
3.4 Discussion
Figure 3.8: Time-normalised contours for all speakers’ target sentence inːa janjan ʁak-
udan ‘he said [one by one] then’.
3.4 Discussion
Among the acoustic correlates discussed in this chapter (duration, intensity and F0),
there were no reliable differences between presumed stressed (final) and unstressed
(initial) syllables. The differences that did exist, i.e. slightly enhanced F0 and intensity
for initial syllables, are not compatible with the results and interpretation of final stress
given by Gordon & Nafi (2012). The fact that the enhancement involved both F0 and
intensity, but not duration, as well as the fact that the differences were very slight,
form indications that the relevant differences are unlikely to form exponents of lexical
stress. Firstly, it is well known that both intensity and F0 tend to gradually decline over
the course of an utterance. The fact that higher values of each are seen in word-initial
syllables, which occur in an earlier phrasal position, could therefore be interpreted as
following from this tendency. Secondly, global intensity measures like the present one
(mean across vowel) are highly correlated with F0, meaning that the effect of intensity
might simply follow from the higher F0 values in the same position. This would then
require an explanation for the F0 difference only, as for example above.
A few more observations can be made to discount ‘stress’ as causing a reliable posi-
tional distinction, zooming in on syllables and speakers separately. Not one of the 11
syllables showed consistent differentiation in terms of a combined effect of the three
possible correlates (the closest to a possible exception is the syllable kaw, which is
durationally differentiated — but the other parameters still do not conspire). For all
syllables, lines in Figures 3.2, 3.4 and 3.6 typically cross each other or stay level: For no
individual syllable was there a consistent, directional effect. Similarly, not one of the
9 speakers exhibit consistent differentiation between ‘stressed’ and ‘unstressed’ (final
and initial) syllables, for any of the syllable types. Rephrased: Not one speaker exhib-
ited behaviour that caused all lines (in the speaker-specific figures) to go in the same
direction. In sum, the degree of overlap in the present data distribution prevents the
conclusion that stress, here in the form of a specific lexical position, exhibits reliable
45
3 Acoustic correlates of word-level stress in Tashlhiyt Berber
acoustic enhancement.7
Finally, the possibility should be raised that perhaps the present study failed to find
correlates of stress because stress in Tashlhiyt is not assigned by position, but rather
is assigned by weight, or is variable in position (rule-based and/or lexically specified).
While these scenarios of different stress assignment cannot be dismissed, simply be-
cause they were never explicitly tested, there is currently no reason to believe they are
very likely. On the one hand, specific stress predictions other than the one of fixed final
stress tested in Gordon & Nafi (2012) have not been suggested in the literature (and
Gordon & Nafi (2012) also did not set out with an explicit hypothesis about the position
of word stress). On the other hand, the possibility of the absence of lexical stress in TB,
as suggested in some literature (see 3.1.1.1), is compatible with evidence from inton-
ation suggesting that there is no fixed lexical position to which prominence-marking
intonational events associate (Grice, Ridouane & Roettger 2015). It seems more fruit-
ful therefore to try to find further support for the absence of stress from other angles
(see also Section 2.3.2). This is the purpose of some of the following chapters: Chapter
4 (which tests for stress in MA), Chapter 6 (which looks at the behaviour of another
type of intonational event in TB) and Chapter 8 (investigating perceptual clues to the
existence of lexical stress).
7
Concluding the opposite, namely that stress is absent because there is no consistent enhancement, would
be unwarranted. The impossibility to prove the null hypothesis is also addressed in Section 2.3.2.
46
4 Acoustic correlates of word-level stress
in Moroccan Arabic
4.1 Introduction
4.1.1 Prior work on the lexical phonology of Moroccan Arabic
4.1.1.1 Stress
The question whether Moroccan Arabic has lexical stress is subject to a long-standing
debate and is not currently resolved.1 Maas (2013), for example, reviews more than
10 sources published between 1894 and 2008 that all differ to some extent in their
views on the existence of word stress in MA. Complicating matters more, the different
authors he cites use varying terminology, including ‘Wortakzent’, ‘Akzent’ and ‘Accent’
(by German authors; Stumme & Socin 1894; Brockelmann 1908; Fischer 1917), ‘accent’
and ‘accent de mot’ (in French works; Cantineau 1960; Benhallam 1989), ‘accento
tonico’ (in Italian; Durand 1994) and finally, ‘stress’ (in English; Aguadé 2008). Some
of these terms may be in fact be interpreted as referring to postlexical pitch prominence
(rather than lexical stress as defined here, see Chapter 2). Others do seem to refer to
inherent word-level prominence, despite what the choice of terminology suggests. In
sum, it is not entirely clear what is meant by these terms other than that they refer to
some sort of word prosody, which could be construed as either lexical or postlexical
prominence. In reviewing the evidence in detail, however, Maas (2013) argues that
the various positions can be allocated to two main groups: One group posits that MA
has word stress, the other that MA lacks word stress. It is this latter viewpoint which
is assumed by the majority of the authors he reviews.
The first group, the advocates of word stress, propose a range of stress rules and
generalisations (for a more comprehenseive overview see Boudlal 2001). For example,
according to Benkirane (1998: 348f.), stress falls on the final syllable if it is heavy (i.e. a
closed syllable such as CVC) and on the penult otherwise. A similar view is found in
Nejmi (1993) as cited in Boudlal (2001). Others believe in a fixed position for stress,
such as simply ‘final prominence’ Watson (2011: 7), or a number of authors cited by
Maas (2013) who predominantly posit penultimate stress. Yet another possibility is
rather more variable stress that may target an initial CV syllable in a trisyllabic word
(Benhallam 1990, as cited in Boudlal 2001).
Finally, a slightly more complicated picture is sketched by Boudlal (2001: 99) who
posits that “the location of stress depends on whether or not the items considered occur
1
This chapter has been published, with some adaptations, as Bruggeman et al. (in press).
47
4 Acoustic correlates of word-level stress in Moroccan Arabic
in isolation or in context”. Accordingly, stress would be final when words are produced
in context, but can be captured by Benkirane’s generalisation for words produced in
isolation. I will return to some specific issues relating to this idea in the discussion in
Section 4.4. Assuming that the position of word stress is a fixed property, Boudlal’s
(2001) observation about varying stress positions for individual words forms a strong
indication that he was talking about postlexical prominence (since these can be variable
under the present definition). This interpretation is supported by some of the pitch
contours shown in his work. In any case, if word stress is considered an invariable
lexically specified property, the observation that lexical prominence is truly variable
would imply the very lack of word stress.
The incongruent analyses of word stress are matched by equally uncongruous judg-
ments on the position of word stress by native speakers. A number of studies have
investigated stress in terms of where the perceptual prominence of a word lies. These
typically involve a few dozen participants underlining the stressed syllable in written
words presented in a list, and/or listening to isolated words (see Boudlal 2001: 101f.
for an overview). Most of these, including Boudlal’s (2001) own underlining test, find
that speakers disagree with each other on the location of stress in the same word. This
disagreement might in part be due to the fact that written Arabic will automatically
invoke Modern Standard Arabic (even if a task explicitly asks for stress judgments on
Moroccan Arabic), which has stress assignment that differs from (the many proposals
for) Moroccan Arabic. Nevertheless, disagreement among native speakers of the lan-
guage adds to the elusive nature of word stress in MA, suggested in the first place by
disagreement among (native) scholars.
Returning to the stress generalisations that have been proposed so far, Benkirane’s
is the most easily testable and and it has also been adopted in recent work, including
Yeou, Embarki & Al-Maqtari (2007), Burdin et al. (2015) and Hellmuth et al. (2015).
Moreover, in most other varieties of Arabic (in which the existence of stress is un-
controversial) stress assignment is subject to weight and position, with stress typically
targeting the final superheavy syllable (e.g. CVCC), or a penultimate heavy syllable
(e.g. CVː) in its absence (Watson 2011). If one were to assume that MA has lexical
stress, and takes into account prior claims about stress position in this variety as well
as facts about stress in other varieties in Arabic, the weight of the final syllable would
be expected play a role in determining its location.
In this chapter I will test correlates of stress as envisaged by Benkirane (and several
others with him), according to whom stress in MA targets either the penultimate or the
final heavy syllable.
48
4.1 Introduction
nucleus resulting in heavier weight. Given that Moroccan Arabic does not have a vowel
length distinction (see next subsection), it is the number of consonantal slots in the coda
that determines the weight of the syllable as light (none), heavy (one), or, under some
analyses, superheavy (two).
A lot of work has been done on syllable structure in MA, and in various theoretical
traditions, yielding varying claims about its phonological representation (Boudlal 2001;
Dell & Elmedlaoui 2002; and Benkirane 1982 as cited in Benkirane 1998). What is clear
from all sources is that MA allows for more complex consonant clusters than most other
varieties of Arabic, but the representation of these clusters in terms of branching onsets
or codas is disputed. For example, Benkirane (1998) provides an inventory of syllable
types that includes CV, CCV, CCVC, and CCəCC, while Dell & Elmedlaoui (2002) ar-
gue that syllable onsets cannot be branching, and that codas can only be branching if
they consist of geminates. In order to account for what appear to be syllable-initial
clusters, Dell & Elmedlaoui (2002) instead propose a complicated general syllabifica-
tion algorithm that posits onsetless syllables and empty nuclei. Crucially, a distinction
is made in all works between heavy and light syllables, and sometimes superheavy syl-
lables. The degree of consensus is however limited to CV being considered light and
CVC heavy, with the exception of CəC (at least according to Dell & Elmedlaoui 2002).
49
4 Acoustic correlates of word-level stress in Moroccan Arabic
4.1.3 Data
The present experiment forms part of the IVAr (Intonational Variation in Arabic) corpus
of speech data (Hellmuth & Almbark 2017).2 It includes comparable data from seven
varieties of Arabic. The corpus recordings of Moroccan Arabic, including the present
experiment which was specifically designed to test correlates of stress, were made in
the spring of 2014 by Sam Hellmuth and Nabila Louriz at the Université Hassan II in
Casablanca.
4.2 Methodology
4.2.1 Participants
Two groups of speakers were recorded for the present experiment. The first group
consisted of 12 native speakers of Moroccan Arabic who grew up with Moroccan Arabic
only at home (the ‘monolingual’ group).3 These speakers were aged 21-34. Ten of them
were born in Casablanca and had lived there all their life at the time of the recording,
one speaker moved to Casablanca aged two, and one speaker was born in very nearby
Kenitra. The second group consisted of 12 speakers of Moroccan Arabic who were
also native speakers of Tashlhiyt Berber through one or both parents (the ‘bilingual’
2
I am very grateful to Sam Hellmuth and Rana Almbark for their generosity in letting me use some of
their Moroccan data recordings.
3
I use quotation marks here because these speakers were not in fact monolingual. All were highly fluent
in multiple other languages, including Modern Standard Arabic and French, see also Chapter 1. For
ease of reference, however, I will use the terms monolingual and bilingual and refer this way to their
first, home language status.
50
4.2 Methodology
group). Their ages ranged between 20 and 32. Nine speakers in this group were born
in Casablanca, the other three moved to the city at the respective ages of six, twelve
and fourteen. All speakers in both groups are fluent in Modern Standard Arabic and
French, and had received a number of years English teaching in school.
4.2.2 Procedure
The present experiment was part of a larger recording session for the IVAr corpus which
consisted of a number of tasks, including reading out of a mock dialogue that included
the qword interrogatives discussed in Chapter 7. For the present experiment, parti-
cipants were recorded individually in a quiet university room with a Shure SM-10
headset microphone. Participants were first given oral instructions by a native speaker.
They were then given a print-out of the experimental stimuli, which consisted of mini-
monologues. They read these mini-monologues out loud (with fillers at the top and
bottom of each page), with the third sentence in each dialogue forming a target utter-
ance (see below for details). The experiment was self-paced. There were no practice
items in order to minimise the duration of the session as a whole.
4
It is not usual for Moroccan Arabic to be written in Arabic script, which is reserved for Modern Standard
Arabic. The stimuli in this experiment however are written so as to mirror Moroccan Arabic lexical
items and pronunciation. For example, /ʒuʒ mərːat/ ‘twice’ as used in MA translates to /kamaːn
marːə/ in MSA so there was no confusion which variety was being tapped into. Moreover, speakers
were explicitly instructed to speak Moroccan Arabic throughout the experimental session.
51
4 Acoustic correlates of word-level stress in Moroccan Arabic
52
4.2 Methodology
Figure 4.1: Example spectrogram and F0 contour (smoothed) for scripted monologue,
with target word ˈmuka highlighted in the target phrase. IVAr file moca-
slb3-f5.
4.2.4 Analysis
The acoustic parameters investigated for correlates of stress were duration, intensity,
vowel quality and F0. Annotation proceeded as follows: Automatic segmentation of ut-
terances into words and segments was performed by means of the Prosodylab Aligner
algorithm (Gorman, Howell & Wagner 2011). The segmentation of target words was
then manually checked and corrected where needed, and coded for preceding and fol-
lowing pauses. Pauses were defined as periods of silence in the signal and were based
on the auditory impression of a pause supported by visual inspection of speech discon-
tinuity in the spectrogram (the auditory impression of a break caused by e.g. pitch reset
was thus not a sufficient criterion to annotate a pause).
The theoretical number of target items was N=384 (2 speaker groups * 12 speakers
* 2 stress conditions * 8 target syllables). Of these, N=360 were targetlike in the sense
that the correct target word was produced (MA bilinguals N=172, MA monolinguals
N=188). Out of these, N=251 were produced without any pauses, with the most
typical location being right after the target word.
F0 measurements are based on a handcorrected version of the fundamental frequency
contour provided by the standard pitch-tracking algorithm in Praat (Boersma & Ween-
ink 2015). Manual correction was limited to the correction of pitch-tracking errors,
such as octave jumps and the tracking of pitch in cases of phonetically voiceless seg-
ments.
For duration, measurements of target words and target vowels were taken. Vowel
duration was determined as the period of time following the initial consonant with
strong periodic energy across the second and third formant. For the segmentation of
intervocalic /r/, the onset of the /r/ was determined as the start of the (first) closure
(most /r/s were realised as either trill or tap, although there were some approximants
as well). To control for durational differences between speakers, duration was also
normalised by z-scoring target vowel duration per speaker.
Vowel quality was measured by F1 and F2 values taken at the midpoint of the vowel.
Measurements were extracted by means of the Burg method in Praat (Boersma & Ween-
53
4 Acoustic correlates of word-level stress in Moroccan Arabic
ink 2015), using a 25 ms. Gaussian window, and a 10 ms. step. All values were verified
manually, corrected where needed, or excluded where reliable formant values could
not be extracted. Results are reported on both the raw F1 and F2 values as well as on
Lobanov-normalised values. The latter were calculated with the NORM vowel normal-
isation suite (Thomas & Kendall 2007) and the R package vowels (Kendall & Thomas
2014).
Two measurements for intensity and F0 each were taken i) mean throughout the
target vowel, ii) maximum (peak) in the target vowel. As for duration, variation in
intensity and F0 between speakers was controlled for by z-scoring per speaker. To allow
for a more holistic analysis of F0 movements, phrasal F0 contours were also extracted.
This was done by means of taking F0 at 20 extraction points spaced equally per word,
with the exception of the target word for which 10 measurements were taken in each
syllable in order to be able to compare syllables directly.
Statistical analysis was performed with linear mixed-effects regression models with
the package lme4 (Bates et al. 2015) in R (R Core Team 2016). Models with the same
structure were run for each of the acoustic parameters under investigation with pre-
sumed stress status as the fixed effect of main interest and an interacting effect of group
(bilingual/monolingual). In the case of duration, pausing (present/absent) is taken into
account as an additional fixed effect. In some cases, mainly for vowel quality, models
are run on subsets of the data after the main model. To allow for potentially vary-
ing interactions with the fixed effect, random intercepts for items (target syllables or
words) and speakers were included (slopes led to overspecification). Statistical signi-
ficance was calculated by means of LRTs comparing main models with corresponding
null models that lacked the relevant fixed effect or interaction term. The R syntax of
the models that are used is given in the footnotes.
4.3 Results
4.3.1 Duration
The total number of targetlike target words was N=360, as mentioned above. Of this
number, 319 tokens actually had an initial vowel. The 41 tokens that did not were
all instances of the words sitta and sittat, which were typically produced as [sːtːə] and
[sːtːaːt]. This left only two tokens of stressed /si(tː)/, which both did have a vowel,
and no matching instances of unstressed /si(tː)/ that had a vowel. This syllable was
therefore also excluded from the statistical analysis (tokens with a vowel are however
included for reference in the figures). The remaining set of N=317 tokens was then
submitted to further analysis.
During annotation, differences appeared to exist between stressed and unstressed
tokens of the syllables si and su, in the sense that the stressed versions of these syllables
had long, dynamic vowels, whereas the unstressed ones had shorter, steady-state only
vowels. An explanation for this observation might be found in the segmental make-up
of the word pairs involved. In the target words in which these syllables are stressed
54
4.3 Results
(sira and sura), there is a high vowel followed by [r] or [rˤ]. This rhotic considerably
affected the preceding vowel formant structure and resulted in a longer vocalic portion
than in the unstressed counterparts sinat and sudan, which had steady-state only initial
vowels (note that the other target syllables followed by /r/, i.e. mu and ma in murra
[mərːa] and marra [mərˤːa], respectively, are much shorter, but the target vowel in
these cases is central rather than high). This is a known effect: Pre-/r/ lengthening
of high vowels preceding rhotics is also observed in other languages, including Dutch
(Rietveld, Kerkhoff & Gussenhoven 2004). In an attempt to neutralise these segmental
effects, vowel duration was re-extracted using only the steady-state portion of the vowel.
The analysis below is based on this shorter vowel duration.5
Figure 4.2: Absolute duration (ms) of vowels, per syllable, as a function of status
(stressed/unstressed). Lines link productions by the same speaker, large
dots represent mean per syllable.
Figure 4.2 shows the distribution of absolute vowel duration, for each syllable separ-
ately and pooled across the groups. Firstly, there was no main effect of pause (present
or not) on any of the duration measures (for absolute vowel duration, LRT: χ2 (1)=1.73,
p=0.18).6 The interaction between stress and group was significant only for z-scored
syllable duration (LRT: χ2 (1)=4.46, p=0.03).7 Since none of the other three duration
measures (absolute syllable duration, and absolute/z-scored vowel duration) were sig-
nificant this effect will not be further considered. There was, however, a main effect
of stress on vowel duration (z-scored and absolute), although the predicted difference
involves ‘stressed’ vowels being 3 ms. shorter than ‘unstressed’ ones (LRT: χ2 (1)=4.70,
5
Analysis was performed on the longer vowel duration measurements too, with the result of a significant
length difference between stressed and unstressed syllables. This effect disappeared entirely when
tokens of si and su were excluded.
6
duration (syllable/vowel, absolute/z-scored) ∼pause + stress * group + (1|speaker) + (1|syllable)
7
Same models as above, with model comparison of stress + group versus stress * group
55
4 Acoustic correlates of word-level stress in Moroccan Arabic
p=0.03).8 This change is not in the expected direction if stress is considered to equal
enhancement, but more importantly a change of 3 ms. on an average vowel duration
of around 77 ms. does not reflect a meaningful change even if statistically significant.
Figure 4.3: Absolute duration (ms) of vowels, per speaker, as a function of status
(stressed/unstressed), tokens of si and su excluded. Lines link productions
of the same vowel, large dots represent mean per speaker.
In short, both the statistics and the above observations suggest that there is no evid-
ence to support an interpretation in terms of stress-induced vowel or syllable length-
ening in MA across the board. To confirm that there are also no individual speakers
who produce consistent durational enhancement of stressed vowels, Figure 4.3 shows
speaker-specific behaviour (tokens of si and su are removed). Unstressed and stressed
tokens of the same syllable are connected by lines (e.g. unstressed mu in mukat and
stressed mu in muka. The varying direction of these lines for almost all speakers, and
the large overlap between the presumed stressed and unstressed categories within each
speaker, indicates that speakers did not systematically differentiate between stressed
and unstressed vowels in terms of duration.
In sum, most syllables do not exhibit durational enhancement under presumed stress,
and speaker-specific results confirm that in addition to an overall distinction, individual
speakers also do not differentiate consistently between stressed and unstressed vowels
in terms of duration.
56
4.3 Results
formants could not be retrieved for N=15 items, resulting in N=304 left for further
analysis. Given the aforementioned observations about formant transitions preceding
/r/ in target words sira and sura, formant measurements for these particular words
were taken in the middle of the initial, steady-state part of the vowel (rather than at
the midpoint of the vowel as a whole, as was the case for the other target words).
There was no interaction of speaker group with stress (LRT: χ2 (1)=1.18, p=0.27).9
There was also no main effect of group (LRT: χ2 (1)=2.4, p=0.11).10
Figure 4.4: Mean formant values (Lobanov-normalised) for N=304 target vowels, el-
lipses indicate 1 SD.
Figure 4.4 shows the stressed and unstressed vowels within a Lobanov-normalised
vowel space. Firstly, there is a high degree of overlap for several matched syllable
pairs: i) sada∼sadat, ii) bashar∼bashart, iii) marra ∼marrart, and iv) muka∼mukat.
On the other hand, there is an obvious separation between stressed/unstressed vow-
els in syllables si and su. This is confirmed statistically with stressed si having higher
F1 (LRT: χ2 (1)=5.20, p=0.02) and lower F2 (LRT: χ2 (1)=61.19, p<0.001), and su
also having higher F1 and lower F2 when stressed (LRTs: χ2 (1)=51.8, p<0.001, and
χ2 (1)=30.25, p<0.001, respectively).11 This effect persisted despite the effort to meas-
ure vowel quality in the steady-state portion of the vowel which presumably is less in-
fluenced by coarticulation than the exact midpoint. I will return to this finding below.
More surprising, perhaps, is the degree of overlap between the initial vowel in the
word pair murra∼murrin and with the initial vowel in marra∼marrart. This overlap
indicates that speakers produce a vowel similar to phonological /a/ (which is phonet-
ically close to [ɑ]) in both cases. On the one hand, it is possible that some speakers
indeed produced the same phonological vowel in marra and murra, since there was no
9
Lobanov-normalised F1/F2 ∼stress * group + (1|speaker) + (1|syllable)
10
Lobanov-normalised F1/F2 ∼stress + group + (1|speaker) + (1|syllable)
11
Lobanov-normalised F1/F2 ∼stress + (1|speaker)
57
4 Acoustic correlates of word-level stress in Moroccan Arabic
Figure 4.5: All individual tokens of target vowels for the 4 syllables mu(rː), sa, si and
su, with lines linking individual speakers’ stressed/unstressed renditions of
the same vowel. Lobanov-normalised values.
vowel diacritic in the written stimuli, and this is confirmed by auditory impressions.
On the the other hand, irrespective of the phonological or phonetic status of the vowel,
it is clear that stress status does not result in clear differences between members of either
of these vowel pairs. For murra∼murrin, stress status did have some effect, with lower
F2 when the syllable is stressed (LRT: χ2 (1)=12.36, p<0.001).12
Finally, differences were found between stressed and unstressed sa, with stressed
sa having higher F1 (LRT: χ2 (1)=12.87, p<0.001) and lower F2 (LRT: χ2 (1)=6.0,
p=0.01).13 For none of the other syllable comparisons were there significant differ-
ences between stressed and unstressed vowel F1 and F2.
Figure 4.5 allows for a closer examination of the four vowel pairs that appear to differ
as a function of stress. Individual speakers consistently differentiate the stressed from
unstressed vowels in sira∼sinat and in sura∼sudan, which confirms that this is a robust
effect that needs explanation.
The differences between the non-high vowels /a/ in sada∼sadat and /u/ (realised
as [ɑ]) in murra∼murrin seem to be of a different kind. In murra∼murrin, most of the
effect seems to be carried by a small number of speakers. Additionally, the fact that this
difference concerns a front/backness distinction might be explained away by appealing
to coarticulation with the following vowel. Unstressed /u/ in murrin is followed by a
12
Lobanov-normalised F1/F2 ∼stress + (1|speaker)
13
Lobanov-normalised F1/F2 ∼stress + (1|speaker)
58
4.3 Results
high front vowel, which might result in its raised F2 values compared to stressed /u/ in
murra, which is followed by a low or centralised vowel. The great degree of between-
group overlap for the target vowels in sada∼sadat however requires closer scrutiny.
While averages suggest that stressed /a/ had 41 Hz higher F1 values and 40 Hz lower
F2 values than its unstressed counterpart, the variation among speakers is high, with
some speakers producing the reverse pattern from others (i.e. a number of male speakers
producing higher F2 in stressed syllables). Whatever differences there are, they are not
robust across speakers. I take this to mean that any significant differences between
stressed and unstressed sa should not be interpreted as meaningful, as all speakers are
not comparable in their behaviour (compare with speakers’ more uniform behaviour
with respect to si and su).
For these high vowels /i/ and /u/, the pattern of higher F1 and lower F2 for the
unstressed member is reminiscent of the effect of pharyngealisation, which typically
occurs due to a preceding pharyngealised (emphatic) consonant (Al-Tamimi 2017). The
present stimuli did not contain pharyngealised consonants.14 A potential explanation
for the effect on F1 and F2 is anticipatory coarticulation with /r/, which follows both
stressed vowels (in sira and sura) but not the unstressed ones (in sinat and sudan).
In sum, for three out of seven syllable pairs (excluding the eighth syllable si(tː)) there
were no F1 and F2 differences between stressed and unstressed vowels. For the other
four, I argued that any apparent effects of stress on formant values could be explained
by coarticulation effects. These results together do not provide evidence to support the
hypothesis that vowel quality reliably distinguishes between ‘stressed’ and ‘unstressed’
syllables.
4.3.3 Intensity
Again, there was no interaction between stress and group (among the four models,
the one closest to significance was the one for z-scored peak value, LRT: χ2 (1)=1.51,
p=0.21).15 In the models with absolute values, there is a main effect of group, with
monolinguals producing higher intensity (e.g. for peak intensity, LRT: χ2 (1)=5.65,
p=0.02).16 . This difference is not reproduced in the models used z-scored values
(e.g. for peak intensity, LRT: χ2 (1)=0.16, p=0.68).17 Even if monolinguals gener-
ally produce speech that results in increased loudness (which is by no means to be
concluded from these findings alone), the lack of an interaction with stress means that
it has little consequence for the present experiment. Group variability (or, at a more
essential level, speaker variability) in intensity patterns is to be expected anyway, and
14
While both pharyngealised [sˤuːrə] and non-pharyngealised [suːrə] are real Arabic words, the stimuli
were presented in written Arabic which uses different letters for these sounds so that it was clear that
the intended word was [suːrə] not [sˤuːrə]). [sˤiːrə] moreover is not a real word as opposed to [siːrə].
All in all it is unlikely that pharyngealisation is the real explanation of the identical effect in both word
pairs.
15
intensity (mean/peak, absolute/z-scored) ∼group * stress + (1|speaker) + (1|syllable)
16
intensity (mean/peak, absolute) ∼group + stress + (1|speaker) + (1|syllable)
17
intensity (mean/peak, z-scored) ∼group + stress + (1|speaker) + (1|syllable)
59
4 Acoustic correlates of word-level stress in Moroccan Arabic
Figure 4.6: Peak intensity (dB) of vowels, per speaker, as a function of status
(stressed/unstressed), tokens of si and su excluded. Lines link productions
by the same speaker, large dots represent mean per syllable.
Figure 4.7: Peak intensity (dB) of vowels, per speaker, as a function of status
(stressed/unstressed), tokens of si and su excluded. Lines link productions
of the same vowel, large dots represent mean per speaker.
More interestingly, perhaps, there was also a significant main effect of stress on
the intensity of a vowel in all of the models (e.g. for z-scored peak intensity, LRT:
χ2 (1)=29.00, p < 0.001).18 Once the stressed/unstressed members of a given vowel
18
intensity (mean/peak, absolute/z-scored) ∼group + stress + (1|speaker) + (1|syllable)
60
4.3 Results
pair are visualised, however, as in Figure 4.6, it appears that this effect is mainly carried
by the syllables si and su. Without these syllables, the significance disappears (e.g. for
z-scored peak intensity, LRT: χ2 (1)=1.69, p=0.19). This particular difference can eas-
ily be interpreted not as a correlate of stress, but rather as a side-effect of the formant
transition to /r/ in sira and sura, which causes an increase in sonority at the end of the
vowel relative to the unstressed member.
A depiction of individual speakers’ behaviour provides further evidence against the
hypothesis that presumed stressed syllables are enhanced. As can be seen in Figure 4.7,
most speakers do not clearly differentiate between stressed and unstressed vowels in
terms of intensity, reflected by the varying directions of the lines that link unstressed
and stressed members of a syllable pair. While there are a few speakers that do seem
to mark stressed syllables more consistently with higher intensity (bilinguals f6 and
m6), there is no overall trend towards a separation between stressed and unstressed
syllables.
One final aspect of the possible link between stress and (enhanced) intensity was
considered, by comparing the peak intensity in the target (initial) vowel compared to
that of the second vowel in the word. The possibility exists that any intensity asymmetry
between the initial and final syllable is augmented in cases where the first syllable is
stressed. Accordingly, the intensity differential between the target and second vowel
was calculated for the three word pairs bashar∼bashart, marra∼marrart and muka∼
mukat. These word pairs each have an initial and final vowel that stays constant across
stress conditions (i.e. only the coda changes in the second syllable). Figure 4.8 shows
the peak energy differential (first minus second vowel) as a function of the status of the
first vowel (stressed versus unstressed).
Figure 4.8: Intensity differential between initial and second syllable as a function of
initial syllable status (stressed/unstressed). Data subset consisting of the
three syllables ba, ma and mu(k).
Differences can observed here between the word pairs but not as a function of stress,
i.e.within word pairs. For example, for ba in bashar∼bashart, the initial vowel almost
always has higher intensity than the second, reflected by positive values for the differ-
61
4 Acoustic correlates of word-level stress in Moroccan Arabic
ence in both the stressed and unstressed conditions. The reverse is true for mu in muka
∼mukat.
The absence of a consistent effect as a function of stress in these cases lends further
credibility to the above results that also failed to reveal consistent stress-related differ-
ences in intensity between syllables.
4.3.4 F0
In this section I will first report on the global F0 contours characterising the utterances
in which target words were embedded. After this I will take a look at static scaling
properties of the F0 contours in terms of mean and peak measurements in target vowels.
As described in Section 4.2.4, global contours were normalised for syllable duration.
The syllable boundary is taken to be the onset of the second consonant, which occurred
in intervocalic position, the exceptions being sitta∼sittat, which, as mentioned previ-
ously, were typically realised as [sːtːə]∼[sːtːaːt]. In order to facilitate the comparison,
the onset of the geminate /tː/ was taken to be the syllable boundary in these words.
Figure 4.9: Phrasal intonation contours (averaged and normalised for duration) across
N=251 target utterances without pauses, male contours (bottom white and
black lines) and female contours (top black and white lines) separated.
Figure 4.9 shows time-normalised mean contours for all utterances in which target
words were not followed (or preceded) by a pause (N=251, as described above). A first
observation is that the part of the contour that occurs on the target word is similar in
all contexts, suggesting that there is no obvious effect of stress on the F0 characteristics
62
4.3 Results
of target words. It is at least expected that the words in this dataset are not subject
to postlexical prominence marking, as they occurred in postfocal position, and at first
sight this is confirmed here. Nevertheless, there is one unexpected visual difference
between female and male speaker behaviour, for at least one syllable pair: The male-
speaker contour for unstressed mu in mukat (white) is realised at a higher level than
stressed mu in muka (black), while the female contour shows the opposite pattern. This
observation led to the inclusion of random slopes for the interaction of stress with
participant sex in the statistical models discussed below.
It can also be observed that target words exhibit a small rise at their right edge, or
somewhere on the second syllable in general, and there are several possible explana-
tions for this. One theoretical possibility, namely that the pitch movement represents
postlexical prominence, is unlikely given the focus structure of the phrase as just men-
tioned. Another is that the higher pitch on word-final syllables is in fact a marker of
word-final stress. I will return to this possibility below. A more likely possibility is that
the rise represents an edge-marking tonal event, rather than a prominence-marking
event, given that they occurrence in a context where the target words are postfocal and
given. In this case, the observed patterns would highlight the difficulty in obtaining
pitch-neutral stimuli despite careful experimental design.
In the present experiment target syllables are word-initial and thus do not themselves
carry this rise. It would have been problematic, however, if some target syllables had
been word-final and others occurred in non-final position (as in fact in the stress exper-
iment conducted for TB in Chapter 3), as this would have implied that final syllables
are distinct from initial ones by virtue of the rise in the form of a positional effect. This
brings me to some of Boudlal’s (2001) results. As mentioned in Section 4.1.1.1, Boud-
lal proposes that stress targets the final syllable in cases where a word is produced in
isolation, while stress may be elsewhere in sentence context. Given that his words in
isolation were produced in a list, it is not surprising that these words were produced
with what looks like a continuation rise, see Figure 4.10, so that all target words were
characterised by high pitch in the final syllable. Therefore, what Boudlal (2001) con-
siders an effect of stress rather seems to be a positional difference, and high final pitch
in this context should not be interpreted as a correlate of stress.
Figure 4.10: Example F0 contour on word in isolation, from Boudlal (2001: 346).
In the present case, it cannot be concluded for certain that the pitch event is edge-
marking rather than serving the purpose of (lexical stress) prominence marking. For
the purpose of this experiment, the presence of a consistent prosodic event on the final
syllable, irrespective of its actual nature, cannot be considered evidence in favour of the
63
4 Acoustic correlates of word-level stress in Moroccan Arabic
view that stress is weight-sensitive and variable between penultimate and final position.
Returning to the experimental question about correlates of stress in MA as conceived
of by Benkirane: The most important aspect for present purposes is that these target
(initial) syllables are comparable in terms of their pitch properties, and they seem to
be.
Several static measures of target vowels were taken (N=315), in the form of peak
and mean F0 (in ST), and as before models were run on both absolute and z-scored
values. Pausing is included in these models as a fixed effect. Firstly, there was no
interaction between pausing and stress (e.g. for absolute peak values, LRT: χ2 (1)=1.54,
p=0.21).19 There was also no main effect of the presence of a pause on the F0 values
of target vowels (LRT: χ2 (1)=0.68, p=0.17).20
There was however an interaction between group and stress status. In bilingual speak-
ers, stressed vowels had F0 values that were around 1 ST higher than unstressed syl-
lables, whereas the effect was less pronounced in monolingual speakers, who exhibited
a difference of 0.6 ST (the ‘least’ significant was z-scored mean F0, LRT: χ2 (1)=3.86,
p=0.05).21 The general effect therefore is one of slightly increased F0 on the stressed
syllables. Figure 4.11 shows the mean F0 values for the N=315 target vowels for which
pitch was correctly retrieved.
Similarly to the F0 differences in the case of the experiment on Tashlhiyt (Chapter 3),
predicted differences of the above-mentioned magnitude might not represent perceptu-
ally ‘robust’ enhancement, as differences of 1 ST to be an absolute minimum in order
for listeners to distinguish dynamic pitch movements. Nevertheless, a difference exists,
19
F0 (mean/peak, absolute/z-scored) ∼pause * stress + stress * group + (1|speaker) + (1|syllable)
20
F0 (mean/peak, absolute/z-scored) ∼pause + stress * group + (1|speaker) + (1|syllable)
21
F0 (mean/peak, absolute/z-scored) ∼pause + stress * group + (1|speaker) + (1|syllable)
64
4.4 Discussion
Figure 4.12 shows that about half of the speakers consistently produce higher peaks
F0 on ‘stressed’ syllables, including almost all bilingual female speakers, as well as
monolingual f2 and perhaps several others. These bilingual speakers thus appear to
carry much of the effect for the bilingual group according to which stressed syllables
are produced with increased F0. The remaining speakers display the otherwise familiar
pattern of making no consistent distinction between stressed and unstressed vowels.
It follows that, while there is a statistical effect of enhanced F0 on stressed syllables
in MA, this effect is not a very robust acoustic correlate, neither in terms of the scale
of the difference (involving no more than 1 ST) nor in terms of being observed reliably
across the speaker population.
4.4 Discussion
For none of the acoustic correlates discussed in this chapter (duration, vowel quality,
intensity and F0) were there convincing differences between presumed stressed and
unstressed syllables. In order to conclude that stressed syllables stand out acoustic-
ally, there would have to be consistent differences across the board. More concretely:
A lexicon-wide effect with acoustic enhancement of stressed positions would require
stressed syllable members to stand out from unstressed ones in most if not all words.
In the present experiment, the only (marginally) consistent differences were found for
the stressed and unstressed counterparts of the syllables si and su. The differences
in duration and intensity however could be attributed to segmental-contextual effects.
65
4 Acoustic correlates of word-level stress in Moroccan Arabic
Vowel quality in these syllable pairs was also different between stressed and unstressed
members, but not in the expected direction, requiring an explanation in terms of coar-
ticulation rather than stress status. In short, any acoustic differences in general are best
interpreted as parasitic on this vowel difference.
Across the board, it is crucial to note that no acoustic parameter was used consistently
across syllable pairs to mark the distinction between stressed and unstressed syllables,
and no syllable was consistently enhanced by multiple acoustic parameters. Addition-
ally, for the differentiation of stressed and unstressed syllables to be robust, individual
speakers would be expected to systematically cue this distinction. In this experiment,
no speaker made a consistent distinction (either using multiple cues to enhance stressed
syllables, or using a single cue consistently), which suggests that speakers do not pro-
duce two acoustically distinct phonological categories ‘stressed’ and ‘unstressed’.
While the present findings can thus not be considered to provide evidence in favour
of the existence of lexical stress, they also cannot be taken to provide evidence against
it. This is a problem inherent to null results, but it is augmented further by the fact that
even if MA lacks lexical stress as envisaged by Benkirane (1998), lexical stress could
still, potentially, be captured by appealing to another stress rule. One other testable
prediction about stress that has been brought forward is the one that considers stress
to be final, irrespective of syllable weight (see Section 4.1.1.1). In order to test this
prediction, a different experimental set-up would be needed, where the stimuli should
contain identical syllables contrasted in a presumed stressed position and in a presumed
unstressed position. This could for example involve stimuli such as the following non-
words /daˈba/∼/baˈka/, in which the target syllable ba could be compared in final
(stressed) and in non-final (unstressed) position. Whether this stress rule is correct,
however, is somewhat doubtful: To my knowledge there is no concrete evidence that
suggests that MA has a stress-by-position system.
Returning to the present null results, in the context of past and present work these
appear to be best interpreted as accurately reflecting a situation in which MA lacks lex-
ical stress, as they are in fact compatible with multiple additional observations. Firstly,
native speakers have shown varied judgments on stress position in MA and there is
a century-long disagreement among scholars on the proper representation of stress.
Clearly, the concept of stress in MA is an elusive notion, which is already a good indica-
tion that it might not play much of a role in the phonology of the language. The second
observation has to do with the intonational phonology of MA and the generally held
assumption that stressed syllables serve as docking sites for postlexical pitch accents.
In an experiment conducted by Yeou, Embarki & Al-Maqtari (2007), in which specific
words in read sentences were contrastively focused, MA displayed somewhat different
patterns from Kuwaiti and Yemeni Arabic (varieties that uncontroversially have stress).
Specifically, the intonational movement accompanying the relevant words was not as
closely tied to the presumed stressed syllable in MA as it was in the other varieties.22
22
The authors do not indicate how they decided on the position of stress, but it seems that they took
syllables with long vowels to be stressed. Vowels that were long according to their transcription
occurred in penultimate position (/CVː.CV/) or in closed syllables in final position (/CV.CVːC/), so
66
4.5 Summary and conclusion
that their understanding of stress for these words at least matches Benkirane’s (1998) predictions.
67
Part III
69
5 Prominence in question word
interrogatives
5.1 Introduction
5.1.1 Preliminaries
One of the aims of this thesis is to characterise the nature of prosodic prominence at the
postlexical level in Tashlhiyt Berber and Moroccan Arabic. This question is addressed
experimentally for the intonational marking of question words (qwords) in the follow-
ing two chapters. The present chapter serves to motivate why specifically question
words (or wh-words) form a good testing ground for the investigation of postlexical
prosodic prominence.
The intonation of questions in general has a long research history, but most of this
work has been concerned with one aspect of question intonation only, namely the right-
edge-marking of yes/no questions. As I will go on to show, qword questions have
received comparatively little attention. In particular, a detailed review of the prosodic
prominence of qwords is so far noticeably absent, a situation that the second part of
this chapter seeks to remedy (Section 5.3).
In stark contrast to the situation for intonation is the amount of work dedicated to
the syntactic and semantic properties of qword questions, which have been investig-
ated in great detail both crosslinguistically and for many individual languages. One
particularly interesting discussion concerns the structural or inherent prominence of
qwords, which is sometimes referred to directly in terms of focus (see Section 5.2.2.2).
Syntactically, the prominence of qwords comes about through syntactic movement (at
least in approaches that stipulate movement), with qwords often considered to move
to (or occur in) a specific position that allows for their intended interpretation. Se-
mantically, the prominence of qwords follows from their contribution to interrogative
meaning. Finally, some aspects of qwords’ (morpho-)syntactic behaviour also suggest
that qwords are similar to focused constituents.
In reviewing the various aspects of qword interrogative structure, I will also discuss
one of the more puzzling observations about the correspondence between the structural
(non-prosodic) and prosodic prominence of qwords. Specifically, the observation has
been made that the two types of prominence do not coincide in English, as qwords
are not typically the words with the highest degree of phrasal prosodic prominence
(i.e. they do not tend to get the nuclear accent). In many other languages, however,
qwords in interrogatives are consistently the most prominent words in the phrase. Un-
fortunately, the facts of English have for a long time confused researchers. In the next
71
5 Prominence in question word interrogatives
sections, I will show that this presumed mismatch between structural and prosodic
prominence is not very pervasive crosslinguistically.
• Experimental studies (Section 5.3.2.1): This set includes the few works that ex-
plicitly investigate prosodic properties of qwords with some quantitative aspect
beyond mere listing or description of intonation patterns. Claims in these works
are based on well-described datasets involving data from multiple speakers.
• Descriptive studies (Section 5.3.2.2): This set contains all other resources, includ-
ing i) general descriptions of intonational systems with a few single examples
of qword contours (most contributions in Jun’s Prosodic Typology volumes (Jun
2005c, 2014b), and in Hirst & Di Cristo’s (1998) Intonation Systems), ii) more
elaborate descriptions of qword interrogatives in specific languages that do not
provide detailed information about the actual data (e.g. Varga 2002) or make
little attempt to quantify patterns (e.g. Frota 2002).
Following a discussion of the inventory of prosodic patterns that may be used cross-
linguistically in the marking of qwords, I will finish this chapter by raising some general
questions about qword prominence that will be answered for Tashlhiyt Berber and Mo-
roccan Arabic in the following chapters.
72
5.2 Syntax, semantics and pragmatics
5.2.1 Syntax
A large amount of work in syntax has been concerned with characterising the phrasal
positions in which qwords may occur in different languages. There seems to be a con-
sensus on the types of patterns that exist crosslinguistically, which can be allocated to
roughly two types of structural behaviour (Chomsky 1995: 68; Dik 1997b) (but see
discussion below on whether there might in fact be three groups):
Type 1 Languages in which qwords in interrogatives are found at (or close to) the left
edge of the phrase;
Type 2 Languages in which qwords are found in the same position as corresponding
non-qword constituents (‘in-situ’ in frameworks that presuppose syntactic move-
ment)
Type 1 covers many languages, some 70% of the world’s languages according to Dik
(1997b: 283) (although no details are given to substantiate this claim).1 Depending
on the definition, this group of languages may include those that front all qwords in
multiple qword questions, as in Slavic, as well as languages like English, in which one
qword only is fronted.
Considering only questions with a single qword, the Type 1 group includes most Indo-
European languages as well as varieties of Berber and Arabic. An example illustrating
the canonical phrase-initial position for qwords in Tashlhiyt is given in (1). A typical
declarative, as in (2), has the object(s) corresponding to the qword ‘what’ in post-verbal
position2
Type 2 covers languages such as Mongolian, Japanese, Chinese and Korean, where
the qword occurs in the same position as the matching constituent in the corresponding
declarative. Compare (3) and (4), based on Cheng 2003: 103):
73
5 Prominence in question word interrogatives
Finally, there are languages not subsumed in either one of these categories because
they seem to allow qwords in multiple positions, such as Maltese (Vella 2007), and
Zulu, Malagasy, and French, which have also been called ‘optional in-situ’ languages
(Sabel & Zeller 2006). Whether these languages should be given their own category is
beyond the scope of this discussion. For present purposes it is important to note that
most languages of the Type 1 group in fact also allow qwords to occur in non-initial
position, and specifically in a position that is typical of qwords in Type 2 languages, so
that the English counterpart of (3) is also permissible, as in (5):
This utterance can still be interpreted as a direct interrogative in English, but it has a
somewhat different interpretation from a canonical question with an initial qword. In
syntactic work such questions are often relegated to the echo question category but per-
haps the better characterisation would be that non-canonical qword placement cannot
be used for out-of-the-blue questions (as isolated written examples tend to be). Corpus
studies or studies on interactional data often reveal that non-initial qwords in (what
are assumed to be) Type 1 languages occur not only as echo questions but also in other
contexts. For example, Germanic languages such as English are sometimes considered
to have a category called reference questions, as exemplified in (6) from Bartels (1997:
4):
Additionally, languages more closely related to Berber and Arabic, such as Hausa (Afro-
Asiatic, Chadic), clearly allow qwords to occur in non-initial default position (Jaggar
2006). In fact, Tashlhiyt Berber and Moroccan Arabic also allow for some variation,
based on my own observations of qword questions in daily interaction as well as in
the CoTaSS corpus (Bruggeman & Roettger 2017) and the IVAr corpus (Hellmuth &
Almbark 2017), respectively. The context in (7) exemplifies the usage of a non-initial
non-echo question in a Moroccan Arabic map task:3
74
5.2 Syntax, semantics and pragmatics
75
5 Prominence in question word interrogatives
76
5.2 Syntax, semantics and pragmatics
observations that qwords syntactically behave like foci have been made for other types
of languages, such as Type 2 languages in which qwords are like foci in terms of their
(in-)sensitivity to so-called island constraints (Hagstrom 2003: 192, Krifka 2011).
A third argument has to do with the morphological marking of qwords. Some lan-
guages that have morphological markers that attach to focused constituents use this
same marker with qwords, such as Wambon (Dik 1997a), Akan (Genzel 2013), Gungbe
(Aboh 2007), and Lɛtɛ (Akrofi Ansah 2010).
A fourth similarity between qwords and foci is that qword interrogatives often re-
semble cleft constructions (clefted constituents being focused by definition). In the
generative tradition in particular, clefts and interrogatives with an initial qword are of-
ten analysed as involving movement that is motivated by a focus feature. The structural
similarity is nevertheless also observed by functionalists (Dik 1997b: Chapter 13). Par-
ticularly relevant to this thesis is that the syntactic parallel has been observed to hold in
Moroccan Arabic (Ouhalla 1999), and in Berber varieties such as Tarifit and Tamazight
(Stoyanova 2004, and Penchoen 1973 as cited in Dik 1997b: 328f). Relevant details
will be discussed in Chapters 6 and 7.
• ‘focus ambiguity’ of phrase-final nuclear accents. This kind of accent may indicate
all-new broad focus, as well as various kinds of narrow focus including on the
single word that carries the accent;
77
5 Prominence in question word interrogatives
• second occurrence focus, where a focused constituent which has been previously
mentioned is not marked by the main pitch prominence.
Additionally, English also has non-prosodic focus strategies, such clefts and pseudo-
clefts. This can be considered another kind of focus without accent.5
On the other hand there are cases of ‘accent without focus’ in which pitch accents oc-
cur on constituents that are not necessarily focused themselves. Such accent placement
may be due to (metrical) phonological considerations or so-called focus projection (see
also Section 2.4.3.2 for a discussion of patterns of accent distribution).
The above set of arguments strongly suggests that the absence of intonational promin-
ence on a given constituent cannot alone serve as conclusive evidence against its being
focused. An alternative explanation of the typical absence of intonational prominence
on English qwords is that qwords function as non-prosodically marked foci. This op-
tion is not entertained by either of the aforementioned authors, but there is no obvious
reason why this possibility should be discarded. Firstly, qwords are a morphologic-
ally distinct group of words, which means that they could be identified as foci even
in the absence of intonational prominence (unlike, say, a random noun like ‘apple’).
Secondly, qwords are all the more distinct due to their occurrence in a specific syn-
tactic slot near the left phrasal edge, which also distinguishes them from homologous
relative pronouns.
In conclusion, it seems precipitous to require qwords to be prosodically the most
prominent in the phrase in order to qualify as foci, even in languages that usually
mark focus by means of intonational prominence.6 Ideally, there should be pragmatic
diagnostics to the focus status of a constituent, which then may or may not be supported
by intonational prominence or other grammatical markers of focus (see also Lambrecht
1994; Zimmermann & Onea 2011: 208).
In sum, the absence of prosodic prominence on a qword alone should not constitute
evidence against it being focused (in English or crosslinguistically) Taking this further,
concluding that qwords are focused by definition, might not be warranted either. The
next section will address the issues relating to generalisations about the focal status of
qwords in some more detail.
5
Clefted constituents often, but do not necessarily, attract intonational prominence (Bolinger 1986).
6
It is moreover well-known that the presence of intonational prominence is not a crosslinguistic marker
of focus, cf. Erteschik-Shir (2007: 40): “not all languages use stress to mark the focus”.
78
5.2 Syntax, semantics and pragmatics
The clefting structure in (10) indicates that in this particular case, the qword is most
definitely focused, but this raises questions about non-clefted qwords in English: Are
these also focused, and if so, should this lead us to posit different kinds of qword focus?
Does it follow that there is a difference in the focal status of qwords in languages that
treat all qwords like clefts, and the status of qwords in languages that may but need not
be clefted? Concrete answers to such questions have not been proposed in the literature,
but a good explanation would likely appeal to different focus types. Zimmermann &
Onea (2011) for example suggest that the difference between in-situ and ex-situ focus
in West African languages reflects the difference between contrastive and information
focus.
5.2.4 Summary
There are many arguments from (morpho-)syntax, semantics and pragmatics that sup-
port an interpretation of qwords as inherently focused in many individual languages.
It is one step further to assume that qwords in interrogatives are by definition focused
crosslinguistically, although this is in fact a very common view. If a uniform character-
isation of qwords in terms of focus is possible at all, semantic and pragmatic arguments
are likely to be needed: A mere glance at the possible crosslinguistic patterns for the
syntactic and prosodic marking of qwords is enough to abandon hopes for a uniform
characterisation from these corners.
At this point, it has to be concluded that qwords in interrogatives cannot be con-
sidered focused by definition. Instead, qwords’ potential status as foci must be determ-
ined on a language-specific basis, and likely even on a discourse-specific basis. Given
7
This question is further complicated by the variety of focus categories that have been proposed (e.g. in-
formation focus and contrastive focus), the discussion of which is beyond the scope of this chapter.
79
5 Prominence in question word interrogatives
80
5.3 Prosodic marking of question word interrogatives
Benzmülller (2005) for German. Most of these languages also have qword questions in
which the right edge is high or rising, although this is less commonly attested than low
pitch. High pitch additionally seems more restricted in its occurrence than final low
pitch, being pragmatically or socially conditioned by specific interactional contexts or
speech style, or even phonologically conditioned by segmental structure (Bartels 1997;
Frota 2002).
An explanation for the different amount of attention yes–no and qword questions
have received might take into account the structural differences between the two types
of questions. In contrast to yes–no questions, qword questions are by definition lexically
and/or morphologically differentiated from statements. This follows from the use of an
interrogative pronoun, which moreover often occurs in phrase-initial position, which
leaves little uncertainty about the modality of the phrase.10 There is therefore no real
need for prosody in qword interrogatives to mark sentence modality the way it does
in yes–no interrogatives. Accordingly, it is perhaps the prominence marking, rather
than the edge marking within qword questions, which is more interesting to compare
crosslinguistically. So far prominence marking of questions has received little mention
in the literature, and I will therefore use the next section to provide an overview of
crosslinguistic patterns of qword intonational prominence.
81
5 Prominence in question word interrogatives
first the discussion of European languages, then Tamil and finally Korean. The studies
involve varying numbers of speakers, ranging from 2 to 10. Most studies had a similar
set-up, with speakers reading out scripted questions following a prompting context.
In the experiments on European languages, qwords occurred in phrase-initial pos-
ition, with only Maltese also allowing for qwords in other positions. The qword in
all these languages is typically marked with the highest pitch or the most extreme
pitch excursion (rising towards a peak) in the phrase: This holds for Manchego Spanish
and Peninsular Spanish (Henriksen 2014 and Prieto 2004, respectively); Maltese (Vella
2007; Grice, Vella & Bruggeman 2019); Greek (Arvaniti & Ladd 2009); and Dutch (Haan
2002). Among these, Dutch exhibited the most variable patterns: In 15% of qword in-
terrogatives (28 of 185 cases), the qword did not receive the main pitch prominence
in the phrase. This means that even for the least consistent language, the qword was
in fact the most prominent in the vast majority of cases (85%). For all languages, the
intonation contour typically dropped right after the qword-related maximum and lacks
further prominence-marking pitch events. In some cases (the Dutch study and both
studies on Spanish) contour variants are observed where the high pitch in the vicinity
of the qword has the form of a plateau that extends beyond the right edge of the qword.
In Tamil, a comparable pattern for qword prominence is observed (Keane 2006a).
Qwords in this language may occur in various phrasal positions. They are marked by
rising pitch movement which is more pronounced than that on the corresponding word
in a declarative. As most content morphemes in Tamil generally carry a rising contour,
qwords do not necessarily get the highest phrasal peak, especially when they occur in
non-initial position (presumably due in part to declination effects). However, qwords
are enhanced compared to their non-qword constituent counterparts.
Korean, finally, does not quite conform to the same pattern of qword enhancement.
Korean is analysed as lacking pitch accents and its intonation is instead analysed in
terms of sequences of boundary tones (Jun 2005a, cf. Section 2.5.2). In Jun & Oh
(1996), various sentence modalities are contrasted with each other, including qword
interrogatives (qwords occur in-situ in Korean). Words that are semantically ambigu-
ous between an interrogative pronoun (qword) and an indefinite pronoun were found to
be phrased differently depending on the function they performed. Specifically, qwords
formed an AP with the following verb, whereas indefinite pronouns were phrased as
a single-word AP. This phrasing resulted in the consistent occurrence of a pitch peak
around the second syllable of the qword, whereas the indefinite pronoun would get
a final peak. It is not clear what this means for an interpretation in terms of qword
prominence. Although Jun (2005a, 2014a) assumes that Korean lacks phonological,
postlexical prominence, other authors have mentioned the percept of enhanced prom-
inence of qwords in Korean, such as Choe (1995) (further references in Jun & Oh 1996).
Even if Korean qwords are not readily interpreted as being prosodically prominent, at
least in terms of surface pitch patterns they are not dissimilar to the aforementioned
languages in which the main phrasal prominence co-occurs with the qword.
Something to keep in mind about the intonational patterns described in this section
is that the findings of experimental studies might well differ from other types of studies
82
5.3 Prosodic marking of question word interrogatives
including other speech styles. For example, for Dutch, qualitative results from a cor-
pus study involving mostly spontaneous interactions (Chen 2012) yielded somewhat
different results from Haan’s (2002). In the corpus study, about three quarters of the
90 qwords produced in interrogatives were reported to receive some kind of accentu-
ation (although not necessarily the most prominent phrasal accent). In Haan’s (2002)
work, almost all qwords carried the most prominent accent in the phrase. It seems
likely therefore that speech style will have an effect not only on the type of right edge
marking of interrogatives (as mentioned in Section 5.3.1) but also on the intonational
prominence of the qword.
11
For some of the languages in these volumes, qword questions are not mentioned at all (e.g. Italian and
Swedish in Hirst & Di Cristo (1998)). Tamil and Spanish are also left out of this count as more detailed
studies were discussed in the previous section.
83
5 Prominence in question word interrogatives
syllables preceding the stressed syllable, or the number of voiced segments preceding
the peak. With this proviso in mind, a small number of languages nevertheless seems
to characterise qword prominence by means of a rise (with a peak and/or fall occurring
after the qword): Egyptian and Lebanese Arabic, Georgian, Bengali and Mongolian.
The remaining languages seem to have high pitch on the qword which is followed by a
fall which is also (in part) realised on the qword.
The fact that there are apparent differences in the phonetic realisation of ‘high F0
on the qword’ does not invalidate the observation that qword intonation is highly sim-
ilar crosslinguistically (at least based on the languages reviewed here). Firstly, align-
ment differences are not necessarily indicative of meaningful differences in the type
of pitch event used to mark qwords as prominent. Alignment may differ predictably
as a function of many factors, some of which reflect differences in phonological struc-
ture between qwords (such as length in syllables, segmental make-up, position of the
stressed syllable), rather than meaningful differences in the nature of prosodic promin-
ence marking. Moreover, even differences in the phonological labels used to describe
pitch events in different languages do not necessarily entail fundamental differences,
which is why I have refrained from referring to AM pitch accent categories as found in
the original sources.12
Secondly, irrespective of the precise details of the high region, most languages exhibit
similarities in the part of the intonation contour that follows the qword, which typic-
ally consists of a low-level flat stretch of F0 until the phrase end. In some cases this is
explicitly described as deaccentuation (e.g. for the Basque varieties, and for Greek). In
reviewing all the above resources, the phonetic resemblance between example F0 con-
tours from different languages was in fact striking. Two examples of similar contours
are given in Figure 5.1 for Northern Bizkaian Basque13 and Hungarian. The schematic
representation given in Figure 5.2 (from Haan 2002: 116) for a set of Dutch questions
is also highly similar to these examples.
Interestingly, in some languages a very similar contour is also used for statements
with (initial) focus. In those cases, the intonation contour is typically analysed as in-
volving a focal pitch accent on the focused constituent, followed by deaccentuation
and/or postfocal compression accounting for the flat F0 stretch (e.g. Elordieta & Hualde
2014 for Basque; Vanrell & Fernández Soriano 2013 for Spanish). These observations
form further support for an interpretation of qwords in interrogatives as foci.
Finally, in contrast to the aforementioned languages that have prosodically promin-
ent qwords, there are a number of languages that do not seem to associate the main
phrasal prominence with the qword. These include Jamaican Creole (Gooden 2014),
Chickasaw (Gordon 2005), Dalabon (Fletcher 2014), Japanese (Igarashi 2014) and Cur-
12
Differences between languages in the choice of pitch accent are nevertheless suggestive of some general
differences: The use of e.g. H*L for qwords in Northern Bizkaian Basque versus L*+H for Lebanese
Arabic suggests that it is the fall from a high point which is crucial in the former and the rise to a
high point which is crucial in the latter (even though both contours might share a global rising-falling
movement)
13
This variety of Basque contrasts words that have lexical pitch with words that lack lexical pitch accent.
Words with lexical accent are identifiable by an acute accent on the relevant syllable.
84
5.3 Prosodic marking of question word interrogatives
Figure 5.1: F0 contours for Basque and Hungarian qword questions, both with initial
qword. Northern Bizkaian Basque reproduced from Elordieta & Hualde
(2014: 435) based on original sound file, Hungarian recording author’s own.
Figure 5.2: Schematised F0 contour for Dutch qword questions with initial qword. Re-
printed from Haan (2002: 116). Solid line represents most of the contours
in the sample (71%, N=186) the dotted lines illustrate alternative realisa-
tions with further accents or optional final rise.
• Languages that typically mark the qword with the main phrasal pitch promin-
ence: European and Brazilian Portuguese, many if not all varieties of Spanish
and Basque, Catalan, Dutch, Italian, Greek, Maltese, Egyptian Arabic, Lebanese
Arabic, Moroccan Arabic, Romanian, Bulgarian, Hungarian, Russian, Georgian,
Bengali, Tamil, Bininj Gun-Wok;
14
There are many resources that deal with yes–no question intonation, including Michalsky (2017) for
German, but these tend to make little mention of qword questions.
85
5 Prominence in question word interrogatives
• Languages that either do not typically mark the qword with the main phrasal pitch
prominence or variably do so: English, German, Jamaican Creole, Chickasaw,
Dalabon, Japanese, Curaçao Papiamentu.
The overall pattern, even if based on limited evidence only, is clear. Some Germanic
languages, including English and an English-based creole, and a handful of other lan-
guages form the exception in matters of qword interrogative intonation. Most other
languages require the main phrasal pitch prominence in the form of some sort of high
pitch to coincide with the qword.
As stated before, I have avoided to frame descriptions in terms of pitch accentuation.
Consequently, qword pitch prominence patterns could be observed to look very similar
across a range of languages. This theory-neutral way of describing qword intonation
is less helpful when the aim is to analyse the prosodic properties as either serving
prominence marking purposes or edge marking. For this latter purpose, the role of
stressed syllables and phrasal edges in determining the location of the pitch movement
should prove insightful, and steers right into the domain of intonational phonology.
In AM approaches to intonation, pitch events that co-occur with stressed syllables are
considered to be prominence marking, whereas pitch events that occur at the edges of
phonological domains are considered edge marking. In the case of qwords, determining
whether its pitch properties are prominence marking or edge marking is somewhat
complicated by the fact that most languages require qwords to appear at the left edge
of some prosodic phrase. This makes it potentially difficult to distinguish phonetically
between pitch movement that is to be interpreted as edge marking and pitch movement
that serves to mark the prominence of the qword (see also Section 2.4.3.2). Helpfully,
most of the previously discussed studies mention that there is a role for stressed syllables
in determining the location of specific F0 turning points, and accordingly analyse the
pitch events as pitch accents rather than boundary tones. Additionally, in languages
where qwords can occur in initial as well as non-initial positions, the high pitch co-
occurs with the qword irrespective of its position. This is the case for Romanian and
Maltese (and also for Tashlhiyt Berber and Moroccan Arabic, as will be shown in the
next chapters).
In conclusion, phonetically and phonologically, it seems that the pitch properties
of qwords in many languages reflect intonational prominence marking, rather than edge
marking. This interpretation is moreover supported by the qword’s semantic-pragmatic
salience as discussed in previous sections.
86
5.4 Further points of interest
2. What is the prosodic prominence status of question words? How should any pros-
odic marking be analysed phonologically?
The following two chapters (6 and 7) will address these questions for Tashlhiyt Berber
and Moroccan Arabic, respectively.
There is a third question, which will not be addressed directly, but is relevant never-
theless: To what extent do Tashlhiyt Berber and Moroccan Arabic exhibit similarities
in question word interrogative marking? This question bears on the language contact
situation that characterises these languages (see Section 1.3). At this point, very little
is known about qword interrogatives in other varieties of Berber, making it difficult
to determine how any characteristics of Tashlhiyt compare to those of other varieties.
More information is available about qword interrogatives in other varieties of Arabic
(e.g. Chahal & Hellmuth 2014), but, to my knowledge, there are no controlled experi-
ments on the topic. A very promising possibility for future work in this direction lies
in the IVAr corpus (Hellmuth & Almbark 2017), which contains comparable data from
seven varieties of Arabic, including recordings of the same experiment that yielded the
MA qword interrogatives that will be discussed in Chapter 7. Any similarities between
TB and MA qword marking could be compared against qword marking in other Arabic
varieties in order to find out where the greater similarities lie.
87
6 Question word interrogative intonation
in Tashlhiyt Berber
6.1 Introduction
6.1.1 Prior work on the intonation of Tashlhiyt Berber
Prior to 2011, no work had ever directly addressed Tashlhiyt Berber intonation. Passing
remarks had been made in Stumme (1899) and Dell & Elmedlaoui (2002), and a hand-
ful of examples were given in Lafkioui (2010). Work on the intonation of Tashlhiyt
since then has focused on the tonal events occurring near the right edge of Intonation
Phrases (IPs) (cf. proceedings papers Grice et al. 2011; Roettger, Ridouane & Grice
2012), articles Grice, Ridouane & Roettger 2015; Roettger & Grice 2015, and the PhD
thesis Roettger 2017, which is based on some of the earlier publications).1
A finding shared by all of these past studies is that Tashlhiyt exhibits a great deal of
variability in tonal placement in IP-final position. Grice, Ridouane & Roettger (2015)
investigate two types of sentences produced in an experimental setting: Yes-no ques-
tions (marked with a question particle), and statements with a contrastively focused
word in IP-final position. Both are characterised by a rising-falling F0 movement close
to the right edge of the phrase and both these movements are analysed phonologically
as involving an H tone. Despite their presumed different functions as edge-marking (in
the case of the yes-no questions) and prominence-marking (in the case of the declarat-
ives with a contrastively focused word), H tones in both phrase types exhibit similar
behaviour in the sense that they can both dock onto either the penultimate or the fi-
nal syllable of the final word in the phrase. These patterns of variable realisation are
moreover very similar to the realisational variants Roettger & Grice (2015) report for
declarative questions (yes-no questions not marked by a question particle).
Grice, Ridouane & Roettger (2015) analyse the H tones at the right edge of questions
and contrastive statements as having primary association with the phrase edge, and
secondary association to a specific syllable. This explanation fits in well with the as-
sumption that the language does not have lexical stress, since this type of association
presupposes that there are no predetermined lexical anchors in the form of stressed
syllables to which postlexical tones would associate. This is not to say that tonal asso-
ciation was entirely unconstrained: Instead of structural prominence factors, syllable
weight and nucleus sonority as other lexical–phonological metrical factors, were in-
voked to explain the distribution of the H tones.
1
The present chapter has been published in a slightly different format as Bruggeman, Roettger & Grice
(2017).
89
6 Question word interrogative intonation in Tashlhiyt Berber
The present chapter looks at the intonational events that co-occur with question
words (qwords) and will provide a detailed investigation of tonal alignment and as-
sociation in phrase-initial position.
2. What is the prosodic prominence status of question words? How should any pros-
odic marking be analysed phonologically?
Because very little is known about qword interrogative intonation in Berber in gen-
eral, and nothing about Tashlhiyt, a subquestion to the first main question is whether
and in what contexts qwords in Tashlhiyt are characterised by prosodic prominence
marking. Based on the discussion in Chapter 5, it is likely that qwords are marked
by an intonational prominence-marking event in direct interrogatives, as they are in
many other languages. The follow-up question would then concern how this postlex-
ical prominence should be interpreted phonologically, and how it stands in relation to
the prosodic characteristics of the rest of the interrogative (i.e. Question 1). In section
6.2 I will report on a pilot experiment which compares the qword manaɡu ‘when’ in
direct interrogatives, i.e. speech acts expressing a request for information, and in em-
bedding contexts. It is likely that the qword is intonationally prominent by means of a
90
6.2 Pilot
pitch event (which likely conveys focus) in a direct interrogative (see Chapter 5.2.2.2).
At the same time, the expectation is that the same intonational marking is not found
for embedded qwords.
The second question pertains to how the intonational characteristics of qwords in
Tashlhiyt should be interpreted phonologically. After the pilot in section 6.2 has cla-
rified that qwords indeed are marked consistently with a pitch event, the main experi-
ment from section 6.3 onwards will serve to answer this second research question with
specific reference to qwords in phrase-initial position. This main experiment investig-
ates the properties of 11 different qwords (five simple and six complex), zooming in on
the alignment and scaling characteristics of the F0 event in question.
6.1.4 Data
All speech materials were developed in consultation with native speaker informants.
The data were recorded by the author in a quiet room in the Département des études
amazighes at the Université Ibn Zohr in Agadir, using a PreSonus Audiobox solid-state
recorder at a sampling rate of 44.1 kHz, and a head-mounted AKG C420 III microphone.
Pilot recordings took place in November 2014, recordings of the main experiment in
March 2015.
6.2 Pilot
6.2.1 Participants
The recordings for both experiments involved the same participants, with the exception
of one participant who completed the pilot but not the main experiment. Data from two
participants were excluded from analysis in both pilot and main experiment, which in
one case was due to reading difficulties and in the other case due to the speaker being
unable to finish recording. In the end, the pilot results are based on data from the seven
participants that did in both experiments.
Table 6.1 gives more detailed information about the main seven participants (six
female, indicated by ‘f’, and one male, indicated by ‘m’) who did both pilot and main
experiment. Participants’ place of birth varied, as shown in Figure 6.1. The three
participants whose place of birth is marked by an asterisk moved to Agadir sometime
91
6 Question word interrogative intonation in Tashlhiyt Berber
during their youth, the rest had come to Agadir for the purpose of their university
education.
All participants were native and dominant speakers of Tashlhiyt Berber, students at
the Département des études amazighes at the Université Ibn Zohr in Agadir at the time
of recording, and spoke Tashlhiyt regularly both with friends and family. Participants
were multilingual: They were all fluent in Moroccan Arabic, and had varying fluency
in Modern Standard Arabic, French and English as languages learnt in school (this was
expected, see also Section 1.2).
Rabat
Morocco
6f Marrakesh
2m
3f
4f
8f
Agadir 1f
9f
Algeria
92
6.2 Pilot
statements).2 The basic expectation is that the qword is marked by intonational prom-
inence in a short direct interrogative, whereas it is probably not when it is embedded.
In order to elicit natural sounding instances of direct questions as well as embedding
contexts, a scripted mock telephone dialogue between two imaginary speakers was
used. The full script with gloss and translation is given in Appendix A. Participants were
familiarised with the content of the conversation beforehand and were then instructed
to read and act out both sides. The Tashlhiyt text was presented in Latin script a few
lines at a time on a screen. Pictures and speech bubbles were used to represent the
turns of the dialogue participants.
2
Dell & Elmedlaoui (2002: 185f.) note that the pronunciation of the final segment in manaɡu is subject
to free variation and may be produced either with a full high back vowel or with a labialised velar
stop, i.e. as [manaɡu] or [manaɡʷ]. In informal elicitation with all our participants the word was
consistently produced with a clearly identifiable final vowel, which motivated the transcription as
‘manaɡu’.
3
The scripted dialogue also included a test stimulus with an initial embedded qword: manaɡu a ra nː
aʃkʁ, sːnʁ is rad t tːinit ‘when (that) I come home, I know that you want to say it’. However, given that
most participants struggled to understand this construction upon presentation, this item was excluded
from further analysis.
93
6 Question word interrogative intonation in Tashlhiyt Berber
Figure 6.2: Representative F0 contours and waveforms for interrogative qwords (a: ini-
tial and c: final) and embedded qwords (b: medial and d: final); target
word manaɡu ‘when’ highlighted in grey.
Figure 6.3: Two F0 contours and waveforms for questions with peninitial qword, with
target word manaɡu ‘when’ highlighted in grey.
94
6.3 Methodology
to the left phrasal boundary and is not a target that needs to be realised on the qword
itself. Secondly, in final position, as in Figure 6.2c, the rising movement may be realised
largely prior to the qword. Following these observations of variability, the initial low
turning point appears to be less inherently part of the qword tune than the other two
that exhibit less variability in alignment. I will return again to the role of phonetic
alignment in determining phonological representation in the discussion of the main
experiment in Section 6.5.
The aim is to find out, by means of the main experiment reported in the following
sections, what the exact realisational details are of this rising-falling F0 pattern on the
qword. In the next section, the methodology of the main experiment will be described.
6.3 Methodology
6.3.1 Participants
Participants in the main experiment were the same as in the pilot, minus one speaker.
This resulted in data being analysed from a total of seven speakers. For details see
Section 6.2.1.
6.3.2 Procedure
Participants were instructed to act out the role of a primary school teacher doing an
exercise with their pupils that involved asking questions about pictures. The experiment
was presented to participants as a powerpoint presentation in slide presentation mode
with each slide showing a picture and a brief textual description of the scene depicted.
The target question asking about the picture was printed at the bottom of the slide
underneath the picture. Participants were instructed to read the picture description
out loud and then produce the question as if they were asking it to their pupils.
95
6 Question word interrogative intonation in Tashlhiyt Berber
a phonemic transcription.
Context sentence 1
afrux ann iZRa yan aHuli gh ugharas.
afruχ anː izˤra jan a uli ʁ uʁaras.
boy dist 3sg.m-see.aor one sheep in road
‘The boy there sees a sheep on the road.’
Context sentence 2
tsaqsat imHDarn nnm :
tsaqsat im darn nːm :
2sg.f-ask.aor students poss-2sg.f
‘You ask your students :’
Target question
mani gh iZRa aHuli?
mani ʁ izˤra a uli?
where in 3sg.m-see.aor sheep
‘Where does he see the sheep?’
The full set of 11 qwords (five simple and six complex) with their carrier sentences
are given in Table 6.2. Target questions had either qword-Verb-Object or qword-Verb-
Adverb structure and the number of syllables following the qword was always five.
Simple qwords varied in number of syllables from one to three, and included CV and
CVC syllables. Complex qwords consisted of the interrogative element man ‘which’
followed by either a disyllabic or a trisyllabic noun. Syllabification in Tashlhiyt has
been the subject of much previous work and is especially complex in the case of long
consonantal sequences (Dell & Elmedlaoui 2002, 2008; Ridouane 2008). In the case
of the simple qwords, which all had vocalic nuclei, the syllable boundary location is
uncontroversial. It is unclear if resyllabification takes place across the elements of a
complex qword constituent, which is why no syllabification is given for those qwords.
Picture slides were presented in blocks of 11 within which each target qword inter-
rogative occurred once. As the recording session involved a number of other tasks,
speakers completed a set of two blocks with the stimuli in each block having a differ-
ent semi-randomised order, followed by another task, and finally the same set of two
blocks again. Each set was preceded by five practice items. No fillers were included to
minimise the total duration of the session. Of the seven speakers, two completed only
one block, so that their number of repetitions per stimulus is two instead of four, as
shown in Table 6.1. This led to a total number of 24 repetitions per target word, result-
ing in a total of 120 tokens of the five simple qwords, and 144 for the complex qwords.
After the exclusion of disfluent items (misreading, hesitation) 107 simple (89%) and
120 complex qwords (83%) remained.4
4
The relatively high number of exclusions is probably an artefact of speakers not being used to reading
96
6.3 Methodology
Simple Complex
ma ‘what’ man anu ‘which well’
ma iʃːt a ʁ umalu? man anu nzˤr a ʁ umalu?
‘what does he eat in the shade?’ ‘which well do we see in the
shade?’
mad ‘what’ man ananas ‘which pineapple’
mad nzˤr a ʁ umalu? man ananas nzˤr a ʁ umalu?
‘what do we see in the shade?’ ‘which pineapple do we see
in the shade?’
mani /ma.ni/ ‘where’ man aʜuli ‘which sheep’
mani ʁ izˤr a aʜuli? man aʜuli nzˤr a ʁ umalu?
‘where does he see the sheep?’ ‘which sheep do we see in
the shade?’
manwi /man.wi/ ‘who’ man tizi ‘what time’
manwi nzˤr a ʁ umalu? man tizi nzˤr a aʜuli?
‘who do we see in the shade?’ ‘what time do we see the
sheep?’
manaɡu /ma.na.ɡu/, /ma.naɡʷ/ ‘when’ man tili ‘which ewe’
manaɡu nzˤr a aʜuli? man tili nzˤr a ʁ umalu?
‘when do we see the sheep?’ ‘which ewe do we see in the
shade?’
man butili ‘which shepherd’
man butili nzˤr a ʁ umalu?
‘which shepherd do we see
in the shade?’
6.3.4 Analysis
Scaling and alignment measurements were taken from the F0 contour provided by the
standard pitch tracking algorithm in Praat (Boersma & Weenink 2015), with manual
correction of spurious pitch points and tracking errors such as octave jumps.5
Tashlhiyt aloud. The remaining productions, however, were judged to be natural sounding utterances
expressing the intended communicative function by two additional Tashlhiyt speakers who did not
participate in the experiments.
5
Utterances were characterised by considerable microprosodic effects at the transition between vowels
and nasals. In an attempt to control for these perturbations in the analysis of alignment, measurements
were taken both from the raw and a smoothed version of the F0 contour. For this smoothed contour
the raw contour (already handcorrected) was manipulated in four additional steps with the customised
Praat script mausmooth (Cangemi 2015), which included two rounds of smoothing. Results from raw
and smoothed contour versions were highly similar and I will only report the results from the raw
contour here.
97
6 Question word interrogative intonation in Tashlhiyt Berber
Two measurements were used in the quantification of the properties of the maximum
occurring on the qword: A single absolute F0 maximum, and a measure of a ‘high
region’. It has repeatedly been shown that small and gradual F0 displacement within
a dynamic contour does not lead speakers to perceive pitch differences (’t Hart 1976;
d’Alessandro & Mertens 1995). Given the possibility, then, of a region in the contour
within which pitch is perceptually equivalent, it is also conceivable that a larger region
is involved in systematic tune-text association.
In order to quantify the high region, a heuristic measure inspired by earlier work
on pitch plateaux was used. Different measurements strategies for plateaux have been
used in the past, and some will be quickly reviewed here. Knight & Nolan (2006) and
Knight (2008) for example use the start and end of a region delimited by 4% of F0
values in Hertz below the absolute maximum when measuring high plateau accents in
British English. A slightly different measure is adopted by Niebuhr & Hoekstra (2015),
who use a 1 ST difference criterion around the peak in their discussion of North Frisian
plateaux. Both of these sets of authors refer back to more general pitch perception
findings by ’t Hart (1981), who had to draw the somewhat dissatisfactory conclusion
that the perception of pitch movement was highly speaker-specific: For a reliable dis-
crimination between pairs of pitch movements different speakers needed differences
that ranged in size from 1 to 6 semitones. Given this variability in perceptual discrim-
ination ability among speakers, the exact definition of what constitutes a perceptually
relevant plateau seems somewhat arbitrary. Knight & Nolan (2006) argue that there is
little difference between plateaux delimited by 4% and 6% Hz values around the max-
imum. This motivated the choice to use 6% difference values as plateau measures in the
present study, especially because in most of the present speakers’ ranges, 6% in Hertz
values was very similar to the 1 ST criterion used by Niebuhr & Hoekstra (2015). Figure
6.4 schematically depicts the adopted plateau measurements (start and end point).
Figure 6.4: Schematic representation of peak measure (absolute F0 maximum) and plat-
eau measures (6% lower values in Hz around maximum).
98
6.4 Results
6.4 Results
6.4.1 Global contours
Figure 6.5 below shows time-normalised phrasal contours for all individual qword in-
terrogatives with a simple qword.
Figure 6.5: Smoothed F0 contours for all target utterances with simple qwords
(N=107). Normalised duration based on ten equidistant measuring points
throughout each syllable. Dotted lines delimit target qword.
It can be seen that qwords are characterised by a region of high pitch, with a rise
towards a peak on the qword and a fall back to what seems to be a low baseline imme-
diately following the peak, around the right edge of the qword. Only in interrogatives
with manaɡu the fall is achieved somewhat earlier, before the right of the qword. Inter-
rogatives with mad, finally, diverge only marginally from the general pattern in having
an additional, further fall, a few syllables to the right of the qword. Taken together,
these interrogatives provide a highly coherent picture of a rising-falling F0 movement
on the qword and a typical absence of additional intonational prominence anywhere in
the rest of the phrase.
99
6 Question word interrogative intonation in Tashlhiyt Berber
word (normalised duration), with word boundaries indicated by solid lines.6 Each dot
represents an utterance, i.e. one specific peak as realised on that qword.7
While maxima are not commonly realised in the very first part of the word, the
temporal domain across which they are realised is surprisingly large, and spans the
largest part of the word (excepting manaɡu, which exhibits an absence of peaks in the
second half of the word).
In order to investigate peak alignment in more detail, the location of F0 maxima
was considered in relation to individual segments. Figure 6.7 illustrates this for all five
qwords and for each speaker separately. The differing number of tokens per speaker is
a function of the speaker’s original number of repetitions, which was either two or four
(see Table 6.1), and subsequent exclusions.
As expected, the distribution of peaks over a large part of the word translates into
maxima that variably occur on different segments. In absolute terms, monosyllabic
qwords ma and mad ‘what’ exhibit the greatest degree of uniformity. For these words,
all speakers tend to produce maxima that occur on the vowel /a/ across multiple repe-
titions. In the polysyllabic words, maxima are spread across different segments as well
as different syllables. Peaks are observed on any of the segments from the second to
the last segment in mani and manwi, and from the first to the fourth in manaɡu. The
apparent restriction on the occurrence of F0 maxima at the start of the second syllable
(i.e. around /n/ in manaɡu and mani, and /w/ in manwi) can be explained in terms of
microprosody: There is a predictable dip in the pitch contour on those segments. It is
6
The realisation of mad often underwent assimilation to the following consonant, whereby the target
sequence mad nzˤra became [man(ː)zˤra]. As it was impossible to distinguish between individual /d/
and /n/ segments, the whole sequence [man(ː)] is taken to reflect orthographic mad.
7
Four peaks in the case of manwi were realised just milliseconds after the qword end, but this was in
cases where the absolute maximum was difficult to identify as a single point that was higher than the
rest. In the following I will simply assume that peaks align systematically on the qword.
100
6.4 Results
Figure 6.7: Alignment of absolute maximum with respect to segments (numbered and
separated by dashed lines) for all five simple qwords.
nevertheless clear from the above that there is no stable segmental anchor for the peak.
Peak distribution overall can moreover be characterised as exhibiting a gradient
spread rather than a categorical distribution, with most speakers producing a multi-
tude of alignment patterns. The attested variability can be classified along a number of
parameters:
• peaks that align with different syllables (e.g. 3f’s peaks in mani, and 4f’s peaks in
manaɡu)
• peaks that align with different segments within the same syllable (e.g. 1f’s and
9f’s peaks in manwi and mani, respectively)
• peaks that align with different syllables and with different segments within one
of these syllables (e.g. 3f’s peaks in manaɡu, 9f’s peaks in manwi)
The alignment patterns across words also display similarities, especially between
manaɡu and mani, both of which exhibit a general preference for maxima on the second
vowel. In both words, the earliest peaks occur around the segment boundary between
/m/ and /a/, and the latest peaks halfway through the second vowel. It appears, there-
fore, that manaɡu behaves like a disyllabic word in which the final syllable ɡu does not
count. A plausible explanation is that the word, despite being produced with a clear
final vowel, is treated as if it does not have a final phonological vowel that participates
in tune-text association, i.e. as manaɡʷ (see also 6.2.2).
Compared to mani and manaɡu, manwi exhibits slightly more categorical peak align-
ment, reflected also in the greatest within-speaker variability (e.g. speakers 4f and 9f
101
6 Question word interrogative intonation in Tashlhiyt Berber
producing discretely different peak alignment, i.e. on different syllables, across repeti-
tions). Still, single speakers may produce any combination of ‘categorical’ peaks and
other more gradiently different peaks (excepting 3f). This suggests that manwi, too, is
characterised by a gradient distribution of peaks, which is simply obscured by the large
microprosodic perturbation of the labiovelar approximant in the middle. Additionally,
a comparison between late peaks in manwi and the right edge peaks in the complex
qwords (see Section 6.4.5) shows that the late peaks on manwi are not aligned as late
as the peaks marking the right edge of a complex qword. This suggests that the manwi-
peaks are qualitatively more like the peaks on the other simple qwords than like the
possible edge-marking strategy seen in complex qwords.
In sum, then, the alignment results presented so far indicate that there is little sys-
tematicity both within and across speakers in alignment of F0 maxima in TB qwords.
While peaks may occur on most segments in the word, and variably align within these
segments, the main consistent feature of all peaks is that they occur within the domain
delimited by the boundaries of the qword.
This degree of variation in alignment is unlike that in languages for which segmental
anchoring has been invoked as an explanation for the relatively stable alignment of
(specific) tonal targets in relation to the segmental string. To name one example of
consistent alignment results, Atterer & Ladd (2004) found that in rising L*+H accents
in two varieties of German, the low turning point characterising this accent was signi-
ficantly later aligned in one variety compared to the other. Crucially, cast in absolute
terms, this alignment difference roughly spanned half a segment. Speakers of each
language variety behaved so uniformly that the resulting difference, realised on a very
small temporal scale, functioned as a significant predictor of the variety spoken. In rela-
tion to these findings, the patterns discussed here are of an altogether different nature.
While it might be objected that the present data come from speakers with perhaps a less
uniform background than Atterer & Ladd’s (although this is debatable given that their
dialect regions were rather broadly defined as ‘Northern’ and ‘Southern’). Two further
points may be raised in defence of considering the present study’s speakers together as
representing a single variety. Firstly, alignment patterns from the two speakers with
the same birthplace (2m and 6f, both of whose parents are also from that same town),
are not more similar than those of any two other speakers. In a similar vein, the three
speakers who grew up in Agadir (1f, 4f, 6f) do not behave more alike than any other
grouping of speakers. Secondly, even if variability in alignment between speakers could
be explained by attributing differences to specific subdialects, we would still not expect
to observe the degree of intraspeaker variability exhibited in the present data if align-
ment patterns were characterised by segmental anchoring. It can be concluded that the
alignment of the qword-related F0 maximum exhibits genuinely variable alignment,
both within and across speakers.
102
6.4 Results
Figure 6.8: Plateau alignment for all individual repetitions of simple qwords by qword
and by speaker. Plateau onset and offset indicated by black dots, location of
absolute maximum within plateau indicated by orange dot, segments (dura-
tion normalised) separated by dashed lines and qword boundaries indicated
by solid lines.
103
6 Question word interrogative intonation in Tashlhiyt Berber
8f), matching the observations about an additional fall (or later fall) in this particular
interrogative phrase (Figure 6.5). Further research will have to show what exactly the
location of this fall reflects, and specifically whether this apparent categorical difference
between early and late falls is meaningful. Other speakers produce plateaux on ma that
are more similar to the narrower plateaux characteristic of other qwords. Finally, the
qword manwi is characterised by another kind of variability in the realisation of its
high region, with both very wide plateaux and very narrow plateaux.
Considering all qword plateaux together, every single plateau parameter investigated
here seems to be characterised by gradient variation (with the exception of ma with two
categorically different patterns): The plateau onset, the plateau offset, the duration of
the plateau and the peak location within the plateau region. The latter measure is
crucial to a discussion that takes into account peak shape as an important perceptual
cue to listeners’ categorisation of contours (cf. Barnes et al. 2012a,b). In the present
data, the varying peak location within the plateau indicates that different peak shapes
are being produced, even for plateaux characterised by similar spans. Speaker 6f, for
example, produces plateaux on manaɡu that are similarly aligned in all her productions.
In one case, however, the peak is aligned differently from the rest, namely near the
end of the plateau, which indicates that there is a relatively shallow rise to a peak
and a steeper fall following it. Similar and more extreme variability is seen across
repetitions of the same word by most individual speakers. The different realisations
of the high region and, by extension, different peak shapes, once again suggest that
considerable variability is an intrinsic and non-spurious characteristic of intonational
events in Tashlhiyt Berber.
104
6.4 Results
105
6 Question word interrogative intonation in Tashlhiyt Berber
1f 2m 3f 4f 6f 8f 9f
N w/ N w/ N w/ N w/ N w/ N w/ N w/
rise rise rise rise rise rise rise
man anu 4 4 2 0 4 0 4 2 2 1 2 2 2 1
man ananas 4 4 2 0 2 0 1 1 3 3 1 1 4 4
man aʜuli 4 4 2 0 4 0 4 0 3 3 2 0 3 2
man tizi 4 4 1 0 3 0 4 1 2 2 2 2 3 2
man tili 4 4 2 0 3 0 3 0 2 2 2 1 4 4
man butili 3 3 2 0 3 0 3 1 4 3 1 0 3 3
sum 23 23 11 0 19 0 19 5 16 14 10 6 21 16
Table 6.3: Number of tokens produced per speaker and number of these produced with
final rise.
106
6.4 Results
and interactions between the two domains are well documented (e.g. Gussenhoven
2004) and have the potential to shed light on the reasons why peak alignment is so
variable. As was clear from Figure 6.10, scaling of the qword peak varies considerably
across subjects, with at the lowest end the single male speaker who produces maxima
around 200 Hertz, and at the highest end one of the female speakers, 3f, who produces
maxima in falsetto up to 850 Hertz. Such high values are not unusual for speakers of TB
(or speakers of MA for that matter), as large pitch excursions and the use of falsetto are
a recurrent feature of Moroccan speech (observed in semi-spontaneous recorded data
from Roettger & Grice (2015) and in general daily interactions).
Figure 6.11: Relatively stable scaling of F0 peak on qword for all simple and complex
qwords (N=227).
Figure 6.12: Somewhat lower scaling of F0 values at phrase onset with increasing peak
distance for all simple and complex qwords (N=227).
107
6 Question word interrogative intonation in Tashlhiyt Berber
Returning to pitch scaling of the individual turning points, Figure 6.11 shows F0
height for the peak on the qword as a function of peak distance from the left phrasal
edge (which coincides with the start of the qword). The correlation is rather weak
overall: R2 =0.16. Speakers moreover appear to have individual preferences for the
scaling of this peak: Four speakers (2m, 3f, 4f and 9f) produce somewhat lower maxima
with increasing distance from the left phrasal edge. A further two speakers produce
more or less the same peak height irrespective of its alignment (1f and 6f) and the
remaining speaker (8f) produces marginally higher peaks with later alignment.
Cases in which decreased peak height correlates with later alignment are especially
interesting with respect to speaker 3f, who produces the highest but also the earliest
aligned peaks. In order to produce a rise from the phrasal edge to a peak shortly after,
this speaker must produce very steep pitch rises, or limit rise excursion. While the other
speakers exhibit varying interactions between alignment and scaling of the peak, peak
height in absolute terms is not influenced much by its alignment. It seems that whereas
alignment is relatively unimportant, there is more of a requirement to produce a high
target (H) with some specific peak height.
In the main experiment, qwords were phrase-initial, so that the left qword boundary
coincided with the left IP boundary. In these cases the minimum value preceding the
peak on the qword marked the start of the rise. This stands in contrast to the pilot
experiment, where a minimum or turning point usually occurred before the left edge
of the qword when it was non-initial (Section 6.2.3). For phrase-initial qwords, the
scaling of the F0 minimum at the phrase onset can serve to shed light on whether a
compromise is made on the excursion of the rise.
Figure 6.12 shows that when the peak on the qword is aligned later, the preceding
low F0 value is typically realised lower, with the strongest effects for the two female
speakers with the highest ranges. The overall correlation of R2 =0.22 is not much
higher than for the scaling of the F0 peak but the trend is the same for all speakers. This
indicates that alignment of the peak and scaling of the low(er) F0 value at phrase onset
are in a trading relation: In cases that require a steep rise (when the peak is early), the
rise may be truncated by starting the rise at an initial F0 value that is somewhat higher.
The relative stability of peak scaling and the more variable scaling of the starting point
of the rise suggest that the peak is the more important turning point in the qword tune.
The properties of the low turning point following the peak on the qword were not
quantified here as microprosody would make elbow or minimum detection in the relev-
ant region unreliable. Based on observation, the fall following the peak usually entailed
a drop to a level similar to that of the phrase-onset F0 value, which is then maintained
until the end of the phrase, as illustrated with an example in Figure 6.13.
108
6.5 Discussion
Figure 6.13: Example of an utterance with post-peak fall to baseline in mani ʁ izˤra
a uli? as spoken by speaker 4f.
6.5 Discussion
6.5.1 Towards a phonological analysis
The results reported in the previous sections have shown that qwords in direct inter-
rogatives in TB are consistently marked by a localised pitch event on the qword (see
especially Figure 6.5). There were no other noteworthy intonational events in the sen-
tence and this section will therefore be concerned primarily with the analysis of the
intonational movement that co-occurs with the qword.
All 227 tokens of qwords, 107 simple and 120 complex, were characterised by a local
maximum that occurred on the qword, or on the interrogative element man in the case
of the complex qwords. These peaks often aligned on vowels and tended not to be
realised at the very edges of the qword, especially in the case of the simple qwords.
There were no apparent phonological or other structural factors that systematically
governed peak alignment in qwords: Peaks did not consistently align with a specific
syllable in polysyllabic qwords, nor did they align in general with specific structural
units such as syllable rhymes or even specific segments. These observations about align-
ment, together with the context in which qwords were produced (with narrow focus
on the qword), strongly suggest that the peak reflects an H target that is somehow
linked to the qword, and that it should be interpreted as a prominence-marking in-
tonational event rather than an edge-marking event. This interpretation in terms of a
prominence-marking event is further supported by the arguments brought forward in
Chapter 5: Qwords are pragmatically prominent, and intonational events co-occurring
with qwords in other languages are also usually interpreted as prominence-marking.
The variability in peak alignment in this experiment however presents a puzzle. On
the one hand, its consistent occurrence on the qword, and its rather consistent scaling,
suggest that it is a central component of the qword tune. On the other hand, its align-
ment exhibits a type and degree of variability that poses problems for an analysis in
terms of association to a specific TBU.
As a first step to arrive at a phonological analysis, the peak alignment in the present
study can be compared with the variability observed in previous work on intonation
in TB. Grice, Ridouane & Roettger (2015) showed that in phrase-final position, both
in yes-no questions and statements with a contrastively focused phrase-final word, the
109
6 Question word interrogative intonation in Tashlhiyt Berber
variability in alignment of a local high turning point in the contour could be captured
probabilistically by appealing to a number of constraints that favoured the peak to oc-
cur on heavier and more sonorous syllables. This raises a number of questions with
respect to the present data. Why is the apparent tonal attraction effect of ‘prosodically
privileged’ syllables in final position, as previously reported, not observed in initial po-
sition? Unfortunately a direct comparison is not possible, as stimuli in Grice, Ridouane
& Roettger (2015) were designed to vary in terms of syllable structure and sonority,
factors that only vary in a limited way in the present qwords. Among the present
qwords, syllable weight only distinguishes two minimal word pairs, namely ma/mad
and mani/manwi. There is nevertheless little reason to believe that any differences in
peak alignment between members of these minimal word pairs should be attributed
to syllable weight, as the peak distribution in the present study and the one in Grice,
Ridouane & Roettger (2015) is fundamentally different. Grice, Ridouane & Roettger
(2015) find a discrete peak distribution, with the maximum being aligned either on
the penultimate syllable or on the final syllable, with an attraction effect for peaks
to occur on the heavier or the more sonorous of the two syllables. This categorical
tonal placement was additionally supported by the authors’ auditory impression that
the peak was located either on one syllable or the other, and by the peak’s system-
atic alignment in the rhyme of the syllable it occurred on. In the present experiment,
peaks on qwords exhibited a far more gradient distribution, with peaks also occurring
on intervocalic consonants in onset or coda position, and lacking systematic alignment
relative to any specific segment or sub-lexical structural unit. While many of the target
words in Grice, Ridouane & Roettger (2015) had obstruents in syllable onset position,
which would have prevented peaks from occurring on onsets, an explanation in terms of
segmental make-up can account for only some of the differences in alignment between
experiments. Other target words in the earlier study had liquids in onset position, and
no peaks occurred on any of these onsets either. In contrast, in the present data nasals
in syllable onset position did carry peaks.
Differences in tonal alignment between experiments could be due to many factors,
including the difference in phrasal position of target words (initial in the present study,
versus final in Grice, Ridouane & Roettger 2015), as well as differences in the function of
the intonational event (narrow focus in the present experiment versus yes-no question
modality marking and contrastive focus). Some of the differences in alignment patterns
might nevertheless still be due to differences in the segmental make-up of the words.
At this point, peak alignment patterns in qwords do not provide enough evidence
to argue for the association of the peak, in the form of an H tone, to a specific sub-
lexical unit. There are of course other potential tonal targets that might form part of
the qword tune and are perhaps more systematically aligned, such as potential L tones
on both sides of the H, which might be invoked to represent the potential requirement
to have a local rise or fall. Their alignment could unfortunately not be investigated with
the present data. The results from the pilot however can be used to argue that the rise
is not as integral a component of qword intonation as the peak. Rises in interrogatives
with non-initial qwords preferentially occurred prior to the qword, with the start of the
110
6.5 Discussion
rise coinciding with the left phrasal edge. Alignment in these cases suggests that any
low turning point does not co-occur with the qword. Instead, if an L tonal target should
be posited to account for the presence of a steep rise, it could be interpreted as seeking
association to the phrasal edge.
Even more speculation applies to the realisation of the low turning point marking
the end of the fall. There were some differences between qwords, notably with ma
exhibiting a stepped contour down to some baseline level. All other words, including
the complex qwords, showed a pattern in which a local minimum was produced shortly
after the peak, suggesting that there does appear to be an L tonal target that accounts for
the steep fall. This will be left for future work, as the decision to include tonal targets
in a phonological representation should ideally also take into account paradigmatic
contrast. To this end, it would be useful to know whether a shallower fall, or a less
complete fall to a baseline, results in a different pragmatic interpretation from the steep
fall observed here. If so, this would form evidence to support an interpretation of an
HL tonal sequence marking the qword in the present case, in contrast to for example a
single H target to characterise cases with a shallower or no immediate fall.
In sum, the high turning point can be interpreted as representing an H tonal target
by virtue of its systematic occurrence on the qword. Since no further association to a
TBU below the level of the word could be determined, I suggest to analyse the H tone
as a ‘non-metrical pitch accent’. The ‘non-metrical’ part of this characterisation follows
from the variable alignment of the H tone paired with the absence of lexically stressed
syllables, and the ‘pitch accent’ part follows the AM tradition to distinguish between
delimitative ‘edge tones’ and culminative prominence-marking ‘pitch accents’. Even if
the analysis in terms of an H accent would have to be revised to include further tonal
targets, the absence of lexical stress in the language would still justify the use of the
term ‘non-metrical pitch accent’.
In the context of the discussion about the nature of the tonal event, it should be men-
tioned that rather different accounts of intonational movements have been proposed for
several other languages that are considered to lack stress. Intonational systems in such
languages are typically analysed in terms of predetermined tonal strings that associate
sequentially within small phrasal domains such as APs or PPs (for Korean: Jun 2005a,
for French: Post 2000; Jun & Fougeron 2002, for Mongolian: Karlsson 2014), or even
the IP (for Ambonese Malay: Maskikit-Essed & Gussenhoven 2016). The present results,
on the other hand, indicate that the intonational event marking qwords in TB can be
analysed rather straightforwardly as serving the purpose of prominence-marking, not
edge-marking. This makes the intonational event in question much like a pitch accent,
which by definition is prominence-marking, but unlike a pitch accent in the sense of its
lacking further association to a TBU, specifically a stressed syllable.
In sum, the present qword data exhibit a type of intonational marking that does not
refer to sub-lexical metrical structure. This means on the one hand, that Tashlhiyt
Berber clearly differs from languages with lexical stress and postlexical pitch accent,
and, on the other hand, from languages like Mongolian, French and Korean, in which
there is a role for the mora or the syllable in determining the location of intonational
111
6 Question word interrogative intonation in Tashlhiyt Berber
tones.
Finally, it is possible that an analysis in terms of non-metrical tonal association is ap-
propriate for other languages as well. A recent case has been made for Ambonese Malay,
which is another language that lacks lexical stress, as also exhibiting intonational tonal
alignment that does not involve reference to a “word-internal synchronisation point”
(Maskikit-Essed & Gussenhoven 2016). The intonational tones in question were ana-
lysed as boundary tone complexes associating with the right edge of an IP, but the
phenomenon of variable tonal alignment appears very similar to what is observed in
the present case.
2. What is the prosodic prominence status of question words? How should any pros-
odic marking be analysed phonologically?
Question 1 can be answered as follows: Qword interrogatives in TB are characterised
by a main phrasal intonational event which co-occurs with the qword. This tonal event
takes the form of a rising-falling pitch movement with a peak or high region, with the
F0 maximum always aligned on the qword. Both peak and high region were unsys-
tematically aligned, suggesting that the realisation of the intonational event is rather
unconstrained. Following the fall from the maximum there are no further intonational
events: The contour stays low until the right edge of the interrogative phrase. An ex-
ception to this pattern are complex qwords, which are optionally characterised by an
additional, edge-aligned rise at the right edge of the noun following the interrogative
word.
As for question 2, it is clear that qwords attract the main phrasal prominence, whether
the qword is in initial position (as in the main experiment) or in non-initial (medial/fi-
nal) position (as in the pilot). In the context of the main experiment, qwords were
produced under narrow focus, so that on a semantic-pragmatic account, qwords are
prominent. Consequently, the intonational event marking this constituent was con-
sidered to serve a prominence-marking rather than an edge-marking function. This
interpretation is supported by the alignment of the peak, which occurred relatively
freely within the domain of the qword, rather than near one of the edges.
While the role of low turning points in shaping the qword tune could not be analysed
in great detail, it was clear that an H target is among the defining aspects of the qword
tune in TB. This followed from i) alignment and scaling of the peak (which occurred on
the qword and was scaled at a consistent height for each speaker), ii) the scaling of the
start of the rise (which was more variable and depended in part on the alignment of
the peak), and iii) the alignment of the start of the rise (with the pilot suggesting that
the rise starts prior to the qword when enough segmental material is available).
112
6.6 Summary and conclusion
113
6 Question word interrogative intonation in Tashlhiyt Berber
114
7 Question word interrogative intonation
in Moroccan Arabic
7.1 Introduction
7.1.1 Prior work on the intonation of Moroccan Arabic
Most of what is currently known about Moroccan Arabic intonation is limited to qualit-
ative observations, found primarily in Benkirane (1998) and Maas (n.d.: Ch. 10). The
former’s analysis includes a concise inventory of the prosodic properties that are charac-
teristic of various sentence types, including yes–no questions, declaratives, imperatives
and question word (qword) questions. Whereas Benkirane’s (1998) claims are based on
read sentences from various speakers and his observations as a native speaker, Maas’s ()
work is corpus-based, describing, on a case by case basis, examples of various sentences
in context.
Experimental work on MA is limited, too, and can be divided into two main thematic
areas: Prosodic marking of focus, and prosodic marking of yes–no question intonation.
For the former, two papers discuss MA focus marking with reference to other lan-
guage varieties, namely Burdin et al. (2015) and Yeou, Embarki & Al-Maqtari (2007).1
While both sets of authors consider their work to look at contrastive focus, results of the
two studies are not directly comparable. Burdin et al. (2015) investigate focus marking
in a game setting, contrasting three focus conditions in noun phrases consisting of a
noun+adjective: i) noun–only focus, ii) adjective–only focus and iii) full noun phrase
focus. Yeou, Embarki & Al-Maqtari (2007) look at contrastively focused single words
in read speech consisting of question–answer sentence pairs.
For the latter, the intonation of yes–no interrogatives, there is one study based on data
from both elicitation and semi-spontaneous interaction between speakers (Hellmuth et
al. 2015).
What is currently known about the topic of this chapter, qword interrogatives, is
limited to observations. According to Benkirane (1998: 354), the qword attracts the
phrasal pitch peak, and this peak is followed by “rapidly falling pitch on the rest of
the utterance” . This is in line, he observes, with the description of qword interrog-
atives given by Rhiati-Salih (1984). Maas (n.d.: Ch. 10) also reports similar patterns,
supported by example F0 contours.
1
The same dataset is reported on, from a slightly different angle, in Yeou et al. (2007).
115
7 Question word interrogative intonation in Moroccan Arabic
116
7.1 Introduction
2. What is the prosodic prominence status of question words? How should any pros-
odic marking be analysed phonologically?
Based on the aforementioned observations there are a number of predictions for the
answer to Question 1. The first is that the qword is most likely to be prosodically
prominent and that it can be expected to carry the main pitch event in the sentence,
specifically in that it takes the form of a peak followed by a fall. The right phrasal edge
of the qword interrogative as a whole is expected to be low rather than rising.
The possible answer to Question 2 remains more open. Does the phrasal maximum
align on the qword, and how is this high turning point best represented? These ques-
tions are concerned with the nature of the relevant pitch event as edge marking (a
117
7 Question word interrogative intonation in Moroccan Arabic
7.1.5 Data
This chapter reports on an experiment that involved questions being read aloud by par-
ticipants as part of a scripted mock dialogue, the transcription and translation of which
is given in Appendix B. As in Chapter 4, data are part of the IVAr corpus (Hellmuth &
Almbark 2017) and were kindly made available to me by the authors. Further details
about each specific interrogative are given in Section 7.2.3.
7.2 Methodology
7.2.1 Participants
There were 24 participants, identical to those in Chapter 4 on correlates of stress in MA
(for details see Section 4.2.1). All speakers were university students in Casablanca, with
the majority being born and raised there. Participants belonged to one of two groups
with 12 speakers each: One was a ‘monolingual’ group with participants who grew
up speaking only MA at home, and one ‘bilingual’ group with participants who spoke
Tashlhiyt as a first language in addition to MA. All speakers have some to near-native
proficiency in MSA and French.
118
7.2 Methodology
7.2.2 Procedure
The present task was performed by pairs of participants as part of a set of recordings for
the IVAr corpus. Recordings took place in a quiet room at the Université Hassan II in
Casablanca, with participants each wearing a headset microphone and being recorded
on a separate channel. Participants were presented with a scripted dialogue (Appendix
B) printed on paper, and were instructed to play one of the roles. The dialogue was
performed twice by each pair of speakers, so that each participant performed each role
once. All sentences in the scripted dialogue are therefore available for each participant.
whq2 ʃkun
شكون ل ّ�ل� ��شه� على العرس الروم�؟
ʃkun lːi ʁajʃɦad ʕala lʕars rːumi
who rel fut.witness.ipfv.3sg.m on wedding civil
‘Who is it that is going to witness the civil wedding?’
whq3 imta
إ�م�ا عرس بن� عمك د�نا؟
imta ʕars bint ʕamːək dina
when wedding daughter uncle.poss2sg Dina
‘When is the wedding of your cousin Dina?’
whq4 fʔaʃ
العرس ��كون فأش من م��نة؟
lʕars ʁajkun fʔaʃ min mədinə
wedding fut.be.ipfv.3sg.m in.which from city
‘In which city will the wedding be?’
119
7 Question word interrogative intonation in Moroccan Arabic
whq5 fin
فن� شاف� د�نا نب�ل؟
fin ʃafit dina nabil
where see.pfv.3sg.f Dina Nabil
‘Where did Dina see Nabil?’
In addition to the factors listed in the table, whq4 additionally differs from the other
questions in that its qword is morphologically complex, in that fʔaʃ is historically de-
composable into fi ‘in’ + ʔaʃ ‘what, which’. Whq4 also differs from the rest with respect
to the type of qword, because fʔaʃ is part of the complex question constituent ‘in which
city’. 2 The relevance of these factors to the realisation of qword interrogative intona-
tion will be discussed in Section 7.4.
7.2.4 Analysis
The theoretical total number of utterances was 144 (6 qwords * 12 speakers * 2 groups
= 144). Out of this total, 11 utterances were non-targetlike due to mispronunciations
or major disfluencies and excluded. A further four utterances were excluded due to
smaller disfluencies in the vicinity of the qword. This resulted in a final number of
2
Similarly, fin can historically be decomposed into fi ‘in’ and ʔajnə ‘where’, but this form has long been
grammaticalised.
120
7.3 Results
65 utterances analysed for the monolingual group and 64 utterances for the bilingual
group (total N=129).
Acoustic measurements were taken in Praat (Boersma & Weenink 2015). Initial
segmentation of utterances into words and segments was performed automatically by
Prosodylab-Aligner (Gorman, Howell & Wagner 2011). Subsequently all utterances
were manually checked and segmentation was adjusted where needed. F0 measure-
ments are based on a version of Praat’s automatically generated pitch contour which
was manually corrected for pitch tracking errors and smoothed with 15Hz bandwidth.
Local minima and maxima were detected automatically and verified manually.
Statistical analysis was performed in R (R Core Team 2016). For comparisons be-
tween speaker groups, linear mixed models were run with the package lme4 (Bates
et al. 2015). Significance for individual predictors or interactions between predictors
were calculated by means of LRTs between a main model and a corresponding null
model lacking the relevant interaction or predictor. The R syntax of the main model is
in each case given in a footnote.
7.3 Results
This section consists of three parts. First, in Section 7.3.1 an overview is given of the
phrasal intonational patterns of the read sentences is given. What is observed here
serves to motivate the choice to focus on the alignment and scaling of specific turning
points, the initial low and the following peak, as discussed in Section 7.3.2, as well as
the time course of the fall, discussed in Section 7.3.3.
3
There was one exception in the whole dataset concerning a voiceless rendering of fʔaʃ as [fʔʃ] followed
by a rise–fall on the final word mədinə. This particular final contour is typical of yes–no questions
(Hellmuth et al. 2015) and I will assume that it was wrongly interpreted as a yes–no question (especially
as fʔaʃ can also be a question particle in a yes–no question). This particular utterance will further be
ignored.
121
7 Question word interrogative intonation in Moroccan Arabic
Figure 7.1: F0 contours for all target utterances (N=129; normalised duration based
on 10 equidistant measuring points throughout each word. Dotted lines
delimit target qword.
due to four different speakers (two male, two female), but in the monolingual group
there was one female speaker, f3, who produced five of the total eight rises. Finally,
rises were spread across most utterances, although ʃnu (2) lacked final rises altogether.
Interrogatives with ʃnu (1), on the other hand, exhibited relatively many rises (27%, or
six out of a total 22).
In sum, in this experiment, final rises for qword questions are firstly optional, and
secondly, representing only 10% of all qword questions, the more marginal option dis-
preferred to level or falling final intonation. The distribution of final rises also suggests
that there might be something specific about the interrogative with ʃnu (1) as opposed
to all other interrogatives (the attested six rises among the total 22 renderings for this
sentence represent a likelihood 0.014, given the overall frequency of 10% rises). Plaus-
ible explanations could appeal, on the one hand, to the notion of epistemic bias: This
particular question involved only given constituents that were all overtly mentioned in
the preceding turns, and the speaker would presuppose that the interlocutor knows the
answer (cf. Warren 2016 on similar contextual effects influencing final rising intona-
tion or ‘uptalk’ in English). On the other hand, final rises might be linked to the higher
social cost associated with asking a detailed follow-up question requesting highly per-
sonal information (see Chen 2012 for the suggestion that higher social cost of asking
certain questions might result in deviant accentuation of qwords in Dutch). In any case,
the presence of final rises does not in any obvious way correlate with any structural
linguistic factors, including information structure of the phrase.
For the sake of a clearer between-group and between-qword comparison, the aver-
122
7.3 Results
aged time-normalised F0 contours for are shown in Figure 7.2 (N=13 with final rise
and N=1 with final rise–fall excluded). Just as in the above figure, these contours
were based on frequency sampling at ten equidistant measuring points per word, but
are plotted retaining information about average word duration as well. It can be seen
that the average duration of the interrogative phrase with imta was shortest, and that
among the qwords, ʃnu (2) had the shortest duration.4
Figure 7.2: Averaged and time-normalised interpolated F0 contours across all target
utterances without final rise–fall) (N=105). Dotted lines delimit target
qword.
Two main observations can be made based on this figure. The first is that the global
contours for the different interrogative phrases look very similar to each other (except-
ing, for the moment, the phrase with medial qword fʔaʃ). Across the board, the qword
carries what seems to be the one and only intonational prominence in the phrase, and
it takes the form of a rising–falling pitch movement in most cases. The second point
is that these mean contours highlight the apparent absence of differences between the
two groups of speakers. There are no categorical differences, and the overall contours
look near-identical, suggesting that any differences in qword question intonation linked
to the linguistic background of speakers are expected to be subtle at best.
These preliminary findings are compatible with Benkirane (1998)’s observation that
the qword attracts the utterance maximum and is followed by “a rapid fall”. Clearly, a
rise–fall in the vicinity of the qword is a defining intonational characteristic of qword
questions.
4
The considerable dip in the F0 contour occurring around 500 ms. in interrogatives with fin is due to
microprosody (the mostly voiced sequence /t d/ in /ʃafit dina/) and does not represent a linguistically
meaningful intonational movement.
123
7 Question word interrogative intonation in Moroccan Arabic
In the next sections I focus on the intonational movement characterising the qword,
or qword tune, trying to define any regularities defining the relationship between the
tune and the text for this particular word.
5
These frequency counts come from the smoothed contours but are almost identical to the peak distribu-
tion in absolute contours.
124
7.3 Results
differences in absolute alignment relative to the qword start between the two groups
of speakers (LRT: χ2 (1) = 1.8, p=0.17),6 and results are therefore shown pooled in
Figure 7.3. For all six qwords, the absolute distance between the word start (indicated
by dotted line) and the F0 maximum varies considerably. Mean peak alignment varies
between 174 ms. for bilingual imta to 244 ms. for monolingual ʃkun, but these varying
means are not very informative given large alignment differences within single words.
In fact, the smallest range of variation in alignment (latest minus earliest) for a single
word comprises 185 ms. for ʃnu (2).
Figure 7.3: Alignment of qword-related peaks for all six qwords relative to the left edge
of the qword (dotted line).
Nevertheless, maximum alignment is highly correlated with the word start. Predict-
ing maximum alignment (absolute values) based on the qword start revealed a signific-
6
peak alignment from qword start ∼group + (0+group|qword) + (1|speaker)
125
7 Question word interrogative intonation in Moroccan Arabic
Figure 7.4: Alignment of qword-related peaks for all six qwords relative to the right
edge of the qword (dotted line).
The qword ʃkun in whq2 occurred in a cleft, and it seemed possible that peak align-
ment for this qword might be better correlated with the end of the larger constituent
that includes clefted /lːi/. A quick comparison shows that this is not the case. Maximum
alignment was better predicted with respect to the (simple) qword edge than with the
extended qword constituent edge (correlation coefficients of r=0.54 and r=0.34, re-
spectively).10
7
peak alignment ∼qword start + group + (0+qwordstart|qword) + (0+qwordstart|speaker)
8
peak alignment from qword end ∼group + (0+group|qword) + (1|speaker)
9
peak alignment ∼qword end + group + (0+qword end|qword) + (0+qword end|speaker)
10
peak alignment ∼qword end + group + (1|speaker) and peak alignment ∼extended qword end +
group + (1|speaker)
126
7.3 Results
Table 7.3: Mean alignment values of maxima, relative to qword start and qword end
(ms.), SD (ms.) in brackets.
In order to directly compare the magnitude of this variation with reference to each
edge, table 7.3 lists, per word, the mean alignment values across all participants. Two
main patterns are observed in the data, the first to do with average alignment and the
second with the degree of variability in the data.
Firstly, peak alignment is on average very close to the right qword edge, much closer
than to the left qword edge. In case of a global analysis only, this might be taken to
argue for right edge-alignment of the peak and subsequent interpretation as association
to the qword edge. However, given the overall variation and different behaviour of
individual words, this grand mean falsely suggests that all peaks align with the right
qword edge.
Secondly, the large standard deviations for both measures for each qword indicate
that the means are not very informative. Since there were high correlations for peak
alignment with both word edges, it seems that the edges play some role in peak align-
ment. The qword edges nevertheless do not constitute strong absolute limits on the
domain within which peaks are realised, nor do peaks align systematically within the
qword domain when they do occur within it. The range of absolute alignment values
seems especially large in comparison with results reported in earlier studies that set out
to test predictions of the segmental anchoring hypothesis (see Chapter 2).
In order to make this comparison explicit, I will quickly review the details on align-
ment from two much-cited studies. In Ladd, Mennen & Schepman (2000), the authors
investigate rising accents in Dutch by looking at the alignment of low and high turning
points marking the start and end of a rising movement on a stressed syllable. For the
high turning point, mean speaker alignment values ranged from 10.8 ms. (short vowel
condition) to 19.9 ms. (long vowel condition) (based on their Table 1). Similarly, in a
study on rising L*+H accents in two dialects of German, Atterer & Ladd (2004) found
127
7 Question word interrogative intonation in Moroccan Arabic
that the temporal domain across which high turning points occurred spanned 50 ms.
with standard deviations of 13 and 17ms. (values reflecting means per speaker/group
in their table 1). The high turning point is subsequently interpreted as the phonetic
reflex of a H trailing tone (and the accent thus being L*+H). Trailing tones are gener-
ally analysed as such by virtue of being less systematically aligned than the preceding,
starred, tone. This suggests that the range of variation for H alignment attested in At-
terer & Ladd (2004) is already relatively large compared to tonal alignment of starred
tones.
While these measures of variation in alignment concern small groups of speakers,
and controlled speech under laboratory conditions, it does seem that the alignment
patterns in the present data are of an altogether different nature. In the present data,
the temporal domain across which maxima are realised is, as mentioned previously,
91.8 ms. in the cases of the ‘smallest’ domain, and standard deviations for the variation
in alignment for individual words start at 49 ms. Even if not directly comparable, these
numbers are indicative of substantially different degrees of variability in alignment.
The next section will delve into alignment just a bit further, to see if relative measures
might still reveal a somewhat more systematic pattern otherwise missed. The phrase-
initial peak in MA qword interrogatives is apparently not systematically aligned to any
part within the qword, but the qword edges nevertheless delimit the domain within
which peaks are realised. So far, this is in line with the initial hypothesis that there
is no segmental anchoring for the peak in the qword rise–fall in MA. It could still be
conjectured, however, that some of the variation seen here can be explained away by
accounting for inter-speaker differences in speech rate. Additionally, the segmental
make-up of the words in the experiment varied and some words were more likely to
have post-qword peaks than others, see table 7.2. This suggests that some more vari-
ation can be accounted for by investigating peak alignment relative to segments.
In order to address whether one or both of the above factors might play a role in de-
termining peak alignment, the next subsection will investigate peak alignment relative
to segment duration.
11
Since maxima were extracted automatically based on smoothed and interpolated F0 contours, maxima
may also occur on voiceless segments.
12
normalised peak alignment ∼group + (0+group|qword) + (1|speaker)
128
7.3 Results
Secondly, there is a word-specific effect for imta and fʔaʃ in that peaks in these words
typically align before the right edge of the qword. An obvious explanation would appeal
to the fact that imta is disyllabic word and has a first segment that is phonetically voiced
(both unlike all other qwords). Interestingly, in none of its 21 renderings does the peak
actually occur on the initial vowel, with the /i/ instead being used to carry a rising
pitch movement. This suggests that if enough voiced segmental material is available,
both a rise and a peak will be realised on the qword. Under suboptimal circumstances,
as in all other qwords, the peak might be realised to the right of the qword, or the
initial rise gets truncated. To what extent either one of these possible strategies is more
consistent with the observed data will be explored in the next section in which scaling
is investigated.
Returning to the issues mentioned at the end of the previous section, it seems that
even if speech rate and segmental make-up are considered, peak alignment remains
highly variable. Based on both absolute and relative peak alignment results I conclude
that F0 peaks associated with MA qwords are not aligned systematically with reference
to segmental anchors. F0 peaks lacked consistent alignment in terms of:
• absolute distance from the start of the qword (left edge) (Figure 7.3)
• absolute distance from the end of the qword (right edge) (Figure 7.4)
Nevertheless, peak location does seem to be governed to some extent by the edges
of the qword domain, as peak alignment exhibited high correlations with both edges.
129
7 Question word interrogative intonation in Moroccan Arabic
From a pragmatic point of view (see discussion of prominence and focus in qword
interrogatives, Chapter 5) the qword is a good candidate of a domain to which a pitch
event might seek to associate. If the qword domain is indeed relevant in governing tune-
to-text association, it is worth exploring whether properties of the qword tune beyond
alignment can explain why peaks in some cases are realised outside this domain.
13
2 ST is chosen here as a somewhat arbitrary cut-off point for perceptible changes in F0, see also Section
6.3.4).
130
7.3 Results
Maxima that are aligned later are not convincingly higher (weak correlations with R2
varying from 0.0001 for ʃkun to 0.19 for ʃnu (2)). Correlations can be expected in cases
where there are temporal constraints on the realisation of a series of tonal targets, espe-
cially if low and high targes occur in quick succession. In such cases the scaling might
be compromised or alignment delayed (see also Chapter 6). The lack of a correlation
between scaling and alignment of the high turning points in MA qwords suggests no
such bidirectional effects on realisation.
To consider the rising movement more holistically, Figure 7.6 serves to show the
relation between the scaling of the start and the end of the rise (minimum F0 value
near left phrasal edge and peak on or after qword, respectively). There is a correla-
tion between these two, manifested as a clear linear relationship indicating that the
ratio of maximum relative to minimum is constant: For any given minimum, the fol-
lowing maximum is scaled roughly 42% higher (overall R2 =0.86). Similarly, a linear
model taking into account by-item and by-speaker variation also finds a strong depend-
ency between the turning points in the scaling domain (LRT: χ2 (1)=32.7, p<0.001,
β=1.01).14 Of course, this correlation on its own does not give any information about
the directionality of the effect.
Figure 7.6: Scaling of qword minima and maxima, regression lines given for each qword
separately.
131
7 Question word interrogative intonation in Moroccan Arabic
represent the rising movement. Moreover, if one of the defining prosodic characterist-
ics of qwords involves the presence of a rising F0 movement, and the rise starts at the
left edge of the qword (or soon thereafter at the onset of voicing), then variable peak
alignment can be analysed to follow from the requirement to realise some degree of a
rise.
Similar argumentation, in which an L tonal target is posited based on a scaling de-
pendency with a following H target, is found in Ladd & Schepman (2003). This set of
authors argues for the existence of an L target based on a scaling dependency between
a low and high turning point in the context of a so-called sagging transition between
two H targets in British English. The particular low turning point they discuss had
previously been considered a mere transition effect with a phonetic explanation rather
than a phonological target.
• the F0 maximum marking the peak is aligned variably: It may occur on or after
the qword
I interpret these findings as providing evidence for the qword tune involving both an
L and an H target, in the form of an L+H bitonal accent. No starred tone is included
for two reasons, which are to a large extent overlapping with the reasons given for the
absence of a starred tone in the TB qword tune (see Chapter 6).
Firstly, there is some unclarity about the alignment of the L, and much attested vari-
ability in the alignment of the H. If the alignment of the high turning point had been
consistent with respect to a specific sub-lexical phonological unit, then this would have
provided some evidence for phonological association to this unit. There is however no
apparent moraic or syllabic unit that serves the purpose of tonal anchoring.
Secondly, and more fundamentally, there is unclarity surrounding the existence of
stress in MA (see Chapter 4). If MA has no lexical–metrical positions in the form of
stressed syllables, then these cannot be expected to play a role in determining tonal
association, including starredness, in the first place.
A possible analysis of the L+H tonal complex is that it seeks association to the qword,
and the fact that the alignment of the peak is not always aligned within the domain
of the qword can then be explained by the requirement to realise at least some rising
movement. This discussion will be taken up in more detail in Section 7.4.
132
7.3 Results
It is clear that there is a steep fall following the peak in all cases: On average 28%
of the entire range is covered by the F0 distance between peak and immediately fol-
lowing vowel, and the 50% mark is reached around the second vowel after the peak
in all utterances. The trajectories then start to diverge. This seems best explained by
appealing to the number of vowels between peak and end of utterance. Fʔaʃ is followed
15
The reason why the 100% mark is not reached for all utterances is due to averaging across all individual
fall trajectories, including those that exhibit smaller rising movements at any point after the peak
(utterances with a final rise were excluded from the mean calculation).
133
7 Question word interrogative intonation in Moroccan Arabic
by only four vowels and has a much steeper fall than the longer utterances with ʃnu (2).
The other four utterances have different lengths but nevertheless display highly similar
trajectories, suggesting that the differences in steepness of the fall for the short and
long utterances are indeed best interpreted as adjustments from a default.
The results for the fall are informative with respect to the earlier finding that peak
alignment was not systematic. The fall trajectory shows that once the peak is considered
relative to what follows (i.e. in controlling for varying alignment by taking the peak as
reference point), the observed contour is comparable across utterances, with the traject-
ory of the fall being highly similar for the first two vowels following the peak. Results
also indicate that there is some effect of utterance length in that there are adjustments
in steepness of fall after the second vowel following the peak: The longer the utterance,
the shallower the fall. Utterances are nevertheless alike in exhibiting a similarly steep
fall immediately following the qword peak.
On the one hand, this could be taken to argue in favour of the presence of another low
tonal target shortly after the peak. Further support for this comes from the idealised
scenario with a linear fall between the peak and the phrasal minimum at the end of
the utterance. This linearly-interpolated fall scenario is illustrated for the longest and
shortest utterances in Figure 7.7 with a dashed line. It can be seen that the trajectories
for the falls as they are attested in the present dataset deviate considerably from a linear
fall. While no interpolation between intonational tonal targets is ever really linear, the
difference seems appreciable enough to look like there should be another low target
causing the rather steep fall.
On the other hand, the inflection points in this figure do not necessarily represent
turning points in the actual F0 contours. At the very best, they give an indication as to
where steeper falls turn into shallower falls. Further work will be needed to determine
the location of an actual low turning point following a qword peak in MA, and to
investigate the possibility that there is a second L phonological target involved in the
MA qword tune.
7.4 Discussion
7.4.1 Towards a phonological analysis
The above findings showed that the MA qword tune consists of a sharp rise, starting at
the left qword edge, to a peak reached on or shortly after the qword, and a relatively
steep fall following this peak. Earlier in this chapter (in Section 7.2.3), it was men-
tioned that there were differences between the interrogatives investigated in terms of
phonological and syntactic structure, and there might have also been pragmatic differ-
ences (since each sentence occurred at a different point in the scripted dialogue). None
of these factors had an obvious effect on the intonational structure of the interrogat-
ives, since i) qwords in all interrogatives received the main phrasal pitch prominence,
ii) there were no other prominence-marking events in the utterance, and iii) the degree
of variability in phonetic realisation within and between utterances suggest that there
134
7.4 Discussion
135
7 Question word interrogative intonation in Moroccan Arabic
• For a preceding L: The local character of the rise towards the peak. There was
a steep rise preceding the peak on the qword. This is especially informative in
the case of whq4 where the rise towards the phrase-medial qword peak could
theoretically have started much earlier than at the onset of the qword.
• For a trailing L: The relatively steep fall immediately following the peak (Figure
7.7), and the notable absence of contours in which the high region on the qword
was extended into a high plateau, as in the case of TB (Chapter 6), or as reported
for Spanish (Prieto 2004; Henriksen 2014).
These observations could be taken as support for various analyses, for example along
the lines of an LH, HL, or even LHL non-metrical pitch accents.16
16
Tritonal pitch accents are typically avoided in AM-style analyses but are proposed in a few cases, e.g. El
Zarka 2011 for Egyptian Arabic.
136
7.4 Discussion
this point it is simply not known whether there is an interpretative difference between
a gradual and a steep fall following the qword peak, and by extension, if a trailing L
should be posited to distinguish one pragmatic meaning from another.
As for syntagmatic contrast, on the other hand, all work so far has noted the existence
of local rising–falling movements (rather than, for example, extended high regions,
rises without sharp falls or falls without sharp rises). It is not clear however whether
Yeou, Embarki & Al-Maqtari’s (2007) contrastive focus movement is any different from
the present qword rise–fall either in phonetic realisation or in terms of the domain of
association (although it seems that both rise–falls take words rather than individual
syllables as the domain over which they are realised, see also Section 7.1.2).
2. What is the prosodic prominence status of question words? How should any pros-
odic marking be analysed phonologically?
3. Are there any speaker group differences between Tashlhiyt/Moroccan Arabic bi-
linguals and non-Berber Moroccan Arabic speakers?
137
7 Question word interrogative intonation in Moroccan Arabic
138
7.5 Summary and conclusion
ation of the tonal movement. Secondly, the absence of a role for stressed syllables in
determining tune-to-text association strongly suggests that phonological tune-to-text as-
sociation does not need to refer to TBUs below the lexical level. In conclusion, qword
interrogatives in MA are interpreted as involving a non-metrical H accent that associ-
ates to the qword.
139
Part IV
Prominence perception
141
8 Prominence deafness in Tashlhiyt
Berber and Moroccan Arabic speakers
8.1 Introduction
8.1.1 Prior work
8.1.1.1 The phenomenon of stress deafness
Over the last two decades, a very productive line of research has investigated the
perception of prosodic contrasts reflecting lexical stress in participants with different
language backgrounds. Much work has been done by Dupoux, Peperkamp and col-
leagues (Dupoux et al. 1997; Dupoux, Peperkamp & Sebastián-Gallés 2001; Dupoux
& Peperkamp 2002; Peperkamp & Dupoux 2002; Dupoux et al. 2008; Skoruppa et al.
2009; Dupoux, Peperkamp & Sebastián-Gallés 2010; Peperkamp, Vendelin & Dupoux
2010). More recently, other groups of researchers have also conducted experiments
inspired by this work (Correia et al. 2015; Rahmani, Rietveld & Gussenhoven 2015;
Hellmuth, Muradás-Taylor & Karrinton to appear).
The main shared finding across all experiments is that certain groups of participants
(i.e. native speakers of specific languages) exhibit ‘stress deafness’. A prototypical stress
deaf group is formed by native French speakers, in contrast to for example Spanish
natives. Stress deaf listeners struggle with reliably distinguishing between stimuli that
vary in the location of prosodic prominence. Their relative inability to deal with a
prosodic prominence contrast is all the more robust because this behaviour is not only
different from other listener groups, but also stands in contrast to the same listeners’
ability to deal with segmental contrasts. Stress deaf and non-stress deaf participants
typically perform similarly well on segmental-phonological contrasts.
Over the years, stress deafness has been tested by in slightly varying ways, and the
perceptual tasks and stimuli have varied considerably between experiments. In the
following, I will describe a typical stress deafness experiment as conducted in the later
studies (i.e. Dupoux et al. 2008; Dupoux, Peperkamp & Sebastián-Gallés 2010; Rahmani,
Rietveld & Gussenhoven 2015).
The task that has proved to reliably yield stress deafness effects is a so-called sequence
recall task (SRT). In an SRT, listeners are presented with sequences of words (typically
nonce words in the native language of the participants) that differ only in terms of
where the main prominence is located. Two example word pairs that have been used to
test the ‘stress’ or prosodic contrast are /ˈnumi/ ∼/nuˈmi/ and /ˈmipa/ ∼/miˈpa/ (as in
Dupoux et al. 2008; Dupoux, Peperkamp & Sebastián-Gallés 2010). The exact phonetic
143
8 Prominence deafness in Tashlhiyt Berber and Moroccan Arabic speakers
correlates of the difference between these words will be discussed below. Example word
pairs exhibiting a segmental contrast as used in SRTs are /ˈfiku/ ∼/ˈfitu/ and /ˈmunu/
∼/ˈmuku/.
The first part of the task involves a learning phase that serves to ensure that parti-
cipants correctly associate the contrasting members of a minimal stress pair with des-
ignated keyboard keys (e.g. key ‘1’ for initial stress, and key ‘2’ for final stress). In this
phase, individual words are presented to the listener, and the answer involves a forced
choice between the two categories.
The actual experiment, the SRT, follows the learning phase and involves longer se-
quences of these words, ranging from two to six words. An individual experimental
trial thus could involve a three-word sequence such as /ˈnumi/ /nuˈmi/ /ˈnumi/, or
/ˈmuku/ /ˈmunu/ /ˈmuku/, followed by the word “OK”. The playing of the word “OK”,
typically in another voice, serves to prevent listeners from using acoustic memory, so
that the task taps into short-term memory and by extension reflects listeners’ categor-
ical representations. Listeners respond to what they have just heard by typing in what
they think are the appropriate keys in matching order (e.g. 121 for the aforementioned
sequences).
The crucial finding is that on the trials involving more than two words, different
groups of listeners exhibit markedly different behaviour, where those that exhibit low
accuracy on the prosodic contrast are considered to be stress deaf. This low accur-
acy is understood in two ways: i) low compared to the same participants’ scores on
the SRT involving the segmental contrast, and ii) low compared to the scores of other
groups of participants’ (with a different native language) on the prosodic SRT. A second
finding of these experiments is that response accuracy decreases at longer sequences
for all groups of listeners, irrespective of whether they are considered stress deaf or
not, and irrespective of whether the contrast is segmental or prosodic. The overall
difference in performance between stress deaf and non-stress deaf participants on the
prosodic SRT is nevertheless maintained even at longer sequences. Both types of ef-
fects (asymmetry between groups of participants and decreasing performance with in-
creasing sequence length) are shown for the five groups tested in Rahmani, Rietveld &
Gussenhoven (2015), in Figure 8.1.
This graph illustrates a typical stress deafness effect as occurring in some participant
groups as opposed to others. Firstly, all listener groups give correct responses for more
than half of the trials on the segmental SRT, irrespective of sequence length (although
accuracy does decrease somewhat with increasing sequence length). Secondly, clear
differences between the groups arise in the case of the prosodic, but not the segmental
SRT. Of the five groups that featured in the study by Rahmani, Rietveld & Gussen-
hoven, participants whose native language is Persian (Farsi), French or Indonesian are
considered stress deaf, whereas the Dutch and Japanese participants are not. Dutch
and Japanese participants give correct responses to more than half of the trials with se-
quence lengths of three and four words, and to about half of the trials with a sequence
length of five words. In contrast, at no sequence length do the other three groups
reliably recall sequences of words that differ in the location of prosodic prominence.
144
8.1 Introduction
Figure 8.1: Mean accuracy scores for segmental and prosodic SRTs, per native language
group and per sequence length, based on average participant scores from
Rahmani, Rietveld & Gussenhoven 2015 (Appendix). Dotted line reflects
50% or half of the responses correct.
It should be noted that stress deaf participants can nevertheless perceive the prosodic
difference between stimuli such as /ˈnumi/ and /nuˈmi/: This is why most of them
pass the learning phase of the task, where they learn to associate prosodically differing
words with a specific key. Words presented for discrimination in this phase are presen-
ted in isolation, which removes the element of memory storage seen in the sequence
recall phase. Additionally, an earlier study also showed that French (stress deaf) par-
ticipants’ performance on a task that appeals to purely phonetic–acoustic distinction
skills, such as an AX discrimination task, is clearly better than on SRTs (Dupoux et al.
1997). These observations highlight that stress deaf listeners are not incapable of hear-
ing a phonetic difference between prosodically different stimuli. Rather, they struggle
to encode the difference at a more abstract level, so that in more demanding tasks that
tap into categorical–phonological representation their performance deteriorates. Ex-
actly how linguistic background contributes to causing this stress deafness effect will
be discussed in section 8.1.1.3.
145
8 Prominence deafness in Tashlhiyt Berber and Moroccan Arabic speakers
same definition of stress as the one given in this thesis (‘the phonological property of
a syllable within a word of being special’). In fact, it is crucial to note that the surface
prosodic contrast between stimuli used in most stress deafness experiments is only partly
due to lexical stress proper. Specifically, typical stimuli used across all experiments are
(modified versions of) words spoken in isolation by a Dutch native speaker. Isolated
words form an IP on their own and are, therefore, subject to sentence-level prosody.
Isolated words thus exhibit what I will refer to as ‘double enhancement’: Characterist-
ics of lexical stress proper paired with characteristics of phrase-level prosody. Lexical
stress, in the case of Dutch, manifests itself as differences between the syllables in terms
of spectral tilt, vowel quality and duration. Phrase-level prosody involves a nuclear
pitch accent associated with the stressed syllable, and phrase-final lengthening primar-
ily targeting the final rhyme of the final syllable (cf. Cambier-Langeveld 2000). ‘Double
enhancement’ defined this way can be contrasted with ‘single enhancement’, referring
to acoustic differences caused by lexical stress proper, in the absence of phrase-level
prominence. This single enhancement is often referred to as ‘acoustic correlates of
stress’, as also discussed in Section 2.3.3.
In sum, the stimuli used in SRTs are typically of the double enhancement type.1
The acoustic properties of one such set stimuli are exemplified in Figure 8.2 for one
realisation of the words /nuˈmi/ and /ˈnumi/ each, spoken by a Dutch male speaker.
These particular stimuli are from Rahmani, Rietveld & Gussenhoven, but very similar
Dutch stimuli were used in a number of earlier studies (Dupoux et al. 2008; Dupoux,
Peperkamp & Sebastián-Gallés 2010; Peperkamp, Vendelin & Dupoux 2010).
Figure 8.2: Spectrogram, waveform and F0 contour for Dutch pseudo-words /nuˈmi/
(left) and /ˈnumi/ (right) spoken by a male Dutch native speaker. Stimuli
from Rahmani, Rietveld & Gussenhoven (2015)
Clearly, the surface acoustics of these stimuli exhibit big differences: The stressed
syllable is longer than its unstressed counterpart in the same position, the stress syl-
lable carries most of the pitch prominence on the word and it is spectrally enhanced.
Nevertheless, the individual words are by no means mirror images of each other in
terms of prominence, i.e. it is not the case that initial stressed syllables are marked by
exactly the same type of enhancement as final stressed syllables. As this is relevant for
1
The main exception being Correia et al. (2015), which will be discussed below.
146
8.1 Introduction
the discussion in Section 8.4.3, a brief dissection of the prosodic properties associated
with both words is given here.
Firstly, while the nature of the pitch accent associating to the stressed syllable is the
same for initial and final stressed syllables (according to the ToDI annotation system
for Dutch intonation it would be H*L in both cases, cf. Gussenhoven 2005), this pitch
prominence is phonetically realised differently in the two words, with a much steeper
rise-fall on the word with final stress than on the word with initial stress. This is mainly
due to the phrasal position of the syllable the pitch accent belongs with. Final lexical
stress in /nuˈmi/ corresponds to absolute phrase-final position, while penultimate stress
in /ˈnumi/ coincides with penultimate position in the phrase. This results in the differ-
ence in the pitch movement, which in /nuˈmi/ is a rise-fall fully realised on the stressed
syllable, as opposed to in /ˈnumi/ where the fall is realised mostly on the non-stressed,
final syllable. This difference can be accounted for if we consider that the phrase-final
syllable also carries the modality marking L% IP boundary tone used for declarative ut-
terances. This results in a quick succession of tonal targets (H*L L%, i.e. a pitch accent
followed by boundary tones) on the final syllable in the case where it is stressed (i.e. in
/nuˈmi/).
Secondly, the durational asymmetry between the first and second syllable in /nuˈmi/
is more pronounced than the one in /ˈnumi/, a fact that can be explained by appeal-
ing to the interaction of phrase-final lengthening and accent-induced lengthening: In
/nuˈmi/, the position of the pitch-accented syllable is phrase-final, which means the fi-
nal syllable is subject to both of the just mentioned lengthening effects. In /ˈnumi/, on
the other hand, the lengthening effects are separated: The initial syllable receives most
of the accentual lengthening while the final syllable receives the bulk of phrase-final
lengthening, resulting in both syllables ending up with similar duration (cf. Cambier-
Langeveld 2000).
Based on the preceding discussion, it would thus seem that stress deafness as it
is typically conceived of really might better be called ‘failure to represent a phrase-
level prominence contrast parasitic on lexical stress’.2 Alternatively, one could say
that ‘stress deafness’ in its common use refers to a perceptual insensitivity relating to
the ‘double enhancement’ type of stimuli. Why it is so crucial to distinguish between
‘double’ and ‘single’ enhancement is highlighted by the results reported in Correia et al.
(2015). These authors report on two SRTs performed with native Portuguese speaking
participants. In the first SRT (their experiment 2), there were two sets of stimuli (all
spoken by Portuguese speakers): One set consisted of isolated words (double enhance-
ment stimuli) while the other set consisted of words excised from a phrase, where the
word in question was postfocal and non-IP-final (single enhancement stimuli). For the
single enhancement type stimuli, the enhancement consisted primarily of longer dur-
2
Although stress deafness effects have repeatedly been shown for unmodified or little-modified Dutch
stimuli, several stress deafness studies used stimuli that were strongly manipulated in terms of duration
and/or pitch, or stimuli spoken by native speakers languages other than Dutch (including Spanish,
French, Persian and English). Since similar effects were found in most cases, the phenomenon of stress
deafness seems rather robust, although it must be admitted that these manipulations make it impossible
to directly compare the relative strength of the deafness effects in each case.
147
8 Prominence deafness in Tashlhiyt Berber and Moroccan Arabic speakers
ation for the stressed syllable (vowel quality differences might also be a correlate of
stress in Portuguese, but the vowels to which this applies were not used in this set of
stimuli, which incidentally also involved the words /ˈnumi/ ∼/nuˈmi/). Notably, these
stimuli were not characterised by the presence of a pitch accent, phrase-final lengthen-
ing, or boundary marking. The double enhancement stimuli, in contrast, additionally
did exhibit this latter type of enhancement.
Results indicated that among the prosodic stimuli, the presence of postlexical prom-
inence resulted in considerably better SRT performance. Specifically, participants gave
about 9% correct responses for the single enhancement type target words, and around
22% correct responses for double enhancement type words. They had more correct re-
sponses on the phoneme contrast (50%), but here there were no differences as a function
of enhancement type. Despite the relatively better performance on the double enhance-
ment type stimuli for the prosodic contrast, the error rates are nevertheless very high
for a group of participants who speak a language with variable stress and thus should
not exhibit scores that are reminiscent of stress deafness. The authors follow this result
up with a second SRT (their experiment 3).
In contrast to the high vowels in the /numi/ word pair, other Portuguese vowels ex-
hibit reduction when unstressed, and the authors’ follow-up experiment aimed to test
the relative contribution of vowel quality as a cue to stress position. The words used
for the prosodic contrast in this task were /ˈnemi/ ∼/neˈmi/ (with /e/ realised as [e]
when stressed and [ɨ] when unstressed). The words for the segmental contrast were the
same as in the previous experiment. The average percentage correct on the segmental
contrast, as expected, was the same as in the first SRT, namely around 50%. The aver-
age percentage correct on the prosodic contrast, however, was much higher than in the
previous SRT: 45% for the double enhancement type stimuli, and 25% for the single
enhancement type. Clearly, response accuracy on the prosodic SRT improves when
vowel quality differences are present, which is interpreted by the authors as indicating
that vowel quality is a necessary cue to reliably encode stress position in Portuguese,
whereas the presence of double enhancement as opposed to single enhancement plays
a secondary role. One observation however remains to be explained: Portuguese par-
ticipants still make about 75% errors on the single enhancement type stimuli in which
vowel quality cues are present. While this is a clear improvement with respect to the
91% errors (9% correct) without the vowel quality cue, a 75% error rate still seems
worse than the ca. 50% error rate on the phonemic contrast. This difference is also
not explained by the authors.3 It seems, therefore, that it cannot be concluded from
Correia et al. (2015) that vowel quality is a sufficient cue for reliable stress encoding
in Portuguese participants. Instead, it is safe to say that vowel quality in Portuguese is
an important cue to stress, but the same can still be said for the presence of postlexical
enhancement.
These results once more highlight two important points that have perhaps not re-
ceived due attention in the literature. The first is that the use of the term ‘stress’ in
3
I have so far refrained from reporting any statistics here since they might be problematic to interpret:
The authors used ANOVAs on non-normally distributed data that were not transformed.
148
8.1 Introduction
‘stress deafness’ might, to some, rather misleadingly suggest that the relevant parti-
cipants are deaf to a lexical stress prominence asymmetry. This is clearly not the case,
since, all else being equal, the presence versus absence of postlexical prominence can
account for a considerable part of the stress deafness effect: Single enhancement type
stimuli result in considerably worse accuracy on SRTs than double enhancement type
stimuli, as the Portuguese results showed. Portuguese participants were in fact rather
deaf to correlates of lexical stress alone: 25% correct scores on stimuli that exhibit
only correlates of lexical stress (duration and vowel quality) is rather poor. The second
point to take away is the importance of considering the native language’s cues to lexical
stress in relation to the specific acoustic–phonetic properties of the stress contrast par-
ticipants are presented with. Portuguese listeners could be expected to perform worse
on a Spanish lexical stress contrast (which is not cued by vowel quality) than on a
Germanic stress contrast (which is cued by considerable vowel reduction in unstressed
syllables). Past studies have not always kept these points in mind.
In the following, I will discuss some of the past work and the explanations that have
been brought forward for ‘stress deafness’.
• The status of lexical stress as predictable. This explanation overlaps with the
previous one. In Peperkamp, Vendelin & Dupoux (2010), it is argued that southern
French listeners exhibit stress deafness due to the fact that they are native speakers
of a language with predictable stress (in this variety, stress is presumed to be final,
but penultimate in the case of a final schwa). According to this explanation, their
stress deafness could be explained in terms of the absence of a need to represent
stress in the mental lexicon. An issue with this explanation concerns the general
consensus that lexical stress in French (whichever Hexagonal variety) does not
exist, in the sense that prominence asymmetries are not lexically assigned, but
postlexically. See also Section 2.3. Better examples of languages with predictable
lexical stress would be most varieties of Arabic, in which the stressed position can
149
8 Prominence deafness in Tashlhiyt Berber and Moroccan Arabic speakers
be derived by rule based on syllable weight. None of the stress deafness studies
so far have tested uncontroversial predictable stress languages.
• The domain of ‘stress’ assignment. If the native language assigns ‘stress’ postlex-
ically (or in my use of the term: If lexical stress is absent), listeners will exhibit
stress deafness. In later papers, Dupoux, Peperkamp and colleagues (e.g. Dupoux
et al. 2008) use this explanation to explain findings with regard to French listeners.
The ‘domain of stress’ explanation is also used by Rahmani, Rietveld & Gussen-
hoven (2015) and accounts for the stress deafness effect they find for Persian,
Indonesian and French listeners as opposed to the absence of the effect in Dutch
and Japanese participants.4
The common denominator of the (partly) successful explanations is the idea that a
lack of a grammaticalised contrast for ‘stress’ in the native language leads to stress
deafness. It is important to note, however, that the absence of a lexical stress contrast
in a language may have various underlying explanations, with different phonological
properties leading to the same surface lack of the contrast. This part of the explanation
for stress deafness has so far not been made explicit. In what follows, I will briefly sketch
three different ‘reasons’ why languages may lack a surface lexical stress contrast.5
Firstly, languages like Finnish and Hungarian, whose speakers exhibit stress deafness
in some of the earlier experiments, are commonly considered to have lexical stress. In
these languages, the issue seems to relate to the fact that lexical stress is fixed (and thus
by definition not contrastive).
Secondly, languages like French and Indonesian, whose speakers also exhibit stress
deafness, lack contrastive lexical stress in a different way. In this case the absence of
contrast is due to the simple fact that lexical stress is absent altogether.
Thirdly, native speakers of Japanese were found not to be stress deaf (Rahmani, Ri-
etveld & Gussenhoven 2015). Several varieties of Japanese are commonly considered
to lack lexical stress in the sense that lexical prominence asymmetries are not of the
stress, but of the lexical pitch accent type. Thus, Japanese lexical phonology does en-
code lexical prominence asymmetries.
4
Japanese has lexical prominence asymmetries in the form of lexical pitch accent rather than lexical
stress.
5
Presumably, more reasons, such as the presence of lexical tone, can also be included, but no work has
so far been conducted to test speakers of lexical tone languages.
150
8.1 Introduction
6
Although it does not, of course, exclude other possible causes for a stress deafness effect.
151
8 Prominence deafness in Tashlhiyt Berber and Moroccan Arabic speakers
8.2 Methodology
8.2.1 Participants
In order to test effects of prominence deafness as a function of a given native language,
participants are ideally monolingual speakers of this language. This is an impossible
task when recruiting among university students in Morocco, so a number of measures
was taken to arrive at target groups of participants that would, despite not being mono-
lingual, still reflect perception effects that could be attributed to a single native lan-
guage.
In addition to the diglossic situation that characterises present-day Morocco, with
Berber varieties being used alongside Moroccan Arabic, there was the added complica-
tion of foreign languages learnt in school. These include French, English, and Modern
Standard Arabic, the latter of which Moroccans have semi-regular exposure to on ra-
dio and TV, and in written form for the purpose of official communications. Among
these three languages, only familiarity with French was deemed inconsequential, given
the past results on stress deafness in native French speakers (e.g. Dupoux, Peperkamp
& Sebastián-Gallés 2001; Dupoux et al. 2008; Peperkamp, Vendelin & Dupoux 2010).
On the other hand, familiarity with Modern Standard Arabic and English, as languages
that do have lexical stress, is potentially more problematic. Care was therefore taken
to select participants with minimal proficiency and exposure to these two (and other
stress) languages.
Two target groups of participants were recruited for the present study: One group
of native speakers of Tashlhiyt Berber and one group of native speakers of Moroccan
Arabic. While it was possible to find MA natives with zero proficiency in Berber, the
reverse was not possible. Care was taken therefore to find those TB speakers that almost
uniquely spoke TB and very little MA. To this end, prior to participation all participants
filled out a questionnaire which included questions about their proficiency in various
languages as well as the regularity with which they spoke these. No participants were
recorded who did not meet the criteria of clear dominance in either Tashlhiyt Berber or
Moroccan Arabic respectively. Participants also had to have no more than intermediate
proficiency in any language with lexical stress (these were English, Spanish and German;
Participants all understood MSA but did not speak it often).
All participants were recruited and recorded at the universities at which the experi-
ments took place (Université Ibn Tofail in Kenitra for the MA group and Université Ibn
Zohr in Agadir for the TB group).
In the TB group (N=39) none of the participants spoke MA as a first language (age
of onset for learning MA was six to eight) and all participants primarily spoke Tashlhiyt
with their family, their friends, and at university or work. Participants’ self-reported use
of Moroccan Arabic typically involved around 20% of their social interactions. Most
of the participants in this group were either students in the Département des études
amazighes, where the main language of instruction is Berber, or, in case of a few gradu-
ates, primary school teachers of Standard Berber (see also Section 1.2). For the MA
group (N=37), participants were selected from the French and physics departments,
152
8.2 Methodology
where the main language of education is French (as opposed to MSA or English).
In both groups, a number of participants had to be excluded, resulting in a grand
total of 62 participants, with 31 in each group. See Section 8.2.3 for the reasons why
these were excluded prior to analysis. Mean age in the resulting Tashliyt-speaking
group was 22 (range: 18 to 33), with 13 female and 18 male participants. Mean age
in the Moroccan Arabic group was 20 (range: 17 to 36), with 16 female and 15 male
participants.
8.2.2 Stimuli
The stimuli used in the experiment are the exact same ones as those used in the exper-
iment by Rahmani, Rietveld & Gussenhoven (2015).7
The experiment consisted of two parts, a segmental and a prosodic part, which were
separated by a voluntary break. The order of presentation was counterbalanced across
participants, see Procedure below. The segmental part tested listeners’ ability to discern
the contrast in the segmental minimal pair /ˈmuku/ ∼/ˈmunu/. These are non-words in
both Tashlhiyt Berber and Moroccan Arabic. The prosodic part tested listeners’ ability
to discern the contrast in the prosodic minimal pair /ˈnumi/ ∼/nuˈmi/. The word
/numi/ is a non-word in both Tashlhiyt Berber and Moroccan Arabic. Several phonetic
variants of these four words were used in the experiment: 12 per word as spoken three
times each by a female and male speaker of Dutch, and a female and male speaker
of Persian. The tokens were time-compressed so that all stimuli had a comparable
duration of around 450 ms. The word “OK” which concluded every sequence of target
words (see 8.1.1.1) was spoken by a different female speaker.
The acoustic differences between the members of the prosodic minimal pair /ˈnumi/
∼/nuˈmi/ were discussed in Section 8.1.1.2 and were visualised for one example pair
in Figure 8.2.
The exact order of words as used in the sequences were the same as in Rahmani, Ri-
etveld & Gussenhoven (2015), i.e. there were five different sequences for each sequence
length:
Each sequence was made up phonetic variants from a single speaker, and no phonetic
token occurred more than once in each sequence. Sequences such as 1211, with three
tokens of the same word, thus exhausted the three phonetic variants of that word as
uttered by a single speaker. Each of the above 15 sequences occurred twice in each
SRT: Once as instantiated by tokens from one of the two Dutch speakers and once by
tokens from one of the two Persian speakers. The total of 30 test trials (per SRT) was
7
I am very grateful to the authors for letting me use their stimuli.
153
8 Prominence deafness in Tashlhiyt Berber and Moroccan Arabic speakers
8.2.3 Procedure
The two parts of the experiment, the prosodic and the segmental part, had the same
structure, with four subparts. In the segmental part, participants had to perform dis-
crimination on the stimulus pair with the segmental contrast /ˈmuku/ ∼/ˈmunu/. In
the prosodic part, participants did the same for the stimulus pair with the prosodic con-
trast /ˈnumi/ ∼/nuˈmi/. The four phases were the following, with 4. being the main
experimental task, or SRT test phase:
In phase 1, the word presentation phase, participants were presented first with all 12
phonetic tokens of a member of a contrasting word pair (upon pressing a designated
key, ‘1’), and then with all 12 phonetic tokens of the other member of the pair (upon
pressing the other designated key, ‘2’). After this, participants could press either key
as often as they wanted in order to listen to single instances of the words. They could
choose to continue to the next phase whenever they felt they had learned the key-to-
word association.
In phase 2, the word identification phase, participants had to achieve a number of
correct responses before they could proceed to the next phase. They were presented
with randomly selected tokens of the 24 phonetic variants for the contrast. For each
token they had to press the matching key, after which they received feedback as to
whether their response was correct or not. Only once a participant had given eight
correct responses in a row were they allowed to move on to the next phase. This
proved to be very difficult for some participants in the case of the prosodic contrast.
The average number of tokens participants listened to in order to get eight correct was
34 for the MA group, and 41 for the TB group. In order to get eight correct on the
segmental contrast, these groups required only 14 and 10 tries, respectively. Similarly
to Rahmani, Rietveld & Gussenhoven (2015), participants who needed more than 150
tokens before reaching eight correct in a row were excluded (six in the MA group, eight
in the TB group).
Phase 3, the SRT practice phase (‘warm-up block’ in Rahmani, Rietveld & Gussen-
hoven 2015), served to train participants in doing the recall task. Participants heard
eight sequences of two words with each of the four possible sequences (i.e. 11, 12, 21
and 22) occurring once in each stimulus language (Dutch/Persian). Each sequence was
154
8.2 Methodology
followed by the word “OK” as spoken by a different female speaker. All words (target
word 1, target word 2 and “OK”) were separated from each other by a pause of 120
ms. Participants gave their response by entering the numbers (keys) in a dialogue box.
They could only enter their response after the word OK, and they could check if they
had entered their intended response. They pressed enter to confirm their response. A
new sequence was presented following a 1500 ms. pause. Participants received feed-
back as to whether their response was correct or not, and trials that were responded to
incorrectly were presented again until the correct response was given.
Phase 4 was the main experimental SRT and used the same procedure as the practice
SRT. Sequence lengths in this phase however involved three, four or five words, and
participants received no feedback on the accuracy of their response. Sequences were
randomly selected from the pool of 30 trials, 15 of which were Dutch speaker stimuli
and 15 Persian speaker stimuli. Phonetic variants for each target word were selected
randomly among the realisations from a single speaker. The order in which the pros-
odic and segmental task were performed was counterbalanced: Half of the participants
performed the prosodic part first, followed by the segmental part, and the other half
performed the segmental part before the prosodic part.
The perception experiment took place in subsequent weeks in November 2017 in
quiet rooms at the Université Ibn Tofail in Kenitra for the MA group, and at the Uni-
versité Ibn Zohr in Agadir for the TB group. The experiment was run with E-Prime
3.0 (2017) on a laptop computer, and with headphones to listen to the sound. Pre-
experiment instructions were given orally in TB or MA, and on screen during the exper-
iment in French. The average duration to complete the whole experiment was compar-
able in both groups and was around 30 minutes (including an optional break).
8.2.4 Analysis
Responses were classified as either correct or incorrect. Incorrect responses included
reversals such as 121 for a target response 212.8
Statistical analysis was performed with binomial Generalized Linear Mixed Models
(GLMMs) in R (R Core Team 2016) with the package lme4 (Bates et al. 2015). The
accuracy of individual responses was modelled as a function of the predictors group
(MA/TB), contrast (segmental/prosodic), sequence length (3/4/5) and stimulus
speaker (DutchF/DutchM/PersianF/PersianM). Specific interactions between main ef-
fects were included, as well as random slopes allowing for interactions of main effects
with participant. The R syntax for each model is given in a footnote. The inclusion of a
8
The decision to exclude reversals along with incorrect responses was motivated by the relative number
of reversals per participant. Three participants (two MA and one TB) produced more reversals than
correct responses on the prosodic SRT (eight reversals versus only six correct (and the rest incorrect),
and seven versus four, and two versus zero, respectively). Six participants produced more reversals
than correct responses on the segmental SRT (differences of one, two, two, two, seven and ten). It is
not clear from these numbers whether some cases involve the incorrect, but systematic, association
of words with keys, as suggested in some of the earlier studies. Exclusion seemed to be the the most
sensible treatment of these patterns, and mirrors what was done in most previous studies.
155
8 Prominence deafness in Tashlhiyt Berber and Moroccan Arabic speakers
main effect of stimulus speaker in the model followed from participants’ comments,
as well as my own impression, that the difficulty of some of the contrasts depended on
the speaker.
The binomial models used here are rather different from analysis methods in previ-
ous work, which are typically ANOVAs on averaged accuracy per participant per SRT,
expressed in percentages. There are some disadvantages to this approach, including the
fact that such data is bounded (0-100%). This means that the variance is not constant
across the scale and by consequence, that ANOVAs are a suboptimal, if not inappro-
priate method of analysis. Moreover, the aggregation of data achieved by averaging
error scores per participant, rather than using raw responses (correct/incorrect), leads
to an increased Type I error rate, i.e. an increased likelihood of reporting false positives.
These issues occur in all previous studies, with the exception of Rahmani, Rietveld &
Gussenhoven (2015), who performed arcsine transformation to the data before running
ANOVAs, eliminating the equal variance issue (although not the aggregation issue).
In any case, the comparison between the present and previous results should be made
carefully. The approach taken here is to compare the present data with the raw data
from Rahmani, Rietveld & Gussenhoven (2015) (after a reconstruction of single re-
sponses based on the aggregated scores per participant). The model that serves the
direct comparison between the TB/MA and the Rahmani, Rietveld & Gussenhoven data
(the ‘combined model’), has the same structure as the model run on the present data
only (‘the TB/MA model’), with one difference: The combined model has the predictor
stimulus language (Dutch/Persian) instead of stimulus speaker, since the exact
speaker for each trial is not available for the other dataset. Further details will be
discussed in Section 8.4.1.
In order to compare accuracy on the prosodic SRT, many previous studies check whet-
her the behaviour of groups is similar on the segmental SRT. Presumably, the segmental
contrast is equally easy for different participant groups as long as it reflects a phonemic
difference in the relevant native languages. If scores on the segmental SRT are consider-
ably lower for one group this might form an indication that groups are not comparable
in terms of working memory or other cognitive factors. In practice, however, there are
many reasons why the contrast might still more difficult for one group than the other,
relating to e.g. linguistic factors such as frequency of occurrence of the relevant phon-
emes in different native languages. Previous studies (Peperkamp, Vendelin & Dupoux
2010; Hellmuth, Muradás-Taylor & White n.d.) have used a so-called stress deafness
index, defined as the error rate in the prosodic SRT minus the error rate in the seg-
mental SRT to control for participant variability in terms of such general performance
factors. The present way of accounting for participant- and group-specific differences
is by including random slopes for the interaction of the main effect of contrast with
individual participants.
156
8.3 Results
8.3 Results
Mean scores for both participant groups, averaged across the four stimulus speakers,
are shown in Figure 8.3.
Figure 8.3: Mean accuracy per participant group for both contrasts and for each se-
quence length.
Mean accuracy is, on average, higher on the segmental SRT for both groups. The
difference between the participant groups seems minimal for the prosodic SRT, but
on the segmental SRT the TB group scores somewhat lower than the MA group. This
difference is however not statistically significant, see below. As was discussed in the
previous section, a difference on the segmental SRT precludes the reliable use of a
stress deafness index for which the segmental SRT serves as a baseline. Since Rahmani,
Rietveld & Gussenhoven (2015) also did not make use of difference scores, I decided
to base the comparison between the present and their study on the raw scores on the
prosodic SRT only. Most importantly, Tashlhiyt participants score well over half of their
responses correct on the segmental SRT (which is better than pure chance performance
given that a mistake in any of the three to five items in a sequence results in an incorrect
response for the entire sequence). Finally, cognitive differences between groups are
highly unlikely given the recruitment among university students in both cases.
I will turn now to the statistical comparison of accuracy on both the SRTs. A binomial
model was run with the aforementioned main effects, interactions between group and
contrast, and stimulus speaker and contrast, and random slopes for interactions
of contrast and stimulus speaker with participants.9 The ‘mixed’ function from the
afex R package (Singmann et al. 2015) and LRTs were used to obtain main effects in
the case of interactions, and the lsmeans package (Lenth & Hervé 2015) was used to
perform multiple comparisons (Tukey), yielding the following results:
9
score ∼Group + Contrast + StimulusSpeaker + SequenceLength + Contrast:Group + Con-
trast:StimulusSpeaker + (0+Contrast|Participant) + (0+Contrast|StimulusSpeaker)
157
8 Prominence deafness in Tashlhiyt Berber and Moroccan Arabic speakers
• A main effect of contrast, in the sense that overall accuracy was lower on the
prosodic SRT: (χ2 =38.21, p<0.001);
• A main effect of stimulus speaker (but see interaction below): (χ2 =23.04,
p<0.001);
• A main effect of sequence length: (χ2 =228.1, p<0.001). Post-hoc comparis-
ons indicate that all pairwise comparisons are significantly different at p<0.001
(i.e. accuracy is lower for five words than for four words, which in turn is lower
than for three words);
• An interaction of contrast with stimulus speaker: (χ2 =38.97, p<0.001).
Post-hoc comparisons show that the Dutch female speaker’s stimuli result in sig-
nificantly lower accuracy than all other three speakers’ stimuli in the prosodic
SRT (p<0.001). There is an additional difference concerning the two Persian
speakers in the segmental SRT, where the Persian male speaker’s stimuli cause
lower accuracy than the female speaker’s (p<0.01).
Importantly, there is no effect of group, that is the scores of TB and MA participants
do not differ overall. There was also no interaction of contrast with group, that is,
TB participants do not perform significantly worse on the segmental SRT than the MA
participants (χ2 =1.29, p=0.26).
Since the main effect of stimulus speaker was also involved in an interaction this
requires some further scrutiny. Post-hoc comparisons showed that of the four speakers,
the Dutch female speaker in the prosodic SRT carries most of this effect, for both parti-
cipant groups. The difference is visualised in terms of model predictions and confidence
intervals (on the prosodic SRT) for the different speakers in Figure 8.4.
8.4 Discussion
One of the crucial results so far is that there are no differences between the TB and
MA participant groups. That is, both groups exhibit the same degree of ‘deafness’ to
lexical prominence. Exactly to what degree this term applies will be discussed in the
following. I will compare the present results to earlier work on stress deafness (Section
8.4.1), and specifically to Rahmani, Rietveld & Gussenhoven (2015), whose experiment
was replicated here and therefore provides comparable reference groups. Following this
comparison I will sketch the implications for the interpretation of lexical prominence
structure in TB and MA (section 8.4.2), and finally I will try to explain the effect of
stimulus t pe by comparing the phonetic properties of the stimuli that cause this
effect (Section 8.4.3).
158
8.4 Discussion
Figure 8.4: Predicted probability of an accurate response, with 95% confidence interval
(based on model fixed effects only), per stimulus length and speaker group,
for the prosodic SRT. The Dutch female speaker’s stimuli result in reliably
lower accuracy than the other three speakers’.
difficult, with participants scoring fewer trials correct than on the segmental SRT. In
their analysis, this effect held irrespective of group, that is, stress deaf and non-stress
deaf groups performed alike in having worse scores on the prosodic SRT (i.e. there was
a significant main effect of contrast in their ANOVA, in addition to an interaction
which meant that all groups performed even worse on the prosodic SRT than others).
The groups who performed relatively badly on the prosodic SRT were consequently
considered the ‘stress deaf’ groups: Native speakers of French, Indonesian and Persian.
These observations are somewhat different to results by Dupoux et al. (2008), who
found that Spanish participants, representing a non-stress deaf population, performed
a little better on the prosodic contrast (scoring 80% correct) than the phoneme contrast
(75% correct). The phonemic contrast in this study concerned /t/ and /k/ in the seg-
mental minimal pair /fiku/ ∼/fitu/, a segmental-phonological contrast only of place
of articulation, which is arguably more difficult than the contrast between the voiceless
stop and the (voiced) nasal in present /muku/ ∼/munu/, a difference in both manner
and place of articulation, as well as voicing. In any case, despite the small differences
in means, the Spanish scores did not differ significantly as a function of contrast (pros-
odic/segmental) in their model.
The next important question is how, in absolute terms, the TB and MA groups com-
pare with ‘stress deaf’ and ‘non-stress deaf’ participants. Since the present study was
designed to be comparable with the Rahmani, Rietveld & Gussenhoven (2015) study
for this very purpose, I will consider their raw scores in relation to the present ones. A
similar model to the above was run on this combined dataset (MA/TB plus Dutch/Japa-
159
8 Prominence deafness in Tashlhiyt Berber and Moroccan Arabic speakers
Figure 8.5: Predicted probability (dots) and 95% confidence intervals of accurate re-
sponse, separated by contrast and sequence length (collapsed across stim-
ulus language Dutch/Persian), for all native participant groups tested in
Rahmani, Rietveld & Gussenhoven (2015) (Dutch, Japanese, Indonesian,
French, Persian) and the present experiment (Tashlhiyt Berber, Moroccan
Arabic).
It is clear that while there are some differences among these groups on the segmental
contrast, notably with Tashlhiyt participants scoring lower than the other participants,
the great divide concerns, as expected, the prosodic contrast. On the prosodic contrast,
the TB and MA participant groups score no different from the Indonesian, Persian and
10
score ∼ParticipantLanguage + Contrast + StimulusLanguage + StimulusLength + ParticipantLan-
guage:Contrast + StimulusLanguage:Contrast + StimulusLanguage:ParticipantLanguage + (0+Con-
trast|Participant) + (0+StimulusLanguage|Participant)
160
8.4 Discussion
French participant groups, as suggested by the overlap in these groups’ confidence in-
tervals. In addition, at no sequence length is there any overlap in confidence intervals
between the ‘stress-deaf’ groups and the ‘non-stress deaf’ groups, confirming the ori-
ginal finding of a similarly clear-cut split by Rahmani, Rietveld & Gussenhoven (2015)
(despite the differences in statistical approach).
In sum, it is clear that TB and MA participants exhibit ‘lexical prominence deafness’.
The present results confirm many previous findings in showing how robust the differ-
ences are between groups, as a function of their native language, in their ability to
categorise a prosodic prominence contrast.
11
The male Dutch speaker’s stimuli shown here are the same as those in Figure 8.2.
161
8 Prominence deafness in Tashlhiyt Berber and Moroccan Arabic speakers
Figure 8.6: Spectrogram and F0 contour for /nuˈmi/ (left) and /ˈnumi/ (right). Ex-
amples show one out of three phonetic variants for each word by each of
the four speakers.
The Dutch female speaker’s initial syllable is on average 131 ms. long when it is
unstressed, and 139 ms. long when stressed, a negligible difference of 8 ms. which
means that for her stimuli, syllable duration differences cannot serve as a cue that helps
listeners reliably distinguish between /ˈnumi/ and /nuˈmi/. The Dutch male speaker, in
contrast, produces a stress-induced difference of 30 ms.12 In fact, not only is duration
missing as a reliable cue to stress position for the Dutch female speaker, it is potentially
even a misleading cue, since initial stress /ˈnumi/ displays relative durational properties
more appropriate for final stress.
12
Rahmani, Rietveld & Gussenhoven (2015) report mean syllable duration per stimulus language and per
stress contrast, but their values differ considerably from my own measurements. According to them,
the Dutch initial syllable in /ˈnumi/ would be 215 ms. long (as opposed to my measured average of
167 ms.).
162
8.4 Discussion
In addition to the durational differences, the Dutch female speaker also produces
somewhat diverging F0 patterns for /nuˈmi/, also visible in Figure 8.6. All her three
phonetic variants of /nuˈmi/ are like the one pictured, terminating high as opposed to
falling, as is the case for all other speakers. It is not obvious that this F0 pattern con-
tributes to participants’ difficulties with these stimuli. While the non-falling terminal
intonation certainly differs from the other speakers’ realisations of word-final stress, it
is still a pattern exclusively used for /nuˈmi/ (not /ˈnumi/) and therefore could presum-
ably serve as a cue to final stress.
The question remains why the potentially conflicting durational cues (and/or final
non-falling terminal intonation for tokens with final stress) would negatively influence
TB and MA speakers’ scores, while other listener groups do not appear to be influenced
by it. Stimulus language, after all, was not a significant effect in Rahmani, Rietveld
& Gussenhoven (2015). Part of the explanation may relate to the fact that Rahmani,
Rietveld & Gussenhoven took stimulus language, not stimulus speaker as a pre-
dictor, thereby collapsing scores across trials with stimuli from the Dutch male and
female speaker. It is possible that the male speaker’s accuracy scores were high enough
to cancel out any potential effect relating to the female speaker.
In order to explore the possibility that there is no effect of stimulus language on
the prosodic contrast when collapsing across speakers, yet another GLMM was run on
the prosodic SRT results from all seven languages combined.13
Figure 8.7 visualises the effect of stimulus language (Persian/Dutch) for each lan-
guage, on the prosodic contrast (note that the predictions for TB/MA are marginally
different from the ones in Figure 8.4, which were based on a different GLMM with stim-
ulus speaker as a predictor). The only participant groups for which there was no ef-
fect of stimulus language were Japanese (z=-0.18, p=0.85) and Persian (z=1.231,
p=0.21). The Dutch group performed marginally worse on the Dutch stimuli than
the Persian stimuli (z=-1.79, p=0.07). Indonesian and French groups performed bet-
ter on the Dutch contrast, whereas the Moroccan Arabic and Tashlhiyt Berber groups
performed better on the Persian contrast.
13
score ∼ParticipantLanguage + Contrast + StimulusLanguage + StimulusLength + ParticipantLan-
guage:Contrast + StimulusLanguage:Contrast + StimulusLanguage:ParticipantLanguage + (0+Con-
trast|Participant) + (0+StimulusLanguage|Participant)
163
8 Prominence deafness in Tashlhiyt Berber and Moroccan Arabic speakers
Figure 8.7: Predicted probability of an accurate response, per participant group and per
stimulus language, on the prosodic SRT.
These results are not necessarily at odds with the lack of an effect of stimulus lan-
guage reported in the original study, but they do highlight the importance of differ-
ences in statistical tools. For their five original participant groups the present model
finds a clear effect of stimulus language only for the French and Indonesian groups
(better performance on the Dutch stimuli), which will have been counteracted in the
original model by tendencies in the opposite direction for Dutch and to a lesser extent,
Japanese.
The present results indicate that there is no straightforward effect of the language
in which stimuli are spoken per se. Rather, there are differences between participant
groups in the relative difficulty of specific stimuli. This in turn suggests that the acoustic
details of the prominence contrast, and speaker- or voice-related differences play an
important role in determining response accuracy, at least across participant groups. It
follows that, in general, it is crucial to ensure variability in test stimuli before making
claims about language-specific effects (cf. Roettger et al. 2014 for different speaker
voices).
At this point it is unclear if the female Dutch speaker’s prosodic contrast causes a
similar effect in different participant groups alike: The effect was observed for TB and
MA groups because stimulus speaker was logged, but it is possible that this specific
speaker’s effect also applied to (some of) the participant groups in Rahmani, Rietveld
& Gussenhoven (2015). In the absence of this information, it can only be conjectured
164
8.5 Summary and conclusion
165
8 Prominence deafness in Tashlhiyt Berber and Moroccan Arabic speakers
In conclusion, the results from the present experiment lend further credibility to
claims that were brought forward in earlier parts of this thesis, namely that there is a
lack of lexical prominence asymmetries in the form of lexical stress in both Tashlhiyt
Berber and Moroccan Arabic.
166
Part V
Conclusion
167
9 Summary and general discussion
9.1 Summary of results
In this thesis I have reported results from various experiments that bear on the question
of phonological prominence structure in Tashlhiyt Berber and Moroccan Arabic.
Part II was concerned with correlates of lexical prominence in both languages. The
specific question asked here was whether, based on arguments relating to acoustic en-
hancement of syllables, TB and MA can be considered to lack lexical stress. For Tashl-
hiyt Berber (Chapter 3), the final syllable was not consistently enhanced relative to the
penultimate syllable in disyllabic words. This result could not confirm an earlier claim
that there is fixed final-position stress in Tashlhiyt. For Moroccan Arabic (Chapter 4),
no consistent enhancement was found of syllables that are presumed stressed under a
weight-sensitive view of stress in this language. This result provided no support for
the claim that stress in MA targets either the penultimate or final syllable. In conclu-
sion, the results of these experiments are compatible with the absence of stress in both
languages, a possibility that has also been raised independently by several authors for
each language.
Part III was concerned with the postlexical prosodic prominence structure of both lan-
guages, with specific focus on intonational prominence in interrogative phrases with a
question word (qword). Qwords in both TB and MA were shown to attract the main pros-
odic prominence-marking event in the phrase: A rise–fall realised entirely or partly on
the qword. In neither language did the rise–fall exhibit phonetic properties compatible
with an interpretation along the lines of a pitch accent associating with a predetermined,
lexically stressed syllable. Based on considerations of semantic-pragmatic prominence
(focus) in qword interrogatives, and on the phonetic properties of the intonational event,
it was argued that the rise–fall functions as a postlexical prominence-marking event in
both cases. Unlike in languages that exhibit lexical prominence asymmetries, in which
stressed syllables are the typical TBUs, the pitch event was interpreted as associating
with a higher structural unit: The phonological domain of the qword. This aspect of
postlexical prominence structure is compatible with, if not a direct corollary of, the
absence of lexical prominence structure.
Part IV served to investigate the perception of prominence by native TB and MA
speakers. Chapter 8 reported on a ‘stress deafness’ experiment in which the perceptual
sensitivity of participants was tested with respect to a postlexical prominence contrast
parasitic on a lexical prominence contrast. Native speakers of both TB and MA were
shown to perform poorly on this task. Specifically, they performed significantly worse
than speakers of languages that do have lexical prominence structure. At the same time
169
9 Summary and general discussion
their behaviour was similar to that of speakers of (other) languages that are known lack
lexical prominence. Participants’ perceptual behaviour therefore is one more result that
provides support for the idea that lexical stress is absent in both TB and MA.
In the following, I will consider these converging results in light of the two secondary
goals of this thesis, as mentioned in Chapter 1.1, in shedding further light on:
• The result of language contact between Tashlhiyt Berber and Moroccan Arabic in
the (prosodic)-phonological domain
• The possible mappings between lexical and postlexical prominence structure and
the theoretical implications of the present findings
170
9.2 Language contact: Prominence structure in TB and MA
of stress in Arabic is given in Watson (2011), including the observation that Modern
Standard Arabic is considered to have stress, like its precursor Classical Arabic.1
Based on these facts about Berber and Arabic it seems plausible that the lack of stress
in Moroccan Arabic is the product of language contact with (Tashlhiyt) Berber. Similar
reasoning is found in Zellou (2010), which is concerned with the origin of consonant
harmony in Moroccan Arabic, a feature not found in other varieties of Arabic but at-
tested in Berber. Unfortunately, little is known about the phonological prominence
structure of Proto-Berber, which was presumably spoken at some time after 1000 BCE
(Kossmann 1999, 2012). This makes it unclear whether, and if so at what point, (some
varieties of) Berber lost stress. If Proto-Berber was stressless, some varieties of Berber
including Zwara must have developed it at some point. If Proto-Berber had stress, some
varieties including Tashlhiyt must have lost it.
These open issues moreover raise the general question as to whether and how lan-
guages may ‘lose’ or ‘develop’ lexical stress, which has in fact rarely if ever been ad-
dressed.2 French is uncontroversially considered a language that has ‘lost’ stress, as it
is known to have evolved from Latin, the lexical phonology of which did include stress
(cf. Jun & Fougeron 2002). In contrast, there are no specific reports of stress coming
into existence from an originally stressless word prosodic system, although a general
hypothesis about language change is brought forward in Hyman (1977), taken up again
by Gordon (2014). According to this view of word stress, particularly in systems that
have fixed penultimate stress, stress might arise as the result of a generalisation of
phrase-level prosodic patterns. Neither author gives a very detailed description of such
a process, but presumably refer to a process like the following: The starting point is a
situation in which a language exhibits a consistent location of postlexical pitch (whether
this should be interpreted as prominence- or edge-marking is not clear). It most likely
takes the form of some sort of high pitch within a basic rising-falling intonational pat-
tern, which seems to be observed in some form or another in every language.3 When
an intonational pattern like a rise–fall is realised on a short one-word IP, some syllable
near the right edge, for example the penultimate, will be characterised by the pitch
peak and/or considerable pitch movement. This is schematised in Figure 9.1:
1
This claim should nevertheless be evaluated carefully due the fact that both MSA and Classical Arabic
are not native languages and are to a large extent literary. When spoken they should be considered
second language varieties that most likely exhibit prosodic features of the first language of the speaker,
typically a ‘colloquial’ variety of Arabic.
2
The changes in lexical phonological systems that lead to the development of lexical tone and pitch
accent, as opposed to stress, are much better researched. See among many others: Coetzee, Beddor &
Wissing (2014) and Coetzee et al. (2018) for tonogenesis in Afrikaans; Kingston (2005) for Athabaskan;
Kirby (2014) for Khmer; Kang & Han (2013) for (renewed) tonogenesis in Korean.
3
The privileged status of high pitch crosslinguistically has been observed in many places (e.g. Gussen-
hoven 2004; Ladd 2008) and is also reflected in the title of a special issue of Journal of Phonetics:
“What’s so special about H(igh)?” (Evans 2015).
171
9 Summary and general discussion
[[ σ σ σ ]ω ]IP
Figure 9.1: Schematised rising-falling intonational contour as typically realised on isol-
ated (phonological) words that form an IP.
The resulting pitch pattern could then be generalised to words that are not produced
in isolation, so that pitch prominence occurs in the penultimate position on each lexical
word. Presumably, the pitch-prominent syllable becomes reinterpreted as reflecting
fixed word-level prominence in that position, after which the pitch correlate might
give way to other correlates of prominence and other lexical–phonological interactions
that single this syllable out as prominent.
Some additional speculation can shed light on how exactly high pitch might diachron-
ically lose its primacy in singling out the syllable that is becoming stressed. A plausible
scenario is one in which high pitch involves increased articulatory effort, involving
greater jaw and lip opening, which may in turn also yield greater loudness (cf. Żygis,
Fuchs & Stoltmann 2017). It is also known that the presence of pitch movement tends
to cause durational expansion of the syllable on which it is realised (e.g. ‘accentual
lengthening’, see Cambier-Langeveld 2000). Once the point is reached at which this
prominence is reanalysed as being positionally determined or due primarily to non-
pitch properties, one could call it lexical stress. This last part of the process, in which
stress becomes phonologised, is reminiscent of the idea that differences in the quality
of vowels that may occur in stressed and unstressed syllables in Germanic languages
are the phonologised result of stress-induced enhancement (e.g. stressed [iː, aː] versus
unstressed [ɪ, ɑ] in Dutch, Gussenhoven 2004).
The above hypothesis about the phonologisation of stress involves a top-down explan-
ation, with higher-level prosodic structure percolating down to lexical-level structure.
The reverse, a bottom-up explanation, could presumably be invoked to explain the lack
of stress in French. The fact that it is considered a language with fixed prominence in
phrase-final position suggests that the lack of stress might have arisen as a result from
a generalisation of what at some stage would have been fixed word-final stress to fixed
position for prominence at the phrasal level.
At this point, not much is understood about how and why languages may come to
lose word stress, how they might develop it, and to what extent the location of postlex-
ical prosodic events, either in the form or prominence-marking or edge-marking, is
involved in bringing about lexical prominence structure. It is clear, however, that the
interaction between levels of prosodic structure plays some role, if not a crucial one,
in determining the presence or absence of stress. The next section will discuss the syn-
chronic correspondence between lexical and postlexical prominence in further detail.
For the two languages investigated here, the directionality of change is not entirely
certain at this point. Did TB and MA lose stress or did they never develop it in the first
place? With respect to the relationship between the two languages, however, it does
172
9.3 Phonological theory: Lexical and postlexical prominence structure
seem likely that a prior absence of stress in TB has impacted on the lexical prosodic
structure of MA, causing it to develop away from some earlier state in which it did
have lexical stress, like other varieties of Arabic. Given the relative rarity of stressless
languages and given the well-documented influence of Berber on Arabic, it seems likely
that these similarities reflect linguistic convergence due to language contact, with Mo-
roccan Arabic being influenced by Tashlhiyt Berber, rather than independent language-
internal pathways resulting in comparable structures (see also Zellou 2010).
173
9 Summary and general discussion
different patterns of alignment were reported, with yet other constraints that governed
tonal association. For Tashlhiyt Berber, Grice, Ridouane & Roettger (2015) showed
that syllable sonority and weight govern the alignment of phrase-final peaks as found
at the right phrasal edge of yes–no interrogatives, implying that to some extent low-
level phonetic factors influence the realisation of intonational movement. Perhaps this
finding is not that unexpected: Many languages assign lexical stress based on stress-
to-weight principles, which might be a result from the phonologisation of similar con-
straints. For Moroccan Arabic, Hellmuth et al. (2015) proposed that there is a role
for the foot in the realisation of a phrase-final rise–fall in yes–no interrogatives. The
rise–fall in this case was interpreted to associate with both an edge (the right phrasal
edge) and a metrical position (the start of the phrase-final foot). While Hellmuth et
al.’s (2015) interpretation referred to lexical stress, the patterns could just as well be
interpreted with reference to syllable weight alone, since stress position according to
Benkirane (1998) is determined based on syllable weight. Specifically, the start of the
final foot would then simply coincide with the start of the final two moras (i.e. the start
of a heavy final syllable, or the start of a penultimate syllable in the absence of a heavy
final syllable). This would make syllable weight a factor constraining tonal association
in both TB and MA.
One of the implications of this interpretation would be that metrical structure does in-
deed play a role in tonal association in languages lacking stress, since tonal association
that makes reference to syllable weight entails reference to metrical structure. It is nev-
ertheless clear that there is a difference between metrically-constrained association that
makes reference to metrical structure in terms of weight and metrically-constrained as-
sociation that makes reference to stress (which may in turn refer to weight in languages
with stress-by-weight systems).
A more fundamental difference between findings on TB and MA intonation concerns
the nature of the intonational events. In the present thesis, the intonational event
associated with qwords is clearly prominence-marking, whereas in the aforementioned
studies, rising or high pitch at the right edge of yes–no questions functions as a modality-
marking edge tone. In contrast, the nature of the intonational movements that co-
occur with contrastively focused words in phrase-final position in TB appear less clear
(cf. Grice, Ridouane & Roettger 2015). While these pitch events are likely candidates to
serve prominence-marking purposes, the fact that they occurred in phrase-final position
makes it difficult to determine whether they in part (also) served right-edge IP-marking.
Grice, Ridouane & Roettger’s (2015) interpretation leaves this question open, and the
intonational marking in both yes–no questions and contrastive statements is interpreted
in terms of an H boundary tone that has secondary association to either the final or the
penultimate syllable of the contrastively focused word (there was in fact clear evidence
for alignment with a specific TBU, unlike in the present qword data, see also Chapter
6).
The postlexical prominence associated with qwords in the experiments presented
in this thesis did not exhibit clear distributional patterns below the word level, but the
materials were not designed to test non-stress related metrical influence. This leaves the
174
9.3 Phonological theory: Lexical and postlexical prominence structure
9.3.2 Typology
Finally, the present discussion is also relevant to the analysis of postlexical prominence
across languages. Intonational events in stressless languages can still be prominence-
marking even if they do not associate to lexically stressed syllables. Using the term
‘non-metrical pitch accent’ to describe such events makes this explicit, and precludes
the need to resort to edge tones and phrase accents as the remaining options among AM
categories. Yet another alternative term could be ‘phrasal pitch accent’ in order to de-
scribe cases like French and possibly Mongolian, in which postlexical prominence seeks
a culminative position of strength at the phrasal level (as opposed to at the lexical level,
which would be the case for a standard pitch accent). The present view stands in con-
trast to the implicit idea in e.g. Jun (2014a) that if, for a given language, the category
of pitch accent is not available by virtue of the absence of lexical stress, all intonational
events must be interpreted as boundary tones. While the analysis of intonation in terms
of boundary tones only may be appropriate in the case of specific languages (e.g. Korean
and Ambonese Malay, see Section 2.5.2), the results from qword interrogative inton-
ation in TB and MA clearly indicate that postlexical prominence in these languages
exists. Analysing the relevant intonational events in terms of edge association does not
do justice to the facts and intuitions of several authors about prominence, including
Dell & Elmedlaoui (2002), and would wrongly suggest that these events are function-
ally different from pitch accents. As argued here, the main difference between the
intonational events discussed here and a standard pitch accent is one of form, rather
of function: The present cases are simply characterised by the absence of a metrical
anchor in the form of a lexically stressed syllable.
175
9 Summary and general discussion
The view on lexical and postlexical prominence brought forward here has conse-
quences for a typology that includes both levels of prominence, as shown in the fol-
lowing table (this is an updated version of Table 2.1 in Section 2.5.2, where a more
detailed discussion of the relevant languages can be found).
Table 9.1: Proposed typology of languages as a function of lexical and postlexical prom-
inence structure.
The left column represents languages with lexical stress, the right column languages
that lack it. The top left cell represents many languages whose intonation is well-
documented. These are languages with lexical stress that have pitch accents associat-
ing to stressed syllables. The top right cell represents languages without stress in which
prominence-marking seeks out positions of (metrical) prominence above the word level.
In French this is AP- or PP-final position and in Mongolian it seems to be AP-initial
position. As previously mentioned, postlexical prominence in these cases is sometimes
termed ‘phrasal pitch accent’. The left cell in the middle row represents those languages
in which stress presumably exists, but in which intonational prominence-marking does
not seek association to these same syllables. The existence of this type of language
was called into question in Chapter 2.5.1. The right cell in the middle row contains
Tashlhiyt Berber and Moroccan Arabic as languages that lack lexical stress and exhibit
postlexical prominence that does not take account of metrical prominence structure
neither above nor below the word level. It was argued that the intonational promin-
ence associated with qwords in these languages could be characterised as ‘non-metrical
pitch accent’. Future work, especially the comparison with intonational marking of
other types of prominence (such as contrastive focus) might nevertheless shed further
light on whether the languages really lack postlexical–metrical prominence structure
altogether. The bottom row contains languages that are considered to lack any form
of postlexical phonological prominence. On the one hand, Wolof has been argued to
have lexical stress but no marking of intonational prominence at all. On the other
hand, there are languages that seemingly lack any phonological prominence, at both
the lexical and postlexical level: Korean and Ambonese Malay.
A final notes concerns the classification of a language as lacking both lexical and
postlexical prominence, as this does not entail that the language lacks prosodic means
altogether to mark differences in information structure, including, in particular, focus.
176
9.4 Directions for future work
Korean, for example, is known to mark focus by means of phrasing. The crucial point
here is that it would lack a culminative position at either the lexical, AP or other phrasal
level, that serves as the predetermined location for a localised pitch event that contrib-
utes to the percept of prominence.
177
9 Summary and general discussion
to date, including Chapters 6 and 7) will be informative. Edge adjacency might impose
production constraints on the realisation of tonal targets that obscure any potential
metrical (e.g. weight-related) factors that would otherwise play a more important role.
At this point it is known that contrastive focus in phrase-medial position, at least in MA,
is cued by local F0 protrusions (Yeou, Embarki & Al-Maqtari 2007). This experiment
yielded the insight that the rising–falling F0 contour in MA was less localised in its loc-
ation than a comparable event that occurred in the same context in two other varieties
of Arabic that do have stress.
Similarly, the results on intonational prominence in TB reported in Chapter 6 raised
many questions about the language’s intonation system. What are the intonational
characteristics of contrastively focused words that occur phrase-medially? Do factors
like syllable weight and sonority play a role in determining tonal alignment (as in
Grice, Ridouane & Roettger 2015) or do intonational movements rather exhibit gradient
alignment (as in the case study on qwords in Chapter 6)? More detailed studies of tonal
alignment in both TB and MA will be required to gain a better insight into how both
intonational systems deal with the realisation of postlexical prominence in the absence
of lexical stress.
More generally, it is hoped that future work on other languages that lack lexical
prominence structure will uncover which possible correspondences there are between
metrically prominent positions at lexical and postlexical levels of phonological struc-
ture.
178
Appendices
179
A Tashlhiyt Berber scripted telephone
conversation
The below gives the English translation of the scripted telephone dialogue between two
speakers, M(other) and S(aid). The four target sentences (printed in bold) are glossed
as well, in order, the lines represent i) (in italics) the Latin script version as presented
to participants, ii) a phonemic transcription, and iii) a morphological gloss.
[… The line is breaking up … As for Said, his mother cannot hear him very well.]
181
A Tashlhiyt Berber scripted telephone conversation
182
B Moroccan Arabic scripted dialogue
The dialogue is given in Arabic on the right and the English translation is given on the
left. The two speakers in the dialogue are referred to as A and B (if the participant was
male, they were given a version with initial call to Ziyaad, if female to Manaal). Target
qword questions as reported on in Chapter 7 are printed in bold and identified by their
IVAr code.
B I’ve been tired for two days. �ع ّ�انة هادي �ومن
B These days, we are inviting for هادي �ومن� و ا�نا كنعرضو على الناس لعرس د�ال
the wedding of my uncle Maazin’s بن� عم� مازن
daughter.
A Oh, is your cousin getting married? بن� عمك �ا�� ّوج مبروك
Congratulations.
B Thank you and I wish the same for اهلل �بارك ف�ك و العقبال�كم
you.
A Who is getting married, Dina or ّ شكون ل ّ�ل� �ا�� ّوج د�نا
وال م ّ��ادة؟
Mayyaada?
B Dina is the one who is getting mar- و كِ �اَ� َرة ن��ا؟.ل ّ�ل� �ا�� ّوج ه� د�نا
ried. And how are you?
A Thank god, everything is fine. شنو در�� ل�وم؟.ل�م� هلل كل ش� ال باس
What have you been up to today?
183
B Moroccan Arabic scripted dialogue
B Early evening, we played sport مش�نا ل�انوت منال.فلعش�ة مش�نا لكلوب مع ل�لى
with Layla. And we got coun- ن��بو الورد البِل ْ�ي
tryside flowers on our way from
the shop Manaal.
A Did you go to the Yamani sport لسوق ال�من�؟
ّ ���واش مش
centre?
B No, I went to the Japanese sport �لسوق ال�ابان
ّ ��ال مش
centre.
A Did you go to the sport center with ّ لسوق مع ل�لى
وال ل�نا؟ ّ ���معا من مش
Layla or Lina?
B I went to the sport center with Lina. لسوق معا ل�نا
ّ ��مش
A whq3 When is the wedding of your إ�م�ا عرس بن� عمك د�نا؟
cousin Dina?
A Dina’s wedding is in two days. �عرس د�نا من هنا �ومن
B whq1 What is the name of the Yemeni شنو سم�� ذا الرا�ل ال�من�؟
man?
A The Yemeni man’s name is Nabil هذا الرا�ل ال�من� سم��و نب�ل الب�وي
Al-Badawi.
B Did they meet through Zeena? واش القا�هم ز�نة؟
B Does that mean that she met him زعما شاف�و فهاد�ك لقهوة ل ّ�ل� كا�نة فاملول امل�رب�؟
in the cafe which is in the Morocco
centre?
A Yes, she met him at the Al- �أه �عرف� عل�ه ف� املول امل�رب
Maghribi shopping centre.
B Is it going to be a religious or civil ّ واش العرس بل�ي
وال روم�؟
marriage?
A The marriage is mostly civil. �العرس �الب ًا روم
184
B whq2 Who is going to witness the civil شكون ل ّ�ل� ��شه� على العرس الروم�؟
marriage?
A Noor and Zeen are going to witness ���ش ْه�و على العرس الروم
ِ نور و ز�ن
the civil marriage.
B whq4 In which city will the wedding العرس ��كون فأش من م��نة؟
be held?
A The wedding is most likely in �ق�ري �مش� ل�ب�؟.�العرس �الب ًا ف� دب
Dubai. Will you be able to go to
Dubai?
B I will try because Dina is very dear د�نا عز�زة عل�ة بزّاف. و اهلل �ن�اول
to me.
A This wedding will be a mixture of �هاذ العرس م�لط بن� البل�ي و الروم
urban and bedouin styles.
A Is the civil wedding going to be العرس ��كون فالقصر البل�ي؟
held at the Al-Baladi building?
B Yes, Dina’s wedding is at the Al- أه عرس د�نا ��كون فالقصر البل�ي
Baladi building.
A Is the party going to be in Layalina ّ واش ال�فلة ��كون فقاعة ل�ال�نا
وال ب�ان؟
or Bayaan hall?
B They will most likely book Al- �الب ًا ��ر�زر �و فب�ان
Bayaan hall.
A Are they going to Dubai or to Le- ّ �واش ��مش�و من بع� العرس ل�ب
وال لبنان؟
banon after the wedding?
B They are going to spend the days ���وزو ش� �امات من بع� العرس فلبنان
after marriage in Lebanon.
A Does it mean that she will visit her زعما ��مش� �شوف ��ها ل�ال�؟
sister Layali?
B Yes, she will visit her sister Layali أه ��مش� �شوف ��ها ل�ال� ش� �امات
for a few days.
A Is Nabil’s father going to be there? ّبات نب�ل ��كون مو�ود؟
A whq6 What are you going to get for شنو ����ب� ل��نا من �انوت دالل؟
Dina from Dalaal’s shop?
B I am not going to Dalaal’s, I ِّ .أنا ما �اداش عن� دالل
فكرن��ب ل�ها نونوس من
am thinking of getting her a �� ال�وم �ادي �عرض عل�ك ول.عن� الوردة الب�ضا
teddy bear from the White Flower.
Today you’ll get the invitation with
Walid.
185
B Moroccan Arabic scripted dialogue
A Yes sure, anyway, congratulations. �نع ّ�ط ل�ها و نبارك ل�ها دابا.أه بالص�؟ املهم مبروك
I will call her and congratulate her
now.
186
Bibliography
Abercrombie, David. 1976. Stress and some other terms. Work in Progress 9. 51–53.
Aboh, Enoch O. 2007. Focused versus non-focused wh-phrases. In Enoch O. Aboh, Kath-
arina Hartmann & Malte Zimmermann (eds.), Focus strategies in African languages: The
interaction of focus and grammar in Niger-Congo and Afro-Asiatic, vol. 191 (Trends in
Linguistics, Studies and Monographs), 287–314. Berlin: Mouton de Gruyter.
Aguadé, Jordi. 2008. Moroccan Arabic. In Kees Versteegh (ed.), Encyclopedia of Arabic
language and linguistics, vol. III, 273–297. Leiden: Brill.
Akrofi Ansah, Mercy. 2010. Focused constituent interrogatives in Lɛtɛ (Larteh). Nordic
Journal of African studies 19(2). 98–107.
Al-Ani, Salman H. 1970. Arabic phonology: An acoustical and physiological investigation.
Indiana University PhD dissertation.
Almbark, Rana, Nadia Bouchhioua & Sam Hellmuth. 2014. Acquiring the phonetics and
phonology of English word stress: Comparing learners from different L1 backgrounds.
Concordia Working Papers in Applied Linguistics 5. 19–35.
Al-Tamimi, Jalal. 2009. Effect of pharyngealisation on vowels revisited: Static and dy-
namic analyses of vowels in Moroccan and Jordanian Arabic. In Workshop on Pharyn-
geals and Pharyngealisation. Newcastle.
Al-Tamimi, Jalal. 2017. Revisiting acoustic correlates of pharyngealization in Jordanian
and Moroccan Arabic: Implications for formal representations. Laboratory Phonology:
Journal of the Association for Laboratory Phonology 8(1). 1–40. https://doi.org/
10.5334/labphon.19.
Antônio de Moraes, João. 1998. Intonation in Brazilian Portuguese. In Daniel Hirst
& Albert Di Cristo (eds.), Intonation systems: A survey of twenty languages, 179–194.
Cambridge: Cambridge University Press.
Applegate, Joseph R. 1958. An outline of the structure of Shilha. New York: American
Council of Learned Societies.
Arnhold, Anja. 2014. Prosodic structure and focus realization in West Greenlandic. In
Sun-Ah Jun (ed.), Prosodic Typology II: The phonology of intonation and phrasing, 216–
251. Oxford: Oxford University Press.
Arvaniti, Amalia & D. Robert Ladd. 2009. Greek wh-questions and the phono-
logy of intonation. Phonology 26(1). 43–74. https : / / doi . org / 10 . 1017 /
S0952675709001717.
Arvaniti, Amalia, D. Robert Ladd & Ineke Mennen. 1998. Stability of tonal alignment:
the case of Greek prenuclear accents. Journal of Phonetics 26(1). 3–25. https://doi.
org/doi:10.1006/jpho.1997.0063.
Arvaniti, Amalia, D. Robert Ladd & Ineke Mennen. 2000. What is a starred tone? Evid-
ence from Greek. In Michael J. Broe & Janet B. Pierrehumbert (eds.), Papers in Labor-
187
Bibliography
atory Phonology V: Acquisition and the lexicon, 119–131. Cambridge: Cambridge Uni-
versity Press.
Aspinion, Robert. 1953. Apprenons le berbère: Initiation aux dialectes chleuhs. Rabat: Mon-
cho.
Athanasopoulou, Angeliki & Irene Vogel. 2016. The acoustic manifestation of promin-
ence in stressless languages. In Proceedings of Interspeech 2016, 82–86. San Francisco.
https://doi.org/10.21437/Interspeech.2016-1424.
Atterer, Michaela & D. Robert Ladd. 2004. On the phonetics and phonology of “seg-
mental anchoring” of F0: Evidence from German. Journal of Phonetics 32(2). 177–
197. https://doi.org/10.1016/S0095-4470(03)00039-1.
Barnes, Jonathan, Alejna Brugos, Stefanie Shattuck-Hufnagel & Nanette Veilleux. 2012a.
On the nature of perceptual differences between accentual peaks and plateaux. In
Oliver Niebuhr (ed.), Understanding prosody: The role of context, function and commu-
nication (Language, Context, and Cognition), 93–118. Berlin: De Gruyter.
Barnes, Jonathan, Nanette Veilleux, Alejna Brugos & Stefanie Shattuck-Hufnagel.
2012b. Tonal Center of Gravity. Journal of Laboratory Phonology 3(2). 337–383.
https://doi.org/10.1515/lp-2012-0017.
Bartels, Christine. 1997. The pragmatics of wh-question intonation in English. University
of Pennsylvania Working Papers in Linguistics 4(2). http://repository.upenn.edu/
pwpl/vol4/iss2/1.
Bates, Douglas, Martin Maechler, Ben Bolker & Steven Walker. 2015. lme4: Linear mixed-
effects models using Eigen and S4. R package. https://doi.org/https://github.
com/lme4/lme4/.
Bateson, Mary Catherine. 1967. Arabic language handbook. Vol. 3. Washington, D.C.:
Georgetown University Press.
Baunaz, Lena & Cédric Patin. 2009. Prosody refers to semantic factors: Evidence from
French wh-words. In Proceedings of IDP 09, 93–107.
Beckman, Mary E. 1986. Stress and non-stress accent. Dordrecht: Foris Publications Hol-
land.
Beckman, Mary E., Julia Hirschberg & Stefanie Shattuck-Hufnagel. 2005. The original
ToBI system and the evolution of the ToBI framework. In Sun-Ah Jun (ed.), Prosodic
Typology: The phonology of intonation and phrasing, 9–54. Oxford: Oxford University
Press.
Beckman, Mary E. & Janet B. Pierrehumbert. 1986. Intonational structure in Japanese
and English. Phonology Yearbook 3. 255–309.
Benhallam, Abderrafi. 1989. Aspects de la recherche en phonologie de l’arabic maro-
cain. In Langue et société au Maghreb, 13–23. Rabat: Publications de la faculté des
Lettres et des Sciences Humaines.
Benhallam, Abderrafi. 1990. Native speaker intuitions about Moroccan Arabic stress.
In Jochen Pleines (ed.), Maghreb linguistics, 91–109. Rabat: Éditions Okad.
Benkirane, Thami. 1982. Étude phonétique et fonctions de la syllabe en arabe marocain.
Université de Provence Aix-Marseille I PhD dissertation.
188
Bibliography
Benkirane, Thami. 1998. Intonation in Western Arabic (Morocco). In Daniel Hirst &
Albert Di Cristo (eds.), Intonation systems: A survey of twenty languages, 345–359. Cam-
bridge: Cambridge University Press.
Bishop, Judith & Janet Fletcher. 2005. Intonation in six dialects of Bininj Gun-Wok. In
Sun-Ah Jun (ed.), Prosodic Typology: The phonology of intonation and phrasing, 331–
361. Oxford: Oxford University Press.
Blevins, Juliette. 2004. Evolutionary phonology: The emergence of sound patterns. Cam-
bridge: Cambridge University Press.
Bocci, Giuliano & Cinzia Avesani. 2011. Phrasal prominences do not need pitch move-
ments: Postfocal phrasal heads in Italian. In Proceedings of Interspeech 2011. Florence.
Boersma, Paul & David Weenink. 2015. Praat: Doing phonetics by computer. Software.
Version 6.0.22.
Bolinger, Dwight (ed.). 1972. Intonation: Selected readings. Harmondsworth: Penguin
Education.
Bolinger, Dwight. 1978. Intonation across languages. In Joseph H. Greenberg (ed.),
Universals of human language, 471–524. Stanford: Stanford University Press.
Bolinger, Dwight. 1986. Intonation and its parts: Melody in spoken English. London: Ed-
ward Arnold.
Bolinger, Dwight L. 1958. A theory of pitch accent in English. Word 14(2–3). 109–149.
https://doi.org/10.1080/00437956.1958.11659660.
Bouchhioua, Nadia. 2008. The acoustic correlates of stress and accent in Tunisian Arabic:
A comparative study with English. Université de 7 Novembre: l’Institut Supérieur des
Langues de Tunis PhD dissertation.
Boudlal, Abdelaziz. 2001. Constraint interaction in the phonology and morphology of Cas-
ablanca Moroccan Arabic. Université Mohammed V PhD dissertation.
Brockelmann, Carl. 1908. Grundriss der vergleichenden Grammatik der semitischen Sprac-
hen. Berlin: Reuther & Reichard.
Bruce, Gösta. 1977. Swedish word accents in sentence perspective. Malmo: LiberLärome-
del/Gleerup.
Bruggeman, Anna, Nabila Louriz, Rana Almbark & Sam Hellmuth. in press. Acoustic
correlates of lexical stress in Moroccan Arabic. Journal of the International Phonetic
Association.
Bruggeman, Anna & Timo Benjamin Roettger. 2017. CoTaSS: Corpus of Tashlhiyt Semi-
spontaneous Speech. Online speech database. http://cotass.uni-koeln.de.
Bruggeman, Anna, Timo Benjamin Roettger & Martine Grice. 2017. Question word in-
tonation in Tashlhiyt: Is ‘high’ good enough? Laboratory Phonology: Journal of the Asso-
ciation for Laboratory Phonology 8(1), 5. https://doi.org/10.5334/labphon.79.
Brunelle, Marc. 2017. Stress and phrasal prominence in tone languages: The case of
Southern Vietnamese. Journal of the International Phonetic Association 47(3). 283–320.
https://doi.org/10.1017/S0025100316000402.
Burdin, Rachel Steindel, Sara Phillips-Bourass, Rory Turnbull, Murat Yasavul, Cyn-
thia. G. Clopper & Judith Tonhauser. 2015. Variation in the prosody of focus in
189
Bibliography
190
Bibliography
Coleman, John. 1999. The nature of vocoids associated with syllabic consonants in
Tashlhiyt Berber. In Proceedings of the International Congress of Phonetic Sciences XIV,
735–738. San Francisco.
Coleman, John. 2001. The phonetics and phonology of Tashlhiyt Berber syllabic con-
sonants. Transactions of the Philological Society 99(1). 29–64. https://doi.org/10.
1111/1467-968X.00073.
Correia, Susana, Joseph Butler, Marina Vigário & Sónia Frota. 2015. A stress “deafness”
effect in European Portuguese. Language and Speech 58(1). 48–67. https : / / doi .
org/10.1177/0023830914565193.
Crosswhite, Katherine. 2001. Vowel reduction in Optimality Theory (Outstanding disser-
tations in linguistics). New York: Routledge.
Cruttenden, Alan. 1986. Intonation. Cambridge: Cambridge University Press.
Crystal, David. 1972. The intonation system of English. In Dwight Bolinger (ed.), Inton-
ation: Selected readings, 110–137. Harmondsworth: Penguin Education.
Culicover, Peter W. & Michael Rochemont. 1983. Stress and focus in English. Language
59(1). 123–165. https://doi.org/10.2307/414063.
Cutler, Anne & Dennis Norris. 1988. The role of strong syllables in segmentation for
lexical access. Journal of Experimental Psychology: Human perception and performance
14(1). 113–121. https://doi.org/10.1037/0096-1523.14.1.113.
d’Alessandro, Christophe & Piet Mertens. 1995. Automatic pitch contour stylization
using a model of tonal perception. Computer Speech and Language 9. 257–288. https:
//doi.org/10.1006/csla.1995.0013.
Dascalu-Jinga, Laurentia. 1998. Intonation in Romanian. In Daniel Hirst & Albert Di
Cristo (eds.), Intonation systems: A survey of twenty languages, 239–260. Cambridge:
Cambridge University Press.
Dell, François & Mohamed Elmedlaoui. 1985. Syllabic consonants and syllabification in
Imdlawn Tashlhiyt Berber. Journal of African Languages and Linguistics 7(2). 105–130.
https://doi.org/10.1515/jall.1985.7.2.105.
Dell, François & Mohamed Elmedlaoui. 2002. Syllables in Tashlhiyt Berber and in Moroc-
can Arabic. Dordrecht: Kluwer.
Dell, François & Mohamed Elmedlaoui. 2008. Poetic meter and musical form in Tashlhiyt
Berber songs. Köln: Rüdiger Köppe Verlag.
Dik, Simon C. 1997a. The theory of Functional Grammar: Part 1, The structure of the clause
(Functional Grammar Series). Berlin: De Gruyter.
Dik, Simon C. 1997b. The theory of Functional Grammar: Part 2, Complex and derived
constructions (Functional Grammar Series). Berlin: De Gruyter.
Dilley, Laura C., D. Robert Ladd & Astrid Schepman. 2005. Alignment of L and H in
bitonal pitch accents: Testing two hypotheses. Journal of Phonetics 33(1). 115–119.
https://doi.org/10.1016/j.wocn.2004.02.003.
D’Imperio, Mariapaola. 2006. Preface. Rivista di Linguistica 18(1). 3–18.
Dupoux, Emmanuel, Christophe Pallier, Nuria Sebastián & Jacques Mehler. 1997. A
destressing “deafness” in French? Journal of Memory and Language 36(3). 406–421.
https://doi.org/10.1006/jmla.1996.2500.
191
Bibliography
Dupoux, Emmanuel & Sharon Peperkamp. 2002. Fossil markers of language develop-
ment: phonological ‘deafnesses’ in adult speech processing. In Jacques Durand & Bern-
ard Laks (eds.), Phonetics, phonology, and cognition, 168–190. Oxford: Oxford Univer-
sity Press.
Dupoux, Emmanuel, Sharon Peperkamp & Núria Sebastián-Gallés. 2001. A robust me-
thod to study stress “deafness”. The Journal of the Acoustical Society of America 110(3).
1606–1618. https://doi.org/10.1121/1.1380437.
Dupoux, Emmanuel, Sharon Peperkamp & Núria Sebastián-Gallés. 2010. Limits on bilin-
gualism revisited: Stress ‘deafness’ in simultaneous French-Spanish bilinguals. Cogni-
tion 114(2). 266–275. https://doi.org/10.1016/j.cognition.2009.10.001.
Dupoux, Emmanuel, Núria Sebastián-Gallés, Eduardo Navarrete & Sharon Peperkamp.
2008. Persistent stress ‘deafness’: The case of French learners of Spanish. Cognition
106(2). 682–706. https://doi.org/{10.1016/j.cognition.2007.04.001}.
Durand, Olivier. 1994. Profilo di arabo marocchino. Roma: Universitá degli studie ‘La
sapienza’.
É. Kiss, Katalin. 1995. Discourse configurational languages: Introduction. In Katalin
É. Kiss (ed.), Discourse configurational languages, 3–27. Oxford: Oxford University
Press.
El Aissati, Abderrahman. 2005. A socio-historical perspective on the Amazigh (Berber)
cultural movement in North America. Afrika Focus 18(1/2). 59–72.
El Zarka, Dina. 2011. Leading, linking, and closing tones and tunes in Egyptian Arabic:
What a simple intonation system tells us about the nature of intonation. In Ellen
Broselow & Hamid Ouali (eds.), Perspectives on Arabic linguistics: Papers from the annual
symposia on Arabic linguistics volume XXII-XXIII (College Park, Maryland, 2008 and
Milwaukee, Wisconsin, 2009), 57–74. Amsterdam: John Benjamins.
Elordieta, Gorka. 1998. Intonation in a pitch accent variety of Basque. International
Journal of Basque Linguistics and Philology 32. 511–569.
Elordieta, Gorka & José Ignacio Hualde. 2003. Tonal and durational correlates of ac-
cent in contexts of downstep in Lekeitio Basque. Journal of the International Phonetic
Association 33(2). 195–209. https://doi.org/10.1017/S0025100303001294.
Elordieta, Gorka & José Ignacio Hualde. 2014. Intonation in Basque. In Sun-Ah Jun
(ed.), Prosodic Typology II: The phonology of intonation and phrasing, 405–463. Oxford:
Oxford University Press.
E-Prime 3.0. 2017. Software. Pittsburgh, PA: Psychology Software Tools, Inc. http :
//www.pstnet.com.
Erteschik-Shir, Nomi. 1986. Wh-questions and focus. Linguistics and Philosophy 9(2).
117–149. https://doi.org/10.1007/BF00635608.
Erteschik-Shir, Nomi. 2007. Information structure: The syntax-discourse interface. Oxford:
Oxford University Press.
Evans, Jonathan (ed.). 2015. What’s so special about H(igh)? Multi-disciplinary perspect-
ives on the linguistic functions of raised pitch. Special issue of Journal of Phonetics.
Fischer, August. 1917. Zur Lautlehre des Marokkanisch-Arabischen. Leipzig: Edelmann.
192
Bibliography
Fletcher, Janet. 2014. Intonation and prosody in Dalabon. In Sun-Ah Jun (ed.), Pros-
odic Typology II: The phonology of intonation and phrasing, 252–272. Oxford: Oxford
University Press.
Foster, Michael K. 1982. Alternating weak and strong syllables in Cayuga words. Inter-
national Journal of American Linguistics 48(1). 59–72.
Frota, Sónia. 2002. Nuclear falls and rises in European Portuguese: A phonological
analysis of declarative and question intonation. Probus 14(1). 113–146. https://
doi.org/10.1515/prbs.2002.001.
Frota, Sónia & Marina Vigário. 2000. Aspectos de prosódia comparada: Ritmo e
entoação no PE e no PB. In Actas do XV encontro da Associação Portuguesa de Lin-
guı ́stica, 533–555. Braga: Associação Portuguesa de Linguı ́stica.
Fry, Dennis B. 1955. Duration and intensity as physical correlates of linguistic stress.
The Journal of the Acoustical Society of America 27(4). 765–768. https://doi.org/
10.1121/1.1908022.
Fry, Dennis B. 1958. Experiments in the perception of stress. Language and Speech 1(2).
126–152. https://doi.org/10.1177/002383095800100207.
Fujisaki, Hiroya & Keikichi Hirose. 1984. Analysis of voice fundamental frequency con-
tours for declarative sentences of Japanese. Journal of the Acoustical Society of Japan
(E.) 5(4). 233–242. https://doi.org/10.1250/ast.5.233.
Genzel, Susanne. 2013. Lexical and post-lexical tones in Akan. Universität Potsdam PhD
dissertation.
Goedemans, Rob & Harry van der Hulst. 2013. Fixed stress locations. In Matthew S.
Dryer & Martin Haspelmath (eds.), The World Atlas of Language Structures Online.
Leipzig: Max Planck Institute for Evolutionary Anthropology. http://wals.info/
chapter/14.
Goedemans, Rob & Ellen Van Zanten. 2007. Stress and accent in Indonesian. LOT Occa-
sional series 9. 35–62.
Goldsmith, John A. 1976. Autosegmental phonology. Bloomington, IN: Indiana University
Linguistics Club Bloomington.
Gooden, Shelome. 2014. Aspects of the intonational phonology of Jamaican Creole. In
Sun-Ah Jun (ed.), Prosodic Typology II: The phonology of intonation and phrasing, 273–
301. Oxford: Oxford University Press.
Gordon, Matthew. 2005. Intonational phonology of Chickasaw. In Sun-Ah Jun (ed.),
Prosodic Typology: The phonology of intonation and phrasing, 301–330. Oxford: Oxford
University Press.
Gordon, Matthew. 2014. Disentangling stress and pitch accent: A typology of promin-
ence at different prosodic levels. In Harry van der Hulst (ed.), Word stress: Theoretical
and typological issues, 83–118. Cambridge: Cambridge University Press.
Gordon, Matthew & Latifa Nafi. 2012. Acoustic correlates of stress and pitch accent
in Tashelhiyt Berber. Journal of Phonetics 40(5). 706–724. https://doi.org/10.
1016/j.wocn.2012.04.003.
193
Bibliography
Gordon, Matthew & Timo Roettger. 2017. Acoustic correlates of word stress: A cross-
linguistic survey. Linguistics Vanguard 3(1). https://doi.org/10.1515/lingvan-
2017-0007.
Gorman, Kyle, Jonathan Howell & Michael Wagner. 2011. Prosodylab-Aligner: A tool
for forced alignment of laboratory speech. Canadian Acoustics 39(3). 192–193. jcaa.
caa-aca.ca/index.php/jcaa/article/view/2476.
Grice, Martine, Stefan Baumann & Ralf Benzmülller. 2005. German intonation in Auto-
segmental-Metrical phonology. In Sun-Ah Jun (ed.), Prosodic Typology: The phonology
of intonation and phrasing, 55–83. Oxford: Oxford University Press.
Grice, Martine, D. Robert Ladd & Amalia Arvaniti. 2000. On the place of phrase accents
in intonational phonology. Phonology 17(2). 143–185. https://doi.org/10.1017/
S0952675700003924.
Grice, Martine, Rachid Ridouane & Timo Benjamin Roettger. 2015. Tonal association
in Tashlhiyt Berber: Evidence from polar questions and contrastive statements. Phon-
ology 32(2). 241–266. https://doi.org/10.1017/S0952675715000147.
Grice, Martine, Simon Ritter, Henrik Niemann & Timo B. Roettger. 2017. Integrating
the discreteness and continuity of intonational categories. Journal of Phonetics 64(1).
90–107. https://doi.org/10.1016/j.wocn.2017.03.003.
Grice, Martine, Timo Benjamin Roettger, Rachid Ridouane & Cécile Fougeron. 2011.
The association of tones in Tashlhiyt Berber. In Proceedings of the International Congress
of Phonetic Sciences XVII. Hong Kong.
Grice, Martine, Alexandra Vella & Anna Bruggeman. 2019. Stress, pitch accent, and
beyond: Intonation in Maltese questions. Journal of Phonetics 76. 100913. https :
//doi.org/10.1016/j.wocn.2019.100913.
Gussenhoven, Carlos. 1983. Focus, mode and the nucleus. Journal of Linguistics 19(2).
377–417. https://doi.org/10.1017/S0022226700007799.
Gussenhoven, Carlos. 2004. The phonology of tone and intonation (Research Surveys in
Linguistics). Cambridge: Cambridge University Press.
Gussenhoven, Carlos. 2005. Transcription of Dutch intonation. In Sun-Ah Jun (ed.),
Prosodic Typology: The phonology of intonation and phrasing, 118–145. Oxford: Oxford
University Press.
Gussenhoven, Carlos. 2015. Does phonological prominence exist? Linge e Linguaggio
XIV(1). 7–24.
Gussenhoven, Carlos. 2017. Zwara (Zuwārah) Berber. Journal of the International Phon-
etic Association. 1–17. https://doi.org/10.1017/S0025100317000135.
Haan, Judith. 2002. Speaking of questions: An exploration of Dutch question intonation
(LOT series). The Hague: Holland Academic Graphics.
Hagstrom, Paul. 2003. What questions mean. Glot International 7. 188–201.
Halim, Amran. 1974. Intonation in relation to syntax in bahasa Indonesia. Jakarta: Djam-
batan.
Hamdi, Rachid. 1991. Etude phonologique et expérimentale de l’emphase en arabe marocain
de Casablanca. Université Lyon 2 PhD dissertation.
194
Bibliography
Harrell, Richard Slade. 1962. A short reference grammar of Moroccan Arabic: With an
appendix of texts in Urban Moroccan Arabic by Louis Brunot. Washington, D.C.: Geor-
getown University Press.
Hayes, Bruce. 1995. Metrical stress theory: Principles and case studies. Chicago: The Uni-
versity of Chicago Press.
Hayes, Bruce & Aditi Lahiri. 1991. Bengali intonational phonology. Natural Language
and Linguistic Theory 9(1). 47–96. https://doi.org/10.1007/BF00133326.
Heath, Jeffrey. 1997. Moroccan Arabic Phonology. In Alan S. Kaye (ed.), Phonologies of
Asia and Africa (including the Caucasus), 205–217. Winona Lake (IN): Eisenbrauns.
Heath, Jeffrey. 2005. A grammar of Tamashek (Tuareg of Mali) (Mouton grammar library
35). Berlin: De Gruyter.
Hellmuth, Sam. 2006. Intonational pitch accent distribution in Egyptian Arabic. SOAS Uni-
versity of London PhD dissertation.
Hellmuth, Sam & Rana Almbark. 2017. IVAr: Intonational Variation in Arabic Corpus.
Data collection. Colchester. https://dx.doi.org/10.5255/UKDA-SN-852878.
Hellmuth, Sam, Nabila Louriz, Basma Chlaihani & Rana Almbark. 2015. F0 peak align-
ment in Moroccan Arabic polar questions. In Proceedings of the International Congress
of Phonetic Sciences XVIII. Glasgow, UK.
Hellmuth, Sam, Becky Muradás-Taylor & Bethany Karrinton. to appear. Cue dependent
stress perception in english listeners: the effect of linguistic experience.
Hellmuth, Sam, Becky Muradás-Taylor & Bethany White. N.d. Non-persistent stress deaf-
ness in English listeners.
Henriksen, Nicholas. 2014. Initial peaks and final falls in the intonation of Manchego
Spanish wh-questions. Probus 26(1). 83–133. https://doi.org/10.1515/probus-
2013-0003.
Hirschberg, Julia. 2000. A corpus-based approach to the study of speaking style. In
Merle Horne (ed.), Prosody: Theory and experiment. Studies presented to Gösta Bruce,
335–350. Dordrecht: Kluwer Academic Publishers.
Hirst, Daniel & Albert Di Cristo (eds.). 1998. Intonation systems: A survey of twenty lan-
guages. Cambridge: Cambridge University Press.
Holes, Clive. 2004. Modern Arabic: Structures, functions, and varieties. Revised edition.
Washington, D.C.: Georgetown University Press.
Horvath, Julia. 1986. FOCUS in the theory of grammar and the syntax of Hungarian.
Dordrecht: Foris Publications.
Hyman, Larry. 1977. On the nature of linguistic stress. In Larry Hyman (ed.), Studies in
stress and accent (Southern California Occasional Papers in Linguistics 4), 37–82. Los
Angeles: University of Southern California.
Hyman, Larry. 2014. Do all languages have word accent? In Harry van der Hulst (ed.),
Word stress: Theoretical and typological issues, 56–82. Cambridge: Cambridge Univer-
sity Press.
Igarashi, Yosuke. 2014. Typology of intonational phrasing in Japanese dialects. In Sun-
Ah Jun (ed.), Prosodic Typology II: The phonology of intonation and phrasing, 464–492.
Oxford: Oxford University Press.
195
Bibliography
Jacobsen, Birgitte. 2000. The question of ‘stress’ in West Greenlandic. Phonetica 57(1).
40–67. https://doi.org/10.1159/000028458.
Jaggar, Philip J. 2006. More on in-situ wh- and focus constructions in Hausa. In Dymitr
Ibriszimow, Henry Tourneux & Ekkehard H. Wolff (eds.), Chadic Linguistics/Linguis-
tique Tchadique/Tschadistik, vol. 3, 49–73. Köln: Rüdiger Köppe Verlag.
Jun, Sun-Ah. 1993. The phonetics and phonology of Korean prosody. The Ohio State Uni-
versity PhD dissertation.
Jun, Sun-Ah. 2005a. Korean intonational phonology and prosodic transcription. In Sun-
Ah Jun (ed.), Prosodic Typology: The phonology of intonation and phrasing, 201–229.
Oxford: Oxford University Press.
Jun, Sun-Ah. 2005b. Prosodic Typology. In Sun-Ah Jun (ed.), Prosodic Typology: The
phonology of intonation and phrasing, 430–458. Oxford: Oxford University Press.
Jun, Sun-Ah (ed.). 2005c. Prosodic Typology: The phonology of intonation and phrasing.
Oxford: Oxford University Press.
Jun, Sun-Ah. 2014a. Prosodic typology: By prominence type, word prosody, and macro-
rhythm. In Sun-Ah Jun (ed.), Prosodic typology II: The phonology of intonation and
phrasing, 520–539. Oxford: Oxford University Press.
Jun, Sun-Ah (ed.). 2014b. Prosodic Typology II: The phonology of intonation and phrasing.
Oxford: Oxford University Press.
Jun, Sun-Ah & Cécile Fougeron. 2002. Realisations of accentual phrase in French inton-
ation. Probus 14(1). 147–172. https://doi.org/10.1515/prbs.2002.002.
Jun, Sun-Ah & Mira Oh. 1996. A prosodic analysis of three types of wh-phrases in
Korean. Language and Speech 39(1). 37–61. https : / / doi . org / 10 . 1177 /
002383099603900103.
Kabak, Barış & Irene Vogel. 2001. The phonological word and stress assignment in Turk-
ish. Phonology 18(3). 315–360. https://doi.org/10.1017/S0952675701004201.
Kang, Yoonjung & Sungwoo Han. 2013. Tonogenesis in early Contemporary Seoul Kore-
an: A longitudinal case study. Lingua 134(1). 62–74. https://doi.org/10.1016/
j.lingua.2013.06.002.
Karlsson, Anastasia. 2014. The intonational phonology of Mongolian. In Sun-Ah Jun
(ed.), Prosodic Typology II: The phonology of intonation and phrasing, 187–215. Oxford:
Oxford University Press.
Karlsson, Anastasia & Jan-Olof Svantesson. 2004. Prominence and mora in Mongolian.
In Proceedings of Speech Prosody 2. Nara, Japan.
Keane, Elinor. 2003. Word-level prominence distinctions in Tamil. In Proceedings of the
International Congress of Phonetic Sciences XV, 1257–1260. Barcelona.
Keane, Elinor. 2006a. Phonetics vs. phonology in Tamil wh-questions. In Proceedings of
Speech Prosody 3. Dresden.
Keane, Elinor. 2006b. Prominence in Tamil. Journal of the International Phonetic Associ-
ation 36(1). 1–20. https://doi.org/10.1017/S0025100306002337.
Kendall, Tyler & Erik R. Thomas. 2014. vowels. R package. http : / / ncslaap . lib .
ncsu.edu/tools/norm/.
196
Bibliography
197
Bibliography
Essais sur des variations dialectales et autres articles (Berber Studies 28). Köln: Rüdiger
Köppe Verlag.
Lambrecht, Knud. 1994. Information structure and sentence form: Topic, focus, and the
mental representations of discourse referents (Cambridge Studies in Linguistics 71). Cam-
bridge: Cambridge University Press.
Laoust, Emile. 2012. Cours de berbère marocain. Deuxième Édition [1936, 1920]. Casab-
lanca: Éditions Frontispice.
Lehiste, Ilse. 1970. Suprasegmental features of speech. In Norman J. Lass (ed.), Contem-
porary issues in experimental phonetics, 225–239. New York: Academic Press.
Lenth, Russell V. & Maxime Hervé. 2015. lsmeans: Least-Squares Means. R package.
Levi, Susannah V. 2005. Acoustic correlates of lexical accent in Turkish. Journal of
the International Phonetic Association 35(1). 73–97. https://doi.org/10.1017/
S0025100305001921.
Liberman, Mark & Alan Prince. 1977. On stress and linguistic rhythm. Linguistic Inquiry
8(2). 249–336. https://www.jstor.org/stable/4177987.
Lindström, Eva & Bert Remijsen. 2005. Aspects of the prosody of Kuot, a language
where intonation ignores stress. Linguistics 43(4). 839–870. https://doi.org/10.
1515/ling.2005.43.4.839.
Lux, Cécile. 2014. Focalization process and intonation in Meridional Berber: The case
of Tamasheq and Tetserret. STUF – Language Typology and Universals 67(1). 113–126.
https://doi.org/10.1515/stuf-2014-0009.
Maas, Utz. 2013. Die marokkanische Akzentuierung: Beiträge zur semitischen Dialekto-
logie: Festschrift für Werner Arnold zum 60. Geburtstag. In Renaud Kuty, Ulrich See-
ger & Shabo Talay (eds.), Nicht nur mit Engelszungen, 223–234. Wiesbaden: Harrassow-
itz Verlag.
Maas, Utz. N.d. Marokkanisches Arabisch: Zur Struktur einer Sprache im Werden. Ms., Uni-
versität Graz. https : / / zentrum . virtuos . uni - osnabrueck . de / utz . maas /
Main/MarokkanischesArabischZurStrukturEinerSpracheImWerden.
Maas, Utz & Stefan Procházka. 2012a. Introduction: Moroccan Arabic in typological
perspective. STUF – Language Typology and Universals 65(4). 321–328. https://doi.
org/10.1524/stuf.2012.0020.
Maas, Utz & Stefan Procházka. 2012b. Moroccan Arabic in its wider linguistic and social
contexts. STUF – Language Typology and Universals 65(4). 329–357. https://doi.
org/10.1524/stuf.2012.0021.
Mahjani, Behzad. 2003. An instrumental study of prosodic features and intonation in mod-
ern Farsi (Persian). University of Edinburgh Master’s thesis.
Maskikit-Essed, Raechel & Carlos Gussenhoven. 2016. No stress, no pitch accent, no
prosodic focus: The case of Ambonese Malay. Phonology 33(2). 1–37. https://doi.
org/10.1017/S0952675716000154.
McClelland, James L. & Jeffrey L. Elman. 1986. The TRACE model of speech perception.
Cognitive psychology 18(1). 1–86. https://doi.org/10.1016/0010- 0285(86)
90015-0.
198
Bibliography
199
Bibliography
Pierrehumbert, Janet. 1980. The phonology and phonetics of English intonation. Massachu-
setts Institute of Technology PhD dissertation.
Pierrehumbert, Janet & Mary Beckman. 1988. Japanese tone structure (Linguistic Inquiry
Monographs). Cambridge, MA: MIT Press.
Post, Brechtje. 2000. Tonal and phrasal structures in French intonation (LOT series). The
Hague: Holland Academic Graphics.
Prieto, Pilar. 2004. The search for phonological targets in the tonal space: H1 scaling
and alignment in five sentence-types in Peninsular Spanish. In Timothy Face (ed.),
Laboratory approaches to Spanish phonology, 29–59. Berlin: De Gruyter.
Prieto, Pilar. 2011. Tonal alignment. In Marc Van Oostendorp, Colin Ewen, Beth
Hume & Keren Rice (eds.), Companion to Phonology, 1185–1203. Chichester: Wiley-
Blackwell.
Prieto, Pilar. 2014. The intonational phonology of Catalan. In Sun-Ah Jun (ed.), Prosodic
Typology II: The phonology of intonation and phrasing, 43–80. Oxford: Oxford University
Press.
Prieto, Pilar, Mariapaola D’Imperio & Barbara Gili Fivela. 2005. Pitch accent alignment
in Romance: Primary and secondary associations with metrical structure. Language
and Speech 48(4). 359–396. https://doi.org/10.1177/00238309050480040301.
R Core Team. 2016. R: A Language and Environment for Statistical Computing. Software.
Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.
org/.
Rahmani, Hamed, Toni Rietveld & Carlos Gussenhoven. 2015. Stress ‘deafness’ reveals
lexical marking of stress and tone in the adult grammar. PLoS ONE 10(12). https:
//doi.org/10.1371/journal.pone.0143968.
Reichel, Uwe D. 2010. Datenbasierte und linguistisch interpretierbare Intonationsmodellier-
ung. University of Munich PhD dissertation. https : / / edoc . ub . uni - muenchen .
de/12650/1/Reichel_Uwe.pdf.
Remijsen, Bert, Fabienne Martis & Ronald Severing. 2014. The marked accentuation
pattern of Curaçao Papiamentu. In Sun-Ah Jun (ed.), Prosodic Typology II: The phono-
logy of intonation and phrasing, 302–323. Oxford: Oxford University Press.
Rhiati-Salih, Najib. 1984. Etude de l’interrogation en arabe marocain. Université de Paris
III, Sorbonne-Nouvelle PhD dissertation.
Rialland, Annie & Stéphane Robert. 2001. The intonational system of Wolof. Linguistics
39(5). 893–940. https://doi.org/10.1515/ling.2001.038.
Ridouane, Rachid. 2008. Syllables without vowels: Phonetic and phonological evidence
from Tashlhiyt Berber. Phonology 25(2). 321–359. https://doi.org/10.1017/
S0952675708001498.
Ridouane, Rachid. 2014. Tashlhiyt Berber. Journal of the International Phonetic Associ-
ation 44(2). 207–221. https://doi.org/10.1017/S0025100313000388.
Rietveld, Toni, Joop Kerkhoff & Carlos Gussenhoven. 2004. Word prosodic structure
and vowel duration in Dutch. Journal of Phonetics 32(3). 349–371. https://doi.
org/10.1016/j.wocn.2003.08.002.
200
Bibliography
Roettger, Timo B. 2017. Tonal placement in Tashlhiyt: How an intonation system accom-
modates to adverse phonological environments (Studies in Laboratory Phonology 3). Ber-
lin: Language Science Press. https://doi.org/10.5281/zenodo.814472.
Roettger, Timo B., Bodo Winter, Sven Grawunder, James Kirby & Martine Grice. 2014.
Assessing incomplete neutralization of final devoicing in German. Journal of Phonetics
43(1). 11–25. https://doi.org/10.1016/j.wocn.2014.01.002.
Roettger, Timo Benjamin, Anna Bruggeman & Martine Grice. 2015. Word stress in Tashl-
hiyt: Postlexical prominence in disguise? In Proceedings of the International Congress
of Phonetic Sciences XVIII. Glasgow, UK.
Roettger, Timo Benjamin & Martine Grice. 2015. The role of high pitch in Tashlhiyt
Tamazight (Berber): Evidence from production and perception. Journal of Phonetics
51(1). 36–49. https://doi.org/10.1016/j.wocn.2014.12.004.
Roettger, Timo Benjamin, Rachid Ridouane & Martine Grice. 2012. Sonority and syl-
lable weight determine tonal association in Tashlhiyt. In Proceedings of Speech Prosody
6. Shanghai.
Rooth, Mats. 1985. Association with focus. University of Massachusetts PhD dissertation.
Rooth, Mats. 1992. A theory of focus interpretation. Natural Language Semantics 1(1).
75–116. https://doi.org/10.1007/BF02342617.
Sabel, Joachim & Jochen Zeller. 2006. Wh-question formation in Nguni. In Selected
proceedings of the 35th annual conference on African linguistics, 271–283.
Sadeghi, Vahid. 2011. Acoustic correlates of lexical stress in Persian. In Proceedings of
the International Congress of Phonetic Sciences XVII, 1738–1741. Hong Kong.
Sadiqi, Fatima. 1997. Grammaire du berbère. Paris: Editions l’Harmattan.
Selkirk, Elisabeth O. 1986. Phonology and syntax: The relationship between sound and
structure. Cambridge, MA: MIT Press.
Shattuck-Hufnagel, Stefanie. 1994. ‘‘Stress shift” as early placement of pitch accents: a
comment on Beckman and Edwards. In Patricia Keating (ed.), Papers in Laboratory
Phonology III: Phonological structure and phonetic form, 34–43. Cambridge: Cambridge
University Press.
Simons, Gary F. & Charles D. Fennig (eds.). 2017. Ethnologue: Languages of the World.
Dallas, Texas. http://www.ethnologue.com.
Singmann, Henrik, Ben Bolker, Jake Westfall & F. Aust. 2015. afex: Analysis of factorial
experiments. R package.
Skoruppa, Katrin, Ferran Pons, Anne Christophe, Laura Bosch, Emmanuel Dupoux,
Núria Sebastián-Gallés, Rita Alves Limissuri & Sharon Peperkamp. 2009. Language-
specific stress perception by 9-month old French and Spanish infants. Developmental
Science 12(6). 914–919. https://doi.org/10.1111/j.1467-7687.2009.00835.
x.
Sluijter, Agaath & Vincent van Heuven. 1996. Notes on the phonetics of word prosody.
In Rob Goedemans, Harry van der Hulst & Ellis Visch (eds.), Stress patterns of the
world, 233–269. The Hague: Holland Academic Graphics.
201
Bibliography
Sluijter, Agaath, Vincent van Heuven & Jos Pacilly. 1997. Spectral balance as a cue in
the perception of linguistic stress. Journal of the Acoustical Society of America 101(1).
503–513. https://doi.org/10.1121/1.417994.
Song, Sanghoun. 2017. Modelling information structure in a cross-linguistic perspect-
ive (Topics at the Grammar-Discourse Interface 1). Berlin: Language Science Press.
https://doi.org/10.5281/zenodo.818365.
Sosa, Juan Manuel. 2003. Wh-questions in Spanish: Meanings and configuration vari-
ability. Catalan Journal of Linguistics 2. 229–247. https://doi.org/10.5565/rev/
catjl.51.
Stoyanova, Marina. 2004. The typology of multiple wh-questions and language vari-
ation. In Sylvia Blaho, Luis Vicente & Mark de Vos (eds.), ConSOLE XII: Proceedings
of the conference of the Student Organisation of Linguistics in Europe XII.
Stumme, Hans. 1899. Handbuch des Schilhischen von Tazerwalt. Leipzig: J.C. Hin-
richs’sche Buchhandlung.
Stumme, Hans & Albert Socin. 1894. Der arabische Dialekt der Houwara des Wad Sus in
Marokko. Leipzig: Hirzel.
Svetozarova, Natalia. 1998. Intonation in Russian. In Daniel Hirst & Albert Di Cristo
(eds.), Intonation systems: A survey of twenty languages, 261–274. Cambridge: Cam-
bridge University Press.
Swerts, Marc. 2007. Contrast and accent in Dutch and Romanian. Journal of Phonetics
35(3). 380–297. https://doi.org/10.1016/j.wocn.2006.07.001.
Swerts, Marc, Emiel Krahmer & Cinzia Avesani. 2002. Prosodic marking of information
status in Dutch and Italian: A comparative analysis. Journal of Phonetics 30(4). 629–
654. https://doi.org/10.1006/jpho.2002.0178.
Szalontai, Ádám, Petra Wagner, Katalin Mády & Andreas Windmann. 2016. Teasing
apart lexical stress and sentence accent in Hungarian and German. In Christoph
Draxler & Felicitas Kleber (eds.), Tagungsband der 12. Tagung Phonetik und Phonologie
im Deutschsprachigen Raum, 215–218. Munich University.
’t Hart, Johan. 1976. Psychoacoustic backgrounds of pitch contour stylization. In IPO
annual progress report, vol. 11. Eindhoven: IPO.
’t Hart, Johan. 1981. Differential sensitivity to pitch distance, particularly in speech.
Journal of the Acoustical Society of America 69(3). 811–821. https://doi.org/10.
1121/1.385592.
’t Hart, Johan & Antonie Cohen. 1973. Intonation by rule: A perceptual quest. Journal
of Phonetics 1. 309–327. https://doi.org/10.1016/S0095-4470(19)31400-7.
’t Hart, Johan & René Collier. 1975. Integrating different levels of intonation analysis.
Journal of Phonetics 3(4). 235–255.
’t Hart, Johan, René Collier & Antonie Cohen. 1990. A perceptual study of intonation: An
experimental-phonetic approach to speech melody (Cambridge Studies in Speech Science
and Communication). Cambridge: Cambridge University Press.
Tench, Paul. 1996. The intonation systems of English. London: Cassell Academic.
Thomas, Erik R. & Tyler Kendall. 2007. NORM: The vowel normalization and plotting suite.
Online resource. http://ncslaap.lib.ncsu.edu/tools/norm/.
202
Bibliography
Trager, George L. & Henry Lee Smith. 1951. An outline of English structure. In Studies
in linguistics: Occasional papers, vol. 3. Norman, OK: Battenberg Press.
Trubetzkoy, Nikolai Sergeevich. 1969. Principles of phonology. [1958, 1939]. Berkeley:
University of California Press.
Ultan, Russell. 1978. Some general characteristics of interrogative systems. In Universals
of human language, vol. 4: Syntax, 211–248. Stanford: Stanford University Press.
Van den Boogert, Nico. 1997. The Berber literary tradition of the Sous (Publications of
the “De Goeie Fonds” 27). Leiden: Nederlands Instituut voor het Nabije Oosten.
van der Hulst, Harry. 2014a. The study of word accent and stress: Past, present and
future. In Harry van der Hulst (ed.), Word stress: Theoretical and typological issues, 3–
55. Cambridge: Cambridge University Press.
van der Hulst, Harry (ed.). 2014b. Word stress: Theoretical and typological issues. Cam-
bridge: Cambridge University Press.
van der Wal, Jenneke. 2016. Diagnosing focus. Studies in Language 40(2). 259–301.
https://doi.org/10.1075/sl.40.2.01van.
Van Heuven, Vincent J. & Vera Faust. 2009. Are Indonesians sensitive to contrastive
accentuation below the word level? Wacana 11(2). 226–240.
Vanrell, Maria del Mar & Olga Fernández Soriano. 2013. Variation at the interfaces
in Ibero-Romance. Catalan and Spanish prosody and word order. Catalan Journal of
Linguistics 12. 253–282. https://doi.org/10.5565/rev/catjl.63.
Varga, László. 2002. Intonation and stress: Evidence from Hungarian. Houndmills/New
York: Palgrave Macmillan.
Vella, Alexandra. 2007. The phonetics and phonology of wh-question intonation in
Maltese. In Proceedings of the International Congress of Phonetic Sciences XVI, 1285–
1289. Saarbrücken.
Versteegh, Kees. 2014. The Arabic language. Second edition. Edinburgh: Edinburgh Uni-
versity Press.
Vicenik, Chad & Sun-Ah Jun. 2014. An Autosegmental Metrical analysis of Georgian
intonation. In Sun-Ah Jun (ed.), Prosodic Typology II: The phonology of intonation and
phrasing, 154–186. Oxford: Oxford University Press.
Vigário, Marina & Sónia Frota. 2003. The intonation of Standard and Northern
European Portuguese: A comparative intonational phonology approach. Journal of
Portuguese Linguistics 2(2). 115–137. https://doi.org/10.5334/jpl.31.
Vilkuna, Maria. 1995. Discourse configurationality in Finnish. In Katalin É. Kiss (ed.),
Discourse configurational languages, 244–268. Oxford: Oxford University Press.
Vycichl, Werner. 2005. A sketch of Siwi Berber (Egypt). Köln: Rüdiger Köppe Verlag.
Warren, Paul. 2016. Uptalk: The phenomenon of rising intonation. Cambridge: Cambridge
University Press.
Watson, Janet C. E. 2011. Word stress in Arabic. In Marc van Oostendorp, Colin J. Ewen,
Elizabeth V. Hume & Keren Rice (eds.), The Blackwell Companion to Phonology, 2990–
3019. Oxford: Wiley-Blackwell.
203
Bibliography
Welby, Pauline. 2006. French intonational structure: Evidence from tonal alignment.
Journal of Phonetics 34(3). 343–371. https://doi.org/10.1016/j.wocn.2005.
09.001.
Willis, Erik. 2007. Utterance signaling and tonal levels in Dominican Spanish declarat-
ives and interrogatives. Journal of Portuguese Linguistics 5/6. 179–202. https://doi.
org/10.5334/jpl.149.
Xu, Yi. 2004a. The PENTA model of speech melody: Transmitting multiple communic-
ative functions in parallel. Proceedings of From sound to sense 50. 91–96.
Xu, Yi. 2004b. Transmitting tone and intonation simultaneously — the parallel encod-
ing and target approximation (PENTA) model: With emphasis on tone languages. In
International symposium on Tonal Aspects of Languages.
Yeou, Mohamed. 2005. Variability of F0 peak alignment in Moroccan Arabic accentual
focus. In Proceedings of Interspeech 2005, 1433–1436. Lisbon.
Yeou, Mohamed, Mohamed Embarki, Sallal Al Maqtari & Christelle Dodane. 2007. F0
alignment patterns in Arabic dialects. In Proceedings of International Congress of Phon-
etic Sciences XVI, 1493–1496. Saarbrücken.
Yeou, Mohamed, Mohamed Embarki & Sallal Al-Maqtari. 2007. Contrastive focus and
F0 patterns in three Arabic dialects. Nouveaux cahiers de linguistique française 28. 317–
326.
Yip, Moira. 2002. Tone. Cambridge: Cambridge University Press.
Zellou, Georgia. 2010. Moroccan Arabic consonant harmony: A multiple causation hy-
pothesis. Toronto Working Papers in Linguistics 33(1).
Zimmermann, Malte & Edgar Onea. 2011. Focus marking and focus interpretation. Lin-
gua 121. 1651–2670. https://doi.org/10.1016/j.lingua.2011.06.002.
Zuraw, Kie, Kristine M. Yu & Robyn Orfitelli. 2014. The word-level prosody of Samoan.
Phonology 31(2). 271–327. https://doi.org/10.1017/S095267571400013X.
Żygis, Marzena, Susanne Fuchs & Katarzyna Stoltmann. 2017. Orofacial expressions in
German questions and statements in voiced and whispered speech. Journal of Mul-
timodal Communication Studies 4(1–2). 87–92.
204
Curriculum Vitae
Name Anna Maria Bruggeman
Date of Birth 29.09.1990
205