Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Phonetics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Ministry of Higher Education

& Scientific Research


University of Tikrit
College of Education for Humanities
Department of English
M.A. Studies/ Linguistics: Phonetics

1
Phonetics:
Phonetics is the scientific study of speech sounds, that is described and categorizing
human sounds, understanding the creation of sounds, comparing and contrasting sounds
in language. It deals with the sounds of speech and their production, combination,
description, and representation by written symbols. Why Everyone Should Learn
Phonetics: In today's era, where communication influences an important part in every
area like teaching, education and much more, here are explanations why you should learn
phonetics:
-It makes you a genius at speaking
-Phonetics boosts you in analyzing words and pronouncing it correctly.
-It decreases mumbling and fumbling and supports in strengthening your confidence.
-It improves your fluency and accent. Phonetics inculcate a skill of analyzing a word and
recognizing it with the sound which with practice improves the fluency and the accent.
-It helps in generating a reading.
Phonetics and phonology are the branches of linguistics concerned with sounds, thus the
main object of investigation in this course is a sound. The English alphabet is comprised of
26 letters, while the sound system of English contains 44 sounds as phonemes. The term
sound is often regarded as not being a precise one in the fields of phonetics and phonology
and is thus replaced by the term phone. Sound could mean any noise or sound, while phone
is restricted to the human voice (‘Phone’ comes from a Greek word ‘phone’ [human voice]
and is regarded as a speech sound which can be cut out from the speech stream. Crystal
(2008) defines phone as “the smallest perceptible discrete segment of sound in a stream of
speech” (2008: 361).
A phoneme includes all the phonetic specifications of phones and is the smallest
independent unit that can bring about a change in meaning. Roach (2009) calls phonemes
“abstract sounds” as there may be slightly different ways to realize the same phoneme. An
example of a phoneme is the sound /t/ in the words team and steam. The slight difference in
the realization of this phoneme is that the /t/ in team is aspirated [tʰ], while the /t/ in steam is
not [t]. Phones that belong to the same phoneme, such as [t] and [tʰ] for English /t/, are called
allophones. Allophones do not affect the semantic meaning of the word, while a substituted
phoneme could bring a semantic change. For example, team pronounced with any allophone
of the phoneme /t/ maintains its meaning, but if it is substituted with the phoneme /b/, then
it brings about a semantic change. These two words then (team /tiːm/ and beam /biːm/) form
a minimal pair, which is an opposition of two words showing the existence of these two
phonemes.
2
For a set of words to form a minimal pair, they may differ in one phoneme only. Phonemes
cannot, in fact, be pronounced – in actual speech, they are realized through allophones. Both
branches investigate the sounds from different perspectives: (Balcytyte, 2014:13)

Phonetics is concerned with the physical manifestation of language in sound waves and
how they are produced, transmitted, and perceived, and also “provides methods for their
description, classification, and transcription” (Crystal 2008: 363).
Phonology “studies the sound systems of languages” (ibid: 365) and how sounds function
in relation to each other in a language. Although phonetics and phonology are
indistinguishable from one another in most instances, the scope of these pages deal with
phonetics essentially and only touches upon a few concepts in phonology for practical
purposes.

The Branches of Phonetics:


Phonetics can be viewed as investigating three distinct areas that are represented in the
following branches of phonetics: (Balcytyte, 2014:14)

Articulatory phonetics, which studies the ways the vocal organs are used to
produce speech sounds. Or the branch of phonetics which studies the organs of
speech and their use in producing speech sounds;

Acoustic phonetics, which investigates the physical properties of speech sounds


(duration, frequency, intensity, and quality) that are generally measured by
spectrographs to depict waveforms and spectrograms. Or the branch of phonetics
which deals with the physical characteristics of the sound waves which carry speech
sounds between mouth and ear. Acoustic phonetics makes heavy use of a battery
of electronic instruments, perhaps most notably the sound spectrograph; these
days it also makes considerable use of computers for analysis and modelling;

Auditory phonetics, which is concerned with how people perceive speech sounds,
i.e. how the sound waves activate the listener’s eardrum, and how the message is
carried to the brain in the form of nerve impulses. Or the branch of pbonetics dealing
with the way in which the human ear and brain process and interpret speech
sounds.

3
Historically, phonetics is classified into two types taxonomic and scientific.
Taxonomic phonetics provides two basic tools for the dealing with speech sounds:
first, uniformity in naming and classifying speech sounds, and, second transcribing
them. This type has led to the rise of the International Phonetic Association (IPA)
which is a system used for transcribing the individual speech sounds of any language
and based primarily on the Latin Alphabet. On the other hand, scientific phonetics
seeks to understand how speak works to all levels from the brain of the speaker to
the brain of the hearer.

Phonology:
In order to understand the differences and similarities between phonetics and
phonology, it is helpful to mention some definitions of them. On his book, Roach
(1992:81) defines phonology as the scientific study of speech and states that the
central concerns in phonetics are the discovery of how speech sounds are produced,
how they are used in spoken language, how we can record speech sounds with
written symbols, and how we hear and recognize different sounds. Phonology, on
the other hand, he states (ibid:82) " the sound system of languages. The most basic
activity in phonology is phonemic analysis, in which the objective is to establish
what the phonemic inventory.

Another definition stated by Yule (2006:30). He introduces phonetics as the general


study of the characteristics of speech sounds. Whereas, phonology, (2006: 43-4) he
states essentially the description of the system of patterns of speech sounds in a
language. It is, in effect, based on a theory of what every speaker of a language
unconsciously knows about the sound patterns of that language. He continues (ibid),
because of their theoretical status, phonology is concerned with the abstract or
mental aspect of the sounds in language rather than with the actual physical
articulation of speech sounds.

The relationship between Phonetics and Phonology:

Kelly (2000:9) declares that the study of pronunciation consists of two fields, namely
phonetics and phonology. Phonetics refers to the study of sounds. A phonetician
usually works in one or more of the following areas: physiological phonetics,
articulatory phonetics, acoustic phonetics, auditory phonetics and perceptual
phonetics. He (ibid) states that if phonetics deals with the reality of speech sounds
then phonology, on the other hand, is concerned with how we interpret and
systematize sounds. Phonology deals with the system and patterns of the sounds
4
which exist with particular languages. Historically, one of the intractable problems
has been to define the proper relation between phonology and phonetics. Concerning
this issue , there are three basic perspectives . First, phonetics and phonology are
unified, in which they are parallel with direct relation between them (Flemming
2001).The second view claims that there is no interface between phonetics and
phonology with the fully autonomous nature of them (Hjelmslev 1953, Foly 1977).
The last view is somewhat in the middle of the above two perspectives. That is, there
is a separation between phonetics and phonology; however, they are strongly
connected with each other (Pierrehumbert, 1990).

Phonetics and Phonology as Unified Model


According to this view, it is proposed that phonetics and phonology are integrated
into a single unit which is all phonetics. The basic idea of this position is that the
properties of phonetics and phonology should be equally interpreted in a unified
(integrated) single module rather than two separate modules (Pierrehumbert and
steele, 1987, 1990).Similar to this view, Odden (2005:2) claims that phonetics is
entirely related to phonology and he believes that a good way for understanding
phonology can be only with studying it with other subjects. He, then, maintains " a
better understand of many issues in phonology requires that you bring phonetics into
consideration, just as a phonological analysis is a prerequisite for any phonetic study
of language. Flemming (2001) also considers phonetics and phonology as being
integrated and postulates that there are many similarities between these two terms.
Additionally, he states that the division between phonetics and phonology should be
ignored despite the belief that they are different from each other in nature.

Phonetics and Phonology as Different Fields


Around the beginning of the 20th century phonetics and phonology have been
concerned as two separated fields. This is because that phonetics is bio-
physical/concrete by its nature. That means, phonetics is concerned with the
discovery of how speech sounds are produced, how they are used in spoken
language, how we can record speech sounds with written symbols, and how we hear
and recognize different sounds (Chomsky and Halle, 1968:450 and Halle, 1970).
Phonology, on the other hand, is cognitive/ abstract. Phonology is about establishing
what are the phonemes in a given language, i.e. those sounds that can bring a
difference in meaning between two words. A phoneme is aphonic segment with a
meaning value, for example in minimal pairs (patbat,hat-had) . Thus each category
stands by itself.

5
Different but conditionally interfaced
On the contrary of the previous view, it is assumed that phonetics and phonology are
distinct from each but there is also a significant interaction between them. Within
this position, there is a constrained mapping between phonology and phonetics,
which implies phonological elements are universally related to phonetic ones to
some extent. In general, there are two arguments supporting this view. The first one
can be found in the sound pattern of English (Chomsky and Halle 1968), whereby
phonological and phonetic representations are related by rules. The general
properties of phonological representations represent the best compromise between
concrete phonetic transcription and abstract representation. The second argument is
semantic one. That is, they have the same general character as principles relating
ordinary nouns or adjectives to their meanings in the real world. Let’s take the word
(dog) as a concept (DOG) which refers to the whole class of dogs and its
pronunciation /dog/ associated with this concept. The claim is that the relationship
is between DOG as a concept and its pronunciation is arbitrary (Boersma,1998; 467).

Speech Production:
The Anatomy of Speech Production:
Speech is the vocal aspect of communication made by human beings. Human
beings are able to express their ideas, feelings and emotions to the listener through
speech generated using vocal articulators (NIDCD, June 2010). Development of
speech is a constant process and requires a lot of practice. Communication is a string
of events which allows the speaker to express their feelings and emotions and the
listener to understand them. Speech communication is considered as a thought that
is transformed into language for expression. The mechanism of speech production
is very complex and before conducting the analysis of any language it is important
for everyone to understand the processes of production and perception of speech (D.
B. Fry, 1979).Ladefoged (2015:2) mentions that sound is the basic requirement for
speech production, most sounds are the result of movements of the tongue and the
lips. These movements are gestures forming particular sounds. Making speech
gestures audible involves pushing air out of the lungs while producing a noise in the
throat or mouth. These basic noises are changed by the actions of the tongue and
lips. The actions of the tongue are among the fastest and most precise physical
movements that people can make. The capability of human beings for articulation of
the sound distinguishes them from the other species. The parts of the human body
which are directly involved in the production of speech are usually termed as the
organs of speech. There are three main organs of speech: Respiratory organs,
phonatory organs and articulatory organs.
6
1. Respiratory Organs:

The most important function of the lungs, which is relevant to speech production, is respiration
and it is responsible for the movement of air. Lungs are controlled by a set of muscles which make
them expand and contract alternately so that the air from outside is drawn in and pushed out
alternatively. When the air is pushed out of the lungs it passes through the windpipe or trachea,
which has at its top the larynx. Glottis is a passage between the two horizontal folds of elastic
muscular tissues called the vocal folds (vocal cords), which , like a pair of lips, can close or open
the glottis (i.e., the passage to the lungs through the windpipe). The main physiological function
of the vocal folds is to close the lungs for its own protection at the time of eating or drinking so
that no solid particles of food or liquid enter the lungs through the windpipe.

By virtue of their tissue structure, the vocal cords are capable of vibrating with different
frequencies when the air passes through and this vibration is called voice. After passing through
the glottis and reaching the pharynx the outgoing air stream can escape either through the nasal
cavity having its exit at the mouth. When the air escapes through the mouth, the nasal cavity can
be closed by bringing the back part of the soft palate or the velic into close contact with the
pharyngeal wall. Such a type of closure of the nasal passage is called velic closure. When there is
velic closure, the air can escape only from the mouth. The nasal cavity can also be kept open when
the air passes through the mouth allowing part of the air to pass through the nose also. The oral
passage can be closed at sometimes so that the outgoing air is temporally shut up in the pharyngeal
cavity. In such cases, the air escapes through the nostrils creating the nasal consonants (W.
Tecumseh Fitch, 2010).

7
2. Phonation Process:

During phonation, each cycle of vocal fold vibration is caused both by the sub glottal air pressure
that built up to separate the folds and the Bernoulli effect which affirms that, as the air rushes
through the glottis with great velocity, creates a region of low pressure against the inner sides of
each fold bringing them together again (Ray D Kent & Charles Read, 2002). The whole process is
made possible by the fact that the folds themselves are elastic. Their elasticity not only permits
them to be blown open for each cycle, but the elastic recoil force (the force that restores any elastic
body to its resting place) works along with the Bernoulli effect to close the folds for each cycle of
vibration. The vocal folds move in a fairly periodic way. During sustained vowels, for example,
the folds open and close in a certain pattern of movement that repeats itself. This action produces
a barrage of airbursts that set up an audible pressure wave (sound) at the glottis. The pressure wave
of sound is also periodic; the pattern repeats itself. Richard L Klevans and Robert O Rodman
(1997) said that a person’s mental ability to control his vocal tract muscles during utterance is
learned during his childhood. These habits affect the range of sound that may be effectively
produced by an individual. The range of sounds is the subset of the set of possible sounds that an
individual could create with his or her personal vocal tract. It is not easy for an individual to change
voluntarily these physical characteristics. Like all sound sources that vibrate in a complex periodic
fashion, the vocal folds generate a harmonic series, consisting of a fundamental frequency and
many whole number multiple of that fundamental frequency (harmonics). The fundamental
frequency is the number of glottal openings/closing per second.

8
Modes of phonation:
Voiced sound: in such type of speech, the vocal cords vibrate and thus produce sound waves.
These vibrations occur along most of the length of the glottis, and their frequency is determined
by the tension in the vocal folds (Kenneth N. Stevens, 1998). All vowels and diphthongs together
with some consonants like b, d, g, m, n, v, l, j, r produces voice sounds.

Unvoiced sound: an unvoiced sound is characterized by the absence of its phonation. In such
cases the vocal folds remain separated and the glottis is held open at all times. The opening lets
the airflow passes through without creating any vibrations, but still accelerates the air by being
narrower than the trachea (Michael Dobrovolsky & Francis Katamba)

9
3. Articulatory Process:

Articulation is a process resulting in the production of speech sounds. It consists of a series of


movements by a set of organs of speech called the articulators. The articulators that move during
the process of articulation are called active articulators. Organs of speech which remain relatively
motionless are called passive articulators. The points at which the articulator are moving towards
or coming in to contact with certain other organ are the place of articulation. The type or the nature
of movement made by the articulator is called the manner of articulation.

Most of the articulators are attached to the movable lower jaw and as such lie on the lower side
or the floor of the mouth. The points of articulation or most of them are attached to the immovable
upper jaw and so lie on the roof of the mouth. Therefore, nearly all the articulatory description of
a speech sound; therefore has to take into consideration the articulator, the point of articulation
and the manner of articulation ( Laurel J. Brinton, 2000). While describing the articulators and the
points of articulation, it would be convenient to take the points of articulation first. The main points
of articulation are the upper lip, the upper teeth, the alveolar ridge, the hard palate and the soft
palate, which is also called the velum (Henry Rogers, 2000). The upper lip and the upper teeth are
easily identifiable parts in the mouth. The alveolar ridge is the rugged, uneven and elevated part
just behind the upper teeth. The hard palate is the hard bony structure with a membranous covering
which immediately follows the alveolar ridge. The hard palate is immediately followed by the soft
palate or the velum. It is like a soft muscular sheet attached to the hard palate at one end, and
ending in a pendulum like soft muscular projection at the other which is called the Uvula (Philipp
Strazny, 2005). Besides the above points of articulation, the pharyngeal wall also may be
considered as a point of articulation. The two most important articulators are the lower lip and
tongue. The tongue owing to its mobility is the most versatile of articulators. The surface of the
tongue is relatively large and the different points of tongue are capable of moving towards different
places or point of articulation. It may be conveniently divided into different parts, viz, the front,
the center, the blade, the back and the root of the tongue. When the tongue is at rest behind the
lower teeth then the part of the tongue, which lies below to the hard palate towards incisor teeth,
is called the front of the tongue (Philipp Strazny, 2005).The part, which faces the soft palate, is
called the back and the region where the front and back meet is known as center. The whole upper
surface of the tongue i.e. the part lying below the hard and soft palate is called by some scholars
as the dorsum.

Finally, the articulation process is the most obvious one: it takes place in the mouth and it is the
process through which we can differentiate most speech sounds. In the mouth we can distinguish
between the oral cavity, which acts as a resonator, and the articulators, which can be active or
passive: upper and lower lips, upper and lower teeth, tongue (tip, blade, front, back) and roof of the
mouth (alveolar ridge, palate and velum). So, speech sounds are distinguished from one another in
terms of the place where and the manner how they are articulated.

10
Summary: Organs and processes

Most speech is produced by an air stream that originates in the lungs and is
pushed upwards through the trachea (the windpipe) and the oral and nasal cavities.
During its passage, the air stream is modified by the various organs of speech. Each
such modification has different acoustic effects, which are used for the
differentiation of sounds. The production of a speech sound may be divided into four
separate but interrelated processes: the initiation of the air stream, normally in the
lungs; its phonation in the larynx through the operation of the vocal folds; its
direction by the velum into either the oral cavity or the nasal cavity (the oro-
nasal process); and finally its articulation, mainly by the tongue, in the oral cavity.
We shall deal with each of the four processes in turn.
Theories of Speech Production:
1- Source /Filter Theory:

Human beings are able to more or less independently control phonation with the
larynx and articulation (source) and with the vocal tract (filter). Thus, we can assume
speech sounds are the response coming from a vocal-tract system, where a sound
source is fed into and filtered by the resonance characteristics of the vocal tract. This
kind of modeling by a linear system is called the source-filter theory of speech
production.

11
The source-filter theory: it describes speech production as a two stage process
involving the generation of a sound source, with its own spectral shape and spectral
fine structure, which is then shaped or filtered by the resonant properties of the vocal
tract.

Most of the filtering of a source spectrum is carried out by that part of the vocal tract
anterior to the sound source. In the case of a glottal source, the filter is the entire
supra-glottal vocal tract. The vocal tract filter always includes some part of the oral
cavity and can also, optionally, include the nasal cavity (depending upon whether
the velum is open or closed).Sound sources can be either periodic or aperiodic.
Glottal sound sources can be periodic (voiced), aperiodic (whisper and /h/) or mixed
(e.g. breathy voice). Supra-glottal sound sources that are used contrastively in speech
are aperiodic (i.e. random noise) although some trill sounds can resemble periodic
sources to some extent. A voiced glottal source has its own spectrum which includes
spectral fine structure (harmonics and some noise) and a characteristic spectral slope
(sloping downwards at approximately -12dB/octave).An aperiodic source (glottal or
supra-glottal) has its own spectrum which includes spectral fine structure (random
spectral components) and a characteristic spectral slope. Periodic and aperiodic
sources can be generated simultaneously to produce mixed voiced and aperiodic
speech typical of sounds such as voiced fricatives. In voiced speech the fundamental
frequency (perceived as vocal pitch) is a characteristic of the glottal source acoustics
whilst features such as vowel formants are characteristics of the vocal tract filter
(resonances). For vowels, the sound source is a glottal sound produced by vocal fold
vibration. The glottal sound governs pitch and voice quality. When the vocal-tract
configuration changes, the resonance characteristics also change, and the vowel
quality of the output sound changes.

Note: When we talk about speech sounds, whether vowels or consonants, there are
four sound sources: glottal (or phonation) source, aspiration source, frication
source, and transient source. When we produce speech sounds, one of the sources
or a combination of them becomes an input to the vocal-tract filter, and a
vowel or a consonant can be viewed as the

12
response of such filter. Thus, the source-filter theory can be expanded and applied
not only to vowels, but to any speech sound, including consonants.

Although the source-filter theory is a good approximation of speech sounds, we have


to remember this theory is only an approximation, and the actual process of speech
production is non-linear and time-variant. It is also true that there is an interaction
between a source and a vocal-tract filter. When you need to discuss such issues in a
strict sense, you might need to pay more attention. However, this theory usually
gives us reasonable approximations, and therefore, many speech applications in
speech technology are based on this theory.

The perspective of source/filter theory of Ladefoged (2015:197) is that speech


sounds differ in pitch, in loudness, and in quality. When discussing differences in
quality, we noted that the quality of a vowel depends on its overtone structure. We
can say that a vowel sound contains a number of different pitches simultaneously.
There is the pitch at which it is actually spoken, and there are the various overtone
pitches that give it its distinctive quality. We distinguish one vowel from another by
the differences in these overtones. The overtones are called formants. Formants
are a resonating frequency of the air in the vocal tract. Vowels are characterized by
three formants.
Figure: Spectrogram of the utterance “First campaign I worked in was for John
Kennedy in nineteen-sixty.”

The lowest three formants distinguish vowels from one another. The lowest, formant
one, which we can symbolize as F1, can be heard by tapping on your throat. There
is a technique to hear the sound which is that by opening your mouth, make a glottal
stop, and flick a finger against your neck just to the side and below the jaw, you will
hear a note, just as you would if you tapped on a bottle. If you tilt your head slightly
backward so that the skin of the neck is stretched while you tap, you may be able to
hear this sound somewhat better.
13
If you check a complete set of vowel positions with this technique, you should hear
the pitch of the first formant going up for the first four vowels and down for the
second four vowels. The second formant, F2, goes down in pitch in the series of
vowels. As can be heard more easily when these vowels are whispered. The third
formant, F3, adds to quality distinctions, but there is no easy way of making it more
evident.
How do these formants arise? The answer is that they are echoes in the vocal tract.
The air in the vocal tract acts like the air in an organ pipe, or in a bottle. Sound travels
from a noise-making source (in voiced sounds, this is the vocal fold vibration) to the
lips. Then, at the lips, most of the sound energy radiates away from the lips for a
listener to hear, while some of the sound energy reflects back into the vocal tract—
it echoes. The addition of the reflected sound energy with the source energy tends to
amplify energy at some frequencies and damp energy at others, depending on the
length and shape of the vocal tract. The vocal folds are then a source of sound energy,
and the vocal tract (due to the interaction of the reflected sound waves in it) is a
frequency filter altering the timbre of the vocal fold sound. In phonetics, the timbre
of a vowel is called the vowel quality. This same source/filter mechanism is at work
in many musical instruments. In the brass instruments, for example, the noise source
is the vibrating lips in the mouthpiece of the instrument, and the filter is provided by
the long brass tube. You can verify for yourself that the instrument changes the sound
produced by the lips by listening to the lip vibration with the mouthpiece alone
(make a circle with your index finger and thumb for a simulated trombone
mouthpiece). Similarly, in a marimba, the sound source is produced by striking one
of the keys of the instrument, and the filter is provided by the tubes that are mounted
underneath each key. One reason the marimba is so much bulkier than a trombone
is that it has a separate source/filter system for each note in the scale, in the
trombone, there is only one source (lips) and one filter (the tube of the instrument),
and both are variable. The human voice is more like the trombone—our vocal fold
sound source can be made to vibrate at different pitches and amplitudes, and our
vocal tract filter can be made to enhance or damp different frequencies, giving us
the many different timbres that we hear as different vowels. We said above that the
filtering action of the vocal tract tends to amplify energy at some frequencies and
damp energy at others, depending on the length and shape of the vocal tract. The
length factor is pretty easy to describe when the shape of the vocal tract is
simple.

The length of the resonating portion of the vocal tract differs substantially for
different speech sounds. In vowels, the whole vocal tract, from glottis to lips, serves
as the acoustic filter for the noise generated by the vibrating vocal folds. In fricatives,
the resonating portion of the vocal tract is shorter. For example, in [ s ], the portion
of the vocal tract that serves as the acoustic filter is from the alveolar ridge to the

14
lips. Thus, the lowest formant in [ s ] (with a vocal tract length of only 2 or 3 cm)
will have a much higher frequency than the F1 found in vowels. This explains why
the fricative noises were so noticeable in the high-pass filtered version of the
utterance in the figure. The only fricative that does not have higher resonant
frequencies than those found in vowels is the glottal fricative [ h ]. In [ h ], the whole
vocal tract, from glottis to lips, is involved. In addition to the length of the vocal
tract, the frequencies of the resonant overtones, the formants, are determined by the
shape of the vocal tract. in nasal consonants, we have numerous side cavities
branching off of the main passageway from glottis to nose the sinus cavities, as well
as the mouth cavity. Similarly, in lateral sounds, the shape of the vocal tract is
complex. The acoustics of vowels can be described in two ways: with tube models
and with perturbation theory.

2- Tube Models Theory:

Ladefoged (2015:200) states that:

1. The formants that characterize different vowels are the result of the different
shapes of the vocal tract.

2. Any body of air, such as that in the vocal tract or that in a bottle, will vibrate
in a way that depends on its size and shape. If you blow across the top of an
empty bottle, you can produce a low-pitched note. If you partially fill the
bottle with water so that the volume of air is smaller, you will be able to
produce a note with a higher pitch. Smaller bodies of air are similar to smaller
piano strings or smaller organ pipes in that they produce higher pitches.

3. In the case of vowel sounds, the vocal tract has a complex shape so that the
different bodies of air produce a number of overtones.

4. The air in the vocal tract is set in vibration by the action of the vocal folds.
Every time the vocal folds open and close, there is a pulse of acoustic energy.
These pulses act like sharp taps on the air in the vocal tract, setting the
resonating cavities into vibration so that they produce a number of different
frequencies, just as if you were tapping on a number of different bottles at the
same time.

15
5. Irrespective of the rate of vibration of the vocal folds, the air in the vocal tract
will resonate at these frequencies as long as the position of the vocal organs
remains the same. Because of the complex shape of the tract, the air will
vibrate in more than one way at once. It’s as if the air in the back of the vocal
tract might vibrate one way, producing a low-frequency waveform, while the
air in front of the tongue, a smaller cavity, might vibrate in another way,
producing a higher frequency waveform. A third mode of vibration of the air
in the vocal tract might produce a sound of even higher frequency. What we
actually hear in vowels is the sum of these waveforms added together.

The relationship between resonant frequencies and vocal tract shape is


actually much more complicated than the air in the back part of the vocal tract
vibrating in one way and the air in other parts vibrating in another. Here we
will just concentrate on the fact that in most voiced sounds, three formants are
produced every time the vocal folds vibrate. Note that the resonance in the
vocal tract is independent of the rate of vibration of the vocal folds. The vocal
folds may vibrate faster or slower, giving the sound a higher or lower pitch,
but the formant frequencies will remain the same as long as there are no
changes in the shape of the vocal tract.

There is nothing particularly new about this way of analyzing vowel sounds.
The general theory of formants was stated by the great German scientist
Hermann Helmholtz about one hundred fifty years ago. Even earlier, in 1829,
the English physicist Robert Willis said, “A given vowel is merely the rapid
repetition of its peculiar note.” We would nowadays say that a vowel is the
rapid repetition (corresponding to the vibrations of the vocal folds) of its
peculiar two or three notes (corresponding to its formants). We can, in fact,
go even further and say that not only vowels but all voiced sounds are
distinguishable from one another by their formant frequencies.

3 . Perturbation Theory:

This fact which is a tube with a uniform diameter has simultaneous resonance
frequencies -several pitches at the same time- these resonance frequencies
change in predictable ways when the tube is squeezed at various locations
illustrates that we can model the acoustics of vowels in terms of perturbations
of the uniform tube. For example, when the lips are rounded, the diameter of
the vocal tract is smaller at the lips than at other locations in the vocal tract.
With perturbation theory, we know the acoustic effect of constriction at the
16
lips, so we can predict the formant frequency differences between rounded
and unrounded vowels. Here’s how perturbation theory works. For each
formant, there are locations in the vocal tract where constriction will cause the
formant frequency to rise, and locations where constriction will cause the
frequency to fall.

This figure shows these locations for F1, F2, and F3. The vocal tract is pictured
three times, once for each formant, and is represented as a tube that has the same
diameter for its whole length and is closed at the glottis and open at the lips. This is
approximately the shape of the vocal tract during the vowel [ E ]. The letters “P” and
“V” in the F12F3 tubes indicate the location of pressure maxima (P) and velocity
maxima (V) in the resonant waves that are bouncing back and forth between the lips
and glottis during a vowel. The fact that three resonant waves can be present in the
vocal tract at the same time is difficult to appreciate but true. The perturbation theory
says that if there is a constriction at a velocity maximum (V) in a resonant wave,
then the frequency of that resonance will decrease, and if there is a constriction at a
point of maximum pressure (P), then the frequency of the resonance will increase.
Given these simple rules for how resonant frequency changes when the shape of the
resonator changes, consider how to change the F1 frequency in vowels. Constriction
near the glottis (as found in low vowels) is closer to a pressure maximum (P) than to
a velocity maximum (V), so the F1 frequency will be higher in low vowels than in
schwa. Constriction near the lips (as found in high vowels and round vowels) is
closer to a velocity maximum, so the F1 frequency will be lower in high vowels than
in schwa. The rules apply in the same way to change the frequency of F2 and F3.
For example, there are two ways to raise the frequency of F2; one involves a very
difficult constriction near the glottis, but without tongue root constriction (which is
near the first V in the F2 resonance wave). The other involves constriction with the
tongue against the roof of the mouth. This is the most common maneuver used to
raise the F2 frequency.

17
Speech Perception:

Introduction
Speech perception refers to the ability to perceive linguistic structure in the acoustic
speech signal. During the course of acquiring a native language infants must
discover several levels of language structure in the speech signal, including
phonemes (speech sounds) which are the smallest units of speech. Although
phonemes have no meaning in themselves, they are the building blocks of higher-
level, meaningful linguistic units or structures, including morphemes, words,
phrases, and sentences. Each of the higher-level units are composed of units at the
next lower level using rules that are specific to each language (i.e., morphology,
grammar, or syntax). Thus, sentences are made up of phrases, phrases are composed
of words, and words are made up of morphemes. Each of the meaningful units are
composed of one or more phonemes. In a very real sense, the ability to perceive
differences between and categorize phonemes provides the underlying capacity for
the discovery of the higher levels of language structure in the speech signal. In this
way, infants’ speech perception abilities play a fundamental role in language
acquisition. Although infant speech perception has traditionally focused on
discrimination and categorization at the phoneme level, research over the past two
decades has shown that infants are also beginning to become sensitive to a variety
of higher-level linguistic structures in speech. While with adults, Speech perception
refers to the earliest levels of processing involved in mapping from the acoustics of
spoken language to meaning. Despite the ease with which adults perceive speech,
there are a number of complex perceptual and cognitive tasks involved in
accomplishing this mapping. These issues include the extreme context dependence
of speech, the influence of experience on perception of speech, and effects of higher-
level and cross-modal linguistic information on speech perception. The goal of
speech perception is understanding a speaker's message. To achieve this, listeners
must recognize the words that comprise a spoken utterance. In speech perception,
listeners focus attention on the sounds of speech and notice phonetic details about
pronunciation that are often not noticed at all in normal speech communication. For
example, listeners will often not hear, or not seem to hear, a speech error or
deliberate mispronunciation in ordinary conversation, but will notice those same
errors when instructed to listen for mispronunciations.

18
References:

 Al-Hindawi, F. H. and Al-Juwaid, W. R., Phonetics and Phonology: Different


Dimensions, Scholarʼs Press.
 Coleman, John, (2001), The vocal tract and larynx, Available from
http://www.phon.ox.ac.uk/~jcoleman/phonation.htm Group.
 Kelly G. (2005). How to Teach Pronunciation, Longman: Longman Press
 Ladefoged,P. and Keith,Jojnson,(2015), A Course in Phonetics. New
York:Harcourt
 Monaghan, Alex, (1998), Phonetics: Processes of Speech Production,
 Odden, D., (2005). Introducing Phonology, Cambridge: Cambridge
 Roach P., (1992). Introducing Phonetics, London: London Penguin English
University Press.

19

You might also like