The Pedagogy-Technology
The Pedagogy-Technology
The Pedagogy-Technology
00
2002, Vol. 00, No. 0, pp. 000–000 # Swets & Zeitlinger
ABSTRACT
In this paper, we examine the relationship between pedagogy and technology in Computer
Assisted Pronunciation Training (CAPT) courseware. First, we will analyse available literature
on second language pronunciation teaching and learning in order to derive some general
guidelines for effective training. Second, we will present an appraisal of various CAPT systems
with a view to establishing whether they meet pedagogical requirements. In this respect, we
will show that many commercial systems tend to prefer technological novelties to the detriment
of pedagogical criteria that could benefit the learner more. While examining the limitations of
today’s technology, we will consider possible ways to deal with these shortcomings. Finally, we
will combine the information thus gathered to suggest some recommendations for future CAPT.
1. INTRODUCTION
The advantages that Computer Assisted Language Learning (CALL) can offer
are nowadays well-known to educators struggling with traditional language
classroom constraints. Computer Assisted Pronunciation Training (CAPT), in
particular, can be beneficial to second language learning as it provides a
private, stress-free environment in which students can access virtually
unlimited input, practise at their own pace and, through the integration of
Automatic Speech Recognition (ASR), receive individualised, instantaneous
feedback. It is not surprising, then, that a wealth of CAPT systems have been
developed, many of which are available on the market for the language teacher
or the individual learner.
When examined carefully, however, the display of products may not look
entirely satisfactory. Many authors describe commercially available programs
Address correspondence to: A. Neri, A2RT, Department of Language and Speech, University of
Nijmegen, The Netherlands. E-mail: A.Neri@let.kun.nl or fC.Cucchiarini, H.Strik, L.Bovesg
@let.kun.nl
CAL-06
2 A. NERI ET AL.
as fancy-looking systems that may at first impress student and teacher alike,
but eventually fail to meet sound pedagogical requirements (Murray &
Barnes, 1998; Pennington, 1999; Price, 1998; Warschauer & Healey, 1998;
Watts, 1997). These systems, which do not fully exploit the potentialities of
CAPT, look more like the result of a technology push, rather than of a demand
pull. This may not necessarily be due to a lack of willingness, on the part of the
developers, to include pedagogical guidelines in the design. It may simply be
due to a failure to adopt a multidisciplinary approach involving speech
technologists, linguists and language teachers (Cole et al., 1998; Price, 1998),
or more fundamentally, to the absence of clear pedagogical guidelines that suit
these types of environments.
What are, then, the guidelines that should be considered when developing a
pedagogically sound CAPT system? We believe that research on second
language acquisition and teaching can already provide us with some
indications on which ingredients are needed for effective pronunciation
training. Although much work still needs to be done, especially with respect to
the issue of feedback, we feel that it is possible to suggest ways to blend these
ingredients in order to obtain the optimal outcome. However, incorporating
this knowledge within state-of-the-art technology may not be as straightfor-
ward as educators hope. Current ASR technology, for instance, still suffers
from several limitations that pose constraints on the design of CAPT, as is
exemplified by the occasional provision of erroneous feedback.
In this paper, we will first analyse available literature on traditional
pronunciation training in order to identify the basic pedagogical criteria that a
system should ideally meet. Second, we will provide a critical evaluation of
those CAPT systems that more closely fulfill those demands, with a view to
establishing which pedagogical aims can be achieved with state-of-the-art
technology. In doing so, we will focus in particular on the issue of feedback.
Finally, we will combine the information thus gathered in an attempt to
provide some recommendations for the development of CAPT systems that
employ state-of-the-art technology in order to meet pedagogical requirements.
2.1. Input
According to interactionist theories, the basic ingredient for successful
language acquisition is input. Students must be able to access large quantities
of input, so that target models become available. Although the majority of the
studies on the impact of different types of input have addressed the acquisition
of linguistic aspects other than pronunciation (see Schachter, 1998), there are
reasons to believe that input can benefit pronunciation learning. As pointed out
by Leather and James (1996), the initial production of new speech patterns,
whether in L1 or L2, implies some phonetic representation in auditory-
perceptual space that must have been previously derived from exemplars
PEDAGOGY-TECHNOLOGY INTERFACE IN CAPT 5
available in the community or explicitly presented during training. Just like for
the acquisition of L1 sounds, multiple-talker models seem to be particularly
effective to improve perception of novel contrasts as the inherent variability
allows for induction of general phonetic categories (Logan, Lively, & Pisoni,
1991). To this end, it may be important that lip movements be visible for the
students, as both seeing and hearing a sound that is being articulated has been
shown to improve production and perception (Jones, 1997; Massaro, 1987).
It has also been suggested that specific instruction on different pronuncia-
tion aspects can lead to improvement of those aspects (Bongaerts, 2001;
Derwing, Munro, Wiebe, 1998; Flege, 1999). This may be taken as an
indication that metalinguistic awareness is conducive to learning gains in
pronunciation. With regard to the way input should be presented, teachers
should try to contextualize input, as meaningful learning, that is, learning
through associations, generally facilitates long-term retention (Ausubel,
1968). Furthermore, input that is meaningful to a learner is perceived by the
learner as relevant to his/her needs, a factor that can stimulate intrinsic
motivation and thus indirectly favour learning (D€ornyei, 1998; Keller, 1983).
Another way to stimulate learner motivation is to present the student with
engaging input that also accommodates different learning styles (Crookes &
Schmidt, 1991; Oxford & Anderson, 1995). For instance, input could be
presented in written, aural and audio-visual form (e.g., a radio interview and a
short film episode).
2.2. Output
Although essential, mere exposure to the L2 does not appear to be a sufficient
condition for pronunciation improvement, as is exemplified by long-term
foreign residents who retain a strong accent and are hardly intelligible in the
L2 (Morley, 1991). As a matter of fact, it is now generally accepted in second
language acquisition research that, if the learners’ aim is to speak the foreign
language fluently and accurately, it is necessary for them to practise speaking
it (Hendrik, 1997; Swain, 1985; Swain & Lapkin, 1995). By producing
speech, learners can test their hypotheses on the L2 sounds. Learners can
compare their own output with the input model and consequently form correct
L2 representations. Through production, speakers receive a first, propriocep-
tive feedback on their own performance: auditory and tactile feedback is
available from air- and bone-conducted pressure changes and from contact
surfaces of articulators, while feedback from the joints, tendons, and muscles
provides a sense of articulatory positions and movements; motor programs are
6 A. NERI ET AL.
2.3. Feedback
The issue of feedback is still controversial: There appears to be a general
disagreement on the definition of corrective, implicit, explicit, or metalin-
guistic feedback, on whether different types of feedback should be considered
as a form of positive or negative evidence, and on what constitutes evidence
for the effectiveness of this factor, especially where pronunciation is
PEDAGOGY-TECHNOLOGY INTERFACE IN CAPT 7
indications are still lacking, it appears that both segmental and supra-
segmental factors are important (see Derwing et al., 1998 for an overview).
Segmental errors can preclude full intelligibility of speech (Derwing &
Munro, 1997; Rogers & Dalby, 1996). On the other hand, lexical stress and
intonation are important too, as they help listeners to process the segmental
content by adding structure to the complex and continuously varying speech
signals (Celce-Murcia et al., 1996). Furthermore, both levels are so tightly
interwoven that, while they can be separated and measured instrumentally, in
reality they influence each other, as the case of stress placement well
illustrates.
2.4. Conclusions
On the basis of this brief synopsis, we can outline some basic recommenda-
tions for the ideal design of effective pronunciation teaching and learning.
Learning must take place in a stress-free environment in which students can be
exposed to considerable and meaningful input, are stimulated to actively
practise oral skills and can receive immediate feedback on individual errors.
Input should pertain to real-world language situations, it should include
multiple-speaker models and it should allow the learner to get a sense of the
articulatory movements involved in the production of L2 speech. Oral
production should be elicited with realistic material and exercises catering for
different learning styles, and should include pronunciation of full sentences.
Pertinent and comprehensible feedback should be provided individually and
with minimum delay and should focus on those segmental and supra-
segmental aspects that affect intelligibility most.
3. CAPT SYSTEMS
as long as they want and at self-paced speed. Third, as some studies suggest
(Murray, 1999), the privacy and the self-directed kind of learning offered by
these environments may lead to a reduction of foreign language anxiety – a
phenomenon strongly linked to social-judgement factors (Young, 1990) – and
thus indirectly favour learning. Furthermore, student profiles can be stored by
the system in a log-file so that the students themselves can monitor problems
and improvements, which in turn might result in increased motivation.
Alternatively, the teacher can refer to the logs and suggest appropriate
remedial steps. Finally, the student might in certain cases receive feedback on
oral performance from the program itself, in real-time. On account of these
advantages, there have been various attempts to develop CAPT systems.
However, the ideal requirements that we sketched in the previous section are
not often met by existing CAPT systems.
with the model one. The criticism of these kinds of displays is all the more
appropriate in the case of waveforms, since these are even more variable and
less informative than spectrograms. Other systems, like the Talk To me (TTM,
2002) and the more comprehensive Tell me More series (Auralog, 2000), are
not exclusively based on waveforms as a form of feedback, in that a global
score is also provided and words that are incorrectly pronounced within a
sentence are colour-coded. However, the graphical importance the waveforms
have on the screen suggests that they are presented because of their flashy
look, to impress the users – that is, the buyers.
A much-praised system, WinPitchLTL (Germain-Rutherford & Martin,
2000; WinPitch, 2002), has been developed by two phoneticians working on
speech technology and pedagogy, as an authoring tool for different learning
environments. This system is able to analyse recorded speech of a maximum
duration of 12 min and display the pitch curve, the intensity curve and the
‘speech signal’ (in the form of a waveform or of a spectrogram). The main
advantage of this system is that it features ‘word-processing’ facilities: the
teacher can easily segment the speech signal displayed, label it by adding text
on the display, highlight with different colours relevant segments in the
melodic curve or significant cues on a spectrogram, thereby making important
information easily visible and retraceable for the student. These are operations
that the system cannot perform automatically as the technology that underlies
it cannot segment a complex speech signal. WinPitchLTL also contains a
synthesis feature that allows the teacher to modify the prosodic parameters of
a student’s utterance and redesign its acoustic properties within a given range
on the basis of the target model. In this way, the student can hear the correct
prosodic contours with his/her own voice, which has been shown to help the
student to better perceive important deviations (Nagano & Ozawa, 1990).
However, the effectiveness of this system totally relies on the teacher: a
teacher must be available who previously received sufficient training in
phonetics and acoustics and who is able to pass on that information to the
students by editing the speech signal, while this, of course, is not the common
rule (Price, 1998). While this system offers the stated advantage to help
teachers clearly indicate what a pronunciation problem was and how it can be
improved, it is unlikely that a teacher will be able to edit a large number of
utterances in such a detailed way. In other words, feedback will be subject,
once again, to time constraints and unfavourable teacher-to-student ratios.
Sometimes graphic displays of pitch contours, without the addition of the
oscillogram or spectrogram, are used to give feedback on intonational patterns
PEDAGOGY-TECHNOLOGY INTERFACE IN CAPT 15
(see Chun, 1998). Like other systems using displays, these programs
presuppose some degree of training in interpreting the displays. However,
pitch contours are easier to interpret than spectrograms or oscillograms. In
addition, while it is doubtful whether attempting to match a spectrogram or an
oscillogram is a meaningful exercise, trying to approach a pitch contour does
certainly make sense. Kommissarchik and Kommissarchik (2000) have
discussed the shortcomings of various forms of supra-segmental feedback and
have developed a system for teaching American English prosody to non-native
speakers of English, BetterAccentTutor, in which readily accessible feedback
is provided. Visual feedback is provided on all three components of prosody:
intonation, stress and rhythm. The students listen to a native speaker’s
recording studying its intonation, stress and rhythm patterns, utter a phrase
and receive immediate audio-visual feedback from the system. Both the
students’ and the natives’ patterns are displayed on the screen so that the
students can compare them and notice the most relevant features they should
match. The system offers two major, easy-to-interpret visualisation modes:
intonation – visualised as a pitch graph on vowels and semivowels – and
intensity/rhythm – visualised as steps (syllables) of various length (duration)
and height (vowel’s energy). This program, however, does not address
segmental errors. The rationale behind the system is based on the assumption
that ‘‘the three factors that have the biggest impact on intelligibility of speech
are intonation, stress and rhythm’’ (Betteraccent, 2002), but no hierarchy
order for speech intelligibility has yet been established and research has
evidenced that segmental errors can be detrimental for comprehension too.
pronounced with the correct vowel (Hillenbrand, Getty, Clark, & Wheeler,
1995).
The ISLE system provides feedback by highlighting the locus of the error in
the word. In addition, example words are shown and can be listened to
which contain, highlighted, the correct sound to imitate and the one
corresponding to the mispronounced version. While this feedback design
seems satisfactory, the system yields poor performance results. The authors
report that only 25% of the errors are detected by the system and that over
5% of correct phones are incorrectly classified as errors. As the authors
comment, with such a performance ‘‘students will more frequently be given
erroneous discouraging feedback than they will be given helpful diagnoses’’
(Menzel et al., 2000, p. 54). Thus, future CAPT systems that use ASR to detect
pronunciation errors should focus on errors that can be detected with a high
degree of robustness. In addition, it should help if more, carefully transcribed
non-native speech in different L2s became available: this could be used to
train an ASR system for the specific task of detecting typical pronunciation
errors. Nevertheless, even if the performance of an ASR system is optimised,
it will never be perfect, and, consequently erroneous feedback will occa-
sionally be provided.
Erroneous feedback is a common problem in CAPT systems using ASR
technology (see for instance the evaluation of TriplePlayPlus and Learn
German Now! in the CALICO Software Reviews, 2002; LLT Software
Reviews, 2002). Patently wrong error detection can be so frustrating for the
student that Wachowicz and Scott (1999) recommend using implicit
rather than explicit, judgemental feedback. For example, a system that only
indicates the part of a word or utterance that was mispronounced, without
indicating exactly which erroneous sounds it recognised, is likely to
make fewer errors than the ISLE system, simply because it makes only
half the number of decisions. And, as some suggested with regard to
recasts, telling the student that some areas in his/her utterance were
incorrect and offering him/her the possibility to listen to the correct version
– without attempting to also play a version of the confusable counterpart –
might just be sufficient feedback. Still, it would be necessary to focus on
pronunciation problems that are robust to detect. It goes without saying that
those are errors where the distance between the wrong and correct
pronunciations is relatively large. Even if these errors do not cause confusions
between words, they are so conspicuous for a listener that they are likely to
affect intelligibility.
PEDAGOGY-TECHNOLOGY INTERFACE IN CAPT 19
3.3. Conclusions
To summarize, this overview of available CAPT systems has identified a
number of pros and cons of these systems, which should be taken into
consideration when developing new prototypes. On the whole, we have seen
that an ideal system should provide input, output and feedback, and should
incorporate ASR technology.
With regard to input and output, we have observed that presently available
technology is sufficiently advanced to match the pedagogical requirements
sketched in Section 2. The technology can now even offer possibilities that are
not available in traditional classroom learning. The limitations of those
systems that make use of outdated or less effective multimedia are only
attributable to economic constraints or choices made by the developers, and
not to problems inherent in the technology.
What still remains problematic is the issue of feedback: its implementation
in CAPT systems needs to be studied carefully. We have seen that it is only
through the integration of ASR technology and pedagogical guidelines that we
can design programs providing real-time, pertinent and easy-to-interpret
feedback both on segmental and supra-segmental aspects. However, the
limitations in current ASR technology imply that error diagnosis will only be
possible with a limited degree of detail. Even if pedagogically desirable,
detailed diagnosis is simply not feasible because the performance levels
attained are too poor. Reliability is crucial in language learning: nothing could
be more confusing for a learner than a system reacting in different ways to
successive realizations of the same mistake. It therefore seems that, if we want
to reach an ideal compromise between technology and demand, we will have
to settle for something that is less ambitious, but that can guarantee correct
feedback at least in the majority of the cases.
We hope that the suggestions we have given for future work can contribute to
ameliorating CAPT design. However, further research is needed to establish
the effectiveness of specific systems that employ the functionalities suggested
here.
ACKNOWLEDGEMENTS
REFERENCES
Celce-Murcia, M., Brinton, D.M., & Goodwin, J.M. (1996). Teaching pronunciation.
Cambridge: CUP.
Chapelle, C.A. (1997). CALL in the year 2000: Still in search of research paradigms? Language
Learning and Technology, 1, 19–43 [On-line] [Last consulted 27/02/2002]. Available:
http://llt.msu.edu/vol1num1/chapelle/default.html
Chapelle, C.A. (2001). Innovative language learning: Achieving the vision. ReCALL, 13,
3–14.
Chaudron, C. (1977). A descriptive model of discourse in the corrective treatment of learner’s
errors. Language Learning, 27, 29–46.
Chun, D.M. (1998). Signal analysis software for teaching discourse intonation. Language
Learning and Technology, 2, 61–77 [On-line] [Last consulted 27/02/2002]. Available:
http://llt.msu.edu/vol2num1/article4/index.html
Cole, R., Carmell, T., Connors, P., Macon, M., Wouters, J., De Villiers, J., Tarachow, A.,
Massaro, D., Cohen, M., Beskow, J., Yang, J., Meier, U., Waibel, A., Stone, P., Fortier,
G., Davies, A., & Soland, C. (1998). Intelligent animated agents for interactive language
learning. Proceedings of InSTILL (pp. 163–166). Marholmen, Sweden.
Crompton, P., & Rodrigues, S. (2001). The role and nature of feedback on students learning
grammar: A small scale study on the use of feedback in CALL in language learning.
Proceedings of the workshop on Computer Assisted Language Learning, Artificial
Intelligence in Education Conference (pp. 70–82). San Antonio, TX.
Crookes, G., & Schmidt, R.W. (1991). Motivation: Reopening the research agenda. Language
Learning, 41, 469–512.
Cucchiarini, C., Strik, H., & Boves, L. (2000). Different aspects of pronunciation quality ratings
and their relation to scores produced by speech recognition algorithms. Speech
Communication, 30, 109–119.
De Bot, K. (1983). Visual feedback of intonation I: Effectiveness and induced practice behavior.
Language and Speech, 26, 331–350.
De Bot, K. (1996). The psycholinguistics of the Output Hypothesis. Language Learning, 46,
529–555.
Derwing, T.M., & Munro, M.J. (1997). Accent, intelligibility, and comprehensibility. Studies in
Second Language Acquisition, 20, 1–16.
Derwing, T.M., Munro, M.J., & Wiebe, G. (1998). Evidence in favour of a broad framework for
pronunciation instruction. Language Learning, 48, 393–410.
D€ornyei, Z. (1998). Motivation in second and foreign language learning. Language Teaching,
31, 117–135.
Ehsani, F., & Knodt, E. (1998). Speech technology in computer-aided learning: Strengths and
limitations of a new CALL paradigm. Language Learning and Technology, 2, 45–60
[On-line] [Last consulted 27/02/2002]. Available: http://llt.msu.edu/vol2num1/article3/
index.html
Eskenazi, M. (1999). Using automatic speech processing for foreign language pronunciation
tutoring: Some issues and a prototype. Language Learning and Technology, 2, 62–76
[On-line] [Last consulted 27/02/2002]. Available: http://llt.msu.edu/vol2num2/article3/
index.html
Eurotalk (2002). [Last consulted 27/02/2002]. http://www.eurotalk.co.uk/ETWebPages/
Products/DVDF.html
Ferrier, L., & Reid, L. (2000). Accent modification training in The Internet Way1. Proceedings
of InSTILL (pp. 69–72). Dundee, Scotland.
24 A. NERI ET AL.
Flege, J.E. (1995). Second-language speech learning: Findings and problems. In W. Strange.
(Ed.), Speech perception and linguistic experience: Theoretical and methodological
issues (pp. 233–273). Timonium, MD: York Press.
Flege, J.E. (1999). Age of learning and second language speech. In D. Birdsong (Ed.), Second
language acquisition and the critical period hypothesis (pp. 101–131). Mahwah, NJ:
Lawrence Erlbaum.
Franco, H., Neumeyer, L., Digalakis, V., & Ronen, O. (2000). Combination of machine scores
for automatic grading of pronunciation quality. Speech Communication, 30, 121–130.
Germain-Rutherford, A., & Martin, P. (2000). Presentation d’un logiciel de visualisation pour
l’apprentissage de l’oral en langue seconde. ALSIC, 3, 61–76 [On-line] [Last consulted
25/02/2002]. Available: http://alsic.u-strasbg.fr/Menus/frameder.htm
Glearner (2001). [Last consulted 10/05/2001]. http://www.glearner.com
Guiora, A.Z., Brannon, R.C., & Dull, C.Y. (1972). Empathy and second language learning.
Language Learning, 22, 111–130.
Hendrik, H. (1997). Keep them talking! A project for improving students’ L2 pronunciation.
System, 25, 545–560.
Hillenbrand, J., Getty, L.A., Clark, M.J., & Wheeler, K. (1995). Acoustic characteristics
of American English vowels. Journal of the Acoustical Society of America, 97,
3099–3111.
Holland, V.M., Kaplan, J.D., & Sabol, M.A. (1999). Preliminary tests of language learning in a
speech-interactive graphics microworld. CALICO Journal, 16, 339–359.
ILT (1997). Interactive language tour. M€unchen: Digital Publishing.
ISLE 1.4. (1999). Pronunciation training: Requirements and solutions [On-line] [Last consulted
27/02/2002]. ISLE Deliverable 1.4. Available: http://nats-www.informatik.uni-hamburg.
de/isle/public/D14/D14.html
ISLE 4.5. (2001). Error diagnosis for spoken language, ISLE Deliverable 4.5 [On-line] [Last
consulted 27/02/2002]. Available at http://nats-www.informatik.uni-hamburg.
de/isle/public/D45/D45.html
Jones, R.H. (1997). Beyond ‘Listen and Repeat’: Pronunciation teaching materials and theories
of Second Language Acquisition. System, 25, 103–112.
Kay. (2002). Kay [On-line] [Last consulted 26/02/2002]. Available: http://www.
kayelemetrics.com.
Keller, J.M. (1983). Motivational design of instruction. In C.M. Reigelruth (Ed.), Instructional
design theories and models: An overview of their current status (pp. 383–434). Hillsdale,
NJ: Lawrence Erlbaum.
Kommissarchik, J., & Komissarchik, E. (2000). Better accent tutor – Analysis and visualization
of speech prosody. Proceedings of InSTILL 2000 (pp. 86–89).Dundee, Scotland.
Krashen, S.D. (1981). Second language acquisition and second language learning. Oxford:
Pergamon Press.
Krashen, S.D., & Terrell, T.D. (1983). The natural approach: Language acquisition in the
classroom. Oxford: Pergamon Press.
Lambacher, S. (1999). A CALL tool for improving second language acquisition of
English consonants by Japanese learners. Computer Assisted Language Learning, 12,
137–156.
LaRocca, S., Morgan, J., & Bellinger, S. (2001). Optimizing speech recognition for use by
learners of less commonly taught languages. Show and Tell presentation, EuroCALL.
Nijmegen, The Netherlands.
PEDAGOGY-TECHNOLOGY INTERFACE IN CAPT 25
Leather, J., & James, A. (1996). Second language speech. In W.C. Ritchie & T.K. Bhatia (Eds.).
Handbook of second language acquisition (pp. 269–316). San Diego, CA: Academic
Press.
Levy, M. (1997). Computer-assisted Language Learning: Context and conceptualization.
Oxford: Clarendon Press.
Lightbown, P.M. (2001). Input filters in second language acquisition. EUROSLA Yearbook 1
(pp. 79–97). Amsterdam: John Benjamins.
Liontas, J. (2002). CALLMedia digital technology: Whither in the new millennium. CALICO
Journal, 19, 315–330.
LLT Software Reviews. (2002). LLT Archives – Software Reviews [Last consulted 27/02/
2002]. Available: http://llt.msu.edu/archives/software.html.
Logan, J.S., Lively, S.E., & Pisoni, D.B. (1991). Training Japanese listeners to identify English
/r/ and /l/ III: Long-term retention of new phonetic categories. Journal of the Acoustical
Society of America, 89, 874–886.
Long, M.H. (1996). The role of the linguistic environment in second language acquisition.
In W.C. Ritchie & T.K. Bhatia (Eds.), Handbook of second language acquisition (pp.
413–468). San Diego, CA: Academic Press.
Lyster, R. (1998). Negotiation of form, recasts, and explicit correction in relation to error types
and learner repair in immersion classrooms. Language Learning, 48, 183–218.
Lyster, R., & Ranta, L. (1997). Corrective feedback and learner uptake. Studies in Second
Language Acquisition, 19, 37–66.
Massaro, D.W. (1987). Speech perception by ear and eye: A Paradigm for psychological
enquiry. Hillsdale, NJ: Lawrence Erlbaum.
Menzel, W., Herron, D., Bonaventura, P., & Morton, R. (2000). Automatic detection and
correction of non-native English pronunciations. Proceedings of InSTILL (pp. 49–56).
Dundee, Scotland.
MITAS. (2002). Multimedia Instructional Tutoring and Authoring System with 3D [Last
consulted 25/06/2002]. Available: http://www.maad.com/MaadWeb/products/mitas/
mitasma.htm
Molholt, G. (1988). Computer-assisted instruction in pronunciation for Chinese speakers of
American English. TESOL Quarterly, 22, 91–111.
Molholt, G. (2001). Three Modes of Visualization. Paper presented at InSTILL. EuroCALL,
Nijmegen, The Netherlands.
Morley, J. (1991). The pronunciation component in teaching English to speakers of other
languages. TESOL Quarterly, 25, 481–519.
Munro, M.J., & Derwing, T.M. (1995). Foreign accent, comprehensibility and intelligibility in
the speech of second language learners. Language Learning, 45, 73–97.
Murray, G.L. (1999). Autonomy in language learning in a simulated environment. System, 27,
295–308.
Murray, L., & Barnes, A. (1998). Beyond the ‘wow’ factor – Evaluating multimedia language
learning software from a pedagogical point of view. System, 26, 249–259.
Nagano, K., & Ozawa, K. (1990). English speech training using voice conversion. Proceedings
of ICSLP, Kobe, 1169–1172.
Nagata, N. (1993). Intelligent computer feedback for second language instruction. The Modern
Language Journal, 77, 330–339.
Nicholas, H., Lightbown, P.M., & Spada, N. (2001). Recasts as feedback to language learners.
Language Learning, 51, 719–758.
26 A. NERI ET AL.
Nieuwe Buren. (2002). Nieuwe Buren [Last consulted 26/02/2002]. Available: http://
www.nieuweburen.nl
Nouza, J. (1998). Training speech through visual feedback patterns. Proceedings of ICSLP,
Sydney, Australia.
Nunan, D. (1989). Designing tasks for the communicative classroom. Cambridge, UK:
CUP.
O’Malley, J.M., & Chamot, A.U. (1990). Learning strategies in second language acquisition.
Cambridge: CUP.
Oxford, R.L., & Anderson, N.J. (1995). A crosscultural view of learning styles. Language
Teaching, 28, 201–215.
Pennington, M.C. (1999). Computer-aided pronunciation pedagogy: Promise, limitations,
directions. Computer Assisted Language Learning, 12, 427–440.
Piske, T., MacKay, I.R.A., & Flege, J.A. (2001). Factors affecting degree of foreign accent in an
L2: A review. Journal of Phonetics, 29, 191–215.
Precoda, K., Halverson, C.A., & Franco, H. (2000). Effects of speech recognition-based
pronunciation feedback on second-language pronunciation ability. Proceedings of
InSTILL (pp. 102–105). Dundee, Scotland.
Price, P. (1998). How can speech technology replicate and complement skills of good language
teachers in ways that help people to learn language? Proceedings of InSTILL (pp. 81–86).
Marholmen, Sweden.
Pro-nunciation. (2002). Products [Last consulted 26/02/2002]. Available: http://users.
zipworld.com.au/pronunce/products.htm
Pujola, J.-T. (2001). Did CALL feedback feed back? Researching learners’ use of feedback.
ReCALL, 13, 79–98.
Rogers, C., & Dalby, J. (1996). Prediction of foreign-accented speech intelligibility from
segmental contrast measures. Journal of the Acoustical Society of America, 100 (Pt. 2),
2725 (A).
Ross, K. (2001). Teaching Languages With Asynchronous Voice Over the Internet. Paper
presented at InSTILL, EuroCALL, Nijmegen, The Netherlands.
Rubin, J. (1987). Learning strategies: Theoretical assumptions, research history and typology.
In A.L. Wenden & J. Rubin (Eds.), Learner strategies in language learning (pp. 15–30).
Englewood Cliffs, NJ: Prentice Hall.
Schachter, J. (1998). Recent research in language learning studies: Promises and problems.
Language Learning, 48, 557–583.
Schmidt, R.W. (1990). The role of consciousness in second language learning. Applied
Linguistics, 11, 129–158.
Scovel, T. (1988). A time to speak. A psycholinguistic inquiry into the critical period for human
speech. Rowley, MA: Newbury House.
Swain, M. (1985). Communicative competence: Some roles of comprehensible input and
comprehensible output in its development. In M.A. Gass & C.G. Madden (Eds.), Input in
second language acquisition (pp. 235–253). Rowley, MA: Newbury House.
Swain, M., & Lapkin, S. (1995). Problems in output and the cognitive process they generate: A
step towards second language learning. Applied Linguistics, 16, 371–391.
TraciTalk (2002). Traci Talk, The Mystery [Last consulted 24/06/2002]. Available:
http://www.encomium.com/CPI/CPITTTM.html
TTM. (2002). Talk to Me, the Conversation Method [Last consulted 26/02/2002]. Available:
http://www.auralog.com/en/talktome.html
PEDAGOGY-TECHNOLOGY INTERFACE IN CAPT 27
Tutsui, M., Masashi, K., & Mohr, B. (1999). Closing the gap between practice environments
and reality: An interactive multimedia program for oral communication training in
Japanese. Computer Assisted Language Learning, 11, 125–151.
Van de Voort, M. (1999). Gluren naar de buren. Nijmegen: UTN.
Van Heuven, V.J.J.P., Kruyt, J.G., & de Vries, J.W. (1981). Buitenlandsheid en begrijpelijkheid
in het Nederlands van buitenlandse arbeiders; een verkennende studie. Forum der
Letteren, 22, 171–178.
Wachowicz, K., & Scott, B. (1999). Software that listens: It’s not a question of whether, it’s a
question of how. CALICO Journal, 16, 253–276.
Warschauer, M., & Healey, D. (1998). Computers and language learning: An overview,
Language Teaching, 31, 57–71.
Watts, N. (1997). A learner-based design model for interactive multimedia language learning
packages. System, 25, 1–8.
Wharton, G. (2000). Language learning strategy use of bilingual foreign language learners in
Singapore. Language Learning, 50, 203–243.
WinPitch (2002). Pitch Instruments Inc. [Last consulted 26/02/2002]. Available:
http://www.winpitch.com.
Young, D.J. (1990). An investigation of students’ perspectives on anxiety and speaking. Foreign
Language Annals, 23, 539–553.