Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Word Error Rate Minimization Using an Integrated Confidence Measure

Published: 01 May 2007 Publication History

Abstract

This paper describes a new criterion for speech recognition using an integrated confidence measure to minimize the word error rate (WER). The conventional criteria for WER minimization obtain the expected WER of a sentence hypothesis merely by comparing it with other hypotheses in an n-best list. The proposed criterion estimates the expected WER by using an integrated confidence measure with word posterior probabilities for a given acoustic input. The integrated confidence measure, which is implemented as a classifier based on maximum entropy (ME) modeling or support vector machines (SVMs), is used to acquire probabilities reflecting whether the word hypotheses are correct. The classifier is comprised of a variety of confidence measures and can deal with a temporal sequence of them to attain a more reliable confidence. Our proposed criterion for minimizing WER achieved a WER of 9.8% and a 3.9% reduction, relative to conventional n-best rescoring methods in transcribing Japanese broadcast news in various environments such as under noisy field and spontaneous speech conditions.

References

[1]
A. Ando, T. Imai, A. Kobayashi, S. Homma, J. Goto, N. Seiyama, T. Mishima, T. Kobayakawa, S. Sato, K. Onoe, H. Segi, A. Imai, A. Matsui, A. Nakamura, H. Tanaka, T. Takagi, E. Miyasaka, and H. Isono, “Simultaneous subtitling system for broadcast news programs with a speech recognizer,” IEICE Trans. Inf. & Syst., vol.E86-D, no.1, pp.15–25, Jan. 2003.
[2]
G. Riccardi and D. Hakkani-Tür, “Active and unsupervised learning for automatic speech recognition,” Proc. Eurospeech, pp.1825–1828, 2003.
[3]
M. Nakano, “Using untranscribed user utterances for improving language models based on confidence scoring,” Proc. Eurospeech, pp.417–420, 2003.
[4]
A. Stolcke, Y. Konig, and M. Weintraub, “Explicit word error minimization in N-best list rescoring,” Proc. Eurospeech, pp.163–166, 1997.
[5]
V. Goel, W.J. Byrne, and S.P. Khudanpur, “LVCSR rescoring with modified loss functions: A decision theoristic perspective,” Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, pp.425–428, 1998.
[6]
F. Wessel, R. Schlüter, K. Macherey, and H. Ney, “Confidence measure for large vocabulary continuous speech recognition,” IEEE Trans. Speech Audio Process., vol.9, no.3, pp.288–298, 2001.
[7]
T. Kemp and T. Schaaf, “Estimating confidence using word lattices,” Proc. Eurospeech, pp.827–830, 1997.
[8]
G. Evermann and P. Woodland, “Posterior probability decoding, confidence estimation and system combination,” Proc. NIST Speech Transcription Workshop, http://www.nist.gov/speech/publications/tw00/html/cp230/cp230.htm, 2000.
[9]
G. Riccardi and D. Hakkani-Tür, “Active learning: Theory and applications to automatic speech recognition,” IEEE Trans. Speech Audio Process., vol.13, no.4, pp.504–511, 2005.
[10]
A. Berger, S.D. Pietra, and V.D. Pietra, “A maximum entropy approach to natural language processing,” Computational Linguistics, vol.22, pp.39–71, 1996.
[11]
J. Darroch and D. Ratcliff, “Generalized iterative scaling for log-linear models,” The Annals of Mathematical Statistics, pp.1470–1480, 1972.
[12]
S.F. Chen and R. Rosenfeld, “A Gaussian prior for smoothing maximum entropy models,” Technical Report CMU-CS-99-108, Carnegie Mellon University, 1999.
[13]
T.J. Hazen, S. Seneff, and J. Polifromi, “Recognition confidence scoring and its use in speech understanding systems,” Comput. Speech Lang., vol.16, pp.49–67, 2002
[14]
P.J. Moreno, B. Logan, and B. Raj, “A boosting approach for confidence scoring,” Proc. Eurospeech, pp.2109–2112, 2001.
[15]
T. Joachims, Learning to classify text using support vector machines, Kluwer Academic Publishers, Boston, 2002.
[16]
T. Joachims, “Introduction to support vector learning,” in Advances in Kernel Methods, ed. B. Scho-lkopf, C.J. Burges, and A.J. Smola, MIT Press, 1999.
[17]
J. Platt, “Probabilistic outputs for support vector machines and comparison to regularized likelihood methods,” in Advances in Large Margin Classifiers, ed. A. Smola, P. Bartlett, B. Schoelkopf, and D. Schuurmans, pp.61–74, MIT Press, 2000.
[18]
T. Joachims, “Making large-scale SVM learning practical,” in Advances in Kernel Methods, ed. B. Scho-lkopf, C.J. Burges, and A.J. Smola, MIT Press, 1999.
[19]
A. Stolcke, “SRILM–An extenisible language modeling toolkit,” Proc. Int. Conf. Spoken Language Processing, pp.901–904, 2002.
[20]
T. Zeppenfeld, M. Finke, M. Westphal, K. Ries, and A. Waibel, “Recognition of conversational telephone speech using the Janus speech engine,” Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, pp.1815–1818, 1997.
[21]
F. Weng, A. Stolcke, and A. Sanker, “Efficient lattice representation and generation,” Proc. Int. Conf. Spoken Language Processing, pp.2531–2534, 1998.
[22]
H. Hermansky and N. Morgan, “RASTA processing of speech,” IEEE Trans. Speech Audio Process., vol.2, pp.587–589, 1994.
[23]
A. Kobayashi, K. Onoe, T. Imai, and A. Ando, “Time dependent language model for broadcast news transcription and its post-correction,” Proc. Int. Conf. Spoken Language Processing, pp.2435–2438, 1998.
[24]
L. Gillick and S.J. Cox, “Some statistical issues in the comparison of speech recognition algorithms,” Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, pp.532–535, 1989.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEICE - Transactions on Information and Systems
IEICE - Transactions on Information and Systems  Volume E90-D, Issue 5
May 2007
80 pages
ISSN:0916-8532
EISSN:1745-1361
Issue’s Table of Contents

Publisher

Oxford University Press, Inc.

United States

Publication History

Published: 01 May 2007

Author Tags

  1. maximum entropy
  2. n-best rescoring
  3. support vector machines
  4. word error rate minimization

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media