ASR Independent Hybrid Recurrent Neural Network Based Error Correction for Dialog System Applications

Choi, Junhwi; Ryu, Seonghan; Lee, Kyusong; Kim, Yonghee; Koo, Sangjun; Bang, Jeesoo; Park, Seonyeong; Lee, Gary Geunbae

doi:10.1007/978-3-319-15557-9_7

Junhwi Choi⁸,
Seonghan Ryu⁸,
Kyusong Lee⁸,
Yonghee Kim⁸,
Sangjun Koo⁸,
Jeesoo Bang⁸,
Seonyeong Park⁸ &
…
Gary Geunbae Lee⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8757))

Included in the following conference series:

International Workshop on Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction

850 Accesses

Abstract

We proposed an automatic speech recognition (ASR) error correction method using hybrid word sequence matching and recurrent neural network for dialog system applications. Basically, the ASR errors are corrected by the word sequence matching whereas the remaining OOV (out of vocabulary) errors are corrected by the secondary method which uses a recurrent neural network based syllable prediction. We evaluated our method on a test parallel corpus (Korean) including ASR results and their correct transcriptions. Overall result indicates that the method effectively decreases the word error rate of the ASR results. The proposed method can correct ASR errors only with a text corpus without their speech recognition results, which means that the method is independent to the ASR engine. The method is general and can be applied to any speech based application such as spoken dialog systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 34.99; Price excludes VAT (USA)

Softcover Book: USD 44.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

ASR Error Management Using RNN Based Syllable Prediction for Spoken Dialog Applications

Engine-Independent ASR Error Management for Dialog Systems

Error Heuristic Based Text-Only Error Correction Method for Automatic Speech Recognition

Notes

1.
In some cases, the syllable prediction length is longer than the number of syllables of the detected erroneous word. Then, to predict a syllable, the method must predict a phoneme first.
2.
The number of syllables and phonemes is constant for Korean.

References

Brandow, R.L., Strzalkowski, T.: Improving speech recognition through text-based linguistic post-processing. US Patent 6,064,957, 16 May 2000
Google Scholar
Choi, J., Kim, K., Lee, S., Kim, S., Lee, D., Lee, I., Lee, G.G.: Seamless error correction interface for voice word processor. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4973–4976. IEEE (2012)
Google Scholar
Choi, J., Lee, D., Ryu, S., Lee, K., Kim, K., Noh, H., Lee, G.G.: Engine-independent asr error management for dialog systems. In: Intenational Workshop Series on Spoken Dialogue Systems Technology (IWSDS) (2014)
Google Scholar
Evermann, G., Woodland, P.: Posterior probability decoding, confidence estimation and system combination (2000)
Google Scholar
Fiscus, J.G.: A post-processing system to yield reduced word error rates: Recognizer output voting error reduction (rover). In: Proceedings of the 1997 IEEE Workshop on Automatic Speech Recognition and Understanding, pp. 347–354. IEEE (1997)
Google Scholar
Han, D., Choi, K.: A study on error correction using phoneme similarity in post-processing of speech recognition. J. Korea Inst. Intel. Transp. Syst. 6(3), 77–86 (2007). The Korean Institute of Intelligent Transport Systems (Korean ITS)
MathSciNet Google Scholar
Jeong, M., Jung, S., Lee, G.G.: Speech recognition error correction using maximum entropy language model. In: Proceedings of INTERSPEECH, pp. 2137–2140 (2004)
Google Scholar
Mikolov, T., Karafiát, M., Burget, L., Cernockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: INTERSPEECH, pp. 1045–1048 (2010)
Google Scholar
Mikolov, T., Kombrink, S., Burget, L., Cernocky, J., Khudanpur, S.: Extensions of recurrent neural network language model. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5528–5531. IEEE (2011)
Google Scholar
Ringger, E.K., Allen, J.F.: A fertility channel model for post-correction of continuous speech recognition. In: Proceedings of the Fourth International Conference on Spoken Language, 1996. ICSLP 1996, vol. 2, pp. 897–900. IEEE (1996)
Google Scholar

Download references

Acknowledgements

This work was partly supported by the ICT R&D program of MSIP/IITP [14-824-09-014, Basic Software Research in Human-level Lifelong Machine Learning (Machine Learning Center)] and by the National Research Foundation of Korea (NRF) [NRF-2014R1A2A1A01003041].

Author information

Authors and Affiliations

Pohang University of Science and Technology, 77 Cheongam-Ro, Nam-Gu, Pohang, Gyeongbuk, Korea
Junhwi Choi, Seonghan Ryu, Kyusong Lee, Yonghee Kim, Sangjun Koo, Jeesoo Bang, Seonyeong Park & Gary Geunbae Lee

Authors

Junhwi Choi
View author publications
You can also search for this author in PubMed Google Scholar
Seonghan Ryu
View author publications
You can also search for this author in PubMed Google Scholar
Kyusong Lee
View author publications
You can also search for this author in PubMed Google Scholar
Yonghee Kim
View author publications
You can also search for this author in PubMed Google Scholar
Sangjun Koo
View author publications
You can also search for this author in PubMed Google Scholar
Jeesoo Bang
View author publications
You can also search for this author in PubMed Google Scholar
Seonyeong Park
View author publications
You can also search for this author in PubMed Google Scholar
Gary Geunbae Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Junhwi Choi .

Editor information

Editors and Affiliations

Otto von Guericke University, Magdeburg, Germany
Ronald Böck
Trinity College, Dublin, Ireland
Francesca Bonin
Trinity College, Dublin, Ireland
Nick Campbell
Utrecht University, Utrecht, The Netherlands
Ronald Poppe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Choi, J. et al. (2015). ASR Independent Hybrid Recurrent Neural Network Based Error Correction for Dialog System Applications. In: Böck, R., Bonin, F., Campbell, N., Poppe, R. (eds) Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction. MA3HMI 2014. Lecture Notes in Computer Science(), vol 8757. Springer, Cham. https://doi.org/10.1007/978-3-319-15557-9_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-15557-9_7
Published: 12 February 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-15556-2
Online ISBN: 978-3-319-15557-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

ASR Independent Hybrid Recurrent Neural Network Based Error Correction for Dialog System Applications

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

ASR Error Management Using RNN Based Syllable Prediction for Spoken Dialog Applications

Engine-Independent ASR Error Management for Dialog Systems

Error Heuristic Based Text-Only Error Correction Method for Automatic Speech Recognition

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

ASR Independent Hybrid Recurrent Neural Network Based Error Correction for Dialog System Applications

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

ASR Error Management Using RNN Based Syllable Prediction for Spoken Dialog Applications

Engine-Independent ASR Error Management for Dialog Systems

Error Heuristic Based Text-Only Error Correction Method for Automatic Speech Recognition

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation