Automatic Classification and Transcription of Telephone Speech in Radio Broadcast Data

Abad, Alberto; Meinedo, Hugo; Neto, João

doi:10.1007/978-3-540-85980-2_18

Alberto Abad¹,
Hugo Meinedo¹ &
João Neto¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5190))

Included in the following conference series:

International Conference on Computational Processing of the Portuguese Language

580 Accesses
3 Citations

Abstract

Automatic transcription of telephone speech involves additional challenges compared to wideband data processing, mainly due to channel limitations and to particular characteristics of conversational telephone speech. While in TV speech recognition applications, such as automatic transcription of broadcast news, the presence of telephone data is nearly insignificant (less than 1 %), in most radio broadcast stations the presence of telephone speech grows significantly. Thus, transcription of telephone speech data deserves special attention in radio broadcast applications. In this work, we describe our initial efforts to tackle this particular problem. First, a telephone channel classifier is proposed to automatically detect telephone segments. Then, some strategies for increasing robustness of the automatic transcription system are investigated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Cross-Lingual Adaptation of Broadcast Transcription System to Polish Language Using Public Data Sources

Automatic Speech Recognition Based on Clustering Technique

Phoneme sequence recognition via DTW-based classification

Article 19 October 2015

References

Nguyen, L., Xiang, B., Afify, M., Abdou, S., Matsoukas, S., Schwartz, R., Makhoul, J.: The BBN RT04 English Broadcast News Transcription System. In: Proceedings of Interspeech 2005, Lisbon, Portugal (2005)
Google Scholar
Gales, M.J.F., Kim, D.Y., Woodland, P.C., Chan, H.Y., Mrva, D., Sinha, R., Tranter, S.E.: Progress in the CU-HTK Broadcast News Transcription System. IEEE Transactions on Audio, Speech, and Language Processing 14(5), 1513–1525 (2006)
Article Google Scholar
Galliano, S., Geoffrois, E., Mostefa, D., Choukri, K., Bonastre, J.-F., Gravier, G.: The ESTER Phase II Evaluation Campaign for the Rich Transcription of French Broadcast News. In: Proceedings of Interspeech 2005, Lisbon, Portugal (2005)
Google Scholar
Meinedo, H., Caseiro, D., Neto, J., Trancoso, I.: AUDIMUS.media: A Broadcast News speech recognition system for the European Portuguese language. In: Proceedings of PROPOR- 2003, Portugal (2003)
Google Scholar
Gauvain, J.-L., Lamel, L., Schwenk, H., Adda, G., Chen, L., Lefèvre, F.: Conversational telephone speech recognition. In: Proceedings of ICASSP-2003, pp. 212–215 (April 2003)
Google Scholar
Matsoukas, S., Prasad, R., Laxminarayan, S., Xiang, B., Nguyen, L., Schwartz, R.: The 2004 BBN 1xRT Recognition Systems for English Broadcast News and Conversational Telephone Speech. In: Proceedings of Interspeech 2005, Lisbon, Portugal (2005)
Google Scholar
Godfrey, J.J., Holliman, E.C., McDaniel, J.: Switchboard: Telephone speech corpus for research and development. In: Proceedings of ICASSP-1992, pp. 517–520 (March 1992)
Google Scholar
Morgan, N., Bourlard, H.: An introduction to hybrid HMM/Connectionist continuous speech recognition. IEEE Signal Processing Magazine, 25–42 (1995)
Google Scholar
Mohri, M., Pereira, F., Riley, M.: Weighted finite-state transducers in speech recognition. In: ISCA ITRW Automatic Speech Recognition, Paris, pp. 97–106 (2000)
Google Scholar
Martins, C., Teixeira, A., Neto, J.: Dynamic language modeling for a daily broadcast news transcription system. In: Proceedings of ASRU-2007, Kyoto, pp. 165–170 (2007)
Google Scholar
Hagen, A., Neto, J.: HMM/MLP Hybrid Speech Recognizer for the Portuguese Telephone SpeechDat Corpus. In: Proceedings of PROPOR-2003, Portugal (2003)
Google Scholar
Lindberg, B., Johansen, F., Warakagoda, N., Lehtinen, G., Kacic, Z., Zgank, A., Elenius, K., Salvi, G.: A noise robust multilingual reference recogniser based on SpeechDat(II). In: Proceedings of ICSLP 2000, Beijing, pp. III, 370–373 (2000)
Google Scholar
Junqua, J.-C., Haton, J.P.: Robustness in Automatic Speech Recognition: Fundamentals and Applications. Kluwer Academic Publishers, Dordrecht (1996)
Google Scholar
ETSI standard doc.: Speech Processing, Transmission and Quality Aspects (STQ); Distributed speech recognition; Advanced feature extraction algorithm. ETSI ES 202 050 Ver. 1.1.5 (2002)
Google Scholar
Kamm, T., Andreou, G., Cohen, J.: Vocal tract normalization in speech recognition: Compensating for systematic speaker variability. In: Proceedings of the 15th Annual Speech Research Symposium, Baltimore, USA (1995)
Google Scholar

Download references

Author information

Authors and Affiliations

L2F - Spoken Language Systems Lab, INESC-ID / IST, Lisboa, Portugal
Alberto Abad, Hugo Meinedo & João Neto

Authors

Alberto Abad
View author publications
You can also search for this author in PubMed Google Scholar
Hugo Meinedo
View author publications
You can also search for this author in PubMed Google Scholar
João Neto
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

António Teixeira Vera Lúcia Strube de Lima Luís Caldas de Oliveira Paulo Quaresma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Abad, A., Meinedo, H., Neto, J. (2008). Automatic Classification and Transcription of Telephone Speech in Radio Broadcast Data. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds) Computational Processing of the Portuguese Language. PROPOR 2008. Lecture Notes in Computer Science(), vol 5190. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85980-2_18

Download citation

DOI: https://doi.org/10.1007/978-3-540-85980-2_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85979-6
Online ISBN: 978-3-540-85980-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Automatic Classification and Transcription of Telephone Speech in Radio Broadcast Data

Abstract

Access this chapter

Preview

Similar content being viewed by others

Cross-Lingual Adaptation of Broadcast Transcription System to Polish Language Using Public Data Sources

Automatic Speech Recognition Based on Clustering Technique

Phoneme sequence recognition via DTW-based classification

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Automatic Classification and Transcription of Telephone Speech in Radio Broadcast Data

Abstract

Access this chapter

Preview

Similar content being viewed by others

Cross-Lingual Adaptation of Broadcast Transcription System to Polish Language Using Public Data Sources

Automatic Speech Recognition Based on Clustering Technique

Phoneme sequence recognition via DTW-based classification

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation