Abstract
This paper presents a set of exploratory experiments addressed to analyse and evaluate the performance of baseline speech processing components in European Portuguese for distant voice command recognition applications in domestic environments. The analysis, conducted in a multi-channel multi-room scenario, showed the importance of adequate room detection and channel selection strategies to obtain acceptable performances. Two different computationally inexpensive channel selection measures for room detection, channel selection and cluster selection have been investigated. Experimental results show that the strategies based on envelope-variance measure consistently outperformed the remaining methods investigated, and particularly, that channel selection strategies can be more convenient than baseline beamforming methods, such as delay-and-sum, for this type of multi-room scenarios.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Young, S., et al.: HTK – Hidden Markov Model Toolkit, Manual (2006), http://htk.eng.cam.ac.uk/
Neto, J.P., Martins, C.A., Meinedo, H., Almeida, L.B.: The design of a large vocabulary speech corpus for Portuguese. In: Proc. Eurospeech, pp. 1707–1710 (1997)
Potamianos, G., et al.: Robustness of distant–speech recognition and speaker identification-development of baseline system. Deliverable D4.1, DIRHA Consortium (February 2013)
Hagmüller, M., et al.: Experimental task definitions. Deliverable D2.2, DIRHA Consortium (February 2013)
Ravanelli, M., et al.: DIRHA-simcorpora I and II. Deliverables 2.1, 2.3, 2.4, DIRHA Consortium (February 2014)
Johnson, D., Dudgeon, D.: Array signal processing: concepts and techniques. Prentice Hall (1993)
Wolf, M., Nadeu, C.: On the potential of channel selection for recognition of reverberated speech with multiple microphones. In: Proc. Interspeech, pp. 80–83 (2010)
Wolf, M., Nadeu, C.: Channel selection using N-Best hypothesis for multi-microphone ASR. In: Proc. Interspeech (2013)
Wolf, M.: Channel selection and reverberation-robust automatic speech recognition. PhD, Universitat Politècnica de Catalunya (UPC) (2013)
Cristoforetti, L., Ravanelli, M., Omologo, M., Sosi, A., Abad, A., Hagmüller, M., Maragos, P.: The DIRHA simulated corpus. In: Proc. LREC (2014)
Abad, A., et al.: First report on novel techniques for distant-speech and speaker recognition. Deliverable D4.2, DIRHA Consortium (February 2014)
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing, 19–41 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Matos, M., Abad, A., Astudillo, R., Trancoso, I. (2014). Recognition of Distant Voice Commands for Home Applications in Portuguese. In: Navarro Mesa, J.L., et al. Advances in Speech and Language Technologies for Iberian Languages. Lecture Notes in Computer Science(), vol 8854. Springer, Cham. https://doi.org/10.1007/978-3-319-13623-3_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-13623-3_19
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13622-6
Online ISBN: 978-3-319-13623-3
eBook Packages: Computer ScienceComputer Science (R0)