Isolate Speech Recognition Based on Time-Frequency Analysis Methods

Mantilla-Caeiros, Alfredo; Nakano Miyatake, Mariko; Perez-Meana, Hector

doi:10.1007/978-3-642-10268-4_35

Alfredo Mantilla-Caeiros¹⁸,
Mariko Nakano Miyatake¹⁹ &
Hector Perez-Meana¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5856))

Included in the following conference series:

Iberoamerican Congress on Pattern Recognition

1455 Accesses

Abstract

A feature extraction method for isolate speech recognition is proposed, which is based on a time frequency analysis using a critical band concept similar to that performed in the inner ear model; which emulates the inner ear behavior by performing signal decomposition, similar to carried out by the basilar membrane. Evaluation results show that the proposed method performs better than other previously proposed feature extraction methods when it is used to characterize normal as well as esophageal speech signal.

Download to read the full chapter text

Chapter PDF

Smoothed Nonlinear Energy Operator-Based Amplitude Modulation Features for Robust Speech Recognition

Spectral Analysis for Automatic Speech Recognition and Enhancement

Noise Perturbation Improves Supervised Speech Separation

Keywords

References

Rabiner, L., Juang, B.: Fundamentals of Speech Recognition. Prentice Hall, Piscataway (1993)
Google Scholar
Rabiner, R., Juang, B.H., Lee, C.H.: An Overview of Automatic Speech Recognition. In: Lee, C.H., Soong, F.K., Paliwal, K.K. (eds.) Automatic Speech and Speaker Recognition: Advanced Topics, pp. 1–30. Kluwer Academic Publisher, Dordrecht (1996)
Google Scholar
Junqua, C., Haton, J.P.: Robustness in Automatic Speech Recognition. Kluwer Academic Publishers, Dordrecht (1996)
Google Scholar
Pitton, J.W., Wang, K., Juang, B.H.: Time-frequency analysis and auditory modeling for automatic recognition od speech. Proc. of The IEEE 84(9), 1109–1215 (1999)
Google Scholar
Haque, S., Togneri, R., Zaknich, A.: Perceptual features for automatic speech recognition in noise environments. Speech Communication 51(1), 58–75 (2009)
Article Google Scholar
Suarez-Guerra, S., Oropeza-Rodriguez, J.: Introduction to Speech Recognition. In: Perez-Meana, H. (ed.) Advances in Audio and Speech Signal Processing; Technologies and Applications, pp. 325–347. Idea Group Publishing, USA (2007)
Google Scholar
Childers, D.G.: Speech Processing and Synthesis Toolboxes. Wiley and Sons, New York (2000)
Google Scholar
Zhang, X., Heinz, M., Bruce, I., Carney, L.: A phenomenological model for the responses of auditory-nerve fibers: I. Nonlinear tuning with compression and suppression. Acoustical Society of America 109(2), 648–670 (2001)
Article Google Scholar
Rao, R.M., Bopardikar, A.S.: Wavelets Transforms, Introduction to Theory and Applications. Addison Wesley, New York (1998)
Google Scholar
Schroeder, M.R., et al.: Objective measure of certain speech signal degradations based on masking properties of the human auditory perception. In: Frontiers of Speech Communication Research. Academic Press, London (1979)
Google Scholar
Freeman, J., et al.: Neural Networks, Algorithms, Applications and Programming Techniques. Addison-Wesley, New York (1991)
MATH Google Scholar
Mantilla-Caeiros, A., Nakano-Miyatake, M., Perez-Meana, H.: A New Wavelet Function for Audio and Speech Processing. In: Proc. of the MWSCAS 2007, pp. 101–104 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Intituto Tecnologico de Monterrey, Campus Ciudad de Mexco, Av. Del Puente, Mexico, D.F.
Alfredo Mantilla-Caeiros
ESIME Culhuacan, Instituto Politécnico Nacional, Av. Santa Ana 1000, 04430, Mexico, D.F., Mexico
Mariko Nakano Miyatake & Hector Perez-Meana

Authors

Alfredo Mantilla-Caeiros
View author publications
You can also search for this author in PubMed Google Scholar
Mariko Nakano Miyatake
View author publications
You can also search for this author in PubMed Google Scholar
Hector Perez-Meana
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Ingeniería Eléctrica y Ciencias de la Computación, CINVESTAV, Unidad Guadalajara, Jalisco, México
Eduardo Bayro-Corrochano
Computer Vision and Active Perception Laboratory, CSC, KTH, SE-100 44, Stockholm, Sweden
Jan-Olof Eklundh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mantilla-Caeiros, A., Nakano Miyatake, M., Perez-Meana, H. (2009). Isolate Speech Recognition Based on Time-Frequency Analysis Methods. In: Bayro-Corrochano, E., Eklundh, JO. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2009. Lecture Notes in Computer Science, vol 5856. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10268-4_35

Download citation

DOI: https://doi.org/10.1007/978-3-642-10268-4_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10267-7
Online ISBN: 978-3-642-10268-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Isolate Speech Recognition Based on Time-Frequency Analysis Methods

Abstract

Chapter PDF

Similar content being viewed by others

Smoothed Nonlinear Energy Operator-Based Amplitude Modulation Features for Robust Speech Recognition

Spectral Analysis for Automatic Speech Recognition and Enhancement

Noise Perturbation Improves Supervised Speech Separation

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Isolate Speech Recognition Based on Time-Frequency Analysis Methods

Abstract

Chapter PDF

Similar content being viewed by others

Smoothed Nonlinear Energy Operator-Based Amplitude Modulation Features for Robust Speech Recognition

Spectral Analysis for Automatic Speech Recognition and Enhancement

Noise Perturbation Improves Supervised Speech Separation

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation