Abstract
In this paper, we describe environment compensation approach based on MAP (maximum a posteriori) estimation assuming that the noise can be modeled as a single Gaussian distribution. It employs the prior information of the noise to deal with environmental variabilities. The acoustic-distorted environment model in the cepstral domain is approximated by the truncated first-order vector Taylor series(VTS) expansion and the clean speech is trained by using Self-Organizing Map (SOM) neural network with the assumption that the speech can be well represented as the multivariate diagonal Gaussian mixtures model (GMM). With the reasonable environment model approximation and effective clustering for the clean model, the noise is well refined using batch-EM algorithm under MAP criterion. Experiment with large vocabulary speaker-independent continuous speech recognition shows that this approach achieves considerable improvement on recognition performance.
This research was sponsored by NSFC (National Natural Science Foundation of China) under Grant No.60475007, the Foundation of China Education Ministry for Century Spanning Talent and BUPT Education Foundation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Boll, S.F.: Suppression of Acoustic Noise in Speech Using Spectral Subtraction. IEEE Trans. Acoustics, Speech and Signal Processing, 113–120 (1979)
Moreno, P.J., Raj, B., Stern, R.M.: A Vector Taylor Series Approach for Environment-Independent Speech Recognition. The Proceedings of IEEE, 733–736 (1995)
Kim, N.S., Kim, D.Y., Kong, B.G.: Application of VTS to Environment Compensation with Noise Statistics. In: ESCA workshop on Robust Speech Recognition, Pont-a-Mousson, France, pp. 99–102 (1997)
Kim, N.S.: Statistical Linear Approximation for Environment Compensation. IEEE Signal Processing Letters 1, 8–10 (1998)
Shen, H., Liu, G., Guo, J., Li, Q.: Two-Domain Feature Compensation for Robust Speech Recognition. In: Wang, J., Liao, X.-F., Yi, Z. (eds.) ISNN 2005. LNCS, vol. 3497, pp. 351–356. Springer, Heidelberg (2005)
Shen, H., Guo, J., Liu, G., Li, Q.: Non-Stationary Environment Compensation Using Sequential EM Algorithm for Robust Speech Recognition. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 264–273. Springer, Heidelberg (2005)
Gauvain, J.L., Lee, C.H.: Maximum A Posteriori Estimation for Multivariate Gaussian Mixture Observation of Markov Chains. IEEE Transactions on Speech and Audio Processing 2, 291–298 (1994)
Huo, Q., Lee, C.H.: On-Line Adaptive Learning of the Continuous Density Hidden Markov Model Based on Approximate Recursive Bayes Estimate. IEEE Transactions on Speech and Audio Processing 2, 161–172 (1997)
Huo, Q., Chan, C., Lee, C.H.: Bayesian Adaptive Learning of the Parameters of Hidden Markov Model for Speech Recognition. IEEE Transactions on Speech and Audio Processing 5, 334–345 (1995)
Kohonen, T.: The self-Organizing Map. The Proceedings of the IEEE 78, 1464–1480 (1990)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B, 1–38 (1977)
Zu, Y.Q.: Issues in the Scientific Design of the Continuous Speech Database. Available: http://www.cass.net.cn/chinese/s18_yys/yuyin/report/report_1998.htm
Varga, A., Steenneken, H.J.M., Tomilson, M., Jones, D.: The NOISEX–92 Study on the Effect of Additive Noise on Automatic Speech Recognition. Tech. Rep. DRA Speech Research Unit (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shen, H., Guo, J., Liu, G., Huang, P., Li, Q. (2005). Environment Compensation Based on Maximum a Posteriori Estimation for Improved Speech Recognition. In: Gelbukh, A., de Albornoz, Á., Terashima-Marín, H. (eds) MICAI 2005: Advances in Artificial Intelligence. MICAI 2005. Lecture Notes in Computer Science(), vol 3789. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11579427_87
Download citation
DOI: https://doi.org/10.1007/11579427_87
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29896-0
Online ISBN: 978-3-540-31653-4
eBook Packages: Computer ScienceComputer Science (R0)