Abstract
In this paper, we present a robust voice activity detection (VAD) algorithm using the combination of short-term and long-term spectral patterns. We analyze the benefit of short-term and long-term spectral patterns, respectively, when applied to robust VAD. Based on the analysis, we find the combination of short-term and long-term spectral patterns can be used to achieve a higher VAD accuracy than one of them only in noisy environments. We evaluate its performance under four types of noises and six types of signal-to-noise ratio (SNR) conditions. Compared with standard VAD schemes, the evaluation almost demonstrates promising results with the proposed scheme being comparable or favorable over the whole test set for various criterions of the VAD evaluation.
This research was supported in part by the China National Nature Science Foundation (No.91120303, No.61273267, No.90820011 and No.90820303).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Evangelopoulos, G., Maragos, P.: Speech event detection using multiband modulation energy. In: Proceedings of INTERSPEECH, pp. 685–688 (2005)
Kotnik, B., Kacic, Z., Horvat, B.: A multiconditional robust front-end feature extraction with a noise reduction procedure based on improved spectral subtraction algorithm. In: Proceedings of INTERSPEECH, pp. 197–200 (2001)
Yoo, I.-C., Yook, D.: Robust voice activity detection using the spectral peaks of vowel sounds. ETRI Journal 31(4) (2009)
Moattar, M.H., Homayounpour, M.M., Kalantari, N.K.: A new approach for robust realtime voice activity detection using spectral pattern. In: Proceedings of ICASSP, pp. 4478–4481 (2010)
Soleimani, S.A., Ahadi, S.M.: Voice activity detection based on combination of multiple features using linear/kernel discriminant analyses. In: Proceedings of ICTTA, pp. 1–5 (2008)
Ramırez, J., Segura, J.C., Benıtez, M., de la Torre, A., Rubio, A.: A new adaptive long-term spectral estimation voice activity detector. In: Proceedings of EUROSPEECH, pp. 3041–3044 (2003)
Ramırez, J., Segura, J.C., Benıtez, C., De La Torre, A., Rubio, A.: Efficient voice activity detection algorithms using long-term speech information. Speech Communication 42(3), 271–287 (2004)
Ramirez, J., Segura, J.C., Benitez, C., de La Torre, A., Rubio, A.: Voice activity detection with noise reduction and long-term spectral divergence estimation. In: Proceedings of ICASSP (2004)
Benyassine, A., Shlomot, E., Su, H.-Y., Massaloux, D., Lamblin, C., Petit, J.-P.: ITU-T Recommendation G. 729 Annex B: a silence compression scheme for use with G. 729 optimized for V. 70 digital simultaneous voice and data applications. IEEE Communications Magazine 35(9), 64–73 (1997)
ETSI, Voice activity detector(VAD) for Adaptive MultiRate(AMR) speech traffic channels, ETSI EN 301 708 Recommendation (1999)
Garofolo, J.S., et al.: Getting started with the DARPA TIMIT CD-ROM: An acoustic phonetic continuous speech database. In: National Institute of Standards and Technology (NIST), Gaithersburgh, MD, vol. 107 (1988)
Sarikaya, R., Sarikaya, R., Hansen, J.H.L., Hansen, J.H.L.: Robust speech activity detection in the presence of noise. In: Proceedings of ICSLP (1998)
Beritelli, F., Casale, S., Cavallaero, A.: A robust voice activity detector for wireless communications using soft computing. IEEE Journal on Selected Areas in Communications 16(9), 1818–1829 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tan, YW., Liu, WJ. (2014). Robust Voice Activity Detection Using the Combination of Short-Term and Long-Term Spectral Patterns. In: Li, S., Liu, C., Wang, Y. (eds) Pattern Recognition. CCPR 2014. Communications in Computer and Information Science, vol 484. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45643-9_45
Download citation
DOI: https://doi.org/10.1007/978-3-662-45643-9_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45642-2
Online ISBN: 978-3-662-45643-9
eBook Packages: Computer ScienceComputer Science (R0)