Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Using geometric spectral subtraction approach for feature extraction for DSR front-end Arabic system

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

Noise robustness and Arabic language are still considered as the main challenges for speech recognition over mobile environments. This paper contributed to these trends by proposing a new robust Distributed Speech Recognition (DSR) system for Arabic language. A speech enhancement algorithm was applied to the noisy speech as a robust front-end pre-processing stage to improve the recognition performance. While an isolated Arabic word engine was designed, and developed using HMM Model to perform the recognition process at the back-end. To test the engine, several conditions including clean, noisy and enhanced noisy speech were investigated together with speaker dependent and speaker independent tasks. With the experiments carried out on noisy database, multi-condition training outperforms the clean training mode in all noise types in terms of recognition rate. The results also indicate that using the enhancement method increases the DSR accuracy of our system under severe noisy conditions especially at low SNR down to 10 dB.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Alotaibi, Y. A. (2005). Investigating spoken Arabic digits in speech recognition setting. Information Sciences, 173(1–3), 115–139.

    Article  Google Scholar 

  • Alotaibi, Y. A. (2012). Comparing ANN to HMM in implementing limited Arabic vocabulary ASR systems. International Journal of Speech Technology, 15(1), 25–32.

    Article  Google Scholar 

  • Cardenas, J., Castillo, E., Meng, J. (2012). Noise-robust speech recognition system based on power spectral subtraction with a geometric approach. Nantes: Société Française d’Acoustique, Acoustics.

    Google Scholar 

  • Djamel, A., Selouani, S. A., Kifaya, K., Boudraa, M., & Boudraa, B. (2007). A noise-robust front-end for distributed speech recognition in mobile communications. International Journal of Speech Technology, 10, 167–173.

    Article  Google Scholar 

  • ETSI (2007). Speech Processing, Transmission and Quality Aspects (STQ); Distributed speech recognition; advanced front-end feature extraction algorithm; compression algorithms. Technical report.

  • Farah, T., Souici, L., & Sellami, M. (2006). Classifiers combination and syntax analysis for arabic literal amount recognition. Engineering Applications of Artificial Intelligence, 19(1), 29–39.

    Article  Google Scholar 

  • Nasir, S., Sher, A., Usman, K., & Farman, U. (2013). Speech enhancement with geometric advent of spectral subtraction using connected time-frequency regions noise estimation. Research Journal of Applied Sciences, Engineering and Technology, 6(6), 1081–1087.

    Google Scholar 

  • Nasr, M. B., Talbi, M., & Cherif, A. (2012). Arabic speech recognition by bionic wavelet transform and mfcc using a multilayer perceptron. In Proceedings of the 6th International Conferences: Sciences of Electronics, Technologies of Information and Telecommunications SETIT, Sousse, TUNISIE.

  • O’Shaughnessy, D. (2001). Speech communication: human and machine. New York: IEEE Press.

    MATH  Google Scholar 

  • Płonkowski, M. (2015). Using bands of frequencies for vowel recognition for Polish language. International Journal of Speech Technology, 18(2), 187–193.

    Article  Google Scholar 

  • Ronan, F. & Edward, J. (2008). Robust distributed speech recognition using speech enhancement. IEEE Transactions on Consumer Electronics, 54(3), 1267–1273.

    Article  Google Scholar 

  • Ronan, F. & Edward, J. (2012). Robust distributed speech recognition using auditory modelling. In S. Ramakrishnan (Eds), Modern speech recognition approaches with case studies. Rijeka: InTech Open Access. ISBN 978-953-51-0831-3.

  • Sakka, Z., Kachouri, A., & Samet, M. (2004). Speech recognition with hmm models for cochlear prostheses. In Proceedings of the IEEE International Conference on Industrial Technology, Hammamet, Tunisie.

  • Sakka Z., Kachouri A., & Samet, M. (2007). Speech denoising and speaker recognition system using subband approach. International review on Computers and Software, 2(3), 264–271.

    Google Scholar 

  • Yang, L. & Philipos, C. L. (2008). A geometric approach to spectral subtraction. Speech Communication, 50, 453–466.

    Article  Google Scholar 

  • Young, S. J., Woodland, P.C., & Byrne, W. J. (2001). HTK Reference Manual, for htk version 3.1.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zied Sakka.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sakka, Z., Techini, E. & Bouhlel, M. Using geometric spectral subtraction approach for feature extraction for DSR front-end Arabic system. Int J Speech Technol 20, 645–650 (2017). https://doi.org/10.1007/s10772-017-9433-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-017-9433-1

Keywords