Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

An Effective Speech Emotion Recognition Model for Multi-Regional Languages Using Threshold-based Feature Selection Algorithm

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

At present, there are several communication tools employed to express human emotions. Among the numerous modes of communication, speech is the most predominant one for communicating with people effectively and efficiently. Speech emotion recognition (SER) plays a significant role in several signal processing applications. However, in both feature selection (FS) methods and reliable classifiers, determining their appropriate features has emerged as challenges in identifying the emotions expressed in Indian regional languages. In this work, a novel SER framework has been proposed to classify different speech emotions. Primarily, the proposed framework utilizes a preprocessing phase so as to alleviate the background noise and the artifacts present in input speech signal. Later on, the two new speech attributes related to energy and phase have been integrated with state-of-the-art attributes for examining speech emotion characteristics. The threshold-based feature selection (TFS) algorithm has been introduced to determine the optimal features by applying a statistical approach. An Indian regional language called Tamil Emotional dataset has been created for examining the proposed framework with the aid of standard machine learning and deep learning classifiers. The proposed TFS technique has been more suitable for Indian regional languages since it exhibits a superior performance with 97.96% accuracy compared to Indian English and Malayalam datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Algorithm 2:
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

The dataset is retrieved from English dataset [19] for extracting emotional features.

References

  1. L. Abdel-Hamid, Egyptian Arabic speech emotion recognition using prosodic, spectral and wavelet features. Speech Commun. 122, 9–30 (2020)

    Article  Google Scholar 

  2. G. Agarwal, H. Om, Performance of deer hunting optimization based deep learning algorithm for speech emotion recognition. Multimed. Tools Appl. 80, 9961–9992 (2021)

    Article  Google Scholar 

  3. A. Bhowmick, A. Biswas, Identification/segmentation of Indian regional languages with singular value decomposition based feature embedding. Appl. Acoust. 176, 1–15 (2021)

    Article  Google Scholar 

  4. Y. Caiming, Q. Tian, F. Cheng, S..Zhang, Speech emotion recognition using support vector machines. Proceedings of International Conference on Computer Science and Information Engineering, pp. 215–220 (2011).

  5. S. Chattopadhyay, A. Dey, P.K. Singh, A. Ahmadian, R. Sarkar, A feature selection model for speech emotion recognition using clustering-based population generation with hybrid of equilibrium optimizer and atom search optimization algorithm. Multimed. Tools Appl. 24, 1–34 (2022)

    Google Scholar 

  6. A. Dey, S. Chattopadhyay, P. Singh, A hybrid meta-heuristic feature selection method using golden ratio and equilibrium optimization algorithms for speech emotion recognition. IEEE Access, 8 (2020).

  7. M. Hasan, M. Hossain, Effect of vocal tract dynamics on neural network-based speech recognition: A Bengali language-based study. Expert. Syst. 39, 1–22 (2022)

    Article  Google Scholar 

  8. S. Jayachitra, A. Prasanth, Multi-feature analysis for automated brain stroke classification using weighted Gaussian Naïve Baye’s classifier. J. Circuits Syst. Comp. 30, 1–26 (2021)

    Article  Google Scholar 

  9. S. Kalli, T. Suresh, An effective motion object detection using adaptive background modeling mechanism in video surveillance system. J. Intell. Fuzzy Syst. 41, 777–1789 (2021)

    Google Scholar 

  10. B. Kaur, S. Rathi, R.K. Agrawal, Enhanced depression detection from speech using Quantum Whale Optimization Algorithm for feature selection. Comput. Biol. Med. 150, 1–15 (2022)

    Article  Google Scholar 

  11. K. Kaur, P. Singh, Impact of feature extraction and feature selection algorithms on Punjabi speech emotion recognition using convolutional neural network. Trans. Asian Low-Resour. Lang. Inform. Process. 21, 1–23 (2022)

    Article  Google Scholar 

  12. A. Koduru, H.B. Valiveti, A.K. Budati, Feature extraction algorithms to improve the speech emotion recognition rate. Int. J. Speech Technol. 23, 45–55 (2020)

    Article  Google Scholar 

  13. S. Langari, H. Marvi, M. Zahedi, Efficient speech emotion recognition using modified feature extraction. Inform. Med. Unlocked. 20, (2020).

  14. S. Lavanya, A. Prasanth, S. Jayachitra, A Tuned classification approach for efficient heterogeneous fault diagnosis in IoT-enabled WSN applications. Measurement 183, 1–16 (2021)

    Article  Google Scholar 

  15. K.R. Lekshmi, E. Sherly, An acoustic model and linguistic analysis for Malayalam disyllabic words: a low resource language. Int. J. Speech Technol. 24, 483–495 (2021)

    Article  Google Scholar 

  16. W. Lim, J. Daeyoung, L. Taejin, Speech emotion recognition using convolutional and recurrent neural networks. Asia-Pacific signal and information processing association annual summit and conference (APSIPA), IEEE (2016), pp 1–4.

  17. K. Manohar, E. Logashanmugam, Hybrid deep learning with optimal feature selection for speech emotion recognition using improved meta-heuristic algorithm. Knowl.-Based Syst. 246, 1–25 (2022)

    Article  Google Scholar 

  18. K. Mrinalini, P. Vijayalakshmi, T. Nagarajan, Feature-weighted AdaBoost classifier for punctuation prediction in Tamil and Hindi NLP systems. Expert. Syst. 39, 1–17 (2022)

    Google Scholar 

  19. https://superkogito.github.io/SER-datasets/

  20. A. Prasanth, Certain investigations on energy-efficient fault detection and recovery management in underwater wireless sensor networks. J. Circuits Syst. Comput. 30, 1–20 (2021)

    Article  Google Scholar 

  21. S. Radhika, A. Prasanth, A survey of human emotion recognition using speech signals: current trends and future perspectives. Micro-Electronics and Telecommunication Engineering: Proceedings of 6th ICMETE 2022, Singapore (2023), pp. 509–518.

  22. J. Rong, Li. Gang, Y.P.P. Chen, Acoustic feature selection for automatic emotion recognition from speech. Inform. Process. Manage. 45, 315–328 (2009).

  23. R. Sathya, S. Ananthi, K. Vaidehi, A hybrid location-dependent ultra convolutional neural network-based vehicle number plate recognition approach for intelligent transportation systems. Concurr. Comput. Pract. Experience 35(8), 1–25 (2023)

    Google Scholar 

  24. N. Sebe, M.S. Lew, I. Cohen, A. Garg, T.S. Huang, Object recognition supported by user interaction for service robots—emotion recognition using a Cauchy Naive Bayes classifier. IEEE Comput. Soc 16th International Conference on Pattern Recognition—Quebec City, Quebec, Canada, vol 1, pp17–20 (2002).

  25. J. Sekar, P. Aruchamy, An efficient clinical support system for heart disease prediction using TANFIS classifier. Comput. Intell. 38, 610–640 (2022)

    Article  Google Scholar 

  26. L. Sun, Q. Li, S. Fu, P. Li, Speech emotion recognition based on genetic algorithm–decision tree fusion of deep and acoustic features. ETRI J. 7, 15–29 (2022)

    Google Scholar 

  27. T. Jha, R. Kavya, J. Christopher, V. Arunachalam, Machine learning techniques for speech emotion recognition using paralinguistic acoustic features. Int. J. Speech Technol. 25, 707–725 (2022).

  28. W. Wang, P.A. Watters, X. Cao, L. Shen, B. Li, Significance of phonological features in speech emotion recognition. Int. J. Speech Technol. 23, 633–642 (2020)

    Article  Google Scholar 

  29. K. Wang, G. Su, L. Liu, S. Wang, Wavelet packet analysis for speaker-independent emotion recognition. Neurocomputing 398, 257–264 (2020)

    Article  Google Scholar 

  30. P. Xiao, K. Ma, L. Gu, Inter-subject prediction of pediatric emergence delirium using feature selection and classification from spontaneous EEG signals. Biomed. Signal Process. Control 80, 1–19 (2023)

    Article  Google Scholar 

  31. X. Xu, J. Deng, Z. Zhang, Rethinking auditory affective descriptors through zero-shot emotion recognition in speech. IEEE Trans. Comput. Soc. Syst. 9, 1530–1541 (2022)

    Article  Google Scholar 

  32. P. Yadav, G. Aggarwal, Speech emotion classification using machine learning. Int. J. Comp. Appl. 118(13), (2015).

  33. S. Zhang, Z. Xiaoming, C. Yuelong, G. Wenping, C. Ying, Feature learning via deep belief network for Chinese speech emotion recognition. Pattern Recognition: 7th Chinese Conference, Chengdu, China, November 5–7, 2016 Proceedings Part II, vol 7, pp 645–651

  34. Z. Zhang, Speech feature selection and emotion recognition based on weighted binary cuckoo search. Alex. Eng. J. 60, 1499–1507 (2021)

    Article  Google Scholar 

  35. Y. Zhou, X. Liang, Y. Gu, Multi-classifier interactive learning for ambiguous speech emotion recognition. IEEE/ACM Trans. Audio Speech Lang Process. 30, 695–705 (2022)

    Article  Google Scholar 

Download references

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

Authors

Contributions

RS involved in conceptualization, visualization, validation and methodology. PA involved in software, investigation and writing—original draft preparation.

Corresponding author

Correspondence to Radhika Subramanian.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Ethical Approval

This material is the authors’ own original work, which has not been previously published elsewhere. This paper is not currently being considered for publication elsewhere. The paper contains authors’ own research work and analysis in a truthful and complete manner.

Consent for Publication

This manuscript is approved by all authors for publication.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Subramanian, R., Aruchamy, P. An Effective Speech Emotion Recognition Model for Multi-Regional Languages Using Threshold-based Feature Selection Algorithm. Circuits Syst Signal Process 43, 2477–2506 (2024). https://doi.org/10.1007/s00034-023-02571-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-023-02571-4

Keywords