Abstract
At present, there are several communication tools employed to express human emotions. Among the numerous modes of communication, speech is the most predominant one for communicating with people effectively and efficiently. Speech emotion recognition (SER) plays a significant role in several signal processing applications. However, in both feature selection (FS) methods and reliable classifiers, determining their appropriate features has emerged as challenges in identifying the emotions expressed in Indian regional languages. In this work, a novel SER framework has been proposed to classify different speech emotions. Primarily, the proposed framework utilizes a preprocessing phase so as to alleviate the background noise and the artifacts present in input speech signal. Later on, the two new speech attributes related to energy and phase have been integrated with state-of-the-art attributes for examining speech emotion characteristics. The threshold-based feature selection (TFS) algorithm has been introduced to determine the optimal features by applying a statistical approach. An Indian regional language called Tamil Emotional dataset has been created for examining the proposed framework with the aid of standard machine learning and deep learning classifiers. The proposed TFS technique has been more suitable for Indian regional languages since it exhibits a superior performance with 97.96% accuracy compared to Indian English and Malayalam datasets.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The dataset is retrieved from English dataset [19] for extracting emotional features.
References
L. Abdel-Hamid, Egyptian Arabic speech emotion recognition using prosodic, spectral and wavelet features. Speech Commun. 122, 9–30 (2020)
G. Agarwal, H. Om, Performance of deer hunting optimization based deep learning algorithm for speech emotion recognition. Multimed. Tools Appl. 80, 9961–9992 (2021)
A. Bhowmick, A. Biswas, Identification/segmentation of Indian regional languages with singular value decomposition based feature embedding. Appl. Acoust. 176, 1–15 (2021)
Y. Caiming, Q. Tian, F. Cheng, S..Zhang, Speech emotion recognition using support vector machines. Proceedings of International Conference on Computer Science and Information Engineering, pp. 215–220 (2011).
S. Chattopadhyay, A. Dey, P.K. Singh, A. Ahmadian, R. Sarkar, A feature selection model for speech emotion recognition using clustering-based population generation with hybrid of equilibrium optimizer and atom search optimization algorithm. Multimed. Tools Appl. 24, 1–34 (2022)
A. Dey, S. Chattopadhyay, P. Singh, A hybrid meta-heuristic feature selection method using golden ratio and equilibrium optimization algorithms for speech emotion recognition. IEEE Access, 8 (2020).
M. Hasan, M. Hossain, Effect of vocal tract dynamics on neural network-based speech recognition: A Bengali language-based study. Expert. Syst. 39, 1–22 (2022)
S. Jayachitra, A. Prasanth, Multi-feature analysis for automated brain stroke classification using weighted Gaussian Naïve Baye’s classifier. J. Circuits Syst. Comp. 30, 1–26 (2021)
S. Kalli, T. Suresh, An effective motion object detection using adaptive background modeling mechanism in video surveillance system. J. Intell. Fuzzy Syst. 41, 777–1789 (2021)
B. Kaur, S. Rathi, R.K. Agrawal, Enhanced depression detection from speech using Quantum Whale Optimization Algorithm for feature selection. Comput. Biol. Med. 150, 1–15 (2022)
K. Kaur, P. Singh, Impact of feature extraction and feature selection algorithms on Punjabi speech emotion recognition using convolutional neural network. Trans. Asian Low-Resour. Lang. Inform. Process. 21, 1–23 (2022)
A. Koduru, H.B. Valiveti, A.K. Budati, Feature extraction algorithms to improve the speech emotion recognition rate. Int. J. Speech Technol. 23, 45–55 (2020)
S. Langari, H. Marvi, M. Zahedi, Efficient speech emotion recognition using modified feature extraction. Inform. Med. Unlocked. 20, (2020).
S. Lavanya, A. Prasanth, S. Jayachitra, A Tuned classification approach for efficient heterogeneous fault diagnosis in IoT-enabled WSN applications. Measurement 183, 1–16 (2021)
K.R. Lekshmi, E. Sherly, An acoustic model and linguistic analysis for Malayalam disyllabic words: a low resource language. Int. J. Speech Technol. 24, 483–495 (2021)
W. Lim, J. Daeyoung, L. Taejin, Speech emotion recognition using convolutional and recurrent neural networks. Asia-Pacific signal and information processing association annual summit and conference (APSIPA), IEEE (2016), pp 1–4.
K. Manohar, E. Logashanmugam, Hybrid deep learning with optimal feature selection for speech emotion recognition using improved meta-heuristic algorithm. Knowl.-Based Syst. 246, 1–25 (2022)
K. Mrinalini, P. Vijayalakshmi, T. Nagarajan, Feature-weighted AdaBoost classifier for punctuation prediction in Tamil and Hindi NLP systems. Expert. Syst. 39, 1–17 (2022)
A. Prasanth, Certain investigations on energy-efficient fault detection and recovery management in underwater wireless sensor networks. J. Circuits Syst. Comput. 30, 1–20 (2021)
S. Radhika, A. Prasanth, A survey of human emotion recognition using speech signals: current trends and future perspectives. Micro-Electronics and Telecommunication Engineering: Proceedings of 6th ICMETE 2022, Singapore (2023), pp. 509–518.
J. Rong, Li. Gang, Y.P.P. Chen, Acoustic feature selection for automatic emotion recognition from speech. Inform. Process. Manage. 45, 315–328 (2009).
R. Sathya, S. Ananthi, K. Vaidehi, A hybrid location-dependent ultra convolutional neural network-based vehicle number plate recognition approach for intelligent transportation systems. Concurr. Comput. Pract. Experience 35(8), 1–25 (2023)
N. Sebe, M.S. Lew, I. Cohen, A. Garg, T.S. Huang, Object recognition supported by user interaction for service robots—emotion recognition using a Cauchy Naive Bayes classifier. IEEE Comput. Soc 16th International Conference on Pattern Recognition—Quebec City, Quebec, Canada, vol 1, pp17–20 (2002).
J. Sekar, P. Aruchamy, An efficient clinical support system for heart disease prediction using TANFIS classifier. Comput. Intell. 38, 610–640 (2022)
L. Sun, Q. Li, S. Fu, P. Li, Speech emotion recognition based on genetic algorithm–decision tree fusion of deep and acoustic features. ETRI J. 7, 15–29 (2022)
T. Jha, R. Kavya, J. Christopher, V. Arunachalam, Machine learning techniques for speech emotion recognition using paralinguistic acoustic features. Int. J. Speech Technol. 25, 707–725 (2022).
W. Wang, P.A. Watters, X. Cao, L. Shen, B. Li, Significance of phonological features in speech emotion recognition. Int. J. Speech Technol. 23, 633–642 (2020)
K. Wang, G. Su, L. Liu, S. Wang, Wavelet packet analysis for speaker-independent emotion recognition. Neurocomputing 398, 257–264 (2020)
P. Xiao, K. Ma, L. Gu, Inter-subject prediction of pediatric emergence delirium using feature selection and classification from spontaneous EEG signals. Biomed. Signal Process. Control 80, 1–19 (2023)
X. Xu, J. Deng, Z. Zhang, Rethinking auditory affective descriptors through zero-shot emotion recognition in speech. IEEE Trans. Comput. Soc. Syst. 9, 1530–1541 (2022)
P. Yadav, G. Aggarwal, Speech emotion classification using machine learning. Int. J. Comp. Appl. 118(13), (2015).
S. Zhang, Z. Xiaoming, C. Yuelong, G. Wenping, C. Ying, Feature learning via deep belief network for Chinese speech emotion recognition. Pattern Recognition: 7th Chinese Conference, Chengdu, China, November 5–7, 2016 Proceedings Part II, vol 7, pp 645–651
Z. Zhang, Speech feature selection and emotion recognition based on weighted binary cuckoo search. Alex. Eng. J. 60, 1499–1507 (2021)
Y. Zhou, X. Liang, Y. Gu, Multi-classifier interactive learning for ambiguous speech emotion recognition. IEEE/ACM Trans. Audio Speech Lang Process. 30, 695–705 (2022)
Funding
The authors did not receive support from any organization for the submitted work.
Author information
Authors and Affiliations
Contributions
RS involved in conceptualization, visualization, validation and methodology. PA involved in software, investigation and writing—original draft preparation.
Corresponding author
Ethics declarations
Conflict of interest
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Ethical Approval
This material is the authors’ own original work, which has not been previously published elsewhere. This paper is not currently being considered for publication elsewhere. The paper contains authors’ own research work and analysis in a truthful and complete manner.
Consent for Publication
This manuscript is approved by all authors for publication.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Subramanian, R., Aruchamy, P. An Effective Speech Emotion Recognition Model for Multi-Regional Languages Using Threshold-based Feature Selection Algorithm. Circuits Syst Signal Process 43, 2477–2506 (2024). https://doi.org/10.1007/s00034-023-02571-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-023-02571-4