An Effective Speech Emotion Recognition Model for Multi-Regional Languages Using Threshold-based Feature Selection Algorithm

Subramanian, Radhika; Aruchamy, Prasanth

doi:10.1007/s00034-023-02571-4

An Effective Speech Emotion Recognition Model for Multi-Regional Languages Using Threshold-based Feature Selection Algorithm

Published: 22 December 2023

Volume 43, pages 2477–2506, (2024)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Radhika Subramanian¹ &
Prasanth Aruchamy¹

307 Accesses
1 Citation
Explore all metrics

Abstract

At present, there are several communication tools employed to express human emotions. Among the numerous modes of communication, speech is the most predominant one for communicating with people effectively and efficiently. Speech emotion recognition (SER) plays a significant role in several signal processing applications. However, in both feature selection (FS) methods and reliable classifiers, determining their appropriate features has emerged as challenges in identifying the emotions expressed in Indian regional languages. In this work, a novel SER framework has been proposed to classify different speech emotions. Primarily, the proposed framework utilizes a preprocessing phase so as to alleviate the background noise and the artifacts present in input speech signal. Later on, the two new speech attributes related to energy and phase have been integrated with state-of-the-art attributes for examining speech emotion characteristics. The threshold-based feature selection (TFS) algorithm has been introduced to determine the optimal features by applying a statistical approach. An Indian regional language called Tamil Emotional dataset has been created for examining the proposed framework with the aid of standard machine learning and deep learning classifiers. The proposed TFS technique has been more suitable for Indian regional languages since it exhibits a superior performance with 97.96% accuracy compared to Indian English and Malayalam datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving Speech Emotion Recognition System Using Spectral and Prosodic Features

Speech Emotion Recognition Using Feature Fusion of TEO and MFCC on Multilingual Databases

Improved Speech Emotion Classification Using Deep Neural Network

Article 29 July 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

The dataset is retrieved from English dataset [19] for extracting emotional features.

References

L. Abdel-Hamid, Egyptian Arabic speech emotion recognition using prosodic, spectral and wavelet features. Speech Commun. 122, 9–30 (2020)
Article Google Scholar
G. Agarwal, H. Om, Performance of deer hunting optimization based deep learning algorithm for speech emotion recognition. Multimed. Tools Appl. 80, 9961–9992 (2021)
Article Google Scholar
A. Bhowmick, A. Biswas, Identification/segmentation of Indian regional languages with singular value decomposition based feature embedding. Appl. Acoust. 176, 1–15 (2021)
Article Google Scholar
Y. Caiming, Q. Tian, F. Cheng, S..Zhang, Speech emotion recognition using support vector machines. Proceedings of International Conference on Computer Science and Information Engineering, pp. 215–220 (2011).
S. Chattopadhyay, A. Dey, P.K. Singh, A. Ahmadian, R. Sarkar, A feature selection model for speech emotion recognition using clustering-based population generation with hybrid of equilibrium optimizer and atom search optimization algorithm. Multimed. Tools Appl. 24, 1–34 (2022)
Google Scholar
A. Dey, S. Chattopadhyay, P. Singh, A hybrid meta-heuristic feature selection method using golden ratio and equilibrium optimization algorithms for speech emotion recognition. IEEE Access, 8 (2020).
M. Hasan, M. Hossain, Effect of vocal tract dynamics on neural network-based speech recognition: A Bengali language-based study. Expert. Syst. 39, 1–22 (2022)
Article Google Scholar
S. Jayachitra, A. Prasanth, Multi-feature analysis for automated brain stroke classification using weighted Gaussian Naïve Baye’s classifier. J. Circuits Syst. Comp. 30, 1–26 (2021)
Article Google Scholar
S. Kalli, T. Suresh, An effective motion object detection using adaptive background modeling mechanism in video surveillance system. J. Intell. Fuzzy Syst. 41, 777–1789 (2021)
Google Scholar
B. Kaur, S. Rathi, R.K. Agrawal, Enhanced depression detection from speech using Quantum Whale Optimization Algorithm for feature selection. Comput. Biol. Med. 150, 1–15 (2022)
Article Google Scholar
K. Kaur, P. Singh, Impact of feature extraction and feature selection algorithms on Punjabi speech emotion recognition using convolutional neural network. Trans. Asian Low-Resour. Lang. Inform. Process. 21, 1–23 (2022)
Article Google Scholar
A. Koduru, H.B. Valiveti, A.K. Budati, Feature extraction algorithms to improve the speech emotion recognition rate. Int. J. Speech Technol. 23, 45–55 (2020)
Article Google Scholar
S. Langari, H. Marvi, M. Zahedi, Efficient speech emotion recognition using modified feature extraction. Inform. Med. Unlocked. 20, (2020).
S. Lavanya, A. Prasanth, S. Jayachitra, A Tuned classification approach for efficient heterogeneous fault diagnosis in IoT-enabled WSN applications. Measurement 183, 1–16 (2021)
Article Google Scholar
K.R. Lekshmi, E. Sherly, An acoustic model and linguistic analysis for Malayalam disyllabic words: a low resource language. Int. J. Speech Technol. 24, 483–495 (2021)
Article Google Scholar
W. Lim, J. Daeyoung, L. Taejin, Speech emotion recognition using convolutional and recurrent neural networks. Asia-Pacific signal and information processing association annual summit and conference (APSIPA), IEEE (2016), pp 1–4.
K. Manohar, E. Logashanmugam, Hybrid deep learning with optimal feature selection for speech emotion recognition using improved meta-heuristic algorithm. Knowl.-Based Syst. 246, 1–25 (2022)
Article Google Scholar
K. Mrinalini, P. Vijayalakshmi, T. Nagarajan, Feature-weighted AdaBoost classifier for punctuation prediction in Tamil and Hindi NLP systems. Expert. Syst. 39, 1–17 (2022)
Google Scholar
https://superkogito.github.io/SER-datasets/
A. Prasanth, Certain investigations on energy-efficient fault detection and recovery management in underwater wireless sensor networks. J. Circuits Syst. Comput. 30, 1–20 (2021)
Article Google Scholar
S. Radhika, A. Prasanth, A survey of human emotion recognition using speech signals: current trends and future perspectives. Micro-Electronics and Telecommunication Engineering: Proceedings of 6th ICMETE 2022, Singapore (2023), pp. 509–518.
J. Rong, Li. Gang, Y.P.P. Chen, Acoustic feature selection for automatic emotion recognition from speech. Inform. Process. Manage. 45, 315–328 (2009).
R. Sathya, S. Ananthi, K. Vaidehi, A hybrid location-dependent ultra convolutional neural network-based vehicle number plate recognition approach for intelligent transportation systems. Concurr. Comput. Pract. Experience 35(8), 1–25 (2023)
Google Scholar
N. Sebe, M.S. Lew, I. Cohen, A. Garg, T.S. Huang, Object recognition supported by user interaction for service robots—emotion recognition using a Cauchy Naive Bayes classifier. IEEE Comput. Soc 16th International Conference on Pattern Recognition—Quebec City, Quebec, Canada, vol 1, pp17–20 (2002).
J. Sekar, P. Aruchamy, An efficient clinical support system for heart disease prediction using TANFIS classifier. Comput. Intell. 38, 610–640 (2022)
Article Google Scholar
L. Sun, Q. Li, S. Fu, P. Li, Speech emotion recognition based on genetic algorithm–decision tree fusion of deep and acoustic features. ETRI J. 7, 15–29 (2022)
Google Scholar
T. Jha, R. Kavya, J. Christopher, V. Arunachalam, Machine learning techniques for speech emotion recognition using paralinguistic acoustic features. Int. J. Speech Technol. 25, 707–725 (2022).
W. Wang, P.A. Watters, X. Cao, L. Shen, B. Li, Significance of phonological features in speech emotion recognition. Int. J. Speech Technol. 23, 633–642 (2020)
Article Google Scholar
K. Wang, G. Su, L. Liu, S. Wang, Wavelet packet analysis for speaker-independent emotion recognition. Neurocomputing 398, 257–264 (2020)
Article Google Scholar
P. Xiao, K. Ma, L. Gu, Inter-subject prediction of pediatric emergence delirium using feature selection and classification from spontaneous EEG signals. Biomed. Signal Process. Control 80, 1–19 (2023)
Article Google Scholar
X. Xu, J. Deng, Z. Zhang, Rethinking auditory affective descriptors through zero-shot emotion recognition in speech. IEEE Trans. Comput. Soc. Syst. 9, 1530–1541 (2022)
Article Google Scholar
P. Yadav, G. Aggarwal, Speech emotion classification using machine learning. Int. J. Comp. Appl. 118(13), (2015).
S. Zhang, Z. Xiaoming, C. Yuelong, G. Wenping, C. Ying, Feature learning via deep belief network for Chinese speech emotion recognition. Pattern Recognition: 7th Chinese Conference, Chengdu, China, November 5–7, 2016 Proceedings Part II, vol 7, pp 645–651
Z. Zhang, Speech feature selection and emotion recognition based on weighted binary cuckoo search. Alex. Eng. J. 60, 1499–1507 (2021)
Article Google Scholar
Y. Zhou, X. Liang, Y. Gu, Multi-classifier interactive learning for ambiguous speech emotion recognition. IEEE/ACM Trans. Audio Speech Lang Process. 30, 695–705 (2022)
Article Google Scholar

Download references

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, Sri Venkateswara College of Engineering, Sriperumpudur, India
Radhika Subramanian & Prasanth Aruchamy

Authors

Radhika Subramanian
View author publications
You can also search for this author in PubMed Google Scholar
Prasanth Aruchamy
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

RS involved in conceptualization, visualization, validation and methodology. PA involved in software, investigation and writing—original draft preparation.

Corresponding author

Correspondence to Radhika Subramanian.

Ethics declarations

Conflict of interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Ethical Approval

This material is the authors’ own original work, which has not been previously published elsewhere. This paper is not currently being considered for publication elsewhere. The paper contains authors’ own research work and analysis in a truthful and complete manner.

Consent for Publication

This manuscript is approved by all authors for publication.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Subramanian, R., Aruchamy, P. An Effective Speech Emotion Recognition Model for Multi-Regional Languages Using Threshold-based Feature Selection Algorithm. Circuits Syst Signal Process 43, 2477–2506 (2024). https://doi.org/10.1007/s00034-023-02571-4

Download citation

Received: 31 January 2023
Revised: 21 November 2023
Accepted: 24 November 2023
Published: 22 December 2023
Issue Date: April 2024
DOI: https://doi.org/10.1007/s00034-023-02571-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Effective Speech Emotion Recognition Model for Multi-Regional Languages Using Threshold-based Feature Selection Algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Improving Speech Emotion Recognition System Using Spectral and Prosodic Features

Speech Emotion Recognition Using Feature Fusion of TEO and MFCC on Multilingual Databases

Improved Speech Emotion Classification Using Deep Neural Network

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical Approval

Consent for Publication

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

An Effective Speech Emotion Recognition Model for Multi-Regional Languages Using Threshold-based Feature Selection Algorithm

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Improving Speech Emotion Recognition System Using Spectral and Prosodic Features

Speech Emotion Recognition Using Feature Fusion of TEO and MFCC on Multilingual Databases

Improved Speech Emotion Classification Using Deep Neural Network

Explore related subjects

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical Approval

Consent for Publication

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation