Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-031-20980-2_27guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Continuous Wavelet Transform for Severity-Level Classification of Dysarthria

Published: 14 November 2022 Publication History

Abstract

Dysarthria is a neuro-motor speech defect that causes speech to be unintelligible and is largely unnoticeable to humans at various severity-levels. Dysarthric speech classification is used as a diagnostic method to assess the progression of a patient’s severity of the condition, as well as to aid with automatic dysarthric speech recognition systems (an important assistive speech technology). This study investigates the significance of Generalized Morse Wavelet (GMW)-based scalogram features for capturing the discriminative acoustic cues of dysarthric severity-level classification for low-frequency regions, using Convolutional Neural Network (CNN). The performance of scalogram-based features is compared with Short-Time Fourier Transform (STFT)-based features, and Mel spectrogram-based features. Compared to the STFT-based baseline features with a classification accuracy of 91.76%, the proposed Continuous Wavelet Transform (CWT)-based scalogram features achieve significantly improved classification accuracy of 95.17% on standard and statistically meaningful UA-Speech corpus. The remarkably improved results signify that for better dysarthric severity-level classification, the information in the low-frequency regions is more discriminative, as the proposed CWT-based time-frequency representation (scalogram) has a high-frequency resolution in the lower frequencies. On the other hand, STFT-based representations have constant resolution across all the frequency bands and therefore, are not as better suited for dysarthric severity-level classification, as the proposed Morse wavelet-based CWT features. In addition, we also perform experiments on the Mel spectrogram to demonstrate that even though the Mel spectrogram also has a high frequency resolution in the lower frequencies with a classification accuracy of 92.65%, the proposed system is better suited. We see an increase of 3.41% and 2.52% in classification accuracy of the proposed system to STFT and Mel spectrogram respectively. To that effect, the performance of the STFT, Mel spectrogram, and scalogram are analyzed using F1-Score, Matthew’s Correlation Coefficients (MCC), Jaccard Index, Hamming Loss, and Linear Discriminant Analysis (LDA) scatter plots.

References

[1]
Al-Qatab BA and Mustafa MB Classification of dysarthric speech according to the severity of impairment: an analysis of acoustic features IEEE Access 2021 9 18183-18194
[2]
Bouchard M, Jousselme AL, and Doré PE A proof for the positive definiteness of the Jaccard index matrix Int. J. Approx. Reason. 2013 54 5 615-626
[3]
Chen, H., Zhang, P., Bai, H., Yuan, Q., Bao, X., Yan, Y.: Deep convolutional neural network with scalogram for audio scene modeling. In: INTERSPEECH, Hyderabad India, pp. 3304–3308 (2018)
[4]
Darley FL, Aronson AE, and Brown JR Differential diagnostic patterns of dysarthria J. Speech Hear. Res. (JSLHR) 1969 12 2 246-269
[5]
Daubechies I The wavelet transform, time-frequency localization and signal analysis IEEE Trans. Inf. Theory 1990 36 5 961-1005
[6]
Dembczyński K, Waegeman W, Cheng W, and Hüllermeier E Balcázar JL, Bonchi F, Gionis A, and Sebag M Regret analysis for performance metrics in multi-label classification: the case of hamming and subset zero-one loss Machine Learning and Knowledge Discovery in Databases 2010 Heidelberg Springer 280-295
[7]
Fawcett T An introduction to ROC analysis Pattern Recognit. Lett. 2006 27 8 861-874
[8]
Gillespie, S., Logan, Y.Y., Moore, E., Laures-Gore, J., Russell, S., Patel, R.: Cross-database models for the classification of dysarthria presence. In: INTERSPEECH, Stockholm, Sweden, pp. 3127–31 (2017)
[9]
Gupta et al., S.: Residual neural network precisely quantifies dysarthria severity-level based on short-duration speech segments. Neural Netw. 139, 105–117 (2021)
[10]
Holschneider, M.: Wavelets. An analysis tool (1995)
[11]
Izenman, A.J.: Linear discriminant analysis. In: Izenman, A.J. (ed.) Modern Multivariate Statistical Techniques. Springer Texts in Statistics, pp. 237–280. Springer, New York (2013).
[12]
Joshy, A.A., Rajan, R.: Automated dysarthria severity classification using deep learning frameworks. In: 28th European Signal Processing Conference (EUSIPCO), Amsterdam, Netherlands, pp. 116–120 (2021)
[13]
Knutsson, H., Westin, C.F., Granlund, G.: Local multiscale frequency and bandwidth estimation. In: Proceedings of 1st International Conference on Image Processing, Austin, TX, USA, vol. 1, pp. 36–40, 13–16 November 1994
[14]
LeCun, Y., Kavukcuoglu, K., Farabet, C.: Convolutional networks and applications in vision. In: Proceedings of 2010 IEEE International Symposium on Circuits and Systems, Paris, France, pp. 253–256 (2010)
[15]
Lieberman P Primate vocalizations and human linguistic ability J. Acoust. Soci. Am. (JASA) 1968 44 6 1574-1584
[16]
Lilly JM and Olhede SC Generalized Morse wavelets as a superfamily of analytic wavelets IEEE Trans. Signal Process. 2012 60 11 6036-6041
[17]
Lilly, J.M., Olhede, S.C.: Higher-order properties of analytic wavelets. IEEE Trans. Signal Process. 57(1), 146–160 (2008)
[18]
Lilly, J.M., Olhede, S.C.: On the analytic wavelet transform. IEEE Trans. Inf. Theory 56(8), 4135–4156 (2010)
[19]
Mackenzie C and Lowit A Behavioural intervention effects in dysarthria following stroke: communication effectiveness, intelligibility and dysarthria impact Int. J. Lang. Commun. Disord. 2007 42 2 131-153
[20]
Mallat, S.: A Wavelet Tour of Signal Processing, 2nd edn. Elsevier, Amsterdam (1999)
[21]
Matthews, B.W.: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA) Prot. Struct. 405(2), 442–451 (1975)
[22]
Ren Z, Qian K, Zhang Z, Pandit V, Baird A, and Schuller B Deep scalogram representations for acoustic scene classification IEEE/CAA J. Automatica Sinica 2018 5 3 662-669
[23]
Vásquez-Correa, J.C., Orozco-Arroyave, J.R., Nöth, E.: Convolutional neural network to model articulation impairments in patients with Parkinson’s disease. In: INTERSPEECH, Stockholm, pp. 314–318 (2017)
[24]
Young V and Mihailidis A Difficulties in automatic speech recognition of dysarthric speakers and implications for speech-based applications used by the elderly: A literature review Assist. Technol. 2010 22 2 99-112
[25]
Yu, J., et al.: Development of the CUHK dysarthric speech recognition system for the UA speech corpus. In: INTERSPEECH, Hyderabad, India, pp. 2938–2942 (2018)

Index Terms

  1. Continuous Wavelet Transform for Severity-Level Classification of Dysarthria
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image Guide Proceedings
            Speech and Computer: 24th International Conference, SPECOM 2022, Gurugram, India, November 14–16, 2022, Proceedings
            Nov 2022
            736 pages
            ISBN:978-3-031-20979-6
            DOI:10.1007/978-3-031-20980-2

            Publisher

            Springer-Verlag

            Berlin, Heidelberg

            Publication History

            Published: 14 November 2022

            Author Tags

            1. Wavelet transform
            2. Dysarthria
            3. UA-Speech corpus
            4. Morse wavelet
            5. CNN

            Qualifiers

            • Article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 0
              Total Downloads
            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 08 Feb 2025

            Other Metrics

            Citations

            View Options

            View options

            Figures

            Tables

            Media

            Share

            Share

            Share this Publication link

            Share on social media