Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-031-20980-2_27guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype

Continuous Wavelet Transform for Severity-Level Classification of Dysarthria

Published: 14 November 2022 Publication History


Dysarthria is a neuro-motor speech defect that causes speech to be unintelligible and is largely unnoticeable to humans at various severity-levels. Dysarthric speech classification is used as a diagnostic method to assess the progression of a patient’s severity of the condition, as well as to aid with automatic dysarthric speech recognition systems (an important assistive speech technology). This study investigates the significance of Generalized Morse Wavelet (GMW)-based scalogram features for capturing the discriminative acoustic cues of dysarthric severity-level classification for low-frequency regions, using Convolutional Neural Network (CNN). The performance of scalogram-based features is compared with Short-Time Fourier Transform (STFT)-based features, and Mel spectrogram-based features. Compared to the STFT-based baseline features with a classification accuracy of 91.76%, the proposed Continuous Wavelet Transform (CWT)-based scalogram features achieve significantly improved classification accuracy of 95.17% on standard and statistically meaningful UA-Speech corpus. The remarkably improved results signify that for better dysarthric severity-level classification, the information in the low-frequency regions is more discriminative, as the proposed CWT-based time-frequency representation (scalogram) has a high-frequency resolution in the lower frequencies. On the other hand, STFT-based representations have constant resolution across all the frequency bands and therefore, are not as better suited for dysarthric severity-level classification, as the proposed Morse wavelet-based CWT features. In addition, we also perform experiments on the Mel spectrogram to demonstrate that even though the Mel spectrogram also has a high frequency resolution in the lower frequencies with a classification accuracy of 92.65%, the proposed system is better suited. We see an increase of 3.41% and 2.52% in classification accuracy of the proposed system to STFT and Mel spectrogram respectively. To that effect, the performance of the STFT, Mel spectrogram, and scalogram are analyzed using F1-Score, Matthew’s Correlation Coefficients (MCC), Jaccard Index, Hamming Loss, and Linear Discriminant Analysis (LDA) scatter plots.


Al-Qatab BA and Mustafa MB Classification of dysarthric speech according to the severity of impairment: an analysis of acoustic features IEEE Access 2021 9 18183-18194
Bouchard M, Jousselme AL, and Doré PE A proof for the positive definiteness of the Jaccard index matrix Int. J. Approx. Reason. 2013 54 5 615-626
Chen, H., Zhang, P., Bai, H., Yuan, Q., Bao, X., Yan, Y.: Deep convolutional neural network with scalogram for audio scene modeling. In: INTERSPEECH, Hyderabad India, pp. 3304–3308 (2018)
Darley FL, Aronson AE, and Brown JR Differential diagnostic patterns of dysarthria J. Speech Hear. Res. (JSLHR) 1969 12 2 246-269
Daubechies I The wavelet transform, time-frequency localization and signal analysis IEEE Trans. Inf. Theory 1990 36 5 961-1005
Dembczyński K, Waegeman W, Cheng W, and Hüllermeier E Balcázar JL, Bonchi F, Gionis A, and Sebag M Regret analysis for performance metrics in multi-label classification: the case of hamming and subset zero-one loss Machine Learning and Knowledge Discovery in Databases 2010 Heidelberg Springer 280-295
Fawcett T An introduction to ROC analysis Pattern Recognit. Lett. 2006 27 8 861-874
Gillespie, S., Logan, Y.Y., Moore, E., Laures-Gore, J., Russell, S., Patel, R.: Cross-database models for the classification of dysarthria presence. In: INTERSPEECH, Stockholm, Sweden, pp. 3127–31 (2017)
Gupta et al., S.: Residual neural network precisely quantifies dysarthria severity-level based on short-duration speech segments. Neural Netw. 139, 105–117 (2021)
Holschneider, M.: Wavelets. An analysis tool (1995)
Izenman, A.J.: Linear discriminant analysis. In: Izenman, A.J. (ed.) Modern Multivariate Statistical Techniques. Springer Texts in Statistics, pp. 237–280. Springer, New York (2013).
Joshy, A.A., Rajan, R.: Automated dysarthria severity classification using deep learning frameworks. In: 28th European Signal Processing Conference (EUSIPCO), Amsterdam, Netherlands, pp. 116–120 (2021)
Knutsson, H., Westin, C.F., Granlund, G.: Local multiscale frequency and bandwidth estimation. In: Proceedings of 1st International Conference on Image Processing, Austin, TX, USA, vol. 1, pp. 36–40, 13–16 November 1994
LeCun, Y., Kavukcuoglu, K., Farabet, C.: Convolutional networks and applications in vision. In: Proceedings of 2010 IEEE International Symposium on Circuits and Systems, Paris, France, pp. 253–256 (2010)
Lieberman P Primate vocalizations and human linguistic ability J. Acoust. Soci. Am. (JASA) 1968 44 6 1574-1584
Lilly JM and Olhede SC Generalized Morse wavelets as a superfamily of analytic wavelets IEEE Trans. Signal Process. 2012 60 11 6036-6041
Lilly, J.M., Olhede, S.C.: Higher-order properties of analytic wavelets. IEEE Trans. Signal Process. 57(1), 146–160 (2008)
Lilly, J.M., Olhede, S.C.: On the analytic wavelet transform. IEEE Trans. Inf. Theory 56(8), 4135–4156 (2010)
Mackenzie C and Lowit A Behavioural intervention effects in dysarthria following stroke: communication effectiveness, intelligibility and dysarthria impact Int. J. Lang. Commun. Disord. 2007 42 2 131-153
Mallat, S.: A Wavelet Tour of Signal Processing, 2nd edn. Elsevier, Amsterdam (1999)
Matthews, B.W.: Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA) Prot. Struct. 405(2), 442–451 (1975)
Ren Z, Qian K, Zhang Z, Pandit V, Baird A, and Schuller B Deep scalogram representations for acoustic scene classification IEEE/CAA J. Automatica Sinica 2018 5 3 662-669
Vásquez-Correa, J.C., Orozco-Arroyave, J.R., Nöth, E.: Convolutional neural network to model articulation impairments in patients with Parkinson’s disease. In: INTERSPEECH, Stockholm, pp. 314–318 (2017)
Young V and Mihailidis A Difficulties in automatic speech recognition of dysarthric speakers and implications for speech-based applications used by the elderly: A literature review Assist. Technol. 2010 22 2 99-112
Yu, J., et al.: Development of the CUHK dysarthric speech recognition system for the UA speech corpus. In: INTERSPEECH, Hyderabad, India, pp. 2938–2942 (2018)

Index Terms

  1. Continuous Wavelet Transform for Severity-Level Classification of Dysarthria
            Index terms have been assigned to the content through auto-classification.



            Information & Contributors


            Published In

            cover image Guide Proceedings
            Speech and Computer: 24th International Conference, SPECOM 2022, Gurugram, India, November 14–16, 2022, Proceedings
            Nov 2022
            736 pages



            Berlin, Heidelberg

            Publication History

            Published: 14 November 2022

            Author Tags

            1. Wavelet transform
            2. Dysarthria
            3. UA-Speech corpus
            4. Morse wavelet
            5. CNN


            • Article


            Other Metrics

            Bibliometrics & Citations


            Article Metrics

            • 0
              Total Citations
            • 0
              Total Downloads
            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 08 Feb 2025

            Other Metrics


            View Options

            View options






            Share this Publication link

            Share on social media