Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A deep transfer learning approach for improved post-traumatic stress disorder diagnosis

Published: 01 September 2019 Publication History

Abstract

Post-traumatic stress disorder (PTSD) is a traumatic-stressor-related disorder developed by exposure to a traumatic or adverse environmental event that caused serious harm or injury. Structured interview is the only widely accepted clinical practice for PTSD diagnosis but suffers from several limitations including the stigma associated with the disease. Diagnosis of PTSD patients by analyzing speech signals has been investigated as an alternative since recent years, where speech signals are processed to extract frequency features and these features are then fed into a classification model for PTSD diagnosis. In this paper, we developed a deep belief network (DBN) model combined with a transfer learning (TL) strategy for PTSD diagnosis. We computed three categories of speech features and utilized the DBN model to fuse these features. The TL strategy was utilized to transfer knowledge learned from a large speech recognition database, TIMIT, for PTSD detection where PTSD patient data are difficult to collect. We evaluated the proposed methods on two PTSD speech databases, each of which consists of audio recordings from 26 patients. We compared the proposed methods with other popular methods and showed that the state-of-the-art support vector machine (SVM) classifier only achieved an accuracy of 57.68%, and TL strategy boosted the performance of the DBN from 61.53 to 74.99%. Altogether, our method provides a pragmatic and promising tool for PTSD diagnosis. Preliminary results of this study were presented in Banerjee (in: 2017 IEEE international conference on data mining (ICDM), IEEE, 2017).

References

[1]
Banerjee D, Islam K, Mei G, Xiao L, Zhang G, Xu R, Ji S, Li J (2017) A deep transfer learning approach for improved post-traumatic stress disorder diagnosis. In: 2017 IEEE international conference on data mining (ICDM), IEEE, pp 11–20
[2]
Bengio Y (2009) Learning deep architectures for AI. Found Trends® Mach Learn 2(1):1–127
[3]
Bijleveld H-A (2015) Post-traumatic stress disorder and stuttering: a diagnostic challenge in a case study. Proc Soc Behav Sci 193:37–43
[4]
Brown SM, Webb A, Mangoubi R, Dy JG (2015) A sparse combined regression-classification formulation for learning a physiological alternative to clinical post-traumatic stress disorder scores. In: AAAI, pp 1700–1706
[5]
Calvo RA, D’Mello S (2010) Affect detection: an interdisciplinary review of models, methods, and their applications. IEEE Trans Affect Comput 1(1):18–37
[6]
Deng L, Li J, Huang J-T, Yao K, Yu D, Seide F, Seltzer M, Zweig G, He X, Williams J, et al (2013) Recent advances in deep learning for speech research at Microsoft. In: 2013 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 8604–8608
[7]
Dieleman S, Schrauwen B (2014) End-to-end learning for music audio. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 6964–6968
[8]
Edwards AL (1948) Note on the correction for continuity in testing the significance of the difference between correlated proportions. Psychometrika 13(3):185–187
[9]
Farrús M, Hernando J, Ejarque P (2007) Jitter and shimmer measurements for speaker recognition. In: Eighth annual conference of the international speech communication association
[10]
Foa EB, Steketee G, Rothbaum BO (1989) Behavioral/cognitive conceptualizations of post-traumatic stress disorder. Behav Ther 20(2):155–176
[11]
Friedman MJ (2007) PTSD history and overview. United States Department of Veterans Affairs
[12]
Galatzer-Levy IR, Ma S, Statnikov A, Yehuda R, Shalev AY (2017) Utilization of machine learning for prediction of post-traumatic stress: a re-examination of cortisol in the prediction and pathways to non-remitting ptsd. Transl Psychiatr 7(3):e1070
[13]
Galatzer-Levy IR, Karstoft KI, Statnikov A, Shalev AY (2014) Quantitative forecasting of ptsd from early trauma responses: a machine learning application. J Psychiatr Res 59:68–76
[14]
Garofolo John S, Lamel Lori F, Fisher William M, Fiscus Jonathan G, Pallett David S, Dahlgren Nancy L, Victor Z (1993) TIMIT acoustic-phonetic continuous speech corpus, 1993. Linguistic Data Consortium, Philadelphia
[15]
Grinage BD (2003) Diagnosis and management of post-traumatic stress disorder. Am Fam Phys 68(12):2401–2408
[16]
Gulzar T, Singh A, Sharma S (2014) Comparative analysis of IPCC, MFCC and BFCC for the recognition of Hindi words using artificial neural networks. Int J Comput Appl 101(12):22–27
[18]
Hansen JHL, Kim W, Rahurkar M, Ruzanski E, Meyerhoff J (2011) Robust emotional stressed speech detection using weighted frequency subbands. EURASIP J Adv Signal Process 2011(1):906789
[19]
Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
[20]
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
[21]
Hovens JE, Van der Ploeg HM, Klaarenbeek MTA, Bramsen I, Schreuder JN, Rivero VV (1994) The assessment of posttraumatic stress disorder: with the clinician administered ptsd scale: Dutch results. J Clin Psychol 50(3):325–340
[22]
Kamishima T, Hamasaki M, Akaho S (2009) Trbagg: a simple transfer learning method and its application to personalization in collaborative tagging. In: Ninth IEEE international conference on data mining, 2009, ICDM’09, IEEE, pp 219–228
[23]
Karen-Inge K, Galatzer-Levy Isaac R, Alexander S, Zhiguo L, Shalev Arieh Y (2015) Bridging a translational gap: using machine learning to improve the prediction of ptsd. BMC Psychiatr 15(1):30
[24]
Kessler RC, Rose S, Koenen KC, Karam EG, Stang PE, Stein DJ, Heeringa SG, Hill ED, Liberzon I, McLaughlin KA (2014) How well can post-traumatic stress disorder be predicted from pre-trauma risk factors? An exploratory study in the who world mental health surveys. World Psychiatr 13(3):265–274
[25]
Kim J-H, Woodland PC (2001) The use of prosody in a combined system for punctuation generation and speech recognition. In: Seventh European conference on speech communication and technology
[26]
Knoth B, Vergyri D, Shriberg E, Mitra V, Mclaren V, Kathol A, Richey C, Graciarena M (2018) Systems for speech-based assessment of a patient’s state-of-mind. US Patent WO2016028495 A1
[27]
Krothapalli SR, Koolagudi SG (2013) Characterization and recognition of emotions from speech using excitation source information. Int J Speech Technol 16(2):181–201
[28]
Kumaraswamy R, Odom P, Kersting K, Leake D, Natarajan S (2015) Transfer learning via relational type matching. In: 2015 IEEE international conference on data mining (ICDM), IEEE, pp 811–816
[29]
Kunze J, Kirsch L, Kurenkov I, Krug A, Johannsmeier J, Stober S (2017) Transfer learning for speech recognition on a budget. ArXiv preprint arXiv:1706.00290
[30]
Li X, Tao J, Johnson MT, Soltis J, Savage A, Leong KM, Newman JD (2007) Stress and emotion classification using jitter and shimmer features. In: IEEE international conference on acoustics, speech and signal processing, 2007, ICASSP 2007, vol 4. IEEE, pp IV–1081
[31]
Litman DJ, Hirschberg JB, Swerts M (2000) Predicting automatic speech recognition performance using prosodic cues. In: Proceedings of the 1st North American chapter of the association for computational linguistics conference. Association for Computational Linguistics, pp 218–225
[32]
Marinić I, Supek F, Kovačić Z, Rukavina L, Jendričko T, Kozarić-Kovačić D (2007) Posttraumatic stress disorder: diagnostic data analysis by data mining methodology. Croat Med J 48(2):185–197
[33]
Muda L, Begam M, Elamvazuthi I (2010) Voice recognition algorithms using mel frequency cepstral coefficient (mfcc) and dynamic time warping (dtw) techniques. ArXiv preprint arXiv:1003.4083
[34]
Omurca S, Ekinci E (2015) An alternative evaluation of post traumatic stress disorder with machine learning methods. In: 2015 International symposium on innovations in intelligent systems and applications (INISTA), IEEE, pp 1–7
[35]
Ooi KEBrian, Low LSA, Lech M, Allen N (2012) Early prediction of major depression in adolescents using glottal wave characteristics and Teager energy parameters. In: 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 4613–4616
[38]
Pan SJ, Yang Q (2010) A survey on transfer learning. EEE Trans Knowl Data Eng 22(10):1345–1359
[39]
Pitman RK (1989) Post-traumatic stress disorder, hormones, and memory. Biol Psychiatr 26(3):221–223
[40]
Pratt LY (1993) Discriminability-based transfer between neural networks. In: Advances in neural information processing systems, pp 204–211
[41]
Ramaswamy S, Madaan V, Qadri F, Heaney CJ, North TC, Padala PR, Sattar SP, Petty F (2005) A primary care perspective of posttraumatic stress disorder for the department of veterans affairs. Prim Care Compan J Clin Psychiatr 7(4):180
[42]
Rozgic V, Vazquez-Reina A, Crystal M, Srivastava A, Tan V, Berka C (2014) Multi-modal prediction of ptsd and stress indicators. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 3636–3640
[43]
Scherer S, Lucas GM, Gratch J, Rizzo AS, Morency L-P (2016) Self-reported symptoms of depression and ptsd are associated with reduced vowel space in screening interviews. IEEE Trans Affect Comput 7(1):59–73
[44]
Scherer S, Stratou G, Gratch J, Morency L-P (2013) Investigating voice quality as a speaker-independent indicator of depression and ptsd. In: Interspeech, pp 847–851
[45]
Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 806–813
[46]
Sparr LF, Bremner JD (2005) Post-traumatic stress disorder and memory prescient medicolegal testimony at the international war crimes tribunal? J Am Acad Psychiatr Law Online 33(1):71–78
[47]
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
[48]
van den Broek EL, van der Sluis F, Dijkstra T (2010) Telling the story and re-living the past: how speech analysis can reveal emotions in post-traumatic stress disorder (ptsd) patients. In: Sensing emotions, Springer, pp 153–180
[49]
Vergyri D, Knoth B, Shriberg E, Mitra V, McLaren M, Ferrer L, Garcia P, Marmar C (2015) Speech-based assessment of ptsd in a military population using diverse feature classes. In: Sixteenth annual conference of the international speech communication association
[50]
Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83
[51]
Young A (1997) The harmony of illusions: inventing post-traumatic stress disorder. Princeton University Press, Princeton
[52]
Zhang Q, Wu Q, Zhu H, He L, Huang H, Zhang J, Zhang W (2016) Multimodal MRI-based classification of trauma survivors with and without post-traumatic stress disorder. Front Neurosci 10:292
[53]
Zhang W, Li R, Zeng T, Sun Q, Kumar S, Ye J, Ji S (2016) Deep model based transfer and multi-task learning for biological image analysis. In: IEEE transactions on big data
[54]
Zhuang X, Rozgić V, Crystal M, Marx BP (2014) Improving speech-based ptsd detection via multi-view learning. In: Spoken language technology workshop (SLT), 2014 IEEE, pp 260–265

Cited By

View all
  • (2024)Deep Learning for Time Series Classification and Extrinsic Regression: A Current SurveyACM Computing Surveys10.1145/364944856:9(1-45)Online publication date: 25-Apr-2024
  • (2023)Combining Deep Learning with Signal-image Encoding for Multi-Modal Mental Wellbeing ClassificationACM Transactions on Computing for Healthcare10.1145/36316185:1(1-23)Online publication date: 3-Nov-2023
  • (2022)Stress emotion recognition with discrepancy reduction using transfer learningMultimedia Tools and Applications10.1007/s11042-022-13593-682:4(5949-5963)Online publication date: 1-Aug-2022
  • Show More Cited By

Index Terms

  1. A deep transfer learning approach for improved post-traumatic stress disorder diagnosis
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image Knowledge and Information Systems
          Knowledge and Information Systems  Volume 60, Issue 3
          Sep 2019
          581 pages

          Publisher

          Springer-Verlag

          Berlin, Heidelberg

          Publication History

          Published: 01 September 2019

          Author Tags

          1. Speech based PTSD diagnosis
          2. Deep belief network
          3. Deep learning
          4. Transfer learning

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 09 Feb 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)Deep Learning for Time Series Classification and Extrinsic Regression: A Current SurveyACM Computing Surveys10.1145/364944856:9(1-45)Online publication date: 25-Apr-2024
          • (2023)Combining Deep Learning with Signal-image Encoding for Multi-Modal Mental Wellbeing ClassificationACM Transactions on Computing for Healthcare10.1145/36316185:1(1-23)Online publication date: 3-Nov-2023
          • (2022)Stress emotion recognition with discrepancy reduction using transfer learningMultimedia Tools and Applications10.1007/s11042-022-13593-682:4(5949-5963)Online publication date: 1-Aug-2022
          • (2021)Public opinion mining using natural language processing technique for improvisation towards smart cityInternational Journal of Speech Technology10.1007/s10772-020-09766-z24:3(561-569)Online publication date: 1-Sep-2021
          • (2021)Metro passengers counting and density estimation via dilated-transposed fully convolutional neural networkKnowledge and Information Systems10.1007/s10115-021-01563-763:6(1557-1575)Online publication date: 1-Jun-2021
          • (2021)Transfer learning for fine-grained entity typingKnowledge and Information Systems10.1007/s10115-021-01549-563:4(845-866)Online publication date: 1-Apr-2021
          • (2020)Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory Sound DataProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3394486.3412865(3474-3484)Online publication date: 23-Aug-2020

          View Options

          View options

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media