research-article

A deep transfer learning approach for improved post-traumatic stress disorder diagnosis

Authors:

Debrup Banerjee,

Guangfan Zhang,

Jiang LiAuthors Info & Claims

Volume 60, Issue 3

Pages 1693 - 1724

https://doi.org/10.1007/s10115-019-01337-2

Published: 01 September 2019 Publication History

Abstract

Post-traumatic stress disorder (PTSD) is a traumatic-stressor-related disorder developed by exposure to a traumatic or adverse environmental event that caused serious harm or injury. Structured interview is the only widely accepted clinical practice for PTSD diagnosis but suffers from several limitations including the stigma associated with the disease. Diagnosis of PTSD patients by analyzing speech signals has been investigated as an alternative since recent years, where speech signals are processed to extract frequency features and these features are then fed into a classification model for PTSD diagnosis. In this paper, we developed a deep belief network (DBN) model combined with a transfer learning (TL) strategy for PTSD diagnosis. We computed three categories of speech features and utilized the DBN model to fuse these features. The TL strategy was utilized to transfer knowledge learned from a large speech recognition database, TIMIT, for PTSD detection where PTSD patient data are difficult to collect. We evaluated the proposed methods on two PTSD speech databases, each of which consists of audio recordings from 26 patients. We compared the proposed methods with other popular methods and showed that the state-of-the-art support vector machine (SVM) classifier only achieved an accuracy of 57.68%, and TL strategy boosted the performance of the DBN from 61.53 to 74.99%. Altogether, our method provides a pragmatic and promising tool for PTSD diagnosis. Preliminary results of this study were presented in Banerjee (in: 2017 IEEE international conference on data mining (ICDM), IEEE, 2017).

References

[1]

Banerjee D, Islam K, Mei G, Xiao L, Zhang G, Xu R, Ji S, Li J (2017) A deep transfer learning approach for improved post-traumatic stress disorder diagnosis. In: 2017 IEEE international conference on data mining (ICDM), IEEE, pp 11–20

[2]

Bengio Y (2009) Learning deep architectures for AI. Found Trends® Mach Learn 2(1):1–127

[3]

Bijleveld H-A (2015) Post-traumatic stress disorder and stuttering: a diagnostic challenge in a case study. Proc Soc Behav Sci 193:37–43

[4]

Brown SM, Webb A, Mangoubi R, Dy JG (2015) A sparse combined regression-classification formulation for learning a physiological alternative to clinical post-traumatic stress disorder scores. In: AAAI, pp 1700–1706

[5]

Calvo RA, D’Mello S (2010) Affect detection: an interdisciplinary review of models, methods, and their applications. IEEE Trans Affect Comput 1(1):18–37

Digital Library

[6]

Deng L, Li J, Huang J-T, Yao K, Yu D, Seide F, Seltzer M, Zweig G, He X, Williams J, et al (2013) Recent advances in deep learning for speech research at Microsoft. In: 2013 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 8604–8608

[7]

Dieleman S, Schrauwen B (2014) End-to-end learning for music audio. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 6964–6968

[8]

Edwards AL (1948) Note on the correction for continuity in testing the significance of the difference between correlated proportions. Psychometrika 13(3):185–187

[9]

Farrús M, Hernando J, Ejarque P (2007) Jitter and shimmer measurements for speaker recognition. In: Eighth annual conference of the international speech communication association

[10]

Foa EB, Steketee G, Rothbaum BO (1989) Behavioral/cognitive conceptualizations of post-traumatic stress disorder. Behav Ther 20(2):155–176

[11]

Friedman MJ (2007) PTSD history and overview. United States Department of Veterans Affairs

[12]

Galatzer-Levy IR, Ma S, Statnikov A, Yehuda R, Shalev AY (2017) Utilization of machine learning for prediction of post-traumatic stress: a re-examination of cortisol in the prediction and pathways to non-remitting ptsd. Transl Psychiatr 7(3):e1070

[13]

Galatzer-Levy IR, Karstoft KI, Statnikov A, Shalev AY (2014) Quantitative forecasting of ptsd from early trauma responses: a machine learning application. J Psychiatr Res 59:68–76

[14]

Garofolo John S, Lamel Lori F, Fisher William M, Fiscus Jonathan G, Pallett David S, Dahlgren Nancy L, Victor Z (1993) TIMIT acoustic-phonetic continuous speech corpus, 1993. Linguistic Data Consortium, Philadelphia

[15]

Grinage BD (2003) Diagnosis and management of post-traumatic stress disorder. Am Fam Phys 68(12):2401–2408

[16]

Gulzar T, Singh A, Sharma S (2014) Comparative analysis of IPCC, MFCC and BFCC for the recognition of Hindi words using artificial neural networks. Int J Comput Appl 101(12):22–27

[17]

How common is ptsd (2018) https://www.ptsd.va.gov/public/ptsd-overview/basics/how-common-is-ptsd.asp. Accessed 20 June 2018

[18]

Hansen JHL, Kim W, Rahurkar M, Ruzanski E, Meyerhoff J (2011) Robust emotional stressed speech detection using weighted frequency subbands. EURASIP J Adv Signal Process 2011(1):906789

[19]

Hinton GE, Osindero S, Teh Y-W (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

Digital Library

[20]

Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507

[21]

Hovens JE, Van der Ploeg HM, Klaarenbeek MTA, Bramsen I, Schreuder JN, Rivero VV (1994) The assessment of posttraumatic stress disorder: with the clinician administered ptsd scale: Dutch results. J Clin Psychol 50(3):325–340

[22]

Kamishima T, Hamasaki M, Akaho S (2009) Trbagg: a simple transfer learning method and its application to personalization in collaborative tagging. In: Ninth IEEE international conference on data mining, 2009, ICDM’09, IEEE, pp 219–228

[23]

Karen-Inge K, Galatzer-Levy Isaac R, Alexander S, Zhiguo L, Shalev Arieh Y (2015) Bridging a translational gap: using machine learning to improve the prediction of ptsd. BMC Psychiatr 15(1):30

[24]

Kessler RC, Rose S, Koenen KC, Karam EG, Stang PE, Stein DJ, Heeringa SG, Hill ED, Liberzon I, McLaughlin KA (2014) How well can post-traumatic stress disorder be predicted from pre-trauma risk factors? An exploratory study in the who world mental health surveys. World Psychiatr 13(3):265–274

[25]

Kim J-H, Woodland PC (2001) The use of prosody in a combined system for punctuation generation and speech recognition. In: Seventh European conference on speech communication and technology

[26]

Knoth B, Vergyri D, Shriberg E, Mitra V, Mclaren V, Kathol A, Richey C, Graciarena M (2018) Systems for speech-based assessment of a patient’s state-of-mind. US Patent WO2016028495 A1

[27]

Krothapalli SR, Koolagudi SG (2013) Characterization and recognition of emotions from speech using excitation source information. Int J Speech Technol 16(2):181–201

Digital Library

[28]

Kumaraswamy R, Odom P, Kersting K, Leake D, Natarajan S (2015) Transfer learning via relational type matching. In: 2015 IEEE international conference on data mining (ICDM), IEEE, pp 811–816

[29]

Kunze J, Kirsch L, Kurenkov I, Krug A, Johannsmeier J, Stober S (2017) Transfer learning for speech recognition on a budget. ArXiv preprint arXiv:1706.00290

[30]

Li X, Tao J, Johnson MT, Soltis J, Savage A, Leong KM, Newman JD (2007) Stress and emotion classification using jitter and shimmer features. In: IEEE international conference on acoustics, speech and signal processing, 2007, ICASSP 2007, vol 4. IEEE, pp IV–1081

[31]

Litman DJ, Hirschberg JB, Swerts M (2000) Predicting automatic speech recognition performance using prosodic cues. In: Proceedings of the 1st North American chapter of the association for computational linguistics conference. Association for Computational Linguistics, pp 218–225

[32]

Marinić I, Supek F, Kovačić Z, Rukavina L, Jendričko T, Kozarić-Kovačić D (2007) Posttraumatic stress disorder: diagnostic data analysis by data mining methodology. Croat Med J 48(2):185–197

[33]

Muda L, Begam M, Elamvazuthi I (2010) Voice recognition algorithms using mel frequency cepstral coefficient (mfcc) and dynamic time warping (dtw) techniques. ArXiv preprint arXiv:1003.4083

[34]

Omurca S, Ekinci E (2015) An alternative evaluation of post traumatic stress disorder with machine learning methods. In: 2015 International symposium on innovations in intelligent systems and applications (INISTA), IEEE, pp 1–7

[35]

Ooi KEBrian, Low LSA, Lech M, Allen N (2012) Early prediction of major depression in adolescents using glottal wave characteristics and Teager energy parameters. In: 2012 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 4613–4616

[36]

Ptsd and dsm-5 (2016) http://www.ptsd.va.gov/professional/PTSD-overview/dsm_criteria_ptsd.asp. Accessed 10 July 2016

[37]

Ptsd and symptoms (2018) https://www.ptsd.va.gov/public/ptsd-overview/basics/symptoms_of_ptsd.asp. Accessed 20 June 2018

[38]

Pan SJ, Yang Q (2010) A survey on transfer learning. EEE Trans Knowl Data Eng 22(10):1345–1359

Digital Library

[39]

Pitman RK (1989) Post-traumatic stress disorder, hormones, and memory. Biol Psychiatr 26(3):221–223

[40]

Pratt LY (1993) Discriminability-based transfer between neural networks. In: Advances in neural information processing systems, pp 204–211

[41]

Ramaswamy S, Madaan V, Qadri F, Heaney CJ, North TC, Padala PR, Sattar SP, Petty F (2005) A primary care perspective of posttraumatic stress disorder for the department of veterans affairs. Prim Care Compan J Clin Psychiatr 7(4):180

[42]

Rozgic V, Vazquez-Reina A, Crystal M, Srivastava A, Tan V, Berka C (2014) Multi-modal prediction of ptsd and stress indicators. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), IEEE, pp 3636–3640

[43]

Scherer S, Lucas GM, Gratch J, Rizzo AS, Morency L-P (2016) Self-reported symptoms of depression and ptsd are associated with reduced vowel space in screening interviews. IEEE Trans Affect Comput 7(1):59–73

Digital Library

[44]

Scherer S, Stratou G, Gratch J, Morency L-P (2013) Investigating voice quality as a speaker-independent indicator of depression and ptsd. In: Interspeech, pp 847–851

[45]

Razavian AS, Azizpour H, Sullivan J, Carlsson S (2014) CNN features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 806–813

Digital Library

[46]

Sparr LF, Bremner JD (2005) Post-traumatic stress disorder and memory prescient medicolegal testimony at the international war crimes tribunal? J Am Acad Psychiatr Law Online 33(1):71–78

[47]

Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

Digital Library

[48]

van den Broek EL, van der Sluis F, Dijkstra T (2010) Telling the story and re-living the past: how speech analysis can reveal emotions in post-traumatic stress disorder (ptsd) patients. In: Sensing emotions, Springer, pp 153–180

[49]

Vergyri D, Knoth B, Shriberg E, Mitra V, McLaren M, Ferrer L, Garcia P, Marmar C (2015) Speech-based assessment of ptsd in a military population using diverse feature classes. In: Sixteenth annual conference of the international speech communication association

[50]

Wilcoxon F (1945) Individual comparisons by ranking methods. Biom Bull 1(6):80–83

[51]

Young A (1997) The harmony of illusions: inventing post-traumatic stress disorder. Princeton University Press, Princeton

[52]

Zhang Q, Wu Q, Zhu H, He L, Huang H, Zhang J, Zhang W (2016) Multimodal MRI-based classification of trauma survivors with and without post-traumatic stress disorder. Front Neurosci 10:292

[53]

Zhang W, Li R, Zeng T, Sun Q, Kumar S, Ye J, Ji S (2016) Deep model based transfer and multi-task learning for biological image analysis. In: IEEE transactions on big data

[54]

Zhuang X, Rozgić V, Crystal M, Marx BP (2014) Improving speech-based ptsd detection via multi-view learning. In: Spoken language technology workshop (SLT), 2014 IEEE, pp 260–265

Cited By

Mohammadi Foumani NMiller LTan CWebb GForestier GSalehi M(2024)Deep Learning for Time Series Classification and Extrinsic Regression: A Current SurveyACM Computing Surveys10.1145/364944856:9(1-45)Online publication date: 25-Apr-2024
https://dl.acm.org/doi/10.1145/3649448
Woodward KKanjo ETsanas A(2023)Combining Deep Learning with Signal-image Encoding for Multi-Modal Mental Wellbeing ClassificationACM Transactions on Computing for Healthcare10.1145/36316185:1(1-23)Online publication date: 3-Nov-2023
https://dl.acm.org/doi/10.1145/3631618
Theerthagiri P(2022)Stress emotion recognition with discrepancy reduction using transfer learningMultimedia Tools and Applications10.1007/s11042-022-13593-682:4(5949-5963)Online publication date: 1-Aug-2022
https://dl.acm.org/doi/10.1007/s11042-022-13593-6
Show More Cited By

Index Terms

A deep transfer learning approach for improved post-traumatic stress disorder diagnosis
1. Applied computing
  1. Life and medical sciences
2. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches
      1. Neural networks

Index terms have been assigned to the content through auto-classification.

Recommendations

Post-traumatic stress disorder: opportunities & challenges for computing technology
IHI '10: Proceedings of the 1st ACM International Health Informatics Symposium

Post-Traumatic Stress Disorder (PTSD) is a condition in which a person responds to a traumatic event, such as war, a car accident, or physical abuse, with prolonged feelings of fear, helplessness, or horror. This disorder can have a significant ...
Robust Low Complexity Framework for Early Diagnosis of Autism Spectrum Disorder Based on Cross Wavelet Transform and Deep Transfer Learning
Abstract
Autism spectrum disorder (ASD) starts in the early childhood. Therefore, its diagnosis and classification at the right time would prevent the damages in long terms. EEG signals are non-invasive brain activity signals with excellent temporal ...
Cognitive-Behavioral Therapy (CBT) Is Applied in Post-Traumatic Stress Disorder (PTSD) of Chinese Shidu Parents Who Lost Their Only Child
Objective. The objective is to help Chinese Shidu parents who have lost their only child to relieve post-traumatic stress disorder. Methods. A qualitative phenomenology study using the hermeneutical phenomenological method was employed in a major ...

Comments

Information & Contributors

Information

Published In

cover image Knowledge and Information Systems

Knowledge and Information Systems Volume 60, Issue 3

Sep 2019

581 pages

ISSN:0219-1377

Issue’s Table of Contents

Copyright © 2019 Springer-Verlag London Ltd., part of Springer Nature.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 September 2019

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 09 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Mohammadi Foumani NMiller LTan CWebb GForestier GSalehi M(2024)Deep Learning for Time Series Classification and Extrinsic Regression: A Current SurveyACM Computing Surveys10.1145/364944856:9(1-45)Online publication date: 25-Apr-2024
https://dl.acm.org/doi/10.1145/3649448
Woodward KKanjo ETsanas A(2023)Combining Deep Learning with Signal-image Encoding for Multi-Modal Mental Wellbeing ClassificationACM Transactions on Computing for Healthcare10.1145/36316185:1(1-23)Online publication date: 3-Nov-2023
https://dl.acm.org/doi/10.1145/3631618
Theerthagiri P(2022)Stress emotion recognition with discrepancy reduction using transfer learningMultimedia Tools and Applications10.1007/s11042-022-13593-682:4(5949-5963)Online publication date: 1-Aug-2022
https://dl.acm.org/doi/10.1007/s11042-022-13593-6
Leelavathy SNithya M(2021)Public opinion mining using natural language processing technique for improvisation towards smart cityInternational Journal of Speech Technology10.1007/s10772-020-09766-z24:3(561-569)Online publication date: 1-Sep-2021
https://dl.acm.org/doi/10.1007/s10772-020-09766-z
Zhu GZeng XJin XZhang J(2021)Metro passengers counting and density estimation via dilated-transposed fully convolutional neural networkKnowledge and Information Systems10.1007/s10115-021-01563-763:6(1557-1575)Online publication date: 1-Jun-2021
https://dl.acm.org/doi/10.1007/s10115-021-01563-7
Hou FWang RZhou Y(2021)Transfer learning for fine-grained entity typingKnowledge and Information Systems10.1007/s10115-021-01549-563:4(845-866)Online publication date: 1-Apr-2021
https://dl.acm.org/doi/10.1007/s10115-021-01549-5
Brown CChauhan JGrammenos AHan JHasthanasombat ASpathis DXia TCicuta PMascolo CGupta RLiu YShah MRajan STang JPrakash B(2020)Exploring Automatic Diagnosis of COVID-19 from Crowdsourced Respiratory Sound DataProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining10.1145/3394486.3412865(3474-3484)Online publication date: 23-Aug-2020
https://dl.acm.org/doi/10.1145/3394486.3412865

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents