Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3269206.3271732acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

"Let Me Tell You About Your Mental Health!": Contextualized Classification of Reddit Posts to DSM-5 for Web-based Intervention

Published: 17 October 2018 Publication History
  • Get Citation Alerts
  • Abstract

    Social media platforms are increasingly being used to share and seek advice on mental health issues. In particular, Reddit users freely discuss such issues on various subreddits, whose structure and content can be leveraged to formally interpret and relate subreddits and their posts in terms of mental health diagnostic categories. There is prior research on the extraction of mental health-related information, including symptoms, diagnosis, and treatments from social media; however, our approach can additionally provide actionable information to clinicians about the mental health of a patient in diagnostic terms for web-based intervention. Specifically, we provide a detailed analysis of the nature of subreddit content from domain expert's perspective and introduce a novel approach to map each subreddit to the best matching DSM-5 (Diagnostic and Statistical Manual of Mental Disorders - 5th Edition) category using multi-class classifier. Our classification algorithm analyzes all the posts of a subreddit by adapting topic modeling and word-embedding techniques, and utilizing curated medical knowledge bases to quantify relationship to DSM-5 categories. Our semantic encoding-decoding optimization approach reduces the false-alarm-rate from 30% to 2.5% over a comparable heuristic baseline, and our mapping results have been verified by domain experts achieving a kappa score of 0.84.

    References

    [1]
    Amrudin Agovic and Arindam Banerjee. 2012. Gaussian process topic models. arXiv preprint arXiv:1203.3462 (2012).
    [2]
    Melanie Andresen and Heike Zinsmeister. 2017. Approximating Style by N-gram-based Annotation. In Proceedings of the Workshop on Stylistic Variation .
    [3]
    Erik Cambria, Bjorn Schuller, Bing Liu, Haixun Wang, and Catherine Havasi. 2013. Knowledge-based approaches to concept-level sentiment analysis. IEEE intelligent systems (2013).
    [4]
    Delroy Cameron, Gary A Smith, Raminta Daniulaityte, Amit P Sheth, Drashti Dave, Lu Chen, Gaurish Anand, Robert Carlson, Kera Z Watkins, and Russel Falck. 2013. PREDOSE: a semantic web platform for drug abuse epidemiology using social media. Journal of biomedical informatics (2013).
    [5]
    William B Cavnar, John M Trenkle, and others. 1994. N-gram-based text categorization. Ann arbor mi (1994).
    [6]
    Chao Chen, Andy Liaw, and Leo Breiman. 2004. Using random forest to learn imbalanced data. University of California, Berkeley (2004).
    [7]
    Raminta Daniulaityte, Robert Carlson, Gregory Brigham, Delroy Cameron, and Amit Sheth. 2015. "Sub is a weird drug:" A web-based study of lay attitudes about use of buprenorphine to self-treat opioid withdrawal symptoms. The American journal on addictions (2015).
    [8]
    Raminta Daniulaityte, Francois R Lamy, G Alan Smith, Ramzi W Nahhas, Robert G Carlson, Krishnaprasad Thirunarayan, Silvia S Martins, Edward W Boyer, and Amit Sheth. 2017. "Retweet to Pass the Blunt": Analyzing Geographic and Content Features of Cannabis-Related Tweeting Across the United States. Journal of studies on alcohol and drugs (2017).
    [9]
    Munmun De Choudhury, Scott Counts, and Mary Czerwinski. 2011. Identifying relevant social media content: leveraging information diversity and user cognition. In Proceedings of the 22nd ACM conference on Hypertext and hypermedia .
    [10]
    Munmun De Choudhury, Scott Counts, and Eric Horvitz. 2013a. Social media as a measurement tool of depression in populations. In Proceedings of the 5th Annual ACM Web Science Conference .
    [11]
    Munmun De Choudhury, Michael Gamon, Scott Counts, and Eric Horvitz. 2013b. Predicting depression via social media. ICWSM (2013).
    [12]
    Munmun De Choudhury, Emre Kiciman, Mark Dredze, Glen Coppersmith, and Mrinal Kumar. 2016. Discovering shifts to suicidal ideation from mental health content in social media. In Proceedings of the 2016 CHI conference on human factors in computing systems .
    [13]
    George Gkotsis, Anika Oellrich, Tim Hubbard, Richard Dobson, Maria Liakata, Sumithra Velupillai, and Rina Dutta. 2016. The language of mental health problems in social media. In Proceedings of the Third Workshop on Computational Lingusitics and Clinical Psychology .
    [14]
    George Gkotsis, Anika Oellrich, Sumithra Velupillai, Maria Liakata, Tim JP Hubbard, Richard JB Dobson, and Rina Dutta. 2017. Characterisation of mental health conditions in social media using Informed Deep Learning. Scientific reports (2017).
    [15]
    Li Guan, Bibo Hao, Qijin Cheng, Paul SF Yip, and Tingshao Zhu. 2015. Identifying Chinese microblog users with high suicide probability using internet-based profile and linguistic features: classification model. JMIR mental health (2015).
    [16]
    Yuening Hu, Jordan Boyd-Graber, Brianna Satinoff, and Alison Smith. 2014. Interactive topic modeling. Machine learning (2014).
    [17]
    Matthew R Jamnik and David J Lane. 2017. The Use of Reddit as an Inexpensive Source for High-Quality Data. Practical Assessment, Research & Evaluation (2017).
    [18]
    Elyor Kodirov, Tao Xiang, and Shaogang Gong. 2017. Semantic autoencoder for zero-shot learning. arXiv preprint arXiv:1704.08345 (2017).
    [19]
    Bartosz Krawczyk. 2016. Learning from imbalanced data: open challenges and future directions. Progress in Artificial Intelligence (2016).
    [20]
    Mrinal Kumar, Mark Dredze, Glen Coppersmith, and Munmun De Choudhury. 2015. Detecting changes in suicide content manifested in social media following celebrity suicides. In Proceedings of the 26th ACM Conference on Hypertext & Social Media .
    [21]
    Ugur Kursuncu, Manas Gaur, Usha Lokala, Krishnaprasad Thirunarayan, Amit Sheth, and I Budak Arpinar. 2018. Predictive Analysis on Twitter: Techniques and Applications. Emerging Research Challenges and Opportunities in Computational Social Network Analysis and Mining, Springer-Nature (2018).
    [22]
    Francois R Lamy, Raminta Daniulaityte, Ramzi W Nahhas, Monica J Barratt, Alan G Smith, Amit Sheth, Silvia S Martins, Edward W Boyer, and Robert G Carlson. 2017. Increases in synthetic cannabinoids-related harms: Results from a longitudinal web-based content analysis. International Journal of Drug Policy (2017).
    [23]
    Raymond Lau, Ronald Rosenfeld, and Salim Roukos. 1997. Building scalable n-gram language models using maximum likelihood maximum entropy n-gram models. (1997).
    [24]
    Neil A Macmillan and Howard L Kaplan. 1985. Detection theory analysis of group data: estimating sensitivity from average hit and false-alarm rates. Psychological bulletin (1985).
    [25]
    Matthew J Maenner, Marshalyn Yeargin-Allsopp, Kim Van Naarden Braun, Deborah L Christensen, and Laura A Schieve. 2016. Development of a machine learning algorithm for the surveillance of autism spectrum disorder. PloS one (2016).
    [26]
    Shervin Malmasi, Marcos Zampieri, and Mark Dras. 2016. Predicting post severity in mental health forums. Proceedings of the third workshop on computational lingusitics and clinical psychology .
    [27]
    Stefano Massei, Davide Palitta, and Leonardo Robol. 2017. Solving rank structured Sylvester and Lyapunov equations. arXiv preprint arXiv:1711.05493 (2017).
    [28]
    Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems .
    [29]
    David Mimno, Hanna M Wallach, Edmund Talley, Miriam Leenders, and Andrew McCallum. 2011. Optimizing semantic coherence in topic models. In Proceedings of the conference on empirical methods in natural language processing .
    [30]
    M Mitchell, K Hollingshead, and G Coppersmith. 2015. Quantifying the language of schizophrenia in social media. In Proceedings of the 2nd workshop on Computational linguistics and clinical psychology: From linguistic signal to clinical reality .
    [31]
    Finn Årup Nielsen. 2011. A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. arXiv preprint arXiv:1103.2903 (2011).
    [32]
    Albert Park, Mike Conway, and Annie T Chen. 2018. Examining thematic similarity, difference, and membership in three online mental health communities from Reddit: a text mining and visualization approach. Computers in Human Behavior (2018).
    [33]
    D Preoţiuc-Pietro, M Sap, H A Schwartz, and L Ungar. 2015. Mental illness detection at the World Well-Being Project for the CLPsych 2015 shared task. In Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality .
    [34]
    Elvis Saravia, Chun-Hao Chang, Renaud Jollet De Lorenzo, and Yi-Shin Chen. 2016. MIDAS: Mental illness detection and analysis via social media. In Advances in Social Networks Analysis and Mining (ASONAM), 2016 IEEE/ACM International Conference on .
    [35]
    Judy Hanwen Shen and Frank Rudzicz. 2017. Detecting Anxiety through Reddit. In Proceedings of the Fourth Workshop on Computational Linguistics and Clinical Psychology--From Linguistic Signal to Clinical Reality .
    [36]
    Richard Socher, Milind Ganjoo, Christopher D Manning, and Andrew Ng. 2013. Zero-shot learning through cross-modal transfer. Advances in neural information processing systems .
    [37]
    Joseph Thomas. 2009. Medical records and issues in negligence. Indian journal of urology: IJU: journal of the Urological Society of India (2009).
    [38]
    Xuerui Wang, Andrew McCallum, and Xing Wei. 2007. Topical n-grams: Phrase and topic discovery, with an application to information retrieval. In ICDM .
    [39]
    Sanjaya Wijeratne, Lakshika Balasuriya, Derek Doran, and Amit Sheth. 2016. Word embeddings to enhance twitter gang member profile identification. (2016).
    [40]
    Sanjaya Wijeratne, Amit Sheth, Shreyansh Bhatt, Lakshika Balasuriya, Hussein S Al-Olimat, Manas Gaur, AH Yazdavar, and Krishnaprasad Thirunarayan. 2017. Feature Engineering for Twitter-based Applications. Feature Engineering for Machine Learning and Data Analytics (2017).
    [41]
    Marie Bee Hui Yap, Shireen Mahtani, Ronald M Rapee, Claire Nicolas, Katherine A Lawrence, Andrew Mackinnon, and Anthony F Jorm. 2018. A tailored web-based intervention to improve parenting risk and protective factors for adolescent depression and anxiety problems: postintervention findings from a randomized controlled trial. Journal of medical Internet research (2018).
    [42]
    Amir Hossein Yazdavar, Hussein S Al-Olimat, Monireh Ebrahimi, Goonmeet Bajaj, Tanvi Banerjee, Krishnaprasad Thirunarayan, Jyotishman Pathak, and Amit Sheth. 2017. Semi-supervised approach to monitoring clinical depressive symptoms in social media. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017 .
    [43]
    Fuzheng Zhang, Nicholas Jing Yuan, Defu Lian, Xing Xie, and Wei-Ying Ma. 2016. Collaborative knowledge base embedding for recommender systems. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining .

    Cited By

    View all
    • (2024)Detecting Substance Use Disorder Using Social Media Data and the Dark Web: Time- and Knowledge-Aware StudyJMIRx Med10.2196/485195(e48519-e48519)Online publication date: 1-May-2024
    • (2024)Quantifying the Pollan Effect: Investigating the Impact of Emerging Psychiatric Interventions on Online Mental Health DiscourseProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642477(1-22)Online publication date: 11-May-2024
    • Show More Cited By

    Index Terms

    1. "Let Me Tell You About Your Mental Health!": Contextualized Classification of Reddit Posts to DSM-5 for Web-based Intervention

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image ACM Conferences
            CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management
            October 2018
            2362 pages
            ISBN:9781450360142
            DOI:10.1145/3269206
            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Sponsors

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 17 October 2018

            Permissions

            Request permissions for this article.

            Check for updates

            Author Tags

            1. drug abuse ontology
            2. dsm-5
            3. medical knowledge bases
            4. mental health
            5. reddit
            6. semantic encoding and decoding
            7. semantic social computing

            Qualifiers

            • Research-article

            Funding Sources

            Conference

            CIKM '18
            Sponsor:

            Acceptance Rates

            CIKM '18 Paper Acceptance Rate 147 of 826 submissions, 18%;
            Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

            Upcoming Conference

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)218
            • Downloads (Last 6 weeks)22

            Other Metrics

            Citations

            Cited By

            View all
            • (2024)Detecting Substance Use Disorder Using Social Media Data and the Dark Web: Time- and Knowledge-Aware StudyJMIRx Med10.2196/485195(e48519-e48519)Online publication date: 1-May-2024
            • (2024)Quantifying the Pollan Effect: Investigating the Impact of Emerging Psychiatric Interventions on Online Mental Health DiscourseProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642477(1-22)Online publication date: 11-May-2024
            • (2024)Association Analysis of Population Health Data Based on Topsis Evaluation Model and XGBoost Algorithm2024 IEEE 4th International Conference on Power, Electronics and Computer Applications (ICPECA)10.1109/ICPECA60615.2024.10471119(419-426)Online publication date: 26-Jan-2024
            • (2024)A comprehensive review of predictive analytics models for mental illness using machine learning algorithmsHealthcare Analytics10.1016/j.health.2024.1003506(100350)Online publication date: Dec-2024
            • (2024)Building trustworthy NeuroSymbolic AI Systems: Consistency, reliability, explainability, and safetyAI Magazine10.1002/aaai.12149Online publication date: 14-Feb-2024
            • (2023)Understanding Mental Health Issues in Different Subdomains of Social Networking Services: Computational Analysis of Text-Based Reddit PostsJournal of Medical Internet Research10.2196/4907425(e49074)Online publication date: 30-Nov-2023
            • (2023)Detecting Symptoms of Depression on RedditProceedings of the 15th ACM Web Science Conference 202310.1145/3578503.3583621(174-183)Online publication date: 30-Apr-2023
            • (2023)Language on Reddit Reveals Differential Mental Health Markers for Individuals posting in Immigration CommunitiesProceedings of the 15th ACM Web Science Conference 202310.1145/3578503.3583600(153-162)Online publication date: 30-Apr-2023
            • (2023)Designing Human-centered AI for Mental Health: Developing Clinically Relevant Applications for Online CBT TreatmentACM Transactions on Computer-Human Interaction10.1145/356475230:2(1-50)Online publication date: 17-Mar-2023
            • Show More Cited By

            View Options

            Get Access

            Login options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media