Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3079452.3079470acmotherconferencesArticle/Chapter ViewAbstractPublication PagesdhConference Proceedingsconference-collections
research-article

Using Machine Learning for Automatic Identification of Evidence-Based Health Information on the Web

Published: 02 July 2017 Publication History
  • Get Citation Alerts
  • Abstract

    Automatic assessment of the quality of online health information is a need especially with the massive growth of online content. In this paper, we present an approach to assessing the quality of health webpages based on their content rather than on purely technical features, by applying machine learning techniques to the automatic identification of evidence-based health information. Several machine learning approaches were applied to learn classifiers using different combinations of features. Three datasets were used in this study for three different diseases, namely shingles, flu and migraine. The results obtained using the classifiers were promising in terms of precision and recall especially with diseases with few different pathogenic mechanisms.

    References

    [1]
    American Medical Association. 2017. JAMA. (2017). Retrieved January 2, 2017 from http://jama.jamanetwork.com/journal.aspx
    [2]
    David Blumenthal. 2002. Doctors in a wired world: can professionalism survive connectivity? Milbank Quarterly 80, 3 (2002), 525--546.
    [3]
    Deborah Charnock and Sasha Shepperd. 2017. DISCERN. (2017). Retrieved February 20, 2017 from http://www.discern.org.uk/
    [4]
    Deborah Charnock, Sasha Shepperd, Gill Needham, and Robert Gann. 1999. DISCERN: an instrument for judging the quality of written consumer health information on treatment choices. Journal of epidemiology and community health 53, 2 (1999), 105--111.
    [5]
    Elizabeth Cohen. 2010. Your top health searches, asked and answered. (2010). Retrieved January 2, 2017 from http://edition.cnn.com/2010/HEALTH/10/21/top. health.searches.answered/
    [6]
    Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Machine learning 20, 3 (1995), 273--297.
    [7]
    Kathleen C Ellwood, Paula R Trumbo, and Claudine J Kavanaugh. 2010. How the US Food and Drug Administration evaluates the scientific evidence for health claims. Nutrition reviews 68, 2 (2010), 114--121.
    [8]
    Susannah Fox. 2013. Pew Research Center. (2013). Retrieved February 18, 2016 from http://www.pewinternet.org/files/old-media/Files/Reports/PIP
    [9]
    Arnaud Gaudinat, Natalia Grabar, Célia Boyer, and others. 2007. Machine learning approach for automatic quality criteria detection of health web pages. In Medinfo 2007: Proceedings of the 12th World Congress on Health (Medical) Informatics; Building Sustainable Health Systems. IOS Press, 705.
    [10]
    Evidence-Based Medicine Working Group and others. 1992. Evidence-based medicine. A new approach to teaching the practice of medicine. Jama 268, 17 (1992), 2420.
    [11]
    Health on the Net Foundation. 2016. Health on the Net. (2016). Retrieved February 20, 2017 from https://www.healthonnet.org/
    [12]
    Jeremy H Howick. 2011. The philosophy of evidence-based medicine. John Wiley & Sons.
    [13]
    Lexical Computing CZ s.r.o. 2017. Sketch Engine. (2017). Retrieved February 20, 2017 from http://www.sketchengine.co.uk
    [14]
    Ali Maki, Roger Evans, and Pietro Ghezzi. 2015. Bad news: analysis of the Quality of information on influenza Prevention returned by google in english and italian. Frontiers in immunology 6 (2015).
    [15]
    Tom M Mitchell. 1997. Machine learning. 1997. Burr Ridge, IL: McGraw Hill 45, 37 (1997), 870--877.
    [16]
    NLTK project. 2015. Natural Language Toolkit -- NLTK 3.0 documentation. (2015). Retrieved February 20, 2017 from http://www.nltk.org/
    [17]
    Irina Rish. 2001. An empirical study of the naive Bayes classifier. In IJCAI 2001 workshop on empirical methods in artificial intelligence, Vol. 3. IBM New York, 41--46.
    [18]
    scikit-learn Project. 2016. scikit-learn: machine learning in Python - scikitlearn 0.18.1 documentation. (2016). Retrieved February 20, 2017 from http: //scikit-learn.org/stable
    [19]
    Edward H Shortliffe, RB Altman, PF Brennan, B Davie, WM Detmer, V Florance, A Friede, M Frisse, J Glaser, J Huffman, and others. 2000. Networking health: Prescriptions for the Internet. Computer Science and Telecommunications Board, The National Academies. Washington, DC: The National Academies Press (2000).
    [20]
    Parikshit Sondhi, VG Vinod Vydiswaran, and ChengXiang Zhai. 2012. Reliability prediction of webpages in the medical domain. In European Conference on Information Retrieval. Springer, 219--231.
    [21]
    Songbo Tan. 2005. Neighbor-weighted k-nearest neighbor for unbalanced text corpus. Expert Systems with Applications 28, 4 (2005), 667--671.
    [22]
    Yunli Wang and Zhenkai Liu. 2007. Automatic detecting indicators for quality of health information on the Web. International Journal of Medical Informatics 76, 8 (2007), 575--582.
    [23]
    Yunli Wang and Rene Richard. 2007. Rule-based automatic criteria detection for assessing quality of online health information. Journal on Information Technology in Healthcare 5, 5 (2007), 288--299.
    [24]
    Mubashar Yaqub and Pietro Ghezzi. 2015. Adding dimensions to the analysis of the quality of health information of websites returned by Google: cluster analysis identifies patterns of websites according to their classification and the type of intervention described. Frontiers in public health 3 (2015), 204.
    [25]
    Guo-Xun Yuan, Chia-Hua Ho, and Chih-Jen Lin. 2012. Recent advances of large-scale linear classification. Proc. IEEE 100, 9 (2012), 2584--2603.
    [26]
    Yan Zhang, Yalin Sun, and Bo Xie. 2015. Quality of health information for consumers on the web: a systematic review of indicators, criteria, tools, and evaluation results. Journal of the Association for Information Science and Technology 66, 10 (2015), 2071--2084.

    Cited By

    View all
    • (2024)Modeling Health Video Consumption Behaviors on Social Media: Activities, Challenges, and CharacteristicsProceedings of the ACM on Human-Computer Interaction10.1145/36536998:CSCW1(1-28)Online publication date: 26-Apr-2024
    • (2023)Evaluating online health information quality using machine learning and deep learning: A systematic literature reviewDIGITAL HEALTH10.1177/205520762312122969Online publication date: 20-Nov-2023
    • (2023)Vec4Cred: a model for health misinformation detection in web pagesMultimedia Tools and Applications10.1007/s11042-022-13368-z82:4(5271-5290)Online publication date: 1-Feb-2023
    • Show More Cited By

    Index Terms

    1. Using Machine Learning for Automatic Identification of Evidence-Based Health Information on the Web

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Other conferences
        DH '17: Proceedings of the 2017 International Conference on Digital Health
        July 2017
        256 pages
        ISBN:9781450352499
        DOI:10.1145/3079452
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        In-Cooperation

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 02 July 2017

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. assessing health web pages
        2. evidence-based medicine
        3. machine learning
        4. online health information
        5. text classification

        Qualifiers

        • Research-article

        Conference

        DH '17
        DH '17: International Conference on Digital Health
        July 2 - 5, 2017
        London, United Kingdom

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)9
        • Downloads (Last 6 weeks)1

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Modeling Health Video Consumption Behaviors on Social Media: Activities, Challenges, and CharacteristicsProceedings of the ACM on Human-Computer Interaction10.1145/36536998:CSCW1(1-28)Online publication date: 26-Apr-2024
        • (2023)Evaluating online health information quality using machine learning and deep learning: A systematic literature reviewDIGITAL HEALTH10.1177/205520762312122969Online publication date: 20-Nov-2023
        • (2023)Vec4Cred: a model for health misinformation detection in web pagesMultimedia Tools and Applications10.1007/s11042-022-13368-z82:4(5271-5290)Online publication date: 1-Feb-2023
        • (2023)Assessing Depression Health Information Using Machine LearningInternet of Things10.1007/978-3-031-28475-5_5(45-53)Online publication date: 30-Mar-2023
        • (2021)Using Patient Descriptions of 20 Most Common Diseases in Text Classification for Evidence-based Medicine2021 Mohammad Ali Jinnah University International Conference on Computing (MAJICC)10.1109/MAJICC53071.2021.9526252(1-8)Online publication date: 15-Jul-2021
        • (2020)Automatic Identification of Information Quality Metrics in Health News StoriesFrontiers in Public Health10.3389/fpubh.2020.5153478Online publication date: 18-Dec-2020
        • (2020)Interventions to Support Consumer Evaluation of Online Health Information Credibility: A Scoping ReviewInternational Journal of Medical Informatics10.1016/j.ijmedinf.2020.104321(104321)Online publication date: Nov-2020

        View Options

        Get Access

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media