Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3388440.3412418acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article

Global Surveillance of COVID-19 by mining news media using a multi-source dynamic embedded topic model

Published: 10 November 2020 Publication History
  • Get Citation Alerts
  • Abstract

    As the COVID-19 pandemic continues to unfold, understanding the global impact of non-pharmacological interventions (NPI) is important for formulating effective intervention strategies, particularly as many countries prepare for future waves. We used a machine learning approach to distill latent topics related to NPI from large-scale international news media. We hypothesize that these topics are informative about the timing and nature of implemented NPI, dependent on the source of the information (e.g., local news versus official government announcements) and the target countries. Given a set of latent topics associated with NPI (e.g., self-quarantine, social distancing, online education, etc), we assume that countries and media sources have different prior distributions over these topics, which are sampled to generate the news articles. To model the source-specific topic priors, we developed a semi-supervised, multi-source, dynamic, embedded topic model. Our model is able to simultaneously infer latent topics and learn a linear classifier to predict NPI labels using the topic mixtures as input for each news article. To learn these models, we developed an efficient end-to-end amortized variational inference algorithm. We applied our models to news data collected and labelled by the World Health Organization (WHO) and the Global Public Health Intelligence Network (GPHIN). Through comprehensive experiments, we observed superior topic quality and intervention prediction accuracy, compared to the baseline embedded topic models, which ignore information on media source and intervention labels. The inferred latent topics reveal distinct policies and media framing in different countries and media sources, and also characterize reaction to COVID-19 and NPI in a semantically meaningful manner. Our PyTorch code is available on Github (htps://github.com/li-lab-mcgill/covid19_media).

    References

    [1]
    Ahmad Alimadadi, Sachin Aryal, Ishan Manandhar, Patricia B Munroe, Bina Joe, and Xi Cheng. 2020. Artificial intelligence and machine learning to fight COVID-19.
    [2]
    Philippe Barboza, Laetitia Vaillant, Yann Le Strat, David M Hartley, Noele P Nelson, Abla Mawudeku, Lawrence C Madoff, Jens P Linge, Nigel Collier, John S Brownstein, et al. 2014. Factors influencing performance of internet-based biosurveillance systems used in epidemic intelligence for early detection of infectious diseases outbreaks. PloS one 9, 3 (2014).
    [3]
    Arnab Bhadury, Jianfei Chen, Jun Zhu, and Shixia Liu. 2016. Scaling up dynamic topic models. In Proceedings of the 25th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 381--390.
    [4]
    David M Blei and John D Lafferty. 2006. Dynamic topic models. In Proceedings of the 23rd international conference on Machine learning. 113--120.
    [5]
    David M Blei, Andrew Y Ng, and Michael I Jordan. 2003. Latent dirichlet allocation. The Journal of Machine Learning Research 3 (March 2003), 993--1022.
    [6]
    Adji B Dieng, Francisco JR Ruiz, and David M Blei. 2019. The dynamic embedded topic model. arXiv preprint arXiv:1907.05545 (2019).
    [7]
    Adji B Dieng, Francisco JR Ruiz, and David M Blei. 2019. Topic modeling in embedding spaces. arXiv preprint arXiv.1907.04907 (2019).
    [8]
    M Dion, P AbdelMalik, and A Mawudeku. 2015. Big Data: Big Data and the Global Public Health Intelligence Network (GPHIN). Canada Communicable Disease Report 41, 9 (2015), 209.
    [9]
    Nikou Günnemann, Michael Derntl, Ralf Klamma, and Matthias Jarke. 2013. An interactive system for visual analytics of dynamic topic models. Datenbank-Spektrum 13, 3 (2013), 213--223.
    [10]
    M D Hoffman, D M Blei, C Wang, and J W Paisley. 2013. Stochastic variational inference. Journal of Machine Learning Research (JMLR) (2013).
    [11]
    Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
    [12]
    Diederik P Kingma and Max Welling. 2013. Auto-Encoding Variational Bayes. arXiv.org (Dec. 2013). arXiv:1312.6114v10 [stat.ML]
    [13]
    Jon D McAuliffe and David M Blei. 2008. Supervised Topic Models. In Advances in Neural Information Processing Systems 20, J C Platt, D Koller, Y Singer, and S T Roweis (Eds.). Curran Associates, Inc., 121--128.
    [14]
    Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.
    [15]
    David Mimno, Hanna M Wallach, Edmund Talley, Miriam Leenders, and Andrew McCallum. 2011. Optimizing semantic coherence in topic models. In Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, 262--272.
    [16]
    Yotam Ophir. 2018. Coverage of epidemics in American newspapers through the lens of the crisis and emergency risk communication framework. Health security 16, 3 (2018), 147--157.
    [17]
    William Poirier, Catherine Ouellet, Marc-Antoine Rancourt, Justine Béchard, and Yannick Dufresne. 2020. (Un)Covering the COVID-19 Pandemic: Framing Analysis of the Crisis in Canada. Canadian Journal of Political Science/Revue canadienne de science politique (April 2020), 1--7.
    [18]
    R Ranganath, S Gerrish, D Blei Artificial Intelligence Statistics, and 2014. [n.d.]. Black box variational inference. jmlr.org ([n. d.]).
    [19]
    Dror Walter and Yotam Ophir. 2019. News Frame Analysis: An Inductive Mixed-method Computational Approach. Communication Methods and Measures 13, 4 (2019), 248--266.
    [20]
    Peng Zhou, Xing-Lou Yang, Xian-Guang Wang, Ben Hu, Lei Zhang, Wei Zhang, Hao-Rui Si, Yan Zhu, Bei Li, Chao-Lin Huang, et al. 2020. A pneumonia outbreak associated with a new coronavirus of probable bat origin. nature 579, 7798 (2020), 270--273.

    Cited By

    View all
    • (2024)Dynamic topic modelling for exploring the scientific literature on coronavirus: an unsupervised labelling techniqueInternational Journal of Data Science and Analytics10.1007/s41060-024-00610-0Online publication date: 13-Aug-2024
    • (2024)A survey on neural topic models: methods, applications, and challengesArtificial Intelligence Review10.1007/s10462-023-10661-757:2Online publication date: 25-Jan-2024
    • (2023)User feedback on the NHS test & Trace Service during COVID-19: The use of machine learning to analyse free-text data from 37,914 England adultsPublic Health in Practice10.1016/j.puhip.2023.1004016(100401)Online publication date: Dec-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    BCB '20: Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics
    September 2020
    193 pages
    ISBN:9781450379649
    DOI:10.1145/3388440
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 November 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Bayesian inference
    2. Topic models
    3. coronavirus
    4. media news
    5. text mining

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    BCB '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 254 of 885 submissions, 29%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)49
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 09 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Dynamic topic modelling for exploring the scientific literature on coronavirus: an unsupervised labelling techniqueInternational Journal of Data Science and Analytics10.1007/s41060-024-00610-0Online publication date: 13-Aug-2024
    • (2024)A survey on neural topic models: methods, applications, and challengesArtificial Intelligence Review10.1007/s10462-023-10661-757:2Online publication date: 25-Jan-2024
    • (2023)User feedback on the NHS test & Trace Service during COVID-19: The use of machine learning to analyse free-text data from 37,914 England adultsPublic Health in Practice10.1016/j.puhip.2023.1004016(100401)Online publication date: Dec-2023
    • (2023)Deep learning-based user experience evaluation in distance learningCluster Computing10.1007/s10586-022-03918-327:1(443-455)Online publication date: 8-Jan-2023
    • (2022)Galileo, a data platform for viewing news on social networksEl Profesional de la información10.3145/epi.2022.sep.12Online publication date: 3-Oct-2022
    • (2022)Revealing the Reflections of the Pandemic by Investigating COVID-19 Related News Articles Using Machine Learning and Network AnalysisPandeminin Yansımalarını Ortaya Çıkarmak için COVID-19 ile İlgili Gazete Makalelerinin Makine Öğrenimi ve Ağ Analizi Yöntemleri ile İncelenmesiBilişim Teknolojileri Dergisi10.17671/gazibtd.94959915:2(209-220)Online publication date: 30-Apr-2022
    • (2022)Multilingual Topic Labelling of News Topics Using Ontological MappingAdvances in Information Retrieval10.1007/978-3-030-99739-7_29(248-256)Online publication date: 5-Apr-2022
    • (2021)Monitoring non-pharmaceutical public health interventions during the COVID-19 pandemicScientific Data10.1038/s41597-021-01001-x8:1Online publication date: 24-Aug-2021
    • (2021)Lifespan Perspective on Congenital Heart Disease ResearchJournal of the American College of Cardiology10.1016/j.jacc.2021.03.01277:17(2219-2235)Online publication date: May-2021

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media