Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3340531.3412880acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article
Public Access

ReCOVery: A Multimodal Repository for COVID-19 News Credibility Research

Published: 19 October 2020 Publication History
  • Get Citation Alerts
  • Abstract

    First identified in Wuhan, China, in December 2019, the outbreak of COVID-19 has been declared as a global emergency in January, and a pandemic in March 2020 by the World Health Organization (WHO). Along with this pandemic, we are also experiencing an "infodemic" of information with low credibility such as fake news and conspiracies. In this work, we present ReCOVery, a repository designed and constructed to facilitate research on combating such information regarding COVID-19. We first broadly search and investigate ~2,000 news publishers, from which 60 are identified with extreme [high or low] levels of credibility. By inheriting the credibility of the media on which they were published, a total of 2,029 news articles on coronavirus, published from January to May 2020, are collected in the repository, along with 140,820 tweets that reveal how these news articles have spread on the Twitter social network. The repository provides multimodal information of news articles on coronavirus, including textual, visual, temporal, and network information. The way that news credibility is obtained allows a trade-off between dataset scalability and label accuracy. Extensive experiments are conducted to present data statistics and distributions, as well as to provide baseline performances for predicting news credibility so that future methods can be compared. Our repository is available at http://coronavirus-fakenews.com.

    Supplementary Material

    MP4 File (3340531.3412880.mp4)
    We present ReCOVery, a repository designed and constructed to combat the "infodemic" of information with low credibilities such as fake news and conspiracies. We first broadly search and investigate ~2,000 news publishers, from which 60 are identified with extreme levels of credibility. By inheriting the credibility of the media on which they were published, a total of 2,029 news articles on coronavirus, published from January to May 2020, are collected, along with 140,820 tweets revealing how these news articles have spread on Twitter. The repository provides multimodal information of news articles, including textual, visual, temporal, and network information. The way that news credibility is obtained allows a trade-off between dataset scalability and label accuracy. Extensive experiments are conducted to present data statistics and distributions, as well as to provide baseline performances for predicting news credibility. Our repository is available at http://coronavirus-fakenews.com.

    References

    [1]
    Ramy Baly, Georgi Karadzhov, Dimitar Alexandrov, James Glass, and Preslav Nakov. 2018. Predicting factuality of reporting and bias of news media sources. arXiv preprint arXiv:1810.01765 (2018).
    [2]
    Emily Chen, Kristina Lerman, and Emilio Ferrara. 2020. Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set. JMIR Public Health and Surveillance, Vol. 6, 2 (2020), e19273.
    [3]
    Limeng Cui and Dongwon Lee. 2020. CoAID: COVID-19 Healthcare Misinformation Dataset. arXiv preprint arXiv:2006.00885 (2020).
    [4]
    Enyan Dai, Yiwei Sun, and Suhang Wang. 2020. Ginger Cannot Cure Cancer: Battling Fake Health News with a Comprehensive Data Repository. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 14. 853--862.
    [5]
    Ensheng Dong, Hongru Du, and Lauren Gardner. 2020. An interactive web-based dashboard to track COVID-19 in real time. The Lancet infectious diseases, Vol. 20, 5 (2020), 533--534.
    [6]
    Emilio Ferrara. 2019. The history of digital spam. Commun. ACM, Vol. 62, 8 (2019), 82--91.
    [7]
    Chaolin Huang, Yeming Wang, Xingwang Li, Lili Ren, Jianping Zhao, Yi Hu, Li Zhang, Guohui Fan, Jiuyang Xu, Xiaoying Gu, et almbox. 2020. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. The lancet, Vol. 395, 10223 (2020), 497--506.
    [8]
    Yangfeng Ji and Jacob Eisenstein. 2014. Representation learning for text-level discourse parsing. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vol. 1. 13--24.
    [9]
    Yoon Kim. 2014. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014).
    [10]
    Tanushree Mitra and Eric Gilbert. 2015. CREDBANK: A Large-scale Social Media Corpus with Associated Credibility Annotations. In Ninth International AAAI Conference on Web and Social Media.
    [11]
    Maria Nicola, Zaid Alsafi, Catrin Sohrabi, Ahmed Kerwan, Ahmed Al-Jabir, Christos Iosifidis, Maliha Agha, and Riaz Agha. 2020. The socio-economic implications of the coronavirus and COVID-19 pandemic: A review. International Journal of Surgery (2020).
    [12]
    Jeppe Nørregaard, Benjamin D Horne, and Sibel Adali. 2019. NELA-GT-2018: A Large Multi-Labelled News Dataset for the Study of Misinformation in News Articles. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 13. 630--638.
    [13]
    James W Pennebaker, Ryan L Boyd, Kayla Jordan, and Kate Blackburn. 2015. The development and psychometric properties of LIWC2015. Technical Report.
    [14]
    Kai Shu, Deepak Mahudeswaran, Suhang Wang, Dongwon Lee, and Huan Liu. 2018. FakeNewsNet: A Data Repository with News Content, Social Context and Dynamic Information for Studying Fake News on Social Media. arXiv preprint arXiv:1809.01286 (2018).
    [15]
    Niraj Sitaula, Chilukuri K Mohan, Jennifer Grygiel, Xinyi Zhou, and Reza Zafarani. 2020. Credibility-based Fake News Detection. In Disinformation, Misinformation and Fake News in Social Media: Emerging Research Challenges and Opportunities. Springer.
    [16]
    James Thorne, Andreas Vlachos, Christos Christodoulopoulos, and Arpit Mittal. 2018. FEVER: a large-scale dataset for fact extraction and verification. arXiv preprint arXiv:1803.05355 (2018).
    [17]
    William Yang Wang. 2017. " liar, liar pants on fire": A new benchmark dataset for fake news detection. arXiv preprint arXiv:1705.00648 (2017).
    [18]
    Bo Xu, Bernardo Gutierrez, Sumiko Mekaru, Kara Sewalk, Lauren Goodwin, Alyssa Loskill, Emily L Cohn, Yulin Hswen, Sarah C Hill, Maria M Cobo, et almbox. 2020. Epidemiological data from the COVID-19 outbreak, real-time case information. Scientific data, Vol. 7, 1 (2020), 1--6.
    [19]
    Reza Zafarani, Mohammad Ali Abbasi, and Huan Liu. 2014. Social media mining: an introduction .Cambridge University Press.
    [20]
    Reza Zafarani, Xinyi Zhou, Kai Shu, and Huan Liu. 2019. Fake News Research: Theories, Detection Strategies, and Open Problems. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 3207--3208.
    [21]
    Jiawei Zhang, Bowen Dong, and S Yu Philip. 2020. Fakedetector: Effective fake news detection with deep diffusive neural network. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 1826--1829.
    [22]
    Xinyi Zhou, Atishay Jain, Vir V Phoha, and Reza Zafarani. 2020 a. Fake News Early Detection: A Theory-driven Model. Digital Threats: Research and Practice, Vol. 1, 2 (2020), 1--25.
    [23]
    Xinyi Zhou, Jindi Wu, and Reza Zafarani. 2020 b. SAFE: Similarity-Aware Multi-Modal Fake News Detection. In The 24th Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer.
    [24]
    Xinyi Zhou and Reza Zafarani. 2020. A Survey of Fake News: Fundamental Theories, Detection Methods, and Opportunities. ACM Computing Surveys (CSUR) (2020).

    Cited By

    View all
    • (2024)Building a framework for fake news detection in the health domainPLOS ONE10.1371/journal.pone.030536219:7(e0305362)Online publication date: 8-Jul-2024
    • (2024)Analyzing the Interplay between Diversity of News Recommendations and Misinformation Spread in Social MediaAdjunct Proceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization10.1145/3631700.3664870(80-85)Online publication date: 27-Jun-2024
    • (2024)Heterogeneous Subgraph Transformer for Fake News DetectionProceedings of the ACM on Web Conference 202410.1145/3589334.3645680(1272-1282)Online publication date: 13-May-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management
    October 2020
    3619 pages
    ISBN:9781450368599
    DOI:10.1145/3340531
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 October 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. coronavirus
    2. covid-19
    3. fake news
    4. infodemic
    5. information credibility
    6. multimodal
    7. pandemic
    8. repository
    9. social media

    Qualifiers

    • Research-article

    Funding Sources

    • DARPA

    Conference

    CIKM '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)490
    • Downloads (Last 6 weeks)33
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Building a framework for fake news detection in the health domainPLOS ONE10.1371/journal.pone.030536219:7(e0305362)Online publication date: 8-Jul-2024
    • (2024)Analyzing the Interplay between Diversity of News Recommendations and Misinformation Spread in Social MediaAdjunct Proceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization10.1145/3631700.3664870(80-85)Online publication date: 27-Jun-2024
    • (2024)Heterogeneous Subgraph Transformer for Fake News DetectionProceedings of the ACM on Web Conference 202410.1145/3589334.3645680(1272-1282)Online publication date: 13-May-2024
    • (2024)Comment-Context Dual Collaborative Masked Transformer Network for Fake News DetectionIEEE Transactions on Multimedia10.1109/TMM.2023.333007426(5170-5180)Online publication date: 2024
    • (2024)A Review of Deep Learning Techniques for Multimodal Fake News and Harmful Languages DetectionIEEE Access10.1109/ACCESS.2024.340625812(76133-76153)Online publication date: 2024
    • (2024)DANESKnowledge-Based Systems10.1016/j.knosys.2024.111715294:COnline publication date: 21-Jun-2024
    • (2024)Sensational stories: The role of narrative characteristics in distinguishing real and fake news and predicting their spreadJournal of Business Research10.1016/j.jbusres.2023.114289170(114289)Online publication date: Jan-2024
    • (2024)Predicting the virality of fake news at the early stage of disseminationExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.123390248:COnline publication date: 15-Aug-2024
    • (2024)BRaG: a hybrid multi-feature framework for fake news detection on social mediaSocial Network Analysis and Mining10.1007/s13278-023-01185-714:1Online publication date: 29-Jan-2024
    • (2024)Amplifying Voices in the Pandemic: A Critical Analysis of Citizen Journalism’s Emotional Narrative During COVID-19Journal of the Knowledge Economy10.1007/s13132-024-02149-8Online publication date: 25-Jun-2024
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media