research-article

A Benchmark for Fact Checking Algorithms Built on Knowledge Bases

Authors:

Viet-Phi Huynh,

Paolo PapottiAuthors Info & Claims

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

Pages 689 - 698

https://doi.org/10.1145/3357384.3358036

Published: 03 November 2019 Publication History

Abstract

Fact checking is the task of determining if a given claim holds. Several algorithms have been developed to check claims with reference information in the form of facts in a knowledge base. While individual algorithms have been experimentally evaluated in the past, we provide the first comprehensive and publicly available benchmark infrastructure for evaluating methods across a wide range of assumptions about the claims and the reference information. We show how, by changing the popularity, transparency, homogeneity, and functionality properties of the facts in an experiment, it is possible to influence significantly the performance of the fact checking algorithms. We introduce a benchmark framework to systematically enforce such properties in training and testing datasets with fine tune control over their properties. We then use our benchmark to compare fact checking algorithms with one another, as well as with methods that can solve the link prediction task in knowledge bases. Our evaluation shows the impact of the four data properties on the qualitative performance of the fact checking solutions and reveals a number of new insights concerning their applicability and performance.

References

[1]

Naser Ahmadi, Joohyung Lee, Paolo Papotti, and Mohammed Saeed. 2019. Explainable Fact Checking with Probabilistic Answer Set Programming. In Truth and Trust Online, TTO.

[2]

Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD . 1247--1250.

[3]

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In NIPS .

[4]

Tien Duc Cao, Ioana Manolescu, and Xavier Tannier. 2018. Searching for Truth in a Database of Statistics. WebDB. 4:1--4:6.

[5]

Giovanni Luca Ciampaglia, Prashant Shiralkar, Luis M Rocha, Johan Bollen, Filippo Menczer, and Alessandro Flammini. 2015. Computational fact checking from knowledge networks. PloS one, Vol. 10, 6 (2015), e0128193.

[6]

Omkar Deshpande, Digvijay S Lamba, Michel Tourn, Sanjib Das, Sri Subramaniam, Anand Rajaraman, Venky Harinarayan, and AnHai Doan. 2013. Building, maintaining, and using knowledge bases: a report from the trenches. In SIGMOD .

[7]

Dua Dheeru and Efi Karra Taniskidou. 2017. UCI Machine Learning Repository. (2017). http://archive.ics.uci.edu/ml

[8]

Xin Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Murphy, Thomas Strohmann, Shaohua Sun, and Wei Zhang. 2014. Knowledge Vault: A Web-scale Approach to Probabilistic Knowledge Fusion. In KDD .

Digital Library

[9]

Xin Luna Dong. 2018. Challenges and innovations in building a product knowledge graph: extended abstract. In (GRADES-NDA). 1:1.

[10]

Emilio Ferrara, Onur Varol, Clayton A. Davis, Filippo Menczer, and Alessandro Flammini. 2016. The rise of social bots. Commun. ACM, Vol. 59, 7 (2016), 96--104.

Digital Library

[11]

Mohamed H. Gad-Elrab, Daria Stepanova, Jacopo Urbani, and Gerhard Weikum. 2019. ExFaKT: A Framework for Explaining Facts over Knowledge Graphs and Text. In WSDM. 87--95.

[12]

Luis Antonio Galárraga, Christina Teflioudi, Katja Hose, and Fabian Suchanek. 2013. AMIE: association rule mining under incomplete evidence in ontological knowledge bases. In WWW . ACM.

[13]

Matt Gardner and Tom M Mitchell. 2015. Efficient and Expressive Knowledge Base Completion Using Subgraph Feature Extraction. In EMNLP . 1488--1498.

[14]

Matt Gardner, Partha Pratim Talukdar, Jayant Krishnamurthy, and Tom Mitchell. 2014. Incorporating vector space similarity in random walk inference over knowledge bases. (2014).

[15]

Naeemul Hassan, Fatma Arslan, Chengkai Li, and Mark Tremayne. 2017. Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by ClaimBuster. In KDD.

[16]

Viet-Phi Huynh and Paolo Papotti. 2018. Towards a Benchmark for Fact Checking with Knowledge Bases. In Companion of the The Web Conference. 1595--1598.

Digital Library

[17]

Viet-Phi Huynh and Paolo Papotti. 2019. Buckle: Evaluating Fact Checking Algorithms Built on Knowledge Bases. PVLDB, Vol. 12, 12 (2019), 1798--1801.

[18]

Israa Jaradat, Pepa Gencheva, Alberto Barró n-Cede n o, Llu'i s Mà rquez, and Preslav Nakov. 2018. ClaimRank: Detecting Check-Worthy Claims in Arabic and English. In NAACL-HTL. 26--30.

[19]

Julien Leblay. 2017. A Declarative Approach to Data-Driven Fact Checking. In AAAI. 147--153.

[20]

Tsvetomila Mihaylova, Preslav Nakov, Llu'i s Mà rquez, Alberto Barró n-Cede n o, Mitra Mohtarami, Georgi Karadzhov, and James R. Glass. 2018. Fact Checking in Community Forums. In AAAI. 5309--5316.

[21]

Randal S. Olson, William La Cava, Patryk Orzechowski, Ryan J. Urbanowicz, and Jason H. Moore. 2017. PMLB: a large benchmark suite for machine learning evaluation and comparison. BioData Mining, Vol. 10, 1 (2017), 36.

[22]

Stefano Ortona, Vamsi Meduri, and Paolo Papotti. 2018. Robust Discovery of Positive and Negative Rules in Knowledge-Bases. In ICDE. 1168--1179.

[23]

Wanita Sherchan, Surya Nepal, and Cecile Paris. 2013. A Survey of Trust in Social Networks. ACM Comput. Surv., Vol. 45, 4, Article 47 (2013), bibinfonumpages33 pages.

Digital Library

[24]

Baoxu Shi and Tim Weninger. 2016. Discriminative predicate path mining for fact checking in knowledge graphs. Knowledge-Based Systems, Vol. 104 (2016), 123--133.

Digital Library

[25]

Fabian M Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: a core of semantic knowledge. In WWW. 697--706.

[26]

Catherine A Sugar and Gareth M James. 2003. Finding the number of clusters in a dataset: An information-theoretic approach. J. Amer. Statist. Assoc., Vol. 98, 463 (2003), 750--763.

[27]

James Thorne and Andreas Vlachos. 2018. Automated Fact Checking: Task Formulations, Methods and Future Directions. In COLING. 3346--3359.

[28]

Denny Vrandevc ić and Markus Krötzsch. 2014. Wikidata: a free collaborative knowledgebase. Commun. ACM, Vol. 57, 10 (2014), 78--85.

Digital Library

[29]

Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge Graph Embedding by Translating on Hyperplanes. In AAAI .

[30]

Claire Wardle. 2019. Enlisting the Public to Build a Healthier Web Information Commons. In TheWebConf (WWW) . 3.

[31]

Yang Yang, Ryan N. Lichtenwalter, and Nitesh V. Chawla. 2015. Evaluating link prediction methods. Knowl. Inf. Syst., Vol. 45, 3 (2015), 751--782.

Digital Library

Cited By

Mannino MGarcia JHazim RAbouzied APapotti PSerra ESpezzano F(2024)Data Void Exploits: Tracking & Mitigation StrategiesProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679781(1627-1637)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679781
J JB JRoy SRamani R(2023)Internet Fact-Checking using RTSVU (Rating Trust Score by Verified Users)2023 IEEE 5th International Conference on Cybernetics, Cognition and Machine Learning Applications (ICCCMLA)10.1109/ICCCMLA58983.2023.10346853(487-492)Online publication date: 7-Oct-2023
https://doi.org/10.1109/ICCCMLA58983.2023.10346853
Tornés BBoros EDoucet AGomez-Krämer POgier Jd’Andecy V(2023)Knowledge-Based Techniques for Document Fraud Detection: A Comprehensive StudyComputational Linguistics and Intelligent Text Processing10.1007/978-3-031-24337-0_2(17-33)Online publication date: 26-Feb-2023
https://doi.org/10.1007/978-3-031-24337-0_2
Show More Cited By

Index Terms

A Benchmark for Fact Checking Algorithms Built on Knowledge Bases

Recommendations

Towards a Benchmark for Fact Checking with Knowledge Bases
WWW '18: Companion Proceedings of the The Web Conference 2018

Fact checking is the task of determining if a given claim holds. Several algorithms have been developed to check facts with reference information in the form of knowledge bases. While individual algorithms have been experimentally evaluated, we provide a ...
Fact-checking Effect on Viral Hoaxes: A Model of Misinformation Spread in Social Networks
WWW '15 Companion: Proceedings of the 24th International Conference on World Wide Web

spread of misinformation, rumors and hoaxes. The goal of this work is to introduce a simple modeling framework to study the diffusion of hoaxes and in particular how the availability of debunking information may contain their diffusion. As traditionally ...
Linguistic Signals under Misinformation and Fact-Checking: Evidence from User Comments on Social Media

Misinformation and fact-checking are opposite forces in the news environment: the former creates inaccuracies to mislead people, while the latter provides evidence to rebut the former. These news articles are often posted on social media and attract user ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

November 2019

3373 pages

ISBN:9781450369763

DOI:10.1145/3357384

General Chairs:
Wenwu Zhu
Tsinghua University, China
,
Dacheng Tao
University of Massachusetts, USA
,
Xueqi Cheng
Institute of Computing Technology, CAS, China
,
Program Chairs:
Peng Cui
Tsinghua University, China
,
Elke Rundensteiner
Worcester Polytechnic Institute, USA
,
David Carmel
Amazon Research, USA
,
Qi He
LinkedIn, USA
,
Jeffrey Xu Yu
Chinese University of Hong Kong, China

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Agence Nationale de la Recherche

Conference

CIKM '19

Sponsor:

CIKM '19: The 28th ACM International Conference on Information and Knowledge Management

November 3 - 7, 2019

Beijing, China

Acceptance Rates

CIKM '19 Paper Acceptance Rate 202 of 1,031 submissions, 20%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
487
Total Downloads

Downloads (Last 12 months)32
Downloads (Last 6 weeks)4

Reflects downloads up to 01 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Mannino MGarcia JHazim RAbouzied APapotti PSerra ESpezzano F(2024)Data Void Exploits: Tracking & Mitigation StrategiesProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679781(1627-1637)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679781
J JB JRoy SRamani R(2023)Internet Fact-Checking using RTSVU (Rating Trust Score by Verified Users)2023 IEEE 5th International Conference on Cybernetics, Cognition and Machine Learning Applications (ICCCMLA)10.1109/ICCCMLA58983.2023.10346853(487-492)Online publication date: 7-Oct-2023
https://doi.org/10.1109/ICCCMLA58983.2023.10346853
Tornés BBoros EDoucet AGomez-Krämer POgier Jd’Andecy V(2023)Knowledge-Based Techniques for Document Fraud Detection: A Comprehensive StudyComputational Linguistics and Intelligent Text Processing10.1007/978-3-031-24337-0_2(17-33)Online publication date: 26-Feb-2023
https://doi.org/10.1007/978-3-031-24337-0_2
Rajabi EKafaie S(2022)Knowledge Graphs and Explainable AI in HealthcareInformation10.3390/info1310045913:10(459)Online publication date: 28-Sep-2022
https://doi.org/10.3390/info13100459
Rossi AFirmani DMerialdo PTeofili T(2022)KelpieProceedings of the VLDB Endowment10.14778/3554821.355484515:12(3566-3569)Online publication date: 1-Aug-2022
https://dl.acm.org/doi/10.14778/3554821.3554845
Rossi AFirmani DMerialdo PTeofili TIves ZBonifati AEl Abbadi A(2022)Explaining Link Prediction Systems based on Knowledge Graph EmbeddingsProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517887(2062-2075)Online publication date: 10-Jun-2022
https://dl.acm.org/doi/10.1145/3514221.3517887
Vedula NCollins MAgichtein ERokhlenko O(2022)What Matters for Shoppers: Investigating Key Attributes for Online Product ComparisonAdvances in Information Retrieval10.1007/978-3-030-99739-7_27(231-239)Online publication date: 5-Apr-2022
https://doi.org/10.1007/978-3-030-99739-7_27
Wang SMao WDemartini GZuccon GCulpepper JHuang ZTong H(2021)Modeling Inter-Claim Interactions for Verifying Multiple ClaimsProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482144(3503-3507)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3459637.3482144
Rossi ABarbosa DFirmani DMatinata AMerialdo P(2021)Knowledge Graph Embedding for Link PredictionACM Transactions on Knowledge Discovery from Data10.1145/342467215:2(1-49)Online publication date: 4-Jan-2021
https://dl.acm.org/doi/10.1145/3424672
Ahmadi NTruong TDao LOrtona SPapotti P(2020)RuleHubJournal of Data and Information Quality10.1145/340938412:4(1-22)Online publication date: 15-Oct-2020
https://dl.acm.org/doi/10.1145/3409384
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten