Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3357384.3358036acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

A Benchmark for Fact Checking Algorithms Built on Knowledge Bases

Published: 03 November 2019 Publication History

Abstract

Fact checking is the task of determining if a given claim holds. Several algorithms have been developed to check claims with reference information in the form of facts in a knowledge base. While individual algorithms have been experimentally evaluated in the past, we provide the first comprehensive and publicly available benchmark infrastructure for evaluating methods across a wide range of assumptions about the claims and the reference information. We show how, by changing the popularity, transparency, homogeneity, and functionality properties of the facts in an experiment, it is possible to influence significantly the performance of the fact checking algorithms. We introduce a benchmark framework to systematically enforce such properties in training and testing datasets with fine tune control over their properties. We then use our benchmark to compare fact checking algorithms with one another, as well as with methods that can solve the link prediction task in knowledge bases. Our evaluation shows the impact of the four data properties on the qualitative performance of the fact checking solutions and reveals a number of new insights concerning their applicability and performance.

References

[1]
Naser Ahmadi, Joohyung Lee, Paolo Papotti, and Mohammed Saeed. 2019. Explainable Fact Checking with Probabilistic Answer Set Programming. In Truth and Trust Online, TTO.
[2]
Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD . 1247--1250.
[3]
Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In NIPS .
[4]
Tien Duc Cao, Ioana Manolescu, and Xavier Tannier. 2018. Searching for Truth in a Database of Statistics. WebDB. 4:1--4:6.
[5]
Giovanni Luca Ciampaglia, Prashant Shiralkar, Luis M Rocha, Johan Bollen, Filippo Menczer, and Alessandro Flammini. 2015. Computational fact checking from knowledge networks. PloS one, Vol. 10, 6 (2015), e0128193.
[6]
Omkar Deshpande, Digvijay S Lamba, Michel Tourn, Sanjib Das, Sri Subramaniam, Anand Rajaraman, Venky Harinarayan, and AnHai Doan. 2013. Building, maintaining, and using knowledge bases: a report from the trenches. In SIGMOD .
[7]
Dua Dheeru and Efi Karra Taniskidou. 2017. UCI Machine Learning Repository. (2017). http://archive.ics.uci.edu/ml
[8]
Xin Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Murphy, Thomas Strohmann, Shaohua Sun, and Wei Zhang. 2014. Knowledge Vault: A Web-scale Approach to Probabilistic Knowledge Fusion. In KDD .
[9]
Xin Luna Dong. 2018. Challenges and innovations in building a product knowledge graph: extended abstract. In (GRADES-NDA). 1:1.
[10]
Emilio Ferrara, Onur Varol, Clayton A. Davis, Filippo Menczer, and Alessandro Flammini. 2016. The rise of social bots. Commun. ACM, Vol. 59, 7 (2016), 96--104.
[11]
Mohamed H. Gad-Elrab, Daria Stepanova, Jacopo Urbani, and Gerhard Weikum. 2019. ExFaKT: A Framework for Explaining Facts over Knowledge Graphs and Text. In WSDM. 87--95.
[12]
Luis Antonio Galárraga, Christina Teflioudi, Katja Hose, and Fabian Suchanek. 2013. AMIE: association rule mining under incomplete evidence in ontological knowledge bases. In WWW . ACM.
[13]
Matt Gardner and Tom M Mitchell. 2015. Efficient and Expressive Knowledge Base Completion Using Subgraph Feature Extraction. In EMNLP . 1488--1498.
[14]
Matt Gardner, Partha Pratim Talukdar, Jayant Krishnamurthy, and Tom Mitchell. 2014. Incorporating vector space similarity in random walk inference over knowledge bases. (2014).
[15]
Naeemul Hassan, Fatma Arslan, Chengkai Li, and Mark Tremayne. 2017. Toward Automated Fact-Checking: Detecting Check-worthy Factual Claims by ClaimBuster. In KDD.
[16]
Viet-Phi Huynh and Paolo Papotti. 2018. Towards a Benchmark for Fact Checking with Knowledge Bases. In Companion of the The Web Conference. 1595--1598.
[17]
Viet-Phi Huynh and Paolo Papotti. 2019. Buckle: Evaluating Fact Checking Algorithms Built on Knowledge Bases. PVLDB, Vol. 12, 12 (2019), 1798--1801.
[18]
Israa Jaradat, Pepa Gencheva, Alberto Barró n-Cede n o, Llu'i s Mà rquez, and Preslav Nakov. 2018. ClaimRank: Detecting Check-Worthy Claims in Arabic and English. In NAACL-HTL. 26--30.
[19]
Julien Leblay. 2017. A Declarative Approach to Data-Driven Fact Checking. In AAAI. 147--153.
[20]
Tsvetomila Mihaylova, Preslav Nakov, Llu'i s Mà rquez, Alberto Barró n-Cede n o, Mitra Mohtarami, Georgi Karadzhov, and James R. Glass. 2018. Fact Checking in Community Forums. In AAAI. 5309--5316.
[21]
Randal S. Olson, William La Cava, Patryk Orzechowski, Ryan J. Urbanowicz, and Jason H. Moore. 2017. PMLB: a large benchmark suite for machine learning evaluation and comparison. BioData Mining, Vol. 10, 1 (2017), 36.
[22]
Stefano Ortona, Vamsi Meduri, and Paolo Papotti. 2018. Robust Discovery of Positive and Negative Rules in Knowledge-Bases. In ICDE. 1168--1179.
[23]
Wanita Sherchan, Surya Nepal, and Cecile Paris. 2013. A Survey of Trust in Social Networks. ACM Comput. Surv., Vol. 45, 4, Article 47 (2013), bibinfonumpages33 pages.
[24]
Baoxu Shi and Tim Weninger. 2016. Discriminative predicate path mining for fact checking in knowledge graphs. Knowledge-Based Systems, Vol. 104 (2016), 123--133.
[25]
Fabian M Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: a core of semantic knowledge. In WWW. 697--706.
[26]
Catherine A Sugar and Gareth M James. 2003. Finding the number of clusters in a dataset: An information-theoretic approach. J. Amer. Statist. Assoc., Vol. 98, 463 (2003), 750--763.
[27]
James Thorne and Andreas Vlachos. 2018. Automated Fact Checking: Task Formulations, Methods and Future Directions. In COLING. 3346--3359.
[28]
Denny Vrandevc ić and Markus Krötzsch. 2014. Wikidata: a free collaborative knowledgebase. Commun. ACM, Vol. 57, 10 (2014), 78--85.
[29]
Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge Graph Embedding by Translating on Hyperplanes. In AAAI .
[30]
Claire Wardle. 2019. Enlisting the Public to Build a Healthier Web Information Commons. In TheWebConf (WWW) . 3.
[31]
Yang Yang, Ryan N. Lichtenwalter, and Nitesh V. Chawla. 2015. Evaluating link prediction methods. Knowl. Inf. Syst., Vol. 45, 3 (2015), 751--782.

Cited By

View all
  • (2024)Data Void Exploits: Tracking & Mitigation StrategiesProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679781(1627-1637)Online publication date: 21-Oct-2024
  • (2023)Internet Fact-Checking using RTSVU (Rating Trust Score by Verified Users)2023 IEEE 5th International Conference on Cybernetics, Cognition and Machine Learning Applications (ICCCMLA)10.1109/ICCCMLA58983.2023.10346853(487-492)Online publication date: 7-Oct-2023
  • (2023)Knowledge-Based Techniques for Document Fraud Detection: A Comprehensive StudyComputational Linguistics and Intelligent Text Processing10.1007/978-3-031-24337-0_2(17-33)Online publication date: 26-Feb-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management
November 2019
3373 pages
ISBN:9781450369763
DOI:10.1145/3357384
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. benchmark
  2. claim verification
  3. data generation
  4. evaluation
  5. fact-checking
  6. graph embeddings
  7. knowledge bases
  8. logical rules
  9. trust

Qualifiers

  • Research-article

Funding Sources

  • Agence Nationale de la Recherche

Conference

CIKM '19
Sponsor:

Acceptance Rates

CIKM '19 Paper Acceptance Rate 202 of 1,031 submissions, 20%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)32
  • Downloads (Last 6 weeks)4
Reflects downloads up to 01 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Data Void Exploits: Tracking & Mitigation StrategiesProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679781(1627-1637)Online publication date: 21-Oct-2024
  • (2023)Internet Fact-Checking using RTSVU (Rating Trust Score by Verified Users)2023 IEEE 5th International Conference on Cybernetics, Cognition and Machine Learning Applications (ICCCMLA)10.1109/ICCCMLA58983.2023.10346853(487-492)Online publication date: 7-Oct-2023
  • (2023)Knowledge-Based Techniques for Document Fraud Detection: A Comprehensive StudyComputational Linguistics and Intelligent Text Processing10.1007/978-3-031-24337-0_2(17-33)Online publication date: 26-Feb-2023
  • (2022)Knowledge Graphs and Explainable AI in HealthcareInformation10.3390/info1310045913:10(459)Online publication date: 28-Sep-2022
  • (2022)KelpieProceedings of the VLDB Endowment10.14778/3554821.355484515:12(3566-3569)Online publication date: 1-Aug-2022
  • (2022)Explaining Link Prediction Systems based on Knowledge Graph EmbeddingsProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517887(2062-2075)Online publication date: 10-Jun-2022
  • (2022)What Matters for Shoppers: Investigating Key Attributes for Online Product ComparisonAdvances in Information Retrieval10.1007/978-3-030-99739-7_27(231-239)Online publication date: 5-Apr-2022
  • (2021)Modeling Inter-Claim Interactions for Verifying Multiple ClaimsProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482144(3503-3507)Online publication date: 26-Oct-2021
  • (2021)Knowledge Graph Embedding for Link PredictionACM Transactions on Knowledge Discovery from Data10.1145/342467215:2(1-49)Online publication date: 4-Jan-2021
  • (2020)RuleHubJournal of Data and Information Quality10.1145/340938412:4(1-22)Online publication date: 15-Oct-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media