Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

Discovering Patterns for Fact Checking in Knowledge Graphs

Published: 07 May 2019 Publication History

Abstract

This article presents a new framework that incorporates graph patterns to support fact checking in knowledge graphs. Our method discovers discriminant graph patterns to construct classifiers for fact prediction. First, we propose a class of graph fact checking rules (GFCs). A GFC incorporates graph patterns that best distinguish true and false facts of generalized fact statements. We provide statistical measures to characterize useful patterns that are both discriminant and diversified. Second, we show that it is feasible to discover GFCs in large graphs with optimality guarantees. We develop an algorithm that performs localized search to generate a stream of graph patterns, and dynamically assemble the best GFCs from multiple GFC sets, where each set ensures quality scores within certain ranges. The algorithm guarantees a (1/2−ϵ) approximation when it (early) terminates. We also develop a space-efficient alternative that dynamically spawns prioritized patterns with best marginal gains to the verified GFCs. It guarantees a (1−1/e) approximation. Both strategies guarantee a bounded time cost independent of the size of the underlying graph. Third, to support fact checking, we develop two classifiers, which make use of top-ranked GFCs as predictive rules or instance-level features of the pattern matches induced by GFCs, respectively. Using real-world data, we experimentally verify the efficiency and the effectiveness of GFC-based techniques for fact checking in knowledge graphs and verify its application in knowledge exploration and news prediction.

References

[1]
Ashwinkumar Badanidiyuru, Baharan Mirzasoleiman, Amin Karbasi, and Andreas Krause. 2014. Streaming submodular maximization: Massive data summarization on the fly. In Proceedings of KDD.
[2]
S. Bhagat, G. Cormode, and S. Muthukrishnan. 2011. Node classification in social networks. In Social Network Data Analytics. Springer, 115--148.
[3]
Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Proceedings of NIPS.
[4]
Hongyun Cai, Vincent W. Zheng, and Kevin Chang. 2018. A comprehensive survey of graph embedding: Problems, techniques and applications. arXiv:1709.07604.
[5]
Andrew Carlson, Justin Betteridge, Bryan Kisiel, Burr Settles, Estevam R. Hruschka Jr., and Tom M. Mitchell. 2010. Toward an architecture for never-ending language learning. In Proceedings of AAAI.
[6]
Yang Chen and Daisy Zhe Wang. 2014. Knowledge expansion over probabilistic knowledge bases. In Proceedings of SIGMOD.
[7]
Giovanni Luca Ciampaglia, Prashant Shiralkar, Luis M. Rocha, Johan Bollen, Filippo Menczer, and Alessandro Flammini. 2015. Computational fact checking from knowledge networks. PloS One 10, 6, 30128193.
[8]
William Cukierski, Benjamin Hamner, and Bo Yang. 2011. Graph-based features for supervised link prediction. In Proceedings of IJCNN.
[9]
Xin Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Murphy, Thomas Strohmann, et al. 2014. Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In Proceedings of KDD.
[10]
Mohammed Elseidy, Ehab Abdelhamid, Spiros Skiadopoulos, and Panos Kalnis. 2014. GRAMI: Frequent subgraph and pattern mining in a single large graph. Proceedings of the VLDB Endowment 7, 7, 517--528.
[11]
Wenfei Fan, Xin Wang, Yinghui Wu, and Jingbo Xu. 2015. Association rules with graph patterns. Proceedings of the VLDB Endowment 8, 12, 1502--1513.
[12]
Wenfei Fan, Yinghui Wu, and Jingbo Xu. 2016. Functional dependencies for graphs. In Proceedings of SIGMOD.
[13]
Samantha Finn, Panagiotis Takis Metaxas, Eni Mustafaraj, Megan O’Keefe, Lindsay Tang, Susan Tang, and Laura Zeng. 2014. TRAILS: A system for monitoring the propagation of rumors on Twitter. In Proceedings of the Computation and Journalism Symposium.
[14]
Luis Galárraga, Christina Teflioudi, Katja Hose, and Fabian M. Suchanek. 2015. Fast rule mining in ontological knowledge bases with AMIE+. VLDB Journal 24, 6, 707--730.
[15]
Luis Antonio Galárraga, Christina Teflioudi, Katja Hose, and Fabian Suchanek. 2013. AMIE: Association rule mining under incomplete evidence in ontological knowledge bases. In Proceedings of WWW.
[16]
Matt Gardner and Tom M. Mitchell. 2015. Efficient and expressive knowledge base completion using subgraph feature extraction. In Proceedings of EMNLP.
[17]
Travis R. Goodwin and Sanda M. Harabagiu. 2016. Medical question answering for clinical decision support. In Proceedings of CIKM.
[18]
Naeemul Hassan, Afroza Sultana, You Wu, Gensheng Zhang, Chengkai Li, Jun Yang, and Cong Yu. 2014. Data in, fact out: Automated monitoring of facts by factwatcher. Proceedings of the VLDB Endowment 7, 13, 1557--1560.
[19]
ICIJ. 2016. Offshore Dataset. Retrieved April 8, 2019 from https://offshoreleaks.icij.org/pages/database.
[20]
Guoliang Ji, Shizhu He, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Knowledge graph embedding via dynamic mapping matrix. In Proceedings of ACL.
[21]
Chuntao Jiang, Frans Coenen, and Michele Zito. 2013. A survey of frequent subgraph mining algorithms. Knowledge Engineering Review 28, 1, 75--105.
[22]
Rudolf Kadlec, Ondrej Bajgar, and Jan Kleindienst. 2017. Knowledge base completion: Baselines strike back. In Proceedings of RepL4NLP.
[23]
Ni Lao, Tom Mitchell, and William W. Cohen. 2011. Random walk inference and learning in a large scale knowledge base. In Proceedings of EMNLP.
[24]
Kalev Leetaru and Philip A. Schrodt. 2013. Gdelt: Global data on events, location, and tone, 1979--2012. In Proceedings of the ASA Annual Convention, Vol. 2. 1--49.
[25]
Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N. Mendes, Sebastian Hellmann, et al. 2015. DBpedia—A large-scale, multilingual knowledge base extracted from Wikipedia. Semantic Web 1, 1--5.
[26]
Hui Lin and Jeff Bilmes. 2011. A class of submodular functions for document summarization. In Proceedings of ACL/HLT.
[27]
Peng Lin, Qi Song, Jialiang Shen, and Yinghui Wu. 2018. Discovering graph patterns for fact checking in knowledge graphs. In Proceedings of DASFAA.
[28]
Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of AAAI.
[29]
Shuai Ma, Yang Cao, Wenfei Fan, Jinpeng Huai, and Tianyu Wo. 2011. Capturing topology in graph pattern matching. Proceedings of the VLDB Endowment 5, 4, 310--321.
[30]
Farzaneh Mahdisoltani, Joanna Biega, and Fabian Suchanek. 2014. YAGO3: A knowledge base from multilingual wikipedias. In Proceedings of CIDR.
[31]
George L. Nemhauser, Laurence A. Wolsey, and Marshall L. Fisher. 1978. An analysis of approximations for maximizing submodular set functions—I. Mathematical Programming 14, 1, 265--294.
[32]
Maximilian Nickel, Kevin Murphy, Volker Tresp, and Evgeniy Gabrilovich. 2016. A review of relational machine learning for knowledge graphs. Proceedings of the IEEE 104, 1, 11--33.
[33]
Feng Niu, Ce Zhang, Christopher Ré, and Jude W. Shavlik. 2012. DeepDive: Web-scale knowledge-base construction using statistical learning and inference. In Proceedings of VLDS.
[34]
Alexandre Passant. 2010. dbrec-music recommendations using DBpedia. In Proceedings of ISWC.
[35]
Heiko Paulheim. 2017. Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic Web 0, 1-23.
[36]
Jay Pujara, Hui Miao, Lise Getoor, and William Cohen. 2013. Knowledge graph identification. In Proceedings of ISWC.
[37]
Sayan Ranu and Ambuj K. Singh. 2009. Graphsig: A scalable approach to mining significant subgraphs in large graph databases. In Proceedings of ICDE.
[38]
Chengcheng Shao, Giovanni Luca Ciampaglia, Alessandro Flammini, and Filippo Menczer. 2016. Hoaxy: A platform for tracking online misinformation. In Proceedings of the WWW Companion.
[39]
Baoxu Shi and Tim Weninger. 2016. Discriminative predicate path mining for fact checking in knowledge graphs. arXiv:1510.05911.
[40]
Baoxu Shi and Tim Weninger. 2017. ProjE: Embedding projection for knowledge graph completion. In Proceedings of AAAI.
[41]
Arnab Sinha, Zhihong Shen, Yang Song, Hao Ma, Darrin Eide, Bo-June Paul Hsu, and Kuansan Wang. 2015. An overview of Microsoft Academic Service (MAS) and applications. In Proceedings of WWW.
[42]
Chunyao Song, Tingjian Ge, Cindy Chen, and Jie Wang. 2014. Event pattern matching over graph streams. Proceedings of the VLDB Endowment 8, 4, 413--424.
[43]
Qi Song, Yinghui Wu, Peng Lin, Xin Luna Dong, and Hui Sun. 2018. Mining summaries for knowledge graph search. IEEE Transactions on Knowledge and Data Engineering 30, 10, 1887--1900.
[44]
Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: A core of semantic knowledge. In Proceedings of WWW.
[45]
Andreas Thor, Philip Anderson, Louiqa Raschid, Saket Navlakha, Barna Saha, Samir Khuller, and Xiao-Ning Zhang. 2011. Link prediction for annotation graphs using graph summarization. In Proceedings of the ISWC.
[46]
Théo Trouillon, Christopher R. Dance, Éric Gaussier, Johannes Welbl, Sebastian Riedel, and Guillaume Bouchard. 2017. Knowledge graph completion via complex tensor factorization. Journal of Machine Learning Research 18, 1, 4735--4772.
[47]
Denny Vrandečić and Markus Krötzsch. 2014. Wikidata: A free collaborative knowledgebase. Communications of the ACM 57, 10, 78--85.
[48]
Quan Wang, Jing Liu, Yuanfei Luo, Bin Wang, and Chin-Yew Lin. 2016. Knowledge base completion via coupled path ranking. In Proceedings of ACL.
[49]
You Wu, Pankaj K. Agarwal, Chengkai Li, Jun Yang, and Cong Yu. 2014. Toward computational fact-checking. Proceedings of the VLDB Endowment 7, 7, 589--600.
[50]
Xifeng Yan, Hong Cheng, Jiawei Han, and Philip S. Yu. 2008. Mining significant graph patterns by leap search. In Proceedings of SIGMOD.
[51]
Shengqi Yang, Yinghui Wu, Huan Sun, and Xifeng Yan. 2014. Schemaless and structureless graph querying. Proceedings of the VLDB Endowment 7, 7, 565--576.
[52]
Ganggao Zhu and Carlos A. Iglesias. 2017. Computing semantic similarity of concepts in knowledge graphs. IEEE Transactions on Knowledge and Data Engineering 29, 1, 72--85.

Cited By

View all
  • (2024)View-based Explanations for Graph Neural NetworksProceedings of the ACM on Management of Data10.1145/36392952:1(1-27)Online publication date: 26-Mar-2024
  • (2024)Automated Fact Checking Using A Knowledge Graph-based Model2024 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)10.1109/ICAIIC60209.2024.10463196(709-716)Online publication date: 19-Feb-2024
  • (2024)Correcting Inconsistencies in Knowledge Graphs with Correlated KnowledgeBig Data Research10.1016/j.bdr.2024.100450(100450)Online publication date: Mar-2024
  • Show More Cited By

Index Terms

  1. Discovering Patterns for Fact Checking in Knowledge Graphs

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Journal of Data and Information Quality
    Journal of Data and Information Quality  Volume 11, Issue 3
    Special Issue on Combating Digital Misinformation and Disinformation and On the Horizon
    September 2019
    160 pages
    ISSN:1936-1955
    EISSN:1936-1963
    DOI:10.1145/3331015
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 May 2019
    Accepted: 01 October 2018
    Revised: 01 August 2018
    Received: 01 May 2018
    Published in JDIQ Volume 11, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Fact checking
    2. knowledge graph
    3. supervised graph pattern mining

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)212
    • Downloads (Last 6 weeks)46
    Reflects downloads up to 10 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)View-based Explanations for Graph Neural NetworksProceedings of the ACM on Management of Data10.1145/36392952:1(1-27)Online publication date: 26-Mar-2024
    • (2024)Automated Fact Checking Using A Knowledge Graph-based Model2024 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)10.1109/ICAIIC60209.2024.10463196(709-716)Online publication date: 19-Feb-2024
    • (2024)Correcting Inconsistencies in Knowledge Graphs with Correlated KnowledgeBig Data Research10.1016/j.bdr.2024.100450(100450)Online publication date: Mar-2024
    • (2024)Fact-Checking Generative AI: Ontology-Driven Biological Graphs for Disease-Gene Link VerificationComputational Science – ICCS 202410.1007/978-3-031-63772-8_12(130-137)Online publication date: 2-Jul-2024
    • (2023)A semantic metric for concepts similarity in knowledge graphsJournal of Information Science10.1177/0165551521102058049:3(778-791)Online publication date: 1-Jun-2023
    • (2023)Reinforcement Learning-based Knowledge Graph Reasoning for Explainable Fact-checkingProceedings of the International Conference on Advances in Social Networks Analysis and Mining10.1145/3625007.3627593(164-170)Online publication date: 6-Nov-2023
    • (2023)Discovering Frequency Bursting Patterns in Temporal Graphs2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00052(599-611)Online publication date: Apr-2023
    • (2021)Modeling Inter-Claim Interactions for Verifying Multiple ClaimsProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482144(3503-3507)Online publication date: 26-Oct-2021
    • (2021)Dynamic Relation Repairing for Knowledge EnhancementIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.3101237(1-1)Online publication date: 2021
    • (2021)Explaining Missing Data in Graphs: A Constraint-based Approach2021 IEEE 37th International Conference on Data Engineering (ICDE)10.1109/ICDE51399.2021.00131(1476-1487)Online publication date: Apr-2021

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media