Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Deep Active Alignment of Knowledge Graph Entities and Schemata

Published: 20 June 2023 Publication History

Abstract

Knowledge graphs (KGs) store rich facts about the real world. In this paper, we study KG alignment, which aims to find alignment between not only entities but also relations and classes in different KGs. Alignment at the entity level can cross-fertilize alignment at the schema level. We propose a new KG alignment approach, called DAAKG, based on deep learning and active learning. With deep learning, it learns the embeddings of entities, relations and classes, and jointly aligns them in a semi-supervised manner. With active learning, it estimates how likely an entity, relation or class pair can be inferred, and selects the best batch for human labeling. We design two approximation algorithms for efficient solution to batch selection. Our experiments on benchmark datasets show the superior accuracy and generalization of DAAKG and validate the effectiveness of all its modules.

Supplemental Material

MP4 File
Presentation video for SIGMOD 2023

References

[1]
Farahnaz Akrami, Mohammed Samiul Saeef, Qingheng Zhang, Wei Hu, and Chengkai Li. 2020. Realistic Re-evaluation of Knowledge Graph Completion Methods: An Experimental Study. In SIGMOD. ACM, Portland, OR, USA, 1995--2010.
[2]
Jordan T. Ash, Chicheng Zhang, Akshay Krishnamurthy, John Langford, and Alekh Agarwal. 2020. Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds. In ICLR. OpenReview.net, Addis Ababa, Ethiopia.
[3]
Ivana Balazevic, Carl Allen, and Timothy M. Hospedales. 2019. Multi-relational Poincaré Graph Embeddings. In NeurIPS. Curran Associates Inc., Vancouver, BC, Canada, 4465--4475.
[4]
Antoine Bordes, Nicolas Usunier, Alberto Garc'i a-Durá n, Jason Weston, and Oksana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. In NIPS. Curran Associates Inc., Lake Tahoe, NV, USA, 2787--2795.
[5]
Ursin Brunner and Kurt Stockinger. 2020. Entity matching with Transformer architectures - A step forward in data integration. In EDBT. OpenProceedings.org, Copenhagen, Denmark, 463--473.
[6]
Yixin Cao, Zhiyuan Liu, Chengjiang Li, Zhiyuan Liu, Juanzi Li, and Tat-Seng Chua. 2019. Multi-Channel Graph Neural Network for Entity Alignment. In ACL. ACL, Florence, Italy, 1452--1461.
[7]
Chengliang Chai, Guoliang Li, Jian Li, Dong Deng, and Jianhua Feng. 2018. A Partial-Order-Based Framework for Cost-Effective Crowdsourced Entity Resolution. The VLDB Journal, Vol. 27, 6 (2018), 745--770.
[8]
Ines Chami, Adva Wolf, Da-Cheng Juan, Frederic Sala, Sujith Ravi, and Christopher Ré. 2020. Low-Dimensional Hyperbolic Knowledge Graph Embeddings. In ACL. ACL, Online, 6901--6914.
[9]
Jiaoyan Chen, Ernesto Jiménez-Ruiz, Ian Horrocks, Denvar Antonyrajah, Ali Hadian, and Jaehun Lee. 2021. Augmenting Ontology Alignment by Semantic Embedding and Distant Supervision. In ESWC. Springer, Online, 392--408.
[10]
Liyi Chen, Zhi Li, Tong Xu, Han Wu, Zhefeng Wang, Nicholas Jing Yuan, and Enhong Chen. 2022. Multi-modal Siamese Network for Entity Alignment. In KDD. ACM, Washington, D.C., USA, 118--126.
[11]
Muhao Chen, Yingtao Tian, Mohan Yang, and Carlo Zaniolo. 2017. Multilingual Knowledge Graph Embeddings for Cross-lingual Knowledge Alignment. In IJCAI. IJCAI, Melbourne, Australia, 1511--1517.
[12]
Sanjib Das, Paul Suganthan G.C., AnHai Doan, Jeffrey F. Naughton, Ganesh Krishnan, Rohit Deep, Esteban Arcaute, Vijay Raghavendra, and Youngchoon Park. 2017. Falcon: Scaling Up Hands-Off Crowdsourced Entity Matching to Build Cloud Services. In SIGMOD. ACM, Raleigh, NC, USA, 1431--1446.
[13]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT. ACL, Minneapolis, MN, USA, 4171--4186.
[14]
Muhammad Ebraheem, Saravanan Thirumuruganathan, Shafiq R. Joty, Mourad Ouzzani, and Nan Tang. 2018. Distributed representations of tuples for entity resolution. Proceedings of the VLDB Endowment, Vol. 11, 11 (2018), 1454--1467.
[15]
Ahmed K. Elmagarmid, Panagiotis G. Ipeirotis, and Vassilios S. Verykios. 2007. Duplicate Record Detection: A Survey. IEEE Transactions on Knowledge and Data Engineering, Vol. 19, 1 (2007), 1--16.
[16]
Jérôme Euzenat and Pavel Shvaiko. 2013. Ontology Matching second ed.). Springer-Verlag, Heidelberg.
[17]
Yunjun Gao, Xiaoze Liu, Junyang Wu, Tianyi Li, Pengfei Wang, and Lu Chen. 2022. ClusterEA: Scalable Entity Alignment with Stochastic Training and Normalized Mini-batch Similarities. In KDD. ACM, Washington D.C., USA, 421--431.
[18]
Congcong Ge, Xiaoze Liu, Lu Chen, Baihua Zheng, and Yunjun Gao. 2021. LargeEA: Aligning Entities for Large-scale Knowledge Graphs. Proceedings of the VLDB Endowment, Vol. 15, 2 (2021), 237--245.
[19]
Chaitanya Gokhale, Sanjib Das, AnHai Doan, Jeffrey F Naughton, Narasimhan Rampalli, Jude Shavlik, and Xiaojin Zhu. 2014. Corleone: Hands-off Crowdsourcing for Entity Matching. In SIGMOD. ACM, Snowbird, UT, USA, 601--612.
[20]
Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q. Weinberger. 2017. On Calibration of Modern Neural Networks. In ICML. PMLR, Sydney, Australia, 1321--1330.
[21]
Lingbing Guo, Zequn Sun, and Wei Hu. 2019. Learning to Exploit Long-term Relational Dependencies in Knowledge Graphs. In ICML. PMLR, Long Beach, CA, USA, 2505--2514.
[22]
Zhen Han, Peng Chen, Yunpu Ma, and Volker Tresp. 2020. DyERNIE: Dynamic Evolution of Riemannian Manifold Embeddings for Temporal Knowledge Graph Completion. In EMNLP. ACL, Online, 7301--7316.
[23]
Junheng Hao, Muhao Chen, Wenchao Yu, Yizhou Sun, and Wei Wang. 2019. Universal Representation Learning of Knowledge Bases by Jointly Embedding Instances and Ontological Concepts. In KDD. ACM, Anchorage, AK, USA, 1709--1719.
[24]
Fuzhen He, Zhixu Li, Qiang Yang, An Liu, Guanfeng Liu, Pengpeng Zhao, Lei Zhao, Min Zhang, and Zhigang Chen. 2019. Unsupervised Entity Alignment Using Attribute Triples and Relation Triples. In DASFAA. Springer, Chiang Mai, Thailand, 367--382.
[25]
Fuzhen He, Zhixu Li, Qiang Yang, An Liu, Guanfeng Liu, Pengpeng Zhao, Lei Zhao, Min Zhang, and Zhigang Chen. 2020. BERT-INT: A BERT-based Interaction Model For Knowledge Graph Alignment. In IJCAI. IJCAI, Online, 3174--3180.
[26]
Yuan He, Jiaoyan Chen, Denvar Antonyrajah, and Ian Horrocks. 2022. BERTMap: A BERT-Based Ontology Alignment System. In AAAI. AAAI Press, Online, 5684--5691.
[27]
Jiacheng Huang, Wei Hu, Zhifeng Bao, Qijin Chen, and Yuzhong Qu. 2022. Deep Entity Matching with Adversarial Active Learning. The VLDB Journal (2022), Early Access.
[28]
Jiacheng Huang, Wei Hu, Zhifeng Bao, and Yuzhong Qu. 2020. Crowdsourced Collective Entity Resolution with Relational Match Propagation. In ICDE. IEEE, Dallas, TX, USA, 37--48.
[29]
Arjit Jain, Sunita Sarawagi, and Prithviraj Sen. 2021. Deep Indexed Active Learning for Matching Heterogeneous Entity Representations. Proceedings of the VLDB Endowment, Vol. 15, 1 (2021), 31--45.
[30]
Shaoxiong Ji, Shirui Pan, Erik Cambria, Pekka Marttinen, and Philip S. Yu. 2021. A Survey on Knowledge Graphs: Representation, Acquisition, and Applications. IEEE Transactions on Neural Networks and Learning Systems, Vol. 33, 2 (2021), 494--514.
[31]
Jungo Kasai, Kun Qian, Sairam Gurajada, Yunyao Li, and Lucian Popa. 2019. Low-resource Deep Entity Resolution with Transfer and Active Learning. In ACL. ACL, Florence, Italy, 5851--5861.
[32]
Prodromos Kolyvakis, Alexandros Kalousis, and Dimitris Kiritsis. 2018. DeepAlignment: Unsupervised Ontology Matching with Refined Word Vectors. In NAACL. ACL, New Orleans, LA, USA, 787--798.
[33]
Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N. Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick van Kleef, Sö ren Auer, and Christian Bizer. 2015. DBpedia - A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia. Semantic Web, Vol. 6, 2 (2015), 167--195.
[34]
Manuel Leone, Stefano Huber, Akhil Arora, Alberto Garc'i a-Durá n, and Robert West. 2022. A Critical Re-evaluation of Neural Methods for Entity Alignment. Proceedings of the VLDB Endowment, Vol. 15, 8 (2022), 1712--1725.
[35]
Chengjiang Li, Yixin Cao, Lei Hou, Jiaxin Shi, Juanzi Li, and Tat-Seng Chua. 2019. Semi-supervised Entity Alignment via Joint Knowledge Embedding Model and Cross-graph Model. In EMNLP-IJCNLP. ACL, Hong Kong, China, 2723--2732.
[36]
Yuliang Li, Jinfeng Li, Yoshihiko Suhara, AnHai Doan, and Wang-Chiew Tan. 2020. Deep Entity Matching with Pre-trained Language Models. Proceedings of the VLDB Endowment, Vol. 14, 1 (2020), 50--60.
[37]
Tsung-Yi Lin, Priya Goyal, Ross B. Girshick, Kaiming He, and Piotr Dollá r. 2020. Focal Loss for Dense Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 42, 2 (2020), 318--327.
[38]
Bing Liu, Harrisen Scells, Guido Zuccon, Wen Hua, and Genghong Zhao. 2021b. ActiveEA: Active Learning for Neural Entity Alignment. In EMNLP. ACL, Punta Cana, Dominican Republic, 3364--3374.
[39]
Fangyu Liu, Muhao Chen, Dan Roth, and Nigel Collier. 2021a. Visual Pivoting for (Unsupervised) Entity Alignment. In AAAI. AAAI Press, Online, 4257--4266.
[40]
Juncheng Liu, Zequn Sun, Bryan Hooi, Yiwei Wang, Dayiheng Liu, Baosong Yang, Xiaokui Xiao, and Muhao Chen. 2022. Dangling-Aware Entity Alignment with Mixed High-Order Proximities. In NAACL-HLT (Findings). ACL, Seattle, WA, USA, 1172--1184.
[41]
Xin Mao, Wenting Wang, Yuanbin Wu, and Man Lan. 2021. Boosting the Speed of Entity Alignment 10texttimes: Dual Attention Matching Network with Normalized Hard Sample Mining. In WWW. ACM/IW3C2, Ljubljana, Slovenia, 821--832.
[42]
Baharan Mirzasoleiman, Ashwinkumar Badanidiyuru, Amin Karbasi, Jan Vondrák, and Andreas Krause. 2015. Lazier than Lazy Greedy. In AAAI. AAAI Press, Austin, TX, USA, 1812--1818.
[43]
Sidharth Mudgal, Han Li, Theodoros Rekatsinas, AnHai Doan, Youngchoon Park, Ganesh Krishnan, Rohit Deep, Esteban Arcaute, and Vijay Raghavendra. 2018. Deep Learning for Entity Matching: A Design Space Exploration. In SIGMOD. ACM, Houston, TX, USA, 19--34.
[44]
Youcef Nafa, Qun Chen, Zhaoqiang Chen, Xingyu Lu, Haiyang He, Tianyi Duan, and Zhanhuai Li. 2022. Active Deep Learning on Entity Resolution by Risk Sampling. Knowledge-Based System, Vol. 236 (2022), 107729.
[45]
Hao Nie, Xianpei Han, Ben He, Le Sun, Bo Chen, Wei Zhang, Suhui Wu, and Hao Kong. 2019. Deep Sequence-to-Sequence Entity Matching for Heterogeneous Entity Resolution. In CIKM. ACM, Beijing, China, 629--638.
[46]
Matteo Paganelli, Francesco Del Buono, Andrea Baraldi, and Francesco Guerra. 2022. Analyzing How BERT Performs Entity Matching. Proceedings of the VLDB Endowment, Vol. 15, 8 (2022), 1726--1738.
[47]
Thorsten Papenbrock, Arvid Heise, and Felix Naumann. 2015. Progressive Duplicate Detection. IEEE Transactions on Knowledge and Data Engineering, Vol. 27, 5 (2015), 1316--1329.
[48]
Shichao Pei, Lu Yu, and Xiangliang Zhang. 2019. Improving Cross-lingual Entity Alignment via Optimal Transport. In IJCAI. IJCAI, Macao, China, 3231--3237.
[49]
Pengzhen Ren, Yun Xiao, Xiaojun Chang, Po-Yao Huang, Zhihui Li, Brij B. Gupta, Xiaojiang Chen, and Xin Wang. 2022. A Survey of Deep Active Learning. ACM Computing Survey, Vol. 54, 9 (2022), 180:1--180:40.
[50]
Fabian M. Suchanek, Serge Abiteboul, and Pierre Senellart. 2011. PARIS: Probabilistic Alignment of Relations, Instances, and Schema. Proceedings of the VLDB Endowment, Vol. 5, 3 (2011), 157--168.
[51]
Zequn Sun, Muhao Chen, and Wei Hu. 2021. Knowing the No-match: Entity Alignment with Dangling Cases. In ACL-IJCNLP. ACL, Online, 3582--3593.
[52]
Zhiqing Sun, Zhi-Hong Deng, Jian-Yun Nie, and Jian Tang. 2019. RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space. In ICLR. OpenReview.net, New Orleans, LA, USA, 1--18.
[53]
Zequn Sun, Wei Hu, and Chengkai Li. 2017. Cross-Lingual Entity Alignment via Joint Attribute-Preserving Embedding. In ISWC. Springer, Vienna, Austria, 628--644.
[54]
Zequn Sun, Wei Hu, Qingheng Zhang, and Yuzhong Qu. 2018. Bootstrapping Entity Alignment with Knowledge Graph Embedding. In IJCAI. IJCAI, Stockholm, Sweden, 4396--4402.
[55]
Zequn Sun, Chengming Wang, Wei Hu, Muhao Chen, Jian Dai, Wei Zhang, and Yuzhong Qu. 2020a. Knowledge Graph Alignment Network with Gated Multi-Hop Neighborhood Aggregation. In AAAI. AAAI Press, New York, NY, USA, 222--229.
[56]
Zequn Sun, Qingheng Zhang, Wei Hu, Chengming Wang, Muhao Chen, Farahnaz Akrami, and Chengkai Li. 2020b. A Benchmarking Study of Embedding-based Entity Alignment for Knowledge Graphs. Proceedings of the VLDB Endowment, Vol. 13, 11 (2020), 2326--2340.
[57]
Thomas Pellissier Tanon, Gerhard Weikum, and Fabian M. Suchanek. 2020. YAGO 4: A Reason-able Knowledge Base. In ESWC. Springer, Heraklion, Greece, 583--596.
[58]
Bayu Distiawan Trisedya, Jianzhong Qi, and Rui Zhang. 2019. Entity Alignment between Knowledge Graphs Using Attribute Embeddings. In AAAI. AAAI Press, Honolulu, HI, USA, 297--304.
[59]
Shikhar Vashishth, Soumya Sanyal, Vikram Nitin, and Partha Talukdar. 2020. Composition-based Multi-Relational Graph Convolutional Networks. In ICLR. OpenReview.net, Addis Ababa, Ethiopia.
[60]
Denny Vrandecic and Markus Krö tzsch. 2014. Wikidata: A Free Collaborative Knowledgebase. Commun. ACM, Vol. 57, 10 (2014), 78--85.
[61]
Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. 2017. Knowledge Graph Embedding: A Survey of Approaches and Applications. IEEE Transactions on Knowledge and Data Engineering, Vol. 29, 12 (2017), 2724--2743.
[62]
Zhichun Wang, Qingsong Lv, Xiaohan Lan, and Yu Zhang. 2018. Cross-lingual Knowledge Graph Alignment via Graph Convolutional Networks. In EMNLP. ACL, Brussels, Belgium, 349--357.
[63]
Zhengyang Wang, Bunyamin Sisman, Hao Wei, Xin Luna Dong, and Shuiwang Ji. 2020. CorDEL: A Contrastive Deep Learning Approach for Entity Linkage. In ICDM. IEEE, Sorrento, Italy, 1322--1327.
[64]
Weixin Zeng, Xiang Zhao, Xinyi Li, Jiuyang Tang, and Wei Wang. 2022. On Entity Alignment at Scale. The VLDB Journal, Vol. 31, 5 (2022), 1009--1033.
[65]
Weixin Zeng, Xiang Zhao, Jiuyang Tang, and Xuemin Lin. 2020. Collective Entity Alignment via Adaptive Features. In ICDE. IEEE, Dallas, TX, USA, 1870--1873.
[66]
Qingheng Zhang, Zequn Sun, Wei Hu, Muhao Chen, Lingbing Guo, and Yuzhong Qu. 2019. Multi-view Knowledge Graph Embedding for Entity Alignment. In IJCAI. IJCAI, Macao, China, 5429--5435.
[67]
Rui Zhang, Bayu Distiawan Trisedya, Miao Li, Yong Jiang, and Jianzhong Qi. 2022. A Benchmark and Comprehensive Survey on Knowledge Graph Entity Alignment via Representation Learning. The VLDB Journal, Vol. 31, 5 (2022), 1143--1168.
[68]
Xiang Zhao, Weixin Zeng, Jiuyang Tang, Wei Wang, and Fabian Suchanek. 2022. An Experimental Study of State-of-the-Art Entity Alignment Approaches. IEEE Transactions on Knowledge and Data Engineering, Vol. 34, 6 (2022), 2610--2625.
[69]
Qi Zhu, Hao Wei, Bunyamin Sisman, Da Zheng, Christos Faloutsos, Xin Luna Dong, and Jiawei Han. 2020. Collective Multi-type Entity Alignment Between Knowledge Graphs. In WWW. ACM/IW3C2, Taipei, Taiwan, 2241--2252.
[70]
Yan Zhuang, Guoliang Li, Zhuojian Zhong, and Jianhua Feng. 2016. PBA: Partition and Blocking Based Alignment for Large Knowledge Bases. In DASFAA. Springer, Dallas, TX, USA, 415--431.
[71]
Yan Zhuang, Guoliang Li, Zhuojian Zhong, and Jianhua Feng. 2017. Hike: A Hybrid Human-Machine Method for Entity Alignment in Large-Scale Knowledge Bases. In CIKM. ACM, Singapore, 1917--1926.

Cited By

View all
  • (2024)Window Function Expression: Let the Self-Join EnterProceedings of the VLDB Endowment10.14778/3665844.366584817:9(2162-2174)Online publication date: 6-Aug-2024
  • (2024)Proximity Queries on Point Clouds using Rapid Construction Path OracleProceedings of the ACM on Management of Data10.1145/36392612:1(1-26)Online publication date: 26-Mar-2024
  • (2024)DiffusionE: Reasoning on Knowledge Graphs via Diffusion-based Graph Neural NetworksProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671997(222-230)Online publication date: 25-Aug-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data
Proceedings of the ACM on Management of Data  Volume 1, Issue 2
PACMMOD
June 2023
2310 pages
EISSN:2836-6573
DOI:10.1145/3605748
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2023
Published in PACMMOD Volume 1, Issue 2

Permissions

Request permissions for this article.

Author Tags

  1. active learning
  2. deep neural networks
  3. entity alignment
  4. knowledge graph
  5. schema matching

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)93
  • Downloads (Last 6 weeks)15
Reflects downloads up to 31 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Window Function Expression: Let the Self-Join EnterProceedings of the VLDB Endowment10.14778/3665844.366584817:9(2162-2174)Online publication date: 6-Aug-2024
  • (2024)Proximity Queries on Point Clouds using Rapid Construction Path OracleProceedings of the ACM on Management of Data10.1145/36392612:1(1-26)Online publication date: 26-Mar-2024
  • (2024)DiffusionE: Reasoning on Knowledge Graphs via Diffusion-based Graph Neural NetworksProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671997(222-230)Online publication date: 25-Aug-2024
  • (2024)ReliK: A Reliability Measure for Knowledge Graph EmbeddingsProceedings of the ACM Web Conference 202410.1145/3589334.3645430(2009-2019)Online publication date: 13-May-2024
  • (2024)Position-Aware Active Learning for Multi-Modal Entity AlignmentICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10447624(8215-8219)Online publication date: 14-Apr-2024
  • (2023)Specification Mining over Temporal DataComputers10.3390/computers1209018512:9(185)Online publication date: 14-Sep-2023

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media