research-article

Knowledge Base Embedding for Sampling-Based Prediction

Authors:

Yongyi MaoAuthors Info & Claims

ACM Transactions on Information Systems, Volume 41, Issue 2

Article No.: 28, Pages 1 - 25

https://doi.org/10.1145/3533769

Published: 08 April 2023 Publication History

Abstract

Each link prediction task requires different degrees of answer diversity. While a link prediction task may expect up to a couple of answers, another may expect nearly a hundred answers. Given this fact, the performance of a link prediction model can be estimated more accurately if a flexible number of obtained answers are estimated instead of a predefined number of answers. Inspired by this, in this article, we analyze two evaluation criteria for link prediction tasks, respectively ranking-based protocol and sampling-based protocol. Furthermore, we study two classes of models on link prediction task, direct model and latent-variable model respectively, to demonstrate that latent-variable model performs better under the sampling-based protocol. We then propose a latent-variable model where the framework of Conditional Variational AutoEncoder (CVAE) is applied. Experimental study suggests that the proposed model performs comparably to the current state-of-the-art even under the conventional rank-based protocol. Under the sampling-based protocol, the proposed model is shown to outperform various state-of-the-art models.

References

[1]

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. https://www.tensorflow.org/. Software available from tensorflow.org.

[2]

Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. 2007. Dbpedia: A Nucleus for a Web of Open Data. Springer.

[3]

Philip Bachman and Doina Precup. 2015. Variational generative stochastic networks with collaborative shaping. In Proceedings of the 32nd International Conference on Machine Learning (ICML’15), (Lille, France, 6–11 July 2015). 1964–1972. http://jmlr.org/proceedings/papers/v37/bachman15.html.

[4]

Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: A collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. ACM, 1247–1250.

Digital Library

[5]

Antoine Bordes, Xavier Glorot, Jason Weston, and Yoshua Bengio. 2014. A semantic matching energy function for learning with multi-relational data. Machine Learning 94, 2 (2014), 233–259.

Digital Library

[6]

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. In Advances in Neural Information Processing Systems. 2787–2795.

Digital Library

[7]

Antoine Bordes, Jason Weston, Ronan Collobert, and Yoshua Bengio. 2011. Learning structured embeddings of knowledge bases. In Proceedings of the Conference on Artificial Intelligence.

[8]

Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Józefowicz, and Samy Bengio. 2016. Generating sentences from a continuous space. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning (CoNLL’16) (Berlin, Germany, August 11–12, 2016). 10–21. http://aclweb.org/anthology/K/K16/K16-1002.pdf.

[9]

Dawei Cheng, Fangzhou Yang, Xiaoyang Wang, Ying Zhang, and Liqing Zhang. 2020. Knowledge graph-based event embedding framework for financial quantitative investments. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, China) (SIGIR’20). ACM, New York, 2221–2230.

Digital Library

[10]

Gourab Chowdhury, Madiraju Srilakshmi, Mainak Chain, and Sudeshna Sarkar. 2019. Neural factorization for offer recommendation using knowledge graph embeddings. In eCOM@ SIGIR.

[11]

Travis R. Goodwin and Sanda M. Harabagiu. 2017. Knowledge representations and inference techniques for medical question answering. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1.

[12]

G. E. Hinton, J. L. McClelland, and D. E. Rumelhart. 1986. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1. MIT Press, Cambridge, MA, Chapter: Distributed Representations, 77–109. http://dl.acm.org/citation.cfm?id=104279.104287.

[13]

Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. CoRR abs/1412.6980 (2014). http://arxiv.org/abs/1412.6980.

[14]

Diederik P. Kingma and Max Welling. 2014. Auto-encoding variational Bayes. In Proceedings of theInternational Conference on Learning Representations (2014).

[15]

Geon Lee, Seonggoo Kang, and Joyce Jiyoung Whang. 2019. Hyperlink classification via structured graph embedding. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (Paris, France) (SIGIR’19). ACM, New York, 2019, 1017–1020.

Digital Library

[16]

Yankai Lin, Zhiyuan Liu, and Maosong Sun. 2015. Modeling relation paths for representation learning of knowledge bases. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP’15).

[17]

Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning entity and relation embeddings for knowledge graph completion. In Proceedings of AAAI.

Digital Library

[18]

Hanxiao Liu, Yuexin Wu, and Yiming Yang. 2017. Analogical inference for multi-relational embeddings. In Proceedings of the 34th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 70), Doina Precup and Yee Whye Teh (Eds.). PMLR, International Convention Centre, Sydney, Australia, 2168–2178.

[19]

Jiajie Mei, Richong Zhang, Yongyi Mao, and Ting Deng. 2018. On link prediction in knowledge bases: Max-K criterion and prediction protocols. In Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 755–764.

Digital Library

[20]

John Mylopoulos, Alex Borgida, Matthias Jarke, and Manolis Koubarakis. 1990. Telos: Representing knowledge about information systems. ACM Trans. Inf. Syst. 8, 4 (Oct.1990), 325–362.

Digital Library

[21]

Baoxu Shi and Tim Weninger. 2016. ProjE: Embedding projection for knowledge graph completion. Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 31. DOI:

[22]

Richard Socher, Danqi Chen, Christopher D. Manning, and Andrew Ng. 2013. Reasoning with neural tensor networks for knowledge base completion. In Advances in Neural Information Processing Systems. 926–934.

Digital Library

[23]

Kihyuk Sohn, Xinchen Yan, and Honglak Lee. 2015. Learning structured output representation using deep conditional generative models. In Proceedings of the International Conference on Neural Information Processing Systems. 3483–3491.

[24]

Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: A core of semantic knowledge. In Proceedings of the 16th International Conference on the World Wide Web. ACM, 697–706.

Digital Library

[25]

Théo Trouillon, Johannes Welbl, Sebastian Riedel, Éric Gaussier, and Guillaume Bouchard. 2016. Complex embeddings for simple link prediction. In Proceedings of the 33nd International Conference on Machine Learning (ICML’16) (New York City, NY, June 19–24, 2016). 2071–2080. http://jmlr.org/proceedings/papers/v48/trouillon16.html.

[26]

Nikhita Vedula, Patrick K. Nicholson, Deepak Ajwani, Sourav Dutta, Alessandra Sala, and Srinivasan Parthasarathy. 2018. Enriching taxonomies with functional domain knowledge. In Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval(SIGIR’18). (Ann Arbor, MI).), ACM, New York, 2018, 745–754.

Digital Library

[27]

Chun-Chih Wang and Pu-Jen Cheng. 2018. Translating representations of knowledge graphs with neighbors In Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval(SIGIR’18). (Ann Arbor, MI). ACM, New York, 2018, 917–920.

Digital Library

[28]

Hongwei Wang, Fuzheng Zhang, Jialin Wang, Miao Zhao, Wenjie Li, Xing Xie, and Minyi Guo. 2019. Exploring high-order user preference on the knowledge graph for recommender systems. ACM Trans. Inf. Syst. 37, 3, Article 32 (March2019), 26 pages.

Digital Library

[29]

Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge graph embedding by translating on hyperplanes. In Proceedings of the 28th AAAI Conference on Artificial Intelligence. Citeseer, 1112–1119.

[30]

Jianfeng Wen, Jianxin Li, Yongyi Mao, Shini Chen, and Richong Zhang. 2016. On the representation and embedding of knowledge bases beyond binary relations. In Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI’16) (New York, NY, 9–15 July 2016). 1300–1307. http://www.ijcai.org/Abstract/16/188.

[31]

Richong Zhang, Samuel Mensah, Fanshuang Kong, Zhiyuan Hu, Yongyi Mao, and Xudong Liu. 2020. Pairwise link prediction model for out of vocabulary knowledge base entities. ACM Trans. Inf. Syst. 38, 4, Article 36 (Sept.2020), 28 pages.

Digital Library

[32]

S. Kullback and R. A. Leibler. 1951. On information and sufficiency. The Annals of Mathematical Statistics 22, 1 (1951), 79–86.

Cited By

Wang TLi FZhu LLi JZhang ZShen H(2024)Invisible Black-Box Backdoor Attack against Deep Cross-Modal Hashing RetrievalACM Transactions on Information Systems10.1145/365020542:4(1-27)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3650205
Su HLi JDu ZZhu LLu KShen H(2024)Cross-domain Recommendation via Dual Adversarial AdaptationACM Transactions on Information Systems10.1145/363252442:3(1-26)Online publication date: 22-Jan-2024
https://dl.acm.org/doi/10.1145/3632524
Su HMeng LZhu LLu KLi JHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)DDPO: Direct Dual Propensity Optimization for Post-Click Conversion Rate EstimationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657817(1179-1188)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657817
Show More Cited By

Index Terms

Knowledge Base Embedding for Sampling-Based Prediction
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Probabilistic retrieval models

Recommendations

On Link Prediction in Knowledge Bases: Max-K Criterion and Prediction Protocols
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

Building knowledge base embedding models for link prediction has achieved great success. We however argue that the conventional top-k criterion used for evaluating the model performance is inappropriate. This paper introduces a new criterion, referred ...
Joint Link Prediction Via Inference from a Model
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management

A Joint Link Prediction Query (JLPQ) specifies a set of links to be predicted, given another set of links as well as node attributes as evidence. While single link prediction has been well studied in literature on deep graph learning, predicting multiple ...
Link Prediction Based on Smooth Evolution of Network Embedding
Web Information Systems and Applications
Abstract
The problem of link prediction in dynamic heterogeneous information networks has been widely studied in recent years. The technique of network embedding has been proved extremely useful for link prediction. However, the existing methods lack the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Information Systems

ACM Transactions on Information Systems Volume 41, Issue 2

April 2023

770 pages

ISSN:1046-8188

EISSN:1558-2868

DOI:10.1145/3568971

Editor:
Min Zhang
Tsinghua University, China

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 April 2023

Online AM: 11 June 2022

Accepted: 11 April 2022

Revised: 06 March 2022

Received: 21 June 2021

Published in TOIS Volume 41, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Key R&D Program of China
Fundamental Research Funds for the Central Universities
State Key Laboratory of Software Development Environment

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

15
Total Citations
View Citations
295
Total Downloads

Downloads (Last 12 months)131
Downloads (Last 6 weeks)23

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wang TLi FZhu LLi JZhang ZShen H(2024)Invisible Black-Box Backdoor Attack against Deep Cross-Modal Hashing RetrievalACM Transactions on Information Systems10.1145/365020542:4(1-27)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3650205
Su HLi JDu ZZhu LLu KShen H(2024)Cross-domain Recommendation via Dual Adversarial AdaptationACM Transactions on Information Systems10.1145/363252442:3(1-26)Online publication date: 22-Jan-2024
https://dl.acm.org/doi/10.1145/3632524
Su HMeng LZhu LLu KLi JHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)DDPO: Direct Dual Propensity Optimization for Post-Click Conversion Rate EstimationProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657817(1179-1188)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657817
Zhang PHuang ZBai GHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Universal Adversarial Perturbations for Vision-Language Pre-trained ModelsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657781(862-871)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657781
Zhang YSang LZhang YHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Exploring the Individuality and Collectivity of Intents behind Interactions for Graph Collaborative FilteringProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657738(1253-1262)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657738
Yue LLiu QZhao LWang LGao WAn YHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Event Grounded Criminal Court View Generation with Cooperative (Large) Language ModelsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657698(2221-2230)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657698
Breuer TFuhr NSchaer P(2024)Validating Synthetic Usage Data in Living Lab EnvironmentsJournal of Data and Information Quality10.1145/362364016:1(1-33)Online publication date: 6-Mar-2024
https://dl.acm.org/doi/10.1145/3623640
Yang XLi XLiu ZWang YLu SLiu F(2024)Disentangled causal representation learning for debiasing recommendation with uniform dataApplied Intelligence10.1007/s10489-024-05497-954:8(6760-6775)Online publication date: 24-May-2024
https://dl.acm.org/doi/10.1007/s10489-024-05497-9
Li YYin S(2023)User Cold Start Recommendation System Based on Hofstede Cultural TheoryInternational Journal of Web Services Research10.4018/IJWSR.32119920:1(1-17)Online publication date: 13-Apr-2023
https://dl.acm.org/doi/10.4018/IJWSR.321199
Liang KMeng LLiu MLiu YTu WWang SZhou SLiu XChen HDuh WHuang HKato MMothe JPoblete B(2023)Learn from Relational Correlations and Periodic Events for Temporal Knowledge Graph ReasoningProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591711(1559-1568)Online publication date: 19-Jul-2023
https://dl.acm.org/doi/10.1145/3539618.3591711
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents