research-article

Open access

Ontology Enrichment for Effective Fine-grained Entity Typing

Authors:

Jiawei HanAuthors Info & Claims

KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 2318 - 2327

https://doi.org/10.1145/3637528.3671857

Published: 24 August 2024 Publication History

Abstract

Fine-grained entity typing (FET) is the task of identifying specific entity types at a fine-grained level for entity mentions based on their contextual information. Conventional methods for FET require extensive human annotation, which is time-consuming and costly given the massive scale of data. Recent studies have been developing weakly supervised or zero-shot approaches. We study the setting of zero-shot FET where only an ontology is provided. However, most existing ontology structures lack rich supporting information and even contain ambiguous relations, making them ineffective in guiding FET. Recently developed language models, though promising in various few-shot and zero-shot NLP tasks, may face challenges in zero-shot FET due to their lack of interaction with task-specific ontology. In this study, we propose øurs, where we (1) enrich each node in the ontology structure with two categories of extra information:instance information for training sample augmentation andtopic information to relate types with contexts, and (2) develop a coarse-to-fine typing algorithm that exploits the enriched information by training an entailment model with contrasting topics and instance-based augmented training samples. Our experiments show that øurs achieves high-quality fine-grained entity typing without human annotation, outperforming existing zero-shot methods by a large margin and rivaling supervised methods. øurs also enjoys strong transferability to unseen and finer-grained types. We will open source this work upon acceptance.

Supplemental Material

MP4 File - promotional video

This is a short promotional video of the paper "Ontology Enrichment for Effective Fine-grained Entity Typing" published in the proceedings of KDD 2024.

Download
6.44 MB

References

[1]

Yamen Ajjour, Johannes Kiesel, Benno Stein, and Martin Potthast. 2023. Topic Ontologies for Arguments. In Findings of EACL'23, Andreas Vlachos and Isabelle Augenstein (Eds.). 1381--1397.

[2]

Samuel R Bowman, Gabor Angeli, Christopher Potts, and Christopher D Manning. 2015. A large annotated corpus for learning natural language inference. In EMNLP'15. 632--642.

[3]

Shuang Chen, Jinpeng Wang, Feng Jiang, and Chin-Yew Lin. 2020. Improving Entity Linking by Modeling Latent Entity Type Information. In IAAI'20. 7529-- 7537.

[4]

Tongfei Chen, Yunmo Chen, and Benjamin Van Durme. 2020. Hierarchical Entity Typing via Multi-level Learning to Rank. In ACL'20. 8465--8475.

[5]

Yi Chen, Haiyun Jiang, Lemao Liu, Shuming Shi, Chuang Fan, Min Yang, and Ruifeng Xu. [n. d.]. An Empirical Study on Multiple Information Sources for Zero-Shot Fine-Grained Entity Typing. In EMNLP'21, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.). 2668--2678.

[6]

Eunsol Choi, Omer Levy, Yejin Choi, and Luke Zettlemoyer. 2018. Ultra-Fine Entity Typing. In ACL'18. 87--96.

[7]

Hongliang Dai, Donghong Du, Xin Li, and Yangqiu Song. 2019. Improving Fine-grained Entity Typing with Entity Linking. In EMNLP'19. 6209--6214.

[8]

Hongliang Dai, Yangqiu Song, and Haixun Wang. 2021. Ultra-Fine Entity Typing with Weak Supervision from a Masked Language Model. In ACL'21. 1790--1799.

[9]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL'19. 4171--4186.

[10]

Davide Ferrari. 2008. Maximum Lq-likelihood estimation. University of Minnesota.

[11]

Yuxia Geng, Jiaoyan Chen, Zhuo Chen, Jeff Z Pan, Zhiquan Ye, Zonggang Yuan, Yantao Jia, and Huajun Chen. 2021. OntoZSL: Ontology-enhanced zero-shot learning. In WWW'21. 3325--3336.

Digital Library

[12]

Dan Gillick, Nevena Lazic, Kuzman Ganchev, Jesse Kirchner, and David Huynh. 2014. Context-dependent fine-grained entity type tagging. arXiv preprint arXiv:1412.1820 (2014).

[13]

Dan Gillick, Nevena Lazic, Kuzman Ganchev, Jesse Kirchner, and David Huynh. 2014. Context-dependent fine-grained entity type tagging. arXiv preprint arXiv:1412.1820 (2014).

[14]

Jiaxin Huang, Yu Meng, and Jiawei Han. 2022. Few-Shot Fine-Grained Entity Typing with Automatic Label Interpretation and Instance Generation. In KDD'22. 605--614.

[15]

Yizhu Jiao, Sha Li, Yiqing Xie, Ming Zhong, Heng Ji, and Jiawei Han. 2022. Open-Vocabulary Argument Role Prediction For Event Extraction. In Findings of EMNLP'22. 5404--5418.

[16]

Yizhu Jiao, Ming Zhong, Sha Li, Ruining Zhao, Siru Ouyang, Heng Ji, and Jiawei Han. 2023. Instruct and Extract: Instruction Tuning for On-Demand Information Extraction. In EMNLP'23. 10030--10051.

[17]

Nitish Shirish Keskar, Bryan McCann, Lav R Varshney, Caiming Xiong, and Richard Socher. 2019. Ctrl: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858 (2019).

[18]

Tanay Komarlu, Minhao Jiang, Xuan Wang, and Jiawei Han. 2023. OntoType: Ontology-Guided Zero-Shot Fine-Grained Entity Typing with Weak Supervision from Pre-Trained Language Models. arXiv preprint arXiv:2305.12307 (2023).

[19]

Oleksii Kononenko, Olga Baysal, Reid Holmes, and Michael W Godfrey. 2014. Mining modern repositories with elasticsearch. In MSR'14. 328--331.

Digital Library

[20]

Bangzheng Li,Wenpeng Yin, and Muhao Chen. 2022. Ultra-fine entity typing with indirect supervision from natural language inference. TACL 10 (2022), 607--622.

[21]

Ying Lin and Heng Ji. 2019. An attentive fine-grained entity typing model with latent type representation. In EMNLP'19. 6197--6202.

[22]

Xiao Ling and Daniel S Weld. 2012. Fine-grained entity recognition. In AAAI'12. 94--100.

[23]

Qing Liu, Hongyu Lin, Xinyan Xiao, Xianpei Han, Le Sun, and Hua Wu. 2021. Fine-grained Entity Typing via Label Reasoning. In EMNLP'21. 4611--4622.

[24]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).

[25]

Yukun Ma, Erik Cambria, and Sa Gao. 2016. Label Embedding for Zero-shot Fine-grained Named Entity Typing. In COLING'16. 171--180.

[26]

Yuning Mao, Tong Zhao, Andrey Kan, Chenwei Zhang, Xin Luna Dong, Christos Faloutsos, and Jiawei Han. 2020. Octet: Online catalog taxonomy enrichment with self-supervision. In KDD'20. 2247--2257.

[27]

Yu Meng, Jiaxin Huang, Yu Zhang, and Jiawei Han. 2022. Generating Training Data with Language Models: Towards Zero-Shot Language Understanding. In NeurIPS'22.

[28]

Shikhar Murty, Patrick Verga, Luke Vilnis, Irena Radovanovic, and Andrew McCallum. 2018. Hierarchical losses and new resources for fine-grained entity typing and linking. In ACL'18. 97--109.

[29]

Rasha Obeidat, Xiaoli Fern, Hamed Shahbazi, and Prasad Tadepalli. 2019. Description-based zero-shot fine-grained entity typing. In NAACL'19. 807--814.

[30]

Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. 2022. Training language models to follow instructions with human feedback. In NeurIPS'22. 27730--27744.

[31]

Siru Ouyang, Shuohang Wang, Yang Liu, Ming Zhong, Yizhu Jiao, Dan Iter, Reid Pryzant, Chenguang Zhu, Heng Ji, and Jiawei Han. 2023. The Shifted and The Overlooked: A Task-oriented Investigation of User-GPT Interactions. In EMNLP'23. 2375--2393.

[32]

Shuai Pang, Jianqiang Ma, Zeyu Yan, Yang Zhang, and Jianping Shen. 2020. FASTMATCH: Accelerating the Inference of BERT-based Text Matching. In COLING'20. 6459--6469.

[33]

Fabio Petroni, Tim Rocktäschel, Sebastian Riedel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, and Alexander H. Miller. 2019. Language Models as Knowledge Bases?. In EMNLP'19. 2463--2473.

[34]

Chengwei Qin, Aston Zhang, Zhuosheng Zhang, Jiaao Chen, Michihiro Yasunaga, and Diyi Yang. 2023. Is ChatGPT a General-Purpose Natural Language Processing Task Solver?. In EMNLP'23. 1339--1384.

[35]

Maxim Rabinovich and Dan Klein. 2017. Fine-Grained Entity Typing with High- Multiplicity Assignments. In ACL'17. 330--334.

[36]

Alexander Ratner and Christopher Ré. 2018. Knowledge Base Construction in the Machine-learning Era. ACM Queue 16, 3 (2018), 50.

Digital Library

[37]

Xiang Ren,Wenqi He, Meng Qu, Lifu Huang, Heng Ji, and Jiawei Han. 2016. Afet: Automatic fine-grained entity typing by hierarchical partial-label embedding. In EMNLP'16. 1369--1378.

[38]

Xiang Ren, Wenqi He, Meng Qu, Clare R. Voss, Heng Ji, and Jiawei Han. 2016. Label Noise Reduction in Entity Typing by Heterogeneous Partial-Label Embedding. In KDD'16, Balaji Krishnapuram, Mohak Shah, Alexander J. Smola, Charu C. Aggarwal, Dou Shen, and Rajeev Rastogi (Eds.). 1825--1834.

Digital Library

[39]

Jiaming Shen, Jinfeng Xiao, Xinwei He, Jingbo Shang, Saurabh Sinha, and Jiawei Han. 2018. Entity set search of scientific literature: An unsupervised ranking approach. In SIGIR'18. 565--574.

Digital Library

[40]

Sonse Shimaoka, Pontus Stenetorp, Kentaro Inui, and Sebastian Riedel. 2017. Neural architectures for fine-grained entity type classification. In EACL'17. 1271-- 1280.

[41]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, ?ukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NIPS'17. 5998--6008.

Digital Library

[42]

Denny Vrandecic. 2012. Wikidata: a newplatform for collaborative data collection. In WWW'12, Alain Mille, Fabien Gandon, Jacques Misselis, Michael Rabinovich, and Steffen Staab (Eds.). 1063--1064.

[43]

RalphWeischedel and Ada Brunstein. 2005. BBN pronoun coreference and entity type corpus. Linguistic Data Consortium, Philadelphia 112 (2005).

[44]

Ralph Weischedel, Sameer Pradhan, Lance Ramshaw, Martha Palmer, Nianwen Xue, Mitchell Marcus, Ann Taylor, Craig Greenberg, Eduard Hovy, Robert Belvin, et al. 2011. Ontonotes release 4.0. LDC2011T03, Philadelphia, Penn.: Linguistic Data Consortium (2011).

[45]

Wentao Wu, Hongsong Li, Haixun Wang, and Kenny Q Zhu. 2012. Probase: A probabilistic taxonomy for text understanding. In SIGMOD'12. 481--492.

Digital Library

[46]

Zilin Xiao, Ming Gong, Paola Cascante-Bonilla, Xingyao Zhang, Jie Wu, and Vicente Ordonez. 2024. Grounding Language Models for Visual Entity Recognition. arXiv preprint arXiv:2402.18695 (2024).

[47]

Zilin Xiao, Ming Gong, Jie Wu, Xingyao Zhang, Linjun Shou, and Daxin Jiang. 2023. Instructed Language Models with Retrievers Are Powerful Entity Linkers. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 2267--2282. https://doi.org/10.18653/v1/ 2023.emnlp-main.139

[48]

Wenhan Xiong, Jiawei Wu, Deren Lei, Mo Yu, Shiyu Chang, Xiaoxiao Guo, and William YangWang. 2019. Imposing Label-Relational Inductive Bias for Extremely Fine-Grained Entity Typing. In NAACL'19. 773--784.

[49]

Peng Xu and Denilson Barbosa. 2018. Neural Fine-Grained Entity Type Classification with Hierarchy-Aware Loss. In NAACL'18. 16--25.

[50]

Mohamed Amir Yosef, Sandro Bauer, Johannes Hoffart, Marc Spaniol, and Gerhard Weikum. 2012. Hyena: Hierarchical type classification for entity names. In COLING'12. 1361--1370.

[51]

Zheng Yuan and Doug Downey. 2018. Otyper: A neural architecture for open named entity typing. In AAAI'18. 6037--6044.

[52]

Tao Zhang, Congying Xia, Chun-Ta Lu, and S Yu Philip. 2020. MZET: Memory Augmented Zero-Shot Fine-grained Named Entity Typing. In COLING'20. 77--87.

[53]

Yu Zhang, Yu Meng, XuanWang, ShengWang, and Jiawei Han. 2022. Seed-Guided Topic Discovery with Out-of-Vocabulary Seeds. In NAACL'22. 279--290.

[54]

Yu Zhang, Yunyi Zhang, Yucheng Jiang, Martin Michalski, Yu Deng, Lucian Popa, ChengXiang Zhai, and Jiawei Han. 2022. Entity Set Co-Expansion in StackOverflow. In IEEE BigData'22. 4792--4795.

[55]

Yu Zhang, Yunyi Zhang, Yanzhen Shen, Yu Deng, Lucian Popa, Larisa Shwartz, ChengXiang Zhai, and Jiawei Han. 2024. Seed-Guided Fine-Grained Entity Typing in Science and Engineering Domains. In AAAI'24. 19606--19614.

[56]

Zhilu Zhang and Mert Sabuncu. 2018. Generalized cross entropy loss for training deep neural networks with noisy labels. In NeurIPS'18. 8792--8802.

[57]

Ruiqi Zhong, Kristy Lee, Zheng Zhang, and Dan Klein. 2021. Adapting Language Models for Zero-shot Learning by Meta-tuning on Dataset and Prompt Collections. In Findings of EMNLP'21. 2856--2878.

[58]

Ben Zhou, Daniel Khashabi, Chen-Tse Tsai, and Dan Roth. 2018. Zero-Shot Open Entity Typing as Type-Compatible Grounding. In EMNLP'18. 2065--2076.

[59]

Sizhe Zhou, Suyu Ge, Jiaming Shen, and Jiawei Han. 2023. Corpus-Based Relation Extraction by Identifying and Refining Relation Patterns. In ECML/PKDD'23. 20--38.

Cited By

Xiao ZGong MCascante-Bonilla PZhang XWu JOrdonez V(2024)Grounding Language Models for Visual Entity RecognitionComputer Vision – ECCV 202410.1007/978-3-031-73247-8_23(393-411)Online publication date: 1-Nov-2024
https://doi.org/10.1007/978-3-031-73247-8_23

Index terms have been assigned to the content through auto-classification.

Recommendations

OntoType: Ontology-Guided and Pre-Trained Language Model Assisted Fine-Grained Entity Typing
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Fine-grained entity typing (FET), which assigns entities in text with context-sensitive, fine-grained semantic types, is a basic but important task for knowledge extraction from unstructured text. FET has been studied extensively in natural language ...
A Fine-Grained Entity Typing Method Combined with Features
Abstract
Using the fine-grained entity typing method of distant supervision, when assigning type labels to entity mention, since the knowledge base contains all type labels of the entity, noisy labels will be introduced. This paper proposed a Fine-grained ...
Label Noise Reduction in Entity Typing by Heterogeneous Partial-Label Embedding
KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Current systems of fine-grained entity typing use distant supervision in conjunction with existing knowledge bases to assign categories (type labels) to entity mentions. However, the type labels so obtained from knowledge bases are often noisy (i.e., ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 2024

6901 pages

ISBN:9798400704901

DOI:10.1145/3637528

General Chairs:
Ricardo Baeza-Yates
Northeastern University, USA
,
Francesco Bonchi
CENTAI / Eurecat, Italy

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 August 2024

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

KDD '24

Sponsor:

KDD '24: The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona, Spain

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
333
Total Downloads

Downloads (Last 12 months)333
Downloads (Last 6 weeks)89

Reflects downloads up to 15 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xiao ZGong MCascante-Bonilla PZhang XWu JOrdonez V(2024)Grounding Language Models for Visual Entity RecognitionComputer Vision – ECCV 202410.1007/978-3-031-73247-8_23(393-411)Online publication date: 1-Nov-2024
https://doi.org/10.1007/978-3-031-73247-8_23

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents