Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Document-Level Relation Extraction Based on Machine Reading Comprehension and Hybrid Pointer-sequence Labeling

Published: 26 June 2024 Publication History

Abstract

Document-level relational extraction requires reading, memorization, and reasoning to discover relevant factual information in multiple sentences. It is difficult for the current hierarchical network and graph network methods to fully capture the structural information behind the document and make natural reasoning from the context. Different from the previous methods, this article reconstructs the relation extraction task into a machine reading comprehension task. Each pair of entities and relationships is characterized by a question template, and the extraction of entities and relationships is translated into identifying answers from the context. To enhance the context comprehension ability of the extraction model and achieve more precise extraction, we introduce large language models (LLMs) during question construction, enabling the generation of exemplary answers. Besides, to solve the multi-label and multi-entity problems in documents, we propose a new answer extraction model based on hybrid pointer-sequence labeling, which improves the reasoning ability of the model and realizes the extraction of zero or multiple answers in documents. Extensive experiments on three public datasets show that the proposed method is effective.

References

[1]
Kun Xu, Yansong Feng, Songfang Huang, and Dongyan Zhao. 2015. Semantic relation classification via convolutional neural networks with simple negative sampling. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 536–540.
[2]
Daojian Zeng, Kang Liu, Siwei Lai, Guangyou Zhou, and Jun Zhao. 2014. Relation classification via convolutional deep neural network. In Proceedings of the 25th International Conference on Computational Linguistics: Technical Papers (COLING’14). 2335–2344.
[3]
Razieh Baradaran, Razieh Ghiasi, and Hossein Amirkhani. 2022. A survey on machine reading comprehension systems. Natural Language Engineering 28, 6 (2022), 683–732.
[4]
Hengzhu Tang, Yanan Cao, Zhenyu Zhang, Jiangxia Cao, Fang Fang, Shi Wang, and Pengfei Yin. 2020. HIN: Hierarchical inference network for document-level relation extraction. In Proceedings of the 24th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD’20), Part I 24. Springer, 197–209.
[5]
Difeng Wang, Wei Hu, Ermei Cao, and Weijian Sun. 2020. Global-to-local neural networks for document-level relation extraction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP’20). 3711–3721.
[6]
Shuang Zeng, Runxin Xu, Baobao Chang, and Lei Li. 2020. Double graph based reasoning for document-level relation extraction. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP’20). 1630–1640.
[7]
Patrick Verga, Emma Strubell, and Andrew McCallum. 2018. Simultaneously self-attending to all mentions for full-abstract biological relation extraction. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 872–884.
[8]
Dat Quoc Nguyen and Karin Verspoor. 2018. Convolutional neural networks for chemical-disease relation extraction are improved with character-based word embeddings. In Proceedings of the BioNLP 2018 Workshop. 129–136.
[9]
Yuan Yao, Deming Ye, Peng Li, Xu Han, Yankai Lin, Zhenghao Liu, Zhiyuan Liu, Lixin Huang, Jie Zhou, and Maosong Sun. 2019. DocRED: A large-scale document-level relation extraction dataset. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 764–777.
[10]
Hong Wang, Christfried Focke, Rob Sylvester, Nilesh Mishra, and William Wang. 2019. Fine-tune BERT for docred with two-step process. arXiv preprint arXiv:1909.11898 (2019).
[11]
Ningyu Zhang, Xiang Chen, Xin Xie, Shumin Deng, Chuanqi Tan, Mosha Chen, Fei Huang, Luo Si, and Huajun Chen. 2021. Document-level relation extraction as semantic segmentation. In Proceedings of the 30th International Joint Conference on Artificial Intelligence. 3999–4006.
[12]
Lige Yang, Liping Zheng, and Lijuan Zheng. 2020. Research on extraction of human information entity relationship based on improved capsule network. In 2020 International Workshop on Electronic Communication and Artificial Intelligence (IWECAI’20). IEEE, 41–45.
[13]
Xinsong Zhang, Pengshuai Li, Weijia Jia, and Hai Zhao. 2019. Multi-labeled relation extraction with attentive capsule network. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 7484–7491.
[14]
Melvin Johnson, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viégas, Martin Wattenberg, Greg Corrado, Macduff Hughes, and Jeffrey Dean. 2017. Google’s multilingual neural machine translation system: Enabling zero-shot translation. Transactions of the Association for Computational Linguistics 5 (2017), 339–351.
[15]
Nanyun Peng, Hoifung Poon, Chris Quirk, Kristina Toutanova, and Wen-tau Yih. 2017. Cross-sentence N-ary relation extraction with graph LSTMs. Transactions of the Association for Computational Linguistics 5 (2017), 101–115.
[16]
Linfeng Song, Yue Zhang, Zhiguo Wang, and Daniel Gildea. 2018. N-ary relation extraction using graph-state LSTM. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2226–2235.
[17]
Fenia Christopoulou, Makoto Miwa, and Sophia Ananiadou. 2019. Connecting the dots: Document-level neural relation extraction with edge-oriented graphs. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 4925–4936.
[18]
Wang Xu, Kehai Chen, and Tiejun Zhao. 2021. Document-level relation extraction with reconstruction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 14167–14175.
[19]
Limeng Cui, Haeseung Seo, Maryam Tabar, Fenglong Ma, Suhang Wang, and Dongwon Lee. 2020. Deterrent: Knowledge guided graph attention network for detecting healthcare misinformation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 492–502.
[20]
Qingyu Tan, Ruidan He, Lidong Bing, and Hwee Tou Ng. 2022. Document-level relation extraction with adaptive focal loss and knowledge distillation. In Findings of the Association for Computational Linguistics (ACL’22). 1672–1681.
[21]
Yang Chen and Bowen Shi. 2024. Enhanced heterogeneous graph attention network with a novel multilabel focal loss for document-level relation extraction. Entropy 26, 3 (2024), 210.
[22]
Wenxuan Zhou, Kevin Huang, Tengyu Ma, and Jing Huang. 2021. Document-level relation extraction with adaptive thresholding and localized context pooling. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 14612–14620.
[23]
Jiaxin Yu, Deqing Yang, and Shuyu Tian. 2022. Relation-specific attentions over entity mentions for enhanced document-level relation extraction. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1523–1529.
[24]
Quan Yuan, Yunpeng Xu, and Chengliang Tang. 2023. Document-level relation extraction method based on path labels. Journal of Computer Applications 43, 4 (2023), 1029.
[25]
Omer Levy, Minjoon Seo, Eunsol Choi, and Luke Zettlemoyer. 2017. Zero-shot relation extraction via reading comprehension. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL’17). 333–342.
[26]
Bryan McCann, Nitish Shirish Keskar, Caiming Xiong, and Richard Socher. 2018. The natural language decathlon: Multitask learning as question answering. arXiv preprint arXiv:1806.08730 (2018).
[27]
Xiaoya Li, Jingrong Feng, Yuxian Meng, Qinghong Han, Fei Wu, and Jiwei Li. 2020. A unified MRC framework for named entity recognition. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 5849–5859.
[28]
Tianyang Zhao, Zhao Yan, Yunbo Cao, and Zhoujun Li. 2021. Asking effective and diverse questions: A machine reading comprehension based framework for joint entity-relation extraction. In Proceedings of the 29th International Conference on International Joint Conferences on Artificial Intelligence. 3948–3954.
[29]
Zhaoyang Feng, Xing Wang, and Deqian Fu. 2022. Dual machine reading comprehension for event extraction. In 2022 12th International Conference on Information Science and Technology (ICIST’22). IEEE, 317–324.
[30]
Alexander J. Ratner, Christopher M. De Sa, Sen Wu, Daniel Selsam, and Christopher Ré. 2016. Data programming: Creating large training sets, quickly. Advances in Neural Information Processing Systems 29 (2016), 3567–3575.
[31]
Minghao Hu, Yuxing Peng, Zhen Huang, and Dongsheng Li. 2019. A multi-type multi-span network for reading comprehension that requires discrete reasoning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 1596–1606.
[32]
Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems 2 (2015), 2692–2700.
[33]
Ye Wu, Ruibang Luo, Henry C. M. Leung, Hing-Fung Ting, and Tak-Wah Lam. 2019. Renet: A deep learning approach for extracting gene-disease associations from literature. In Proceedings of the 23rd Annual International Conference on Research in Computational Molecular Biology (RECOMB’19). Springer, 272–284.
[34]
Guoshun Nan, Zhijiang Guo, Ivan Sekulic, and Wei Lu. 2020. Reasoning with latent structure refinement for document-level relation extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 1546–1557.
[35]
Jinghang Gu, Fuqing Sun, Longhua Qian, and Guodong Zhou. 2017. Chemical-induced disease relation extraction via convolutional neural network. Database 2017 (2017), bax024.
[36]
Sunil Kumar Sahu, Fenia Christopoulou, Makoto Miwa, and Sophia Ananiadou. 2019. Inter-sentence relation extraction with document-level graph convolutional neural network. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 4309–4316.
[37]
Chen Haotian, Chen Yijiang, and Zhou Xiangdong. 2024. Understanding more knowledge makes the transformer perform better in document-level relation extraction. In Asian Conference on Machine Learning. PMLR, 231–246.

Index Terms

  1. Document-Level Relation Extraction Based on Machine Reading Comprehension and Hybrid Pointer-sequence Labeling

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Asian and Low-Resource Language Information Processing
    ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 23, Issue 7
    July 2024
    254 pages
    EISSN:2375-4702
    DOI:10.1145/3613605
    • Editor:
    • Imed Zitouni
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 June 2024
    Online AM: 01 June 2024
    Accepted: 21 May 2024
    Revised: 20 March 2024
    Received: 13 March 2023
    Published in TALLIP Volume 23, Issue 7

    Check for updates

    Author Tags

    1. Document-level relation extraction
    2. machine reading comprehension
    3. large language model

    Qualifiers

    • Research-article

    Funding Sources

    • National Key Research and Development Program of China
    • National Natural Science Foundation of China
    • 14th Five-Year Scientific Research Plan of the National language commission

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 107
      Total Downloads
    • Downloads (Last 12 months)107
    • Downloads (Last 6 weeks)16
    Reflects downloads up to 04 Oct 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media