Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

MABERT: Mask-Attention-Based BERT for Chinese Event Extraction

Published: 20 July 2023 Publication History

Abstract

Event extraction is an essential but challenging task in information extraction. This task has considerably benefited from pre-trained language models, such as BERT. However, when it comes to the trigger-word mismatch problem in languages without natural delimiters, existing methods ignore the complement of lexical information to BERT. In addition, the inherent multi-role noise problem could limit the performance of methods when one sentence contains multiple events. In this article, we propose a Mask-Attention-based BERT (MABERT) framework for Chinese event extraction to address the above problems. Firstly, in order to avoid trigger-word mismatch and integrate lexical features into BERT layers directly, a mask-attention-based transformer augmented with two mask matrices is devised to replace the original one in BERT. By the mask-attention-based transformer, the character sequence interacts with external lexical semantics sufficiently and keeps its structure information at the same time. Moreover, against the multi-role noise problem, we make use of event type information from representation and classification, two aspects to enrich entity features, where type markers and event-schema-based mask matrix are proposed. Experimental results on the widely used ACE2005 dataset show the effectiveness of our proposed MABERT on Chinese event extraction task compared with other state-of-the-art methods.

References

[1]
David Ahn. 2006. The stages of event extraction. In Proceedings of the Workshop on Annotating and Reasoning about Time and Events. 1–8.
[2]
Chen Chen and Vincent Ng. 2012. Joint modeling for Chinese event extraction with rich linguistic features. In Proceedings of COLING 2012. 529–544.
[3]
Yubo Chen, Liheng Xu, Kang Liu, Daojian Zeng, and Jun Zhao. 2015. Event extraction via dynamic multi-pooling convolutional neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 167–176.
[4]
Zheng Chen and Heng Ji. 2009. Language specific issue and feature exploration in Chinese event extraction. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers. 209–212.
[5]
Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, and Ziqing Yang. 2021. Pre-training with whole word masking for Chinese bert. IEEE/ACM Transactions on Audio, Speech, and Language Processing 29 (2021), 3504–3514.
[6]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171–4186.
[7]
Shizhe Diao, Jiaxin Bai, Yan Song, Tong Zhang, and Yonggang Wang. 2020. ZEN: Pre-training Chinese text encoder enhanced by n-gram representations. In Findings of the Association for Computational Linguistics: EMNLP 2020. 4729–4740.
[8]
Ning Ding, Ziran Li, Zhiyuan Liu, Haitao Zheng, and Zibo Lin. 2019. Event detection with trigger-aware lattice neural network. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 347–356.
[9]
Xiaocheng Feng, Bing Qin, and Ting Liu. 2018. A language-independent neural network for event detection. Science China Information Sciences 61, 9 (2018), 092106.
[10]
Goran Glavaš and Jan Šnajder. 2014. Event graphs for information retrieval and multi-document summarization. Expert Systems with Applications 41, 15 (2014), 6904–6916.
[11]
Yu Hong, Jianfeng Zhang, Bin Ma, Jianmin Yao, Guodong Zhou, and Qiaoming Zhu. 2011. Using cross-entity inference to improve event extraction. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 1127–1136.
[12]
Heng Ji and Ralph Grishman. 2011. Knowledge base population: Successful approaches and challenges. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 1148–1158.
[13]
Viet Dac Lai, Tuan Ngo Nguyen, and Thien Huu Nguyen. 2020. Event detection: Gate diversity and syntactic importance scores for graph convolution neural networks. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP’20). 5405–5411.
[14]
Gina-Anne Levow. 2006. The third international Chinese language processing bakeoff: Word segmentation and named entity recognition. In Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing. 108–117.
[15]
Peifeng Li and Guodong Zhou. 2012. Employing morphological structures and sememes for Chinese event extraction. In Proceedings of COLING 2012. 1619–1634.
[16]
Qi Li, Heng Ji, and Liang Huang. 2013. Joint event extraction via structured prediction with global features. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 73–82.
[17]
Zhenghan Li, Nanchang Cheng, and Wenchao Song. 2021. Research on Chinese event extraction method based on RoBERTa-WWM-CRF. In Proceedings of the 2021 IEEE 12th International Conference on Software Engineering and Service Science (ICSESS’21). IEEE, 100–104.
[18]
Hongyu Lin, Yaojie Lu, Xianpei Han, and Le Sun. 2018. Nugget proposal networks for Chinese event detection. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1565–1574.
[19]
Jian Liu, Yubo Chen, and Kang Liu. 2019. Exploiting the ground-truth: An adversarial imitation based knowledge distillation approach for event detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 6754–6761.
[20]
Ting Liu and Tomek Strzalkowski. 2012. Bootstrapping events and relations from text. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. 296–305.
[21]
Wei Liu, Xiyan Fu, Yue Zhang, and Wenming Xiao. 2021. Lexicon enhanced Chinese sequence labeling using BERT adapter. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 5847–5858.
[22]
Xiao Liu, Zhunchen Luo, and He-Yan Huang. 2018. Jointly multiple events extraction via attention-based graph information aggregation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 1247–1256.
[23]
Yi Luan, Dave Wadden, Luheng He, Amy Shah, Mari Ostendorf, and Hannaneh Hajishirzi. 2019. A general framework for information extraction using dynamic span graphs. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 3036–3046.
[24]
Thien Huu Nguyen, Kyunghyun Cho, and Ralph Grishman. 2016. Joint event extraction via recurrent neural networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 300–309.
[25]
Trung Minh Nguyen and Thien Huu Nguyen. 2019. One for all: Neural joint modeling of entities and events. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 6851–6858.
[26]
Nanyun Peng and Mark Dredze. 2015. Named entity recognition for Chinese social media with jointly trained embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 548–554.
[27]
Lei Sha, Feng Qian, Baobao Chang, and Zhifang Sui. 2018. Jointly extracting event triggers and arguments by dependency-bridge RNN and tensor-based argument interaction. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.
[28]
Yan Song, Shuming Shi, Jing Li, and Haisong Zhang. 2018. Directional skip-gram: Explicitly distinguishing left and right context for word embeddings. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). 175–180.
[29]
Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu. 2019. Ernie: Enhanced representation through knowledge integration. arXiv:1904.09223. https://arxiv.org/abs/1904.09223.
[30]
Meihan Tong, Bin Xu, Shuai Wang, Yixin Cao, Lei Hou, Juanzi Li, and Jun Xie. 2020. Improving event detection via open-domain trigger knowledge. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 5887–5897.
[31]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 5998–6008.
[32]
David Wadden, Ulme Wennberg, Yi Luan, and Hannaneh Hajishirzi. 2019. Entity, relation, and event extraction with contextualized span representations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 5784–5789.
[33]
Xiaozhi Wang, Ziqi Wang, Xu Han, Zhiyuan Liu, Juanzi Li, Peng Li, Maosong Sun, Jie Zhou, and Xiang Ren. 2019. HMEAE: Hierarchical modular event argument extraction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 5777–5783.
[34]
Zhihong Wang, Yi Guo, and Jiahui Wang. 2021a. Empower Chinese event detection with improved atrous convolution neural networks. Neural Computing and Applications 33, 11 (2021), 5805–5820.
[35]
Ziqi Wang, Xiaozhi Wang, Xu Han, Yankai Lin, Lei Hou, Zhiyuan Liu, Peng Li, Juanzi Li, and Jie Zhou. 2021b. CLEVE: Contrastive pre-training for event extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 6283–6297.
[36]
Xiangyu Xi, Tong Zhang, Wei Ye, Jinglei Zhang, Rui Xie, and Shikun Zhang. 2019. A hybrid character representation for Chinese event detection. In 2019 International Joint Conference on Neural Networks (IJCNN’19). IEEE, 1–8.
[37]
Nuo Xu, Haihua Xie, and Dongyan Zhao. 2020. A novel joint framework for multiple Chinese events extraction. China National Conference on Chinese Computational Linguistics. Springer, 174–183.
[38]
Bishan Yang and Tom Mitchell. 2016. Joint extraction of events and entities within a document context. In Proceedings of NAACL-HLT. 289–299.
[39]
Sen Yang, Dawei Feng, Linbo Qiao, Zhigang Kan, and Dongsheng Li. 2019. Exploring pre-trained language models for event extraction and generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 5284–5294.
[40]
Ying Zeng, Honghui Yang, Yansong Feng, Zheng Wang, and Dongyan Zhao. 2016. A convolution BiLSTM neural network model for Chinese event extraction. Natural Language Understanding and Intelligent Applications. Springer, 275–287.
[41]
Yue Zhang and Jie Yang. 2018. Chinese NER using lattice LSTM. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1554–1564.

Cited By

View all
  • (2024)MaskDGNets: Masked-attention guided dynamic graph aggregation network for event extractionPLOS ONE10.1371/journal.pone.030667319:11(e0306673)Online publication date: 15-Nov-2024
  • (2023)Causal Knowledge Integrated with Attention for Interpretable Event Detection2023 9th International Conference on Big Data and Information Analytics (BigDIA)10.1109/BigDIA60676.2023.10429095(707-714)Online publication date: 15-Dec-2023
  • (2023)Modeling Character–Word Interaction via a Novel Mesh Transformer for Chinese Event DetectionNeural Processing Letters10.1007/s11063-023-11382-255:8(11429-11448)Online publication date: 11-Sep-2023

Index Terms

  1. MABERT: Mask-Attention-Based BERT for Chinese Event Extraction

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 7
      July 2023
      422 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3610376
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 20 July 2023
      Online AM: 19 May 2023
      Accepted: 12 May 2023
      Revised: 27 February 2023
      Received: 11 December 2021
      Published in TALLIP Volume 22, Issue 7

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Event extraction
      2. mask-attention-based transformer
      3. event type markers
      4. event ontology

      Qualifiers

      • Research-article

      Funding Sources

      • National Natural Science Foundation of China

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)198
      • Downloads (Last 6 weeks)13
      Reflects downloads up to 22 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)MaskDGNets: Masked-attention guided dynamic graph aggregation network for event extractionPLOS ONE10.1371/journal.pone.030667319:11(e0306673)Online publication date: 15-Nov-2024
      • (2023)Causal Knowledge Integrated with Attention for Interpretable Event Detection2023 9th International Conference on Big Data and Information Analytics (BigDIA)10.1109/BigDIA60676.2023.10429095(707-714)Online publication date: 15-Dec-2023
      • (2023)Modeling Character–Word Interaction via a Novel Mesh Transformer for Chinese Event DetectionNeural Processing Letters10.1007/s11063-023-11382-255:8(11429-11448)Online publication date: 11-Sep-2023

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media