research-article

MABERT: Mask-Attention-Based BERT for Chinese Event Extraction

Authors:

Yang XiangAuthors Info & Claims

ACM Transactions on Asian and Low-Resource Language Information Processing, Volume 22, Issue 7

Article No.: 192, Pages 1 - 21

https://doi.org/10.1145/3597455

Published: 20 July 2023 Publication History

Abstract

Event extraction is an essential but challenging task in information extraction. This task has considerably benefited from pre-trained language models, such as BERT. However, when it comes to the trigger-word mismatch problem in languages without natural delimiters, existing methods ignore the complement of lexical information to BERT. In addition, the inherent multi-role noise problem could limit the performance of methods when one sentence contains multiple events. In this article, we propose a Mask-Attention-based BERT (MABERT) framework for Chinese event extraction to address the above problems. Firstly, in order to avoid trigger-word mismatch and integrate lexical features into BERT layers directly, a mask-attention-based transformer augmented with two mask matrices is devised to replace the original one in BERT. By the mask-attention-based transformer, the character sequence interacts with external lexical semantics sufficiently and keeps its structure information at the same time. Moreover, against the multi-role noise problem, we make use of event type information from representation and classification, two aspects to enrich entity features, where type markers and event-schema-based mask matrix are proposed. Experimental results on the widely used ACE2005 dataset show the effectiveness of our proposed MABERT on Chinese event extraction task compared with other state-of-the-art methods.

References

[1]

David Ahn. 2006. The stages of event extraction. In Proceedings of the Workshop on Annotating and Reasoning about Time and Events. 1–8.

Digital Library

[2]

Chen Chen and Vincent Ng. 2012. Joint modeling for Chinese event extraction with rich linguistic features. In Proceedings of COLING 2012. 529–544.

[3]

Yubo Chen, Liheng Xu, Kang Liu, Daojian Zeng, and Jun Zhao. 2015. Event extraction via dynamic multi-pooling convolutional neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 167–176.

[4]

Zheng Chen and Heng Ji. 2009. Language specific issue and feature exploration in Chinese event extraction. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers. 209–212.

Digital Library

[5]

Yiming Cui, Wanxiang Che, Ting Liu, Bing Qin, and Ziqing Yang. 2021. Pre-training with whole word masking for Chinese bert. IEEE/ACM Transactions on Audio, Speech, and Language Processing 29 (2021), 3504–3514.

Digital Library

[6]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171–4186.

[7]

Shizhe Diao, Jiaxin Bai, Yan Song, Tong Zhang, and Yonggang Wang. 2020. ZEN: Pre-training Chinese text encoder enhanced by n-gram representations. In Findings of the Association for Computational Linguistics: EMNLP 2020. 4729–4740.

[8]

Ning Ding, Ziran Li, Zhiyuan Liu, Haitao Zheng, and Zibo Lin. 2019. Event detection with trigger-aware lattice neural network. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 347–356.

[9]

Xiaocheng Feng, Bing Qin, and Ting Liu. 2018. A language-independent neural network for event detection. Science China Information Sciences 61, 9 (2018), 092106.

[10]

Goran Glavaš and Jan Šnajder. 2014. Event graphs for information retrieval and multi-document summarization. Expert Systems with Applications 41, 15 (2014), 6904–6916.

Digital Library

[11]

Yu Hong, Jianfeng Zhang, Bin Ma, Jianmin Yao, Guodong Zhou, and Qiaoming Zhu. 2011. Using cross-entity inference to improve event extraction. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 1127–1136.

Digital Library

[12]

Heng Ji and Ralph Grishman. 2011. Knowledge base population: Successful approaches and challenges. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. 1148–1158.

Digital Library

[13]

Viet Dac Lai, Tuan Ngo Nguyen, and Thien Huu Nguyen. 2020. Event detection: Gate diversity and syntactic importance scores for graph convolution neural networks. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP’20). 5405–5411.

[14]

Gina-Anne Levow. 2006. The third international Chinese language processing bakeoff: Word segmentation and named entity recognition. In Proceedings of the 5th SIGHAN Workshop on Chinese Language Processing. 108–117.

[15]

Peifeng Li and Guodong Zhou. 2012. Employing morphological structures and sememes for Chinese event extraction. In Proceedings of COLING 2012. 1619–1634.

[16]

Qi Li, Heng Ji, and Liang Huang. 2013. Joint event extraction via structured prediction with global features. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 73–82.

[17]

Zhenghan Li, Nanchang Cheng, and Wenchao Song. 2021. Research on Chinese event extraction method based on RoBERTa-WWM-CRF. In Proceedings of the 2021 IEEE 12th International Conference on Software Engineering and Service Science (ICSESS’21). IEEE, 100–104.

[18]

Hongyu Lin, Yaojie Lu, Xianpei Han, and Le Sun. 2018. Nugget proposal networks for Chinese event detection. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1565–1574.

[19]

Jian Liu, Yubo Chen, and Kang Liu. 2019. Exploiting the ground-truth: An adversarial imitation based knowledge distillation approach for event detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 6754–6761.

Digital Library

[20]

Ting Liu and Tomek Strzalkowski. 2012. Bootstrapping events and relations from text. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. 296–305.

Digital Library

[21]

Wei Liu, Xiyan Fu, Yue Zhang, and Wenming Xiao. 2021. Lexicon enhanced Chinese sequence labeling using BERT adapter. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 5847–5858.

[22]

Xiao Liu, Zhunchen Luo, and He-Yan Huang. 2018. Jointly multiple events extraction via attention-based graph information aggregation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 1247–1256.

[23]

Yi Luan, Dave Wadden, Luheng He, Amy Shah, Mari Ostendorf, and Hannaneh Hajishirzi. 2019. A general framework for information extraction using dynamic span graphs. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 3036–3046.

[24]

Thien Huu Nguyen, Kyunghyun Cho, and Ralph Grishman. 2016. Joint event extraction via recurrent neural networks. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 300–309.

[25]

Trung Minh Nguyen and Thien Huu Nguyen. 2019. One for all: Neural joint modeling of entities and events. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 6851–6858.

Digital Library

[26]

Nanyun Peng and Mark Dredze. 2015. Named entity recognition for Chinese social media with jointly trained embeddings. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 548–554.

[27]

Lei Sha, Feng Qian, Baobao Chang, and Zhifang Sui. 2018. Jointly extracting event triggers and arguments by dependency-bridge RNN and tensor-based argument interaction. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence.

[28]

Yan Song, Shuming Shi, Jing Li, and Haisong Zhang. 2018. Directional skip-gram: Explicitly distinguishing left and right context for word embeddings. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). 175–180.

[29]

Yu Sun, Shuohuan Wang, Yukun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu. 2019. Ernie: Enhanced representation through knowledge integration. arXiv:1904.09223. https://arxiv.org/abs/1904.09223.

[30]

Meihan Tong, Bin Xu, Shuai Wang, Yixin Cao, Lei Hou, Juanzi Li, and Jun Xie. 2020. Improving event detection via open-domain trigger knowledge. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 5887–5897.

[31]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems. 5998–6008.

[32]

David Wadden, Ulme Wennberg, Yi Luan, and Hannaneh Hajishirzi. 2019. Entity, relation, and event extraction with contextualized span representations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 5784–5789.

[33]

Xiaozhi Wang, Ziqi Wang, Xu Han, Zhiyuan Liu, Juanzi Li, Peng Li, Maosong Sun, Jie Zhou, and Xiang Ren. 2019. HMEAE: Hierarchical modular event argument extraction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 5777–5783.

[34]

Zhihong Wang, Yi Guo, and Jiahui Wang. 2021a. Empower Chinese event detection with improved atrous convolution neural networks. Neural Computing and Applications 33, 11 (2021), 5805–5820.

Digital Library

[35]

Ziqi Wang, Xiaozhi Wang, Xu Han, Yankai Lin, Lei Hou, Zhiyuan Liu, Peng Li, Juanzi Li, and Jie Zhou. 2021b. CLEVE: Contrastive pre-training for event extraction. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 6283–6297.

[36]

Xiangyu Xi, Tong Zhang, Wei Ye, Jinglei Zhang, Rui Xie, and Shikun Zhang. 2019. A hybrid character representation for Chinese event detection. In 2019 International Joint Conference on Neural Networks (IJCNN’19). IEEE, 1–8.

[37]

Nuo Xu, Haihua Xie, and Dongyan Zhao. 2020. A novel joint framework for multiple Chinese events extraction. China National Conference on Chinese Computational Linguistics. Springer, 174–183.

Digital Library

[38]

Bishan Yang and Tom Mitchell. 2016. Joint extraction of events and entities within a document context. In Proceedings of NAACL-HLT. 289–299.

[39]

Sen Yang, Dawei Feng, Linbo Qiao, Zhigang Kan, and Dongsheng Li. 2019. Exploring pre-trained language models for event extraction and generation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 5284–5294.

[40]

Ying Zeng, Honghui Yang, Yansong Feng, Zheng Wang, and Dongyan Zhao. 2016. A convolution BiLSTM neural network model for Chinese event extraction. Natural Language Understanding and Intelligent Applications. Springer, 275–287.

[41]

Yue Zhang and Jie Yang. 2018. Chinese NER using lattice LSTM. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1554–1564.

Cited By

Zhang GXie FYu L(2024)MaskDGNets: Masked-attention guided dynamic graph aggregation network for event extractionPLOS ONE10.1371/journal.pone.030667319:11(e0306673)Online publication date: 15-Nov-2024
https://doi.org/10.1371/journal.pone.0306673
Sun JXiao KHu STang JZhao R(2023)Causal Knowledge Integrated with Attention for Interpretable Event Detection2023 9th International Conference on Big Data and Information Analytics (BigDIA)10.1109/BigDIA60676.2023.10429095(707-714)Online publication date: 15-Dec-2023
https://doi.org/10.1109/BigDIA60676.2023.10429095
Dai LWang BXiang WMo Y(2023)Modeling Character–Word Interaction via a Novel Mesh Transformer for Chinese Event DetectionNeural Processing Letters10.1007/s11063-023-11382-255:8(11429-11448)Online publication date: 11-Sep-2023
https://doi.org/10.1007/s11063-023-11382-2

Index Terms

MABERT: Mask-Attention-Based BERT for Chinese Event Extraction
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction
      2. Lexical semantics

Recommendations

Chinese Event Extraction via Graph Attention Network
Event extraction plays an important role in natural language processing (NLP) applications, including question answering and information retrieval. Most of the previous state-of-the-art methods were lack of ability in capturing features in long range. ...
Document-Level Multi-event Extraction via Event Ontology Guiding
Knowledge Science, Engineering and Management
Abstract
Document-level Event Extraction (DEE) aims to extract event information from a whole document, in which extracting multiple events is a fundamental challenge. Previous works struggle to handle the Document-level Multi-Event Extraction (DMEE) due ...
EABERT: An Event Annotation Enhanced BERT Framework for Event Extraction
Abstract
Event extraction(EE) is a challenging task of information extraction, which aims to extract structured event information from text. Existing methods usually achieve state-of-the-art performance based on pre-trained language models(PLMs) that ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Asian and Low-Resource Language Information Processing

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 22, Issue 7

July 2023

422 pages

ISSN:2375-4699

EISSN:2375-4702

DOI:10.1145/3610376

Editor:
Imed Zitouni
Google, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 July 2023

Online AM: 19 May 2023

Accepted: 12 May 2023

Revised: 27 February 2023

Received: 11 December 2021

Published in TALLIP Volume 22, Issue 7

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
375
Total Downloads

Downloads (Last 12 months)198
Downloads (Last 6 weeks)13

Reflects downloads up to 22 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang GXie FYu L(2024)MaskDGNets: Masked-attention guided dynamic graph aggregation network for event extractionPLOS ONE10.1371/journal.pone.030667319:11(e0306673)Online publication date: 15-Nov-2024
https://doi.org/10.1371/journal.pone.0306673
Sun JXiao KHu STang JZhao R(2023)Causal Knowledge Integrated with Attention for Interpretable Event Detection2023 9th International Conference on Big Data and Information Analytics (BigDIA)10.1109/BigDIA60676.2023.10429095(707-714)Online publication date: 15-Dec-2023
https://doi.org/10.1109/BigDIA60676.2023.10429095
Dai LWang BXiang WMo Y(2023)Modeling Character–Word Interaction via a Novel Mesh Transformer for Chinese Event DetectionNeural Processing Letters10.1007/s11063-023-11382-255:8(11429-11448)Online publication date: 11-Sep-2023
https://doi.org/10.1007/s11063-023-11382-2

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents