Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3442381.3449995acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Effective Named Entity Recognition with Boundary-aware Bidirectional Neural Networks

Published: 03 June 2021 Publication History

Abstract

Named Entity Recognition (NER) is a fundamental problem in Natural Language Processing and has received much research attention. Although the current neural-based NER approaches have achieved the state-of-the-art performance, they still suffer from one or more of the following three problems in their architectures: (1) boundary tag sparsity, (2) lacking of global decoding information; and (3) boundary error propagation. In this paper, we propose a novel Boundary-aware Bidirectional Neural Networks (Ba-BNN) model to tackle these problems for neural-based NER. The proposed Ba-BNN model is constructed based on the structure of pointer networks for tackling the first problem on boundary tag sparsity. Moreover, we also use a boundary-aware binary classifier to capture the global decoding information as input to the decoders. In the Ba-BNN model, we propose to use two decoders to process the information in two different directions (i.e., from left-to-right and right-to-left). The final hidden states of the left-to-right decoder are obtained by incorporating the hidden states of the right-to-left decoder in the decoding process. In addition, a boundary retraining strategy is also proposed to help reduce boundary error propagation caused by the pointer networks in boundary detection and entity classification. We have conducted extensive experiments based on three NER benchmark datasets. The performance results have shown that the proposed Ba-BNN model has outperformed the current state-of-the-art models.

References

[1]
Alan Akbik, Duncan Blythe, and Roland Vollgraf. 2018. Contextual String Embeddings for Sequence Labeling. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, Santa Fe, New Mexico, USA, 1638–1649. https://www.aclweb.org/anthology/C18-1139
[2]
Shany Barhom, Vered Shwartz, Alon Eirew, Michael Bugert, Nils Reimers, and Ido Dagan. 2019. Revisiting Joint Modeling of Cross-document Entity and Event Coreference Resolution. In Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers. Association for Computational Linguistics, Florence, Italy, 4179–4189.
[3]
Jason P.C. Chiu and Eric Nichols. 2016. Named Entity Recognition with Bidirectional LSTM-CNNs. Transactions of the Association for Computational Linguistics 4 (2016), 357–370. https://doi.org/10.1162/tacl_a_00104
[4]
Nigel Collier and Jin-Dong Kim. 2004. Introduction to the bio-entity recognition task at JNLPBA. In Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP). COLING, Geneva, Switzerland, 73–78.
[5]
Leon Derczynski, Eric Nichols, Marieke van Erp, and Nut Limsopatham. 2017. Results of the WNUT2017 shared task on novel and emerging entity recognition. In Proceedings of the 3rd Workshop on Noisy User-generated Text. 140–147.
[6]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT (1).
[7]
Nitish Gupta, Sameer Singh, and Dan Roth. 2017. Entity linking via joint encoding of types, descriptions, and context. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2681–2690.
[8]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (Nov. 1997), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
[9]
Zhiheng Huang, Wei Xu, and Kai Yu. 2015. Bidirectional LSTM-CRF Models for Sequence Tagging. CoRR abs/1508.01991(2015). arxiv:1508.01991http://arxiv.org/abs/1508.01991
[10]
Shafiq Joty, Giuseppe Carenini, and Raymond T Ng. 2015. Codra: A novel discriminative framework for rhetorical analysis. Computational Linguistics 41, 3 (2015), 385–435.
[11]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).
[12]
John Lafferty, Andrew McCallum, and Fernando CN Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. (2001).
[13]
Alex M Lamb, Anirudh Goyal Alias Parth Goyal, Ying Zhang, Saizheng Zhang, Aaron C Courville, and Yoshua Bengio. 2016. Professor forcing: A new algorithm for training recurrent networks. In Advances In Neural Information Processing Systems. 4601–4609.
[14]
Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. 2016. Neural architectures for named entity recognition. arXiv preprint arXiv:1603.01360(2016).
[15]
Jing Li, Aixin Sun, Jianglei Han, and Chenliang Li. 2020. A survey on deep learning for named entity recognition. IEEE Transactions on Knowledge and Data Engineering (2020).
[16]
Jing Li, Aixin Sun, and Yukun Ma. 2020. Neural Named Entity Boundary Detection. IEEE Transactions on Knowledge and Data Engineering (2020).
[17]
Ying Luo, Fengshun Xiao, and Hai Zhao. 2020. Hierarchical Contextualized Representation for Named Entity Recognition. In AAAI. 8441–8448.
[18]
Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 1532–1543.
[19]
Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 2227–2237. https://doi.org/10.18653/v1/N18-1202
[20]
Yanyao Shen, Hyokun Yun, Zachary C Lipton, Yakov Kronrod, and Animashree Anandkumar. 2018. Deep Active Learning for Named Entity Recognition. In International Conference on Learning Representations.
[21]
Emma Strubell, Patrick Verga, David Belanger, and Andrew McCallum. 2017. Fast and Accurate Entity Recognition with Iterated Dilated Convolutions. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Copenhagen, Denmark, 2670–2680. https://doi.org/10.18653/v1/D17-1283
[22]
Erik F Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 shared task: language-independent named entity recognition. In Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003-Volume 4. 142–147.
[23]
Suzushi Tomori, Takashi Ninomiya, and Shinsuke Mori. 2016. Domain Specific Named Entity Recognition Referring to the Real World by Deep Neural Networks. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Berlin, Germany, 236–242. https://doi.org/10.18653/v1/P16-2039
[24]
Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer networks. In Advances in neural information processing systems. 2692–2700.
[25]
Zheng Wang, Cheng Long, Gao Cong, and Yiding Liu. 2020. Efficient and Effective Similar Subtrajectory Search with Deep Reinforcement Learning. PVLDB 13, 11 (2020), 12–25.
[26]
Chenyan Xiong, Zhengzhong Liu, Jamie Callan, and Tie-Yan Liu. 2018. Towards better text understanding and retrieval through kernel entity salience modeling. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 575–584.
[27]
Feifei Zhai, Saloni Potdar, Bing Xiang, and Bowen Zhou. 2017. Neural models for sequence chunking. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. 3365–3371.
[28]
Qingyu Zhou, Nan Yang, Furu Wei, Chuanqi Tan, Hangbo Bao, and Ming Zhou. 2017. Neural question generation from text: A preliminary study. In National CCF Conference on Natural Language Processing and Chinese Computing. Springer, 662–671.
[29]
Andrej Žukov-Gregorič, Yoram Bachrach, and Sam Coope. 2018. Named Entity Recognition With Parallel Recurrent Neural Networks. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Melbourne, Australia, 69–74. https://doi.org/10.18653/v1/P18-2012

Cited By

View all
  • (2024)Dual Contrastive Learning for Cross-Domain Named Entity RecognitionACM Transactions on Information Systems10.1145/367887942:6(1-33)Online publication date: 18-Oct-2024
  • (2024)LinkNER: Linking Local Named Entity Recognition Models to Large Language Models using UncertaintyProceedings of the ACM Web Conference 202410.1145/3589334.3645414(4047-4058)Online publication date: 13-May-2024
  • (2024)MuJo-SF: Multimodal Joint Slot Filling for Attribute Value Prediction of E-Commerce CommoditiesIEEE Transactions on Multimedia10.1109/TMM.2024.340766726(10354-10366)Online publication date: 2024
  • Show More Cited By
  1. Effective Named Entity Recognition with Boundary-aware Bidirectional Neural Networks

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '21: Proceedings of the Web Conference 2021
    April 2021
    4054 pages
    ISBN:9781450383127
    DOI:10.1145/3442381
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 June 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Named entity recognition
    2. bidirectional decoding
    3. boundary retraining
    4. pointer networks

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    WWW '21
    Sponsor:
    WWW '21: The Web Conference 2021
    April 19 - 23, 2021
    Ljubljana, Slovenia

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)25
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 21 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Dual Contrastive Learning for Cross-Domain Named Entity RecognitionACM Transactions on Information Systems10.1145/367887942:6(1-33)Online publication date: 18-Oct-2024
    • (2024)LinkNER: Linking Local Named Entity Recognition Models to Large Language Models using UncertaintyProceedings of the ACM Web Conference 202410.1145/3589334.3645414(4047-4058)Online publication date: 13-May-2024
    • (2024)MuJo-SF: Multimodal Joint Slot Filling for Attribute Value Prediction of E-Commerce CommoditiesIEEE Transactions on Multimedia10.1109/TMM.2024.340766726(10354-10366)Online publication date: 2024
    • (2024)Named Entity Recognition in User-Generated Text: A Systematic Literature ReviewIEEE Access10.1109/ACCESS.2024.342771412(136330-136353)Online publication date: 2024
    • (2024)Causal Relationship Extraction Combined Boundary Detection and Information InteractionKnowledge Science, Engineering and Management10.1007/978-981-97-5489-2_15(165-175)Online publication date: 27-Jul-2024
    • (2023)Cross-Modality Graph-based Language and Sensor Data Co-Learning of Human-Mobility InteractionProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36109047:3(1-25)Online publication date: 27-Sep-2023
    • (2023)Representation and Labeling Gap Bridging for Cross-lingual Named Entity RecognitionProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591757(1230-1240)Online publication date: 19-Jul-2023
    • (2023)Online Anomalous Subtrajectory Detection on Road Networks with Deep Reinforcement Learning2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00026(246-258)Online publication date: Apr-2023
    • (2023)Domain-Specific Entity Recognition as Token-Pair Relation ClassificationIEEE Access10.1109/ACCESS.2023.332707411(118363-118371)Online publication date: 2023
    • (2022)Exploring Modular Task Decomposition in Cross-domain Named Entity RecognitionProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531976(301-311)Online publication date: 6-Jul-2022
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media