research-article

BERT-Based Models with Attention Mechanism and Lambda Layer for Biomedical Named Entity Recognition

Authors:

Masaomi KimuraAuthors Info & Claims

ICMLC '24: Proceedings of the 2024 16th International Conference on Machine Learning and Computing

Pages 536 - 544

https://doi.org/10.1145/3651671.3651709

Published: 07 June 2024 Publication History

Abstract

Biomedical named entity recognition (NER) is a crucial subtask in the field of information extraction within natural language processing (NLP). Its primary objective is to identify and classify entities in biomedical text, playing a pivotal role in applications such as medical information retrieval and biomedical knowledge discovery. In this paper, we propose several enhanced versions of BERT-BiLSTM-CRF and BERT-IDCNN-CRF by incorporating an attention mechanism or lambda layer to improve entity recognition accuracy. Specifically, we utilize the attention mechanism to enable the model to learn interrelationships among all words in the input sequence. Additionally, we employ the lambda layer to enhance the model's capacity for capturing semantic relationships between words and considering word order. This integration results in superior accuracy in entity recognition. We evaluate our proposed methods using the i2b2 2010 dataset and six additional biomedical datasets from the Biomedical Language Understanding and Reasoning Benchmark (BLURB), including JNLPBA, BC2GM, BC5CDR, AnatEM, BioNLP-CG, and NCBI-disease. Experimental results demonstrate that our proposed methods achieve higher accuracy than the original methods, indicating superior capabilities in medical knowledge extraction for our models.

References

[1]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.

[2]

Elman, Jeffrey L. 1990. Finding structure in time. Cognitive science, 14(2), 179-211.

[3]

Hochreiter, Sepp, and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, 9(8), 1735-1780.

[4]

Graves, Alex, and Jürgen Schmidhuber. 2005. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural networks, 18(5-6), 602-610.

[5]

Emma Strubell, Patrick Verga, David Belanger, and Andrew McCallum. 2017. Fast and accurate entity recognition with iterated dilated convolutions. arXiv preprint arXiv:1702.02098.

[6]

Huang, Zhiheng, Wei Xu, and Kai Yu. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991.

[7]

Zhenjin Dai, Xutao Wang, Pin Ni, Yuming Li, Gangmin Li, and Xuming Bai. 2019. Named entity recognition using BERT BiLSTM CRF for Chinese electronic health records. In 2019 12th international congress on image and signal processing, biomedical engineering and informatics (cisp-bmei) (pp. 1-5). IEEE.

[8]

Cai, Xiaocheng, Erhua Sun, and Jiali Lei. 2022. Research on application of named entity recognition of electronic medical records based on BERT-IDCNN-CRF model. In Proceedings of the 6th International Conference on Graphics and Signal Processing (pp. 80-85).

[9]

Vaswani, A., 2017. Attention is all you need. Advances in neural information processing systems, 30.

[10]

Kitaev, Nikita, Łukasz Kaiser, and Anselm Levskaya. 2020. Reformer: The efficient transformer. arXiv preprint arXiv:2001.04451.

[11]

Bello, Irwan. 2021. Lambdanetworks: Modeling long-range interactions without attention. arXiv preprint arXiv:2102.08602.

[12]

Rewon Child, Scott Gray, Alec Radford, and Ilya Sutskever. 2019. Generating long sequences with sparse transformers. arXiv preprint arXiv:1904.10509.

[13]

Özlem Uzuner, Brett R South, Shuying Shen, and Scott L DuVall. 2011. 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. Journal of the American Medical Informatics Association, 18(5), 552-556.

[14]

YU GU, ROBERT TINN, HAO CHENG, MICHAEL LUCAS, NAOTO USUYAMA, XIAODONG LIU, TRISTAN NAUMANN, JIANFENG GAO, and HOIFUNG POON. 2021. Domain-specific language model pretraining for biomedical natural language processing. ACM Transactions on Computing for Healthcare (HEALTH), 3(1), 1-23.

[15]

Collier, Nigel, and Jin-Dong Kim. 2004. Introduction to the bio-entity recognition task at JNLPBA. In Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP) (pp. 73-78).

Digital Library

[16]

Smith, Larry, 2008. Overview of BioCreative II gene mention recognition. Genome biology, 9, 1-19.

[17]

Li, Jiao, 2016. BioCreative V CDR task corpus: a resource for chemical disease relation extraction. Database, 2016.

[18]

Pyysalo, S. and Ananiadou, S. 2014. Anatomical entity mention recognition at literature scale. Bioinformatics, 30(6), 868–875.

[19]

Sampo Pyysalo, Tomoko Ohta, Rafal Rak, Andrew Rowley, Hong-Woo Chun, Sung-Jae Jung, Sung-Pil Choi, Jun'ichi Tsujii, and Sophia Ananiadou. 2015. Overview of the cancer genetics and pathway curation tasks of bionlp shared task 2013. BMC bioinformatics, 16, 1-19.

[20]

Doğan, Rezarta Islamaj, Robert Leaman, and Zhiyong Lu. 2014. NCBI disease corpus: a resource for disease name recognition and concept normalization. Journal of biomedical informatics, 47, 1-10.

[21]

Alsentzer, Emily, 2019. Publicly available clinical BERT embeddings. arXiv preprint arXiv:1904.03323.

[22]

Loshchilov, Ilya, and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101.

[23]

Zheng Yuan, Yijia Liu, Chuanqi Tan, Songfang Huang, and Fei Huang. 2021. Improving biomedical pretrained language models with knowledge. arXiv preprint arXiv:2104.10344.

[24]

Kocaman, Veysel, and David Talby. 2021. Spark NLP: natural language understanding at scale. Software Impacts, 8, 100058.

[25]

Sheng Zhang, Hao Cheng, Jianfeng Gao, and Hoifung Poon. 2022. Optimizing bi-encoder for named entity recognition via contrastive learning. arXiv preprint arXiv:2208.14565.

[26]

Kocaman, Veysel, and David Talby. 2022. Accurate clinical and biomedical named entity recognition at scale. Software Impacts, 13, 100373.

[27]

Zhili Wang, Yufan Wu, Pengbin Lei, and Cheng Peng. 2020. Named entity recognition method of brazilian legal text based on pre-training model. In Journal of Physics: Conference Series (Vol. 1550, No. 3, p. 032149). IOP Publishing.

[28]

Li, Yan, 2022. Character-based Joint Word Segmentation and Part-of-Speech Tagging for Tibetan Based on Deep Learning. Transactions on Asian and Low-Resource Language Information Processing, 21(5), 1-15.

Digital Library

[29]

Hui Li, Lin Yu, Jie Zhang, and Ming Lyu. 2022. Fusion deep learning and machine learning for heterogeneous military entity recognition. Wireless Communications and Mobile Computing, 2022, 1-11.

Index Terms

BERT-Based Models with Attention Mechanism and Lambda Layer for Biomedical Named Entity Recognition
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction

Recommendations

Unsupervised biomedical named entity recognition

Display Omitted BM-NER is approached by an unsupervised stepwise method.Noun phrase chunking is a good approximation of boundary detection.Distributional semantics works well in classifying entities.The system performs well on clinical and biological ...
Learning multilingual named entity recognition from Wikipedia

We automatically create enormous, free and multilingual silver-standard training annotations for named entity recognition (ner) by exploiting the text and structure of Wikipedia. Most ner systems rely on statistical models of annotated data to identify ...
Two-stage approach to named entity recognition using Wikipedia and DBpedia
IMCOM '17: Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication

In natural language understanding, extraction of named entity (NE) mentions in given text and classification of the mentions into pre-defined NE types are important processes. Most NE recognition (NER) relies on resources such as a training corpus or NE ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICMLC '24: Proceedings of the 2024 16th International Conference on Machine Learning and Computing

February 2024

757 pages

ISBN:9798400709234

DOI:10.1145/3651671

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 June 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICMLC 2024

ICMLC 2024: 2024 16th International Conference on Machine Learning and Computing

February 2 - 5, 2024

Shenzhen, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
26
Total Downloads

Downloads (Last 12 months)26
Downloads (Last 6 weeks)9

Reflects downloads up to 10 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents