research-article

Free access

Just Accepted

MedNER: Enhanced Named Entity Recognition in Medical Corpus via Optimized Balanced and Deep Active Learning

Authors: Yan Zhuang, Junyan Zhang, Ruogu Lu, Kunlun He, and Xiuxing LiAuthors Info & Claims

ACM Transactions on Intelligent Systems and Technology

Accepted on 27 April 2024

https://doi.org/10.1145/3678178

Online AM: 17 July 2024 Publication History

Abstract

Ever-growing electronic medical corpora provide unprecedented opportunities for researchers to analyze patient conditions and drug effects. Meanwhile, severe challenges emerged in the large-scale electronic medical records process phase. Primarily, emerging words for medical terms, including informal descriptions, are difficult to recognize. Moreover, although deep models can help in entity extraction on medical texts, it requires large-scale labels which are time-intensive to obtain and not always available in the medical domain. However, when encountering a situation where massive unseen concepts appear, or labeled data is insufficient, the performance of existing algorithms will suffer an intolerable decline. In this paper, we propose a balanced and deep active learning framework (MedNER) for Named Entity Recognition in the medical corpus to alleviate above problems. Specifically, to describe our selection strategy precisely, we first define the uncertainty of a medical sentence as a labeling loss predicted by a loss-prediction module and define diversity as the least text distance between pairs of sentences in a sample batch computed based on word-morpheme embeddings. Furthermore, aiming to make a trade-off between uncertainty and diversity, we formulate a Distinct-K optimization problem to maximize the slightest uncertainty and diversity of chosen sentences. Finally, we propose a threshold-based approximation selection algorithm, Distinct-K Filter, which selects the most beneficial training samples by balancing diversity and uncertainty. Extensive experimental results on real datasets demonstrate that MedNER significantly outperforms existing approaches.

References

[1]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In ICLR.

[2]

William H. Beluch, Tim Genewein, Andreas Nürnberger, and Jan M. Köhler. 2018. The Power of Ensembles for Active Learning in Image Classification. In CVPR. 9368–9377.

[3]

Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching Word Vectors with Subword Information. TACL 5 (2017), 135–146.

[4]

Shayok Chakraborty, Vineeth Nallure Balasubramanian, Qian Sun, Sethuraman Panchanathan, and Jieping Ye. 2015. Active Batch Selection via Convex Relaxations with Guaranteed Solution Bounds. IEEE Trans. Pattern Anal. Mach. Intell. 37, 10 (2015), 1945–1958.

Digital Library

[5]

Xinxiong Chen, Lei Xu, Zhiyuan Liu, Maosong Sun, and Huan-Bo Luan. 2015. Joint Learning of Character and Word Embeddings. In IJCAI. 1236–1242.

[6]

Jason P. C. Chiu and Eric Nichols. 2016. Named Entity Recognition with Bidirectional LSTM-CNNs. TACL 4 (2016), 357–370.

[7]

Junghwan Cho, Kyewook Lee, Ellie Shin, Garry Choy, and Synho Do. 2015. How much data is needed to train a medical image deep learning system to achieve necessary high accuracy? arXiv preprint arXiv:1511.06348 (2015).

[8]

Kyunghyun Cho, Bart van Merrienboer, Çaglar Gülçehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. In EMNLP. ACL, 1724–1734.

[9]

Shaika Chowdhury, C Zhang, and Philip S. Yu. 2018. Multi-Task Pharmacovigilance Mining from Social Media Posts. In WWW. 117–126.

[10]

Fenia Christopoulou, Thy Thy Tran, Sunil Kumar Sahu, Makoto Miwa, and Sophia Ananiadou. 2020. Adverse drug events and medication relation extraction in electronic health records with ensemble deep learning methods. Journal of the American Medical Informatics Association 27, 1 (2020), 39–46.

[11]

Pinar Donmez, Jaime G. Carbonell, and Paul N. Bennett. 2007. Dual Strategy Active Learning. In ECML. 116–127.

[12]

Joseph Gatto, Parker Seegmiller, Garrett Johnston, and Sarah M Preum. 2022. HealthE: Classifying Entities in Online Textual Health Advice. arXiv preprint arXiv:2210.03246 (2022).

[13]

Alex Graves, Abdel-rahman Mohamed, and Geoffrey E. Hinton. 2013. Speech recognition with deep recurrent neural networks. In ICASSP. 6645–6649.

[14]

Yuhong Guo. 2010. Active instance sampling via matrix partition. Advances in Neural Information Processing Systems (2010), 802–810.

[15]

M. M. Halldórsson and J. Radhakrishnan. 1997. Greed is good: Approximating independent sets in sparse and bounded-degree graphs. In Algorithmica. 145–163.

[16]

Daniel Hanisch, Katrin Fundel, Heinz-Theodor Mevissen, Ralf Zimmer, and Juliane Fluck. 2005. ProMiner: rule-based protein and gene entity recognition. BMC Bioinform. 6, S-1 (2005).

[17]

Steven C. H. Hoi, Rong Jin, and Michael R. Lyu. 2009. Batch Mode Active Learning with Applications to Text Categorization and Image Retrieval. TKDE 21, 9 (2009), 1233–1248.

Digital Library

[18]

Yan Hu, Iqra Ameer, Xu Zuo, Xueqing Peng, Yujia Zhou, Zehan Li, Yiming Li, Jianfu Li, Xiaoqian Jiang, and Hua Xu. 2023. Zero-shot Clinical Entity Recognition using ChatGPT. arXiv preprint arXiv:2303.16416v2 (2023).

[19]

Sheng-Jun Huang, Jia-Wei Zhao, and Zhao-Yang Liu. 2018. Cost-Effective Training of Deep CNNs with Active Model Adaptation. In SIGKDD. 1580–1588.

[20]

Adina R Kern-Goldberger, Sindhu K Srinivas, Elizabeth A Howell, Michael Harhay, and Lisa D Levine. 2023. Validation of maternal co-morbidity diagnoses using differential data extraction strategies across a large health system. American Journal of Obstetrics & Gynecology 228, 1 (2023), S247–S248.

[21]

Matt J. Kusner, Yu Sun, N Kolkin, and Kilian Q. Weinberger. 2015. From Word Embeddings To Document Distances. In ICML. 957–966.

Digital Library

[22]

John D. Lafferty, Andrew McCallum, and Fernando C. N. Pereira. 2001. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In ICML. 282–289.

Digital Library

[23]

Chenliang Li, Aixin Sun, Jianshu Weng, and Qi He. 2015. Tweet Segmentation and Its Application to Named Entity Recognition. TKDE 27, 2 (2015), 558–570.

[24]

Huayu Li, Martin Renqiang Min, Yong Ge, and Asim Kadav. 2017. A Context-aware Attention Network for Interactive Question Answering. In SIGKDD. 927–935.

[25]

Fenglong Ma, Radha Chitta, Jing Zhou, Quanzeng You, Tong Sun, and Jing Gao. 2017. Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks. In SIGKDD. 1903–1911.

Digital Library

[26]

Christopher D. Manning, Prabhakar Raghavan, and H Schütze. 2008. Introduction to information retrieval. Cambridge University Press.

[27]

Tomas Mikolov, Martin Karafiát, Lukás Burget, Jan Cernocký, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In INTERSPEECH. 1045–1048.

[28]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NIPS. 3111–3119.

[29]

Priyanka C Nair and B Indira Devi. 2022. Automatic Symptom Extraction from Unstructured Web Data for Designing Healthcare Systems. In Emerging Research in Computing, Information, Communication and Applications: ERCICA 2020, Volume 2. Springer, 599–608.

[30]

Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP. 1532–1543.

[31]

Sarah Riepenhausen, Cornelia Mertens, and Martin Dugas. 2021. Comparing SDTM and FHIR® for Real World Data from Electronic Health Records for Clinical Trial Submissions. In MIE. 585–589.

[32]

Abeed Sarker, Azadeh Nikfarjam, and Graciela Gonzalez-Hernandez. 2016. Social Media Mining Shared Task Workshop. In Biocomputing 2016: Proceedings of the Pacific Symposium. 581–592.

[33]

Max Schumm, Ming-Yeah Hu, Vivek Sant, Jiyoon Kim, Chi-Hong Tseng, Javier Sanz, Steven Raman, Run Yu, and Masha Livhits. 2023. Automated extraction of incidental adrenal nodules from electronic health records. Surgery 173, 1 (2023), 52–58.

[34]

Ozan Sener and Silvio Savarese. 2018. Active Learning for Convolutional Neural Networks: A Core-Set Approach. In ICLR.

[35]

Yanyao Shen, Hyokun Yun, Zachary C. Lipton, Yakov Kronrod, and Animashree Anandkumar. 2018. Deep Active Learning for Named Entity Recognition. In ICLR.

[36]

Joseph P. Turian, Lev-Arie Ratinov, and Yoshua Bengio. 2010. Word Representations: A Simple and General Method for Semi-Supervised Learning. In ACL 2010. 384–394.

[37]

Liqin Wang, Sheril Varghese, Sonam Bassir, Kimberly G Blumenthal, Elizabeth J Phillips, and Li Zhou. 2022. Stevens-Johnson syndrome and toxic epidermal necrolysis: A systematic review of /MEDLINE case reports from 1980 to 2020. Frontiers in Medicine 9 (2022).

[38]

Shuhe Wang, Xiaofei Sun, Xiaoya Li, Rongbin Ouyang, Fei Wu, Tianwei Zhang, Jiwei Li, and Guoyin Wang. 2023. GPT-NER: Named Entity Recognition via Large Language Models. arXiv preprint arXiv:2304.10428 (2023).

[39]

Jun Wen, Xiang Zhang, Everett Rush, Vidul A Panickan, Xingyu Li, Tianrun Cai, Doudou Zhou, Yuk-Lam Ho, Lauren Costa, Edmon Begoli, et al. 2023. Multimodal representation learning for predicting molecule–disease relations. Bioinformatics 39, 2 (2023), btad085.

[40]

Mingyu Xiao and Hiroshi Nagamochi. 2017. Exact algorithms for maximum independent set. Inf. Comput. 255 (2017), 126–146.

[41]

Christopher C. Yang, Haodong Yang, Ling Jiang, and Mi Zhang. 2012. Social media mining for drug safety signal detection. In SHB. 33–40.

[42]

Yi Yang, Zhigang Ma, Feiping Nie, Xiaojun Chang, and Alexander G. Hauptmann. 2015. Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization. International Journal of Computer Vision 113, 2 (2015), 113–127.

Digital Library

[43]

Donggeun Yoo and In So Kweon. 2019. Learning Loss for Active Learning. In CVPR. 93–102.

[44]

Tongxuan Zhang, Hongfei Lin, Yuqi Ren, Liang Yang, Bo Xu, Zhihao Yang, Jian Wang, and Yijia Zhang. 2019. Adverse drug reaction detection via a multihop self-attention mechanism. BMC Bioinformatics 20, 1 (2019), 479:1–479:11.

[45]

Zongwei Zhou, Jae Y. Shin, Lei Zhang, Suryakanth R. Gurudu, Michael B. Gotway, and Jianming Liang. 2017. Fine-Tuning Convolutional Neural Networks for Biomedical Image Analysis: Actively and Incrementally. In CVPR. 4761–4772.

Index Terms

MedNER: Enhanced Named Entity Recognition in Medical Corpus via Optimized Balanced and Deep Active Learning
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Information extraction

Recommendations

Medical Named Entity Recognition using Surrounding Sequences Matching
Abstract
Since the development of information technologies, there is a huge amount of electronic documents that was written by medical specialists and are rich of useful information needed to make critical decisions in several medical tasks. Thus, a doctor ...
Read More
Boosted Web Named Entity Recognition via Tri-Training
TALLIP Notes and Regular Papers

Named entity extraction is a fundamental task for many natural language processing applications on the web. Existing studies rely on annotated training data, which is quite expensive to obtain large datasets, limiting the effectiveness of recognition. In ...
Read More
Research on Named Entity Recognition of Traditional Chinese Medicine Electronic Medical Records
Health Information Science
Abstract
The electronic medical record (EMR) is a patient’s individual medical record written by health care providers to describe the medical activities of patients. Named entity recognition (NER) of EMR is helpful to extract important information from a ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology

ACM Transactions on Intelligent Systems and Technology Just Accepted

ISSN:2157-6904

EISSN:2157-6912

Table of Contents

Copyright © 2024 Copyright held by the owner/author(s).

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Online AM: 17 July 2024

Accepted: 27 April 2024

Revised: 19 April 2024

Received: 10 April 2023

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables