research-article

Recognition of Patient-Related Named Entities in Noisy Tele-Health Texts

Authors:

Mi-Young Kim,

Ying Xu,

Osmar R. Zaiane,

Randy GoebelAuthors Info & Claims

ACM Transactions on Intelligent Systems and Technology (TIST), Volume 6, Issue 4

Article No.: 59, Pages 1 - 23

https://doi.org/10.1145/2651444

Published: 24 July 2015 Publication History

Get Access

Abstract

We explore methods for effectively extracting information from clinical narratives that are captured in a public health consulting phone service called HealthLink. Our research investigates the application of state-of-the-art natural language processing and machine learning to clinical narratives to extract information of interest. The currently available data consist of dialogues constructed by nurses while consulting patients by phone. Since the data are interviews transcribed by nurses during phone conversations, they include a significant volume and variety of noise. When we extract the patient-related information from the noisy data, we have to remove or correct at least two kinds of noise: explicit noise, which includes spelling errors, unfinished sentences, omission of sentence delimiters, and variants of terms, and implicit noise, which includes non-patient information and patient's untrustworthy information. To filter explicit noise, we propose our own biomedical term detection/normalization method: it resolves misspelling, term variations, and arbitrary abbreviation of terms by nurses. In detecting temporal terms, temperature, and other types of named entities (which show patients’ personal information such as age and sex), we propose a bootstrapping-based pattern learning process to detect a variety of arbitrary variations of named entities. To address implicit noise, we propose a dependency path-based filtering method. The result of our denoising is the extraction of normalized patient information, and we visualize the named entities by constructing a graph that shows the relations between named entities. The objective of this knowledge discovery task is to identify associations between biomedical terms and to clearly expose the trends of patients’ symptoms and concern; the experimental results show that we achieve reasonable performance with our noise reduction methods.

References

[1]

ACE. 2008. Automatic Content Extraction. English annotation guidelines for relations. Linguistic Data Consortium, version 6.0--2008.01.07 edition. Retrieved from http: //www.ldc.upenn.edu/Projects/ACE/.

Abstract

References

Cited By

Index Terms

Recommendations

Learning multilingual named entity recognition from Wikipedia

Named Entity Recognition Experiments on Turkish Texts

Two-stage approach to named entity recognition using Wikipedia and DBpedia

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations