Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2837185.2837239acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiiwasConference Proceedingsconference-collections
research-article

REMed: automatic relation extraction from medical documents

Published: 11 December 2015 Publication History

Abstract

The large amount of unstructured medical documents written in natural language bears a massive quantity of knowledge, whose extraction becomes useful. An automatic relation identification strategy leads to the discovery of relations, (possible unknown) interactions, and associations between medical conditions, investigations and treatments. The current paper introduces a learning based approach for the automatic discovery of relations between medical concepts, entitled REMed. We propose an original list of features, grouped into four categories with the following distribution: lexical - 3, context - 6, grammatical -- 4 and syntactic - 4. We analyzed the influence of each category on the classification performance and determined that the performance of the REMed solution is comparable with similar solutions. We report the overall F-measure as 74.9% that outperforms the best solution reported in the similar systems with 1.2%. This performance was achieved mostly by the features from the lexical and context categories.

References

[1]
Albin A, Ji X, Borlawsky TB, Ye Z, Lin S, Payne PR, Huang K, Xiang Y. Enabling Online Studies of Conceptual Relationships Between Medical Terms: Developing an Efficient Web Platform. JMIR Med Inform. 2014; 2(2):e23. URL: http://medinform.jmir.org/2014/2/e23.
[2]
Anick P, Hong P, Xue N, et al. 2010. I2B2 2010 challenge: machine learning for information extraction from patient records. In Proceedings of the 2010 i2b2/VA Workshop on Challenges in Natural Language Processing for Clinical Data. Boston, MA, USA: i2b2.
[3]
Aronson AR, Lang FM. 2010. An overview of MetaMap: historical perspective and recent advances. J Am Med Inform Assoc 2010; 17:229--36.
[4]
de Bruijn B, Cherry C, Kiritchenko S, et al. 2010. NRC at i2b2: one challenge, three practical tasks, nine statistical systems, hundreds of clinical records, millions of useful features. In Proceedings of the 2010 i2b2/VA Workshop on Challenges in Natural Language Processing for Clinical Data. Boston, MA, USA: i2b2.
[5]
R. Bunescu and R. Mooney 2005. A shortest path dependency kernel for relation extraction. In Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP-05)
[6]
Chang C. C., Lin C. J. 2011. LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology.
[7]
Collins M., Mitchell P. Marcus 1999. Head-driven statistical models for natural language parsing, University of Pennsylvania, Philadelphia, PA.
[8]
Demner-Fushman D, Apostolova E, Islamaj Doğan R, et al. 2010. NLM's system description for the fourth i2b2/VA challenge. In Proceedings of the 2010 i2b2/VA Workshop on Challenges in Natural Language Processing for Clinical Data. Boston, MA, USA: i2b2
[9]
Fan R. E., Chang K. W., Hsieh C. J., Wang X. R., Lin C. J. 2008. LIBLINEAR: a library for large linear classification. J Mach Learn Res 2008; 9:1871e4.
[10]
Doing-Harris K, Livnat Y and Meystre S. 2015. Automated concept and relationship extraction for the semi-automated ontology management (SEAM) system. Journal of Biomedical Semantics (2015) 6:15 DOI= http://doi.acm.org/10.1186/s13326-015-0011-7.
[11]
Friedman C, Alderson PO, Austin JHM, Cimino JJ, Johnson SB. 1994. A general natural-language text processor for clinical radiology. J Am Med Inform Assoc 1994; 1:161--74.
[12]
Grouin C, Abacha AB, Bernhard D, et al. 2010. CARAMBA: concept, assertion, and relation annotation using machine-learning based approaches. In Proceedings of the 2010 i2b2/VA Workshop on Challenges in Natural Language Processing for Clinical Data. Boston, MA, USA: i2b2, 2010
[13]
Henriksson A, Moen H, Skeppstedt M, Daudaravičius V, Duneld M. S. 2014. Synonym extraction and abbreviation expansion with ensembles of semantic spaces. J Biomed Semantics. 2014; 5: 6. Published online 2014 Feb 5. DOI= http://doi.acm.org/10.1186/2041-1480-5-6.
[14]
Manning, Christopher D., Surdeanu, Mihai, Bauer, John, Finkel, Jenny, Bethard, Steven J., and McClosky, David. 2014. The Stanford CoreNLP Natural Language Processing Toolkit. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55--60.
[15]
Mark Hall, Eibe Frank, Geoffrey Holmes, Bernhard Pfahringer, Peter Reutemann, Ian H. Witten 2009. The WEKA Data Mining Software: An Update; SIGKDD Explorations, Volume 11, Issue 1.
[16]
Marie-Catherine de Marneffe, Bill MacCartney and Christopher D. Manning. 2006. Generating Typed Dependency Parses from Phrase Structure Parses. In LREC.
[17]
Marie-Catherine de Marneffe and Christopher D. Manning 2008. Stanford typed dependencies manual.
[18]
Patrick JD, Nguyen DHM, Wang Y, et al. 2010. I2b2 challenges in clinical natural language processing 2010. In Proceedings of the 2010 i2b2/VA Workshop on Challenges in Natural Language Processing for Clinical Data. Boston, MA, USA: i2b2, 2010
[19]
Rindflesch TC, Rayan JV, Hunter L. Extracting molecular binding relationships from biomedical text 2000. In: Nirenburg S. In Proceedings of the 6th Conference on Applied Natural Language Processing. Morristown, NJ, USA: Association for Computational Linguistics; p. 188--95.
[20]
Roberts K, Rink B, Harabagiu S. 2010. Extraction of medical concepts, assertions, and relations from discharge summaries for the fourth i2b2/VA shared task. Proceedings of the 2010 i2b2/VA Workshop on Challenges in Natural Language Processing for Clinical Data. Boston, MA, USA: i2b2, 2010
[21]
Savona G. K, Masanz J. J., Ogren P. V., Zheng J., Sohn S., Kipper-Schuler K. C., Chute C, 2010. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J Am Med Inform Assoc. 2010 Sep--Oct; 17(5): 507--513. DOI=http://doi.acm.org/10.1136/jamia.2009.001560
[22]
Solt I, Szidarovszky FP, Tikk D. 2010. Concept, assertion and relation extraction at the 2010 i2b2 relation extraction challenge using parsing information and dictionaries. In Proceedings of the 2010 i2b2/VA Workshop on Challenges in Natural Language Processing for Clinical Data. Boston, MA, USA: i2b2, 2010
[23]
Uzuner O, South BR, Shen S, DuVall SL. 2010. i2b2/VA challenge on concepts, assertions, and relations in clinical text. J Am Med Inform Assoc 2011; 18:552e6
[24]
Universal Dependencies documentation. Retrieved 20 March, 2015, from Universal Dependencies contributors: http://universaldependencies.github.com/docs/

Cited By

View all
  • (2018)Health social network analytics: utilizing social media to detect the outcome of chronic diseases (Preprint)Journal of Medical Internet Research10.2196/12876Online publication date: 22-Nov-2018
  • (2018)Utilizing soft constraints to enhance medical relation extraction from the history of present illness in electronic medical recordsJournal of Biomedical Informatics10.1016/j.jbi.2018.09.01387(108-117)Online publication date: Nov-2018
  • (2016)Learning relations using semantic-based vector similarity2016 IEEE 12th International Conference on Intelligent Computer Communication and Processing (ICCP)10.1109/ICCP.2016.7737125(69-75)Online publication date: Sep-2016

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
iiWAS '15: Proceedings of the 17th International Conference on Information Integration and Web-based Applications & Services
December 2015
704 pages
ISBN:9781450334914
DOI:10.1145/2837185
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 December 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. concept relation
  2. data correlation
  3. data mining
  4. dependency tree parser
  5. text mining

Qualifiers

  • Research-article

Conference

iiWAS '15

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2018)Health social network analytics: utilizing social media to detect the outcome of chronic diseases (Preprint)Journal of Medical Internet Research10.2196/12876Online publication date: 22-Nov-2018
  • (2018)Utilizing soft constraints to enhance medical relation extraction from the history of present illness in electronic medical recordsJournal of Biomedical Informatics10.1016/j.jbi.2018.09.01387(108-117)Online publication date: Nov-2018
  • (2016)Learning relations using semantic-based vector similarity2016 IEEE 12th International Conference on Intelligent Computer Communication and Processing (ICCP)10.1109/ICCP.2016.7737125(69-75)Online publication date: Sep-2016

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media