Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2110363.2110378acmconferencesArticle/Chapter ViewAbstractPublication PagesihiConference Proceedingsconference-collections
research-article

A statistical medical summary translation system

Published: 28 January 2012 Publication History

Abstract

In a hospital, a medical summary is indispensable for both a clinician and a patient. However, it is written in English in some non-English native countries and becomes a barrier for a patient to read. In this paper we propose a framework for rapid acquisition of bilingual medical summaries using machine translation (MT) techniques. We describe a medical summary corpus and some terminological databases prepared for the framework. We then touch on the challenging issues of MT adapted from generic to specific domains, and propose a pattern translation scheme to achieve domain adaptation based on a background statistical MT system. We identify the significant patterns to capture the specific writing styles in a medical summary. The patterns are then translated with the involvements of doctors. Our major concern is to reduce the cost of translation and better allocate the efforts made by the domain experts. The experimental results show the proposed methods are effective in terms of the significance and diversity of the patterns. The approaches to integrate the mined patterns into background MT are also discussed.

References

[1]
Abacha, A. B. and Zweigenbaum, P. 2011. Medical entity recognition: a comparison of semantic and statistical methods. In Proceedings of the 2011 Workshop on Biomedical Natural Language Processing, pages 56--64.
[2]
Banerjee, S. and Pedersen, T. 2003. The design, implementation, and use of the Ngram Statistics Package. In Proceedings of the Fourth International Conference on Intelligent Text Processing and Computational Linguistics, pages 370--381.
[3]
Bertoldi, N. and Federico, M. 2009. Domain adaptation for statistical machine translation with monolingual resources. In Proceedings of the Fourth Workshop on Statistical Machine Translation, pages 182--189.
[4]
Civera, J. and Juan A. 2007. Domain adaptation in statistical machine translation with mixture modeling. In Proceedings of the Second Workshop on Statistical Machine Translation, pages 177--180.
[5]
Embarek, M. and Ferret, O. 2008. Learning patterns for building resources about semantic relations in the medical domain. In Proceedings of the 6th International Conference on Language Resources and Evaluation, pages 2006--2002.
[6]
Foster, G. and Kuhn R. 2007. Mixture model adaptation for SMT. In Proceedings of the Second Workshop on Statistical Machine Translation, pages 128--135.
[7]
Foster G., Goutte, C., and Kuhn R. 2010. Discriminative instance weighting for domain adaptation in statistical machine translation. In Proceedings of EMNLP 2010, pages 451--459.
[8]
Klein, D. and Manning, C. 2003. Accurate unlexicalized parsing. In Proceedings of ACL 2003, pages 423--430.
[9]
Koehn, P., Och, F. J., and Marcu D. 2003. Statistical phrase-based translation. In Proceedings of HLT/NAACL 2003, pages 127--133.
[10]
Koehn, P. 2004. Pharaoh: a beam search decoder for phrased-based statistical machine translation models. In Proceedings of AMTA 2004, pages 115--124.
[11]
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constrantin, A., and Herbst, E. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of ACL 2007, Demonstration Session, pages 177--180.
[12]
Loper, E. and Bird, S. 2002. NLTK: the natural language toolkit. In Proceedings of the ACL Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics.
[13]
Och, F. J. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, pages 160--167.
[14]
Shadow, G. and MacDonald, C. J. 2003. Extracting structured information from free text pathology reports. In Proceedings of AMIA 2003 Annual Symposium, pages 584--588.
[15]
Vilar, D., Xu, J., D'Haro, L. F. and Ney H. 2006. Error analysis of statistical machine translation output. In Proceedings of the 5th International Conference on Language Resources and Evaluation, pages 697--702.
[16]
Zhao, B., Eck, M., and Vogel, S. 2004. Language model adaptation for statistical machine translation via structured query models. In Proceedings of COLING 2004, pages 411--417.

Cited By

View all
  • (2014)Mining Professional Knowledge from Medical RecordsBrain Informatics and Health10.1007/978-3-319-09891-3_15(152-163)Online publication date: 2014
  • (2012)Outpatient Department Recommendation Based on Medical SummariesInformation Retrieval Technology10.1007/978-3-642-35341-3_47(518-527)Online publication date: 2012

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
IHI '12: Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
January 2012
914 pages
ISBN:9781450307819
DOI:10.1145/2110363
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 January 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. machine translation
  2. medical summary
  3. pattern identification

Qualifiers

  • Research-article

Conference

IHI '12
Sponsor:
IHI '12: ACM International Health Informatics Symposium
January 28 - 30, 2012
Florida, Miami, USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)1
Reflects downloads up to 14 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2014)Mining Professional Knowledge from Medical RecordsBrain Informatics and Health10.1007/978-3-319-09891-3_15(152-163)Online publication date: 2014
  • (2012)Outpatient Department Recommendation Based on Medical SummariesInformation Retrieval Technology10.1007/978-3-642-35341-3_47(518-527)Online publication date: 2012

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media