Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1690219.1690273dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
research-article
Free access

Robust approach to abbreviating terms: a discriminative latent variable model with global information

Published: 02 August 2009 Publication History

Abstract

The present paper describes a robust approach for abbreviating terms. First, in order to incorporate non-local information into abbreviation generation tasks, we present both implicit and explicit solutions: the latent variable model, or alternatively, the label encoding approach with global information. Although the two approaches compete with one another, we demonstrate that these approaches are also complementary. By combining these two approaches, experiments revealed that the proposed abbreviation generator achieved the best results for both the Chinese and English languages. Moreover, we directly apply our generator to perform a very different task from tradition, the abbreviation recognition. Experiments revealed that the proposed model worked robustly, and outperformed five out of six state-of-the-art abbreviation recognizers.

References

[1]
Eytan Adar. 2004. SaRAD: A simple and robust abbreviation dictionary. Bioinformatics, 20(4):527--533.
[2]
Hiroko Ao and Toshihisa Takagi. 2005. ALICE: An algorithm to extract abbreviations from MEDLINE. Journal of the American Medical Informatics Association, 12(5):576--586.
[3]
June A. Barrett and Mandalay Grems. 1960. Abbreviating words systematically. Communications of the ACM, 3(5):323--324.
[4]
Charles P. Bourne and Donald F. Ford. 1961. A study of methods for systematically abbreviating english words and names. Journal of the ACM, 8(4):538--552.
[5]
Jeffrey T. Chang and Hinrich Schütze. 2006. Abbreviations in biomedical text. In Sophia Ananiadou and John McNaught, editors, Text Mining for Biology and Biomedicine, pages 99--119. Artech House, Inc.
[6]
Stanley F. Chen and Ronald Rosenfeld. 1999. A gaussian prior for smoothing maximum entropy models. Technical Report CMU-CS-99-108, CMU.
[7]
Yaakov HaCohen-Kerner, Ariel Kass, and Ariel Peretz. 2008. Combined one sense disambiguation of abbreviations. In Proceedings of ACL'08: HLT, Short Papers, pages 61--64, June.
[8]
Louis-Philippe Morency, Ariadna Quattoni, and Trevor Darrell. 2007. Latent-dynamic discriminative models for continuous gesture recognition. Proceedings of CVPR'07, pages 1--8.
[9]
David Nadeau and Peter D. Turney. 2005. A supervised learning approach to acronym identification. In the 8th Canadian Conference on Artificial Intelligence (AI'2005) (LNAI 3501), page 10 pages.
[10]
Jorge Nocedal and Stephen J. Wright. 1999. Numerical optimization. Springer.
[11]
Naoaki Okazaki, Sophia Ananiadou, and Jun'ichi Tsujii. 2008. A discriminative alignment model for abbreviation recognition. In Proceedings of the 22nd International Conference on Computational Linguistics (COLING'08), pages 657--664, Manchester, UK.
[12]
Serguei Pakhomov. 2002. Semi-supervised maximum entropy based approach to acronym and abbreviation normalization in medical texts. In Proceedings of ACL'02, pages 160--167.
[13]
Youngja Park and Roy J. Byrd. 2001. Hybrid text mining for finding abbreviations and their definitions. In Proceedings of EMNLP'01, pages 126--133.
[14]
Leonid Peshkin and Avi Pfeffer. 2003. Bayesian information extraction network. In Proceedings of IJCAI'03, pages 421--426.
[15]
Slav Petrov and Dan Klein. 2008. Discriminative loglinear grammars with latent variables. Proceedings of NIPS'08.
[16]
Ariel S. Schwartz and Marti A. Hearst. 2003. A simple algorithm for identifying abbreviation definitions in biomedical text. In the 8th Pacific Symposium on Biocomputing (PSB'03), pages 451--462.
[17]
Fei Sha and Fernando Pereira. 2003. Shallow parsing with conditional random fields. Proceedings of HLT/NAACL'03.
[18]
Xu Sun, Houfeng Wang, and Bo Wang. 2008. Predicting chinese abbreviations from definitions: An empirical learning approach using support vector regression. Journal of Computer Science and Technology, 23(4):602--611.
[19]
Kazem Taghva and Jeff Gilbreth. 1999. Recognizing acronyms and their definitions. International Journal on Document Analysis and Recognition (IJDAR), 1(4):191--198.
[20]
Yoshimasa Tsuruoka, Sophia Ananiadou, and Jun'ichi Tsujii. 2005. A machine learning approach to acronym generation. In Proceedings of the ACL-ISMB Workshop, pages 25--31.
[21]
Jonathan D. Wren and Harold R. Garner. 2002. Heuristics for identification of acronym-definition patterns within text: towards an automated construction of comprehensive acronym-definition dictionaries. Methods of Information in Medicine, 41(5):426--434.
[22]
Hong Yu, Won Kim, Vasileios Hatzivassiloglou, and John Wilbur. 2006. A large scale, corpus-based approach for automatically disambiguating biomedical abbreviations. ACM Transactions on Information Systems (TOIS), 24(3):380--404.

Cited By

View all
  • (2013)Learning Abbreviations from Chinese and English Terms by Modeling Non-Local InformationACM Transactions on Asian Language Information Processing (TALIP)10.1145/2461316.246131712:2(1-17)Online publication date: 1-Jun-2013
  • (2013)Probabilistic Chinese word segmentation with non-local information and stochastic trainingInformation Processing and Management: an International Journal10.1016/j.ipm.2012.12.00349:3(626-636)Online publication date: 1-May-2013

Recommendations

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
ACL '09: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
August 2009
595 pages
ISBN:9781932432466
  • General Chair:
  • Keh-Yih Su

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 02 August 2009

Qualifiers

  • Research-article

Acceptance Rates

Overall Acceptance Rate 85 of 443 submissions, 19%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)61
  • Downloads (Last 6 weeks)6
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2013)Learning Abbreviations from Chinese and English Terms by Modeling Non-Local InformationACM Transactions on Asian Language Information Processing (TALIP)10.1145/2461316.246131712:2(1-17)Online publication date: 1-Jun-2013
  • (2013)Probabilistic Chinese word segmentation with non-local information and stochastic trainingInformation Processing and Management: an International Journal10.1016/j.ipm.2012.12.00349:3(626-636)Online publication date: 1-May-2013

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media