Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.3115/1220575.1220672dlproceedingsArticle/Chapter ViewAbstractPublication PageshltConference Proceedingsconference-collections
Article
Free access

Word-sense disambiguation for machine translation

Published: 06 October 2005 Publication History

Abstract

In word sense disambiguation, a system attempts to determine the sense of a word from contextual features. Major barriers to building a high-performing word sense disambiguation system include the difficulty of labeling data for this task and of predicting fine-grained sense distinctions. These issues stem partly from the fact that the task is being treated in isolation from possible uses of automatically disambiguated data. In this paper, we consider the related task of word translation, where we wish to determine the correct translation of a word from context. We can use parallel language corpora as a large supply of partially labeled data for this task. We present algorithms for solving the word translation problem and demonstrate a significant improvement over a baseline system. We then show that the word-translation system can be used to improve performance on a simplified machine-translation task and can effectively and accurately prune the set of candidate translations for a word.

References

[1]
A. Berger, S. Della Pietra, and V. Della Pietra. 1996. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1).
[2]
P. F. Brown, S. A. Della Pietra, V. J. Della Pietra, and R. L. Mercer. 1993. The mathematics of statistical machine translation. Computational Linguistics, 19(2).
[3]
M. Carpuat and D. Wu. 2005. Word sense disambiguation vs. statistical machine translation. Proc. ACL.
[4]
M. Diab and P. Resnik. 2002. An unsupervised method for word sense tagging using parallel corpora. Proc. ACL.
[5]
C. Fellbaum, editor. 1998. WordNet: An Electronic Lexical Database. MIT Press.
[6]
U. Germann, M. Jahr, K. Knight, D. Marcu, and K. Yamada. 2001. Fast decoding and optimal decoding for machine translation. Proc. ACL.
[7]
P. Koehn and K. Knight. 2003. Feature-rich statistical translation of noun phrases. Proc. ACL.
[8]
P. Koehn, F. Och, and D. Marcu. 2003. Statistical phrase-based translation. HLT/NAACL.
[9]
T. Minka. 2000. Algorithms for maximum-likelihood logistic regression. http://lib.stat.cmu.edu/minka/papers/logreg.html.
[10]
A. Ng and M. Jordan. 2002. On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. Proc. NIPS.
[11]
H. T. Ng, B. Wang, and Y. S. Chan. 2003. Exploiting parallel texts for word sense disambiguation: An empirical study. Proc. ACL.
[12]
F. Och and H. Ney. 2002. Discriminative training and maximum entropy models for statistical machine translation. Proc. ACL.
[13]
J. Shewchuk. 1994. An introduction to the conjugate gradient method without the agonizing pain. http://www-2.cs.cmu.edu/jrs/jrspapers.html.

Cited By

View all
  • (2022)Word Sense Disambiguation using Cooperative Game Theory and Fuzzy Hindi WordNet based on ConceptNetACM Transactions on Asian and Low-Resource Language Information Processing10.1145/350273921:4(1-25)Online publication date: 4-Mar-2022
  • (2020)Machine NormalizationACM Transactions on Asian and Low-Resource Language Information Processing10.1145/337841419:4(1-30)Online publication date: 11-Apr-2020
  • (2015)Demystifying the Semantics of Relevant Objects in Scholarly CollectionsProceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries10.1145/2756406.2756923(157-164)Online publication date: 21-Jun-2015
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
HLT '05: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
October 2005
1054 pages

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 06 October 2005

Qualifiers

  • Article

Acceptance Rates

HLT '05 Paper Acceptance Rate 127 of 402 submissions, 32%;
Overall Acceptance Rate 240 of 768 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)74
  • Downloads (Last 6 weeks)10
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Word Sense Disambiguation using Cooperative Game Theory and Fuzzy Hindi WordNet based on ConceptNetACM Transactions on Asian and Low-Resource Language Information Processing10.1145/350273921:4(1-25)Online publication date: 4-Mar-2022
  • (2020)Machine NormalizationACM Transactions on Asian and Low-Resource Language Information Processing10.1145/337841419:4(1-30)Online publication date: 11-Apr-2020
  • (2015)Demystifying the Semantics of Relevant Objects in Scholarly CollectionsProceedings of the 15th ACM/IEEE-CS Joint Conference on Digital Libraries10.1145/2756406.2756923(157-164)Online publication date: 21-Jun-2015
  • (2015)Multilingual Word Sense Induction to Improve Web Search Result ClusteringProceedings of the 24th International Conference on World Wide Web10.1145/2740908.2743009(835-839)Online publication date: 18-May-2015
  • (2014)Word Sense Induction with Multilingual Features RepresentationProceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 0210.1109/WI-IAT.2014.117(343-349)Online publication date: 11-Aug-2014
  • (2013)Semantic to intelligent web eraProceedings of the Fifth International Conference on Management of Emergent Digital EcoSystems10.1145/2536146.2536150(159-168)Online publication date: 28-Oct-2013
  • (2013)Word sense and semantic relations in noun compoundsACM Transactions on Speech and Language Processing 10.1145/2483969.248397110:3(1-17)Online publication date: 11-Jul-2013
  • (2013)Dynamic EM in Neologism EvolutionProceedings of the 14th International Conference on Intelligent Data Engineering and Automated Learning --- IDEAL 2013 - Volume 820610.1007/978-3-642-41278-3_35(286-293)Online publication date: 20-Oct-2013
  • (2012)WSD for n-best reranking and local language modeling in SMTProceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation10.5555/2392936.2392938(1-9)Online publication date: 12-Jul-2012
  • (2012)Were the clocks striking or surprising?Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra)10.5555/2387956.2387968(87-92)Online publication date: 23-Apr-2012
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media