Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.3115/1118627.1118629dlproceedingsArticle/Chapter ViewAbstractPublication PagesulaConference Proceedingsconference-collections
Article
Free access

Learning a translation lexicon from monolingual corpora

Published: 12 July 2002 Publication History

Abstract

This paper presents work on the task of constructing a word-level translation lexicon purely from unrelated monolingual corpora. We combine various clues such as cognates, similar context, preservation of word similarity, and word frequency. Experimental results for the construction of a German-English noun lexicon are reported. Noun translation accuracy of 39% scored against a parallel test corpus could be achieved.

References

[1]
Al-Onaizan, Y., Curin, J., Jahr, M., Knight, K., Lafferty, J., Melamed, D., Och, F.-J., Purdy, D., Smith, N. A., and Yarowsky, D. (1999). Statistical machine translation. Technical report, John Hopkins University Summer Workshop http:/www.clsp.jhu.edu/ws99/projects/mt/.
[2]
Brown, P., Cocke, J., Pietra, S. A. D., Pietra, V. J. D., Jelinek, F., Lafferty, J. D., Mercer, R. L., and Rossin, P. (1990). A statistical approach to machine translation. Computational Linguistics, 16(2):76--85.
[3]
Diab, M. and Finch, S. (2000). A statistical word-level translation model for comparable corpora. In Proceedings of the Conference on Content-based multimedia information access (RIAO).
[4]
Fung, P. and Yee, L. Y. (1998). An IR approach for translating new words from nonparallel, comparable texts. In Proceedings of ACL 36, pages 414--420.
[5]
Jones, K. S. (1979). Experiments in relevance weighting of search terms. In Information Processing and Management, pages 133--144.
[6]
Koehn, P. (2002). Combining multiclass maximum entropy classifiers with neural network voting. In Proceedings of PorTAL.
[7]
Koehn, P. and Knight, K. (2000). Estimating word translation probabilities from unrelated monolingual corpora using the EM algorithm. In Proceedings of AAAI.
[8]
Koehn, P. and Knight, K. (2001). Knowledge sources for word-level translation models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP.
[9]
Mann, G. S. and Yarowski, D. (2001). Multipath translation lexicon induction via bridge languages. In Proceedings of NAACL, pages 151--158.
[10]
Melamed, D. (1995). Automatic evaluation and uniform filter cascades for inducing n-best translation lexicons. In Third Workshop on Very Large Corpora.
[11]
Rapp, R. (1995). Identifying word translations in non-parallel texts. In Proceedings of ACL 33, pages 320--322.
[12]
Rapp, R. (1999). Automatic identification of word translations from unrelated English and German corpora. In Proceedings of ACL 37, pages 519--526.
[13]
Tiedemann, J. (1999). Automatic construction of weighted string similarity measures. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP.

Cited By

View all
  • (2024)Bilingual Lexicon Induction From Comparable and Parallel Data: A Comparative AnalysisText, Speech, and Dialogue10.1007/978-3-031-70563-2_3(30-42)Online publication date: 9-Sep-2024
  • (2019)Matching Graph, a Method for Extracting Parallel Information from Comparable CorporaACM Transactions on Asian and Low-Resource Language Information Processing10.1145/332971319:1(1-29)Online publication date: 25-Jul-2019
  • (2018)One-shot unsupervised cross domain translationProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3326943.3327138(2108-2118)Online publication date: 3-Dec-2018
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
ULA '02: Proceedings of the ACL-02 workshop on Unsupervised lexical acquisition - Volume 9
July 2002
77 pages

Publisher

Association for Computational Linguistics

United States

Publication History

Published: 12 July 2002

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)37
  • Downloads (Last 6 weeks)5
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Bilingual Lexicon Induction From Comparable and Parallel Data: A Comparative AnalysisText, Speech, and Dialogue10.1007/978-3-031-70563-2_3(30-42)Online publication date: 9-Sep-2024
  • (2019)Matching Graph, a Method for Extracting Parallel Information from Comparable CorporaACM Transactions on Asian and Low-Resource Language Information Processing10.1145/332971319:1(1-29)Online publication date: 25-Jul-2019
  • (2018)One-shot unsupervised cross domain translationProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3326943.3327138(2108-2118)Online publication date: 3-Dec-2018
  • (2018)Web Forum Retrieval and Text AnalyticsFoundations and Trends in Information Retrieval10.1561/150000006212:1(1-163)Online publication date: 3-Jan-2018
  • (2018)Phrase Table Induction Using Monolingual Data for Low-Resource Statistical Machine TranslationACM Transactions on Asian and Low-Resource Language Information Processing10.1145/316805417:3(1-25)Online publication date: 13-Feb-2018
  • (2017)Bilingual lexicon induction from non-parallel data with minimal supervisionProceedings of the Thirty-First AAAI Conference on Artificial Intelligence10.5555/3298023.3298059(3379-3385)Online publication date: 4-Feb-2017
  • (2017)Corpus-Based Translation Induction in Indian Languages Using Auxiliary Language Corpora from WikipediaACM Transactions on Asian and Low-Resource Language Information Processing10.1145/303829516:3(1-25)Online publication date: 17-Mar-2017
  • (2017)Adapting sentiment lexicons to domain-specific social media textsDecision Support Systems10.1016/j.dss.2016.11.00194:C(65-76)Online publication date: 1-Feb-2017
  • (2016)A Study on Online Social Networks Theme Semantic Computing ModelInternational Journal of Web Services Research10.4018/IJWSR.201610010513:4(67-90)Online publication date: 1-Oct-2016
  • (2015)Iterative learning of parallel lexicons and phrases from non-parallel corporaProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832415.2832423(1250-1256)Online publication date: 25-Jul-2015
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media