Bilingual LSA-based adaptation for statistical machine translation

Tam, Yik-Cheung; Lane, Ian; Schultz, Tanja

doi:10.1007/s10590-008-9045-2

Bilingual LSA-based adaptation for statistical machine translation

Published: 19 November 2008

Volume 21, pages 187–207, (2007)
Cite this article

Machine Translation

Yik-Cheung Tam¹,
Ian Lane¹ &
Tanja Schultz¹

278 Accesses
Explore all metrics

Abstract

We propose a novel approach to cross-lingual language model and translation lexicon adaptation for statistical machine translation (SMT) based on bilingual latent semantic analysis. Bilingual LSA enables latent topic distributions to be efficiently transferred across languages by enforcing a one-to-one topic correspondence during training. Using the proposed bilingual LSA framework, model adaptation can be performed by, first, inferring the topic posterior distribution of the source text and then applying the inferred distribution to an n-gram language model of the target language and translation lexicon via marginal adaptation. The background phrase table is enhanced with the additional phrase scores computed using the adapted translation lexicon. The proposed framework also features rapid bootstrapping of LSA models for new languages based on a source LSA model of another language. Our approach is evaluated on the Chinese–English MT06 test set using the medium-scale SMT system and the GALE SMT system measured in BLEU and NIST scores. Improvement in both scores is observed on both systems when the adapted language model and the adapted translation lexicon are applied individually. When the adapted language model and the adapted translation lexicon are applied simultaneously, the gain is additive. At the 95% confidence interval of the unadapted baseline system, the gain in both scores is statistically significant using the medium-scale SMT system, while the gain in the NIST score is statistically significant using the GALE SMT system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Integrating Specialized Bilingual Lexicons of Multiword Expressions for Domain Adaptation in Statistical Machine Translation

Instant Translation Model Adaptation by Translating Unseen Words in Continuous Vector Space

A Comparative Study on Effective Approaches for Unsupervised Statistical Machine Translation

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Bellegarda JR (2000) Large vocabulary speech recognition with multispan statistical language models. IEEE Trans Speech Audio Process 8: 76–84
Article Google Scholar
Blei D, Ng A, Jordan M (2003) Latent Dirichlet allocation. J Mach Learn Res 3: 1107–1135
Article Google Scholar
Brown PF, Della Pietra SA, Della Pietra VJ, Mercer RL (1994) The mathematics of statistical machine translation: parameter estimation. Comput Linguist 19: 263–311
Google Scholar
Darroch JN, Ratcliff D (1972) Generalized iterative scaling for log-linear models. Ann Math Stat 43: 1470–1480
Article Google Scholar
Deerwester SC, Dumais ST, Landauer TK, Furnas GW, Harshman RA (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41: 391–407
Article Google Scholar
Doddington G (2002) Automatic evaluation of MT quality using n-gram co-occurrence statistics. In: Proceedings of human language technology conference 2002, San Diego, CA, pp 138–145
Griffiths TL, Steyvers M, Blei DM, Tenenbaum JB (2004) Integrating topics and syntax. In: Saul LK, Weiss Y, Bottou L (eds) Advances in neural information processing systems 17, Proceedings of the 2004 conference. MIT Press, Cambridge MA, pp 537–544
Google Scholar
Hofmann T (1999) Probabilistic latent semantic indexing. In: UAI ’99, proceedings of the fifteenth conference on uncertainty in artificial intelligence, Stockholm, Sweden, pp 289–296
Hsu B-J(P), Glass J (2006) Style & topic language model adaptation using HMM-LDA. In: EMNLP 2006, 2006 conference on empirical methods in natural language processing, Sydney, Australia, pp 373–381
Iyer R, Ostendorf M (1996) Modeling long distance dependence in language: topic mixtures vs. dynamic cache models. In: ICSLP 96, fourth international conference on spoken language processing, Philadelphia, PA, pp 236–239
Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, New York
Google Scholar
Kim W, Khudanpur S (2003) LM adaptation using cross-lingual information. In: 8th European conference on speech communication and technology (Eurospeech 2003 – Interspeech 2003), Geneva, Switzerland, pp 3129–3132
Kim W, Khudanpur S (2004) Cross-lingual latent semantic analysis for LM. In: 2004 IEEE international conference on acoustics, speech, and signal processing, vol 1. Montreal, Quebec, Canada, pp 257–260
Kneser R, Peters J, Klakow D (1997) Language model adaptation using dynamic marginals. In: Proceedings of Eurospeech ’97, 5th European conference on speech communication and technology, Rhodes, Greece, pp 1971–1974
Mrva D, Woodland PC (2006) Unsupervised language model adaptation for Mandarin broadcast conversation transcription. In: Interspeech 2006 – ICSLP, ninth international conference on spoken language processing, Pittsburgh, Pennsylvania, paper 1549-Thu1A2O.3
Och FJ (2003) Minimum error rate training in statistical machine translation. In: ACL-03, 41st annual meeting of the Association for Computational Linguistics, Sapporo, Japan, pp 160–167
Papineni K, Roukos S, Ward T, Zhu W (2002) BLEU: a method for automatic evaluation of machine translation. In: 40th annual meeting of the Association of Computational Linguistics, Philadelphia, Pennsylvania, pp 311–318
Paulik M, Fügen C, Schaaf T, Schultz T, Stüker S, Waibel A (2005) Document driven machine translation enhanced automatic speech recognition. In: Proceedings of Interspeech’2005 – Eurospeech, 9th European conference on speech communication and technology, Lisbon, Portugal, pp 2261–2264
Rottmann K, Vogel S (2007) Word reordering in statistical machine translation with a POS-based distortion model. In: TMI 2007, proceedings of the 11th international conference on theoretical and methodological issues in machine translation, Skövde, pp 171–180
Stolcke A (2002) SRILM – an extensible language modeling toolkit. In: Proceedings of the 7th international conference on spoken language processing ICSLP/Interspeech, Denver, Colorado, pp 901–904
Tam YC, Schultz T (2005) Language model adaptation using variational Bayes inference. In: Proceedings of Interspeech’2005 – Eurospeech, 9th European conference on speech communication and technology, Lisbon, Portugal, pp 5–8
Tam YC, Schultz T (2006) Unsupervised language model adaptation using latent semantic marginals. In: Interspeech 2006 – ICSLP, ninth international conference on spoken language processing, Pittsburgh, Pennsylvania, paper 1705-Thu1A2O.2
Tam YC, Schultz T (2007) Correlated latent semantic model for unsupervised language model adaptation. In: Proceedings of ICASSP 2007, international conference on acoustics, speech, and signal processing, vol IV. Honolulu, Hawaii, pp 41–44
Tseng H, Chang P, Andrew G, Jurafsky D, Manning C (2005) A conditional random field word segmenter. In: IJCNLP-05, fourth SIGHAN workshop on Chinese language processing, Jeju Island, Korea, pp 168–171
Vogel S, Zhang Y, Huang F, Tribble A, Venugopal A, Zhao B, Waibel A (2003) The CMU statistical translation system. In: MT summit IX, proceedings of the ninth machine translation summit, New Orleans, pp 402–409
Zhang Y, Vogel S (2004) Measuring confidence intervals for the machine translation evaluation metrics. In: Proceedings of the tenth conference on theoretical and methodological issues in machine translation TMI-04, Baltimore, Maryland, pp 85–94
Zhao B, Xing EP (2006) BiTAM: Bilingual topic admixture models for word alignment. In: Coling · ACL 2006, 21st international conference on computational linguistics and 44th annual meeting of the Association for Computational Linguistics, proceedings of the main conference poster sessions, Sydney, Australia, pp 969–976
Zhao B, Xing EP (2007) HM-BiTAM: Bilingual topic exploration, word alignment, and translation. In: Twenty-second annual conference on neural information processing systems, Vancouver BC, Canada

Download references

Author information

Authors and Affiliations

Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA, 15213, USA
Yik-Cheung Tam, Ian Lane & Tanja Schultz

Authors

Yik-Cheung Tam
View author publications
You can also search for this author in PubMed Google Scholar
Ian Lane
View author publications
You can also search for this author in PubMed Google Scholar
Tanja Schultz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yik-Cheung Tam.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tam, YC., Lane, I. & Schultz, T. Bilingual LSA-based adaptation for statistical machine translation. Machine Translation 21, 187–207 (2007). https://doi.org/10.1007/s10590-008-9045-2

Download citation

Received: 27 March 2008
Accepted: 31 October 2008
Published: 19 November 2008
Issue Date: December 2007
DOI: https://doi.org/10.1007/s10590-008-9045-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bilingual LSA-based adaptation for statistical machine translation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Integrating Specialized Bilingual Lexicons of Multiword Expressions for Domain Adaptation in Statistical Machine Translation

Instant Translation Model Adaptation by Translating Unseen Words in Continuous Vector Space

A Comparative Study on Effective Approaches for Unsupervised Statistical Machine Translation

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now