Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Lattice BLEU oracles in machine translation

Published: 03 January 2014 Publication History

Abstract

The search space of Phrase-Based Statistical Machine Translation (PBSMT) systems can be represented as a directed acyclic graph (lattice). By exploring this search space, it is possible to analyze and understand the failures of PBSMT systems. Indeed, useful diagnoses can be obtained by computing the so-called oracle hypotheses, which are hypotheses in the search space that have the highest quality score. For standard SMT metrics, this problem is, however, NP-hard and can only be solved approximately. In this work, we present two new methods for efficiently computing oracles on lattices: the first one is based on a linear approximation of the corpus bleu score and is solved using generic shortest distance algorithms; the second one relies on an Integer Linear Programming (ILP) formulation of the oracle decoding that incorporates count clipping constraints. It can either be solved directly using a standard ILP solver or using Lagrangian relaxation techniques. These new decoders are evaluated and compared with several alternatives from the literature for three language pairs, using lattices produced by two PBSMT systems.

References

[1]
Allauzen, A., Bonneau-Maynard, H., Le, H.-S., Max, A., Wisniewski, G., Yvon, F., Adda, G., Crego, J. M., Lardilleux, A., Lavergne, T., and Sokolov, A. 2011. LIMSI@WMT11. In Proceedings of the 6th Workshop on Statistical Machine Translation. 309--315.
[2]
Allauzen, C., Kumar, S., Macherey, W., Mohri, M., and Riley, M. 2010. Expected sequence similarity maximization. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human language Technologies. 957--965.
[3]
Allauzen, C., Mohri, M., Riley, M., and Roark, B. 2004. A generalized construction of integrated speech recognition transducers. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'04). Vol. 1, 761--764.
[4]
Allauzen, C., Riley, M., Schalkwyk, J., Skut, W., and Mohri, M. 2007. OpenFst: A general and efficient weighted finite-state transducer library. In Proceedings of the International Conference on Implementation and Application of Automata. Springer, 11--23.
[5]
Arun, A. and Koehn, P. 2007. Online learning methods for discriminative training of phrase based statistical machine translation. In Proceedings of the 11th Machine Translation Summit. 15--20.
[6]
Auli, M., Lopez, A., Hoang, H., and Koehn, P. 2009. A systematic analysis of translation model search spaces. In Proceedings of the 4th Workshop on Statistical Machine Translation (StatMT'09). 224--232.
[7]
Blackwood, G., Gispert, A., and Byrne, W. 2010. Efficient path counting transducers for minimum bayes-risk decoding of statistical machine translation lattices. In Proceedings of the ACL Conference Short Papers (ACL-Short'10). 27--32.
[8]
Callison-Burch, C., Koehn, P., Monz, C., and Zaidan, O. 2011. Findings of the 2011 workshop on statistical machine translation. In Proceedings of the 6th Workshop on Statistical Machine Translation. Association for Computational Linguistics. 22--64.
[9]
Chang, Y.-W. and Collins, M. 2011. Exact decoding of phrase-based translation models through lagrangian relaxation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 26--37.
[10]
Chiang, D. 2012. Hope and fear for discriminative training of statistical translation models. J. Mach. Learn. Res. 98888, 1159--1187.
[11]
Chiang, D., Marton, Y., and Resnik, P. 2008. Online large-margin training of syntactic and structural translation features. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 224--233.
[12]
Crego, J. M., Max, A., and Yvon, F. 2010. Local lexical adaptation in machine translation through triangulation: SMT helping smt. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING'10). 232--240.
[13]
Crego, J. M. and Yvon, F. 2010. Improving reordering with linguistically informed bilingual n-grams. In Coling: Posters. Coling 2010 Organizing Committee, 197--205.
[14]
Crego, J. M., Yvon, F., and Marino, J. B. 2011. Ncode: An open source bilingual n-gram smt toolkit. Prague Bull. Math. Linguist. 96, 49--58.
[15]
Denkowski, M. and Lavie, A. 2011. Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In Proceedings of the 6th Workshop on Statistical Machine Translation. 85--91.
[16]
Dreyer, M., Hall, K., and Khudanpur, S. 2007. Comparing reordering constraints for smt using efficient bleu oracle computation. In Proceedings of the NAACL-HLT/AMTA Workshop on Syntax and Structure in Statistical Translation (SSST'07). 103--110.
[17]
Gurobi Optimization. 2010. Gurobi optimizer. Version 3.0. http://www.gurobi.com/products/gurobi-optimizer/prior-versions.
[18]
Karp, R. 1972. Reducibility among combinatorial problems. In Complexity of Computer Computations, R. Miller and J. Thatcher, Eds., Plenum Press, New York, 85--103.
[19]
Knight, K. 1999. Decoding complexity in word-replacement translation models. Comput. Linguist. 25, 4, 607--615.
[20]
Koehn, P. 2010. Statistical Machine Translation. Cambridge University Press, Cambridge, UK.
[21]
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., and Herbst, E. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions. 177--180.
[22]
Leusch, G., Matusov, E., and Ney, H. 2008. Complexity of finding the bleu-optimal hypothesis in a confusion network. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 839--847.
[23]
Li, Z. and Khudanpur, S. 2009. Efficient extraction of oracle-best translations from hypergraphs. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies Companion Volume: Short Papers (NAACLShort'09). 9--12.
[24]
Liang, P., Bouchard-Cote, A., Klein, D., and Taskar, B. 2006. An end-to-end discriminative approach to machine translation. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics. 761--768.
[25]
Lin, C.-Y. and Och, F. J. 2004. ORANGE: A method for evaluating automatic evaluation metrics for machine translation. In Proceedings of the 20th International Conference on Computational Linguistics (COLING'04).
[26]
Macherey, W., Och, F., Thayer, I., and Uszkoreit, J. 2008. Lattice-based minimum error rate training for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 725--734.
[27]
Mangu, L., Brill, E., and Stolcke, A. 2000. Finding consensus in speech recognition: Word error minimization and other applications of confusion networks. Comput. Speech Lang. 14, 4, 373--400.
[28]
Mohri, M. 2002. Semiring frameworks and algorithms for shortest-distance problems. J. Autom. Lang. Comb. 7, 3, 321--350.
[29]
Mohri, M. 2009. Weighted automata algorithms. In Handbook of Weighted Automata, M. Droste, W. Kuich, and H. Vogler, Eds., Springer, 213--254.
[30]
Och, F. J. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics (ACL'03). 60--167.
[31]
Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of 40th Annual Meeting of the Association for Computational Linguistics. 311--318.
[32]
Rush, A. M., Sontag, D., Collins, M., and Jaakkola, T. 2010. On dual decomposition and linear programming relaxations for natural language processing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'10). 1--11.
[33]
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., and Makhoul, J. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of the Conference of the Association for Machine Translation in the America (AMTA'06). 223--231.
[34]
Snover, M., Madnani, N., Dorr, B., and Schwartz, R. 2009. Fluency, adequacy, or hter? Exploring different human judgments with a tunable mt metric. In Proceedings of the 4th Workshop on Statistical Machine Translation. 259--268.
[35]
Sokolov, A., Wisniewski, G., and Yvon, F. 2012. Computing lattice bleu oracle scores for machine translation. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics. 120--129.
[36]
Song, X., Cohn, T., and Specia, L. 2013. BLEU deconstructed: Designing a better mt evaluation metric. In Proceedings of the 14th International Conference on Intelligent Text Processing and Computational Linguistics.
[37]
Tromble, R. W., Kumar, S., Och, F., and Macherey, W. 2008. Lattice minimum bayes-risk decoding for statistical machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'08). 620--629.
[38]
Turchi, M., Bie, T. D., and Cristianini, N. 2008. Learning performance of a machine translation system: A statistical and computational analysis. In Proceedings of the 3rd Workshop on Statistical Machine Translation. 35--43.
[39]
Turchi, M., Bie, T. D., Goutte, C., and Cristianini, N. 2012. Learning to translate: A statistical and computational analysis. Adv. Artif. Intell. 1.
[40]
Watanabe, T. 2012. Optimized online rank learning for machine translation. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 253--262.
[41]
Wisniewski, G., Allauzen, A., and Yvon, F. 2010. Assessing phrase-based translation models with oracle decoding. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP'10). 933--943.
[42]
Wisniewski, G. and Yvon, F. 2013. Oracle decoding as a new way to analyze phrase-based machine translation. Mach. Trans. 27, 2, 115--138.
[43]
Wolsey, L. 1998. Integer Programming. John Wiley and Sons, New York.

Cited By

View all
  • (2014)Learning to translate queries for CLIRProceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval10.1145/2600428.2609539(1179-1182)Online publication date: 3-Jul-2014

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Speech and Language Processing
ACM Transactions on Speech and Language Processing   Volume 10, Issue 4
December 2013
206 pages
ISSN:1550-4875
EISSN:1550-4883
DOI:10.1145/2560566
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 January 2014
Accepted: 01 July 2013
Revised: 01 May 2013
Received: 01 January 2013
Published in TSLP Volume 10, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. BLEU
  2. Integer linear programming
  3. lattices
  4. machine translation
  5. oracle decoding

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • OSEO under the Quaero program

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 28 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2014)Learning to translate queries for CLIRProceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval10.1145/2600428.2609539(1179-1182)Online publication date: 3-Jul-2014

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media