Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Syntax-Based Post-Ordering for Efficient Japanese-to-English Translation

Published: 01 August 2013 Publication History
  • Get Citation Alerts
  • Abstract

    This article proposes a novel reordering method for efficient two-step Japanese-to-English statistical machine translation (SMT) that isolates reordering from SMT and solves it after lexical translation. This reordering problem, called post-ordering, is solved as an SMT problem from Head-Final English (HFE) to English. HFE is syntax-based reordered English that is very successfully used for reordering with English-to-Japanese SMT. The proposed method incorporates its advantage into the reverse direction, Japanese-to-English, and solves the post-ordering problem by accurate syntax-based SMT with target language syntax. Two-step SMT with the proposed post-ordering empirically reduces the decoding time of the accurate but slow syntax-based SMT by its good approximation using intermediate HFE. The proposed method improves the decoding speed of syntax-based SMT decoding by about six times with comparable translation accuracy in Japanese-to-English patent translation experiments.

    References

    [1]
    Aikawa, T. and Ruopp, A. 2009. Chained system: A linear combination of different types of statistical machine translation systems. In Proceedings of the 12th Machine Translation Summit.
    [2]
    Bangalore, S., Haffner, P., and Kanthak, S. 2007. Statistical machine translation through global lexical selection and sentence reconstruction. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 152--159.
    [3]
    Bangalore, S. and Riccardi, G. 2000. Finite-state models for lexical reordering in spoken language translation. In Proceedings of the International Conference on Spoken Language Processing (ICSLP). 422--425.
    [4]
    Brown, P. F., Della Pietra, S. A., Della Pietra, V. J., and Mercer, R. L. 1993. The mathematics of statistical machine translation: Parameter estimation. Comput. Linguis. 19, 2, 263--311.
    [5]
    Chiang, D. 2007. Hierarchical phrase-based translation. Comput. Linguis. 33, 2, 201--228.
    [6]
    Collins, M., Koehn, P., and Kucerova, I. 2005. Clause restructuring for statistical machine translation. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05). Association for Computational Linguistics, 531--540.
    [7]
    Costa-jussà, M. R. and Fonollosa, J. A. R. 2006. Statistical machine reordering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 70--76.
    [8]
    DeNero, J. and Uszkoreit, J. 2011. Inducing sentence structure from parallel corpora for reordering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 193--203.
    [9]
    Dugast, L., Senellart, J., and Koehn, P. 2007. Statistical post-editing on SYSTRAN’s rule-based translation system . In Proceedings of the 2nd Workshop on Statistical Machine Translation. Association for Computational Linguistics, 220--223.
    [10]
    Ehara, T. 2007. Rule based machine translation combined with statistical post editor for japanese to english patent translation. In Proceedings of the MT Summit XI Workshop on Patent Translation.
    [11]
    Galley, M. and Manning, C. D. 2008. A simple and effective hierarchical phrase reordering model. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 848--856.
    [12]
    Galley, M., Hopkins, M., Knight, K., and Marcu, D. 2004. What’s in a translation rule? In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL’04), D. M. Susan Dumais and S. Roukos Eds., Association for Computational Linguistics, 273--280.
    [13]
    Genzel, D. 2010. Automatically learning source-side reordering rules for large scale machine translation. In Proceedings of the 23rd International Conference on Computational Linguistics (COLING’10). 376--384.
    [14]
    Goto, I., Lu, B., Chow, K. P., Sumita, E., and Tsou, B. K. 2011. Overview of the patent machine translation task at the NTCIR-9 workshop. In Proceedings of the NII Test Collection for IR Systems (NTCIR-9).
    [15]
    Goto, I., Utiyama, M., and Sumita, E. 2012. Post-ordering by parsing for Japanese-English statistical machine translation. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. (Vol. 2: Short Papers). Association for Computational Linguistics, 311--316.
    [16]
    Graehl, J. and Knight, K. 2004. Training tree transducers. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL’04), D. M. Susan Dumais and S. Roukos Eds., Association for Computational Linguistics, 105--112.
    [17]
    Hong, G., Lee, S.-W., and Rim, H.-C. 2009. Bridging morpho-syntactic gap between source and target sentences for English-Korean statistical machine translation. In Proceedings of the Annual Meeting of the Association for Computational Linguistics and the International Conference on Natural Language Processing (ACL-IJCNLP’09). Conference Short Papers. Association for Computational Linguistics, 233--236.
    [18]
    Isozaki, H., Hirao, T., Duh, K., Sudoh, K., and Tsukada, H. 2010a. Automatic evaluation of translation quality for distant language pairs. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 944--952.
    [19]
    Isozaki, H., Sudoh, K., Tsukada, H., and Duh, K. 2010b. Head finalization: A simple reordering rule for sov languages. In Proceedings of the Joint 5th Workshop on Statistical Machine Translation and MetricsMATR. Association for Computational Linguistics, 244--251.
    [20]
    Isozaki, H., Sudoh, K., Tsukada, H., and Duh, K. 2012. HPSG-based preprocessing for English-to-Japanese translation. ACM Trans. Asian Lang. Inf. Proces. 11, 3.
    [21]
    Katz-Brown, J. and Collins, M. 2008. Syntactic reordering in preprocessing for Japanese-English translation: MIT system description for NTCIR-7 patent translation task . In Proceedings of the NII Test Collection for IR Systems (NTCIR-7). 409--414.
    [22]
    Katz-Brown, J., Petrov, S., McDonald, R., Och, F., Talbot, D., Ichikawa, H., Seno, M., and Kazawa, H. 2011. Training a parser for machine translation reordering. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 183--192.
    [23]
    Koehn, P. 2010. Statistical Machine Translation. Cambridge University Press, Cambridge, U.K.
    [24]
    Koehn, P., Och, F. J., and Marcu, D. 2003. Statistical phrase-based translation. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics. 263--270.
    [25]
    Kondo, S., Komachi, M., Matsumoto, Y., Sudoh, K., Duh, K., and Tsukada, H. 2011. Learning of linear ordering problems and its application to J-E patent translation in NTCIR-9 PatentMT. In Proceedings of the NII Test Collection for IR Systems (NTCIR-9).
    [26]
    Li, C.-H., Li, M., Zhang, D., Li, M., Zhou, M., and Guan, Y. 2007. A probabilistic approach to syntax-based reordering for statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 720--727.
    [27]
    Matusov, E., Kanthak, S., and Ney, H. 2005. On the integration of speech recognition and statistical machine translation. In Proceedings of the Annual Conference of the International Speech Communication Association (INTERSPEECH). 3177--3180.
    [28]
    Miyao, Y. and Tsujii, J. 2008. Feature forest models for probabilistic hpsg parsing. Comput. Linguis. 34, 1, 35--80.
    [29]
    Nagata, M., Saito, K., Yamamoto, K., and Ohashi, K. 2006. A clustered global phrase reordering model for statistical machine translation. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 713--720.
    [30]
    Neubig, G., Watanabe, T., and Mori, S. 2012. Inducing a discriminative parser to optimize machine translation reordering. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 843--853.
    [31]
    Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. 2002. Bleu: A method for automatic evaluation of machine translation. In Proceedings of 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 311--318.
    [32]
    Quirk, C., Menezes, A., and Cherry, C. 2005. Dependency treelet translation: Syntactically informed phrasal SMT. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05). Association for Computational Linguistics, 271--279.
    [33]
    Simard, M., Goutte, C., and Isabelle, P. 2007. Statistical phrase-based post-editing. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 508--515.
    [34]
    Snover, M., Dorr, B., Schwartz, R., Micciulla, L., and Makhoul, J. 2006. A study of translation edit rate with targeted human annotation. In Proceedings of the Conference of the Association for Machine Translation in the Americas (AMTA). 223--231.
    [35]
    Sudoh, K., Duh, K., Tsukada, H., Nagata, M., Wu, X., Matsuaki, T., and Tsujii, J. 2011a. NTT-UT statistical machine translation in NTCIR-9 PatentMT. In Proceedings of the NII Test Collection for IR Systems (NTCIR-9).
    [36]
    Sudoh, K., Wu, X., Duh, K., Tsukada, H., and Nagata, M. 2011b. Post-ordering in statistical machine translation. In Proceedings of the 13th Machine Translation Summit (MT Summit XIII). 316--323.
    [37]
    Tillmann, C. 2004. A unigram orientation model for statistical machine translation. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL’04), D. M. Dumais and S. Roukos Eds., Association for Computational Linguistics, 101--104.
    [38]
    Tillmann, C., Vogel, S., Ney, H., Zubiaga, A., and Sawaf, H. 1997. Accelerated DP based search for statistical translation. In Proceedings of the European Conference on Speech Communication and Technology (Eurospeech). Vol. 5. 2667--2670.
    [39]
    Tromble, R. and Eisner, J. 2009. Learning linear ordering problems for better translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1007--1016.
    [40]
    Wu, D. 1997. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Comput. Linguis. 23, 3, 377--404.
    [41]
    Wu, H. and Wang, H. 2007. Pivot language approach for phrase-based statistical machine translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 856--863.
    [42]
    Xia, F. and McCord, M. 2004. Improving a statistical MT system with automatically learned rewrite patterns. In Proceedings of the International Conference on Computational Linguistics (COLING). 508--514.
    [43]
    Xu, P., Kang, J., Ringgaard, M., and Och, F. 2009. Using a dependency parser to improve smt for subject-object-verb languages. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 245--253.
    [44]
    Yamada, K. and Knight, K. 2001. A syntax-based statistical translation model. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 523--530.
    [45]
    Zollmann, A. and Venugopal, A. 2006. Syntax augmented machine translation via chart parsing. In Proceedings on the Workshop on Statistical Machine Translation. Association for Computational Linguistics, 138--141.

    Cited By

    View all
    • (2023)How Good are Transformers in Reordering?Multi-disciplinary Trends in Artificial Intelligence10.1007/978-3-031-36402-0_5(60-67)Online publication date: 24-Jun-2023
    • (2018)A neural reordering model based on phrasal dependency tree for statistical machine translationIntelligent Data Analysis10.3233/IDA-17358222:5(1163-1183)Online publication date: 26-Sep-2018
    • (2018)A preordering model based on phrasal dependency treeDigital Scholarship in the Humanities10.1093/llc/fqy00933:4(748-765)Online publication date: 18-May-2018
    • Show More Cited By

    Index Terms

    1. Syntax-Based Post-Ordering for Efficient Japanese-to-English Translation

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Asian Language Information Processing
      ACM Transactions on Asian Language Information Processing  Volume 12, Issue 3
      August 2013
      76 pages
      ISSN:1530-0226
      EISSN:1558-3430
      DOI:10.1145/2499955
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 01 August 2013
      Accepted: 01 December 2012
      Revised: 01 November 2012
      Received: 01 February 2012
      Published in TALIP Volume 12, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Japanese-to-English translation
      2. long-distance reordering
      3. post-ordering
      4. statistical machine translation

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)6
      • Downloads (Last 6 weeks)0

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)How Good are Transformers in Reordering?Multi-disciplinary Trends in Artificial Intelligence10.1007/978-3-031-36402-0_5(60-67)Online publication date: 24-Jun-2023
      • (2018)A neural reordering model based on phrasal dependency tree for statistical machine translationIntelligent Data Analysis10.3233/IDA-17358222:5(1163-1183)Online publication date: 26-Sep-2018
      • (2018)A preordering model based on phrasal dependency treeDigital Scholarship in the Humanities10.1093/llc/fqy00933:4(748-765)Online publication date: 18-May-2018
      • (2016)A survey of word reordering in statistical machine translationComputational Linguistics10.1162/COLI_a_0024542:2(163-205)Online publication date: 1-Jun-2016
      • (2016)Inter-, Intra-, and Extra-Chunk Pre-Ordering for Statistical Japanese-to-English Machine TranslationACM Transactions on Asian and Low-Resource Language Information Processing10.1145/281838115:3(1-28)Online publication date: 9-Jan-2016
      • (2015)Improving Statistical Machine Translation using Syntax-based Learning-to-Rank SystemDigital Scholarship in the Humanities10.1093/llc/fqv032(fqv032)Online publication date: 12-Aug-2015

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media