Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.3115/1220175.1220257dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free access

Word alignment in English-Hindi parallel corpus using recency-vector approach: some studies

Published: 17 July 2006 Publication History

Abstract

Word alignment using recency-vector based approach has recently become popular. One major advantage of these techniques is that unlike other approaches they perform well even if the size of the parallel corpora is small. This makes these algorithms worth-studying for languages where resources are scarce. In this work we studied the performance of two very popular recency-vector based approaches, proposed in (Fung and McKeown, 1994) and (Somers, 1998), respectively, for word alignment in English-Hindi parallel corpus. But performance of the above algorithms was not found to be satisfactory. However, subsequent addition of some new constraints improved the performance of the recency-vector based alignment technique significantly for the said corpus. The present paper discusses the new version of the algorithm and its performance in detail.

References

[1]
L. Ahrenberg, M. Merkel, A. Sagvall Hein, and J. Tiedemann. 2000. Evaluation of word alignment systems. In Proc. 2nd International conference on Linguistic resources and Evaluation (LREC-2000), volume 3, pages 1255--1261, Athens, Greece.
[2]
P. Brown, S. A. Della Pietra, V. J. Della Pietra, and R. L. Mercer. 1993. The mathematics of statistical machine translation: parameter estimation. Computational Linguistics, 19(2):263--311.
[3]
K. W. Church Dagan, I. and W. A. Gale. 1993. Robust bilingual word alignment for machine aided translation. In Proc. Workshop on Very Large Corpora: Academic and Industrial Perspectives, pages 1--8, Columbus, Ohio.
[4]
P. Fung and K. McKeown. 1994. Aligning noisy parallel corpora across language groups: Word pair feature matching by dynamic time warping. In Technology Partnerships for Crossing the Language Barrier: Proc. First conference of the Association for Machine Translation in the Americas, pages 81--88, Columbia, Maryland.
[5]
W. A. Gale and K. W. Church. 1991. Identifying word correspondences in parallel texts. In Proc. Fourth DARPA Workshop on Speech and Natural Language, pages 152--157. Morgan Kaufmann Publishers, Inc.
[6]
Jin-Xia Huang and Key-Sun Choi. 2000. Chinese korean word alignment based on linguistic comparison. In Proc. 38th annual meeting of the association of computational linguistic, pages 392--399, Hong Kong.
[7]
Ananthakrishnan Ramanathan and Durgesh D. Rao. 2003. A lightweight stemmer for hindi. In Proc. Workshop of Computational Linguistics for South Asian Languages -Expanding Synergies with Europe, EACL-2003, pages 42--48, Budapest, Hungary.
[8]
H Somers. 1998. Further experiments in bilingual text alignment. International Journal of Corpus Linguistics, 3:115--150.
[9]
Jörg Tiedemann. 2003. Combining clues word alignment. In Proc. 10th Conference of The European Chapter of the Association for Computational Linguistics, pages 339--346, Budapest, Hungary.
  1. Word alignment in English-Hindi parallel corpus using recency-vector approach: some studies

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image DL Hosted proceedings
      ACL-44: Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
      July 2006
      1214 pages

      Publisher

      Association for Computational Linguistics

      United States

      Publication History

      Published: 17 July 2006

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate 85 of 443 submissions, 19%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 443
        Total Downloads
      • Downloads (Last 12 months)17
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 09 Nov 2024

      Other Metrics

      Citations

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media