Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1076034.1076182acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Mining translations of OOV terms from the web through cross-lingual query expansion

Published: 15 August 2005 Publication History
  • Get Citation Alerts
  • Abstract

    Translating out-of-vocabulary (OOV) terms is a great challenge for the Cross-lingual Information Retrieval and Data-driven Machine Translation systems. Several approaches have been proposed to mine translations for OOV terms from the web, especially from pages containing mixed languages. In this paper, we propose a novel approach to automatically translate OOV terms on the fly through cross-lingual query expansion. The proposed approach does not require any web crawling and has achieved an inclusion rate of 95% and overall translation accuracy of 90%, outperforming state-of-the-art OOV translation techniques.

    References

    [1]
    P.-J. Cheng, J.-W. Teng, R.-C. Chen, J.-H. Wang, W.-H. Lu, and L.-F. Chien. Translating unknown queries with web corpora for cross-language information retrieval. In SIGIR'04, pages 146--153. ACM Press, 2004.
    [2]
    F. Huang, S. Vogel, and A. Waibel. Automatic extraction of named entity translingual equivalence based on multi-feature cost minimization. In Proceeding of the 41st ACL, Workshop on Multilingual and Mixed-Language Named Entity Recognition, Sapporo, Japan, July 2003.
    [3]
    W.-H. Lu, L.-F. Chien, and H.-J. Lee. Translation of web queries using anchor text mining. ACM TALIP, 1(2):159--172, 2002.
    [4]
    P. Resnik and N. A. Smith. The web as a parallel corpus. Comput. Linguist., 29(3):349--380, 2003.
    [5]
    Y. Zhang and P. Vines. Detection and translation of oov terms prior to query time. In SIGIR'04, pages 524--525. ACM Press, 2004.

    Cited By

    View all
    • (2021)Reinforcement Learning for Clue Selection in Web-Based Entity Translation MiningKnowledge Graph and Semantic Computing: Knowledge Graph and Cognitive Intelligence10.1007/978-981-16-1964-9_6(64-77)Online publication date: 6-May-2021
    • (2017)An Approach for Chinese-Japanese Named Entity Equivalents Extraction Using Inductive Learning and Hanzi-Kanji Mapping TableIEICE Transactions on Information and Systems10.1587/transinf.2016EDP7425E100.D:8(1882-1892)Online publication date: 2017
    • (2017)A comprehensive analysis of bilingual lexicon inductionComputational Linguistics10.1162/COLI_a_0028443:2(273-310)Online publication date: 1-Jun-2017
    • Show More Cited By

    Index Terms

    1. Mining translations of OOV terms from the web through cross-lingual query expansion

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
      August 2005
      708 pages
      ISBN:1595930345
      DOI:10.1145/1076034
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 15 August 2005

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. OOV terms
      2. automatic translation
      3. cross-lingual IR
      4. query expansion

      Qualifiers

      • Article

      Conference

      SIGIR05
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 792 of 3,983 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 09 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2021)Reinforcement Learning for Clue Selection in Web-Based Entity Translation MiningKnowledge Graph and Semantic Computing: Knowledge Graph and Cognitive Intelligence10.1007/978-981-16-1964-9_6(64-77)Online publication date: 6-May-2021
      • (2017)An Approach for Chinese-Japanese Named Entity Equivalents Extraction Using Inductive Learning and Hanzi-Kanji Mapping TableIEICE Transactions on Information and Systems10.1587/transinf.2016EDP7425E100.D:8(1882-1892)Online publication date: 2017
      • (2017)A comprehensive analysis of bilingual lexicon inductionComputational Linguistics10.1162/COLI_a_0028443:2(273-310)Online publication date: 1-Jun-2017
      • (2016)Automatic identification and multi-translatable translation of vocabulary terms with a combined approach2016 Eighth International Conference on Advanced Computational Intelligence (ICACI)10.1109/ICACI.2016.7449849(342-348)Online publication date: Feb-2016
      • (2014)Chinese-English OOV Term Translation with Web Mining, Multiple Feature Fusion and Supervised LearningChinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data10.1007/978-3-319-12277-9_21(234-246)Online publication date: 2014
      • (2012)Learning to find translations and transliterations on the webProceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 210.5555/2390665.2390698(130-134)Online publication date: 8-Jul-2012
      • (2012)Translingual Mining from Text DataMining Text Data10.1007/978-1-4614-3223-4_10(323-359)Online publication date: 7-Jan-2012
      • (2011)Machine transliteration surveyACM Computing Surveys10.1145/1922649.192265443:3(1-46)Online publication date: 29-Apr-2011
      • (2011)Multi-feature representation for Web-based English-Chinese OOV term translation2011 International Conference on Machine Learning and Cybernetics10.1109/ICMLC.2011.6016971(1515-1519)Online publication date: Jul-2011
      • (2011)A CLIR-oriented OOV translation mining method from bilingual webpages2011 International Conference on Machine Learning and Cybernetics10.1109/ICMLC.2011.6016958(1872-1877)Online publication date: Jul-2011
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media