Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.3115/1220355.1220472dlproceedingsArticle/Chapter ViewAbstractPublication PagescolingConference Proceedingsconference-collections
Article
Free access

Cognate mapping: a heuristic strategy for the semi-supervised acquisition of a Spanish lexicon from a Portuguese seed lexicon

Published: 23 August 2004 Publication History

Abstract

We deal with the automated acquisition of a Spanish medical subword lexicon from an already existing Portuguese seed lexicon. Using two non-parallel monolingual corpora we determined Spanish lexeme candidates from Portuguese seed lexicon entries by heuristic cognate mapping. We validated the emergent lexical translation hypotheses by determining the similarity of fixed-window context vectors on the basis of Portuguese and Spanish text corpora.

References

[1]
Gosia Barker and Richard F. E. Sutcliffe. 2000. An experiment in the semi-automatic identification of false-cognates between English and Polish. In Proceedings of the Irish Conference on Artificial Intelligence and Cognitive Science.
[2]
Peter F. Brown, John Cocke, Stephen A. Della Pietra, Vincent J. Della Pietra, Fredrick Jelinek, John D. Lafferty, Robert L. Mercer, and Paul S. Roossin. 1990. A statistical approach to machine translation. Computational Linguistics, 16(2):79--85.
[3]
Julio Gonzalo, Felisa Verdejo, and Irina Chugur. 1999. Using EuroWordNet in a conceptbased approach to cross-language text retrieval. Applied Artificial Intelligence, 13(7):647--678.
[4]
Udo Hahn, Kornél Markó, Michael Poprat, Stefan Schulz, and Joachim Wermter. 2004. Crossing languages in text retrieval via an interlingua. In Proceedings of the 7th International RIAO'04 Conference, pages 82--99.
[5]
Philipp Koehn and Kevin Knight. 2002. Learning a translation lexicon from monolingual corpora. In Unsupervised Lexical Acquisition: Proceedings of the Workshop of the ACL Special Interest Group on the Lexicon (SIGLEX), pages 9--16. Association for Computational Linguistics.
[6]
Brian MacWhinney. 1995. Language-specific prediction in foreign language learning. Language Testing, 12(3):292--320.
[7]
Kornél Markó, Phillip Daumke, Stefan Schulz, and Udo Hahn. 2003. Cross-language MeSH indexing using morpho-semantic normalization. In AMIA'03 - Proceedings of the 2003 Annual Symposium of the American Medical Informatics Association, pages 425--429. Philadelphia, PA: Hanley & Belfus.
[8]
Ari Pirkola, Jarmo Toivonen, Heikki Keskustalo, Kari Visala, and Kalervo Järvelin. 2003. Fuzzy translation of cross-lingual spelling variants. In SIGIR 2003 - Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 345--352. Toronto, Canada, 2003, New York, NY: ACM.
[9]
Reinhard Rapp. 1999. Automatic identification of word translations from unrelated English and German corpora. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pages 519--526. San Francisco, CA: Morgan Kaufmann.
[10]
Miguel Ruiz, Anne Diekema, and Páraic Sheridan. 1999. Cindor conceptual interlingua document retrieval: Trec-8 evaluation. In Proceedings of the 8th Text REtrieval Conference (TREC-8), pages 597--606. National Institute of Standards and Technology (NIST). NIST Special Publication, No. 500--246.
[11]
Gerard Salton and Michael J. McGill. 1983. Introduction to Modern Information Retrieval. New York, NY: McGraw Hill.
[12]
MeSH. 2001. Medical Subject Headings. Bethesda, MD: National Library of Medicine.
[13]
Stefan Schulz and Udo Hahn. 2000. Morpheme-based, cross-lingual indexing for medical document retrieval. International Journal of Medical Informatics, 59(3):87--99.
[14]
Stefan Schulz, Martin Honeck, and Udo Hahn. 2002. Biomedical text retrieval in languages with a complex morphology. In Proceedings of the ACL 2002 Workshop 'Natural Language Processing in the Biomedical Domain', pages 61--68. New Brunswick, NJ: Association for Computational Linguistics (ACL).
[15]
Davide Turcato. 1998. Automaticaly creating bilingual lexicons for machine translation from bilingual text. In COLING/ACL'98 - Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics & 17th International Conference on Computational Linguistics, volume 2, pages 1299--1306. San Francisco, CA: Morgan Kaufmann.
[16]
Takehito Utsuro, Takashi Horiuchi, Takeshi Hamamoto, Kohei Hino, and Takeaki Nakayama. 2003. Effect of cross-language IR in bilingual lexicon acquisition from comparable corpora. In EACL'03 - Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, pages 355--362. Association for Computational Linguistics.

Cited By

View all
  • (2015)A Constraint Approach to Pivot-Based Bilingual Dictionary InductionACM Transactions on Asian and Low-Resource Language Information Processing10.1145/272314415:1(1-26)Online publication date: 21-Nov-2015
  • (2012)Customizing search results for non-native speakersProceedings of the 21st ACM international conference on Information and knowledge management10.1145/2396761.2398526(1829-1833)Online publication date: 29-Oct-2012
  • (2005)Cross-language mining for acronyms and their completions from the webProceedings of the 8th international conference on Discovery Science10.1007/11563983_11(113-123)Online publication date: 8-Oct-2005
  • Show More Cited By
  1. Cognate mapping: a heuristic strategy for the semi-supervised acquisition of a Spanish lexicon from a Portuguese seed lexicon

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image DL Hosted proceedings
        COLING '04: Proceedings of the 20th international conference on Computational Linguistics
        August 2004
        1411 pages

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        Published: 23 August 2004

        Qualifiers

        • Article

        Acceptance Rates

        COLING '04 Paper Acceptance Rate 1,411 of 1,411 submissions, 100%;
        Overall Acceptance Rate 1,537 of 1,537 submissions, 100%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)60
        • Downloads (Last 6 weeks)5
        Reflects downloads up to 31 Dec 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2015)A Constraint Approach to Pivot-Based Bilingual Dictionary InductionACM Transactions on Asian and Low-Resource Language Information Processing10.1145/272314415:1(1-26)Online publication date: 21-Nov-2015
        • (2012)Customizing search results for non-native speakersProceedings of the 21st ACM international conference on Information and knowledge management10.1145/2396761.2398526(1829-1833)Online publication date: 29-Oct-2012
        • (2005)Cross-language mining for acronyms and their completions from the webProceedings of the 8th international conference on Discovery Science10.1007/11563983_11(113-123)Online publication date: 8-Oct-2005
        • (2005)Translating biomedical terms by inferring transducersProceedings of the 10th conference on Artificial Intelligence in Medicine10.1007/11527770_34(236-240)Online publication date: 23-Jul-2005

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media