Abstract
One of the most important issues in Cross Language Information Retrieval (CLIR) which affects the performance of CLIR systems is how to exploit available translation resources. This issue can be more challenging when dealing with a language that lacks appropriate translation resources. Another factor that affects the performance of a CLIR system is the degree of ambiguity of query words. In this paper, we propose to combine different translation resources for CLIR. We also propose two different methods that exploit phrases in the query translation process to solve the problem of ambiguousness of query words. Our evaluation results on English-Persian CLIR show the superiority of phrase based and combinational translation CLIR methods over other CLIR methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
AleAhmad, A., Amiri, H., Darrudi, E., Rahgozar, M., Oroumchian, F.: Hamshahri: A standard persian text collection. Know.-Based Syst. 22(5), 382–387 (2009)
Brown, P.F., Pietra, V.J.D., Pietra, S.A.D., Mercer, R.L.: The mathematics of statistical machine translation: parameter estimation. Comput. Linguist. 19(2), 263–311 (1993)
Hashemi, H.B.: Using Comparable Corpora for English-Persian Cross-Language Information Retrieval. Master’s thesis, University of Tehran, Tehran, Iran (2011)
Baradaran Hashemi, H., Shakery, A., Faili, H.: Creating a persian-english comparable corpus. In: Agosti, M., Ferro, N., Peters, C., de Rijke, M., Smeaton, A. (eds.) CLEF 2010. LNCS, vol. 6360, pp. 27–39. Springer, Heidelberg (2010)
Koehn, P.: Pharaoh: A beam search decoder for phrase-based statistical machine translation models. In: Frederking, R.E., Taylor, K.B. (eds.) AMTA 2004. LNCS (LNAI), vol. 3265, pp. 115–124. Springer, Heidelberg (2004)
Nie, J.Y.: Cross-Language Information Retrieval. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers (2010)
Nie, J.Y., Isabelle, P., Plamondon, P., Foster, G.: Using a probabilistic translation model for cross-language information retrieval. In: 6th Workshop on Very Large Corpora, pp. 18–27 (1998)
Pilevar, M.T., Faili, H., Pilevar, A.H.: TEP: Tehran english-persian parallel corpus. In: Gelbukh, A. (ed.) CICLing 2011, Part II. LNCS, vol. 6609, pp. 68–79. Springer, Heidelberg (2011)
Talvensaari, T., Pirkola, A., Järvelin, K., Juhola, M., Laurikkala, J.: Focused web crawling in the acquisition of comparable corpora. Inf. Retr. 11(5), 427–445 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Azarbonyad, H., Shakery, A., Faili, H. (2013). Exploiting Multiple Translation Resources for English-Persian Cross Language Information Retrieval. In: Forner, P., Müller, H., Paredes, R., Rosso, P., Stein, B. (eds) Information Access Evaluation. Multilinguality, Multimodality, and Visualization. CLEF 2013. Lecture Notes in Computer Science, vol 8138. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40802-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-40802-1_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40801-4
Online ISBN: 978-3-642-40802-1
eBook Packages: Computer ScienceComputer Science (R0)