Abstract
This paper reports experiments performed in the course of the CLEF’09 Intellectual Property track, where our main goal was to study automatic query generation from the patent documents. Two simple word weighting algorithms (modified RATF formula, and tf·idf) for selecting query keys from the patent documents were tested. Also using different parts of the patent documents as sources of query keys was investigated. Our best runs placed relatively well compared to the other CLEF-IP’09 participants’ runs. This suggests that tested approaches to the automatic query generation could be useful, and should be developed further. For three topics, the performance of the automatically extracted queries were compared to queries produced by three patent experts to see whether the automatic key word extraction algorithms seem to be able to extract relevant words from the topics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Mase, H., Matsubayashi, T., Ogawa, Y., Iwayama, M., Oshio, T.: Proposal of two-stage patent retrieval method considering the claim structure. ACM TALIP 4(2), 190–206 (2005)
Larkey, L.S.: A patent search and classification system. In: Proc. of the Fourth ACM conference on Digital Libraries, pp. 179–187. ACM, New York (1999)
Roda, G., Tait, J., Piroi, F., Zenz, V.: CLEF-IP 2009: Retrieval experiments in the intellectual property domain (2009), http://clef.iei.pi.cnr.it/
Pirkola, A., Leppänen, E., Järvelin, K.: The RATF formula (Kwok’s formula): Exploiting average term frequency in cross-language retrieval. Information Research 7(2) (2002)
Kim, J.H., Choi, K.S.: Patent document categorization based on semantic structural information. Inf. Process. Manage 43(5), 1200–1215 (2007)
Strohman, T., Metzler, D., Turtle, H., Croft, W.B.: Indri: A language-model based search engine for complex queries. In: Proc. of the International Conference on Intelligence Analysis (2005)
Wilkins, P., Ferguson, P., Smeaton, A.F.: Using score distributions for query-time fusion in multimediaretrieval. In: Proc. of the 8th ACM International Workshop on Multimedia Information Retrieval, pp. 51–60. ACM, New York (2006)
Järvelin, A., Järvelin, A., Hansen, P.: UTA and SICS at CLEF-IP. In: CLEF Working Notes 2009 (2009), http://clef.iei.pi.cnr.it/
Kekäläinen, J., Järvelin, K.: Using graded relevance assessments in IR evaluation. ACM TOIS 53(13), 1120–1129 (2002)
Fujita, S.: Technology survey and invalidity search: A comparative study of different tasks for Japanese patent document retrieval. Inf. Process. Manage. 43(5), 1154–1172 (2007)
Talvensaari, T., Pirkola, A., Järvelin, K., Juhola, M., Laurikkala, J.: Focused web crawling in the acquisition of comparable corpora. Information Retrieval 11(5), 427–445 (2008)
Sahlgren, M.: The Word-Space Model. PhD thesis, Stockholm University (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Järvelin, A., Järvelin, A., Hansen, P. (2010). UTA and SICS at CLEF-IP’09. In: Peters, C., et al. Multilingual Information Access Evaluation I. Text Retrieval Experiments. CLEF 2009. Lecture Notes in Computer Science, vol 6241. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15754-7_55
Download citation
DOI: https://doi.org/10.1007/978-3-642-15754-7_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15753-0
Online ISBN: 978-3-642-15754-7
eBook Packages: Computer ScienceComputer Science (R0)