Abstract
This paper presents a new method for searching documents which have similar topics to an already present document set. It is designed to help mobile device users to search for documents in a peer-to-peer environment which have similar topic to the ones on the users own device. The algorithms are designed for slower processors, smaller memory and small data traffic between the devices. These features allow the application in an environment of mobile devices like phones or PDA-s. The keyword list based topic comparison is enhanced with cascading, leading to a series of document searching elements specialized on documents not selected by previous stages. The architecture, the employed algorithms, and the experimental results are presented in this paper.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Buntine, W.: Topic-specific scoring of documents for relevant retrieval. In: Proceedings of ICML 2005 Workshop 4: Learning in Web Search, Bonn, Germany, 7 August (2005)
Cai, D., He, X.: Orthogonal locality preserving indexing. In: SIGIR 2005: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 3–10. ACM Press, New York (2005)
Csorba, K., Vajk, I.: Supervised term cluster creation for document clustering. Scientific Bulletin of Politehnica University of Timisoara, Romania, Transactions on Automatic Control and Computer Science 51 (2006)
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression Technical report, Department of Statistics, Stanford University (2002)
Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1989)
Forstner, B., Charaf, H.: Neighbor selection in peer-to-peer networks using semantic relations. WSEAS Transactions on Information Science and Applications 2(2), 239–244 (2005) ISSN 1790-0832
Forstner, B., Kelenyi, I., Csucs, G.: Towards Cognitive and Cooperative Wireless Networking: Techniques, Methodologies and Prospects. In: Chapter Peer-to-Peer Information Retrieval Based on Fields of Interest, pp. 311–325. Springer, Heidelberg (2007)
Fortuna, B., Mladenić, D., Grobelnik, M.: Semi-automatic construction of topic ontology. In: Proceedings of SIKDD 2005 at multiconference is 2005, Ljubljana, Slovenia (2005)
Furnas, G., Deerwester, S., Dumais, S.T., Landauer, T.K., Harshman, R., Streeter, L.A., Lochbaum, K.E.: Information retrieval using a singular value decomposition model of latent semantic structure. In: Chiaramella, Y. (ed.) Proceedings of the 11th Annual International ACM SIGIR Conference, Grenoble, France, pp. 465–480. ACM, New York (1988)
Keerthi, S.S.: Generalized lars as an effective feature selection tool for text classification with SVMS. In: Proceedings of the 22nd International Conference on Machine Learning, Bonn, Germany (2005)
Lang, K.: NewsWeeder: learning to filter netnews. In: Prieditis, A., Russell, S.J. (eds.) Proceedings of ICML-95, 12th International Conference on Machine Learning, Lake Tahoe, US, pp. 331–339. Morgan Kaufmann Publishers, San Francisco (1995)
Oded Maimon, L.R.: The Data Mining and Knowledge Discovery Handbook. Springer, Heidelberg (2005)
Weiss, S.M., Indurkhya, N., Zhang, T., Damerau, F.J.: Text Mining, Predictive Methods for Analysing Unstuctured Information. Springer, Heidelberg (2005)
Yan, J., Liu, N., Zhang, B., Yan, S., Chen, Z., Cheng, Q., Fan, W., Ma, W.-Y.: Ocfs: optimal orthogonal centroid feature selection for text categorization. In: SIGIR 2005: Proceedings of the 28th annual international ACM SIGIR conference, pp. 122–129. ACM Press, New York (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Csorba, K., Vajk, I. (2008). Iterative Search for Similar Documents on Mobile Devices. In: Dengel, A.R., Berns, K., Breuel, T.M., Bomarius, F., Roth-Berghofer, T.R. (eds) KI 2008: Advances in Artificial Intelligence. KI 2008. Lecture Notes in Computer Science(), vol 5243. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85845-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-85845-4_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85844-7
Online ISBN: 978-3-540-85845-4
eBook Packages: Computer ScienceComputer Science (R0)