Abstract
Collaborate Filtering is one of the most popular recommendation algorithms. Most Collaborative Filtering algorithms work with a static set of data. This paper introduces a novel approach to providing recommendations using Collaborative Filtering when user rating is received over an incoming data stream. In an incoming stream there are massive amounts of data arriving rapidly making it impossible to save all the records for later analysis. By dynamically building a decision tree for every item as data arrive, the incoming data stream is used effectively although an inevitable trade off between accuracy and amount of memory used is introduced. By adding a simple personalization step using a hierarchy of the items, it is possible to improve the predicted ratings made by each decision tree and generate recommendations in real-time. Empirical studies with the dynamically built decision trees show that the personalization step improves the overall predicted accuracy.
Chapter PDF
Similar content being viewed by others
References
Resnick, P., Varian, H.R.: Recommender systems. Commun. ACM 40, 56–58 (1997)
Goldberg, D., Nichols, D., Oki, B.M., Terry, D.: Using collaborative filtering to weave an information tapestry. Commun. ACM 35, 61–70 (1992)
Breese, J.S., Heckerman, D., Kadie, C.: Empirical analysis of predictive algorithms for collaborative filtering. In: Fourteenth Annual Conference on Uncertainty in Artificial Intelligence, pp. 43–52 (1998)
Sarwar, B.M., Karypis, G., Konstan, J.A., Reidl, J.: Item-based collaborative filtering recommendation algorithms. In: World Wide Web, pp. 285–295 (2001)
Hofmann, T.: Collaborative filtering via gaussian probabilistic latent semantic analysis. In: SIGIR 2003: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pp. 259–266. ACM Press, New York (2003)
Linden, G., Smith, B., York, J.: Industry report: Amazon.com recommendations: Item-to-item collaborative filtering. IEEE Distributed Systems Online 4 (2003)
Calderón-Benavides, M.L., González-Caro, C.N., de Pérez-Alcázar, J., GarcÃa-DÃaz, J., Delgado, J.C.: A comparison of several predictive algorithms for collaborative filtering on multi-valued ratings. In: SAC 2004: Proceedings of the 2004 ACM symposium on Applied computing, pp. 1033–1039. ACM Press, New York (2004)
Nakamura, A., Abe, N.: Collaborative filtering using weighted majority prediction algorithms. In: ICML 1998: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 395–403. Morgan Kaufmann Publishers Inc., San Francisco (1998)
Delgado, J., Ishii, N.: Memory-based weighted-majority prediction for recommender systems. In: Proceedings of the ACM SIGIR1999 (1999)
Papagelis, M., Rousidis, I., Plexousakis, D., Theoharopoulos, E.: Incremental collaborative filtering for highly-scalable recommendation algorithms. In: Hacid, M.-S., Murray, N.V., Raś, Z.W., Tsumoto, S. (eds.) ISMIS 2005. LNCS (LNAI), vol. 3488. Springer, Heidelberg (2005)
Domingos, P., Hulten, G.: Mining high-speed data streams. Knowledge Discovery and Data Mining, 71–80 (2000)
Garofalakis, M.N., Gehrke, J.: Querying and mining data streams: You only get one look. In: VLDB (2002)
Basu, C., Hirsh, H., Cohen, W.W.: Recommendation as classification: Using social and content-based information in recommendation. In: AAAI/IAAI, pp. 714–720 (1998)
Ganesan, P., Garcia-Molina, H., Widom, J.: Exploiting hierarchical domain structure to compute similarity. ACM Trans. Inf. Syst. 21, 64–93 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Barajas, J.M., Li, X. (2005). Collaborative Filtering on Data Streams. In: Jorge, A.M., Torgo, L., Brazdil, P., Camacho, R., Gama, J. (eds) Knowledge Discovery in Databases: PKDD 2005. PKDD 2005. Lecture Notes in Computer Science(), vol 3721. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564126_42
Download citation
DOI: https://doi.org/10.1007/11564126_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29244-9
Online ISBN: 978-3-540-31665-7
eBook Packages: Computer ScienceComputer Science (R0)