Abstract
With the advent of Web 2.0 users are producing bigger and bigger amounts of diverse data, which are stored in a large variety of systems. Since the users’ data spaces are scattered among those independent systems, data sharing becomes a challenging problem. Distributed search and recommendation provides a general solution for data sharing and among its various alternatives, gossip-based approaches are particularly interesting as they provide scalability, dynamicity, autonomy and decentralized control. Generally, in these approaches each participant maintains a cluster of “relevant” users, which are later employed in query processing. However, as we show in the paper, only considering relevance in the construction of the cluster introduces a significant amount of redundancy among users, which in turn leads to reduced recall. Indeed, when a query is submitted, due to the high similarity among the users in a cluster, the probability of retrieving the same set of relevant items increases, thus limiting the amount of distinct results that can be obtained.In this paper, we propose a gossip-based search and recommendation approach that is based on diversity-based clustering scores. We present the resultant new gossip-based clustering algorithms and validate them through experimental evaluation over four real datasets, based on MovieLens-small, MovieLens, LastFM and Delicious. Compared with state of the art solutions, we show that taking into account diversity-based clustering score enables to obtain major gains in terms of recall while reducing the number of users involved during query processing.
Work conducted within the Institut de Biologie Computationnelle and partially funded by the labex NUMEV and the CNRS project Mastodons.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
References
Bai, X., Guerraoui, R., Kermarrec, A., Leroy, V.: Collaborative personalized top-k processing. Trans. Database Syst. 36(26), 1–38 (2011)
Carretero, J., Isaila, F., Kermarrec, A.M., Taïani, F., Tirado, J.M.: Geology: modular georecommendation in gossip-based social networks. In: ICDCS, pp. 637–646 (2012)
Jelasity, M., Babaoglu, O.: T-Man: gossip-based overlay topology management. Engi. Self-Org. Syst. 53(13), 1–15 (2006)
Draidi, F., Pacitti, E., Parigot, D., Verger, G.: P2Prec: a social-based P2P recommendation system. In: CIKM, pp. 2593–2596 (2011)
Voulgaris, S., van Steen, M.: Epidemic-style management of semantic overlays for content-based searching. In: Cunha, J.C., Medeiros, P.D. (eds.) Euro-Par 2005. LNCS, vol. 3648, pp. 1143–1152. Springer, Heidelberg (2005)
Kermarrec, A.-M., Leroy, V., Moin, A., Thraves, C.: Application of random walks to decentralized recommender systems. In: Lu, C., Masuzawa, T., Mosbah, M. (eds.) OPODIS 2010. LNCS, vol. 6490, pp. 48–63. Springer, Heidelberg (2010)
Kermarrec, A.M., Taïani, F.: Diverging towards the common good: heterogeneous self-organisation in decentralised recommenders. In: SNS, pp. 3–8 (2012)
Angel, A., Koudas, N.: Efficient diversity-aware search. In: SIGMOD, pp. 781–792 (2011)
Chen, H., Karger, D.R.: Less is more: probabilistic models for retrieving fewer relevant documents. In: SIGIR, pp. 429–436 (2006)
Servajean, M., Pacitti, E., Liroz-Gistau, M., Amer-Yahia, S., El Abbadi, A.: Exploiting diversification in gossip-based recommendation. In: Hameurlain, A., Dang, T.K., Morvan, F. (eds.) Globe 2014. LNCS, vol. 8648, pp. 25–36. Springer, Heidelberg (2014)
Dice, L.R.: Measures of the amount of ecologic association between species. Ecology 26(3), 297–302 (1945)
Rogers, D.J., Tanimoto, T.T.: A computer program for classifying plants. Science 132(3434), 1115–1118 (1960)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. ACM Press, New York (1999)
Carbonell, J., Goldstein, J.: The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: SIGIR, pp. 335–336. ACM Press, New York (1998)
Anagnostopoulos, A., Broder, A., Carmel, D.: Sampling search-engine results. In: WWW 2005 (2005)
Loupasakis, A., Ntarmos, N.: eXO: decentralized autonomous scalable social networking. In: CIDR, pp. 85–95 (2011)
Pouwelse, J., Garbacki, P., Wang, J., Bakker, A., Yang, J., Iosup, A., Epema, D.H.J., Reinders, M., Van Steen, M.R., Sips, H.J.: TRIBLER: a social-based peer-to-peer system. Concurr. Comput. Pract. Exp. 20(2), 127–138 (2008)
Penzo, W., Lodi, S., Mandreoli, F., Martoglia, R., Sassatelli, S.: Semantic peer, here are the neighbors you want! In: Proceedings of the 11th International Conference on Extending Database Technology: Advances in Database Technology, EDBT 2008, pp. 26–37. ACM, New York (2008)
Baraglia, R., Dazzi, P., Mordacchini, M., Ricci, L.: A peer-to-peer recommender system for self-emerging user communities based on gossip overlays. J. Comput. Syst. Sci. 79(2), 291–308 (2013)
Acknowledgments
Experiments presented in this paper were carried out using the Grid’5000 testbed, supported by a scientific interest group hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations (see https://www.grid5000.fr).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Servajean, M., Pacitti, E., Liroz-Gistau, M., Amer-Yahia, S., El Abbadi, A. (2015). Increasing Coverage in Distributed Search and Recommendation with Profile Diversity. In: Hameurlain, A., Küng, J., Wagner, R. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems XXII. Lecture Notes in Computer Science(), vol 9430. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-48567-5_4
Download citation
DOI: https://doi.org/10.1007/978-3-662-48567-5_4
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-48566-8
Online ISBN: 978-3-662-48567-5
eBook Packages: Computer ScienceComputer Science (R0)