Abstract
The rapid growth of the so-called Web 2.0 has changed the surfers’ behavior. A new democratic vision emerged, in which users can actively contribute to the evolution of the Web by producing new content or enriching the existing one with user generated metadata. In this context the use of tags, keywords freely chosen by users for describing and organizing resources, spread as a model for browsing and retrieving web contents. The success of that collaborative model is justified by two factors: firstly, information is organized in a way that closely reflects the users’ mental model; secondly, the absence of a controlled vocabulary reduces the users’ learning curve and allows the use of evolving vocabularies. Since tags are handled in a purely syntactical way, annotations provided by users generate a very sparse and noisy tag space that limits the effectiveness for complex tasks. Consequently, tag recommenders, with their ability of providing users with the most suitable tags for the resources to be annotated, recently emerged as a way of speeding up the process of tag convergence. The contribution of this work is a tag recommender system implementing both a collaborative and a content-based recommendation technique. The former exploits the user and community tagging behavior for producing recommendations, while the latter exploits some heuristics to extract tags directly from the textual content of resources. Results of experiments carried out on a dataset gathered from Bibsonomy show that hybrid recommendation strategies can outperform single ones and the way of combining them matters for obtaining more accurate results.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern information retrieval. Reading: Addison-Wesley.
Baruzzo, A., Dattolo, A., Pudota, N., Tasso, C. (2009). Recommending new tags using domain-ontologies. In Proceedings of the web intelligence/IAT workshops (pp. 409–412).
Basile, P., Degemmis, M., Gentile, A.L., Lops, P., Semeraro, G. (2007). UNIBA: JIGSAW algorithm for word sense disambiguation. In Proceedings of the 4th ACL 2007 international workshop on semantic evaluations (SemEval-2007), Prague, Czech Republic, 23–24 June 2007 (pp. 398–401). Association for Computational Linguistics.
Billsus, D., & Pazzani, M.J. (1998). Learning collaborative information filters. In Proceeding of the 15th international conference on machine learning (pp. 46–54). San Francisco: Morgan Kaufmann.
Brooks, C.H., & Montanez, N. (2006). Improved annotation of the blogosphere via autotagging and hierarchical clustering. In WWW ’06: Proceedings of the 15th international conference on World Wide Web (pp. 625–632). New York: ACM.
Cattuto, C., Schmitz, C., Baldassarri, A., Servedio, V.D.P., Loreto, V., Hotho, A., Grahl, M., Stumme, G. (2007). Network properties of folksonomies. AI Communications, 20(4), 245–262.
Chen, X., & Shin, H. (2012). Tag recommendation by machine learning with textual and social features. Journal of Intelligent information Systems (JIIS). doi:10.1007/s10844-012-0200-0.
de Campos, L.M., Fernández-Luna, J.M., Huete, J.F., Rueda-Morales, M.A. (2010). Combining content-based and collaborative recommendations: a hybrid approach based on Bayesian networks. International Journal of Approximate Reasoning, 51(7), 785–799.
Gabrilovich, E., & Markovitch, S. (2009). Wikipedia-based semantic interpretation for natural language processing. Journal of Artificial Intelligence Research (JAIR), 34, 443–498.
Gemmell, J., Schimoler, T., Ramezani, M., Mobasher, B. (2009). Adapting k-nearest neighbor for tag recommendation in folksonomies. In 7th workshop on intelligent techniques for web personalization and recommender systems, held in conjunction with the 21st international joint conference on artificial intelligence (IJCAI-09).
Golder, S., & Huberman, B.A. (2006). The structure of collaborative tagging systems. Journal of Information Science, 32(2), 198–208.
Grineva, M.P., Grinev, M.N., Lizorkin, D. (2009). Extracting key terms from noisy and multitheme documents. In J. Quemada, G. León, Y.S. Maarek, W. Nejdl (Eds.), Proceedings of the 18th international conference on World Wide Web, WWW 2009 (pp. 661–670). New York: ACM.
Heymann, P., Ramage, D., Garcia-Molina, H. (2008). Social tag prediction. In S. Myaeng, D.W. Oard, F. Sebastiani, T. Chua, M. Leong (Eds.), SIGIR ’08: proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval (pp. 531–538). New York: ACM.
Jäschke, R., Marinho, L.B., Hotho, A., Schmidt-Thieme, L., Stumme, G. (2007). Tag recommendations in folksonomies. In J.N. Kok, J. Koronacki, R. López de Mántaras, S. Matwin, D. Mladenic, A. Skowron (Eds.), Knowledge discovery in databases: PKDD 2007, 11th European conference on principles and practice of knowledge discovery in databases, lecture notes in computer science (Vol. 4702, pp. 506–514). New York: Springer.
Ju, S., & Hwang, K. (2009). A weighting scheme for tag recommendation in social bookmarking systems. In F. Eisterlehner, A. Hotho, R. Jaschke (Eds.), ECML PKDD discovery challenge 2009 (DC09), CEUR workshop proceedings (Vol. 497, pp. 109–118).
Kee Lee, S.O., & Wai Chun, A.H. (2007). Automatic tag recommendation for the web 2.0 blogosphere using collaborative tagging and hybrid ANN semantic structures. In ACOS’07: proceedings of the 6th conference on WSEAS international conference on applied computer science (pp. 88–93). Singapore: World Scientific and Engineering Academy and Society.
Lipczak, M. (2008). Tag recommendation for folksonomies oriented towards individual users. In Proceedings of ECML PKDD discovery challenge (DC08) (pp. 84–95).
Lipczak, M., Hu, Y., Kollet, Y., Milios, E. (2009). Tag sources for recommendation in collaborative tagging systems. In F. Eisterlehner, A. Hotho, R. Jaschke (Eds.), ECML PKDD discovery challenge 2009 (DC09), CEUR workshop proceedings (Vol. 497, pp. 157–172).
Marinho, L.B., & Schmidt-Thieme, L. (2008). Collaborative tag recommendations. In C. Preisach, H. Burkhardt, L. Schmidt-Thieme, R. Decker (Eds.), Data analysis, machine learning and applications—proceedings of the 31st annual conference of the Gesellschaft für Klassifikation e.V., Albert-Ludwigs-Universitä t Freiburg, studies in classification, data analysis, and knowledge organization (pp. 533–540). New York: Springer.
Mathes, A. (2004). Folksonomies—cooperative classification and communication through shared metadata. http://www.adammathes.com/academic/computer-mediated-communication/folksonomies.html
Mishne, G. (2006). Autotag: a collaborative approach to automated tag assignment for weblog posts. In WWW ’06: proceedings of the 15th international conference on World Wide Web (pp. 953–954). New York: ACM.
Mrosek, J., Bussmann, S., Albers, H., Posdziech, K., Hengefeld, B., Opperman, N., Robert, S., Spira, G. (2009). Content- and graph-based tag recommendation: two variations. In F. Eisterlehner, A. Hotho, R. Jaschke (Eds.), ECML PKDD discovery challenge 2009 (DC09), CEUR workshop proceedings (Vol. 497, pp. 189–199).
Murfi, H., & Obermayer, K. (2009). A two-level learning hierarchy of concept based keyword extraction for tag recommendations. In F. Eisterlehner, A. Hotho, R. Jaschke (Eds.), ECML PKDD discovery challenge 2009 (DC09), CEUR workshop proceedings (Vol. 497, pp. 201–214).
Musto, C., Narducci, F., de Gemmis, M., Lops, P., Semeraro, G. (2009) STaR: a social tag recommender system. In F. Eisterlehner, A. Hotho, R. Jaschke (Eds.), Proceedings the ECML/PKDD 2009 discovery challenge workshop, CEUR workshop proceedings (Vol. 497, pp. 215–227).
Musto, C., Narducci, F., de Gemmis, M., Lops, P., Semeraro, G. (2010a). An IR-based approach for tag recommendation. In IIR 2010—proceedings of the f irst Italian information retrieval workshop, Padua, Italy, 27–28 January 2010, CEUR workshop proceedings (Vol. 560, pp. 65–69).
Musto, C., Narducci, F., de Gemmis, M., Lops, P. (2010b). Combining collaborative and contentbased techniques for tag recommendation. In F. Buccafurri, G. Semeraro (Eds.) E-Commerce and web technologies, 11th international conference, EC-Web 2010, Bilbao, Spain, 1–3 September 2010, of lecture notes in business information processing (LNBIP) (Vol. 61 pp. 13–23). ISBN: 978-3-642-15207-8.
Robertson, S.E., Walker, S., Beaulieu, M.H., Gull, A., Lau, M. (1992). Okapi at TREC. In Text retrieval conference (pp. 21–30).
Salton, G. (1989). Automatic text processing. Reading: Addison-Wesley.
Schmitz, C., Hotho, A., Jäschke, R., Stumme, G. (2006). Mining association rules in folksonomies. In data science and classification (proc. IFCS 2006 conference), studies in classification, data analysis, and knowledge organization, Ljubljana (pp. 261–270). Berlin: Springer.
Sebastiani, F. (2002). Machine learning in automated text categorization. ACM Computing Surveys, 34(1), 1–47.
Song, Y., Zhang, L., Giles, C.L. (2011). Automatic tag recommendation algorithms for social recommender systems. ACM Transactions on the Web, 5(1), 1–31.
Sood, S., Owsley, S., Hammond, K., Birnbaum, L. (2007). TagAssist: automatic tag suggestion for blog posts. In Proceedings of the international conference on weblogs and social media (ICWSM 2007).
Symeonidis, P. (2009). User recommendations based on tensor dimensionality reduction. In L.S. Iliadis, I. Maglogiannis, G. Tsoumakas, I.P. Vlahavas, Max Bramer (Eds.), Artificial intelligence applications and innovations III, proceedings of the 5th IFIP conference on artificial intelligence applications and innovations (AIAI’2009), IFIP (Vol. 296, pp. 331–340). New York: Springer.
Tatu, M., Srikanth, M., D’Silva, T. (2008). RSDC’08: tag recommendations using bookmark content. In Proceedings of ECML PKDD discovery challenge (DC08) (pp. 96–107).
Wang, J., Hong, L., Davison, B.D. (2009). RSDC09: Tag recommendation using keywords and association rules. In F. Eisterlehner, A. Hotho, R. Jaschke (Eds.), ECML PKDD discovery challenge 2009 (DC09), CEUR workshop proceedings (Vol. 497, pp. 261–274).
Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C., Nevill-Manning, C.G. (1999). Kea: practical automatic keyphrase extraction. In Proceedings of the fourth ACM conference on digital libraries (pp. 254–255). New York: ACM.
Wu, H., Zubair, M., Maly, K. (2006). Harvesting social knowledge from folksonomies. In HYPERTEXT ’06: proceedings of the seventeenth conference on hypertext and hypermedia (pp. 111–114). New York: ACM.
Zhang, Y., Zhang, N., Tang, J. (2009). A collaborative filtering tag recommendation system based on graph. In F. Eisterlehner, A. Hotho, R. Jaschke (Eds.), ECML PKDD discovery challenge 2009 (DC09), CEUR workshop proceedings (Vol. 497, pp. 297–306).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lops, P., de Gemmis, M., Semeraro, G. et al. Content-based and collaborative techniques for tag recommendation: an empirical evaluation. J Intell Inf Syst 40, 41–61 (2013). https://doi.org/10.1007/s10844-012-0215-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-012-0215-6