Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2124295.2124329acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Online selection of diverse results

Published: 08 February 2012 Publication History

Abstract

The phenomenal growth in the volume of easily accessible information via various web-based services has made it essential for service providers to provide users with personalized representative summaries of such information. Further, online commercial services including social networking and micro-blogging websites, e-commerce portals, leisure and entertainment websites, etc. recommend interesting content to users that is simultaneously diverse on many different axes such as topic, geographic specificity, etc. The key algorithmic question in all these applications is the generation of a succinct, representative, and relevant summary from a large stream of data coming from a variety of sources. In this paper, we formally model this optimization problem, identify its key structural characteristics, and use these observations to design an extremely scalable and efficient algorithm. We analyze the algorithm using theoretical techniques to show that it always produces a nearly optimal solution. In addition, we perform large-scale experiments on both real-world and synthetically generated datasets, which confirm that our algorithm performs even better than its analytical guarantees in practice, and also outperforms other candidate algorithms for the problem by a wide margin.

Supplementary Material

JPG File (wsdm_day2_session1_3.jpg)
MP4 File (wsdm_day2_session1_3.mp4)

References

[1]
Rakesh Agrawal, Sreenivas Gollapudi, Alan Halverson, and Samuel Ieong. Diversifying search results. In WSDM, pages 5--14, 2009.
[2]
Nikhil Bansal and Maxim Sviridenko. The Santa Claus problem. In STOC, pages 31--40, 2006.
[3]
Jaime G. Carbonell and Jade Goldstein. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In SIGIR, pages 335--336, 1998.
[4]
Nikhil R. Devanur, Kamal Jain, Balasubramanian Sivan, and Christopher A. Wilkens. Near optimal online algorithms and fast approximation algorithms for resource allocation problems. In ACM Conference on Electronic Commerce, pages 29--38, 2011.
[5]
Marina Drosou and Evaggelia Pitoura. Diversity over continuous data. IEEE Data Eng. Bull., 32(4):49--56, 2009.
[6]
Marina Drosou and Evaggelia Pitoura. Search result diversification. SIGMOD Record, 39(1):41--47, 2010.
[7]
Khalid El-Arini, Gaurav Veda, Dafna Shahaf, and Carlos Guestrin. Turning down the noise in the blogosphere. In KDD, pages 289--298, 2009.
[8]
Sreenivas Gollapudi and Aneesh Sharma. An axiomatic approach for result diversification. In WWW, pages 381--390, 2009.
[9]
Sean M. McNee, John Riedl, and Joseph A. Konstan. Being accurate is not enough: how accuracy metrics have hurt recommender systems. In CHI Extended Abstracts, pages 1097--1101, 2006.
[10]
R. Motwani and P. Raghavan. Randomized Algorithms. Cambridge University Press, 1997.
[11]
Filip Radlinski, Paul N. Bennett, Ben Carterette, and Thorsten Joachims. Redundancy, diversity and interdependent document relevance. SIGIR Forum, 43(2):46--52, 2009.
[12]
Rodrygo L. T. Santos, Craig Macdonald, and Iadh Ounis. Exploiting query reformulations for web search result diversification. In WWW, pages 881--890, 2010.
[13]
Rodrygo L. T. Santos, Craig Macdonald, and Iadh Ounis. Selectively diversifying web search results. In CIKM, pages 1179--1188, 2010.
[14]
Rodrygo L. T. Santos, Craig Macdonald, and Iadh Ounis. How diverse are web search results? In SIGIR, pages 1187--1188, 2011.
[15]
Rodrygo L. T. Santos, Craig Macdonald, and Iadh Ounis. Intent-aware search result diversification. In SIGIR, pages 595--604, 2011.
[16]
Aleksandrs Slivkins, Filip Radlinski, and Sreenivas Gollapudi. Learning optimally diverse rankings over large document collections. In ICML, pages 983--990, 2010.
[17]
Erik Vee, Utkarsh Srivastava, Jayavel Shanmugasundaram, Prashant Bhat, and Sihem Amer-Yahia. Efficient computation of diverse query results. In ICDE, 2008.
[18]
Cong Yu, Laks V. S. Lakshmanan, and Sihem Amer-Yahia. It takes variety to make a world: diversification in recommender systems. In EDBT, pages 368--378, 2009.
[19]
Cong Yu, Laks V. S. Lakshmanan, and Sihem Amer-Yahia. Recommendation diversification using explanations. In ICDE, pages 1299--1302, 2009.
[20]
ChengXiang Zhai and John D. Lafferty. A risk minimization framework for information retrieval. Inf. Process. Manage., 42(1):31--55, 2006.
[21]
Cai-Nicolas Ziegler, Sean M. McNee, Joseph A. Konstan, and Georg Lausen. Improving recommendation lists through topic diversification. In WWW, pages 22--32, 2005.

Cited By

View all
  • (2022)Online algorithms for the santa claus problemProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602498(30732-30743)Online publication date: 28-Nov-2022
  • (2020)Serendipity-based Points-of-Interest NavigationACM Transactions on Internet Technology10.1145/339119720:4(1-32)Online publication date: 1-Oct-2020
  • (2020)Navigation leads for exploratory search and navigation in digital librariesKnowledge and Information Systems10.1007/s10115-019-01434-2Online publication date: 31-Jan-2020
  • Show More Cited By

Index Terms

  1. Online selection of diverse results

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WSDM '12: Proceedings of the fifth ACM international conference on Web search and data mining
      February 2012
      792 pages
      ISBN:9781450307475
      DOI:10.1145/2124295
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 08 February 2012

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. online algorithm
      2. result diversity

      Qualifiers

      • Research-article

      Conference

      Acceptance Rates

      Overall Acceptance Rate 498 of 2,863 submissions, 17%

      Upcoming Conference

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 26 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)Online algorithms for the santa claus problemProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602498(30732-30743)Online publication date: 28-Nov-2022
      • (2020)Serendipity-based Points-of-Interest NavigationACM Transactions on Internet Technology10.1145/339119720:4(1-32)Online publication date: 1-Oct-2020
      • (2020)Navigation leads for exploratory search and navigation in digital librariesKnowledge and Information Systems10.1007/s10115-019-01434-2Online publication date: 31-Jan-2020
      • (2019)Diversity in Machine LearningIEEE Access10.1109/ACCESS.2019.29176207(64323-64350)Online publication date: 2019
      • (2018)Advisory Search and Security on Data Mining using Clustering Approaches2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT)10.1109/ICICCT.2018.8473252(256-261)Online publication date: Apr-2018
      • (2018)Relevant Filtering in a Distributed Content‐based Publish/Subscribe SystemNoSQL Data Models10.1002/9781119528227.ch7(203-244)Online publication date: 6-Aug-2018
      • (2016)Diversified set monitoring over distributed data streamsProceedings of the 10th ACM International Conference on Distributed and Event-based Systems10.1145/2933267.2933298(1-12)Online publication date: 13-Jun-2016
      • (2015)TDV-based Filter for Novelty and Diversity in a Real-time Pub/Sub SystemProceedings of the 19th International Database Engineering & Applications Symposium10.1145/2790755.2790768(136-145)Online publication date: 13-Jul-2015
      • (2015)Diversity-Aware Top-k Publish/Subscribe for Text StreamProceedings of the 2015 ACM SIGMOD International Conference on Management of Data10.1145/2723372.2749451(347-362)Online publication date: 27-May-2015
      • (2015)Multiple Radii DisC DiversityACM Transactions on Database Systems10.1145/269949940:1(1-43)Online publication date: 25-Mar-2015
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media