Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

On the Complexity of Query Result Diversification

Published: 26 May 2014 Publication History

Abstract

Query result diversification is a bi-criteria optimization problem for ranking query results. Given a database D, a query Q, and a positive integer k, it is to find a set of k tuples from Q(D) such that the tuples are as relevant as possible to the query, and at the same time, as diverse as possible to each other. Subsets of Q(D) are ranked by an objective function defined in terms of relevance and diversity. Query result diversification has found a variety of applications in databases, information retrieval, and operations research.
This article investigates the complexity of result diversification for relational queries. (1) We identify three problems in connection with query result diversification, to determine whether there exists a set of k tuples that is ranked above a bound with respect to relevance and diversity, to assess the rank of a given k-element set, and to count how many k-element sets are ranked above a given bound based on an objective function. (2) We study these problems for a variety of query languages and for the three objective functions proposed in Gollapudi and Sharma [2009]. We establish the upper and lower bounds of these problems, all matching, for both combined complexity and data complexity. (3) We also investigate several special settings of these problems, identifying tractable cases. Moreover, (4) we reinvestigate these problems in the presence of compatibility constraints commonly found in practice, and provide their complexity in all these settings.

Supplementary Material

a15-deng-apndx.pdf (deng.zip)
Supplemental movie, appendix, image and software files for, On the Complexity of Query Result Diversification

References

[1]
S. Abiteboul, R. Hull, and V. Vianu. 1995. Foundations of Databases. Addison-Wesley.
[2]
G. Adomavicius and A. Tuzhilin. 2005. Towards the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Engin. 17, 6, 734--749.
[3]
R. Agrawal, S. Gollapudi, A. Halverson, and S. Leong. 2009. Diversifying search results. In Proceedings of the International Conference on Web Search and Web Data Mining (WSDM'09). 5--14.
[4]
S. Amer-Yahia. 2011. Recommendation projects at yahoo! IEEE Data Engin. Bull. 34, 2, 69--77.
[5]
S. Amer-Yahia, F. Bonchi, C. Castillo, E. Feuerstein, I. Mendez-Diaz, and P. Zabala. 2013. Complexity and algorithms for composite retrieval. In Proceedings of the International Conference on World Wide Web (WWW'13). 79--80.
[6]
G. Berbeglia and G. Hahn. 2010. Counting feasible solutions of the traveling salesman problem with pickups and deliveries is #P-complete. Discr. Appl. Math. 157, 11, 2541--2547.
[7]
A. Borodin, H. C. Lee, and Y. Ye. 2012. Max-sum diversification, monotone submodular functions and dynamic updates. In Proceedings of the ACM SIGACT-SIGMOD Symposium on Principles of Database Systems (PODS'12). 155--166.
[8]
G. Capannini, F. M. Nardini, R. Perego, and F. Silvestri. 2011. Efficient diversification of web search results. Proc. VLDB Endow. 4, 7, 451--459.
[9]
Z. Chen and T. Li. 2007. Addressing diverse user preferences in SQL-query-resul navigation. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'07). 641--652.
[10]
E. Demidova, P. Fankhauser, X. Zhou, and W. Nejdl. 2010. DivQ: Diversification for keyword search over structured databases. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'10). 331--338.
[11]
T. Deng and W. Fan. 2013. On the complexity of query result diversification. Proc. VLDB Endow. 6, 8, 577--588.
[12]
T. Deng, W. Fan, and F. Geerts. 2012. On the complexity of package recommendation problems. In Proceedings of the ACM SIGACT-SIGMOD Symposium on Principles of Database Systems (PODS'12). 261--272.
[13]
M. Drosou and E. Pitoura. 2009. Diversity over continuous data. IEEE Data Engin. Bull. 32, 4.
[14]
M. Drosou and E. Pitoura. 2010. Search result diversification. SIGMOD Rec. 39, 1, 41--47.
[15]
A. Durand, M. Hermann, and P. G. Kolaitis. 2005. Subtractive reductions and complete problems for counting complexity classes. Theor. Comput. Sci. 340, 3, 496--513.
[16]
R. Fagin, A. Lotem, and M. Naor. 2003. Optimal aggregation algorithms for middleware. J. Comput. Syst. Sci. 66, 4, 614--656.
[17]
E. Feuerstein, P. A. Heiber, J. Martinez-Viademonte, and R. A. Baeza-Yates. 2007. New stochastic algorithms for scheduling ads in sponsored search. In Proceedings of the 5th Latin American Web Congress (LA-WEB'07). 22--31.
[18]
P. Fraternali, D. Martinenghi, and M. Tagliasacchi. 2012. Top-k bounded diversification. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'12). 421--432.
[19]
S. Gollapudi and A. Sharma. 2009. An axiomatic approach for result diversification. In Proceedings of the 18th International Conference on World Wide Web (WWW'09). 381--390.
[20]
L. A. Hemaspaandra and H. Vollmer. 1995. The satanic notations: Counting classes beyond #P and other definitional adventures. SIGACT News 26, 1, 2--13.
[21]
I. F. Ilyas, G. Beskales, and M. A. Soliman. 2008. A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv. 40, 4, 11:1--11:58.
[22]
W. Jin and J. M. Patel. 2011. Efficient and generic evaluation of ranked queries. In Proceedings of the ACM SIGMOD International Conference on Management of data (SIGMOD'11). 601--612.
[23]
G. Koutrika, B. Bercovitz, and H. Garcia-Molina. 2009. FlexRecs: Expressing and combining flexible recommendations. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD'09). 745--758.
[24]
R. E. Ladner. 1989. Polynomial space counting problems. SIAM J. Comput. 18, 6, 1087--1097.
[25]
T. Lappas, K. Liu, and E. Terzi. 2009. Finding a team of experts in social networks. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'09). 467--476.
[26]
C. Li, M. A. Soliman, K. C.-C. Chang, and I. F. Ilyas. 2005. RankSQL: Supporting ranking queries in relational database management systems. In Proceedings of the 31st International Conference on Very Large Data Bases (VLDB'05). 1342--1345.
[27]
Z. Liu, P. Sun, and Y. Chen. 2009. Structured search result differentiation. Proc. VLDB Endow. 2, 1, 313--324.
[28]
E. Minack, G. Demartini, and W. Nejdl. 2009. Current approaches to search result diversification. In Proceedings of the 1st International Workshop on Living Web.
[29]
C. H. Papadimitriou. 1994. Computational Complexity. Addison-Wesley.
[30]
A. G. Parameswaran, H. Garcia-Molina, and J. D. Ullman. 2010. Evaluating, combining and generalizing recommendations with prerequisites. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM'10). 919--928.
[31]
A. G. Parameswaran, P. Venetis, and H. Garcia-Molina. 2011. Recommendation systems with complex constraints: A course recommendation perspective. ACM Trans. Inf. Syst. 29, 4.
[32]
O. A. Prokopyev, N. Kong, and D. L. Martinez-Torres. 2009. The equitable dispersion problem. Euro. J. Oper. Res. 197, 1, 59--67.
[33]
K. Schnaitter and N. Polyzotis. 2008. Evaluating rank joins with optimal cost. In Proceedings of the 27th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS'08). 43--52.
[34]
K. Stefanidis, M. Drosou, and E. Pitoura. 2010. Perk: Personalized keyword search in relational databases through preferences. In Proceedings of the 13th International Conference on Extending Database Technology (EDBT'10). 585--596
[35]
L. Valiant. 1979. The complexity of computing the permanent. Theor. Comput. Sci. 8, 2, 189--201.
[36]
M. Y. Vardi. 1982. The complexity of relational query languages. In Proceedings of the 14th Annual ACM Symposium on Theory of Computing (STOC'82). 137--146.
[37]
E. Vee, U. Srivastava, J. Shanmugasundaram, P. Bhat, and S. A. Yahia. 2008. Efficient computation of diverse query results. In Proceedings of the 24th IEEE International Conference on Data Engineering (ICDE'08). 228--236.
[38]
M. R. Vieira, H. L. Razente, M. C. N. Barioni, M. Hadjieleftheriou, D. Srivastava, Traina., and V. J. Tsotras. 2011. On query result diversification. In Proceedings of the 27th IEEE International Conference on Data Engineering (ICDE'11). 1163--1174.
[39]
M. Xie, L. V. S. Lakshmanan, and P. T. Wood. 2012. Composite recommendations: From items to packages. Frontiers Comput. Sci. 6, 3, 264--277.
[40]
C. Yu, L. Lakshmanan, and S. Amer-Yahia. 2009a. It takes variety to make a world: Diversification in recommender systems. In Proceedings of the 12th International Conference on Extending Database Technology (EDBT'09). 368--378.
[41]
C. Yu, L. V. Lakshmanan, and S. Amer-Yahia. 2009b. Recommendation diversification using explanations. In Proceedings of the IEEE International Conference on Data Engineering (ICDE'09). 1299--1302.
[42]
M. Zhang and N. Hurley. 2008. Avoiding monotony: Improving the diversity of recommendation lists. In Proceedings of the ACM Conference on Recommender Systems (RecSys'08). 123--130.
[43]
C.-N. Ziegler, S. M. Mcnee, J. A. Konstan, and G. Lausen. 2005. Improving recommendation lists through topic diversification. In Proceedings of the 14th International Conference on World Wide Web (WWW'05). 22--32.

Cited By

View all
  • (2024)Towards Tractability of the Diversity of Query Answers: Ultrametrics to the RescueProceedings of the ACM on Management of Data10.1145/36958332:5(1-26)Online publication date: 7-Nov-2024
  • (2024)Discovering Top-k Relevant and Diversified RulesProceedings of the ACM on Management of Data10.1145/36771312:4(1-28)Online publication date: 30-Sep-2024
  • (2024)Query Refinement for Diverse Top-k SelectionProceedings of the ACM on Management of Data10.1145/36549692:3(1-27)Online publication date: 30-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Database Systems
ACM Transactions on Database Systems  Volume 39, Issue 2
May 2014
336 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/2627748
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 May 2014
Accepted: 01 March 2014
Revised: 01 January 2014
Received: 01 May 2013
Published in TODS Volume 39, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Result diversification
  2. combined complexity
  3. counting problems
  4. data complexity
  5. database queries
  6. diversity
  7. recommender systems
  8. relevance

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)3
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Towards Tractability of the Diversity of Query Answers: Ultrametrics to the RescueProceedings of the ACM on Management of Data10.1145/36958332:5(1-26)Online publication date: 7-Nov-2024
  • (2024)Discovering Top-k Relevant and Diversified RulesProceedings of the ACM on Management of Data10.1145/36771312:4(1-28)Online publication date: 30-Sep-2024
  • (2024)Query Refinement for Diverse Top-k SelectionProceedings of the ACM on Management of Data10.1145/36549692:3(1-27)Online publication date: 30-May-2024
  • (2022)Representative Query Results by VotingProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517858(1741-1754)Online publication date: 10-Jun-2022
  • (2019)Interactive summarization and exploration of top aggregate query answersProceedings of the VLDB Endowment10.14778/3275366.328496511:13(2196-2208)Online publication date: 17-Jan-2019
  • (2019)Fusion-based methods for result diversification in web searchInformation Fusion10.1016/j.inffus.2018.01.00645(16-26)Online publication date: Jan-2019
  • (2019)Towards both Local and Global Query Result DiversificationDatabase Systems for Advanced Applications10.1007/978-3-030-18579-4_28(464-481)Online publication date: 22-Apr-2019
  • (2018)Interactive summarization and exploration of top aggregate query answersProceedings of the VLDB Endowment10.5555/3275366.328496511:13(2196-2208)Online publication date: 1-Sep-2018
  • (2018)RC-indexProceedings of the VLDB Endowment10.14778/3192965.319296911:7(773-786)Online publication date: 1-Mar-2018
  • (2018)Exploring Diversified Similarity with KundahaProceedings of the 27th ACM International Conference on Information and Knowledge Management10.1145/3269206.3269220(1903-1906)Online publication date: 17-Oct-2018
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media