Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3448016.3452832acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

On m-Impact Regions and Standing Top-k Influence Problems

Published: 18 June 2021 Publication History

Abstract

In this paper, we study the m-impact region problem (mIR). In a context where users look for available products with top-k queries, mIR identifies the part of the product space that attracts the most user attention. Specifically, mIR determines the kind of attribute values that lead a (new or existing) product to the top-k result for at least a fraction of the user population. mIR has several applications, ranging from effective marketing to product improvement. Importantly, it also leads to (exact and efficient) solutions for standing top-k impact problems, which were previously solved heuristically only, or whose current solutions face serious scalability limitations. We experiment, among others, on data mined from actual user reviews for real products, and demonstrate the practicality and efficiency of our algorithms, both for mIR and for standing top-k impact problems.

Supplementary Material

MP4 File (3448016.3452832.mp4)
In this paper, we study the m-impact region problem (mIR). In a context where users look for available products with top-k queries, mIR identifies the part of the product space that attracts the most user attention. Specifically, mIR determines the kind of attribute values that lead a (new or existing) product to the top-k result for at least a fraction of the user population. mIR has several applications, ranging from effective marketing to product improvement. Importantly, it also leads to (exact and efficient) solutions for standing top-k impact problems, which were previously solved heuristically only, or whose current solutions face serious scalability limitations. We experiment, among others, on data mined from actual user reviews for real products, and demonstrate the practicality and efficiency of our algorithms, both for mIR and for standing top-k impact problems.

References

[1]
Hotel dataset. http://www.hotels-base.com.
[2]
House dataset. http://www.ipums.org.
[3]
lpsolver. http://lpsolve.sourceforge.net/5.5/.
[4]
NBA dataset. http://www.basketball-reference.com.
[5]
qhalf. http://www.qhull.org/html/qhalf.htm.
[6]
qhull. http://www.qhull.org.
[7]
TripAdvisor Data Set. http://www.cs.virginia.edu/~hw5x/dataset.html.
[8]
P. K. Agarwal and M. Sharir. Arrangements and their applications. Handbook of computational geometry, pages 49--119, 2000.
[9]
A. Asudeh, A. Nazi, N. Koudas, and G. Das. Maximizing gain over flexible attributes in peer to peer marketplaces. In PAKDD, pages 327--345, 2019.
[10]
A. Asudeh, A. Nazi, N. Zhang, and G. Das. Efficient computation of regret-ratio minimizing set: A compact maxima representative. In SIGMOD Conference, pages 821--834, 2017.
[11]
C. B. Barber, D. P. Dobkin, and H. Huhdanpaa. The quickhull algorithm for convex hulls. ACM Trans. Math. Softw., 22(4):469--483, 1996.
[12]
M. d. Berg, O. Cheong, M. v. Kreveld, and M. Overmars. Computational geometry: algorithms and applications. Springer-Verlag TELOS, 2008.
[13]
A. Blum, J. C. Jackson, T. Sandholm, and M. Zinkevich. Preference elicitation and query learning. J. Mach. Learn. Res., 5:649--667, 2004.
[14]
S. Börzsönyi, D. Kossmann, and K. Stocker. The skyline operator. In ICDE, pages 421--430, 2001.
[15]
Y. Cai, Y. Tang, and N. Mamoulis. Maximizing a record's standing in a relation. IEEE Trans. Knowl. Data Eng., 27(9):2401--2414, 2015.
[16]
T. M. Chan. Output-sensitive results on convex hulls, extreme points, and related problems. Discret. Comput. Geom., 16(4):369--387, 1996.
[17]
Y.-C. Chang, L. Bergman, V. Castelli, C.-S. Li, M.-L. Lo, and J. R. Smith. The onion technique: Indexing for linear optimization queries. In SIGMOD Conference, pages 391--402, 2000.
[18]
B. Chazelle. An optimal convex hull algorithm in any fixed dimension. Discrete & Computational Geometry, 10(4):377--409, 1993.
[19]
M. A. Cheema, Z. Shen, X. Lin, and W. Zhang. A unified framework for efficiently processing ranking related queries. In EDBT, pages 427--438, 2014.
[20]
P. Ciaccia and D. Martinenghi. Reconciling skyline and ranking queries. PVLDB, 10(11):1454--1465, 2017.
[21]
K. Clarkson, K. Mehlhorn, and R. Seidel. Four results on randomized incremental constructions. Computational Geometry, 3(4):185--212, 1993.
[22]
M. Das, G. Das, and V. Hristidis. Leveraging collaborative tagging for web item design. In KDD, pages 538--546, 2011.
[23]
H. Edelsbrunner, R. Seidel, and M. Sharir. On the zone theorem for hyperplane arrangements. SIAM J. Comput., 22(2):418--429, 1993.
[24]
R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. In PODS, page 102--113, 2001.
[25]
Y. Gao, Q. Liu, G. Chen, B. Zheng, and L. Zhou. Answering why-not questions on reverse top-k queries. PVLDB, 8(7):738--749, 2015.
[26]
S. Ge, L. H. U, N. Mamoulis, and D. W. Cheung. Efficient all top-k computation - A unified solution for all top-k, reverse top-k and top-m influential queries. IEEE Trans. Knowl. Data Eng., 25(5):1015--1027, 2013.
[27]
S. Ge, L. H. U, N. Mamoulis, and D. W. Cheung. Dominance relationship analysis with budget constraints. Knowl. Inf. Syst., 42(2):409--440, 2015.
[28]
P. Godfrey, R. Shipley, and J. Gryz. Algorithms and analyses for maximal vector computation. VLDB J., 16(1):5--28, 2007.
[29]
V. Hristidis, N. Koudas, and Y. Papakonstantinou. PREFER: A system for the efficient execution of multi-parametric ranked queries. In SIGMOD Conference, pages 259--270, 2001.
[30]
I. F. Ilyas, G. Beskales, and M. A. Soliman. A survey of top-k query processing techniques in relational database systems. ACM Comp. Surveys, 40(4):11:1--11:58, 2008.
[31]
M. S. Islam and C. Liu. Know your customer: computing k-most promising products for targeted marketing. VLDB J., 25(4):545--570, 2016.
[32]
K. G. Jamieson and R. D. Nowak. Active ranking using pairwise comparisons. In NIPS, pages 2240--2248, 2011.
[33]
T. Joachims. Optimizing search engines using clickthrough data. In KDD, pages 133--142, 2002.
[34]
J. Koh, C. Lin, and A. L. P. Chen. Finding k most favorite products based on reverse top-t queries. VLDB J., 23(4):541--564, 2014.
[35]
F. Korn and S. Muthukrishnan. Influence sets based on reverse nearest neighbor queries. In SIGMOD Conference, pages 201--212, 2000.
[36]
C. Li, B. C. Ooi, A. K. H. Tung, and S. Wang. DADA: a data cube for dominant relationship analysis. In SIGMOD Conference, pages 659--670, 2006.
[37]
C. Lin, J. Koh, and A. L. P. Chen. Determining k-most demanding products with maximum expected number of total customers. IEEE Trans. Knowl. Data Eng., 25(8):1732--1747, 2013.
[38]
H. Lu and C. S. Jensen. Upgrading uncompetitive products economically. In ICDE, pages 977--988, 2012.
[39]
Y. Lu, C. Zhai, and N. Sundaresan. Rated aspect summarization of short comments. In WWW, pages 131--140, 2009.
[40]
N. Mamoulis, M. L. Yiu, K. H. Cheng, and D. W. Cheung. Efficient top-k aggregation of ranked inputs. ACM Trans. Database Syst., 32(3):19, 2007.
[41]
M. Miah, G. Das, V. Hristidis, and H. Mannila. Standing out in a crowd: Selecting attributes for maximum visibility. In ICDE, pages 356--365, 2008.
[42]
R. D. C. Monteiro and I. Adler. Interior path following primal-dual algorithms. part II: convex quadratic programming. Math. Program., 44(1--3):43--66, 1989.
[43]
K. Mouratidis and B. Tang. Exact processing of uncertain top-k queries in multi-criteria settings. PVLDB, 11(8):866--879, 2018.
[44]
K. Mouratidis, J. Zhang, and H. Pang. Maximum rank query. PVLDB, 8(12):1554--1565, 2015.
[45]
K. Mulmuley. On levels in arrangements and voronoi diagrams. Discrete & Computational Geometry, 6:307--338, 1991.
[46]
D. Nanongkai, A. D. Sarma, A. Lall, R. J. Lipton, and J. J. Xu. Regret-minimizing representative databases. PVLDB, 3(1):1114--1124, 2010.
[47]
V. Padmanabhan, S. Rajiv, and K. Srinivasan. New products, upgrades, and new releases: A rationale for sequential product introduction. Journal of Marketing Research, 34(4):456--472, 1997.
[48]
D. Papadias, Y. Tao, G. Fu, and B. Seeger. Progressive skyline computation in database systems. ACM Trans. Database Syst., 30(1):41--82, 2005.
[49]
Y. Peng, R. C. Wong, and Q. Wan. Finding top-k preferable products. IEEE Trans. Knowl. Data Eng., 24(10):1774--1788, 2012.
[50]
A. Popescu and O. Etzioni. Extracting product features and opinions from reviews. In HLT/EMNLP, pages 339--346, 2005.
[51]
L. Qian, J. Gao, and H. V. Jagadish. Learning user preferences by adaptive pairwise comparison. PVLDB, 8(11):1322--1333, 2015.
[52]
B. Tang, K. Mouratidis, and M. L. Yiu. Determining the impact regions of competing options in preference space. In SIGMOD Conference, pages 805--820, 2017.
[53]
B. Tang, K. Mouratidis, M. L. Yiu, and Z. Chen. Creating top ranking options in the continuous option and preference space. PVLDB, 12(10):1181--1194, 2019.
[54]
Y. Tao, V. Hristidis, D. Papadias, and Y. Papakonstantinou. Branch-and-bound processing of ranked queries. Inf. Syst., 32(3):424--445, 2007.
[55]
Y. Tao, D. Papadias, X. Lian, and X. Xiao. Multidimensional reverse k NN search. VLDB J., 16(3):293--316, 2007.
[56]
Y. Tao, X. Xiao, and J. Pei. Efficient skyline and top-k retrieval in subspaces. IEEE Trans. Knowl. Data Eng., 19(8):1072--1088, 2007.
[57]
A. Vlachou, C. Doulkeridis, Y. Kotidis, and K. Nørvåg. Reverse top-k queries. In ICDE, pages 365--376, 2010.
[58]
A. Vlachou, C. Doulkeridis, K. Nørvåg, and Y. Kotidis. Identifying the most influential data objects with reverse top-k queries. PVLDB, 3(1):364--372, 2010.
[59]
A. Vlachou, C. Doulkeridis, K. Norvag, and Y. Kotidis. Branch-and-bound algorithm for reverse top-k queries. In SIGMOD Conference, pages 481--492, 2013.
[60]
Q. Wan, R. C. Wong, I. F. Ilyas, M. T. Ö zsu, and Y. Peng. Creating competitive products. PVLDB, 2(1):898--909, 2009.
[61]
H. Wang, Y. Lu, and C. Zhai. Latent aspect rating analysis on review text data: a rating regression approach. In KDD, pages 783--792, 2010.
[62]
R. C. Wong, M. T. Ö zsu, A. W. Fu, P. S. Yu, L. Liu, and Y. Liu. Maximizing bichromatic reverse nearest neighbor for Lp -norm in two- and three-dimensional spaces. VLDB J., 20(6):893--919, 2011.
[63]
X. Wu, Y. Tao, R. C. Wong, L. Ding, and J. X. Yu. Finding the influence set through skylines. In EDBT, pages 1030--1041, 2009.
[64]
M. Xie, R. C. Wong, and A. Lall. An experimental survey of regret minimization query and variants: bridging the best worlds between top-k query and skyline query. VLDB J., 29(1):147--175, 2020.
[65]
M. Xie, R. C. Wong, J. Li, C. Long, and A. Lall. Efficient k-regret query algorithm with restriction-free bound for any dimensionality. In SIGMOD Conference, pages 959--974, 2018.
[66]
G. Yang and Y. Cai. Querying improvement strategies. In EDBT, pages 294--305, 2017.
[67]
J. Yang, Y. Zhang, W. Zhang, and X. Lin. Influence based cost optimization on user preference. In ICDE, pages 709--720, 2016.
[68]
J. Yang, Y. Zhang, W. Zhang, and X. Lin. Cost optimization based on influence and user preference. Knowl. Inf. Syst., 61(2):695--732, 2019.
[69]
S. Yang, M. A. Cheema, X. Lin, and Y. Zhang. SLICE: reviving regions-based pruning for reverse k nearest neighbors queries. In ICDE, pages 760--771, 2014.
[70]
M. L. Yiu and N. Mamoulis. Multi-dimensional top-k dominating queries. VLDB J., 18(3):695--718, 2009.
[71]
A. Yu, P. K. Agarwal, and J. Yang. Processing a large number of continuous preference top-k queries. In SIGMOD Conference, pages 397--408, 2012.
[72]
A. Yu, P. K. Agarwal, and J. Yang. Top-k preferences in high dimensions. IEEE Trans. Knowl. Data Eng., 28(2):311--325, 2016.
[73]
J. Zhang, K. Mouratidis, and H. Pang. Direct neighbor search. Inf. Syst., 44:73--92, 2014.
[74]
Z. Zhang, C. Jin, and Q. Kang. Reverse k-ranks query. PVLDB, 7(10):785--796, 2014.
[75]
Z. Zhou, W. Wu, X. Li, M. Lee, and W. Hsu. MaxFirst for MaxBRkNN. In ICDE, pages 828--839, 2011.

Cited By

View all
  • (2023)rkHit: Representative Query with Uncertain PreferenceProceedings of the ACM on Management of Data10.1145/35892711:2(1-26)Online publication date: 20-Jun-2023
  • (2023)Quantifying the competitiveness of a dataset in relation to general preferencesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-023-00804-133:1(231-250)Online publication date: 8-Aug-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '21: Proceedings of the 2021 International Conference on Management of Data
June 2021
2969 pages
ISBN:9781450383431
DOI:10.1145/3448016
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. market analysis
  2. multi-dimensional datasets
  3. top-k query

Qualifiers

  • Research-article

Funding Sources

  • Education Department of Guangdong
  • NSFC
  • Guangdong Provincial Key Laboratory

Conference

SIGMOD/PODS '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)32
  • Downloads (Last 6 weeks)8
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)rkHit: Representative Query with Uncertain PreferenceProceedings of the ACM on Management of Data10.1145/35892711:2(1-26)Online publication date: 20-Jun-2023
  • (2023)Quantifying the competitiveness of a dataset in relation to general preferencesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-023-00804-133:1(231-250)Online publication date: 8-Aug-2023

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media