Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Reverse skyline search in uncertain databases

Published: 15 February 2008 Publication History

Abstract

Reverse skyline queries over uncertain databases have many important applications such as sensor data monitoring and business planning. Due to the wide existence of uncertainty in many real-world data, answering reverse skyline queries accurately and efficiently over uncertain data has become increasingly important. In this article, we formalize the probabilistic reverse skyline query over uncertain data, in both monochromatic and bichromatic cases, and propose effective pruning methods, namely spatial pruning and probabilistic pruning, to reduce the search space of the reverse skyline query processing. Moreover, efficient query procedures have been presented seamlessly integrating the proposed pruning methods. Furthermore, a novel query type, namely Probabilistic Reverse Furthest Skyline (PRFS) query, is proposed and tackled under “the larger, the better” dominance semantics of skyline. Variants of probabilistic reverse skyline have been proposed and tackled, including those that return objects with top-k highest probabilities and that retrieve top-k reverse skylines. Extensive experiments demonstrated the efficiency and effectiveness of our approaches with various experimental settings.

Supplementary Material

Lian Appendix (a3-lian-apndx.pdf)
Online appendix to reverse skyline search in uncertain databases on article 03.

References

[1]
Antova, L., Jansen, T., Koch, C., and Olteanu, D. 2008. Fast and simple relational processing of uncertain data. In Proceedings of the 24th International Conference on Data Engineering. 983--992.
[2]
Benjelloun, O., Sarma, A. D., Halevy, A. Y., and Widom, J. 2006. ULDBs: Databases with uncertainty and lineage. In Proceedings of the 32nd International Conference on Very Large Data Bases. 953--964.
[3]
Beskales, G., Soliman, M. A., and Ilyas, I. F. 2008. Efficient search for the top-k probable nearest neighbors in uncertain databases. Proc. VLDB Endow. 1, 1, 326--339.
[4]
Böhm, C., Pryakhin, A., and Schubert, M. 2006. The Gauss-tree: Efficient object identification in databases of probabilistic feature vectors. In Proceedings of the 22nd International Conference on Data Engineering. 9.
[5]
Börzsönyi, S., Kossmann, D., and Stocker, K. 2001. The skyline operator. In Proceedings of the 17th International Conference on Data Engineering. 421--430.
[6]
Chan, C. Y., Jagadish, H. V., Tan, K.-L., Tung, A. K. H., and Zhang, Z. 2006. On high dimensional skylines. In Proceedings of the International Conference on Extending Database Technology. 478--495.
[7]
Chen, L. and Lian, X. 2008. Dynamic skyline queries in metric spaces. In Proceedings of the International Conference on Extending Database Technology. 333--343.
[8]
Cheng, R., Kalashnikov, D., and Prabhakar, S. 2004. Querying imprecise data in moving object environments. IEEE Trans. Knowl. Data Engin.16, 1112--1127.
[9]
Cheng, R., Kalashnikov, D. V., and Prabhakar, S. 2003. Evaluating probabilistic queries over imprecise data. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 551--562.
[10]
Cheng, R., Singh, S., and Prabhakar, S. 2005. U-DBMS: A database system for managing constantly-evolving data. In Proceedings of the 31st International Conference on Very Large Data Bases. 1271--1274.
[11]
Chomicki, J., Godfrey, P., Gryz, J., and Liang, D. 2003. Skyline with presorting. In Proceedings of the 19th International Conference on Data Engineering. 717--719.
[12]
Dalvi, N. N. and Suciu, D. 2007. Efficient query evaluation on probabilistic databases. VLDB J. 16, 4, 523--544.
[13]
Dellis, E. and Seeger, B. 2007. Efficient computation of reverse skyline queries. In Proceedings of the 33rd International Conference on Very Large Data Bases. 291--302.
[14]
Deng, K., Zhou, X., and Shen, H. T. 2007. Multi-Source skyline query processing in road networks. In Proceedings of the 23rd International Conference on Data Engineering. 796--805.
[15]
Faradjian, A., Gehrke, J., and Bonnet, P. 2002. Gadt: A probability space ADT for representing and querying the physical world. In Proceedings of the 18th International Conference on Data Engineering. 201--211.
[16]
Godfrey, P., Shipley, R., and Gryz, J. 2005. Maximal vector computation in large data sets. In Proceedings of the 31st International Conference on Very Large Data Bases. 229--240.
[17]
Guttman, A. 1984. R-trees: A dynamic index structure for spatial searching. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 47--57.
[18]
Hua, M., Pei, J., Zhang, W., and Lin, X. 2008. Ranking queries on uncertain data: A probabilistic threshold approach. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 673--686.
[19]
Kang, J. M., Mokbel, M. F., Shekhar, S., Xia, T., and Zhang, D. 2007. Continuous evaluation of monochromatic and bichromatic reverse nearest neighbors. In Proceedings of the 23rd International Conference on Data Engineering. 806--815.
[20]
Khalefa, M., Mokbel, M., and Levandoski, J. 2008. Skyline query processing for incomplete data. In Proceedings of the 24th International Conference on Data Engineering. 556--565.
[21]
Kossmann, D., Ramsak, F., and Rost, S. 2002. Shooting stars in the sky: An online algorithm for skyline queries. In Proceedings of the 28th International Conference on Very Large Data Bases. 275--286.
[22]
Kriegel, H.-P., Kunath, P., Pfeifle, M., and Renz, M. 2006. Probabilistic similarity join on uncertain data. In Proceedings of the 11th International Conference on Database Systems for Advanced Applications. 295--309.
[23]
Kriegel, H.-P., Kunath, P., and Renz, M. 2007. Probabilistic nearest-neighbor query on uncertain objects. In Proceedings of the 12th International Conference on Database Systems for Advanced Applications. 337--348.
[24]
Lee, K., Zheng, B., Li, H., and Lee, W.-C. 2007. Approaching the skyline in Z order. In Proceedings of the 33rd International Conference on Very Large Data Bases. 279--290.
[25]
Lian, X. and Chen, L. 2008a. Monochromatic and bichromatic reverse skyline search over uncertain databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 213--226.
[26]
Lian, X. and Chen, L. 2008b. Probabilistic group nearest neighbor queries in uncertain databases. IEEE Trans. Knowl. Data Engin. 20, 6, 809--824.
[27]
Lian, X. and Chen, L. 2008c. Probabilistic ranked queries in uncertain databases. In Proceedings of the International Conference on Extending Database Technology. 511--522.
[28]
Lian, X. and Chen, L. 2009a. Efficient processing of probabilistic reverse nearest neighbor queries over uncertain data. The VLDB J. 18, 3, 787--808.
[29]
Lian, X. and Chen, L. 2009b. Probabilistic inverse ranking queries over uncertain data. In Proceedings of Database Systems for Advanced Applications. Lecture Notes in Computer Science, vol. 5463. Springer, 35--50.
[30]
Lian, X. and Chen, L. 2009c. Top-k dominating queries in uncertain databases. In Proceedings of the International Conference on Extending Database Technology.
[31]
Ljosa, V. and Singh, A. K. 2008. Top-k spatial joins of probabilistic objects. In Proceedings of the 24th International Conference on Data Engineering. 566--575.
[32]
Mokbel, M. F., Chow, C.-Y., and Aref, W. G. 2006. The new casper: Query processing for location services without compromising privacy. In Proceedings of the 32nd International Conference on Very Large DataBases. 763--774.
[33]
Morse, M., Patel, J., and Jagadish, H. 2007. Efficient skyline computation over low-cardinality domains. In Proceedings of the 33rd International Conference on Very Large Data Bases. 267--278.
[34]
Papadias, D., Tao, Y., Fu, G., and Seeger, B. 2003. An optimal and progressive algorithm for skyline queries. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 467--478.
[35]
Pei, J., Jiang, B., Lin, X., and Yuan, Y. 2007. Probabilistic skylines on uncertain data. In Proceedings of the 33rd International Conference on Very Large Data Bases. 15--26.
[36]
Pei, J., Jin, W., Ester, M., and Tao, Y. 2005. Catching the best views of skyline: A semantic approach based on decisive subspaces. In Proceedings of the 31st International Conference on Very Large Data Bases. 253--264.
[37]
Re, C., Dalvi, N., and Suciu, D. 2007. Efficient top-k query evaluation on probabilistic data. In Proceedings of the 23rd International Conference on Data Engineering. 886--895.
[38]
Sarma, A. D., B., O., Halevy, A. Y., and Widom, J. 2006. Working models for uncertain data. In Proceedings of the 22nd International Conference on Data Engineering. 7.
[39]
Seidl, T. and Kriegel, H. 1998. Optimal multi-step k-nearest neighbor search. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 154--165.
[40]
Sen, P. and Deshpande, A. 2007. Representing and querying correlated tuples in probabilistic databases. In Proceedings of the 23rd International Conference on Data Engineering. 596--605.
[41]
Sen, P., Deshpande, A., and Getoor, L. 2008. Exploiting shared correlations in probabilistic databases. In Proceedings of the 34th International Conference on Very Large Data Bases. 809--820.
[42]
Sharifzadeh, M. and Shahabi, C. 2006. The spatial skyline queries. In Proceedings of the 32nd International Conference on Very Large Data Bases. 751--762.
[43]
Singh, S., Mayfield, C., Shah, R., Prabhakar, S., Hambrusch, S., Neville, J., and Cheng, R. 2008. Database support for probabilistic attributes and tuples. In Proceedings of the 24th International Conference on Data Engineering. 1053--1061.
[44]
Soliman, M. A., Ilyas, I. F., and Chang, K. C. 2007. Top-k query processing in uncertain databases. In Proceedings of the 23rd International Conference on Data Engineering. 896--905.
[45]
Stanoi, I., Riedewald, M., Agrawal, D., and Abbadi, A. E. 2001. Discovery of influence sets in frequently updated databases. In Proceedings of the 27th International Conference on Very Large Data Bases. 99--108.
[46]
Tan, K.-L., Eng, P.-K., and Ooi, B. C. 2001. Efficient progressive skyline computation. In Proceedings of the 27th International Conference on Very Large Data Bases. 301--310.
[47]
Tao, Y., Papadias, D., and Lian, X. 2004. Reverse kNN search in arbitrary dimensionality. In Proceedings of the 30th International Conference on Very Large Data Bases. 744--755.
[48]
Tao, Y., Papadias, D., Lian, X., and Xiao, X. 2005. Multidimensional reverse kNN search. The VLDB J., 293--316.
[49]
Tao, Y., Xiao, X. K., and Pei, J. 2006. SUBSKY: Efficient computation of skylines in subspaces. In Proceedings of the 22nd International Conference on Data Engineering. 65.
[50]
Theodoridis, Y. and Sellis, T. 1996. A model for the prediction of R-tree performance. In Proceedings of the 15th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. 161--171.
[51]
Wang, D. Z., Michelakis, E., Garofalakis, M., and Hellerstein, J. 2008. Bayesstore: Managing large, uncertain data repositories with probabilistic graphical models. In Proceedings of the 34th International Conference on Very Large Data Bases. 340--351.
[52]
Yi, K., Li, F., Kollios, G., and Srivastava, D. 2008. Efficient processing of top-k queries in uncertain databases. In Proceedings of the 24th International Conference on Data Engineering. 1406--1408.
[53]
Yuan, Y., Lin, X., Q., Wang, W., Yu, J. X., and Zhang, Q. 2005. Efficient computation of the skyline cube. In Proceedings of the 31st International Conference on Very Large Data Bases. 241--252.

Cited By

View all
  • (2025)Achieving Efficient and Privacy-Preserving Reverse Skyline Query Over Single CloudIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348764637:1(29-44)Online publication date: Jan-2025
  • (2024)Parallel continuous skyline query over high-dimensional data stream windowsDistributed and Parallel Databases10.1007/s10619-024-07443-742:4(469-524)Online publication date: 1-Dec-2024
  • (2022)ProbSky: Efficient Computation of Probabilistic Skyline Queries over Distributed DataIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3151740(1-1)Online publication date: 2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Database Systems
ACM Transactions on Database Systems  Volume 35, Issue 1
February 2010
310 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/1670243
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Accepted: 01 August 2009
Revised: 01 May 2009
Received: 01 August 2008
Published: 15 February 2008
Published in TODS Volume 35, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Uncertain database
  2. bichromatic reverse skyline
  3. monochromatic reverse skyline

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)2
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Achieving Efficient and Privacy-Preserving Reverse Skyline Query Over Single CloudIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.348764637:1(29-44)Online publication date: Jan-2025
  • (2024)Parallel continuous skyline query over high-dimensional data stream windowsDistributed and Parallel Databases10.1007/s10619-024-07443-742:4(469-524)Online publication date: 1-Dec-2024
  • (2022)ProbSky: Efficient Computation of Probabilistic Skyline Queries over Distributed DataIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.3151740(1-1)Online publication date: 2022
  • (2020)Skylines and Other Dominance-Based QueriesSynthesis Lectures on Data Management10.2200/S01048ED1V01Y202009DTM06315:2(1-158)Online publication date: 19-Nov-2020
  • (2020)Skyline Diagram: Efficient Space Partitioning for Skyline QueriesIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2019.292391433:1(271-286)Online publication date: 7-Dec-2020
  • (2020)Efficient column-oriented processing for mutual subspace skyline queriesSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-020-04875-y24:20(15427-15445)Online publication date: 1-Oct-2020
  • (2019)An Adaptive Parallel PI-Skyline Query for Probabilistic and Incomplete DatabaseInternational Journal of Computational Methods10.1142/S021987621950036117:07(1950036)Online publication date: 31-May-2019
  • (2019)Top-k Skyline Result Optimization Algorithm in MapReduce2019 14th International Conference on Computer Science & Education (ICCSE)10.1109/ICCSE.2019.8845361(466-471)Online publication date: Aug-2019
  • (2019)Skyline Recommendation with Uncertain PreferencesPattern Recognition Letters10.1016/j.patrec.2019.06.002Online publication date: Jun-2019
  • (2018)Efficient Computation of G-Skyline GroupsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2017.277799430:4(674-688)Online publication date: 1-Apr-2018
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media