Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1516360.1516480acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article
Free access

Efficient skyline retrieval with arbitrary similarity measures

Published: 24 March 2009 Publication History
  • Get Citation Alerts
  • Abstract

    A skyline query returns a set of objects that are not dominated by other objects. An object is said to dominate another if it is closer to the query than the latter on all factors under consideration. In this paper, we consider the case where the similarity measures may be arbitrary and do not necessarily come from a metric space. We first explore middleware algorithms, analyze how skyline retrieval for non-metric spaces can be done on the middleware backend, and lay down a necessary and sufficient stopping condition for middleware-based skyline algorithms. We develop the Balanced Access Algorithm, which is provably more IO-friendly than the state-of-the-art algorithm for skyline query processing on middleware and show that BAA outperforms the latter by orders of magnitude. We also show that without prior knowledge about data distributions, it is unlikely to have a middleware algorithm that is more IO-friendly than BAA. In fact, we empirically show that BAA is very close to the absolute lower bound of IO costs for middleware algorithms. Further, we explore the non-middleware setting and devise an online algorithm for skyline retrieval which uses a recently proposed value space index over non-metric spaces (AL-Tree [10]). The AL-Tree based algorithm is able to prune subspaces and efficiently maintain candidate sets leading to better performance. We compare our algorithms to existing ones which can work with arbitrary similarity measures and show that our approaches are better in terms of computational and disk access costs leading to significantly better response times.

    References

    [1]
    How fast is your disk? http://www.linuxinsight.com/how_fast_is_your_disk.html, January 2007.
    [2]
    A. Asuncion and D. Newman. UCI machine learning repository, 2007.
    [3]
    W.-T. Balke, U. Güntzer, and J. X. Zheng. Efficient distributed skylining for web information systems. In EDBT, pages 256--273, 2004.
    [4]
    H. Bast, D. Majumdar, R. Schenkel, M. Theobald, and G. Weikum. Io-top-k: Index-access optimized top-k query processing. In VLDB. ACM, 2006.
    [5]
    J. L. Bentley. Multidimensional binary search trees used for associative searching. Commun. ACM, 18(9):509--517, 1975.
    [6]
    S. Börzsönyi, D. Kossmann, and K. Stocker. The skyline operator. In ICDE, 2001.
    [7]
    J. Chomicki, P. Godfrey, J. Gryz, and D. Liang. Skyline with presorting. In ICDE, 2003.
    [8]
    W. Chung, Gray and Horst. Windows 2000 disk io performance. Microsoft Research TR, June 2000.
    [9]
    K. Deng, X. Zhou, and H. T. Shen. Multi-source skyline query processing in road networks. In ICDE, 2007.
    [10]
    P. Deshpande, Deepak, and K. Kummamuru. Efficient online top-k retrieval with arbitrary similarity measures. In EDBT, 2008.
    [11]
    R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. In PODS. ACM, 2001.
    [12]
    R. Fagin, A. Lotem, and M. Naor. Optimal aggregation algorithms for middleware. J. Comput. Syst. Sci., 66(4):614--656, 2003.
    [13]
    P. Godfrey, R. Shipley, and J. Gryz. Maximal vector computation in large data sets. In VLDB, 2005.
    [14]
    K. Goh, B. Li, and E. Chang. Dyndex: A dynamic and nonmetric space indexer. In ACM Intl. Conference on Multimedia, 2002.
    [15]
    D. Kossmann, F. Ramsak, and S. Rost. Shooting stars in the sky: An online algorithm for skyline queries. In VLDB, pages 275--286. Morgan Kaufmann, 2002.
    [16]
    D. Papadias, Y. Tao, G. Fu, and B. Seeger. An optimal and progressive algorithm for skyline queries. In SIGMOD Conference, 2003.
    [17]
    D. Papadias, Y. Tao, G. Fu, and B. Seeger. Progressive skyline computation in database systems. ACM Trans. Database Syst., 30(1):41--82, 2005.
    [18]
    K.-L. Tan, P.-K. Eng, and B. C. Ooi. Efficient progressive skyline computation. In VLDB, pages 301--310. Morgan Kaufmann, 2001.
    [19]
    Y. Tao, X. Xiao, and J. Pei. Subsky: Efficient computation of skylines in subspaces. In ICDE, page 65. IEEE Computer Society, 2006.
    [20]
    S. Wang, B. C. Ooi, A. K. H. Tung, and L. Xu. Efficient skyline query processing on peer-to-peer networks. In ICDE, pages 1126--1135, 2007.
    [21]
    P. Zesula, G. Amato, V. Dohnal, and M. Batko. Similarity Search - The Metric Space Approach. Springer, 2005.

    Cited By

    View all
    • (2016)Finding desirable objects under group categorical preferencesKnowledge and Information Systems10.1007/s10115-015-0886-849:1(273-313)Online publication date: 1-Oct-2016
    • (2014)Reconciling Multiple Categorical Preferences with Double Pareto-Based AggregationDatabase Systems for Advanced Applications10.1007/978-3-319-05810-8_18(266-281)Online publication date: 2014

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    EDBT '09: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
    March 2009
    1180 pages
    ISBN:9781605584225
    DOI:10.1145/1516360
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 March 2009

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article

    Conference

    EDBT/ICDT '09
    EDBT/ICDT '09: EDBT/ICDT '09 joint conference
    March 24 - 26, 2009
    Saint Petersburg, Russia

    Acceptance Rates

    Overall Acceptance Rate 7 of 10 submissions, 70%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)21
    • Downloads (Last 6 weeks)8
    Reflects downloads up to 05 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2016)Finding desirable objects under group categorical preferencesKnowledge and Information Systems10.1007/s10115-015-0886-849:1(273-313)Online publication date: 1-Oct-2016
    • (2014)Reconciling Multiple Categorical Preferences with Double Pareto-Based AggregationDatabase Systems for Advanced Applications10.1007/978-3-319-05810-8_18(266-281)Online publication date: 2014

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media