Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/967900.968054acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
Article

Replicated declustering for arbitrary queries

Published: 14 March 2004 Publication History

Abstract

Declustering have attracted a lot of interest over the couple of years. Recently, declustering using replication is proposed to reduce the additive overhead of declustering. Most of the work on declustering focuses on spatial range queries. However, in many scenarios including multi-user environments, query shapes can be arbitrary. In this paper, we explore replicated declustering for arbitrary queries. Replication reduces the cost of arbitrary queries to manageable levels. First, we investigate theoretically what is possible using replication for arbitrary queries. Then, we propose a 2-copy replication strategy that achieves the theoretical limit and therefore is the best possible scheme. Using proposed scheme, an arbitrary query containing b buckets requires disk accesses bounded by [√b] This is a significant improvement especially for small queries because using a single copy b buckets require min (b, N) disk accesses in the worst case even for small queries. Proposed scheme works for nonuniform data as well as uniform data. Finally, we extend the proposed scheme to a partial replication scheme to achieve best performance using limited replication.

References

[1]
K. A. S. Abdel-Ghaffar and A. El Abbadi. Optimal allocation of two-dimensional data. In ICDT, pages 409--418, Delphi, Greece, January 1997.]]
[2]
M. J. Atallah and S. Prabhakar. (Almost) optimal parallel block access for range queries. In Proc. ACM PODS, pages 205--215, Dallas, Texas, May 2000.]]
[3]
N. Beckmann, H. Kriegel, R. Schneider, and B. Seeger. The R* tree: An efficient and robust access method for points and rectangles. In Proc. ACM SIGMOD, pages 322--331, May 23--25 1990.]]
[4]
R. Bhatia, R. K. Sinha, and C.-M. Chen. Hierarchical declustering schemes for range queries. In EDBT 2000, pages 525--537, Konstanz, Germany, March 2000.]]
[5]
C.-M. Chen, R. Bhatia, and R. Sinha. Declustering using golden ratio sequences. In ICDE, pages 271--280, San Diego, California, Feb 2000.]]
[6]
C.-M. Chen and C. T. Cheng. From discrepancy to declustering: Near optimal multidimensional declustering strategies for range queries. In Proc. ACM PODS, pages 29--38, Wisconsin, Madison, 2002.]]
[7]
M. S. Chen, H. I. Hsiao, C. S. Lie, and P. S. Yu. Using rotational mirrored declustering for replica placement in a disk array-based video server. In Proceedings of the ACM Multimedia, pages 121--130, 1995.]]
[8]
P. Ciaccia and A. Veronesi. Dynamic declustering methods for parallel grid files. In Proceedings of Third International ACPC Conference with Special Emphasis on Parallel Databases and Parallel I/O, pages 110--123, Berlin, Germany, September 1996.]]
[9]
H. C. Du and J. S. Sobolewski. Disk allocation for cartesian product files on multiple-disk systems. ACM Trans. on Database Systems, 7(1):82--101, March 1982.]]
[10]
C. Faloutsos and D. Metaxas. Declustering using error correcting codes. In Proc. ACM PODS, pages 253--258, 1989.]]
[11]
V. Gaede and O. Gunther. Multidimensional access methods. ACM Computing Surveys, 30:170--231, 1998.]]
[12]
S. Ghandeharizadeh and D. J. DeWitt. A performance analysis of alternative multi-attribute declustering strategies. In Proc. ACM SIGMOD, pages 29--38, 1992.]]
[13]
J. Gray, B. Horst, and M. Walker. Parity striping of disc arrays: Low-cost reliable storage with acceptable throughput. In Proc. VLDB, pages 148--161, Washington DC., August 1990.]]
[14]
A. Guttman. R-trees: a dynamic index structure for spatial searching. In Proc. ACM SIGMOD, pages 47--57, 1984.]]
[15]
M. H. Kim and S. Pramanik. Optimal file distribution for partial match retrieval. In Proc. ACM SIGMOD, pages 173--182, Chicago, 1988.]]
[16]
J. Li, J. Srivastava, and D. Rotem. CMD: a multidimensional declustering method for parallel database systems. In Proc. VLDB, pages 3--14, Vancouver, Canada, August 1992.]]
[17]
B. Moon, A. Acharya, and J. Saltz. Study of scalable declustering algorithms for parallel grid files. In Proc. of the Parallel Processing Symposium, April 1996.]]
[18]
R. Muntz, J. R. Santos, and S. Berson. A parallel disk storage system for real-time multimedia applications. International Journal of Intelligent Systems, Special Issue on Multimedia Computing System, 13(12):1137--74, December 1998.]]
[19]
S. Prabhakar, K. Abdel-Ghaffar, D. Agrawal, and A. El Abbadi. Cyclic allocation of two-dimensional data. In ICDE, pages 94--101, Orlando, Florida, 1998.]]
[20]
H. Samet. The Design and Analysis of Spatial Structures. Addison Wesley, Massachusetts, 1989.]]
[21]
R. K. Sinha, R. Bhatia, and C.-M. Chen. Asymptotically optimal declustering schemes for range queries. In 8th International Conference on Database Theory, Lecture Notes in Computer Science, pages 144--158, London, UK, January 2001. Springer.]]
[22]
A. S. Tosun. Replicated declustering for arbitrary queries. Technical report, University of Texas at San Antonio, 2003.]]
[23]
A. S. Tosun and H. Ferhatosmanoglu. Optimal parallel I/O using replication. In Proceedings of International Workshops on Parallel Processing (ICPP), Vancouver, Canada, August 2002.]]

Cited By

View all
  • (2018)A Cost-Effective Data Replica Placement Strategy Based on Hybrid Genetic Algorithm for Cloud ServicesResearch and Practical Issues of Enterprise Information Systems10.1007/978-3-319-99040-8_4(43-56)Online publication date: 15-Aug-2018
  • (2016)Multithreaded Maximum Flow Based Optimal Replica Selection Algorithm for Heterogeneous Storage ArchitecturesIEEE Transactions on Computers10.1109/TC.2015.245162065:5(1543-1557)Online publication date: 1-May-2016
  • (2016)Exploiting Replication for Energy Efficiency of Heterogeneous Storage Systems2016 IEEE 24th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS)10.1109/MASCOTS.2016.70(79-84)Online publication date: Sep-2016
  • Show More Cited By

Index Terms

  1. Replicated declustering for arbitrary queries

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SAC '04: Proceedings of the 2004 ACM symposium on Applied computing
    March 2004
    1733 pages
    ISBN:1581138121
    DOI:10.1145/967900
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 March 2004

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. arbitrary query
    2. declustering
    3. replication

    Qualifiers

    • Article

    Conference

    SAC04
    Sponsor:
    SAC04: The 2004 ACM Symposium on Applied Computing
    March 14 - 17, 2004
    Nicosia, Cyprus

    Acceptance Rates

    Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

    Upcoming Conference

    SAC '25
    The 40th ACM/SIGAPP Symposium on Applied Computing
    March 31 - April 4, 2025
    Catania , Italy

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 22 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)A Cost-Effective Data Replica Placement Strategy Based on Hybrid Genetic Algorithm for Cloud ServicesResearch and Practical Issues of Enterprise Information Systems10.1007/978-3-319-99040-8_4(43-56)Online publication date: 15-Aug-2018
    • (2016)Multithreaded Maximum Flow Based Optimal Replica Selection Algorithm for Heterogeneous Storage ArchitecturesIEEE Transactions on Computers10.1109/TC.2015.245162065:5(1543-1557)Online publication date: 1-May-2016
    • (2016)Exploiting Replication for Energy Efficiency of Heterogeneous Storage Systems2016 IEEE 24th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS)10.1109/MASCOTS.2016.70(79-84)Online publication date: Sep-2016
    • (2014)SWORDThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-014-0362-123:6(845-870)Online publication date: 1-Dec-2014
    • (2013)Generalized Optimal Response Time Retrieval of Replicated Data from Storage ArraysACM Transactions on Storage10.1145/2491472.24914749:2(1-36)Online publication date: 1-Jul-2013
    • (2013)Query-Log Aware Replicated DeclusteringIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2012.11324:5(987-995)Online publication date: 1-May-2013
    • (2012)Equivalent Disk AllocationsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2011.17723:3(538-546)Online publication date: 1-Mar-2012
    • (2012)Integrated Maximum Flow Algorithm for Optimal Response Time Retrieval of Replicated DataProceedings of the 2012 41st International Conference on Parallel Processing10.1109/ICPP.2012.34(11-20)Online publication date: 10-Sep-2012
    • (2012)Replication Based QoS Framework for Flash ArraysProceedings of the 2012 IEEE International Conference on Cluster Computing10.1109/CLUSTER.2012.53(182-190)Online publication date: 24-Sep-2012
    • (2009)Divide-and-conquer scheme for strictly optimal retrieval of range queriesACM Transactions on Storage10.1145/1629075.16290775:3(1-32)Online publication date: 30-Nov-2009
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media