Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1007912.1007920acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
Article

Consistent and compact data management in distributed storage systems

Published: 27 June 2004 Publication History
  • Get Citation Alerts
  • Abstract

    In this paper we consider the problem of maintaining a consistent mapping of a virtual object space to a set of memory modules, i.e. the object space can be decomposed into a set of ranges where every module is responsible for exactly one range. A module owning some range R is responsible for storing all objects in R. Besides consistency, we require the mapping to be compact, i.e. any object or consecutive range of objects should be spread out over as few memory modules as possible. A compact mapping is important for many applications such as efficiently executing programs using a large amount of space or complex search queries such as semi-group range queries. Our main result assumes a static set of memory modules of uniform capacity, but we also show how to extend this to a dynamic set of memory modules of non-uniform capacity in a decentralized environment.In both settings, new objects may be added, old objects may be deleted, or objects may be modified over time. Each object consists of a set of data blocks of uniform size. So insert, delete, or modify operations on objects can be seen as insert or delete operations of data blocks. Each module can send or receive at most one data block in each unit of time and the injection of insert or delete requests for data blocks is under adversarial control. We prove asymptotically tight upper and lower bounds on the maximum rate at which the adversary can inject requests into the system so that a consistent and compact placement can be preserved without exceeding the capacity of a module at any time. Specifically, we show that in a (1-ε)-utilized system (i.e. the available space is used up to an ε fraction) the maximum injection rate that can be sustained is Θ(ε).

    References

    [1]
    S. Alstrup, G. Stolting Brodal, and T. Rauhe. New data structures for orthogonal range searching. In Proc. of the 41st IEEE Symp. on Foundations of Computer Science (FOCS), pages 198--207, 2000.
    [2]
    A. Andersson and T.W. Lai. Fast updating of well-balanced trees. In Proc. of the 2nd Scandinavian Workshop on Algorithm Theory (SWAT), pages 111--121, 1990.
    [3]
    R.A. Baeza-Yates and H. Soza-Pollman. Analysis of linear hashing revisited. Nordic Journal of Computing, 5(1):70--85, 1998.
    [4]
    M. Bender, E. Demaine, and M. Farach-Colton. Cache-oblivious B-trees. In Proc. of the 41st IEEE Symp. on Foundations of Computer Science (FOCS), pages 399--409, 2000.
    [5]
    A. Bolour. Optimal retrieval algorithms for small region queries. SIAM Journal on Computing, 10(4):721--741, 1981.
    [6]
    G. Brodal, R. Fagerberg, and R. Jacob. Cache-oblivious search trees via trees of small height. In Proc. of the 13th ACM/SIAM Symp. on Discrete Algorithms (SODA), pages 39--48, 2002.
    [7]
    L.-F. Cabrera and D.D.E. Long. Swift: Using distributed disk striping to provide high I/O data rates. Computer Systems, 4(4):405--436, 1991.
    [8]
    C.Y. Chen, C.C. Chang, and R.C.T. Lee. Optimal MMI file systems for orthogonal range queries. Information Systems, 18(1):37--54, 1993.
    [9]
    F. Dabek, F. Kaashoek, D. Karger, R. Morris, and I. Stoica. Wide-area cooperative storage with CFS. In Symp. on Operating Systems Principles (SOSP), pages 202--215, 2001.
    [10]
    P.F. Dietz, J.I. Seiferas, and J. Zhang. A tight lower bound for on-line monotonic list labeling. In Proc. of the 6th Scandinavian Workshop on Algorithm Theory (SWAT), pages 131--142, 1994.
    [11]
    P.F. Dietz and J. Zhang. Lower bounds for monotonic list labeling. In Proc. of the 2nd Scandinavian Workshop on Algorithm Theory (SWAT), pages 173--180, 1990.
    [12]
    A. Gupta, D. Agrawal, and A. El Abbadi. Approximate range selection queries in peer-to-peer systems. In Proc. of the 1st Biennial Conference on Innovative Data Systems Research, 2003.
    [13]
    S. Hanke, T. Ottmann, and E. Soisalon-Soininen. Relaxed balanced red-black trees. In 3rd Italian Conference on Algorithms and Complexity CIAC, pages 193--204, 1997.
    [14]
    E.P. Harris. Towards optimal storage design for efficient query processing in relational database systems. PhD thesis, The University of Melbourne, Parkville, Victoria 3052, Australia, December 1994.
    [15]
    E.P. Harris and K. Ramamohanarao. Using optimized multiattribute hash indexes for hash joins. In Proc. of the 5th Australasian Database Conference, pages 92--111, 1994.
    [16]
    J.H. Hartman and J.K. Ousterhout. The Zebra striped network file system. In Hai Jin, Toni Cortes, and Rajkumar Buyya, editors, High Performance Mass Storage and Parallel I/O: Technologies and Applications, pages 309--329. IEEE Computer Society Press and Wiley, New York, NY, 2001.
    [17]
    A. Itai, A.G. Konheim, and M. Rodeh. A sparse table implementation of sorted sets. In Proc. of 8th Int. Colloquium on Automata, Languages and Programming (ICALP), 1981.
    [18]
    T. Johnson and D. Shasha. The performance of concurrent data structure algorithms. Transactions on Database Systems, pages 51--101, 1993.
    [19]
    D. Karger, E. Lehman, T. Leighton, M. Levine, D. Lewin, and R. Panigrahy. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web. In Proc. of the 29th ACM Symp. on Theory of Computing (STOC), pages 654--663, 1997.
    [20]
    J.S. Karlsson. HQT*: A scalable distributed data structure for high-performance spatial accesses. In FODO 1998.
    [21]
    P. Krishna and T. Johnson. Highly scalable data balanced distributed B-trees. Technical Report 95-015, University of Florida, Dept. of CISE, 1995. Available at ftp.cis.ufl.edu:cis/tech-reports.
    [22]
    H.T. Kung and P. Lehman. A concurrent database manipulation problem: binary search trees. ACM Transactions on Database Systems, 5(3):339--353, 1980.
    [23]
    W. Litwin, M. Neimat, G. Levy, S. Ndiaye, and T. Seck. LH*: A high-availability and high-security scalable distributed data structure. In IEEE Research Issues in Data Engineering (RIDE-97), 1997.
    [24]
    M.K. McKusick, W.N. Joy, S.J. Leffler, and R.S. Fabry. A fast file system for UNIX. Computer Systems, 2(3):181--197, 1984.
    [25]
    X. Messeguer. Skip trees, an alternative data structure to skip lists in a concurrent approach. Informatique Theorique et Applications, 31(3):251--269, 1997.
    [26]
    K. Ramamohanarao and E.P. Harris. Effective clustering of records for fast query processing. In 1st Int. Symposium on Cooperative Database Systems for Advanced Applications CODAS, pages 516--525, 1996.
    [27]
    S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker. A scalable content-addressable network. In SIGCOMM '01, 2001.
    [28]
    F. Schmuck and R. Haskin. GPFS: A shared-disk file system for large computing clusters. In Proc. of the 1st Conference on File and Storage Technologies (FAST), 2002.
    [29]
    E. Shriver, E. Gabber, L. Huang, and C.A. Stein. Storage management for web proxies. In USENIX '01, pages 203--216, 2001.
    [30]
    I. Stoica, R. Morris, D. Karger, F. Kaashoek, and H. Balakrishnan. Chord: A scalable peer-to-peer lookup service for Internet applications. In SIGCOMM '01, pages 149--160, 2001.
    [31]
    S. Subramanian and S. Ramaswamy. The P-range tree: A new data structure for range searching in secondary memory. In Proc. of the 6th ACM/SIAM Symp. on Discrete Algorithms (SODA), 1995.
    [32]
    D. Schneider W. Litwin, M. Neimat. Linear hashing for distributed files. In ACM SIGMOD Conference, pages 327--335, 1993.
    [33]
    D.E. Willard. New data structures for orthogonal range queries. SIAM Journal on Computing, 14:232--253, 1985.
    [34]
    B.Y. Zhao, J. Kubiatowicz, and A. Joseph. Tapestry: An infrastructure for fault-tolerant wide-area location and routing. In UCB Technical Report UCB/CSD-01-1141, 2001.

    Cited By

    View all
    • (2019)Survey of research towards robust peer-to-peer networksComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2006.02.00150:17(3485-3521)Online publication date: 5-Jan-2019
    • (2018)$$D^2$$D2-TreeAlgorithmica10.1007/s00453-014-9878-472:3(860-883)Online publication date: 31-Dec-2018
    • (2008)Data Distribution Algorithm Using Time Based Weighted Distributed Hash TablesProceedings of the 2008 Seventh International Conference on Grid and Cooperative Computing10.1109/GCC.2008.44(210-213)Online publication date: 24-Oct-2008

    Recommendations

    Reviews

    David Gary Hill

    This paper proves theorems, and describes an algorithm for improving the efficient mapping of data objects to memory modules. Objects are composed of data blocks. Insert, delete, or update requests of objects are insert or delete requests of data blocks. Memory modules receive data blocks. The paper shows how to preserve both consistent and compact placement of data blocks, without exceeding the capacity of a module at any time. Consistency is critical because the integrity of the data must be preserved. The paper deals with both operational consistency (the return of the system to a consistent state after an operation on the data), and the more difficult transient consistency (the maintenance of the consistency of a storage system, even during the execution of an insert/delete operation). Transient consistency is critical for the ability to read at any time, and the ability to recover from a crash. A compact mapping is important for efficiently executing many applications efficiently, such as programs that use a large amount of space, or complex search queries. Efficiency translates into increased performance through lower overhead, which in turn translates into lower cost. The paper goes technically deep to substantiate its assertions. Researchers or designers of distributed memory or storage systems should find this mathematically rich treatise useful. Online Computing Reviews Service

    Access critical reviews of Computing literature here

    Become a reviewer for Computing Reviews.

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SPAA '04: Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
    June 2004
    332 pages
    ISBN:1581138407
    DOI:10.1145/1007912
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 June 2004

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. distributed data management
    2. load balancing
    3. peer-to-peer systems
    4. range queries

    Qualifiers

    • Article

    Conference

    SPAA04

    Acceptance Rates

    Overall Acceptance Rate 447 of 1,461 submissions, 31%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Survey of research towards robust peer-to-peer networksComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2006.02.00150:17(3485-3521)Online publication date: 5-Jan-2019
    • (2018)$$D^2$$D2-TreeAlgorithmica10.1007/s00453-014-9878-472:3(860-883)Online publication date: 31-Dec-2018
    • (2008)Data Distribution Algorithm Using Time Based Weighted Distributed Hash TablesProceedings of the 2008 Seventh International Conference on Grid and Cooperative Computing10.1109/GCC.2008.44(210-213)Online publication date: 24-Oct-2008

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media