Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Layered List Labeling

Published: 14 May 2024 Publication History

Abstract

The list-labeling problem is one of the most basic and well-studied algorithmic primitives in data structures, with an extensive literature spanning upper bounds, lower bounds, and data management applications. The classical algorithm for this problem, dating back to 1981, has amortized cost O(log bn). Subsequent work has led to improvements in three directions: low-latency (worst-case) bounds; high-throughput (expected) bounds; and (adaptive) bounds for important workloads.
Perhaps surprisingly, these three directions of research have remained almost entirely disjoint---this is because, so far, the techniques that allow for progress in one direction have forced worsening bounds in the others. Thus there would appear to be a tension between worst-case, adaptive, and expected bounds. List labeling has been proposed for use in databases at least as early as PODS'99, but a database needs good throughput, response time, and needs to adapt to common workloads (e.g., bulk loads), and no current list-labeling algorithm achieve good bounds for all three.
We show that this tension is not fundamental. In fact, with the help of new data-structural techniques, one can actually combine any three list-labeling solutions in order to cherry-pick the best worst-case, adaptive, and expected bounds from each of them.

References

[1]
Arne Andersson. 1989. Improving Partial Rebuilding by Using Simple Balance Criteria. In Proc. Workshop on Algorithms and Data Structures (WADS) (Lecture Notes in Computer Science, Vol. 382). Springer, 393--402.
[2]
Arne Andersson and Tony W. Lai. 1990. Fast Updating of Well-Balanced Trees. In Proc. 2nd Scandinavian Workshop on Algorithm Theory (SWAT) (Lecture Notes in Computer Science, Vol. 447), John R. Gilbert and Rolf G. Karlsson (Eds.). 111--121. https://doi.org/10.1007/3--540--52846--6_82
[3]
Martin Babka, Jan Bulá nek, Vladim'i r Cuná t, Michal Koucký, and Michael E. Saks. 2019. On Online Labeling with Large Label Set. SIAM J. Discret. Math., Vol. 33, 3 (2019), 1175--1193.
[4]
Michael A. Bender, Jon Berry, Rob Johnson, Thomas M. Kroeger, Samuel McCauley, Cynthia A. Phillips, Bertrand Simon, Shikha Singh, and David Zage. 2016a. Anti-Persistence on Persistent Storage: History-Independent Sparse Tables and Dictionaries. In Proc. 35th ACM Symposium on Principles of Database Systems (PODS). 289--302.
[5]
Michael A. Bender, Richard Cole, Erik D. Demaine, and Martin Farach-Colton. 2002 a. Scanning and Traversing: Maintaining Data for Traversals in a Memory Hierarchy. In ESA (Lecture Notes in Computer Science, Vol. 2461). Springer, 139--151.
[6]
Michael A Bender, Richard Cole, Erik D Demaine, Martin Farach-Colton, and Jack Zito. 2002 b. Two simplified algorithms for maintaining order in a list. In Proc. 10th European Symposium on Algorithms (ESA). Springer, 152--164.
[7]
Michael A. Bender, Richard Cole, Erik D. Demaine, Martin Farach-Colton, and J. Zito. 2002 c. Two Simplified Algorithms for Maintaining Order in a List. In Proc. 10th European Symposium on Algorithms (ESA). 152--164.
[8]
Michael A. Bender, Alex Conway, Mart'in Farach-Colton, Hanna Komlós, William Kuszmaul, and Nicole Wein. 2022. Online List Labeling: Breaking the $łog^2 n$ Barrier. In Proc. 63rd IEEE Annual Symposium on Foundations of Computer Science (FOCS).
[9]
M. A. Bender, E. Demaine, and M. Farach-Colton. 2005 a. Cache-Oblivious B-Trees. sicomp, Vol. 35, 2 (2005), 341--358.
[10]
Michael A Bender, Erik D Demaine, and Martin Farach-Colton. 2000. Cache-oblivious B-trees. In Proc. of the 41st Annual IEEE Symposium on Foundations of Computer Science (FOCS). IEEE Computer Society, 399--409.
[11]
Michael A. Bender, Ziyang Duan, John Iacono, and Jing Wu. 2002 d. A Locality-Preserving Cache-Oblivious Dynamic Dictionary. In Proc. 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 29--38.
[12]
Michael A. Bender, Roozbeh Ebrahimi, Haodong Hu, and Bradley C. Kuszmaul. 2016b. B-trees and Cache-Oblivious B-trees with Different-Sized Atomic Keys. Transactions on Database Systems, Vol. 41, 3 (July 2016), 19:1--19:33.
[13]
Michael A. Bender, Martin Farach-Colton, and Bradley C. Kuszmaul. 2006 a. Cache-oblivious string B-trees. In Proc. 25th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS). ACM, 233--242.
[14]
Michael A. Bender, Martin Farach-Colton, and Miguel Mosteiro. 2004. Insertion Sort is $O(n log n)$. In Fun with Algorithms. 16--23.
[15]
Michael A. Bender, Martin Farach-Colton, and Miguel A. Mosteiro. 2006 b. Insertion Sort is $O(n łog n)$ . Theory of Computing Systems, Vol. 39, 3 (2006), 391--397. Special Issue on FUN '04.
[16]
Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, Tsvi Kopelowitz, and Pablo Montes. 2017. File Maintenance: When in Doubt, Change the Layout!. In Proc. 28th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 1503--1522.
[17]
Michael A Bender, Jeremy T Fineman, Seth Gilbert, and Bradley C Kuszmaul. 2005 b. Concurrent cache-oblivious B-trees. In Proc. of the 17th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA). 228--237.
[18]
Michael A Bender and Haodong Hu. 2007. An adaptive packed-memory array. ACM Transactions on Database Systems, Vol. 32, 4 (Nov. 2007), 26.
[19]
Gerth Stølting Brodal, Rolf Fagerberg, and Riko Jacob. 2002. Cache oblivious search trees via binary trees of small height. In Proc. of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). 39--48.
[20]
Jan Bulánek, Michal Kouckỳ, and Michael Saks. 2012. Tight lower bounds for the online labeling problem. In Proc. of the 44th Annual ACM Symposium on Theory of Computing (STOC). 1185--1198.
[21]
Jan Bulá nek, Michal Koucký, and Michael E. Saks. 2013. On Randomized Online Labeling with Polynomially Many Labels. In Proc. International Colloquium on Automata, Languages, and Programming (ICALP) (Lecture Notes in Computer Science, Vol. 7965). Springer, 291--302.
[22]
William E Devanny, Jeremy T Fineman, Michael T Goodrich, and Tsvi Kopelowitz. 2017. The online house numbering problem: Min-max online list labeling. In Proc. 25th European Symposium on Algorithms (ESA). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.
[23]
Paul F Dietz. 1982. Maintaining order in a linked list. In Proc. of the 14th Annual ACM Symposium on Theory of Computing (San Francisco, California, USA). New York, NY, USA, 122--127. https://doi.org/10.1145/800070.802184
[24]
Paul F Dietz, Joel I Seiferas, and Ju Zhang. 1994. A tight lower bound for on-line monotonic list labeling. In Scandinavian Workshop on Algorithm Theory. Springer, 131--142.
[25]
Paul F Dietz, Joel I Seiferas, and Ju Zhang. 2004. A tight lower bound for online monotonic list labeling. SIAM Journal on Discrete Mathematics, Vol. 18, 3 (2004), 626--637.
[26]
Paul F Dietz and Ju Zhang. 1990. Lower bounds for monotonic list labeling. In Scandinavian Workshop on Algorithm Theory. Springer, 173--180.
[27]
Igal Galperin and Ronald L. Rivest. 1993. Scapegoat Trees. In Proc. 4th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). ACM/SIAM, 165--174.
[28]
Jason D. Hartline, Edwin S. Hong, Alexander E. Mohr, William R. Pentney, and Emily Rocke. 2002. Characterizing History Independent Data Structures. In Proceedings of the Algorithms and Computation, 13th International Symposium (ISAAC). 229--240. https://doi.org/10.1007/3--540--36136--7_21
[29]
Jason D Hartline, Edwin S Hong, Alexander E Mohr, William R Pentney, and Emily C Rocke. 2005. Characterizing history independent data structures. Algorithmica, Vol. 42, 1 (2005), 57--74.
[30]
Alon Itai and Irit Katriel. 2007. Canonical density control. Inf. Process. Lett., Vol. 104, 6 (2007), 200--204.
[31]
Alon Itai, Alan Konheim, and Michael Rodeh. 1981. A sparse table implementation of priority queues. Proc. of the 8th Annual International Colloquium on Automata, Languages, and Programming (ICALP), Vol. 115 (1981), 417--431.
[32]
Irit Katriel. 2002. Implicit Data Structures Based on Local Reorganizations. Master's thesis. Technion -- Israel Inst. of Tech., Haifa.
[33]
Dean De Leo and Peter A. Boncz. 2019. Fast Concurrent Reads and Updates with PMAs. In Proceedings of the 2nd Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA). ACM, 8:1--8:8.
[34]
Dean De Leo and Peter A. Boncz. 2021. Teseo and the Analysis of Structural Dynamic Graphs. Proc. VLDB Endowment 14, Vol. 14, 6 (2021), 1053--1066.
[35]
Samuel McCauley, Benjamin Moseley, Aidin Niaparast, and Shikha Singh. 2023. Online List Labeling with Predictions. CoRR, Vol. abs/2305.10536 (2023). https://doi.org/10.48550/arXiv.2305.10536 showeprint[arXiv]2305.10536
[36]
Daniele Micciancio. 1997. Oblivious data structures: applications to cryptography. In Proc. of the 29th Annual ACM Symposium on Theory of Computing (STOC). 456--464.
[37]
Moni Naor and Vanessa Teague. 2001. Anti-persistence: history independent data structures. In Proc. of the 33rd Annual ACM Symposium on Theory of Computing (STOC). 492--501.
[38]
Prashant Pandey, Brian Wheatman, Helen Xu, and Bulucc Buluc. 2021. Terrace: A Hierarchical Graph Container for Skewed Dynamic Graphs. In Proc. 2021 ACM SIGMOD International Conference on Management of Data (SIGMOD). 1372--1385.
[39]
Vijayshankar Raman. 1999. Locality Preserving Dictionaries: Theory and Application to Clustering in Databases. In Proc. 18th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS) (Philadelphia, Pennsylvania, USA). 337--345. https://doi.org/10.1145/303976.304009
[40]
Michael Saks. 2018. Online Labeling: Algorithms, Lower Bounds and Open Questions. In International Computer Science Symposium in Russia (CSR), Vol. 10846. Springer, 23--28.
[41]
Tokutek, Inc. 2015a. TokuDB: MySQL Performance, MariaDB Performance. http://www.tokutek.com/products/tokudb-for-mysql/.
[42]
Tokutek, Inc. 2015b. TokuMX--MongoDB Performance Engine. http://www.tokutek.com/products/tokumx-for-mongodb/.
[43]
Brian Wheatman and Randal Burns. 2021. Streaming Sparse Graphs using Efficient Dynamic Sets. In IEEE BigData. IEEE, 284--294.
[44]
Brian Wheatman and Helen Xu. 2018. Packed Compressed Sparse Row: A Dynamic Graph Representation. In HPEC. IEEE, 1--7.
[45]
Brian Wheatman and Helen Xu. 2021. A Parallel Packed Memory Array to Store Dynamic Graphs. In Proc. Symposium on Algorithm Engineering and Experiments (ALENEX). SIAM, 31--45.
[46]
Dan E. Willard. 1981. Inserting and Deleting Records in Blocked Sequential Files. Technical Report TM81--45193--5. Bell Labs Tech Reports.
[47]
Dan E. Willard. 1982. Maintaining Dense Sequential Files in a Dynamic Environment (Extended Abstract). In Proc. 14th Annual Symposium on Theory of Computing (STOC). 114--121.
[48]
Dan E. Willard. 1986. Good Worst-Case Algorithms for Inserting and Deleting Records in Dense Sequential Files. In Proc. 1986 ACM SIGMOD International Conference on Management of Data (SIGMOD). 251--260.
[49]
Dan E. Willard. 1992. A Density Control Algorithm for Doing Insertions and Deletions in a Sequentially Ordered File in Good Worst-Case Time. Information and Computation, Vol. 97, 2 (April 1992), 150--204.
[50]
Ju Zhang. 1993. Density control and on-line labeling problems. Ph.,D. Dissertation. University of Rochester.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data
Proceedings of the ACM on Management of Data  Volume 2, Issue 2
PODS
May 2024
852 pages
EISSN:2836-6573
DOI:10.1145/3665155
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 May 2024
Published in PACMMOD Volume 2, Issue 2

Permissions

Request permissions for this article.

Author Tags

  1. algorithms
  2. data structures
  3. history independence
  4. online algorithms
  5. randomized algorithms

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 30
    Total Downloads
  • Downloads (Last 12 months)30
  • Downloads (Last 6 weeks)6
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media