Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2902251.2902276acmconferencesArticle/Chapter ViewAbstractPublication PagespodsConference Proceedingsconference-collections
research-article
Public Access

Anti-Persistence on Persistent Storage: History-Independent Sparse Tables and Dictionaries

Published: 15 June 2016 Publication History

Abstract

We present history-independent alternatives to a B-tree, the primary indexing data structure used in databases. A data structure is history independent (HI) if it is impossible to deduce any information by examining the bit representation of the data structure that is not already available through the API. We show how to build a history-independent cache-oblivious B-tree and a history-independent external-memory skip list. One of the main contributions is a data structure we build on the way---a history-independent packed-memory array (PMA). The PMA supports efficient range queries, one of the most important operations for answering database queries.
Our HI PMA matches the asymptotic bounds of prior non-HI packed-memory arrays and sparse tables. Specifically, a PMA maintains a dynamic set of elements in sorted order in a linear-sized array. Inserts and deletes take an amortized O(log2 N) element moves with high probability. Simple experiments with our implementation of HI PMAs corroborate our theoretical analysis. Comparisons to regular PMAs give preliminary indications that the practical cost of adding history-independence is not too large.
Our HI cache-oblivious B-tree bounds match those of prior non-HI cache-oblivious B-trees. Searches take O(logB N) I/Os; inserts and deletes take O((log2 N)/B+ logB N) amortized I/Os with high probability; and range queries returning k elements take O(logB N + k/B) I/Os.
Our HI external-memory skip list achieves optimal bounds with high probability, analogous to in-memory skip lists: O(logB N) I/Os for point queries and amortized O(logB N) I/Os for inserts/deletes. Range queries returning k elements run in O(logB N + k/B) I/Os. In contrast, the best possible high-probability bounds for inserting into the folklore B-skip list, which promotes elements with probability 1/B, is just Theta(log N) I/Os. This is no better than the bounds one gets from running an in-memory skip list in external memory.

References

[1]
I. Abraham, J. Aspnes, and J. Yuan. Skip B-trees. In Proc. of the 9th Annual International Conference on Principles of Distributed Systems (OPODIS), page 366, 2006.
[2]
U. A. Acar, G. E. Blelloch, R. Harper, J. L. Vittes, and S. L. M. Woo. Dynamizing static algorithms, with applications to dynamic trees and history independence. In Proc. of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 531--540, 2004.
[3]
A. Aggarwal and J. S. Vitter. The input/output complexity of sorting and related problems. Communications of the ACM, 31(9):1116--1127, Sept. 1988.
[4]
O. Amble and D. E. Knuth. Ordered hash tables. The Computer Journal, 17(2):135--142, 1974.
[5]
A. Anagnostopoulos, M. Goodrich, and R. Tamassia. Persistent authenticated dictionaries and their applications. Information Security, pages 379--393, 2001.
[6]
A. Andersson and T. Ottmann. Faster uniquely represented dictionaries. In Proc. of the 32nd Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 642--649, 1991.
[7]
C. R. Aragon and R. G. Seidel. Randomized search trees. In Proc. of the 30th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 540--545, 1989.
[8]
L. Arge, D. Eppstein, and M. T. Goodrich. Skip-webs: efficient distributed data structures for multi-dimensional data sets. In Proc. of the 24th Annual ACM Symposium on Principles of Distributed Computing (PODS), pages 69--76, 2005.
[9]
J. Aspnes and G. Shah. Skip graphs. ACM Transactions on Algorithms, 3(4):37, 2007.
[10]
S. Bajaj, A. Chakraborti, and R. Sion. The foundations of history independence. arXiv preprint arXiv:1501.06508, 2015.
[11]
S. Bajaj and R. Sion. Ficklebase: Looking into the future to erase the past. In Proc. of the 29th IEEE International Conference on Data Engineering (ICDE), pages 86--97, 2013.
[12]
S. Bajaj and R. Sion. HIFS: History independence for file systems. In Proc. of the ACM SIGSAC Conference on Computer & Communications Security (CCS), pages 1285--1296, 2013.
[13]
M. A. Bender, R. Cole, and R. Raman. Exponential structures for efficient cache-oblivious algorithms. In Proc. of the 29th Annual International Colloquium on Automata, Languages, and Programming (ICALP), pages 195--207, 2002.
[14]
M. A. Bender, E. D. Demaine, and M. Farach-Colton. Cache-oblivious B-trees. SIAM Journal on Computing, 35(2):341--358, 2005.
[15]
M. A. Bender, Z. Duan, J. Iacono, and J. Wu. A locality-preserving cache-oblivious dynamic dictionary. Journal of Algorithms, 3(2):115--136, 2004.
[16]
M. A. Bender, M. Farach-Colton, and B. C. Kuszmaul. Cache-oblivious string B-trees. In Proc. of the 25th Annual ACM Symposium on Principles of Database Systems (PODS), pages 233--242, 2006.
[17]
M. A. Bender, J. T. Fineman, S. Gilbert, and B. C. Kuszmaul. Concurrent cache-oblivious B-trees. In Proc. of the 17th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 228--237, 2005.
[18]
M. A. Bender and H. Hu. An adaptive packed-memory array. ACM Transactions on Database Systems, 32(4):26, 2007.
[19]
J. Bethencourt, D. Boneh, and B. Waters. Cryptographic methods for storing ballots on a voting machine. In Proc. of the 14th Network and Distributed System Security Symposium (NDSS), 2007.
[20]
G. E. Blelloch and D. Golovin. Strongly history-independent hashing with applications. In Proc. of the 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 272--282, 2007.
[21]
G. E. Blelloch, D. Golovin, and V. Vassilevska. Uniquely represented data structures for computational geometry. In Proc. of the 11th Scandinavian Workshop on Algorithm Theory (SWAT), pages 17--28, 2008.
[22]
G. S. Brodal, R. Fagerberg, and R. Jacob. Cache oblivious search trees via binary trees of small height. In Proc. of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 39--48, 2002.
[23]
N. Buchbinder and E. Petrank. Lower and upper bounds on obtaining history independence. In Advances in Cryptology, pages 445--462, 2003.
[24]
J. Bulánek, M. Kouckỳ, and M. Saks. Tight lower bounds for the online labeling problem. In Proc. of the 44th Annual ACM Symposium on Theory of Computing (STOC), pages 1185--1198, 2012.
[25]
P. Callahan, M. T. Goodrich, and K. Ramaiyer. Topology B-trees and their applications. In Proc. of the 4th International Workshop on Algorithms and Data Structures (WADS), pages 381--392, 1995.
[26]
V. Ciriani, P. Ferragina, F. Luccio, and S. Muthukrishnan. Static optimality theorem for external memory string access. In Proc. of the 43rd Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 219--227, 2002.
[27]
L. Devroye. A limit theory for random skip lists. The Annals of Applied Probability, pages 597--609, 1992.
[28]
M. Fomitchev and E. Ruppert. Lock-free linked lists and skip lists. In Proc. of the 23rd Annual ACM Symposium on Principles of Distributed Computing (PODS), pages 50--59, 2004.
[29]
M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cache-oblivious algorithms. In Proc. of the 40th Annual IEEE Symposium on the Foundations of Computer Science (FOCS), pages 285--298, 1999.
[30]
M. Frigo, C. E. Leiserson, H. Prokop, and S. Ramachandran. Cache-oblivious algorithms. ACM Transactions on Algorithms, 8(1):4, 2012.
[31]
D. Golovin. Uniquely Represented Data Structures with Applications to Privacy. PhD thesis, Carnegie Mellon University, Pittsburgh, PA, 2008, 2008.
[32]
D. Golovin. B-treaps: A uniquely represented alternative to B-trees. In Proc. of the 36th Annual International Colloquium on Automata, Languages, and Programming (ICALP), pages 487--499. 2009.
[33]
D. Golovin. The B-skip-list: A simpler uniquely represented alternative to B-trees. arXiv preprint arXiv:1005.0662, 2010.
[34]
M. T. Goodrich and R. Tamassia. Efficient authenticated dictionaries with skip lists and commutative hashing. US Patent App, 10(416,015), 2000.
[35]
J. A. Halderman, S. D. Schoen, N. Heninger, W. Clarkson, W. Paul, J. A. Calandrino, A. J. Feldman, J. Appelbaum, and E. W. Felten. Lest we remember: Cold-boot attacks on encryption keys. Communications of the ACM, 52(5):91--98, 2009.
[36]
J. D. Hartline, E. S. Hong, A. E. Mohr, W. R. Pentney, and E. C. Rocke. Characterizing history independent data structures. Algorithmica, 42(1):57--74, 2005.
[37]
M. Herlihy, Y. Lev, V. Luchangco, and N. Shavit. A simple optimistic skiplist algorithm. Proc. of the 14th Annual Colloquium on Structural Information and Communication Complexity (SIROCCO), page 124, 2007.
[38]
A. Itai, A. Konheim, and M. Rodeh. A sparse table implementation of priority queues. Proc. of the 8th Annual International Colloquium on Automata, Languages, and Programming (ICALP), pages 417--431, 1981.
[39]
R. Jacob, A. Richa, C. Scheideler, S. Schmid, and H. Taubig. A distributed polylogarithmic time algorithm for self-stabilizing skip graphs. In Proc. of the 28th ACM Symposium on Principles of Distributed Computing (PODS), pages 131--140, 2009.
[40]
Z. Kasheff. Cache-oblivious dynamic search trees. M.eng., Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, June 2004.
[41]
I. Katriel. Implicit data structures based on local reorganizations. Master's thesis, Technion -- Israel Inst. of Tech., Haifa, May 2002.
[42]
P. Kirschenhofer and H. Prodinger. The path length of random skip lists. Acta Informatica, 31(8):775--792, 1994.
[43]
D. Micciancio. Oblivious data structures: applications to cryptography. In Proc. of the 29th Annual ACM Symposium on Theory of Computing (STOC), pages 456--464, 1997.
[44]
D. Molnar, T. Kohno, N. Sastry, and D. Wagner. Tamper-evident, history-independent, subliminal-free data structures on prom storage-or-how to store ballots on a voting machine. In Proc. of the 27th Annual IEEE Symposium on Security and Privacy (S&P), 2006.
[45]
T. Moran, M. Naor, and G. Segev. Deterministic history-independent strategies for storing information on write-once memories. In Proc. of the 34th International Colloquium on Automata, Languages and Programming (ICALP), 2007.
[46]
M. Naor, G. Segev, and U. Wieder. History-independent cuckoo hashing. In Proc. of the 35th International Colloquium on Automata, Languages and Programming (ICALP), pages 631--642. Springer, 2008.
[47]
M. Naor and V. Teague. Anti-persistence: history independent data structures. In Proc. of the 33rd Annual ACM Symposium on Theory of Computing (STOC), pages 492--501, 2001.
[48]
J. Nievergelt and E. M. Reingold. Binary search trees of bounded balance. SIAM Journal on Computing, 2(1):33--43, 1973.
[49]
R. Oshman and N. Shavit. The SkipTrie: low-depth concurrent search without rebalancing. In Proc. of the 32nd Annual ACM Symposium on Principles of Distributed Computing (PODS), pages 23--32, 2013.
[50]
T. Papadakis, J. I. Munro, and P. V. Poblete. Analysis of the expected search cost in skip lists. In Proc. of the 2nd Scandinavian Workshop on Algorithm Theory (SWAT), pages 160--172, 1990.
[51]
H. Prokop. Cache oblivious algorithms. Master's thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, June 1999.
[52]
W. Pugh. Incremental computation and the incremental evaluation of functional programs. PhD thesis, Cornell University, 1988.
[53]
W. Pugh. Skip lists: a probabilistic alternative to balanced trees. Communications of the ACM, 33(6):668--676, 1990.
[54]
W. Pugh and T. Teitelbaum. Incremental computation via function caching. In Proc. of the 16th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 315--328, 1989.
[55]
N. Rahman, R. Cole, and R. Raman. Optimised predecessor data structures for internal memory. In Proc. of the 5th International Workshop on Algorithm Engineering (WAE), pages 67--78, 2001.
[56]
D. S. Roche, A. J. Aviv, and S. G. Choi. Oblivious secure deletion with bounded history independence. arXiv preprint arXiv:1505.07391, 2015.
[57]
N. Shavit and I. Lotan. Skiplist-based concurrent priority queues. In Proc. of the 14th International Parallel and Distributed Processing Symposium (IPDPS), pages 263--268, 2000.
[58]
J. Shun and G. E. Blelloch. Phase-concurrent hash tables for determinism. In Proc. of the 26th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 96--107, 2014.
[59]
L. Snyder. On uniquely represented data strauctures. In Proc. of the 18th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 142--146, 1977.
[60]
R. Sundar and R. E. Tarjan. Unique binary search tree representations and equality-testing of sets and sequences. In Proc. of the 22nd Annual ACM Symposium on Theory of Computing (STOC), pages 18--25, 1990.
[61]
T. Tzouramanis. History-independence: a fresh look at the case of R-trees. In Proc. of the 27th Annual ACM Symposium on Applied Computing (SAC), pages 7--12, 2012.
[62]
J. S. Vitter. Random sampling with a reservoir. ACM Transactions on Mathematical Software, 11(1):37--57, 1985.
[63]
D. E. Willard. Inserting and deleting records in blocked sequential files. Technical Report TM81--45193--5, Bell Labs Tech Reports, 1981. (Cited inciteWillard92).
[64]
D. E. Willard. Maintaining dense sequential files in a dynamic environment. In Proc. of the 14th Annual ACM Symposium on Theory of Computing (STOC), pages 114--121, 1982.
[65]
D. E. Willard. Good worst-case algorithms for inserting and deleting records in dense sequential files. In ACM SIGMOD Record, volume 15:2, pages 251--260, 1986.
[66]
D. E. Willard. A density control algorithm for doing insertions and deletions in a sequentially ordered file in a good worst-case time. Information and Computation, 97(2):150--204, 1992.

Cited By

View all
  • (2024)History-Independent Concurrent ObjectsProceedings of the 43rd ACM Symposium on Principles of Distributed Computing10.1145/3662158.3662814(14-24)Online publication date: 17-Jun-2024
  • (2024)History-Independent Dynamic Partitioning: Operation-Order Privacy in Ordered Data StructuresProceedings of the ACM on Management of Data10.1145/36516092:2(1-27)Online publication date: 14-May-2024
  • (2024)Layered List LabelingProceedings of the ACM on Management of Data10.1145/36516022:2(1-19)Online publication date: 14-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PODS '16: Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
June 2016
504 pages
ISBN:9781450341912
DOI:10.1145/2902251
  • General Chair:
  • Tova Milo,
  • Program Chair:
  • Wang-Chiew Tan
© 2016 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. B-tree
  2. cache-oblivious
  3. data structures
  4. external memory
  5. history-independence
  6. online list labelling
  7. packed-memory array
  8. sequential file maintenance

Qualifiers

  • Research-article

Funding Sources

Conference

SIGMOD/PODS'16
Sponsor:
SIGMOD/PODS'16: International Conference on Management of Data
June 26 - July 1, 2016
California, San Francisco, USA

Acceptance Rates

PODS '16 Paper Acceptance Rate 31 of 94 submissions, 33%;
Overall Acceptance Rate 642 of 2,707 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)85
  • Downloads (Last 6 weeks)11
Reflects downloads up to 10 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)History-Independent Concurrent ObjectsProceedings of the 43rd ACM Symposium on Principles of Distributed Computing10.1145/3662158.3662814(14-24)Online publication date: 17-Jun-2024
  • (2024)History-Independent Dynamic Partitioning: Operation-Order Privacy in Ordered Data StructuresProceedings of the ACM on Management of Data10.1145/36516092:2(1-27)Online publication date: 14-May-2024
  • (2024)Layered List LabelingProceedings of the ACM on Management of Data10.1145/36516022:2(1-19)Online publication date: 14-May-2024
  • (2023)Strongly History-Independent Storage Allocation: New Upper and Lower Bounds2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS)10.1109/FOCS57990.2023.00111(1822-1841)Online publication date: 6-Nov-2023
  • (2022) Online List Labeling: Breaking the log 2 n Barrier 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS)10.1109/FOCS54457.2022.00096(980-990)Online publication date: Oct-2022
  • (2021)External-memory Dictionaries in the Affine and PDAM ModelsACM Transactions on Parallel Computing10.1145/34706358:3(1-20)Online publication date: 20-Sep-2021
  • (2019)Small Refinements to the DAM Can Have Big Consequences for Data-Structure DesignThe 31st ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3323165.3323210(265-274)Online publication date: 17-Jun-2019
  • (2019)eHIFSProceedings of the 2019 ACM Asia Conference on Computer and Communications Security10.1145/3321705.3329839(573-585)Online publication date: 2-Jul-2019
  • (2017)Cost-Oblivious Storage ReallocationACM Transactions on Algorithms10.1145/307069313:3(1-20)Online publication date: 26-May-2017
  • (2017)Write-Optimized Skip ListsProceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3034786.3056117(69-78)Online publication date: 9-May-2017
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media