Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-540-70575-8_32guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Succinct Data Structures for Retrieval and Approximate Membership (Extended Abstract)

Published: 07 July 2008 Publication History

Abstract

The retrieval problemis the problem of associatingdata with keys in a set. Formally, the data structure must store afunction $f\colon U\to \{0,1\}^r$ that has specified values on theelements of a given set SU, |S|= n, but may have any value on elements outsideS. All known methods (e. g. those based on perfect hashfunctions), induce a space overhead of θ(n)bits over the optimum, regardless of the evaluation time. We showthat for any k, query time O(k) can beachieved using space that is within a factor 1 + e-kof optimal, asymptotically forlarge n. The time to construct the data structure isO(n), expected. If we allow logarithmicevaluation time, the additive overhead can be reduced toO(loglogn) bits whp. A general reductiontransfers the results on retrieval into analogous results onapproximate membership, a problem traditionally addressedusing Bloom filters. Thus we obtain space bounds arbitrarily closeto the lower bound for this problem as well. The evaluationprocedures of our data structures are extremely simple. For theresults stated above we assume free access to fully random hashfunctions. This assumption can be justified using spaceo(n) to simulate full randomness on a RAM.

References

[1]
Alstrup, S., Brodal, G.S., Rauhe, T.: Optimal static range reporting in one dimension. In: Proc. 33rd ACM STOC, pp. 476-482 (2001)
[2]
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422-426 (1970)
[3]
Botelho, F.C., Pagh, R., Ziviani, N.: Simple and space-efficient minimal perfect hash functions. In: Dehne, F., Sack, J.-R., Zeh, N. (eds.) WADS 2007. LNCS, vol. 4619, pp. 139-150. Springer, Heidelberg (2007)
[4]
Broder, A.Z., Mitzenmacher, M.: Network applications of Bloom filters: A survey. In: Proc. 40th Annual Allerton Conference on Communication, Control, and Computing, pp. 636-646. ACM Press, New York (2002)
[5]
Cain, J.A., Sanders, P., Wormald, N.C.: The random graph threshold for k- orientiability and a fast algorithm for optimal multiple-choice allocation. In: Proc. 18th ACM-SIAM SODA, pp. 469-476 (2007)
[6]
Calkin, N.J.: Dependent sets of constant weight binary vectors. Combinatorics, Probability and Computing 6(3), 263-271 (1997)
[7]
Carter, L., Floyd, R.W., Gill, J., Markowsky, G., Wegman, M.N.: Exact and approximate membership testers. In: Proc. 10th ACM STOC, pp. 59-65 (1978)
[8]
Chazelle, B., Kilian, J., Rubinfeld, R., Tal, A.: The Bloomier filter: an efficient data structure for static support lookup tables. In: Proc. 15th ACM-SIAM SODA, pp. 30-39 (2004)
[9]
Cooper, C.: On the rank of random matrices. Random Struct. Algorithms 16(2), 209-232 (2001)
[10]
Czumaj, A., Riley, C., Scheideler, C.: Perfectly Balanced Allocation. In: Arora, S., Jansen, K., Rolim, J.D.P., Sahai, A. (eds.) RANDOM 2003 and APPROX 2003. LNCS, vol. 2764, pp. 240-251. Springer, Heidelberg (2003)
[11]
Dietzfelbinger, M.: Design strategies for minimal perfect hash functions. In: Proc. 4th Int. Symp. on Stochastic Algorithms: Foundations and Applications (SAGA). LNCS, vol. 4665, pp. 2-17. Springer, Heidelberg (2007)
[12]
Dietzfelbinger, M., Pagh, R.: Succinct data structures for retrieval and approximate membership, Technical Report, arXiv:0803.3693v1 [cs.DS] (March 26, 2008)
[13]
Dietzfelbinger, M., Weidling, C.: Balanced allocation and dictionaries with tightly packed constant size bins. Theoret. Comput. Sci. 380(1-2), 47-68 (2007)
[14]
Fernholz, D., Ramachandran, V.: The k-orientability thresholds for Gn,p. In: Proc. 18th ACM-SIAM SODA, pp. 459-468 (2007)
[15]
Fotakis, D., Pagh, R., Sanders, P., Spirakis, P.G.: Space efficient hash tables with worst case constant access time. Theory Comput. Syst. 38(2), 229-248 (2005)
[16]
Hagerup, T., Tholey, T.: Efficient minimal perfect hashing in nearly minimal space. In: Ferreira, A., Reichel, H. (eds.) STACS 2001. LNCS, vol. 2010, pp. 317-326. Springer, Heidelberg (2001)
[17]
Majewski, B.S., Wormald, N.C., Havas, G., Czech, Z.J.: A family of perfect hashing methods. Computer J. 39(6), 547-554 (1996)
[18]
Mitzenmacher, M.: Compressed Bloom filters. IEEE/ACM Transactions on Networking 10(5), 604-612 (2002)
[19]
Mortensen, C.W., Pagh, R., Pătraşcu, M.: On dynamic range reporting in one dimension. In: Proc. 37th ACM STOC, pp. 104-111 (2005)
[20]
Pagh, R., Rodler, F.F.: Cuckoo Hashing. J. Algorithms 51, 122-144 (2004)
[21]
Panigrahy, R.: Efficient hashing with lookups in two memory accesses. In: Proc. 16th ACM-SIAM SODA, pp. 830-839 (2005)
[22]
Porat, E.: An optimal Bloom filter replacement based on matrix solving, Technical Report, arXiv:0804.1845v1 [cs.DS] (April 11, 2008)
[23]
Seiden, S.S., Hirschberg, D.S.: Finding succinct ordered minimal perfect hash functions. Inf. Process. Lett. 51(6), 283-288 (1994)
[24]
Zukowski, M., Heman, S., Boncz, P.A.: Architecture-conscious hashing. In: Proc. Int. Workshop on Data Management on New Hardware (DaMoN), Chicago, 8 pages, Article No. 6 (2006)

Cited By

View all
  • (2024)Space Lower Bounds for Dynamic Filters and Value-Dynamic RetrievalProceedings of the 56th Annual ACM Symposium on Theory of Computing10.1145/3618260.3649649(1153-1164)Online publication date: 10-Jun-2024
  • (2023)Forward Security with Crash Recovery for Secure LogsACM Transactions on Privacy and Security10.1145/363152427:1(1-28)Online publication date: 3-Nov-2023
  • (2022)Practical Volume-Hiding Encrypted Multi-Maps with Optimal Overhead and BeyondProceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security10.1145/3548606.3559345(2825-2839)Online publication date: 7-Nov-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICALP '08: Proceedings of the 35th international colloquium on Automata, Languages and Programming - Volume Part I
July 2008
892 pages
ISBN:3540705740
  • Editors:
  • Luca Aceto,
  • Ivan Damgård,
  • Leslie Ann Goldberg,
  • Magnús M. Halldórsson,
  • Anna Ingólfsdóttir,
  • Igor Walukiewicz

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 07 July 2008

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Space Lower Bounds for Dynamic Filters and Value-Dynamic RetrievalProceedings of the 56th Annual ACM Symposium on Theory of Computing10.1145/3618260.3649649(1153-1164)Online publication date: 10-Jun-2024
  • (2023)Forward Security with Crash Recovery for Secure LogsACM Transactions on Privacy and Security10.1145/363152427:1(1-28)Online publication date: 3-Nov-2023
  • (2022)Practical Volume-Hiding Encrypted Multi-Maps with Optimal Overhead and BeyondProceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security10.1145/3548606.3559345(2825-2839)Online publication date: 7-Nov-2022
  • (2022)Binary Fuse Filters: Fast and Smaller Than Xor FiltersACM Journal of Experimental Algorithmics10.1145/351044927(1-15)Online publication date: 4-Mar-2022
  • (2021)Peeling close to the orientability thresholdProceedings of the Thirty-Second Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3458064.3458195(2194-2211)Online publication date: 10-Jan-2021
  • (2021)A four-dimensional analysis of partitioned approximate filtersProceedings of the VLDB Endowment10.14778/3476249.347628614:11(2355-2368)Online publication date: 27-Oct-2021
  • (2020)Cuckoo indexProceedings of the VLDB Endowment10.14778/3424573.342457713:13(3559-3572)Online publication date: 27-Oct-2020
  • (2019)Data domain cloud tierProceedings of the 2019 USENIX Conference on Usenix Annual Technical Conference10.5555/3358807.3358862(647-660)Online publication date: 10-Jul-2019
  • (2019)Probabilistic Data Structures in Adversarial EnvironmentsProceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security10.1145/3319535.3354235(1317-1334)Online publication date: 6-Nov-2019
  • (2019)Bloom Filters in Adversarial EnvironmentsACM Transactions on Algorithms10.1145/330619315:3(1-30)Online publication date: 7-Jun-2019
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media