Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Linear hash functions

Published: 01 September 1999 Publication History

Abstract

Consider the set ℋ of all linear (or affine) transformations between two vector spaces over a finite field F. We study how good ℋ is as a class of hash functions, namely we consider hashing a set S of size n into a range having the same cardinality n by a randomly chosen function from ℋ and look at the expected size of the largest hash bucket. ℋ is a universal class of hash functions for any finite field, but with respect to our measure different fields behave differently.
If the finite field F has n elements, then there is a bad set SF2 of size n with expected maximal bucket size Ω(n1/3). If n is a perfect square, then there is even a bad set with largest bucket size always at least √n. (This is worst possible, since with respect to a universal class of hash functions every set of size n has expected largest bucket size below √ + 1/2.)
If, however, we consider the field of two elements, then we get much better bounds. The best previously known upper bound on the expected size of the largest bucket for this class was O(2√ log n). We reduce this upper bound to O(log n log logn). Note that this is not far from the guarantee for a random function. There, the average largest bucket would be Θ (log n/ log log n).
In the course of our proof we develop a tool which may be of independent interest. Suppose we have a subset S of a vector space D over Z2, and consider a random linear mapping of D to a smaller vector space R. If the cardinality of S is larger than cε|R|log|R|, then with probability 1 - ϵ, the image of S will cover all elements in the range.

References

[1]
ALON, N., BABAI, L., AND ITAI, A. 1986. A fast and simple randomized parallel algorithm for the maximal independent set problem. J. Algorithms 7, 567-583.]]
[2]
ANDERSSON, A., HAGERUP, T., NILSSON, S., AND RAMAN, R. 1995. Sorting in linear time? In Proceedings of the 27th ACM Symposium on Theory of Computing (Las Vegas, Nev., May 29-June 1). ACM, New York, pp. 427-436.]]
[3]
CARTER,J.L.,AND WEGMAN, M. N. 1979. Universal classes of hash functions. J. Comput. Syst. Sci. 18, 143-154.]]
[4]
CORMEN,T.H.,LEISERSON,C.E.,AND RIVEST, R. L. 1990. Introduction to Algorithms. MIT Press, Cambridge, Mass.]]
[5]
DIETZFELBINGER, M., GIL, J., MATIAS, Y., AND PIPPENGER, N. 1992. Polynomial hash functions are reliable. Proceedings of ICALP'92. Lecture Notes in Computer Science, Vol. 623. Springer-Verlag, New York, pp. 235-246.]]
[6]
DIETZFELBINGER, M., HAGERUP, T., KATAJAINEN, J., AND PENTTONEN, M. 1997. A reliable randomized algorithm for the closest-pair problem. J. Algorithms 25, 19-51.]]
[7]
DIETZFELBINGER, M., KARLIN, A., MEHLHORN, K., MEYER AUF DER HEIDE, F., ROHNERT, H., TARJAN, R. E. 1994. Dynamic perfect hashing: upper and lower bounds. SIAM J. Comput. 23, 738-761.]]
[8]
DIETZFELBINGER, M., AND MEYER AUF DER HEIDE, F. 1992. Dynamic hashing in real time. In Informatik-Festschrift zum 60. Geburt-stag von Gunter Hotz., J. Buchmann, H. Ganzinger, W. J. Paul, eds. Teubner-Texte zur Informatik, Band 1, B. G. Teubner. pp. 95-119.]]
[9]
FREDMAN,M.L.,KOMLOS, J., AND SZEMEREDI, E. 1984. Storing a sparse table with O(1) worst case access time. J. A. C. M. 31, 3 (July), 538-544.]]
[10]
GONNET, G., AND BAEZA-YATES, R. 1991. Handbook of Algorithms and Data Structures. Addison-Wesley, Reading, Mass.]]
[11]
GRAHAM,S.W.,AND RINGROSE, C. J. 1990. Lower bounds for least quadratic nonresidues. In Analytic Number Theory: Proceedings of a Conference in Honor of P.T. Bateman. B. C. Berndt et al., eds., Birkhauser, Boston.]]
[12]
MANSOUR, Y., NISAN, N., AND TIWARI, P. 1993. The computational complexity of universal hashing. Theoret. Comput. Sci. 107, 121-133.]]
[13]
MARKOWSKY, G., CARTER,J.L.,AND WEGMAN, M. N. 1978. Analysis of a universal class of hash functions. In Proceedings of the 7th Conference on Mathematical Foundations of Computer Science (MFCS). Lecture Notes in Computer Science, Vol. 64, Springer-Verlag, New York, pp. 345-354.]]
[14]
MEHLHORN, K., AND VISHKIN, U. 1984. Randomized and deterministic simulations of PRAMs by parallel machines with restricted granularity of parallel memories. Acta Inf. 21, 339-374.]]
[15]
PACH, J., AND AGARWAL, P. K. 1995. Combinatorial Geometry. Wiley, New York.]]
[16]
SIEGEL, A. 1989. On universal classes of fast high performance hash functions, their timespace tradeoff, and their applications. In Proceedings of the 30th IEEE Symposium on Foundations of Computer Science. IEEE, New York, pp. 20-25.]]
[17]
VAPNIK,V.A.,AND CHERVONENKIS, A. Y. 1971. On the uniform convergence of relative frequencies of events to their probabilities. Theory Prob. Appl. 16, 264-280.]]

Cited By

View all
  • (2022) Linear Hashing with ℓ ∞ guarantees and two-sided Kakeya bounds 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS)10.1109/FOCS54457.2022.00047(419-428)Online publication date: Oct-2022
  • (2021)HalftimeHash: Modern Hashing Without 64-Bit Multipliers or Finite FieldsAlgorithms and Data Structures10.1007/978-3-030-83508-8_8(101-114)Online publication date: 31-Jul-2021
  • (2021)Public-Coin Statistical Zero-Knowledge Batch Verification Against Malicious VerifiersAdvances in Cryptology – EUROCRYPT 202110.1007/978-3-030-77883-5_8(219-246)Online publication date: 17-Oct-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Journal of the ACM
Journal of the ACM  Volume 46, Issue 5
Sept. 1999
210 pages
ISSN:0004-5411
EISSN:1557-735X
DOI:10.1145/324133
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 1999
Published in JACM Volume 46, Issue 5

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. hashing via linear maps
  2. universal hashing

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)159
  • Downloads (Last 6 weeks)36
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022) Linear Hashing with ℓ ∞ guarantees and two-sided Kakeya bounds 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS)10.1109/FOCS54457.2022.00047(419-428)Online publication date: Oct-2022
  • (2021)HalftimeHash: Modern Hashing Without 64-Bit Multipliers or Finite FieldsAlgorithms and Data Structures10.1007/978-3-030-83508-8_8(101-114)Online publication date: 31-Jul-2021
  • (2021)Public-Coin Statistical Zero-Knowledge Batch Verification Against Malicious VerifiersAdvances in Cryptology – EUROCRYPT 202110.1007/978-3-030-77883-5_8(219-246)Online publication date: 17-Oct-2021
  • (2020)Extractors and Secret Sharing Against Bounded Collusion Protocols2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS)10.1109/FOCS46700.2020.00117(1226-1242)Online publication date: Nov-2020
  • (2019)Derandomized balanced allocationProceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms10.5555/3310435.3310589(2513-2526)Online publication date: 6-Jan-2019
  • (2019)Linear Hashing Is AwesomeSIAM Journal on Computing10.1137/17M112680148:2(736-741)Online publication date: 30-Apr-2019
  • (2018)Multi-Collision Resistant Hash Functions and Their ApplicationsAdvances in Cryptology – EUROCRYPT 201810.1007/978-3-319-78375-8_5(133-161)Online publication date: 31-Mar-2018
  • (2016)Linear Hashing Is Awesome2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS)10.1109/FOCS.2016.45(345-352)Online publication date: Oct-2016
  • (2015)On the k-Independence Required by Linear Probing and Minwise IndependenceACM Transactions on Algorithms10.1145/271631712:1(1-27)Online publication date: 16-Nov-2015
  • (2014)Fast Pseudorandomness for Independence and Load BalancingAutomata, Languages, and Programming10.1007/978-3-662-43948-7_71(859-870)Online publication date: 2014
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media