Hash Tables: Hash Tables Are A Type of Data Structure Used in Computation To Efficiently Store Data. They Are
Hash Tables: Hash Tables Are A Type of Data Structure Used in Computation To Efficiently Store Data. They Are
Hash Tables: Hash Tables Are A Type of Data Structure Used in Computation To Efficiently Store Data. They Are
Hash tables are a type of data structure used in computation to efficiently store data. They are
very efficient for storage and speed of access - meaning that, in the worst cases, it will only take
as much time performing operations on the data as the amount of data itself. This is very
efficient considering that the average processing time of any operation is immediately executed
no matter the data size.
Hash tables characteristically use a key-value storage method. For example, many web sites
make use of usernames and passwords to identify all the authorized users, but employ hash
tables during authorization rather than checking every existing user for a match.
There are two key parts in the making of hash tables: hashing and collisions.
Hashing
Hashing is a process that uses a hash function to get the key for the hash table and transform it
into an index that will point to different arrays of buckets, which is where the information will
be stored. (Think of a bucket as a faster place to search for things than in one really long list).
Hashing works both ways: when storing data, it is the function that will determine where to
store it, and when retrieving data, it is the function that will point where to find it inside the
hash table. The key to a good hash table is choosing a good hash function. This will be
addressed later in the lesson.
Collisions
Collisions are a big part of hash tables. Collision occur when the hash function assigns a hash
key that was already assigned. Collisions are pretty difficult to avoid and are bound to happen,
so they key to a good hash function is collision resolution.
Hash Functions
As mentioned earlier, the key to an efficient hash table is choosing a good hash function. To
select a hash function, you need to consider that for a hash table capable of holding 'n' key-
value pairs, you need a hash function that will transform the key to an index between 0 and n-
1. This function should be easy to compute and must distribute the keys uniformly.
An example of one of the most common hashing functions is modular hashing. In this method,
the number of buckets available for key-value pairs (M) should be a prime number (to minimize
collisions). As well, for any positive key (k), it should compute the remainder after dividing k by
M (k % M). This function will effectively disperse the keys between 0 and n-1.
Even when implementing a perfect hash function, there is 95% chance that two values will be
assigned with the same key due during the randomization of key generation. Therefore, it is
important to have a good collision resolution method.