Hashing
Hashing
Hashing
Definition
In all search techniques like linear search, binary search and
n) search trees, the time required to search an element depends
on the total number of elements in that data structure.
ndent o n
l to n
(defined as O(n),O(lgn))
exity In all these search techniques, as the number of element are
increased the time required to search an element also
increased linearly.
Hashing is another approach in which time required to
tyis not search an element doesn't depend on the number of elements.
n. Using hashing data structure, an element is searched
ata
s hashing with constant time complexity(O(1)).
– constant Hashing is an effective way to reduce the number of
comparisons to search an element in a data structure.
Hashing is the process of indexing and retrieving
element (data) in a data structure to provide
faster way of finding the element using the hash
key.(hash key is a value which provides the index
value where the actual data is likely to be stored in
the data structure).
Hash Table
In this data structure, we use a concept called Hash
table to store data.
It is based on the concept of hashing in which address
of each key is determined by hash function.
Hash function is a mathematical function which
determines the address of a key in terms of its value.
Hash table is a data structure in which allocation of a
data item is determined directly as a function of data
item itself rather than by a sequence of comparisons.
The time requirement to locate a data item in a hash
table is O(1) that is it is constant, it does not depend
on the number of data items stored.
Hash Function
Hash function is a mathematical formula or
function which takes a piece of data (i.e. key) as
input and outputs an integer (i.e. hash value)
which maps the data to a particular index in the
hash table.
Main considerations while choosing a hash
function:
1.it should be possible to compute it efficiently.
2.it should distribute the key uniformly accross the hash
table to make the collision as minimum as possible.
3.it should result into unique index for a given key.
Different Hash Functions
1.Division Method:
In division method, H(k) is mapped onto
one of the m slots in the hash table.
The key is divided by m and the
remainder of this division is taken as an
index into the hash table.
H(k)= K mod m (index starts from 0)
H(k) =(K mod m)+1(index starts from1)
Example:
H(k)=
H(k)=
K=21
M=5
H(21)=
=1
Numerical:
Consider a hash table with 5 slots.Map the key 21 in the
hash table if the starting index is 1(2nd slot)
Advantage:
• Not difficult to calculate
• It can use the full hash table because take 0 to m-1
values,so full table utilized.
2.Mid–square Method:
It is a two-step method.
In the first step,the square of the key value
k is taken.
In the second step,the hash value is
obtained by deleting digits from ends of the
squared value,that is k2
H(k)=s
Where s is obtained by deleting digits from both
sides of k2
Example:
Consider a hash table with 100 slots and
the key values 3205,7148,2345
H’(k)= (h(k)+i)mod m
Example:
Insert data with keys 28, 19, 59, 68, 89 into
a table of size 10.
Disadvantage of Linear Probing
A problem with the linear probe method
is that it is possible for blocks of data to
form when collisions are resolved. This is
known as primary clustering.(appear
next to each other).
This means that any key that hashes into
the cluster will require several attempts to
resolve the collision.
Quadratic Probing
To resolve the primary clustering
problem, quadratic probing can be used.
With quadratic probing, rather than always
moving one spot, move i2 spots from the
point of collision, where i is the number of
attempts to resolve the collision.
H’(k)=(h(k) +C1i +C2i2) mod m,
C1C2 are auxiliary constants
Example:
Exercise
Consider inserting the keys
76,26,37,59,21 into a hash table of size 11
slots using quadratic probing(c1=1 and
c2=3). Further consider that the primary
hash function is k mod m.
Ans=10,4,8,4,7,9,
Double Hashing
Double hashing uses the idea of applying a
second hash function to the key when a
collision occurs.
The result of the second hash function will
be the number of positions form the point of
collision to insert.
H(k)= (h1(k) mod m + ih2(k)mod m) mod m
where h1(k)= k mod m
H2(k)= k mod(m-1)
Exercise:
Consider inserting the keys
76,26,37,59,21,65,88 into a hash table of
size 11 slots using double. Further
consider that the auxiliary hash functions
are h1(K)= k mod 11 and h2(k)= k mod 9.
Ans =10,4,5,9,2,1,0