Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Hashing

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 23

HASHING

Definition
In all search techniques like linear search, binary search and
n) search trees, the time required to search an element depends
on the total number of elements in that data structure.
ndent o n
l to n
(defined as O(n),O(lgn))
exity In all these search techniques, as the number of element are
increased the time required to search an element also
increased linearly.
Hashing is another approach in which time required to
tyis not search an element doesn't depend on the number of elements.
n. Using hashing data structure, an element is searched
ata
s hashing with constant time complexity(O(1)).
– constant  Hashing is an effective way to reduce the number of
comparisons to search an element in a data structure.
Hashing is the process of indexing and retrieving
element (data) in a data structure to provide
faster way of finding the element using the hash
key.(hash key is a value which provides the index
value where the actual data is likely to be stored in
the data structure).
Hash Table
In this data structure, we use a concept called Hash
table to store data.
It is based on the concept of hashing in which address
of each key is determined by hash function.
Hash function is a mathematical function which
determines the address of a key in terms of its value.
Hash table is a data structure in which allocation of a
data item is determined directly as a function of data
item itself rather than by a sequence of comparisons.
The time requirement to locate a data item in a hash
table is O(1) that is it is constant, it does not depend
on the number of data items stored.
Hash Function
Hash function is a mathematical formula or
function which takes a piece of data (i.e. key) as
input and outputs an integer (i.e. hash value)
which maps the data to a particular index in the
hash table.
Main considerations while choosing a hash
function:
1.it should be possible to compute it efficiently.
2.it should distribute the key uniformly accross the hash
table to make the collision as minimum as possible.
3.it should result into unique index for a given key.
Different Hash Functions
1.Division Method:
In division method, H(k) is mapped onto
one of the m slots in the hash table.
The key is divided by m and the
remainder of this division is taken as an
index into the hash table.
H(k)= K mod m (index starts from 0)
H(k) =(K mod m)+1(index starts from1)
Example:
H(k)=
H(k)=

K=21
M=5
H(21)=
=1

 Numerical:
 Consider a hash table with 5 slots.Map the key 21 in the
hash table if the starting index is 1(2nd slot)
 Advantage:
• Not difficult to calculate
• It can use the full hash table because take 0 to m-1
values,so full table utilized.
2.Mid–square Method:
It is a two-step method.
In the first step,the square of the key value
k is taken.
In the second step,the hash value is
obtained by deleting digits from ends of the
squared value,that is k2
H(k)=s
Where s is obtained by deleting digits from both
sides of k2
Example:
Consider a hash table with 100 slots and
the key values 3205,7148,2345

Note: remove the same number of digits


3. Folding Method:
It is a two-step method.
In the first step,the key value k is divided
into number of parts k1,k2,k3,...kr where
each part has the same number of digits
except the last part,which can have lesser
digits.
In the second step, these parts are added
together and the hash value is obtained by
ignoring the last carry,if nay.
Example:

Q.Consider a hash table with 100


slots(m=100), key values k= 9235, 714,
71458(Ans:27,75,24)
Disadvantage of Hashing
 Collision:
Collision is one of the limitations of
hashing.
It is said to occur when two or more
unequal keys hash to the same slot.
In order to resolve collision,we have two
separate schemes:
1.collision resolution by separate chaining.
2.collision resolution by open addressing.
Collision Resolution Techniques
1. Collision Resolution by Chaining:
When a collision occurs, elements with
the same hash key will
be chained together.
A chain is simply a linked list of all the
elements with the same hash key.
The hash table slots will no longer hold a
table element. They will now hold the
address of a table element.
Example:

Disadvantage: Collision reduced at the cost of


time complexity—list also has to be searched
now.
2. Collision Resolution by Open
Addressing
1. Linear Probing
2. Quadratic Probing
3. Rehashing
Linear Probing
When using a linear probe, the item will be stored in the next
available slot in the table, assuming that the table is not
already full.
This is implemented via a linear search for an empty slot,
from the point of collision. If the physical end of table is
reached during the linear search, the search will wrap around
to the beginning of the table and continue from there.
If an empty slot is not found before reaching the point of
collision, the table is full.

H’(k)= (h(k)+i)mod m
Example:
Insert data with keys 28, 19, 59, 68, 89 into
a table of size 10.
Disadvantage of Linear Probing
A problem with the linear probe method
is that it is possible for blocks of data to
form when collisions are resolved. This is
known as primary clustering.(appear
next to each other).
This means that any key that hashes into
the cluster will require several attempts to
resolve the collision.
Quadratic Probing
To resolve the primary clustering
problem, quadratic probing can be used.
With quadratic probing, rather than always
moving one spot, move i2 spots from the
point of collision, where i is the number of
attempts to resolve the collision.
H’(k)=(h(k) +C1i +C2i2) mod m,
C1C2 are auxiliary constants
Example:
Exercise
Consider inserting the keys
76,26,37,59,21 into a hash table of size 11
slots using quadratic probing(c1=1 and
c2=3). Further consider that the primary
hash function is k mod m.
Ans=10,4,8,4,7,9,
Double Hashing
Double hashing uses the idea of applying a
second hash function to the key when a
collision occurs.
The result of the second hash function will
be the number of positions form the point of
collision to insert.
H(k)= (h1(k) mod m + ih2(k)mod m) mod m
where h1(k)= k mod m
H2(k)= k mod(m-1)
Exercise:
Consider inserting the keys
76,26,37,59,21,65,88 into a hash table of
size 11 slots using double. Further
consider that the auxiliary hash functions
are h1(K)= k mod 11 and h2(k)= k mod 9.
Ans =10,4,5,9,2,1,0

You might also like