Introduction to Hashing Last Updated : 21 Mar, 2025 Comments Improve Suggest changes Like Article Like Report Hashing refers to the process of generating a small sized output (that can be used as index in a table) from an input of typically large and variable size. Hashing uses mathematical formulas known as hash functions to do the transformation. This technique determines an index or location for the storage of an item in a data structure called Hash Table.Introduction to HashingHash Table Data Structure OverviewIt is one of the most widely used data structure after arrays. It mainly supports search, insert and delete in O(1) time on average which is more efficient than other popular data structures like arrays, Linked List and Self Balancing BST.We use hashing for dictionaries, frequency counting, maintaining data for quick access by key, etc. Real World Applications include Database Indexing, Cryptography, Caches, Symbol Table and Dictionaries.There are mainly two forms of hash typically implemented in programming languages. Hash Set : Collection of unique keys (Implemented as Set in Python, Set in JavaScrtipt, unordered_set in C++ and HashSet in Java.Hash Map : Collection of key value pairs with keys being unique (Implemented as dictionary in Python, Map in JavaScript, unordered_map in C++ and HashMap in Java)Situations Where Hash is not UsedNeed to maintain sorted data along with search, insert and delete. We use a self balancing BST in these cases.When Strings are keys and we need operations like prefix search along with search, insert and delete. We use Trie in these cases.When we need operations like floor and ceiling along with search, insert and/or delete. We use Self Balancing BST in these cases.Components of Hashing There are majorly three components of hashing: Key: A Key can be anything string or integer which is fed as input in the hash function the technique that determines an index or location for storage of an item in a data structure. Hash Function: Receives the input key and returns the index of an element in an array called a hash table. The index is known as the hash index . Hash Table: Hash table is typically an array of lists. It stores values corresponding to the keys. Hash stores the data in an associative manner in an array where each data value has its own unique index. How does Hashing work? Suppose we have a set of strings {“ab”, “cd”, “efg”} and we would like to store it in a table. Step 1: We know that hash functions (which is some mathematical formula) are used to calculate the hash value which acts as the index of the data structure where the value will be stored. Step 2: So, let's assign “a” = 1, “b”=2, .. etc, to all alphabetical characters. Step 3: Therefore, the numerical value by summation of all characters of the string: “ab” = 1 + 2 = 3, “cd” = 3 + 4 = 7 , “efg” = 5 + 6 + 7 = 18 Step 4: Now, assume that we have a table of size 7 to store these strings. The hash function that is used here is the sum of the characters in key mod Table size . We can compute the location of the string in the array by taking the sum(string) mod 7 . Step 5: So we will then store “ab” in 3 mod 7 = 3, “cd” in 7 mod 7 = 0, and “efg” in 18 mod 7 = 4. The above technique enables us to calculate the location of a given string by using a simple hash function and rapidly find the value that is stored in that location. Therefore the idea of hashing seems like a great way to store (key, value) pairs of the data in a table. What is a Hash function? A hash function creates a mapping from an input key to an index in hash table, this is done through the use of mathematical formulas known as hash functions. For example: Consider phone numbers as keys and a hash table of size 100. A simple example hash function can be to consider the last two digits of phone numbers so that we have valid array indexes as output. A good hash function should have the following properties: Efficient Should uniformly distribute the keys to each index of hash table.Should minimize collisions (This and the below are mainly derived from the above 2nd point)Should have a low load factor (number of items in the table divided by the size of the table). What is Collision in Hashing?When two or more keys have the same hash value, a collision happens. If we consider the above example, the hash function we used is the sum of the letters, but if we examined the hash function closely then the problem can be easily visualised that for different strings same hash value is being generated by the hash function. For example: {“ab”, “ba”} both have the same hash value, and string {“cd”,”be”} also generate the same hash value, etc. This is known as collision and it creates problem in searching, insertion, deletion, and updating of value. Collision in HashingThe probability of a hash collision depends on the size of the algorithm, the distribution of hash values and the efficiency of Hash function. To handle this collision, we use Collision Resolution Techniques. What is meant by Load Factor in Hashing? The load factor of the hash table can be defined as the number of items the hash table contains divided by the size of the hash table. Load factor is the decisive parameter that is used when we want to rehash the previous hash function or want to add more elements to the existing hash table. It helps us in determining the efficiency of the hash function i.e. it tells whether the hash function which we are using is distributing the keys uniformly or not in the hash table. Load Factor = Total elements in hash table/ Size of hash table What is Rehashing? As the name suggests, rehashing means hashing again. Basically, when the load factor increases to more than its predefined value (the default value of the load factor is 0.75), the complexity increases. So to overcome this, the size of the array is increased (doubled) and all the values are hashed again and stored in the new double-sized array to maintain a low load factor and low complexity. How to Create Your Own Hash Table?You Own Hash Table with Chaining Your Own Hash Table with Linear Probing in Open Addressing Your Own Hash Table with Quadratic Probing in Open Addressing Comment More infoAdvertise with us Next Article What is Hashing? kartik Follow Improve Article Tags : Algorithms Hash Data Structures DSA HashTable HashSet Hash Tutorials DSA-Blogs DSA Tutorials +6 More Practice Tags : AlgorithmsData StructuresHashHash Similar Reads Hashing in Data Structure Hashing is a technique used in data structures that efficiently stores and retrieves data in a way that allows for quick access. Hashing involves mapping data to a specific index in a hash table (an array of items) using a hash function. It enables fast retrieval of information based on its key. The 3 min read Introduction to Hashing Hashing refers to the process of generating a small sized output (that can be used as index in a table) from an input of typically large and variable size. Hashing uses mathematical formulas known as hash functions to do the transformation. This technique determines an index or location for the stor 7 min read What is Hashing? Hashing refers to the process of generating a fixed-size output from an input of variable size using the mathematical formulas known as hash functions. This technique determines an index or location for the storage of an item in a data structure. Need for Hash data structureThe amount of data on the 3 min read Index Mapping (or Trivial Hashing) with negatives allowed Index Mapping (also known as Trivial Hashing) is a simple form of hashing where the data is directly mapped to an index in a hash table. The hash function used in this method is typically the identity function, which maps the input data to itself. In this case, the key of the data is used as the ind 7 min read Separate Chaining Collision Handling Technique in Hashing Separate Chaining is a collision handling technique. Separate chaining is one of the most popular and commonly used techniques in order to handle collisions. In this article, we will discuss about what is Separate Chain collision handling technique, its advantages, disadvantages, etc.There are mainl 3 min read Open Addressing Collision Handling technique in Hashing Open Addressing is a method for handling collisions. In Open Addressing, all elements are stored in the hash table itself. So at any point, the size of the table must be greater than or equal to the total number of keys (Note that we can increase table size by copying old data if needed). This appro 7 min read Double Hashing Double hashing is a collision resolution technique used in hash tables. It works by using two hash functions to compute two different hash values for a given key. The first hash function is used to compute the initial hash value, and the second hash function is used to compute the step size for the 15+ min read Load Factor and Rehashing Prerequisites: Hashing Introduction and Collision handling by separate chaining How hashing works: For insertion of a key(K) - value(V) pair into a hash map, 2 steps are required: K is converted into a small integer (called its hash code) using a hash function.The hash code is used to find an index 15+ min read Easy problems on HashingCheck if an array is subset of another arrayGiven two arrays a[] and b[] of size m and n respectively, the task is to determine whether b[] is a subset of a[]. Both arrays are not sorted, and elements are distinct.Examples: Input: a[] = [11, 1, 13, 21, 3, 7], b[] = [11, 3, 7, 1] Output: trueInput: a[]= [1, 2, 3, 4, 5, 6], b = [1, 2, 4] Output 13 min read Union and Intersection of two Linked List using HashingGiven two singly Linked Lists, create union and intersection lists that contain the union and intersection of the elements present in the given lists. Each of the two linked lists contains distinct node values.Note: The order of elements in output lists doesnât matter. Examples:Input:head1 : 10 - 10 min read Two Sum - Pair with given SumGiven an array arr[] of n integers and a target value, the task is to find whether there is a pair of elements in the array whose sum is equal to target. This problem is a variation of 2Sum problem.Examples: Input: arr[] = [0, -1, 2, -3, 1], target = -2Output: trueExplanation: There is a pair (1, -3 15+ min read Max Distance Between Two OccurrencesGiven an array arr[], the task is to find the maximum distance between two occurrences of any element. If no element occurs twice, return 0.Examples: Input: arr = [1, 1, 2, 2, 2, 1]Output: 5Explanation: distance for 1 is: 5-0 = 5, distance for 2 is: 4-2 = 2, So max distance is 5.Input : arr[] = [3, 8 min read Most frequent element in an arrayGiven an array, the task is to find the most frequent element in it. If there are multiple elements that appear a maximum number of times, return the maximum element.Examples: Input : arr[] = [1, 3, 2, 1, 4, 1]Output : 1Explanation: 1 appears three times in array which is maximum frequency.Input : a 10 min read Only Repeating From 1 To n-1Given an array arr[] of size n filled with numbers from 1 to n-1 in random order. The array has only one repetitive element. The task is to find the repetitive element.Examples:Input: arr[] = [1, 3, 2, 3, 4]Output: 3Explanation: The number 3 is the only repeating element.Input: arr[] = [1, 5, 1, 2, 15+ min read Check for Disjoint Arrays or SetsGiven two arrays a and b, check if they are disjoint, i.e., there is no element common between both the arrays.Examples: Input: a[] = {12, 34, 11, 9, 3}, b[] = {2, 1, 3, 5} Output: FalseExplanation: 3 is common in both the arrays.Input: a[] = {12, 34, 11, 9, 3}, b[] = {7, 2, 1, 5} Output: True Expla 11 min read Non-overlapping sum of two setsGiven two arrays A[] and B[] of size n. It is given that both array individually contains distinct elements. We need to find the sum of all elements that are not common.Examples: Input : A[] = {1, 5, 3, 8} B[] = {5, 4, 6, 7}Output : 291 + 3 + 4 + 6 + 7 + 8 = 29Input : A[] = {1, 5, 3, 8} B[] = {5, 1, 9 min read Check if two arrays are equal or notGiven two arrays, a and b of equal length. The task is to determine if the given arrays are equal or not. Two arrays are considered equal if:Both arrays contain the same set of elements.The arrangements (or permutations) of elements may be different.If there are repeated elements, the counts of each 6 min read Find missing elements of a rangeGiven an array, arr[0..n-1] of distinct elements and a range [low, high], find all numbers that are in a range, but not the array. The missing elements should be printed in sorted order.Examples: Input: arr[] = {10, 12, 11, 15}, low = 10, high = 15Output: 13, 14Input: arr[] = {1, 14, 11, 51, 15}, lo 15+ min read Minimum Subsets with Distinct ElementsYou are given an array of n-element. You have to make subsets from the array such that no subset contain duplicate elements. Find out minimum number of subset possible.Examples : Input : arr[] = {1, 2, 3, 4}Output :1Explanation : A single subset can contains all values and all values are distinct.In 9 min read Remove minimum elements such that no common elements exist in two arraysGiven two arrays arr1[] and arr2[] consisting of n and m elements respectively. The task is to find the minimum number of elements to remove from each array such that intersection of both arrays becomes empty and both arrays become mutually exclusive.Examples: Input: arr[] = { 1, 2, 3, 4}, arr2[] = 8 min read 2 Sum - Count pairs with given sumGiven an array arr[] of n integers and a target value, the task is to find the number of pairs of integers in the array whose sum is equal to target.Examples: Input: arr[] = {1, 5, 7, -1, 5}, target = 6Output: 3Explanation: Pairs with sum 6 are (1, 5), (7, -1) & (1, 5). Input: arr[] = {1, 1, 1, 9 min read Count quadruples from four sorted arrays whose sum is equal to a given value xGiven four sorted arrays each of size n of distinct elements. Given a value x. The problem is to count all quadruples(group of four numbers) from all the four arrays whose sum is equal to x.Note: The quadruple has an element from each of the four arrays. Examples: Input : arr1 = {1, 4, 5, 6}, arr2 = 15+ min read Sort elements by frequency | Set 4 (Efficient approach using hash)Print the elements of an array in the decreasing frequency if 2 numbers have the same frequency then print the one which came first. Examples: Input : arr[] = {2, 5, 2, 8, 5, 6, 8, 8} Output : arr[] = {8, 8, 8, 2, 2, 5, 5, 6} Input : arr[] = {2, 5, 2, 6, -1, 9999999, 5, 8, 8, 8} Output : arr[] = {8, 12 min read Find all pairs (a, b) in an array such that a % b = kGiven an array with distinct elements, the task is to find the pairs in the array such that a % b = k, where k is a given integer. You may assume that a and b are in small range Examples : Input : arr[] = {2, 3, 5, 4, 7} k = 3Output : (7, 4), (3, 4), (3, 5), (3, 7)7 % 4 = 33 % 4 = 33 % 5 = 33 % 7 = 15 min read Group words with same set of charactersGiven a list of words with lower cases. Implement a function to find all Words that have the same unique character set. Example: Input: words[] = { "may", "student", "students", "dog", "studentssess", "god", "cat", "act", "tab", "bat", "flow", "wolf", "lambs", "amy", "yam", "balms", "looped", "poodl 8 min read k-th distinct (or non-repeating) element among unique elements in an array.Given an integer array arr[], print kth distinct element in this array. The given array may contain duplicates and the output should print the k-th element among all unique elements. If k is more than the number of distinct elements, print -1.Examples:Input: arr[] = {1, 2, 1, 3, 4, 2}, k = 2Output: 7 min read Intermediate problems on HashingFind Itinerary from a given list of ticketsGiven a list of tickets, find the itinerary in order using the given list.Note: It may be assumed that the input list of tickets is not cyclic and there is one ticket from every city except the final destination.Examples:Input: "Chennai" -> "Bangalore" "Bombay" -> "Delhi" "Goa" -> "Chennai" 11 min read Find number of Employees Under every ManagerGiven a 2d matrix of strings arr[][] of order n * 2, where each array arr[i] contains two strings, where the first string arr[i][0] is the employee and arr[i][1] is his manager. The task is to find the count of the number of employees under each manager in the hierarchy and not just their direct rep 9 min read Longest Subarray With Sum Divisible By KGiven an arr[] containing n integers and a positive integer k, he problem is to find the longest subarray's length with the sum of the elements divisible by k.Examples:Input: arr[] = [2, 7, 6, 1, 4, 5], k = 3Output: 4Explanation: The subarray [7, 6, 1, 4] has sum = 18, which is divisible by 3.Input: 10 min read Longest Subarray with 0 Sum Given an array arr[] of size n, the task is to find the length of the longest subarray with sum equal to 0.Examples:Input: arr[] = {15, -2, 2, -8, 1, 7, 10, 23}Output: 5Explanation: The longest subarray with sum equals to 0 is {-2, 2, -8, 1, 7}Input: arr[] = {1, 2, 3}Output: 0Explanation: There is n 10 min read Longest Increasing consecutive subsequenceGiven N elements, write a program that prints the length of the longest increasing consecutive subsequence. Examples: Input : a[] = {3, 10, 3, 11, 4, 5, 6, 7, 8, 12} Output : 6 Explanation: 3, 4, 5, 6, 7, 8 is the longest increasing subsequence whose adjacent element differs by one. Input : a[] = {6 10 min read Count Distinct Elements In Every Window of Size KGiven an array arr[] of size n and an integer k, return the count of distinct numbers in all windows of size k. Examples: Input: arr[] = [1, 2, 1, 3, 4, 2, 3], k = 4Output: [3, 4, 4, 3]Explanation: First window is [1, 2, 1, 3], count of distinct numbers is 3. Second window is [2, 1, 3, 4] count of d 10 min read Design a data structure that supports insert, delete, search and getRandom in constant timeDesign a data structure that supports the following operations in O(1) time.insert(x): Inserts an item x to the data structure if not already present.remove(x): Removes item x from the data structure if present. search(x): Searches an item x in the data structure.getRandom(): Returns a random elemen 5 min read Subarray with Given Sum - Handles Negative NumbersGiven an unsorted array of integers, find a subarray that adds to a given number. If there is more than one subarray with the sum of the given number, print any of them.Examples: Input: arr[] = {1, 4, 20, 3, 10, 5}, sum = 33Output: Sum found between indexes 2 and 4Explanation: Sum of elements betwee 13 min read Implementing our Own Hash Table with Separate Chaining in JavaAll data structure has their own special characteristics, for example, a BST is used when quick searching of an element (in log(n)) is required. A heap or a priority queue is used when the minimum or maximum element needs to be fetched in constant time. Similarly, a hash table is used to fetch, add 10 min read Implementing own Hash Table with Open Addressing Linear ProbingIn Open Addressing, all elements are stored in the hash table itself. So at any point, size of table must be greater than or equal to total number of keys (Note that we can increase table size by copying old data if needed).Insert(k) - Keep probing until an empty slot is found. Once an empty slot is 13 min read Maximum possible difference of two subsets of an arrayGiven an array of n-integers. The array may contain repetitive elements but the highest frequency of any element must not exceed two. You have to make two subsets such that the difference of the sum of their elements is maximum and both of them jointly contain all elements of the given array along w 15+ min read Sorting using trivial hash functionWe have read about various sorting algorithms such as heap sort, bubble sort, merge sort and others. Here we will see how can we sort N elements using a hash array. But this algorithm has a limitation. We can sort only those N elements, where the value of elements is not large (typically not above 1 15+ min read Smallest subarray with k distinct numbersWe are given an array consisting of n integers and an integer k. We need to find the smallest subarray [l, r] (both l and r are inclusive) such that there are exactly k different numbers. If no such subarray exists, print -1 and If multiple subarrays meet the criteria, return the one with the smalle 14 min read Like