Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Implement Bitap Algorithm for String Matching in C++



The bitap algorithm is fuzzy string matching algorithm that is used to find approximate matches between a pattern and a text.

The algorithm determines whether a given text contains a substring that is "approximately equal" to a given pattern, where approximate equality is defined in terms of Levenshtein distance (or number of edits) if the substring and pattern are within a given distance k of each other, then according to the algorithm they are equal. It begins by precomputing a set of bitmasks containing one bit for each element of the pattern. So we can do most of the work with bitwise operations, which are extremely fast.

Algorithm to Implement Bitap Algorithm for String Matching

Following is the bitap algorithm:

Begin
Take the string and pattern as input.
function bitmap_search() and it takes argument string text t and string pattern p :
Initialize the bit array A.
Initialize the pattern bitmasks, p_mask[300]
Update the bit array.
for i = 0 to 299
   p_mask[i] = ~0
for i = 0 to m-1
   p_mask[p[i]] and= ~(1L left shift i);
for i = 0 to t.length()-1
   A |= p_mask[t[i]];
   A <<= 1;
if ((A and (1L left shift m)) == 0)
   return i - m + 1
   return -1
End

C++ Program to Implement Bitap Algorithm for String Matching

In this example, we efficiently search for a small string (pattern) inside a larger string (text) using bitmasking technique:

#include <string>
#include <iostream>
using namespace std;
int bitmap_search(string t, string p) {
   int m = p.length();
   long p_mask[300];
   long A = ~1;
   if (m == 0)
      return -1;
   if (m > 63) {
      cout << "Pattern is too long!";
      return -1;
   }
   for (int i = 0; i <= 299; ++i)
      p_mask[i] = ~0;
   for (int i = 0; i < m; ++i)
      p_mask[p[i]] &= ~(1L << i);
   for (int i = 0; i < t.length(); ++i) {
      A |= p_mask[t[i]];
      A <<= 1;
      if ((A & (1L << m)) == 0)
         return i - m + 1;
   }
   return -1;
}
    
void findPattern(string t, string p) {
   int position = bitmap_search(t, p);
   if (position == -1)
      cout << "\nNo Match\n";
   else
      cout << "\nPattern found at position: " << position;
} 
int main() {
   string t = "tutorialspoint";
   string p = "point";
    
   findPattern(t, p);
   return 0;
}

Following is the output:

Pattern found at position: 9
Updated on: 2025-05-20T19:24:35+05:30

428 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements