
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Implement Bitap Algorithm for String Matching in C++
The bitap algorithm is fuzzy string matching algorithm that is used to find approximate matches between a pattern and a text.
The algorithm determines whether a given text contains a substring that is "approximately equal" to a given pattern, where approximate equality is defined in terms of Levenshtein distance (or number of edits) if the substring and pattern are within a given distance k of each other, then according to the algorithm they are equal. It begins by precomputing a set of bitmasks containing one bit for each element of the pattern. So we can do most of the work with bitwise operations, which are extremely fast.
Algorithm to Implement Bitap Algorithm for String Matching
Following is the bitap algorithm:
Begin Take the string and pattern as input. function bitmap_search() and it takes argument string text t and string pattern p : Initialize the bit array A. Initialize the pattern bitmasks, p_mask[300] Update the bit array. for i = 0 to 299 p_mask[i] = ~0 for i = 0 to m-1 p_mask[p[i]] and= ~(1L left shift i); for i = 0 to t.length()-1 A |= p_mask[t[i]]; A <<= 1; if ((A and (1L left shift m)) == 0) return i - m + 1 return -1 End
C++ Program to Implement Bitap Algorithm for String Matching
In this example, we efficiently search for a small string (pattern) inside a larger string (text) using bitmasking technique:
#include <string> #include <iostream> using namespace std; int bitmap_search(string t, string p) { int m = p.length(); long p_mask[300]; long A = ~1; if (m == 0) return -1; if (m > 63) { cout << "Pattern is too long!"; return -1; } for (int i = 0; i <= 299; ++i) p_mask[i] = ~0; for (int i = 0; i < m; ++i) p_mask[p[i]] &= ~(1L << i); for (int i = 0; i < t.length(); ++i) { A |= p_mask[t[i]]; A <<= 1; if ((A & (1L << m)) == 0) return i - m + 1; } return -1; } void findPattern(string t, string p) { int position = bitmap_search(t, p); if (position == -1) cout << "\nNo Match\n"; else cout << "\nPattern found at position: " << position; } int main() { string t = "tutorialspoint"; string p = "point"; findPattern(t, p); return 0; }
Following is the output:
Pattern found at position: 9