CSES solution - Counting Patterns
Last Updated :
23 Apr, 2024
Given a string S and patterns[], count for each pattern the number of positions where it appears in the string.
Examples:
Input: S = "aybabtu", patterns[] = {"bab", "abc", "a"}
Output:
1
0
2
Explanation:
- "bab" occurs only 1 time in "aybabtu", that is from S[2...4].
- "bab" does not occur in "aybabtu".
- "a" occurs only 2 times in "aybabtu", that is from S[0...0] and from S[3...3].
Input: S = "geeksforgeeks", patterns[] = {"geeks", "for", "gfg"}
Output:
2
1
0
Explanation:
- "geeks" occurs 2 times in "geeksforgeeks", that is from S[0...4] and S[8...12].
- "for" occurs 1 time in "geeksforgeeks", that is from S[5...7].
- "gfg" does not occur in "geeksforgeeks".
Approach: To solve the problem, follow the below idea:
The idea is to uses Suffix Array data structure. A Suffix Array is a sorted array of all suffixes of a given string.
Let's look at the intuition in step-by-steps:
Suffix Array Construction: The first part of the solution is to building a suffix array for the given string. A suffix array is a sorted array of all suffixes of a given string. The buildSuffixArray() function constructs this array. It starts by initializing the suffix array and position array. The position array holds the rank (i.e., lexicographic order) of each suffix. Then, it iteratively sorts the suffixes based on their current and next gap’s characters until all ranks are unique.
Pattern Checking: The checkPattern() function checks if a pattern is present at a given position in the suffix array. It compares the characters of the pattern with the characters of the suffix starting at the given position. If the pattern is lexicographically smaller, it returns -1; if larger, it returns 1; if equal and the pattern length is less than or equal to the remaining length of the suffix, it returns 0.
Pattern Searching: Our solve() function performs a binary search for the leftmost and rightmost occurrence of the pattern in the suffix array using the checkPattern() function. The difference between the rightmost and leftmost position plus one will gives the number of occurrences of the pattern in the string.
Step-by-step algorithm:
- Comparison Function (compareSuffixes):
- Compares two suffixes based on their positions and characters beyond a specified gap.
- If positions are different, returns true if the first position is smaller.
- If positions are equal, compares additional characters beyond the gap.
- Build Suffix Array Function (buildSuffixArray):
- Initializes the suffix array and positions based on the characters of the input string.
- Uses a loop to build the suffix array:
- Sorts the suffix array based on the comparison function.
- Updates the positions based on the sorted order.
- Checks if all suffixes are in order; if so, exits the loop.
- Pattern Check Function (checkPattern):
- Checks if a pattern is present at a given position in the suffix array.
- Returns -1 if the pattern is smaller, 0 if it matches, and 1 if it's greater.
- Pattern Search Function (solve):
- Uses binary search to find the range where the pattern appears in the suffix array.
- Initializes left and right boundaries.
- Finds the leftmost occurrence using binary search and updates the left boundary.
- Finds the rightmost occurrence using binary search and updates the right boundary.
- Calculates and prints the count of occurrences.
Below is the implementation of the algorithm:
C++
#include <bits/stdc++.h>
using namespace std;
#define int long long
#define endl '\n'
const int maxN = 1e5 + 5;
int suffixArray[maxN], position[maxN], temp[maxN];
int gap, n;
string s;
// Function to compare two suffixes
bool compareSuffixes(int x, int y)
{
// Compare the positions of two suffixes
if (position[x] != position[y])
return position[x] < position[y];
// Move to the next positions with a gap and
// compare again
x += gap;
y += gap;
return (x < n && y < n) ? position[x] < position[y]
: x > y;
}
// Function to build the suffix array
void buildSuffixArray()
{
// Initialize the suffix array and positions based on
// the characters of the string
for (int i = 0; i < n; i++)
suffixArray[i] = i, position[i] = s[i];
// Build the suffix array using repeated sorting and
// updating positions
for (gap = 1;; gap <<= 1) {
// Sort the suffix array based on the comparison
// function
sort(suffixArray, suffixArray + n, compareSuffixes);
// Update the temporary array with cumulative
// comparisons
for (int i = 0; i < n - 1; i++)
temp[i + 1]
= temp[i]
+ compareSuffixes(suffixArray[i],
suffixArray[i + 1]);
// Update the positions based on the sorted order
for (int i = 0; i < n; i++)
position[suffixArray[i]] = temp[i];
// Check if all suffixes are in order; if so, exit
// the loop
if (temp[n - 1] == n - 1)
break;
}
}
// Function to check if a pattern is present at a given
// position in the suffix array
int checkPattern(int mid, string& pattern)
{
int flag = -1, patternSize = pattern.size(),
suffixStart = suffixArray[mid];
// Check if the suffix can contain the entire pattern
if (n - suffixStart >= patternSize)
flag = 0;
// Compare characters of the pattern and suffix
for (int i = 0; i < min(n - suffixStart, patternSize);
i++) {
if (s[suffixStart + i] < pattern[i])
return -1;
if (s[suffixStart + i] > pattern[i])
return 1;
}
return flag;
}
// Function to find and print the count of occurrences of a
// pattern in the string
void solve(string& pattern)
{
int left = 0, right = n - 1;
int answer = -1, l = left, r = right;
// Binary search for the leftmost occurrence of the
// pattern
while (l <= r) {
int mid = l + (r - l) / 2;
int check = checkPattern(mid, pattern);
if (check == 0) {
answer = mid;
r = mid - 1;
}
else if (check == 1)
r = mid - 1;
else
l = mid + 1;
}
// If the pattern is not found, print 0 and return
if (answer == -1) {
cout << 0 << endl;
return;
}
// Update the left boundary for the next binary search
left = answer, l = left, r = right;
// Binary search for the rightmost occurrence of the
// pattern
while (l <= r) {
int mid = l + (r - l) / 2;
int check = checkPattern(mid, pattern);
if (check == 0) {
answer = mid;
l = mid + 1;
}
else if (check == -1)
l = mid + 1;
else
r = mid - 1;
}
// Update the right boundary
right = answer;
// Print the count of occurrences
cout << right - left + 1 << endl;
}
// Main function
signed main()
{
// Set the input string and its size
s = "aybabtu";
n = s.size();
// Build the suffix array
buildSuffixArray();
// Define patterns to search for
vector<string> patterns = { "bab", "abc", "a" };
// For each pattern, call the solve function to find and
// print the count of occurrences
for (auto pattern : patterns) {
solve(pattern);
}
}
Java
import java.util.*;
public class Main {
// Function to build the suffix array
static List<Integer> buildSuffixArray(String s) {
int n = s.length();
List<Integer> suffixArray = new ArrayList<>();
for (int i = 0; i < n; i++) {
suffixArray.add(i);
}
int[] position = new int[n];
for (int i = 0; i < n; i++) {
position[i] = s.charAt(i);
}
int[] temp = new int[n];
int[] gap = {1}; // Use an array to hold the value of gap
while (true) {
suffixArray.sort((a, b) -> {
if (position[a] != position[b]) {
return position[a] - position[b];
}
int aNextPos = (a + gap[0] < n) ? position[a + gap[0]] : -1;
int bNextPos = (b + gap[0] < n) ? position[b + gap[0]] : -1;
return aNextPos - bNextPos;
});
for (int i = 0; i < n - 1; i++) {
temp[i + 1] = temp[i] + (compareSuffixes(suffixArray.get(i), suffixArray.get(i + 1), position, gap[0], n) ? 1 : 0);
}
for (int i = 0; i < n; i++) {
position[suffixArray.get(i)] = temp[i];
}
if (temp[n - 1] == n - 1) {
break;
}
gap[0] <<= 1;
}
return suffixArray;
}
// Function to compare two suffixes
static boolean compareSuffixes(int x, int y, int[] position, int gap, int n) {
if (position[x] != position[y]) {
return position[x] < position[y];
}
x += gap;
y += gap;
return (x < n && y < n) ? (position[x] < position[y]) : (x > y);
}
// Function to check if a pattern is present at a given position in the suffix array
static int checkPattern(int mid, String pattern, String s, List<Integer> suffixArray) {
int flag = -1;
int patternSize = pattern.length();
int suffixStart = suffixArray.get(mid);
if (s.length() - suffixStart >= patternSize) {
flag = 0;
}
for (int i = 0; i < Math.min(s.length() - suffixStart, patternSize); i++) {
if (s.charAt(suffixStart + i) < pattern.charAt(i)) {
return -1;
}
if (s.charAt(suffixStart + i) > pattern.charAt(i)) {
return 1;
}
}
return flag;
}
// Function to find and print the count of occurrences of a pattern in the string
static void solve(String pattern, String s, List<Integer> suffixArray) {
int left = 0;
int right = s.length() - 1;
int answer = -1;
int l = left;
int r = right;
while (l <= r) {
int mid = l + (r - l) / 2;
int check = checkPattern(mid, pattern, s, suffixArray);
if (check == 0) {
answer = mid;
r = mid - 1;
} else if (check == 1) {
r = mid - 1;
} else {
l = mid + 1;
}
}
if (answer == -1) {
System.out.println(0);
return;
}
left = answer;
l = left;
r = right;
while (l <= r) {
int mid = l + (r - l) / 2;
int check = checkPattern(mid, pattern, s, suffixArray);
if (check == 0) {
answer = mid;
l = mid + 1;
} else if (check == -1) {
l = mid + 1;
} else {
r = mid - 1;
}
}
right = answer;
System.out.println(right - left + 1);
}
// Main function
public static void main(String[] args) {
String s = "aybabtu";
List<Integer> suffixArray = buildSuffixArray(s);
List<String> patterns = Arrays.asList("bab", "abc", "a");
for (String pattern : patterns) {
solve(pattern, s, suffixArray);
}
}
}
Python3
# Importing the required libraries
from typing import List
# Function to compare two suffixes
def compare_suffixes(x: int, y: int, position: List[int], gap: int, n: int) -> bool:
# Compare the positions of two suffixes
if position[x] != position[y]:
return position[x] < position[y]
# Move to the next positions with a gap and compare again
x += gap
y += gap
return (x < n and y < n) if position[x] < position[y] else x > y
# Function to build the suffix array
def build_suffix_array(s: str) -> List[int]:
n = len(s)
suffix_array = list(range(n))
position = [ord(char) for char in s]
temp = [0]*n
# Build the suffix array using repeated sorting and updating positions
gap = 1
while True:
suffix_array.sort(key=lambda x: (position[x], position[x + gap] if x + gap < n else -1))
# Update the temporary array with cumulative comparisons
for i in range(n - 1):
temp[i + 1] = temp[i] + (compare_suffixes(suffix_array[i], suffix_array[i + 1], position, gap, n))
# Update the positions based on the sorted order
for i in range(n):
position[suffix_array[i]] = temp[i]
# Check if all suffixes are in order; if so, exit the loop
if temp[n - 1] == n - 1:
break
gap <<= 1
return suffix_array
# Function to check if a pattern is present at a given position in the suffix array
def check_pattern(mid: int, pattern: str, s: str, suffix_array: List[int]) -> int:
flag = -1
pattern_size = len(pattern)
suffix_start = suffix_array[mid]
# Check if the suffix can contain the entire pattern
if len(s) - suffix_start >= pattern_size:
flag = 0
# Compare characters of the pattern and suffix
for i in range(min(len(s) - suffix_start, pattern_size)):
if s[suffix_start + i] < pattern[i]:
return -1
if s[suffix_start + i] > pattern[i]:
return 1
return flag
# Function to find and print the count of occurrences of a pattern in the string
def solve(pattern: str, s: str, suffix_array: List[int]) -> None:
left = 0
right = len(s) - 1
answer = -1
l = left
r = right
# Binary search for the leftmost occurrence of the pattern
while l <= r:
mid = l + (r - l) // 2
check = check_pattern(mid, pattern, s, suffix_array)
if check == 0:
answer = mid
r = mid - 1
elif check == 1:
r = mid - 1
else:
l = mid + 1
# If the pattern is not found, print 0 and return
if answer == -1:
print(0)
return
# Update the left boundary for the next binary search
left = answer
l = left
r = right
# Binary search for the rightmost occurrence of the pattern
while l <= r:
mid = l + (r - l) // 2
check = check_pattern(mid, pattern, s, suffix_array)
if check == 0:
answer = mid
l = mid + 1
elif check == -1:
l = mid + 1
else:
r = mid - 1
# Update the right boundary
right = answer
# Print the count of occurrences
print(right - left + 1)
# Main function
def main():
# Set the input string and its size
s = "aybabtu"
# Build the suffix array
suffix_array = build_suffix_array(s)
# Define patterns to search for
patterns = ["bab", "abc", "a"]
# For each pattern, call the solve function to find and print the count of occurrences
for pattern in patterns:
solve(pattern, s, suffix_array)
if __name__ == "__main__":
main()
JavaScript
// Function to build the suffix array
function buildSuffixArray(s) {
let n = s.length;
let suffixArray = Array.from({ length: n }, (_, i) => i);
let position = Array.from(s, char => char.charCodeAt(0));
let temp = new Array(n).fill(0);
let gap = 1;
while (true) {
suffixArray.sort((a, b) => {
if (position[a] !== position[b]) {
return position[a] - position[b];
}
let aNextPos = (a + gap < n) ? position[a + gap] : -1;
let bNextPos = (b + gap < n) ? position[b + gap] : -1;
return aNextPos - bNextPos;
});
for (let i = 0; i < n - 1; i++) {
temp[i + 1] = temp[i] + (compareSuffixes(suffixArray[i], suffixArray[i + 1], position, gap, n) ? 1 : 0);
}
for (let i = 0; i < n; i++) {
position[suffixArray[i]] = temp[i];
}
if (temp[n - 1] === n - 1) {
break;
}
gap <<= 1;
}
return suffixArray;
}
// Function to compare two suffixes
function compareSuffixes(x, y, position, gap, n) {
if (position[x] !== position[y]) {
return position[x] < position[y];
}
x += gap;
y += gap;
return (x < n && y < n) ? (position[x] < position[y]) : (x > y);
}
// Function to check if a pattern is present at a given position in the suffix array
function checkPattern(mid, pattern, s, suffixArray) {
let flag = -1;
let patternSize = pattern.length;
let suffixStart = suffixArray[mid];
if (s.length - suffixStart >= patternSize) {
flag = 0;
}
for (let i = 0; i < Math.min(s.length - suffixStart, patternSize); i++) {
if (s[suffixStart + i] < pattern[i]) {
return -1;
}
if (s[suffixStart + i] > pattern[i]) {
return 1;
}
}
return flag;
}
// Function to find and print the count of occurrences of a pattern in the string
function solve(pattern, s, suffixArray) {
let left = 0;
let right = s.length - 1;
let answer = -1;
let l = left;
let r = right;
while (l <= r) {
let mid = l + Math.floor((r - l) / 2);
let check = checkPattern(mid, pattern, s, suffixArray);
if (check === 0) {
answer = mid;
r = mid - 1;
} else if (check === 1) {
r = mid - 1;
} else {
l = mid + 1;
}
}
if (answer === -1) {
console.log(0);
return;
}
left = answer;
l = left;
r = right;
while (l <= r) {
let mid = l + Math.floor((r - l) / 2);
let check = checkPattern(mid, pattern, s, suffixArray);
if (check === 0) {
answer = mid;
l = mid + 1;
} else if (check === -1) {
l = mid + 1;
} else {
r = mid - 1;
}
}
right = answer;
console.log(right - left + 1);
}
// Main function
function main() {
let s = "aybabtu";
let suffixArray = buildSuffixArray(s);
let patterns = ["bab", "abc", "a"];
for (let pattern of patterns) {
solve(pattern, s, suffixArray);
}
}
main();
Time Complexity:
Building Suffix Array: O(n log2n)
Checking each Pattern: O(logn)
Overall Time Complexity:(mlogn + nlog2n), , where m is the number of patterns and n is the length of the input string.
Auxiliary Space Complexity: O(n) due to the arrays suffixArray, position, and temp. These arrays are used to store information about the suffix array and the intermediate steps in its construction.
Similar Reads
CSES Solutions - Counting Numbers
Given two integers a and b. Your task is to count the number of integers between a and b where no two adjacent digits are the same. Examples: Input: a=11, b=13Output: 2Explanation: The two numbers are 12 and 13. Input: a=123, b=321Output: 171 Approach:The idea is to use digit DP to solve this proble
9 min read
CSES Solution - Finding Patterns
Prerequisite: Aho-Corasick algorithm Given a string and patterns, check for each pattern if it appears in the string. Example: Input: s = "aybabtu", patterns = {"bab", "abc", "ayba"}Output: YESNOYES Input: s = "bybabtu", patterns = {"bat", "tu"}Output:NOYES Approach: The solution uses a method calle
15 min read
CSES Solutions - Counting Tilings
Give an n x m grid, the task is to count the number of ways you can fill an n x m grid using 1 x 2 and 2 x 1 tiles. Output the number of ways modulo 109 + 7. Examples: Input: n=2, m=2Output: 2 Input: n=4, m=7Output: 781 Approach: The idea is to solve the problem using DP with bitmasking. We'll proce
10 min read
CSES Solutions - Grid Paths
There are 88418 paths in a 7x7 grid from the upper-left square to the lower-left square. Each path corresponds to a 48-character description consisting of characters D (down), U (up), L (left) and R (right). You are given a description of a path which may also contain characters ? (any direction). Y
15 min read
CSES Problem Set Solutions
In this article, we have compiled comprehensive, high-quality tutorials on the CSES Problem Set Solutions to assist you in understanding the problem set for learning algorithmic programming. What is CSES Problem Set?CSES Problem Set is a collection of competitive programming tasks hosted on the CSES
8 min read
Count of occurrences of a "1(0+)1" pattern in a string
Given an alphanumeric string, find the number of times a pattern 1(0+)1 occurs in the given string. Here, (0+) signifies the presence of non empty sequence of consecutive 0's. Examples: Input : 1001010001 Output : 3 First sequence is in between 0th and 3rd index. Second sequence is in between 3rd an
7 min read
Grouping Countries
People in a group, are sitting in a row numbered from 1 to n. Every has been asked the same question, âHow many people of your country are there in the group?â The answers provided by the people may be incorrect. People of the same country always sit together. If all answers are correct determine th
6 min read
CSES Solutions - Array Description
You know that an array arr[] has N integers between 1 and M, and the absolute difference between two adjacent values is at most 1. Given a description of the array where some values may be unknown, your task is to count the number of arrays that match the description. Examples: Input: N = 3, M = 5,
14 min read
CSES Solutions - String Matching
Given a string S and a pattern P, your task is to count the number of positions where the pattern occurs in the string. Examples: Input: S = "saippuakauppias", P = "pp"Output: 2Explanation: "pp" appears 2 times in S. Input: S = "aaaa", P = "aa"Output: 3Explanation: "aa" appears 3 times in S. Approac
7 min read
CSES Solutions - Two Knights
Given a number N, the task is to count for each K = 1,2 ⦠N the number of ways two knights can be placed on a K X K chessboard so that they do not attack each other.Examples:Input: N = 2Output: 0 6Explanation: For a 1 X 1 chessboard, there is no possible way to place 2 knights on a 1 X 1 chessboard,
7 min read