0% found this document useful (0 votes)

17 views

G5 Advanced String Algorithms Lecture (No Code)

Uploaded by

tmrtkebede

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

G5 Advanced String Algorithms Lecture (No Code)

Uploaded by

tmrtkebede

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 136

Advanced

String
Algorithms
Substring search
Lecture Outline
● Prerequisites
● Substring Search (The Naive Way)
● Rabin-Karp Algorithm
● Knuth-Morris-Pratt Algorithm
● Applications of Rabin-Karp and Knuth-Morris-Pratt Algorithm
● Additional String Algorithms
● Quote of the Day

2
Pre-requisites
● Math II
● String manipulation in Python
● Time and Space complexity analysis
What is a substring search?
Naive Method
i
String : a b c d a b c d f
1 2 3 4 5 6 7 8 9

j
Pattern : abcdf
1 2 3 4 5
i
String : a b c d a b c d f
1 2 3 4 5 6 7 8 9

j
Pattern : abcdf
1 2 3 4 5
Okay, let’s try again
i
String : a b c d a b c d f
1 2 3 4 5 6 7 8 9

j
Pattern : abcdf
1 2 3 4 5
i
String : a b c d a b c d f
1 2 3 4 5 6 7 8 9

j
Pattern : abcdf
1 2 3 4 5
Failed yet again.
AGAIN !
i
String : a b c d a b c d f
1 2 3 4 5 6 7 8 9

j
Pattern : abcdf
1 2 3 4 5
i
String : a b c d a b c d f
1 2 3 4 5 6 7 8 9

j
Pattern : abcdf
1 2 3 4 5
AGAINNN !!!
i
String : a b c d a b c d f
1 2 3 4 5 6 7 8 9

j
Pattern : abcdf
1 2 3 4 5
i
String : a b c d a b c d f
1 2 3 4 5 6 7 8 9

j
Pattern : abcdf
1 2 3 4 5
Hmm :/
Again ?
i
String : a b c d a b c d f
1 2 3 4 5 6 7 8 9

j
Pattern : abcdf
1 2 3 4 5
i
String : a b c d a b c d f
1 2 3 4 5 6 7 8 9

j
Pattern : abcdf
1 2 3 4 5
Okay we got somewhere, but how long
did it take us ?
O(n*m)
Practice Problem

Find the index of the first occurrence in a string

Rabin-Karp
Algorithm
Average O(n + m) Time
What is Hashing ?
Why do we need to
encode strings ?
Encoding Strings

For s = “abcad”,

let’s start thinking in base alphabets.

Encoding Strings

For s = “abcad”,

`a` * 264 + `b` * 263 + `c` * 262 + `a` * 261 + `d` * 260
We need to find some values to
represent each of the above letters.
Any ideas ?
Encoding Strings

`a` = 0
`b` = 1
`c` = 2
`d` = 3
.
.
`z` = 25
This will result in an edge case if we represent strings
this way

“aaa” => 0 * 262 + 0 * 261 + 0 * 260 = 0

“aa” => 0 * 261 + 0 * 260 = 0
There are two ways to fix this problem

1. Encode the length in the hash (messy)

2. Don’t use 0, encode (alphabet + 1) size
Operations on Hashes
Operation: addLast
let 𝞪 = 26 + 1

“abc” + “x” = ?

“abc” => (1 * 𝞪2 + 2 * 𝞪1 + 3 * 𝞪0 )
“x” => (24 * 𝞪0 )
“abc” + “x” => (1 * 𝞪2 + 2 * 𝞪1 + 3 * 𝞪0 ) * 𝞪 + (24 * 𝞪0 )
“abcx” => 1 * 𝞪3 + 2 * 𝞪2 + 3 * 𝞪1 + 24 * 𝞪0
Operation: pollFirst
let 𝞪 = 26 + 1

“abcx” = let’s try to remove the `a` ?

“abcx” => 1 * 𝞪3 + 2 * 𝞪2 + 3 * 𝞪1 + 24 * 𝞪0

“bcx” => (1 * 𝞪3 + 2 * 𝞪2 + 3 * 𝞪1 + 24 * 𝞪0 ) - (1 * 𝞪3)

“bcx” => 2 * 𝞪2 + 3 * 𝞪1 + 24 * 𝞪0
For Rabin-Karp, the above two
operations suffice for most cases
Operation: addFirst
let 𝞪 = 26 + 1

“x” + “abc” = ?
“x” => (24 * 𝞪0 )
“abc” => (1 * 𝞪2 + 2 * 𝞪1 + 3 * 𝞪0 )

“xabc” => (24 * 𝞪0 ) * 𝞪3 + (1 * 𝞪2 + 2 * 𝞪1 + 3 * 𝞪0 )

Operation: pollLast
let 𝞪 = 26 + 1

“abcx” = let’s try to remove the `x` ?

“abcx” => 1 * 𝞪3 + 2 * 𝞪2 + 3 * 𝞪1 + 24 * 𝞪0

“abc” => ((1 * 𝞪3 + 2 * 𝞪2 + 3 * 𝞪1 + 24 * 𝞪0 ) - (24 * 𝞪0)) / 𝞪

“abc” => 1 * 𝞪2 + 2 * 𝞪1 + 3 * 𝞪0
Most of the time, the hash values are very large numbers
hence we need to use them under mod.
Therefore, the last operation is trickier than we made it
look like; since it involves knowing division under mod
TIP: Precompute all 𝞪k
TIP: Pick a Prime number for modulus.

Typically, 10 ** 9 + 7

(Fermat’s Little theorem)

Why Choose a Prime Modulus in Rabin-Karp?
● Reduces Hash Collisions: Primes ensure a uniform distribution of hash values.
● Prevents Overflow: Large prime modulus like 10 ** 9 + 7 keeps hash values within
limits.
● Fermat's Little Theorem: Enables efficient calculation of modular inverses for rolling
hashes.
TIP: Use multiple primes to decrease
the chance of collisions
Rabin-Karp: Demonstration

String: abacdabazxywp
pattern: abaz
Rabin-Karp: Demonstration
pattern: abaz

String: abacdabazxywp

(1 * 𝞪3 + 2 * 𝞪2 + 1 * 𝞪1 + 3 * 𝞪0)
Rabin-Karp: Demonstration
pattern: abaz

String: abacdabazxywp

pol
lFir L ast
d
st ad
Practice Problem
Find the index of the first occurrence in a string
Note: If you have to do things under mod given your constraints,
a hash match doesn’t necessarily mean you found the string.
Note: You have to do a string equality check just to be sure.
Most people don’t feel confident after writing a
probabilistic algorithm such as Rabin-Karp,
but the way you should see it is, if you can bring down the
probability of your algorithm getting it wrong less than the
probability of the hardware failing while running your code….
you should be able to submit and be able to sleep at
night.
Knuth–Morris–Pratt
algorithm
Guaranteed O(n + m) Time
This algorithm was invented by Donald Knuth, Von Pratt
and independently by James Morris
Key Idea : Take advantage of the successful comparisons
we make between the string and the pattern.
Example

S = adsgwadsdsgwadsgz
P = dsgwadsgz
Example

S = adsgwadsdsgwadsgz
P = dsgwadsgz
The KMP algorithm wants to avoid going back in the
string S and revert our progress in matching the pattern.
So it looks for a suffix that is also a prefix in the matched
substring before the mismatch

dsgwads
We know the substring `ds` exists in our string S before
the mismatch. Due to this fact, the algorithm finds out
how far it needs to go back in the string P to continue
matching without reverting the progress that was made
In our example, we will jump back to `g` in the string P
and we will not go back in our string S.

dsgwads
Example

S = adsgwadsdsgwadsgz
P = dsgwadsgz
Since we don’t have any suffix that is prefix in the
substring `ds`, we will now go back to the beginning in P
Example

S = adsgwadsdsgwadsgz
P = dsgwadsgz
Example

S = adsgwadsdsgwadsgz
P = dsgwadsgz
The algorithm mainly has two parts to achieve this
efficiently.

1. Preprocessing
2. Matching
1. Preprocessing

Some vocabularies first :)

Prefix: Substring of a string that starts from the beginning of the string. Empty string ("") is a prefix of
every string.

● "", "a", "ab", "aba", "abac", "abaca", "abacab" are prefix of "abacab"
● "", "a", "ab", "aba", "abab", "ababa", "ababab", "abababa" are prefix of
"abababa"

Suffix: Substring of a string that ends at the end of the string. Empty string ("") is a suffix of every
string.

● "abacab", "bacab", "acab", "cab", "ab", "b", "" are suffix of "abacab"
● "abababa", "bababa", "ababa", "baba", "aba", "ba", "a", "" are suffix of
"abababa"
1. Preprocessing

Proper Prefix: Prefix that is not equal to the string itself.

● "", "a", "ab", "aba", "abac", "abaca" are proper prefix of "abacab"
● "", "a", "ab", "aba", "abab", "ababa", "ababab" are proper prefix of "abababa"

Proper Suffix: Suffix that is not equal to the string itself.

● "bacab", "acab", "cab", "ab", "b", "" are proper suffix of "abacab"
● "bababa", "ababa", "baba", "aba", "ba", "a", "" are proper suffix of "abababa"

Border: Substring of a string that is both proper prefix and proper suffix. The length of the border is
often called the Width of the Border. Although, the term Width is rarely used.

● "", "ab" are borders of "abacab"

● "", "aba", "ababa" are borders of "abababa"
1. Preprocessing

longest_border: Array that stores the length of Longest Proper

Prefix that is also a Suffix of every prefix of string. More precisely,
longest_border[i] is the length of the longest border of the
string[0...i]
1. Preprocessing

The Longest Border Array (LPS, π-table, or Prefix Table) is used in multiple algorithms. The
naïve approach to built it is of O(m3) by adhering to the mathematical formula and searching
for the longest proper prefix that is also a suffix, for every index.

for i = 1 to m-1

for k = 0 to i

if needle[0..k-1] == needle[i-(k-1)..i]

longest_border[i] = k

However, we can follow the greedy approach, and can build it in linear time.
1. Preprocessing

d s g w a d s g z

LPS

LPS[i] = where to start matching in P after a

mismatch at i + 1.

In other words, the length of the longest

proper prefix that is a suffix in P[0….i]
1. Preprocessing
i j
d s g w a d s g z

LPS 0
1. Preprocessing
i j
d s g w a d s g z

LPS 0 0
1. Preprocessing
i j
d s g w a d s g z

LPS 0 0 0
1. Preprocessing
i j
d s g w a d s g z

LPS 0 0 0 0
1. Preprocessing
i j
d s g w a d s g z

LPS 0 0 0 0 0 1
1. Preprocessing
i j
d s g w a d s g z

LPS 0 0 0 0 0 1 2
1. Preprocessing
i j
d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

Now that W and Z don’t match, i becomes LPS[i - 1].

This is because if we don’t have a border of three, we
want to try out less wider borders before going back
to zero.
1. Preprocessing
i j
a a a c a a a a

LPS 0 1 2 0 1 2 3

Here you can see, that c and a, don’t much and we

can’t have a border of 4, but we clearly have a border
of 3. That is why, we need to switch to i = LPS[i - 1] and
then compare. Here LPS[i - 1] = 2.
1. Preprocessing
i j
a a a c a a a a

LPS 0 1 2 0 1 2 3 3

And since a matches with a, LPS[j] = LPS[i] + 1

1. Practice: write the stub code for generating LPS table

def KMP_part_one(p : str) -> list:

# todo

assert KMP_part_one('aaacaaaa') == [0, 1, 2, 0, 1, 2, 3, 3]

assert KMP_part_one('dsgwadsgz') == [0, 0, 0, 0, 0, 1, 2, 3, 0]
What is the time complexity of building
the LPS table this way?
Interestingly enough it’s linear.
O(length of the pattern)
Why O(M)?
● Each character is processed at most twice: once when i moves forward and possibly once
when prevLPS backtracks.
●
● The total length of all "drops" (rollbacks in prevLPS) is bounded by M, meaning no position
is revisited unnecessarily.
●
● Even when prevLPS drops after a mismatch, it cannot drop more than the interval already
covered by i.
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j

i = LPS[i - 1]
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j

i = LPS[i - 1]
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j
2. Matching

d s g w a d s g z

LPS 0 0 0 0 0 1 2 3 0

i
S = adsgwadsdsgwadsgz
j

MATCH
What is the time complexity of this
Matching process?
Once again it’s linear.
O(length of the text)

Hint: Notice the behavior of the pointers during the

construction of the LPS array and compare it with the way
the pointers move during the pattern matching process
Practice Problem

Rotate String
Efficiency of the KMP algorithm
● Since the two portions of the algorithm have, respectively, complexities
of O(m) and O(n), the complexity of the overall algorithm is O(m + n).
● These complexities are the same, no matter how many repetitive
patterns are in P or S.
Applications of RK and KMP
● Spell Checker
● Plagiarism Detection
● Text Editors
● Spam Filters
● Digital Forensics
● Matching DNA Sequences
● Intrusion Detection
● Search Engines
● Bioinformatics and Cheminformatics
● Information Retrieval System
● Language Syntax Checker
Additional String
Algorithms
Z Algorithm

● Highly resembles KMP but simpler and versatile.

● Mostly used to find
○ Periodicity of a string
○ All Occurrences of a substring
● Relatively great at handling multiple patterns
Manacher's Algorithm

● is used to find the longest palindromic substring in a given string in

linear time.

● can be used to count all pairs (i, j) such that substring s[i…j] is a
palindrome in linear time.
Suffix Array

● Efficiently solve pattern matching, lexicographic order problems,

and LCP (Longest Common Prefix) queries.

● Applications: Fast substring queries, string compression, DNA

sequence alignment.
Practice Problems
● Repeated String Match
● Longest Happy Prefix
● Find the index of the first occurrence in a string
● Permutation in String
● Find Substring with a given hash value
● Division + LCP (easy version)
Resources
● Pattern Search with the Knuth-Morris-Pratt (KMP)
algorithm
● Preﬁx function. Knuth–Morris–Pratt algorithm
● Knuth–Morris–Pratt (KMP) Pattern Matching Substring
Search - First Occurrence Of Substring
● Algorithms live : Rolling hash and bloom filters
● String Searching | USACO GUIDE
Quote of the day

G5 Advanced String Algorithms Lecture (With Code)
No ratings yet
G5 Advanced String Algorithms Lecture (With Code)
142 pages
Advanced String Lecture
No ratings yet
Advanced String Lecture
50 pages
Strings and Pattern Matching
No ratings yet
Strings and Pattern Matching
17 pages
54.string Inotes
No ratings yet
54.string Inotes
20 pages
Ch3 Brute Force and Exhaustive Searchmodifieduntil Stringmatching
No ratings yet
Ch3 Brute Force and Exhaustive Searchmodifieduntil Stringmatching
20 pages
String Matching
No ratings yet
String Matching
35 pages
Data Structures Using C: Example 4.13
No ratings yet
Data Structures Using C: Example 4.13
5 pages
Trings and Attern Atching: - Brute Force, Rabin-Karp, Knuth-Morris-Pratt
No ratings yet
Trings and Attern Atching: - Brute Force, Rabin-Karp, Knuth-Morris-Pratt
49 pages
4string Matching Kmprabin Karp and Naive
No ratings yet
4string Matching Kmprabin Karp and Naive
57 pages
Rabin-Karp Algorithm For Pattern Searching: Examples
No ratings yet
Rabin-Karp Algorithm For Pattern Searching: Examples
5 pages
AAD Lec11
No ratings yet
AAD Lec11
5 pages
Rabin Karp Algorithm of Pattern Matching (Goutam Padhy)
No ratings yet
Rabin Karp Algorithm of Pattern Matching (Goutam Padhy)
15 pages
Naïve Method. Code:: Naive, Rabin-Karp, and Knuth-Morris-Pratt Algorithms For String Matching
No ratings yet
Naïve Method. Code:: Naive, Rabin-Karp, and Knuth-Morris-Pratt Algorithms For String Matching
5 pages
Zoho Level1
100% (1)
Zoho Level1
118 pages
Trings and Attern Atching: - Brute Force, Rabin-Karp, Knuth-Morris-Pratt - Regular Expressions
No ratings yet
Trings and Attern Atching: - Brute Force, Rabin-Karp, Knuth-Morris-Pratt - Regular Expressions
21 pages
SANCHIT_EXPT6_DAA
No ratings yet
SANCHIT_EXPT6_DAA
7 pages
ADITYARAJRAI
No ratings yet
ADITYARAJRAI
47 pages
DAA-DA
No ratings yet
DAA-DA
9 pages
Cse2012 Design and Analysis of Algorithms Lab Digital Assignment 2
No ratings yet
Cse2012 Design and Analysis of Algorithms Lab Digital Assignment 2
18 pages
Cse2012 Design and Analysis of Algorithms Lab Digital Assignment 2
No ratings yet
Cse2012 Design and Analysis of Algorithms Lab Digital Assignment 2
18 pages
KMP String Matching Algorithm
No ratings yet
KMP String Matching Algorithm
8 pages
String Matching Algorithms
No ratings yet
String Matching Algorithms
25 pages
StringMatchingAlgorithms Rabin and finite
No ratings yet
StringMatchingAlgorithms Rabin and finite
56 pages
20BCS5977_DAA LAB WORKSHEET 3.3pdf
No ratings yet
20BCS5977_DAA LAB WORKSHEET 3.3pdf
5 pages
String Matching Algorithms: 1 Brute Force
No ratings yet
String Matching Algorithms: 1 Brute Force
5 pages
AOA Module 6 - String of Algorithms - Aeraxia - in
No ratings yet
AOA Module 6 - String of Algorithms - Aeraxia - in
26 pages
String Matching
No ratings yet
String Matching
34 pages
Chapter 3 - String Processing
No ratings yet
Chapter 3 - String Processing
28 pages
Co 4 (Lo 2)
No ratings yet
Co 4 (Lo 2)
12 pages
Daa Exp 09
No ratings yet
Daa Exp 09
7 pages
APExp4 tekrat
No ratings yet
APExp4 tekrat
6 pages
DS V Unit Notes
No ratings yet
DS V Unit Notes
33 pages
Strings
No ratings yet
Strings
23 pages
12 - Strings Matching
No ratings yet
12 - Strings Matching
111 pages
07 Brute Force
No ratings yet
07 Brute Force
54 pages
DAA-DA-output
No ratings yet
DAA-DA-output
9 pages
Module 06. String Algorithms Lecture 3-6
No ratings yet
Module 06. String Algorithms Lecture 3-6
48 pages
Lecture Notes On Pattern Matching Algorithms
No ratings yet
Lecture Notes On Pattern Matching Algorithms
16 pages
Lecture Notes On Pattern Matching Algorithms
No ratings yet
Lecture Notes On Pattern Matching Algorithms
16 pages
String Matching
No ratings yet
String Matching
4 pages
Strings and Pattern Searching
100% (1)
Strings and Pattern Searching
80 pages
Unit 3-Pattern Matching.pptx
No ratings yet
Unit 3-Pattern Matching.pptx
43 pages
Brute Force
No ratings yet
Brute Force
5 pages
Module-5-28march
No ratings yet
Module-5-28march
10 pages
A357460420 - 22393 - 2 - 2018 - String Matching
No ratings yet
A357460420 - 22393 - 2 - 2018 - String Matching
27 pages
Techniques: Two Pointer Technique
No ratings yet
Techniques: Two Pointer Technique
5 pages
pattern matching
No ratings yet
pattern matching
33 pages
Week 4
No ratings yet
Week 4
18 pages
Strings
No ratings yet
Strings
6 pages
String Search Algorithm
No ratings yet
String Search Algorithm
6 pages
Lecture 56string Matching
No ratings yet
Lecture 56string Matching
43 pages
Strings
No ratings yet
Strings
73 pages
ZScaler Questions
No ratings yet
ZScaler Questions
4 pages
Lec 6-String Processing
100% (1)
Lec 6-String Processing
25 pages
11 Data Structures and Algorithms - Narasimha Karumanchi
No ratings yet
11 Data Structures and Algorithms - Narasimha Karumanchi
12 pages
Sandeep Singh (Iii B.Tech I.T)
No ratings yet
Sandeep Singh (Iii B.Tech I.T)
179 pages
String Matching
100% (1)
String Matching
27 pages
Trellis Cable Hat
From Everand
Trellis Cable Hat
Nancy Hand
No ratings yet
Squirrel Hat: A Knitting Pattern
From Everand
Squirrel Hat: A Knitting Pattern
Nancy Hand
No ratings yet
Step It Up Knits: Take Your Skills to the Next Level with 25 Quick and Stylish Projects
From Everand
Step It Up Knits: Take Your Skills to the Next Level with 25 Quick and Stylish Projects
Vickie Howell
3/5 (1)
T17 SABC-1 Simulator
No ratings yet
T17 SABC-1 Simulator
186 pages
Iteh Standard Preview (Standards - Iteh.ai) : SIST EN 12568:2010 Slovenski Standard
No ratings yet
Iteh Standard Preview (Standards - Iteh.ai) : SIST EN 12568:2010 Slovenski Standard
12 pages
Control System Engineering - Prof - Priyen S. Patel
No ratings yet
Control System Engineering - Prof - Priyen S. Patel
40 pages
Successful Ultrasonic Inspection of Austenitic Welds
No ratings yet
Successful Ultrasonic Inspection of Austenitic Welds
6 pages
Arithmetic 08 - Time Speed Distance LL Basic To Moderate Level - Class Notes II MBA PIONEER 2023
No ratings yet
Arithmetic 08 - Time Speed Distance LL Basic To Moderate Level - Class Notes II MBA PIONEER 2023
25 pages
Thermal and Structural Electronic Packaging Analysis for Space and Extreme Environments 1st Edition Juan Cepeda-Rizo All Chapters Instant Download
100% (6)
Thermal and Structural Electronic Packaging Analysis for Space and Extreme Environments 1st Edition Juan Cepeda-Rizo All Chapters Instant Download
49 pages
Đề thi cuối HK I lớp 12 số 26
No ratings yet
Đề thi cuối HK I lớp 12 số 26
6 pages
Sem 3
No ratings yet
Sem 3
1 page
CO & PO DELD
No ratings yet
CO & PO DELD
3 pages
(Amaleaks - Blogspot.com) Humms 125 Week 1-20
No ratings yet
(Amaleaks - Blogspot.com) Humms 125 Week 1-20
242 pages
Session 2 Notes
No ratings yet
Session 2 Notes
11 pages
8 Hydrocracking Vs Catalytic Cracking
No ratings yet
8 Hydrocracking Vs Catalytic Cracking
2 pages
Structural Functionalism 2019-2020
No ratings yet
Structural Functionalism 2019-2020
39 pages
11-Influence of Porosity On Compressive and Tensile Strength of Cement Mortar
No ratings yet
11-Influence of Porosity On Compressive and Tensile Strength of Cement Mortar
6 pages
Iit Chemistry 2022 Key PDF
No ratings yet
Iit Chemistry 2022 Key PDF
3 pages
Minerals in Afghanistan The Potential For Gold
No ratings yet
Minerals in Afghanistan The Potential For Gold
7 pages
SPHY GUIs
100% (1)
SPHY GUIs
83 pages
Sanyo PLV-Z4 Service Manual
No ratings yet
Sanyo PLV-Z4 Service Manual
138 pages
Mohammed Rahmat Ali Civil Site Eingineer: Ersonal Etails
No ratings yet
Mohammed Rahmat Ali Civil Site Eingineer: Ersonal Etails
4 pages
Ahmed: Mohamed Ibrahim Hussien
No ratings yet
Ahmed: Mohamed Ibrahim Hussien
2 pages
SHS Humss Diss Q1 WK3
No ratings yet
SHS Humss Diss Q1 WK3
5 pages
Chapter 8 Planning and Goal-Setting: Management, 14e, Global Edition (Robbins/Coulter)
No ratings yet
Chapter 8 Planning and Goal-Setting: Management, 14e, Global Edition (Robbins/Coulter)
10 pages
Graded Assignment: Mid-Unit Test, Part 2
No ratings yet
Graded Assignment: Mid-Unit Test, Part 2
3 pages
Grade 4 6 Mathematics 5
No ratings yet
Grade 4 6 Mathematics 5
22 pages
Humidity
No ratings yet
Humidity
18 pages
2 Jsa & Hirac Sept 2018
100% (1)
2 Jsa & Hirac Sept 2018
65 pages
Assignment No 5 Conervation
No ratings yet
Assignment No 5 Conervation
12 pages
Graduate Program Manual 2021 Edition
No ratings yet
Graduate Program Manual 2021 Edition
70 pages
Practical Skills
No ratings yet
Practical Skills
163 pages
Machine Learning in Remote Sensing - LinkedIn PDF
No ratings yet
Machine Learning in Remote Sensing - LinkedIn PDF
7 pages

G5 Advanced String Algorithms Lecture (No Code)

Uploaded by

G5 Advanced String Algorithms Lecture (No Code)

Uploaded by

Advanced

Find the index of the first occurrence in a string

let’s start thinking in base alphabets.

“aaa” => 0 * 262 + 0 * 261 + 0 * 260 = 0

1. Encode the length in the hash (messy)

“abcx” = let’s try to remove the `a` ?

“bcx” => (1 * 𝞪3 + 2 * 𝞪2 + 3 * 𝞪1 + 24 * 𝞪0 ) - (1 * 𝞪3)

“xabc” => (24 * 𝞪0 ) * 𝞪3 + (1 * 𝞪2 + 2 * 𝞪1 + 3 * 𝞪0 )

“abcx” = let’s try to remove the `x` ?

“abc” => ((1 * 𝞪3 + 2 * 𝞪2 + 3 * 𝞪1 + 24 * 𝞪0 ) - (24 * 𝞪0)) / 𝞪

(Fermat’s Little theorem)

Some vocabularies first :)

Proper Prefix: Prefix that is not equal to the string itself.

Proper Suffix: Suffix that is not equal to the string itself.

● "", "ab" are borders of "abacab"

longest_border: Array that stores the length of Longest Proper

LPS[i] = where to start matching in P after a

In other words, the length of the longest

Now that W and Z don’t match, i becomes LPS[i - 1].

Here you can see, that c and a, don’t much and we

And since a matches with a, LPS[j] = LPS[i] + 1

def KMP_part_one(p : str) -> list:

assert KMP_part_one('aaacaaaa') == [0, 1, 2, 0, 1, 2, 3, 3]

Hint: Notice the behavior of the pointers during the

● Highly resembles KMP but simpler and versatile.

● is used to find the longest palindromic substring in a given string in

● Efficiently solve pattern matching, lexicographic order problems,

● Applications: Fast substring queries, string compression, DNA

You might also like