2. Analysis and Design of Algorithms
Pattern searching
Naive Pattern Searching
Regular Expression
3. Analysis and Design of Algorithms
Pattern searching is an important problem in
computer science. When we do search for a string in
notepad/word file or browser or database, pattern
searching algorithms are used to show the search
results.
5. Analysis and Design of Algorithms
Slide the pattern over text one by one and
check for a match. If a match is found, then
slides by 1 again to check for subsequent
matches.
6. Analysis and Design of Algorithms
Example 1:
Input: txt[] = "THIS IS A TEST TEXT"
pat[] = "TEST"
Output: Pattern found at index 10
7. Analysis and Design of Algorithms
Example 2:
Input: txt[] = "AABAACAADAABAABA"
pat[] = "AABA"
Output: Pattern found at index 0
Pattern found at index 9
Pattern found at index 12
8. Analysis and Design of Algorithms
Compare
A A B A A C A A D A A B A A B A
A A B A
j
i
9. Analysis and Design of Algorithms
Compare
A A B A A C A A D A A B A A B A
A A B A
j
i i+j
10. Analysis and Design of Algorithms
Compare
A A B A A C A A D A A B A A B A
A A B A
j
i i+j
11. Analysis and Design of Algorithms
Compare
A A B A A C A A D A A B A A B A
A A B A
j
i i+j
12. Analysis and Design of Algorithms
Pattern found at index 0
A A B A A C A A D A A B A A B A
A A B A
i
13. Analysis and Design of Algorithms
Compare
A A B A A C A A D A A B A A B A
A A B A
i
14. Analysis and Design of Algorithms
Compare
A A B A A C A A D A A B A A B A
A A B A
i
15. Analysis and Design of Algorithms
Compare
A A B A A C A A D A A B A A B A
A A B A
i
16. Analysis and Design of Algorithms
Compare
A A B A A C A A D A A B A A B A
A A B A
i
17. Analysis and Design of Algorithms
Compare
A A B A A C A A D A A B A A B A
A A B A
i
18. Analysis and Design of Algorithms
Compare
A A B A A C A A D A A B A A B A
A A B A
i
19. Analysis and Design of Algorithms
Compare
A A B A A C A A D A A B A A B A
A A B A
i
20. Analysis and Design of Algorithms
Compare
A A B A A C A A D A A B A A B A
A A B A
i
21. Analysis and Design of Algorithms
Pattern found at index 9
A A B A A C A A D A A B A A B A
A A B A
i
22. Analysis and Design of Algorithms
Compare
A A B A A C A A D A A B A A B A
A A B A
i
23. Analysis and Design of Algorithms
Compare
A A B A A C A A D A A B A A B A
A A B A
i
24. Analysis and Design of Algorithms
Pattern found at index 12
A A B A A C A A D A A B A A B A
A A B A
i
25. Analysis and Design of Algorithms
Compare
A A B A A C A A D A A B A A B A
A A B
i
26. Analysis and Design of Algorithms
Compare
A A B A A C A A D A A B A A B A
A A
i
27. Analysis and Design of Algorithms
Input: txt[] = "AABAACAADAABAABA"
pat[] = "AABA"
Output: Pattern found at index 0
Pattern found at index 9
Pattern found at index 12
30. Analysis and Design of Algorithms
What is the best case?
The best case occurs when the first character of the pattern
is not present in text at all.
txt[] = "AABCCAADDEE"
pat[] = "FAA"
31. Analysis and Design of Algorithms
What is the worst case ?
1) When all characters of the text and pattern are same.
txt[] = "AAAAAAAAAAAAAAAAAA"
pat[] = "AAAAA"
32. Analysis and Design of Algorithms
2) Worst case also occurs when only the last character is
different.
txt[] = "AAAAAAAAAAAAAAAAAB"
pat[] = "AAAAB"
35. Analysis and Design of Algorithms
Regular expressions are a
powerful language for matching
text patterns.
36. Analysis and Design of Algorithms
re.match() checks for a match only at the beginning of
the string.
re.search() checks for a match anywhere in the string.
49. Analysis and Design of Algorithms
[^ ] : will match any character except this.
50. Analysis and Design of Algorithms
{n} : match exactly with any number of n.
51. Analysis and Design of Algorithms
{n,m} : match exactly with any number of n to m.
52. Analysis and Design of Algorithms
Syntax Description Equivalent
d Matches any decimal digit [0-9]
D Matches any non-digit character [^0-9]
s Matches any whitespace character [ tnrfv]
S Matches any non-whitespace character [^ tnrfv]
w Matches any alphanumeric character [a-zA-Z0-9_]
W Matches any non-alphanumeric character [^a-zA-Z0-9_]
60. Analysis and Design of Algorithms
Operators Description
. Matches with any single character except newline ‘n’
? Match 0 or 1 occurrence of the pattern to its left
+ Match 1 or more occurrences of the pattern to its left
* Match 0 or more occurrences of the pattern to its left
w Matches with a alphanumeric character
W Matches non alphanumeric character
d Matches with digits [0-9]
D Matches with non-digits
61. Analysis and Design of Algorithms
Operators Description
s Matches with a single white space character (space, newline, tab)
S Matches any non-white space character
[..] Matches any single character in a square bracket
[^..] Matches any single character not in square bracket
It is used for special meaning characters
^ and $ ^ and $ match the start or end of the string respectively
{n,m} Matches at least n and at most m occurrences of expression
a| b Matches either a or b
62. Analysis and Design of Algorithms
Operators Description
( ) Groups regular expressions and returns matched text
t, n, r Matches tab, newline, return
63. Analysis and Design of Algorithms
Validate a phone number (phone number must be of
10 digits and starts with 8 or 9)