Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Analysis and Design of Algorithms
Patterns Algorithms
Analysis and Design of Algorithms
Pattern searching
Naive Pattern Searching
Regular Expression
Analysis and Design of Algorithms
Pattern searching is an important problem in
computer science. When we do search for a string in
notepad/word file or browser or database, pattern
searching algorithms are used to show the search
results.
Analysis and Design of Algorithms
Naive Pattern Searching
Analysis and Design of Algorithms
Slide the pattern over text one by one and
check for a match. If a match is found, then
slides by 1 again to check for subsequent
matches.
Analysis and Design of Algorithms
Example 1:
Input: txt[] = "THIS IS A TEST TEXT"
pat[] = "TEST"
Output: Pattern found at index 10
Analysis and Design of Algorithms
Example 2:
Input: txt[] = "AABAACAADAABAABA"
pat[] = "AABA"
Output: Pattern found at index 0
Pattern found at index 9
Pattern found at index 12
Analysis and Design of Algorithms
 Compare
A A B A A C A A D A A B A A B A
A A B A

j
i
Analysis and Design of Algorithms
 Compare
A A B A A C A A D A A B A A B A
A A B A

j
i i+j
Analysis and Design of Algorithms
 Compare
A A B A A C A A D A A B A A B A
A A B A

j
i i+j
Analysis and Design of Algorithms
 Compare
A A B A A C A A D A A B A A B A
A A B A

j
i i+j
Analysis and Design of Algorithms
 Pattern found at index 0
A A B A A C A A D A A B A A B A
A A B A
i
Analysis and Design of Algorithms
 Compare
A A B A A C A A D A A B A A B A
A A B A
i
Analysis and Design of Algorithms
 Compare
A A B A A C A A D A A B A A B A
A A B A
i
Analysis and Design of Algorithms
 Compare
A A B A A C A A D A A B A A B A
A A B A
i
Analysis and Design of Algorithms
 Compare
A A B A A C A A D A A B A A B A
A A B A
i
Analysis and Design of Algorithms
 Compare
A A B A A C A A D A A B A A B A
A A B A
i
Analysis and Design of Algorithms
 Compare
A A B A A C A A D A A B A A B A
A A B A
i
Analysis and Design of Algorithms
 Compare
A A B A A C A A D A A B A A B A
A A B A
i
Analysis and Design of Algorithms
 Compare
A A B A A C A A D A A B A A B A
A A B A
i
Analysis and Design of Algorithms
 Pattern found at index 9
A A B A A C A A D A A B A A B A
A A B A
i
Analysis and Design of Algorithms
 Compare
A A B A A C A A D A A B A A B A
A A B A
i
Analysis and Design of Algorithms
 Compare
A A B A A C A A D A A B A A B A
A A B A
i
Analysis and Design of Algorithms
 Pattern found at index 12
A A B A A C A A D A A B A A B A
A A B A
i
Analysis and Design of Algorithms
 Compare
A A B A A C A A D A A B A A B A
A A B
i
Analysis and Design of Algorithms
 Compare
A A B A A C A A D A A B A A B A
A A
i
Analysis and Design of Algorithms
Input: txt[] = "AABAACAADAABAABA"
pat[] = "AABA"
Output: Pattern found at index 0
Pattern found at index 9
Pattern found at index 12
Analysis and Design of Algorithms
 Python Code:
Analysis and Design of Algorithms
Analysis and Design of Algorithms
What is the best case?
The best case occurs when the first character of the pattern
is not present in text at all.
txt[] = "AABCCAADDEE"
pat[] = "FAA"
Analysis and Design of Algorithms
What is the worst case ?
1) When all characters of the text and pattern are same.
txt[] = "AAAAAAAAAAAAAAAAAA"
pat[] = "AAAAA"
Analysis and Design of Algorithms
2) Worst case also occurs when only the last character is
different.
txt[] = "AAAAAAAAAAAAAAAAAB"
pat[] = "AAAAB"
Analysis and Design of Algorithms
The worst case is O(m*(n-m+1))
Analysis and Design of Algorithms
Regular Expression
Analysis and Design of Algorithms
Regular expressions are a
powerful language for matching
text patterns.
Analysis and Design of Algorithms
re.match() checks for a match only at the beginning of
the string.
re.search() checks for a match anywhere in the string.
Analysis and Design of Algorithms
Import Regular Expression (Python)
Analysis and Design of Algorithms
In Python a regular expression search is
typically written as:
 match = re.search(pat, str)
Analysis and Design of Algorithms
Simple match
Analysis and Design of Algorithms
w (lowercase w): matches a "word" character:
a letter or digit or underbar [a-z A-Z 0-9 _].
Analysis and Design of Algorithms
W (upper case W): matches any non-word
character.
Analysis and Design of Algorithms
d (lowercase d): decimal digit [0-9]
Analysis and Design of Algorithms
s (lowercase s): matches a single whitespace
character -- space, newline, return, tab, form [
nrtf].
Analysis and Design of Algorithms
 . (dot) : matches any single character except
newline 'n'
Analysis and Design of Algorithms
 ^ = start : match the start
Match is empty
Analysis and Design of Algorithms
 $ = end : match the end of the string
Match is empty
Analysis and Design of Algorithms
 [ ] : a range of characters can be
indicated by giving two characters and
separating them by a “-”.
Analysis and Design of Algorithms
 [ ]
Analysis and Design of Algorithms
 [^ ] : will match any character except this.
Analysis and Design of Algorithms
 {n} : match exactly with any number of n.
Analysis and Design of Algorithms
 {n,m} : match exactly with any number of n to m.
Analysis and Design of Algorithms
Syntax Description Equivalent
d Matches any decimal digit [0-9]
D Matches any non-digit character [^0-9]
s Matches any whitespace character [ tnrfv]
S Matches any non-whitespace character [^ tnrfv]
w Matches any alphanumeric character [a-zA-Z0-9_]
W Matches any non-alphanumeric character [^a-zA-Z0-9_]
Analysis and Design of Algorithms
+ : 1 or more occurrences of the pattern
Analysis and Design of Algorithms
* : 0 or more occurrences of the pattern
Analysis and Design of Algorithms
? : match 0 or 1 occurrences of the pattern
Analysis and Design of Algorithms
 Emails:
Analysis and Design of Algorithms
 Emails:
 Square brackets can be used to indicate a set of chars, so [abc]
matches 'a' or 'b' or 'c'.
Analysis and Design of Algorithms
 Emails:
 findall() finds all the matches and returns them as a list of strings.
Analysis and Design of Algorithms
( ): match group of pattern
Analysis and Design of Algorithms
Operators Description
. Matches with any single character except newline ‘n’
? Match 0 or 1 occurrence of the pattern to its left
+ Match 1 or more occurrences of the pattern to its left
* Match 0 or more occurrences of the pattern to its left
w Matches with a alphanumeric character
W Matches non alphanumeric character
d Matches with digits [0-9]
D Matches with non-digits
Analysis and Design of Algorithms
Operators Description
s Matches with a single white space character (space, newline, tab)
S Matches any non-white space character
[..] Matches any single character in a square bracket
[^..] Matches any single character not in square bracket
 It is used for special meaning characters
^ and $ ^ and $ match the start or end of the string respectively
{n,m} Matches at least n and at most m occurrences of expression
a| b Matches either a or b
Analysis and Design of Algorithms
Operators Description
( ) Groups regular expressions and returns matched text
t, n, r Matches tab, newline, return
Analysis and Design of Algorithms
Validate a phone number (phone number must be of
10 digits and starts with 8 or 9)
Analysis and Design of Algorithms
 Solution
Analysis and Design of Algorithms
Return date from given string
str= Amit 34-3456 12-05-2007, XYZ 56-4532 11-
11-2011
Analysis and Design of Algorithms
 Solution
Analysis and Design of Algorithms
facebook.com/mloey
mohamedloey@gmail.com
twitter.com/mloey
linkedin.com/in/mloey
mloey@fci.bu.edu.eg
mloey.github.io
Analysis and Design of Algorithms
www.YourCompany.com
© 2020 Companyname PowerPoint Business Theme. All Rights Reserved.
THANKS FOR
YOUR TIME

More Related Content

Algorithms Lecture 8: Pattern Algorithms

  • 1. Analysis and Design of Algorithms Patterns Algorithms
  • 2. Analysis and Design of Algorithms Pattern searching Naive Pattern Searching Regular Expression
  • 3. Analysis and Design of Algorithms Pattern searching is an important problem in computer science. When we do search for a string in notepad/word file or browser or database, pattern searching algorithms are used to show the search results.
  • 4. Analysis and Design of Algorithms Naive Pattern Searching
  • 5. Analysis and Design of Algorithms Slide the pattern over text one by one and check for a match. If a match is found, then slides by 1 again to check for subsequent matches.
  • 6. Analysis and Design of Algorithms Example 1: Input: txt[] = "THIS IS A TEST TEXT" pat[] = "TEST" Output: Pattern found at index 10
  • 7. Analysis and Design of Algorithms Example 2: Input: txt[] = "AABAACAADAABAABA" pat[] = "AABA" Output: Pattern found at index 0 Pattern found at index 9 Pattern found at index 12
  • 8. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A  j i
  • 9. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A  j i i+j
  • 10. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A  j i i+j
  • 11. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A  j i i+j
  • 12. Analysis and Design of Algorithms  Pattern found at index 0 A A B A A C A A D A A B A A B A A A B A i
  • 13. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  • 14. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  • 15. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  • 16. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  • 17. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  • 18. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  • 19. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  • 20. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  • 21. Analysis and Design of Algorithms  Pattern found at index 9 A A B A A C A A D A A B A A B A A A B A i
  • 22. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  • 23. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B A i
  • 24. Analysis and Design of Algorithms  Pattern found at index 12 A A B A A C A A D A A B A A B A A A B A i
  • 25. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A B i
  • 26. Analysis and Design of Algorithms  Compare A A B A A C A A D A A B A A B A A A i
  • 27. Analysis and Design of Algorithms Input: txt[] = "AABAACAADAABAABA" pat[] = "AABA" Output: Pattern found at index 0 Pattern found at index 9 Pattern found at index 12
  • 28. Analysis and Design of Algorithms  Python Code:
  • 29. Analysis and Design of Algorithms
  • 30. Analysis and Design of Algorithms What is the best case? The best case occurs when the first character of the pattern is not present in text at all. txt[] = "AABCCAADDEE" pat[] = "FAA"
  • 31. Analysis and Design of Algorithms What is the worst case ? 1) When all characters of the text and pattern are same. txt[] = "AAAAAAAAAAAAAAAAAA" pat[] = "AAAAA"
  • 32. Analysis and Design of Algorithms 2) Worst case also occurs when only the last character is different. txt[] = "AAAAAAAAAAAAAAAAAB" pat[] = "AAAAB"
  • 33. Analysis and Design of Algorithms The worst case is O(m*(n-m+1))
  • 34. Analysis and Design of Algorithms Regular Expression
  • 35. Analysis and Design of Algorithms Regular expressions are a powerful language for matching text patterns.
  • 36. Analysis and Design of Algorithms re.match() checks for a match only at the beginning of the string. re.search() checks for a match anywhere in the string.
  • 37. Analysis and Design of Algorithms Import Regular Expression (Python)
  • 38. Analysis and Design of Algorithms In Python a regular expression search is typically written as:  match = re.search(pat, str)
  • 39. Analysis and Design of Algorithms Simple match
  • 40. Analysis and Design of Algorithms w (lowercase w): matches a "word" character: a letter or digit or underbar [a-z A-Z 0-9 _].
  • 41. Analysis and Design of Algorithms W (upper case W): matches any non-word character.
  • 42. Analysis and Design of Algorithms d (lowercase d): decimal digit [0-9]
  • 43. Analysis and Design of Algorithms s (lowercase s): matches a single whitespace character -- space, newline, return, tab, form [ nrtf].
  • 44. Analysis and Design of Algorithms  . (dot) : matches any single character except newline 'n'
  • 45. Analysis and Design of Algorithms  ^ = start : match the start Match is empty
  • 46. Analysis and Design of Algorithms  $ = end : match the end of the string Match is empty
  • 47. Analysis and Design of Algorithms  [ ] : a range of characters can be indicated by giving two characters and separating them by a “-”.
  • 48. Analysis and Design of Algorithms  [ ]
  • 49. Analysis and Design of Algorithms  [^ ] : will match any character except this.
  • 50. Analysis and Design of Algorithms  {n} : match exactly with any number of n.
  • 51. Analysis and Design of Algorithms  {n,m} : match exactly with any number of n to m.
  • 52. Analysis and Design of Algorithms Syntax Description Equivalent d Matches any decimal digit [0-9] D Matches any non-digit character [^0-9] s Matches any whitespace character [ tnrfv] S Matches any non-whitespace character [^ tnrfv] w Matches any alphanumeric character [a-zA-Z0-9_] W Matches any non-alphanumeric character [^a-zA-Z0-9_]
  • 53. Analysis and Design of Algorithms + : 1 or more occurrences of the pattern
  • 54. Analysis and Design of Algorithms * : 0 or more occurrences of the pattern
  • 55. Analysis and Design of Algorithms ? : match 0 or 1 occurrences of the pattern
  • 56. Analysis and Design of Algorithms  Emails:
  • 57. Analysis and Design of Algorithms  Emails:  Square brackets can be used to indicate a set of chars, so [abc] matches 'a' or 'b' or 'c'.
  • 58. Analysis and Design of Algorithms  Emails:  findall() finds all the matches and returns them as a list of strings.
  • 59. Analysis and Design of Algorithms ( ): match group of pattern
  • 60. Analysis and Design of Algorithms Operators Description . Matches with any single character except newline ‘n’ ? Match 0 or 1 occurrence of the pattern to its left + Match 1 or more occurrences of the pattern to its left * Match 0 or more occurrences of the pattern to its left w Matches with a alphanumeric character W Matches non alphanumeric character d Matches with digits [0-9] D Matches with non-digits
  • 61. Analysis and Design of Algorithms Operators Description s Matches with a single white space character (space, newline, tab) S Matches any non-white space character [..] Matches any single character in a square bracket [^..] Matches any single character not in square bracket It is used for special meaning characters ^ and $ ^ and $ match the start or end of the string respectively {n,m} Matches at least n and at most m occurrences of expression a| b Matches either a or b
  • 62. Analysis and Design of Algorithms Operators Description ( ) Groups regular expressions and returns matched text t, n, r Matches tab, newline, return
  • 63. Analysis and Design of Algorithms Validate a phone number (phone number must be of 10 digits and starts with 8 or 9)
  • 64. Analysis and Design of Algorithms  Solution
  • 65. Analysis and Design of Algorithms Return date from given string str= Amit 34-3456 12-05-2007, XYZ 56-4532 11- 11-2011
  • 66. Analysis and Design of Algorithms  Solution
  • 67. Analysis and Design of Algorithms facebook.com/mloey mohamedloey@gmail.com twitter.com/mloey linkedin.com/in/mloey mloey@fci.bu.edu.eg mloey.github.io
  • 68. Analysis and Design of Algorithms www.YourCompany.com © 2020 Companyname PowerPoint Business Theme. All Rights Reserved. THANKS FOR YOUR TIME