Module4 DataAnalyticsLanguages
Module4 DataAnalyticsLanguages
31/07/2024 Slide 1
History
31/07/2024 Slide 7
Regular Expressions
http://en.wikipedia.org/wiki/Regular_expression
31/07/2024 8
Python Regular Expressions
^ Matches the beginning of a line
$ Matches the end of the line
. Matches any character
\s Matches whitespace
\S Matches any non-whitespace character
* Repeats a character zero or more times
*? Repeats a character zero or more times (non-greedy)
+ Repeats a chracter one or more times
+? Repeats a character one or more times (non-greedy)
[aeiou] Matches a single character in the listed set
[^XYZ] Matches a single character not in the listed set
[a-z0-9] The set of characters can include a range
( Indicates where string extraction is to start
) Indicates where string extraction is to end
31/07/2024 9
The Regular Expression Module
• Before you can use regular expressions in your
program, you must import the library using
"import re"
• You can use re.search() to see if a string matches a
regular expression similar to using the find()
method for strings
• You can use re.findall() extract portions of a string
that match your regular expression similar to a
combination of find() and slicing: var[5:10]
31/07/2024 10
Wild-Card Characters
31/07/2024 11
Wild-Card Characters
31/07/2024 12
Wild-Card Characters
31/07/2024 13
Greedy Matching
31/07/2024 14
Non-Greedy Matching
31/07/2024 15
Python Slicing
31/07/2024 Slide 16
String Slices
• >>>fruit = “apple”
• >>>fruit[1:3]
• >>>’pp’
• >>>fruit[1:]
• >>>’pple’
• >>>fruit[:4]
• >>>’appl’
• >>>fruit[:]
• >>>’apple’
31/07/2024 17
List Slices
• >>>b
• [3, 4, 5, 6]
• >>>b[0:3]
• [3,4,5]
• b[0:j] with j > 3 and b[0:] are same
• >>>b[:2]
• [3,4]
31/07/2024 18
List Slices
• >>>b[2:2]
• []
• b[i:j:k] is a subset of b[i:j] with elements
picked in steps of k
• >>>b=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
• >>>b[0:10:3]
• [1, 4, 7]
31/07/2024 19
NumPy array slicing
• 1-d array slicing and indexing is similar to
Python lists
• import numpy as np
• arr1=np.array([1,2,5,6,4,3])
• arr1[2:4]=99
• arr1
• Out[8]: array([ 1, 2, 99, 99, 4, 3])
eLahe Technologies 2020
31/07/2024 20
www.elahetech.com
NumPy array slicing
• arr2[0]=88
• arr1
• Out[13]: array([ 1, 2, 88, 99, 4, 3])
31/07/2024 Slide 22
in and notin
• >>>setA= {1,3,5,7}
• >>>3 in setA
• True
• >>>3 not in setA
• False
• >>>4 not in setA
• True
31/07/2024 23
Subset
• >>>setA= {1,3,5,7}
• >>>setB= {1, 3, 5, 7, 9}
• >>>setC = {1,3,5,9,10}
• >>>setA issubset setB
• True
• >>> setA issubset setC
• False
31/07/2024 24
Superset
• >>>setA= {1,3,5,7}
• >>>setB= {1, 3, 5, 7, 9}
• >>>setC = {1,3,5,9,10}
• >>>setA issuperset setB
• False
• >>> setB issuperset setA
• True
• >>> setC issuperset setA
• False
31/07/2024 25
Set Union
• >>>setA= {1,3,5,7}
• >>>setB= {7, 5, 9}
• >>>setA.union(setB)
• {1,3,5,7,9}
• >>>setA | setB
• {1, 3, 5, 7, 9}
31/07/2024 26
Set Intersection
• >>>setA= {1,3,5,7}
• >>>setB= {7, 5, 9}
• >>>setA.intersection(setB)
• {5,7}
• >>>setA & setB
• {5, 7}
31/07/2024 27
Dictionaries
31/07/2024 Slide 28
Dictionaries
>>>
• Lists index their entries >>> purse = dict() >>>purse['money'] =
12
based on the position >>> purse['candy'] = 3
in the list >>> purse['tissues'] = 75
>>> print(purse)
• Dictionaries are like {'money': 12, 'tissues': 75, 'candy': 3}
bags - no order >>> print(purse['candy'])
3
• So we index the things >>> purse['candy'] = purse['candy'] + 2
we put in the dictionary >>> print(purse)
{'money': 12, 'tissues': 75, 'candy': 5}
with a “lookup tag”
Comparing Lists and
Dictionaries
Dictionaries are like lists except that they use keys instead of
numbers to look up values