Data Analysis With Python - Day2
Data Analysis With Python - Day2
1
Agenda
3 18 Sept 2021 9.00-10.00 Data Analysis with Python: import packages, read data
Data Analysis with Python: clean data, wrangle data, analysis
10.00-11.00
(numerical and categorical data)
11.00-12.00 Data Analysis with Python: data visualization
2
Learning Objectives
3
Jupyter Notebook / Colab
Tools for coding
4
Jupyter Notebook
● A Jupyter notebook is a web interface that lets us use formatting alongside our code. It is the
extremely common and very useful!
● You can launch it by typing: Jupyter notebook
5
Jupyter Notebook
● Cells:
○ Markdown for notes
○ Code for Python
● Modes
○ Blue for commands
○ Green for editing
● Execution
○ Shift + return
● Output
○ Print (all)
○ Return values (last)
6
Jupyter Notebook Errors
7
Quick Review: Jupyter Notebook
8
Colab
9
Python Essentials
10
Programming and Programming Languages
Programming:
Programming Languages
11
Python Indentation
12
Image source: https://www.faceprep.in/python/python-indentation/
What we’ve learn from the previous class
13
Variables
and Variable Types
Python Essentials
14
Variables
myInt = 4
myReal = 2.5
myChar = "a"
myString = "hello"
15
Creating variables
x = 5
y = "John"
print(x)
print(y)
5
John
16
Get the Type
• You can get the data type of a variable with the type() function.
x = 5
y = "John"
print(type(x))
print(type(y))
<class 'int'>
<class 'str'>
17
Variable Types
Python has the following data types built-in by default, in these categories:
18
Example
x = 20 int
x = 20.5 float
x = range(6) range
x = True bool
19
Casting
• If you want to specify the data type of a variable, this can be done with
casting.
3
3
3.0
20
Assign Multiple Values
Orange
Banana
Cherry
Make sure the number of variables matches the number of values, or else you will get an error.
21
Declaring Strings
22
Declaring Strings
23
String Concatenation (+ operator)
first_name = "Doc"
last_name = "Brown"
full_name = first_name + last_name
24
Spaces in Concatenation
To begin:
25
Strings and Printing: Review
sentence = name + " is driving his " + car + " " + speed
string_numbers = "88" + "51"
# string_numbers = 8851
26
Discussion: Some Common Mistakes: 1
my_num
print(my_num)
27
Discussion: Some Common Mistakes: 2
my_num = 5
print()
28
Discussion: Some Common Mistakes: 3
my_num = 5
my_string = "Hello"
print(my_num + my_string)
29
Discussion: Some Common Mistakes: 4
my_num1 = "10"
my_num2 = "20"
print(my_num1 + my_num2)
30
Quick Review: Variables
31
String
Python Essentials
32
String
● String is a sequence made up of one or more individual characters that could consist of letters,
numbers, whitespace characters, or symbols.
● Strings in python are surrounded by either single quotation marks, or double quotation marks.
"Hello"
'Hello'
● Assign string to a variable
a = "Hello"
print(a)
33
String index
S t r i n g s a r e i n d e x e d !
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
34
Accessing Characters by Positive Index Number
● By referencing index numbers, we can isolate one of the characters in a string by putting the index
numbers in square brackets.
S t r i n g s a r e i n d e x e d !
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
35
Accessing Characters by Negative Index Number
● We can also count backwards from the end of the string, starting at the index number -1.
S t r i n g s a r e i n d e x e d !
-20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1
36
Slicing Strings
● With slices, we can call multiple character values by creating a range of index numbers separated by
a colon [x:y]:
S t r i n g s a r e i n d e x e d !
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
● the first index number is where the slice starts (inclusive), and the second index number is where the
slice ends (exclusive).
37
Specifying Stride while Slicing Strings
● String slicing can accept a third parameter in addition to two index numbers.
● The third parameter specifies the stride, which refers to how many characters to move forward after
the first character is retrieved from the string.
S t r i n g s a r e i n d e x e d !
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
38
Stride to -1
● you can indicate a negative numeric value for the stride, which we can use to print the original string
in reverse order if we set the stride to -1:
S t r i n g s a r e i n d e x e d !
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
39
len() function
S t r i n g s a r e i n d e x e d !
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
40
Discussion: index and len()
0 1 2 3
● What happens for this code?
name[len(name)]
41
String concatenation
43
String Methods
Python Essentials - String
44
String Methods (String functions)
● Python has built-in functions for performing various operations on strings, which can be called in
the following format:
variable.function_name()
45
String Methods
● str.isupper(), str.islower()
46
isupper(), islower()
47
String Methods
● str.isalpha(), str.isdigit()
48
isalpha(), isdigit()
text = '02-111-2222'
print(text.isdigit())
text = '98765'
print(text.isdigit())
49
String Methods
● str.upper(), str.lower()
50
upper(), lower()
print(text)
print(upperText)
print(lowerText)
51
upper(), lower()
print(text)
print(upperText)
print(lowerText)
52
Quick Review: String Methods
53
String Iteration
Python Essentials - String
54
How to Access Each Character in a String?
55
Using for Loop to Traverse a String
H e l l o W o r l d !
0 1 2 3 4 5 6 7 8 9 10 11
56
Using Range to Iterate over a String
H e l l o W o r l d !
0 1 2 3 4 5 6 7 8 9 10 11
57
Using While Loop to Traverse a String
H e l l o W o r l d !
0 1 2 3 4 5 6 7 8 9 10 11
58
Quick Review: String Iteration
59
Data Structures
Python Essentials
60
How to Store the Data?
61
How to Store the Data?
62
List
Python Essentials – Data Structures
63
List
● A list is a Python Data Structures in Python that is a mutable, or changeable, ordered sequence of
elements.
● Each element or value that is inside of a list is called an item.
● Lists are defined by having values between square brackets [ ].
● Lists are great to use when you want to work with many related values.
64
Create a list
65
List index
● Each item in a list corresponds to an index number, which is an integer value, starting with the index
number 0.
66
Access List Elements
67
List Length
68
In Operator
Python's in operator lets you loop through all the members of a collection(such as a list or a tuple) and check if
there's a member in the list that's equal to the given item.
69
In Operator
70
Quick Review: List
● A list is a Python Data Structures in Python that is a mutable, ordered sequence of elements.
● A list can be created using square brackets.
● List index starts at 0.
● Accessing each item using index.
● len() return the size of the list.
● Using the in operator to check if the item is in the list.
71
List Iteration
Python Essentials – Data Structures
72
Loop on List
73
Loop on List (using index)
74
Loop on List (using index)
76
Loop on List (element-wise)
element
For loop does not need index.
Example.
myList = [1, 3, 5]
for element in myList:
print(element)
1 3 5
myList
0 1 2
77
Loop on List (element-wise)
Example. element
myList = [1, 3, 5]
for element in myList:
print(element)
1 3 5
myList
0 1 2
78
Loop on List (element-wise)
Example.
myList = [1, 3, 5]
for element in myList: element
print(element)
1 3 5
myList
0 1 2
79
Quick Review: List Iteration
80
List Operations
Python Essentials – Data Structures
81
List Property
82
List Operations
83
Add Element into List
0 1 2 3
84
Insert Element into List (anywhere)
listA = ['a','c','d']
print(listA)
# list.insert(i, elem)
listA.insert(1,'b')
85
Insert Element into List (anywhere)
listA = ['a','c','d']
print(listA)
# list.insert(i, elem)
listA.insert(1,'b')
86
Remove Last Element from List
87
Remove Last Element from List
88
Remove Last Element from List
last
89
Remove Last Element from List
last
3.14
90
Remove Last Element from List
last
91
Remove Element from Anywhere in the List
removed ‘c’
0 1
listA
‘a’ ‘d’
92
Append 2 Lists
listA = [1, 2, 3]
listA = listA + [4,5]
print(listA)
listA 1 2 3 4 5
93
Append 2 Lists
2. Use .extend()
listA 1 2 3
listA = [1, 2, 3]
listA.extend([4, 5])
print(listA)
listA 1 2 3 4 5
94
Challenge: + vs Extend
a = [1, 2, 3] a = [1, 2, 3]
b = [4, 5] b = [4, 5]
c = a + b a.extend(b)
95
Use List as Parameter
Recap
96
List as Parameters
What is listX?
97
List as Parameters
listX 1
list A
98
List as Parameters
def listOp(listA):
listOp(listA): listX
listX == [1]
[1]
listA.extend([4,
listA.extend([4, 5])
5]) listOp(listX)
listOp(listX)
listX 1 4 5
list A
99
Caution When Use List as Parameter
100
Quick Review: List Operations
● Create a list:
○ Put items in a square bracket
○ Each item splitted with commas myList = [0, 'Hello World', 3.14]
● Print out specific elements in a list for i in range(3):
element = myList[i]
print(element)
101
Range
Python Essentials – Data Structures
102
range() function
for i in range(5):
print(i)
103
range() function
for i in range(101):
if i >= 5:
if i % 5 == 0:
print(i)
104
range() function
105
range -> slice
106
Find Reverse String
word = 'camp'
reverse = word[::-1]
107
Quick Review: Range
108
Common Pattern with List
Python Essentials – Data Structures
109
List Operations
110
Common Things We Do with Lists
111
Turn a String(sentence) into a List of String(words)
112
Count Certain Things in a List
score_list 5 6 1 10 2
print(total_count)
total_count
113
Count Certain Things in a List
total_count = 0
score_list = [5, 6, 1, 10, 2]
for i in range(len(score_list)):
total_count = 0
if score_list[i]> 4:
for number in score_list:
total_count += 1
if number > 4:
print(total_count)
total_count += 1
score_list 5 6 1 10 2
print(total_count)
total_count
114
Filter Things in a List
score_list 5 6 1 10 2
print(new_list)
new_list 5 6 10
115
Process Things in a List in the Sorted Order
116
Process things in a list from right to left
117
Quick Review: Common Patterns with List
118
Dictionary
Python Essentials – Data Structures
119
Introducing Dictionaries
Think about dictionaries — they're filled with words and definitions that are paired together.
120
Introducing Dictionaries
121
Declaring a Dictionary
# And in action...
my_dictionary = {"Puppy": "Furry, energetic animal", "Pineapple": "Acidic tropical
fruit", "Tea": "Herb-infused drink"}
print(my_dictionary)
# Prints the whole dictionary
print(my_dictionary["Puppy"])
# => Prints Puppy's value: "Furry, energetic animal"
122
Dictionaries and Quick Tips
● The order of keys you see printed may differ from how you entered them. That's fine!
● You can't have the same key twice. Imagine having two "puppies" in a real dictionary! If you
try, the last value will be the one that's kept.
● What's more, printing a key that doesn't exist gives an error.
● Let's create a dictionary together.
123
Dictionary Syntax
What if we have new things to add? It's the same syntax as changing the value, just with a new key:
my_dictionary["Yoga"] = "Peaceful".
print(my_dictionary)
124
Quick Review: Dictionaries
Make a dictionary.
●
● Print a dictionary.
● Print one key's value.
● Change a key's value.
Here's a best practice: Declare your dictionary across multiple lines for readability.
Which is better?
# This works but is not proper style.
my_dictionary = {"Puppy": "Furry, energetic animal", "Pineapple": "Acidic tropical fruit", "Tea":
"Herb-infused drink"}
# Do this instead!
my_dictionary = {
"Puppy": "Furry, energetic animal",
"Pineapple": "Acidic tropical fruit",
"Tea": "Herb-infused drink"
}
125
Dictionary Iteration
Python Essentials – Data Structures
126
Looping through Dictionaries
We can print a dictionary with print(my_dictionary), but, like a list, we can also loop through the
items with a for loop:
my_dictionary = {
"Puppy": "Furry energetic animal",
"Pineapple": "Acidic tropical fruit",
"Tea": "Herb infused drink"
}
127
Other Values
128
Reverse Lookup
Finding the value from a key is easy: my_dictionary[key]. But, what if you only have the value and want
to find the key?
You task is to write a function, reverse_lookup(), that takes a dictionary and a value and returns the
corresponding key.
For example:
The items() method returns a view object.
state_capitals = { The view object contains the key-value pairs
"Alaska" : "Juneau", of the dictionary, as tuples in a list.
"Colorado" : "Denver",
"Oregon" : "Salem",
"Texas" : "Austin"
} dictionary = {'george': 16, 'amber': 19}
search_age = int(input("Provide age"))
print(reverse_lookup("Denver")) for name, age in dictionary.items():
# Prints Colorado if age == search_age:
print(name)
129
Solution
state_capitals = {
"Alaska" : "Juneau",
"Colorado" : "Denver",
"Oregon" : "Salem",
"Texas" : "Austin"
}
def reverse_lookup(value):
for k, v in state_capitals.items():
if value == v:
return k
● Dictionaries:
○ Are another kind of collection, instead of a list.
○ Use keys to access values, not indices!
○ Should be used instead of lists when:
■ You don't care about the order of the items.
■ You'd prefer more meaningful keys than just index numbers.
my_dictionary = {
"Puppy": "Furry, energetic animal",
"Pineapple": "Acidic tropical fruit",
"Tea": "Herb-infused drink"
}
131
File I/O
Python Essentials
132
Entering data into the program
● Embed data directly into the source code of a program (hard code).
133
File Types
134
Quick Review: File I/O
135
Reading a File
Python Essentials - File I/O
136
Reading a File
137
Approaches of Reading a File Line by Line
● Before reading a file, you need to open the file using open() function.
● Open() function returns a file object that contains methods and attributes to perform various
operations.
138
Approach#1: Iterate on File
song_file = open('lyrics.txt')
for line in song_file:
if line != '\n':
print(line)
139
Why Doesn’t Print Two Times?
song_file = open('lyrics.txt')
for line in song_file:
if line != '\n':
print(line)
140
readline()
readline() function reads a line of the file and return it in the form of the string.
song_file = open('lyrics.txt')
first = song_file.readline()
second = song_file.readline()
for line in song_file:
if line != '\n':
print(line)
141
Approach#2: With Statement
with open('lyrics.txt') as
song_file:
for line in song_file:
if line != '\n':
print(line)
142
Approach#3: Open a File in For Loop
for line in
open('lyrics.txt'):
if line != '\n':
print(line)
143
Reading Data from File
144
Read the Whole File to a String
The read() method returns the read data in the form of a string.
whole_file = open('lyrics.txt').read()
print(whole_file)
145
Reading Data from File
146
Read the Whole File to a List
The readlines() method is used to read all the lines at a single go and then return them as each line a
string element in a list.
lines = open('lyrics.txt').readlines()
print(lines)
['Twinkle twinkle little star\n', 'How I wonder what you are\n', 'Up above the world so high\n',
'Like a diamond in the sky \n', '\n', 'Twinkle twinkle little star\n', 'How I wonder what you are']
147
Quick Review: Reading a File
● There are two types of files that can be handled in python, normal text files and binary files
(written in binary language,0s and 1s).
● before reading a file, open the file using open() function, then...
○ Read line by line
○ Read the whole file to a string
○ Read the whole file to a list
148
Writing to Files
Python Essentials - File I/O
149
Writing to Files Similar to Reading a file
150
caution#1: mode = 'w'
f = open('my_fav_song.txt')
f.write('Silent night, holy night \n')
f.write('All is calm and all is bright')
f.close()
151
caution#2: write strings only
f = open('my_math.txt', mode =
'w')
f.write(123456789)
f.close()
152
caution#3: close the file
If you write to a file without closing, the data won’t make it to the target
file.
f = open('my_fav_song.txt', mode = 'w') Silent night, holy night
All is calm and all is bright
f.write('Silent night, holy night \n')
f.write('All is calm and all is bright')
153
caution#3: close the file
If you write to a file without closing, the data won’t make it to the target
file.
f = open('my_fav_song.txt', mode = 'w') Silent night, holy night
All is calm and all is bright
f.write('Silent night, holy night \n')
f.write('All is calm and all is bright')
f.close()
154
caution#4: \n
155
Quick Review: Writing to Files
156
Summary
We’ve learned:
• Variables and variable types
• String
• Data structures
• List
• Range
• Dictionary
• File I/O
157
Next
Data Analysis with Python
158
See You
159