Python For Data Science and AI
Python For Data Science and AI
Types:
A type is how Python represents different types of data. In this video, we will discuss some widely
used types in Python. You can have different types in Python. They can be integers like 11, real
numbers like 21.213, they can even be words. Integers, real numbers, and words can be expressed
as different data types. The following chart summarizes three data types for the last examples.
The first column indicates the expression. The second column indicates the data type. We can see
the actual data type in Python by using the type command. We can have int, which stands for an
integer and float that stands for float, essentially a real number. The type string is a sequence of
characters. Here are some integers. Integers can be negative or positive. It should be noted that
there is a finite range of integers but it is quite large. Floats are real numbers. They include the
integers but also numbers in between the integers. Consider the numbers between 0 and 1. We can
select numbers in between them. These numbers are floats. Similarly, consider the numbers
between 0.5 and 0.6. We can select numbers in between them. These are floats as well. We can
continue the process zooming in for different numbers. Of course there is a limit but it is quite
small. You can change the type of the expression in Python, this is called typecasting.
You can convert an int to a float. For example, you can convert or cast the integer 2 to a float 2.0.
Nothing really changes, if you cast a float to an integer, you must be careful. For example, if you
cast the float 1.1 to 1, you will lose some information. If a string contains an integer value, you
can convert it to int. If we convert a string that contains a non-integer value, we get an error. Check
out more examples in the lab. You can convert an int to a string or a float to a string. Boolean is
another important type in Python. A Boolean can take on two values. The first value is True, just
remember we use an uppercase T. Boolean values can also be False with an uppercase F. Using
the type command on a Boolean value, we obtain the term bool. This is short for Boolean, if we
cast a Boolean True to an integer or float, we will get a 1. If we cast a Boolean False to an integer
or float, we get a 0. If you cast a 1 to a Boolean, you get a True. Similarly, if you cast a 0 to a
Boolean, you get a False. Check the labs for more examples or check Python.org for other kinds
of types in Python.
Expressions describe a type of operation the computers perform. Expressions are operations the
python performs, for example, basic arithmetic operations like adding multiple numbers. The result
in this case is 160. We call the numbers operands and the math symbols in this case, addition, are
called operators. We can perform operations such as traction using the subtraction sign. In this
case the result is a negative number.
We can perform multiplication operations using the asterisk, The result is 25. In this case, the
operands are given by negative and asterisk. We can also perform division with the forward slash.
25 / 5 is 5.0. 25 / 6 is approximately 4.167. In Python 3 the version we will using in this course,
both will result in a float. We can use double slash for integer division, where the result is rounded.
Be aware, in some cases the results are not the same as regular division. Python follows
mathematical conventions when performing mathematical expressions. The following operations
are in a different order.
In both cases Python performs multiplication then addition to obtain the final result. There are a
lot more operations you can do with Python, check the labs for more examples. We will also be
covering more complex operations throughout he course. The expressions in the parenthesis are
performed first. We then multiply the result by 60. The result is 1,920. Now, let's look at variables.
We can use variables to store values, in this case, we assign a value of 1 to the variable my_variable
using the assignment operator, i.e, the equal sign. We can then use the value somewhere else in
the code by typing the exact name of the variable. We will use a colon to denote the value of the
variable.
We can assign a new value to my_variable using the assignment operator. We assign a value of
10, the variable now has a value of 10. The old value of the variable is not important. We can store
the results of expressions, for example, we add several values and assign the result to x, x now
stores the result. We can also perform operations on x and save the result to a new variable, y. Y
now has a value of 2.666. We can also perform operations on x and assign the value x. The variable
x now, has a value, 2.666. As before, the old value of x is not important. We can use the type
command in variables as well. It's good practice to use meaningful variable names, so you don't
have to keep track of what the variable is doing.
Let say, we would like to convert the number of minutes in a highlighted examples to number of
hours in the following music dataset. We call the variable that contains the total number of minutes
total_min. It's common to use the underscore to represent the start of a new word, you could also
use a capital letter. We call the variable that contains the total number of hours, total_hour. We
can obtain the total number of hours by dividing total_min by 60. The result is approximately
2.367 hours. If we modify the value of the first variable, the value of the variable will change. The
final result values change accordingly, but we do not have to modify the rest of the code.
These are called compound data types and are one of the key types of data structures in Python.
Tuples, tuples are an ordered sequence. Here is a tuple ratings. Tuples are expressed as comma
separated elements within parentheses. These are values inside the parentheses.
In Python, there are different types, strings, integer, float. They can all be contained in a tuple.
But the type of the variable is tuple. Each element of a tuple can be accessed via an index. The
following table represents the relationship between the index and the elements in the tuple. The
first element can be accessed by the name of the tuple, followed by a square bracket with the
index number, in this case, 0. We can access the second element as follows, we can also access
the last element. In Python, we can use negative index. The relationship is as follows, the
corresponding values are shown here. We can concatenate or combine tuples by adding them, the
result is the following with the following index.
If we would like multiple elements from a tuple, we could also slice tuples. For example, if we
want the first three elements we use the following command, the last index is one larger than the
index you want. Similarly, if we want the last two elements, we use the following command.
Notice how the last index is one larger than the length of the tuple, we can use the len command
to obtain the length of the tuple. As there are five elements, the result is five.
Tuples are immutable, which means we can't change them. To see why this is important, let's see
what happens when we set the variable ratings one to ratings. Let's use the image to provide a
simplified explanation of what's going on. Each variable does not contain a tuple but references
the same immutable tuple object. See the objects and classes module for more about objects.
Let's say we want to change the element at index 2. Because tuples are immutable, we can't,
therefore, ratings 1 will not be effected by a change in rating. Because the tuple is immutable, i.e,
we can't change it. We can assign a different tuple to the ratings variable. The variable ratings
now references another tuple. As a consequence of immutability, if we would like to manipulate
a tuple, we must create a new tuple instead. For example, if we would like to sort a tuple, we use
the function sorted, the input is the original tuple. The output is a new sorted tuple, for more on
functions, see our video on functions. A tuple can contain other tuples as well as other complex
data types, this is called nesting. We can access these elements using the standard indexing
methods. If we select an index with the tuple, the same index convention applies. As such, we
can then access values in the tuple. For example, we could access the second element, we can
apply this indexing directly to the tuple variable NT.
It is helpful to visualize this as a tree. We can visualize this nesting as a tree. The tuple has the
following indexes. If we consider indexes with other tuples, we see the tuple at index 2 contains
a tuple with two elements. We can access those two indexes. The same conventional applies to
index 3. We can access the elements in those tuples as well. We can continue the process, we can
even access deeper levels of the tree by adding another square bracket. We can access different
characters in the string or various elements in the second tuple contained in the first. Lists are
also a popular data structure in Python. Lists are also an ordered sequence.
Here is a list L. A list is represented with square brackets. In may respects, lists are like tuples.
One key difference is they are mutable. Lists can contain strings, floats, integers. We can nest
other lists. We also nest tuples and other data structures. The same indexing conventions apply
for nesting. Like tuples, each element of a list can be accessed via an index. The following table
represents the relationship between the index and the elements in the list. The first element can
be accessed by the name of the list followed by a square bracket with the index number, in this
case, 0. We can access the second element as follows. We can also access the last element. In
Python, we can use a negative index. The relationship is as follows. The corresponding indexes
are as follows. We can also perform slicing in lists. For example, if we want the last two
elements in this list, we use the following command. Notice how the last index is one larger than
the length of the list. The index conventions for list and tuples are identical. Check the labs for
more examples. We can concatenate or combine list by adding them. The result is the following.
The new list has the following indices. Lists are mutable, therefore, we can change them. For
example, we apply the method extends by adding a dot followed by the name of the method, then
parentheses. The argument inside the parentheses is a new list that we are going to concatenate to
the original list. In this case, instead of creating a new list L1, the original list L is modified by
adding two new elements. To learn more about methods, check out our video on objects and
classes. Another similar method is append. If we apply append instead of extended, we add one
element to the list. If we look at the index, there is only one more element. Index 3 includes the
list that we appended.
Every time we apply a method, the list changes. If we apply extend, we add two new elements to
the list. The list L is modified by adding two new elements. If we append the string A, we further
change the list, adding the string A.
As lists are mutable, we can change them. For example, we can change the first element as
follows. The list now becomes hard rock, 10, 1.2. We can delete an element of a list using the
Del command. We simply indicate the list item we could like to remove as an argument. For
example, if we would like to remove the first element, the result becomes 10, 1.2. We can delete
the second element. This operation removes the second element of the list. We can convert a
string to a list using split. For example, the method split converts every group of characters
separated by space into an element of a list. We can use the split function to separate strings on a
specific character known as a delimiter. We simply pass the delimiter we would like to split on
as an argument, in this case, a comma. The result is a list, each element corresponds to a set of
characters that have been separated by a comma. When we set one variable B equal to A, both A
and B are referencing the same list. Multiple names referring to the same object is known as
aliasing. We know from the list slide that the first element in B is set as hard rock. If we change
the first element in A to banana, we get a side effect. The value of B would change as a
consequence.
A and B are referencing the same list, therefore, if we change A, list B also changes. If we check
the first element of B after changing list A, we get banana instead of hard rock.
You can clone list A by using the following syntax, variable A references one list. Variable B
references a new copy or clone of the original list. Now, if you change A, B will not change. We
can get more info on list, tuples, and many other objects in Python using the help command. Simply
pass in the list, tuple, or any other Python object. See the labs for more things you can do with
lists.
Dictionaries
Let's cover Dictionaries in Python. Dictionaries are a type of collection in Python. If you recall, a
list is integer indexes. These are like addresses. A list also has elements. A dictionary has keys and
values. The key is analogous to the index, they are like addresses but they don't have to be integers.
They are usually characters. The values are similar to the element in a list and contain information.
To create a dictionary, we use curly brackets. The keys are the first elements. They must be
immutable and unique. Each key is followed by a value separated by a colon. The values can be
immutable, mutable, and duplicates. Each key and value pair is separated by a comma.
Consider the following example of a dictionary. The album title is the key and the value is the
released data. We can use yellow to highlight the keys and leave the values in white. It is helpful
to use the table to visualize a dictionary where the first column represents the keys, and the second
column represents the values. We can add a few more examples to the dictionary. We can also
assign the dictionary to a variable.
The key is used to look at the value. We use square brackets. The argument is the key. This outputs
the value. Using the key of "Back in Black", this returns the value of 1980. The key, "The Dark
Side Of The Moon", gives us the value of 1973. Using the key, "The bodyguard", gives us the
value 1992, and so on. We can add a new entry to the dictionary as follows. This will add the value
2007 with a new key called, "Graduation." We can delete an entry as follows. This gets rid of the
key, "Thriller" and it's value.
We can verify if an element is in the dictionary using the "in" command as follows. The command
checks the keys. If they are in the dictionary, they return a true. If we try the same command with
a key that is not in the dictionary, we get a false. If we try another key that is not in the dictionary,
we get a false. In order to see all the keys in the dictionary, we can use the method keys to get the
keys. The output is a list like object with all the keys. In the same way, we can obtain the values
using the method values. Check out the labs for more examples and info on dictionaries.
Sets:
Let's cover sets. They are also a type of collection. Sets are a type of collection. This means that
like lists and tuples, you can input different Python types. Unlike lists and tuples, they are
unordered. This means sets do not record element position. Sets only have unique elements. This
means there is only one of a particular element in a set.
To define a set, you use curly brackets. You place the elements of a set within the curly brackets.
You notice there are duplicate items. When the actual set is created, duplicate items will not be
present. You can convert a list to a set by using the function set, this is called type casting. You
simply use the list as the input to the function set. The result will be a list converted to a set.
Let's go over an example. We start off with a list. We input the list to the function set. The function
set returns a set. Notice how there are no duplicate elements. Let's go over set operations. These
could be used to change the set. Consider the set A. Let's represent this set with a circle. If you are
familiar with sets, this could be part of a Venn diagram. A Venn diagram is a tool that uses shapes
usually to represent sets. We can add an item to a set using the add method. We just put the set
name followed by a dot, then the add method. The argument is the new element of the set we
would like to add, in this case, NSYNC. The set A now has in NSYNC as an item. If we add the
same item twice, nothing will happen as there can be no duplicates in a set.
Let's say we would like to remove NSYNC from set A. We can also remove an item from a set
using the remove method. We just put the set name followed by a dot, then the remove method.
The argument is the element of the set we would like to remove, in this case, NSYNC. After the
remove method is applied to the set, set A does not contain the item NSYNC. You can use this
method for any item in the set. We can verify if an element is in the set using the in command as
follows. The command checks that the item, in this case AC/DC, is in the set. If the item is in the
set, it returns true. If we look for an item that is not in the set, in this case for the item Who, adds
the item is not in the set, we will get a false. These are types of mathematical set operations. There
are other operations we can do. There are lots of useful mathematical operations we can do between
sets. Let's define the set album set one. We can represent it using a red circle or Venn diagram.
Similarly, we can define the set album set two. We can also represent it using a blue circle or Venn
diagram. The intersection of two sets is a new set containing elements which are in both of those
sets. It's helpful to use Venn diagrams. The two circles that represent the sets combine, the overlap,
represents the new set. As the overlap is comprised with the red circle and blue circle, we define
the intersection in terms of and.
In Python, we use an ampersand to find the union of two sets. If we overlay the values of the set
over the circle placing the common elements in the overlapping area, we see the correspondence.
After applying the intersection operation, all the items that are not in both sets disappear. In Python,
we simply just place the ampersand between the two sets. We see that both AC /DC and Back in
Black are in both sets. The result is a new set album; set three, containing all the elements in both
albums set one and album set two.
The union of two sets is the new set of elements which contain all the items in both sets. We can
find the union of the sets album set one and album set two as follows. The result is a new set that
has all the elements of album set one and album set two. This new set is represented in green.
Consider the new album set, album set three. The set contains the elements AC/DC and Back in
Black. We can represent this with a Venn diagram, as all the elements and album set three are in
album set one. The circle representing album set one encapsulates the circle representing album
set three. We can check if a set is a subset using the issubset method. As album set three is a subset
of the album set one, the result is true. There is a lot more you can do with sets.
In this video, you will learn about conditions and branching. Comparison operations compares
some value or operand, then based on some condition, they produce a Boolean. Let's say we assign
a value of a to six. We can use the equality operator denoted with two equal signs to determine if
two values are equal. In this case, if seven is equal to six. In this case, as six is not equal to seven,
the result is false. If we performed an equality test for the value of six, the two values would be
equal. As a result, we would get a true. Consider the following equality comparison operator. If
the value of the left operand, in this case, the variable i is greater than the value of the right operand,
in this case five, the condition becomes true or else we get a false. Let's display some values for i
on the left. Let's see the value is greater than five in green and the rest in red. If we set i equal to
six, we see that six is larger than five and as a result, we get a true. We can also apply the same
operations to floats. If we modify the operator as follows, if the left operand i is greater than or
equal to the value of the right operand, in this case five, then the condition becomes true. In this
case, we include the value of five in the number line and the color changes to green accordingly.
If we set the value of i equal to five, the operand will produce a true. If we set the value of i to two,
we would get a false because two is less than five. We can change the inequality if the value of the
left operand, in this case, i is less than the value of the right operand, in this case, six. Then
condition becomes true. Again, we can represent this with a colored number line. The areas where
the inequality is true are marked in green and red where the inequality is false. If the value for i is
set to two, the result is a true. As two is less than six. The inequality test uses an explanation mark
preceding the equal sign. If two operands are not equal, then the condition becomes true. We can
use a number line. When the condition is true, the corresponding numbers are marked in green and
red for where the condition is false. If we set i equal to two, the operator is true as two is not equal
to six. We compare strings as well. Comparing ACDC and Michael Jackson using the equality
test, we get a false, as the strings are not the same. Using the inequality test, we get a true, as the
strings are different. See the Lapps for more examples. Branching allows us to run different
statements for a different input. It's helpful to think of an if statement as a locked room. If this
statement is true, you can enter the room and your program can run some predefined task. If the
statement is false, your program will skip the task. For example, consider the blue rectangle
representing an ACDC concert. If the individual is 18 or older, they can enter the ACDC concert.
If they are under the age of 18, they cannot enter the concert. Individual proceeds to the concert
their age is 17, therefore, they are not granted access to the concert and they must move on. If the
individual is 19, the condition is true. They can enter the concert then they can move on. This is
the syntax of the if statement from our previous example. We have the if statement. We have the
expression that can be true or false. The brackets are not necessary. We have a colon. Within an
indent, we have the expression that is run if the condition is true. The statements after the if
statement will run regardless if the condition is true or false. For the case where the age is 17, we
set the value of the variable age to 17. We check the if statement, the statement is false. Therefore
the program will not execute the statement to print, "you will enter". In this case, it will just print
"move on". For the case where the age is 19, we set the value of the variable age to 19. We check
the if statement. The statement is true. Therefore, the program will execute the statement to print
"you will enter". Then it will just print "move on". The else statement will run a different block of
code if the same condition is false. Let's use the ACDC concert analogy again. If the user is 17,
they cannot go to the ACDC concert but they can go to the Meat Loaf concert represented by the
purple square. If the individual is 19, the condition is true, they can enter the ACDC concert then
they can move on as before. The syntax of the else statement is similar. We simply append the
statement else. We then add the expression we would like to execute with an indent. For the case
where the age is 17, we set the value of the variable age to 17. We check the if statement, the
statement is false. Therefore, we progress to the else statement. We run the statement in the indent.
This corresponds to the individual attending the Meat Loaf concert. The program will then
continue running. For the case where the age is 19, we set the value of the variable age to 19. We
check the if statement, the statement is true. Therefore, the program will execute the statement to
print "you will enter". The program skips the expressions in the else statement and continues to
run the rest of the expressions. The elif statement, short for else if, allows us to check additional
conditions if the preceding condition is false. If the condition is true, the alternate expressions will
be run. Consider the concert example, if the individual is 18, they will go to the Pink Floyd concert
instead of attending the ACDC or Meat Loaf concerts. The person of 18 years of age enters the
area as they are not over 19 years of age. They cannot see ACDC but as their 18 years, they attend
Pink Floyd. After seeing Pink Floyd, they move on. The syntax of the else if statement is similar.
We simply add the statement else if with the condition. We then add the expression we would like
to execute if the statement is true with an indent. Let's illustrate the code on the left. An 18 year
old enters. They are not older than 18 years of age. Therefore, the condition is false. So the
condition of the else if statement is checked. The condition is true. So then we would print "go see
Pink Floyd". Then we would move on as before. If the variable age was 17, the statement "go see
Meat Loaf" would print. Similarly, if the age was greater than 18, the statement "you can enter"
would print. Check the Lapps for more examples. Now let's take a look at logic operators. Logic
operations take Boolean values and produce different Boolean values. The first operation is the
not operator. If the input is true, the result is a false. Similarly, if the input is false, the result is a
true. Let A and B represent Boolean variables. The OR operator takes in the two values and
produces a new Boolean value. We can use this table to represent the different values. The first
column represents the possible values of A. The second column represents the possible values of
B. The final column represents the result of applying the OR operation. We see the OR operator
only produces a false if all the Boolean values are false. The following lines of code will print out:
"This album was made in the 70s' or 90's", if the variable album year does not fall in the 80s. Let's
see what happens when we set the album year to 1990. The colored number line is green when the
condition is true and red when the condition is false. In this case, the condition is true. Examining
the second condition, we see that 1990 is greater than 1989. So the condition is also true. We can
verify by examining the corresponding second number line. In the final number line, the green
region indicates where the area is true. This region corresponds to where at least one statement is
true. We see that 1990 falls in the area. Therefore, we execute the statement. Let A and B represent
Boolean variables. The AND operator takes in the two values and produces a new Boolean value.
We can use this table to represent the different values. The first column represents the possible
values of A. The second column represents the possible values of B. The final column represents
the result of applying the AND operation. We see the OR operator only produces a true if all the
Boolean values are true. The following lines of code will print out "This album was made in the
80's" if the variable album year is between 1980 and 1989. Let's see what happens when we set the
album year to 1983. As before, we can use the colored number line to examine where the condition
is true. In this case, 1983 is larger than 1980, so the condition is true. Examining the second
condition, we see that 1990 is greater than 1983. So this condition is also true. We can verify by
examining the corresponding second number line. In the final number line, the green region
indicates where the area is true. Similarly, this region corresponds to where both statements are
true. We see that 1983 falls in the area. Therefore, we execute the statement. Branching allows us
to run different statements for different inputs.
Loops
In this video we will cover Loops in particular for loops and while loops. We will use many visual
examples in this video. See the labs for examples with data. Before we talk about loops, let's go
over the range function. The range function outputs and ordered sequence as a list I. If the input is
a positive integer, the output is a sequence. The sequence contains the same number of elements
as the input but starts at zero. For example if the input is three the output is the sequence zero, one,
two. If the range function has two inputs where the first input is larger than the second input, the
output is a sequence that starts at the first input. Then the sequence iterates up to but not including
the second number. For the input 10 and 15 we get the following sequence. See the labs for more
capabilities of the range function. Please note if you use Python three the range function will not
generate a list explicitly like in Python two. In this section. We will cover for loops. We will focus
on lists but many of the procedures can be used on tupples. Loops perform a task over and over.
Consider the group of colored squares. Let's say we would like to replace each colored square with
a white square. Let's give each square a number to make things a little easier and refer to all the
group of squares as squares. If we wanted to tell someone to replace squares zero with a white
square we would say equals replace square zero with a white square or we can say four squares
zero in squares square zero equals white square. Similarly for the next square we can say for square
one in squares, square one equals white square. For the next square we can say for square two in
squares, square two equals white square. We repeat the process for each square. The only thing
that changes is the index of the square we are referring to. If we're going to perform a similar task
in Python we cannot use actual squares. So let's use a list to represent the boxes. Each element in
the list is a string representing the color. We want to change the name of the color in each element
to white. Each element in the list has the following index. This is a syntax to perform a loop in
Python. Notice the indent, the range function generates a list. The code will simply repeat
everything in the indent five times. If you were to change the value to six it would do it 6 times.
However, the value of I is incremented by one each time. In this segment we change the I element
of the list to the string white. The value of I is set to zero. Each iteration of the loop starts at the
beginning of the indent. We then run everything in the indent. The first element in the list is set to
white. We then go to the start of the indent, we progress down each line. When we reach the line
to change the value of the list, we set the value of index one to white. The value of I increases by
one. We repeat the process for index two. The process continues for the next index, until we've
reached the final element. We can also iterate through a list or tupple directly in python, we do not
even need to use indices. Here is the list squares. Each iteration of the list we pass one element of
the list squares to the variable square. Lets display the value of the variable square on this section.
For the first iteration, the value of square is red, we then start the second iteration. For the second
iteration, the value of square is yellow. We then start the third iteration for the final iteration the
value of Square is Green, a useful function for iterating data is enumerate. It can be used to obtain
the index and the element in the list. Let's use the box analogy with the numbers representing the
index of each square. This is the syntax to iterate through a list and provide the index of each
element. We use the list squares and use the names of the colors to represent the colored squares.
The argument of the function enumerate is the list. In this case squares the variable I is the index
and the variable Square is the corresponding element in the list. Let's use the left part of the screen
to display the different values of the variable square. And I for the various iterations of the loop
for the first iteration. The value of the variable is red corresponding to the zeroth index and the
value for I is zero for the second iteration. The value of the variable square is yellow and the value
of eye corresponds to its index i.e. 1. We repeat the process for the last index. While loops are
similar to for loops but instead of executing a statement a set number of times a while loop will
only run if a condition is met. Let's say we would like to copy all the orange squares from the list
squares to the list New squares. But we would like to stop if we encounter a non-orange square.
We don't know the value of the squares beforehand. We would simply continue the process while
the square is orange or see if the square equals orange. If not we would stop for the first example.
We would check if the square was orange. It satisfies the conditions so we would copy the square.
We repeat the process for the second square. The condition is met. So we copy the square in the
next iteration we encounter a purple square. The condition is not met. So we start the process. This
is essentially what a while loop does. Let's use the figure on the left to represent the code we will
use a list with the names of the color to represent the different squares. We create an empty list of
new squares. In reality the list is of indeterminate size. We start the index at zero the while
statement will repeatedly execute the statements within the indent until the condition inside the
bracket is false. We append the value of the first element of the list squares to the list, new squares.
We increase the value of I by one. We append the value of the second element of the list squares
to the list, new squares. We increment the value of I, now the value in the array squares is purple,
therefore the condition for the while statement is false and we exit the loop. Check out the labs for
more examples of loop many with real data.