Week 6, Python and excel, pandas read and write
Week 6, Python and excel, pandas read and write
Week 6
Dr. Li Ruozhu
City University of Hong Kong
ruozhuli@cityu.edu.hk
Data Type
• Variables can store data of different types, and different types can do
different things.
• Python has the following data types built-in by default, in these categories:
• 9 of them are the most common, and they're the ones we need to remember and use
in this course.
Data Type
• 9 most commonly used data types:
• You can evaluate any expression in Python, and get one of two answers, True
or False.
• When you compare two values, the expression is evaluated and Python
returns the Boolean answer:
• List items are indexed, the first item has index [0], the second
item has index [1] etc.
• Tuple items are indexed, the first item has index [0], the second item has
index [1] etc.
Data Type for a set of values
Set
• Sets are used to store multiple items in a single variable.
• A set is a collection which is unordered, unchangeable, and
unindexed.
• Set items are unchangeable, but you can remove items and add
new items.
• Sets are written with curly brackets.
• set = {"apple", "banana", "cherry"}
Data Type for a set of values
Dictionary
• Dictionaries are used to store data values in key:value pairs.
• At this point of time, for instance, you can try to explore more about “collection data types” if you
want to——you can read information available online, such as this link:
• https://medium.com/analytics-vidhya/collection-data-types-in-python-3a3f9c0b554
• Additionally, you can search the web and engage with AI to gain further insights.
• Ultimately, I hope you‘ll gradually build the awareness and ability to self-learn, which will benefit
you throughout your life.
• After all, one of the key goals of this course is to foster your ability to be lifelong learners in the
information age.
Data Type
• Variables can store data of different types, and different types can do
different things.
• Python has the following data types built-in by default, in these categories:
• 9 of them are the most common, and they're the ones we ask you to remember and
use in this course.
More about range
range parameters
• The range() function can be represented in three different ways, or you can
think of them as three range parameters:
• range(start_value, stop_value) : This generates the sequence based on the start and
stop value.
• only work when the specified value is an integer or a whole number. It does not support
the float data type and the string data type. However, you can pass in both positive and
negative integer values to it.
How to use parameters of range
• specify both start and the stop value
• The step size can be negative (when start value> stop value)
Play with range appear in test
• The len() function returns the number of items in an object. The object,
must be a sequence or a collection.
• When the object is a string, the len() function returns the number of
characters in the string.
• Please also try other data type: list, tuple, dic, range.
Play with range
• Let’s continue to study range final !!!!!
• Accessing range() with an Index Value
• range is indexed
• For example:
• range (3,8)
• Items are: 3, 4, 5, 6, 7
• Their index: [0] [1] [2] [3] [4]
Be careful: pip is a software. please keep your pip in the latest version, If this prompt appears, copy the green words
and paste in terminal window, press Enter key to run it:
If done:
How to start to use a third party package and its modules?
• You will get a new excel file in the same folder, and the content of
the first sheet is df.
pandas ——write
• You can specify the name of the sheet
• You will get a new excel file in the same folder, and the content of
the first sheet is df, the name of this sheet is “try”.
pandas ——write
• You can write an dataframe with no data (only defined columns in
the first line) to the excel
• In this way, you can create new excel file with new sheets.
pandas ——Filter
• You can filter the data based on a certain condition. For example:
• You can get all the students come from Y university, and put their
information into a new DataFrame.
Task
• In student score file (download in Canvas), please find all the
students whose score1 is larger than 85, and than put their
information in to a new excel file (named “high score1”)
Task
• In student score file, please find all the students whose score1 is
larger than 85, and than put their information in to a new excel
file (named “high score1”)
Task:
• You have a student score file. Please split this file into multiple
files by university name so that each file contains information from
only one university students:
• please find all the students come from X university, and put their
information into a new excel file named “X university students. xlsx”;
• please find all the students come from Y university, and put their
information into a new excel file named “Y university students. xlsx”;
• please find all the students come from Z university, and put their
information into a new excel file named “Z university students. xlsx”.
Task:
• You have a student score file. Please split this file into multiple
files by major so that each file contains information of students
from only one major.
Lifelong study
• Wanna know more skill of Pandas by yourself?
• Learning to code can be a lifelong study, so in this course you are also
encouraged to build self-learning skills to support your sustainable lifelong
learning ambitions.
52