Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
8 views

Course Pack - Programming For Data Science

The document provides information about a Python programming course, including: 1) The course aims to teach Python programming fundamentals like variables, data types, control structures, and functions. Students will learn to write clear and efficient Python code. 2) Students will explore Python features like data structures, file input/output, object-oriented programming, and debugging techniques. They will learn to break down problems and develop modular, reusable code. 3) The course objectives are for students to understand Python syntax, develop problem-solving skills, and gain hands-on experience through programming exercises and projects.

Uploaded by

nitish.patil
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Course Pack - Programming For Data Science

The document provides information about a Python programming course, including: 1) The course aims to teach Python programming fundamentals like variables, data types, control structures, and functions. Students will learn to write clear and efficient Python code. 2) Students will explore Python features like data structures, file input/output, object-oriented programming, and debugging techniques. They will learn to break down problems and develop modular, reusable code. 3) The course objectives are for students to understand Python syntax, develop problem-solving skills, and gain hands-on experience through programming exercises and projects.

Uploaded by

nitish.patil
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 72

School of Data Science Programming for Data Science

Study Material

Bachelor in
Data Science

Subject
Programming for
Data Science

Faculty
Nitish Patil

School of Data Science


Asian School of Media Studies

Page 1
School of Data Science Programming for Data Science

A. COURSE DESCRIPTION

The Python Programming course is a comprehensive and hands-on program designed to


introduce students to the fundamental concepts, syntax, and applications of the Python
programming language. This course is suitable for beginners with no prior programming
experience as well as individuals looking to expand their programming skills.
The course begins by providing students with a solid foundation in programming principles
and logic. Students will learn the basics of variables, data types, control structures, and
functions, gaining a deep understanding of how to write clear and efficient Python code.
Next, students will explore the powerful features and capabilities of the Python language.
They will learn about data structures such as lists, tuples, dictionaries, and sets, and how to
manipulate and iterate over them. Students will also delve into file input/output operations,
exception handling, and the concept of object-oriented programming in Python. The course
covers essential topics such as modular programming, code reusability, and debugging
techniques. Students will learn how to break down complex problems into smaller,
manageable tasks, and develop modular and reusable code using functions and classes. They
will also gain proficiency in identifying and resolving errors and bugs through effective
debugging strategies.

B. LEARNING OBJECTIVES

Students will able to –

● Understand the basic syntax and structure of the Python programming language.

● Learn fundamental programming concepts such as variables, data types, control flow, and

functions.

● Develop problem-solving skills through the implementation of algorithms and logical

thinking.

● Gain hands-on experience by working on real-world programming exercises and projects.

Page 2
School of Data Science Programming for Data Science

● Acquire a solid foundation in Python programming that can be built upon for more

advanced topics.

C. LEARNING OUTCOMES

At the end of this course participant will be able to –

● Recall and explain Python syntax, built-in functions, and standard library modules.

● Understanding: Demonstrate a deep understanding of Python programming concepts and


principle.

● Practice Python programming techniques to solve real-world problems and automate tasks.

● Interpret and debug Python code, identify errors, and propose appropriate solutions.

● Evaluating: Evaluate the efficiency and effectiveness of Python programs, identify areas for
improvement, and suggest optimizations.

● Design and develop Python programs and applications that meet specific requirements,
demonstrating creativity and problem-solving skills.

Projects
● Writing a Python program to calculate and display the Fibonacci sequence.

● Developing a simple web scraping script using Python's BeautifulSoup library.

● Creating a command-line tool that performs file manipulation tasks, such as renaming and
organizing files.

● Implementing a basic chatbot using natural language processing libraries in Python.

● Building a data visualization application using Python's Matplotlib or Plotly library.

D. LEARNING RESOURCE MATERIAL

Page 3
School of Data Science Programming for Data Science

Online References:

● "Python.org" - Official website of the Python programming language. Available at:


www.python.org

● "W3Schools Python Tutorial" - Comprehensive tutorials and examples for learning


Python. Available at: www.w3schools.com/python

● "Real Python" - Online tutorials, articles, and resources for Python programming.
Available at: realpython.com

● "Python Crash Course" by Eric Matthes. Available at:


www.nostarch.com/pythoncrashcourse

Suggested Readings:

● "Python Crash Course" by Eric Matthes

● "Learn Python the Hard Way" by Zed A. Shaw

● "Python Programming: An Introduction to Computer Science" by John Zelle

● "Automate the Boring Stuff with Python" by Al Sweigart

● "Python Cookbook" by David Beazley and Brian K. Jones

● "Fluent Python" by Luciano Ramalho

Page 4
School of Data Science Programming for Data Science

Unit-1: Introduction to Python Programming


What is Python?

Python is a popular programming language. It was created by Guido van Rossum, and released in
1991.

It is used for:

 web development (server-side),


 software development,
 mathematics,
 System scripting.

What can Python do?

 Python can be used on a server to create web applications.


 Python can be used alongside software to create workflows.
 Python can connect to database systems. It can also read and modify files.
 Python can be used to handle big data and perform complex mathematics.
 Python can be used for rapid prototyping, or for production-ready software development.

Why Python?

 Python works on different platforms (Windows, Mac, Linux, Raspberry Pi, etc).
 Python has a simple syntax similar to the English language.
 Python has syntax that allows developers to write programs with fewer lines than some
other programming languages.
 Python runs on an interpreter system, meaning that code can be executed as soon as it is
written. This means that prototyping can be very quick.
 Python can be treated in a procedural way, an object-oriented way or a functional way.

Variables

Variables are containers for storing data values.

Creating Variables

Python has no command for declaring a variable.

A variable is created the moment you first assign a value to it.

Page 5
School of Data Science Programming for Data Science

Example
x=5
y = "John"
print(x)
print(y)

Variables do not need to be declared with any particular type, and can even change type after they
have been set.

Example
x=4 # x is of type int
x = "Sally" # x is now of type str
print(x)

Casting

If you want to specify the data type of a variable, this can be done with casting.

Example
x = str(3) # x will be '3'
y = int(3) # y will be 3
z = float(3) # z will be 3.0

Get the Type

You can get the data type of a variable with the type() function.

Example
x=5
y = "John"
print(type(x))
print(type(y))

Single or Double Quotes?

Page 6
School of Data Science Programming for Data Science

String variables can be declared either by using single or double quotes:

Example
x = "John"
# is the same as
x = 'John'

Case-Sensitive

Variable names are case-sensitive.

Example

This will create two variables:

a=4
A = "Sally"

Python Operators
Operators are used to perform operations on variables and values.

In the example below, we use the + operator to add together two values:

Example
print(10 + 5)

Python divides the operators in the following groups:

 Arithmetic operators
 Assignment operators
 Comparison operators
 Logical operators
 Identity operators
 Membership operators
 Bitwise operators

Python Arithmetic Operators

Page 7
School of Data Science Programming for Data Science

Arithmetic operators are used with numeric values to perform common mathematical operations:

Operator Name

+ Addition

- Subtraction

* Multiplication

/ Division

% Modulus

** Exponentiation

// Floor division

Python Assignment Operators

Assignment operators are used to assign values to variables:

Operator Example

Page 8
School of Data Science Programming for Data Science

= x=5

+= x += 3

-= x -= 3

*= x *= 3

/= x /= 3

%= x %= 3

//= x //= 3

**= x **= 3

&= x &= 3

|= x |= 3

Page 9
School of Data Science Programming for Data Science

^= x ^= 3

>>= x >>= 3

<<= x <<= 3

Python Comparison Operators

Comparison operators are used to compare two values:

Operator Name

== Equal

!= Not equal

> Greater than

< Less than

>= Greater than or equal to

Page 10
School of Data Science Programming for Data Science

<= Less than or equal to

Python Logical Operators

Logical operators are used to combine conditional statements:

Operator Description

and Returns True if both statements are true

or Returns True if one of the statements is true

not Reverse the result, returns False if the result is true

Python Identity Operators

Identity operators are used to compare the objects, not if they are equal, but if they are actually the
same object, with the same memory location:

Operator Description

is Returns True if both variables are the same object

Page 11
School of Data Science Programming for Data Science

is not Returns True if both variables are not the same object

Python Membership Operators

Membership operators are used to test if a sequence is presented in an object:

Operator Description

in Returns True if a sequence with the specified value is present in t


object

not in Returns True if a sequence with the specified value is not present
object

Python Bitwise Operators

Bitwise operators are used to compare (binary) numbers:

Operator Name Description

& AND Sets each bit to 1 if both bits are 1

| OR Sets each bit to 1 if one of two bits is 1

Page 12
School of Data Science Programming for Data Science

^ XOR Sets each bit to 1 if only one of two bits is 1

~ NOT Inverts all the bits

<< Zero fill left shift Shift left by pushing zeros in from the right and let the leftmost bits

>> Signed right shift Shift right by pushing copies of the leftmost bit in from the left, and
rightmost bits fall off

Operator Precedence

Operator precedence describes the order in which operations are performed.

Example

Parentheses has the highest precedence, meaning that expressions inside parentheses must be
evaluated first:

print((6 + 3) - (6 + 3))

Example

Multiplication * has higher precedence than addition +, and therefor multiplications are evaluated
before additions:

print(100 + 5 * 3)

The precedence order is described in the table below, starting with the highest precedence at the
top:

Page 13
School of Data Science Programming for Data Science

Operator Description

() Parentheses

** Exponentiation

+x -x ~x Unary plus, unary minus, and bitwise NOT

* / // % Multiplication, division, floor division, and modulus

+ - Addition and subtraction

<< >> Bitwise left and right shifts

& Bitwise AND

^ Bitwise XOR

| Bitwise OR

== != > >= < <= is is not in not in Comparisons, identity, and membership operators

Page 14
School of Data Science Programming for Data Science

not Logical NOT

and AND

or OR

Built-in Data Types

In programming, data type is an important concept.

Variables can store data of different types, and different types can do different things.

Python has the following data types built-in by default, in these categories:

Text Type: str

Numeric Types: int, float, complex

Sequence Types: list, tuple, range

Mapping Type: dict

Set Types: set, frozenset

Boolean Type: bool

Binary Types: bytes, bytearray, memoryview

None Type: NoneType

Python Numbers

Page 15
School of Data Science Programming for Data Science

There are three numeric types in Python:

 int
 float
 complex

Variables of numeric types are created when you assign a value to them:

Example
x = 1 # int
y = 2.8 # float
z = 1j # complex

Python Lists

mylist = ["apple", "banana", "cherry"]

List

Lists are used to store multiple items in a single variable.

Lists are one of 4 built-in data types in Python used to store collections of data, the other 3
are Tuple, Set, and Dictionary, all with different qualities and usage.

Lists are created using square brackets:

Example

Create a List:

thislist = ["apple", "banana", "cherry"]


print(thislist)

List Items

List items are ordered, changeable, and allow duplicate values.

List items are indexed, the first item has index [0], the second item has index [1] etc.

Ordered

Page 16
School of Data Science Programming for Data Science

When we say that lists are ordered, it means that the items have a defined order, and that order will
not change.

If you add new items to a list, the new items will be placed at the end of the list.

Changeable

The list is changeable, meaning that we can change, add, and remove items in a list after it has been
created.

Allow Duplicates

Since lists are indexed, lists can have items with the same value:

Example

Lists allow duplicate values:

thislist = ["apple", "banana", "cherry", "apple", "cherry"]


print(thislist)

List Length

To determine how many items a list has, use the len() function:

Example

Print the number of items in the list:

thislist = ["apple", "banana", "cherry"]


print(len(thislist))

List Items - Data Types

List items can be of any data type:

Page 17
School of Data Science Programming for Data Science

Example

String, int and boolean data types:

list1 = ["apple", "banana", "cherry"]


list2 = [1, 5, 7, 9, 3]
list3 = [True, False, False]

A list can contain different data types:

Example

A list with strings, integers and boolean values:

list1 = ["abc", 34, True, 40, "male"]

type()

From Python's perspective, lists are defined as objects with the data type 'list':

<class 'list'>

Example

What is the data type of a list?

mylist = ["apple", "banana", "cherry"]


print(type(mylist))

The list() Constructor

It is also possible to use the list() constructor when creating a new list.

Example

Using the list() constructor to make a List:

thislist = list(("apple", "banana", "cherry"))


print(thislist)

Python Collections (Arrays)

There are four collection data types in the Python programming language:

 List is a collection which is ordered and changeable. Allows duplicate members.


 Tuple is a collection which is ordered and unchangeable. Allows duplicate members.

Page 18
School of Data Science Programming for Data Science

 Set is a collection which is unordered, unchangeable, and unindexed. No duplicate


members.
 Dictionary is a collection which is ordered and changeable. No duplicate members.

When choosing a collection type, it is useful to understand the properties of that type. Choosing the
right type for a particular data set could mean retention of meaning, and, it could mean an increase
in efficiency or security.

Python Tuples

mytuple = ("apple", "banana", "cherry")

Tuple

Tuples are used to store multiple items in a single variable.

Tuple is one of 4 built-in data types in Python used to store collections of data, the other 3
are List, Set, and Dictionary, all with different qualities and usage.

A tuple is a collection which is ordered and unchangeable.

Tuples are written with round brackets.

Example

Create a Tuple:

thistuple = ("apple", "banana", "cherry")


print(thistuple)

Tuple Items

Tuple items are ordered, unchangeable, and allow duplicate values.

Tuple items are indexed, the first item has index [0], the second item has index [1] etc.

Ordered

When we say that tuples are ordered, it means that the items have a defined order, and that order
will not change.

Page 19
School of Data Science Programming for Data Science

Unchangeable

Tuples are unchangeable, meaning that we cannot change, add or remove items after the tuple has
been created.

Allow Duplicates

Since tuples are indexed, they can have items with the same value:

Example

Tuples allow duplicate values:

thistuple = ("apple", "banana", "cherry", "apple", "cherry")


print(thistuple)

Tuple Length

To determine how many items a tuple has, use the len() function:

Example

Print the number of items in the tuple:

thistuple = ("apple", "banana", "cherry")


print(len(thistuple))

Create Tuple With One Item

To create a tuple with only one item, you have to add a comma after the item, otherwise Python will
not recognize it as a tuple.

Example

One item tuple, remember the comma:

thistuple = ("apple",)
print(type(thistuple))

thistuple = ("apple")
print(type(thistuple))

Page 20
School of Data Science Programming for Data Science

Tuple Items - Data Types

Tuple items can be of any data type:

Example

String, int and boolean data types:

tuple1 = ("apple", "banana", "cherry")


tuple2 = (1, 5, 7, 9, 3)
tuple3 = (True, False, False)

A tuple can contain different data types:

Example

A tuple with strings, integers and boolean values:

tuple1 = ("abc", 34, True, 40, "male")

type()

From Python's perspective, tuples are defined as objects with the data type 'tuple':

<class 'tuple'>

Example

What is the data type of a tuple?

mytuple = ("apple", "banana", "cherry")


print(type(mytuple))

The tuple() Constructor

It is also possible to use the tuple() constructor to make a tuple.

Example

Using the tuple() method to make a tuple:

thistuple = tuple(("apple", "banana", "cherry"))


print(thistuple)

Page 21
School of Data Science Programming for Data Science

Python Dictionaries

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}

Dictionary

Dictionaries are used to store data values in key: value pairs.

A dictionary is a collection which is ordered, changeable and do not allow duplicates.

Dictionaries are written with curly brackets, and have keys and values:

Example

Create and print a dictionary:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
print(thisdict)

Dictionary Items

Dictionary items are ordered, changeable, and does not allow duplicates.

Dictionary items are presented in key:value pairs, and can be referred to by using the key name.

Example

Print the "brand" value of the dictionary:

Page 22
School of Data Science Programming for Data Science

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964
}
print(thisdict["brand"])

Ordered or Unordered?

As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries
are unordered.

When we say that dictionaries are ordered, it means that the items have a defined order, and that
order will not change.

Unordered means that the items does not have a defined order, you cannot refer to an item by using
an index.

Changeable

Dictionaries are changeable, meaning that we can change, add or remove items after the dictionary
has been created.

Duplicates Not Allowed

Dictionaries cannot have two items with the same key:

Example

Duplicate values will overwrite existing values:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964,
"year": 2020
}
print(thisdict)

Page 23
School of Data Science Programming for Data Science

Dictionary Length

To determine how many items a dictionary has, use the len() function:

Example

Print the number of items in the dictionary:

print(len(thisdict))

Dictionary Items - Data Types

The values in dictionary items can be of any data type:

Example

String, int, boolean, and list data types:

thisdict = {
"brand": "Ford",
"electric": False,
"year": 1964,
"colors": ["red", "white", "blue"]
}

type()

From Python's perspective, dictionaries are defined as objects with the data type 'dict':

<class 'dict'>

Example

Print the data type of a dictionary:

thisdict = {
"brand": "Ford",
"model": "Mustang",
"year": 1964

Page 24
School of Data Science Programming for Data Science

}
print(type(thisdict))

The dict() Constructor

It is also possible to use the dict() constructor to make a dictionary.

Example

Using the dict() method to make a dictionary:

thisdict = dict(name = "John", age = 36, country = "Norway")


print(thisdict)

Python - List Comprehension


List Comprehension

List comprehension offers a shorter syntax when you want to create a new list based on the values
of an existing list.

Example:

Based on a list of fruits, you want a new list, containing only the fruits with the letter "a" in the
name.

Page 25
School of Data Science Programming for Data Science

Without list comprehension you will have to write a for statement with a conditional test inside:

Example
fruits = ["apple", "banana", "cherry", "kiwi", "mango"]
newlist = []

for x in fruits:
if "a" in x:
newlist.append(x)

print(newlist)

With list comprehension you can do all that with only one line of code:

Example
fruits = ["apple", "banana", "cherry", "kiwi", "mango"]

newlist = [x for x in fruits if "a" in x]

print(newlist)

The Syntax
newlist = [expression for item in iterable if condition == True]

The return value is a new list, leaving the old list unchanged.

Condition

The condition is like a filter that only accepts the items that valuate to True.

Example

Only accept items that are not "apple":

newlist = [x for x in fruits if x != "apple"]

Page 26
School of Data Science Programming for Data Science

The condition if x != "apple" will return True for all elements other than "apple", making the new
list contain all fruits except "apple".

The condition is optional and can be omitted:

Example

With no if statement:

newlist = [x for x in fruits]

Iterable

The iterable can be any iterable object, like a list, tuple, set etc.

Example

You can use the range() function to create an iterable:

newlist = [x for x in range(10)]

Same example, but with a condition:

Example

Accept only numbers lower than 5:

newlist = [x for x in range(10) if x < 5]

Expression

The expression is the current item in the iteration, but it is also the outcome, which you can
manipulate before it ends up like a list item in the new list:

Example

Set the values in the new list to upper case:

newlist = [x.upper() for x in fruits]

Page 27
School of Data Science Programming for Data Science

You can set the outcome to whatever you like:

Example

Set all values in the new list to 'hello':

newlist = ['hello' for x in fruits]

The expression can also contain conditions, not like a filter, but as a way to manipulate the
outcome:

Example

Return "orange" instead of "banana":

newlist = [x if x != "banana" else "orange" for x in fruits]

The expression in the example above says:

"Return the item if it is not banana, if it is banana return orange".

Nested list comprehension

A nested list is a list within a list. Python provides features to handle nested list gracefully and
apply common functions to manipulate the nested lists. In this article we will see how to use list
comprehension to create and use nested lists in python.
Creating a Matrix

Page 28
School of Data Science Programming for Data Science

Creating a matrix involves creating series of rows and columns. We can use for loop for creating
the matrix rows and columns by putting one python list with for loop inside another python list
with for loop.
Example
matrix = [[m for m in range(4)] for n in range(3)]
print(matrix)
Running the above code gives us the following result:
[[0, 1, 2, 3], [0, 1, 2, 3], [0, 1, 2, 3]]

Python Dictionary Comprehension


Dictionaries are data types in Python which allows us to store data in key/value pair. For example:

my_dict = {1: 'apple', 2: 'ball'}

What is Dictionary Comprehension in Python?

Dictionary comprehension is an elegant and concise way to create dictionaries.

Example 1: Dictionary Comprehension

Consider the following code:


square_dict = dict()
for num in range(1, 11):
square_dict[num] = num*num
print(square_dict)
Run Code

# dictionary comprehension example


square_dict = {num: num*num for num in range(1, 11)}
print(square_dict)
The output of both programs will be the same.

{1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81, 10: 100}

Page 29
School of Data Science Programming for Data Science

Python Strings
Strings

Strings in python are surrounded by either single quotation marks, or double quotation marks.

'hello' is the same as "hello".

You can display a string literal with the print() function:

ExampleGet your own Python Server


print("Hello")
print('Hello')

Assign String to a Variable

Assigning a string to a variable is done with the variable name followed by an equal sign and the
string:

Example
a = "Hello"
print(a)

Multiline Strings

You can assign a multiline string to a variable by using three quotes:

Example

You can use three double quotes:

a = """Lorem ipsum dolor sit amet,


consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua."""
print(a)

Or three single quotes:

Page 30
School of Data Science Programming for Data Science

Example
a = '''Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua.'''
print(a)

Strings are Arrays

Like many other popular programming languages, strings in Python are arrays of bytes representing
unicode characters.

However, Python does not have a character data type, a single character is simply a string with a
length of 1.

Square brackets can be used to access elements of the string.

Example

Get the character at position 1 (remember that the first character has the position 0):

a = "Hello, World!"
print(a[1])

Looping Through a String

Since strings are arrays, we can loop through the characters in a string, with a for loop.

Example

Loop through the letters in the word "banana":

for x in "banana":
print(x)

String Length

To get the length of a string, use the len() function.

Example

The len() function returns the length of a string:

Page 31
School of Data Science Programming for Data Science

a = "Hello, World!"
print(len(a))

Check String

To check if a certain phrase or character is present in a string, we can use the keyword in.

Example

Check if "free" is present in the following text:

txt = "The best things in life are free!"


print("free" in txt)

Use it in an if statement:

Example

Print only if "free" is present:

txt = "The best things in life are free!"


if "free" in txt:
print("Yes, 'free' is present.")

Check if NOT

To check if a certain phrase or character is NOT present in a string, we can use the keyword not in.

Example

Check if "expensive" is NOT present in the following text:

txt = "The best things in life are free!"


print("expensive" not in txt)

Use it in an if statement:

Example

print only if "expensive" is NOT present:

Page 32
School of Data Science Programming for Data Science

txt = "The best things in life are free!"


if "expensive" not in txt:
print("No, 'expensive' is NOT present.")

Slicing

You can return a range of characters by using the slice syntax.

Specify the start index and the end index, separated by a colon, to return a part of the string.

Example

Get the characters from position 2 to position 5 (not included):

b = "Hello, World!"
print(b[2:5])

String Concatenation

To concatenate, or combine, two strings you can use the + operator.

ExampleGet your own Python Server

Merge variable a with variable b into variable c:

a = "Hello"
b = "World"
c=a+b
print(c)

Unit 2: Functions & Modules in Python


Python Functions

A function is a block of code which only runs when it is called.

Page 33
School of Data Science Programming for Data Science

You can pass data, known as parameters, into a function.

A function can return data as a result.

Creating a Function

In Python a function is defined using the def keyword:

Example
def my_function():
print("Hello from a function")

Calling a Function

To call a function, use the function name followed by parenthesis:

Example
def my_function():
print("Hello from a function")

my_function()

Arguments

Information can be passed into functions as arguments.

Arguments are specified after the function name, inside the parentheses. You can add as many
arguments as you want, just separate them with a comma.

The following example has a function with one argument (fname). When the function is called, we
pass along a first name, which is used inside the function to print the full name:

Example
def my_function(fname):
print(fname + " Refsnes")

Page 34
School of Data Science Programming for Data Science

my_function("Emil")
my_function("Tobias")
my_function("Linus")

Arguments are often shortened to args in Python documentations.

Parameters or Arguments?

The terms parameter and argument can be used for the same thing: information that are passed into
a function.

A parameter is the variable listed inside the parentheses in the function definition.

An argument is the value that is sent to the function when it is called.

Number of Arguments

By default, a function must be called with the correct number of arguments. Meaning that if your
function expects 2 arguments, you have to call the function with 2 arguments, not more, and not
less.

Example

This function expects 2 arguments, and gets 2 arguments:

def my_function(fname, lname):


print(fname + " " + lname)

my_function("Emil", "Refsnes")

If you try to call the function with 1 or 3 arguments, you will get an error:

Example

This function expects 2 arguments, but gets only 1:

def my_function(fname, lname):


print(fname + " " + lname)

Page 35
School of Data Science Programming for Data Science

my_function("Emil")

Arbitrary Arguments, *args

If you do not know how many arguments that will be passed into your function, add a * before the
parameter name in the function definition.

This way the function will receive a tuple of arguments, and can access the items accordingly:

Example

If the number of arguments is unknown, add a * before the parameter name:

def my_function(*kids):
print("The youngest child is " + kids[2])

my_function("Emil", "Tobias", "Linus")

Arbitrary Arguments are often shortened to *args in Python documentations.

Keyword Arguments

You can also send arguments with the key = value syntax.

This way the order of the arguments does not matter.

Example
def my_function(child3, child2, child1):
print("The youngest child is " + child3)

my_function(child1 = "Emil", child2 = "Tobias", child3 = "Linus")

Arbitrary Keyword Arguments, **kwargs

If you do not know how many keyword arguments that will be passed into your function, add two
asterisk: ** before the parameter name in the function definition.

This way the function will receive a dictionary of arguments, and can access the items accordingly:

Page 36
School of Data Science Programming for Data Science

Example

If the number of keyword arguments is unknown, add a double ** before the parameter name:

def my_function(**kid):
print("His last name is " + kid["lname"])

my_function(fname = "Tobias", lname = "Refsnes")

Arbitrary Kword Arguments are often shortened to **kwargs in Python documentations.

Default Parameter Value

The following example shows how to use a default parameter value.

If we call the function without argument, it uses the default value:

Example
def my_function(country = "Norway"):
print("I am from " + country)

my_function("Sweden")
my_function("India")
my_function()
my_function("Brazil")

Passing a List as an Argument

You can send any data types of argument to a function (string, number, list, dictionary etc.), and it
will be treated as the same data type inside the function.

E.g. if you send a List as an argument, it will still be a List when it reaches the function:

Example
def my_function(food):
for x in food:
print(x)

Page 37
School of Data Science Programming for Data Science

fruits = ["apple", "banana", "cherry"]

my_function(fruits)

Return Values

To let a function return a value, use the return statement:

Example
def my_function(x):
return 5 * x

print(my_function(3))
print(my_function(5))
print(my_function(9))

The pass Statement

function definitions cannot be empty, but if you for some reason have a function definition with no
content, put in the pass statement to avoid getting an error.

Example
def myfunction():
pass

Recursion

Python also accepts function recursion, which means a defined function can call itself.

Recursion is a common mathematical and programming concept. It means that a function calls
itself. This has the benefit of meaning that you can loop through data to reach a result.

The developer should be very careful with recursion as it can be quite easy to slip into writing a
function which never terminates, or one that uses excess amounts of memory or processor power.
However, when written correctly recursion can be a very efficient and mathematically-elegant
approach to programming.

Page 38
School of Data Science Programming for Data Science

In this example, tri_recursion() is a function that we have defined to call itself ("recurse"). We use
the k variable as the data, which decrements (-1) every time we recurse. The recursion ends when
the condition is not greater than 0 (i.e. when it is 0).

To a new developer it can take some time to work out how exactly this works, best way to find out
is by testing and modifying it.

Example

Recursion Example

def tri_recursion(k):
if(k > 0):
result = k + tri_recursion(k - 1)
print(result)
else:
result = 0
return result

print("\n\nRecursion Example Results")


tri_recursion

Python Lambda

A lambda function is a small anonymous function.

A lambda function can take any number of arguments, but can only have one expression.

Syntax
lambda arguments : expression

The expression is executed and the result is returned:

Example
Add 10 to argument a, and return the result:

Page 39
School of Data Science Programming for Data Science

x = lambda a : a + 10
print(x(5))

Lambda functions can take any number of arguments:

Example
Multiply argument a with argument b and return the result:

x = lambda a, b : a * b
print(x(5, 6))

Example
Summarize argument a, b, and c and return the result:

x = lambda a, b, c : a + b + c
print(x(5, 6, 2))

Why Use Lambda Functions?


The power of lambda is better shown when you use them as an anonymous function inside another
function.

Say you have a function definition that takes one argument, and that argument will be multiplied with
an unknown number:

def myfunc(n):
return lambda a : a * n

Use that function definition to make a function that always doubles the number you send in:

Example
def myfunc(n):
return lambda a : a * n

mydoubler = myfunc(2)

print(mydoubler(11))

Page 40
School of Data Science Programming for Data Science

Test Yourself Exercises

Unit 3: Looping Through Python


Python Conditions and If statements

Page 41
School of Data Science Programming for Data Science

Python supports the usual logical conditions from mathematics:

 Equals: a == b
 Not Equals: a != b
 Less than: a < b
 Less than or equal to: a <= b
 Greater than: a > b
 Greater than or equal to: a >= b

These conditions can be used in several ways, most commonly in "if statements" and loops.

An "if statement" is written by using the if keyword.

Example

If statement:

a = 33
b = 200
if b > a:
print("b is greater than a")

In this example we use two variables, a and b, which are used as part of the if statement to test
whether b is greater than a. As a is 33, and b is 200, we know that 200 is greater than 33, and so we
print to screen that "b is greater than a".

Indentation

Python relies on indentation (whitespace at the beginning of a line) to define scope in the code.
Other programming languages often use curly-brackets for this purpose.

Example

If statement, without indentation (will raise an error):

a = 33
b = 200
if b > a:
print("b is greater than a")

Elif

Page 42
School of Data Science Programming for Data Science

The elif keyword is Python's way of saying "if the previous conditions were not true, then try this
condition".

Example
a = 33
b = 33
if b > a:
print("b is greater than a")
elif a == b:
print("a and b are equal")

In this example a is equal to b, so the first condition is not true, but the elif condition is true, so we
print to screen that "a and b are equal".

Else

The else keyword catches anything which isn't caught by the preceding conditions.

Example
a = 200
b = 33
if b > a:
print("b is greater than a")
elif a == b:
print("a and b are equal")
else:
print("a is greater than b")

In this example a is greater than b, so the first condition is not true, also the elif condition is not
true, so we go to the else condition and print to screen that "a is greater than b".

You can also have an else without the elif:

Example
a = 200
b = 33

Page 43
School of Data Science Programming for Data Science

if b > a:
print("b is greater than a")
else:
print("b is not greater than a")

Short Hand If

If you have only one statement to execute, you can put it on the same line as the if statement.

Example

One line if statement:

if a > b: print("a is greater than b")

Short Hand If ... Else

If you have only one statement to execute, one for if, and one for else, you can put it all on the same
line:

Example

One line if else statement:

a=2
b = 330
print("A") if a > b else print("B")

You can also have multiple else statements on the same line:

Example

One line if else statement, with 3 conditions:

a = 330
b = 330
print("A") if a > b else print("=") if a == b else print("B")

Page 44
School of Data Science Programming for Data Science

And

The and keyword is a logical operator, and is used to combine conditional statements:

Example

Test if a is greater than b, AND if c is greater than a:

a = 200
b = 33
c = 500
if a > b and c > a:
print("Both conditions are True")

Or

The or keyword is a logical operator, and is used to combine conditional statements:

Example

Test if a is greater than b, OR if a is greater than c:

a = 200
b = 33
c = 500
if a > b or a > c:
print("At least one of the conditions is True")

Not

The not keyword is a logical operator, and is used to reverse the result of the conditional statement:

Example

Test if a is NOT greater than b:

Page 45
School of Data Science Programming for Data Science

a = 33
b = 200
if not a > b:
print("a is NOT greater than b")

Nested If

You can have if statements inside if statements, this is called nested if statements.

Example
x = 41

if x > 10:
print("Above ten,")
if x > 20:
print("and also above 20!")
else:
print("but not above 20.")

The pass Statement

if statements cannot be empty, but if you for some reason have an if statement with no content, put
in the pass statement to avoid getting an error.

Example
a = 33
b = 200

if b > a:
pass

Test Yourself Exercises

Python While Loops


Python Loops

Page 46
School of Data Science Programming for Data Science

Python has two primitive loop commands:

 while loops
 for loops

The while Loop

With the while loop we can execute a set of statements as long as a condition is true.

Example

Print i as long as i is less than 6:

i=1
while i < 6:
print(i)
i += 1

The while loop requires relevant variables to be ready, in this example we need to define an
indexing variable, i, which we set to 1.

The break Statement

With the break statement we can stop the loop even if the while condition is true:

Example

Exit the loop when i is 3:

i=1
while i < 6:
print(i)
if i == 3:
break
i += 1

Page 47
School of Data Science Programming for Data Science

The continue Statement

With the continue statement we can stop the current iteration, and continue with the next:

Example

Continue to the next iteration if i is 3:

i=0
while i < 6:
i += 1
if i == 3:
continue
print(i)

The else Statement

With the else statement we can run a block of code once when the condition no longer is true:

Example

Print a message once the condition is false:

i=1
while i < 6:
print(i)
i += 1
else:
print("i is no longer less than 6")

Python For Loops


A for loop is used for iterating over a sequence (that is either a list, a tuple, a dictionary, a set, or a
string).

Page 48
School of Data Science Programming for Data Science

This is less like the for keyword in other programming languages, and works more like an iterator
method as found in other object-orientated programming languages.

With the for loop we can execute a set of statements, once for each item in a list, tuple, set etc.

Example
Print each fruit in a fruit list:

fruits = ["apple", "banana", "cherry"]


for x in fruits:
print(x)

The for loop does not require an indexing variable to set beforehand.

Looping Through a String


Even strings are iterable objects, they contain a sequence of characters:

Example
Loop through the letters in the word "banana":

for x in "banana":
print(x)

The break Statement


With the break statement we can stop the loop before it has looped through all the items:

Example
Exit the loop when x is "banana":

fruits = ["apple", "banana", "cherry"]


for x in fruits:
print(x)

Page 49
School of Data Science Programming for Data Science

if x == "banana":
break

Example
Exit the loop when x is "banana", but this time the break comes before the print:

fruits = ["apple", "banana", "cherry"]


for x in fruits:
if x == "banana":
break
print(x)

The continue Statement


With the continue statement we can stop the current iteration of the loop, and continue with the next:

Example
Do not print banana:

fruits = ["apple", "banana", "cherry"]


for x in fruits:
if x == "banana":
continue
print(x)

The range() Function


To loop through a set of code a specified number of times, we can use the range() function,

The range() function returns a sequence of numbers, starting from 0 by default, and increments by 1 (by
default), and ends at a specified number.

Page 50
School of Data Science Programming for Data Science

Example
Using the range() function:

for x in range(6):
print(x)

Note that range(6) is not the values of 0 to 6, but the values 0 to 5.

The range() function defaults to 0 as a starting value, however it is possible to specify the starting value
by adding a parameter: range(2, 6), which means values from 2 to 6 (but not including 6):

Example
Using the start parameter:

for x in range(2, 6):


print(x)

The range() function defaults to increment the sequence by 1, however it is possible to specify the
increment value by adding a third parameter: range(2, 30, 3):

Example
Increment the sequence with 3 (default is 1):

for x in range(2, 30, 3):


print(x)

Else in For Loop


The else keyword in a for loop specifies a block of code to be executed when the loop is finished:

Example
Print all numbers from 0 to 5, and print a message when the loop has ended:

for x in range(6):
print(x)

Page 51
School of Data Science Programming for Data Science

else:
print("Finally finished!")

Example
Break the loop when x is 3, and see what happens with the else block:

for x in range(6):
if x == 3: break
print(x)
else:
print("Finally finished!")

Nested Loops
A nested loop is a loop inside a loop.

The "inner loop" will be executed one time for each iteration of the "outer loop":

Example
Print each adjective for every fruit:

adj = ["red", "big", "tasty"]


fruits = ["apple", "banana", "cherry"]

for x in adj:
for y in fruits:
print(x, y)

The pass Statement


for loops cannot be empty, but if you for some reason have a for loop with no content, put in
the pass statement to avoid getting an error.

Page 52
School of Data Science Programming for Data Science

Example
for x in [0, 1, 2]:
pass

Unit 4: Python Libraries used in Data Science

Page 53
School of Data Science Programming for Data Science

NumPy, which stands for Numerical Python, is a library consisting of multidimensional array
objects and a collection of routines for processing those arrays. Using NumPy, mathematical and
logical operations on arrays can be performed. It explains the basics of NumPy such as its
architecture and environment. It also discusses the various array functions, types of indexing, etc.
NumPy is a Python package. It stands for 'Numerical Python'. It is a library consisting of
multidimensional array objects and a collection of routines for processing of array.
Numeric, the ancestor of NumPy, was developed by Jim Hugunin. Another package Numarray was
also developed, having some additional functionalities. In 2005, Travis Oliphant created NumPy
package by incorporating the features of Numarray into Numeric package. There are many
contributors to this open source project.
Operations using NumPy
Using NumPy, a developer can perform the following operations −
 Mathematical and logical operations on arrays.
 Fourier transforms and routines for shape manipulation.
 Operations related to linear algebra. NumPy has in-built functions for linear algebra and
random number generation

Contents of ndarray object can be accessed and modified by indexing or slicing, just like Python's
in-built container objects.
As mentioned earlier, items in ndarray object follows zero-based index. Three types of indexing
methods are available − field access, basic slicing and advanced indexing.
Basic slicing is an extension of Python's basic concept of slicing to n dimensions. A Python slice
object is constructed by giving start, stop, and step parameters to the built-in slice function. This
slice object is passed to the array to extract a part of array.
Example 1
import numpy as np
a = np.arange(10)
s = slice(2,7,2)
print a[s]
Its output is as follows −
[2 4 6]
In the above example, an ndarray object is prepared by arange() function. Then a slice object is
defined with start, stop, and step values 2, 7, and 2 respectively. When this slice object is passed to
the ndarray, a part of it starting with index 2 up to 7 with a step of 2 is sliced.
The same result can also be obtained by giving the slicing parameters separated by a colon :
(start:stop:step) directly to the ndarray object.
Example 2

Page 54
School of Data Science Programming for Data Science

import numpy as np
a = np.arange(10)
b = a[2:7:2]
print b
Here, we will get the same output −
[2 4 6]
If only one parameter is put, a single item corresponding to the index will be returned. If a : is
inserted in front of it, all items from that index onwards will be extracted. If two parameters (with :
between them) is used, items between the two indexes (not including the stop index) with default
step one are sliced.
Example 3
# slice single item
import numpy as np

a = np.arange(10)
b = a[5]
print b
Its output is as follows −
5

Pandas

Page 55
School of Data Science Programming for Data Science

The name of Pandas is gotten from the word Board Information, and that implies an Econometrics
from Multi-faceted information. It was created in 2008 by Wes McKinney and is used for data
analysis in Python.

Processing, such as restructuring, cleaning, merging, etc., is necessary for data analysis. Numpy,
Scipy, Cython, and Panda are just a few of the fast data processing tools available. Yet, we incline
toward Pandas since working with Pandas is quick, basic and more expressive than different
apparatuses.

Since Pandas is built on top of the Numpy bundle, it is expected that Numpy will work with
Pandas.

Before Pandas, Python was able for information planning, however it just offered restricted help for
information investigation. As a result, Pandas entered the picture and enhanced data analysis
capabilities. Regardless of the source of the data, it can carry out the five crucial steps that are
necessary for processing and analyzing it: load, manipulate, prepare, model, and analyze.

Key Features of Pandas

o It has a DataFrame object that is quick and effective, with both standard and custom
indexing.
o Utilized for reshaping and turning of the informational indexes.
o For aggregations and transformations, group by data.
o It is used to align the data and integrate the data that is missing.
o Provide Time Series functionality.
o Process a variety of data sets in various formats, such as matrix data, heterogeneous tabular
data, and time series.
o Manage the data sets' multiple operations, including subsetting, slicing, filtering, groupBy,
reordering, and reshaping.
o It incorporates with different libraries like SciPy, and scikit-learn.
o Performs quickly, and the Cython can be used to accelerate it even further.

Benefits of Pandas

The following are the advantages of pandas overusing other languages:

Representation of Data: Through its DataFrame and Series, it presents the data in a manner that is
appropriate for data analysis.

Page 56
School of Data Science Programming for Data Science

Clear code: Pandas' clear API lets you concentrate on the most important part of the code. In this
way, it gives clear and brief code to the client.

DataFrame and Series are the two data structures that Pandas provides for processing data. These
data structures are discussed below:

1) Series

A one-dimensional array capable of storing a variety of data types is how it is defined. The term
"index" refers to the row labels of a series. We can without much of a stretch believer the rundown,
tuple, and word reference into series utilizing "series' technique. Multiple columns cannot be
included in a Series. Only one parameter exists:

Data: It can be any list, dictionary, or scalar value.

Creating Series from Array:

Before creating a Series, Firstly, we have to import the numpy module and then use array() function
in the program.

1. import pandas as pd
2. import numpy as np
3. info = np.array(['P','a','n','d','a','s'])
4. a = pd.Series(info)
5. print(a)

Output

0 P
1 a
2 n
3 d
4 a
5 s
dtype: object

Explanation: In this code, firstly, we have imported the pandas and numpy library with
the pd and np alias. Then, we have taken a variable named "info" that consist of an array of some
values. We have called the info variable through a Series method and defined it in an "a" variable.
The Series has printed by calling the print(a) method.

Python Pandas DataFrame

Page 57
School of Data Science Programming for Data Science

It is a generally utilized information design of pandas and works with a two-layered exhibit with
named tomahawks (lines and segments). As a standard method for storing data, DataFrame has two
distinct indexes-row index and column index. It has the following characteristics:

The sections can be heterogeneous sorts like int, bool, etc.

It can be thought of as a series structure dictionary with indexed rows and columns. It is referred to
as "columns" for rows and "index" for columns.

Create a DataFrame using List:

We can easily create a DataFrame in Pandas using list.

1. import pandas as pd
2. # a list of strings
3. x = ['Python', 'Pandas']
4.
5. # Calling DataFrame constructor on list
6. df = pd.DataFrame(x)
7. print(df)

Output

0
0 Python
1 Pandas

Page 58
School of Data Science Programming for Data Science

Matplotlib is a plotting library for Python. It is used along with NumPy to provide an
environment that is an effective open source alternative for MatLab. It can also be used
with graphics toolkits like PyQt and wxPython.
Matplotlib module was first written by John D. Hunter. Since 2012, Michael Droettboom is
the principal developer. Currently, Matplotlib ver. 1.5.1 is the stable version available. The
package is available in binary distribution as well as in the source code form
on www.matplotlib.org.
Conventionally, the package is imported into the Python script by adding the following
statement −
from matplotlib import pyplot as plt
Here pyplot() is the most important function in matplotlib library, which is used to plot 2D
data. The following script plots the equation y = 2x + 5
Example
import numpy as np
from matplotlib import pyplot as plt

x = np.arange(1,11)
y=2*x+5
plt.title("Matplotlib demo")
plt.xlabel("x axis caption")
plt.ylabel("y axis caption")
plt.plot(x,y)
plt.show()
An ndarray object x is created from np.arange() function as the values on the x axis. The
corresponding values on the y axis are stored in another ndarray object y. These values
are plotted using plot() function of pyplot submodule of matplotlib package.
The graphical representation is displayed by show() function.
The above code should produce the following output −

Page 59
School of Data Science Programming for Data Science

Instead of the linear graph, the values can be displayed discretely by adding a format
string to the plot() function. Following formatting characters can be used.

The Plotly Python library is an interactive open-source library. This can be a very
helpful tool for data visualization and understanding the data simply and easily.
plotly graph objects are a high-level interface to plotly which are easy to use. It
can plot various types of graphs and charts like scatter plots, line charts, bar
charts, box plots, histograms, pie charts, etc.
So you all must be wondering why plotly over other visualization tools or libraries?
Here’s the answer –
 Plotly has hover tool capabilities that allow us to detect any outliers or
anomalies in a large number of data points.
 It is visually attractive that can be accepted by a wide range of audiences.
 It allows us for the endless customization of our graphs that makes our plot
more meaningful and understandable for others.

Page 60
School of Data Science Programming for Data Science

Seaborn is a Python data visualization library based on matplotlib. It provides a high-level


interface for drawing attractive and informative statistical graphics.

For a brief introduction to the ideas behind the library, you can read the introductory notes or
the paper. Visit the installation page to see how you can download the package and get
started with it. You can browse the example gallery to see some of the things that you can do
with seaborn, and then check out the tutorials or API reference to find out how.

Seaborn is a Python data visualization library based on matplotlib. It provides a high-level


interface for drawing attractive and informative statistical graphics.

For a brief introduction to the ideas behind the library, you can read the introductory notes or
the paper. Visit the installation page to see how you can download the package and get
started with it. You can browse the example gallery to see some of the things that you can do
with seaborn, and then check out the tutorials or API reference to find out how.

To see the code or report a bug, please visit the GitHub repository. General support questions
are most at home on stackoverflow, which has a dedicated channel for seaborn.

Page 61
School of Data Science Programming for Data Science

Unit 5: Data Processing & File Handling


File handling is an important part of any web application.

Python has several functions for creating, reading, updating, and deleting files.

File Handling

The key function for working with files in Python is the open() function.

The open() function takes two parameters; filename, and mode.

There are four different methods (modes) for opening a file:

"r" - Read - Default value. Opens a file for reading, error if the file does not exist

"a" - Append - Opens a file for appending, creates the file if it does not exist

"w" - Write - Opens a file for writing, creates the file if it does not exist

"x" - Create - Creates the specified file, returns an error if the file exists

In addition you can specify if the file should be handled as binary or text mode

"t" - Text - Default value. Text mode

"b" - Binary - Binary mode (e.g. images)

Syntax

To open a file for reading it is enough to specify the name of the file:

f = open("demofile.txt")

The code above is the same as:

f = open("demofile.txt", "rt")

Because "r" for read, and "t" for text are the default values, you do not need to specify them.

Page 62
School of Data Science Programming for Data Science

Open a File on the Server

Assume we have the following file, located in the same folder as Python:

demofile.txt

Hello! Welcome to demofile.txt


This file is for testing purposes.
Good Luck!

To open the file, use the built-in open() function.

The open() function returns a file object, which has a read() method for reading the content of the
file:

Example
f = open("demofile.txt", "r")
print(f.read())

If the file is located in a different location, you will have to specify the file path, like this:

Example

Open a file on a different location:

f = open("D:\\myfiles\welcome.txt", "r")
print(f.read())

Read Only Parts of the File

By default the read() method returns the whole text, but you can also specify how many characters
you want to return:

Example

Return the 5 first characters of the file:

f = open("demofile.txt", "r")
print(f.read(5))

Page 63
School of Data Science Programming for Data Science

Read Lines

You can return one line by using the readline() method:

Example

Read one line of the file:

f = open("demofile.txt", "r")
print(f.readline())

By calling readline() two times, you can read the two first lines:

Example

Read two lines of the file:

f = open("demofile.txt", "r")
print(f.readline())
print(f.readline())

By looping through the lines of the file, you can read the whole file, line by line:

Example

Loop through the file line by line:

f = open("demofile.txt", "r")
for x in f:
print(x)

Close Files

It is a good practice to always close the file when you are done with it.

Example

Close the file when you are finish with it:

Page 64
School of Data Science Programming for Data Science

f = open("demofile.txt", "r")
print(f.readline())
f.close()

Write to an Existing File

To write to an existing file, you must add a parameter to the open() function:

"a" - Append - will append to the end of the file

"w" - Write - will overwrite any existing content

ExampleGet your own Python Server

Open the file "demofile2.txt" and append content to the file:

f = open("demofile2.txt", "a")
f.write("Now the file has more content!")
f.close()

#open and read the file after the appending:


f = open("demofile2.txt", "r")
print(f.read())

Example

Open the file "demofile3.txt" and overwrite the content:

f = open("demofile3.txt", "w")
f.write("Woops! I have deleted the content!")
f.close()

#open and read the file after the overwriting:


f = open("demofile3.txt", "r")
print(f.read())

Note: the "w" method will overwrite the entire file.

Page 65
School of Data Science Programming for Data Science

Create a New File

To create a new file in Python, use the open() method, with one of the following parameters:

"x" - Create - will create a file, returns an error if the file exist

"a" - Append - will create a file if the specified file does not exist

"w" - Write - will create a file if the specified file does not exist

Example

Create a file called "myfile.txt":

f = open("myfile.txt", "x")

Result: a new empty file is created!

Example

Create a new file if it does not exist:

f = open("myfile.txt", "w")

Delete a File

To delete a file, you must import the OS module, and run its os.remove() function:

ExampleGet your own Python Server

Remove the file "demofile.txt":

import os
os.remove("demofile.txt")

Check if File exist:

To avoid getting an error, you might want to check if the file exists before you try to delete it:

Page 66
School of Data Science Programming for Data Science

Example

Check if file exists, then delete it:

import os
if os.path.exists("demofile.txt"):
os.remove("demofile.txt")
else:
print("The file does not exist")

Delete Folder

To delete an entire folder, use the os.rmdir() method:

Example

Remove the folder "myfolder":

import os
os.rmdir("myfolder")

Page 67
School of Data Science Programming for Data Science

Exception Handling

When an error occurs, or exception as we call it, Python will normally stop and generate an error
message.

These exceptions can be handled using the try statement:

Example

The try block will generate an exception, because x is not defined:

try:
print(x)
except:
print("An exception occurred")

Since the try block raises an error, the except block will be executed.

Without the try block, the program will crash and raise an error:

Example

This statement will raise an error, because x is not defined:

print(x)

Many Exceptions

You can define as many exception blocks as you want, e.g. if you want to execute a special block of
code for a special kind of error:

Example

Print one message if the try block raises a NameError and another for other errors:

try:
print(x)
except NameError:
print("Variable x is not defined")

Page 68
School of Data Science Programming for Data Science

except:
print("Something else went wrong")

Else

You can use the else keyword to define a block of code to be executed if no errors were raised:

Example

In this example, the try block does not generate any error:

try:
print("Hello")
except:
print("Something went wrong")
else:
print("Nothing went wrong")

Finally

The finally block, if specified, will be executed regardless if the try block raises an error or not.

Example
try:
print(x)
except:
print("Something went wrong")
finally:
print("The 'try except' is finished")

This can be useful to close objects and clean up resources:

Example

Try to open and write to a file that is not writable:

Page 69
School of Data Science Programming for Data Science

try:
f = open("demofile.txt")
try:
f.write("Lorum Ipsum")
except:
print("Something went wrong when writing to the file")
finally:
f.close()
except:
print("Something went wrong when opening the file")

The program can continue, without leaving the file object open.

Raise an exception

As a Python developer you can choose to throw an exception if a condition occurs.

To throw (or raise) an exception, use the raise keyword.

Example

Raise an error and stop the program if x is lower than 0:

x = -1

if x < 0:
raise Exception("Sorry, no numbers below zero")

The raise keyword is used to raise an exception.

You can define what kind of error to raise, and the text to print to the user.

Example

Raise a TypeError if x is not an integer:

x = "hello"

if not type(x) is int:


raise TypeError("Only integers are allowed")

Page 70
School of Data Science Programming for Data Science

Page 71
School of Data Science Programming for Data Science

Page 72

You might also like