Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
292 views

Python - Data Engineering

Python data types include strings, integers, floats, lists, tuples, dictionaries, sets, booleans, bytes and NoneType. Variables in Python take the data type of the value assigned to them. Operators like arithmetic, comparison, assignment and logical operators are used to perform operations on variables and values of different data types in Python. Indentation through spaces is used to indicate code blocks in Python.

Uploaded by

Chetan Patil
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
292 views

Python - Data Engineering

Python data types include strings, integers, floats, lists, tuples, dictionaries, sets, booleans, bytes and NoneType. Variables in Python take the data type of the value assigned to them. Operators like arithmetic, comparison, assignment and logical operators are used to perform operations on variables and values of different data types in Python. Indentation through spaces is used to indicate code blocks in Python.

Uploaded by

Chetan Patil
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Data Engineering

Python

Basic syntax and variables

We can directly type in the command line:

>> print("Hello World!"


Hello World!

or by creating a python file on the server, using .py file


extension, and running it in the Command Line.

In python variables are created when you assign a value to it:


>> x = 5
>> y = "Hello World!"

Indentation

indentation refers to the spaces at the beginning of a code.


Python uses indentation to indicate a block of code.

## for commenting use following

1) # - to comment out a line


2) ''' ''' - to comment out multiple lines

Data Types

1) Built-in data types

Type Name of Data Type

Text-type str
Numeric Types int, float, complex

Sequence Types list, tuple, range


Mapping Type dict
Set Types set, frozenset
Boolean Type bool
Binary Types bytes, bytearray, memoryview

None Type NoneType

2) Setting the Data type

In Python, the data type is set when you assign a value to a


variable.

Example Data Type

x = "Hello World" str


x = 20 int
x = 20.5 float
x = 1j complex
x = ["apple", "banana", "cherry"] list
x = ("apple", "banana", "cherry") tuple
x = range(6) range
x = {"name" : "John", "age" : 36} dict
x = {"apple", "banana", "cherry"} set
x = frozenset({"apple", "banana", frozenset
"cherry"})
x = True bool
x = b"Hello" bytes
x = bytearray(5) bytearray
x = memoryview(bytes(5)) memoryview
x = None NoneType

3) Setting the speci c Data Type

If you want to specify the data type, you can use the
following constructor functions.

Example Data Type

x = str(“Hello World”) str


x = int(20) int
x = float(20.5) float
x = complex(1j) complex
x = list([“apple", "banana", “cherry"]) list
x = tuple((“apple”, "banana", “cherry")) tuple
x = range(6) range
x = dict(name=“John”, age=36) dict
x = set((“apple”, "banana", “cherry”)) set
x = frozenset((“apple", "banana", frozenset
“cherry"))
x = bool(5) bool
x = bytes(5) bytes
x = bytearray(5) bytearray
x = memoryview(bytes(5)) memoryview

fi

Operators

Operators are used to perform operations on variables and


values.

1) Arithmetic Operators
2) Assignment Operators
3) Comparison Operators
4) Logical Operators
5) Identity Operators
6) Membership Operators
7) Bitwise Operators

1) Arithmetic Operators

Arithmetic operators are used with numeric values to perform


common mathematical operations.

Operator Name Example


+ Addition x + y
- Subtraction x - y
* Multiplication x * y
/ Division x / y
% Modulus x % y
** Exponentiation x ** y
// Floor division x // y

2) Assignment Operators

Assignment operators are used to assign values to variables

Opearator Example Same as

= x = 5 x = 5
+= x += 3 x = x + 3
-= x -= 3 x = x - 3
*= x *= 3 x = x * 3
/= x /= 3 x = x / 3
%= x %= 3 x = x % 3
//= x //= 3 x = x // 3
**= x **= 3 x = x ** 3
&= x &= 3 x = x & 3
|= x |= 3 x = x | 3
^= x ^= 3 x = x ^ 3
>>= x >>= 3 x = x >> 3
<<= x <<= 3 x = x << 3

3) Comparison Operators

Comparison operators are used to compare two values

Operator Name Example


== Equal x == y
!= Not equal x != y
> Greater than x > y
< Less than x < y
>= Greater than or equal to x >= y
<= Less than or equal to x <= y

4) Logical Operators

Logical operators are used to combine conditional statements.

Operator Description Example

and Returns True if both x < 5 and x < 10


statements are true.

or Returns True if one of the x < 5 or x < 4


statements is true.

not Reverse the result, returns not(x < 5 and x <


False if the result is 10)
true.

5) Identity Operators

Identity operators are used to compare the object, not if they


are equal, but if they are actually the same object, with the
same memory location.

Operator Description Example

is Returns True if both variables are x is y


the same object

is not Returns True if both variables are x is not y


not the same object

6) Membership Operators

Membership operators are used to test if a sequence is


presented in an object.

Operator Description Example

in Returns True if a sequence with x in y


the specified value is present in
the object.

not in Returns True if a sequence with x not in y


the specified value is not present
in the object.

7) Bitwise Operators

Bitwise operators are used to compare (binary) numbers.

Operator Name Description

& AND Sets each bit to 1 if both bits are 1.

| OR Sets each bit to 1 if one of two bits


is 1.

^ XOR Sets each bit to 1 if only one of two


bits is 1.

~ NOT Inverts all the bits.

<< Zero fill Shift left by pushing zeros in from


Left shift the right and let the leftmost bits
fall off.

>> Signed right Shift right by pushing copies of the


shift leftmost bit in from the left, and let
the rightmost bits fall off.

List

create list:

mylist = ["apple", "banana", "cherry"]

list items

list items are ordered, changeable and allow duplicate values.


List items are indexed, the first item has index [0], the
second has [1] and goes on.

Ordered
List items have a defined order and that order will not
change.
newly added items will be placed at the end of list.

Changeable
The list is changeable, meaning that we can change, add and
remove items in a list. After it has been created.

Allow duplicates
Since lists are indexed, lists can have items with the same
value.

List length
To determine how many items a list has, use len() function
ex - len(mylist)

List data type


List items can be of any data type.
And list can contain different data types

ex - newlist = ["abc", 34, True, 40.5, "male"]

Type
From Python's perspective, lists are defined as objects with
the data type 'list'

<class 'list' >

The list constructor


It is also possible to use the list() constructor when
creating a new list

ex -
thisislist = list(("apple", "banana", "cherry"))
round-brackets
print(thisislist)

Python Collections(Arrays)

There are four collection data types in the Python programming


language:

1) List - is a collection which is ordered and changeable.


Allows duplicate members.

2) Tuple - is a collection which is ordered and unchangeable.


Allows duplicate members.

3)Set - is a collection which is unordered, unchangeable*, and


unindexed.
No duplicate members.

4) Dictionary - is a collection which is ordered** and


changeable.
No duplicate members.

Tuple

create tuple:

mytuple = ("apple", "banana", "cherry")

Tuple are used to store multiple items in a single variable.

A tuple is a collection which is ordered and unchangeable.

Tuple are written with round brackets.

Tuple items

Tuple items are ordered, changeable and allow duplicate


values.
Tuple items are indexed, the first item has index [0], the
second has [1] and goes on.

10

Ordered
Tuple items have a defined order and that order will not
change.
newly added items will be placed at the end of Tuple.

Changeable
The Tuple is changeable, meaning that we can change, add and
remove items in a Tuple. After it has been created.

Allow duplicates
Since Tuples are indexed, Tuples can have items with the same
value.

Tuple length
To determine how many items a Tuple has, use len() function
ex - len(myTuples)

Set

Create set:
myset = {"apple", "banana", "cherry"}

sets are used to store multiple items in a single variable.

Set items
A set is collection which is unordered, unchangeable and do
not allow duplicate values.

Note - Set items are unchangeable, but you can remove items
and add new items

Unordered
The items in a set do not have a defined order.

Set items can appear in a different order every time you use
them, and can not be referred to by index or key.

Unchangeable
You can not change its items, but you can remove items and add
new items.

Duplicates not allowed


Sets can not have two items with the same value.

11

set length
To determine how many items a set has, use the len() function.

Set data type


Set items can be of any data type.
And Set can contain different data types

ex - newSet = ["abc", 34, True, 40.5, "male"]

Type
From Python's perspective, Sets are defined as objects with
the data type 'Set'

<class 'Set' >

The Set constructor


It is also possible to use the Set() constructor when creating
a new Set

ex -
thisisSet = Set(("apple", "banana", "cherry"))
round-brackets
print(thisisSet)

12

Dictionaries

Create Dictionary:
thisisdict = {
"brand":"ford",
"model":"Mustang"
"year":"1964"
}

Dictionary
Are used to store data values in key:value pairs.
A dictionary is a collection which is ordered*, changeable and
do not allow duplicates. '''(As of Python version 3.7,
dictionaries are ordered. In Python 3.6 and earlier,
dictionaries are unordered.)'''

Dictionaries are written in curly brackets and have key:value


pair.

Dictionary items
Dictionary items are ordered, changeable and does not allow
duplicates.

Dictionary items are presented in key: value pair, and can be


referred to by using the key name.

ex -
#Print the "brand" value of the dictionary

print(thisisdict["brand"])

Ordered or unordered?

'''(As of Python version 3.7, dictionaries are ordered. In


Python 3.6 and earlier, dictionaries are unordered.)'''

When we say that dictionaries are ordered, it means that the


items have a defined order, and that order will not change.

Unordered means that the items does not have a defined order,
you cannot refer to an item by using an index.

13

Changeable
We can change, add or remove items after the dictionary has
been created.

Duplicates Not Allowed


Dictionaries can not have two items with same key.

dict length
To determine how many items a dict has, use the len()
function.

dict data type


dict items can be of any data type.
And dict can contain different data types

thisisdict = {
"brand":"ford",
"model":"Mustang"
"year":"1964"
}

Type
From Python's perspective, dicts are defined as objects with
the data type 'dict'

<class 'dict' >

14


Conditional statements (If...Else)

Python conditional statements

Python supports the usual logical conditions from mathematics:

Equals: a == b
Not Equals: a != b
Less than: a < b
Less than or equal to: a <= b
Greater than: a > b
Greater than or equal to: a >= b

These conditions can be used in several ways, most commonly in


"if statements" and loops.

If statement

An "if statement" is written by using the if keyword.

ex -

a = 33
b = 200
if b > a:
print("b is greater than a")

Elif

The elif keyword is python's way of saying "If the previous


conditions were not true, then try this condition".

ex -

a = 33
b = 33
if b > a:
print("b is greater than a")
elif a == b:
print("a and b are equal")

15

Else

The else keyword catches anything which isn't caught by the


preceding conditions.

ex -
a = 200
b = 33
if b > a:
print("b is greater than a")
elif a == b:
print("a and b are equal")
else:
print("a is greater than b")

Short hand if

If you have only one statement to execute, you can put it on


the same line as the if statement.

Ex -
Q - One line if statement:

if a > b: print("a is greater than b")

Short hand if...else

If you have only one statement to execute, one for if and one
for else, you can put it all on the same line:

Ex -
Q - One line if…else statement:

a = 2
b = 330
print("A") if a > b else print(“B")

16

And

The and keyword is a logical operator, and is used to combine


conditional statements.

Ex -
Q - test if a is greater than b AND if c is greater than a:

a = 200
b = 33
c = 500
If a > b and c > a:
print(“Both conditions are True”)

Or

The or keyword is a logical operator, and is used to combine


conditional statements.

Ex -
Q - Test if a is greater than b, OR if a is greater than c:

a = 200
b = 33
c = 500
If a > b or a >c:
print(“At least one of the conditions is True”)

Nested if

You can have if statements inside if statements, this is


called nested if statements.

Ex -

x = 41

If x > 10:
print(“Above Ten,”)
If x > 20:
print(“and also above 20!”)
else:
print(“but not above 20”)

17

The Pass statement

if statements cannot be empty, but if you for some reasons


have an if statement with no content, put in the pass
statement to avoid getting an error.

Ex -
a = 33
b = 200

If b>a:
Pass

Python Loops

Python has two primitive loop commands:

• While loops
• For loops

1) The While Loop

With the while loop we can execute a set of statements as long


as a condition is True.

Ex -
Q - Print I as long as i is less than 6:

i = 1
While i < 6:
print(i)
i += 1

Note : Remember to increment I, or else the loop will continue


forever

The while loop require relevant variables to be ready, in this


example we need to define an indexing variables, I, which we
set to 1.

18

The Break statement

With the break statement we can stop the loop even if the
while condition is True:

Ex -

Q - Exit the loop when i is 3:

i = 1
While i < 6:
print(i)
If i == 3:
break
i += 1

The Continue statement

With the continue statement we can stop the current iteration,


and continue with the next:

Ex -

Q - Continue to the next iteration if i is 3:

i = 0
While i < 6:
i += 1
If I == 3:
continue
print(i)

The else statement

With the else statement we can run a block of code once when
the condition no longer is true:

Ex -
Q - Print a message once the condition is false:

ii = 1
While i < 6:
print(i)
I += 1
else:
print(“i is no longer less than 6”)

19

2) The For Loop

A for loop is used for iterating over a sequence( that is


either a list, a tuple, a dictionary, a set, or a string).

This is less like the for keyword in other programming


languages, and works more like an iterator method as found in
other object-orientated programming languages.

With the for loop we can execute a set of statements, once for
each item in a list, tuple, set etc.

Ex -

Q - Print each fruit in a fruit list:

fruits = [“apple”, “banana”, “cherry”]


For x in fruits:
print(x)

Note : The for loop does not require an indexing variable to


set beforehand.
Looping through a string

Even strings are utterable objects, they contain a sequence of


characters:

Ex -

Q - Loop through the letters in the word “banana”:

For x in “banana”:
print(x)

The break statement

With the break we can stop the loop before it has looped
through all the items:

Ex - 1

Q - Exit the loop when x is “banana”:

fruits = [“apple”, “banana”, “cherry”]


For x in fruits:
Print(x)

20

If x == “banana”:
break

Ex - 2

Q - Exit the loop when x is “banana”, but this time the break
comes before the print:

Fruits = [“apple”, “banana”, “cherry”]


For x in fruits:
If x == “banana”:
Break
print(x)

The continue statement

With the continue statement we can stop the current iteration


of the loop, and continue with the next:

Ex -

Q - Do not print “banana”

Fruits = [“apple”, “banana”, “cherry”]


For x in fruits:
If x == “banana”:
continue
print(x)

The Range() function

To loop through a set of a code a specified number of times,


we can use the range() function.

The range() function returns a sequence of numbers, starting


from 0 by default, and increments by 1 (by default), and ends
at a specified number.

Ex -

Q - Using the range() function:

For x in range() function:


print(x)

21

Note - The range(6) is not the values of 0 to 6, but the


values 0 to 5.

The range() function defaults to 0 as a starting value,


however it is possible to specify the starting value by adding
a parameter: range(2,6), which means values from 2 to 6 (but
not including 6):

Ex - Using the start parameter:

For x in range(2,6):
print(x)

The range() function defaults to increment the sequence by 1,


however it is possible to specify the increment value by
adding a third parameter: range(2,30,3)

Ex -

Q - Increment the sequence with 3 (default is 1):

For x in range(2,30,3):
print(x)

Else in for loop

The else keyword in a for loop specifies a block of code to be


executed when the loop is finished:

Ex -

Q - Print all numbers from 0 to 5, and print a message when


the loop has ended:

For x in range(6):
print(x)
Else:
print(“finally finished!”)

Note - The else block will NOT be executed if the loop is


stopped by a break statement.

22

Ex -

Q - Break the loop when x is 3, and see what happens with the
else block:

For x in range(6):
If x == 3: break
print(x)
Else:
print(“Finally finished!”)

Nested loops

A nested loop is a loop inside a loop.


The “inner loop” will be executed one time for each iteration
of the “outer loop”:

Ex -

Q - Print each adjective for every fruit:

adj = [“red”, “big”, “tasty”]


Fruits = [“apple”, “banana”, “cherry”]

For x in adj:
For y in fruits:
print(x,y)

The pass statement

For loops cannot be empty, but if you for some reason have a
for loop with no content, put in the pass statement to avoid
getting an error.

Ex -

For x in [0,1,2]
pass

23

Try Except

The try block lets you test a block of code for errors.
The except block lets you handle the error.
The else block lets you execute code when there is no error.
The finally block lets you execute, regardless of the result
of the try and except blocks.

Exception Handling

When an error occurs, or exception as we call it, Python will


normally stop and generate an error message.

These exceptions can be handled by using the try statement.

Ex -

Q - The try block will generate an exception, because x in not


defined.

Try:
print(x)
Except:
print(“An exception occurred!”)

Since the try block raises an error, the except block will be
executed.
Without the try block, the program will crash and raise an
error.

Many exception

You can define as many exception blocks as you want, e.r if


you want to execute a special block of code for a special kind
of error:

Ex -

Q - Print one message if the try block raises a NameError and


another for the other errors:

Try:
print(x)

24

Except NameError:
print(“Variable x is not defined”)
Except:
print(“Something else went wrong”)

Else

You can use the else keyword to define a block of code to be


executed if no errors were raised:

Ex -

Q - In this example, the try block does not generate any


error:

Try:
print(“Hello”)
Except:
print(“Something went wrong”)
Else:
print(“Nothing went wrong”)

Finally

The finally block, if specified, will be executed regardless


if the try block raises an error or not.

Ex - 1

Try:
print(x)
Except:
print(“Something went wrong”)
Finally:
print(“The ‘try except’ is finished ”)

This can be useful to close objects and clean up resources:

25

Ex - 2

Q - Try to open and write a file that is not a writable.

Try:
F = open(“demofile.txt”)
Try:
F.write(“lorum ipsum”)
except:
print(“Something went wrong when writing the file”)
Finally:
F.close()
Except:
print(“Something went wrong when opening the file”)

Raise an exception

As a Python developer you can choose to throw an exception if


a condition occurs.
To throw (or raise) an exception, use raise keyword.

Ex -

Q - Raise an error and stop program if x is lower than 0:

x = -1

If x < 0:
Raise Exception(“Sorry, no numbers below zero”)

The raise keyword is used to raise an exception.


You can define what kind of error to raise, and the next to
print to the user.

Ex -

Q - Raise a TypeError if x is not an integer:

26

x = “Hello”

If not type(x) is int:


Raise TypeError(“Only integers are allowed”)

Python File Handling


Python file open

The key function for working with files I Python is the open()
function.
The open() function takes two parameters; filename, and mode.
There are four different methods(modes) for opening a file:

“r” - Read - Default value. Opens a file for reading,


error if the file does not exists

“a” - Append - Opens a file for appending, creates


the file if it does not exist.

“w” - Write - Opens a file for writing, creates the


file if it does not exist.

“x” - Create - Creates the specified file, returns an


error if the file exists.

In addition you can specify if the file should be handled as


binary or text mode.

“t” - text - Default value. Text Mode

“b” - Binary - Binary mode(e.g. images)

To open a file for reading it is enough to specify the name of


the file:

Syntax -
f = open(“demofile.txt”)

27




Open a file on the server

Open a file from a particular location

F = open(“D:\\myfiles\welcome.txt”, “r”)
print(f.read())

Read() and Readline()

We can specify count of characters in the parentheses of


read().

Ex -
F = open(“demofile.txt”, “r”)
print(f.read(5))

And we can also use consecutive readline() multiple times to


read number of lines as equal to occurrence of readline().

Ex - open(“demofile.txt”, “r”)
print(f.readline())
print(f.readline())

Python file write

To write to an existing file, you must add a parameter to the


open() function.

“a” - Append - will append to the end of file.

Ex -
F = open(“demofile2.txt”, “a”)
F.write(“Now the file has more content!”)
f.close()

“w” - Write - will overwrite any existing content.


Also create a file if the specified file does
not exists

Ex -
F = open(“demofile3.txt”, “w”)
F.write(“Whoops!, I have deleted the content!”)
f.close()

28



Delete a File or Folder

To delete a file, you must import the OS module, and run its
os.remove() function.

Ex -
Q - avoid getting an error while deleting.

Import os
If os.path.exists(“demo file.txt”):
os.remove(“demofile.txt”)
Else:
print(“The file does not exists”)

To delete an entire folder, use the os.rmdir() method.

Ex -

import os
Os.rmdir(“myfolder”)

29

Python functions
A function is a block of code which only runs when it is
called.
You can pass data, known as parameters, into a function.
A function can return data as a result.

Creating a function

In Python a function is defined by def keyword:

Ex -
Def my_function():
Print(“Hello from a function”)

Calling a function

To call a function, use the function name followed by


parenthesis.

Ex -
Def my_function():
print(“Hello”)

my_function()

Arguments

Information can be passed into functions as arguments.

Arguments are specified after the function name, inside the


parenthesis. You can add as many arguments as you want, just
separate them with comma.

30

The following example has a function with one argument(fname).


When the function is called, we pass along a first name, which
is used inside the function to print the full name:

Ex -
Def my_function(fname):
print(fname + “refsnes”)

My_function(“Emil”)
my_function(“Tobias”)

Parameters or Arguments?

The term parameter and argument cam be used for the same
thing: Information are passed into function.

Note -
From function’s perspective:

A parameter is the variable listed inside the parenthesis in


the function definition.

A Argument is the value that is sent to the function when it


is called.

Number of arguments

By default, a function must be called with the correct number


of arguments. Meaning that if your function expects 2
arguments, you have to call the function with 2 arguments, not
more, and not less.

Ex -
Q - This function expects 2 arguments, and gets 2 arguments:

Def my_function(fname, lname):


print(fname + “ ” + lname)

my_function(“Emil”, “Refsnes”)

31

If you try to call the function 1 or 3 arguments, you will get


an error.

Arbitrary Arguments, *args

If you do not know how many arguments that will be passed into
your function, add a * before the parameter name in the
function definition.

This way the function will receive a tuple of arguments, and


can access the items accordingly:

Ex -
Q - If the number of arguments is unknown, add a * before the
parameter name:

Def my_function(*kids):
print(“The youngest child is ” + kids[2])

my_function(“Emil”, “Tobias”, “Linus”)

Note - Arbitrary Arguments are often shortened to *args in


Python documentations.

Keyword Arguments

You can also send arguments with the key = value syntax.
This way the order of the arguments does not matter.

Ex -

Def my_function(child3, child2, child1)


print(“The youngest child is ” + child3)

my_function(child1 = “Emil”, child2 = “Tobias”, child3 =


“Linus”)

Note - The phrase keyword arguments are often shortened to


kwargs in Python documentations.

32

Arbitrary Keyword arguments, **kwargs

If you do not know how many keyword arguments that will be


passed into your function, add two asterisk: ** before the
parameter name in the function definition.

This way the function will receive a dictionary of arguments,


and can access the items accordingly.
Ex -
Q - If the number of keyword arguments is unknown, add a
double ** before the parameter name:

Def my_function(**kid):
print(“His lsat name is ” +kid[“lname”])

my_function(“fname = Tobias”, lname = “Refsnes”)

Note - Arbitrary Kword Arguments are often shortened to


**kwargs in Python documentations.

Default Parameter Value

The following example shows how to use a default parameter


value.
If we call the function without argument, it uses the default
value.

Ex -

Def my_function(country = “Norway”)


print(“I am from ” + country)

my_function(“Sweden”)
my_function()
my_function(“Brazil”)

Passing a list as an Argument

You can send any data types of argument to a function(string,


number, list, dictionary etc.), and it will be treated as the
same data type inside the function.

e.g. If you send a List as an argument, it will still be a


list when it reaches the function:

33

Ex -

Def my_function(food):
For x in food:
print(x)

Fruits = [“apple”, “banana”, “cherry”]

my_function(fruits)

Return Values

To let a function return a value, use the return statement:

Def my_function(x):
Return 5 * x

print(my_function(3))

34

You might also like