Duratech Python Introduction
Duratech Python Introduction
Feature of Python
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 1/108
03/09/2019 Duratech_Python_Introduction
Java Code
C++ Code
#include <iostream>
void main()
{
cout << "Hello Duratech";
}
Python Code
print("Hello Duratech")
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 2/108
03/09/2019 Duratech_Python_Introduction
Example 2
Consider the case if a there are set of numbers from 1 to 10. Multiplying all the elements with 2. The same
code is written in Java and Python
Java Code
import java.util.ArrayList;
import java.util.Arrays;
Python Code
import numpy as np
numbers = np.array([1, 2, 3, 4, 5, 6,7, 8,9,10])
numbers*2
List Comprehension
List comprehensions is better way to create lists based on existing lists. When using list comprehensions,
lists can be built by leveraging any iterable items.
Below code consist of a creating a list and iterating those items and also filtering
myprogram.py
$ python myprogram.py
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 3/108
03/09/2019 Duratech_Python_Introduction
Symbol names
Variable names in Python can contain alphanumerical characters a-z , A-Z , 0-9 and some special
characters such as _ . Variable names must start with a letter.
By convention, variable names start with a lower-case letter, and Class names start with a capital letter.
In addition, there are a number of Python keywords that cannot be used as variable names. These keywords
are:
and, as, assert, break, class, continue, def, del, elif, else, except,
exec, finally, for, from, global, if, import, in, is, lambda, not, or,
pass, print, raise, return, try, while, with, yield
Assignment
The assignment operator in Python is = . Python is a dynamically typed language, so the user do not need
to specify the type of a variable.
In [1]:
# variable assignments
x = 1.0
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 4/108
03/09/2019 Duratech_Python_Introduction
Fundamental types
Python has various data types
Integer
Float
String
Boolean
Complex
In [2]:
# integers
x = 1
type(x)
Out[2]:
int
In [3]:
# float
x = 1.0
type(x)
Out[3]:
float
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 5/108
03/09/2019 Duratech_Python_Introduction
In [4]:
# boolean
b1 = True
b2 = False
type(b1)
Out[4]:
bool
In [5]:
# complex numbers: note the use of `j` to specify the imaginary part
x = 1.0 - 1.0j
type(x)
Out[5]:
complex
In [6]:
print(x)
(1-1j)
In [7]:
print(x.real, x.imag)
1.0 -1.0
If variable is used that has not yet been defined NameError occurs:
In [8]:
print(y)
--------------------------------------------------------------------
-------
NameError Traceback (most recent cal
l last)
<ipython-input-8-d9183e048de3> in <module>
----> 1 print(y)
The module type contains a number of type name definitions that can be used to test if variables are of
certain types:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 6/108
03/09/2019 Duratech_Python_Introduction
In [1]:
x = 1.0
Out[1]:
True
In [2]:
Out[2]:
False
In [3]:
isinstance(x, float)
Out[3]:
True
Type casting
In [4]:
x = 1.5
print(x, type(x))
Function Description
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 7/108
03/09/2019 Duratech_Python_Introduction
Arithmetic Operators
h f ll i ih i b f di h
operators Description
+ Addition
- Subtraction
* Multiplication
/ Division
% Modulus or Remainder
** Power
In [5]:
# Addition
1 + 2
Out[5]:
In [6]:
# Subtraction
20 - 10
Out[6]:
10
In [7]:
# Multiplication
30 * 10
Out[7]:
300
In [8]:
# Division
1000/125
Out[8]:
8.0
In [9]:
Out[9]:
22.5
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 8/108
03/09/2019 Duratech_Python_Introduction
In [10]:
Out[10]:
23
In [11]:
Out[11]:
In [12]:
# Power operator
9**3
Out[12]:
729
Logical Operator
Operators Description
or Logical Or
In [13]:
Out[13]:
False
In [14]:
not False
Out[14]:
True
In [15]:
True or False
Out[15]:
True
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 9/108
03/09/2019 Duratech_Python_Introduction
Relational Operator
Operators Description
== Equality
!= Not Equal
In [16]:
a=10
b=5
c=5
a < b, b <= c, a > b, a >= b, b==c, b!=c
Out[16]:
In [17]:
# equality
[1,2] == [1,2]
Out[17]:
True
In [18]:
# objects identical?
l1 = l2 = [1,2]
l1 is l2
Out[18]:
True
Bitwise Operator
In [ ]:
a = 20 # 20 = 0001 0100
b = 13 # 13 = 0000 1101
In [19]:
#Bitwise And
a&b
Out[19]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 10/108
03/09/2019 Duratech_Python_Introduction
In [20]:
#Bitwise or
a | b
Out[20]:
15
In [21]:
# Bitwise Xor
a^b
Out[21]:
15
In [22]:
# Complimentary
~a
Out[22]:
-11
input()
This function first takes the input from the user and then evaluates the expression
In [19]:
In [23]:
Out[23]:
str
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 11/108
03/09/2019 Duratech_Python_Introduction
In [24]:
Out[24]:
int
In [30]:
Python has ability to give multiple inputs with space as seperator and can store it as list. List will be dealt
later it this document,
In [32]:
Enter values: 43 65 76 87 32
Out[32]:
Output
The output is being performed by print statement
print()
In [36]:
x= "Python"
print("Duratech")
print(10)
print(x)
Duratech
10
Python
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 12/108
03/09/2019 Duratech_Python_Introduction
In [42]:
print("Nadal "*3)
In [46]:
print(10,20,30)
10 20 30
Cricket 2019 True
Football,Worldcup,2022
In [68]:
country="Qatar"
print("Next world cup football takes place in ",country)
In [69]:
In [67]:
Control Flow
The Python syntax for conditional execution of code uses the keywords if , elif (else if), else :
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 13/108
03/09/2019 Duratech_Python_Introduction
Indentation
In Python, indentation is used to mark a block of code. In order to indicate a block of code, there should
indent each line of the block of code by four spaces
if condition1:
if condition2:
statement
else
stataments
If Statement
if condition:
statements
In [23]:
a = 200
b = 100
if a > b:
print ("a is greater than b")
a is greater than b
If else Statement
The statement under if executes statement only if a specified condition is true otherwise else part is
executed.
if condition:
statement 1
else:
statement 2
In [24]:
a = 100
b = 200
if a > b:
print ("a is greater")
else:
print (" b is greater")
b is greater
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 14/108
03/09/2019 Duratech_Python_Introduction
If-elif-else Statement
If there are multiple statement to be met Then if elif else statement should be used
if condition1:
statement1
elif condition2:
statement 2
else:
default statement
In [25]:
num=200
if num >0:
print ("Positive Number")
elif num<0:
print ("Negative Number")
else:
print ("Zero")
Positive Number
Loops
In Python, loops can be programmed in a number of different ways. There few kinds of loops used here like
for and while
for loops:
The most commonly used loop is the for loop, which is used together with iteratable objects, such as lists.
The basic syntax is:
Here val is the variable that takes the value of the item inside the sequence on each iteration.
In [26]:
for x in [1,2,3]:
print(x)
1
2
3
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 15/108
03/09/2019 Duratech_Python_Introduction
In [27]:
Cat
Dog
Elephant
In [28]:
0
1
2
3
4
5
while loops:
These are loops that runs till a condition is met
while condition
statements
In [29]:
i = 1
while i <= 5:
print(i)
i = i + 1
1
2
3
4
5
Continue
continue can skip the next command and continue the next iteration of the loop.
if a==3:
continue
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 16/108
03/09/2019 Duratech_Python_Introduction
In [30]:
a=0
while a<5:
a = a + 1
if a==3:
continue
print(a)
1
2
4
5
Break Statement
break keyword is used to stop the running of a loop according to the condition.
When the num reaches 5 it breaks the loop and execution stops
In [31]:
num=0
while num<10:
if num==5:
break
num=num+1
print(num)
1
2
3
4
5
Data Structure
It is a Collection of related data. It is a way of organizing and storing data so that it can be accessed
efficiently. There are four types of Data structures in python.
List
Tuples
Dictionary
Set
Frozen set
List
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 17/108
03/09/2019 Duratech_Python_Introduction
Lists are ordered collection of items of any data type.It is mutable which means that it can be modified any
time.
In [32]:
# list can take multiple values it can be combination of string, numbers, boolea
n etc.,
s2 = [12,"red",False]
s2
Out[32]:
In [33]:
Out[33]:
[1, 4, 7, 10, 13, 16, 19, 22, 25, 28, 31, 34, 37, 40, 43, 46, 49]
In [34]:
s4
Hello world
Out[34]:
['H', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd']
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 18/108
03/09/2019 Duratech_Python_Introduction
List Functions
Function Description
In [35]:
print(a1)
In [36]:
a1.insert(1,15)
print(a1)
In [37]:
s=a1.pop(2)
print(s,a1)
In [38]:
a1.remove(60)
a1
Out[38]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 19/108
03/09/2019 Duratech_Python_Introduction
In [39]:
a1.reverse()
a1
c=a1.reverse()
In [40]:
a1.index(10)
Out[40]:
In [41]:
a1.sort()
a1
Out[41]:
Extends iterates over its argument adding each element to the list, extending the list.
In [42]:
x = [1, 2, 3]
x.append([4, 5])
print (x)
In [43]:
x = [1, 2, 3]
x.extend([4, 5])
print (x)
[1, 2, 3, 4, 5]
In [44]:
Out[44]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 20/108
03/09/2019 Duratech_Python_Introduction
In [45]:
Out[45]:
In [46]:
l = [ 10,20,30,40,50,60,70,80,90,100,110]
print(l)
del l[7] # removes 80
del l[6] # removes 70
print(l)
[10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110]
[10, 20, 30, 40, 50, 60, 90, 100, 110]
In [47]:
# Adding an list
lst1 = [0, 1, 2]
lst2 = [3, 4, 5]
list3 = lst1 + lst2
list3
Out[47]:
[0, 1, 2, 3, 4, 5]
In Operator
This returns TRUE if a element is present in list. This can be applied to numreic string etc.
In [9]:
15 in [1,2,8,10,15,30]
Out[9]:
True
In [10]:
Out[10]:
False
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 21/108
03/09/2019 Duratech_Python_Introduction
In [11]:
Out[11]:
True
Tuples
Tuples are like lists, except that they cannot be modified once created, that is they are immutable.
In Python, tuples are created using the syntax (..., ..., ...) , or even ..., ... :
In [12]:
print(point, type(point))
In [13]:
print(point1, type(point1))
Tuple Function
Function Description
In [14]:
Out[14]:
True
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 22/108
03/09/2019 Duratech_Python_Introduction
In [15]:
Out[15]:
False
In [16]:
Out[16]:
In [17]:
Out[17]:
In [18]:
Out[18]:
In [19]:
tup1 = (10,20,30,40)
tup2 = ("Red","Blue")
tup3 = (True,False)
Out[19]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 23/108
03/09/2019 Duratech_Python_Introduction
In [20]:
# Creating a tuple
Out[20]:
In [21]:
In [22]:
tup
--------------------------------------------------------------------
-------
NameError Traceback (most recent cal
l last)
<ipython-input-22-95b80b2375ef> in <module>
----> 1 tup
Set
A set is an unordered collection with no duplicate elements. It removes duplicate entries and performs set
operations such as intersection, union and difference. The set type is mutable.
Set can be created in two different ways either a list or using a curly bracket { }
# e.g.
x = {}
In [70]:
#Example
my_set = set([1,2,3,2])
my_set
Out[70]:
{1, 2, 3}
In [71]:
type(my_set)
Out[71]:
set
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 24/108
03/09/2019 Duratech_Python_Introduction
In [72]:
A = {1, 2, 3, 4, 5}
A
Out[72]:
{1, 2, 3, 4, 5}
In [73]:
type(A)
Out[73]:
set
set Function
Function Description
In [74]:
# Copying a set
my_set = set([1,2,3,2])
new_set = my_set.copy()
new_set
Out[74]:
{1, 2, 3}
In [75]:
# Adding an element
my_set = set([1,2,3,2])
print(my_set)
my_set.add(100)
my_set
{1, 2, 3}
Out[75]:
{1, 2, 3, 100}
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 25/108
03/09/2019 Duratech_Python_Introduction
In [76]:
B = {100,200,300}
A.update(B)
A
{1, 2, 3}
Out[76]:
In [77]:
# Discarding a set
my_set = set([1,2,3,2])
print(my_set)
# discrading an element
my_set.discard(2)
my_set
{1, 2, 3}
Out[77]:
{1, 3}
In [78]:
# Remove an element
A = set([1,2,3,2])
A.remove(3)
A
Out[78]:
{1, 2}
In [79]:
my_set = set("HelloWorld")
print(my_set)
# pop an element
# Output: random element
print(my_set.pop())
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 26/108
03/09/2019 Duratech_Python_Introduction
In [80]:
# Add
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}
#union
print(A | B)
# Union
print(A.union(B))
{1, 2, 3, 4, 5, 6, 7, 8}
{1, 2, 3, 4, 5, 6, 7, 8}
In [81]:
# Intersection
A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}
A.intersection(B)
Out[81]:
{4, 5}
In [82]:
Out[82]:
{1, 2, 3}
In [83]:
Out[83]:
{1, 2, 3, 6, 7, 8}
FrozenSet
Frozen set is an immutable version of a Python set object. Elements cannot be modifed in frozen sets an it
can be modified in sets
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 27/108
03/09/2019 Duratech_Python_Introduction
In [84]:
fSet = frozenset(vowels)
fSet
Out[84]:
Dictionary
A dictionary is a collection of key:value pair which is unordered, changeable and indexed. In Python
dictionaries are written with curly brackets.
The syntax :
In [85]:
sampledict ={
"brand": "Ford",
"model": "Fiesta",
"year": 2005
}
sampledict
Out[85]:
In [86]:
Out[86]:
'Fiesta'
In [87]:
1
9
25
49
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 28/108
03/09/2019 Duratech_Python_Introduction
Dictionary Function
Function Description
In [88]:
# create a dictionary
student = {"Name":"George", "RollNo":1234, "Age":20, "Marks":50}
student
Out[88]:
In [89]:
Out[89]:
In [90]:
Out[90]:
In [91]:
Out[91]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 29/108
03/09/2019 Duratech_Python_Introduction
In [92]:
George
Out[92]:
In [93]:
Out[93]:
20
In [94]:
Out[94]:
In [95]:
x= student.copy()
x
Out[95]:
In [96]:
Out[96]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 30/108
03/09/2019 Duratech_Python_Introduction
In [97]:
Out[97]:
{}
Functions
Python has two sets of functions
Built in functions
User Defined functions
Built in functions
The Python has a number of functions built. e.g. abs sum There are lots of packages available. Each
package has a collection of functions. Predefined functions can be imported using import function.
In [98]:
import math
math.sqrt(64)
Out[98]:
8.0
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 31/108
03/09/2019 Duratech_Python_Introduction
Math Function
Python has many built in function; one of the most useful modules is Math module.
Function Description
In [99]:
import math
abs(-10)
Out[99]:
10
In [100]:
max(10,40,60,43)
Out[100]:
60
In [101]:
min(10,40,43,60)
Out[101]:
10
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 32/108
03/09/2019 Duratech_Python_Introduction
In [102]:
pow(4,2)
Out[102]:
16
In [103]:
math.ceil(10.3)
Out[103]:
11
In [104]:
math.floor(10.4)
Out[104]:
10
In [105]:
math.degrees(math.pi)
Out[105]:
180.0
In [106]:
math.radians(180)
Out[106]:
3.141592653589793
In [107]:
round(100.4356,2)
Out[107]:
100.44
In [108]:
math.log(5)
Out[108]:
1.6094379124341003
In [109]:
math.exp(1.609)
Out[109]:
4.997810917177775
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 33/108
03/09/2019 Duratech_Python_Introduction
String Functions
Strings are the variable type that is used for storing text messages. Python has lots of functions related to
strings
In [110]:
s = "Hello world"
type(s)
Out[110]:
str
In [111]:
Out[111]:
11
In [112]:
Hello test
In [113]:
In [114]:
print(s2)
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 34/108
03/09/2019 Duratech_Python_Introduction
Testing Function
Function Description
In [115]:
s4 = "1234"
s4.isdigit()
Out[115]:
True
In [116]:
s4 = "Sachin"
s4.istitle()
Out[116]:
True
In [117]:
s5="Bombay"
s5.isalpha()
Out[117]:
True
In [118]:
s6="DELHI"
s6.isupper()
Out[118]:
True
In [119]:
s7=" "
s7.isspace()
Out[119]:
True
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 35/108
03/09/2019 Duratech_Python_Introduction
Search Function
Function Description
rindex(x) return the index of first occurrence from right, or alert error
In [120]:
Out[120]:
In [121]:
Out[121]:
13
In [122]:
s3 = "abec"
s3.index("e")
Out[122]:
Split Function
Function Description
In [123]:
Out[123]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 36/108
03/09/2019 Duratech_Python_Introduction
In [124]:
Out[124]:
Other Functions
In [125]:
Out[125]:
'10-11-2018'
In [126]:
Out[126]:
'uNITED sTATES'
In [127]:
Out[127]:
'0000012346'
Regular Expressions
Regular Expressions are used to match the string with specified pattern, performs the tasks of search,
replacement and splitting.
Python has a built-in package called re, which can be used to work with Regular Expressions.
Function Description
split Returns a list where the string has been split at each match
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 37/108
03/09/2019 Duratech_Python_Introduction
In [128]:
import re
# Search for all values of la
txt = "Cricket has 11 players, they play for 5 days "
re.findall("la", txt)
Out[128]:
['la', 'la']
In [129]:
Out[129]:
In [130]:
# Match Command
import re
pattern = re.compile("^(\d{2})-(\d{2})-(\d{4})$")
valid = pattern.match("01-01-2000")
if valid:
print ("Valid date!")
else:
print("Invalid Date!")
Valid date!
In [131]:
Out[131]:
In [132]:
Out[132]:
['e', 'a', 'd', 'a', 'e', 'e', 'd', 'c', 'a', 'a']
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 38/108
03/09/2019 Duratech_Python_Introduction
In [133]:
import re
txt = "Washington"
#Search for a sequence that starts with "Wa", followed by two (any) characters,
and a "i":
re.findall("Wa..i", txt)
Out[133]:
['Washi']
In [134]:
Out[134]:
['June']
In [135]:
#The 'r' in front tells Python the expression is a raw string. In a raw string,
escape sequences are not parsed.
# For example, '\n' is a single newline character. But, r'\n' would be two chara
cters: a backslash and an 'n'.
str = 'My email is abc123@google.com, His email is cde@gmail.com Her mail is s
ss@dasd.in'
emails = re.findall(r'[\w\.-]+@[\w\.-]+', str)
for email in emails:
print(email)
abc123@google.com
cde@gmail.com
sss@dasd.in
In [136]:
def testfunction():
print("test")
In [137]:
testfunction()
test
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 39/108
03/09/2019 Duratech_Python_Introduction
In [138]:
def square(x):
"""
Return the square of x.
"""
return x ** 2
In [139]:
square(4)
Out[139]:
16
In [140]:
def fact(x):
fact =1
while(x>1):
fact = fact * x
x-=1
return fact
In [141]:
fact(4)
Out[141]:
24
In [142]:
If we don't provide a value of the y argument when calling the the function add it defaults to the value(10)
provided in the function definition:
In [143]:
# Here it takes 5 as x and y value is not provided then it takes 10 default valu
e for y
add(5)
Out[143]:
15
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 40/108
03/09/2019 Duratech_Python_Introduction
In [144]:
Out[144]:
300
syntax errors
exceptions.
Syntax Error
In [145]:
# Syntax error
c=[1,2,3,45])
Indentation Error
In Python indentation is must. Here like other language there are no brackets. Instead of that it requires
indentation is must. At least 4 spaces are required
In [146]:
# indentation Error
#This is due to wrong intendation. In Python indentation is must
a= 10
if a>0:
print("Positive")
else:
print("Negative")
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 41/108
03/09/2019 Duratech_Python_Introduction
Correct indentation
a= 10
if a>0:
print("Positive")
else:
print("Negative")
Exceptions
Even if a statement or expression is syntactically correct, there may be an error when an attempt is made to
execute it. Errors detected during execution are called exceptions. some of the exceptions are given below
In [147]:
# Divisible by Zero
1/0
--------------------------------------------------------------------
-------
ZeroDivisionError Traceback (most recent cal
l last)
<ipython-input-147-71233faae7dc> in <module>
1 # Divisible by Zero
----> 2 1/0
In [49]:
--------------------------------------------------------------------
-------
NameError Traceback (most recent cal
l last)
<ipython-input-49-5f18184f0cc1> in <module>
1 # variable not found
----> 2 v + 4
Handling Exceptions
Exceptions are handled using try block.If an error is encountered, a try block code execution is stopped and
transferred down to the except block. There is a finally block. The code in the finally block will be executed
regardless of whether an exception occurs.
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 42/108
03/09/2019 Duratech_Python_Introduction
In [50]:
# example
try:
x= 1/0
except ZeroDivisionError:
print("You can't divide by zero!")
IOError
If the file cannot be opened.
ImportError
If python cannot find the module
ValueError
Raised when a built-in operation or function receives an argument that has the right type but an inappropriate
value
KeyboardInterrupt
Raised when the user hits the interrupt key (normally Control-C or Delete)
ZeroDivisionError :
Raised when denominator is zero in division
EOFError
Raised when one of the built-in functions (input() or raw_input()) hits an end-of-file condition (EOF) without
reading any data
In [148]:
In [149]:
Out[149]:
'Invalid Input'
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 43/108
03/09/2019 Duratech_Python_Introduction
In [150]:
# when 0 is in denomintor
divide(1,0)
Out[150]:
In [151]:
#NO errors
divide(10,2)
Out[151]:
5.0
File Operations
Python has in-built functions to create and manipulate files
opening a file
Open()
The built-in Python function open() is used to open the file
try:
data=open("D:\\data.txt")
except IOError:
print("File not found or path is incorrect")
finally:
print("exit")
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 44/108
03/09/2019 Duratech_Python_Introduction
Access Modes
Access modes defines the way in which the file should be opened, It specifies from where to start reading or
writing in the file
Mode Function
r Open a file in read only mode. Starts reading from beginning of file. This is the default mode
rb Open a file for reading in binary format. Starts reading from beginning of file
r+ Open file for reading and writing. File pointer placed at beginning of the file.
Open file for writing only. File pointer placed at beginning of the file. Overwrites existing file or creates a new one if it
w
does not exists.
a Open a file for appending. Starts writing at the end of file. Creates a new file if file does not exist.
ab Same as a but in binary format. Creates a new file if file does not exist.
Reads lines
data=open("D:\\data1.txt")
data.read(3)
data.close()
data=open("D:\\data1.txt","r")
for line in data:
print(line)
data.close()
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 45/108
03/09/2019 Duratech_Python_Introduction
Deleting a File
A file can be removed using os.remove() function
To avoid getting an error, If it good to check if the file exist before it is tried to delete it:
import os
if os.path.exists("file.txt"):
os.remove("file.txt")
else:
print("The file does not exist")
Classes are the key features of object-oriented programming. A class is a structure for representing an object
and the operations that can be performed on the object.
Object is a collection of data (variables) and methods (functions) that act on those data. The class is a
blueprint for the object.
A class is defined almost like a function, but uses the class keyword, and the class definition usually
contains a number of class method definitions (a function in a class).
In [152]:
In [153]:
# Example
class Student():
courses = ["English", "Mathematics", "Maths"]
age = 15
def ageIncrement(self):
"""This method increments the age of the instance."""
self.age += 1
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 46/108
03/09/2019 Duratech_Python_Introduction
Creating an Object
Object is an instance of a class.
In [154]:
john = Student()
john.age
Out[154]:
15
In [155]:
john.ageIncrement()
john.age
Out[155]:
16
In [156]:
class Student():
def __init__(self, courses, age, sex):
self.courses = courses
self.age = age
self.sex = sex
def ageIncrement(self):
self.age += 1
In [157]:
def get_perimeter(self):
return 2 * (self.length + self.breadth)
def get_area(self):
return self.length * self.breadth
def calculate_cost(self):
area = self.get_area()
return area * self.unit_cost
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 47/108
03/09/2019 Duratech_Python_Introduction
In [158]:
In [159]:
Area: 12000
Cost: Rs.12000000
Inheritance
Inheritance can be performed in Python. It has two classes
base class
derived class
Syntax
class BaseClass:
statements
class DerivedClass(BaseClass):
statements
In [160]:
def getColour(self):
return self.__colour
def getName(self):
return self.__name
In [161]:
class Truck(TruckClass):
def getDescription(self):
return "Name: " + self.getName() + " Model :" + self.__model + " Colour
" + self.getColour()
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 48/108
03/09/2019 Duratech_Python_Introduction
In [162]:
Polymorphism
Polymorphism is the ability to take various forms. If the program has more than one class it has the ability to
perform different method for different object
In [163]:
In [164]:
# Calling as car
c = Car()
c.wheel()
4 wheelers
In [165]:
# Calling as Truck
t = Truck()
t.wheel()
8 wheelers
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 49/108
03/09/2019 Duratech_Python_Introduction
Numpy
Introduction
The numpy package (module) is used in almost all numerical computation using Python. It is a package
that provide high-performance vector, matrix and higher-dimensional data structures for Python. NumPy is an
incredible library to perform mathematical and statistical operations. It works perfectly well for multi-
dimensional arrays and matrices multiplication.
NumPy is memory efficiency, meaning it can handle the vast amount of data more accessible than any other
library. Besides, NumPy is very convenient to work with, especially for matrix multiplication and reshaping
NumPy is the fundamental package for scientific computing with Python. It contains
import numpy as np
Arrays
A numpy array is a grid of values, all of the same type. The shape of an array is a tuple of integers giving the
size of the array along each dimension.
Numpy are faster than iterating through the loop. Loops are inefficient compared to numpy operations.
Consider the following example
In [166]:
import numpy as np
a = np.array([1,2])
b = np.array([2,1])
dot = 0
dot
Out[166]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 50/108
03/09/2019 Duratech_Python_Introduction
In [167]:
Out[167]:
In [168]:
Out[168]:
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
In [169]:
Out[169]:
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
In [170]:
# consider an example of adding adding 2 to each number above created numpy arra
y.
#It iterates through each value in the numpy array
npa2=npa + 2
npa2
Out[170]:
In [171]:
10
In [172]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 51/108
03/09/2019 Duratech_Python_Introduction
In [173]:
Out[173]:
In [174]:
Out[174]:
(2, 3)
In [175]:
print(a[1,2])
print(a[1,1])
60
50
In [176]:
import numpy as np
e= np.array([(1,2,3), (4,5,6),(7,8,9),(10,11,12)])
print(e)
[[ 1 2 3]
[ 4 5 6]
[ 7 8 9]
[10 11 12]]
In [177]:
Out[177]:
array([[ 1, 2],
[ 3, 4],
[ 5, 6],
[ 7, 8],
[ 9, 10],
[11, 12]])
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 52/108
03/09/2019 Duratech_Python_Introduction
In [178]:
Out[178]:
In [179]:
import numpy as np
a = np.zeros((2, 2))
a
Out[179]:
array([[0., 0.],
[0., 0.]])
In [180]:
a = np.ones((2, 3))
print(a)
[[1. 1. 1.]
[1. 1. 1.]]
In [181]:
Out[181]:
array([[8, 8, 8],
[8, 8, 8],
[8, 8, 8]])
In [182]:
Out[182]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 53/108
03/09/2019 Duratech_Python_Introduction
In [183]:
import numpy as np
e = np.random.random((2,2)) # Create an array filled with random values
e
Out[183]:
array([[0.10842964, 0.86275331],
[0.75740158, 0.04158337]])
Array indexing
In [184]:
import numpy as np
s = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(s)
[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]]
In [185]:
# First row
print (s[ :1])
[[1 2 3 4]]
In [186]:
print(s[1:])
[[ 5 6 7 8]
[ 9 10 11 12]]
Index slicing
Index slicing is the technical name for the syntax [lower:upper:step] to extract part of an array:
In [187]:
Out[187]:
array([[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9],
[10, 11, 12]])
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 54/108
03/09/2019 Duratech_Python_Introduction
In [188]:
Out[188]:
array([ 2, 5, 8, 11])
In [189]:
data[:, 1:2]
Out[189]:
array([[ 2],
[ 5],
[ 8],
[11]])
In [190]:
# Use slicing to get the data of first 2 rows and columns 1 and 2;
b = data[:2, 1:3]
b
Out[190]:
array([[2, 3],
[5, 6]])
In [52]:
import numpy as np
boolean_idx = (a > 2)
boolean_idx
Out[52]:
array([[False, False],
[ True, True],
[ True, True]])
In [192]:
# Filtering
print(a[boolean_idx]) # Prints "[3 4 5 6]"
[3 4 5 6]
[3 4 5 6]
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 55/108
03/09/2019 Duratech_Python_Introduction
In [54]:
[[1 2]
[3 4]
[5 6]]
Out[54]:
array([[100, 100],
[100, 50],
[ 50, 50]])
Datatypes
Every numpy array is a grid of elements of the same type. Numpy provides a large set of numeric datatypes
that you can use to construct arrays.
In [194]:
import numpy as np
int64
float64
int64
Operation on Numpy
In [195]:
x = np.array([[1,2],[4,5]])
y = np.array([[3,4],[6,7]])
# Addition
np.add(x, y)
Out[195]:
array([[ 4, 6],
[10, 12]])
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 56/108
03/09/2019 Duratech_Python_Introduction
In [196]:
# subtraction
np.subtract(x,y)
Out[196]:
array([[-2, -2],
[-2, -2]])
In [197]:
# Multiplication
np.multiply(x,y)
Out[197]:
array([[ 3, 8],
[24, 35]])
In [198]:
# Division
np.divide(x, y)
Out[198]:
array([[0.33333333, 0.5 ],
[0.66666667, 0.71428571]])
In [199]:
Out[199]:
array([[15, 18],
[42, 51]])
In [200]:
Out[200]:
array([[ 3, 4, 6, 7],
[ 6, 8, 12, 14],
[12, 16, 24, 28],
[15, 20, 30, 35]])
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 57/108
03/09/2019 Duratech_Python_Introduction
In [201]:
1
3
6
In [202]:
#Square Root
a=np.array([(1,2,3),(3,4,5,)])
print(np.sqrt(a))
In [203]:
# trignometric Operation
a=np.array([(1,2,3),(3,4,5,)])
np.sin(a)
Out[203]:
In [204]:
# Logarithmic operation
a=np.array([(1,2,3),(3,4,5,)])
np.log(a)
Out[204]:
In [205]:
# Exponential operation
a=np.array([(1,2,3),(3,4,5,)])
np.exp(a)
Out[205]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 58/108
03/09/2019 Duratech_Python_Introduction
In [206]:
# Mean median
data = [1,2,3,5,6,7,8]
print(np.mean(data))
print(np.median(data))
4.571428571428571
5.0
In [207]:
std=np.std(data)
print(std)
variance = np.var(data)
print(variance)
2.4411439272335804
5.959183673469389
In [208]:
np.quantile(data,0.9)
Out[208]:
7.4
In [209]:
# Product
np.prod(data)
Out[209]:
10080
In [210]:
x = np.array([[1,2],[3,4]])
print(x)
[[1 2]
[3 4]]
In [211]:
y = np.array([[5,6],[7,8]])
print(y)
[[5 6]
[7 8]]
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 59/108
03/09/2019 Duratech_Python_Introduction
In [212]:
np.unique(names)
Out[212]:
Matrix Operation
Matrix operation can be performed using numpy. Matrix addition, subtraction multiplication, inverse,
determinant etc....
In [213]:
In [214]:
# Matrix Addition
np.add(a,b)
Out[214]:
array([[5, 1],
[2, 3]])
In [215]:
# Matrix Subtraction
np.subtract(a,b)
Out[215]:
array([[-3, -1],
[-2, -1]])
matrix multiplication
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 60/108
03/09/2019 Duratech_Python_Introduction
In [216]:
np.matmul(a, b)
Out[216]:
array([[4, 1],
[2, 2]])
In [217]:
#inverse of matrix
ainv = np.linalg.inv(a)
ainv
Out[217]:
array([[1., 0.],
[0., 1.]])
In [218]:
#matrix deteriminant
np.linalg.det(a)
Out[218]:
1.0
In [219]:
#matrix diagonal
print("The matrix diagonal ",np.diag(a))
Out[219]:
array([[1, 0],
[0, 2]])
In [220]:
Out[220]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 61/108
03/09/2019 Duratech_Python_Introduction
In [221]:
In [222]:
Out[222]:
In [223]:
Out[223]:
In [224]:
Out[224]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 62/108
03/09/2019 Duratech_Python_Introduction
In [225]:
Out[225]:
In [226]:
Out[226]:
In [227]:
Out[227]:
Random Numbers
In [228]:
Out[228]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 63/108
03/09/2019 Duratech_Python_Introduction
In [229]:
Out[229]:
In [230]:
Out[230]:
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
In [231]:
### linspace
#It generates equal interval values from 1 to 10
lin = np.linspace(1.0, 10, num=10)
lin
Out[231]:
array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])
In [232]:
### logspace
#It generates values from from 10^3 to 10^4
log1 = np.logspace(3.0, 4.0, num=4)
log1
Out[232]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 64/108
03/09/2019 Duratech_Python_Introduction
In [233]:
# Horizontal stack
np.hstack((f, g))
Out[233]:
array([1, 2, 3, 4, 5, 6])
In [234]:
Out[234]:
array([[1, 2, 3],
[4, 5, 6]])
String operations
Numpy provides a set of vectorized string operations for arrays of type numpy.string or numpy.unicode. Here
are some character functions
In [235]:
x= "Hello India"
In [236]:
# To upper
np.char.upper(x)
Out[236]:
In [237]:
# To lower
np.char.lower(x)
Out[237]:
In [238]:
# Splitting a char
np.char.split('Hello, Hi, How', sep = ',')
Out[238]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 65/108
03/09/2019 Duratech_Python_Introduction
In [239]:
# Join
np.char.join(['-', ':'], ['Goodday', ' all'])
Out[239]:
In [240]:
#counting a substring
a=np.array(['Hi', 'How', 'How'])
np.char.count(a, 'How')
Out[240]:
array([0, 1, 1])
Function Description
In [241]:
# Capitalize
import numpy as np
x= "new delhi"
np.char.capitalize(x)
Out[241]:
In [242]:
np.char.add(x,y)
Out[242]:
array('NewDelhi', dtype='<U8')
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 66/108
03/09/2019 Duratech_Python_Introduction
In [243]:
Out[243]:
array(6)
In [244]:
Out[244]:
array(False)
In [245]:
Out[245]:
array(True)
In [246]:
#Returns true if all the character are present but there can be some
x= "madras"
y = "mad"
np.char.greater_equal(x,y)
Out[246]:
array(True)
In [247]:
#Returns true if all the character are equal but there can be some
x= "del"
y = "delhi"
np.char.less_equal(x,y)
Out[247]:
array(True)
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 67/108
03/09/2019 Duratech_Python_Introduction
In [248]:
arr = np.arange(10)
np.save('filename',arr)
In [249]:
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
In [250]:
Pandas
Pandas is an open-source, BSD-licensed Python library providing high-performance, easy-to-use data
structures and data analysis tools for the Python programming language.
The Pandas module is a high performance, highly efficient, and high level data analysis library.
Series
Series is a one-dimensional labeled array which can be of any data type (integer, string, float, python objects,
etc.). The axis labels are collectively called index.
General Syntax
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 68/108
03/09/2019 Duratech_Python_Introduction
In [251]:
import pandas as pd
data = np.array(['a','b','c','d'])
s = pd.Series(data)
print(s)
0 a
1 b
2 c
3 d
dtype: object
In [252]:
import pandas as pd
from pandas import Series, DataFrame
obj = Series([1,2,3,4,5,6])
print(obj)
print(obj.values)
print(obj.index)
0 1
1 2
2 3
3 4
4 5
5 6
dtype: int64
[1 2 3 4 5 6]
RangeIndex(start=0, stop=6, step=1)
In [253]:
Out[253]:
maths 178
chemistry 200
biology 199
physics 197
dtype: int64
In [254]:
Out[254]:
178
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 69/108
03/09/2019 Duratech_Python_Introduction
In [255]:
#Filtering
mymarks[mymarks > 180]
Out[255]:
chemistry 200
biology 199
physics 197
dtype: int64
In [256]:
print(case2)
True
False
In [257]:
In [258]:
maths 178
chemistry 200
biology 199
physics 197
dtype: int64
In [259]:
Out[259]:
biology 199.0
chemistry 200.0
maths 178.0
physics 197.0
english NaN
dtype: float64
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 70/108
03/09/2019 Duratech_Python_Introduction
In [260]:
Out[260]:
biology False
chemistry False
maths False
physics False
english True
dtype: bool
In [261]:
Out[261]:
biology True
chemistry True
maths True
physics True
english False
dtype: bool
In [262]:
Out[262]:
Marks
biology 199.0
chemistry 200.0
maths 178.0
physics 197.0
english NaN
Name: My Public results, dtype: float64
In [263]:
Out[263]:
A 1
B 2
C 3
D 4
dtype: int64
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 71/108
03/09/2019 Duratech_Python_Introduction
In [264]:
Out[264]:
In [265]:
myindex[2]
Out[265]:
'C'
In [266]:
--------------------------------------------------------------------
-------
TypeError Traceback (most recent cal
l last)
<ipython-input-266-94d4f6b3af02> in <module>
1 #Direct modification of a index in not possible, error
----> 2 myindex[2] = 'C1'
/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py i
n __setitem__(self, key, value)
3936
3937 def __setitem__(self, key, value):
-> 3938 raise TypeError("Index does not support mutable oper
ations")
3939
3940 def __getitem__(self, key):
Reindexing
Reindexing changes the row labels and column labels of a DataFrame.
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 72/108
03/09/2019 Duratech_Python_Introduction
In [267]:
Out[267]:
A 1.0
B 2.0
C 3.0
D 4.0
F NaN
E NaN
z NaN
dtype: float64
In [268]:
newdf1.reindex(['A','B','C','D','F','E','z','N'], fill_value=0)
newdf1
Out[268]:
A 1.0
B 2.0
C 3.0
D 4.0
F NaN
E NaN
z NaN
dtype: float64
In [269]:
# Creating a series
newdf2 = Series(['India', 'China', 'Malaysia'], index=[0,5,10])
newdf2
Out[269]:
0 India
5 China
10 Malaysia
dtype: object
In [270]:
ser1 = Series(np.arange(3),index=['A','B','C'])
ser1 = 2*ser1
ser1
Out[270]:
A 0
B 2
C 4
dtype: int64
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 73/108
03/09/2019 Duratech_Python_Introduction
In [271]:
Out[271]:
In [272]:
Out[272]:
In [273]:
ser1[0:3]
Out[273]:
A 0
B 2
C 4
dtype: int64
In [274]:
Out[274]:
A 0
B 2
C 4
dtype: int64
DATAFRAME:
DataFrame is the widely used data structure of pandas. DataFrame can be used with two dimensional
arrays. DataFrame has two different index i.e.column-index and row-index.
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 74/108
03/09/2019 Duratech_Python_Introduction
In [275]:
0
0 1
1 2
2 3
3 4
4 5
In [276]:
data = [['Roger',10],['Andy',12],['Rafael',13]]
df = pd.DataFrame(data,columns=['Name','Age'])
print(df)
Name Age
0 Roger 10
1 Andy 12
2 Rafael 13
Pandas Operation
One of the essential pieces of NumPy is the ability to perform quick elementwise operations, both with basic
arithmetic (addition, subtraction, multiplication, etc.) trigonometric functions, exponential and logarithmic
functions, etc. Pandas inherits most of the functionality from NumPy
In [277]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 75/108
03/09/2019 Duratech_Python_Introduction
In [278]:
Out[278]:
survived pclass sex age sibsp parch fare embarked class who adult_ma
Selecting a column
In [279]:
Out[279]:
0 7.2500
1 71.2833
2 7.9250
3 53.1000
4 8.0500
Name: fare, dtype: float64
Removing a column
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 76/108
03/09/2019 Duratech_Python_Introduction
In [280]:
# Removing a column
# using del function
print ("Deleting the last column using DEL function:")
del data['alone']
data.head()
Out[280]:
survived pclass sex age sibsp parch fare embarked class who adult_ma
Slicing
Slicing is a computationally fast way to methodically access parts of your data.
slicing by columns
loc uses string indices; iloc uses integers
In [281]:
Out[281]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 77/108
03/09/2019 Duratech_Python_Introduction
In [282]:
Out[282]:
0 3 male 22.0
1 1 female 38.0
2 3 female 26.0
3 1 female 35.0
4 3 male 35.0
In [283]:
Out[283]:
0 0 3 male
1 1 1 female
2 1 3 female
3 1 1 female
4 0 3 male
Slicing by rows
In [284]:
Out[284]:
survived pclass sex age sibsp parch fare embarked class who adult_ma
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 78/108
03/09/2019 Duratech_Python_Introduction
In [285]:
Out[285]:
survived pclass sex age sibsp parch fare embarked class who adult_ma
In [286]:
Out[286]:
survived pclass sex age sibsp parch fare embarked class who adult_male
In [287]:
Out[287]:
pclass sex
0 3 male
1 1 female
2 3 female
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 79/108
03/09/2019 Duratech_Python_Introduction
In [288]:
Out[288]:
survived pclass sex age sibsp parch fare embarked class who adult_ma
In [289]:
data["class"].value_counts(ascending=True)
Out[289]:
Second 184
First 216
Third 491
Name: class, dtype: int64
In [290]:
#Crosstab
#A crosstab creates a bivariate frequency distribution.
pd.crosstab(data.sex,data.alive)
Out[290]:
alive no yes
sex
female 81 233
Continous variables
In [291]:
# Getting mean
data["fare"].mean()
Out[291]:
32.204207968574636
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 80/108
03/09/2019 Duratech_Python_Introduction
In [292]:
# Getting ssum
data["fare"].sum()
Out[292]:
28693.9493
In [293]:
Out[293]:
77.9583
In [294]:
Out[294]:
count 891.000000
mean 32.204208
std 49.693429
min 0.000000
25% 7.910400
50% 14.454200
75% 31.000000
max 512.329200
Name: fare, dtype: float64
In [295]:
Out[295]:
survived pclass sex age sibsp parch fare embarked class who adult_ma
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 81/108
03/09/2019 Duratech_Python_Introduction
In [296]:
# Sorting a dataframe
data.sort_values('fare',ascending=False).head()
Out[296]:
survived pclass sex age sibsp parch fare embarked class who adult
Aggregation in Dataframe
Aggregation can be done as it is done using sql. here we use group by function
In [297]:
## Groupby Function
data.groupby('sex').fare.min()
Out[297]:
sex
female 6.75
male 0.00
Name: fare, dtype: float64
In [298]:
Out[298]:
sex
In [299]:
Out[299]:
female 314
male 577
Name: sex, dtype: int64
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 82/108
03/09/2019 Duratech_Python_Introduction
Transform
Transform is same as groupby but the result is applied through all the values in dataframe
In [300]:
data['new_fare']=data.groupby('sex')['fare'].transform(sum)
data.head()
Out[300]:
survived pclass sex age sibsp parch fare embarked class who adult_ma
Map
map functions expects a function object and any number of iterables like list, dictionary, etc. It executes the
function_object for each element in the sequence and returns a list of the elements modified by the function
object.
Basic syntax
In [4]:
import warnings
warnings.filterwarnings('ignore')
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 83/108
03/09/2019 Duratech_Python_Introduction
In [5]:
Out[5]:
survived pclass sex age sibsp parch fare embarked class who adult_ma
In [302]:
df['Sex_num']=df.sex.map({'female':0, 'male':1})
df.head()
Out[302]:
survived pclass sex age sibsp parch fare embarked class who adult_ma
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 84/108
03/09/2019 Duratech_Python_Introduction
In [303]:
Out[303]:
In [304]:
Out[304]:
In [306]:
x=map(lambda x, y: x + y,list_a,list_b)
list(x)
Out[306]:
In [305]:
Out[305]:
0 14.5000
1 142.5666
2 15.8500
3 106.2000
4 16.1000
Name: fare, dtype: float64
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 85/108
03/09/2019 Duratech_Python_Introduction
In [7]:
Out[7]:
survived pclass sex age sibsp parch fare embarked class who adult_ma
In [11]:
Out[11]:
0 2.692582
1 8.442944
2 2.815138
3 7.286975
4 2.837252
Name: fare, dtype: float64
Apply
Apply the function over the column
apply() can apply a function along any axis of the dataframe
In [307]:
Out[307]:
0 2.692582
1 8.442944
2 2.815138
3 7.286975
4 2.837252
Name: fare, dtype: float64
Applymap
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 86/108
03/09/2019 Duratech_Python_Introduction
In [8]:
# Apply a square root function to every single cell in the whole data frame
# applymap() applies a function to every single element in the entire dataframe.
import numpy as np
df[['fare','age']].applymap(np.sqrt).head()
Out[8]:
fare age
0 2.692582 4.690416
1 8.442944 6.164414
2 2.815138 5.099020
3 7.286975 5.916080
4 2.837252 5.916080
map
applymap
apply
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 87/108
03/09/2019 Duratech_Python_Introduction
Filter
Syntax
filter(function_object, iterable)
Filter function expects two arguments, function_object and an iterable. function_object returns a boolean
value. function_object is called for each element of the iterable and filter returns only those element for which
the function_object returns true
In [309]:
Out[309]:
survived pclass sex age sibsp parch fare embarked class who adult_male
In [310]:
Out[310]:
[2, 4, 6]
In [311]:
Out[311]:
Reduce
The function reduce(func, seq) applies the function func() to the sequence seq. It returns a single
value.
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 88/108
03/09/2019 Duratech_Python_Introduction
In [312]:
import functools
functools.reduce(lambda x,y: x+y, [47,11,42,13])
Out[312]:
113
In [313]:
Out[313]:
102
In [314]:
Out[314]:
5050
Here we create different data frame for merge. In Python merge and join refers to same things
In [315]:
import pandas as pd
df1= pd.DataFrame({
'id':[1,2,3,4,5],
'Name': ['Rahul', 'Sachin', 'VVS', 'Saurav', 'Anil'],
'role':['Batsman','All rounder','Batsman','All rounder','Bowler']})
df1
Out[315]:
id Name role
0 1 Rahul Batsman
2 3 VVS Batsman
4 5 Anil Bowler
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 89/108
03/09/2019 Duratech_Python_Introduction
In [316]:
df2= pd.DataFrame({
'id':[1,2,3,4,5],
'State': ['Karnataka', 'Maharashtra', 'Andhra Pradesh', 'West Bengal',
'Karnataka']})
df2
Out[316]:
id State
0 1 Karnataka
1 2 Maharashtra
2 3 Andhra Pradesh
3 4 West Bengal
4 5 Karnataka
In [317]:
df3 =pd.merge(df1,df2,on='id')
df3
Out[317]:
Consider the two dataframes. df_left and df_right for performing join operation
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 90/108
03/09/2019 Duratech_Python_Introduction
In [318]:
import pandas as pd
df_left = pd.DataFrame({
'id':[1,2,3,4,5],
'Name': ['Johnny', 'George', 'Cook', 'Remo', 'Mike'],
'subject_id':['History','Maths','Social','French','English']})
df_left
Out[318]:
id Name subject_id
0 1 Johnny History
1 2 George Maths
2 3 Cook Social
3 4 Remo French
4 5 Mike English
In [319]:
df_right = pd.DataFrame(
{'id':[1,2,3,4,5],
'Name': ['gates', 'Brian', 'Bran', 'Bryce', 'Betty'],
'subject_id':['Maths','Social','Science','French','English']})
df_right
Out[319]:
id Name subject_id
0 1 gates Maths
1 2 Brian Social
2 3 Bran Science
3 4 Bryce French
4 5 Betty English
In [320]:
# left Join
pd.merge(df_left, df_right, on='subject_id', how='left')
Out[320]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 91/108
03/09/2019 Duratech_Python_Introduction
In [321]:
# Right join
pd.merge(df_left, df_right, on='subject_id', how='right')
Out[321]:
In [322]:
# Outer join
pd.merge(df_left, df_right, how='outer', on='subject_id')
Out[322]:
In [323]:
# Inner Join
pd.merge(df_left, df_right, on='subject_id', how='inner')
Out[323]:
Data Cleaning
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 92/108
03/09/2019 Duratech_Python_Introduction
In [324]:
# Data cleaning
from pandas import DataFrame
# Creating data frame with NA
import numpy as np
import pandas as pd
dframe = DataFrame([[1,2,3],[np.nan,5,6],[7,np.nan,9],[np.nan,np.nan,np.nan]])
dframe
Out[324]:
0 1 2
In [325]:
# Dropping NA
clean_dframe = dframe.dropna()
clean_dframe
Out[325]:
0 1 2
In [326]:
Out[326]:
0 1 2
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 93/108
03/09/2019 Duratech_Python_Introduction
In [55]:
import pandas as pd
# Creating a dataframe with NA
dframe2 = pd.DataFrame([[1,2,3,np.nan],[2,np.nan,5,6],[np.nan,7,np.nan,9],[1,np.
nan,np.nan,np.nan]])
print("Original dataframe")
print(dframe2)
Original dataframe
0 1 2 3
0 1.0 2.0 3.0 NaN
1 2.0 NaN 5.0 6.0
2 NaN 7.0 NaN 9.0
3 1.0 NaN NaN NaN
In [57]:
In [58]:
In [328]:
Original dataframe
0 1 2 3
0 1.0 2.0 3.0 NaN
1 2.0 NaN 5.0 6.0
2 NaN 7.0 NaN 9.0
3 1.0 NaN NaN NaN
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 94/108
03/09/2019 Duratech_Python_Introduction
In [329]:
dframe2.fillna({0:0,1:1,2:2,3:3})
print(dframe2)
dframe2
0 1 2 3
0 1.0 2.0 3.0 NaN
1 2.0 NaN 5.0 6.0
2 NaN 7.0 NaN 9.0
3 1.0 NaN NaN NaN
Out[329]:
0 1 2 3
Reshaping Pandas
Reshaping is done using stack and unstack function.
stack() function in pandas converts the data into stacked format .i.e. the column is stacked row wise.
When more than one column header is present then it stack the specific column header by specified the
level.
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 95/108
03/09/2019 Duratech_Python_Introduction
In [330]:
import pandas as pd
import numpy as np
header = pd.MultiIndex.from_product([['2017','2018'],['IPL','Ranji']])
runs=([[212,145,267,156],[278,189,145,167],[345,267,189,390],[167,144,156,355]])
df = pd.DataFrame(runs,
index=['Virat','Rohit','Sachin','Ganguly'],
columns=header)
df
Out[330]:
2017 2018
In [331]:
Out[331]:
In [332]:
Out[332]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 96/108
03/09/2019 Duratech_Python_Introduction
In [333]:
stacked_df=df.stack()
stacked_df
Out[333]:
2017 2018
In [334]:
Out[334]:
2017 2018
In [335]:
df['2017']
Out[335]:
IPL Ranji
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 97/108
03/09/2019 Duratech_Python_Introduction
In [336]:
df = DataFrame(np.arange(16).reshape(4,4),
index=[['a','a','b','b'],[1,2,1,2]],
columns=[['Delhi','Delhi','Mum','Chn'],['cold','hot','hot',
'cold']])
df
Out[336]:
a 1 0 1 2 3
2 4 5 6 7
b 1 8 9 10 11
2 12 13 14 15
In [337]:
Out[337]:
INDEX_1 INDEX_2
a 1 0 1 2 3
2 4 5 6 7
b 1 8 9 10 11
2 12 13 14 15
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 98/108
03/09/2019 Duratech_Python_Introduction
In [338]:
df.columns.names = ['Cities','Temp']
df
Out[338]:
INDEX_1 INDEX_2
a 1 0 1 2 3
2 4 5 6 7
b 1 8 9 10 11
2 12 13 14 15
In [339]:
df.swaplevel('Cities','Temp',axis=1)
Out[339]:
INDEX_1 INDEX_2
a 1 0 1 2 3
2 4 5 6 7
b 1 8 9 10 11
2 12 13 14 15
In [340]:
df.sum(level='Temp',axis=1)
Out[340]:
INDEX_1 INDEX_2
a 1 3 3
2 11 11
b 1 19 19
2 27 27
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 99/108
03/09/2019 Duratech_Python_Introduction
In [341]:
Out[341]:
0 India
5 China
10 Malaysia
dtype: object
In [342]:
newrange = range(15)
#Forward fill index
newdf2.reindex(newrange, method='ffill')
Out[342]:
0 India
1 India
2 India
3 India
4 India
5 China
6 China
7 China
8 China
9 China
10 Malaysia
11 Malaysia
12 Malaysia
13 Malaysia
14 Malaysia
dtype: object
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 100/108
03/09/2019 Duratech_Python_Introduction
In [343]:
import numpy as np
import pandas as pd
from pandas import Series, DataFrame
from numpy.random import randn
dframe = DataFrame(randn(25).reshape((5,5)),index=['A','B','D','E','F'],columns=
['col1','col2','col3','col4','col5'])
dframe
dframe2 = dframe.reindex(['A','B','C','D','E','F'])
new_columns = ['col1','col2','col3','col4','col5','col6']
dframe2.reindex(columns=new_columns)
Out[343]:
In [344]:
dframe.loc[['A','B','C','D','E','F'],new_columns]
Out[344]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 101/108
03/09/2019 Duratech_Python_Introduction
In [345]:
dframe2.drop('C')
dframe2=dframe2.drop('C')
dframe2
Out[345]:
In [346]:
#axis=0 is default
dframe2.drop('col5',axis=1)
Out[346]:
The first step to any data science project is to import the data. Pandas provide a wide range of input/output
formats. Few of them are given below
delimited files
SQL database
Excel
HDFS
json
html
pickle
sas,
stata
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 102/108
03/09/2019 Duratech_Python_Introduction
Note
The file path can be absolute file path or if the file is in the working directory just the file name is sufficient
Any valid string path is acceptable. The string could be a URL. Valid URL schemes include http, ftp, s3, and
file.
For file URLs, a host is expected. A local file could be: file://localhost/path/to/table.csv
In [347]:
Out[347]:
In [348]:
# Load the first sheet of the JSON file into a data frame
df = pd.read_json(url, orient='columns')
df.head(n=5)
Out[348]:
0 5 2015-01-01 00:00:00 0
1 5 2015-01-01 00:00:01 0
10 5 2015-01-01 00:00:10 0
11 5 2015-01-01 00:00:11 0
12 8 2015-01-01 00:00:12 0
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 103/108
03/09/2019 Duratech_Python_Introduction
In [349]:
# Load the first sheet of the Excel file into a data frame
df = pd.read_excel(filePath, sheet_name=0, header=0)
Out[349]:
In [352]:
df
Out[352]:
id Name role
0 1 Rahul Batsman
2 3 VVS Batsman
4 5 Anil Bowler
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 104/108
03/09/2019 Duratech_Python_Introduction
In [360]:
df1 = pd.read_csv('data.csv',sep=",")
df1.head()
Out[360]:
id Name role
0 1 Rahul Batsman
2 3 VVS Batsman
4 5 Anil Bowler
Creating a connection to the database. Use the username and password from MySQL database
# Creating a connection
import mysql.connector
mydb = mysql.connector.connect(
host="127.0.0.1",
port="3306",
user="root",
passwd="root"
)
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 105/108
03/09/2019 Duratech_Python_Introduction
Data Export
Similar to import, Pandas provide a wide range of option to export the data into various output formats. Few
of them are given below
delimited files
SQL database
Excel
HDFS
json
html
pickle
sas,
stata
In [362]:
In [363]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 106/108
03/09/2019 Duratech_Python_Introduction
In [364]:
orient : string
Indication of expected JSON string format.
‘split’ : dict like {‘index’ -> [index], ‘columns’ -> [columns], ‘data’ -
> [values]}
In [365]:
df1.to_json("data.json",orient="columns")
# Creating a connection
import mysql.connector
mydb = mysql.connector.connect(
host="127.0.0.1",
port="3306",
user="root",
passwd="root"
)
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 107/108
03/09/2019 Duratech_Python_Introduction
# This will insert df1 dataframe to mysql table tablename, replace the ta
ble if exists
df1.to_sql('tablename', con=mydb,if_exists="replace")
In [366]:
In [367]:
#Persist a model
from sklearn.externals import joblib
joblib.dump(dict_a, 'sample.joblib')
Out[367]:
['sample.joblib']
In [368]:
Out[368]:
file:///home/duratech/Downloads/Duratech_Python_Introduction.html 108/108