Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Traversing Dataframe Elements Using: Iterrows, Iteritems and Itertuples

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Part 2 : Data Frame Continued …………………………..

5. Traversing DataFrame Elements using


i. iterrows() , iteritems() and itertuples()

To iterate over the rows of the DataFrame, we can use the following functions −

iteritems() − to iterate over the (key,value) pairs

iterrows() − iterate over the rows as (index,series) pairs

itertuples() − iterate over the rows as named tuples

Lets take an example


>>>import pandas as pd
>>>import numpy as np
>>>d={'Name':['Shalini','Varsha','Shanti','Madhu'],'Age':[23,56,54,34]}
>>> print df

Name Age
0 Shalini 23
1 Varsha 56
2 Shanti 54
3 Madhu 34

5.1 USE of iterrows()


>>>print('ITER ROWS')
>>>for key,values in df.iterrows():
for val in values:
print('Hello',val)

ITER ROWS
Hello Shalini
Hello 23
Hello Varsha
Hello 56
Hello Shanti
Hello 54
Hello Madhu
Hello 34
5.2 USE of iteritems()

>>>print('ITER ITEMS')

>>>for key,values in df.iteritems():

for val in values:


print('Hello',val)

ITER ITEMS
Hello Shalini
Hello Varsha
Hello Shanti
Hello Madhu
Hello 23
Hello 56
Hello 54
Hello 34
www.pythonclassroomdiary.wordpress.com by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior
5.3 USE of itertuples()
>>>print('ITER TUPLES')

>>>for rows in df.itertuples():

print(rows)
ITER TUPLES
Pandas(Index=0, Name='Shalini', Age=23)
Pandas(Index=1, Name='Varsha', Age=56)
Pandas(Index=2, Name='Shanti', Age=54)
Pandas(Index=3, Name='Madhu', Age=34)

6. Binary Operations in a DataFrame (add, sub, mul, div, radd , rsub) :


Lets take a DataFrames with numeric data
>>> s1=[[1,2,3],[4,5,6]]

>>> s2=[[3,2,5],[5,7,8]]

>>> s3=[[5,5,5],[4,4,4]]

>>> dfr1=pd.DataFrame(s1)
Created three data frames namely
>>> dfr2=pd.DataFrame(s2) dfr1, dfr2 and dfr3
>>> dfr3=pd.DataFrame(s3)

6.1 ADDITION
>>> dfr1
0 1 2
0 1 2 3
1 4 5 6

>>> dfr2
0 1 2
0 3 2 5
1 5 7 8

>>> dfr3
0 1 2
0 5 5 5
1 4 4 4

An individual value or a Data frame can be added to another Dataframe


>>> dfr1+2
0 1 2
0 3 4 5 Here 2 is added to each element of Data Frame dfr2
1 6 7 8

www.pythonclassroomdiary.wordpress.com by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior


>>> dfr1+dfr2
0 1 2
Corresponding element of dfr1 and dfr2 is added
0 4 4 8
1 9 12 14

>>> dfr1.add(dfr2)
0 1 2 It will add Corresponding elements of dfr2 with dfr1
0 4 4 8 (dfr2+dfr1)
1 9 12 14

>>> dfr1.radd(dfr2)
0 1 2 Here ‘r’ stands for reverse it will add Corresponding elements
0 4 4 8 of dfr2 with dfr1 (dfr2+dfr1)
1 9 12 14

>>> dfr3+dfr1+dfr2
0 1 2 It will add Corresponding elements of dfr1, dfr2 and dfr13
0 9 9 13
1 13 16 18

6.2
SUBTRACTION

>>> dfr1-dfr2
0 1 2 It will subtract Corresponding elements of dfr1 with dfr2
0 -2 0 -2
1 -1 -2 -2

>>> dfr1.sub(dfr2)
0 1 2 It will subtract Corresponding elements of dfr1 with dfr2
0 -2 0 -2
1 -1 -2 -2

>>> dfr1.rsub(dfr2)
Here ‘r’ stands for reverse it will subtract Corresponding
0 1 2
0 2 0 2
elements of dfr2 with dfr1 (dfr2 - dfr1)
1 1 2 2

>>> dfr1-2
0 1 2
Here 2 is subtracted with each element of Data Frame dfr1
0 -1 0 1
1 2 3 4

www.pythonclassroomdiary.wordpress.com by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior


In the Same way Multiplication can be done with * operator and mul()
function and Division can be done with / operator and div() function

7. Matching and Broadcasting Operations:

7.1 Matching : Whenever we perform arithmetic operations on dataframe data is aligned on


the basis of matching indexes and then performed arithmetic ; for non-overlapping indexes
the arithmetic operations result as a NaN . This default behavior of data alignment on the
basis of matching indexes is known as MATCHING

import pandas as pd Data Frame 1


0 1 2
s1=[[21,52,43],[41,55,66]]
0 21 52 43
s2=[[34,4],[4,6]] 1 41 55 66
dfr1=pd.DataFrame(s1)
Data Frame 2
Output
dfr2=pd.DataFrame(s2)
print('Data Frame 1') 0 1
0 34 4
print(dfr1)
1 4 6
print('Data Frame 2')
print(dfr2) Matching is done
print('Matching is done') 0 1 2
print(dfr1+dfr2) 0 55 56 NaN
1 45 61 NaN

7.2 Broadcasting : Enlarging the smaller object in a binary operation by replicating its
elements so as to match the shape of larger object.

import pandas as pd Data Frame 1


0 1 2
s1=[[21,52,43],[41,55,66]]
0 21 52 43 Output
s2=[[34,4],[4,6]] 1 41 55 66
dfr1=pd.DataFrame(s1) Data Frame 2
dfr2=pd.DataFrame(s2) 0 1
print('Data Frame 1') 0 34 4
print(dfr1) 1 4 6
Broadcasting is done
print('Data Frame 2') 0 1 2
print(dfr2) 0 24 56 48
print('Matching is done') 1 44 59 71
print(dfr1+dfr2)
print('Broadcasting is done')
s3=[3,4,5]
print(dfr1+s3)

1
www.pythonclassroomdiary.wordpress.com by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior
8. Handling Missing Data :
As data comes in many shapes and forms, pandas aims to be flexible with regard to handling
missing data. While NaN is the default missing value marker for reasons of computational speed
and convenience, we need to be able to easily detect this value with data of different types:
floating point, integer, boolean, and general object. In many cases, however, the
Python None will arise and we wish to also consider that “missing” or “not available” or “NA”.

Function Name Use


isnull() Returns True or False for each value in pandas object if it is a missing
value or not
notnull() Returns True or False for each value in pandas object if it is a data value
or not
dropna() It will remove(drop) all the rows which contain NaN values anywhere in
row

Dropna(how=’all’) It will remove nly those rows that have all NaN values
fillna(<dictionary It will fill missing Values with the value specified
Values>)

>>>import pandas as pd
>>>KV_shift1={'Computer':[20,25,22,50],'Projectors':[1,1,1,14],'iPad':[1,1,1,7],'AppleTv':[1,1,1,7]}
>>>dfr1=pd.DataFrame(KV_shift1,index=['SrCompLab','SecCompLab','PriLab','Others'])
>>>print(dfr1)
>>>KV_shift2={'Computer':[20,25,22,50],'Visualizers':[1,1,1,14],'iPad':[1,1,1,7],'AppleTv':[1,1,1,7]}
>>>dfr2=pd.DataFrame(KV_shift2,index=['SrCompLab','SecCompLab','PriLab','Others'])
>>>print(dfr2)
>>>KV3Gwl=dfr1+dfr2
>>>print(KV3Gwl)

Computer Projectors iPad AppleTv


SrCompLab 20 1 1 1
SecCompLab 25 1 1 1 DataFrame : dfr1
PriLab 22 1 1 1
Others 50 14 7 7

Computer Visualizers iPad AppleTv


SrCompLab 20 1 1 1
SecCompLab 25 1 1 1 DataFrame : dfr2
PriLab 22 1 1 1
Others 50 14 7 7

AppleTv Computer Projectors Visualizers iPad


SrCompLab 2 40 NaN NaN 2
SecCompLab 2 50 NaN NaN 2 DataFrame : KV3Gwl
PriLab 2 44 NaN NaN 2
Others 14 100 NaN NaN 14
8.1 Use of isnull() and notnull()
print('ISNULL ()' )
print(KV3Gwl.isnull())
print('NOTNULL ()' )
print(KV3Gwl.notnull()) notnull() Will Give True If
Isnull() Will Give True If Corresponding Element
Corresponding Element Contains an data
www.pythonclassroomdiary.wordpress.com
Contains by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior
NaN
Isnull() Will Give True If
Corresponding Element
Contains NaN
ISNULL ()
AppleTv Computer Projectors Visualizers iPad
SrCompLab False False True True False
SecCompLab False False True True False
PriLab False False True True False
Others False False True True False

NOTNULL ()
AppleTv Computer Projectors Visualizers iPad
SrCompLab True True False False True
SecCompLab True True False False True
PriLab True True False False True
Others True True False False True

8.2 Use of dropna()

>>> KV3Gwl
AppleTv Computer Projectors Visualizers iPad
SrCompLab 2 40 6.0 2.0 2
SecCompLab 2 50 NaN 5.0 2
PriLab 2 44 7.0 NaN 2
Others 14 100 5.0 2.0 14

>>> KV3Gwl.dropna()
AppleTv Computer Projectors Visualizers iPad
SrCompLab 2 40 6.0 2.0 2
Others 14 100 5.0 2.0 14

8.3 Use of fillna()


>>> KV3Gwl.fillna({'Projectors':0,'Visualizers':0})
AppleTv Computer Projectors Visualizers iPad
SrCompLab 2 40 6.0 2.0 2
SecCompLab 2 50 0.0 5.0 2
PriLab 2 44 7.0 0.0 2
Others 14 100 5.0 2.0 14

9. Comparision among Panda Objects (Series, DataFrame)


We can compare Panda Objects using == operator or using equals() function. The difference between
these two is that == compares each element of first dataframe with corresponding element of second
dataframe.

Lets clear with following example

import pandas as pd
import numpy as np

www.pythonclassroomdiary.wordpress.com by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior


KV_Shift1={'Computer':[20,25,22,50],'Projectors':[1,1,np.NaN,14],'iPad':[np.NaN,1,2,7],'Ap
pleTv':[1,1,1,7]}
dfr1=pd.DataFrame(KV_Shift1,index=['SrCompLab','SecCompLab','PriLab','Others'])
KV_Shift2={'Computer':[20,25,22,50],'Projectors':[1,1,np.NaN,14],'iPad':[np.NaN,1,2,7],'Ap
pleTv':[1,1,1,7]}
dfr2=pd.DataFrame(KV_Shift2,index=['SrCompLab','SecCompLab','PriLab','Others'])
print('Data Frame 1 :')
print(dfr1)
print('Data Frame 2 :')
print(dfr2)
print('Checking Equality using == operator')
print(dfr1==dfr1)
print('Checking Equality using aequal() funcition')
print(dfr1.equals(dfr1))

Data Frame 1 :
Computer Projectors iPad AppleTv
SrCompLab 20 1.0 NaN 1
SecCompLab 25 1.0 1.0 1
PriLab 22 NaN 2.0 1
Others 50 14.0 7.0 7

Data Frame 2 :
Computer Projectors iPad AppleTv
SrCompLab 20 1.0 NaN 1.0
SecCompLab 25 1.0 1.0 1.0
PriLab 22 NaN 2.0 1.0
Others 50 14.0 7.0 7.0

Checking Equality using == operator


Computer Projectors iPad AppleTv
SrCompLab True True False True
SecCompLab True True True True
PriLab True False True True
Others True True True True

Checking Equality using equals() funcition


True

10. Boolean Reduction :


With Boolean Reduction ,You can get overall result for a row or a column with a single True or False. For this
purpose Pandas offers following Boolean reduction functions or attributes

10.1 empty : Tells whether the Data Frame is Empty.


10.2 any () : It returns True if any of the element is True over requested axis.
10.3 all () : This function will return True if all the values on an axis are satisfying condition.

www.pythonclassroomdiary.wordpress.com by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior


import pandas as pd
import numpy as np
df1=pd.DataFrame()
s1=[[2,5,8],[10,5,2]]
df2=pd.DataFrame(s1)

if df1.empty==True:
print('Data1 Frame is Empty')
if df2.empty==True:
print('Data Frame2 is Empty')
else:
print('Data Frame2 is not Empty')

print('Data Frame')
print(df2)
print('Used function all()')
print((df2<5).all())
print('Used function any()')
print((df2<5).any())
Data1 Frame is Empty
Data Frame2 is not Empty
Data Frame
0 1 2
0 2 5 8
1 10 5 2

Used function all()


0 False
1 False
2 False
dtype: bool

Used function any()


0 True
1 False
2 True
dtype: bool

www.pythonclassroomdiary.wordpress.com by Sangeeta M Chauhan, PGT CS , KV NO.3 Gwalior

You might also like