Chapter-2 Python Pandas
Chapter-2 Python Pandas
Python Pandas
Introduction :
Selecting/Accessing a Column :
Selecting/Accessing a Subset from a DataFrame using
Row/Column Names :
Syntax :
<DataFrameObject>.loc[<startrow>:<endrow>]
<DataFrameObject>.loc[<startrow>:<endrow>,
<startcolumn>:<endcolumn>]
Parameters :
mode() – Returns the mode value (i.e., the value that appears
most often) from a set of values.
Parameters :
<dataframe>[<column name>]
<dateframe>.loc[<row index>, :]
1. pivoting
2. sorting
3. aggregation
1. Pivoting : Pivoting is actually a summary technique that
works on tabular data (i.e., data in rows and columns).
Pivoting technique rearranges the data from rows and columns,
by possibly aggregating data from multiple sources, in a report
form (with rows transferred to columns) so that data can be
viewed in a different perspective.
Now change the rows and columns , i.e. the index and columns
arguments
The above data is for one quarter only. The online tutoring
company has data for the entire year as shown below :
The index i.e. , the rows are specified as ‘Tutor’ and the
columns as ‘Country’. There are multiple entries of tutor which
are very much different for same country.
Try to create a row for tutor Tahira from the above data with
columns as Country.
37 28 38 44 53 69 74 53 35 38 66 46 24 45 92 48 51 62 58 57
Bin Frequency Age included in Bin
20-30 2 28,24
30-40 4 37,38,35,38
40-50 4 44,46,45,48
50-60 5 53,53,51,56,57
60-70 3 69,66,92
70-80 1 74
80-90 0 --
90-100 1 92
Function Application : It means that a function (a library
function or user defined function ) may be applied on a
dataframe in multiple ways:
Syntax - <dataframe>.apply(<funcname>,axis = 0)
Syntax - <dataframe>.applymap(<funcname>)
To apply, apply() row-wise write :
<dataframe>.apply(<func>, axis = 1)
mean :-
36,40,30,32
(36+40+30+32)/4 =
34.5
median :-
30 32 36 40
n/2 = 4/2 = 2
(n/2) + 1 = (4/2) + 1
=3
(32 + 36) / 2 = 34
Sum : (36+40+30+32)
= 138
Adding indexes :
(iii) reindex_like() – for creating indexes/column-labels
based on other dataframe object.
<dataframe>.reindex_like(other)
Solved Problems :
import pandas as pd
df1 = pd.DataFrame([1,2,3])
df2 = pd.DataFrame([[1,2,3]])
print("df1")
print(df1)
print("df2")
print(df2)
Ans :