0% found this document useful (0 votes)

17 views

Python Pandas-Data Frames

Uploaded by

Pari Khanuja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views

Python Pandas-Data Frames

Uploaded by

Pari Khanuja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

Data Frames

Dataframe

A Data Frame is
two-dimensional labelled data
structure like a table of MySQL.
It contains rows and columns,
and therefore has both a row
and column index.
Creation of DataFrame
(A) Creation of an empty DataFrame
>>> import pandas as pd
>>> dFrameEmt = pd.DataFrame()
>>> dFrameEmt
Empty DataFrame
Columns: []
Index: []
Creation of DataFrame
import pandas as pd
a1=[1,2,3,4]
a2=[10,20,30,40]
a3=[20,45,67,8]
x=pd.DataFrame([a1,a2,a3],columns=['A','B','C','D'])
print(x)
Creation of DataFrame
(B) Creation of DataFrame from NumPy ndarrays
>>> import numpy as np
>>> array1 = np.array([10,20,30])
>>> array2 = np.array([100,200,300])
>>> array3 = np.array([-10,-20,-30, -40])
>>> dFrame4 = pd.DataFrame(array1)
>>> dFrame4
Creation of DataFrame
(B) Creation of DataFrame from NumPy ndarrays
>>> dFrame5 = pd.DataFrame([array1, array3, array2], columns=[ 'A', 'B',
'C', 'D'])
>>> dFrame5
A B C D
0 10 20 30 NaN
1 -10 -20 -30 -40.0
2 100 200 300 NaN
Creation of DataFrame
(C) Creation of DataFrame from List of Dictionaries
# Create list of dictionaries Here, the dictionary keys are
taken as column labels, and the
>>> listDict = [{'a':10, 'b':20}, {'a':5, values corresponding to each
key are taken as rows.There will
'b':10, 'c':20}] be as many rows as the number
of dictionaries present in the list.
>>> dFrameListDict = pd.DataFrame(listDict)
Number of columns in a
>>> dFrameListDict DataFrame is equal to the
maximum number of keys in any
abc dictionary of the list.

0 10 20 NaN
1 5 10 20.0
(D) Creation of DataFrame from Dictionary of Lists
dictForest = {'State': ['Assam', 'Delhi', 'Kerala'], 'GArea': [78438, 1483, 38852]
, 'VDF' : [2797, 6.72,1663]}
>>> dFrameForest= pd.DataFrame(dictForest)
>>> dFrameForest
State GArea VDF
0 Assam 78438 2797.00
1 Delhi 1483 6.72
2 Kerala 38852 1663.00
(D) Creation of DataFrame from Dictionary of Lists
We can change the sequence of columns in a DataFrame.
>>> dFrameForest1 = pd.DataFrame(dictForest, columns = ['State','VDF',
'GArea'])
>>> dFrameForest1
State VDF GArea
0 Assam 2797.00 78438
1 Delhi 6.72 1483
2 Kerala 1663.00 38852
(E) Creation of DataFrame from Series
seriesA = pd.Series([1,2,3,4,5],index = ['a', 'b', 'c', 'd', 'e'])
seriesB = pd.Series ([1000,2000,-1000,-5000,1000], index = ['a', 'b', 'c', 'd',
'e'])
seriesC = pd.Series([10,20,-10,-50,100], index = ['z', 'y', 'a', 'c', 'e'])
(E) Creation of DataFrame from Series
>>> dFrame6 = pd.DataFrame(seriesA)
>>> dFrame6
0
a1
b2
c3
d4
e5
(E) Creation of DataFrame from Series
>>> dFrame7 = pd.DataFrame([seriesA, seriesB])
>>> dFrame7
a b c d e
0 1 2 3 4 5
1 1000 2000 -1000 -5000 1000
(E) Creation of DataFrame from Series
>>> dFrame8 = pd.DataFrame([seriesA, seriesC])
>>> dFrame8
a b c d e z y
0 1.0 2.0 3.0 4.0 5.0 NaN NaN
1 -10.0 NaN -50.0 NaN 100.0 10.0 20.0
(F) Creation of DataFrame from Dictionary of Series
>>> ResultSheet={'Arnab': pd.Series([90, 91,
97],index=['Maths','Science','Hindi']),
'Ramit': pd.Series([92, 81, 96], index=['Maths','Science','Hindi']),
'Samridhi': pd.Series([89, 91, 88], index=['Maths','Science','Hindi']),
'Riya': pd.Series([81, 71, 67], index=['Maths','Science','Hindi']),
'Mallika': pd.Series([94, 95, 99], index=['Maths','Science','Hindi']) }
(F) Creation of DataFrame from Dictionary of Series
>>> ResultDF = pd.DataFrame(ResultSheet)
>>> ResultDF
Arnab Ramit Samridhi Riya Mallika
Maths 90 92 89 81 94
Science 91 81 91 71 95
Hindi 97 96 88 67 99

When a DataFrame is created from a Dictionary of Series, the resulting index

or row labels are a union of all series indexes used to create the DataFrame.
Operations on rows and columns in DataFrames
(A) Adding a New Column to a DataFrame
>>> ResultDF['Preeti']=[89,78,76]
>>> ResultDF

Assigning values to a new column label that does not exist

will create a new column at the end.
Operations on rows and columns in DataFrames
(A) Adding a New Column to a DataFrame
>>> ResultDF['Ramit']=[99, 98, 78]
>>> ResultDF

>>> ResultDF['Arnab']=90 #To change the value of entire column

Operations on rows and columns in DataFrames
(B) Adding a New Row to a DataFrame
We can add a new row to a DataFrame using the
DataFrame.loc[ ] method.
>>> ResultDF.loc['English'] = [85, 86, 83, 80, 90, 89]
>>> ResultDF
We cannot use this method to add a row of data with already
existing (duplicate) index value (label). In such case, a row with this
index label will be updated. DataFRame.loc[] method can also be
used to change the data values of a row to a particular value.
Operations on rows and columns in DataFrames
(B) Adding a New Row to a DataFrame
If we try to add a row with lesser values than the number of
columns in the DataFrame, it results in a ValueError, with the error
message: ValueError: Cannot set a row with mismatched columns.
Similarly, if we try to add a column with lesser values than the
number of rows in the DataFrame, it results in a ValueError, with the
error message: ValueError: Length of values does not match length
of index.
Operations on rows and columns in DataFrames

Further, we can set all values of a DataFrame to a particular value,

for example:
>>> ResultDF[: ] = 0 # Set all values in ResultDF to 0
>>> ResultDF
Arnab Ramit Samridhi Riya Mallika Preeti
Maths 0 0 0 0 0 0
Science 0 0 0 0 0 0
Hindi 0 0 0 0 0 0
English 0 0 0 0 0 0
(C) Deleting Rows or Columns from a DataFrame
We can use the DataFrame.drop() method to delete rows
and columns from a DataFrame.
We need to specify the names of the labels to be dropped and the
axis from which they need to be dropped. To delete a row, the
parameter axis is assigned the value 0 and for deleting a
column,the parameter axis is assigned the value 1.

>>> ResultDF = ResultDF.drop('Science', axis=0)

>>> ResultDF = ResultDF.drop(['Samridhi','Ramit','Riya'], axis=1)
(D) Renaming Row Labels of a DataFrame
We can change the labels of rows and columns in a DataFrame
using the DataFrame.rename() method.
>>>
ResultDF=ResultDF.rename({'Maths':'Sub1',‘Science':'Sub2','English':
'Sub3', 'Hindi':'Sub4'}, axis='index')
>>> print(ResultDF)
The parameter axis='index' is used to specify that the row label is to
be changed. If no new label is passed corresponding to an existing
label, the existing row label is left as it is, for example:
(D) Renaming Row Labels of a DataFrame
>>>
ResultDF=ResultDF.rename({'Maths':'Sub1',‘Science':'Sub2','Hindi':'S
ub4'}, axis='index')
>>> print(ResultDF)
(E) Renaming Column Labels of a DataFrame
To alter the column names of ResultDF we can again use
the rename() method. The parameter axis='columns' implies we
want to change the column labels:

>>> ResultDF=ResultDF.rename({'Arnab':'Student1','Ramit':'Student2','
Samridhi':'Student3','Mallika':'Student4'},axis='columns')
Accessing DataFrames Element through Indexing
There are two ways of indexing Dataframes :
Label based indexing and Boolean Indexing.
(A) Label Based Indexing:- DataFrame.loc[ ] is an important method
that is used for label based indexing with DataFrames.
>>> ResultDF.loc['Science']
Arnab 91
Ramit 81
Samridhi 91
Riya 71
Mallika 95
Name: Science, dtype: int64
Accessing DataFrames Element through Indexing
(A) Label Based Indexing:-
Also, note that when the row label is passed as an integer value, it
is interpreted as a label of the index and not as an integer position
along the index, for example:
>>> dFrame10Multiples = pd.DataFrame([10,20,30,40,50])
>>> dFrame10Multiples.loc[2]
0 30
Name: 2, dtype: int64
When a single column label is passed, it returns the column
as a Series.
Accessing DataFrames Element through Indexing
(A) Label Based Indexing:-
>>> ResultDF.loc[:,'Arnab'] # we can obtain the same result that is
the marks of ‘Arnab’ in all the subjects
>>> ResultDF.loc[['Science', 'Hindi']] # To read more than one row
from a DataFrame
Accessing DataFrames Element through Indexing
(B) Boolean Indexing:- Boolean means a binary variable that can
represent either of the two states - True (indicated by 1) or False
(indicated by 0). In Boolean indexing, we can select the subsets of
data based on the actual values in the DataFrame rather than their
row/column labels.
>>> ResultDF.loc['Maths'] > 90
Arnab False
Ramit True
Samridhi False
Riya False
Mallika True
Name: Maths, dtype: bool
Accessing DataFrames Element through Indexing
(B) Boolean Indexing:-
>>> ResultDF.loc[:,‘Arnab’]>90 #To check in which subjects ‘Arnab’ has
scored more than 90,
Accessing DataFrames Element through Slicing
>>> ResultDF.loc['Maths': 'Science']
>>> ResultDF.loc['Maths': 'Science', ‘Arnab’]
Maths 90
Science 91
Name: Arnab, dtype: int64
>>> ResultDF.loc['Maths': 'Science', ‘Arnab’:’Samridhi’]
>>> ResultDF.loc['Maths': 'Science',[‘Arnab’,’Samridhi’]]
Filtering Rows in DataFrames
In DataFrames, Boolean values like True (1) and False (0) can be
associated with indices. They can also be used to ﬁlter the records
using the DataFrmae.loc[] method.
In order to select or omit particular row(s), we can use a Boolean list
specifying ‘True’ for the rows to be shown and ‘False’ for the ones to be
omitted in the output.
>>> ResultDF.loc[[True, False, True]] # row having index as Science is
omitted
Joining, Merging and Concatenation of DataFrames

We can use the pandas.DataFrame.append() method to merge two

DataFrames. It appends rows of the second DataFrame at the end of
the ﬁrst DataFrame.

# append dFrame1 to dFrame2

>>> dFrame2 =dFrame2.append(dFrame1, sort=’True’)
>>> dFrame2 = dFrame2.append(dFrame1, sort=’False’)
Joining, Merging and Concatenation of DataFrames

The parameter verify_integrity of append()method may be set to True

when we want to raise an error if the row labels are duplicate. By
default, verify_integrity =False.

The parameter ignore_index of append()method may

be set to True, when we do not want to use row index
labels. By default, ignore_index = False.
Attributes of DataFrames
Consider the following series,
>>> ForestArea = { 'Assam' :pd.Series([78438, 2797,
10192, 15116], index = ['GeoArea', 'VeryDense', 'ModeratelyDense',
'OpenForest']), 'Kerala' :pd.Series([ 38852, 1663,
9407, 9251], index = ['GeoArea' ,'VeryDense', 'ModeratelyDense',
'OpenForest']), 'Delhi' :pd.Series([1483, 6.72, 56.24, 129.45], index =
['GeoArea', 'VeryDense', 'ModeratelyDense', 'OpenForest'])}

>>> ForestAreaDF = pd.DataFrame(ForestArea)

Attributes of DataFrames
Attributes of DataFrames
Attributes of DataFrames
Importing and Exporting Data between CSV Files and DataFrames
(For Practicals only)
Importing a CSV file to a DataFrame
>>> marks = pd.read_csv("C:/NCERT/ResultData.csv",sep =",", header=0)
★ The first parameter to the read_csv() is the name of the comma
separated data file along with its path.
★ The parameter sep specifies whether the values are separated by
comma, semicolon, tab, or any other character. The default value for
sep is a space.
★ The parameter header specifies the number of the row whose values
are to be used as the column names. header=0 implies that column
names are inferred from the first line of the file. By default, header=0.
Importing and Exporting Data between CSV Files and DataFrames
Importing a CSV file to a DataFrame
★ We can exclusively specify column names using the parameter
names while creating the DataFrame using the read_csv() function.
★ >>> marks1 = pd.read_csv("C:/NCERT/ResultData1.csv",sep=",",
names=['RNo','StudentName', 'Sub1','Sub2'])
Importing and Exporting Data between CSV Files and DataFrames
Exporting a DataFrame to a CSV file
We can use the to_csv() function to save a DataFrame to a text or csv
file.
>>>ResultDF.to_csv(path_or_buf='C:/NCERT/resultout.csv', sep=',')
➔ This creates a file by the name resultout.csv in the folder C:/NCERT
on the hard disk.
➔ In case we do not want the column names to be saved to the file we
may use the parameter header=False.
➔ Another parameter index=False is used when we do not want the row
labels to be written to the file on disk.
Importing and Exporting Data between CSV Files and DataFrames
Exporting a DataFrame to a CSV file
➔ >>> ResultDF.to_csv( 'C:/NCERT/resultonly.txt',sep = '@', header =
False, index= False)
➔

Econometrics - Fumio Hayashi (Solutions)
No ratings yet
Econometrics - Fumio Hayashi (Solutions)
19 pages
Project Canopy Document
No ratings yet
Project Canopy Document
71 pages
Chapter 2 Data Handling using pandas - I(DATA FRAME)
No ratings yet
Chapter 2 Data Handling using pandas - I(DATA FRAME)
15 pages
Pandas
No ratings yet
Pandas
5 pages
Data Frame Demo
No ratings yet
Data Frame Demo
73 pages
Lab 9
No ratings yet
Lab 9
9 pages
Block 1-Data Handling Using Pandas DataFrame
No ratings yet
Block 1-Data Handling Using Pandas DataFrame
17 pages
Chapter 1 - Part 2 - DataFrame (1)
No ratings yet
Chapter 1 - Part 2 - DataFrame (1)
48 pages
Pandas DataFrame
No ratings yet
Pandas DataFrame
70 pages
Data Frames
No ratings yet
Data Frames
60 pages
SBLC 1
No ratings yet
SBLC 1
23 pages
DOC-20230110-WA0046. (1)
No ratings yet
DOC-20230110-WA0046. (1)
8 pages
lecture-9-pandas
No ratings yet
lecture-9-pandas
176 pages
Chapter 1 Python Pandas - I
No ratings yet
Chapter 1 Python Pandas - I
35 pages
Pandas Dataframe
No ratings yet
Pandas Dataframe
48 pages
DataFrame Notes1
No ratings yet
DataFrame Notes1
32 pages
IP 12th Chapter 3
No ratings yet
IP 12th Chapter 3
9 pages
Class XII IP Key Points (Python Pandas)
No ratings yet
Class XII IP Key Points (Python Pandas)
5 pages
Pandas DataFrame Notes
No ratings yet
Pandas DataFrame Notes
13 pages
05Getting Started With Pandas
No ratings yet
05Getting Started With Pandas
44 pages
Data Handing Using Pandas-I
100% (2)
Data Handing Using Pandas-I
46 pages
Copy of Copy of Black Doodle Group Project Presentation - 20230903 - 211147 - 0000
No ratings yet
Copy of Copy of Black Doodle Group Project Presentation - 20230903 - 211147 - 0000
32 pages
Class 12 Practical File
No ratings yet
Class 12 Practical File
29 pages
Pandas
No ratings yet
Pandas
26 pages
Python Pandas ch-2
No ratings yet
Python Pandas ch-2
56 pages
Python Pandas New Sylabus
No ratings yet
Python Pandas New Sylabus
53 pages
ip study
No ratings yet
ip study
18 pages
DataFrame 2
No ratings yet
DataFrame 2
38 pages
Data Handling Using Pandas-I-ORG
No ratings yet
Data Handling Using Pandas-I-ORG
44 pages
Data Frame Notes1
No ratings yet
Data Frame Notes1
7 pages
Python Pandas Demo PDF
100% (2)
Python Pandas Demo PDF
23 pages
04-Data Manipulation With Pandas
No ratings yet
04-Data Manipulation With Pandas
28 pages
Unit 4
No ratings yet
Unit 4
36 pages
DataFrame NOTES
No ratings yet
DataFrame NOTES
45 pages
Pandas - NOTES
No ratings yet
Pandas - NOTES
14 pages
Pandas Class 12 Ncertttt
No ratings yet
Pandas Class 12 Ncertttt
48 pages
Practice Questions (Unsolved)
No ratings yet
Practice Questions (Unsolved)
8 pages
Python Pandas - DataFrame
No ratings yet
Python Pandas - DataFrame
12 pages
Revision Point - Dataframe
No ratings yet
Revision Point - Dataframe
11 pages
p.no 35 to 52
No ratings yet
p.no 35 to 52
18 pages
Pandas
No ratings yet
Pandas
8 pages
Pandas-Creating Series & Dataframes (DR V Gowri, Srmist)
No ratings yet
Pandas-Creating Series & Dataframes (DR V Gowri, Srmist)
47 pages
Pandas DataFrameObject
No ratings yet
Pandas DataFrameObject
4 pages
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
100% (1)
Cheat Sheet: The Pandas Dataframe Object: Preliminaries Get Your Data Into A Dataframe
10 pages
Class 12 Panda Project
No ratings yet
Class 12 Panda Project
13 pages
Pandas DataFrame1
No ratings yet
Pandas DataFrame1
22 pages
Cheat Sheet
No ratings yet
Cheat Sheet
10 pages
Dataframe Notes
No ratings yet
Dataframe Notes
47 pages
02. Python Pandas - 2 2020-21
No ratings yet
02. Python Pandas - 2 2020-21
21 pages
1 Data Handlinng Using Pandas-I
No ratings yet
1 Data Handlinng Using Pandas-I
46 pages
IP Practical File Project
No ratings yet
IP Practical File Project
60 pages
Line By Line 12 IP
No ratings yet
Line By Line 12 IP
21 pages
python 2.1.2 (2)
No ratings yet
python 2.1.2 (2)
7 pages
Pandas
No ratings yet
Pandas
27 pages
_8th_of_10_Python_Resources_PANDAS_Interview_Q_A_?_1737825285
No ratings yet
_8th_of_10_Python_Resources_PANDAS_Interview_Q_A_?_1737825285
19 pages
IP-LAB-FILE-PYTHON
No ratings yet
IP-LAB-FILE-PYTHON
9 pages
Data Handlinng Using Pandas-I
No ratings yet
Data Handlinng Using Pandas-I
46 pages
PYTHON UNIT IV- PANDAS
No ratings yet
PYTHON UNIT IV- PANDAS
36 pages
Pandas
No ratings yet
Pandas
29 pages
Phan1_Pandas_Numpy_Matplotlib
No ratings yet
Phan1_Pandas_Numpy_Matplotlib
158 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Chapter II. RRL
No ratings yet
Chapter II. RRL
37 pages
TAKE HOME QUIZ 1 Stat
No ratings yet
TAKE HOME QUIZ 1 Stat
2 pages
Group 16 Cpp Presentation
No ratings yet
Group 16 Cpp Presentation
12 pages
WAGO 750 530 en 2
No ratings yet
WAGO 750 530 en 2
12 pages
Contingency Plan For Covid19
No ratings yet
Contingency Plan For Covid19
20 pages
Intro To Philo 4TH QUARTER EXAM
No ratings yet
Intro To Philo 4TH QUARTER EXAM
2 pages
EE3408 Embedded Processor Lab Manual
No ratings yet
EE3408 Embedded Processor Lab Manual
21 pages
The Essential Principles of Graphic Design -- Millman, Debbie -- 2008
No ratings yet
The Essential Principles of Graphic Design -- Millman, Debbie -- 2008
264 pages
Slide VIVA (Tuti) 4
No ratings yet
Slide VIVA (Tuti) 4
41 pages
Kimo vt100 User Manual
No ratings yet
Kimo vt100 User Manual
2 pages
Fts 9
No ratings yet
Fts 9
19 pages
Kajiado County Integrated Development Plan 2018-2022
No ratings yet
Kajiado County Integrated Development Plan 2018-2022
166 pages
Maths Vol 1 Set1
No ratings yet
Maths Vol 1 Set1
25 pages
Progress Report of The Team Project - 21 Noiembrie
No ratings yet
Progress Report of The Team Project - 21 Noiembrie
2 pages
Caregiving NC II Assessment
No ratings yet
Caregiving NC II Assessment
8 pages
Thesis For Free
100% (3)
Thesis For Free
7 pages
Visvesvaraya Technological University Belagavi
No ratings yet
Visvesvaraya Technological University Belagavi
23 pages
District Innovation Proposal - Project Mind
100% (6)
District Innovation Proposal - Project Mind
11 pages
Protection Diagram
No ratings yet
Protection Diagram
1 page
kakinada schools list
No ratings yet
kakinada schools list
63 pages
ISO 41001 2018 Facility Management System SYARAT
No ratings yet
ISO 41001 2018 Facility Management System SYARAT
5 pages
CLIL WORKSHEETS Globalization
No ratings yet
CLIL WORKSHEETS Globalization
1 page
P.E Quiz 1 - 10
100% (1)
P.E Quiz 1 - 10
4 pages
Ucsp Unit Iii Reviewer
No ratings yet
Ucsp Unit Iii Reviewer
7 pages
Thesis Statement Heros Journey
100% (3)
Thesis Statement Heros Journey
4 pages
WD 40 Specialist Cleaner Degreaser Tds Sheet
No ratings yet
WD 40 Specialist Cleaner Degreaser Tds Sheet
2 pages
Persuasive Speech Assignment
No ratings yet
Persuasive Speech Assignment
3 pages
Hinks
No ratings yet
Hinks
14 pages

Python Pandas-Data Frames

Uploaded by

Python Pandas-Data Frames

Uploaded by

Data Frames

When a DataFrame is created from a Dictionary of Series, the resulting index

Assigning values to a new column label that does not exist

>>> ResultDF['Arnab']=90 #To change the value of entire column

Further, we can set all values of a DataFrame to a particular value,

>>> ResultDF = ResultDF.drop('Science', axis=0)

We can use the pandas.DataFrame.append() method to merge two

# append dFrame1 to dFrame2

The parameter verify_integrity of append()method may be set to True

The parameter ignore_index of append()method may

>>> ForestAreaDF = pd.DataFrame(ForestArea)

You might also like