Program Dataframe

Uploaded by

yogitry007

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views

Program Dataframe

Uploaded by

yogitry007

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

1Write a program to Create different dataframe:

1 #Dataframe using nested list

import pandas as pd
012
0 1 10 a
l =[ [ 1, 10, ' a' ] , [ 2, 20, ' b' ] , [ 3, 30, ' c' ] ] 1 2 20 b
df = pd. Dat aFr ame( l ) 2 3 30 c
pr i nt ( df
import pandas as pd c1 c2 c3
l=[[1,3,-9],[4,5,6],[7,5,8]] r1 1 3 -9
df = pd.DataFrame(l,index=["r1","r2","r3"],columns=["c1","c2","c3"]) r2 4 5 6
print(df) r3 7 5 8
#Dataframe using dictionary one two three
import pandas as pd 0159
dic = {'one':[1,2,3,4],'two':[5,6,7,8],'three':[9,10,11,12]} 1 2 6 10
df = pd.DataFrame(dic) 2 3 7 11
print(df) 3 4 8 12
import pandas as pd Empty DataFrame
dic = {'one':[1,2,3,4],'two':[5,6,7,8],'three':[9,10,11,12]} Columns: [c1, c2]0
df = pd.DataFrame(dic,columns=['c1','c2','c3']) Index: []
print(d
# Dataframe using Series 0123
import pandas as pd 01234
s1 = pd.Series([1,2,3,4]) 15678
s2 = pd.Series([5,6,7,8]) 2 9 10 11 12
s3 = pd.Series([9,10,11,12])
df = pd.DataFrame([s1,s2,s3])
print(df)
# Dataframe using Text file regno roll name
import pandas as pd 0 1 34 Rajesh
df=pd.read_csv("data.txt",header=0) 1 2 42 Suman
print(df)
Roll No Name Eng Phy Chem Maths IP
import pandas as pd 12 Kishore 23 54 36 56 54
df=pd.read_csv("student1.csv",index_col=0) 44 Tarun 34 65 45 46 52
print(df)

df = pd.DataFrame({'name': ['Raphael', 'Donatello'],

'mask': ['red', 'purple'],
'weapon': ['sai', 'bo staff']})
df.to_csv(index=False)

1Write a program to Create different dataframe:

• an empty dataframe named as empty_df.
import pandas as pd
empty_df = pd.DataFrame()
print(empty_df)
• dataframe named as students using a list of names of 5 students.
import pandas as pd
students = ["Ram","Aman","Akash","Ramesh","Virat"]
students = pd.DataFrame(students,columns=["Name"])
print(students)
• dataframe players using a nested list of names and scores of the
previous three matches.
import pandas as pd
data = [["Virat",55,66,31],["Rohit",88,66,43],["Samson",99,101,68]]
players = pd.DataFrame(data, columns = ["Name","Match-
1","Match-2", "Match-3"])
print(players)
• Using Dictionary of list:
import pandas as pd
data = {"Virat":[55,66,31],"Rohit":[88,66,43],"Samson":[99,101,68]}
players = pd.DataFrame(data,columns = ["Name","Match-1","Match-
2","Match-3"])
print(players)
• using the series sales_person which stored saleman names and
quantity of sales of August.
import pandas as pd
sales_person = [["Ram",55],["Ajay",22],["Vijay",67],["Sachin",44]]
salesman = pd.DataFrame(sales_person,columns=
["Name","Sales(August)"])
print(salesman)
• using a dictionary which stored country name, capitals and
populations of the country.
import pandas as pd
country_data = {"Country Name":["India","Canada","Australia"],
"Capital": ["New Delhi","Ottawa","Canberra"],
"Population" : ["136 Cr","10 Cr","50 Cr"]
}
countries = pd.DataFrame(country_data)
print(countries)

2. Programs based on Select and Access data

Consider following data and write a program to do the following:
SN
O Batsman Test ODI T20
354 224 192
1 Virat Kohli 3 5 5
Ajinkya 257 216 185
2 Rehane 8 5 3
228 208 152
3 Rohit Sharma 0 0 2
Shikhar 215 195 102
4 Dhawan 8 7 0
187 185
5 Hardik Pandya 9 6 980

• Print the batsman name along with runs scored in Test and T20 using
column names and dot notation.
import pandas as pd

# Creating the Data

player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}
data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1
print(data.Name)
print(data.Test)
print(data.T20)
• Display the Batsman name along with runs scored in ODI using loc.
import pandas as pd
# Creating the Data
player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}
data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1
print(data.loc[:,('Name','ODI')])
• Display the batsman details who scored runs more than :
• More than 2000 in ODI ii. Less than 2500 in Test iii. More than
1500 in T20
import pandas as pd
# Creating the Data
player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}
data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1
# runs more than 2500 in ODI
print("---- Runs greater than 2500 in ODI ---------")
print(data.loc[data['ODI']>2500, ['Name']])
# Less than 2500 runs in test
print("---- Runs less than 2500 in Test ---------")
print(data.loc[data['Test']<2500, ['Name']])
# More than 1500 runs in T20
print("---- Runs more than 1500 in T20 ---------")
print(data.loc[data['T20']>1500, ['Name']])
• Display the columns using column index number like 0, 2, 4.
import pandas as pd

# Creating the Data

player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}
data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1
# =======================
print(data[data.columns[0]])
print(data[data.columns[2]])
• Reindex the dataframe created above with batsman name and delete
data of Hardik Pandya and ShikharDhawan by their index from
original dataframe.
import pandas as pd
# Creating the Data
player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}
data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1
# ----------------------
data = data.set_index('Name')
print(data)
# Now deleting records
print("Records after Deleting the values:")
data = data.drop(['ShikharDhawan','Hardik Pandya'])
print(data)
• Insert 2 rows in the dataframe and delete rows whose index is 1 and
4.
import pandas as pd
# Creating the Data
player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}

data = pd.DataFrame(player_data)
# ----------------------
values = {'Name':['Rishabh Pant','ShreyasIyer'],'Test':
[1500,1459],'ODI':[1980,1342],'T20':[2300,1988]}
data = data.append(pd.DataFrame(values), ignore_index=True)
print(data)

# -----
print("Data after deleting index 1 and 4")
data = data.drop([1,4])
print(data)
• Delete a column Test, add one more column at last (next to T20
column), make total of ODI and T20 runs in that column.
import pandas as pd
# Creating the Data
player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}
data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1
# ----------------------
data = data.drop('Test', axis = 1)
print(data)
# ---------------
total = data['ODI'] + data["T20"]
data['Total'] = total
print(data)
• Delete columns using T20 and total using columns parameter in
drop() function.
import pandas as pd

# Creating the Data

data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1
# ---------------
total = data['ODI'] + data["T20"]
data['Total'] = total
data = data.drop(['T20','Total'], axis = 1)
print(data)
• Rename column T20 with “T20I Runs”.
import pandas as pd