Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Program Dataframe

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

1Write a program to Create different dataframe:

1 #Dataframe using nested list


import pandas as pd
012
0 1 10 a
l =[ [ 1, 10, ' a' ] , [ 2, 20, ' b' ] , [ 3, 30, ' c' ] ] 1 2 20 b
df = pd. Dat aFr ame( l ) 2 3 30 c
pr i nt ( df
import pandas as pd c1 c2 c3
l=[[1,3,-9],[4,5,6],[7,5,8]] r1 1 3 -9
df = pd.DataFrame(l,index=["r1","r2","r3"],columns=["c1","c2","c3"]) r2 4 5 6
print(df) r3 7 5 8
#Dataframe using dictionary one two three
import pandas as pd 0159
dic = {'one':[1,2,3,4],'two':[5,6,7,8],'three':[9,10,11,12]} 1 2 6 10
df = pd.DataFrame(dic) 2 3 7 11
print(df) 3 4 8 12
import pandas as pd Empty DataFrame
dic = {'one':[1,2,3,4],'two':[5,6,7,8],'three':[9,10,11,12]} Columns: [c1, c2]0
df = pd.DataFrame(dic,columns=['c1','c2','c3']) Index: []
print(d
# Dataframe using Series 0123
import pandas as pd 01234
s1 = pd.Series([1,2,3,4]) 15678
s2 = pd.Series([5,6,7,8]) 2 9 10 11 12
s3 = pd.Series([9,10,11,12])
df = pd.DataFrame([s1,s2,s3])
print(df)
# Dataframe using Text file regno roll name
import pandas as pd 0 1 34 Rajesh
df=pd.read_csv("data.txt",header=0) 1 2 42 Suman
print(df)
Roll No Name Eng Phy Chem Maths IP
import pandas as pd 12 Kishore 23 54 36 56 54
df=pd.read_csv("student1.csv",index_col=0) 44 Tarun 34 65 45 46 52
print(df)

df = pd.DataFrame({'name': ['Raphael', 'Donatello'],


'mask': ['red', 'purple'],
'weapon': ['sai', 'bo staff']})
df.to_csv(index=False)

1Write a program to Create different dataframe:


• an empty dataframe named as empty_df.
import pandas as pd
empty_df = pd.DataFrame()
print(empty_df)
• dataframe named as students using a list of names of 5 students.
import pandas as pd
students = ["Ram","Aman","Akash","Ramesh","Virat"]
students = pd.DataFrame(students,columns=["Name"])
print(students)
• dataframe players using a nested list of names and scores of the
previous three matches.
import pandas as pd
data = [["Virat",55,66,31],["Rohit",88,66,43],["Samson",99,101,68]]
players = pd.DataFrame(data, columns = ["Name","Match-
1","Match-2", "Match-3"])
print(players)
• Using Dictionary of list:
import pandas as pd
data = {"Virat":[55,66,31],"Rohit":[88,66,43],"Samson":[99,101,68]}
players = pd.DataFrame(data,columns = ["Name","Match-1","Match-
2","Match-3"])
print(players)
• using the series sales_person which stored saleman names and
quantity of sales of August.
import pandas as pd
sales_person = [["Ram",55],["Ajay",22],["Vijay",67],["Sachin",44]]
salesman = pd.DataFrame(sales_person,columns=
["Name","Sales(August)"])
print(salesman)
• using a dictionary which stored country name, capitals and
populations of the country.
import pandas as pd
country_data = {"Country Name":["India","Canada","Australia"],
"Capital": ["New Delhi","Ottawa","Canberra"],
"Population" : ["136 Cr","10 Cr","50 Cr"]
}
countries = pd.DataFrame(country_data)
print(countries)

2. Programs based on Select and Access data


Consider following data and write a program to do the following:
SN
O Batsman Test ODI T20
354 224 192
1 Virat Kohli 3 5 5
Ajinkya 257 216 185
2 Rehane 8 5 3
228 208 152
3 Rohit Sharma 0 0 2
Shikhar 215 195 102
4 Dhawan 8 7 0
187 185
5 Hardik Pandya 9 6 980

• Print the batsman name along with runs scored in Test and T20 using
column names and dot notation.
import pandas as pd

# Creating the Data


player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}
data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1
print(data.Name)
print(data.Test)
print(data.T20)
• Display the Batsman name along with runs scored in ODI using loc.
import pandas as pd
# Creating the Data
player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}
data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1
print(data.loc[:,('Name','ODI')])
• Display the batsman details who scored runs more than :
• More than 2000 in ODI ii. Less than 2500 in Test iii. More than
1500 in T20
import pandas as pd
# Creating the Data
player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}
data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1
# runs more than 2500 in ODI
print("---- Runs greater than 2500 in ODI ---------")
print(data.loc[data['ODI']>2500, ['Name']])
# Less than 2500 runs in test
print("---- Runs less than 2500 in Test ---------")
print(data.loc[data['Test']<2500, ['Name']])
# More than 1500 runs in T20
print("---- Runs more than 1500 in T20 ---------")
print(data.loc[data['T20']>1500, ['Name']])
• Display the columns using column index number like 0, 2, 4.
import pandas as pd

# Creating the Data


player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}
data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1
# =======================
print(data[data.columns[0]])
print(data[data.columns[2]])
• Reindex the dataframe created above with batsman name and delete
data of Hardik Pandya and ShikharDhawan by their index from
original dataframe.
import pandas as pd
# Creating the Data
player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}
data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1
# ----------------------
data = data.set_index('Name')
print(data)
# Now deleting records
print("Records after Deleting the values:")
data = data.drop(['ShikharDhawan','Hardik Pandya'])
print(data)
• Insert 2 rows in the dataframe and delete rows whose index is 1 and
4.
import pandas as pd
# Creating the Data
player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}

data = pd.DataFrame(player_data)
# ----------------------
values = {'Name':['Rishabh Pant','ShreyasIyer'],'Test':
[1500,1459],'ODI':[1980,1342],'T20':[2300,1988]}
data = data.append(pd.DataFrame(values), ignore_index=True)
print(data)

# -----
print("Data after deleting index 1 and 4")
data = data.drop([1,4])
print(data)
• Delete a column Test, add one more column at last (next to T20
column), make total of ODI and T20 runs in that column.
import pandas as pd
# Creating the Data
player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}
data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1
# ----------------------
data = data.drop('Test', axis = 1)
print(data)
# ---------------
total = data['ODI'] + data["T20"]
data['Total'] = total
print(data)
• Delete columns using T20 and total using columns parameter in
drop() function.
import pandas as pd

# Creating the Data


player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}

data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1
# ---------------
total = data['ODI'] + data["T20"]
data['Total'] = total
data = data.drop(['T20','Total'], axis = 1)
print(data)
• Rename column T20 with “T20I Runs”.
import pandas as pd

# Creating the Data


player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}

data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1

# ----------------------
data.rename(columns={'T20':'T20I runs'}, inplace = True)
print(data)
• Rename all the columns of dataframe with your choice of column
names.
import pandas as pd

# Creating the Data


player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}

data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1

# ----------------------
data.rename(columns={'T20':'T20I
runs','Name':'PlayerName','Test':'TestRuns','ODI':'ODI Runs'},
inplace = True)
print(data)
• Rename the index with prefix IND and number like 001 and So on.
import pandas as pd
# Creating the Data
player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}

data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1

# ----------------------
index = pd.Index(["001","002","003","004","005"])
data = pd.DataFrame(player_data,index)
data = data.rename_axis('IND')
print(data)
3. Display the first two rows and last two rows.
import pandas as pd

# Creating the Data


player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}

data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1

# ----------------------
print(data.head(2))
print(data.tail(2))
• Count the total number of rows and columns of the dataframe.
import pandas as pd

# Creating the Data


player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}

data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1

# ----------------------
total_rows = len(data.axes[0])
total_col = len(data.axes[1])

print("Total Rows: " + str(total_rows))


print("Total Columns: " + str(total_col))
4. Print the dataframe without headers. Go for the multiple options.
import pandas as pd

# Creating the Data


player_data = {"Name":["ViratKohli","AjinkyaRahane","Rohit
Sharma","ShikharDhawan","Hardik Pandya"],
"Test" : [3543,2578,2280,2158,1879],
"ODI" : [2245,2165,2080,1957,1856],
"T20" : [1925,1853,1522,1020,980]
}

data = pd.DataFrame(player_data)
# The following line is used to start the index from 1
data.index = data.index + 1

# ----------------------
print(data.to_csv(header=None,index=False))

You might also like