IP Practical File Project
IP Practical File Project
IP Practical File Project
Practical Question – 1
Pandas Series Question-1
Study the following series -
1 Sunday
2 Monday
3 Tuesday
4 Wednesday
5 Thursday
And create the same using
(i)ndarrays (ii) Dictionary
Code (i)
import pandas as pd
import numpy as np
days = ['Sunday','Monday','Tuesday','Wednesday','Thursday']
array = np.array(days)
Practical Question - 1
s1 = pd.Series (array, index = [1, 2, 3, 4, 5])
print (s1)
Code(ii)
import pandas as pd
dict1 = {1:'Sunday', 2:'Monday', 3:'Tuesday',
4:'Wednesday',5:'Thursday'}
s2 = pd.Series (dict1)
print (s2)
Output
(i)
(ii)
Practical Question – 2
Pandas Series Question-2
A series that stores the average marks scored by 10 students is as follows
–
[90, 89, 78, 91, 80, 88, 95, 98, 75, 97]
Write a code to :
(i) Create a series using the given dataset with index values (1-10)
generated
using arange ( ).
(ii) Give name to the series as ‘AVERAGES’ and index values as ‘ROLL
NUMBER’.
(iii) Display the top three averages.
(iv) Display all mark averages less than 80.
(v) Update the mark averages of roll number (index) 5 to 82 and display
the
series.
(vi) Display mark detail of roll number 7, 8 and 9.
Code
import pandas as pd ; import numpy as np
dataset = [90, 89, 78, 91, 80, 88, 95, 98, 75, 97]
Practical Question - 2
index_array = np.arange(1, 11, 1)
s1 = pd.Series (dataset, index = index_array)
print (s1) #Q(i)
s1.name = 'AVERAGES' ; s1.index.name = 'ROLL NUMBER'
print (s1) #Q(ii)
print (s1.head(3)) #Q(iii)
print(s1[s1<80]) #Q(iv)
s1[5] = 82; print (s1) #Q(v)
print(s1.iloc[7 : 10]) #Q(vi)
Output
(i)
(ii)
(iii)
(iv)
(v)
(vi)
Practical Question-3
Pandas Series Question-3
Write a program to store employees’ salary data of one year. Write a
code to
do the following :
Salary_data = [120000, 120000, 130000, 115000, 300000,
150000, 100000, 250000, 160000, 400000, 250000, 350000]
Index = Jan, Feb, March, April, May, June, July, Aug,
Sep, Oct, Nov, Dec
(i) Display salary data by slicing in 4 parts.
(ii)Display salary of any April month.
(iii) Apply increment of 10% into salary for all values.
(iv) Give 2400 arrear to employees in April month.
Code
import pandas as pd
salary_data = [120000, 120000, 130000, 115000, 300000,
150000, 100000, 250000, 160000, 400000, 250000, 350000]
index_data = ['Jan', 'Feb', 'March', 'April', 'May',
'June', 'July', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
s1 = pd.Series (salary_data, index = index_data)
print (s1)
print ("First Quarter Salary Details : ")
print (s1.loc['Jan' : 'March'])
print ("Second Quarter Salary Details : ")
print (s1.loc['April' : 'June'])
print ("Third Quarter Salary Details : ")
print (s1.loc['July' : 'Sep'])
print ("Fourth Quarter Salary Details : ")
print (s1.loc['Oct' : 'Dec'])
#Q(ii)
print (s1.loc['April':'April'])
#Q(iii)
s2 = s1 + (s1*0.1); print (s2)
#Q(iv)
s1.loc['April'] = s1.loc['April'] + 2400
print (s1)
Output
(i)
(ii)
(iii)
(iv)
Practical Question-4
Pandas Series Question-4
Create a Series as follows :
15
2 10
3 15
4 20
5 25
(i) Create a similar series but with indices 5, 4, 3, 2, 1
(ii)Remove the entry with index 5
Code
import pandas as pd
import numpy as np
a = np.arange(5, 26, 5)
s1 = pd.Series(a, index = [1, 2, 3, 4, 5])
print (s1)
s2 = s1.reindex ([5, 4, 3, 2, 1])
Practical Question - 4
print (s2)
s3 = s1.drop(5)
print (s3)
Output
Practical Question-5
Data Frame-Creation
Study the following Data Frame representing quarterly sales data of
2017, 2018
and 2019 and create the same using (i) Dictionary of Series and (ii) List
of
Dictionaries.
Code
import pandas as pd
s1 = pd.Series([400000, 350000, 470000, 450000], index
= ['Qtr1', 'Qtr2', 'Qtr3', 'Qtr4'])
s2 = pd.Series([420000, 370000, 490000, 470000], index
= ['Qtr1', 'Qtr2', 'Qtr3', 'Qtr4'])
s3 = pd.Series([430000, 380000, 500000, 480000], index
= ['Qtr1', 'Qtr2', 'Qtr3', 'Qtr4'])
data = {2017 : s1, 2018 : s2, 2019 : s3}
df1 = pd.DataFrame (data)
print (df1)
L = [{2017 : 400000, 2018 : 420000, 2019 : 430000},
{2017 : 350000, 2018 : 370000, 2019 : 380000},
{2017 : 470000, 2018 : 490000, 2019 : 500000},
{2017 : 450000, 2018 : 470000, 2019 : 480000}]
df2 = pd.DataFrame (L, index = ['Qtr1', 'Qtr2', 'Qtr3',
'Qtr4']) #List of Dictionaries
print (df2)
Output
Practical Question-6
Data Frame-Add and Remove Operations
Create a Data Frame showing details of employees with Name,
Department,
Salary, and Bonus amount. The employee code should be the indices of
the Data
Frame as shown.
(ii)
(iii)
(iv)
Practical Question-7
Data Frame-Iterations
Create a Data Frame as shown using list of dictionaries.
Practical Question – 8
Data Frame-Iteration and Updation
Write a program to iterate over a Data Frame containing names and
marks,
then calculate grades as per marks (as per guidelines below) and add
them to
the grade column
Code
import pandas as pd
import numpy as np
# Create initial data
data = {'Name': ['Sajeev', 'Rajeev', 'Sanjay', 'Abhay'],'Marks': [76, 86, 55,
54],'Grade': [np.NaN, np.NaN, np.NaN, np.NaN]}
print("**** DataFrame before updation ****")
df1 = pd.DataFrame(data)
print(df1)
# Update grades based on marks
for index, row in df1.iterrows():
if row['Marks'] >= 90:
df1.loc[index, 'Grade'] = 'A+'
elif row['Marks'] >= 70 and row['Marks'] < 90:
df1.loc[index, 'Grade'] = 'A'
elif row['Marks'] >= 60 and row['Marks'] < 70:
df1.loc[index, 'Grade'] = 'B'
elif row['Marks'] >= 50 and row['Marks'] < 60:
df1.loc[index, 'Grade'] = 'C'
elif row['Marks'] >= 40 and row['Marks'] < 50:
df1.loc[index, 'Grade'] = 'D'
elif row['Marks'] < 40:
df1.loc[index, 'Grade'] = 'F'
print("**** DataFrame after updation ****")
print(df1)
Output
Practical Question-9
Data Frame-Statistical Functions
Create a Data Frame names ‘Cricket’ and perform all statistical functions
on
the same.
Code
import pandas as pd
data = {'Name' : ['Sachin', 'Dhoni', 'Virat', 'Rohit',
'Shikhar'], 'Age' : [26, 25, 25, 24, 31], 'Score' : [87,
67, 89, 55, 47]}
df1 = pd.DataFrame (data); print (df1)
print ("Max score : ", df1['Score'].max())
print ("Min score : ", df1['Score'].min())
print ("Sum of score : ", df1['Score'].sum())
print ("Mean/Avg of score : ", df1['Score'].mean())
print ("Mode of score : ", df1['Score'].mode())
print ("Standard deviation of score : ",
df1['Score'].std())
print ("Variance of score : ", df1['Score'].var())
Output
Practical Question-10
Data Frame-Addition of Rows and Columns
Consider the following Data Frame.
Create the above Data Frame and add the following information using
append() function.
Add job information as follows : Engr, Engr, Dr, Dr, HR, Analyst, HR
Code
import pandas as pd
# Create initial data
data = {'Name': ['Jack', 'Riti', 'Vikas', 'Neelu', 'John'],'Age': [34, 30, 31,
32, 16],'City': ['Sydney', 'Delhi', 'Mumbai', 'Banglore', 'New York'],
'Country': ['Australia', 'India', 'India', 'India', 'US']}
# Create DataFrame
df1 = pd.DataFrame(data)
print("**** Initial DataFrame ****")
print(df1)
# Create new rows as DataFrames
new_rows = pd.DataFrame({'Name': ['Mike', 'Saahil'],'Age': [17, 12],
'City': ['Las Vegas', 'Mumbai'],'Country': ['US', 'India']})
# Concatenate DataFrames
df2 = pd.concat([df1, new_rows], ignore_index=True)
print("**** DataFrame after adding new rows ****")
print(df2)
# Add a new column 'Job'
df2['Job'] = ['Engr', 'Engr', 'Dr', 'Dr', 'HR', 'Analyst', 'HR']
print("**** Final DataFrame ****")
print(df2)
Output
Practical Question – 11
Data Frame-Removal of Rows and Columns
Consider the following Data Frame.
Output
Practical Question-12
Data Frame-Accessing Rows and Columns
Consider the given Data Frame :
Output
Practical Question-13
Data Frame-Accessing Elements using Operators
Create the Data Frame shown :
Practical Question-14
Data Frame-Boolean Indexing
Create a Data Frame containing online classes information as follows :
Output
Practical Question-15
CSV Import and Export
Import the following data from the CSV File “PriceList”.
Increase the price of all items by 2% and export the updated data to
another
CSV File “PriceList_Updated”.
Code
import pandas as pd
# Create a DataFrame from the dictionary
data = {'P_ID': [101, 102, 103, 104, 105],'Product_Name': ['Computer',
'Laptop', 'Monitor', 'Tablet', 'Printer'],'Price': [800, 1200, 300, 450, 150]}
df = pd.DataFrame(data)
# Print the original DataFrame
print("Original DataFrame:")
print(df)
# Update the Price column by adding 2%
df['Price'] = df['Price'] + (df['Price'] * 0.02)
# Print the updated DataFrame
print("\nUpdated DataFrame:")
print(df)
# Save the updated DataFrame to a new CSV file
df.to_csv(r"PriceList_Updated.csv", index=False)
Output
(i)
(ii)
Practical Question-16
Data Handling Using CSV Files
Create a menu driven program to perform the following:
1. Add details and create a file “stu.csv”.
2. Update details and modify csv.
3. Delete details.
4. View details.
5. Display Graph.
Code
import pandas as pd
import matplotlib.pyplot as plt
import os
def main():
print("**** MENU ****")
print("1. Add Details\n2. Update Details\n3. Delete Details\n4. View
all Details\n5. Display Graph\n")
choice = int(input("Enter choice (1-5): "))
if choice == 1:
# Adding new student details
df = pd.DataFrame(columns=['Roll No.', 'Name', 'Marks'])
n = int(input("Enter number of students: "))
for i in range(n):
rn = int(input("Roll number: "))
name = input("Enter name: ")
marks = float(input("Enter marks: "))
df.loc[i] = [rn, name, marks]
print(df)
df.to_csv("Stu.csv", index=False)
elif choice == 2:
# Updating existing student details
if os.path.exists("Stu.csv"):
df = pd.read_csv("Stu.csv")
rn = int(input("Enter Roll number to update: "))
if rn in df['Roll No.'].values:
name = input("Enter new name: ")
marks = float(input("Enter new marks: "))
index = df[df['Roll No.'] == rn].index[0]
df.loc[index, 'Name'] = name
df.loc[index, 'Marks'] = marks
print("Updated DataFrame:")
print(df)
df.to_csv("Stu.csv", index=False)
else:
print("Roll number not found.")
else:
print("No data file found. Please add details first.")
elif choice == 3:
# Deleting a student record
if os.path.exists("Stu.csv"):
df = pd.read_csv("Stu.csv")
rn = int(input("Enter Roll number to delete: "))
if rn in df['Roll No.'].values:
df = df[df['Roll No.'] != rn]
print("Updated DataFrame after deletion:")
print(df)
df.to_csv("Stu.csv", index=False)
else:
print("Roll number not found.")
else:
print("No data file found. Please add details first.")
elif choice == 4:
# Viewing all student details
if os.path.exists("Stu.csv"):
df = pd.read_csv("Stu.csv")
print("All Student Details:")
print(df)
else:
print("No data file found.")
elif choice == 5:
# Displaying a graph of student marks
if os.path.exists("Stu.csv"):
df = pd.read_csv("Stu.csv")
print("**** Your Graph ****")
x = df['Name'].values.tolist()
y = df['Marks'].values.tolist()
plt.bar(x, y, width=0.5)
plt.xlabel("Name")
plt.ylabel("Marks")
plt.title("Students vs. Marks")
plt.show()
else:
print("No data file found.")
else:
print("Invalid choice. Please select a number between 1 and 5.")
if __name__ == "__main__":
main()
Output
(ii)
(iii)
Practical Question-18
Data Visualization Multiline and Multibar Charts
Consider the data given below. Using the above data, plot the following:
Output
(i)
(ii)
Practical Question-19
Data Visualisation-School Results Analysis
Given the school result data. Analyze the performance of students using
data visualization techniques.
i) Draw a bar chart to represent above data with appropriate labels and
title.
ii) Given subject average data for 3 years. Draw a multi bar chart to
represent
the above data with appropriate labels, title, and legend.
iii) Plot a histogram for Marks data of 20 students of a class. The data
(Marks
of 20 students) is as follows:
[90, 99, 95, 92, 92, 90, 85, 82, 75, 78, 83, 82, 85, 90, 92, 98, 99, 100]
Code(i)
import matplotlib.pyplot as plt
import numpy as np
# Average marks for each subject
avg20 = [85, 88, 87, 73, 80, 90]
# Create an array for the x-axis
x_axis = np.arange(len(avg20))
# Create the bar chart
plt.bar(x_axis, avg20, width=0.4, color='khaki') # Adjusted width for
better visibility
# Set the x-ticks and labels
plt.xticks(x_axis, ['Eng', 'Eco', 'Bst', 'Acc', 'Entre', 'Mgt']) # Ensure all
subjects are unique
# Add title and labels
plt.title('Subject-wise Average Marks in 2020')
plt.xlabel('Subjects')
plt.ylabel('Average Marks')
# Add grid for better readability
plt.grid(axis='y', linestyle='--', alpha=0.7)
# Show the plot
plt.show()
Code(ii)
import matplotlib.pyplot as plt
import numpy as np
avg18 = [87, 88, 90, 76, 82, 90]
avg19 = [86, 87, 87, 74, 81, 91]
avg20 = [85, 88, 87, 73, 80, 90]
x_axis = np.arange(len(avg18)) # [0, 1, 2, 3, 4, 5]
plt.bar(x_axis, avg18, width=0.2, label='2018')
plt.bar(x_axis + 0.2, avg19, width=0.2, label='2019')
plt.bar(x_axis + 0.4, avg20, width=0.2, label='2020')
plt.legend(loc='upper left')
plt.xticks(x_axis, ['Eng', 'Eco', 'Bst', 'Acc', 'Entre', 'Eco'])
plt.title('Subjects Average Marks of 3 Years')
plt.xlabel('Subjects')
plt.ylabel('Average Marks')
plt.grid(True)
plt.show()
Code(iii)
import matplotlib.pyplot as plt
class1 = [90, 99, 95, 92, 92, 90, 85, 82, 75, 78, 83, 82, 85, 90, 92, 98, 99,
100]
plt.hist(class1, bins=[75, 80, 85, 90, 95, 100], color="springgreen",
edgecolor="darkslategrey", linewidth=2)
plt.xlabel('Marks')
plt.ylabel('Frequency')
plt.title('Histogram of Class Marks') # Optional: Add a title for clarity
plt.show()
Output
(i)
(ii)
(iii)
Practical Question – 20
Covid Data Analysis
Export the CSV File “CovidData.csv” and plot a bar chart with Country
vs. Total Confirmed cases.
Code
import pandas as pd
import matplotlib.pyplot as plt
data = {'Country': ['Austria', 'Belgium', 'China', 'Denmark', 'France',
'Germany', 'Iran', 'Italy', 'Japan', 'South Korea', 'Malaysia', 'Netherlands',
'Norway', 'Spain', 'Sweden', 'Switzerland', 'US', 'UK'],'Total Confirmed
Cases': [39309, 103000, 85297, 24257, 468000, 276000, 429000,
301000, 79438, 23106, 10358, 95995, 13005, 682000, 89436, 50378,
6890000, 404000],'Percentage': [0.13, 0.33, 0.27, 0.08, 1.49, 0.88, 1.37,
0.96, 0.25, 0.07, 0.03, 0.31, 0.04, 2.17, 0.28, 0.16, 21.94, 1.29]}
df = pd.DataFrame(data)
print(df)
country = df['Country'].tolist()
confirmedcases = df['Total Confirmed Cases'].tolist()
plt.bar(country, confirmedcases, width=0.4, align='center',
color='midnightblue')
plt.title('Total Number of Covid Cases')
plt.xlabel('Country'); plt.ylabel('Total Cases')
plt.xticks(rotation=90)
plt.show()
Output
Practical Question-21
Company Sales Data Analysis
Export the CSV File “CompanySalesData.csv” and plot a line chart with
month number against total profit.
Code
import pandas as pd
import matplotlib.pyplot as plt
# Create the DataFrame from the dictionary
data = {'month_num': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],'facecream':
[2500, 2630, 2140, 3400, 3600, 2760, 2980, 3700, 3540, 1990, 2340,
2900],'facewash': [1500, 1200, 1340, 1130, 1740, 1555, 1120, 1400,
1780, 1890, 2100, 1760],'toothpaste': [5200, 5100, 4550, 5870, 4560,
4890, 4780, 5860, 6100, 8300, 7300, 7400],'bathingsoap': [9200, 6100,
9550, 8870, 7760, 7490, 8980, 9960, 8100, 10300, 13300,
14400],'shampoo': [1200, 2100, 3550, 1870, 1560, 1890, 1780, 2860,
2100, 2300, 2400, 1800],'moisturizer': [1500, 1200, 1340, 1130, 1740,
1555, 1120, 1400, 1780, 1890, 2100, 1760]}
df = pd.DataFrame(data)
# Calculate total profit for each month
df['total_profit'] = df[['facecream', 'facewash', 'toothpaste', 'bathingsoap',
'shampoo', 'moisturizer']].sum(axis=1)
# Prepare data for plotting
x = df['month_num'].tolist()
y = df['total_profit'].tolist()
# Create the plot
plt.plot(x, y, color='royalblue', linewidth=1.0, marker='*', ms=10,
mec='black', mfc='midnightblue')
# Set labels and title
plt.xlabel('Month Number')
plt.ylabel('Total Profit')
plt.title('Company Sales Data')
# Show the plot
plt.xticks(x) # Optional: Set x-ticks to be the month numbers
plt.grid(True) # Optional: Add grid for better readability
plt.show()
Output