Python Programs
Python Programs
Introduction:
Pandas (Panel Data) is a popular data
manipulation and analysis tool. It is also
very convenient.
It is very easy to import and export data
using Pandas library which has a very rich
set of functions.
It makes a simple and easy process for data
analysis.
A data structure is a collection of data values
and the operations that can be applied to that
data. It enables efficient storage, retrieval,
and modification to the data.
Pandas offer three data structures:
1. Series
2. DataFrame
3. Panel
The basic difference between Pandas Series and
NumPy ndarray is that operations between Series
automatically align the data based on labels.
Thus, we can write computations without
considering whether all Series involved have
the same label or not whereas in case of
ndarrays it raises an error.
CODE 1
Write a code to create a series using the Python
sequence [4,6,8,10]. Assume that Pandas is imported as
pd.
import pandas as pd
print('CODE 1')
s1= pd.Series([4,6,8,10])
print('Series object 1: ')
print(s1)
print('_______________________________________________')
OUTPUT (1)
CODE 2
Write a code to create a Series object using the python
sequence (11,21,31,41). Assume that Pandas is imported
as pd.
import pandas as pd
print('CODE 2')
s2=pd.Series((11,21,31,41))
print('Series object 2:')
print(s2)
print('_______________________________________________')
OUTPUT (2)
CODE 3
Write a program to create a series object using
individual characters ‘o’,’h’,’o’. Assume that Pandas is
imported as pd.
import pandas as pd
print('CODE 3')
s3=pd.Series(['o','h','o'])
print('Series object: ')
print(s3)
print('_______________________________________________')
OUTPUT (3)
CODE 4
Write a program to create a Series object using a
string: ‘So funny’. Assume that Pandas is imported as
pd.
import pandas as pd
print('CODE 4')
print('series object: ')
s4=pd.Series('So funny')
print(s4)
print('_______________________________________________')
OUTPUT (4)
CODE 5
Write a program to create a Series object using three
different words: ‘I’, ‘am’, ‘laughing’. Assume that
Pandas is imported as pd.
import pandas as pd
print('CODE 5')
s5=pd.Series(['I','am','laughing'])
print('Series object: ')
print(s5)
print('_______________________________________________')
OUTPUT (5)
CODE 6
Write a program to create a series object using an ndarray
that has 5 elements in the range 24 to 64.
import pandas as pd
import numpy as np
print('CODE 6')
s6=pd.Series(np.linspace(24,64,5))
print(s6)
print('_______________________________________________')
OUTPUT (6)
CODE 7
Write a program to create a Series object using an
ndarray that is created by tiling a list[3,5] twice.
import pandas as pd
import numpy as np
print('CODE 7')
s7=pd.Series(np.tile([3,5],2))
print(s7)
print('_______________________________________________')
OUTPUT (7)
CODE 8
Write a program to create a Series object using a
dictionary that stores the number of students in each
section on class 12 in your school.
import pandas as pd
print('CODE 8')
stu={'A':39,'B':41,'C':42,'D':44}
s8=pd.Series(stu)
print(s8)
print('_______________________________________________')
OUTPUT (8)
CODE 9
Write a program to create a Series object that stores
the initial budget allocated five hundred thousand each
for the four quarters of the year Qtr1, Qtr2, Qtr3 and
Qtr4.
import pandas as pd
print('CODE 9')
s9=pd.Series(50000,index=['qtr1','qtr2','qtr3','qtr4'])
print(s9)
print('_______________________________________________')
OUTPUT (9)
CODE 10
Total number of medals to be won is two hundred in the
Inter University games held every alternate year. Write
code to create a series object that stores these medals
for games to be held in the decade 2020-29.
import pandas as pd
print('CODE 10')
s10=pd.Series(200, index=range(2020,2029,2))
print(s10)
print('_______________________________________________')
OUTPUT (10)
CODE 11
A python list, namely section stores the section names
‘A’,’B’,’C’,’D’ of class 12 in your school. Another list
contri stores the contribution made by these students to
a charity fund. Write a code to create a Series object
that stores the contribution amount as the values and
the section names as the indexes.
import pandas as pd
print('CODE 11')
section=['a','b','c','d']
contri=[6700,5600,5000,5200]
s11=pd.Series(data=contri,index=section)
print(s11)
print('_______________________________________________')
OUTPUT (11)
CODE 12
Sequences section and contri1 store the section
names(‘A’,’B’,’C’,’D’,’E’) and contribution made by them
respectively (6700,5600,5000,5200, nil)for a charity.
Your school has decided to donate as much contribution
as made by each section, i.e., the donation will be
doubled.
write a code to create a Series object that stores the
contribution amount as the values and the section names
as the indexes with datatype as folat32.
import pandas as pd
import numpy as np
print('CODE 12')
section =['a','b','c','d','e']
contri=np.array([6700,5600,5000,5200,np.NaN])
s12=pd.Series(data=contri*2,index=section,dtype=np.float64
)
print(s12)
print('_______________________________________________')
OUTPUT (12)
CODE 13
Consider the two series objects s11 and s12 that you
created in examples 11 and 12 respectively. Print the
attributes of both these objects in a report form as
shown below:
Attribute names object s11 object s12
data type
shape
no of bytes
no of dimensions
item size
has NaNs?
empty?
import pandas as pd
import numpy as np
section=['a','b','c','d']
contri=[6700,5600,5000,5200]
s11=pd.Series(data=contri,index=section)
section =['a','b','c','d','e']
contri=np.array([6700,5600,5000,5200,np.NaN])
s12=pd.Series(data=contri*2,index=section,dtype=np.float
64)
print('CODE 13')
print('attribute name| Object s11| Object s12')
print('------------','------------','------------')
print('Data type(.dtype)',s11.dtype,s12.dtype)
print('Shape(.shape)',s11.shape,s12.shape)
print('no of bytes(.nbytes)',s11.nbytes,s12.nbytes)
print('no of dimensions(.ndim)',s11.ndim,s12.ndim)
print('Has NaNs?(.hasnans)',s11.hasnans,s12.hasnans)
print('empty?(.empty)',s11.empty,s12.empty)
print('_______________________________________________')
OUTPUT (13)
CODE 14
Consider a Series object s8 that stores the number of
students in each section of class 12 as shown below
A 39
B 41
C 42
D 44
First two sections have been given a task of selling
tickets @ 100/- per ticket as part of a social
experiment. Write a code to display how much they have
collected.
import pandas as pd
stu={'A':39,'B':41,'C':42,'D':44}
s8=pd.Series(stu)
print('CODE 14')
print('Tickets amount: ')
print(s8[:2]*100)
print('_______________________________________________')
OUTPUT (14)
CODE 15
Consider the Series object s13 that stores the
contribution of each section as shown below:
A 6700
B 5600
C 5000
D 5200
import pandas as pd
s13=pd.Series([6700,5600,5000,5200])
s13[0]=7600
s13[2:]=7000
print('Series object after modifying amounts')
print(s13)
print('_______________________________________________')
OUTPUT(15)
CODE 16
A series object trdata consists of around 10 rows of
data. write a program to print the following details: i)
first 4 rows ii)last 5 rows of data
import pandas as pd
i) print(trdata.head(4))
ii) print(trdata.tail())
CODE 17
Number of students in classes 11 and 12 in three streams
(science, commerce, and humanities) are stored in two
Series objects c11 and c12. Write a code to find total
number of students in classes 11 and 12, streamwise.
import pandas as pd
print('CODE 17')
c11=pd.Series(data=[30,40,50],index= ['Science ',
'Commerce ', 'Humanities '])
c12=pd.Series(data=[37,44,45],index=[ 'Science ',
'Commerce ', 'Humanities '])
print('Total no. of students ')
print(c11+c12)
print('_______________________________________')
OUTPUT (17)
CODE 18
Object1 Population stores the details of population in
four metro cities of India and Object2 AvgIncome stores
the total average income reported in previous year in
each of these metros. Calculate income per capita for
each of these metro cities.
import pandas as pd
print('CODE 18')
Population = pd.Series([10927986, 12691836, 4631392,
4328863 ],
index= ['Delhi', 'Mumbai',
'Kolkata', 'Chennai'])
AvgIncome =
pd.Series([72167810927986,85087812691836,4226784631392,5
261784328863],
index= [ 'Delhi', 'Mumbai',
'Kolkata', 'Chennai'])
perCapita = AvgIncome / Population
print("Population in four metro cities")
print(Population)
print("Avg. Income in four metro cities")
print(AvgIncome)
print("Per Capita Income in four metro cities")
print(perCapita)
print('___________________________________________')
OUTPUT (18)
CODE 19
What will be the output produced by the following
program?
import pandas as pd
info=pd.Series(data=[31,41,51])
print(info)
print(info>40)
print(info[info>40])
OUTPUT(19)
CODE 20
Series object s11 stores the charity contribution made
by each section
A 6700
B 5600
C 5000
D 5200
import pandas as pd
s11=pd.Series([6700,5600,5000,5200],
index=['A','B','C','D'])
print('Contribution >5500 by :')
print(s11[s11>5500])
OUTPUT(20)
CODE 21
Given a dictionary that sites the section names’ list as
value of ‘Section’ key and contribution amounts’ list as
value for ‘Contri’ key:
dict1={‘Section’:[‘A’,’B’,’C’,’D’],
‘Contri’:[6700.5600,5000,5200]}
Write a code to create and display the data frame using
above dictionary.
import pandas as pd
dict1={'Section':['A','B','C','D'],
'Contri':[6700,5600,5000,5200]}
df1=pd.DataFrame(dict1)
print(df1)
OUTPUT(21)
CODE 22
create and display a DataFrame from a 2D dictionary,
Sales, which stores the quarter-wise sales as inner
dictionary for two years as show below:
Sales={‘yr1’:
{‘Qtr1’:34500,’Qtr2’:56000,’Qtr3’:47000,’Qtr$:49000},’yr
2:{‘Qtr1’:44900,’Qtr2’:46100,’Qtr3’57000,’Qtr4’:59000}}
import pandas as pd
sales={'yr1':
{'Qtr1':34500,'Qtr2':56000,'Qtr3':47000,'Qtr4':49000},'y
r2':
{'Qtr1':44900,'Qtr2':46100,'Qtr3':57000,'Qtr4':59000}}
dfsales=pd.DataFrame(sales)
print(dfsales)
OUTPUT(22)
CODE 23
Read the following code:
import pandas as pd
yr1={'Qtr1':44900,'Qtr2':46100,'Q3':57000,'Q4':59000}
yr2={'A':54500,'B':51000,'Qtr4':57000}
disales={1:yr1,2:yr2}
df3=pd.DataFrame(disales)
i) list the index labels of the dataframe df3
ii) list the column names of dataframe df3
OUTPUT (24)
CODE 25
Write a program that creates a dataframe for a 2D list.
Specify own index labels.
import pandas as pd
list2=[[25,45,60],[34,67,89],[88,90,56]]
df2=pd.DataFrame(list2,index=['row1','row2','row3'])
print(df2)
OUTPUT (25)
CODE 26
Write a program create a dataframe for a list containing
2 lists, each containing Target and actual Sales figures
for four zonal offices. Give appropriate row labels.
import pandas as pd
Target = [56000, 70000, 75000, 60000]
Sales = [58000, 68000, 78000, 61000]
ZoneSales = [Target, Sales]
zsaleDf = pd.DataFrame(ZoneSales, columns = ['ZoneA',
'ZoneB', 'ZoneC', 'ZoneD'], index = ['Target', 'Sales'])
print(zsaleDf)
OUTPUT (26)
CODE 27
What is the output of the following code?
import pandas as pd
import numpy as np
arr1=np.array([[11,12],[13,14],[15,16]],np.int32)
dtf2=pd.DataFrame(arr1)
print(dtf2)
OUTPUT (27)
CODE 28
Write a program to create a DataFrame from a 2D array as
shown below:
import pandas as pd
import numpy as np
arr2=np.array([[101,113,124],[130,140,200],
[115,216,217]])
df3=pd.DataFrame(arr2)
print(df3)
OUTPUT (28)
CODE 29
Consider two series objects staff and salaries that
store the number of people in various office branches
and salaries distributed in these branches,
respectively. Write a program to create another Series
object that stores average salary per branch and then
create a DataFrame object from these Series objects.
import pandas as pd
import numpy as np
staff=pd.Series([20,36,44])
sal=pd.Series([279000,396800,563000])
avg=sal/staff
org={'people':staff,'amount':sal,'average':avg}
df5=pd.DataFrame(org)
print(df5)
OUTPUT (29)
CODE 30
Write a program to create a DataFrame to store weight,
age and names of 3 people. Print the DataFrame and its
transpose.
import pandas as pd
df=pd.DataFrame({'Weight':[42,75,66],
'Name':['Arnav','Charles','Guru'],
'Age':[15,22,35]})
print('Original DataFrame')
print(df)
print('Transpose: ')
print(df.T)
OUTPUT (30)
CODE 31
Given a dataframe namely aid that stores the aid by NGOs
for different states:
Toys Books Uniform Shoes
Andhra 7916 6189 610 8810
Odisha 8508 8208 508 6798
M.P. 7226 6149 611 9611
U.P. 7617 6157 457 6457
Write a program to display the aid for
I) Books and Uniform only
II) Shoes only
import pandas as pd
print('Aid for books and uniform ')
print(aid[['Books', 'Uniform']])
print('Aid for shoes')
print(aid.Shoes)
CODE 32
Given a dataframe namely aid that stores the aid by NGOs
for different states:
Toys Books Uniform Shoes
Andhra 7916 6189 610 8810
Odisha 8508 8208 508 6798
M.P. 7226 6149 611 9611
U.P. 7617 6157 457 6457
Write a program to display the aid for states ‘Andhra’
and ‘Odisha’ for Books and Uniform only.
import pandas as pd
print(aid.loc[['Andhra', 'Odisha'],[ 'Books':
'Uniform'])
CODE 33
Consider a dataframe df as shown below.
CODE 35
Given a Series object s5. Write a program to store the
squares of the Series values in object s6. Display s6’s
values which are > 15.
OUTPUT (35)
import pandas as pd
print("Series object s5 : ")
print(s5)
s6=s5*2
print("Values in s6 > s5: ")
print(s6[s6>15])
CODE 36
Write a program to add column namely orders having
values 6000,6700,6200,6000 for zones A,B,C,D. Add some
dummy values in this row.
import pandas as pd
d={'Target':[56000,70000,75000,60000],
'Sales':[58000,68000,78000,61000]}
df=pd.DataFrame(d,index=['zoneA','zoneB','zoneC','zoneD'
])
df['Orders']=[6000,6700,6200,6000]
df.loc['zoneE',:]=[50000,45000,5000]
print(df)
OUTPUT (36)
CODE 37
From the dtf5 used above, create another DataFrame and it
must not contain the column ‘Population’ and the row
Bangalore.
import pandas as pd
d={'Population':
[10927986.0,12691836.0,4631392.0,4328063.0,678097.0],
'Hospitals':[189.0,208.0,149.0,157.0,1200.0],
'Schools':[7916.0,8508.0,7226.0,7617.0,1200.0]}
dtf5=pd.DataFrame(d,index=['Delhi','Mumbai','Kolkata','C
hennai','Bangalore'])
dtf6=pd.DataFrame(dtf5)
del dtf6['Population']
dtf6=dtf6.drop(['Bangalore'])
print(dtf6)
OUTPUT (37)
CODE 38
Consider the saleDf shown below
Target Sales
zoneA 56000 58000
zoneB 70000 68000
zoneC 75000 78000
zoneD 60000 61000
Write a program to rename indexes of ‘zoneC’ and’zoneD’
as ‘Central’ and ‘Dakshin’ respectively and the column
name Target and sales as ‘Targeted’ and
‘Achieved’respectively.
import pandas as pd
d={'Target':[56000,70000,75000,60000],
'Sales':[58000,68000,78000,61000]}
df=pd.DataFrame(d,index=['zoneA','zoneB','zoneC','zoneD'
])
print(df.rename(index={'zoneC':'Central','zoneD':'Dakshi
n'},
columns={'Target':'Targeted','Sales':'Achieved'}))
OUTPUT (38)
PYTHON - PANDAS – II
OUTPUT (1)
CODE 2
Use iterrows() to extract row-wise Series objects:
import pandas as pd
disales={'yr1':
{'Qtr1':34500,'Qtr2':56000,'Qtr3':47000,'Qtr4':49000},
'yr2':
{'Qtr1':44900,'Qtr2':46100,'Qtr3':57000,'Qtr4':59000},
'yr3':
{'Qtr1':54500,'Qtr2':51000,'Qtr3':57000,'Qtr4':48500}}
df1=pd.DataFrame(disales)
for (row, rowSeries)in df1.iterrows():
print('Row index:',row)
print('containing: ')
i=0
for val in rowSeries:
print('At',i,'position: ',val)
i=i+1
OUTPUT (2)
CODE 3
Use iteritems() to extract data from dataframe column
wise.
import pandas as pd
disales={'yr1':
{'Qtr1':34500,'Qtr2':56000,'Qtr3':47000,'Qtr4':49000},
'yr2':
{'Qtr1':44900,'Qtr2':46100,'Qtr3':57000,'Qtr4':59000},
'yr3':
{'Qtr1':54500,'Qtr2':51000,'Qtr3':57000,'Qtr4':48500}}
df1=pd.DataFrame(disales)
for (col, colSeries)in df1.iteritems():
print('Column index: ',col)
print('containing: ')
print(colSeries)
OUTPUT (3)
CODE 4
Write a program to print the DataFrame df, one row at a
time.
import pandas as pd
dict={'Name':['Ram','Pam','Sam'],
'Marks':[70,95,80]}
df=pd.DataFrame(dict,index=['Rno.1','Rno.2','Rno3'])
for i,j in df.iterrows():
print(j)
print('________')
OUTPUT (4)
CODE 5
Write a program to print the DataFrame df one column at
a time.
import pandas as pd
dict={'Name':['Ram','Pam','Sam'],
'Marks':[70,95,80]}
df=pd.DataFrame(dict,index=['Rno.1','Rno.2','Rno3'])
for i,j in df.iteritems():
print(j)
print('______________')
OUTPUT (5)
CODE 6
Write a program to print only the values from the marks
column for each row.
import pandas as pd
dict={'Name':['Ram','Pam','Sam'],
'Marks':[70,95,80]}
df=pd.DataFrame(dict,index=['Rno.1','Rno.2','Rno3'])
for r, row in df.iterrows():
print(row['Marks'])
print('______________'))
OUTPUT (6)
CODE 7
Write a program to calculate total points earned by both
the teams in each round.
import pandas as pd
d1={'p1':{'1':700,'2':975,'3':970,'4':900},
'p2':{'1':490,'2':460,'3':570,'4':590}}
d2={'p1':{'1':1100,'2':1275,'3':1270,'4':1400},
'p2':{'1':1400,'2':1260,'3':1500,'4':1190}}
df1=pd.DataFrame(d1)
df2=pd.DataFrame(d2)
print("Team1's performance")
print(df1)
print("Team2's performance")
print(df2)
print("Points earned by both teams:")
print(df1+df2)
OUTPUT (7)
CODE 8
Consider the points earned by two teams. Display how
much point difference Team2 has with Team1.
import pandas as pd
d1={'p1':{'1':700,'2':975,'3':970,'4':900},
'p2':{'1':490,'2':460,'3':570,'4':590}}
d2={'p1':{'1':1100,'2':1275,'3':1270,'4':1400},
'p2':{'1':1400,'2':1260,'3':1500,'4':1190}}
df1=pd.DataFrame(d1)
df2=pd.DataFrame(d2)
print("Team1's performance")
print(df1)
print("Team2's performance")
print(df2)
print("team2's points' difference")
print(df1.rsub(df2))
OUTPUT (7)
CODE 9
Given two DataFrames storing points of a 2-player team
in four rounds, write a program to calculate average
points obtained by each player in each round.
import pandas as pd
d1={'p1':{'1':700,'2':975,'3':970,'4':900},
'p2':{'1':490,'2':460,'3':570,'4':590}}
d2={'p1':{'1':1100,'2':1275,'3':1270,'4':1400},
'p2':{'1':1400,'2':1260,'3':1500,'4':1190}}
df1=pd.DataFrame(d1)
df2=pd.DataFrame(d2)
print("Team1's performance")
print(df1)
print("Team2's performance")
print(df2)
print('average scored by each player')
av=(df1+df2)/2
print(av)
OUTPUT (9)
CODE 10
Consider the DateFrame(dfmks) given. Write a program to
print the maximum marks scored in each subject across
all sections.
import pandas as pd
import numpy as np
d={'A':[99,90,95,94,97],
'B':[94.0,94.0,89.0,np.NaN,100.0],
'C':[92,92,91,99,99],
'D':[97.0,97.0,89.0,95.0,np.NaN]}
dfmks=pd.DataFrame(d,index=['Acct','Eco','Eng','IP','Mat
h'])
print('Max marks scored in each sub')
print(dfmks.max(axis=1))
OUTPUT (10)
CODE 11
Which statement would you change in the above program so
that it considers only non-Nan values for calculation
purposes?
import pandas as pd
import numpy as np
d={'A':[99,90,95,94,97],
'B':[94.0,94.0,89.0,np.NaN,100.0],
'C':[92,92,91,99,99],
'D':[97.0,97.0,89.0,95.0,np.NaN]}
dfmks=pd.DataFrame(d,index=['Acct','Eco','Eng','IP','Mat
h'])
print(dfmks)
print(dfmks.max(axis=1,skipna=True))
OUTPUT (11)
CODE 12
Consider the marks’ DataFrame(dfmks). Write a program to
print the maximum marks scored in a section, across all
subjects.
import pandas as pd
import numpy as np
d={'A':[99,90,95,94,97],
'B':[94.0,94.0,89.0,np.NaN,100.0],
'C':[92,92,91,99,99],
'D':[97.0,97.0,89.0,95.0,np.NaN]}
dfmks=pd.DataFrame(d,index=['Acct','Eco','Eng','IP','Math'
])
print('Max marks scored in each sub')
print(dfmks.max(axis=1))
print(dfmks)
print('max marks scored in a section')
print(dfmks.max())
OUTPUT (12)
CODE 13
Consider the DataFrame df. Write a program to calculate
mode for each section.
import pandas as pd
import numpy as np
d={'A':[99,90,95,94,97],
'B':[94.0,94.0,89.0,np.NaN,100.0],
'C':[92,92,91,99,99],
'D':[97.0,97.0,89.0,95.0,np.NaN]}
df=pd.DataFrame(d,index=['Acct','Eco','Eng','IP','Math']
)
print('CODE 13')
print('mode for every section')
print(df.mode())
print('___________')
OUTPUT (13)
CODE 14
Consider the DataFrame df. Write a program to calculate
mode for each subject.
import pandas as pd
import numpy as np
d={'A':[99,90,95,94,97],
'B':[94.0,94.0,89.0,np.NaN,100.0],
'C':[92,92,91,99,99],
'D':[97.0,97.0,89.0,95.0,np.NaN]}
df=pd.DataFrame(d,index=['Acct','Eco','Eng','IP','Math']
)
print('CODE 14')
print('mode for every subject')
print(df.mode(axis=1))
print('___________')
OUTPUT (14)
CODE 15
Consider the DataFrame df. Write a program to calculate
median value for each section.
import pandas as pd
import numpy as np
d={'A':[99,90,95,94,97],
'B':[94.0,94.0,89.0,np.NaN,100.0],
'C':[92,92,91,99,99],
'D':[97.0,97.0,89.0,95.0,np.NaN]}
df=pd.DataFrame(d,index=['Acct','Eco','Eng','IP','Math']
)
print('CODE 15')
print('median value for each section')
print(df.median())
print('___________')
OUTPUT (15)
CODE 16
Consider the DataFrame df. Write a program to calculate
median value for each subject.
import pandas as pd
import numpy as np
d={'A':[99,90,95,94,97],
'B':[94.0,94.0,89.0,np.NaN,100.0],
'C':[92,92,91,99,99],
'D':[97.0,97.0,89.0,95.0,np.NaN]}
df=pd.DataFrame(d,index=['Acct','Eco','Eng','IP','Math']
)
print('CODE 16')
print('median value for each subject')
print(df.median(axis=1))
print('___________')
OUTPUT (16)
CODE 17
Consider the DataFrame df. Write a program to calculate
average for each section.
import pandas as pd
import numpy as np
d={'A':[99,90,95,94,97],
'B':[94.0,94.0,89.0,np.NaN,100.0],
'C':[92,92,91,99,99],
'D':[97.0,97.0,89.0,95.0,np.NaN]}
df=pd.DataFrame(d,index=['Acct','Eco','Eng','IP','Math'])
print('CODE 17')
print('Average marks for each section')
print(df.mean())
print('___________')
OUTPUT (17)
CODE 18
Consider the DataFrame df. Write a program to calculate
average for each subject.
import pandas as pd
import numpy as np
d={'A':[99,90,95,94,97],
'B':[94.0,94.0,89.0,np.NaN,100.0],
'C':[92,92,91,99,99],
'D':[97.0,97.0,89.0,95.0,np.NaN]}
df=pd.DataFrame(d,index=['Acct','Eco','Eng','IP','Math']
)
print('CODE 18')
print(df.mean(axis=1))
print('___________')
OUTPUT (18)
CODE 19
Consider the DataFrame df. Write a program to calculate
how many exams were conducted, for each section.
import pandas as pd
import numpy as np
d={'A':[99,90,95,94,97],
'B':[94.0,94.0,89.0,np.NaN,100.0],
'C':[92,92,91,99,99],
'D':[97.0,97.0,89.0,95.0,np.NaN]}
df=pd.DataFrame(d,index=['Acct','Eco','Eng','IP','Math']
)
print('CODE 19')
print('no. of exams conducted for each section')
print(df.count())
print('___________')
OUTPUT (19)
CODE 20
Consider the DataFrame df. Assuming that in each
section, only 1 student scored maximum marks in each
subject, write a program to calculate the topper’s total
marks in each section.
OUTPUT (20)
CODE 21
Given a DataFrame dtf6
Hospitals schools
Delhi 189.0 7916.0
Mumbai 208.0 8508.0
Kolkata 149.0 7226.0
Chennai 157.0 7617.0
Write a program to display top two rows’ values of
‘Schools’ column and last 3 values of ‘Hospitals’
column.
import pandas as pd
d={'Hospitals':[189.0,208.0,149.0,157.0],
'Schools':[7916.0,8508.0,7226.0,7617.0]}
df=pd.DataFrame(d,index=['Delhi','Mumbai','Kolkata','Che
nnai'])
print(df.Schools.head(2))
print(df.Hospitals.tail(3))
OUTPUT (21)
CODE 22
Consider the DataFrame prodf that stores some
agriculture statistics of some Indian states. Write a
program to find out how many states produce wheat crop.
import pandas as pd
import numpy as np
d = { 'Rice': {'Andhra P.':7452.4, 'Gujarat':1930.0,
'Kerala': 2604.8,
'Punjab':11586.2, 'Tripura':814.6, 'Uttar
P.':13754.0},
'Wheat': { 'Andhra P.':np. NaN, 'Gujarat':2737.0,
'Kerala': np. NaN,
'Punjab':16440.5, 'Tripura' :0.5,
'Uttar P.':30056.0},
'Pulses': { 'Andhra P.':931.0, 'Gujarat':818.0,
'Kerala':1.7,
'Punjab' :33.0, 'Tripura':23.2, 'Uttar
P.':2184.4},
'Fruits': { 'Andhra P.':7830.0, 'Gujarat':11950.0,
'Kerala':113.1,
'Punjab':7152.0, 'Tripura': 44.1,
'Uttar P. ':140169.2}}
prodf=pd.DataFrame(d)
print('wheat crop is produced
by',prodf['Wheat'].count(),'states.')
OUTPUT (22)
CODE 23
Consider the DataFrame prodf that stores some
agriculture statistics of some Indian states. Write a
program to find the quantity of wheat and rice crop
produced.
print('wheat and rice crops are produced by these many
states')
print(prodf[['Wheat','Rice']].count())
OUTPUT (23)
CODE 24
Consider the DataFrame prodf that stores some
agriculture statistics of some Indian states. Write a
program to find how many crops are produced by Andhra
Pradesh.
print('crops produced by andhra pradesh are')
print(prodf.loc['Andhra P.',:].count())
OUTPUT (24)
CODE 25
write a program to print the maximum produce of the
middle 2 states listed(Kerala and Punjab)
print('Maximum produce by Kerala and Punjab: ')
print(prodf.iloc[2:4,:].max(axis=1))
OUTPUT (25)
CODE 26
Consider a DataFrame ndf as shown below:
Name Sex Position City Age Projects
Rabia F Manager Bangalore 30 13
Evan M Programmer New Delhi 27 17
Jia F Manager Chennai 32 16
Lalit M Manager Mumbai 40 20
Jaspreet M Programmer Chennai 28 21
Suji F Programmer Bangalore 32 14
Write a program to summarize how many projects are being
handled by each position for each city. Also write a
program to summarize information position wise, about
how many projects are being handled by them for each
city. Print zero in case the data is missing for some
column/row.
import pandas as pd
d={'Name':
['Rabia','Evan','Jia','Lalit','Jaspreet','Suji'],
'Sex':['F','M','F','M','M','F'],
'Position':
['Manager','Manager','Programmer','Manager','Manager','P
rogrammer',],
'City':
['Bangalore','NewDelhi','Chennai','Mumbai','Chennai','Bang
alore'],
'Age':[30,27,32,40,28,32],
'Projects':[13,17,16,20,21,14]}
ndf=pd.DataFrame(d)
print(ndf.pivot(index='Position',columns='City',values='
Projects'))
print('---------------------')
OUTPUT (26)
CODE 27
print(ndf.pivot(index='Position',columns='City',values='
Projects').fillna(0))
print('---------------------')
OUTPUT (27)
CODE 28
Consider the prodf DataFrame storing crop production
statistics of various states. Write a program to sort
the values of prodf in the order of wheat production by
them.
import pandas as pd
import numpy as np
d = { 'Rice': {'Andhra P.':7452.4, 'Gujarat':1930.0,
'Kerala': 2604.8,
'Punjab':11586.2, 'Tripura':814.6, 'Uttar
P.':13754.0},
'Wheat': { 'Andhra P.':np. NaN, 'Gujarat':2737.0,
'Kerala': np. NaN,
'Punjab':16440.5, 'Tripura' :0.5,
'Uttar P.':30056.0},
'Pulses': { 'Andhra P.':931.0, 'Gujarat':818.0,
'Kerala':1.7,
'Punjab' :33.0, 'Tripura':23.2, 'Uttar
P.':2184.4},
'Fruits': { 'Andhra P.':7830.0, 'Gujarat':11950.0,
'Kerala':113.1,
'Punjab':7152.0, 'Tripura': 44.1,
'Uttar P. ':140169.2}}
prodf=pd.DataFrame(d)
print(prodf.sort_values(by=['Wheat']))
print('----------------')
OUTPUT (28)
CODE 29
import pandas as pd
import numpy as np
d = { 'Rice': {'Andhra P.':7452.4, 'Gujarat':1930.0,
'Kerala': 2604.8,
'Punjab':11586.2, 'Tripura':814.6, 'Uttar
P.':13754.0},
'Wheat': { 'Andhra P.':np. NaN, 'Gujarat':2737.0,
'Kerala': np. NaN,
'Punjab':16440.5, 'Tripura' :0.5,
'Uttar P.':30056.0},
'Pulses': { 'Andhra P.':931.0, 'Gujarat':818.0,
'Kerala':1.7,
'Punjab' :33.0, 'Tripura':23.2, 'Uttar
P.':2184.4},
'Fruits': { 'Andhra P.':7830.0, 'Gujarat':11950.0,
'Kerala':113.1,
'Punjab':7152.0, 'Tripura': 44.1,
'Uttar P. ':140169.2}}
prodf=pd.DataFrame(d)
print(prodf.sort_values(by=['Fruits'],ascending=False))
print('----------------')
OUTPUT (29)
CODE 30
import pandas as pd
import numpy as np
d = { 'Rice': {'Andhra P.':7452.4, 'Gujarat':1930.0,
'Kerala': 2604.8,
'Punjab':11586.2, 'Tripura':814.6, 'Uttar
P.':13754.0},
'Wheat': { 'Andhra P.':np. NaN, 'Gujarat':2737.0,
'Kerala': np. NaN,
'Punjab':16440.5, 'Tripura' :0.5,
'Uttar P.':30056.0},
'Pulses': { 'Andhra P.':931.0, 'Gujarat':818.0,
'Kerala':1.7,
'Punjab' :33.0, 'Tripura':23.2, 'Uttar
P.':2184.4},
'Fruits': { 'Andhra P.':7830.0, 'Gujarat':11950.0,
'Kerala':113.1,
'Punjab':7152.0, 'Tripura': 44.1,
'Uttar P. ':140169.2}}
prodf=pd.DataFrame(d)
print(prodf.sort_index(ascending=False))
print('----------------')
OUTPUT (30)
CODE 31
Write a program to create a merged dataframe using the
cust and order dataframes which is merged on common
field cust_id and contains only the rows having matching
cust_id values.
import pandas as pd
d={'cust_id':[1,2,3,4,5],
'first_name':
['Daniel','Nisha','Thomas','Shubhi','Ishpreet'],
'last_name':['Shah','Jain','Madison','Pai','Singh'],
'email':
['dshah@kbc.com','jainn@uru.com','madison@imp.com','pais
@abc.com','ips@xya.biz']}
d1={'order_id':[1,2,3,4,5,6],
'amount':
[23234.56,62378.50,32124.00,8365.0,1232.50,12614.40],
'cust_id':[1,3,2,3,10,9]}
cust=pd.DataFrame(d)
order=pd.DataFrame(d1)
res0=pd.merge(order,cust,on='cust_id')
print(res0)
print('------------------')
OUTPUT (31)
CODE 32
Write a program to create a merged dataframe using the
cust and order dataframes which is merged on common
field cust_id and contains all the rows for order
dataframe and from cust dataframe only the rows having
matching cust_id.
import pandas as pd
d={'cust_id':[1,2,3,4,5],
'first_name':
['Daniel','Nisha','Thomas','Shubhi','Ishpreet'],
'last_name':['Shah','Jain','Madison','Pai','Singh'],
'email':
['dshah@kbc.com','jainn@uru.com','madison@imp.com','pais
@abc.com','ips@xya.biz']}
d1={'order_id':[1,2,3,4,5,6],
'amount':
[23234.56,62378.50,32124.00,8365.0,1232.50,12614.40],
'cust_id':[1,3,2,3,10,9]}
cust=pd.DataFrame(d)
order=pd.DataFrame(d1)
res=pd.merge(order,cust,on='cust_id',how='left')
print(res)
print('------------------')
OUTPUT (32)
CODE 33
Write a program to create a merged dataframe using the
cust and order dataframes which is merged on common
field cust_id and contains all the rows both for order
and cust dataframes.
import pandas as pd
d={'cust_id':[1,2,3,4,5],
'first_name':
['Daniel','Nisha','Thomas','Shubhi','Ishpreet'],
'last_name':['Shah','Jain','Madison','Pai','Singh'],
'email':
['dshah@kbc.com','jainn@uru.com','madison@imp.com','pais@a
bc.com','ips@xya.biz']}
d1={'order_id':[1,2,3,4,5,6],
'amount':
[23234.56,62378.50,32124.00,8365.0,1232.50,12614.40],
'cust_id':[1,3,2,3,10,9]}
cust=pd.DataFrame(d)
order=pd.DataFrame(d1)
res2=pd.merge(order,cust,on='cust_id',how='outer')
print(res2)
print('------------------')
OUTPUT (33)
CODE 34
Create a dataframe qtrsales where each each row contains
the item category, item name and expenditure. Group the
rows by the category and print the total expenditure per
category.
import pandas as pd
qtrsales=pd.DataFrame({'Item Category':
['A','B','A','A','B','C','B','C'],
'Item Name':
['iPad','LCD','iPhone','iWatch','Projector',
'Hard
disk','Smartboard','Pen drive'],
'Expenditure':
[288000,356000,497000,315000,413000,45000,
211000, 21000]})
print(qtrsales)
print('total expenditure category wise: ')
print(qtrsales.groupby('Item Category')
['Expenditure'].sum())
print('----------------')
OUTPUT (34)
CODE 35
Consider the same dataframe of the previous code and
group the rows by the category, and the average
expenditure.
import pandas as pd
qtrsales=pd.DataFrame({'Item Category':
['A','B','A','A','B','C','B','C'],
'Item Name':
['iPad','LCD','iPhone','iWatch','Projector',
'Hard
disk','Smartboard','Pen drive'],
'Expenditure':
[288000,356000,497000,315000,413000,45000,
211000, 21000]})
print('average expenditure category wise: ')
print(qtrsales.groupby('Item Category')
['Expenditure'].mean())
print('----------------')
OUTPUT (35)
PLOTTING WITH PYPLOT
OUTPUT (1)
CODE 2
Marks is a list that stores marks of a student in 10
unit tests. Write a program to plot the student’s
performance in these 10 units.
import matplotlib.pyplot as plt
week=[1,2,3,4,5,6,7,8,9,10]
marks=[12,10,10,15,17,25,12,22,35,40]
plt.plot(week,marks)
plt.xlabel('Week')
plt.ylabel('UT marks')
plt.show()
OUTPUT (2)
CODE 3
Tanushree is doing some research. She has a stored line
of Pascal’s triangle numbers as ar2 as given below:
ar2=[1,7,21,35,35,21,7,1]
import matplotlib.pyplot as plt
import numpy as np
ar2=[1,7,21,35,35,21,7,1]
s2=np.sin(ar2)
c2=np.cos(ar2)
t2=np.tan(ar2)
plt.figure(figsize=(15,7))
plt.plot(ar2,s2,'c')
plt.plot(ar2,c2,'r')
plt.plot(ar2,t2,'k',linestyle='dashed')
plt.xlabel('Array values')
plt.ylabel('Sine, Cosine and Tangent Values')
plt.show()
OUTPUT (3)
CODE 4
First 10 terms of a Fibonacci series are stored in a list
namely fib.
fib=[0,1,1,2,3,5,8,13,21,34]
Write a program to plot Fibonacci terms and their square
roots with two separate lines on the same plot.
The Fibonacci series should be plotted as a cyan line with
‘o’ markers having size 5 and edge color red.
The square root series should be plotted as a black line
with ‘+’ markers having size 7 and edge color as red.
import matplotlib.pyplot as plt
import numpy as np
fib=[0,1,1,2,3,5,8,13,21,34]
sqfib= np.sqrt(fib)
plt.figure(figsize=(10,7))
plt.plot(range(1,11),fib,'co',markersize=5,linestyle='soli
d',markeredgecolor='r')
plt.plot(range(1,11),sqfib,'k+',markersize=7,linestyle='so
lid',markeredgecolor='r')
plt.show()
OUTPUT (4)
CODE 5
Given a series nfib that contains reversed fibonacci
numbers with fibonacci numbers as shown below:
nfib=[0,-1,-1,-2,-3,-5,-8,-13,-21,-
34,0,1,1,2,3,5,8,13,21,34]
Write a program to plot nfib with:
the line color being magenta
the marker edge color being black with size 5 the grid
should be displayed.
import matplotlib.pyplot as plt
import numpy as np
nfib=[0,-1,-1,-2,-3,-5,-8,-13,-21,-
34,0,1,1,2,3,5,8,13,21,34]
plt.plot(range(-
10,10),nfib,'mo',markersize=5,markeredgecolor='k',linest
yle='solid')
plt.grid(True)
plt.show()
OUTPUT (5)
CODE 6
Create an array a, in the range 1 to 20 with values 1.25
apart. Another array b, contains the log values of the
elements in the array a.
Write a program to create a scatter plot of first vs.
second array with red circle markers specify the x axis
title as random values and y axis title as logarithm
values.
import matplotlib.pyplot as plt
import numpy as np
a=np.arange(1,20,1.25)
b=np.log(a)
plt.plot(a,b,'ro')
plt.xlabel('random values')
plt.ylabel('logarithm values')
plt.show()
OUTPUT (6)
CODE 7
Consider the arrays of the previous example and create
an array c that stores the log10 values of the elements
in the array A. Write a program to modify previous
example program so that scatter plot of array a vs. c
is also plotted with the blue triangular markers.
import matplotlib.pyplot as plt
import numpy as np
a=np.arange(1,20,1.25)
b=np.log(a)
c=np.log10(a)
plt.plot(a,b,'ro')
plt.plot(a,c,'b^')
plt.xlabel('Random Values')
plt.ylabel('Logarithm Values')
plt.show()
OUTPUT (7)
CODE 8
Write a program to plot a scatter graph taking a random
distribution in X and Y both having shape as 100 having
randomly generated integers nd plotted against each
other.
import matplotlib.pyplot as plt
import numpy as np
x=np.random.randint(1,100,size=(100,))
y=np.random.randint(1,100,size=(100,))
plt.scatter(x,y,color='r')
plt.xlabel('x values')
plt.ylabel('y values')
plt.show()
OUTPUT (8)
CODE 11
Consider the reference and write a program to plot a bar
chart from the medals of Australia.
import matplotlib.pyplot as plt
info=['gold','silver','bronze','total']
aus=[80,59,59,198]
plt.bar(info,aus)
plt.xlabel('australia medal count')
plt.show()
OUTPUT (11)
CODE 12
Consider the reference and write a program to plot a bar
chart from the medals of Australia. In the same chart
plot medals won by India too.
OUTPUT (12)
CODE 13
Consider the reference and write a program to plot a bar
chart from the medals of Australia. Make sure that Gold,
Silver and Bronze and total tally is represented through
different widths.
OUTPUT (13)
CODE 14
Consider the reference and write a program to plot a bar
chart from the medals of India. Make sure that Gold,
Silver and Bronze and total tally is represented through
different colors.
import matplotlib.pyplot as plt
info=['gold','silver','bronze','total']
ind=[26,20,20,66]
plt.bar(info,ind,color=['gold','silver','brown','black']
)
plt.xlabel('medal type')
plt.ylabel('medal count')
plt.show()
OUTPUT (14)
CODE 15
val is a list having three lists inside it. It contains
summarised data of three different surveys conducted by
a company. Create a bar chart that plots these three
sublists of val in a single chart. Keep the width of
each bar as 0.25.
OUTPUT (17)
CODE 18
TSS school celebrated volunteering week where each
section of class XI dedicated a day for collecting
amount for charity being supported by the school.
Section A volunteered on Monday, B on Tuesday, C on
Wednesday and so on. There are six section in class XI.
Amount collected by sections A to F are 8000, 12000,
9800, 11200, 15500, 7300.
Write a program to create a pie chart showing collecting
amount section wise.
import matplotlib.pyplot as plt
col=[8000,12000,9800,11200,15500,7300]
section=['a','b','c','d','e','f']
plt.title('Volunteering Week Collection')
plt.pie(col,labels=section)
plt.show()
OUTPUT (18)
CODE 19
Considering the TSS school’s charity, write a program to
create a pie chart showing collection amount percentage
section wise. Also make sure that the pie chart is
perfectly circular.
import matplotlib.pyplot as plt
col=[8000,12000,9800,11200,15500,7300]
section=['a','b','c','d','e','f']
plt.title('Volunteering Week Collection')
plt.axis('equal')
plt.pie(col,labels=section,autopct='%5.2f%%')
plt.show()
OUTPUT (19)
CODE 20
Considering the TSS school’s charity collection, write a
program to create a pie chart showing collection amount
percentage section wise.
Make sure that the pie chart is perfectly circular.
Show the collection of section C and E with exploded
pies.
import matplotlib.pyplot as plt
col=[8000,12000,9800,11200,15500,7300]
section=['a','b','c','d','e','f']
expl=[0,0,0.15,0,0.2,0]
colors=['cyan','gold','violet','lightgreen','pink','silv
er']
plt.title('Volunteering week collection')
plt.axis('equal')
plt.pie(col,labels=section,explode=expl,colors=colors,au
topct='5.2f%%')
plt.show()
OUTPUT (20)
CODE 21
Generally, ten different prices of a stock are stored.
However, for abc co. only 5 prices are available for a
day. [74.25,76.06,69.5,72.55,81.5]
Write a program to create a bar chart with the given
prices but the graph should be plotted between the
limits -2 to 10 on x-axis.
import matplotlib.pyplot as plt
pr=[74.25,76.06,69.5,72.55,81.5]
plt.bar(range(len(pr)),pr,width=0.4)
plt.xlim(-2,10)
plt.title('prices of abc co.')
plt.ylabel('prices')
plt.show()
OUTPUT (21)
CODE 22
Generally, 10 different prices of stock are stored.
However, for abc co. only 5 prices are available for a
day. [74.25,76.06,69.5,72.55,81.5]
Write a program to create a bar chart with the given
prices:
The graph should be plotted within the limits -2 to 10
on x-axis.
There should be tick for every plotted point.
import matplotlib.pyplot as plt
pr=[74.25,76.06,69.5,72.55,81.5]
plt.bar(range(len(pr)),pr,width=0.4,color='m')
plt.xlim(-2,10)
plt.title('prices of abc co.')
plt.xticks(range(-2,10))
plt.ylabel('prices')
plt.show()
OUTPUT (22)
CODE 23
TSS school celebrated volunteering week where each
section of class XI dedicated a day for collecting
amount for charity being supported by the school.
Section A volunteered on Monday, B on Tuesday, C on
Wednesday, etc. There are six sections in class XI.
Amount collected by sections A to F are 8000, 12000,
9800, 11200, 15500, 7300.
Write a program to create a bar chart showing collection
amount. The graph should have proper titles and axes
titles.
import matplotlib.pyplot as plt
import numpy as np
col=[8000,12000,9800,11200,15500,7300]
x=np.arange(6)
plt.title('volunteering week collection')
plt.bar(x,col,color='r',width=0.25)
plt.xlabel('collection')
plt.show()
OUTPUT (23)
CODE 24
Consider the TSS school’s charity, write a program to
plot the collected amount vs days using a bar chart. The
ticks on x-axis should have day names. The graph should
have proper title and axes titles.
import matplotlib.pyplot as plt
import numpy as np
col=[8000,12000,9800,11200,15500,7300]
x=np.arange(6)
plt.title('volunteering week collection')
plt.bar(x,col,color='olive',width=0.25)
plt.xlabel('days')
plt.ylabel('collection')
plt.show()
OUTPUT (24)
CODE 25
Consider the TSS school’s charity, write a program to
plot the collected amount vs sections using a bar chart.
The ticks on x-axis should have section names.The graph
should have proper title and axes titles. Make sure that
the limits for y-axis are in the range 6000 to 20000.
import matplotlib.pyplot as plt
import numpy as np
col=[8000,12000,9800,11200,15500,7300]
x=np.arange(6)
plt.title('volunteering week collection')
plt.bar(x,col,color='cyan',width=0.25)
plt.xticks(x,['A','B','C','D','E','F'])
plt.ylim(6000,20000)
plt.xlabel('section')
plt.ylabel('collection')
plt.show()
OUTPUT (25)
CODE 26
Create multiple line charts on common plot where three
data ranges are plotted on same chart. The data range(s)
to be plotted is/are:
data=[5.,25.,45.,20.],[8.,13.,29.,27.],[9.,29.,27.,39.]
import matplotlib.pyplot as plt
import numpy as np
data=[5.,25.,45.,20.],[8.,13.,29.,27.],[9.,29.,27.,39.]
x=np.arange(4)
plt.plot(x,data[0],color='b',label='range1')
plt.plot(x,data[1],color='g',label='range2')
plt.plot(x,data[2],color='r',label='range3')
plt.legend(loc='upper left')
plt.title('multirange line chart')
plt.xlabel('x')
plt.ylabel('y')
plt.show()
OUTPUT (26)
CODE 27
A survey gathers height and weight of 100 participants
and recorded the participants’ ages as:
ages = [1, 1,2,3,5,7,8,9,10, 10,11,13,13,15,16,17,18,
19, 20, 21, 21, 23, 24, 24, 24, 25, 25, 25, 25, 26, 26,
26, 27, 27, 27, 27, 27, 29, 30, 30,30,30,31,33,34, 34,
34, 35, 36, 36, 37, 37,37,38, 38, 39, 40,40,41,41, 42,
43,45,45,46, 46, 46, 47,48,48,49,50,51,51, 52, 52, 53,
54,55,56, 57, 58, 60, 61, 63,65,66,68,70,72,74, 75,
77,81,83,84,87,89,90,91]
import matplotlib.pyplot as plt
ages = [1, 1,2,3,5,7,8,9,10, 10,11,13,13,15,16,17,18,
19, 20, 21, 21, 23, 24, 24, 24, 25, 25, 25, 25, 26, 26,
26, 27, 27, 27, 27, 27, 29, 30, 30,30,30,31,33,34, 34,
34, 35, 36, 36, 37, 37,37,38, 38, 39, 40,40,41,41, 42,
43,45,45,46, 46, 46, 47,48,48,49,50,51,51, 52, 52, 53,
54,55,56, 57, 58, 60, 61, 63,65,66,68,70,72,74, 75,
77,81,83,84,87,89,90,91]
plt.hist (ages, bins = 20)
plt.title ("Participants' Ages Histogram")
plt.show()
OUTPUT (27)
CODE 28
Prof Awasthi is doing some research in the field of
Environment. For some plotting purposes, he has
generated some data as:
mu=100
sigma = 15
x=mu + sigma * np.random.randn (10000)
import numpy as np
import matplotlib.pyplot as plt
mu=100
sigma = 15
x=mu + sigma * np.random.randn (10000)
plt.hist(x, bins = 30, orientation = 'horizontal')
plt.title('Research data Histogram')
plt.show()
OUTPUT (28)
CODE 29
Prof Awasthi is doing some research in the field of
Environment. For some plotting purposes, he has generated
some data as:
mu=100
sigma = 15
x=mu + sigma * np.random.randn (10000)
y=mu + 30*np.random.randn(10000)
import numpy as np
import matplotlib.pyplot as plt
mu = 100
sigma = 15
x = mu + sigma * np.random.randn(10000)
y = mu + 30* np.random.randn (10000)
plt.hist([x,y], bins = 100, histtype = 'barstacked')
plt.title('Research data Histogram')
plt.show()
OUTPUT (29)
CODE 30
Create a frequency polygon using using Prof Awasthi’s
research data.
import numpy as np
import matplotlib.pyplot as plt
mu = 100
sigma = 15
x = mu + sigma * np.random.randn(100)
plt.figure(figsize = (10, 7))
y = np.arange(len(x))
plt.hist(x, bins = 40, histtype = 'step')
plt.title('research data histogram')
plt.show()
OUTPUT (30)
CODE 31
Write a program to plot a histogram using Prof Awasthi’s
research data and then draw a frequency polygon by
joining the edges’ midpoints through a line chart.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
mu = 100
sigma = 15
x= mu + sigma* np.random.randn(100)
plt.figure(figsize = (10, 7) )
y = np.arange(len(x))
n, edges, p = plt.hist(x, bins = 40, histtype = 'step')
m=0.5*(edges [1:] + edges[:-1])
m = m.tolist()
l = len(m)
m.insert(0, m[0] - 10)
m.append(m[l-1]+10)
n = n.tolist()
n.insert(0,0)
n.append(0)
plt.plot ( m, n ,'-^')
plt.title('Research data Histogram')
plt.show()
OUTPUT (31)
CODE 32
Consider a dataframe mskdf as shown below
Name Age PreBoardMarks BoardMarks
0 Karan 17 4 25
1 Alex 19 24 94
2 Ani 18 31 57
3 Javed 18 2 62
4 Amrit 17 3 70
import pandas as pd
import matplotlib.pyplot as plt
d={'Name':['Karan','Alex','Ani','Javed','Amrit'],
'Age':[17,19,18,18,17],
'PreBoardMarks':[4,24,31,2,3],
'BoardMarks':[25,94,57,62,70]}
mksdf=pd.DataFrame(d)
plt.scatter(x=mksdf.index,y=mksdf.PreBoardMarks,c='g',s=
mksdf.Age)
plt.scatter(x=mksdf.index,y=mksdf.BoardMarks,c='r',s=mks
df.Age)
plt.xlabel('X-axis')
plt.ylabel('Marks')
plt.show()
OUTPUT (32)
CODE 33
Consider the data frame mskdf.Write a program to plot
‘BoardMarks’ against ‘PreBoardMarks’ in a scatter chart.
import pandas as pd
import matplotlib.pyplot as plt
d={'Name':['Karan','Alex','Ani','Javed','Amrit'],
'Age':[17,19,18,18,17],
'PreBoardMarks':[4,24,31,2,3],
'BoardMarks':[25,94,57,62,70]}
mksdf=pd.DataFrame(d)
mksdf.plot(kind='scatter',x='PreBoardMarks',y='BoardMark
s')
plt.xlabel('PreBoardMarks')
plt.ylabel('BoardMarks')
plt.show()
OUTPUT (33)
SIMPLE QUERIES IN SQL
Creating a table
MySQL Queries
To select all columns from a table
Simple Problems
Aggregate Functions
Sum() function
Avg() function
Max() function
Min() function
Count() function
Distinct() function
Between condition
Order by clause
Queries
Write a query to display employee name, job,
salary whose job is Manager or Analyst or
Salesman
Write a query to select ename where M is 1st
letter and R is last letter
String functions
Concatenation
Lower() function
Upper() function
Substring() function
Ltrim() function
Rtrim() function
Trim() function
Instr() function
Length() function
Left() Function
Right() Function
Mid() Function
Mathematical Functions
Mod() Function
Power() Function
Round() Function
Sign() Function
Sqrt() Function
Truncate() Function
Date and Time Functions
Current date [curdate()]
Month()
Year()
DayName()
DayofMonth()
Dayofyear()
Present time
System time
Sleep() function
Write a query to find total salary of department
30
Group by
Write a query
to calculate
the number of
employees in
each job and
the sum of
their salaries
Write a query to display the employees department
wise
Truncate command
Drop table
Update command
Write a query to set hobby to gardening whose
empid is 2 from employee1.
Write a query to set hobby to 'painting' and doj
to curdate whose ename is 'Srinivas' from
employee1.
Table Graduate
Write a query to
count the number
of students with
physics and comp
sc as their
subject from
graduate.
Write a query to insert a new column grade of
varchar(2) from graduate.