Fundamentals of Data Visualization
Fundamentals of Data Visualization
Visualization in python
plt.savefig(‘ScatterPlot.png’)
Image Plot, Box plot, Violin Plot, Stream plot, Quiver plot, Area Plot,
Peter Plot, and Donut Plot
Easy integration with Pandas and Numpy.
Line Plot
import numpy as np
import matplotlib.pyplot as plt
# allows plot to be display below the notebook
%matplotlib inline
#defining the dataset
x=np.arange(0,10,0.1)
y=3*x+5#plotting the datapoints
plt.plot(x,y)
plt.show()
import numpy as np
import matplotlib.pyplot as plt
# allows plot to be display below the notebook
%matplotlib inline
#defining the dataset
x=np.arange(0,10,1)
y=3*x+5
#plotting the datapoints
plt.plot(x,y,linewidth =2.0 , linestyle =":",color ='y',alpha
=0.7, marker ='o')
plt.title("Line Plot Demo")
plt.xlabel("X-Axis")
plt.ylabel("Y-Axis")
plt.legend(['line1'], loc='best')
plt.grid(True)
plt.show()
Figure Size
...
y=3*x+5
#changing the figure
fig=plt.figure(figsize=(10,5))
#plotting the datapoints
...
subplots
import numpy as np
import matplotlib.pyplot as plot
%matplotlib inlinex=np.arange(0,10,1)
y1=2*x+5
y2=3*x+10plt.subplot(2,1,1) #A
# B - plt.subplot(1,2,1)
#(height,width,column)
plt.plot(x,y1)
plt.title('Graph1')plt.subplot(2,1,2) #A
# B - plt.subplot(1,2,2)
plt.plot(x,y2)
plt.title('Graph2')
plt.show()
Bar plot
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inlinedata = {'apples':20,'Mangoes':15,
'lemon':30,'Oranges':10}
names =list(data.keys())
values =list(data.values())plt.subplot(3,1,1)
#fig =plt.figure(figsize =(10,5))
plt.bar(names,values,color ="orange")
plt.title("Bar Graph Demo")
plt.xlabel("Fruits")
plt.ylabel("Quantity")plt.subplot(3,1,3)
plt.barh(names,values,color ="orange")
plt.title("Bar Graph Demo")
plt.xlabel("Fruits")
plt.ylabel("Quantity")
plt.show()
Scatter Plots
import matplotlib.pyplot as plt
%matplotlib inline
#dataset Note - a and y1,y2 should be of same size
a=[10,20,30,40,50,60,70,80]
y1=[2,3,5,6,1,4,5,3]
y2=[1,2,3,4,5,5,1,3]plt.scatter(a,y1)
plt.scatter(a,y2)
plt.show()
Histogr.am
box plot helps to Analyse data efficiently and does the outer
analysis of data such as outlier, Quartile, etc
a violin plot is used for large amounts of data, where the
individual representation of data is not possible.
Customization
#Donut plot
import matplotlib.pyplot as plt
%matplotlib inline
group_names = ["GroupA","GroupB","GroupC"]
group_size=[20,30,50]
size_centre = [5]#colors
colors = ['#ff9999','#66b3ff','#99ff99','#ffcc99']pie1
=plt.pie(group_size, labels = group_names,radius =1.5,colors
=colors)
pie2 = plt.pie(size_centre,radius =1.0,colors ='w')
plt.show()
Area Plots
few customizations
DataSet: https://www.dropbox.com/s/v3ux6vy7ajvltz0/
Customerdata.csv?dl=0
1. Build a box-plot for the dataset. x-axis — Contract type, y-axis-
count
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
customer= pd.read_csv(r'Customerdata.csv')
grp=customer.Contract.value_counts()
x=grp.keys()
y=grp.valuesprint(type(grp.values))plt.bar(x,y,color ="orange")
plt.title("Distribution of Contract in dataset")
plt.xlabel("Contract Type of Customer")
plt.ylabel("count")
plt.show()
Hint :
a=Customer[Customer['PaymentMethod']=='Electronic Check']
b=Customer[Customer['PaymentMethod']=='Mailed Check']
c=Customer[Customer['PaymentMethod']=='Bank transfer']