Visualizing Using Matplotlib
Visualizing Using Matplotlib
What is Visualization?
Data Visualization means graphical or pictorial representation of data using graph, chart etc.,
It helps to communicate the information effectively to intended users. The purpose of plotting
data is to visualize variation or show relationship between variables.
Eg. Traffic Symbols, Ultrasound Reports, Speedometer of Vehicle, Atlas book of maps.
What is Matplotlib?
Matplotlib is a powerful plotting library in Python used for creating static, animated, and
interactive visualizations. Matplotlib’s primary purpose is to provide users with the tools and
functionality to represent data graphically, making it easier to analyze and understand. It was
originally developed by John D. Hunter in 2003 and is now maintained by a large community
of developers.
Key Features of Matplotlib:
1. Versatility: Matplotlib can generate a wide range of plots, including line plots, scatter
plots, bar plots, histograms, pie charts, and more.
2. Customization: It offers extensive customization options to control every aspect of the
plot, such as line styles, colors, markers, labels, and annotations.
3. Integration with NumPy: Matplotlib integrates seamlessly with NumPy, making it
easy to plot data arrays directly.
4. Publication Quality: Matplotlib produces high-quality plots suitable for publication
with fine-grained control over the plot aesthetics.
5. Extensible: Matplotlib is highly extensible, with a large ecosystem of add-on toolkits
and extensions like Seaborn, Pandas plotting functions, and Basemap for geographical
plotting.
6. Cross-Platform: It is platform-independent and can run on various operating systems,
including Windows, macOS, and Linux.
7. Interactive Plots: Matplotlib supports interactive plotting through the use of widgets
and event handling, enabling users to explore data dynamically.
Difference between MATLAB and Matplotlib:
Matlab Matplotlib
1. MATLAB uses a proprietary scripting Matplotlib is a Python library that utilizes Python's
language, syntax
3. MATLAB requires additional toolboxes or Matplotlib integrates seamlessly with other Python
manual efforts to integrate with external libraries such as NumPy and Pandas, which allows
libraries and to achieve similar for efficient data handling and manipulation
functionality.
4. MATLAB is a commercial software that Matplotlib is an open-source library that is freely
requires a paid license to access its full available and can be easily installed using Python's
capabilities. package manager
5. MATLAB provides an integrated Matplotlib, being a Python library, primarily relies
development environment (IDE) that on third-party IDEs or notebooks (e.g., Jupyter) for
offers a user-friendly interface, interactive coding and execution.
debugging, and direct execution of code
blocks
Applications of Matplotlib:
1. Scientific Research: For plotting experimental results and visualizations that describe
the data more effectively.
2. Finance: For creating financial charts to analyze market trends and movements.
3. Data Analysis: For exploratory data analysis in fields such as data science and machine
learning.
4. Education: For teaching complex concepts in mathematics, physics, and statistics
through visual aids.
5. Engineering: For visualizing engineering simulations and results.
Disadvantages of Matplotlib:
1. Steep Learning Curve
2. Verbose Syntax
3. Default Aesthetics
4. Limited Interactivity
5. Limited 3D Plotting Capabilities
6. Performance Issues with Large Datasets
7. Documentation and Error Messages
Mr. K. Sathish AP/CSE, ESCET
3
Syntax:
matplotlib.pyplot.plot(*args, scalex=True, scaley=True, data=None, **kwargs)
Parameters:
This function accepts parameters that enable us to set axes scales and format the graphs. These
parameters are mentioned below :-
plot(x, y): plot x and y using default line style and color.
plot.axis([xmin, xmax, ymin, ymax]): scales the x-axis and y-axis from minimum to
maximum values
plot.(x, y, color=’green’, marker=’o’, linestyle=’dashed’, linewidth=2,
markersize=12):
x and y co-ordinates are marked using circular markers of size 12 and green color line
with — style of width 2
plot.xlabel(‘X-axis’): names x-axis
plot.ylabel(‘Y-axis’): names y-axis
plot(x, y, label = ‘Sample line ‘): plotted Sample Line will be displayed as a legend.
What is a Matplotlib Figure?
In Matplotlib, a figure is the top-level container that holds all the elements of a plot. It
represents the entire window or page where the plot is drawn.
the relationship or trend between data points and can be styled with different colors,
widths, and styles to convey additional information.
6. Matplotlib Title: The title is a text element that provides a descriptive title for the plot.
It typically appears at the top of the figure and provides context or information about
the data being visualized.
7. Axis Labels in Matplotlib: Labels are text elements that provide descriptions for the
x-axis and y-axis. They help identify the data being plotted and provide units or other
relevant information.
8. Ticks: Tick marks are small marks along the axis that indicate specific data points or
intervals. They help users interpret the scale of the plot and locate specific data values.
9. Tick Labels: Tick labels are text elements that provide labels for the tick marks. They
usually display the data values corresponding to each tick mark and can be customized
to show specific formatting or units.
10. Matplotlib Legend: Legends provide a key to the symbols or colors used in the plot to
represent different data series or categories. They help users interpret the plot and
understand the meaning of each element.
11. Matplotlib Grid Lines: Grid lines are horizontal and vertical lines that extend across
the plot, corresponding to specific data intervals or divisions. They provide a visual
guide to the data and help users identify patterns or trends.
12. Spines of Matplotlib Figures: Spines are the lines that form the borders of the plot
area. They separate the plot from the surrounding whitespace and can be customized to
change the appearance of the plot borders.
students = ["Jane","Joe","Beck","Tom",
"Sam","Eva","Samuel","Jack",
"Dana","Ester","Carla","Steve",
"Fallon","Liam","Culhane","Candance",
"Ana","Mari","Steffi","Adam"]
marks=[]
for i in range(0,len(students)):
marks.append(random.randint(0, 101))
plt.xlabel("Students")
plt.ylabel("Marks")
plt.title("CLASS RECORDS")
plt.plot(students,marks,'m--')
Output:
Character Definition
– Solid line
— Dashed line
-. dash-dot line
: Dotted line
. Point marker
o Circle marker
, Pixel marker
v triangle_down marker
^ triangle_up marker
1 tri_down marker
2 tri_up marker
3 tri_left marker
4 tri_right marker
s square marker
p pentagon marker
* star marker
h hexagon1 marker
H hexagon2 marker
+ Plus marker
x X marker
D Diamond marker
d thin_diamond marker
| vline marker
_ hline marker
Color codes:
The below table shows the list of colours that are supported to change the color of
plotted data. We can either use character codes or the color names as values to the parameter
color in the plot().
Codes Description
b blue
g green
r red
c cyan
m magenta
y yellow
k black
w white
students = ["Jane","Joe","Beck","Tom","Sam",
"Eva","Samuel","Jack","Dana","Ester",
"Carla","Steve","Fallon","Liam","Culhane",
"Candance","Ana","Mari","Steffi","Adam"]
marks=[]
for i in range(0,len(students)):
marks.append(random.randint(0, 101))
plt.xlabel("Students")
plt.ylabel("Marks")
plt.title("CLASS RECORDS")
plt.plot(students, marks, color = 'green',
linestyle = 'solid', marker = 'o',
markerfacecolor = 'red', markersize = 12)
Output:
In this example, we use Matplotlib to visualize the marks of 20 students in a class. The marks
are displayed using a dashed magenta line graph. Grid lines are added to provide better
readability and reference across the plot.
import matplotlib.pyplot as plt
import random as random
marks = []
for i in range(0, len(students)):
marks.append(random.randint(0, 101))
plt.xlabel("Students")
plt.ylabel("Marks")
plt.title("CLASS RECORDS")
plt.plot(students, marks, 'm--')
plt.show()
Output:
# Example data
x = range(10)
y1 = [xi**2 for xi in x]
y2 = [xi**1.5 for xi in x]
The layout is organized in rows and columns, which are represented by the first and second
argument.
The third argument represents the index of the current plot.
plt.subplot(1, 2, 1)
#the figure has 1 row, 2 columns, and this plot is the first plot.
plt.subplot(1, 2, 2)
#the figure has 1 row, 2 columns, and this plot is the second plot.
Example:
import matplotlib.pyplot as plt
import numpy as np
#plot 1:
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])
plt.subplot(1, 2, 1)
plt.plot(x,y)
#plot 2:
x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])
plt.subplot(1, 2, 2)
plt.plot(x,y)
plt.show()
plot() Vs scatter()
Matplotlib’s plt.plot() is a general purpose plotting function that will allow to create
various different line or marker plots.
The primary difference of plt.scatter from plt.plot is that it can be used to create scatter
plots where the properties of each individual point (size, face color, edge color, etc.) can be
individually controlled or mapped to data.
plt.plot can be noticeably more efficient than plt.scatter. The reason is
that plt.scatter has the capability to render a different size and/or color for each point, so the
renderer must do the extra work of constructing each point individually.
In plt.plot, on the other hand, the points are always essentially clones of each other, so
the work of determining the appearance of the points is done only once for the entire set of
data.
For large datasets, the difference between these two can lead to vastly different
performance, and for this reason, plt.plot should be preferred over plt.scatter for large datasets.
Size &
You can change the size of the dots with the s argument. Just like colors, make sure the array
for sizes has the same length as the arrays for the x- and y-axis.
Example
Set your own size for the markers:
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
sizes = np.array([20,50,100,200,500,1000,60,90,10,300,600,800,75])
plt.scatter(x, y, s=sizes)
plt.show()
You cannot use the color argument for this, only the c argument.
Example
Set your own color of the markers:
Example:
colors=np.array(["red","green","blue","yellow","pink","black","orange","purple","beige","br
own","gray","cyan","magenta"])
plt.scatter(x, y, c=colors) plt.show()
COLORMAP
The Matplotlib module has a number of available colormaps. A colormap is like a list of colors,
where each color has a value that ranges from 0 to 100.
This colormap is called 'viridis' and as you can see it ranges from 0, which is a purple
color, up to 100, which is a yellow color.
You can specify the colormap with the keyword argument cmap with the value of the
colormap, in this case 'viridis' which is one of the built-in colormaps available in
Matplotlib.
In addition you have to create an array with values (from 0 to 100), one value for each
point in the scatter plot:
Example:
Create a color array, and specify a colormap in the scatter plot:
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
colors = np.array([0, 10, 20, 30, 40, 45, 50, 55, 60, 70, 80, 90, 100])
plt.scatter(x, y, c=colors, cmap='viridis')
plt.colorbar()
plt.show()
ALPHA
You can adjust the transparency of the dots with the alpha argument. Just like colors, make
sure the array for sizes has the same length as the arrays for the x- and y-axis:
Example
Set your own size for the markers:
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
sizes = np.array([20,50,100,200,500,1000,60,90,10,300,600,800,75])