Data Visualization in Python
Data Visualization in Python
Matplotlib
Instructor:
Sayed Shahid Hussain
Research Associate,
AI in Healthcare, National Center of AI
UET Peshawar
Email: sayedshahid310@gmail.com
Data Visualisation in Python
• Data visualization refers to the integration of data and visual elements like
images, charts, diagrams, and so on to communicate messages to different
users.
• Data visualization is an easier way of presenting the data, however
complex it is, to analyze trends and relationships amongst variables with
the help of pictorial representation.
• The following are the advantages of Data Visualization
• Easier representation of complex data
• Highlights good and bad performing areas
• Explores relationship between data points
• Identifies data patterns even for larger data points
Python Libraries
• There are a lot of python libraries which could be used to build visualization
like matplotlib, vispy, bokeh, seaborn, pygal, folium, plotly, etc.
• Of the many, matplotlib and seaborn seems to be very widely used for basic
to intermediate level of visualizations.
• Matplotlib
• Matplotlib is a popular Python library for displaying data and creating static, animated,
and interactive plots.
• This lets you draw appealing and informative graphics like line plots, scatter plots,
histograms, and bar charts.
• Matplotlib Doc: https://matplotlib.org/stable/index.html
• Seaborn
• This library sits on top of matplotlib.
• In a sense, it has some flavors of matplotlib while from the visualization point, it has
some added features as well.
Pyplot
• Pyplot is a collection of functions that make matplotlib work like MATLAB.
• Most of the Matplotlib utilities lies under the pyplot submodule, and are usually imported
under the plt alias.
• plt.plot(x, y)
• plt.title(“Title")
• plt.xlabel(“X_label")
• plt.ylabel(“Y_label")
• plt.figure(figsize=(10,10))
• We can change the style of the plots. In order to see all available styles:
• plt.style.available
• plt.style.use('dark_background')
Bar Plot
• A barplot (or barchart) is one of the most common types of graphic.
• It shows the relationship between a numeric and a categoric variable.
• Each entity of the categoric variable is represented as a bar.
• The size of the bar represents its numeric value.
• plt.bar(x,y)
Pie Chart
• A pie chart (or a circle chart) is a circular statistical graphic, which is divided
into slices to illustrate numerical proportion.
• In a pie chart, the arc length of each slice (and consequently its central
angle and area), is proportional to the quantity it represents.
• plt.pie()
Stack plot & Histograms
• Stack Plot
• The idea of stack plots is to show “parts to a whole” over time; basically, it’s like a pie-
chart, only over time.
• We can use stackplot() built-in function.
• Histograms
• A histogram is a graph showing frequency distributions.
• It is a graph showing the number of observations within each given interval.
• We use hist() function in order to create histograms.