Advanced Visualizations
Advanced Visualizations
Table of Contents
Contents
Advanced Visualizations & Dashboards: Visualization Using Python..................................1
Table of Contents................................................................................................................... 1
Course Overview.................................................................................................................... 1
Relevance of Data Visualization for Business.....................................................................2
Libraries for Data Visualization in Python............................................................................3
Python Data Visualization Environment Configuration.......................................................4
Matplotlib Libraries for Visualization..................................................................................12
Bar Chart Using ggplot........................................................................................................13
Bokeh and Pygal.................................................................................................................. 17
Select Visualization Libraries..............................................................................................20
Interactive Graphs and Image Files....................................................................................22
Plot Graphs........................................................................................................................... 23
Multiple Lines in Graphs......................................................................................................25
Exercise: Create Line Charts with Pygal............................................................................28
Course Overview
[Video description begins] Topic title: Course Overview. [Video description
ends]
Hi, my name is Niranjan Pandey, and I'm part of Alchemy Software Solutions
e-learning and training content development team.
In this course, I will explore how to implement advanced visualization, plot and
graph using Python. I will discuss the scenario and approach of building and
implementing visualization plots and graphs using essential libraries of Python
that includes Matplotlib, ggplot, Bokeh, and Pygal.
Business process is driven by the data. If data is handled well, it will assist
business in order to grow by providing different calculations, manipulations of
data. Data visualization is one of the techniques in order to elaborate nature of
data and also provide insight in the business. And it does it by enabling
decision makers to take swift action after analyzing the data through
visualizations. It also provides capability to decision makers to identify key
business trends and base their decisions for future forecast on that.
Apart from that, business decision makers and business drivers would be able
to interpret the data quick, swift, and also concise. Data visualization
facilitates direct and customized interaction with data, providing summarized
data with good visual depictions, and also helps in doing forecasting and
predictions. Once data is understood by various different stakeholders, it
directly impacts productivity and also turns productivity into better sales. Now
let's try to understand relevance of data visualization for businesses.
Apart from that, if you provide data in concise way, it increases productivity
and, above all, it saves lots of time of data manipulation. If you have data, in
form of visualization, gathering ideas from the team in order to identify the
gaps and create a shared view to align the decision-making is easier. Now
let's try to understand some of the visualization types.
You have line chart, where you can depict how actually growth has been there
in the identified parameter over time. Then you have gauge which tells you the
current relative business development. You can also have vertical or
horizontal bar graph and you can have pie graph which provides you
capability of identifying share of different components within a single chart.
Objective of this video is to list various essential libraries that we can use in
Python to implement data visualization.
Python is one of the useful tools which is used by data scientists. It provides
you capability of data curating. It provides you capability of data extraction. It
provides you capability of data elaboration, and also provides you capability of
data wrangling, along with rich visualization. Some of the important libraries of
Python are Pandas. Pandas provides you capability of data manipulation and
analysis. Seaborn provides you statistical graphical representations. Pygal
provides your capability of creating SVGs.
Then we have Plotly, which helps you to plot different type of charts and
graphs, and it can be utilized by various languages. Then you have Bokeh,
which is an interacting visualization libraries and targets modern web
browsers for presentation. Let's understand Pandas in greater detail. Pandas
is highly optimized, built-in options for plotting visualizations. Apart from that,
name itself is derived from panel data, which provides you capability of
organizing the data in form of datasets, and do your observations over
multiple time periods for the same individual dataset. It integrates well with
other libraries of Python.
Then you have Seaborn, which is used for statistical graphics. It's an open
source software library, provides capability of building statistical graphics on
top of Matplotlib, and also closely integrates with Pandas. Apart from that, it
comes with built-in themes for styling the information for better visibility and
rich user interface for plotting the data. It also provides you capability of
drawing complicated plots in simple way. Then you have Bokeh, which is
meant for real-time interactive data presentation, which can be presented on
modern web browser.
It's open source, easiest possible platform that can be utilized to perform
Python and R data science and machine learning on various different
platforms. There are multiple different packages. Along with packages, it gives
a package manager tool called as Conda. It also facilitates in developing
machine learning and deep learning models using various different libraries.
Apart from that, we can also utilize visualizations using libraries like Matplotlib,
Bokeh, Datashader, HoloViews, and others. Let's go and download Anaconda
from www.anaconda.com/distribution and clicking on Download.
[Video description begins] He clicks the Windows tab. The page heading is
Anaconda 2018.12 for Windows Installer. Two versions are displayed: Python
3.7 and Python 2.7. Each version has a Download button. [Video description
ends]
And once you get Anaconda 2018.12 for Windows Installer, we'll click on
Download.
Once you'll click on Download, it will let you download exe, click on Save File.
Wait for download to get completed. Once it is downloaded, you'll just go and
double-click on the download file.
[Video description begins] He clicks the I Agree button. The third step of the
Setup is displayed: Select Installation Type. Two options with radio buttons
are displayed for Install for: Just Me and All Users. At the bottom Back, Next,
and Cancel buttons are displayed. [Video description ends]
And finally, you can decide whether this installation will be supported to one
user or all users. We'll select All Users and click on Next. Once you'll click on
Yes, it will ask you to select the destination folder.
[Video description begins] The fourth step of the Setup: Choose Install
Location is displayed. It has an input box for file destination. A browse button
is placed next to it. At the bottom there are three buttons: Back, Next, and
Cancel. [Video description ends]
Let's go and select the destination folder. We'll go to Browse, and select
destination of our choice.
[Video description begins] He clicks the Browse button. A file explorer titled
Browse For Folder appears. It has a list of folders. At the bottom there are
three buttons: Make New Folder, OK, and Cancel. He selects the folder
named anaconda in the D Drive of the PC. He clicks the OK button. The file
explorer closes. The fourth step of the Setup window is open. The Destination
Folder is set as: D:\anaconda. [Video description ends]
Then you can go and you can select whether you want to add Anaconda to
the path, we'll say Yes, though it is not recommended.
[Video description begins] He selects the first option. Both the options are
now selected. [Video description ends]
Once you'll click on Install, it will start the installation process, in that process,
it will extract various libraries and DLL files.
You can also go and install Microsoft VSCode, if you wish to use it as your
editor, we'll say Skip for now.
[Video description begins] He clicks the Skip button. The final page of the
Setup is displayed. The heading is: Thanks for installing Anaconda3! At the
bottom, Back, Finish, and Cancel buttons are displayed. [Video description
ends]
Once installation is successful, it will install various tools. One of the tool is
Spyder, which provides you capability of writing application, testing application
and also visualizing graphs if you are using appropriate libraries.
You have to go and you have to allow access. And finally, you will get Spyder
which is divided into multiple different parts.
[Video description begins] He clicks the Allow access button. The Windows
Security Alert box closes. Spyder (Python 3.7) IDE opens on the screen. It
has a menu bar with options for File, Edit, Search, etc. It has a tool bar with
options for New, Open, Save, etc. The Spyder screen has three sections. The
first section is the Editor, in which the file untitled0.py is open. Next to it, there
is a second section which is the Console. It has a search bar and a Usage
box. Below it, there is a third section with the following three tabs: Help,
Variable explorer, and File explorer. The Help tab is open. It has a tab titled
Console 1/A . [Video description ends]
You have editor with one file called untitled0. On right hand side, you have
two panels, one is console. Another one displays variables under Variable
explorer and files under File explorer. You can click on Run to run the file and
see the outcome in console.
It's open source, easiest possible platform that can be utilized to perform
Python and R data science and machine learning on various different
platforms. There are multiple different packages. Along with packages, it gives
a package manager tool called as Conda.
And once you get Anaconda 2018.12 for Windows Installer, we'll click on
Download.
Once you'll click on Download, it will let you download exe, click on Save File.
Wait for download to get completed. Once it is downloaded, you'll just go and
double-click on the download file.
[Video description begins] He clicks the I Agree button. The third step of the
Setup is displayed: Select Installation Type. Two options with radio buttons
are displayed for Install for: Just Me and All Users. At the bottom Back, Next,
and Cancel buttons are displayed. [Video description ends]
And finally, you can decide whether this installation will be supported to one
user or all users. We'll select All Users and click on Next. Once you'll click on
Yes, it will ask you to select the destination folder.
Let's go and select the destination folder. We'll go to Browse, and select
destination of our choice.
[Video description begins] He clicks the Browse button. A file explorer titled
Browse For Folder appears. It has a list of folders. At the bottom there are
three buttons: Make New Folder, OK, and Cancel. He selects the folder
named anaconda in the D Drive of the PC. He clicks the OK button. The file
explorer closes. The fourth step of the Setup window is open. The Destination
Folder is set as: D:\anaconda. [Video description ends]
Then you can go and you can select whether you want to add Anaconda to
the path, we'll say Yes, though it is not recommended.
[Video description begins] He selects the first option. Both the options are
now selected. [Video description ends]
Once you'll click on Install, it will start the installation process, in that process,
it will extract various libraries and DLL files.
[Video description begins] He clicks the Show details button. It expands to
display a list of libraries and files. [Video description ends]
Once you get message indicating installation complete, it means all set of files
are downloaded and they are executed in order to help you in building
environment. We'll click on Next.
You can also go and install Microsoft VSCode, if you wish to use it as your
editor, we'll say Skip for now.
[Video description begins] He clicks the Skip button. The final page of the
Setup is displayed. The heading is: Thanks for installing Anaconda3! At the
bottom, Back, Finish, and Cancel buttons are displayed. [Video description
ends]
Once installation is successful, it will install various tools. One of the tool is
Spyder, which provides you capability of writing application, testing application
and also visualizing graphs if you are using appropriate libraries.
You have to go and you have to allow access. And finally, you will get Spyder
which is divided into multiple different parts.
[Video description begins] He clicks the Allow access button. The Windows
Security Alert box closes. Spyder (Python 3.7) IDE opens on the screen. It
has a menu bar with options for File, Edit, Search, etc. It has a tool bar with
options for New, Open, Save, etc. The Spyder screen has three sections. The
first section is the Editor, in which the file untitled0.py is open. Next to it, there
is a second section which is the Console. It has a search bar and a Usage
box. Below it, there is a third section with the following three tabs: Help,
Variable explorer, and File explorer. The Help tab is open. It has a tab titled
Console 1/A . [Video description ends]
You have editor with one file called untitled0. On right hand side, you have
two panels, one is console. Another one displays variables under Variable
explorer and files under File explorer. You can click on Run to run the file and
see the outcome in console.
Objective of this video is to list the prominent data visualization libraries that
we can use with Matplotlib. Matplotlib is a typical plotting library, which is used
by Python developers. It has numerical mathematical extensions which are
provided by Python. For example, it supports NumPy. It also provides an
object oriented API. It is considered as oldest and most popular InfoVis library.
Provides extensive range of 2D plot types and output formats.
[Video description begins] He hits the play icon to execute the code. [Video
description ends]
Once you'll start execution, it will ensure that it installs ggplot and makes it
available for your other cells within the same session.
[Video description begins] It starts installing the required packages in the code
cell. [Video description ends]
Our next objective is now to go and prepare a bar chart using ggplot.
[Video description begins] In the second code cell, the first line reads: from
ggplot import *, the second line reads: import pandas as pd, the third line
reads: df = pd.DataFrame({"data1": [1,2,3,4], "data2": [10,30,40,50] }), and the
last line reads: ggplot(aes(x="data1", weight="data2"), df) +
geom_bar_(fill="#FF6666"). [Video description ends]
[Video description begins] He points to the first code line in the second code
cell. [Video description ends]
[Video description begins] He points to the second code line in the second
code cell. [Video description ends]
We are taking here static data in order to build Pandas DataFrame, we have
written df = pd.DataFrame, and we are creating data1 and data2 as two
different datasets.
[Video description begins] He points to the third code line in the second code
cell. [Video description ends]
[Video description begins] He points to the fourth code line in the second code
cell. [Video description ends]
Then we'll go and we'll specify some of the properties, for example, we want
this graph to be filled by a particular color. You can use the color in order to fill
the graph. After you have written, you have to go and you have to execute the
current cell.
[Video description begins] He hits the play icon to execute the code. A graph
with data2 on y-axis and data1 on x-axis appears in the code cell. [Video
description ends]
Once you'll execute the current cell, you'll find that your data2 is present in Y
axis. It starts with 10, 20, 30, 40, 50. And in X axis, you have data1, which is
1, 2, 3, 4. And you can clearly see that whatever color you have applied in fill
inside geom_bar, it is displaying your bar chart in that particular color.
[Video description begins] Topic title: Bar Chart Using ggplot. Your host for
this session is Niranjan Pandey. [Video description ends]
[Video description begins] He hits the play icon to execute the code. [Video
description ends]
Once you'll start execution, it will ensure that it installs ggplot and makes it
available for your other cells within the same session.
[Video description begins] It starts installing the required packages in the code
cell. [Video description ends]
Our next objective is now to go and prepare a bar chart using ggplot.
[Video description begins] In the second code cell, the first line reads: from
ggplot import *, the second line reads: import pandas as pd, the third line
reads: df = pd.DataFrame({"data1": [1,2,3,4], "data2": [10,30,40,50] }), and the
last line reads: ggplot(aes(x="data1", weight="data2"), df) +
geom_bar_(fill="#FF6666"). [Video description ends]
[Video description begins] He points to the first code line in the second code
cell. [Video description ends]
[Video description begins] He points to the second code line in the second
code cell. [Video description ends]
We are taking here static data in order to build Pandas DataFrame, we have
written df = pd.DataFrame, and we are creating data1 and data2 as two
different datasets.
[Video description begins] He points to the third code line in the second code
cell. [Video description ends]
[Video description begins] He points to the fourth code line in the second code
cell. [Video description ends]
Then we'll go and we'll specify some of the properties, for example, we want
this graph to be filled by a particular color. You can use the color in order to fill
the graph. After you have written, you have to go and you have to execute the
current cell.
[Video description begins] He hits the play icon to execute the code. A graph
with data2 on y-axis and data1 on x-axis appears in the code cell. [Video
description ends]
Once you'll execute the current cell, you'll find that your data2 is present in Y
axis. It starts with 10, 20, 30, 40, 50. And in X axis, you have data1, which is
1, 2, 3, 4. And you can clearly see that whatever color you have applied in fill
inside geom_bar, it is displaying your bar chart in that particular color.
Bokeh and Pygal
[Video description begins] Topic title: Bokeh and Pygal. Your host for this
session is Niranjan Pandey. [Video description ends]
We have created directive where we have specified doctype HTML and HTML
opening and closing.
[Video description begins] He talks about code lines 5 and 6. Code line 5 is: <!
DOCTYPE html>. Code line 6 is: <html>. [Video description ends]
Then we are specifying what should be the title of the chart we are using,
Product usage evolution in percentage. Then we are providing labels which is
map ( str, which ranges from 2002 to 2013), which is the year when we are
going to display the product evolution.
[Video description begins] He talks about code line 16. It reads: line_chart.title
= "Product usage evolution (in %)". [Video description ends]
Products that we have added to our chart are product 1, product 2, product 3,
and miscellaneous along with the data.
[Video description begins] He talks about code lines 18 to 21. Code line 18 is:
line_chart.add("Product1", [None, None, 0, 16.6, 25, 31, 36.4, 45.5, 46.3,
42.8, 37.1]). Code line 19 is: line_chart.add("Product2", [None, None, None,
None, None, None, 0, 3.9, 10.8, 23.8, 35.3]). Code line 20 is:
line_chart.add("Product3", [85.8, 84.6, 84.7, 74.5, 66, 58.6, 54.7, 44.8, 36.2,
26.6, 20.1]). Code line 21 is: line_chart.add("MISC", [14.2, 15.4, 15.3, 8.9, 9,
10.4, 8.9, 5.8, 6.7, 6.8, 7.5]). [Video description ends]
Finally, we are using HTML in order to ensure that we are able to launch the
chart specifying pygal_render, which will be utilized in the HTML directive that
we have specified in the above lines. And then finally, we are making call to
.render passing is_unicode=True.
When we run this cell, it will ensure that it generates a line chart utilizing
capabilities of Pygal.
Apart from this, you can also go and store this chart in SVG. For that, yes, you
have to have a different function of render which is render_2_5. So depending
on your need, you can go and you can manipulate, evaluate and store Pygal
charts.
[Video description begins] He scrolls down the page and goes to the third
code cell. [Video description ends]
[Video description begins] He talks about code line 1. It reads: import numpy
as np. [Video description ends]
Then we are taking figure and show, which is present in bokeh.plotting. And
finally, we are importing output_notebook because we want the chart to be
displayed in the notebook itself.
[Video description begins] He talks about code lines 2 and 3. Code line 2 is:
from bokeh.plotting import figure, show. Code line 3 is: from bokeh.io import
output_notebook. [Video description ends]
[Video description begins] He talks about code lines 4 to 6. Code line 4 is: N =
100. Code line 5 is: x = np.random.random(size=N) * 50. Code line 6 is: y =
np.random.random(size=N) * 50. [Video description ends]
Then we are going and selecting the colors that we want in order to plot the
chart.
[Video description begins] He talks about code line 8. It reads: colors = ["#
%02x%02x%02x" % (r, g, 150) for r, g in zip (np.floor(50+2*x) .astype (int),
np.floor(30+2*y) .astype(int))]. [Video description ends]
And that chart is getting plotted by using p = figure, p.circle specifying x and y
which are the data points. Radius equal to radii, which we have calculated
using np.random function. Fill color, we are using colors, which we are
passing as an array in colors variable. And finally, fill alpha equal to 0.6 and
line color equal to none.
[Video description begins] He talks about code lines 10 to 12. Code line 10
reads: p = figure(). Code line 11 reads: p.circle(x, y, radius=radii,
fill_color=colors, fill_alpha=0.6, line_color=None). Code line 12 reads:
show(p). [Video description ends]
Show does the magic of embedding your current chart which will be created
using Bokeh to the notebook. In order to display the chart in notebook, we'll go
and we'll execute the cell.
Once you'll execute the cell, you'll find that it loads Bokeh.js and provides the
browser view of your chart.
[Video description begins] A graph appears in the output below the code lines.
It has the following text: BokehJS 1.0.4 successfully loaded. There is a list of
icons present on the right side of the graph. These icons are Save, Reset,
Refresh etc. [Video description ends]
You can go and you can save it. You can also refresh, so that in case if there
is change in data, that change is reflected in your chart as well.
Once you identify the domain, you need to go and find out whether UI/UX
design and customization is essential. If it is yes, better is to go and build
vector files which are SVG. But in case if UI/UX customization doesn't matter,
you need to go and identify whether resolution scale matters. If it matters, you
have to go and you have to select Canvas, which is Bitmap.
Then you need to focus on the volume of data and whether you have to
manage data at rest or real time data. Depending on the need, if you have to
manage big data or real time data, you can go and you can select Canvas
which is again, Bitmap. If you are selecting Bitmap or Canvas, you can use
ChartJS or you can use E-charts. If you are going with SVG, you can select
from Highcharts or Plotly.
One of the task is to go and select the right graphics type, for example,
whether you should go for Bitmap or you should go for Vector. If your need is
high performance and you're working with filters, rays, and tracers, probably
you can go with Canvas.
But if you are working with high-fidelity documents, which are important for
viewing and printing, or you want to go with static images, probably SVG is
the better option.
[Video description begins] An illustration is shown. In it, two images
representing canvas and SVG are connected to each other via a double-sided
arrow. The arrow is labelled on one side as High performance and Image and
on the other side as Hi-fi format. Screen title: Chart Library Comparison
Matrix. [Video description ends]
Now let's try to understand various libraries and their capabilities through
comparison matrix. Features that we have selected here are graphics type,
ChartJS's Bitmap, Highcharts Vector, plotly Vector, Britecharts Vector. When
it comes to dependencies, ChartJS depends on color moment. Highchart have
no dependency.
Some of the interactive charts which are essential are heat map. Heat map is
a graphical representation of data where individual values contained in a
matrix are represented with different colors. Heat map is a newer term but
shading matrices has existed for long. Then we have geographical map.
Information which are related to geography can be represented using
geographical map.
For example, if you want to point out sales across different geographies
probably you will go with geographical map. Then you have link chart. Link
charts responsibility is to depict or show the magnitude and direction of
relationships between two or more categorical variables.
They are used in link analysis for identifying relationships between nodes that
are not easy to see from the existing raw data. Now let's try to understand
Image File Graphs. They are typical infographics which are posted in files.
Focuses on a specific data story rather than changing and providing multidata
stories in a single visualization. It provides single view for exploration.
Plot Graphs
[Video description begins] Topic title: Plot Graphs. Your host for this session
is Niranjan Pandey. [Video description ends]
Objective of this video is to demonstrate how to plot graphs using lines and
markers.
[Video description begins] An RStudio IDE opens. At the top, it has a menu
bar with the following options: File, Edit, Code, View, etc. The IDE has four
sections. The first section is the Code Editor. Inside the Code Editor, there is
a panel with a few buttons. Some of these are: Save, Source on Save, Run,
etc. A tab titled Untitled is open here. It has fourteen lines of code. The
second section is below the Code Editor. It has two tabs: Console and
Terminal. The Console tab is open. The third section lies on the right of the
Code Editor. It has three tabs: Environment, History, and Connections. The
Environment tab is open. Below this section lies the fourth section. It has five
tabs: Files, Plots, Packages, Help, and Viewers. The Plots tab is open. [Video
description ends]
We're using ggplot2 and reshape2 libraries which are part of R Studio or
which comes along with R. You can go and you can install package if it is not
already installed. After installing package, you have to go and get the library
available in your program. In line number 1 and 2, we are using the libraries
which we are going to use for this current program.
[Video description begins] He talks about code line 4. It reads: set.seed <-
1. [Video description ends]
We are creating a data frame using data.frame. We are passing cbind. Cbind
typically is a function which is used in R that appends or combines vectors,
matrices or data frames by columns. We are passing sequence 1,10,1 and
then matrix along with rnorm. Rnorm is used in order to generate random
numbers with arguments mean and at standard deviation. It can be used in
order to generate uniform number.
Then, we are using melt df, we are passing id=x1. After passing id=x1
and initializing d, we will have sufficient data in order to go and plot the line.
In order to plot line with different styles, we are using ggplot. We are passing
the data. And we are specifying x-axis and y-axis along with color variable. In
other words, it will choose color on its own depending on the number of lines.
Then we are specifying line type is variable and size is 1. We'll execute this
code in order to see how actually lines will get created.
[Video description begins] He points to code line 10. It reads:
geom_line(aes(linetype=variable), size=1). [Video description ends]
[Video description begins] He selects code lines 1 to 10 and hits the Run
button. The selected code lines appear in the console pane. [Video
description ends]
Once we'll click on Run, it will execute and it will plot the graph where it will
use variables as x2, x3, x4 and x5 along with different lines.
[Video description begins] A graph appears in the Plots tab. It has value on
the y-axis and X1 on the x-axis. Four variable lines are present in the graph.
These are: X2, X3, X4, and X5. [Video description ends]
Now our objective is to go and see what happens when we want to add
different markers for each line.
[Video description begins] He talks about code line 12. It reads: # Line Styles
with markers 2: different markers for each line. [Video description ends]
Let's go and execute the complete program now. We'll Select All and we'll
click on Run.
[Video description begins] He selects code lines 1 to 14 and clicks the Run
button. [Video description ends]
Once we'll click on Run, you'll find that it will generate two charts, and it will
bring up the most recent one up in your plotter.
[Video description begins] A graph appears in the Plots tab. It has value on
the y-axis and X1 on the x-axis. Four variable lines are present in the graph.
These are: X2, X3, X4, and X5. The pink line on the graph, labelled X2,
shows pink circles as markers at each new point in the line. The green line on
the graph, labelled X3, shows green triangles as markers at each new point in
the line. The blue line on the graph, labelled X4, shows blue rectangles as
markers at each new point in the line. The purple line on the graph, labelled
X5, shows purple cross signs as markers at each new point in the line. [Video
description ends]
You can see there are different shapes for different markers, triangle,
rectangle, and circle. But when you go to previous one, it shows you simple
line without any marker. This is as a result of line number 8 and 9.
[Video description begins] He presses the backward arrow button in the Plots
tab and shifts to the previous graph. [Video description ends]
And, second chart which comes with marker as a result of line number 13 and
14.
[Video description begins] He presses the forward arrow button in the Plots
tab and shifts to the new graph. [Video description ends]
Objective of this video is to plot several lines in a single graph using different
line styles and markers. In order to ensure that we are able to plot several
lines in a single graph, first task is to go and ensure that we have taken up the
libraries.
[Video description begins] An RStudio IDE opens. At the top, it has a menu
bar with the following options: File, Edit, Code, View, etc. The IDE has four
sections. The 1st section is the Code Editor. Inside the Code Editor, there is a
panel with a few buttons. Some of these are: Save, Source on Save, Run, etc.
A tab titled 1_11.R is open here. It has eleven lines of code. The 2nd section
is below the Code Editor. It has two tabs: Console and Terminal. The Console
tab is open. The 3rd section lies on the right of the Code Editor. It has three
tabs: Environment, History, and Connections. The Environment tab is open.
Below this section lies the 4th section. It has 5 tabs: Files, Plots, Packages,
Help, and Viewers. The Plots tab is open. [Video description ends]
In line number 1 and 2, we are taking library ggplot2 and reshape2.
In line number 4, we are setting the seed value. In line number 5, we are
generating random sequence and matrices, which we are binding in order to
build a data frame.
After building data frame, we are deriving d dataset, which we will be using in
line number 9 in order to start plotting.
In line number 11, we are specifying marker, which will be marking the points
with different shape. We have taken size as 10 which controls the width of the
marker.
Let's go and execute it to derive the outcome. We'll select the entire code
which we have written in the script, and we'll click on Run button.
[Video description begins] He selects code lines 1 to 11 and clicks the Run
button. The selected code lines appear in the console pane. [Video
description ends]
You'll see that it generates different lines in a single graph along with different
style and marker. You'll find different colors. In chase if you want to increase
the size of the line, you can go and you can change the value from size 1 to
size 10. Re-select it, run the same script again.
[Video description begins] In code line 10, he changes the size to 10. Code
line 10 now reads: geom_line(aes(linetype=variable), size=10). He again
selects code lines 1 to 11 and clicks the Run button. [Video description ends]
Once you'll run the same script again, you'll find that line thickness changes. It
is only because we have changed the size of the line to 10.
[Video description begins] A graph appears in the Plots tab. It has value on
the y-axis and X1 on the x-axis. Four variable lines are present in the graph
with size 10 and marker size 10. These are: X2, X3, X4, and X5. The
thickness of the lines in the new graph has increased. The pink line on the
graph, labelled X2, shows pink circles as markers at each new point in the
line. The green line on the graph, labelled X3, shows green triangles as
markers at each new point in the line. The blue line on the graph, labelled X4,
shows blue rectangles as markers at each new point in the line. The purple
line on the graph, labelled X5, shows purple cross signs as markers at each
new point in the line. [Video description ends]
We'll revert back to 1, and we will change the size of marker now to 1.
[Video description begins] He reverts the changes in code line 10. In code line
11, he changes the size to 1. Code line 11 now reads:
geom_point(aes(shape=variable, size=1)). [Video description ends]
Once we'll click on Run, you'll find that now linetype is 1 and you'll find that
marker has reduced its size.
[Video description begins] A graph appears in the Plots tab. It has value on
the y-axis and X1 on the x-axis. Four variable lines are present in the graph
with size 1 and marker size 1. These are: X2, X3, X4, and X5. The thickness
of the lines has decreased. The pink line on the graph, labelled X2, shows
pink circles as markers at each new point in the line. The green line on the
graph, labelled X3, shows green triangles as markers at each new point in the
line. The blue line on the graph, labelled X4, shows blue rectangles as
markers at each new point in the line. The purple line on the graph, labelled
X5, shows purple cross signs as markers at each new point in the line. [Video
description ends]
Now it's time to test your knowledge and understanding of what we have
learnt in this course. So in this exercise, you will create line chart using Pygal
in Python. Create HTML directive to render the chart, and finally render Pygal
chart created using HTML directive. Now I want you to pause this video and
attempt all of the exercises and then come back to view the solution so you
understand how well you have attempt.
And finally, we are taking up HTML directive, which starts with DOCTYPE
html opening and closing, and we are storing it in html_pygal.
[Video description begins] He points to code lines 4 to 12. Code line 4 is:
html_pygal = u""" . Code line 5 is: <!DOCTYPE html>. Code line 6 is: <html>.
Code line 7 is: <head>. Code line 8 is: <script type="text/javascript"
src="http://kozea.github.com/pygal.js/javascripts/svg.jquery.js"></script>.
Code line 9 is: <script type="text/javascript"
src="http://kozea.github.com/pygal.js/javascripts/pygal-tooltips.js"></script>.
Code line 10 is: < /head >. Code line 11 is:
<body><figure>{pygal_render}</figure></body>. Code line 12 is:
</html>. [Video description ends]
Our next objective is to go and build the line chart that we are doing by
specifying line_chart = pygal.Line(). We are specifying title to our chart and we
are adding label and various different elements which we want to plot.
[Video description begins] He points to code lines 15 to 21. Code line 15 is:
line_chart = pygal.Line(). Code line 16 is: line_chart.title = "Product usage
evolution (in %)". Code line 17 is: line_chart.x_labels = map(str, range(2002,
2013)). Code line 18 is: line_chart.add("Product1", [None, None, 0, 16.6, 25,
31, 36.4, 45.5, 46.3, 42.8, 37.1]). Code line 19 is: line_chart.add("Product2",
[None, None, None, None, None, None, 0, 3.9, 10.8, 23.8, 35.3]). Code line
20 is: line_chart.add("Product3", [85.8, 84.6, 84.7, 74.5, 66, 58.6, 54.7, 44.8,
36.2, 26.6, 20.1]). Code line 21 is: line_chart.add("MISC", [14.2, 15.4, 15.3,
8.9, 9, 10.4, 8.9, 5.8, 6.7, 6.8, 7.5]). [Video description ends]
After creating line_chart, our objective is to go and plot it in the HTML which
we have created earlier. For that, we are using HTML, we are passing the
name of the variable where we have stored the directive, which is html_pygal.
We are formatting it by applying .format and passing
pygal_render=line_chat.render, specifying is_unicode=True.
[Video description begins] He points to code line 22. It reads:
HTML(html_pygal . format(pygal_render=line_chart . render(is_unicode=True)
) ). [Video description ends]
Once you have done that, it means that you have already created the chart.
And you have specified statements which are required in order to render the
chart in the HTML segment of your current Google Colab, which supports
Python code using various cells. We'll go and we'll click on Run cell.
Once you'll click on Run cell, it will start executing the statement. And finally,
you'll find that it renders the chart which is line_chart, specifying products and
miscellaneous. And also specifies the header on the top of it as product usage
evolution, along with various different charts that indicates different products.
[Video description begins] A graph output appears below the code cell. It is
titled: Product usage evolution (in %). The x-axis ranges from 2002 to 2012
and the y-axis ranges from 10 to 80. In the graph, Product1 is marked in red,
Product2 is marked in purple, Product3 is marked in green, and MISC is
marked in yellow. [Video description ends]