Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Data Visualisation With Tableau

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 26

Data Visualisation with

Tableau
In this course, you will learn how to analyse and
display data using Tableau and make better, more
data-driven decisions.
Our goal as Data Analysts is to arrange the insights of our data in
such a way that everybody who sees them is able to understand
their implications and how to act on them clearly.

Tableau is a data analytics and visualization tool used widely in the


industry today. Many businesses even consider it indispensable for
data-science-related work. Tableau's ease of use comes from the
fact that it has a drag and drop interface. This feature helps to
perform tasks like sorting, comparing and analyzing, very easily and
fast. Tableau is also compatible with multiple sources, including
Excel, SQL Server, and cloud-based data repositories which make it
an excellent choice for Data Scientists.

This tutorial will cover the following topics:

 1. Introduction to Tableau
o Overview

o Installation

 2. Getting Started
o Tableau Workspace

o Connecting to a Data Source

o Creating a view

o Refining the view

 3. Emphasizing the Results


o Adding Filters to the view

o Adding Colors to the view

o Key Findings

 4. Map View
o Building a Map View

o Getting into details

o Identifying the Key points

 5. Dashboard
o Creating a dashboard

o Adding Interactiveness

 6. Story
o Building a Story

o Making a Conclusion

 7. Tableau's integration with R, Python & SQL Server


o Tableau & R

o Tableau & Python

o Tableau & SQL Server

1.Introduction to Tableau
Overview

Tableau Software is a software company headquartered in Seattle, Washington that


produces interactive data visualization products focused on business intelligence.
Tableau was established at Stanford University’s Department of Computer Science
between 1997 and 2002(Wikipedia)

The main products offered by tableau are:


Tableau Desktop, Tableau Public, and Tableau Online, all offer Data
Visual Creation and choice depends upon the type of work

In this tutorial, we will be working with Tableau Desktop. The link for
the source of reference is here.

Installation

Depending upon the choice of product, download the software on to the computer.
After accepting the license agreement, you can verify the installation by clicking the
Tableau Icon. If the following screen appears, you are good to go.
2. Getting Started
In this section, we will learn some basic operations in Tableau to get accustomed to its
interface.

Tableau Workspace

The Tableau workspace is a collection of worksheets, menu bar, toolbar, marks card,
shelves and a lot of other elements about which we will learn in sections to come.
Sheets can be worksheets, dashboards, or stories. The image below highlights the
major components of the workspace. However, more familiarity will be achieved once
we work with actual data.
s
ource:Tableau.com

Connecting to a Data Source

To begin working with Tableau, we need to connect Tableau to the data source.
Tableau is compatible with a lot of data sources. The data sources supported by
Tableau appear on the left side of the opening screen. Some commonly used data
sources are excel, text file, relational database or even on a server. One can also
connect to a cloud database source such as Google Analytics, Amazon Redshift, etc.

The launch screen of Tableau Desktop shows the available data sources that one can
connect too. It is also dependent on the version of Tableau since the paid version
offers more possibilities. On the left side of the screen, there is a Connect pane which
highlights the available sources. File types are listed first, followed by common server
types, or the servers that have been recently connected. You can open previously
created workbooks Under Open tab. Tableau Desktop also provides some sample
workbooks under Sample Workbooks.
Hands On

Connecting to the Sample-Superstore data set

We shall be working with a sample data set names Superstore dataset, that comes
pre-loaded with Tableau. However, we will be downloading the file from here so that
we can get an idea of connecting to an Excel data source. The data is that of a
superstore. It contains information about products, sales, profits, etc. Our aim as Data
Analysts is to analyze the data and find critical areas of improvement within this
fictitious company.

Steps

1. Import the Data into tableau workspace from the computer.


2. Under the Sheets Tab, three sheets will become visible namely
Orders, People, and Returns. However, we will focus only on
Orders data. Double click on Orders Sheet, and it opens up just
like a spreadsheet.
3. We observe the first three rows of data looks a bit different and
is not in the desired format. Here we make use of Data
Interpreter, also present under Sheets Tab. By clicking on it,
we get a nicely formatted sheet.
Hands On

Creating a View

We will start by generating a simple chart. In this section, we will get to know our data
and will begin to ask questions about the data to gain insights. There are some
important terms that we will encounter in this section.

Dimension

Measures

Aggregation

Dimensions are qualitative data, such as a name or date. By default, Tableau


automatically classifies data that contains qualitative or categorical information as a
dimension, for example, any field with text or date values. These fields generally
appear as column headers for rows of data, such as Customer Name or Order Date,
and also define the level of granularity that shows in the view.

Measures are quantitative numerical data. By default, Tableau treats any field
containing this kind of data as a measure, for example, sales transactions or profit.
Data that is classified as a measure can be aggregated based on a given dimension,
for example, total sales (Measure) by region (Dimension).
Aggregation is the row-level data rolled up to a higher category, such as the sum of
sales or total profit.

Tableau automatically sorts the fields in Measures and Dimensions. However, for any
anomaly, one can change it manually too.

Steps

1. Go to the worksheet. Click on the tab Sheet 1 at the bottom


left of the tableau workspace.

2. Once, you are in the worksheet, from Dimensions under the


Data pane, drag the Order Date to the Column shelf.
On dragging the Order Date to the columns shelf, a column for
each year of Orders is created in the dataset. An 'Abc' indicator
is visible under each column which implies that text or
numerical or text data can be dragged here. On the other hand,
if we pulled Sales here, a cross-tab would be created which
would show the total Sales for each year.
3. Similarly, from the Measures tab, drag the Sales field onto
the Rows shelf.

Tableau populates a chart with sales aggregated as a sum. Total aggregated sales for
each year by order date is displayed. Tableau always populates a line chart for a view
that includes time-field which in this example is Order Date.
Hands On

What does the line chart above convey? Well, it shows that the sales look
quite promising and appear to be increasing with time. This is a valuable
insight, but it hardly says much about the products which are contributing
to increased Sales. Let us delve further to get more insights.

Refining the View

Let us delve deeper and try to find out more insights regarding which products drive
more sales. Let's start by adding the product categories to look at sales totals in a
different way.

Steps

1. Category is present under the Dimensions pane. Drag it to the


columns shelf and place it next to YEAR(Order Date). The Category
should be placed to the right of Year. In doing so, the view
immediately changes to a bar chart type from a line. The chart
shows the overall Sales for every Product by year.
Learn More
To view information about each data point (that is, mark) in the
view, hover over one of the bars to reveal a tooltip. The tooltip
displays total sales for that category. Here is the tooltip for the
Office Supplies category for 2016:

1.

To add labels to the view, click Show Mark Labels on the toolbar.

The bar chart can be displayed horizontally instead of vertically


too. Click Swap on the toolbar for the same.
 2. The view above nicely shows sales by category, i.e., furniture, office supplies, and
technology. We can also infer that furniture sales are growing faster than sales of
office supplies except for 2016. Hence it will be wise to focus sales efforts on furniture
instead of office supplies. But furniture is a vast category and consists of many
different items. How can we identify which furniture item is contributing towards
maximum sales?

To help us answer that question, we decide to look at products by Sub-category to see


which items are the big sellers. Let's say for the Furniture category; we want to look
at details about only bookcases, chairs, furnishings, and tables. We will Double-click or
drag the Sub-Category dimension to the Columns shelf.

The sub-category is another discrete field. It further dissects the Category and displays
a bar for every sub-category broken down by category and year. However, it is a
humongous amount of data to make sense of visually. In the next section, we will
learn about filters, color and other ways to make the view more comprehensible.
Hands On

3.Emphasizing the Results


In this section, we will try to focus on specific results. Filters and colors are ways to
add more focus to the details that interest us.

Adding filters to the view

Filters can be used to include or exclude values in the view. Here we try to add two
simple filters to the worksheet to make it easier to look at product sales by sub-
category for a specific year.

Steps

In the Data pane, under Dimensions, right-click Order Date and select Show
Filter.Repeat for Sub->category field also.
Filters are the type of cards and can be moved around on the worksheet by
simple drag and drop
Adding colors to the view

Colors can be helpful in the visual identification of a pattern.

Steps

In the Data pane, under Measures, drag Profit to Color on the Marks card.
It can be seen that Bookcases, Tables and even machine contribute to negative profit, i.e., loss. A powerful insight.

Hands On

Key Findings

Let's take a closer look at the filters to find out more about the unprofitable products.

Steps
In the view, in the Sub-Category filter card, uncheck all boxes except Bookcases, Tables, and Machines. This brings to
light an interesting fact. While in some years, Bookcases and Machines were actually profitable. However, in 2016,
Machines became unprofitable.

Select All in the Sub-Category filter card to show all the subcategories again.

From the Dimensions, drag Region to the Rows shelf and place it to the left of the Sum(Sales) tab. We notice that
machines in the South are reporting a higher negative profit overall than in your other regions.

Let us now give a name to the sheet. At the bottom-left of the workspace, double-click Sheet 1 and type Sales by
Product and Region.

In order to preserve the view, Tableau allows us to duplicate our worksheet so that we can continue in another sheet
from where we left off.

In your workbook, right-click the Sales by Product and Region sheet and select Duplicate and rename the duplicated
sheet to Sales-South.

In the new worksheet, from Dimensions, drag Region to the Filters shelf to add it as a filter in the view.

In the Filter Region dialogue box, clear all check boxes except South and then click OK. Now we can focus on sales
and profit in the South. We find that machine sales had a negative profit in 2014 and again in 2016. We will
investigate this in the next section

Lastly, do not forget to save the results by selecting File > Save As. Let us name our workbook as Regional Sales and
Profits
4. Map View
Creating a Map View

Map views are beneficial when we are looking at geographic data (the Region field). In the current
example, Tableau automatically recognizes that the Country, State, City, and Postal Code fields contain
geographical information.

Steps

Create a new worksheet.

Add State and Country under Data pane to Detail on the Marks card. We obtain the map view.

Drag Region to the Filters shelf, and then filter down to South only. The map view now zooms in to the
South region only, and a mark represents each state.

Drag the Sales measure to the Color tab on the Marks card. We obtain a filled map with the colors showing
the range of sales in each state.

We can change the color scheme by clicking Color on the Marks card and selecting Edit Colors. We can
experiment with the available palettes.

We observe that Florida is performing the best regarding Sales. If we Hover over Florida, it shows a total of
89,474 USD in sales, as compared to South Carolina, for example, which has only 8,482 USD in sales. Let us
gauge the performance by Profit now since Profit is a better indicator than Sales alone.

Drag Profit to Color on the Marks card. We now see that Tennessee, North Carolina, and Florida have
negative profit, even though it appeared they were doing good in Sales. Rename the sheet as Profit Map
Hands On

Getting into the details

Maps empower us to visualize the data broadly. In the last step, we discovered that we discovered that Tennessee,
North Carolina, and Florida have a negative profit. In this section let us draw a Bar chart to explore the reason for the
negative profit.

Steps

Duplicate the Profit Map worksheet and name it Negative Profit Bar Chart.

Click Show Me on the Negative Profit Bar Chart worksheet. Show Me presents the number of ways in which a graph
can be plotted between items mentioned in the worksheet. From Show Me select the horizontal bar option and the
view updates to horizontal from vertical bars instantly.

We can select more than one bar at a time by simply clicking and dragging the cursor over them. We want to focus
only on the three states, i.e., Tennessee, North Carolina, and Florida. Hence, we will only select the bars pertaining
to them.

Learn More

Creating Hierarchies:
Hierarchies come in handy when we want to group similar fields so that we can quickly drill down between levels in
the viz.

In the Data pane, drag a field and drop it directly on top of another field or right-click the field and select
Drag any additional fields into the hierarchy. Fields can also be re-ordered in the hierarchy by simply dragging them
to a new position. In the current viz. we will create the following hierarchies: Location, Order, and Product.

On the Rows Shelf, click the plus-shaped icon on the State Field to drill-down to the City level.

That's a lot of data. We can use N-Filter to filter and reveal the weakest performers. For that, drag City from the Data
pane to the Filters shelf. Click By field and then Click the Top drop-down and select Bottom to reveal the weakest
performers. Type 5 in the text box to show the bottom 5 performers in the data set.

We now see that Jacksonville and Miami, Florida; Burlington, North Carolina; and Knoxville and Memphis, Tennessee
are the poorest performing cities by profit. There is one other mark in the view—Jacksonville, North Carolina—that
doesn't belong here since it has profitable sales. This means there is an issue in the filter we applied. We will take the
help of Tableau Order of Operations.

On the Filters shelf, right-click the Inclusions (Country, State) set and select Add to Context. We find that now
Concord(North Carolina) appears in view while Miami(Florida) have disappeared. This makes sense now.
But Jacksonville (North Carolina) is still present which is incorrect. On the Rows shelf, click the plus-shaped icon on
City tab to drill down to the Postal Code level. Right-click the postal code for Jacksonville, NC, 28540, and then select
Exclude to exclude Jacksonville manually.

Drag Postal Code of the Rows shelf. This is the final view.

Hands On
Key Findings

Let us now focus only on the loss-making entities, i.e., the Products and also let us identify the locations where such
products are sold.

Steps

Drag Sub-Category to the Rows to further drill down.

Similarly, drag the Profit to Color on the Marks card. This enables us to spot products with negative profit quickly.

Right-click the Order Date and select Show Filter. It seems that Machines, tables, and binders are performing poorly.
So what should we do? One solution would be to stop the sale of these products in Jacksonville, Concord, Burlington,
Knoxville, and Memphis? Let's verify if our decision is right.

Let us head back o previously created Profit Map sheet tab.

Now, click on the Sub-Category field to select the Show Filter option.

Drag Profit from under Measures onto the Label Marks card.

Again, click on the Order Date and select Show Filter. From the filter let us clear off the items which we think are
contributing to negative profit. So, uncheck the boxes in front of Binders, Machines, and Tables respectively. Now we
are only left with the profit-making entities. This shows that the entities like Binders, machines, and tables were
actually causing losses in some areas and we were right in our findings.
5. Dashboard
A dashboard is a collection of several views, enabling one to compare a variety of data simultaneously.

Creating a Dashboard

Steps

Click the New dashboard button.

Drag Sales in the South to the empty dashboard

Drag Profit Map to the dashboard, and drop it on top of the Sales in the South view. Both views can be seen at once.
To be able to present data in a manner so that others can understand it we can arrange the dashboard to our liking.

On the Sales South worksheet in the dashboard view, click under the Region and clear off the Show Header. Repeat
the same process for all the other headers. This helps to emphasize only what is needed and hides away the not so
important information.

On the Profit Map, Hide the Title as well and perform the same Steps for the Sales South map.

We can see that the Sub-Category filter card and Year of Order Date have been repeated on the right-side. Let us get
rid of the extra by simply crossing them out. Finally, click on the Year of Order Date. A drop-down arrow appears and
select the option of Single Value (Slider). Now let the magic unfold. Experiment by choosing different years on the
slider and the Sales also vary accordingly.

Drag the SUM(Profit) filter to the bottom of the dashboard below Sales in South for a better view.

Hands On
Adding Interactiveness

In order to make the dashboard more interactive like viewing which sub-categories are profitable in which states, a
few changes need to be done.

Steps

Let's start with the Profit Map. On clicking the map, a Use as filter icon appears in the upper right. Click on it. If we
select any map, Sales corresponding to that state will be highlighted in the Sales-South map.
For the Year of Order Date, click on the drop-down option and go to Apply to Worksheets > Selected Worksheets. A
dialog box opens up. Select the All option followed by OK. What does this option do? It applies filters to all the
worksheets having the same data source.

Explore and experiment. In the visualization below, we can filter the Sales South map to view products that are being
sold in North Carolina only. We can then easily explore the profits yearly.

Rename the Dashboard to Regional Sales and Profit.

Hands On

Thus, selling machines in the North Carolina did not bring any profits to the company.
6. Story
A dashboard is a cool feature, but tableau also offers us to showcase our results in presentation mode in the form of
stories about which we will discuss in this section.

Building a Story

Steps

Click the New story button.

From the Story pane on the left, drag the Sales in the South worksheet (created earlier) onto the view.

Edit the text in the gray box above the worksheet. This is the caption. Name it as Sales and profit by year.

Stories are quite specific. Here we will tell a story about selling machines in North Carolina. In the Story pane, click on
Duplicate to duplicate the first caption, or you may even create a new one.

In the Sub-Category, filter select only Machines. This helps to gauge sales and profit of machines by year.

Rename the caption to Machine sales and profit by year.

Hands On
Making a Conclusion

It is clear that machines in North Carolina are leading to loss of profit. However, this cannot be demonstrated by
looking at Profit and Sales on the whole. For this, we need regional Profit.

Steps

In the Story pane, select Blank. Drag the already created dashboard Regional Sales and Profit onto the canvas.

Caption it as Low performing items in the South.

Select Duplicate to create another story point with the Regional Profit dashboard. Select North Carolina on the bar
chart since we are interested in showing more about it.
Select All the years.

Add a caption for clarity, like, Profit in NC : 2013-2016.

Select any year like 2014. Add a caption, for example, Profit in NC : 2014 and then click on the Duplicate tab. Repeat
the same step for all the remaining years.

Click on the presentation mode and let the story unfold.

Hands On

Now we have an idea about, what products were introduced to the North Carolina market when, and how they
performed. Not only have we identified a way to address negative profit, but have also successfully managed to back
it with data. This is the advantage of Story in Tableau.

You might also like