0% found this document useful (0 votes)

2 views

Module 4 Data Science Visualization Tools

Data visualization is crucial for discovering trends, providing context, and saving time in data analysis. Various tools like D3.js, Google Charts, and MapReduce are discussed for creating interactive visualizations and processing large datasets. The document also highlights the pros and cons of developing custom reporting applications versus using existing company tools.

Uploaded by

Pratibha S

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Module 4 Data Science Visualization Tools

Uploaded by

Pratibha S

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

MODULE 3

VISUALIZATION
TOOLS
Data visualization is the graphical representation of
information and data.

Why is Data Visualization Important?

1. Data Visualization Discovers the Trends in Data
2. Data Visualization Provides a Perspective on the
Data
3. Data Visualization Puts the Data into the Correct
Context
4. Data Visualization Saves Time
5. Data Visualization Tells a Data Story
Java script Libraries
Dashboard Development Tools
High Charts
Google Charts

Chartkick
d3.js
Data scientists must deliver their new insights to the end user. and results can be communicated in several ways:

•A one-time presentation
•A new viewport on your
data
•A real-time dashboard
•A one-time presentation
Research questions are one-shot deals
because the business decision derived
from them will bind the organization to a
certain course for many years to come.

Take, for example, company investment

decisions: Do we distribute our goods
from two distribution centers or only
one? Where do they need to be located
for optimal efficiency?

When the decision is made, the exercise

may not be repeated until you’ve
retired. In this case, the results are
• A new viewport on your data
The most obvious example here is customer
segmentation. Sure, the segments themselves
will be communicated via reports and
presentations, but in essence they form tools,
not the end result itself.

When a clear and relevant customer

segmentation is discovered, it can be fedback to
the database as a new dimension on the data
from which it was derived.

From then on, people can make their own

reports, such as how many products were sold
to each segment of customers
A real-time dashboard
Sometimes your task as a data scientist doesn’t end when
you’ve discovered the new information you were looking
for.

You can send your information back to the database and

be done with it. But when other people start making
reports on this newly discovered gold nugget, they might
interpret it incorrectly and make reports that don’t make
sense.

As the data scientist who discovered this new information,

you must set the example: make the first refreshable
report so others, mainly reporters and IT, can understand
it and follow in your footsteps.

Making the first dashboard is also a way to shorten the

delivery time of your insights to the end user who wants
to use it on an everyday basis.
Data Visualization options
(For delivering dashboard to end users)
D3.js, or D3, is a free, open-source JavaScript library that
allows users to create interactive data visualizations for
web browsers.

D3.js is built on web standards and uses HTML5, Cascading

Style Sheets (CSS), and Scalable Vector Graphics .

D3.js is used to:

•Attach data to Document Object Model (DOM) elements
•Use CSS, HTML, and SVG to showcase data
•Make data interactive with D3.js data-driven
transformations and transitions

•An open-source JavaScript library for custom dynamic

visualizations with unparalleled flexibility and
expressiveness.

The main reason: compared to what it delivers, an

http://dc-js.github.io/
dc.js/
MapReduce is a programming model and software framework for
processing large amounts of data in parallel:

•How it works
MapReduce breaks down large data sets into smaller chunks and
processes them in parallel. This makes it faster and easier to
process large amounts of data.

•Phases
MapReduce has two phases: Map and Reduce. The Map phase splits
and maps data, while the Reduce phase shuffles and reduces the
data.
•Fault-tolerant
MapReduce is fault-tolerant, which means it can maintain reliable
operations and output even if it's interrupted during processing.
.

MapReduce is part of the Apache Hadoop Ecosystem and uses

Hadoop Distributed File System (HDFS) for input and
output. Hadoop can run MapReduce programs written in various
languages, including Python, Java, and C++.
Flow diagram for Map
Reduce
Numerical:-
MovieLens Data
USER_ID MOVIE_ID RATING TIMESTAMP
196 242 3 881250949
186 302 3 891717742
196 377 1 878887116
244 51 2 880606923
166 346 1 886397596
186 474 4 884182806
186 265 2 881171488
Solution : –
Step 1 – First we have to map the values , it is happen in 1st phase of Map Reduce model.
196:242 ; 186:302 ; 196:377 ; 244:51 ; 166:346 ; 186:274 ; 186:265
Step 2 – After Mapping we have to shuffle and sort the values.
166:346 ; 186:302,274,265 ; 196:242,377 ; 244:51
Step 3 – After completion of step1 and step2 we have to reduce each key’s values.
Now, put all values together

Solution
Do not send enormous loads of data over the
internet or even your internal network though, for
these reasons:

■ Sending a bulk of data will tax the network to

the point where it will bother other users.

■ The browser is on the receiving end, and while

loading in the data it will temporarily freeze.
For small amounts of data this is unnoticeable,
but when you start looking at 100,000 lines, it
can become a visible lag. When you go over
1,000,000 lines, depending on the width of your
data, your browser could give up on you.
Crossfilter,
the JavaScript MapReduce
library
Crossfilter is a JavaScript library for
exploring large multivariate datasets in
the browser. Crossfilter supports
extremely fast (<30ms) interaction with
coordinated views, even with datasets
containing a million or more records
Example: Airline on-time performance

https://square.github.io/crossfilter
It’s time to build the actual application, and the
ingredients of our small dc.js application are as
follows:

■ JQuery—To handle the interactivity

■ Crossfilter.js—A MapReduce library and
prerequisite to dc.js
■ d3.js—A popular data visualization library and
prerequisite to dc.js
■ dc.js—The visualization library you will use to
create your interactive dashboard
■ Bootstrap—A widely used layout library you’ll
use to make it all look better

You’ll write only three files:

■ index.html—The HTML page that contains your
There are multiple reasons why you’d create
your own custom reports instead of opting for
the (often more expensive) company tools out
there:

•No budget—Startups can’t always afford

every tool

• High accessibility—Everyone has a browser

•Available talent—(Comparatively) easy

access to JavaScript developers

• Quick release—IT cycles can take a while

• Prototyping—A prototype application can

provide and leave time for IT to build the
production version.
There are reasons against developing your own
application:

•Company policy—Application proliferation isn’t

a good thing and the company might want to
prevent this by restricting local development.

•Mature reporting team—If you have a good

reporting department, why would you still
bother?

•Customization is is satisfactory—Not
everyone wants the shiny stuff; basic can be
enough.

Wiring Diagram For Driver-Side SAM Control Unit With Fuse and Relay Module
100% (1)
Wiring Diagram For Driver-Side SAM Control Unit With Fuse and Relay Module
8 pages
Music Theatre Monologues - FEMALE
86% (7)
Music Theatre Monologues - FEMALE
11 pages
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
Ivs233 Investment Property Under Construction
No ratings yet
Ivs233 Investment Property Under Construction
7 pages
UNIT 5 IDS
No ratings yet
UNIT 5 IDS
18 pages
UNIT-5-IDS
No ratings yet
UNIT-5-IDS
19 pages
M4
No ratings yet
M4
14 pages
Module 4-Data visualization to the end user
No ratings yet
Module 4-Data visualization to the end user
9 pages
Chapter 2 - Data Science
No ratings yet
Chapter 2 - Data Science
20 pages
Big Data
No ratings yet
Big Data
10 pages
Chapter Two Data Science: by Abdulaziz Oumer
No ratings yet
Chapter Two Data Science: by Abdulaziz Oumer
29 pages
RHadoop
No ratings yet
RHadoop
50 pages
Unit4 - DataAnalytics and IoT PDF
No ratings yet
Unit4 - DataAnalytics and IoT PDF
40 pages
Lecture 3 (DS) - Steps in Data Science Process
No ratings yet
Lecture 3 (DS) - Steps in Data Science Process
57 pages
Big Data
No ratings yet
Big Data
77 pages
Unit 6
No ratings yet
Unit 6
34 pages
Data Visualization in Data Science
No ratings yet
Data Visualization in Data Science
50 pages
BDA_Unit-1
No ratings yet
BDA_Unit-1
32 pages
Big Data
No ratings yet
Big Data
25 pages
Chapter 2-Data Science
No ratings yet
Chapter 2-Data Science
23 pages
Fda 1
No ratings yet
Fda 1
5 pages
Chapter 2
No ratings yet
Chapter 2
31 pages
Data Science
No ratings yet
Data Science
59 pages
CCW331 Business Analytics Lecture Notes 2
No ratings yet
CCW331 Business Analytics Lecture Notes 2
185 pages
DWV Notes Units 1 to 5
No ratings yet
DWV Notes Units 1 to 5
158 pages
BDA Unit-1
No ratings yet
BDA Unit-1
31 pages
Hadoop & BigData (UNIT - 2)
No ratings yet
Hadoop & BigData (UNIT - 2)
22 pages
data scince report
No ratings yet
data scince report
11 pages
Business Anaytics Unit 1
No ratings yet
Business Anaytics Unit 1
37 pages
BA Unit 1
No ratings yet
BA Unit 1
38 pages
Big Data Report
No ratings yet
Big Data Report
6 pages
CH 2 Data Science
No ratings yet
CH 2 Data Science
28 pages
Mca Big Data PDF Sem 3
No ratings yet
Mca Big Data PDF Sem 3
193 pages
Lecture Notes: Introduction To Data Science and Big Data
No ratings yet
Lecture Notes: Introduction To Data Science and Big Data
5 pages
BDA-UNIT-1
No ratings yet
BDA-UNIT-1
32 pages
Chap1-Overview of Data Science
No ratings yet
Chap1-Overview of Data Science
50 pages
DA-1,2,3[1]_merged
No ratings yet
DA-1,2,3[1]_merged
39 pages
Big Data and Blockchain Basics: Dr. Poonam Saini Poonamsaini@pec - Edu.in
No ratings yet
Big Data and Blockchain Basics: Dr. Poonam Saini Poonamsaini@pec - Edu.in
42 pages
Manual -DV
No ratings yet
Manual -DV
51 pages
Introduction To Bda
No ratings yet
Introduction To Bda
67 pages
Module 4 Data Science
No ratings yet
Module 4 Data Science
6 pages
Chapter 2 Emerging
No ratings yet
Chapter 2 Emerging
31 pages
Introduction To Big Data
No ratings yet
Introduction To Big Data
30 pages
Unit 1
No ratings yet
Unit 1
36 pages
5976a467-633a-4f93-9c46-652eeb12a6c8
No ratings yet
5976a467-633a-4f93-9c46-652eeb12a6c8
4 pages
UNIT I BIG DATA Extra Content
No ratings yet
UNIT I BIG DATA Extra Content
15 pages
CSE 5th Semester - Business Analytics - CCW331 - Notes
No ratings yet
CSE 5th Semester - Business Analytics - CCW331 - Notes
140 pages
Chapter N1 Introduction To Big Data
No ratings yet
Chapter N1 Introduction To Big Data
40 pages
BDH Answer Bank
No ratings yet
BDH Answer Bank
21 pages
Lecture Notes - Introduction To Big Data
0% (1)
Lecture Notes - Introduction To Big Data
8 pages
DV Assg1
No ratings yet
DV Assg1
7 pages
Module 1
No ratings yet
Module 1
91 pages
Chapter - 2 - Data Science
No ratings yet
Chapter - 2 - Data Science
32 pages
Data Warehousing Data Minig Etc.....................
No ratings yet
Data Warehousing Data Minig Etc.....................
23 pages
Project Report
No ratings yet
Project Report
29 pages
Business Analytics Anna University
No ratings yet
Business Analytics Anna University
40 pages
Notes - Business Analytics
No ratings yet
Notes - Business Analytics
138 pages
BD.1ST MID
No ratings yet
BD.1ST MID
8 pages
Eds Unit 3
No ratings yet
Eds Unit 3
22 pages
Big Data and Hadoop Self Notes
No ratings yet
Big Data and Hadoop Self Notes
16 pages
Emergency chapter two(2)
No ratings yet
Emergency chapter two(2)
41 pages
Chap 2-Data Analysis
No ratings yet
Chap 2-Data Analysis
27 pages
Real-Time Big Data Analytics: Emerging Trends
From Everand
Real-Time Big Data Analytics: Emerging Trends
Trilokesh Khatri
No ratings yet
Frank Lloyd Wright-Le Corbusier
No ratings yet
Frank Lloyd Wright-Le Corbusier
60 pages
A22 Service Parts Catalog
No ratings yet
A22 Service Parts Catalog
34 pages
11-13 Worksheets
No ratings yet
11-13 Worksheets
10 pages
5B Vocab List
No ratings yet
5B Vocab List
3 pages
Hot Work Permit To Work Template Checklist - SafetyCulture PDF
No ratings yet
Hot Work Permit To Work Template Checklist - SafetyCulture PDF
5 pages
2018 - sh330 6 sh330lc 6 sh350hd 6 sh350lhd 6 t3
50% (2)
2018 - sh330 6 sh330lc 6 sh350hd 6 sh350lhd 6 t3
14 pages
Back Time Past: Going in To The
No ratings yet
Back Time Past: Going in To The
48 pages
Final Design School Pamphlet
No ratings yet
Final Design School Pamphlet
2 pages
Microlite20 Modern
No ratings yet
Microlite20 Modern
3 pages
Art of Puppetry Catalog 2
No ratings yet
Art of Puppetry Catalog 2
64 pages
Digestion, Nutrition, Metabolism - ANSWER
No ratings yet
Digestion, Nutrition, Metabolism - ANSWER
24 pages
Community Health, 1-3. Area of Concern & Relevance To Occupational Therapy Practice
No ratings yet
Community Health, 1-3. Area of Concern & Relevance To Occupational Therapy Practice
3 pages
Columbia University Mention of Hedy Lamarr
No ratings yet
Columbia University Mention of Hedy Lamarr
44 pages
Godrej Safety Locker
No ratings yet
Godrej Safety Locker
15 pages
Koons_UGS 303_61360
No ratings yet
Koons_UGS 303_61360
10 pages
Intelligent Energy-The True Cost of Providing Energy To Telecom Towers in India
No ratings yet
Intelligent Energy-The True Cost of Providing Energy To Telecom Towers in India
10 pages
Flexi Ul 1007 SC / Flexi Ul 1015 SC: Flexible Single-Core Cable
No ratings yet
Flexi Ul 1007 SC / Flexi Ul 1015 SC: Flexible Single-Core Cable
2 pages
R Programming 5 10 Marks
No ratings yet
R Programming 5 10 Marks
2 pages
Token Labels Export
No ratings yet
Token Labels Export
21 pages
Quantum Technologies Execsum
No ratings yet
Quantum Technologies Execsum
24 pages
Spiritual Definition of Abundance - Google Search
No ratings yet
Spiritual Definition of Abundance - Google Search
1 page
List of Master Circulars Issued by Railway Board
100% (2)
List of Master Circulars Issued by Railway Board
10 pages
Programme Recog
No ratings yet
Programme Recog
2 pages
A Concept Paper On The Production of Radio Program Child Friendly Entertainment
100% (1)
A Concept Paper On The Production of Radio Program Child Friendly Entertainment
3 pages
P2 - Non-Neoplastic and Neoplastic Diseases of Musculoskeletal System
No ratings yet
P2 - Non-Neoplastic and Neoplastic Diseases of Musculoskeletal System
45 pages
War Against Terrorism Essay
100% (2)
War Against Terrorism Essay
3 pages
Lab Report Heat of Neutralization
No ratings yet
Lab Report Heat of Neutralization
2 pages