Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
54 views

Introduction of Big Data & Applications

This document provides an overview of big data including its definition, what makes data big, applications of big data, and who is generating big data. It discusses the 3Vs of big data, tools used for big data analytics, skills needed, common job profiles, and domains that use big data analytics. Examples are also provided of how big data is being used in applications like personalized marketing, recommendation engines, sentiment analysis, mobile advertising, consumer growth, biomedical research, smart cities, and wildfire prediction.

Uploaded by

Moazzam Ali
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views

Introduction of Big Data & Applications

This document provides an overview of big data including its definition, what makes data big, applications of big data, and who is generating big data. It discusses the 3Vs of big data, tools used for big data analytics, skills needed, common job profiles, and domains that use big data analytics. Examples are also provided of how big data is being used in applications like personalized marketing, recommendation engines, sentiment analysis, mobile advertising, consumer growth, biomedical research, smart cities, and wildfire prediction.

Uploaded by

Moazzam Ali
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 107

Lecture 01- Introduction of Big Data

What is Big Data?


What makes data, “Big” Data?
Application of Big Data

Adeel Ahmed, PhD.

1
Big Data Definition

“Big Data” is data whose scale, diversity, and complexity


require new architecture, techniques, algorithms, and analytics
to manage it and extract value and hidden knowledge from it…

Big data analytics examines large and different types of data


to, uncover hidden patterns , correlations and other
insights.

2
What’s Big Data?

No single definition; here is from Wikipedia:

• Big data is the term for a collection of data sets so large


and complex that it becomes difficult to process using on-
hand database management tools or traditional data
processing applications.
• The challenges include capture, curation, storage, search,
sharing, transfer, analysis, and visualization.

3
Big Data: 3V’s

4
Who’s Generating Big Data

Mobile devices
(tracking all objects all the time)
Social media and networks Scientific instruments
(all of us are generating data) (collecting all sorts of data)
Sensor technology and networks
(measuring all kinds of data)

• The progress and innovation is no longer hindered by the ability to collect data
• But, by the ability to manage, analyze, summarize, visualize, and discover knowledge from the
collected data in a timely manner and in a scalable fashion
5
The Model Has Changed…
• The Model of Generating/Consuming Data has Changed

Old Model: Few companies are generating data, all others are consuming data

New Model: all of us are generating data, and all of us are consuming data

6
Harnessing Big Data

• OLTP: Online Transaction Processing (DBMSs)


• OLAP: Online Analytical Processing (Data Warehousing)
• RTAP: Real-Time Analytics Processing (Big Data Architecture & technology)

7
What’s driving Big Data
- Optimizations and predictive analytics
- Complex statistical analysis
- All types of data, and many sources
- Very large datasets
- More of a real-time

- Ad-hoc querying and reporting


- Data mining techniques
- Structured data, typical sources
- Small to mid-size datasets

8
Big Data Analytics

• Big data is more real-time in nature


than traditional DW applications
• Traditional DW architectures (e.g.
Exadata, Teradata) are not well-suited
for big data apps
• Shared nothing, massively parallel
processing, scale out architectures
are well-suited for big data apps

9
10
Big Data Landscape
Shortlist few
Tools used in big Data Analytics
Soft Skills
Skills
Two Big Jobs
Similarity of Data scientist and big data
Engineer
Big Data Job Profiles

Chief Data Offi Big Data Mana Big Data Scien Big Data Analy
cer ger tist st

Big Data Soluti Big Data Visua Big Data Cons Big Data Rese
ons Architect lizer ultant archer
What does big data Professional do?
Data Analyst Salary Distribution in Pakistan

http://www.salaryexplorer.com/salary-survey.php?loc=164&loctype=1&jobtype=3&job=805
https://2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com/wp-content/uploads/2019/01/KPMG-report.png
Domain Using big Data Analytics
Data Sets
• https://www.kaggle.com/datasets
• https://archive.ics.uci.edu/ml/datasets.html
• https://opendata.com.pk/
• http://www.kdnuggets.com/2011/02/free-public-datasets.html
• http://aws.amazon.com/publicdatasets/
• http://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public
• http://stackoverflow.com/questions/2674421/free-large-datasets-to-experimen
t-with-hadoop
• http://www.ll.mit.edu/mission/communications/cyber/CSTcorpora/ideval/data/
• http://www.gapminder.org/data/
• https://data.worldbank.org/
Stages in Big Data Analytics
What launched the
Big Data era?
New Opportunities

Changing Times
Data Science à༎
New Opportunities
#1 Catalyst for
economic growth
-McKinsey

Changing Times
McKinsey Report (2013)
McKinsey Report (2013)
McKinsey Report (2013)
McKinsey Report (2013)
McKinsey Report (2013)
Data Science à༎
#1 Catalyst for
economic growth
-McKinsey
Data Science à༎
Cloud Computing #1 Catalyst for
economic growth
-McKinsey
Computing
anywhere and anytime
Computing
anywhere and anytime

On-Demand
Computing
Computing
anywhere and anytime

dynamic and scalable


data analysis
Data Torrent Computing
Anytime, Anywhere

Big Data Era


Applications:
What makes Big Data valuable
What Makes Big Data Valuable?

Big Data

Better Models

Higher Precision
What Makes Big Data Valuable?

Big Data
Awesome!
Just what I was
looking for!
Personalized
marketing data

Happy
customers!
What Makes Big Data Valuable?
I like blue plates!

I like gold plates!


×
Personalized Marketing
Recommendation Engines
Sentiment Analysis
I love my new
blue plates!
Let me write
a review.
Sentiment Analysis
I love my new
blue plates!
Let me write
a review.
Sentiment Analysis
I love my new
blue plates!
Let me write
a review.
Sentiment Analysis
I love my new
blue plates!
Let me write
a review.

Natural Language Processing


Mobile Advertising
Mobile Advertising
Mobile Advertising

Paint
on
sale at
Home
Depot!
Mobile Advertising
Geolocation data

Customer profile
Recent purchases
Consumer Growth to Guide
Product Growth
Consumer Growth to Guide
Product Growth
Collective Consumer
Behavior
Consumer Growth to Guide
Product Growth
We like
Flight Schedule
morning flights!
Biomedical Applications
Biomedical Applications
2 billion human genomes
sequenced by 2025
Biomedical Applications
2 billion human genomes
sequenced by 2025

Up to 40 exabytes
in storage!
Biomedical Applications
2 billion human genomes
sequenced by 2025

Up to 40 exabytes
in storage!

Up to 10,000 trillion CPU


hours for processing!
Personalized Cancer Treatment

Common treatment
Personalized Cancer Treatment
Perform large scale
analysis of patient genes
and tumor growth

Customized
treatment
The Biomedical Big Data Challenge

How to integrate sources


to gain further insight
Big Data-Driven Cities
Smart Cities
• Use city wide sensor data to
• Lower energy costs, pollution
• Improve services, traffic, safety, …
How are other applications using Big Data?
Smart and personalized business!
Big Data
Personalized Marketing
Personalized Medicine
Smart Cities
Better Models < and more! >

Higher Precision
Saving Lives with
Big Data:
Wildfire Prediction
and Emergency
Response W1.1
Why important?
San Diego County, May 14, 2014
Why important?
San Diego County, May 14, 2014
14 fires burning
Why important?
San Diego County, May 14, 2014
14 fires burning
area of
San Francisco
Why important?
San Diego County, May 14, 2014
14 fires burning
area of
San Francisco

6 injured
+ 1 death
Why important?
San Diego County, May 14, 2014
14 fires burning
area of
San Francisco

6 injured
+ 1 death
$60 million USD
Why can Big Data help?
People Sensors

Organizations
Why can Big Data help?
Integration ofSensors
People

diverse data
Organizations
Integration of diverse streams

see new things


develop predictive analytics
Diverse Data Sources
Machine Data
Organizational Data

San Diego County Emergency


Map displaying fire perimeter
information from NICS.
People
Monitoring
Big Data Visualization
Fire Modeling
Monitoring
Big Data Visualization
Fire Modeling
Monitoring
Big Data Visualization
Fire Modeling
Wildfire Command Centers of the Futur
Saving Lives with
Big Data:
Precision Medicine
and
Health Informatics
Why is this important now?
$100M

Moore’s law

$1M Cost to sequence


genome
decreasing
$10K
Genome Data Storage

AGTTA 700MB

200GB
1 day
Health Records à༎ Digital

Digital records

Paper
records
120 Terabytes in 2013
2X more than in 2011
cent changes à༎ Big Data for Healthca

Reduced Cost Analysis


Cheap, Large Data Storage
Digitization of Records
Why can Big Data help?
integration
People Sensors

Organizations
Sensor Data

101100010 à༎
Sensor Data

More sensors, More places


Data à༎ Storage & Analysis
Fitness Device
Industry
3.5M

Nike

Jawbone

0.5M Fitbit

2014
Data Generated?

2-5 GB / day
Save health care costs?
Organization Data
cientific Data and Knowledge-base
Experimental Data

Computed Data

tional Center for Biotechnology Information


cientific Data and Knowledge-base
People Data
Mobile Health Apps
Webby

>100,000 health apps By 2017 à༎


(iTunes & Google Play) $26 billion market?
A story:
e impact of novel people-generated da
Have you had
any reactions to It’s been a month…
your medications? Was that a reaction?
Today à༎ Self-Reported Data
Social Media
Integration à༎
Personalization à༎
Precision
integration
People Sensors

Organizations
Integration à༎
Personalization à༎
Precision

You might also like