Introduction of Big Data & Applications
Introduction of Big Data & Applications
1
Big Data Definition
2
What’s Big Data?
3
Big Data: 3V’s
4
Who’s Generating Big Data
Mobile devices
(tracking all objects all the time)
Social media and networks Scientific instruments
(all of us are generating data) (collecting all sorts of data)
Sensor technology and networks
(measuring all kinds of data)
• The progress and innovation is no longer hindered by the ability to collect data
• But, by the ability to manage, analyze, summarize, visualize, and discover knowledge from the
collected data in a timely manner and in a scalable fashion
5
The Model Has Changed…
• The Model of Generating/Consuming Data has Changed
Old Model: Few companies are generating data, all others are consuming data
New Model: all of us are generating data, and all of us are consuming data
6
Harnessing Big Data
7
What’s driving Big Data
- Optimizations and predictive analytics
- Complex statistical analysis
- All types of data, and many sources
- Very large datasets
- More of a real-time
8
Big Data Analytics
9
10
Big Data Landscape
Shortlist few
Tools used in big Data Analytics
Soft Skills
Skills
Two Big Jobs
Similarity of Data scientist and big data
Engineer
Big Data Job Profiles
Chief Data Offi Big Data Mana Big Data Scien Big Data Analy
cer ger tist st
Big Data Soluti Big Data Visua Big Data Cons Big Data Rese
ons Architect lizer ultant archer
What does big data Professional do?
Data Analyst Salary Distribution in Pakistan
http://www.salaryexplorer.com/salary-survey.php?loc=164&loctype=1&jobtype=3&job=805
https://2s7gjr373w3x22jf92z99mgm5w-wpengine.netdna-ssl.com/wp-content/uploads/2019/01/KPMG-report.png
Domain Using big Data Analytics
Data Sets
• https://www.kaggle.com/datasets
• https://archive.ics.uci.edu/ml/datasets.html
• https://opendata.com.pk/
• http://www.kdnuggets.com/2011/02/free-public-datasets.html
• http://aws.amazon.com/publicdatasets/
• http://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public
• http://stackoverflow.com/questions/2674421/free-large-datasets-to-experimen
t-with-hadoop
• http://www.ll.mit.edu/mission/communications/cyber/CSTcorpora/ideval/data/
• http://www.gapminder.org/data/
• https://data.worldbank.org/
Stages in Big Data Analytics
What launched the
Big Data era?
New Opportunities
Changing Times
Data Science à༎
New Opportunities
#1 Catalyst for
economic growth
-McKinsey
Changing Times
McKinsey Report (2013)
McKinsey Report (2013)
McKinsey Report (2013)
McKinsey Report (2013)
McKinsey Report (2013)
Data Science à༎
#1 Catalyst for
economic growth
-McKinsey
Data Science à༎
Cloud Computing #1 Catalyst for
economic growth
-McKinsey
Computing
anywhere and anytime
Computing
anywhere and anytime
On-Demand
Computing
Computing
anywhere and anytime
Big Data
Better Models
Higher Precision
What Makes Big Data Valuable?
Big Data
Awesome!
Just what I was
looking for!
Personalized
marketing data
Happy
customers!
What Makes Big Data Valuable?
I like blue plates!
Paint
on
sale at
Home
Depot!
Mobile Advertising
Geolocation data
Customer profile
Recent purchases
Consumer Growth to Guide
Product Growth
Consumer Growth to Guide
Product Growth
Collective Consumer
Behavior
Consumer Growth to Guide
Product Growth
We like
Flight Schedule
morning flights!
Biomedical Applications
Biomedical Applications
2 billion human genomes
sequenced by 2025
Biomedical Applications
2 billion human genomes
sequenced by 2025
Up to 40 exabytes
in storage!
Biomedical Applications
2 billion human genomes
sequenced by 2025
Up to 40 exabytes
in storage!
Common treatment
Personalized Cancer Treatment
Perform large scale
analysis of patient genes
and tumor growth
Customized
treatment
The Biomedical Big Data Challenge
Higher Precision
Saving Lives with
Big Data:
Wildfire Prediction
and Emergency
Response W1.1
Why important?
San Diego County, May 14, 2014
Why important?
San Diego County, May 14, 2014
14 fires burning
Why important?
San Diego County, May 14, 2014
14 fires burning
area of
San Francisco
Why important?
San Diego County, May 14, 2014
14 fires burning
area of
San Francisco
6 injured
+ 1 death
Why important?
San Diego County, May 14, 2014
14 fires burning
area of
San Francisco
6 injured
+ 1 death
$60 million USD
Why can Big Data help?
People Sensors
Organizations
Why can Big Data help?
Integration ofSensors
People
diverse data
Organizations
Integration of diverse streams
Moore’s law
AGTTA 700MB
200GB
1 day
Health Records à༎ Digital
Digital records
Paper
records
120 Terabytes in 2013
2X more than in 2011
cent changes à༎ Big Data for Healthca
Organizations
Sensor Data
101100010 à༎
Sensor Data
Nike
Jawbone
0.5M Fitbit
2014
Data Generated?
2-5 GB / day
Save health care costs?
Organization Data
cientific Data and Knowledge-base
Experimental Data
Computed Data
Organizations
Integration à༎
Personalization à༎
Precision