Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
5 views

Chapter 4 - Big Data Analytics Part 3

The document discusses the role of data science in the Internet of Things (IoT), highlighting the challenges and opportunities presented by dynamic data-driven application systems. It emphasizes the importance of real-time data analytics, knowledge discovery, and decision support to create value from big data in various sectors such as retail, healthcare, and smart cities. The document also outlines the MIPS framework for managing streaming data and the potential ROI from innovative applications of IoT data.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Chapter 4 - Big Data Analytics Part 3

The document discusses the role of data science in the Internet of Things (IoT), highlighting the challenges and opportunities presented by dynamic data-driven application systems. It emphasizes the importance of real-time data analytics, knowledge discovery, and decision support to create value from big data in various sectors such as retail, healthcare, and smart cities. The document also outlines the MIPS framework for managing streaming data and the potential ROI from innovative applications of IoT data.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Data Science for Dynamic

Data-Driven Application Systems


in the Internet of Things (IoT)
Kirk Borne
@KirkDBorne
Principal Data Scientist
Booz Allen Hamilton, Strategic Innovation Group
http://www.boozallen.com/datascience
https://www.nsf.gov/news/news_images.jsp?cntn_id=122028

Internet of
Things
https://www.nsf.gov/news/news_images.jsp?cntn_id=122028

Everything
Interconnected
http://www.beechamresearch.com/article.aspx?id=4
http://www.datasciencecentral.com/profiles/blogs/what-is-the-internet-of-everything-ioe
Internet of Things (IoT) :
so many challenges and benefits!
1) The 3 E’s of an IoT World
▪ Everywhere a sensor
▪ Everything quantified and tracked (temporally & spatially)
▪ Erosion of Data Privacy
2) The 3 V’s of IoT Data (Deep, Wide, Fast all the time)
▪ (Volume) Deep Data = Ubiquitous Sensors everywhere
▪ (Variety) Wide Data = Diverse and Complex Data Types
▪ (Velocity) Fast Data = Streaming time series all the time
3) The 3 D2D’s of IoT ROI (Return On Innovation)
▪ Data2Discovery –– Data2Decisions –– Data2Dollars
(or Data2Dividends)
IoT Use Case examples
■ Retail (Dynamic Pricing, Smart Supply Chain, Precision Demand Forecasting)
■ Marketing (Personalized Real-time Ad Campaigns for Next Best Offer)
■ Smart Highways (monitoring vehicles, weather, road conditions, closures)
■ Precision Traffic (Self-driving & Self-parking Connected Cars)
■ Smart Cities (Growth, Dynamic Street-lighting, Smart Energy Usage)
■ Law Enforcement (Predictive, Prescriptive personnel & resource placements)
■ Healthcare (Wearables, Personalized Medicine, Patient/Provider Monitoring)
■ Online Education (Personalized Learning, Real-time interventions)
■ Forests, Farms, Vineyards,… (Precision Planning, Nurturing, Harvesting)
■ Financial / Banking / Insurance (Real-time Risk Mitigation, Fraud detection)
■ Organizations (Smart Ergonomics, Improved Employee Workflow, Process
Mining for Efficiencies)
■ Invisibles (under-the-skin smart sensors – not only measure, but also learn,
react, and proactively respond) = The Internet of Emotions!
■ Machines (Early Warning, Prescriptive Maintenance, Smart Obsolescence,
M2M, IIoT = Industrial IoT)
In a nutshell … the XYZ of IoT:
Intelligence at the edge of the network
(at the point of data collection)
■ Smart X
■ Precision Y
■ Personalized Z

http://blog.autoserviceintelligence.com/a-smart-crm-in-action-building-consumer-trust-part-2
https://www.nsf.gov/news/news_images.jsp?cntn_id=122028

The Internet of Things (IoT)


is an interconnected universe
of Dynamic Data-Driven
Application Systems (DDDAS)
The BIG Big Data Challenge in the IoT
• Streaming Data Analytics:
❖ Real-Time Event Mining for Actionable Intelligence :
❑ Identifying, characterizing, & responding to millions of events in real-time streaming data
❑ Deciding which events (out of millions) need investigation and/or response (= Triage!)
• Web Analytics example:
❑ Web Behavior Modeling and Automated System Response (from
online interactions & web browse patterns, behavioral analytics,
user segmentation, data-driven discovery,…)
• Many other examples:
❖ Health alerts (from EHRs and national health systems)

Risk Mitigation
❖ Tsunami alerts (from geo sensors everywhere)
❖ Cybersecurity alerts (from network logs)
❖ Social event alerts or early warnings (from social media)
❖ Preventive Fraud alerts (from financial applications)
❖ Predictive Maintenance alerts (from machine / engine sensors)
❖ Infrastructure Monitoring alerts (from ubiquitous sensors)
Big Data and the fundamental
business conflict:
RISK versus REWARD

http://www.telegraph.co.uk/news/worldnews/europe/russia/10061780/Russian-convicts-beat-Americans-in-cyber-chess-battle.html
Creating Rewards (Value) from
Big Data in the IoT : The 3 D2D’s
o Knowledge Discovery
– Data-to-Discovery (D2D)
o Data-driven Decision Support
– Data-to-Decisions (D2D)
o Big ROI (Return On Innovation)
– Data-to-Dollars or Data-to-Dividends (D2D)
– Innovative Applications of sense-making from
IoT sensors and sentinels everywhere

11
Data Science = 4 Types of Discovery
Learning from Data in the IoT
1) Class Discovery
■ Finding new classes of objects, events, and behaviors
■ Learning the rules that constrain class boundaries
2) Correlation (Predictive Power!) Discovery
■ Finding patterns and dependencies, which reveal new
governing principles or behavioral patterns (the “DNA”)
3) Association (or Link) Discovery
■ Finding unusual (improbable) co-occurring associations
4) Novelty (Surprise!) Discovery
■ Finding new, rare, one-in-a-[million / billion / trillion]
objects and events
https://www.nsf.gov/news/news_images.jsp?cntn_id=122028

Consider 3 Case Studies:


from the universe of IoT possibilities,
as metaphors for your IoT / DDDAS* challenges
[*Dynamic Data-Driven Application Systems]
Case Study #1:
Astronomy Big Data

The LSST (Large Synoptic Survey Telescope)


14
Creating Rewards (Value) from
Big Data in the IoT : The 3 D2D’s
✓ Knowledge Discovery
– Data-to-Discovery (D2D)
✓ Data-driven Decision Support
– Data-to-Decisions (D2D)
✓ Big ROI (Return On Innovation)
– Data-to-Dollars or Data-to-Dividends (D2D)
– Innovative Applications of sense-making from
IoT sensors and sentinels everywhere

15
(mirror funded by private donors)

LSST = 8.4-meter diameter


primary mirror =
Large 10 square degrees!
Synoptic
Survey
Telescope
http://www.lsst.org/

Hello !

16
(mirror funded by private donors)

LSST = 8.4-meter diameter


primary mirror =
Large 10 square degrees!
Synoptic
Survey
Telescope
t 2 0 1 4
http://www.lsst.org/
u gu s
g a n A E )
i o n b e DO
t ru c t F a nd
Con s b y N S
n d e d
(fu Hello !

17
(mirror funded by private donors)

LSST = 8.4-meter diameter


primary mirror =
Large 10 square degrees!
Synoptic
Survey
Telescope
http://www.lsst.org/

– 100-200 Petabyte image archive


– 20-40 Petabyte database catalog
Hello !

18
LSST Key Science Drivers: Mapping the Dynamic Universe
– Complete inventory of the Solar System (Near-Earth Objects; killer asteroids???)
– Nature of Dark Energy (Cosmology; Supernovae at edge of the known Universe)
– Optical transients (10 million daily event notifications sent within 60 seconds)
– Digital Milky Way (Dark Matter; Locations and velocities of 20 billion stars!)

Architect’s design
LSST in time and space: of LSST Observatory

– When? ~2022-2032
– Where? Cerro Pachon, Chile 19
LSST Summary
http://www.lsst.org/
• 3-Gigapixel camera
• One 6-Gigabyte image every 20 seconds
• 30 Terabytes every night for 10 years
• Repeat images of the entire night sky every
3 nights: Celestial Cinematography
• 100-200 Petabyte final image data archive
anticipated – all data are public!!!
• 20-40 Petabyte final database catalog
anticipated
• Real-Time Event Mining: ~10 million events
per night, every night, for 10 years!
– Follow-up observations required to classify these
– Which ones should we follow up? … TRIAGE!
… Decisions! Decisions! ( = D2D !)
20
LSST Database = massive complexity challenge
http://www.lsst.org/
• ~30-Petabyte final database catalog anticipated
• ~30 trillion rows (astronomical sources), 200-300 columns
• How can we find descriptive and predictive models of
trillions of astrophysical sources? …
• … Soft10ware.com’s fast statistical modeling technology
offers an excellent approach:
– Non-parametric, multi-model generation
– Rapid model evaluation & ranking

21
Case
Study #2:
Mars
Rovers

22
Creating Rewards (Value) from
Big Data in the IoT : The 3 D2D’s
o Knowledge Discovery
– Data-to-Discovery (D2D)
o Data-driven Decision Support
– Data-to-Decisions (D2D)
o Big ROI (Return On Innovation)
– Data-to-Dollars or Data-to-Dividends (D2D)
– Innovative Applications of sense-making from
IoT sensors and sentinels everywhere

23
Mars Rover: intelligent data-gatherer, mobile data mining
agent, and autonomous decision-support system
Rove around the surface of Mars and take samples of rocks (experimental
data type: mass spectroscopy = data histogram = feature vector)
Intelligent Data Operations in Action:
• Classification (assign rocks to known classes)
• Supervised Learning (search for rocks with known compositions)
• Unsupervised Learning (discover what types of rocks are present,
without preconceived biases)
• Clustering (find the set of unique classes of rocks)
• Association Mining (find unusual associations)
• Deviation/Outlier Detection (one-of-kind; interesting?)
• On-board Intelligent Data Understanding & Decision Support Systems
(Fuzzy Logic & Decision Trees & Cased-Based Reasoning ) =
= Science Goal Monitoring :
– “stay here and do more” ; or else “follow trend to most interesting location”
– “send results to Earth immediately” ; or “send results later”
24
Smart Sensors & Sentinels for Data-Driven
Sense-Making and Decision Support
From Sensors to Sentinels to Sense
(for any application domain with streaming data from sensors)
• New knowledge and insights are acquired by monitoring and
mining actionable data from all digital inputs (Sensors!)
• Alerts are triggered autonomously, without intervention (if
permitted), applying machine learning and actionable business
decision rules for pattern detection and diagnosis. (Sentinels!)
• “Smart Sensors” (powered by Machine Learning-enabled
sentinels) deliver actionable intelligence (Sense!)

http://legacy.samsi.info/200506/astro/presentations/tut1lo
redo-7.pdf
The MIPS Architecture Framework
for Dynamic Data-Driven Application Systems (DDDAS)
http://dddas.org
• MIPS =
– Measurement – Inference – Prediction – Steering
• This applies to any Network of Sensors:
– Web user interactions & actions (web analytics data), Cyber network
usage logs, Social network sentiment, Machine logs (of any kind),
Manufacturing sensors, Health & Epidemic monitoring systems,
Financial transactions, National Security, Utilities and Energy, Remote
Sensing, Tsunami warnings, Weather/Climate events, Astronomical
sky events, …
• Machine Learning enables the “IP” part of MIPS:
– Autonomous (or semi-autonomous) Classification
– Intelligent Data Understanding Alert & Response systems:
– Rule-based •Actionable insights from
– Model-based streaming business data
– Neural Networks
•Automation of any
– Markov Models
– Bayes Inference Engines data-driven operational
The MIPS Architecture Framework
for Dynamic Data-Driven Application Systems (DDDAS)
http://dddas.org
• MIPS =
– Measurement – Inference – Prediction – Steering
s e
• This applies to any Network of Sensors: Sen
s t
– Web user interactions & actions (web analytics data),
l oCyber network

t i ne
usage logs, Social network sentiment, Machine logs (of any kind),

e n Utilities and Energy, Remote


Manufacturing sensors, Health & Epidemic
Financial transactions, NationalSSecurity,
monitoring systems,

t o
sky events, … ors
Sensing, Tsunami warnings, Weather/Climate events, Astronomical

n s
• MachineSe Learning enables the “IP” part of MIPS:
– o
r m
F– Intelligent Data Understanding Alert & Response systems:
Autonomous (or semi-autonomous) Classification

– Rule-based •Actionable insights from


– Model-based streaming business data
– Neural Networks
•Automation of any
– Markov Models
– Bayes Inference Engines data-driven operational
Big Data Analytics in the IoT:
Take Data to Information to Knowledge to Insights (and Action!)
in the Internet of Things using the MIPS Framework for DDDAS

✓ From Sensors (Measurement & Data Collection)…

✓ … to Sentinels (Monitoring & Alerts) …

✓ … to Sense-making (Data Science) …

✓ … to Cents-making (Business ROI)


… Actionizing and Productizing your Big Data
28
Case Study #3:

Digital Marketing Analytics

http://www.webtwit.com/digital-marketing-company-india.html

29
Creating Rewards (Value) from
Big Data in the IoT : The 3 D2D’s
o Knowledge Discovery
– Data-to-Discovery (D2D)
o Data-driven Decision Support
– Data-to-Decisions (D2D)
o Big ROI (Return On Innovation)
– Data-to-Dollars or Data-to-Dividends (D2D)
– Innovative Applications of sense-making from
IoT sensors and sentinels everywhere

30
Digital Marketing Analytics – on fast streaming big data…
… from Devices… … Intentions…

… Location, weather, and


… Demographics… other geographic attributes…

© SYNTASA.com 2015 31
Automating
Analytics Domain
Modeling
as-as-Service & Event
Response
• Based on the SYNTASA.com Marketing Analytics-as-a-ServiceTM (MAaaS)
• Your own “Smart Sentinel (Mars Rover) in a box”
– Your business rules determine the goals, decision points, alerts, and responses.
– Moving beyond historical hindsight and oversight (Descriptive & Diagnostic
Analytics) to new world of insight and foresight (Predictive & Prescriptive AaaS),
eventually achieving right sight (Cognitive Analytics = the 360 view, enabling the
right action, for right customer, at right place, at right time, in right context).
• Mining multi-channel big data streams (across your organization)
• Target Marketing, Personalization, and Segmentation (“segment of one”)
• Decision Automation in a rich content (Big Data) environment
Based on Marketing Analytics-as-a-ServiceTM (MAaaS) from http://www.syntasa.com/ 32
Summary

http://art-of-stories.com/writing-tools-narrative-summary/

33
Data Science for Dynamic Data-Driven
Application Systems in the Internet of Things
• Learning from data (Data Science):
– Clustering (= New Class discovery, Segmentation)
– Correlation & Association (Link) discovery
– Classification, Diagnosis (Predictive power discovery)
– Outlier / Anomaly / Novelty / Surprise discovery

• … for D2D in IoT big data:


– Data-to-Discoveries
– Data-to-Decisions
– Data-to-Dividends
(big ROI = Return on Innovation)
http://www.dataev.com/it-experts-blog/bid/297713/The-Big-Data-Challenges-of-a-Biotechnology-Startup-Company

http://www.boozallen.com/datascience

You might also like