Chapter 4 - Big Data Analytics Part 3
Chapter 4 - Big Data Analytics Part 3
Internet of
Things
https://www.nsf.gov/news/news_images.jsp?cntn_id=122028
Everything
Interconnected
http://www.beechamresearch.com/article.aspx?id=4
http://www.datasciencecentral.com/profiles/blogs/what-is-the-internet-of-everything-ioe
Internet of Things (IoT) :
so many challenges and benefits!
1) The 3 E’s of an IoT World
▪ Everywhere a sensor
▪ Everything quantified and tracked (temporally & spatially)
▪ Erosion of Data Privacy
2) The 3 V’s of IoT Data (Deep, Wide, Fast all the time)
▪ (Volume) Deep Data = Ubiquitous Sensors everywhere
▪ (Variety) Wide Data = Diverse and Complex Data Types
▪ (Velocity) Fast Data = Streaming time series all the time
3) The 3 D2D’s of IoT ROI (Return On Innovation)
▪ Data2Discovery –– Data2Decisions –– Data2Dollars
(or Data2Dividends)
IoT Use Case examples
■ Retail (Dynamic Pricing, Smart Supply Chain, Precision Demand Forecasting)
■ Marketing (Personalized Real-time Ad Campaigns for Next Best Offer)
■ Smart Highways (monitoring vehicles, weather, road conditions, closures)
■ Precision Traffic (Self-driving & Self-parking Connected Cars)
■ Smart Cities (Growth, Dynamic Street-lighting, Smart Energy Usage)
■ Law Enforcement (Predictive, Prescriptive personnel & resource placements)
■ Healthcare (Wearables, Personalized Medicine, Patient/Provider Monitoring)
■ Online Education (Personalized Learning, Real-time interventions)
■ Forests, Farms, Vineyards,… (Precision Planning, Nurturing, Harvesting)
■ Financial / Banking / Insurance (Real-time Risk Mitigation, Fraud detection)
■ Organizations (Smart Ergonomics, Improved Employee Workflow, Process
Mining for Efficiencies)
■ Invisibles (under-the-skin smart sensors – not only measure, but also learn,
react, and proactively respond) = The Internet of Emotions!
■ Machines (Early Warning, Prescriptive Maintenance, Smart Obsolescence,
M2M, IIoT = Industrial IoT)
In a nutshell … the XYZ of IoT:
Intelligence at the edge of the network
(at the point of data collection)
■ Smart X
■ Precision Y
■ Personalized Z
http://blog.autoserviceintelligence.com/a-smart-crm-in-action-building-consumer-trust-part-2
https://www.nsf.gov/news/news_images.jsp?cntn_id=122028
Risk Mitigation
❖ Tsunami alerts (from geo sensors everywhere)
❖ Cybersecurity alerts (from network logs)
❖ Social event alerts or early warnings (from social media)
❖ Preventive Fraud alerts (from financial applications)
❖ Predictive Maintenance alerts (from machine / engine sensors)
❖ Infrastructure Monitoring alerts (from ubiquitous sensors)
Big Data and the fundamental
business conflict:
RISK versus REWARD
http://www.telegraph.co.uk/news/worldnews/europe/russia/10061780/Russian-convicts-beat-Americans-in-cyber-chess-battle.html
Creating Rewards (Value) from
Big Data in the IoT : The 3 D2D’s
o Knowledge Discovery
– Data-to-Discovery (D2D)
o Data-driven Decision Support
– Data-to-Decisions (D2D)
o Big ROI (Return On Innovation)
– Data-to-Dollars or Data-to-Dividends (D2D)
– Innovative Applications of sense-making from
IoT sensors and sentinels everywhere
11
Data Science = 4 Types of Discovery
Learning from Data in the IoT
1) Class Discovery
■ Finding new classes of objects, events, and behaviors
■ Learning the rules that constrain class boundaries
2) Correlation (Predictive Power!) Discovery
■ Finding patterns and dependencies, which reveal new
governing principles or behavioral patterns (the “DNA”)
3) Association (or Link) Discovery
■ Finding unusual (improbable) co-occurring associations
4) Novelty (Surprise!) Discovery
■ Finding new, rare, one-in-a-[million / billion / trillion]
objects and events
https://www.nsf.gov/news/news_images.jsp?cntn_id=122028
15
(mirror funded by private donors)
Hello !
16
(mirror funded by private donors)
17
(mirror funded by private donors)
18
LSST Key Science Drivers: Mapping the Dynamic Universe
– Complete inventory of the Solar System (Near-Earth Objects; killer asteroids???)
– Nature of Dark Energy (Cosmology; Supernovae at edge of the known Universe)
– Optical transients (10 million daily event notifications sent within 60 seconds)
– Digital Milky Way (Dark Matter; Locations and velocities of 20 billion stars!)
Architect’s design
LSST in time and space: of LSST Observatory
– When? ~2022-2032
– Where? Cerro Pachon, Chile 19
LSST Summary
http://www.lsst.org/
• 3-Gigapixel camera
• One 6-Gigabyte image every 20 seconds
• 30 Terabytes every night for 10 years
• Repeat images of the entire night sky every
3 nights: Celestial Cinematography
• 100-200 Petabyte final image data archive
anticipated – all data are public!!!
• 20-40 Petabyte final database catalog
anticipated
• Real-Time Event Mining: ~10 million events
per night, every night, for 10 years!
– Follow-up observations required to classify these
– Which ones should we follow up? … TRIAGE!
… Decisions! Decisions! ( = D2D !)
20
LSST Database = massive complexity challenge
http://www.lsst.org/
• ~30-Petabyte final database catalog anticipated
• ~30 trillion rows (astronomical sources), 200-300 columns
• How can we find descriptive and predictive models of
trillions of astrophysical sources? …
• … Soft10ware.com’s fast statistical modeling technology
offers an excellent approach:
– Non-parametric, multi-model generation
– Rapid model evaluation & ranking
21
Case
Study #2:
Mars
Rovers
22
Creating Rewards (Value) from
Big Data in the IoT : The 3 D2D’s
o Knowledge Discovery
– Data-to-Discovery (D2D)
o Data-driven Decision Support
– Data-to-Decisions (D2D)
o Big ROI (Return On Innovation)
– Data-to-Dollars or Data-to-Dividends (D2D)
– Innovative Applications of sense-making from
IoT sensors and sentinels everywhere
23
Mars Rover: intelligent data-gatherer, mobile data mining
agent, and autonomous decision-support system
Rove around the surface of Mars and take samples of rocks (experimental
data type: mass spectroscopy = data histogram = feature vector)
Intelligent Data Operations in Action:
• Classification (assign rocks to known classes)
• Supervised Learning (search for rocks with known compositions)
• Unsupervised Learning (discover what types of rocks are present,
without preconceived biases)
• Clustering (find the set of unique classes of rocks)
• Association Mining (find unusual associations)
• Deviation/Outlier Detection (one-of-kind; interesting?)
• On-board Intelligent Data Understanding & Decision Support Systems
(Fuzzy Logic & Decision Trees & Cased-Based Reasoning ) =
= Science Goal Monitoring :
– “stay here and do more” ; or else “follow trend to most interesting location”
– “send results to Earth immediately” ; or “send results later”
24
Smart Sensors & Sentinels for Data-Driven
Sense-Making and Decision Support
From Sensors to Sentinels to Sense
(for any application domain with streaming data from sensors)
• New knowledge and insights are acquired by monitoring and
mining actionable data from all digital inputs (Sensors!)
• Alerts are triggered autonomously, without intervention (if
permitted), applying machine learning and actionable business
decision rules for pattern detection and diagnosis. (Sentinels!)
• “Smart Sensors” (powered by Machine Learning-enabled
sentinels) deliver actionable intelligence (Sense!)
http://legacy.samsi.info/200506/astro/presentations/tut1lo
redo-7.pdf
The MIPS Architecture Framework
for Dynamic Data-Driven Application Systems (DDDAS)
http://dddas.org
• MIPS =
– Measurement – Inference – Prediction – Steering
• This applies to any Network of Sensors:
– Web user interactions & actions (web analytics data), Cyber network
usage logs, Social network sentiment, Machine logs (of any kind),
Manufacturing sensors, Health & Epidemic monitoring systems,
Financial transactions, National Security, Utilities and Energy, Remote
Sensing, Tsunami warnings, Weather/Climate events, Astronomical
sky events, …
• Machine Learning enables the “IP” part of MIPS:
– Autonomous (or semi-autonomous) Classification
– Intelligent Data Understanding Alert & Response systems:
– Rule-based •Actionable insights from
– Model-based streaming business data
– Neural Networks
•Automation of any
– Markov Models
– Bayes Inference Engines data-driven operational
The MIPS Architecture Framework
for Dynamic Data-Driven Application Systems (DDDAS)
http://dddas.org
• MIPS =
– Measurement – Inference – Prediction – Steering
s e
• This applies to any Network of Sensors: Sen
s t
– Web user interactions & actions (web analytics data),
l oCyber network
t i ne
usage logs, Social network sentiment, Machine logs (of any kind),
t o
sky events, … ors
Sensing, Tsunami warnings, Weather/Climate events, Astronomical
n s
• MachineSe Learning enables the “IP” part of MIPS:
– o
r m
F– Intelligent Data Understanding Alert & Response systems:
Autonomous (or semi-autonomous) Classification
http://www.webtwit.com/digital-marketing-company-india.html
29
Creating Rewards (Value) from
Big Data in the IoT : The 3 D2D’s
o Knowledge Discovery
– Data-to-Discovery (D2D)
o Data-driven Decision Support
– Data-to-Decisions (D2D)
o Big ROI (Return On Innovation)
– Data-to-Dollars or Data-to-Dividends (D2D)
– Innovative Applications of sense-making from
IoT sensors and sentinels everywhere
30
Digital Marketing Analytics – on fast streaming big data…
… from Devices… … Intentions…
© SYNTASA.com 2015 31
Automating
Analytics Domain
Modeling
as-as-Service & Event
Response
• Based on the SYNTASA.com Marketing Analytics-as-a-ServiceTM (MAaaS)
• Your own “Smart Sentinel (Mars Rover) in a box”
– Your business rules determine the goals, decision points, alerts, and responses.
– Moving beyond historical hindsight and oversight (Descriptive & Diagnostic
Analytics) to new world of insight and foresight (Predictive & Prescriptive AaaS),
eventually achieving right sight (Cognitive Analytics = the 360 view, enabling the
right action, for right customer, at right place, at right time, in right context).
• Mining multi-channel big data streams (across your organization)
• Target Marketing, Personalization, and Segmentation (“segment of one”)
• Decision Automation in a rich content (Big Data) environment
Based on Marketing Analytics-as-a-ServiceTM (MAaaS) from http://www.syntasa.com/ 32
Summary
http://art-of-stories.com/writing-tools-narrative-summary/
33
Data Science for Dynamic Data-Driven
Application Systems in the Internet of Things
• Learning from data (Data Science):
– Clustering (= New Class discovery, Segmentation)
– Correlation & Association (Link) discovery
– Classification, Diagnosis (Predictive power discovery)
– Outlier / Anomaly / Novelty / Surprise discovery
http://www.boozallen.com/datascience