Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Apache Spark:
The Analytics Operating System
Anjul Bhambhri
Vice President, IBM Big Data Engineering
Deep Blue SQL RISC
DNA Transistor Magnetic Tape Linux PC
Fortran DRAM Mainframe Watson
Floppy Disk UPC
Punch Card
IBM: 100 years of (supporting) innovation
The
Analytics
Operating System
Apache Spark
Enhance it! Offer it!
Leverage it!
Spark Technology
Center @ SF
On-prem and on
the cloud
Inside our products
At IBM, We Love Spark!
IBM Cloud Data Services
now featuring Spark is
open for data
IBM is Building on Apache Spark
• IBM Analytics
• IBM Commerce
• IBM Watson
• IBM Research
• IBM Cloud
Quarks from IBM
Announced Feb 2016
• Open-source platform for
building IoT applications
• Light-weight & embeddable
• Integrates with Spark
• Lambda Architecture and Spark enable efficient batch and streaming analytics
• Visualization at every step of data discovery enables better self service
The Weather Company clusters running hot:
 ~30 billion API requests per day
 ~120 million active mobile users
 #3 most active mobile user base
 Billions of events per day (1.3M/sec)
 ~360 PB of traffic daily
 Need to keep data forever
The use case:
Efficient batch + streaming analysis
Self-serve data science
BI / visualization tool support
An IBM Business
Spark for daily weather
Spark in Health Care
Health Care Data Lakes
 Improve how healthcare is delivered
 Collect and combine data from dozens of sources
 Clinical, Operational, Financial
 Inside and outside your enterprise
Benefits
 Better medical outcomes for patients
 Control cost and improve quality
SystemML on Spark
 Predictive Risk Modeling
 Right patient intervention relating to adverse health events
Spark in Telecom
The challenge:
 Improve customer satisfaction rates
 Multiple channels for customer interactions
 Very large data volumes
The need:
 Create a 360 degree view of a customer
 Stitch all interactions across channels –
“Customer Experience Journey”
 Classify interaction sentiment and take
necessary actions
• Spark Streaming brings all the data together
• Spark Core is used to process and transform text and voice data
• Spark MLLib algorithms stitch interactions on a journey and score “sentiment”
• Spark SQL drives interactive queries via visual dashboards
PUB / SUB
MQTT / WebSockets / Flume / Kafka
` ` `
Journey
Dashboards
Interaction & Journey Data
Voice &
Text Dat
a
Apache Spark:
The Analytics Operating System
THANK YOU!

More Related Content

Spark Summit East Keynote by Anjul Bhambhri

  • 1. Apache Spark: The Analytics Operating System Anjul Bhambhri Vice President, IBM Big Data Engineering
  • 2. Deep Blue SQL RISC DNA Transistor Magnetic Tape Linux PC Fortran DRAM Mainframe Watson Floppy Disk UPC Punch Card IBM: 100 years of (supporting) innovation
  • 4. Enhance it! Offer it! Leverage it! Spark Technology Center @ SF On-prem and on the cloud Inside our products At IBM, We Love Spark! IBM Cloud Data Services now featuring Spark is open for data
  • 5. IBM is Building on Apache Spark • IBM Analytics • IBM Commerce • IBM Watson • IBM Research • IBM Cloud Quarks from IBM Announced Feb 2016 • Open-source platform for building IoT applications • Light-weight & embeddable • Integrates with Spark
  • 6. • Lambda Architecture and Spark enable efficient batch and streaming analytics • Visualization at every step of data discovery enables better self service The Weather Company clusters running hot:  ~30 billion API requests per day  ~120 million active mobile users  #3 most active mobile user base  Billions of events per day (1.3M/sec)  ~360 PB of traffic daily  Need to keep data forever The use case: Efficient batch + streaming analysis Self-serve data science BI / visualization tool support An IBM Business Spark for daily weather
  • 7. Spark in Health Care Health Care Data Lakes  Improve how healthcare is delivered  Collect and combine data from dozens of sources  Clinical, Operational, Financial  Inside and outside your enterprise Benefits  Better medical outcomes for patients  Control cost and improve quality SystemML on Spark  Predictive Risk Modeling  Right patient intervention relating to adverse health events
  • 8. Spark in Telecom The challenge:  Improve customer satisfaction rates  Multiple channels for customer interactions  Very large data volumes The need:  Create a 360 degree view of a customer  Stitch all interactions across channels – “Customer Experience Journey”  Classify interaction sentiment and take necessary actions • Spark Streaming brings all the data together • Spark Core is used to process and transform text and voice data • Spark MLLib algorithms stitch interactions on a journey and score “sentiment” • Spark SQL drives interactive queries via visual dashboards PUB / SUB MQTT / WebSockets / Flume / Kafka ` ` ` Journey Dashboards Interaction & Journey Data Voice & Text Dat a
  • 9. Apache Spark: The Analytics Operating System THANK YOU!