Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Using Continuous ETL
     with Real-Time Queries
to Eliminate MySQL Bottlenecks


       Damian.Black@sqlstream.com

        
Julian.Hyde@sqlstream.com

                  April
2009

Agenda


»  Background

»  Real-time Data Challenges

»  SQLstream’s Solution

»  Applications of SQLstream

»  Live Demo
SQLstream Company


Corporate:

»  Founded 2003, product launched 2008

»  Co-founded Eigenbase

»  Patented software technology

»  Experienced team

»  Presence in California, Colorado, UK

»  Privately funded
The Business Pain

»  Rising data volumes

»  Data Warehouse always out of date
   »  Poor Visibility into data still arriving from apps & users

   »  Painful Latency – data warehouse always out of date

»  Scaling for real-time performance proves costly
   »  Custom solutions, specialized hardware, bespoke integration

»  Scaling for massively distributed data is impossible
The SQLstream Solution


»  Fundamentally better way of processing real-time data
   »  Enhances the Data Warehouse performance and functionality

   »  Eliminates MySQL bottlenecks with Continuous ETL in declarative SQL

»  Simplifies Data Integration
   »  Continuous, real-time data integration yielding early visibility

   »  High level language, very productive and easy manage & maintain

»  Built on ISO and Industry standards
   »  Eigenbase and SQL:2003/SQL:2008

   »  Eclipse-based UI, standards-based drivers, meta data, SQL/MED

»  Query The Future™
SQLstream Eliminates Business Latency



                            »  Traditional data warehouse
   Collect
                    SQLstream Innovation

                             »  Elimination of high latency
              Stage

                                  processing stages via a
                                  pipelined approach
                       Query

                       Process

                             »  Classic approach delivers
                                  results the next day;
                                      Query

                                  SQLstream produces
                                  results continuously
                                                 Deliver

SQLstream Enhances the Data Warehouse


   »  Con5nuous
ETL
and
keeping
DW
updated


   »  Offloads
the
data
warehouse
from
ELT,
RT
queries


   »  Closes
the
loop:
Data
mining
used
for
Real‐5me
Detec5on


   »  Con5nuous,
RT
business
answers
with
near
zero
latency





           data


                  data
                                                 Data Warehouse
                         data


                                data
Streaming SQL – an example



 CREATE VIEW compliant_orders AS
  SELECT STREAM *
    FROM orders OVER sla
    JOIN shipments
    ON orders.id = shipments.orderid
    WHERE city = 'New York'
    WINDOW sla AS (RANGE INTERVAL '1' HOUR PRECEDING)



 »  Produces a stream of orders from New York that shipped
   within a service level agreement of 1 hour
Streaming SQL


»  Built upon standard SQL:2003
   »  Familiar & declarative

»  Basics:
   »  Streams
   »  Tables
   »  Views

»  Streaming versions of relational operators:
   »  Projections and Filters (SELECT … FROM … WHERE)
   »  Windowed join (JOIN … OVER)
   »  Windowed aggregation
   »  Streaming aggregation (GROUP BY)
   »  Union
Mondrian

                                               Viewers
»  Open-source OLAP engine

»  Part of Pentaho Suite

»  Julian Hyde is lead developer

»  “ROLAP with caching”                    JEE Application Server


»  Aggregate tables                              Mondrian


»  Cache-control API
                                                  cube         cube
                                    cube




                                    JDBC          JDBC         JDBC
                            Cube
                           Schema
                             XML




                                                                    RDBMS
                                    RDBMS
Mondrian schema


A dimensional model (logical)
   »  Cubes & virtual cubes

   »  Shared & private dimensions

   »  Measures

… mapped onto a star/
  snowflake schema
  (physical)
   »  Fact table

   »  Dimension tables

   »  Joined by foreign key
     relationships
   »  Aggregate tables
ETL Process for OLAP




                                                               OLAP
cache

                                                               flushed
aLer

                                                OLAP

                                                               load


                   Conven5onal
ETL




     Opera5onal
                                                    Aggregate

                                                    Data

      database
                                                     tables

                                                  warehouse

                                                                    populated

                                                                    from
DW



                       SQLstream
Inc.
©
2009

Continuous ETL for Real-time OLAP




                           OLAP
cache

                                                OLAP

                              flushed

                           proac5vely

                   SQLstream

                   Con5nuous

                      ETL


     Opera5onal
                                     Data

      database
                                    warehouse

                               Aggregate
tables

                                  populated

                                incrementally

                       SQLstream
Inc.
©
2009

Real-time charts and alerts




                   Charts
generated

                   from
SQLstream

       Real‐5me

         alerts

                                                        OLAP





     Opera5onal
                                            Data

      database
                                           warehouse

                     SQLstream

                     Con5nuous

                        ETL


                               SQLstream
Inc.
©
2009

»  Demo
  »  Moving charts

  »  Mondrian

  »  SQLstream Studio
Where Real-time DW / OLAP really helps


»  Advertising
   »  Measuring results in real-time to manage budgets, ROI

   »  Finding costly errors ASAP

   »  Promoting & demoting campaigns

   »  Matching punters to products: win impulse buyers, get ahead of rivals

»  Social Networking
   »  Above plus: adapting content to real-time activity, interests

»  Commerce
   »  Above plus: pricing that reacts to inventory, competition

   »  Creating bundles dynamically

   »  Smart loyalty programs
The SQLstream Advantage: Do More with Less


»  Changing the Economics of ETL and Data Integration
   »  Leverages SQL skill sets in new ways

      »  Fewer and cheaper consultants for real-time integration

      »  Much lower development and maintenance costs

   »  Offloads existing Data Warehouses

      »  Reduces and defer infrastructure upgrades

      »  Enhances DW performance
   »  Make better business decisions faster

      »  Data Warehouses kept always up-to-date

      »  Continuous & real-time alerts and analytics
Questions?
Thank you for attending!

     www.sqlstream.com


More Related Content

Using Continuous Etl With Real Time Queries To Eliminate My Sql Bottlenecks

  • 1. Using Continuous ETL with Real-Time Queries to Eliminate MySQL Bottlenecks
 Damian.Black@sqlstream.com
 
Julian.Hyde@sqlstream.com
 April
2009

  • 2. Agenda »  Background »  Real-time Data Challenges »  SQLstream’s Solution »  Applications of SQLstream »  Live Demo
  • 3. SQLstream Company Corporate: »  Founded 2003, product launched 2008 »  Co-founded Eigenbase »  Patented software technology »  Experienced team »  Presence in California, Colorado, UK »  Privately funded
  • 4. The Business Pain »  Rising data volumes »  Data Warehouse always out of date »  Poor Visibility into data still arriving from apps & users »  Painful Latency – data warehouse always out of date »  Scaling for real-time performance proves costly »  Custom solutions, specialized hardware, bespoke integration »  Scaling for massively distributed data is impossible
  • 5. The SQLstream Solution »  Fundamentally better way of processing real-time data »  Enhances the Data Warehouse performance and functionality »  Eliminates MySQL bottlenecks with Continuous ETL in declarative SQL »  Simplifies Data Integration »  Continuous, real-time data integration yielding early visibility »  High level language, very productive and easy manage & maintain »  Built on ISO and Industry standards »  Eigenbase and SQL:2003/SQL:2008 »  Eclipse-based UI, standards-based drivers, meta data, SQL/MED »  Query The Future™
  • 6. SQLstream Eliminates Business Latency »  Traditional data warehouse Collect
 SQLstream Innovation »  Elimination of high latency Stage
 processing stages via a pipelined approach Query
 Process
 »  Classic approach delivers results the next day; Query
 SQLstream produces results continuously Deliver

  • 7. SQLstream Enhances the Data Warehouse »  Con5nuous
ETL
and
keeping
DW
updated
 »  Offloads
the
data
warehouse
from
ELT,
RT
queries
 »  Closes
the
loop:
Data
mining
used
for
Real‐5me
Detec5on
 »  Con5nuous,
RT
business
answers
with
near
zero
latency
 data data Data Warehouse data data
  • 8. Streaming SQL – an example CREATE VIEW compliant_orders AS SELECT STREAM * FROM orders OVER sla JOIN shipments ON orders.id = shipments.orderid WHERE city = 'New York' WINDOW sla AS (RANGE INTERVAL '1' HOUR PRECEDING) »  Produces a stream of orders from New York that shipped within a service level agreement of 1 hour
  • 9. Streaming SQL »  Built upon standard SQL:2003 »  Familiar & declarative »  Basics: »  Streams »  Tables »  Views »  Streaming versions of relational operators: »  Projections and Filters (SELECT … FROM … WHERE) »  Windowed join (JOIN … OVER) »  Windowed aggregation »  Streaming aggregation (GROUP BY) »  Union
  • 10. Mondrian Viewers »  Open-source OLAP engine »  Part of Pentaho Suite »  Julian Hyde is lead developer »  “ROLAP with caching” JEE Application Server »  Aggregate tables Mondrian »  Cache-control API cube cube cube JDBC JDBC JDBC Cube Schema XML RDBMS RDBMS
  • 11. Mondrian schema A dimensional model (logical) »  Cubes & virtual cubes »  Shared & private dimensions »  Measures … mapped onto a star/ snowflake schema (physical) »  Fact table »  Dimension tables »  Joined by foreign key relationships »  Aggregate tables
  • 12. ETL Process for OLAP OLAP
cache
 flushed
aLer
 OLAP
 load
 Conven5onal
ETL
 Opera5onal
 Aggregate
 Data
 database
 tables
 warehouse
 populated
 from
DW
 SQLstream
Inc.
©
2009

  • 13. Continuous ETL for Real-time OLAP OLAP
cache
 OLAP
 flushed
 proac5vely
 SQLstream
 Con5nuous
 ETL
 Opera5onal
 Data
 database
 warehouse
 Aggregate
tables
 populated
 incrementally
 SQLstream
Inc.
©
2009

  • 14. Real-time charts and alerts Charts
generated
 from
SQLstream
 Real‐5me
 alerts
 OLAP
 Opera5onal
 Data
 database
 warehouse
 SQLstream
 Con5nuous
 ETL
 SQLstream
Inc.
©
2009

  • 15. »  Demo »  Moving charts »  Mondrian »  SQLstream Studio
  • 16. Where Real-time DW / OLAP really helps »  Advertising »  Measuring results in real-time to manage budgets, ROI »  Finding costly errors ASAP »  Promoting & demoting campaigns »  Matching punters to products: win impulse buyers, get ahead of rivals »  Social Networking »  Above plus: adapting content to real-time activity, interests »  Commerce »  Above plus: pricing that reacts to inventory, competition »  Creating bundles dynamically »  Smart loyalty programs
  • 17. The SQLstream Advantage: Do More with Less »  Changing the Economics of ETL and Data Integration »  Leverages SQL skill sets in new ways »  Fewer and cheaper consultants for real-time integration »  Much lower development and maintenance costs »  Offloads existing Data Warehouses »  Reduces and defer infrastructure upgrades »  Enhances DW performance »  Make better business decisions faster »  Data Warehouses kept always up-to-date »  Continuous & real-time alerts and analytics
  • 19. Thank you for attending! www.sqlstream.com