Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1
Big Data
Jean-Pierre Dijcks
Team Lead – Big Data Product Management
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.2
Agenda
 Big Data Implementation Patterns
 Big Data Products
 Q&A
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.3
Big Data Implementations
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.4
Big Data Usage Pattern
ETL and Batch Processing Workloads on Hadoop
Integrate
SQL
SQL
NoSQL
• Scalable
• Flexible
• Cost
Effective
DW & BI
Analytics
Web
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.5
Ad-hoc
Big Data Usage Pattern
Scale-out Information Discovery
• Online
• Scalable
• Flexible
• Cost
Effective
Data Factory
Continuous On-Demand
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.6
Big Data Usage Pattern
Expand Data Warehouse with Granular Data Store
MartsData Warehouse
Σ Σ
Business
Intelligence
Archiving
• Online
• Scalable
• Flexible
• Cost
Effective
Data Factory
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.7
Big Data Usage Pattern
Instant Responses to Streaming Data based on Historical Analysis
Data Warehouse
Business
Intelligence
• Online
• Scalable
• Flexible
• Cost
Effective
Data Factory
Event Decisions
NoSQL
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.8
Oracle Big Data Solution
Stream Acquire – Organize – Analyze
In-Database
Analytics
Data
Warehouse
Oracle
Advanced
Analytics
Oracle
Database
Oracle BI
Enterprise Edition
Oracle Real-Time
Decisions
Endeca Information
Discovery
Decide
Oracle Event
Processing
Apache
Flume
Applications
Oracle
NoSQL
Database
Cloudera
Hadoop
Oracle R
Distribution
Oracle Big Data
Connectors
Oracle Data
Integrator
• Complete
• Integrated
• Scalable
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.9
Big Data Products
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.10
Big Data Appliance X3-2
Sun Oracle X3-2L Servers with per server:
• 2 * 8 Core Intel Xeon E5 Processors
• 64 GB Memory
• 36TB Disk space
Totals per Full Rack:
• 288 Processor Cores
• 1152 GB of Memory
• 648TB Available Disk space
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.11
Big Data Appliance Software Stack
Integrated Software:
 Oracle Linux 5.8 with UEK
 Cloudera CDH 4.2 & Cloudera Manager 4.5
 Big Data Appliance Enterprise Manager Plug-In
 Oracle R Distribution
All integrated software is supported as part of Premier Support for
Systems and Premier Support for Operating Systems
Optional Software:
 Oracle NoSQL Database 2.x
 Oracle Big Data Connectors 2.x
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.12
BDA in Infrastructure as a Service
 Procurement option for H/W
 Low monthly fee spread out
over 3 to 5 years
 Ownership of the system
stays with Oracle
 Applies to all Engineered
Systems
 BDA Full Racks only
Month
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.13
Big Data Appliance Product Family
 Starter Rack is a fully cabled and
configured for growth with 6 servers
 In-Rack Expansion delivers 6 server
modular expansion block
 Full Rack delivers optimal blend of
capacity and expansion options
 Grow by adding rack – up to 18 racks
without additional switches
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.14
Big Data Appliance X3-2 Starter Rack
 6 Nodes fully cabled in Starter Rack
• 96 Intel® Xeon® E5 Processors
• 384 GB total memory
• 216TB total raw storage capacity
 6 Nodes In-Rack Expansion added in-rack
• 96 Intel® Xeon® E5 Processors
• 384 GB total memory
• 216TB total raw storage capacity
Start and grow in increments of six servers
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.15
Why Oracle Big Data Appliance?
 Beats DIY Clusters on:
– Initial Cost and Time to Value
– Performance and Scalability
 Pre-configured with leading Hadoop Distribution
– Proven at large scale
– Contributors across all components for better support
 Better Integration with your Oracle ecosystem with:
– High-performance connectivity to Exadata
– Unified analytics API (SQL, R, MapReduce etc.)
– Single Enterprise Manager Framework
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.16
Divide Full Rack BDA in
multiple clusters
Provide more flexible
configurations for
customers
Automatic reconfiguration
when expanding the
cluster
Flexible Configurations
6 Node Cluster
12 Node Cluster
Example Configuration
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.17
Engineered for Quicker Time to Value at Lower Cost
http://www.oracle.com/us/corporate/analystreports/industries/esg-big-data-wp-1914112.pdf
ESG believes that a "buy" versus "do-it-yourself"
approach will yield roughly one-third faster time-
to-market benefit improvement...
0
5
10
15
20
25
30
Oracle Big Data Appliance Build it yourself
Time to Market (Weeks)
0
100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
Oracle Big Data Appliance Build it yourself
Cost: Initial Infrastructure/Tasks
[…] nearly 40% cost savings versus IT
architecting, designing, procuring, configuring, an
d implementing its own big data infrastructure.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.18
Engineered for Performance
Compared with a DIY Cluster
0
5
10
Big Data
Appliance
DIY Hadoop
Cluster
Time(hours)
 Configured for exceptional
performance on delivery
 6x faster than custom 20-node
Hadoop cluster for large batch
transformation jobs
 Engineering done by Oracle and
Cloudera:
– OS and File System Tuning
– Java Virtual Machine Tuning
– Hadoop Configuration and Setup
6x
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.19
Engineered by Oracle and Cloudera
Why Cloudera and Cloudera CDH?
 Proven Track Record with the largest Hadoop Installed Base
 Proven in large scale enterprise implementations
 Demonstrated Leadership in Hadoop Community
– Breath and Depth across the Hadoop ecosystem and products
– Fast evolution in critical features
 Managed Distribution
– Components certified to work together and on Oracle Big Data Appliance in
regular updates
– Industry Leading Management Framework for Hadoop integrated with
Oracle Enterprise Manager
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.20
Engineered by Oracle and Cloudera
 Cloudera’s Hadoop Knowledge Engineered into the system:
– Master service lay-out designed for large clusters based on
experience with many large systems
– Optimized data block size for MapReduce workloads
– Optimized number of Map and Reduce slots fitting the system
capacity
– Optimized settings for a large number of Hadoop parameters
 Tested at Oracle and Cloudera on the same hardware/software
stack as our customers
Market Leading Hadoop Distribution Pre-configured
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.21
Engineered by Oracle and Cloudera
 Multi-Homing for Hadoop
– To leverage BDA’s InfiniBand and 10GiGE network, Hadoop needed to be able to
support multiple networks and IP addresses
– Committed to Apache Hadoop by Cloudera
 Highly Available NameNode Solution
– Remove dependency on a HA Filer to enable HA without required additional
hardware
– Build a journaling based HA solution for NameNode with automatic fail-over
 System Administration
– Tight integration between Oracle Enterprise Manager (Hardware and High-Level
Software Monitoring) and Cloudera Manager (Hadoop Details)
Driving Enterprise Class Requirements for Hadoop
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.22
Integrated Management Framework
Management Infrastructure combines EM and CM
Quick view of Hardware and Software status
in Oracle Enterprise Manager
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.23
Big Data Connectors
Optimized integration of Hadoop with Oracle Database
and Oracle Exadata
• Oracle Loader for Hadoop
• Oracle SQL Connector for Hadoop Distributed File System
(HDFS)
• Oracle Data Integrator Application Adapter for Hadoop
• Oracle R Connector for Hadoop
• Does not require Big Data Appliance – can be licensed for Hadoop
running on non-Oracle hardware
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.24
Analyze Data across your Oracle Systems
SQL Analytics on ALL data
SQL
Hadoop Oracle Database
IB
 Expand the data pool for
analytics leveraging Hadoop
 Stream Hadoop resident data
through Big Data Connectors
for SQL processing
 Use the full power of Oracle
SQL on all data
 Or use Oracle Loader for
Hadoop to integrate data in
Oracle Database
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.25
Analyze Data across your Oracle Systems
R Analytics on ALL data
R
Hadoop Oracle Database
IB
 Expand the data pool for
analytics leveraging Hadoop
 Improve scalability and
performance for R without
changes to your programs
 Dynamically leverage Hadoop
through Big Data Connectors
to execute R analytics
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.26
Oracle Data Integrator
Simplify Map Reduce
OLH
&
OSCH
Oracle
Data
Integrator
 Automatically generates
MapReduce code
 High performance loads into
Data Warehouse leveraging
both OLH and OSCH
 Manages the process across
platforms
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.27
Oracle NoSQL Database
Scalable, Highly Available, Key-Value Database
Application
Storage Nodes
Datacenter B
Storage Nodes
Datacenter A
Application
NoSQL DB Driver
Application
NoSQL DB Driver
Application
 Simple Key-Value Data Model
 Horizontally Scalable
 Highly Available
 Simple administration
 ACID Transactions at scale
 Transparent load balancing
 Elastic Configuration
 Commercial grade software and
support
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.28
Oracle NoSQL Database Use Cases
NoSQL DB Driver
Application
Oracle Event
Processor
Event
Stream
Web Scale Transaction Processing
• High velocity, High volume, High variety, Low information density data capture
• Uses Hadoop and/or Data Warehouse for analytics
• Applications: Web browsing, Web Retail, CDR processing, Sensor data capture
Last Mile Content Delivery
• Platform for real-time content delivery
• Content & market segmentation Acquired and Analyzed in Hadoop & RDBMS
• NoSQL provides low latency content lookup and delivery to end-customers
• OEP rules perform low latency lookups to Oracle NoSQL DB for additional data
Real Time Event Processing
• Real time events trigger rule execution in Oracle Event Processing
• OEP rules perform low latency lookups to Oracle NoSQL DB for additional data
• OEP actions are triggered
• Applications: Medical Monitoring, Factory Automation, Oil & Gas, Geo-location
Rule Action
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.29
Copyright © 2013, Oracle and/or its affiliates. All rights reserved.30

More Related Content

2013 05 Oracle big_dataapplianceoverview

  • 1. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.1 Big Data Jean-Pierre Dijcks Team Lead – Big Data Product Management
  • 2. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.2 Agenda  Big Data Implementation Patterns  Big Data Products  Q&A
  • 3. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.3 Big Data Implementations
  • 4. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.4 Big Data Usage Pattern ETL and Batch Processing Workloads on Hadoop Integrate SQL SQL NoSQL • Scalable • Flexible • Cost Effective DW & BI Analytics Web
  • 5. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.5 Ad-hoc Big Data Usage Pattern Scale-out Information Discovery • Online • Scalable • Flexible • Cost Effective Data Factory Continuous On-Demand
  • 6. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.6 Big Data Usage Pattern Expand Data Warehouse with Granular Data Store MartsData Warehouse Σ Σ Business Intelligence Archiving • Online • Scalable • Flexible • Cost Effective Data Factory
  • 7. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.7 Big Data Usage Pattern Instant Responses to Streaming Data based on Historical Analysis Data Warehouse Business Intelligence • Online • Scalable • Flexible • Cost Effective Data Factory Event Decisions NoSQL
  • 8. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.8 Oracle Big Data Solution Stream Acquire – Organize – Analyze In-Database Analytics Data Warehouse Oracle Advanced Analytics Oracle Database Oracle BI Enterprise Edition Oracle Real-Time Decisions Endeca Information Discovery Decide Oracle Event Processing Apache Flume Applications Oracle NoSQL Database Cloudera Hadoop Oracle R Distribution Oracle Big Data Connectors Oracle Data Integrator • Complete • Integrated • Scalable
  • 9. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.9 Big Data Products
  • 10. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.10 Big Data Appliance X3-2 Sun Oracle X3-2L Servers with per server: • 2 * 8 Core Intel Xeon E5 Processors • 64 GB Memory • 36TB Disk space Totals per Full Rack: • 288 Processor Cores • 1152 GB of Memory • 648TB Available Disk space
  • 11. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.11 Big Data Appliance Software Stack Integrated Software:  Oracle Linux 5.8 with UEK  Cloudera CDH 4.2 & Cloudera Manager 4.5  Big Data Appliance Enterprise Manager Plug-In  Oracle R Distribution All integrated software is supported as part of Premier Support for Systems and Premier Support for Operating Systems Optional Software:  Oracle NoSQL Database 2.x  Oracle Big Data Connectors 2.x
  • 12. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.12 BDA in Infrastructure as a Service  Procurement option for H/W  Low monthly fee spread out over 3 to 5 years  Ownership of the system stays with Oracle  Applies to all Engineered Systems  BDA Full Racks only Month
  • 13. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.13 Big Data Appliance Product Family  Starter Rack is a fully cabled and configured for growth with 6 servers  In-Rack Expansion delivers 6 server modular expansion block  Full Rack delivers optimal blend of capacity and expansion options  Grow by adding rack – up to 18 racks without additional switches
  • 14. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.14 Big Data Appliance X3-2 Starter Rack  6 Nodes fully cabled in Starter Rack • 96 Intel® Xeon® E5 Processors • 384 GB total memory • 216TB total raw storage capacity  6 Nodes In-Rack Expansion added in-rack • 96 Intel® Xeon® E5 Processors • 384 GB total memory • 216TB total raw storage capacity Start and grow in increments of six servers
  • 15. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.15 Why Oracle Big Data Appliance?  Beats DIY Clusters on: – Initial Cost and Time to Value – Performance and Scalability  Pre-configured with leading Hadoop Distribution – Proven at large scale – Contributors across all components for better support  Better Integration with your Oracle ecosystem with: – High-performance connectivity to Exadata – Unified analytics API (SQL, R, MapReduce etc.) – Single Enterprise Manager Framework
  • 16. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.16 Divide Full Rack BDA in multiple clusters Provide more flexible configurations for customers Automatic reconfiguration when expanding the cluster Flexible Configurations 6 Node Cluster 12 Node Cluster Example Configuration
  • 17. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.17 Engineered for Quicker Time to Value at Lower Cost http://www.oracle.com/us/corporate/analystreports/industries/esg-big-data-wp-1914112.pdf ESG believes that a "buy" versus "do-it-yourself" approach will yield roughly one-third faster time- to-market benefit improvement... 0 5 10 15 20 25 30 Oracle Big Data Appliance Build it yourself Time to Market (Weeks) 0 100,000 200,000 300,000 400,000 500,000 600,000 700,000 800,000 Oracle Big Data Appliance Build it yourself Cost: Initial Infrastructure/Tasks […] nearly 40% cost savings versus IT architecting, designing, procuring, configuring, an d implementing its own big data infrastructure.
  • 18. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.18 Engineered for Performance Compared with a DIY Cluster 0 5 10 Big Data Appliance DIY Hadoop Cluster Time(hours)  Configured for exceptional performance on delivery  6x faster than custom 20-node Hadoop cluster for large batch transformation jobs  Engineering done by Oracle and Cloudera: – OS and File System Tuning – Java Virtual Machine Tuning – Hadoop Configuration and Setup 6x
  • 19. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.19 Engineered by Oracle and Cloudera Why Cloudera and Cloudera CDH?  Proven Track Record with the largest Hadoop Installed Base  Proven in large scale enterprise implementations  Demonstrated Leadership in Hadoop Community – Breath and Depth across the Hadoop ecosystem and products – Fast evolution in critical features  Managed Distribution – Components certified to work together and on Oracle Big Data Appliance in regular updates – Industry Leading Management Framework for Hadoop integrated with Oracle Enterprise Manager
  • 20. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.20 Engineered by Oracle and Cloudera  Cloudera’s Hadoop Knowledge Engineered into the system: – Master service lay-out designed for large clusters based on experience with many large systems – Optimized data block size for MapReduce workloads – Optimized number of Map and Reduce slots fitting the system capacity – Optimized settings for a large number of Hadoop parameters  Tested at Oracle and Cloudera on the same hardware/software stack as our customers Market Leading Hadoop Distribution Pre-configured
  • 21. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.21 Engineered by Oracle and Cloudera  Multi-Homing for Hadoop – To leverage BDA’s InfiniBand and 10GiGE network, Hadoop needed to be able to support multiple networks and IP addresses – Committed to Apache Hadoop by Cloudera  Highly Available NameNode Solution – Remove dependency on a HA Filer to enable HA without required additional hardware – Build a journaling based HA solution for NameNode with automatic fail-over  System Administration – Tight integration between Oracle Enterprise Manager (Hardware and High-Level Software Monitoring) and Cloudera Manager (Hadoop Details) Driving Enterprise Class Requirements for Hadoop
  • 22. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.22 Integrated Management Framework Management Infrastructure combines EM and CM Quick view of Hardware and Software status in Oracle Enterprise Manager
  • 23. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.23 Big Data Connectors Optimized integration of Hadoop with Oracle Database and Oracle Exadata • Oracle Loader for Hadoop • Oracle SQL Connector for Hadoop Distributed File System (HDFS) • Oracle Data Integrator Application Adapter for Hadoop • Oracle R Connector for Hadoop • Does not require Big Data Appliance – can be licensed for Hadoop running on non-Oracle hardware
  • 24. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.24 Analyze Data across your Oracle Systems SQL Analytics on ALL data SQL Hadoop Oracle Database IB  Expand the data pool for analytics leveraging Hadoop  Stream Hadoop resident data through Big Data Connectors for SQL processing  Use the full power of Oracle SQL on all data  Or use Oracle Loader for Hadoop to integrate data in Oracle Database
  • 25. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.25 Analyze Data across your Oracle Systems R Analytics on ALL data R Hadoop Oracle Database IB  Expand the data pool for analytics leveraging Hadoop  Improve scalability and performance for R without changes to your programs  Dynamically leverage Hadoop through Big Data Connectors to execute R analytics
  • 26. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.26 Oracle Data Integrator Simplify Map Reduce OLH & OSCH Oracle Data Integrator  Automatically generates MapReduce code  High performance loads into Data Warehouse leveraging both OLH and OSCH  Manages the process across platforms
  • 27. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.27 Oracle NoSQL Database Scalable, Highly Available, Key-Value Database Application Storage Nodes Datacenter B Storage Nodes Datacenter A Application NoSQL DB Driver Application NoSQL DB Driver Application  Simple Key-Value Data Model  Horizontally Scalable  Highly Available  Simple administration  ACID Transactions at scale  Transparent load balancing  Elastic Configuration  Commercial grade software and support
  • 28. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.28 Oracle NoSQL Database Use Cases NoSQL DB Driver Application Oracle Event Processor Event Stream Web Scale Transaction Processing • High velocity, High volume, High variety, Low information density data capture • Uses Hadoop and/or Data Warehouse for analytics • Applications: Web browsing, Web Retail, CDR processing, Sensor data capture Last Mile Content Delivery • Platform for real-time content delivery • Content & market segmentation Acquired and Analyzed in Hadoop & RDBMS • NoSQL provides low latency content lookup and delivery to end-customers • OEP rules perform low latency lookups to Oracle NoSQL DB for additional data Real Time Event Processing • Real time events trigger rule execution in Oracle Event Processing • OEP rules perform low latency lookups to Oracle NoSQL DB for additional data • OEP actions are triggered • Applications: Medical Monitoring, Factory Automation, Oil & Gas, Geo-location Rule Action
  • 29. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.29
  • 30. Copyright © 2013, Oracle and/or its affiliates. All rights reserved.30