Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Tame Big Data with Oracle Data Integration
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Data Integration: CON7922
Tame Big Data with Oracle Data Integration
Alex Kotopoulis
Senior Principal Product Manager
Oracle Fusion Middleware, Data Integration Solutions
Michael Rainey
Principal Consultant
Rittman Mead
Oracle OpenWorld 2014 2
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for information
purposes only, and may not be incorporated into any contract. It is not a commitment to deliver
any material, code, or functionality, and should not be relied upon in making purchasing decisions.
The development, release, and timing of any features or functionality described for Oracle’s
products remains at the sole discretion of Oracle.
Oracle OpenWorld 2014 3
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Agenda
Oracle OpenWorld 2014 4
Oracle Data Integration Overview
Customer Cases and Best Practices
Big Data Demo
Q&A and For More Information
• OOW Data Integration Sessions and Additional Resources
3
4
1
2
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Data Integration Solutions and Proven Benefits
Oracle OpenWorld 2014 5
 Improve Agility
• Deploy Projects Faster
• Reliable Real-Time
 Reduce Risk
• Popular, Proven Tools
• Open, Not Proprietary
 Reduce Costs
• Better Productivity
• Eliminate ETL Servers
Analytic Data Integration
• Big Data Integration & Governance
• Data Warehouse Integration
• Business Intelligence Applications
Enterprise Data Integration and Governance
• Enterprise Data Quality and Profiling
• Comprehensive, Heterogeneous Data Integration
• Business Glossary and Metadata Management
Business Continuity
• Active-Active for Maximum Availability
• Zero Downtime Migrations
• Data Consolidation / Application Modernization
24 x 7 x 365
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Comprehensive Data Integration & Governance Capabilities
Oracle OpenWorld 2014 6
Real-Time Data Movement
– Low impact capture, stage in Hadoop
– Continuous data availability
Data Transformation
– Bulk data movement
– Pushdown data processing
Data Federation
– Virtualized Data Services
Data Quality & Verification
– Fix quality at the source
– Verify data consistency
Metadata Management
– Lineage and Impact Analysis
– Business Glossary Semantics
Data Governance
Foundation
Oracle Data Integrator
(Transformation)
Enterprise Data Quality
(Profile, Cleanse, Match and De-duplicate)
Fast
Load
Oracle GoldenGate
(Movement)
Enterprise Metadata Management & Business Glossary
(Business Glossary, Data Lineage, Impact Analysis and Data Provenance)
Data Service Integrator
(Federation)
GoldenGate Veridata
(Online Data Verification)
ELT Processing
on Hadoop or SQL
Continuous Availability
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Data Governance
Foundation
Differentiated Technical Approach
Oracle OpenWorld 2014 7
Dynamic Data Movement
– Real-time CDC is by default, not ETL
– Least invasive on sources
– Proven best performance
– Integrated Oracle capture/apply
No ETL Engines
– Take the processing to the data;
don’t move the data to the process
– Leverage your data engines for the
workloads (Hadoop or SQL)
Most Heterogeneous
– Leverage open source Hadoop, not
proprietary distributions
– Hadoop is the Hub, not ETL tools
– Open metadata standards
Oracle Data Integrator
(Transformation)
Enterprise Data Quality
(Profile, Cleanse, Match and De-duplicate)
Fast
Load
Oracle GoldenGate
(Movement)
Enterprise Metadata Management & Business Glossary
(Business Glossary, Data Lineage, Impact Analysis and Data Provenance)
Data Service Integrator
(Federation)
GoldenGate Veridata
(Online Data Verification)
ELT Processing
on Hadoop or SQL
Continuous Availability
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Data Reservoir Use Case with Oracle Data Integration
Oracle Confidential – Internal/Restricted/Highly Restricted 8
Oracle Data
Integrator
Logs
OLTP Databases
Social
Media
Sensor
Data
Data Warehouses,
Datamarts
Pig
Sqoop Initial Load Sqoop Load
OLH / OSCH
Big Data SQL
File Load
CDC to HDFS, Hive,
Flume, HBase
Oracle GoldenGate
Oracle Enterprise
Metadata Management
Oracle Data Service
Integrator
Federated Queries
Oracle Enterprise
Data Quality
Impala
Transformations with
HDFS, Hive, Hbase, Pig
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Logical and Physical Design with ODI
Logical
Design
Oracle
MySQL
Hive
Physical
Design
Sqoop
Sqoop
IKM
LKM
LKM
Oracle
Hive
MySQL
Hive
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Design Once, Run Anywhere
• Use native technologies for any data
source
– Data Locality
– Optimal performance, reduced
network traffic
• No proprietary middle tier
– Reduced infrastructure cost and
maintenance effort
• Declarative design
– Simplified development
– Reusable across technologies
Hive
Agent
Languages and Tools
Runtime
Environments
Sqoop
Big Data
SQL
Future
Languages
Future Runtime
Engines
OLH
OSCH
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle GoldenGate Adapter – Big Data Use Cases
Oracle Confidential – Internal/Restricted/Highly Restricted 11
Java
Adapter
HDFS
file
Capture
Parameter
File
Adapter
Property file
Adapter
Jar file
Source
Database
Pump
Parameter file
Hive
HBase
Flume
Source Channel Sink
Other
Custom
Targets
Log File Pump
Trail
File
Capture
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Agenda
Oracle OpenWorld 2014 12
Oracle Data Integration Overview
Customer Cases and Best Practices
Big Data Demo
Q&A and For More Information
• OOW Data Integration Sessions and Additional Resources
1
2
3
4
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Introduction
• Michael Rainey
• Principal Consultant - Rittman Mead
• Oracle Data Integration expert
– Oracle Data Integrator and Oracle GoldenGate
• Oracle ACE
• Twitter: @mRainey
Oracle Confidential – Internal/Restricted/Highly Restricted 13
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
About Rittman Mead
• Oracle Gold partner
– World leading specialist partner for technical excellence, solutions delivery and
innovation in Oracle BI
– Provide consulting, training, managed services for customers worldwide
• 120+ consultants including 1 Oracle ACE Director, 3 Oracle ACEs and 1
Oracle ACE Associate
– All expert in Oracle BI, DW, EPM and Analytics tech
– Skills in broad range of supporting Oracle tools: OBIEE, OBIA, ODIEE, Essbase, Oracle
OLAP, GoldenGate, Exadata, Endeca
• Blog: www.rittmanmead.com/blog Twitter: @rittmanmead
Oracle Confidential – Internal/Restricted/Highly Restricted 14
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Customer Challenge
• Company has subscribers with in-home devices
• Company wishes to improve customer experience
• Log data can potentially help identify issues, but difficult to access and read
• …and there’s a lot of data!
Oracle Confidential – Internal/Restricted/Highly Restricted 15
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Big Data Solution
• 6 Node Big Data Appliance (BDA)
Oracle Confidential – Internal/Restricted/Highly Restricted 16
bin/hadoop*dfs*-copyFromLocal
Process scheduled via cron jobs
Extract data
from XML logs
via python script
Load data to
HDFS using
copyFromLocal
command
Filter, format,
sort data using
Oracle R
Aggregate &
transform data
using python
scripts & HiveQL
Load to Oracle
DB via Sqoop
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Wait, this looks familiar…
• Looks like a standard data integration project!
• Scripts written to extract, load, and transform data
• Source data and transformations evolving
• But something is missing
– Scheduling, process flow, monitoring, data quality
– Standardization and maintainability
Oracle Confidential – Internal/Restricted/Highly Restricted 17
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Transition to an ETL tool
• Initial thought…Informatica
– Client has experience with product
• Why Oracle Data Integrator?
– Extensibility - “Design Once…”
– No middle ETL engine
– Data Quality
• And…it’s licensed with their BDA!
Oracle Confidential – Internal/Restricted/Highly Restricted 18
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
ODI Procedure
IKM Hive Transform
IKM File-Hive to SQL (SQOOP)
Big Data Solution using ODI 12c
Oracle Confidential – Internal/Restricted/Highly Restricted 19
bin/hadoop*dfs*-copyFromLocal
Extract data
from XML logs
via python script
Load data to
HDFS using
copyFromLocal
command
Filter, format,
sort data using
Oracle R
Aggregate &
transform data
using python
scripts & HiveQL
Load to Oracle
DB via Sqoop
IKM Hive Control Append
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
What we learned along the way…
• HiveQL <> Oracle SQL
– Hive KMs, check the Generate ANSI Syntax checkbox, Hive expects table joins to be in
this format rather than the “Oracle” format.
• Begin with scripts, but have ODI Application Adapters for Hadoop in mind
• Utilize the skills your available resources have
– Not everyone can write MapReduce code
Oracle Confidential – Internal/Restricted/Highly Restricted 20
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Agenda
Oracle OpenWorld 2014 21
Oracle Data Integration Overview
Customer Cases and Best Practices
Big Data Demo
Q&A and For More Information
• OOW Data Integration Sessions and Additional Resources
1
2
3
4
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Data Integration Demo
Oracle Confidential – Internal/Restricted/Highly Restricted 22
Oracle Data
Integrator
Oracle
GoldenGate
Flume
Process Activity
(Hive)
Application
Logs
Activity
Load Oracle
Big Data SQL
ActivityClean CountrySales
Load Oracle
OLH/OSCH
MySQL DB
SQOOP
OGG
(HDFS/Flume)
MovieMovie MovieRating MovieRating
Customer
Calculate Rating
(Hive)
Sessionize Activity
(Pig OS Call)
Customer SessionStats
Calc Purchases
(Oracle)
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Agenda
Oracle OpenWorld 2014 23
Oracle Data Integration Overview
Customer Cases and Best Practices
Big Data Demo
Q&A and For More Information
• OOW Data Integration Sessions and Additional Resources
1
2
3
4
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
2014
2014 Oracle Excellence Award Ceremony
for Fusion Middleware Innovation
ORACLE FUSION MIDDLEWARE:
CELEBRATE THIS YEAR'S MOST INNOVATIVE
CUSTOMER SOLUTIONS
Tuesday, September 30, 2014 5:00-5:45pm
YBCA Theater (next to Moscone North)
Session ID: CON7029
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Resources
Oracle OpenWorld 2014
25
Oracle Data Integration Oracle Data Integration OracleGoldenGateORCL DataIntegration blogs.oracle.com/dataint
egration
Oracle Data
Integrator
Oracle
GoldenGate
Oracle
Enterprise
Data Quality
Oracle Enterprise
Metadata
Management
Oracle Data
Services
Integrator
http://www.oracle.com/us/products/middleware/data-integration/overview/index.html
Data Integration
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Questions and Answers
Oracle OpenWorld 2014 26
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle DIS Session @ OOW ’14 – Oracle GoldenGate
2:45PM - CON7717 Oracle GoldenGate
New Features & Options Product Update
4:00PM - CON7716 Oracle GoldenGate
12c for Oracle Database 12c
5:15PM – CON7719 Enabling Real-Time
Data Integration for Big Data
10:45AM – CON7715 Oracle Active Data
Guard & Oracle GoldenGate for HA
12:00PM – CON7328 Near-Zero
Downtime Unicode Migration for Oracle
12:00PM – CON774 Oracle GoldenGate
for Cloud
6:00PM – BOF9597 International Oracle
GoldenGate User Group Meeting
3:30PM – CON7934 Tapping into the Big
Data Reserve with All Data
4:45PM – CON7922 Tame Big Data with
Oracle Data Integration
4:45PM – CON7773 Oracle GoldenGate
Performance Tuning for Oracle Database
10:45AM – CON7655 Achieving Zero
Downtime During Oracle Application
Upgrades & System Migrations
1:15PM – CON7718 Managing &
Monitoring Oracle GoldenGate
Oracle OpenWorld 2014 27
TUEMON
WED THU
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle DIS Session @ OOW ’14 – Oracle Data Integrator
4:00PM – CON7899 Oracle Data
Integrator: Product Update and
Future Strategy
5:00PM – CON7820 Making he Move from
Oracle Warehouse Building to Oracle Data
Integrator
3:30PM – CON7934 Tapping into the Big
Data Reserve with All Data
4:45PM – CON7922 Tame Big Data with
Oracle Data Integration
9:30AM – CON7926 Oracle Data
Integration: A Crucial Ingredient for Cloud
Integration
10:45AM – CON7923 Oracle Data
Integration & Metadata Management for
Seamless Enterprise
2:30PM – CON7921 Insight into Action:
Business Intelligence Applications and
Oracle Data Integrator
Oracle OpenWorld 2014 28
TUEMON
WED THU
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle DIS Session @ OOW ’14 – Enterprise Data Quality
11:45AM – CON7776 Data Quality
Maturity Journey: Building Toward
Strong Enterprise Data Quality
10:45AM – CON7780 Oracle Enterprise
Data Quality: Product Overview and
Roadmap
2:00PM – CON7775 The Essential Core of
Data Governance with Oracle Enterprise
Data Quality
3:30PM – CON7934 Tapping into the Big
Data Reserve with All Data
4:45PM – CON7922 Tame Big Data with
Oracle Data Integration
12:00PM CON7931 Solving Big Data’s Big
Problem with Data Preparation &
Enrichment in the Cloud
Oracle OpenWorld 2014 29
TUEMON
WED THU
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle DIS Hands-on Labs @ OOW ’14
Tuesday 3:45PM – HOL9439
• Oracle Data Integrator 12c New
Features Deep Dive
Tuesday 5:15PM – HOL9414
• Oracle Data Integrator for Big Data
Hotel Nikko
Nikko Ballroom II
22 Mason Street
Monday 1:15PM – HOL9437
• Oracle GoldenGate 12c New
Features Deep Drive
Wednesday 4:15PM – HOL9436
• Pushing Transactions to JCache with
Coherence and GoldenGate
Thursday 10AM – HOL9413
• Oracle GoldenGate Heterogeneous
Replication
Monday 2:45PM – HOL9438
• Oracle Enterprise Data Quality
Introduction
Oracle OpenWorld 2014 30
OGG
ODI
EDQ
Tame Big Data with Oracle Data Integration
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 32

More Related Content

Tame Big Data with Oracle Data Integration

  • 2. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Data Integration: CON7922 Tame Big Data with Oracle Data Integration Alex Kotopoulis Senior Principal Product Manager Oracle Fusion Middleware, Data Integration Solutions Michael Rainey Principal Consultant Rittman Mead Oracle OpenWorld 2014 2
  • 3. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. Oracle OpenWorld 2014 3
  • 4. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Agenda Oracle OpenWorld 2014 4 Oracle Data Integration Overview Customer Cases and Best Practices Big Data Demo Q&A and For More Information • OOW Data Integration Sessions and Additional Resources 3 4 1 2
  • 5. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Data Integration Solutions and Proven Benefits Oracle OpenWorld 2014 5  Improve Agility • Deploy Projects Faster • Reliable Real-Time  Reduce Risk • Popular, Proven Tools • Open, Not Proprietary  Reduce Costs • Better Productivity • Eliminate ETL Servers Analytic Data Integration • Big Data Integration & Governance • Data Warehouse Integration • Business Intelligence Applications Enterprise Data Integration and Governance • Enterprise Data Quality and Profiling • Comprehensive, Heterogeneous Data Integration • Business Glossary and Metadata Management Business Continuity • Active-Active for Maximum Availability • Zero Downtime Migrations • Data Consolidation / Application Modernization 24 x 7 x 365
  • 6. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Comprehensive Data Integration & Governance Capabilities Oracle OpenWorld 2014 6 Real-Time Data Movement – Low impact capture, stage in Hadoop – Continuous data availability Data Transformation – Bulk data movement – Pushdown data processing Data Federation – Virtualized Data Services Data Quality & Verification – Fix quality at the source – Verify data consistency Metadata Management – Lineage and Impact Analysis – Business Glossary Semantics Data Governance Foundation Oracle Data Integrator (Transformation) Enterprise Data Quality (Profile, Cleanse, Match and De-duplicate) Fast Load Oracle GoldenGate (Movement) Enterprise Metadata Management & Business Glossary (Business Glossary, Data Lineage, Impact Analysis and Data Provenance) Data Service Integrator (Federation) GoldenGate Veridata (Online Data Verification) ELT Processing on Hadoop or SQL Continuous Availability
  • 7. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Data Governance Foundation Differentiated Technical Approach Oracle OpenWorld 2014 7 Dynamic Data Movement – Real-time CDC is by default, not ETL – Least invasive on sources – Proven best performance – Integrated Oracle capture/apply No ETL Engines – Take the processing to the data; don’t move the data to the process – Leverage your data engines for the workloads (Hadoop or SQL) Most Heterogeneous – Leverage open source Hadoop, not proprietary distributions – Hadoop is the Hub, not ETL tools – Open metadata standards Oracle Data Integrator (Transformation) Enterprise Data Quality (Profile, Cleanse, Match and De-duplicate) Fast Load Oracle GoldenGate (Movement) Enterprise Metadata Management & Business Glossary (Business Glossary, Data Lineage, Impact Analysis and Data Provenance) Data Service Integrator (Federation) GoldenGate Veridata (Online Data Verification) ELT Processing on Hadoop or SQL Continuous Availability
  • 8. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Data Reservoir Use Case with Oracle Data Integration Oracle Confidential – Internal/Restricted/Highly Restricted 8 Oracle Data Integrator Logs OLTP Databases Social Media Sensor Data Data Warehouses, Datamarts Pig Sqoop Initial Load Sqoop Load OLH / OSCH Big Data SQL File Load CDC to HDFS, Hive, Flume, HBase Oracle GoldenGate Oracle Enterprise Metadata Management Oracle Data Service Integrator Federated Queries Oracle Enterprise Data Quality Impala Transformations with HDFS, Hive, Hbase, Pig
  • 9. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Logical and Physical Design with ODI Logical Design Oracle MySQL Hive Physical Design Sqoop Sqoop IKM LKM LKM Oracle Hive MySQL Hive
  • 10. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Design Once, Run Anywhere • Use native technologies for any data source – Data Locality – Optimal performance, reduced network traffic • No proprietary middle tier – Reduced infrastructure cost and maintenance effort • Declarative design – Simplified development – Reusable across technologies Hive Agent Languages and Tools Runtime Environments Sqoop Big Data SQL Future Languages Future Runtime Engines OLH OSCH
  • 11. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle GoldenGate Adapter – Big Data Use Cases Oracle Confidential – Internal/Restricted/Highly Restricted 11 Java Adapter HDFS file Capture Parameter File Adapter Property file Adapter Jar file Source Database Pump Parameter file Hive HBase Flume Source Channel Sink Other Custom Targets Log File Pump Trail File Capture
  • 12. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Agenda Oracle OpenWorld 2014 12 Oracle Data Integration Overview Customer Cases and Best Practices Big Data Demo Q&A and For More Information • OOW Data Integration Sessions and Additional Resources 1 2 3 4
  • 13. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Introduction • Michael Rainey • Principal Consultant - Rittman Mead • Oracle Data Integration expert – Oracle Data Integrator and Oracle GoldenGate • Oracle ACE • Twitter: @mRainey Oracle Confidential – Internal/Restricted/Highly Restricted 13
  • 14. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | About Rittman Mead • Oracle Gold partner – World leading specialist partner for technical excellence, solutions delivery and innovation in Oracle BI – Provide consulting, training, managed services for customers worldwide • 120+ consultants including 1 Oracle ACE Director, 3 Oracle ACEs and 1 Oracle ACE Associate – All expert in Oracle BI, DW, EPM and Analytics tech – Skills in broad range of supporting Oracle tools: OBIEE, OBIA, ODIEE, Essbase, Oracle OLAP, GoldenGate, Exadata, Endeca • Blog: www.rittmanmead.com/blog Twitter: @rittmanmead Oracle Confidential – Internal/Restricted/Highly Restricted 14
  • 15. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Customer Challenge • Company has subscribers with in-home devices • Company wishes to improve customer experience • Log data can potentially help identify issues, but difficult to access and read • …and there’s a lot of data! Oracle Confidential – Internal/Restricted/Highly Restricted 15
  • 16. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Big Data Solution • 6 Node Big Data Appliance (BDA) Oracle Confidential – Internal/Restricted/Highly Restricted 16 bin/hadoop*dfs*-copyFromLocal Process scheduled via cron jobs Extract data from XML logs via python script Load data to HDFS using copyFromLocal command Filter, format, sort data using Oracle R Aggregate & transform data using python scripts & HiveQL Load to Oracle DB via Sqoop
  • 17. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Wait, this looks familiar… • Looks like a standard data integration project! • Scripts written to extract, load, and transform data • Source data and transformations evolving • But something is missing – Scheduling, process flow, monitoring, data quality – Standardization and maintainability Oracle Confidential – Internal/Restricted/Highly Restricted 17
  • 18. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Transition to an ETL tool • Initial thought…Informatica – Client has experience with product • Why Oracle Data Integrator? – Extensibility - “Design Once…” – No middle ETL engine – Data Quality • And…it’s licensed with their BDA! Oracle Confidential – Internal/Restricted/Highly Restricted 18
  • 19. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | ODI Procedure IKM Hive Transform IKM File-Hive to SQL (SQOOP) Big Data Solution using ODI 12c Oracle Confidential – Internal/Restricted/Highly Restricted 19 bin/hadoop*dfs*-copyFromLocal Extract data from XML logs via python script Load data to HDFS using copyFromLocal command Filter, format, sort data using Oracle R Aggregate & transform data using python scripts & HiveQL Load to Oracle DB via Sqoop IKM Hive Control Append
  • 20. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | What we learned along the way… • HiveQL <> Oracle SQL – Hive KMs, check the Generate ANSI Syntax checkbox, Hive expects table joins to be in this format rather than the “Oracle” format. • Begin with scripts, but have ODI Application Adapters for Hadoop in mind • Utilize the skills your available resources have – Not everyone can write MapReduce code Oracle Confidential – Internal/Restricted/Highly Restricted 20
  • 21. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Agenda Oracle OpenWorld 2014 21 Oracle Data Integration Overview Customer Cases and Best Practices Big Data Demo Q&A and For More Information • OOW Data Integration Sessions and Additional Resources 1 2 3 4
  • 22. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Data Integration Demo Oracle Confidential – Internal/Restricted/Highly Restricted 22 Oracle Data Integrator Oracle GoldenGate Flume Process Activity (Hive) Application Logs Activity Load Oracle Big Data SQL ActivityClean CountrySales Load Oracle OLH/OSCH MySQL DB SQOOP OGG (HDFS/Flume) MovieMovie MovieRating MovieRating Customer Calculate Rating (Hive) Sessionize Activity (Pig OS Call) Customer SessionStats Calc Purchases (Oracle)
  • 23. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Agenda Oracle OpenWorld 2014 23 Oracle Data Integration Overview Customer Cases and Best Practices Big Data Demo Q&A and For More Information • OOW Data Integration Sessions and Additional Resources 1 2 3 4
  • 24. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 2014 2014 Oracle Excellence Award Ceremony for Fusion Middleware Innovation ORACLE FUSION MIDDLEWARE: CELEBRATE THIS YEAR'S MOST INNOVATIVE CUSTOMER SOLUTIONS Tuesday, September 30, 2014 5:00-5:45pm YBCA Theater (next to Moscone North) Session ID: CON7029
  • 25. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Resources Oracle OpenWorld 2014 25 Oracle Data Integration Oracle Data Integration OracleGoldenGateORCL DataIntegration blogs.oracle.com/dataint egration Oracle Data Integrator Oracle GoldenGate Oracle Enterprise Data Quality Oracle Enterprise Metadata Management Oracle Data Services Integrator http://www.oracle.com/us/products/middleware/data-integration/overview/index.html Data Integration
  • 26. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Questions and Answers Oracle OpenWorld 2014 26
  • 27. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle DIS Session @ OOW ’14 – Oracle GoldenGate 2:45PM - CON7717 Oracle GoldenGate New Features & Options Product Update 4:00PM - CON7716 Oracle GoldenGate 12c for Oracle Database 12c 5:15PM – CON7719 Enabling Real-Time Data Integration for Big Data 10:45AM – CON7715 Oracle Active Data Guard & Oracle GoldenGate for HA 12:00PM – CON7328 Near-Zero Downtime Unicode Migration for Oracle 12:00PM – CON774 Oracle GoldenGate for Cloud 6:00PM – BOF9597 International Oracle GoldenGate User Group Meeting 3:30PM – CON7934 Tapping into the Big Data Reserve with All Data 4:45PM – CON7922 Tame Big Data with Oracle Data Integration 4:45PM – CON7773 Oracle GoldenGate Performance Tuning for Oracle Database 10:45AM – CON7655 Achieving Zero Downtime During Oracle Application Upgrades & System Migrations 1:15PM – CON7718 Managing & Monitoring Oracle GoldenGate Oracle OpenWorld 2014 27 TUEMON WED THU
  • 28. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle DIS Session @ OOW ’14 – Oracle Data Integrator 4:00PM – CON7899 Oracle Data Integrator: Product Update and Future Strategy 5:00PM – CON7820 Making he Move from Oracle Warehouse Building to Oracle Data Integrator 3:30PM – CON7934 Tapping into the Big Data Reserve with All Data 4:45PM – CON7922 Tame Big Data with Oracle Data Integration 9:30AM – CON7926 Oracle Data Integration: A Crucial Ingredient for Cloud Integration 10:45AM – CON7923 Oracle Data Integration & Metadata Management for Seamless Enterprise 2:30PM – CON7921 Insight into Action: Business Intelligence Applications and Oracle Data Integrator Oracle OpenWorld 2014 28 TUEMON WED THU
  • 29. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle DIS Session @ OOW ’14 – Enterprise Data Quality 11:45AM – CON7776 Data Quality Maturity Journey: Building Toward Strong Enterprise Data Quality 10:45AM – CON7780 Oracle Enterprise Data Quality: Product Overview and Roadmap 2:00PM – CON7775 The Essential Core of Data Governance with Oracle Enterprise Data Quality 3:30PM – CON7934 Tapping into the Big Data Reserve with All Data 4:45PM – CON7922 Tame Big Data with Oracle Data Integration 12:00PM CON7931 Solving Big Data’s Big Problem with Data Preparation & Enrichment in the Cloud Oracle OpenWorld 2014 29 TUEMON WED THU
  • 30. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle DIS Hands-on Labs @ OOW ’14 Tuesday 3:45PM – HOL9439 • Oracle Data Integrator 12c New Features Deep Dive Tuesday 5:15PM – HOL9414 • Oracle Data Integrator for Big Data Hotel Nikko Nikko Ballroom II 22 Mason Street Monday 1:15PM – HOL9437 • Oracle GoldenGate 12c New Features Deep Drive Wednesday 4:15PM – HOL9436 • Pushing Transactions to JCache with Coherence and GoldenGate Thursday 10AM – HOL9413 • Oracle GoldenGate Heterogeneous Replication Monday 2:45PM – HOL9438 • Oracle Enterprise Data Quality Introduction Oracle OpenWorld 2014 30 OGG ODI EDQ
  • 32. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 32

Editor's Notes

  1. Big Data is the New Fuel for the Enterprise. It’s a clean fuel that from renewable sources. It’s perishable if not used regularly It’s combustible and can explosive impacts.