tame big data with oracle data integration

32

Upload: michael-rainey

Post on 12-Apr-2017

503 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Tame Big Data with Oracle Data Integration
Page 2: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle Data Integration: CON7922Tame Big Data with Oracle Data Integration

Alex KotopoulisSenior Principal Product ManagerOracle Fusion Middleware, Data Integration Solutions

Michael RaineyPrincipal ConsultantRittman Mead

Oracle OpenWorld 2014 2

Page 3: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Safe Harbor StatementThe following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

Oracle OpenWorld 2014 3

Page 4: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 4

Agenda

Oracle Data Integration OverviewCustomer Cases and Best PracticesBig Data DemoQ&A and For More Information

• OOW Data Integration Sessions and Additional Resources

3

4

1

2

Page 5: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 5

Oracle Data Integration Solutions and Proven Benefits

Improve Agility• Deploy Projects Faster• Reliable Real-Time

Reduce Risk• Popular, Proven Tools• Open, Not Proprietary

Reduce Costs• Better Productivity• Eliminate ETL Servers

Analytic Data Integration• Big Data Integration & Governance• Data Warehouse Integration• Business Intelligence Applications

Enterprise Data Integration and Governance• Enterprise Data Quality and Profiling• Comprehensive, Heterogeneous Data Integration• Business Glossary and Metadata Management

Business Continuity• Active-Active for Maximum Availability• Zero Downtime Migrations• Data Consolidation / Application Modernization

24 x 7 x 365

Page 6: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 6

Comprehensive Data Integration & Governance CapabilitiesReal-Time Data Movement

– Low impact capture, stage in Hadoop– Continuous data availability

Data Transformation– Bulk data movement– Pushdown data processing

Data Federation– Virtualized Data Services

Data Quality & Verification– Fix quality at the source– Verify data consistency

Metadata Management– Lineage and Impact Analysis– Business Glossary Semantics

Data GovernanceFoundation

Oracle Data Integrator(Transformation)

Enterprise Data Quality(Profile, Cleanse, Match and De-duplicate)

FastLoad

Oracle GoldenGate(Movement)

Enterprise Metadata Management & Business Glossary(Business Glossary, Data Lineage, Impact Analysis and Data Provenance)

Data Service Integrator(Federation)

GoldenGate Veridata(Online Data Verification)

ELT Processingon Hadoop or SQL

Continuous Availability

Page 7: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 7

Data GovernanceFoundation

Differentiated Technical ApproachDynamic Data Movement

– Real-time CDC is by default, not ETL– Least invasive on sources– Proven best performance– Integrated Oracle capture/apply

No ETL Engines– Take the processing to the data;

don’t move the data to the process– Leverage your data engines for the

workloads (Hadoop or SQL)

Most Heterogeneous– Leverage open source Hadoop, not

proprietary distributions– Hadoop is the Hub, not ETL tools– Open metadata standards

Oracle Data Integrator(Transformation)

Enterprise Data Quality(Profile, Cleanse, Match and De-duplicate)

FastLoad

Oracle GoldenGate(Movement)

Enterprise Metadata Management & Business Glossary(Business Glossary, Data Lineage, Impact Analysis and Data Provenance)

Data Service Integrator(Federation)

GoldenGate Veridata(Online Data Verification)

ELT Processingon Hadoop or SQL

Continuous Availability

Page 8: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Data Reservoir Use Case with Oracle Data Integration

Oracle Confidential – Internal/Restricted/Highly Restricted 8

Oracle Data Integrator

Logs

OLTP Databases

Social Media

Sensor Data

Data Warehouses,Datamarts

Pig

Sqoop Initial Load Sqoop Load

OLH / OSCH

Big Data SQL

File Load

CDC to HDFS, Hive, Flume, HBase

Oracle GoldenGate

Oracle EnterpriseMetadata Management

Oracle Data Service Integrator

Federated Queries

Oracle EnterpriseData Quality

Impala

Transformations with HDFS, Hive, Hbase, Pig

Page 9: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Logical and Physical Design with ODI

LogicalDesign

Oracle

MySQL

Hive

PhysicalDesign

Sqoop

Sqoop

IKM

LKM

LKM

Oracle

Hive

MySQL

Hive

Page 10: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Design Once, Run Anywhere• Use native technologies for any data

source– Data Locality– Optimal performance, reduced

network traffic• No proprietary middle tier

– Reduced infrastructure cost and maintenance effort

• Declarative design– Simplified development– Reusable across technologies

Hive

Agent

Languages and Tools

Runtime Environments

SqoopBig Data

SQLFuture

Languages

Future RuntimeEngines

OLHOSCH

Page 11: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle GoldenGate Adapter – Big Data Use Cases

Oracle Confidential – Internal/Restricted/Highly Restricted 11

Java Adapter

HDFS file

Capture Parameter

File

Adapter Property file

Adapter Jar file

Source Database

PumpParameter file

Hive

HBase

Flume

Source Channel Sink

OtherCustom Targets

Log File PumpTrailFile

Capture

Page 12: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 12

Agenda

Oracle Data Integration OverviewCustomer Cases and Best PracticesBig Data DemoQ&A and For More Information

• OOW Data Integration Sessions and Additional Resources

1

2

3

4

Page 13: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted

13

Introduction• Michael Rainey• Principal Consultant - Rittman Mead• Oracle Data Integration expert

– Oracle Data Integrator and Oracle GoldenGate

• Oracle ACE• Twitter: @mRainey

Page 14: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted

14

About Rittman Mead• Oracle Gold partner

– World leading specialist partner for technical excellence, solutions delivery and innovation in Oracle BI

– Provide consulting, training, managed services for customers worldwide

• 120+ consultants including 1 Oracle ACE Director, 3 Oracle ACEs and 1 Oracle ACE Associate– All expert in Oracle BI, DW, EPM and Analytics tech– Skills in broad range of supporting Oracle tools: OBIEE, OBIA, ODIEE, Essbase, Oracle

OLAP, GoldenGate, Exadata, Endeca

• Blog: www.rittmanmead.com/blog Twitter: @rittmanmead

Page 15: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted

15

Customer Challenge• Company has subscribers with in-home devices• Company wishes to improve customer experience• Log data can potentially help identify issues, but difficult to access and read• …and there’s a lot of data!

Page 16: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted

16

Big Data Solution• 6 Node Big Data Appliance (BDA)

Extract data from XML logs via python script

Load data to HDFS using copyFromLocal command

Filter, format, sort data using Oracle R

Aggregate & transform data using python scripts & HiveQL

Load to Oracle DB via Sqoop

Page 17: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted

17

Wait, this looks familiar…• Looks like a standard data integration project!

• Scripts written to extract, load, and transform data• Source data and transformations evolving

• But something is missing– Scheduling, process flow, monitoring, data quality– Standardization and maintainability

Page 18: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted

18

Transition to an ETL tool• Initial thought…Informatica

– Client has experience with product

• Why Oracle Data Integrator?– Extensibility - “Design Once…”– No middle ETL engine– Data Quality

• And…it’s licensed with their BDA!

Page 19: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted

19

ODI ProcedureIKM Hive Transform

IKM File-Hive to SQL (SQOOP)

Big Data Solution using ODI 12c

Extract data from XML logs via python script

Load data to HDFS using copyFromLocal command

Filter, format, sort data using Oracle R

Aggregate & transform data using python scripts & HiveQL

Load to Oracle DB via Sqoop

IKM Hive Control Append

Page 20: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted

20

What we learned along the way…• HiveQL <> Oracle SQL

– Hive KMs, check the Generate ANSI Syntax checkbox, Hive expects table joins to be in this format rather than the “Oracle” format.

• Begin with scripts, but have ODI Application Adapters for Hadoop in mind• Utilize the skills your available resources have

– Not everyone can write MapReduce code

Page 21: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 21

Agenda

Oracle Data Integration OverviewCustomer Cases and Best PracticesBig Data DemoQ&A and For More Information

• OOW Data Integration Sessions and Additional Resources

1

2

3

4

Page 22: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Data Integration Demo

Oracle Confidential – Internal/Restricted/Highly Restricted 22

Oracle Data Integrator

Oracle GoldenGate

Flume

Process Activity(Hive)

Application Logs

Activity

Load Oracle Big Data SQL

ActivityClean CountrySales

Load Oracle OLH/OSCH

MySQL DB

SQOOP

OGG(HDFS/Flume)

MovieMovie MovieRating MovieRating

Customer

Calculate Rating(Hive)

Sessionize Activity(Pig OS Call)

Customer SessionStats

Calc Purchases(Oracle)

Page 23: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 23

Agenda

Oracle Data Integration OverviewCustomer Cases and Best PracticesBig Data DemoQ&A and For More Information

• OOW Data Integration Sessions and Additional Resources

1

2

3

4

Page 24: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

2014

2014 Oracle Excellence Award Ceremony for Fusion Middleware Innovation

ORACLE FUSION MIDDLEWARE:CELEBRATE THIS YEAR'S MOST INNOVATIVE CUSTOMER SOLUTIONS

Tuesday, September 30, 2014 5:00-5:45pm YBCA Theater (next to Moscone North)Session ID: CON7029

Page 25: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014

Resources

25

Oracle Data Integration Oracle Data Integration OracleGoldenGateORCL DataIntegration blogs.oracle.com/dataintegration

Oracle Data Integrator

Oracle GoldenGate

Oracle EnterpriseData Quality

Oracle Enterprise Metadata Management

Oracle Data Services Integrator

http://www.oracle.com/us/products/middleware/data-integration/overview/index.html

Data Integration

Page 26: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Questions and Answers

Oracle OpenWorld 2014 26

Page 27: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle DIS Session @ OOW ’14 – Oracle GoldenGate2:45PM - CON7717 Oracle GoldenGate New Features & Options Product Update

4:00PM - CON7716 Oracle GoldenGate 12c for Oracle Database 12c

5:15PM – CON7719 Enabling Real-Time Data Integration for Big Data

10:45AM – CON7715 Oracle Active Data Guard & Oracle GoldenGate for HA

12:00PM – CON7328 Near-Zero Downtime Unicode Migration for Oracle

12:00PM – CON774 Oracle GoldenGate for Cloud

6:00PM – BOF9597 International Oracle GoldenGate User Group Meeting

3:30PM – CON7934 Tapping into the Big Data Reserve with All Data

4:45PM – CON7922 Tame Big Data with Oracle Data Integration

4:45PM – CON7773 Oracle GoldenGate Performance Tuning for Oracle Database

10:45AM – CON7655 Achieving Zero Downtime During Oracle Application Upgrades & System Migrations

1:15PM – CON7718 Managing & Monitoring Oracle GoldenGate

Oracle OpenWorld 2014 27

TUEMON

WED THU

Page 28: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle DIS Session @ OOW ’14 – Oracle Data Integrator

4:00PM – CON7899 Oracle Data Integrator: Product Update and Future Strategy

5:00PM – CON7820 Making he Move from Oracle Warehouse Building to Oracle Data Integrator

3:30PM – CON7934 Tapping into the Big Data Reserve with All Data

4:45PM – CON7922 Tame Big Data with Oracle Data Integration

9:30AM – CON7926 Oracle Data Integration: A Crucial Ingredient for Cloud Integration

10:45AM – CON7923 Oracle Data Integration & Metadata Management for Seamless Enterprise

2:30PM – CON7921 Insight into Action: Business Intelligence Applications and Oracle Data Integrator

Oracle OpenWorld 2014 28

TUEMON

WED THU

Page 29: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle DIS Session @ OOW ’14 – Enterprise Data Quality

11:45AM – CON7776 Data Quality Maturity Journey: Building Toward Strong Enterprise Data Quality

10:45AM – CON7780 Oracle Enterprise Data Quality: Product Overview and Roadmap

2:00PM – CON7775 The Essential Core of Data Governance with Oracle Enterprise Data Quality

3:30PM – CON7934 Tapping into the Big Data Reserve with All Data

4:45PM – CON7922 Tame Big Data with Oracle Data Integration

12:00PM CON7931 Solving Big Data’s Big Problem with Data Preparation & Enrichment in the Cloud

Oracle OpenWorld 2014 29

TUEMON

WED THU

Page 30: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Oracle DIS Hands-on Labs @ OOW ’14Tuesday 3:45PM – HOL9439• Oracle Data Integrator 12c New

Features Deep DiveTuesday 5:15PM – HOL9414• Oracle Data Integrator for Big Data

Hotel NikkoNikko Ballroom II22 Mason Street

Monday 1:15PM – HOL9437• Oracle GoldenGate 12c New

Features Deep DriveWednesday 4:15PM – HOL9436• Pushing Transactions to JCache with

Coherence and GoldenGateThursday 10AM – HOL9413• Oracle GoldenGate Heterogeneous

Replication

Monday 2:45PM – HOL9438• Oracle Enterprise Data Quality

Introduction

Oracle OpenWorld 2014 30

OGG

ODI

EDQ

Page 31: Tame Big Data with Oracle Data Integration
Page 32: Tame Big Data with Oracle Data Integration

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle OpenWorld 2014 32