Transcript

1 ©2014 Cloudera, Inc. All rights reserved.1

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

2

Agenda

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

• Data Warehouse Vision & Reality• What is legacy data & why an Enterprise Data Hub• Offloading legacy data and workloads to Hadoop• Transform all types of data into self-service analytics• Live Demonstration• Customer case study• Q&A

3

What is this?

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.3

4

Real-Time

Mainframe

Oracle

ERP

ETL ETL

Data Mart

DataWarehouse

File

XML

The Data Warehouse Vision -1998

4

Data Integration & ETL Tools would enable a Single, Consistent Version of the Truth

Data Mart

Data Mart

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

5

Data Warehouse Reality 2014

5

Real-Time

Mainframe

Oracle

ERP

ETL ETL

Data Mart

File

XML

Data Integration & ETL Tools would enable a Single, Consistent Version of the Truth

Data Mart

Data Mart

Dormant Data

Staging / ELT

New

Reports

SLA’s

New

ColumnComplete

History

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

6

The Data Warehouse Vision vs Reality

Fresher data

Longer history data

Faster analytics

More data sources

Lower costs

Longer ELT batch windows

Shorter data retention

Slower queries

Weeks/months just to add new data fields

Growing costs

Vision Reality

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

7

Mainframes | A Critical Source of Big Data

7

Top 25World Banks

9 of World’s

Top Insurers

23 of Top 25 US

Retailers

71%Fortune 500

30 Billion Bus. Transactions / day

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

8

Suits & Hoodies – Working Together

8

Integration

Gaps

Expertise

Gaps

• COBOL appeared in 1959, Hadoop in 2005• Mainframe & Hadoop skills shortage

Security

Gaps

• Hosts mission critical sensitive data• Very difficult to install new software on MF

Costs

Gaps

• Mainframe data is (expensive) Big Data• Even FTP costs CPU cycles (MIPS)

• Connectivity• Data conversion (EBCDIC vs ASCII)

Suits & Hoodies idea: Merv Adrian, Gartner Research.

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

9

Expanding Data Requires A New Approach

9

1980sBring Data to Compute

NowBring Compute to Data

Relative size & complexity

DataInformation-centric

businesses use all data:

Multi-structured, internal & external data

of all types

Compute

Compute

Compute

Process-centric businesses use:

• Structured data mainly• Internal data only• “Important” data only

Compute

Compute

Compute

Data

Data

Data

Data

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

10

From Apache Hadoop to an enterprise data hub

10

Open SourceScalableFlexibleCost-Effective

Managed

Open Architecture

Secure and Governed

BATCHPROCESSING

STORAGE FOR ANY TYPE OF DATAUNIFIED, ELASTIC, RESILIENT, SECURE

FILESYSTEM

MAPREDUCE

HDFS

Core Apache Hadoop is great, but…

1) Hard to use and manage.

2) Only supports batch processing.

3) Not comprehensively secure.

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

11

From Apache Hadoop to an enterprise data hub

11

Open SourceScalableFlexibleCost-Effective

Managed

Open Architecture

Secure and Governed

BATCHPROCESSING

STORAGE FOR ANY TYPE OF DATAUNIFIED, ELASTIC, RESILIENT, SECURE SYSTEM

MA

NA

GEM

ENTFILESYSTEM

MAPREDUCE

HDFS

CL

OU

DE

RA

MA

NA

GE

R

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

12

From Apache Hadoop to an enterprise data hub

12

Open SourceScalableFlexibleCost-Effective

Managed

Open Architecture

Secure and Governed

BATCHPROCESSING

ANALYTICSQL

SEARCHENGINE

MACHINELEARNING

STREAMPROCESSING

3RD PARTYAPPS

WORKLOAD MANAGEMENT

STORAGE FOR ANY TYPE OF DATAUNIFIED, ELASTIC, RESILIENT, SECURE SYSTEM

MA

NA

GEM

ENTFILESYSTEM ONLINE NOSQL

MAPREDUCE IMPALA SOLR SPARK SPARK STREAMING

YARN

HDFS HBASE

CL

OU

DE

RA

MA

NA

GE

R

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

13

From Apache Hadoop to an enterprise data hub

13

Open SourceScalableFlexibleCost-Effective

Managed

Open Architecture

Secure and Governed

BATCHPROCESSING

ANALYTICSQL

SEARCHENGINE

MACHINELEARNING

STREAMPROCESSING

3RD PARTYAPPS

WORKLOAD MANAGEMENT

STORAGE FOR ANY TYPE OF DATAUNIFIED, ELASTIC, RESILIENT, SECURE

DA

TAM

AN

AG

EMEN

TSYSTEM

MA

NA

GEM

ENTFILESYSTEM ONLINE NOSQL

MAPREDUCE IMPALA SOLR SPARK SPARK STREAMING

YARN

HDFS HBASE

CL

OU

DE

RA

NA

VIG

AT

OR

CL

OU

DE

RA

MA

NA

GE

R

SENTRY

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

14

From Apache Hadoop to an enterprise data hub

14

Open SourceScalableFlexibleCost-Effective

Managed

Open Architecture

Secure and Governed

BATCHPROCESSING

ANALYTICSQL

SEARCHENGINE

MACHINELEARNING

STREAMPROCESSING

3RD PARTYAPPS

WORKLOAD MANAGEMENT

STORAGE FOR ANY TYPE OF DATAUNIFIED, ELASTIC, RESILIENT, SECURE

DA

TAM

AN

AG

EMEN

TSYSTEM

MA

NA

GEM

ENT

CLOUDERA’S ENTERPRISE DATA HUB

FILESYSTEM ONLINE NOSQL

MAPREDUCE IMPALA SOLR SPARK SPARK STREAMING

YARN

HDFS HBASE

CL

OU

DE

RA

NA

VIG

AT

OR

CL

OU

DE

RA

MA

NA

GE

R

SENTRY

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

15

Partners

Proactive &Predictive Support

ProfessionalServices

Training

Cloudera: Your Trusted Advisor for Big Data

15

Advance from Strategy to ROI with Best Practices and Peak Performance

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

16 ©2014 Cloudera, Inc. All rights reserved.16 ©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

17

The Impact of ELT & Dormant Data on the EDW

17 ©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

ELT drives up to 80% of database capacity

Dormant – rarely used data – waste premium storage

ETL/ELT processes on dormant data waste premium CPU cycles

Hot Warm Cold Data

Transformations (ELT) of unused data

1818 ©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

19

Where to Start?

19

How to identify dormant data?

What workloads will deliver the biggest impact?

How will you access &

move all your data?

Can you secure the new environment?

How do you optimize it?How do you manage it?

How do you make it business-class?

What tools do you need?

How will you leverage all your data, including mainframes?

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

2020

Offload Legacy Data & Workloads to The Enterprise Data Hub

Phase III:

Optimize & SecurePhase II:

OffloadPhase I:

Identify

One Framework. Blazing Performance, Iron-Clad Security, Disruptive Economics

• Identify data & workloads

most suitable for offload

• Focus on those that will

deliver maximum savings &

performance

• Access and move virtually any data e.g. mainframe to Enterprise Data Hub with one tool

• Easily replicate existing staging workloads in Hadoop using a graphical user interface

• Deploy on premises and in Cloud• Optimize the new environment• Manage & secure all your data

with business class tools• Deliver self-service reporting

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

21

22

The Problem: Volume of DataBusinesses are struggling to unlock exploding data

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

23

The Problem: Diverse DataBusinesses and their people are struggling to unlock diverse data

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

24

The Problem: Old School

SoftwareTraditional technologies are complicated, inflexible and slow moving

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

25

The Tableau RevolutionFast and easy analytics for everyone

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

26

FlexibleTransform all types of data into self-service analytics

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

27

For EveryoneEase of use leads to adoption across all departments and use cases

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

28

•LIVE DEMO

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

29

Case Study: Optimize EDW Leading Financial Org

29

0

50

100

150

200

250

Elap

sed

Tim

e (m

)

HiveQL217 min

SyncsortDMX-h9 min

HiveQL217 min

Mainframe Offload(74-page COBOL

copybook)

Development Effort

Syncsort DMX-h: 4 hrs.

Manual Coding: Weeks!

Benefits:

Cut development time from weeks to hours Reduced complexity 47 HiveQL scripts to 4 DMX-h graphical jobs Easily validate COBOL copybooks and find errors

Mainframe Data available to business for analytics

Staging & ELT moved out of RDBMS – Queries run faster

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

3030

Final Thoughts..

Rusty Sears

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.

Vice President of Enterprise Data Services and Big Data at Regions Financial Corporation

31 ©2014 Cloudera, Inc. All rights reserved.31

QUESTIONS?

©2014 Cloudera, Syncsort, Tableau Inc. All rights reserved.


Top Related