to the cloud and back: a look at hybrid analytics

22
To The Cloud and Back: A Look at Hybrid Analytics Keith Manthey

Upload: hadoop-summit

Post on 07-Jan-2017

115 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: To The Cloud and Back: A Look At Hybrid Analytics

To The Cloud and Back: A Look at Hybrid AnalyticsKeith Manthey

Page 2: To The Cloud and Back: A Look At Hybrid Analytics

© Dell EMC 2016 - All rights reserved2

ITIL-basedIT processes

Client-serverscale-up apps

Infrastructureresiliency

TraditionalCIO challenge: next-generation infrastructures are needed

Page 3: To The Cloud and Back: A Look At Hybrid Analytics

© Dell EMC 2016 - All rights reserved3

ITIL-basedIT processes

Client-serverscale-up apps

Infrastructureresiliency

Cloud-native

Coexisting IT paradigms

Traditional

DevOps basedIT processes

Distributedscale-out apps

Applicationresiliency

CIO challenge: next-generation infrastructures are needed

Page 4: To The Cloud and Back: A Look At Hybrid Analytics

© Dell EMC 2016 - All rights reserved4

Traditional Cloud-native

Modern IT

Off-Premises

On-Premises

Available both on and off premises

Page 5: To The Cloud and Back: A Look At Hybrid Analytics

EDGE

CLOUD ENABLED STORAGE

PRIVATE

PUBLICHOSTED

DATALAKE

COREDATACENTER

Page 6: To The Cloud and Back: A Look At Hybrid Analytics

EDGE

CLOUD ENABLED STORAGE

PRIVATE

PUBLICHOSTED

DATALAKE

COREDATACENTER

Through 2020, IDC predicts 4.5x faster spending on cloud based big data analytics then on-premise solutions

By 2020, IDC predicts that usage of big data analytics for actionable intelligence will double over today

On premise and Cloud use of Hybrid Analytics will become a dominant use of the technology

More specifically – Moving data from On-Premise to the Cloud and Back

Page 7: To The Cloud and Back: A Look At Hybrid Analytics

The journey to here

Page 8: To The Cloud and Back: A Look At Hybrid Analytics

© Dell EMC 2016 - All rights reserved8

Data Management

DATA LAKE SOLUTION FOR EDW MODERNISATION

Clickstream

Web & Social

Geolocation

Sensor & Machine

Server Logs

EXIS

TIN

G S

OU

RC

ES

ERP

CRM

Commodity Compute

DATA SERVICES

OPERATIONAL SERVICES

HORTONWORKS DATA PLATFORM

HADOOP CORE

Business Analytics

Visualization& Dashboards

IT Applications

NEW

SO

UR

CES

2

3

1

ETL/ELT OFFLOAD

ACTIVE ARCHIVE

ENRICH WITH NEW DATA TYPES

MULTI-PROTOCOLACCESS

ENTERPRISE-GRADE DATA MANAGEMENT

5NFS, SMB,HTTP, Swift

1

2

3

4

5

4

New Data Flow

Current Data Flow

Legend

OFFLOAD

Isilon

Page 9: To The Cloud and Back: A Look At Hybrid Analytics

© Dell EMC 2016 - All rights reserved9

1. Active Archive– Optimise Enterprise Data Warehouse storage by archiving cold data and still analyse it as

needed

2. ETL Offload– Improve EDW performance by offloading ETL processing to Hadoop

3. Semi/Unstructured Data Analytics– Increase confidence in business decisions with new data sources

4. Multi-protocol Access – Enable applications to access/update Hadoop data using NFS, SMB, HTTP, Swift and

other file/object based access methods

5. Data Management– Enterprise-grade data management at Hadoop economics

DATA LAKE BENEFITS

Unique to Isilon

Page 10: To The Cloud and Back: A Look At Hybrid Analytics

© Dell EMC 2016 - All rights reserved10

ISILON MOMENTUM

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

• >12,000 Clusters in the Field• Approaching 1PB Average Cluster

Capacity• 1.3EB of Hadoop Usage

• 7,000+ Customers World Wide• 1,300 New Customers in 2015• #1 in Data Lakes

>1,200 HDFS Customers

Scale-Out NAS Leader

Page 11: To The Cloud and Back: A Look At Hybrid Analytics

A look forward

Page 12: To The Cloud and Back: A Look At Hybrid Analytics

Ability to:

• Enact Policy Based Data Movement• Migrate data from On-Prem to Cloud

and vice versa• Leverage Hybrid Analytics to greatest

Effect

To The Cloud and Back

ISILON SD

Page 13: To The Cloud and Back: A Look At Hybrid Analytics

Limits:100 Gb/Sec

1 Petabyte (1 million gigabytes)

~1 Day to transfer one way

To The Cloud and Back

ISILON SD

Page 14: To The Cloud and Back: A Look At Hybrid Analytics

Data Silos

On Prem

Only files C

loud

Onl

y fil

es

Strong BandwidthTypically Hand Scripted

Expensive ExtractionLimit outbound data movement

Page 15: To The Cloud and Back: A Look At Hybrid Analytics

Our Research

Page 16: To The Cloud and Back: A Look At Hybrid Analytics

Problem Domain

1. File movement automation in place today is labor intensive and fraught with peril

2. Non-prescriptive file movement out of the cloud will be extremely expensive with limited value in return.

3. Most file movements might not return to exactly the same target location. For example:

Location 1, File 1 Location 2,

File 1 – begets File 2

Location 3, File 2

Page 17: To The Cloud and Back: A Look At Hybrid Analytics

What if….

1. There was a way to move files by policy (i.e., rules based) to various locations, including the cloud and back1. For Example – only target net new files created in the cloud in certain

directories for movement.2. The rules based file movement could allow for multiple targets and / or

destinations for files.1. For Example – move net new files in one directory to a single argetand

net new files in a second directory to three targets.

Page 18: To The Cloud and Back: A Look At Hybrid Analytics

OneFS in the Cloud

Page 19: To The Cloud and Back: A Look At Hybrid Analytics

Our Journey to Here

1. OneFS already is a software defined asset. 1. SDEdge is a software defined storage offering2. CloudPools is a software feature in SDEdge that uses the Public Cloud

(Azure, AWS, Google) as a back up target2. OneFS has been loaded and run in the public cloud as noted in the previous

slide.3. Dell EMC has a long history of data mobility across our product suite

(replication, backups, etc)4. Dell EMC has a long history of policy based file movement features.

Page 20: To The Cloud and Back: A Look At Hybrid Analytics

Futures

1. We are exploring what OneFS with all its features and abilities to move data up to the cloud and back would look like.

2. What about a cloud environment that contained a OneFS daemon that allowed policy based file movement.?

3. The cloud environment file movement (up and back) could be controlled by a Isilon cluster

Page 21: To The Cloud and Back: A Look At Hybrid Analytics

On-Prem FilesMovement policies

Future State?

ISILON SD

Cloud FilesOneFS Daemon

Page 22: To The Cloud and Back: A Look At Hybrid Analytics