cloudera enterprise in the cloud - sys-con...

32
1 © Cloudera, Inc. All rights reserved. Hadoop in the Cloud David Tishgart, Director of Product Marketing, Cloudera

Upload: truonglien

Post on 09-Sep-2018

236 views

Category:

Documents


0 download

TRANSCRIPT

1© Cloudera, Inc. All rights reserved.

Hadoop in the CloudDavid Tishgart, Director of Product Marketing, Cloudera

2© Cloudera, Inc. All rights reserved.Cloudera — Confidential

MARKET & POSITIONING

The modern platform for data management and analytics, built on Apache Hadoop, Spark, latest open source tech

CUSTOMERS >1,000 enterprise subscription software customers

LARGESTECOSYSTEM

>2,600 partners

GLOBAL EXPANSION

Operations in 27 countries, 24x7 global support, educationCustomers in >50 countries

EMPLOYEES >1,400

FUNDING-TO-DATE

$670 million (Intel, Accel Partners, Institutional Investors)

Cloudera Snapshot

Cloudera — Confidential

3© Cloudera, Inc. All rights reserved.

4© Cloudera, Inc. All rights reserved.

Drive Customer

Insights

Improve Product & Services Efficiency

Lower Business

Risk

5© Cloudera, Inc. All rights reserved.

The world’s largest taxi

company owns ZERO

vehicles.

The world’s largest

accommodation provider owns

ZERO real estate.

The world’s most popular

media owner creates ZERO

content.

The world’s leading

music platform owns no

music.

6© Cloudera, Inc. All rights reserved.

Our relationship with datais changing

7© Cloudera, Inc. All rights reserved.

Data is abundant …and cheap.

Keep all data online as long as needed.

8© Cloudera, Inc. All rights reserved.

Computationis affordable.

Ask bigger questions as fast as you can.

9© Cloudera, Inc. All rights reserved.

Internet of Things (IoT) – A Revolution In The Making

$1.7

TrillionIn Value

20% Annual

Growth

30

BillionThings

250

MillionConnected

Vehicles

Source - IDC & Gartner Estimates

Internet of

Things

IoT Markets - 2020

10© Cloudera, Inc. All rights reserved.

A modern data architecture is needed to drive success from data

11© Cloudera, Inc. All rights reserved.Cloudera — Confidential

What is Hadoop?

12© Cloudera, Inc. All rights reserved.Cloudera — Confidential

Current Data ArchitecturesLimited data. Single access. Platform silos.

SERVEINTEGRATE& PROCESS

ANALYZE

13© Cloudera, Inc. All rights reserved.Cloudera — Confidential

Modern Data PlatformUnlimited data. Diverse access. One platform.

OPERA

TIONS

DATAM

ANAGEM

ENT

UNIFIEDSERVICES

PROCESS,ANALYZE,SERVE

STORE

INTEGRATE

SERVEINTEGRATE& PROCESS

ANALYZE

14© Cloudera, Inc. All rights reserved.Cloudera — Confidential

: Distributed Compute ::

OPERA

TIONS

DATAM

ANAGEM

ENT

UNIFIEDSERVICES

PROCESS,ANALYZE,SERVE

STORE

INTEGRATE

: Distributed Data

15© Cloudera, Inc. All rights reserved.

What’s Driving Hadoop to the Cloud?Enterprise customers using cloud for big data analytics

Hadoop deployments in cloud are accelerating:

● Executive mandate: minimize on-prem datacenter footprint

● Perceived lower overall TCO

● Increased agility: end-user self-service

● Elasticity: optimize infrastructure usage

16© Cloudera, Inc. All rights reserved.

17© Cloudera, Inc. All rights reserved.

Common workloads in the cloud

Only pay for what you need, when you need it

▪ Transient clusters▪ Elastic workload▪ Object storage centric▪ Cloud-native deployment

ETL/Modeling(Data Engineering)

App Delivery(Operational

Database)

Reduce Operating Costs New Insights, New Revenue Run Without Risk

BI/Analytics(Analytic Database)

Explore and analyze all data, wherever it lives

▪ Transient or Persistent clusters▪ Sized to demand▪ HDFS or object storage▪ Lift-and-shift or cloud-native

deployment

Enterprise-grade to protect your business, no matter what

▪ Fixed clusters▪ Periodic sync▪ All HDFS storage▪ Lift-and-shift deployment

18© Cloudera, Inc. All rights reserved.

Crunching 1,000+ Business Metrics per Customer with Sub-Second Responses

• Enables granular targeting of customers

• 50% reduction in marketing cost execution at one bank with focus on high potential customers

• Stores and processes thousands of critical events at scale at a low cost

• Provides flexibility, agility to support customer needs with Cloudera on Amazon Web Services and on premises

CUSTOMER 360

FINANCIAL SERVICES» BEHAVIORAL ANALYTICS» PREDICTIVE ANALYTICS» SCALABLE PROCESSING

19© Cloudera, Inc. All rights reserved.

Common workloads in the cloud

Only pay for what you need, when you need it

▪ Transient clusters▪ Elastic workload▪ Object storage centric▪ Cloud-native deployment

ETL/Modeling(Data Engineering)

App Delivery(Operational

Database)

Reduce Operating Costs New Insights, New Revenue Run Without Risk

BI/Analytics(Analytic Database)

Explore and analyze all data, wherever it lives

▪ Transient or Persistent clusters▪ Sized to demand▪ HDFS or object storage▪ Lift-and-shift or cloud-native

deployment

Enterprise-grade to protect your business, no matter what

▪ Fixed clusters▪ Periodic sync▪ All HDFS storage▪ Lift-and-shift deployment

20© Cloudera, Inc. All rights reserved.

Providing a complete view of consumer watching and buying habits

• Helps customers optimize their ad spend for greater campaign ROI

• Improves processing performance as data volumes double

• Boosts agility and flexibility and reduces risk with hybrid and multi-cloud strategy

CUSTOMER 360

MEDIA » CUSTOMER 360°» OMNI-CHANNEL ANALYTICS» SCALABLE PROCESSING

21© Cloudera, Inc. All rights reserved.

Common workloads in the cloud

Only pay for what you need, when you need it

▪ Transient clusters▪ Elastic workload▪ Object storage centric▪ Cloud-native deployment

ETL/Modeling(Data Engineering)

App Delivery(Operational

Database)

Reduce Operating Costs New Insights, New Revenue Run Without Risk

BI/Analytics(Analytic Database)

Explore and analyze all data, wherever it lives

▪ Transient or Persistent clusters▪ Sized to demand▪ HDFS or object storage▪ Lift-and-shift or cloud-native

deployment

Enterprise-grade to protect your business, no matter what

▪ Fixed clusters▪ Periodic sync▪ All HDFS storage▪ Lift-and-shift deployment

22© Cloudera, Inc. All rights reserved.

Measure user interaction across the ecosystem, help direct R&D and development spend• Virtuous cycle: Identify features that

facilitate sharing of content that drive new customers

• Real-time streaming and batch data from product logs, web analytics, channel data and ERP

• Impala connects to third-party data wrangling and BI tools for fast reporting

MANUFACTURING» CUSTOMER 360» DATA DRIVEN PRODUCTS» DATA DRIVEN SERVICES

DATA-DRIVENPRODUCTS

23© Cloudera, Inc. All rights reserved.

Key Requirements of Big Data in the Cloud

Size compute and storage independently, grow and shrink clusters dynamically, and pay only for what you use on ad-hoc, transient workloads

Preserve business flexibility and data portability and minimize cloud lock-in by running in any one of the three major public cloud providers or in private cloud

Reduce risk with comprehensive manageability, availability, security, and governance required for production big data workloads

Elastic Hybrid/Multi-Cloud Enterprise Grade

24© Cloudera, Inc. All rights reserved.

How do you do Hadoop in the cloud?

25© Cloudera, Inc. All rights reserved.

Embrace Transience for Lower Costs

Decoupled Storage and Compute for Elastic Scale

Patterns of Cloud-Native ApplicationsFlexibility, Self-Service Models, and New Cost Dynamics

Compartmentalize for Greater Isolation

Object Store

COMPUTE

1hr

SPIN UP SPIN DOWN

Object Store

26© Cloudera, Inc. All rights reserved.

Transient Applications

Transient cluster requirements: ● Object store integration● Fast cluster provisioning● Cluster metadata persistence● Usage-based pricing

Examples of transient clusters in the cloud: ● ETL workflows● Model training● Ad hoc analytics● Dev and test workflows

27© Cloudera, Inc. All rights reserved.

Persistent ApplicationsPersistent clusters have similar requirements to on-prem clusters:

• High availability and disaster recovery• Cluster operational management• Resource management• Security

Acquire some benefits from cloud infrastructure• Cluster auto-scaling for peak demand• Ad-hoc dev & test environments• Capitalize on cheaper “blade” instances

Examples of persistent use cases in the cloud: • Hbase, Solr clusters• Kafka clusters• BI analytics• Busy, multi-user clusters

28© Cloudera, Inc. All rights reserved.

Persistent application

DataSource

s

Real-TimeServing

Kafka/Flume

Spark Streaming

HBase orImpala/Kudu

(beta)

Kafka

Application

S3

Hive/Spark/HoS

Impala

Analytics

Batch Data Transformations

Streaming Architecture

29© Cloudera, Inc. All rights reserved.

Transient application

DataSource

s

Real-TimeServing

Kafka/Flume

Spark Streaming

HBase, orImpala/Kudu

(beta)

Kafka

Application

S3

Hive/Spark/HoS

S3

Batch Data Transformations

Batch Analytics

Impala

BI & Analytics

30© Cloudera, Inc. All rights reserved.

Combining the two: lambda architecture

DataSource

s

Real-TimeServing

Kafka/Flume

Spark Streaming

HBase orImpala/Kudu

(beta)

Kafka

Application

S3 S3

Hive/Spark/HoS

Batch Data Transformations

Impala

BI & Analytics

31© Cloudera, Inc. All rights reserved.

Get started with Cloudera Enterprise in the cloud

Deploy and manage Cloudera Enterprise in the cloud environment of your choice

Deploy an enterprise data hub on AWS

Provision and deploy Cloudera Enterprise on the Azure Marketplace

Cloudera Director AWS Quickstart Azure Marketplace

32© Cloudera, Inc. All rights reserved.

Thank You