devnet-1141dynamic dockerized hadoop provisioning

47
Docker-Based Hadoop Provisioning On Cisco InterCloud Innovation Architect, CIS CTO Group Cisco Dmitri Chtchourov Rakesh Saha Product Management Hortonworks

Upload: cisco-devnet

Post on 08-Jan-2017

658 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

Docker-BasedHadoop ProvisioningOn Cisco InterCloud

Innovation Architect, CIS CTO Group

Cisco

Dmitri Chtchourov Rakesh SahaProduct Management

Hortonworks

Page 2: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

© Hortonworks Inc. 2011 – 2015. All Rights Reserved

Cautionary Statement Regarding Forward-Looking Statements

This presentation contains forward-looking statements involving risks and uncertainties. Such forward-looking statements in this presentation generally relate to future events, our ability to increase the number of support subscription customers, the growth in usage of the Hadoop framework, our ability to innovate and develop the various open source projects that will enhance the capabilities of the Hortonworks Data Platform, anticipated customer benefits and general business outlook. In some cases, you can identify forward-looking statements because they contain words such as “may,” “will,” “should,” “expects,” “plans,” “anticipates,” “could,” “intends,” “target,” “projects,” “contemplates,” “believes,” “estimates,” “predicts,” “potential” or “continue” or similar terms or expressions that concern our expectations, strategy, plans or intentions. You should not rely upon forward-looking statements as predictions of future events. We have based the forward-looking statements contained in this presentation primarily on our current expectations and projections about future events and trends that we believe may affect our business, financial condition and prospects. We cannot assure you that the results, events and circumstances reflected in the forward-looking statements will be achieved or occur, and actual results, events, or circumstances could differ materially from those described in the forward-looking statements.

The forward-looking statements made in this prospectus relate only to events as of the date on which the statements are made and we undertake no obligation to update any of the information in this presentation.

Trademarks

Hortonworks is a trademark of Hortonworks, Inc. in the United States and other jurisdictions.  Other names used herein may be trademarks of their respective owners.

Page 3: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

3© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Speakers

Rakesh SahaProduct ManagementHortonworks

Dmitri ChtchourovInnovation Architect, CIS CTO GroupCisco

Page 4: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

4© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Agenda

• About Hortonworks• Cloudbreak – Docker-based Hadoop provisioning tool• Introduction to Docker • Hadoop Provisioning using Docker• Cisco and Hortonworks Collaboration

Page 5: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

© Hortonworks Inc. 2011 – 2015. All Rights Reserved

About HortonworksO

NLY 100

open source Apache Hadoop data platform

%Founded in 2011

HADOOP1STdistribution to go public

IPO Fall 2014 (NASDAQ: HDP)

subscriptioncustomers322 employees across

600+

countries

technology partners1000+ 17

TM

Page 6: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

© Hortonworks Inc. 2011 – 2015. All Rights Reserved

Hortonworks Mission:

Power your Modern Data Architecture with HDP and Enterprise Apache Hadoop

Customer Momentum• 300+ customers in seven quarters, growing at 75+/quarter• Two thirds of customers come from F1000

Hortonworks and Hadoop at Scale• HDP in production on largest clusters on planet• Multiple +1000 node clusters, including 35,000 nodes at

Yahoo!, 800 nodes at Spotify

• Founded in 2011 • Original 24 architects, developers,

operators of Hadoop from Yahoo!• We are leaders in Hadoop community• 500+ employees

Page 7: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

© Hortonworks Inc. 2011 – 2015. All Rights Reserved

OPERATIONAL TOOLS

DEV & DATA TOOLS

INFRASTRUCTURE

HDP is deeply integrated in the data centerSO

UR

CES

EXISTING Systems

Clickstream Web &Social Geolocation Sensor & Machine

Server Logs Unstructured

DAT

A S

YSTE

M

RDBMS EDW MPP

APPL

ICAT

ION

S

Deep PartnershipsHortonworks engages in deep engineered relationships with the leaders in the data center, such as Cisco, Microsoft, EMC, Pivotal, Teradata, Red Hat, SAS & SAP.

Broad PartnershipsOver a 1,000 partners work with us to certify their applications to work with Hadoop so they can extend big data to their users.

HDP

Gov

erna

nce

& In

tegr

atio

n

Secu

rity

Ope

ratio

nsData Access

Data Management

YARN

Page 8: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

8© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Agenda

Cloudbreak Docker Provisioning Collaboration

Page 9: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

9© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Cloudbreak• Developed by SequenceIQ• Open source with Apache 2.0

license [ Apache project soon ]• Deploys selected services to

public and private cloud via Ambari Blueprints

• Elastic – can spin up any number of nodes, add/remove on the fly

• Provides full cloud lifecycle management post-deployment

Page 10: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

10© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

BI / Analytics(Hive)

IoT Apps(Storm, HBase, Hive)

Launch HDP on Any Cloud for Any Application

Dev / Test(all HDP services)

Data Science(Spark)

Cloudbreak

1. Pick a Blueprint2. Choose a Cloud3. Launch HDP!

Example Ambari Blueprints:

IoT Apps, BI / Analytics, Data Science, Dev / Test

Page 11: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

11© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Hadoop in Cloud Provisioning with Cloudbreak

CreateTemplates

ProvideBlueprint

AssociateCredentials

LaunchCluster

Page 12: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

12© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Provisioning: Template

CreateTemplate

ProvideBlueprint

AssociateCredentials

LaunchCluster

Page 13: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

13© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Provisioning: Blueprint

CreateTemplate

ProvideBlueprint

AssociateCredentials

LaunchCluster

Page 14: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

14© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Provisioning: Provider Credentials

CreateTemplate

ProvideBlueprint

AssociateCredentials

LaunchCluster

Page 15: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

15© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Provisioning: Launch

CreateTemplate

ProvideBlueprint

AssociateCredentials

LaunchCluster

Page 16: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

16© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Specialized BlueprintsQuick productivity with pre-configured clusters blueprints Lambda Architecture Machine Learning Batch ETL …

Page 17: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

17© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

BI / Analytics(Hive)

IoT Apps(Storm, HBase, Hive)

Dev / Test(all HDP services)

Data Science(Spark)

Autoscaling Policy

• Policies based on any Ambari metrics• Coordinates with YARN • Policies are based on Metrics or Time • Scaling can be service or component

type specific

Optimize cloud usage via Elastic Clusters

Page 18: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

18© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Auto-scale PolicyAuto-scale

PolicyAuto-scale Policy

YARN

Ambari Alerts

Ambari Metrics

Ambari

Ambari

Ambari

Provisioning

Cloudbreak Static

Dynamic

Enforces Policies Scales Cluster/YARN Apps

Metrics and Alerts Feed Cloudbreak

Scaling for Static and Dynamic Clusters

Page 19: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

19© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Provisioning – How it works

Start VMs - with a running Docker

daemon

Start Ambari servers/agents -

Swarm API Post Blueprint

Page 20: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

20© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Agenda

Cloudbreak Docker Provisioni

ngCollabora

tion

Page 21: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

21© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Multiplicity of

Stacks

Multiplicity of hardware

environments

Static website Web frontend User DB Queue Analytics DB

Development VM QA server Public Cloud

Contributor’s laptopProduction

ClusterCustomer Data

Center

An engine that enables any payload to be encapsulated as a lightweight, portable, self-sufficient container

Docker is a “Shipping Container” System for Code

Page 22: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

22© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Lightweight, portable Build once, run anywhere VM – without the overhead of a VM Isolated containers Automated and scripted

Docker

Page 23: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

23© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Why Is Docker So Exciting?For Developers:Build once…run anywhere

• A clean, safe, and portable runtime environment for your app.

• No missing dependencies, packages etc.• Run each app in its own isolated container• Automate testing, integration, packaging• Reduce/eliminate concerns about

compatibility on different platforms• Cheap, zero-penalty containers to deploy

services

For DevOps:Configure once…run anything

• Make the entire lifecycle more efficient, consistent, and repeatable

• Eliminate inconsistencies between SDLC stages

• Support segregation of duties• Significantly improves the speed and

reliability of CICD• Significantly lightweight compared to VMs

Page 24: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

24© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

AppA

Hypervisor (Type 2)

Host OS

Server

GuestOS

Bins/Libs

AppA’

GuestOS

Bins/Libs

AppB

GuestOS

Bins/LibsD

ocker

Host OS kernel

Server

binA

pp Alib

App B

VM

Container

Containers are isolated,Share only the kernel

GuestOS

GuestOS

…result is significantly faster deployment, much less overhead, easier migration, faster restart

lib

App B

lib

App B

lib

App B

bin

App A

Docker: Containers vs. VMs

Page 25: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

25© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Agenda

Cloudbreak Docker Provisioni

ngCollabora

tion

Page 26: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

26© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

HDP as Docker Containersvia Cloudbreak

• Running Ambari Cluster in Containers• Use Blueprint to define services• All HDP services share a single container Cloudb

reakAmbari HDP

Installs Ambari on the VMs

Docker

VM

Docker

VM

Docker

Linux

Instructs

Ambari to build

HDP cluster

Cloud Provider/Bare Metal

Provisions VMs from

Cloud Providers

Run Hadoop as Docker Containers

Page 27: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

27© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Swarm + Consul for Placement and Discovery

Page 28: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

28© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Cloudbreak

Run Hadoop as Docker containers

Docker Docker

DockerDockerDocker

Docker

Page 29: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

29© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Cloudbreak

Run Hadoop as Docker containers

Docker Docker

DockerDockerDocker

Docker

amb-agn amb-ser amb-

agn

amb-agn

amb-agn

amb-agn

Blueprint

Page 30: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

30© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Cloudbreak

Run Hadoop as Docker containers

Docker Docker

DockerDockerDocker

Docker

amb-agn- hdfs- hbase

amb-seramb-agn-hdfs-hive

amb-agn-hdfs-yarn

amb-agn-hdfs-zookpr

amb-agn-nmnode-hdfs

Page 31: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

31© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

• Quick installation with pre-pulled rpms• Same process/images for dev/qa/prod• Same process for single/multi-node

Benefits of running Hadoop on Docker

Page 32: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

32© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Demo

Page 33: DEVNET-1141Dynamic Dockerized Hadoop Provisioning
Page 34: DEVNET-1141Dynamic Dockerized Hadoop Provisioning
Page 35: DEVNET-1141Dynamic Dockerized Hadoop Provisioning
Page 36: DEVNET-1141Dynamic Dockerized Hadoop Provisioning
Page 37: DEVNET-1141Dynamic Dockerized Hadoop Provisioning
Page 38: DEVNET-1141Dynamic Dockerized Hadoop Provisioning
Page 39: DEVNET-1141Dynamic Dockerized Hadoop Provisioning
Page 40: DEVNET-1141Dynamic Dockerized Hadoop Provisioning
Page 41: DEVNET-1141Dynamic Dockerized Hadoop Provisioning
Page 42: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

42© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Agenda

Cloudbreak Docker Provisioni

ngCollabora

tion

Page 43: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

43© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Cisco and Hortonworks’ Partnership

100% open source Hadoop Distribution, Support and Training

Integrated Infrastructures for Big Data

CISCO AND HORTONWORKS ARE PARTNERING TO HELP YOU BUILD YOUR BIG DATA SOLUTION AND REACH MASSIVE SCALABILITY,

SUPERIOR EFFICIENCY AND DRAMATICALLY LOWER TOTAL COST OF OWNERSHIP THANKS TO A VALIDATED JOINT ARCHITECTURE.

Page 44: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

44© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Results of the collaboration

• Efficient Hadoop as a service

• Adoption of Docker for enterprise Hadoop deployment

Tasks Cisco InterCloud

Public Cloud Provider

HDP installation15:04 mins 11:55 mins

Teragen (avg of 3 execution)7:08 mins 22:15 mins

Terasort(avg of 3 execution)32:09 mins 60:12 mins

Teravalidate(avg of 3 execution)

2:31 mins 10:40 mins

Page 45: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

45© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Observations Future Collaboration

• Docker is maturing inside enterprises

• Interest to run Docker on top of bare metal  

• Big data app developers are leaning towards containerization of apps

• YARN is becoming application deployment platform beyond big data apps

• Demand for native containerized fully managed app on YARN

• Run Docker natively on Openstack

• Run Docker on Yarn • OpenStack bare metal

Page 46: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

46© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Conclusion

Data Science

IoT

BI / Analytics

Dev / Test

Blueprints

HDP

HDP + Cisco InterCloud - Efficient Hadoop-as-a-service

Page 47: DEVNET-1141Dynamic Dockerized Hadoop Provisioning

47© 2014 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Learn More

Download the Hortonworks SandboxLearn Hadoop

Build Your Analytic App

Try Hadoop 2

More about Cisco & Hortonworkshttp://hortonworks.com/partner/cisco/

More about Hortonworks’ Acquisition of SequenceIQhttp://bit.ly/1R1ktxO