2013 nov 20 toronto hadoop user group (thug) - hadoop 2.2.0

Hadoop 2.2.0 Hadoop grows up

Adam Muise

Rob Ford says…

…turn off your #*@!#%!!! Mobile Phones!

YARN Yet Another Resource Negotiator

A new abstraction layer

HADOOP 1.0

HDFS (redundant, reliable storage)

MapReduce (cluster resource management

& data processing)

HDFS2 (redundant, reliable storage)

YARN (cluster resource management)

MapReduce (data processing)

Others (data processing)

HADOOP 2.0

Single Use System Batch Apps

Multi Purpose Platform Batch, Interactive, Online, Streaming, …

Concepts

• Application – Application is a job submitted to the framework – Example – Map Reduce Job

• Container – Basic unit of allocation – Fine-grained resource allocation across multiple resource

types (memory, cpu, disk, network, gpu etc.) –  container_0 = 2GB, 1CPU

–  container_1 = 1GB, 6 CPU

– Replaces the fixed map/reduce slots

YARN Architecture

• Resource Manager – Global resource scheduler – Hierarchical queues

• Node Manager – Per-machine agent – Manages the life-cycle of container – Container resource monitoring

• Application Master – Per-application – Manages application scheduling and task execution – E.g. MapReduce Application Master

NodeManager NodeManager NodeManager NodeManager

Container 1.1

Container 2.4

Container 1.2

Container 1.3

Container 2.2

Container 2.1

Container 2.3

YARN Architecture - Walkthrough

Client2

ResourceManager

Scheduler

map 1.1

vertex1.2.2

map1.2

reduce1.1

vertex1.1.1

vertex1.1.2

vertex1.2.1

InteracFve SQL

YARN as OS for Data Lake ResourceManager

Scheduler

Real-‐Time

nimbus0

nimbus1

nimbus2

Multi-Tenant YARN ResourceManager

Scheduler root

Adhoc 10%

DW 60%

Mrkting 30%

Dev 10%

Reserved 20%

Prod 70%

Prod 80%

Dev 20%

P0 70%

P1 30%

Multi-Tenancy with New Capacity Scheduler

•  Queues •  Economics as queue-capacity

– Heirarchical Queues •  SLAs

– Preemption •  Resource Isolation

–  Linux: cgroups – MS Windows: Job Control – Roadmap: Virtualization (Xen, KVM)

•  Administration – Queue ACLs – Run-time re-configuration for queues – Charge-back

ResourceManager

Scheduler

Adhoc 10%

DW 70%

Mrkting 20%

Dev 10%

Reserved 20%

Prod 70%

Prod 80%

Dev 20%

P0 70%

P1 30%

Capacity Scheduler

Hierarchical Queues

MapReduce v2 Changes to MapReduce on YARN

MapReduce V2 is a library now… •  MapReduce runs on YARN like all other Hadoop 2.x applications

– Gone are the map and reduce slots, that’s up to containers in YARN now – Gone is the JobTracker, replaced by the YARN AppMaster library

•  Multiple versions of MapReduce –  The older mapred APIs work without modification or recompilation –  The newer mapreduce APIs may need to be recompiled

•  Still has one master server component: the Job History Server –  The Job History Server stores the execution of jobs – Used to audit prior execution of jobs – Will also be used by YARN framework to store charge backs at that level

Shuffle in MapReduce v2 •  Faster Shuffle

– Better embedded server: Netty •  Encrypted Shuffle

– Secure the shuffle phase as data moves across the cluster – Requires 2 way HTTPS, certificates on both sides –  Incurs significant CPU overhead, reserve 1 core for this work – Certs stored on each node (provision with the cluster), refreshed every 10secs

•  Pluggable Shuffle Sort – Shuffle is the first phase in MapReduce that is guaranteed to not be data-local – Pluggable Shuffle/Sort allows for intrepid application developers or hardware

developers to intercept the network-heavy workload and optimize it –  Typical implementations have hardware components like fast networks and

software components like sorting algorithms – API will change with future versions of Hadoop

Efficiency Gains of MRv2

• Key Optimizations – No hard segmentation of resource into map and reduce slots – Yarn scheduler is more efficient – MRv2 framework has become more efficient than MRv1; shuffle phase in MRv2 is

more performant with the usage of netty.

• Yahoo has over 30000 nodes running YARN across over 365PB of data.

• They calculate running about 400,000 jobs per day for about 10 million hours of compute time.

• They also have estimated a 60% – 150% improvement on node usage per day.

• Yahoo got rid of a whole colo (10,000 node datacenter) because of their increased utilization.

HDFS v2 In a NutShell

HDFS Snapshots: Feature Overview

•  Admin can create point in time snapshots of HDFS

– Of the entire file system (/root)

– Of a specific data-set (sub-tree directory of file system)

•  Restore state of entire file system or data-set to a snapshot (like Apple

Time Machine)

– Protect against user errors

•  Snapshot diffs identify changes made to data set

– Keep track of how raw or derived/analytical data changes over time

NFS Gateway: Feature Overview

•  NFS v3 standard

•  Supports all HDFS commands

–  List files

– Copy, move files

– Create and delete directories

•  Ingest for large scale analytical workloads

–  Load immutable files as source for analytical processing

– No random writes

•  Stream files into HDFS

–  Log ingest by applications writing directly to HDFS client mount

Federation

Managing Namespaces

Performance

Other Features

Apache Tez A New Hadoop Data Processing Framework

Moving Hadoop Beyond MapReduce •  Low level data-processing execution engine •  Built on YARN

•  Enables pipelining of jobs •  Removes task and job launch times •  Does not write intermediate output to HDFS

– Much lighter disk and network usage

•  New base of MapReduce, Hive, Pig, Cascading etc. •  Hive and Pig jobs no longer need to move to the end of the queue

between steps in the pipeline

Apache Tez as the new Primitive

HADOOP 1.0

& data processing)

Pig (data flow)

Hive (sql)

Others (cascading)

Tez (execu:on engine)

HADOOP 2.0

Data Flow Pig

SQL Hive

Others (cascading)

Batch MapReduce Real Time

Stream Processing

Online Data

Processing HBase,

Accumulo

MapReduce as Base Apache Tez as Base

Hive – MR Hive – Tez

Hive-on-MR vs. Hive-on-Tez SELECT a.x, AVERAGE(b.y) AS avg FROM a JOIN b ON (a.id = b.id) GROUP BY a UNION SELECT x, AVERAGE(y) AS AVG FROM c GROUP BY x

ORDER BY AVG;

SELECT a.state

JOIN (a, c) SELECT c.price

SELECT b.id

JOIN(a, b) GROUP BY a.state

COUNT(*) AVERAGE(c.price)

SELECT a.state, c.itemId

JOIN (a, c)

SELECT b.id

Tez avoids unneeded writes to

Apache Tez (“Speed”) • Replaces MapReduce as primitive for Pig, Hive, Cascading etc.

– Smaller latency for interactive queries – Higher throughput for batch queries – 22 contributors: Hortonworks (13), Facebook, Twitter, Yahoo, Microsoft

YARN ApplicationMaster to run DAG of Tez Tasks

Task with pluggable Input, Processor and Output

Tez Task - <Input, Processor, Output>

Processor Input Output

Tez: Building blocks for scalable data processing

Classical ‘Map’ Classical ‘Reduce’

Intermediate ‘Reduce’ for Map-Reduce-Reduce

Map Processor

HDFS Input

Sorted Output

Reduce Processor

Shuffle Input

HDFS Output

Reduce Processor

Shuffle Input

Sorted Output

SQL: Enhancing SQL Semantics

Hive SQL Datatypes Hive SQL SemanFcs INT SELECT, INSERT

TINYINT/SMALLINT/BIGINT GROUP BY, ORDER BY, SORT BY

BOOLEAN JOIN on explicit join key

FLOAT Inner, outer, cross and semi joins

DOUBLE Sub-‐queries in FROM clause

STRING ROLLUP and CUBE

TIMESTAMP UNION

BINARY Windowing Func:ons (OVER, RANK, etc)

DECIMAL Custom Java UDFs

ARRAY, MAP, STRUCT, UNION Standard Aggrega:on (SUM, AVG, etc.)

DATE Advanced UDFs (ngram, Xpath, URL)

VARCHAR Sub-‐queries in WHERE, HAVING

CHAR Expanded JOIN Syntax

SQL Compliant Security (GRANT, etc.)

INSERT/UPDATE/DELETE (ACID)

Hive 0.12

Available

Roadmap

SQL Compliance Hive 12 provides a wide array of SQL datatypes and semantics so your existing tools integrate more seamlessly with Hadoop

SPEED: Increasing Hive Performance

Performance Improvements included in Hive 12 –  Base & advanced query optimization –  Startup time improvement –  Join optimizations

Interactive Query Times across ALL use cases •  Simple and advanced queries in seconds •  Integrates seamlessly with existing tools •  Currently a >100x improvement in just nine months

Apache Tez as the new Primitive

HADOOP 1.0

& data processing)

Pig (data flow)

Hive (sql)

Others (cascading)

Tez (execu:on engine)

HADOOP 2.0

Data Flow Pig

SQL Hive

Others (cascading)

Batch MapReduce Real Time

Stream Processing

Online Data

Processing HBase,

Accumulo

MapReduce as Base Apache Tez as Base

Hive – MR Hive – Tez

Hive-on-MR vs. Hive-on-Tez SELECT a.x, AVERAGE(b.y) AS avg FROM a JOIN b ON (a.id = b.id) GROUP BY a UNION SELECT x, AVERAGE(y) AS AVG FROM c GROUP BY x

ORDER BY AVG;

SELECT a.state

JOIN (a, c) SELECT c.price

SELECT b.id

SELECT a.state, c.itemId

JOIN (a, c)

SELECT b.id

Tez avoids unneeded writes to

map 1.1 vertex1.2.2

map1.2

reduce1.1

vertex1.1.1

vertex1.1.2

vertex1.2.1

Hive/Tez (SQL)

Tez on YARN ResourceManager

Scheduler

Real-‐Time

nimbus0

nimbus1

nimbus2

Apache Falcon Data Lifecycle Management for Hadoop

Data Lifecycle on Hadoop is Challenging

Data Management Needs Tools Data Processing Oozie Replication Sqoop Retention Distcp Scheduling Flume Reprocessing Map / Reduce Multi Cluster Management Hive and Pig Jobs

Problem: Patchwork of tools complicate data lifecycle management. Result: Long development cycles and quality challenges.

Falcon: One-stop Shop for Data Lifecycle

Apache Falcon Provides Orchestrates

Data Management Needs Tools Data Processing Oozie Replication Sqoop Retention Distcp Scheduling Flume Reprocessing Map / Reduce Multi Cluster Management Hive and Pig Jobs

Falcon provides a single interface to orchestrate data lifecycle. Sophisticated DLM easily added to Hadoop applications.

Falcon Core Capabilities •  Core Functionality

– Pipeline processing – Replication – Retention –  Late data handling

•  Automates – Scheduling and retry – Recording audit, lineage and metrics

•  Operations and Management – Monitoring, management, metering – Alerts and notifications – Multi Cluster Federation

•  CLI and REST API

Falcon At A Glance

>  Falcon offers a high-level abstraction of key services for Hadoop data management needs. >  Complex data processing logic is handled by Falcon instead of hard-coded in data processing apps. >  Falcon enables faster development of ETL, reporting and other data processing apps on Hadoop.

Data Processing Applications

Data Import and

Replication

Scheduling and

Coordination

Data Lifecycle Policies

Multi-Cluster Management

SLA Management

Falcon Data Management Framework

>  Falcon manages workflow and replication. >  Enables business continuity without requiring full data representation. >  Failover clusters can be smaller than primary clusters.

Falcon Example: Replication

Staged Data

Cleansed Data

Access Data

Processed Data

Conformed Data

>  Sophisticated retention policies expressed in one place. >  Simplify data retention for audit, compliance, or for data re-processing.

Falcon Example: Retention

Staged Data

Retain 20 Years

Cleansed Data

Retain 3 Years

Conformed Data

Retain 3 Years

Access Data

Retain Last Copy Only

Falcon Example: Late Data Handling

>  Processing waits until all required input data is available. >  Checks for late data arrivals, issues retrigger processing as necessary. >  Eliminates writing complex data handling rules within applications.

Online Transaction

Data (via Sqoop)

Web Log Data (via FTP)

Staged Data Combined Dataset

Wait up to 4 hours for FTP data

to arrive

Examples

Example: Cluster Specification

<?xml version="1.0"?>!<!--!

My Local Cluster specification! -->!

<cluster colo=”my-local-cluster" description="" name="cluster-alpha"> ! <interfaces>!

<interface type="readonly" endpoint="hftp://nn:50070" version="2.2.0" />! <interface type="write" endpoint="hdfs://nn:8020" version="2.2.0" />!

<interface type="execute" endpoint=”rm:8050" version="2.2.0" />! <interface type="workflow" endpoint="http://os:11000/oozie/" version="4.0.0" />!

<interface type="messaging" endpoint="tcp://mq:61616?daemon=true" version="5.1.6" />! </interfaces>!

<locations>! <location name="staging" path="/apps/falcon/cluster-alpha/staging" />!

<location name="temp" path="/tmp" />! <location name="working" path="/apps/falcon/cluster-alpha/working" />!

</locations>!</cluster>!

NameNode

Resource Manager

Oozie Server

readonly!

write!

execute!

workflow!

Example: Weblogs Replication and Retention

Example 1: Weblogs •  Weblogs land hourly in my primary cluster •  HDFS location is /weblogs/{date}

•  I want to: – Evict weblogs from primary cluster after 1 day

Feed Specification 1: Weblogs

<feed description="" name="feed-weblogs1" xmlns="uri:falcon:feed:0.1” >! <frequency>hours(1)</frequency>! ! <clusters>!

!<cluster name="cluster-primary" type="source”>!! <validity start="2013-10-24T00:00Z" end="2014-12-31T00:00Z"/>!! <retention limit="days(1)" action="delete"/>!!</cluster>!

</clusters>!! <locations>!

!<location type="data" path="/weblogs/${YEAR}-${MONTH}-${DAY}-${HOUR}" />! </locations>!! <ACL owner="hdfs" group="users" permission="0755" />! <schema location="/none" provider="none"/>!</feed>!

Location of the data

Cluster where data is located

Retention policy 1 day

•  I want to: – Replicate weblogs to my secondary cluster – Evict weblogs from primary cluster after 2 days – Evict weblogs from secondary cluster after 1 week

Feed Specification 2: Weblogs

<feed description=“" name=”feed-weblogs2” xmlns="uri:falcon:feed:0.1">! <frequency>hours(1)</frequency>!! <clusters>!

<cluster name=”cluster-primary" type="source">! <validity start="2012-01-01T00:00Z" end="2099-12-31T00:00Z"/>!

<retention limit="days(2)" action="delete"/>! </cluster>! <cluster name=”cluster-secondary" type="target">!

<validity start="2012-01-01T00:00Z" end="2099-12-31T00:00Z"/>! <retention limit=”days(7)" action="delete"/>!

</cluster>! </clusters>!!

<locations>! <location type="data” path="/weblogs/${YEAR}-${MONTH}-${DAY}-${HOUR} "/>! </locations>!

! <ACL owner=”hdfs" group="users" permission="0755"/>!

<schema location="/none" provider="none"/>!</feed>!

Location of the data

Cluster where data is located

Retention policy 2 days

Cluster where data will be replicated

Retention policy 1 week

•  I want to: – Replicate weblogs to a discovery cluster – Replicate weblogs to a BCP cluster – Evict weblogs from primary cluster after 2 days – Evict weblogs from discovery cluster after 1 week – Evict weblogs from BCP cluster after 3 months

Feed Specification 3: Weblogs <feed description=“” name=”feed-weblogs” xmlns="uri:falcon:feed:0.1">! <frequency>hours(1)</frequency>!

! <clusters>!

<cluster name=”cluster-primary" type="source">! <validity start="2012-01-01T00:00Z" end="2099-12-31T00:00Z"/>!

<retention limit="days(2)" action="delete"/>! </cluster>!

<cluster name=“cluster-discovery" type="target">! <validity start="2012-01-01T00:00Z" end="2099-12-31T00:00Z"/>!

<retention limit=”days(7)" action="delete"/>! <locations>!

<location type="data” path="/projects/recommendations/${YEAR}-${MONTH}-${DAY}-${HOUR} "/>! </locations>!

</cluster>! <cluster name=”cluster-bcp" type="target">!

<validity start="2012-01-01T00:00Z" end="2099-12-31T00:00Z"/>! <retention limit=”months(3)" action="delete"/>!

<locations>! <location type="data” path="/weblogs/${YEAR}-${MONTH}-${DAY}-${HOUR} "/>!

</locations>!

</cluster>! </clusters>!

! <locations>!

<location type="data” path="/weblogs/${YEAR}-${MONTH}-${DAY}-${HOUR} "/>! </locations>!

! <ACL owner=”hdfs" group="users" permission="0755"/>!

<schema location="/none" provider="none"/>!</feed>!

Cluster specific location

Apache Knox Secure Access to Hadoop

Connecting to the Cluster..Edge Nodes •  What is an Edge Node?

– Nodes in a DMZ zone that has access to the cluster. Only way to access the cluster

– Hadoop client Apis and MR/Pig/Hive jobs would be executed from these edge nodes.

– Users SSH to Edge Node and upload all job artifacts and then execute API/Commands commands from shell

Hadoop User Edge Node

• Challenges – SSH, Edge Node, and job maintenance nightmare – Difficult to integrate with Applications

Connecting to the Cluster..REST API

•  Useful for connecting to Hadoop from the outside the cluster •  When more client language flexibility is required

–  i.e. Java binding not an option

• Challenges – Client must have knowledge of cluster topology – Required to open ports (and in some cases, on every host) outside the cluster

Service API WebHDFS Supports HDFS user operations including reading files,

writing to files, making directories, changing permissions and renaming. Learn more about WebHDFS.

WebHCat Job control for MapReduce, Pig and Hive jobs, and HCatalog DDL commands. Learn more about WebHCat.

Oozie Job submission and management, and Oozie administration. Learn more about Oozie.

Apache Knox Gateway – Perimeter Security

Simplified Access

•  Single Hadoop access point •  Rationalized REST API hierarchy •  Consolidated API calls •  Multi-cluster support •  Client DSL

Centralized Security

•  Eliminate SSH “edge node” •  LDAP and ActiveDirectory auth •  Central API management + audit

Knox Gateway Network Architecture

Ambari Server/Hue Server

Kerberos/Enterprise

Identity Provider

Enterprise/Cloud SSO

Provider

Identity Providers

Knox Gateway Cluster

GW GW GW

A stateless cluster of reverse proxy instances deployed in DMZ

Firewall

Secure Hadoop Cluster 1

Masters

JT NN Web HCat Oozie

YARN HBase Hive

Secure Hadoop Cluster 2

Masters

JT NN Web HCat Oozie

YARN HBase Hive

DN TT -Requests streamed through GW to Hadoop services after auth. -URLs rewritten to refer to gateway

Firewall

REST Client

JDBC Client

Ambari Client

Browser

Wot no 2.2.0? Where can I get the Hadoop 2.2.0 fix?

Like the Truth, Hadoop 2.2.0 is out there…

Component HDP2.0 CDH4 CDH5 Beta

Intel IDH3.0

MapR 3 IBM Big Insights 2.1

Hadoop Common

2.2.0 2.0.0 2.2.0 2.0.4 N/A 1.1.1

Hive + HCatalog

0.12 0.10 + 0.5

0.11 0.10 + 0.5 0.11 0.9 + 0.4

Pig 0.12 0.11 0.11 0.10 0.11 0.10

Mahout 0.8 0.7 0.8 0.8 0.8 N/A

Flume 1.4.0 1.4.0 1.4.0 1.3.0 1.4.0 1.3.0

Oozie 4.0.0 3.3.2 4.0.0 3.3.0 3.3.2 3.2.0

Sqoop 1.4.4 1.4.3 1.4.4 1.4.3 1.4.4 1.4.2

HBase 0.96.0 0.94.6 95.2 0.94.7 94.9 0.94.3

Thank You THUG Life

2013 nov 20 toronto hadoop user group (thug) - hadoop 2.2.0

yarn hortonworks

slots hortonworks

reliab hortonworks

nutshell hortonworks

level hortonworks

ha hortonworks

capacity scheduler hortonworks

yarn framework

Technology

globalization of thug life - the boston theological...

· (page views ? hourly? monthly hadoop node hadoop node...

thug life: significados sobre masculinidad y trasgresión

wso2 product release webinar - wso2 message broker 2.2.0

projektas nr. bpd2004-esf-2.2.0-03-05/0092

version 2.2.0 nvidia gpu

gain better insight from big data using jboss data...

qgis 2.2.0

camel manual 2.2.0

who is the real thug?

hadoop 2.2.0 multi-node cluster installation on ubuntu

hadoop 2.2.0 installation - university of...

manda lsu kung thug

samport host2t 2.2.0

f?:@'.$&2& %g&1'210%a'5% h&$02'$$%!5&102$%!,ij · oracle...

thug: a new low-interaction honeyclient

lc app - what's new version 2.2.0

akka 2.2.0

stan reference 2.2.0

justify my thug by wahida clark