apache hadoop yarn: past, present and future

31
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Hadoop YARN: Past, Present and Future Dublin, April 2016 Varun Vasudev

Upload: hadoop-summit

Post on 07-Jan-2017

956 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Apache Hadoop YARN: Past, Present and Future

1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Apache Hadoop YARN: Past, Present and FutureDublin, April 2016Varun Vasudev

Page 2: Apache Hadoop YARN: Past, Present and Future

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

About myself

⬢Apache Hadoop contributor since 2014

⬢Apache Hadoop committer

⬢Currently working for Hortonworks

[email protected]

Page 3: Apache Hadoop YARN: Past, Present and Future

3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Introduction to Apache Hadoop YARN

YARN: Data Operating System(Cluster Resource Management)

1 ° ° ° ° ° ° °

° ° ° ° ° ° ° °

Script

Pig

SQL

Hive

TezTez

JavaScala

Cascading

Tez

° °

° °

° ° ° ° °

° ° ° ° °

Others

ISV Engines

HDFS (Hadoop Distributed File System)

Stream

Storm

Search

Solr

NoSQL

HBaseAccumulo

Slider Slider

BATCH, INTERACTIVE & REAL-TIME DATA ACCESS

In-Memory

Spark

YARNThe Architectural Center of Hadoop• Common data platform, many applications• Support multi-tenant access & processing• Batch, interactive & real-time use cases

Page 4: Apache Hadoop YARN: Past, Present and Future

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Introduction to Apache Hadoop YARN

⬢Architectural center of big data workloads

⬢Enterprise adoption accelerating–Secure mode becoming more widespread–Multi-tenant support–Diverse workloads

⬢SLAs–Tolerance for slow running jobs decreasing–Consistent performance desired

Page 5: Apache Hadoop YARN: Past, Present and Future

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Past – Apache Hadoop 2.6, 2.7

Page 6: Apache Hadoop YARN: Past, Present and Future

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Apache Hadoop YARN

ResourceManager(active)

ResourceManager(standby)

NodeManager1

NodeManager2

NodeManager3

NodeManager4

Resources: 128G, 16 vcores

Label: SAS

Page 7: Apache Hadoop YARN: Past, Present and Future

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Scheduler

Inter queue pre-emption

Application

Queue B – 25%

Queue C – 25%

Label: SAS (exclusive)

Queue A – 50%

FIFO

ResourceManager(active)

Application, Queue A, 4G, 1 vcore

Reservation for application

User

Page 8: Apache Hadoop YARN: Past, Present and Future

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Node 1

NodeManager128G, 16 vcores

Launch Applicaton 1 AMAM process

Launch AM process via ContainerExecutor – DCE, LCE, WSCE.

Monitor/isolate memory and cpu

Application Lifecycle

ResourceManager(active)

Request containers

Allocate containersContainer 1 process

Container 2 process

Launch containers on node using DCE, LCE, WSCE. Monitor/isolate

memory and cpu

History Server(ATS – leveldb, JHS - HDFS)

HDFS

Log aggregation

Page 9: Apache Hadoop YARN: Past, Present and Future

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Operational support

⬢Support added for work preserving restarts in the RM and the NM

⬢Support added for rolling upgrades and downgrades from 2.6 onwards

Page 10: Apache Hadoop YARN: Past, Present and Future

10

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Recent releases

⬢2.6 and 2.7 maintenance releases are carried out–Only blockers and critical fixes are added

⬢Apache Hadoop 2.7–2.7.3 should be out soon–2.7.2 released in January, 2016–2.7.1 released in July, 2015

⬢Apache Hadoop 2.6–2.6.4 released in February, 2016–2.6.3 released in December, 2015–2.6.2 released in October, 2015

Page 11: Apache Hadoop YARN: Past, Present and Future

11

© Hortonworks Inc. 2011 – 2016. All Rights Reserved11

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Present – Apache Hadoop 2.8

Page 12: Apache Hadoop YARN: Past, Present and Future

12

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

YARN

ResourceManager(active)

ResourceManager(standby)

NodeManager1

NodeManager2

NodeManager3

NodeManager4

Resources: 128G, 16 vcoresAuto-calculate node resources

Label: SAS

Dynamic NodeManager resource configuration

Page 13: Apache Hadoop YARN: Past, Present and Future

13

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

NodeManager resource management

⬢Options to report NM resources based on node hardware–YARN-160–Restart of the NM required to enable feature

⬢Alternatively, admins can use the rmadmin command to update the node’s resources–YARN-291–Looks at the dynamic-resource.xml–No restart of the NM or the RM required

Page 14: Apache Hadoop YARN: Past, Present and Future

14

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

YARN Scheduler

Inter queue pre-emptionImprovements to pre-emption

Application

Queue B – 25%

Queue C – 25%

Label: SAS (non-exclusive)

Queue A – 50%

Priority/FIFO, Fair

ResourceManager(active)

Application, Queue A, 4G, 1 vcoreSupport for application priority

Reservation for applicationSupport for cost based placement

agent

User

Page 15: Apache Hadoop YARN: Past, Present and Future

15

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Scheduler

⬢Support for application priority within a queue–YARN-1963–Users can specify application priority–Specified as an integer, higher number is higher priority–Application priority can be updated while it’s running

⬢ Improvements to reservations–YARN-2572–Support for cost based placement agent added in addition to greedy

⬢Queue allocation policy can be switched to fair sharing–YARN-3319–Containers allocated on a fair share basis instead of FIFO

Page 16: Apache Hadoop YARN: Past, Present and Future

16

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Scheduler

⬢Support for non-exclusive node labels–YARN-3214–Improvement over partition that existed earlier–Better for cluster utilization

⬢ Improvements to pre-emption

Page 17: Apache Hadoop YARN: Past, Present and Future

17

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Node 1

NodeManagerSupport added for graceful

decomissioning

128G, 16 vcores

Launch Applicaton 1 AMAM process/Docker container(alpha)

Launch AM via ContainerExecutor – DCE, LCE, WSCE. Monitor/isolate

memory and cpu. Support added for disk and network isolation via

CGroups(alpha)

Application Lifecycle

ResourceManager(active)

Request containers

Allocate containersSupport added to resize containers. Container 1 process/Docker

container(alpha)

Container 2 process/Docker container(alpha)

Launch containers on node using DCE, LCE, WSCE. Monitor/isolate memory and cpu. Support added for disk and network

isolation using Cgroups(alpha).

History Server(ATS 1.5– leveldb + HDFS, JHS - HDFS)

HDFS

Log aggregation

Page 18: Apache Hadoop YARN: Past, Present and Future

18

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Application Lifecycle

⬢Graceful decommissioning of NodeManagers–YARN-914–Drains a node that’s being decommissioned to allow running containers to finish

⬢Resource isolation support for disk and network–YARN-2619, YARN-2140–Containers get a fair share of disk and network resources using CGroups–Alpha feature

⬢Docker support in LinuxContainerExecutor–YARN-3853–Support to launch Docker containers alongside process containers–Alpha feature–Talk by Sidharta Seethana at 12:20 tomorrow in Liffey A

Page 19: Apache Hadoop YARN: Past, Present and Future

19

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Application Lifecycle

⬢Support for container resizing–YARN-1197–Allows applications to change the size of an existing container

⬢ATS 1.5–YARN-4233–Store timeline events on HDFS–Better scalability and reliability

Page 20: Apache Hadoop YARN: Past, Present and Future

20

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Operational support

⬢ Improvements to existing tools(like yarn logs)

⬢New tools added(yarn top)

⬢ Improvements to the RM UI to expose more details about running applications

Page 21: Apache Hadoop YARN: Past, Present and Future

21

© Hortonworks Inc. 2011 – 2016. All Rights Reserved21

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Future

Page 22: Apache Hadoop YARN: Past, Present and Future

22

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Drivers for changes

⬢Changing workload types–Workloads have moved from batch to batch + interactive–Workloads will change to batch + interactive + services

⬢Big data workloads continue to evolve–Spark on YARN the most popular way to run Spark in production

⬢Containerization has taken off–Docker becoming extremely popular

⬢ Improve ease of operations–Easier to debug application failures/poor performance–Make overall cluster management easier–Improve existing tools such as yarn logs, yarn top, etc

Page 23: Apache Hadoop YARN: Past, Present and Future

23

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Apache Hadoop YARN

ResourceManager(active)

ResourceManager(standby)

NodeManager1

NodeManager2

NodeManager3

NodeManager4

Resources: 128G, 16 vcoresAdd support for arbitrary resource types

Label: SAS

Add support for federation – allow YARN

to scale

New RM UI

Page 24: Apache Hadoop YARN: Past, Present and Future

24

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Future work

⬢Support for arbitrary resource types and resource profiles–YARN-3926–Admins can add arbitrary resource types for scheduling–Users can specify resource profile name instead of individual resources

⬢YARN federation–YARN-2915–Allows YARN to scale out to tens of thousands of nodes–Cluster of clusters which appear as a single cluster to an end user

⬢New RM UI–YARN-3368–Enhanced usability–Easier to add new features

Page 25: Apache Hadoop YARN: Past, Present and Future

25

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Scheduler

Inter queue pre-emptionSupport for intra queue pre-emption

Application

Queue B – 25%

Queue C – 25%

Label: SAS (non-exclusive)

Queue A – 50%

Priority/FIFO, Fair

ResourceManager(active)

Application, Queue AAdd support for resource profiles

Reservation for application

User

New scheduler APISchedule based on actual resource usage

Page 26: Apache Hadoop YARN: Past, Present and Future

26

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Future work

⬢New scheduler features–YARN-4902–Support richer placement strategies such as affinity, anti-affinity

⬢Support pre-emption within a queue–YARN-4781

⬢More improvements to pre-emption–YARN-4108, YARN-4390

⬢Scheduling based on actual resource usage–YARN-1011–Nodes report actual memory and cpu usage to the scheduler to make better decisions

Page 27: Apache Hadoop YARN: Past, Present and Future

27

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Node 1

NodeManagerAdd distributed scheduling

128G, 16 vcores

Launch Applicaton 1 AMAM process/Docker container

Launch AM process via ContainerExecutor – DCE, LCE, WSCE.

Monitor/isolate memory and cpu. Support for disk and network isolation

Application Lifecycle

ResourceManager(active)

Request containers

Allocate containersNew scheduler API to allow far more

powerful placement strategiesContainer 1 process/Docker

container. Support container restart.

Container 2 process/Docker container. Support container restart.

Launch containers on node using DCE, LCE, WSCE. Monitor/isolate memory and

cpu. Support for disk and network isolation.

History Server(ATS v2 - HBase, JHS - HDFS)

HDFS

Log aggregation

DNS sevice

Page 28: Apache Hadoop YARN: Past, Present and Future

28

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Future work

⬢Distributed scheduling–YARN-2877, YARN-4742–NMs run a local scheduler–Allows faster scheduling turnaround

⬢Better support for disk and network isolation–Tied to supporting arbitrary resource types

⬢Enhance Docker support–YARN-3611–Support to mount volumes–Isolate containers using CGroups

Page 29: Apache Hadoop YARN: Past, Present and Future

29

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Future work – support for services

⬢YARN-4692

⬢Container restart–YARN-3988–Allow container restart without losing allocation

⬢Service discovery via DNS–YARN-4757–Running services can be discovered via DNS

⬢Allocation re-use–YARN-4726–Allow AMs to stop a container but not lose resources on the node–Required for application upgrades

Page 30: Apache Hadoop YARN: Past, Present and Future

30

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Future work

⬢ATS v2–YARN-2928–Run timeline service on Hbase–Support for more data, better performance

⬢Also in the pipeline–Switch to Java 8 with Hadoop 3.0–Add support for GPU isolation–Better tools to detect limping nodes

Page 31: Apache Hadoop YARN: Past, Present and Future

31

© Hortonworks Inc. 2011 – 2016. All Rights Reserved31

© Hortonworks Inc. 2011 – 2016. All Rights Reserved

Thank you!