applied research group - university of washington · applied research group systems+database people...

85

Upload: others

Post on 20-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers
Page 2: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers
Page 3: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Applied research groupSystems+database people building prototypes, publishing papers

Page 4: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Applied research groupSystems+database people building prototypes, publishing papers

Collaborating with Big Data product group at MSShipping our code to production

Page 5: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Applied research groupSystems+database people building prototypes, publishing papers

Collaborating with Big Data product group at MSShipping our code to production

Open-sourcing our codeApache Hadoop, REEF, Heron

Page 6: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Resource

management

Distributed

tiered storage

Query

optimization

Log analyticsStream

processing

Page 7: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Resource

management

Distributed

tiered storage

Query

optimization

Log analyticsStream

processing

Page 8: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Node Manager

Node Manager

Node Manager

Page 9: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Node Manager

Node Manager

Node Manager

Page 10: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Node Manager

Node Manager

Node Manager

Page 11: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Node Manager

Node Manager

Node Manager

Page 12: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Node Manager

Node Manager

Node Manager

1. Request

Page 13: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Node Manager

Node Manager

Node Manager

1. Request

2. Allocation

Page 14: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Node Manager

Node Manager

Node Manager

1. Request

2. Allocation

3. Start task

Page 15: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Node Manager

Node Manager

Node Manager

1. Request

2. Allocation

3. Start task

Page 16: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Node Manager

Node Manager

Node Manager

1. Request

2. Allocation

3. Start task

Page 17: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Node Manager

Node Manager

Node Manager

1. Request

2. Allocation

3. Start task

Do we really need a Resource Manager?

Page 18: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Ad-hocapp

Ad-hocapp

Ad-hocapp

Ad-hocApps

YARN

MR

v2Tez Giraph Storm Dryad

REEF

...

Hive / Pig

Hadoop 1.x

(MapReduce)

MR v1

Hive / Pig

Users

Application

Frameworks

Programming

Model(s)

Cluster OS

(Resource

Management)

Hadoop 1 World Hadoop 2 World

File System

HDFS 1 HDFS 2

Hardware

Ad-hocapp

Ad-hocapp

Scope

on

YARNSpark

monolithic

Heron

Page 19: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Ad-hocapp

Ad-hocapp

Ad-hocapp

Ad-hocApps

YARN

MR

v2Tez Giraph Storm Dryad

REEF

...

Hive / Pig

Hadoop 1.x

(MapReduce)

MR v1

Hive / Pig

Users

Application

Frameworks

Programming

Model(s)

Cluster OS

(Resource

Management)

Hadoop 1 World Hadoop 2 World

File System

HDFS 1 HDFS 2

Hardware

Ad-hocapp

Ad-hocapp

Scope

on

YARNSpark

monolithic

• Reuse of RM component

Heron

Page 20: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Ad-hocapp

Ad-hocapp

Ad-hocapp

Ad-hocApps

YARN

MR

v2Tez Giraph Storm Dryad

REEF

...

Hive / Pig

Hadoop 1.x

(MapReduce)

MR v1

Hive / Pig

Users

Application

Frameworks

Programming

Model(s)

Cluster OS

(Resource

Management)

Hadoop 1 World Hadoop 2 World

File System

HDFS 1 HDFS 2

Hardware

Ad-hocapp

Ad-hocapp

Scope

on

YARNSpark

monolithic

• Reuse of RM component

• YARN

layering abstractions

Heron

Page 21: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers
Page 22: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

But is all this good enough for the Microsoft clusters?

Page 23: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

High resource

utilizationScalability

Workload

heterogeneity

Production jobs

and

predictability

Page 24: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

100% Utilization

Page 25: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers
Page 26: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

0

Page 27: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

• Wide variety

Page 28: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

• Wide variety

Page 29: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

• Wide variety

Page 30: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

deadlines

recurring>60%

• Predictability

over-provisioned

Page 31: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

• Rayon/Morpheus:

• Mercury/Yaq:

• YARN Federation:

• Medea:

4 Hadoop committers in CISL

404 patches as of last night

Page 32: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

• Rayon/Morpheus:

• Mercury/Yaq:

• YARN Federation:

• Medea:

4 Hadoop committers in CISL

404 patches as of last night

Page 33: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

[Hadoop 3.0; ATC 2015, EuroSys 2016]

Page 34: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

Page 35: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

j1

Page 36: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

j1

Page 37: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

j2

Page 38: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

j2

Page 39: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

j2

Page 40: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

j2

Page 41: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

j2

Page 42: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

j2

• Feedback delays

idle between allocations

Page 43: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

• Feedback delays

idle between allocations

5 sec 10 sec 50 sec Mixed-5-50 Cosmos-gm

60.59% 78.35% 92.38% 78.54% 83.38%

N1 N2

RM

j2

Page 44: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

• Feedback delays

idle between allocations

• Actual

5 sec 10 sec 50 sec Mixed-5-50 Cosmos-gm

60.59% 78.35% 92.38% 78.54% 83.38%

N1 N2

RM

j2

Page 45: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

• Introduce task queuing at nodes• Mask feedback delays

• Improve cluster utilization

• Improve task throughput (by up to 40%)

• Container types• GUARANTEED and OPPORTUNISTIC

• Keep guarantees for important jobs

• Use opportunistic execution to improve utilization

Page 46: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

Page 47: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

Page 48: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

j1

Page 49: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

j1

Page 50: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

j2

Page 51: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

j2

Page 52: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

j2

Page 53: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

j2

Page 54: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

j2

Page 55: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

j2

Page 56: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2

RM

j2

Page 57: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers
Page 58: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Page 59: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Page 60: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

So all we need to do is use long queues?

Page 61: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers
Page 62: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

can be detrimental for job completion times• Despite the utilization gains

Page 63: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

can be detrimental for job completion times• Despite the utilization gains

Proper queue management techniques are required

Page 64: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2 N3

Page 65: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2 N3

Page 66: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2 N3

Page 67: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2 N3

Page 68: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Place tasks to

node queues

Prioritize task

execution

(queue reordering)

Bound queue

lengths

Page 69: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

Place tasks to

node queues

Prioritize task

execution

(queue reordering)

Bound queue

lengths

Yaq improves median job completion time by 1.7x over YARN

Page 70: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2 N3

RM

Page 71: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2 N3

RM

queue length

Page 72: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2 N3

RM

queue length

Page 73: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

queue length

N1 N2 N3

RM

Page 74: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2 N3

RM

queue length

queue wait time

Page 75: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

N1 N2 N3

RM

queue length

queue wait time

Page 76: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

• Shortest Remaining Job First (SRJF)

• Least Remaining Tasks First (LRTF)

Page 77: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

• Shortest Remaining Job First (SRJF)

• Least Remaining Tasks First (LRTF)

N1 N2 N3

RM

j2: 5 tasks

j3: 9 tasks

j1: 21 tasks

Page 78: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

• Shortest Remaining Job First (SRJF)

• Least Remaining Tasks First (LRTF)

N1 N2 N3

RM

j2: 5 tasks

j3: 9 tasks

j1: 21 tasks

Page 79: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

• Shortest Remaining Job First (SRJF)

• Least Remaining Tasks First (LRTF)

N1 N2 N3

RM

j2: 5 tasks

j3: 9 tasks

j1: 21 tasks

Page 80: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

• Shortest Remaining Job First (SRJF)

• Least Remaining Tasks First (LRTF)

job-aware

N1 N2 N3

RM

j2: 5 tasks

j3: 9 tasks

j1: 21 tasks

Page 81: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

lower throughput

longer job completion times

Page 82: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

• 1.7x improvement in median JCT over YARN

Page 83: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

• Container types

distributed scheduling

any distributed scheduler

over-commitment

multi-tenancy

• Pricing

Page 84: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers

cluster utilization

queue management techniques

job completion time

Page 85: Applied research group - University of Washington · Applied research group Systems+database people building prototypes, publishing papers