performance isolation and fairness for multi-tenant cloud storage

32
PrincetonUnive rsity Performance Isolation and Fairness for Multi-Tenant Cloud Storage David Shue*, Michael Freedman*, and Anees Shaikh *Princeton IBM Research

Upload: zada

Post on 07-Jan-2016

75 views

Category:

Documents


3 download

DESCRIPTION

Performance Isolation and Fairness for Multi-Tenant Cloud Storage. David Shue *, Michael Freedman*, and Anees Shaikh ✦. *Princeton ✦ IBM Research. Setting: Shared Storage in the Cloud. Y. Y. F. F. Z. Z. T. T. Shared Key-Value Storage. S3. EBS. SQS. Predictable Performance is Hard. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

PrincetonUniversity

Performance Isolation and Fairness for Multi-Tenant Cloud Storage

David Shue*, Michael Freedman*, and Anees Shaikh✦

*Princeton ✦IBM Research

Page 2: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

2

Setting: Shared Storage in the Cloud

Z

Y

TFZ

Y

FT

S3 EBS SQSShared Key-Value

Storage

Page 3: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

3

DD DD DD DDShared Key-Value Storage

Z Y T FZ Y FTY YZ F F F

Multiple co-located tenants ⇒ resource contention

Predictable Performance is Hard

Page 4: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

4

DD DD DD DDDD DD DD DDShared Key-Value Storage

Z Y T FZ Y FTY YZ F F F

Fair queuing

@ big ironMultiple co-located tenants ⇒ resource contention

Predictable Performance is Hard

Page 5: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

5

Distributed system ⇒ distributed resource allocation

Multiple co-located tenants ⇒ resource contention

Z Y T FZ Y FTY YZ F F F

SS SS SS SS

Predictable Performance is Hard

Page 6: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

6

Z Y T FZ Y FTY YZ F F F

SS SS SS SS

pop

ula

ritydata partition

Multiple co-located tenants ⇒ resource contentionDistributed system ⇒ distributed resource

allocation

Z keyspace T keyspace F keyspaceY keyspace

Predictable Performance is Hard

Page 7: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

7

Z Y T FZ Y FTY YZ F F FZ Y T FZ Y FTY YZ F F F

SS SS SS SS

Skewed object popularity ⇒ variable per-node demand

Multiple co-located tenants ⇒ resource contentionDistributed system ⇒ distributed resource

allocation

1kBGET10BGET 1kBSET

10BSET

(small reads)(large reads) (large writes) (small writes)

Disparate workloads ⇒ different bottleneck resources

Predictable Performance is Hard

Page 8: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

8

Zynga Yelp FoursquareTP

Shared Key-Value Storage

Tenants Want System-wide Resource Guarantees

Z Y T FZ Y FTY YZ F F F

SS SS SS SS

80 kreq/s120 kreq/s 160 kreq/s40 kreq/s

demandf = 120 kreq/s

Skewed object popularity ⇒ variable per-node demand

Multiple co-located tenants ⇒ resource contentionDistributed system ⇒ distributed resource

allocation

Disparate workloads ⇒ different bottleneck resources

Page 9: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

9

Zynga Yelp FoursquareTP

Shared Key-Value Storage

Pisces Provides Weighted Fair-shares

wz = 20%wy = 30% wf = 40%wt = 10%

demandz = 30%

demandf = 30%

Z Y T FZ Y FTY YZ F F F

SS SS SS SS

Skewed object popularity ⇒ variable per-node demand

Multiple co-located tenants ⇒ resource contentionDistributed system ⇒ distributed resource

allocation

Disparate workloads ⇒ different bottleneck resources

Page 10: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

10

Pisces: Predictable Shared Cloud Storage

•Pisces- Per-tenant max-min fair shares of system-wide

resources ~ min guarantees, high utilization

- Arbitrary object popularity

- Different resource bottlenecks

•Amazon DynamoDB- Per-tenant provisioned rates

~ rate limited, non-work conserving

- Uniform object popularity

- Single resource (1kB requests)

Page 11: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

11

Tenant A

Predictable Multi-Tenant Key-Value Storage

Tenant BVM VM VM VM VM VM

RS

FQ

GET 1101100

RR

Controller

PP

Page 12: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

12

Tenant A

Predictable Multi-Tenant Key-Value Storage

Tenant BVM VM VM VM VM VM

WeightA WeightB

RS

FQ

PP

WA

WA2 WB2

GET 1101100

RR

Controller

WA2 WB2WA1 WB1

Page 13: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

13

Strawman: Place Partitions Randomly

Tenant A Tenant BVM VM VM VM VM VM

WeightA WeightB

PP

RS

FQ

WA

WA2 WB2Controlle

r

RR

Page 14: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

14

Strawman: Place Partitions Randomly

Tenant A Tenant BVM VM VM VM VM VM

WeightA WeightB

PP

RS

FQ

WA

WA2 WB2

RR

ControllerOverloaded

Page 15: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

15

Pisces: Place Partitions By Fairness Constraints

Bin-pack partitions

Tenant A Tenant BVM VM VM VM VM VM

WeightA WeightB

PP

RS

FQ

WA

WA2 WB2

RR

Collect per-partition tenant demand

Controller

Page 16: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

16

Pisces: Place Partitions By Fairness Constraints

Tenant A Tenant BVM VM VM VM VM VM

WeightA WeightB

PP

Results in feasible partition placement

RS

FQ

WA

WA2 WB2

RR

Controller

Page 17: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

17

Controller

Strawman: Allocate Local Weights Evenly

WA1 = WB1 WA2 = WB2

Tenant A Tenant BVM VM VM VM VM VM

WeightA WeightB

PP

WA

WA2 WB2

RR RS

FQ

Overloaded

Page 18: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

18

Pisces: Allocate Local Weights By Tenant Demand

A←B A→B

WA1 > WB1 WA2 < WB2

maxmismatch

Tenant A Tenant BVM VM VM VM VM VM

WeightA WeightB

Compute per-tenant+/- mismatch

PP

WA

WA2 WB2Controlle

r

WA1 = WB1 WA2 = WB2

Reciprocal weight swap

RR RS

FQ

Page 19: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

19

Strawman: Select Replicas Evenly

50% 50%

RS

PP

WA

WA2 WB2Controlle

r

Tenant A Tenant BVM VM VM VM VM VM

WeightA WeightB

GET 1101100

RR

WA1 > WB1 WA2 < WB2

FQ

Page 20: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

20

Tenant A

Pisces: Select Replicas By Local Weight

60% 40%detect weightmismatch by

request latency

Tenant BVM VM VM

WeightB

Controller

50% 50%RS

PP

WA

WA2 WB2

VM VM VM

WeightA

GET 1101100

WA1 > WB1 WA2 < WB2

FQ

RR

Page 21: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

21

Strawman: Queue Tenants By Single Resource

Bandwidth limited Request Limited

bottleneck resource (out bytes) fair share

outreqoutreq

Tenant A Tenant BVM VM VM VM VM VM

Controller

RS

PP

WA

WA2 WB2

FQ

WA2 < WB2WA1 > WB1

RR

GET 1101100

GET 0100111

Page 22: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

22

Pisces: Queue Tenants By Dominant Resource

Bandwidth limited Request Limited

outreqoutreq

Tenant A Tenant BVM VM VM VM VM VM

bottlenecked by out bytes

Track per-tenantresource vector

dominant resource fair share

Controller

RS

PP

WA

WA2 WB2

FQ

WA2 < WB2

RR

Page 23: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

23

Pisces Mechanisms Solve For Global Fairness

minutessecondsmicrosecondsTimescale

Syste

m V

isib

ilit

ylo

cal

glo

bal

RS

RR

RR

...

SS

SS

...

Con

trolle

r

demand-driven weight allocation

Maximum bottleneck flow weight exchange

FAST-TCP basedreplica

selection

DRR token-basedDRFQ scheduler

Replica Selection Policies

Weig

ht

Allo

cati

ons

fairness and capacity

constraints

PP

WA

WA2 WB2

FQ

Page 24: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

24

Evaluation

•Does Pisces achieve (even) system-wide fairness?

-Is each Pisces mechanism necessary for fairness?

-What is the overhead of using Pisces?

•Does Pisces handle mixed workloads?

•Does Pisces provide weighted system-wide fairness?

•Does Pisces provide local dominant resource fairness?

•Does Pisces handle dynamic demand?

•Does Pisces adapt to changes in object popularity?

Page 25: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

25

Evaluation

•Does Pisces achieve (even) system-wide fairness?

-Is each Pisces mechanism necessary for fairness?

-What is the overhead of using Pisces?

•Does Pisces handle mixed workloads?

•Does Pisces provide weighted system-wide fairness?

•Does Pisces provide local dominant resource fairness?

•Does Pisces handle dynamic demand?

•Does Pisces adapt to changes in object popularity?

Page 26: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

Pisces Achieves System-wide Per-tenant Fairness

Unmodified Membase

Ideal fair share: 110 kreq/s (1kB requests)

Pisces

0.57 MMR 0.98 MMR

Min-Max Ratio: min rate/max rate (0,1]

8 Tenants - 8 Client - 8 Storage NodesZipfian object popularity

distribution

Page 27: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

27

Each Pisces Mechanism Contributes to System-wide Fairness and Isolation

Unmodified Membase

0.59 MMR 0.93 MMR 0.98 MMR

0.36 MMR 0.58 MMR 0.74 MMR

RSWA

PPFQWA

PPFQ

0.90 MMR

0.96 MMR 0.97 MMR0.89 MMR

0.57 MMR

FQ

2x vs 1x demand

0.64 MMR

Page 28: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

28

Pisces Imposes Low-overhead

< 5%

> 19%

Page 29: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

29

Pisces Achieves System-wide Weighted Fairness

0.98 MMR4 heavy hitters20 moderate demand40 low demand

0.56 MMR

0.89 MMR 0.91 MMR

0.91 MMR

Page 30: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

30

Pisces Achieves Dominant Resource Fairness

Time (s)

Band

wid

th

(Mb

/s)

GET R

eq

uest

s (k

req/s

)

76% of bandwidth

76% of request rate

Time (s)

1kB workloadbandwidth limited

10B workloadrequest limited

24% of request rate

Page 31: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

31

Pisces Adapts to Dynamic Demand

Constant BurstyDiurnal (2x wt)

~2x

even

Tenant Demand

Page 32: Performance Isolation and Fairness for Multi-Tenant Cloud Storage

32

Conclusion

•Pisces Contributions- Per-tenant weighted max-min fair shares of system-wide

resources w/ high utilization

- Arbitrary object distributions

- Different resource bottlenecks

- Novel decomposition into 4 complementary mechanisms

PPPartition

Placement WA

RS FQWeight

AllocationReplica

SelectionFair

Queuing