predicting web service levels during vm live migrations · predicting web service levels during vm...

31
Predicting Web Service Levels During VM Live Migrations Predicting Web Service Levels During VM Live Migrations 5th International DMTF Academic Alliance Workshop on Systems and Virtualization Management: Standards and the Cloud Helmut Hlavacs, Thomas Treutner 24. 10. 2011 1 / 31

Upload: others

Post on 11-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Predicting Web Service Levels During VM Live

Migrations

5th International DMTF Academic Alliance Workshop on Systems

and Virtualization Management: Standards and the Cloud

Helmut Hlavacs, Thomas Treutner

24. 10. 2011

1 / 31

Page 2: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Table of Contents

1 Scenario

2 Experiment

3 Data overview

4 Model selection

5 Conclusions

2 / 31

Page 3: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Abstract

• VM live migration important for energy efficiency• Establish energy efficient target distribution of VMs

• No perceivable downtime while live migrating?

• Live migration is resource intensive (iterative page copying)

• Experiments: Influence on service levels while migrating?• Modelling: Predict service levels based on utilization?

3 / 31

Page 4: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Scenario

Scenario

Scenario

• Virtualized data center, static consolidation (P2V)

• Provisioning for peak load, still bad energy efficiency

• E.g., 9-5-cycles, very low utilization at night

• Live migration enables dynamic consolidation

• But: Seldom used, fear of possible side effects

• ⇒ Identify and quantify effects on (web) service levels

4 / 31

Page 5: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Scenario

Scenario

5 / 31

Page 6: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Experiment

Experiment

Experiment

• 2 servers, 1 VM, migrating forth and back

• VM disk image on central node (Gbit, open-iscsi)

• qemu-kvm VM: Linux, Apache2, PHP5, MediaWiki

• SQL VM and load generation on an extra nodes

• Logging utilization of servers and VM, >100 variables

• Rise load from 50 to 600 concurrent virtual users and back

• Migrate every 15min, track response time of last 5min (SLA=1s)

6 / 31

Page 7: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Experiment

Experiment

7 / 31

Page 8: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Data overview

CPU utilization

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 20 40 60 80 100 120 140 160 180 200

CP

U U

tiliz

atio

n (

VM

no

rme

d)

Interval

Abbildung: Effect of increasing/decreasing HTTP workload in equal steps on

the VM’s CPU utilization

8 / 31

Page 9: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Data overview

Load average

0

1

2

3

4

5

6

0 20 40 60 80 100 120 140 160 180 200

Lo

ad

5 A

vera

ge

Interval

Abbildung: Effect of increasing/decreasing HTTP workload in equal steps on

the VM’s UNIX load average.

9 / 31

Page 10: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Data overview

Markov Queues

• UNIX load as approximation to Q, the average number of jobs

in an M/M/1 queue, CPU utilization as ρ, the system utilization

• QM/M/1 = ρ2

1−ρ

• UNIX load exponentially averaged by definition, service times

are not necessarily exactly exponentially distributed, a

systematic deviation from the theoretical solution is expectable.

• QM/G/1 = ρ2

1−ρ · (1+c2

B)2

, cB: coefficient of service time variation

• Exponentially distributed service times: c2

B = 1

• Deterministic service times: c2

B = 0

• Simple linear regression: c2

B = 0.42

10 / 31

Page 11: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Data overview

0

1

2

3

4

5

6

7

8

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Lo

ad

5 A

vera

ge

CPU Utilization (VM normed)

Measured

M/M/1

M/D/1

M/G/1, c2B=0.42

Abbildung: Queuing issues are rapidly increasing at 60-70% CPU utilization.

The measured data follow the trend of the theoretical solution very closely.

11 / 31

Page 12: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Data overview

Response Time

0

5000

10000

15000

20000

25000

06 12 18 00 06 12 18 00 06 12

Re

spo

nse

Tim

e [

ms]

Hours

Abbildung: HTTP response times are massively increased during phases

with higher workload.

12 / 31

Page 13: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Data overview

Service Level

0.82

0.84

0.86

0.88

0.9

0.92

0.94

0.96

0.98

1

0 20 40 60 80 100 120 140 160 180 200

Se

rvic

e L

eve

l R

atio

Interval

Abbildung: Service levels decrease significantly during phases of high

workload. The maximum allowed response time is 1 s, which is extremely

conservative.13 / 31

Page 14: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Data overview

Load average vs Service Level

0.82

0.84

0.86

0.88

0.9

0.92

0.94

0.96

0.98

1

0 1 2 3 4 5 6

Se

rvic

e L

eve

l

Load5 Average

MeasuredTheoretic

Abbildung: Higher load average is an indicator for queuing issues which are

causing decreased service level ratios. The measured data follow the

theoretical solution closely for higher load values.14 / 31

Page 15: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Data overview

Response Time: Low Load

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

0.95 0.96 0.97 0.98 0.99 1

Resp

onse

Tim

e [s]

SLA

Ratio

Time

HTTP Response TimeMigration started

Migration finishedSLA Ratio

Maximum allowed response time

Abbildung: Influence of live migration on HTTP response time and service

level during low load.

15 / 31

Page 16: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Data overview

Response Time: Medium Load

0

2

4

6

8

10

12

14

16

0.85

0.9

0.95

1

Resp

onse

Tim

e [s]

SLA

Ratio

Time

HTTP Response TimeMigration started

Migration finishedSLA Ratio

Maximum allowed response time

Abbildung: Influence of Live Migration on HTTP Response Time and service

level during medium load.

16 / 31

Page 17: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Data overview

Response Time: High Load

0

5

10

15

20

0.5

0.6

0.7

0.8

0.9

1

Resp

onse

Tim

e [s]

SLA

Ratio

Time

HTTP Response TimeMigration started

Migration finishedSLA Ratio

Maximum allowed response time

Abbildung: Influence of Live Migration on HTTP Response Time and service

level during high load.

17 / 31

Page 18: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Model selection

Model selection

Stepwise: AIC

• Akaike Information Criterion (AIC), lower values are better

• AIC = 2k − 2 ln(L)• k : the number of independent parameters of a model

• L: the maximum likelihood

• Stepwise model selection approach

• Find trade-off between number of parameters (model size) and

goodness of fit (model quality)

• Start with empty model, in each step a variable can be

added/removed, until the AIC can not be decreased further.

• Does not guarantee to find a global optimum, but typically gives a

near-optimum result within reasonable time.

18 / 31

Page 19: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Model selection

Model selection

Exhaustive: LEAPS

• Comparison with the stepwise AIC approach

• Exhaustive all-subsets-regression

• Find best of all possible models for given range of model size

• Computationally intensive even if number of variables is limited

19 / 31

Page 20: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Model selection

AIC

-4550

-4500

-4450

-4400

-4350

-4300

-4250

-4200

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

AIC

va

lue

Step

AIC value in current stepFinal AIC value

Abbildung: The AIC value quickly improves during the first steps and then

slowly converges to the final value, thus allowing a trade-off between model

complexity and precision.20 / 31

Page 21: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Model selection

AIC

0.013

0.014

0.015

0.016

0.017

0.018

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Sta

nd

ard

De

via

tion

of

Re

sid

ua

ls

Step

Best value

Abbildung: The residuals’ standard deviation quickly improves during the first

steps and then slowly converges to the final value, when using stepwise AIC.

21 / 31

Page 22: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Model selection

AIC

Abbildung: QQ-Plot of standardized residuals for the model produced by

stepwise AIC.

22 / 31

Page 23: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Model selection

AIC

Abbildung: There is no visible correlation between the residuals’ variance

and the time series.

23 / 31

Page 24: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Model selection

LEAPS

0.9

0.91

0.92

0.93

0.94

0.95

0.96

0.97

1 2 3 4 5 6 7 8 9 10 11 12

R2 A

dj

Model Size

Best value

Abbildung: Higher model complexity yields better model quality when using

LEAPS, but the increase is rather small.

24 / 31

Page 25: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Model selection

LEAPS

0.015

0.016

0.017

0.018

0.019

0.02

0.021

0.022

0.023

0.024

0.025

1 2 3 4 5 6 7 8 9 10 11 12

Sta

nd

ard

De

via

tio

n o

f R

esid

ua

ls

Model Size

Best value

Abbildung: Higher model complexity yields better regression error when

using LEAPS.

25 / 31

Page 26: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Model selection

AIC: Most influencing variables of final model

Variable Meaning Estimate Std. Error Pr(> |t|)Intercept 2.395e+00 5.069e-01 3.00e-06

wp01 load5 VM UNIX load5 -1.871e-02 2.627e-03 3.67e-12

wp01 swapUsed VM swap used -7.656e-07 8.809e-08 < 2e-16

wp01 residentSize SQ The squared amount of res-

ident memory used by the

qemu-kvm process.

-4.652e-14 1.652e-14 0.00506

src host cpu proc s Tasks created/s, source

host.

-2.475e+00 9.166e-01 0.00716

src host cpu proc s SQ 1.091e+00 4.133e-01 0.00856

wp01 cpu util vmnorm SQ -1.328e-01 3.187e-02 3.63e-05

wp01 cpu util vmnorm CPU util measured inside

VM

9.517e-02 2.316e-02 4.64e-05

wp01 load5 SQ The squared UNIX load5of the VM.

-1.140e-03 1.918e-04 5.22e-09

wp01 freeMemRatio SQ The squared ratio of free

memory inside the VM.

1.976e-02 9.462e-03 0.03727

Tabelle: The most influencing and significant variables of the final model

produced by stepwise AIC.

26 / 31

Page 27: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Conclusions

Conclusions

Qualitative

1 Impact of live migration on SL depends on amount of workload

2 Tighter SLAs can be fulfilled for low workload situations

3 High workload: SL decreases massively

4 ⇒ Live migrating when VM has only medium load

5 Avoid multiple live migrations within short time

6 ⇒ Inertia effect for recently migrated VMs

7 Avoid senseless flip-flop migrations and stabilize service level

27 / 31

Page 28: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Conclusions

Conclusions

Quantitative

1 SL variance during a live migration to 90% predictable using only

a single variable, the UNIX load5 average.

2 Models with 12 variables can explain 95% of variance

3 Models selected by AIC/LEAPS are of comparable quality and

robust to statistical tests

4 AIC is way faster, so LEAPS is not really necessary

28 / 31

Page 29: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Conclusions

Conclusions

Technical

1 Systems using live migration as a mechanism to realize a more

energy efficient target distribution need to consider the UNIXload average, at least for web servers and comparable

2 Typical hypervisors do not collect and export this information

3 Needs to be done by VM introspection

4 Related efforts by qemu-kvm and libvirt developers to

pass-through the VMs’ memory utilization (free mem)

5 Should be extended to export load information

29 / 31

Page 30: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Conclusions

Conclusions

Future Work

1 Influence of additional VMs

• idle

• utilized

• and a mixed scenario

2 Linux and qemu-kvm: Kernel Samepage Merging (KSM)

3 Database VM migration

4 Bandwidth limits, maximum allowed downtime

5 Migration delay, energy consumption, service downtime

30 / 31

Page 31: Predicting Web Service Levels During VM Live Migrations · Predicting Web Service Levels During VM Live Migrations Experiment Experiment Experiment • 2 servers, 1 VM, migrating

Predicting Web Service Levels During VM Live Migrations

Conclusions

Q&A

Thank you for your attention!

31 / 31