understanding and predicting host load

24
Understanding and Predicting Host Load Peter A. Dinda Carnegie Mellon University http://www.cs.cmu.edu/~pdinda

Upload: hester

Post on 23-Feb-2016

33 views

Category:

Documents


0 download

DESCRIPTION

Understanding and Predicting Host Load. Peter A. Dinda Carnegie Mellon University http://www.cs.cmu.edu/~pdinda. Talk in a Nutshell. Statistical analysis of two sets of week long, 1 Hz resolution traces of load on ~40 machines and evaluation of linear time series models for load prediction. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Understanding and Predicting Host Load

Understanding and PredictingHost Load

Peter A. DindaCarnegie Mellon University

http://www.cs.cmu.edu/~pdinda

Page 2: Understanding and Predicting Host Load

2

Talk in a Nutshell

• Load is self-similar• Load exhibits epochal behavior• Load prediction benefits from capturing

self-similarity

Statistical analysis of two sets of week long, 1 Hz resolution traces of load on ~40 machines and evaluation of linear time series models for load prediction

Page 3: Understanding and Predicting Host Load

3

Why Study Load?

Load partially determines execution time

We want to model and predict load

[tmin,tmax] ??InteractiveApplication

Short taskswith deadlines

Unmodified Distributed System

Page 4: Understanding and Predicting Host Load

4

Load and Execution Time

1 3 5 7Measured Load

0

5

10

15

20

25

Exe

cutio

n TI

me

(Sec

onds

)

42,000 pointsCoefficient of Correlation = 0.998

nominal

tt

t

tdttload

execnow

now

)(11

Page 5: Understanding and Predicting Host Load

5

Outline• Measurement methodology• Load traces• Load variance• New Results

– Self-similarity– Epochal behavior

• Benefits of capturing self similarity in linear models

• Conclusions

Page 6: Understanding and Predicting Host Load

6

Measurement Methodology

Ready Queue

RUN

lent

lent-T

lent-2T

lent-29T

...

lent-30T...

ExponentialAverage(1 minute Load “Average”)

avgt

avgt-0.5T

avgt-T...

Our Measurements(1 Hz sample rate)

Digital Unix Kernel User Level Measurement Tool

T=2 seconds

Page 7: Understanding and Predicting Host Load

7

Load Traces

Machines DurationAugust 1997 13 production cluster

8 research cluster2 compute servers

15 desktops

~ one week(over onemillionsamples)

February 1998 13 production cluster8 research cluster2 compute servers

11 desktops

~ one week(over onemillionsamples)

Page 8: Understanding and Predicting Host Load

8

Absolute Variation

-1

-0.5

0

0.5

1

1.5

2

Host

+SDev

-SDev

Mean

Production Cluster ResearchCluster

Desktops

Page 9: Understanding and Predicting Host Load

9

Relative Variation

0

2

4

6

8

10

12

Host

Production Cluster ResearchCluster

Desktops

Page 10: Understanding and Predicting Host Load

10

Lag

AC

F

0 100 200 300 400 500 600

0.0

0.2

0.4

0.6

0.8

1.0 Series : axp7.19.day$NormLoad

frequency

spec

trum

0.0 0.1 0.2 0.3 0.4 0.5

-80

-60

-40

-20

020

Series: axp7.19.day$NormLoad Raw Periodogram

bandwidth= 3.34114e-006 , 95% C.I. is ( -5.87588 , 17.5667 )dB

Time

0 20000 40000 60000 80000

0.00

.20.40

.60.81

.01.21

.4Lo

adA

utoc

orre

latio

nP

erio

dogr

am

Time

Lag

Frequency

Page 11: Understanding and Predicting Host Load

11

Visual Self-Similarity Here

Page 12: Understanding and Predicting Host Load

12

The Hurst Parameter

0.01

0.1

1

10

100

1000

10000

100000

0.0001 0.001 0.01 0.1 1Log(Frequency)

H=0.375

H=0.5

H=0.625H=0.875

H=(1-slope)/2

Page 13: Understanding and Predicting Host Load

13

Self-similarity Statistics

0

0.2

0.4

0.6

0.8

1

1.2

Host

Production Cluster ResearchCluster

Desktops

+SDev

-SDev

Mean

Page 14: Understanding and Predicting Host Load

14

Why is Self-Similarity Important?• Complex structure

– Not completely random, nor independent– Short range dependence

• Excellent for history-based prediction– Long range dependence

• Possibly a problem

• Modeling Implications– Suggests models that can capture

• ARFIMA, FGN, TAR

Page 15: Understanding and Predicting Host Load

15

Load Exhibits Epochal Behavior

Title:axp7_tue_19.epsCreator:MATLAB, The Mathworks, Inc.Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.

Title:axp7_19_day_time.epsCreator:MATLAB, The Mathworks, Inc.Preview:This EPS picture was not savedwith a preview included in it.Comment:This EPS picture will print to aPostScript printer, but not toother types of printers.

Page 16: Understanding and Predicting Host Load

16

Epoch Length Statistics

-200

0

200

400

600

800

1000

1200

Host

+SDev

-SDev

Mean

Production Cluster ResearchCluster

Desktops

Page 17: Understanding and Predicting Host Load

17

Why is Epochal Behavior Important?

• Complex structure – Non-stationary

• Modeling Implications– Suggests models

• ARIMA, ARFIMA, etc.• Non-parametric spectral methods

– Suggests problem decomposition

Page 18: Understanding and Predicting Host Load

18

Linear Time Series Models

Time

0 20000 40000 60000 80000

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

tj

jtjt aaz

1

Time

0 20000 40000 60000 80000

-0.0

4-0

.02

0.0

0.02

0.04

),0(~ 2at WhiteNoisea 2,~ ztz

22za

Choose weights j to minimize a2

a is the confidence interval for t+1 predictions

UnpredictableRandom Sequence Fixed Linear Filter

Partially PredictableLoad Sequence

Page 19: Understanding and Predicting Host Load

19

Realizable Pole-Zero Models

ARFIMA(p,d,q)

ARIMA(p,d,q)

ARMA(p,q)

AR(p) MA(q)

Self Similarity, d related to Hurst

Non-stationarity, d integer

p,q are numbers of parametersd is degree of differencing

Page 20: Understanding and Predicting Host Load

20

Real World Benefits of Modelsa is the confidence interval for t+1 predictions

Map work that would take 100 ms at zero load

axp0: z=0.54, =1.0, a(ARMA(4,4))= 0.109 a(ARFIMA(4,d,4))= 0.108no model: 1.0 +/- 1.06 (95%) => 100 to 306 msARMA: 1.0 +/- 0.22 (95%) => 178 to 222 msARFIMA: 1.0 +/- 0.21 (95%) => 179 to 221 ms

axp7: z=0.14, =0.12, a(ARMA(4,4))= 0.041 a(ARFIMA(4,d,4))= 0.025no model: 0.12 +/- 0.27 (95%) => 100 to 139 msARMA: 0.12 +/- 0.08 (95%) => 104 to 120 msARFIMA: 0.12 +/- 0.05 (95%) => 107 to 117 ms

1 %

40 %

Page 21: Understanding and Predicting Host Load

21

t+1 prediction

-505

1015202530354045

Host

Production Cluster ResearchCluster

Desktops

Page 22: Understanding and Predicting Host Load

22

t+8 prediction

-10

-5

0

5

10

15

20

25

30

35

Host

Production Cluster ResearchCluster

Desktops

Page 23: Understanding and Predicting Host Load

23

Conclusions• Load has high variance• Load is self-similar• Load exhibits epochal behavior• Capturing self-similarity in linear time

series models improves predictability

Page 24: Understanding and Predicting Host Load

24

Load Traces• Would a web-accessible load trace

database be useful?• Would you like to contribute?