maximum likelihood estimation of closed queueing network demands from queue length data

ACM/SPEC ICPE 2016 - March14th, 2016

Weikun Wang (Imperial College London, UK)

Giuliano Casale (Imperial College London, UK)

Ajay Kattepur (TCS Innovation Labs, India)

Manoj Nambiar (TCS Innovation Labs, India)

Maximum Likelihood Estimation of Closed Queueing Network

Demands from Queue Length Data

2

SPE & Model-Driven Engineering

Platform-Indep. Model

ArchitectureModel

Platform-Specific Model

Performance& Reliability Models

3

CQN Characterization

CPU-1

Time

Service

Demand

X X X

CPU-1

Mix 1 Mix 2 Mix 3

Service demand of a request

CPU time, bandwidth consumed, …

Multi-threaded software

e.g., web servers

4

Drawback of Existing methods

Utilization based approaches

Regression based on utilization and throughput

Issues: collinearities, load-dependence, outliers, utilization unreliable/unavailable, …

5

State observations

Dataset ( points):

CQN State:

Queue length samples

6

State observations

Dataset ( points):

CQN State:


7

Assume product-form state probabilities

Computationally challenging to evaluate

Maximum likelihood estimation?

Infer demands with the probability


Service demand

Normalizing constant

Queue length

8

Maximum likelihood estimation (MLE)

Problem with direct computation

Evaluation of for each observation

Slow due to the need for computing

Very small probabilities when L is large

Any other solution?

Maximum Likelihood Estimation

parameter spaceLikelihood

9

A necessary condition for a point inside to be a MLE is that

How to find the MLE?

Change the value of , until the mean queue length predicted with MVA match

Fixed point iteration or an optimization program

Maximum Likelihood Estimation

observed mean queue length

theoretical mean queue length

Only meanqueue length is required!

10

Confidence Intervals

Assume the MLE to be asymptotically normal

Confidence intervals for the MLE demands

is the Fisher Information matrix

is the Hessian matrix

works with mean queue length only!

Obtained by using standard MVA, no probabilities!

11

QMLE Approximation

Exact MLE can be found by direct search

Fixed-point iteration tends to be effective

A simple approximation of the MLE:

Consider the demand vector where

Then it must be

observed mean queue length

12

Validation- Existing methods

CI: Complete Information[J.F. Perez et al., IEEE Trans. Sw. Eng.’15]

Full knowledge of sample path

Baseline approach

ERPS: Extended Regression for Processor Sharing

[J.F. Perez et al., IEEE Trans. Sw. Eng.’15]

Based on mean response time and arrival queue

GQL: Gibbs Sampling for Queue Lengths

[W. Wang et al., Accepted to appear in ACM TOMACS]

Gibbs sampling based on queue length samples

Many iterations until convergence

13

Validation

≈20000 random models

Randomized number of stations, classes, jobs

Focus on QMLE instead of exact analysis

Results

All the algorithms: below 10%

QMLE has less than 4% error

Confidence interval validated

Number of observations

14

1E-04

1E-03

1E-02

1E-01

1E+00

1E+01

1E+02

1E+03

1E+04

CI ERPS GQL QMLE

Execution timeE

xe

cu

tio

n tim

e (

s)

15

Mean demand varies under different load

Real world system behavior

e.g. multi-core servers

Load-dependent (LD) extension

16

A scaling factor function

load-independent :

Product-form still holds

MLE

Load-dependent (LD) extension

new term

17

Directly computation is infeasible

A necessary condition for a point inside

to be a MLE is that

and

Works with marginal probability only!

MLE characterization

Empirical marginal queue length probability

Theoretical marginal queue length probability

18

How to find the MLE?

Solve by optimization program

Confidence intervals

Hessian matrix can still be derived

Computation requires marginal probabilities and mean queue length only

Drawback

Computationally expensive because of LD-MVA

MLE characterization

19

Validation

Random models validation

2 stations, 2 classes, 8 jobs, different think time

MATLAB fmincon solver

Compare the estimated against exact ones

Considered scaling factors

: resembles multi-core feature

– number of CPUs in queueing station i.

20

0

5

10

15

20

25

30

35

40

5000 10000 50000 100000

u

Estimation error on γ (scaling factors)E

rro

r o

n γ

(%)

Number of observed samples (L)

21

0.0E+00

5.0E+03

1.0E+04

1.5E+04

2.0E+04

2.5E+04

3.0E+04

5000 10000 50000 10000

Execution timeE

xe

cu

tio

n tim

e (

s)

Number of observed samples (L)

Progress: We found a new approximation method for efficiently evaluating marginal probabilities, which reduces the execution time to < 20s on average!

22

Case Study (MyBatis JPetStore)

3-tier commercial application

Transactions grouped in R=1 class

5 GB user data

User 1Worker Database

Web/Application server

Workload

GeneratorDispatcher

Database server

23

Observed performance matching

Exact demand unknown

Estimated demands using QMLE

Validate observed throughput with estimated demands

0

5

10

15

20

25

0.1 0.5 1 5 All

ERPS

QMLE

Err

or

on

th

rou

gh

pu

t (%

)

Think time (s)

24

Conclusion

Demand estimation from queue length

Efficient

Confidence interval characterization

Load-dependent extension

Ongoing work

Accelerate the load-dependent estimation

More experimental evaluations

Funded by FP7 MODAClouds, H2020 DICE, EPSRC OptiMAM

25

[email protected]

Thanks!

maximum likelihood estimation of closed queueing network demands from queue length data

Technology