maximum likelihood estimation of closed queueing network demands from queue length data
TRANSCRIPT
ACM/SPEC ICPE 2016 - March14th, 2016
Weikun Wang (Imperial College London, UK)
Giuliano Casale (Imperial College London, UK)
Ajay Kattepur (TCS Innovation Labs, India)
Manoj Nambiar (TCS Innovation Labs, India)
Maximum Likelihood Estimation of Closed Queueing Network
Demands from Queue Length Data
2
SPE & Model-Driven Engineering
Platform-Indep. Model
ArchitectureModel
Platform-Specific Model
Performance& Reliability Models
3
CQN Characterization
CPU-1
Time
Service
Demand
X X X
CPU-1
Mix 1 Mix 2 Mix 3
Service demand of a request
CPU time, bandwidth consumed, …
Multi-threaded software
e.g., web servers
4
Drawback of Existing methods
Utilization based approaches
Regression based on utilization and throughput
Issues: collinearities, load-dependence, outliers, utilization unreliable/unavailable, …
7
Assume product-form state probabilities
Computationally challenging to evaluate
Maximum likelihood estimation?
Infer demands with the probability
Queue length samples
Service demand
Normalizing constant
Queue length
8
Maximum likelihood estimation (MLE)
Problem with direct computation
Evaluation of for each observation
Slow due to the need for computing
Very small probabilities when L is large
Any other solution?
Maximum Likelihood Estimation
parameter spaceLikelihood
9
A necessary condition for a point inside to be a MLE is that
How to find the MLE?
Change the value of , until the mean queue length predicted with MVA match
Fixed point iteration or an optimization program
Maximum Likelihood Estimation
observed mean queue length
theoretical mean queue length
Only meanqueue length is required!
10
Confidence Intervals
Assume the MLE to be asymptotically normal
Confidence intervals for the MLE demands
is the Fisher Information matrix
is the Hessian matrix
works with mean queue length only!
Obtained by using standard MVA, no probabilities!
11
QMLE Approximation
Exact MLE can be found by direct search
Fixed-point iteration tends to be effective
A simple approximation of the MLE:
Consider the demand vector where
Then it must be
observed mean queue length
12
Validation- Existing methods
CI: Complete Information[J.F. Perez et al., IEEE Trans. Sw. Eng.’15]
Full knowledge of sample path
Baseline approach
ERPS: Extended Regression for Processor Sharing
[J.F. Perez et al., IEEE Trans. Sw. Eng.’15]
Based on mean response time and arrival queue
GQL: Gibbs Sampling for Queue Lengths
[W. Wang et al., Accepted to appear in ACM TOMACS]
Gibbs sampling based on queue length samples
Many iterations until convergence
13
Validation
≈20000 random models
Randomized number of stations, classes, jobs
Focus on QMLE instead of exact analysis
Results
All the algorithms: below 10%
QMLE has less than 4% error
Confidence interval validated
Number of observations
14
1E-04
1E-03
1E-02
1E-01
1E+00
1E+01
1E+02
1E+03
1E+04
CI ERPS GQL QMLE
Execution timeE
xe
cu
tio
n tim
e (
s)
15
Mean demand varies under different load
Real world system behavior
e.g. multi-core servers
Load-dependent (LD) extension
16
A scaling factor function
load-independent :
Product-form still holds
MLE
Load-dependent (LD) extension
new term
17
Directly computation is infeasible
A necessary condition for a point inside
to be a MLE is that
and
Works with marginal probability only!
MLE characterization
Empirical marginal queue length probability
Theoretical marginal queue length probability
18
How to find the MLE?
Solve by optimization program
Confidence intervals
Hessian matrix can still be derived
Computation requires marginal probabilities and mean queue length only
Drawback
Computationally expensive because of LD-MVA
MLE characterization
19
Validation
Random models validation
2 stations, 2 classes, 8 jobs, different think time
MATLAB fmincon solver
Compare the estimated against exact ones
Considered scaling factors
: resembles multi-core feature
– number of CPUs in queueing station i.
20
0
5
10
15
20
25
30
35
40
5000 10000 50000 100000
u
Estimation error on γ (scaling factors)E
rro
r o
n γ
(%)
Number of observed samples (L)
21
0.0E+00
5.0E+03
1.0E+04
1.5E+04
2.0E+04
2.5E+04
3.0E+04
5000 10000 50000 10000
Execution timeE
xe
cu
tio
n tim
e (
s)
Number of observed samples (L)
Progress: We found a new approximation method for efficiently evaluating marginal probabilities, which reduces the execution time to < 20s on average!
22
Case Study (MyBatis JPetStore)
3-tier commercial application
Transactions grouped in R=1 class
5 GB user data
User 1Worker Database
Web/Application server
Workload
GeneratorDispatcher
Database server
23
Observed performance matching
Exact demand unknown
Estimated demands using QMLE
Validate observed throughput with estimated demands
0
5
10
15
20
25
0.1 0.5 1 5 All
ERPS
QMLE
Err
or
on
th
rou
gh
pu
t (%
)
Think time (s)
24
Conclusion
Demand estimation from queue length
Efficient
Confidence interval characterization
Load-dependent extension
Ongoing work
Accelerate the load-dependent estimation
More experimental evaluations
Funded by FP7 MODAClouds, H2020 DICE, EPSRC OptiMAM