NetQuest: A Flexible Framework for NetQuest: A Flexible Framework for Large-Scale Network MeasurementLarge-Scale Network Measurement
Lili QiuLili QiuUniversity of Texas at AustinUniversity of Texas at Austin
[email protected]@cs.utexas.edu
Joint work with Han Hee Song and Yin ZhangJoint work with Han Hee Song and Yin Zhang
ACM SIGMETRICS 2006ACM SIGMETRICS 2006June 27, 2006June 27, 2006
2
C&W
UUNet
AOL
Earthlink
Sprint
Qwest
AT&T
Motivating Scenario IMotivating Scenario INetwork DiagnosisNetwork Diagnosis
Why is itso slow?
3
Motivating Scenario IIMotivating Scenario IIPerformance Monitoring in ISP NetworksPerformance Monitoring in ISP Networks
ISP
4
Key RequirementsKey Requirements• Scalable: work for large networks (e.g.,
thousands of nodes)• Flexible: accommodate different applications
– Differentiated design • Different quantities have different importance, e.g., a
subset of paths belong to a major customer– Augmented design
• Conduct additional experiments given existing observations, e.g., after measurement failures
– Multi-user design• Multiple users interested in different parts of network or
have different objective functions
Q: Which measurement to conduct to estimate the quantities of interest?
5
What We WantWhat We WantA function f(x) of link performance x
– We use a linear function f(x)=F*x in this talk
2
51 4
76
x1
x11
x4x5
x10
x9
x7
x6
x8
3x2
x3
Ex. 1: average link delay f(x) = (x1+…+x11)/11
Ex. 2: end-to-end delay
Apply to any additive
metric, eg. Log (1 – loss rate)
11
2
1
:10...0....0...0110....01
)(
x
xx
xf
6
Problem FormulationProblem FormulationWhat we can measure: e2e performance
Network performance estimation– Goal: e2e performance on some paths f(x)
• Input: yS (end-to-end performance on a subset of paths S), AS, and yS=ASx
• Output: f(x) – Design of measurement experiments
• Select which subset of paths S for active probing – Network inference
• Infer x based on partial indirect observations
7
Design of ExperimentsDesign of Experiments• State of the art
– Probe every path (e.g., RON)• Not scalable since # paths grow quadratically with #nodes
– Rank-based approach [sigcomm04]• Let A denote routing matrix• Monitor rank(A) paths that are linearly independent to
exactly reconstruct end-to-end path properties• Still very expensive
• Our work– If we can tolerate some error, we can significantly
reduce measurement cost.– How to select a given # paths to probe to estimate
f(x) as accurately as possible?• Need a metric to quantify goodness of a given set of paths
8
Bayesian Experimental DesignBayesian Experimental Design• Many practical applications
– Car crash test, medicine design, software testing
• A good design should maximize the expected utility under the optimal inference algorithm
• Different utility functions yield different design criteria– We use Bayesian A-optimal and Bayesian D-
optimal design criteria
9
Bayesian A-Optimal DesignBayesian A-Optimal Design• Goal
– Minimize the squared error– Maximize the following expected utility
• Let , where is covariance matrix of x
• Assuming a normal linear system, the Bayesian procedure yields
• This is equivalent to minimize
22|||| eFxFx
dxdyxypxFFxxFFxU TA )|,()ˆ()ˆ()(
})({)( 2 TA FFDtraceU
})({trace)( TA FFD
1)()( RAAD STS 12 R
10
Bayesian D-Optimal DesignBayesian D-Optimal Design• Goal
– Maximize the expected gain in Shannon information
• Let , where is covariance matrix of x
• Assuming a normal linear system, the Bayesian procedure yields
• This is equivalent to minimize
dxdyxypyFxpUD )|,()},|(log{)(
})(det{log21
2)2log(
2)( T
D FFDnnU
})(det{)( TD FFD
1)()( RAAD STS 12 R
11
Search AlgorithmSearch Algorithm• Given a design criterion , next step is to
find s rows of A to minimize – This problem is NP-hard– We use a sequential search algorithm
• Start with an empty initial design• Sequentially add rows to the design that results in the
largest reduction in
)(
)(
)(
12
FlexibilityFlexibilityDifferentiated design
– Give higher weights to the important rowsof matrix F
Augmented design– Identify the additional paths to probe such that in
conjunction with previously monitored paths maximize the utility
Multi-user design– New design criteria: a linear combination of
different users’ design criteria
13
Network InferenceNetwork InferenceGoal: infer x s.t. yS=ASxMain challenge: under-constrained problemL2-norm minimization
L1-norm minimization
Maximum entropy estimation
22
22
2 ||||||||min Axyx
11 ||||||||min Axyx
222 ||||logmin Axyxx
i i
ii
14
Evaluation MethodologyEvaluation MethodologyData sets
Accuracy metric
Only show the RTT results from PlanetLab– Refer to our paper for more extensive results, e.g., loss
estimation and comparison of inference algorithms
i i
i ii
actualactual
MAEnormalized|infer|
# nodes # overlay nodes
# paths # links Rank
PlanetLab-RTT 2514 61 3657 5467 769
PlanetLab-loss 1795 60 3270 4628 690Brite-n1000-o200 1000 200 39800 2883 2051
Brite-n5000-o600 5000 600 359400 14698 9729
15
Comparison of DOE Algorithms: Comparison of DOE Algorithms: Estimating Network-Wide Mean RTTEstimating Network-Wide Mean RTT
A-optimal yields the lowest error.
16
Comparison of DOE Algorithms: Comparison of DOE Algorithms: Estimating Per-Path RTTEstimating Per-Path RTT
A-optimal yields the lowest error.
17
Differentiated Design: Differentiated Design: Inference Error on Preferred PathsInference Error on Preferred Paths
Lower error on the paths with higher weights.
18
Differentiated Design: Differentiated Design: Inference Error on the Remaining PathsInference Error on the Remaining Paths
Error on the remaining paths increases slightly.
19
Augmented DesignAugmented Design
A-optimal is most effective in augmenting an existing design.
20
Multi-user DesignMulti-user Design
A-optimal yields the lowest error.
21
SummarySummaryOur contributions
– Apply Bayesian experimental design to large-scale network performance monitoring
– Develop a flexible framework to accommodate different design requirements
– Develop a toolkit on PlanetLab to measure and estimate network performance
– Our results• Higher or comparable accuracy• Flexible: differentiated design, augmented design, multi-user design • Scalable: can handle 1,000,000 paths and 50,000 links
Future work– Making measurement design fault tolerant– Extend our framework to incorporate additional design
constraints– Apply our technique to traffic matrix estimation
22
Thank you!Thank you!