measurement errors 2
TRANSCRIPT
-
7/27/2019 Measurement Errors 2
1/53
MEASUREMENT ERRORS
Copyright 2004 David J. Lilja
1
BY:
DAVID J.LITJA
-
7/27/2019 Measurement Errors 2
2/53
Errors in Experimental Measurements2
Sources of errors
Accuracy, precision, resolution
A mathematical model of errors
Confidence intervals For means
For proportions
How many measurements are needed for desirederror?
-
7/27/2019 Measurement Errors 2
3/53
Why do we need statistics?
1. Noise, noise, noise, noise, noise!
Copyright 2004 David J. Lilja 3
OK not really this type of noise
-
7/27/2019 Measurement Errors 2
4/53
Why do we need statistics?
2. Aggregate data into
meaningful
information.
...x
Copyright 2004 David J. Lilja 4
445 446 397 226388 3445 188 1002
47762 432 54 12
98 345 2245 8839
77492 472 565 999
1 34 882 545 4022827 572 597 364
-
7/27/2019 Measurement Errors 2
5/53
What is a statistic?
Copyright 2004 David J. Lilja
5
A quantity that is computed from a sample [ofdata].
Merriam-Webster
A single number used to summarize a largercollection of values.
-
7/27/2019 Measurement Errors 2
6/53
What are statistics?
Copyright 2004 David J. Lilja
6
A branch of mathematics dealing with thecollection, analysis, interpretation, andpresentation of masses of numerical data.
Merriam-WebsterWe are most interested in analysis and
interpretation here.
Lies, damn lies, and statistics!
-
7/27/2019 Measurement Errors 2
7/53
Goals
Copyright 2004 David J. Lilja
7
Provide intuitive conceptual background for somestandard statistical tools.
Draw meaningful conclusions in presence of noisy
measurements. Allow you to correctly and intelligently apply techniques in
new situations.
Dont simply plug and crank from a formula.
-
7/27/2019 Measurement Errors 2
8/53
Goals
Copyright 2004 David J. Lilja
8
Present techniques for aggregating large quantitiesof data.
Obtain a big-picture view of your results.
Obtain new insights from complex measurement andsimulation results.
E.g. How does a new feature impact the overallsystem?
-
7/27/2019 Measurement Errors 2
9/53
Sources of Experimental Errors
Accuracy, precision, resolution
Copyright 2004 David J. Lilja 9
-
7/27/2019 Measurement Errors 2
10/53
Experimental errors
Copyright 2004 David J. Lilja
10
Errorsnoise in measured values
Systematic errors Result of an experimental mistake
Typically produce constant or slowly varying bias
Controlled through skill of experimenter
Examples Temperature change causes clock drift
Forget to clear cache before timing run
-
7/27/2019 Measurement Errors 2
11/53
Experimental errors
Copyright 2004 David J. Lilja
11
Random errors Unpredictable, non-deterministic Unbiased equal probability of increasing or decreasing
measured value
Result of Limitations of measuring tool Observer reading output of tool Random processes within system
Typically cannot be controlled
Use statistical tools to characterize and quantify
-
7/27/2019 Measurement Errors 2
12/53
Example: Quantization Random error
Copyright 2004 David J. Lilja 12
-
7/27/2019 Measurement Errors 2
13/53
Quantization error
Timer resolution
quantization error
Repeated measurements
X
Completely unpredictable
Copyright 2004 David J. Lilja 13
-
7/27/2019 Measurement Errors 2
14/53
A Model of Errors
Error Measured
value
Probability
-E x E
+E x+ E
Copyright 2004 David J. Lilja 14
-
7/27/2019 Measurement Errors 2
15/53
A Model of Errors
Error 1 Error 2 Measured
value
Probability
-E -E x 2E
-E+E x
+E -E x
+E +E x+ 2E
Copyright 2004 David J. Lilja 15
-
7/27/2019 Measurement Errors 2
16/53
A Model of Errors
Probability
0
0.1
0.2
0.3
0.4
0.5
0.6
x-E x x+E
Measured value
Copyright 2004 David J. Lilja 16
-
7/27/2019 Measurement Errors 2
17/53
Probability of Obtaining a Specific MeasuredValue
Copyright 2004 David J. Lilja 17
-
7/27/2019 Measurement Errors 2
18/53
A Model of Errors
Copyright 2004 David J. Lilja
18
Pr(X=xi) = Pr(measurexi)
= number of paths from real value toxi
Pr(X=xi) ~ binomial distribution
As number of error sources becomes large n,
BinomialGaussian (Normal)
Thus, thebell curve
-
7/27/2019 Measurement Errors 2
19/53
Frequency of Measuring Specific Values
Copyright 2004 David J. Lilja
19
Mean of measured values
True value
Resolution
Precision
Accuracy
-
7/27/2019 Measurement Errors 2
20/53
Accuracy, Precision, Resolution
Copyright 2004 David J. Lilja
20
Systematic errorsaccuracy
How close mean of measured values is to true value
Random errors
precision Repeatability of measurements
Characteristics of toolsresolution
Smallest increment between measured values
-
7/27/2019 Measurement Errors 2
21/53
Quantifying Accuracy, Precision, Resolution
Copyright 2004 David J. Lilja
21
Accuracy
Hard to determine true accuracy
Relative to a predefined standard
E.g. definition of a second
Resolution
Dependent on tools
Precision
Quantify amount ofimprecision using statistical tools
-
7/27/2019 Measurement Errors 2
22/53
Confidence Interval for the Mean
Copyright 2004 David J. Lilja 22
c1 c2
1-
/2 /2
-
7/27/2019 Measurement Errors 2
23/53
Normalizex
1)(deviationstandard
mean
tsmeasuremenofnumber
/
n
1i
2
1
nxxs
xx
n
ns
xxz
i
n
i
i
Copyright 2004 David J. Lilja 23
-
7/27/2019 Measurement Errors 2
24/53
Confidence Interval for the Mean
Normalized zfollows a Students tdistribution (n-1) degrees of freedom
Area left of c2 = 1 /2
Tabulated values for t
Copyright 2004 David J. Lilja 24
c1 c2
1-
/2 /2
-
7/27/2019 Measurement Errors 2
25/53
Confidence Interval for the Mean
As n, normalized distribution becomesGaussian (normal)
Copyright 2004 David J. Lilja 25
c1 c2
1-
/2 /2
-
7/27/2019 Measurement Errors 2
26/53
Confidence Interval for the Mean
Copyright 2004 David J. Lilja
26
1)Pr(
Then,
21
1;2/12
1;2/11
cxc
n
stxc
n
stxc
n
n
-
7/27/2019 Measurement Errors 2
27/53
An Example
Experiment Measured value
1 8.0 s
2 7.0 s
3 5.0 s
4 9.0 s
5 9.5 s
6 11.3 s7 5.2 s
8 8.5 s
Copyright 2004 David J. Lilja 27
-
7/27/2019 Measurement Errors 2
28/53
An Example (cont.)
14.2deviationstandardsample
94.71
s
n
xx
n
i i
Copyright 2004 David J. Lilja 28
-
7/27/2019 Measurement Errors 2
29/53
An Example (cont.)
90% CI 90% chance actual value in interval
90% CI = 0.10 1 - /2 = 0.95
n = 8 7 degrees of freedom
Copyright 2004 David J. Lilja
29
c1 c2
1-
/2 /2
-
7/27/2019 Measurement Errors 2
30/53
90% Confidence Interval
a
n 0.90 0.95 0.975
5 1.476 2.015 2.571
6 1.440 1.943 2.447
7 1.415 1.895 2.365
1.282 1.645 1.960
4.98
)14.2(895.194.7
5.68
)14.2(895.194.7
895.1
95.02/10.012/1
2
1
7;95.01;
c
c
tt
a
na
Copyright 2004 David J. Lilja
30
-
7/27/2019 Measurement Errors 2
31/53
95% Confidence Interval
a
n 0.90 0.95 0.975
5 1.476 2.015 2.571
6 1.440 1.943 2.447
7 1.415 1.895 2.365
1.282 1.645 1.960
7.98
)14.2(365.294.7
1.68
)14.2(365.294.7
365.2
975.02/10.012/1
2
1
7;975.01;
c
c
tt
a
na
Copyright 2004 David J. Lilja
31
-
7/27/2019 Measurement Errors 2
32/53
What does it mean?
Copyright 2004 David J. Lilja
32
90% CI = [6.5, 9.4] 90% chance real value is between 6.5, 9.4
95% CI = [6.1, 9.7]
95% chance real value is between 6.1, 9.7 Why is interval wider when we are more confident?
-
7/27/2019 Measurement Errors 2
33/53
Higher ConfidenceWider Interval?
Copyright 2004 David J. Lilja
33
6.5 9.4
90%
6.1 9.7
95%
-
7/27/2019 Measurement Errors 2
34/53
Key Assumption
Measurement errors areNormally distributed.
Is this true for most
measurements on realcomputer systems?
Copyright 2004 David J. Lilja
34
c1 c2
1-
/2 /2
-
7/27/2019 Measurement Errors 2
35/53
Key Assumption
Copyright 2004 David J. Lilja
35
Saved by the Central Limit Theorem
Sum of a large number of values from anydistribution will be Normally (Gaussian)
distributed. What is a large number? Typically assumed to be > 6 or 7.
-
7/27/2019 Measurement Errors 2
36/53
How many measurements?
Width of interval inversely proportional to n
Want to minimize number of measurements
Find confidence interval for mean, such that:
Pr(actual mean in interval) = (1 )
xexecc )1(,)1(),( 21
Copyright 2004 David J. Lilja
36
-
7/27/2019 Measurement Errors 2
37/53
How many measurements?
Copyright 2004 David J. Lilja
37
2
2/1
2/1
2/1
21 )1(),(
ex
szn
exn
sz
n
szx
xecc
-
7/27/2019 Measurement Errors 2
38/53
How many measurements?
Copyright 2004 David J. Lilja
38
But n depends on knowing mean and standarddeviation!
Estimate s with small number of measurements
Use this s to find n needed for desired interval width
-
7/27/2019 Measurement Errors 2
39/53
How many measurements?
Copyright 2004 David J. Lilja
39
Mean = 7.94 s
Standard deviation = 2.14 s
Want 90% confidence mean is within 7% of actual
mean.
-
7/27/2019 Measurement Errors 2
40/53
How many measurements?
Copyright 2004 David J. Lilja
40
Mean = 7.94 s
Standard deviation = 2.14 s
Want 90% confidence mean is within 7% of actual
mean. = 0.90
(1-/2) = 0.95
Error = 3.5%
e = 0.035
-
7/27/2019 Measurement Errors 2
41/53
How many measurements?
9.212
)94.7(035.0
)14.2(895.12
2/1
ex
szn
213 measurements
90% chance true mean is within
3.5% interval
Copyright 2004 David J. Lilja
41
-
7/27/2019 Measurement Errors 2
42/53
Proportions
Copyright 2004 David J. Lilja
42
p = Pr(success) in n trials of binomial experiment
Estimate proportion: p = m/n m = number of successes
n = total number of trials
-
7/27/2019 Measurement Errors 2
43/53
Proportions
Copyright 2004 David J. Lilja
43
n
pp
zpc
n
pp
zpc
)1(
)1(
2/12
2/11
-
7/27/2019 Measurement Errors 2
44/53
Proportions
Copyright 2004 David J. Lilja
44
How much time does processor spend in OS?
Interrupt every 10 ms
Increment counters
n = number of interrupts m = number of interrupts when PC within OS
-
7/27/2019 Measurement Errors 2
45/53
Proportions
Copyright 2004 David J. Lilja
45
How much time does processor spend in OS? Interrupt every 10 ms
Increment counters n = number of interrupts
m = number of interrupts when PC within OS
Run for 1 minute n = 6000
m = 658
-
7/27/2019 Measurement Errors 2
46/53
Proportions
)1176.0,1018.0(6000
)1097.01(1097.096.11097.0
)1(),( 2/121
n
ppzpcc
95% confidence interval for proportion
So 95% certain processor spends 10.2-11.8% of itstime in OS
Copyright 2004 David J. Lilja
46
-
7/27/2019 Measurement Errors 2
47/53
Number of measurements for proportions
Copyright 2004 David J. Lilja
47
2
2
2/1
2/1
2/1
)(
)1(
)1(
)1()1(
pe
ppzn
n
ppzpe
n
ppzppe
-
7/27/2019 Measurement Errors 2
48/53
Number of measurements for proportions
Copyright 2004 David J. Lilja
48
How long to run OS experiment?
Want 95% confidence
0.5%
-
7/27/2019 Measurement Errors 2
49/53
Number of measurements for proportions
Copyright 2004 David J. Lilja
49
How long to run OS experiment?
Want 95% confidence
0.5%
e = 0.005 p = 0.1097
-
7/27/2019 Measurement Errors 2
50/53
Number of measurements for proportions
102,247,1
)1097.0(005.0)1097.01)(1097.0()960.1(
)(
)1(
2
2
2
2
2/1
pe
ppzn
10 ms interrupts
3.46 hours
Copyright 2004 David J. Lilja
50
-
7/27/2019 Measurement Errors 2
51/53
Important Points
Copyright 2004 David J. Lilja
51
Use statistics to Deal with noisy measurements
Aggregate large amounts of data
Errors in measurements are due to: Accuracy, precision, resolution of tools
Other sources of noise
Systematic, random errors
Important Points: Model errors with bell
-
7/27/2019 Measurement Errors 2
52/53
Important Points: Model errors with bellcurve
Copyright 2004 David J. Lilja
52True value
Precision
Mean of measured values
Resolution
Accuracy
-
7/27/2019 Measurement Errors 2
53/53
Important Points53
Use confidence intervals to quantify precision
Confidence intervals for Mean ofn samples
Proportions
Confidence level Pr(actual mean within computed interval)
Compute number of measurements needed for
desired interval width