2-16-2012
TRANSCRIPT
![Page 1: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/1.jpg)
CPS 424/552Discrete-Event Simulation TechniquesSpring 2012
Chapter 4.1 Sample Statistics Zhongmei YaoDepartment of Computer ScienceUniversity of Dayton
2-16-2012
![Page 2: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/2.jpg)
Review: Chapter 1 Models
1.1 Introduction– Model characterization, development
1.2 A Single-Server Queue1) Conceptual model 2) Specification model3) Output statistics4) Computational model
1.3 A Simple Inventory System– Conceptual model, specification model– Output statistics– Computational model
Textbook copyright © 2006, Prentice Halls
![Page 3: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/3.jpg)
Review: Chapter 2 RNG
2.1 Lehmer Random-Number Generators– Introduction
2.2 Lehmer Random-Number Generators– Implementation
2.3 Monte Carlo Simulation2.4 Monte Carlo Simulation Examples2.5 Finite-State Sequences
Textbook copyright © 2006, Prentice Halls
![Page 4: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/4.jpg)
Review: Chapter 3 DES
3.1 Discrete-event simulation– Exponential random variate, geometric random variate
3.2 Multi-stream Lehmer RNGs– Streams, examples
3.3 Discrete-event simulation examples– SSQ with immediate feedback– Simple inventory systems with delivery lag– Single-server machine shop
Textbook copyright © 2006, Prentice Halls
![Page 5: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/5.jpg)
Chapter 4 Statistics
4.1 Sample statistics– Sample mean, sample standard deviation, examples
4.2 Discrete-data histograms– Histograms, empirical cumulative distribution functions
4.3 Continuous-data histograms– Histograms, empirical cumulative distribution functions
4.4 Correlation
Textbook copyright © 2006, Prentice Halls
![Page 6: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/6.jpg)
Chapter Overview
• Discrete-event simulations generate a lot of experimental data
• This chapter considers how we can compress data into meaningful statistics and interpret sample statistics
• A sample is data collected from a much larger population• If the size of sample is small, essentially all that can be
done is compute the sample mean and standard deviation– Section 4.1
• If the size of sample is not small, a sample-data histogramcan be computed and then used to analyze the distribution of data in the sample– Section 4.2 and 4.3
![Page 7: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/7.jpg)
Sample Mean and Standard Deviation
• How to collect data in DES?– Within-the-run (e.g., job avg and time avg used to characterize the
performance of a SSQ system)– Between-the-run: simulate the system repeatedly by simply
changing the initial seed from run to run
• Def. 4.1.1: Given a sample x1, x2, …, xn (continuous or discrete)– Sample mean:
– Sample variance:
– The sample standard deviation:
![Page 8: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/8.jpg)
• Sample mean: a measure of central tendency of data values• Sample variance and sample standard deviation are
measures of dispersion– The spread of data about the sample– If the unit of the data is sec, then the units of the sample mean and
sample standard deviation are sec as well
From http://en.wikipedia.org/wiki/File:Normal_Distribution_PDF.svg
Sample Mean and Standard Deviation
mean
variance
![Page 9: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/9.jpg)
Sample Variance
• A common alternative definition of the sample variance s2:
rather than
• The 1/(n 1) version appears universally–The s2 is undefined for n = 1–The 1/(n 1) form is an unbiased estimate of the population variance (means that the sample variance converges to the population variance)
• Why consider the 1/n form?–The sample size n is typically large in simulations–If n is large, the difference is negligible–We will use the 1/n version
![Page 10: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/10.jpg)
Relating the Mean and Standard Deviation
• The root-mean-square (rms) function d(x) measures dispersion about any value x
• Theorem 4.1.1– The sample mean gives the smallest possible value for d(x)– The standard deviation s is that smallest value:
![Page 11: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/11.jpg)
Relating the Mean and Standard Deviation
• Example 4.1.1:– Collect 50 observations – The sample mean is 1.095 – The sample standard deviation is 0.354
– The smallest value of d(x) is s, as shown in the figure
![Page 12: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/12.jpg)
Chebyshev’s Inequality
• To better understand how the mean and s are related, consider the number of points that lie within k standard deviations of the mean– The parameter k > 1
• Let the set contain the points satisfying:
• Let pk = |Sk| / n be the proportion of xi that lie within ks of the mean
• Chebyshev’s inequality states: pk 1 – 1/k2
2ks
![Page 13: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/13.jpg)
Chebyshev’s Inequality
• For k = 2, we have from Chebyshev’s inequality that pk 1 – ¼ = 75%
– For any sample, at least 75% of data values lie within 2s of the sample mean. What is pk for k = 3?
– Example 4.1.1: 95% of points lie within 2s of the sample mean
– Chebyshev’s is very conservative for k = 2
• Chebyshev’s inequality and practical experience suggest that the is the “effective width” of a sample– Most (but not all) points will lie in this interval – Outliers must be viewed with suspicion
4s
![Page 14: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/14.jpg)
• Often the output data generated by simulations should be converted to different units– Example 4.1.2: Suppose x1, x2, …, xn measured in seconds. To
convert to minutes, we let xi’ = xi / 60
• Let xi’ = a xi + b be the new data• Sample mean:
• Sample variance:
• Sample standard deviation:
Linear Data Transformation
![Page 15: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/15.jpg)
• Example 4.1.2: Suppose x1, x2, …, xn measured in seconds. To convert to minutes, we let xi’ = xi / 60
– Given is 45 sec, what is ?– Given s is 15 sec, what is s’ ?
• Example 4.1.3: Standardize data by subtracting the sample mean and dividing the result by s
– For sample x1, x2, …, xn , standardized sample is
– Used to avoid issues with vary large (or small) valued data– What is ?– What is s’ ?
Linear Data Transformation
![Page 16: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/16.jpg)
Nonlinear Data Transformation
• When data is used to generate a Boolean (1 or 0) outcome, we need nonlinear data transformation– The value of xi is not important as the effect – E.g., consider the effect: it will rain tomorrow. How much rain we
will have is not important
• Let A be a fixed set and
• Let p be the proportion of xi that fall in A
• Then, and
![Page 17: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/17.jpg)
Nonlinear Data Transformation
• Example 4.1.4: A SSQ system– Let xi = di be the queueing delay for job i– Let A = R+ be the set of all positive numbers– Then xi’ = 1 if and only if di > 0– From Exercise 1.2.3, proportion of jobs delayed is p = 0.723– Therefore, = 0.723– What about s’ ?
![Page 18: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/18.jpg)
Computational Considerations
• Recall that the sample standard deviation is given by
– Require two passes through the data1. Compute the sample mean2. Compute the squared differences
• The two-pass approach is undesirable for large n since we need to temporarily store data – Can we find a one-pass algorithm for computing s?
![Page 19: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/19.jpg)
Conventional One-Pass Algorithm
• A one-pass equation for s2:
– Thus, s2 can be computed in one pass by accumulating these two partial sums:
![Page 20: 2-16-2012](https://reader036.vdocuments.net/reader036/viewer/2022081813/543cb708b1af9f5f378b4745/html5/thumbnails/20.jpg)
Next Time
• Section 4.1 – Welford’s one-pass algorithm– Time-Averaged Sample Statistics
• Section 4.2