04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 1
Sampling for Estimation
Instructor: Ron S. KenettEmail: [email protected]
Course Website: www.kpa.co.il/biostatCourse textbook: MODERN INDUSTRIAL STATISTICS,
Kenett and Zacks, Duxbury Press, 1998
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 2
Course Syllabus
•Understanding Variability•Variability in Several Dimensions•Basic Models of Probability•Sampling for Estimation of Population Quantities•Parametric Statistical Inference•Computer Intensive Techniques•Multiple Linear Regression•Statistical Process Control•Design of Experiments
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 3
Error Sampling Nonsampling
Standard error of the mean of the proportion
Standardized individual value sample mean
Finite Population Correction (FPC)
Probability sample Simple random
sample Systematic sample Stratified sample Cluster sample
Nonprobability sample Convenience sample Quota sample Purposive sample Judgment sample
Key TermsKey Terms
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 4
Key TermsKey Terms
Unbiased estimator
Point estimates Interval
estimates Interval limits Confidence
coefficient
Confidence level
Accuracy Degrees of
freedom (df) Maximum likely
sampling error
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 5
Types of SamplesTypes of Samples
Simple random
Systematic
Every person has an equal chance of being selected. Best when roster of the population exists.
Randomly enter a stream of elements and sample every kth element. Best when elements are randomly ordered, no cyclic variation.
Probability, or Scientific, Samples: Each element to be sampled has a known (or calculable) chance of being selected.
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 6
Types of SamplesTypes of Samples
Stratified
Cluster
Randomly sample elements from every layer, or stratum, of the population. Best when elements within strata are homogeneous.
Randomly sample elements within some of the strata. Best when elements within strata are heterogeneous.
Probability, or Scientific, Samples: Each element to be sampled has a known (or calculable) chance of being selected.
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 7
Types of SamplesTypes of Samples
Convenience
Quota
Elements are sampled because of ease and availability.
Elements are sampled, but not randomly, from every layer, or stratum, of the population.
Nonprobability Samples: Not every element has a chance to be sampled. Selection process usually involves subjectivity.
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 8
Types of SamplesTypes of Samples
Purposive
Judgment
Elements are sampled because they are atypical, not representative of the population.
Elements are sampled because the researcher believes the members are representative of the population.
Nonprobability Samples: Not every element has a chance to be sampled. Selection process usually involves subjectivity.
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 9
Distribution of the MeanDistribution of the Mean
When the population is normally distributed Shape: Regardless of sample size, the
distribution of sample means will be normally distributed.
Center: The mean of the distribution of sample means is the mean of the population. Sample size does not affect the center of the distribution.
Spread: The standard deviation of the distribution of sample means, or the standard error, is
. nx
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 10
The Standardized MeanThe Standardized Mean
The standardized z-score is how far above or below the sample mean is compared to the population mean in units of standard error. “How far above or below” sample mean minus µ “In units of standard error” divide by
Standardized sample mean
n
xz
– error standard
mean sample
n
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 11
Distribution of the MeanDistribution of the Mean
When the population is not normally distributed Shape: When the sample size taken
from such a population is sufficiently large, the distribution of its sample means will be approximately normally distributed regardless of the shape of the underlying population those samples are taken from. According to the Central Limit Theorem, the larger the sample size, the more normal the distribution of sample means becomes.
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 12
Distribution of the MeanDistribution of the Mean
When the population is not normally distributed Center: The mean of the distribution of
sample means is the mean of the population, µ. Sample size does not affect the center of the distribution.
Spread: The standard deviation of the distribution of sample means, or the standard error, is.
nx
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 13
Distribution of the ProportionDistribution of the Proportion
When the sample statistic is generated by a count not a measurement, the proportion of successes in a sample of n trials is p, where Shape: Whenever both n and n(1 –
) are greater than or equal to 5, the distribution of sample proportions will be approximately normally distributed.
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 14
Distribution of the ProportionDistribution of the Proportion
When the sample proportion of successes in a sample of n trials is p, Center: The center of the distribution
of sample proportions is the center of the population, .
Spread: The standard deviation of the distribution of sample proportions, or the standard error, isp ׳(1–)
n .
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 15
Distribution of the ProportionDistribution of the Proportion
The standardized z-score is how far above or below the sample proportion is compared to the population proportion in units of standard error. “How far above or below” sample p – “In units of standard error” divide by
Standardized sample proportion
n
pz)–1(
– error standard
proportion sample ׳
np)–1( ׳
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 16
Finite Population CorrectionFinite Population Correction Finite Population Correction (FPC) Factor:
Rule of Thumb: Use FPC when n > 5%•N.
Apply to: Standard errors of mean and proportion.
FPC
N nN 1
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 17
Unbiased Point EstimatesUnbiased Point Estimates
PopulationSampleParameterStatistic Formula
Mean, µ
Variance,
Proportion,
x x xi
n
1–
2)–( 22
nxixss
p p x successesn trials
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 18
Confidence Intervals: Confidence Intervals: µ, Known
where = sample mean ASSUMPTION: = population standard infinite
population deviationn = sample sizez = standard normal score for area in tail = /2
nzxx
nzxx
zzz ׳׳
–:
0–:
x
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 19
Confidence Intervals: Confidence Intervals: µ, Unknown
where = sample mean ASSUMPTION: s = sample standard Population deviation
approximately n = sample size normal
and t = t-score for area infinite in tail = /2 df = n – 1
nstxx
nstxx
ttt׳׳
–:
0–:
x
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 20
Confidence Intervals on Confidence Intervals on
where p = sample proportion ASSUMPTION: n = sample size n•p 5,
z = standard normal score n•(1–p) 5,
for area in tail = /2 and population
infinite
nn
ppzppppzpp )–1()–1(–: ׳׳
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 21
Confidence Intervals for Finite Confidence Intervals for Finite PopulationsPopulations
Mean:
or
Proportion:
1––
2
1––
2
NnN
nstx
NnN
nzx
1–
–)–1(
2
NnN
nppzp
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 22
Interpretation of Confidence Interpretation of Confidence IntervalsIntervals
Repeated samples of size n taken from the same population will generate (1–)% of the time a sample statistic that falls within the stated confidence interval.
OR We can be (1–)% confident that the
population parameter falls within the stated confidence interval.
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 23
Sample Size Determination for Sample Size Determination for Infinite PopulationsInfinite Populations
Mean: Note is known and e, the bound within which you want to estimate µ, is given. The interval half-width is e, also called
the maximum likely error:
Solving for n, we find: 2
22
e
zn
nze
׳
׳
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 24
Sample Size Determination for Sample Size Determination for Finite PopulationsFinite Populations
Mean: Note is known and e, the bound within which you want to estimate µ, is given.
where n = required sample sizeN = population sizez = z-score for (1–)%
confidence
n 2e2z2
2N
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 25
Sample Size Determination of Sample Size Determination of for for Infinite PopulationsInfinite Populations
Proportion: Note e, the bound within
which you want to estimate , is given. The interval half-width is e, also called
the maximum likely error:
Solving for n, we find:2
)–1(2
)–1(
eppzn
nppze
׳
04/19/23
(c) 2001, Ron S. Kenett, Ph.D. 26
Sample Size Determination of Sample Size Determination of for for Finite PopulationsFinite Populations
Mean: Note e, the bound within which
you want to estimate , is given.
where n = required sample sizeN = population sizez = z-score for (1–)%
confidence
p = sample estimator of
n p(1– p)e2z2
p(1– p)N