chapter 9 sampling distributions ap statistics st. francis high school fr. chris, 2001
TRANSCRIPT
Chapter 9Chapter 9Sampling DistributionsSampling Distributions
AP StatisticsAP Statistics
St. Francis High SchoolSt. Francis High School
Fr. Chris, 2001Fr. Chris, 2001
Two Key IdeasTwo Key Ideas
A Statistic is a Random VariableA Statistic is a Random Variable
As such, mean and standard deviations As such, mean and standard deviations can be found from combining the basic can be found from combining the basic random variables that make the statisticrandom variables that make the statistic
Pick Pennies from a HatPick Pennies from a Hat
Recall how we did thisRecall how we did this Try it again:Try it again:
– Pick at randomPick at random– Note the yearNote the year– Compute the mean and standard deviation Compute the mean and standard deviation
of your sample of your sample – NEW: Compute what you think the mean NEW: Compute what you think the mean
and standard deviation of the entire hat!and standard deviation of the entire hat!
FormulasFormulas
μ x ( ) =μ
σ x ( )=σn
μ) p ( ) =p
σ ) p ( )=p1−p( )
n
Statistic vs. ParameterStatistic vs. Parameter
A Statistic is a way to describe a A Statistic is a way to describe a parameterparameter
A Parameter describes a populationA Parameter describes a population
Which is a sample, which is a Which is a sample, which is a parameter?parameter?
42% of today’s 15 year-old girls will get pregnant in their teens
37% said they would vote for Joan Smith, on election day 41% actually did.
The NIH reports that the mean systolic blood pressure for males 35-44 years of age is 128 and the standard deviation is 15. 72 male Stock Brokers in this age group have a mean blood pressure of 126.07
42: parameter
37:statistic 41:parameter
128, 15: parameter, 126.07:statistic
Bias vs. VariabilityBias vs. Variability
Bias: Is your statistic centered around the population’s parameter?
Variability: Is your sample distribution scattered or focused?
Identify the bias and variability Identify the bias and variability of each:of each:
Population Parameter Population Parameter
Population ParameterPopulation Parameter
What about your sample?What about your sample?
Is it variable?
Is it biased? How can you tell?
http://www.mathorama.com/stat/penny97hist.html
http://www.mathorama.com/stat/penny99hist.html
Confidence IntervalsConfidence Intervals
By hand:http://www.mathorama.com/stat/Confidence.html
Computer Simulationhttp://www.mathorama.com/stat/RandomSamp.html
Use your sample statistics and what you know of the central limit Theorem, to make an assertion about the Population parameter.
x ±z(std.error)
where z is the z−score
for the desired %
-3 -2 -1 1 2 3
0.1
0.2
0.3
0.4
€
y = 12π
e−x 2
2
What about a proportion?What about a proportion?
The Gallup poll asked a probability sample of 1785 adults whether they attended church or synagogue during the past week. Suppose 40% did attend. How likely is it that a SRS of 1785 would be within 3% of this actual value?
μ ˆ p ( ) =p
σ ˆ p ( )=p1−p( )
n
Two rules of thumb:Two rules of thumb:
The population must be at least 10 times more than your sample size to use this formula for standard deviation.
np > 10 and n(1-p) > 10 in order to use the normal curve for approximating p.
Compute the standard Compute the standard deviationdeviation
σ( ˆ p ) =p(1−p)
n=
.4(.6)1785
Since the population is more than 10 times 1785,
=0.0116
The Probability thatThe Probability that p-hat is between 37%-43% p-hat is between 37%-43%
Since (.4)(1785) >10, and (.6)(1785)>10 then we can convert to z-scores and use the normal curve.
z=x −μσ
.37−.40.0116
=−2.586
.43−.40.0116
=+2.586
Using the Normal Using the Normal Distribution…Distribution…
P(-2.586 < Z < 2.586)=P(Z<2.586)-P(Z<-2.586)=normalcdf(-2.586,2.586)=Normalcdf(.37, .43, .4, 0.0116)=
.9903!
Okay, what if you flip a coin 20 Okay, what if you flip a coin 20 times and it’s heads 14 times?times and it’s heads 14 times?
Is it a fair coin? How can justify your answer?
Did you mention sample variability? Bias?
Do the rules of thumb apply to find a sigma? To use the normal distribution?
If you suspect that 70% is this coin’s true proportion, how many times should we flip it so we can use the normal curve?
(.3)(n) >10
n>10.3
n>34
Dishonest Cola?Dishonest Cola?
DC Cola is suspected of underfilling its cans of cola. They say each can has 12 ounces, with a standard deviation of 0.4 oz.
If this is true, how likely is it to get an average of 11.9 oz.or less, by taking a random sample of 50 cans?
Work it out...Work it out...
Z score?
Or normalcdf(-1E99, 11.9, 12, .4 / √50)
=.0384
z=11.9−12
0.450
≈−1.77
Look up -1.77 in Table A, or
normalcdf(-1E99, -1.77)
This leads to inference...This leads to inference...
If these were your results, there is still a 3% chance that the parameter really is where the company says it is (12 oz.) and sample variation lead you to a result less than 11.9 oz.
At what point do you reject the company’s claim? At 5%? 1%? 0.1%?
Inferential StatisticsInferential Statistics We choose a level of rejection (alpha)We choose a level of rejection (alpha) We assume that our results are no We assume that our results are no
different, and any variation is from different, and any variation is from chance (Null Hypothesis).chance (Null Hypothesis).
If it is unlikely (less than our chosen If it is unlikely (less than our chosen alpha), we reject the “Null Hypothesis”alpha), we reject the “Null Hypothesis”
Then claim our results SIGNIFICANTLY Then claim our results SIGNIFICANTLY different.different.
Central Limit TheoremCentral Limit Theorem
Draw an SRS of size n from any population whatsoever with mean µ and a finite standard deviation . When n is large, the sampling distribution of the sample mean x-bar is close to the normal distribution N[µ, /√n] (page 488).
Law of Large NumbersLaw of Large Numbers
Draw observations at random from any population with finite mean µ. As the number of observations drawn increases, the mean x-bar of the observed values gets closer and closer to .
Homework 9.1-9.4 (489)Homework 9.1-9.4 (489)
Parameter or a statistic?
€
=2.5003 parameter
€
x = 2.5009 statistic
€
) p = 7.2% statistic
€
) p = 48% statistic
€
p = 52% parameter
€
x1 = 335 statistic
€
x 2 = 289 statistic
9.5 (492)Tumbling Toast9.5 (492)Tumbling Toast
Toss coin 20 times. P-hat=Toss coin 20 times. P-hat= 10 more times… make a histogram of 10 more times… make a histogram of
your p-hats…. Is the center close to .5?your p-hats…. Is the center close to .5? Pool your work.. Is the center near .5? Pool your work.. Is the center near .5?
Is it normal?Is it normal?
9.9 (500) Dead Guinea Pigs9.9 (500) Dead Guinea Pigs
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
9.10(510)9.10(510)
A) Large Bias, Large VariabilityA) Large Bias, Large Variability B)Small Bias Small VariabilityB)Small Bias Small Variability C)Small Bias Large VariabilityC)Small Bias Large Variability D)Large Bias Small VariabilityD)Large Bias Small Variability
9.17 (503) School Vouchers9.17 (503) School Vouchers
Assuming the poll’s sample size is less Assuming the poll’s sample size is less than 780,000-10% of the population of than 780,000-10% of the population of NJ… the variability would be about the NJ… the variability would be about the samesame
9.19 (511) Got Milk?9.19 (511) Got Milk?
n=1012
€
) p = .67
p = .7
€
=p = .7;σ =p(1− p)
n=
(.7)(.3)
1012= .0144
€
US >10120
€
np = (1012)(.7) = 708.4 ≥10
n(1− p) = (1012)(.3) = 303.6 ≥10
€
P() p ≤ .67) = P(Z ≤ −0.25) = .0186
€
4 *1012 = 4048
9.33(519) Juan’s results9.33(519) Juan’s results
=10
€
n
=10
3= 5.7735mg
€
10
n= 3;n =12
9.35(524)Bad Rug9.35(524)Bad Rug
Mean=1.6 sd=1.2
€
normalcdf (2,9999999,1.6,1.2
200) ≈ 0
9.39(525) Cheap Cola9.39(525) Cheap Cola
=298, =3 P(<295)? P(xbar<295, n=6)?
€
P(X < 295) = P(Z <295 − 298
3= −1) = .8413
€
P(x < 295) = P(Z <295 − 298
3
6
= −2.4495) = .0072
9.41(526) What a Wreck!9.41(526) What a Wreck!
=2.2, =2.2, =1.4=1.4 Not normal but dist of x-bar is!Not normal but dist of x-bar is!
€
N(2.2,1.4
52= .1941)
€
P(x < 2) = P(Z <2 − 2.2
1.4
52
= −1.0302) = .1515
€
P(x < 10052 ) ≈ P(Z <
10052 − 2.2
1.4
52
= −1.4267) = .0768