9-1:sampling distributions preparing for inference! parameter: a number that describes the...

22

Upload: buck-long

Post on 03-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can
Page 2: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can

9-1:Sampling Distributions Preparing for Inference!

Parameter: A number that describes the population (usually not known)

Statistic: A number that can be computed from the sample data without making use of any unknown parameters.

Page 3: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can

Symbols:

Page 4: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can

Example Sample surveys show that fewer people

enjoy shopping than in the past. A recent survey asked a nationwide random sample of 2500 adults if they agreed or disagreed that “I like buying new clothes but shopping is often frustrating and time-consuming.” Of the respondents, 1650, or 66%, said they agreed.

Page 5: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can

Example cont’d: p-hat = 66% = statistic = sample

proportion Population = what we want to draw

conclusions about = all US residents >18 yrs old

Parameter = % of all adult US residents who agreed

Page 6: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can

Sampling Variability Sampling Variability:the value of a

statistic varies in repeated random sampling.

Simulation, Example 9.3 p. 565

Page 7: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can

Simulation,Example 9.3 p. 565

Page 8: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can

Figure 9.1 (p.566) Sampling distribution of p-hat

Histogram of values of p-hat from 1000 SRS’s of size 100 from a population of .70

This is an ideal pattern that would emerge if we looked at all possible samples of size 100 from our population

Page 9: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can

Describing Sampling Distributions Overall shape:

symmetric/approx. normal Outliers/deviations from

overall pattern: None Center: close to the true

value of p Spread: value of p-hats

have large spread, but because the distribution is closer to normal, we can therefore use sigma to describe the spread.

Page 10: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can

Are you a Survivor Fan? Suppose that the true

proportion of US adults who watched Survivor II is p = .37. The graph shows the results of drawing 1000 SRSs of size n = 100 from a population with p = .37.

Shape: Center: Spread: Outliers/Deviations:

Page 11: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can

Top: Results of drawing 1000 SRSs of size n=1000 drawn from a population with p = .37

Bottom: Results of drawing 1000 SRSs of size n=100 drawn from a population with p=.37

What happened when we took n = 1000 vs. n = 100?

Notes on top picture: Center: close to .37 Spread: small; range is .321

to .421. Shape: hard to see, since

values of p-hat cluster so tightly about .37

Page 12: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can
Page 13: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can
Page 14: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can

Random sampling… …gives us regular and predictable

shapes …patterns of behavior over many

repetitions …these distributions are

approximately normal.

Page 15: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can

Unbiased Statistic

Bias: Concerns the center of the sampling distribution

A statistic used to estimate a parameter is unbiased if the mean of its sampling distribution is equal to the true value of the parameter being estimated.

Page 16: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can

Examples of Unbiased Estimators

If we draw an SRS from a population in which 60% find shopping frustrating, the mean of the sampling distribution of p-hat is:

If we draw an SRS from a population in which 50% find shopping frustrating, the mean of p-hat is:

Page 17: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can
Page 18: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can

Variability of a statistic…

As long as the candy is well mixed (it selects a random sample), the variability of the result depends only on the size of the scoop and not the size of the container.

Page 19: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can

Bulls Eye Analogy

True value of population parameter: bull’s-eye, sample statistic: arrow fired at the target

Bias: our aim is off, we consistently miss the bull’s-eye in the same direction

High Variability: repeated shots are widely scattered on the target

Goal: low bias, low variability Take random samples with big n!

Page 20: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can

In items 1–3, classify each underlined number as a parameter or statistic. Give the appropriate notation for each.

1. Forty-two percent of today’s 15-year-old girls will get pregnant in their teens.

2. A 1993 survey conducted by the Richmond Times-Dispatch one week before election day asked voters which candidate for the state’s attorney general they would vote for. Thirty-seven percent of the respondents said they would vote for the Democratic candidate. On election day, 41% actually voted for the Democratic candidate.

3. The National Center for Health Statistics reports that the mean systolic blood pressure for males 35 to 44 years of age is 128 and the standard deviation is 15. The medical director of a large company looks at the medical records of 72 executives in this age group and finds that the mean systolic blood pressure for these executives is 126.07.

Page 21: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can

Below are histograms of the values taken by three sample statistics in several hundred samples from the same population. The true value of the population parameter is marked on each histogram.

4. Which statistic has the largest bias among these three? Justify your answer.

5. Which statistic has the lowest variability among these three?

6. Based on the performance of the three statistics in many samples, which is preferred as an estimate of the parameter? Why?

Page 22: 9-1:Sampling Distributions  Preparing for Inference! Parameter: A number that describes the population (usually not known) Statistic: A number that can