lecture 21: chapter 9, section 1 inference for...

©2011 Brooks/Cole, CengageLearning

Elementary Statistics: Looking at the Big Picture 1

Lecture 21: Chapter 9, Section 1Inference for Categorical Variable:Confidence Intervals

3 Forms of InferenceProbability vs. ConfidenceConstructing Confidence IntervalSample Size; Level of Confidence

©2011 Brooks/Cole,Cengage Learning

Elementary Statistics: Looking at the Big Picture L21.2

Looking Back: Review

4 Stages of Statistics Data Production (discussed in Lectures 1-4) Displaying and Summarizing (Lectures 5-12) Probability (discussed in Lectures 13-20) Statistical Inference

1 categorical 1 quantitative categorical and quantitative 2 categorical 2 quantitative



Four Processes of Statistics



Summarizing Categorical Sample Data (Review)

What proportion of sampled students ate breakfastthe day of the survey?

Looking Back: In Part 2, we summarized sample data forsingle variables or relationships.

Looking Ahead: In Part 4, our goal is to go beyond sampledata and draw conclusions about the larger populationfrom which the sample was obtained.



Three Types of Inference Problem

In a sample of 446 students, 0.55 ate breakfast.1. What is our best guess for the population

proportion of students who eat breakfast?Point Estimate

2. What interval should contain the populationproportion of students who eat breakfast?Confidence Interval

3. Is the population proportion of students who eatbreakfast more than half (50%)?Hypothesis Test



Behavior of Sample Proportion (Review)For random sample of size n from population

with p in category of interest, sampleproportion has

mean p is unbiased estimator of p

(sample must be random)



Example: Checking if Estimator is Unbiased

Background: Survey produced sample proportion ofintro stat students (various ages and times of day) ata university who’d eaten breakfast.

Questions: Is the sample representative of all college students? All

students at that university? Were the values of the variable (breakfast or not) recorded

without bias? Responses:

Differences among college cafeterias, etc.______________________________________________

Question not sensitive __________________________

nancyp

Text Box

Practice: 9.6 p.407



Example: Point Estimate for p

Background: In a representative sample ofstudents, 0.55 ate breakfast.

Question: What is our best guess for theproportion of all students at that universitywho eat breakfast?

Response: unbiased estimator for p_____ is best guess for p

nancyp

Text Box

Practice: 9.5b p.407



Example: Point Estimate Inadequate Background: Our best guess for p, population

proportion eating breakfast, is sample proportion 0.55. Questions:

Are we pretty sure the population proportion is 0.55? By approximately what amount is our guess “off”? Are we pretty sure population proportion is > 0.50?

Responses: ___________________________________________ ______________________ ______________________



Beyond a Point Estimate

Sample proportion from unbiased sample isbest estimate for population proportion.Looking Ahead: For point estimate we don’tneed sample size or info about spread. Theseare required for confidence intervals andhypothesis tests, to quantify how good ourpoint estimate is.



Probability vs. Confidence Probability: given population proportion,

how does sample proportion behave? Confidence: given sample proportion,

what is a range of plausible values forpopulation proportion?



Example: Probability Statement

Background: If students pick numbers from 1 to 20at random, p=0.05 should pick #7. For n=400, has mean 0.05 standard deviation shape approximately normal

Question: What does the “95” part of the 68-95-99.7Rule tell us about ?

Response: Probability is approximately 0.95 thatfalls within _________ of ______.

Looking Ahead: This statement about sample proportion is correct but notvery useful for practical purposes. In most real-life problems, we want to drawconclusions about an unknown population proportion.

nancyp

Text Box

Practice: 8.57a p.376



Example: How Far is One from the Other?

Background: An instructor can say abouthis/her position in the classroom:“I’m within 10 feet of this particular student.”

Question: What can be said about where thatstudent is in relation to the instructor?

Response:



DefinitionsMargin of Error: Distance around a sample

statistic, within which we have reason to believethe corresponding parameter falls.

A common margin of error is 2 s.d.s.Confidence Interval for parameter: Interval within

which we have reason to believe the parameterfalls = range of plausible values

A common confidence interval is sample statisticplus or minus 2 s.d.s.

A Closer Look: A parameter is not a R.V. It does not obey thelaws of probability so we must use the word “confidence”.



Example: Confidence Interval for p

Background: 30/400=0.075 students picked #7“at random” from 1 to 20. Let’s assume sampleproportion for n=400 has s.d. 0.01.

Question: What can we claim about populationproportion p picking #7?

Response: We’re pretty sure p is__________________________________________

Looking Back: In Part I, we learned about biased samples. The data suggest p>0.05:students were apparently biased in favor of #7. Their selections were haphazard, notrandom. If sampling individuals or assigning them to experimental treatments is notrandomized, then we produce a confidence interval that is not centered at p.

nancyp

Text Box

Practice: 9.7 p.407



Level of Confidence Corresponds to Multiplier

By “pretty sure”, we mean “95% confident”,because 95% is the probability of sampleproportion within 2 s.d.s of p (for largeenough n).

Looking Back: Our probability statement claimed sampleproportion should fall within 2 s.d.s of population proportion.Now, the inference statement claims population proportionshould be within 2 s.d.s of sample proportion.



Probability Interval for Picking #7



Confidence Interval for p Picking #7We do not sketch a curve showing probabilities for population proportionbecause it is not a random variable.

A Closer Look:How do we knowthe margin oferror?



Behavior of Sample Proportion (Review)For random sample of size n from population

with p in category of interest, sampleproportion has

mean p standard deviationWe do inference because p is unknown; how

can we know the standard deviation, whichinvolves p?



DefinitionStandard error: estimated standard

deviation of a sampling distribution.We estimate standard deviation of

with standard error .

Looking Ahead: In many situations throughoutinference, when needed information about thepopulation is unknown, we substitute knowninformation about the sample.



Definition95% confidence interval for p: (approx.)

estimate= sample proportion

standard error

margin of error =2 standard errors

95% confidence intervalfor population proportion



Confidence Interval Formula: Conditions95% confidence interval for p: (approx.)

Sample must be unbiased(otherwise interval is not really centered at )

n must be large enough so is approx. normal(otherwise multiplier 2 from 68-95-99.7 Rule isincorrect)

Population size must be at least 10n(otherwise formula for s.d., which requiresindependence, is incorrect)



Conditions for Normality in Confidence Interval

Multiplier 2 from normal dist. approximatelycorrect if np and n(1-p) both at least 10.

But p is unknown so substitute :Require

Sample count in (X) and out (n-X) of categoryof interest should both be at least 10.



Example: Checking Sample Size

Background: 30/400=0.075 students picked#7 “at random” from 1 to 20.

Question: Do the data satisfy requirement forapproximate normality of sample proportion?

Response:

nancyp

Text Box




Example: Checking Population Size

Background: To draw conclusions about criminalhistories of a city’s 750 bus drivers, a randomsample of 100 drivers was used.

Question: Is there approximate independence inspite of sampling without replacement, so formulafor standard error is accurate?

Response:

nancyp

Text Box

Practice: 9.9f p.408



Example: Revisiting Original Question

Background: In sample of 446 college students,246 (proportion 0.55) ate breakfast.

Question: Assuming sample is representative, whatinterval should contain proportion of all students atthat university who eat breakfast?

Response: Approx. 95% confidence interval for p isLooking Back:Earlier we wondered ifa majority of studentseat breakfast. Theinterval suggests this isthe case, since it isentirely above 0.50.

nancyp

Text Box

Practice: 9.9a,c p.407



Example: Role of Sample Size Background: 95% confidence intervals based on

sample proportion 0.54 from various sample sizes:

Question: What happens as n increases? Response:

nancyp

Text Box




Example: A Common Margin of Error Background: Pollsters most often report a 3% error margin.

Question: What is the most common sample size for polls? Response: Approximately _______.

nancyp

Text Box

Practice: 9.17c p.409



Other Levels of ConfidenceConfidence level 95% uses multiplier 2. Other levels use

other multipliers, based on normal curve:

z



Other Levels of ConfidenceConfidence level 95% uses multiplier 2. Otherlevels use other multipliers, based on normal curve.More precise multiplier for 95% is 1.96 instead of 2.



Example: Other Levels of Confidence Background: Of 108 students in committed

relationships, 0.70 said they took comfort by sniffingout-of-town partner’s clothing. Standard error canbe found to be 0.04.

Question: How do 90%, 95%, 98%, 99%confidence intervals compare?

Response: 90% C.I. is ___________________=(0.63, 0.77) 95% C.I. is ___________________=(0.62, 0.78) 98% C.I. is ___________________=(0.61, 0.79) 99% C.I. is ___________________=(0.60, 0.80)

nancyp

Text Box

Practice: 9.18a p.409



Example: Other Levels of Confidence

Intervals get _____as confidence level increases:



Confidence Interval and Long-Run Behavior

Repeatedly set up 95% confidence interval forproportion of heads, based on 20 coinflips.

In the long run, 95% of the intervals shouldcontain population proportion of heads, 0.5.



Confidence Interval and Long-Run Behavior



Example: Confidence in the Long Run Background: “President-elect Barack Obama's campaign

strategists weren't the only ones vindicated Tuesday.Pollsters came out looking pretty good, too. Of 27 polls ofPennsylvania voters released in the campaign's final twoweeks, only seven missed Obama's 10.3-point victory bymore than their margins of error. Obama's national victory ofabout 6 points was within the error margins of 16 of the 21national polls released in the final week.”

Question: Should pollsters be pleased with success rates of20/27=16/21= 75% ?

Response:

pittsburghlive.com/x/pittsburghtrib/news/cityregion/s_597288.html

nancyp

Text Box

Practice: 9.35 p.413



Lecture Summary(Inference for Proportions: Confidence Interval)

3 forms of inference; focus on confidence interval Probability vs. confidence Constructing confidence interval

Margin of error based on standard error Conditions

Role of sample size Confidence at other levels Confidence interval in the long run

lecture 21: chapter 9, section 1 inference for...

Documents