the practice of statistics third edition chapter 10: estimating with confidence copyright © 2008 by...
TRANSCRIPT
The Practice of StatisticsThird Edition
Chapter 10:Estimating with Confidence
Copyright © 2008 by W. H. Freeman & Company
Daniel S. Yates
Ex.
Suppose a sample of 50 men had a mean score of 109 on an intelligence test.
• We can estimate that the population mean , is approximately 109.
• x bar is normally distributed.
• The mean of the sampling distribution is equal to the unknown population mean.
• The standard deviation of x bar for an SRS of 50 given the population standard deviation is 15/(50)0.5 = 2.1
• The 68 – 95 – 99.7 rule states that about 95% of all possible sample means x bar will be within 2 standard deviations of the population mean
• In 95% of all possible samples the unknown , lies between x bar + or – 4.2
• We are 95% confident that lies between 109 + 4.2; that is (104.8 , 113.2)
• There are only two possibilities:
1. The interval between 104.8 and 113.2 contains the true population mean
2. Our SRS was one of the few samples for which x bar is not within 4.2 points of the true Only 5% of all samples give such inaccurate results.
The method we used gives the correct result 95% of the time.
Applet showing confidence intervals:
http://onlinestatbook.com/stat_sim/conf_interval/index.html
Suppose you want to construct an 80% confidence interval
Confidence
levelTail area Z*
80% 0.1 1.282
90% 0.05 1.645
95% 0.025 1.960
99% 0.005 2.576
Confidence level is usually chosen as > 0.90
Margin of error
estimate
Ex.
A questionnaire of 160 hotel managers asked how long they had been with their current company. The average time was reported as 11.78 years. Give a 99% confidence interval for the mean number of years that the entire population of managers have been with there current company. Assume the standard deviation of the population is years.
11.78 + 2.576(3.2/√160) = 11.78 + 0.652 = (11.128, 12.432)
We are 99% confident that the true population mean lies between 11.128 and 12.432.
The method we used will give the correct result 99% of the time.
Margin of error decreases when;
1) z* gets smaller; but this makes confidence level smaller
) is small – sample drawn from less spread population.
3) n, sample size is large. Quadrupling the sample size cuts margin of error in half.
• Ideally, we would like; 1) high confidence; method almost always gives the right result.
and
2) small margin of error; population parameter estimated very precisely.
How to choose a sample size for a desired margin of error.
Ex.
How many observations must be made to produce results accurate to within + 0.005 with 95% confidence? Assume
z* /√n)n => 7.1 < n ; choose n greater than or equal to 8
You must round up to next integer
It is incorrect to say that the probability is 95% that the true mean lies within a certain interval.
We can say that we are 95% confident that the mean lies within a certain interval or ; The method we used to calculate the interval gives the correct result in 95% of all possible sample of a particular size.
Tests of Significance
• Significance tests assess the evidence provided by the data in favor of some claim about the population.
• Significance tests begin by stating a hypothesis about a population parameter.
• The null hypothesis Ho, is always stated as an equivalence.
Ho : o
• The alternative hypothesis Ha, can be stated in one of three ways.
Ha : ≠ o
< o
> o
Ex.
A car manufacturer claims that one of their car models gets 33mpg. A random sample of 30 cars is selected and the mean gas mileage of this sample x-bar is calculated to be 31 mpg. Can we refute the claim of the automaker? Assume 3.5 mpg.
Ho: 33 mpg
Ha: mpg
x - bar = 31 mpg, sample std. = 3.5/√30 = 0.639
33
3.5/√30 = 0.639
3331 32.361
- 0.639
3331
33
0.00087
• X-bar = 31 is way out on the normal curve. So far out that a result this small almost never occurs by chance if the true 33 mpg.
• This is good evidence that the automakers claim should be rejected in favor of the alternate hypothesis, 33 mpg
• Generally P-values < 0.05 are considered small enough to reject the Ho. It is statistically significant.
P( z < -3.12) = 0.00087
Significance level
• We compare the P – value with a fixed value that we regard as decisive.
• The decisive value of P is called the significance level. Symbol =>
• Choosing = 0.05 require that the data give evidence against Ho so extreme that it would happen in no more than 5% of the possible samples if Ho is true. = 0.01 require that the data give evidence against Ho so extreme that it would happen in no more than 1% of the possible samples if Ho is true.
• If the P – value is as small or smaller than , we say that the data are statistically significant at level The null hypothesis should be rejected in favor of the alternate hypothesis.
If P-value is low,
reject the HO
One sided test
Two sided test
{
Choosing an level in significance tests
• If Ho represents an assumption that people you must convince have believed for a long period of time, strong evidence (small ) is needed to persuade them.
• If the consequences of rejecting Ho are drastic; ie expensive, finality. You may want strong evidence, (small ).
• May be more useful to report the P-value so each individual may decide for themselves.
• Even though significance levels of 0.10, 0.05 and 0.01 have been used traditionally. The border between what levels are significant is not black and white. Not much difference between P-values of 0.049 and 0.051.
• No significance level is sacred.
Inference as decision
Type I and Type II errors
• If we reject Ho (accept Ha) when Ho is really true, this is a Type I error.
• If we reject Ha (accept Ho) when Ha is really true, this is Type II error.
Ho True Ha True
Reject Ho Type I Error
Correct Decision
Reject Ha Correct Decision
Type II error
Significance and Type I error
• The significance level of any fixed level significance test is equal to the probability of making a Type I error.
• the value of is the probability that the test will reject the null hypothesis Ho when Ho is really true.
Power of the test
• The probability that a fixed level a significance test will reject Ho when Ha is true is called the power of the test.
• Increasing sample size n, increases the power of the test.
• Increasing the significance level increases the power of the test.