chapter 20 confidence intervals and hypothesis tests for a population mean ; t distributions t...
TRANSCRIPT
Chapter 20Confidence Intervals and Hypothesis Tests for a
Population Mean ; t distributions• t distributions• confidence intervals for a
population mean using t distributions
• hypothesis tests for a population mean using t distributions
The Importance of the Central Limit Theorem
When we select simple random samples of size n, the sample means we find will vary from sample to sample. We can model the distribution of these sample means with a probability model that is
,Nn
Since the sampling model for x is the normal model, when we standardize x we get the
standard normal z
n
xz
nxSD
)( that Note
If is unknown, we probably don’t know either.
The sample standard deviation s provides an estimate of
the population standard deviation
For a sample of size n,
the sample standard deviation s is:
n − 1 is the “degrees of freedom.”
The value s/√n is called the standard error of x , denoted
SE(x).
nxSD
)(
2)(1
1xx
ns i
n
sxSE )(
Standardize using s for
Substitute s (sample standard deviation) for
n
xz
ssss sss s
n
xz
Note quite correct
Not knowing means using z is no longer correct
t-distributions
Suppose that a Simple Random Sample of size n is drawn from a
population whose distribution can be approximated by a N(µ, σ) model.
When is known, the sampling model for the mean x is N(/√n).
When is estimated from the sample standard deviation s, the sampling model for the mean x follows a t distribution t(, s/√n) with degrees of freedom n − 1.
is the 1-sample t statistic
t x s n
Confidence Interval Estimates CONFIDENCE CONFIDENCE
INTERVAL for INTERVAL for
where: t = Critical value from t-
distribution with n-1 degrees of freedom
= Sample mean s = Sample standard
deviation n = Sample size
For very small samples (n < 15), the data should follow a Normal model very closely.
For moderate sample sizes (n between 15 and 40), t methods will work well as long as the data are unimodal and reasonably symmetric.
For sample sizes larger than 40, t methods are safe to use unless the data are extremely skewed. If outliers are present, analyses can be performed twice, with the outliers and without.
n
stx
x
t distributions Very similar to z~N(0, 1) Sometimes called Student’s t
distribution; Gossett, brewery employee Properties:
i) symmetric around 0 (like z)
ii)degrees of freedom if > 1, E( ) = 0
if > 2, = - 2, which is always
bigger than 1.
t
-3 -2 -1 0 1 2 3
Z
0 1 2 3-1-2-3
z = x - x
x
t =
x -
s, s =
s
nx
x
x
Student’s t Distribution
-3 -2 -1 0 1 2 3
Z
t
0 1 2 3-1-2-3
n
x-x
= z
n
s - x
=t x
Student’s t Distribution
Figure 11.3, Page 372
-3 -2 -1 0 1 2 3
Z
t1
0 1 2 3-1-2-3
n
sx - x
=t
Student’s t Distribution
Figure 11.3, Page 372
Degrees of Freedom
s = s2
s =
(X X)
n -12
i2
i=1
n
-3 -2 -1 0 1 2 3
Z
t1
0 1 2 3-1-2-3
t7
Student’s t Distribution
Figure 11.3, Page 372
n
sx - x
=t Degrees of Freedom
s = s2
s =
(X X)
n -12
i2
i=1
n
Degrees of Freedom1 3.0777 6.314 12.706 31.821 63.6572 1.8856 2.9200 4.3027 6.9645 9.9250. . . . . .. . . . . .
10 1.3722 1.8125 2.2281 2.7638 3.1693. . . . . .. . . . . .
100 1.2901 1.6604 1.9840 2.3642 2.62591.282 1.6449 1.9600 2.3263 2.5758
0.80 0.90 0.95 0.98 0.99
t-Table: text- inside back cover
90% confidence interval; df = n-1 = 10
118125.1 :interval confidence%90
sx
0 1.8125
Student’s t Distribution
P(t > 1.8125) = .05
-1.8125.05.05
.90
t10
P(t < -1.8125) = .05
Comparing t and z Critical Values
Conf.
level n = 30
z = 1.645 90% t = 1.6991
z = 1.96 95% t = 2.0452
z = 2.33 98% t = 2.4620
z = 2.58 99% t = 2.7564
Example – An investor is trying to estimate the return
on investment in companies that won quality awards last year.
– A random sample of 41 such companies is selected, and the return on investment is recorded for each company. The data for the 41 companies have
– Construct a 95% confidence interval for the
mean return.
18.875.14 sx
40141freedom of degrees
18.875.14
sx
awards.quality
that wincompaniesfor investmenton return
mean population thecontains 17.36) (12.14,
interval that theconfident 95% are We
36.17,14.1261.275.1441
18.80211.275.14
2.0211 t table,- tfrom
n
stx
1.. nfdn
stx 1.. nfd
n
stx
Example Because cardiac deaths increase after heavy
snowfalls, a study was conducted to measure the cardiac demands of shoveling snow by hand
The maximum heart rates for 10 adult males were recorded while shoveling snow. The sample mean and sample standard deviation were
Find a 90% CI for the population mean max. heart rate for those who shovel snow.
15,175 sx
Solution 1.. nfdn
stx 1.. nfd
n
stx
shovelers snowfor rateheart maximum
mean thecontains 183.70) (166.30,
interval that theconfident 90% are We
)70.183,30.166(
70.817510
158331.1175
1.8331 ttable,- t theFrom
1015,175
nsx
EXAMPLE: Consumer Protection Agency
Selected random sample of 16 packages of a product whose packages are marked as weighing 1 pound.
From the 16 packages: a. find a 95% CI for the mean weight
of the 1-pound packages b. should the company’s claim that the
mean weight is 1 pound be challenged ?
1.10pounds, .36 poundx s
EXAMPLE
95% CI, n=16, df=15, x=1.10
s=.36
critical value of t is 2.1315
becomes
.361.10 (2.1315) 1.10 .19 .91, 1.29
16
Since 1 pound is in the interval, the company's
claim appears reasonable.
t
sx t
n
1.. nfdn
stx 1.. nfd
n
stx
Chapter 20Testing Hypotheses
about Means
22
Sweetness in cola soft drinksCola manufacturers want to test how much the sweetness of cola drinks is affected by storage. The sweetness loss due to storage was evaluated by 10 professional tasters by comparing the sweetness before and after storage (a positive value indicates a loss of sweetness):
Taster Sweetness loss
1 2.0 2 0.4 3 0.7 4 2.0 5 −0.4 6 2.2 7 −1.3 8 1.2 9 1.1 10 2.3
We want to test if storage results in a loss of sweetness, thus:
H0: = 0 versus HA: > 0
where is the mean sweetness loss due to storage.
We also do not know the population parameter , the standard deviation of the sweetness loss.
The one-sample t-test
As in any hypothesis tests, a hypothesis test for requires a few steps:
1. State the null and alternative hypotheses (H0 versus HA)
a) Decide on a one-sided or two-sided test
2. Calculate the test statistic t and determining its degrees of
freedom
3. Find the area under the t distribution with the t-table or
technology
4. State the P-value (or find bounds on the P-value) and interpret
the result
The one-sample t-test; hypotheses
Step 1:
1. State the null and alternative hypotheses (H0 versus HA)
a) Decide on a one-sided or two-sided test
H0: = versus HA: > (1 –tail test)
H0: = versus HA: < (1 –tail test)
H0: = versus HA: ≠ –tail test)
The one-sample t-test; test statistic
We perform a hypothesis test with null hypothesisH : = 0 using the test statistic
where the standard error of is .
When the null hypothesis is true, the test statistic follows a t distribution with n-1 degrees of freedom. We use that model to obtain a P-value.
0
( )
yt
SE y
y
( )s
SE yn
27
The one-sample t-test; P-Values
Recall:The P-value is the probability, calculated assuming the null hypothesis H0 is true, of observing a value of the test statistic more extreme than the value we actually observed.
The calculation of the P-value depends on whether the hypothesis test is 1-tailed(that is, the alternative hypothesis isHA : < 0 or HA : > 0)or 2-tailed(that is, the alternative hypothesis is HA : ≠ 0).
28
P-Values
If HA: > 0, then P-value=P(t > t0)
Assume the value of the test statistic t is t0
If HA: < 0, then P-value=P(t < t0)
If HA: ≠ 0, then P-value=2P(t > |t0|)
Sweetening colas (continued)
Is there evidence that storage results in sweetness loss in colas?
H0: = 0 versus Ha: > 0 (one-sided test) Taster Sweetness loss 1 2.0 2 0.4 3 0.7 4 2.0 5 -0.4 6 2.2 7 -1.3 8 1.2 9 1.110 2.3___________________________Average 1.02Standard deviation 1.196Degrees of freedom n − 1 = 9
Conf. Level 0.1 0.3 0.5 0.7 0.8 0.9 0.95 0.98 0.99Two Tail 0.9 0.7 0.5 0.3 0.2 0.1 0.05 0.02 0.01
One Tail 0.45 0.35 0.25 0.15 0.1 0.05 0.025 0.01 0.005
df Values of t 9 0.1293 0.3979 0.7027 1.0997 1.3830 1.8331 2.2622 2.8214 3.2498
2.2622 < t = 2.70 < 2.8214; thus 0.01 < P-value < 0.025.
Since P-value < .05, we reject H0. There is a significant loss
of sweetness, on average, following storage.
9( 2.70)P value P t
0 1.02 02.70
1.196 10
yt
s n
Finding P-values with ExcelTDIST(x, degrees_freedom, tails)
TDIST = P(t > x) for a random variable t following the t distribution (x positive). Use it in place of t-table to obtain the P-value.
– x is the absolute value of the test statistic.
– Deg_freedom is an integer indicating the number of degrees of freedom.
– Tails specifies the number of distribution tails to return. If tails = 1, TDIST returns the one-tailed P-value. If tails = 2, TDIST returns the two-tailed P-value.
Sweetness in cola soft drinks (cont.)
31
2.2622 < t = 2.70 < 2.8214; thus 0.01 < p < 0.025.0 1.02 0
2.701.196 10
yt
s n
New York City Hotel Room Costs
The NYC Visitors Bureau claims that the average cost of a hotel room is $168 per night. A random sample of 25 hotels resulted in
y = $172.50 and
s = $15.40.
H0: μ= 168
HA: μ
168
n = 25; df = 24
New York City Hotel Room Costs
Do not reject H0: not sufficient evidence that true mean cost is different than $168
.079
0
.079
y μ 172.50 168t 1.46
s 15.40
n 25
1. 46
H0: μ= 168
HA: μ
168
-1. 46
$172.50, $15.40y s
t, 24 df
2 ( 1.46)P value P t Conf. Level 0.1 0.3 0.5 0.7 0.8 0.9 0.95 0.98 0.99Two Tail 0.9 0.7 0.5 0.3 0.2 0.1 0.05 0.02 0.01One Tail 0.45 0.35 0.25 0.15 0.1 0.05 0.025 0.01 0.005
df Values of t 24 0.1270 0.3900 0.6848 1.0593 1.3178 1.7109 2.0639 2.4922 2.7969
P-value = .158
0.1 ≤ P-value ≤ 0.2
Microwave PopcornA popcorn maker wants a combination of
microwave time and power that delivers high-quality popped corn with less than 10% unpopped kernels, on average. After testing, the research department determines that power 9 at 4 minutes is optimum. The company president tests 8 bags in his office microwave and finds the following percentages of unpopped kernels: 7, 13.2, 10, 6, 7.8, 2.8, 2.2, 5.2.
Do the data provide evidence that the mean percentage of unpopped kernels is less than 10%?
H0: μ= 10HA: μ 10where μ is true unknown mean percentage of unpopped kernels
n = 8; df = 7
Microwave Popcorn
Reject H0: there is sufficient evidence that true mean percentage of unpopped kernels is less than 10%
.02
0
6.775 102.51
3.64
8
yt
s
n
H0: μ= 10
HA: μ 10
-2. 516.775, 3.64y s
t, 7 df
( 2.51)P value P t Exact P-value = .02
Conf. Level 0.1 0.3 0.5 0.7 0.8 0.9 0.95 0.98 0.99Two Tail 0.9 0.7 0.5 0.3 0.2 0.1 0.05 0.02 0.01One Tail 0.45 0.35 0.25 0.15 0.1 0.05 0.025 0.01 0.005
df Values of t 7 0.1303 0.4015 0.7111 1.1192 1.4149 1.8946 2.3646 2.9980 3.4995