module 17: two-sample t-tests, with equal variances for

34
17 - 1 Module 17: Two-Sample t-tests, with equal variances for the two populations This module describes one of the most utilized statistical tests, the two-sample t-test conducted under the assumption that the two populations from which the two samples were selected have the same variance. Reviewed 11 May 05 /MODULE 17

Upload: others

Post on 23-Oct-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Module 17: Two-Sample t-tests, with equal variances for

17 - 1

Module 17: Two-Sample t-tests, with equal variances for the two

populations

This module describes one of the most utilized statistical tests, the two-sample t-test conducted under the assumption that the two populations from which the two samples were selected have the same variance.

Reviewed 11 May 05 /MODULE 17

Page 2: Module 17: Two-Sample t-tests, with equal variances for

17 - 2

Up to this point, the focus has been on a single population, for which the observations had a normal distribution with a population mean μ and standard deviation σ. From this population, a random sample of size n provided the sample statistics and s as estimates of μ and σ, respectively.

We created confidence intervals and tested hypotheses concerning the population mean μ, using the normal distribution when we had available the value of σ and using the t distribution when we did not and thus used the estimate s from the sample. This circumstance is often described as the one sample situation.

x

The General Situation

Page 3: Module 17: Two-Sample t-tests, with equal variances for

17 - 3

Clearly, we are often faced with making judgments for circumstances that involve more than one population and sample. For the moment, we will focus on the so-called two sample situation. That is, we consider two populations.

Question:

Do you believe the two populations have the same mean?

σBσASD

µBµAMean

City BCity A

Page 4: Module 17: Two-Sample t-tests, with equal variances for

17 -

H0: μA = μB versus H1: μA ≠ μB

or equivalently

H0: Δ = μA - μB = 0 versus H1: Δ = μA - μB ≠ 0.

Two Sample Hypotheses

Page 5: Module 17: Two-Sample t-tests, with equal variances for

17 - 4

Population 1 Population 2 Parameter Estimate Parameter Estimate

Populations of individual values

μ1 μ2

σ12 s1

2 σ22

s22

σ1 s1

σ2 s2

Populations of means, samples of size n1 and n2

μ1 μ2

σ12/n1 s1

2/n1 σ22/n2

s22/n2

σ1/√n1 s1/√n1 σ2/√n2 s2/√n2

1x 2x

2x1x

Parameters vs. Estimates

Page 6: Module 17: Two-Sample t-tests, with equal variances for

17 - 5

We are interested in

Δ = µ1 - µ2

If the samples are independent, then

When

1 2d x x= −

1 2 1 22 2

1 21 2

1 2

( ) ( ) ( )

( )

V a r x x V a r x V a r x

V a r x xn nσ σ

− = +

− = +

2 2 21 2 1 2

1 2

1 1, ( )Var x xn n

σ σ σ⎛ ⎞

= − = +⎜ ⎟⎝ ⎠

Page 7: Module 17: Two-Sample t-tests, with equal variances for

17 - 6

we have two estimates of σ2 , one from sample 1, namely s1

2 and one from sample 2, namely s22. How

can we best use these two estimates of the same thing. One obvious answer is to use the average of the two; however, it may be desirable to somehow take into account that the two samples may not the same size. If they are not the same size, then we may want the larger one to count more.

2 2 21 2When ,σ σ σ= =

Estimating σ2

Page 8: Module 17: Two-Sample t-tests, with equal variances for

17 - 7

Hence, we use the weighted average of the two sample variances, with the weighting done according to sample size. This weighted average is called the pooled estimate:

( ) ( )( ) ( )11

11

21

222

2112

−+−−+−

=nn

snsnsp

Pooled Average

Page 9: Module 17: Two-Sample t-tests, with equal variances for

17 - 8

To estimate Var( ), we can use

⎟⎟⎠

⎞⎜⎜⎝

⎛+

21

2 11nn

sp

1 2x x−

1 2x x−Estimate of Var( )

Page 10: Module 17: Two-Sample t-tests, with equal variances for

17 - 9

Statistic City A City B N 10 10 x (mmHg) 105.8 97.2 s2(mmHg)2 78.62 22.40 s (mmHg) 8.87 4.73

To investigate the question of whether the children of city A and city B have the same systolic blood pressure, a random sample of n = 10 children was selected from each city and their blood pressures measured. These samples provided the following data:

Example 1: Blood Pressures of Children

Page 11: Module 17: Two-Sample t-tests, with equal variances for

17 - 10

We are interested in the difference:

Δ = μA - μB

and we have as an estimate of μA and as an estimate of μB; hence it is reasonable to use:

d = - = 105.8 - 97.2 = 8.6 (mm Hg)

as an estimate of Δ = μA - μB.

Ax Bx

Ax Bx

Page 12: Module 17: Two-Sample t-tests, with equal variances for

17 - 11

We then can ask whether this observed difference of 8.6 mm Hg is sufficiently large for us to question whether the two population means could be the same, that is, μA = μB. Clearly, if the two population means are truly equal, that is, if μA = μB is true, then we would expect the two sample means also to be equal, that is = , except for the random error that occurs as a consequence of using random samples to represent the entire populations. The question before us is whether this observed difference of 8.6 mm Hg is larger than could be reasonably attributed to this random error and thus reflects true differences between the population means.

Ax Bx

Page 13: Module 17: Two-Sample t-tests, with equal variances for

17 - 12

Confidence Interval for μA- μB, using sp

0.975

( 1) ( 1)

8.6 = 2.1009 18

A B

A B

df n n

x x t df

= − + −

− = =

0.975 0.9751 1 1 1C ( ) ( ) 0.95A B p A B A B p

A B A B

x x t s x x t sn n n n

μ μ⎡ ⎤

− − + ≤ − ≤ − + + =⎢ ⎥⎣ ⎦

2 22 ( 1) ( 1) 9(78.62) 9(22.4) 50.51

( 1) ( 1) 18A A B B

pA B

n s n ssn n− + − +

= = =− + −

Page 14: Module 17: Two-Sample t-tests, with equal variances for

17 - 13

1 1 1 1C 8.6 2.1009(7.11) 8.6 2.1009(7.11) 0.9510 10 10 10A Bμ μ

⎡ ⎤− + ≤ − ≤ + + =⎢ ⎥

⎣ ⎦

50.51 7.11pS = =

[ ]C 1.92 15.27 0.95A Bμ μ≤ − ≤ =

Page 15: Module 17: Two-Sample t-tests, with equal variances for

17 - 14

Example 2: AJPH, April 1994; 84:p644

31n = 223n =

Page 16: Module 17: Two-Sample t-tests, with equal variances for

17 - 15

1 2 OCCP Prog Non OCCP Prog n 31 223

mean 4.1 3.4

SD 1.2 1.5

S2 1.44 2.25

Example 2 (contd.)

Page 17: Module 17: Two-Sample t-tests, with equal variances for

17 - 16

1. The hypothesis: H0 : μ1 = μ2 vs. H1: μ1 ≠ μ2

2. The assumptions: Independent random samples from normal distributions,

3. The α level: α = 0.05

4. The test statistic:

5. The critical region: Reject H0 if t is not between

2 2 21 2σ σ σ= =

1 2

1 2

1 1p

x x

sn n

t −=

+

0.975 (252) 1.97t± =

Example 2 (contd.)

Page 18: Module 17: Two-Sample t-tests, with equal variances for

17 - 17

6. Test result:

7. The Conclusion: Reject H0 since t = 2.5 is not between ± 1.97; 0.01 < p < 0.02

( )4.1 3.4 2.5

1.47 0.19t −= =

2 22 1 1 2 2

1 2

( 1) ( 1)( 1) ( 1)p

n s n ssn n− + −

=− + −

2 30(1.44) 222(2.25)30 222ps +

=+

2 542.7 2.154252ps = = 2.154 1.47ps = =

1 2

1 1 1 1 0.1931 223n n

+ = + =

Page 19: Module 17: Two-Sample t-tests, with equal variances for

17 - 18

Example 3: AJPH July 1994; 89:1068

Page 20: Module 17: Two-Sample t-tests, with equal variances for

17 - 19

Example 3 (contd.)

s SE n= (0.2) 1383 7.44= (0.7) 357 13.23=

2s 55.35 175.03

Mainland Cuban Puerto Ricans Americans n 1,383 357

mean 3.3 2.4

SE 0.2 0.7

Source: AJPH, July 1994; 89:1068

Page 21: Module 17: Two-Sample t-tests, with equal variances for

17 - 20

1. The hypothesis: H0 : µ1 = µ2 vs. H1: µ1 ≠ µ2

2. The assumptions: Independent random samples from normal distributions

3. The α level: α = 0.05

4. The test statistic:

2 2 21 2σ σ σ= =

1 2

1 2

1 1p

x x

sn n

t −=

+

Page 22: Module 17: Two-Sample t-tests, with equal variances for

17 - 21

5. The critical region: Reject if t is not between

± t0.975(1738) =1.96

6. The Result:

7. Conclusion: Accept H0: μ1 = μ2, since p > 0.05 ; 0.05 < p < 0.10

79 .86 8 .94 ps = =

3.3 2.4 0.90 1.71 8.94(0.059) (0.527)

t −= = =

2 1382(55.35) 356(175.03)1382 356ps +

=+

1 1 0.0591383 357

+ =

Page 23: Module 17: Two-Sample t-tests, with equal variances for

17 - 22

Example 4: AJPH July 1994; 89:1068

Page 24: Module 17: Two-Sample t-tests, with equal variances for

17 - 23

1. The hypothesis: H0: μSSS = μNHS vs. H1: μSSS ≠ μNHS

2. The α level: α = 0.05

3. The assumptions: Independent Samples, Normal Distribution,

4. The test statistic:

5. The critical region: Reject if t is not between ± 2.1315

2 2SSS NHSσ σ=

1 1SSS NHS

pSSS NHS

X Xt

Sn n

−=

+

Page 25: Module 17: Two-Sample t-tests, with equal variances for

17 - 24

6. The result :

7. The conclusion: Reject H0: μSSS = μNHS ; 0.01< p < 0.02

2 22

2 2

( 1) ( 1)( 1) ( 1)

6(142.5) 9(334.1) 75, 0966 9

274.0

1427.6 1057.5 370.1 2.75274.0(0.49)1 1274.0

7 10

SSS SSS NHS NHSp SSS NHS

p

n s n ss

n n

s

t

− + −=

− + −

+= =

+=

−= = =

+

Page 26: Module 17: Two-Sample t-tests, with equal variances for

17 - 25

Sample 1 2 1.2 1.7 0.8 1.5 1.1 2.0 0.7 2.1 0.9 1.1 1.1 0.9 1.5 2.2 0.8 1.8 1.6 1.3 0.9 1.5 Sum 10.6 16.1 Mean 1.06 1.61

Independent Random Samples from Two Populations of Serum Uric Acid Values

Page 27: Module 17: Two-Sample t-tests, with equal variances for

17 - 26

Sample 1 Sample 2 x x2 x x2

1.2 1.44 1.7 2.89 0.8 0.64 1.5 2.25 1.1 1.21 2.0 4.00 0.7 0.49 2.1 4.41 0.9 0.81 1.1 1.21 1.1 1.21 0.9 0.81 1.5 2.25 2.2 4.84 0.8 0.64 1.8 3.24 1.6 2.56 1.3 1.69 0.9 0.81 1.5 2.25

Sum 10.6 12.06 16.1 27.59 Mean 1.06 1.61 Sum2/n 1.236 25.921 SS 0.824 1.669 Variance 0.092 0.185 SD 0.303 0.431

Serum Acid Worksheet

Page 28: Module 17: Two-Sample t-tests, with equal variances for

17 - 27

2 22 1 1 2 2

1 2

( 1) ( 1)( 1) ( 1)p

n s n ssn n− + −

=− + −

2 (9)(0.09) (9)(0.19)9 9ps +

=+

2 0.81 1.71 0.1418ps +

= = ,

0 .37ps =

s12 = 0.09, s2

2 = 0.19

Page 29: Module 17: Two-Sample t-tests, with equal variances for

17 - 28

1. The hypothesis: H0: μ1 = μ2 vs. H1: μ1 ≠ μ2

2. The α-level: α = 0.05

3. The assumptions: Independent Random Samples Normal Distribution,

4. The test statistic:

2 21 2σ σ=

1 2

1 2

1 1p

x xts

n n

−=

+

Testing the Hypothesis That The Two Serum Uric Acid Populations Have The Same Mean

Page 30: Module 17: Two-Sample t-tests, with equal variances for

17 - 29

5. The reject region: Reject H0: μ1 = μ2 if t is not between ± t0.975(18) = 2.1009

6. The result:

7. The conclusion: Reject H0: μ1 = μ2 , since t is not between ± 2.1009

1.06 1.61 0.55 3.30.37(0.45)1 10.37

10 10

t − −= = = −

+

Page 31: Module 17: Two-Sample t-tests, with equal variances for

17 - 30

Before After 1.2 1.7 0.8 1.5 1.1 2.0 0.7 2.1 0.9 1.1 1.1 0.9 1.5 2.2 0.8 1.8 1.6 1.3 0.9 1.5 Sum 10.6 6.1 Mean 1.06 1.61

Serum Uric Acid Values Before And After a Special Meal

Page 32: Module 17: Two-Sample t-tests, with equal variances for

17 - 31

Serum Uric Acid Values Before And After A Special Meal

Worksheet Person Before After da d2

1 1.2 1.7 0.5 0.25 2 0.8 1.5 0.7 0.49 3 1.1 2.0 0.9 0.81 4 0.7 2.1 1.4 1.96 5 0.9 1.1 0.2 0.04 6 1.1 0.9 -0.2 0.04 7 1.5 2.2 0.7 0.49 8 0.8 1.8 1.0 1.00 9 1.6 1.3 -0.3 0.09

10 0.9 1.5 0.6 0.36 Sum 10.6 16.1 5.5 5.53 Mean 1.06 1.61 0.55

Sum2/n 3.025 SS 2.505 Variance 0.278 SD 0.528

da = After - Before

Page 33: Module 17: Two-Sample t-tests, with equal variances for

17 - 32

1. The hypothesis: H0: Δ = 0 vs. H1: Δ ≠ 0, where Δ = μAfter - μBefore

2. The α-level: α = 0.05

3. The assumptions: Random Sample of Differences, Normal Distribution

4. The test statistic:After Before

d d

x xdts s n

−= =

Testing the Hypothesis That The Serum Uric Acid Levels Before and After A Special Meal Are The Same

Page 34: Module 17: Two-Sample t-tests, with equal variances for

17 - 33

5. The rejection region: Reject H0: Δ = 0, if t is not between ± t0.975(9) = 2.26

6. The result:

7. The conclusion: Reject H0: Δ = 0 since t is not between ± 2.26

0.55 0.55 3.290.528 /(3.16)0.528 10

t = = =