Transcript

Hypothesis Testing and Comparison of Two Populations

Dr. Burton

If the heights of male teenagers are normally distributed with a mean of 60 inches and standard deviation of 10, And the sample size was 25, what percentage of boy’s heights in inches would be:

Between 57 and 63

Lass than 58

61 or larger

7.2a

60 0

HeightZ

57 63

%

Z = x - s / n

57 - 60

10 / 25

63 - 60

10 / 25

Z= -1.5 = .4332

Z= 1.5 = .4332

.8664 = 86.8%

7.2b

60 0

HeightZ

58-1.0

%

Z = x - s / n

58 - 60

10 / 25 Z = -1.0 = .5000 - .3413

.1587 = 16%

7.2c

60 0 0.5

HeightZ

61

%

Z = x - s / n

61 - 60

10 / 25 Z = 0.50

- .1915 = .3085 = 30.9%= 0.50

Hypothesis Testing

Hypothesis: A statement of belief…

Null Hypothesis, H0: …there is no difference between the population mean and the hypothesized value 0.

Alternative Hypothesis, Ha: …reject the null hypothesis and accept that there is a difference between the population mean and the hypothesized value 0.

Probabilities of Type I and Type II errors

H0 True H0 False

Accept H0

Reject H0

Type I Error

Type IIError

Correctresults

Correctresults

Truth

Testresult

1 -

1 -

H0 True = statistically insignificantH0 False = statistically significantAccept H0 = statistically insignificantReject H0 = statistically significant

Differences

a b

c d

http://en.wikipedia.org/wiki/False_positive

-3 -3 -2 -2 -1-1 00 11 22 33

SE

Probability Distribution for a two-tailed test

SEMagnitude of (XE – XC)

1.96 SE

XE < XC XE > XC

= 0.05

0.0250.025

-3 -3 -2 -2 -1-1 00 11 22 33

SE

Probability Distribution for a one-tailed test

SEMagnitude of (XE – XC)

1.645 SE

XE < XC XE > XC

= 0.05= 0.05

Box 10 - 5t =

A

Distance between the means

Variation around the means

Box 10 - 5t =

A

B

Distance between the means

Variation around the means

Box 10 - 5t =

A

B

C

Distance between the means

Variation around the means

t-Tests

• Students t-test is used if:– two samples come from two different

groups.– e.g. A group of students and a group of

professors

• Paired t-test is used if:– two samples from the sample group.– e.g. a pre and post test on the same

group of subjects.

One-Tailed vs. Two Tailed Tests

• The Key Question: “Am I interested in the deviation from the mean of the sample from the mean of the population in one or both directions.”

• If you want to determine whether one mean is significantly from the other, perform a two-tailed test.

• If you want to determine whether one mean is significantly larger, or significantly smaller, perform a one-tailed test.

t-Test(Two Tailed)

Independent Sample means

x A - xB - 0

t =

Sp [ ( 1/NA ) + ( 1/NB) ]

d f = N A + N B - 2

Independent Sample Means

Sample A (A – Mean)2

26 34.3424 14.9018 4.5817 9.8618 4.5820 .0218 4.58Mean = 20.14A2 = 2913N = 7(A – Mean)2 = 72.86Var = 12.14s = 3.48

Sample B (B – Mean)2

38 113.8526 1.7724 11.0924 11.0930 7.1322 28.41

Mean = 27.33B2 = 4656N = 6(B – Mean)2 = 173.34Var = 34.67s = 5.89

Standard error of the difference between the means (SED)

SED of E - C =

s A 2Estimate of the s B

2

N AN B

+SED of x E - x C =

A 2 B

2

N AN B

+Theoretical

Theoretical

Population

Sample

Pooled estimate of the SED (SEDp)

1Estimate of the 1

N AN B

+SEDp of x A - x B = Sp

s2(nA-1) + s2 (nB – 1)Sp =

n A + n B - 2

12.14 (6) + 34.67 (5)Sp =

7 + 6 - 2= 22.38 = 4.73

t-Test(Two Tailed)

d f = N E + N C - 2 = 11

x A - xB - 0

t =

Sp [ ( 1/NA ) + ( 1/NB) ]

20.14 - 27.33 - 0 =

4.73 ( 1/7 ) + ( 1/6)= -2.73

Critical Value 95% = 2.201

One-tailed and two-tailed t-tests

• A two-tailed test is generally recommended because differences in either direction need to be known.

Paired t-test

t paired = t p = d - 0

Standard error of d

= -------------d - 0

S d 2

N

df = N - 1

d = D/N

d 2 = D 2 – ( D) 2 / N

S d2 = d 2 / N - 1

Pre/post attitude assessment

Student Before After Difference D squared

1 25 28 3 9

2 23 19 -4 16

3 30 34 4 16

4 7 10 3 9

5 3 6 3 9

6 22 26 4 16

7 12 13 1 1

8 30 47 17 289

9 5 16 11 121

10 14 9 -5 25

Total 171 208 D = 37 D2 = 511

Pre/post attitude assessment

Student Before After Difference D squaredTotal 171 208 37 511

t paired = t p = d - 0

Standard error of d

= -------------d - 0

S d 2

N

d = D/N

N = 10

d 2 = D 2 – ( D) 2 / N

S d2 = d 2 / N - 1

= 37/10 = 3.7

= 511 - 1369/10 = 374.1

= 374.1 / 10 – 1 = 41.5667

= 3.7 / 2.0387

= 1.815

= 3.7 / 41.5667 / 10

= 3.7 / 4. 15667

df = N – 1 = 90.05 > 1.833

Probabilities of Type I and Type II errors

H0 True H0 False

Accept H0

Reject H0

Type I Error

Type IIError

Correctresults

Correctresults

Truth

Testresult

1 -

1 -

H0 True = statistically insignificantH0 False = statistically significantAccept H0 = statistically insignificantReject H0 = statistically significant

Differences

Standard 2 X 2 table

a = subjects with both the risk factor and the diseaseb = subjects with the risk factor but not the diseasec = subjects with the disease but not the risk factord = subjects with neither the risk factor nor the diseasea + b = all subjects with the risk factorc + d = all subjects without the risk factora + c = all subjects with the diseaseb + d = all subjects without the diseasea + b + c + d = all study subjects

Present Absent

Present

Absent

Disease status

Risk FactorStatus

aa bb

cc dd

a + ba + b

c + dc + d

a + ca + c b + db + d a+b+c+da+b+c+dTotal

Total

Standard 2 X 2 table

Sensitivity = a/a+cSpecificity = d/b+d

Present Absent

Present

Absent

Disease status

Risk FactorStatus

aa bb

cc dd

a + ba + b

c + dc + d

a + ca + c b + db + d a+b+c+da+b+c+dTotal

Total

Diabetic Screening Program

Sensitivity = a/a+c = 100 X 5/6 = 83.3% (16.7% false neg.)

Specificity = d/b+d = 100 X 81/94 = 86.2%(13.8% false pos.)

Diabetic Nondiabetic

>125mg/100ml

<125mg/100ml

Disease status

Risk FactorStatus

55 1313

11 8181

1818

8282

66 9494 100100Total

Total


Top Related