Download - Hypothesis Testing

Transcript

Hypothesis Testing

1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 20060

10

20

30

40

50

60

70

80

90

SOURCE: www.census.gov/population/www/socdemo/fertility.html#hist

Fertility of American Women (aged 15-44), 1980-2006Births per

1,000 women UT

NH

yryr uYRFRate 10 yr = 0, …, 26πœ€π‘¦π‘Ÿ

yryr YRFRate 42.017.71 ^

1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 20060

10

20

30

40

50

60

70

80

90Births per

1,000 women UT

NH

tstststs OlderFRateCRate ,,2,10, ln s = 1, …, 47t = 1, …, 13 (1970 to 82)

Hypothesis Testing of 1Μ‚

Suppose you are asked to empirically investigate the charge that obstetricians are guilty of inducing demand by performing unnecessary cesarean sections.

Empirical Model (linear-log)

Panel data𝑙𝑛𝑄𝐢𝑖𝑔𝑖=𝛾0+𝛾1 𝑙𝑛𝑃𝐢𝑖𝑔1+πœ‡π‘–

𝛾1=% βˆ†π‘„πΆπ‘–π‘”% βˆ† 𝑃𝐢𝑖𝑔

=ΒΏ(%βˆ†π‘„πΆπ‘–π‘”

100 )(% βˆ†π‘ƒπΆπ‘–π‘”

100 )𝛽1=

βˆ†πΆπ‘…π‘Žπ‘‘π‘’

(% βˆ†πΉπ‘…π‘Žπ‘‘π‘’100 ) 𝛽1=

100 βˆ™ βˆ†πΆπ‘…π‘Žπ‘‘π‘’%βˆ†πΉπ‘…π‘Žπ‘‘π‘’

βˆ†πΆπ‘…π‘Žπ‘‘π‘’% βˆ†πΉπ‘…π‘Žπ‘‘π‘’

=𝛽1

100

tstststs uOlderFRateCRate ,,2,10, ln s = 1, …, 47t = 1, …, 13 (1970 to 82)

Empirical Model (linear-log)

tststs OlderFRateCRate ,2,0,Λ†ln 097.0Λ†

Gruber and Owings (1996) OLS regression

1Μ‚

^(0.021)

Standard error of

CRate = 0.119_____

where

11.9% of deliveries are by cesarean section

The hypothesis test of answers the question of whether is negative enough to be convincing evidence that the true coefficient is negative rather than simply being negative due to chance. But first, let’s interpret what the estimated coefficient means, assuming temporarily that it is statistically significant.

1Μ‚ 1Μ‚

Gruber and Owings interpretation of 1Μ‚β€œA fall in the fertility rate of 10 percent is associated with an increase in the likelihood of cesarean delivery of 0.97 percentage points.” (p. 113)

00097.0100

097.0

100

Λ†

%1

FRate

CRate

What is the predicted effect of a 10% decrease in the fertility rate?

00097.010

CRate^%10097.0 CRate^ point

Set H0 and H1

H0: null hypothesis (straw man)

HA: alternative hypothesis (what we believe)

Does the empirical evidence convincingly knock H0 down?

Testing for demand inducement by obstetricians (OBs):

H0:

HA:

01 01

Gruber and Owings use economics to tell a story of why this might be true.

Two types of potential errorsTruth about Ξ²1

(OBs are guilty) (OBs are innocent)

no error Type I errorStatistical (OBs are guilty)

InferenceType II error no error

(OBs are innocent)

01

01

01

01

2i

ii2Λ†

)var(

))E((var11 PCig

uPCigPCig

ni

2

2i

2i

2i

2Λ†

)(1

Λ†)(2

11

Λ†1

PCigPCign

uPCigPCign

n

___

___

Hypothesis Testing of 1Μ‚

Normally, a negative sign for is not sufficient to convince us to reject the null hypothesis and, in this case, conclude that OBs are guilty of inducing demand by carrying out unnecessary c-sections. Instead, has to be sufficient smaller than zero for us to be relatively confident that the true is negative. In other words, has to be smaller than some critical negative value, call it , for us to be willing to reject the null hypothesis.

1

1Μ‚

1Μ‚

C

Decision Rule

C 1Μ‚

C 1Μ‚

Find OBs

Do not Reject H0 Innocent

Reject H0 Guilty

1Μ‚0 CReject H0 Accept H0

Prob (Type I error) = significance level of the test )N(0,~Λ† 2

Λ†11

Choose by setting Type I errorCReject true H0

Suppose 01

1Μ‚0 C Reject H0 Accept H0

)N(0,~Λ† 2Λ†11

Given how would you illustrate the probability of making a Type II error? CType II error: accept a false H0.

Suppose 0*11

1Μ‚0 *1C

Reject H0 Accept H0

),Λ†N(~Λ† 2Λ†

*11

1Note: There is no way to measure Type II error without knowing the true value of

1

As the significance level of the test becomes more stringent, what happens to the prob(Type I error) and prob(Type II error)?

As prob(Type I error)

prob(Type II error)C

Given that changing the critical value decreases the probability of making one type of error while increasing the probability of making the other type of error, how should we set ? C

CSetting . Think about the cost of making each type of error.

prob(Type I error) reject true H0

prob(Type II error) accept false H0

β€’ innocent OBs are tainted as demand inducers

β€’ unnecessary policies may be introduced

β€’ unnecessary c-sections

Typical practice set the significance level so that the probability of making at type I error is small.

Prob(Type I error ) is typically set at either .10, .05 or .01

But you should always question whether the typical practice is appropriate.

How do you find

1Μ‚0 C

)N(0,~Λ† 2Λ†11

?C

One possibility: solve for whereC

05.0Λ†d)N(0, 12Λ†1

C

Too cumbersome, requiring that we solve difficult problems for every new hypothesis.

How do you find ?C

Easier process: transform the test statistic into a standard normal one. Define z statistic as:

,1))Λ†N(E(~)Λ†E(Λ†

1Λ†

111

1

z

Standard normal distribution

But if H0 is true, then , so 0)Λ†E( 1

N(0,1)~Λ†

1Μ‚

11

z

1z0 Cz

)1N(0,~1zMuch easier a table of the standard normal distribution can be used to find the critical value for a variety of hypothesis tests.

BUT how do we calculate when is unknown? 1Μ‚

11

Λ†

z1Μ‚

Standard deviation of

1Μ‚Λ†

1Μ‚1Μ‚

Solution substitute , which is the standard error of , for 1Μ‚

)1(~Λ†

Λ†

Λ†

Λ†

11

0

Λ†

1

Λ†

H11

kntt

t distributiont statistic

n = Sample size

k+1= # parameters

The shape of the t distribution depends on the number of degrees of freedom. It is a little fatter than the standard normal distribution due to the increased variation of estimating .Λ†with

11Λ†Λ†

)1(~Λ†

Λ†

Λ†

Λ†

11

0

Λ†

1

Λ†

H11

kntt

Decision Rule ,Hreject then sign,right thehas Λ†& tt if 01C1 it.reject not do otherwise

The critical value, tC , depends on

1. Whether it is a one-sided or two-sided test.

2. The significance level of the test (usually, either 0.10, 0.05 or 0.01.

3. The degrees of freedom (n-k-1)


Top Related