hypothesis tests for means the context “statistical significance” hypothesis tests and...

37
Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution Alpha, and the rejection region Result p-Values One-sided vs. two-sided tests Hypothesis tests for proportions

Post on 21-Dec-2015

230 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Hypothesis Tests for Means

The context

“Statistical significance”Hypothesis tests and confidence intervals

The stepsHypothesisTest statisticDistributionAlpha, and the rejection regionResult

p-ValuesOne-sided vs. two-sided tests

Hypothesis tests for proportions

Page 2: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

The context

PARAMETERS = population mean (unknown) = population SD (might be known)

STATISTICS n = sample sizex = sample means = sample SD (using n-1)

ALSO 0 = conjectured value of

x

Page 3: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Statistical significance

We’re trying to decide whether is equal to 0.

As usual we use x as an estimate of . Usually x is at least a little different from 0. But could the difference be due to random variation?

IF YES – then we DO NOT REJECT the hypothesis that is really equal to 0. We say that x is not significantly different from 0.

IF NO – then we REJECT the hypothesis that = 0. We say that x IS significantly different from 0.

xx

x

x

Page 4: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Hypothesis tests are just confidence intervals

If we only cared about hypothesis tests for means, we could make this a lot simpler.

Just construct a confidence interval for ,based on n, x, s (or ) and your favorite confidence level C.

If 0 is outside the confidence interval, then we reject the hypothesis that = 0. The significance level is = 1 – C.

That’s all there is to it. So why all the complex ritual of a hypothesis test?

Because there are other hypothesis tests, for other hypotheses (difference of two means, for example). For those tests, we need the ritual.

x,

Page 5: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Hypothesis Test for

Cookbook using rejection regions

1. Choose hypotheses – H0 and HA.

2. Define a test statistic.

3. Predict the distribution of the test statistic,assuming that H0 is true.

4. Choose C and . Pick a rejection region.

5. Look at the observed value of the test statistic. Is it in the rejection region? If so, reject H0.

Page 6: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Hypothesis Test for

Cookbook using rejection regions

1. Choose hypotheses – H0 and HA.

2. Define a test statistic.

3. Predict the distribution of the test statistic,assuming that H0 is true.

4. Choose C and . Pick a rejection region.

5. Look at the observed value of the test statistic. Is it in the rejection region? If so, reject H0.

Page 7: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Choose hypotheses

Two-sided test:H0: = 0 HA: 0

One-sided tests:H0: = 0 HA: > 0

orH0: = 0 HA: < 0

Working rule: Always use two-sided tests.

Page 8: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Hypothesis Test for

Cookbook using rejection regions

1. Choose hypotheses – H0 and HA.

2. Define a test statistic.

3. Predict the distribution of the test statistic,assuming that H0 is true.

4. Choose C and . Pick a rejection region.

5. Look at the observed value of the test statistic. Is it in the rejection region? If so, reject H0.

Page 9: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Define a test statistic

Choose

or

Do you know ? Maybe it comes with the null hypothesis. If so, use it.

0xz

n

0xt s

n

Page 10: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Hypothesis Test for

Cookbook using rejection regions

1. Choose hypotheses – H0 and HA.

2. Define a test statistic.

3. Predict the distribution of the test statistic,assuming that H0 is true.

4. Choose C and . Pick a rejection region.

5. Look at the observed value of the test statistic. Is it in the rejection region? If so, reject H0.

Page 11: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Distribution of the test statistic

ASSUME H0 IS TRUE.

Then (if you know ) z has a STANDARD NORMAL distribution.

Or (if you’re using s) t has a “t” distribution withn-1 degrees of freedom.

Page 12: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Hypothesis Test for

Cookbook using rejection regions

1. Choose hypotheses – H0 and HA.

2. Define a test statistic.

3. Predict the distribution of the test statistic,assuming that H0 is true.

4. Choose C and . Pick a rejection region.

5. Look at the observed value of the test statistic. Is it in the rejection region? If so, reject H0.

Page 13: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

(Standard normal case)

The rejection region is a range (or double-range) of values of the test statistic that are(a) UNLIKELY if H0 is true

(b) roughly consistent with the alternative HA.

The rejection region should have probability (given H0).

Two-sided case:

z*/2- z*/2

Rejection region consists of two parts, each with probability /2.

Page 14: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Predicting the distribution

• If you’re using t, just use t-critical values.

• For the one-sided case:

z*

Rejection region probability , all in

one tail.

Page 15: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Chance of a Type I error

Note:

IF H0 is actually true, then there is still a probability of that you will reject the null hypothesis.

z*/2- z*/2

Page 16: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Chance of a Type I error

There are two possible bad results:

TYPE I ERROR (“act of commission”) – reject H0, when H0 is actually true.

The probability of a Type I error is (given that H0 is true)

TYPE II ERROR (“act of omission”) – don’t reject H0, when H0 is actually false.

The probability of a Type II error depends on the actual value of

Page 17: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Hypothesis Test for

Cookbook using rejection regions

1. Choose hypotheses – H0 and HA.

2. Define a test statistic.

3. Predict the distribution of the test statistic,assuming that H0 is true.

4. Choose C and . Pick a rejection region.

5. Look at the observed value of the test statistic. Is it in the rejection region? If so, reject H0.

Page 18: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Tradeoff

High (say, 10%) then you have a good chance of having a statistically significant result, but it won’t impress anyone.

MORE TYPE I ERRORS

Low (say, 1%) then your significant results are more convincing, but you’ll have fewer of them.

MORE TYPE II ERRORS

Is there a way to avoid choosing in advance?

Page 19: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Determine p-value

The “p-value” is the answer to this question:

What fraction of x ‘s are more extreme than the one you actually obtained?

If HA: 0 this means, what fraction are further from zero than the value you obtained?

If HA: > 0 this means, what fraction are more than the value you obtained?

If HA: < 0 this means, what fraction are less than the value you obtained?

x

Page 20: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Determine p-valueExample:

Do a test of H0: = 0 vs. HA: 0 .

Get test statistic z = 2.30.What’s the p-value?

Probability of seeing 2.30 OR MORE: 0.0107Probability of seeing 2.30 OR MORE EXTREME:

0.0214p-value for 2-sided test: 0.0214

z=2.30

tail: 0.0107

Page 21: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Determine p-value

Keep it simple?

p-value = (for 1-sided test with z) = 1 - NORMSDIST ( |z| )(for 2-sided test with z) = 2 × (1-NORMSDIST(|z|))

(for 1-sided test with t) = TDIST ( |t|, n-1, 1 )(for 2-sided test with t) = TDIST ( |t|, n-1, 2 )

df number of tails

Page 22: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Determine p-value

The p-value is the border between ’s for whichwe reject H0 and ’s for which we do not

reject H0.

REJECTION REGION VERSION: Pick , and the rejection region, in advance.In this story, the p-value is an afterthought.

p-VALUE FIRST VERSION: Find the p-value first. Then if anyone has a favorite , you can…

Reject H0 if p <

Do not reject if p > .

Page 23: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Example: 1969 Draft Lottery

Null hypothesis (informally): The numbers for the second half of the year were drawn randomly from the population 1, 2, …, 366.

(Note: The mean of these numbers is 183.5, and their standard deviation is 105.6547. )

Null hypothesis (formally): H0 : = 183.5

(and this is one of those cases where = 105.6547 comes with the null hypothesis)

Alternative: HA : 183.5

Page 24: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Example: 1969 Draft Lottery

H0 : = 183.5 HA : 183.5

0 = 183.5 = 105.6547

Experiment: n = 184, x = _________

Test statistic:

p-value:

Conclusion: REJECT H0 (even at 1% significance level)

x

0 183.57.789

x xz

n

160.92

= - 2.898

0.00375

Page 25: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Hypothesis tests for proportions

PARAMETERp = population proportion

STATISTICSn = sample sizek = number of “hits”p = k / n = sample proportionp̂

Page 26: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Hypothesis tests for proportions

Test statistic:

(Minor subtlety: The distribution of the test statistic is based on H0, so we use p0 in the formula for SE. This is different from what we do in confidence intervals, but not by much.)

0 0

0 01

ˆ ˆ

SE ( )

p p p pz

p pn

Page 27: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Another example

Suppose we have flipped 10000 coins, and obtained 5100 heads. Is this result statistically significant?

Page 28: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Another example

Suppose we have flipped 10000 coins, and obtained 5100 heads. Is this result statistically significant?

Choose:H0: p = 0.50 HA: p 0.50

Page 29: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Another example

Suppose we have flipped 10000 coins, and obtained 5100 heads. Is this result statistically significant?

Choose:H0: p = 0.50 HA: p 0.50

Conditions? OK.

Page 30: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Another example

Suppose we have flipped 10000 coins, and obtained 5100 heads. Is this result statistically significant?

Choose:H0: p = 0.50 HA: p 0.50

Conditions? OK.

Distribution of p^, given H0:

Normal, mean 0.50, SD=0.005

Page 31: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Another example

Our value of p^ is 0.51. That’s 2.0 SD’s above the mean.

What fraction of p^ values would be further from zero than 0.51 ?

Page 32: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Another example

Our value of p^ is 0.51. That’s 2.0 SD’s above the mean.

What fraction of p^ values would be further from zero than 0.51 ?

ABOUT 4.5%, counting both tails. So, P-value is 0.045.

Page 33: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Result of test

Is a P-value of 0.045 good enough to reject H0?

Page 34: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Result of test

Is a P-value of 0.045 good enough to reject H0?

If we choose = 0.05, then yes. But that’s a very mild test for such an extraordinary claim.

Page 35: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Result of test

Is a P-value of 0.045 good enough to reject H0?

If we choose = 0.05, then yes. But that’s a very mild test for such an extraordinary claim.

If we pick = 0.05, then 5% of all our experiments will end in rejecting H0, even though H0 is true every time.

Page 36: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Result of test

Is a P-value of 0.045 good enough to reject H0?

If we choose = 0.05, then yes. But that’s a very mild test for such an extraordinary claim.

If we pick = 0.05, then 5% of all our experiments will end in rejecting H0, even though H0 is true every time.

So we should choose a lower value of . In this case, our result isn’t really “statistically significant.”

Page 37: Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution

Result of test

Is a P-value of 0.045 good enough to reject H0?

If we choose = 0.05, then yes. But that’s a very mild test for such an extraordinary claim.

If we pick = 0.05, then 5% of all our experiments will end in rejecting H0, even though H0 is true every time.

So we should choose a lower value of . In this case, our result isn’t really “statistically significant.”

We need a bigger sample!