1 today null and alternative hypotheses 1- and 2-tailed tests regions of rejection sampling...

45
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests for sample means The 5 steps of hypothesis-testing Type I and Type II error (not necessarily in this order)

Upload: stephany-robinson

Post on 28-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

1

Today

• Null and alternative hypotheses• 1- and 2-tailed tests• Regions of rejection• Sampling distributions• The Central Limit Theorem• Standard errors• z-tests for sample means• The 5 steps of hypothesis-testing• Type I and Type II error

(not necessarily in this order)

Page 2: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

2

Hypothesis testing

• Approach hypothesis testing from the standpoint of theory.

• If our theory about some phenomenon is correct, then things should be a certain way.

• If the commercial really works, then we should see an increase in sales (that cannot easily be attributed to chance).

• Hypotheses are stated in terms of parameters (e.g., “the average difference between Groups A and B is zero in the population”).

Page 3: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

3

Hypothesis testing

• We will always observe some kind of effect, even if nothing interesting is going on.

• It could be due to chance fluctuations, or sampling error... or there really could be an effect in the population.

• Inferential statistics help us decide.

• If we conclude, on the basis of statistics, that an effect should not be attributed to chance, the effect is termed statistically significant.

Page 4: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

4

56 58 60 62 64 66 68 70

How tall are you in inches?

0

2

4

6

8

10

Fre

qu

en

cy

Mean = 64.28Std. Dev. = 3.077N = 47

Gender: F

• Say we know and , and that they are = 64.28” and = 3.1”, like in the female sample.

• We want to know if the 74-inch-tall person is female.

• Use logic to make a good guess.

Page 5: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

5

56 58 60 62 64 66 68 70

How tall are you in inches?

0

2

4

6

8

10

Fre

qu

en

cy

Mean = 64.28Std. Dev. = 3.077N = 47

Gender: F

• If the person is female, then her distribution has = 64.28” and = 3.1” (assuming normality).

• That implies that “her” z-score is:

• Very unlikely that this person is female!

• We could do this because we made the assumption of normality, and assumed = 64.28” and = 3.1”.

74 64.283.14

3.1.001

xz

p

Page 6: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

6

Hypothesis testing

• A hypothesis is a theory-based prediction about population parameters.

• Researchers begin with a theory.

• Then they define the implications of the theory.

• Then they test the implications using if-then logic (e.g., if the theory is true, then the population mean should be greater than 3.8).

Page 7: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

7

Hypothesis testing

• Null hypothesis – Represents the “status quo” situation. Usually, the hypothesis of no difference or no relationship. E.g. ...

• Alternative hypothesis – what we are predicting will occur. Usually, the most scientifically interesting hypothesis. E.g. ...

0 : 0H

1 : 0H

Page 8: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

8

Conventions

• By convention, the null and alternative hypotheses are mutually exclusive and exhaustive. E.g. ...

• Not everyone follows this convention.

0 : 40%H

1 : 40%H

Page 9: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

9

Hypothesis testing

• This is an example of a 2-tailed hypothesis test:

• Null distribution:

0 : 40%H 1 : 40%H

Null Distribution for Recidivism

Recidivism Percentage

0 20 40 60 80 100

0.00

0.01

0.02

0.03

0.04

0.05

Page 10: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

10

1-tailed tests

• Say we had the following hypotheses:

• We would reject the null hypothesis only if the observed mean is sufficiently positive.

• “Sufficiently” because sample means will always differ. We care about the population, not samples.

• If we conclude that chance variability isn’t driving the effect, then we say the effect is statistically significant.

0 : 0H

1 : 0H

Page 11: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

11

An example...

• Say we want to know if UNC students’ IQ differs from the national average. We know:

• We pick a student at random (our “sample”), and give her an IQ test. She scores 700.

• Was her score drawn from the U.S. population at large, or from another (more intelligent) distribution?

500

100

Page 12: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

12

An example...

• The null hypothesis is that she is part of the U.S. population distribution of IQ test-takers. Nothing special.

• The alternative is that she is from some other (more intelligent) population distribution.

• 1-tailed, because we are interested only if UNC students are more intelligent than average.

0 : 500H

1 : 500H

Page 13: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

13

An example...

• First, draw thenull distribution:

• Then define theregion(s) of rejection:

Null Distribution of IQ scores (U.S. population)

IQ

100 200 300 400 500 600 700 800 900

0.000

0.001

0.002

0.003

0.004

0.005

Null Distribution of IQ scores (U.S. population)

IQ

100 200 300 400 500 600 700 800 900

0.000

0.001

0.002

0.003

0.004

0.005

Page 14: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

14

Null Distribution of IQ scores (U.S. population)

IQ

100 200 300 400 500 600 700 800 900

0.000

0.001

0.002

0.003

0.004

0.005

An example...

Page 15: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

15

An example...

0

0

0

1.645 100 500

664.5

x

x

x

xz

z x

x z

x

x

• How did I find the “critical value” of IQ? By knowing alpha, knowing how to use Table E10, and a little algebra...

• First, find z given p, then...

Page 16: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

16

An example...

• Our student’s IQ score is 700. Does it fall in the region of rejection?

...Yes!

Null Distribution of IQ scores (U.S. population)

IQ

100 200 300 400 500 600 700 800 900

0.000

0.001

0.002

0.003

0.004

0.005

Page 17: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

17

An example...

• We could have done this by comparing z-scores instead of raw scores.

• 2.0 > 1.645, so we reject H0.

0 700 5002.0

100x

xz

Page 18: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

18

An example...

• We also could have done this by comparing a p-value to instead of comparing raw scores or z-scores.

• The p-value corresponding to a z-score or 2.0 is .0228.

• .0228 < .05, so we reject H0.

• A UNC student with an IQ of 700 would be very rare if drawn from the null population with = 500. In fact, even more rare than we are willing to tolerate (remember, = .05).

Page 19: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

19

3 Decision rules in this example

• We need to know if we should reject H0. These three rules all yield the same conclusion. Reject H0 if...

obs criticalx x

obs criticalz zp

Page 20: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

20

But...

• Wait a minute – we did all that with only one student??

• The sample was very small (N = 1) to making such bold claims about UNC.

• We need a representative sample, N >> 1.

• The logic of hypothesis testing is exactly the same with samples as it is with individuals.

• But, we need to know about sampling distributions...

Page 21: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

21

Sampling distributions

• Sampling distribution: A distribution of some statistic.

• “Sampling distribution of _____” (mean / variance / z, t, etc.)

Page 22: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

22

The Central Limit Theorem

• Given a population with mean and variance 2, the sampling distribution of the mean (the distribution of sample means) will have a mean equal to and a variance equal to:

...and thus a standard deviation of:

The distribution will approach normality as N increases. [from Howell, p. 267]

2 2x x N

x x N

Page 23: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

23

The Central Limit Theorem

• ...is called the standard error of the mean, or simply standard error.

x x N

Page 24: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

24

Null Distribution of IQ scores (U.S. population)

IQ

100 200 300 400 500 600 700 800 900

0.000

0.001

0.002

0.003

0.004

0.005

The Central LimitTheorem

Null Distribution of IQ scores (U.S. population)

IQ

100 200 300 400 500 600 700 800 900

0.000

0.002

0.004

0.006

0.008

0.010

Null Distribution of IQ scores (U.S. population)

IQ

100 200 300 400 500 600 700 800 900

0.000

0.002

0.004

0.006

0.008

0.010

0.012

0.014

0.016

0.018

0.020

N = 1

N = 5

N = 20

• As sample sizeincreases, thestandard errordecreases.

Page 25: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

25

The Central Limit Theorem

• Another example...

Page 26: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

26

Back to the UNC IQ example...

• Let’s say we that we collect a sample of N = 4 UNC students.

• Their IQs are 700, 710, 680, and 670.

• Now the mean is

• Is there enough evidence to claim that UNC students are brighter than average?

• Now the question is, “if the population mean is 500, how extreme would a sample mean of 690 be (given that N = 4)?

690x

Page 27: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

27

In terms of z-scores...

• The critical value for z is still +1.645 (because it’s a 1-tailed test and = .05).

• 3.8 > 1.645, so reject H0.

• Conclusion: UNC students are likely brighter than average (we’ll never really know for sure).

0 0 690 5003.8

1004

xx

x xz

N

Page 28: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

28

Another example

• Your theory says that Benadryl should alter reaction time on some task, but you are not sure how. The null and alternative hypotheses might be:

• We’re given that = .032 seconds

• We’re given that N = 400

• We’re given that = .01

0 : 0.09secH

1 : 0.09secH

Page 29: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

29

Standard Normal Distribution

z

-5 -4 -3 -2 -1 0 1 2 3 4 5

De

nsity

0.0

0.1

0.2

0.3

0.4

0.5

Finding critical z’s for a 2-tailed test

2

2

z = -2.575 z = +2.575

Page 30: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

30

Reaction Time Sampling Distribution of the Mean

seconds

0.082 0.084 0.086 0.088 0.090 0.092 0.094 0.096 0.098

De

nsi

ty

0

50

100

150

200

250

300

Finding critical reaction times

2

2

.032sec

.032 400

.032 / 20 .0016x

Page 31: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

31

Another example

• We collect data from our 400 subjects and find the mean RT to be .097 seconds.

• .097 is different from .09, but different enough?

• 4.375 > 2.575, so reject H0. Benadryl probably does have an effect on reaction time. Specifically, it slows people down.

0 .097 .09 .0074.375

.0016.032400

x

xz

Page 32: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

32

N = 1: a special case?

0 0

xx

x xz

N

• When N = 1,

...and:

1x x

x

x x

N

0

x

xz

Page 33: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

33

The 5 steps of hypothesis testing

1. Specify null and alternative hypotheses.

2. Identify a test statistic.

3. Specify the sampling distribution and sample size.

4. Specify alpha and the region(s) of rejection.

5. Collect data, compute the test statistic, and make a decision regarding H0.

Page 34: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

34

1. Null and alternative hypotheses

• Specify H0 and H1 in terms of population parameters.

• H0 is presumed to be true in the absence of evidence against it.

• H1 is adopted if H0 is rejected.

0 : 0.09secH

1 : 0.09secH

Page 35: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

35

2. Identify a test statistic

• Identify a test statistic that is useful for discriminating between different hypotheses about the population parameter of interest, taking into account the hypothesis being tested and the information known.

• E.g., z, t, F, and 2.

Page 36: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

36

3. Sampling distribution and N

• Specify the sampling distribution and sample size.

• The sampling distribution here refers to the distribution of all possible values of the test statistic obtained under the assumption that H0 is true.

• E.g., “N = 48. The sampling distribution is the standard normal distribution (distribution of z statistics), because we are testing a hypothesis about the population mean when is known.”

Page 37: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

37

4. Specify and the rejection regions

• Alpha () is the probability of incorrectly rejecting H0 (rejecting the null hypothesis when it is really true).

• Regions of rejection are those ranges of the test statistic’s sampling distribution which, if encountered, would lead to rejecting H0.

• The regions of rejection are determined by and by whether the test is 1-tailed or 2-tailed.

Page 38: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

38

5. Collect data, compute the test statistic, make a decision

• For example...

• E.g., “2.77 > 1.96, so reject H0 and conclude that...”

• Always couch the conclusion in terms of the original problem.

0 2 0 22.77

.722548

x

xz

N

Page 39: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

39

The 5 steps: Example

• Let’s say you think a certain standardized achievement test is biased against Asian-Americans. You know that for the non-Asian-American population...

• In the sample...

100

10

28N

Page 40: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

40

The 5 steps: Example

1. Specify null and alternative hypotheses.

0 : 100H 1 : 100H

2. Identify a test statistic.

We want to compare a sample mean to a hypothesized value, and we know , so we use a z-test.

Page 41: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

41

The 5 steps: Example

3. Specify the sampling distribution and sample size.

The sampling distribution of z is the standard normal distribution.

28N

4. Specify alpha and the region(s) of rejection.

The regions of rejection are harder...

.05

Page 42: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

42

The 5 steps: Example

5. Collect data, compute the test statistic, make a decision.

We collect data. Say the mean is 97.1. Does 97.1 fall in the region of rejection?

97.1 100 2.91.53

1.891028

xx

x xz

N

Page 43: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

43

Type I and Type II errors

• There are two ways to make an incorrect decision in hypothesis testing: Type I and Type II errors.

• Type I error: Concluding that the null hypothesis is false when it is really true.

• We control the probability of making a Type I error (alpha).

• Alpha (): The risk of incorrectly rejecting a true null hypothesis.

• Why not make really, really small? The smaller we make , the more likely it becomes we will encounter a Type II error.

Page 44: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

44

Type I and Type II errors

• Type II error: Concluding the null hypothesis is true when it is really false.

• Beta (): The probability of incorrectly retaining a false null hypothesis.

Page 45: 1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests

45

Next time...

• Power

• Effect size

• Statistical significance vs. practical significance

• Confidence intervals