1 today null and alternative hypotheses 1- and 2-tailed tests regions of rejection sampling...

1

Today

• Null and alternative hypotheses• 1- and 2-tailed tests• Regions of rejection• Sampling distributions• The Central Limit Theorem• Standard errors• z-tests for sample means• The 5 steps of hypothesis-testing• Type I and Type II error

(not necessarily in this order)

2

Hypothesis testing

• Approach hypothesis testing from the standpoint of theory.

• If our theory about some phenomenon is correct, then things should be a certain way.

• If the commercial really works, then we should see an increase in sales (that cannot easily be attributed to chance).

• Hypotheses are stated in terms of parameters (e.g., “the average difference between Groups A and B is zero in the population”).

3

Hypothesis testing

• We will always observe some kind of effect, even if nothing interesting is going on.

• It could be due to chance fluctuations, or sampling error... or there really could be an effect in the population.

• Inferential statistics help us decide.

• If we conclude, on the basis of statistics, that an effect should not be attributed to chance, the effect is termed statistically significant.

4

56 58 60 62 64 66 68 70

How tall are you in inches?

0

2

4

6

8

10

Fre

qu

en

cy

Mean = 64.28Std. Dev. = 3.077N = 47

Gender: F

• Say we know and , and that they are = 64.28” and = 3.1”, like in the female sample.

• We want to know if the 74-inch-tall person is female.

• Use logic to make a good guess.

5

56 58 60 62 64 66 68 70

How tall are you in inches?

0

2

4

6

8

10

Fre

qu

en

cy

Mean = 64.28Std. Dev. = 3.077N = 47

Gender: F

• If the person is female, then her distribution has = 64.28” and = 3.1” (assuming normality).

• That implies that “her” z-score is:

• Very unlikely that this person is female!

• We could do this because we made the assumption of normality, and assumed = 64.28” and = 3.1”.

74 64.283.14

3.1.001

xz

p

6

Hypothesis testing

• A hypothesis is a theory-based prediction about population parameters.

• Researchers begin with a theory.

• Then they define the implications of the theory.

• Then they test the implications using if-then logic (e.g., if the theory is true, then the population mean should be greater than 3.8).

7

Hypothesis testing

• Null hypothesis – Represents the “status quo” situation. Usually, the hypothesis of no difference or no relationship. E.g. ...

• Alternative hypothesis – what we are predicting will occur. Usually, the most scientifically interesting hypothesis. E.g. ...

0 : 0H

1 : 0H

8

Conventions

• By convention, the null and alternative hypotheses are mutually exclusive and exhaustive. E.g. ...

• Not everyone follows this convention.

0 : 40%H

1 : 40%H

9

Hypothesis testing

• This is an example of a 2-tailed hypothesis test:

• Null distribution:

0 : 40%H 1 : 40%H

Null Distribution for Recidivism

Recidivism Percentage

0 20 40 60 80 100

0.00

0.01

0.02

0.03

0.04

0.05

10

1-tailed tests

• Say we had the following hypotheses:

• We would reject the null hypothesis only if the observed mean is sufficiently positive.

• “Sufficiently” because sample means will always differ. We care about the population, not samples.

• If we conclude that chance variability isn’t driving the effect, then we say the effect is statistically significant.

0 : 0H

1 : 0H

11

An example...

• Say we want to know if UNC students’ IQ differs from the national average. We know:

• We pick a student at random (our “sample”), and give her an IQ test. She scores 700.

• Was her score drawn from the U.S. population at large, or from another (more intelligent) distribution?

500

100

12

An example...

• The null hypothesis is that she is part of the U.S. population distribution of IQ test-takers. Nothing special.

• The alternative is that she is from some other (more intelligent) population distribution.

• 1-tailed, because we are interested only if UNC students are more intelligent than average.

0 : 500H

1 : 500H

13

An example...

• First, draw thenull distribution:

• Then define theregion(s) of rejection:

Null Distribution of IQ scores (U.S. population)

IQ

100 200 300 400 500 600 700 800 900

0.000

0.001

0.002

0.003

0.004

0.005


IQ

100 200 300 400 500 600 700 800 900

0.000

0.001

0.002

0.003

0.004

0.005

14


IQ

100 200 300 400 500 600 700 800 900

0.000

0.001

0.002

0.003

0.004

0.005

An example...

15

An example...

0

0

0

1.645 100 500

664.5

x

x

x

xz

z x

x z

x

x

• How did I find the “critical value” of IQ? By knowing alpha, knowing how to use Table E10, and a little algebra...

• First, find z given p, then...

16

An example...

• Our student’s IQ score is 700. Does it fall in the region of rejection?

...Yes!


IQ

100 200 300 400 500 600 700 800 900

0.000

0.001

0.002

0.003

0.004

0.005

17

An example...

• We could have done this by comparing z-scores instead of raw scores.

• 2.0 > 1.645, so we reject H0.

0 700 5002.0

100x

xz

18

An example...

• We also could have done this by comparing a p-value to instead of comparing raw scores or z-scores.

• The p-value corresponding to a z-score or 2.0 is .0228.

• .0228 < .05, so we reject H0.

• A UNC student with an IQ of 700 would be very rare if drawn from the null population with = 500. In fact, even more rare than we are willing to tolerate (remember, = .05).

19

3 Decision rules in this example

• We need to know if we should reject H0. These three rules all yield the same conclusion. Reject H0 if...

obs criticalx x

obs criticalz zp

20

But...

• Wait a minute – we did all that with only one student??

• The sample was very small (N = 1) to making such bold claims about UNC.

• We need a representative sample, N >> 1.

• The logic of hypothesis testing is exactly the same with samples as it is with individuals.

• But, we need to know about sampling distributions...

21

Sampling distributions

• Sampling distribution: A distribution of some statistic.

• “Sampling distribution of _____” (mean / variance / z, t, etc.)

22

The Central Limit Theorem

• Given a population with mean and variance 2, the sampling distribution of the mean (the distribution of sample means) will have a mean equal to and a variance equal to:

...and thus a standard deviation of:

The distribution will approach normality as N increases. [from Howell, p. 267]

2 2x x N

x x N

23


• ...is called the standard error of the mean, or simply standard error.

x x N

24


IQ

100 200 300 400 500 600 700 800 900

0.000

0.001

0.002

0.003

0.004

0.005

The Central LimitTheorem


IQ

100 200 300 400 500 600 700 800 900

0.000

0.002

0.004

0.006

0.008

0.010


IQ

100 200 300 400 500 600 700 800 900

0.000

0.002

0.004

0.006

0.008

0.010

0.012

0.014

0.016

0.018

0.020

N = 1

N = 5

N = 20

• As sample sizeincreases, thestandard errordecreases.

25


• Another example...

26

Back to the UNC IQ example...

• Let’s say we that we collect a sample of N = 4 UNC students.

• Their IQs are 700, 710, 680, and 670.

• Now the mean is

• Is there enough evidence to claim that UNC students are brighter than average?

• Now the question is, “if the population mean is 500, how extreme would a sample mean of 690 be (given that N = 4)?

690x

27

In terms of z-scores...

• The critical value for z is still +1.645 (because it’s a 1-tailed test and = .05).

• 3.8 > 1.645, so reject H0.

• Conclusion: UNC students are likely brighter than average (we’ll never really know for sure).

0 0 690 5003.8

1004

xx

x xz

N

28

Another example

• Your theory says that Benadryl should alter reaction time on some task, but you are not sure how. The null and alternative hypotheses might be:

• We’re given that = .032 seconds

• We’re given that N = 400

• We’re given that = .01

0 : 0.09secH

1 : 0.09secH

29

Standard Normal Distribution

z

-5 -4 -3 -2 -1 0 1 2 3 4 5

De

nsity

0.0

0.1

0.2

0.3

0.4

0.5

Finding critical z’s for a 2-tailed test

2

2

z = -2.575 z = +2.575

30

Reaction Time Sampling Distribution of the Mean

seconds

0.082 0.084 0.086 0.088 0.090 0.092 0.094 0.096 0.098

De

nsi

ty

0

50

100

150

200

250

300

Finding critical reaction times

2

2

.032sec

.032 400

.032 / 20 .0016x

31

Another example

• We collect data from our 400 subjects and find the mean RT to be .097 seconds.

• .097 is different from .09, but different enough?

• 4.375 > 2.575, so reject H0. Benadryl probably does have an effect on reaction time. Specifically, it slows people down.

0 .097 .09 .0074.375

.0016.032400

x

xz

32

N = 1: a special case?

0 0

xx

x xz

N

• When N = 1,

...and:

1x x

x

x x

N

0

x

xz

33

The 5 steps of hypothesis testing

1. Specify null and alternative hypotheses.

2. Identify a test statistic.

3. Specify the sampling distribution and sample size.

4. Specify alpha and the region(s) of rejection.

5. Collect data, compute the test statistic, and make a decision regarding H0.

34

1. Null and alternative hypotheses

• Specify H0 and H1 in terms of population parameters.

• H0 is presumed to be true in the absence of evidence against it.

• H1 is adopted if H0 is rejected.

0 : 0.09secH

1 : 0.09secH

35

2. Identify a test statistic

• Identify a test statistic that is useful for discriminating between different hypotheses about the population parameter of interest, taking into account the hypothesis being tested and the information known.

• E.g., z, t, F, and 2.

36

3. Sampling distribution and N

• Specify the sampling distribution and sample size.

• The sampling distribution here refers to the distribution of all possible values of the test statistic obtained under the assumption that H0 is true.

• E.g., “N = 48. The sampling distribution is the standard normal distribution (distribution of z statistics), because we are testing a hypothesis about the population mean when is known.”

37

4. Specify and the rejection regions

• Alpha () is the probability of incorrectly rejecting H0 (rejecting the null hypothesis when it is really true).

• Regions of rejection are those ranges of the test statistic’s sampling distribution which, if encountered, would lead to rejecting H0.

• The regions of rejection are determined by and by whether the test is 1-tailed or 2-tailed.

38

5. Collect data, compute the test statistic, make a decision

• For example...

• E.g., “2.77 > 1.96, so reject H0 and conclude that...”

• Always couch the conclusion in terms of the original problem.

0 2 0 22.77

.722548

x

xz

N

39

The 5 steps: Example

• Let’s say you think a certain standardized achievement test is biased against Asian-Americans. You know that for the non-Asian-American population...

• In the sample...

100

10

28N

40


1. Specify null and alternative hypotheses.

0 : 100H 1 : 100H

2. Identify a test statistic.

We want to compare a sample mean to a hypothesized value, and we know , so we use a z-test.

41


3. Specify the sampling distribution and sample size.

The sampling distribution of z is the standard normal distribution.

28N

4. Specify alpha and the region(s) of rejection.

The regions of rejection are harder...

.05

42


5. Collect data, compute the test statistic, make a decision.

We collect data. Say the mean is 97.1. Does 97.1 fall in the region of rejection?

97.1 100 2.91.53

1.891028

xx

x xz

N

43

Type I and Type II errors

• There are two ways to make an incorrect decision in hypothesis testing: Type I and Type II errors.

• Type I error: Concluding that the null hypothesis is false when it is really true.

• We control the probability of making a Type I error (alpha).

• Alpha (): The risk of incorrectly rejecting a true null hypothesis.

• Why not make really, really small? The smaller we make , the more likely it becomes we will encounter a Type II error.

44

Type I and Type II errors

• Type II error: Concluding the null hypothesis is true when it is really false.

• Beta (): The probability of incorrectly retaining a false null hypothesis.

45

Next time...

• Power

• Effect size

• Statistical significance vs. practical significance

• Confidence intervals

1 today null and alternative hypotheses 1- and 2-tailed tests regions of rejection sampling...

Documents

hypothesis testinga

alternative hypothesis

interesting hypothesis

steps of hypothesis

hypothesis testingthis

hypothesis testingwe

tailed hypothesis test

population parameters