5 hypothesis testing

7/27/2019 5 Hypothesis Testing

1/30

Hypothesis TestingObjectives:

Students should be able to identify the null and alternative (research)hypotheses in a statistical test

Students should know the difference between one-and two-directionalhypothesis testing

Students should know what alpha, beta, power, and p-values are

Students should be able to identify/define type I and type II errors

Students should understand the differences between statisticalsignificance and clinical importance

Students should know how to determine statistical significance givenalpha and a calculated p-value OR given alpha and a corresponding

confidence interval


2/30

Hypothesis TestingThe second type of inferential statistics

Hypothesis testing is a statistical method used to makecomparisons between a single sample and a population, or

between 2 or more samples.

The result of a statistical hypothesis test is a probability,called a p-value, of obtaining the results (or more extreme

results) from tests of samples, if the results really werent

true in the population.


3/30


4/30

Hypothesis TestingIn all hypothesis testing, the numerical result from the statistical test is

compared to a probability distribution to determine the probability of

obtaining the result if the result is not true in the population.

Examples of two

probability

distributions:

the normal andt-distributions

-4 -3 -2 -1 0 1 2 3 4

t distribution

normal

distribution


5/30

Steps in Statistical

Hypothesis Testing1. Formulate null and research hypotheses

2. Set alpha error (Type I error) and beta error(Type II error)

3. Compute statistical test and determine

statistical significance

4. Draw conclusion


6/30

Null Hypothesis (H0):

There is no difference between groups;

there is no relationship between the independent anddependent variable(s).

Research Hypothesis (HR):

There is a difference between groups;

there is a relationship between the independentand dependent variable(s).

Step 1: Formulate Null and

Research Hypotheses


7/30

Directional vs

Non-directional HypothesesNull and research hypotheses are either non-directional (two-tailed) or directional

(one-tailed):

Non-directional (two-tailed): Directional (one-tailed):

H0: Drug A = Drug B H0: Drug A

Drug BHR: Drug A Drug B HR: Drug A > Drug Bor

H0: Drug A Drug BHR: Drug A < Drug B

Non-

Rejection

RegionRejection

Region

2.5%

Rejection

Region

2.5%

Non-

Rejection

Region Rejection

Region

5.0%


8/30

Example:

Directional vs Non-directionalResearch question: Does age of onset of paranoid schizophrenia differ

for males and females?

Non-directional (two-tailed):

H0: Male Age = Female Age

HR: Male Age Female Age

Directional (one-tailed):

H0: Male Age Female Age

HR: Male Age > Female Age

(or the opposite)

Non-Rejection

RegionRejection

Region

2.5%

Rejection

Region

2.5%

Non-

Rejection

Region Rejection

Region

5.0%


9/30

Step 2: Set Alpha (Type I) and

Beta (Type II) ErrorsAlpha () is the level of significance in hypothesis testing:

Alpha is a probability specified before the test is performed.

Alpha is the probability of rejecting the null hypothesis

when it is true.

By convention, typical values of alpha specified in medicalresearch are 0.05 and 0.01.

Alphas have corresponding critical values, the same ones

used to calculate confidence intervals 0.05/1.96,0.01/2.575


10/30

Step 2: Set Alpha (Type I) and

Beta (Type II) ErrorsBeta () is the probability of accepting the nullhypothesis when it is false.

Typical values for beta are 0.10 to 0.20

Beta is directly related to the power of a statistical test:

Power is the probability of correctly rejecting the null

hypothesis when it is false. Power = 1 - Beta

A type II error occurs when a false null hypothesis is

accepted.


11/30

P-valuesP-values are the actual probabilities calculated from a

statistical test, and are compared against alpha to

determine whether to reject the null hypothesis or not.

Example:

alpha = 0.05; calculated p-value = 0.008; reject null

hypothesisalpha = 0.05; calculated p-value = 0.110; do not reject null

hypothesis

A type I error occurs when a true null hypothesis isrejected.


12/30

Possible Outcomes in

Statistical TestingNull Hypothesis

(Treatment A = Treatment B)

POPULATION

True

(No difference)

False

(Difference)

Accept H0

(No difference)

Correct

Decision

Type II Error

(beta () error)Decision Basedon Inferential

Statistical Test Reject H0

(Difference)

Type I Error

(alpha ()error)

Correct

Decision

Power (1-)


13/30


14/30

Null Hypothesis (H0)

There is no difference in posttreatment mortality

between the CABG and PTCA groups(the post treatment mortality is equal, i.e. P1 = P2)

Post treatment mortality in CABG/PTCA study:

What are the null and alternative hypotheses for a

two-tailed test?

Research Hypothesis (HR)

There is a difference in posttreatment mortalitybetween the CABG and PTCA groups (the post

treatment mortality is not equal, i.e. P1 P2)


15/30


16/30

Step 3

Compute statistical test and determine statistical

significance

Calculations for statistical tests are different dependingon the type of test

All involve determining a value of a test statistic that isthen converted to a probability of obtaining that test

statistic if the null hypothesis is true.

The value of a test statistic is determined from themeasurement being tested, and the variability of themeasurement in the sample (the SE of the

measurement).

Hypothesis Testing


17/30


18/30

Example of a statistical test: two-sample t test

Does age of onset of paranoid schizophrenia differ for

males and females?

H0: Male Age = Female Age

HR: Male Age Female Age

n mean age SD

Male 12 26.8 5.8

Female 12 29.6 6.2

Test statistic:

)2x1x(

21

SE

)xx(t


19/30

Example of a statistical test: two-sample t test

Does age of onset of paranoid schizophrenia differ for males

and females?

calculated test statistic: t = -1.142

Critical value of t for alpha = 0.05: + 1.960

The computed value of t does not exceed the critical value

so the null hypothesis of no difference in age is not

rejected (the p value is greater than 0.05)

Conclusion:The mean age of onset is not different for males versus

females


20/30

There are a number of statistical tests that can be

used:

2 examples are 1) chi-square test, or 2) z test for

proportions. The resulting p values will be thesame regardless of the test used.

The researchers used a z test:

the p value from the test was 0.3508.

If alpha = 0.05, what did they conclude?

Is the post treatment mortality different for patients

receiving CABG compared to patients receiving

PTCA?


21/30

The p value is 0.3508 this is

>0.05, so the conclusion from the

study is that there is no difference

Is the post treatment mortality different for patients

receiving CABG compared to patients receiving

PTCA?

If there is truly no difference between

CABG and PTCA, the probability ofobtaining the difference of 0.6% is

~35%


22/30

Step 4

Draw conclusion about the population

based on the results of the statistical test

on the sample

Statistical conclusion: the results either are

or are not statistically significant

BUT

You need to interpret the results in ameaningful (and not just statistical) way

Hypothesis Testing


23/30

Principles for Statistical Significance

1. The size of a p value does not indicate importance of the

result.

2. Interpret nonsignificance cautiously.

a. finding no difference may be important

b. statistically nonsignificant clinically unimportant

3. Results may be statistically significant but clinically trivial.


24/30

P Values vs. Confidence Intervals

There is a direct relationship between levels of alpha set for a statisticaltest and the level set for constructing a confidence interval.

For example, alpha = 0.05 for a 2-sided statistical test is equivalent to a 95%confidence interval

Non-

Rejection

RegionRejection

Region

2.5%

Rejection

Region

2.5%

95% confidence interval


25/30


Statistical significance can be obtained from a confidence interval as

well as a hypothesis test

AND

Confidence intervals convey more information than p values

For this reason, most medical journals now prefer that results be

presented with confidence intervals rather than p values.

If the NULL VALUE for a statistical hypothesis test using alpha = 0.05

is contained within the 95% confidence interval,

we can conclude NO statistical significance at alpha = 0.05

without doing the hypothesis test:


26/30


Example:

For differences between means or proportions, the null hypothesis isthat the difference is equal to zero:

If the 95% CI includes the value zero, the differences are not statistically

significant at alpha = 0.05.

For the test comparing the ages of males and females for onset ofparanoid schizophrenia, the null hypothesis is that the difference in age

is zero years.


27/30


Example:

n mean age SD

Male 12 26.8 5.8Female 12 29.6 6.2

The difference in age obtained from the sample is:

26.8-29.6 = -2.8 years

The standard error of the difference is 2.45

(calculation not shown)

The 95% confidence interval is:

-2.8 +/- 2(2.45) = -2.8 +/- 4.9 = -7.7 to 2.1years


28/30


The 95% confidence interval is:

-2.8 +/- 2(2.45) = -2.8 +/- 4.9 = -7.7 to 2.1 years

This means that the true population mean difference in

age is somewhere between males being 7.7 years

younger to males being 2.1 years older than females

The 95% CI includes 0 years, so there is no statistically

significant difference in age. In addition, we have

information about the precision of our estimate of the

difference, which cannot be obtained from p valuesalone.

Note: This is a relatively wide confidence interval

because the sample size is small


29/30


We can be 95% confident that the true difference in

mortality between CABG and PTCA is between0.6%

and +1.7%

For the CABG/PTCA result:

The 95% CI is0.6% to 1.7%

This confidence interval contains the value zero;

therefore, we could have concluded that the mortalityis not different based on the confidence interval alone.


30/30


For ratio variables, such as relative risk andodds ratio, the value one represents equality.

The null hypothesis is that the ratio is equal to

one:

If the 95% CI includes the value one, the

difference is not statistically significant at

alpha = 0.05.

5 hypothesis testing

Documents