5 hypothesis testing
TRANSCRIPT
-
7/27/2019 5 Hypothesis Testing
1/30
Hypothesis TestingObjectives:
Students should be able to identify the null and alternative (research)hypotheses in a statistical test
Students should know the difference between one-and two-directionalhypothesis testing
Students should know what alpha, beta, power, and p-values are
Students should be able to identify/define type I and type II errors
Students should understand the differences between statisticalsignificance and clinical importance
Students should know how to determine statistical significance givenalpha and a calculated p-value OR given alpha and a corresponding
confidence interval
-
7/27/2019 5 Hypothesis Testing
2/30
Hypothesis TestingThe second type of inferential statistics
Hypothesis testing is a statistical method used to makecomparisons between a single sample and a population, or
between 2 or more samples.
The result of a statistical hypothesis test is a probability,called a p-value, of obtaining the results (or more extreme
results) from tests of samples, if the results really werent
true in the population.
-
7/27/2019 5 Hypothesis Testing
3/30
-
7/27/2019 5 Hypothesis Testing
4/30
Hypothesis TestingIn all hypothesis testing, the numerical result from the statistical test is
compared to a probability distribution to determine the probability of
obtaining the result if the result is not true in the population.
Examples of two
probability
distributions:
the normal andt-distributions
-4 -3 -2 -1 0 1 2 3 4
t distribution
normal
distribution
-
7/27/2019 5 Hypothesis Testing
5/30
Steps in Statistical
Hypothesis Testing1. Formulate null and research hypotheses
2. Set alpha error (Type I error) and beta error(Type II error)
3. Compute statistical test and determine
statistical significance
4. Draw conclusion
-
7/27/2019 5 Hypothesis Testing
6/30
Null Hypothesis (H0):
There is no difference between groups;
there is no relationship between the independent anddependent variable(s).
Research Hypothesis (HR):
There is a difference between groups;
there is a relationship between the independentand dependent variable(s).
Step 1: Formulate Null and
Research Hypotheses
-
7/27/2019 5 Hypothesis Testing
7/30
Directional vs
Non-directional HypothesesNull and research hypotheses are either non-directional (two-tailed) or directional
(one-tailed):
Non-directional (two-tailed): Directional (one-tailed):
H0: Drug A = Drug B H0: Drug A
Drug BHR: Drug A Drug B HR: Drug A > Drug Bor
H0: Drug A Drug BHR: Drug A < Drug B
Non-
Rejection
RegionRejection
Region
2.5%
Rejection
Region
2.5%
Non-
Rejection
Region Rejection
Region
5.0%
-
7/27/2019 5 Hypothesis Testing
8/30
Example:
Directional vs Non-directionalResearch question: Does age of onset of paranoid schizophrenia differ
for males and females?
Non-directional (two-tailed):
H0: Male Age = Female Age
HR: Male Age Female Age
Directional (one-tailed):
H0: Male Age Female Age
HR: Male Age > Female Age
(or the opposite)
Non-Rejection
RegionRejection
Region
2.5%
Rejection
Region
2.5%
Non-
Rejection
Region Rejection
Region
5.0%
-
7/27/2019 5 Hypothesis Testing
9/30
Step 2: Set Alpha (Type I) and
Beta (Type II) ErrorsAlpha () is the level of significance in hypothesis testing:
Alpha is a probability specified before the test is performed.
Alpha is the probability of rejecting the null hypothesis
when it is true.
By convention, typical values of alpha specified in medicalresearch are 0.05 and 0.01.
Alphas have corresponding critical values, the same ones
used to calculate confidence intervals 0.05/1.96,0.01/2.575
-
7/27/2019 5 Hypothesis Testing
10/30
Step 2: Set Alpha (Type I) and
Beta (Type II) ErrorsBeta () is the probability of accepting the nullhypothesis when it is false.
Typical values for beta are 0.10 to 0.20
Beta is directly related to the power of a statistical test:
Power is the probability of correctly rejecting the null
hypothesis when it is false. Power = 1 - Beta
A type II error occurs when a false null hypothesis is
accepted.
-
7/27/2019 5 Hypothesis Testing
11/30
P-valuesP-values are the actual probabilities calculated from a
statistical test, and are compared against alpha to
determine whether to reject the null hypothesis or not.
Example:
alpha = 0.05; calculated p-value = 0.008; reject null
hypothesisalpha = 0.05; calculated p-value = 0.110; do not reject null
hypothesis
A type I error occurs when a true null hypothesis isrejected.
-
7/27/2019 5 Hypothesis Testing
12/30
Possible Outcomes in
Statistical TestingNull Hypothesis
(Treatment A = Treatment B)
POPULATION
True
(No difference)
False
(Difference)
Accept H0
(No difference)
Correct
Decision
Type II Error
(beta () error)Decision Basedon Inferential
Statistical Test Reject H0
(Difference)
Type I Error
(alpha ()error)
Correct
Decision
Power (1-)
-
7/27/2019 5 Hypothesis Testing
13/30
-
7/27/2019 5 Hypothesis Testing
14/30
Null Hypothesis (H0)
There is no difference in posttreatment mortality
between the CABG and PTCA groups(the post treatment mortality is equal, i.e. P1 = P2)
Post treatment mortality in CABG/PTCA study:
What are the null and alternative hypotheses for a
two-tailed test?
Research Hypothesis (HR)
There is a difference in posttreatment mortalitybetween the CABG and PTCA groups (the post
treatment mortality is not equal, i.e. P1 P2)
-
7/27/2019 5 Hypothesis Testing
15/30
-
7/27/2019 5 Hypothesis Testing
16/30
Step 3
Compute statistical test and determine statistical
significance
Calculations for statistical tests are different dependingon the type of test
All involve determining a value of a test statistic that isthen converted to a probability of obtaining that test
statistic if the null hypothesis is true.
The value of a test statistic is determined from themeasurement being tested, and the variability of themeasurement in the sample (the SE of the
measurement).
Hypothesis Testing
-
7/27/2019 5 Hypothesis Testing
17/30
-
7/27/2019 5 Hypothesis Testing
18/30
Example of a statistical test: two-sample t test
Does age of onset of paranoid schizophrenia differ for
males and females?
H0: Male Age = Female Age
HR: Male Age Female Age
n mean age SD
Male 12 26.8 5.8
Female 12 29.6 6.2
Test statistic:
)2x1x(
21
SE
)xx(t
-
7/27/2019 5 Hypothesis Testing
19/30
Example of a statistical test: two-sample t test
Does age of onset of paranoid schizophrenia differ for males
and females?
calculated test statistic: t = -1.142
Critical value of t for alpha = 0.05: + 1.960
The computed value of t does not exceed the critical value
so the null hypothesis of no difference in age is not
rejected (the p value is greater than 0.05)
Conclusion:The mean age of onset is not different for males versus
females
-
7/27/2019 5 Hypothesis Testing
20/30
There are a number of statistical tests that can be
used:
2 examples are 1) chi-square test, or 2) z test for
proportions. The resulting p values will be thesame regardless of the test used.
The researchers used a z test:
the p value from the test was 0.3508.
If alpha = 0.05, what did they conclude?
Is the post treatment mortality different for patients
receiving CABG compared to patients receiving
PTCA?
-
7/27/2019 5 Hypothesis Testing
21/30
The p value is 0.3508 this is
>0.05, so the conclusion from the
study is that there is no difference
Is the post treatment mortality different for patients
receiving CABG compared to patients receiving
PTCA?
If there is truly no difference between
CABG and PTCA, the probability ofobtaining the difference of 0.6% is
~35%
-
7/27/2019 5 Hypothesis Testing
22/30
Step 4
Draw conclusion about the population
based on the results of the statistical test
on the sample
Statistical conclusion: the results either are
or are not statistically significant
BUT
You need to interpret the results in ameaningful (and not just statistical) way
Hypothesis Testing
-
7/27/2019 5 Hypothesis Testing
23/30
Principles for Statistical Significance
1. The size of a p value does not indicate importance of the
result.
2. Interpret nonsignificance cautiously.
a. finding no difference may be important
b. statistically nonsignificant clinically unimportant
3. Results may be statistically significant but clinically trivial.
-
7/27/2019 5 Hypothesis Testing
24/30
P Values vs. Confidence Intervals
There is a direct relationship between levels of alpha set for a statisticaltest and the level set for constructing a confidence interval.
For example, alpha = 0.05 for a 2-sided statistical test is equivalent to a 95%confidence interval
Non-
Rejection
RegionRejection
Region
2.5%
Rejection
Region
2.5%
95% confidence interval
-
7/27/2019 5 Hypothesis Testing
25/30
P Values vs. Confidence Intervals
Statistical significance can be obtained from a confidence interval as
well as a hypothesis test
AND
Confidence intervals convey more information than p values
For this reason, most medical journals now prefer that results be
presented with confidence intervals rather than p values.
If the NULL VALUE for a statistical hypothesis test using alpha = 0.05
is contained within the 95% confidence interval,
we can conclude NO statistical significance at alpha = 0.05
without doing the hypothesis test:
-
7/27/2019 5 Hypothesis Testing
26/30
P Values vs. Confidence Intervals
Example:
For differences between means or proportions, the null hypothesis isthat the difference is equal to zero:
If the 95% CI includes the value zero, the differences are not statistically
significant at alpha = 0.05.
For the test comparing the ages of males and females for onset ofparanoid schizophrenia, the null hypothesis is that the difference in age
is zero years.
-
7/27/2019 5 Hypothesis Testing
27/30
P Values vs. Confidence Intervals
Example:
n mean age SD
Male 12 26.8 5.8Female 12 29.6 6.2
The difference in age obtained from the sample is:
26.8-29.6 = -2.8 years
The standard error of the difference is 2.45
(calculation not shown)
The 95% confidence interval is:
-2.8 +/- 2(2.45) = -2.8 +/- 4.9 = -7.7 to 2.1years
-
7/27/2019 5 Hypothesis Testing
28/30
P Values vs. Confidence Intervals
The 95% confidence interval is:
-2.8 +/- 2(2.45) = -2.8 +/- 4.9 = -7.7 to 2.1 years
This means that the true population mean difference in
age is somewhere between males being 7.7 years
younger to males being 2.1 years older than females
The 95% CI includes 0 years, so there is no statistically
significant difference in age. In addition, we have
information about the precision of our estimate of the
difference, which cannot be obtained from p valuesalone.
Note: This is a relatively wide confidence interval
because the sample size is small
-
7/27/2019 5 Hypothesis Testing
29/30
P Values vs. Confidence Intervals
We can be 95% confident that the true difference in
mortality between CABG and PTCA is between0.6%
and +1.7%
For the CABG/PTCA result:
The 95% CI is0.6% to 1.7%
This confidence interval contains the value zero;
therefore, we could have concluded that the mortalityis not different based on the confidence interval alone.
-
7/27/2019 5 Hypothesis Testing
30/30
P Values vs. Confidence Intervals
For ratio variables, such as relative risk andodds ratio, the value one represents equality.
The null hypothesis is that the ratio is equal to
one:
If the 95% CI includes the value one, the
difference is not statistically significant at
alpha = 0.05.