l08_ch 08.ppt
TRANSCRIPT
1 Copyright F. Michael Speed1
Learning Outcomes
• You will learn– How to identify if a problem fits an ANOVA
model– How to setup a ANOVA model– How to interpret the terms of the model– What are the important hypotheses to be
tested– What is the difference between “clean” &
“dirty” models
2
A Statistical Test About More Than Two Population Means:
An Analysis of Variance
3
REGRESSION – ANOVA
X1 X2 X3 …… Xp
All Scale
Regression
All Factors
ANOVA
4
Poll
• Is the following and ANOVA or a regression?
Creatinine clearance (Y) is an important measure of kidney function, but it is difficult to obtain in a clinical office setting because it requires 24-hour urine collection. To determine whether this measure can be predicted from some data that are easily available, a kidney specialist obtained the data that follow from 33 male subjects. The predictor variable is serum creatinine concentration (X1).
5
Poll
• Is the following and ANOVA or a regression?
The two most crucial factors that influence the strength of solders used in cementing computer chips into the mother board of the guidance system of an airplane are identified as the machine used to insert the solder and the operator of the machine. Only three qualified operators were available, and four solder machines were randomly selected from the many solder machines available at the company’s plants. Each operator made two solders on each of the four machines. The resulting strength determinations of the solders are given here.
6
ANOVA
NO X’s
u1
u3
u2
Y
7
New way to look at Data
Y
u1 u3 u2
8
FIGURE 8.5Distributions of four populations that
satisfy AOV assumptions
9
Applet
• http://bcs.whfreeman.com/ips4e/cat_010/applets/anova.html
10
A Treatment (Factor) 5 Levels
• A• B• C• D• E
• How many populations?
54321: oH
11
Multiple t tests
Null Hypotheses
1 2 1 4 2 3 2 5 3 5
1 3 1 5 2 4 3 4 4 5
12
Analysis of Variance Procedures
1. Each of the five populations has a normal distribution. Use residuals to test this.
2. The variances of the five populations are equal; that is
3. The five sets of measurements are independent random samples from their respective populations.
2 2 2 2 2 21 2 3 4 5 .
13
The Null and Alternative Hypotheses:
(i.e., the t population means are equal)
At least one of the t population means differs from the rest.
0 1 2 3: tH
:aH
14
Table 8.6An example of an AOV table for a
completely randomized design
15
Model
ij i ij
ij i ij
y
or
y
Dirty
Clean
16
Poll
• In the “dirty” model, the parameters are population parameters
• Yes - No
• In the “clean” model, the parameters are population parameters
• Yes - No
17
TABLE 8.11Summary of some of the assumptions for a
completely randomized design
Population
Population
Mean
Population
Variance
Sample
Measurements
1
2
t
2
2
2
1
2
11 12 1
21 22 2
1 2
, , ,
, , ,
, , ,t
n
n
t t tn
y y y
y y y
y y y
1
2
t
1 2 t
18
Checking on the AOV Conditions
• Residuals analysis
• Levene’s test for equality of variances
19
Reporting Conclusions1. Statement of objective for study2. Description of study design and data collection
procedures3. Discussion of why the results from 11 of the 100
patients were not included in the data analysis4. Numerical and graphical summaries of data sets5. Description of all inference methodologies:
– AOV table and F –test– t-based confidence intervals on means– Verification that all necessary conditions for
using inference techniques were satisfied
20
6. Discussion of results and conclusions
7. Interpretation of finding relative to previous studies
8. Recommendations for future studies
9. Listing of data sets
21
• This demonstration illustrates ...
Demonstrationcxxsxdx
22
• This exercise reinforces the concepts discussed previously.
Exercises
23
24
25
Multiple ComparisonsBut Which Means Are Different?
Chapter 9
26
Elementary, Watson
27
Linear Contrasts - LMATRIX
1 1 2 21
1 2
2 31
( )
2
t
t t i ii
l
l
l
28
DEFINITION 9.1
1 1 2 2ˆ
.
0.
t t i i i
i i i i
i i
l a y a y a y a y
t
l a a
a
is called
a among the sample
means and can be used to estimate
The s are constants satisfying
the constraint
linear contrast
29
Which Error Rate Is Controlled?
• Individual comparisons
• Experimentwise error rate
• Bonferroni inequality
• Fisher’s protected LSD
• Tukey
• And on and on and on ….
30
Individual Comparison
0 1 1 2 2
1 1 2 2
Error
: 0
: 0
SSCT.S.:
MS
t t
a t t
H l a a a
H l a a a
F
level is correct.
31
2 Comparisons
1
2
1o 1
2o 2
Suppose that we want to test 2 comparisons L
and L . Let be P(Rejecting L | L is true}= .1 .
H : 0
H : 0
Probability of making a TYPE I error on at least
one of the null hypothesis is
P{at lea
L
L
L
2st 1 error} = 1 - (1-.1) 1 .81 .19
32
Table 9.4
33
Experimentwise Error Rate
E
Bonferroni Inequality
If we want to test m hypothesis, then use
/
This will guarantee that the chance of a TYPE I
error is at most .
L m
34
Fisher’s Least Significant Difference Procedure
1. Perform an analysis of variance to test against the alternative
hypothesis that at least one of the means differs from the rest.
2. If there is insufficient evidence to reject using F = MSB/MSW, proceed no further.
3. If is rejected, define the least significant difference (LSD) to be the observed difference between two sample means necessary to declare the corresponding population means different.
0 1:H
0H
2 t
0H
35
Testing What You Want To Test
o 1 2
o 1 3
1 2o
1 3
Is there a difference?
H :
H :
and
H :
36
Testing What You Want To Test - Continued
1 2 3 4
1 2 3 4
1 3 4
: 2* 2*
:( ) / 2
O
O
H
H
37
Testing What You Want To Test - Continued
1 2 3 4
Rewrite as:
1 2 -2 -1
1 -1 -1 1
2 0 -1 -1
38
• This demonstration illustrates ...
Demonstrationcxxsxdx
39
• This exercise reinforces the concepts discussed previously.
Exercises
40
41