1 experimental design and statistical analyses of data lesson 5: mixed models nested anovas...

71
1 Experimental design and statistical analyses of data Lesson 5: Mixed models Nested anovas Split-plot designs

Post on 22-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

1

Experimental design and statistical analyses of data

Lesson 5:Mixed modelsNested anovas

Split-plot designs

2

Randomized block design

• All treatments are allocated to the same experimental units

• Treatments are allocated at random

B C B

A B D

D A A

C D C

Blocks (b = 3)

Treatments (a = 4)

3

Dy1

Treatments

Patient

A B C D Average

1

2

3

Average

Cy1

Ay2

Ay3

Cy2By2 Dy2

By3 Cy3 Dy3

Ay By

1y

2y

3y

Cy Dy y

55443322110 xxxxxy

Blocks (patients) Treatments (drugs)

Ay1 By1

4

An alternative way of writing a GLM

jiijy

Response of patient j receiving drug i

Overall mean Effect of drug i

Effect of patient j

Residual

αi = μi - μ

βj = μj - μ

5

jiijy ˆˆˆˆ

Predicted value of y

αi = μi - μ

βj = μj - μ

yyyii

yy jj Response of patient j receiving drug i

6

Treatments

Patient

A B C D

1 5.17 5.21 4.91 4.74 5.008

2 6.23 7.34 6.18 6.31 6.515

3 4.93 4.55 4.64 4.61 4.683

5.443 5.700 5.243 5.220 5.402iy

jy

402.5ˆ y042.0402.5443.5ˆ yyAA298.0402.5700.5ˆ yyBB

158.0402.5243.5ˆ yyCC182.0402.5220.5ˆ yyDD

0ˆ i

i

7

Treatments

Patient

A B C D

1 5.17 5.21 4.91 4.74 5.008

2 6.23 7.34 6.18 6.31 6.515

3 4.93 4.55 4.64 4.61 4.683

5.443 5.700 5.243 5.220 5.402iy

jy

402.5ˆ y

394.0402.5008.5ˆ11 yy

113.1402.5515.6ˆ22 yy

719.0402.5683.4ˆ33 yy

0ˆ j

j

8

402.5ˆ y042.0402.5443.5ˆ yyAA298.0402.5700.5ˆ yyBB

158.0402.5243.5ˆ yyCC182.0402.5220.5ˆ yyDD

394.0402.5008.5ˆ11 yy

113.1402.5515.6ˆ22 yy

719.0402.5683.4ˆ33 yy

Effects of drugs

Effects of patients

Ex: Patient 2 receiving treatment C:

357.6113.1158.0402.5ˆˆˆ 22 CC yy

9

Consider the two questions:

• Are the three patients different?

• Are patients in general different?

• In the first case, ”patients” is considered as a fixed factor

• In the second case, ”patients” is considered as a random factor

10

”Patients” is a random effect:

jiy

βj is assumed to be iid ND(0,σb2)

0

Value of

Pro

babi

lity

of

β

If patients are randomly chosen, βj will be a stochastic variable

i.e. independently and identically normally distributed with zero mean and variance σ²b

11

V(y) = V(μ + αi + βj + ε) = V(μ)+ V(αi)+ V( βj)+ V(ε)

= σa2 + σb

2 + σ2

Variances

Variance due to drug (factor a)Variance due to patient (factor b)

Residual variance

12

Both factors are fixed

V(y) = V(μ + αi + βj + ε) = V(μ)+ V(αi)+ V( βj)+ V(ε)

= σa2 + σb

2 + σ2

V(y) = σ2

banyVyV

2

/)()(

Variance of a single observation:

Variance of an average:

13

”Patients” is a random factor (mixed anova)

V(y) = V(μ + αi + βj + ε) = V(μ)+ V(αi)+ V( βj)+ V(ε)

= σa2 + σb

2 + σ2

V(y) = σb2 + σ2 Variance of a single observation:

Variance of an average: baabab

yV bb

/)( 22

22

14

Both factors are random

V(y) = V(μ + αi + βj + ε) = V(μ)+ V(αi)+ V( βj)+ V(ε)

= σa2 + σb

2 + σ2

V(y) = σa2 +σb

2 + σ2 Variance of a single observation:

Variance of an average:

baabbaba

yV baba

/)( 222

222

15

Source df MS E[MS] F

Drugs

Patients

Error

a-1

b-1

(a-1)(b-1)

MSa

MSb

MSe

Total ab-1

Expected Means Squares

16

Expected Mean Squares

E[MSa] = bσa2 + σ2

E[MSb] = aσb2 + σ2

E[MSe] = σ2

df = a-1

df = b-1

df = (a-1)(b-1)

H0: αA = αB = αC = αD = 0 → σa2 = 0 →

e

a

MS

MSF

H0: β1 = β2 = β3 = 0 → σb2 = 0 →

e

b

MS

MSF

17

Source df MS E[MS] F

Drugs

Patients

Error

a-1

b-1

(a-1)(b-1)

MSa

MSb

MSe

bσa2 + σ2

aσb2 + σ2

σ2

MSa/Mse

MSb/MSe

Total ab-1

18

Source df MS E[MS] F

Drugs

Patients

Error

3

2

6

0.149

3.824

0.117

bσa2 + σ2

aσb2 + σ2

σ2

MSa/Mse

MSb/MSe

Total 11

19

Hvis ”patients” is a random factor, σb2 is estimated from

E[MSb] = aσb2 + σ2 →

a

MSMS

a

sMSs ebb

b

22

927.04

117.0824.32

bs

V(y) = σb2 + σ2 = 0.927+0.117 = 1.044Variance of a single observation:

Variance of the average:

0.3190.0100.30912

117.0

3

927.0)(

22

bab

yV b

20

How to do it with SAS

21

DATA eks5_1;

INPUT pat $ treat $ y; /* indlæser data */

CARDS; /* her kommer data. Kan også indlæses fra en fil */

1 A 5.17

2 A 6.23

3 A 4.93

1 B 5.21

2 B 7.34

3 B 4.55

1 C 4.91

2 C 6.18

3 C 4.64

1 D 4.74

2 D 6.31

3 D 4.61

;

PROC GLM; /* procedure General Linear Models */

TITLE 'Eksempel 5.1'; /* medtages hvis der ønskes en titel */

CLASS pat treat; /* pat og treat er klasse (kvalitative) variable */

MODEL y = pat treat;

RANDOM pat; /* Patienter er en tilfældig faktor */

RUN;

22

Eksempel 5.1 8 13:18 Monday, November 5, 2001  General Linear Models Procedure Dependent Variable: Y Source DF Sum of Squares Mean Square F Value Pr > F Model 5 8.09475000 1.61895000 13.80 0.0031 Error 6 0.70401667 0.11733611 Corrected Total 11 8.79876667  R-Square C.V. Root MSE Y Mean  0.919987 6.341443 0.34254359 5.40166667  Source DF Type I SS Mean Square F Value Pr > F PAT 2 7.64831667 3.82415833 32.59 0.0006TREAT 3 0.44643333 0.14881111 1.27 0.3666 Source DF Type III SS Mean Square F Value Pr > F PAT 2 7.64831667 3.82415833 32.59 0.0006TREAT 3 0.44643333 0.14881111 1.27 0.3666 

MSb

MSa

MSe

23

Eksempel 5.1 18

09:00 Friday, November 16, 2001

General Linear Models Procedure

Source Type III Expected Mean Square

PAT Var(Error) + 4 Var(PAT)

TREAT Var(Error) + Q(TREAT)

24

Nested designs

25

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

1 2 3 1 2 3 1 2 3 1 2 3

A B C DFactor A (drug)

Factor B (patient)

Replicate

Model: kijijjiijky )()(

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

A B C DFactor A (drug)

Factor B (patient)

Replicate

Model: kijjiiijky )()(

1 2 3 1 2 3 1 2 3 1 2 3

Patient j is the same for all drugs

Patient j is not the same for all drugs

Patients are said to be nested within drugs

Replicates can also be regarded as nestedwithin drugs and patients

26

Rules for finding the EMS(after Dunn and Clark)

1. For each effect, write down every possible variance component containing every letter of the effect name. For example, in a two way design with r replicates per cell, the EMS for factor A includes σa

2, σab2 and σ(ab)e

2, but not σb2

2. For any nested factor add in parentheses to the effect name the name(s) of the factor within it is nested e.g if B is nested in A, σ(a)b

2 is the variance of β(i)j.

3. For the coefficient of each variance component, use all letters not in the subscripts of the variance component

4. For each variance component, look at any subscripts outside parentheses that are not in the effect name; if any of these letters corresponds to a fixed effect, omit that variance component

27

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

1 2 3 1 2 3 1 2 3 1 2 3

A B C D

Two-way anova (A and B fixed)Factor A (drug)

Factor B (patient)

Replicate

Model: kijijjiijky )()(

Interaction between drug and patient

Residual of the kth replicate nested within drug i and patient j

28

Model: kijijjiijky )()(

0i

i 0j

j i j

ij 0)(

(1) For each effect, write down every possible variance component containing every letter of the effect name. For example, in a two way design with r replicates per cell, the EMS for factor A includes σa

2, σab2 and σ(ab)e

2, but not σb2

σa2 + σab

2 + σ(ab)e 2Factor A:

σb2 + σab

2 + σ(ab)e 2Factor B:

σab2 + σ(ab)e 2Factor AB:

σ(ab)e 2Residual:

29

Model: kijijjiijky )()(

0i

i 0j

j i j

ij 0)(

σa2 + σab

2 + σ(ab)e 2

σb2 + σab

2 + σ(ab)e 2

σab2 + σ(ab)e 2

Factor A:

Factor B:

Factor AB:

(2) For any nested factor add in parentheses to the effect name the name(s) of the factor within it is nested e.g if B is nested in A, σ(a)b

2 is the variance of β(i)j.

σ(ab)e 2Residual:

30

Model: kijijjiijky )()(

0i

i 0j

j i j

ij 0)(

brσa2 + rσab

2 + σ(ab)e 2Factor A:

arσb2 + rσab

2 + σ(ab)e 2Factor B:

rσab2 + σ(ab)e 2Factor AB:

(3) For the coefficient of each variance component, use all letters not in the subscripts of the variance component

σ(ab)e 2Residual:

31

Model: kijijjiijky )()(

0i

i 0j

j i j

ij 0)(

brσa2 + rσab

2 + σ(ab)e 2

arσb2 + rσab

2 + σ(ab)e 2

rσab2 + σ(ab)e 2

Factor A:

Factor B:

Factor AB:

(4) For each variance component, look at any subscripts outside parentheses that are not in the effect name; if any of these letters corresponds to a fixed effect, omit that variance component

σ(ab)e 2Residual:

32

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

1 2 3 1 2 3 1 2 3 1 2 3

A B C D

Two-way anova (A and B fixed)Factor A (drug)

Factor B (patient)

Replicate

Model: kijijjiijky )()(

0i

i 0j

j i j

ij 0)(

2)(

222)ˆ( rababbayV abryV 2)(

Source df MS E[MS] F

A

B

AB

Error

a-1

b-1

(a-1)(b-1)

ab(r-1)

MSa

MSb

MSab

MSe

brσa2 +r σab

2+ σ2

arσb2 + r σab

2+ σ2

r σab2+ σ2

σ2

MSa/MSe

MSb/MSe

MSab/MSe

33

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

1 2 3 1 2 3 1 2 3 1 2 3

A B C D

Two-way anova (A fixed, B random)Factor A (drug)

Factor B (patient)

Replicate

Model: kijijjiijky )()(

0i

i

2)(

222)ˆ( rababbayV

Source df MS E[MS] F

A

B

AB

Error

a-1

b-1

(a-1)(b-1)

ab(r-1)

MSa

MSb

MSab

MSe

brσa2 +r σab

2+ σ2

arσb2 + r σab

2+ σ2

r σab2+ σ2

σ2

MSa/MSab

MSb/MSe

MSab/MSe

i

ij 0)( βj is ND(0, σb2) (αβ)ij is ND(0; σab

2(1-1/a))

abrarabrb

yV bb /)( 22

22

NB!

34

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

1 2 3 1 2 3 1 2 3 1 2 3

A B C D

Two-way anova (A and B random)Factor A:

Factor B:

Replicate

Model: kijijjiijky )()(

2)(

222)ˆ( rababbayV

Source df MS E[MS] F

A

B

AB

Error

a-1

b-1

(a-1)(b-1)

ab(r-1)

MSa

MSb

MSab

MSe

brσa2 +r σab

2+ σ2

arσb2 + r σab

2+ σ2

r σab2+ σ2

σ2

MSa/MSab

MSb/MSab

MSab/MSe

βi is ND(0, σb2) (αβ)ij is ND(0; σab

2)αi is ND(0, σa2)

abrrarbryV abba /)( 2222

35

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

A B C D

Nested anova (A fixed, B random)Factor A (drug)

Factor B (patient)

Replicate

Model: kijjiiijky )()(

2)(

2)(

2)ˆ( rabbaayV

Source df MS E[MS] F

A

B(A)

Error

a-1

a(b-1)

ab(r-1)

MSa

MS(a)b

MSe

brσa2 +r σ(a)b

2+ σ2

rσ(a)b2 + σ2

σ2

MSa/MS(a)b

MS(a)b/MSe

MSe

β(i)j is ND(0, σ(a)b2)

1 2 3 1 2 3 1 2 3 1 2 3

0i

i

abrrabrab

yV baba /)( 22

)(

22)(

36

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

A B C D

Nested anova (A and B random)Factor A (doctor)

Factor B (patient)

Replicate

Model: kijjiiijky )()(

2)(

2)(

2)ˆ( rabbaayV

Source df MS E[MS] F

A

B(A)

Error

a-1

a(b-1)

ab(r-1)

MSa

MS(a)b

MSe

brσa2 +r σ(a)b

2+ σ2

rσ(a)b2 + σ2

σ2

MSa/MS(a)b

MS(a)b/MSe

MSe

β(i)j is ND(0, σ(a)b2)

abrrbrabraba

yV baabaa /)( 22

)(2

22)(

2

αi is ND(0, σa2)

1 2 3 1 2 3 1 2 3 1 2 3

37

40% 20% 0% Four level nested anova

Tree (b = 2 )

Replicate (r = 2)

Model: rijkkijjiiijky )()()(

2)(

2)(

2)(

2)ˆ( rabccabbaayV

β(i)j is ND(0, σ(a)b2)

abcrrcrabcrabcab

yV cabbacabba /)( 22

)(2

)(

22)(

2)(

Leaf (c = 3 )

1 2 1 2 1 2

1

1 2 3

1 2 1 2 1 2

2

1 2 3

1 2 1 2 1 2

1

1 2 3

1 2 1 2 1 2

2

1 2 3

1 2 1 2 1 2

1

1 2 3

1 2 1 2 1 2

2

1 2 3

Treatment (a = 3)

γ(ij)k is ND(0, σ(ab)c2)0

ii

38

Sourcec df MS E[MS] F

Treatments

Trees

Leaves

Error

a-1

a(b-1)

ab(c-1)

abc(r-1)

MSa

MS(a)b

MS(ab)c

MSe

bcrσa2 +cr σ(a)b

2+ r σ(ab)c2 +σ2

cr σ(a)b2+ r σ(ab)c

2 +σ2 r σ(ab)c

2 +σ2

σ2

MSa/MS(a)b

MS(a)b/MS(ab)c

MS(ab)c/MSe

MSe

MS(ab)c = rs(ab)c2 + s2 →

r

MSMSs ecab

cab

)(2

)(

MS(a)b = cr s(a)b2+ r s(ab)c

2 +s2 = cr s(a)b2 + MS(ab)c

→cr

MSMSs cabba

ba)()(2

)(

MSa = bcrsa2 +cr s(a)b

2+ r s(ab)c2 +s2 = bcrsa

2 +MS(a)b →bcr

MSMSs baa

a)(2

2

)(2

)(2

)(2)ˆ( rabccabbaayV

22)(

2)(

2)ˆ( ssssyV cabbaa

39

How do it with SAS

40

PROC GLM;

CLASS treat tree leaf disc;

MODEL Nitro = treat tree(treat) leaf(tree treat);

/* treatment is a fixed factor, while trees and leaves are random */

RANDOM tree(treat) leaf(tree treat);

/* gives the expected means squares */

RUN;

DATA nested; /* Nested anova (eks 6-4 in the lecture notes) */

INFILE 'H:\lin-mod\eks6x.prn' firstobs =2 ;

INPUT treat $ tree $ leaf $ disc $ Nitro ;

41

General Linear Models Procedure

Dependent Variable: NITRO

Source DF Sum of Squares Mean Square F Value Pr > F

Model 17 134.04000000 7.88470588 8.00 0.0001

Error 18 17.75000000 0.98611111

Corrected Total 35 151.79000000

R-Square C.V. Root MSE NITRO Mean

0.883062 3.271932 0.99303127 30.35000000

Source DF Type I SS Mean Square F Value Pr > F

TREAT 2 71.78000000 35.89000000 36.40 0.0001

TREE(TREAT) 3 36.04666667 12.01555556 12.18 0.0001

LEAF(TREAT*TREE) 12 26.21333333 2.18444444 2.22 0.0618

Source DF Type III SS Mean Square F Value Pr > F

TREAT 2 71.78000000 35.89000000 36.40 0.0001

TREE(TREAT) 3 36.04666667 12.01555556 12.18 0.0001

LEAF(TREAT*TREE) 12 26.21333333 2.18444444 2.22 0.0618

NB!These values are based on MSe as the error term, which is wrong!

42

PROC GLM;

CLASS treat tree leaf disc;

MODEL Nitro = treat tree(treat) leaf(tree treat);

/* treatment is a fixed factor, while trees and leaves are random */

RANDOM tree(treat) leaf(tree treat);

/* gives the expected means squares */

RUN;

DATA nested; /* Nested anova (eks 6-4 in the lecture notes) */

INFILE 'H:\lin-mod\eks6x.prn' firstobs =2 ;

INPUT treat $ tree $ leaf $ disc $ Nitro ;

43

General Linear Models Procedure

Source Type III Expected Mean Square

TREAT Var(Error) + 2 Var(LEAF(TREAT*TREE)) + 6 Var(TREE(TREAT))

+ Q(TREAT)

TREE(TREAT) Var(Error) + 2 Var(LEAF(TREAT*TREE)) + 6 Var(TREE(TREAT))

LEAF(TREAT*TREE) Var(Error) + 2 Var(LEAF(TREAT*TREE))

44

PROC GLM;

CLASS treat tree leaf disc;

MODEL Nitro = treat tree(treat) leaf(tree treat);

/* treatment is a fixed factor, while trees and leaves are random */

RANDOM tree(treat) leaf(tree treat);

/* gives the expected means squares */

TEST h=treat e= tree(treat);

/* tests for the difference between treatments with MS for tree(treat) as denominator */

TEST h= tree(treat) e=leaf(tree treat);

/* tests for the difference between trees with MS for leaf(tree treat) as denominator*/

45

General Linear Models Procedure

Dependent Variable: NITRO

Tests of Hypotheses using the Type III MS for TREE(TREAT) as an error term

Source DF Type III SS Mean Square F Value Pr > F

TREAT 2 71.78000000 35.89000000 2.99 0.1933

Tests of Hypotheses using the Type III MS for LEAF(TREAT*TREE) as an error term

Source DF Type III SS Mean Square F Value Pr > F

TREE(TREAT) 3 36.04666667 12.01555556 5.50 0.0130

46

PROC GLM;

CLASS treat tree leaf disc;

MODEL Nitro = treat tree(treat) leaf(tree treat);

/* treatment is a fixed factor, while trees and leaves are random */

RANDOM tree(treat) leaf(tree treat);

/* gives the expected means squares */

TEST h=treat e= tree(treat);

/* tests for the difference between treatments with MS for tree(treat) as denominator */

TEST h= tree(treat) e=leaf(tree treat);

/* tests for the difference between trees with MS for leaf(tree treat) as denominator*/

MEANS treat / Tukey Dunnett('Control') e= tree(treat) cldiff;

/* finds possible significant differences between treatments and the control and the other treatments */

RUN;

47

Tukey's Studentized Range (HSD) Test for variable: NITRO

NOTE: This test controls the type I experimentwise error rate.

Alpha= 0.05 Confidence= 0.95 df= 3 MSE= 12.01556

Critical Value of Studentized Range= 5.910

Minimum Significant Difference= 5.9134

Comparisons significant at the 0.05 level are indicated by '***'.

Simultaneous Simultaneous

Lower Difference Upper

TREAT Confidence Between Confidence

Comparison Limit Means Limit

20% - 40% -3.663 2.250 8.163

20% - Control -2.513 3.400 9.313

40% - 20% -8.163 -2.250 3.663

40% - Control -4.763 1.150 7.063

Control - 20% -9.313 -3.400 2.513

Control - 40% -7.063 -1.150 4.763

48

Dunnett's T tests for variable: NITRO

NOTE: This tests controls the type I experimentwise error for

comparisons of all treatments against a control.

Alpha= 0.05 Confidence= 0.95 df= 3 MSE= 12.01556

Critical Value of Dunnett's T= 3.866

Minimum Significant Difference= 5.4714

Comparisons significant at the 0.05 level are indicated by '***'.

Simultaneous Simultaneous

Lower Difference Upper

TREAT Confidence Between Confidence

Comparison Limit Means Limit

20% - Control -2.071 3.400 8.871

40% - Control -4.321 1.150 6.621

49

PROC NESTED;

CLASS treat tree leaf;

VAR Nitro;

RUN;

50

Coefficients of Expected Mean Squares

Source TREAT TREE LEAF ERROR

TREAT 12 6 2 1

TREE 0 6 2 1

LEAF 0 0 2 1

ERROR 0 0 0 1

Sourcec df MS E[MS] F

Treatments

Trees

Leaves

Error

a-1

a(b-1)

ab(c-1)

abc(r-1)

MSa

MS(a)b

MS(ab)c

MSe

bcrσa2 +cr σ(a)b

2+ r σ(ab)c2 +σ2

cr σ(a)b2+ r σ(ab)c

2 +σ2 r σ(ab)c

2 +σ2

σ2

MSa/MS(a)b

MS(a)b/MS(ab)c

MS(ab)c/MSe

MSe

51

Nested Random Effects Analysis of Variance for Variable NITRO

Degrees

Variance of Sum of Error

Source Freedom Squares F Value Pr > F Term

TOTAL 35 151.790000

TREAT 2 71.780000 2.987 0.1933 TREE

TREE 3 36.046667 5.501 0.0130 LEAF

LEAF 12 26.213333 2.215 0.0618 ERROR

ERROR 18 17.750000

Variance Variance Percent

Source Mean Square Component of Total

TOTAL 4.336857 5.213333 100.0000

TREAT 35.890000 1.989537 38.1625

TREE 12.015556 1.638519 31.4294

LEAF 2.184444 0.599167 11.4930

ERROR 0.986111 0.986111 18.9152

Mean 30.35000000

Standard error of mean 0.99847105

r

MSMSs ecab

cab

)(2

)(

cr

MSMSs cabba

ba)()(2

)(

bcr

MSMSs baa

a)(2

52

The problem of pseudoreplication

53

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

1 2 3 1 2 3 1 2 3

A B C

Two-way anova (A fixed, B random)

Factor A (drug)

Factor B (patient)

Replicate

18 measurements

If we want to increase the power of the analysis, we may e.g. double thenumber of measurements

But be careful about what you do!

54

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

1 2 3 1 2 3 1 2 3

A B C

1 2

1

1 2

2

1 2

3

1 2

4

1 2

5

1 2

6

A

1 2

1

1 2

2

1 2

3

1 2

4

1 2

5

1 2

6

1 2

1

1 2

2

1 2

3

1 2

4

1 2

5

1 2

6

CB

Design 1

Design 2

Both experiments have 36 measurements

3 experimental units/treatment

6 experimental units/treatment

Pseudoreplicates

Design 2 is best because it uses 6 experimental units/treatment

55

40% 20% 0%

Four level nested anova

Tree (b = 2 )

Replicate (r = 2)

Leaf (c = 3 )

1 2 1 2 1 2

1

1 2 3

1 2 1 2 1 2

2

1 2 3

1 2 1 2 1 2

1

1 2 3

1 2 1 2 1 2

2

1 2 3

1 2 1 2 1 2

1

1 2 3

1 2 1 2 1 2

2

1 2 3

Treatment (a = 3)

Trees are the experimental units(2 replicates/treatment)Pseudoreplicates

56

Split-plot designs

• Three types of fertilizers

• Two types of soil treatment

• Interactions between fertilizers and soil treatment

57

A1

A2

Block 3

A2

A1

Block 1

A2

A1

Block 4

A1

A2

Block 2

2 whole-plots within each block

Soil treatments

58

A1

A2

Block 3

A2

A1

Block 1

A2

A1

Block 4

A1

A2

Block 2

Fertilizer treatments

3 sub- plots within each whole-plot

59

Analysis of whole-plots

Factor df MS E[MS] FSoil treatment (A)

Block (B)

Soil*Block (AB)

Error

a-1 = 1

b-1 =3

(a-1)(b-1) = 3

0

MSa

MSb

MSab

bσa2+σab

2

aσb2+ σab

2

σab2

MSa/MSab

MSb/MSab

Total ab-1 = 7

kijijjiijky )()(

0i

i βj is ND(0, σb2)

Effect of soil treatInteraction betweensoil and block

Effect of block

Interaction term serves as error term

60

Analysis of sub-plots

Factor df MS E[MS] FWhole plots

Fertilizer (C)

Soil*Fertilizer (AC)

Block*Fert. (BC)

Soil*Block*Fert. (ABC)

Error

ab-1 = 7

c-1 = 2

(a-1)( c-1) = 2

(b-1)(c-1) = 6

(a-1)(b-1)(c-1) = 6

0

MSc

MSac

MSbc

MSabc

abσc2+σabc

2

bσac2+ σabc

2

aσbc2+ σabc

2

σabc2

MSc/MSabc

MSac/MSabc

MSbc/MSabc

Total abc-1 = 23

0k

k (βγ)jk is ND(0, σbc2(1-1/c))

jkikkijjiijky )()()(ˆ

0)( i k

ik

Effect of fertilizer

Interaction between soil treatment and fertilizer

Interaction between block and fertilizer

61

Analysis of sub-plots

Factor df MS E[MS] FWhole plots

Fertilizer (C)

Soil*Fertilizer (AC)

Block*Fert. (BC)

Soil*Block*Fert. (ABC)

Error

ab-1 = 7

c-1 =2

(a-1)( c-1) =2

(b-1)(c-1) = 6

(a-1)(b-1)(c-1) = 6

0

MSc

MSac

MSbc

MSabc

abσc2+σ2

bσac2+ σ2

bσbc2+ σ2

σ2

MSc/MSe

MSac/MSe

MSbc/MSe

Total abc-1 = 23

0k

k (βγ)jk is ND(0, σbc2(1-1/c))

jkikkijjiijky )()()(ˆ

0)( i k

ik

62

Analysis of sub-plots

Factor df MS E[MS] FWhole plots

Fertilizer (C)

Soil*Fertilizer (AC)

Block*Fert. (BC)

Soil*Block*Fert. (ABC)

Error

ab-1 = 7

c-1 =2

(a-1)( c-1) =2

(b-1)(c-1) = 6

(a-1)(b-1)(c-1) = 6

0

MSc

MSac

MSbc

MSabc

abσc2+σ2

bσac2+ σ2

σ2

MSc/MSe

MSac/MSe

Total abc-1 = 23

0k

k (βγ)jk is ND(0, σbc2(1-1/c))

jkikkijjiijky )()()(ˆ

0)( i k

ik

63

How do it with SAS

64

DATA SplitPlt;

/* Example 6-8 in the lecture notes */

/* block = block effect (random factor) */

/* soil = effect of soil treatment (whole-plot effect) */

/* fert = effect of fertilizer (subplot effect) */

/* yield = dependent variable */

INFILE 'h:\lin-mod\eks6-8.prn';

INPUT soil $ block $ fert $ yield;

PROC GLM;

TITLE 'Split plot - full model';

CLASS block soil fert;

MODEL yield= block soil block*soil fert soil*fert block*fert ;

RANDOM block; /* declare block as a random effect */

TEST h = soil e = block*soil; /* tests effect of wholeplot */

TEST h = block e = block*soil; /* tests effect of blocks */

RUN;

65

Split plot - full model

The GLM ProcedureDependent Variable: yield Sum of Source DF Squares Mean Square F Value Pr > F  Model 17 32796.58333 1929.21078 3.24 0.0764 Error 6 3575.41667 595.90278 Corrected Total 23 36372.00000  R-Square Coeff Var Root MSE yield Mean 0.901699 15.02223 24.41112 162.5000  Source DF Type III SS Mean Square F Value Pr > F  block 3 588.33333 196.11111 0.33 0.8050 soil 1 7848.16667 7848.16667 13.17 0.0110 block*soil 3 3740.83333 1246.94444 2.09 0.2027 fert 2 10950.75000 5475.37500 9.19 0.0149 soil*fert 2 462.58333 231.29167 0.39 0.6942 block*fert 6 9205.91667 1534.31944 2.57 0.1373  Tests of Hypotheses Using the Type III MS for block*soil as an Error Term  Source DF Type III SS Mean Square F Value Pr > F  soil 1 7848.166667 7848.166667 6.29 0.0870 block 3 588.333333 196.111111 0.16 0.9185

Sub-plot effects

NB! These P-values cannot be used!

Instead use these whole-plot results

Whole-plot effects

66

PROC GLM;

TITLE 'Split plot - reduced model

block*fert omitted';

CLASS block soil fert;

MODEL yield= block soil block*soil fert soil*fert;

RANDOM block;

TEST h = soil e = block*soil; /* tests effect of wholeplot */

TEST h = block e = block*soil; /* tests effect of blocks */

RUN;

67

Split plot - reduced modelblock*fert omitted

The GLM Procedure Dependent Variable: yield  Sum of Source DF Squares Mean Square F Value Pr > F  Model 11 23590.66667 2144.60606 2.01 0.1224  Error 12 12781.33333 1065.11111  Corrected Total 23 36372.00000   R-Square Coeff Var Root MSE yield Mean  0.648594 20.08372 32.63604 162.5000   Source DF Type III SS Mean Square F Value Pr > F  block 3 588.33333 196.11111 0.18 0.9051 soil 1 7848.16667 7848.16667 7.37 0.0188 block*soil 3 3740.83333 1246.94444 1.17 0.3615 fert 2 10950.75000 5475.37500 5.14 0.0244 soil*fert 2 462.58333 231.29167 0.22 0.8079

68

PROC GLM;

TITLE 'Split plot - reduced model

block*fert and soil*fert omitted';

CLASS block soil fert;

MODEL yield= block soil block*soil fert;

RANDOM block;

TEST h = soil e = block*soil; /* tests effect of wholeplot */

TEST h = block e = block*soil; /* tests effect of blocks */

MEANS soil /TUKEY e= block*soil CLM CLDIFF; /* confidence limits for wholeplot effects */

MEANS fert /TUKEY CLM CLDIFF; /* confidence limits for subplot effects */

RUN;

69

Split plot - reduced model block*fert and soil*fert omitted

97Dependent Variable: yield Sum of Source DF Squares Mean Square F Value Pr > F  Model 9 23128.08333 2569.78704 2.72 0.0457  Error 14 13243.91667 945.99405  Corrected Total 23 36372.00000   R-Square Coeff Var Root MSE yield Mean  0.635876 18.92739 30.75702 162.5000 Source DF Type III SS Mean Square F Value Pr > F  block 3 588.33333 196.11111 0.21 0.8896 soil 1 7848.16667 7848.16667 8.30 0.0121 block*soil 3 3740.83333 1246.94444 1.32 0.3079 fert 2 10950.75000 5475.37500 5.79 0.0147

70

The GLM Procedure  Tukey's Studentized Range (HSD) Test for yield  NOTE: This test controls the Type I experimentwise error rate.   Alpha 0.05 Error Degrees of Freedom 3 Error Mean Square 1246.944 Critical Value of Studentized Range 4.50067 Minimum Significant Difference 45.879   Comparisons significant at the 0.05 level are indicated by ***.   Difference Simultaneous soil Between 95% Confidence Comparison Means Limits  2 - 1 36.17 -9.71 82.05 1 - 2 -36.17 -82.05 9.71

71

The GLM Procedure  Tukey's Studentized Range (HSD) Test for yield  NOTE: This test controls the Type I experimentwise error rate.   Alpha 0.05 Error Degrees of Freedom 14 Error Mean Square 945.994 Critical Value of Studentized Range 3.70139 Minimum Significant Difference 40.25   Comparisons significant at the 0.05 level are indicated by ***.   Difference Simultaneous fert Between 95% Confidence Comparison Means Limits  3 - 2 15.38 -24.87 55.62 3 - 1 51.00 10.75 91.25 *** 2 - 3 -15.38 -55.62 24.87 2 - 1 35.63 -4.62 75.87 1 - 3 -51.00 -91.25 -10.75 *** 1 - 2 -35.63 -75.87 4.62