small and medium scale

7/27/2019 Small and Medium Scale

1/140

Statistical Modellin Cha ter VII 1

VII. Factorial experiments

VII.A Design of factorial experimentsVII.B Advantages of factorial experimentsVII.C An example two-factor CRD experiment

VII.D Indicator-variable models and estimation for factorial experimentsVII.E Hypothesis testing using the ANOVA method

for factorial experiments

VII.F Treatment differencesVII.G Nested factorial structuresVII.H Models and hypothesis testing for three-

factor experiments


2/140


Factorial experiments

Often be more than one factor of interest to theexperimenter. Definition VII.1 : Experiments that involve more than one

randomized or treatment factor are called factorialexperiments .

In general, the number of treatments in a factorialexperiment is the product of the numbers of levels of thetreatment factors.

Given the number of treatments, the experiment could belaid out as a Completely Randomized Design, a Randomized Complete Block Design or a Latin Squarewith that number of treatments.

BIBDs or Youden Squares are not suitable.


3/140


VII.A Design of factorial experimentsa) Obtaining a layout for a factorial experiment in R

Layouts for factorial experiments can beobtained in R using expressions for the chosendesign when only a single-factor is involved.

Difference with factorial experiments is that theseveral treatment factors are entered. Their values can be generated using fac.gen .

fac.gen(generate, each=1, times=1,order="standard")

It is likely to be necessary to use either theeach or times arguments to generate thereplicate combinations.

The syntax of fac.gen and examples are givenin Appendix B, Randomized layouts and sample

size computations in R .


4/140


Example VII.1 Fertilizing oranges

Suppose an experimenter is interested in investigatingthe effect of nitrogen and phosphorus fertilizer on yield of oranges.

Investigate 3 levels of Nitrogen (viz 0,30,60 kg/ha) and 2levels of Phosphorus (viz. 0,20 kg/ha).

The yield after six months was measured. Treatments are all possible combinations of the

3 Nitrogen 2 Phosphorus levels: 3 2 = 6 treatments. The treatment combinations, arranged in standard order,

are:Treatment N P

1 0 02 0 203 30 04 30 205 60 0

6 60 20


5/140


Generating a layout in R for a CRDwith 3 reps

> #> # CRD> #> n CRDFac2.unit CRDFac2.ran CRDFac2.lay remove("CRDFac2.unit, "CRDFac2.ran")

Specifies unitsindexed by Seedlingwith 18 levels.

Creates 3 copies of thelevels combinations of Nand P, with 3 and 2 levels;stores these in thedata.frame

CRDFac2.ran .

Does CRDrandomization

Remove excess objects


6/140


The layout> CRDFac2.lay

Units Permutation Seedling N P1 1 2 1 30 202 2 18 2 0 03 3 4 3 30 04 4 5 4 30 05 5 7 5 30 206 6 12 6 30 07 7 15 7 60 08 8 13 8 0 09 9 6 9 60 010 10 1 10 60 011 11 10 11 30 2012 12 16 12 60 20

13 13 8 13 0 2014 14 14 14 0 2015 15 3 15 0 016 16 11 16 60 2017 17 9 17 60 2018 18 17 18 0 20


7/140


What about an RCBD?

Suppose we decide on a RCBD with three blocks how many units per block would be required?

Answer 6. In factorial experiments not limited to two factors Thus we may have looked at Potassium at 2

levels as well. How many treatments in this

case? Answer 3 2 2 =12.


8/140


VII.B Advantages of factorialexperiments

a) Interaction in factorial experiments The major advantage of factorial experiments is

that they allow the detection of interaction.

Definition VII.2: Two factors are said to interact if the effect of one, on the response variable,depends upon the level of the other.

If they do not interact, they are said to beindependent .

To investigate whether two factors interact, thesimple effects are computed.


9/140Statistical Modellin Cha ter VII 9

Effects Definition VII.3: A simple effect , for the means

computed for each combination of at least two factors, isthe difference between two of these means having

different levels of one of the factors but the same levelsfor all other factors. We talk of the simple effects of a factor for the levels of

the other factors. If there is an interaction, compute an interaction effect

from the simple effects to measure the size of theinteraction Definition VII.4: An interaction effect is half the

difference of two simple effects for two different levels of just one factor (or is half the difference of two interaction

effects). If there is not an interaction, can separately compute themain effects to see how each factor affects the response.

Definition VII.5: A main effect of a factor is thedifference between two means with different levels of that

factor, each mean having been formed from allobservations having the same level of the factor.



Example VII.2 Chemical reactor experiment

Investigate the effect of catalyst and temperature on theyield of chemical from a chemical reactor.

Table of means from the experiment was as follows:Temperature (C)

160 180

A 60 72CatalystB 52 64

For A the temperature effect is 72 - 60 = 12

For B the temperature effect is 64 - 52 = 12 These are called the simple effects of temperature. Clearly, the difference between (effect of) the

temperatures is independent of which catalyst is used.

Interaction effect: [12 - 12]/2 = 0



Illustrate using an interaction plot

A set of parallel lines indicates no interaction

Effects of catalyst and temperature on yield

- no interaction

Temperature

Yi

eld

40

45

50

55

60

65

70

75

150 160 170 180 190

A

B



Interaction & independence aresymmetrical in factors

Thus, the simple catalyst effect at 160C is 52 - 60 = - 8

the simple catalyst effect at 180C is 64 - 72 = - 8 Thus the difference between (effect of) the

catalysts is independent of which temperature isused.

Interaction effect is still 0 and factors are additive.


13/140


14/140



Interaction plot

Clearly an interaction as lines have differentslopes.

So cannot use overall means.

Effects of catalyst and temperature on yield- interaction

Temperature

Yield

40

45

50

55

60

65

70

75

80

85

150 160 170 180 190

A

B



Why using overall means isinappropriate

Overall means are: Main effects:

cannot be equal to simpleeffects as these differ

have no practicalinterpretation.

Look at means for thecombinations of the factors

Temperature (C)160 180

Mean 56 77.5

Catalyst A B

Mean 66 67.5

Temperature (C)160 180

A 60 72Catalyst

B 52 83

Interaction effect:[(72 - 60) - (83 - 52)]/2

= [12 - 31]/2 = - 9.5or [(52 - 60) - (83 - 72)]/2

= [- 8 - 9]/2 = - 9.5. two non-interacting factors

is the simpler

b) Ad f



b) Advantages over one-factor-at-a-time experiments

Sometimes suggested better to keep it simple and

investigate one factor at a time. However, this is wrong. Unable to determine whether or not there is an interaction. Take temperature-catalyst experiment at 2 nd reactor.

Experiment 1 Temperature (C)160 180

A 60 ?Experiment 2 Catalyst

B 52 83

WELL YOU HAVE ONLY APPLIED THREE OF THEFOUR POSSIBLE COMBINATIONS OF THE TWOFACTORS

Catalyst A at 180C has not been tested but catalyst B at160C has been tested twice as indicated above.



Limitation of inability to detectinteraction The results of the experiments would indicate

that: temperature increases yield by 31 gms

the catalysts differ by 8 gms in yield. If we presume the factors act additively, predict

the yield for catalyst A at 160C to be: 60+31 = 83 + 8 = 91.

This is quite clearly erroneous. Need the factorial experiment to determine if

there is an interaction.



Same resources but more info

Exactly the same total amount of resources are involvedin the two alternative strategies, assuming the number of replicates is the same in all the experiments.

In addition, if the factors are additive then the maineffects are estimated with greater precision in factorialexperiments.

In the one-factor-at-a time experiments the effect of a particular factor is estimated as the difference

between two means each based on r observations. In the factorial experiment

the main effects of the factors are the difference between twomeans based on 2r observations

which represents a sqrt(2) increase in precision. The improvement in precision will be greater for more

factors and more levels



Summary of advantages of factorialexperiments

if the factors interact, factorial experimentsallow this to be detected and estimates of

the interaction effect can be obtained,and

if the factors are independent, factorial

experiments result in the estimation of themain effects with greater precision.



VII.C An example two-factor CRDexperiment

Modification of ANOVA: instead of a single

source for treatments, will have a sourcefor each factor and one for each possiblecombinations of factors.


22/140



c) Sources derived from the structure formulae Units = Units A*B = A + B + A#B

Unit n

Aa

Bb

A Bab

Randomized termsUnrandomized terms

a - 1 b- 1

(a - 1)( b- 1)

n- 1

1 1

1 1

U A B

A#B

d) Degrees of freedom and sums of squares Hasse diagrams for this study with

degrees of freedom M and Q matrices

Unit

A B

A B


MG MG

MG MG

MU M A MB

M AB

MU- MG M A- MG MB- MG

M AB- M A- MB+MG

U A B

A#B


24/140



f) Maximal expectation and variation models

Assume the randomized factors are fixed and thatthe unrandomized factor is a random factor. Then the potential expectation terms are A, B and

A B.

The variation term is: Units. The maximal expectation model is

y = E[Y] = A B

and the variation model is var[Y] = Units



g) The expected mean squares The Hasse diagrams, with contributions to

expected mean squares, for this study are:

Unit

A B

A B


1 1

1 1

2U

2U Aq Aq Bq Bq

ABq ABq

U A B

A#B



ANOVA table with E[MSq]

Source df SSq E[MSq]

Units n- 1 U Y Q Y

A a - 1 A Y Q Y 2U Aq y

B b- 1 B Y Q Y 2U Bq y A#B (a - 1)( b- 1) AB Y Q Y 2U ABq yResidual ab (r - 1)

ResU Y Q Y 2

U



b) Analysis of an exampleExample VII.4 Animal survival experiment To demonstrate the analysis I will use the example from

Box, Hunter and Hunter (sec. 7.7).Treatment

1 2 3 4I 0.31 0.82 0.43 0.45

0.45 1.10 0.45 0.71

0.46 0.88 0.63 0.660.43 0.72 0.76 0.62

II 0.36 0.92 0.44 0.56Poison 0.29 0.61 0.35 1.02

0.40 0.49 0.31 0.710.23 1.24 0.40 0.38

III 0.22 0.30 0.23 0.300.21 0.37 0.25 0.360.18 0.38 0.24 0.310.23 0.29 0.22 0.33

In this experimentthree poisons andfour treatments

(antidotes) wereinvestigated. The 12 combinations

of poisons andtreatments were

applied to animalsusing a CRD and thesurvival times of theanimals measured(10 hours).



A. Description of pertinent features of the study1. Observational unit

an animal

2. Response variable Survival Time

3. Unrandomized factors Animals

4. Randomized factors Treatments, Poisons

5. Type of study

Two-factor CRDB. The experimental structure

Structure Formulaunrandomized 48 Animalsrandomized 3 Poisons* 4 Treatments

These are thesteps that need tobe performedbefore R is usedto obtain the

analysis. The remaining

steps are left asan exercise for you.



Interaction plot

There is some evidence of an interaction in thatthe traces for each level of Treat look to be

different.

0 . 2

0 . 3

0 . 4

0 . 5

0 . 6

0 . 7

0 . 8

0 . 9

Poison

m e a n

o f S u r v . T

i m e

1 2 3

Treat

2431


31/140


32/140


Hypothesis test for the example(continued)Step 2 : Calculate test statistics The ANOVA table for a two-factor CRD, with

random factors being the unrandomized factorsand fixed factors the randomized factors, is:

Source df SSq MSq E[MSq] F Prob

Animals 47 3.0051

Poison 2 1.0330 0.5165 2 A Pq y 23.22


33/140


Hypothesis test for the example(continued)

Step 3 : Decide between hypothesesInteraction of Poison and Treatment is not

significant, so there is no interaction.Both main effects are highly significant,so both factors affect the response.More about models soon.

Also, it remains to perform the usualdiagnostic checking.

VII DI di i bl d l d


34/140


VII.D Indicator-variable models andestimation for factorial

experiments The models for the factorial experiments willdepend on the design used in assigning thetreatments that is, CRD, RCBD or LS.

The design will determine the unrandomizedfactors and the terms to be included involvingthose factors.

They will also depend on the number of randomized factors.

Let the total number of observations be n and thefactors be A and B with a and b levels,respectively.

Suppose that the combinations of A and B areeach replicated r times that is, n = a b r .

a) Maximal model for two factor


35/140


a) Maximal model for two-factor CRD experiments

The maximal model used for a two-factor CRD

experiment, where the two randomized factors A and Bare fixed, is: 2 AB AB Uand nE = = = Y X V Ia

where

Y is the n-vector of random variables for the responsevariable observations,

(a ) is the ab -vector of parameters for the A-Bcombinations,

X AB is the n ab matrix giving the combinations of A and B

that occurred on each unit, i.e. X matrix for A B,

is the variability arising from different units.2U

Our model also assumes Y ~ N( y AB , V)


36/140


Standard order Expression for X matrix in terms of direct

products of Is and 1s when A and B are instandard order.

Previously used standard order generaldefinition in notes.

The values of the k factors A 1, A2, , A k with a 1,a 2, , a k levels, respectively, are systematicallyordered in a hierarchical fashion: they are ordered according to A 1, then A 2, then A 3,

and then A k .

Suppose, the elements of the Y vector arearranged so that the values of the factors A, Band the replicates are in standard order, as for asystematic layout.

Then AB a b r = X I I 1


37/140


Example VII.5 2 2 Factorial experiment

Suppose A and B have 2 levels each and thateach combination of A and B has 3 replicates.

Hence, a = b = 2, r = 3 and n = 12. Then 11 12 21 22a a a a = Now Y is arranged so that the

values of A, B and the reps arein standard order that is

111 112 113 121 122 123 211 212 213 221 222 223Y Y Y Y Y Y Y Y Y Y Y Y = Y

AB

A 1 1 2 2B 1 2 1 2

1 0 0 01 0 0 01 0 0 00 1 0 00 1 0 00 1 0 0 0 0 1 00 0 1 00 0 1 00 0 0 10 0 0 1

0 0 0 1

X

Then AB 2 2 3= X I I 1

so that X AB for 4 level A B is:


38/140

b) Alternative expectation models


39/140


b) Alternative expectation models marginality-compliant modelsRule VII.1: The set of expectation models

corresponds to the set of all possiblecombinations of potential expectation terms,subject to restriction that terms marginal toanother expectation term are excluded from themodel;

it includes the minimal model that consists of asingle term for the grand mean.

For marginality of terms refer to Hasse diagramsand can be deduced using definition VI.9.

This definition states that one generalized factor is marginal to another if the factors in the marginal generalized factor are a

subset of those in the other and this will occur irrespective of the replication of the

levels of the generalized factors.

fRandomized termsUnrandomized terms


40/140


Two-factor CRD

For all randomized factors fixed, the potentialexpectation terms are A, B and A B.

Maximal model includes all terms: E[Y] = A + B + A B However, marginal terms must be removed so the maximal model reduces to E[Y] = A B

Next model leaves out A B giving additivemodel E[Y] = A + B no marginal terms in this model.

A simpler model than this is either E[Y] = Aand E[Y] = B.

Only other possible model is one with neither A nor B: E[Y] = G.

Unit n

Aa

Bb

A B

ab

a - 1 b- 1

(a - 1)( b- 1)

n- 1

1 1

1 1

U A B

A#B

Alt ti t ti d l i


41/140


Alternative expectation models interms of matrices

Expressions for X matrices in terms of directproducts of Is and 1s when A and B are instandard order.

AB AB

A+B A B

A A

B B

G G

A and B interactin effect on response

A and B independentlyaffect response

A only affects response

B only affects response

no factors affect response

=

=

=

=

=

X

X X

X

X

X

y a

y a

y a

y

y


42/140


X matrices Again suppose, the elements of the Y vector are

arranged so that the values of the factors A, Band the replicates are in standard order, as for asystematic layout.

Then the X matrices can be written as thefollowing direct products:

G

A

B

AB

a b r abr

a b r

a b r

a b r

= =

= = =

X 1 1 1 1

X I 1 1X 1 I 1

X I I 1

E l VII 5 2 2 F t i l i t


43/140


Example VII.5 2 2 Factorial experiment(continued)

Remember A and B have two levels each and that eachcombination of A and B is replicated 3 times. Hence, a = b = 2, r = 3 and n = 12. Then

1 2

1 2

11 12 21 22

a a

a a a a

=

==

a

a

Suppose Y is arranged so that the values of A, B and the

replicates are in standard order that is

Then 111 112 113 121 122 123 211 212 213 221 222 223Y Y Y Y Y Y Y Y Y Y Y Y = Y

Example VII 5 2 2 Factorial experiment


44/140


Example VII.5 2 2 Factorial experiment (continued)

Notice, irrespective of the replication of the levels of A B , XG can be written as a linear combination of the columns

of each of the other three X A and XB can be written as linear combinations of the

columns of X AB .

G A B AB

A 1 2 1 1 2 2B 1 2 1 2 1 2

1 1 0 1 0 1 0 0 01 1 0 1 0 1 0 0 01 1 0 1 0 1 0 0 01 1 0 0 1 0 1 0 01 1 0 0 1 0 1 0 0

1 1 0 0 1 0 1 0 0 1 0 1 1 0 0 0 1 01 0 1 1 0 0 0 1 01 0 1 1 01 0 1 0 11 0 1 0 11 0 1 0 1

X X X X

0 0 1 00 0 0 10 0 0 10 0 0 1

G 2 2 3 12

A 2 2 3

B 2 2 3

AB 2 2 3

= ==

= =

X 1 1 1 1

X I 1 1

X 1 I 1X I I 1


45/140

Estimators of the expected values for


46/140


Estimators of the expected values for the expectation models They are all functions of means.

So can be written in terms of mean operators, Ms. If Y is arranged so that the associated factors A, B andthe replicates are in standard order, the M operatorswritten as the direct product of I and J matrices:

Model Estimator

AB AB= X a 1 AB AB a b r r -= = = A B M Y I I J Y A+B A B= X Xa A+B = -A B G A A= X a 1 A A a b r br

-= = = A M Y I J J Y B B= X 1B B a b r ar

-= = = B M Y J I J Y G G = X 1G G a b r n

-= = = G M Y J J J Y

are the n-vectors of means, the latter for the combinations of A and B, that is for thegeneralized factor A B.

, , andG A B A B

Example VII 5 2 2 Factorial experiment


47/140


Example VII.5 2 2 Factorial experiment (continued)

The mean vectors, produced by an MY, are as follows:

1 11 11 11 1

1 11 11 21 21 21 2

1 2 1 2

2 1 2 12 1 2 12 1 2 12 2 22 2

2 2

A B A BG A B A BG A B A BG A B A BG A B A BG

A B A BG A BG A B

G A B A BG A B A BG A B AG A BG A B

G A B A

2

2 2

2 2

B A B A B

VII E Hypothesis testing using the ANOVA


48/140


VII.E Hypothesis testing using the ANOVAmethod for factorial experiments

Use ANOVA to choose between models. In this section will use generic names of A, B and Units

for the factors Recall ANOVA for two-factor CRD.

Source df SSq E[MSq]

Units n- 1 U Y Q Y

A a - 1 A Y Q Y 2U Aq y

B b- 1 B Y Q Y 2U Bq y

A#B (a - 1)(b- 1) AB Y Q Y 2U ABq yResidual ab (r - 1)

ResU Y Q Y 2

U

a) Sums of squares for the


49/140


a) Sums of squares for theanalysis of variance

Require estimators of the following SSqs for atwo-factor CRD ANOVA: Total or Units; A; B; A#B and Residual.

Use Hasse diagram.



50/140


Vectorsfor sumsof squares

Res

U U G G

A A G

B B G

AB AB A B G

U U AB AB

Total or Units SSq:

A SSq:

B SSq: A#B SSq:

Residual SSq:

e

e

e

e ee

= - = - =

= - = - =

= - = - == - -

= - - =

= - = - =

= - - - -

Q Y M M Y Y G D

Q Y M M Y A G A

Q Y M M Y B G BQ Y M M M M Y

A B A B G A B

Q Y M M Y Y A B D

Y A B A B G

All the Msand Qs aresymmetricandidempotent.

Unit

A B

A B


MG MG

MG MG

MU M A MB

M AB

MU- MG M A- MG MB- MG

M AB- M A- MB+MG

U A B

A#B


51/140


SSq (continued)

From section VII.C, Models and estimationfor factorial experiments , we have that

1

G1

A

1B

1 AB

a b r

a b r

a b r

a b r

n

br

ar

r

-

-

-

-

= =

= =

= =

= =

G M Y J J J Y

A M Y I J J Y

B M Y J I J Y

A B M Y I I J Y

SSq (continued)


52/140


SSq (continued)

So SSqs for the ANOVA are given by

Res

U G G

A

B

AB

U AB AB

e e

e e

e e

= = =

=

=

Y Q Y D D

Y Q Y A A

Y Q Y B B

Y Q Y A B A B

Y Q Y D D

ANOVA table constructed as follows:


53/140


ANOVA table constructed as follows:Source df SSq MSq E[MSq] F p

Units n- 1 U Y Q Y

A a - 1 A Y Q Y 2 A A1

sa

=-

Y Q Y 2U Aq y Res2 2 A Us s A p

B b- 1 B Y Q Y 2BB1

sb

=-

Y Q Y 2U Bq y Res2 2B Us s B p

A#B (a - 1)( b- 1) AB Y Q Y

2 AB AB1 1

sa b

=- -

Y Q Y 2U ABq y

Res

2 2 AB Us s AB p

Residual ab (r - 1)ResU

Y Q Y

Res

Res

U 2U1

sab r

=-

Y Q Y

2U

Total abr - 1 U Y Q Y

Can compute the SSqs by decomposing y as follows:

ABe e e= y g a b a b d

d) Expected mean squares


54/140


d) Expected mean squares The E[MSq]s involve three quadratic functions of

the expectation vector:

A A

B B

AB AB

1 ,

1 ,

1 1 .

q a

q b

q a b

= -

= -

= - -

Q

Q

Q

y y y

y y y

y y y

That is, numerators are SSqs of Q Ay = (M A- MG)y , QBy = (MB- MG)y and Q ABy = (M AB - M A- MB+MG)y ,where y is one of the models y G = XG y A = X Aa y B = XB y A+B = X Aa + XB

y AB = X AB(a )

Requireexpressionsfor thequadraticfunctionsunder each of these models.

Zero & nonzero quadratic functions


55/140


Zero & nonzero quadratic functions

Firstly, considering the column for source A#B, the only model for which q AB(y ) 0 is y AB = X AB(a ). Consequently, A#B is significant indicates that q AB(y ) > 0 and that

the maximal model is the appropriate model. Secondly, considering the column for source A,

q A(y ) 0 implies either a model that includes X Aa or the maximalmodel X AB(a ):

if A#B is significant, know need maximal model and test for A irrelevant.If A#B is not significant, know maximal model is not required and so

significant A indicates that the model should include X Aa . Thirdly for source B, provided A#B is not significant, a significant B

indicates that the model should include XB .

SourceExpectationmodel

A B A#B

G G = X

A G 0q=

B G 0q=

AB G 0q =

A= X a A q y B 0q = AB 0q = B= X A B 0q = B Bq y AB B 0q = A+B A B= X Xa A +Bq y B A+Bq y AB A+B 0q =

AB AB= X a A Bq y B ABq y AB ABq y

Choosing an expectation model


56/140


Choosing an expectation modelfor a two-factor CRD

A#B hypothesis p a p > a

Factors interactin their effect on response variable.

Use maximal model y AB= X AB(a ).

A and B hypotheses

Factors independentin their effect on response variable.

Use a model that includes significantterms, that is a single factor or

additive model

Factors have no effecton response variable.

Use minimal model y G= XG .

reject H0

reject H 0(s)

retain H0

one or both p a

for both p > a retain all H 0s

Nonzero quadratic functions


57/140


Nonzero quadratic functions In the notes show that the non-zero q-functions are given

by:

2.

1 A A +B

2.

1B BB A+B

2

. . ..1 1

AB AB

1

1

1 1

a

i

i

b

j j

a b

ij i j i j

rb

q q a

ra

q qb

r

qa b

a a

a a a a

=

=

= =

-

= -

-= =

-

- - =

- -

y y =

y y

y

So q-functions are zero when expressions in parenthesesare zero. That is when . . . . .., andi j ij i j a a a a a a = = = - That is equality or an additive pattern obtain.

These, or equivalent, expressions are given for H 0.

e) Summary of the hypothesis test


58/140


e) Summary of the hypothesis testStep 1 : Set up hypotheses

a) H 0: there is no interaction between A and B(or model simpler than X AB(a ) isadequate)

H1: there is an interaction between A and B

b) H 0: a 1 = a 2 = = a a (or X Aa not required in model)H1: not all population A means are equal

c) H 0: 1 = 2 = = b (or XB not required in model)H1: not all population B means are equal

Set a = 0.05.

. . .. 0 for all ,ij i j i j a a a a - - =

. . .. 0 for some ,ij i j i j a a a a - -

Summary of the hypothesis test


59/140


Summary of the hypothesis test(continued)

Step 2 : Calculate test statisticsSource df SSq MSq E[MSq] F p

Units n- 1 U Y Q Y

A a - 1 A Y Q Y 2 A A1

sa

=-

Y Q Y 2U Aq y Res2 2 A Us s A p

B b- 1 B Y Q Y 2BB1

sb

=-

Y Q Y 2U Bq y Res2 2B Us s B p

A#B (a - 1)( b- 1) AB Y Q Y

2 AB AB

1 1s

a b=

- -

Y Q Y 2U ABq y

Res

2 2 AB Us s AB p

Residual ab (r - 1)ResU Y Q Y

Res

Res

U 2U1

sab r

=-

Y Q Y

2U

Total abr - 1 U Y Q Y


60/140

f) Computation of ANOVA and


61/140


f) Computation of ANOVA anddiagnostic checking in R

The assumptions underlying a factorialexperiment will be the same as for thebasic design employed, except thatresiduals-versus-factor plots of residualsare also produced for all the factors in theexperiment.


62/140

R instructions


63/140


R instructions

Fac2Pois.dat


64/140


R output> Fac2Pois.dat

Animals Poison Treat Surv.Time1 1 1 1 0.31

2 2 1 2 0.823 3 1 3 0.434 4 1 4 0.455 5 1 1 0.456 6 1 2 1.107 7 1 3 0.458 8 1 4 0.719 9 1 1 0.46

10 10 1 2 0.8811 11 1 3 0.6312 12 1 4 0.6613 13 1 1 0.4314 14 1 2 0.7215 15 1 3 0.7616 16 1 4 0.62

17 17 2 1 0.3618 18 2 2 0.9219 19 2 3 0.4420 20 2 4 0.5621 21 2 1 0.2922 22 2 2 0.6123 23 2 3 0.3524 24 2 4 1.02

25 25 2 1 0.40

26 26 2 2 0.4927 27 2 3 0.3128 28 2 4 0.7129 29 2 1 0.2330 30 2 2 1.2431 31 2 3 0.4032 32 2 4 0.3833 33 3 1 0.2234 34 3 2 0.3035 35 3 3 0.2336 36 3 4 0.3037 37 3 1 0.2138 38 3 2 0.3739 39 3 3 0.2540 40 3 4 0.36

41 41 3 1 0.1842 42 3 2 0.3843 43 3 3 0.2444 44 3 4 0.3145 45 3 1 0.2346 46 3 2 0.2947 47 3 3 0.2248 48 3 4 0.33

R instructions and output


65/140


R instructions and outputinteraction.plot(Poison, Treat, Surv.Time, lwd=4)

Fac2Pois.aov interaction.plot(Poison, Treat, Surv.Time, lwd = 4)> Fac2Pois.aov summary(Fac2Pois.aov)

Error: AnimalsDf Sum Sq Mean Sq F value Pr(>F)

Poison 2 1.03301 0.51651 23.2217 3.331e-07Treat 3 0.92121 0.30707 13.8056 3.777e-06Poison:Treat 6 0.25014 0.04169 1.8743 0.1123

Residuals 36 0.80073 0.02224

Diagnostic checking


66/140


g g As experiment was set up as a CRD, the assumptions

underlying its analysis will be the same as for the CRD Diagnostic checking the same in particular, Tukeys

one-degree-of-freedom-for-nonadditivity cannot becomputed.

The R output produced by the expressions that deal withdiagnostic checking is as follows:

> #> # Diagnostic checking

> #

> res fit plot(fit, res, pch=16)

> plot(as.numeric(Poison), res, pch=16)

> plot(as.numeric(Treat), res, pch=16)

> qqnorm(res, pch=16)

> qqline(res)

Diagnostic checking (continued)


67/140


D ag ost c c ec g (co t ued)

All plots indicate a problem with the assumptions willa transformation fix the problem?

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

- 0 . 2

0 . 0

0 . 2

0 . 4

fit

r e s

1.0 1.5 2.0 2.5 3.0

- 0 . 2

0 . 0

0 . 2

0 . 4

as.numeric(Poison)

r e s

1.0 1.5 2.0 2.5 3.0 3.5 4.0

- 0 . 2

0 . 0

0 . 2

0 . 4

as.numeric(Treat)

r e s

-2 -1 0 1 2

- 0 . 2

0 . 0

0 . 2

0 . 4

Normal Q-Q Plot

Theoretical Quantiles

S a m p

l e Q u a n

t i l e s

g) Box-Cox transformations for correcting


68/140


g) Box Cox transformations for correctingtransformable non-additivity

Box, Hunter and Hunter (sec. 7.9) describe theBox-Cox procedure for determining theappropriate power transformation for a set of data.

It has been implemented in the R functionboxcox supplied in the MASS library that comeswith R.

When you run this procedure you obtain a plot of the log-likelihood of l , the power of thetransformation to be used (for l = 0 use the lntransformation).

However, the function does not work withaovlist objects and so the aov function must

be repeated without the Error function.


69/140


70/140

Repeat the analysis on the


71/140


p yreciprocals detach Fac2Pois.dat data.frame

add Death.Rate to the data.frame reattach the data.frame to refresh info in R. repeat expressions from the original analysis with

Surv.time replaced by Death.Rate appropriately.

Looking like no interaction.

2

3

4

Poison

m e a n o

f D e a

t h . R

a t e

1 2 3

Treat

1342


72/140

Repeat the analysis on the


73/140


p yreciprocals (continued)

Looking good

2 3 4

- 0 . 5

0 . 0

0 . 5

1 . 0

fit

r e s

1.0 1.5 2.0 2.5 3.0

- 0 . 5

0 . 0

0 . 5

1 . 0

as.numeric(Poison)

r e s

1.0 1.5 2.0 2.5 3.0 3.5 4.0

- 0 . 5

0 . 0

0 . 5

1 . 0

as.numeric(Treat)

r e s

-2 -1 0 1 2

- 0 . 5

0 . 0

0 . 5

1 . 0

Normal Q-Q Plot


S a m p

l e Q u a n

t i l e s

Comparison of untransformed and


74/140


ptransformed analyses

The analysis of the transformed data indicates that there isno interaction on the transformed scale confirms plot.

The main effect mean squares are even larger than beforeindicating that we are able to separate the treatments evenmore on the transformed scale.

Diagnostic checking now indicates all assumptions are met.

UNTRANSFORMED TRANSFORMED Source df MSq F Prob df MSq F Prob

Animals 47 47Poison 2 0.5165 23.22 0.0000 2 17.439 72.63


75/140


VII.F Treatment differences

As usual the examination of treatmentdifferences can be based on multiplecomparisons or submodels.

a) Multiple comparison procedures


76/140


a) Multiple comparison procedures

For two factor experiments, there will bealtogether three tables of means, namelyone for each of A, B and A B.

Which table is of interest depends on theresults of the hypothesis tests outlinedabove.

However, in all cases Tukeys HSDprocedure will be employed to determinewhich means are significantly different.

A#B Interaction significant


77/140


A#B Interaction significant In this case you look at the table of means for the

A B combinations. A

1 2 3 . . . a 1 x x x . . . x2 x x x . . . x. . . . . . . .

B . . . . . . . .. . . . . . . .b x x x . . . x

, ,0.05 , ,0.05 25%2 2d

ab ab x q qw s s r

= =

In this case you look at differences betweenmeans for different A B combinations.

A#B interaction not significant


78/140


g

In this case examine the A and B tables of

means for the significant lines.

A1 2 3 . . . a

Means x x x . . . x

, ,0.05 , ,0.05 25%2 2d

a a x

q qw s s

rb = =

B1 2 3 . . . b

Means x x x . . . x

, ,0.05 , ,0.05 25%2 2d

b b x

q qw s s

ra = =

That is, we examine each factor separately,using main effects.


79/140

Example VII 4 Animal survival experiment


80/140


Example VII.4 Animal survival experiment (continued)

For our example, as the interaction is not significant, theoverall tables of means are examined.

For the Poison meansPoison

1 2 31.801 2.269 3.797

3.456758 0.240 25%162

0.42

w =

=

All Poison means are significantly different. For the Treat means

Treat1 2 3 4

3.519 1.862 2.947 2.161

=

=

3.808798 0.240 25%122

0.54

w

All but Treats 2 and 4 are different.

Plotting the means in a bar chart


81/140


g> # Plotting means> #> Fac2Pois.DR.tab Fac2Pois.DR.Poison.Means barchart(Death.Rate ~ Poison, main="Fitted values for Death rate",+ ylim=c(0,4), data=Fac2Pois.DR.Poison.Means)> Fac2Pois.DR.Treat.Means barchart(Death.Rate ~ Treat, main="Fitted values for Death rate",

+ ylim=c(0,4), data=Fac2Pois.DR.Treat.Means)

Max death rate with Poison 3 and Treats 1.

Min death rate with Poison 1 and either Treats 2 or 4.

Fitted values for Death rate

D e a

t h . R

a t e

1

2

3

1 2 3 4

Fitted values for Death rate

D e a

t h . R

a t e

1

2

3

1 2 3

If interaction significant, 2 possibilities


82/140


Possible researchers objective(s) :i. finding levels combination(s) of the factors that

maximize (or minimize) response variable or describing response variable differences between alllevels combinations of the factors

ii. for each level of one factor, finding the level of theother factor that maximizes (or minimizes) theresponse variable or describing the responsevariable differences between the levels of the other factor

iii. finding a level of one factor for which there is nodifference between the levels of the other factor

For i: examine all possible pairs of differencesbetween all means.

For ii & iii: examine pairs of mean differencesbetween levels of one factor for each level of other factor i.e. in slices for each level of other

factor (= examining simple effects).

Table of Poison by Treat means


83/140


Table of Poison by Treat meansPoison:Treat

TreatPoison 1 2 3 4

1 2.487 1.163 1.863 1.6902 3.268 1.393 2.714 1.7023 4.803 3.029 4.265 3.092

Look for overall max or max in each column Do not do for this example as interaction is not

significant

=

=

4.93606 0.240 25%

42

1.21

w

b) Polynomial submodels


84/140


b) Polynomial submodels As stated previously, the formal expression for

maximal indicator-variable model for a two-factor CRD experiment, where the two randomizedfactors A and B are fixed, is:

2 AB AB Uand nE = = = Y X V Ia

In respect of fitting polynomial submodels, twosituations are possible:

i. one factor only is quantitative, or ii. both factors are quantitative.


85/140

Matrix expressions for modelsd d bi ti f A d BE =Y X


86/140


1 1

2 2

2 1 2

is an -vector of effects

is an -vector of linear coefficients

is an -vector of quadratic coeffients

is an -vector of effects

is a -vector of effects

ij

i

i

i

j

ab

a

a

a

b

a

a

a

a

=

=

=

=

=

=

a

a

a

a

AB

A A1 A21 2

A A1 1

A B

depends on combination of A and B

quadratic response to B, differing for A

linear response to B, differing for A

nonsmooth, independent response to A

E

E

E

E

=

=

=

=

Y X

Y X X X

Y X X

Y X X

a

a a a

a a

a

A 1 1 2 2

A 1 1

A

B

& B

quadratic response to B, intercept differs for A

linear response to B, intercept differs for A

nonsmooth response, depends on A only

nonsmooth response, depends

E

E

E

E

=

=

=

=

Y X X X

Y X X

Y X

Y X

a

a

a

G 1 1 2 2

G 1 1

G

on B onlyquadratic response to B, A has no effect

linear response to B, A has no effect

neither factor affects the response

E

E

E

=

=

=

Y X X X

Y X X

Y X

12

2

A1

is an -vector containing the values of the levels of B

is an -vector containing the (values) of the levels of B

is an matrix whose th column contains

the values of the levels of B

n

n

n a i

X

X

X

A22

for just those units thatreceived the th level of A

is an matrix whose th column contains

the (values) of the levels of B for just those unitsthat received the th level of A

i n a i

i

X

Example VII.6 Effect of operating temperatureon light output of an oscilloscope tube


87/140


on light output of an oscilloscope tube Suppose an experiment conducted to investigate the effect of the

operating temperatures 75, 100, 125 and 150, for three glass types,on the light output of an oscilloscope tube.

Further suppose that this was done using a CRD with 2 reps. Then X matrices for the analysis of the experiment:

1 1 1 A1 1

75 75 0 075 75 0 0

100 100 0 0100 100 0 0125 125 0 0125 125 0 0150 150 0 0150 150 0 075 0 75 075 0 75 0

100 0 100 0100 0 100 0,125 0 125 0125 0 125

1501507575

100100125125150

150

= =

X X a

11

A221 2

31

5625 0 05265 0 0

10000 0 010000 0 015625 0 015625 0 022500 0 022500 0 0

0 5625 00 5625 00

,

0

0 150 00 150 00 0 750 0 750 0 1000 0 1000 0 1250 0 1250 0 150

0 0 150

a a a

=

X a

12

22

32

10000 00 10000 00 15625 00 15625 0

0 22500 00 22500 00 0 56250 0 56250 0 100000 0 100000 0 156250 0 156250 0 22500

0 0 22500

a a a

Why this set of expectation


88/140


models? As before, s are used for the coefficients of polynomial

terms a numeric subscript for each quantitative fixed factor in the

experiment is placed on the s to indicate the degree(s) to whichthe factor(s) is(are) raised.

The above models are ordered from the most complex to

the simplest. They obey two rules: Rule VII.1: The set of expectation models corresponds to

the set of all possible combinations of potentialexpectation terms, subject to restriction that termsmarginal to another expectation term are excluded fromthe model;

Rule VII.2 : An expectation model must include allpolynomial terms of lower degree than a polynomial term

that has been put in the model.

Definitions to determine if a


89/140


polynomial term is of lower degree Definition VII.7 : A polynomial term is one in

which the X matrix involves the quantitativelevels of a factor(s).

Definition VII.8 : The degree for a polynomialterm with respect to a quantitative factor is thepower to which levels of that factor are to beraised in this term.

Definition VII.9 : A polynomial term is said to beof lower degree than a second polynomial termif, for each quantitative factor in first term, its degree is

less than or equal to its degree in the second term and the degree of at least one factor in the first term is less

than that of the same factor in the second term.

Marginality of terms and models


90/140


g y Note that the term X1 1 is not marginal to X2 2

the column X1 is not a linear combination of the column X2.

However, the degree of X1 1 is less than that of X2 2 the degree rule above implies that if term X2 2 is

included in the model, so must the term X1 1. As far as the marginality of models is

concerned, the model involving just X1 1 ismarginal to the model consisting of X1 1 andX2 2

G 1 1 G 1 1 2 2E E = = Y X X Y X X X

1 1 1

7575

100100

1251251501507575

1001001251251501507575

100100125

125150150

=

X

Marginality of terms and models (cont'd)75 75 0 0 5625 0 0


91/140


Also note that the term X1 1 is marginal to X A1 (a )1 since X1 is thesum of the columns of X A1 .

Consequently, a model containing X A1 (a )1 will not contain X1 1. In general, the models to which a particular model is marginal will be

found above it in the list.

1 1 1 A1 1

75 75 0 0100 100 0 0100 100 0 0125 125 0 0125 125 0 0

150 150 0 0150 150 0 075 0 75 075 0 75 0

100 0 100 0100 0 100 0,125 0 125 0125 0 1251501507575

100100125125150150

= =

X X a

11

A221 2

31

5265 0 010000 0 010000 0 015625 0 015625 0 0

22500 0 022500 0 00 5625 00 5625 00

,

00 150 00 150 00 0 750 0 750 0 1000 0 1000 0 1250 0 1250 0 1500 0 150

a a a

=

X a

12

22

32

10000 00 10000 00 15625 00 15625 00 22500 00 22500 00 0 56250 0 56250 0 100000 0 100000 0 156250 0 156250 0 225000 0 22500

a a a

ANOVA table for a two-factor CRD


92/140


ANOVA table for a two factor CRDwith one quantitative factor

Source df SSqUnits n- 1 U Y Q Y A a - 1 A Y Q Y B b- 1

B Y Q Y

Linear 1LB

Y Q Y Quadratic 1

QB Y Q Y

Deviations b- 3DevB

Y Q Y A#B (a - 1)( b- 1) AB Y Q Y

A#B Linear a - 1 L AB Y Q Y A#B Quadratic a - 1 Q AB Y Q Y Deviations (a - 1)( b- 3)

Dev AB Y Q Y

Residual ab( r - 1)ResU

Y Q Y

Strategy in determining models to bed t d ib th d t


93/140


used to describe the data. For Deviations Only if the terms to which a term is marginal are not significant then, if

P(F F calc ) a , the evidence suggests that H 0 be rejected and theterm must be incorporated in the model. Deviations for B is marginal to Deviations for A#B so that if the latter is

significant, the Deviations for B is not tested; indeed no further testingoccurs as the maximal model has to be used to describe the data.

For A#B Linear and A#B Quadratic Only if the polynomial terms are not of lower degree than a significant

polynomial term then, if P(F Fcalc) a , the evidence suggests thatH0 be rejected and the term be incorporated in the model.

A#B Linear is of lower degree than to A#B Quadratic so that if the latter issignificant, A#B Linear is not tested.

For A, Linear for B, Quadratic for B Only if the terms to which a term is marginal and the polynomial terms

of higher degree are not significant then, if P(F Fcalc) a , theevidence suggests that H 0 be rejected and the term be incorporated inthe model.

For example, for the Linear term for B, it is of lower degree than theQuadratic term for B and it is marginal to A#B Linear so that if either of these is significant, Linear for B is not tested.

Both factors quantitativeE l VII 7 M l l i f i l


94/140


Example VII.7 Muzzle velocity of an antipersonnelweapon

In a two-factor CRD experiment with two replicates theeffect of Vent volume and Discharge hole area

on the muzzle velocity of a mortar-like antipersonnel

weapon was investigated.Vent Discharge hole area

volume 0.016 0.03 0.048 0.0620.29 294.9 295.0 270.5 258.6

294.1 301.1 263.2 255.9

0.40 301.7 293.1 278.6 257.1307.9 300.6 267.9 263.6

0.59 285.5 285.0 269.6 262.6298.6 289.1 269.1 260.3

0.91 303.1 277.8 262.2 305.3

305.3 266.4 263.2 304.9

Interaction.Plot produced


95/140


Interaction.Plot producedusing R

Pretty clear that there is an interaction.

2 6 0

2 7 0

2 8 0

2 9 0

3 0 0

Vent.Vol

m e a n o

f V e

l o c

i t y

0.29 0.4 0.59 0.91

Hole.Area

0.0620.0160.030.048

Maximal polynomial submodel, inf i l b i


96/140


terms of a single observation

where Y ijk is the random variable representing the response

variable for the k th unit that received the i th level of factor A and the j th level of factor B,

is the overall level of the response variable in theexperiment,

is the value of the i th level of factor A,

is the value of the j th level of factor B, s are the coefficients of the equation describing thechange in response as the levels of A and/or B changeswith the first subscript indicating the degree with respect tofactor A and the second subscript indicating the degreewith respect to factor B.

2 210 20 01 02

2 2 2 211 12 21 22

i i j j

i j i j i j i j

ijk E Y x x x x

x x x x x x x x

a a

a a a a

=

i x a

j x

Maximal polynomial submodel, in


97/140


Maximal polynomial submodel, inmatrix terms

X is an n 8 matrix whose columns are theproducts of the values of the levels of A and B asindicated by the subscripts in X.

For example 3 rd column consists of the values of the levels of B 7 th column the product of the squared values of the

levels of A with the values of the levels of B.

G 22E = Y X X

22 10 20 01 02 11 12 21 22

10 20 01 02 11 12 21 22

where =

=X X X X X X X X X

Set of expectation models consideredwhen both factors are quantitative


98/140


when both factors are quantitative

AB

22

22

A B

A 01 01 02 02

depends on combination of A and B

smooth response in A and Bor some subset of that obeys the degrees rule

nonsmooth, independent response to A & B

q

G

E

E

E

E

=

=

=

=

Y X

Y X XX

Y X X

Y X X X

a

a

a

A 01 01

A

B 10 10 20 20

uadratic response for B, intercept differs for A

linear response for B, intercept differs for A

nonsmooth response, depends on A only

quadratic response for A, inte

E

E

E

=

=

=

Y X X

Y X

Y X X X

a

a

B 10 10

B

rcept differs for B

linear response for A, intercept differs for B

nonsmooth response, depends on B only

E

E

=

= Y X X

Y X

22 10 20 01 02 11 12 21 22

where ij

i

j

a

a

=

=

=

=

a

a

non-smoothB

non-smooth

A

Set of expectation models (continued)


99/140


p ( )

Again, rules VII.1 and VII.2 were used in deriving

this set of models. Also, the subsets of terms from 22 mentioned

above include the null subset and must conformto rule VII.2 so that whenever a term from X 22 isadded to the subset, all terms of lower degreemust also be included in the subset. X11 11 < X12 12 so model with X12 12 must include X11 11 X12 12 X21 21 so model with X12 12 does not need

X21 21 Further, if for a term the Deviation for a marginal

term is significant, polynomial terms are notconsidered for it.

Interpreting the fitted models


100/140


models in which there are only single-factor polynomialterms define

a plane if both terms linear a parabolic tunnel if one term is linear and the other quadratic a paraboloid if both involve quadratic terms

models including interaction submodels define nonlinear

surfaces they will be monotonic for factors involving only linear terms, for interactions involving quadratic terms, some candidate

shapes are:

ANOVA table for a two-factor CRDi h b h f i i


101/140


with both factors quantitativeSource df SSqUnits n-

1 U Y Q Y

A a - 1 A Y Q Y Linear 1

L A Y Q Y

Quadratic 1Q A

Y Q Y Deviations a - 3

DevB Y Q Y

B b- 1 B Y Q Y Linear 1

LB Y Q Y Quadratic 1

QB Y Q Y

Deviations b- 3DevB Y Q Y

A#B (a - 1)(b- 1) AB Y Q Y ALinear #B Linear 1 L L A B Y Q Y

ALinear #B Quadratic 1 L Q A B Y Q Y AQuadratic #B Linear 1 Q L A B Y Q Y

AQuadratic #B Quadratic 1 Q Q A B Y Q Y

Deviations (a - 1)( b- 1)- 4Dev AB Y Q Y

Residual ab (r - 1)ResU Y Q Y

Step 3 : Decide between hypotheses F D i i


102/140


For Deviations Only if the terms to which a term is marginal are not

significant then, if Pr{ F F 0} = p a , the evidencesuggests that H 0 be rejected and the term must beincorporated in the model.

Deviations for A and B are marginal to Deviations for A#Bso that if the latter is significant, neither the Deviations for

A nor for B is tested; indeed no further testing occurs asthe maximal model has to be used to describe the data.

For all Linear and Quadratic terms Only if the polynomial terms are not of lower degree than a

significant polynomial term and the terms to which theterm is marginal are not significant then, if Pr{ F F 0} = p a , the evidence suggests that H 0 be rejected; the term andall polynomial terms of lower degree must be incorporatedin the model.

For example, A linear #B Linear is marginal to A#B and is of lower degree than all other polynomial interaction terms

and so is not tested if any of them is significant.

Example VII.7 Muzzle velocity of anantipersonnel weapon (continued)


103/140


antipersonnel weapon (continued) Here is the analysis produced using R, where

> attach(Fac2Muzzle.dat)> interaction.plot(Vent.Vol, Hole.Area, Velocity, lwd=4)> Vent.Vol.lev Fac2Muzzle.dat$Vent.Vol contrasts(Fac2Muzzle.dat$Vent.Vol) contrasts(Fac2Muzzle.dat$Vent.Vol)> Hole.Area.lev Fac2Muzzle.dat$Hole.Area contrasts(Fac2Muzzle.dat$Hole.Area) contrasts(Fac2Muzzle.dat$Hole.Area

both factors are converted to ordered

and polynomial contrasts for unequally-spaced levels obtained

Contrasts


104/140

> contrasts(Fac2Muzzle.dat$Vent.Vol)

.L .Q .C

0.29 -0.54740790 0.5321858 -0.40880670

0.4 -0.31356375 -0.1895091 0.784706360.59 0.09034888 -0.7290797 -0.45856278

0.91 0.77062277 0.3864031 0.08266312

> contrasts(Fac2Muzzle.dat$Hole.Area)

.L .Q .C

0.016 -0.6584881 0.5 -0.2576693

0.03 -0.2576693 -0.5 0.6584881

0.048 0.2576693 -0.5 -0.6584881

0.062 0.6584881 0.5 0.2576693> summary(Fac2Muzzle.aov, split = list(+ Vent.Vol = list(L=1, Q=2, Dev=3),+ Hole.Area = list(L=1, Q= 2, Dev=3),+ "Vent.Vol:Hole.Area" = list(L.L=1, L.Q=2, Q.L=4, Q.Q=5, Dev=c(3,6:9))))


Factor BContrast 1 2 3Contrast Label L Q Dev

1 L L.L(1)

L.Q(2)

Dev(3)

A 2 Q Q.L(4)

Q.Q(5)

Dev(6)

3 Dev Dev(7)

Dev(8)

Dev(9)

Table showsnumbering of contrasts(standard order;by rows).

R ANOVA split used for bothfactors and interactionsi h f i


105/140


> summary(Fac2Muzzle.aov, split = list(+ Vent.Vol = list(L=1, Q=2, Dev=3),+ Hole.Area = list(L=1, Q= 2, Dev=3),+ "Vent.Vol:Hole.Area" = list(L.L=1, L.Q=2, Q.L=4, Q.Q=5, Dev=c(3,6:9))))

Error: Test

Df Sum Sq Mean Sq F value Pr(>F) Vent.Vol 3 379.5 126.5 5.9541 0.0063117

Vent.Vol: L 1 108.2 108.2 5.0940 0.0383455Vent.Vol: Q 1 72.0 72.0 3.3911 0.0841639Vent.Vol: Dev 1 199.2 199.2 9.3771 0.0074462

Hole.Area 3 5137.2 1712.4 80.6092 7.138e-10

Hole.Area: L 1 4461.2 4461.2 210.0078 1.280e-10Hole.Area: Q 1 357.8 357.8 16.8422 0.0008297Hole.Area: Dev 1 318.2 318.2 14.9776 0.0013566

Vent.Vol:Hole.Area 9 3973.5 441.5 20.7830 3.365e-07

Vent.Vol:Hole.Area: L.L 1 1277.2 1277.2 60.1219 8.298e-07

Vent.Vol:Hole.Area: L.Q 1 89.1 89.1 4.1962 0.0572893Vent.Vol:Hole.Area: Q.L 1 2171.4 2171.4 102.2166 2.358e-08Vent.Vol:Hole.Area: Q.Q 1 308.5 308.5 14.5243 0.0015364Vent.Vol:Hole.Area: Dev 5 127.2 25.4 1.1975 0.3541807

Residuals 16 339.9 21.2

in the summary function.

Analysis summarySource df SSq MSq F p


106/140


5 interaction Deviations lines have been pooled df and SSq havebeen added together.

While the Deviations for the interaction is not significant ( p = 0.354),those for both the main effects are significant ( p = 0.007 and p = 0.001). Hence a smooth response function cannot be fitted.

Furthermore, the V quadratic #H Quadratic source is significant ( p = 0.002) sothat interaction terms are required.

In this case, revert to the maximal model & use multiple comparisons.

Runs 31Vent.Vol 3 379.5 126.5 5.95 0.006Linear 1 108.2 108.2 5.09 0.038Quadratic 1 72.0 72.0 3.39 0.084Deviations 1 199.2 199.2 9.38 0.007

Hole.Area 3 5137.2 1712.4 80.61 0.000Linear 1 4461.2 4461.2 210.01 0.000Quadratic 1 357.8 357.8 16.84 0.001Deviations 1 318.2 318.2 14.98 0.001

Vent.Vol#Hole.Area 9 3973.5 441.5 20.78 0.000VLinear #H Linear 1 1277.2 1277.2 60.12 0.000

VLinear #H Quadratic 1 89.1 89.1 4.20 0.057VQuadratic #H Linear 1 2171.4 2171.4 102.22 0.000VQuadratic #H Quadratic 1 308.5 308.5 14.52 0.002Deviations 5 127.2 25.4 1.20 0.354

Residual 16 339.9 21.2

Fitting these submodels in R


107/140


Extension of the procedure for a single factor: Having specified polynomial contrasts for each quantitative factor,

the list argument of the summary function is used to obtainSSqs.

The general form of the summary function for one factor,B say, quantitative is (details in Appendix C.5, Factorial experiments .):

summary( Experiment .aov, split = list(B = list(L = 1, Q = 2, Dev = 3: (b-1) ),

" A:B " = list(L = 1, Q = 2, Dev = 3: (b-1) )))

and for two factors, A and B say, quantitative issummary( Experiment .aov, split = list(

A = list(L = 1, Q = 2, Dev = 3: (a-1) ),B = list(L = 1, Q = 2, Dev = 3: (b-1) ),

"A:B" = list(L.L=1, L.Q=2, Q.L= b , Q.Q=( b +1),Dev=c(3:( b -1),( b +2: (a-1)(b-1) ))))

(drop Dev terms for b = 3 or a = 3)

VII.G Nested factorial structuresN d f i l l i h


108/140


Nested factorial structures commonly arise when a control treatment is included or an interaction can be described in terms of one cell being

different to the others. Set up

a factor (One say) with two levels: for the control treatment or the different cell for the other treatments or cells.

A second factor (Treats say) with same number of levels as thereare treatments or cells. Structure for these two factors is One/Treats Terms in the analysis are One + Treats[One].

One compares the control or single cell with the mean of the

others. Treats[One] reflects the differences between the other treatmentsor cells.

Can be achieved using an orthogonal contrast, butnested factors is more convenient.

General nested factorial structure


109/140


set-up

An analysis in which there is: a term that reflects the average differences

between g groups; a term that reflects the differences within

groups or several terms each one of which

reflects the differences within a group.

Example VII.8 Grafting experiment


110/140


p g p

For example, consider the following RCBDexperiment involving two factors each at twolevels.

The response is the percent grafts that take.

B 1 2 A 1 2 1 2

I 64 23 30 15

II 75 14 50 33Block III 76 12 41 17

IV 73 33 25 10observation missing; value insertedso that residual is zero.

Example VII.8 Grafting experiment (continued)


111/140


( )

1. Observational unit a plot

2. Response variable % Take

3. Unrandomized factors

Blocks, Plots4. Randomized factors A, B

5. Type of study Two-factor RCBD

b) The experimental structureStructure Formulaunrandomized 4 Blocks/ 4 Plots

randomized 2 A*2 B

a) Description of pertinent features of the study

R output > attach(Fac2Take dat)


112/140


> attach(Fac2Take.dat)> Fac2Take.dat

Blocks Plots A B Take

1 1 1 1 1 642 1 2 2 1 233 1 3 1 2 304 1 4 2 2 155 2 1 1 1 756 2 2 2 1 14

7 2 3 1 2 508 2 4 2 2 339 3 1 1 1 7610 3 2 2 1 1211 3 3 1 2 4112 3 4 2 2 17

13 4 1 1 1 7314 4 2 2 1 3315 4 3 1 2 2516 4 4 2 2 10> interaction.plot(A, B, Take, lwd=4)

An interaction

2 0

3 0

4 0

5 0

6 0

7 0

A

m e a n o

f T a

k e

1 2

B

12

R output (continued)


113/140



> Fac2Take.aov summary(Fac2Take.aov) Error: Blocks Df Sum Sq Mean Sq Blocks 3 221.188 73.729 Error: Blocks:Plots Df Sum Sq Mean Sq F value Pr(>F) A 1 4795.6 4795.6 52.662 4.781e-05 B 1 1387.6 1387.6 15.238 0.003600 A:B 1 1139.1 1139.1 12.509 0.006346 Residuals 9 819.6 91.1



114/140


> res fit plot(fit, res, pch=16)> plot(as.numeric(A), res, pch=16)> plot(as.numeric(B), res, pch=16)> qqnorm(res, pch=16)> qqline(res)> tukey.1df(Fac2Take.aov, Fac2Take.dat,

+ error.term = "Blocks:Plots")$Tukey.SS[1] 2.879712

$Tukey.F[1] 0.02820886

$Tukey.p[1] 0.870787

$Devn.SS[1] 816.6828

Recompute for missing value


115/140


Recalculate either in R or in Excel. See notes for Excel details> #> # recompute for missing value> #> MSq Res df.num df.den Fvalue pvalue data.frame(MSq,Res,df.num,df.den,Fvalue,pvalue)

MSq Res df.num df.den Fvalue pvalue

1 73.7290 102.4500 3 8 0.71965837 0.56773355802 4795.6000 102.4500 1 8 46.80917521 0.00013209423 1387.6000 102.4500 1 8 13.54416789 0.00621700094 1139.1000 102.4500 1 8 11.11859444 0.01031582595 2.8797 116.6690 1 7 0.02468266 0.8795959255

Diagnostic checking


116/140


1.0 1.2 1.4 1.6 1.8 2.0

- 1 0

- 5

0

5

1 0

1 5

as.numeric(A)

r e s

1.0 1.2 1.4 1.6 1.8 2.0

- 1 0

- 5

0

5

1 0

1 5

as.numeric(B)

r e s

20 30 40 50 60 70 80

- 1 0

- 5

0

5

1 0

1 5

fit

r e s

-2 -1 0 1 2

- 1 0

- 5

0

5

1 0

1 5

Normal Q-Q Plot


S a m p

l e Q u a n

t i l e s

Hypothesis test for this example


117/140


Hypothesis test for this example

Step 1 : Set up hypothesesa) H 0: (a )21 - (a )11 - (a )22 + ( a )12 = 0

H1: (a )21 - (a )11 - (a )22 + ( a )12 0

b) H 0: a 1 = a 2H1: a 1 = a 2

c) H 0: 1 = 2

H1: 1 = 2Set a = 0.05.

Hypothesis test for this example (continued)Step 2 : Calculate test statistics The ANOVA table for the two-factor RCBD is:


118/140


Source df SSq MSq E[MSq] F Prob Blocks 3 221.9 73.7 2 2BP B4 0.72 0.568

Plots[Blocks] 12 8141.8 A 1 4795.6 4795.6 2BP Aq y 46.81


119/140


Suppose the researcher wants to determine the level of A that has the greatest takefor each level of B.

> #

> # multiple comparisons> #> Fac2Take.tab Fac2Take.tab$tables$"A:B"

B

A 1 21 72.00 36.502 20.50 18.75

> q q[1] 4.52881

no difference between A at level two of B there is an A difference at level one of B

level one of A maximizes.

4.52881 102.4 25%42

22.91

w =

=

Best description


120/140


> Fac2Take.tab$tables$"A:B"Dim 1 : A

Dim 2 : B1 2

1 72.00 36.502 20.50 18.75

A and B both at level 1 different from either A or B not atlevel 1.

However, the results are only approximate because of themissing value.

Testing for this can be achieved by setting up a factor for the 4 treatments and a two-level factor that compares thecell with A and B both at level 1 with the remainingfactors.

The four-level factor for treatments is then specified asnested within the two-level factor.

Re-analysis achieved in R> Fac2Take.dat$Cell.1.1


121/140


> Fac2Take.dat$Treats detach(Fac2Take.dat)> attach(Fac2Take.dat)

> Fac2Take.datBlocks Plots A B Take Cell.1.1 Treats

1 1 1 1 1 64 1 12 1 2 2 1 23 2 33 1 3 1 2 30 2 24 1 4 2 2 15 2 45 2 1 1 1 75 1 16 2 2 2 1 14 2 37 2 3 1 2 50 2 28 2 4 2 2 33 2 49 3 1 1 1 76 1 110 3 2 2 1 12 2 311 3 3 1 2 41 2 2

12 3 4 2 2 17 2 413 4 1 1 1 73 1 114 4 2 2 1 33 2 315 4 3 1 2 25 2 216 4 4 2 2 10 2 4

Re-analysis (continued)> Fac2Take.aov


122/140


Error(Blocks/Plots), Fac2Take.dat)> summary(Fac2Take.aov)

Error: BlocksDf Sum Sq Mean Sq

Blocks 3 221.188 73.729

Error: Blocks:PlotsDf Sum Sq Mean Sq F value Pr(>F)

Cell.1.1 1 6556.7 6556.7 72.0021 1.378e-05Cell.1.1:Treats 2 765.5 382.8 4.2032 0.05139Residuals 9 819.6 91.1> # recompute for missing value> MSq Res df.num Fvalue pvalue data.frame(MSq,Res,df.num,Fvalue,pvalue)

MSq Res df.num Fvalue pvalue1 73.729 102.45 3 0.7196584 5.677336e-012 6556.700 102.45 1 63.9990239 4.367066e-053 382.800 102.45 2 3.7364568 7.146140e-02

Revised analysis of variance tableS df SS MS F P b


123/140


Appears difference between the treatments is bestsummarized in terms of this single degree of freedomcontrast between cell1,1 and the others.

The mean for cell 1,1 is 72.0 and, for the other threetreatments, the mean is 25.2, a difference of 46.8.

Such one-cell interactions are a very common form of interaction.

Source df SSq MSq F Prob Blocks 3 221.9 73.7 0.72 0.568

Plots[Blocks] 12 8141.8 Cell 1,1 vs rest 1 6556.7 6556.7 64.00


124/140


An experiment was conducted to investigate the effects of tractor speed and spray pressure on the quality of driedsultanas.

Response was lightness of the dried sultanas measuredusing a Hunterlab D25 L colour difference meter.

Three tractor speeds and two spray pressures resulting in6 treatment combinations which were applied to 6 plots,each consisting of 12 vines, using a RCBD with 3 blocks.

However, these 6 treatment combinations resulted in only4 rates of spray application as indicated in the followingtable.

Table of application rates for the sprayer experiment

Tractor Speed (km hour - 1

)Pressure (kPa) 3.6 2.6 1.8

140 2090 2930 4120 330 2930 4120 5770

Set up for analysis To analyze this experiment set up:


125/140


y p p a factor, Rates, with 4 levels to compare 4 rate means

two factors with 3 levels, Rate2 and Rate3, each of which compares the means of 2 treatmentcombinations with the same rate.

Table of factor levels for Rate2 and Rate3 in the sprayer experiment

Rate2 Rate3

ractor Speed (km hour - 1 ) 3.6 2.6 1.8 3.6 2.6 1.8Pressure (kPa)

140 1 2 1 1 1 2330 3 1 1 1 3 1

The experimental structure for this experiment is:

Structure Formulaunrandomized 3 Blocks/ 6 Plotsrandomized 4 Rates/( 3 Rate2+ 3 Rate3)

Sources in the ANOVA table


126/140


Source df E[MSq]Blocks 2 2 2BP B

Plots[Blocks] 15

Rates 3 2BP Rq y

Rate2[Rates] 1 2BP R2q y Rate3[Rates] 1

2BP R3q y

Residual 10 2BP

Total 17

VII.H Models and hypothesis testing


127/140


for three-factor experiments Experiment with

factors A, B and C with a , b and c levels each of the abc combinations of A, B and C replicated

r times. n = abcr observations.

The analysis is an extension of that for a two-factor CRD.

Initial exploration using interaction plots of twofactors for each level of the third factor.

use interaction.ABC.plot from the DAE library.

a) Using the rules to determine the ANOVAtable 3-factor CRD experiment


128/140


1. Observational unit a unit

2. Response variable Y

3. Unrandomized factors Units

4. Randomized factors A, B, C

5. Type of study Three-factor CRD

b) The experimental structureStructure Formulaunrandomized n Unitsrandomized a A*b B*c C

a) Description of pertinent features of the study

c) Sources derived from the structure formulae


129/140


Randomized only A*B*C = A + (B*C) + A#(B*C)

= A + B + C + B#C+ A#B + A#C + A#B#C

d) Degrees of freedom and sums of squares The df can be derived by the cross product rule.

For e ch f ctor in the term c lc l te the n mber of


130/140


For each factor in the term, calculate the number of levels minus one and multiply these together.

Units

Unrandomized factors

MG MG

MU MU- MG U

B C

B C

Randomized factors

A C

A

A B

A B C

MG MG

M A

M AB

MB MC

M AC MBC

M ABC

M A- MG MB- MG MC- MG

M AB- M A- MB+MG M AC - M A- MC+MG MBC - MB- MC+MG

M ABC - M AB- M AC - MBC

+M A+MB+MC- MG

B C A

B#C A#C A#B

A#B#C

e) The analysis of variance table

f f


131/140


Enter the sources for the study, their degrees of freedom and quadratic forms, into the ANOVAtable below.

Given that the only random factor is Units, thefollowing are the symbolic expressions for themaximal expectation and variation models: var[Y] = Units y = E[Y] = A B C

f) Maximal expectation and variation models


132/140

b) Indicator-variable models andestimation for the three factor CRD


133/140


estimation for the three-factor CRD

The models for the expectation:

ABC

AB AC BC

AB AC

A BC

and equivalent models with a pair of

two-factors interactionsand equivalent models with two factorsinteracting and one factor independent

E

E

E

E

=

=

=

=

Y X

Y X X X

Y X X

Y X X

a

a a

a a

a

AB

A B C

G

and equivalent models with two factorsinteracting

and other models consisting of only main effects

E

E

E

= =

=

Y X

y X X X

Y X

a

a

ijk

ij

jk

ik

i

j

k

a

a

a

a

==

=

=

===

a

a

a

a

Altogether 19 different models.

Estimators of expected values Expressions for estimators for each model given in terms


134/140


Expressions for estimators for each model given in termsof following mean vectors:

, , , , , and A B A B C A C C A C Being means vectors can be written in terms of mean

operators, Ms. 1G

1 A

1B

1 AB

1C

1 AC

1BC

1 ABC

a b c r abcr

a b c r bcr

a b c r acr

a b c r cr

a b c r abr

a b c r br

a b c r ar

a b c r r

=

=

=

=

= =

=

=

M J J J J

M I J J J

M J I J J

M I I J J

M J J I J

M I J I J

M J I I J

M I I I J

Further, if Y is arranged sothat the associated factors

A, B, C and the replicatesare in standard order

M operators can be writtenas the direct product of I andJ matrices as follows:

c) Expected mean squares under alternative models


135/140


Previously given the E[MSq]s under the maximal model. Also need to consider them under alternative models so

that we know what models are indicated by the varioushypothesis tests.

Basically, need to know under which models q(y ) = 0.

From two-factor case, q(y ) = 0 only when the model doesnot include a term to which the term for the source ismarginal.

So, when doing the hypothesis test for a MSq for a fixedterm, provided terms to which it is marginal have been ruled out by

prior tests, it tests whether the expectation term corresponding to it is zero.

For example, consider the A#B mean square.

An example: the A#B mean square 2U ABq y Its expected value is


136/140


Now, q(y ) 0 for models involving the A B term or termsto which the AB term is marginal.

All models are of the form

ABC

AB AC

BC

AB BC

AB AC

AB C

AB

E

E

E

E

E

E

=

=

=

=

=

=

Y X

Y X X

X Y X X

Y X X

Y X X

Y X

a

a a

a

a a

a

a

ABC

AB AC BC

AB AC

A BC

AB

A B C

G

E

E

E

E

E

E

E

=

=

=

=

=

=

=

Y X

Y X X X

Y X X

Y X X

Y X

y X X X

Y X

a

a a

a a

a

a

a

So q(y ) 0 for the models

Hence test for A#B, provides a test for whether A B shouldbe included in the model, provided that the test for A#B#C

has already indicated that A B C can be omitted

d) The hypothesis testStep 1 : Set up hypotheses Term being testeda a a a - - -


137/140


a) H 0: A B C

H1:

b) H 0: A BH1:

c) H 0: A CH1:

d) H 0: B CH1:

e) H 0: a 1 = a 2 = ... = a a AH1: not all population A means are equal

f) H0: 1 = 2 = ... = b BH1: not all population B means are equal

g) H 0: 1 = 2 = ... = c CH1: not all population C means are equal

Set a = 0.05.

. . .

.. . . .. ...0 for all i,j,k

ijk ij i k jk

i j k

a a a a

a a a a

- =

. . .

.. . . .. ... 0 for some i,j,k

ijk ij i k jk

i j k

a a a a

a a a a

- - - -

. . .. 0 for all i,jij i j a a a a - - = . . .. 0 for some i,jij i j a a a a - - . . .. 0 for all i,kik i k a a a a - - = . . .. 0 for some i,kik i k a a a a - - . . .. 0 for all j,k jk j k - - = . . .. 0 for some j,k jk j k - -

Step 2 : Calculate test statisticsSource df SSq E[MSq]Units 1 Y Q Y


138/140


MSqs would be added to this table by taking each SSqand dividing by its df,

F statistics computed by dividing all MSqs, except theResidual MSq, by the Residual MSq, and

p values obtained for each F statistic.

Units n- 1 U Y Q Y

A a - 1 A Y Q Y 2U Aq y B b- 1 B Y Q Y 2U Bq y

A#B (a - 1)( b- 1) AB Y Q Y 2U ABq y C c - 1 C Y Q Y 2U Cq y

A#C (a - 1)( c - 1) AC

Y Q Y

2

U ACq y B#C (b- 1)( c - 1) BC Y Q Y 2U BCq y A#B#C (a - 1)( b- 1)( c - 1) ABC Y Q Y 2U ABCq y Residual abc (r - 1)

ResU Y Q Y 2U

Total abcr - 1 U Y Q Y

Step 3 : Decide between hypotheses For A#B#C interaction source


139/140


If Pr{F F 0} = p a , the evidence suggests that

H0 be rejected and the term should beincorporated in the model. For A#B, A#C and B#C interaction sources Only if A#B#C is not significant, then if Pr{ F F 0}

= p a , the evidence suggests that H0

berejected and the term corresponding to thesignificant source should be incorporated in themodel.

For A, B and C source For each term, only if the interactions involving

the term are not significant, then if Pr{ F F 0} = p a , the evidence suggests that H 0 be rejected

and the term corresponding to the significant

source should be incorporated in the model

VII.J Exercises


140/140

Ex. VII-1 asks for the complete analysis of a

factorial experiment with qualitative factors Ex. VII-2 involves a nested factorial analysis not examinable

Ex. VII-3 asks for the complete analysis of a

factorial experiment with both factors quantitative Ex. VII-4 asks for the complete analysis of a

factorial experiment with random treatmentfactors

Ex. VII-5 asks for the complete analysis of afactorial experiment with interactions between

small and medium scale

Documents