1 module 11: experiments to study variances-variance components analysis and nested models in the...

36
1 Module 11: Experiments to study Variances-Variance Components Analysis and Nested Models In the previous study, we are interested in comparing the homogeneity of treatment means. The example we discussed is the concrete strength of five types of sand size. These five sand sizes cover basically all of the sand sizes used for forming concrete. The purpose of the study is to find out which sand size produces the strongest concrete. In some applications, we are interested in identifying the major source of variation in a system and estimate their variances. By the nature of (1) purpose of the study, (2) the treatment structure (3) the experimental protocols and (4) the statistical inference made from the results, the effects in the model is considered to be random effects, and the statistical models are referred as random effect models. Example: (Kuehl, 2000): A metal alloy is produced in a high-temperature casting process. Each casting is broken down into smaller individual bas that are used for applications requiring small amount of the alloy. The tensile strength of the alloy is critical to its intended future. The casting process is designed to produce bars with an average tensile strength above minimum specifications. Excessive variation among bars does not meet the specification, and further improvement

Upload: russell-weaver

Post on 02-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

1

Module 11: Experiments to study Variances-Variance Components Analysis and Nested Models

In the previous study, we are interested in comparing the homogeneity of treatment means. The example we discussed is the concrete strength of five types of sand size. These five sand sizes cover basically all of the sand sizes used for forming concrete. The purpose of the study is to find out which sand size produces the strongest concrete. In some applications, we are interested in identifying the major source of variation in a system and estimate their variances. By the nature of (1) purpose of the study, (2) the treatment structure (3) the experimental protocols and (4) the statistical inference made from the results, the effects in the model is considered to be random effects, and the statistical models are referred as random effect models.

Example: (Kuehl, 2000): A metal alloy is produced in a high-temperature casting process. Each casting is broken down into smaller individual bas that are used for applications requiring small amount of the alloy. The tensile strength of the alloy is critical to its intended future.

The casting process is designed to produce bars with an average tensile strength above minimum specifications. Excessive variation among bars does not meet the specification, and further improvement of the production process will be needed to reduce the variation among alloy bars.

2

Min Min

Through cause-effect analysis, two components contribute to the total variation in the tensile strength of bars are identified:

1. variability among fabrication casting, and

2. inconsistency within the casting process that affect bars from the same casting

Purpose of the study: to quantify the uncertainty due to each component for further investigation of causes that result the uncertainty, and taking actions to reduce the uncertainty

Experiment Design: Three fabrications casting in the same facility were randomly selected. Each casting was broken into individual bars. Ten randomly selected bars from each casting were tested . The interest is on identifying variations of tensile strength caused by casting in the facility and by bar within the casting, not about the mean differences among the tree casting.

3

Row cast 1 cast 2 cast 3

1 88.0 85.9 94.2

2 88.0 88.6 91.5

3 94.8 90.0 92.0

4 90.0 87.1 96.5

5 93.0 85.6 95.6

6 89.0 86.0 93.8

7 86.0 91.0 92.5

8 92.9 89.6 93.2

9 89.0 93.0 96.2

10 93.0 87.5 92.5

The statistical model for identifying the two sources of variation for this random effects in this experiment is

ij

, i = 1,2,...t; j =1,2,..r.

where is the process mean, ' are

the random effects due to castings,

e ' are the random error due to bars within castings.

The distribution assumptio

ij i ij

i

y e

s

s

ij

2 2 2

2 2

ns are:

~ (0, ); e ~N(0, ),andbothareindependent .

The total variance of an observation

may be expressed by :

and are two variance components. .

i e

y e

e

N

The ANOVA table and expected mean squares for the random effect model:

Source Df SS MS F P-value EMS

Among Castings t-1 SSA MSA=SSA/(t-1) MSA/MSW r

Among Bars within Casting N-t SSW MSW=SSW/(N-t)

Total N-1 SSTO

4

Estimate of variance components:2 2 2

2

ˆ ˆ ˆ and MSW = ,

( _ )ˆ therefore,

Interval estimates of variance components can also be obtained for both components:

(NOTE: This is what we call 'Expanded Uncertainty'

e eMSA r

MSA MSW

r

2

2 2( / 2, ) (1- / 2, )

2

U 0

( / 2, 1)

before.)

100(1- )% confidence interval for :

SSW SSW Lower Bound = Upper Bound =

100(1-2 )% confidence interval for :

SSA(1-F / ) Lower Bound =

e

N t N t

a

t

F

r

L 02 2

(1- / 2, 1)

U ( / 2,( 1),( ) L (1- / 2,( 1),( )

2

SSA(1-F / ) Upper Bound =

where F = F and F = F

NOTE: The confidence level is at 100(1-2 )% , NOT at 100(1- )% for

t

t N t t N t

a

F

r

5

2

I 2 2

Another important aspect when estimating the variance component is

the intraclass correlation:

Intracorrelation measures the similarity of observations within groups relative to

th

a

a e

2 2

I

I

at among groups. When observations within groups are similar,

and variance component is small, and relaticely, is large.

Consequencely, is close to one.

The estimate of is given by:

e a

I I

0 0

0 0

ˆ , and its 100(1- )% confidence interval for is ( 1)

Lower Bound = , Upper Bound = ( 1) ( 1)

U L

U L

MSA MSW

MSA r MSW

F F F F

F r F F r F

Estimate of Intra-class Correlation and its Interpretation

6

Analyzing the Casting Process Data

Row cast 1 cast 2 cast 3

1 88.0 85.9 94.2

2 88.0 88.6 91.5

3 94.8 90.0 92.0

4 90.0 87.1 96.5

5 93.0 85.6 95.6

6 89.0 86.0 93.8

7 86.0 91.0 92.5

8 92.9 89.6 93.2

9 89.0 93.0 96.2

10 93.0 87.5 92.5

cast

1

cast

2

cast

3

85

90

95

Casting

Str

engt

h

Dotplots of Strength by Casting

88 89 90 91 92 93 94

-2

-1

0

1

2

Fitted Value

Sta

ndar

dize

d R

esid

ual

Residuals Versus the Fitted Values(response is Strength)

Casting N Mean StDev

cast 1 10 90.30 2.869

cast 2 10 88.43 2.456

cast 3 10 93.80 1.786

7

Factor Type Levels Values

Casting random 3 cast 1 cast 2 cast 3

Analysis of Variance for Strength, using Adjusted SS for Tests

Source DF Seq SS Adj SS Adj MS F P

Casting 2 147.885 147.885 73.942 12.71 0.000

Error 27 157.102 157.102 5.819

Total 29 304.987

Expected Mean Squares, using Adjusted SS

Source Expected Mean Square for Each Term

1 Casting (2) + 10.0000(1)

2 Error (2)

Error Terms for Tests, using Adjusted SS

Source Error DF Error MS Synthesis of Error MS

1 Casting 27.00 5.819 (2)

Variance Components, using Adjusted SS

Source Estimated Value

Casting 6.812

Error 5.819

The EMS provides us information on how to conduct appropriate test.

(2) Is the Random Error due Bars. e2

(1) Is variance due to Casting,

8

Hands-on Activity

•Find the estimate of the two variance components, and draw some suggestions for further investigation of reducing variations.

•Compute a 95% confidence interval for component e2

•Compute a 90% confidence interval for component a2

•Find the intra-class correlation and compute a 95% confidence interval for the intra-class correlation.

9

Sub-samples in Lab testing studies

In many lab testing studies, a sample may be split into several sub-samples for testing. This sub-sampling introduces another source of random variation among sub-samples in addition to the source of random variations due to experimental units. It is necessary that the source of variation is identified in the study.Case Study: Pesticide Residue on Vegetables

A concern is the pesticide residue remains on the plants after a time period. The residues are evaluated with chemical assays in the laboratory using plant samples from the field plots treated with the pesticide.

Hypothesis of the Study: There are two standard chemical assays used to evaluate the residues. Which method can recover the pesticide residues better?

Treatment Design: Two standard chemical methods A and B used on regular basis.

Experimental Design: Six batches of plants, each batch from a single field plot, were sampled, and prepared for the residue analysis. Three batches were randomly assigned to each method in a completely randomized design setting. Within each batch prepared for testing, two sub-samples were taken for test.

NOTE: The experimental unit for each method is batch. The testing unit is the sub-sample within each batch.

10

The statistical model that describes the experimental design with sub-sampling for this study is:

ij

,

i = 1,2, ..., t; j = 1,2, ..., r ; k =1,2, .. , n

is the fixed effect of the ith treatment.

e is the random experimental error for the

jth experimental unit within the ith

ij i ij ijk

i

y e d

ijk

ij ijk

treatment.

d is the kth random subsample error of

the jth experiment within the ith treatment.

e ~ (0, ) , and d ~ (0, ).e dN N

Method N Mean StDev

1 6 120.00 14.14

2 6 69.83 4.26

Method Batch Sample Residue

1 1 1 120

1 1 2 110

1 2 3 120

1 2 4 100

1 3 5 140

1 3 6 130

2 4 7 71

2 4 8 71

2 5 9 70

2 5 10 76

2 6 11 63

2 6 12 68

(Data Source: Kuehl, 2000)

11

ANAOVA Table and the corresponding EMSSource Df SS MS F P-value EMS

Method t-1 SSTR MSTR=SSTR/(t-1) MSTR/MSE d2n

rnt

Batch(Method) t(r-1) SSE MSE=SSE/t(r-1) MSE/MSSA d2n

Sampling tr(n-1) SSSA MSSA=SSSA/tr(n-1) d2

Total trn - 1 SSTO

Since there are two sources of variations contributed to the observations, the variance of Mean of each Method is

..

i.. .. ..

2 22

2 2 2 2i..

2 2y ( )

, since

1 1V(y ) ( ) ( ) ( ) [ ( )]

2 and the estimate is ,

i

I J

d ey

ijkj k

ijk d ej k

Y Y

rn r

y

V V y rn nrn rn rn

MSE MSES S

rn rn

NOTE: The model such as this is called a Mixed Model. It consists of both Fixed and Random effects components.

12

Estimating the variance components for the experiment with subsamples

2 2 2

2

ˆ ˆ ˆ and . Therefore,

ˆ

: Chemical Method is a fixed effect. There is no random variation component.

The component in the EMS (Expected Mean Square)

for the Treat

d d e

e

MSSA MSE n

MSE MSSA

nNOTE

2iment is a constant term = .

( 1)

The larger this is the more differences among treatment means.

t

13

Analysis of the Pesticide Residue Data

1 2

60

70

80

90

100

110

120

130

140

Method

Res

idue

Dotplots of Residue by Method

Method N Mean StDev

1 6 120.00 14.14

2 6 69.83 4.26

Factor Type Levels Values

Method fixed 2 1 2

Batch(Method) random 6 1 2 3 4 5 6

Analysis of Variance for Residue, using Adjusted SS for Tests

Source DF Seq SS Adj SS Adj MS F P

Method 1 7550.1 7550.1 7550.1 39.72 0.003

Batch(Method) 4 760.3 760.3 190.1 3.45 0.086

Error 6 330.5 330.5 55.1

(due to subsample)

Total 11 8640.9

14

Expected Mean Squares, using Adjusted SS

Source Expected Mean Square for Each Term

1 Method (3) + 2.0000(2) + Q[1]

2 Batch(Method) (3) + 2.0000(2)

3 Error (3)

(due to subsample)

Error Terms for Tests, using Adjusted SS

Source Error DF Error MS Synthesis of Error MS

1 Method 4.00 190.1 (2)

2 Batch(Method) 6.00 55.1 (3)

Variance Components, using Adjusted SS

Source Estimated Value

Batch(Method) 67.50

Error(due to Subsample) 55.08

To test Method, use Source (2).

To test Batch(method), use Source (3)

Two variance components are:

15

Least Squares Means for Residue

Method Mean

1 120.00

2 69.83

(Method)Batch

1 1 115.00

1 2 110.00

1 3 135.00

2 4 71.00

2 5 73.00

2 6 65.50

60 70 80 90 100 110 120 130 140

-10

0

10

Fitted Value

Res

idua

l

Residuals Versus the Fitted Values(response is Residue)

0 1 2 3 4

95% Confidence Intervals for Sigmas

2

1

-2 -1 0 1 2

Boxplots of Raw Data

SRES1

F-Test

Test Statistic: 9.836

P-Value : 0.025

Levene's Test

Test Statistic: 16.050

P-Value : 0.002

Factor Levels

1

2

Test for Equal Variances for Raw Residues

NOTE: The residual analysis reveals a serious violation of the constant variance assumption. What should be do?

16

Hands-On activity

1. Find appropriate transformation, conduct the data transformation, and reanalyzed the data.

2. Compare the results between the transformed data and the raw data. Do we notice any specific changes in our conclusion?

3. Obtain a 95% confidence interval for the mean of each chemical method using the raw data.

4. Obtain a 95% confidence interval for the mean difference between two chemical method using the raw data.

17

Nested Factor Experimental Designs and Mixed Factor Designs

In Inter-laboratory testing studies, this type of design occurs naturally.

Example: A study of a chromatographic method was condcuted for determining malathion. Ten labs participated in the study; each lab received a subsample of a technical grade malathion (Tech), two wettable powders (25% WP and 50% WP), and an emulsifiable concentrate (58% EC), and a dust. Each participant also received a internally tested standard of malathion (99.1%) along with the analytical method. (Wernimont, 1985).

Steps involving with this type of inter-lab study include at least:

1. Planning: Requirements for a protocol – objectives, analytical method, participating labs, preparation and distribution of of materials, replication scheme.

2. Executing: Specific rules and guidelines for participating labs, training of operators, analyzing material, recording and reporting the data.

3. Screening data: Taking care of missing data, detecting biased labs, identifying outlying observations. Box-plots, h-plots, Youden’s plots, scatter plots are some common tools for screening.

4. Analyzing data: Setting statistical models, using appropriate statistical methods, diagnosing assumptions, properly interpret the results.

5. Preparing final report.

18

For the malathion testing study, one major purpose is to comparing the performance of lab performance using a variety of materials. One can also study the mean differences among materials and interaction between material and labs.

Different type of designs may be chosen. The simplest one for comparing the lab performance is a two variance component model for testing material.

1. Experimental design with two component model for testing a material:

Ten labs participated in the study. Each participated lab tested four samples of the same material.

The statistical model is

i ij

, i=1,2,..., t ; j =1,2,..,r.

L is assuned to follow N(0, ), e ~ (0, ).

In this type of study, we are interested in identifying unusual labs, and to estimate

the bwteen-lab varianc

ij i ij

L e

y L e

N

e component and within-lab variance component.

19

Note: If the estimate of the variance components will be applied for combined

uncertainty computation, it is important to mkae sure that the testing procedure and

testing results are under statistical control, so that the uncertainty components

reflect the true variance components better.

2 2 2

The overall variance of a measurement from any lab is,

when assuming labs are independent, and material are chosen at random

The variance of a given Laboratory mean is:

T L e

..

..

22

i.. ( / 2, ( 1)) i.. ( / 2, ( 1))

A 100(1- )% confidence interval for each lab mean is given by:

ˆ y y

i

i

ey

t r y t r

r

MSEt t

r

20

Row lab Rep WP25 WP50

1 1 1 26.17 50.76

2 1 2 26.22 50.67

3 1 3 25.85 50.81

4 1 4 25.80 50.72

5 2 1 26.44 50.82

6 2 2 26.57 50.90

7 2 3 25.80 51.04

8 2 4 26.06 50.96

9 3 1 26.95 52.53

10 3 2 26.91 52.54

11 3 3 26.98 52.55

12 3 4 26.91 52.47

13 5 1 26.23 50.20

14 5 2 26.00 50.47

15 5 3 26.22 50.39

16 5 4 26.18 50.43

17 6 1 25.45 51.65

18 6 2 25.62 51.67

Row lab Rep WP25 WP50

19 6 3 27.01 51.72

20 6 4 25.72 52.07

21 7 1 26.14 50.53

22 7 2 26.78 50.75

23 7 3 26.04 49.99

24 7 4 25.97 50.92

25 8 1 25.70 50.00

26 8 2 25.90 50.30

27 8 3 25.80 50.50

28 8 4 25.70 50.60

29 9 1 26.13 50.26

30 9 2 26.13 50.36

31 9 3 25.91 50.97

32 9 4 25.86 50.44

33 10 1 26.22 50.23

34 10 2 26.20 50.27

35 10 3 25.84 50.29

36 10 4 25.84 49.97

Raw Data for the Malathion Interlaboratory Study

21

1 2 3 5 6 7 8 9

10

50

51

52

lab

WP

50

Dotplots of WP50 by lab

lab N Mean Median TrMean StDev

1 4 50.740 50.740 50.740 0.059

2 4 50.930 50.930 50.930 0.093

3 4 52.523 52.535 52.523 0.036

5 4 50.373 50.410 50.373 0.120

6 4 51.777 51.695 51.777 0.197

7 4 50.548 50.640 50.548 0.405

8 4 50.350 50.400 50.350 0.265

9 4 50.508 50.400 50.508 0.317

10 4 50.190 50.250 50.190 0.149

22

Analysis of Variance for WP50%_1

Source DF SS MS F P

laborato 8 19.1570 2.3946 50.958 0.000

Error 27 1.2688 0.0470

Total 35 20.4258

Variance Components

Source Var Comp. % of Total StDev

laborato 0.587 92.59 0.766

Error 0.047 7.41 0.217

Total 0.634 0.796

Expected Mean Squares

1 laborato 1.00(2) + 4.00(1)

2 Error 1.00(2)

2 2 2 + y L e

It is clearly indicates that 92.6% of the total variance of each observation is the between-lab. That is the lab averages are very different. This was also shown by the X-bar chart. This makes sense , since operators will be familiar with the lab facilities.

A further action is to

23

0 1 2 3

95% Confidence Intervals for Sigmas

Bartlett's Test

Test Statistic: 20.290

P-Value : 0.009

Levene's Test

Test Statistic: 1.324

P-Value : 0.274

Factor Levels

1

2

3

5

6

7

8

9

10

Test for Equal Variances for WP50%_1

Bonferroni confidence intervals for standard deviations

Lower Sigma Upper N Factor Levels

0.027423 0.059442 0.46874 4 1

0.042948 0.093095 0.73412 4 2

0.016580 0.035940 0.28341 4 3

0.055152 0.119548 0.94272 4 5

0.090980 0.197210 1.55514 4 6

0.186613 0.404506 3.18983 4 7

0.122058 0.264575 2.08637 4 8

0.146246 0.317004 2.49981 4 9

0.068634 0.148773 1.17318 4 10

0Subgroup 1 2 3 4 5 6 7 8 9 10

50

51

52

Sam

ple Mea

n

1 2 3 4 5 6 7 8 9 10laboratory

Mean=50.88UCL=51.18

LCL=50.58

0.0

0.5

1.0

Sam

ple Ran

ge

R=0.41

UCL=0.9353

LCL=0

Xbar/R Chart for WP50%_1

24

Quantify the uncertainty of Individual Lab Mean and determine a 95% confidence interval for the lab mean.

The ANOVA technique assumes constant variance, therefore, there is a common variance component for each Lab mean

Hence, the estimate from the data is: =.047/4 = .012

95% CI for Lab 1 is

..

2

iys

..i.. ( / 2, ( 1)) (.025,30)ˆ y 50.74 .012

50.74 2.045(0.108) 50.74 .22it r yt t

..

..

22

i.. ( / 2, ( 1)) i.. ( / 2, ( 1))

The variance of a given Laboratory mean is:

A 100(1- )% confidence interval for each lab mean is given by:

ˆ y y

i

i

ey

t r y t r

r

MSEt t

r

25

Hands-on Activity

Conduct the same analysis for the PW25% variable:

1. Set up a statistical model for this experiment.

2. Descriptive summary with dot-plots Vs Labs.

3. Testing constant variance, and find Bonferroni confidence interval of within-lab s.d..

4. Conduct Xbar, R-charts analysis.

5. Conduct ANOVA and estimate variance components.

6. Find variance of lab mean and obtain a 95% CI for the Lab Mean of Lab 3.

26

2. Experimental design with three variance components

In the two-component model, we assume that the testing processes are similar day in and day out. However, in some situations, lab testing may not be consistent day in and day out. A three variance component design may be implemented if there is a doubt about the consistency from day to day.

A participated lab tested four samples of the material per day, and will be asked to conduct the same test at three randomly chosen days, not known to the participated lab. A statistical model based on this design is :

( ) ( )

i j(i) k(ij)

j(i)

k(

, i=1,2,..., t ; j =1,2,..,d; k = 1,2,..., r.

L is assumed to follow N(0, ), D ~N(0, ), e ~ (0, ).

D is the random effect of Day nested within the ith Lab.

e

ij i j i k ij

L D e

y L D e

N

ij) is the random effect of replications nested within the jth Day of the ith Lab.

In this type of study, we are interested in identifying unusual labs,

to study the day-to-day consistency within a lab

2 2 2 2

, to estimate

the bwteen-lab variance component and within-lab variance component.

The overall variance of a measurement from any lab is

The variance of a given Laboratory

y L D e

..

222mean is: i

eDy r dr

27

A general setting of a design with three nested factors

The Inter-laboratory testing study is one case example of the nested design. For a general three nest factors, the effects can be mixed with fixed and random, depending on the actual experiment. The following is the corresponding statistical model for three nested factors.

( ) ( )

i

, i=1,2,..., a ; j =1,2,..,b; k = 1,2,..., c.

The effects can be fixed or random, depending on the actual experiment.

If all three effects are random, we have:

is assumed to f

ij i j i k ijy a b c

a

j(i) k(ij)

j(i)

k(ij)

ollow N(0, ), b ~N(0, ), c ~ (0, ).

is the random effect of factor B nested within the ith level of factor A

is the random effect of factor C nested within the jth level of fa

a b cN

b

c

2 2 2 2

ctor B.

The overall variance of a measurement is

y a b c

NOTE: It is difficult to estimate the components or conduct proper F-tests for each factor when the model is mixed and complicated. Fortunately, Minitab can provide the EMS and show how the F-tests are conducted in ANOVA table.

28

ANOVA Table and the EMS for the Three Nested Random Factor Experiment

Source Df SS MS F P-value EMS

A a-1 SSA MSA MSA/MSB(A)

B(A) a(b-1) SSB(A) MSB(A) MSB(A)/MSC(B)

C(B) ab(c-1) SSC(B) MSC(B)

Total abc - 1 SSTO

2 2 2( ) ( )c b b a ac bc

2 2( ) ( )c b b ac 2( )c b

29

A Case Study of Three Nested Random Factor Experiment

A laboratory performs serum assays critical for correct medical diagnosis. It is important to maintain program to monitor the performance of assays and ensure accurate information for diagnosis.

The important sources of variation in the assays are days on which the assays are conducted, the replicate runs within days, and the replicate serum sample preparations within runs. The quality control program requires that a spectrophotometer be tested within several runs on each of several days with serum standards used in the laboratory for control runs. Replicate serum preparations are evaluated within each of the runs. The data in the following table are observations from a design for the analysis of glucose standards. There are three (c=3) replications of the standard prepared for each of two runs(b=2) on each of three days (a=3).

30

Row Day Run Sample glucose

1 1 1 1 42.5

2 1 1 2 43.3

3 1 1 3 42.9

4 1 2 1 42.2

5 1 2 2 41.4

6 1 2 3 41.8

7 2 3 1 48.0

8 2 3 2 44.6

9 2 3 3 43.7

10 2 4 1 42.0

11 2 4 2 42.8

12 2 4 3 42.8

13 3 5 1 41.7

14 3 5 2 43.4

15 3 5 3 42.5

16 3 6 1 40.6

17 3 6 2 41.8

18 3 6 3 41.8(Data from Kuehl, 2000. Unit: mg/dl)

Three factors: Day,Run within Day and Replicates within Run are all random effect factors.

Bonferroni confidence intervals for standard deviations

Lower Sigma Upper N Run Day

0.170862 0.40000 6.1903 3 1 1

0.170862 0.40000 6.1903 3 2 1

0.968739 2.26789 35.0974 3 3 2

0.197294 0.46188 7.1480 3 4 2

0.363290 0.85049 13.1620 3 5 3

0.295941 0.69282 10.7219 3 6 3

On Day2, the three replications for run 3 are very different, especially the one with glucose = 48. A close check is needed. In this analysis,we assume there was no special causes for this data.

31

Factor Type Levels Values

Day random 3 1 2 3

Run(Day) random 6 1 2 3 4 5 6

Analysis of Variance for glucose, using Adjusted SS for Tests

Source DF Seq SS Adj SS Adj MS F P

Day 2 13.763 13.763 6.882 1.26 0.400

Run(Day) 3 16.357 16.357 5.452 4.75 0.021

Error 12 13.760 13.760 1.147

Total 17 43.880

Unusual Observations for glucose

Obs glucose Fit SE Fit Residual St Resid

7 48.0000 45.4333 0.6182 2.5667 2.94R

R denotes an observation with a large standardized residual.

•The analysis indicates there is no significant day-to-day variation. But there is a significant variation among the runs within Day.

•The 1st replicate of the 3rd run (on 2nd day) is an unusual case.

32

Source Expected Mean Square for Each Term

1 Day (3) + 3.0000(2) + 6.0000(1)

2 Run(Day) (3) + 3.0000(2)

3 Error (3)

Error Terms for Tests, using Adjusted SS

Source Error DF Error MS Synthesis of Error MS

1 Day 3.00 5.452 (2)

2 Run(Day) 12.00 1.147 (3)

Variance Components, using Adjusted SS

Source Estimated Value

Day 0.2382

Run(Day) 1.4352

Error 1.1467Three variance components are

The error term of the F-test for testing the significance of a

is Run(Day). The error term for testing Run(Day) is ‘Error’.

The variance component for Day is non-significant. Run(day) and random error due to replications within Run are similar.

33

Least Squares Means for glucose

Day Mean

1 42.35

2 43.98

3 41.97

(Day)Run

1 1 42.90

1 2 41.80

2 3 45.43

2 4 42.53

3 5 42.53

3 6 41.40

The standard error for the grand mean and the mean of ith level of factor A, are:

...

..

i..

2 2 2( ) ( )2

...

2y

2 2( ) ( )2

2y

,

which is estimated by s =

,

( )which is estimated by s

i

c b b a ay

c b b ay

abc ab aMSA

abc

bc bMSB A

bc

100(1-)% confidence interval for mean of the ith level of factor A:

.. ( / 2, ( 1)

( )i a b

MSB Ay t

bc

34

Hands-on Activity

1. Obtain each variance component by hand.

2. Obtain the estimate of variance of grand mean glucose and a day mean glucose.

3. Obtain a 95% confidence interval of the grand mean glucose and a day mean glucose.

4. Compute the proportion of each variance component.

5. Exclude the case (7), value = 48, and re-analyze the data.

35

36