analysis of variance (anova) and multivariate analysis of variance (manova)

Analysis of Variance (ANOVA) and Multivariate Analysis of

Variance (MANOVA)

Session 6

• Using Statistics.

• The Hypothesis Test of Analysis of Variance.

• The Theory and Computations of ANOVA.

• The ANOVA Table and Examples.

• Further Analysis.

• Models, Factors, and Designs.

• Two-Way Analysis of Variance.

• Blocking Designs.

• Using the Computer.

• Summary and Review of Terms.

Analysis of Variance

• ANOVA (ANalysis Of VAriance) is a statistical method for determining the existence of differences among several population means.

ANOVA is designed to detect differences among means from populations subject to different treatments.

ANOVA is a joint test

The equality of several population means is tested simultaneously or jointly.

ANOVA tests for the equality of several population means by looking at two estimators of the population variance (hence, analysis of variance).

6-1 ANOVA: Using Statistics

• In an analysis of variance:We have r independent random samples, each one

corresponding to a population subject to a different treatment.

We have: n = n1+ n2+ n3+ ...+nr total observations.

r sample means: x1, x2 , x3 , ... , xr These r sample means can be used to calculate an estimator

of the population variance. If the population means are equal, we expect the variance among the sample means to be small.

r sample variances: s12, s2

2, s32, ...,sr

2

These sample variances can be used to find a pooled estimator of the population variance.

Analysis of Variance: Using Statistics (continued)

• We assume independent random sampling from each of the r populations

• We assume that the r populations under study: are normally distributed, with means mi that may or may not be equal,

but with equal variances, i2.

1 2 3

Population 1 Population 2 Population 3

Analysis of Variance: Assumptions

The test statistic of analysis of variance:F(r-1, n-r) = Estimate of variance based on means from r samples

Estimate of variance based on all sample observations

That is, the test statistic in an analysis of variance is based on the ratio of two estimators of a population variance, and is therefore based on the F distribution, with (r-1) degrees of freedom in the numerator and (n-r) degrees of freedom in the denominator.

The hypothesis test of analysis of variance:

H0: 1 = 2 = 3 = 4 = ... r

H1: Not all mi (i = 1, ..., r) are equal.

6-2 The Hypothesis Test of Analysis of Variance

x

x

x

When the null hypothesis is true:

We would expect the sample means to be nearly equal, as in this illustration. And we would expect the variation among the sample means (between sample) to be small, relative to the variation found around the individual sample means (within sample).

If the null hypothesis is true, the numerator in the test statistic is expected to be small, relative to the denominator:

F(r-1, n-r)=Estimate of variance based on means from r samples


When the Null Hypothesis Is True

x xx

When the null hypothesis is false: is equal to but not to , is equal to but not to , is equal to but not to , or , , and are all unequal.

In any of these situations, we would not expect the sample means to all be nearly equal. We would expect the variation among the sample means (between sample) to be large, relative to the variation around the individual sample means (within sample).

If the null hypothesis is false, the numerator in the test statistic is expected to be large, relative to the denominator:

F(r-1, n-r)=Estimate of variance based on means from r samples


When the Null Hypothesis Is False

• Suppose we have 4 populations, from each of which we draw an independent random sample, with n1 + n2 + n3 + n4 = 54. Then our test statistic is:

F(4-1, 54-4)= F(3,50) = Estimate of variance based on means from 4 samples Estimate of

variance based on all 54 sample observations

543210

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.0F(3,50)

f (F)

F Distribution with 3 and 50 Degrees of Freedom

2.79

=0.05

The nonrejection region (for a=0.05)in this instance is F £ 2.79, and the rejection region is F > 2.79. If the test statistic is less than 2.79 we would not reject the null hypothesis, and we would conclude the 4 population means are equal. If the test statistic is greater than 2.79, we would reject the null hypothesis and conclude that the four population means are not equal.

The ANOVA Test Statistic for r = 4 Populations and n = 54 Total Sample Observations

Randomly chosen groups of customers were served different types of coffee and asked to rate the coffee on a scale of 0 to 100: 21 were served pure Brazilian coffee, 20 were served pure Colombian coffee, and 22 were served pure African-grown coffee.

The resulting test statistic was F = 2.02

H0 1 2 3H1 Not all three means equal

n1 = 21 n2 = 20 n3 = 22 n = 21 + 20 + 22 = 63

r = 3

The critical point for = 0.05 is:

- ,( - )

H0 cannot be rejected, and we cannot conclude that any of the

population means differs significantly from the others.

:

:

, , .

. , .

F r n r F F

F F

1 3 1 63 3 2 60 3 15

2 02 2 60 3 15

543210

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.0F

f( F)


=0.05

Test Statistic=2.02 F(2,60)=3.15

Example 6-1

The grand mean, x, is the mean of all n = n1+ n2+ n3+...+ nr observations in all r samples.

The mean of sample i (i = 1,2,3, . . . , r):

=

xij

The grand mean, the mean of all data points:

=

xij=

where xij is the particular data point in position j within the sample from population i.

The subscript i denotes the population, or treatment, and runs from 1 to r. The subscript j

denotes the data point within the sample from population i; thus, j runs from 1 to n j

i

xij

ni

ni

xii

r

j

ni

n

i

n

n xir

1

1 1 1

.

6-3 The Theory and Computations of ANOVA: The Grand Mean

Using the Grand Mean: Table 6-1

Distance from data point to its sample mean

Distance from sample mean to grand mean

1050

x3=2

x2=11.5

x1=6

x=6.909

Treatment (j) Sample point(j) Value(xij)I=1 Triangle 1 4Triangle 2 5Triangle 3 7Triangle 4 8

Mean of Triangles 6I=2 Square 1 10Square 2 11Square 3 12Square 4 13

Mean of Squares 11.5I=3 Circle 1 1Circle 2 2Circle 3 3

Mean of Circles 2Grand mean of all data points 6.909

If the r population means are different (that is, at least two of the population means are not equal), then it is likely that the variation of the data points about their respective sample means (within sample variation) will be small relative to the variation of the r sample means about the grand mean (between sample variation).

We define an as the difference between a data pointand its sample mean. Errors are denoted by , and we have:

We define a as the deviation of a sample meanfrom the grand mean. Treatment deviations, t are given by:

i

error deviation

treatment deviation

e

e x x

t x x

ij ij i

i i

,

The ANOVA principle says:When the population means are not equal, the “average” error(within sample) is relatively small compared with the “average”treatment (between sample) deviation.

The Theory and Computations of ANOVA: Error Deviation and Treatment Deviation

Consider data point x24=13 from table 9-1. The mean of sample 2 is 11.5, and the grand mean is 6.909, so:e x x

t x x

Tot t e

Tot x x

24 24 2 13 11 5 1 5

2 2 11 5 6 909 4 591

24 2 24 1 5 4 591 6 091

24 24 13 6 909 6 091

. .

. . .

. . .

. .

or

1050

x2=11.5

x=6.909

x24=13

Total deviation:Tot24=x24-x=6.091

Treatment deviation:t2=x2-x=4.591

Error deviation:e24=x24-x2=1.5

The total deviation (Totij) is the difference between a data point (xij) and the grand mean (x):

Totij=xij - x

For any data point xij:

Tot = t + e

That is:

Total Deviation = Treatment Deviation + Error Deviation

The Theory and Computations of ANOVA: The Total Deviation

Total Deviation = Treatment Deviation + Error Deviation

Squared Deviations

The total deviation is the sum of the treatment deviation and the error deviation: + = ( ) ( )

Notice that the sample mean term ( ) cancels out in the above addition, which

simplifies the equation.

2 +

2= ( )

2( )

2

ti

eij

xi

x xij xi

xij x Totij

xi

ti

eij

xi

x xij xi

Totij xij x

( )

( )2 2

The Theory and Computations of ANOVA: Squared Deviations

Sums of Squared Deviations

2

+2

= ni( )2 ( )2

SST = SSTR + SSE

Totijj

nj

i

rnitii

reijj

nj

i

r

xij

xj

nj

i

rxi

xi

rxij

xij

nj

i

r

2

11 1 11

2

11 1 11

( )

The Sum of Squares PrincipleThe total sum of squares (SST) is the sum of two terms: the sum of squares for treatment (SSTR) and the sum of squares for error (SSE).

SST = SSTR + SSE

The Theory and Computations of ANOVA: The Sum of Squares Principle

SST

SSTR SSTE

SST measures the total variation in the data set, the variation of all individual data points from the grand mean.

SSTR measures the explained variation, the variation of individual sample means from the grand mean. It is that part of the variation that is possibly expected, or explained, because the data points are drawn from different populations. It’s the variation between groups of data points.

SSE measures unexplained variation, the variation within each group that cannot be explained by possible differences between the groups.

The Theory and Computations of ANOVA: Picturing The Sum of Squares Principle

The number of degrees of freedom associated with SST is (n - 1).n total observations in all r groups, less one degree of freedom lost with the calculation of the grand mean

The number of degrees of freedom associated with SSTR is (r - 1).r sample means, less one degree of freedom lost with thecalculation of the grand mean

The number of degrees of freedom associated with SSE is (n-r). n total observations in all groups, less one degree of freedomlost with the calculation of the sample mean from each of r groups

The degrees of freedom are additive in the same way as are the sums of squares: df(total) = df(treatment) + df(error) (n - 1) = (r - 1) + (n - r)

The Theory and Computations of ANOVA: Degrees of Freedom

Recall that the calculation of the sample variance involves the division of the sum of squared deviations from the sample mean by the number of degrees of freedom. This principle is applied as well to find the mean squared deviations within the analysis of variance.

Mean square treatment (MSTR):

Mean square error (MSE):

Mean square total (MST):

(Note that the additive properties of sums of squares do not extend to the mean squares. MST ¹ MSTR + MSE).

MSTRSSTRr

( )1

MSESSEn r

( )

MSTSSTn

( )1

The Theory and Computations of ANOVA: The Mean Squares

E MSE

E MSTRni ir

i

( )

and

( )( ) when the null hypothesis is true

> when the null hypothesis is false

where is the mean of population i and is the combined mean of all r populations.

2

22

1

2

2

That is, the expected mean square error (MSE) is simply the common population variance (remember the assumption of equal population variances), but the expected treatment sum of squares (MSTR) is the common population variance plus a term related to the variation of the individual population means around the grand population mean.

If the null hypothesis is true so that the population means are all equal, the second term in the E(MSTR) formulation is zero, and E(MSTR) is equal to the common population variance.

The Theory and Computations of ANOVA: The Expected Mean Squares

When the null hypothesis of ANOVA is true and all r population means are equal, MSTR and MSE are two independent, unbiased estimators of the common population variance 2.

On the other hand, when the null hypothesis is false, then MSTR will tend to be larger than MSE.

So the ratio of MSTR and MSE can be used as an indicator of the equality or inequality of the r population means.

This ratio (MSTR/MSE) will tend to be near to 1 if the null hypothesis is true, and greater than 1 if the null hypothesis is false. The ANOVA test, finally, is a test of whether (MSTR/MSE) is equal to, or greater than, 1.

Expected Mean Squares and the ANOVA Principle

Under the assumptions of ANOVA, the ratio (MSTR/MSE) possess an F distribution with (r-1) degrees of freedom for the numerator and (n-r) degrees of freedom for the denominator when the null hypothesis is true.

The test statistic in analysis of variance:

( - , - )FMSTRMSEr n r1

The Theory and Computations of ANOVA: The F Statistic

( )2

ni( )2

Critical point ( = 0.01): 8.65

H0

may be rejected at the 0.01 level

of significance.

SSE xij

xij

nj

i

r

SSTR xi

xi

r

MSTRSSTR

r

MSESSTR

n r

FMSTR

MSE

117

1

1159 9

1

159 9

3 179 95

17

82 125

2 8

79 95

2 12537 62

.

.

( ).

.

( , )

.

.. .

Treatment (i) i j Value (x ij ) (xij -xi ) (xij -xi )2

Triangle 1 1 4 -2 4

Triangle 1 2 5 -1 1

Triangle 1 3 7 1 1

Triangle 1 4 8 2 4

Square 2 1 10 -1.5 2.25

Square 2 2 11 -0.5 0.25Square 2 3 12 0.5 0.25Square 2 4 13 1.5 2.25

Circle 3 1 1 -1 1

Circle 3 2 2 0 0

Circle 3 3 3 1 1

73 0 17

Treatment (xi -x) (xi -x)2 ni (xi -x)2

Triangle -0.909 0.826281 3.305124

Square 4.591 21.077281 84.309124

Circle -4.909 124.098281 72.294843

159.909091

6-4 The ANOVA Table and Examples

Source ofVariation

Sum ofSquares

Degrees ofFreedom Mean Square F Ratio

Treatment SSTR=159.9 (r-1)=2 MSTR=79.95 37.62

Error SSE=17.0 (n-r)=8 MSE=2.125

Total SST=176.9 (n-1)=10 MST=17.69

100

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.0F(2,8)

f(F

)

F Distribution for 2 and 8 Degrees of Freedom

8.65

0.01

Computed test statistic=37.62

The ANOVA Table summarizes the ANOVA calculations.In this instance, since the test statistic is greater than the critical point for an a=0.01 level of significance, the null hypothesis may be rejected, and we may conclude that the means for triangles, squares, and circles are not all equal.

ANOVA Table

Treat Value

1 41 51 71 82 102 112 122 133 13 23 3

MTB > Oneway 'Value' 'Treat'.

One-Way Analysis of Variance

Analysis of Variance on Value Source DF SS MS F pTreat 2 159.91 79.95 37.63 0.000Error 8 17.00 2.12Total 10 176.91

The MINITAB output includes not only the ANOVA table and the test statistic, but it also gives a p-value corresponding to the calculated F-ratio. In this instance the p-value is approximately 0, so the null hypothesis can be rejected at any common level of significance.

Using the Computer

The EXCEL output is created by selecting ANOVA: SINGLE FACTOR option from the DATA ANALYSIS toolkit. The critical F value is based on a = 0.01. The p-value is very small, so again the null hypothesis can be rejected at any common level of significance.

Anova: Single Factor

SUMMARYGroups Count Sum Average Variance

TRIANGLE 4 24 6 3.333333333SQUARE 4 46 11.5 1.666666667CIRCLE 3 6 2 1

ANOVA

Source of Variation SS df MS F P-value F critBetween Groups 159.9090909 2 79.95454545 37.62566845 8.52698E-05 8.64906724Within Groups 17 8 2.125

Total 176.9090909 10

Using the Computer

Club Med has conducted a test to determine whether its Caribbean resorts are equally well liked by vacationing club members. The analysis was based on a survey questionnaire (general satisfaction, on a scale from 0 to 100) filled out by a random sample of 40 respondents from each of 5 resorts.

Source ofVariation

Sum ofSquares


Treatment SSTR= 14208 (r-1)= 4 MSTR= 3552 7.04

Error SSE=98356 (n-r)= 195 MSE= 504.39

Total SST=112564 (n-1)= 199 MST= 565.65

Resort Mean Response (x )i

Guadeloupe 89

Martinique 75

Eleuthra 73

Paradise Island 91

St. Lucia 85

SST=112564 SSE=98356

F(4,200)


0

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.0

f(F

)

3.41

0.01

Computed test statistic=7.04

The resultant F ratio is larger than the critical point for = 0.01, so the null hypothesis may be rejected.

Example 6-2: Club Med

Source ofVariation

Sum ofSquares


Treatment SSTR= 879.3 (r-1)=3 MSTR= 293.1 8.52

Error SSE= 18541.6 (n-r)= 539 MSE=34.4

Total SST= 19420.9 (n-1)=542 MST= 35.83

Given the total number of observations (n = 543), the number of groups (r = 4), the MSE (34. 4), and the F ratio (8.52), the remainder of the ANOVA table can be completed. The critical point of the F distribution for = 0.01 and (3, 400) degrees of freedom is 3.83. The test statistic in this example is much larger than this critical point, so the p value associated with this test statistic is less than 0.01, and the null hypothesis may be rejected.

Example 6-3: Job Involvement



Michael 21 1979 94.23809524 8.59047619Damon 21 1644 78.28571429 41.11428571Allen 21 1381 65.76190476 352.8904762

ANOVASource of Variation SS df MS F P-value F crit

Between Groups 8555.52381 2 4277.761905 31.87639718 3.69732E-10 3.150411487Within Groups 8051.904762 60 134.1984127

Total 16607.42857 62

The test statistic value is 31.8764, way over the critical point for F(2, 60) of 3.15 when = 0.05.

The GM should do whatever it takes to sign Michael.

See text for data and information on the problem

Example 6-4: NBA Franchise

Data ANOVADo Not Reject H0 Stop

Reject H0

The sample means are unbiased estimators of the population means.

The mean square error (MSE) is an unbiased estimator of the common population variance.

Further Analysis

Confidence Intervals for Population Means

Tukey Pairwise Comparisons Test

The ANOVA Diagram

6-5 Further Analysis

A (1 - ) 100% confidence interval for , the mean of population i: i

where t is the value of the distribution with ) degrees of

freedom that cuts off a right - tailed area of2

.2

x tMSEni

i

2

t n - r


x tMSEn

x xi

i

i i

2

1 96504 39

406 96

89 6 96 82 04 95 96]75 6 96 68 04 81 96]73 6 96 66 04 79 96]91 6 96 84 04 97 96]85 6 96 78 04 91 96]

..

.

. [ . , .

. [ . , .

. [ . , .

. [ . , .

. [ . , .

Resort Mean Response (x i)

Guadeloupe 89

Martinique 75

Eleuthra 73

Paradise Island 91

St. Lucia 85

SST = 112564 SSE = 98356

ni = 40 n = (5)(40) = 200

MSE = 504.39


The Tukey Pairwise Comparison test, or Honestly Significant Differences (MSD) test, allows us to compare every pair of population means with a single level of significance.

It is based on the studentized range distribution, q, with r and (n-r) degrees of freedom.

The critical point in a Tukey Pairwise Comparisons test is the Tukey Criterion:

where ni is the smallest of the r sample sizes.

The test statistic is the absolute value of the difference between the appropriate sample means, and the null hypothesis is rejected if the test statistic is greater than the critical point of the Tukey Criterion

T qMSEni

Note that there are r

2 pairs of population means to compare. For example, if = :

H 0 H 0 H 0

H1 H1 H1

r

rr

!

!( ) !

: : :

: : :

2 23

1 2 1 3 2 3

1 2 1 3 2 3

The Tukey Pairwise Comparison Test

The test statistic for each pairwise test is the absolute difference between the appropriate sample means. i Resort Mean I. H0: 1 2 VI. H0: 2 4

1 Guadeloupe 89 H1: 1 2 H1: 2 4

2 Martinique 75 |89-75|=14>13.7* |75-91|=16>13.7*3 Eleuthra 73 II. H0: 1 3 VII. H0: 2 5

4 Paradise Is. 91 H1: 1 3 H1: 2 5

5 St. Lucia 85 |89-73|=16>13.7* |75-85|=10<13.7 III. H0: 1 4 VIII. H0: 3 4

The critical point T0.05 for H1: 1 4 H1: 3 4

r=5 and (n-r)=195 |89-91|=2<13.7 |73-91|=18>13.7*degrees of freedom is: IV. H0: 1 5 IX. H0: 3 5

H1: 1 5 H1: 3 5

|89-85|=4<13.7 |73-85|=12<13.7 V. H0: 2 3 X. H0: 4 5

H1: 2 3 H1: 4 5

|75-73|=2<13.7 |91-85|= 6<13.7Reject the null hypothesis if the absolute value of the difference between the sample means is greater than the critical value of T. (The hypotheses marked with * are rejected.)

T qMSEni

3 86504 4

4013 7.

..

The Tukey Pairwise Comparison Test: The Club Med Example

We rejected the null hypothesis which compared the means of populations 1 and 2, 1 and 3, 2 and 4, and 3 and 4. On the other hand, we accepted the null hypotheses of the equality of the means of populations 1 and 4, 1 and 5, 2 and 3, 2 and 5, 3 and 5, and 4 and 5.

The bars indicate the three groupings of populations with possibly equal means: 2 and 3; 2, 3, and 5; and 1, 4, and 5.

123 45

Picturing the Results of a Tukey Pairwise Comparisons Test: The Club Med Example

• A statistical model is a set of equations and assumptions that capture the essential characteristics of a real-world situation The one-factor ANOVA model:

xij=i+ij=+i+ij

where eij is the error associated with the jth member of the ith population. The errors are assumed to be normally distributed with mean 0 and variance 2.

• A factor is a set of populations or treatments of a single kind. For example: One factor models based on sets of resorts, types of airplanes, or kinds of sweaters Two factor models based on firm and location Three factor models based on color and shape and size of an ad.

• Fixed-Effects and Random Effects A fixed-effects model is one in which the levels of the factor under study (the

treatments) are fixed in advance. Inference is valid only for the levels under study. A random-effects model is one in which the levels of the factor under study are

randomly chosen from an entire population of levels (treatments). Inference is valid for the entire population of levels.

6-6 Models, Factors and Designs

• A completely-randomized design is one in which the elements are assigned to treatments completely at random. That is, any element chosen for the study has an equal chance of being assigned to any treatment.

• In a blocking design, elements are assigned to treatments after first being collected into homogeneous groups. In a completely randomized block design, all members of

each block (homogeneous group) are randomly assigned to the treatment levels.

In a repeated measures design, each member of each block is assigned to all treatment levels.

Experimental Design

• In a two-way ANOVA, the effects of two factors or treatments can be investigated simultaneously. Two-way ANOVA also permits the investigation of the effects of either factor alone and of the two factors together. The effect on the population mean that can be attributed to the levels of either

factor alone is called a main effect. An interaction effect between two factors occurs if the total effect at some pair

of levels of the two factors or treatments differs significantly from the simple addition of the two main effects. Factors that do not interact are called additive.

• Three questions answerable by two-way ANOVA: Are there any factor A main effects? Are there any factor B main effects? Are there any interaction effects between factors A and B?

• For example, we might investigate the effects on vacationers’ ratings of resorts by looking at five different resorts (factor A) and four different resort attributes (factor B). In addition to the five main factor A treatment levels and the four main factor B treatment levels, there are (5*4=20) interaction treatment levels.3

6-7 Two-Way Analysis of Variance

• xijk=+i+ j + (ijk + ijk

where is the overall mean;

i is the effect of level i(i=1,...,a) of factor A;

j is the effect of level j(j=1,...,b) of factor B;

jj is the interaction effect of levels i and j;

jjk is the error associated with the kth data point from level i of factor A and level j of factor B.

jjk is assumed to be distributed normally with mean zero and variance 2 for all i, j, and k.

The Two-Way ANOVA Model

Guadeloupe Martinique EleuthraParadiseIsland St. Lucia

Friendship n11 n21 n31 n41 n51

Sports n12 n22 n32 n42 n52

Culture n13 n23 n33 n43 n53

Excitement n14 n24 n34 n44 n54

Factor A: Resort

Fac

tor

B:

Attr

ibut

e

Resort

Ra

tin

g

Graphical Display of Effects

EleuthraMartinique

St. LuciaGuadeloupe

Paradise island

Friendship

ExcitementSportsCulture

Eleuthra/sports interaction: Combined effect greater than additive main effects

Sports

Friendship

Attribute

Resort

Excitement

Culture

Rating

EleuthraMartinique

St. LuciaGuadeloupe

Paradise Island

Two-Way ANOVA Data Layout: Club Med Example

• Factor A main effects test:H0: i=0 for all i=1,2,...,aH1: Not all i are 0

• Factor B main effects test:H0: j=0 for all j=1,2,...,bH1: Not all i are 0

• Test for (AB) interactions:H0: ij=0 for all i=1,2,...,a and j=1,2,...,bH1: Not all ij are 0

Hypothesis Tests a Two-Way ANOVA

In a two-way ANOVA: xijk=+i+ j + (ijk + ijk

SST = SSTR +SSE SST = SSA + SSB +SS(AB)+SSE

SST SSTR SSE

x x x x x x

SSTR SSA SSB SS AB

xix x

jx x

ijxixjx

( ) ( ) ( )

( )

( ) ( ) ( )

2 2 2

2 2 2

Sums of Squares

Source ofVariation

Sum of Squares

Degreesof Freedom Mean Square F Ratio

Factor A SSA a-1MSA

SSAa

1

FMSAMSE

Factor B SSB b-1MSB

SSBb

1

FMSBMSE

Interaction SS(AB) (a-1)(b-1)MS AB

SS ABa b

( )( )

( )( ) 1 1

FMS ABMSE

( )

Error SSE ab(n-1)MSE

SSEab n

( )1Total SST abn-1

A Main Effect Test: F(a-1,ab(n-1)) B Main Effect Test: F(b-1,ab(n-1))

(AB) Interaction Effect Test: F((a-1)(b-1),ab(n-1))

The Two-Way ANOVA Table

Source ofVariation

Sum of Squares


Location 1824 2 912 8.94 *

Artist 2230 2 1115 10.93 *

Interaction 804 4 201 1.97

Error 8262 81 102

Total 13120 89

=0.01, F(2,81)=4.88 Both main effect null hypotheses are rejected.=0.05, F(2,81)=2.48 Interaction effect null hypotheses are not rejected.

Example 6-4: Two-Way ANOVA (Location and Artist)

6543210

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.0F

f(F)


F0.01=4.88

=0.01

Location test statistic=8.94Artist test statistic=10.93

6543210

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.0 F

f(F)


Interaction test statistic=1.97

=0.05

F0.05=2.48

Hypothesis Tests

Kimball’s Inequality gives an upper limit on the true probability of at least one Type I error in the three tests of a two-way analysis:

1- (1-1) (1-2) (1-3)

Tukey Criterion for factor A:

where the degrees of freedom of the q distribution are now a and ab(n-1). Note that MSE is divided by bn.

T qMSE

bn

Overall Significance Level and Tukey Method for Two-Way ANOVA

Source ofVariation

Sum of Squares


Factor A SSA a-1 MSASSAa

1 FMSAMSE

Factor B SSB b-1MSB

SSBb

1F

MSBMSE

Factor C SSC c-1MSC

SSCc

1

FMSCMSE

Interaction (AB)

SS(AB) (a-1)(b-1)MS AB

SS ABa b

( )( )

( )( )

1 1F

MS ABMSE

( )

Interaction (AC)

SS(AC) (a-1)(c-1)MS AC

SS ACa c

( )( )

( )( ) 1 1

FMS ACMSE

( )

Interaction (BC)

SS(BC) (b-1)(c-1) MS BCSS BCb c

( )( )

( )( ) 1 1

FMS BCMSE

( )

Interaction (ABC)

SS(ABC) (a-1)(b-1)(c-1) MS ABCSS ABC

a b c( )

( )( )( )( )

1 1 1F

MS ABCMSE

( )

Error SSE abc(n-1) MSESSE

abc n ( )1

Total SST abcn-1

Three-Way ANOVA Table

• A block is a homogeneous set of subjects, grouped to minimize within-group differences.

• A competely-randomized design is one in which the elements are assigned to treatments completely at random. That is, any element chosen for the study has an equal chance of being assigned to any treatment.

• In a blocking design, elements are assigned to treatments after

first being collected into homogeneous groups. – In a completely randomized block design, all members of

each block (homogenous group) are randomly assigned to the treatment levels.

– In a repeated measures design, each member of each block is assigned to all treatment levels.

6-8 Blocking Designs

Source of Variation Sum of Squares df Mean Square F RatioBlocks 2750 39 70.51 0.69Treatments 2640 2 1320 12.93Error 7960 78 102.05Total 13350 119

= 0.01, F(2, 78) = 4.88

Source of Variation Sum of Squares Degress of Freedom Mean Square F Ratio

Blocks SSBL n - 1 MSBL = SSBL/(n-1) F = MSBL/MSETreatments SSTR r - 1 MSTR = SSTR/(r-1) F = MSTR/MSEError SSE (n -1)(r - 1)Total SST nr - 1

ANOVA Table for Blocking Designs: Example 6-5

MSE = SSE/(n-1)(r-1)

MTB > ONEWAY C1, C2;SUBC> TUKEY 0.05.One-Way Analysis of Variance

Analysis of Variance on C1 Source DF SS MS F pMethod 2 1348.45 674.23 69.42 0.000Error 63 611.91 9.71Total 65 1960.36 Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev ----+---------+---------+---------+-- 1 22 22.773 3.131 (--*--) 2 22 30.636 3.824 (---*--) 3 22 19.955 2.171 (--*--) ----+---------+---------+---------+--Pooled StDev = 3.117 20.0 24.0 28.0 32.0

Tukey's pairwise comparisons Family error rate = 0.0500Individual error rate = 0.0193

Critical value = 3.39

Intervals for (column level mean) - (row level mean) 1 2 2 -10.116 -5.611

3 0.566 8.429 5.071 12.934

Using the Computer



METHOD A 22 501 22.77272727 9.803030303METHOD B 22 674 30.63636364 14.62337662METHOD C 22 439 19.95454545 4.712121212

ANOVASource of Variation SS df MS F P-value F crit

Between Groups 1348.454545 2 674.2272727 69.41606002 1.18041E-16 3.14280868Within Groups 611.9090909 63 9.712842713

Total 1960.363636 65

95% Confidence MethodIntervals on Means A B C

LOWER BOUND 21.44493 29.309 18.62675UPPER BOUND 24.10052 31.964 21.28234

Using the Computer: Example 6-6 Using Excel

• Introduction.

• The Multivariate Normal Distribution.

• Discriminant Analysis.

• Principal Components and Factor Analysis.

• Using the Computer.

• Summary and Review of Terms.

6-9 Multivariate Analysis

• A k-dimensional (vector) random variable X:X = (X1, X2, X3..., Xk)

• A realization of a k-dimensional random variable X:x = (x1, x2, x3..., xk)

• A joint cumulative probability distribution function of a k-dimensional random variable X:F(x1, x2, x3..., xk) = P(X1x1, X2x2,..., Xkxk)

6-10 The Multivariate Normal Distribution

A multivariate normal random variable has the following probability density function:

where X is the vector random variable, the term = ( is the vector of means of the component variables X and is the variance - covariance matrix. The operations ' and aretransposition and inversion of matrices, respectively, and denotes the determinant of a matrix.

1 2 k

i-1

f x x xk eX X

k( , , , )( ) ( )

, , , ),

1 21

2

12

212

1

The Multivariate Normal Distribution

f(x1,x2)

x1

x2

Picturing the Bivariate Normal Distribution

X2

X1

Group 1

Group 2

1

2

Line L

In a discriminant analysis, observations are classified into two or more groups, depending on the value of a multivariate discriminant function.

As the figure illustrates, it may be easier to classify observations by looking at them from another direction. The groups appear more separated when viewed from a point perpendicular to Line L, rather than from a point perpendicular to the X1 or X2 axis. The discriminant function gives the direction that maximizes the separation between the groups.

6-11 Discriminant Analysis

Group 1 Group 2

CCutting Score

The form of the estimated predicted equation:D= b0 +b1X1+b2X2+...+bkXk

where the bi are the discriminant weights. b0 is a constant.

The intersection of the normal marginal distributions of two groups gives the cutting score, which is used to assign observations to groups. Observations with scores less than C are assigned to group 1, and observations with scores greater than C are assigned to group 2. Since the distributions may overlap, some observations may be misclassified.

The model may be evaluated in terms of the percentages of observations assigned correctly and incorrectly.

The Discriminant Function

Discriminant 'Repay' 'Assets' 'Debt' 'Famsize'.Group 0 1Count 14 18

Summary of ClassificationPut into ....True Group....Group 0 10 10 51 4 13Total N 14 18N Correct 10 13Proport. 0.714 0.722

N = 32 N Correct = 23 Prop. Correct = 0.719

Linear Discriminant Function for Group 0 1Constant -7.0443 -5.4077Assets 0.0019 0.0548Debt 0.0758 0.0113Famsize 3.5833 2.8570

Discriminant Analysis: Example 6-7 (Minitab)

Summary of Misclassified ObservationsObservation True Pred Group Sqrd Distnc Probability Group Group 4 ** 1 0 0 6.966 0.515 1 7.083 0.485 7 ** 1 0 0 0.9790 0.599 1 1.7780 0.401 21 ** 0 1 0 2.940 0.348 1 1.681 0.652 22 ** 1 0 0 0.3812 0.775 1 2.8539 0.225 24 ** 0 1 0 5.371 0.454 1 5.002 0.546 27 ** 0 1 0 2.617 0.370 1 1.551 0.630 28 ** 1 0 0 1.250 0.656 1 2.542 0.344 29 ** 1 0 0 1.703 0.782 1 4.259 0.218 32 ** 0 1 0 1.84529 0.288 1 0.03091 0.712

Example 6-7: Misclassified Observations

1 0 set width 80 2 data list free / assets income debt famsize job repay 3 begin data 35 end data 36 discriminant groups = repay(0,1) 37 /variables assets income debt famsize job 38 /method = wilks 39 /fin = 1 40 /fout = 1 41 /plot 42 /statistics = all Number of cases by group Number of cases REPAY Unweighted Weighted Label 0 14 14.0 1 18 18.0 Total 32 32.0

Example 6-7: SPSS Output (1)

- - - - - - - - D I S C R I M I N A N T A N A L Y S I S - - - - - - - -On groups defined by REPAY Analysis number 1 Stepwise variable selection Selection rule: minimize Wilks' Lambda Maximum number of steps.................. 10 Minimum tolerance level.................. .00100 Minimum F to enter....................… 1.00000 Maximum F to remove...................... 1.00000 Canonical Discriminant Functions Maximum number of functions.............. 1 Minimum cumulative percent of variance... 100.00 Maximum significance of Wilks' Lambda.... 1.0000 Prior probability for each group is .50000


---------------- Variables not in the Analysis after Step 0 ---------------- MinimumVariable Tolerance Tolerance F to Enter Wilks' Lambda ASSETS 1.0000000 1.0000000 6.6151550 .8193329INCOME 1.0000000 1.0000000 3.0672181 .9072429DEBT 1.0000000 1.0000000 5.2263180 .8516360FAMSIZE 1.0000000 1.0000000 2.5291715 .9222491JOB 1.0000000 1.0000000 .2445652 . 9919137 * * * * * * * * * * * ** * * * * * * * * * * * * * * * * * * * * * At step 1, ASSETS was included in the analysis. Degrees of Freedom Signif. Between GroupsWilks' Lambda .81933 1 1 30.0Equivalent F 6.61516 1 30.0 .0153


---------------- Variables in the Analysis after Step 1 ----------------Variable Tolerance F to Remove Wilks' LambdaASSETS 1.0000000 6.6152 ---------------- Variables not in the Analysis after Step 1 ------------ MinimumVariable Tolerance Tolerance F to Enter Wilks' Lambda INCOME .5784563 .5784563 . 0090821 .8190764DEBT .9706667 .9706667 6.0661878 .6775944FAMSIZE .9492947 .9492947 3.9269288 .7216177JOB .9631433 .9631433 .0000005 .8193329 At step 2, DEBT was included in the analysis. Degrees of Freedom Signif. Between GroupsWilks' Lambda .67759 2 1 30.0Equivalent F 6.89923 2 29.0 .0035


----------------- Variables in the Analysis after Step 2 ---------------- Variable Tolerance F to Remove Wilks' LambdaASSETS .9706667 7.4487 .8516360DEBT .9706667 6.0662 .8193329 -------------- Variables not in the Analysis after Step 2 ------------- MinimumVariable Tolerance Tolerance F to Enter Wilks' LambdaINCOME .5728383 .5568120 .0175244 .6771706FAMSIZE .9323959 .9308959 2.2214373 .6277876JOB .9105435 .9105435 .2791429 .6709059 At step 3, FAMSIZE was included in the analysis. Degrees of Freedom Signif. Between GroupsWilks' Lambda .62779 3 1 30.0Equivalent F 5.53369 3 28.0 .0041


------------- Variables in the Analysis after Step 3 ----------------Variable Tolerance F to Remove Wilks' LambdaASSETS .9308959 8.4282 .8167558DEBT .9533874 4.1849 .7216177FAMSIZE .9323959 2.2214 .6775944 ------------- Variables not in the Analysis after Step 3 ------------ MinimumVariable Tolerance Tolerance F to Enter Wilks' LambdaINCOME .5725772 .5410775 .0240984 .6272278JOB .8333526 .8333526 .0086952 .6275855 Summary Table Action Vars Wilks'Step Entered Removed in Lambda Sig. Label 1 ASSETS 1 .81933 .0153 2 DEBT 2 .67759 .0035 3 FAMSIZE 3 .62779 .0041


Classification function coefficients(Fisher's linear discriminant functions) REPAY = 0 1 ASSETS .0018509 .0547891DEBT .0758239 .0113348FAMSIZE 3.5833063 2.8570101(Constant) -7.7374079 -6.1008660

Unstandardized canonical discriminant function coefficients Func 1 ASSETS -.0352245DEBT .0429103FAMSIZE .4832695(Constant) -.9950070


Case Mis Actual Highest Probability 2nd Highest DiscrimNumber Val Sel Group Group P(D/G) P(G/D) Group P(G/D) Scores 1 1 1 .1798 .9587 0 .0413 -1.9990 2 1 1 .3357 .9293 0 .0707 -1.6202 3 1 1 .8840 .7939 0 .2061 -.8034 4 1 ** 0 .4761 .5146 1 .4854 .1328 5 1 1 .3368 .9291 0 .0709 -1.6181 6 1 1 .5571 .5614 0 .4386 -.0704 7 1 ** 0 .6272 .5986 1 .4014 .3598 8 1 1 .7236 .6452 0 .3548 -.3039 ........................................................................... 20 0 0 .1122 .9712 1 .0288 2.4338 21 0 ** 1 .7395 .6524 0 .3476 -.3250 22 1 ** 0 .9432 .7749 1 .2251 .9166 23 1 1 .7819 .6711 0 .3289 -.3807 24 0 ** 1 .5294 .5459 0 .4541 -.0286 25 1 1 .5673 .8796 0 .1204 -1.2296 26 1 1 .1964 .9557 0 .0443 -1.9494 27 0 ** 1 .6916 .6302 0 .3698 -.2608 28 1 ** 0 .7479 .6562 1 .3438 .5240 29 1 ** 0 .9211 .7822 1 .2178 .9445 30 1 1 .4276 .9107 0 .0893 -1.4509 31 1 1 .8188 .8136 0 .1864 -.8866 32 0 ** 1 .8825 .7124 0 .2876 -.5097


Classification results - No. of Predicted Group Membership Actual Group Cases 0 1-------------------- ------ -------- -------- Group 0 14 10 4 71.4% 28.6% Group 1 18 5 13 27.8% 72.2% Percent of "grouped" cases correctly classified: 71.88%


All-groups Stacked Histogram Canonical Discriminant Function 1 4 + + |

| |

|F | |r 3 + 2

+e | 2

|q | 2

| u | 2

|e 2 + 2 1 2 +n | 2 1 2 |c | 2 1 2 |y | 2 1 2 | 1 + 22 222 2 222 121 212112211 2 1 11 1 1 1 + | 22 222 2 222 121 212112211 2 1 11 1 1 1 | | 22 222 2 222 121 212112211 2 1 11 1 1 1 | | 22 222 2 222 121 212112211 2 1 11 1 1 1 | X---------------------+---------------------+---------------------+---------------------+---------------------+---------------------X out -2.0 -1.0 .0 1.0 2.0 out Class 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1Centroids 2 1


First Component

Second Component

x

y

Total Variance

VarianceRemaining After

Extraction of

First Second Third

Component

6-12 Principal Components and Factor Analysis

The k original Xi variables written as linear combinations of a smaller set of m common factors and a unique component for each variable:

X1 = b11F1+ b12F2 +...+ b1mFm + U1

X1 = b21F1+ b22F2 +...+ b2mFm + U2 . . .

Xk = bk1F1+ bk2F2 +...+ bkmFm + Uk

The Fj are the common factors. Each Ui is the unique component of variable Xi. The coefficients bij are called the factor loadings.

Total variance in the data is decomposed into the communality, the common factor component, and the specific part.

Factor Analysis

Factor 2

Factor 1

Rotated Factor 2

Rotated Factor 1

Orthogonal RotationFactor 2

Factor 1

Rotated Factor 2

Rotated Factor 1

Oblique Rotation

Rotation of Factors

Factor LoadingsSatisfaction with: 1 2 3 4 CommunalityInformation1 0.87 0.19 0.13 0.22 0.85832 0.88 0.14 0.15 0.13 0.83343 0.92 0.09 0.11 0.12 0.88104 0.65 0.29 0.31 0.15 0.6252Variety5 0.13 0.82 0.07 0.17 0.72316 0.17 0.59 0.45 0.14 0.59917 0.18 0.48 0.32 0.22 0.41368 0.11 0.75 0.02 0.12 0.58949 0.17 0.62 0.46 0.12 0.639310 0.20 0.62 0.47 0.06 0.6489Closure11 0.17 0.21 0.76 0.11 0.662712 0.12 0.10 0.71 0.12 0.5429Pay13 0.17 0.14 0.05 0.51 0.311114 0.10 0.11 0.15 0.66 0.4802

Factor Analysis of Satisfaction Items

analysis of variance (anova) and multivariate analysis of variance (manova)

Documents

population variance

r sample means

analysis of variance

twoway analysis of variance

r samplesestimate of

r sample variances

hypothesis test of analysis

individual sample