analysis of variance
DESCRIPTION
Analysis of Variance. Chapter 13 BA 303. An Introduction to Analysis of Variance. A factor is a variable that the experimenter has selected for investigation. A treatment is a level of a factor. Example: - PowerPoint PPT PresentationTRANSCRIPT
1 Slide
Analysis of Variance
Chapter 13BA 303
2 Slide
An Introduction to Analysis of Variance A factor is a variable that the experimenter
has selected for investigation. A treatment is a level of a factor. Example:
A company president is interested in the number of hours worked by managers at three different plants: Gulfport, Long Beach, and Biloxi.
Factor: Plant Treatment: Gulfport, Long Beach, and
Biloxi
3 Slide
Analysis of Variance: A Conceptual Overview
Analysis of Variance (ANOVA) can be used to test for the equality of three or more population means.
Data obtained from observational or experimental studies can be used for the analysis.
We want to use the sample results to test the following hypotheses:
H0: 1=2=3=. . . = k
Ha: Not all population means are equal
4 Slide
H0: 1=2=3=. . . = k
Ha: Not all population means are equal
If H0 is rejected, we cannot conclude that all population means are different.
Rejecting H0 means that at least two population means have different values.
Analysis of Variance: A Conceptual Overview
5 Slide
For each population, the response (dependent) variable is normally distributed.
The variance of the response variable, denoted 2, is the same for all of the populations.
The observations must be independent.
Assumptions for Analysis of Variance
Analysis of Variance: A Conceptual Overview
6 Slide
Sampling Distribution of Given H0 is Truex
1x 3x2x
Sample means are close together because there is only one sampling distribution when H0 is true.
Analysis of Variance: A Conceptual Overview
7 Slide
Sampling Distribution of Given H0 is Falsex
3 1x 2x3x 1 2
Sample means come from different sampling distributions and are not as close together when H0
is false.
Analysis of Variance: A Conceptual Overview
8 Slide
Analysis of Variance
Between-Treatments Estimate of Population Variance Within-Treatments Estimate of Population Variance Comparing the Variance Estimates: The F Test ANOVA Table
9 Slide
2
1( )
MSTR 1
k
j jj
n x x
k
Between-Treatments Estimateof Population Variance 2
Denominator is thedegrees of freedom
associated with SSTR
Numerator is calledthe sum of squares
dueto treatments (SSTR)
The estimate of 2 based on the variation of the sample means is called the mean square due to treatments and is denoted by MSTR.
10 Slide
The estimate of 2 based on the variation of the sample observations within each sample is called the mean square error and is denoted by MSE.
Within-Treatments Estimateof Population Variance 2
Denominator is thedegrees of freedom
associated with SSE
Numerator is called
the sum of squares
due to error (SSE)
MSE
( )n s
n k
j jj
k
T
1 21MSE
( )n s
n k
j jj
k
T
1 21
11 Slide
F Test Statistic
MSEMSTRF
1 MSTR d.f k
knT MSE d.f
12 Slide
Comparing the Variance Estimates: The F Test
If the null hypothesis is true and the ANOVA assumptions are valid, the sampling distribution of MSTR/MSE is an F distribution with MSTR d.f. equal to k - 1 and MSE d.f. equal to nT - k. If the means of the k populations are not equal, the value of MSTR/MSE will be inflated because MSTR overestimates 2. Hence, we will reject H0 if the resulting value of MSTR/MSE appears to be too large to have been selected at random from the appropriate F distribution.
13 Slide
F Distribution
14 Slide
Sampling Distribution of MSTR/MSE
Do Not Reject H0
Reject H0
MSTR/MSE
Critical ValueF
Sampling Distributionof MSTR/MSE
Comparing the Variance Estimates: The F Test
15 Slide
MSTR SSTR-
k 1MSTR SSTR-
k 1MSE SSE
-n kT
MSE SSE-
n kT
MSTRMSE
MSTRMSE
Source ofVariation
Sum ofSquares
Degrees ofFreedom
MeanSquare F
Treatments
Error
Total
k - 1
nT - 1
SSTR
SSE
SST
nT - k
SST is partitionedinto SSTR and
SSE.
SST’s degrees of freedom
(d.f.) are partitioned into
SSTR’s d.f. and SSE’s d.f.
ANOVA Tablefor a Completely Randomized Design
p-Value
16 Slide
SST divided by its degrees of freedom nT – 1 is the overall sample variance that would be obtained if we treated the entire set of observations as one data set.
With the entire data set as one sample, the formula for computing the total sum of squares, SST, is:
2
1 1SST ( ) SSTR SSE
jnk
ijj i
x x
ANOVA Tablefor a Completely Randomized Design
17 Slide
ANOVA can be viewed as the process of partitioning the total sum of squares and the degrees of freedom into their corresponding sources: treatments and error.
Dividing the sum of squares by the appropriate degrees of freedom provides the variance estimates and the F value used to test the hypothesis of equal population means.
ANOVA Tablefor a Completely Randomized Design
18 Slide
Test for the Equality of k Population Means
F = MSTR/MSE
H0: 1=2=3=. . . = k
Ha: Not all population means are equal
Hypotheses
Test Statistic
19 Slide
Test for the Equality of k Population Means
Rejection Rule
where the value of F is based on anF distribution with k - 1 numerator d.f.and nT - k denominator d.f.
Reject H0 if p-value < p-value Approach:
Critical Value Approach: Reject H0 if F > F
20 Slide
Reed Manufacturing Janet Reed would like to know if there is anysignificant difference in the mean number of hoursworked per week for the department managers at herthree manufacturing plants (in Buffalo, Pittsburgh,and Detroit). An F test will be conducted using = .05.
Testing for the Equality of k Population Means
21 Slide
Reed Manufacturing A simple random sample of five managers fromeach of the three plants was taken and the number ofhours worked by each manager in the previous weekis shown on the next slide.
Testing for the Equality of k Population Means
Factor . . . Manufacturing plantTreatments . . . Buffalo, Pittsburgh, DetroitExperimental units . . . ManagersResponse variable . . . Number of hours worked
22 Slide
12345
4854575462
7363666474
5163615456
Plant 1Buffalo
Plant 2Pittsburgh
Plant 3DetroitObservation
Sample MeanSample Variance
55 68 5726.0 26.5 24.5
Testing for the Equality of k Population Means
23 Slide
H0: 1=2=3
Ha: Not all the means are equalwhere: 1 = mean number of hours worked per
week by the managers at Plant 1 2 = mean number of hours worked per week by the managers at Plant 23 = mean number of hours worked per week by the managers at Plant 3
1. Develop the hypotheses.
p -Value and Critical Value Approaches
Testing for the Equality of k Population Means
24 Slide
2. Specify the level of significance. = .05
p -Value and Critical Value Approaches
3. Compute the value of the test statistic.
MSTR = 490/(3 - 1) = 245SSTR = 5(55 - 60)2 + 5(68 - 60)2 + 5(57 - 60)2 = 490
(Sample sizes are all equal.)Mean Square Due to Treatments
Testing for the Equality of k Population Means
= (55 + 68 + 57)/3 = 60x
25 Slide
3. Compute the value of the test statistic.
MSE = 308/(15 - 3) = 25.667SSE = 4(26.0) + 4(26.5) + 4(24.5) = 308Mean Square Due to Error
(con’t.)
F = MSTR/MSE = 245/25.667 = 9.55
p -Value and Critical Value Approaches
Testing for the Equality of k Population Means
26 Slide
TreatmentErrorTotal
490308798
21214
24525.667
Source ofVariation
Sum ofSquares
Degrees ofFreedom
MeanSquare
9.55F
ANOVA Table
Testing for the Equality of k Population Means
p-Value.0033
27 Slide
5. Determine whether to reject H0.
We have sufficient evidence to conclude that the mean number of hours worked per week by department managers is not the same at all 3 plant.
The p-value < .05, so we reject H0.
With 2 numerator d.f. and 12 denominator d.f.,the p-value is .01 for F = 6.93. Therefore, thep-value is less than .01 for F = 9.55.
p –Value Approach4. Compute the p –value.
Testing for the Equality of k Population Means
28 Slide
5. Determine whether to reject H0.Because F = 9.55 > 3.89, we reject H0.
Critical Value Approach4. Determine the critical value and rejection rule.
Reject H0 if F > 3.89
We have sufficient evidence to conclude that the mean number of hours worked per week by department managers is not the same at all 3 plant.
Based on an F distribution with 2 numeratord.f. and 12 denominator d.f., F.05 = 3.89.
Testing for the Equality of k Population Means
29 Slide
ANOVA PRACTICE
30 Slide
Perform an ANOVA
Using the data in the table on the next slide, compute the mean and standard deviation for the Direct treatment.Compute the sum of the squares due to treatments (SSTR) and the mean square due to treatments (MSTR).Compute the sum of the squares due to the error (SSE) and the mean square due to the error (MSE).Compute the F test statistic.Using =0.05, what do you conclude about the null hypothesis that the means of the treatments are equal?Determine the approximate p-value.Present the results in an ANOVA table.
31 Slide
Direct Indirect Combination
1 17.0 16.6 14.44 25.2 0.042 18.5 22.2 3.24 24.0 1.003 15.8 20.5 0.01 21.5 12.254 18.2 18.3 4.41 26.8 3.245 20.2 24.2 14.44 27.5 6.256 16.0 19.8 0.36 25.8 0.647 13.3 21.2 0.64 24.2 0.64
20.4 25.06.26 4.01
x
2)( xxi 2)( xxi ixixix
2s
2)( xxi