statr session 19 and 20

57
Learning Objectives Understand the differences between various experimental designs and when to use them. Compute and interpret the results of a one-way ANOVA. Compute and interpret the results of a random block design. Compute and interpret the results of a two-way ANOVA. Understand and interpret interactions between variables. Know when and how to use multiple comparison techniques.

Upload: ruruchowdhury

Post on 14-Apr-2017

1.020 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Statr session 19 and 20

Learning Objectives• Understand the differences between various

experimental designs and when to use them.• Compute and interpret the results of a one-way

ANOVA.• Compute and interpret the results of a random block

design.• Compute and interpret the results of a two-way

ANOVA.• Understand and interpret interactions between

variables.• Know when and how to use multiple comparison

techniques.

Page 2: Statr session 19 and 20

Introduction to Design of Experiments

• Experimental Design– A plan and a structure to test hypotheses in which

the researcher controls or manipulates one or more variables.

Page 3: Statr session 19 and 20

Introduction to Design of Experiments

Independent Variable• Treatment variable - one that the experimenter

controls or modifies in the experiment.• Classification variable - a characteristic of the

experimental subjects that was present prior to the experiment, and is not a result of the experimenter’s manipulations or control.

• Levels or Classifications - the subcategories of the independent variable used by the researcher in the experimental design.

• Independent variables are also referred to as factors.

Page 4: Statr session 19 and 20

Independent Variable

• Manipulation of the independent variable depends on the concept being studied

• Researcher studies the phenomenon under conditions of varying aspects of the variable

Page 5: Statr session 19 and 20

Introduction to Design of Experiments

• Dependent Variable - the response to the different levels of the

independent variable• Analysis of Variance (ANOVA) – a group of

statistical techniques used to analyze experimental designs.- ANOVA begins with notion that individual items

being studied are all the same

Page 6: Statr session 19 and 20

Three Types of Experimental Designs

• Completely Randomized Design – subjects are assigned randomly to treatments; single independent variable.

• Randomized Block Design – includes a blocking variable; single independent variable.

• Factorial Experiments – two or more independent variables are explored at the same time; every levelof each factor are studied under every level of all other factors.

Page 7: Statr session 19 and 20

Completely Randomized Design

• The completely randomized design contains onlyone independent variable with two or moretreatment levels.

• If two treatment levels of the independent variableare present, the design is the same used to test the difference in means of two independent populations which uses the t test to analyze the data.

Page 8: Statr session 19 and 20

Completely Randomized Design

1 2 3

.

.

.

.

.

.

.

.

.

4

.

.

.

Machine Operator Independent Variable

Valve OpeningMeasurements

Dependent Variable

Page 9: Statr session 19 and 20

Completely Randomized Design

• A technique has been developed that analyzes all the sample means at one time and precludes the buildup of error rate: ANOVA.

• A completely randomized design is analyzed by one way analysis of variance (One-Way Anova).

Page 10: Statr session 19 and 20

One-Way ANOVA: Procedural Overview

at least one of the means is different from others

𝐹=𝑀𝑆𝐶𝑀𝑆𝐸

If > reject If ≤ do not reject

Page 11: Statr session 19 and 20

Analysis of Variance

• The null hypothesis states that the population means for all treatment levels are equal.

• Even if one of the population means is different from the other, the null hypothesis is rejected.

• Testing the hypothesis is done by portioning the total variance of data into the following two variances:- Variance resulting from the treatment (columns)- Error variance or that portion of the total variance

unexplained by the treatment

Page 12: Statr session 19 and 20

One-Way ANOVA: Sums of Squares Definitions

Page 13: Statr session 19 and 20

Analysis of Variance

• The total sum of square of variation is partitioned into the sum of squares of treatment columns and the sum of squares of error.

• ANOVA compares the relative sizes of the treatment variation and the error variation.

• The error variation is unaccounted for variation and can be viewed at the point as variation due to individual differences in the groups.

• If a significant difference in treatment is present, the treatment variation should be large relative to the error variation.

Page 14: Statr session 19 and 20

One-Way ANOVA: Computational Formulas

• ANOVA is used to determine statistically whether the variance between the treatment level means is greater than the variances within levels (error variance)

• Assumptions underlying ANOVA Normally distributed populations Observations represent random samples from

the population Variances of the population are equal

Page 15: Statr session 19 and 20

One-Way ANOVA: Computational Formulas

ANOVA is computed with the three sums of squares:• Total – Total Sum of Squares (SST); a measure of

all variations in the dependent variable• Treatment – Sum of Squares Columns (SSC);

measures the variations between treatments or columns since independent variable levels are present in columns

• Error – Sum of Squares of Error (SSE); yields the variations within treatments (or columns)

Page 16: Statr session 19 and 20

One-Way ANOVA: Preliminary Calculations

1 2 3 4

6.33 6.26 6.44 6.29

6.26 6.36 6.38 6.23

6.31 6.23 6.58 6.19

6.29 6.27 6.54 6.21

6.4 6.19 6.56

6.5 6.34

6.19 6.58

6.22Tj T1 = 31.59 T2 = 50.22 T3 = 45.42 T4 = 24.92 T = 152.15nj n1 = 5 n2 = 8 n3 = 7 n4 = 4 N = 24

Mean 6.318000 6.277500 6.488571 6.230000 6.339583

Page 17: Statr session 19 and 20

One-Way ANOVA: Sum of Squares Calculations

Page 18: Statr session 19 and 20

One-Way ANOVA: Sum of Squares Calculations

Page 19: Statr session 19 and 20

One-Way ANOVA: Computational Formulas

• Other items□ MSC – Mean Squares Columns□ MSE – Mean Squares Error□ MST – Mean Squares Total

• F value – determined by dividing the treatment variance (MSC) by the error variance (MSE)□ F value is a ratio of the treatment variance to the

error variance

Page 20: Statr session 19 and 20

One-Way ANOVA: Mean Square and F Calculations

Page 21: Statr session 19 and 20

Analysis of Variance for Valve Openings

Source of Variance df SS MS F

Between 3 0.23658 0.07886010.18

Error 20 0.15492 0.007746Total 23 0.39150

Page 22: Statr session 19 and 20

F Table

• F distribution table is in Table A7.• Associated with every F table are two unique df

variables: degrees of freedom in the numerator,and degrees of freedom in the denominator.

• Statistical computer software packages for computing ANOVA usually give a probability for the F value, which allows hypothesis testing decisions for any values of alpha .

Page 23: Statr session 19 and 20

A portion of F Table

1 2 3 4 5 6 7 8 9

1 161.45 199.50 215.71 224.58 230.16 233.99 236.77 238.88 240.54… … … … … … … … … …

18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46

19 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42

20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.3921 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37

df2

df1

F 20,3,05.

Page 24: Statr session 19 and 20

One-Way ANOVA: Procedural Summary

.H0reject ,10.3 > = F Since cF=10.18

Rejection Region

=

Critical Value10.311,9,05. =F

Non rejectionRegion

20

3

2

1

=

=

Page 25: Statr session 19 and 20

Multiple Comparison Tests

• ANOVA techniques useful in testing hypothesisabout differences of means in multiple groups.

• Advantage: Probability of committing a Type I error is controlled.

• Multiple Comparison techniques are used to identify which pairs of means are significantly different given that the ANOVA test reveals overall significance.

Page 26: Statr session 19 and 20

Multiple Comparison Tests

• Multiple comparisons are used when an overall significant difference between groups has been determined using the F value of the analysis of variance

• Tukey’s honestly significant difference (HSD) test requires equal sample sizes Takes into consideration the number of treatment levels,

value of mean square error, and sample size

Page 27: Statr session 19 and 20

Multiple Comparison Tests

• Tukey’s Honestly Significant Difference (HSD) – also known as the Tukey’s T method – examines the absolute value of all differences between pairs of means from treatment levels to determine if there is a significant difference.

• Tukey-Kramer Procedure is used when sample sizes are unequal.

Page 28: Statr session 19 and 20

Tukey’s Honestly SignificantDifference (HSD) Test

If comparison for a pair of means is greater than HSD, then the means of the two treatment levels are significantly different.

Page 29: Statr session 19 and 20

Demonstration Example Problem

A company has three manufacturing plants, and company officials want to determine whether there is a difference in the average age of workers at the three locations. The following data are the ages of five randomly selected workers at each plant. Perform a one-way ANOVA to determine whether there is a significant difference in the mean ages of the workers at the three plants. Use α = 0.01 and note that the sample sizes are equal.

Page 30: Statr session 19 and 20

Data from Demonstration Example

PLANT (Employee Age) 1 2 3

29 32 2527 33 2430 31 2427 34 2528 30 26

Group Means 28.2 32.0 24.8nj 5 5 5

C = 3dfE = N - C = 12 MSE = 1.63

Page 31: Statr session 19 and 20

Tukey’s HSD test

• Since sample sizes are equal, Tukey’s HSD testscan be used to compute multiple comparison tests between groups

• To compute the HSD, the values of MSE, n andq must be determined

Page 32: Statr session 19 and 20

q Value for = 0.01

Degrees of Freedom

1

2

3

4

.

11

12

2 3 4 5

90 135 164 186

14 19 22.3 24.7

8.26 10.6 12.2 13.3

6.51 8.12 9.17 9.96

4.39 5.14 5.62 5.97

4.32 5.04 5.50 5.84

.

...

Number of Populations

. , ,.

01 3 12504q =

Page 33: Statr session 19 and 20

Tukey’s HSD Test for the Employee Age Data

All three comparisons are greater than 2.88. Thus the mean ages between any and all pairs of plants are significantly different.

Page 34: Statr session 19 and 20

Tukey-Kramer Procedure: The Case of Unequal Sample Sizes

Page 35: Statr session 19 and 20

Example: Mean Valve openings produced by four operators

A valve manufacturing wants to test whether there are any differences in the mean valve openings produced by four different machine operators. The data follow.

Page 36: Statr session 19 and 20

Example: Mean Valve openings produced by four operators

Operator Sample Size Mean1 5 6.31802 8 6.27753 7 6.48864 4 6.2300

Page 37: Statr session 19 and 20

Example: Tukey-Kramer Results forthe Four Operators

PairCritical Difference

|Actual Differences|

1 and 2 .1405 .0405

1 and 3 .1443 .1706*

1 and 4 .1653 .0880

2 and 3 .1275 .2111*

2 and 4 .1509 .0475

3 and 4 .1545 .2586*

*denotes significant at =.05

Page 38: Statr session 19 and 20

Randomized Block Design

• Randomized block design - focuses on one independent variable (treatment variable) of interest.

• Includes a second variable (blocking variable) usedto control for confounding or concomitant variables.

• A Blocking Variable can have an effect on the outcome of the treatment being studied

• A blocking variable is a variable a researchers wants to control but not a treatment variable of interest.

Page 39: Statr session 19 and 20

Randomized Block Design

1 2 3

.

.

.

.

.

.

.

.

.

4

.

.

Independent Variable

Individual Observations

.

.

.

.

.

Blocking Variable

Page 40: Statr session 19 and 20

Examples: Blocking Variable

• In the study of growth patterns of varieties of seeds for a given type of plant, different plots of ground work as blocks.

• Machine number, worker, shift, day of the week etc.• Gender, Age, Intelligence, Economic level of

subjects • Brand, Supplier, Vehicle etc.

Page 41: Statr session 19 and 20

Randomized Block Design

• Repeated measures design - is a design in which each block level is an individual item or person, and that person or item is measured across all treatments

• A special case of Randomized Block Design

Page 42: Statr session 19 and 20

Randomized Block Design

• The sum of squares in a completely randomizeddesign is

SST = SSC + SSE• In a randomized block design, the sum of squares is

SST = SSC + SSR + SSE• SSR (blocking effects) comes out of the SSE

Some error in variation in randomized design aredue to the blocking effects of the randomized block design

Page 43: Statr session 19 and 20

Randomized Block Design TreatmentEffects: Procedural Overview

• The observed F value for treatments computed using the randomized block design formula is tested by comparing it to a table F value.

• If the observed F value is greater than the table value, the null hypothesis is rejected for that alpha value.

• If the F value for blocks is greater than the critical F value, the null hypothesis that all block population means are equal is rejected.

Page 44: Statr session 19 and 20

Randomized Block Design TreatmentEffects: Procedural Overview

Page 45: Statr session 19 and 20

Randomized Block Design: Computational Formulas

SSC n j C

SSR C i n

SSE ij i i C n N n C

SST ij N

MSCSSCC

MSRSSRn

MSESSE

N n CMSCMSE

MSRMSE

X X df

X X df

X X X X df

X X df

F

F

j

C

C

i

n

R

i

n

j

n

E

i

n

j

n

E

treatments

blocks

= =

= =

= = =

= =

=

=

=

=

=

=

=

==

==

2

1

2

12

112

11

1

1

1 1 1

1

1

1

1

( )

( )

( )

( )where: i = block group (row)

j = a treatment level (column)C = number of treatment levels (columns)n = number of observations in each treatment level (number of blocks - rows)

individual observation

treatment (column) mean

block (row) mean

X = grand meanN = total number of observations

ij

j

i

XXX

=

=

=

SSC sum of squares columns (treatment)SSR = sum of squares rows (blocking)SSE = sum of squares errorSST = sum of squares total

=

Page 46: Statr session 19 and 20

Randomized Block Design: Tread-Wear Example

As an example of the application of the randomized block design, consider a tire company that developed a new tire. The company conducted tread-wear tests on the tire to determine whether there is a significant difference in tread wear if the average speed with which the automobile is driven varies. The company set up an experiment in which the independent variable was speed of automobile. There were three treatment levels.

Page 47: Statr session 19 and 20

Randomized Block Design: Tread-Wear Example

Supplier

1

2

3

4

Slow Medium FastBlock

Means ( )

3.7 4.5 3.1 3.77

3.4 3.9 2.8 3.37

3.5 4.1 3.0 3.53

3.2 3.5 2.6 3.10

5

Treatment Means( )

3.9 4.8 3.4 4.03

3.54 4.16 2.98 3.56

Speed

jX

iX

n = 5

C = 3

X

N = 15

Page 48: Statr session 19 and 20

Randomized Block Design: Sum of Squares Calculations (Part 1)

Page 49: Statr session 19 and 20

Randomized Block Design: Sum of Squares Calculations (Part 2)

Page 50: Statr session 19 and 20

Randomized Block Design: Mean Square Calculations

Page 51: Statr session 19 and 20

Analysis of Variance for the Tread-Wear Example

Source of Variance SS df MS F

Treatment 3.484 2 1.742 97.45Block 1.541 4 0.38525 21.72Error 0.143 8 0.017875Total 5.176 14

Page 52: Statr session 19 and 20

Randomized Block Design Treatment Effects: Procedural Summary

Page 53: Statr session 19 and 20

Randomized Block Design Treatment Effects: Procedural Overview

Page 54: Statr session 19 and 20

Randomized Block Design: Tread-Wear Example

• Because the observed value of F for treatment (97.45) is greater than this critical F value, the null hypothesis is rejected. At least one of the population means of the

treatment levels is not the same as the others. There is a significant difference in tread wear for

cars driven at different speeds• The F value for treatment with the blocking was

97.45 and without the blocking was 12.44 By using the random block design, a much larger

observed F value was obtained.

Page 55: Statr session 19 and 20

Factorial Design (Two way Anova)

.

.

.

.

.

.

.

.

.

.

.

Column Treatment

Cells

.

.

.

.

.

Row Treatment

Page 56: Statr session 19 and 20

Two-Way ANOVA: Hypotheses

Page 57: Statr session 19 and 20

Formulas for Computing aTwo-Way ANOVA