stat 470-5 today: general linear model assignment 1:

28
Stat 470-5 Today: General Linear Model Assignment 1:

Post on 20-Dec-2015

222 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Stat 470-5 Today: General Linear Model Assignment 1:

Stat 470-5

• Today: General Linear Model

• Assignment 1:

Page 2: Stat 470-5 Today: General Linear Model Assignment 1:

General Linear Model

• ANOVA model can be viewed as a special case of the general linear model or regression model

• Suppose have response, y, which is thought to be related to p predictors (sometimes called explanatory variables or regressors)

• Predictors: x1, x2,…,xp

• Model:

Page 3: Stat 470-5 Today: General Linear Model Assignment 1:

Example: Rainfall (Exercise 2.16)

• In winter, a plastic rain gauge cannot be used to collect precipitation because it will freeze and crack. Instead, metal cans are used to collect snowfall and the snow is allowed to melt indoors. The water is then poured into a plastic rain gauge and a measurement recorded. An estimate of snowfall is obtained by multiplying this measurement by 0.44.

• One observer questions this and decides to collect data to test the validity of this approach

• For each rainfall in a summer, she measures: (i) rainfall using a plastic rain gauge, (ii) using a metal can

• What is the current model being used?

Page 4: Stat 470-5 Today: General Linear Model Assignment 1:

Example: Rainfall (Exercise 2.16)

Scatter Plot of Rainfall Data

Rain Collected in Metal Can (x)

76543210

Ra

in C

olle

cte

d in

Pla

stic

Ga

ug

e4.0

3.0

2.0

1.0

0.0

Page 5: Stat 470-5 Today: General Linear Model Assignment 1:

Example: Rainfall (Exercise 2.16)

• Seems to be a linear relationship

• Will use regression to establish linear relationship between x and y

• What should the slope be?

Page 6: Stat 470-5 Today: General Linear Model Assignment 1:

Example: Rainfall (Exercise 2.16)

Coefficientsa

3.579E-02 .012 2.931 .005

.444 .006 .995 76.264 .000

(Constant)

X

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: Ya.

ANOVAb

25.860 1 25.860 5816.213 .000a

.245 55 .004

26.105 56

Regression

Residual

Total

Model1

Sum ofSquares df Mean Square F Sig.

Predictors: (Constant), Xa.

Dependent Variable: Yb.

Model Summaryb

.995a .991 .990 .06668Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), Xa.

Dependent Variable: Yb.

Page 7: Stat 470-5 Today: General Linear Model Assignment 1:

Example: Rainfall (Exercise 2.16)

Normal Q-Q Plot of Residuals

Observed Value

.4.3.2.10.0-.1-.2

Exp

ect

ed

No

rma

l Va

lue

.2

.1

0.0

-.1

-.2

Page 8: Stat 470-5 Today: General Linear Model Assignment 1:

Example: Rainfall (Exercise 2.16)

Plot of Residuals vs X

X

76543210-1

Re

sid

ua

ls

.4

.3

.2

.1

0.0

-.1

Page 9: Stat 470-5 Today: General Linear Model Assignment 1:

Example: Rainfall (Exercise 2.16)

Residuals vs Predicted

Predicted Value

3.53.02.52.01.51.0.50.0

Re

sid

ua

ls.4

.3

.2

.1

0.0

-.1

Page 10: Stat 470-5 Today: General Linear Model Assignment 1:

Comments

• General linear model may have many predictors

• Is suitable for many situations

• Easily done in all stats packages

Page 11: Stat 470-5 Today: General Linear Model Assignment 1:

Designs So Far…

• Have considered 1-factor designs:– Paired comparisons (paired t-test)

– Completely randomized design (ANOVA)

• Frequently have more than one factor

• We will learn to design and analyze such experiments

Page 12: Stat 470-5 Today: General Linear Model Assignment 1:

Example: Penicillin Experiment

• Objective: Compare four processes for making penicillin

• The raw material used in the process is thought to vary substantially from batch to batch

• Experiment Design:

– Use five separately produced batches of raw material

– Divide each batch into four sub-batches

– Randomly assign each process to one sub-batch.

– Randomize the production order within each batch

– Measure the yield (%)

Page 13: Stat 470-5 Today: General Linear Model Assignment 1:

Blocking

• Paired comparisons (Section 2.1) is a special case of a Randomized Complete Block (RCB) design

• More generally:

– Have k treatments

– have b blocks

– each of the k treatments is applied (in random order) to each block

Page 14: Stat 470-5 Today: General Linear Model Assignment 1:

Blocking

• Units within a block are more homogeneous than units between blocks

• Can remove variability due to blocks (e.g., boy to boy variability) from the comparison of treatments

Page 15: Stat 470-5 Today: General Linear Model Assignment 1:

Model

• i=1, 2, …, b;

• j=1, 2, …,k;

ijjiijy

),0(~ 2 Nij

Page 16: Stat 470-5 Today: General Linear Model Assignment 1:

ANOVA Table

Source of Variation

Degrees of Freedom

Sum of Squares

Mean Squares

F

Block b-1 Treatment k-1 Residual (b-1)k-1) Total bk-1

Page 17: Stat 470-5 Today: General Linear Model Assignment 1:

Hypothesis Tests

Page 18: Stat 470-5 Today: General Linear Model Assignment 1:

Multiple Comparisons

Page 19: Stat 470-5 Today: General Linear Model Assignment 1:

Example: Penicillin Experiment

• Objective: Compare four processes for making penicillin

• The raw material used in the process is thought to vary substantially from batch to batch

• Experiment Design: – Use five separately produced batches of raw material– Divide each batch into four sub-batches – Randomly assign each process to one sub-batch. – Randomize the production order within each batch – Measure the yield (%)

• This is a RCB design with b = k =

Page 20: Stat 470-5 Today: General Linear Model Assignment 1:

Data: Penicillin Example

Penicillin Experiment

Process B1 B2 B3 B4 B5 Proc. ave.A 89 84 81 87 79 84B 88 77 87 92 81 85C 97 92 87 89 80 89D 94 79 85 84 88 86batch ave. 92 83 85 88 82 86

Batch

Page 21: Stat 470-5 Today: General Linear Model Assignment 1:

Yield versus Process (grouped by blocks)

Data: Penicillin Experiment

70

75

80

85

90

95

100

A B C D

Process

Yiel

d (%

)

B1

B2

B3

B4

B5

Page 22: Stat 470-5 Today: General Linear Model Assignment 1:

Observations:

• Some consistent differences among batches: generally, B1 high, B5 low

• No apparent consistent differences among processes

Page 23: Stat 470-5 Today: General Linear Model Assignment 1:

ANOVA – Randomized Block Design

ANOVA - Penicillin ExperimentSource of Variation SS df MS F P-value F critProcesses 70.0 3 23.33 1.24 0.34 3.490Batches 264.0 4 66.00 3.50 0.04 3.259Error 226.0 12 18.83

Total 560.0 19

Page 24: Stat 470-5 Today: General Linear Model Assignment 1:

Conclusions

• F-value for Processes is not significant at

• F-value for Batches (P = .04) is significant at … indicates some differences among batches of raw material

• We suspected batch differences; that’s why the design was done this way. This result is no surprise or of particular interest, in this case.

• Which would you use?

05.0

05.0

Page 25: Stat 470-5 Today: General Linear Model Assignment 1:

Diagnostic Checking

• Residual plots -- penicillin experiment– To check Normality assumption:

• plot all residuals: dot chart, histogram, Normal prob. plot

– To check assumption of equal variances:• dot plot of residuals by Treatment

• dot plot of residuals by Block

– Other possible checks:• plot residuals vs. testing order

• plot residuals vs. other potential sources of variability

– e.g., vs. technician, or machine, etc.

Page 26: Stat 470-5 Today: General Linear Model Assignment 1:

Randomized Block Design -- Summary

• Objective: – Compare several treatments for a factor

– eliminate source of variability from comparison of treatments

– broaden conclusions

• Experimental Method: – create b blocks each with a experimental units

– in each block, randomly assign each treatment to one experimental unit

• Analysis:– ANOVA: Blocks, Treatments, Error are sources of variation

Page 27: Stat 470-5 Today: General Linear Model Assignment 1:

Why Bother?

• Can remove variability due to blocks (e.g., boy to boy variability) from the comparison of treatments

• Removing source of variability often increases power to detect treatment differences

• Make comparisons on more homogeneous units

Page 28: Stat 470-5 Today: General Linear Model Assignment 1:

Examples of Blocking Variables

• Blocks are units that can be sub-divided into sub-units

– Time:

– Space

– People:

– Batches: