linear statistical models 2009 models for continuous, binary and binomial responses simple linear...

13
Linear statistical models 2009 Models for continuous, binary and binomial responses Simple linear models regarded as special cases of GLMs Simple linear regression One-way ANOVA Two-way ANOVA with or without interaction effects Some useful continuous distributions Binary and binomial responses

Post on 19-Dec-2015

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Linear statistical models 2009 Models for continuous, binary and binomial responses  Simple linear models regarded as special cases of GLMs  Simple linear

Linear statistical models 2009

Models for continuous, binary and binomial responses

Simple linear models regarded as special cases of GLMs Simple linear regression

One-way ANOVA

Two-way ANOVA with or without interaction effects

Some useful continuous distributions

Binary and binomial responses

Page 2: Linear statistical models 2009 Models for continuous, binary and binomial responses  Simple linear models regarded as special cases of GLMs  Simple linear

Linear statistical models 2009

A simple linear regression model

Heart rate in the common leopard frog

y = 1.775x + 2.1389

0

5

10

15

20

25

30

35

40

0 2 4 6 8 10 12 14 16 18 20

Temperature (oC)

Hea

rt r

ate

(b

ea

ts/m

inu

te)

r = 0.967

Temp. (oC) Heart rate (beats/min)2 5

4 116 118 14

10 2212 2314 32

16 2918 32

Page 3: Linear statistical models 2009 Models for continuous, binary and binomial responses  Simple linear models regarded as special cases of GLMs  Simple linear

Linear statistical models 2009

GENMOD implementation of simple linear regression

proc genmod data=linear.heartrate;

model heart_rate = temp /dist=normal link=identity;

run;

Analysis of Parameter Estimates

Standard Wald 95% Confidence Chi-

Parameter DF Estimate Error Limits Square Pr > ChiSq

Intercept 1 2.1389 1.6906 -1.1746 5.4524 1.60 0.2058

Temp 1 1.7750 0.1502 1.4806 2.0694 139.63 <.0001

Scale 1 2.3271 0.5485 1.4662 3.6936

NOTE: The scale parameter was estimated by maximum likelihood.

Page 4: Linear statistical models 2009 Models for continuous, binary and binomial responses  Simple linear models regarded as special cases of GLMs  Simple linear

Linear statistical models 2009

Comparison of GENMOD and

MINITAB’s simple linear regression

GENMOD Analysis of Parameter Estimates

Standard Wald 95% Confidence Chi-

Parameter DF Estimate Error Limits Square Pr > ChiSq

Intercept 1 2.1389 1.6906 -1.1746 5.4524 1.60 0.2058

Temp 1 1.7750 0.1502 1.4806 2.0694 139.63 <.0001

Scale 1 2.3271 0.5485 1.4662 3.6936

NOTE: The scale parameter was estimated by maximum likelihood.

MINITAB Regression Analysis: Heart_rate versus Temp

The regression equation is Heart_rate = 2.14 + 1.77 Temp

Predictor Coef SE Coef T P

Constant 2.139 1.917 1.12 0.301

Temp 1.7750 0.1703 10.42 0.000

S = 2.63869 R-Sq = 93.9% R-Sq(adj) = 93.1%

Page 5: Linear statistical models 2009 Models for continuous, binary and binomial responses  Simple linear models regarded as special cases of GLMs  Simple linear

Linear statistical models 2009

Comparison of GENMOD and

MINITAB’s simple linear regression

The point estimates of the fitted line are identical

The deviance

in GENMOD is equal to the error sum of squares

The estimates of the standard deviation are different

The Wald-tests and the t-tests are different

22 ˆˆ n

pnML

));;ˆ();;((2 ylyylD

Page 6: Linear statistical models 2009 Models for continuous, binary and binomial responses  Simple linear models regarded as special cases of GLMs  Simple linear

Linear statistical models 2009

One-way ANOVA

Zr_content Temperature Sample Hardness_Gpa1 400 1 29.929541 400 1 29.623641 400 1 26.294591 400 1 28.104281 400 1 30.176241 400 1 30.817871 400 1 27.204221 400 1 26.83581 400 1 28.967041 400 1 28.273511 400 1 21.927731 400 1 28.156741 400 1 30.616291 400 1 27.67011 400 1 29.910871 400 1 28.16759

Measurement of hardness for nine groups of samples (3 levels of Zr, 3 temperature levels)

Page 7: Linear statistical models 2009 Models for continuous, binary and binomial responses  Simple linear models regarded as special cases of GLMs  Simple linear

Linear statistical models 2009

GENMOD implementation of one-way ANOVA

proc genmod data=linear.hardness;

class zr_content temperature sample;

model hardness_Gpa = sample /dist=normal link=identity;

run;

Analysis Of Parameter Estimates

Standard Wald 95% Confidence Chi- Parameter DF Estimate Error Limits Square Pr > ChiSq

Intercept 1 35.9016 0.4357 35.0476 36.7555 6789.78 <.0001 Sample 1 1 -6.8011 0.6130 -8.0024 -5.5997 123.11 <.0001 Sample 2 1 -6.8303 0.6195 -8.0446 -5.6161 121.56 <.0001 Sample 3 1 -8.1457 0.6130 -9.3471 -6.9443 176.61 <.0001 Sample 4 1 -13.4144 0.6195 -14.6286 -12.2002 468.86 <.0001 Sample 5 1 -8.6257 0.4800 -9.5665 -7.6850 322.95 <.0001 Sample 6 1 -10.4443 0.6099 -11.6396 -9.2490 293.30 <.0001 Sample 7 1 -8.5459 0.6162 -9.7535 -7.3382 192.36 <.0001 Sample 8 1 -3.1868 0.6565 -4.4735 -1.9001 23.56 <.0001 Sample 9 0 0.0000 0.0000 0.0000 0.0000 . . Scale 1 2.9870 0.0871 2.8211 3.1627 NOTE: The scale parameter was estimated by maximum likelihood.

Page 8: Linear statistical models 2009 Models for continuous, binary and binomial responses  Simple linear models regarded as special cases of GLMs  Simple linear

Linear statistical models 2009

GENMOD implementation of one-way ANOVA

Standard Wald 95% Confidence Chi-GENMOD Parameter DF Estimate Error Limits Square Pr > ChiSq

Intercept 1 35.9016 0.4357 35.0476 36.7555 6789.78 <.0001 Sample 1 1 -6.8011 0.6130 -8.0024 -5.5997 123.11 <.0001 Sample 2 1 -6.8303 0.6195 -8.0446 -5.6161 121.56 <.0001 Sample 3 1 -8.1457 0.6130 -9.3471 -6.9443 176.61 <.0001 Sample 4 1 -13.4144 0.6195 -14.6286 -12.2002 468.86 <.0001 Sample 5 1 -8.6257 0.4800 -9.5665 -7.6850 322.95 <.0001 Sample 6 1 -10.4443 0.6099 -11.6396 -9.2490 293.30 <.0001 Sample 7 1 -8.5459 0.6162 -9.7535 -7.3382 192.36 <.0001 Sample 8 1 -3.1868 0.6565 -4.4735 -1.9001 23.56 <.0001 Sample 9 0 0.0000 0.0000 0.0000 0.0000 . . Scale 1 2.9870 0.0871 2.8211 3.1627MINTAB ANOVA Pooled StDev

Level N Mean StDev ------+---------+---------+---------+---

1 48 29.100 2.770 (-*-)

2 46 29.071 2.605 (--*-)

3 48 27.756 1.777 (-*--)

4 46 22.487 2.842 (-*-)

5 220 27.276 2.699 (*)

6 49 25.457 3.385 (-*-)

7 47 27.356 4.465 (-*--)

8 37 32.715 3.815 (--*-)

9 47 35.902 3.236 (-*-)

------+---------+---------+---------+---

24.0 28.0 32.0 36.0

Pooled StDev = 3.010

Page 9: Linear statistical models 2009 Models for continuous, binary and binomial responses  Simple linear models regarded as special cases of GLMs  Simple linear

Linear statistical models 2009

GENMOD implementation of two-way ANOVA

proc genmod data=linear.hardness;

class zr_content temperature sample;

model hardness_Gpa = zr_content temperature zr_content * temperature/dist=normal link=identity;

run;

Analysis Of Parameter Estimates

Standard Wald 95% Confidence Chi-

Parameter DF Estimate Error Limits Square Pr > ChiSq

Intercept 1 27.7559 0.4311 26.9108 28.6009 4144.59 <.0001

Zr_content 0.17 1 8.1457 0.6130 6.9443 9.3471 176.61 <.0001

Zr_content 0.5 1 -2.2986 0.6066 -3.4875 -1.1097 14.36 0.0002

Zr_content 1 0 0.0000 0.0000 0.0000 0.0000 . .

Temperature 400 1 1.3446 0.6097 0.1496 2.5397 4.86 0.0274

Temperature 800 1 1.3154 0.6163 0.1074 2.5233 4.56 0.0328

Temperature 1000 0 0.0000 0.0000 0.0000 0.0000 . .

Zr_conten*Temperatur 0.17 400 1 -9.8905 0.8668 -11.5895 -8.1915 130.18 <.0001

Zr_conten*Temperatur 0.17 800 1 -4.5022 0.9004 -6.2670 -2.7373 25.00 <.0001

Zr_conten*Temperatur 0.17 1000 0 0.0000 0.0000 0.0000 0.0000 . .

Zr_conten*Temperatur 0.5 400 1 -4.3147 0.8648 -6.0096 -2.6199 24.90 <.0001

Zr_conten*Temperatur 0.5 800 1 0.5032 0.7762 -1.0181 2.0245 0.42 0.5168

Zr_conten*Temperatur 0.5 1000 0 0.0000 0.0000 0.0000 0.0000 . .

Zr_conten*Temperatur 1 400 0 0.0000 0.0000 0.0000 0.0000 . .

Zr_conten*Temperatur 1 800 0 0.0000 0.0000 0.0000 0.0000 . .

Zr_conten*Temperatur 1 1000 0 0.0000 0.0000 0.0000 0.0000 . .

Scale 1 2.9870 0.0871 2.8211 3.1627

Page 10: Linear statistical models 2009 Models for continuous, binary and binomial responses  Simple linear models regarded as special cases of GLMs  Simple linear

Linear statistical models 2009

The gamma distribution

Expected value:

Variance: 2

)/exp()(

1),;( 1

yyyf

0

2

4

6

8

10

12

0 10 20 30 40 50

Den

sity

fu

nct

ion

Page 11: Linear statistical models 2009 Models for continuous, binary and binomial responses  Simple linear models regarded as special cases of GLMs  Simple linear

Linear statistical models 2009

The 2 distribution

Expected value: p

Variance: 2p

Special case of gamma distribution

Sum of independent squared standard normal distributions

)2/exp(2)2/(

1);( 1)2/(

2/yy

ppyf p

p

0

2

4

6

8

10

12

0 10 20 30 40 50

Den

sity

fu

nct

ion

Page 12: Linear statistical models 2009 Models for continuous, binary and binomial responses  Simple linear models regarded as special cases of GLMs  Simple linear

Linear statistical models 2009

A model of the mean of a gamma distribution

proc genmod data=linear.clottingtime;

model clotting_time = lconc agent lconc * agent/dist=gamma link=power(-1) residuals;

output out=linear.clottingout resdev=resdev pred=pred;

run;

Conc Lconc Agent Clotting_time5 1.61 1 11810 2.30 1 5815 2.71 1 4220 3.00 1 3530 3.40 1 2740 3.69 1 2560 4.09 1 2180 4.38 1 19100 4.61 1 185 1.61 0 6910 2.30 0 3515 2.71 0 2620 3.00 0 2130 3.40 0 1840 3.69 0 1660 4.09 0 1380 4.38 0 12100 4.61 0 12

Page 13: Linear statistical models 2009 Models for continuous, binary and binomial responses  Simple linear models regarded as special cases of GLMs  Simple linear

Linear statistical models 2009

Binary and binomial responses

The response probabilities are modelled as functions of

the predictors

Link functions:

the probit link:

the logit link:

the log-log link:

p

pplogit

1log)(

)()( 1 ppprobit

))1log(log()( ppCL