1 sta 617 – chp12 generalized linear mixed models sas for model (12.3) with matched pairs from...

26
1 STA 617 – Chp12 Generalized Linear Mixed Models STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

Upload: sibyl-bruce

Post on 21-Jan-2016

244 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

1STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

SAS for Model (12.3) with Matched Pairs from Table 12.1

Page 2: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

2STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

Page 3: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

3STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

REPLICATE Statement

The REPLICATE statement provides a way to accommodate models in which different subjects have identical data. PROC NLMIXED assumes that its value indicates the number of subjects having data identical to those for the current value of the SUBJECT= variable (specified in the RANDOM statement).

Only the last observation of the REPLICATE variable for each subject is used, and the replicate variable must have only positive integer values.

Note that the REPLICATE mechanism is different from using a FREQ statement in other statistical modeling procedures, such as PROC GLM, GENMOD, GLIMMIX, and LOGISTIC.

A FREQ variable is used to identify grouped values for observations, essentially multiplying the log likelihood or sum of squares contribution for the observation.

A REPLICATE variable is used to multiply the contribution of a subject that comprises one or more observations.

Page 4: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

4STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

12.3 EXAMPLES OF RANDOM EFFECTS MODELS FOR BINARY DATA

random effects models: 12.3.1 Small-Area Estimation of Binomial

Proportions 12.3.2 Modeling Repeated Binary Responses 12.3.4 Modeling Heterogeneity among

Multicenter Clinical Trials 12.3.5 Alternative Formulations of Random

Effects Models 12.3.6 Capture–Recapture Modeling to Predict

Population Size

Page 5: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

5STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

12.3.1 Small-Area Estimation of Binomial Proportions

Small-area estimation refers to estimation of parameters for a large number of geographical areas when each has relatively few observations.

For instance, one might want county-specific estimates of characteristics such as the unemployment rate or the proportion of families having health insurance coverage.

With a national or statewide survey, some counties may have few observations. Then, sample proportions in the counties may poorly estimate the true countywide proportions.

Random effects models that treat each county as a cluster can provide improved estimates.

In assuming that the true proportions vary according to some distribution, the fitting process ‘‘borrows from the whole’’ it uses data from all the counties to estimate the proportion in any given one.

Page 6: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

6STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

Example

a simulated sample of size 2000 to mimic a poll taken before the 1996 U.S. presidential election.

For Ti observations in state i (i=1, . . . , 51, where I=51 is DC ), yi is bin(Ti ,i), where i is the actual proportion of votes in state i for Bill Clinton in the 1996 election, conditional on voting for Clinton or the Republican candidate, Bob Dole.

Here, Ti is proportional to the state’s population size, subject to Ti=2000.

Table 12.2 shows Ti ,i, and pi=yi/Ti.

Page 7: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

7STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

Page 8: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

8STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

fixed-effects model Problem: some states have few observations. Then, sample proportions in the states may poorly estimate the true statewide proportions.

General notation: Let i denote the true proportion in area i, i=1, . . . , n.

These areas may be all the ones of interest, or only a sample.

Let {yi} denote independent bin(Ti ,i) variates; that is, yi= yit , where {yit , t=1, . . . , Ti }are independent with P(Yit=1)=i and P(Yit=0)=1-i.

The sample proportions pi=yi/Ti are ML estimates of i for the fixed-effects model

Page 9: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

9STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

Problem of fixed-effects model

For small {Ti}, {pi} have large standard errors.

Thus, pi may display much more variability than i, especially when i are similar.

It is helpful shrink {pi} toward their overall mean.

Random effects

Page 10: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

10STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

Random effects model

If then all

the random effects estimate of each i

this is a much better estimator of that common value than the sample proportion from a single sample.

Generally, the random effects model estimators shrink the separate sample proportions toward the overall sample proportion. The amount of shrink-age decreases as increases.

Page 11: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

11STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

The predicted random effect is the estimated mean of the distribution of ui , given the data.

This prediction depends on all the data, not just data from area i.

A benefit is potential reduction in the mean-squared error of the estimates around the true values.

Page 12: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

12STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

SAS GLMM Analyses of Election Data in Table 12.2

Page 13: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

13STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

For the ML fit of model (12.9)

From the predicted random effect values obtained using PROC NLMIXED in SAS, considerable shrinkage of these estimates occurs from the sample proportions toward the overall proportion supporting Clinton, which was 0.548 (vary from 0.468 to 0.696) [exp(0.1633)/(1+exp(0.1633))=0.5408]

The sample proportions vary between 0.111 to 1.0. Sample proportions based on fewer observations, such

as DC, tended to shrink more. Although the estimates incorporating random effects

are relatively homogeneous, they tend to be closer than the sample proportions to the true values.

Page 14: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

14STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

How to simulate the data?

/*new simulation*/

data vote1; set vote;

/*simulate the data based on true prob in each state*/

y=rand("BINOMIAL", truep, n);

run;

Page 15: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

15STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

12.3.2 Modeling Repeated Binary Responses --- incorporate covariates.

Items are (1=yes, 2=no)

(1) if the family has a very low income and cannot afford anymore children

(2)when the woman is not married and does not want to marry the man

(3) when the woman wants the abortion for any reason.

The subjects indicated whether they supported legalizing abortion in each of three situations.

The subjects indicated whether they supported legalizing abortion in each of three situations.

Page 16: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

16STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

Let yit denote the response for subject i on item t, with yit=1 representing support.

Consider the model

where xi=1 for females and 0 for males, and where ui are independent normal. The gender effect is assumed the same for each item, and the {t} refer to the items.

Page 17: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

17STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

data new;input sex poor single any count;datalines;1 1 1 1 3421 1 1 0 261 1 0 1 111 1 0 0 321 0 1 1 61 0 1 0 211 0 0 1 191 0 0 0 3562 1 1 1 4402 1 1 0 252 1 0 1 142 1 0 0 472 0 1 1 142 0 1 0 182 0 0 1 222 0 0 0 457;

data new; set new; sex = sex-1; case = _n_; q1=1; q2=0; resp = poor; output; q1=0; q2=1; resp = single; output; q1=0; q2=0; resp = any; output;drop poor single any;proc nlmixed qpoints = 50; parms alpha=0 beta1=.8 beta2=.3 gamma=0

sigma=8.6; eta = alpha + beta1*q1 + beta2*q2 + gamma*sex

+ u; p = exp(eta)/(1 + exp(eta)); model resp ~ binary(p); random u ~ normal(0,sigma*sigma) subject =

case; replicate count; estimate 'diff1-2' beta1-beta2;run;

Page 18: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

18STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

GEEdata new2; set new1; do i=1 to count; id=compress(case||"|"||i); output; end;data q1 q2 q3; set new2; if q1=1 and q2=0 then output q1; else if q1=0 and q2=1 then output q2; else output q3;data qq; merge q1 (rename=(resp=qq1))

q2(rename=(resp=qq2)) q3(rename=(resp=qq3)); run;proc corr; var qq1 qq2 qq3; run;

proc GENMOD desc data=new2; class id ; model resp=q1 q2 sex/link=logit dist=bin covb

MAXITER=500; repeated subject = id / type=exch; estimate 'diff12' q1 1 q2 -1;run;

Page 19: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

19STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

For a given subject of either gender, for instance, the estimated odds of supporting legalized abortion for item 1 equal exp(0.83)=2.3 times the estimated odds for item 3.

for each item the estimated probability of supporting legalized abortion is similar for females and males with similar random effect values (gamma=0.01).

Page 20: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

20STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

For these data, subjects are highly heterogeneous sigma=8.6. Thus, strong associations exist among responses on the three items.

This is reflected by 1595 of the 1850 subjects making the same response on all three items: that is, response patterns 0, 0, 0. and 1, 1, 1.

It implies tremendous variability in between-subject odds ratios.

From (12.7), for different subjects of a given gender, the middle 50% of odds ratios comparing items 1 and 3 are estimated to vary between about exp(0.83-0.95*8.6) and exp(0.83+0.95*8.6).

Page 21: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

21STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

An extended model allows interaction between gender and item. It does not fit better.

GEE estimates for the exchangeable working correlation structure

Page 22: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

22STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

GEE model describes six marginal probabilities (three for each gender) using four parameters.

These population-averaged beta are much smaller than the subject-specific beta from the GLMM.

This reflects the very large GLMM heterogeneity (sigma=8.6) and the corresponding strong correlations among the three responses.

For instance, the GEE analysis estimates a common correlation of 0.82 between pairs of responses.

Although the GLMM beta are about five to six times the marginal model beta, so are the standard errors. The two approaches provide similar substantive interpretations and conclusions.

Page 23: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

23STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

12.3.3 Longitudinal Mental Depression Study Revisited

The response yt for measurement t on mental depression equals 1 for normal and 0 for abnormal. Predictors: For severity of initial diagnosis s ( 1=severe,

0=mild) drug treatment d (1=new, 0=standard) and time of measurement t

Page 24: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

24STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

Marginal model chp11

Random effects model

Page 25: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

25STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

SAS

proc nlmixed qpoints=100;

parms alpha=-.03 beta1=-1.3 beta2=-.06 beta3=.48 beta4=1.02 sigma=.066;

eta = alpha + beta1*diagnose + beta2*treat + beta3*time + beta4*treat*time + u;

p = exp(eta)/(1 + exp(eta));

model outcome ~ binary(p);

random u ~ normal(0, sigma*sigma) subject = case;

run;

Page 26: 1 STA 617 – Chp12 Generalized Linear Mixed Models SAS for Model (12.3) with Matched Pairs from Table 12.1

26STA 617 – Chp12 Generalized Linear Mixed ModelsSTA 617 – Chp12 Generalized Linear Mixed Models

GEE and GLMM are similar because sigma=0.07 is very small

Little heterogeneity among subjects -> population-average will equal to subject-specific roughly