Download - Partially missing at random and ignorable inferences for parameter subsets with missing data
![Page 1: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/1.jpg)
Partially missing at random and ignorable inferences for parameter
subsets with missing data
Roderick Little
![Page 2: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/2.jpg)
Outline• Survey Bayesics in three slides• Inference with missing data: Rubin's (1976)
paper on conditions for ignoring the missing-data mechanism
• Rubin’s standard conditions are sufficient but not necessary: example
• Propose definitions of MAR, ignorability for likelihood (and Bayes) inference for subsets of parameters
• Examples• Joint work with Sahar Zanganeh
Graybill Conference: Partially Missing at Random 2
![Page 3: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/3.jpg)
Calibrated Bayes– Frequentists should be Bayesian
• Bayes is optimal under assumed model
– Bayesians should be frequentist• We never know the model (and all models are wrong)• Inferences should have good repeated sampling
characteristics
– Calibrated Bayes (e.g. Box 1980, Rubin 1984, Little 2012)
• Inference based on a Bayesian model• Model chosen to yield inferences that are well-calibrated
in a frequentist sense• Aim for posterior probability intervals that have
(approximately) nominal frequentist coverage
Graybill Conference: Partially Missing at Random 3
![Page 4: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/4.jpg)
Calibrated Bayes models for surveys should incorporate sample design features
– All models are wrong, some models are useful• Design-assisted: make the estimator more robust• Calibrated Bayes: make the model more robust – many
models yield design-consistent estimates
– Models that ignore features like survey weights are vulnerable to misspecification
– But models can be successfully applied in survey setting, with attention to design features
• Weighting, stratification, clustering
– Capture design weights as covariates in the prediction model (e.g. Gelman 2007)
Graybill Conference: Partially Missing at Random 4
![Page 5: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/5.jpg)
Benefits of Bayes• Unified approach to all problems
– Avoids current approach -- “inferential schizophrenia”
• Not asymptotic– Propagates errors in estimating parameters
• Avoids frequentist pitfalls:– Conditions on ancillaries– Obeys likelihood principle
Graybill Conference: Partially Missing at Random 5
![Page 6: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/6.jpg)
v
Graybill Conference: Partially Missing at Random 6
![Page 7: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/7.jpg)
There are those who predict…
… and those who weight
Graybill Conference: Partially Missing at Random 7
![Page 8: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/8.jpg)
Rubin (1976 Biometrika)• Landmark paper (3700+ citations, after being
rejected by many journals!)– RL wrote his first (11 page) referee report, and an
obscure discussion
• Modeled the missing data mechanism by treating missingness indicators as random variables, assigning them a distribution
• Sufficient conditions under which missing data mechanism can be ignored for likelihood and frequentist inference about parameters– Focus here on likelihood, Bayes
Graybill Conference: Partially Missing at Random 8
![Page 9: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/9.jpg)
Ignoring the mechanism
• Full likelihood:
• Likelihood ignoring mechanism:
• Missing data mechanism can be ignored for likelihood inference when
obs mis
, |
data with no missing values, observed, missing
= response indicator matrix
( , | , ) ( | ) ( | , )D R D RD
D D D
R
f D R f D f R D
obs | mis( , | , ) const. ( | ) ( | , )D RDL D R f D f R D dD
ign obs mis( | , ) const. ( | )DL D R f D dD
obs ign obs rest obs( , | , ) ( | , ) ( | , )L D R L D R L D R
Graybill Conference: Partially Missing at Random 9
![Page 10: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/10.jpg)
Rubin’s sufficient conditions for ignoring the mechanism
• Missing data mechanism can be ignored for likelihood inference when– (a) the missing data are missing at random (MAR):
– (b) distinctness of the parameters of the data model and the missing-data mechanism:
• MAR is the key condition: without (b), inferences are valid but not fully efficient
| obs mis | obs mis( | , , ) ( | , ) for all ,R D R Df R D D f R D D
( , ) ; for Bayes, and a-priori independent
Graybill Conference: Partially Missing at Random 10
![Page 11: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/11.jpg)
“Sufficient for ignorable” is not the same as “ignorable”
• These definitions have come to define ignorability (e.g. Little and Rubin 2002)
• However, Rubin (1976) described (a) and (b) as the "weakest simple and general conditions under which it is always appropriate to ignore the process that causes missing data".
• These conditions are not necessary for ignoring the mechanism in all situations.
MAR+distinctness ignorable
ignorable MAR+distinctness
Graybill Conference: Partially Missing at Random 11
![Page 12: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/12.jpg)
Example 1: Nonresponse with auxiliary data
obs resp aux
*resp 1 2 aux 1
( , )
( , ), 1,..., , , 1,...,i i j
D D D
D y y i m D y j n
00011
??
1 1 2Y R Y Y
??
Not linked
1 aux
2 1 resp
But... mechanism is ignorable, does not need to be modeled:
Marginal distribution of estimated from
Conditional of given estimated from D
Y D
Y Y
1aux
1 2 ind 1 2
1 2 1
includes the respondent values of ,
but we do not know w
, ~ ( , | )
Pr( 1| , , ) ( , )
hich they are.
i i
i i i i
D
Y Y f y y
r y
Y
y g y
Or whole population N
1Not MAR -- missing for nonrespondents iy i
Graybill Conference: Partially Missing at Random 12
![Page 13: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/13.jpg)
MAR, ignorability for parameter subsets• MAR and ignorability are defined in terms of
the complete set of parameters in the data model for D
• It would be useful to have a definition of MAR that applies to subsets of parameters, including parameters of substantive interest.
• A trivial example: It seems plausible that a nonignorable mechanism would be MAR for the parameters of distributions of variables that are not missing.
Graybill Conference: Partially Missing at Random 13
![Page 14: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/14.jpg)
MAR, ignorability for parameter subsets
1 2
1 1
1 2 obs ign 1 obs rest 2 obs
1 2
=( , )
Mechanism is partially MAR for likelihood inference
about , denoted P-MAR( ), if:
( , , | , ) ( | , ) ( , | , )
for all , ,
L D R L D R L D R
1 1 1 2Mechanism is IGN( ) if MAR( ) and and ( , ) distinct
Graybill Conference: Partially Missing at Random 14
![Page 15: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/15.jpg)
MAR, ignorability for parameter subsets
1
obs ign obs rest obs
Special case where =
Mechanism is P-MAR( ) if:
( , | , ) ( | , ) ( | , )
for all ,
A consequence of (but does not imply) Rubin's MAR condition
IGN( ) if MAR( ) and and distinct
L D R L D R L D R
Graybill Conference: Partially Missing at Random 15
![Page 16: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/16.jpg)
Partial MAR given a function of mechanism
obs mis obs
obs
Harel and Schafer (2009) define a different kind of Partial MAR:
Mechanism is partially MAR given ( ) if:
( | , , ( ), , ) ( | , ( ), , )
for all , , ,
Here "partial" relates to the mech
g R
P R Y Y g R P R Y g R
R Y
anism,
In my definition "partial" relates to the parameters
This ideas seems quite distinct
Graybill Conference: Partially Missing at Random 16
![Page 17: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/17.jpg)
Example 1: Auxiliary Survey Data
obs resp aux
*resp 1 2 aux 1
( , )
( , ), 1,..., , , 1,...,i i j
D D D
D y y i m D y j n
00011
??
1 1 2Y R Y Y
??
Not linked
Easy to show that mechanism is P-MAR( ),
and IGN( ) if , are distinct
aux
1 2
1 2 1 2
1 2 1
1 includes the respondent values of ,
but we do not know which they
( , ), 1,..., }
, ~ ( , | )
Pr( 1| , ,
are
( )
.
) ,
i i
i i
i i i i
D
D y y i n
Y Y f y y
y y
Y
r y g
Graybill Conference: Partially Missing at Random 17
![Page 18: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/18.jpg)
Ex. 2: MNAR Monotone Bivariate Data
• Paper presents more interesting case with Y1, Y2 blocks of variables and missing data in each block
1 2
obs 1 2 1
1 2 1 2 1 1 2 1 2
2 1 2 1 2
( , ), 1,..., }
( , ), 1,..., and , 1,...,
, ~ ( , | ) ( | ) ( | , )
Pr ( 1| , , ) ( , , ) (MNAR)
i i
i i i
i i i i i
i i i i i
D y y i n
D y y i m y i m n
Y Y f y y f y f y y
r y y g y y
00011
??
1 2M Y Y
1
1
1
1
COMMENT: Clearly, inference about parameters
of the marginal distribution of can ignore mechanism,
since has no missing values.
In proposed definition, this mechanism is P-MAR( ),
and IGN( ) if
Y
Y
1 2 and ( , ) distinct
1
Graybill Conference: Partially Missing at Random 18
![Page 19: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/19.jpg)
More generally…(1) (2)
1 2
(1) (2) (1)1 2 1 1 1 1 1
(2) (1)1 2 1 2 1 2 2
( , ), ( , ) blocks of incomplete variables, and
( , , , ) ( | )Pr( | , )
( | , )Pr( | , , , )
i i i i i i i
i i i i i i
Y R Y R
f y y r r f y r y
f y y r r y y
(1)1 1 1 1,obs, 1 1,mis,Assume: Pr( | ; ) ( , ) for all ,i i i ir y g y y
(2) (1) (1)1 2 2 2 1 2 2Pr( | , , ; ) ( , , , ),i i i i i i ir r y y g r y y
1 1 1
2 1 2
Mechanism is P-MAR( ), IGN( ) if and
( , , ) are distinct
Graybill Conference: Partially Missing at Random 19
![Page 20: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/20.jpg)
Ex. 3: Complete Case Analysis in Regression
1 2
obs 1 2 1 2 1 2
1 2 1
( , ), 1,..., }
( , ), 1,..., , ~ ( , | )
Pr( 1| , , ) ( , )
i i
i i i i
i i i i
D y y i n
D y y i m Y Y f y y
r y y g y
000011
??
1 2R Y Y
??
1 2 1 1 1 2 2 1 2
1 2 obs 1 2 obs 2 1 obs
1 2 obs 2 2 1 2
2 2 1
1
2
Let ( , | ) ( | ) ( | , )
( , , | , ) const. ( | ) ( , | , ),
MNAR, but P-MAR( ), and IGN( ) if
where
(
, ( , ) distin t
| , )
c
| ) (
i i i i i
r
i ii
f y y f y f y y
L D R L D L D R
L D f y y
2 1
MNAR, but inference about parameters of
conditional distribution of given based on
complete cases is valid, ignoring the mechanism.
Y Y
Graybill Conference: Partially Missing at Random 20
![Page 21: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/21.jpg)
Ex. 4:A normal pattern-mixture model
obs 1 2 1
2 | 2 2
( ) ( )1 2 2 ind 2 ind
2 1 2 2
( , ), 1,..., and , 1,...,
( , | , ) ( | , ) ( | )
( , | , ) ~ ( , ), 0,1, ~ Bern( )
Assume Pr( 1| , ) ( ), unknown (M
COMMENT: Dist
NA
ribution
R)
i i i
D R R
j ji i i i
i i i i
D y y i m y i m n
f D R f D R f R
y y r j G j r
r y y g y g
1 2 2 2 of given and is independent of ,
so it can be estimated from complete cases, ignoring the mechanism
Y Y R R
00011
??
2 1 2R Y Y
(0) (0) (1) (1)1 2 12 0 12 2 11 2 2 22 1 11
obs 2 1 1 2 obs 2 2 obs 2
1 1 obs 1 2 1 2 1 21
1 2 1 2 1 2
( , , ), , , ,
( , | , ) const. ( | , ) ( , | , ), where
( | ) ( | , )
MNAR, but P-MAR( ), not IGN( ) since and
m
i ii
L D R L D R L D R
L D f y y
are not distinct
Graybill Conference: Partially Missing at Random 21
![Page 22: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/22.jpg)
Ex. 5: Subsample ignorable likelihood
• Interest concerns parameters of regression of Y on (Z,X,W)• Z complete, W and (X,Y) incomplete. W complete in P1.• Division of covariates into W, X is based on following MNAR
assumptions about the missing data mechanism:• Pr(W complete) = fn(W,X,Z) (not Y)
(X,Y) MAR in subsample with W fully observed (that is, P1)
Pattern Z W X Y
P1 √ √ ? ?
P2 √ ? ? ?
wu
1
1This mechanism is P-MAR( );corresponding analysis is
to apply an ignorable likelihood method, discarding data in P2
Little and Zhang (2011)
Columns could be vectors√ = fully observed? = observed or missing
Graybill Conference: Partially Missing at Random 22
![Page 23: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/23.jpg)
Ex. 6: Auxiliary data, survey nonresponse
1 2 3
obs resp aux
resp 1 2 3 1
*aux 2
( , , ), 1,..., }
( , )
( , , ), 1,..., , ( ), 1,..., ,
, 1,..., , = population size
i i i
i i i i
j
D y y y i n
D D D
D y y y i r y i r n
D y j N N
??
2 1 2 3 Y Y Y Y
??
Not linked
1..r..n..N
2
1 2 1 2
2 aux 1 resp
3 1 2
NOT MAR -- missing for nonrespondents
But mechanism is P-MAR( ) if ( , , ) additive function of ( , )
Marginal of from ,marginal of from
Conditional of given , from co
i
i i i i
y
g y y y y
Y D Y D
Y Y Y
respmplete cases in D
1 2 3 1 2 3
1 2 3 1 2
, , ~ ( , , | )
Pr( 1| , , , ) ( , , )i i i
i i i i i i
Y Y Y f y y y
m y y y g y y
Graybill Conference: Partially Missing at Random 23
![Page 24: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/24.jpg)
Simulation Study
1 2 3 1 2 3 1 2 1 2 3
1 2
3 1 2
3 1 2 1 1 2 2 12 1 2
1 2
1 2 1 1 2 2 12 1
[ , , , ] [ , ][ | , ][ | , , ]
[ , ] multinomial
[ | , ] generated as
logit Pr( 1| , ) 0.5 *
[ | , ] generated as
logit Pr( 1| , ) 0.5 *
Y Y Y M Y Y Y Y Y M Y Y Y
Y Y
Y Y Y
Y Y Y Y Y Y Y
M Y Y
M Y Y Y Y Y
2
Each , set to zero or two (various com
100,000, 20
binati
0, 1000 and 10,
on )
00
s
0
j j
Y
N n
Graybill Conference: Partially Missing at Random 24
![Page 25: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/25.jpg)
Simulation Study: methodsCC: Complete Case estimates based on the responding units
M1: ML based on a logistic regression with interaction for Y3
M2: ML based on an additive logistic regression for Y3
NR: Weighting class estimates where nonresponse weights are obtained based on Y1
PS: Post-stratification weighted estimates (PS) based on Y2
NRPS: Adjust weights using both Y1 and Y2. For the case of
categorical variable, this method is equivalent to Linear Calibration regression, or Generalized Raking estimates
Graybill Conference: Partially Missing at Random 25
![Page 26: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/26.jpg)
Graybill Conference: Partially Missing at Random 26
![Page 27: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/27.jpg)
Simulation: summary findings• When response depends on Y1 *Y2 interaction,
all methods do poorly• When data are MCAR, all methods do similarly
well• Model-based methods remove almost all the
bias and perform better when response doesn’t depend on Y1 *Y2 interaction
• Qualitative patterns hold for different sample sizes
Graybill Conference: Partially Missing at Random 27
![Page 28: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/28.jpg)
Frequentist inference• Rubin’s (1976) sufficient conditions for
ignorability for frequentist inference were even stronger (essentially MCAR)
• These can be weakened too – for example asymptotic frequentist inference based on ML and observed information matrix works under conditions given here
• Small sample inference seems more problematic
Graybill Conference: Partially Missing at Random 28
![Page 29: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/29.jpg)
Frequentist inference• Rubin’s (1976) sufficient conditions for
ignorability for frequentist inference were even stronger (essentially MCAR)
• These can be weakened too – for example asymptotic frequentist inference based on ML and observed information matrix works under conditions given here
• Small sample inference is more complex
Graybill Conference: Partially Missing at Random 29
![Page 30: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/30.jpg)
Summary• Proposed definitions of partial MAR,
ignorability for subsets of parameters• Expands range of situations where
missing data mechanism can be ignored• Though, in some cases, MAR analysis
entails a loss of information –– How much is lost is an interesting question,
varies by context
Graybill Conference: Partially Missing at Random 30
![Page 31: Partially missing at random and ignorable inferences for parameter subsets with missing data](https://reader035.vdocuments.net/reader035/viewer/2022081420/5681479f550346895db4d96c/html5/thumbnails/31.jpg)
ReferencesHarel, O. and Schafer, J.L. (2009). Partial and Latent Ignorability in missing data problems. Biometrika, 2009, 1-14
Little, R.J.A. (1993). Pattern Mixture Models for Multivariate ‑Incomplete Data. JASA, 88, 125-134.
Little, R. J. A., and Rubin, D. B. (2002). Statistical Analysis with Missing Data (2nd ed.) Wiley.
Little, R.J. and Zangeneh, S.Z. (2013). Missing at random and ignorability for inferences about subsets of parameters with missing data. University of Michigan Biostatistics Working Paper Series.
Little, R. J. and Zhang, N. (2011). Subsample ignorable likelihood for regression analysis with missing data. JRSSC, 60, 4, 591–605.
Rubin, D. B. (1976). Inference and Missing Data. Biometrika 63, 581-592.
Graybill Conference: Partially Missing at Random 31