laplace approximation in measurement error models
TRANSCRIPT
Laplace approximation in measurement error models
Michela Battauz�
Department of Economics and Statistics, University of Udine, Udine, Italy
Received 28 April 2010, revised 8 November 2010, accepted 7 February 2011
Likelihood analysis for regression models with measurement errors in explanatory variables typicallyinvolves integrals that do not have a closed-form solution. In this case, numerical methods such asGaussian quadrature are generally employed. However, when the dimension of the integral is large,these methods become computationally demanding or even unfeasible. This paper proposes the useof the Laplace approximation to deal with measurement error problems when the likelihood functioninvolves high-dimensional integrals. The cases considered are generalized linear models with multiplecovariates measured with error and generalized linear mixed models with measurement error in thecovariates. The asymptotic order of the approximation and the asymptotic properties of the Laplace-based estimator for these models are derived. The method is illustrated using simulations andreal-data analysis.
Keywords: Generalized linear mixed model; Generalized linear model; Laplaceapproximation; Measurement error; Random effect.
Supporting Information for this article is available from the author or on the WWW underhttp://dx.doi.org/10.1002/bimj.201000095
1 Introduction
Measurement error is a common problem in many areas of statistical analysis, and a largepart of the literature focuses on medical and biological applications (see Carroll et al., 2006).Accounting for measurement error is particularly important as it leads to biased parameterestimates and loss of power for detecting relationships among variables (Carroll et al.,2006, Section 1.1). The literature provides a variety of statistical methods to correct for thepresence of measurement error and distinguishes between structural and functional modelingdepending on whether the true variables are modeled parametrically or not (Carrollet al., 2006, Section 2.1). Likelihood methods generally require strong distributional assumptions,but they can be applied to more general models and provide a gain in efficiency. However,the evaluation of the log-likelihood function often involves integrals with respect to the latentvariables that have no closed-form solution. In these cases, numerical methods such as Gaussianquadrature are employed. Nevertheless, when high-dimensional integrals are involved, suchmethods become computationally demanding or even unfeasible. Two such cases are considered inthis paper.
The first one concerns generalized linear models with multiple covariates measured with error,where the dimension of the integral depends on the number of error-prone variables. Schafer (1987)
*Corresponding author: e-mail: [email protected], Phone: 139-0432-24-95-81, Fax: 139-0432-24-95-95
r 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim
Biometrical Journal 53 (2011) 3, 411–425 DOI: 10.1002/bimj.201000095 411
proposes the use of Gauss–Hermite quadrature in maximum likelihood analysis to approximateintegrals. As pointed out by Higdon and Schafer (2001), the inclusion of more than one explanatoryvariable measured with error is considerably more difficult because multidimensional quadrature isquite cumbersome. Monte–Carlo integration may be more reasonable, but it may require a largeMonte–Carlo sample size becoming time consuming. Liu and Pierce (1994) consider the Laplaceapproximation in a logistic regression model with one explanatory variable measured with error.They found that the Laplace approximation is very accurate; however, the asymptotic properties ofthe estimator are not established.
The second case regards generalized linear mixed models with measurement error in thecovariates. Mixed models are widely used in statistical applications since they are well suitedfor the analysis of longitudinal panel data and clustered cross-sectional data (for a broadreview see for example McCulloch and Searle, 2001). Maximum likelihood estimation requiresintegration over the random effects distribution. Various numerical techniques are proposedin the literature for this task, including Laplace approximation (Liu and Pierce, 1993),Gaussian quadrature (Anderson and Aitkin, 1985) and Monte–Carlo methods (McCulloch, 1994,1997). Computational complexity involved in generalized linear mixed models makes thetreatment of measurement error in the covariates rather challenging. Wang et al. (1998) analyze thebias in parameter estimation when the measurement error is not properly taken intoaccount and propose the SIMEX method (Cook and Stefanski, 1994) for measurement errorcorrection. Wang et al. (1999) showed that a naive application of regression calibration is notsuitable for generalized linear mixed models with measurement error in the covariates. When alikelihood-based approach is used in order to adjust for measurement error in mixed models, boththe integration of the random effects and the true variables is required. Let q be the number ofrandom effects, p1 is the number of variables measured with error and ni is the number of unitswithin cluster i. Since the observations from the same cluster are correlated, the dimension of theintegral is equal to q1p1 � ni.
This paper proposes the use of the Laplace approximation for dealing with high-dimensionalintegrals in measurement error models when a likelihood-based approach is followed. The Laplaceapproximation is as follows (Barndorff-Nielsen and Cox, 1989, p. 170)Z
Rqe�rðxÞdx[ð2pÞðq=2Þjr00ðxÞj�1=2e�rðxÞ; ð1Þ
where x is a q-dimensional parameter vector and r(x) is a smooth function on Rq with a uniqueminimum at x. Generally, r(x) is of order O(n) and the typical case is r(x)5 nr1(x). For fixed qoN,the order of accuracy of the Laplace approximation is then O(1/n). In the case considered here, anunusual type of Laplace approximation results as the order of the function r(x) does not onlydepend on a single parameter n, but also on the measurement error variances. The accuracy of theapproximation then requires a specific investigation.
Ko and Davidian (2000) have found successful use of the Laplace approximation in nonlinearmixed effects models with measurement error in the covariates thus encouraging the study of themethod for generalized linear mixed models. Although they conjecture a non-negligible bias in thiscase, an analytical study of the accuracy of the approximation or simulation results are not pro-vided. Furthermore, a complete investigation of the asymptotic order of the Laplace approximationfor measurement error correction and the asymptotic properties of the resulting estimator is lackingin the literature.
The paper is organized as follows. The models and some notation are introduced in Section 2.The estimation procedure, the asymptotic order of the approximation and the asymptotic propertiesof the estimator are presented in Section 3. Section 4 regards some simulation studies, Section 5illustrates the method using real data and Section 6 contains some concluding remarks. Finally,mathematical details are given in the Appendices.
412 M. Battauz: Laplace approximation in measurement error models
r 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com
2 Model specifications
2.1 Generalized linear measurement error model
Let Yi be the response variable on unit i5 1,y, m.. Let Xi denote a vector of true covariates withdimension p1 and Zi denote a vector of covariates measured without error with dimension p2. It isassumed that observations from different units are independent. Suppose that the responses comefrom a generalized linear model (McCullagh and Nelder, 1999) with density
fYijXiðyijxiÞ ¼ exp ½yigi � bðgiÞ�=t
2 � cðyi; tÞ� �
; ð2Þ
mean E(Yi)5 mi and variance Var(Yi)5 t2 v(mi). The mean is related to the linear predictor throughthe link function g
gðmiÞ ¼ b01X>i bx1Z>i bz;
where b0 is the intercept, bx is a p1-vector of coefficients of Xi and bz is a p2-vector of coefficients ofZi. In this paper, the additive measurement error structure is assumed for the measurement errormodel, hence
Wi ¼ Xi1Ui;
where Wi is a p1-vector of measurements, Ui is a p1-vector of measurement errors independent of Xi
that follow a normal distribution with zero mean and variance Ru 5 Ru(wu), and wu is a vector ofparameters. In general, measurement error problems require the availability of extra informationabout the measurement error distribution (Carroll et al., 2006, Section 2.3). In this paper, themeasurement error variance Ru is assumed to be known. Conditional on Xi and Zi, the distributionof Yi is assumed independent of Wi, which means that measurement error is non-differential(Carroll et al., 2006, Section 2.5). Given the true variables Xi, the measurements Wi are assumedindependent of any other variable. The structural model for the true variables is
Xi ¼ aZi1ei;
where a is a p1� p2-matrix of coefficients and ei is a p1-vector of residuals that follow a normaldistribution with zero mean and variance matrix Rx 5 Rx(wx), and wx is a vector of parameters.
2.2 Generalized linear mixed measurement error model
Let Yij denote the response variable of unit j in cluster i, where j5 1,y,ni and i5 1,y,m. The vectorof p1 variables measured with error is denoted Xij and the vector of p2 variables measured withouterror is denoted Zij. Conditional on the random effects, the responses come from a generalizedlinear model with predictors
Zij ¼ b01X>ij bx1Z>ij bz1S>ij bi;
where bx and bz are vector of coefficients with dimensions p1 and p2, Sij is a subset of Zij withdimension q, bi is a vector of random effects with dimension q that follows a normal distributionwith zero mean and variance D5D(wb), and wb is a vector of parameters. The random effects areindependent of any other variable. The conditional means of the response variable are denoted bymij, satisfying g(mij)5Zij for some link function g, and the conditional variance is t2 v(mij). Thedensity of Yij is
fYij jbi ;Xijðyijjbi; xijÞ ¼ exp ½yijgij � bðgijÞ�=t
2 � cðyij ; tÞ� �
: ð3Þ
Note that the situation considered here concerns the variables measured with error associated withfixed coefficients. The non-differential measurement error assumption is taken. The specification ofthe model is completed by defining the measurement error model, that is
Biometrical Journal 53 (2011) 3 413
r 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com
Wij ¼ Xij1Uij ;
where Wij is a p1-vector of measurements, Uij is a p1-vector of normally distributed measurementerrors with zero mean and variance Ru 5 Ru(wu). The measurement errors are assumed independentof the true variables and independent across units. Measurement error variances are assumedknown. Furthermore, conditional on Xi, the measurements are assumed independent of othervariables. Two cases are considered for the model for Xij: the homogeneous and the heterogeneouscases (Wang et al., 1998). In the homogeneous case, the true variables are marginally independent,that is
Xij ¼ aZij1eij ;
where a is a p1� p2-matrix of coefficients and eij is a p1-vector vector of residuals that follow anormal distribution with zero mean and variance matrix Rx, independent across units. In theheterogeneous case the model for the true variables presents random coefficients. Here, we considerthe case of a random intercept model
Xij ¼ aZij1ai1eij ;
where ai are independent N ð0;s2aÞ and independent of other variables, and eij is a p1-vector of
N ð0;RxÞ, independent across units. In both cases, the vector of parameters entering Rx is wx.
3 Maximum likelihood estimation using the Laplace approximation
3.1 Likelihood functions
The likelihood function for generalized linear measurement error models can be expressed as
LðhÞ ¼Ymi¼1
ZRp1
f ðYijXi;ZiÞf ðWijXiÞf ðXijZiÞ dXi
¼Ymi¼1
ZRp1
exp �rðXiÞ� �
dXi;
ð4Þ
where r(Xi)5�log f (Yi|Xi,Zi)�log f (Wi|Xi)�log f (Xi|Zi) and h ¼ ðb>x ; b>z ; t
2; vecðaÞ>;w>x Þ> is the
vector of all the parameters. Let Xi denote the minimizer of r(Xi) satisfying theequation r0(Xi)5 0. The Laplace approximation gives the following expression for the log-likelihoodfunction:
~‘ðhÞ ¼ �Xmi¼1
rðXiÞ �1
2
Xi
logjr00ðXiÞj; ð5Þ
where ~‘ðhÞ denotes the approximated log-likelihood function and
r00ðXiÞ ¼1
t2oibxb
>x 1R�1u 1R�1x ; ð6Þ
where oi ¼ ½vðmiÞg2mðmiÞ�
�1 and gm(mi) is the first derivative of g(mi) with respect to mi. For details seeAppendix A. The constant term is omitted from Eq. (5) and from all the following likelihoodfunctions.
For simplicity, in mixed models we consider a single variable measured with error, that is p1 5 1.Define Yi ¼ ðYi1; . . .;Yini Þ
>; Xi ¼ ðXi1; . . .;Xini Þ>; Zi ¼ ðZi1; . . .;Zini Þ
> and Si and Wi similarly. Thelikelihood function for generalized linear mixed measurement error models can be expressed as
LðhÞ ¼Ymi¼1
ZRq
ZRni
f ðYijXi;Zi; biÞf ðWijXiÞf ðXijZiÞf ðbiÞ dXidbi:
414 M. Battauz: Laplace approximation in measurement error models
r 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com
Note that the factorization of density f ðYi;Wi;Xi; bijZiÞ used here is different from that of Ko andDavidian (2000).
In the homogeneous case, the vector of all the parameters is h ¼ ðb>x ; b>z ; t
2;a>;w>x ;w>b Þ> and the
likelihood function results
LðhÞ ¼Ymi¼1
ZRq
ZRni
Ynij¼1
f ðYijjXij ;Zij ; biÞf ðWijjXijÞf ðXijjZijÞ
!f ðbiÞ dXidbi
¼Ymi¼1
ZRq
ZRni
exp �rðbi;XiÞ� �
dXidbi;
ð7Þ
where rðbi;XiÞ ¼ �P
j ðlog f ðYijjXij ;Zij ; biÞ1log f ðWijjXijÞ1log f ðXijjZijÞÞ � log f ðbiÞ. Let ðbi; XiÞ>
denote the minimizer of r(bi,Xi) satisfying the equation r0(bi,Xi)5 0. The log-likelihood function isthen approximated as
~‘ðhÞ ¼ �Xmi¼1
Xnij¼1
rðbi; XiÞ �1
2
Xmi¼1
Xnij¼1
logjr00ðbi; XiÞj; ð8Þ
where r00(bi,Xi) has various components, that are
q2rðbi;XiÞ
qbiqb>i¼
1
t2Xnij¼1
oijSijS>ij 1D�1 ¼ OðniÞ;
q2rðbi;XiÞ
qX2ij
¼1
t2oijb
2x1
1
s2u
11
s2x
¼ Oðs�2u Þ;
q2rðbi;XiÞ
qXijqXij0¼0 for j 6¼ j0;
q2rðbi;XiÞ
qbiqXij¼
1
t2Sijoijbx ¼ Oð1Þ;
ð9Þ
where oij ¼ ½vðmijÞg2mðmijÞ�
�1, gm(mij) is the first derivative of g(mij) with respect to mij, s2u represents the
measurement error variance and s2x denotes the variance of Xij. The asymptotic order of the second
derivatives have been specified because they will be used in Section 3.2.In the heterogeneous case, h ¼ ðb>x ; b
>z ; t
2; a>;w>x ;w>b ;s
2aÞ> and the likelihood function is
LðhÞ ¼Ymi¼1
ZRq
ZR
ZRni
Ynij¼1
f ðYijjXij ;Zij ; biÞf ðWijjXijÞf ðXijjZij ; aiÞ
!f ðaiÞf ðbiÞdXidaidbi
¼Ymi¼1
ZRq
ZR
ZRni
exp �rðai; bi;XiÞ� �
dXidaidbi; ð10Þ
where rðai; bi;XiÞ¼�P
j ðlog f ðYijjXij ;Zij ; biÞ1log f ðWijjXijÞ1log f ðXijjZij ; aiÞÞ�log f ðaiÞ�log f ðbiÞ:
The Laplace approximation gives then
~‘ðhÞ ¼ �Xmi¼1
Xnij¼1
rðai; bi; XiÞ �1
2
Xmi¼1
Xnij¼1
logjr00ðai; bi; XiÞj; ð11Þ
where ðai; bi; XiÞ> denote the minimizer of r(ai,bi,Xi) and the components of r00(ai,bi,Xi) are
q2rðai; bi;XiÞ
qa2i¼
ni
s2x
11
s2a
¼ OðniÞ;
Biometrical Journal 53 (2011) 3 415
r 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com
q2rðai; bi;XiÞ
qbiqb>i¼
1
t2Xj
oijSijS>ij 1D�1 ¼ OðniÞ;
q2rðai; bi;XiÞ
qX2ij
¼1
t2b2xoij1
1
s2u
11
s2x
¼ Oðs�2u Þ;q2rðai; bi;XiÞ
qXijqXij0¼ 0 for j 6¼ j0; ð12Þ
q2rðai; bi;XiÞ
qaiqbi¼ 0;
q2rðai; bi;XiÞ
qaiqXij¼
1
s2x
¼ Oð1Þ;
q2rðai; bi;XiÞ
@bi@Xij¼
1
t2Sijoijbx ¼ Oð1Þ;
where oij is defined as in the homogeneous case. Details are given in Appendix A.The matrices of second derivatives in (6), (9) and (12) are exact for some models (e.g. the binomial
or Poisson). In general, an expectation should be taken to obtain those expressions (see McCullochand Searle, 2001, p. 282).
3.2 Asymptotic properties
Since the method proposed in this paper provides an approximation of the log-likelihood function,it is important to establish the order of this approximation and the conditions necessary to obtainconsistency of the estimator based on the maximization of the approximated log-likelihood. Thissection concerns then the asymptotic order of the approximation and the asymptotic properties ofthe Laplace-based estimator. The vector of parameters h is assumed to be fixed, while the mea-surement error variances tend to zero and the sample sizes tend to infinity. All the proofs are givenin Appendix B and they proceed along the lines of Vonesh (1996), who derived the rate of con-vergence of the estimator based on the Laplace approximation for nonlinear mixed effects models.
In generalized linear measurement error models, the order of accuracy of the Laplace approx-imation depends on the magnitude of the largest measurement error variance. In particular the log-likelihood function (4) is approximated as
‘ðhÞ ¼ ~‘ðhÞ1m Oðmaxs
s4usÞ;
where s2us is the measurement error variance of the sth variable. Let h be the Laplace-based max-
imum likelihood estimate. Under suitable regularity conditions on ‘ðhÞ and provided thatðh� hÞ ¼ opð1Þ, it is possible to show that
ðh� hÞ ¼ Op½maxðm�1=2;maxs
s4us�:
Thus the approximate maximum likelihood estimator h will be consistent only as both m-N andmaxssus-0. Consistent with the findings of Vonesh (1996), the rate of convergence depends on twoterms, one coming from the standard asymptotic theory and one coming from the Laplaceapproximation. In the case of generalized linear measurement error models, the former is m�1/2 andthe latter is maxs s4
us.Consider now the generalized linear mixed measurement error models. The order of accuracy
of the Laplace approximation depends on both the measurement error variance and thenumber of units within clusters. Specifically, the log-likelihood functions (7) and (10) areapproximated as
‘ðhÞ ¼ ~‘ðhÞ1mOðmaxðs4u;minðniÞ
�1ÞÞ:
416 M. Battauz: Laplace approximation in measurement error models
r 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com
Under suitable regularity conditions on ‘ðhÞ and provided that ðh� hÞ ¼ opð1Þ, the rate of con-vergence of the Laplace-based maximum likelihood estimator is
ðh� hÞ ¼ Opðmaxðm�1=2;s4u;minðniÞ
�1ÞÞ:
In order to have consistency it is then necessary that m-N, su-0 and min ni-N. The m�1/2 termcomes from the standard asymptotic theory, while both s4
u and min(ni)�1 come from the Laplace
approximation. This can be seen as a generalization to the measurement error case of the resultobtained by Vonesh (1996).
4 Simulation studies
In order to evaluate the behavior of the estimator for generalized linear measurement error modelswhen moN and sus40 for all s, and the behavior of the estimator for generalized linear mixedmeasurement error models when moN, nioN and su40, some simulation studies have beenconducted. The standard deviation of the true variables is set sx 5 1 and measurement errorstandard deviations are taken constant across different variables and equal to 0.655, 0.5 and 0.333,so that the reliabilities result around 0.7, 0.8 and 0.9. Various estimators have been considered foreach simulated data set. The true estimator represents a benchmark as it uses the true covariatevalues, while the naive estimator treats the observed covariates as measured without error, thusleading to biased estimates. In other words, the true estimator assumes that the true covariates Xi
and Xij are known in (2) and (3). The naive estimator fits models (2) and (3) using Wi and Wij inplace of the true covariates. Finally, the Laplace-based correction proposed in this paper is referredto as the adjusted method.
To implement the proposed method, a program was written in the statistical language R(R Development Core Team, 2010), with computational intensive parts written in C code. All thecode used is provided in the online Supporting Information. The true and the naive mixed modelshave been fitted using the R package lme4.
For the generalized linear measurement error model, a logistic regression model has been con-sidered. Two variables measured with error have been generated. The true variables have zero meanand correlation equal to 0.5, while measurement errors are taken uncorrelated. The elements of bxare both set equal to 1, while the intercept b0 is set equal to 0. No covariates measured without errorhave been included. The number of observations m is taken equal to 100 and 300. There are 1000simulations for each parameter setting. Table 1 reports bias and root mean-square error (RMSE) ofthe estimates of the first variable measured with error. Since the two variables are perfectly sym-metric, results for the second one are omitted. Simulation results show that the bias of the adjustedestimator is very close to that of the true method. Notice that here a small-sample bias yieldsupward estimates, while the presence of measurement error causes a negative bias in the naiveestimator. These two sources of bias work in opposite directions and, for the naive estimator, theytend to compensate each other. Increasing the measurement error variance yields an increment ofthe standard error of the adjusted estimates, due to the greater uncertainty of the estimationprocess. However, it does not have effect on the bias of the adjusted estimates. This can be due tothe fact that the density functions f(Wi|Xi) and f(Xi|Zi) in (4) are normal, so that the Laplaceapproximation gives a nearly exact solution of the integral in (4) (see Liu and Pierce, 1994).
A logistic regression model with random intercepts in the homogeneous case has been consideredfor the generalized linear mixed measurement error model. The parameter settings have been takensimilar to Wang et al. (1998). In particular, there is one variable measured with error that has zeromean. The number of clusters m is taken equal to 50 and 100, and the number of units withinclusters is constant across groups and it is ni 5 n5 5 and 10. One variable measured without errorhas been generated from a standard normal distribution. The coefficients are b0 5 0, bx 5 2 and
Biometrical Journal 53 (2011) 3 417
r 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com
bz 5 1. The variance of the random intercepts, cb, has been taken equal to 0.5. The results are basedon 1000 data sets for each set of parameter values and are displayed in Table 2.
The true and the naive methods use the Laplace approximation for the integration of the randomeffects. All the methods present convergence problems for same data sets. In case of convergenceproblems for one method, the data set has been excluded for all the methods. The tables report thenumber of cases presenting no convergence (n.c.) for each method, and the number of excluded datasets (excl.). Since, in same cases, different methods have problems with the same data set, thenumber of excluded cases is not given by the sum of cases presenting no convergence.
It is important to note that the true method provides biased parameter estimates due to small-sample size and Laplace approximation. When the naive method is used, the measurement error is afurther source of bias in parameter estimation.
Naive estimates of both bx and bz are biased downwards, with the former presenting the largerbias. Measurement error also causes an underestimation of the variance of the random intercept.The adjusted estimator presents a slight bias and, consistently with our findings, it performs betterwhen the measurement error variance is smaller.
Further simulation studies for generalized linear mixed measurement error models have beenconducted. These studies regard a logistic regression model with random intercepts in the hetero-geneous case and a Poisson model with random intercepts in both the homogeneous and theheterogeneous cases. For the heterogeneous case, the random intercept variance of the structuralmodel is s2
a ¼ 1:5. The other parameters are the same as the first simulation. The results arereported in the online Supporting Information. The simulation of the logistic model in the het-erogeneous case gives results very similar to those of the homogeneous case (Supporting In-formation Table S1). The Laplace approximation works very well for the Poisson model in both thehomogeneous and the heterogeneous cases (Supporting Information Tables S2 and S3). Using the
Table 1 Simulation of logistic measurement error model.
Method su Bias RMSE
m5 100True 0.655 0.067 0.117Naive 0.655 �0.265 0.124Adjusted 0.655 0.057 0.180True 0.500 0.053 0.139Naive 0.500 �0.169 0.109Adjusted 0.500 0.064 0.201True 0.333 0.058 0.126Naive 0.333 �0.059 0.100Adjusted 0.333 0.062 0.159
m5 300True 0.655 0.025 0.036Naive 0.655 �0.287 0.101Adjusted 0.655 0.023 0.071True 0.500 0.022 0.038Naive 0.500 �0.191 0.061Adjusted 0.500 0.023 0.061True 0.333 0.025 0.038Naive 0.333 �0.084 0.036Adjusted 0.333 0.025 0.044
418 M. Battauz: Laplace approximation in measurement error models
r 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com
true method, the estimates of cb are slightly biased downwards, but this is a well-known result forthe variance components in mixed models. The naive method presents a negative bias for bx, whilebz seems to be not affected by measurement error. The parameter cb has instead a positive bias. The
Table 2 Simulation of logistic model in the homogeneous case.
Method su n bx bz cb
Bias RMSE Bias RMSE Bias RMSE n.c. excl.
m5 50True 0.655 5 0.033 0.310 0.022 0.230 0.018 0.494 0 1Naive 0.655 5 �0.819 0.840 �0.154 0.247 �0.142 0.397 1 1Adjusted 0.655 5 0.020 0.701 0.024 0.319 0.059 1.045 0 1True 0.655 10 0.020 0.205 0.010 0.155 �0.019 0.281 1 1Naive 0.655 10 �0.826 0.836 �0.164 0.210 �0.172 0.270 0 1Adjusted 0.655 10 �0.056 0.401 �0.003 0.207 �0.074 0.401 0 1True 0.500 5 0.049 0.321 0.035 0.237 0.017 0.487 0 2Naive 0.500 5 �0.564 0.608 �0.097 0.226 �0.102 0.394 2 2Adjusted 0.500 5 0.038 0.520 0.030 0.279 0.009 0.819 0 2True 0.500 10 0.032 0.210 0.013 0.152 �0.016 0.285 0 1Naive 0.500 10 �0.578 0.597 �0.113 0.176 �0.130 0.259 1 1Adjusted 0.500 10 �0.010 0.274 0.006 0.170 �0.059 0.303 0 1True 0.333 5 0.049 0.316 0.026 0.221 0.030 0.519 2 2Naive 0.333 5 �0.286 0.386 �0.042 0.210 �0.044 0.447 0 2Adjusted 0.333 5 0.037 0.394 0.025 0.241 0.009 0.681 0 2True 0.333 10 0.015 0.207 0.013 0.152 �0.028 0.286 0 0Naive 0.333 10 �0.303 0.349 �0.054 0.148 �0.085 0.266 0 0Adjusted 0.333 10 0.008 0.232 0.011 0.155 �0.044 0.298 0 0
m5 100True 0.655 5 0.017 0.224 0.011 0.164 �0.017 0.329 0 7Naive 0.655 5 �0.829 0.840 �0.162 0.212 �0.181 0.297 7 7Adjusted 0.655 5 �0.080 0.409 �0.014 0.192 �0.138 0.454 0 7True 0.655 10 0.012 0.145 0.001 0.105 �0.006 0.202 0 0Naive 0.655 10 �0.831 0.836 �0.169 0.190 �0.161 0.221 0 0Adjusted 0.655 10 �0.090 0.251 �0.018 0.122 �0.081 0.253 0 0True 0.500 5 0.008 0.212 0.012 0.152 �0.030 0.322 3 3Naive 0.500 5 �0.593 0.612 �0.112 0.176 �0.143 0.293 3 3Adjusted 0.500 5 �0.054 0.274 �0.004 0.167 �0.118 0.358 0 3True 0.500 10 0.016 0.145 0.006 0.105 �0.012 0.205 0 1Naive 0.500 10 �0.587 0.596 �0.119 0.152 �0.128 0.210 1 1Adjusted 0.500 10 �0.032 0.190 �0.005 0.118 �0.066 0.226 0 1True 0.333 5 0.017 0.217 0.015 0.161 �0.025 0.338 0 7Naive 0.333 5 �0.305 0.355 �0.050 0.158 �0.088 0.308 7 7Adjusted 0.333 5 0.000 0.243 0.012 0.167 �0.066 0.351 0 7True 0.333 10 0.008 0.145 0.000 0.105 �0.013 0.200 0 1Naive 0.333 10 �0.311 0.335 �0.065 0.118 �0.075 0.195 1 1Adjusted 0.333 10 �0.004 0.164 �0.001 0.110 �0.035 0.210 0 1
Biometrical Journal 53 (2011) 3 419
r 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com
adjusted method gives nearly unbiased estimates for bx and bz. For the variance of the randomintercepts cb, it presents a negative bias. In the heterogeneous case when su 5 0.655, it is notnegligible, but most of the large bias of the naive estimator is corrected.
5 Example
The use of the Laplace approximation for the estimation of a logistic random intercept model withmeasurement error in the covariates will be illustrated using data from the First National Healthand Nutrition Examination Survey (NHANES I) and its Epidemiologic Follow-up Study(NHEFS). Briefly, between 1971 and 1975, a random sample of 23 808 persons representing thenon-institutionalized US population aged 1–74 years was selecting for participation in theNHANES I. Of these, 14 407 participants aged 25–74 underwent a medical examination. Follow-updata were collected in 1982–1984, 1986–1987 and 1992. This example considers the risk factors forstroke. The response indicates whether the subject had a stroke since the previous interview. Thepredictor variables at baseline include the following: age, sex, history of diabetes (yes, no), history ofheart disease (yes, no), educational attainment (o12 years, Z12 years), physical activity (yes, no)and long-term systolic blood pressure (SBP). The SBP reading was taken during the baseline surveyand is considered a measure with error of the long-term SBP (Carroll et al., 2006, p. 13). Assuggested by Carroll et al. (2006, p. 113), this variable was transformed into log(SBP�50) to achieveapproximate normality and the additive measurement error model is generally used in this context.Subjects who had a positive history of stroke at baseline were excluded. The sample used in thisstudy is composed of 2634 White and Black respondents aged 45–74 years at baseline with completedata at baseline and follow-up. Other possible confounders, such as cigarette smoking, serumcholesterol, hemoglobin concentration and body mass index were not included in the model becausethey were not significant. Analogous to Carroll et al. (2006, Section 4.3), for the study of breastcancer and to Wang et al. (1998) for the study of coronary heart disease, a logistic model has beenfitted. To account for the longitudinal structure of the data, a random intercept at the subject levelhas been included. The measurement error variance is not known in this study and other sources ofinformation are not available. Using data from the Framingham study (Carroll et al., 2006, Section1.6.6), the reliability of log(SBP�50) has been calculated around 0.74. For this application, thereliability (r) was then set equal to 0.8 and 0.7 as they represent two sensible values. As initial valuesfor the maximization of the log-likelihood function, the SIMEX estimates have been taken. Table 3shows the results for the Laplace-based estimator and the SIMEX method. The structural para-meters a have been omitted because they are not of interest. Both the naive and the adjusted modelsshow that SBP is positively associated with the risk of stroke. After measurement error adjustment,the coefficient of this variable is considerably increased. The SIMEX method gives very similarestimates to the Laplace-based estimator.
6 Discussion
The use of the Laplace approximation in measurement error models gives encouraging results. Themethod is applicable to problems involving high-dimensional integrals, where other solutionsbecome impractical or computationally intensive. Furthermore, it could be a convenient choice evenwhen other methods are available in terms of computational time.
The simulation studies show that the Laplace approximation gives very good results for gen-eralized linear measurement error models. Obviously, the employment of the Laplace approxima-tion for generalized linear mixed measurement error models maintains all the limits presented by theuse of this method for generalized linear mixed models. In particular, same care is required when theLaplace approximation is used for binary data with very small-sample sizes.
420 M. Battauz: Laplace approximation in measurement error models
r 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com
This paper shows that the convergence rate of the Laplace-based maximum likelihood estimatordepends on the measurement error variances. This is not an unusual feature (see for example Cookand Stefanski, 1994) and the simulation studies show the good performance of the method evenwhen considerable measurement error variances are present.
In this paper, the measurement error variances are assumed known. However, furthertypes of additional information could be available, such as internal validation, internal replication,external validation or external replication (Carroll et al., 2006, Section 2.3) At any rate, themethodology proposed here is straightforward to extend to these types of additional information byconsidering suitable modifications of the likelihood function and applying the Laplace approx-imation to evaluate integrals. Assuming the measurement error variances as known has theadvantage of a simpler expression of the accuracy of the Laplace approximation. In fact, the orderof the accuracy of the likelihood approximation for generalized linear measurement error modelsdepends only on the measurement error variances, while, in the case of internal replication,the order of the accuracy depends on both the measurement error variances and the number ofreplications. An analogous result applies to generalized linear mixed measurement error models.
In generalized linear mixed measurement error models, only one variable measured with error hasbeen considered. However, Laplace approximation is well suited for multiple variables measuredwith error in generalized linear mixed model and the procedure presented in this paper can be easilygeneralized to take this into account properly.
In this paper, the true predictor is assumed normally distributed and this assumptionmay be difficult to verify. For this reason, an extension of the methodology to more robust tech-niques could be interesting. Flexible parametric specifications as in Carroll et al. (1999) could bepromising, but they involve substantial computational complexity. The performance of the methodis expected to worsen with not normally distributed latent predictors and it requires a specificevaluation. At any rate, the asymptotic results presented in Section 3.2 hold for any distributionalassumption.
Acknowledgements The author is grateful to Professors R. Bellio, L. Pace and P. Vidoni for their helpfulsuggestions. This work was supported by grants of the Italian Ministry for Education, University and Research(MIUR).
Conflict of Interest
The authors have declared no conflict of interest.
Table 3 Estimates (standard errors) for the NHANES data.
Laplace Laplace SIMEX SIMEXNaive
r5 0.8 r5 0.7 r5 0.8 r5 0.7
Intercept �15.634 (1.407) �17.333 (2.367) �18.341 (2.737) �17.407 (2.523) �18.401 (2.587)
log (SBP�50) 0.893 (0.226) 1.308 (0.528) 1.533 (0.625) 1.319 (0.494) 1.534 (0.522)
Age 0.083 (0.011) 0.081 (0.013) 0.079 (0.013) 0.081 (0.012) 0.080 (0.012)
Sex (male) 0.267 (0.143) 0.288 (0.199) 0.284 (0.200) 0.285 (0.199) 0.294 (0.199)
Heart disease (yes) 0.884 (0.381) 0.899 (0.415) 0.868 (0.420) 0.870 (0.381) 0.873 (0.408)
Diabetes (yes) 1.112 (0.477) 1.125 (0.510) 1.107 (0.515) 0.119 (0.469) 1.121 (0.464)
Education (Z 12 yr) �0.422 (0.170) �0.403 (0.197) �0.398 (0.199) �0.411 (0.198) �0.404 (0.211)
Physical activity (none) 0.683 (0.365) 0.687 (0.388) 0.702 (0.393) 0.678 (0.364) 0.685 (0.362)
Year (87) 1.064 (0.151) 1.056 (0.158) 1.064 (0.159) 1.063 (0.164) 1.075 (0.177)
Year (92) 1.701 (0.160) 1.695 (0.169) 1.706 (0.169) 1.699 (0.184) 1.718 (0.198)
cb 15.130 (3.105) 14.952 (3.442) 15.555 (3.598) 14.976 (3.976) 15.522 (4.161)
Biometrical Journal 53 (2011) 3 421
r 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com
Appendix A: Technical details
Consider the generalized linear measurement error model. The log-density functions entering in (4)are
log f ðYijXi;ZiÞ ¼ fYigi � bðgiÞg=t2 � cðYi; tÞ;
log f ðWijXiÞ ¼ �1
2logðRuÞ �
1
2ðWi � XiÞ
>R�1u ðWi � XiÞ
and
log f ðXijZiÞ ¼ �1
2logðRxÞ �
1
2ðXi � aZiÞ
>R�1x ðXi � aZiÞ:
The following identity is used to obtain first and second derivatives of r(Xi)
qmiqXi¼
qmiqgðmiÞ
qgðmiÞqXi
¼1
gmðmiÞbx:
The first derivative of r(Xi) is then
qrðXiÞ
qXi¼ �
1
t2ðYi � miÞoigmðmiÞbx � R�1u ðWi � XiÞ1R�1x ðXi � aZiÞ:
Consider the homogeneous case of the generalized linear mixed measurement error model. The log-density functions entering in (7) are
log f ðYijjbi;XijÞ ¼ fYijgij � bðgijÞg=t2 � cðYij ; tÞ;
log f ðWijjXijÞ ¼ �1
2logðs2
uÞ �1
2
ðWij � XijÞ2
s2u
;
log f ðXijÞ ¼ �1
2logðs2
xÞ �1
2
ðXij � a>ZijÞ2
s2x
and
log f ðbiÞ ¼ �1
2log jDj �
1
2bTi D
�1bi:
The first derivatives of r(bi,Xi) are
qrðbi;XiÞ
qbi¼ �
1
t2Xj
SijðYij � mijÞoijgmðmijÞ1D�1bi
and
qrðbi;XiÞ
qXij¼ �
1
t2ðYij � mijÞoijgmðmijÞbx �
Wij � Xij
s2u
1Xij � a>Zij
s2x
:
In the heterogeneous case of the generalized linear mixed measurement error model the log-densityfunctions entering in (10) are
log f ðYijjXij ; biÞ ¼ ½Yijgij � bðgijÞ�=t2 � cðYij ; tÞ;
log f ðWijjXijÞ ¼ �1
2logðs2
uÞ �1
2
ðWij � XijÞ2
s2u
;
log f ðXijjaiÞ ¼ �1
2logðs2
xÞ �1
2
ðXij � a>Zij � aiÞ2
s2x
;
422 M. Battauz: Laplace approximation in measurement error models
r 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com
log f ðaiÞ ¼ �1
2logðs2
aÞ �1
2
a2is2a
and
log f ðbiÞ ¼ �1
2logjDj �
1
2b>i D
�1bi:
The first derivatives of r(ai,bi,Xi) are
qrðai; bi;XiÞ
qai¼ �
PjðXij � a>Zij � aiÞ
s2x
1ai
s2a
;
qrðai; bi;XiÞ
@bi¼ �
1
t2Xj
SijðYij � mijÞoijgmðmijÞ1D�1bi
and
qrðai; bi;XiÞ
qXij¼ �
1
t2ðYij � mijÞoijgmðmijÞbx �
Wij � Xij
s2u
1Xij � a>Zij � ai
s2x
:
First and second derivatives in both the homogeneous and the heterogeneous case are obtainedusing the following identities
qmijqXij¼
qmijqgðmijÞ
qgðmijÞ
qXij¼
1
gmðmijÞbx and
qmijqbi¼
qmijqgðmijÞ
qgðmijÞ
qbi¼
1
gmðmijÞSij :
Appendix B: Consistency of the Laplace-based maximum likelihood estimates
In order to evaluate the asymptotic order of the Laplace approximations (5), (8) and (11) let usconsider the first multiplicative correction term of (1) that is
111
24ð3rij rhkrlmrijhrklm12rikrjl rhmrijhrklm � 3rij rhkrijhkÞ; ðB1Þ
where rRm, Rm ¼ r1; . . .; rm, indicate mth partial derivatives of r(x) with respect to the corresponding
components of x, rRm¼ rRm
ðxÞ, and rij are the components of the inverse matrix of second deri-vatives. Here the index notation and the summation convention have been employed. The vector ofparameters h is assumed to be fixed, while the measurement error variances tend to zero and thesample sizes tend to infinity.
Consider the generalized linear measurement error model. In all scalars of the type (B1), r00ðXiÞ
may be replaced by R�1u without affecting the asymptotic order. In this case ðr00ðXiÞÞ�1¼ Ru. Let the
elements of Ru corresponding to variables i and j be denoted by suij, with suii ¼ s2ui. Then,
Oðrij rklÞ ¼ OðsuijsuklÞ � OðsuisujsuksulÞ, and the equality holds only for i5 j and k5 l. The highesterror is then Oðmaxs s4
usÞ. Since rijh and rijhk are O(1) the asymptotic order of the scalars in Eq. (B1)are
rij rhkrlmrijhrklm � Oðmaxs
s6usÞ;
rikrjl rhmrijhrklm � Oðmaxs
s6usÞ;
rij rhkrijhk � Oðmaxs
s4usÞ:
Biometrical Journal 53 (2011) 3 423
r 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com
For unit i the log-likelihood function is then approximated as
‘iðhÞ ¼ ~‘iðhÞ1Oðmaxs
s4usÞ
and the overall log-likelihood function is
‘ðhÞ ¼ ~‘ðhÞ1mOðmaxs
s4usÞ: ðB2Þ
Let ‘RmðhÞ ¼ ‘r1...rmðhÞ denote the generic derivative of order m of ‘ðhÞ and let h be the Laplace-based
maximum likelihood estimate satisfying ~‘rðhÞ ¼ 0. Deriving (B2) with respect to the r component ofh, evaluating it in h and dividing by m gives
m�1‘rðhÞ ¼ m�1 ~‘rðhÞ1Oðmaxs
s4usÞ ¼ Oðmax
ss4usÞ:
Under suitable regularity conditions on ‘ðhÞ and provided that ðh� hÞ ¼ opð1Þ, a Taylor seriesexpansion about the true parameter value h yields
m�1‘rðhÞ[m�1‘rðhÞ1m�1‘rsðhÞðh� hÞs; ðB3Þ
where the remainder is Opð1Þðh� hÞst. Combining these results, it follows that
ðh� hÞs ¼ ðm�1‘rsðhÞÞ�1fm�1‘rðhÞ1Oðmax
ss4usÞg ¼ Opðmaxðm�1=2;max
ss4usÞÞ;
where the standard asymptotic results ‘rðhÞ ¼ Opðm1=2Þ and ‘rsðhÞ ¼ OpðmÞ have been used.
Consider now the generalized linear mixed measurement error model. Let �q be equal to q in thehomogeneous case and equal to q11 in the heterogeneous case. Let us divide the argument of r00
into two components: the first one has dimension �q and concerns the random effects ai and bi, whilethe second one has dimension ni and regards the true covariates Xi. The matrix r00 can then bepartitioned in 2� 2 blocks. By applying the formulas for the inversion of block matrices it ispossible to determine the largest elements of ðr00Þ�1, that are Oðn�1i Þ and Oðs2uÞ in both the homo-geneous and the heterogeneous cases. Furthermore,
rijh ¼OðniÞ if i; j; h ¼ 1; . . .; �qOð1Þ if i ¼ �q11; . . .; �q1ni j; h ¼ 1; . . .; �q1ni
�
and
rijhk ¼OðniÞ if i; j; h; k ¼ 1; . . .; �q;Oð1Þ if i ¼ �q11; . . .; �q1ni; j; h; k ¼ 1; . . .; �q1ni
�
The asymptotic order of the scalars in Eq. (B1) is then
rij rhkrlmrijhrklm � Oðmaxðs6u; n�1i ÞÞ;
rikrjl rhmrijhrklm � Oðmaxðs6u; n�1i ÞÞ;
rij rhkrijhk � Oðmaxðs4u; n�1i ÞÞ:
For cluster i the log-likelihood function is then approximated as
‘iðhÞ ¼ ~‘iðhÞ1Oðmaxðs4u; n�1i ÞÞ
and the overall likelihood function is
‘ðhÞ ¼ ~‘ðhÞ1Oðm �maxðs4u;minðniÞ
�1ÞÞ:
Then,
m�1‘rðhÞ ¼ m�1 ~‘rðhÞ1Oðmaxðs4u;minðniÞ
�1ÞÞ ¼ Oðmaxðs4
u;minðniÞ�1ÞÞ: ðB4Þ
424 M. Battauz: Laplace approximation in measurement error models
r 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com
Combining Eq. (B3) with Eq. (B4) gives
ðh� hÞs ¼ðm�1‘rsðhÞÞ�1fOðmaxðs4
u;minðniÞ�1ÞÞ �m�1‘rðhÞg
¼Opðmaxðm�1=2;s4u;minðniÞ
�1ÞÞ:
References
Anderson, D. A. and Aitkin, M. (1985). Variance component models with binary response: Interviewervariability. Journal of the Royal Statistical Society, Series B: Methodological 47, 203–210.
Barndorff-Nielsen, O. E. and Cox, D. R. (1989). Asymptotic Techniques for Use in Statistics. Chapman & Hall,London.
Carroll, R. J., Roeder, K. and Wasserman, L. (1999). Flexible parametric measurement error models. Bio-metrics 55, 44–54.
Carroll, R. J., Ruppert, D., Stefanski, L. A. and Crainiceanu, C. M. (2006). Measurement Error in NonlinearModels. Chapman & Hall, London.
Cook, J. R. and Stefanski, L. A. (1994). Simulation–extrapolation estimation in parametric measurement errormodels. Journal of the American Statistical Association 89, 1314–1328.
Higdon, R. and Schafer, D. W. (2001). Maximum likelihood computations for regression with measurementerror. Computational Statistics and Data Analysis 35, 283–299.
Ko, H. and Davidian, M. (2000). Correcting for measurement error in individual-level covariates in nonlinearmixed effects models. Biometrics 56, 368–375.
Liu, Q. and Pierce, D. A. (1993). Heterogeneity in Mantel–Haenszel-type models. Biometrika 80, 543–556.Liu, Q. and Pierce, D. A. (1994). A note on Gauss–Hermite quadrature. Biometrika 81, 624–629.McCullagh, P. and Nelder, J. A. (1999). Generalized Linear Models. Chapman & Hall,London.McCulloch, C. E. (1994). Maximum likelihood variance components estimation for binary data. Journal of the
American Statistical Association 89, 330–335.McCulloch, C. E. (1997). Maximum likelihood algorithms for generalized linear mixed models. Journal of the
American Statistical Association 92, 162–170.McCulloch, C. E. and Searle, S. R. (2001). Generalized, Linear, and Mixed Models. Wiley, New York.R Development Core Team. (2010). R: A Language and Environment for Statistical Computing. R Foundation
for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.Schafer, D. W. (1987). Covariate measurement error in generalized linear models. Biometrika 74, 385–391.Vonesh, E. F. (1996). A note on the use of Laplace’s approximation for nonlinear mixed-effects models.
Biometrika 83, 447–452.Wang, N., Lin, X., Gutierrez, R. G. and Carroll, R. (1998). Bias analysis and SIMEX approach in generalized
linear mixed measurement error models. Journal of the American Statistical Association 93, 249–261.Wang, N., Lin, X. and Guttierrez, R. G. (1999). A bias correction regression calibration approach in gen-
eralized linear mixed measurement error models. Communications in Statistics: Theory and Methods 28,217–232.
Biometrical Journal 53 (2011) 3 425
r 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com