laplace approximation in measurement error models

Laplace approximation in measurement error models

Michela Battauz�

Department of Economics and Statistics, University of Udine, Udine, Italy

Received 28 April 2010, revised 8 November 2010, accepted 7 February 2011

Likelihood analysis for regression models with measurement errors in explanatory variables typicallyinvolves integrals that do not have a closed-form solution. In this case, numerical methods such asGaussian quadrature are generally employed. However, when the dimension of the integral is large,these methods become computationally demanding or even unfeasible. This paper proposes the useof the Laplace approximation to deal with measurement error problems when the likelihood functioninvolves high-dimensional integrals. The cases considered are generalized linear models with multiplecovariates measured with error and generalized linear mixed models with measurement error in thecovariates. The asymptotic order of the approximation and the asymptotic properties of the Laplace-based estimator for these models are derived. The method is illustrated using simulations andreal-data analysis.

Keywords: Generalized linear mixed model; Generalized linear model; Laplaceapproximation; Measurement error; Random effect.

Supporting Information for this article is available from the author or on the WWW underhttp://dx.doi.org/10.1002/bimj.201000095

1 Introduction

Measurement error is a common problem in many areas of statistical analysis, and a largepart of the literature focuses on medical and biological applications (see Carroll et al., 2006).Accounting for measurement error is particularly important as it leads to biased parameterestimates and loss of power for detecting relationships among variables (Carroll et al.,2006, Section 1.1). The literature provides a variety of statistical methods to correct for thepresence of measurement error and distinguishes between structural and functional modelingdepending on whether the true variables are modeled parametrically or not (Carrollet al., 2006, Section 2.1). Likelihood methods generally require strong distributional assumptions,but they can be applied to more general models and provide a gain in efficiency. However,the evaluation of the log-likelihood function often involves integrals with respect to the latentvariables that have no closed-form solution. In these cases, numerical methods such as Gaussianquadrature are employed. Nevertheless, when high-dimensional integrals are involved, suchmethods become computationally demanding or even unfeasible. Two such cases are considered inthis paper.

The first one concerns generalized linear models with multiple covariates measured with error,where the dimension of the integral depends on the number of error-prone variables. Schafer (1987)

*Corresponding author: e-mail: [email protected], Phone: 139-0432-24-95-81, Fax: 139-0432-24-95-95

r 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

Biometrical Journal 53 (2011) 3, 411–425 DOI: 10.1002/bimj.201000095 411

proposes the use of Gauss–Hermite quadrature in maximum likelihood analysis to approximateintegrals. As pointed out by Higdon and Schafer (2001), the inclusion of more than one explanatoryvariable measured with error is considerably more difficult because multidimensional quadrature isquite cumbersome. Monte–Carlo integration may be more reasonable, but it may require a largeMonte–Carlo sample size becoming time consuming. Liu and Pierce (1994) consider the Laplaceapproximation in a logistic regression model with one explanatory variable measured with error.They found that the Laplace approximation is very accurate; however, the asymptotic properties ofthe estimator are not established.

The second case regards generalized linear mixed models with measurement error in thecovariates. Mixed models are widely used in statistical applications since they are well suitedfor the analysis of longitudinal panel data and clustered cross-sectional data (for a broadreview see for example McCulloch and Searle, 2001). Maximum likelihood estimation requiresintegration over the random effects distribution. Various numerical techniques are proposedin the literature for this task, including Laplace approximation (Liu and Pierce, 1993),Gaussian quadrature (Anderson and Aitkin, 1985) and Monte–Carlo methods (McCulloch, 1994,1997). Computational complexity involved in generalized linear mixed models makes thetreatment of measurement error in the covariates rather challenging. Wang et al. (1998) analyze thebias in parameter estimation when the measurement error is not properly taken intoaccount and propose the SIMEX method (Cook and Stefanski, 1994) for measurement errorcorrection. Wang et al. (1999) showed that a naive application of regression calibration is notsuitable for generalized linear mixed models with measurement error in the covariates. When alikelihood-based approach is used in order to adjust for measurement error in mixed models, boththe integration of the random effects and the true variables is required. Let q be the number ofrandom effects, p1 is the number of variables measured with error and ni is the number of unitswithin cluster i. Since the observations from the same cluster are correlated, the dimension of theintegral is equal to q1p1 � ni.

This paper proposes the use of the Laplace approximation for dealing with high-dimensionalintegrals in measurement error models when a likelihood-based approach is followed. The Laplaceapproximation is as follows (Barndorff-Nielsen and Cox, 1989, p. 170)Z

Rqe�rðxÞdx[ð2pÞðq=2Þjr00ðxÞj�1=2e�rðxÞ; ð1Þ

where x is a q-dimensional parameter vector and r(x) is a smooth function on Rq with a uniqueminimum at x. Generally, r(x) is of order O(n) and the typical case is r(x)5 nr1(x). For fixed qoN,the order of accuracy of the Laplace approximation is then O(1/n). In the case considered here, anunusual type of Laplace approximation results as the order of the function r(x) does not onlydepend on a single parameter n, but also on the measurement error variances. The accuracy of theapproximation then requires a specific investigation.

Ko and Davidian (2000) have found successful use of the Laplace approximation in nonlinearmixed effects models with measurement error in the covariates thus encouraging the study of themethod for generalized linear mixed models. Although they conjecture a non-negligible bias in thiscase, an analytical study of the accuracy of the approximation or simulation results are not pro-vided. Furthermore, a complete investigation of the asymptotic order of the Laplace approximationfor measurement error correction and the asymptotic properties of the resulting estimator is lackingin the literature.

The paper is organized as follows. The models and some notation are introduced in Section 2.The estimation procedure, the asymptotic order of the approximation and the asymptotic propertiesof the estimator are presented in Section 3. Section 4 regards some simulation studies, Section 5illustrates the method using real data and Section 6 contains some concluding remarks. Finally,mathematical details are given in the Appendices.

412 M. Battauz: Laplace approximation in measurement error models

r 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.biometrical-journal.com

2 Model specifications

2.1 Generalized linear measurement error model

Let Yi be the response variable on unit i5 1,y, m.. Let Xi denote a vector of true covariates withdimension p1 and Zi denote a vector of covariates measured without error with dimension p2. It isassumed that observations from different units are independent. Suppose that the responses comefrom a generalized linear model (McCullagh and Nelder, 1999) with density

fYijXiðyijxiÞ ¼ exp ½yigi � bðgiÞ�=t

2 � cðyi; tÞ� �

; ð2Þ

mean E(Yi)5 mi and variance Var(Yi)5 t2 v(mi). The mean is related to the linear predictor throughthe link function g

gðmiÞ ¼ b01X>i bx1Z>i bz;

where b0 is the intercept, bx is a p1-vector of coefficients of Xi and bz is a p2-vector of coefficients ofZi. In this paper, the additive measurement error structure is assumed for the measurement errormodel, hence

Wi ¼ Xi1Ui;

where Wi is a p1-vector of measurements, Ui is a p1-vector of measurement errors independent of Xi

that follow a normal distribution with zero mean and variance Ru 5 Ru(wu), and wu is a vector ofparameters. In general, measurement error problems require the availability of extra informationabout the measurement error distribution (Carroll et al., 2006, Section 2.3). In this paper, themeasurement error variance Ru is assumed to be known. Conditional on Xi and Zi, the distributionof Yi is assumed independent of Wi, which means that measurement error is non-differential(Carroll et al., 2006, Section 2.5). Given the true variables Xi, the measurements Wi are assumedindependent of any other variable. The structural model for the true variables is

Xi ¼ aZi1ei;

where a is a p1� p2-matrix of coefficients and ei is a p1-vector of residuals that follow a normaldistribution with zero mean and variance matrix Rx 5 Rx(wx), and wx is a vector of parameters.

2.2 Generalized linear mixed measurement error model

Let Yij denote the response variable of unit j in cluster i, where j5 1,y,ni and i5 1,y,m. The vectorof p1 variables measured with error is denoted Xij and the vector of p2 variables measured withouterror is denoted Zij. Conditional on the random effects, the responses come from a generalizedlinear model with predictors

Zij ¼ b01X>ij bx1Z>ij bz1S>ij bi;

where bx and bz are vector of coefficients with dimensions p1 and p2, Sij is a subset of Zij withdimension q, bi is a vector of random effects with dimension q that follows a normal distributionwith zero mean and variance D5D(wb), and wb is a vector of parameters. The random effects areindependent of any other variable. The conditional means of the response variable are denoted bymij, satisfying g(mij)5Zij for some link function g, and the conditional variance is t2 v(mij). Thedensity of Yij is

fYij jbi ;Xijðyijjbi; xijÞ ¼ exp ½yijgij � bðgijÞ�=t

2 � cðyij ; tÞ� �

: ð3Þ

Note that the situation considered here concerns the variables measured with error associated withfixed coefficients. The non-differential measurement error assumption is taken. The specification ofthe model is completed by defining the measurement error model, that is

Biometrical Journal 53 (2011) 3 413


Wij ¼ Xij1Uij ;

where Wij is a p1-vector of measurements, Uij is a p1-vector of normally distributed measurementerrors with zero mean and variance Ru 5 Ru(wu). The measurement errors are assumed independentof the true variables and independent across units. Measurement error variances are assumedknown. Furthermore, conditional on Xi, the measurements are assumed independent of othervariables. Two cases are considered for the model for Xij: the homogeneous and the heterogeneouscases (Wang et al., 1998). In the homogeneous case, the true variables are marginally independent,that is

Xij ¼ aZij1eij ;

where a is a p1� p2-matrix of coefficients and eij is a p1-vector vector of residuals that follow anormal distribution with zero mean and variance matrix Rx, independent across units. In theheterogeneous case the model for the true variables presents random coefficients. Here, we considerthe case of a random intercept model

Xij ¼ aZij1ai1eij ;

where ai are independent N ð0;s2aÞ and independent of other variables, and eij is a p1-vector of

N ð0;RxÞ, independent across units. In both cases, the vector of parameters entering Rx is wx.

3 Maximum likelihood estimation using the Laplace approximation

3.1 Likelihood functions

The likelihood function for generalized linear measurement error models can be expressed as

LðhÞ ¼Ymi¼1

ZRp1

f ðYijXi;ZiÞf ðWijXiÞf ðXijZiÞ dXi

¼Ymi¼1

ZRp1

exp �rðXiÞ� �

dXi;

ð4Þ

where r(Xi)5�log f (Yi|Xi,Zi)�log f (Wi|Xi)�log f (Xi|Zi) and h ¼ ðb>x ; b>z ; t

2; vecðaÞ>;w>x Þ> is the

vector of all the parameters. Let Xi denote the minimizer of r(Xi) satisfying theequation r0(Xi)5 0. The Laplace approximation gives the following expression for the log-likelihoodfunction:

~‘ðhÞ ¼ �Xmi¼1

rðXiÞ �1

2

Xi

logjr00ðXiÞj; ð5Þ

where ~‘ðhÞ denotes the approximated log-likelihood function and

r00ðXiÞ ¼1

t2oibxb

>x 1R�1u 1R�1x ; ð6Þ

where oi ¼ ½vðmiÞg2mðmiÞ�

�1 and gm(mi) is the first derivative of g(mi) with respect to mi. For details seeAppendix A. The constant term is omitted from Eq. (5) and from all the following likelihoodfunctions.

For simplicity, in mixed models we consider a single variable measured with error, that is p1 5 1.Define Yi ¼ ðYi1; . . .;Yini Þ

>; Xi ¼ ðXi1; . . .;Xini Þ>; Zi ¼ ðZi1; . . .;Zini Þ

> and Si and Wi similarly. Thelikelihood function for generalized linear mixed measurement error models can be expressed as

LðhÞ ¼Ymi¼1

ZRq

ZRni

f ðYijXi;Zi; biÞf ðWijXiÞf ðXijZiÞf ðbiÞ dXidbi:



Note that the factorization of density f ðYi;Wi;Xi; bijZiÞ used here is different from that of Ko andDavidian (2000).

In the homogeneous case, the vector of all the parameters is h ¼ ðb>x ; b>z ; t

2;a>;w>x ;w>b Þ> and the

likelihood function results

LðhÞ ¼Ymi¼1

ZRq

ZRni

Ynij¼1

f ðYijjXij ;Zij ; biÞf ðWijjXijÞf ðXijjZijÞ

!f ðbiÞ dXidbi

¼Ymi¼1

ZRq

ZRni

exp �rðbi;XiÞ� �

dXidbi;

ð7Þ

where rðbi;XiÞ ¼ �P

j ðlog f ðYijjXij ;Zij ; biÞ1log f ðWijjXijÞ1log f ðXijjZijÞÞ � log f ðbiÞ. Let ðbi; XiÞ>

denote the minimizer of r(bi,Xi) satisfying the equation r0(bi,Xi)5 0. The log-likelihood function isthen approximated as


Xnij¼1

rðbi; XiÞ �1

2

Xmi¼1

Xnij¼1

logjr00ðbi; XiÞj; ð8Þ

where r00(bi,Xi) has various components, that are

q2rðbi;XiÞ

qbiqb>i¼

1

t2Xnij¼1

oijSijS>ij 1D�1 ¼ OðniÞ;

q2rðbi;XiÞ

qX2ij

¼1

t2oijb

2x1

1

s2u

11

s2x

¼ Oðs�2u Þ;

q2rðbi;XiÞ

qXijqXij0¼0 for j 6¼ j0;

q2rðbi;XiÞ

qbiqXij¼

1

t2Sijoijbx ¼ Oð1Þ;

ð9Þ

where oij ¼ ½vðmijÞg2mðmijÞ�

�1, gm(mij) is the first derivative of g(mij) with respect to mij, s2u represents the

measurement error variance and s2x denotes the variance of Xij. The asymptotic order of the second

derivatives have been specified because they will be used in Section 3.2.In the heterogeneous case, h ¼ ðb>x ; b

>z ; t

2; a>;w>x ;w>b ;s

2aÞ> and the likelihood function is

LðhÞ ¼Ymi¼1

ZRq

ZR

ZRni

Ynij¼1

f ðYijjXij ;Zij ; biÞf ðWijjXijÞf ðXijjZij ; aiÞ

!f ðaiÞf ðbiÞdXidaidbi

¼Ymi¼1

ZRq

ZR

ZRni

exp �rðai; bi;XiÞ� �

dXidaidbi; ð10Þ

where rðai; bi;XiÞ¼�P

j ðlog f ðYijjXij ;Zij ; biÞ1log f ðWijjXijÞ1log f ðXijjZij ; aiÞÞ�log f ðaiÞ�log f ðbiÞ:

The Laplace approximation gives then


Xnij¼1

rðai; bi; XiÞ �1

2

Xmi¼1

Xnij¼1

logjr00ðai; bi; XiÞj; ð11Þ

where ðai; bi; XiÞ> denote the minimizer of r(ai,bi,Xi) and the components of r00(ai,bi,Xi) are

q2rðai; bi;XiÞ

qa2i¼

ni

s2x

11

s2a

¼ OðniÞ;



q2rðai; bi;XiÞ

qbiqb>i¼

1

t2Xj

oijSijS>ij 1D�1 ¼ OðniÞ;

q2rðai; bi;XiÞ

qX2ij

¼1

t2b2xoij1

1

s2u

11

s2x

¼ Oðs�2u Þ;q2rðai; bi;XiÞ

qXijqXij0¼ 0 for j 6¼ j0; ð12Þ

q2rðai; bi;XiÞ

qaiqbi¼ 0;

q2rðai; bi;XiÞ

qaiqXij¼

1

s2x

¼ Oð1Þ;

q2rðai; bi;XiÞ

@bi@Xij¼

1

t2Sijoijbx ¼ Oð1Þ;

where oij is defined as in the homogeneous case. Details are given in Appendix A.The matrices of second derivatives in (6), (9) and (12) are exact for some models (e.g. the binomial

or Poisson). In general, an expectation should be taken to obtain those expressions (see McCullochand Searle, 2001, p. 282).

3.2 Asymptotic properties

Since the method proposed in this paper provides an approximation of the log-likelihood function,it is important to establish the order of this approximation and the conditions necessary to obtainconsistency of the estimator based on the maximization of the approximated log-likelihood. Thissection concerns then the asymptotic order of the approximation and the asymptotic properties ofthe Laplace-based estimator. The vector of parameters h is assumed to be fixed, while the mea-surement error variances tend to zero and the sample sizes tend to infinity. All the proofs are givenin Appendix B and they proceed along the lines of Vonesh (1996), who derived the rate of con-vergence of the estimator based on the Laplace approximation for nonlinear mixed effects models.

In generalized linear measurement error models, the order of accuracy of the Laplace approx-imation depends on the magnitude of the largest measurement error variance. In particular the log-likelihood function (4) is approximated as

‘ðhÞ ¼ ~‘ðhÞ1m Oðmaxs

s4usÞ;

where s2us is the measurement error variance of the sth variable. Let h be the Laplace-based max-

imum likelihood estimate. Under suitable regularity conditions on ‘ðhÞ and provided thatðh� hÞ ¼ opð1Þ, it is possible to show that

ðh� hÞ ¼ Op½maxðm�1=2;maxs

s4usÞ�:

Thus the approximate maximum likelihood estimator h will be consistent only as both m-N andmaxssus-0. Consistent with the findings of Vonesh (1996), the rate of convergence depends on twoterms, one coming from the standard asymptotic theory and one coming from the Laplaceapproximation. In the case of generalized linear measurement error models, the former is m�1/2 andthe latter is maxs s4

us.Consider now the generalized linear mixed measurement error models. The order of accuracy

of the Laplace approximation depends on both the measurement error variance and thenumber of units within clusters. Specifically, the log-likelihood functions (7) and (10) areapproximated as

‘ðhÞ ¼ ~‘ðhÞ1mOðmaxðs4u;minðniÞ

�1ÞÞ:



Under suitable regularity conditions on ‘ðhÞ and provided that ðh� hÞ ¼ opð1Þ, the rate of con-vergence of the Laplace-based maximum likelihood estimator is

ðh� hÞ ¼ Opðmaxðm�1=2;s4u;minðniÞ

�1ÞÞ:

In order to have consistency it is then necessary that m-N, su-0 and min ni-N. The m�1/2 termcomes from the standard asymptotic theory, while both s4

u and min(ni)�1 come from the Laplace

approximation. This can be seen as a generalization to the measurement error case of the resultobtained by Vonesh (1996).

4 Simulation studies

In order to evaluate the behavior of the estimator for generalized linear measurement error modelswhen moN and sus40 for all s, and the behavior of the estimator for generalized linear mixedmeasurement error models when moN, nioN and su40, some simulation studies have beenconducted. The standard deviation of the true variables is set sx 5 1 and measurement errorstandard deviations are taken constant across different variables and equal to 0.655, 0.5 and 0.333,so that the reliabilities result around 0.7, 0.8 and 0.9. Various estimators have been considered foreach simulated data set. The true estimator represents a benchmark as it uses the true covariatevalues, while the naive estimator treats the observed covariates as measured without error, thusleading to biased estimates. In other words, the true estimator assumes that the true covariates Xi

and Xij are known in (2) and (3). The naive estimator fits models (2) and (3) using Wi and Wij inplace of the true covariates. Finally, the Laplace-based correction proposed in this paper is referredto as the adjusted method.

To implement the proposed method, a program was written in the statistical language R(R Development Core Team, 2010), with computational intensive parts written in C code. All thecode used is provided in the online Supporting Information. The true and the naive mixed modelshave been fitted using the R package lme4.

For the generalized linear measurement error model, a logistic regression model has been con-sidered. Two variables measured with error have been generated. The true variables have zero meanand correlation equal to 0.5, while measurement errors are taken uncorrelated. The elements of bxare both set equal to 1, while the intercept b0 is set equal to 0. No covariates measured without errorhave been included. The number of observations m is taken equal to 100 and 300. There are 1000simulations for each parameter setting. Table 1 reports bias and root mean-square error (RMSE) ofthe estimates of the first variable measured with error. Since the two variables are perfectly sym-metric, results for the second one are omitted. Simulation results show that the bias of the adjustedestimator is very close to that of the true method. Notice that here a small-sample bias yieldsupward estimates, while the presence of measurement error causes a negative bias in the naiveestimator. These two sources of bias work in opposite directions and, for the naive estimator, theytend to compensate each other. Increasing the measurement error variance yields an increment ofthe standard error of the adjusted estimates, due to the greater uncertainty of the estimationprocess. However, it does not have effect on the bias of the adjusted estimates. This can be due tothe fact that the density functions f(Wi|Xi) and f(Xi|Zi) in (4) are normal, so that the Laplaceapproximation gives a nearly exact solution of the integral in (4) (see Liu and Pierce, 1994).

A logistic regression model with random intercepts in the homogeneous case has been consideredfor the generalized linear mixed measurement error model. The parameter settings have been takensimilar to Wang et al. (1998). In particular, there is one variable measured with error that has zeromean. The number of clusters m is taken equal to 50 and 100, and the number of units withinclusters is constant across groups and it is ni 5 n5 5 and 10. One variable measured without errorhas been generated from a standard normal distribution. The coefficients are b0 5 0, bx 5 2 and



bz 5 1. The variance of the random intercepts, cb, has been taken equal to 0.5. The results are basedon 1000 data sets for each set of parameter values and are displayed in Table 2.

The true and the naive methods use the Laplace approximation for the integration of the randomeffects. All the methods present convergence problems for same data sets. In case of convergenceproblems for one method, the data set has been excluded for all the methods. The tables report thenumber of cases presenting no convergence (n.c.) for each method, and the number of excluded datasets (excl.). Since, in same cases, different methods have problems with the same data set, thenumber of excluded cases is not given by the sum of cases presenting no convergence.

It is important to note that the true method provides biased parameter estimates due to small-sample size and Laplace approximation. When the naive method is used, the measurement error is afurther source of bias in parameter estimation.

Naive estimates of both bx and bz are biased downwards, with the former presenting the largerbias. Measurement error also causes an underestimation of the variance of the random intercept.The adjusted estimator presents a slight bias and, consistently with our findings, it performs betterwhen the measurement error variance is smaller.

Further simulation studies for generalized linear mixed measurement error models have beenconducted. These studies regard a logistic regression model with random intercepts in the hetero-geneous case and a Poisson model with random intercepts in both the homogeneous and theheterogeneous cases. For the heterogeneous case, the random intercept variance of the structuralmodel is s2

a ¼ 1:5. The other parameters are the same as the first simulation. The results arereported in the online Supporting Information. The simulation of the logistic model in the het-erogeneous case gives results very similar to those of the homogeneous case (Supporting In-formation Table S1). The Laplace approximation works very well for the Poisson model in both thehomogeneous and the heterogeneous cases (Supporting Information Tables S2 and S3). Using the

Table 1 Simulation of logistic measurement error model.

Method su Bias RMSE

m5 100True 0.655 0.067 0.117Naive 0.655 �0.265 0.124Adjusted 0.655 0.057 0.180True 0.500 0.053 0.139Naive 0.500 �0.169 0.109Adjusted 0.500 0.064 0.201True 0.333 0.058 0.126Naive 0.333 �0.059 0.100Adjusted 0.333 0.062 0.159

m5 300True 0.655 0.025 0.036Naive 0.655 �0.287 0.101Adjusted 0.655 0.023 0.071True 0.500 0.022 0.038Naive 0.500 �0.191 0.061Adjusted 0.500 0.023 0.061True 0.333 0.025 0.038Naive 0.333 �0.084 0.036Adjusted 0.333 0.025 0.044



true method, the estimates of cb are slightly biased downwards, but this is a well-known result forthe variance components in mixed models. The naive method presents a negative bias for bx, whilebz seems to be not affected by measurement error. The parameter cb has instead a positive bias. The

Table 2 Simulation of logistic model in the homogeneous case.

Method su n bx bz cb

Bias RMSE Bias RMSE Bias RMSE n.c. excl.

m5 50True 0.655 5 0.033 0.310 0.022 0.230 0.018 0.494 0 1Naive 0.655 5 �0.819 0.840 �0.154 0.247 �0.142 0.397 1 1Adjusted 0.655 5 0.020 0.701 0.024 0.319 0.059 1.045 0 1True 0.655 10 0.020 0.205 0.010 0.155 �0.019 0.281 1 1Naive 0.655 10 �0.826 0.836 �0.164 0.210 �0.172 0.270 0 1Adjusted 0.655 10 �0.056 0.401 �0.003 0.207 �0.074 0.401 0 1True 0.500 5 0.049 0.321 0.035 0.237 0.017 0.487 0 2Naive 0.500 5 �0.564 0.608 �0.097 0.226 �0.102 0.394 2 2Adjusted 0.500 5 0.038 0.520 0.030 0.279 0.009 0.819 0 2True 0.500 10 0.032 0.210 0.013 0.152 �0.016 0.285 0 1Naive 0.500 10 �0.578 0.597 �0.113 0.176 �0.130 0.259 1 1Adjusted 0.500 10 �0.010 0.274 0.006 0.170 �0.059 0.303 0 1True 0.333 5 0.049 0.316 0.026 0.221 0.030 0.519 2 2Naive 0.333 5 �0.286 0.386 �0.042 0.210 �0.044 0.447 0 2Adjusted 0.333 5 0.037 0.394 0.025 0.241 0.009 0.681 0 2True 0.333 10 0.015 0.207 0.013 0.152 �0.028 0.286 0 0Naive 0.333 10 �0.303 0.349 �0.054 0.148 �0.085 0.266 0 0Adjusted 0.333 10 0.008 0.232 0.011 0.155 �0.044 0.298 0 0

m5 100True 0.655 5 0.017 0.224 0.011 0.164 �0.017 0.329 0 7Naive 0.655 5 �0.829 0.840 �0.162 0.212 �0.181 0.297 7 7Adjusted 0.655 5 �0.080 0.409 �0.014 0.192 �0.138 0.454 0 7True 0.655 10 0.012 0.145 0.001 0.105 �0.006 0.202 0 0Naive 0.655 10 �0.831 0.836 �0.169 0.190 �0.161 0.221 0 0Adjusted 0.655 10 �0.090 0.251 �0.018 0.122 �0.081 0.253 0 0True 0.500 5 0.008 0.212 0.012 0.152 �0.030 0.322 3 3Naive 0.500 5 �0.593 0.612 �0.112 0.176 �0.143 0.293 3 3Adjusted 0.500 5 �0.054 0.274 �0.004 0.167 �0.118 0.358 0 3True 0.500 10 0.016 0.145 0.006 0.105 �0.012 0.205 0 1Naive 0.500 10 �0.587 0.596 �0.119 0.152 �0.128 0.210 1 1Adjusted 0.500 10 �0.032 0.190 �0.005 0.118 �0.066 0.226 0 1True 0.333 5 0.017 0.217 0.015 0.161 �0.025 0.338 0 7Naive 0.333 5 �0.305 0.355 �0.050 0.158 �0.088 0.308 7 7Adjusted 0.333 5 0.000 0.243 0.012 0.167 �0.066 0.351 0 7True 0.333 10 0.008 0.145 0.000 0.105 �0.013 0.200 0 1Naive 0.333 10 �0.311 0.335 �0.065 0.118 �0.075 0.195 1 1Adjusted 0.333 10 �0.004 0.164 �0.001 0.110 �0.035 0.210 0 1



adjusted method gives nearly unbiased estimates for bx and bz. For the variance of the randomintercepts cb, it presents a negative bias. In the heterogeneous case when su 5 0.655, it is notnegligible, but most of the large bias of the naive estimator is corrected.

5 Example

The use of the Laplace approximation for the estimation of a logistic random intercept model withmeasurement error in the covariates will be illustrated using data from the First National Healthand Nutrition Examination Survey (NHANES I) and its Epidemiologic Follow-up Study(NHEFS). Briefly, between 1971 and 1975, a random sample of 23 808 persons representing thenon-institutionalized US population aged 1–74 years was selecting for participation in theNHANES I. Of these, 14 407 participants aged 25–74 underwent a medical examination. Follow-updata were collected in 1982–1984, 1986–1987 and 1992. This example considers the risk factors forstroke. The response indicates whether the subject had a stroke since the previous interview. Thepredictor variables at baseline include the following: age, sex, history of diabetes (yes, no), history ofheart disease (yes, no), educational attainment (o12 years, Z12 years), physical activity (yes, no)and long-term systolic blood pressure (SBP). The SBP reading was taken during the baseline surveyand is considered a measure with error of the long-term SBP (Carroll et al., 2006, p. 13). Assuggested by Carroll et al. (2006, p. 113), this variable was transformed into log(SBP�50) to achieveapproximate normality and the additive measurement error model is generally used in this context.Subjects who had a positive history of stroke at baseline were excluded. The sample used in thisstudy is composed of 2634 White and Black respondents aged 45–74 years at baseline with completedata at baseline and follow-up. Other possible confounders, such as cigarette smoking, serumcholesterol, hemoglobin concentration and body mass index were not included in the model becausethey were not significant. Analogous to Carroll et al. (2006, Section 4.3), for the study of breastcancer and to Wang et al. (1998) for the study of coronary heart disease, a logistic model has beenfitted. To account for the longitudinal structure of the data, a random intercept at the subject levelhas been included. The measurement error variance is not known in this study and other sources ofinformation are not available. Using data from the Framingham study (Carroll et al., 2006, Section1.6.6), the reliability of log(SBP�50) has been calculated around 0.74. For this application, thereliability (r) was then set equal to 0.8 and 0.7 as they represent two sensible values. As initial valuesfor the maximization of the log-likelihood function, the SIMEX estimates have been taken. Table 3shows the results for the Laplace-based estimator and the SIMEX method. The structural para-meters a have been omitted because they are not of interest. Both the naive and the adjusted modelsshow that SBP is positively associated with the risk of stroke. After measurement error adjustment,the coefficient of this variable is considerably increased. The SIMEX method gives very similarestimates to the Laplace-based estimator.

6 Discussion

The use of the Laplace approximation in measurement error models gives encouraging results. Themethod is applicable to problems involving high-dimensional integrals, where other solutionsbecome impractical or computationally intensive. Furthermore, it could be a convenient choice evenwhen other methods are available in terms of computational time.

The simulation studies show that the Laplace approximation gives very good results for gen-eralized linear measurement error models. Obviously, the employment of the Laplace approxima-tion for generalized linear mixed measurement error models maintains all the limits presented by theuse of this method for generalized linear mixed models. In particular, same care is required when theLaplace approximation is used for binary data with very small-sample sizes.



This paper shows that the convergence rate of the Laplace-based maximum likelihood estimatordepends on the measurement error variances. This is not an unusual feature (see for example Cookand Stefanski, 1994) and the simulation studies show the good performance of the method evenwhen considerable measurement error variances are present.

In this paper, the measurement error variances are assumed known. However, furthertypes of additional information could be available, such as internal validation, internal replication,external validation or external replication (Carroll et al., 2006, Section 2.3) At any rate, themethodology proposed here is straightforward to extend to these types of additional information byconsidering suitable modifications of the likelihood function and applying the Laplace approx-imation to evaluate integrals. Assuming the measurement error variances as known has theadvantage of a simpler expression of the accuracy of the Laplace approximation. In fact, the orderof the accuracy of the likelihood approximation for generalized linear measurement error modelsdepends only on the measurement error variances, while, in the case of internal replication,the order of the accuracy depends on both the measurement error variances and the number ofreplications. An analogous result applies to generalized linear mixed measurement error models.

In generalized linear mixed measurement error models, only one variable measured with error hasbeen considered. However, Laplace approximation is well suited for multiple variables measuredwith error in generalized linear mixed model and the procedure presented in this paper can be easilygeneralized to take this into account properly.

In this paper, the true predictor is assumed normally distributed and this assumptionmay be difficult to verify. For this reason, an extension of the methodology to more robust tech-niques could be interesting. Flexible parametric specifications as in Carroll et al. (1999) could bepromising, but they involve substantial computational complexity. The performance of the methodis expected to worsen with not normally distributed latent predictors and it requires a specificevaluation. At any rate, the asymptotic results presented in Section 3.2 hold for any distributionalassumption.

Acknowledgements The author is grateful to Professors R. Bellio, L. Pace and P. Vidoni for their helpfulsuggestions. This work was supported by grants of the Italian Ministry for Education, University and Research(MIUR).

Conflict of Interest

The authors have declared no conflict of interest.

Table 3 Estimates (standard errors) for the NHANES data.

Laplace Laplace SIMEX SIMEXNaive

r5 0.8 r5 0.7 r5 0.8 r5 0.7

Intercept �15.634 (1.407) �17.333 (2.367) �18.341 (2.737) �17.407 (2.523) �18.401 (2.587)

log (SBP�50) 0.893 (0.226) 1.308 (0.528) 1.533 (0.625) 1.319 (0.494) 1.534 (0.522)

Age 0.083 (0.011) 0.081 (0.013) 0.079 (0.013) 0.081 (0.012) 0.080 (0.012)

Sex (male) 0.267 (0.143) 0.288 (0.199) 0.284 (0.200) 0.285 (0.199) 0.294 (0.199)

Heart disease (yes) 0.884 (0.381) 0.899 (0.415) 0.868 (0.420) 0.870 (0.381) 0.873 (0.408)

Diabetes (yes) 1.112 (0.477) 1.125 (0.510) 1.107 (0.515) 0.119 (0.469) 1.121 (0.464)

Education (Z 12 yr) �0.422 (0.170) �0.403 (0.197) �0.398 (0.199) �0.411 (0.198) �0.404 (0.211)

Physical activity (none) 0.683 (0.365) 0.687 (0.388) 0.702 (0.393) 0.678 (0.364) 0.685 (0.362)

Year (87) 1.064 (0.151) 1.056 (0.158) 1.064 (0.159) 1.063 (0.164) 1.075 (0.177)

Year (92) 1.701 (0.160) 1.695 (0.169) 1.706 (0.169) 1.699 (0.184) 1.718 (0.198)

cb 15.130 (3.105) 14.952 (3.442) 15.555 (3.598) 14.976 (3.976) 15.522 (4.161)



Appendix A: Technical details

Consider the generalized linear measurement error model. The log-density functions entering in (4)are

log f ðYijXi;ZiÞ ¼ fYigi � bðgiÞg=t2 � cðYi; tÞ;

log f ðWijXiÞ ¼ �1

2logðRuÞ �

1

2ðWi � XiÞ

>R�1u ðWi � XiÞ

and

log f ðXijZiÞ ¼ �1

2logðRxÞ �

1

2ðXi � aZiÞ

>R�1x ðXi � aZiÞ:

The following identity is used to obtain first and second derivatives of r(Xi)

qmiqXi¼

qmiqgðmiÞ

qgðmiÞqXi

¼1

gmðmiÞbx:

The first derivative of r(Xi) is then

qrðXiÞ

qXi¼ �

1

t2ðYi � miÞoigmðmiÞbx � R�1u ðWi � XiÞ1R�1x ðXi � aZiÞ:

Consider the homogeneous case of the generalized linear mixed measurement error model. The log-density functions entering in (7) are

log f ðYijjbi;XijÞ ¼ fYijgij � bðgijÞg=t2 � cðYij ; tÞ;

log f ðWijjXijÞ ¼ �1

2logðs2

uÞ �1

2

ðWij � XijÞ2

s2u

;

log f ðXijÞ ¼ �1

2logðs2

xÞ �1

2

ðXij � a>ZijÞ2

s2x

and

log f ðbiÞ ¼ �1

2log jDj �

1

2bTi D

�1bi:

The first derivatives of r(bi,Xi) are

qrðbi;XiÞ

qbi¼ �

1

t2Xj

SijðYij � mijÞoijgmðmijÞ1D�1bi

and

qrðbi;XiÞ

qXij¼ �

1

t2ðYij � mijÞoijgmðmijÞbx �

Wij � Xij

s2u

1Xij � a>Zij

s2x

:

In the heterogeneous case of the generalized linear mixed measurement error model the log-densityfunctions entering in (10) are

log f ðYijjXij ; biÞ ¼ ½Yijgij � bðgijÞ�=t2 � cðYij ; tÞ;

log f ðWijjXijÞ ¼ �1

2logðs2

uÞ �1

2

ðWij � XijÞ2

s2u

;

log f ðXijjaiÞ ¼ �1

2logðs2

xÞ �1

2

ðXij � a>Zij � aiÞ2

s2x

;



log f ðaiÞ ¼ �1

2logðs2

aÞ �1

2

a2is2a

and

log f ðbiÞ ¼ �1

2logjDj �

1

2b>i D

�1bi:

The first derivatives of r(ai,bi,Xi) are

qrðai; bi;XiÞ

qai¼ �

PjðXij � a>Zij � aiÞ

s2x

1ai

s2a

;

qrðai; bi;XiÞ

@bi¼ �

1

t2Xj

SijðYij � mijÞoijgmðmijÞ1D�1bi

and

qrðai; bi;XiÞ

qXij¼ �

1

t2ðYij � mijÞoijgmðmijÞbx �

Wij � Xij

s2u

1Xij � a>Zij � ai

s2x

:

First and second derivatives in both the homogeneous and the heterogeneous case are obtainedusing the following identities

qmijqXij¼

qmijqgðmijÞ

qgðmijÞ

qXij¼

1

gmðmijÞbx and

qmijqbi¼

qmijqgðmijÞ

qgðmijÞ

qbi¼

1

gmðmijÞSij :

Appendix B: Consistency of the Laplace-based maximum likelihood estimates

In order to evaluate the asymptotic order of the Laplace approximations (5), (8) and (11) let usconsider the first multiplicative correction term of (1) that is

111

24ð3rij rhkrlmrijhrklm12rikrjl rhmrijhrklm � 3rij rhkrijhkÞ; ðB1Þ

where rRm, Rm ¼ r1; . . .; rm, indicate mth partial derivatives of r(x) with respect to the corresponding

components of x, rRm¼ rRm

ðxÞ, and rij are the components of the inverse matrix of second deri-vatives. Here the index notation and the summation convention have been employed. The vector ofparameters h is assumed to be fixed, while the measurement error variances tend to zero and thesample sizes tend to infinity.

Consider the generalized linear measurement error model. In all scalars of the type (B1), r00ðXiÞ

may be replaced by R�1u without affecting the asymptotic order. In this case ðr00ðXiÞÞ�1¼ Ru. Let the

elements of Ru corresponding to variables i and j be denoted by suij, with suii ¼ s2ui. Then,

Oðrij rklÞ ¼ OðsuijsuklÞ � OðsuisujsuksulÞ, and the equality holds only for i5 j and k5 l. The highesterror is then Oðmaxs s4

usÞ. Since rijh and rijhk are O(1) the asymptotic order of the scalars in Eq. (B1)are

rij rhkrlmrijhrklm � Oðmaxs

s6usÞ;

rikrjl rhmrijhrklm � Oðmaxs

s6usÞ;

rij rhkrijhk � Oðmaxs

s4usÞ:



For unit i the log-likelihood function is then approximated as

‘iðhÞ ¼ ~‘iðhÞ1Oðmaxs

s4usÞ

and the overall log-likelihood function is

‘ðhÞ ¼ ~‘ðhÞ1mOðmaxs

s4usÞ: ðB2Þ

Let ‘RmðhÞ ¼ ‘r1...rmðhÞ denote the generic derivative of order m of ‘ðhÞ and let h be the Laplace-based

maximum likelihood estimate satisfying ~‘rðhÞ ¼ 0. Deriving (B2) with respect to the r component ofh, evaluating it in h and dividing by m gives

m�1‘rðhÞ ¼ m�1 ~‘rðhÞ1Oðmaxs

s4usÞ ¼ Oðmax

ss4usÞ:

Under suitable regularity conditions on ‘ðhÞ and provided that ðh� hÞ ¼ opð1Þ, a Taylor seriesexpansion about the true parameter value h yields

m�1‘rðhÞ[m�1‘rðhÞ1m�1‘rsðhÞðh� hÞs; ðB3Þ

where the remainder is Opð1Þðh� hÞst. Combining these results, it follows that

ðh� hÞs ¼ ðm�1‘rsðhÞÞ�1fm�1‘rðhÞ1Oðmax

ss4usÞg ¼ Opðmaxðm�1=2;max

ss4usÞÞ;

where the standard asymptotic results ‘rðhÞ ¼ Opðm1=2Þ and ‘rsðhÞ ¼ OpðmÞ have been used.

Consider now the generalized linear mixed measurement error model. Let �q be equal to q in thehomogeneous case and equal to q11 in the heterogeneous case. Let us divide the argument of r00

into two components: the first one has dimension �q and concerns the random effects ai and bi, whilethe second one has dimension ni and regards the true covariates Xi. The matrix r00 can then bepartitioned in 2� 2 blocks. By applying the formulas for the inversion of block matrices it ispossible to determine the largest elements of ðr00Þ�1, that are Oðn�1i Þ and Oðs2uÞ in both the homo-geneous and the heterogeneous cases. Furthermore,

rijh ¼OðniÞ if i; j; h ¼ 1; . . .; �qOð1Þ if i ¼ �q11; . . .; �q1ni j; h ¼ 1; . . .; �q1ni

�

and

rijhk ¼OðniÞ if i; j; h; k ¼ 1; . . .; �q;Oð1Þ if i ¼ �q11; . . .; �q1ni; j; h; k ¼ 1; . . .; �q1ni

�

The asymptotic order of the scalars in Eq. (B1) is then

rij rhkrlmrijhrklm � Oðmaxðs6u; n�1i ÞÞ;

rikrjl rhmrijhrklm � Oðmaxðs6u; n�1i ÞÞ;

rij rhkrijhk � Oðmaxðs4u; n�1i ÞÞ:

For cluster i the log-likelihood function is then approximated as

‘iðhÞ ¼ ~‘iðhÞ1Oðmaxðs4u; n�1i ÞÞ

and the overall likelihood function is

‘ðhÞ ¼ ~‘ðhÞ1Oðm �maxðs4u;minðniÞ

�1ÞÞ:

Then,

m�1‘rðhÞ ¼ m�1 ~‘rðhÞ1Oðmaxðs4u;minðniÞ

�1ÞÞ ¼ Oðmaxðs4

u;minðniÞ�1ÞÞ: ðB4Þ



Combining Eq. (B3) with Eq. (B4) gives

ðh� hÞs ¼ðm�1‘rsðhÞÞ�1fOðmaxðs4

u;minðniÞ�1ÞÞ �m�1‘rðhÞg

¼Opðmaxðm�1=2;s4u;minðniÞ

�1ÞÞ:

References

Anderson, D. A. and Aitkin, M. (1985). Variance component models with binary response: Interviewervariability. Journal of the Royal Statistical Society, Series B: Methodological 47, 203–210.

Barndorff-Nielsen, O. E. and Cox, D. R. (1989). Asymptotic Techniques for Use in Statistics. Chapman & Hall,London.

Carroll, R. J., Roeder, K. and Wasserman, L. (1999). Flexible parametric measurement error models. Bio-metrics 55, 44–54.

Carroll, R. J., Ruppert, D., Stefanski, L. A. and Crainiceanu, C. M. (2006). Measurement Error in NonlinearModels. Chapman & Hall, London.

Cook, J. R. and Stefanski, L. A. (1994). Simulation–extrapolation estimation in parametric measurement errormodels. Journal of the American Statistical Association 89, 1314–1328.

Higdon, R. and Schafer, D. W. (2001). Maximum likelihood computations for regression with measurementerror. Computational Statistics and Data Analysis 35, 283–299.

Ko, H. and Davidian, M. (2000). Correcting for measurement error in individual-level covariates in nonlinearmixed effects models. Biometrics 56, 368–375.

Liu, Q. and Pierce, D. A. (1993). Heterogeneity in Mantel–Haenszel-type models. Biometrika 80, 543–556.Liu, Q. and Pierce, D. A. (1994). A note on Gauss–Hermite quadrature. Biometrika 81, 624–629.McCullagh, P. and Nelder, J. A. (1999). Generalized Linear Models. Chapman & Hall,London.McCulloch, C. E. (1994). Maximum likelihood variance components estimation for binary data. Journal of the

American Statistical Association 89, 330–335.McCulloch, C. E. (1997). Maximum likelihood algorithms for generalized linear mixed models. Journal of the

American Statistical Association 92, 162–170.McCulloch, C. E. and Searle, S. R. (2001). Generalized, Linear, and Mixed Models. Wiley, New York.R Development Core Team. (2010). R: A Language and Environment for Statistical Computing. R Foundation

for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.Schafer, D. W. (1987). Covariate measurement error in generalized linear models. Biometrika 74, 385–391.Vonesh, E. F. (1996). A note on the use of Laplace’s approximation for nonlinear mixed-effects models.

Biometrika 83, 447–452.Wang, N., Lin, X., Gutierrez, R. G. and Carroll, R. (1998). Bias analysis and SIMEX approach in generalized

linear mixed measurement error models. Journal of the American Statistical Association 93, 249–261.Wang, N., Lin, X. and Guttierrez, R. G. (1999). A bias correction regression calibration approach in gen-

eralized linear mixed measurement error models. Communications in Statistics: Theory and Methods 28,217–232.



laplace approximation in measurement error models

Documents