inverse gaussian

12
Statistical Society of Canada is collaborating with JSTOR to digitize, preserve and extend access to The Canadian Journal of Statistics / La Revue Canadienne de Statistique. http://www.jstor.org A Mixed Poisson-Inverse-Gaussian Regression Model Author(s): C. Dean, J. F. Lawless and G. E. Willmot Source: The Canadian Journal of Statistics / La Revue Canadienne de Statistique, Vol. 17, No. 2 ( Jun., 1989), pp. 171-181 Published by: Statistical Society of Canada Stable URL: http://www.jstor.org/stable/3314846 Accessed: 17-08-2015 16:02 UTC Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/ info/about/policies/terms.jsp JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. This content downloaded from 134.84.148.170 on Mon, 17 Aug 2015 16:02:01 UTC All use subject to JSTOR Terms and Conditions

Upload: chat0123

Post on 15-Dec-2015

273 views

Category:

Documents


7 download

DESCRIPTION

Inverse Gaussian Poisson mixed

TRANSCRIPT

Page 1: Inverse Gaussian

Statistical Society of Canada is collaborating with JSTOR to digitize, preserve and extend access to The Canadian Journal of Statistics / La Revue Canadienne de Statistique.

http://www.jstor.org

A Mixed Poisson-Inverse-Gaussian Regression Model Author(s): C. Dean, J. F. Lawless and G. E. Willmot Source: The Canadian Journal of Statistics / La Revue Canadienne de Statistique, Vol. 17, No. 2 (

Jun., 1989), pp. 171-181Published by: Statistical Society of CanadaStable URL: http://www.jstor.org/stable/3314846Accessed: 17-08-2015 16:02 UTC

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/ info/about/policies/terms.jsp

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

This content downloaded from 134.84.148.170 on Mon, 17 Aug 2015 16:02:01 UTCAll use subject to JSTOR Terms and Conditions

Page 2: Inverse Gaussian

The Canadian Journal of Statistics Vol. 17, No. 2, 1989, Pages 171-181 La Revue Canadienne de Statistique

171

A mixed Poisson-inverse-Gaussian

regression model*

C. DEAN, J.F. LAWLESS, AND G.E. WILLMOT

Key words and phrases: regression analysis of counts, extra-Poisson variation, maximum likelihood, quasilikelihood, quadratic estimating equations.

AMS 1985 subject classifications: 62J02.

ABSTRACT

The mixed Poisson-inverse-Gaussian distribution has been used by Holla, Sankaran, Sichel, and others in univariate problems involving counts. We propose a Poisson-inverse-Gaussian regression model which can be used for regression analysis of counts. The model provides an attractive framework for incorporating random effects in Poisson regression models and in handling extra- Poisson variation. Maximum-likelihood and quasilikelihood-moment estimation is investigated and illustrated with an example involving motor-insurance claims.

RESUME

Un melange ponder6 de lois de Poisson, avec des poids suivant une loi gaussienne inverse, a 6t6

utilis6 par Holla, Sankaran, Sichel et d'autres comme modele unidimensionnel dans des problemes de d6nombrement. Nous proposons un modele de regression base sur un tel m61ange. Celui-ci peut &tre utilis6 dans des analyses de regression faites ta partir de d6nombrements. II permet d'incorporer des effets al6atoires dans un modele de regression base sur la loi de Poisson, ainsi que le traitement de la variation non repr6sent6e par la loi de Poisson. L'estimation par la m6thode du maximum de vraisemblance et par la quasi-vraisemblance/moments est 6tudi6e et illustr6e t l'aide de donn6es au sujet de reclamations relatives i l'assurance automobile.

1. INTRODUCTION

Mixed Poisson distributions have been found useful in situations where counts display extra-Poisson variation. Applications of univariate models abound in areas such as insur- ance (e.g., Willmot 1987) and biology (e.g., Anscombe 1950), where specific models such as the negative-binomial and Poisson-inverse-Gaussian are widely used. Mixed Poisson regression models have been employed in areas such as demography (e.g., Brillinger 1986), medicine (e.g., Breslow 1984), and engineering (e.g., Engel 1984). With regres- sion data, statistical analysis is often based on weighted least-squares or quasilikelihood methods, but fully parametric models such as the negative-binomial (e.g., Engel 1984, Lawless 1987) or Poisson-Normal mixture (e.g., Hinde 1982) are also used, particularly when more than the first two moments are of interest.

In some applications, for example in insurance, it is useful to fit a specific probability distribution to the data. A factor inhibiting the use of fully parametric methods somewhat is that inference for many mixed Poisson regression models requires the use of numerical integration (e.g., Brillinger 1986, Section 6), negative-binomial models being a notable exception. For univariate data, the Poisson-inverse-Gaussian mixture has been shown to

*Research was supported in part by grants to J.F. Lawless and G.E. Willmot from the Natural Sciences and Engineering Research Council of Canada

This content downloaded from 134.84.148.170 on Mon, 17 Aug 2015 16:02:01 UTCAll use subject to JSTOR Terms and Conditions

Page 3: Inverse Gaussian

172 DEAN, LAWLESS, AND WILLMOT Vol. 17, No. 2

be an attractive and easily used model (e.g., Holla 1966, Sankaran 1968, Sichel 1971, Ord and Whitmore 1986, Willmot 1987). In this paper we consider a regression form of the model, and show that statistical methods for it are straightforward computationally. This provides an attractive alternative to negative-binomial models when a longer-tailed distribution is required.

Section 2 introduces the model, and Section 3 develops maximum-likelihood estima- tion. Section 4 examines the efficiency of quasilikelihood-moment methods relative to maximum likelihood. Section 5 contains an example involving motor-insurance claims.

2. A POISSON-INVERSE-GAUSSIAN REGRESSION MODEL

Let Y be a response variable, and let x be an associated k x 1 vector of covariates. A Poisson regression model for Y would stipulate that, given x, Y had a Poisson distribution with mean gt(x). There are various ways to introduce random effects or extra-Poisson variation into such a model. The approach we discuss below is a very natural and flexible one which has been used by numerous other authors such as Brillinger (1986), Engel (1984), Hinde (1982), and Lawless (1987). We note, however, that other models also have appeal, and we discuss this momentarily.

We consider mixed Poisson regression models with

Pr(Y = y Ix) = e- ) [v()]g(v) dv, y = 0, 1, ..., (2. 1)

where g(v) is a probability density function and L(x) is a positive-valued function. Such models can be viewed as multiplicative Poisson random-effects models (e.g., Brillinger 1986) where, given the fixed covariates x and a random effect v with density g(v), v > 0, the response Y has a Poisson distribution with mean vL(x). We assume that L(x) depends on a vector 0 of unknown regression coefficients and, without loss of generality, that E (v) = 1. This parametrization has the appealing property that when Lt(x) takes the common log-linear form I(x) = exp(x'p), random and fixed effects are added on the same (exponential) scale.

It is well known that the assumption of a gamma distribution for v leads to a negative binomial model. In this paper, we assume an inverse Gaussian distribution (e.g., Folks and Chhikara 1978, Tweedie 1957) for v, with density

g(v) = (27tv3)-e-V-1)2/2w, v > 0. (2.2)

The parameter t is unknown, and equals Var (v). The distribution of Y given x resulting from (2.1) is then a Poisson-inverse-Gaussian (P-IG) regression model, with mean and variance functions

t(x) and t(x) +

tit(x)2, respectively. For convenience we will write Y -, P-IG (p(x), t) to denote this model. It provides a heavier-tailed alternative to the

negative-binomial model and is more tractable than other Poisson mixtures (2.1). A simple extension of the model is also often useful. Suppose that given the random

effect vi, Yi is Poisson with mean vi.(xi; p)Ti, where Ti is a known measure of exposure. In some situations vi and, in particular, its variance might depend on Ti. For example, if Yi is obtained as an aggregate count by adding across counts with a common xi but different exposures, then a plausible model might be to take Yar(vi) = t/Ti. The model (2.1) arising from this can be fitted with only slight alterations to the procedures in Section 3 and is discussed in the example of Section 5.

We note that Jorgensen (1987) and Stein and Juritz (1988) consider other P-IG regres- sion models. Each has attractive features: Jorgensen's is a discrete exponential dispersion

This content downloaded from 134.84.148.170 on Mon, 17 Aug 2015 16:02:01 UTCAll use subject to JSTOR Terms and Conditions

Page 4: Inverse Gaussian

1989 POISSON-IG REGRESSION 173

model and satisfies an appealing convolution property, and Stein and Juritz's is structured so that the regression parameters p are orthogonal to the parameter (analogous to our t) specifying the degree of extra-Poisson variation. Neither model has, however, the simple structure of those we consider in terms of the multiplicative random effects.

We note for use in the next section that for P-IG(I, t) the probability generating function is (e.g., Holla 1966, Willmot 1987)

00

P(z) = Zp(y)zy = exp(r'[1 - {1 - 2"t(z

- 1)}1/2]), (2.3) y=o

where we have written p(y) for Pr(Y = y). Probabilities may be calculated recursively using the easily established results

p(O) =exp[t-' {1 - (1 + 2t)? }1], p(l) =gL(1 + 2-tY)-'p(O),

P(Y) - 1- - p(y - 1 + p(y - 2), 1 + 2Ty 2y I + 21g yOy - 1)

y =2,3, .... (2.4)

3. MAXIMUM-LIKELIHOOD METHODS The log-likelihood function from a random sample of observations (Yi, xi), i =

1,...., n, is 1(3, t) = =E , log pi(yi), where pi(yi) stands for Pr(Y = yi I xi;P, ). For convenience we write Li = L(xi; P) and define, for i = 1,..., n,

Pi(Y + 1) ti(y) =

(y+ 1) (Y y = 0, 1,2, .... (3.1) Pi(Y)

Manipulation of (2.3), (2.4), and (3.1) shows that l(p3, ) and its first and second derivatives can be expressed as

n

yi-1

l(43, ) = log (--

+ log pi(O) + I(yi > 0) log ti(j) , i= 1

Yij=0

ni=

1 U l 1 + Cgi j1

Uk+ ---

=

l i

ti(Yi) -

( + 'yi)

'2 , (3.3)

Irs

a•

= {y

- t(yi)tibi

+ 1) + tii)2 - /

+ {ti(yi) - y} i ( r

)}, r, s = 1, ...,k, (3.4)

9i2n Or +O)ti, r,k+1

({ti(bi

+ 1) -

tiYi)}l

1 ti)

,

r =1 ... k, (3.5)

-k+1,k+1

= =

,

ti i)

"3i•2'-

_

+3

Ik~l'k~l= '2 di= 1 (1 +

z1 i)2 -_2(

ti (yi){ti(yi + 1)- ti(Yi)})

. (3.6)

This content downloaded from 134.84.148.170 on Mon, 17 Aug 2015 16:02:01 UTCAll use subject to JSTOR Terms and Conditions

Page 5: Inverse Gaussian

174 DEAN, LAWLESS, AND WILLMOT Vol. 17, No. 2

We remark that (3.2) and (3.4) hold quite generally for mixed Poisson models of the form (2.1); Sprott (1965) notes this in the univariate case.

It is possible to have the maximum-likelihood estimate T = 0, implying that a Poisson regression model is best supported by the data. To avoid problems when T is zero or close to zero, we find it convenient to maximize l(3, t) for selected t-values by solving the likelihood equations u,r(, t) = 0 (r = 1, ..., k) to obtain 0 (t). The profile log-likelihood 1(0 (t), t) is then easily obtained and maximized with respect to t, to yield I and I = I (i). Estimates 0I(t) are readily found via Newton-Raphson iteration or, alternatively, the scoring algorithm. With regard to the latter and to efficiency calculations in Section 4, we note that the Fisher information matrix entries are found after some algebra to be

Ars(0,')=E

arni= ,--' r s '

rs =

1,....k,

9 (3.7) Or, ap, i= 1

-a21 n

9I_•- i 1+

i _i Ar,k+1(, T) - Vi

1 v 2 ,L) (3.8) ( ap, T 1 + Tgi T2 i? r

r = 1, ...., k, -a2

n li (1 I+ i)2

Ak+12,k+l(9, ) =

2&E 2 I(3.9) i=l ~

where Vi = gi - g(11 + ) + E [ti(Yi)2]. For the widely used log-linear specification gi = exp(x'p), the formulae (3.2)-(3.9)

simplify to some extent; note that a4i/B, = Xirgi in this case. We remark also that the values ti(yi) can be computed from the following recursions, which are a direct consequence of (2.4):

ti(O) = ti(1 + 21;ti)-,

ti (y)= (2y

_-

1) ti (O)2, y= 1, 2,...

t (y)- i

ti(y --1)

" Working with the ti(y)'s rather than the pi(y)'s helps avoid roundoff problems.

Confidence intervals and tests about parameters can be obtained by using familiar asymptotic X2 approximations for likelihood-ratio statistics or by treating (9 - T, ~ - t) as approximately normally distributed with mean 0 and covariance matrix either I(0 , )-1 or A• ((, )-'. We have not studied whether one covariance-matrix estimate might be preferable to the other, but the observed is more easily computed and we have used it in Section 5. When t > 0, limiting distributions which yield these approximations arise under mild conditions as n --+ 00 and also for fixed n and t as the li's -+ c0. Likelihood-ratio and normal-approximation confidence intervals for 0 generally appear to be in good agreement and satisfactory for practical purposes, provided that the li's and n are not both small; the likelihood-ratio method is preferable when the two disagree. The same is true for inferences about t, with the additional proviso that when t is close to zero, intervals by either approach are inaccurate. Based on limited empirical results, we suggest as a rough practical guideline that when a 0.95 or 0.99 confidence interval for t includes zero, one should expect that the right limit of the interval is somewhat too small.

We remark that results of Stein, Zucchini, and Juritz (1987) or Willmot (1988), showing for P-IG(i, t) that t and V-2 + 2r'-1 are orthogonal parameters, imply that [3 and i will have low asymptotic correlation when Rirt values are small. This is often the case and

This content downloaded from 134.84.148.170 on Mon, 17 Aug 2015 16:02:01 UTCAll use subject to JSTOR Terms and Conditions

Page 6: Inverse Gaussian

1989 POISSON-IG REGRESSION 175

has been reflected in the estimated covariance matrices I(3, i)-' or A (0, i)-' in data sets we have examined.

Finally, tests of the hypothesis t = 0 are often of interest, since this corresponds to a Poisson model. A test may be based on the likelihood-ratio statistic

A = 21(, T) - 21(0(0), 0).

Under H : ?

= 0 this has an asymptotic distribution with a probability mass of 0.5 at A = 0, and half-X 2) distribution for A > 0 (Chemoff 1954). When n and the gLi's are both small, this limiting distribution is a poor approximation to the actual distribution of A. Dean and Lawless (1989a) discuss other approaches which can be used then.

4. QUASILIKELIHOOD-MOMENT ESTIMATION

Weighted least squares, or quasilikelihood, is often used for the regression analysis of count data. Such methods are popular because they involve familiar iterative least- squares calculations and can be carried out with readily available software. They also possess a degree of robustness to misspecification of the distribution of v in (2.1). We examine these methods briefly here to see whether they are efficient when the model (2.1) is indeed P-IG(gt(xi), 1).

The quasilikelihood equations for 0 (e.g., McCullagh and Nelder 1983) are

Si 2

i r =-

0, r -

1 .1....

k, (4. 1)

where •

= Var (Yi) = i + T,r2. An additional equation is needed to allow estimation of t; one that is often used is

(Y i)2 - (n - k) = 0

(4.2) i=1

i

(e.g., Breslow 1984, McCullagh and Nelder 1983). Dean (1988) shows that it is preferable to use (4.1) combined with the equation

n (yi

,i)2 _ (2

Yi g)2- = 0 (4.3)

i=1

(1 +

1rpi)2 to estimate 0 and t. This is motivated by a study of quadratic estimating equations (Crowder 1987, Godambe and Thompson 1989) for this problem. The equations may be solved conveniently by first fitting the Poisson model (T = 0) to obtain initial estimates 0 and ji (i =

1,...., n), and then inserting these in (4.3) and solving for t. If the solution t is positive, the process is repeated using t in (4.1) to obtain a new j, and so on, iterating until convergence. In some cases it may be that Z = 0, representing a Poisson model.

The asymptotic covariance matrix of the estimator (1, t) given by the solution to (4.1) with (4.3) can be obtained using general results on estimating equations (e.g., Inagaki 1973) or on quadratic estimating equations (Crowder 1987). The limiting distribution of

( - 1,

t - t) is normal with covariance matrix of the form

1F-'1 F-'(c - b) ) bk+1 , (4. 4)

b-+l (c - b)'F-1 (Ck+l - 2c'F-lb + b'F-lb)

This content downloaded from 134.84.148.170 on Mon, 17 Aug 2015 16:02:01 UTCAll use subject to JSTOR Terms and Conditions

Page 7: Inverse Gaussian

176 DEAN, LAWLESS, AND WILLMOT Vol. 17, No. 2

TABLE 1: Asymptotic relative efficiencies of quasi likelihood moment estimation in a Poisson-inverse-

Gaussian regression model.

Efficiency

T ro vs. o i vs. P, vs.

0 1.000 1.000 1.000 0.005 1.000 1.000 1.000 0.01 1.000 1.000 1.000 0.05 1.000 0.999 1.000 0.10 1.000 0.995 0.961 0.20 1.000 0.978 0.860 0.50 1.000 0.914 0.641

where Fkxk has (r, s) entry

Frs = lim n--+oo 0n r Ps i=1

and b = (bi,..., bk)', C = (Cl,. .. Ck), bk+l, and Ck+1 are given by

S1n RI + 2Tgi) t ai

br = lim

1 •, r =1,

...2, n--•oo

n r i= I

n 4

bk+l= lim - n--+oo n

i= i=l

Cr=

lim l

3I, r

1,6.6li29k, n--+oo n

1 i1 a r

" r ' r . 1 • (2i + 2)g4

Ck+l1 = lim -E (y 9

n--oo n i=1

where yli = {(1 +

2+Cti)o2 + r2

3i }/ao and Y2i = 7T + {1 +

4t22 3+ 8t3

i}/{ti(1 +

"tji)2} are skewness and kurtosis coefficients for P-IG(Li, t). Variance estimates for finite n are obtained by inserting parameter estimates into (4.4) with "limno" omitted in all expressions. It can be shown, as in Lawless (1987), that the estimator 01 and its estimated variance are consistent provided the model is mixed Poisson, and consequently are robust to departure of the mixing distribution from inverse-Gaussian form.

An example in Section 5 compares the estimates 0, t based on (4.1) and (4.3) with maximum-likelihood estimates 0, t for a set of data. We have also examined the asymptotic efficiency of (0j, ) relative to (5, I~) for a variety of regression situations as follows: (a) k = 1, g = 10; (b) k = 1, g = 40; (c) k = 2,

1i = exp(Po + Pixi),

PI = 1, exp(o)-

= 10, of the xi's each of -1, -0.5,0,0.5, 1; (d) k = 2,i-

=

exp(Po + Plxi), 1 = 1, exp(jo) = 10, 4 of the xi's each of -1,0, 1; (e) same as (d) except that Pi = 0.5, exp(jo) = 10; (f) same as (d) except P1I = 0.5, exp(Po) = 50. As an illustration, Table 1 shows results for case (e) (the gLi-values corresponding to

This content downloaded from 134.84.148.170 on Mon, 17 Aug 2015 16:02:01 UTCAll use subject to JSTOR Terms and Conditions

Page 8: Inverse Gaussian

1989 POISSON-IG REGRESSION 177

TABLE 2: Classificatory factors for Swedish motor-insurance data.

DISTANCE Kilometers travelled (5 classes) (1) less than 1000 km per year (2) 1000-15,000 km per year (3) 15,000-20,000 km per year (4) 20,000-25,000 km per year (5) more than 25,000 km per year

BONUS No-claims bonus (7 classes) Insured starts in the class BONUS = 1 and is moved

up one class (to a maximum of 7) each year there is no claim.

MAKE 9 specified car makes.

x = -1,0, 1 are then approximately 6.1, 10, and 16.5). Several values of t are considered, and the table shows asymptotic relative efficiencies for ~o, P1, and t, defined as the ratio of the asymptotic variance for the maximum-likelihood estimator for each parameter to the asymptotic variance for the quasilikelihood-moment estimator. The results shown in the table illustrate features found in all of the situations.

For the regression situations the efficiencies of fi were all larger than 0.9 for t < 0. 5. With moderate amounts of extra-Poisson variation (i.e. t not too large) the efficiency of t is very high. As r increases, the efficiency of drops off faster than that of 0. In practice, however, relevant values of r tend to be fairly small; unless the gi's are very small, estimates of r are usually under 0.50. Efficiencies tend to be slightly higher when the regression effect is smaller or when there is less variation in the xi's (when

li =... = =ln = e , 0 is asymptotically fully efficient). If one is certain about the appropriateness of the P-IG model, then the maximum-

likelihood estimation of Section 2 is of course preferred. However, the estimation procedure embodied in (4.1) and (4.3) is a simple practical alternative, and can also be used to get starting values for maximum- likelihood iteration. In addition, it is more robust than maximum likelihood in providing consistent estimates of 0 and r when li and

o• are correctly specified, even though the distribution of Y may not be Poisson-

inverse-Gaussian. Finally, it is comforting to know that this general estimation method, which is used by many statisticians, has quite high efficiency for the P-IG model. Dean and Lawless (1989b) show that estimates from (4.1) and (4.3) also have high efficiency under a negative-binomial model.

5. AN EXAMPLE

Andrews and Herzberg (1985, Table 68.1) present data on Swedish third-party motor insurance in 1977 for Stockholm, G6teburg, and Malmo, obtained from a committee study of risk premiums in motor insurance. Three factors are thought to be important in modeling the occurrence of claims; these are listed in Table 2. The data give the total number of claims Y for automobiles insured in each of the 315 risk groups defined by a combination of DISTANCE, BONUS, and MAKE factor levels. For each group there is also an "exposure" T, which is the number of insured automobile-years for that group, in units of 105.

We investigated the fit of Poisson and mixed Poisson log-linear models to these claims

This content downloaded from 134.84.148.170 on Mon, 17 Aug 2015 16:02:01 UTCAll use subject to JSTOR Terms and Conditions

Page 9: Inverse Gaussian

178 DEAN, LAWLESS, AND WILLMOT Vol. 17, No. 2

TABLE 3: Fits of several models to the Swedish motor-insurance data.

Model Pearson Statistica d.f.

1. Poisson (main effects) 485.1 296 2. Poisson (main effects; observations 174,

180, 183 deleted) 355.0 293 3. Poisson (main effects, MAKE = 9 deleted) 361.6 262 4. Poisson (main effects, MAKE = 9 and

observations 8, 174, 183, 184 deleted) 274.7 258 5. P-IG(Ri,7) (main effects) 319.7 295 6. P-IG(Ri,7/Ti) (main effects) 331.5 295

aFor models 1, 2, 3, and 4 the Pearson statistic is X(Yi -

"i)2/ii; for model 5 it

is R(Yi

- ji)2/{I11 + T7i)}, and for model 6 it is X(Y;

- •,i)2/{

1 + 7ilTi)}.

data. For the Poisson models Yi, the number of claims in the ith group (i = 1, .. .,315) was assumed to be Poisson with mean Pi = Ti exp(x'3), where Ti was the exposure for the ith group and the covariates xi were chosen to represent levels of the factors in Table 2, and possibly interactions. A Poisson model with covariates only for factor main effects did not fit well, and there was evidence of extra-Poisson variation or lack of fit (see Table 3). We also examined factor interactions, but were unable to find a parsimonious Poisson model that gave a gqod fit. Figure 1 shows a normal probability plot of the Pearson residuals (yi - fi)/^7! for the main- effects Poisson model. The shape of the plot, with residuals tending to be uniformly larger in magnitude than expected, suggests overdispersion. Examination of various other residual plots did not reveal systematic evidence for lack of fit due to an incorrect specification of the Poisson mean. Three cases (groups 174, 180, and 183 in the data set) had particularly large residuals of 6.22, 6.24, and 5.61 respectively. After deleting these observations and refitting the Poisson model, we still find evidence for overdispersion (see Table 3). We observe also that the category MAKE=9 is actually not one particular car make, but rather includes any make not in the categories MAKE= 1 tO MAKE=8. If we drop the 35 groups with MAKE=9 and in addition observations 8, 174, 183, 184, then (see Table 3) the resulting Poisson model fits well except that extreme residuals tend to be smaller than expected.

It is difficult to assign sources for the lack of fit of the Poisson regression models, and a sensible alternative approach is to fit mixed Poisson models (2.1), which can be thought of as incorporating random effects vi representing additional variability. Either a model (2.1) with Var (vi) = t or one with Var (vi) =

-/Ti seems plausible, the former because

there may be heterogeneity associated with automobiles with certain characteristics, and the latter because there may be heterogeneity arising from the individual automobiles which make up a group. Both Poisson-inverse-Gaussian models of this kind fit quite well: see Table 3 and Figures 2 and 3, which portray normal probability plots of Pearson residuals (yi - i)/{Pi(1 + +gi))} 1 and (yi - Pi)/(~i(1 + Ti/T)^ } respectively for the two models. We remark that observations 174 and 183 still have large residuals (5.30 and -3.89) under the first model, and 174, 180, and 183 still have large residuals (5.55, 5.11, and -4.92) under the second model. Although they give reasonable fits, we note that the Pearson-statistic values (see Table 3) and Figures 2 and 3 suggest a slightly better fit for the first model with '/ar (vi) = t than for the second.

Table 4 shows maximum-likelihood estimates of main effects and their standard errors under the P-IG(4i, t) model, computed as outlined in Section 3. Estimates and standard

This content downloaded from 134.84.148.170 on Mon, 17 Aug 2015 16:02:01 UTCAll use subject to JSTOR Terms and Conditions

Page 10: Inverse Gaussian

1.

0

-1

-2

-2 -1 0 1 2

Ordered Residuals

FIGURE 1: Normal probability plot of the Pearson residuals for the main-effects Poisson model, Model 1. There are eight points placed on the boundaries of the x-axis whose values are actually outside of its limits.

2 -

1

O-

-1 -

7,/ -2 - '_ ./ .

-2 -1 0 1 2

Ordered Residuals

FIGURE 2: Normal probability plot of the Pearson residuals for the main-effects Poisson- inverse-Gaussian model, Model 5. There are two points placed on the boundaries of the x-axis whose values are actually outside of its limits.

2 /

1

0

-1 -

-2

5!"

-2 -1 0 1 2

Ordered Residuals

FIGURE 3: Normal probability plot of the Pearson residuals for the main-effects Poisson- inverse-Gaussian model, Model 6. There are three points placed on the boundaries of the x-axis whose values are actually outside of its limits.

OD

(0

(0

-D

O

co C/)

CI/

(0 z G) m0 mr G)

z

-.iL

--4

This content downloaded from 134.84.148.170 on Mon, 17 Aug 2015 16:02:01 UTCAll use subject to JSTOR Terms and Conditions

Page 11: Inverse Gaussian

180 DEAN, LAWLESS, AND WILLMOT Vol. 17, No. 2

TABLE 4: Parameter estimates and standard errors for the (a) quasi- likelihood and (b) maximum-likelihood fits to the insurance data.

(a) (b)

Parameter Estimate Std. Error Estimate Std. Error

T 0.0113 0.0029 0.0115 0.0035

Intercept -1.719 0.050 -1.716 0.050 DISTANCE:

(1) - - - -

(2) 0.171 0.037 0.170 0.037

(3) 0.230 0.040 0.226 0.040

(4) 0.282 0.047 0.280 0.048

(5) 0.528 0.048 0.528 0.049 BONUS:

(1) - - - - (2) -0.551 0.051 -0.551 0.051 (3) -0.698 0.052 -0.698 0.052 (4) -0.910 0.054 -0.910 0.055

(5) -0.999 0.053 -1.000 0.053

(6) -1.046 0.048 -1.049 0.048

(7) -1.481 0.042 -1.483 0.043 MAKE:

(1) - - - -

(2) 0.159 0.057 0.159 0.057

(3) -0.142 0.062 -0.149 0.063 (4) -0.512 0.065 -0.513 0.065

(5) 0.110 0.060 0.110 0.060

(6) -0.415 0.057 -0.416 0.057

(7) -0.154 0.070 -0.155 0.070

(8) 0.102 0.093 0.100 0.093 (9) -0.029 0.038 -0.029 0.039

errors obtained from the quasilikelihood method-of-moments procedure embodied in

Equations (4.1) and (4.3) are also shown. These are remarkably close to the maximum- likelihood estimates. We have found that generally iterations for the quasilikelihood moment estimates converge rapidly, and that it is useful to use them as starting values for the maximum-likelihood iterative procedure.

ACKNOWLEDGEMENT

The authors thank J.D. Kalbfleisch, two anonymous referees, and the editor for constructive comments.

REFERENCES

Andrews, D.F., and Herzberg, A.M. (1985). Data. Springer-Verlag, New York. Anscombe, F. (1950). Sampling theory of the negative binomial and logarithmic series distributions. Biometrika,

37, 358-382. Breslow, N. (1984). Extra-Poisson variation in log-linear models. Appl. Statist., 33, 38-44. Brillinger, D.R. (1986). The natural variability of vital rates and associated statistics (with Discussion).

Biometrics, 42, 693-711. Chernoff, H. (1954). On the distribution of the likelihood ratio. Ann. Math. Statist., 25, 573-578.

This content downloaded from 134.84.148.170 on Mon, 17 Aug 2015 16:02:01 UTCAll use subject to JSTOR Terms and Conditions

Page 12: Inverse Gaussian

1989 POISSON-IG REGRESSION 181

Crowder, M. (1987). On linear and quadratic estimating functions. Biometrika, 74, 591-597. Dean, C. (1988). Mixed Poisson models and regression methods for count data. Ph.D. Thesis, University of

Waterloo. Dean, C., and Lawless, J.F. (1989a). Testing for overdispersion in Poisson regression models. J. Amer. Statist.

Assoc., 84, 467-472. Dean, C., and Lawless, J.F. (1989b). Comments on "An extension of quasilikelihood estimation", by Godambe

and Thompson. J. Statist. Plann. Inference, 22, 155-158. Engel, J. (1984). Models for response data showing extra-Poisson variation. Statist. Neerlandica, 38, 159-167. Folks, J.L., and Chhikara, R.S. (1978). The inverse Gaussian distribution and its statistical application-a

review. J. Roy. Statist. Soc. Ser. B, 40, 263-289. Godambe, V.P., and Thompson, M.E. (1989). An extension of quasi-likelihood estimation. J. Statist. Plann.

Inference, 22, 137-152. Hinde, J. (1982). Compound Poisson regression models. GLIM 82: Proc. Internat. Conf. Generalized Linear

Models (R. Gilchrist, ed.), Springer-Verlag, Berlin, 109-121. Holla, M.S. (1966). On a Poisson-inverse Gaussian distribution. Metrika, 11, 115-121.

Inagaki, N. (1973). Asymptotic relations between the likelihood estimating function and the maximum likeli- hood estimator. Ann. Inst. Statist. Math., 25, 1-26.

Jorgensen, B. (1987). Exponential dispersion models (with Discussion). J. Roy. Statist. Soc. Ser. B, 49, 127-162. Lawless, J.F. (1987). Negative binomial and mixed Poisson regression. Canad. J. Statist., 15, 209-226. McCullagh, P., and Nelder, J.A. (1983). Generalized Linear Models. Chapman and Hall, London. Ord, J.K., and Whitmore, G.A. (1986). The Poisson-inverse Gaussian distribution as a model for species

abundance. Comm. Statist. A, 15, 853-871. Sankaran, M. (1968). Mixtures by the inverse Gaussian distribution. Sankhyd Ser. B, 30, 455-458. Sichel, H.S. (1971). On a family of discrete distributions particularly suited to represent long tailed frequency

data. Proc. 3rd Symp. Math. Statist. (N.F. Loubscher, ed.), CSIR, Pretoria. Sprott, D.A. (1965). Some comments on the question of identifiability of parameters raised by Rao. Classical

and Contagious Discrete Distributions (G.P. Patil, ed.), Statistical Publishing Society, Calcutta, 333-336. Stein, G.Z., and Juritz, J.M. (1988). Linear models with an inverse-Gaussian error distribution. Comm. Statist.

Theory Methods, 17, 557-571. Stein, G., Zucchini, W., and Juritz, J. (1987). Parameter estimation for the Sichel distribution and its multivariate

extension. J. Amer. Statist. Assoc., 82, 938-944. Tweedie, M.C.K. (1957). Statistical properties of inverse Gaussian distributions, I. Ann. Math. Statist., 28,

362-372. Willmot, G.E. (1987). The Poisson-inverse Gaussian distribution as an alternative to the negative binomial.

Scand. Actuar. J., 113-127. Willmot, G.E. (1988). Parameter orthogonality for a family of discrete distributions. J. Amer. Statist. Assoc.,

83, 517-521.

Received 21 January 1988 Revised 13 January 1989

Accepted 6 March 1989

Department of Statistics and Actuarial Science University of Waterloo,

Waterloo, Ontario N2L 3G1

This content downloaded from 134.84.148.170 on Mon, 17 Aug 2015 16:02:01 UTCAll use subject to JSTOR Terms and Conditions