notice changes introduced as a result of publishing...

31
This may be the author’s version of a work that was submitted/accepted for publication in the following source: Kang, Su, McGree, James,& Mengersen, Kerrie (2014) The choice of spatial scales and spatial smoothness priors for various spa- tial patterns. Spatial and Spatio-temporal Epidemiology, 10, pp. 11-26. This file was downloaded from: https://eprints.qut.edu.au/75778/ c Consult author(s) regarding copyright matters This work is covered by copyright. Unless the document is being made available under a Creative Commons Licence, you must assume that re-use is limited to personal use and that permission from the copyright owner must be obtained for all other uses. If the docu- ment is available under a Creative Commons License (or other specified license) then refer to the Licence for details of permitted re-use. It is a condition of access that users recog- nise and abide by the legal requirements associated with these rights. If you believe that this work infringes copyright please provide details by email to [email protected] License: Creative Commons: Attribution-Noncommercial-No Derivative Works 2.5 Notice: Please note that this document may not be the Version of Record (i.e. published version) of the work. Author manuscript versions (as Sub- mitted for peer review or as Accepted for publication after peer review) can be identified by an absence of publisher branding and/or typeset appear- ance. If there is any doubt, please refer to the published source. https://doi.org/10.1016/j.sste.2014.05.003

Upload: others

Post on 10-Aug-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

This may be the author’s version of a work that was submitted/acceptedfor publication in the following source:

Kang, Su, McGree, James, & Mengersen, Kerrie(2014)The choice of spatial scales and spatial smoothness priors for various spa-tial patterns.Spatial and Spatio-temporal Epidemiology, 10, pp. 11-26.

This file was downloaded from: https://eprints.qut.edu.au/75778/

c© Consult author(s) regarding copyright matters

This work is covered by copyright. Unless the document is being made available under aCreative Commons Licence, you must assume that re-use is limited to personal use andthat permission from the copyright owner must be obtained for all other uses. If the docu-ment is available under a Creative Commons License (or other specified license) then referto the Licence for details of permitted re-use. It is a condition of access that users recog-nise and abide by the legal requirements associated with these rights. If you believe thatthis work infringes copyright please provide details by email to [email protected]

License: Creative Commons: Attribution-Noncommercial-No DerivativeWorks 2.5

Notice: Please note that this document may not be the Version of Record(i.e. published version) of the work. Author manuscript versions (as Sub-mitted for peer review or as Accepted for publication after peer review) canbe identified by an absence of publisher branding and/or typeset appear-ance. If there is any doubt, please refer to the published source.

https://doi.org/10.1016/j.sste.2014.05.003

Page 2: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

The choice of spatial scales and spatial smoothness priors for

various spatial patterns

Abstract

Given the drawbacks for using geo-political areas in mapping outcomes unrelated to geo-politics, a compromise is to aggregate and analyse data at the grid level. This has theadvantage of allowing spatial smoothing and modelling at a biologically or physically relevantscale. This article addresses two consequent issues: the choice of the spatial smoothness priorand the scale of the grid. Firstly, we describe several spatial smoothness priors applicable forgrid data and discuss the contexts in which these priors can be employed based on differentaims. Two such aims are considered, i.e., to identify regions with clustering and to modelspatial dependence in the data. Secondly, the choice of the grid size is shown to dependlargely on the spatial patterns. We present a guide on the selection of spatial scales andsmoothness priors for various point patterns based on the two aims for spatial smoothing.

Keywords: Data aggregation, Moran’s I statistics, spatial clustering, spatial scales, spatialsmoothing

1. Introduction

Spatial data are prevalent in various disciplines including epidemiology (Diggle, 1990;Benes et al., 2005; Goovaerts, 2009a,b), geology (Baddeley et al., 2010), ecology (Wolpertand Ickstadt, 1998; Best et al., 2000; Diggle et al., 2007), weather phenomenon (van Lieshoutand Stein, 2012; Elsner et al., 2013), and transportation planning (Ickstadt et al., 1998;Ickstadt and Wolpert, 1999). A detailed description of the various types of spatial data canbe found in Banerjee et al. (2004). Here we are interested in spatial point patterns whichare realizations of a spatial point process; see Møller and Waagepetersen (2004) and Møllerand Waagepetersen (2007) for theory and review of spatial point processes.

Point pattern data have become increasingly common in many fields of application dueto recent advances in geographical information systems (GIS) and global positioning systemswhich enable accurate geocoding of locations of data collected. Methods for analysing spatialpoint patterns can be found in Diggle (2003) and Illian et al. (2008). In practice, care must betaken to ensure computationally tractable methods are available when dealing with complexand high-dimensional point patterns that contain many points and marks (marked pointpattern). As a result, to reduce the computational burden, point patterns can be aggregated

∗Corresponding author. Tel.: +61 7 3138 2308; fax: +61 7 3138 2310Email address: [email protected] ()

Preprint submitted to Spatial and Spatio-temporal Epidemiology April 8, 2014

Page 3: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

to regular lattices. Regular lattice data may also be collected directly with the aid of GISand related software, in the form of pixel or raster data (Chang, 2010).

Without loss of generality, we restrict our attention to spatial analysis of regular latticedata in the context of epidemiology. Presuming the availability of point pattern diseasedata with continuous coordinate information, to obtain regular lattice data, a discretizationstrategy is to overlay the study region with a discrete grid and to assign occurrences to thegrid cells. This results in regularly-shaped regions (grid cells). Despite being less commonthan modelling at the geo-political scales, grid level modelling approaches have increasedrapidly in recent years (Biggeri et al., 2006; Kleinschmidt et al., 2007; Vanhatalo and Vehtari,2007; Hossain and Lawson, 2009). Modelling disease data at the grid level is a desirableapproach as it is geographically more relevant than using geo-political area data and yetprotects patient confidentiality. It is feasible to construct generalized linear models andapproximate the covariance structure by a Markov random field which eases computation;see, for instance, Baddeley et al. (2010) or Li et al. (2012). Grid level modelling avoids theneed to deal with the problem of changing geo-political boundaries over time and has theflexibility to manipulate the data aggregation to a practically, biologically, geographicallyor computationally sensible scale. The comparison of three types of commonly encountereddisease data are provided in Appendix A for further details, if required. As discussed, theadvantages of using disease data aggregated to grid cells often outweigh its drawbacks.

In the context of disease mapping, spatial smoothing may facilitate more stable diseaserates by incorporating information from neighbouring areas and more robust risk estimatesfor each area may be obtained. Disease data arising from both geo-political areas andregular grid cells typically exhibit spatial correlation due to the presence of spatial structurein the unknown risk factors. While it is of interest to study the relationship between theresponse and predictors, ignoring the spatial correlation among neighbouring regions maylead to unstable estimates of underlying risk of disease and false conclusion of covariateeffects (Fahrmeir and Kneib, 2011). In contrast, accounting for spatial dependence resultsin improved model inference, prediction and estimation (Haran, 2011). Spatial smoothingalso reduces the effect of the arbitrary geographical boundaries.

Given the importance for epidemiologists to take into account the spatial correlation ina disease dataset using spatial smoothing techniques, this article aims to describe severalspatial smoothness priors and discuss the contexts in which these priors can be employedbased on the aims for spatial smoothing outlined in Section 2. In order to capture thespatial variation of a disease, one of these priors is imposed on the spatial component in aBayesian hierarchical model. These priors are capable of modelling the spatial dependence inthe data using different mechanisms and understanding how they differ may help in makingthe right choice for spatial smoothing. In this article, three Bayesian spatial smoothnesspriors for the spatially structured effect are considered. They are all suitable for modellingthe spatial dependence of regular lattice data, and one of the priors can also be appliedto irregular lattice data. These three priors are chosen due to their popularity in spatialmodelling and their fast, straightforward implementation using the integrated nested Laplaceapproximation (Rue et al., 2009). The implementation of these priors on regular grid cells

2

Page 4: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

is described in Kang et al. (2013b).One of the main challenges of modelling disease data at the grid level is to specify an

appropriate grid size. The choice of grid size depends largely on the spatial patterns as it isknown that spatial data exhibit different spatial structures at different scales of aggregation(Illian et al., 2012a; Kang et al., 2013b). In light of this, the secondary aim of the article isto provide some recommendations on the choice of spatial scale for various point patternsbased on the two aims for spatial smoothing.

The following discussion is not limited to epidemiological data but is applicable to spatialdata arising from various areas of research. The article is organized as follows. Two commonaims of spatial smoothing are outlined in Section 2. Several canonical examples are presentedin Section 3. In Section 4, we describe three spatial smoothness priors and summarizetheir practicability in various contexts. Section 5 discusses the choice of spatial scale andillustrates the changes in spatial behaviour at various spatial scales using tests of spatialautocorrelation. A case study is used to demonstrate the selection of an optimum spatialscale based on model predictive performance in Section 6. Finally, Section 7 provides someguidance on the selection of spatial scales and spatial smoothness priors for two differentaims of analysis. Relevant mathematical equations and technical details are appended.

2. Aims of spatial smoothing

Spatial smoothing involves estimating an effect of interest at a location, using the effectvalues at nearby locations (Wang, 2006). In the epidemiological context, we assume thatareas in the same neighbourhood are likely to share similar environmental exposures andthus similar disease rates are expected. This results in the reduction of spatial variabilitywhich is particularly important for small-area estimation problem, especially in areas withsmall populations which have very small counts of disease. For instance, an unusually highrate may be caused by only one occurrence of disease in a small region due to the smallpopulation size. Considering the notion of neighbourhood where regions with a commonboundary are treated as neighbours, spatial smoothing approaches based on Markov randomfields are widely employed in disease mapping (Paciorek, 2013).

Consider the following aims for spatial smoothing,

(a) To identify regions with unusually high risks or clustering (Goovaerts, 2009b; Haran,2011).This is important for the study of possible socio-economic factors. Moreover, theidentification of clusters of high-risk areas is vital for the appropriate remedial actionto be taken by public health authorities, for example, allocating resources to areas ofgreatest need.

(b) To model spatial dependence in the data in order to adjust for an unknown spatiallyvarying mean (Richardson et al., 2004; Haran, 2011) or to estimate the surface of re-gression effect (Fahrmeir and Kneib, 2011).The spatial surface provides information on unmeasured risk factors. Borrowing infor-mation from neighbouring areas leads to improved and more stable risk estimates for

3

Page 5: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

each area. This is particularly important for health policy makers to better understandthe unobserved geographical variation of disease risk.

3. Canonical examples

We present four canonical examples of spatial point patterns in this article to complementthe following sections: inhomogeneous point patterns, patterns with local repulsion, patternswith local clustering, and patterns with local clustering in the presence of larger-scale in-homogeneity (Illian et al., 2012a). We note that these point patterns are not restrictedto the epidemiological context, but are also commonly seen in disciplines such as forestry,geography, biology, and ecology (Baddeley et al., 2000; Hahn et al., 2003; Smith, 2004). In-homogeneous point patterns are point patterns with non-homogeneous intensity. Examplesof inhomogeneous point patterns include the number of trees per unit area in a forest (Hahnet al., 2003), the cell size in plant and animal tissue (Hahn et al., 2003), and the position ofplants induced by the spatial variation in soil fertility (Baddeley et al., 2000). An example ofpatterns with local repulsion is the locations of plant species having incompatible root sys-tems but requiring similar soil conditions (Smith, 2004). The species tend to grow togetherbut compete for light and nutrients. Spatial clustering in point patterns may occur due tothe interaction of points and is common in geographical epidemiology. Examples includeclusters of disease incidence caused by environmental contamination such as nuclear instal-lations (Gatrell et al., 1996) or workplace exposure, and communicable diseases (Sartoriuset al., 2013) that spread from one person to another including colds, flu, whooping cough,chlamydia, tuberculosis and human immunodeficiency virus. The size and distribution ofthe clusters may depend on population size and contagion of the disease. Table 1 describesthe four canonical examples of spatial point patterns presented in this article.

Table 1: Description of the four canonical examples

Dataset Description ExamplesX1 Point patterns with non-homogeneous

intensityNumber of trees per unit area in aforest, cell size in plant and animaltissue, positions of plants

X2 Point patterns with local repulsion Locations of plant species requiringsimilar soil conditions but competingfor light and nutrients

X3 Point patterns with large andhomogeneously distributed clusters

Locations of disease incidences causedby environmental contamination orcommunicable diseases

X4 Point patterns with small andinhomogeneously distributed clusters

Locations of disease incidences inducedby population inhomogeneity

The inhomogeneous point patterns (dataset X1) were generated from an inhomoge-neous Poisson process (Møller and Waagepetersen, 2004, Chapter 3.1) with trend functionλ = 1000 exp(−2x) on the unit square. For the patterns with local repulsion (datasetX2), we

4

Page 6: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

generated point-based data from a homogeneous Strauss process (Møller and Waagepetersen,2004, Chapter 6.1), with medium repulsion β = 700 (intensity parameter), interaction pa-rameter γ = 0.8 and interaction radius r = 0.05, on the unit square. To generate theclustered patterns (dataset X3), we simulated a homogeneous Thomas process (Møller andWaagepetersen, 2004, Chapter 5.3) with parameters κ = 10 (the intensity of the Poisson pro-cess of cluster centres), σ = 0.05 (the standard deviation of the distance of a point from thecluster centre) and µ = 50 (the expected number of points per cluster), on the unit square.For the last spatial pattern (dataset X4), we generated the data from an inhomogeneousThomas process with parameters σ = 0.01 and µ = 5 and a simple trend function for theintensity of parent points given by κ(x1, x2) = 100x1, on the unit square, which was thensuperimposed with a pattern generated from an inhomogeneous Poisson process with trendfunction λ = 500 exp(−2x). See Figure 1 for illustrations of the simulated point patterns andAppendixB for descriptions of the point processes. Here we refer to the clustering in datasetX3 as “large” as the clusters are clearly visible when plotted. In contrast, the clustering indataset X4 is labelled “small” as the data appear in small groups when plotted. We shall seethat using Moran’s I statistics, that dataset X3 does exhibit clustering at nearly all spatialscales while dataset X4 does not; see Table 2, to follow.

Dataset X1 Dataset X2 Dataset X3 Dataset X4

Figure 1: Four simulated spatial point patterns

4. Spatial smoothness priors

This section addresses the first aim of this article, which is to outline and discuss the con-texts in which the spatial smoothness priors can be employed based on the aims for spatialsmoothing listed in Section 2. Three Bayesian spatial smoothness priors are described andtheir practicalities in different contexts are discussed. In a Bayesian spatial model, let vi de-notes a spatially structured term that describes the effect of the i-th region by assuming thatgeographically close areas are more similar than distant areas. The spatial smoothness priorsconsidered are a first-order intrinsic Gaussian Markov random field (IGMRF) (Rue and Held,2005), a second-order IGMRF on a lattice (Rue and Held, 2005), and a Gaussian field withMatern correlation function (Stein, 1999) which includes a range parameter. The first-orderIGMRF prior considers first-order neighbours while the second-order IGMRF on a latticeprior accounts for both first-order and second-order neighbours for spatial smoothing. The

5

Page 7: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

Matern model uses GMRFs with a local neighbourhood to approximate isotropic Gaussianfields on regular lattices. These priors can be readily implemented using integrated nestedLaplace approximation (INLA) computation. The formulations of the priors are detailed inAppendixC.

The INLA package performs approximate Bayesian inference for latent Gaussian modelsand is able to return approximate posterior marginal distributions in short computationaltime. The posterior marginals can be used to compute summary statistics of interest, suchas posterior means, variances or quantiles. In addition, a model choice criterion termedthe deviance information criterion, and predictive measures including logarithmic score andprobability integral transform are provided. We refer the reader to Rue et al. (2009) andBlangiardo et al. (2013) for more details on INLA computation and applications.

The priors are based on the GMRF and lead to sparse covariance matrices which make thecomputation tractable and efficient. The GMRF models and Gaussian processes are quiteflexible and have applications in a wide range of disciplines. Modelling spatial dependenceusing the GMRF-based priors helps to absorb spatial variation and produce improved esti-mates for each region by borrowing strength from neighbouring regions. The GMRF-basedpriors are prevalent in small-area estimation problems (Ghosh and Rao, 1994) and diseasemapping (Paciorek, 2013), where estimation of disease rates is based on sparsely populatedregions. Incorporating information from regions with larger populations can reduce the vari-ability of estimates. The intuitive conditional structure of the GMRF-based priors producesan estimate based on neighbouring values and thus works particularly well to smooth outvariability not relevant to the underlying risk (Assuncao and Krainski, 2009). The simpleand sparse precision matrix of the GMRF models is another desirable property; see Paciorek(2013) for the use of MRFs on a fine grid.

4.1. First-order intrinsic Gaussian Markov random field

The first-order IGMRF on irregular lattices has been widely employed in disease mappingto study the spatial variation of disease risk (Clayton and Bernardinelli, 1992; Mollie, 1996;Wakefield et al., 2000). The neighbourhood structure in these papers was defined in terms ofadministrative districts which are irregular lattices. Here, we consider a finer neighbourhoodstructure in terms of grid cells which are regular lattices. On a grid, the four nearest gridcells are considered as neighbours. This is known as Rook neighbourhood where two gridcells are termed neighbours if they share a common boundary.

A first-order IGMRF prior is also known as an intrinsic conditional autoregressive (ICAR)prior. The spatial component v is specified such that spatial structure is induced, viaconditional autoregression (Besag and Kooperberg, 1995), whereby the conditional of vi,i = 1, . . . , I depends solely on the random effects v−i,i∼k of the k neighbouring areas. LetWij be an indicator function which takes on the value one if grid cells i and j are neighboursthat share a common boundary, and zero otherwise; Wii is set equal to zero. We note thatother specifications for W are also possible, for example W could be a function of distanceto the centroid of the area, of the size of the area, or the length of common boundary. SeeAppendixC.1 for the specification of the first-order IGMRF or ICAR.

6

Page 8: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

This prior can be conveniently implemented in INLA using the latent model besag.Using the INLA computation, Illian et al. (2012b) model the small-scale spatial interaction(attraction or repulsion) using the first-order IGMRF prior. At a low degree of smoothing,the spatial effect can reflect very local behaviour, whereas a high degree of smoothing leadsto a very smooth spatial surface.

4.2. Second-order intrinsic Gaussian Markov random field on a lattice

It is possible to extend the first-order IGMRF to higher orders. Here we consider asecond-order IGMRF on a two-dimensional regular lattice ((Rue and Held, 2005, chapter3.4.2), (Fahrmeir and Kneib, 2011, chapter 5.3.1)), alternatively known as a second-orderrandom walk on a lattice. This choice is motivated by its application to a discretized logGaussian Cox process (LGCP) where the model approximates a LGCP when the grid cellsare fine enough (Rue et al., 2009). See AppendixC.2 for the specification of the second-orderIGMRF.

Using the INLA computation with the rw2d prior, Illian et al. (2012b) use a second-order IGMRF prior to model the large-scale spatial variation and stress the importance ofchoosing the prior parameters carefully to avoid over-fitting, especially when the griddeddata are relatively sparse. Quoting Illian et al. (2012b), it is recommended to choose theprior parameters at which the spatial effect operates at a similar spatial scale as the covariateand use a grid that is not finer than the data.

4.3. Gaussian field with Matern correlation function

A Gaussian field is a simple and flexible approach to model spatial dependence in thedata. A random field {v(s), s ∈ D} is a Gaussian random field if (v(s1), . . . , v(sk))

T ismultivariate normal for any k ≥ 1 and any locations s1, . . . , sk ∈ D, where D ⊂ R

d. Forpoint-level data, a function of the distance between two points can be used to model spatialcorrelation whilst for lattice data, GMRFs using adjacency and neighbourhood matrices canbe employed (Haran, 2011). The Markov property of GMRFs results in sparse matrices andgreatly reduces computational burden as compared to the Gaussian process which is knownto have a dense covariance matrix.

Rue and Held (2005) (chapter 5) demonstrate the approximation of an isotropic Gaussianfield on regular lattices In with commonly used covariance functions using GMRFs with alocal neighbourhood for computational reasons. Each element in the covariance matrix ofthe GMRF is close to the corresponding element of the covariance matrix of the Gaussianfield. See AppendixC.3 for the specification of the Matern model.

This prior can be implemented via the INLA approach using the latent model matern2d.The Matern model is recommended by Stein (1999) due to its flexibility in estimating thesmoothness of the process. In practice the smoothness parameter ν may be difficult toestimate from data. To use this model, stationarity and isotropy assumptions must befulfilled for the spatial process, so the model might not be suitable if the spatial dependencevaries in different directions or regions (Haran, 2011).

7

Page 9: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

4.4. Summary

Based on the results reported by Kang et al. (2013b) and the discussion of the priorsabove, here we make some suggestions regarding the use of the priors for various spatialpatterns. When simple spatial structures such as inhomogeneous point patterns are involved,the first-order IGMRF prior, which is the simplest of the three priors, is generally preferred.This is because there is no substantive reason to believe that any one part of the lattice ismore important than any other parts. One of the advantages for using the first-order IGMRFprior is that there is no need to decide the class of covariance functions. Moreover, thereis a lack of concern about the boundary assumptions of the lattice because it is a simpleadjustment and does not have to be wrapped on a torus to compute the precision matrix.

When complicated spatial structures exhibiting interaction of points are concerned, suchas spatial patterns with repulsion or clustering, more complex priors are generally required.However, the choice of priors depends largely on the aim of spatial smoothing. Consider thefirst aim for spatial smoothing listed in Section 2, where the aim of the analysis is to identifyregions with unusually high risks such as disease clusters, a low degree of spatial smoothingis required and so the first-order IGMRF prior might be a reasonable choice. Otherwise,the second-order IGMRF on a lattice prior and the Matern model with a low degree ofsmoothing and careful selection of prior parameters are also applicable. We note that theconstruction of the precision matrix for the second-order IGMRF prior is rather complicatedsince the lattice has to be wrapped on a torus and corrections to the boundaries have to beundertaken. It is computationally challenging to implement this prior using Markov chainMonte Carlo (MCMC) algorithms due to the difficulty in constructing the precision matrix.However, the implementation of this prior using the INLA computation is straightforwardand convenient.

Now consider the second aim for spatial smoothing, where in some contexts the estimationof the spatial surface is of interest. Since the aim is to reduce the spatial variation caused byrelatively high (or low) estimates, a higher degree of spatial smoothing is required and so thesecond-order IGMRF prior might result in satisfactory estimation of the spatial surface. Ifthe degree of smoothing using the second-order IGMRF on a lattice prior is not satisfactory,the Matern model should be considered due to its flexibility. However, this prior is moretechnical to use as the range parameter and smoothness parameter require careful selection.For example, a larger value of the smoothness parameter implies a smoother process. Kanget al. (2013b) have shown that the Matern model is sensitive to changes in spatial scales.

Despite the varying complexity in the formulation of the three priors, Bayesian compu-tation via the INLA approach eliminates many of the computational concerns with fittingany of these three priors. Nevertheless, careful selection of prior parameters is required andthis is specific to the problem at hand as different prior parameters impose different degreeof spatial smoothing. Given the fast computation using INLA, it is feasible to consider arange of prior parameters to obtain a desirable level of spatial smoothing.

Throughout the section, we have outlined the advantages of the INLA approach in theimplementation of the discussed spatial smoothness priors. However, we note that MCMCalgorithms can also be employed to perform the computation (Banerjee et al., 2004; Best

8

Page 10: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

et al., 2005; Cramb et al., 2011; Lee, 2013). MCMC methods have some key advantages,such as simplicity in programming and flexibility, but a major concern is the computationalcost increases at least proportionally with the number of grid cells (regions) included in themodel (Vanhatalo et al., 2010). This issue has been highlighted by Rue et al. (2009) andLindgren et al. (2011) for spatial modelling. In light of this, we adopt the INLA approachin this article. The focus of this article is not to provide an empirical comparison betweenMCMC and INLA methods. A number of comparisons between these two approaches havebeen performed since the increasing popularity of INLA computation in recent years. Thesestudies have found comparable performance between the two approaches, see, for instance,Schrodle et al. (2011); Paul et al. (2010); Held et al. (2010) and Martino et al. (2011). Thesefindings provide further support for the choice of INLA methods in this article.

5. Spatial scale

This section addresses the second aim of the article, where the choice of spatial scale foreach canonical example is discussed. The spatial scales that are relevant for a particular spa-tial dataset is important in the analysis of a spatial pattern (Illian et al., 2012b). It is knownthat spatial structures vary at different spatial scales due to the underlying mechanisms thatdrive the spatial pattern (Wiegand et al., 2007; Latimer et al., 2009; Illian et al., 2012a).Some mechanisms operate at a local spatial scale while others may require a larger spatialscale. Different datasets may require different spatial scales due to the different spatial be-haviours and aims of analyses. It is important to identify the aim of the spatial analysisand consider a range of relevant spatial scales for the problem of interest before conductingconcrete spatial data analyses. Some background knowledge on the problem is necessary inorder to understand the spatial scales at which the underlying mechanisms operate. In ad-dition, the increasing computational costs and potential loss of stability of estimation usingfine grid cells should also be taken into consideration.

Using Moran’s I test for spatial autocorrelation (Moran, 1948) (described in AppendixD),we demonstrate how the spatial behaviours vary from one spatial scale to another for thefour canonical examples illustrated in Figure 1. We discretized the study region using grids5× 5, 10× 10, 15× 15, 20× 20, 25× 25, 30× 30, 35× 35, 40× 40, 45× 45, and 50× 50. Grid5 × 5 resulted in 25 regular grid cells over the region, grid 10 × 10 resulted in 100 regulargrid cells, and so on. As such, grid 5× 5 had the largest grid cell size whereas grid 50× 50had the smallest grid cell size.

Here we do not attempt to draw conclusions about the distribution of the spatial patternsbut to illustrate how the changes in spatial scale result in different Moran’s I statistics thatsuggest the changing spatial behaviours across the various grid sizes (see Table 2). As seen,for dataset X1, across the increasingly small grid cells, the gradually decreasing Moran’sI statistics suggest that the spatial patterns shift from being clustered at big cell sizes tobeing randomly distributed at small grid cell sizes. On the other hand, dataset X2 exhibitsa slightly clustered pattern at grid 5 × 5 and then a very small degree of dispersion at allother scales. Dataset X3 is shown to be very close to randomly distributed at grid 5× 5 butdisplays different degree of clustering at subsequent scales. Similar to dataset X2, dataset

9

Page 11: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

X4 displays opposite spatial patterns at the largest and the smallest grid cell sizes, in whichthe data show a slight dispersion at grid 5× 5 but a slight clustering at grid 50× 50. Severalspatial scales are chosen for each dataset and various ways of discretizing the study regionare illustrated in Figures 2, 3, 4, and 5, with some descriptions included.

Table 2: Moran’s I statistics for each dataset at various spatial scales

Spatialscale

X1 X2 X3 X4

5× 5 0.516 0.199 0.082 -0.21510× 10 0.553 -0.032 0.474 -0.09115× 15 0.344 -0.031 0.620 0.00820× 20 0.330 -0.021 0.612 0.05925× 25 0.238 -0.082 0.553 0.10430× 30 0.150 -0.082 0.542 0.10735× 35 0.157 -0.046 0.501 0.17540× 40 0.103 -0.052 0.385 0.15945× 45 0.101 -0.023 0.355 0.17750× 50 0.084 -0.032 0.323 0.214

Grid 10x10 Grid 25x25

Grid 35x35 Grid 50x50

Figure 2: Inhomogeneous point patterns in dataset X1: At grid 10 × 10, the cell counts are ratherinhomogeneous across the space. As the grid cell size becomes increasingly small, the cell counts becomemore homogeneously distributed and eventually form a random distribution which indicates a lack of spatialpattern.

10

Page 12: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

Grid 5x5 Grid 10x10

Grid 25x25 Grid 45x45

Figure 3: Point patterns with local repulsion in dataset X2: At grid 5× 5, the cell counts are slightlyinhomogeneous across the space. At all other grid sizes, a lack of spatial pattern is observed due to therandomness in the distribution of the cell counts despite a very small degree of dispersion as suggested byMoran’s I statistics.

Grid 5x5 Grid 10x10

Grid 15x15 Grid 50x50

Figure 4: Point patterns with large and homogeneously distributed clusters in dataset X3: Atgrid 5× 5, Moran’s I statistics suggest a lack of spatial pattern. Different degrees of clustering are observedat all other scales in which grid 15× 15 displays the highest degree of clustering.

11

Page 13: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

Grid 5x5 Grid 10x10

Grid 15x15 Grid 50x50

Figure 5: Point patterns with small and inhomogeneously distributed clusters in dataset X4:Across the various grid sizes, there is no obvious spatial pattern in this dataset as shown by Moran’s I

statistics, possibly due to the rather homogeneous distribution. However, the data show a slight dispersionat grid 5× 5 but a slight clustering at grid 50× 50.

To fulfil the second aim of this article, which is to provide suggestions on the choice ofspatial scale for different spatial patterns, we note that Moran’s I statistics and inspectionof the plots may be used as a guide to investigate the distribution of the spatial patterns. Asshown by Moran’s I statistics, the spatial scales have an impact on the spatial behaviours foreach spatial point pattern illustrated here. Given a point pattern dataset, it is not alwaysclear about the most appropriate scale to model the data. Some knowledge about theproblem of interest is required such as the aim of the spatial analysis, what conclusion is theanalysis trying to draw, the scales of the available covariates for analysis, and computationalconstraint. It is important to consider a range of spatial scales before conducting the spatialanalyses in order to choose the most suitable scale. Analysis of spatial data under aim (a)or aim (b) for spatial smoothing, as listed in Section 2, will require different spatial scales.

That is, under aim (a) where we identify regions of high risks, the chosen spatial scaleshould demonstrate clustering in the data rather than randomness. In contrast, under aim(b) where we smooth the spatially varying surface, we may consider a spatial scale that showsspatial randomness in the data. We note that a spatial scale that results in excessive zero cellcounts should be avoided as this leads to inefficient spatial smoothing. The zero or small cellcounts at fine grid cells result in less information sharing across the grid cells as the spatialvariability does not extend very far. A coarse spatial scale should also be avoided as it leadsto a larger extent of loss of information. With respect to the computational consideration,it is noted that the increase in the number of grid cells increases the computational timeproportionally. In summary, the question of the choice of spatial scale is not only related tothe spatial structure of the data provided, but also to statistical and computational feasibility.

12

Page 14: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

6. Case study: selection of spatial scales

In the previous section, we describe how Moran’s I statistics can be used as a guide toinvestigate the distribution of the spatial patterns before determining the right spatial scale.Here we intend to use the four canonical examples described in Section 3 to demonstrate theselection of spatial scales through the evaluation of model predictive performance.

The model used to fit datasets X1, X2, X3 and X4 is briefly described here. Let X be aspatial point pattern embedded in an observation window S which is discretized into n1×n2

grid cells {sij} with area |sij| for i = 1, ..., n1 and j = 1, ..., n2. Let Nij denote the observednumber of points in each grid cell sij. Assume that Nij are conditionally independent Poissoncounts, Nij ∼ Po(|sij|λij), we consider the following models,

Model 1 : log(λij) = µ+ uij .

Model 2 : log(λij) = µ+ uij + vij.

Here λij denotes the intensity in each grid cell, µ refers to the common intercept term,uij is an unstructured term that accounts for unexplained variability in the process, andvij is a spatially structured term that describes the effect of the location by assuming thatgeographically close areas are more similar than distant areas. We are interested in modellingthe log-intensity (ηij = log(λij)) of the Poisson process. Clearly Model 1 is a model withoutspatial component and Model 2 is a spatial model. The spatially unstructured component,uij , is assumed to be independent and identically distributed (i.i.d.) and normally distributedwith zero mean and unknown precision τu. The spatially structured component in Model2, vij, is assigned an IGMRF prior with unknown precision (inverse variance) τv. Gammapriors are assigned to the precision parameters τu and τv.

To evaluate the predictive performance at various spatial scales, we calculate meansquared error (MSE) between the observed counts in a grid cell, Nij, and the estimated

counts in the respective grid cell, Nij,

MSE =

n1×n2∑

i,j=1

(

Nij −Nij

)2

n1 × n2

.

A smaller MSE indicates a better predictive performance. The resulting MSE is used toassist the selection of an appropriate spatial scale.

We first consider a range of spatial scales for a specific aim for spatial smoothing basedon the guidance from Moran’s I statistics (refer to Table 2). In this study we choose athreshold for clustering of 0.3 for Moran’s I statistic (Longley and Batty, 1996) althoughother thresholds could be chosen in particular contexts. Firstly, for dataset X1 which isan inhomogeneous spatial point pattern, under aim (a) where the identification of high riskregions is of interest, we consider spatial scales at which Moran’s I statistic ≥ 0.3 (spatialclustering), which include grids 5 × 5, 10 × 10, 15 × 15, and 20 × 20. In contrast, underaim (b) where we smooth the spatially varying surface, we consider spatial scales at whichMoran’s I statistic < 0.3 (spatial randomness), namely grids 25×25, 30×30, 35×35, 40×40,

13

Page 15: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

45 × 45, and 50 × 50. Figure 6 shows the MSE obtained for the spatial scales consideredunder aims (a) and (b) for dataset X1. Given these results, under aim (a), grid 20× 20 is arecommended spatial scale as it produces the smallest MSE amongst the considered scales.With respect to aim (b), the MSE appears to decrease very slightly beyond grid 40 × 40,which may suggest that further discretization does not improve prediction much. As such,grid 40× 40 would be an appropriate scale under aim (b).

1.0

1.5

2.0

2.5

3.0

Aim (a)

Spatial scale

MS

E

5x5 10x10 15x15 20x20

Model with spatial componentModel without spatial component

0.2

0.4

0.6

0.8

1.0

Aim (b)

Spatial scale

MS

E

25x25 30x30 35x35 40x40 45x45 50x50

Model with spatial componentModel without spatial component

Figure 6: The mean squared error between the observed counts and the estimated counts of both modelsfor dataset X1 under two different aims. Grid 20 × 20 is a recommended spatial scale under aim (a) — toidentify clusters; while grid 40 × 40 would be an appropriate scale under aim (b) — to smooth the spatialvarying surface. It is interesting to note that the model without spatial component produces a smaller MSEat the first three scales whilst the spatial model has a smaller MSE at all other scales.

In regard to dataset X2 which is a point pattern with local repulsion, Moran’s I statisticsgiven in Table 2 are within (−0.3, 0.3) which means that the spatial pattern shows random-ness at all spatial scales. Therefore, aim (b) is a suitable aim of analysis for this datasetwhereas aim (a) cannot be fulfilled. Figure 7 suggests that the predictive performance atscales beyond grid 25×25 does not improve significantly. Grid 25×25 is therefore a favouredspatial scale under aim (b).

We shift our attention to the point pattern with large and homogeneously distributedclusters in dataset X3. Moran’s I statistics (Table 2) suggest spatial randomness at grid5×5 while different degrees of clustering are observed at all subsequent scales (with Moran’sI statistic ≥ 0.3). In view of this, the identification of high risk regions in aim (a) is suitablefor this dataset. As displayed in Figure 8, the predictive performance based on the MSEremains relatively consistent beyond grid 35× 35. This signifies that grid 35× 35 should beconsidered as the appropriate scale.

With respect to datasetX4 that contains small and inhomogeneously distributed clusters,similarly to dataset X2, Moran’s I statistics (Table 2) are within (−0.3, 0.3), suggesting thatspatial randomness occurs at all spatial scales. Again, aim (a) cannot be fulfilled here. Incontrast, aim (b) which is to smooth the spatial varying surface, is a suitable aim of analysisfor this dataset. Figure 9 suggests that the predictive performance at scales beyond grid

14

Page 16: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

02

46

810

Aim (b)

Spatial scale

MS

E

5x5 10x10 15x15 20x20 25x25 30x30 35x35 40x40 45x45 50x50

Model with spatial componentModel without spatial component

Figure 7: The mean squared error between the observed counts and the estimated counts of both models fordataset X2 under aim (b) which is to smooth the spatial surface. Grid 25 × 25 is a favoured spatial scale.Both models produce similar MSE at all spatial scales except at grid 5 × 5 where the spatial model has arelatively smaller MSE.

0.1

0.3

0.5

0.7

Aim (a)

Spatial scale

MS

E

10x10 15x15 20x20 25x25 30x30 35x35 40x40 45x45 50x50

Model with spatial componentModel without spatial component

Figure 8: The mean squared error between the observed counts and the estimated counts for dataset X3

under aim (a) which is to identify high risk regions. Grid 35 × 35 is the recommended scale. We note thatthe model without spatial component produces a smaller MSE at the first two scales whereas the spatialmodel has a smaller MSE at all other scales.

20×20 does not improve significantly. Grid 20×20 is therefore a recommended spatial scaleunder aim (b).

In summary, we have demonstrated the selection of an appropriate spatial scale for dif-ferent structures of point patterns via the evaluation of the MSE. Throughout the analyses,the MSE decreases as the grid cell size reduces. As observed in datasets X3 and X4, thesize of the clusters appears to play a role in determining the best grid size for the dataset, inaddition to the influence of the spatial structure at each grid size. We note that for certainspatial structures, a particular aim of analysis may not be fulfilled due to the nature ofthe point pattern itself. For instance, point patterns in datasets X2 and X4 show spatial

15

Page 17: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

02

46

8

Aim (b)

Spatial scale

MS

E

5x5 10x10 15x15 20x20 25x25 30x30 35x35 40x40 45x45 50x50

Model with spatial componentModel without spatial component

Figure 9: The mean squared error between the observed counts and the estimated counts for dataset X4

under aim (b) which is to smooth the spatial varying surface. Grid 20× 20 is the recommended scale. Bothmodels produce comparable MSE at all spatial scales.

randomness at all scales and therefore aim (a) is not a suitable aim of analysis for these twopoint patterns. On the other hand, dataset X3 shows spatial clustering at almost all spatialscales, which renders aim (b) an inappropriate aim of analysis.

The comparison between the spatial model and the model without spatial component,in terms of MSE, has revealed some interesting results. Specifically, datasets X2 and X4appear to produce comparable MSE at nearly all spatial scales. This could be explained bythe spatial randomness observed in both datasets which renders the spatial component animportant part of the model. On the other hand, for datasetsX1 andX3 which portray morecomplicated spatial structures, the model without spatial component has a smaller MSE atcoarse grid cells whereas the spatial model produces smaller MSE as the grid cell size becomesincreasingly fine. These results suggest that complicated spatial structures require a spatialmodel while point patterns that show spatial randomness do not necessarily need a spatialcomponent in the model. Such results are similar to those found in Kang et al. (2013b).

7. Concluding discussion

The development of this article is motivated by the importance of employing the rightspatial scale and spatial smoothness prior for spatial datasets which are commonly encoun-tered in the disciplines of epidemiology, ecology, forestry, geology, and geography. Kang et al.(2013b), for example, have demonstrated how different spatial scales and spatial smoothnesspriors impact on model outcomes. Here, we provide guidance on the choice of spatial scalesand spatial smoothness priors based on the aims of spatial smoothing for several canonicalexamples, as summarized in Table 3. The recommendations are specific to the four pointpatterns presented, based on the investigation of Moran’s I statistics and MSE, as conductedin Section 5 and Section 6, respectively. The results have suggested that the homogeneity ofthe point patterns and the size of the clusters will determine the spatial scales for different

16

Page 18: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

aims of analyses. We acknowledge that the choices are specific to datasets at hand. Nonethe-less, this article offers some practical steps for the process of selecting the most appropriatespatial scale and smoothness prior.

Table 3: The choice of spatial scales and spatial smoothness priors based on two aims of spatial smoothingfor the four canonical examples

DatasetAims of spatial smoothing

(a) Identify clusters (b) Smooth the spatial surfaceSpatial scale Spatial

smoothnessprior

Spatial scale Spatial smoothnessprior

X1:Inhomogeneouslydistributed

Grid 20× 20 First-order IGMRFwith low degree ofsmoothing

Grid 40× 40 Second-order IGMRFon a lattice or theMatern modeldepending on thedesired degree ofsmoothing

X2: Localrepulsion

Not applicable Not applicable Grid 25× 25 First-order IGMRF issufficient as the datado not showclustering accordingto Moran’s I statistics

X3: Large andhomogeneouslydistributed clusters

Grid 35× 35 First-order IGMRFwith low degree ofsmoothing

Not applicable Not applicable

X4: Small andinhomogeneouslydistributed clusters

Not applicable Not applicable Grid 20× 20 First-order IGMRF issufficient as the datado not showclustering accordingto Moran’s I statistics

In general, the aims of spatial smoothing include the identification of clusters and smooth-ing of spatial surface, as listed in Section 2. We propose that in order to identify clusters,the first-order IGMRF prior is a reasonable choice as it allows for less spatial smoothingcompared to two other priors, whereas the preferred spatial scales are those that show somedegree of clustering in the data. When smoothing of spatial surface is of interest, either thesecond-order IGMRF on a lattice or the Matern model is recommended, depending on thedesired degree of smoothing. These two priors generally impose a higher level of smoothingthan the first-order IGMRF prior and therefore are ideal for the estimation of the surfaceof regression effect. The spatial scales that show randomness or less clustering in the dataare preferable to fulfil this aim. If the data do not show clustering according to Moran’s Istatistics, the first-order IGMRF prior would be sufficient to smooth the spatial surface. Forcases where the data exhibit spatial randomness, the spatial component may not necessarilybe included in the model.

Based on the findings discussed above, we present a guide on the selection of spatial scalesand spatial smoothness priors for two different aims of analysis, as presented in Figure 10.

17

Page 19: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

We conclude that the aim of spatial smoothing and the spatial structures of a point patterndetermine the choice of spatial scales and spatial smoothness priors. We encourage the readerto identify the aim of spatial analysis for the problem at hand and consider a range of possiblespatial scales (using Moran’s I statistics) before conducting analyses. A preferred spatialscale is the spatial scale that has the best predictive performance (with the smallest MSE)or the scale at which subsequent scales do not improve predictive performance significantly.Spatial modelling can then be conducted using the smoothness prior that suits a specificaim for spatial smoothing. We emphasize that the recommendation of the priors are basedon the performance of the three smoothness priors discussed in this article, excluding otherspatial priors that may be possible. When we suggest that a prior is suitable for a particulartask, we are implying that it is the best choice amongst the three choices.

Aims of the analysis

Aim (a): Identify clusters Aim (b): Smooth the spatial surface

Consider a range of spatial scales with

Moran's I statistic ≥ k

Select the spatial scale with the smallest

MSE or the scale at which subsequent

scales do not improve predictive

performance significantly

Select the spatial scale with the smallest

MSE or the scale at which subsequent

scales do not improve predictive

performance significantly

Proceed with spatial modelling using the

first-order IGMRF prior for spatial

smoothing

Proceed with spatial modelling using the

second-order IGMRF on a lattice prior

or the Matérn model for spatial

smoothing

Consider a range of spatial scales

with Moran's I statistic < k

Figure 10: A guide on the selection of spatial scales and spatial smoothness priors for two different aims ofanalysis. Here, k is a constant threshold for identification of clustering; see Section 6 where k was chosen tobe equal to 0.3.

In this study, we have not included covariates in the model as they may impact on theobserved spatial effects and confound the findings of this investigation. In practice, covariatescould easily be included in the standard way to the log-linear model listed in Section 6, which

18

Page 20: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

leads to log(λij) = µ + βX + uij + vij. The covariate X may include individual level oraggregated level characteristics and β represents the effect of each covariate. See for instance,Kang et al. (2013a) and the recommendation therein for spatial studies including covariates.

In short, the recommendations presented in this article are based on the assumption thatpoint pattern data are available for analysis. We note that it may be tempting to use aspatial point pattern approach but this can prove to be infeasible in practice. A practicalexample would be the availability of disease data with exact residential locations, such asthat considered in Kang et al. (2013a), where spatial point process analyses may not beauthorised due to privacy concerns. In such scenarios, disease data may be aggregated tovarious geographical scales in order to protect patients’ confidentiality while also reducingthe computational time. For large datasets, the computational burden increases substan-tially when dealing with point pattern data as opposed to aggregated data, especially inapplications where a large number of covariates are being collected. Inclusion of covariatesin the point process model greatly increases model complexity and many current point pro-cess approaches are not adequate for this modelling purpose (Illian et al., 2008). In someapplications difficulties arise in point process analyses due to measurement error that oc-curs in the observed locations, which may be caused by measuring instrument or factorsinfluencing detection of event occurrences (Chakraborty and Gelfand, 2010).

Acknowledgements

The work was supported by the Cooperative Research Centre for Spatial Information,whose activities are funded by the Australian Commonwealth’s Cooperative Research Cen-tres Programme.

AppendixA. Comparison of different types of disease data

Historically, the analysis of disease data involves aggregation of individual data to geo-political areas which are irregular in shape in order to protect patients’ confidentiality. Indi-vidual level disease data are rarely available due to privacy issues. Furthermore, modelling ofpoint level data is computationally demanding when large datasets are involved due to densecovariance matrices. Geo-political areas are commonly used as the geographical boundariesin disease mapping due to practical reasons such as the availability of population data orsocio-economic attributes at these geographical scales. However, several concerns have beenraised regarding the aggregation of data, including increase of spatial correlation (Song et al.,2011), biases in estimates due to ecological fallacy (Robinson, 1950), loss of information, andissues of overlapping and artificiality of geo-political boundaries (Kirby, 1996; Louie andKolaczyk, 2006). Given the identified drawbacks of using geo-political areas in disease map-ping, it is of interest to model the disease data at a smaller geographical scale that maybe more relevant to the disease of interest, while still allowing for some aggregation. Thismotivates the adoption of grid level analysis. Table A.4 summarizes three types of commonlyencountered disease data.

19

Page 21: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

Table A.4: Comparison of three types of disease data

Disease data aggregatedto geo-political areas

Disease data withprecise geographiccoordinates

Disease data aggregatedto grid cells

Description Disease incidences areaggregated overgeo-political defined areas

Exact locations ofincidences are given andform a spatial point pattern

Disease incidences areaggregated over regulargrid cells

Advantages • Accessibility • Avoid ecological fallacy • Geographically morerelevant than geo-politicalarea data to disease ofinterest and thus lessconcern about ecologicalfallacy

• Protect patients’confidentiality

• Avoid loss of informationcaused by data aggregation

• Protect patients’confidentiality

• Availability of populationdata and socio-economicattributes at geo-politicalarea level

• Adjustment forindividual level covariates

• Computationallytractable via Markovrandom field approximation

• Avoid the issue ofchanging boundaries

Drawbacks • Spatial correlationsamong neighbouring areas

• Rarely accessible due toconfidentiality issues

• Spatial correlationsamong neighbouring gridcells

• Loss of informationcaused by data aggregationand thus concern aboutecological fallacy

• Computationallydemanding

• Loss of information butat a smaller degree

• Issues of overlapping andartificiality of geo-politicalboundaries

AppendixB. Descriptions of several point processes

AppendixB.1. Inhomogeneous Poisson process

Let λ(.) be an intensity function, an inhomogeneous Poisson process (Møller andWaagepetersen,2004, Chapter 3.1) can be defined as below,

1. the counts N(S) is Poisson distributed with mean∫

Sλ(u)du, for all S.

2. the n points are independent and identically distributed in S with density proportionalto λ(.), conditional on N(S) = n.

AppendixB.2. Strauss process

A Strauss process (Strauss, 1975) is a pairwise interaction point process. Let φ be aninteraction function, if φ(ξ) is constant and the second order interaction term is invariant

20

Page 22: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

under motions, i.e. φ({ξ, η}) = φ2(||ξ−η||), the process is homogeneous. In a Strauss process(Møller and Waagepetersen, 2004, Chapter 6.2),

φ2(r) = γ1[r≤R],

setting 00 = 1. Here γ is an interaction parameter in the range [0, 1] and R > 0 is the rangeof interaction. A Strauss process is repulsive and therefore locally stable.

AppendixB.3. Thomas process

A Thomas process (Møller and Waagepetersen, 2004, Chapter 5.3) is a Poisson clusterprocess. In a Thomas process, each cluster consists of a Poisson(µ) number of random points.Each of the points has an isotropic Gaussian Normal(0, σ2I) displacement from its parent.With parameters θ = (κ, µ, σ), the K-function of a Thomas process can be expressed as,

Kθ(r) = πr2 +1

κ(1− exp(−

r2

4σ2)).

AppendixC. Formulations of the spatial smoothness priors

AppendixC.1. First-order intrinsic Gaussian Markov random field

A first-order IGMRF for vi is defined as

vi|v−i, τv ∼ N

(

1

ni

i∼k

vk,1

niτv

)

,

where ni is the number of neighbours of region i, v−i denotes all elements in v except for vi,and i ∼ k indicates that the two regions are neighbours that share a common boundary; seeBesag et al. (1991), Besag et al. (1995), and Rue and Held (2005) for further details.

The ICAR specification (Haneuse and Wakefield, 2008) is given by

vi|v−i ∼ Normal

(

1

ni

i∼j

Wijvj,1

niτv

)

,

where ni =∑

i∼j Wij is the number of neighbours for grid cell i. The corresponding precisionmatrix Q has elements

Qij = τv

i∼k Wik i = j,−Wij i ∼ j,0 otherwise.

As the precision matrix is not positive definite, the conditional specification above doesnot yield a proper joint distribution for v. The density can, however, be expressed via apairwise difference distribution as

π(v|W , τv) ∝ τ (N−1)/2v exp

{

−τv2

i<j

Wij(vi − vj)2

}

∝ τ (N−1)/2v exp

{

−1

2vTQv

}

,

21

Page 23: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

where N denotes the total number of grid cells. The prior for vi|v−i is Normal with mean1ni

i∼j Wijvj and precision parameter τv that determines the strength of dependence be-tween the parameters vi and vj, and is assigned a Gamma distribution τv ∼ Gamma(av, bv)(Bernardinelli et al., 1995). The choice of the Gamma prior influences the smoothness of thespatial effect.

AppendixC.2. Second-order intrinsic Gaussian Markov random field on a lattice

Consider a two-dimensional regular lattice In with n = n1n2 nodes, by restricting atten-tion to the distribution of the increment

vi+1,j + vi−1,j + vi,j+1 + vi,j−1 − 4vi,j,

we can be sure that the covariance structure is sparse and the interior contains non-zeroelements only in the neighbourhood structure of a second-order IGMRF

◦ ◦ × ◦ ◦◦ • • • ◦× • ⊗ • ×◦ • • • ◦◦ ◦ × ◦ ◦

,

which means that for the grid cell ⊗, the first-order neighbours are shown as • and × denotesthe second-order neighbours.

The density is similar to that of the first-order IGMRF prior,

π(v|W , τv) ∝ τ (N−1)/2v exp

{

−1

2vTQv

}

,

except that the precision matrix Q has a different representation. We refer the reader toRue and Held (2005) (chapter 3.4.2) for the technical details to compute Q.

Using graphical notation, the full conditionals of the nodes in the interior of the regulargrid are as follows

E(vi|v−i, τv) =1

20

8

◦ ◦ ◦ ◦ ◦◦ ◦ • ◦ ◦◦ • ◦ • ◦◦ ◦ • ◦ ◦◦ ◦ ◦ ◦ ◦

− 2

◦ ◦ ◦ ◦ ◦◦ • ◦ • ◦◦ ◦ ◦ ◦ ◦◦ • ◦ • ◦◦ ◦ ◦ ◦ ◦

− 1

◦ ◦ • ◦ ◦◦ ◦ ◦ ◦ ◦• ◦ ◦ ◦ •◦ ◦ ◦ ◦ ◦◦ ◦ • ◦ ◦

,

Prec(vi|v−i, τv) = 20τv.

The precision τv is unknown. As stated by Rue et al. (2009), the full-conditionals areconstructed to mimic the thin plate spline. Corrections to the boundary can be foundby using the stencils in Terzopoulos (1988). A Gamma prior is assigned to the precisionparameter τv and it determines the smoothness of the estimated surface.

22

Page 24: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

AppendixC.3. Gaussian field with Matern correlation function

The density of a discrete domain Gaussian field with expectation µ and covariance matrixC can be defined as

p(v) =1

(2π)n/2|C|1/2exp

(

−1

2(v − µ)TC−1(v − µ)

)

.

Consider a discrete representation of a continuous-domain Gaussian field {v(s), s ∈ D},

µ(s) = E(v(s)),

C(s, t) = Cov(v(s),v(t)).

The Matern family of covariance functions (Matern, 1960) is popular in environmentalstatistics. The Matern isotropic correlation function C(s, t) = C(d) on an infinite lat-tice with d =

||s− t|| is given as (Handcock and Stein, 1993; Stein, 1999; Minasny andMcBratney, 2005),

C(d) =1

2ν−1Γ(ν)

(

d

r

(

d

r

)

,

where d is the separation distance, Kν is the modified Bessel function of the second kind oforder ν, Γ(·) is the Gamma-function, r is the range or distance parameter (r > 0) whichmeasures how quickly the correlations decay with distance, and ν is the smoothness pa-rameter (ν > 0). The latent field has marginal variance 1/τv and range r. Gamma priorsare assigned to both parameters. The Matern model has great flexibility in modelling thespatial correlation due to the smoothness parameter ν. A larger value of ν (ν → ∞) impliesa smoother spatial process.

AppendixD. Moran’s I test

Moran’s I test is one of the most popular tests for spatial correlation. It was firstadopted by Moran (1948, 1950) and generalized by Cliff (1969). It is a global measure ofspatial autocorrelation that quantifies the degree to which data are clustered, dispersed orrandomly distributed. We describe a model-based Moran’s I (Zhang and Lin, 2008) that canbe used to test the spatial correlation of aggregated cell counts in the context of grid levelmodelling. Given a study region that has n regions indexed by i. Let Xi be the variable ofinterest in region i. The Moran’s I statistic for spatial autocorrelation can be written as,

I =

n∑

i=1

n∑

j=1

wij(Xi − X)(Xj − X)

[

n∑

i=1

n∑

j=1

wij

]

[

n∑

i=1

(Xi − X)2/n

]

,

where X =∑n

i=1 Xi/n, and wij with wii = 0 is the spatial weight between regions i and jof a spatial weight matrix W . W is commonly defined using the adjacency of spatial units:

23

Page 25: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

wij = 1 if regions i and j are adjacent (neighbours) and wij = 0 otherwise. Values of Moran’sI vary in the range [−1, 1]. The former denotes clustering where similar values located closeto one another, whereas the latter denotes dispersion where high values located closest tolow values and vice versa. A value close to 0 indicates a lack of spatial pattern or a randomdistribution.

A test for spatial correlation that is similar to Moran’s I test is Geary’s C test (Geary,1954), which is based on the deviations in responses of each observation with one another,

I =

(n− 1)n∑

i=1

n∑

j=1

wij(Xi −Xj)2

2n∑

i=1

n∑

j=1

wij(Xi − X)2.

Values of Geary’s C typically range from 0 to 2. Values lower than 1 indicate increasingpositive spatial autocorrelation, whereas values higher than 1 indicate increasing negativespatial autocorrelation. Moran’s I is a global measurement and sensitive to extreme valuesof X, whereas Geary’s C is more sensitive to differences in small neighbourhoods. Bothstatistics generally result in similar conclusions but Moran’s I is preferred as Cliff and Ord(1981, 1975) have shown that it is consistently more powerful than Geary’s C in detectingspatial autocorrelation.

References

Assuncao, R., Krainski, E., 2009. Neighborhood dependence in Bayesian spatial models.Biometrical Journal 51 (5), 851–869.

Baddeley, A., Berman, M., Fisher, N. I., Hardegen, A., Milne, R. K., Schuhmacher, D.,Shah, R., Turner, R., 2010. Spatial logistic regression and change-of-support in Poissonpoint processes. Electronic Journal of Statistics 4, 1151–1201.

Baddeley, A. J., Møller, J., Waagepetersen, R., 2000. Non-and semi-parametric estimationof interaction in inhomogeneous point patterns. Statistica Neerlandica 54 (3), 329–350.

Banerjee, S., Gelfand, A. E., Carlin, B. P., 2004. Hierarchical Modeling and Analysis forSpatial data. Chapman and Hall/CRC, Boca Raton, Florida.

Benes, V., Bodlak, K., Møller, J., Waagepetersen, R. P., 2005. A case study on point processmodelling in disease mapping. Image Analysis and Stereology 24, 159–168.

Bernardinelli, L., Clayton, D., Montomoli, C., 1995. Bayesian estimates of disease maps:How important are priors? Statistics in Medicine 14 (21-22), 2411–2431.

Besag, J., Green, P., Higdon, D., Mengersen, K., 1995. Bayesian computation and stochasticsystems. Statistical Science 10 (1), 3–41.

24

Page 26: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

Besag, J., Kooperberg, C., 1995. On conditional and intrinsic autoregressions. Biometrika82 (4), 733–46.

Besag, J., York, J., Mollie, A., 1991. Bayesian image restoration, with two applicationsin spatial statistics (with discussion). Annals of the Institute of Statistical Mathematics43 (1), 1–59.

Best, N., Richardson, S., Thomson, A., 2005. A comparison of Bayesian spatial models fordisease mapping. Statistical Methods in Medical Research 14 (1), 35–59.

Best, N. G., Ickstadt, K., Wolpert, R. L., 2000. Spatial Poisson regression for health andexposure data measured at disparate resolutions. Journal of the American Statistical As-sociation 95 (452), 1076–1088.

Biggeri, A., Dreassi, E., Catelan, D., Rinaldi, L., Lagazio, C., Cringoli, G., 2006. Diseasemapping in veterinary epidemiology: a Bayesian geostatistical approach. Statistical Meth-ods in Medical Research 15 (4), 337–352.

Blangiardo, M., Cameletti, M., Baio, G., Rue, H., 2013. Spatial and spatio-temporal modelswith R-INLA. Spatial and Spatio-temporal Epidemiology 4 (0), 33–49.

Chakraborty, A., Gelfand, A. E., 2010. Analyzing spatial point patterns subject to measure-ment error. Bayesian Analysis 5 (1), 97–122.

Chang, K., 2010. Introduction to Geographic Information Systems. McGraw-Hill, New York.

Clayton, D., Bernardinelli, L., 1992. Bayesian methods for mapping disease risk. In: Elliott,P., Cuzick, J., English, D., Stern, R. (Eds.), Geographical and Environmental Epidemiol-ogy: Methods for Small Area Studies. Oxford University Press, pp. 205–220.

Cliff, A. D., 1969. Some measures of spatial association in areal data, unpublished Ph.D.dissertation, University of Bristol.

Cliff, A. D., Ord, J. K., 1975. The choice of a test for spatial autocorrelation. In: Davies,J. C., McCullagh, M. J. (Eds.), Display and Analysis of Spatial Data. John Wiley andSons, London, pp. 54–57.

Cliff, A. D., Ord, J. K., 1981. Spatial Processes: Models & Applications. Vol. 44. London:Pion.

Cramb, S. M., Mengersen, K. L., Baade, P. D., 2011. Developing the atlas of cancer inQueensland: methodological issues. International Journal of Health Geographics 10 (9).

Diggle, P. J., 1990. A point process modelling approach to raised incidence of a rare phe-nomenon in the vicinity of a prespecified point. Journal of the Royal Statistical Society:Series A (Statistics in Society) 153, 349–362.

25

Page 27: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

Diggle, P. J., 2003. Statistical Analysis of Spatial Point Patterns, 2nd Edition. London:Arnold.

Diggle, P. J., Gomez-Rubio, V., Brown, P. E., Chetwynd, A. G., Gooding, S., 2007. Second-order analysis of inhomogeneous spatial point processes using case–control data. Biomet-rics 63 (2), 550–557.

Elsner, J. B., Murnane, R. J., Jagger, T. H., Widen, H. M., 2013. A spatial point processmodel for violent tornado occurrence in the US Great Plains. Mathematical Geosciences,1–13.

Fahrmeir, L., Kneib, T., 2011. Bayesian Smoothing and Regression for Longitudinal, Spatialand Event History Data. OUP Catalogue. Oxford University Press.

Gatrell, A. C., Bailey, T. C., Diggle, P. J., Rowlingson, B. S., 1996. Spatial point patternanalysis and its application in geographical epidemiology. Transactions of the Institute ofBritish Geographers, 256–274.

Geary, R. C., 1954. The contiguity ratio and statistical mapping. The Incorporated Statisti-cian 5 (3), 115–146.

Ghosh, M., Rao, J. N. K., 1994. Small area estimation: an appraisal. Statistical Science,55–76.

Goovaerts, P., 2009a. Combining area-based and individual-level data in the geostatisticalmapping of late-stage cancer incidence. Spatial and Spatio-temporal Epidemiology 1 (1),61–71.

Goovaerts, P., 2009b. Medical geography: a promising field of application for geostatistics.Mathematical Geosciences 41 (3), 243–264.

Hahn, U., Jensen, E. B. V., Lieshout, M. C. V., Nielsen, L. S., 2003. Inhomogeneous spatialpoint processes by location-dependent scaling. Advances in Applied Probability, 319–336.

Handcock, M. S., Stein, M. L., 1993. A Bayesian analysis of kriging. Technometrics 35 (4),403–410.

Haneuse, S., Wakefield, J., 2008. Geographic-based ecological correlation studies using sup-plemental case–control data. Statistics in Medicine 27 (6), 864–87.

Haran, M., 2011. Gaussian random field models for spatial data. In: Brooks, S. P., Gelman,A., Jones, G. L., Meng, X. L. (Eds.), Handbook of Markov Chain Monte Carlo. BocaRaton: CRC Press, pp. 449–478.

Held, L., Schrodle, B., Rue, H., 2010. Posterior and cross-validatory predictive checks: acomparison of MCMC and INLA. In: Kneib, T., Tutz, G. (Eds.), Statistical Modellingand Regression Structures–Festschrift in Honour of Ludwig Fahrmeir. Physica-Verlag HD,Dordrecht, pp. 91–110.

26

Page 28: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

Hossain, M. M., Lawson, A. B., 2009. Approximate methods in Bayesian point process spatialmodels. Computational Statistics & Data Analysis 53 (8), 2831–2842.

Ickstadt, K., Wolpert, R., Lu, X., 1998. Modeling travel demand in Portland, Oregon.In: Dey, D., Muller, P., Sinha, D. (Eds.), Practical Nonparametric and Semiparamet-ric Bayesian Statistics. New York: Springer-Verlag, pp. 305–322.

Ickstadt, K., Wolpert, R. L., 1999. Spatial regression for marked point processes. BayesianStatistics 6, 323–341.

Illian, J., Penttinen, A., Stoyan, H., Stoyan, D., 2008. Statistical Analysis and Modelling ofSpatial Point Patterns. Vol. 70. John Wiley & Sons.

Illian, J. B., Sørbye, S. H., Rue, H., 2012a. A toolbox for fitting complex spatial pointprocess models using integrated nested Laplace approximation (INLA). The Annals ofApplied Statistics 6 (4), 1499–1530.

Illian, J. B., Sørbye, S. H., Rue, H., Hendrichsen, D. K., 2012b. Using INLA to fit a complexpoint process model with temporally varying effects - a case study. Journal of Environ-mental Statistics 3 (7), 1–29.

Kang, S. Y., McGree, J., Baade, P., Mengersen, K., 2013a. An investigation of the impactof various geographical scales for the specification of spatial dependence. Submitted forpublication.

Kang, S. Y., McGree, J., Mengersen, K., 2013b. The impact of spatial scales and spatialsmoothing on the outcome of Bayesian spatial model. PLoS ONE 8 (10), e75957.

Kirby, R. S., 1996. Toward congruence between theory and practice in small area analysisand local public health data. Statistics in Medicine 15 (17), 1859–1866.

Kleinschmidt, I., Pettifor, A., Morris, N., MacPhail, C., Rees, H., 2007. Geographic distribu-tion of human immunodeficiency virus in South Africa. The American Journal of TropicalMedicine and Hygiene 77 (6), 1163–1169.

Latimer, A. M., Banerjee, S., Jr, H. S., Mosher, E. S., Jr, J. A. S., 2009. Hierarchical modelsfacilitate spatial analysis of large data sets: a case study on invasive plant species in thenortheastern United States. Ecology Letters 12 (2), 144–154.

Lee, D., 2013. CARBayes: An R package for Bayesian spatial modelling with conditionalautoregressive priors. Journal of Statistical Software 55 (13).

Li, Y., Brown, P., Rue, H., al Maini, M., Fortin, P., 2012. Spatial modelling of lupus incidenceover 40 years with changes in census areas. Journal of the Royal Statistical Society: SeriesC (Applied Statistics) 61 (1), 99–115.

27

Page 29: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

Lindgren, F., Rue, H., Lindstrom, J., 2011. An explicit link between Gaussian fields andGaussian Markov random fields: the stochastic partial differential equation approach.Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73 (4), 423–498.

Longley, P. A., Batty, M., 1996. Spatial Analysis: Modelling in a GIS Environment. JohnWiley & Sons.

Louie, M. M., Kolaczyk, E. D., 2006. A multiscale method for disease mapping in spatialepidemiology. Statistics in Medicine 25 (8), 1287–1306.

Martino, S., Akerkar, R., Rue, H., 2011. Approximate Bayesian inference for survival models.Scandinavian Journal of Statistics 38 (3), 514–528.

Matern, B., 1960. Spatial variation–Stochastic models and their application to some problemsin forest surveys and other sampling investigations. Medd. Statens Skogsforskningsinstitut49 (5).

Minasny, B., McBratney, A. B., 2005. The Matern function as a general model for soilvariograms. Geoderma 128 (34), 192–207.

Møller, J., Waagepetersen, R. P., 2004. Statistical Inference and Simulation for Spatial PointProcesses. Vol. 100. Champman and Hall/CRC Press, Boca Raton, FL.

Møller, J., Waagepetersen, R. P., 2007. Modern statistics for spatial point processes*. Scan-dinavian Journal of Statistics 34 (4), 643–684.

Mollie, A., 1996. Bayesian mapping of disease. In: Gilks, W. R., Richardson, S., Spiegelhalter,D. J. (Eds.), Markov Chain Monte Carlo in Practice. Chapman & Hall, London, pp. 359–379.

Moran, P. A. P., 1948. The interpretation of statistical maps. Journal of the Royal StatisticalSociety. Series B (Methodological) 10 (2), 243–251.

Moran, P. A. P., 1950. Notes on continuous stochastic phenomena. Biometrika 37 (1/2),17–23.

Paciorek, C. J., 2013. Spatial models for point and areal data using Markov random fieldson a fine grid. Electronic Journal of Statistics 7, 946–972.

Paul, M., Riebler, A., Bachmann, L. M., Rue, H., Held, L., 2010. Bayesian bivariate meta-analysis of diagnostic test studies using integrated nested Laplace approximations. Statis-tics in Medicine 29 (12), 1325–1339.

Richardson, S., Thomson, A., Best, N., Elliott, P., 2004. Interpreting posterior relative riskestimates in disease-mapping studies. Environmental Health Perspectives 112 (9), 1016.

28

Page 30: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

Robinson, W. S., 1950. Ecological correlations and the behavior of individuals. AmericanSociological Review 15 (3), 351–357.

Rue, H., Held, L., 2005. Gaussian Markov Random Fields: Theory and Applications. Vol.104 of Monographs on Statistics and Applied Probability. Chapman & Hall, London.

Rue, H., Martino, S., Chopin, N., 2009. Approximate Bayesian inference for latent Gaussianmodels by using integrated nested Laplace approximations. Journal of the Royal StatisticalSociety: Series B (Statistical Methodology) 71 (2), 319–392.

Sartorius, B., Kahn, K., Collinson, M. A., Sartorius, K., Tollman, S. M., 2013. Dying intheir prime: determinants and space-time risk of adult mortality in rural South Africa.Geospatial Health 7 (2), 237–249.

Schrodle, B., Held, L., Riebler, A., Danuser, J., 2011. Using integrated nested Laplaceapproximations for the evaluation of veterinary surveillance data from Switzerland: acase-study. Journal of the Royal Statistical Society: Series C (Applied Statistics) 60 (2),261–279.

Smith, T. E., 2004. A scale-sensitive test of attraction and repulsion between spatial pointpatterns. Geographical Analysis 36 (4), 315–331.

Song, H. R., Lawson, A., et al., 2011. Modeling type 1 and type 2 diabetes mellitus incidencein youth: An application of Bayesian hierarchical regression for sparse small area data.Spatial and Spatio-temporal Epidemiology 2 (1), 23–33.

Stein, M. L., 1999. Interpolation of Spatial Data: Some Theory for Kriging. Chapter 2,Springer Verlag, New York.

Strauss, D. J., 1975. A model for clustering. Biometrika 62 (2), 467–475.

Terzopoulos, D., 1988. The computation of visible-surface representations. IEEE Transac-tions on Pattern Analysis and Machine Intelligence 10 (4), 417–438.

van Lieshout, M. N. M., Stein, A., 2012. Earthquake modelling at the country level usingaggregated spatio-temporal point processes. Mathematical Geosciences 44 (3), 309–326.

Vanhatalo, J., Pietilainen, V., Vehtari, A., 2010. Approximate inference for disease mappingwith sparse Gaussian processes. Statistics in Medicine 29 (15), 1580–1607.

Vanhatalo, J., Vehtari, A., 2007. Sparse log Gaussian processes via MCMC for spatial epi-demiology. In: JMLR Workshop and Conference Proceedings. Vol. 1. pp. 73–89.

Wakefield, J. C., Best, N. G., Waller, L. A., 2000. Bayesian approaches to disease mapping.In: Elliot, P., Wakefield, J. C., Best, N. G., Briggs, D. J. (Eds.), Spatial Epidemiology:Methods and Applications. Oxford: Oxford University Press, pp. 104–127.

29

Page 31: Notice Changes introduced as a result of publishing ...eprints.qut.edu.au/75778/1/Revised_manuscript.pdf · copy-editing and formatting may not be reflected in this document. For

Wang, F., 2006. Quantitative Methods and Applications in GIS. CRC Press, Boca Raton,FL.

Wiegand, T., Gunatilleke, S., Gunatilleke, N., Okuda, T., 2007. Analyzing the spatial struc-ture of a Sri Lankan tree species with multiple scales of clustering. Ecology 88 (12), 3088–3102.

Wolpert, R. L., Ickstadt, K., 1998. Poisson/gamma random field models for spatial statistics.Biometrika 85 (2), 251–267.

Zhang, T. L., Lin, G., 2008. Identification of local clusters for count data: a model-basedMoran’s I test. Journal of Applied Statistics 35 (3), 293–306.

30