published jas paper
TRANSCRIPT
7/21/2019 Published Jas Paper
http://slidepdf.com/reader/full/published-jas-paper 1/19
This article was downloaded by: [Universiti Pendidikan Sultan Idris], [Nor Azah Samat]On: 27 June 2012, At: 19:09Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
Journal of Applied StatisticsPublication details, including instructions for authors and
subscription information:
http://www.tandfonline.com/loi/cjas20
Vector-borne infectious disease
mapping with stochastic difference
equations: an analysis of dengue
disease in MalaysiaN. A. Samat a & D. F. Percy b
a Department of Mathematics, Faculty of Science and
Mathematics, Universiti Pendidikan Sultan Idris, 35900 Tanjong
Malim, Perak, Malaysiab Salford Business School, University of Salford, Greater
Manchester, M5 4WT, UK
Version of record first published: 27 Jun 2012
To cite this article: N. A. Samat & D. F. Percy (2012): Vector-borne infectious disease mapping
with stochastic difference equations: an analysis of dengue disease in Malaysia, Journal of Applied
Statistics, DOI:10.1080/02664763.2012.700450
To link to this article: http://dx.doi.org/10.1080/02664763.2012.700450
PLEASE SCROLL DOWN FOR ARTICLE
Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions
This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representationthat the contents will be complete or accurate or up to date. The accuracy of anyinstructions, formulae, and drug doses should be independently verified with primary
sources. The publisher shall not be liable for any loss, actions, claims, proceedings,demand, or costs or damages whatsoever or howsoever caused arising directly orindirectly in connection with or arising out of the use of this material.
7/21/2019 Published Jas Paper
http://slidepdf.com/reader/full/published-jas-paper 2/19
Journal of Applied Statistics
2012, iFirst article
Vector-borne infectious disease mappingwith stochastic difference equations: an
analysis of dengue disease in Malaysia
N.A. Samata∗ and D.F. Percyb
a Department of Mathematics, Faculty of Science and Mathematics, Universiti Pendidikan Sultan Idris,
35900 Tanjong Malim, Perak, Malaysia; b Salford Business School, University of Salford, Greater
Manchester, M5 4WT, UK
( Received 3 May 2011; final version received 2 June 2012)
Few publications consider the estimation of relative risk for vector-borne infectious diseases. Most of
these articles involve exploratory analysis that includes the study of covariates and their effects on disease
distribution and the study of geographic information systems to integrate patient-related information. The
aim of this paper is to introduce an alternative method of relative risk estimation based on discrete time–space stochastic SIR-SI models (susceptible–infective–recovered for human populations; susceptible–
infective for vector populations) for the transmission of vector-borne infectious diseases, particularly
dengue disease. First, we describe deterministic compartmental SIR-SI models that are suitable for dengue
disease transmission. We then adapt these to develop corresponding discrete time–space stochastic SIR-
SI models. Finally, we develop an alternative method of estimating the relative risk for dengue disease
mapping based on these models and apply them to analyse dengue data from Malaysia. This new approach
offers a better model for estimating the relative risk for dengue disease mapping compared with the other
common approaches, because it takes into account the transmission process of the disease while allowing
for covariates and spatial correlation between risks in adjacent regions.
Keywords: relative risk; disease mapping; dengue disease; tract-count data; SIR-SI models
1. Introduction
Dengue is a common, serious, infectious, mosquito-borne, viral disease in tropical and subtropical
regions of the world. Dengue viruses are transmitted to humans through the bites of infective
female Aedes mosquitoes, which live in clear and stagnated water that is mostly generated by
human activity and rainfall. There is currently no vaccine available for the prevention or treatment
of dengue disease. However, dengue can be prevented and controlled if detected early. Therefore,
the use of statistical models for studying the transmission of dengue disease and the estimation
∗Corresponding author. Email: [email protected]
ISSN 0266-4763 print/ISSN 1360-0532 online© 2012 Taylor & Francishttp://dx.doi.org/10.1080/02664763.2012.700450http://www.tandfonline.com
7/21/2019 Published Jas Paper
http://slidepdf.com/reader/full/published-jas-paper 3/19
2 N.A. Samat and D.F. Percy
of relative risk for disease mapping are important contributions to the prevention and control
strategies for dengue.
This paper investigates geographical distribution and disease mapping particularly for dengue
disease. Relative risk estimation is one of the most important issues when studying geographical
distributions of disease occurrence. Many studies of disease mapping use regression-type models
in which observable (fixed effects) and unobservable (random effects) variables are included togive a clean map and so depict the true excess risk [2–4,13,16,19,32]. In spite of this, published
studies that use structural disease transmission models for disease mapping are scarce [9].
Specifically for the case of dengue disease, few researchers use stochastic processes to estimate
the relative risk for disease mapping. Rather, most dengue studies are based on exploratory
data analysis accompanied by pictorial maps, which includes the study of covariates and their
effects on dengue disease distribution. See, for example, [10,27]. Furthermore, some authors use
a geographic information system to integrate the patient-related information [31].
In attempting to develop an improved model and a complementary analysis, our research
introduces an alternative method to estimate the relative risk of dengue disease transmission
based initially on discrete-time, discrete-space, stochastic SIR-SI models (susceptible–infective–recovered for human populations; susceptible–infective for vector populations). This method is
designed to overcome the drawbacks of relative risk estimation in disease mapping using the
classical approach based on standardized morbidity ratios (SMRs). It involves extending the
fundamental Poisson-gamma model and developing a Bayesian analytic approach.
In the remainder of this paper, we first describe existing deterministic compartmental SIR-SI
models for dengue disease transmission. Then, we derive a discrete time–space stochastic SIR-
SI model for dengue disease transmission, which adapts and extends the stochastic SIR models
described by Lawson [14]. We then continue with explanations about an alternative method of
relative risk estimation for denguedisease mapping, which we developbased on this new stochastic
SIR-SI model. This method is then applied to dengue data of Malaysia to demonstrate the models
in practice.
2. Compartmental SIR-SI models for dengue disease transmission
The compartmental model displayed in Figure 1 is the most common model used in the study of
dengue disease transmission and is adapted from [7,22]. In this study, for i = 1, 2, . . . , M study
Figure 1. Compartmental SIR-SI model for dengue disease transmission.
7/21/2019 Published Jas Paper
http://slidepdf.com/reader/full/published-jas-paper 4/19
Journal of Applied Statistics 3
regions, and j = 1,2, . . . , T time periods, S (h)i, j represents the total number of susceptible humans
at time j, I (h)i, j represents the total number of infective humans at time j, and R
(h)i, j represents the total
number of recovered humans at time j. We use the superscript (h) to distinguish the variables and
parameters as representing the human population rather than the vector population for which we
use the superscript (v). Furthermore, in Figure 1, S (v)i, j represents the total number of susceptible
mosquitoes at time j, I (v)i, j represents the total number of infective mosquitoes at time j, µ(h) and
µ(v) represent the (assumed equal) birth and death rates of humans per week and the (assumed
equal) birth and death rates of mosquitoes per week, respectively, γ (h) represents the rate at which
humans recover per week, b represents the biting rate per week, m represents the number of
alternative hosts available as the blood source, A represents the constant recruitment rate for the
mosquito vector, β(h) represents the transmission probability from mosquitoes to humans, β(v)
represents the transmission probability from humans to mosquitoes, N (h)i represents the human
population size for the study region i and N (v)i represents the mosquito population size for the
study region i. These definitions and notations hold throughout this paper.
For the case of dengue, susceptible people can become infective and then recover or die due
to the infection. However, susceptible Aedes mosquitoes can become infective but they will not
recover or die due to the infection because infective mosquitoes stay infective for the remainder
of their lifetimes.
For discrete-time intervals, the compartmental model in Figure 1 can also be written mathemat-
ically as a system of difference equations. Therefore, the deterministic SIR-SI model for dengue
disease transmission in human populations is given by
S (h)i, j = µ(h) N
(h)i +
1 − µ(h) −
β(h)b
N (h)i + m
I (v)i, j−1
S (h)i, j−1, (1)
I (h)i, j = (1 − µ(h) − γ (h)) I
(h)i, j−1 +
β(h)b
N (h)i + m
I (v)i, j−1S
(h)i, j−1, (2)
R(h)i, j = (1 − µ(h)) R
(h)i, j−1 + γ (h) I
(h)i, j−1. (3)
Similarly, the deterministic SIR-SI model for dengue disease transmission in vector populations
is given by
S (v)i, j = µ(v) N
(v)i +
1 − µ(v) −
β(v)b
N (h)i + m
I (h)i, j−1
S (v)i, j−1, (4)
I (v)i, j = (1 − µ(v)) I
(v)i, j−1 +
β(v)b
N (h)i + m
I (h)i, j−1S
(v)i, j−1. (5)
The combined model derived above has the same form as the deterministic SIR-SI model used
by Esteva and Vargas [7]. Here, N (h)i and N
(v)i are assumed to be constant, such that N
(h)i =
S (h)i, j + I
(h)i, j + R
(h)i, j and N
(v)i = S
(v)i, j + I
(v)i, j . This formulation can then be used to provide a link to
stochastic means, which will be explained in the next section.
3. Stochastic SIR-SI model for dengue disease transmission
A deterministic analysis provides a good approximation to the stochastic means for a major out-
break when the sample size is large [12]. Therefore, in the following analysis we use a formulation
of the deterministic model to provide an approximation to the stochastic means.
7/21/2019 Published Jas Paper
http://slidepdf.com/reader/full/published-jas-paper 5/19
4 N.A. Samat and D.F. Percy
Lawson [14] developed a stochastic SIR model for direct transmission of infectious diseases.
Although it only considered discrete time and discrete space, this model proved very effective for
analysing the spread of influenza. We now extend this model to enable the analysis of indirectly
transmitted infectious diseases, similarly taking into account of correlations among neighbouring
regions, using a spatial prior as described later in this section. However, in this study we include
the terms(h)i, j ,
(h)i, j and
(v)i, j to represent the numbers of newly infective humans, newly recovered
humans and newly infective mosquitoes, respectively, all in the interval or time period ( j − 1, j],
and study region i. This is because the dengue data that we observe are weekly new infective cases
in human populations, and we are interested in finding the posterior mean of the new infective
dengue cases each week.
For i = 1,2, . . . , M study regions and j = 1,2, . . . , T time periods, our discrete time–space
stochastic SIR-SI model for dengue disease transmission in human populations follows by adapt-
ing Equations (1)–(5) and including a probability distribution to reflect the randomness inherent
in the data as shown:
S (h)i, j = µ(h) N
(h)i + (1 − µ(h))S
(h)i, j−1 −
(h)i, j , (6)
(h)i, j ∼ Poisson(λ
(h)i, j ), (7)
λ(h)i, j = exp(β
(h)0 + c
(h)i )
β(h)b
N (h)i + m
I (v)i, j−1S
(h)i, j−1, (8)
I (h)i, j = (1 − µ(h)) I
(h)i, j−1 +
(h)i, j −
(h)i, j , (9)
R(h)i, j = (1 − µ(h)) R
(h)i, j−1 +
(h)i, j , (10)
(h)i, j = γ (h) I
(h)i, j−1. (11)
Furthermore, in this study and due to the general unavailability of sufficient data for vectors, thediscrete-time discrete-space SIR-SI models for dengue disease transmission in vector populations
are assumed non-stochastic and are as follows:
S (v)i, j = µ(v) N
(v)i + (1 − µ(v))S
(v)i, j−1 −
(v)i, j , (12)
(v)i, j =
β(v)b
N (h)i + m
I (h)i, j−1S
(v)i, j−1. (13)
I (v)i, j = (1 − µ(v)) I
(v)i, j−1 +
(v)i, j . (14)
We use the Poisson distribution to model the number of new infectives, as this is the fundamentalmodel for count data. Its mean λ
(h)i, j is chosen to match the deterministic form in Equation (2) with
a positive multiplicative factor to represent spatial correlation as explained below.
The formulations above show that the counts of new infective humans are assumed to follow
independent Poisson distributions, where the expected numbers of new infectives include elements
of the transmission, which are the simple direct dependence of current infective counts on previous
counts in the same spatial unit and a linear predictor term that can include covariates or random
effects.
As these counts are conditional upon other variables, the Poisson assumption cannot be tested in
isolation, but rather by trying other candidate distributions and comparing overall goodness-of-fit
as described in Section 5.3. However, the Poisson assumption is the default for log linear models
such as this, and we leave the testing of other distributions to future investigations.
In Equation (8), β(h)0 is a constant term to describe the overall rates of the process for human
populations, and c(h)i is a random effect that is designed to absorb residual spatial variation for
7/21/2019 Published Jas Paper
http://slidepdf.com/reader/full/published-jas-paper 6/19
Journal of Applied Statistics 5
human populations. In this study, a conditional autoregressive (CAR) prior is used as a family of
prior distributions for the random effect. This CAR model was proposed by Besag et al. [3], where
the probability densities of values at any given location are conditional on the neighbouring areas.
The advantage of this intrinsic CAR model is that the conditional moments are defined as simple
functions of the neighbouring values and the number of neighbours mi by means of a conditional
distribution defined by
c(h)i |c
(h) j ( j = i) ∼ Normal
c̄(h)i ,
r
mi
.
In other words, under the CAR prior, the random effect c(h)i at site i, conditional upon the random
effects at all other sites, is normally distributed with mean equal to the average of the neighbouring
c(h)
j and variance equal to r /mi, where r is an unknown variance parameter. This intrinsic Gaussian
CAR model allows for over-dispersion and spatial correlation among neighbouring areas. How-
ever, Lawson [15] points out that this intrinsic CAR model is not the only available specification
of a Gaussian Markov random field model. In fact, a proper CAR model formulation can also be
used. The application, comparison and discussion of a proper CAR prior to the analysis of ourstochastic SIR-SI dengue disease transmission model will be included in future investigations to
improve this methodology.
The discrete time–space stochastic SIR-SI model for dengue disease transmission that we
propose here will be used in the estimation of relative risk for dengue disease mapping. However,
the methods extend readily to apply more generally to other vector-borne infectious diseases. A
discussion about this is presented and explained in the next section.
4. Relative risk estimation for disease mapping
Many studies on disease mapping use regression-type models to estimate the risk. Here, we
introduce an alternative method of relative risk estimation of disease mapping based on the disease
transmission model adapted specially for dengue disease. Our computational analysis is performed
using WinBUGS software, which is a package designed to carry out Markov chain Monte Carlo
computations for a wide variety of Bayesian models [29].A discussion and application of Bayesian
analysis of disease mapping using this software can be found in Lawson and Clark [17].
In general, for i = 1, 2, . . . , M study regions and j = 1,2, . . . , T time periods, a pseudo-random
sample of observations λ(h)ijk for k = 1,2, . . . , n is generated from the posterior distribution for
the mean number of infectives λ(h)ij . From this sample, the posterior expected mean number of
infectives can be approximated using the unbiased sample mean
λ̃(h)ij =
1
n
nk =1
λ(h)ijk . (15)
Next, the relative risk parameter θ (h)ij is defined by
θ (h)ij =
λ(h)ij
e(h)ij
. (16)
Therefore, the posterior expected relative risk can also be approximated using an unbiased sample
mean
θ̃ (h)ij =
1
n
nk =1
θ (h)ijk =
1
n
nk =1
λ(h)ijk
e(h)ij
=λ̃(h)ij
e(h)ij
. (17)
7/21/2019 Published Jas Paper
http://slidepdf.com/reader/full/published-jas-paper 7/19
6 N.A. Samat and D.F. Percy
In other words, the posterior expected relative risk is equal to the posterior expected mean number
of infectives, λ̃(h)ij , divided by the corresponding naïve mean number of infectives based on the
human population across all study regions, e(h)ij .
We then use this formulation in the estimation of relative risk for disease mapping, based on
the discrete time–space stochastic SIR-SI model for disease transmission using data in the form
of counts of cases for all tracts under consideration.
5. Application of relative risk estimation for dengue disease in Malaysia
This section demonstrates and displays the results of relative risk estimation based on an
application of the preceding discrete time–space stochastic SIR-SI models for dengue disease
transmission with five alternative assumptions about the mosquito population. The results are
compared and presented in tables and a map, and a powerful model for relative risk estimation
and dengue disease mapping is revealed.
5.1 Data set
Data used in this study were provided by the Ministry of Health, the Institute for Medical Research
and the Department of Statistics, all in Malaysia. All methods presented here are applied to dengue
data in the form of counts of cases within the states of Malaysia for epidemiology weeks 1–53
during a 1-year period spanning 2008–2009. Figure 2 displays the available data, which refer
to observed new infective dengue cases of humans in time periods or intervals ( j − 1, j] for
j = 1, 2, . . . , 53.
The values for β(h) and β(v) are chosen to be 0.50 and 0.75, respectively, and the number of
alternative hosts available as the blood source m is assumed to be zero. Furthermore, the weekly
rate values forµ(h), µ(v) and γ (h) are 0.0002736, 0.4028 and 0.7903, respectively, and b is 2.33. Allof these rates are converted from daily rates that we derived from the literature [22,25]. Moreover,
0
100
200
300
400
500
600
700
1 3 5 7 9 1 1 13 1 5 17 1 9 21 2 3 25 2 7 2 9 31 3 3 35 3 7 39 4 1 43 4 5 47 4 9 5 1 53
N u m b e r s o f N e w I n f e c t i v e
D e n g u e C a s e s
Epidemiology Week
Perlis
Kedah
P.Pinang
Perak
Kelantan
Terengganu
Pahang
Selangor
K.Lumpur
Putrajaya
N.Sembilan
Melaka
Johor
Sarawak
Labuan
Sabah
Figure 2. Time series plot for numbers of new infective dengue cases from epidemiology weeks 1–53 during
1-year period spanning 2008–2009 for all 16 states in Malaysia.
7/21/2019 Published Jas Paper
http://slidepdf.com/reader/full/published-jas-paper 8/19
Journal of Applied Statistics 7
since there are no routine data available for dengue mosquitoes, we impute suitable values based
on studies conducted by other researchers. This process is explained in the next section.
5.2 Estimation of vector mosquito populations
Implementation of the SIR-SI models requires dengue mosquito vector data. Since there are
no available routine data for vector mosquito populations, specifically data for newly infective
mosquitoes that are difficult to collect, we propose three simple methods to impute values in
order to generate better results for relative risk estimation than would otherwise be possible. First,
the estimation is based on seasonal averages reported in relevant journal publications, which
specifically study dengue in Malaysia, written by Rohani et al. [26] and Lee and Inder Singh [18].
Second, the estimation is based on the SIR-SI model for dengue disease transmission where the
starting values are set and the estimation propagates from the SIR-SI equations. Here, some of the
estimation is based on information taken from an article by Nishiura [22]. Third, the estimation
is based on an assumption that the infective mosquito data follow the pattern of weekly data for
new infective humans.
5.2.1 Estimation of vector mosquito populations based on seasonal averages
Rohani et al. [26] identified about 40 infective adult mosquitoes in a sample of 5508. In order to
progress, it is feasible to interpret this information as
S (v)i,0
I (v)i,0
≈5508 − 40
40=
1367
10, (18)
⇒ I (v)i,0 ≈ 0.00732S (v)
i,0 . (19)
The calculation above clearly assumes that the ratio of susceptibles to infectives for mosquitoes
is approximately constant, which is a reasonable first-order assumption.
Now consider a study by Lee and Inder Singh [18], who conducted monthly surveillance of
adult mosquitoes in Kuala Lumpur, Malaysia, continuously from January to December 1990 to
monitor their population. Results of the study give the distribution and numbers of adult Aedes
collected in sentinel traps and the total number of adult mosquitoes for each month. Sentinel
traps are typically huts or rooms or houses, which are used to collect mosquitoes. Normally, two
humans stay inside the hut as a bait to attract mosquitoes. Therefore, the numbers of mosquitoes
collected here refer to the numbers corresponding to the population of susceptible humans at risk.
In this investigation, Lee and Inder Singh [18] observed a total of 8518 mosquitoes among 556
susceptible humans in the year 1990. Since we plan to use the number of susceptible vectors as
the starting point at time j = 0 for each region in our analysis, we have
S (v)i,0 + I
(v)i,0
S (h)i,0
≈8518
556=
4259
278. (20)
Rearranging Approximation (20) gives
I (v)i,0 ≈
4259
278S (h)i,0 − S
(v)i,0 . (21)
7/21/2019 Published Jas Paper
http://slidepdf.com/reader/full/published-jas-paper 9/19
8 N.A. Samat and D.F. Percy
Substituting Approximation (21) into Approximation (19) now gives
S (v)i,0 ≈
1367
10
4259
278S (h)i,0 − S
(v)i,0
⇒ S (v)i,0 ≈ 15.21S
(h)i,0 . (22)
However, the data for adult mosquitoes in Lee’s paper represent monthly periods, and the lifespan
of Aedes mosquitoes in nature typically ranges from 2 weeks to a month depending on environ-
mental conditions [21]. Consequently, we need to redefine Equation (22) by transforming to a
single generation of Aedes mosquito. Under this redefinition, Approximation (20) changes so that
the appropriate revised form of Equation (22) becomes
S (v)i,0 ≈
15.21S (h)i,0
2= 7.605S
(h)i,0 . (23)
Hence, there are seven or eight susceptible mosquitoes for every susceptible human, on average.
Relations (19)–(23) give some idea of what the average values are for the infective mosquitopopulation and susceptible mosquito population, which we assume as initial values for our inves-
tigation. In this analysis, the value for the infective mosquito count I (v)i,0 is used as the average value
over the first time period, which we then propagate using one of three alternative assumptions.
These values are then imputed in Equation (8), giving three similar sets of results arising from
our relative risk estimation.
First, we assume that the data for infective mosquitoes are constant over time for all the states
in Malaysia (Assumption 1). Figure 3 shows a graph of the estimated infective mosquito data for
each state in Malaysia from epidemiology weeks 1–53 during the course of 2008–2009. That is,
from Equations (19) and (23) we estimate the number of infective mosquitoes for the start of the
time period for each state, which we then assume constant for all time periods.Second, we assume that the data for infective mosquitoes follow a cyclical seasonal pattern
(Assumption 2). This is because many researchers have reported in their studies that the seasonal
patterns of outbreak of dengue coincide with the rainy season [8,23,28,30].
Figure 3. Imputed infective mosquitoes without seasonality.
7/21/2019 Published Jas Paper
http://slidepdf.com/reader/full/published-jas-paper 10/19
Journal of Applied Statistics 9
Figure 4. Imputed infective mosquitoes with piecewise constant seasonality.
According to Okogun et al. [23], rainfall is an important factor which regulates the abundance
of outdoor breeding mosquito populations and consequently directly associates with the higher
prevalence levels of mosquito diseases. This view is supported by Foo et al. [8], who found that the
monthly incidence of dengue is associated with the monthly rainfall, which provides the breeding
sites for mosquito populations. In Malaysia, the northeast monsoon is the major rainy season inthe country, which brings heavy rainfall from mid November to early March [20]. Therefore, it
is expected that the number of infective mosquitoes will increase during this monsoon season. In
this study, we assume that the number of infective mosquitoes is piecewise constant over time,
where the value is in a range between 10% above and 10% below the estimated average number of
infective mosquitoes in each state (Figure 4). Here, the number of infective mosquitoes is assumed
to be large during epidemiology weeks 1–11 and 46–53, corresponding to the raining season in
Malaysia, and small during the other epidemiology weeks.
Third, we again assume that the data for infective mosquitoes follow a cyclical seasonal pattern,
but that this seasonality is now represented by a sinusoidal function ranging from 20% below the
estimated average value to 20% above the estimated average value in each state (Assumption 3).
The idea of using a sinusoidal function is to model the seasonal variation continuously throughout
the year, as a better representation of the true cyclical behaviour than in Assumption 2. Figure 5
shows the imputed infective mosquito data based on this assumption.
In any particular state i, we fit the sinusoidal function for infective mosquitoes by considering
the continuous-time equivalent to I (v)i, j , which is
I (v)i (t ) = ai + bi sin(c + dt ),
where ai is the mean response, bi is the amplitude, c is the phase, 2π/d is the period and t
represents time. In this research, ai represents the estimated average value used in Assumptions1 and 2, bi reflects the amplitude of ±20% about the average and t interpolates epidemiology
weeks j = 1,2, . . . , 53. The parameters c = 39/53 and d = 2π/53 are assumed constant across
all states.
7/21/2019 Published Jas Paper
http://slidepdf.com/reader/full/published-jas-paper 11/19
10 N.A. Samat and D.F. Percy
Figure 5. Imputed infective mosquitoes with sinusoidal seasonality.
Therefore, dt measures annual cycles, taking the values [0, 2π) for year 1, [2π , 4π) for year
2, [4π , 6π) for year 3 and so on. Here, we choose c in the interval [0, 2π), but any value equal
to this plus a multiple of 2π will give the same imputed values for I (v)i (t ). As for Assumption 2,
the rainy season falls during epidemiology weeks 1–11 and 46–53. Therefore, it is assumed that
the number of infected mosquitoes is high in this duration compared with the other epidemiology
weeks.These three alternative assumptions for mosquito data are then imputed in the discrete time–
space stochastic SIR-SI model for dengue disease transmission for all states in Malaysia, to obtain
comparable posterior expected relative risks.
5.2.2 Estimation of vector mosquito populations based on propagation
Several articles used the same information as Nishiura [22] in their studies of dengue disease
transmission [7,25]. Here, we use information from Nishiura [22] in order to estimate the total
mosquito population N (v)i in state i = 1,2, . . . , M . In his analysis, Nishiura assumed the total
human population N (h)
i to be 10,000 and the recruitment rate of mosquitoes to be 5000 per day.
Converting this daily rate to weekly rate gives the recruitment rate of mosquitoes to be 35,000 per
week. We know that the recruitment rate of the mosquito population is µ(v) N (v)i , and in this study
the birth and death rates for the mosquito population are both µ(v) ≈ 0.4028 per week. Therefore,
N (v)i ≈
35, 000
0.4028≈ 86, 892,
and this leads to
N (v)i ≈ 8.6892 N
(h)i . (24)
Based on Approximation (24), we can now estimate the total mosquito population for each state.These data are then imputed in Equation (12), which is then substituted in Equations (13)a nd(14).
This subsequently gives estimated values for the numbers of infective mosquitoes I (v)i, j which we
propagate from Equation (14). We refer to this approach as Assumption 4.
7/21/2019 Published Jas Paper
http://slidepdf.com/reader/full/published-jas-paper 12/19
Journal of Applied Statistics 11
0
5000
10000
15000
20000
25000
30000
35000
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53
I m p u t e d I n f e c t i v e M o s q u i t o e s
Epidemiology Week
Perlis
Kedah
P.Pinang
Perak
Kelantan
Terengganu
Pahang
Selangor
K.Lumpur
Putrajaya
N.Sembilan
Melaka
Johor
Sarawak
Labuan
Sabah
Figure 6. Imputed infective mosquitoes based on propagation.
Figure 6 shows the corresponding values of I (v)i, j for each state in Malaysia for epidemiology
weeks 1–53 corresponding to the 12 months from 1 January 2008 to 3 January 2009. These values
are finally imputed in Equation (8) to give the posterior expected means of new infective humans,
which subsequently give the posterior expected relative risks of dengue disease.
5.2.3 Estimation of vector mosquito populations from human populations
Here, we assume that the infective mosquito population counts follow the cyclical pattern of infec-
tive human population counts, with a constant ratio between infective mosquitoes and infective
humans (Assumption 5).
This assumption is based on our belief that there is a positive correlation between the numbers
of infective mosquitoes and the numbers of new infective humans. We assume that when there
is an increase in the number of new infective humans, there will also be an increase in the
number of infective mosquitoes. Figure 7 shows the pattern of I (v)i, j for each state in Malaysia
from epidemiology weeks 1–53 for the same year during 2008–2009, based on Assumption 5.
These data are then imputed in Equation (8) to give the posterior expected means of new infective
humans, which subsequently give the posterior expected relative risks of dengue disease.
5.3 Analysis and results: comparison of posterior expected relative risks
The aim of this research is to improve the accuracy and reliability of the existing methods for
mapping vector-borne infectious diseases. In this paper, the estimation of relative risk is based
on our stochastic SIR-SI model for disease transmission. Many published studies of general
infectious diseases, including [1,5,6,24], use stochastic terms in their models as probabilistic
statements about the progression of the disease. These studies generally agree that stochastic
models are more realistic than deterministic models, the latter being a special case of the former.To demonstrate the possible benefits of our approach, we focus on the spread of dengue disease
in Malaysia. We adopt Bayesian methods of analysis for improved robustness in estimation and
decision-making. However, this paper is primarily concerned with the models and methods, so we
7/21/2019 Published Jas Paper
http://slidepdf.com/reader/full/published-jas-paper 13/19
12 N.A. Samat and D.F. Percy
Figure 7. Imputed infective mosquitoes based on human populations.
choose reference (uniform) priors for illustration, except for the CAR prior for spatial variability.
Future work will investigate the impactof more informative priors by means of sensitivity analyses.
We now present the results of relative riskestimation basedon our discrete time–space stochastic
SIR-SI model for dengue disease transmission using the five alternative methods for imputing
vector mosquito populations described in the previous section. The model in this analysis is
posterior sampled and is run to convergence using WinBUGS software. Figures 8–12 show time
series plots for posterior expected relative risks across all states, based on our discrete time–space
stochastic SIR-SI models for dengue disease transmission in epidemiology weeks 1–53 during
2008–2009 using Assumptions 1–5, respectively.
Figures 8–12 suggest a conclusion that all states have similar patterns of posterior expected
relative riskfor all epidemiologyweeks, thoughdifferent methods give different values of posterior
expected relative risk. Based on the posterior expected relative risks for epidemiology week 53
in Table 1, all methods lead to the same conclusion that the state with the highest risk is Putrajaya
and the state with the lowest risk is Sabah, except for Assumption 5 which concludes that the
state of Labuan has the lowest risk. The risks for the other 14 states seem to be quite similarfor all five assumptions. This consistency is most encouraging and suggests that the disease
maps are not overly sensitive to the accuracy of the assumption made for imputing mosquito
counts. Consequently, there appears to be little to gain from expensive efforts to collect actual
data on mosquito populations, so long as reference values are available, such as those used in
our analysis. Mathematical considerations lead towards Assumption 3 as best representing the
physical process, but we now evaluate model goodness-of-fit measures to help us determine
which mosquito population assumption is most appropriate.
The use of goodness-of-fit measures is common in statistics for comparing fitted models. Law-
son[15] discusses several methods that can be used to assess goodness-of-fit, including chi-square
statistics, Akaike information criterion, Bayesian information criterion, deviance information cri-terion (DIC) and posterior predictive loss. In this study, we use the DIC because it is readily
available in WinBUGS software and because Lawson [15] identifies weaknesses with the other
measures, particularly for models that involve several random effects. The DIC is defined by
7/21/2019 Published Jas Paper
http://slidepdf.com/reader/full/published-jas-paper 14/19
Journal of Applied Statistics 13
Figure 8. Posterior expected relative risks under Assumption 1.
Figure 9. Posterior expected relative risks under Assumption 2.
Spiegelhalter et al. [29] as
DIC = 2 E θ | x { D} − D{ E θ | x (θ)},
where D(·) is the deviance of the model and x represents the observed data. It uses the average
of the posterior samples of θ to produce an expected value of θ . This value can also be computedfrom a sample output from a chain. According to Spiegelhalter et al. [29], the model with the
smallest DIC is the model that would best predict a replicate data set of the same structure as that
currently observed. While Lawson and Clark [17] point out that the other overall goodness-of-fit
7/21/2019 Published Jas Paper
http://slidepdf.com/reader/full/published-jas-paper 15/19
14 N.A. Samat and D.F. Percy
Figure 10. Posterior expected relative risks under Assumption 3.
Figure 11. Posterior expected relative risks under Assumption 4.
measures are useful for helping model selection, they give little help in assessing how well the
model fits the data.
Table 2 shows the DIC values for the new infective humans for epidemiology weeks 1–53 for
all states in Malaysia based on our five different assumptions for the mosquito populations. From
the DIC values in Table 2, we can say that the model with Assumption 5 fits best because it givesthe smallest DIC, compared with the other models. We conclude that the discrete time–space
stochastic SIR-SI model that assumes that infective mosquito counts are proportional to infective
human counts is the best model to be used in the analysis specifically for estimating relative risk.
7/21/2019 Published Jas Paper
http://slidepdf.com/reader/full/published-jas-paper 16/19
Journal of Applied Statistics 15
Figure 12. Posterior expected relative risks under Assumption 5.
Table 1. Posterior expected relative risks for epidemiology week 53.
Assumption 1 Assumption 2 Assumption 3 Assumption 4 Assumption 5
I (v) piecewise I (v) propagated I (v) estimated from
I (v) constant I (v) sinusoidal from SIR-SI human infectives
State constant seasonality seasonality equations I (h)
1. Perlis 0.3471 0.3955 0.4171 0.6454 0.29602. Kedah 0.3881 0.4422 0.4663 0.8397 0.89633. Pulau Pinang 0.6753 0.7695 0.8115 0.9692 1.07104. Perak 0.7790 0.8876 0.9361 0.9843 0.81685. Kelantan 0.6972 0.7944 0.8377 0.5154 0.53516. Terengganu 0.7176 0.8177 0.8623 0.6239 0.55187. Pahang 0.3891 0.4433 0.4675 0.5375 0.71518. Selangor 1.9350 2.2040 2.3250 3.0420 3.18309. Kuala Lumpur 1.4450 1.6470 1.7370 1.6210 1.557010. Putrajaya 2.2420 2.5550 2.6950 5.7400 5.237011. Negeri Sembilan 0.6184 0.7046 0.7430 0.6123 0.845512. Melaka 0.4911 0.5595 0.5900 0.3274 0.281713. Johor 0.5444 0.6204 0.6542 0.6684 0.598314. Sarawak 0.2766 0.3151 0.3323 0.4213 0.282915. Labuan 0.5723 0.6522 0.6878 0.1551 0.00000011216. Sabah 0.1532 0.1745 0.1841 0.1251 0.0926
Table 2. DIC evaluated for Assumptions 1–5.
Assumption 1 Assumption 2 Assumption 3 Assumption 4 Assumption 5
New infective humans, (h) 8993.57 9515.41 10087.4 10137.5 7982.23
7/21/2019 Published Jas Paper
http://slidepdf.com/reader/full/published-jas-paper 18/19
Journal of Applied Statistics 17
can overcome the problems of SMR, especially when there are no observed count data in cer-
tain regions, and the problems of the Poisson-gamma model, where covariate adjustments are
impossible and it is not possible to allow for spatial correlation between risks in adjacent areas.
Possible extensions to thiswork include the development of a model for dengue disease mapping
with continuous time and discrete space, in order to improve the accuracy of disease mapping
further and for particular applicability to vector-borne infectious diseases that are rare or in theirearly stages. We anticipate that the results of this analysis will further strengthen our conclusions
about tract-count data using the above analysis. The techniques presented in this paper offer an
alternative method for estimating the relative risk in the study of disease mapping particularly for
diseases with indirect transmission.
Acknowledgements
The authors acknowledge Universiti Pendidikan Sultan Idris and the Ministry of Higher Education in Malaysia for their
financial support in respect of this study.
References
[1] C.L. Addy, I.M. Longini, Jr., and M. Haber, A generalized stochastic model for the analysis of infectious disease
final size data, Biometrics 47 (1991), pp. 961–974.
[2] L. Bernardinelli, D.G. Clayton, C. Pascutto, C. Montomoli, M. Ghislandi, and M. Songini, Bayesian analysis of
space–time variation in disease risk , Stat. Med. 14 (1995), pp. 2433–2443.
[3] J. Besag, J. York, and A. Mollie, Bayesian image restoration with two applications in spatial statistics, Ann. Inst.
Stat. Math. 43 (1991), pp. 1–59.
[4] D. Boehning, E. Dietz, and P. Schlattmann, Space–time mixture modelling of public health data, Stat. Med. 19
(2000), pp. 2333–2344.
[5] D. Clancy, A stochastic SIS infection model incorporating indirect transmission, J.Appl. Probab. 42 (2005), pp. 726–
737.[6] D. Clancy and P.D. O’Neill, Bayesian estimation of the basic reproduction number in stochastic epidemic models,
Bayesian Anal. 3 (2008), pp. 737–758.
[7] L. Esteva and C. Vargas, Analysis of a dengue disease transmission model, Math. Biosci. 150 (1998), pp. 131–151.
[8] L.C. Foo, T.W. Tim, H.L. Lee, and R. Fang, Rainfall, abundance of Aedes aegypti and dengue infection in Selangor,
Malaysia, Southeast Asian J. Trop. Med. Public Health 16 (1985), pp. 560–568.
[9] A. Gemperli, P. Vounatsou, N. Sogoba, and T. Smith, Malaria mapping using transmission models: An application
to survey data from Mali, Am. J. Epidemiol. 163 (2006), pp. 289–297.
[10] D.J. Gubler, Dengue and dengue haemorrhagic fever , Clin. Microbiol. Rev. 11 (1998), pp. 480–496.
[11] D.J. Gubler, Epidemic dengue/dengue haemorrhagic fever as a public health, social and economic problem in the
21st century, Trends Microbiol. 10 (2002), pp. 100–103.
[12] V. Isham, Stochastic models for epidemics: Current issues and development , in Celebrating Statistics, A.C. Davison,
Y. Dodge and N. Wermuth, eds., Oxford University Press, Oxford, 2005, pp. 27–54.
,
[13] L. Knorr-Held and J. Besag, Modelling risk from a disease in time and space, Stat. Med. 17 (1998), pp. 2045–2060.
[14] A.B. Lawson, Statistical Methods in Spatial Epidemiology, 2nd ed., John Wiley & Sons, Chichester, UK, 2006.
[15] A.B. Lawson, Bayesian Disease Mapping, CRC Press, Boca Raton, FL, 2009.
[16] A.B. Lawson, W.J. Browne, and C.L Vidal Rodeiro, Disease Mapping with WinBUGS and MLwiN , John Wiley &
Sons, Chichester, UK, 2003.
[17] A.B. Lawson and A. Clark, Spatial mixture relative risk models applied to disease mapping, Stat. Med. 21 (2002),
pp. 359–370.
[18] H.L. Lee and K. Inder Singh, Sequential sampling for Aedes aegypti and Aedes albopictus (Skuse) adults: Its use in
estimation of vector density threshold in dengue transmission and control, J. Biosci. 2 (1991), pp. 9–14.
[19] Y.C. MacNab and C.B Dean, Spatio-temporal modelling of rates for the construction of disease maps , Stat. Med. 21
(2002), pp. 347–358.
[20] Malaysian Meteorological Department, Monsoon season in Malaysia. Available at http://www.met.gov.my (2 April
2010).
[21] Maricopa County EnvironmentalServices, Lifecycle and information on Aedes aegypti mosquitoes, MaricopaCounty.
Available at http://www.maricopa.gov/EnvSvc/VectorControl/Mosquitos/MosqInfo.aspx (20 July 2009).
[22] H. Nishiura, Mathematical and statistical analysis of the spread of dengue, Dengue Bull. 30 (2006), pp. 51–67.
7/21/2019 Published Jas Paper
http://slidepdf.com/reader/full/published-jas-paper 19/19
18 N.A. Samat and D.F. Percy
[23] R.A.G. Okogun, E.B.N. Bethran, N.O. Anthony, C.A. Jude, and C.E. Anegbe, Epidemiological implication of
preferences of breeding sites of mosquito species in Midwestern Nigeria, Ann. Agric. Environ. Med. 10 (2003),
pp. 217–222.
[24] P.D. O’Neil, A tutorial introduction to Bayesian inference for stochastic epidemic models using Markov chain Monte
Carlo methods, Math. Biosci. 180 (2002), pp. 103–114.
[25] P. Pongsumpun, K. Patanarapelert, M. Sripom, S. Varamit, and I.M. Tang, Infection risk to travellers going to dengue
fever endemic regions, Southeast Asian J. Trop. Med. Public Health 35 (2004), pp. 155–159.[26] A. Rohani, I. Asmaliza, S. Zainah, and H.L. Lee, Detection of dengue from field Aedes aegypti and Aedes albopictus
adults and larvae, Southeast Asian J. Trop. Med. Public Health 28 (1997), pp. 138–142.
[27] M.G. Rosa-Freitas, P. Tsouris, A. Sibajev, E.T. Weimann, A.U. Marques, R.L Ferreire, and F.C.L. Gards-Moura,
Exploratory temporal and spatial distribution analysis of dengue notifications in Boa Vista, Roraima, Brazilian
Amazon, 1999–2001, Dengue Bull. 27 (2003), pp. 63–80.
[28] H. Rozilawati, J. Zairi, and C.R. Adanan, Seasonal abundance of Aedes albopictus in selected urban and suburban
areas in Penang, Malaysia, Trop. Biomed. 24 (2007), pp. 83–94.
[29] D. Spiegelhalter, A. Thomas, N. Best, and D. Lunn, WinBUGS User Manual Version 1.4, MRC Biostatistics Unit,
Cambridge, UK, 2003.
[30] S. Sulaiman, Z.A. Pawanchee, J. Jeffery, I. Ghauth, and V. Buspavani, Studies on the distribution and abundance
of Aedes aegypti (L.) and Aedes albopictus (Skuse) (Diptera: Culicidae) in an endemic area of dengue/dengue
haemorrhagic fever in Kuala Lumpur , Mosq.-Borne Dis. Bull. 8 (1991), pp. 35–39.
[31] A. Tran, X. Deparis, P. Dussart, J. Morran, P. Rabarison, F. Remy, L. Polidori, and J. Gardon, Dengue spatial and
temporal patterns, French Guiana, 2001, Emerg. Infect. Dis. 10 (2004), pp. 615–621.
[32] L.A. Waller, B.P. Carlin, H. Xia, and A.E. Gelfand, Hierarchical spatio temporal mapping of disease rates, J. Am.
Stat. Assoc. 92 (1997), pp. 607–617.