numerical algorithms for estimation and calculation of parameters in

13
Numerical algorithms for estimation and calculation of parameters in modeling pest population dynamics and evolution of resistance Mingren Shi a,b,, Michael Renton a,b,c a School of Plant Biology, University of Western Australia, 35, Stirling Highway, Crawley, WA 6009, Australia b Cooperative Research Centre for National Plant Biosecurity, Australia c CSIRO Ecosystem Sciences, Underwood Avenue, Floreat, WA 6014, Australia article info Article history: Received 8 March 2011 Received in revised form 19 May 2011 Accepted 20 June 2011 Available online xxxx Keywords: Parameter estimation Offspring genotype table Probit models Mortality estimation Population dynamics Resistance evolution abstract Computational simulation models can provide a way of understanding and predicting insect population dynamics and evolution of resistance, but the usefulness of such models depends on generating or esti- mating the values of key parameters. In this paper, we describe four numerical algorithms generating or estimating key parameters for simulating four different processes within such models. First, we describe a novel method to generate an offspring genotype table for one- or two-locus genetic models for simu- lating evolution of resistance, and how this method can be extended to create offspring genotype tables for models with more than two loci. Second, we describe how we use a generalized inverse matrix to find a least-squares solution to an over-determined linear system for estimation of parameters in probit mod- els of kill rates. This algorithm can also be used for the estimation of parameters of Freundlich adsorption isotherms. Third, we describe a simple algorithm to randomly select initial frequencies of genotypes either without any special constraints or with some pre-selected frequencies. Also we give a simple method to calculate the ‘‘stable’’ Hardy–Weinberg equilibrium proportions that would result from these initial frequencies. Fourth we describe how the problem of estimating the intrinsic rate of natural increase of a population can be converted to a root-finding problem and how the bisection algorithm can then be used to find the rate. We implemented all these algorithms using MATLAB and Python code; the key statements in both codes consist of only a few commands and are given in the appendices. The results of numerical experiments are also provided to demonstrate that our algorithms are valid and efficient. Ó 2011 Elsevier Inc. All rights reserved. 1. Introduction Estimating parameters based on measured empirical data is a critical issue in biosecurity models, such as simulation models of population dynamics and evolution of resistance in stored-grain insect pests [12]. These simulation models are based on integrating sub-models representing different key biological processes, such as genetic recombination and mortality due to pesticides. Various parameters for different sub-models must be calculated or esti- mated before these models are used to predict the effects of differ- ent possible management strategies. These parameters include: the chance of certain genotypes being produced as the result of the mating of certain parent genotypes (which we call offspring geno- type tables), initial frequencies of genotypes, mortalities of insect pests under various pesticide doses, and the intrinsic rate of natural increase of an insect population. These are important parameters within the sub-models for simulating genetic recombination and thus determining the genotype of offspring, initialisation of the population, simulating the effects of pesticide applications and cal- culating the number of eggs produced by each insect, respectively. By an offspring genotype table we mean a table that lists all pos- sible combinations of parental genotypes, and, for each possible parental combination, gives the expected proportions of offspring genotypes (see Hedrick’ book [19, p. 76] for an example of this kind of table, although no formal name is provided in this or other liter- ature). Such a table is indispensable for a genetic model simulating evolution of resistance, or other traits. We develop a novel method to generate the offspring genotype table for a one-locus genetic model: quantifying all possible genotypes of parents and offspring and then using a block-matrix multiplication approach to generate the full table describing the chance of certain genotypes being produced as the result of the mating of each and every possible combination of parent genotypes. The offspring genotype tables for more than one locus are then produced recursively, with the table for a model with a higher number of loci produced from the tables for lower numbers of loci. This algorithm for the one- and 0025-5564/$ - see front matter Ó 2011 Elsevier Inc. All rights reserved. doi:10.1016/j.mbs.2011.06.005 Corresponding author at: School of Plant Biology, University of Western Australia, 35, Stirling Highway, Crawley, WA 6009, Australia. Tel.: +61 8 6488 1992; fax: +61 8 6488 1108. E-mail addresses: [email protected] (M. Shi), [email protected]. edu.au (M. Renton). Mathematical Biosciences xxx (2011) xxx–xxx Contents lists available at ScienceDirect Mathematical Biosciences journal homepage: www.elsevier.com/locate/mbs Please cite this article in press as: M. Shi, M. Renton, Numerical algorithms for estimation and calculation of parameters in modeling pest population dynamics and evolution of resistance, Math. Biosci. (2011), doi:10.1016/j.mbs.2011.06.005

Upload: lydang

Post on 10-Feb-2017

237 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Numerical algorithms for estimation and calculation of parameters in

Mathematical Biosciences xxx (2011) xxx–xxx

Contents lists available at ScienceDirect

Mathematical Biosciences

journal homepage: www.elsevier .com/locate /mbs

Numerical algorithms for estimation and calculation of parameters in modelingpest population dynamics and evolution of resistance

Mingren Shi a,b,⇑, Michael Renton a,b,c

a School of Plant Biology, University of Western Australia, 35, Stirling Highway, Crawley, WA 6009, Australiab Cooperative Research Centre for National Plant Biosecurity, Australiac CSIRO Ecosystem Sciences, Underwood Avenue, Floreat, WA 6014, Australia

a r t i c l e i n f o

Article history:Received 8 March 2011Received in revised form 19 May 2011Accepted 20 June 2011Available online xxxx

Keywords:Parameter estimationOffspring genotype tableProbit modelsMortality estimationPopulation dynamicsResistance evolution

0025-5564/$ - see front matter � 2011 Elsevier Inc. Adoi:10.1016/j.mbs.2011.06.005

⇑ Corresponding author at: School of Plant BiolAustralia, 35, Stirling Highway, Crawley, WA 6009,1992; fax: +61 8 6488 1108.

E-mail addresses: [email protected] (M.edu.au (M. Renton).

Please cite this article in press as: M. Shi, M. Rdynamics and evolution of resistance, Math. Bio

a b s t r a c t

Computational simulation models can provide a way of understanding and predicting insect populationdynamics and evolution of resistance, but the usefulness of such models depends on generating or esti-mating the values of key parameters. In this paper, we describe four numerical algorithms generating orestimating key parameters for simulating four different processes within such models. First, we describea novel method to generate an offspring genotype table for one- or two-locus genetic models for simu-lating evolution of resistance, and how this method can be extended to create offspring genotype tablesfor models with more than two loci. Second, we describe how we use a generalized inverse matrix to finda least-squares solution to an over-determined linear system for estimation of parameters in probit mod-els of kill rates. This algorithm can also be used for the estimation of parameters of Freundlich adsorptionisotherms. Third, we describe a simple algorithm to randomly select initial frequencies of genotypeseither without any special constraints or with some pre-selected frequencies. Also we give a simplemethod to calculate the ‘‘stable’’ Hardy–Weinberg equilibrium proportions that would result from theseinitial frequencies. Fourth we describe how the problem of estimating the intrinsic rate of naturalincrease of a population can be converted to a root-finding problem and how the bisection algorithmcan then be used to find the rate. We implemented all these algorithms using MATLAB and Python code;the key statements in both codes consist of only a few commands and are given in the appendices. Theresults of numerical experiments are also provided to demonstrate that our algorithms are valid andefficient.

� 2011 Elsevier Inc. All rights reserved.

1. Introduction increase of an insect population. These are important parameters

Estimating parameters based on measured empirical data is acritical issue in biosecurity models, such as simulation models ofpopulation dynamics and evolution of resistance in stored-graininsect pests [12]. These simulation models are based on integratingsub-models representing different key biological processes, such asgenetic recombination and mortality due to pesticides. Variousparameters for different sub-models must be calculated or esti-mated before these models are used to predict the effects of differ-ent possible management strategies. These parameters include: thechance of certain genotypes being produced as the result of themating of certain parent genotypes (which we call offspring geno-type tables), initial frequencies of genotypes, mortalities of insectpests under various pesticide doses, and the intrinsic rate of natural

ll rights reserved.

ogy, University of WesternAustralia. Tel.: +61 8 6488

Shi), [email protected].

enton, Numerical algorithms fsci. (2011), doi:10.1016/j.mbs.

within the sub-models for simulating genetic recombination andthus determining the genotype of offspring, initialisation of thepopulation, simulating the effects of pesticide applications and cal-culating the number of eggs produced by each insect, respectively.

By an offspring genotype table we mean a table that lists all pos-sible combinations of parental genotypes, and, for each possibleparental combination, gives the expected proportions of offspringgenotypes (see Hedrick’ book [19, p. 76] for an example of this kindof table, although no formal name is provided in this or other liter-ature). Such a table is indispensable for a genetic model simulatingevolution of resistance, or other traits. We develop a novel methodto generate the offspring genotype table for a one-locus geneticmodel: quantifying all possible genotypes of parents and offspringand then using a block-matrix multiplication approach to generatethe full table describing the chance of certain genotypes beingproduced as the result of the mating of each and every possiblecombination of parent genotypes. The offspring genotype tablesfor more than one locus are then produced recursively, with thetable for a model with a higher number of loci produced from thetables for lower numbers of loci. This algorithm for the one- and

or estimation and calculation of parameters in modeling pest population2011.06.005

Page 2: Numerical algorithms for estimation and calculation of parameters in

2 M. Shi, M. Renton / Mathematical Biosciences xxx (2011) xxx–xxx

two-locus cases is given in Section 2.1. We also explain how thisalgorithm can be extended for models with more than two loci.

Many problems of quantitative inference in biological and tech-nological research concern the relation between a stimulus (e.g.phosphine fumigation dose) and a binomial response (e.g. mortalityof insect pests). A binomial generalized linear model, with a linkfunction such as the probit function (the inverse of cumulative dis-tribution function), is usually used to analyse the empirical data.Normally, maximum likelihood estimation or chi-square approxi-mation is applied to fitting the parameters of such probit models.In fact, however, in such probit models the probit is a linear functionof parameters or metameter (e.g. log) of parameters and the corre-sponding equations with respect to the parameters form an over-determined linear system. We used a generalized inverse matrixmethod to find the least-squares solution of the regularization equa-tions. We describe the method in Section 2.2. This method hasadvantages over other methods [4] if we only need to estimateparameters without other statistical information such as signifi-cance or confidence intervals for the estimates: it is simple with onlyone key command, provides a more accurate estimate of parameters,and even if the coefficient matrix of the over-determined linear sys-tem is not numerically (column) full ranked it will still work andyield a solution with minimum error in the L2 norm sense [4].

In some situations, we may wish to randomly select some or allof the initial frequencies of genotypes for a biological or geneticmodel. These frequencies must satisfy two simple constraints:each frequency is in the range [0,1] and the sum is equal to 1. InSection 2.3, we describe how we select the initial frequencieseither without any extra conditions, or with some pre-selected fre-quencies, or with linear equality and inequality constraints. Alsowe give a simple block-matrix multiplication method to calculatethe equilibrium proportions that should result from these initialfrequencies according to the Hardy–Weinberg Principle [29].

The intrinsic rate of natural increase (or development rate) is animportant parameter in modeling the dynamics of an insect popu-lation. In Section 2.4, we describe how we converted the problemof estimating this parameter into a root-finding problem and useda bisection method to find the rate to any desired accuracy.

All the above algorithms are implemented using MATLAB(www.mathworks.com) and Python (www.python.org) code, usingthe Scientific Python library (www.scipy.org), and the key state-ments and results of numerical experiments are given in Section3 demonstrating that our algorithms are valid and efficient.

2. Methods

2.1. Quantification of genotypes through block-matrix multiplicationalgorithm for creation of offspring genotype table

We developed a novel quantification of genotypes through block-matrix multiplication algorithm to generate the offspring genotypetables for a one-locus genetic model. In this section we describehow this algorithm can be used to induce the two-locus table fromthe one-locus table by block-matrix multiplication, and then howthis algorithm can recursively be extended to generate the off-spring genotype tables for models with more than two loci. Basedon assumptions of random mating and no dependence of inheri-tance on gender, this algorithm now makes it relatively straightfor-ward to express genotype frequencies of an insect population asthe proportion of offspring from all possible parental unions thatbelong to each genotype. Note that we developed the method inthis paper only for diploid species, i.e. where each locus has two al-leles, but the idea for developing this algorithm is also suitable forconstructing algorithms for species where each locus has morethan two alleles.

Please cite this article in press as: M. Shi, M. Renton, Numerical algorithms fdynamics and evolution of resistance, Math. Biosci. (2011), doi:10.1016/j.mbs.

2.1.1. One-locus caseTo use computational methods for generating the one-locus off-

spring genotype table, we need to quantify the parental and off-spring’s genotypes first. In the one-locus case, the two alleles,dominant ‘‘A’’ and recessive ‘‘a’’, are distributed among offspringin the usual, binomial ratios. Each mating of ‘‘female parent �maleparent’’ will produce four possible combinations: [each of 2 allelesof female parent (F1,F2)] � [each of 2 alleles of male parent(M1,M2)]. For example, the mating Aa � Aa, will produce F1 �M1

: AA, F2 �M1 : aA (=Aa), F1 �M2 : Aa and F2 �M2 : aa. This processcan be obtained by a schematic or a diagrammatic method, knownas the Punnett square, or by constructing a tree diagram [31]. ThePunnett square, named after the geneticist Reginald C. Punnett,for the above case is shown in Table 1.

Hence the proportions of offspring are equal to 2/4 = 0.5 forgenotype Aa, 1/4 = 0.25 for aa and also 1/4 = 0.25 for AA. It isimportant to note that Punnett squares give probabilities only forgenotypes, not phenotypes. The way in which the A and a allelesinteract with each other to affect the phenotype of the offspringdepends on how the gene products (proteins) interact. For classicaldominant/recessive genes, like that which determines whether arat has black hair (A) or white hair (a), the dominant allele willmask the recessive one. Thus in the example above 75% of the off-spring will be black (AA or Aa) while only 25% will be white (aa).The ratio of the phenotypes is 3:1.

The proportion of each genotype in the offspring can be calcu-lated by hand by counting the number of this genotype in the Pun-nett square or by calculating the probability using a multiplicationrule in the tree diagram [31]. Our more efficient computer-basedmethod to do this works as follows. First we use numbers to denotethe genotypes of parents: ‘‘1’’ for the allele A and ‘‘2’’ for the allelea. Then the Aa genotype of female and male parents can be ex-pressed by the following matrices respectively:

FAa ¼12

� �; MAa ¼ ½1;2�: ð2:1Þ

The genotypes and numbers of four possible combinations of theiroffspring can be generated by matrix multiplication:

FAaMAa ¼12

� �½1;2� ¼

1 22 4

� �: ð2:2Þ

In the product, which can be regarded as a digitized or quantifiedPunnett square, ‘‘1’’ stands for the genotype AA (as 1 � 1 = 1), ‘‘2’’for Aa (1 � 2 = 2 � 1 = 2) and ‘‘4’’ for aa (2 � 2 = 4). We do not needto produce all of the ‘‘products’’ of the different genotypes one byone, instead, the whole offspring genotype table can be obtainedat once by the following process:

(i) Let M be a 1 � 6 matrix (or a 1 � 3 block-matrix) represent-ing the three possible genotypes of the male parent:

or estim2011.06

ð2:3Þ

and F = MT (transpose of M) be a 3 � 1 block-matrix representingthe three genotypes of the female parent.

(ii) Then the block-matrix product FM is a 3 � 3 block-matrixwith each block being a 2 � 2 sub-matrix where

ð2:4Þ

ation and calculation of parameters in modeling pest population.005

Page 3: Numerical algorithms for estimation and calculation of parameters in

Table 1The Punnett square for the mating Aa � Aa.

Maternal

A a

M. Shi, M. Renton / Mathematical Biosciences xxx (2011) xxx–xxx 3

Each 2 � 2 sub-matrix then corresponds to the result of a possiblemating; for example, the sub-matrix in the middle is the one shownin Eq. (2.2).

(iii) Calculate the proportions of the three different genotypes ofoffspring for each possible mating (sub-matrix) by

Paternal A AA Aaa Aa aa

Pleasedynam

ðthe number of\1"s or \2"s or \4"sÞ=4: ð2:5Þ

Table 2The 4 � 4 Punnett square for the mating AajBb � AajB.

AB Ab aB ab

AB AABB (ss) AABb (sh) AaBB (hs) AaBb (hh)Ab AABb (sh) AAbb (sr) AaBb (hh) Aabb (hr)aB AaBB (hs) AaBb (hh) aaBB (rs) aaBb (rh)ab AaBb (hh) aaBb (rh) aaBb (rh) Aabb (rr)

We can use ‘‘s’’ to denote ‘‘AA’’, ‘‘h’’ for ‘‘Aa’’ and ‘‘r’’ for ‘‘aa’’, andPy�z is used to denote the ‘‘proportion list’’, which is a list (or rowvector) of proportions of offspring reproduced by the cross offemale parent having genotype y with male parent having geno-type z. Then for the example represented by the sub-matrix inEq. (2.2),

ð2:6Þ

Note that each of Py�z is a 1 � 3 matrix or a row vector.If each sub-matrix in FM is replaced by its corresponding pro-

portion list in Eq. (2.6) we have a block-matrix P where

P ¼

Ps�s Ps�h Ps�r

Ph�s Ph�h Ph�r

Pr�s Pr�h Pr�r

2664

3775¼

ð1;0;0Þ ð0:5;0;5;0Þ ð0;1;0Þ

ð0:5;0:5;0Þ ð0:25;0:5;0:25Þ ð0;0:5;0:5Þ

ð0;1;0Þ ð0;0:5;0:5Þ ð0;0;1Þ

2664

3775:

ð2:7Þ

2.1.2. Two-locus caseThe two-locus offspring genotype table can be obtained from

the one-locus offspring genotype table directly; fortunately wedo not need to construct the two-locus Punnett squares as thiswould be very time-consuming. For example, the 4 � 4 Punnettsquare shown in Table 2 (from [31] but in our notation) is for themating AajBb � AajBb (or hjh � hjh: note ‘‘xjy’’ means the genotypex from the 1st locus and y from the 2nd locus).

Therefore the proportions of the offspring’s genotypes from thismating are

ð2:8Þ

where Phjh�hjh is a 1 � 9 matrix or a row vector.Note that if we assume classical dominant/recessive genes the

corresponding phenotype ratios are

A�B� : aaB� : A�bb : aabb ¼ ðssþ shþ hsþ hhÞ : ðrsþ rhÞ : ðsr þ hrÞ: rr ¼ 9 : 3 : 3 : 1;

where the ‘⁄’ indicates that the corresponding allele could be any ofA, B, a, or b. For example if ‘⁄ = b’ then ‘aaB⁄’ becomes ‘aaBb’ or ‘rh’genotype.

To produce the offspring genotype table for the two-locus case,we would need to make 81 such squares and calculate the propor-tions. How time-consuming it would be!

Now we describe our efficient method to calculate the probabil-ities of offspring’s genotypes. For the above mating AajBb � AajBb,from the 1st locus cross Aa � Aa (or h � h), there are 1/4 = 25% ofgenotype AA (s) and aa (r) and 50% Aa (h) in the offspring (seethe ‘‘Aa � Aa’’ row in Table 3). For each possible combination ofthe 1st locus crosses, there are 1/4 = 0.25 of genotype BB (s) andbb (r) and 2/4 = 0.5 of Bb (h) from the 2nd locus cross Bb � Bb (orh � h). Hence the proportions of genotypes (in the order shownin Eq. (2.8)) of offspring resulting from mating AAjBb �AajBb(sjh � hjh) can be obtained by

cite this article in press as: M. Shi, M. Renton, Numerical algorithms fics and evolution of resistance, Math. Biosci. (2011), doi:10.1016/j.mbs.

ð2:9Þ

Define XC as the column vector obtained by arranging the col-umns of a matrix X one by one below each other in the original or-der and simply denote the row vector (XC)T by XCT. Now we have

PTh�hPh�h

� �CT¼ Phjh�hjh (see Eq. (2.8)), which is then placed in the

‘‘AajBb � AajBb (or hjh � jhjh)’’ row of the two-locus offspring geno-type table (see Appendix B.2). Note that the 1 � 9 row vectorPhjh�hjh is equivalent to the ‘‘proportion lists’’ discussed in theone-locus case. This definition comes from the fact that the orderof genotypes in Eq. (2.8) can be obtained by (YTY)C whereY = (s,h,r) and the multiplication of letters is defined by combiningthe two letters in the original order.

The whole offspring genotype table can thus be obtained by thefollowing steps:

(i) Find each of the following product of two sub-matrices:

or estim2011.06

Xijkl ¼ PTi�jPk�l; for i; j; k; l 2 fs;h; rg: ð2:10Þ

In computer code, the two loops for i and j are associated with themating in the 1st gene and the other two loops (k and l) with themating in the 2nd gene (see Appendix A.1).

(ii) The corresponding 1 � 9 vector XCTijkl or ‘‘proportion list’’

forms a row of the offspring genotype table.

2.1.3. Cases with more than two lociFurthermore, this method can be extended recursively to form

offspring genotype tables for the N-locus case where N P 3. Aseach locus has 3 possibilities: s, h and r, there are 3N different geno-types for the parents and 3N � 3N = 32N mating combinations. Thatis, the offspring table has 32N rows and 3N columns. Let

M1 ¼ M2 ¼ k; if N ¼ 2k

M1 ¼ k� 1; M2 ¼ k; if N ¼ 2k� 1for k ¼ 2;3; . . . ð2:11Þ

If

� I and J are any of the possible M1-locus genotypes, representingthe first M1 loci of the female and male genotype respectively,� K and L are any of the possible M2-locus genotypes, representing

the last M2 loci of the female and male genotype respectively,then the row of the offspring genotype table for the crossbetween female genotype IjK and male genotype JjL can be

obtained from the matrix product PðM1ÞI�J

� �TPðM2Þ

K�L

� �CT

, where

ation and calculation of parameters in modeling pest population.005

Page 4: Numerical algorithms for estimation and calculation of parameters in

4 M. Shi, M. Renton / Mathematical Biosciences xxx (2011) xxx–xxx

PðM1ÞI�J for the cross I � J and PðM2Þ

K�L for the cross K � L are row vec-tors in the M1-locus and M2-locus offspring genotype tablerespectively.

For example, the 11-locus offspring table would be derived bymultiplying elements of the 5 and 6-locus genotype tables.

2.2. Generalized inverse matrix for fitting the parameters of probitmodels

2.2.1. Probit modelsIn statistics, the generalized linear model (GLM) in the form of

Y ¼ aþ b1x1 þ b2x2 þ � � � þ bkxk;

is a flexible generalization of ordinary least squares regression thatallows the linear model to be related to the response variable viaa link function for Y and the magnitude of the variance of each mea-surement to be a function of its predicted value [14,18]. GLMincludes ordinary linear regression, Poisson regression, logisticregression (with the canonical logit link) and probit regression.

The probit (=‘‘probability unit’’) link function is the inversecumulative distribution function (CDF) associated with the stan-dard normal distribution [7,16]. Many problems of quantitativeinference in biological and technological research concern the rela-tion between a stimulus (e.g. phosphine fumigation) and a re-sponse (e.g. mortality of insects). Bliss [7] used all observationsof mortality response to each of a range of exposure times for eachof a range of fumigation concentrations, i.e. ‘‘all the information insuch a family of curves and not just that from a single point on eachcomponent’’. Using this approach, a probit plane

Y ¼ aþ b1 logðtÞ þ b2 logðCÞ ð2:12Þ

may be fitted to the data, where t and C are respectively exposuretime and concentration, and Y is the probit mortality, which meansthe probability of mortality P is related to Y by the following CDFexpression:

P ¼ 1ffiffiffiffiffiffiffi2pp

Z Y�5

�1exp �1

2u2

� �du: ð2:13Þ

In the case that the available independent data consist only ofthe products Ct, rather than C and t separately, the parameters b1

and b2 can be merged into a single parameter, b:

Y ¼ aþ b logðCtÞ: ð2:14Þ

Whether common logarithms (base 10) or natural logarithms (ln orloge,base e) are used in model (2.12) or (2.14) is immaterial, sinceresults obtained using either base are easily converted to the otherbase: log10 x = (log10 e) lnx = 0.43429 lnx.

It is an implicit assumption in Eq. (2.12) that concentration andtime act independently. Alternatively, an extra term b3log(t) log(C)can be added to describe the interaction of the dosage variables tand C, which may be seen, for example, as a systematic changein the slope of individual regressions of probit mortality on dosagewith change in exposure time:

Y ¼ aþ b1 logðtÞ þ b2 logðCÞ þ b3 logðtÞ logðCÞ: ð2:15Þ

Bell [3] applied a more conventional model to mortality data:

log t ¼ log k� n log C or log C ¼ log aþ b log t;

ða ¼ k1=n; b ¼ �1=n; or k ¼ an; n ¼ �1=bÞ:

ð2:16Þ

This equation yields the familiar Haber-type model:

Cn ¼ k or C ¼ tb; ð2:17Þ

where C is the dosage that when applied for a time t achieves a par-ticular specified response level (e.g. 50% or 99% mortality) and n, k,

Please cite this article in press as: M. Shi, M. Renton, Numerical algorithms fdynamics and evolution of resistance, Math. Biosci. (2011), doi:10.1016/j.mbs.

a, and b are parameters that define the specific characteristics of theresponse relationship.

Note that the integrand function in formula (2.13) is for thestandard normal distribution N(0,1). Probits may sometimes betransformed by subtracting 5 from them i.e. Z = Y � 5 where Z isthe normal equivalent deviate or N.E.D. [16].

It should be pointed out that in model (2.12) or (2.14) probitmortality (Y), but not mortality percentage (P), is a linear functionof log time (log t) and log concentration (logC) or log(Ct), but not oftime and concentration themselves. Similarly, log t is a linear func-tion of logk and n (or logC is a linear function of loga and b) inmodel (2.16).

2.2.2. The generalized inverse matrix methodA number of approaches can be used to estimate the parameters

of mortality models such as those above. Maximum-likelihood esti-mation remains popular and is the default method in many statis-tical computing packages. Other approaches, including Bayesianapproaches and least squares have been developed [14,18]. Algebra-ically, when any one of the above models (2.12), (2.14), (2.15) and(2.16) is fitted to a data set, we have an over-determined system oflinear equations with respect to the parameters to be estimated. Forexample, for the model (2.12), the N-equations with 3 variables(a,b1,b2) corresponding to the data set fYi; ti;CigN

i¼1 are as follows:

Yi ¼ 1 � aþ ðlogðtiÞ � b1 þ ðlog CiÞ � b2; ði ¼ 1;2; . . . ;NÞ: ð2:18Þ

The matrix form of the above equations is Ax = b wherex = (a,b1,b2)T,

A ¼

1 log t1 log C1

1 log t2 log C2

..

. ... ..

.

1 log tN log CN

266664

377775 and b ¼

Y1

Y2

..

.

YN

266664

377775: ð2:19Þ

Then the maximum-likelihood method maximizes their joint log-likelihood function provided that the expected value E[ATA] existsand is not singular [14,18].

The method of least squares is often used to generate estimatorsand other statistics in regression analysis [33]. If a solutionminimizes

XN

i¼1

ðeY i � YiÞ2 ¼XN

i¼1

ð½aþ b1 logðtiÞ þ b2 logðCiÞ� � YiÞ2; ð2:20Þ

where eY i ¼ aþ b1 logðtiÞ þ b2 logðCiÞ is the ith predicted value, thenthe solution is called a least squares solution [17]. Normally, theleast squares method can be used to solve the regularized equationsof Ax = b : ATAx = ATb, provided that ATA is non-singular. Actually, ifA+ is the generalized inverse (or Moore–Penrose pseudo-inverse) ofmatrix A, then A+b is such a solution [4,23,26]. Note that if A is anon-singular square matrix then A+ = A�1. If A is column full-ranked,then ATA is non-singular and A+ = (ATA)�1AT. But while this equationcould theoretically be used to calculate A+, it is of limited practicaluse for calculating A+ numerically, because using QR decompositionor singular value decomposition (SVC) to obtain A+ will give muchsmaller numerical errors than direct calculation of (ATA)�1AT

[20,27].

2.3. Selection of initial frequencies and calculation of the equilibriumfrequencies of genotypes

2.3.1. Selection of initial frequencies of genotypes without specialconstraints

Generating random initial frequencies of genotypes withoutspecial conditions is simple. In general, if pi denotes the frequencyof genotype i then the following constraints apply

or estimation and calculation of parameters in modeling pest population2011.06.005

Page 5: Numerical algorithms for estimation and calculation of parameters in

M. Shi, M. Renton / Mathematical Biosciences xxx (2011) xxx–xxx 5

ðiÞ 0 6 pi 6 1 and ðiiÞXk

i¼1

pi ¼ 1 ðk

¼ total number of genotypes;i ¼ 1;2; . . . ; kÞ: ð2:21Þ

We can randomly generate k uniformly distributed numbers be-tween zero and one to satisfy the 1st constraint, calculate thesum of these k values, and then divide each value by this sum ofthe k values, thus ensuring the 2nd condition is satisfied, while alsoensuring the 1st constraint is maintained.

2.3.2. Selection of initial frequencies of genotypes with somepreselected values

In some cases, we may want some initial frequencies to havespecial values. For example, when we want to simulate the impactof the initial proportion of resistant rr beetles (prr) on the evolutionof phosphine resistance, we may want to double or triple the valueof prr that was previously used, while maintaining the 2nd con-straint. Now suppose m(<k) proportions: p1,p2, . . . ,pm have been

preselected and their sum Sm ¼Pm

i¼1pi < 1. Firstly, we randomly

generate m � k uniformly distributed numbers x1,x2, . . .,xk�m, and

calculate the sum of those values Sk�m ¼Pk�m

i¼1 xi. Secondly, each

of xi, for i = 1, . . . ,k �m is divided by S where S = Sk�m/(1 � Sm).Thus those k �m values together with the preselected m valueswill satisfy the second condition in constraints (2.21) since

Xk�m

i¼1

xi=SþXm

i¼1

pi ¼Sk�m

Sþ Sm ¼ ð1� SmÞ þ Sm ¼ 1: ð2:22Þ

More generally, if constraints for the set of frequencies form a linearsystem of equations and/or inequalities (note that the 1st conditionin formula (2.21) is a set of inequalities and the 2nd one is an equa-tion), we can use any technique (e.g. [5,25,28]) for finding a feasiblesolution of a linear programming problem to find a possible set offrequencies.

2.3.3. Calculation of the Hardy–Weinberg equilibrium genotypeproportions for two-locus case using allelic proportion matrix

Now we describe how to calculate the Hardy–Weinberg equilib-rium genotype proportions that should result from any particularinitial genotype proportions. For the one-locus case, let alleles Aand a be in proportions p1 and q1(=1 � p1) respectively. Then, be-cause half the alleles in genotype Aa or h are ‘‘A’’, p1 = ps + 0.5ph,where px is the initial frequency of genotype, x 2 {s,h,r}. Accordingto the Hardy–Weinberg principle, with neutral selection pressureover time the frequency of the three genotypes s, h and r withinthe population will tend towards the equilibrium proportionsp2

1 : 2p1q1 : q21 [29]. Similarly, for a two-locus model, suppose al-

leles A and a on the 1st locus are in proportions p and q respec-tively and alleles B and b on the 2nd locus are in proportions uand v respectively. Let pxy be the initial frequency of genotype xy,x, y 2 {s,h,r}. Then

p ¼ pss þ psh þ psr þ 0:5ðphs þ phh þ phrÞ; q ¼ 1� p;

u ¼ pss þ phs þ prs þ 0:5ðpsh þ phh þ prhÞ; v ¼ 1� u:ð2:23Þ

The equilibrium proportions for this two locus case, PE2, can be ob-tained using the matrix product rather than element-wise calcula-tion, by letting

A ¼s

h

r

p2

2pq

q2

264

375; B ¼ u2 2uv v2

s h rð2:24Þ

Then the nine Hardy–Weinberg equilibrium genotype proportionscan be obtained from the product AB:

Please cite this article in press as: M. Shi, M. Renton, Numerical algorithms fdynamics and evolution of resistance, Math. Biosci. (2011), doi:10.1016/j.mbs.

P ¼ AB ¼s

hr

p2u2 2p2uv p2v2

2pqu2 4pquv 2pqv2

q2u2 2q2uv q2v2

264

375

s h r

ð2:25Þ

2.4. Bisection method to estimate the intrinsic rate of natural increaseof an insect population

In any study of the biology of insect pests, one of the first ques-tions is: How fast can the insect population multiply? The intrinsicrate of natural increase, r, is defined as the rate of increase perhead under specified physical conditions, in an unlimited environ-ment [1,6]. This rate plays a key role in fields as diverse as ecology,genetics, demography and evolution.

Given the age-specific survival rates (lx) and the age-specificfecundity rates (mx) at age x, an approximation of the value of rmay be calculated from the Lotka equation [15]:X

x

e�rxlxmx ¼ 1: ð2:26Þ

In Birch’s approximation of the value of r [6], he neglected the contri-bution of the older age groups (similarly in our Example 3.4 in Section3.4, the summation of the expression (2.26) is not carried beyond theage-group centered at x = 13.5). He then substituted a number of trialvalues of r into the expression (2.26) using 4-figure tables for calcu-lating e�rx to find the value of r which would make the summationapproximately equal to 1. Carey [10] determined the r value by usinga procedure based on Newton’s method rn+1 = rn � f(rn)/f0(rn). Maia etal. [21] used a jackknife technique and Meyer et al. [22] comparedjackknife and bootstrap techniques for estimating r. Also the simplexmethod has been used to obtain a numerical solution of r [30]. Herewe describe the use of an alternative and very simple iterationapproach, the bisection method, to find the value of r.

2.4.1. Bisection methodThe bisection method can be found in most textbooks of numer-

ical analysis (e.g. [9]). Some authors [11,24,32,34] mentioned thename ‘‘bisection’’ in the context of estimating the developmentrate, but their papers did not describe any details, which biologistsmay be interested in.

The bisection method is a root-finding algorithm, where ‘root’means the value of x at which a function f(x) is equal to zero. Themethod repeatedly bisects an interval then selects the subintervalin which a root must lie for further iteration. If f(x) is a continuousfunction on the interval [a,b], then the bisection method is very sim-ple and guaranteed to converge to a root of f, x⁄, provided f(a) and f(b)have different signs, i.e. f(a)f(b) < 0. The process of iteration is termi-nated when the length of the iterated interval <2e and then the mid-point of the interval is then chosen as the estimate of x⁄. Hence theaccuracy of the estimate of x⁄ is less than the desired accuracy e.

To find r using our bisection method, we first define a continu-ous function with respect to r:

gðrÞ ¼X

x

e�rxlxmx � 1: ð2:27Þ

Then the problem is reduced to finding the root of g(r), i.e. find thevalue of r which will make g(r) = 0, and the bisection method can beapplied in the normal way.

3. Results

3.1. The offspring genotype table

3.1.1. For the one- and two-locus casesThe Python function and main program for creating the off-

spring genotype tables for the one- and two-locus models are given

or estimation and calculation of parameters in modeling pest population2011.06.005

Page 6: Numerical algorithms for estimation and calculation of parameters in

6 M. Shi, M. Renton / Mathematical Biosciences xxx (2011) xxx–xxx

in Appendices A.1 and B.1 respectively. The table for the two-locusmodel is given in Appendix B.2. Creating this large table with 729entries takes less than 0.01 s using the functions in the appendices.The resulting offspring genotype table for the one-locus model isshown in Table 3.

3.1.2. An example for the three-locus caseFor N = 3, in which case k = 2 (see formula (2.11)), the 3 � 9

matrix form of the offspring’s genotype proportions resulting fromthe mating AajBBjCc � AAjbbjCc (or hjsjh � sjrjh) can be obtained bycalculating the product PT

h�sPsjh�rjh where Ph�s = (0.5,0.5,0) fromTable 3. and Psjh�rjh = (0,0.25,0,0,0.5,0,0,0.25,0) from ‘‘ sjh � rjhor AABb � aaBb’’ row of Table B.2 (two-locus). Then convert it intoa 1 � 27 vector. The order of 27 genotypes for the three-locus casecan be obtained by (YTZ)CT where Y = (s,h,r) and Z is the 1 � 9‘‘vector’’ shown in Eq. (2.8), which is

ðYT ZÞCT ¼ ðsss; hss; rss; shs;hhs; rhs; srs;hrs; rrs; ssh;hsh; rsh; shh; hhh;rhh; srh; hrh; rrh; ssr;hsr; rsr; shr;hhr; rhr; srr;hrr; rrrÞ:

ð3:1Þ

Hence the six non-zero proportions of PTh�sPsjh�rjh

� �CTare:

shs;hhs; shh; hhh; shr;hhr

ð0:125;0:125;0:25;0:25;0:125;0:125Þð3:2Þ

This result can be confirmed by using a multiplication rule in a treediagram. For example, for hhs (and similarly for other genotypes):

Table 4Results of phosphine dose–response trials for the QRD14 strain of R. dominicashowing the phosphine dose used for 48 h exposure, the number of insects used, thenumber of insects dying and the aggregate response mortality rate for each trial.

1st locus cross 2nd locus cross 3rd locus cross (Aa×AA) (BB×bb) (Cc×Cc)

AA (s) [0.5] CC (s) [0.25] 0.5(1.0)(0.25)=0.125 (hhs)Bb (h) [1.0] Cc (h) [0.5]

Aa (h) [0.5] cc (r) [0.25]

3.2. Examples of using the generalized inverse matrix method to fitparameters of probit models

3.2.1. MATLAB and Python codesIn MATLAB, commands Y = norminv (P) + 5 and P = normcdf

(Y � 5) can be used to convert between the probit value Y (see Sec-tion 2.2.1) and mortality percentage P. In Python, we can use sim-ilar commands ‘‘Y = norm.ppf (P ) + 5 and P = norm.cdf (Y � 5)’’with ‘‘from scipy.stats import ⁄’’. The probit value Y can thenbe converted to and from the N.E.D value Z by adding or subtract-ing five.

Table 3The offspring genotype table for the one-locus model showing the expectedproportions of each possible offspring genotype resulting from each possible parentalgenotype mating combination.

Female parent Male parent Offspring

AA (s) Aa (h) aa (r)

AA (s)� AA (s) 1.0 0 0Aa (h) 0.5 0.5 0aa (r) 0 1.0 0

Aa (h)� AA (s) 0.5 0.5 0Aa (h) 0.25 0.5 0.25aa (r) 0 0.5 0.5

aa (r)� AA (s) 0 1.0 0Aa (h) 0 0.5 0.5aa (r) 0 0 1.0

Please cite this article in press as: M. Shi, M. Renton, Numerical algorithms fdynamics and evolution of resistance, Math. Biosci. (2011), doi:10.1016/j.mbs.

In either MATLAB or Python, the command for finding the gen-eralized inverse of matrix A is pinv (A).

3.2.2. Examples

Example 3.1 (Collins, 2010, unpublished data). The dose of phos-phine (PH3) and response of strain QRD14 of the insect R. dominicato phosphine fumigation are listed in the first three columns ofTable 4. Here t is a constant (48 h) and ‘‘dose’’ (mg/l) means‘‘concentration’’.

For the purpose of analysis, the 15 observations should be di-vided into 5 groups (so N = 5) each having 3 observations withthe same dose. The response rate (or mortality), listed in the lastcolumn of Table 4, is the aggregated rate of the 3 observations;for example, for the dose 0.0010 the aggregated rate is(2 + 1 + 0)/(49 + 50 + 50) = 0.0201. Note that if the response rateis p = 100% (e.g. to the dose 0.0040 in Table 4) then we change itfrom 1 to 0.9999. Otherwise the corresponding probit value is infi-nite, which cannot be used to fit the parameters. Similarly weshould change p = 0 (with examples in the other two data sets)to something very close to zero, such as p = 0.0001, otherwise thecorresponding probit value will be negative infinity.

The probit model for this example is shown in formula (2.24)and the coefficients matrix of the corresponding over-determinedlinear system, A, has 5 rows and only 2 columns with the elementsin the 1st column being all ‘‘1’’ (see Eq. (2.19)). The fitted parame-ters (obtained using A+) are a = 15.0324 and b = 9.2291 respec-tively. Note that the fitted parameters using Maximum Likelihood(ML) are a⁄ = 14.2743 and b⁄ = 8.5963 respectively.

Fig. 1 (a) and (b) shows the probit lines (probit values against log(dose)) and mortality (%) (against dose) curves obtained using thetwo methods. It can be seen that the two probit lines and mortalitycurves for QRD14 are close to each other (similar for the other twostrains). But it can be seen from comparing the least squares (LS)errors that our method has smaller numerical error in the senseof formula (2.20): 0.2214 compared with 0.3850. For the data setsof strain QRD569 and Comb F1, the numerical errors for our methodare 3.1589 and 0.3034 respectively, also smaller than the values of4.7249 and 0.8035 (obtained using ML) respectively.

Example 3.2. Daglish [13] observed a range of concentrations incombination with exposure times of 20, 48, 72 and 144 h, to

Observations Response rate

Dose (C) No. used No. response

0.0010 49 2 0.02010.0010 50 10.0010 50 0

0.0015 50 20 0.32000.0015 49 130.0015 51 15

0.0020 50 38 0.70470.0020 50 340.0020 49 33

0.0030 50 47 0.97330.0030 50 490.0030 50 50

0.0040 50 50 0.99990.0040 48 480.0040 50 50

or estimation and calculation of parameters in modeling pest population2011.06.005

Page 7: Numerical algorithms for estimation and calculation of parameters in

Fig. 1. The probit lines (a) and percentage mortality (b) curves obtained using the least squares (LS) and maximum likelihood (ML) methods with the observed values for theQRD14 strain.

Table 5Concentration (LC50 & LC99) values (mg/l) required to achieve 50% and 99% mortality for different exposure times (t) for three strains of R. dominica, together with the model fittedby Daglish [13] for the LC50 data.

Strain Mortality (%) 20 h 48 h 72 h 144 h Model (2.17)

QRD14 50 0.0052 0.0017 0.0011 0.00064 C0.8673t = 0.208899 0.0091 0.0037 0.0021 0.0014

QRD369 50 0.20 0.052 0.032 0.017 C0.8673t = 4.090899 0.40 0.091 0.060 0.028

QRD369� 50 0.010 0.0042 0.0023 0.0011 C0.8673 t = 0.3863

QRD14 99 0.026 0.013 0.0066 0.0025

M. Shi, M. Renton / Mathematical Biosciences xxx (2011) xxx–xxx 7

determine the time-concentration combinations required toachieve mortality rates of 50% and 99% for strain QRD14 (Suscep-tible-S), QRD369 (Resistant-R) and QRD369 � QRD14 (Hybrid-H).These concentrations are denoted LC50 and LC99 (Lethal Concen-tration value for 50% and 99% mortalities). Their observed data andprediction equations (Haber-type model (2.17)) associated withLC50 are listed in Table 5 (see Table 1 in [13]).

Note that the fitted value for the index n in model (2.17) ob-tained by Daglish [13] was different for the LC50 data and theLC99 data, which means it is not possible to develop a Haber-typerule with which to successfully extrapolate predicted mortalitiesbetween exposure scenarios [8].

For the purposes of predicting the mortalities at different con-centrations and different expose times, we employ the probit mod-el (2.15) to refit all the data of Daglish [13], including both the LC50

and LC99 data sets together, to fit the four parameters for each ofthe three strains (C: mg/l, t: h). Note that the four t values shouldbe repeated twice in the 2nd column of coefficient matrix A (seeEq. (2.19)). The fitted equations are as follows (the logarithmicbase is 10):

QRD369 : Y ¼�10:8398þ16:1356logðtÞþ1:9145logðCRÞþ4:0846logðtÞ logðCRÞ;QRD14 : Y ¼3:9749þ12:3267logðtÞþ3:8700logðCSÞþ1:9247logðtÞ logðCSÞ;369�14 : Y ¼11:2847þ3:7764logðtÞþ6:9650logðCHÞ�1:0105logðtÞ logðCHÞ:

ð3:3Þ

3.3. Calculation of equilibrium proportions

The Python function and main program for selection of uni-formly distributed random numbers that the sum = 1 with someor without any preselected numbers are given in Appendices A.2and C.1 respectively. The Python function and main program for

Please cite this article in press as: M. Shi, M. Renton, Numerical algorithms fdynamics and evolution of resistance, Math. Biosci. (2011), doi:10.1016/j.mbs.

calculation of equilibrium frequencies of genotypes are given inAppendices A.3 and C.2 respectively.

Example 3.3. If the initial frequencies of genotypes PI2 for the two-locus case are

PI2 ¼ss hs rs sh hh rh sr hr rr

0:2040; 0:1203; 0:0875; 0:1064; 0:0690; 0:0894; 0:0467; 0:1197; 0:1570

ð3:4Þ

then p = 0.5116, q = 0.4884, u = 0.5442, v = 0.4558, according to for-mula (2.23), and

PE2 ¼ss hs rs sh hh rh sr hr rr

0:0775; 0:1480; 0:0706; 0:1298; 0:2479; 0:1183; 0:0544; 0:1038; 0:0496

ð3:5Þ

according to formula (2.25), after converting the matrix form to arow vector form (see Eqs. (2.9) and (2.8)).

Note that if we choose the above proportions PE2 in Eq. (3.5) asthe initial ones then the equilibrium proportions are PE2 them-selves. However many other different sets of initial frequenciescould also result in the same set of equilibrium proportions; PE2

is a special solution.

3.4. Bisection method for finding the development rate r

Example 3.4. The pivotal age in weeks (x), age-specific survivalrates (lx) and the age-specific fecundity rates (mx) are shown inTable 6 (Table 2 in [6]).

The results for this example obtained using our bisection algo-rithm for three tests (different initial intervals) are listed in Table 7.The iteration process for Test 3 is shown in Fig. 2. Given the initialinterval [0.6,1.0] the iteration steps are as follows: the interval

or estimation and calculation of parameters in modeling pest population2011.06.005

Page 8: Numerical algorithms for estimation and calculation of parameters in

Table 6Raw data: age x (weeks), age-specific survival rate lx per week and age-specific fecundity rate mx per week.

x 4.5 5.5 6.5 7.5 8.5 9.5 10.5 11.5 12.5 13.5 14.5 15.5 16.5 17.5 18.5lx 0.87 0.83 0.81 0.8 0.79 0.77 0.74 0.66 0.59 0.52 0.45 0.36 0.29 0.25 0.19mx 20.0 23.0 15.0 12.5 12.5 14.0 12.5 14.5 11.0 9.5 2.5 2.5 2.5 4.0 1.0

Table 7The estimate of rate r and accuracy for three initial intervals.

Test Initial interval [a,b] Iteration number Estimate of r g(r) Accuracy

1 [0,1] 6 0.76 0.02 <0.012 [0,1] 13 0.7620 �0.0002 <0.00013 [0.6,1.0] 5 0.76 0.03 <0.01

Fig. 2. Illustration of the iteration process for Test 3, showing the initial interval[0.6,1.0], the function g(r) curve, together with the function values at the left andright endpoints and the iterated intervals after each of the 5 iterations.

8 M. Shi, M. Renton / Mathematical Biosciences xxx (2011) xxx–xxx

resulting from the previous iteration is bisected, the function valueat the two ends of the interval is calculated, and then the half sub-interval for which the function values at the two ends have differ-ent signs is kept as the new interval. This process continues untilthe desired accuracy is achieved.

4. Discussion

Our quantification and block-matrix multiplication approach togenerate the offspring genotype tables involves many fewer opera-tions than classical methods such as that based on Punnett squares(PS). For each mating in any N-locus cases, the PS method requiresthree processes: constructing the PS, counting the number andcalculating the proportions of each genotype of progeny. In theone-locus case, our method requires multiplying two matrices withthe ‘‘quantified’’ elements and the same counting and calculatingprocesses. It can be seen from Table 1 and (2.2) that both have thesame number of operations (four) if we regard defining the genotypeof one cell in Table 1 as one operation (although this operation ismore complex than multiplication of two numbers). For each matingin the N-locus case (N P 2), constructing such a PS requires2N � 2N = 22N operations as there are N loci each having 2 allelesfor both the female or male parent. In addition, there are 22N countsand 22N divisions for calculating the proportions. If our method isused, the only need is to find the product of two matrices which re-quires 2N multiplications (If N = 2k, then 2k � 2k = 22k = 2N. IfN = 2k � 1, then 2k � 2k�1 = 22k�1 = 2N). This is because our methodcalculates the N-locus case recursively using the results obtained

Please cite this article in press as: M. Shi, M. Renton, Numerical algorithms fdynamics and evolution of resistance, Math. Biosci. (2011), doi:10.1016/j.mbs.

for k or k � 1 loci (N = 2k � 1 or 2k). It could be argued that in somematings, e.g. AAjBB � Aajbb for the two-locus case, the parents haveonly 2 different genotypes AABb and AaBb and so the PS has only 4cells. However, the other cells appearing in Table 1 correspond tozero elements in this case and our algorithm would not performany multiplications by zeros in the computer codes.

Theoretically the two methods (maximum likelihood and leastsquares) to fit the probit models should obtain the same parame-ters. Numerically, however, they yield small differences in results.The generalized inverse matrix approach described here providesan efficient method for fitting probit models. The advantages ofusing this approach are

(1) It provides a more numerically accurate estimate ofparameters.

(2) Even if A is not (column) full-ranked and thus the coefficientmatrix of the regularized equations, (ATA), is singular, therestill exits a matrix A+ where the linear system has a solutionA+b with minimum L2 norm:

or estim2011.06

kAþbk2 ¼minfkxk2 : AT Ax ¼ AT bg:

The generalized inverse matrix with the least squares techniquecan also be used to fit the parameters of a model when the modelor modified model is a linear function with respect to parametersor metameter of parameters. For example, it can be used to fitthe parameters a and b in Freundlich adsorption isotherm models[2] defined by the equation Y = aebt, where t is time of exposure (h)and Y is the ratio (C/Co) of concentration C at time t to the appliedconcentration Co at time t = 0. This follows since the equivalentlog–log model is ln(Y) = ln(a) + bt where ln(Y) is a linear functionof parameters ln(a) and b.

Our algorithm for randomly generating initial genotype frequen-cies based on matrix products is very simple and efficient. For find-ing the Hardy–Weinberg equilibrium genotype proportions, usingmatrix products requires 23 multiplications; forming the elementsof matrix A or B in Eq. (2.24) requires 7 multiplications for eachand calculating the product of AB in Eq. (2.25) requires 9 multiplica-tions. On the other hand, element-wise calculation for the nine ele-ments of matrix P in Eq. (2.25) requires 41 multiplications;calculating the elements in the 4 corners requires 16 multiplications(4 for each element) and in the other 5 positions requires 25 multi-plications (5 for each). The matrix product method we propose isthus efficient and avoids repeated calculations.

The advantage of the bisection method we propose for deter-mining the intrinsic rate of increase is that it is also verysimple and normally only a few iterations are needed to find thedevelopment rate as the desired accuracy is normally to two deci-mal places.

As stated previously, accurately and efficiently determiningparameter values of key sub-models within biological simulation

ation and calculation of parameters in modeling pest population.005

Page 9: Numerical algorithms for estimation and calculation of parameters in

M. Shi, M. Renton / Mathematical Biosciences xxx (2011) xxx–xxx 9

models, such as models simulating population dynamics and evo-lution of resistance in stored-grain insect pests, is a critical issue[12]. We conclude that the methods presented in this paper pro-vide a toolkit for estimating a number of important parameter val-ues for such resistance simulation models, which will allow thesemodels to be used to predict the effects of different possible man-agement strategies.

Appendix A. Three Python functions

A.1. Python function for creation of offspring genotype table

from numpy import ⁄ # If the three functions are separated the leffrom pylab import ⁄ # five commands should be put in the beginfrom random import ⁄ # the file for each of other two functiofrom math import ⁄from scipy import ⁄def GenTable (nL):

‘‘‘‘‘‘Create offspring genotypes table for one- or two-loc

# nL: number of locus, nL = 1 or 2

# 1? A, 2? a:

F = matrix ([1,1,1,2,2,2]) # genotypes of Female paren

M = F # genotypes of Male parent

# 1? AA, 2? Aa, 4? aa in the product FM

FM = F.T � M # F.T: Transpose of F

Table1 = matrix (zeros ((9,3),float)) # The table for

for i in range (3): # Count the numbers a

for j in range (3): # calculate the prop

N1 = 0. # for calculation of numbers 1, 2, or 4

N2 = 0. # in each sub-matrix

N4 = 0.

X=(FM[2 � i,2 � j],FM[2 � i + 1,2 � j],FM[2 � i,2 � j + 1],FM[2 � i + 1,2 � j + 1])YS = choose (greater(X,1),(X,0))

k = 3 � i + jTable1[k,0] = sum (YS)/4.0

YHR = choose (equal (X,1),(X,0))

YH = choose (greater (YHR,2),(YHR,0))

Table1[k,1] = sum (YH)/8.0

YR = choose (equal (YHR,2),(YHR,0))

Table1[k,2]=sum (YR)/16.0

if nL==1:

return Table1

if nL==2:

n = 0

Table2 = zeros ((9 � 9,9),float)for f2 in range (3): # 2nd gene of Female

for f1 in range (3): # 1st gene of Female

for m2 in range (3): # 2nd gene of Male

for m1 in range (3): # 1st gene of Male

k1 = 3 � f1 + m1 # The index in one-locu

# for 1st cross

k2 = 3 � f2 + m2 # The index in one-loc

# for 2nd cross

C1 = Table1[k1,:]

C2 = Table1[k2,:]

C1C2 = (C1).T � C2FM2 = ((C1C2[:,0]).T,(C1C2[:,1]).T,(C1C2[:,2]).

Table2[n,] = reshape (FM2,(1,9))

n = n + 1

return Table2

Please cite this article in press as: M. Shi, M. Renton, Numerical algorithms fdynamics and evolution of resistance, Math. Biosci. (2011), doi:10.1016/j.mbs.

Acknowledgements

The authors would like to acknowledge the support of the Aus-tralian Government’s Cooperative Research Centres Program. Wealso thank P.J. Collins for his great help in the genetics and provi-sion of raw data.

tning ofns

us cases by set nL = 1 or nL = 2’’’’’’

t

one-locus

nd

ortions

s table

us table

T)

or estimation and calculation of parameters in modeling pest population2011.06.005

Page 10: Numerical algorithms for estimation and calculation of parameters in

A.2. Python function for selection of uniformly distributed random proportions

def UniformRandom (Str,Pm,IDXm):

‘‘‘‘‘‘Creating K random (uniformly distributed) numbers with sum = 1.0 with or without m preselected ones for

creating initial proportions of genotypes or life stages or others’’’’’’# Str: A vector of strings indicating the random variable

# Pm: A 1xm row vector of preselected uniformed distributed

# numbers with sum Sm < 1.0. If m = 0, input Pm as []

# INXm: Indices of Pm in the returned array Pk.

# If m = 0, input IDXm as []

K = len (Str) # Number of the random digits

m = len (Pm) # Number of preselected uniformed distributed digits

print ‘n n m=’, m, ‘IDXm:’, IDXm, ‘n n Pm:’,Pm

Pk = zeros (K,float)

x = zeros (K-m,float)

Sm = sum (Pm) # Sum of the preselected random numbers

if Sm > 1.0:print Sm,’ !!! The sum of preselected numbers > 1.0’Pk=[]

return Pk

from random import ⁄for i in range (K � m):

x[i]=random ()

Sk_m = sum (x)

print ‘nn x:’, x, ‘nn sum=’, Sk_mS = Sk_m/(1.-Sm)Pk_m = x/Sprint ‘nn Pk_m:’, Pk_mif m==0:

Pk = Pk_melse:

Pk[IDXm]=Pm

IDXk_m = delete (range (K),IDXm)

Pk[IDXk_m]=Pk_mreturn Pk

A.3. Python function for calculation of equilibrium frequencies of genotypes

def EqiFre_pquv (nL,IniF):

‘‘‘‘‘‘Given initial frequency of genotypes calculate allelic proportions p& q/u& v and the frequencies in

equilibrium for 1- or 2-locus model"""

# nL: number of locus

# IniF: 1 � 3 or 1 � 9 list - initial frequency of genotypes

if nL==2:

x1 = take (IniF,[0,3,6])

x2 = take (IniF,[1,4,7])

x3 = take (IniF,[0,1,2])

x4 = take (IniF,[3,4,5])

p = sum (x1) + sum (x2)/2. # proportion of allele ‘A’

u = sum (x3) + sum (x4)/2. # proportion of allele ‘B’

q = 1.-p

v = 1.-u

A = matrix ([[p ⁄⁄ 2],[2.0 � p � q],[q ⁄⁄ 2]])B = matrix ([u ⁄⁄ 2,2.0 � u � v,v ⁄⁄ 2])P=(A � B).TEP2 = reshape (P.flat,(1,9))[0]

return (p,q,u,v,EP2)

if nL==1:

p = IniF[0] + 0.5 � IniF[1]q = IniF[2] + 0.5 � IniF[1]EP1=[p ⁄⁄ 2,2. � p � q,q ⁄⁄ 2]return (p,q,EP1)

10 M. Shi, M. Renton / Mathematical Biosciences xxx (2011) xxx–xxx

Please cite this article in press as: M. Shi, M. Renton, Numerical algorithms for estimation and calculation of parameters in modeling pest populationdynamics and evolution of resistance, Math. Biosci. (2011), doi:10.1016/j.mbs.2011.06.005

Page 11: Numerical algorithms for estimation and calculation of parameters in

Appendix B. Offspring genotype table

B.1. Python code for creating offspring genotype table for 1- and 2-locus models

def PrintTable (GenTypes,Table):

(row,col) = shape (Table)

for i in range (row):

k = i/col

j = i � k � colif i==k � col:print GenTypes[k],‘X’,GenTypes[j],Table[i]

else:

print ’ ’,GenTypes[j],Table[i]return

print ‘nnnn### Offspring Genotypes Table for one-locus ###nnnn’T1 = GenTable (1)

GenType1=[‘AA (s)’,‘Aa (h)’,‘aa (r)’]

PrintTable (GenType1,T1)

T2 = GenTable (2)

GenotypesA=[‘AABB’, ‘AaBB’, ‘aaBB’, ‘AABb’, ‘AaBb’, ‘aaBb’,‘AAbb’, ‘Aabb’, ‘aabb’]

Genotypes=[‘ss[0]’,‘hs[1]’,‘rs[2]’,‘sh[3]’,‘hh[4],’,‘rh[5]’,‘sr[6]’,‘hr[7]’,‘rr[8]’]

print ‘nnnn### Offspring Genotypes Table for two-locus ###nnnn’print GenotypesA,‘nn’,Genotypes,‘nn’PrintTable (GenotypesA,T2)

B.2. Offspring genotype table (two loci)

Female parent Male parent AABB AaBB aaBB AABb AaBb aaBb AAbb Aabb aabb

AABB� AABB 1. 0. 0. 0. 0. 0. 0. 0. 0.AaBB 0.5 0.5 0. 0. 0. 0. 0. 0. 0.aaBB 0. 1. 0. 0. 0. 0. 0. 0. 0.AABb 0.5 0. 0. 0.5 0. 0. 0. 0. 0.AaBb 0.25 0.25 0. 0.25 0.25 0. 0. 0. 0.aaBb 0. 0.5 0. 0. 0.5 0. 0. 0. 0.AAbb 0. 0. 0. 1. 0. 0. 0. 0. 0.Aabb 0. 0. 0. 0.5 0.5 0. 0. 0. 0.aabb 0. 0. 0. 0. 1. 0. 0. 0. 0.

AaBB� AABB 0.5 0.5 0. 0. 0. 0. 0. 0. 0.AaBB 0.25 0.5 0.25 0. 0. 0. 0. 0. 0.aaBB 0. 0.5 0.5 0. 0. 0. 0. 0. 0.AABb 0.25 0.25 0. 0.25 0.25 0. 0. 0. 0.AaBb 0.125 0.25 0.125 0.125 0.25 0.125 0. 0. 0.aaBb 0. 0.25 0.25 0. 0.25 0.25 0. 0. 0.AAbb 0. 0. 0. 0.5 0.5 0. 0. 0. 0.Aabb 0. 0. 0. 0.25 0.5 0.25 0. 0. 0.aabb 0. 0. 0. 0. 0.5 0.5 0. 0. 0.

aaBB� AABB 0. 1. 0. 0. 0. 0. 0. 0. 0.AaBB 0. 0.5 0.5 0. 0. 0. 0. 0. 0.aaBB 0. 0. 1. 0. 0. 0. 0. 0. 0.AABb 0. 0.5 0. 0. 0.5 0. 0. 0. 0.AaBb 0. 0.25 0.25 0. 0.25 0.25 0. 0. 0.aaBb 0. 0. 0.5 0. 0. 0.5 0. 0. 0.AAbb 0. 0. 0. 0. 1. 0. 0. 0. 0.Aabb 0. 0. 0. 0. 0.5 0.5 0. 0. 0.aabb 0. 0. 0. 0. 0. 1. 0. 0. 0.

AABb� AABB 0.5 0. 0. 0.5 0. 0. 0. 0. 0.AaBB 0.25 0.25 0. 0.25 0.25 0. 0. 0. 0.aaBB 0. 0.5 0. 0. 0.5 0. 0. 0. 0.AABb 0.25 0. 0. 0.5 0. 0. 0.25 0. 0.

(continued on next page)

M. Shi, M. Renton / Mathematical Biosciences xxx (2011) xxx–xxx 11

Please cite this article in press as: M. Shi, M. Renton, Numerical algorithms for estimation and calculation of parameters in modeling pest populationdynamics and evolution of resistance, Math. Biosci. (2011), doi:10.1016/j.mbs.2011.06.005

Page 12: Numerical algorithms for estimation and calculation of parameters in

AaBb 0.125 0.125 0. 0.25 0.25 0. 0.125 0.125 0.aaBb 0. 0.25 0. 0. 0.5 0. 0. 0.25 0.AAbb 0. 0. 0. 0.5 0. 0. 0.5 0. 0.Aabb 0. 0. 0. 0.25 0.25 0. 0.25 0.25 0.aabb 0. 0. 0. 0. 0.5 0. 0. 0.5 0.

AaBb� AABB 0.25 0.25 0. 0.25 0.25 0. 0. 0. 0.AaBB 0.125 0.25 0.125 0.125 0.25 0.125 0. 0. 0.aaBB 0. 0.25 0.25 0. 0.25 0.25 0. 0. 0.AABb 0.125 0.125 0. 0.25 0.25 0. 0.125 0.125 0.AaBb 0.0625 0.125 0.0625 0.125 0.25 0.125 0.0625 0.125 0.0625aaBb 0. 0.125 0.125 0. 0.25 0.25 0. 0.125 0.125AAbb 0. 0. 0. 0.25 0.25 0. 0.25 0.25 0.Aabb 0. 0. 0. 0.125 0.25 0.125 0.125 0.25 0.125aabb 0. 0. 0. 0. 0.25 0.25 0. 0.25 0.25

aaBb� AABB 0. 0.5 0. 0. 0.5 0. 0. 0. 0.AaBB 0. 0.25 0.25 0. 0.25 0.25 0. 0. 0.aaBB 0. 0. 0.5 0. 0. 0.5 0. 0. 0.AABb 0. 0.25 0. 0. 0.5 0. 0. 0.25 0.AaBb 0. 0.125 0.125 0. 0.25 0.25 0. 0.125 0.125aaBb 0. 0. 0.25 0. 0. 0.5 0. 0. 0.25AAbb 0. 0. 0. 0. 0.5 0. 0. 0.5 0.Aabb 0. 0. 0. 0. 0.25 0.25 0. 0.25 0.25aabb 0. 0. 0. 0. 0. 0.5 0. 0. 0.5

AAbb� AABB 0. 0. 0. 1. 0. 0. 0. 0. 0.AaBB 0. 0. 0. 0.5 0.5 0. 0. 0. 0.aaBB 0. 0. 0. 0. 1. 0. 0. 0. 0.AABb 0. 0. 0. 0.5 0. 0. 0.5 0. 0.AaBb 0. 0. 0. 0.25 0.25 0. 0.25 0.25 0.aaBb 0. 0. 0. 0. 0.5 0. 0. 0.5 0.AAbb 0. 0. 0. 0. 0. 0. 1. 0. 0.Aabb 0. 0. 0. 0. 0. 0. 0.5 0.5 0.aabb 0. 0. 0. 0. 0. 0. 0. 1. 0.

Aabb� AABB 0. 0. 0. 0.5 0.5 0. 0. 0. 0.AaBB 0. 0. 0. 0.25 0.5 0.25 0. 0. 0.aaBB 0. 0. 0. 0. 0.5 0.5 0. 0. 0.AABb 0. 0. 0. 0.25 0.25 0. 0.25 0.25 0.AaBb 0. 0. 0. 0.125 0.25 0.125 0.125 0.25 0.125aaBb 0. 0. 0. 0. 0.25 0.25 0. 0.25 0.25AAbb 0. 0. 0. 0. 0. 0. 0.5 0.5 0.Aabb 0. 0. 0. 0. 0. 0. 0.25 0.5 0.25aabb 0. 0. 0. 0. 0. 0. 0. 0.5 0.5

aabb� AABB 0. 0. 0. 0. 1. 0. 0. 0. 0.AaBB 0. 0. 0. 0. 0.5 0.5 0. 0. 0.aaBB 0. 0. 0. 0. 0. 1. 0. 0. 0.AABb 0. 0. 0. 0. 0.5 0. 0. 0.5 0.AaBb 0. 0. 0. 0. 0.25 0.25 0. 0.25 0.25aaBb 0. 0. 0. 0. 0. 0.5 0. 0. 0.5AAbb 0. 0. 0. 0. 0. 0. 0. 1. 0.Aabb 0. 0. 0. 0. 0. 0. 0. 0.5 0.5aabb 0. 0. 0. 0. 0. 0. 0. 0. 1.

12 M. Shi, M. Renton / Mathematical Biosciences xxx (2011) xxx–xxx

Please cite this article in press as: M. Shi, M. Renton, Numerical algorithms for estimation and calculation of parameters in modeling pest populationdynamics and evolution of resistance, Math. Biosci. (2011), doi:10.1016/j.mbs.2011.06.005

Page 13: Numerical algorithms for estimation and calculation of parameters in

Appendix C. Python code for selection of uniformly distributed random proportions and calculation of equilibrium frequencies ofgenotypes

C.1. Python code for selection of uniformly distributed random proportions

# IniFrq = UniformRandom (Genotypes,[],[]) # without preselected numbers

#Pm = [0.1248,0.0,0.9000] # Test if the sum > 1.0Pm = [0.1248,0.0,0.4352]

IDXm = [0,2,7]

IniFrq = UniformRandom (Genotypes,Pm,IDXm)

print ‘nn### Initial frequencies for two-locus ###nnnn’print Genotypes,‘nn’,array (IniFrq)

print ‘Check sum = 1?’,sum (IniFrq)

C.2. Python code for calculation of equilibrium frequencies of genotypes

IniF = [0.2040,0.1203,0.0875,0.1064,0.0690,0.0894,0.0467,0.1197,0.1570] (p,q,u,v,EP2)=EqiFre_pquv (2,IniF)

print ‘nnnn### The allelic proportions p,q,u,v:nn’print ‘p=’,p,‘q=’,q,‘u=’,u,‘v=’,v

print ‘nn### The equilibrium frequencies of genotypes:nn’,Genotypesprint ‘nn’,EP2,‘nn Check sum = 1?’,sum (EP2)

M. Shi, M. Renton / Mathematical Biosciences xxx (2011) xxx–xxx 13

References

[1] H.G. Andrewartha, L.G. Birch, Selections from the Distribution and Abundanceof Animals, The University of Chicago, 1982.

[2] H.J. Banks, Behaviour of gases in grain storages, in: Fumigation and ControlledAtmosphere Storage of Grain, Proceedings of an Iinternational Conference Heldat Singapore, 1989, pp. 96–107.

[3] C.H. Bell, The efficiency of phosphine against diapausing larvae of Ephestiuelutellu (Lepidoptera) over a wide range of concentrations and exposure times,J. Stored Prod. Res. 15 (1979) 53.

[4] A. Ben-Israel, T.N.E. Greville, Generalized Inverses: Theory and Applications,second ed., Springer, New York, NY, 2003.

[5] M.J. Best, K. Ritter, Linear Programming: Active Set Analysis and ComputerPrograms, Prentice-Hall, Englewood Cliffs, New Jersey, 1985.

[6] C. Birch, The intrinsic rate of increase of an insect population, J. Animal Ecol. 17(1948) 15.

[7] C.I. Bliss, The relation between exposure time, concentration and toxicity inexperiments on insecticides, Ann. Em. Sot. Am. 33 (1940) 721.

[8] N.J. Bunce, R.B. Remillard, Haber’s rule: the search for quantitativerelationships in toxicology, Human Ecol. Risk Assess. 9 (4) (2003) 973.

[9] R.L. Burden, J.D. Faires, Numerical Analysis, eighth ed., Thomson, Belmont, CA.,2005.

[10] J.R. Carey, Applied Demography for Biologists, Oxford University Press, 1993.[11] H. Chi, H. Su, Age-stage, two-sex life tables of Aphidius gifuensis (Ashmead)

(Hymenoptera: Braconidae) and its host myzus persicae (Sulzer) (Homoptera:Aphididae) with mathematical proof of the relationship between femalefecundity and the net reproductive rate, Environ. Entomol. 35 (1) (2006) 10.

[12] P.J. Collins, G.J. Daglish, M. Bengston, T.M. Lambkin, H. Pavic, Genetics ofresistance to phosphine in Rhyzopertha dominica (Coleoptera: Bostrichidae), J.Econ. Entomol. 95 (2002) 862.

[13] GJ. Daglish, Effect of exposure period on degree of dominance of phosphineresistance in adults of Rhyzopertha dominica (Coleoptera: Bostrychidae) andSitophilus oryzae (Coleoptera: Curculionidae), Pest Manage. Sci. 60 (8) (2004)822.

[14] A.J. Dobson, A.G. Barnett, Introduction to Generalized Linear Models, third ed.,Chapman and Hall/CRC, Boca Raton, FL, 2008.

[15] L.I. Dublin, A.J. Lotka, On the true rate of natural increase, J. Am. Stat. Assoc. 20(1925) 305.

[16] D.J. Finney, Probit Analysis, third ed., Cambridge University, 1971.[17] S. Gilbert, Introduction to Linear Algebra, third ed., Wellesley-Cambridge Press,

Wellesley, Massachusetts, 2003.

Please cite this article in press as: M. Shi, M. Renton, Numerical algorithms fdynamics and evolution of resistance, Math. Biosci. (2011), doi:10.1016/j.mbs.

[18] J. Hardin, J. Hilbe, Generalized Linear Models and Extensions, second ed., StataPress, College Station, 2007.

[19] P.W. Hedrick, Genetics of Population, third ed., Jones and Bartlett Publishers,2005.

[20] S. Josef, B. Roland, Introduction to Numerical Analysis, third ed., Springer-Verlag, Berlin, New York, 2002.

[21] A De H.N. Maia, A.J.B. Luiz, C. Ampanhola, Statistical inference on associatedfertility life table parameters using jackknife technique: computationalaspects, J. Econ. Entomol. 93 (2) (2000) 511.

[22] J.S. Meyer, C.G. Ingersoll, L.L. McDonald, S. Marks Boyce, Estimatinguncertainty in population growth rates: jackknife vs. bootstrap techniques,Ecology 67 (5) (1986) 1156.

[23] E.H. Moore, On the reciprocal of the general algebraic matrix, Bull. Am. Math.Soc. 26 (1920) 394.

[24] L.D. Mueller, M.R. Rose, Evolution Evolutionary theory predicts late-lifemortality plateaus, Proc. Natl. Acad. Sci. USA 93 (1996) 15249.

[25] M.R. Osborne, Finite Algorithms in Optimization and Data Analysis, Wiley,Chichester, 1985.

[26] R. Penrose, A generalized inverse for matrices, Proc. Cambridge Philos. Soc. 51(1955) 406.

[27] M. Shi, L. Gao, Z. Chen, Z. Yang, Matrix Computation in Engineering: Theories,Algorithms and FORTRAN Programs, BUT Publishing House, 1990.

[28] M. Shi, M.A. Lukas, An L1 estimation algorithm with degeneracy and linearconstraints, Comp. Statist. Data Anal. 39 (2002) 35.

[29] W.D. Stansfield, Theory and Problems of Genetics, second ed., McGraw-HillBook Company, Sturtevant, 1983.

[30] A. Taberner, P. Castaera, E. Silvestre, J. Dopazo, Estimation of the intrinsic rateof natural increase and its error by both algebraic and resampling approaches,Bioinformatics 9 (5) (1993) 535.

[31] R.H. Tamarin, Principles of Genetics, fourth ed., Boston University, Wm. C.Brown Publisher, 1989.

[32] M.F.J. Taylor, Field measurement of the dependence of life history on plantnitrogen and temperature for a herbivorous moth, J. Animal Ecol. 57 (1988)873.

[33] J. Wolberg, Data Analysis Using the Method of Least Squares: Extracting theMost Information from Experiments, Springer, 2005.

[34] T. Yang, H. Chi, Life tables and development of Bemisia argentifolii(Homoptera: Aleyrodidae) at different temperatures, J. Econ. Entomol. 99 (3)(2006) 691.

or estimation and calculation of parameters in modeling pest population2011.06.005