research article new cancer stochastic models involving...
TRANSCRIPT
Hindawi Publishing CorporationISRN BiomathematicsVolume 2013 Article ID 954912 19 pageshttpdxdoiorg1011552013954912
Research ArticleNew Cancer Stochastic Models Involving Both Hereditary andNonhereditary Cancer Cases A New Approach
Wai-Yuan Tan1 and Hong Zhou2
1 Department of Mathematical Sciences The University of Memphis Memphis TN 38152 USA2Department of Mathematics and Statistics Arkansas State University State University AR 72467 USA
Correspondence should be addressed to Wai-Yuan Tan waitanmemphisedu
Received 24 August 2012 Accepted 10 October 2012
Academic Editors T LaFramboise K M Page I Rogozin and J M Starobin
Copyright copy 2013 W-Y Tan and H Zhou This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited
To incorporate biologically observed epidemics into multistage models of carcinogenesis in this paper we have developed newstochastic models for human cancers We have further incorporated genetic segregation of cancer genes into these models to derivegeneralized mixture models for cancer incidence Based on these models we have developed a generalized Bayesian approach toestimate the parameters and to predict cancer incidence via Gibbs sampling procedures We have applied these models to fit andanalyze the SEER data of human eye cancers fromNCINIH Our results indicate that the models not only provide a logical avenueto incorporate biological information but also fit the data much better than other models These models would not only providemore insights into human cancers but also would provide useful guidance for its prevention and control and for prediction of futurecancer cases
1 Introduction
It is universally recognized that each cancer tumor developsthrough stochastic proliferation and differentiation from asingle stem cell which has sustained a series of irreversiblegenetic andor epigenetic changes (Little [1] Tan [2 3] Tanet al [4 5] Weinberg [6] Zheng [7]) That is carcinogenesisis a stochastic multistage model with intermediate cellssubjecting to stochastic proliferation and differentiation Fur-thermore the number of stages and the number of pathwaysof the carcinogenesis process are significantly influenced byenvironmental factors underlying the individuals (Tan et al[4 5] Weinberg [6])
Another important observation in human carcinogenesisis that most human cancers cluster around family membersFurther many cancer incidence data (such as SEER data ofNCINIHUSA) have documented that some cancers developduring pregnancy before birth to give new born babieswith cancer at birth This has been referred to as pediatriccancers Well-known examples of pediatric cancers includeretinoblastomamdasha pediatric eye cancer hepatoblastomamdasha pediatric liver cancer Wilmrsquos tumormdasha pediatric kidney
cancer and medulloblastomamdasha pediatric brain tumor Epi-demiological and clinical studies on oncology have alsorevealed that inherited cancers are very common in manyadult human cancers including lung cancer colon cancer[8] uveal melanomas (adult eye cancer [9]) and adult livercancer (HCC [10])
Given the above results from cancer biology and humancancer epidemiology the objective of this paper is to illustratehow to develop stochastic models of carcinogenesis incor-porating these biological and epidemiological observationsBased on these models and cancer incidence data we willthen proceed to develop efficient statistical procedures toestimate unknown parameters in the model to validate themodel and to predict cancer incidence
In Section 2 we illustrate how to incorporate segregationof cancer genes in multistage stochastic models of carcino-genesis to account for inherited cancer cases In Section 3we will develop stochastic equations for the state variables ofthe model described in Section 2 By using these stochasticequations we will derive probability distributions of the statevariables (ie the number of intermediate cancer cells) andthe probability distribution of time to detectable cancer
2 ISRN Biomathematics
Normalepithelium
(RASSF1 BLURobo1Dutt1 FHIT etc)
Hyperplasiametaplasia
Telomeraseexpression
(prevention oftelomere erosion)
(disruption of cellcycle check pointresist to apoptosis)
Dysplasia Carcinoma in situ
K-Ras mutation(activation of
growth signals)
Invasive cancer
VEGFexpression
COX-2expression
Squamous cell carcinoma(NSCLC)
9p21 LOH(p16 p14)
17p13 (p53) LOH(or p53 mutation)
3p LOH 8p LOH
Figure 1 Histopathology lesions and genetic pathway of squamous cell carcinoma of NonSmall Cell Lung Caner (NSCLC)
Tumor
PTENBAP1 orRb1 orDDEF1
GNAQ GNA11CDK4
BCL2 HDM2or others
Chr 3 loss
Cell cycle progression Cell survival Cancer progression Metastasis gain
3p loss
BRCA2 or P16
Figure 2 A Multistage Model of uveal melanoma (adult human eye cancer)
tumors In Section 4 assuming that we have some cancerincidence data such as the SEER data from NCINIH weproceed to develop statistical models for these data fromthese multistage models of carcinogenesis In Section 5 bycombining models in Sections 2ndash4 we proceed to developa generalized Bayesian inference and Gibbs sampling proce-dures to estimate the unknown parameters to validate themodel and to predict cancer incidence As an example ofapplication in Section 6 we proceed to develop a multistagemodel of human eye cancer with inherited cancer casesas described in Figure 2 We will illustrate the model andmethods by analyzing the SEER data of human eye cancerfrom NCINIH Finally in Section 7 we will discuss theusefulness of the model and the methods developed in thispaper and point out some future research directions
2 The Stochastic Multistage Model ofCarcinogenesis with Inherited Cancer Cases
The 119896-stage multistage model of carcinogenesis views car-cinogenesis as the end point of 119896 (119896 ge 2) discrete heritableand irreversible events (mutations genetic changes or epige-netic changes) with intermediate cells subjected to stochasticproliferation and differentiation (Little [1] Tan [2 3] Tan etal [4 5] Weinberg [6]) Let 119873 = 119869
0denote normal stem
cells 119879 the cancer tumors and 119869119894the 119894th stage initiated cells
arising from the (119894 minus 1)th stage initiated cells (119894 = 1 119896)by some genetic andor epigenetic changes Then the modelassumes 119873 rarr 119869
1rarr 119869
2rarr sdot sdot sdot rarr 119869
119896rarr 119879119906119898119900119903
with the 119869119894cells subject to stochastic proliferation (birth) and
differentiation (death) Further it assumes that each stem cellproceeds independently of other cells and that cancer tumorsdevelop from primary 119869
119896cells by clonal expansion (stochastic
birth and death) where primary 119869119896cells are 119869
119896cells which
arise directly from 119869119896minus1
cells see Yang and Chen [11]For example Figure 1 is a multistage pathway for the
squamous NSCLC (NonSmall Cell Lung cancer) as proposedby Osada and Takahashi [12] and Wistuba et al [13] Simi-larly Figure 2 is the multistage model for uveal melanomaproposed by Landreville et al [14] and Mensink et al [15]while Figure 3 is the APC-120573-Catenin-Tcf pathway for humancolon cancer (Tan et al [8] Tan and Yan [16])
Remark 1 To develop stochastic multistage models of car-cinogenesis in the literature (Little [1] Tan [2] Zheng [7]) itis conveniently assumed that the 119869
119896cells grow instantaneously
into cancer tumors as soon as they are generated In thiscase the number of tumors is equal to the number of 119869
119896cells
and one may identify 119869119896cells as tumors It follows that the
number of tumors is aMarkov process and that the 119869119896cells are
ISRN Biomathematics 3
Second copySecond copyAPC APC
Second copyAPC
Smad4DCC
Smad4DCC
Second copy
Second copySmad4DCC
Second copySmad4DCC
N
Ras
Ras
DysplasticACF
(a) Sporadic (about 70ndash75)
Carcinomas(b) FAP (familial adenomatous polyps) (about 1)
middot middotmiddot
middotmiddotmiddot
P53
P53 P53
P53
Figure 3 The APC-120573-catenin-Tcf-myc pathway for human colon cancer
transient cells In these cases one needs only to deal with119879(119905)and 119869
119895cells with 119895 = 1 119896 minus 1 However as shown by Yang
and Chen [11] the number of tumors is much smaller thanthe total number of 119869
119896cells Also in many animal models and
in cancer risk assessment of radiation Klebanov et al [17]Yakovlev and Tsodikov [18] and Fakir et al [19] have shownthat 119879(119905) are in general not Markov
To extend the above model to include hereditary cancersobserve that mutants of cancer genes exist in the populationand that both germline cells (egg and sperm) and somaticcells may carrymutant alleles of cancer genes [2 20] Furtherwithout exception every human being develops from theembryo in hisher motherrsquos womb (embryo stage denotetime by 0) where stem cells of different organs divide anddifferentiate to develop different organs respectively (seeWeinberg [6] Chapter 10) If both the egg and the spermgenerating the embryo carry mutant alleles of relevant cancergenes then the individual is an 119869
2-stage person at the embryo
stage if only one of the germ line cells (egg or sperm)generating the embryo carries mutant alleles of cancer genesthen at the embryo stage the individual is an 119869
1-stage person
Similarly the individual is a normal person (119873 = 1198690person)
at the embryo stage if both the egg and the sperm generatingthe embryo do not carry mutant alleles of cancer genesRefer to the person in the population as an 119869
119894(119894 = 0 1 2)
person if heshe is an 119869119894-stage person at the embryo stage
Then with respect to the cancer development in questionpeople in the population can be classified into 3 types ofpeople normal people (119873 = 119869
0people) 119869
1people and 119869
2
people Based on this classification for normal people in thepopulation the stochastic model of carcinogenesis is a 119896-stagemultievent model given by 119869
0rarr 119869
1rarr sdot sdot sdot rarr 119869
119896rarr
119879119906119898119900119903 for 1198691people in the population the stochastic model
of carcinogenesis is a (119896minus1)-stage multievent model given by1198691rarr 119869
2rarr sdot sdot sdot rarr 119869
119896rarr 119879119906119898119900119903 and for 119869
2people in the
population the stochasticmodel of carcinogenesis is a (119896minus2)-stage multievent model given by 119869
2rarr sdot sdot sdot rarr 119869
119896rarr 119879119906119898119900119903
To account for inherited cancer cases let 1199011be the
proportion of 1198691people in the population and 119901
2the
proportion of 1198692people in the population In general large
human populations under steady-state conditions one maypractically assume that the 119901
119894is a constant independent of
time (Crow and Kimura [21]) Then 1199010= 1 minus 119901
1minus 119901
2(0 lt
1199011+ 119901
2lt 1) is the proportion of normal people (ie119873 = 119869
0
people) in the population Let 119899 be the population size and119899119894(119894 = 0 1 2) the number of 119869
119894people in the population
so that sum2
119906=0119899119906= 119899 Assume that 119899 is very large and that
marriage between people in the population is random withrespect to cancer genes then as shown in Crow and Kimura[21] (see also Tan [22] Chapter 2) the conditional probabilitydistribution of (119899
1 119899
2) given n is 2-dimensional multinomial
with parameters (119899 1199011 119901
2) That is
(1198991 119899
2) | 119899 sim Multinomial (119899 119901
1 119901
2) (1)
To derive probability distribution of time to cancerunder the above model observe that during pregnancy theproliferation rates of all stem cells are quite high Thuswith positive probability 119869
2people in the population may
acquire additional genetic andor epigenetic changes duringpregnancy to become 119869
3-stage people at birth Similarly 119869
1
people may acquire genetic andor epigenetic changes duringpregnancy to become 119869
2people at birth albeit the probability
is very small normal people at the embryo stage may acquiresome genetic andor epigenetic changes during pregnancy tobecome 119869
1people at birth Because the probability of genetic
and epigenetic changes is small one may practically assumethat an 119869
119894(119894 = 0 1 2) person at the embryo stage would
only give rise to 119869119894stem cells and possibly 119869
119894+1stem cells at
birth This is equivalent to assuming that 119869119894people at the
embryos stage would not generate 119869119894+119895
(119895 gt 1) stem cells ator before birth This model is represented schematically inFigure 4 Notice that if 119896 = 2 one may practically assumethat with probability one an 119869
2person at the embryo stage
would develop cancer at or before birth (1199050) If 119896 = 3 then
4 ISRN Biomathematics
Two-stage model
Embryo state
Embryo state
At birth
At birth
Tumor
α1 minus α
( gt 3)tumor ( = 3)
-stage model ( ge 3)
Figure 4 Embryo genotypes and their frequencies at embryo stage and at birth
with probability 120572 (120572 gt 0) an 1198692person at the embryo stage
would develop cancer at or before birth
3 The Stochastic Process ofCarcinogenesis with Hereditary CancerCases and Mathematical Analysis
Because tumors are developed from primary 119869119896cells for the
above stochasticmodel the identifiable response variables are119879(119905) and 119869
119906(119905 119894) 119894 = 0 1 2 119906 = 119894 119894 + 1 119896 minus 1 where
119879(119905) is the number of cancer tumors at time 119905 and 119869119906(119905 119894) is
the number of 119869119906(119906 = 119894 119894 + 1 119896 minus 1) cells at time 119905 in
people who are 119869119894people at the embryo stage (see [3 5 8 23]
Remarks 1 and 2) For people who have genotype 119869119894(119894 =
0 1 2) at the embryo stage the stochastic model of carcino-genesis is then given by the stochastic process
˜119883119894(119905) 119879(119905) 119905 gt
0 where˜119883119894(119905) = 119869
119906(119905 119894) 119906 = 119894 119894 + 1 119896 minus 1
1015840 For theseprocesses in the next subsections we will derive stochasticequations for the state variables (119869
119906(119905 119894)119894 = 0 1 2 119906 =
119894 119896 minus 1) we will also derive the probability distributionsof these state variables and the probabilities of developingcancer tumors These are the basic approaches for modelingcarcinogenesis used by the first author and his associates seeTan [3] Tan et al [4 5 8 23] Tan and Zhou [9] Tan and Yan[16] and Tan and Chen [24 25] and Remark 3
Remark 2 At any time (say 119905) the total number of 119869119896cells
is equal to the total number of 119869119896cells generated from 119869
119896minus1
cells at time 119905 plus the total number of 119869119896cells generated by
cell division from other 119869119896cells at time 119905 the former 119869
119896cells
are referred to as primary 119869119896cells while the latter are not
primary 119869119896cells Since each tumor is developed from a single
primary 119869119896cell through stochastic birth and death process
each primary 119869119896cell will generate atmost one tumor It follows
that at any time the total number of 119869119896cells is considerably
greater than the number of cancer tumors (see also Yangand Chen [11]) Thus for generating cancer tumors the onlyidentifiable state variables are the number of 119869
119895cells with (119895 =
0 1 119896 minus 1) and the number of detectable cancer tumor
Remark 3 To model stochastic multistage models of car-cinogenesis the standard traditional approach is to assumethat the last stage cells (ie the 119869
119896cells in the model 119873 rarr
1198691sdot sdot sdot rarr 119869
119896rarr 119879119906119898119900119903) grow instantaneously into a cancer
tumor as soon as they are generated and then apply thestandard Markov theory to 119879(119905) and to the state variables
˜119883(119905) = 119869
119894(119905) 119894 = 0 1 119896 minus 1 This approach has been
described in detail in Tan [2] Little [1] and Zheng [7] seealso Luebeck and Moolgavkar [26] and Durrett et al [27]However in some cases the assumption of instantaneousgrowth into cancer tumors of 119869
119896cells may not be realistic
ISRN Biomathematics 5
(Klebanov et al [17] Yakovlev and Tsodikov [18] and Fakiret al [19]) in these cases 119879(119905) is not Markov so that theMarkov theory method is not applicable to 119879(119905) To developanalytical results and to resolve many difficult issues Tan andhis associates [4 5 24] have proposed an alternative approachthrough stochastic equations and have followed Yang andChen [11] to assume that cancer tumors develop by clonalexpansion from primary last stage cells Through probabilitygenerating function method Tan and Chen [24] have shownthat if the Markov theory is applicable to 119879(119905) then thestochastic equation method is equivalent to the classicalMarkov theory method but is more powerful Also throughstochastic equation method we have shown in the Appendixthat the classical approach provides a close approximation todiscrete time model under the assumption that the primarylast stage cells develop into a detectable tumor in one timeunit This provides a reasonable explanation why the tradi-tional approach (see [2 22]) can still work well even thoughthe Markov assumption for 119879(119905) may not hold In this paperwe will thus basically use the stochastic equationmethod andassume that cancer tumors develop from primary last stagecells through clonal expansion
31 Stochastic Equations for the State Variables Assume nowthat an individual is an 119869
119894(119894 = 0 1 2) person at the embryo
stageThen in this individual cancer is developed by a (119896minus119894)-stage multievent model given by 119869
119894rarr sdot sdot sdot rarr 119869
119896rarr
119879119906119898119900119903 and the identifiable response variables are given by˜119883119894(119905)
1015840 119879(119905) To derive stochastic equations for the staging
variables in˜119883119894(119905) in this individual observe that for each
119894 = 0 1 2˜119883119894(119905)
1015840 is in general a Markov Process although119879(119905) may not be Markov see Remark 1 Tan [3] and Tan etal [4 5] Tan and Zhou [9] and Tan and Yan [16] It followsthat
˜119883119894(119905 + Δ119905)
1015840 derive from˜119883119894(119905)
1015840 through stochastic birth-death processes of 119869
119906(119906 = 119894 119894 + 1 119896minus1) cells and through
stochastic transition 119869119906rarr 119869
119906+1 119906 = 119894 119894+1 119896minus1 during
(119905 119905 + Δ119905] Let 119861119906(119905 119894) 119863
119906(119905 119894) 119872
119906(119905 119894) be the number of
birth the number of death of 119869119906cells and the number of
transition from 119869119906rarr 119869
119906+1cells during (119905 119905+Δ119905] respectively
in people who are 119869119894people at the embryo stage Let 119872
0(119905)
denote the number of transitions from 119873 rarr 1198691during
(119905 119905 + Δ119905] Because the transition of 119869119906rarr 119869
119906+1would not
affect the number of 119869119906cells but only increase the number of
119869119906+1
cells (see Remark 4) by the conservation law we have thefollowing stochastic equations for 119869
119906(119905 119894) 119906 = 119894 119896minus1 119894 =
0 1 2 (see Tan [3] Tan et al [4 5 8] Tan and Zhou [9] andTan and Yan [16])
119869119894 (119905 + Δ119905 119894) = 119869
119894 (119905 119894) + 119861119894 (119905 119894) minus 119863119894 (119905 119894) 119894 = 0 1 2
119869119906(119905 + Δ119905 119894) = 119869
119906(119905 119894) + 119861
119906(119905 119894) minus 119863
119906(119905 119894)
+ 119872119906minus1 (119905 119894) 119894 lt 119906 le 119896 minus 1
(2)
Because 119861119907(119905 119894) 119863
119907(119905 119894)119872
119907(119905 119894) 119894 = 0 1 2 119907 = 119894 119896minus
1 are random variables the above equations are basicallystochastic equations To derive probability distributions ofthese variables let 119887
119906(119905) and 119889
119906(119905) denote the birth rate and
the death rate at time 119905 of the 119869119906(119906 = 0 1 119896 minus 1) cells
respectively Let 120573119906(119905) (119906 = 0 1 119896 minus 1) be the transition
rate at time 119905 from 119869119906rarr 119869
119906+1 Then as shown in Tan [3] for
(119894 = 0 1 2 119906 = 119894 119896 minus 1) we have to the order of 119900(Δ119905)
119861119906(119905 119894) 119863
119906(119905 119894) 119872
119906(119905 119894) | 119869
119906(119905 119894)
sim Multinomial 119869119906 (119905 119894) 119887119906 (119905) Δ119905 119889119906 (119905) Δ119905 120573119906 (119905) Δ119905
(3)
It follows that to the order of 119900(Δ119905)
119861119906(119905 119894) 119863
119906(119905 119894) | 119869
119906(119905 119894)
simMultinomial 119869119906 (119905 119894) 119887119906 (119905) Δ119905 119889119906 (119905) Δ119905
119872119906(119905 119894) | 119869
119906(119905 119894)
sim Binomial 119869119906(119905 119894) 120573
119906(119905) Δ119905
sim Poisson 119869119906(119905 119894) 120573
119906(119905) Δ119905 + 119900 (120573
119895(119905) Δ119905)
independently of 119861119906(119905 119894) 119863
119906(119905 119894)
119906 = 0 1 119896 minus 1
(4)
From these distribution results by subtracting from therandom transition variables its conditional means respec-tively we obtain the following stochastic equations for thestate variables
˜119883119894(119905) (119894 = 0 1 Min(2 119896 minus 1))
119889119869119894(119905 119894) = 119869
119894(119905 + Δ119905 119894) minus 119869
119894(119905 119894) = 119861
119894(119905 119894) minus 119863
119894(119905 119894)
= 119869119894(119905 119894) 120574
119894(119905) Δ119905 + 119890
119894(119905 119894) Δ119905 119894 = 0 1 2
119889119869119906 (119905 119894) = 119869
119906 (119905 + Δ119905 119894) minus 119869119906 (119905 119894) = 119872119906minus1 (119905 119894)
+ 119861119906(119905 119894) minus 119863
119906(119905 119894)
= 119869119906minus1 (119905 119894) 120573119906minus1 (119905)
+119869119906(119905 119894) 120574
119906(119905) Δ119905 + 119890
119906(119905 119894) Δ119905
119894 = 0 1 Min (2 119896 minus 1) 119894 lt 119906 le 119896 minus 1
(5)
where 120574119906(119905) = 119887
119906(119905) minus 119889
119906(119905) for 119906 = 0 1 119896 minus 1 and where
119890119894(119905 119894)Δ119905 = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)Δ119905] minus [119863
119894(119905 119894) minus 119869
119894(119905 119894)119889
119894(119905)Δ119905]
for 119894 = 0 1 2 119890119906(119905 119894)Δ119905 = [119861
119906(119905 119894) minus 119869
119906(119905 119894)119887
119906(119905)Δ119905] minus
[119863119906(119905 119894) minus 119869
119906(119905 119894)119889
119906(119905)Δ119905] + [119872
119906minus1(119905 119894) minus 119869
119906minus1(119905 119894)120573
119906minus1(119905)Δ119905]
for 119894 = 0 1 Min(2 119896 minus 1) 119894 lt 119906 le 119896 minus 1From the above equations by dividing both sides by Δ119905
and letting Δ119905 rarr 0 we obtain
119869119894(119905 119894)
119889119905= 119869
119894(119905 119894) 120574
119894(119905) + 119890
119894(119905 119894) 119894 = 0 1 Min (2 119896 minus 1)
119869119906 (119905 119894)
119889119905= 119869
119906 (119905 119894) 120574119906 (119905) + 119869119906minus1 (119905 119894) 120573119906minus1 (119905) + 119890119906 (119905 119894)
for 119894 = 0 1 Min (2 119896 minus 1) 119894 lt 119906 le 119896 minus 1
(6)
In the above equations using the distribution results in(4) it can easily be shown that the randomnoises 119890
119906(119905 119894) 119906 =
119894 119896 minus 1 119894 = 0 1 2 have expected value zero and are
6 ISRN Biomathematics
uncorrelated with the staging variables and 119879(119905) The initialconditions at birth (119905
0) for the above stochastic differential
equations are 119869119906(1199050 119894) gt 0 119906 = 119894 119894 + 1 119869
119906(1199050 119894) = 0 119906 gt 119894 + 1
Given the initial conditions (119869119906(1199050 119894) gt 0 119906 = 119894 119894 + 1)
and (119869119906(1199050 119894) = 0 119906 gt 119894 + 1) at birth (119905
0) the solution of the
equations in (6) is given respectively by
119869119894 (119905 119894) = 119869
119894(1199050 119894) 119890
int119905
1199050
120574119894(119909)119889119909
+ 120578119894 (119905 119894)
119894 = 0 1 Min (2 119896 minus 1)
119869119906 (119905 119894) = 119869
119906(1199050 119894) 119890
int119905
1199050
120574119906(119909)119889119909
+ int
119905
1199050
119869119906minus1 (119909 119894) 120573119906minus1 (119909) 119890
int119905
119909120574119906(119910)119889119910
119889119909
+ 120578119906(119905 119894) = sdot sdot sdot = 119869
119906(1199050 119894) 119890
int119905
1199050
120574119906(119909)119889119909
+
119906minus119894
sum
119907=1
119869119906minus119907
(1199050 119894) 120601
(119907)
119906(119905 119894) +
119906+1minus119894
sum
119907=1
120578(119907)
119906(119905 119894)
119906 = 119894 + 1 119896 minus 1
where 119894 = 0 if 119896 = 2
119894 = 0 1 if 119896 = 3
119894 = 0 1 2 if 119896 gt 3
(7)
where
120601(1)
119906(119905 119894) =int
119905
1199050
119890int119905
119909120574119906(119910)119889119910+int
119909
1199050
120574119906minus1
(119910)119889119910120573119906minus1
(119909) 119889119909
119906 = 119894 119896 minus 1
120601(119907)
119906(119905 119894) =int
119905
1199050
119890int119905
119909120574119906(119910)119889119910
120573119906minus1
(119909) 120601(119907minus1)
119906minus1(119909 119894) 119889119909
119907 = 2 119906 minus 119894
120578119906(119905 119894) = 120578
(1)
119906(119905 119894)
= int
119905
1199050
119890int119905
119909120574119906(119910)119889119910
119890119906(119909 119894) 119889119909
119906 = 119894 119896 minus 1
120578(119907)
119906(119905 119894) = int
119905
1199050
119890int119905
119909120574119906(119910)119889119910
120573119906minus1
(119909) 120578(119907minus1)
119906minus1(119909 119894) 119889119909
119907 = 2 119906 minus 119894
(8)
If the model is time homogeneous so that 120573119906(119905) =
120573119906 119887119906(119905) = 119887
119906 119889
119906(119905) = 119889
119906 120574
119906(119905) = 119887
119906minus 119889
119906= 120574
119906 119906 = 0 1
119896minus1 and if 120574119894= 120574119906if 119894 = 119906 the above solutions under the initial
conditions (119869119894+119906(1199050 119894) gt 0 119906 = 0 1 119869
119894+119906(1199050 119894) = 0 119906 gt 1)
then reduce respectively to
119869119894 (119905 119894) = 119869
119894(1199050 119894) 119890
120574119894(119905minus1199050)+ 120578
(1)
119894(119905 119894)
119894 = 0 1 Min (2 119896 minus 1)
119869119906(119905 119894) = 119869
119906(1199050 119894) 119890
120574119906(119905minus1199050)
+120573119906minus1
int
119905
1199050
119869119906minus1
(119909 119894) 119890120574119906(119905minus119909)
119889119909+120578(1)
119906(119905 119894)
= sdot sdot sdot = 119869119906(1199050 119894) 119890
120574119906(119905minus1199050)
+
119906minus1
sum
119903=119894
119869119903(1199050 119894) (
119906minus1
prod
119907=119903
120573119907)
119906
sum
119897=119903
119860119903119906(119897) 119890
120574119897(119905minus1199050)
+
119906
sum
119903=119894
120578(119906+1minus119903)
119906(119905 119894)
=
119894+1
sum
119903=119894
119869119903(1199050 119894) (
119906minus1
prod
119907=119903
120573119907)
119906
sum
119897=119903
119860119903119906(119897) 119890
120574119897(119905minus1199050)
+
119906
sum
119903=119894
120578(119906+1minus119903)
119906(119905 119894) 119894 lt 119906 le 119896 minus 1
(9)
where for 119894 le 119907 le 119906119860119894119906 (119907) = 1 if 119894 = 119906
=
119906
prod
119897=119894119897 = 119907
(120574119897minus 120574
119907)minus1 if 119894 lt 119906
(10)
Obviously 119864[120578(119903)119906(119905 119894)] = 0 for all (119894 = 0 1 2 119906 = 119894
119896 minus 1 119903 = 1 119906 minus 119894 + 1) It follows that for (119894 =
0 1 Min(119896 minus 1 2)) the expected values 119864[119869119896minus1
(119905 119894)] of119869119896minus1
(119905 119894) for homogeneous models with 120574119894= 120574119906if 119894 = 119906 are
given by
119864 [119869119896minus1
(119905 119894) = 119864 [119869119894+1
(1199050 119894)]
times (
119896minus2
prod
119906=119894+1
120573119906)
119896minus1
sum
119907=119894+1
119860(119894+1)(119896minus1)
(119907) 119890120574119907(119905minus1199050)
+ 119864 [119869119894(1199050 119894)] (
119896minus2
prod
119906=119894
120573119906)
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907) 119890120574119907(119905minus1199050)
119894 = 0 1Min (119896 minus 1 2) 119896 ge 2
(11)
where as a convention (sum119894
119895=119894+1119888119894= 0prod
119894
119895=119894+1119889119895= 1) for all
(119888119894 119889
119895)
Remark 4 Because genetic changes and epigenetic changesoccur during cell division to the order of 119900(Δ119905) the probabil-ity is120573
119906(119905)Δ119905 that one 119869
119906cell at time 119905would give rise to 1 119869
119906cell
and 1 119869119906+1
cell at time 119905 + Δ119905 by genetic changes or epigeneticchanges It follows that the transition of 119869
119906rarr 119869
119906+1would not
affect the population size of 119869119906cells but only increase the size
of the 119869119906+1
population
ISRN Biomathematics 7
32 Transition Probabilities and Probability Distributions ofStaging Variables Let 119891(119909 119899 119901) denote the probabilitydensity function of a binomial random variable 119883 sim
Binomial(119899 119901) ℎ(119909 120582) the probability density function ofa Poisson random variable 119883 sim Poisson(120582) and 119892(119909 119910 119899
1199011 119901
2) the probability density function of a bivariate multi-
nomial random vector (119883 119884)simMultinomial(119899 1199011 119901
2) Using
the stochastic equations of the staging variables given by(2) and using the probability distributions of the transitionvariables 119861
119903(119905 119894) 119863
119903(119905 119894)119872
119903(119905 119894) in (4) as inTan et al [4 5]
we obtain the following transition probabilities of 119869119903(119905 +
Δ119905 119894) 119903 + 119894 119896 minus 1 given 119869119903(119905 119894) 119903 = 119894 119896 minus 1 for
(119894 = 0 1 Min(119896 minus 1 2))
119875 119869119903 (119905 + Δ119905 119894) = 119907
119903 119903 = 119894 119894 + 1 119896 minus 1 | 119869
119903 (119905 119894)
= 119906119903 119903 = 119894 119894 + 1 119896 minus 1
= 119875 119869119894(119905 + Δ119905 119894) = 119907
119894| 119869
119894(119905 119894) = 119906
119894
times
119896minus1
prod
119895=119894+1
119875 119869119895(119905 + Δ119905 119894) = 119907
119895| 119869
119895(119905 119894)
= 119906119895 119869119895minus1
(119905 119894) = 119906119895minus1
(12)
where
119875 119869119894(119905 + Δ119905 119894) = 119907
119894| 119869
119894(119905 119894) = 119906
119894
=
119906119894
sum
119903=0
119891 (119903 119906119894 119887119894 (119905) Δ119905)
times 119891(119906119894minus 119907
119894+ 119903 119906
119894minus 119903
119889119894(119905) Δ119905
1 minus 119887119894 (119905) Δ119905
)
119894 = 0 1 Min (119896 minus 1 2)
119875 119869119895(119905 + Δ119905 119894) = 119907
119895| 119869
119895(119905 119894) = 119906
119895 119869119895minus1
(119905 119894) = 119906119895minus1
=
119906119895
sum
1199031=0
119906119895minus1199031
sum
1199032=0
119892 (1199031 1199032 119906
119895 119887119895(119905) Δ119905 119889
119895(119905) Δ119905)
times ℎ (119907119895minus 119906
119895minus 119903
1+ 119903
2 119906
119895minus1120573119895minus1
Δ119905) 119895 gt 119894
(13)
Define the unobservable transition variables˜119880119894(119905) =
119861119894(119905 119894) (119861
119895(119905 119894) 119863
119895(119905 119894)) 119895 = 119894 + 1 119896 minus 1
1015840(119894 = 0 1Min
(119896 minus 1 2)) Then we have for the joint probability densityfunction of
˜119883119894(119905 + Δ119905)
˜119880119894(119905) given
˜119883119894(119905)
119875 ˜119883119894 (119905 +Δ119905) ˜
119880119894 (119905) | ˜
119883119894 (119905) = 119875
˜119883119894 (119905 + Δ119905) | ˜
119880119894 (119905) ˜
119883119894 (119905)
times 119875 ˜119880119894(119905) |
˜119883119894(119905)
(14)
where
119875 ˜119883119894(119905 + Δ119905) |
˜119880119894(119905)
˜119883119894(119905)
= 119891(119869119894 (119905 119894) minus 119869119894 (119905 + Δ119905 119894) + 119861119894 (119905 119894) 119869119894 (119905 119894)
minus119861119894(119905 119894)
119889119894(119905) Δ119905
1 minus 119887119894 (119905) Δ119905
)
times
119896minus1
prod
119895=119894+1
ℎ 119869119895 (119905 + Δ119905 119894) minus 119869119895 (119905 119894) minus 119861119895 (119905 119894)
+119863119895(119905 119894) 119869
119895minus1(119905 119894) 120573
119895minus1(119905) Δ119905
(15)
119875 ˜119880119894(119905) |
˜119883119894(119905)
= 119891 (119861119894 (119905 119894) 119869119894 (119905 119894) 119887119894 (119905 119894) Δ119905)
times
119896minus1
prod
119895=119894+1
119892119861119895(119905 119894) 119863
119895(119905 119894) 119869
119895(119905 119894) 119887
119895(119905 119894) Δ119905 119889
119895(119905 119894) Δ119905
(16)
Let˜119890119894(119895) be a (119896 minus 119894) times 1 column vector with 1 in the
119895th position (1 le 119895 le 119896 minus 119894) and with 0 in other positionsLet
˜119906 = (119906
119894 119906
119896minus1)1015840 and
˜119907 = (119907
119894 119907
119896minus1)1015840 be (119896 minus
119894) times 1 column vectors of nonnegative integers (ie 119906119895and
119907119895are nonnegative integers) Then by using the probability
distribution results in (14)ndash(16) it can readily be shown that
119875 ˜119883119894(119905 + Δ119905) =
˜119907 |
˜119883119894(119905) =
˜119906
= [119906119895119887119895(119905) + (1 minus 120575
119895119894) 119906
119895minus1120573119895minus1
(119905)] Δ119905 + 119900 (Δ119905)
if˜119907 =
˜119906 +
˜119890119894(119895) 119895 = 119894 119896 minus 1
= 119906119895119889119895(119905) Δ119905 + 119900 (Δ119905)
if˜119907 =
˜119906 minus
˜119890119894(119895) 119895 = 119894 119896 minus 1
= 119900 (Δ119905) if 10038161003816100381610038161003816˜11015840
119899minus119896(˜119906 minus
˜119907)10038161003816100381610038161003816ge 2
(17)
The above results imply that˜119883119894(119905) is a (119896minus 119894)-dimensional
birth-death process with birth rates 119894119887119906(119905) 119906 = 119894 119896 minus
1 death rates 119894119889119906(119905) 119906 = 119894 119896 minus 1 and cross-
transition rates 120572119906119906+1
(119895 119905) = 119895120573119906(119905) 119906 = 119894 119896 minus
1 120573119906119907(119895 119905) = 0 if 119907 = 119906 + 1 (See Definition 41 in Tan
([22] Chapter 4)) Using these results it can be shownthat the Kolmogorov forward equation for the probabilities119875119869
119895(119905 119894) = 119906
119895 119895 = 119894 119896 minus 1 | 119869
119894(0) = 119898
119894 119869
119895(0) = 0
8 ISRN Biomathematics
119895 gt 119894 = 119875(119906119895 119895 = 119894 119896 minus 1 119905) (119894 = 0 1Min(119896 minus 1 2))
in the above model is given by
119889
119889119905119875 (119906
119895 119895 = 119894 119896 minus 1 119905)
= 119875 (119906119894minus 1 119906
119895 119895 = 119894 + 1 119896 minus 1 119905) (119906
119894minus 1) 119887
119894(119905)
+
119896minus1
sum
119895=119894+1
119875 (119906119894 119906
119894+1 119906
119895minus1 119906
119895
minus1 119906119895+1
119906119896minus1
119905) (119906119895minus 1) 119887
119895(119905)
+
119896minus2
sum
119895=119894
119875 (119906119894 119906
119894+1 119906
119895 119906
119895+1
minus1 119906119895+2
119906119896minus1
119905) 119906119895120573119895 (119905)
+
119896minus1
sum
119895=119894
119875 (119906119894 119906
119894+1 119906
119895minus1 119906
119895
+1 119906119895+1
119906119896minus1
119905) (119906119895+ 1) 119889
119895(119905)
minus 119875 (119906119894 119906
119894+1 119906
119896minus1 119905)
times
119896minus1
sum
119895=119894
119906119895[119887119895(119905) + 119889
119895(119905)] +
119896minus2
sum
119895=119894
119906119895120573119895(119905)
(18)
for 119906119895= 0 1 infin 119895 = 119894 119896 minus 1
By using the above set of differential equations one canreadily compute the probabilities 119875119869
119895(119905) = 119906
119895 119895 = 119894 119896 minus
1 | 119868119894(0) = 119898
119894 = 119875(119906
119895 119895 = 119894 119896 minus 1 119905) numerically
33 The Probability Distributions of the Number of DetectableTumors and Times to Tumors As shown by Yang and Chen[11] malignant cancer tumors arise from primary 119869
119896cells by
clonal expansion where primary 119869119896cells are 119869
119896cells generated
directly by 119869119896minus1
cells (119869119896cells derived by stochastic birth of
other 119869119896cells are not primary 119869
119896cells) That is cancer tumors
develop from primary 119869119896cells through stochastic birth-death
processesTo derive the probability distribution for 119879(119905) in 119869
119894people
in the population let 119875119879(119904 119905) denote the probability that
a primary cancer cell at time 119904 develops into a detectablecancer tumor by time 119905 (Explicit formula for119875
119879(119904 119905) has been
given in Tan [22] Chapter 8 and in Tan and Chen [24])Than as shown in Tan ([3 22] chapter 8) the conditionalprobability distribution of 119879(119905) given 119869
119896minus1(119904 119894) 119904 le 119905
in 119869119894people is Poisson with mean 120596(119905 119894) where 120596(119905 119894) =
int119905
1199050
119869119896minus1
(119904 119894)120573119896minus1
(119904)119875119879(119904 119905)119889119904 That is
119879 (119905) | 119869119896minus1
(119904 119894) 119904 le 119905 sim Poisson (120596 (119905 119894)) (19)
Let 119876119894(119895) be the probability that cancer tumors develop
during (119905119895minus1
119905119895] in 119869
119894people in the population For time
homogeneous models with small 120573119896minus1
119876119894(119895) is then given by
119876119894(119895) = 119864 119890
minus120596(119905119895minus1
119894)minus 119890
minus120596(119905119895119894)
= 119890minus120573119896minus1
119867119894(119905119895minus1
)minus 119890
minus120573119896minus1
119867119894(119905119895)+ 119900 (120573
119896minus1)
(20)
where119867119894(119905) = int
119905
1199050
119864[119869119896minus1
(119909 119894)]119875119879(119909 119905)119889119909
To derive 119876119894(119895) denote by
120579119894(119896minus1)
= 119864 [119869119894(1199050 119894 minus 1)] 120573
119894
119896minus1
prod
119906=119894+1
(120573119906
120574119906
)
119894 = 1 Min (3 119896 minus 1)
120582119906(119896minus1)
= 119864 [119869119906(1199050 119906)] 120573
119906
119896minus1
prod
119907=119906+1
(120573119906
120574119906
)
119906 = 0 1 Min (2 119896 minus 1)
(21)
and define the functions
120595119894(119896minus1)
(119905) =
119896minus1
prod
119906=119894+1
120574119906
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907)
times int
119905
1199050
119890120574119906(119909minus1199050)119875119879(119909 119905) 119889119909
119894 = 0 1 Min (2 119896 minus 1)
(22)
Applying results of 119864[119869119896minus1
(119905 119894)] given in (11) for timehomogeneous models with 120574
119894= 120574119895if 119894 = 119895 we obtain 119876
119894(119895)rsquos as
follows
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) = 119890
minus1205791112059511(119905119895minus1
)minus1205820112059501(119905119895minus1
)
minus119890minus1205791112059511(119905119895)minus1205820112059501(119905119895) + 119900 (120573
1)
(23)
1198761(119895) = (1 minus 120572
1) 119890
minus1205821112059511(119905119895minus1
)
minus119890minus1205821112059511(119905119895) + 119900 (120573
1)
(24)
ISRN Biomathematics 9
(2) If 119896 ge 3 then we have 119876119894(0) = 0 for 119894 = 0 1 and
1198762(0) = 120575
2119896120572 and for 119895 gt 0
1198760(119895) = 119890
minus1205791(119896minus1)
1205951(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1205950(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1205951(119896minus1)
(119905119895)minus1205820(119896minus1)
1205950(119896minus1)
(119905119895) + 119900 (120573
119896minus1)
(25)
1198761(119895) = 119890
minus1205792(119896minus1)
1205952(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1205951(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1205952(119896minus1)
(119905119895)minus1205821(119896minus1)
1205951(119896minus1)
(119905119895) + 119900 (120573
119896minus1)
(26)
1198762(119895) = 120575
2119896(1 minus 120572) 119890
minus1205822212059522(119905119895minus1
)minus 119890
minus1205822212059522(119905119895)
+ (1 minus 1205752119896) 119890
minus1205793(119896minus1)
1205953(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1205952(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1205953(119896minus1)
(119905119895)minus1205822(119896minus1)
1205952(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(27)
where 1205752119896= 1 if 119896 = 2 and = 0 if 119896 = 2
Notice that if 1205740= 0 then 120595
0(119896minus1)(119905) reduces to
1205950(119896minus1)
(119905) = (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=0
1198600(119896minus1)
(119907)
times int
119905
1199050
119890120574119907(119909minus1199050)119875119879(119909 119905) 119889119909
= (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1) (119907)
times int
119905
1199050
[119890120574119907(119909minus1199050)minus 1] 119875
119879 (119909 119905) 119889119909
(28)
Notice also that if 119875119879(119904 119905) asymp 1 for 119905 gt 119904 and if 120574
0= 0 then
the above 120595119894119895(119905)rsquos reduce respectively to
120595119894119894(119905) =
1
120574119894
119890120574119894(119905minus1199050)minus 1 119894 = 1 119896 minus 1
1205950(119896minus1) (119905) = (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1) (119907)
times1
1205741199072119890
120574119907(119905minus1199050)minus 1 minus (119905 minus 119905
0) 120574
119907
120595119894(119896minus1)
(119905) = (
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907)
times1
120574119907
119890120574119907(119905minus1199050)minus 1 119894 = 1 119896 minus 1
(29)
4 Probability Distribution ofObserved Cancer Incidence IncorporatingHereditary Cancer Cases
For estimating unknown parameters and to validate themodel one would need real data generated from the modelFor studies of carcinogenesis such data are usually given bycancer incidence For example in the SEER data of NCINIHof USA the data are given by (119910
0 119899
0) (119910
119895 119899
119895) 119895 = 1 119905
119873
where 1199100is the number of cancer cases at birth and 119899
0
the total number of birth and where for 119895 ge 1 119910119895is
the number of cancer cases developed during the 119895th agegroup of a one-year period (or 5 years periods) and 119899
119895is
the number of noncancer people who are at risk for cancerand from whom 119910
119895of them have developed cancer during
the 119895th age group Given in Table 1 are the SEER data ofuveal melanoma (adult eye cancer) during the period 1973ndash2007 In Table 1 notice that there are some cancer cases atbirth implying some inherited cancer cases In this sectionwe will develop a statistical model for these types of datasets from the stochastic multistage model with hereditarycancers as given in Section 2 As in previous sections let119899119894119895be the number of individuals who have genotype 119869
119894(119894 =
0 1 2) at the embryo stage among the 119899119895people at risk for
the cancer in question Then as showed above (1198991119895 119899
2119895) |
119899119895sim Multinomial119899
119895 119901
1 119901
2 It follows that 119899
119894119895| 119899
119895sim
Binomial119899119895 119901
119894 119894 = 0 1 2 In what follows we let 119884
119895denote
the random variable for 119910119895unless otherwise stated
41 The Probability Distribution of 1198840 As shown in Figure 4
119869119894(119894 = 0 1 2) people would only generate 119869
119894stage cells and
119869119894+1
stage cells at birth Thus for cancers to develop at orbefore birth the number of stages for the stochastic model ofcarcinogenesis must be 3 or less It follows that if 119910
0gt 0 the
appropriate model of carcinogenesis must be either a 2-stagemodel or a 3-stage model Since 119899
20| 119899
0sim Binomial(119899
0 119901
2)
and 11989910| (119899
0 119899
20) sim Binomial(119899
0 119901
1) + 119900(119901
2) the probability
distribution of 1198840is therefore
1198840sim Poisson (120594
119896) 119896 = 2 3 (30)
where
120594119896= 119899
0(119901
2+ 119901
1120572) if 119896 = 2
= 11989901199012120572 if 119896 = 3
(31)
The expected number of 1198840given 119899
0is 119864(119884
0| 119899
0) =
1198990(119901
2+ 119901
1120572) = 120594
2if 119896 = 2 and 119864(119884
0| 119899
0) = 119899
01199012120572 = 120594
3
if 119896 = 3 Hence for the 2-stage model (ie 119896 = 2) or the 3-stage model (ie 119896 = 3) the maximum likelihood estimateof 120594
119896is 120594
119896= 119910
0and the deviance119863
0(119896) from the conditional
probability distribution of 1198840given 119899
0is
1198630(119896) = minus2 log ℎ (119910
0 120594
119896) minus log ℎ (119910
0 120594
119896)
= 120594119896minus 119910
0 minus 119910
0log
120594119896
1199100
119896 = 2 3
(32)
10 ISRN Biomathematics
Table 1 The SEER incidence data (1973ndash2007) of uveal melanomafrom NCINIH (over all races and genders)
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
0 12495777 34 36 36 381 12221582 20 11 11 162 12120990 14 7 8 173 12112995 9 5 6 184 12146174 4 4 5 205 12161336 3 3 4 226 12111854 2 2 4 257 12160452 2 2 4 288 11942586 2 2 4 309 12381299 1 2 5 3410 12512703 2 3 6 3811 12410338 5 3 6 4112 12449244 2 3 6 4413 12527781 5 4 7 4814 12602883 3 5 7 5115 12719598 5 5 8 5516 12766107 7 6 9 5917 12831400 9 7 9 6218 12382047 8 7 9 6319 12581638 10 8 10 6820 12636509 7 9 11 7121 12682601 6 10 10 7522 12840510 12 11 11 7923 13075528 17 13 13 8424 13358635 16 15 14 8925 13473849 12 16 16 9426 13426340 17 18 18 9727 13525264 28 20 20 10128 13149674 17 22 21 10229 13812811 23 25 25 11030 13886874 24 28 27 11431 13488332 37 30 29 11532 13460286 32 32 32 11833 13256067 38 35 35 11934 13428827 37 39 38 12435 13220037 40 41 41 12636 12870265 30 44 44 12637 12689592 43 47 47 12738 12157014 42 49 49 12539 12494081 46 55 54 13140 12272125 49 58 58 13241 11826573 56 61 60 13042 11663153 54 65 64 13143 11407082 53 68 68 13144 11296848 70 73 72 133
Table 1 Continued
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
45 11016369 57 76 76 13246 10651593 71 79 79 13047 10475708 87 84 83 13148 9994684 82 86 85 12749 10138908 78 93 93 13150 9836359 87 97 96 13051 9475641 95 100 99 12752 9250985 113 104 104 12653 9027382 106 108 108 12554 8883737 117 113 113 12555 8547883 129 116 116 12356 8279648 107 119 119 12157 8062368 119 123 123 11958 7654610 132 124 124 11559 7563706 118 130 129 11560 7232719 131 131 130 11261 6927332 116 132 132 10962 6708273 133 134 134 10763 6543931 143 138 137 10664 6404652 130 141 141 10565 6168486 145 142 142 10266 5913479 138 142 142 9967 5746766 151 144 144 9868 5480517 147 142 142 9469 5363912 149 144 144 9470 5110728 136 142 142 9071 4925076 144 141 141 8872 4696825 140 138 138 8573 4512136 146 135 135 8274 4345300 126 133 133 8075 4148801 122 128 129 7876 3900900 124 122 122 7477 3681587 114 116 116 7078 3481918 93 110 110 6779 3243631 102 102 102 6380 2961234 79 92 93 5881 2724984 93 83 84 5482 2495219 82 75 75 5083 2271595 92 66 67 4684 2041351 64 57 58 4285 10466605 304 282 285 216Note 1 for age 0 and 1ndash10 years old the cancer incidence are derived bysubtracting incidence of retinoblastoma from the original SEERdata (see Tanand Zhou [9])Note 2 the observed uveal melanoma incidence rates per 106 individuals arederived from the SEER eye cancer incidence by subtracting retinoblastomaincidence as given in Tan and Zhou [9]
ISRN Biomathematics 11
42 The Probability Distribution of 119884119895(119895 ge 1) To derive the
probability distribution of 119884119895(119895 ge 1) in the 119895th age group
let 119884119894119895(119894 = 0 1 2) be the number of cancer cases generated
by people who have genotype 119869119894at the embryo stage among
these 119884119895cancer cases Then 119884
119895= sum
2
119894=0119884119894119895and 119884
0119895is the
number of cancer cases generated by the 1198990119895= 119899
119895minus 119899
1119895minus 119899
2119895
normal people in the populationThe conditional probabilitydistribution of 119884
119894119895given 119899
119894119895is
119884119894119895| 119899
119894119895sim Poisson 119899
119894119895119876119894(119895) 119894 = 0 1 2 (33)
Notice that if 119896 = 2 (a 2-stage model) then all 1198692
individuals would develop tumor at or before birth Thus if119896 = 2 then 119884
2119895= 0 for all 119895 gt 0 so that if 119895 gt 0 cancer
cases develop only from normal people (119873 = 1198690people) and
1198691people On the other hand if 119896 gt 2 then with positive
probability 119884119894119895gt 0 for all (119894 = 0 1 2 119895 = 0 1 119899
119879) where
119899119879is the last time point in the data Let 120575
2119896= 1 if 119896 = 2 and
1205752119896= 0 if 119896 = 2 Then 119884
119895| (119899
119894119895 119894 = 0 1 2) sim Poisson(119876
119879(119895)
where 119876119879(119895) = sum
1
119894=0119899119894119895119876119894(119895) + (1 minus 120575
2119896)119899
21198951198762(119895) Since
(1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) we have for the
conditional probability density function119875(119910119895| 119899
119895) of119884
119895given
119899119895
119875 (119910119895| 119899
119895)=
119899119895
sum
1198990119895=0
119899119895minus1198990119895
sum
1198991119895=0
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1) ℎ 119910
119895 119876
119879(119895)
(34)
where 119892(1198990119895 119899
1119895 119899
119895 119901
0 119901
1) is the probability density function
of (1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) and ℎ119910
119895 119876
119879(119895)
the probability density function of 119884119895| (119899
119894119895 119894 = 0 1 2) sim
Poisson119876119879(119895)
The probability density function 119875(119910119895| 119899
119895) given by (34)
is a mixture of Poisson probability density functions withmixing probability density function given by themultinomialprobability distribution of 119899
0119895 119899
1119895 given 119899
119895 This mixing
probability density function represents individuals with dif-ferent genotypes at the embryo stage in the population
Let Θ be the set of all unknown parameters (ie theparameters (119901
1 119901
2 120572) and the birth rates the death rates
and the mutation rates of 119869119895cells) Based on data (119910
119895 119895 =
0 1 119899119879) the likelihood function of Θ is
119871 Θ | 119910119895 119895 = 0 1 119899
119879 = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119875 (119910119895| 119899
119895)
(35)
Notice that because themutation rates are very small onemay practically assume 120573
119894(119905) = 120573
119894for 119894 = 0 1 119896 minus 1
Also because the stage-limiting genes are basically tumorsuppressor genes which act recessively (see Tan [3]Weinberg[6] and Tan et al [5 8 23]) one may practically assume119887119894(119905) = 119887
119894 119889
119894(119905) = 119889
119894 120574
119894(119905) = 119887
119894minus 119889
119894= 120574
119894 119894 = 0 1 119896 minus 1
(see Tan et al [4 5 8])
43 The Joint Probability Distribution of Augmented Variablesand Cancer Incidence For applying the mixture distribution
of 119884119895in (34) to make inference about the unknown parame-
ters one needs to expand the model to include the unobserv-able augmented variables (119899
0119895 119899
1119895 119910
0119895 119910
1119895 119895 = 1 119905
119873) and
derives the joint probability distribution of these variablesFor these purposes observe that for (119895 = 1 119905
119873) and for
119896 gt 2 the conditional probability distribution119875119910119894119895 119894 = 0 1 |
119910119895 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1) given (119884
119895= 119910
119895 119899
119894119895 119894 =
0 1 119899119895) is
(119910119894119895 119894 = 0 1) | (119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Multinomial(119910119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1)
(36)
Since the conditional probability distribution of 119884119895given
(119899119894119895 119894 = 0 1 2) for 119896 gt 2 is Poisson with mean 119876
119879(119895) we
have for the joint conditional probability density function119875119910
119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1 119884
119895) given
(119899119894119895 119894 = 0 1 119899
119895)
119875 119910119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895
= 119875 119910119894119895 119894 = 0 1 | 119910
119895 119899
119894119895 119894 = 0 1 2
times 119875 119910119895| 119899
119894119895 119895 = 0 1 2 =
1
119910119895119890minus119876119879(119895)119876
119879(119895)
119910119895
times 1198921199100119895 119910
1119895 119910
119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1
=
2
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) (119896 gt 2)
(37)
where 1199102119895= 119910
119895minus sum
1
119906=0119910119906119895and 119899
2119895= 119899
119895minus sum
1
119906=0119899119906119895
If 119896 = 2 then 1198842119895= 0 so that 119884
119895= sum
1
119894=0119884119894119895 Thus we have
for 119896 = 2
119884119895| (119899
119894119895 119894 = 0 1 119899
119895) sim Poisson
1
sum
119894=0
119899119894119895119876119894(119895) (38)
119884119894119895| (119884
119895= 119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Binomial(119910119895
119899119894119895119876119894(119895)
sum1
119906=0119899119906119895119876119906(119895)
) 119894 = 0 1
(39)
It follows that if 119896 = 2 then sum1
119894=0119884119894119895= 119884
119895and the joint
probability density function of (1198840119895 119884
119895) given (119899
119906119895 119906 = 0 1 2)
is119875 119910
0119895 119910
119895| 119899
119906119895 119906 = 0 1 119899
119895
= 119875 119910119895| 119899
119894119895 119894 = 0 1 119899
119895
times 119875 1199100119895| 119910
119895 119899
119906119895 119906 = 0 1 119899
119895
=
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) if 119896 = 2
(40)
where 119910119895= sum
1
119894=0119910119894119895and 119899
119895= sum
2
119894=0119899119894119895
12 ISRN Biomathematics
Put Y = (119910119894119895 119894 = 0 1 119895 = 1 119905
119873) N = (119899
119894119895 119894 = 0 1
119895 = 1 119905119873)˜119899 = (119899
119895 119895 = 0 1 119905
119873)˜119910 = (119910
119895 119895 = 0 1
119905119873) From (37) and (40) we have for the conditional joint
probability density function of (Y˜119910) given (N
˜119899)
119875 Y˜119910 | N
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(41)
It follows that the joint conditional probability densityfunction of NY
˜119910 given (
˜119899 Θ) is
119875 NY˜119910 |
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1)
times ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(42)
Notice that the above probability density function is aproduct of multinomial probability density functions andPoisson probability density functions For this joint probabil-ity density function the deviance Dev = minus2log119875[Y
˜119910N |
˜119899 Θ] minus log119875[Y
˜119910N |
˜119899 Θ] is
Dev = 1198630(119896) + Dev (119901
1 119901
2) +
119905119873
sum
119895=1
119863119895 (43)
where
1198630(119896) = 2 120594 (119896) minus 119910
0minus 119910
0log
120594 (119896)
1199100
(44)
Dev (1199011 119901
2) = 2
119905119873
sum
119895=1
1198990119895log
1199010
(1 minus 1199011minus 119901
2)
+
2
sum
119894=1
119899119894119895log
119901119894
119901119894
(45)
119863119895= 2
1
sum
119894=0
119899119894119895119876119894(119895)minus119910
119894119895minus119910
119894119895log
119899119894119895119876119894(119895)
119910119894119895
+2 (1 minus 1205752119896)
times 11989921198951198762(119895) minus 119910
2119895minus 119910
2119895log
11989921198951198762(119895)
1199102119895
(46)
where 119901119894= ((sum
119905119873
119895=0119899119894119895)(sum
119905119873
119895=0119899119895)) (119894 = 0 1 2)
The joint probability density function 119875Y˜119910N |
˜119899 Θ
of (Y˜119910N) given by (42) will be used as the kernel for the
Bayesian method to estimate the unknown parameters andto predict the state variables
44 Fitting of the Model to Cancer Incidence Data To fit themodel to real data as inTan [3ndash5] we letΔ119905 sim 1 to correspondto a fixed time interval such as 6 months in human cancerstudies (Tan et al [4] has assumed 3months as one-time unitwhile Luebeck andMoolgavkar [26] has assumed one year asone-time unit)Then because the proliferation rate of the laststage cells is quite large onemay practically assume119875
119879(119904 119905) =
1 for 119905 minus 119904 ge 1 Hence noting that 120573119896minus1
(119905) = 120573119896minus1
is usuallyvery small (see [3ndash5]) the 119876
119894(119895) is approximated by
119876119894(119895) asymp119864 119890
minus120573119896minus1
119866119894(119905119895minus1
)minus 119890
minus120573119896minus1
119866119894(119905119895) = 119890
minus120573119896minus1
119864[119866119894(119905119895minus1
)]
minus 119890minus120573119896minus1
119864[119866119894(119905119895)]+ 119900 (120573
119896minus1)
(47)
where 119866119894(119905) = sum
119905minus1
119904=1199050
119869119896minus1
(119904 119894)Under discrete time approximation the 119864[119868
119896minus1(119905 119894)]rsquos
have been derived in the appendix Using these results ofexpected numbers and using the resultsum119905minus1
119894=0119886119894= (119886
119905minus1)(119886minus
1) for 119886 = 0 we obtain
120573119896minus1
119864 [119866119894(119905)] =
119894+1
sum
119903=119894
119864 [119869119903(1199050 119894)] (
119896minus1
prod
119906=119903
120573119906)
times
119896minus1
sum
119907=119903
119860119903(119896minus1)
(119907)
119905minus1
sum
119904=1199050
(1 + 120574119907)119904minus1199050
= 120579(119894+1)(119896minus1)
120601(119894+1)(119896minus1) (119905) + 120582119894(119896minus1)120601119894(119896minus1) (119905)
(48)
where (120579119894(119896minus1)
119894 = 1 2 3) and (120579119894(119896minus1)
119894 = 0 1 2) are definedin Section 33 andwhere the120601
119894(119896minus1)(119905) (119894 = 0 1 2 3)rsquos are given
by
120601119894(119896minus1) (119905) =(
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1) (119907)
times1
120574119907
(1 + 120574119907)119905minus1199050
minus 1 119894 = 0 1 2 3
(49)
Notice that if 1205740= 0 then 120601
0(119896minus1)(119905) reduces to
1206010(119896minus1)
(119905) =(
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1)
(119907)
times1
1205741199072(1 + 120574
119907)119905minus1199050
minus 1 minus (119905 minus 1199050) 120574
119907
(50)
Applying these results for time homogeneous modelswith 120574
119894= 120574119895if 119894 = 119895 the 119876
119894(119895)rsquos under discrete approximation
are given as follows
ISRN Biomathematics 13
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791112060111(119905119895minus1
)minus1205820112060101(119905119895minus1
)
minus119890minus1205791112060111(119905119895)minus1205820112060101(119905119895) + 119900 (120573
1)
1198761(119895) asymp (1 minus 120572
1) 119890
minus1205821112060111(119905119895minus1
)
minus119890minus1205821112060111(119905119895) + 119900 (120573
1)
(51)
(2) If 119896 ge 3 thenwe have 1198762(0) = 120575
2(119896minus1)120572119876
119894(0) = 0 119894 =
0 1 and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791(119896minus1)
1206011(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1206011(119896minus1)
(119905119895)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198761(119895) asymp 119890
minus1205792(119896minus1)
1206012(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1206012(119896minus1)
(119905119895)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198762(119895) asymp 120575
2(119896minus1)(1 minus 120572) 119890
minus1205822212060122(119905119895minus1
)minus 119890
minus1205822212060122(119905119895)
+ (1 minus 1205752(119896minus1)
) 119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(52)
Notice that if one replaces [1 + 120574119907]119905minus1199050 by [1 + 120574
119907]119905minus1199050 =
119890(119905minus1199050) log(1+120574
119907)
asymp 119890(119905minus1199050)120574119907 the above 119876
119894(119895)rsquos from discrete
time model are exactly the same as from the continuousmodel respectively as given in equations (23)ndash(27) under theassumption that 119875
119879(119904 119905) = 1 for 119905 minus 119904 gt 0 Notice that the
assumption 119875119879(119904 119905) = 1 for 119905minus119904 gt 0 is equivalent to assuming
that the last stage cancer cells grow instantaneously intocancer tumors as soon as they are generated see Remark 1
5 The Fitting of the Model to CancerIncidence and the Generalized BayesianInference Procedure
Given the model in Sections 2 and 3 and cancer incidenceone may use results in Section 4 to fit the model By usingthis model and the distribution results in Section 4 one canreadily estimate the unknown genetic parameters predictcancer incidence and check the validity of the model byusing the generalized Bayesian inference and Gibbs samplingprocedures for more detail see Tan [3 22] and Tan et al[4 5]
The generalized Bayesian inference is based on the pos-terior distribution 119875Θ | NY
˜119910
˜119899 of Θ given NY
˜119910
˜119899
This posterior distribution is derived by combining the prior
distribution 119875Θ ofΘ with the joint probability distribution119875NY
˜119910 |
˜119899 Θ given
˜119899 Θ given by (42) It follows
that this inference procedure would combine informationfrom three sources (1) previous information and experiencesabout the parameters in terms of the prior distribution 119875Θof the parameters (2) biological information of inheritedcancer cases via genetic segregation of cancer genes inthe population (119875N |
˜119899 119901
119894 119894 = 1 2 see Section 2)
and (3) information from the expanded data (Y) and theobserved data (
˜119910) via the statistical model from the system
(119875Y˜119910 | N Θ) given by (37) and (40) Because of additional
information from the genetic segregation of the cancer genesthis inference procedure provides an efficient procedure toextract information of effects of genotypes of individuals atthe embryo stage
51 The Prior Distribution of the Parameters For the priordistributions of Θ because biological information has sug-gested some lower bounds and upper bounds for the muta-tion rates and for the proliferation rates we assume
119875 (Θ) prop 119888 (119888 gt 0) (53)
where 119888 is a positive constant if these parameters satisfy somebiologically specified constraints are and equal to zero forotherwise These biological constraints are as follows
(i) 0 lt 1199011lt 10
minus2 0 lt 1199012lt 10
minus6 and minus001 lt 120574119894lt 1 (119894 =
1 2)(ii) For 120573
119895(119895 = 0 1 119896 minus 1) we let 1 lt 120596
0= 119873(119905
0)120573
0lt
1000 (119873 rarr 1198681) and 10minus8 lt 120573
119894lt 10
minus3 119894 = 1 119896minus1(iii) For the 120582
119895(119896minus1)(119895 = 0 1 2) we let 0 lt 120582
119894lt 10 119894 = 0 1
and 0 lt 1205822lt 10
3(iv) For the 120579
119894(119894 = 1 2 3) we let 0 lt 120579
1lt 10
minus2 and 0 lt
1205792lt 10
minus1
We will refer to the above prior as a partially informativeprior which may be considered as an extension of thetraditional noninformative prior given in Box and Tiao [28]
52 The Posterior Distribution of the Parameters Given119884119873
˜119910
˜119899 Denote by Θ
1= Θ minus 119901
1 119901
2 120572 From the
posterior distribution 119875Θ | NY˜119910
˜119899 we obtain
119875 119901119894 119894 = 1 2 120572 | Θ
1NY
˜119910
˜119899
prop (120594 (119896))1199100
119890minus120594(119896)
119901sum119905119873
119895=11198991119895
1119901sum119905119873
119895=11198992119895
2
times (1 minus 1199011minus 119901
2)sum119905119873
119895=11198990119895
0 lt 1199011 119901
2 120572 lt 1
119875 Θ1| (119901
1 119901
2 120572) NY
˜119910
˜119899
prop
119905119873
prod
119895=1
2
prod
119894=0
119890minus119876119894(119895)119876
119894(119895)
119910119894119895
Θ1isin Ω
(54)
14 ISRN Biomathematics
where Ω is the parameter space of Θ1provided by the
biological constraints in Section 51For computational convenience we notice that the log of
1198751199011 119901
2 120572 | Θ
1NY
˜119910
˜119899 is proportional to the negative of
1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) similarly the log of
119875Θ1| (119901
1 119901
2 120572)NY
˜119910
˜119899 is proportional to the negative
of sum119896
119895=1119863119895given by (46)
53 The Multilevel Gibbs Sampling Procedure For EstimatingUnknown Parameters Given the posterior probability distri-butions we will use the following multilevel Gibbs samplingprocedure to derive estimates of the parameters We noticethat numerically the Gibbs sampling procedure given belowis equivalent to the EM-algorithm from the sampling theoryviewpoint with Steps 1 and 2 as the 119864-Step and with Steps 3and 4 as the119872-Step respectively [29]Thesemultilevel Gibbssampling procedures are given by the following
Step 1 (Generating N Given (Y˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 1)) Given Θ and given˜119899 use the multinomial
distribution of 1198991119895 119899
2119895 given 119899
119895in Section 3 to generate
a large sample of N Then by combining this large samplewith 119875Y
˜119910 | N
˜119899 Θ in (37) and (40) to select N through
the weighted bootstrap method due to Smith and Gelfand[30] This selected N is then a sample from 119875N | Y
˜119910
˜119899 Θ
even though the latter is unknown (For proof see Tan [22]Chapter 3) Call the generated sample N
Step 2 (Generating Y Given (N˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 2)) Given
˜119910
˜119899 Θ and given N = N generated
from Step 1 generate Y from the probability distribution119875Y | N
˜119910
˜119899 Θ given by (36) and (38) Call the generated
sample Y
Step 3 (Estimation of 119901119894 119894 = 1 2 120572 Given Θ
1NY
˜119910
˜119899)
Given
˜119910
˜119899 Θ
1 and given (NY) = (N Y) from Steps 1
and 2 derive the posterior mode of 119901119894 119894 = 1 2 120572 by
maximizing the conditional posterior distribution 119875119901119894 119894 =
1 2 120572 | Θ1 N Y
˜119910
˜119899 Under the partially informative prior
this is equivalent to maximize the negative of the deviance1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) in Section 43 under
the constraints given in Section 51 Denote this generatedmode by 119901
119894 119894 = 1 2
Step 4 (Estimation of Θ1Given (119901
119894 119894 = 1 2 120572NY
˜119910
˜119899))
Given (
˜119910
˜119899) and given (NY 119901
119894 119894 = 1 2 120572) = (N Y 119901
119894 119894 =
1 2 ) from Steps 1ndash3 derive the posterior mode of Θ1
by maximizing the conditional posterior distribution119875Θ
1| 119901
119894 119894 = 1 2 N Y
˜119910
˜119899 Under the partially
informative prior this is equivalent to maximize the negativeof the deviancesum119896
119895=1119863119895in (46) under the constraints Denote
the generated mode as Θ1
Step 5 (Recycling Step) With (NY 119901119894 119894 = 1 2 120572 Θ
1) =
(N Y 119901119894 119894 = 1 2 Θ
1) given above go back to Step 1 and
continue until convergence
The proof of convergence of the above steps can bederived by using procedure given in Tan ([22] Chapter 3) Atconvergence the Θ = 119901
119894 119894 = 1 2 Θ
1 are the generated
values from the posterior distribution of Θ given
˜119910
˜119899
independently of (NY) (for proof see Tan [22] Chapter 3)Repeat the above procedures once then generate a randomsample ofΘ from the posterior distribution ofΘ given
˜119910
˜119899
then one uses the sample mean as the estimates of (Θ) anduse the sample variances and covariances as estimates of thevariances and covariances of these estimates
6 A NewMultistage Stochastic Model for AdultEye Cancer (Uveal Melanoma)mdashAn Example
The human eye cancers consist of pediatric eye cancers andadult eye cancers The most common pediatric eye cancer isthe retinoblastoma which develops from the retinal pigmentepithelium cells underlying the retina that do not formmelanoma The most common adult eye cancers are theuveal melanomas involving the iris the ciliary body andthe choroid (collectively referred to as the uveal) Thesecancers develop from melanocytes (pigment cells) whichreside within the uveal giving color to the eye In Tan andZhou [9] we have developed a modified two-stage model forretinoblastoma Based on results frommolecular biology (seeLandreville et al [14] Mensink et al [15] and Loercher andHarbour [31]) Landreville et al [14] have proposed a threestage model for uveal melanoma as given in Figure 2 As anexample of applications of this paper in this section we willapply this model of uveal melanoma to the NCINIH eyecancer data from the SEER project We notice that the samemethods can be applied to model other human cancers aswell but this will be our future research
Given in Table 1 are the numbers of people at risk andthe eye cancer cases in the age groups together with thepredicted cases from the models These data give cancerincidence at birth and incidence for 85 age groups (119896 =
85) with each group spanning over a 1-year period exceptthe last age group (ge85 years old) For human eye cancerbecause the incidence at birth and for age groups from 1to 10 years old is basically generated by the pediatric eyecancer-retinoblastoma (see [9]) to account for inheritedcancer cases of uveal melanomas the incidence for age 0(birth) and for age periods from 1 to 10 years old in Table 1for uveal melanoma is derived by subtracting incidence ofretinoblastoma from SEER data (see Tan and Zhou [9])
To fit the data we let one-time unit be 6 months afterbirth and let 119905
0= 1 To compare different models and to
assess different assumptions we will consider the following2-3-stage mixture models (1) the complete 3-stage mixturemodel (Model-F) in which no assumptions are made on theparameters (2) the Type-1ndash3-stage mixture model in whichwe assume that 120574
1= 0 and that normal people and 119869
1at
the embryo stage will remain normal people and 1198691people
respectively at birth (Model-1) For comparison purposes wealso fit a 2-stage model as defined in Tan and Zhou [9] Wewill apply the methods in Section 6 to fit these models to theSEER data given in Table 1
ISRN Biomathematics 15
Table 2 The log-likelihood AIC and BIC of the fitted models
Models Log-likelihood AIC BICThree-stage
Model-F minus131202 264804 267735Model-1 minus129576 260953 263151
Two-stage minus4392421 8796843 8811569
0 6 12 18 24 30 36 42 48 54 60 66 72 78 84Age (year)
0
50
100
150
200
250
300
350
Inci
denc
es
ObservedModel-F
Model-1Two-stage
Figure 5 Curve fitting of SEER data by the Model-F Model-1 andthe two-stage model
Given in Table 2 are the natural logs of the likelihoodfunctions the AIC (Akaike Information Criterion) and theBIC (Bayesian Information Criterion) for these modelsGiven in Table 3 are the estimates of parameters in the 3-stagemodels Given in Figure 5 are the plots of predicted cancercases from the 3-stage mixture models (Model-F and Model-1) and the 2-stagemodel For comparison purposes in Table 1we also provide numbers of predicted cancer cases from the3-stage mixture models and the 2-stage model together withthe observed cancer cases over time from SEER From theseresults we have made the following observations
(a) As shown by results in Table 1 and Figure 5 it appear-ed that both Model-F and Model-1 fitted the SEERdata well although Model-1 fitted the data slightlybetter from values of AIC and BIC The Chi-squaretest statistics 1205942 = sum
85
119894=0((119910
119894minus 119910
119894)2119910
119894) for Model-F
and Model-1 are given by 8843 and 9448 respec-tively giving a 119875-value of 012 (df = 86 minus 12 = 74)for Model-F and a 119875 value of 011 (df = 86 minus 7 = 79)for Model-1 On the other hand the 2-stage modelfitted the date very poorly the Chi-statistic value forthe 2-stage model is 274769 giving a 119875-value less than10
minus3The AIC (Akaike Information Criteria) and BIC(Bayesian Information Criteria) values ofModel-1 aregiven by (AIC = 260953 BIC = 263151) which areslightly smaller than those of Model-F respectivelyhowever the AIC and BIC values (879684 881157)of the two-stage model are considerably greater thanthose of the 3-stagemodels respectivelyThese resultssuggest that uvealmelanomamaybest be described bya 3-stage model with inherited component and that
one may practically assume 1205741= 0 and that normal
people and 1198691people at the embryo stage will remain
normal people and 1198691people respectively at birth
(b) From Table 3 it is observed that the estimate of 1205741is
close to zero (the estimate is of order 10minus5) indicatingthat the phenotype of 119869
1is almost identical to that
of 119873 = 1198690further confirming that the staging-
limiting genes are basically tumor suppressor genesand that there is no haploinsufficiency for these tumorsuppressor genes On the other hand the estimate of1205742is of order 10minus2 which is about 103 times greater
than those of cells with genotype 1198691
(c) From Table 3 the estimates 119895(119895 = 0 1 2) of 120582
119895
are of order 10minus1 10
minus1 10
2sim 10
3 respectively
Because 120582119894= 119864[119869
119894(1199050 119894)]120573
119894prod
2
119906=119894+1(120573
119906120574
119906) 119894 = 0 1 2
assuming some values of 119864[119869119894(1199050 119894)] 119894 = 0 1 2 from
some biological observations one can have somerough ideas about the magnitude of 120573
119895(119895 = 0 1 2)
For example if we follow Potten et al [32] to assume(119864[119873(119905
0)] = 119864[119869
119894(1199050 119894)] sim 10
8 119894 = 0 1 2) then 120573
119895asymp
10minus6sim 10
minus5(d) From Table 3 the estimates of 119901
1and 119901
2from the
SEER data are of orders 10minus4 sim 10minus3 and 10minus7 sim 10
minus6respectively This indicates that in the US populationthe frequency of the staging limiting cancer genefor uveal melanoma is approximately around 10
minus3Table 3 also showed that the estimate of 120572 was 08411indicating that most individuals with genotype 119869
2
would develop cancer at birth This may help toexplain why there are observed cancer incidencesat birth for uveal melanoma in the SEER data eventhough the estimate of the frequency 119901
2is of order
10minus7sim 10
minus6
7 Discussion and Conclusions
To account for inherited cancer cases in this paper wehave developed some general multistage models involvinghereditary cancer cases For human cancer incidence thesemodels are basically generalized mixture models In thesemixture models the mixing probability is a multinomialdistribution to account for genetic segregation of the staging-limiting tumor suppressor genes This mixture model allowsus to estimate for the first time the frequency of the staging-limiting tumor suppressor gene in human populations As anexample of applications in this paper we have developed ageneral 3-stage stochastic multistage model of carcinogenesisfor adult human eye cancer To account for inherited cancercases in the stochastic model of human eye cancer wehave also developed a generalized mixture model for uvealmelanoma in human beings
For using the proposed models to fit the cancer incidencedata in this paper we have developed a generalized Bayesianinference procedure to estimate the unknownparameters andto predict cancer cases This inference procedure is advanta-geous over the classical sampling theory inference (ie max-imum likelihood method) because the procedure combines
16 ISRN Biomathematics
Table3Estim
ates
ofparametersfor
the3
-stage
stochastic
mod
els
Parameters
1205820
1205821
1205822
1205741
1205742
1199011
1199012
1205721205791
1205792
Mod
el-F
Estim
ates
288119864minus01
481119864minus01
655119864+02
271119864minus05
337119864minus02
995119864minus04
968119864minus07
841119864minus01
329119864minus04
341119864minus01
StD
121119864minus01
465119864minus02
112119864+02
886119864minus06
192119864minus04
175119864minus06
648119864minus09
419119864minus02
354119864minus05
107119864minus02
95CL
-Low
er503119864minus02
389119864minus01
434119864+02
534119864minus06
333119864minus02
991119864minus04
955119864minus07
759119864minus01
260119864minus04
320119864minus01
95CL
-Upp
er525119864minus01
572119864minus01
875119864+02
401119864minus05
341119864minus02
998119864minus04
981119864minus07
923119864minus01
398119864minus04
362119864minus01
Mod
el-1
Estim
ates
655119864minus02
372119864minus01
198119864+02
NA
349119864minus02
998119864minus04
100119864minus06
807119864minus01
NA
NA
StD
254119864minus02
463119864minus02
338119864+01
NA
358119864minus03
627119864minus05
453119864minus08
706119864minus02
NA
NA
95CL
-Low
er157119864minus02
282119864minus01
132119864+02
NA
279119864minus02
875119864minus04
911119864minus07
668119864minus01
NA
NA
95CL
-Upp
er115119864minus01
463119864minus01
265119864+02
NA
419119864minus02
1121198640minus03
109119864minus06
945119864minus01
NA
NA
NoteNAassum
edno
nexiste
nce
ISRN Biomathematics 17
information from three sources (1)previous information andexperiences about the parameters in terms of the prior distri-bution 119875Θ of the parameters (2) biological information ofinherited cancer cases via the genetic segregation of staging-limiting tumor suppressor genes in the population and (3)
information from the expanded data (Y) and the observeddata (
˜119910) via the statistical model from the system (119875Y
˜119910 |
N Θ) given by (37) and (40)To illustrate the usefulness and applications of ourmodels
and methods we have applied our models and methods tothe eye cancer SEER data of NCINIH Our analysis clearlyshowed that the proposed 3-stage model with inheritedcancer cases fitted the data nicely (see Table 2 and Figure 5)on the other hand the classical 2-stage model cannot fit thedata at all (see Table 2 and Figure 5)These results clearly haveconfirmed results frommolecular biology that the human eyecancer is derived by a 3-stage model with inherited cancercomponent Notice however our 3-stage multistage modelis more general than the classical 3-stage model as describedin Little [1] Tan [2] and Zheng [7] in that we postulatethat cancer tumors develop from primary 119868
3cells by clonal
expansion (see Yang and Chen [11]) (Note that the stochasticmultistagemodels in the literature assume that cancer tumorsdevelop from last stage cells immediately as soon as theyare generated ignoring completely cancer progression seeRemark 1) As a matter of fact we had assumed Δ119905 = 1 fora period of three months and found that the 3-stage modelsthen did not fit the SEER data clearly indicating that119875
119905(119904 119905) lt
1 over a period of three monthsApplying our models and methods to the SEER data
of human eye cancer we have derived for the first timesome useful pieces of information Specifically we mention(1) for the first time that we have estimated the frequencyof the staging-limiting tumor suppressor gene in the USpopulation (119901
1sim 9948 times 10
minus3) (2) With the estimate of 120572as = 08411 the predicted number of uveal melanoma atbirth is 119910
0= 119899
011199012= 36 by 3-stage models with inherited
cancer component (Model-F and Model-1) (The observednumber of eye cancer at birth is 34) (3) The estimateof the proliferation rate (120574
1) of 119869
1cells using Model-F is
1205741
= 2271 times 10minus5
sim 0 (The estimate is 8603 times 10minus5
using Model-1) This confirms that the stage-1 limitinggene is a tumor suppressor gene and unlike the p53 genein chromosome 17p (see [33]) there is little or no haploidinsufficiency for this gene in cells with genotype 119869
1
Using models and methods of this paper one can easilypredict future cancer cases for human eye cancer Thus bycomparing results from different populations our modelsand methods can be used to assess cancer prevention andcontrol procedures This will be our future research topicswe will not go any further here
Appendix
The Expected Numbers of State Variablesunder Discrete Time Approximation
Under discrete time the stochastic differential equations ofstate variables reduce to the following stochastic difference
equations of state variables respectively
119869119894 (119905 + 1 119894) = 119869
119894 (119905 119894) [1 + 120574119894 (119905)]
+ 119890119894(119905 + 1 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 + 1 119906) = 119869
119895(119905 119906) [1 + 120574
119895(119905)] + 119869
119895minus1(119905 119906) 120573
119895minus1(119905)
+ 119890119895(119905 + 1 119906) 0 le 119906 lt 119895 le 119896 minus 1
(A1)
where 119890119894(119905 + 1 119894) = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)] minus [119863
119894(119905 119894) minus
119869119894(119905 119894)119889
119894(119905)] and 119890
119895(119905+1 119894) = [119861
119895(119905 119894)minus119869
119895(119905 119894)119887
119895(119905)] minus [119863
119895(119905 119894)minus
119869119895(119905 119894)119889
119895(119905)] + [119872
119895minus1(119905 119894) minus 119869
119895minus1(119905 119894)120573
119895minus1(119905)] for 119895 gt 119894
The initial conditions at birth (1199050) for the above stochastic
difference equations are 119869119895(1199050 119894) gt 0 if (119895 = 119894 119894 + 1) and
119869119895(1199050 119894) = 0 if 119895 gt 119894 + 1 The solution of the above difference
equations under these initial conditions is given respectivelyby
119869119894(119905 119894) = 119869
119894(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119894(119904))
+
119905
sum
119904=1199050+1
119890119894(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119894(119906))
119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119895(119904))
+
119905minus1
sum
119904=1199050
119869119895minus1 (119904 119894) 120573119895minus1 (119904)
119905minus1
prod
119906=119904+1
(1 + 120574119895 (119906))
+
119905
sum
119904=1199050+1
119890119895(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119895(119906))
0 le 119894 lt 119895 le 119896 minus 1
(A2)
If the model is time homogeneous so that 120573119895(119905) =
120573119895 120574
119895(119905) = 120574
119895(119895 = 119894 119896 minus 1 and if 120574
119894= 120574119895if 119894 = 119895 then the
above solutions under the initial conditions (119869119895(1199050 119894) = 0 119895 gt
119894 + 1) reduce to
119869119894 (119905 119894) = 119869
119894(1199050 119894) (1 + 120574
119894)119905minus1199050
+ 120578(0)
119894(119905 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894) (1 + 120574
119895)119905minus1199050
+
119895minus1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895(119903) (1 + 120574
119903)119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
18 ISRN Biomathematics
=
119894+1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895 (119903) (1 + 120574119903)
119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
0 le 119894 lt 119895 le 119896 minus 1
(A3)
where 120578(0)
119895(119905 119894) = sum
119905
119904=1199050+1119890119895(119904 119894)(1 + 120574
119894)119905minus119904 120578
(119906)
119895(119905 119894) =
sum119905
119904=1199050+1119890119895minus119906
(119904 119894)sum119895
119907=119906119860119906119895(119907)(1 + 120574
119907)119905minus119904 (119906 = 1 119895 minus 119894)
Thus if the model is time homogeneous and if 120574119894=
120574119895if 119894 = 119895 the 119864[119869
119896minus1(119905 119894)]rsquos in discrete time models under
the initial conditions (119869119895(1199050 119894) = 0 119895 gt 119894 + 1) are given
respectively by
119864 [119869119896minus1
(119905 119894)] =
119894+1
sum
119906=119894
119864 [119869119906(1199050 119894)] (
119896minus2
prod
119907=119906
120573119907)
times
119896minus1
sum
119903=119906
119860119906(119896minus1)
(119903) (1 + 120574119903)119905minus1199050
119894 = 0 1 Min (119896 minus 1 2)
(A4)
References
[1] M P Little ldquoCancermodels ionization and genomic instabilitya reviewrdquo inHandbook of CancerModels with ApplicationsW YTan and L Hanin Eds chapter 5 World Scientific River EdgeNJ USA 2008
[2] W Y Tan Stochastic Models of Carcinogenesis Marcel DekkerNew York NY USA 1991
[3] W Y Tan ldquoStochastic multiti-stage models of carcinogenesis ashidden Markov models a new approachrdquo International Journalof Systems and Synthetic Biology vol 1 pp 313ndash337 2010
[4] W Y Tan L J Zhang and C W Chen ldquoStochastic modelingof carcinogenesis state space models and estimation of param-etersrdquoDiscrete and Continuous Dynamical Systems B vol 4 no1 pp 297ndash322 2004
[5] W Y Tan C W Chen and L J Zhang ldquoCancer biology cancermodels and stochasticmathematical analysis of carcinogenesisrdquoin Handbook of Cancer Models and Applications W Y Tan andL Hanin Eds chapter 3World Scientific River Edge NJ USA2008
[6] R A Weinberg The Biology of Human Cancer Garland Sci-ences Taylor and Frances New York NY USA 2007
[7] Q Zheng ldquoStochastic multistage cancer models a fresh lookat an old approachrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds chapter 2 WorldScientific River Edge NJ USA 2008
[8] W Y Tan L J Zhang W Chen and J M Zhu ldquoA stochasticmodel of human colon cancer involving multiple pathwaysrdquo inHandbook of Cancer Models with Applications W Y Tan and LHanin Eds chapter 11 World Scientific River Edge NJ USA2008
[9] W Y Tan and H Zhou ldquoA new stochastic model of retinoblas-toma involving both hereditary and non- hereditary cancercasesrdquo Journal of Carcinogenesis and Mutagenesis vol 2 no 2article 117 2011
[10] X W Yan Stochastic and State Space Models of carcinogenesisUnder Complex Situation [PhD thesis] Department of Math-ematical Sciences University of Memphsis Memphis TennUSA
[11] G L Yang and C W Chen ldquoA stochastic two-stage carcino-genesis model a new approach to computing the probabilityof observing tumor in animal bioassaysrdquo Mathematical Bio-sciences vol 104 no 2 pp 247ndash258 1991
[12] H Osada and T Takahashi ldquoGenetic alterations of multipletumor suppressors and oncogenes in the carcinogenesis andprogression of lung cancerrdquo Oncogene vol 21 no 48 pp 7421ndash7434 2002
[13] I I Wistuba L Mao and A F Gazdar ldquoSmoking moleculardamage in bronchial epitheliumrdquo Oncogene vol 21 no 48 pp7298ndash7306 2002
[14] S Landreville O A Agapova and J W Hartbour ldquoEmerginginsights into the molecular pathogenesis of uveal melanomardquoFuture Oncology vol 4 no 5 pp 629ndash636 2008
[15] H W Mensink D Paridaens and A De Klein ldquoGenetics ofuveal melanomardquo Expert Review of Ophthalmology vol 4 no6 pp 607ndash616 2009
[16] WY Tan andXWYan ldquoAnew stochastic and state spacemodelof human colon cancer incorporating multiple pathwaysrdquoBiology Direct vol 5 article 26 2010
[17] L B Klebanov S T Rachev and A Y Yakovlev ldquoA stochasticmodel of radiation carcinogenesis latent time distributions andtheir propertiesrdquo Mathematical Biosciences vol 113 no 1 pp51ndash75 1993
[18] A Y Yakovlev and A D Tsodikov Stochastic Models of TumorLatency and Their Biostatistical Applications World ScientificRiver Edge NJ USA 1996
[19] H Fakir W Y Tan L Hlatky P Hahnfeldt and R K SachsldquoStochastic population dynamic effects for lung cancer progres-sionrdquo Radiation Research vol 172 no 3 pp 383ndash393 2009
[20] A G Knudson ldquoMutation and cancer statistical study ofretinoblastomardquoProceedings of theNational Academy of Sciencesof the United States of America vol 68 no 4 pp 820ndash823 1971
[21] J F Crow and M Kimura An Introduction to PopulationGenetics Theory Harper and Row New York NY USA 1970
[22] W Y Tan Stochastic Models With Applications to GeneticsCancers AIDS and Other Biomedical Systems World ScientificRiver Edge NJ USA 2002
[23] W Y Tan CW Chen and L J Zhang ldquoCancer risk assessmentby state space modelsrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds Chapter 13 WorldScientific River Edge NJ USA 2008
[24] W Y Tan andCWChen ldquoStochasticmodeling of carcinogene-sis some new insightsrdquoMathematical and Computer Modellingvol 28 no 11 pp 49ndash71 1998
[25] W Y Tan and C W Chen ldquoCancer stochastic modelsrdquo inEncyclopedia of Statistical Sciences Revised edition John Wileyand Sons New York NY USA 2005
[26] E G Luebeck and S HMoolgavkar ldquoMultistage carcinogenesisand the incidence of colorectal cancerrdquo Proceedings of theNational Academy of Sciences of the United States of Americavol 99 no 23 pp 15095ndash15100 2002
[27] R Durrett J Mayberry S Moseley and D Schmidt ldquoProba-bility models for cancer development and progressionrdquo GoogleSearch Google 2012
[28] G E P Box and G C Tiao Bayesian Inference in StatisticalAnalysis Addison-Wesley Reading Mass USA 1973
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
2 ISRN Biomathematics
Normalepithelium
(RASSF1 BLURobo1Dutt1 FHIT etc)
Hyperplasiametaplasia
Telomeraseexpression
(prevention oftelomere erosion)
(disruption of cellcycle check pointresist to apoptosis)
Dysplasia Carcinoma in situ
K-Ras mutation(activation of
growth signals)
Invasive cancer
VEGFexpression
COX-2expression
Squamous cell carcinoma(NSCLC)
9p21 LOH(p16 p14)
17p13 (p53) LOH(or p53 mutation)
3p LOH 8p LOH
Figure 1 Histopathology lesions and genetic pathway of squamous cell carcinoma of NonSmall Cell Lung Caner (NSCLC)
Tumor
PTENBAP1 orRb1 orDDEF1
GNAQ GNA11CDK4
BCL2 HDM2or others
Chr 3 loss
Cell cycle progression Cell survival Cancer progression Metastasis gain
3p loss
BRCA2 or P16
Figure 2 A Multistage Model of uveal melanoma (adult human eye cancer)
tumors In Section 4 assuming that we have some cancerincidence data such as the SEER data from NCINIH weproceed to develop statistical models for these data fromthese multistage models of carcinogenesis In Section 5 bycombining models in Sections 2ndash4 we proceed to developa generalized Bayesian inference and Gibbs sampling proce-dures to estimate the unknown parameters to validate themodel and to predict cancer incidence As an example ofapplication in Section 6 we proceed to develop a multistagemodel of human eye cancer with inherited cancer casesas described in Figure 2 We will illustrate the model andmethods by analyzing the SEER data of human eye cancerfrom NCINIH Finally in Section 7 we will discuss theusefulness of the model and the methods developed in thispaper and point out some future research directions
2 The Stochastic Multistage Model ofCarcinogenesis with Inherited Cancer Cases
The 119896-stage multistage model of carcinogenesis views car-cinogenesis as the end point of 119896 (119896 ge 2) discrete heritableand irreversible events (mutations genetic changes or epige-netic changes) with intermediate cells subjected to stochasticproliferation and differentiation (Little [1] Tan [2 3] Tan etal [4 5] Weinberg [6]) Let 119873 = 119869
0denote normal stem
cells 119879 the cancer tumors and 119869119894the 119894th stage initiated cells
arising from the (119894 minus 1)th stage initiated cells (119894 = 1 119896)by some genetic andor epigenetic changes Then the modelassumes 119873 rarr 119869
1rarr 119869
2rarr sdot sdot sdot rarr 119869
119896rarr 119879119906119898119900119903
with the 119869119894cells subject to stochastic proliferation (birth) and
differentiation (death) Further it assumes that each stem cellproceeds independently of other cells and that cancer tumorsdevelop from primary 119869
119896cells by clonal expansion (stochastic
birth and death) where primary 119869119896cells are 119869
119896cells which
arise directly from 119869119896minus1
cells see Yang and Chen [11]For example Figure 1 is a multistage pathway for the
squamous NSCLC (NonSmall Cell Lung cancer) as proposedby Osada and Takahashi [12] and Wistuba et al [13] Simi-larly Figure 2 is the multistage model for uveal melanomaproposed by Landreville et al [14] and Mensink et al [15]while Figure 3 is the APC-120573-Catenin-Tcf pathway for humancolon cancer (Tan et al [8] Tan and Yan [16])
Remark 1 To develop stochastic multistage models of car-cinogenesis in the literature (Little [1] Tan [2] Zheng [7]) itis conveniently assumed that the 119869
119896cells grow instantaneously
into cancer tumors as soon as they are generated In thiscase the number of tumors is equal to the number of 119869
119896cells
and one may identify 119869119896cells as tumors It follows that the
number of tumors is aMarkov process and that the 119869119896cells are
ISRN Biomathematics 3
Second copySecond copyAPC APC
Second copyAPC
Smad4DCC
Smad4DCC
Second copy
Second copySmad4DCC
Second copySmad4DCC
N
Ras
Ras
DysplasticACF
(a) Sporadic (about 70ndash75)
Carcinomas(b) FAP (familial adenomatous polyps) (about 1)
middot middotmiddot
middotmiddotmiddot
P53
P53 P53
P53
Figure 3 The APC-120573-catenin-Tcf-myc pathway for human colon cancer
transient cells In these cases one needs only to deal with119879(119905)and 119869
119895cells with 119895 = 1 119896 minus 1 However as shown by Yang
and Chen [11] the number of tumors is much smaller thanthe total number of 119869
119896cells Also in many animal models and
in cancer risk assessment of radiation Klebanov et al [17]Yakovlev and Tsodikov [18] and Fakir et al [19] have shownthat 119879(119905) are in general not Markov
To extend the above model to include hereditary cancersobserve that mutants of cancer genes exist in the populationand that both germline cells (egg and sperm) and somaticcells may carrymutant alleles of cancer genes [2 20] Furtherwithout exception every human being develops from theembryo in hisher motherrsquos womb (embryo stage denotetime by 0) where stem cells of different organs divide anddifferentiate to develop different organs respectively (seeWeinberg [6] Chapter 10) If both the egg and the spermgenerating the embryo carry mutant alleles of relevant cancergenes then the individual is an 119869
2-stage person at the embryo
stage if only one of the germ line cells (egg or sperm)generating the embryo carries mutant alleles of cancer genesthen at the embryo stage the individual is an 119869
1-stage person
Similarly the individual is a normal person (119873 = 1198690person)
at the embryo stage if both the egg and the sperm generatingthe embryo do not carry mutant alleles of cancer genesRefer to the person in the population as an 119869
119894(119894 = 0 1 2)
person if heshe is an 119869119894-stage person at the embryo stage
Then with respect to the cancer development in questionpeople in the population can be classified into 3 types ofpeople normal people (119873 = 119869
0people) 119869
1people and 119869
2
people Based on this classification for normal people in thepopulation the stochastic model of carcinogenesis is a 119896-stagemultievent model given by 119869
0rarr 119869
1rarr sdot sdot sdot rarr 119869
119896rarr
119879119906119898119900119903 for 1198691people in the population the stochastic model
of carcinogenesis is a (119896minus1)-stage multievent model given by1198691rarr 119869
2rarr sdot sdot sdot rarr 119869
119896rarr 119879119906119898119900119903 and for 119869
2people in the
population the stochasticmodel of carcinogenesis is a (119896minus2)-stage multievent model given by 119869
2rarr sdot sdot sdot rarr 119869
119896rarr 119879119906119898119900119903
To account for inherited cancer cases let 1199011be the
proportion of 1198691people in the population and 119901
2the
proportion of 1198692people in the population In general large
human populations under steady-state conditions one maypractically assume that the 119901
119894is a constant independent of
time (Crow and Kimura [21]) Then 1199010= 1 minus 119901
1minus 119901
2(0 lt
1199011+ 119901
2lt 1) is the proportion of normal people (ie119873 = 119869
0
people) in the population Let 119899 be the population size and119899119894(119894 = 0 1 2) the number of 119869
119894people in the population
so that sum2
119906=0119899119906= 119899 Assume that 119899 is very large and that
marriage between people in the population is random withrespect to cancer genes then as shown in Crow and Kimura[21] (see also Tan [22] Chapter 2) the conditional probabilitydistribution of (119899
1 119899
2) given n is 2-dimensional multinomial
with parameters (119899 1199011 119901
2) That is
(1198991 119899
2) | 119899 sim Multinomial (119899 119901
1 119901
2) (1)
To derive probability distribution of time to cancerunder the above model observe that during pregnancy theproliferation rates of all stem cells are quite high Thuswith positive probability 119869
2people in the population may
acquire additional genetic andor epigenetic changes duringpregnancy to become 119869
3-stage people at birth Similarly 119869
1
people may acquire genetic andor epigenetic changes duringpregnancy to become 119869
2people at birth albeit the probability
is very small normal people at the embryo stage may acquiresome genetic andor epigenetic changes during pregnancy tobecome 119869
1people at birth Because the probability of genetic
and epigenetic changes is small one may practically assumethat an 119869
119894(119894 = 0 1 2) person at the embryo stage would
only give rise to 119869119894stem cells and possibly 119869
119894+1stem cells at
birth This is equivalent to assuming that 119869119894people at the
embryos stage would not generate 119869119894+119895
(119895 gt 1) stem cells ator before birth This model is represented schematically inFigure 4 Notice that if 119896 = 2 one may practically assumethat with probability one an 119869
2person at the embryo stage
would develop cancer at or before birth (1199050) If 119896 = 3 then
4 ISRN Biomathematics
Two-stage model
Embryo state
Embryo state
At birth
At birth
Tumor
α1 minus α
( gt 3)tumor ( = 3)
-stage model ( ge 3)
Figure 4 Embryo genotypes and their frequencies at embryo stage and at birth
with probability 120572 (120572 gt 0) an 1198692person at the embryo stage
would develop cancer at or before birth
3 The Stochastic Process ofCarcinogenesis with Hereditary CancerCases and Mathematical Analysis
Because tumors are developed from primary 119869119896cells for the
above stochasticmodel the identifiable response variables are119879(119905) and 119869
119906(119905 119894) 119894 = 0 1 2 119906 = 119894 119894 + 1 119896 minus 1 where
119879(119905) is the number of cancer tumors at time 119905 and 119869119906(119905 119894) is
the number of 119869119906(119906 = 119894 119894 + 1 119896 minus 1) cells at time 119905 in
people who are 119869119894people at the embryo stage (see [3 5 8 23]
Remarks 1 and 2) For people who have genotype 119869119894(119894 =
0 1 2) at the embryo stage the stochastic model of carcino-genesis is then given by the stochastic process
˜119883119894(119905) 119879(119905) 119905 gt
0 where˜119883119894(119905) = 119869
119906(119905 119894) 119906 = 119894 119894 + 1 119896 minus 1
1015840 For theseprocesses in the next subsections we will derive stochasticequations for the state variables (119869
119906(119905 119894)119894 = 0 1 2 119906 =
119894 119896 minus 1) we will also derive the probability distributionsof these state variables and the probabilities of developingcancer tumors These are the basic approaches for modelingcarcinogenesis used by the first author and his associates seeTan [3] Tan et al [4 5 8 23] Tan and Zhou [9] Tan and Yan[16] and Tan and Chen [24 25] and Remark 3
Remark 2 At any time (say 119905) the total number of 119869119896cells
is equal to the total number of 119869119896cells generated from 119869
119896minus1
cells at time 119905 plus the total number of 119869119896cells generated by
cell division from other 119869119896cells at time 119905 the former 119869
119896cells
are referred to as primary 119869119896cells while the latter are not
primary 119869119896cells Since each tumor is developed from a single
primary 119869119896cell through stochastic birth and death process
each primary 119869119896cell will generate atmost one tumor It follows
that at any time the total number of 119869119896cells is considerably
greater than the number of cancer tumors (see also Yangand Chen [11]) Thus for generating cancer tumors the onlyidentifiable state variables are the number of 119869
119895cells with (119895 =
0 1 119896 minus 1) and the number of detectable cancer tumor
Remark 3 To model stochastic multistage models of car-cinogenesis the standard traditional approach is to assumethat the last stage cells (ie the 119869
119896cells in the model 119873 rarr
1198691sdot sdot sdot rarr 119869
119896rarr 119879119906119898119900119903) grow instantaneously into a cancer
tumor as soon as they are generated and then apply thestandard Markov theory to 119879(119905) and to the state variables
˜119883(119905) = 119869
119894(119905) 119894 = 0 1 119896 minus 1 This approach has been
described in detail in Tan [2] Little [1] and Zheng [7] seealso Luebeck and Moolgavkar [26] and Durrett et al [27]However in some cases the assumption of instantaneousgrowth into cancer tumors of 119869
119896cells may not be realistic
ISRN Biomathematics 5
(Klebanov et al [17] Yakovlev and Tsodikov [18] and Fakiret al [19]) in these cases 119879(119905) is not Markov so that theMarkov theory method is not applicable to 119879(119905) To developanalytical results and to resolve many difficult issues Tan andhis associates [4 5 24] have proposed an alternative approachthrough stochastic equations and have followed Yang andChen [11] to assume that cancer tumors develop by clonalexpansion from primary last stage cells Through probabilitygenerating function method Tan and Chen [24] have shownthat if the Markov theory is applicable to 119879(119905) then thestochastic equation method is equivalent to the classicalMarkov theory method but is more powerful Also throughstochastic equation method we have shown in the Appendixthat the classical approach provides a close approximation todiscrete time model under the assumption that the primarylast stage cells develop into a detectable tumor in one timeunit This provides a reasonable explanation why the tradi-tional approach (see [2 22]) can still work well even thoughthe Markov assumption for 119879(119905) may not hold In this paperwe will thus basically use the stochastic equationmethod andassume that cancer tumors develop from primary last stagecells through clonal expansion
31 Stochastic Equations for the State Variables Assume nowthat an individual is an 119869
119894(119894 = 0 1 2) person at the embryo
stageThen in this individual cancer is developed by a (119896minus119894)-stage multievent model given by 119869
119894rarr sdot sdot sdot rarr 119869
119896rarr
119879119906119898119900119903 and the identifiable response variables are given by˜119883119894(119905)
1015840 119879(119905) To derive stochastic equations for the staging
variables in˜119883119894(119905) in this individual observe that for each
119894 = 0 1 2˜119883119894(119905)
1015840 is in general a Markov Process although119879(119905) may not be Markov see Remark 1 Tan [3] and Tan etal [4 5] Tan and Zhou [9] and Tan and Yan [16] It followsthat
˜119883119894(119905 + Δ119905)
1015840 derive from˜119883119894(119905)
1015840 through stochastic birth-death processes of 119869
119906(119906 = 119894 119894 + 1 119896minus1) cells and through
stochastic transition 119869119906rarr 119869
119906+1 119906 = 119894 119894+1 119896minus1 during
(119905 119905 + Δ119905] Let 119861119906(119905 119894) 119863
119906(119905 119894) 119872
119906(119905 119894) be the number of
birth the number of death of 119869119906cells and the number of
transition from 119869119906rarr 119869
119906+1cells during (119905 119905+Δ119905] respectively
in people who are 119869119894people at the embryo stage Let 119872
0(119905)
denote the number of transitions from 119873 rarr 1198691during
(119905 119905 + Δ119905] Because the transition of 119869119906rarr 119869
119906+1would not
affect the number of 119869119906cells but only increase the number of
119869119906+1
cells (see Remark 4) by the conservation law we have thefollowing stochastic equations for 119869
119906(119905 119894) 119906 = 119894 119896minus1 119894 =
0 1 2 (see Tan [3] Tan et al [4 5 8] Tan and Zhou [9] andTan and Yan [16])
119869119894 (119905 + Δ119905 119894) = 119869
119894 (119905 119894) + 119861119894 (119905 119894) minus 119863119894 (119905 119894) 119894 = 0 1 2
119869119906(119905 + Δ119905 119894) = 119869
119906(119905 119894) + 119861
119906(119905 119894) minus 119863
119906(119905 119894)
+ 119872119906minus1 (119905 119894) 119894 lt 119906 le 119896 minus 1
(2)
Because 119861119907(119905 119894) 119863
119907(119905 119894)119872
119907(119905 119894) 119894 = 0 1 2 119907 = 119894 119896minus
1 are random variables the above equations are basicallystochastic equations To derive probability distributions ofthese variables let 119887
119906(119905) and 119889
119906(119905) denote the birth rate and
the death rate at time 119905 of the 119869119906(119906 = 0 1 119896 minus 1) cells
respectively Let 120573119906(119905) (119906 = 0 1 119896 minus 1) be the transition
rate at time 119905 from 119869119906rarr 119869
119906+1 Then as shown in Tan [3] for
(119894 = 0 1 2 119906 = 119894 119896 minus 1) we have to the order of 119900(Δ119905)
119861119906(119905 119894) 119863
119906(119905 119894) 119872
119906(119905 119894) | 119869
119906(119905 119894)
sim Multinomial 119869119906 (119905 119894) 119887119906 (119905) Δ119905 119889119906 (119905) Δ119905 120573119906 (119905) Δ119905
(3)
It follows that to the order of 119900(Δ119905)
119861119906(119905 119894) 119863
119906(119905 119894) | 119869
119906(119905 119894)
simMultinomial 119869119906 (119905 119894) 119887119906 (119905) Δ119905 119889119906 (119905) Δ119905
119872119906(119905 119894) | 119869
119906(119905 119894)
sim Binomial 119869119906(119905 119894) 120573
119906(119905) Δ119905
sim Poisson 119869119906(119905 119894) 120573
119906(119905) Δ119905 + 119900 (120573
119895(119905) Δ119905)
independently of 119861119906(119905 119894) 119863
119906(119905 119894)
119906 = 0 1 119896 minus 1
(4)
From these distribution results by subtracting from therandom transition variables its conditional means respec-tively we obtain the following stochastic equations for thestate variables
˜119883119894(119905) (119894 = 0 1 Min(2 119896 minus 1))
119889119869119894(119905 119894) = 119869
119894(119905 + Δ119905 119894) minus 119869
119894(119905 119894) = 119861
119894(119905 119894) minus 119863
119894(119905 119894)
= 119869119894(119905 119894) 120574
119894(119905) Δ119905 + 119890
119894(119905 119894) Δ119905 119894 = 0 1 2
119889119869119906 (119905 119894) = 119869
119906 (119905 + Δ119905 119894) minus 119869119906 (119905 119894) = 119872119906minus1 (119905 119894)
+ 119861119906(119905 119894) minus 119863
119906(119905 119894)
= 119869119906minus1 (119905 119894) 120573119906minus1 (119905)
+119869119906(119905 119894) 120574
119906(119905) Δ119905 + 119890
119906(119905 119894) Δ119905
119894 = 0 1 Min (2 119896 minus 1) 119894 lt 119906 le 119896 minus 1
(5)
where 120574119906(119905) = 119887
119906(119905) minus 119889
119906(119905) for 119906 = 0 1 119896 minus 1 and where
119890119894(119905 119894)Δ119905 = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)Δ119905] minus [119863
119894(119905 119894) minus 119869
119894(119905 119894)119889
119894(119905)Δ119905]
for 119894 = 0 1 2 119890119906(119905 119894)Δ119905 = [119861
119906(119905 119894) minus 119869
119906(119905 119894)119887
119906(119905)Δ119905] minus
[119863119906(119905 119894) minus 119869
119906(119905 119894)119889
119906(119905)Δ119905] + [119872
119906minus1(119905 119894) minus 119869
119906minus1(119905 119894)120573
119906minus1(119905)Δ119905]
for 119894 = 0 1 Min(2 119896 minus 1) 119894 lt 119906 le 119896 minus 1From the above equations by dividing both sides by Δ119905
and letting Δ119905 rarr 0 we obtain
119869119894(119905 119894)
119889119905= 119869
119894(119905 119894) 120574
119894(119905) + 119890
119894(119905 119894) 119894 = 0 1 Min (2 119896 minus 1)
119869119906 (119905 119894)
119889119905= 119869
119906 (119905 119894) 120574119906 (119905) + 119869119906minus1 (119905 119894) 120573119906minus1 (119905) + 119890119906 (119905 119894)
for 119894 = 0 1 Min (2 119896 minus 1) 119894 lt 119906 le 119896 minus 1
(6)
In the above equations using the distribution results in(4) it can easily be shown that the randomnoises 119890
119906(119905 119894) 119906 =
119894 119896 minus 1 119894 = 0 1 2 have expected value zero and are
6 ISRN Biomathematics
uncorrelated with the staging variables and 119879(119905) The initialconditions at birth (119905
0) for the above stochastic differential
equations are 119869119906(1199050 119894) gt 0 119906 = 119894 119894 + 1 119869
119906(1199050 119894) = 0 119906 gt 119894 + 1
Given the initial conditions (119869119906(1199050 119894) gt 0 119906 = 119894 119894 + 1)
and (119869119906(1199050 119894) = 0 119906 gt 119894 + 1) at birth (119905
0) the solution of the
equations in (6) is given respectively by
119869119894 (119905 119894) = 119869
119894(1199050 119894) 119890
int119905
1199050
120574119894(119909)119889119909
+ 120578119894 (119905 119894)
119894 = 0 1 Min (2 119896 minus 1)
119869119906 (119905 119894) = 119869
119906(1199050 119894) 119890
int119905
1199050
120574119906(119909)119889119909
+ int
119905
1199050
119869119906minus1 (119909 119894) 120573119906minus1 (119909) 119890
int119905
119909120574119906(119910)119889119910
119889119909
+ 120578119906(119905 119894) = sdot sdot sdot = 119869
119906(1199050 119894) 119890
int119905
1199050
120574119906(119909)119889119909
+
119906minus119894
sum
119907=1
119869119906minus119907
(1199050 119894) 120601
(119907)
119906(119905 119894) +
119906+1minus119894
sum
119907=1
120578(119907)
119906(119905 119894)
119906 = 119894 + 1 119896 minus 1
where 119894 = 0 if 119896 = 2
119894 = 0 1 if 119896 = 3
119894 = 0 1 2 if 119896 gt 3
(7)
where
120601(1)
119906(119905 119894) =int
119905
1199050
119890int119905
119909120574119906(119910)119889119910+int
119909
1199050
120574119906minus1
(119910)119889119910120573119906minus1
(119909) 119889119909
119906 = 119894 119896 minus 1
120601(119907)
119906(119905 119894) =int
119905
1199050
119890int119905
119909120574119906(119910)119889119910
120573119906minus1
(119909) 120601(119907minus1)
119906minus1(119909 119894) 119889119909
119907 = 2 119906 minus 119894
120578119906(119905 119894) = 120578
(1)
119906(119905 119894)
= int
119905
1199050
119890int119905
119909120574119906(119910)119889119910
119890119906(119909 119894) 119889119909
119906 = 119894 119896 minus 1
120578(119907)
119906(119905 119894) = int
119905
1199050
119890int119905
119909120574119906(119910)119889119910
120573119906minus1
(119909) 120578(119907minus1)
119906minus1(119909 119894) 119889119909
119907 = 2 119906 minus 119894
(8)
If the model is time homogeneous so that 120573119906(119905) =
120573119906 119887119906(119905) = 119887
119906 119889
119906(119905) = 119889
119906 120574
119906(119905) = 119887
119906minus 119889
119906= 120574
119906 119906 = 0 1
119896minus1 and if 120574119894= 120574119906if 119894 = 119906 the above solutions under the initial
conditions (119869119894+119906(1199050 119894) gt 0 119906 = 0 1 119869
119894+119906(1199050 119894) = 0 119906 gt 1)
then reduce respectively to
119869119894 (119905 119894) = 119869
119894(1199050 119894) 119890
120574119894(119905minus1199050)+ 120578
(1)
119894(119905 119894)
119894 = 0 1 Min (2 119896 minus 1)
119869119906(119905 119894) = 119869
119906(1199050 119894) 119890
120574119906(119905minus1199050)
+120573119906minus1
int
119905
1199050
119869119906minus1
(119909 119894) 119890120574119906(119905minus119909)
119889119909+120578(1)
119906(119905 119894)
= sdot sdot sdot = 119869119906(1199050 119894) 119890
120574119906(119905minus1199050)
+
119906minus1
sum
119903=119894
119869119903(1199050 119894) (
119906minus1
prod
119907=119903
120573119907)
119906
sum
119897=119903
119860119903119906(119897) 119890
120574119897(119905minus1199050)
+
119906
sum
119903=119894
120578(119906+1minus119903)
119906(119905 119894)
=
119894+1
sum
119903=119894
119869119903(1199050 119894) (
119906minus1
prod
119907=119903
120573119907)
119906
sum
119897=119903
119860119903119906(119897) 119890
120574119897(119905minus1199050)
+
119906
sum
119903=119894
120578(119906+1minus119903)
119906(119905 119894) 119894 lt 119906 le 119896 minus 1
(9)
where for 119894 le 119907 le 119906119860119894119906 (119907) = 1 if 119894 = 119906
=
119906
prod
119897=119894119897 = 119907
(120574119897minus 120574
119907)minus1 if 119894 lt 119906
(10)
Obviously 119864[120578(119903)119906(119905 119894)] = 0 for all (119894 = 0 1 2 119906 = 119894
119896 minus 1 119903 = 1 119906 minus 119894 + 1) It follows that for (119894 =
0 1 Min(119896 minus 1 2)) the expected values 119864[119869119896minus1
(119905 119894)] of119869119896minus1
(119905 119894) for homogeneous models with 120574119894= 120574119906if 119894 = 119906 are
given by
119864 [119869119896minus1
(119905 119894) = 119864 [119869119894+1
(1199050 119894)]
times (
119896minus2
prod
119906=119894+1
120573119906)
119896minus1
sum
119907=119894+1
119860(119894+1)(119896minus1)
(119907) 119890120574119907(119905minus1199050)
+ 119864 [119869119894(1199050 119894)] (
119896minus2
prod
119906=119894
120573119906)
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907) 119890120574119907(119905minus1199050)
119894 = 0 1Min (119896 minus 1 2) 119896 ge 2
(11)
where as a convention (sum119894
119895=119894+1119888119894= 0prod
119894
119895=119894+1119889119895= 1) for all
(119888119894 119889
119895)
Remark 4 Because genetic changes and epigenetic changesoccur during cell division to the order of 119900(Δ119905) the probabil-ity is120573
119906(119905)Δ119905 that one 119869
119906cell at time 119905would give rise to 1 119869
119906cell
and 1 119869119906+1
cell at time 119905 + Δ119905 by genetic changes or epigeneticchanges It follows that the transition of 119869
119906rarr 119869
119906+1would not
affect the population size of 119869119906cells but only increase the size
of the 119869119906+1
population
ISRN Biomathematics 7
32 Transition Probabilities and Probability Distributions ofStaging Variables Let 119891(119909 119899 119901) denote the probabilitydensity function of a binomial random variable 119883 sim
Binomial(119899 119901) ℎ(119909 120582) the probability density function ofa Poisson random variable 119883 sim Poisson(120582) and 119892(119909 119910 119899
1199011 119901
2) the probability density function of a bivariate multi-
nomial random vector (119883 119884)simMultinomial(119899 1199011 119901
2) Using
the stochastic equations of the staging variables given by(2) and using the probability distributions of the transitionvariables 119861
119903(119905 119894) 119863
119903(119905 119894)119872
119903(119905 119894) in (4) as inTan et al [4 5]
we obtain the following transition probabilities of 119869119903(119905 +
Δ119905 119894) 119903 + 119894 119896 minus 1 given 119869119903(119905 119894) 119903 = 119894 119896 minus 1 for
(119894 = 0 1 Min(119896 minus 1 2))
119875 119869119903 (119905 + Δ119905 119894) = 119907
119903 119903 = 119894 119894 + 1 119896 minus 1 | 119869
119903 (119905 119894)
= 119906119903 119903 = 119894 119894 + 1 119896 minus 1
= 119875 119869119894(119905 + Δ119905 119894) = 119907
119894| 119869
119894(119905 119894) = 119906
119894
times
119896minus1
prod
119895=119894+1
119875 119869119895(119905 + Δ119905 119894) = 119907
119895| 119869
119895(119905 119894)
= 119906119895 119869119895minus1
(119905 119894) = 119906119895minus1
(12)
where
119875 119869119894(119905 + Δ119905 119894) = 119907
119894| 119869
119894(119905 119894) = 119906
119894
=
119906119894
sum
119903=0
119891 (119903 119906119894 119887119894 (119905) Δ119905)
times 119891(119906119894minus 119907
119894+ 119903 119906
119894minus 119903
119889119894(119905) Δ119905
1 minus 119887119894 (119905) Δ119905
)
119894 = 0 1 Min (119896 minus 1 2)
119875 119869119895(119905 + Δ119905 119894) = 119907
119895| 119869
119895(119905 119894) = 119906
119895 119869119895minus1
(119905 119894) = 119906119895minus1
=
119906119895
sum
1199031=0
119906119895minus1199031
sum
1199032=0
119892 (1199031 1199032 119906
119895 119887119895(119905) Δ119905 119889
119895(119905) Δ119905)
times ℎ (119907119895minus 119906
119895minus 119903
1+ 119903
2 119906
119895minus1120573119895minus1
Δ119905) 119895 gt 119894
(13)
Define the unobservable transition variables˜119880119894(119905) =
119861119894(119905 119894) (119861
119895(119905 119894) 119863
119895(119905 119894)) 119895 = 119894 + 1 119896 minus 1
1015840(119894 = 0 1Min
(119896 minus 1 2)) Then we have for the joint probability densityfunction of
˜119883119894(119905 + Δ119905)
˜119880119894(119905) given
˜119883119894(119905)
119875 ˜119883119894 (119905 +Δ119905) ˜
119880119894 (119905) | ˜
119883119894 (119905) = 119875
˜119883119894 (119905 + Δ119905) | ˜
119880119894 (119905) ˜
119883119894 (119905)
times 119875 ˜119880119894(119905) |
˜119883119894(119905)
(14)
where
119875 ˜119883119894(119905 + Δ119905) |
˜119880119894(119905)
˜119883119894(119905)
= 119891(119869119894 (119905 119894) minus 119869119894 (119905 + Δ119905 119894) + 119861119894 (119905 119894) 119869119894 (119905 119894)
minus119861119894(119905 119894)
119889119894(119905) Δ119905
1 minus 119887119894 (119905) Δ119905
)
times
119896minus1
prod
119895=119894+1
ℎ 119869119895 (119905 + Δ119905 119894) minus 119869119895 (119905 119894) minus 119861119895 (119905 119894)
+119863119895(119905 119894) 119869
119895minus1(119905 119894) 120573
119895minus1(119905) Δ119905
(15)
119875 ˜119880119894(119905) |
˜119883119894(119905)
= 119891 (119861119894 (119905 119894) 119869119894 (119905 119894) 119887119894 (119905 119894) Δ119905)
times
119896minus1
prod
119895=119894+1
119892119861119895(119905 119894) 119863
119895(119905 119894) 119869
119895(119905 119894) 119887
119895(119905 119894) Δ119905 119889
119895(119905 119894) Δ119905
(16)
Let˜119890119894(119895) be a (119896 minus 119894) times 1 column vector with 1 in the
119895th position (1 le 119895 le 119896 minus 119894) and with 0 in other positionsLet
˜119906 = (119906
119894 119906
119896minus1)1015840 and
˜119907 = (119907
119894 119907
119896minus1)1015840 be (119896 minus
119894) times 1 column vectors of nonnegative integers (ie 119906119895and
119907119895are nonnegative integers) Then by using the probability
distribution results in (14)ndash(16) it can readily be shown that
119875 ˜119883119894(119905 + Δ119905) =
˜119907 |
˜119883119894(119905) =
˜119906
= [119906119895119887119895(119905) + (1 minus 120575
119895119894) 119906
119895minus1120573119895minus1
(119905)] Δ119905 + 119900 (Δ119905)
if˜119907 =
˜119906 +
˜119890119894(119895) 119895 = 119894 119896 minus 1
= 119906119895119889119895(119905) Δ119905 + 119900 (Δ119905)
if˜119907 =
˜119906 minus
˜119890119894(119895) 119895 = 119894 119896 minus 1
= 119900 (Δ119905) if 10038161003816100381610038161003816˜11015840
119899minus119896(˜119906 minus
˜119907)10038161003816100381610038161003816ge 2
(17)
The above results imply that˜119883119894(119905) is a (119896minus 119894)-dimensional
birth-death process with birth rates 119894119887119906(119905) 119906 = 119894 119896 minus
1 death rates 119894119889119906(119905) 119906 = 119894 119896 minus 1 and cross-
transition rates 120572119906119906+1
(119895 119905) = 119895120573119906(119905) 119906 = 119894 119896 minus
1 120573119906119907(119895 119905) = 0 if 119907 = 119906 + 1 (See Definition 41 in Tan
([22] Chapter 4)) Using these results it can be shownthat the Kolmogorov forward equation for the probabilities119875119869
119895(119905 119894) = 119906
119895 119895 = 119894 119896 minus 1 | 119869
119894(0) = 119898
119894 119869
119895(0) = 0
8 ISRN Biomathematics
119895 gt 119894 = 119875(119906119895 119895 = 119894 119896 minus 1 119905) (119894 = 0 1Min(119896 minus 1 2))
in the above model is given by
119889
119889119905119875 (119906
119895 119895 = 119894 119896 minus 1 119905)
= 119875 (119906119894minus 1 119906
119895 119895 = 119894 + 1 119896 minus 1 119905) (119906
119894minus 1) 119887
119894(119905)
+
119896minus1
sum
119895=119894+1
119875 (119906119894 119906
119894+1 119906
119895minus1 119906
119895
minus1 119906119895+1
119906119896minus1
119905) (119906119895minus 1) 119887
119895(119905)
+
119896minus2
sum
119895=119894
119875 (119906119894 119906
119894+1 119906
119895 119906
119895+1
minus1 119906119895+2
119906119896minus1
119905) 119906119895120573119895 (119905)
+
119896minus1
sum
119895=119894
119875 (119906119894 119906
119894+1 119906
119895minus1 119906
119895
+1 119906119895+1
119906119896minus1
119905) (119906119895+ 1) 119889
119895(119905)
minus 119875 (119906119894 119906
119894+1 119906
119896minus1 119905)
times
119896minus1
sum
119895=119894
119906119895[119887119895(119905) + 119889
119895(119905)] +
119896minus2
sum
119895=119894
119906119895120573119895(119905)
(18)
for 119906119895= 0 1 infin 119895 = 119894 119896 minus 1
By using the above set of differential equations one canreadily compute the probabilities 119875119869
119895(119905) = 119906
119895 119895 = 119894 119896 minus
1 | 119868119894(0) = 119898
119894 = 119875(119906
119895 119895 = 119894 119896 minus 1 119905) numerically
33 The Probability Distributions of the Number of DetectableTumors and Times to Tumors As shown by Yang and Chen[11] malignant cancer tumors arise from primary 119869
119896cells by
clonal expansion where primary 119869119896cells are 119869
119896cells generated
directly by 119869119896minus1
cells (119869119896cells derived by stochastic birth of
other 119869119896cells are not primary 119869
119896cells) That is cancer tumors
develop from primary 119869119896cells through stochastic birth-death
processesTo derive the probability distribution for 119879(119905) in 119869
119894people
in the population let 119875119879(119904 119905) denote the probability that
a primary cancer cell at time 119904 develops into a detectablecancer tumor by time 119905 (Explicit formula for119875
119879(119904 119905) has been
given in Tan [22] Chapter 8 and in Tan and Chen [24])Than as shown in Tan ([3 22] chapter 8) the conditionalprobability distribution of 119879(119905) given 119869
119896minus1(119904 119894) 119904 le 119905
in 119869119894people is Poisson with mean 120596(119905 119894) where 120596(119905 119894) =
int119905
1199050
119869119896minus1
(119904 119894)120573119896minus1
(119904)119875119879(119904 119905)119889119904 That is
119879 (119905) | 119869119896minus1
(119904 119894) 119904 le 119905 sim Poisson (120596 (119905 119894)) (19)
Let 119876119894(119895) be the probability that cancer tumors develop
during (119905119895minus1
119905119895] in 119869
119894people in the population For time
homogeneous models with small 120573119896minus1
119876119894(119895) is then given by
119876119894(119895) = 119864 119890
minus120596(119905119895minus1
119894)minus 119890
minus120596(119905119895119894)
= 119890minus120573119896minus1
119867119894(119905119895minus1
)minus 119890
minus120573119896minus1
119867119894(119905119895)+ 119900 (120573
119896minus1)
(20)
where119867119894(119905) = int
119905
1199050
119864[119869119896minus1
(119909 119894)]119875119879(119909 119905)119889119909
To derive 119876119894(119895) denote by
120579119894(119896minus1)
= 119864 [119869119894(1199050 119894 minus 1)] 120573
119894
119896minus1
prod
119906=119894+1
(120573119906
120574119906
)
119894 = 1 Min (3 119896 minus 1)
120582119906(119896minus1)
= 119864 [119869119906(1199050 119906)] 120573
119906
119896minus1
prod
119907=119906+1
(120573119906
120574119906
)
119906 = 0 1 Min (2 119896 minus 1)
(21)
and define the functions
120595119894(119896minus1)
(119905) =
119896minus1
prod
119906=119894+1
120574119906
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907)
times int
119905
1199050
119890120574119906(119909minus1199050)119875119879(119909 119905) 119889119909
119894 = 0 1 Min (2 119896 minus 1)
(22)
Applying results of 119864[119869119896minus1
(119905 119894)] given in (11) for timehomogeneous models with 120574
119894= 120574119895if 119894 = 119895 we obtain 119876
119894(119895)rsquos as
follows
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) = 119890
minus1205791112059511(119905119895minus1
)minus1205820112059501(119905119895minus1
)
minus119890minus1205791112059511(119905119895)minus1205820112059501(119905119895) + 119900 (120573
1)
(23)
1198761(119895) = (1 minus 120572
1) 119890
minus1205821112059511(119905119895minus1
)
minus119890minus1205821112059511(119905119895) + 119900 (120573
1)
(24)
ISRN Biomathematics 9
(2) If 119896 ge 3 then we have 119876119894(0) = 0 for 119894 = 0 1 and
1198762(0) = 120575
2119896120572 and for 119895 gt 0
1198760(119895) = 119890
minus1205791(119896minus1)
1205951(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1205950(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1205951(119896minus1)
(119905119895)minus1205820(119896minus1)
1205950(119896minus1)
(119905119895) + 119900 (120573
119896minus1)
(25)
1198761(119895) = 119890
minus1205792(119896minus1)
1205952(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1205951(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1205952(119896minus1)
(119905119895)minus1205821(119896minus1)
1205951(119896minus1)
(119905119895) + 119900 (120573
119896minus1)
(26)
1198762(119895) = 120575
2119896(1 minus 120572) 119890
minus1205822212059522(119905119895minus1
)minus 119890
minus1205822212059522(119905119895)
+ (1 minus 1205752119896) 119890
minus1205793(119896minus1)
1205953(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1205952(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1205953(119896minus1)
(119905119895)minus1205822(119896minus1)
1205952(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(27)
where 1205752119896= 1 if 119896 = 2 and = 0 if 119896 = 2
Notice that if 1205740= 0 then 120595
0(119896minus1)(119905) reduces to
1205950(119896minus1)
(119905) = (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=0
1198600(119896minus1)
(119907)
times int
119905
1199050
119890120574119907(119909minus1199050)119875119879(119909 119905) 119889119909
= (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1) (119907)
times int
119905
1199050
[119890120574119907(119909minus1199050)minus 1] 119875
119879 (119909 119905) 119889119909
(28)
Notice also that if 119875119879(119904 119905) asymp 1 for 119905 gt 119904 and if 120574
0= 0 then
the above 120595119894119895(119905)rsquos reduce respectively to
120595119894119894(119905) =
1
120574119894
119890120574119894(119905minus1199050)minus 1 119894 = 1 119896 minus 1
1205950(119896minus1) (119905) = (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1) (119907)
times1
1205741199072119890
120574119907(119905minus1199050)minus 1 minus (119905 minus 119905
0) 120574
119907
120595119894(119896minus1)
(119905) = (
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907)
times1
120574119907
119890120574119907(119905minus1199050)minus 1 119894 = 1 119896 minus 1
(29)
4 Probability Distribution ofObserved Cancer Incidence IncorporatingHereditary Cancer Cases
For estimating unknown parameters and to validate themodel one would need real data generated from the modelFor studies of carcinogenesis such data are usually given bycancer incidence For example in the SEER data of NCINIHof USA the data are given by (119910
0 119899
0) (119910
119895 119899
119895) 119895 = 1 119905
119873
where 1199100is the number of cancer cases at birth and 119899
0
the total number of birth and where for 119895 ge 1 119910119895is
the number of cancer cases developed during the 119895th agegroup of a one-year period (or 5 years periods) and 119899
119895is
the number of noncancer people who are at risk for cancerand from whom 119910
119895of them have developed cancer during
the 119895th age group Given in Table 1 are the SEER data ofuveal melanoma (adult eye cancer) during the period 1973ndash2007 In Table 1 notice that there are some cancer cases atbirth implying some inherited cancer cases In this sectionwe will develop a statistical model for these types of datasets from the stochastic multistage model with hereditarycancers as given in Section 2 As in previous sections let119899119894119895be the number of individuals who have genotype 119869
119894(119894 =
0 1 2) at the embryo stage among the 119899119895people at risk for
the cancer in question Then as showed above (1198991119895 119899
2119895) |
119899119895sim Multinomial119899
119895 119901
1 119901
2 It follows that 119899
119894119895| 119899
119895sim
Binomial119899119895 119901
119894 119894 = 0 1 2 In what follows we let 119884
119895denote
the random variable for 119910119895unless otherwise stated
41 The Probability Distribution of 1198840 As shown in Figure 4
119869119894(119894 = 0 1 2) people would only generate 119869
119894stage cells and
119869119894+1
stage cells at birth Thus for cancers to develop at orbefore birth the number of stages for the stochastic model ofcarcinogenesis must be 3 or less It follows that if 119910
0gt 0 the
appropriate model of carcinogenesis must be either a 2-stagemodel or a 3-stage model Since 119899
20| 119899
0sim Binomial(119899
0 119901
2)
and 11989910| (119899
0 119899
20) sim Binomial(119899
0 119901
1) + 119900(119901
2) the probability
distribution of 1198840is therefore
1198840sim Poisson (120594
119896) 119896 = 2 3 (30)
where
120594119896= 119899
0(119901
2+ 119901
1120572) if 119896 = 2
= 11989901199012120572 if 119896 = 3
(31)
The expected number of 1198840given 119899
0is 119864(119884
0| 119899
0) =
1198990(119901
2+ 119901
1120572) = 120594
2if 119896 = 2 and 119864(119884
0| 119899
0) = 119899
01199012120572 = 120594
3
if 119896 = 3 Hence for the 2-stage model (ie 119896 = 2) or the 3-stage model (ie 119896 = 3) the maximum likelihood estimateof 120594
119896is 120594
119896= 119910
0and the deviance119863
0(119896) from the conditional
probability distribution of 1198840given 119899
0is
1198630(119896) = minus2 log ℎ (119910
0 120594
119896) minus log ℎ (119910
0 120594
119896)
= 120594119896minus 119910
0 minus 119910
0log
120594119896
1199100
119896 = 2 3
(32)
10 ISRN Biomathematics
Table 1 The SEER incidence data (1973ndash2007) of uveal melanomafrom NCINIH (over all races and genders)
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
0 12495777 34 36 36 381 12221582 20 11 11 162 12120990 14 7 8 173 12112995 9 5 6 184 12146174 4 4 5 205 12161336 3 3 4 226 12111854 2 2 4 257 12160452 2 2 4 288 11942586 2 2 4 309 12381299 1 2 5 3410 12512703 2 3 6 3811 12410338 5 3 6 4112 12449244 2 3 6 4413 12527781 5 4 7 4814 12602883 3 5 7 5115 12719598 5 5 8 5516 12766107 7 6 9 5917 12831400 9 7 9 6218 12382047 8 7 9 6319 12581638 10 8 10 6820 12636509 7 9 11 7121 12682601 6 10 10 7522 12840510 12 11 11 7923 13075528 17 13 13 8424 13358635 16 15 14 8925 13473849 12 16 16 9426 13426340 17 18 18 9727 13525264 28 20 20 10128 13149674 17 22 21 10229 13812811 23 25 25 11030 13886874 24 28 27 11431 13488332 37 30 29 11532 13460286 32 32 32 11833 13256067 38 35 35 11934 13428827 37 39 38 12435 13220037 40 41 41 12636 12870265 30 44 44 12637 12689592 43 47 47 12738 12157014 42 49 49 12539 12494081 46 55 54 13140 12272125 49 58 58 13241 11826573 56 61 60 13042 11663153 54 65 64 13143 11407082 53 68 68 13144 11296848 70 73 72 133
Table 1 Continued
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
45 11016369 57 76 76 13246 10651593 71 79 79 13047 10475708 87 84 83 13148 9994684 82 86 85 12749 10138908 78 93 93 13150 9836359 87 97 96 13051 9475641 95 100 99 12752 9250985 113 104 104 12653 9027382 106 108 108 12554 8883737 117 113 113 12555 8547883 129 116 116 12356 8279648 107 119 119 12157 8062368 119 123 123 11958 7654610 132 124 124 11559 7563706 118 130 129 11560 7232719 131 131 130 11261 6927332 116 132 132 10962 6708273 133 134 134 10763 6543931 143 138 137 10664 6404652 130 141 141 10565 6168486 145 142 142 10266 5913479 138 142 142 9967 5746766 151 144 144 9868 5480517 147 142 142 9469 5363912 149 144 144 9470 5110728 136 142 142 9071 4925076 144 141 141 8872 4696825 140 138 138 8573 4512136 146 135 135 8274 4345300 126 133 133 8075 4148801 122 128 129 7876 3900900 124 122 122 7477 3681587 114 116 116 7078 3481918 93 110 110 6779 3243631 102 102 102 6380 2961234 79 92 93 5881 2724984 93 83 84 5482 2495219 82 75 75 5083 2271595 92 66 67 4684 2041351 64 57 58 4285 10466605 304 282 285 216Note 1 for age 0 and 1ndash10 years old the cancer incidence are derived bysubtracting incidence of retinoblastoma from the original SEERdata (see Tanand Zhou [9])Note 2 the observed uveal melanoma incidence rates per 106 individuals arederived from the SEER eye cancer incidence by subtracting retinoblastomaincidence as given in Tan and Zhou [9]
ISRN Biomathematics 11
42 The Probability Distribution of 119884119895(119895 ge 1) To derive the
probability distribution of 119884119895(119895 ge 1) in the 119895th age group
let 119884119894119895(119894 = 0 1 2) be the number of cancer cases generated
by people who have genotype 119869119894at the embryo stage among
these 119884119895cancer cases Then 119884
119895= sum
2
119894=0119884119894119895and 119884
0119895is the
number of cancer cases generated by the 1198990119895= 119899
119895minus 119899
1119895minus 119899
2119895
normal people in the populationThe conditional probabilitydistribution of 119884
119894119895given 119899
119894119895is
119884119894119895| 119899
119894119895sim Poisson 119899
119894119895119876119894(119895) 119894 = 0 1 2 (33)
Notice that if 119896 = 2 (a 2-stage model) then all 1198692
individuals would develop tumor at or before birth Thus if119896 = 2 then 119884
2119895= 0 for all 119895 gt 0 so that if 119895 gt 0 cancer
cases develop only from normal people (119873 = 1198690people) and
1198691people On the other hand if 119896 gt 2 then with positive
probability 119884119894119895gt 0 for all (119894 = 0 1 2 119895 = 0 1 119899
119879) where
119899119879is the last time point in the data Let 120575
2119896= 1 if 119896 = 2 and
1205752119896= 0 if 119896 = 2 Then 119884
119895| (119899
119894119895 119894 = 0 1 2) sim Poisson(119876
119879(119895)
where 119876119879(119895) = sum
1
119894=0119899119894119895119876119894(119895) + (1 minus 120575
2119896)119899
21198951198762(119895) Since
(1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) we have for the
conditional probability density function119875(119910119895| 119899
119895) of119884
119895given
119899119895
119875 (119910119895| 119899
119895)=
119899119895
sum
1198990119895=0
119899119895minus1198990119895
sum
1198991119895=0
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1) ℎ 119910
119895 119876
119879(119895)
(34)
where 119892(1198990119895 119899
1119895 119899
119895 119901
0 119901
1) is the probability density function
of (1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) and ℎ119910
119895 119876
119879(119895)
the probability density function of 119884119895| (119899
119894119895 119894 = 0 1 2) sim
Poisson119876119879(119895)
The probability density function 119875(119910119895| 119899
119895) given by (34)
is a mixture of Poisson probability density functions withmixing probability density function given by themultinomialprobability distribution of 119899
0119895 119899
1119895 given 119899
119895 This mixing
probability density function represents individuals with dif-ferent genotypes at the embryo stage in the population
Let Θ be the set of all unknown parameters (ie theparameters (119901
1 119901
2 120572) and the birth rates the death rates
and the mutation rates of 119869119895cells) Based on data (119910
119895 119895 =
0 1 119899119879) the likelihood function of Θ is
119871 Θ | 119910119895 119895 = 0 1 119899
119879 = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119875 (119910119895| 119899
119895)
(35)
Notice that because themutation rates are very small onemay practically assume 120573
119894(119905) = 120573
119894for 119894 = 0 1 119896 minus 1
Also because the stage-limiting genes are basically tumorsuppressor genes which act recessively (see Tan [3]Weinberg[6] and Tan et al [5 8 23]) one may practically assume119887119894(119905) = 119887
119894 119889
119894(119905) = 119889
119894 120574
119894(119905) = 119887
119894minus 119889
119894= 120574
119894 119894 = 0 1 119896 minus 1
(see Tan et al [4 5 8])
43 The Joint Probability Distribution of Augmented Variablesand Cancer Incidence For applying the mixture distribution
of 119884119895in (34) to make inference about the unknown parame-
ters one needs to expand the model to include the unobserv-able augmented variables (119899
0119895 119899
1119895 119910
0119895 119910
1119895 119895 = 1 119905
119873) and
derives the joint probability distribution of these variablesFor these purposes observe that for (119895 = 1 119905
119873) and for
119896 gt 2 the conditional probability distribution119875119910119894119895 119894 = 0 1 |
119910119895 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1) given (119884
119895= 119910
119895 119899
119894119895 119894 =
0 1 119899119895) is
(119910119894119895 119894 = 0 1) | (119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Multinomial(119910119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1)
(36)
Since the conditional probability distribution of 119884119895given
(119899119894119895 119894 = 0 1 2) for 119896 gt 2 is Poisson with mean 119876
119879(119895) we
have for the joint conditional probability density function119875119910
119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1 119884
119895) given
(119899119894119895 119894 = 0 1 119899
119895)
119875 119910119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895
= 119875 119910119894119895 119894 = 0 1 | 119910
119895 119899
119894119895 119894 = 0 1 2
times 119875 119910119895| 119899
119894119895 119895 = 0 1 2 =
1
119910119895119890minus119876119879(119895)119876
119879(119895)
119910119895
times 1198921199100119895 119910
1119895 119910
119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1
=
2
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) (119896 gt 2)
(37)
where 1199102119895= 119910
119895minus sum
1
119906=0119910119906119895and 119899
2119895= 119899
119895minus sum
1
119906=0119899119906119895
If 119896 = 2 then 1198842119895= 0 so that 119884
119895= sum
1
119894=0119884119894119895 Thus we have
for 119896 = 2
119884119895| (119899
119894119895 119894 = 0 1 119899
119895) sim Poisson
1
sum
119894=0
119899119894119895119876119894(119895) (38)
119884119894119895| (119884
119895= 119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Binomial(119910119895
119899119894119895119876119894(119895)
sum1
119906=0119899119906119895119876119906(119895)
) 119894 = 0 1
(39)
It follows that if 119896 = 2 then sum1
119894=0119884119894119895= 119884
119895and the joint
probability density function of (1198840119895 119884
119895) given (119899
119906119895 119906 = 0 1 2)
is119875 119910
0119895 119910
119895| 119899
119906119895 119906 = 0 1 119899
119895
= 119875 119910119895| 119899
119894119895 119894 = 0 1 119899
119895
times 119875 1199100119895| 119910
119895 119899
119906119895 119906 = 0 1 119899
119895
=
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) if 119896 = 2
(40)
where 119910119895= sum
1
119894=0119910119894119895and 119899
119895= sum
2
119894=0119899119894119895
12 ISRN Biomathematics
Put Y = (119910119894119895 119894 = 0 1 119895 = 1 119905
119873) N = (119899
119894119895 119894 = 0 1
119895 = 1 119905119873)˜119899 = (119899
119895 119895 = 0 1 119905
119873)˜119910 = (119910
119895 119895 = 0 1
119905119873) From (37) and (40) we have for the conditional joint
probability density function of (Y˜119910) given (N
˜119899)
119875 Y˜119910 | N
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(41)
It follows that the joint conditional probability densityfunction of NY
˜119910 given (
˜119899 Θ) is
119875 NY˜119910 |
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1)
times ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(42)
Notice that the above probability density function is aproduct of multinomial probability density functions andPoisson probability density functions For this joint probabil-ity density function the deviance Dev = minus2log119875[Y
˜119910N |
˜119899 Θ] minus log119875[Y
˜119910N |
˜119899 Θ] is
Dev = 1198630(119896) + Dev (119901
1 119901
2) +
119905119873
sum
119895=1
119863119895 (43)
where
1198630(119896) = 2 120594 (119896) minus 119910
0minus 119910
0log
120594 (119896)
1199100
(44)
Dev (1199011 119901
2) = 2
119905119873
sum
119895=1
1198990119895log
1199010
(1 minus 1199011minus 119901
2)
+
2
sum
119894=1
119899119894119895log
119901119894
119901119894
(45)
119863119895= 2
1
sum
119894=0
119899119894119895119876119894(119895)minus119910
119894119895minus119910
119894119895log
119899119894119895119876119894(119895)
119910119894119895
+2 (1 minus 1205752119896)
times 11989921198951198762(119895) minus 119910
2119895minus 119910
2119895log
11989921198951198762(119895)
1199102119895
(46)
where 119901119894= ((sum
119905119873
119895=0119899119894119895)(sum
119905119873
119895=0119899119895)) (119894 = 0 1 2)
The joint probability density function 119875Y˜119910N |
˜119899 Θ
of (Y˜119910N) given by (42) will be used as the kernel for the
Bayesian method to estimate the unknown parameters andto predict the state variables
44 Fitting of the Model to Cancer Incidence Data To fit themodel to real data as inTan [3ndash5] we letΔ119905 sim 1 to correspondto a fixed time interval such as 6 months in human cancerstudies (Tan et al [4] has assumed 3months as one-time unitwhile Luebeck andMoolgavkar [26] has assumed one year asone-time unit)Then because the proliferation rate of the laststage cells is quite large onemay practically assume119875
119879(119904 119905) =
1 for 119905 minus 119904 ge 1 Hence noting that 120573119896minus1
(119905) = 120573119896minus1
is usuallyvery small (see [3ndash5]) the 119876
119894(119895) is approximated by
119876119894(119895) asymp119864 119890
minus120573119896minus1
119866119894(119905119895minus1
)minus 119890
minus120573119896minus1
119866119894(119905119895) = 119890
minus120573119896minus1
119864[119866119894(119905119895minus1
)]
minus 119890minus120573119896minus1
119864[119866119894(119905119895)]+ 119900 (120573
119896minus1)
(47)
where 119866119894(119905) = sum
119905minus1
119904=1199050
119869119896minus1
(119904 119894)Under discrete time approximation the 119864[119868
119896minus1(119905 119894)]rsquos
have been derived in the appendix Using these results ofexpected numbers and using the resultsum119905minus1
119894=0119886119894= (119886
119905minus1)(119886minus
1) for 119886 = 0 we obtain
120573119896minus1
119864 [119866119894(119905)] =
119894+1
sum
119903=119894
119864 [119869119903(1199050 119894)] (
119896minus1
prod
119906=119903
120573119906)
times
119896minus1
sum
119907=119903
119860119903(119896minus1)
(119907)
119905minus1
sum
119904=1199050
(1 + 120574119907)119904minus1199050
= 120579(119894+1)(119896minus1)
120601(119894+1)(119896minus1) (119905) + 120582119894(119896minus1)120601119894(119896minus1) (119905)
(48)
where (120579119894(119896minus1)
119894 = 1 2 3) and (120579119894(119896minus1)
119894 = 0 1 2) are definedin Section 33 andwhere the120601
119894(119896minus1)(119905) (119894 = 0 1 2 3)rsquos are given
by
120601119894(119896minus1) (119905) =(
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1) (119907)
times1
120574119907
(1 + 120574119907)119905minus1199050
minus 1 119894 = 0 1 2 3
(49)
Notice that if 1205740= 0 then 120601
0(119896minus1)(119905) reduces to
1206010(119896minus1)
(119905) =(
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1)
(119907)
times1
1205741199072(1 + 120574
119907)119905minus1199050
minus 1 minus (119905 minus 1199050) 120574
119907
(50)
Applying these results for time homogeneous modelswith 120574
119894= 120574119895if 119894 = 119895 the 119876
119894(119895)rsquos under discrete approximation
are given as follows
ISRN Biomathematics 13
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791112060111(119905119895minus1
)minus1205820112060101(119905119895minus1
)
minus119890minus1205791112060111(119905119895)minus1205820112060101(119905119895) + 119900 (120573
1)
1198761(119895) asymp (1 minus 120572
1) 119890
minus1205821112060111(119905119895minus1
)
minus119890minus1205821112060111(119905119895) + 119900 (120573
1)
(51)
(2) If 119896 ge 3 thenwe have 1198762(0) = 120575
2(119896minus1)120572119876
119894(0) = 0 119894 =
0 1 and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791(119896minus1)
1206011(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1206011(119896minus1)
(119905119895)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198761(119895) asymp 119890
minus1205792(119896minus1)
1206012(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1206012(119896minus1)
(119905119895)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198762(119895) asymp 120575
2(119896minus1)(1 minus 120572) 119890
minus1205822212060122(119905119895minus1
)minus 119890
minus1205822212060122(119905119895)
+ (1 minus 1205752(119896minus1)
) 119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(52)
Notice that if one replaces [1 + 120574119907]119905minus1199050 by [1 + 120574
119907]119905minus1199050 =
119890(119905minus1199050) log(1+120574
119907)
asymp 119890(119905minus1199050)120574119907 the above 119876
119894(119895)rsquos from discrete
time model are exactly the same as from the continuousmodel respectively as given in equations (23)ndash(27) under theassumption that 119875
119879(119904 119905) = 1 for 119905 minus 119904 gt 0 Notice that the
assumption 119875119879(119904 119905) = 1 for 119905minus119904 gt 0 is equivalent to assuming
that the last stage cancer cells grow instantaneously intocancer tumors as soon as they are generated see Remark 1
5 The Fitting of the Model to CancerIncidence and the Generalized BayesianInference Procedure
Given the model in Sections 2 and 3 and cancer incidenceone may use results in Section 4 to fit the model By usingthis model and the distribution results in Section 4 one canreadily estimate the unknown genetic parameters predictcancer incidence and check the validity of the model byusing the generalized Bayesian inference and Gibbs samplingprocedures for more detail see Tan [3 22] and Tan et al[4 5]
The generalized Bayesian inference is based on the pos-terior distribution 119875Θ | NY
˜119910
˜119899 of Θ given NY
˜119910
˜119899
This posterior distribution is derived by combining the prior
distribution 119875Θ ofΘ with the joint probability distribution119875NY
˜119910 |
˜119899 Θ given
˜119899 Θ given by (42) It follows
that this inference procedure would combine informationfrom three sources (1) previous information and experiencesabout the parameters in terms of the prior distribution 119875Θof the parameters (2) biological information of inheritedcancer cases via genetic segregation of cancer genes inthe population (119875N |
˜119899 119901
119894 119894 = 1 2 see Section 2)
and (3) information from the expanded data (Y) and theobserved data (
˜119910) via the statistical model from the system
(119875Y˜119910 | N Θ) given by (37) and (40) Because of additional
information from the genetic segregation of the cancer genesthis inference procedure provides an efficient procedure toextract information of effects of genotypes of individuals atthe embryo stage
51 The Prior Distribution of the Parameters For the priordistributions of Θ because biological information has sug-gested some lower bounds and upper bounds for the muta-tion rates and for the proliferation rates we assume
119875 (Θ) prop 119888 (119888 gt 0) (53)
where 119888 is a positive constant if these parameters satisfy somebiologically specified constraints are and equal to zero forotherwise These biological constraints are as follows
(i) 0 lt 1199011lt 10
minus2 0 lt 1199012lt 10
minus6 and minus001 lt 120574119894lt 1 (119894 =
1 2)(ii) For 120573
119895(119895 = 0 1 119896 minus 1) we let 1 lt 120596
0= 119873(119905
0)120573
0lt
1000 (119873 rarr 1198681) and 10minus8 lt 120573
119894lt 10
minus3 119894 = 1 119896minus1(iii) For the 120582
119895(119896minus1)(119895 = 0 1 2) we let 0 lt 120582
119894lt 10 119894 = 0 1
and 0 lt 1205822lt 10
3(iv) For the 120579
119894(119894 = 1 2 3) we let 0 lt 120579
1lt 10
minus2 and 0 lt
1205792lt 10
minus1
We will refer to the above prior as a partially informativeprior which may be considered as an extension of thetraditional noninformative prior given in Box and Tiao [28]
52 The Posterior Distribution of the Parameters Given119884119873
˜119910
˜119899 Denote by Θ
1= Θ minus 119901
1 119901
2 120572 From the
posterior distribution 119875Θ | NY˜119910
˜119899 we obtain
119875 119901119894 119894 = 1 2 120572 | Θ
1NY
˜119910
˜119899
prop (120594 (119896))1199100
119890minus120594(119896)
119901sum119905119873
119895=11198991119895
1119901sum119905119873
119895=11198992119895
2
times (1 minus 1199011minus 119901
2)sum119905119873
119895=11198990119895
0 lt 1199011 119901
2 120572 lt 1
119875 Θ1| (119901
1 119901
2 120572) NY
˜119910
˜119899
prop
119905119873
prod
119895=1
2
prod
119894=0
119890minus119876119894(119895)119876
119894(119895)
119910119894119895
Θ1isin Ω
(54)
14 ISRN Biomathematics
where Ω is the parameter space of Θ1provided by the
biological constraints in Section 51For computational convenience we notice that the log of
1198751199011 119901
2 120572 | Θ
1NY
˜119910
˜119899 is proportional to the negative of
1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) similarly the log of
119875Θ1| (119901
1 119901
2 120572)NY
˜119910
˜119899 is proportional to the negative
of sum119896
119895=1119863119895given by (46)
53 The Multilevel Gibbs Sampling Procedure For EstimatingUnknown Parameters Given the posterior probability distri-butions we will use the following multilevel Gibbs samplingprocedure to derive estimates of the parameters We noticethat numerically the Gibbs sampling procedure given belowis equivalent to the EM-algorithm from the sampling theoryviewpoint with Steps 1 and 2 as the 119864-Step and with Steps 3and 4 as the119872-Step respectively [29]Thesemultilevel Gibbssampling procedures are given by the following
Step 1 (Generating N Given (Y˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 1)) Given Θ and given˜119899 use the multinomial
distribution of 1198991119895 119899
2119895 given 119899
119895in Section 3 to generate
a large sample of N Then by combining this large samplewith 119875Y
˜119910 | N
˜119899 Θ in (37) and (40) to select N through
the weighted bootstrap method due to Smith and Gelfand[30] This selected N is then a sample from 119875N | Y
˜119910
˜119899 Θ
even though the latter is unknown (For proof see Tan [22]Chapter 3) Call the generated sample N
Step 2 (Generating Y Given (N˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 2)) Given
˜119910
˜119899 Θ and given N = N generated
from Step 1 generate Y from the probability distribution119875Y | N
˜119910
˜119899 Θ given by (36) and (38) Call the generated
sample Y
Step 3 (Estimation of 119901119894 119894 = 1 2 120572 Given Θ
1NY
˜119910
˜119899)
Given
˜119910
˜119899 Θ
1 and given (NY) = (N Y) from Steps 1
and 2 derive the posterior mode of 119901119894 119894 = 1 2 120572 by
maximizing the conditional posterior distribution 119875119901119894 119894 =
1 2 120572 | Θ1 N Y
˜119910
˜119899 Under the partially informative prior
this is equivalent to maximize the negative of the deviance1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) in Section 43 under
the constraints given in Section 51 Denote this generatedmode by 119901
119894 119894 = 1 2
Step 4 (Estimation of Θ1Given (119901
119894 119894 = 1 2 120572NY
˜119910
˜119899))
Given (
˜119910
˜119899) and given (NY 119901
119894 119894 = 1 2 120572) = (N Y 119901
119894 119894 =
1 2 ) from Steps 1ndash3 derive the posterior mode of Θ1
by maximizing the conditional posterior distribution119875Θ
1| 119901
119894 119894 = 1 2 N Y
˜119910
˜119899 Under the partially
informative prior this is equivalent to maximize the negativeof the deviancesum119896
119895=1119863119895in (46) under the constraints Denote
the generated mode as Θ1
Step 5 (Recycling Step) With (NY 119901119894 119894 = 1 2 120572 Θ
1) =
(N Y 119901119894 119894 = 1 2 Θ
1) given above go back to Step 1 and
continue until convergence
The proof of convergence of the above steps can bederived by using procedure given in Tan ([22] Chapter 3) Atconvergence the Θ = 119901
119894 119894 = 1 2 Θ
1 are the generated
values from the posterior distribution of Θ given
˜119910
˜119899
independently of (NY) (for proof see Tan [22] Chapter 3)Repeat the above procedures once then generate a randomsample ofΘ from the posterior distribution ofΘ given
˜119910
˜119899
then one uses the sample mean as the estimates of (Θ) anduse the sample variances and covariances as estimates of thevariances and covariances of these estimates
6 A NewMultistage Stochastic Model for AdultEye Cancer (Uveal Melanoma)mdashAn Example
The human eye cancers consist of pediatric eye cancers andadult eye cancers The most common pediatric eye cancer isthe retinoblastoma which develops from the retinal pigmentepithelium cells underlying the retina that do not formmelanoma The most common adult eye cancers are theuveal melanomas involving the iris the ciliary body andthe choroid (collectively referred to as the uveal) Thesecancers develop from melanocytes (pigment cells) whichreside within the uveal giving color to the eye In Tan andZhou [9] we have developed a modified two-stage model forretinoblastoma Based on results frommolecular biology (seeLandreville et al [14] Mensink et al [15] and Loercher andHarbour [31]) Landreville et al [14] have proposed a threestage model for uveal melanoma as given in Figure 2 As anexample of applications of this paper in this section we willapply this model of uveal melanoma to the NCINIH eyecancer data from the SEER project We notice that the samemethods can be applied to model other human cancers aswell but this will be our future research
Given in Table 1 are the numbers of people at risk andthe eye cancer cases in the age groups together with thepredicted cases from the models These data give cancerincidence at birth and incidence for 85 age groups (119896 =
85) with each group spanning over a 1-year period exceptthe last age group (ge85 years old) For human eye cancerbecause the incidence at birth and for age groups from 1to 10 years old is basically generated by the pediatric eyecancer-retinoblastoma (see [9]) to account for inheritedcancer cases of uveal melanomas the incidence for age 0(birth) and for age periods from 1 to 10 years old in Table 1for uveal melanoma is derived by subtracting incidence ofretinoblastoma from SEER data (see Tan and Zhou [9])
To fit the data we let one-time unit be 6 months afterbirth and let 119905
0= 1 To compare different models and to
assess different assumptions we will consider the following2-3-stage mixture models (1) the complete 3-stage mixturemodel (Model-F) in which no assumptions are made on theparameters (2) the Type-1ndash3-stage mixture model in whichwe assume that 120574
1= 0 and that normal people and 119869
1at
the embryo stage will remain normal people and 1198691people
respectively at birth (Model-1) For comparison purposes wealso fit a 2-stage model as defined in Tan and Zhou [9] Wewill apply the methods in Section 6 to fit these models to theSEER data given in Table 1
ISRN Biomathematics 15
Table 2 The log-likelihood AIC and BIC of the fitted models
Models Log-likelihood AIC BICThree-stage
Model-F minus131202 264804 267735Model-1 minus129576 260953 263151
Two-stage minus4392421 8796843 8811569
0 6 12 18 24 30 36 42 48 54 60 66 72 78 84Age (year)
0
50
100
150
200
250
300
350
Inci
denc
es
ObservedModel-F
Model-1Two-stage
Figure 5 Curve fitting of SEER data by the Model-F Model-1 andthe two-stage model
Given in Table 2 are the natural logs of the likelihoodfunctions the AIC (Akaike Information Criterion) and theBIC (Bayesian Information Criterion) for these modelsGiven in Table 3 are the estimates of parameters in the 3-stagemodels Given in Figure 5 are the plots of predicted cancercases from the 3-stage mixture models (Model-F and Model-1) and the 2-stagemodel For comparison purposes in Table 1we also provide numbers of predicted cancer cases from the3-stage mixture models and the 2-stage model together withthe observed cancer cases over time from SEER From theseresults we have made the following observations
(a) As shown by results in Table 1 and Figure 5 it appear-ed that both Model-F and Model-1 fitted the SEERdata well although Model-1 fitted the data slightlybetter from values of AIC and BIC The Chi-squaretest statistics 1205942 = sum
85
119894=0((119910
119894minus 119910
119894)2119910
119894) for Model-F
and Model-1 are given by 8843 and 9448 respec-tively giving a 119875-value of 012 (df = 86 minus 12 = 74)for Model-F and a 119875 value of 011 (df = 86 minus 7 = 79)for Model-1 On the other hand the 2-stage modelfitted the date very poorly the Chi-statistic value forthe 2-stage model is 274769 giving a 119875-value less than10
minus3The AIC (Akaike Information Criteria) and BIC(Bayesian Information Criteria) values ofModel-1 aregiven by (AIC = 260953 BIC = 263151) which areslightly smaller than those of Model-F respectivelyhowever the AIC and BIC values (879684 881157)of the two-stage model are considerably greater thanthose of the 3-stagemodels respectivelyThese resultssuggest that uvealmelanomamaybest be described bya 3-stage model with inherited component and that
one may practically assume 1205741= 0 and that normal
people and 1198691people at the embryo stage will remain
normal people and 1198691people respectively at birth
(b) From Table 3 it is observed that the estimate of 1205741is
close to zero (the estimate is of order 10minus5) indicatingthat the phenotype of 119869
1is almost identical to that
of 119873 = 1198690further confirming that the staging-
limiting genes are basically tumor suppressor genesand that there is no haploinsufficiency for these tumorsuppressor genes On the other hand the estimate of1205742is of order 10minus2 which is about 103 times greater
than those of cells with genotype 1198691
(c) From Table 3 the estimates 119895(119895 = 0 1 2) of 120582
119895
are of order 10minus1 10
minus1 10
2sim 10
3 respectively
Because 120582119894= 119864[119869
119894(1199050 119894)]120573
119894prod
2
119906=119894+1(120573
119906120574
119906) 119894 = 0 1 2
assuming some values of 119864[119869119894(1199050 119894)] 119894 = 0 1 2 from
some biological observations one can have somerough ideas about the magnitude of 120573
119895(119895 = 0 1 2)
For example if we follow Potten et al [32] to assume(119864[119873(119905
0)] = 119864[119869
119894(1199050 119894)] sim 10
8 119894 = 0 1 2) then 120573
119895asymp
10minus6sim 10
minus5(d) From Table 3 the estimates of 119901
1and 119901
2from the
SEER data are of orders 10minus4 sim 10minus3 and 10minus7 sim 10
minus6respectively This indicates that in the US populationthe frequency of the staging limiting cancer genefor uveal melanoma is approximately around 10
minus3Table 3 also showed that the estimate of 120572 was 08411indicating that most individuals with genotype 119869
2
would develop cancer at birth This may help toexplain why there are observed cancer incidencesat birth for uveal melanoma in the SEER data eventhough the estimate of the frequency 119901
2is of order
10minus7sim 10
minus6
7 Discussion and Conclusions
To account for inherited cancer cases in this paper wehave developed some general multistage models involvinghereditary cancer cases For human cancer incidence thesemodels are basically generalized mixture models In thesemixture models the mixing probability is a multinomialdistribution to account for genetic segregation of the staging-limiting tumor suppressor genes This mixture model allowsus to estimate for the first time the frequency of the staging-limiting tumor suppressor gene in human populations As anexample of applications in this paper we have developed ageneral 3-stage stochastic multistage model of carcinogenesisfor adult human eye cancer To account for inherited cancercases in the stochastic model of human eye cancer wehave also developed a generalized mixture model for uvealmelanoma in human beings
For using the proposed models to fit the cancer incidencedata in this paper we have developed a generalized Bayesianinference procedure to estimate the unknownparameters andto predict cancer cases This inference procedure is advanta-geous over the classical sampling theory inference (ie max-imum likelihood method) because the procedure combines
16 ISRN Biomathematics
Table3Estim
ates
ofparametersfor
the3
-stage
stochastic
mod
els
Parameters
1205820
1205821
1205822
1205741
1205742
1199011
1199012
1205721205791
1205792
Mod
el-F
Estim
ates
288119864minus01
481119864minus01
655119864+02
271119864minus05
337119864minus02
995119864minus04
968119864minus07
841119864minus01
329119864minus04
341119864minus01
StD
121119864minus01
465119864minus02
112119864+02
886119864minus06
192119864minus04
175119864minus06
648119864minus09
419119864minus02
354119864minus05
107119864minus02
95CL
-Low
er503119864minus02
389119864minus01
434119864+02
534119864minus06
333119864minus02
991119864minus04
955119864minus07
759119864minus01
260119864minus04
320119864minus01
95CL
-Upp
er525119864minus01
572119864minus01
875119864+02
401119864minus05
341119864minus02
998119864minus04
981119864minus07
923119864minus01
398119864minus04
362119864minus01
Mod
el-1
Estim
ates
655119864minus02
372119864minus01
198119864+02
NA
349119864minus02
998119864minus04
100119864minus06
807119864minus01
NA
NA
StD
254119864minus02
463119864minus02
338119864+01
NA
358119864minus03
627119864minus05
453119864minus08
706119864minus02
NA
NA
95CL
-Low
er157119864minus02
282119864minus01
132119864+02
NA
279119864minus02
875119864minus04
911119864minus07
668119864minus01
NA
NA
95CL
-Upp
er115119864minus01
463119864minus01
265119864+02
NA
419119864minus02
1121198640minus03
109119864minus06
945119864minus01
NA
NA
NoteNAassum
edno
nexiste
nce
ISRN Biomathematics 17
information from three sources (1)previous information andexperiences about the parameters in terms of the prior distri-bution 119875Θ of the parameters (2) biological information ofinherited cancer cases via the genetic segregation of staging-limiting tumor suppressor genes in the population and (3)
information from the expanded data (Y) and the observeddata (
˜119910) via the statistical model from the system (119875Y
˜119910 |
N Θ) given by (37) and (40)To illustrate the usefulness and applications of ourmodels
and methods we have applied our models and methods tothe eye cancer SEER data of NCINIH Our analysis clearlyshowed that the proposed 3-stage model with inheritedcancer cases fitted the data nicely (see Table 2 and Figure 5)on the other hand the classical 2-stage model cannot fit thedata at all (see Table 2 and Figure 5)These results clearly haveconfirmed results frommolecular biology that the human eyecancer is derived by a 3-stage model with inherited cancercomponent Notice however our 3-stage multistage modelis more general than the classical 3-stage model as describedin Little [1] Tan [2] and Zheng [7] in that we postulatethat cancer tumors develop from primary 119868
3cells by clonal
expansion (see Yang and Chen [11]) (Note that the stochasticmultistagemodels in the literature assume that cancer tumorsdevelop from last stage cells immediately as soon as theyare generated ignoring completely cancer progression seeRemark 1) As a matter of fact we had assumed Δ119905 = 1 fora period of three months and found that the 3-stage modelsthen did not fit the SEER data clearly indicating that119875
119905(119904 119905) lt
1 over a period of three monthsApplying our models and methods to the SEER data
of human eye cancer we have derived for the first timesome useful pieces of information Specifically we mention(1) for the first time that we have estimated the frequencyof the staging-limiting tumor suppressor gene in the USpopulation (119901
1sim 9948 times 10
minus3) (2) With the estimate of 120572as = 08411 the predicted number of uveal melanoma atbirth is 119910
0= 119899
011199012= 36 by 3-stage models with inherited
cancer component (Model-F and Model-1) (The observednumber of eye cancer at birth is 34) (3) The estimateof the proliferation rate (120574
1) of 119869
1cells using Model-F is
1205741
= 2271 times 10minus5
sim 0 (The estimate is 8603 times 10minus5
using Model-1) This confirms that the stage-1 limitinggene is a tumor suppressor gene and unlike the p53 genein chromosome 17p (see [33]) there is little or no haploidinsufficiency for this gene in cells with genotype 119869
1
Using models and methods of this paper one can easilypredict future cancer cases for human eye cancer Thus bycomparing results from different populations our modelsand methods can be used to assess cancer prevention andcontrol procedures This will be our future research topicswe will not go any further here
Appendix
The Expected Numbers of State Variablesunder Discrete Time Approximation
Under discrete time the stochastic differential equations ofstate variables reduce to the following stochastic difference
equations of state variables respectively
119869119894 (119905 + 1 119894) = 119869
119894 (119905 119894) [1 + 120574119894 (119905)]
+ 119890119894(119905 + 1 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 + 1 119906) = 119869
119895(119905 119906) [1 + 120574
119895(119905)] + 119869
119895minus1(119905 119906) 120573
119895minus1(119905)
+ 119890119895(119905 + 1 119906) 0 le 119906 lt 119895 le 119896 minus 1
(A1)
where 119890119894(119905 + 1 119894) = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)] minus [119863
119894(119905 119894) minus
119869119894(119905 119894)119889
119894(119905)] and 119890
119895(119905+1 119894) = [119861
119895(119905 119894)minus119869
119895(119905 119894)119887
119895(119905)] minus [119863
119895(119905 119894)minus
119869119895(119905 119894)119889
119895(119905)] + [119872
119895minus1(119905 119894) minus 119869
119895minus1(119905 119894)120573
119895minus1(119905)] for 119895 gt 119894
The initial conditions at birth (1199050) for the above stochastic
difference equations are 119869119895(1199050 119894) gt 0 if (119895 = 119894 119894 + 1) and
119869119895(1199050 119894) = 0 if 119895 gt 119894 + 1 The solution of the above difference
equations under these initial conditions is given respectivelyby
119869119894(119905 119894) = 119869
119894(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119894(119904))
+
119905
sum
119904=1199050+1
119890119894(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119894(119906))
119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119895(119904))
+
119905minus1
sum
119904=1199050
119869119895minus1 (119904 119894) 120573119895minus1 (119904)
119905minus1
prod
119906=119904+1
(1 + 120574119895 (119906))
+
119905
sum
119904=1199050+1
119890119895(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119895(119906))
0 le 119894 lt 119895 le 119896 minus 1
(A2)
If the model is time homogeneous so that 120573119895(119905) =
120573119895 120574
119895(119905) = 120574
119895(119895 = 119894 119896 minus 1 and if 120574
119894= 120574119895if 119894 = 119895 then the
above solutions under the initial conditions (119869119895(1199050 119894) = 0 119895 gt
119894 + 1) reduce to
119869119894 (119905 119894) = 119869
119894(1199050 119894) (1 + 120574
119894)119905minus1199050
+ 120578(0)
119894(119905 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894) (1 + 120574
119895)119905minus1199050
+
119895minus1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895(119903) (1 + 120574
119903)119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
18 ISRN Biomathematics
=
119894+1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895 (119903) (1 + 120574119903)
119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
0 le 119894 lt 119895 le 119896 minus 1
(A3)
where 120578(0)
119895(119905 119894) = sum
119905
119904=1199050+1119890119895(119904 119894)(1 + 120574
119894)119905minus119904 120578
(119906)
119895(119905 119894) =
sum119905
119904=1199050+1119890119895minus119906
(119904 119894)sum119895
119907=119906119860119906119895(119907)(1 + 120574
119907)119905minus119904 (119906 = 1 119895 minus 119894)
Thus if the model is time homogeneous and if 120574119894=
120574119895if 119894 = 119895 the 119864[119869
119896minus1(119905 119894)]rsquos in discrete time models under
the initial conditions (119869119895(1199050 119894) = 0 119895 gt 119894 + 1) are given
respectively by
119864 [119869119896minus1
(119905 119894)] =
119894+1
sum
119906=119894
119864 [119869119906(1199050 119894)] (
119896minus2
prod
119907=119906
120573119907)
times
119896minus1
sum
119903=119906
119860119906(119896minus1)
(119903) (1 + 120574119903)119905minus1199050
119894 = 0 1 Min (119896 minus 1 2)
(A4)
References
[1] M P Little ldquoCancermodels ionization and genomic instabilitya reviewrdquo inHandbook of CancerModels with ApplicationsW YTan and L Hanin Eds chapter 5 World Scientific River EdgeNJ USA 2008
[2] W Y Tan Stochastic Models of Carcinogenesis Marcel DekkerNew York NY USA 1991
[3] W Y Tan ldquoStochastic multiti-stage models of carcinogenesis ashidden Markov models a new approachrdquo International Journalof Systems and Synthetic Biology vol 1 pp 313ndash337 2010
[4] W Y Tan L J Zhang and C W Chen ldquoStochastic modelingof carcinogenesis state space models and estimation of param-etersrdquoDiscrete and Continuous Dynamical Systems B vol 4 no1 pp 297ndash322 2004
[5] W Y Tan C W Chen and L J Zhang ldquoCancer biology cancermodels and stochasticmathematical analysis of carcinogenesisrdquoin Handbook of Cancer Models and Applications W Y Tan andL Hanin Eds chapter 3World Scientific River Edge NJ USA2008
[6] R A Weinberg The Biology of Human Cancer Garland Sci-ences Taylor and Frances New York NY USA 2007
[7] Q Zheng ldquoStochastic multistage cancer models a fresh lookat an old approachrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds chapter 2 WorldScientific River Edge NJ USA 2008
[8] W Y Tan L J Zhang W Chen and J M Zhu ldquoA stochasticmodel of human colon cancer involving multiple pathwaysrdquo inHandbook of Cancer Models with Applications W Y Tan and LHanin Eds chapter 11 World Scientific River Edge NJ USA2008
[9] W Y Tan and H Zhou ldquoA new stochastic model of retinoblas-toma involving both hereditary and non- hereditary cancercasesrdquo Journal of Carcinogenesis and Mutagenesis vol 2 no 2article 117 2011
[10] X W Yan Stochastic and State Space Models of carcinogenesisUnder Complex Situation [PhD thesis] Department of Math-ematical Sciences University of Memphsis Memphis TennUSA
[11] G L Yang and C W Chen ldquoA stochastic two-stage carcino-genesis model a new approach to computing the probabilityof observing tumor in animal bioassaysrdquo Mathematical Bio-sciences vol 104 no 2 pp 247ndash258 1991
[12] H Osada and T Takahashi ldquoGenetic alterations of multipletumor suppressors and oncogenes in the carcinogenesis andprogression of lung cancerrdquo Oncogene vol 21 no 48 pp 7421ndash7434 2002
[13] I I Wistuba L Mao and A F Gazdar ldquoSmoking moleculardamage in bronchial epitheliumrdquo Oncogene vol 21 no 48 pp7298ndash7306 2002
[14] S Landreville O A Agapova and J W Hartbour ldquoEmerginginsights into the molecular pathogenesis of uveal melanomardquoFuture Oncology vol 4 no 5 pp 629ndash636 2008
[15] H W Mensink D Paridaens and A De Klein ldquoGenetics ofuveal melanomardquo Expert Review of Ophthalmology vol 4 no6 pp 607ndash616 2009
[16] WY Tan andXWYan ldquoAnew stochastic and state spacemodelof human colon cancer incorporating multiple pathwaysrdquoBiology Direct vol 5 article 26 2010
[17] L B Klebanov S T Rachev and A Y Yakovlev ldquoA stochasticmodel of radiation carcinogenesis latent time distributions andtheir propertiesrdquo Mathematical Biosciences vol 113 no 1 pp51ndash75 1993
[18] A Y Yakovlev and A D Tsodikov Stochastic Models of TumorLatency and Their Biostatistical Applications World ScientificRiver Edge NJ USA 1996
[19] H Fakir W Y Tan L Hlatky P Hahnfeldt and R K SachsldquoStochastic population dynamic effects for lung cancer progres-sionrdquo Radiation Research vol 172 no 3 pp 383ndash393 2009
[20] A G Knudson ldquoMutation and cancer statistical study ofretinoblastomardquoProceedings of theNational Academy of Sciencesof the United States of America vol 68 no 4 pp 820ndash823 1971
[21] J F Crow and M Kimura An Introduction to PopulationGenetics Theory Harper and Row New York NY USA 1970
[22] W Y Tan Stochastic Models With Applications to GeneticsCancers AIDS and Other Biomedical Systems World ScientificRiver Edge NJ USA 2002
[23] W Y Tan CW Chen and L J Zhang ldquoCancer risk assessmentby state space modelsrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds Chapter 13 WorldScientific River Edge NJ USA 2008
[24] W Y Tan andCWChen ldquoStochasticmodeling of carcinogene-sis some new insightsrdquoMathematical and Computer Modellingvol 28 no 11 pp 49ndash71 1998
[25] W Y Tan and C W Chen ldquoCancer stochastic modelsrdquo inEncyclopedia of Statistical Sciences Revised edition John Wileyand Sons New York NY USA 2005
[26] E G Luebeck and S HMoolgavkar ldquoMultistage carcinogenesisand the incidence of colorectal cancerrdquo Proceedings of theNational Academy of Sciences of the United States of Americavol 99 no 23 pp 15095ndash15100 2002
[27] R Durrett J Mayberry S Moseley and D Schmidt ldquoProba-bility models for cancer development and progressionrdquo GoogleSearch Google 2012
[28] G E P Box and G C Tiao Bayesian Inference in StatisticalAnalysis Addison-Wesley Reading Mass USA 1973
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
ISRN Biomathematics 3
Second copySecond copyAPC APC
Second copyAPC
Smad4DCC
Smad4DCC
Second copy
Second copySmad4DCC
Second copySmad4DCC
N
Ras
Ras
DysplasticACF
(a) Sporadic (about 70ndash75)
Carcinomas(b) FAP (familial adenomatous polyps) (about 1)
middot middotmiddot
middotmiddotmiddot
P53
P53 P53
P53
Figure 3 The APC-120573-catenin-Tcf-myc pathway for human colon cancer
transient cells In these cases one needs only to deal with119879(119905)and 119869
119895cells with 119895 = 1 119896 minus 1 However as shown by Yang
and Chen [11] the number of tumors is much smaller thanthe total number of 119869
119896cells Also in many animal models and
in cancer risk assessment of radiation Klebanov et al [17]Yakovlev and Tsodikov [18] and Fakir et al [19] have shownthat 119879(119905) are in general not Markov
To extend the above model to include hereditary cancersobserve that mutants of cancer genes exist in the populationand that both germline cells (egg and sperm) and somaticcells may carrymutant alleles of cancer genes [2 20] Furtherwithout exception every human being develops from theembryo in hisher motherrsquos womb (embryo stage denotetime by 0) where stem cells of different organs divide anddifferentiate to develop different organs respectively (seeWeinberg [6] Chapter 10) If both the egg and the spermgenerating the embryo carry mutant alleles of relevant cancergenes then the individual is an 119869
2-stage person at the embryo
stage if only one of the germ line cells (egg or sperm)generating the embryo carries mutant alleles of cancer genesthen at the embryo stage the individual is an 119869
1-stage person
Similarly the individual is a normal person (119873 = 1198690person)
at the embryo stage if both the egg and the sperm generatingthe embryo do not carry mutant alleles of cancer genesRefer to the person in the population as an 119869
119894(119894 = 0 1 2)
person if heshe is an 119869119894-stage person at the embryo stage
Then with respect to the cancer development in questionpeople in the population can be classified into 3 types ofpeople normal people (119873 = 119869
0people) 119869
1people and 119869
2
people Based on this classification for normal people in thepopulation the stochastic model of carcinogenesis is a 119896-stagemultievent model given by 119869
0rarr 119869
1rarr sdot sdot sdot rarr 119869
119896rarr
119879119906119898119900119903 for 1198691people in the population the stochastic model
of carcinogenesis is a (119896minus1)-stage multievent model given by1198691rarr 119869
2rarr sdot sdot sdot rarr 119869
119896rarr 119879119906119898119900119903 and for 119869
2people in the
population the stochasticmodel of carcinogenesis is a (119896minus2)-stage multievent model given by 119869
2rarr sdot sdot sdot rarr 119869
119896rarr 119879119906119898119900119903
To account for inherited cancer cases let 1199011be the
proportion of 1198691people in the population and 119901
2the
proportion of 1198692people in the population In general large
human populations under steady-state conditions one maypractically assume that the 119901
119894is a constant independent of
time (Crow and Kimura [21]) Then 1199010= 1 minus 119901
1minus 119901
2(0 lt
1199011+ 119901
2lt 1) is the proportion of normal people (ie119873 = 119869
0
people) in the population Let 119899 be the population size and119899119894(119894 = 0 1 2) the number of 119869
119894people in the population
so that sum2
119906=0119899119906= 119899 Assume that 119899 is very large and that
marriage between people in the population is random withrespect to cancer genes then as shown in Crow and Kimura[21] (see also Tan [22] Chapter 2) the conditional probabilitydistribution of (119899
1 119899
2) given n is 2-dimensional multinomial
with parameters (119899 1199011 119901
2) That is
(1198991 119899
2) | 119899 sim Multinomial (119899 119901
1 119901
2) (1)
To derive probability distribution of time to cancerunder the above model observe that during pregnancy theproliferation rates of all stem cells are quite high Thuswith positive probability 119869
2people in the population may
acquire additional genetic andor epigenetic changes duringpregnancy to become 119869
3-stage people at birth Similarly 119869
1
people may acquire genetic andor epigenetic changes duringpregnancy to become 119869
2people at birth albeit the probability
is very small normal people at the embryo stage may acquiresome genetic andor epigenetic changes during pregnancy tobecome 119869
1people at birth Because the probability of genetic
and epigenetic changes is small one may practically assumethat an 119869
119894(119894 = 0 1 2) person at the embryo stage would
only give rise to 119869119894stem cells and possibly 119869
119894+1stem cells at
birth This is equivalent to assuming that 119869119894people at the
embryos stage would not generate 119869119894+119895
(119895 gt 1) stem cells ator before birth This model is represented schematically inFigure 4 Notice that if 119896 = 2 one may practically assumethat with probability one an 119869
2person at the embryo stage
would develop cancer at or before birth (1199050) If 119896 = 3 then
4 ISRN Biomathematics
Two-stage model
Embryo state
Embryo state
At birth
At birth
Tumor
α1 minus α
( gt 3)tumor ( = 3)
-stage model ( ge 3)
Figure 4 Embryo genotypes and their frequencies at embryo stage and at birth
with probability 120572 (120572 gt 0) an 1198692person at the embryo stage
would develop cancer at or before birth
3 The Stochastic Process ofCarcinogenesis with Hereditary CancerCases and Mathematical Analysis
Because tumors are developed from primary 119869119896cells for the
above stochasticmodel the identifiable response variables are119879(119905) and 119869
119906(119905 119894) 119894 = 0 1 2 119906 = 119894 119894 + 1 119896 minus 1 where
119879(119905) is the number of cancer tumors at time 119905 and 119869119906(119905 119894) is
the number of 119869119906(119906 = 119894 119894 + 1 119896 minus 1) cells at time 119905 in
people who are 119869119894people at the embryo stage (see [3 5 8 23]
Remarks 1 and 2) For people who have genotype 119869119894(119894 =
0 1 2) at the embryo stage the stochastic model of carcino-genesis is then given by the stochastic process
˜119883119894(119905) 119879(119905) 119905 gt
0 where˜119883119894(119905) = 119869
119906(119905 119894) 119906 = 119894 119894 + 1 119896 minus 1
1015840 For theseprocesses in the next subsections we will derive stochasticequations for the state variables (119869
119906(119905 119894)119894 = 0 1 2 119906 =
119894 119896 minus 1) we will also derive the probability distributionsof these state variables and the probabilities of developingcancer tumors These are the basic approaches for modelingcarcinogenesis used by the first author and his associates seeTan [3] Tan et al [4 5 8 23] Tan and Zhou [9] Tan and Yan[16] and Tan and Chen [24 25] and Remark 3
Remark 2 At any time (say 119905) the total number of 119869119896cells
is equal to the total number of 119869119896cells generated from 119869
119896minus1
cells at time 119905 plus the total number of 119869119896cells generated by
cell division from other 119869119896cells at time 119905 the former 119869
119896cells
are referred to as primary 119869119896cells while the latter are not
primary 119869119896cells Since each tumor is developed from a single
primary 119869119896cell through stochastic birth and death process
each primary 119869119896cell will generate atmost one tumor It follows
that at any time the total number of 119869119896cells is considerably
greater than the number of cancer tumors (see also Yangand Chen [11]) Thus for generating cancer tumors the onlyidentifiable state variables are the number of 119869
119895cells with (119895 =
0 1 119896 minus 1) and the number of detectable cancer tumor
Remark 3 To model stochastic multistage models of car-cinogenesis the standard traditional approach is to assumethat the last stage cells (ie the 119869
119896cells in the model 119873 rarr
1198691sdot sdot sdot rarr 119869
119896rarr 119879119906119898119900119903) grow instantaneously into a cancer
tumor as soon as they are generated and then apply thestandard Markov theory to 119879(119905) and to the state variables
˜119883(119905) = 119869
119894(119905) 119894 = 0 1 119896 minus 1 This approach has been
described in detail in Tan [2] Little [1] and Zheng [7] seealso Luebeck and Moolgavkar [26] and Durrett et al [27]However in some cases the assumption of instantaneousgrowth into cancer tumors of 119869
119896cells may not be realistic
ISRN Biomathematics 5
(Klebanov et al [17] Yakovlev and Tsodikov [18] and Fakiret al [19]) in these cases 119879(119905) is not Markov so that theMarkov theory method is not applicable to 119879(119905) To developanalytical results and to resolve many difficult issues Tan andhis associates [4 5 24] have proposed an alternative approachthrough stochastic equations and have followed Yang andChen [11] to assume that cancer tumors develop by clonalexpansion from primary last stage cells Through probabilitygenerating function method Tan and Chen [24] have shownthat if the Markov theory is applicable to 119879(119905) then thestochastic equation method is equivalent to the classicalMarkov theory method but is more powerful Also throughstochastic equation method we have shown in the Appendixthat the classical approach provides a close approximation todiscrete time model under the assumption that the primarylast stage cells develop into a detectable tumor in one timeunit This provides a reasonable explanation why the tradi-tional approach (see [2 22]) can still work well even thoughthe Markov assumption for 119879(119905) may not hold In this paperwe will thus basically use the stochastic equationmethod andassume that cancer tumors develop from primary last stagecells through clonal expansion
31 Stochastic Equations for the State Variables Assume nowthat an individual is an 119869
119894(119894 = 0 1 2) person at the embryo
stageThen in this individual cancer is developed by a (119896minus119894)-stage multievent model given by 119869
119894rarr sdot sdot sdot rarr 119869
119896rarr
119879119906119898119900119903 and the identifiable response variables are given by˜119883119894(119905)
1015840 119879(119905) To derive stochastic equations for the staging
variables in˜119883119894(119905) in this individual observe that for each
119894 = 0 1 2˜119883119894(119905)
1015840 is in general a Markov Process although119879(119905) may not be Markov see Remark 1 Tan [3] and Tan etal [4 5] Tan and Zhou [9] and Tan and Yan [16] It followsthat
˜119883119894(119905 + Δ119905)
1015840 derive from˜119883119894(119905)
1015840 through stochastic birth-death processes of 119869
119906(119906 = 119894 119894 + 1 119896minus1) cells and through
stochastic transition 119869119906rarr 119869
119906+1 119906 = 119894 119894+1 119896minus1 during
(119905 119905 + Δ119905] Let 119861119906(119905 119894) 119863
119906(119905 119894) 119872
119906(119905 119894) be the number of
birth the number of death of 119869119906cells and the number of
transition from 119869119906rarr 119869
119906+1cells during (119905 119905+Δ119905] respectively
in people who are 119869119894people at the embryo stage Let 119872
0(119905)
denote the number of transitions from 119873 rarr 1198691during
(119905 119905 + Δ119905] Because the transition of 119869119906rarr 119869
119906+1would not
affect the number of 119869119906cells but only increase the number of
119869119906+1
cells (see Remark 4) by the conservation law we have thefollowing stochastic equations for 119869
119906(119905 119894) 119906 = 119894 119896minus1 119894 =
0 1 2 (see Tan [3] Tan et al [4 5 8] Tan and Zhou [9] andTan and Yan [16])
119869119894 (119905 + Δ119905 119894) = 119869
119894 (119905 119894) + 119861119894 (119905 119894) minus 119863119894 (119905 119894) 119894 = 0 1 2
119869119906(119905 + Δ119905 119894) = 119869
119906(119905 119894) + 119861
119906(119905 119894) minus 119863
119906(119905 119894)
+ 119872119906minus1 (119905 119894) 119894 lt 119906 le 119896 minus 1
(2)
Because 119861119907(119905 119894) 119863
119907(119905 119894)119872
119907(119905 119894) 119894 = 0 1 2 119907 = 119894 119896minus
1 are random variables the above equations are basicallystochastic equations To derive probability distributions ofthese variables let 119887
119906(119905) and 119889
119906(119905) denote the birth rate and
the death rate at time 119905 of the 119869119906(119906 = 0 1 119896 minus 1) cells
respectively Let 120573119906(119905) (119906 = 0 1 119896 minus 1) be the transition
rate at time 119905 from 119869119906rarr 119869
119906+1 Then as shown in Tan [3] for
(119894 = 0 1 2 119906 = 119894 119896 minus 1) we have to the order of 119900(Δ119905)
119861119906(119905 119894) 119863
119906(119905 119894) 119872
119906(119905 119894) | 119869
119906(119905 119894)
sim Multinomial 119869119906 (119905 119894) 119887119906 (119905) Δ119905 119889119906 (119905) Δ119905 120573119906 (119905) Δ119905
(3)
It follows that to the order of 119900(Δ119905)
119861119906(119905 119894) 119863
119906(119905 119894) | 119869
119906(119905 119894)
simMultinomial 119869119906 (119905 119894) 119887119906 (119905) Δ119905 119889119906 (119905) Δ119905
119872119906(119905 119894) | 119869
119906(119905 119894)
sim Binomial 119869119906(119905 119894) 120573
119906(119905) Δ119905
sim Poisson 119869119906(119905 119894) 120573
119906(119905) Δ119905 + 119900 (120573
119895(119905) Δ119905)
independently of 119861119906(119905 119894) 119863
119906(119905 119894)
119906 = 0 1 119896 minus 1
(4)
From these distribution results by subtracting from therandom transition variables its conditional means respec-tively we obtain the following stochastic equations for thestate variables
˜119883119894(119905) (119894 = 0 1 Min(2 119896 minus 1))
119889119869119894(119905 119894) = 119869
119894(119905 + Δ119905 119894) minus 119869
119894(119905 119894) = 119861
119894(119905 119894) minus 119863
119894(119905 119894)
= 119869119894(119905 119894) 120574
119894(119905) Δ119905 + 119890
119894(119905 119894) Δ119905 119894 = 0 1 2
119889119869119906 (119905 119894) = 119869
119906 (119905 + Δ119905 119894) minus 119869119906 (119905 119894) = 119872119906minus1 (119905 119894)
+ 119861119906(119905 119894) minus 119863
119906(119905 119894)
= 119869119906minus1 (119905 119894) 120573119906minus1 (119905)
+119869119906(119905 119894) 120574
119906(119905) Δ119905 + 119890
119906(119905 119894) Δ119905
119894 = 0 1 Min (2 119896 minus 1) 119894 lt 119906 le 119896 minus 1
(5)
where 120574119906(119905) = 119887
119906(119905) minus 119889
119906(119905) for 119906 = 0 1 119896 minus 1 and where
119890119894(119905 119894)Δ119905 = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)Δ119905] minus [119863
119894(119905 119894) minus 119869
119894(119905 119894)119889
119894(119905)Δ119905]
for 119894 = 0 1 2 119890119906(119905 119894)Δ119905 = [119861
119906(119905 119894) minus 119869
119906(119905 119894)119887
119906(119905)Δ119905] minus
[119863119906(119905 119894) minus 119869
119906(119905 119894)119889
119906(119905)Δ119905] + [119872
119906minus1(119905 119894) minus 119869
119906minus1(119905 119894)120573
119906minus1(119905)Δ119905]
for 119894 = 0 1 Min(2 119896 minus 1) 119894 lt 119906 le 119896 minus 1From the above equations by dividing both sides by Δ119905
and letting Δ119905 rarr 0 we obtain
119869119894(119905 119894)
119889119905= 119869
119894(119905 119894) 120574
119894(119905) + 119890
119894(119905 119894) 119894 = 0 1 Min (2 119896 minus 1)
119869119906 (119905 119894)
119889119905= 119869
119906 (119905 119894) 120574119906 (119905) + 119869119906minus1 (119905 119894) 120573119906minus1 (119905) + 119890119906 (119905 119894)
for 119894 = 0 1 Min (2 119896 minus 1) 119894 lt 119906 le 119896 minus 1
(6)
In the above equations using the distribution results in(4) it can easily be shown that the randomnoises 119890
119906(119905 119894) 119906 =
119894 119896 minus 1 119894 = 0 1 2 have expected value zero and are
6 ISRN Biomathematics
uncorrelated with the staging variables and 119879(119905) The initialconditions at birth (119905
0) for the above stochastic differential
equations are 119869119906(1199050 119894) gt 0 119906 = 119894 119894 + 1 119869
119906(1199050 119894) = 0 119906 gt 119894 + 1
Given the initial conditions (119869119906(1199050 119894) gt 0 119906 = 119894 119894 + 1)
and (119869119906(1199050 119894) = 0 119906 gt 119894 + 1) at birth (119905
0) the solution of the
equations in (6) is given respectively by
119869119894 (119905 119894) = 119869
119894(1199050 119894) 119890
int119905
1199050
120574119894(119909)119889119909
+ 120578119894 (119905 119894)
119894 = 0 1 Min (2 119896 minus 1)
119869119906 (119905 119894) = 119869
119906(1199050 119894) 119890
int119905
1199050
120574119906(119909)119889119909
+ int
119905
1199050
119869119906minus1 (119909 119894) 120573119906minus1 (119909) 119890
int119905
119909120574119906(119910)119889119910
119889119909
+ 120578119906(119905 119894) = sdot sdot sdot = 119869
119906(1199050 119894) 119890
int119905
1199050
120574119906(119909)119889119909
+
119906minus119894
sum
119907=1
119869119906minus119907
(1199050 119894) 120601
(119907)
119906(119905 119894) +
119906+1minus119894
sum
119907=1
120578(119907)
119906(119905 119894)
119906 = 119894 + 1 119896 minus 1
where 119894 = 0 if 119896 = 2
119894 = 0 1 if 119896 = 3
119894 = 0 1 2 if 119896 gt 3
(7)
where
120601(1)
119906(119905 119894) =int
119905
1199050
119890int119905
119909120574119906(119910)119889119910+int
119909
1199050
120574119906minus1
(119910)119889119910120573119906minus1
(119909) 119889119909
119906 = 119894 119896 minus 1
120601(119907)
119906(119905 119894) =int
119905
1199050
119890int119905
119909120574119906(119910)119889119910
120573119906minus1
(119909) 120601(119907minus1)
119906minus1(119909 119894) 119889119909
119907 = 2 119906 minus 119894
120578119906(119905 119894) = 120578
(1)
119906(119905 119894)
= int
119905
1199050
119890int119905
119909120574119906(119910)119889119910
119890119906(119909 119894) 119889119909
119906 = 119894 119896 minus 1
120578(119907)
119906(119905 119894) = int
119905
1199050
119890int119905
119909120574119906(119910)119889119910
120573119906minus1
(119909) 120578(119907minus1)
119906minus1(119909 119894) 119889119909
119907 = 2 119906 minus 119894
(8)
If the model is time homogeneous so that 120573119906(119905) =
120573119906 119887119906(119905) = 119887
119906 119889
119906(119905) = 119889
119906 120574
119906(119905) = 119887
119906minus 119889
119906= 120574
119906 119906 = 0 1
119896minus1 and if 120574119894= 120574119906if 119894 = 119906 the above solutions under the initial
conditions (119869119894+119906(1199050 119894) gt 0 119906 = 0 1 119869
119894+119906(1199050 119894) = 0 119906 gt 1)
then reduce respectively to
119869119894 (119905 119894) = 119869
119894(1199050 119894) 119890
120574119894(119905minus1199050)+ 120578
(1)
119894(119905 119894)
119894 = 0 1 Min (2 119896 minus 1)
119869119906(119905 119894) = 119869
119906(1199050 119894) 119890
120574119906(119905minus1199050)
+120573119906minus1
int
119905
1199050
119869119906minus1
(119909 119894) 119890120574119906(119905minus119909)
119889119909+120578(1)
119906(119905 119894)
= sdot sdot sdot = 119869119906(1199050 119894) 119890
120574119906(119905minus1199050)
+
119906minus1
sum
119903=119894
119869119903(1199050 119894) (
119906minus1
prod
119907=119903
120573119907)
119906
sum
119897=119903
119860119903119906(119897) 119890
120574119897(119905minus1199050)
+
119906
sum
119903=119894
120578(119906+1minus119903)
119906(119905 119894)
=
119894+1
sum
119903=119894
119869119903(1199050 119894) (
119906minus1
prod
119907=119903
120573119907)
119906
sum
119897=119903
119860119903119906(119897) 119890
120574119897(119905minus1199050)
+
119906
sum
119903=119894
120578(119906+1minus119903)
119906(119905 119894) 119894 lt 119906 le 119896 minus 1
(9)
where for 119894 le 119907 le 119906119860119894119906 (119907) = 1 if 119894 = 119906
=
119906
prod
119897=119894119897 = 119907
(120574119897minus 120574
119907)minus1 if 119894 lt 119906
(10)
Obviously 119864[120578(119903)119906(119905 119894)] = 0 for all (119894 = 0 1 2 119906 = 119894
119896 minus 1 119903 = 1 119906 minus 119894 + 1) It follows that for (119894 =
0 1 Min(119896 minus 1 2)) the expected values 119864[119869119896minus1
(119905 119894)] of119869119896minus1
(119905 119894) for homogeneous models with 120574119894= 120574119906if 119894 = 119906 are
given by
119864 [119869119896minus1
(119905 119894) = 119864 [119869119894+1
(1199050 119894)]
times (
119896minus2
prod
119906=119894+1
120573119906)
119896minus1
sum
119907=119894+1
119860(119894+1)(119896minus1)
(119907) 119890120574119907(119905minus1199050)
+ 119864 [119869119894(1199050 119894)] (
119896minus2
prod
119906=119894
120573119906)
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907) 119890120574119907(119905minus1199050)
119894 = 0 1Min (119896 minus 1 2) 119896 ge 2
(11)
where as a convention (sum119894
119895=119894+1119888119894= 0prod
119894
119895=119894+1119889119895= 1) for all
(119888119894 119889
119895)
Remark 4 Because genetic changes and epigenetic changesoccur during cell division to the order of 119900(Δ119905) the probabil-ity is120573
119906(119905)Δ119905 that one 119869
119906cell at time 119905would give rise to 1 119869
119906cell
and 1 119869119906+1
cell at time 119905 + Δ119905 by genetic changes or epigeneticchanges It follows that the transition of 119869
119906rarr 119869
119906+1would not
affect the population size of 119869119906cells but only increase the size
of the 119869119906+1
population
ISRN Biomathematics 7
32 Transition Probabilities and Probability Distributions ofStaging Variables Let 119891(119909 119899 119901) denote the probabilitydensity function of a binomial random variable 119883 sim
Binomial(119899 119901) ℎ(119909 120582) the probability density function ofa Poisson random variable 119883 sim Poisson(120582) and 119892(119909 119910 119899
1199011 119901
2) the probability density function of a bivariate multi-
nomial random vector (119883 119884)simMultinomial(119899 1199011 119901
2) Using
the stochastic equations of the staging variables given by(2) and using the probability distributions of the transitionvariables 119861
119903(119905 119894) 119863
119903(119905 119894)119872
119903(119905 119894) in (4) as inTan et al [4 5]
we obtain the following transition probabilities of 119869119903(119905 +
Δ119905 119894) 119903 + 119894 119896 minus 1 given 119869119903(119905 119894) 119903 = 119894 119896 minus 1 for
(119894 = 0 1 Min(119896 minus 1 2))
119875 119869119903 (119905 + Δ119905 119894) = 119907
119903 119903 = 119894 119894 + 1 119896 minus 1 | 119869
119903 (119905 119894)
= 119906119903 119903 = 119894 119894 + 1 119896 minus 1
= 119875 119869119894(119905 + Δ119905 119894) = 119907
119894| 119869
119894(119905 119894) = 119906
119894
times
119896minus1
prod
119895=119894+1
119875 119869119895(119905 + Δ119905 119894) = 119907
119895| 119869
119895(119905 119894)
= 119906119895 119869119895minus1
(119905 119894) = 119906119895minus1
(12)
where
119875 119869119894(119905 + Δ119905 119894) = 119907
119894| 119869
119894(119905 119894) = 119906
119894
=
119906119894
sum
119903=0
119891 (119903 119906119894 119887119894 (119905) Δ119905)
times 119891(119906119894minus 119907
119894+ 119903 119906
119894minus 119903
119889119894(119905) Δ119905
1 minus 119887119894 (119905) Δ119905
)
119894 = 0 1 Min (119896 minus 1 2)
119875 119869119895(119905 + Δ119905 119894) = 119907
119895| 119869
119895(119905 119894) = 119906
119895 119869119895minus1
(119905 119894) = 119906119895minus1
=
119906119895
sum
1199031=0
119906119895minus1199031
sum
1199032=0
119892 (1199031 1199032 119906
119895 119887119895(119905) Δ119905 119889
119895(119905) Δ119905)
times ℎ (119907119895minus 119906
119895minus 119903
1+ 119903
2 119906
119895minus1120573119895minus1
Δ119905) 119895 gt 119894
(13)
Define the unobservable transition variables˜119880119894(119905) =
119861119894(119905 119894) (119861
119895(119905 119894) 119863
119895(119905 119894)) 119895 = 119894 + 1 119896 minus 1
1015840(119894 = 0 1Min
(119896 minus 1 2)) Then we have for the joint probability densityfunction of
˜119883119894(119905 + Δ119905)
˜119880119894(119905) given
˜119883119894(119905)
119875 ˜119883119894 (119905 +Δ119905) ˜
119880119894 (119905) | ˜
119883119894 (119905) = 119875
˜119883119894 (119905 + Δ119905) | ˜
119880119894 (119905) ˜
119883119894 (119905)
times 119875 ˜119880119894(119905) |
˜119883119894(119905)
(14)
where
119875 ˜119883119894(119905 + Δ119905) |
˜119880119894(119905)
˜119883119894(119905)
= 119891(119869119894 (119905 119894) minus 119869119894 (119905 + Δ119905 119894) + 119861119894 (119905 119894) 119869119894 (119905 119894)
minus119861119894(119905 119894)
119889119894(119905) Δ119905
1 minus 119887119894 (119905) Δ119905
)
times
119896minus1
prod
119895=119894+1
ℎ 119869119895 (119905 + Δ119905 119894) minus 119869119895 (119905 119894) minus 119861119895 (119905 119894)
+119863119895(119905 119894) 119869
119895minus1(119905 119894) 120573
119895minus1(119905) Δ119905
(15)
119875 ˜119880119894(119905) |
˜119883119894(119905)
= 119891 (119861119894 (119905 119894) 119869119894 (119905 119894) 119887119894 (119905 119894) Δ119905)
times
119896minus1
prod
119895=119894+1
119892119861119895(119905 119894) 119863
119895(119905 119894) 119869
119895(119905 119894) 119887
119895(119905 119894) Δ119905 119889
119895(119905 119894) Δ119905
(16)
Let˜119890119894(119895) be a (119896 minus 119894) times 1 column vector with 1 in the
119895th position (1 le 119895 le 119896 minus 119894) and with 0 in other positionsLet
˜119906 = (119906
119894 119906
119896minus1)1015840 and
˜119907 = (119907
119894 119907
119896minus1)1015840 be (119896 minus
119894) times 1 column vectors of nonnegative integers (ie 119906119895and
119907119895are nonnegative integers) Then by using the probability
distribution results in (14)ndash(16) it can readily be shown that
119875 ˜119883119894(119905 + Δ119905) =
˜119907 |
˜119883119894(119905) =
˜119906
= [119906119895119887119895(119905) + (1 minus 120575
119895119894) 119906
119895minus1120573119895minus1
(119905)] Δ119905 + 119900 (Δ119905)
if˜119907 =
˜119906 +
˜119890119894(119895) 119895 = 119894 119896 minus 1
= 119906119895119889119895(119905) Δ119905 + 119900 (Δ119905)
if˜119907 =
˜119906 minus
˜119890119894(119895) 119895 = 119894 119896 minus 1
= 119900 (Δ119905) if 10038161003816100381610038161003816˜11015840
119899minus119896(˜119906 minus
˜119907)10038161003816100381610038161003816ge 2
(17)
The above results imply that˜119883119894(119905) is a (119896minus 119894)-dimensional
birth-death process with birth rates 119894119887119906(119905) 119906 = 119894 119896 minus
1 death rates 119894119889119906(119905) 119906 = 119894 119896 minus 1 and cross-
transition rates 120572119906119906+1
(119895 119905) = 119895120573119906(119905) 119906 = 119894 119896 minus
1 120573119906119907(119895 119905) = 0 if 119907 = 119906 + 1 (See Definition 41 in Tan
([22] Chapter 4)) Using these results it can be shownthat the Kolmogorov forward equation for the probabilities119875119869
119895(119905 119894) = 119906
119895 119895 = 119894 119896 minus 1 | 119869
119894(0) = 119898
119894 119869
119895(0) = 0
8 ISRN Biomathematics
119895 gt 119894 = 119875(119906119895 119895 = 119894 119896 minus 1 119905) (119894 = 0 1Min(119896 minus 1 2))
in the above model is given by
119889
119889119905119875 (119906
119895 119895 = 119894 119896 minus 1 119905)
= 119875 (119906119894minus 1 119906
119895 119895 = 119894 + 1 119896 minus 1 119905) (119906
119894minus 1) 119887
119894(119905)
+
119896minus1
sum
119895=119894+1
119875 (119906119894 119906
119894+1 119906
119895minus1 119906
119895
minus1 119906119895+1
119906119896minus1
119905) (119906119895minus 1) 119887
119895(119905)
+
119896minus2
sum
119895=119894
119875 (119906119894 119906
119894+1 119906
119895 119906
119895+1
minus1 119906119895+2
119906119896minus1
119905) 119906119895120573119895 (119905)
+
119896minus1
sum
119895=119894
119875 (119906119894 119906
119894+1 119906
119895minus1 119906
119895
+1 119906119895+1
119906119896minus1
119905) (119906119895+ 1) 119889
119895(119905)
minus 119875 (119906119894 119906
119894+1 119906
119896minus1 119905)
times
119896minus1
sum
119895=119894
119906119895[119887119895(119905) + 119889
119895(119905)] +
119896minus2
sum
119895=119894
119906119895120573119895(119905)
(18)
for 119906119895= 0 1 infin 119895 = 119894 119896 minus 1
By using the above set of differential equations one canreadily compute the probabilities 119875119869
119895(119905) = 119906
119895 119895 = 119894 119896 minus
1 | 119868119894(0) = 119898
119894 = 119875(119906
119895 119895 = 119894 119896 minus 1 119905) numerically
33 The Probability Distributions of the Number of DetectableTumors and Times to Tumors As shown by Yang and Chen[11] malignant cancer tumors arise from primary 119869
119896cells by
clonal expansion where primary 119869119896cells are 119869
119896cells generated
directly by 119869119896minus1
cells (119869119896cells derived by stochastic birth of
other 119869119896cells are not primary 119869
119896cells) That is cancer tumors
develop from primary 119869119896cells through stochastic birth-death
processesTo derive the probability distribution for 119879(119905) in 119869
119894people
in the population let 119875119879(119904 119905) denote the probability that
a primary cancer cell at time 119904 develops into a detectablecancer tumor by time 119905 (Explicit formula for119875
119879(119904 119905) has been
given in Tan [22] Chapter 8 and in Tan and Chen [24])Than as shown in Tan ([3 22] chapter 8) the conditionalprobability distribution of 119879(119905) given 119869
119896minus1(119904 119894) 119904 le 119905
in 119869119894people is Poisson with mean 120596(119905 119894) where 120596(119905 119894) =
int119905
1199050
119869119896minus1
(119904 119894)120573119896minus1
(119904)119875119879(119904 119905)119889119904 That is
119879 (119905) | 119869119896minus1
(119904 119894) 119904 le 119905 sim Poisson (120596 (119905 119894)) (19)
Let 119876119894(119895) be the probability that cancer tumors develop
during (119905119895minus1
119905119895] in 119869
119894people in the population For time
homogeneous models with small 120573119896minus1
119876119894(119895) is then given by
119876119894(119895) = 119864 119890
minus120596(119905119895minus1
119894)minus 119890
minus120596(119905119895119894)
= 119890minus120573119896minus1
119867119894(119905119895minus1
)minus 119890
minus120573119896minus1
119867119894(119905119895)+ 119900 (120573
119896minus1)
(20)
where119867119894(119905) = int
119905
1199050
119864[119869119896minus1
(119909 119894)]119875119879(119909 119905)119889119909
To derive 119876119894(119895) denote by
120579119894(119896minus1)
= 119864 [119869119894(1199050 119894 minus 1)] 120573
119894
119896minus1
prod
119906=119894+1
(120573119906
120574119906
)
119894 = 1 Min (3 119896 minus 1)
120582119906(119896minus1)
= 119864 [119869119906(1199050 119906)] 120573
119906
119896minus1
prod
119907=119906+1
(120573119906
120574119906
)
119906 = 0 1 Min (2 119896 minus 1)
(21)
and define the functions
120595119894(119896minus1)
(119905) =
119896minus1
prod
119906=119894+1
120574119906
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907)
times int
119905
1199050
119890120574119906(119909minus1199050)119875119879(119909 119905) 119889119909
119894 = 0 1 Min (2 119896 minus 1)
(22)
Applying results of 119864[119869119896minus1
(119905 119894)] given in (11) for timehomogeneous models with 120574
119894= 120574119895if 119894 = 119895 we obtain 119876
119894(119895)rsquos as
follows
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) = 119890
minus1205791112059511(119905119895minus1
)minus1205820112059501(119905119895minus1
)
minus119890minus1205791112059511(119905119895)minus1205820112059501(119905119895) + 119900 (120573
1)
(23)
1198761(119895) = (1 minus 120572
1) 119890
minus1205821112059511(119905119895minus1
)
minus119890minus1205821112059511(119905119895) + 119900 (120573
1)
(24)
ISRN Biomathematics 9
(2) If 119896 ge 3 then we have 119876119894(0) = 0 for 119894 = 0 1 and
1198762(0) = 120575
2119896120572 and for 119895 gt 0
1198760(119895) = 119890
minus1205791(119896minus1)
1205951(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1205950(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1205951(119896minus1)
(119905119895)minus1205820(119896minus1)
1205950(119896minus1)
(119905119895) + 119900 (120573
119896minus1)
(25)
1198761(119895) = 119890
minus1205792(119896minus1)
1205952(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1205951(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1205952(119896minus1)
(119905119895)minus1205821(119896minus1)
1205951(119896minus1)
(119905119895) + 119900 (120573
119896minus1)
(26)
1198762(119895) = 120575
2119896(1 minus 120572) 119890
minus1205822212059522(119905119895minus1
)minus 119890
minus1205822212059522(119905119895)
+ (1 minus 1205752119896) 119890
minus1205793(119896minus1)
1205953(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1205952(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1205953(119896minus1)
(119905119895)minus1205822(119896minus1)
1205952(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(27)
where 1205752119896= 1 if 119896 = 2 and = 0 if 119896 = 2
Notice that if 1205740= 0 then 120595
0(119896minus1)(119905) reduces to
1205950(119896minus1)
(119905) = (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=0
1198600(119896minus1)
(119907)
times int
119905
1199050
119890120574119907(119909minus1199050)119875119879(119909 119905) 119889119909
= (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1) (119907)
times int
119905
1199050
[119890120574119907(119909minus1199050)minus 1] 119875
119879 (119909 119905) 119889119909
(28)
Notice also that if 119875119879(119904 119905) asymp 1 for 119905 gt 119904 and if 120574
0= 0 then
the above 120595119894119895(119905)rsquos reduce respectively to
120595119894119894(119905) =
1
120574119894
119890120574119894(119905minus1199050)minus 1 119894 = 1 119896 minus 1
1205950(119896minus1) (119905) = (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1) (119907)
times1
1205741199072119890
120574119907(119905minus1199050)minus 1 minus (119905 minus 119905
0) 120574
119907
120595119894(119896minus1)
(119905) = (
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907)
times1
120574119907
119890120574119907(119905minus1199050)minus 1 119894 = 1 119896 minus 1
(29)
4 Probability Distribution ofObserved Cancer Incidence IncorporatingHereditary Cancer Cases
For estimating unknown parameters and to validate themodel one would need real data generated from the modelFor studies of carcinogenesis such data are usually given bycancer incidence For example in the SEER data of NCINIHof USA the data are given by (119910
0 119899
0) (119910
119895 119899
119895) 119895 = 1 119905
119873
where 1199100is the number of cancer cases at birth and 119899
0
the total number of birth and where for 119895 ge 1 119910119895is
the number of cancer cases developed during the 119895th agegroup of a one-year period (or 5 years periods) and 119899
119895is
the number of noncancer people who are at risk for cancerand from whom 119910
119895of them have developed cancer during
the 119895th age group Given in Table 1 are the SEER data ofuveal melanoma (adult eye cancer) during the period 1973ndash2007 In Table 1 notice that there are some cancer cases atbirth implying some inherited cancer cases In this sectionwe will develop a statistical model for these types of datasets from the stochastic multistage model with hereditarycancers as given in Section 2 As in previous sections let119899119894119895be the number of individuals who have genotype 119869
119894(119894 =
0 1 2) at the embryo stage among the 119899119895people at risk for
the cancer in question Then as showed above (1198991119895 119899
2119895) |
119899119895sim Multinomial119899
119895 119901
1 119901
2 It follows that 119899
119894119895| 119899
119895sim
Binomial119899119895 119901
119894 119894 = 0 1 2 In what follows we let 119884
119895denote
the random variable for 119910119895unless otherwise stated
41 The Probability Distribution of 1198840 As shown in Figure 4
119869119894(119894 = 0 1 2) people would only generate 119869
119894stage cells and
119869119894+1
stage cells at birth Thus for cancers to develop at orbefore birth the number of stages for the stochastic model ofcarcinogenesis must be 3 or less It follows that if 119910
0gt 0 the
appropriate model of carcinogenesis must be either a 2-stagemodel or a 3-stage model Since 119899
20| 119899
0sim Binomial(119899
0 119901
2)
and 11989910| (119899
0 119899
20) sim Binomial(119899
0 119901
1) + 119900(119901
2) the probability
distribution of 1198840is therefore
1198840sim Poisson (120594
119896) 119896 = 2 3 (30)
where
120594119896= 119899
0(119901
2+ 119901
1120572) if 119896 = 2
= 11989901199012120572 if 119896 = 3
(31)
The expected number of 1198840given 119899
0is 119864(119884
0| 119899
0) =
1198990(119901
2+ 119901
1120572) = 120594
2if 119896 = 2 and 119864(119884
0| 119899
0) = 119899
01199012120572 = 120594
3
if 119896 = 3 Hence for the 2-stage model (ie 119896 = 2) or the 3-stage model (ie 119896 = 3) the maximum likelihood estimateof 120594
119896is 120594
119896= 119910
0and the deviance119863
0(119896) from the conditional
probability distribution of 1198840given 119899
0is
1198630(119896) = minus2 log ℎ (119910
0 120594
119896) minus log ℎ (119910
0 120594
119896)
= 120594119896minus 119910
0 minus 119910
0log
120594119896
1199100
119896 = 2 3
(32)
10 ISRN Biomathematics
Table 1 The SEER incidence data (1973ndash2007) of uveal melanomafrom NCINIH (over all races and genders)
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
0 12495777 34 36 36 381 12221582 20 11 11 162 12120990 14 7 8 173 12112995 9 5 6 184 12146174 4 4 5 205 12161336 3 3 4 226 12111854 2 2 4 257 12160452 2 2 4 288 11942586 2 2 4 309 12381299 1 2 5 3410 12512703 2 3 6 3811 12410338 5 3 6 4112 12449244 2 3 6 4413 12527781 5 4 7 4814 12602883 3 5 7 5115 12719598 5 5 8 5516 12766107 7 6 9 5917 12831400 9 7 9 6218 12382047 8 7 9 6319 12581638 10 8 10 6820 12636509 7 9 11 7121 12682601 6 10 10 7522 12840510 12 11 11 7923 13075528 17 13 13 8424 13358635 16 15 14 8925 13473849 12 16 16 9426 13426340 17 18 18 9727 13525264 28 20 20 10128 13149674 17 22 21 10229 13812811 23 25 25 11030 13886874 24 28 27 11431 13488332 37 30 29 11532 13460286 32 32 32 11833 13256067 38 35 35 11934 13428827 37 39 38 12435 13220037 40 41 41 12636 12870265 30 44 44 12637 12689592 43 47 47 12738 12157014 42 49 49 12539 12494081 46 55 54 13140 12272125 49 58 58 13241 11826573 56 61 60 13042 11663153 54 65 64 13143 11407082 53 68 68 13144 11296848 70 73 72 133
Table 1 Continued
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
45 11016369 57 76 76 13246 10651593 71 79 79 13047 10475708 87 84 83 13148 9994684 82 86 85 12749 10138908 78 93 93 13150 9836359 87 97 96 13051 9475641 95 100 99 12752 9250985 113 104 104 12653 9027382 106 108 108 12554 8883737 117 113 113 12555 8547883 129 116 116 12356 8279648 107 119 119 12157 8062368 119 123 123 11958 7654610 132 124 124 11559 7563706 118 130 129 11560 7232719 131 131 130 11261 6927332 116 132 132 10962 6708273 133 134 134 10763 6543931 143 138 137 10664 6404652 130 141 141 10565 6168486 145 142 142 10266 5913479 138 142 142 9967 5746766 151 144 144 9868 5480517 147 142 142 9469 5363912 149 144 144 9470 5110728 136 142 142 9071 4925076 144 141 141 8872 4696825 140 138 138 8573 4512136 146 135 135 8274 4345300 126 133 133 8075 4148801 122 128 129 7876 3900900 124 122 122 7477 3681587 114 116 116 7078 3481918 93 110 110 6779 3243631 102 102 102 6380 2961234 79 92 93 5881 2724984 93 83 84 5482 2495219 82 75 75 5083 2271595 92 66 67 4684 2041351 64 57 58 4285 10466605 304 282 285 216Note 1 for age 0 and 1ndash10 years old the cancer incidence are derived bysubtracting incidence of retinoblastoma from the original SEERdata (see Tanand Zhou [9])Note 2 the observed uveal melanoma incidence rates per 106 individuals arederived from the SEER eye cancer incidence by subtracting retinoblastomaincidence as given in Tan and Zhou [9]
ISRN Biomathematics 11
42 The Probability Distribution of 119884119895(119895 ge 1) To derive the
probability distribution of 119884119895(119895 ge 1) in the 119895th age group
let 119884119894119895(119894 = 0 1 2) be the number of cancer cases generated
by people who have genotype 119869119894at the embryo stage among
these 119884119895cancer cases Then 119884
119895= sum
2
119894=0119884119894119895and 119884
0119895is the
number of cancer cases generated by the 1198990119895= 119899
119895minus 119899
1119895minus 119899
2119895
normal people in the populationThe conditional probabilitydistribution of 119884
119894119895given 119899
119894119895is
119884119894119895| 119899
119894119895sim Poisson 119899
119894119895119876119894(119895) 119894 = 0 1 2 (33)
Notice that if 119896 = 2 (a 2-stage model) then all 1198692
individuals would develop tumor at or before birth Thus if119896 = 2 then 119884
2119895= 0 for all 119895 gt 0 so that if 119895 gt 0 cancer
cases develop only from normal people (119873 = 1198690people) and
1198691people On the other hand if 119896 gt 2 then with positive
probability 119884119894119895gt 0 for all (119894 = 0 1 2 119895 = 0 1 119899
119879) where
119899119879is the last time point in the data Let 120575
2119896= 1 if 119896 = 2 and
1205752119896= 0 if 119896 = 2 Then 119884
119895| (119899
119894119895 119894 = 0 1 2) sim Poisson(119876
119879(119895)
where 119876119879(119895) = sum
1
119894=0119899119894119895119876119894(119895) + (1 minus 120575
2119896)119899
21198951198762(119895) Since
(1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) we have for the
conditional probability density function119875(119910119895| 119899
119895) of119884
119895given
119899119895
119875 (119910119895| 119899
119895)=
119899119895
sum
1198990119895=0
119899119895minus1198990119895
sum
1198991119895=0
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1) ℎ 119910
119895 119876
119879(119895)
(34)
where 119892(1198990119895 119899
1119895 119899
119895 119901
0 119901
1) is the probability density function
of (1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) and ℎ119910
119895 119876
119879(119895)
the probability density function of 119884119895| (119899
119894119895 119894 = 0 1 2) sim
Poisson119876119879(119895)
The probability density function 119875(119910119895| 119899
119895) given by (34)
is a mixture of Poisson probability density functions withmixing probability density function given by themultinomialprobability distribution of 119899
0119895 119899
1119895 given 119899
119895 This mixing
probability density function represents individuals with dif-ferent genotypes at the embryo stage in the population
Let Θ be the set of all unknown parameters (ie theparameters (119901
1 119901
2 120572) and the birth rates the death rates
and the mutation rates of 119869119895cells) Based on data (119910
119895 119895 =
0 1 119899119879) the likelihood function of Θ is
119871 Θ | 119910119895 119895 = 0 1 119899
119879 = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119875 (119910119895| 119899
119895)
(35)
Notice that because themutation rates are very small onemay practically assume 120573
119894(119905) = 120573
119894for 119894 = 0 1 119896 minus 1
Also because the stage-limiting genes are basically tumorsuppressor genes which act recessively (see Tan [3]Weinberg[6] and Tan et al [5 8 23]) one may practically assume119887119894(119905) = 119887
119894 119889
119894(119905) = 119889
119894 120574
119894(119905) = 119887
119894minus 119889
119894= 120574
119894 119894 = 0 1 119896 minus 1
(see Tan et al [4 5 8])
43 The Joint Probability Distribution of Augmented Variablesand Cancer Incidence For applying the mixture distribution
of 119884119895in (34) to make inference about the unknown parame-
ters one needs to expand the model to include the unobserv-able augmented variables (119899
0119895 119899
1119895 119910
0119895 119910
1119895 119895 = 1 119905
119873) and
derives the joint probability distribution of these variablesFor these purposes observe that for (119895 = 1 119905
119873) and for
119896 gt 2 the conditional probability distribution119875119910119894119895 119894 = 0 1 |
119910119895 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1) given (119884
119895= 119910
119895 119899
119894119895 119894 =
0 1 119899119895) is
(119910119894119895 119894 = 0 1) | (119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Multinomial(119910119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1)
(36)
Since the conditional probability distribution of 119884119895given
(119899119894119895 119894 = 0 1 2) for 119896 gt 2 is Poisson with mean 119876
119879(119895) we
have for the joint conditional probability density function119875119910
119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1 119884
119895) given
(119899119894119895 119894 = 0 1 119899
119895)
119875 119910119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895
= 119875 119910119894119895 119894 = 0 1 | 119910
119895 119899
119894119895 119894 = 0 1 2
times 119875 119910119895| 119899
119894119895 119895 = 0 1 2 =
1
119910119895119890minus119876119879(119895)119876
119879(119895)
119910119895
times 1198921199100119895 119910
1119895 119910
119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1
=
2
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) (119896 gt 2)
(37)
where 1199102119895= 119910
119895minus sum
1
119906=0119910119906119895and 119899
2119895= 119899
119895minus sum
1
119906=0119899119906119895
If 119896 = 2 then 1198842119895= 0 so that 119884
119895= sum
1
119894=0119884119894119895 Thus we have
for 119896 = 2
119884119895| (119899
119894119895 119894 = 0 1 119899
119895) sim Poisson
1
sum
119894=0
119899119894119895119876119894(119895) (38)
119884119894119895| (119884
119895= 119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Binomial(119910119895
119899119894119895119876119894(119895)
sum1
119906=0119899119906119895119876119906(119895)
) 119894 = 0 1
(39)
It follows that if 119896 = 2 then sum1
119894=0119884119894119895= 119884
119895and the joint
probability density function of (1198840119895 119884
119895) given (119899
119906119895 119906 = 0 1 2)
is119875 119910
0119895 119910
119895| 119899
119906119895 119906 = 0 1 119899
119895
= 119875 119910119895| 119899
119894119895 119894 = 0 1 119899
119895
times 119875 1199100119895| 119910
119895 119899
119906119895 119906 = 0 1 119899
119895
=
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) if 119896 = 2
(40)
where 119910119895= sum
1
119894=0119910119894119895and 119899
119895= sum
2
119894=0119899119894119895
12 ISRN Biomathematics
Put Y = (119910119894119895 119894 = 0 1 119895 = 1 119905
119873) N = (119899
119894119895 119894 = 0 1
119895 = 1 119905119873)˜119899 = (119899
119895 119895 = 0 1 119905
119873)˜119910 = (119910
119895 119895 = 0 1
119905119873) From (37) and (40) we have for the conditional joint
probability density function of (Y˜119910) given (N
˜119899)
119875 Y˜119910 | N
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(41)
It follows that the joint conditional probability densityfunction of NY
˜119910 given (
˜119899 Θ) is
119875 NY˜119910 |
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1)
times ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(42)
Notice that the above probability density function is aproduct of multinomial probability density functions andPoisson probability density functions For this joint probabil-ity density function the deviance Dev = minus2log119875[Y
˜119910N |
˜119899 Θ] minus log119875[Y
˜119910N |
˜119899 Θ] is
Dev = 1198630(119896) + Dev (119901
1 119901
2) +
119905119873
sum
119895=1
119863119895 (43)
where
1198630(119896) = 2 120594 (119896) minus 119910
0minus 119910
0log
120594 (119896)
1199100
(44)
Dev (1199011 119901
2) = 2
119905119873
sum
119895=1
1198990119895log
1199010
(1 minus 1199011minus 119901
2)
+
2
sum
119894=1
119899119894119895log
119901119894
119901119894
(45)
119863119895= 2
1
sum
119894=0
119899119894119895119876119894(119895)minus119910
119894119895minus119910
119894119895log
119899119894119895119876119894(119895)
119910119894119895
+2 (1 minus 1205752119896)
times 11989921198951198762(119895) minus 119910
2119895minus 119910
2119895log
11989921198951198762(119895)
1199102119895
(46)
where 119901119894= ((sum
119905119873
119895=0119899119894119895)(sum
119905119873
119895=0119899119895)) (119894 = 0 1 2)
The joint probability density function 119875Y˜119910N |
˜119899 Θ
of (Y˜119910N) given by (42) will be used as the kernel for the
Bayesian method to estimate the unknown parameters andto predict the state variables
44 Fitting of the Model to Cancer Incidence Data To fit themodel to real data as inTan [3ndash5] we letΔ119905 sim 1 to correspondto a fixed time interval such as 6 months in human cancerstudies (Tan et al [4] has assumed 3months as one-time unitwhile Luebeck andMoolgavkar [26] has assumed one year asone-time unit)Then because the proliferation rate of the laststage cells is quite large onemay practically assume119875
119879(119904 119905) =
1 for 119905 minus 119904 ge 1 Hence noting that 120573119896minus1
(119905) = 120573119896minus1
is usuallyvery small (see [3ndash5]) the 119876
119894(119895) is approximated by
119876119894(119895) asymp119864 119890
minus120573119896minus1
119866119894(119905119895minus1
)minus 119890
minus120573119896minus1
119866119894(119905119895) = 119890
minus120573119896minus1
119864[119866119894(119905119895minus1
)]
minus 119890minus120573119896minus1
119864[119866119894(119905119895)]+ 119900 (120573
119896minus1)
(47)
where 119866119894(119905) = sum
119905minus1
119904=1199050
119869119896minus1
(119904 119894)Under discrete time approximation the 119864[119868
119896minus1(119905 119894)]rsquos
have been derived in the appendix Using these results ofexpected numbers and using the resultsum119905minus1
119894=0119886119894= (119886
119905minus1)(119886minus
1) for 119886 = 0 we obtain
120573119896minus1
119864 [119866119894(119905)] =
119894+1
sum
119903=119894
119864 [119869119903(1199050 119894)] (
119896minus1
prod
119906=119903
120573119906)
times
119896minus1
sum
119907=119903
119860119903(119896minus1)
(119907)
119905minus1
sum
119904=1199050
(1 + 120574119907)119904minus1199050
= 120579(119894+1)(119896minus1)
120601(119894+1)(119896minus1) (119905) + 120582119894(119896minus1)120601119894(119896minus1) (119905)
(48)
where (120579119894(119896minus1)
119894 = 1 2 3) and (120579119894(119896minus1)
119894 = 0 1 2) are definedin Section 33 andwhere the120601
119894(119896minus1)(119905) (119894 = 0 1 2 3)rsquos are given
by
120601119894(119896minus1) (119905) =(
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1) (119907)
times1
120574119907
(1 + 120574119907)119905minus1199050
minus 1 119894 = 0 1 2 3
(49)
Notice that if 1205740= 0 then 120601
0(119896minus1)(119905) reduces to
1206010(119896minus1)
(119905) =(
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1)
(119907)
times1
1205741199072(1 + 120574
119907)119905minus1199050
minus 1 minus (119905 minus 1199050) 120574
119907
(50)
Applying these results for time homogeneous modelswith 120574
119894= 120574119895if 119894 = 119895 the 119876
119894(119895)rsquos under discrete approximation
are given as follows
ISRN Biomathematics 13
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791112060111(119905119895minus1
)minus1205820112060101(119905119895minus1
)
minus119890minus1205791112060111(119905119895)minus1205820112060101(119905119895) + 119900 (120573
1)
1198761(119895) asymp (1 minus 120572
1) 119890
minus1205821112060111(119905119895minus1
)
minus119890minus1205821112060111(119905119895) + 119900 (120573
1)
(51)
(2) If 119896 ge 3 thenwe have 1198762(0) = 120575
2(119896minus1)120572119876
119894(0) = 0 119894 =
0 1 and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791(119896minus1)
1206011(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1206011(119896minus1)
(119905119895)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198761(119895) asymp 119890
minus1205792(119896minus1)
1206012(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1206012(119896minus1)
(119905119895)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198762(119895) asymp 120575
2(119896minus1)(1 minus 120572) 119890
minus1205822212060122(119905119895minus1
)minus 119890
minus1205822212060122(119905119895)
+ (1 minus 1205752(119896minus1)
) 119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(52)
Notice that if one replaces [1 + 120574119907]119905minus1199050 by [1 + 120574
119907]119905minus1199050 =
119890(119905minus1199050) log(1+120574
119907)
asymp 119890(119905minus1199050)120574119907 the above 119876
119894(119895)rsquos from discrete
time model are exactly the same as from the continuousmodel respectively as given in equations (23)ndash(27) under theassumption that 119875
119879(119904 119905) = 1 for 119905 minus 119904 gt 0 Notice that the
assumption 119875119879(119904 119905) = 1 for 119905minus119904 gt 0 is equivalent to assuming
that the last stage cancer cells grow instantaneously intocancer tumors as soon as they are generated see Remark 1
5 The Fitting of the Model to CancerIncidence and the Generalized BayesianInference Procedure
Given the model in Sections 2 and 3 and cancer incidenceone may use results in Section 4 to fit the model By usingthis model and the distribution results in Section 4 one canreadily estimate the unknown genetic parameters predictcancer incidence and check the validity of the model byusing the generalized Bayesian inference and Gibbs samplingprocedures for more detail see Tan [3 22] and Tan et al[4 5]
The generalized Bayesian inference is based on the pos-terior distribution 119875Θ | NY
˜119910
˜119899 of Θ given NY
˜119910
˜119899
This posterior distribution is derived by combining the prior
distribution 119875Θ ofΘ with the joint probability distribution119875NY
˜119910 |
˜119899 Θ given
˜119899 Θ given by (42) It follows
that this inference procedure would combine informationfrom three sources (1) previous information and experiencesabout the parameters in terms of the prior distribution 119875Θof the parameters (2) biological information of inheritedcancer cases via genetic segregation of cancer genes inthe population (119875N |
˜119899 119901
119894 119894 = 1 2 see Section 2)
and (3) information from the expanded data (Y) and theobserved data (
˜119910) via the statistical model from the system
(119875Y˜119910 | N Θ) given by (37) and (40) Because of additional
information from the genetic segregation of the cancer genesthis inference procedure provides an efficient procedure toextract information of effects of genotypes of individuals atthe embryo stage
51 The Prior Distribution of the Parameters For the priordistributions of Θ because biological information has sug-gested some lower bounds and upper bounds for the muta-tion rates and for the proliferation rates we assume
119875 (Θ) prop 119888 (119888 gt 0) (53)
where 119888 is a positive constant if these parameters satisfy somebiologically specified constraints are and equal to zero forotherwise These biological constraints are as follows
(i) 0 lt 1199011lt 10
minus2 0 lt 1199012lt 10
minus6 and minus001 lt 120574119894lt 1 (119894 =
1 2)(ii) For 120573
119895(119895 = 0 1 119896 minus 1) we let 1 lt 120596
0= 119873(119905
0)120573
0lt
1000 (119873 rarr 1198681) and 10minus8 lt 120573
119894lt 10
minus3 119894 = 1 119896minus1(iii) For the 120582
119895(119896minus1)(119895 = 0 1 2) we let 0 lt 120582
119894lt 10 119894 = 0 1
and 0 lt 1205822lt 10
3(iv) For the 120579
119894(119894 = 1 2 3) we let 0 lt 120579
1lt 10
minus2 and 0 lt
1205792lt 10
minus1
We will refer to the above prior as a partially informativeprior which may be considered as an extension of thetraditional noninformative prior given in Box and Tiao [28]
52 The Posterior Distribution of the Parameters Given119884119873
˜119910
˜119899 Denote by Θ
1= Θ minus 119901
1 119901
2 120572 From the
posterior distribution 119875Θ | NY˜119910
˜119899 we obtain
119875 119901119894 119894 = 1 2 120572 | Θ
1NY
˜119910
˜119899
prop (120594 (119896))1199100
119890minus120594(119896)
119901sum119905119873
119895=11198991119895
1119901sum119905119873
119895=11198992119895
2
times (1 minus 1199011minus 119901
2)sum119905119873
119895=11198990119895
0 lt 1199011 119901
2 120572 lt 1
119875 Θ1| (119901
1 119901
2 120572) NY
˜119910
˜119899
prop
119905119873
prod
119895=1
2
prod
119894=0
119890minus119876119894(119895)119876
119894(119895)
119910119894119895
Θ1isin Ω
(54)
14 ISRN Biomathematics
where Ω is the parameter space of Θ1provided by the
biological constraints in Section 51For computational convenience we notice that the log of
1198751199011 119901
2 120572 | Θ
1NY
˜119910
˜119899 is proportional to the negative of
1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) similarly the log of
119875Θ1| (119901
1 119901
2 120572)NY
˜119910
˜119899 is proportional to the negative
of sum119896
119895=1119863119895given by (46)
53 The Multilevel Gibbs Sampling Procedure For EstimatingUnknown Parameters Given the posterior probability distri-butions we will use the following multilevel Gibbs samplingprocedure to derive estimates of the parameters We noticethat numerically the Gibbs sampling procedure given belowis equivalent to the EM-algorithm from the sampling theoryviewpoint with Steps 1 and 2 as the 119864-Step and with Steps 3and 4 as the119872-Step respectively [29]Thesemultilevel Gibbssampling procedures are given by the following
Step 1 (Generating N Given (Y˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 1)) Given Θ and given˜119899 use the multinomial
distribution of 1198991119895 119899
2119895 given 119899
119895in Section 3 to generate
a large sample of N Then by combining this large samplewith 119875Y
˜119910 | N
˜119899 Θ in (37) and (40) to select N through
the weighted bootstrap method due to Smith and Gelfand[30] This selected N is then a sample from 119875N | Y
˜119910
˜119899 Θ
even though the latter is unknown (For proof see Tan [22]Chapter 3) Call the generated sample N
Step 2 (Generating Y Given (N˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 2)) Given
˜119910
˜119899 Θ and given N = N generated
from Step 1 generate Y from the probability distribution119875Y | N
˜119910
˜119899 Θ given by (36) and (38) Call the generated
sample Y
Step 3 (Estimation of 119901119894 119894 = 1 2 120572 Given Θ
1NY
˜119910
˜119899)
Given
˜119910
˜119899 Θ
1 and given (NY) = (N Y) from Steps 1
and 2 derive the posterior mode of 119901119894 119894 = 1 2 120572 by
maximizing the conditional posterior distribution 119875119901119894 119894 =
1 2 120572 | Θ1 N Y
˜119910
˜119899 Under the partially informative prior
this is equivalent to maximize the negative of the deviance1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) in Section 43 under
the constraints given in Section 51 Denote this generatedmode by 119901
119894 119894 = 1 2
Step 4 (Estimation of Θ1Given (119901
119894 119894 = 1 2 120572NY
˜119910
˜119899))
Given (
˜119910
˜119899) and given (NY 119901
119894 119894 = 1 2 120572) = (N Y 119901
119894 119894 =
1 2 ) from Steps 1ndash3 derive the posterior mode of Θ1
by maximizing the conditional posterior distribution119875Θ
1| 119901
119894 119894 = 1 2 N Y
˜119910
˜119899 Under the partially
informative prior this is equivalent to maximize the negativeof the deviancesum119896
119895=1119863119895in (46) under the constraints Denote
the generated mode as Θ1
Step 5 (Recycling Step) With (NY 119901119894 119894 = 1 2 120572 Θ
1) =
(N Y 119901119894 119894 = 1 2 Θ
1) given above go back to Step 1 and
continue until convergence
The proof of convergence of the above steps can bederived by using procedure given in Tan ([22] Chapter 3) Atconvergence the Θ = 119901
119894 119894 = 1 2 Θ
1 are the generated
values from the posterior distribution of Θ given
˜119910
˜119899
independently of (NY) (for proof see Tan [22] Chapter 3)Repeat the above procedures once then generate a randomsample ofΘ from the posterior distribution ofΘ given
˜119910
˜119899
then one uses the sample mean as the estimates of (Θ) anduse the sample variances and covariances as estimates of thevariances and covariances of these estimates
6 A NewMultistage Stochastic Model for AdultEye Cancer (Uveal Melanoma)mdashAn Example
The human eye cancers consist of pediatric eye cancers andadult eye cancers The most common pediatric eye cancer isthe retinoblastoma which develops from the retinal pigmentepithelium cells underlying the retina that do not formmelanoma The most common adult eye cancers are theuveal melanomas involving the iris the ciliary body andthe choroid (collectively referred to as the uveal) Thesecancers develop from melanocytes (pigment cells) whichreside within the uveal giving color to the eye In Tan andZhou [9] we have developed a modified two-stage model forretinoblastoma Based on results frommolecular biology (seeLandreville et al [14] Mensink et al [15] and Loercher andHarbour [31]) Landreville et al [14] have proposed a threestage model for uveal melanoma as given in Figure 2 As anexample of applications of this paper in this section we willapply this model of uveal melanoma to the NCINIH eyecancer data from the SEER project We notice that the samemethods can be applied to model other human cancers aswell but this will be our future research
Given in Table 1 are the numbers of people at risk andthe eye cancer cases in the age groups together with thepredicted cases from the models These data give cancerincidence at birth and incidence for 85 age groups (119896 =
85) with each group spanning over a 1-year period exceptthe last age group (ge85 years old) For human eye cancerbecause the incidence at birth and for age groups from 1to 10 years old is basically generated by the pediatric eyecancer-retinoblastoma (see [9]) to account for inheritedcancer cases of uveal melanomas the incidence for age 0(birth) and for age periods from 1 to 10 years old in Table 1for uveal melanoma is derived by subtracting incidence ofretinoblastoma from SEER data (see Tan and Zhou [9])
To fit the data we let one-time unit be 6 months afterbirth and let 119905
0= 1 To compare different models and to
assess different assumptions we will consider the following2-3-stage mixture models (1) the complete 3-stage mixturemodel (Model-F) in which no assumptions are made on theparameters (2) the Type-1ndash3-stage mixture model in whichwe assume that 120574
1= 0 and that normal people and 119869
1at
the embryo stage will remain normal people and 1198691people
respectively at birth (Model-1) For comparison purposes wealso fit a 2-stage model as defined in Tan and Zhou [9] Wewill apply the methods in Section 6 to fit these models to theSEER data given in Table 1
ISRN Biomathematics 15
Table 2 The log-likelihood AIC and BIC of the fitted models
Models Log-likelihood AIC BICThree-stage
Model-F minus131202 264804 267735Model-1 minus129576 260953 263151
Two-stage minus4392421 8796843 8811569
0 6 12 18 24 30 36 42 48 54 60 66 72 78 84Age (year)
0
50
100
150
200
250
300
350
Inci
denc
es
ObservedModel-F
Model-1Two-stage
Figure 5 Curve fitting of SEER data by the Model-F Model-1 andthe two-stage model
Given in Table 2 are the natural logs of the likelihoodfunctions the AIC (Akaike Information Criterion) and theBIC (Bayesian Information Criterion) for these modelsGiven in Table 3 are the estimates of parameters in the 3-stagemodels Given in Figure 5 are the plots of predicted cancercases from the 3-stage mixture models (Model-F and Model-1) and the 2-stagemodel For comparison purposes in Table 1we also provide numbers of predicted cancer cases from the3-stage mixture models and the 2-stage model together withthe observed cancer cases over time from SEER From theseresults we have made the following observations
(a) As shown by results in Table 1 and Figure 5 it appear-ed that both Model-F and Model-1 fitted the SEERdata well although Model-1 fitted the data slightlybetter from values of AIC and BIC The Chi-squaretest statistics 1205942 = sum
85
119894=0((119910
119894minus 119910
119894)2119910
119894) for Model-F
and Model-1 are given by 8843 and 9448 respec-tively giving a 119875-value of 012 (df = 86 minus 12 = 74)for Model-F and a 119875 value of 011 (df = 86 minus 7 = 79)for Model-1 On the other hand the 2-stage modelfitted the date very poorly the Chi-statistic value forthe 2-stage model is 274769 giving a 119875-value less than10
minus3The AIC (Akaike Information Criteria) and BIC(Bayesian Information Criteria) values ofModel-1 aregiven by (AIC = 260953 BIC = 263151) which areslightly smaller than those of Model-F respectivelyhowever the AIC and BIC values (879684 881157)of the two-stage model are considerably greater thanthose of the 3-stagemodels respectivelyThese resultssuggest that uvealmelanomamaybest be described bya 3-stage model with inherited component and that
one may practically assume 1205741= 0 and that normal
people and 1198691people at the embryo stage will remain
normal people and 1198691people respectively at birth
(b) From Table 3 it is observed that the estimate of 1205741is
close to zero (the estimate is of order 10minus5) indicatingthat the phenotype of 119869
1is almost identical to that
of 119873 = 1198690further confirming that the staging-
limiting genes are basically tumor suppressor genesand that there is no haploinsufficiency for these tumorsuppressor genes On the other hand the estimate of1205742is of order 10minus2 which is about 103 times greater
than those of cells with genotype 1198691
(c) From Table 3 the estimates 119895(119895 = 0 1 2) of 120582
119895
are of order 10minus1 10
minus1 10
2sim 10
3 respectively
Because 120582119894= 119864[119869
119894(1199050 119894)]120573
119894prod
2
119906=119894+1(120573
119906120574
119906) 119894 = 0 1 2
assuming some values of 119864[119869119894(1199050 119894)] 119894 = 0 1 2 from
some biological observations one can have somerough ideas about the magnitude of 120573
119895(119895 = 0 1 2)
For example if we follow Potten et al [32] to assume(119864[119873(119905
0)] = 119864[119869
119894(1199050 119894)] sim 10
8 119894 = 0 1 2) then 120573
119895asymp
10minus6sim 10
minus5(d) From Table 3 the estimates of 119901
1and 119901
2from the
SEER data are of orders 10minus4 sim 10minus3 and 10minus7 sim 10
minus6respectively This indicates that in the US populationthe frequency of the staging limiting cancer genefor uveal melanoma is approximately around 10
minus3Table 3 also showed that the estimate of 120572 was 08411indicating that most individuals with genotype 119869
2
would develop cancer at birth This may help toexplain why there are observed cancer incidencesat birth for uveal melanoma in the SEER data eventhough the estimate of the frequency 119901
2is of order
10minus7sim 10
minus6
7 Discussion and Conclusions
To account for inherited cancer cases in this paper wehave developed some general multistage models involvinghereditary cancer cases For human cancer incidence thesemodels are basically generalized mixture models In thesemixture models the mixing probability is a multinomialdistribution to account for genetic segregation of the staging-limiting tumor suppressor genes This mixture model allowsus to estimate for the first time the frequency of the staging-limiting tumor suppressor gene in human populations As anexample of applications in this paper we have developed ageneral 3-stage stochastic multistage model of carcinogenesisfor adult human eye cancer To account for inherited cancercases in the stochastic model of human eye cancer wehave also developed a generalized mixture model for uvealmelanoma in human beings
For using the proposed models to fit the cancer incidencedata in this paper we have developed a generalized Bayesianinference procedure to estimate the unknownparameters andto predict cancer cases This inference procedure is advanta-geous over the classical sampling theory inference (ie max-imum likelihood method) because the procedure combines
16 ISRN Biomathematics
Table3Estim
ates
ofparametersfor
the3
-stage
stochastic
mod
els
Parameters
1205820
1205821
1205822
1205741
1205742
1199011
1199012
1205721205791
1205792
Mod
el-F
Estim
ates
288119864minus01
481119864minus01
655119864+02
271119864minus05
337119864minus02
995119864minus04
968119864minus07
841119864minus01
329119864minus04
341119864minus01
StD
121119864minus01
465119864minus02
112119864+02
886119864minus06
192119864minus04
175119864minus06
648119864minus09
419119864minus02
354119864minus05
107119864minus02
95CL
-Low
er503119864minus02
389119864minus01
434119864+02
534119864minus06
333119864minus02
991119864minus04
955119864minus07
759119864minus01
260119864minus04
320119864minus01
95CL
-Upp
er525119864minus01
572119864minus01
875119864+02
401119864minus05
341119864minus02
998119864minus04
981119864minus07
923119864minus01
398119864minus04
362119864minus01
Mod
el-1
Estim
ates
655119864minus02
372119864minus01
198119864+02
NA
349119864minus02
998119864minus04
100119864minus06
807119864minus01
NA
NA
StD
254119864minus02
463119864minus02
338119864+01
NA
358119864minus03
627119864minus05
453119864minus08
706119864minus02
NA
NA
95CL
-Low
er157119864minus02
282119864minus01
132119864+02
NA
279119864minus02
875119864minus04
911119864minus07
668119864minus01
NA
NA
95CL
-Upp
er115119864minus01
463119864minus01
265119864+02
NA
419119864minus02
1121198640minus03
109119864minus06
945119864minus01
NA
NA
NoteNAassum
edno
nexiste
nce
ISRN Biomathematics 17
information from three sources (1)previous information andexperiences about the parameters in terms of the prior distri-bution 119875Θ of the parameters (2) biological information ofinherited cancer cases via the genetic segregation of staging-limiting tumor suppressor genes in the population and (3)
information from the expanded data (Y) and the observeddata (
˜119910) via the statistical model from the system (119875Y
˜119910 |
N Θ) given by (37) and (40)To illustrate the usefulness and applications of ourmodels
and methods we have applied our models and methods tothe eye cancer SEER data of NCINIH Our analysis clearlyshowed that the proposed 3-stage model with inheritedcancer cases fitted the data nicely (see Table 2 and Figure 5)on the other hand the classical 2-stage model cannot fit thedata at all (see Table 2 and Figure 5)These results clearly haveconfirmed results frommolecular biology that the human eyecancer is derived by a 3-stage model with inherited cancercomponent Notice however our 3-stage multistage modelis more general than the classical 3-stage model as describedin Little [1] Tan [2] and Zheng [7] in that we postulatethat cancer tumors develop from primary 119868
3cells by clonal
expansion (see Yang and Chen [11]) (Note that the stochasticmultistagemodels in the literature assume that cancer tumorsdevelop from last stage cells immediately as soon as theyare generated ignoring completely cancer progression seeRemark 1) As a matter of fact we had assumed Δ119905 = 1 fora period of three months and found that the 3-stage modelsthen did not fit the SEER data clearly indicating that119875
119905(119904 119905) lt
1 over a period of three monthsApplying our models and methods to the SEER data
of human eye cancer we have derived for the first timesome useful pieces of information Specifically we mention(1) for the first time that we have estimated the frequencyof the staging-limiting tumor suppressor gene in the USpopulation (119901
1sim 9948 times 10
minus3) (2) With the estimate of 120572as = 08411 the predicted number of uveal melanoma atbirth is 119910
0= 119899
011199012= 36 by 3-stage models with inherited
cancer component (Model-F and Model-1) (The observednumber of eye cancer at birth is 34) (3) The estimateof the proliferation rate (120574
1) of 119869
1cells using Model-F is
1205741
= 2271 times 10minus5
sim 0 (The estimate is 8603 times 10minus5
using Model-1) This confirms that the stage-1 limitinggene is a tumor suppressor gene and unlike the p53 genein chromosome 17p (see [33]) there is little or no haploidinsufficiency for this gene in cells with genotype 119869
1
Using models and methods of this paper one can easilypredict future cancer cases for human eye cancer Thus bycomparing results from different populations our modelsand methods can be used to assess cancer prevention andcontrol procedures This will be our future research topicswe will not go any further here
Appendix
The Expected Numbers of State Variablesunder Discrete Time Approximation
Under discrete time the stochastic differential equations ofstate variables reduce to the following stochastic difference
equations of state variables respectively
119869119894 (119905 + 1 119894) = 119869
119894 (119905 119894) [1 + 120574119894 (119905)]
+ 119890119894(119905 + 1 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 + 1 119906) = 119869
119895(119905 119906) [1 + 120574
119895(119905)] + 119869
119895minus1(119905 119906) 120573
119895minus1(119905)
+ 119890119895(119905 + 1 119906) 0 le 119906 lt 119895 le 119896 minus 1
(A1)
where 119890119894(119905 + 1 119894) = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)] minus [119863
119894(119905 119894) minus
119869119894(119905 119894)119889
119894(119905)] and 119890
119895(119905+1 119894) = [119861
119895(119905 119894)minus119869
119895(119905 119894)119887
119895(119905)] minus [119863
119895(119905 119894)minus
119869119895(119905 119894)119889
119895(119905)] + [119872
119895minus1(119905 119894) minus 119869
119895minus1(119905 119894)120573
119895minus1(119905)] for 119895 gt 119894
The initial conditions at birth (1199050) for the above stochastic
difference equations are 119869119895(1199050 119894) gt 0 if (119895 = 119894 119894 + 1) and
119869119895(1199050 119894) = 0 if 119895 gt 119894 + 1 The solution of the above difference
equations under these initial conditions is given respectivelyby
119869119894(119905 119894) = 119869
119894(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119894(119904))
+
119905
sum
119904=1199050+1
119890119894(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119894(119906))
119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119895(119904))
+
119905minus1
sum
119904=1199050
119869119895minus1 (119904 119894) 120573119895minus1 (119904)
119905minus1
prod
119906=119904+1
(1 + 120574119895 (119906))
+
119905
sum
119904=1199050+1
119890119895(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119895(119906))
0 le 119894 lt 119895 le 119896 minus 1
(A2)
If the model is time homogeneous so that 120573119895(119905) =
120573119895 120574
119895(119905) = 120574
119895(119895 = 119894 119896 minus 1 and if 120574
119894= 120574119895if 119894 = 119895 then the
above solutions under the initial conditions (119869119895(1199050 119894) = 0 119895 gt
119894 + 1) reduce to
119869119894 (119905 119894) = 119869
119894(1199050 119894) (1 + 120574
119894)119905minus1199050
+ 120578(0)
119894(119905 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894) (1 + 120574
119895)119905minus1199050
+
119895minus1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895(119903) (1 + 120574
119903)119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
18 ISRN Biomathematics
=
119894+1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895 (119903) (1 + 120574119903)
119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
0 le 119894 lt 119895 le 119896 minus 1
(A3)
where 120578(0)
119895(119905 119894) = sum
119905
119904=1199050+1119890119895(119904 119894)(1 + 120574
119894)119905minus119904 120578
(119906)
119895(119905 119894) =
sum119905
119904=1199050+1119890119895minus119906
(119904 119894)sum119895
119907=119906119860119906119895(119907)(1 + 120574
119907)119905minus119904 (119906 = 1 119895 minus 119894)
Thus if the model is time homogeneous and if 120574119894=
120574119895if 119894 = 119895 the 119864[119869
119896minus1(119905 119894)]rsquos in discrete time models under
the initial conditions (119869119895(1199050 119894) = 0 119895 gt 119894 + 1) are given
respectively by
119864 [119869119896minus1
(119905 119894)] =
119894+1
sum
119906=119894
119864 [119869119906(1199050 119894)] (
119896minus2
prod
119907=119906
120573119907)
times
119896minus1
sum
119903=119906
119860119906(119896minus1)
(119903) (1 + 120574119903)119905minus1199050
119894 = 0 1 Min (119896 minus 1 2)
(A4)
References
[1] M P Little ldquoCancermodels ionization and genomic instabilitya reviewrdquo inHandbook of CancerModels with ApplicationsW YTan and L Hanin Eds chapter 5 World Scientific River EdgeNJ USA 2008
[2] W Y Tan Stochastic Models of Carcinogenesis Marcel DekkerNew York NY USA 1991
[3] W Y Tan ldquoStochastic multiti-stage models of carcinogenesis ashidden Markov models a new approachrdquo International Journalof Systems and Synthetic Biology vol 1 pp 313ndash337 2010
[4] W Y Tan L J Zhang and C W Chen ldquoStochastic modelingof carcinogenesis state space models and estimation of param-etersrdquoDiscrete and Continuous Dynamical Systems B vol 4 no1 pp 297ndash322 2004
[5] W Y Tan C W Chen and L J Zhang ldquoCancer biology cancermodels and stochasticmathematical analysis of carcinogenesisrdquoin Handbook of Cancer Models and Applications W Y Tan andL Hanin Eds chapter 3World Scientific River Edge NJ USA2008
[6] R A Weinberg The Biology of Human Cancer Garland Sci-ences Taylor and Frances New York NY USA 2007
[7] Q Zheng ldquoStochastic multistage cancer models a fresh lookat an old approachrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds chapter 2 WorldScientific River Edge NJ USA 2008
[8] W Y Tan L J Zhang W Chen and J M Zhu ldquoA stochasticmodel of human colon cancer involving multiple pathwaysrdquo inHandbook of Cancer Models with Applications W Y Tan and LHanin Eds chapter 11 World Scientific River Edge NJ USA2008
[9] W Y Tan and H Zhou ldquoA new stochastic model of retinoblas-toma involving both hereditary and non- hereditary cancercasesrdquo Journal of Carcinogenesis and Mutagenesis vol 2 no 2article 117 2011
[10] X W Yan Stochastic and State Space Models of carcinogenesisUnder Complex Situation [PhD thesis] Department of Math-ematical Sciences University of Memphsis Memphis TennUSA
[11] G L Yang and C W Chen ldquoA stochastic two-stage carcino-genesis model a new approach to computing the probabilityof observing tumor in animal bioassaysrdquo Mathematical Bio-sciences vol 104 no 2 pp 247ndash258 1991
[12] H Osada and T Takahashi ldquoGenetic alterations of multipletumor suppressors and oncogenes in the carcinogenesis andprogression of lung cancerrdquo Oncogene vol 21 no 48 pp 7421ndash7434 2002
[13] I I Wistuba L Mao and A F Gazdar ldquoSmoking moleculardamage in bronchial epitheliumrdquo Oncogene vol 21 no 48 pp7298ndash7306 2002
[14] S Landreville O A Agapova and J W Hartbour ldquoEmerginginsights into the molecular pathogenesis of uveal melanomardquoFuture Oncology vol 4 no 5 pp 629ndash636 2008
[15] H W Mensink D Paridaens and A De Klein ldquoGenetics ofuveal melanomardquo Expert Review of Ophthalmology vol 4 no6 pp 607ndash616 2009
[16] WY Tan andXWYan ldquoAnew stochastic and state spacemodelof human colon cancer incorporating multiple pathwaysrdquoBiology Direct vol 5 article 26 2010
[17] L B Klebanov S T Rachev and A Y Yakovlev ldquoA stochasticmodel of radiation carcinogenesis latent time distributions andtheir propertiesrdquo Mathematical Biosciences vol 113 no 1 pp51ndash75 1993
[18] A Y Yakovlev and A D Tsodikov Stochastic Models of TumorLatency and Their Biostatistical Applications World ScientificRiver Edge NJ USA 1996
[19] H Fakir W Y Tan L Hlatky P Hahnfeldt and R K SachsldquoStochastic population dynamic effects for lung cancer progres-sionrdquo Radiation Research vol 172 no 3 pp 383ndash393 2009
[20] A G Knudson ldquoMutation and cancer statistical study ofretinoblastomardquoProceedings of theNational Academy of Sciencesof the United States of America vol 68 no 4 pp 820ndash823 1971
[21] J F Crow and M Kimura An Introduction to PopulationGenetics Theory Harper and Row New York NY USA 1970
[22] W Y Tan Stochastic Models With Applications to GeneticsCancers AIDS and Other Biomedical Systems World ScientificRiver Edge NJ USA 2002
[23] W Y Tan CW Chen and L J Zhang ldquoCancer risk assessmentby state space modelsrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds Chapter 13 WorldScientific River Edge NJ USA 2008
[24] W Y Tan andCWChen ldquoStochasticmodeling of carcinogene-sis some new insightsrdquoMathematical and Computer Modellingvol 28 no 11 pp 49ndash71 1998
[25] W Y Tan and C W Chen ldquoCancer stochastic modelsrdquo inEncyclopedia of Statistical Sciences Revised edition John Wileyand Sons New York NY USA 2005
[26] E G Luebeck and S HMoolgavkar ldquoMultistage carcinogenesisand the incidence of colorectal cancerrdquo Proceedings of theNational Academy of Sciences of the United States of Americavol 99 no 23 pp 15095ndash15100 2002
[27] R Durrett J Mayberry S Moseley and D Schmidt ldquoProba-bility models for cancer development and progressionrdquo GoogleSearch Google 2012
[28] G E P Box and G C Tiao Bayesian Inference in StatisticalAnalysis Addison-Wesley Reading Mass USA 1973
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
4 ISRN Biomathematics
Two-stage model
Embryo state
Embryo state
At birth
At birth
Tumor
α1 minus α
( gt 3)tumor ( = 3)
-stage model ( ge 3)
Figure 4 Embryo genotypes and their frequencies at embryo stage and at birth
with probability 120572 (120572 gt 0) an 1198692person at the embryo stage
would develop cancer at or before birth
3 The Stochastic Process ofCarcinogenesis with Hereditary CancerCases and Mathematical Analysis
Because tumors are developed from primary 119869119896cells for the
above stochasticmodel the identifiable response variables are119879(119905) and 119869
119906(119905 119894) 119894 = 0 1 2 119906 = 119894 119894 + 1 119896 minus 1 where
119879(119905) is the number of cancer tumors at time 119905 and 119869119906(119905 119894) is
the number of 119869119906(119906 = 119894 119894 + 1 119896 minus 1) cells at time 119905 in
people who are 119869119894people at the embryo stage (see [3 5 8 23]
Remarks 1 and 2) For people who have genotype 119869119894(119894 =
0 1 2) at the embryo stage the stochastic model of carcino-genesis is then given by the stochastic process
˜119883119894(119905) 119879(119905) 119905 gt
0 where˜119883119894(119905) = 119869
119906(119905 119894) 119906 = 119894 119894 + 1 119896 minus 1
1015840 For theseprocesses in the next subsections we will derive stochasticequations for the state variables (119869
119906(119905 119894)119894 = 0 1 2 119906 =
119894 119896 minus 1) we will also derive the probability distributionsof these state variables and the probabilities of developingcancer tumors These are the basic approaches for modelingcarcinogenesis used by the first author and his associates seeTan [3] Tan et al [4 5 8 23] Tan and Zhou [9] Tan and Yan[16] and Tan and Chen [24 25] and Remark 3
Remark 2 At any time (say 119905) the total number of 119869119896cells
is equal to the total number of 119869119896cells generated from 119869
119896minus1
cells at time 119905 plus the total number of 119869119896cells generated by
cell division from other 119869119896cells at time 119905 the former 119869
119896cells
are referred to as primary 119869119896cells while the latter are not
primary 119869119896cells Since each tumor is developed from a single
primary 119869119896cell through stochastic birth and death process
each primary 119869119896cell will generate atmost one tumor It follows
that at any time the total number of 119869119896cells is considerably
greater than the number of cancer tumors (see also Yangand Chen [11]) Thus for generating cancer tumors the onlyidentifiable state variables are the number of 119869
119895cells with (119895 =
0 1 119896 minus 1) and the number of detectable cancer tumor
Remark 3 To model stochastic multistage models of car-cinogenesis the standard traditional approach is to assumethat the last stage cells (ie the 119869
119896cells in the model 119873 rarr
1198691sdot sdot sdot rarr 119869
119896rarr 119879119906119898119900119903) grow instantaneously into a cancer
tumor as soon as they are generated and then apply thestandard Markov theory to 119879(119905) and to the state variables
˜119883(119905) = 119869
119894(119905) 119894 = 0 1 119896 minus 1 This approach has been
described in detail in Tan [2] Little [1] and Zheng [7] seealso Luebeck and Moolgavkar [26] and Durrett et al [27]However in some cases the assumption of instantaneousgrowth into cancer tumors of 119869
119896cells may not be realistic
ISRN Biomathematics 5
(Klebanov et al [17] Yakovlev and Tsodikov [18] and Fakiret al [19]) in these cases 119879(119905) is not Markov so that theMarkov theory method is not applicable to 119879(119905) To developanalytical results and to resolve many difficult issues Tan andhis associates [4 5 24] have proposed an alternative approachthrough stochastic equations and have followed Yang andChen [11] to assume that cancer tumors develop by clonalexpansion from primary last stage cells Through probabilitygenerating function method Tan and Chen [24] have shownthat if the Markov theory is applicable to 119879(119905) then thestochastic equation method is equivalent to the classicalMarkov theory method but is more powerful Also throughstochastic equation method we have shown in the Appendixthat the classical approach provides a close approximation todiscrete time model under the assumption that the primarylast stage cells develop into a detectable tumor in one timeunit This provides a reasonable explanation why the tradi-tional approach (see [2 22]) can still work well even thoughthe Markov assumption for 119879(119905) may not hold In this paperwe will thus basically use the stochastic equationmethod andassume that cancer tumors develop from primary last stagecells through clonal expansion
31 Stochastic Equations for the State Variables Assume nowthat an individual is an 119869
119894(119894 = 0 1 2) person at the embryo
stageThen in this individual cancer is developed by a (119896minus119894)-stage multievent model given by 119869
119894rarr sdot sdot sdot rarr 119869
119896rarr
119879119906119898119900119903 and the identifiable response variables are given by˜119883119894(119905)
1015840 119879(119905) To derive stochastic equations for the staging
variables in˜119883119894(119905) in this individual observe that for each
119894 = 0 1 2˜119883119894(119905)
1015840 is in general a Markov Process although119879(119905) may not be Markov see Remark 1 Tan [3] and Tan etal [4 5] Tan and Zhou [9] and Tan and Yan [16] It followsthat
˜119883119894(119905 + Δ119905)
1015840 derive from˜119883119894(119905)
1015840 through stochastic birth-death processes of 119869
119906(119906 = 119894 119894 + 1 119896minus1) cells and through
stochastic transition 119869119906rarr 119869
119906+1 119906 = 119894 119894+1 119896minus1 during
(119905 119905 + Δ119905] Let 119861119906(119905 119894) 119863
119906(119905 119894) 119872
119906(119905 119894) be the number of
birth the number of death of 119869119906cells and the number of
transition from 119869119906rarr 119869
119906+1cells during (119905 119905+Δ119905] respectively
in people who are 119869119894people at the embryo stage Let 119872
0(119905)
denote the number of transitions from 119873 rarr 1198691during
(119905 119905 + Δ119905] Because the transition of 119869119906rarr 119869
119906+1would not
affect the number of 119869119906cells but only increase the number of
119869119906+1
cells (see Remark 4) by the conservation law we have thefollowing stochastic equations for 119869
119906(119905 119894) 119906 = 119894 119896minus1 119894 =
0 1 2 (see Tan [3] Tan et al [4 5 8] Tan and Zhou [9] andTan and Yan [16])
119869119894 (119905 + Δ119905 119894) = 119869
119894 (119905 119894) + 119861119894 (119905 119894) minus 119863119894 (119905 119894) 119894 = 0 1 2
119869119906(119905 + Δ119905 119894) = 119869
119906(119905 119894) + 119861
119906(119905 119894) minus 119863
119906(119905 119894)
+ 119872119906minus1 (119905 119894) 119894 lt 119906 le 119896 minus 1
(2)
Because 119861119907(119905 119894) 119863
119907(119905 119894)119872
119907(119905 119894) 119894 = 0 1 2 119907 = 119894 119896minus
1 are random variables the above equations are basicallystochastic equations To derive probability distributions ofthese variables let 119887
119906(119905) and 119889
119906(119905) denote the birth rate and
the death rate at time 119905 of the 119869119906(119906 = 0 1 119896 minus 1) cells
respectively Let 120573119906(119905) (119906 = 0 1 119896 minus 1) be the transition
rate at time 119905 from 119869119906rarr 119869
119906+1 Then as shown in Tan [3] for
(119894 = 0 1 2 119906 = 119894 119896 minus 1) we have to the order of 119900(Δ119905)
119861119906(119905 119894) 119863
119906(119905 119894) 119872
119906(119905 119894) | 119869
119906(119905 119894)
sim Multinomial 119869119906 (119905 119894) 119887119906 (119905) Δ119905 119889119906 (119905) Δ119905 120573119906 (119905) Δ119905
(3)
It follows that to the order of 119900(Δ119905)
119861119906(119905 119894) 119863
119906(119905 119894) | 119869
119906(119905 119894)
simMultinomial 119869119906 (119905 119894) 119887119906 (119905) Δ119905 119889119906 (119905) Δ119905
119872119906(119905 119894) | 119869
119906(119905 119894)
sim Binomial 119869119906(119905 119894) 120573
119906(119905) Δ119905
sim Poisson 119869119906(119905 119894) 120573
119906(119905) Δ119905 + 119900 (120573
119895(119905) Δ119905)
independently of 119861119906(119905 119894) 119863
119906(119905 119894)
119906 = 0 1 119896 minus 1
(4)
From these distribution results by subtracting from therandom transition variables its conditional means respec-tively we obtain the following stochastic equations for thestate variables
˜119883119894(119905) (119894 = 0 1 Min(2 119896 minus 1))
119889119869119894(119905 119894) = 119869
119894(119905 + Δ119905 119894) minus 119869
119894(119905 119894) = 119861
119894(119905 119894) minus 119863
119894(119905 119894)
= 119869119894(119905 119894) 120574
119894(119905) Δ119905 + 119890
119894(119905 119894) Δ119905 119894 = 0 1 2
119889119869119906 (119905 119894) = 119869
119906 (119905 + Δ119905 119894) minus 119869119906 (119905 119894) = 119872119906minus1 (119905 119894)
+ 119861119906(119905 119894) minus 119863
119906(119905 119894)
= 119869119906minus1 (119905 119894) 120573119906minus1 (119905)
+119869119906(119905 119894) 120574
119906(119905) Δ119905 + 119890
119906(119905 119894) Δ119905
119894 = 0 1 Min (2 119896 minus 1) 119894 lt 119906 le 119896 minus 1
(5)
where 120574119906(119905) = 119887
119906(119905) minus 119889
119906(119905) for 119906 = 0 1 119896 minus 1 and where
119890119894(119905 119894)Δ119905 = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)Δ119905] minus [119863
119894(119905 119894) minus 119869
119894(119905 119894)119889
119894(119905)Δ119905]
for 119894 = 0 1 2 119890119906(119905 119894)Δ119905 = [119861
119906(119905 119894) minus 119869
119906(119905 119894)119887
119906(119905)Δ119905] minus
[119863119906(119905 119894) minus 119869
119906(119905 119894)119889
119906(119905)Δ119905] + [119872
119906minus1(119905 119894) minus 119869
119906minus1(119905 119894)120573
119906minus1(119905)Δ119905]
for 119894 = 0 1 Min(2 119896 minus 1) 119894 lt 119906 le 119896 minus 1From the above equations by dividing both sides by Δ119905
and letting Δ119905 rarr 0 we obtain
119869119894(119905 119894)
119889119905= 119869
119894(119905 119894) 120574
119894(119905) + 119890
119894(119905 119894) 119894 = 0 1 Min (2 119896 minus 1)
119869119906 (119905 119894)
119889119905= 119869
119906 (119905 119894) 120574119906 (119905) + 119869119906minus1 (119905 119894) 120573119906minus1 (119905) + 119890119906 (119905 119894)
for 119894 = 0 1 Min (2 119896 minus 1) 119894 lt 119906 le 119896 minus 1
(6)
In the above equations using the distribution results in(4) it can easily be shown that the randomnoises 119890
119906(119905 119894) 119906 =
119894 119896 minus 1 119894 = 0 1 2 have expected value zero and are
6 ISRN Biomathematics
uncorrelated with the staging variables and 119879(119905) The initialconditions at birth (119905
0) for the above stochastic differential
equations are 119869119906(1199050 119894) gt 0 119906 = 119894 119894 + 1 119869
119906(1199050 119894) = 0 119906 gt 119894 + 1
Given the initial conditions (119869119906(1199050 119894) gt 0 119906 = 119894 119894 + 1)
and (119869119906(1199050 119894) = 0 119906 gt 119894 + 1) at birth (119905
0) the solution of the
equations in (6) is given respectively by
119869119894 (119905 119894) = 119869
119894(1199050 119894) 119890
int119905
1199050
120574119894(119909)119889119909
+ 120578119894 (119905 119894)
119894 = 0 1 Min (2 119896 minus 1)
119869119906 (119905 119894) = 119869
119906(1199050 119894) 119890
int119905
1199050
120574119906(119909)119889119909
+ int
119905
1199050
119869119906minus1 (119909 119894) 120573119906minus1 (119909) 119890
int119905
119909120574119906(119910)119889119910
119889119909
+ 120578119906(119905 119894) = sdot sdot sdot = 119869
119906(1199050 119894) 119890
int119905
1199050
120574119906(119909)119889119909
+
119906minus119894
sum
119907=1
119869119906minus119907
(1199050 119894) 120601
(119907)
119906(119905 119894) +
119906+1minus119894
sum
119907=1
120578(119907)
119906(119905 119894)
119906 = 119894 + 1 119896 minus 1
where 119894 = 0 if 119896 = 2
119894 = 0 1 if 119896 = 3
119894 = 0 1 2 if 119896 gt 3
(7)
where
120601(1)
119906(119905 119894) =int
119905
1199050
119890int119905
119909120574119906(119910)119889119910+int
119909
1199050
120574119906minus1
(119910)119889119910120573119906minus1
(119909) 119889119909
119906 = 119894 119896 minus 1
120601(119907)
119906(119905 119894) =int
119905
1199050
119890int119905
119909120574119906(119910)119889119910
120573119906minus1
(119909) 120601(119907minus1)
119906minus1(119909 119894) 119889119909
119907 = 2 119906 minus 119894
120578119906(119905 119894) = 120578
(1)
119906(119905 119894)
= int
119905
1199050
119890int119905
119909120574119906(119910)119889119910
119890119906(119909 119894) 119889119909
119906 = 119894 119896 minus 1
120578(119907)
119906(119905 119894) = int
119905
1199050
119890int119905
119909120574119906(119910)119889119910
120573119906minus1
(119909) 120578(119907minus1)
119906minus1(119909 119894) 119889119909
119907 = 2 119906 minus 119894
(8)
If the model is time homogeneous so that 120573119906(119905) =
120573119906 119887119906(119905) = 119887
119906 119889
119906(119905) = 119889
119906 120574
119906(119905) = 119887
119906minus 119889
119906= 120574
119906 119906 = 0 1
119896minus1 and if 120574119894= 120574119906if 119894 = 119906 the above solutions under the initial
conditions (119869119894+119906(1199050 119894) gt 0 119906 = 0 1 119869
119894+119906(1199050 119894) = 0 119906 gt 1)
then reduce respectively to
119869119894 (119905 119894) = 119869
119894(1199050 119894) 119890
120574119894(119905minus1199050)+ 120578
(1)
119894(119905 119894)
119894 = 0 1 Min (2 119896 minus 1)
119869119906(119905 119894) = 119869
119906(1199050 119894) 119890
120574119906(119905minus1199050)
+120573119906minus1
int
119905
1199050
119869119906minus1
(119909 119894) 119890120574119906(119905minus119909)
119889119909+120578(1)
119906(119905 119894)
= sdot sdot sdot = 119869119906(1199050 119894) 119890
120574119906(119905minus1199050)
+
119906minus1
sum
119903=119894
119869119903(1199050 119894) (
119906minus1
prod
119907=119903
120573119907)
119906
sum
119897=119903
119860119903119906(119897) 119890
120574119897(119905minus1199050)
+
119906
sum
119903=119894
120578(119906+1minus119903)
119906(119905 119894)
=
119894+1
sum
119903=119894
119869119903(1199050 119894) (
119906minus1
prod
119907=119903
120573119907)
119906
sum
119897=119903
119860119903119906(119897) 119890
120574119897(119905minus1199050)
+
119906
sum
119903=119894
120578(119906+1minus119903)
119906(119905 119894) 119894 lt 119906 le 119896 minus 1
(9)
where for 119894 le 119907 le 119906119860119894119906 (119907) = 1 if 119894 = 119906
=
119906
prod
119897=119894119897 = 119907
(120574119897minus 120574
119907)minus1 if 119894 lt 119906
(10)
Obviously 119864[120578(119903)119906(119905 119894)] = 0 for all (119894 = 0 1 2 119906 = 119894
119896 minus 1 119903 = 1 119906 minus 119894 + 1) It follows that for (119894 =
0 1 Min(119896 minus 1 2)) the expected values 119864[119869119896minus1
(119905 119894)] of119869119896minus1
(119905 119894) for homogeneous models with 120574119894= 120574119906if 119894 = 119906 are
given by
119864 [119869119896minus1
(119905 119894) = 119864 [119869119894+1
(1199050 119894)]
times (
119896minus2
prod
119906=119894+1
120573119906)
119896minus1
sum
119907=119894+1
119860(119894+1)(119896minus1)
(119907) 119890120574119907(119905minus1199050)
+ 119864 [119869119894(1199050 119894)] (
119896minus2
prod
119906=119894
120573119906)
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907) 119890120574119907(119905minus1199050)
119894 = 0 1Min (119896 minus 1 2) 119896 ge 2
(11)
where as a convention (sum119894
119895=119894+1119888119894= 0prod
119894
119895=119894+1119889119895= 1) for all
(119888119894 119889
119895)
Remark 4 Because genetic changes and epigenetic changesoccur during cell division to the order of 119900(Δ119905) the probabil-ity is120573
119906(119905)Δ119905 that one 119869
119906cell at time 119905would give rise to 1 119869
119906cell
and 1 119869119906+1
cell at time 119905 + Δ119905 by genetic changes or epigeneticchanges It follows that the transition of 119869
119906rarr 119869
119906+1would not
affect the population size of 119869119906cells but only increase the size
of the 119869119906+1
population
ISRN Biomathematics 7
32 Transition Probabilities and Probability Distributions ofStaging Variables Let 119891(119909 119899 119901) denote the probabilitydensity function of a binomial random variable 119883 sim
Binomial(119899 119901) ℎ(119909 120582) the probability density function ofa Poisson random variable 119883 sim Poisson(120582) and 119892(119909 119910 119899
1199011 119901
2) the probability density function of a bivariate multi-
nomial random vector (119883 119884)simMultinomial(119899 1199011 119901
2) Using
the stochastic equations of the staging variables given by(2) and using the probability distributions of the transitionvariables 119861
119903(119905 119894) 119863
119903(119905 119894)119872
119903(119905 119894) in (4) as inTan et al [4 5]
we obtain the following transition probabilities of 119869119903(119905 +
Δ119905 119894) 119903 + 119894 119896 minus 1 given 119869119903(119905 119894) 119903 = 119894 119896 minus 1 for
(119894 = 0 1 Min(119896 minus 1 2))
119875 119869119903 (119905 + Δ119905 119894) = 119907
119903 119903 = 119894 119894 + 1 119896 minus 1 | 119869
119903 (119905 119894)
= 119906119903 119903 = 119894 119894 + 1 119896 minus 1
= 119875 119869119894(119905 + Δ119905 119894) = 119907
119894| 119869
119894(119905 119894) = 119906
119894
times
119896minus1
prod
119895=119894+1
119875 119869119895(119905 + Δ119905 119894) = 119907
119895| 119869
119895(119905 119894)
= 119906119895 119869119895minus1
(119905 119894) = 119906119895minus1
(12)
where
119875 119869119894(119905 + Δ119905 119894) = 119907
119894| 119869
119894(119905 119894) = 119906
119894
=
119906119894
sum
119903=0
119891 (119903 119906119894 119887119894 (119905) Δ119905)
times 119891(119906119894minus 119907
119894+ 119903 119906
119894minus 119903
119889119894(119905) Δ119905
1 minus 119887119894 (119905) Δ119905
)
119894 = 0 1 Min (119896 minus 1 2)
119875 119869119895(119905 + Δ119905 119894) = 119907
119895| 119869
119895(119905 119894) = 119906
119895 119869119895minus1
(119905 119894) = 119906119895minus1
=
119906119895
sum
1199031=0
119906119895minus1199031
sum
1199032=0
119892 (1199031 1199032 119906
119895 119887119895(119905) Δ119905 119889
119895(119905) Δ119905)
times ℎ (119907119895minus 119906
119895minus 119903
1+ 119903
2 119906
119895minus1120573119895minus1
Δ119905) 119895 gt 119894
(13)
Define the unobservable transition variables˜119880119894(119905) =
119861119894(119905 119894) (119861
119895(119905 119894) 119863
119895(119905 119894)) 119895 = 119894 + 1 119896 minus 1
1015840(119894 = 0 1Min
(119896 minus 1 2)) Then we have for the joint probability densityfunction of
˜119883119894(119905 + Δ119905)
˜119880119894(119905) given
˜119883119894(119905)
119875 ˜119883119894 (119905 +Δ119905) ˜
119880119894 (119905) | ˜
119883119894 (119905) = 119875
˜119883119894 (119905 + Δ119905) | ˜
119880119894 (119905) ˜
119883119894 (119905)
times 119875 ˜119880119894(119905) |
˜119883119894(119905)
(14)
where
119875 ˜119883119894(119905 + Δ119905) |
˜119880119894(119905)
˜119883119894(119905)
= 119891(119869119894 (119905 119894) minus 119869119894 (119905 + Δ119905 119894) + 119861119894 (119905 119894) 119869119894 (119905 119894)
minus119861119894(119905 119894)
119889119894(119905) Δ119905
1 minus 119887119894 (119905) Δ119905
)
times
119896minus1
prod
119895=119894+1
ℎ 119869119895 (119905 + Δ119905 119894) minus 119869119895 (119905 119894) minus 119861119895 (119905 119894)
+119863119895(119905 119894) 119869
119895minus1(119905 119894) 120573
119895minus1(119905) Δ119905
(15)
119875 ˜119880119894(119905) |
˜119883119894(119905)
= 119891 (119861119894 (119905 119894) 119869119894 (119905 119894) 119887119894 (119905 119894) Δ119905)
times
119896minus1
prod
119895=119894+1
119892119861119895(119905 119894) 119863
119895(119905 119894) 119869
119895(119905 119894) 119887
119895(119905 119894) Δ119905 119889
119895(119905 119894) Δ119905
(16)
Let˜119890119894(119895) be a (119896 minus 119894) times 1 column vector with 1 in the
119895th position (1 le 119895 le 119896 minus 119894) and with 0 in other positionsLet
˜119906 = (119906
119894 119906
119896minus1)1015840 and
˜119907 = (119907
119894 119907
119896minus1)1015840 be (119896 minus
119894) times 1 column vectors of nonnegative integers (ie 119906119895and
119907119895are nonnegative integers) Then by using the probability
distribution results in (14)ndash(16) it can readily be shown that
119875 ˜119883119894(119905 + Δ119905) =
˜119907 |
˜119883119894(119905) =
˜119906
= [119906119895119887119895(119905) + (1 minus 120575
119895119894) 119906
119895minus1120573119895minus1
(119905)] Δ119905 + 119900 (Δ119905)
if˜119907 =
˜119906 +
˜119890119894(119895) 119895 = 119894 119896 minus 1
= 119906119895119889119895(119905) Δ119905 + 119900 (Δ119905)
if˜119907 =
˜119906 minus
˜119890119894(119895) 119895 = 119894 119896 minus 1
= 119900 (Δ119905) if 10038161003816100381610038161003816˜11015840
119899minus119896(˜119906 minus
˜119907)10038161003816100381610038161003816ge 2
(17)
The above results imply that˜119883119894(119905) is a (119896minus 119894)-dimensional
birth-death process with birth rates 119894119887119906(119905) 119906 = 119894 119896 minus
1 death rates 119894119889119906(119905) 119906 = 119894 119896 minus 1 and cross-
transition rates 120572119906119906+1
(119895 119905) = 119895120573119906(119905) 119906 = 119894 119896 minus
1 120573119906119907(119895 119905) = 0 if 119907 = 119906 + 1 (See Definition 41 in Tan
([22] Chapter 4)) Using these results it can be shownthat the Kolmogorov forward equation for the probabilities119875119869
119895(119905 119894) = 119906
119895 119895 = 119894 119896 minus 1 | 119869
119894(0) = 119898
119894 119869
119895(0) = 0
8 ISRN Biomathematics
119895 gt 119894 = 119875(119906119895 119895 = 119894 119896 minus 1 119905) (119894 = 0 1Min(119896 minus 1 2))
in the above model is given by
119889
119889119905119875 (119906
119895 119895 = 119894 119896 minus 1 119905)
= 119875 (119906119894minus 1 119906
119895 119895 = 119894 + 1 119896 minus 1 119905) (119906
119894minus 1) 119887
119894(119905)
+
119896minus1
sum
119895=119894+1
119875 (119906119894 119906
119894+1 119906
119895minus1 119906
119895
minus1 119906119895+1
119906119896minus1
119905) (119906119895minus 1) 119887
119895(119905)
+
119896minus2
sum
119895=119894
119875 (119906119894 119906
119894+1 119906
119895 119906
119895+1
minus1 119906119895+2
119906119896minus1
119905) 119906119895120573119895 (119905)
+
119896minus1
sum
119895=119894
119875 (119906119894 119906
119894+1 119906
119895minus1 119906
119895
+1 119906119895+1
119906119896minus1
119905) (119906119895+ 1) 119889
119895(119905)
minus 119875 (119906119894 119906
119894+1 119906
119896minus1 119905)
times
119896minus1
sum
119895=119894
119906119895[119887119895(119905) + 119889
119895(119905)] +
119896minus2
sum
119895=119894
119906119895120573119895(119905)
(18)
for 119906119895= 0 1 infin 119895 = 119894 119896 minus 1
By using the above set of differential equations one canreadily compute the probabilities 119875119869
119895(119905) = 119906
119895 119895 = 119894 119896 minus
1 | 119868119894(0) = 119898
119894 = 119875(119906
119895 119895 = 119894 119896 minus 1 119905) numerically
33 The Probability Distributions of the Number of DetectableTumors and Times to Tumors As shown by Yang and Chen[11] malignant cancer tumors arise from primary 119869
119896cells by
clonal expansion where primary 119869119896cells are 119869
119896cells generated
directly by 119869119896minus1
cells (119869119896cells derived by stochastic birth of
other 119869119896cells are not primary 119869
119896cells) That is cancer tumors
develop from primary 119869119896cells through stochastic birth-death
processesTo derive the probability distribution for 119879(119905) in 119869
119894people
in the population let 119875119879(119904 119905) denote the probability that
a primary cancer cell at time 119904 develops into a detectablecancer tumor by time 119905 (Explicit formula for119875
119879(119904 119905) has been
given in Tan [22] Chapter 8 and in Tan and Chen [24])Than as shown in Tan ([3 22] chapter 8) the conditionalprobability distribution of 119879(119905) given 119869
119896minus1(119904 119894) 119904 le 119905
in 119869119894people is Poisson with mean 120596(119905 119894) where 120596(119905 119894) =
int119905
1199050
119869119896minus1
(119904 119894)120573119896minus1
(119904)119875119879(119904 119905)119889119904 That is
119879 (119905) | 119869119896minus1
(119904 119894) 119904 le 119905 sim Poisson (120596 (119905 119894)) (19)
Let 119876119894(119895) be the probability that cancer tumors develop
during (119905119895minus1
119905119895] in 119869
119894people in the population For time
homogeneous models with small 120573119896minus1
119876119894(119895) is then given by
119876119894(119895) = 119864 119890
minus120596(119905119895minus1
119894)minus 119890
minus120596(119905119895119894)
= 119890minus120573119896minus1
119867119894(119905119895minus1
)minus 119890
minus120573119896minus1
119867119894(119905119895)+ 119900 (120573
119896minus1)
(20)
where119867119894(119905) = int
119905
1199050
119864[119869119896minus1
(119909 119894)]119875119879(119909 119905)119889119909
To derive 119876119894(119895) denote by
120579119894(119896minus1)
= 119864 [119869119894(1199050 119894 minus 1)] 120573
119894
119896minus1
prod
119906=119894+1
(120573119906
120574119906
)
119894 = 1 Min (3 119896 minus 1)
120582119906(119896minus1)
= 119864 [119869119906(1199050 119906)] 120573
119906
119896minus1
prod
119907=119906+1
(120573119906
120574119906
)
119906 = 0 1 Min (2 119896 minus 1)
(21)
and define the functions
120595119894(119896minus1)
(119905) =
119896minus1
prod
119906=119894+1
120574119906
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907)
times int
119905
1199050
119890120574119906(119909minus1199050)119875119879(119909 119905) 119889119909
119894 = 0 1 Min (2 119896 minus 1)
(22)
Applying results of 119864[119869119896minus1
(119905 119894)] given in (11) for timehomogeneous models with 120574
119894= 120574119895if 119894 = 119895 we obtain 119876
119894(119895)rsquos as
follows
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) = 119890
minus1205791112059511(119905119895minus1
)minus1205820112059501(119905119895minus1
)
minus119890minus1205791112059511(119905119895)minus1205820112059501(119905119895) + 119900 (120573
1)
(23)
1198761(119895) = (1 minus 120572
1) 119890
minus1205821112059511(119905119895minus1
)
minus119890minus1205821112059511(119905119895) + 119900 (120573
1)
(24)
ISRN Biomathematics 9
(2) If 119896 ge 3 then we have 119876119894(0) = 0 for 119894 = 0 1 and
1198762(0) = 120575
2119896120572 and for 119895 gt 0
1198760(119895) = 119890
minus1205791(119896minus1)
1205951(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1205950(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1205951(119896minus1)
(119905119895)minus1205820(119896minus1)
1205950(119896minus1)
(119905119895) + 119900 (120573
119896minus1)
(25)
1198761(119895) = 119890
minus1205792(119896minus1)
1205952(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1205951(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1205952(119896minus1)
(119905119895)minus1205821(119896minus1)
1205951(119896minus1)
(119905119895) + 119900 (120573
119896minus1)
(26)
1198762(119895) = 120575
2119896(1 minus 120572) 119890
minus1205822212059522(119905119895minus1
)minus 119890
minus1205822212059522(119905119895)
+ (1 minus 1205752119896) 119890
minus1205793(119896minus1)
1205953(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1205952(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1205953(119896minus1)
(119905119895)minus1205822(119896minus1)
1205952(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(27)
where 1205752119896= 1 if 119896 = 2 and = 0 if 119896 = 2
Notice that if 1205740= 0 then 120595
0(119896minus1)(119905) reduces to
1205950(119896minus1)
(119905) = (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=0
1198600(119896minus1)
(119907)
times int
119905
1199050
119890120574119907(119909minus1199050)119875119879(119909 119905) 119889119909
= (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1) (119907)
times int
119905
1199050
[119890120574119907(119909minus1199050)minus 1] 119875
119879 (119909 119905) 119889119909
(28)
Notice also that if 119875119879(119904 119905) asymp 1 for 119905 gt 119904 and if 120574
0= 0 then
the above 120595119894119895(119905)rsquos reduce respectively to
120595119894119894(119905) =
1
120574119894
119890120574119894(119905minus1199050)minus 1 119894 = 1 119896 minus 1
1205950(119896minus1) (119905) = (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1) (119907)
times1
1205741199072119890
120574119907(119905minus1199050)minus 1 minus (119905 minus 119905
0) 120574
119907
120595119894(119896minus1)
(119905) = (
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907)
times1
120574119907
119890120574119907(119905minus1199050)minus 1 119894 = 1 119896 minus 1
(29)
4 Probability Distribution ofObserved Cancer Incidence IncorporatingHereditary Cancer Cases
For estimating unknown parameters and to validate themodel one would need real data generated from the modelFor studies of carcinogenesis such data are usually given bycancer incidence For example in the SEER data of NCINIHof USA the data are given by (119910
0 119899
0) (119910
119895 119899
119895) 119895 = 1 119905
119873
where 1199100is the number of cancer cases at birth and 119899
0
the total number of birth and where for 119895 ge 1 119910119895is
the number of cancer cases developed during the 119895th agegroup of a one-year period (or 5 years periods) and 119899
119895is
the number of noncancer people who are at risk for cancerand from whom 119910
119895of them have developed cancer during
the 119895th age group Given in Table 1 are the SEER data ofuveal melanoma (adult eye cancer) during the period 1973ndash2007 In Table 1 notice that there are some cancer cases atbirth implying some inherited cancer cases In this sectionwe will develop a statistical model for these types of datasets from the stochastic multistage model with hereditarycancers as given in Section 2 As in previous sections let119899119894119895be the number of individuals who have genotype 119869
119894(119894 =
0 1 2) at the embryo stage among the 119899119895people at risk for
the cancer in question Then as showed above (1198991119895 119899
2119895) |
119899119895sim Multinomial119899
119895 119901
1 119901
2 It follows that 119899
119894119895| 119899
119895sim
Binomial119899119895 119901
119894 119894 = 0 1 2 In what follows we let 119884
119895denote
the random variable for 119910119895unless otherwise stated
41 The Probability Distribution of 1198840 As shown in Figure 4
119869119894(119894 = 0 1 2) people would only generate 119869
119894stage cells and
119869119894+1
stage cells at birth Thus for cancers to develop at orbefore birth the number of stages for the stochastic model ofcarcinogenesis must be 3 or less It follows that if 119910
0gt 0 the
appropriate model of carcinogenesis must be either a 2-stagemodel or a 3-stage model Since 119899
20| 119899
0sim Binomial(119899
0 119901
2)
and 11989910| (119899
0 119899
20) sim Binomial(119899
0 119901
1) + 119900(119901
2) the probability
distribution of 1198840is therefore
1198840sim Poisson (120594
119896) 119896 = 2 3 (30)
where
120594119896= 119899
0(119901
2+ 119901
1120572) if 119896 = 2
= 11989901199012120572 if 119896 = 3
(31)
The expected number of 1198840given 119899
0is 119864(119884
0| 119899
0) =
1198990(119901
2+ 119901
1120572) = 120594
2if 119896 = 2 and 119864(119884
0| 119899
0) = 119899
01199012120572 = 120594
3
if 119896 = 3 Hence for the 2-stage model (ie 119896 = 2) or the 3-stage model (ie 119896 = 3) the maximum likelihood estimateof 120594
119896is 120594
119896= 119910
0and the deviance119863
0(119896) from the conditional
probability distribution of 1198840given 119899
0is
1198630(119896) = minus2 log ℎ (119910
0 120594
119896) minus log ℎ (119910
0 120594
119896)
= 120594119896minus 119910
0 minus 119910
0log
120594119896
1199100
119896 = 2 3
(32)
10 ISRN Biomathematics
Table 1 The SEER incidence data (1973ndash2007) of uveal melanomafrom NCINIH (over all races and genders)
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
0 12495777 34 36 36 381 12221582 20 11 11 162 12120990 14 7 8 173 12112995 9 5 6 184 12146174 4 4 5 205 12161336 3 3 4 226 12111854 2 2 4 257 12160452 2 2 4 288 11942586 2 2 4 309 12381299 1 2 5 3410 12512703 2 3 6 3811 12410338 5 3 6 4112 12449244 2 3 6 4413 12527781 5 4 7 4814 12602883 3 5 7 5115 12719598 5 5 8 5516 12766107 7 6 9 5917 12831400 9 7 9 6218 12382047 8 7 9 6319 12581638 10 8 10 6820 12636509 7 9 11 7121 12682601 6 10 10 7522 12840510 12 11 11 7923 13075528 17 13 13 8424 13358635 16 15 14 8925 13473849 12 16 16 9426 13426340 17 18 18 9727 13525264 28 20 20 10128 13149674 17 22 21 10229 13812811 23 25 25 11030 13886874 24 28 27 11431 13488332 37 30 29 11532 13460286 32 32 32 11833 13256067 38 35 35 11934 13428827 37 39 38 12435 13220037 40 41 41 12636 12870265 30 44 44 12637 12689592 43 47 47 12738 12157014 42 49 49 12539 12494081 46 55 54 13140 12272125 49 58 58 13241 11826573 56 61 60 13042 11663153 54 65 64 13143 11407082 53 68 68 13144 11296848 70 73 72 133
Table 1 Continued
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
45 11016369 57 76 76 13246 10651593 71 79 79 13047 10475708 87 84 83 13148 9994684 82 86 85 12749 10138908 78 93 93 13150 9836359 87 97 96 13051 9475641 95 100 99 12752 9250985 113 104 104 12653 9027382 106 108 108 12554 8883737 117 113 113 12555 8547883 129 116 116 12356 8279648 107 119 119 12157 8062368 119 123 123 11958 7654610 132 124 124 11559 7563706 118 130 129 11560 7232719 131 131 130 11261 6927332 116 132 132 10962 6708273 133 134 134 10763 6543931 143 138 137 10664 6404652 130 141 141 10565 6168486 145 142 142 10266 5913479 138 142 142 9967 5746766 151 144 144 9868 5480517 147 142 142 9469 5363912 149 144 144 9470 5110728 136 142 142 9071 4925076 144 141 141 8872 4696825 140 138 138 8573 4512136 146 135 135 8274 4345300 126 133 133 8075 4148801 122 128 129 7876 3900900 124 122 122 7477 3681587 114 116 116 7078 3481918 93 110 110 6779 3243631 102 102 102 6380 2961234 79 92 93 5881 2724984 93 83 84 5482 2495219 82 75 75 5083 2271595 92 66 67 4684 2041351 64 57 58 4285 10466605 304 282 285 216Note 1 for age 0 and 1ndash10 years old the cancer incidence are derived bysubtracting incidence of retinoblastoma from the original SEERdata (see Tanand Zhou [9])Note 2 the observed uveal melanoma incidence rates per 106 individuals arederived from the SEER eye cancer incidence by subtracting retinoblastomaincidence as given in Tan and Zhou [9]
ISRN Biomathematics 11
42 The Probability Distribution of 119884119895(119895 ge 1) To derive the
probability distribution of 119884119895(119895 ge 1) in the 119895th age group
let 119884119894119895(119894 = 0 1 2) be the number of cancer cases generated
by people who have genotype 119869119894at the embryo stage among
these 119884119895cancer cases Then 119884
119895= sum
2
119894=0119884119894119895and 119884
0119895is the
number of cancer cases generated by the 1198990119895= 119899
119895minus 119899
1119895minus 119899
2119895
normal people in the populationThe conditional probabilitydistribution of 119884
119894119895given 119899
119894119895is
119884119894119895| 119899
119894119895sim Poisson 119899
119894119895119876119894(119895) 119894 = 0 1 2 (33)
Notice that if 119896 = 2 (a 2-stage model) then all 1198692
individuals would develop tumor at or before birth Thus if119896 = 2 then 119884
2119895= 0 for all 119895 gt 0 so that if 119895 gt 0 cancer
cases develop only from normal people (119873 = 1198690people) and
1198691people On the other hand if 119896 gt 2 then with positive
probability 119884119894119895gt 0 for all (119894 = 0 1 2 119895 = 0 1 119899
119879) where
119899119879is the last time point in the data Let 120575
2119896= 1 if 119896 = 2 and
1205752119896= 0 if 119896 = 2 Then 119884
119895| (119899
119894119895 119894 = 0 1 2) sim Poisson(119876
119879(119895)
where 119876119879(119895) = sum
1
119894=0119899119894119895119876119894(119895) + (1 minus 120575
2119896)119899
21198951198762(119895) Since
(1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) we have for the
conditional probability density function119875(119910119895| 119899
119895) of119884
119895given
119899119895
119875 (119910119895| 119899
119895)=
119899119895
sum
1198990119895=0
119899119895minus1198990119895
sum
1198991119895=0
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1) ℎ 119910
119895 119876
119879(119895)
(34)
where 119892(1198990119895 119899
1119895 119899
119895 119901
0 119901
1) is the probability density function
of (1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) and ℎ119910
119895 119876
119879(119895)
the probability density function of 119884119895| (119899
119894119895 119894 = 0 1 2) sim
Poisson119876119879(119895)
The probability density function 119875(119910119895| 119899
119895) given by (34)
is a mixture of Poisson probability density functions withmixing probability density function given by themultinomialprobability distribution of 119899
0119895 119899
1119895 given 119899
119895 This mixing
probability density function represents individuals with dif-ferent genotypes at the embryo stage in the population
Let Θ be the set of all unknown parameters (ie theparameters (119901
1 119901
2 120572) and the birth rates the death rates
and the mutation rates of 119869119895cells) Based on data (119910
119895 119895 =
0 1 119899119879) the likelihood function of Θ is
119871 Θ | 119910119895 119895 = 0 1 119899
119879 = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119875 (119910119895| 119899
119895)
(35)
Notice that because themutation rates are very small onemay practically assume 120573
119894(119905) = 120573
119894for 119894 = 0 1 119896 minus 1
Also because the stage-limiting genes are basically tumorsuppressor genes which act recessively (see Tan [3]Weinberg[6] and Tan et al [5 8 23]) one may practically assume119887119894(119905) = 119887
119894 119889
119894(119905) = 119889
119894 120574
119894(119905) = 119887
119894minus 119889
119894= 120574
119894 119894 = 0 1 119896 minus 1
(see Tan et al [4 5 8])
43 The Joint Probability Distribution of Augmented Variablesand Cancer Incidence For applying the mixture distribution
of 119884119895in (34) to make inference about the unknown parame-
ters one needs to expand the model to include the unobserv-able augmented variables (119899
0119895 119899
1119895 119910
0119895 119910
1119895 119895 = 1 119905
119873) and
derives the joint probability distribution of these variablesFor these purposes observe that for (119895 = 1 119905
119873) and for
119896 gt 2 the conditional probability distribution119875119910119894119895 119894 = 0 1 |
119910119895 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1) given (119884
119895= 119910
119895 119899
119894119895 119894 =
0 1 119899119895) is
(119910119894119895 119894 = 0 1) | (119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Multinomial(119910119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1)
(36)
Since the conditional probability distribution of 119884119895given
(119899119894119895 119894 = 0 1 2) for 119896 gt 2 is Poisson with mean 119876
119879(119895) we
have for the joint conditional probability density function119875119910
119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1 119884
119895) given
(119899119894119895 119894 = 0 1 119899
119895)
119875 119910119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895
= 119875 119910119894119895 119894 = 0 1 | 119910
119895 119899
119894119895 119894 = 0 1 2
times 119875 119910119895| 119899
119894119895 119895 = 0 1 2 =
1
119910119895119890minus119876119879(119895)119876
119879(119895)
119910119895
times 1198921199100119895 119910
1119895 119910
119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1
=
2
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) (119896 gt 2)
(37)
where 1199102119895= 119910
119895minus sum
1
119906=0119910119906119895and 119899
2119895= 119899
119895minus sum
1
119906=0119899119906119895
If 119896 = 2 then 1198842119895= 0 so that 119884
119895= sum
1
119894=0119884119894119895 Thus we have
for 119896 = 2
119884119895| (119899
119894119895 119894 = 0 1 119899
119895) sim Poisson
1
sum
119894=0
119899119894119895119876119894(119895) (38)
119884119894119895| (119884
119895= 119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Binomial(119910119895
119899119894119895119876119894(119895)
sum1
119906=0119899119906119895119876119906(119895)
) 119894 = 0 1
(39)
It follows that if 119896 = 2 then sum1
119894=0119884119894119895= 119884
119895and the joint
probability density function of (1198840119895 119884
119895) given (119899
119906119895 119906 = 0 1 2)
is119875 119910
0119895 119910
119895| 119899
119906119895 119906 = 0 1 119899
119895
= 119875 119910119895| 119899
119894119895 119894 = 0 1 119899
119895
times 119875 1199100119895| 119910
119895 119899
119906119895 119906 = 0 1 119899
119895
=
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) if 119896 = 2
(40)
where 119910119895= sum
1
119894=0119910119894119895and 119899
119895= sum
2
119894=0119899119894119895
12 ISRN Biomathematics
Put Y = (119910119894119895 119894 = 0 1 119895 = 1 119905
119873) N = (119899
119894119895 119894 = 0 1
119895 = 1 119905119873)˜119899 = (119899
119895 119895 = 0 1 119905
119873)˜119910 = (119910
119895 119895 = 0 1
119905119873) From (37) and (40) we have for the conditional joint
probability density function of (Y˜119910) given (N
˜119899)
119875 Y˜119910 | N
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(41)
It follows that the joint conditional probability densityfunction of NY
˜119910 given (
˜119899 Θ) is
119875 NY˜119910 |
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1)
times ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(42)
Notice that the above probability density function is aproduct of multinomial probability density functions andPoisson probability density functions For this joint probabil-ity density function the deviance Dev = minus2log119875[Y
˜119910N |
˜119899 Θ] minus log119875[Y
˜119910N |
˜119899 Θ] is
Dev = 1198630(119896) + Dev (119901
1 119901
2) +
119905119873
sum
119895=1
119863119895 (43)
where
1198630(119896) = 2 120594 (119896) minus 119910
0minus 119910
0log
120594 (119896)
1199100
(44)
Dev (1199011 119901
2) = 2
119905119873
sum
119895=1
1198990119895log
1199010
(1 minus 1199011minus 119901
2)
+
2
sum
119894=1
119899119894119895log
119901119894
119901119894
(45)
119863119895= 2
1
sum
119894=0
119899119894119895119876119894(119895)minus119910
119894119895minus119910
119894119895log
119899119894119895119876119894(119895)
119910119894119895
+2 (1 minus 1205752119896)
times 11989921198951198762(119895) minus 119910
2119895minus 119910
2119895log
11989921198951198762(119895)
1199102119895
(46)
where 119901119894= ((sum
119905119873
119895=0119899119894119895)(sum
119905119873
119895=0119899119895)) (119894 = 0 1 2)
The joint probability density function 119875Y˜119910N |
˜119899 Θ
of (Y˜119910N) given by (42) will be used as the kernel for the
Bayesian method to estimate the unknown parameters andto predict the state variables
44 Fitting of the Model to Cancer Incidence Data To fit themodel to real data as inTan [3ndash5] we letΔ119905 sim 1 to correspondto a fixed time interval such as 6 months in human cancerstudies (Tan et al [4] has assumed 3months as one-time unitwhile Luebeck andMoolgavkar [26] has assumed one year asone-time unit)Then because the proliferation rate of the laststage cells is quite large onemay practically assume119875
119879(119904 119905) =
1 for 119905 minus 119904 ge 1 Hence noting that 120573119896minus1
(119905) = 120573119896minus1
is usuallyvery small (see [3ndash5]) the 119876
119894(119895) is approximated by
119876119894(119895) asymp119864 119890
minus120573119896minus1
119866119894(119905119895minus1
)minus 119890
minus120573119896minus1
119866119894(119905119895) = 119890
minus120573119896minus1
119864[119866119894(119905119895minus1
)]
minus 119890minus120573119896minus1
119864[119866119894(119905119895)]+ 119900 (120573
119896minus1)
(47)
where 119866119894(119905) = sum
119905minus1
119904=1199050
119869119896minus1
(119904 119894)Under discrete time approximation the 119864[119868
119896minus1(119905 119894)]rsquos
have been derived in the appendix Using these results ofexpected numbers and using the resultsum119905minus1
119894=0119886119894= (119886
119905minus1)(119886minus
1) for 119886 = 0 we obtain
120573119896minus1
119864 [119866119894(119905)] =
119894+1
sum
119903=119894
119864 [119869119903(1199050 119894)] (
119896minus1
prod
119906=119903
120573119906)
times
119896minus1
sum
119907=119903
119860119903(119896minus1)
(119907)
119905minus1
sum
119904=1199050
(1 + 120574119907)119904minus1199050
= 120579(119894+1)(119896minus1)
120601(119894+1)(119896minus1) (119905) + 120582119894(119896minus1)120601119894(119896minus1) (119905)
(48)
where (120579119894(119896minus1)
119894 = 1 2 3) and (120579119894(119896minus1)
119894 = 0 1 2) are definedin Section 33 andwhere the120601
119894(119896minus1)(119905) (119894 = 0 1 2 3)rsquos are given
by
120601119894(119896minus1) (119905) =(
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1) (119907)
times1
120574119907
(1 + 120574119907)119905minus1199050
minus 1 119894 = 0 1 2 3
(49)
Notice that if 1205740= 0 then 120601
0(119896minus1)(119905) reduces to
1206010(119896minus1)
(119905) =(
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1)
(119907)
times1
1205741199072(1 + 120574
119907)119905minus1199050
minus 1 minus (119905 minus 1199050) 120574
119907
(50)
Applying these results for time homogeneous modelswith 120574
119894= 120574119895if 119894 = 119895 the 119876
119894(119895)rsquos under discrete approximation
are given as follows
ISRN Biomathematics 13
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791112060111(119905119895minus1
)minus1205820112060101(119905119895minus1
)
minus119890minus1205791112060111(119905119895)minus1205820112060101(119905119895) + 119900 (120573
1)
1198761(119895) asymp (1 minus 120572
1) 119890
minus1205821112060111(119905119895minus1
)
minus119890minus1205821112060111(119905119895) + 119900 (120573
1)
(51)
(2) If 119896 ge 3 thenwe have 1198762(0) = 120575
2(119896minus1)120572119876
119894(0) = 0 119894 =
0 1 and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791(119896minus1)
1206011(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1206011(119896minus1)
(119905119895)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198761(119895) asymp 119890
minus1205792(119896minus1)
1206012(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1206012(119896minus1)
(119905119895)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198762(119895) asymp 120575
2(119896minus1)(1 minus 120572) 119890
minus1205822212060122(119905119895minus1
)minus 119890
minus1205822212060122(119905119895)
+ (1 minus 1205752(119896minus1)
) 119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(52)
Notice that if one replaces [1 + 120574119907]119905minus1199050 by [1 + 120574
119907]119905minus1199050 =
119890(119905minus1199050) log(1+120574
119907)
asymp 119890(119905minus1199050)120574119907 the above 119876
119894(119895)rsquos from discrete
time model are exactly the same as from the continuousmodel respectively as given in equations (23)ndash(27) under theassumption that 119875
119879(119904 119905) = 1 for 119905 minus 119904 gt 0 Notice that the
assumption 119875119879(119904 119905) = 1 for 119905minus119904 gt 0 is equivalent to assuming
that the last stage cancer cells grow instantaneously intocancer tumors as soon as they are generated see Remark 1
5 The Fitting of the Model to CancerIncidence and the Generalized BayesianInference Procedure
Given the model in Sections 2 and 3 and cancer incidenceone may use results in Section 4 to fit the model By usingthis model and the distribution results in Section 4 one canreadily estimate the unknown genetic parameters predictcancer incidence and check the validity of the model byusing the generalized Bayesian inference and Gibbs samplingprocedures for more detail see Tan [3 22] and Tan et al[4 5]
The generalized Bayesian inference is based on the pos-terior distribution 119875Θ | NY
˜119910
˜119899 of Θ given NY
˜119910
˜119899
This posterior distribution is derived by combining the prior
distribution 119875Θ ofΘ with the joint probability distribution119875NY
˜119910 |
˜119899 Θ given
˜119899 Θ given by (42) It follows
that this inference procedure would combine informationfrom three sources (1) previous information and experiencesabout the parameters in terms of the prior distribution 119875Θof the parameters (2) biological information of inheritedcancer cases via genetic segregation of cancer genes inthe population (119875N |
˜119899 119901
119894 119894 = 1 2 see Section 2)
and (3) information from the expanded data (Y) and theobserved data (
˜119910) via the statistical model from the system
(119875Y˜119910 | N Θ) given by (37) and (40) Because of additional
information from the genetic segregation of the cancer genesthis inference procedure provides an efficient procedure toextract information of effects of genotypes of individuals atthe embryo stage
51 The Prior Distribution of the Parameters For the priordistributions of Θ because biological information has sug-gested some lower bounds and upper bounds for the muta-tion rates and for the proliferation rates we assume
119875 (Θ) prop 119888 (119888 gt 0) (53)
where 119888 is a positive constant if these parameters satisfy somebiologically specified constraints are and equal to zero forotherwise These biological constraints are as follows
(i) 0 lt 1199011lt 10
minus2 0 lt 1199012lt 10
minus6 and minus001 lt 120574119894lt 1 (119894 =
1 2)(ii) For 120573
119895(119895 = 0 1 119896 minus 1) we let 1 lt 120596
0= 119873(119905
0)120573
0lt
1000 (119873 rarr 1198681) and 10minus8 lt 120573
119894lt 10
minus3 119894 = 1 119896minus1(iii) For the 120582
119895(119896minus1)(119895 = 0 1 2) we let 0 lt 120582
119894lt 10 119894 = 0 1
and 0 lt 1205822lt 10
3(iv) For the 120579
119894(119894 = 1 2 3) we let 0 lt 120579
1lt 10
minus2 and 0 lt
1205792lt 10
minus1
We will refer to the above prior as a partially informativeprior which may be considered as an extension of thetraditional noninformative prior given in Box and Tiao [28]
52 The Posterior Distribution of the Parameters Given119884119873
˜119910
˜119899 Denote by Θ
1= Θ minus 119901
1 119901
2 120572 From the
posterior distribution 119875Θ | NY˜119910
˜119899 we obtain
119875 119901119894 119894 = 1 2 120572 | Θ
1NY
˜119910
˜119899
prop (120594 (119896))1199100
119890minus120594(119896)
119901sum119905119873
119895=11198991119895
1119901sum119905119873
119895=11198992119895
2
times (1 minus 1199011minus 119901
2)sum119905119873
119895=11198990119895
0 lt 1199011 119901
2 120572 lt 1
119875 Θ1| (119901
1 119901
2 120572) NY
˜119910
˜119899
prop
119905119873
prod
119895=1
2
prod
119894=0
119890minus119876119894(119895)119876
119894(119895)
119910119894119895
Θ1isin Ω
(54)
14 ISRN Biomathematics
where Ω is the parameter space of Θ1provided by the
biological constraints in Section 51For computational convenience we notice that the log of
1198751199011 119901
2 120572 | Θ
1NY
˜119910
˜119899 is proportional to the negative of
1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) similarly the log of
119875Θ1| (119901
1 119901
2 120572)NY
˜119910
˜119899 is proportional to the negative
of sum119896
119895=1119863119895given by (46)
53 The Multilevel Gibbs Sampling Procedure For EstimatingUnknown Parameters Given the posterior probability distri-butions we will use the following multilevel Gibbs samplingprocedure to derive estimates of the parameters We noticethat numerically the Gibbs sampling procedure given belowis equivalent to the EM-algorithm from the sampling theoryviewpoint with Steps 1 and 2 as the 119864-Step and with Steps 3and 4 as the119872-Step respectively [29]Thesemultilevel Gibbssampling procedures are given by the following
Step 1 (Generating N Given (Y˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 1)) Given Θ and given˜119899 use the multinomial
distribution of 1198991119895 119899
2119895 given 119899
119895in Section 3 to generate
a large sample of N Then by combining this large samplewith 119875Y
˜119910 | N
˜119899 Θ in (37) and (40) to select N through
the weighted bootstrap method due to Smith and Gelfand[30] This selected N is then a sample from 119875N | Y
˜119910
˜119899 Θ
even though the latter is unknown (For proof see Tan [22]Chapter 3) Call the generated sample N
Step 2 (Generating Y Given (N˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 2)) Given
˜119910
˜119899 Θ and given N = N generated
from Step 1 generate Y from the probability distribution119875Y | N
˜119910
˜119899 Θ given by (36) and (38) Call the generated
sample Y
Step 3 (Estimation of 119901119894 119894 = 1 2 120572 Given Θ
1NY
˜119910
˜119899)
Given
˜119910
˜119899 Θ
1 and given (NY) = (N Y) from Steps 1
and 2 derive the posterior mode of 119901119894 119894 = 1 2 120572 by
maximizing the conditional posterior distribution 119875119901119894 119894 =
1 2 120572 | Θ1 N Y
˜119910
˜119899 Under the partially informative prior
this is equivalent to maximize the negative of the deviance1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) in Section 43 under
the constraints given in Section 51 Denote this generatedmode by 119901
119894 119894 = 1 2
Step 4 (Estimation of Θ1Given (119901
119894 119894 = 1 2 120572NY
˜119910
˜119899))
Given (
˜119910
˜119899) and given (NY 119901
119894 119894 = 1 2 120572) = (N Y 119901
119894 119894 =
1 2 ) from Steps 1ndash3 derive the posterior mode of Θ1
by maximizing the conditional posterior distribution119875Θ
1| 119901
119894 119894 = 1 2 N Y
˜119910
˜119899 Under the partially
informative prior this is equivalent to maximize the negativeof the deviancesum119896
119895=1119863119895in (46) under the constraints Denote
the generated mode as Θ1
Step 5 (Recycling Step) With (NY 119901119894 119894 = 1 2 120572 Θ
1) =
(N Y 119901119894 119894 = 1 2 Θ
1) given above go back to Step 1 and
continue until convergence
The proof of convergence of the above steps can bederived by using procedure given in Tan ([22] Chapter 3) Atconvergence the Θ = 119901
119894 119894 = 1 2 Θ
1 are the generated
values from the posterior distribution of Θ given
˜119910
˜119899
independently of (NY) (for proof see Tan [22] Chapter 3)Repeat the above procedures once then generate a randomsample ofΘ from the posterior distribution ofΘ given
˜119910
˜119899
then one uses the sample mean as the estimates of (Θ) anduse the sample variances and covariances as estimates of thevariances and covariances of these estimates
6 A NewMultistage Stochastic Model for AdultEye Cancer (Uveal Melanoma)mdashAn Example
The human eye cancers consist of pediatric eye cancers andadult eye cancers The most common pediatric eye cancer isthe retinoblastoma which develops from the retinal pigmentepithelium cells underlying the retina that do not formmelanoma The most common adult eye cancers are theuveal melanomas involving the iris the ciliary body andthe choroid (collectively referred to as the uveal) Thesecancers develop from melanocytes (pigment cells) whichreside within the uveal giving color to the eye In Tan andZhou [9] we have developed a modified two-stage model forretinoblastoma Based on results frommolecular biology (seeLandreville et al [14] Mensink et al [15] and Loercher andHarbour [31]) Landreville et al [14] have proposed a threestage model for uveal melanoma as given in Figure 2 As anexample of applications of this paper in this section we willapply this model of uveal melanoma to the NCINIH eyecancer data from the SEER project We notice that the samemethods can be applied to model other human cancers aswell but this will be our future research
Given in Table 1 are the numbers of people at risk andthe eye cancer cases in the age groups together with thepredicted cases from the models These data give cancerincidence at birth and incidence for 85 age groups (119896 =
85) with each group spanning over a 1-year period exceptthe last age group (ge85 years old) For human eye cancerbecause the incidence at birth and for age groups from 1to 10 years old is basically generated by the pediatric eyecancer-retinoblastoma (see [9]) to account for inheritedcancer cases of uveal melanomas the incidence for age 0(birth) and for age periods from 1 to 10 years old in Table 1for uveal melanoma is derived by subtracting incidence ofretinoblastoma from SEER data (see Tan and Zhou [9])
To fit the data we let one-time unit be 6 months afterbirth and let 119905
0= 1 To compare different models and to
assess different assumptions we will consider the following2-3-stage mixture models (1) the complete 3-stage mixturemodel (Model-F) in which no assumptions are made on theparameters (2) the Type-1ndash3-stage mixture model in whichwe assume that 120574
1= 0 and that normal people and 119869
1at
the embryo stage will remain normal people and 1198691people
respectively at birth (Model-1) For comparison purposes wealso fit a 2-stage model as defined in Tan and Zhou [9] Wewill apply the methods in Section 6 to fit these models to theSEER data given in Table 1
ISRN Biomathematics 15
Table 2 The log-likelihood AIC and BIC of the fitted models
Models Log-likelihood AIC BICThree-stage
Model-F minus131202 264804 267735Model-1 minus129576 260953 263151
Two-stage minus4392421 8796843 8811569
0 6 12 18 24 30 36 42 48 54 60 66 72 78 84Age (year)
0
50
100
150
200
250
300
350
Inci
denc
es
ObservedModel-F
Model-1Two-stage
Figure 5 Curve fitting of SEER data by the Model-F Model-1 andthe two-stage model
Given in Table 2 are the natural logs of the likelihoodfunctions the AIC (Akaike Information Criterion) and theBIC (Bayesian Information Criterion) for these modelsGiven in Table 3 are the estimates of parameters in the 3-stagemodels Given in Figure 5 are the plots of predicted cancercases from the 3-stage mixture models (Model-F and Model-1) and the 2-stagemodel For comparison purposes in Table 1we also provide numbers of predicted cancer cases from the3-stage mixture models and the 2-stage model together withthe observed cancer cases over time from SEER From theseresults we have made the following observations
(a) As shown by results in Table 1 and Figure 5 it appear-ed that both Model-F and Model-1 fitted the SEERdata well although Model-1 fitted the data slightlybetter from values of AIC and BIC The Chi-squaretest statistics 1205942 = sum
85
119894=0((119910
119894minus 119910
119894)2119910
119894) for Model-F
and Model-1 are given by 8843 and 9448 respec-tively giving a 119875-value of 012 (df = 86 minus 12 = 74)for Model-F and a 119875 value of 011 (df = 86 minus 7 = 79)for Model-1 On the other hand the 2-stage modelfitted the date very poorly the Chi-statistic value forthe 2-stage model is 274769 giving a 119875-value less than10
minus3The AIC (Akaike Information Criteria) and BIC(Bayesian Information Criteria) values ofModel-1 aregiven by (AIC = 260953 BIC = 263151) which areslightly smaller than those of Model-F respectivelyhowever the AIC and BIC values (879684 881157)of the two-stage model are considerably greater thanthose of the 3-stagemodels respectivelyThese resultssuggest that uvealmelanomamaybest be described bya 3-stage model with inherited component and that
one may practically assume 1205741= 0 and that normal
people and 1198691people at the embryo stage will remain
normal people and 1198691people respectively at birth
(b) From Table 3 it is observed that the estimate of 1205741is
close to zero (the estimate is of order 10minus5) indicatingthat the phenotype of 119869
1is almost identical to that
of 119873 = 1198690further confirming that the staging-
limiting genes are basically tumor suppressor genesand that there is no haploinsufficiency for these tumorsuppressor genes On the other hand the estimate of1205742is of order 10minus2 which is about 103 times greater
than those of cells with genotype 1198691
(c) From Table 3 the estimates 119895(119895 = 0 1 2) of 120582
119895
are of order 10minus1 10
minus1 10
2sim 10
3 respectively
Because 120582119894= 119864[119869
119894(1199050 119894)]120573
119894prod
2
119906=119894+1(120573
119906120574
119906) 119894 = 0 1 2
assuming some values of 119864[119869119894(1199050 119894)] 119894 = 0 1 2 from
some biological observations one can have somerough ideas about the magnitude of 120573
119895(119895 = 0 1 2)
For example if we follow Potten et al [32] to assume(119864[119873(119905
0)] = 119864[119869
119894(1199050 119894)] sim 10
8 119894 = 0 1 2) then 120573
119895asymp
10minus6sim 10
minus5(d) From Table 3 the estimates of 119901
1and 119901
2from the
SEER data are of orders 10minus4 sim 10minus3 and 10minus7 sim 10
minus6respectively This indicates that in the US populationthe frequency of the staging limiting cancer genefor uveal melanoma is approximately around 10
minus3Table 3 also showed that the estimate of 120572 was 08411indicating that most individuals with genotype 119869
2
would develop cancer at birth This may help toexplain why there are observed cancer incidencesat birth for uveal melanoma in the SEER data eventhough the estimate of the frequency 119901
2is of order
10minus7sim 10
minus6
7 Discussion and Conclusions
To account for inherited cancer cases in this paper wehave developed some general multistage models involvinghereditary cancer cases For human cancer incidence thesemodels are basically generalized mixture models In thesemixture models the mixing probability is a multinomialdistribution to account for genetic segregation of the staging-limiting tumor suppressor genes This mixture model allowsus to estimate for the first time the frequency of the staging-limiting tumor suppressor gene in human populations As anexample of applications in this paper we have developed ageneral 3-stage stochastic multistage model of carcinogenesisfor adult human eye cancer To account for inherited cancercases in the stochastic model of human eye cancer wehave also developed a generalized mixture model for uvealmelanoma in human beings
For using the proposed models to fit the cancer incidencedata in this paper we have developed a generalized Bayesianinference procedure to estimate the unknownparameters andto predict cancer cases This inference procedure is advanta-geous over the classical sampling theory inference (ie max-imum likelihood method) because the procedure combines
16 ISRN Biomathematics
Table3Estim
ates
ofparametersfor
the3
-stage
stochastic
mod
els
Parameters
1205820
1205821
1205822
1205741
1205742
1199011
1199012
1205721205791
1205792
Mod
el-F
Estim
ates
288119864minus01
481119864minus01
655119864+02
271119864minus05
337119864minus02
995119864minus04
968119864minus07
841119864minus01
329119864minus04
341119864minus01
StD
121119864minus01
465119864minus02
112119864+02
886119864minus06
192119864minus04
175119864minus06
648119864minus09
419119864minus02
354119864minus05
107119864minus02
95CL
-Low
er503119864minus02
389119864minus01
434119864+02
534119864minus06
333119864minus02
991119864minus04
955119864minus07
759119864minus01
260119864minus04
320119864minus01
95CL
-Upp
er525119864minus01
572119864minus01
875119864+02
401119864minus05
341119864minus02
998119864minus04
981119864minus07
923119864minus01
398119864minus04
362119864minus01
Mod
el-1
Estim
ates
655119864minus02
372119864minus01
198119864+02
NA
349119864minus02
998119864minus04
100119864minus06
807119864minus01
NA
NA
StD
254119864minus02
463119864minus02
338119864+01
NA
358119864minus03
627119864minus05
453119864minus08
706119864minus02
NA
NA
95CL
-Low
er157119864minus02
282119864minus01
132119864+02
NA
279119864minus02
875119864minus04
911119864minus07
668119864minus01
NA
NA
95CL
-Upp
er115119864minus01
463119864minus01
265119864+02
NA
419119864minus02
1121198640minus03
109119864minus06
945119864minus01
NA
NA
NoteNAassum
edno
nexiste
nce
ISRN Biomathematics 17
information from three sources (1)previous information andexperiences about the parameters in terms of the prior distri-bution 119875Θ of the parameters (2) biological information ofinherited cancer cases via the genetic segregation of staging-limiting tumor suppressor genes in the population and (3)
information from the expanded data (Y) and the observeddata (
˜119910) via the statistical model from the system (119875Y
˜119910 |
N Θ) given by (37) and (40)To illustrate the usefulness and applications of ourmodels
and methods we have applied our models and methods tothe eye cancer SEER data of NCINIH Our analysis clearlyshowed that the proposed 3-stage model with inheritedcancer cases fitted the data nicely (see Table 2 and Figure 5)on the other hand the classical 2-stage model cannot fit thedata at all (see Table 2 and Figure 5)These results clearly haveconfirmed results frommolecular biology that the human eyecancer is derived by a 3-stage model with inherited cancercomponent Notice however our 3-stage multistage modelis more general than the classical 3-stage model as describedin Little [1] Tan [2] and Zheng [7] in that we postulatethat cancer tumors develop from primary 119868
3cells by clonal
expansion (see Yang and Chen [11]) (Note that the stochasticmultistagemodels in the literature assume that cancer tumorsdevelop from last stage cells immediately as soon as theyare generated ignoring completely cancer progression seeRemark 1) As a matter of fact we had assumed Δ119905 = 1 fora period of three months and found that the 3-stage modelsthen did not fit the SEER data clearly indicating that119875
119905(119904 119905) lt
1 over a period of three monthsApplying our models and methods to the SEER data
of human eye cancer we have derived for the first timesome useful pieces of information Specifically we mention(1) for the first time that we have estimated the frequencyof the staging-limiting tumor suppressor gene in the USpopulation (119901
1sim 9948 times 10
minus3) (2) With the estimate of 120572as = 08411 the predicted number of uveal melanoma atbirth is 119910
0= 119899
011199012= 36 by 3-stage models with inherited
cancer component (Model-F and Model-1) (The observednumber of eye cancer at birth is 34) (3) The estimateof the proliferation rate (120574
1) of 119869
1cells using Model-F is
1205741
= 2271 times 10minus5
sim 0 (The estimate is 8603 times 10minus5
using Model-1) This confirms that the stage-1 limitinggene is a tumor suppressor gene and unlike the p53 genein chromosome 17p (see [33]) there is little or no haploidinsufficiency for this gene in cells with genotype 119869
1
Using models and methods of this paper one can easilypredict future cancer cases for human eye cancer Thus bycomparing results from different populations our modelsand methods can be used to assess cancer prevention andcontrol procedures This will be our future research topicswe will not go any further here
Appendix
The Expected Numbers of State Variablesunder Discrete Time Approximation
Under discrete time the stochastic differential equations ofstate variables reduce to the following stochastic difference
equations of state variables respectively
119869119894 (119905 + 1 119894) = 119869
119894 (119905 119894) [1 + 120574119894 (119905)]
+ 119890119894(119905 + 1 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 + 1 119906) = 119869
119895(119905 119906) [1 + 120574
119895(119905)] + 119869
119895minus1(119905 119906) 120573
119895minus1(119905)
+ 119890119895(119905 + 1 119906) 0 le 119906 lt 119895 le 119896 minus 1
(A1)
where 119890119894(119905 + 1 119894) = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)] minus [119863
119894(119905 119894) minus
119869119894(119905 119894)119889
119894(119905)] and 119890
119895(119905+1 119894) = [119861
119895(119905 119894)minus119869
119895(119905 119894)119887
119895(119905)] minus [119863
119895(119905 119894)minus
119869119895(119905 119894)119889
119895(119905)] + [119872
119895minus1(119905 119894) minus 119869
119895minus1(119905 119894)120573
119895minus1(119905)] for 119895 gt 119894
The initial conditions at birth (1199050) for the above stochastic
difference equations are 119869119895(1199050 119894) gt 0 if (119895 = 119894 119894 + 1) and
119869119895(1199050 119894) = 0 if 119895 gt 119894 + 1 The solution of the above difference
equations under these initial conditions is given respectivelyby
119869119894(119905 119894) = 119869
119894(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119894(119904))
+
119905
sum
119904=1199050+1
119890119894(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119894(119906))
119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119895(119904))
+
119905minus1
sum
119904=1199050
119869119895minus1 (119904 119894) 120573119895minus1 (119904)
119905minus1
prod
119906=119904+1
(1 + 120574119895 (119906))
+
119905
sum
119904=1199050+1
119890119895(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119895(119906))
0 le 119894 lt 119895 le 119896 minus 1
(A2)
If the model is time homogeneous so that 120573119895(119905) =
120573119895 120574
119895(119905) = 120574
119895(119895 = 119894 119896 minus 1 and if 120574
119894= 120574119895if 119894 = 119895 then the
above solutions under the initial conditions (119869119895(1199050 119894) = 0 119895 gt
119894 + 1) reduce to
119869119894 (119905 119894) = 119869
119894(1199050 119894) (1 + 120574
119894)119905minus1199050
+ 120578(0)
119894(119905 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894) (1 + 120574
119895)119905minus1199050
+
119895minus1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895(119903) (1 + 120574
119903)119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
18 ISRN Biomathematics
=
119894+1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895 (119903) (1 + 120574119903)
119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
0 le 119894 lt 119895 le 119896 minus 1
(A3)
where 120578(0)
119895(119905 119894) = sum
119905
119904=1199050+1119890119895(119904 119894)(1 + 120574
119894)119905minus119904 120578
(119906)
119895(119905 119894) =
sum119905
119904=1199050+1119890119895minus119906
(119904 119894)sum119895
119907=119906119860119906119895(119907)(1 + 120574
119907)119905minus119904 (119906 = 1 119895 minus 119894)
Thus if the model is time homogeneous and if 120574119894=
120574119895if 119894 = 119895 the 119864[119869
119896minus1(119905 119894)]rsquos in discrete time models under
the initial conditions (119869119895(1199050 119894) = 0 119895 gt 119894 + 1) are given
respectively by
119864 [119869119896minus1
(119905 119894)] =
119894+1
sum
119906=119894
119864 [119869119906(1199050 119894)] (
119896minus2
prod
119907=119906
120573119907)
times
119896minus1
sum
119903=119906
119860119906(119896minus1)
(119903) (1 + 120574119903)119905minus1199050
119894 = 0 1 Min (119896 minus 1 2)
(A4)
References
[1] M P Little ldquoCancermodels ionization and genomic instabilitya reviewrdquo inHandbook of CancerModels with ApplicationsW YTan and L Hanin Eds chapter 5 World Scientific River EdgeNJ USA 2008
[2] W Y Tan Stochastic Models of Carcinogenesis Marcel DekkerNew York NY USA 1991
[3] W Y Tan ldquoStochastic multiti-stage models of carcinogenesis ashidden Markov models a new approachrdquo International Journalof Systems and Synthetic Biology vol 1 pp 313ndash337 2010
[4] W Y Tan L J Zhang and C W Chen ldquoStochastic modelingof carcinogenesis state space models and estimation of param-etersrdquoDiscrete and Continuous Dynamical Systems B vol 4 no1 pp 297ndash322 2004
[5] W Y Tan C W Chen and L J Zhang ldquoCancer biology cancermodels and stochasticmathematical analysis of carcinogenesisrdquoin Handbook of Cancer Models and Applications W Y Tan andL Hanin Eds chapter 3World Scientific River Edge NJ USA2008
[6] R A Weinberg The Biology of Human Cancer Garland Sci-ences Taylor and Frances New York NY USA 2007
[7] Q Zheng ldquoStochastic multistage cancer models a fresh lookat an old approachrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds chapter 2 WorldScientific River Edge NJ USA 2008
[8] W Y Tan L J Zhang W Chen and J M Zhu ldquoA stochasticmodel of human colon cancer involving multiple pathwaysrdquo inHandbook of Cancer Models with Applications W Y Tan and LHanin Eds chapter 11 World Scientific River Edge NJ USA2008
[9] W Y Tan and H Zhou ldquoA new stochastic model of retinoblas-toma involving both hereditary and non- hereditary cancercasesrdquo Journal of Carcinogenesis and Mutagenesis vol 2 no 2article 117 2011
[10] X W Yan Stochastic and State Space Models of carcinogenesisUnder Complex Situation [PhD thesis] Department of Math-ematical Sciences University of Memphsis Memphis TennUSA
[11] G L Yang and C W Chen ldquoA stochastic two-stage carcino-genesis model a new approach to computing the probabilityof observing tumor in animal bioassaysrdquo Mathematical Bio-sciences vol 104 no 2 pp 247ndash258 1991
[12] H Osada and T Takahashi ldquoGenetic alterations of multipletumor suppressors and oncogenes in the carcinogenesis andprogression of lung cancerrdquo Oncogene vol 21 no 48 pp 7421ndash7434 2002
[13] I I Wistuba L Mao and A F Gazdar ldquoSmoking moleculardamage in bronchial epitheliumrdquo Oncogene vol 21 no 48 pp7298ndash7306 2002
[14] S Landreville O A Agapova and J W Hartbour ldquoEmerginginsights into the molecular pathogenesis of uveal melanomardquoFuture Oncology vol 4 no 5 pp 629ndash636 2008
[15] H W Mensink D Paridaens and A De Klein ldquoGenetics ofuveal melanomardquo Expert Review of Ophthalmology vol 4 no6 pp 607ndash616 2009
[16] WY Tan andXWYan ldquoAnew stochastic and state spacemodelof human colon cancer incorporating multiple pathwaysrdquoBiology Direct vol 5 article 26 2010
[17] L B Klebanov S T Rachev and A Y Yakovlev ldquoA stochasticmodel of radiation carcinogenesis latent time distributions andtheir propertiesrdquo Mathematical Biosciences vol 113 no 1 pp51ndash75 1993
[18] A Y Yakovlev and A D Tsodikov Stochastic Models of TumorLatency and Their Biostatistical Applications World ScientificRiver Edge NJ USA 1996
[19] H Fakir W Y Tan L Hlatky P Hahnfeldt and R K SachsldquoStochastic population dynamic effects for lung cancer progres-sionrdquo Radiation Research vol 172 no 3 pp 383ndash393 2009
[20] A G Knudson ldquoMutation and cancer statistical study ofretinoblastomardquoProceedings of theNational Academy of Sciencesof the United States of America vol 68 no 4 pp 820ndash823 1971
[21] J F Crow and M Kimura An Introduction to PopulationGenetics Theory Harper and Row New York NY USA 1970
[22] W Y Tan Stochastic Models With Applications to GeneticsCancers AIDS and Other Biomedical Systems World ScientificRiver Edge NJ USA 2002
[23] W Y Tan CW Chen and L J Zhang ldquoCancer risk assessmentby state space modelsrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds Chapter 13 WorldScientific River Edge NJ USA 2008
[24] W Y Tan andCWChen ldquoStochasticmodeling of carcinogene-sis some new insightsrdquoMathematical and Computer Modellingvol 28 no 11 pp 49ndash71 1998
[25] W Y Tan and C W Chen ldquoCancer stochastic modelsrdquo inEncyclopedia of Statistical Sciences Revised edition John Wileyand Sons New York NY USA 2005
[26] E G Luebeck and S HMoolgavkar ldquoMultistage carcinogenesisand the incidence of colorectal cancerrdquo Proceedings of theNational Academy of Sciences of the United States of Americavol 99 no 23 pp 15095ndash15100 2002
[27] R Durrett J Mayberry S Moseley and D Schmidt ldquoProba-bility models for cancer development and progressionrdquo GoogleSearch Google 2012
[28] G E P Box and G C Tiao Bayesian Inference in StatisticalAnalysis Addison-Wesley Reading Mass USA 1973
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
ISRN Biomathematics 5
(Klebanov et al [17] Yakovlev and Tsodikov [18] and Fakiret al [19]) in these cases 119879(119905) is not Markov so that theMarkov theory method is not applicable to 119879(119905) To developanalytical results and to resolve many difficult issues Tan andhis associates [4 5 24] have proposed an alternative approachthrough stochastic equations and have followed Yang andChen [11] to assume that cancer tumors develop by clonalexpansion from primary last stage cells Through probabilitygenerating function method Tan and Chen [24] have shownthat if the Markov theory is applicable to 119879(119905) then thestochastic equation method is equivalent to the classicalMarkov theory method but is more powerful Also throughstochastic equation method we have shown in the Appendixthat the classical approach provides a close approximation todiscrete time model under the assumption that the primarylast stage cells develop into a detectable tumor in one timeunit This provides a reasonable explanation why the tradi-tional approach (see [2 22]) can still work well even thoughthe Markov assumption for 119879(119905) may not hold In this paperwe will thus basically use the stochastic equationmethod andassume that cancer tumors develop from primary last stagecells through clonal expansion
31 Stochastic Equations for the State Variables Assume nowthat an individual is an 119869
119894(119894 = 0 1 2) person at the embryo
stageThen in this individual cancer is developed by a (119896minus119894)-stage multievent model given by 119869
119894rarr sdot sdot sdot rarr 119869
119896rarr
119879119906119898119900119903 and the identifiable response variables are given by˜119883119894(119905)
1015840 119879(119905) To derive stochastic equations for the staging
variables in˜119883119894(119905) in this individual observe that for each
119894 = 0 1 2˜119883119894(119905)
1015840 is in general a Markov Process although119879(119905) may not be Markov see Remark 1 Tan [3] and Tan etal [4 5] Tan and Zhou [9] and Tan and Yan [16] It followsthat
˜119883119894(119905 + Δ119905)
1015840 derive from˜119883119894(119905)
1015840 through stochastic birth-death processes of 119869
119906(119906 = 119894 119894 + 1 119896minus1) cells and through
stochastic transition 119869119906rarr 119869
119906+1 119906 = 119894 119894+1 119896minus1 during
(119905 119905 + Δ119905] Let 119861119906(119905 119894) 119863
119906(119905 119894) 119872
119906(119905 119894) be the number of
birth the number of death of 119869119906cells and the number of
transition from 119869119906rarr 119869
119906+1cells during (119905 119905+Δ119905] respectively
in people who are 119869119894people at the embryo stage Let 119872
0(119905)
denote the number of transitions from 119873 rarr 1198691during
(119905 119905 + Δ119905] Because the transition of 119869119906rarr 119869
119906+1would not
affect the number of 119869119906cells but only increase the number of
119869119906+1
cells (see Remark 4) by the conservation law we have thefollowing stochastic equations for 119869
119906(119905 119894) 119906 = 119894 119896minus1 119894 =
0 1 2 (see Tan [3] Tan et al [4 5 8] Tan and Zhou [9] andTan and Yan [16])
119869119894 (119905 + Δ119905 119894) = 119869
119894 (119905 119894) + 119861119894 (119905 119894) minus 119863119894 (119905 119894) 119894 = 0 1 2
119869119906(119905 + Δ119905 119894) = 119869
119906(119905 119894) + 119861
119906(119905 119894) minus 119863
119906(119905 119894)
+ 119872119906minus1 (119905 119894) 119894 lt 119906 le 119896 minus 1
(2)
Because 119861119907(119905 119894) 119863
119907(119905 119894)119872
119907(119905 119894) 119894 = 0 1 2 119907 = 119894 119896minus
1 are random variables the above equations are basicallystochastic equations To derive probability distributions ofthese variables let 119887
119906(119905) and 119889
119906(119905) denote the birth rate and
the death rate at time 119905 of the 119869119906(119906 = 0 1 119896 minus 1) cells
respectively Let 120573119906(119905) (119906 = 0 1 119896 minus 1) be the transition
rate at time 119905 from 119869119906rarr 119869
119906+1 Then as shown in Tan [3] for
(119894 = 0 1 2 119906 = 119894 119896 minus 1) we have to the order of 119900(Δ119905)
119861119906(119905 119894) 119863
119906(119905 119894) 119872
119906(119905 119894) | 119869
119906(119905 119894)
sim Multinomial 119869119906 (119905 119894) 119887119906 (119905) Δ119905 119889119906 (119905) Δ119905 120573119906 (119905) Δ119905
(3)
It follows that to the order of 119900(Δ119905)
119861119906(119905 119894) 119863
119906(119905 119894) | 119869
119906(119905 119894)
simMultinomial 119869119906 (119905 119894) 119887119906 (119905) Δ119905 119889119906 (119905) Δ119905
119872119906(119905 119894) | 119869
119906(119905 119894)
sim Binomial 119869119906(119905 119894) 120573
119906(119905) Δ119905
sim Poisson 119869119906(119905 119894) 120573
119906(119905) Δ119905 + 119900 (120573
119895(119905) Δ119905)
independently of 119861119906(119905 119894) 119863
119906(119905 119894)
119906 = 0 1 119896 minus 1
(4)
From these distribution results by subtracting from therandom transition variables its conditional means respec-tively we obtain the following stochastic equations for thestate variables
˜119883119894(119905) (119894 = 0 1 Min(2 119896 minus 1))
119889119869119894(119905 119894) = 119869
119894(119905 + Δ119905 119894) minus 119869
119894(119905 119894) = 119861
119894(119905 119894) minus 119863
119894(119905 119894)
= 119869119894(119905 119894) 120574
119894(119905) Δ119905 + 119890
119894(119905 119894) Δ119905 119894 = 0 1 2
119889119869119906 (119905 119894) = 119869
119906 (119905 + Δ119905 119894) minus 119869119906 (119905 119894) = 119872119906minus1 (119905 119894)
+ 119861119906(119905 119894) minus 119863
119906(119905 119894)
= 119869119906minus1 (119905 119894) 120573119906minus1 (119905)
+119869119906(119905 119894) 120574
119906(119905) Δ119905 + 119890
119906(119905 119894) Δ119905
119894 = 0 1 Min (2 119896 minus 1) 119894 lt 119906 le 119896 minus 1
(5)
where 120574119906(119905) = 119887
119906(119905) minus 119889
119906(119905) for 119906 = 0 1 119896 minus 1 and where
119890119894(119905 119894)Δ119905 = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)Δ119905] minus [119863
119894(119905 119894) minus 119869
119894(119905 119894)119889
119894(119905)Δ119905]
for 119894 = 0 1 2 119890119906(119905 119894)Δ119905 = [119861
119906(119905 119894) minus 119869
119906(119905 119894)119887
119906(119905)Δ119905] minus
[119863119906(119905 119894) minus 119869
119906(119905 119894)119889
119906(119905)Δ119905] + [119872
119906minus1(119905 119894) minus 119869
119906minus1(119905 119894)120573
119906minus1(119905)Δ119905]
for 119894 = 0 1 Min(2 119896 minus 1) 119894 lt 119906 le 119896 minus 1From the above equations by dividing both sides by Δ119905
and letting Δ119905 rarr 0 we obtain
119869119894(119905 119894)
119889119905= 119869
119894(119905 119894) 120574
119894(119905) + 119890
119894(119905 119894) 119894 = 0 1 Min (2 119896 minus 1)
119869119906 (119905 119894)
119889119905= 119869
119906 (119905 119894) 120574119906 (119905) + 119869119906minus1 (119905 119894) 120573119906minus1 (119905) + 119890119906 (119905 119894)
for 119894 = 0 1 Min (2 119896 minus 1) 119894 lt 119906 le 119896 minus 1
(6)
In the above equations using the distribution results in(4) it can easily be shown that the randomnoises 119890
119906(119905 119894) 119906 =
119894 119896 minus 1 119894 = 0 1 2 have expected value zero and are
6 ISRN Biomathematics
uncorrelated with the staging variables and 119879(119905) The initialconditions at birth (119905
0) for the above stochastic differential
equations are 119869119906(1199050 119894) gt 0 119906 = 119894 119894 + 1 119869
119906(1199050 119894) = 0 119906 gt 119894 + 1
Given the initial conditions (119869119906(1199050 119894) gt 0 119906 = 119894 119894 + 1)
and (119869119906(1199050 119894) = 0 119906 gt 119894 + 1) at birth (119905
0) the solution of the
equations in (6) is given respectively by
119869119894 (119905 119894) = 119869
119894(1199050 119894) 119890
int119905
1199050
120574119894(119909)119889119909
+ 120578119894 (119905 119894)
119894 = 0 1 Min (2 119896 minus 1)
119869119906 (119905 119894) = 119869
119906(1199050 119894) 119890
int119905
1199050
120574119906(119909)119889119909
+ int
119905
1199050
119869119906minus1 (119909 119894) 120573119906minus1 (119909) 119890
int119905
119909120574119906(119910)119889119910
119889119909
+ 120578119906(119905 119894) = sdot sdot sdot = 119869
119906(1199050 119894) 119890
int119905
1199050
120574119906(119909)119889119909
+
119906minus119894
sum
119907=1
119869119906minus119907
(1199050 119894) 120601
(119907)
119906(119905 119894) +
119906+1minus119894
sum
119907=1
120578(119907)
119906(119905 119894)
119906 = 119894 + 1 119896 minus 1
where 119894 = 0 if 119896 = 2
119894 = 0 1 if 119896 = 3
119894 = 0 1 2 if 119896 gt 3
(7)
where
120601(1)
119906(119905 119894) =int
119905
1199050
119890int119905
119909120574119906(119910)119889119910+int
119909
1199050
120574119906minus1
(119910)119889119910120573119906minus1
(119909) 119889119909
119906 = 119894 119896 minus 1
120601(119907)
119906(119905 119894) =int
119905
1199050
119890int119905
119909120574119906(119910)119889119910
120573119906minus1
(119909) 120601(119907minus1)
119906minus1(119909 119894) 119889119909
119907 = 2 119906 minus 119894
120578119906(119905 119894) = 120578
(1)
119906(119905 119894)
= int
119905
1199050
119890int119905
119909120574119906(119910)119889119910
119890119906(119909 119894) 119889119909
119906 = 119894 119896 minus 1
120578(119907)
119906(119905 119894) = int
119905
1199050
119890int119905
119909120574119906(119910)119889119910
120573119906minus1
(119909) 120578(119907minus1)
119906minus1(119909 119894) 119889119909
119907 = 2 119906 minus 119894
(8)
If the model is time homogeneous so that 120573119906(119905) =
120573119906 119887119906(119905) = 119887
119906 119889
119906(119905) = 119889
119906 120574
119906(119905) = 119887
119906minus 119889
119906= 120574
119906 119906 = 0 1
119896minus1 and if 120574119894= 120574119906if 119894 = 119906 the above solutions under the initial
conditions (119869119894+119906(1199050 119894) gt 0 119906 = 0 1 119869
119894+119906(1199050 119894) = 0 119906 gt 1)
then reduce respectively to
119869119894 (119905 119894) = 119869
119894(1199050 119894) 119890
120574119894(119905minus1199050)+ 120578
(1)
119894(119905 119894)
119894 = 0 1 Min (2 119896 minus 1)
119869119906(119905 119894) = 119869
119906(1199050 119894) 119890
120574119906(119905minus1199050)
+120573119906minus1
int
119905
1199050
119869119906minus1
(119909 119894) 119890120574119906(119905minus119909)
119889119909+120578(1)
119906(119905 119894)
= sdot sdot sdot = 119869119906(1199050 119894) 119890
120574119906(119905minus1199050)
+
119906minus1
sum
119903=119894
119869119903(1199050 119894) (
119906minus1
prod
119907=119903
120573119907)
119906
sum
119897=119903
119860119903119906(119897) 119890
120574119897(119905minus1199050)
+
119906
sum
119903=119894
120578(119906+1minus119903)
119906(119905 119894)
=
119894+1
sum
119903=119894
119869119903(1199050 119894) (
119906minus1
prod
119907=119903
120573119907)
119906
sum
119897=119903
119860119903119906(119897) 119890
120574119897(119905minus1199050)
+
119906
sum
119903=119894
120578(119906+1minus119903)
119906(119905 119894) 119894 lt 119906 le 119896 minus 1
(9)
where for 119894 le 119907 le 119906119860119894119906 (119907) = 1 if 119894 = 119906
=
119906
prod
119897=119894119897 = 119907
(120574119897minus 120574
119907)minus1 if 119894 lt 119906
(10)
Obviously 119864[120578(119903)119906(119905 119894)] = 0 for all (119894 = 0 1 2 119906 = 119894
119896 minus 1 119903 = 1 119906 minus 119894 + 1) It follows that for (119894 =
0 1 Min(119896 minus 1 2)) the expected values 119864[119869119896minus1
(119905 119894)] of119869119896minus1
(119905 119894) for homogeneous models with 120574119894= 120574119906if 119894 = 119906 are
given by
119864 [119869119896minus1
(119905 119894) = 119864 [119869119894+1
(1199050 119894)]
times (
119896minus2
prod
119906=119894+1
120573119906)
119896minus1
sum
119907=119894+1
119860(119894+1)(119896minus1)
(119907) 119890120574119907(119905minus1199050)
+ 119864 [119869119894(1199050 119894)] (
119896minus2
prod
119906=119894
120573119906)
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907) 119890120574119907(119905minus1199050)
119894 = 0 1Min (119896 minus 1 2) 119896 ge 2
(11)
where as a convention (sum119894
119895=119894+1119888119894= 0prod
119894
119895=119894+1119889119895= 1) for all
(119888119894 119889
119895)
Remark 4 Because genetic changes and epigenetic changesoccur during cell division to the order of 119900(Δ119905) the probabil-ity is120573
119906(119905)Δ119905 that one 119869
119906cell at time 119905would give rise to 1 119869
119906cell
and 1 119869119906+1
cell at time 119905 + Δ119905 by genetic changes or epigeneticchanges It follows that the transition of 119869
119906rarr 119869
119906+1would not
affect the population size of 119869119906cells but only increase the size
of the 119869119906+1
population
ISRN Biomathematics 7
32 Transition Probabilities and Probability Distributions ofStaging Variables Let 119891(119909 119899 119901) denote the probabilitydensity function of a binomial random variable 119883 sim
Binomial(119899 119901) ℎ(119909 120582) the probability density function ofa Poisson random variable 119883 sim Poisson(120582) and 119892(119909 119910 119899
1199011 119901
2) the probability density function of a bivariate multi-
nomial random vector (119883 119884)simMultinomial(119899 1199011 119901
2) Using
the stochastic equations of the staging variables given by(2) and using the probability distributions of the transitionvariables 119861
119903(119905 119894) 119863
119903(119905 119894)119872
119903(119905 119894) in (4) as inTan et al [4 5]
we obtain the following transition probabilities of 119869119903(119905 +
Δ119905 119894) 119903 + 119894 119896 minus 1 given 119869119903(119905 119894) 119903 = 119894 119896 minus 1 for
(119894 = 0 1 Min(119896 minus 1 2))
119875 119869119903 (119905 + Δ119905 119894) = 119907
119903 119903 = 119894 119894 + 1 119896 minus 1 | 119869
119903 (119905 119894)
= 119906119903 119903 = 119894 119894 + 1 119896 minus 1
= 119875 119869119894(119905 + Δ119905 119894) = 119907
119894| 119869
119894(119905 119894) = 119906
119894
times
119896minus1
prod
119895=119894+1
119875 119869119895(119905 + Δ119905 119894) = 119907
119895| 119869
119895(119905 119894)
= 119906119895 119869119895minus1
(119905 119894) = 119906119895minus1
(12)
where
119875 119869119894(119905 + Δ119905 119894) = 119907
119894| 119869
119894(119905 119894) = 119906
119894
=
119906119894
sum
119903=0
119891 (119903 119906119894 119887119894 (119905) Δ119905)
times 119891(119906119894minus 119907
119894+ 119903 119906
119894minus 119903
119889119894(119905) Δ119905
1 minus 119887119894 (119905) Δ119905
)
119894 = 0 1 Min (119896 minus 1 2)
119875 119869119895(119905 + Δ119905 119894) = 119907
119895| 119869
119895(119905 119894) = 119906
119895 119869119895minus1
(119905 119894) = 119906119895minus1
=
119906119895
sum
1199031=0
119906119895minus1199031
sum
1199032=0
119892 (1199031 1199032 119906
119895 119887119895(119905) Δ119905 119889
119895(119905) Δ119905)
times ℎ (119907119895minus 119906
119895minus 119903
1+ 119903
2 119906
119895minus1120573119895minus1
Δ119905) 119895 gt 119894
(13)
Define the unobservable transition variables˜119880119894(119905) =
119861119894(119905 119894) (119861
119895(119905 119894) 119863
119895(119905 119894)) 119895 = 119894 + 1 119896 minus 1
1015840(119894 = 0 1Min
(119896 minus 1 2)) Then we have for the joint probability densityfunction of
˜119883119894(119905 + Δ119905)
˜119880119894(119905) given
˜119883119894(119905)
119875 ˜119883119894 (119905 +Δ119905) ˜
119880119894 (119905) | ˜
119883119894 (119905) = 119875
˜119883119894 (119905 + Δ119905) | ˜
119880119894 (119905) ˜
119883119894 (119905)
times 119875 ˜119880119894(119905) |
˜119883119894(119905)
(14)
where
119875 ˜119883119894(119905 + Δ119905) |
˜119880119894(119905)
˜119883119894(119905)
= 119891(119869119894 (119905 119894) minus 119869119894 (119905 + Δ119905 119894) + 119861119894 (119905 119894) 119869119894 (119905 119894)
minus119861119894(119905 119894)
119889119894(119905) Δ119905
1 minus 119887119894 (119905) Δ119905
)
times
119896minus1
prod
119895=119894+1
ℎ 119869119895 (119905 + Δ119905 119894) minus 119869119895 (119905 119894) minus 119861119895 (119905 119894)
+119863119895(119905 119894) 119869
119895minus1(119905 119894) 120573
119895minus1(119905) Δ119905
(15)
119875 ˜119880119894(119905) |
˜119883119894(119905)
= 119891 (119861119894 (119905 119894) 119869119894 (119905 119894) 119887119894 (119905 119894) Δ119905)
times
119896minus1
prod
119895=119894+1
119892119861119895(119905 119894) 119863
119895(119905 119894) 119869
119895(119905 119894) 119887
119895(119905 119894) Δ119905 119889
119895(119905 119894) Δ119905
(16)
Let˜119890119894(119895) be a (119896 minus 119894) times 1 column vector with 1 in the
119895th position (1 le 119895 le 119896 minus 119894) and with 0 in other positionsLet
˜119906 = (119906
119894 119906
119896minus1)1015840 and
˜119907 = (119907
119894 119907
119896minus1)1015840 be (119896 minus
119894) times 1 column vectors of nonnegative integers (ie 119906119895and
119907119895are nonnegative integers) Then by using the probability
distribution results in (14)ndash(16) it can readily be shown that
119875 ˜119883119894(119905 + Δ119905) =
˜119907 |
˜119883119894(119905) =
˜119906
= [119906119895119887119895(119905) + (1 minus 120575
119895119894) 119906
119895minus1120573119895minus1
(119905)] Δ119905 + 119900 (Δ119905)
if˜119907 =
˜119906 +
˜119890119894(119895) 119895 = 119894 119896 minus 1
= 119906119895119889119895(119905) Δ119905 + 119900 (Δ119905)
if˜119907 =
˜119906 minus
˜119890119894(119895) 119895 = 119894 119896 minus 1
= 119900 (Δ119905) if 10038161003816100381610038161003816˜11015840
119899minus119896(˜119906 minus
˜119907)10038161003816100381610038161003816ge 2
(17)
The above results imply that˜119883119894(119905) is a (119896minus 119894)-dimensional
birth-death process with birth rates 119894119887119906(119905) 119906 = 119894 119896 minus
1 death rates 119894119889119906(119905) 119906 = 119894 119896 minus 1 and cross-
transition rates 120572119906119906+1
(119895 119905) = 119895120573119906(119905) 119906 = 119894 119896 minus
1 120573119906119907(119895 119905) = 0 if 119907 = 119906 + 1 (See Definition 41 in Tan
([22] Chapter 4)) Using these results it can be shownthat the Kolmogorov forward equation for the probabilities119875119869
119895(119905 119894) = 119906
119895 119895 = 119894 119896 minus 1 | 119869
119894(0) = 119898
119894 119869
119895(0) = 0
8 ISRN Biomathematics
119895 gt 119894 = 119875(119906119895 119895 = 119894 119896 minus 1 119905) (119894 = 0 1Min(119896 minus 1 2))
in the above model is given by
119889
119889119905119875 (119906
119895 119895 = 119894 119896 minus 1 119905)
= 119875 (119906119894minus 1 119906
119895 119895 = 119894 + 1 119896 minus 1 119905) (119906
119894minus 1) 119887
119894(119905)
+
119896minus1
sum
119895=119894+1
119875 (119906119894 119906
119894+1 119906
119895minus1 119906
119895
minus1 119906119895+1
119906119896minus1
119905) (119906119895minus 1) 119887
119895(119905)
+
119896minus2
sum
119895=119894
119875 (119906119894 119906
119894+1 119906
119895 119906
119895+1
minus1 119906119895+2
119906119896minus1
119905) 119906119895120573119895 (119905)
+
119896minus1
sum
119895=119894
119875 (119906119894 119906
119894+1 119906
119895minus1 119906
119895
+1 119906119895+1
119906119896minus1
119905) (119906119895+ 1) 119889
119895(119905)
minus 119875 (119906119894 119906
119894+1 119906
119896minus1 119905)
times
119896minus1
sum
119895=119894
119906119895[119887119895(119905) + 119889
119895(119905)] +
119896minus2
sum
119895=119894
119906119895120573119895(119905)
(18)
for 119906119895= 0 1 infin 119895 = 119894 119896 minus 1
By using the above set of differential equations one canreadily compute the probabilities 119875119869
119895(119905) = 119906
119895 119895 = 119894 119896 minus
1 | 119868119894(0) = 119898
119894 = 119875(119906
119895 119895 = 119894 119896 minus 1 119905) numerically
33 The Probability Distributions of the Number of DetectableTumors and Times to Tumors As shown by Yang and Chen[11] malignant cancer tumors arise from primary 119869
119896cells by
clonal expansion where primary 119869119896cells are 119869
119896cells generated
directly by 119869119896minus1
cells (119869119896cells derived by stochastic birth of
other 119869119896cells are not primary 119869
119896cells) That is cancer tumors
develop from primary 119869119896cells through stochastic birth-death
processesTo derive the probability distribution for 119879(119905) in 119869
119894people
in the population let 119875119879(119904 119905) denote the probability that
a primary cancer cell at time 119904 develops into a detectablecancer tumor by time 119905 (Explicit formula for119875
119879(119904 119905) has been
given in Tan [22] Chapter 8 and in Tan and Chen [24])Than as shown in Tan ([3 22] chapter 8) the conditionalprobability distribution of 119879(119905) given 119869
119896minus1(119904 119894) 119904 le 119905
in 119869119894people is Poisson with mean 120596(119905 119894) where 120596(119905 119894) =
int119905
1199050
119869119896minus1
(119904 119894)120573119896minus1
(119904)119875119879(119904 119905)119889119904 That is
119879 (119905) | 119869119896minus1
(119904 119894) 119904 le 119905 sim Poisson (120596 (119905 119894)) (19)
Let 119876119894(119895) be the probability that cancer tumors develop
during (119905119895minus1
119905119895] in 119869
119894people in the population For time
homogeneous models with small 120573119896minus1
119876119894(119895) is then given by
119876119894(119895) = 119864 119890
minus120596(119905119895minus1
119894)minus 119890
minus120596(119905119895119894)
= 119890minus120573119896minus1
119867119894(119905119895minus1
)minus 119890
minus120573119896minus1
119867119894(119905119895)+ 119900 (120573
119896minus1)
(20)
where119867119894(119905) = int
119905
1199050
119864[119869119896minus1
(119909 119894)]119875119879(119909 119905)119889119909
To derive 119876119894(119895) denote by
120579119894(119896minus1)
= 119864 [119869119894(1199050 119894 minus 1)] 120573
119894
119896minus1
prod
119906=119894+1
(120573119906
120574119906
)
119894 = 1 Min (3 119896 minus 1)
120582119906(119896minus1)
= 119864 [119869119906(1199050 119906)] 120573
119906
119896minus1
prod
119907=119906+1
(120573119906
120574119906
)
119906 = 0 1 Min (2 119896 minus 1)
(21)
and define the functions
120595119894(119896minus1)
(119905) =
119896minus1
prod
119906=119894+1
120574119906
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907)
times int
119905
1199050
119890120574119906(119909minus1199050)119875119879(119909 119905) 119889119909
119894 = 0 1 Min (2 119896 minus 1)
(22)
Applying results of 119864[119869119896minus1
(119905 119894)] given in (11) for timehomogeneous models with 120574
119894= 120574119895if 119894 = 119895 we obtain 119876
119894(119895)rsquos as
follows
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) = 119890
minus1205791112059511(119905119895minus1
)minus1205820112059501(119905119895minus1
)
minus119890minus1205791112059511(119905119895)minus1205820112059501(119905119895) + 119900 (120573
1)
(23)
1198761(119895) = (1 minus 120572
1) 119890
minus1205821112059511(119905119895minus1
)
minus119890minus1205821112059511(119905119895) + 119900 (120573
1)
(24)
ISRN Biomathematics 9
(2) If 119896 ge 3 then we have 119876119894(0) = 0 for 119894 = 0 1 and
1198762(0) = 120575
2119896120572 and for 119895 gt 0
1198760(119895) = 119890
minus1205791(119896minus1)
1205951(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1205950(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1205951(119896minus1)
(119905119895)minus1205820(119896minus1)
1205950(119896minus1)
(119905119895) + 119900 (120573
119896minus1)
(25)
1198761(119895) = 119890
minus1205792(119896minus1)
1205952(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1205951(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1205952(119896minus1)
(119905119895)minus1205821(119896minus1)
1205951(119896minus1)
(119905119895) + 119900 (120573
119896minus1)
(26)
1198762(119895) = 120575
2119896(1 minus 120572) 119890
minus1205822212059522(119905119895minus1
)minus 119890
minus1205822212059522(119905119895)
+ (1 minus 1205752119896) 119890
minus1205793(119896minus1)
1205953(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1205952(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1205953(119896minus1)
(119905119895)minus1205822(119896minus1)
1205952(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(27)
where 1205752119896= 1 if 119896 = 2 and = 0 if 119896 = 2
Notice that if 1205740= 0 then 120595
0(119896minus1)(119905) reduces to
1205950(119896minus1)
(119905) = (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=0
1198600(119896minus1)
(119907)
times int
119905
1199050
119890120574119907(119909minus1199050)119875119879(119909 119905) 119889119909
= (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1) (119907)
times int
119905
1199050
[119890120574119907(119909minus1199050)minus 1] 119875
119879 (119909 119905) 119889119909
(28)
Notice also that if 119875119879(119904 119905) asymp 1 for 119905 gt 119904 and if 120574
0= 0 then
the above 120595119894119895(119905)rsquos reduce respectively to
120595119894119894(119905) =
1
120574119894
119890120574119894(119905minus1199050)minus 1 119894 = 1 119896 minus 1
1205950(119896minus1) (119905) = (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1) (119907)
times1
1205741199072119890
120574119907(119905minus1199050)minus 1 minus (119905 minus 119905
0) 120574
119907
120595119894(119896minus1)
(119905) = (
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907)
times1
120574119907
119890120574119907(119905minus1199050)minus 1 119894 = 1 119896 minus 1
(29)
4 Probability Distribution ofObserved Cancer Incidence IncorporatingHereditary Cancer Cases
For estimating unknown parameters and to validate themodel one would need real data generated from the modelFor studies of carcinogenesis such data are usually given bycancer incidence For example in the SEER data of NCINIHof USA the data are given by (119910
0 119899
0) (119910
119895 119899
119895) 119895 = 1 119905
119873
where 1199100is the number of cancer cases at birth and 119899
0
the total number of birth and where for 119895 ge 1 119910119895is
the number of cancer cases developed during the 119895th agegroup of a one-year period (or 5 years periods) and 119899
119895is
the number of noncancer people who are at risk for cancerand from whom 119910
119895of them have developed cancer during
the 119895th age group Given in Table 1 are the SEER data ofuveal melanoma (adult eye cancer) during the period 1973ndash2007 In Table 1 notice that there are some cancer cases atbirth implying some inherited cancer cases In this sectionwe will develop a statistical model for these types of datasets from the stochastic multistage model with hereditarycancers as given in Section 2 As in previous sections let119899119894119895be the number of individuals who have genotype 119869
119894(119894 =
0 1 2) at the embryo stage among the 119899119895people at risk for
the cancer in question Then as showed above (1198991119895 119899
2119895) |
119899119895sim Multinomial119899
119895 119901
1 119901
2 It follows that 119899
119894119895| 119899
119895sim
Binomial119899119895 119901
119894 119894 = 0 1 2 In what follows we let 119884
119895denote
the random variable for 119910119895unless otherwise stated
41 The Probability Distribution of 1198840 As shown in Figure 4
119869119894(119894 = 0 1 2) people would only generate 119869
119894stage cells and
119869119894+1
stage cells at birth Thus for cancers to develop at orbefore birth the number of stages for the stochastic model ofcarcinogenesis must be 3 or less It follows that if 119910
0gt 0 the
appropriate model of carcinogenesis must be either a 2-stagemodel or a 3-stage model Since 119899
20| 119899
0sim Binomial(119899
0 119901
2)
and 11989910| (119899
0 119899
20) sim Binomial(119899
0 119901
1) + 119900(119901
2) the probability
distribution of 1198840is therefore
1198840sim Poisson (120594
119896) 119896 = 2 3 (30)
where
120594119896= 119899
0(119901
2+ 119901
1120572) if 119896 = 2
= 11989901199012120572 if 119896 = 3
(31)
The expected number of 1198840given 119899
0is 119864(119884
0| 119899
0) =
1198990(119901
2+ 119901
1120572) = 120594
2if 119896 = 2 and 119864(119884
0| 119899
0) = 119899
01199012120572 = 120594
3
if 119896 = 3 Hence for the 2-stage model (ie 119896 = 2) or the 3-stage model (ie 119896 = 3) the maximum likelihood estimateof 120594
119896is 120594
119896= 119910
0and the deviance119863
0(119896) from the conditional
probability distribution of 1198840given 119899
0is
1198630(119896) = minus2 log ℎ (119910
0 120594
119896) minus log ℎ (119910
0 120594
119896)
= 120594119896minus 119910
0 minus 119910
0log
120594119896
1199100
119896 = 2 3
(32)
10 ISRN Biomathematics
Table 1 The SEER incidence data (1973ndash2007) of uveal melanomafrom NCINIH (over all races and genders)
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
0 12495777 34 36 36 381 12221582 20 11 11 162 12120990 14 7 8 173 12112995 9 5 6 184 12146174 4 4 5 205 12161336 3 3 4 226 12111854 2 2 4 257 12160452 2 2 4 288 11942586 2 2 4 309 12381299 1 2 5 3410 12512703 2 3 6 3811 12410338 5 3 6 4112 12449244 2 3 6 4413 12527781 5 4 7 4814 12602883 3 5 7 5115 12719598 5 5 8 5516 12766107 7 6 9 5917 12831400 9 7 9 6218 12382047 8 7 9 6319 12581638 10 8 10 6820 12636509 7 9 11 7121 12682601 6 10 10 7522 12840510 12 11 11 7923 13075528 17 13 13 8424 13358635 16 15 14 8925 13473849 12 16 16 9426 13426340 17 18 18 9727 13525264 28 20 20 10128 13149674 17 22 21 10229 13812811 23 25 25 11030 13886874 24 28 27 11431 13488332 37 30 29 11532 13460286 32 32 32 11833 13256067 38 35 35 11934 13428827 37 39 38 12435 13220037 40 41 41 12636 12870265 30 44 44 12637 12689592 43 47 47 12738 12157014 42 49 49 12539 12494081 46 55 54 13140 12272125 49 58 58 13241 11826573 56 61 60 13042 11663153 54 65 64 13143 11407082 53 68 68 13144 11296848 70 73 72 133
Table 1 Continued
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
45 11016369 57 76 76 13246 10651593 71 79 79 13047 10475708 87 84 83 13148 9994684 82 86 85 12749 10138908 78 93 93 13150 9836359 87 97 96 13051 9475641 95 100 99 12752 9250985 113 104 104 12653 9027382 106 108 108 12554 8883737 117 113 113 12555 8547883 129 116 116 12356 8279648 107 119 119 12157 8062368 119 123 123 11958 7654610 132 124 124 11559 7563706 118 130 129 11560 7232719 131 131 130 11261 6927332 116 132 132 10962 6708273 133 134 134 10763 6543931 143 138 137 10664 6404652 130 141 141 10565 6168486 145 142 142 10266 5913479 138 142 142 9967 5746766 151 144 144 9868 5480517 147 142 142 9469 5363912 149 144 144 9470 5110728 136 142 142 9071 4925076 144 141 141 8872 4696825 140 138 138 8573 4512136 146 135 135 8274 4345300 126 133 133 8075 4148801 122 128 129 7876 3900900 124 122 122 7477 3681587 114 116 116 7078 3481918 93 110 110 6779 3243631 102 102 102 6380 2961234 79 92 93 5881 2724984 93 83 84 5482 2495219 82 75 75 5083 2271595 92 66 67 4684 2041351 64 57 58 4285 10466605 304 282 285 216Note 1 for age 0 and 1ndash10 years old the cancer incidence are derived bysubtracting incidence of retinoblastoma from the original SEERdata (see Tanand Zhou [9])Note 2 the observed uveal melanoma incidence rates per 106 individuals arederived from the SEER eye cancer incidence by subtracting retinoblastomaincidence as given in Tan and Zhou [9]
ISRN Biomathematics 11
42 The Probability Distribution of 119884119895(119895 ge 1) To derive the
probability distribution of 119884119895(119895 ge 1) in the 119895th age group
let 119884119894119895(119894 = 0 1 2) be the number of cancer cases generated
by people who have genotype 119869119894at the embryo stage among
these 119884119895cancer cases Then 119884
119895= sum
2
119894=0119884119894119895and 119884
0119895is the
number of cancer cases generated by the 1198990119895= 119899
119895minus 119899
1119895minus 119899
2119895
normal people in the populationThe conditional probabilitydistribution of 119884
119894119895given 119899
119894119895is
119884119894119895| 119899
119894119895sim Poisson 119899
119894119895119876119894(119895) 119894 = 0 1 2 (33)
Notice that if 119896 = 2 (a 2-stage model) then all 1198692
individuals would develop tumor at or before birth Thus if119896 = 2 then 119884
2119895= 0 for all 119895 gt 0 so that if 119895 gt 0 cancer
cases develop only from normal people (119873 = 1198690people) and
1198691people On the other hand if 119896 gt 2 then with positive
probability 119884119894119895gt 0 for all (119894 = 0 1 2 119895 = 0 1 119899
119879) where
119899119879is the last time point in the data Let 120575
2119896= 1 if 119896 = 2 and
1205752119896= 0 if 119896 = 2 Then 119884
119895| (119899
119894119895 119894 = 0 1 2) sim Poisson(119876
119879(119895)
where 119876119879(119895) = sum
1
119894=0119899119894119895119876119894(119895) + (1 minus 120575
2119896)119899
21198951198762(119895) Since
(1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) we have for the
conditional probability density function119875(119910119895| 119899
119895) of119884
119895given
119899119895
119875 (119910119895| 119899
119895)=
119899119895
sum
1198990119895=0
119899119895minus1198990119895
sum
1198991119895=0
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1) ℎ 119910
119895 119876
119879(119895)
(34)
where 119892(1198990119895 119899
1119895 119899
119895 119901
0 119901
1) is the probability density function
of (1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) and ℎ119910
119895 119876
119879(119895)
the probability density function of 119884119895| (119899
119894119895 119894 = 0 1 2) sim
Poisson119876119879(119895)
The probability density function 119875(119910119895| 119899
119895) given by (34)
is a mixture of Poisson probability density functions withmixing probability density function given by themultinomialprobability distribution of 119899
0119895 119899
1119895 given 119899
119895 This mixing
probability density function represents individuals with dif-ferent genotypes at the embryo stage in the population
Let Θ be the set of all unknown parameters (ie theparameters (119901
1 119901
2 120572) and the birth rates the death rates
and the mutation rates of 119869119895cells) Based on data (119910
119895 119895 =
0 1 119899119879) the likelihood function of Θ is
119871 Θ | 119910119895 119895 = 0 1 119899
119879 = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119875 (119910119895| 119899
119895)
(35)
Notice that because themutation rates are very small onemay practically assume 120573
119894(119905) = 120573
119894for 119894 = 0 1 119896 minus 1
Also because the stage-limiting genes are basically tumorsuppressor genes which act recessively (see Tan [3]Weinberg[6] and Tan et al [5 8 23]) one may practically assume119887119894(119905) = 119887
119894 119889
119894(119905) = 119889
119894 120574
119894(119905) = 119887
119894minus 119889
119894= 120574
119894 119894 = 0 1 119896 minus 1
(see Tan et al [4 5 8])
43 The Joint Probability Distribution of Augmented Variablesand Cancer Incidence For applying the mixture distribution
of 119884119895in (34) to make inference about the unknown parame-
ters one needs to expand the model to include the unobserv-able augmented variables (119899
0119895 119899
1119895 119910
0119895 119910
1119895 119895 = 1 119905
119873) and
derives the joint probability distribution of these variablesFor these purposes observe that for (119895 = 1 119905
119873) and for
119896 gt 2 the conditional probability distribution119875119910119894119895 119894 = 0 1 |
119910119895 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1) given (119884
119895= 119910
119895 119899
119894119895 119894 =
0 1 119899119895) is
(119910119894119895 119894 = 0 1) | (119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Multinomial(119910119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1)
(36)
Since the conditional probability distribution of 119884119895given
(119899119894119895 119894 = 0 1 2) for 119896 gt 2 is Poisson with mean 119876
119879(119895) we
have for the joint conditional probability density function119875119910
119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1 119884
119895) given
(119899119894119895 119894 = 0 1 119899
119895)
119875 119910119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895
= 119875 119910119894119895 119894 = 0 1 | 119910
119895 119899
119894119895 119894 = 0 1 2
times 119875 119910119895| 119899
119894119895 119895 = 0 1 2 =
1
119910119895119890minus119876119879(119895)119876
119879(119895)
119910119895
times 1198921199100119895 119910
1119895 119910
119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1
=
2
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) (119896 gt 2)
(37)
where 1199102119895= 119910
119895minus sum
1
119906=0119910119906119895and 119899
2119895= 119899
119895minus sum
1
119906=0119899119906119895
If 119896 = 2 then 1198842119895= 0 so that 119884
119895= sum
1
119894=0119884119894119895 Thus we have
for 119896 = 2
119884119895| (119899
119894119895 119894 = 0 1 119899
119895) sim Poisson
1
sum
119894=0
119899119894119895119876119894(119895) (38)
119884119894119895| (119884
119895= 119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Binomial(119910119895
119899119894119895119876119894(119895)
sum1
119906=0119899119906119895119876119906(119895)
) 119894 = 0 1
(39)
It follows that if 119896 = 2 then sum1
119894=0119884119894119895= 119884
119895and the joint
probability density function of (1198840119895 119884
119895) given (119899
119906119895 119906 = 0 1 2)
is119875 119910
0119895 119910
119895| 119899
119906119895 119906 = 0 1 119899
119895
= 119875 119910119895| 119899
119894119895 119894 = 0 1 119899
119895
times 119875 1199100119895| 119910
119895 119899
119906119895 119906 = 0 1 119899
119895
=
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) if 119896 = 2
(40)
where 119910119895= sum
1
119894=0119910119894119895and 119899
119895= sum
2
119894=0119899119894119895
12 ISRN Biomathematics
Put Y = (119910119894119895 119894 = 0 1 119895 = 1 119905
119873) N = (119899
119894119895 119894 = 0 1
119895 = 1 119905119873)˜119899 = (119899
119895 119895 = 0 1 119905
119873)˜119910 = (119910
119895 119895 = 0 1
119905119873) From (37) and (40) we have for the conditional joint
probability density function of (Y˜119910) given (N
˜119899)
119875 Y˜119910 | N
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(41)
It follows that the joint conditional probability densityfunction of NY
˜119910 given (
˜119899 Θ) is
119875 NY˜119910 |
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1)
times ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(42)
Notice that the above probability density function is aproduct of multinomial probability density functions andPoisson probability density functions For this joint probabil-ity density function the deviance Dev = minus2log119875[Y
˜119910N |
˜119899 Θ] minus log119875[Y
˜119910N |
˜119899 Θ] is
Dev = 1198630(119896) + Dev (119901
1 119901
2) +
119905119873
sum
119895=1
119863119895 (43)
where
1198630(119896) = 2 120594 (119896) minus 119910
0minus 119910
0log
120594 (119896)
1199100
(44)
Dev (1199011 119901
2) = 2
119905119873
sum
119895=1
1198990119895log
1199010
(1 minus 1199011minus 119901
2)
+
2
sum
119894=1
119899119894119895log
119901119894
119901119894
(45)
119863119895= 2
1
sum
119894=0
119899119894119895119876119894(119895)minus119910
119894119895minus119910
119894119895log
119899119894119895119876119894(119895)
119910119894119895
+2 (1 minus 1205752119896)
times 11989921198951198762(119895) minus 119910
2119895minus 119910
2119895log
11989921198951198762(119895)
1199102119895
(46)
where 119901119894= ((sum
119905119873
119895=0119899119894119895)(sum
119905119873
119895=0119899119895)) (119894 = 0 1 2)
The joint probability density function 119875Y˜119910N |
˜119899 Θ
of (Y˜119910N) given by (42) will be used as the kernel for the
Bayesian method to estimate the unknown parameters andto predict the state variables
44 Fitting of the Model to Cancer Incidence Data To fit themodel to real data as inTan [3ndash5] we letΔ119905 sim 1 to correspondto a fixed time interval such as 6 months in human cancerstudies (Tan et al [4] has assumed 3months as one-time unitwhile Luebeck andMoolgavkar [26] has assumed one year asone-time unit)Then because the proliferation rate of the laststage cells is quite large onemay practically assume119875
119879(119904 119905) =
1 for 119905 minus 119904 ge 1 Hence noting that 120573119896minus1
(119905) = 120573119896minus1
is usuallyvery small (see [3ndash5]) the 119876
119894(119895) is approximated by
119876119894(119895) asymp119864 119890
minus120573119896minus1
119866119894(119905119895minus1
)minus 119890
minus120573119896minus1
119866119894(119905119895) = 119890
minus120573119896minus1
119864[119866119894(119905119895minus1
)]
minus 119890minus120573119896minus1
119864[119866119894(119905119895)]+ 119900 (120573
119896minus1)
(47)
where 119866119894(119905) = sum
119905minus1
119904=1199050
119869119896minus1
(119904 119894)Under discrete time approximation the 119864[119868
119896minus1(119905 119894)]rsquos
have been derived in the appendix Using these results ofexpected numbers and using the resultsum119905minus1
119894=0119886119894= (119886
119905minus1)(119886minus
1) for 119886 = 0 we obtain
120573119896minus1
119864 [119866119894(119905)] =
119894+1
sum
119903=119894
119864 [119869119903(1199050 119894)] (
119896minus1
prod
119906=119903
120573119906)
times
119896minus1
sum
119907=119903
119860119903(119896minus1)
(119907)
119905minus1
sum
119904=1199050
(1 + 120574119907)119904minus1199050
= 120579(119894+1)(119896minus1)
120601(119894+1)(119896minus1) (119905) + 120582119894(119896minus1)120601119894(119896minus1) (119905)
(48)
where (120579119894(119896minus1)
119894 = 1 2 3) and (120579119894(119896minus1)
119894 = 0 1 2) are definedin Section 33 andwhere the120601
119894(119896minus1)(119905) (119894 = 0 1 2 3)rsquos are given
by
120601119894(119896minus1) (119905) =(
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1) (119907)
times1
120574119907
(1 + 120574119907)119905minus1199050
minus 1 119894 = 0 1 2 3
(49)
Notice that if 1205740= 0 then 120601
0(119896minus1)(119905) reduces to
1206010(119896minus1)
(119905) =(
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1)
(119907)
times1
1205741199072(1 + 120574
119907)119905minus1199050
minus 1 minus (119905 minus 1199050) 120574
119907
(50)
Applying these results for time homogeneous modelswith 120574
119894= 120574119895if 119894 = 119895 the 119876
119894(119895)rsquos under discrete approximation
are given as follows
ISRN Biomathematics 13
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791112060111(119905119895minus1
)minus1205820112060101(119905119895minus1
)
minus119890minus1205791112060111(119905119895)minus1205820112060101(119905119895) + 119900 (120573
1)
1198761(119895) asymp (1 minus 120572
1) 119890
minus1205821112060111(119905119895minus1
)
minus119890minus1205821112060111(119905119895) + 119900 (120573
1)
(51)
(2) If 119896 ge 3 thenwe have 1198762(0) = 120575
2(119896minus1)120572119876
119894(0) = 0 119894 =
0 1 and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791(119896minus1)
1206011(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1206011(119896minus1)
(119905119895)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198761(119895) asymp 119890
minus1205792(119896minus1)
1206012(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1206012(119896minus1)
(119905119895)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198762(119895) asymp 120575
2(119896minus1)(1 minus 120572) 119890
minus1205822212060122(119905119895minus1
)minus 119890
minus1205822212060122(119905119895)
+ (1 minus 1205752(119896minus1)
) 119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(52)
Notice that if one replaces [1 + 120574119907]119905minus1199050 by [1 + 120574
119907]119905minus1199050 =
119890(119905minus1199050) log(1+120574
119907)
asymp 119890(119905minus1199050)120574119907 the above 119876
119894(119895)rsquos from discrete
time model are exactly the same as from the continuousmodel respectively as given in equations (23)ndash(27) under theassumption that 119875
119879(119904 119905) = 1 for 119905 minus 119904 gt 0 Notice that the
assumption 119875119879(119904 119905) = 1 for 119905minus119904 gt 0 is equivalent to assuming
that the last stage cancer cells grow instantaneously intocancer tumors as soon as they are generated see Remark 1
5 The Fitting of the Model to CancerIncidence and the Generalized BayesianInference Procedure
Given the model in Sections 2 and 3 and cancer incidenceone may use results in Section 4 to fit the model By usingthis model and the distribution results in Section 4 one canreadily estimate the unknown genetic parameters predictcancer incidence and check the validity of the model byusing the generalized Bayesian inference and Gibbs samplingprocedures for more detail see Tan [3 22] and Tan et al[4 5]
The generalized Bayesian inference is based on the pos-terior distribution 119875Θ | NY
˜119910
˜119899 of Θ given NY
˜119910
˜119899
This posterior distribution is derived by combining the prior
distribution 119875Θ ofΘ with the joint probability distribution119875NY
˜119910 |
˜119899 Θ given
˜119899 Θ given by (42) It follows
that this inference procedure would combine informationfrom three sources (1) previous information and experiencesabout the parameters in terms of the prior distribution 119875Θof the parameters (2) biological information of inheritedcancer cases via genetic segregation of cancer genes inthe population (119875N |
˜119899 119901
119894 119894 = 1 2 see Section 2)
and (3) information from the expanded data (Y) and theobserved data (
˜119910) via the statistical model from the system
(119875Y˜119910 | N Θ) given by (37) and (40) Because of additional
information from the genetic segregation of the cancer genesthis inference procedure provides an efficient procedure toextract information of effects of genotypes of individuals atthe embryo stage
51 The Prior Distribution of the Parameters For the priordistributions of Θ because biological information has sug-gested some lower bounds and upper bounds for the muta-tion rates and for the proliferation rates we assume
119875 (Θ) prop 119888 (119888 gt 0) (53)
where 119888 is a positive constant if these parameters satisfy somebiologically specified constraints are and equal to zero forotherwise These biological constraints are as follows
(i) 0 lt 1199011lt 10
minus2 0 lt 1199012lt 10
minus6 and minus001 lt 120574119894lt 1 (119894 =
1 2)(ii) For 120573
119895(119895 = 0 1 119896 minus 1) we let 1 lt 120596
0= 119873(119905
0)120573
0lt
1000 (119873 rarr 1198681) and 10minus8 lt 120573
119894lt 10
minus3 119894 = 1 119896minus1(iii) For the 120582
119895(119896minus1)(119895 = 0 1 2) we let 0 lt 120582
119894lt 10 119894 = 0 1
and 0 lt 1205822lt 10
3(iv) For the 120579
119894(119894 = 1 2 3) we let 0 lt 120579
1lt 10
minus2 and 0 lt
1205792lt 10
minus1
We will refer to the above prior as a partially informativeprior which may be considered as an extension of thetraditional noninformative prior given in Box and Tiao [28]
52 The Posterior Distribution of the Parameters Given119884119873
˜119910
˜119899 Denote by Θ
1= Θ minus 119901
1 119901
2 120572 From the
posterior distribution 119875Θ | NY˜119910
˜119899 we obtain
119875 119901119894 119894 = 1 2 120572 | Θ
1NY
˜119910
˜119899
prop (120594 (119896))1199100
119890minus120594(119896)
119901sum119905119873
119895=11198991119895
1119901sum119905119873
119895=11198992119895
2
times (1 minus 1199011minus 119901
2)sum119905119873
119895=11198990119895
0 lt 1199011 119901
2 120572 lt 1
119875 Θ1| (119901
1 119901
2 120572) NY
˜119910
˜119899
prop
119905119873
prod
119895=1
2
prod
119894=0
119890minus119876119894(119895)119876
119894(119895)
119910119894119895
Θ1isin Ω
(54)
14 ISRN Biomathematics
where Ω is the parameter space of Θ1provided by the
biological constraints in Section 51For computational convenience we notice that the log of
1198751199011 119901
2 120572 | Θ
1NY
˜119910
˜119899 is proportional to the negative of
1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) similarly the log of
119875Θ1| (119901
1 119901
2 120572)NY
˜119910
˜119899 is proportional to the negative
of sum119896
119895=1119863119895given by (46)
53 The Multilevel Gibbs Sampling Procedure For EstimatingUnknown Parameters Given the posterior probability distri-butions we will use the following multilevel Gibbs samplingprocedure to derive estimates of the parameters We noticethat numerically the Gibbs sampling procedure given belowis equivalent to the EM-algorithm from the sampling theoryviewpoint with Steps 1 and 2 as the 119864-Step and with Steps 3and 4 as the119872-Step respectively [29]Thesemultilevel Gibbssampling procedures are given by the following
Step 1 (Generating N Given (Y˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 1)) Given Θ and given˜119899 use the multinomial
distribution of 1198991119895 119899
2119895 given 119899
119895in Section 3 to generate
a large sample of N Then by combining this large samplewith 119875Y
˜119910 | N
˜119899 Θ in (37) and (40) to select N through
the weighted bootstrap method due to Smith and Gelfand[30] This selected N is then a sample from 119875N | Y
˜119910
˜119899 Θ
even though the latter is unknown (For proof see Tan [22]Chapter 3) Call the generated sample N
Step 2 (Generating Y Given (N˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 2)) Given
˜119910
˜119899 Θ and given N = N generated
from Step 1 generate Y from the probability distribution119875Y | N
˜119910
˜119899 Θ given by (36) and (38) Call the generated
sample Y
Step 3 (Estimation of 119901119894 119894 = 1 2 120572 Given Θ
1NY
˜119910
˜119899)
Given
˜119910
˜119899 Θ
1 and given (NY) = (N Y) from Steps 1
and 2 derive the posterior mode of 119901119894 119894 = 1 2 120572 by
maximizing the conditional posterior distribution 119875119901119894 119894 =
1 2 120572 | Θ1 N Y
˜119910
˜119899 Under the partially informative prior
this is equivalent to maximize the negative of the deviance1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) in Section 43 under
the constraints given in Section 51 Denote this generatedmode by 119901
119894 119894 = 1 2
Step 4 (Estimation of Θ1Given (119901
119894 119894 = 1 2 120572NY
˜119910
˜119899))
Given (
˜119910
˜119899) and given (NY 119901
119894 119894 = 1 2 120572) = (N Y 119901
119894 119894 =
1 2 ) from Steps 1ndash3 derive the posterior mode of Θ1
by maximizing the conditional posterior distribution119875Θ
1| 119901
119894 119894 = 1 2 N Y
˜119910
˜119899 Under the partially
informative prior this is equivalent to maximize the negativeof the deviancesum119896
119895=1119863119895in (46) under the constraints Denote
the generated mode as Θ1
Step 5 (Recycling Step) With (NY 119901119894 119894 = 1 2 120572 Θ
1) =
(N Y 119901119894 119894 = 1 2 Θ
1) given above go back to Step 1 and
continue until convergence
The proof of convergence of the above steps can bederived by using procedure given in Tan ([22] Chapter 3) Atconvergence the Θ = 119901
119894 119894 = 1 2 Θ
1 are the generated
values from the posterior distribution of Θ given
˜119910
˜119899
independently of (NY) (for proof see Tan [22] Chapter 3)Repeat the above procedures once then generate a randomsample ofΘ from the posterior distribution ofΘ given
˜119910
˜119899
then one uses the sample mean as the estimates of (Θ) anduse the sample variances and covariances as estimates of thevariances and covariances of these estimates
6 A NewMultistage Stochastic Model for AdultEye Cancer (Uveal Melanoma)mdashAn Example
The human eye cancers consist of pediatric eye cancers andadult eye cancers The most common pediatric eye cancer isthe retinoblastoma which develops from the retinal pigmentepithelium cells underlying the retina that do not formmelanoma The most common adult eye cancers are theuveal melanomas involving the iris the ciliary body andthe choroid (collectively referred to as the uveal) Thesecancers develop from melanocytes (pigment cells) whichreside within the uveal giving color to the eye In Tan andZhou [9] we have developed a modified two-stage model forretinoblastoma Based on results frommolecular biology (seeLandreville et al [14] Mensink et al [15] and Loercher andHarbour [31]) Landreville et al [14] have proposed a threestage model for uveal melanoma as given in Figure 2 As anexample of applications of this paper in this section we willapply this model of uveal melanoma to the NCINIH eyecancer data from the SEER project We notice that the samemethods can be applied to model other human cancers aswell but this will be our future research
Given in Table 1 are the numbers of people at risk andthe eye cancer cases in the age groups together with thepredicted cases from the models These data give cancerincidence at birth and incidence for 85 age groups (119896 =
85) with each group spanning over a 1-year period exceptthe last age group (ge85 years old) For human eye cancerbecause the incidence at birth and for age groups from 1to 10 years old is basically generated by the pediatric eyecancer-retinoblastoma (see [9]) to account for inheritedcancer cases of uveal melanomas the incidence for age 0(birth) and for age periods from 1 to 10 years old in Table 1for uveal melanoma is derived by subtracting incidence ofretinoblastoma from SEER data (see Tan and Zhou [9])
To fit the data we let one-time unit be 6 months afterbirth and let 119905
0= 1 To compare different models and to
assess different assumptions we will consider the following2-3-stage mixture models (1) the complete 3-stage mixturemodel (Model-F) in which no assumptions are made on theparameters (2) the Type-1ndash3-stage mixture model in whichwe assume that 120574
1= 0 and that normal people and 119869
1at
the embryo stage will remain normal people and 1198691people
respectively at birth (Model-1) For comparison purposes wealso fit a 2-stage model as defined in Tan and Zhou [9] Wewill apply the methods in Section 6 to fit these models to theSEER data given in Table 1
ISRN Biomathematics 15
Table 2 The log-likelihood AIC and BIC of the fitted models
Models Log-likelihood AIC BICThree-stage
Model-F minus131202 264804 267735Model-1 minus129576 260953 263151
Two-stage minus4392421 8796843 8811569
0 6 12 18 24 30 36 42 48 54 60 66 72 78 84Age (year)
0
50
100
150
200
250
300
350
Inci
denc
es
ObservedModel-F
Model-1Two-stage
Figure 5 Curve fitting of SEER data by the Model-F Model-1 andthe two-stage model
Given in Table 2 are the natural logs of the likelihoodfunctions the AIC (Akaike Information Criterion) and theBIC (Bayesian Information Criterion) for these modelsGiven in Table 3 are the estimates of parameters in the 3-stagemodels Given in Figure 5 are the plots of predicted cancercases from the 3-stage mixture models (Model-F and Model-1) and the 2-stagemodel For comparison purposes in Table 1we also provide numbers of predicted cancer cases from the3-stage mixture models and the 2-stage model together withthe observed cancer cases over time from SEER From theseresults we have made the following observations
(a) As shown by results in Table 1 and Figure 5 it appear-ed that both Model-F and Model-1 fitted the SEERdata well although Model-1 fitted the data slightlybetter from values of AIC and BIC The Chi-squaretest statistics 1205942 = sum
85
119894=0((119910
119894minus 119910
119894)2119910
119894) for Model-F
and Model-1 are given by 8843 and 9448 respec-tively giving a 119875-value of 012 (df = 86 minus 12 = 74)for Model-F and a 119875 value of 011 (df = 86 minus 7 = 79)for Model-1 On the other hand the 2-stage modelfitted the date very poorly the Chi-statistic value forthe 2-stage model is 274769 giving a 119875-value less than10
minus3The AIC (Akaike Information Criteria) and BIC(Bayesian Information Criteria) values ofModel-1 aregiven by (AIC = 260953 BIC = 263151) which areslightly smaller than those of Model-F respectivelyhowever the AIC and BIC values (879684 881157)of the two-stage model are considerably greater thanthose of the 3-stagemodels respectivelyThese resultssuggest that uvealmelanomamaybest be described bya 3-stage model with inherited component and that
one may practically assume 1205741= 0 and that normal
people and 1198691people at the embryo stage will remain
normal people and 1198691people respectively at birth
(b) From Table 3 it is observed that the estimate of 1205741is
close to zero (the estimate is of order 10minus5) indicatingthat the phenotype of 119869
1is almost identical to that
of 119873 = 1198690further confirming that the staging-
limiting genes are basically tumor suppressor genesand that there is no haploinsufficiency for these tumorsuppressor genes On the other hand the estimate of1205742is of order 10minus2 which is about 103 times greater
than those of cells with genotype 1198691
(c) From Table 3 the estimates 119895(119895 = 0 1 2) of 120582
119895
are of order 10minus1 10
minus1 10
2sim 10
3 respectively
Because 120582119894= 119864[119869
119894(1199050 119894)]120573
119894prod
2
119906=119894+1(120573
119906120574
119906) 119894 = 0 1 2
assuming some values of 119864[119869119894(1199050 119894)] 119894 = 0 1 2 from
some biological observations one can have somerough ideas about the magnitude of 120573
119895(119895 = 0 1 2)
For example if we follow Potten et al [32] to assume(119864[119873(119905
0)] = 119864[119869
119894(1199050 119894)] sim 10
8 119894 = 0 1 2) then 120573
119895asymp
10minus6sim 10
minus5(d) From Table 3 the estimates of 119901
1and 119901
2from the
SEER data are of orders 10minus4 sim 10minus3 and 10minus7 sim 10
minus6respectively This indicates that in the US populationthe frequency of the staging limiting cancer genefor uveal melanoma is approximately around 10
minus3Table 3 also showed that the estimate of 120572 was 08411indicating that most individuals with genotype 119869
2
would develop cancer at birth This may help toexplain why there are observed cancer incidencesat birth for uveal melanoma in the SEER data eventhough the estimate of the frequency 119901
2is of order
10minus7sim 10
minus6
7 Discussion and Conclusions
To account for inherited cancer cases in this paper wehave developed some general multistage models involvinghereditary cancer cases For human cancer incidence thesemodels are basically generalized mixture models In thesemixture models the mixing probability is a multinomialdistribution to account for genetic segregation of the staging-limiting tumor suppressor genes This mixture model allowsus to estimate for the first time the frequency of the staging-limiting tumor suppressor gene in human populations As anexample of applications in this paper we have developed ageneral 3-stage stochastic multistage model of carcinogenesisfor adult human eye cancer To account for inherited cancercases in the stochastic model of human eye cancer wehave also developed a generalized mixture model for uvealmelanoma in human beings
For using the proposed models to fit the cancer incidencedata in this paper we have developed a generalized Bayesianinference procedure to estimate the unknownparameters andto predict cancer cases This inference procedure is advanta-geous over the classical sampling theory inference (ie max-imum likelihood method) because the procedure combines
16 ISRN Biomathematics
Table3Estim
ates
ofparametersfor
the3
-stage
stochastic
mod
els
Parameters
1205820
1205821
1205822
1205741
1205742
1199011
1199012
1205721205791
1205792
Mod
el-F
Estim
ates
288119864minus01
481119864minus01
655119864+02
271119864minus05
337119864minus02
995119864minus04
968119864minus07
841119864minus01
329119864minus04
341119864minus01
StD
121119864minus01
465119864minus02
112119864+02
886119864minus06
192119864minus04
175119864minus06
648119864minus09
419119864minus02
354119864minus05
107119864minus02
95CL
-Low
er503119864minus02
389119864minus01
434119864+02
534119864minus06
333119864minus02
991119864minus04
955119864minus07
759119864minus01
260119864minus04
320119864minus01
95CL
-Upp
er525119864minus01
572119864minus01
875119864+02
401119864minus05
341119864minus02
998119864minus04
981119864minus07
923119864minus01
398119864minus04
362119864minus01
Mod
el-1
Estim
ates
655119864minus02
372119864minus01
198119864+02
NA
349119864minus02
998119864minus04
100119864minus06
807119864minus01
NA
NA
StD
254119864minus02
463119864minus02
338119864+01
NA
358119864minus03
627119864minus05
453119864minus08
706119864minus02
NA
NA
95CL
-Low
er157119864minus02
282119864minus01
132119864+02
NA
279119864minus02
875119864minus04
911119864minus07
668119864minus01
NA
NA
95CL
-Upp
er115119864minus01
463119864minus01
265119864+02
NA
419119864minus02
1121198640minus03
109119864minus06
945119864minus01
NA
NA
NoteNAassum
edno
nexiste
nce
ISRN Biomathematics 17
information from three sources (1)previous information andexperiences about the parameters in terms of the prior distri-bution 119875Θ of the parameters (2) biological information ofinherited cancer cases via the genetic segregation of staging-limiting tumor suppressor genes in the population and (3)
information from the expanded data (Y) and the observeddata (
˜119910) via the statistical model from the system (119875Y
˜119910 |
N Θ) given by (37) and (40)To illustrate the usefulness and applications of ourmodels
and methods we have applied our models and methods tothe eye cancer SEER data of NCINIH Our analysis clearlyshowed that the proposed 3-stage model with inheritedcancer cases fitted the data nicely (see Table 2 and Figure 5)on the other hand the classical 2-stage model cannot fit thedata at all (see Table 2 and Figure 5)These results clearly haveconfirmed results frommolecular biology that the human eyecancer is derived by a 3-stage model with inherited cancercomponent Notice however our 3-stage multistage modelis more general than the classical 3-stage model as describedin Little [1] Tan [2] and Zheng [7] in that we postulatethat cancer tumors develop from primary 119868
3cells by clonal
expansion (see Yang and Chen [11]) (Note that the stochasticmultistagemodels in the literature assume that cancer tumorsdevelop from last stage cells immediately as soon as theyare generated ignoring completely cancer progression seeRemark 1) As a matter of fact we had assumed Δ119905 = 1 fora period of three months and found that the 3-stage modelsthen did not fit the SEER data clearly indicating that119875
119905(119904 119905) lt
1 over a period of three monthsApplying our models and methods to the SEER data
of human eye cancer we have derived for the first timesome useful pieces of information Specifically we mention(1) for the first time that we have estimated the frequencyof the staging-limiting tumor suppressor gene in the USpopulation (119901
1sim 9948 times 10
minus3) (2) With the estimate of 120572as = 08411 the predicted number of uveal melanoma atbirth is 119910
0= 119899
011199012= 36 by 3-stage models with inherited
cancer component (Model-F and Model-1) (The observednumber of eye cancer at birth is 34) (3) The estimateof the proliferation rate (120574
1) of 119869
1cells using Model-F is
1205741
= 2271 times 10minus5
sim 0 (The estimate is 8603 times 10minus5
using Model-1) This confirms that the stage-1 limitinggene is a tumor suppressor gene and unlike the p53 genein chromosome 17p (see [33]) there is little or no haploidinsufficiency for this gene in cells with genotype 119869
1
Using models and methods of this paper one can easilypredict future cancer cases for human eye cancer Thus bycomparing results from different populations our modelsand methods can be used to assess cancer prevention andcontrol procedures This will be our future research topicswe will not go any further here
Appendix
The Expected Numbers of State Variablesunder Discrete Time Approximation
Under discrete time the stochastic differential equations ofstate variables reduce to the following stochastic difference
equations of state variables respectively
119869119894 (119905 + 1 119894) = 119869
119894 (119905 119894) [1 + 120574119894 (119905)]
+ 119890119894(119905 + 1 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 + 1 119906) = 119869
119895(119905 119906) [1 + 120574
119895(119905)] + 119869
119895minus1(119905 119906) 120573
119895minus1(119905)
+ 119890119895(119905 + 1 119906) 0 le 119906 lt 119895 le 119896 minus 1
(A1)
where 119890119894(119905 + 1 119894) = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)] minus [119863
119894(119905 119894) minus
119869119894(119905 119894)119889
119894(119905)] and 119890
119895(119905+1 119894) = [119861
119895(119905 119894)minus119869
119895(119905 119894)119887
119895(119905)] minus [119863
119895(119905 119894)minus
119869119895(119905 119894)119889
119895(119905)] + [119872
119895minus1(119905 119894) minus 119869
119895minus1(119905 119894)120573
119895minus1(119905)] for 119895 gt 119894
The initial conditions at birth (1199050) for the above stochastic
difference equations are 119869119895(1199050 119894) gt 0 if (119895 = 119894 119894 + 1) and
119869119895(1199050 119894) = 0 if 119895 gt 119894 + 1 The solution of the above difference
equations under these initial conditions is given respectivelyby
119869119894(119905 119894) = 119869
119894(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119894(119904))
+
119905
sum
119904=1199050+1
119890119894(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119894(119906))
119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119895(119904))
+
119905minus1
sum
119904=1199050
119869119895minus1 (119904 119894) 120573119895minus1 (119904)
119905minus1
prod
119906=119904+1
(1 + 120574119895 (119906))
+
119905
sum
119904=1199050+1
119890119895(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119895(119906))
0 le 119894 lt 119895 le 119896 minus 1
(A2)
If the model is time homogeneous so that 120573119895(119905) =
120573119895 120574
119895(119905) = 120574
119895(119895 = 119894 119896 minus 1 and if 120574
119894= 120574119895if 119894 = 119895 then the
above solutions under the initial conditions (119869119895(1199050 119894) = 0 119895 gt
119894 + 1) reduce to
119869119894 (119905 119894) = 119869
119894(1199050 119894) (1 + 120574
119894)119905minus1199050
+ 120578(0)
119894(119905 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894) (1 + 120574
119895)119905minus1199050
+
119895minus1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895(119903) (1 + 120574
119903)119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
18 ISRN Biomathematics
=
119894+1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895 (119903) (1 + 120574119903)
119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
0 le 119894 lt 119895 le 119896 minus 1
(A3)
where 120578(0)
119895(119905 119894) = sum
119905
119904=1199050+1119890119895(119904 119894)(1 + 120574
119894)119905minus119904 120578
(119906)
119895(119905 119894) =
sum119905
119904=1199050+1119890119895minus119906
(119904 119894)sum119895
119907=119906119860119906119895(119907)(1 + 120574
119907)119905minus119904 (119906 = 1 119895 minus 119894)
Thus if the model is time homogeneous and if 120574119894=
120574119895if 119894 = 119895 the 119864[119869
119896minus1(119905 119894)]rsquos in discrete time models under
the initial conditions (119869119895(1199050 119894) = 0 119895 gt 119894 + 1) are given
respectively by
119864 [119869119896minus1
(119905 119894)] =
119894+1
sum
119906=119894
119864 [119869119906(1199050 119894)] (
119896minus2
prod
119907=119906
120573119907)
times
119896minus1
sum
119903=119906
119860119906(119896minus1)
(119903) (1 + 120574119903)119905minus1199050
119894 = 0 1 Min (119896 minus 1 2)
(A4)
References
[1] M P Little ldquoCancermodels ionization and genomic instabilitya reviewrdquo inHandbook of CancerModels with ApplicationsW YTan and L Hanin Eds chapter 5 World Scientific River EdgeNJ USA 2008
[2] W Y Tan Stochastic Models of Carcinogenesis Marcel DekkerNew York NY USA 1991
[3] W Y Tan ldquoStochastic multiti-stage models of carcinogenesis ashidden Markov models a new approachrdquo International Journalof Systems and Synthetic Biology vol 1 pp 313ndash337 2010
[4] W Y Tan L J Zhang and C W Chen ldquoStochastic modelingof carcinogenesis state space models and estimation of param-etersrdquoDiscrete and Continuous Dynamical Systems B vol 4 no1 pp 297ndash322 2004
[5] W Y Tan C W Chen and L J Zhang ldquoCancer biology cancermodels and stochasticmathematical analysis of carcinogenesisrdquoin Handbook of Cancer Models and Applications W Y Tan andL Hanin Eds chapter 3World Scientific River Edge NJ USA2008
[6] R A Weinberg The Biology of Human Cancer Garland Sci-ences Taylor and Frances New York NY USA 2007
[7] Q Zheng ldquoStochastic multistage cancer models a fresh lookat an old approachrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds chapter 2 WorldScientific River Edge NJ USA 2008
[8] W Y Tan L J Zhang W Chen and J M Zhu ldquoA stochasticmodel of human colon cancer involving multiple pathwaysrdquo inHandbook of Cancer Models with Applications W Y Tan and LHanin Eds chapter 11 World Scientific River Edge NJ USA2008
[9] W Y Tan and H Zhou ldquoA new stochastic model of retinoblas-toma involving both hereditary and non- hereditary cancercasesrdquo Journal of Carcinogenesis and Mutagenesis vol 2 no 2article 117 2011
[10] X W Yan Stochastic and State Space Models of carcinogenesisUnder Complex Situation [PhD thesis] Department of Math-ematical Sciences University of Memphsis Memphis TennUSA
[11] G L Yang and C W Chen ldquoA stochastic two-stage carcino-genesis model a new approach to computing the probabilityof observing tumor in animal bioassaysrdquo Mathematical Bio-sciences vol 104 no 2 pp 247ndash258 1991
[12] H Osada and T Takahashi ldquoGenetic alterations of multipletumor suppressors and oncogenes in the carcinogenesis andprogression of lung cancerrdquo Oncogene vol 21 no 48 pp 7421ndash7434 2002
[13] I I Wistuba L Mao and A F Gazdar ldquoSmoking moleculardamage in bronchial epitheliumrdquo Oncogene vol 21 no 48 pp7298ndash7306 2002
[14] S Landreville O A Agapova and J W Hartbour ldquoEmerginginsights into the molecular pathogenesis of uveal melanomardquoFuture Oncology vol 4 no 5 pp 629ndash636 2008
[15] H W Mensink D Paridaens and A De Klein ldquoGenetics ofuveal melanomardquo Expert Review of Ophthalmology vol 4 no6 pp 607ndash616 2009
[16] WY Tan andXWYan ldquoAnew stochastic and state spacemodelof human colon cancer incorporating multiple pathwaysrdquoBiology Direct vol 5 article 26 2010
[17] L B Klebanov S T Rachev and A Y Yakovlev ldquoA stochasticmodel of radiation carcinogenesis latent time distributions andtheir propertiesrdquo Mathematical Biosciences vol 113 no 1 pp51ndash75 1993
[18] A Y Yakovlev and A D Tsodikov Stochastic Models of TumorLatency and Their Biostatistical Applications World ScientificRiver Edge NJ USA 1996
[19] H Fakir W Y Tan L Hlatky P Hahnfeldt and R K SachsldquoStochastic population dynamic effects for lung cancer progres-sionrdquo Radiation Research vol 172 no 3 pp 383ndash393 2009
[20] A G Knudson ldquoMutation and cancer statistical study ofretinoblastomardquoProceedings of theNational Academy of Sciencesof the United States of America vol 68 no 4 pp 820ndash823 1971
[21] J F Crow and M Kimura An Introduction to PopulationGenetics Theory Harper and Row New York NY USA 1970
[22] W Y Tan Stochastic Models With Applications to GeneticsCancers AIDS and Other Biomedical Systems World ScientificRiver Edge NJ USA 2002
[23] W Y Tan CW Chen and L J Zhang ldquoCancer risk assessmentby state space modelsrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds Chapter 13 WorldScientific River Edge NJ USA 2008
[24] W Y Tan andCWChen ldquoStochasticmodeling of carcinogene-sis some new insightsrdquoMathematical and Computer Modellingvol 28 no 11 pp 49ndash71 1998
[25] W Y Tan and C W Chen ldquoCancer stochastic modelsrdquo inEncyclopedia of Statistical Sciences Revised edition John Wileyand Sons New York NY USA 2005
[26] E G Luebeck and S HMoolgavkar ldquoMultistage carcinogenesisand the incidence of colorectal cancerrdquo Proceedings of theNational Academy of Sciences of the United States of Americavol 99 no 23 pp 15095ndash15100 2002
[27] R Durrett J Mayberry S Moseley and D Schmidt ldquoProba-bility models for cancer development and progressionrdquo GoogleSearch Google 2012
[28] G E P Box and G C Tiao Bayesian Inference in StatisticalAnalysis Addison-Wesley Reading Mass USA 1973
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
6 ISRN Biomathematics
uncorrelated with the staging variables and 119879(119905) The initialconditions at birth (119905
0) for the above stochastic differential
equations are 119869119906(1199050 119894) gt 0 119906 = 119894 119894 + 1 119869
119906(1199050 119894) = 0 119906 gt 119894 + 1
Given the initial conditions (119869119906(1199050 119894) gt 0 119906 = 119894 119894 + 1)
and (119869119906(1199050 119894) = 0 119906 gt 119894 + 1) at birth (119905
0) the solution of the
equations in (6) is given respectively by
119869119894 (119905 119894) = 119869
119894(1199050 119894) 119890
int119905
1199050
120574119894(119909)119889119909
+ 120578119894 (119905 119894)
119894 = 0 1 Min (2 119896 minus 1)
119869119906 (119905 119894) = 119869
119906(1199050 119894) 119890
int119905
1199050
120574119906(119909)119889119909
+ int
119905
1199050
119869119906minus1 (119909 119894) 120573119906minus1 (119909) 119890
int119905
119909120574119906(119910)119889119910
119889119909
+ 120578119906(119905 119894) = sdot sdot sdot = 119869
119906(1199050 119894) 119890
int119905
1199050
120574119906(119909)119889119909
+
119906minus119894
sum
119907=1
119869119906minus119907
(1199050 119894) 120601
(119907)
119906(119905 119894) +
119906+1minus119894
sum
119907=1
120578(119907)
119906(119905 119894)
119906 = 119894 + 1 119896 minus 1
where 119894 = 0 if 119896 = 2
119894 = 0 1 if 119896 = 3
119894 = 0 1 2 if 119896 gt 3
(7)
where
120601(1)
119906(119905 119894) =int
119905
1199050
119890int119905
119909120574119906(119910)119889119910+int
119909
1199050
120574119906minus1
(119910)119889119910120573119906minus1
(119909) 119889119909
119906 = 119894 119896 minus 1
120601(119907)
119906(119905 119894) =int
119905
1199050
119890int119905
119909120574119906(119910)119889119910
120573119906minus1
(119909) 120601(119907minus1)
119906minus1(119909 119894) 119889119909
119907 = 2 119906 minus 119894
120578119906(119905 119894) = 120578
(1)
119906(119905 119894)
= int
119905
1199050
119890int119905
119909120574119906(119910)119889119910
119890119906(119909 119894) 119889119909
119906 = 119894 119896 minus 1
120578(119907)
119906(119905 119894) = int
119905
1199050
119890int119905
119909120574119906(119910)119889119910
120573119906minus1
(119909) 120578(119907minus1)
119906minus1(119909 119894) 119889119909
119907 = 2 119906 minus 119894
(8)
If the model is time homogeneous so that 120573119906(119905) =
120573119906 119887119906(119905) = 119887
119906 119889
119906(119905) = 119889
119906 120574
119906(119905) = 119887
119906minus 119889
119906= 120574
119906 119906 = 0 1
119896minus1 and if 120574119894= 120574119906if 119894 = 119906 the above solutions under the initial
conditions (119869119894+119906(1199050 119894) gt 0 119906 = 0 1 119869
119894+119906(1199050 119894) = 0 119906 gt 1)
then reduce respectively to
119869119894 (119905 119894) = 119869
119894(1199050 119894) 119890
120574119894(119905minus1199050)+ 120578
(1)
119894(119905 119894)
119894 = 0 1 Min (2 119896 minus 1)
119869119906(119905 119894) = 119869
119906(1199050 119894) 119890
120574119906(119905minus1199050)
+120573119906minus1
int
119905
1199050
119869119906minus1
(119909 119894) 119890120574119906(119905minus119909)
119889119909+120578(1)
119906(119905 119894)
= sdot sdot sdot = 119869119906(1199050 119894) 119890
120574119906(119905minus1199050)
+
119906minus1
sum
119903=119894
119869119903(1199050 119894) (
119906minus1
prod
119907=119903
120573119907)
119906
sum
119897=119903
119860119903119906(119897) 119890
120574119897(119905minus1199050)
+
119906
sum
119903=119894
120578(119906+1minus119903)
119906(119905 119894)
=
119894+1
sum
119903=119894
119869119903(1199050 119894) (
119906minus1
prod
119907=119903
120573119907)
119906
sum
119897=119903
119860119903119906(119897) 119890
120574119897(119905minus1199050)
+
119906
sum
119903=119894
120578(119906+1minus119903)
119906(119905 119894) 119894 lt 119906 le 119896 minus 1
(9)
where for 119894 le 119907 le 119906119860119894119906 (119907) = 1 if 119894 = 119906
=
119906
prod
119897=119894119897 = 119907
(120574119897minus 120574
119907)minus1 if 119894 lt 119906
(10)
Obviously 119864[120578(119903)119906(119905 119894)] = 0 for all (119894 = 0 1 2 119906 = 119894
119896 minus 1 119903 = 1 119906 minus 119894 + 1) It follows that for (119894 =
0 1 Min(119896 minus 1 2)) the expected values 119864[119869119896minus1
(119905 119894)] of119869119896minus1
(119905 119894) for homogeneous models with 120574119894= 120574119906if 119894 = 119906 are
given by
119864 [119869119896minus1
(119905 119894) = 119864 [119869119894+1
(1199050 119894)]
times (
119896minus2
prod
119906=119894+1
120573119906)
119896minus1
sum
119907=119894+1
119860(119894+1)(119896minus1)
(119907) 119890120574119907(119905minus1199050)
+ 119864 [119869119894(1199050 119894)] (
119896minus2
prod
119906=119894
120573119906)
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907) 119890120574119907(119905minus1199050)
119894 = 0 1Min (119896 minus 1 2) 119896 ge 2
(11)
where as a convention (sum119894
119895=119894+1119888119894= 0prod
119894
119895=119894+1119889119895= 1) for all
(119888119894 119889
119895)
Remark 4 Because genetic changes and epigenetic changesoccur during cell division to the order of 119900(Δ119905) the probabil-ity is120573
119906(119905)Δ119905 that one 119869
119906cell at time 119905would give rise to 1 119869
119906cell
and 1 119869119906+1
cell at time 119905 + Δ119905 by genetic changes or epigeneticchanges It follows that the transition of 119869
119906rarr 119869
119906+1would not
affect the population size of 119869119906cells but only increase the size
of the 119869119906+1
population
ISRN Biomathematics 7
32 Transition Probabilities and Probability Distributions ofStaging Variables Let 119891(119909 119899 119901) denote the probabilitydensity function of a binomial random variable 119883 sim
Binomial(119899 119901) ℎ(119909 120582) the probability density function ofa Poisson random variable 119883 sim Poisson(120582) and 119892(119909 119910 119899
1199011 119901
2) the probability density function of a bivariate multi-
nomial random vector (119883 119884)simMultinomial(119899 1199011 119901
2) Using
the stochastic equations of the staging variables given by(2) and using the probability distributions of the transitionvariables 119861
119903(119905 119894) 119863
119903(119905 119894)119872
119903(119905 119894) in (4) as inTan et al [4 5]
we obtain the following transition probabilities of 119869119903(119905 +
Δ119905 119894) 119903 + 119894 119896 minus 1 given 119869119903(119905 119894) 119903 = 119894 119896 minus 1 for
(119894 = 0 1 Min(119896 minus 1 2))
119875 119869119903 (119905 + Δ119905 119894) = 119907
119903 119903 = 119894 119894 + 1 119896 minus 1 | 119869
119903 (119905 119894)
= 119906119903 119903 = 119894 119894 + 1 119896 minus 1
= 119875 119869119894(119905 + Δ119905 119894) = 119907
119894| 119869
119894(119905 119894) = 119906
119894
times
119896minus1
prod
119895=119894+1
119875 119869119895(119905 + Δ119905 119894) = 119907
119895| 119869
119895(119905 119894)
= 119906119895 119869119895minus1
(119905 119894) = 119906119895minus1
(12)
where
119875 119869119894(119905 + Δ119905 119894) = 119907
119894| 119869
119894(119905 119894) = 119906
119894
=
119906119894
sum
119903=0
119891 (119903 119906119894 119887119894 (119905) Δ119905)
times 119891(119906119894minus 119907
119894+ 119903 119906
119894minus 119903
119889119894(119905) Δ119905
1 minus 119887119894 (119905) Δ119905
)
119894 = 0 1 Min (119896 minus 1 2)
119875 119869119895(119905 + Δ119905 119894) = 119907
119895| 119869
119895(119905 119894) = 119906
119895 119869119895minus1
(119905 119894) = 119906119895minus1
=
119906119895
sum
1199031=0
119906119895minus1199031
sum
1199032=0
119892 (1199031 1199032 119906
119895 119887119895(119905) Δ119905 119889
119895(119905) Δ119905)
times ℎ (119907119895minus 119906
119895minus 119903
1+ 119903
2 119906
119895minus1120573119895minus1
Δ119905) 119895 gt 119894
(13)
Define the unobservable transition variables˜119880119894(119905) =
119861119894(119905 119894) (119861
119895(119905 119894) 119863
119895(119905 119894)) 119895 = 119894 + 1 119896 minus 1
1015840(119894 = 0 1Min
(119896 minus 1 2)) Then we have for the joint probability densityfunction of
˜119883119894(119905 + Δ119905)
˜119880119894(119905) given
˜119883119894(119905)
119875 ˜119883119894 (119905 +Δ119905) ˜
119880119894 (119905) | ˜
119883119894 (119905) = 119875
˜119883119894 (119905 + Δ119905) | ˜
119880119894 (119905) ˜
119883119894 (119905)
times 119875 ˜119880119894(119905) |
˜119883119894(119905)
(14)
where
119875 ˜119883119894(119905 + Δ119905) |
˜119880119894(119905)
˜119883119894(119905)
= 119891(119869119894 (119905 119894) minus 119869119894 (119905 + Δ119905 119894) + 119861119894 (119905 119894) 119869119894 (119905 119894)
minus119861119894(119905 119894)
119889119894(119905) Δ119905
1 minus 119887119894 (119905) Δ119905
)
times
119896minus1
prod
119895=119894+1
ℎ 119869119895 (119905 + Δ119905 119894) minus 119869119895 (119905 119894) minus 119861119895 (119905 119894)
+119863119895(119905 119894) 119869
119895minus1(119905 119894) 120573
119895minus1(119905) Δ119905
(15)
119875 ˜119880119894(119905) |
˜119883119894(119905)
= 119891 (119861119894 (119905 119894) 119869119894 (119905 119894) 119887119894 (119905 119894) Δ119905)
times
119896minus1
prod
119895=119894+1
119892119861119895(119905 119894) 119863
119895(119905 119894) 119869
119895(119905 119894) 119887
119895(119905 119894) Δ119905 119889
119895(119905 119894) Δ119905
(16)
Let˜119890119894(119895) be a (119896 minus 119894) times 1 column vector with 1 in the
119895th position (1 le 119895 le 119896 minus 119894) and with 0 in other positionsLet
˜119906 = (119906
119894 119906
119896minus1)1015840 and
˜119907 = (119907
119894 119907
119896minus1)1015840 be (119896 minus
119894) times 1 column vectors of nonnegative integers (ie 119906119895and
119907119895are nonnegative integers) Then by using the probability
distribution results in (14)ndash(16) it can readily be shown that
119875 ˜119883119894(119905 + Δ119905) =
˜119907 |
˜119883119894(119905) =
˜119906
= [119906119895119887119895(119905) + (1 minus 120575
119895119894) 119906
119895minus1120573119895minus1
(119905)] Δ119905 + 119900 (Δ119905)
if˜119907 =
˜119906 +
˜119890119894(119895) 119895 = 119894 119896 minus 1
= 119906119895119889119895(119905) Δ119905 + 119900 (Δ119905)
if˜119907 =
˜119906 minus
˜119890119894(119895) 119895 = 119894 119896 minus 1
= 119900 (Δ119905) if 10038161003816100381610038161003816˜11015840
119899minus119896(˜119906 minus
˜119907)10038161003816100381610038161003816ge 2
(17)
The above results imply that˜119883119894(119905) is a (119896minus 119894)-dimensional
birth-death process with birth rates 119894119887119906(119905) 119906 = 119894 119896 minus
1 death rates 119894119889119906(119905) 119906 = 119894 119896 minus 1 and cross-
transition rates 120572119906119906+1
(119895 119905) = 119895120573119906(119905) 119906 = 119894 119896 minus
1 120573119906119907(119895 119905) = 0 if 119907 = 119906 + 1 (See Definition 41 in Tan
([22] Chapter 4)) Using these results it can be shownthat the Kolmogorov forward equation for the probabilities119875119869
119895(119905 119894) = 119906
119895 119895 = 119894 119896 minus 1 | 119869
119894(0) = 119898
119894 119869
119895(0) = 0
8 ISRN Biomathematics
119895 gt 119894 = 119875(119906119895 119895 = 119894 119896 minus 1 119905) (119894 = 0 1Min(119896 minus 1 2))
in the above model is given by
119889
119889119905119875 (119906
119895 119895 = 119894 119896 minus 1 119905)
= 119875 (119906119894minus 1 119906
119895 119895 = 119894 + 1 119896 minus 1 119905) (119906
119894minus 1) 119887
119894(119905)
+
119896minus1
sum
119895=119894+1
119875 (119906119894 119906
119894+1 119906
119895minus1 119906
119895
minus1 119906119895+1
119906119896minus1
119905) (119906119895minus 1) 119887
119895(119905)
+
119896minus2
sum
119895=119894
119875 (119906119894 119906
119894+1 119906
119895 119906
119895+1
minus1 119906119895+2
119906119896minus1
119905) 119906119895120573119895 (119905)
+
119896minus1
sum
119895=119894
119875 (119906119894 119906
119894+1 119906
119895minus1 119906
119895
+1 119906119895+1
119906119896minus1
119905) (119906119895+ 1) 119889
119895(119905)
minus 119875 (119906119894 119906
119894+1 119906
119896minus1 119905)
times
119896minus1
sum
119895=119894
119906119895[119887119895(119905) + 119889
119895(119905)] +
119896minus2
sum
119895=119894
119906119895120573119895(119905)
(18)
for 119906119895= 0 1 infin 119895 = 119894 119896 minus 1
By using the above set of differential equations one canreadily compute the probabilities 119875119869
119895(119905) = 119906
119895 119895 = 119894 119896 minus
1 | 119868119894(0) = 119898
119894 = 119875(119906
119895 119895 = 119894 119896 minus 1 119905) numerically
33 The Probability Distributions of the Number of DetectableTumors and Times to Tumors As shown by Yang and Chen[11] malignant cancer tumors arise from primary 119869
119896cells by
clonal expansion where primary 119869119896cells are 119869
119896cells generated
directly by 119869119896minus1
cells (119869119896cells derived by stochastic birth of
other 119869119896cells are not primary 119869
119896cells) That is cancer tumors
develop from primary 119869119896cells through stochastic birth-death
processesTo derive the probability distribution for 119879(119905) in 119869
119894people
in the population let 119875119879(119904 119905) denote the probability that
a primary cancer cell at time 119904 develops into a detectablecancer tumor by time 119905 (Explicit formula for119875
119879(119904 119905) has been
given in Tan [22] Chapter 8 and in Tan and Chen [24])Than as shown in Tan ([3 22] chapter 8) the conditionalprobability distribution of 119879(119905) given 119869
119896minus1(119904 119894) 119904 le 119905
in 119869119894people is Poisson with mean 120596(119905 119894) where 120596(119905 119894) =
int119905
1199050
119869119896minus1
(119904 119894)120573119896minus1
(119904)119875119879(119904 119905)119889119904 That is
119879 (119905) | 119869119896minus1
(119904 119894) 119904 le 119905 sim Poisson (120596 (119905 119894)) (19)
Let 119876119894(119895) be the probability that cancer tumors develop
during (119905119895minus1
119905119895] in 119869
119894people in the population For time
homogeneous models with small 120573119896minus1
119876119894(119895) is then given by
119876119894(119895) = 119864 119890
minus120596(119905119895minus1
119894)minus 119890
minus120596(119905119895119894)
= 119890minus120573119896minus1
119867119894(119905119895minus1
)minus 119890
minus120573119896minus1
119867119894(119905119895)+ 119900 (120573
119896minus1)
(20)
where119867119894(119905) = int
119905
1199050
119864[119869119896minus1
(119909 119894)]119875119879(119909 119905)119889119909
To derive 119876119894(119895) denote by
120579119894(119896minus1)
= 119864 [119869119894(1199050 119894 minus 1)] 120573
119894
119896minus1
prod
119906=119894+1
(120573119906
120574119906
)
119894 = 1 Min (3 119896 minus 1)
120582119906(119896minus1)
= 119864 [119869119906(1199050 119906)] 120573
119906
119896minus1
prod
119907=119906+1
(120573119906
120574119906
)
119906 = 0 1 Min (2 119896 minus 1)
(21)
and define the functions
120595119894(119896minus1)
(119905) =
119896minus1
prod
119906=119894+1
120574119906
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907)
times int
119905
1199050
119890120574119906(119909minus1199050)119875119879(119909 119905) 119889119909
119894 = 0 1 Min (2 119896 minus 1)
(22)
Applying results of 119864[119869119896minus1
(119905 119894)] given in (11) for timehomogeneous models with 120574
119894= 120574119895if 119894 = 119895 we obtain 119876
119894(119895)rsquos as
follows
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) = 119890
minus1205791112059511(119905119895minus1
)minus1205820112059501(119905119895minus1
)
minus119890minus1205791112059511(119905119895)minus1205820112059501(119905119895) + 119900 (120573
1)
(23)
1198761(119895) = (1 minus 120572
1) 119890
minus1205821112059511(119905119895minus1
)
minus119890minus1205821112059511(119905119895) + 119900 (120573
1)
(24)
ISRN Biomathematics 9
(2) If 119896 ge 3 then we have 119876119894(0) = 0 for 119894 = 0 1 and
1198762(0) = 120575
2119896120572 and for 119895 gt 0
1198760(119895) = 119890
minus1205791(119896minus1)
1205951(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1205950(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1205951(119896minus1)
(119905119895)minus1205820(119896minus1)
1205950(119896minus1)
(119905119895) + 119900 (120573
119896minus1)
(25)
1198761(119895) = 119890
minus1205792(119896minus1)
1205952(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1205951(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1205952(119896minus1)
(119905119895)minus1205821(119896minus1)
1205951(119896minus1)
(119905119895) + 119900 (120573
119896minus1)
(26)
1198762(119895) = 120575
2119896(1 minus 120572) 119890
minus1205822212059522(119905119895minus1
)minus 119890
minus1205822212059522(119905119895)
+ (1 minus 1205752119896) 119890
minus1205793(119896minus1)
1205953(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1205952(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1205953(119896minus1)
(119905119895)minus1205822(119896minus1)
1205952(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(27)
where 1205752119896= 1 if 119896 = 2 and = 0 if 119896 = 2
Notice that if 1205740= 0 then 120595
0(119896minus1)(119905) reduces to
1205950(119896minus1)
(119905) = (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=0
1198600(119896minus1)
(119907)
times int
119905
1199050
119890120574119907(119909minus1199050)119875119879(119909 119905) 119889119909
= (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1) (119907)
times int
119905
1199050
[119890120574119907(119909minus1199050)minus 1] 119875
119879 (119909 119905) 119889119909
(28)
Notice also that if 119875119879(119904 119905) asymp 1 for 119905 gt 119904 and if 120574
0= 0 then
the above 120595119894119895(119905)rsquos reduce respectively to
120595119894119894(119905) =
1
120574119894
119890120574119894(119905minus1199050)minus 1 119894 = 1 119896 minus 1
1205950(119896minus1) (119905) = (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1) (119907)
times1
1205741199072119890
120574119907(119905minus1199050)minus 1 minus (119905 minus 119905
0) 120574
119907
120595119894(119896minus1)
(119905) = (
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907)
times1
120574119907
119890120574119907(119905minus1199050)minus 1 119894 = 1 119896 minus 1
(29)
4 Probability Distribution ofObserved Cancer Incidence IncorporatingHereditary Cancer Cases
For estimating unknown parameters and to validate themodel one would need real data generated from the modelFor studies of carcinogenesis such data are usually given bycancer incidence For example in the SEER data of NCINIHof USA the data are given by (119910
0 119899
0) (119910
119895 119899
119895) 119895 = 1 119905
119873
where 1199100is the number of cancer cases at birth and 119899
0
the total number of birth and where for 119895 ge 1 119910119895is
the number of cancer cases developed during the 119895th agegroup of a one-year period (or 5 years periods) and 119899
119895is
the number of noncancer people who are at risk for cancerand from whom 119910
119895of them have developed cancer during
the 119895th age group Given in Table 1 are the SEER data ofuveal melanoma (adult eye cancer) during the period 1973ndash2007 In Table 1 notice that there are some cancer cases atbirth implying some inherited cancer cases In this sectionwe will develop a statistical model for these types of datasets from the stochastic multistage model with hereditarycancers as given in Section 2 As in previous sections let119899119894119895be the number of individuals who have genotype 119869
119894(119894 =
0 1 2) at the embryo stage among the 119899119895people at risk for
the cancer in question Then as showed above (1198991119895 119899
2119895) |
119899119895sim Multinomial119899
119895 119901
1 119901
2 It follows that 119899
119894119895| 119899
119895sim
Binomial119899119895 119901
119894 119894 = 0 1 2 In what follows we let 119884
119895denote
the random variable for 119910119895unless otherwise stated
41 The Probability Distribution of 1198840 As shown in Figure 4
119869119894(119894 = 0 1 2) people would only generate 119869
119894stage cells and
119869119894+1
stage cells at birth Thus for cancers to develop at orbefore birth the number of stages for the stochastic model ofcarcinogenesis must be 3 or less It follows that if 119910
0gt 0 the
appropriate model of carcinogenesis must be either a 2-stagemodel or a 3-stage model Since 119899
20| 119899
0sim Binomial(119899
0 119901
2)
and 11989910| (119899
0 119899
20) sim Binomial(119899
0 119901
1) + 119900(119901
2) the probability
distribution of 1198840is therefore
1198840sim Poisson (120594
119896) 119896 = 2 3 (30)
where
120594119896= 119899
0(119901
2+ 119901
1120572) if 119896 = 2
= 11989901199012120572 if 119896 = 3
(31)
The expected number of 1198840given 119899
0is 119864(119884
0| 119899
0) =
1198990(119901
2+ 119901
1120572) = 120594
2if 119896 = 2 and 119864(119884
0| 119899
0) = 119899
01199012120572 = 120594
3
if 119896 = 3 Hence for the 2-stage model (ie 119896 = 2) or the 3-stage model (ie 119896 = 3) the maximum likelihood estimateof 120594
119896is 120594
119896= 119910
0and the deviance119863
0(119896) from the conditional
probability distribution of 1198840given 119899
0is
1198630(119896) = minus2 log ℎ (119910
0 120594
119896) minus log ℎ (119910
0 120594
119896)
= 120594119896minus 119910
0 minus 119910
0log
120594119896
1199100
119896 = 2 3
(32)
10 ISRN Biomathematics
Table 1 The SEER incidence data (1973ndash2007) of uveal melanomafrom NCINIH (over all races and genders)
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
0 12495777 34 36 36 381 12221582 20 11 11 162 12120990 14 7 8 173 12112995 9 5 6 184 12146174 4 4 5 205 12161336 3 3 4 226 12111854 2 2 4 257 12160452 2 2 4 288 11942586 2 2 4 309 12381299 1 2 5 3410 12512703 2 3 6 3811 12410338 5 3 6 4112 12449244 2 3 6 4413 12527781 5 4 7 4814 12602883 3 5 7 5115 12719598 5 5 8 5516 12766107 7 6 9 5917 12831400 9 7 9 6218 12382047 8 7 9 6319 12581638 10 8 10 6820 12636509 7 9 11 7121 12682601 6 10 10 7522 12840510 12 11 11 7923 13075528 17 13 13 8424 13358635 16 15 14 8925 13473849 12 16 16 9426 13426340 17 18 18 9727 13525264 28 20 20 10128 13149674 17 22 21 10229 13812811 23 25 25 11030 13886874 24 28 27 11431 13488332 37 30 29 11532 13460286 32 32 32 11833 13256067 38 35 35 11934 13428827 37 39 38 12435 13220037 40 41 41 12636 12870265 30 44 44 12637 12689592 43 47 47 12738 12157014 42 49 49 12539 12494081 46 55 54 13140 12272125 49 58 58 13241 11826573 56 61 60 13042 11663153 54 65 64 13143 11407082 53 68 68 13144 11296848 70 73 72 133
Table 1 Continued
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
45 11016369 57 76 76 13246 10651593 71 79 79 13047 10475708 87 84 83 13148 9994684 82 86 85 12749 10138908 78 93 93 13150 9836359 87 97 96 13051 9475641 95 100 99 12752 9250985 113 104 104 12653 9027382 106 108 108 12554 8883737 117 113 113 12555 8547883 129 116 116 12356 8279648 107 119 119 12157 8062368 119 123 123 11958 7654610 132 124 124 11559 7563706 118 130 129 11560 7232719 131 131 130 11261 6927332 116 132 132 10962 6708273 133 134 134 10763 6543931 143 138 137 10664 6404652 130 141 141 10565 6168486 145 142 142 10266 5913479 138 142 142 9967 5746766 151 144 144 9868 5480517 147 142 142 9469 5363912 149 144 144 9470 5110728 136 142 142 9071 4925076 144 141 141 8872 4696825 140 138 138 8573 4512136 146 135 135 8274 4345300 126 133 133 8075 4148801 122 128 129 7876 3900900 124 122 122 7477 3681587 114 116 116 7078 3481918 93 110 110 6779 3243631 102 102 102 6380 2961234 79 92 93 5881 2724984 93 83 84 5482 2495219 82 75 75 5083 2271595 92 66 67 4684 2041351 64 57 58 4285 10466605 304 282 285 216Note 1 for age 0 and 1ndash10 years old the cancer incidence are derived bysubtracting incidence of retinoblastoma from the original SEERdata (see Tanand Zhou [9])Note 2 the observed uveal melanoma incidence rates per 106 individuals arederived from the SEER eye cancer incidence by subtracting retinoblastomaincidence as given in Tan and Zhou [9]
ISRN Biomathematics 11
42 The Probability Distribution of 119884119895(119895 ge 1) To derive the
probability distribution of 119884119895(119895 ge 1) in the 119895th age group
let 119884119894119895(119894 = 0 1 2) be the number of cancer cases generated
by people who have genotype 119869119894at the embryo stage among
these 119884119895cancer cases Then 119884
119895= sum
2
119894=0119884119894119895and 119884
0119895is the
number of cancer cases generated by the 1198990119895= 119899
119895minus 119899
1119895minus 119899
2119895
normal people in the populationThe conditional probabilitydistribution of 119884
119894119895given 119899
119894119895is
119884119894119895| 119899
119894119895sim Poisson 119899
119894119895119876119894(119895) 119894 = 0 1 2 (33)
Notice that if 119896 = 2 (a 2-stage model) then all 1198692
individuals would develop tumor at or before birth Thus if119896 = 2 then 119884
2119895= 0 for all 119895 gt 0 so that if 119895 gt 0 cancer
cases develop only from normal people (119873 = 1198690people) and
1198691people On the other hand if 119896 gt 2 then with positive
probability 119884119894119895gt 0 for all (119894 = 0 1 2 119895 = 0 1 119899
119879) where
119899119879is the last time point in the data Let 120575
2119896= 1 if 119896 = 2 and
1205752119896= 0 if 119896 = 2 Then 119884
119895| (119899
119894119895 119894 = 0 1 2) sim Poisson(119876
119879(119895)
where 119876119879(119895) = sum
1
119894=0119899119894119895119876119894(119895) + (1 minus 120575
2119896)119899
21198951198762(119895) Since
(1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) we have for the
conditional probability density function119875(119910119895| 119899
119895) of119884
119895given
119899119895
119875 (119910119895| 119899
119895)=
119899119895
sum
1198990119895=0
119899119895minus1198990119895
sum
1198991119895=0
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1) ℎ 119910
119895 119876
119879(119895)
(34)
where 119892(1198990119895 119899
1119895 119899
119895 119901
0 119901
1) is the probability density function
of (1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) and ℎ119910
119895 119876
119879(119895)
the probability density function of 119884119895| (119899
119894119895 119894 = 0 1 2) sim
Poisson119876119879(119895)
The probability density function 119875(119910119895| 119899
119895) given by (34)
is a mixture of Poisson probability density functions withmixing probability density function given by themultinomialprobability distribution of 119899
0119895 119899
1119895 given 119899
119895 This mixing
probability density function represents individuals with dif-ferent genotypes at the embryo stage in the population
Let Θ be the set of all unknown parameters (ie theparameters (119901
1 119901
2 120572) and the birth rates the death rates
and the mutation rates of 119869119895cells) Based on data (119910
119895 119895 =
0 1 119899119879) the likelihood function of Θ is
119871 Θ | 119910119895 119895 = 0 1 119899
119879 = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119875 (119910119895| 119899
119895)
(35)
Notice that because themutation rates are very small onemay practically assume 120573
119894(119905) = 120573
119894for 119894 = 0 1 119896 minus 1
Also because the stage-limiting genes are basically tumorsuppressor genes which act recessively (see Tan [3]Weinberg[6] and Tan et al [5 8 23]) one may practically assume119887119894(119905) = 119887
119894 119889
119894(119905) = 119889
119894 120574
119894(119905) = 119887
119894minus 119889
119894= 120574
119894 119894 = 0 1 119896 minus 1
(see Tan et al [4 5 8])
43 The Joint Probability Distribution of Augmented Variablesand Cancer Incidence For applying the mixture distribution
of 119884119895in (34) to make inference about the unknown parame-
ters one needs to expand the model to include the unobserv-able augmented variables (119899
0119895 119899
1119895 119910
0119895 119910
1119895 119895 = 1 119905
119873) and
derives the joint probability distribution of these variablesFor these purposes observe that for (119895 = 1 119905
119873) and for
119896 gt 2 the conditional probability distribution119875119910119894119895 119894 = 0 1 |
119910119895 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1) given (119884
119895= 119910
119895 119899
119894119895 119894 =
0 1 119899119895) is
(119910119894119895 119894 = 0 1) | (119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Multinomial(119910119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1)
(36)
Since the conditional probability distribution of 119884119895given
(119899119894119895 119894 = 0 1 2) for 119896 gt 2 is Poisson with mean 119876
119879(119895) we
have for the joint conditional probability density function119875119910
119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1 119884
119895) given
(119899119894119895 119894 = 0 1 119899
119895)
119875 119910119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895
= 119875 119910119894119895 119894 = 0 1 | 119910
119895 119899
119894119895 119894 = 0 1 2
times 119875 119910119895| 119899
119894119895 119895 = 0 1 2 =
1
119910119895119890minus119876119879(119895)119876
119879(119895)
119910119895
times 1198921199100119895 119910
1119895 119910
119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1
=
2
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) (119896 gt 2)
(37)
where 1199102119895= 119910
119895minus sum
1
119906=0119910119906119895and 119899
2119895= 119899
119895minus sum
1
119906=0119899119906119895
If 119896 = 2 then 1198842119895= 0 so that 119884
119895= sum
1
119894=0119884119894119895 Thus we have
for 119896 = 2
119884119895| (119899
119894119895 119894 = 0 1 119899
119895) sim Poisson
1
sum
119894=0
119899119894119895119876119894(119895) (38)
119884119894119895| (119884
119895= 119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Binomial(119910119895
119899119894119895119876119894(119895)
sum1
119906=0119899119906119895119876119906(119895)
) 119894 = 0 1
(39)
It follows that if 119896 = 2 then sum1
119894=0119884119894119895= 119884
119895and the joint
probability density function of (1198840119895 119884
119895) given (119899
119906119895 119906 = 0 1 2)
is119875 119910
0119895 119910
119895| 119899
119906119895 119906 = 0 1 119899
119895
= 119875 119910119895| 119899
119894119895 119894 = 0 1 119899
119895
times 119875 1199100119895| 119910
119895 119899
119906119895 119906 = 0 1 119899
119895
=
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) if 119896 = 2
(40)
where 119910119895= sum
1
119894=0119910119894119895and 119899
119895= sum
2
119894=0119899119894119895
12 ISRN Biomathematics
Put Y = (119910119894119895 119894 = 0 1 119895 = 1 119905
119873) N = (119899
119894119895 119894 = 0 1
119895 = 1 119905119873)˜119899 = (119899
119895 119895 = 0 1 119905
119873)˜119910 = (119910
119895 119895 = 0 1
119905119873) From (37) and (40) we have for the conditional joint
probability density function of (Y˜119910) given (N
˜119899)
119875 Y˜119910 | N
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(41)
It follows that the joint conditional probability densityfunction of NY
˜119910 given (
˜119899 Θ) is
119875 NY˜119910 |
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1)
times ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(42)
Notice that the above probability density function is aproduct of multinomial probability density functions andPoisson probability density functions For this joint probabil-ity density function the deviance Dev = minus2log119875[Y
˜119910N |
˜119899 Θ] minus log119875[Y
˜119910N |
˜119899 Θ] is
Dev = 1198630(119896) + Dev (119901
1 119901
2) +
119905119873
sum
119895=1
119863119895 (43)
where
1198630(119896) = 2 120594 (119896) minus 119910
0minus 119910
0log
120594 (119896)
1199100
(44)
Dev (1199011 119901
2) = 2
119905119873
sum
119895=1
1198990119895log
1199010
(1 minus 1199011minus 119901
2)
+
2
sum
119894=1
119899119894119895log
119901119894
119901119894
(45)
119863119895= 2
1
sum
119894=0
119899119894119895119876119894(119895)minus119910
119894119895minus119910
119894119895log
119899119894119895119876119894(119895)
119910119894119895
+2 (1 minus 1205752119896)
times 11989921198951198762(119895) minus 119910
2119895minus 119910
2119895log
11989921198951198762(119895)
1199102119895
(46)
where 119901119894= ((sum
119905119873
119895=0119899119894119895)(sum
119905119873
119895=0119899119895)) (119894 = 0 1 2)
The joint probability density function 119875Y˜119910N |
˜119899 Θ
of (Y˜119910N) given by (42) will be used as the kernel for the
Bayesian method to estimate the unknown parameters andto predict the state variables
44 Fitting of the Model to Cancer Incidence Data To fit themodel to real data as inTan [3ndash5] we letΔ119905 sim 1 to correspondto a fixed time interval such as 6 months in human cancerstudies (Tan et al [4] has assumed 3months as one-time unitwhile Luebeck andMoolgavkar [26] has assumed one year asone-time unit)Then because the proliferation rate of the laststage cells is quite large onemay practically assume119875
119879(119904 119905) =
1 for 119905 minus 119904 ge 1 Hence noting that 120573119896minus1
(119905) = 120573119896minus1
is usuallyvery small (see [3ndash5]) the 119876
119894(119895) is approximated by
119876119894(119895) asymp119864 119890
minus120573119896minus1
119866119894(119905119895minus1
)minus 119890
minus120573119896minus1
119866119894(119905119895) = 119890
minus120573119896minus1
119864[119866119894(119905119895minus1
)]
minus 119890minus120573119896minus1
119864[119866119894(119905119895)]+ 119900 (120573
119896minus1)
(47)
where 119866119894(119905) = sum
119905minus1
119904=1199050
119869119896minus1
(119904 119894)Under discrete time approximation the 119864[119868
119896minus1(119905 119894)]rsquos
have been derived in the appendix Using these results ofexpected numbers and using the resultsum119905minus1
119894=0119886119894= (119886
119905minus1)(119886minus
1) for 119886 = 0 we obtain
120573119896minus1
119864 [119866119894(119905)] =
119894+1
sum
119903=119894
119864 [119869119903(1199050 119894)] (
119896minus1
prod
119906=119903
120573119906)
times
119896minus1
sum
119907=119903
119860119903(119896minus1)
(119907)
119905minus1
sum
119904=1199050
(1 + 120574119907)119904minus1199050
= 120579(119894+1)(119896minus1)
120601(119894+1)(119896minus1) (119905) + 120582119894(119896minus1)120601119894(119896minus1) (119905)
(48)
where (120579119894(119896minus1)
119894 = 1 2 3) and (120579119894(119896minus1)
119894 = 0 1 2) are definedin Section 33 andwhere the120601
119894(119896minus1)(119905) (119894 = 0 1 2 3)rsquos are given
by
120601119894(119896minus1) (119905) =(
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1) (119907)
times1
120574119907
(1 + 120574119907)119905minus1199050
minus 1 119894 = 0 1 2 3
(49)
Notice that if 1205740= 0 then 120601
0(119896minus1)(119905) reduces to
1206010(119896minus1)
(119905) =(
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1)
(119907)
times1
1205741199072(1 + 120574
119907)119905minus1199050
minus 1 minus (119905 minus 1199050) 120574
119907
(50)
Applying these results for time homogeneous modelswith 120574
119894= 120574119895if 119894 = 119895 the 119876
119894(119895)rsquos under discrete approximation
are given as follows
ISRN Biomathematics 13
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791112060111(119905119895minus1
)minus1205820112060101(119905119895minus1
)
minus119890minus1205791112060111(119905119895)minus1205820112060101(119905119895) + 119900 (120573
1)
1198761(119895) asymp (1 minus 120572
1) 119890
minus1205821112060111(119905119895minus1
)
minus119890minus1205821112060111(119905119895) + 119900 (120573
1)
(51)
(2) If 119896 ge 3 thenwe have 1198762(0) = 120575
2(119896minus1)120572119876
119894(0) = 0 119894 =
0 1 and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791(119896minus1)
1206011(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1206011(119896minus1)
(119905119895)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198761(119895) asymp 119890
minus1205792(119896minus1)
1206012(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1206012(119896minus1)
(119905119895)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198762(119895) asymp 120575
2(119896minus1)(1 minus 120572) 119890
minus1205822212060122(119905119895minus1
)minus 119890
minus1205822212060122(119905119895)
+ (1 minus 1205752(119896minus1)
) 119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(52)
Notice that if one replaces [1 + 120574119907]119905minus1199050 by [1 + 120574
119907]119905minus1199050 =
119890(119905minus1199050) log(1+120574
119907)
asymp 119890(119905minus1199050)120574119907 the above 119876
119894(119895)rsquos from discrete
time model are exactly the same as from the continuousmodel respectively as given in equations (23)ndash(27) under theassumption that 119875
119879(119904 119905) = 1 for 119905 minus 119904 gt 0 Notice that the
assumption 119875119879(119904 119905) = 1 for 119905minus119904 gt 0 is equivalent to assuming
that the last stage cancer cells grow instantaneously intocancer tumors as soon as they are generated see Remark 1
5 The Fitting of the Model to CancerIncidence and the Generalized BayesianInference Procedure
Given the model in Sections 2 and 3 and cancer incidenceone may use results in Section 4 to fit the model By usingthis model and the distribution results in Section 4 one canreadily estimate the unknown genetic parameters predictcancer incidence and check the validity of the model byusing the generalized Bayesian inference and Gibbs samplingprocedures for more detail see Tan [3 22] and Tan et al[4 5]
The generalized Bayesian inference is based on the pos-terior distribution 119875Θ | NY
˜119910
˜119899 of Θ given NY
˜119910
˜119899
This posterior distribution is derived by combining the prior
distribution 119875Θ ofΘ with the joint probability distribution119875NY
˜119910 |
˜119899 Θ given
˜119899 Θ given by (42) It follows
that this inference procedure would combine informationfrom three sources (1) previous information and experiencesabout the parameters in terms of the prior distribution 119875Θof the parameters (2) biological information of inheritedcancer cases via genetic segregation of cancer genes inthe population (119875N |
˜119899 119901
119894 119894 = 1 2 see Section 2)
and (3) information from the expanded data (Y) and theobserved data (
˜119910) via the statistical model from the system
(119875Y˜119910 | N Θ) given by (37) and (40) Because of additional
information from the genetic segregation of the cancer genesthis inference procedure provides an efficient procedure toextract information of effects of genotypes of individuals atthe embryo stage
51 The Prior Distribution of the Parameters For the priordistributions of Θ because biological information has sug-gested some lower bounds and upper bounds for the muta-tion rates and for the proliferation rates we assume
119875 (Θ) prop 119888 (119888 gt 0) (53)
where 119888 is a positive constant if these parameters satisfy somebiologically specified constraints are and equal to zero forotherwise These biological constraints are as follows
(i) 0 lt 1199011lt 10
minus2 0 lt 1199012lt 10
minus6 and minus001 lt 120574119894lt 1 (119894 =
1 2)(ii) For 120573
119895(119895 = 0 1 119896 minus 1) we let 1 lt 120596
0= 119873(119905
0)120573
0lt
1000 (119873 rarr 1198681) and 10minus8 lt 120573
119894lt 10
minus3 119894 = 1 119896minus1(iii) For the 120582
119895(119896minus1)(119895 = 0 1 2) we let 0 lt 120582
119894lt 10 119894 = 0 1
and 0 lt 1205822lt 10
3(iv) For the 120579
119894(119894 = 1 2 3) we let 0 lt 120579
1lt 10
minus2 and 0 lt
1205792lt 10
minus1
We will refer to the above prior as a partially informativeprior which may be considered as an extension of thetraditional noninformative prior given in Box and Tiao [28]
52 The Posterior Distribution of the Parameters Given119884119873
˜119910
˜119899 Denote by Θ
1= Θ minus 119901
1 119901
2 120572 From the
posterior distribution 119875Θ | NY˜119910
˜119899 we obtain
119875 119901119894 119894 = 1 2 120572 | Θ
1NY
˜119910
˜119899
prop (120594 (119896))1199100
119890minus120594(119896)
119901sum119905119873
119895=11198991119895
1119901sum119905119873
119895=11198992119895
2
times (1 minus 1199011minus 119901
2)sum119905119873
119895=11198990119895
0 lt 1199011 119901
2 120572 lt 1
119875 Θ1| (119901
1 119901
2 120572) NY
˜119910
˜119899
prop
119905119873
prod
119895=1
2
prod
119894=0
119890minus119876119894(119895)119876
119894(119895)
119910119894119895
Θ1isin Ω
(54)
14 ISRN Biomathematics
where Ω is the parameter space of Θ1provided by the
biological constraints in Section 51For computational convenience we notice that the log of
1198751199011 119901
2 120572 | Θ
1NY
˜119910
˜119899 is proportional to the negative of
1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) similarly the log of
119875Θ1| (119901
1 119901
2 120572)NY
˜119910
˜119899 is proportional to the negative
of sum119896
119895=1119863119895given by (46)
53 The Multilevel Gibbs Sampling Procedure For EstimatingUnknown Parameters Given the posterior probability distri-butions we will use the following multilevel Gibbs samplingprocedure to derive estimates of the parameters We noticethat numerically the Gibbs sampling procedure given belowis equivalent to the EM-algorithm from the sampling theoryviewpoint with Steps 1 and 2 as the 119864-Step and with Steps 3and 4 as the119872-Step respectively [29]Thesemultilevel Gibbssampling procedures are given by the following
Step 1 (Generating N Given (Y˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 1)) Given Θ and given˜119899 use the multinomial
distribution of 1198991119895 119899
2119895 given 119899
119895in Section 3 to generate
a large sample of N Then by combining this large samplewith 119875Y
˜119910 | N
˜119899 Θ in (37) and (40) to select N through
the weighted bootstrap method due to Smith and Gelfand[30] This selected N is then a sample from 119875N | Y
˜119910
˜119899 Θ
even though the latter is unknown (For proof see Tan [22]Chapter 3) Call the generated sample N
Step 2 (Generating Y Given (N˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 2)) Given
˜119910
˜119899 Θ and given N = N generated
from Step 1 generate Y from the probability distribution119875Y | N
˜119910
˜119899 Θ given by (36) and (38) Call the generated
sample Y
Step 3 (Estimation of 119901119894 119894 = 1 2 120572 Given Θ
1NY
˜119910
˜119899)
Given
˜119910
˜119899 Θ
1 and given (NY) = (N Y) from Steps 1
and 2 derive the posterior mode of 119901119894 119894 = 1 2 120572 by
maximizing the conditional posterior distribution 119875119901119894 119894 =
1 2 120572 | Θ1 N Y
˜119910
˜119899 Under the partially informative prior
this is equivalent to maximize the negative of the deviance1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) in Section 43 under
the constraints given in Section 51 Denote this generatedmode by 119901
119894 119894 = 1 2
Step 4 (Estimation of Θ1Given (119901
119894 119894 = 1 2 120572NY
˜119910
˜119899))
Given (
˜119910
˜119899) and given (NY 119901
119894 119894 = 1 2 120572) = (N Y 119901
119894 119894 =
1 2 ) from Steps 1ndash3 derive the posterior mode of Θ1
by maximizing the conditional posterior distribution119875Θ
1| 119901
119894 119894 = 1 2 N Y
˜119910
˜119899 Under the partially
informative prior this is equivalent to maximize the negativeof the deviancesum119896
119895=1119863119895in (46) under the constraints Denote
the generated mode as Θ1
Step 5 (Recycling Step) With (NY 119901119894 119894 = 1 2 120572 Θ
1) =
(N Y 119901119894 119894 = 1 2 Θ
1) given above go back to Step 1 and
continue until convergence
The proof of convergence of the above steps can bederived by using procedure given in Tan ([22] Chapter 3) Atconvergence the Θ = 119901
119894 119894 = 1 2 Θ
1 are the generated
values from the posterior distribution of Θ given
˜119910
˜119899
independently of (NY) (for proof see Tan [22] Chapter 3)Repeat the above procedures once then generate a randomsample ofΘ from the posterior distribution ofΘ given
˜119910
˜119899
then one uses the sample mean as the estimates of (Θ) anduse the sample variances and covariances as estimates of thevariances and covariances of these estimates
6 A NewMultistage Stochastic Model for AdultEye Cancer (Uveal Melanoma)mdashAn Example
The human eye cancers consist of pediatric eye cancers andadult eye cancers The most common pediatric eye cancer isthe retinoblastoma which develops from the retinal pigmentepithelium cells underlying the retina that do not formmelanoma The most common adult eye cancers are theuveal melanomas involving the iris the ciliary body andthe choroid (collectively referred to as the uveal) Thesecancers develop from melanocytes (pigment cells) whichreside within the uveal giving color to the eye In Tan andZhou [9] we have developed a modified two-stage model forretinoblastoma Based on results frommolecular biology (seeLandreville et al [14] Mensink et al [15] and Loercher andHarbour [31]) Landreville et al [14] have proposed a threestage model for uveal melanoma as given in Figure 2 As anexample of applications of this paper in this section we willapply this model of uveal melanoma to the NCINIH eyecancer data from the SEER project We notice that the samemethods can be applied to model other human cancers aswell but this will be our future research
Given in Table 1 are the numbers of people at risk andthe eye cancer cases in the age groups together with thepredicted cases from the models These data give cancerincidence at birth and incidence for 85 age groups (119896 =
85) with each group spanning over a 1-year period exceptthe last age group (ge85 years old) For human eye cancerbecause the incidence at birth and for age groups from 1to 10 years old is basically generated by the pediatric eyecancer-retinoblastoma (see [9]) to account for inheritedcancer cases of uveal melanomas the incidence for age 0(birth) and for age periods from 1 to 10 years old in Table 1for uveal melanoma is derived by subtracting incidence ofretinoblastoma from SEER data (see Tan and Zhou [9])
To fit the data we let one-time unit be 6 months afterbirth and let 119905
0= 1 To compare different models and to
assess different assumptions we will consider the following2-3-stage mixture models (1) the complete 3-stage mixturemodel (Model-F) in which no assumptions are made on theparameters (2) the Type-1ndash3-stage mixture model in whichwe assume that 120574
1= 0 and that normal people and 119869
1at
the embryo stage will remain normal people and 1198691people
respectively at birth (Model-1) For comparison purposes wealso fit a 2-stage model as defined in Tan and Zhou [9] Wewill apply the methods in Section 6 to fit these models to theSEER data given in Table 1
ISRN Biomathematics 15
Table 2 The log-likelihood AIC and BIC of the fitted models
Models Log-likelihood AIC BICThree-stage
Model-F minus131202 264804 267735Model-1 minus129576 260953 263151
Two-stage minus4392421 8796843 8811569
0 6 12 18 24 30 36 42 48 54 60 66 72 78 84Age (year)
0
50
100
150
200
250
300
350
Inci
denc
es
ObservedModel-F
Model-1Two-stage
Figure 5 Curve fitting of SEER data by the Model-F Model-1 andthe two-stage model
Given in Table 2 are the natural logs of the likelihoodfunctions the AIC (Akaike Information Criterion) and theBIC (Bayesian Information Criterion) for these modelsGiven in Table 3 are the estimates of parameters in the 3-stagemodels Given in Figure 5 are the plots of predicted cancercases from the 3-stage mixture models (Model-F and Model-1) and the 2-stagemodel For comparison purposes in Table 1we also provide numbers of predicted cancer cases from the3-stage mixture models and the 2-stage model together withthe observed cancer cases over time from SEER From theseresults we have made the following observations
(a) As shown by results in Table 1 and Figure 5 it appear-ed that both Model-F and Model-1 fitted the SEERdata well although Model-1 fitted the data slightlybetter from values of AIC and BIC The Chi-squaretest statistics 1205942 = sum
85
119894=0((119910
119894minus 119910
119894)2119910
119894) for Model-F
and Model-1 are given by 8843 and 9448 respec-tively giving a 119875-value of 012 (df = 86 minus 12 = 74)for Model-F and a 119875 value of 011 (df = 86 minus 7 = 79)for Model-1 On the other hand the 2-stage modelfitted the date very poorly the Chi-statistic value forthe 2-stage model is 274769 giving a 119875-value less than10
minus3The AIC (Akaike Information Criteria) and BIC(Bayesian Information Criteria) values ofModel-1 aregiven by (AIC = 260953 BIC = 263151) which areslightly smaller than those of Model-F respectivelyhowever the AIC and BIC values (879684 881157)of the two-stage model are considerably greater thanthose of the 3-stagemodels respectivelyThese resultssuggest that uvealmelanomamaybest be described bya 3-stage model with inherited component and that
one may practically assume 1205741= 0 and that normal
people and 1198691people at the embryo stage will remain
normal people and 1198691people respectively at birth
(b) From Table 3 it is observed that the estimate of 1205741is
close to zero (the estimate is of order 10minus5) indicatingthat the phenotype of 119869
1is almost identical to that
of 119873 = 1198690further confirming that the staging-
limiting genes are basically tumor suppressor genesand that there is no haploinsufficiency for these tumorsuppressor genes On the other hand the estimate of1205742is of order 10minus2 which is about 103 times greater
than those of cells with genotype 1198691
(c) From Table 3 the estimates 119895(119895 = 0 1 2) of 120582
119895
are of order 10minus1 10
minus1 10
2sim 10
3 respectively
Because 120582119894= 119864[119869
119894(1199050 119894)]120573
119894prod
2
119906=119894+1(120573
119906120574
119906) 119894 = 0 1 2
assuming some values of 119864[119869119894(1199050 119894)] 119894 = 0 1 2 from
some biological observations one can have somerough ideas about the magnitude of 120573
119895(119895 = 0 1 2)
For example if we follow Potten et al [32] to assume(119864[119873(119905
0)] = 119864[119869
119894(1199050 119894)] sim 10
8 119894 = 0 1 2) then 120573
119895asymp
10minus6sim 10
minus5(d) From Table 3 the estimates of 119901
1and 119901
2from the
SEER data are of orders 10minus4 sim 10minus3 and 10minus7 sim 10
minus6respectively This indicates that in the US populationthe frequency of the staging limiting cancer genefor uveal melanoma is approximately around 10
minus3Table 3 also showed that the estimate of 120572 was 08411indicating that most individuals with genotype 119869
2
would develop cancer at birth This may help toexplain why there are observed cancer incidencesat birth for uveal melanoma in the SEER data eventhough the estimate of the frequency 119901
2is of order
10minus7sim 10
minus6
7 Discussion and Conclusions
To account for inherited cancer cases in this paper wehave developed some general multistage models involvinghereditary cancer cases For human cancer incidence thesemodels are basically generalized mixture models In thesemixture models the mixing probability is a multinomialdistribution to account for genetic segregation of the staging-limiting tumor suppressor genes This mixture model allowsus to estimate for the first time the frequency of the staging-limiting tumor suppressor gene in human populations As anexample of applications in this paper we have developed ageneral 3-stage stochastic multistage model of carcinogenesisfor adult human eye cancer To account for inherited cancercases in the stochastic model of human eye cancer wehave also developed a generalized mixture model for uvealmelanoma in human beings
For using the proposed models to fit the cancer incidencedata in this paper we have developed a generalized Bayesianinference procedure to estimate the unknownparameters andto predict cancer cases This inference procedure is advanta-geous over the classical sampling theory inference (ie max-imum likelihood method) because the procedure combines
16 ISRN Biomathematics
Table3Estim
ates
ofparametersfor
the3
-stage
stochastic
mod
els
Parameters
1205820
1205821
1205822
1205741
1205742
1199011
1199012
1205721205791
1205792
Mod
el-F
Estim
ates
288119864minus01
481119864minus01
655119864+02
271119864minus05
337119864minus02
995119864minus04
968119864minus07
841119864minus01
329119864minus04
341119864minus01
StD
121119864minus01
465119864minus02
112119864+02
886119864minus06
192119864minus04
175119864minus06
648119864minus09
419119864minus02
354119864minus05
107119864minus02
95CL
-Low
er503119864minus02
389119864minus01
434119864+02
534119864minus06
333119864minus02
991119864minus04
955119864minus07
759119864minus01
260119864minus04
320119864minus01
95CL
-Upp
er525119864minus01
572119864minus01
875119864+02
401119864minus05
341119864minus02
998119864minus04
981119864minus07
923119864minus01
398119864minus04
362119864minus01
Mod
el-1
Estim
ates
655119864minus02
372119864minus01
198119864+02
NA
349119864minus02
998119864minus04
100119864minus06
807119864minus01
NA
NA
StD
254119864minus02
463119864minus02
338119864+01
NA
358119864minus03
627119864minus05
453119864minus08
706119864minus02
NA
NA
95CL
-Low
er157119864minus02
282119864minus01
132119864+02
NA
279119864minus02
875119864minus04
911119864minus07
668119864minus01
NA
NA
95CL
-Upp
er115119864minus01
463119864minus01
265119864+02
NA
419119864minus02
1121198640minus03
109119864minus06
945119864minus01
NA
NA
NoteNAassum
edno
nexiste
nce
ISRN Biomathematics 17
information from three sources (1)previous information andexperiences about the parameters in terms of the prior distri-bution 119875Θ of the parameters (2) biological information ofinherited cancer cases via the genetic segregation of staging-limiting tumor suppressor genes in the population and (3)
information from the expanded data (Y) and the observeddata (
˜119910) via the statistical model from the system (119875Y
˜119910 |
N Θ) given by (37) and (40)To illustrate the usefulness and applications of ourmodels
and methods we have applied our models and methods tothe eye cancer SEER data of NCINIH Our analysis clearlyshowed that the proposed 3-stage model with inheritedcancer cases fitted the data nicely (see Table 2 and Figure 5)on the other hand the classical 2-stage model cannot fit thedata at all (see Table 2 and Figure 5)These results clearly haveconfirmed results frommolecular biology that the human eyecancer is derived by a 3-stage model with inherited cancercomponent Notice however our 3-stage multistage modelis more general than the classical 3-stage model as describedin Little [1] Tan [2] and Zheng [7] in that we postulatethat cancer tumors develop from primary 119868
3cells by clonal
expansion (see Yang and Chen [11]) (Note that the stochasticmultistagemodels in the literature assume that cancer tumorsdevelop from last stage cells immediately as soon as theyare generated ignoring completely cancer progression seeRemark 1) As a matter of fact we had assumed Δ119905 = 1 fora period of three months and found that the 3-stage modelsthen did not fit the SEER data clearly indicating that119875
119905(119904 119905) lt
1 over a period of three monthsApplying our models and methods to the SEER data
of human eye cancer we have derived for the first timesome useful pieces of information Specifically we mention(1) for the first time that we have estimated the frequencyof the staging-limiting tumor suppressor gene in the USpopulation (119901
1sim 9948 times 10
minus3) (2) With the estimate of 120572as = 08411 the predicted number of uveal melanoma atbirth is 119910
0= 119899
011199012= 36 by 3-stage models with inherited
cancer component (Model-F and Model-1) (The observednumber of eye cancer at birth is 34) (3) The estimateof the proliferation rate (120574
1) of 119869
1cells using Model-F is
1205741
= 2271 times 10minus5
sim 0 (The estimate is 8603 times 10minus5
using Model-1) This confirms that the stage-1 limitinggene is a tumor suppressor gene and unlike the p53 genein chromosome 17p (see [33]) there is little or no haploidinsufficiency for this gene in cells with genotype 119869
1
Using models and methods of this paper one can easilypredict future cancer cases for human eye cancer Thus bycomparing results from different populations our modelsand methods can be used to assess cancer prevention andcontrol procedures This will be our future research topicswe will not go any further here
Appendix
The Expected Numbers of State Variablesunder Discrete Time Approximation
Under discrete time the stochastic differential equations ofstate variables reduce to the following stochastic difference
equations of state variables respectively
119869119894 (119905 + 1 119894) = 119869
119894 (119905 119894) [1 + 120574119894 (119905)]
+ 119890119894(119905 + 1 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 + 1 119906) = 119869
119895(119905 119906) [1 + 120574
119895(119905)] + 119869
119895minus1(119905 119906) 120573
119895minus1(119905)
+ 119890119895(119905 + 1 119906) 0 le 119906 lt 119895 le 119896 minus 1
(A1)
where 119890119894(119905 + 1 119894) = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)] minus [119863
119894(119905 119894) minus
119869119894(119905 119894)119889
119894(119905)] and 119890
119895(119905+1 119894) = [119861
119895(119905 119894)minus119869
119895(119905 119894)119887
119895(119905)] minus [119863
119895(119905 119894)minus
119869119895(119905 119894)119889
119895(119905)] + [119872
119895minus1(119905 119894) minus 119869
119895minus1(119905 119894)120573
119895minus1(119905)] for 119895 gt 119894
The initial conditions at birth (1199050) for the above stochastic
difference equations are 119869119895(1199050 119894) gt 0 if (119895 = 119894 119894 + 1) and
119869119895(1199050 119894) = 0 if 119895 gt 119894 + 1 The solution of the above difference
equations under these initial conditions is given respectivelyby
119869119894(119905 119894) = 119869
119894(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119894(119904))
+
119905
sum
119904=1199050+1
119890119894(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119894(119906))
119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119895(119904))
+
119905minus1
sum
119904=1199050
119869119895minus1 (119904 119894) 120573119895minus1 (119904)
119905minus1
prod
119906=119904+1
(1 + 120574119895 (119906))
+
119905
sum
119904=1199050+1
119890119895(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119895(119906))
0 le 119894 lt 119895 le 119896 minus 1
(A2)
If the model is time homogeneous so that 120573119895(119905) =
120573119895 120574
119895(119905) = 120574
119895(119895 = 119894 119896 minus 1 and if 120574
119894= 120574119895if 119894 = 119895 then the
above solutions under the initial conditions (119869119895(1199050 119894) = 0 119895 gt
119894 + 1) reduce to
119869119894 (119905 119894) = 119869
119894(1199050 119894) (1 + 120574
119894)119905minus1199050
+ 120578(0)
119894(119905 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894) (1 + 120574
119895)119905minus1199050
+
119895minus1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895(119903) (1 + 120574
119903)119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
18 ISRN Biomathematics
=
119894+1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895 (119903) (1 + 120574119903)
119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
0 le 119894 lt 119895 le 119896 minus 1
(A3)
where 120578(0)
119895(119905 119894) = sum
119905
119904=1199050+1119890119895(119904 119894)(1 + 120574
119894)119905minus119904 120578
(119906)
119895(119905 119894) =
sum119905
119904=1199050+1119890119895minus119906
(119904 119894)sum119895
119907=119906119860119906119895(119907)(1 + 120574
119907)119905minus119904 (119906 = 1 119895 minus 119894)
Thus if the model is time homogeneous and if 120574119894=
120574119895if 119894 = 119895 the 119864[119869
119896minus1(119905 119894)]rsquos in discrete time models under
the initial conditions (119869119895(1199050 119894) = 0 119895 gt 119894 + 1) are given
respectively by
119864 [119869119896minus1
(119905 119894)] =
119894+1
sum
119906=119894
119864 [119869119906(1199050 119894)] (
119896minus2
prod
119907=119906
120573119907)
times
119896minus1
sum
119903=119906
119860119906(119896minus1)
(119903) (1 + 120574119903)119905minus1199050
119894 = 0 1 Min (119896 minus 1 2)
(A4)
References
[1] M P Little ldquoCancermodels ionization and genomic instabilitya reviewrdquo inHandbook of CancerModels with ApplicationsW YTan and L Hanin Eds chapter 5 World Scientific River EdgeNJ USA 2008
[2] W Y Tan Stochastic Models of Carcinogenesis Marcel DekkerNew York NY USA 1991
[3] W Y Tan ldquoStochastic multiti-stage models of carcinogenesis ashidden Markov models a new approachrdquo International Journalof Systems and Synthetic Biology vol 1 pp 313ndash337 2010
[4] W Y Tan L J Zhang and C W Chen ldquoStochastic modelingof carcinogenesis state space models and estimation of param-etersrdquoDiscrete and Continuous Dynamical Systems B vol 4 no1 pp 297ndash322 2004
[5] W Y Tan C W Chen and L J Zhang ldquoCancer biology cancermodels and stochasticmathematical analysis of carcinogenesisrdquoin Handbook of Cancer Models and Applications W Y Tan andL Hanin Eds chapter 3World Scientific River Edge NJ USA2008
[6] R A Weinberg The Biology of Human Cancer Garland Sci-ences Taylor and Frances New York NY USA 2007
[7] Q Zheng ldquoStochastic multistage cancer models a fresh lookat an old approachrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds chapter 2 WorldScientific River Edge NJ USA 2008
[8] W Y Tan L J Zhang W Chen and J M Zhu ldquoA stochasticmodel of human colon cancer involving multiple pathwaysrdquo inHandbook of Cancer Models with Applications W Y Tan and LHanin Eds chapter 11 World Scientific River Edge NJ USA2008
[9] W Y Tan and H Zhou ldquoA new stochastic model of retinoblas-toma involving both hereditary and non- hereditary cancercasesrdquo Journal of Carcinogenesis and Mutagenesis vol 2 no 2article 117 2011
[10] X W Yan Stochastic and State Space Models of carcinogenesisUnder Complex Situation [PhD thesis] Department of Math-ematical Sciences University of Memphsis Memphis TennUSA
[11] G L Yang and C W Chen ldquoA stochastic two-stage carcino-genesis model a new approach to computing the probabilityof observing tumor in animal bioassaysrdquo Mathematical Bio-sciences vol 104 no 2 pp 247ndash258 1991
[12] H Osada and T Takahashi ldquoGenetic alterations of multipletumor suppressors and oncogenes in the carcinogenesis andprogression of lung cancerrdquo Oncogene vol 21 no 48 pp 7421ndash7434 2002
[13] I I Wistuba L Mao and A F Gazdar ldquoSmoking moleculardamage in bronchial epitheliumrdquo Oncogene vol 21 no 48 pp7298ndash7306 2002
[14] S Landreville O A Agapova and J W Hartbour ldquoEmerginginsights into the molecular pathogenesis of uveal melanomardquoFuture Oncology vol 4 no 5 pp 629ndash636 2008
[15] H W Mensink D Paridaens and A De Klein ldquoGenetics ofuveal melanomardquo Expert Review of Ophthalmology vol 4 no6 pp 607ndash616 2009
[16] WY Tan andXWYan ldquoAnew stochastic and state spacemodelof human colon cancer incorporating multiple pathwaysrdquoBiology Direct vol 5 article 26 2010
[17] L B Klebanov S T Rachev and A Y Yakovlev ldquoA stochasticmodel of radiation carcinogenesis latent time distributions andtheir propertiesrdquo Mathematical Biosciences vol 113 no 1 pp51ndash75 1993
[18] A Y Yakovlev and A D Tsodikov Stochastic Models of TumorLatency and Their Biostatistical Applications World ScientificRiver Edge NJ USA 1996
[19] H Fakir W Y Tan L Hlatky P Hahnfeldt and R K SachsldquoStochastic population dynamic effects for lung cancer progres-sionrdquo Radiation Research vol 172 no 3 pp 383ndash393 2009
[20] A G Knudson ldquoMutation and cancer statistical study ofretinoblastomardquoProceedings of theNational Academy of Sciencesof the United States of America vol 68 no 4 pp 820ndash823 1971
[21] J F Crow and M Kimura An Introduction to PopulationGenetics Theory Harper and Row New York NY USA 1970
[22] W Y Tan Stochastic Models With Applications to GeneticsCancers AIDS and Other Biomedical Systems World ScientificRiver Edge NJ USA 2002
[23] W Y Tan CW Chen and L J Zhang ldquoCancer risk assessmentby state space modelsrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds Chapter 13 WorldScientific River Edge NJ USA 2008
[24] W Y Tan andCWChen ldquoStochasticmodeling of carcinogene-sis some new insightsrdquoMathematical and Computer Modellingvol 28 no 11 pp 49ndash71 1998
[25] W Y Tan and C W Chen ldquoCancer stochastic modelsrdquo inEncyclopedia of Statistical Sciences Revised edition John Wileyand Sons New York NY USA 2005
[26] E G Luebeck and S HMoolgavkar ldquoMultistage carcinogenesisand the incidence of colorectal cancerrdquo Proceedings of theNational Academy of Sciences of the United States of Americavol 99 no 23 pp 15095ndash15100 2002
[27] R Durrett J Mayberry S Moseley and D Schmidt ldquoProba-bility models for cancer development and progressionrdquo GoogleSearch Google 2012
[28] G E P Box and G C Tiao Bayesian Inference in StatisticalAnalysis Addison-Wesley Reading Mass USA 1973
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
ISRN Biomathematics 7
32 Transition Probabilities and Probability Distributions ofStaging Variables Let 119891(119909 119899 119901) denote the probabilitydensity function of a binomial random variable 119883 sim
Binomial(119899 119901) ℎ(119909 120582) the probability density function ofa Poisson random variable 119883 sim Poisson(120582) and 119892(119909 119910 119899
1199011 119901
2) the probability density function of a bivariate multi-
nomial random vector (119883 119884)simMultinomial(119899 1199011 119901
2) Using
the stochastic equations of the staging variables given by(2) and using the probability distributions of the transitionvariables 119861
119903(119905 119894) 119863
119903(119905 119894)119872
119903(119905 119894) in (4) as inTan et al [4 5]
we obtain the following transition probabilities of 119869119903(119905 +
Δ119905 119894) 119903 + 119894 119896 minus 1 given 119869119903(119905 119894) 119903 = 119894 119896 minus 1 for
(119894 = 0 1 Min(119896 minus 1 2))
119875 119869119903 (119905 + Δ119905 119894) = 119907
119903 119903 = 119894 119894 + 1 119896 minus 1 | 119869
119903 (119905 119894)
= 119906119903 119903 = 119894 119894 + 1 119896 minus 1
= 119875 119869119894(119905 + Δ119905 119894) = 119907
119894| 119869
119894(119905 119894) = 119906
119894
times
119896minus1
prod
119895=119894+1
119875 119869119895(119905 + Δ119905 119894) = 119907
119895| 119869
119895(119905 119894)
= 119906119895 119869119895minus1
(119905 119894) = 119906119895minus1
(12)
where
119875 119869119894(119905 + Δ119905 119894) = 119907
119894| 119869
119894(119905 119894) = 119906
119894
=
119906119894
sum
119903=0
119891 (119903 119906119894 119887119894 (119905) Δ119905)
times 119891(119906119894minus 119907
119894+ 119903 119906
119894minus 119903
119889119894(119905) Δ119905
1 minus 119887119894 (119905) Δ119905
)
119894 = 0 1 Min (119896 minus 1 2)
119875 119869119895(119905 + Δ119905 119894) = 119907
119895| 119869
119895(119905 119894) = 119906
119895 119869119895minus1
(119905 119894) = 119906119895minus1
=
119906119895
sum
1199031=0
119906119895minus1199031
sum
1199032=0
119892 (1199031 1199032 119906
119895 119887119895(119905) Δ119905 119889
119895(119905) Δ119905)
times ℎ (119907119895minus 119906
119895minus 119903
1+ 119903
2 119906
119895minus1120573119895minus1
Δ119905) 119895 gt 119894
(13)
Define the unobservable transition variables˜119880119894(119905) =
119861119894(119905 119894) (119861
119895(119905 119894) 119863
119895(119905 119894)) 119895 = 119894 + 1 119896 minus 1
1015840(119894 = 0 1Min
(119896 minus 1 2)) Then we have for the joint probability densityfunction of
˜119883119894(119905 + Δ119905)
˜119880119894(119905) given
˜119883119894(119905)
119875 ˜119883119894 (119905 +Δ119905) ˜
119880119894 (119905) | ˜
119883119894 (119905) = 119875
˜119883119894 (119905 + Δ119905) | ˜
119880119894 (119905) ˜
119883119894 (119905)
times 119875 ˜119880119894(119905) |
˜119883119894(119905)
(14)
where
119875 ˜119883119894(119905 + Δ119905) |
˜119880119894(119905)
˜119883119894(119905)
= 119891(119869119894 (119905 119894) minus 119869119894 (119905 + Δ119905 119894) + 119861119894 (119905 119894) 119869119894 (119905 119894)
minus119861119894(119905 119894)
119889119894(119905) Δ119905
1 minus 119887119894 (119905) Δ119905
)
times
119896minus1
prod
119895=119894+1
ℎ 119869119895 (119905 + Δ119905 119894) minus 119869119895 (119905 119894) minus 119861119895 (119905 119894)
+119863119895(119905 119894) 119869
119895minus1(119905 119894) 120573
119895minus1(119905) Δ119905
(15)
119875 ˜119880119894(119905) |
˜119883119894(119905)
= 119891 (119861119894 (119905 119894) 119869119894 (119905 119894) 119887119894 (119905 119894) Δ119905)
times
119896minus1
prod
119895=119894+1
119892119861119895(119905 119894) 119863
119895(119905 119894) 119869
119895(119905 119894) 119887
119895(119905 119894) Δ119905 119889
119895(119905 119894) Δ119905
(16)
Let˜119890119894(119895) be a (119896 minus 119894) times 1 column vector with 1 in the
119895th position (1 le 119895 le 119896 minus 119894) and with 0 in other positionsLet
˜119906 = (119906
119894 119906
119896minus1)1015840 and
˜119907 = (119907
119894 119907
119896minus1)1015840 be (119896 minus
119894) times 1 column vectors of nonnegative integers (ie 119906119895and
119907119895are nonnegative integers) Then by using the probability
distribution results in (14)ndash(16) it can readily be shown that
119875 ˜119883119894(119905 + Δ119905) =
˜119907 |
˜119883119894(119905) =
˜119906
= [119906119895119887119895(119905) + (1 minus 120575
119895119894) 119906
119895minus1120573119895minus1
(119905)] Δ119905 + 119900 (Δ119905)
if˜119907 =
˜119906 +
˜119890119894(119895) 119895 = 119894 119896 minus 1
= 119906119895119889119895(119905) Δ119905 + 119900 (Δ119905)
if˜119907 =
˜119906 minus
˜119890119894(119895) 119895 = 119894 119896 minus 1
= 119900 (Δ119905) if 10038161003816100381610038161003816˜11015840
119899minus119896(˜119906 minus
˜119907)10038161003816100381610038161003816ge 2
(17)
The above results imply that˜119883119894(119905) is a (119896minus 119894)-dimensional
birth-death process with birth rates 119894119887119906(119905) 119906 = 119894 119896 minus
1 death rates 119894119889119906(119905) 119906 = 119894 119896 minus 1 and cross-
transition rates 120572119906119906+1
(119895 119905) = 119895120573119906(119905) 119906 = 119894 119896 minus
1 120573119906119907(119895 119905) = 0 if 119907 = 119906 + 1 (See Definition 41 in Tan
([22] Chapter 4)) Using these results it can be shownthat the Kolmogorov forward equation for the probabilities119875119869
119895(119905 119894) = 119906
119895 119895 = 119894 119896 minus 1 | 119869
119894(0) = 119898
119894 119869
119895(0) = 0
8 ISRN Biomathematics
119895 gt 119894 = 119875(119906119895 119895 = 119894 119896 minus 1 119905) (119894 = 0 1Min(119896 minus 1 2))
in the above model is given by
119889
119889119905119875 (119906
119895 119895 = 119894 119896 minus 1 119905)
= 119875 (119906119894minus 1 119906
119895 119895 = 119894 + 1 119896 minus 1 119905) (119906
119894minus 1) 119887
119894(119905)
+
119896minus1
sum
119895=119894+1
119875 (119906119894 119906
119894+1 119906
119895minus1 119906
119895
minus1 119906119895+1
119906119896minus1
119905) (119906119895minus 1) 119887
119895(119905)
+
119896minus2
sum
119895=119894
119875 (119906119894 119906
119894+1 119906
119895 119906
119895+1
minus1 119906119895+2
119906119896minus1
119905) 119906119895120573119895 (119905)
+
119896minus1
sum
119895=119894
119875 (119906119894 119906
119894+1 119906
119895minus1 119906
119895
+1 119906119895+1
119906119896minus1
119905) (119906119895+ 1) 119889
119895(119905)
minus 119875 (119906119894 119906
119894+1 119906
119896minus1 119905)
times
119896minus1
sum
119895=119894
119906119895[119887119895(119905) + 119889
119895(119905)] +
119896minus2
sum
119895=119894
119906119895120573119895(119905)
(18)
for 119906119895= 0 1 infin 119895 = 119894 119896 minus 1
By using the above set of differential equations one canreadily compute the probabilities 119875119869
119895(119905) = 119906
119895 119895 = 119894 119896 minus
1 | 119868119894(0) = 119898
119894 = 119875(119906
119895 119895 = 119894 119896 minus 1 119905) numerically
33 The Probability Distributions of the Number of DetectableTumors and Times to Tumors As shown by Yang and Chen[11] malignant cancer tumors arise from primary 119869
119896cells by
clonal expansion where primary 119869119896cells are 119869
119896cells generated
directly by 119869119896minus1
cells (119869119896cells derived by stochastic birth of
other 119869119896cells are not primary 119869
119896cells) That is cancer tumors
develop from primary 119869119896cells through stochastic birth-death
processesTo derive the probability distribution for 119879(119905) in 119869
119894people
in the population let 119875119879(119904 119905) denote the probability that
a primary cancer cell at time 119904 develops into a detectablecancer tumor by time 119905 (Explicit formula for119875
119879(119904 119905) has been
given in Tan [22] Chapter 8 and in Tan and Chen [24])Than as shown in Tan ([3 22] chapter 8) the conditionalprobability distribution of 119879(119905) given 119869
119896minus1(119904 119894) 119904 le 119905
in 119869119894people is Poisson with mean 120596(119905 119894) where 120596(119905 119894) =
int119905
1199050
119869119896minus1
(119904 119894)120573119896minus1
(119904)119875119879(119904 119905)119889119904 That is
119879 (119905) | 119869119896minus1
(119904 119894) 119904 le 119905 sim Poisson (120596 (119905 119894)) (19)
Let 119876119894(119895) be the probability that cancer tumors develop
during (119905119895minus1
119905119895] in 119869
119894people in the population For time
homogeneous models with small 120573119896minus1
119876119894(119895) is then given by
119876119894(119895) = 119864 119890
minus120596(119905119895minus1
119894)minus 119890
minus120596(119905119895119894)
= 119890minus120573119896minus1
119867119894(119905119895minus1
)minus 119890
minus120573119896minus1
119867119894(119905119895)+ 119900 (120573
119896minus1)
(20)
where119867119894(119905) = int
119905
1199050
119864[119869119896minus1
(119909 119894)]119875119879(119909 119905)119889119909
To derive 119876119894(119895) denote by
120579119894(119896minus1)
= 119864 [119869119894(1199050 119894 minus 1)] 120573
119894
119896minus1
prod
119906=119894+1
(120573119906
120574119906
)
119894 = 1 Min (3 119896 minus 1)
120582119906(119896minus1)
= 119864 [119869119906(1199050 119906)] 120573
119906
119896minus1
prod
119907=119906+1
(120573119906
120574119906
)
119906 = 0 1 Min (2 119896 minus 1)
(21)
and define the functions
120595119894(119896minus1)
(119905) =
119896minus1
prod
119906=119894+1
120574119906
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907)
times int
119905
1199050
119890120574119906(119909minus1199050)119875119879(119909 119905) 119889119909
119894 = 0 1 Min (2 119896 minus 1)
(22)
Applying results of 119864[119869119896minus1
(119905 119894)] given in (11) for timehomogeneous models with 120574
119894= 120574119895if 119894 = 119895 we obtain 119876
119894(119895)rsquos as
follows
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) = 119890
minus1205791112059511(119905119895minus1
)minus1205820112059501(119905119895minus1
)
minus119890minus1205791112059511(119905119895)minus1205820112059501(119905119895) + 119900 (120573
1)
(23)
1198761(119895) = (1 minus 120572
1) 119890
minus1205821112059511(119905119895minus1
)
minus119890minus1205821112059511(119905119895) + 119900 (120573
1)
(24)
ISRN Biomathematics 9
(2) If 119896 ge 3 then we have 119876119894(0) = 0 for 119894 = 0 1 and
1198762(0) = 120575
2119896120572 and for 119895 gt 0
1198760(119895) = 119890
minus1205791(119896minus1)
1205951(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1205950(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1205951(119896minus1)
(119905119895)minus1205820(119896minus1)
1205950(119896minus1)
(119905119895) + 119900 (120573
119896minus1)
(25)
1198761(119895) = 119890
minus1205792(119896minus1)
1205952(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1205951(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1205952(119896minus1)
(119905119895)minus1205821(119896minus1)
1205951(119896minus1)
(119905119895) + 119900 (120573
119896minus1)
(26)
1198762(119895) = 120575
2119896(1 minus 120572) 119890
minus1205822212059522(119905119895minus1
)minus 119890
minus1205822212059522(119905119895)
+ (1 minus 1205752119896) 119890
minus1205793(119896minus1)
1205953(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1205952(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1205953(119896minus1)
(119905119895)minus1205822(119896minus1)
1205952(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(27)
where 1205752119896= 1 if 119896 = 2 and = 0 if 119896 = 2
Notice that if 1205740= 0 then 120595
0(119896minus1)(119905) reduces to
1205950(119896minus1)
(119905) = (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=0
1198600(119896minus1)
(119907)
times int
119905
1199050
119890120574119907(119909minus1199050)119875119879(119909 119905) 119889119909
= (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1) (119907)
times int
119905
1199050
[119890120574119907(119909minus1199050)minus 1] 119875
119879 (119909 119905) 119889119909
(28)
Notice also that if 119875119879(119904 119905) asymp 1 for 119905 gt 119904 and if 120574
0= 0 then
the above 120595119894119895(119905)rsquos reduce respectively to
120595119894119894(119905) =
1
120574119894
119890120574119894(119905minus1199050)minus 1 119894 = 1 119896 minus 1
1205950(119896minus1) (119905) = (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1) (119907)
times1
1205741199072119890
120574119907(119905minus1199050)minus 1 minus (119905 minus 119905
0) 120574
119907
120595119894(119896minus1)
(119905) = (
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907)
times1
120574119907
119890120574119907(119905minus1199050)minus 1 119894 = 1 119896 minus 1
(29)
4 Probability Distribution ofObserved Cancer Incidence IncorporatingHereditary Cancer Cases
For estimating unknown parameters and to validate themodel one would need real data generated from the modelFor studies of carcinogenesis such data are usually given bycancer incidence For example in the SEER data of NCINIHof USA the data are given by (119910
0 119899
0) (119910
119895 119899
119895) 119895 = 1 119905
119873
where 1199100is the number of cancer cases at birth and 119899
0
the total number of birth and where for 119895 ge 1 119910119895is
the number of cancer cases developed during the 119895th agegroup of a one-year period (or 5 years periods) and 119899
119895is
the number of noncancer people who are at risk for cancerand from whom 119910
119895of them have developed cancer during
the 119895th age group Given in Table 1 are the SEER data ofuveal melanoma (adult eye cancer) during the period 1973ndash2007 In Table 1 notice that there are some cancer cases atbirth implying some inherited cancer cases In this sectionwe will develop a statistical model for these types of datasets from the stochastic multistage model with hereditarycancers as given in Section 2 As in previous sections let119899119894119895be the number of individuals who have genotype 119869
119894(119894 =
0 1 2) at the embryo stage among the 119899119895people at risk for
the cancer in question Then as showed above (1198991119895 119899
2119895) |
119899119895sim Multinomial119899
119895 119901
1 119901
2 It follows that 119899
119894119895| 119899
119895sim
Binomial119899119895 119901
119894 119894 = 0 1 2 In what follows we let 119884
119895denote
the random variable for 119910119895unless otherwise stated
41 The Probability Distribution of 1198840 As shown in Figure 4
119869119894(119894 = 0 1 2) people would only generate 119869
119894stage cells and
119869119894+1
stage cells at birth Thus for cancers to develop at orbefore birth the number of stages for the stochastic model ofcarcinogenesis must be 3 or less It follows that if 119910
0gt 0 the
appropriate model of carcinogenesis must be either a 2-stagemodel or a 3-stage model Since 119899
20| 119899
0sim Binomial(119899
0 119901
2)
and 11989910| (119899
0 119899
20) sim Binomial(119899
0 119901
1) + 119900(119901
2) the probability
distribution of 1198840is therefore
1198840sim Poisson (120594
119896) 119896 = 2 3 (30)
where
120594119896= 119899
0(119901
2+ 119901
1120572) if 119896 = 2
= 11989901199012120572 if 119896 = 3
(31)
The expected number of 1198840given 119899
0is 119864(119884
0| 119899
0) =
1198990(119901
2+ 119901
1120572) = 120594
2if 119896 = 2 and 119864(119884
0| 119899
0) = 119899
01199012120572 = 120594
3
if 119896 = 3 Hence for the 2-stage model (ie 119896 = 2) or the 3-stage model (ie 119896 = 3) the maximum likelihood estimateof 120594
119896is 120594
119896= 119910
0and the deviance119863
0(119896) from the conditional
probability distribution of 1198840given 119899
0is
1198630(119896) = minus2 log ℎ (119910
0 120594
119896) minus log ℎ (119910
0 120594
119896)
= 120594119896minus 119910
0 minus 119910
0log
120594119896
1199100
119896 = 2 3
(32)
10 ISRN Biomathematics
Table 1 The SEER incidence data (1973ndash2007) of uveal melanomafrom NCINIH (over all races and genders)
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
0 12495777 34 36 36 381 12221582 20 11 11 162 12120990 14 7 8 173 12112995 9 5 6 184 12146174 4 4 5 205 12161336 3 3 4 226 12111854 2 2 4 257 12160452 2 2 4 288 11942586 2 2 4 309 12381299 1 2 5 3410 12512703 2 3 6 3811 12410338 5 3 6 4112 12449244 2 3 6 4413 12527781 5 4 7 4814 12602883 3 5 7 5115 12719598 5 5 8 5516 12766107 7 6 9 5917 12831400 9 7 9 6218 12382047 8 7 9 6319 12581638 10 8 10 6820 12636509 7 9 11 7121 12682601 6 10 10 7522 12840510 12 11 11 7923 13075528 17 13 13 8424 13358635 16 15 14 8925 13473849 12 16 16 9426 13426340 17 18 18 9727 13525264 28 20 20 10128 13149674 17 22 21 10229 13812811 23 25 25 11030 13886874 24 28 27 11431 13488332 37 30 29 11532 13460286 32 32 32 11833 13256067 38 35 35 11934 13428827 37 39 38 12435 13220037 40 41 41 12636 12870265 30 44 44 12637 12689592 43 47 47 12738 12157014 42 49 49 12539 12494081 46 55 54 13140 12272125 49 58 58 13241 11826573 56 61 60 13042 11663153 54 65 64 13143 11407082 53 68 68 13144 11296848 70 73 72 133
Table 1 Continued
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
45 11016369 57 76 76 13246 10651593 71 79 79 13047 10475708 87 84 83 13148 9994684 82 86 85 12749 10138908 78 93 93 13150 9836359 87 97 96 13051 9475641 95 100 99 12752 9250985 113 104 104 12653 9027382 106 108 108 12554 8883737 117 113 113 12555 8547883 129 116 116 12356 8279648 107 119 119 12157 8062368 119 123 123 11958 7654610 132 124 124 11559 7563706 118 130 129 11560 7232719 131 131 130 11261 6927332 116 132 132 10962 6708273 133 134 134 10763 6543931 143 138 137 10664 6404652 130 141 141 10565 6168486 145 142 142 10266 5913479 138 142 142 9967 5746766 151 144 144 9868 5480517 147 142 142 9469 5363912 149 144 144 9470 5110728 136 142 142 9071 4925076 144 141 141 8872 4696825 140 138 138 8573 4512136 146 135 135 8274 4345300 126 133 133 8075 4148801 122 128 129 7876 3900900 124 122 122 7477 3681587 114 116 116 7078 3481918 93 110 110 6779 3243631 102 102 102 6380 2961234 79 92 93 5881 2724984 93 83 84 5482 2495219 82 75 75 5083 2271595 92 66 67 4684 2041351 64 57 58 4285 10466605 304 282 285 216Note 1 for age 0 and 1ndash10 years old the cancer incidence are derived bysubtracting incidence of retinoblastoma from the original SEERdata (see Tanand Zhou [9])Note 2 the observed uveal melanoma incidence rates per 106 individuals arederived from the SEER eye cancer incidence by subtracting retinoblastomaincidence as given in Tan and Zhou [9]
ISRN Biomathematics 11
42 The Probability Distribution of 119884119895(119895 ge 1) To derive the
probability distribution of 119884119895(119895 ge 1) in the 119895th age group
let 119884119894119895(119894 = 0 1 2) be the number of cancer cases generated
by people who have genotype 119869119894at the embryo stage among
these 119884119895cancer cases Then 119884
119895= sum
2
119894=0119884119894119895and 119884
0119895is the
number of cancer cases generated by the 1198990119895= 119899
119895minus 119899
1119895minus 119899
2119895
normal people in the populationThe conditional probabilitydistribution of 119884
119894119895given 119899
119894119895is
119884119894119895| 119899
119894119895sim Poisson 119899
119894119895119876119894(119895) 119894 = 0 1 2 (33)
Notice that if 119896 = 2 (a 2-stage model) then all 1198692
individuals would develop tumor at or before birth Thus if119896 = 2 then 119884
2119895= 0 for all 119895 gt 0 so that if 119895 gt 0 cancer
cases develop only from normal people (119873 = 1198690people) and
1198691people On the other hand if 119896 gt 2 then with positive
probability 119884119894119895gt 0 for all (119894 = 0 1 2 119895 = 0 1 119899
119879) where
119899119879is the last time point in the data Let 120575
2119896= 1 if 119896 = 2 and
1205752119896= 0 if 119896 = 2 Then 119884
119895| (119899
119894119895 119894 = 0 1 2) sim Poisson(119876
119879(119895)
where 119876119879(119895) = sum
1
119894=0119899119894119895119876119894(119895) + (1 minus 120575
2119896)119899
21198951198762(119895) Since
(1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) we have for the
conditional probability density function119875(119910119895| 119899
119895) of119884
119895given
119899119895
119875 (119910119895| 119899
119895)=
119899119895
sum
1198990119895=0
119899119895minus1198990119895
sum
1198991119895=0
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1) ℎ 119910
119895 119876
119879(119895)
(34)
where 119892(1198990119895 119899
1119895 119899
119895 119901
0 119901
1) is the probability density function
of (1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) and ℎ119910
119895 119876
119879(119895)
the probability density function of 119884119895| (119899
119894119895 119894 = 0 1 2) sim
Poisson119876119879(119895)
The probability density function 119875(119910119895| 119899
119895) given by (34)
is a mixture of Poisson probability density functions withmixing probability density function given by themultinomialprobability distribution of 119899
0119895 119899
1119895 given 119899
119895 This mixing
probability density function represents individuals with dif-ferent genotypes at the embryo stage in the population
Let Θ be the set of all unknown parameters (ie theparameters (119901
1 119901
2 120572) and the birth rates the death rates
and the mutation rates of 119869119895cells) Based on data (119910
119895 119895 =
0 1 119899119879) the likelihood function of Θ is
119871 Θ | 119910119895 119895 = 0 1 119899
119879 = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119875 (119910119895| 119899
119895)
(35)
Notice that because themutation rates are very small onemay practically assume 120573
119894(119905) = 120573
119894for 119894 = 0 1 119896 minus 1
Also because the stage-limiting genes are basically tumorsuppressor genes which act recessively (see Tan [3]Weinberg[6] and Tan et al [5 8 23]) one may practically assume119887119894(119905) = 119887
119894 119889
119894(119905) = 119889
119894 120574
119894(119905) = 119887
119894minus 119889
119894= 120574
119894 119894 = 0 1 119896 minus 1
(see Tan et al [4 5 8])
43 The Joint Probability Distribution of Augmented Variablesand Cancer Incidence For applying the mixture distribution
of 119884119895in (34) to make inference about the unknown parame-
ters one needs to expand the model to include the unobserv-able augmented variables (119899
0119895 119899
1119895 119910
0119895 119910
1119895 119895 = 1 119905
119873) and
derives the joint probability distribution of these variablesFor these purposes observe that for (119895 = 1 119905
119873) and for
119896 gt 2 the conditional probability distribution119875119910119894119895 119894 = 0 1 |
119910119895 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1) given (119884
119895= 119910
119895 119899
119894119895 119894 =
0 1 119899119895) is
(119910119894119895 119894 = 0 1) | (119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Multinomial(119910119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1)
(36)
Since the conditional probability distribution of 119884119895given
(119899119894119895 119894 = 0 1 2) for 119896 gt 2 is Poisson with mean 119876
119879(119895) we
have for the joint conditional probability density function119875119910
119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1 119884
119895) given
(119899119894119895 119894 = 0 1 119899
119895)
119875 119910119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895
= 119875 119910119894119895 119894 = 0 1 | 119910
119895 119899
119894119895 119894 = 0 1 2
times 119875 119910119895| 119899
119894119895 119895 = 0 1 2 =
1
119910119895119890minus119876119879(119895)119876
119879(119895)
119910119895
times 1198921199100119895 119910
1119895 119910
119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1
=
2
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) (119896 gt 2)
(37)
where 1199102119895= 119910
119895minus sum
1
119906=0119910119906119895and 119899
2119895= 119899
119895minus sum
1
119906=0119899119906119895
If 119896 = 2 then 1198842119895= 0 so that 119884
119895= sum
1
119894=0119884119894119895 Thus we have
for 119896 = 2
119884119895| (119899
119894119895 119894 = 0 1 119899
119895) sim Poisson
1
sum
119894=0
119899119894119895119876119894(119895) (38)
119884119894119895| (119884
119895= 119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Binomial(119910119895
119899119894119895119876119894(119895)
sum1
119906=0119899119906119895119876119906(119895)
) 119894 = 0 1
(39)
It follows that if 119896 = 2 then sum1
119894=0119884119894119895= 119884
119895and the joint
probability density function of (1198840119895 119884
119895) given (119899
119906119895 119906 = 0 1 2)
is119875 119910
0119895 119910
119895| 119899
119906119895 119906 = 0 1 119899
119895
= 119875 119910119895| 119899
119894119895 119894 = 0 1 119899
119895
times 119875 1199100119895| 119910
119895 119899
119906119895 119906 = 0 1 119899
119895
=
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) if 119896 = 2
(40)
where 119910119895= sum
1
119894=0119910119894119895and 119899
119895= sum
2
119894=0119899119894119895
12 ISRN Biomathematics
Put Y = (119910119894119895 119894 = 0 1 119895 = 1 119905
119873) N = (119899
119894119895 119894 = 0 1
119895 = 1 119905119873)˜119899 = (119899
119895 119895 = 0 1 119905
119873)˜119910 = (119910
119895 119895 = 0 1
119905119873) From (37) and (40) we have for the conditional joint
probability density function of (Y˜119910) given (N
˜119899)
119875 Y˜119910 | N
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(41)
It follows that the joint conditional probability densityfunction of NY
˜119910 given (
˜119899 Θ) is
119875 NY˜119910 |
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1)
times ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(42)
Notice that the above probability density function is aproduct of multinomial probability density functions andPoisson probability density functions For this joint probabil-ity density function the deviance Dev = minus2log119875[Y
˜119910N |
˜119899 Θ] minus log119875[Y
˜119910N |
˜119899 Θ] is
Dev = 1198630(119896) + Dev (119901
1 119901
2) +
119905119873
sum
119895=1
119863119895 (43)
where
1198630(119896) = 2 120594 (119896) minus 119910
0minus 119910
0log
120594 (119896)
1199100
(44)
Dev (1199011 119901
2) = 2
119905119873
sum
119895=1
1198990119895log
1199010
(1 minus 1199011minus 119901
2)
+
2
sum
119894=1
119899119894119895log
119901119894
119901119894
(45)
119863119895= 2
1
sum
119894=0
119899119894119895119876119894(119895)minus119910
119894119895minus119910
119894119895log
119899119894119895119876119894(119895)
119910119894119895
+2 (1 minus 1205752119896)
times 11989921198951198762(119895) minus 119910
2119895minus 119910
2119895log
11989921198951198762(119895)
1199102119895
(46)
where 119901119894= ((sum
119905119873
119895=0119899119894119895)(sum
119905119873
119895=0119899119895)) (119894 = 0 1 2)
The joint probability density function 119875Y˜119910N |
˜119899 Θ
of (Y˜119910N) given by (42) will be used as the kernel for the
Bayesian method to estimate the unknown parameters andto predict the state variables
44 Fitting of the Model to Cancer Incidence Data To fit themodel to real data as inTan [3ndash5] we letΔ119905 sim 1 to correspondto a fixed time interval such as 6 months in human cancerstudies (Tan et al [4] has assumed 3months as one-time unitwhile Luebeck andMoolgavkar [26] has assumed one year asone-time unit)Then because the proliferation rate of the laststage cells is quite large onemay practically assume119875
119879(119904 119905) =
1 for 119905 minus 119904 ge 1 Hence noting that 120573119896minus1
(119905) = 120573119896minus1
is usuallyvery small (see [3ndash5]) the 119876
119894(119895) is approximated by
119876119894(119895) asymp119864 119890
minus120573119896minus1
119866119894(119905119895minus1
)minus 119890
minus120573119896minus1
119866119894(119905119895) = 119890
minus120573119896minus1
119864[119866119894(119905119895minus1
)]
minus 119890minus120573119896minus1
119864[119866119894(119905119895)]+ 119900 (120573
119896minus1)
(47)
where 119866119894(119905) = sum
119905minus1
119904=1199050
119869119896minus1
(119904 119894)Under discrete time approximation the 119864[119868
119896minus1(119905 119894)]rsquos
have been derived in the appendix Using these results ofexpected numbers and using the resultsum119905minus1
119894=0119886119894= (119886
119905minus1)(119886minus
1) for 119886 = 0 we obtain
120573119896minus1
119864 [119866119894(119905)] =
119894+1
sum
119903=119894
119864 [119869119903(1199050 119894)] (
119896minus1
prod
119906=119903
120573119906)
times
119896minus1
sum
119907=119903
119860119903(119896minus1)
(119907)
119905minus1
sum
119904=1199050
(1 + 120574119907)119904minus1199050
= 120579(119894+1)(119896minus1)
120601(119894+1)(119896minus1) (119905) + 120582119894(119896minus1)120601119894(119896minus1) (119905)
(48)
where (120579119894(119896minus1)
119894 = 1 2 3) and (120579119894(119896minus1)
119894 = 0 1 2) are definedin Section 33 andwhere the120601
119894(119896minus1)(119905) (119894 = 0 1 2 3)rsquos are given
by
120601119894(119896minus1) (119905) =(
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1) (119907)
times1
120574119907
(1 + 120574119907)119905minus1199050
minus 1 119894 = 0 1 2 3
(49)
Notice that if 1205740= 0 then 120601
0(119896minus1)(119905) reduces to
1206010(119896minus1)
(119905) =(
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1)
(119907)
times1
1205741199072(1 + 120574
119907)119905minus1199050
minus 1 minus (119905 minus 1199050) 120574
119907
(50)
Applying these results for time homogeneous modelswith 120574
119894= 120574119895if 119894 = 119895 the 119876
119894(119895)rsquos under discrete approximation
are given as follows
ISRN Biomathematics 13
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791112060111(119905119895minus1
)minus1205820112060101(119905119895minus1
)
minus119890minus1205791112060111(119905119895)minus1205820112060101(119905119895) + 119900 (120573
1)
1198761(119895) asymp (1 minus 120572
1) 119890
minus1205821112060111(119905119895minus1
)
minus119890minus1205821112060111(119905119895) + 119900 (120573
1)
(51)
(2) If 119896 ge 3 thenwe have 1198762(0) = 120575
2(119896minus1)120572119876
119894(0) = 0 119894 =
0 1 and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791(119896minus1)
1206011(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1206011(119896minus1)
(119905119895)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198761(119895) asymp 119890
minus1205792(119896minus1)
1206012(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1206012(119896minus1)
(119905119895)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198762(119895) asymp 120575
2(119896minus1)(1 minus 120572) 119890
minus1205822212060122(119905119895minus1
)minus 119890
minus1205822212060122(119905119895)
+ (1 minus 1205752(119896minus1)
) 119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(52)
Notice that if one replaces [1 + 120574119907]119905minus1199050 by [1 + 120574
119907]119905minus1199050 =
119890(119905minus1199050) log(1+120574
119907)
asymp 119890(119905minus1199050)120574119907 the above 119876
119894(119895)rsquos from discrete
time model are exactly the same as from the continuousmodel respectively as given in equations (23)ndash(27) under theassumption that 119875
119879(119904 119905) = 1 for 119905 minus 119904 gt 0 Notice that the
assumption 119875119879(119904 119905) = 1 for 119905minus119904 gt 0 is equivalent to assuming
that the last stage cancer cells grow instantaneously intocancer tumors as soon as they are generated see Remark 1
5 The Fitting of the Model to CancerIncidence and the Generalized BayesianInference Procedure
Given the model in Sections 2 and 3 and cancer incidenceone may use results in Section 4 to fit the model By usingthis model and the distribution results in Section 4 one canreadily estimate the unknown genetic parameters predictcancer incidence and check the validity of the model byusing the generalized Bayesian inference and Gibbs samplingprocedures for more detail see Tan [3 22] and Tan et al[4 5]
The generalized Bayesian inference is based on the pos-terior distribution 119875Θ | NY
˜119910
˜119899 of Θ given NY
˜119910
˜119899
This posterior distribution is derived by combining the prior
distribution 119875Θ ofΘ with the joint probability distribution119875NY
˜119910 |
˜119899 Θ given
˜119899 Θ given by (42) It follows
that this inference procedure would combine informationfrom three sources (1) previous information and experiencesabout the parameters in terms of the prior distribution 119875Θof the parameters (2) biological information of inheritedcancer cases via genetic segregation of cancer genes inthe population (119875N |
˜119899 119901
119894 119894 = 1 2 see Section 2)
and (3) information from the expanded data (Y) and theobserved data (
˜119910) via the statistical model from the system
(119875Y˜119910 | N Θ) given by (37) and (40) Because of additional
information from the genetic segregation of the cancer genesthis inference procedure provides an efficient procedure toextract information of effects of genotypes of individuals atthe embryo stage
51 The Prior Distribution of the Parameters For the priordistributions of Θ because biological information has sug-gested some lower bounds and upper bounds for the muta-tion rates and for the proliferation rates we assume
119875 (Θ) prop 119888 (119888 gt 0) (53)
where 119888 is a positive constant if these parameters satisfy somebiologically specified constraints are and equal to zero forotherwise These biological constraints are as follows
(i) 0 lt 1199011lt 10
minus2 0 lt 1199012lt 10
minus6 and minus001 lt 120574119894lt 1 (119894 =
1 2)(ii) For 120573
119895(119895 = 0 1 119896 minus 1) we let 1 lt 120596
0= 119873(119905
0)120573
0lt
1000 (119873 rarr 1198681) and 10minus8 lt 120573
119894lt 10
minus3 119894 = 1 119896minus1(iii) For the 120582
119895(119896minus1)(119895 = 0 1 2) we let 0 lt 120582
119894lt 10 119894 = 0 1
and 0 lt 1205822lt 10
3(iv) For the 120579
119894(119894 = 1 2 3) we let 0 lt 120579
1lt 10
minus2 and 0 lt
1205792lt 10
minus1
We will refer to the above prior as a partially informativeprior which may be considered as an extension of thetraditional noninformative prior given in Box and Tiao [28]
52 The Posterior Distribution of the Parameters Given119884119873
˜119910
˜119899 Denote by Θ
1= Θ minus 119901
1 119901
2 120572 From the
posterior distribution 119875Θ | NY˜119910
˜119899 we obtain
119875 119901119894 119894 = 1 2 120572 | Θ
1NY
˜119910
˜119899
prop (120594 (119896))1199100
119890minus120594(119896)
119901sum119905119873
119895=11198991119895
1119901sum119905119873
119895=11198992119895
2
times (1 minus 1199011minus 119901
2)sum119905119873
119895=11198990119895
0 lt 1199011 119901
2 120572 lt 1
119875 Θ1| (119901
1 119901
2 120572) NY
˜119910
˜119899
prop
119905119873
prod
119895=1
2
prod
119894=0
119890minus119876119894(119895)119876
119894(119895)
119910119894119895
Θ1isin Ω
(54)
14 ISRN Biomathematics
where Ω is the parameter space of Θ1provided by the
biological constraints in Section 51For computational convenience we notice that the log of
1198751199011 119901
2 120572 | Θ
1NY
˜119910
˜119899 is proportional to the negative of
1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) similarly the log of
119875Θ1| (119901
1 119901
2 120572)NY
˜119910
˜119899 is proportional to the negative
of sum119896
119895=1119863119895given by (46)
53 The Multilevel Gibbs Sampling Procedure For EstimatingUnknown Parameters Given the posterior probability distri-butions we will use the following multilevel Gibbs samplingprocedure to derive estimates of the parameters We noticethat numerically the Gibbs sampling procedure given belowis equivalent to the EM-algorithm from the sampling theoryviewpoint with Steps 1 and 2 as the 119864-Step and with Steps 3and 4 as the119872-Step respectively [29]Thesemultilevel Gibbssampling procedures are given by the following
Step 1 (Generating N Given (Y˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 1)) Given Θ and given˜119899 use the multinomial
distribution of 1198991119895 119899
2119895 given 119899
119895in Section 3 to generate
a large sample of N Then by combining this large samplewith 119875Y
˜119910 | N
˜119899 Θ in (37) and (40) to select N through
the weighted bootstrap method due to Smith and Gelfand[30] This selected N is then a sample from 119875N | Y
˜119910
˜119899 Θ
even though the latter is unknown (For proof see Tan [22]Chapter 3) Call the generated sample N
Step 2 (Generating Y Given (N˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 2)) Given
˜119910
˜119899 Θ and given N = N generated
from Step 1 generate Y from the probability distribution119875Y | N
˜119910
˜119899 Θ given by (36) and (38) Call the generated
sample Y
Step 3 (Estimation of 119901119894 119894 = 1 2 120572 Given Θ
1NY
˜119910
˜119899)
Given
˜119910
˜119899 Θ
1 and given (NY) = (N Y) from Steps 1
and 2 derive the posterior mode of 119901119894 119894 = 1 2 120572 by
maximizing the conditional posterior distribution 119875119901119894 119894 =
1 2 120572 | Θ1 N Y
˜119910
˜119899 Under the partially informative prior
this is equivalent to maximize the negative of the deviance1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) in Section 43 under
the constraints given in Section 51 Denote this generatedmode by 119901
119894 119894 = 1 2
Step 4 (Estimation of Θ1Given (119901
119894 119894 = 1 2 120572NY
˜119910
˜119899))
Given (
˜119910
˜119899) and given (NY 119901
119894 119894 = 1 2 120572) = (N Y 119901
119894 119894 =
1 2 ) from Steps 1ndash3 derive the posterior mode of Θ1
by maximizing the conditional posterior distribution119875Θ
1| 119901
119894 119894 = 1 2 N Y
˜119910
˜119899 Under the partially
informative prior this is equivalent to maximize the negativeof the deviancesum119896
119895=1119863119895in (46) under the constraints Denote
the generated mode as Θ1
Step 5 (Recycling Step) With (NY 119901119894 119894 = 1 2 120572 Θ
1) =
(N Y 119901119894 119894 = 1 2 Θ
1) given above go back to Step 1 and
continue until convergence
The proof of convergence of the above steps can bederived by using procedure given in Tan ([22] Chapter 3) Atconvergence the Θ = 119901
119894 119894 = 1 2 Θ
1 are the generated
values from the posterior distribution of Θ given
˜119910
˜119899
independently of (NY) (for proof see Tan [22] Chapter 3)Repeat the above procedures once then generate a randomsample ofΘ from the posterior distribution ofΘ given
˜119910
˜119899
then one uses the sample mean as the estimates of (Θ) anduse the sample variances and covariances as estimates of thevariances and covariances of these estimates
6 A NewMultistage Stochastic Model for AdultEye Cancer (Uveal Melanoma)mdashAn Example
The human eye cancers consist of pediatric eye cancers andadult eye cancers The most common pediatric eye cancer isthe retinoblastoma which develops from the retinal pigmentepithelium cells underlying the retina that do not formmelanoma The most common adult eye cancers are theuveal melanomas involving the iris the ciliary body andthe choroid (collectively referred to as the uveal) Thesecancers develop from melanocytes (pigment cells) whichreside within the uveal giving color to the eye In Tan andZhou [9] we have developed a modified two-stage model forretinoblastoma Based on results frommolecular biology (seeLandreville et al [14] Mensink et al [15] and Loercher andHarbour [31]) Landreville et al [14] have proposed a threestage model for uveal melanoma as given in Figure 2 As anexample of applications of this paper in this section we willapply this model of uveal melanoma to the NCINIH eyecancer data from the SEER project We notice that the samemethods can be applied to model other human cancers aswell but this will be our future research
Given in Table 1 are the numbers of people at risk andthe eye cancer cases in the age groups together with thepredicted cases from the models These data give cancerincidence at birth and incidence for 85 age groups (119896 =
85) with each group spanning over a 1-year period exceptthe last age group (ge85 years old) For human eye cancerbecause the incidence at birth and for age groups from 1to 10 years old is basically generated by the pediatric eyecancer-retinoblastoma (see [9]) to account for inheritedcancer cases of uveal melanomas the incidence for age 0(birth) and for age periods from 1 to 10 years old in Table 1for uveal melanoma is derived by subtracting incidence ofretinoblastoma from SEER data (see Tan and Zhou [9])
To fit the data we let one-time unit be 6 months afterbirth and let 119905
0= 1 To compare different models and to
assess different assumptions we will consider the following2-3-stage mixture models (1) the complete 3-stage mixturemodel (Model-F) in which no assumptions are made on theparameters (2) the Type-1ndash3-stage mixture model in whichwe assume that 120574
1= 0 and that normal people and 119869
1at
the embryo stage will remain normal people and 1198691people
respectively at birth (Model-1) For comparison purposes wealso fit a 2-stage model as defined in Tan and Zhou [9] Wewill apply the methods in Section 6 to fit these models to theSEER data given in Table 1
ISRN Biomathematics 15
Table 2 The log-likelihood AIC and BIC of the fitted models
Models Log-likelihood AIC BICThree-stage
Model-F minus131202 264804 267735Model-1 minus129576 260953 263151
Two-stage minus4392421 8796843 8811569
0 6 12 18 24 30 36 42 48 54 60 66 72 78 84Age (year)
0
50
100
150
200
250
300
350
Inci
denc
es
ObservedModel-F
Model-1Two-stage
Figure 5 Curve fitting of SEER data by the Model-F Model-1 andthe two-stage model
Given in Table 2 are the natural logs of the likelihoodfunctions the AIC (Akaike Information Criterion) and theBIC (Bayesian Information Criterion) for these modelsGiven in Table 3 are the estimates of parameters in the 3-stagemodels Given in Figure 5 are the plots of predicted cancercases from the 3-stage mixture models (Model-F and Model-1) and the 2-stagemodel For comparison purposes in Table 1we also provide numbers of predicted cancer cases from the3-stage mixture models and the 2-stage model together withthe observed cancer cases over time from SEER From theseresults we have made the following observations
(a) As shown by results in Table 1 and Figure 5 it appear-ed that both Model-F and Model-1 fitted the SEERdata well although Model-1 fitted the data slightlybetter from values of AIC and BIC The Chi-squaretest statistics 1205942 = sum
85
119894=0((119910
119894minus 119910
119894)2119910
119894) for Model-F
and Model-1 are given by 8843 and 9448 respec-tively giving a 119875-value of 012 (df = 86 minus 12 = 74)for Model-F and a 119875 value of 011 (df = 86 minus 7 = 79)for Model-1 On the other hand the 2-stage modelfitted the date very poorly the Chi-statistic value forthe 2-stage model is 274769 giving a 119875-value less than10
minus3The AIC (Akaike Information Criteria) and BIC(Bayesian Information Criteria) values ofModel-1 aregiven by (AIC = 260953 BIC = 263151) which areslightly smaller than those of Model-F respectivelyhowever the AIC and BIC values (879684 881157)of the two-stage model are considerably greater thanthose of the 3-stagemodels respectivelyThese resultssuggest that uvealmelanomamaybest be described bya 3-stage model with inherited component and that
one may practically assume 1205741= 0 and that normal
people and 1198691people at the embryo stage will remain
normal people and 1198691people respectively at birth
(b) From Table 3 it is observed that the estimate of 1205741is
close to zero (the estimate is of order 10minus5) indicatingthat the phenotype of 119869
1is almost identical to that
of 119873 = 1198690further confirming that the staging-
limiting genes are basically tumor suppressor genesand that there is no haploinsufficiency for these tumorsuppressor genes On the other hand the estimate of1205742is of order 10minus2 which is about 103 times greater
than those of cells with genotype 1198691
(c) From Table 3 the estimates 119895(119895 = 0 1 2) of 120582
119895
are of order 10minus1 10
minus1 10
2sim 10
3 respectively
Because 120582119894= 119864[119869
119894(1199050 119894)]120573
119894prod
2
119906=119894+1(120573
119906120574
119906) 119894 = 0 1 2
assuming some values of 119864[119869119894(1199050 119894)] 119894 = 0 1 2 from
some biological observations one can have somerough ideas about the magnitude of 120573
119895(119895 = 0 1 2)
For example if we follow Potten et al [32] to assume(119864[119873(119905
0)] = 119864[119869
119894(1199050 119894)] sim 10
8 119894 = 0 1 2) then 120573
119895asymp
10minus6sim 10
minus5(d) From Table 3 the estimates of 119901
1and 119901
2from the
SEER data are of orders 10minus4 sim 10minus3 and 10minus7 sim 10
minus6respectively This indicates that in the US populationthe frequency of the staging limiting cancer genefor uveal melanoma is approximately around 10
minus3Table 3 also showed that the estimate of 120572 was 08411indicating that most individuals with genotype 119869
2
would develop cancer at birth This may help toexplain why there are observed cancer incidencesat birth for uveal melanoma in the SEER data eventhough the estimate of the frequency 119901
2is of order
10minus7sim 10
minus6
7 Discussion and Conclusions
To account for inherited cancer cases in this paper wehave developed some general multistage models involvinghereditary cancer cases For human cancer incidence thesemodels are basically generalized mixture models In thesemixture models the mixing probability is a multinomialdistribution to account for genetic segregation of the staging-limiting tumor suppressor genes This mixture model allowsus to estimate for the first time the frequency of the staging-limiting tumor suppressor gene in human populations As anexample of applications in this paper we have developed ageneral 3-stage stochastic multistage model of carcinogenesisfor adult human eye cancer To account for inherited cancercases in the stochastic model of human eye cancer wehave also developed a generalized mixture model for uvealmelanoma in human beings
For using the proposed models to fit the cancer incidencedata in this paper we have developed a generalized Bayesianinference procedure to estimate the unknownparameters andto predict cancer cases This inference procedure is advanta-geous over the classical sampling theory inference (ie max-imum likelihood method) because the procedure combines
16 ISRN Biomathematics
Table3Estim
ates
ofparametersfor
the3
-stage
stochastic
mod
els
Parameters
1205820
1205821
1205822
1205741
1205742
1199011
1199012
1205721205791
1205792
Mod
el-F
Estim
ates
288119864minus01
481119864minus01
655119864+02
271119864minus05
337119864minus02
995119864minus04
968119864minus07
841119864minus01
329119864minus04
341119864minus01
StD
121119864minus01
465119864minus02
112119864+02
886119864minus06
192119864minus04
175119864minus06
648119864minus09
419119864minus02
354119864minus05
107119864minus02
95CL
-Low
er503119864minus02
389119864minus01
434119864+02
534119864minus06
333119864minus02
991119864minus04
955119864minus07
759119864minus01
260119864minus04
320119864minus01
95CL
-Upp
er525119864minus01
572119864minus01
875119864+02
401119864minus05
341119864minus02
998119864minus04
981119864minus07
923119864minus01
398119864minus04
362119864minus01
Mod
el-1
Estim
ates
655119864minus02
372119864minus01
198119864+02
NA
349119864minus02
998119864minus04
100119864minus06
807119864minus01
NA
NA
StD
254119864minus02
463119864minus02
338119864+01
NA
358119864minus03
627119864minus05
453119864minus08
706119864minus02
NA
NA
95CL
-Low
er157119864minus02
282119864minus01
132119864+02
NA
279119864minus02
875119864minus04
911119864minus07
668119864minus01
NA
NA
95CL
-Upp
er115119864minus01
463119864minus01
265119864+02
NA
419119864minus02
1121198640minus03
109119864minus06
945119864minus01
NA
NA
NoteNAassum
edno
nexiste
nce
ISRN Biomathematics 17
information from three sources (1)previous information andexperiences about the parameters in terms of the prior distri-bution 119875Θ of the parameters (2) biological information ofinherited cancer cases via the genetic segregation of staging-limiting tumor suppressor genes in the population and (3)
information from the expanded data (Y) and the observeddata (
˜119910) via the statistical model from the system (119875Y
˜119910 |
N Θ) given by (37) and (40)To illustrate the usefulness and applications of ourmodels
and methods we have applied our models and methods tothe eye cancer SEER data of NCINIH Our analysis clearlyshowed that the proposed 3-stage model with inheritedcancer cases fitted the data nicely (see Table 2 and Figure 5)on the other hand the classical 2-stage model cannot fit thedata at all (see Table 2 and Figure 5)These results clearly haveconfirmed results frommolecular biology that the human eyecancer is derived by a 3-stage model with inherited cancercomponent Notice however our 3-stage multistage modelis more general than the classical 3-stage model as describedin Little [1] Tan [2] and Zheng [7] in that we postulatethat cancer tumors develop from primary 119868
3cells by clonal
expansion (see Yang and Chen [11]) (Note that the stochasticmultistagemodels in the literature assume that cancer tumorsdevelop from last stage cells immediately as soon as theyare generated ignoring completely cancer progression seeRemark 1) As a matter of fact we had assumed Δ119905 = 1 fora period of three months and found that the 3-stage modelsthen did not fit the SEER data clearly indicating that119875
119905(119904 119905) lt
1 over a period of three monthsApplying our models and methods to the SEER data
of human eye cancer we have derived for the first timesome useful pieces of information Specifically we mention(1) for the first time that we have estimated the frequencyof the staging-limiting tumor suppressor gene in the USpopulation (119901
1sim 9948 times 10
minus3) (2) With the estimate of 120572as = 08411 the predicted number of uveal melanoma atbirth is 119910
0= 119899
011199012= 36 by 3-stage models with inherited
cancer component (Model-F and Model-1) (The observednumber of eye cancer at birth is 34) (3) The estimateof the proliferation rate (120574
1) of 119869
1cells using Model-F is
1205741
= 2271 times 10minus5
sim 0 (The estimate is 8603 times 10minus5
using Model-1) This confirms that the stage-1 limitinggene is a tumor suppressor gene and unlike the p53 genein chromosome 17p (see [33]) there is little or no haploidinsufficiency for this gene in cells with genotype 119869
1
Using models and methods of this paper one can easilypredict future cancer cases for human eye cancer Thus bycomparing results from different populations our modelsand methods can be used to assess cancer prevention andcontrol procedures This will be our future research topicswe will not go any further here
Appendix
The Expected Numbers of State Variablesunder Discrete Time Approximation
Under discrete time the stochastic differential equations ofstate variables reduce to the following stochastic difference
equations of state variables respectively
119869119894 (119905 + 1 119894) = 119869
119894 (119905 119894) [1 + 120574119894 (119905)]
+ 119890119894(119905 + 1 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 + 1 119906) = 119869
119895(119905 119906) [1 + 120574
119895(119905)] + 119869
119895minus1(119905 119906) 120573
119895minus1(119905)
+ 119890119895(119905 + 1 119906) 0 le 119906 lt 119895 le 119896 minus 1
(A1)
where 119890119894(119905 + 1 119894) = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)] minus [119863
119894(119905 119894) minus
119869119894(119905 119894)119889
119894(119905)] and 119890
119895(119905+1 119894) = [119861
119895(119905 119894)minus119869
119895(119905 119894)119887
119895(119905)] minus [119863
119895(119905 119894)minus
119869119895(119905 119894)119889
119895(119905)] + [119872
119895minus1(119905 119894) minus 119869
119895minus1(119905 119894)120573
119895minus1(119905)] for 119895 gt 119894
The initial conditions at birth (1199050) for the above stochastic
difference equations are 119869119895(1199050 119894) gt 0 if (119895 = 119894 119894 + 1) and
119869119895(1199050 119894) = 0 if 119895 gt 119894 + 1 The solution of the above difference
equations under these initial conditions is given respectivelyby
119869119894(119905 119894) = 119869
119894(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119894(119904))
+
119905
sum
119904=1199050+1
119890119894(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119894(119906))
119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119895(119904))
+
119905minus1
sum
119904=1199050
119869119895minus1 (119904 119894) 120573119895minus1 (119904)
119905minus1
prod
119906=119904+1
(1 + 120574119895 (119906))
+
119905
sum
119904=1199050+1
119890119895(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119895(119906))
0 le 119894 lt 119895 le 119896 minus 1
(A2)
If the model is time homogeneous so that 120573119895(119905) =
120573119895 120574
119895(119905) = 120574
119895(119895 = 119894 119896 minus 1 and if 120574
119894= 120574119895if 119894 = 119895 then the
above solutions under the initial conditions (119869119895(1199050 119894) = 0 119895 gt
119894 + 1) reduce to
119869119894 (119905 119894) = 119869
119894(1199050 119894) (1 + 120574
119894)119905minus1199050
+ 120578(0)
119894(119905 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894) (1 + 120574
119895)119905minus1199050
+
119895minus1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895(119903) (1 + 120574
119903)119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
18 ISRN Biomathematics
=
119894+1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895 (119903) (1 + 120574119903)
119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
0 le 119894 lt 119895 le 119896 minus 1
(A3)
where 120578(0)
119895(119905 119894) = sum
119905
119904=1199050+1119890119895(119904 119894)(1 + 120574
119894)119905minus119904 120578
(119906)
119895(119905 119894) =
sum119905
119904=1199050+1119890119895minus119906
(119904 119894)sum119895
119907=119906119860119906119895(119907)(1 + 120574
119907)119905minus119904 (119906 = 1 119895 minus 119894)
Thus if the model is time homogeneous and if 120574119894=
120574119895if 119894 = 119895 the 119864[119869
119896minus1(119905 119894)]rsquos in discrete time models under
the initial conditions (119869119895(1199050 119894) = 0 119895 gt 119894 + 1) are given
respectively by
119864 [119869119896minus1
(119905 119894)] =
119894+1
sum
119906=119894
119864 [119869119906(1199050 119894)] (
119896minus2
prod
119907=119906
120573119907)
times
119896minus1
sum
119903=119906
119860119906(119896minus1)
(119903) (1 + 120574119903)119905minus1199050
119894 = 0 1 Min (119896 minus 1 2)
(A4)
References
[1] M P Little ldquoCancermodels ionization and genomic instabilitya reviewrdquo inHandbook of CancerModels with ApplicationsW YTan and L Hanin Eds chapter 5 World Scientific River EdgeNJ USA 2008
[2] W Y Tan Stochastic Models of Carcinogenesis Marcel DekkerNew York NY USA 1991
[3] W Y Tan ldquoStochastic multiti-stage models of carcinogenesis ashidden Markov models a new approachrdquo International Journalof Systems and Synthetic Biology vol 1 pp 313ndash337 2010
[4] W Y Tan L J Zhang and C W Chen ldquoStochastic modelingof carcinogenesis state space models and estimation of param-etersrdquoDiscrete and Continuous Dynamical Systems B vol 4 no1 pp 297ndash322 2004
[5] W Y Tan C W Chen and L J Zhang ldquoCancer biology cancermodels and stochasticmathematical analysis of carcinogenesisrdquoin Handbook of Cancer Models and Applications W Y Tan andL Hanin Eds chapter 3World Scientific River Edge NJ USA2008
[6] R A Weinberg The Biology of Human Cancer Garland Sci-ences Taylor and Frances New York NY USA 2007
[7] Q Zheng ldquoStochastic multistage cancer models a fresh lookat an old approachrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds chapter 2 WorldScientific River Edge NJ USA 2008
[8] W Y Tan L J Zhang W Chen and J M Zhu ldquoA stochasticmodel of human colon cancer involving multiple pathwaysrdquo inHandbook of Cancer Models with Applications W Y Tan and LHanin Eds chapter 11 World Scientific River Edge NJ USA2008
[9] W Y Tan and H Zhou ldquoA new stochastic model of retinoblas-toma involving both hereditary and non- hereditary cancercasesrdquo Journal of Carcinogenesis and Mutagenesis vol 2 no 2article 117 2011
[10] X W Yan Stochastic and State Space Models of carcinogenesisUnder Complex Situation [PhD thesis] Department of Math-ematical Sciences University of Memphsis Memphis TennUSA
[11] G L Yang and C W Chen ldquoA stochastic two-stage carcino-genesis model a new approach to computing the probabilityof observing tumor in animal bioassaysrdquo Mathematical Bio-sciences vol 104 no 2 pp 247ndash258 1991
[12] H Osada and T Takahashi ldquoGenetic alterations of multipletumor suppressors and oncogenes in the carcinogenesis andprogression of lung cancerrdquo Oncogene vol 21 no 48 pp 7421ndash7434 2002
[13] I I Wistuba L Mao and A F Gazdar ldquoSmoking moleculardamage in bronchial epitheliumrdquo Oncogene vol 21 no 48 pp7298ndash7306 2002
[14] S Landreville O A Agapova and J W Hartbour ldquoEmerginginsights into the molecular pathogenesis of uveal melanomardquoFuture Oncology vol 4 no 5 pp 629ndash636 2008
[15] H W Mensink D Paridaens and A De Klein ldquoGenetics ofuveal melanomardquo Expert Review of Ophthalmology vol 4 no6 pp 607ndash616 2009
[16] WY Tan andXWYan ldquoAnew stochastic and state spacemodelof human colon cancer incorporating multiple pathwaysrdquoBiology Direct vol 5 article 26 2010
[17] L B Klebanov S T Rachev and A Y Yakovlev ldquoA stochasticmodel of radiation carcinogenesis latent time distributions andtheir propertiesrdquo Mathematical Biosciences vol 113 no 1 pp51ndash75 1993
[18] A Y Yakovlev and A D Tsodikov Stochastic Models of TumorLatency and Their Biostatistical Applications World ScientificRiver Edge NJ USA 1996
[19] H Fakir W Y Tan L Hlatky P Hahnfeldt and R K SachsldquoStochastic population dynamic effects for lung cancer progres-sionrdquo Radiation Research vol 172 no 3 pp 383ndash393 2009
[20] A G Knudson ldquoMutation and cancer statistical study ofretinoblastomardquoProceedings of theNational Academy of Sciencesof the United States of America vol 68 no 4 pp 820ndash823 1971
[21] J F Crow and M Kimura An Introduction to PopulationGenetics Theory Harper and Row New York NY USA 1970
[22] W Y Tan Stochastic Models With Applications to GeneticsCancers AIDS and Other Biomedical Systems World ScientificRiver Edge NJ USA 2002
[23] W Y Tan CW Chen and L J Zhang ldquoCancer risk assessmentby state space modelsrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds Chapter 13 WorldScientific River Edge NJ USA 2008
[24] W Y Tan andCWChen ldquoStochasticmodeling of carcinogene-sis some new insightsrdquoMathematical and Computer Modellingvol 28 no 11 pp 49ndash71 1998
[25] W Y Tan and C W Chen ldquoCancer stochastic modelsrdquo inEncyclopedia of Statistical Sciences Revised edition John Wileyand Sons New York NY USA 2005
[26] E G Luebeck and S HMoolgavkar ldquoMultistage carcinogenesisand the incidence of colorectal cancerrdquo Proceedings of theNational Academy of Sciences of the United States of Americavol 99 no 23 pp 15095ndash15100 2002
[27] R Durrett J Mayberry S Moseley and D Schmidt ldquoProba-bility models for cancer development and progressionrdquo GoogleSearch Google 2012
[28] G E P Box and G C Tiao Bayesian Inference in StatisticalAnalysis Addison-Wesley Reading Mass USA 1973
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
8 ISRN Biomathematics
119895 gt 119894 = 119875(119906119895 119895 = 119894 119896 minus 1 119905) (119894 = 0 1Min(119896 minus 1 2))
in the above model is given by
119889
119889119905119875 (119906
119895 119895 = 119894 119896 minus 1 119905)
= 119875 (119906119894minus 1 119906
119895 119895 = 119894 + 1 119896 minus 1 119905) (119906
119894minus 1) 119887
119894(119905)
+
119896minus1
sum
119895=119894+1
119875 (119906119894 119906
119894+1 119906
119895minus1 119906
119895
minus1 119906119895+1
119906119896minus1
119905) (119906119895minus 1) 119887
119895(119905)
+
119896minus2
sum
119895=119894
119875 (119906119894 119906
119894+1 119906
119895 119906
119895+1
minus1 119906119895+2
119906119896minus1
119905) 119906119895120573119895 (119905)
+
119896minus1
sum
119895=119894
119875 (119906119894 119906
119894+1 119906
119895minus1 119906
119895
+1 119906119895+1
119906119896minus1
119905) (119906119895+ 1) 119889
119895(119905)
minus 119875 (119906119894 119906
119894+1 119906
119896minus1 119905)
times
119896minus1
sum
119895=119894
119906119895[119887119895(119905) + 119889
119895(119905)] +
119896minus2
sum
119895=119894
119906119895120573119895(119905)
(18)
for 119906119895= 0 1 infin 119895 = 119894 119896 minus 1
By using the above set of differential equations one canreadily compute the probabilities 119875119869
119895(119905) = 119906
119895 119895 = 119894 119896 minus
1 | 119868119894(0) = 119898
119894 = 119875(119906
119895 119895 = 119894 119896 minus 1 119905) numerically
33 The Probability Distributions of the Number of DetectableTumors and Times to Tumors As shown by Yang and Chen[11] malignant cancer tumors arise from primary 119869
119896cells by
clonal expansion where primary 119869119896cells are 119869
119896cells generated
directly by 119869119896minus1
cells (119869119896cells derived by stochastic birth of
other 119869119896cells are not primary 119869
119896cells) That is cancer tumors
develop from primary 119869119896cells through stochastic birth-death
processesTo derive the probability distribution for 119879(119905) in 119869
119894people
in the population let 119875119879(119904 119905) denote the probability that
a primary cancer cell at time 119904 develops into a detectablecancer tumor by time 119905 (Explicit formula for119875
119879(119904 119905) has been
given in Tan [22] Chapter 8 and in Tan and Chen [24])Than as shown in Tan ([3 22] chapter 8) the conditionalprobability distribution of 119879(119905) given 119869
119896minus1(119904 119894) 119904 le 119905
in 119869119894people is Poisson with mean 120596(119905 119894) where 120596(119905 119894) =
int119905
1199050
119869119896minus1
(119904 119894)120573119896minus1
(119904)119875119879(119904 119905)119889119904 That is
119879 (119905) | 119869119896minus1
(119904 119894) 119904 le 119905 sim Poisson (120596 (119905 119894)) (19)
Let 119876119894(119895) be the probability that cancer tumors develop
during (119905119895minus1
119905119895] in 119869
119894people in the population For time
homogeneous models with small 120573119896minus1
119876119894(119895) is then given by
119876119894(119895) = 119864 119890
minus120596(119905119895minus1
119894)minus 119890
minus120596(119905119895119894)
= 119890minus120573119896minus1
119867119894(119905119895minus1
)minus 119890
minus120573119896minus1
119867119894(119905119895)+ 119900 (120573
119896minus1)
(20)
where119867119894(119905) = int
119905
1199050
119864[119869119896minus1
(119909 119894)]119875119879(119909 119905)119889119909
To derive 119876119894(119895) denote by
120579119894(119896minus1)
= 119864 [119869119894(1199050 119894 minus 1)] 120573
119894
119896minus1
prod
119906=119894+1
(120573119906
120574119906
)
119894 = 1 Min (3 119896 minus 1)
120582119906(119896minus1)
= 119864 [119869119906(1199050 119906)] 120573
119906
119896minus1
prod
119907=119906+1
(120573119906
120574119906
)
119906 = 0 1 Min (2 119896 minus 1)
(21)
and define the functions
120595119894(119896minus1)
(119905) =
119896minus1
prod
119906=119894+1
120574119906
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907)
times int
119905
1199050
119890120574119906(119909minus1199050)119875119879(119909 119905) 119889119909
119894 = 0 1 Min (2 119896 minus 1)
(22)
Applying results of 119864[119869119896minus1
(119905 119894)] given in (11) for timehomogeneous models with 120574
119894= 120574119895if 119894 = 119895 we obtain 119876
119894(119895)rsquos as
follows
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) = 119890
minus1205791112059511(119905119895minus1
)minus1205820112059501(119905119895minus1
)
minus119890minus1205791112059511(119905119895)minus1205820112059501(119905119895) + 119900 (120573
1)
(23)
1198761(119895) = (1 minus 120572
1) 119890
minus1205821112059511(119905119895minus1
)
minus119890minus1205821112059511(119905119895) + 119900 (120573
1)
(24)
ISRN Biomathematics 9
(2) If 119896 ge 3 then we have 119876119894(0) = 0 for 119894 = 0 1 and
1198762(0) = 120575
2119896120572 and for 119895 gt 0
1198760(119895) = 119890
minus1205791(119896minus1)
1205951(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1205950(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1205951(119896minus1)
(119905119895)minus1205820(119896minus1)
1205950(119896minus1)
(119905119895) + 119900 (120573
119896minus1)
(25)
1198761(119895) = 119890
minus1205792(119896minus1)
1205952(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1205951(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1205952(119896minus1)
(119905119895)minus1205821(119896minus1)
1205951(119896minus1)
(119905119895) + 119900 (120573
119896minus1)
(26)
1198762(119895) = 120575
2119896(1 minus 120572) 119890
minus1205822212059522(119905119895minus1
)minus 119890
minus1205822212059522(119905119895)
+ (1 minus 1205752119896) 119890
minus1205793(119896minus1)
1205953(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1205952(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1205953(119896minus1)
(119905119895)minus1205822(119896minus1)
1205952(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(27)
where 1205752119896= 1 if 119896 = 2 and = 0 if 119896 = 2
Notice that if 1205740= 0 then 120595
0(119896minus1)(119905) reduces to
1205950(119896minus1)
(119905) = (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=0
1198600(119896minus1)
(119907)
times int
119905
1199050
119890120574119907(119909minus1199050)119875119879(119909 119905) 119889119909
= (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1) (119907)
times int
119905
1199050
[119890120574119907(119909minus1199050)minus 1] 119875
119879 (119909 119905) 119889119909
(28)
Notice also that if 119875119879(119904 119905) asymp 1 for 119905 gt 119904 and if 120574
0= 0 then
the above 120595119894119895(119905)rsquos reduce respectively to
120595119894119894(119905) =
1
120574119894
119890120574119894(119905minus1199050)minus 1 119894 = 1 119896 minus 1
1205950(119896minus1) (119905) = (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1) (119907)
times1
1205741199072119890
120574119907(119905minus1199050)minus 1 minus (119905 minus 119905
0) 120574
119907
120595119894(119896minus1)
(119905) = (
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907)
times1
120574119907
119890120574119907(119905minus1199050)minus 1 119894 = 1 119896 minus 1
(29)
4 Probability Distribution ofObserved Cancer Incidence IncorporatingHereditary Cancer Cases
For estimating unknown parameters and to validate themodel one would need real data generated from the modelFor studies of carcinogenesis such data are usually given bycancer incidence For example in the SEER data of NCINIHof USA the data are given by (119910
0 119899
0) (119910
119895 119899
119895) 119895 = 1 119905
119873
where 1199100is the number of cancer cases at birth and 119899
0
the total number of birth and where for 119895 ge 1 119910119895is
the number of cancer cases developed during the 119895th agegroup of a one-year period (or 5 years periods) and 119899
119895is
the number of noncancer people who are at risk for cancerand from whom 119910
119895of them have developed cancer during
the 119895th age group Given in Table 1 are the SEER data ofuveal melanoma (adult eye cancer) during the period 1973ndash2007 In Table 1 notice that there are some cancer cases atbirth implying some inherited cancer cases In this sectionwe will develop a statistical model for these types of datasets from the stochastic multistage model with hereditarycancers as given in Section 2 As in previous sections let119899119894119895be the number of individuals who have genotype 119869
119894(119894 =
0 1 2) at the embryo stage among the 119899119895people at risk for
the cancer in question Then as showed above (1198991119895 119899
2119895) |
119899119895sim Multinomial119899
119895 119901
1 119901
2 It follows that 119899
119894119895| 119899
119895sim
Binomial119899119895 119901
119894 119894 = 0 1 2 In what follows we let 119884
119895denote
the random variable for 119910119895unless otherwise stated
41 The Probability Distribution of 1198840 As shown in Figure 4
119869119894(119894 = 0 1 2) people would only generate 119869
119894stage cells and
119869119894+1
stage cells at birth Thus for cancers to develop at orbefore birth the number of stages for the stochastic model ofcarcinogenesis must be 3 or less It follows that if 119910
0gt 0 the
appropriate model of carcinogenesis must be either a 2-stagemodel or a 3-stage model Since 119899
20| 119899
0sim Binomial(119899
0 119901
2)
and 11989910| (119899
0 119899
20) sim Binomial(119899
0 119901
1) + 119900(119901
2) the probability
distribution of 1198840is therefore
1198840sim Poisson (120594
119896) 119896 = 2 3 (30)
where
120594119896= 119899
0(119901
2+ 119901
1120572) if 119896 = 2
= 11989901199012120572 if 119896 = 3
(31)
The expected number of 1198840given 119899
0is 119864(119884
0| 119899
0) =
1198990(119901
2+ 119901
1120572) = 120594
2if 119896 = 2 and 119864(119884
0| 119899
0) = 119899
01199012120572 = 120594
3
if 119896 = 3 Hence for the 2-stage model (ie 119896 = 2) or the 3-stage model (ie 119896 = 3) the maximum likelihood estimateof 120594
119896is 120594
119896= 119910
0and the deviance119863
0(119896) from the conditional
probability distribution of 1198840given 119899
0is
1198630(119896) = minus2 log ℎ (119910
0 120594
119896) minus log ℎ (119910
0 120594
119896)
= 120594119896minus 119910
0 minus 119910
0log
120594119896
1199100
119896 = 2 3
(32)
10 ISRN Biomathematics
Table 1 The SEER incidence data (1973ndash2007) of uveal melanomafrom NCINIH (over all races and genders)
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
0 12495777 34 36 36 381 12221582 20 11 11 162 12120990 14 7 8 173 12112995 9 5 6 184 12146174 4 4 5 205 12161336 3 3 4 226 12111854 2 2 4 257 12160452 2 2 4 288 11942586 2 2 4 309 12381299 1 2 5 3410 12512703 2 3 6 3811 12410338 5 3 6 4112 12449244 2 3 6 4413 12527781 5 4 7 4814 12602883 3 5 7 5115 12719598 5 5 8 5516 12766107 7 6 9 5917 12831400 9 7 9 6218 12382047 8 7 9 6319 12581638 10 8 10 6820 12636509 7 9 11 7121 12682601 6 10 10 7522 12840510 12 11 11 7923 13075528 17 13 13 8424 13358635 16 15 14 8925 13473849 12 16 16 9426 13426340 17 18 18 9727 13525264 28 20 20 10128 13149674 17 22 21 10229 13812811 23 25 25 11030 13886874 24 28 27 11431 13488332 37 30 29 11532 13460286 32 32 32 11833 13256067 38 35 35 11934 13428827 37 39 38 12435 13220037 40 41 41 12636 12870265 30 44 44 12637 12689592 43 47 47 12738 12157014 42 49 49 12539 12494081 46 55 54 13140 12272125 49 58 58 13241 11826573 56 61 60 13042 11663153 54 65 64 13143 11407082 53 68 68 13144 11296848 70 73 72 133
Table 1 Continued
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
45 11016369 57 76 76 13246 10651593 71 79 79 13047 10475708 87 84 83 13148 9994684 82 86 85 12749 10138908 78 93 93 13150 9836359 87 97 96 13051 9475641 95 100 99 12752 9250985 113 104 104 12653 9027382 106 108 108 12554 8883737 117 113 113 12555 8547883 129 116 116 12356 8279648 107 119 119 12157 8062368 119 123 123 11958 7654610 132 124 124 11559 7563706 118 130 129 11560 7232719 131 131 130 11261 6927332 116 132 132 10962 6708273 133 134 134 10763 6543931 143 138 137 10664 6404652 130 141 141 10565 6168486 145 142 142 10266 5913479 138 142 142 9967 5746766 151 144 144 9868 5480517 147 142 142 9469 5363912 149 144 144 9470 5110728 136 142 142 9071 4925076 144 141 141 8872 4696825 140 138 138 8573 4512136 146 135 135 8274 4345300 126 133 133 8075 4148801 122 128 129 7876 3900900 124 122 122 7477 3681587 114 116 116 7078 3481918 93 110 110 6779 3243631 102 102 102 6380 2961234 79 92 93 5881 2724984 93 83 84 5482 2495219 82 75 75 5083 2271595 92 66 67 4684 2041351 64 57 58 4285 10466605 304 282 285 216Note 1 for age 0 and 1ndash10 years old the cancer incidence are derived bysubtracting incidence of retinoblastoma from the original SEERdata (see Tanand Zhou [9])Note 2 the observed uveal melanoma incidence rates per 106 individuals arederived from the SEER eye cancer incidence by subtracting retinoblastomaincidence as given in Tan and Zhou [9]
ISRN Biomathematics 11
42 The Probability Distribution of 119884119895(119895 ge 1) To derive the
probability distribution of 119884119895(119895 ge 1) in the 119895th age group
let 119884119894119895(119894 = 0 1 2) be the number of cancer cases generated
by people who have genotype 119869119894at the embryo stage among
these 119884119895cancer cases Then 119884
119895= sum
2
119894=0119884119894119895and 119884
0119895is the
number of cancer cases generated by the 1198990119895= 119899
119895minus 119899
1119895minus 119899
2119895
normal people in the populationThe conditional probabilitydistribution of 119884
119894119895given 119899
119894119895is
119884119894119895| 119899
119894119895sim Poisson 119899
119894119895119876119894(119895) 119894 = 0 1 2 (33)
Notice that if 119896 = 2 (a 2-stage model) then all 1198692
individuals would develop tumor at or before birth Thus if119896 = 2 then 119884
2119895= 0 for all 119895 gt 0 so that if 119895 gt 0 cancer
cases develop only from normal people (119873 = 1198690people) and
1198691people On the other hand if 119896 gt 2 then with positive
probability 119884119894119895gt 0 for all (119894 = 0 1 2 119895 = 0 1 119899
119879) where
119899119879is the last time point in the data Let 120575
2119896= 1 if 119896 = 2 and
1205752119896= 0 if 119896 = 2 Then 119884
119895| (119899
119894119895 119894 = 0 1 2) sim Poisson(119876
119879(119895)
where 119876119879(119895) = sum
1
119894=0119899119894119895119876119894(119895) + (1 minus 120575
2119896)119899
21198951198762(119895) Since
(1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) we have for the
conditional probability density function119875(119910119895| 119899
119895) of119884
119895given
119899119895
119875 (119910119895| 119899
119895)=
119899119895
sum
1198990119895=0
119899119895minus1198990119895
sum
1198991119895=0
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1) ℎ 119910
119895 119876
119879(119895)
(34)
where 119892(1198990119895 119899
1119895 119899
119895 119901
0 119901
1) is the probability density function
of (1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) and ℎ119910
119895 119876
119879(119895)
the probability density function of 119884119895| (119899
119894119895 119894 = 0 1 2) sim
Poisson119876119879(119895)
The probability density function 119875(119910119895| 119899
119895) given by (34)
is a mixture of Poisson probability density functions withmixing probability density function given by themultinomialprobability distribution of 119899
0119895 119899
1119895 given 119899
119895 This mixing
probability density function represents individuals with dif-ferent genotypes at the embryo stage in the population
Let Θ be the set of all unknown parameters (ie theparameters (119901
1 119901
2 120572) and the birth rates the death rates
and the mutation rates of 119869119895cells) Based on data (119910
119895 119895 =
0 1 119899119879) the likelihood function of Θ is
119871 Θ | 119910119895 119895 = 0 1 119899
119879 = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119875 (119910119895| 119899
119895)
(35)
Notice that because themutation rates are very small onemay practically assume 120573
119894(119905) = 120573
119894for 119894 = 0 1 119896 minus 1
Also because the stage-limiting genes are basically tumorsuppressor genes which act recessively (see Tan [3]Weinberg[6] and Tan et al [5 8 23]) one may practically assume119887119894(119905) = 119887
119894 119889
119894(119905) = 119889
119894 120574
119894(119905) = 119887
119894minus 119889
119894= 120574
119894 119894 = 0 1 119896 minus 1
(see Tan et al [4 5 8])
43 The Joint Probability Distribution of Augmented Variablesand Cancer Incidence For applying the mixture distribution
of 119884119895in (34) to make inference about the unknown parame-
ters one needs to expand the model to include the unobserv-able augmented variables (119899
0119895 119899
1119895 119910
0119895 119910
1119895 119895 = 1 119905
119873) and
derives the joint probability distribution of these variablesFor these purposes observe that for (119895 = 1 119905
119873) and for
119896 gt 2 the conditional probability distribution119875119910119894119895 119894 = 0 1 |
119910119895 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1) given (119884
119895= 119910
119895 119899
119894119895 119894 =
0 1 119899119895) is
(119910119894119895 119894 = 0 1) | (119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Multinomial(119910119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1)
(36)
Since the conditional probability distribution of 119884119895given
(119899119894119895 119894 = 0 1 2) for 119896 gt 2 is Poisson with mean 119876
119879(119895) we
have for the joint conditional probability density function119875119910
119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1 119884
119895) given
(119899119894119895 119894 = 0 1 119899
119895)
119875 119910119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895
= 119875 119910119894119895 119894 = 0 1 | 119910
119895 119899
119894119895 119894 = 0 1 2
times 119875 119910119895| 119899
119894119895 119895 = 0 1 2 =
1
119910119895119890minus119876119879(119895)119876
119879(119895)
119910119895
times 1198921199100119895 119910
1119895 119910
119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1
=
2
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) (119896 gt 2)
(37)
where 1199102119895= 119910
119895minus sum
1
119906=0119910119906119895and 119899
2119895= 119899
119895minus sum
1
119906=0119899119906119895
If 119896 = 2 then 1198842119895= 0 so that 119884
119895= sum
1
119894=0119884119894119895 Thus we have
for 119896 = 2
119884119895| (119899
119894119895 119894 = 0 1 119899
119895) sim Poisson
1
sum
119894=0
119899119894119895119876119894(119895) (38)
119884119894119895| (119884
119895= 119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Binomial(119910119895
119899119894119895119876119894(119895)
sum1
119906=0119899119906119895119876119906(119895)
) 119894 = 0 1
(39)
It follows that if 119896 = 2 then sum1
119894=0119884119894119895= 119884
119895and the joint
probability density function of (1198840119895 119884
119895) given (119899
119906119895 119906 = 0 1 2)
is119875 119910
0119895 119910
119895| 119899
119906119895 119906 = 0 1 119899
119895
= 119875 119910119895| 119899
119894119895 119894 = 0 1 119899
119895
times 119875 1199100119895| 119910
119895 119899
119906119895 119906 = 0 1 119899
119895
=
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) if 119896 = 2
(40)
where 119910119895= sum
1
119894=0119910119894119895and 119899
119895= sum
2
119894=0119899119894119895
12 ISRN Biomathematics
Put Y = (119910119894119895 119894 = 0 1 119895 = 1 119905
119873) N = (119899
119894119895 119894 = 0 1
119895 = 1 119905119873)˜119899 = (119899
119895 119895 = 0 1 119905
119873)˜119910 = (119910
119895 119895 = 0 1
119905119873) From (37) and (40) we have for the conditional joint
probability density function of (Y˜119910) given (N
˜119899)
119875 Y˜119910 | N
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(41)
It follows that the joint conditional probability densityfunction of NY
˜119910 given (
˜119899 Θ) is
119875 NY˜119910 |
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1)
times ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(42)
Notice that the above probability density function is aproduct of multinomial probability density functions andPoisson probability density functions For this joint probabil-ity density function the deviance Dev = minus2log119875[Y
˜119910N |
˜119899 Θ] minus log119875[Y
˜119910N |
˜119899 Θ] is
Dev = 1198630(119896) + Dev (119901
1 119901
2) +
119905119873
sum
119895=1
119863119895 (43)
where
1198630(119896) = 2 120594 (119896) minus 119910
0minus 119910
0log
120594 (119896)
1199100
(44)
Dev (1199011 119901
2) = 2
119905119873
sum
119895=1
1198990119895log
1199010
(1 minus 1199011minus 119901
2)
+
2
sum
119894=1
119899119894119895log
119901119894
119901119894
(45)
119863119895= 2
1
sum
119894=0
119899119894119895119876119894(119895)minus119910
119894119895minus119910
119894119895log
119899119894119895119876119894(119895)
119910119894119895
+2 (1 minus 1205752119896)
times 11989921198951198762(119895) minus 119910
2119895minus 119910
2119895log
11989921198951198762(119895)
1199102119895
(46)
where 119901119894= ((sum
119905119873
119895=0119899119894119895)(sum
119905119873
119895=0119899119895)) (119894 = 0 1 2)
The joint probability density function 119875Y˜119910N |
˜119899 Θ
of (Y˜119910N) given by (42) will be used as the kernel for the
Bayesian method to estimate the unknown parameters andto predict the state variables
44 Fitting of the Model to Cancer Incidence Data To fit themodel to real data as inTan [3ndash5] we letΔ119905 sim 1 to correspondto a fixed time interval such as 6 months in human cancerstudies (Tan et al [4] has assumed 3months as one-time unitwhile Luebeck andMoolgavkar [26] has assumed one year asone-time unit)Then because the proliferation rate of the laststage cells is quite large onemay practically assume119875
119879(119904 119905) =
1 for 119905 minus 119904 ge 1 Hence noting that 120573119896minus1
(119905) = 120573119896minus1
is usuallyvery small (see [3ndash5]) the 119876
119894(119895) is approximated by
119876119894(119895) asymp119864 119890
minus120573119896minus1
119866119894(119905119895minus1
)minus 119890
minus120573119896minus1
119866119894(119905119895) = 119890
minus120573119896minus1
119864[119866119894(119905119895minus1
)]
minus 119890minus120573119896minus1
119864[119866119894(119905119895)]+ 119900 (120573
119896minus1)
(47)
where 119866119894(119905) = sum
119905minus1
119904=1199050
119869119896minus1
(119904 119894)Under discrete time approximation the 119864[119868
119896minus1(119905 119894)]rsquos
have been derived in the appendix Using these results ofexpected numbers and using the resultsum119905minus1
119894=0119886119894= (119886
119905minus1)(119886minus
1) for 119886 = 0 we obtain
120573119896minus1
119864 [119866119894(119905)] =
119894+1
sum
119903=119894
119864 [119869119903(1199050 119894)] (
119896minus1
prod
119906=119903
120573119906)
times
119896minus1
sum
119907=119903
119860119903(119896minus1)
(119907)
119905minus1
sum
119904=1199050
(1 + 120574119907)119904minus1199050
= 120579(119894+1)(119896minus1)
120601(119894+1)(119896minus1) (119905) + 120582119894(119896minus1)120601119894(119896minus1) (119905)
(48)
where (120579119894(119896minus1)
119894 = 1 2 3) and (120579119894(119896minus1)
119894 = 0 1 2) are definedin Section 33 andwhere the120601
119894(119896minus1)(119905) (119894 = 0 1 2 3)rsquos are given
by
120601119894(119896minus1) (119905) =(
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1) (119907)
times1
120574119907
(1 + 120574119907)119905minus1199050
minus 1 119894 = 0 1 2 3
(49)
Notice that if 1205740= 0 then 120601
0(119896minus1)(119905) reduces to
1206010(119896minus1)
(119905) =(
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1)
(119907)
times1
1205741199072(1 + 120574
119907)119905minus1199050
minus 1 minus (119905 minus 1199050) 120574
119907
(50)
Applying these results for time homogeneous modelswith 120574
119894= 120574119895if 119894 = 119895 the 119876
119894(119895)rsquos under discrete approximation
are given as follows
ISRN Biomathematics 13
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791112060111(119905119895minus1
)minus1205820112060101(119905119895minus1
)
minus119890minus1205791112060111(119905119895)minus1205820112060101(119905119895) + 119900 (120573
1)
1198761(119895) asymp (1 minus 120572
1) 119890
minus1205821112060111(119905119895minus1
)
minus119890minus1205821112060111(119905119895) + 119900 (120573
1)
(51)
(2) If 119896 ge 3 thenwe have 1198762(0) = 120575
2(119896minus1)120572119876
119894(0) = 0 119894 =
0 1 and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791(119896minus1)
1206011(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1206011(119896minus1)
(119905119895)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198761(119895) asymp 119890
minus1205792(119896minus1)
1206012(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1206012(119896minus1)
(119905119895)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198762(119895) asymp 120575
2(119896minus1)(1 minus 120572) 119890
minus1205822212060122(119905119895minus1
)minus 119890
minus1205822212060122(119905119895)
+ (1 minus 1205752(119896minus1)
) 119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(52)
Notice that if one replaces [1 + 120574119907]119905minus1199050 by [1 + 120574
119907]119905minus1199050 =
119890(119905minus1199050) log(1+120574
119907)
asymp 119890(119905minus1199050)120574119907 the above 119876
119894(119895)rsquos from discrete
time model are exactly the same as from the continuousmodel respectively as given in equations (23)ndash(27) under theassumption that 119875
119879(119904 119905) = 1 for 119905 minus 119904 gt 0 Notice that the
assumption 119875119879(119904 119905) = 1 for 119905minus119904 gt 0 is equivalent to assuming
that the last stage cancer cells grow instantaneously intocancer tumors as soon as they are generated see Remark 1
5 The Fitting of the Model to CancerIncidence and the Generalized BayesianInference Procedure
Given the model in Sections 2 and 3 and cancer incidenceone may use results in Section 4 to fit the model By usingthis model and the distribution results in Section 4 one canreadily estimate the unknown genetic parameters predictcancer incidence and check the validity of the model byusing the generalized Bayesian inference and Gibbs samplingprocedures for more detail see Tan [3 22] and Tan et al[4 5]
The generalized Bayesian inference is based on the pos-terior distribution 119875Θ | NY
˜119910
˜119899 of Θ given NY
˜119910
˜119899
This posterior distribution is derived by combining the prior
distribution 119875Θ ofΘ with the joint probability distribution119875NY
˜119910 |
˜119899 Θ given
˜119899 Θ given by (42) It follows
that this inference procedure would combine informationfrom three sources (1) previous information and experiencesabout the parameters in terms of the prior distribution 119875Θof the parameters (2) biological information of inheritedcancer cases via genetic segregation of cancer genes inthe population (119875N |
˜119899 119901
119894 119894 = 1 2 see Section 2)
and (3) information from the expanded data (Y) and theobserved data (
˜119910) via the statistical model from the system
(119875Y˜119910 | N Θ) given by (37) and (40) Because of additional
information from the genetic segregation of the cancer genesthis inference procedure provides an efficient procedure toextract information of effects of genotypes of individuals atthe embryo stage
51 The Prior Distribution of the Parameters For the priordistributions of Θ because biological information has sug-gested some lower bounds and upper bounds for the muta-tion rates and for the proliferation rates we assume
119875 (Θ) prop 119888 (119888 gt 0) (53)
where 119888 is a positive constant if these parameters satisfy somebiologically specified constraints are and equal to zero forotherwise These biological constraints are as follows
(i) 0 lt 1199011lt 10
minus2 0 lt 1199012lt 10
minus6 and minus001 lt 120574119894lt 1 (119894 =
1 2)(ii) For 120573
119895(119895 = 0 1 119896 minus 1) we let 1 lt 120596
0= 119873(119905
0)120573
0lt
1000 (119873 rarr 1198681) and 10minus8 lt 120573
119894lt 10
minus3 119894 = 1 119896minus1(iii) For the 120582
119895(119896minus1)(119895 = 0 1 2) we let 0 lt 120582
119894lt 10 119894 = 0 1
and 0 lt 1205822lt 10
3(iv) For the 120579
119894(119894 = 1 2 3) we let 0 lt 120579
1lt 10
minus2 and 0 lt
1205792lt 10
minus1
We will refer to the above prior as a partially informativeprior which may be considered as an extension of thetraditional noninformative prior given in Box and Tiao [28]
52 The Posterior Distribution of the Parameters Given119884119873
˜119910
˜119899 Denote by Θ
1= Θ minus 119901
1 119901
2 120572 From the
posterior distribution 119875Θ | NY˜119910
˜119899 we obtain
119875 119901119894 119894 = 1 2 120572 | Θ
1NY
˜119910
˜119899
prop (120594 (119896))1199100
119890minus120594(119896)
119901sum119905119873
119895=11198991119895
1119901sum119905119873
119895=11198992119895
2
times (1 minus 1199011minus 119901
2)sum119905119873
119895=11198990119895
0 lt 1199011 119901
2 120572 lt 1
119875 Θ1| (119901
1 119901
2 120572) NY
˜119910
˜119899
prop
119905119873
prod
119895=1
2
prod
119894=0
119890minus119876119894(119895)119876
119894(119895)
119910119894119895
Θ1isin Ω
(54)
14 ISRN Biomathematics
where Ω is the parameter space of Θ1provided by the
biological constraints in Section 51For computational convenience we notice that the log of
1198751199011 119901
2 120572 | Θ
1NY
˜119910
˜119899 is proportional to the negative of
1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) similarly the log of
119875Θ1| (119901
1 119901
2 120572)NY
˜119910
˜119899 is proportional to the negative
of sum119896
119895=1119863119895given by (46)
53 The Multilevel Gibbs Sampling Procedure For EstimatingUnknown Parameters Given the posterior probability distri-butions we will use the following multilevel Gibbs samplingprocedure to derive estimates of the parameters We noticethat numerically the Gibbs sampling procedure given belowis equivalent to the EM-algorithm from the sampling theoryviewpoint with Steps 1 and 2 as the 119864-Step and with Steps 3and 4 as the119872-Step respectively [29]Thesemultilevel Gibbssampling procedures are given by the following
Step 1 (Generating N Given (Y˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 1)) Given Θ and given˜119899 use the multinomial
distribution of 1198991119895 119899
2119895 given 119899
119895in Section 3 to generate
a large sample of N Then by combining this large samplewith 119875Y
˜119910 | N
˜119899 Θ in (37) and (40) to select N through
the weighted bootstrap method due to Smith and Gelfand[30] This selected N is then a sample from 119875N | Y
˜119910
˜119899 Θ
even though the latter is unknown (For proof see Tan [22]Chapter 3) Call the generated sample N
Step 2 (Generating Y Given (N˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 2)) Given
˜119910
˜119899 Θ and given N = N generated
from Step 1 generate Y from the probability distribution119875Y | N
˜119910
˜119899 Θ given by (36) and (38) Call the generated
sample Y
Step 3 (Estimation of 119901119894 119894 = 1 2 120572 Given Θ
1NY
˜119910
˜119899)
Given
˜119910
˜119899 Θ
1 and given (NY) = (N Y) from Steps 1
and 2 derive the posterior mode of 119901119894 119894 = 1 2 120572 by
maximizing the conditional posterior distribution 119875119901119894 119894 =
1 2 120572 | Θ1 N Y
˜119910
˜119899 Under the partially informative prior
this is equivalent to maximize the negative of the deviance1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) in Section 43 under
the constraints given in Section 51 Denote this generatedmode by 119901
119894 119894 = 1 2
Step 4 (Estimation of Θ1Given (119901
119894 119894 = 1 2 120572NY
˜119910
˜119899))
Given (
˜119910
˜119899) and given (NY 119901
119894 119894 = 1 2 120572) = (N Y 119901
119894 119894 =
1 2 ) from Steps 1ndash3 derive the posterior mode of Θ1
by maximizing the conditional posterior distribution119875Θ
1| 119901
119894 119894 = 1 2 N Y
˜119910
˜119899 Under the partially
informative prior this is equivalent to maximize the negativeof the deviancesum119896
119895=1119863119895in (46) under the constraints Denote
the generated mode as Θ1
Step 5 (Recycling Step) With (NY 119901119894 119894 = 1 2 120572 Θ
1) =
(N Y 119901119894 119894 = 1 2 Θ
1) given above go back to Step 1 and
continue until convergence
The proof of convergence of the above steps can bederived by using procedure given in Tan ([22] Chapter 3) Atconvergence the Θ = 119901
119894 119894 = 1 2 Θ
1 are the generated
values from the posterior distribution of Θ given
˜119910
˜119899
independently of (NY) (for proof see Tan [22] Chapter 3)Repeat the above procedures once then generate a randomsample ofΘ from the posterior distribution ofΘ given
˜119910
˜119899
then one uses the sample mean as the estimates of (Θ) anduse the sample variances and covariances as estimates of thevariances and covariances of these estimates
6 A NewMultistage Stochastic Model for AdultEye Cancer (Uveal Melanoma)mdashAn Example
The human eye cancers consist of pediatric eye cancers andadult eye cancers The most common pediatric eye cancer isthe retinoblastoma which develops from the retinal pigmentepithelium cells underlying the retina that do not formmelanoma The most common adult eye cancers are theuveal melanomas involving the iris the ciliary body andthe choroid (collectively referred to as the uveal) Thesecancers develop from melanocytes (pigment cells) whichreside within the uveal giving color to the eye In Tan andZhou [9] we have developed a modified two-stage model forretinoblastoma Based on results frommolecular biology (seeLandreville et al [14] Mensink et al [15] and Loercher andHarbour [31]) Landreville et al [14] have proposed a threestage model for uveal melanoma as given in Figure 2 As anexample of applications of this paper in this section we willapply this model of uveal melanoma to the NCINIH eyecancer data from the SEER project We notice that the samemethods can be applied to model other human cancers aswell but this will be our future research
Given in Table 1 are the numbers of people at risk andthe eye cancer cases in the age groups together with thepredicted cases from the models These data give cancerincidence at birth and incidence for 85 age groups (119896 =
85) with each group spanning over a 1-year period exceptthe last age group (ge85 years old) For human eye cancerbecause the incidence at birth and for age groups from 1to 10 years old is basically generated by the pediatric eyecancer-retinoblastoma (see [9]) to account for inheritedcancer cases of uveal melanomas the incidence for age 0(birth) and for age periods from 1 to 10 years old in Table 1for uveal melanoma is derived by subtracting incidence ofretinoblastoma from SEER data (see Tan and Zhou [9])
To fit the data we let one-time unit be 6 months afterbirth and let 119905
0= 1 To compare different models and to
assess different assumptions we will consider the following2-3-stage mixture models (1) the complete 3-stage mixturemodel (Model-F) in which no assumptions are made on theparameters (2) the Type-1ndash3-stage mixture model in whichwe assume that 120574
1= 0 and that normal people and 119869
1at
the embryo stage will remain normal people and 1198691people
respectively at birth (Model-1) For comparison purposes wealso fit a 2-stage model as defined in Tan and Zhou [9] Wewill apply the methods in Section 6 to fit these models to theSEER data given in Table 1
ISRN Biomathematics 15
Table 2 The log-likelihood AIC and BIC of the fitted models
Models Log-likelihood AIC BICThree-stage
Model-F minus131202 264804 267735Model-1 minus129576 260953 263151
Two-stage minus4392421 8796843 8811569
0 6 12 18 24 30 36 42 48 54 60 66 72 78 84Age (year)
0
50
100
150
200
250
300
350
Inci
denc
es
ObservedModel-F
Model-1Two-stage
Figure 5 Curve fitting of SEER data by the Model-F Model-1 andthe two-stage model
Given in Table 2 are the natural logs of the likelihoodfunctions the AIC (Akaike Information Criterion) and theBIC (Bayesian Information Criterion) for these modelsGiven in Table 3 are the estimates of parameters in the 3-stagemodels Given in Figure 5 are the plots of predicted cancercases from the 3-stage mixture models (Model-F and Model-1) and the 2-stagemodel For comparison purposes in Table 1we also provide numbers of predicted cancer cases from the3-stage mixture models and the 2-stage model together withthe observed cancer cases over time from SEER From theseresults we have made the following observations
(a) As shown by results in Table 1 and Figure 5 it appear-ed that both Model-F and Model-1 fitted the SEERdata well although Model-1 fitted the data slightlybetter from values of AIC and BIC The Chi-squaretest statistics 1205942 = sum
85
119894=0((119910
119894minus 119910
119894)2119910
119894) for Model-F
and Model-1 are given by 8843 and 9448 respec-tively giving a 119875-value of 012 (df = 86 minus 12 = 74)for Model-F and a 119875 value of 011 (df = 86 minus 7 = 79)for Model-1 On the other hand the 2-stage modelfitted the date very poorly the Chi-statistic value forthe 2-stage model is 274769 giving a 119875-value less than10
minus3The AIC (Akaike Information Criteria) and BIC(Bayesian Information Criteria) values ofModel-1 aregiven by (AIC = 260953 BIC = 263151) which areslightly smaller than those of Model-F respectivelyhowever the AIC and BIC values (879684 881157)of the two-stage model are considerably greater thanthose of the 3-stagemodels respectivelyThese resultssuggest that uvealmelanomamaybest be described bya 3-stage model with inherited component and that
one may practically assume 1205741= 0 and that normal
people and 1198691people at the embryo stage will remain
normal people and 1198691people respectively at birth
(b) From Table 3 it is observed that the estimate of 1205741is
close to zero (the estimate is of order 10minus5) indicatingthat the phenotype of 119869
1is almost identical to that
of 119873 = 1198690further confirming that the staging-
limiting genes are basically tumor suppressor genesand that there is no haploinsufficiency for these tumorsuppressor genes On the other hand the estimate of1205742is of order 10minus2 which is about 103 times greater
than those of cells with genotype 1198691
(c) From Table 3 the estimates 119895(119895 = 0 1 2) of 120582
119895
are of order 10minus1 10
minus1 10
2sim 10
3 respectively
Because 120582119894= 119864[119869
119894(1199050 119894)]120573
119894prod
2
119906=119894+1(120573
119906120574
119906) 119894 = 0 1 2
assuming some values of 119864[119869119894(1199050 119894)] 119894 = 0 1 2 from
some biological observations one can have somerough ideas about the magnitude of 120573
119895(119895 = 0 1 2)
For example if we follow Potten et al [32] to assume(119864[119873(119905
0)] = 119864[119869
119894(1199050 119894)] sim 10
8 119894 = 0 1 2) then 120573
119895asymp
10minus6sim 10
minus5(d) From Table 3 the estimates of 119901
1and 119901
2from the
SEER data are of orders 10minus4 sim 10minus3 and 10minus7 sim 10
minus6respectively This indicates that in the US populationthe frequency of the staging limiting cancer genefor uveal melanoma is approximately around 10
minus3Table 3 also showed that the estimate of 120572 was 08411indicating that most individuals with genotype 119869
2
would develop cancer at birth This may help toexplain why there are observed cancer incidencesat birth for uveal melanoma in the SEER data eventhough the estimate of the frequency 119901
2is of order
10minus7sim 10
minus6
7 Discussion and Conclusions
To account for inherited cancer cases in this paper wehave developed some general multistage models involvinghereditary cancer cases For human cancer incidence thesemodels are basically generalized mixture models In thesemixture models the mixing probability is a multinomialdistribution to account for genetic segregation of the staging-limiting tumor suppressor genes This mixture model allowsus to estimate for the first time the frequency of the staging-limiting tumor suppressor gene in human populations As anexample of applications in this paper we have developed ageneral 3-stage stochastic multistage model of carcinogenesisfor adult human eye cancer To account for inherited cancercases in the stochastic model of human eye cancer wehave also developed a generalized mixture model for uvealmelanoma in human beings
For using the proposed models to fit the cancer incidencedata in this paper we have developed a generalized Bayesianinference procedure to estimate the unknownparameters andto predict cancer cases This inference procedure is advanta-geous over the classical sampling theory inference (ie max-imum likelihood method) because the procedure combines
16 ISRN Biomathematics
Table3Estim
ates
ofparametersfor
the3
-stage
stochastic
mod
els
Parameters
1205820
1205821
1205822
1205741
1205742
1199011
1199012
1205721205791
1205792
Mod
el-F
Estim
ates
288119864minus01
481119864minus01
655119864+02
271119864minus05
337119864minus02
995119864minus04
968119864minus07
841119864minus01
329119864minus04
341119864minus01
StD
121119864minus01
465119864minus02
112119864+02
886119864minus06
192119864minus04
175119864minus06
648119864minus09
419119864minus02
354119864minus05
107119864minus02
95CL
-Low
er503119864minus02
389119864minus01
434119864+02
534119864minus06
333119864minus02
991119864minus04
955119864minus07
759119864minus01
260119864minus04
320119864minus01
95CL
-Upp
er525119864minus01
572119864minus01
875119864+02
401119864minus05
341119864minus02
998119864minus04
981119864minus07
923119864minus01
398119864minus04
362119864minus01
Mod
el-1
Estim
ates
655119864minus02
372119864minus01
198119864+02
NA
349119864minus02
998119864minus04
100119864minus06
807119864minus01
NA
NA
StD
254119864minus02
463119864minus02
338119864+01
NA
358119864minus03
627119864minus05
453119864minus08
706119864minus02
NA
NA
95CL
-Low
er157119864minus02
282119864minus01
132119864+02
NA
279119864minus02
875119864minus04
911119864minus07
668119864minus01
NA
NA
95CL
-Upp
er115119864minus01
463119864minus01
265119864+02
NA
419119864minus02
1121198640minus03
109119864minus06
945119864minus01
NA
NA
NoteNAassum
edno
nexiste
nce
ISRN Biomathematics 17
information from three sources (1)previous information andexperiences about the parameters in terms of the prior distri-bution 119875Θ of the parameters (2) biological information ofinherited cancer cases via the genetic segregation of staging-limiting tumor suppressor genes in the population and (3)
information from the expanded data (Y) and the observeddata (
˜119910) via the statistical model from the system (119875Y
˜119910 |
N Θ) given by (37) and (40)To illustrate the usefulness and applications of ourmodels
and methods we have applied our models and methods tothe eye cancer SEER data of NCINIH Our analysis clearlyshowed that the proposed 3-stage model with inheritedcancer cases fitted the data nicely (see Table 2 and Figure 5)on the other hand the classical 2-stage model cannot fit thedata at all (see Table 2 and Figure 5)These results clearly haveconfirmed results frommolecular biology that the human eyecancer is derived by a 3-stage model with inherited cancercomponent Notice however our 3-stage multistage modelis more general than the classical 3-stage model as describedin Little [1] Tan [2] and Zheng [7] in that we postulatethat cancer tumors develop from primary 119868
3cells by clonal
expansion (see Yang and Chen [11]) (Note that the stochasticmultistagemodels in the literature assume that cancer tumorsdevelop from last stage cells immediately as soon as theyare generated ignoring completely cancer progression seeRemark 1) As a matter of fact we had assumed Δ119905 = 1 fora period of three months and found that the 3-stage modelsthen did not fit the SEER data clearly indicating that119875
119905(119904 119905) lt
1 over a period of three monthsApplying our models and methods to the SEER data
of human eye cancer we have derived for the first timesome useful pieces of information Specifically we mention(1) for the first time that we have estimated the frequencyof the staging-limiting tumor suppressor gene in the USpopulation (119901
1sim 9948 times 10
minus3) (2) With the estimate of 120572as = 08411 the predicted number of uveal melanoma atbirth is 119910
0= 119899
011199012= 36 by 3-stage models with inherited
cancer component (Model-F and Model-1) (The observednumber of eye cancer at birth is 34) (3) The estimateof the proliferation rate (120574
1) of 119869
1cells using Model-F is
1205741
= 2271 times 10minus5
sim 0 (The estimate is 8603 times 10minus5
using Model-1) This confirms that the stage-1 limitinggene is a tumor suppressor gene and unlike the p53 genein chromosome 17p (see [33]) there is little or no haploidinsufficiency for this gene in cells with genotype 119869
1
Using models and methods of this paper one can easilypredict future cancer cases for human eye cancer Thus bycomparing results from different populations our modelsand methods can be used to assess cancer prevention andcontrol procedures This will be our future research topicswe will not go any further here
Appendix
The Expected Numbers of State Variablesunder Discrete Time Approximation
Under discrete time the stochastic differential equations ofstate variables reduce to the following stochastic difference
equations of state variables respectively
119869119894 (119905 + 1 119894) = 119869
119894 (119905 119894) [1 + 120574119894 (119905)]
+ 119890119894(119905 + 1 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 + 1 119906) = 119869
119895(119905 119906) [1 + 120574
119895(119905)] + 119869
119895minus1(119905 119906) 120573
119895minus1(119905)
+ 119890119895(119905 + 1 119906) 0 le 119906 lt 119895 le 119896 minus 1
(A1)
where 119890119894(119905 + 1 119894) = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)] minus [119863
119894(119905 119894) minus
119869119894(119905 119894)119889
119894(119905)] and 119890
119895(119905+1 119894) = [119861
119895(119905 119894)minus119869
119895(119905 119894)119887
119895(119905)] minus [119863
119895(119905 119894)minus
119869119895(119905 119894)119889
119895(119905)] + [119872
119895minus1(119905 119894) minus 119869
119895minus1(119905 119894)120573
119895minus1(119905)] for 119895 gt 119894
The initial conditions at birth (1199050) for the above stochastic
difference equations are 119869119895(1199050 119894) gt 0 if (119895 = 119894 119894 + 1) and
119869119895(1199050 119894) = 0 if 119895 gt 119894 + 1 The solution of the above difference
equations under these initial conditions is given respectivelyby
119869119894(119905 119894) = 119869
119894(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119894(119904))
+
119905
sum
119904=1199050+1
119890119894(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119894(119906))
119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119895(119904))
+
119905minus1
sum
119904=1199050
119869119895minus1 (119904 119894) 120573119895minus1 (119904)
119905minus1
prod
119906=119904+1
(1 + 120574119895 (119906))
+
119905
sum
119904=1199050+1
119890119895(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119895(119906))
0 le 119894 lt 119895 le 119896 minus 1
(A2)
If the model is time homogeneous so that 120573119895(119905) =
120573119895 120574
119895(119905) = 120574
119895(119895 = 119894 119896 minus 1 and if 120574
119894= 120574119895if 119894 = 119895 then the
above solutions under the initial conditions (119869119895(1199050 119894) = 0 119895 gt
119894 + 1) reduce to
119869119894 (119905 119894) = 119869
119894(1199050 119894) (1 + 120574
119894)119905minus1199050
+ 120578(0)
119894(119905 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894) (1 + 120574
119895)119905minus1199050
+
119895minus1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895(119903) (1 + 120574
119903)119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
18 ISRN Biomathematics
=
119894+1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895 (119903) (1 + 120574119903)
119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
0 le 119894 lt 119895 le 119896 minus 1
(A3)
where 120578(0)
119895(119905 119894) = sum
119905
119904=1199050+1119890119895(119904 119894)(1 + 120574
119894)119905minus119904 120578
(119906)
119895(119905 119894) =
sum119905
119904=1199050+1119890119895minus119906
(119904 119894)sum119895
119907=119906119860119906119895(119907)(1 + 120574
119907)119905minus119904 (119906 = 1 119895 minus 119894)
Thus if the model is time homogeneous and if 120574119894=
120574119895if 119894 = 119895 the 119864[119869
119896minus1(119905 119894)]rsquos in discrete time models under
the initial conditions (119869119895(1199050 119894) = 0 119895 gt 119894 + 1) are given
respectively by
119864 [119869119896minus1
(119905 119894)] =
119894+1
sum
119906=119894
119864 [119869119906(1199050 119894)] (
119896minus2
prod
119907=119906
120573119907)
times
119896minus1
sum
119903=119906
119860119906(119896minus1)
(119903) (1 + 120574119903)119905minus1199050
119894 = 0 1 Min (119896 minus 1 2)
(A4)
References
[1] M P Little ldquoCancermodels ionization and genomic instabilitya reviewrdquo inHandbook of CancerModels with ApplicationsW YTan and L Hanin Eds chapter 5 World Scientific River EdgeNJ USA 2008
[2] W Y Tan Stochastic Models of Carcinogenesis Marcel DekkerNew York NY USA 1991
[3] W Y Tan ldquoStochastic multiti-stage models of carcinogenesis ashidden Markov models a new approachrdquo International Journalof Systems and Synthetic Biology vol 1 pp 313ndash337 2010
[4] W Y Tan L J Zhang and C W Chen ldquoStochastic modelingof carcinogenesis state space models and estimation of param-etersrdquoDiscrete and Continuous Dynamical Systems B vol 4 no1 pp 297ndash322 2004
[5] W Y Tan C W Chen and L J Zhang ldquoCancer biology cancermodels and stochasticmathematical analysis of carcinogenesisrdquoin Handbook of Cancer Models and Applications W Y Tan andL Hanin Eds chapter 3World Scientific River Edge NJ USA2008
[6] R A Weinberg The Biology of Human Cancer Garland Sci-ences Taylor and Frances New York NY USA 2007
[7] Q Zheng ldquoStochastic multistage cancer models a fresh lookat an old approachrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds chapter 2 WorldScientific River Edge NJ USA 2008
[8] W Y Tan L J Zhang W Chen and J M Zhu ldquoA stochasticmodel of human colon cancer involving multiple pathwaysrdquo inHandbook of Cancer Models with Applications W Y Tan and LHanin Eds chapter 11 World Scientific River Edge NJ USA2008
[9] W Y Tan and H Zhou ldquoA new stochastic model of retinoblas-toma involving both hereditary and non- hereditary cancercasesrdquo Journal of Carcinogenesis and Mutagenesis vol 2 no 2article 117 2011
[10] X W Yan Stochastic and State Space Models of carcinogenesisUnder Complex Situation [PhD thesis] Department of Math-ematical Sciences University of Memphsis Memphis TennUSA
[11] G L Yang and C W Chen ldquoA stochastic two-stage carcino-genesis model a new approach to computing the probabilityof observing tumor in animal bioassaysrdquo Mathematical Bio-sciences vol 104 no 2 pp 247ndash258 1991
[12] H Osada and T Takahashi ldquoGenetic alterations of multipletumor suppressors and oncogenes in the carcinogenesis andprogression of lung cancerrdquo Oncogene vol 21 no 48 pp 7421ndash7434 2002
[13] I I Wistuba L Mao and A F Gazdar ldquoSmoking moleculardamage in bronchial epitheliumrdquo Oncogene vol 21 no 48 pp7298ndash7306 2002
[14] S Landreville O A Agapova and J W Hartbour ldquoEmerginginsights into the molecular pathogenesis of uveal melanomardquoFuture Oncology vol 4 no 5 pp 629ndash636 2008
[15] H W Mensink D Paridaens and A De Klein ldquoGenetics ofuveal melanomardquo Expert Review of Ophthalmology vol 4 no6 pp 607ndash616 2009
[16] WY Tan andXWYan ldquoAnew stochastic and state spacemodelof human colon cancer incorporating multiple pathwaysrdquoBiology Direct vol 5 article 26 2010
[17] L B Klebanov S T Rachev and A Y Yakovlev ldquoA stochasticmodel of radiation carcinogenesis latent time distributions andtheir propertiesrdquo Mathematical Biosciences vol 113 no 1 pp51ndash75 1993
[18] A Y Yakovlev and A D Tsodikov Stochastic Models of TumorLatency and Their Biostatistical Applications World ScientificRiver Edge NJ USA 1996
[19] H Fakir W Y Tan L Hlatky P Hahnfeldt and R K SachsldquoStochastic population dynamic effects for lung cancer progres-sionrdquo Radiation Research vol 172 no 3 pp 383ndash393 2009
[20] A G Knudson ldquoMutation and cancer statistical study ofretinoblastomardquoProceedings of theNational Academy of Sciencesof the United States of America vol 68 no 4 pp 820ndash823 1971
[21] J F Crow and M Kimura An Introduction to PopulationGenetics Theory Harper and Row New York NY USA 1970
[22] W Y Tan Stochastic Models With Applications to GeneticsCancers AIDS and Other Biomedical Systems World ScientificRiver Edge NJ USA 2002
[23] W Y Tan CW Chen and L J Zhang ldquoCancer risk assessmentby state space modelsrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds Chapter 13 WorldScientific River Edge NJ USA 2008
[24] W Y Tan andCWChen ldquoStochasticmodeling of carcinogene-sis some new insightsrdquoMathematical and Computer Modellingvol 28 no 11 pp 49ndash71 1998
[25] W Y Tan and C W Chen ldquoCancer stochastic modelsrdquo inEncyclopedia of Statistical Sciences Revised edition John Wileyand Sons New York NY USA 2005
[26] E G Luebeck and S HMoolgavkar ldquoMultistage carcinogenesisand the incidence of colorectal cancerrdquo Proceedings of theNational Academy of Sciences of the United States of Americavol 99 no 23 pp 15095ndash15100 2002
[27] R Durrett J Mayberry S Moseley and D Schmidt ldquoProba-bility models for cancer development and progressionrdquo GoogleSearch Google 2012
[28] G E P Box and G C Tiao Bayesian Inference in StatisticalAnalysis Addison-Wesley Reading Mass USA 1973
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
ISRN Biomathematics 9
(2) If 119896 ge 3 then we have 119876119894(0) = 0 for 119894 = 0 1 and
1198762(0) = 120575
2119896120572 and for 119895 gt 0
1198760(119895) = 119890
minus1205791(119896minus1)
1205951(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1205950(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1205951(119896minus1)
(119905119895)minus1205820(119896minus1)
1205950(119896minus1)
(119905119895) + 119900 (120573
119896minus1)
(25)
1198761(119895) = 119890
minus1205792(119896minus1)
1205952(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1205951(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1205952(119896minus1)
(119905119895)minus1205821(119896minus1)
1205951(119896minus1)
(119905119895) + 119900 (120573
119896minus1)
(26)
1198762(119895) = 120575
2119896(1 minus 120572) 119890
minus1205822212059522(119905119895minus1
)minus 119890
minus1205822212059522(119905119895)
+ (1 minus 1205752119896) 119890
minus1205793(119896minus1)
1205953(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1205952(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1205953(119896minus1)
(119905119895)minus1205822(119896minus1)
1205952(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(27)
where 1205752119896= 1 if 119896 = 2 and = 0 if 119896 = 2
Notice that if 1205740= 0 then 120595
0(119896minus1)(119905) reduces to
1205950(119896minus1)
(119905) = (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=0
1198600(119896minus1)
(119907)
times int
119905
1199050
119890120574119907(119909minus1199050)119875119879(119909 119905) 119889119909
= (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1) (119907)
times int
119905
1199050
[119890120574119907(119909minus1199050)minus 1] 119875
119879 (119909 119905) 119889119909
(28)
Notice also that if 119875119879(119904 119905) asymp 1 for 119905 gt 119904 and if 120574
0= 0 then
the above 120595119894119895(119905)rsquos reduce respectively to
120595119894119894(119905) =
1
120574119894
119890120574119894(119905minus1199050)minus 1 119894 = 1 119896 minus 1
1205950(119896minus1) (119905) = (
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1) (119907)
times1
1205741199072119890
120574119907(119905minus1199050)minus 1 minus (119905 minus 119905
0) 120574
119907
120595119894(119896minus1)
(119905) = (
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1)
(119907)
times1
120574119907
119890120574119907(119905minus1199050)minus 1 119894 = 1 119896 minus 1
(29)
4 Probability Distribution ofObserved Cancer Incidence IncorporatingHereditary Cancer Cases
For estimating unknown parameters and to validate themodel one would need real data generated from the modelFor studies of carcinogenesis such data are usually given bycancer incidence For example in the SEER data of NCINIHof USA the data are given by (119910
0 119899
0) (119910
119895 119899
119895) 119895 = 1 119905
119873
where 1199100is the number of cancer cases at birth and 119899
0
the total number of birth and where for 119895 ge 1 119910119895is
the number of cancer cases developed during the 119895th agegroup of a one-year period (or 5 years periods) and 119899
119895is
the number of noncancer people who are at risk for cancerand from whom 119910
119895of them have developed cancer during
the 119895th age group Given in Table 1 are the SEER data ofuveal melanoma (adult eye cancer) during the period 1973ndash2007 In Table 1 notice that there are some cancer cases atbirth implying some inherited cancer cases In this sectionwe will develop a statistical model for these types of datasets from the stochastic multistage model with hereditarycancers as given in Section 2 As in previous sections let119899119894119895be the number of individuals who have genotype 119869
119894(119894 =
0 1 2) at the embryo stage among the 119899119895people at risk for
the cancer in question Then as showed above (1198991119895 119899
2119895) |
119899119895sim Multinomial119899
119895 119901
1 119901
2 It follows that 119899
119894119895| 119899
119895sim
Binomial119899119895 119901
119894 119894 = 0 1 2 In what follows we let 119884
119895denote
the random variable for 119910119895unless otherwise stated
41 The Probability Distribution of 1198840 As shown in Figure 4
119869119894(119894 = 0 1 2) people would only generate 119869
119894stage cells and
119869119894+1
stage cells at birth Thus for cancers to develop at orbefore birth the number of stages for the stochastic model ofcarcinogenesis must be 3 or less It follows that if 119910
0gt 0 the
appropriate model of carcinogenesis must be either a 2-stagemodel or a 3-stage model Since 119899
20| 119899
0sim Binomial(119899
0 119901
2)
and 11989910| (119899
0 119899
20) sim Binomial(119899
0 119901
1) + 119900(119901
2) the probability
distribution of 1198840is therefore
1198840sim Poisson (120594
119896) 119896 = 2 3 (30)
where
120594119896= 119899
0(119901
2+ 119901
1120572) if 119896 = 2
= 11989901199012120572 if 119896 = 3
(31)
The expected number of 1198840given 119899
0is 119864(119884
0| 119899
0) =
1198990(119901
2+ 119901
1120572) = 120594
2if 119896 = 2 and 119864(119884
0| 119899
0) = 119899
01199012120572 = 120594
3
if 119896 = 3 Hence for the 2-stage model (ie 119896 = 2) or the 3-stage model (ie 119896 = 3) the maximum likelihood estimateof 120594
119896is 120594
119896= 119910
0and the deviance119863
0(119896) from the conditional
probability distribution of 1198840given 119899
0is
1198630(119896) = minus2 log ℎ (119910
0 120594
119896) minus log ℎ (119910
0 120594
119896)
= 120594119896minus 119910
0 minus 119910
0log
120594119896
1199100
119896 = 2 3
(32)
10 ISRN Biomathematics
Table 1 The SEER incidence data (1973ndash2007) of uveal melanomafrom NCINIH (over all races and genders)
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
0 12495777 34 36 36 381 12221582 20 11 11 162 12120990 14 7 8 173 12112995 9 5 6 184 12146174 4 4 5 205 12161336 3 3 4 226 12111854 2 2 4 257 12160452 2 2 4 288 11942586 2 2 4 309 12381299 1 2 5 3410 12512703 2 3 6 3811 12410338 5 3 6 4112 12449244 2 3 6 4413 12527781 5 4 7 4814 12602883 3 5 7 5115 12719598 5 5 8 5516 12766107 7 6 9 5917 12831400 9 7 9 6218 12382047 8 7 9 6319 12581638 10 8 10 6820 12636509 7 9 11 7121 12682601 6 10 10 7522 12840510 12 11 11 7923 13075528 17 13 13 8424 13358635 16 15 14 8925 13473849 12 16 16 9426 13426340 17 18 18 9727 13525264 28 20 20 10128 13149674 17 22 21 10229 13812811 23 25 25 11030 13886874 24 28 27 11431 13488332 37 30 29 11532 13460286 32 32 32 11833 13256067 38 35 35 11934 13428827 37 39 38 12435 13220037 40 41 41 12636 12870265 30 44 44 12637 12689592 43 47 47 12738 12157014 42 49 49 12539 12494081 46 55 54 13140 12272125 49 58 58 13241 11826573 56 61 60 13042 11663153 54 65 64 13143 11407082 53 68 68 13144 11296848 70 73 72 133
Table 1 Continued
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
45 11016369 57 76 76 13246 10651593 71 79 79 13047 10475708 87 84 83 13148 9994684 82 86 85 12749 10138908 78 93 93 13150 9836359 87 97 96 13051 9475641 95 100 99 12752 9250985 113 104 104 12653 9027382 106 108 108 12554 8883737 117 113 113 12555 8547883 129 116 116 12356 8279648 107 119 119 12157 8062368 119 123 123 11958 7654610 132 124 124 11559 7563706 118 130 129 11560 7232719 131 131 130 11261 6927332 116 132 132 10962 6708273 133 134 134 10763 6543931 143 138 137 10664 6404652 130 141 141 10565 6168486 145 142 142 10266 5913479 138 142 142 9967 5746766 151 144 144 9868 5480517 147 142 142 9469 5363912 149 144 144 9470 5110728 136 142 142 9071 4925076 144 141 141 8872 4696825 140 138 138 8573 4512136 146 135 135 8274 4345300 126 133 133 8075 4148801 122 128 129 7876 3900900 124 122 122 7477 3681587 114 116 116 7078 3481918 93 110 110 6779 3243631 102 102 102 6380 2961234 79 92 93 5881 2724984 93 83 84 5482 2495219 82 75 75 5083 2271595 92 66 67 4684 2041351 64 57 58 4285 10466605 304 282 285 216Note 1 for age 0 and 1ndash10 years old the cancer incidence are derived bysubtracting incidence of retinoblastoma from the original SEERdata (see Tanand Zhou [9])Note 2 the observed uveal melanoma incidence rates per 106 individuals arederived from the SEER eye cancer incidence by subtracting retinoblastomaincidence as given in Tan and Zhou [9]
ISRN Biomathematics 11
42 The Probability Distribution of 119884119895(119895 ge 1) To derive the
probability distribution of 119884119895(119895 ge 1) in the 119895th age group
let 119884119894119895(119894 = 0 1 2) be the number of cancer cases generated
by people who have genotype 119869119894at the embryo stage among
these 119884119895cancer cases Then 119884
119895= sum
2
119894=0119884119894119895and 119884
0119895is the
number of cancer cases generated by the 1198990119895= 119899
119895minus 119899
1119895minus 119899
2119895
normal people in the populationThe conditional probabilitydistribution of 119884
119894119895given 119899
119894119895is
119884119894119895| 119899
119894119895sim Poisson 119899
119894119895119876119894(119895) 119894 = 0 1 2 (33)
Notice that if 119896 = 2 (a 2-stage model) then all 1198692
individuals would develop tumor at or before birth Thus if119896 = 2 then 119884
2119895= 0 for all 119895 gt 0 so that if 119895 gt 0 cancer
cases develop only from normal people (119873 = 1198690people) and
1198691people On the other hand if 119896 gt 2 then with positive
probability 119884119894119895gt 0 for all (119894 = 0 1 2 119895 = 0 1 119899
119879) where
119899119879is the last time point in the data Let 120575
2119896= 1 if 119896 = 2 and
1205752119896= 0 if 119896 = 2 Then 119884
119895| (119899
119894119895 119894 = 0 1 2) sim Poisson(119876
119879(119895)
where 119876119879(119895) = sum
1
119894=0119899119894119895119876119894(119895) + (1 minus 120575
2119896)119899
21198951198762(119895) Since
(1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) we have for the
conditional probability density function119875(119910119895| 119899
119895) of119884
119895given
119899119895
119875 (119910119895| 119899
119895)=
119899119895
sum
1198990119895=0
119899119895minus1198990119895
sum
1198991119895=0
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1) ℎ 119910
119895 119876
119879(119895)
(34)
where 119892(1198990119895 119899
1119895 119899
119895 119901
0 119901
1) is the probability density function
of (1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) and ℎ119910
119895 119876
119879(119895)
the probability density function of 119884119895| (119899
119894119895 119894 = 0 1 2) sim
Poisson119876119879(119895)
The probability density function 119875(119910119895| 119899
119895) given by (34)
is a mixture of Poisson probability density functions withmixing probability density function given by themultinomialprobability distribution of 119899
0119895 119899
1119895 given 119899
119895 This mixing
probability density function represents individuals with dif-ferent genotypes at the embryo stage in the population
Let Θ be the set of all unknown parameters (ie theparameters (119901
1 119901
2 120572) and the birth rates the death rates
and the mutation rates of 119869119895cells) Based on data (119910
119895 119895 =
0 1 119899119879) the likelihood function of Θ is
119871 Θ | 119910119895 119895 = 0 1 119899
119879 = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119875 (119910119895| 119899
119895)
(35)
Notice that because themutation rates are very small onemay practically assume 120573
119894(119905) = 120573
119894for 119894 = 0 1 119896 minus 1
Also because the stage-limiting genes are basically tumorsuppressor genes which act recessively (see Tan [3]Weinberg[6] and Tan et al [5 8 23]) one may practically assume119887119894(119905) = 119887
119894 119889
119894(119905) = 119889
119894 120574
119894(119905) = 119887
119894minus 119889
119894= 120574
119894 119894 = 0 1 119896 minus 1
(see Tan et al [4 5 8])
43 The Joint Probability Distribution of Augmented Variablesand Cancer Incidence For applying the mixture distribution
of 119884119895in (34) to make inference about the unknown parame-
ters one needs to expand the model to include the unobserv-able augmented variables (119899
0119895 119899
1119895 119910
0119895 119910
1119895 119895 = 1 119905
119873) and
derives the joint probability distribution of these variablesFor these purposes observe that for (119895 = 1 119905
119873) and for
119896 gt 2 the conditional probability distribution119875119910119894119895 119894 = 0 1 |
119910119895 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1) given (119884
119895= 119910
119895 119899
119894119895 119894 =
0 1 119899119895) is
(119910119894119895 119894 = 0 1) | (119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Multinomial(119910119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1)
(36)
Since the conditional probability distribution of 119884119895given
(119899119894119895 119894 = 0 1 2) for 119896 gt 2 is Poisson with mean 119876
119879(119895) we
have for the joint conditional probability density function119875119910
119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1 119884
119895) given
(119899119894119895 119894 = 0 1 119899
119895)
119875 119910119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895
= 119875 119910119894119895 119894 = 0 1 | 119910
119895 119899
119894119895 119894 = 0 1 2
times 119875 119910119895| 119899
119894119895 119895 = 0 1 2 =
1
119910119895119890minus119876119879(119895)119876
119879(119895)
119910119895
times 1198921199100119895 119910
1119895 119910
119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1
=
2
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) (119896 gt 2)
(37)
where 1199102119895= 119910
119895minus sum
1
119906=0119910119906119895and 119899
2119895= 119899
119895minus sum
1
119906=0119899119906119895
If 119896 = 2 then 1198842119895= 0 so that 119884
119895= sum
1
119894=0119884119894119895 Thus we have
for 119896 = 2
119884119895| (119899
119894119895 119894 = 0 1 119899
119895) sim Poisson
1
sum
119894=0
119899119894119895119876119894(119895) (38)
119884119894119895| (119884
119895= 119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Binomial(119910119895
119899119894119895119876119894(119895)
sum1
119906=0119899119906119895119876119906(119895)
) 119894 = 0 1
(39)
It follows that if 119896 = 2 then sum1
119894=0119884119894119895= 119884
119895and the joint
probability density function of (1198840119895 119884
119895) given (119899
119906119895 119906 = 0 1 2)
is119875 119910
0119895 119910
119895| 119899
119906119895 119906 = 0 1 119899
119895
= 119875 119910119895| 119899
119894119895 119894 = 0 1 119899
119895
times 119875 1199100119895| 119910
119895 119899
119906119895 119906 = 0 1 119899
119895
=
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) if 119896 = 2
(40)
where 119910119895= sum
1
119894=0119910119894119895and 119899
119895= sum
2
119894=0119899119894119895
12 ISRN Biomathematics
Put Y = (119910119894119895 119894 = 0 1 119895 = 1 119905
119873) N = (119899
119894119895 119894 = 0 1
119895 = 1 119905119873)˜119899 = (119899
119895 119895 = 0 1 119905
119873)˜119910 = (119910
119895 119895 = 0 1
119905119873) From (37) and (40) we have for the conditional joint
probability density function of (Y˜119910) given (N
˜119899)
119875 Y˜119910 | N
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(41)
It follows that the joint conditional probability densityfunction of NY
˜119910 given (
˜119899 Θ) is
119875 NY˜119910 |
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1)
times ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(42)
Notice that the above probability density function is aproduct of multinomial probability density functions andPoisson probability density functions For this joint probabil-ity density function the deviance Dev = minus2log119875[Y
˜119910N |
˜119899 Θ] minus log119875[Y
˜119910N |
˜119899 Θ] is
Dev = 1198630(119896) + Dev (119901
1 119901
2) +
119905119873
sum
119895=1
119863119895 (43)
where
1198630(119896) = 2 120594 (119896) minus 119910
0minus 119910
0log
120594 (119896)
1199100
(44)
Dev (1199011 119901
2) = 2
119905119873
sum
119895=1
1198990119895log
1199010
(1 minus 1199011minus 119901
2)
+
2
sum
119894=1
119899119894119895log
119901119894
119901119894
(45)
119863119895= 2
1
sum
119894=0
119899119894119895119876119894(119895)minus119910
119894119895minus119910
119894119895log
119899119894119895119876119894(119895)
119910119894119895
+2 (1 minus 1205752119896)
times 11989921198951198762(119895) minus 119910
2119895minus 119910
2119895log
11989921198951198762(119895)
1199102119895
(46)
where 119901119894= ((sum
119905119873
119895=0119899119894119895)(sum
119905119873
119895=0119899119895)) (119894 = 0 1 2)
The joint probability density function 119875Y˜119910N |
˜119899 Θ
of (Y˜119910N) given by (42) will be used as the kernel for the
Bayesian method to estimate the unknown parameters andto predict the state variables
44 Fitting of the Model to Cancer Incidence Data To fit themodel to real data as inTan [3ndash5] we letΔ119905 sim 1 to correspondto a fixed time interval such as 6 months in human cancerstudies (Tan et al [4] has assumed 3months as one-time unitwhile Luebeck andMoolgavkar [26] has assumed one year asone-time unit)Then because the proliferation rate of the laststage cells is quite large onemay practically assume119875
119879(119904 119905) =
1 for 119905 minus 119904 ge 1 Hence noting that 120573119896minus1
(119905) = 120573119896minus1
is usuallyvery small (see [3ndash5]) the 119876
119894(119895) is approximated by
119876119894(119895) asymp119864 119890
minus120573119896minus1
119866119894(119905119895minus1
)minus 119890
minus120573119896minus1
119866119894(119905119895) = 119890
minus120573119896minus1
119864[119866119894(119905119895minus1
)]
minus 119890minus120573119896minus1
119864[119866119894(119905119895)]+ 119900 (120573
119896minus1)
(47)
where 119866119894(119905) = sum
119905minus1
119904=1199050
119869119896minus1
(119904 119894)Under discrete time approximation the 119864[119868
119896minus1(119905 119894)]rsquos
have been derived in the appendix Using these results ofexpected numbers and using the resultsum119905minus1
119894=0119886119894= (119886
119905minus1)(119886minus
1) for 119886 = 0 we obtain
120573119896minus1
119864 [119866119894(119905)] =
119894+1
sum
119903=119894
119864 [119869119903(1199050 119894)] (
119896minus1
prod
119906=119903
120573119906)
times
119896minus1
sum
119907=119903
119860119903(119896minus1)
(119907)
119905minus1
sum
119904=1199050
(1 + 120574119907)119904minus1199050
= 120579(119894+1)(119896minus1)
120601(119894+1)(119896minus1) (119905) + 120582119894(119896minus1)120601119894(119896minus1) (119905)
(48)
where (120579119894(119896minus1)
119894 = 1 2 3) and (120579119894(119896minus1)
119894 = 0 1 2) are definedin Section 33 andwhere the120601
119894(119896minus1)(119905) (119894 = 0 1 2 3)rsquos are given
by
120601119894(119896minus1) (119905) =(
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1) (119907)
times1
120574119907
(1 + 120574119907)119905minus1199050
minus 1 119894 = 0 1 2 3
(49)
Notice that if 1205740= 0 then 120601
0(119896minus1)(119905) reduces to
1206010(119896minus1)
(119905) =(
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1)
(119907)
times1
1205741199072(1 + 120574
119907)119905minus1199050
minus 1 minus (119905 minus 1199050) 120574
119907
(50)
Applying these results for time homogeneous modelswith 120574
119894= 120574119895if 119894 = 119895 the 119876
119894(119895)rsquos under discrete approximation
are given as follows
ISRN Biomathematics 13
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791112060111(119905119895minus1
)minus1205820112060101(119905119895minus1
)
minus119890minus1205791112060111(119905119895)minus1205820112060101(119905119895) + 119900 (120573
1)
1198761(119895) asymp (1 minus 120572
1) 119890
minus1205821112060111(119905119895minus1
)
minus119890minus1205821112060111(119905119895) + 119900 (120573
1)
(51)
(2) If 119896 ge 3 thenwe have 1198762(0) = 120575
2(119896minus1)120572119876
119894(0) = 0 119894 =
0 1 and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791(119896minus1)
1206011(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1206011(119896minus1)
(119905119895)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198761(119895) asymp 119890
minus1205792(119896minus1)
1206012(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1206012(119896minus1)
(119905119895)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198762(119895) asymp 120575
2(119896minus1)(1 minus 120572) 119890
minus1205822212060122(119905119895minus1
)minus 119890
minus1205822212060122(119905119895)
+ (1 minus 1205752(119896minus1)
) 119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(52)
Notice that if one replaces [1 + 120574119907]119905minus1199050 by [1 + 120574
119907]119905minus1199050 =
119890(119905minus1199050) log(1+120574
119907)
asymp 119890(119905minus1199050)120574119907 the above 119876
119894(119895)rsquos from discrete
time model are exactly the same as from the continuousmodel respectively as given in equations (23)ndash(27) under theassumption that 119875
119879(119904 119905) = 1 for 119905 minus 119904 gt 0 Notice that the
assumption 119875119879(119904 119905) = 1 for 119905minus119904 gt 0 is equivalent to assuming
that the last stage cancer cells grow instantaneously intocancer tumors as soon as they are generated see Remark 1
5 The Fitting of the Model to CancerIncidence and the Generalized BayesianInference Procedure
Given the model in Sections 2 and 3 and cancer incidenceone may use results in Section 4 to fit the model By usingthis model and the distribution results in Section 4 one canreadily estimate the unknown genetic parameters predictcancer incidence and check the validity of the model byusing the generalized Bayesian inference and Gibbs samplingprocedures for more detail see Tan [3 22] and Tan et al[4 5]
The generalized Bayesian inference is based on the pos-terior distribution 119875Θ | NY
˜119910
˜119899 of Θ given NY
˜119910
˜119899
This posterior distribution is derived by combining the prior
distribution 119875Θ ofΘ with the joint probability distribution119875NY
˜119910 |
˜119899 Θ given
˜119899 Θ given by (42) It follows
that this inference procedure would combine informationfrom three sources (1) previous information and experiencesabout the parameters in terms of the prior distribution 119875Θof the parameters (2) biological information of inheritedcancer cases via genetic segregation of cancer genes inthe population (119875N |
˜119899 119901
119894 119894 = 1 2 see Section 2)
and (3) information from the expanded data (Y) and theobserved data (
˜119910) via the statistical model from the system
(119875Y˜119910 | N Θ) given by (37) and (40) Because of additional
information from the genetic segregation of the cancer genesthis inference procedure provides an efficient procedure toextract information of effects of genotypes of individuals atthe embryo stage
51 The Prior Distribution of the Parameters For the priordistributions of Θ because biological information has sug-gested some lower bounds and upper bounds for the muta-tion rates and for the proliferation rates we assume
119875 (Θ) prop 119888 (119888 gt 0) (53)
where 119888 is a positive constant if these parameters satisfy somebiologically specified constraints are and equal to zero forotherwise These biological constraints are as follows
(i) 0 lt 1199011lt 10
minus2 0 lt 1199012lt 10
minus6 and minus001 lt 120574119894lt 1 (119894 =
1 2)(ii) For 120573
119895(119895 = 0 1 119896 minus 1) we let 1 lt 120596
0= 119873(119905
0)120573
0lt
1000 (119873 rarr 1198681) and 10minus8 lt 120573
119894lt 10
minus3 119894 = 1 119896minus1(iii) For the 120582
119895(119896minus1)(119895 = 0 1 2) we let 0 lt 120582
119894lt 10 119894 = 0 1
and 0 lt 1205822lt 10
3(iv) For the 120579
119894(119894 = 1 2 3) we let 0 lt 120579
1lt 10
minus2 and 0 lt
1205792lt 10
minus1
We will refer to the above prior as a partially informativeprior which may be considered as an extension of thetraditional noninformative prior given in Box and Tiao [28]
52 The Posterior Distribution of the Parameters Given119884119873
˜119910
˜119899 Denote by Θ
1= Θ minus 119901
1 119901
2 120572 From the
posterior distribution 119875Θ | NY˜119910
˜119899 we obtain
119875 119901119894 119894 = 1 2 120572 | Θ
1NY
˜119910
˜119899
prop (120594 (119896))1199100
119890minus120594(119896)
119901sum119905119873
119895=11198991119895
1119901sum119905119873
119895=11198992119895
2
times (1 minus 1199011minus 119901
2)sum119905119873
119895=11198990119895
0 lt 1199011 119901
2 120572 lt 1
119875 Θ1| (119901
1 119901
2 120572) NY
˜119910
˜119899
prop
119905119873
prod
119895=1
2
prod
119894=0
119890minus119876119894(119895)119876
119894(119895)
119910119894119895
Θ1isin Ω
(54)
14 ISRN Biomathematics
where Ω is the parameter space of Θ1provided by the
biological constraints in Section 51For computational convenience we notice that the log of
1198751199011 119901
2 120572 | Θ
1NY
˜119910
˜119899 is proportional to the negative of
1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) similarly the log of
119875Θ1| (119901
1 119901
2 120572)NY
˜119910
˜119899 is proportional to the negative
of sum119896
119895=1119863119895given by (46)
53 The Multilevel Gibbs Sampling Procedure For EstimatingUnknown Parameters Given the posterior probability distri-butions we will use the following multilevel Gibbs samplingprocedure to derive estimates of the parameters We noticethat numerically the Gibbs sampling procedure given belowis equivalent to the EM-algorithm from the sampling theoryviewpoint with Steps 1 and 2 as the 119864-Step and with Steps 3and 4 as the119872-Step respectively [29]Thesemultilevel Gibbssampling procedures are given by the following
Step 1 (Generating N Given (Y˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 1)) Given Θ and given˜119899 use the multinomial
distribution of 1198991119895 119899
2119895 given 119899
119895in Section 3 to generate
a large sample of N Then by combining this large samplewith 119875Y
˜119910 | N
˜119899 Θ in (37) and (40) to select N through
the weighted bootstrap method due to Smith and Gelfand[30] This selected N is then a sample from 119875N | Y
˜119910
˜119899 Θ
even though the latter is unknown (For proof see Tan [22]Chapter 3) Call the generated sample N
Step 2 (Generating Y Given (N˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 2)) Given
˜119910
˜119899 Θ and given N = N generated
from Step 1 generate Y from the probability distribution119875Y | N
˜119910
˜119899 Θ given by (36) and (38) Call the generated
sample Y
Step 3 (Estimation of 119901119894 119894 = 1 2 120572 Given Θ
1NY
˜119910
˜119899)
Given
˜119910
˜119899 Θ
1 and given (NY) = (N Y) from Steps 1
and 2 derive the posterior mode of 119901119894 119894 = 1 2 120572 by
maximizing the conditional posterior distribution 119875119901119894 119894 =
1 2 120572 | Θ1 N Y
˜119910
˜119899 Under the partially informative prior
this is equivalent to maximize the negative of the deviance1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) in Section 43 under
the constraints given in Section 51 Denote this generatedmode by 119901
119894 119894 = 1 2
Step 4 (Estimation of Θ1Given (119901
119894 119894 = 1 2 120572NY
˜119910
˜119899))
Given (
˜119910
˜119899) and given (NY 119901
119894 119894 = 1 2 120572) = (N Y 119901
119894 119894 =
1 2 ) from Steps 1ndash3 derive the posterior mode of Θ1
by maximizing the conditional posterior distribution119875Θ
1| 119901
119894 119894 = 1 2 N Y
˜119910
˜119899 Under the partially
informative prior this is equivalent to maximize the negativeof the deviancesum119896
119895=1119863119895in (46) under the constraints Denote
the generated mode as Θ1
Step 5 (Recycling Step) With (NY 119901119894 119894 = 1 2 120572 Θ
1) =
(N Y 119901119894 119894 = 1 2 Θ
1) given above go back to Step 1 and
continue until convergence
The proof of convergence of the above steps can bederived by using procedure given in Tan ([22] Chapter 3) Atconvergence the Θ = 119901
119894 119894 = 1 2 Θ
1 are the generated
values from the posterior distribution of Θ given
˜119910
˜119899
independently of (NY) (for proof see Tan [22] Chapter 3)Repeat the above procedures once then generate a randomsample ofΘ from the posterior distribution ofΘ given
˜119910
˜119899
then one uses the sample mean as the estimates of (Θ) anduse the sample variances and covariances as estimates of thevariances and covariances of these estimates
6 A NewMultistage Stochastic Model for AdultEye Cancer (Uveal Melanoma)mdashAn Example
The human eye cancers consist of pediatric eye cancers andadult eye cancers The most common pediatric eye cancer isthe retinoblastoma which develops from the retinal pigmentepithelium cells underlying the retina that do not formmelanoma The most common adult eye cancers are theuveal melanomas involving the iris the ciliary body andthe choroid (collectively referred to as the uveal) Thesecancers develop from melanocytes (pigment cells) whichreside within the uveal giving color to the eye In Tan andZhou [9] we have developed a modified two-stage model forretinoblastoma Based on results frommolecular biology (seeLandreville et al [14] Mensink et al [15] and Loercher andHarbour [31]) Landreville et al [14] have proposed a threestage model for uveal melanoma as given in Figure 2 As anexample of applications of this paper in this section we willapply this model of uveal melanoma to the NCINIH eyecancer data from the SEER project We notice that the samemethods can be applied to model other human cancers aswell but this will be our future research
Given in Table 1 are the numbers of people at risk andthe eye cancer cases in the age groups together with thepredicted cases from the models These data give cancerincidence at birth and incidence for 85 age groups (119896 =
85) with each group spanning over a 1-year period exceptthe last age group (ge85 years old) For human eye cancerbecause the incidence at birth and for age groups from 1to 10 years old is basically generated by the pediatric eyecancer-retinoblastoma (see [9]) to account for inheritedcancer cases of uveal melanomas the incidence for age 0(birth) and for age periods from 1 to 10 years old in Table 1for uveal melanoma is derived by subtracting incidence ofretinoblastoma from SEER data (see Tan and Zhou [9])
To fit the data we let one-time unit be 6 months afterbirth and let 119905
0= 1 To compare different models and to
assess different assumptions we will consider the following2-3-stage mixture models (1) the complete 3-stage mixturemodel (Model-F) in which no assumptions are made on theparameters (2) the Type-1ndash3-stage mixture model in whichwe assume that 120574
1= 0 and that normal people and 119869
1at
the embryo stage will remain normal people and 1198691people
respectively at birth (Model-1) For comparison purposes wealso fit a 2-stage model as defined in Tan and Zhou [9] Wewill apply the methods in Section 6 to fit these models to theSEER data given in Table 1
ISRN Biomathematics 15
Table 2 The log-likelihood AIC and BIC of the fitted models
Models Log-likelihood AIC BICThree-stage
Model-F minus131202 264804 267735Model-1 minus129576 260953 263151
Two-stage minus4392421 8796843 8811569
0 6 12 18 24 30 36 42 48 54 60 66 72 78 84Age (year)
0
50
100
150
200
250
300
350
Inci
denc
es
ObservedModel-F
Model-1Two-stage
Figure 5 Curve fitting of SEER data by the Model-F Model-1 andthe two-stage model
Given in Table 2 are the natural logs of the likelihoodfunctions the AIC (Akaike Information Criterion) and theBIC (Bayesian Information Criterion) for these modelsGiven in Table 3 are the estimates of parameters in the 3-stagemodels Given in Figure 5 are the plots of predicted cancercases from the 3-stage mixture models (Model-F and Model-1) and the 2-stagemodel For comparison purposes in Table 1we also provide numbers of predicted cancer cases from the3-stage mixture models and the 2-stage model together withthe observed cancer cases over time from SEER From theseresults we have made the following observations
(a) As shown by results in Table 1 and Figure 5 it appear-ed that both Model-F and Model-1 fitted the SEERdata well although Model-1 fitted the data slightlybetter from values of AIC and BIC The Chi-squaretest statistics 1205942 = sum
85
119894=0((119910
119894minus 119910
119894)2119910
119894) for Model-F
and Model-1 are given by 8843 and 9448 respec-tively giving a 119875-value of 012 (df = 86 minus 12 = 74)for Model-F and a 119875 value of 011 (df = 86 minus 7 = 79)for Model-1 On the other hand the 2-stage modelfitted the date very poorly the Chi-statistic value forthe 2-stage model is 274769 giving a 119875-value less than10
minus3The AIC (Akaike Information Criteria) and BIC(Bayesian Information Criteria) values ofModel-1 aregiven by (AIC = 260953 BIC = 263151) which areslightly smaller than those of Model-F respectivelyhowever the AIC and BIC values (879684 881157)of the two-stage model are considerably greater thanthose of the 3-stagemodels respectivelyThese resultssuggest that uvealmelanomamaybest be described bya 3-stage model with inherited component and that
one may practically assume 1205741= 0 and that normal
people and 1198691people at the embryo stage will remain
normal people and 1198691people respectively at birth
(b) From Table 3 it is observed that the estimate of 1205741is
close to zero (the estimate is of order 10minus5) indicatingthat the phenotype of 119869
1is almost identical to that
of 119873 = 1198690further confirming that the staging-
limiting genes are basically tumor suppressor genesand that there is no haploinsufficiency for these tumorsuppressor genes On the other hand the estimate of1205742is of order 10minus2 which is about 103 times greater
than those of cells with genotype 1198691
(c) From Table 3 the estimates 119895(119895 = 0 1 2) of 120582
119895
are of order 10minus1 10
minus1 10
2sim 10
3 respectively
Because 120582119894= 119864[119869
119894(1199050 119894)]120573
119894prod
2
119906=119894+1(120573
119906120574
119906) 119894 = 0 1 2
assuming some values of 119864[119869119894(1199050 119894)] 119894 = 0 1 2 from
some biological observations one can have somerough ideas about the magnitude of 120573
119895(119895 = 0 1 2)
For example if we follow Potten et al [32] to assume(119864[119873(119905
0)] = 119864[119869
119894(1199050 119894)] sim 10
8 119894 = 0 1 2) then 120573
119895asymp
10minus6sim 10
minus5(d) From Table 3 the estimates of 119901
1and 119901
2from the
SEER data are of orders 10minus4 sim 10minus3 and 10minus7 sim 10
minus6respectively This indicates that in the US populationthe frequency of the staging limiting cancer genefor uveal melanoma is approximately around 10
minus3Table 3 also showed that the estimate of 120572 was 08411indicating that most individuals with genotype 119869
2
would develop cancer at birth This may help toexplain why there are observed cancer incidencesat birth for uveal melanoma in the SEER data eventhough the estimate of the frequency 119901
2is of order
10minus7sim 10
minus6
7 Discussion and Conclusions
To account for inherited cancer cases in this paper wehave developed some general multistage models involvinghereditary cancer cases For human cancer incidence thesemodels are basically generalized mixture models In thesemixture models the mixing probability is a multinomialdistribution to account for genetic segregation of the staging-limiting tumor suppressor genes This mixture model allowsus to estimate for the first time the frequency of the staging-limiting tumor suppressor gene in human populations As anexample of applications in this paper we have developed ageneral 3-stage stochastic multistage model of carcinogenesisfor adult human eye cancer To account for inherited cancercases in the stochastic model of human eye cancer wehave also developed a generalized mixture model for uvealmelanoma in human beings
For using the proposed models to fit the cancer incidencedata in this paper we have developed a generalized Bayesianinference procedure to estimate the unknownparameters andto predict cancer cases This inference procedure is advanta-geous over the classical sampling theory inference (ie max-imum likelihood method) because the procedure combines
16 ISRN Biomathematics
Table3Estim
ates
ofparametersfor
the3
-stage
stochastic
mod
els
Parameters
1205820
1205821
1205822
1205741
1205742
1199011
1199012
1205721205791
1205792
Mod
el-F
Estim
ates
288119864minus01
481119864minus01
655119864+02
271119864minus05
337119864minus02
995119864minus04
968119864minus07
841119864minus01
329119864minus04
341119864minus01
StD
121119864minus01
465119864minus02
112119864+02
886119864minus06
192119864minus04
175119864minus06
648119864minus09
419119864minus02
354119864minus05
107119864minus02
95CL
-Low
er503119864minus02
389119864minus01
434119864+02
534119864minus06
333119864minus02
991119864minus04
955119864minus07
759119864minus01
260119864minus04
320119864minus01
95CL
-Upp
er525119864minus01
572119864minus01
875119864+02
401119864minus05
341119864minus02
998119864minus04
981119864minus07
923119864minus01
398119864minus04
362119864minus01
Mod
el-1
Estim
ates
655119864minus02
372119864minus01
198119864+02
NA
349119864minus02
998119864minus04
100119864minus06
807119864minus01
NA
NA
StD
254119864minus02
463119864minus02
338119864+01
NA
358119864minus03
627119864minus05
453119864minus08
706119864minus02
NA
NA
95CL
-Low
er157119864minus02
282119864minus01
132119864+02
NA
279119864minus02
875119864minus04
911119864minus07
668119864minus01
NA
NA
95CL
-Upp
er115119864minus01
463119864minus01
265119864+02
NA
419119864minus02
1121198640minus03
109119864minus06
945119864minus01
NA
NA
NoteNAassum
edno
nexiste
nce
ISRN Biomathematics 17
information from three sources (1)previous information andexperiences about the parameters in terms of the prior distri-bution 119875Θ of the parameters (2) biological information ofinherited cancer cases via the genetic segregation of staging-limiting tumor suppressor genes in the population and (3)
information from the expanded data (Y) and the observeddata (
˜119910) via the statistical model from the system (119875Y
˜119910 |
N Θ) given by (37) and (40)To illustrate the usefulness and applications of ourmodels
and methods we have applied our models and methods tothe eye cancer SEER data of NCINIH Our analysis clearlyshowed that the proposed 3-stage model with inheritedcancer cases fitted the data nicely (see Table 2 and Figure 5)on the other hand the classical 2-stage model cannot fit thedata at all (see Table 2 and Figure 5)These results clearly haveconfirmed results frommolecular biology that the human eyecancer is derived by a 3-stage model with inherited cancercomponent Notice however our 3-stage multistage modelis more general than the classical 3-stage model as describedin Little [1] Tan [2] and Zheng [7] in that we postulatethat cancer tumors develop from primary 119868
3cells by clonal
expansion (see Yang and Chen [11]) (Note that the stochasticmultistagemodels in the literature assume that cancer tumorsdevelop from last stage cells immediately as soon as theyare generated ignoring completely cancer progression seeRemark 1) As a matter of fact we had assumed Δ119905 = 1 fora period of three months and found that the 3-stage modelsthen did not fit the SEER data clearly indicating that119875
119905(119904 119905) lt
1 over a period of three monthsApplying our models and methods to the SEER data
of human eye cancer we have derived for the first timesome useful pieces of information Specifically we mention(1) for the first time that we have estimated the frequencyof the staging-limiting tumor suppressor gene in the USpopulation (119901
1sim 9948 times 10
minus3) (2) With the estimate of 120572as = 08411 the predicted number of uveal melanoma atbirth is 119910
0= 119899
011199012= 36 by 3-stage models with inherited
cancer component (Model-F and Model-1) (The observednumber of eye cancer at birth is 34) (3) The estimateof the proliferation rate (120574
1) of 119869
1cells using Model-F is
1205741
= 2271 times 10minus5
sim 0 (The estimate is 8603 times 10minus5
using Model-1) This confirms that the stage-1 limitinggene is a tumor suppressor gene and unlike the p53 genein chromosome 17p (see [33]) there is little or no haploidinsufficiency for this gene in cells with genotype 119869
1
Using models and methods of this paper one can easilypredict future cancer cases for human eye cancer Thus bycomparing results from different populations our modelsand methods can be used to assess cancer prevention andcontrol procedures This will be our future research topicswe will not go any further here
Appendix
The Expected Numbers of State Variablesunder Discrete Time Approximation
Under discrete time the stochastic differential equations ofstate variables reduce to the following stochastic difference
equations of state variables respectively
119869119894 (119905 + 1 119894) = 119869
119894 (119905 119894) [1 + 120574119894 (119905)]
+ 119890119894(119905 + 1 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 + 1 119906) = 119869
119895(119905 119906) [1 + 120574
119895(119905)] + 119869
119895minus1(119905 119906) 120573
119895minus1(119905)
+ 119890119895(119905 + 1 119906) 0 le 119906 lt 119895 le 119896 minus 1
(A1)
where 119890119894(119905 + 1 119894) = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)] minus [119863
119894(119905 119894) minus
119869119894(119905 119894)119889
119894(119905)] and 119890
119895(119905+1 119894) = [119861
119895(119905 119894)minus119869
119895(119905 119894)119887
119895(119905)] minus [119863
119895(119905 119894)minus
119869119895(119905 119894)119889
119895(119905)] + [119872
119895minus1(119905 119894) minus 119869
119895minus1(119905 119894)120573
119895minus1(119905)] for 119895 gt 119894
The initial conditions at birth (1199050) for the above stochastic
difference equations are 119869119895(1199050 119894) gt 0 if (119895 = 119894 119894 + 1) and
119869119895(1199050 119894) = 0 if 119895 gt 119894 + 1 The solution of the above difference
equations under these initial conditions is given respectivelyby
119869119894(119905 119894) = 119869
119894(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119894(119904))
+
119905
sum
119904=1199050+1
119890119894(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119894(119906))
119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119895(119904))
+
119905minus1
sum
119904=1199050
119869119895minus1 (119904 119894) 120573119895minus1 (119904)
119905minus1
prod
119906=119904+1
(1 + 120574119895 (119906))
+
119905
sum
119904=1199050+1
119890119895(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119895(119906))
0 le 119894 lt 119895 le 119896 minus 1
(A2)
If the model is time homogeneous so that 120573119895(119905) =
120573119895 120574
119895(119905) = 120574
119895(119895 = 119894 119896 minus 1 and if 120574
119894= 120574119895if 119894 = 119895 then the
above solutions under the initial conditions (119869119895(1199050 119894) = 0 119895 gt
119894 + 1) reduce to
119869119894 (119905 119894) = 119869
119894(1199050 119894) (1 + 120574
119894)119905minus1199050
+ 120578(0)
119894(119905 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894) (1 + 120574
119895)119905minus1199050
+
119895minus1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895(119903) (1 + 120574
119903)119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
18 ISRN Biomathematics
=
119894+1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895 (119903) (1 + 120574119903)
119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
0 le 119894 lt 119895 le 119896 minus 1
(A3)
where 120578(0)
119895(119905 119894) = sum
119905
119904=1199050+1119890119895(119904 119894)(1 + 120574
119894)119905minus119904 120578
(119906)
119895(119905 119894) =
sum119905
119904=1199050+1119890119895minus119906
(119904 119894)sum119895
119907=119906119860119906119895(119907)(1 + 120574
119907)119905minus119904 (119906 = 1 119895 minus 119894)
Thus if the model is time homogeneous and if 120574119894=
120574119895if 119894 = 119895 the 119864[119869
119896minus1(119905 119894)]rsquos in discrete time models under
the initial conditions (119869119895(1199050 119894) = 0 119895 gt 119894 + 1) are given
respectively by
119864 [119869119896minus1
(119905 119894)] =
119894+1
sum
119906=119894
119864 [119869119906(1199050 119894)] (
119896minus2
prod
119907=119906
120573119907)
times
119896minus1
sum
119903=119906
119860119906(119896minus1)
(119903) (1 + 120574119903)119905minus1199050
119894 = 0 1 Min (119896 minus 1 2)
(A4)
References
[1] M P Little ldquoCancermodels ionization and genomic instabilitya reviewrdquo inHandbook of CancerModels with ApplicationsW YTan and L Hanin Eds chapter 5 World Scientific River EdgeNJ USA 2008
[2] W Y Tan Stochastic Models of Carcinogenesis Marcel DekkerNew York NY USA 1991
[3] W Y Tan ldquoStochastic multiti-stage models of carcinogenesis ashidden Markov models a new approachrdquo International Journalof Systems and Synthetic Biology vol 1 pp 313ndash337 2010
[4] W Y Tan L J Zhang and C W Chen ldquoStochastic modelingof carcinogenesis state space models and estimation of param-etersrdquoDiscrete and Continuous Dynamical Systems B vol 4 no1 pp 297ndash322 2004
[5] W Y Tan C W Chen and L J Zhang ldquoCancer biology cancermodels and stochasticmathematical analysis of carcinogenesisrdquoin Handbook of Cancer Models and Applications W Y Tan andL Hanin Eds chapter 3World Scientific River Edge NJ USA2008
[6] R A Weinberg The Biology of Human Cancer Garland Sci-ences Taylor and Frances New York NY USA 2007
[7] Q Zheng ldquoStochastic multistage cancer models a fresh lookat an old approachrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds chapter 2 WorldScientific River Edge NJ USA 2008
[8] W Y Tan L J Zhang W Chen and J M Zhu ldquoA stochasticmodel of human colon cancer involving multiple pathwaysrdquo inHandbook of Cancer Models with Applications W Y Tan and LHanin Eds chapter 11 World Scientific River Edge NJ USA2008
[9] W Y Tan and H Zhou ldquoA new stochastic model of retinoblas-toma involving both hereditary and non- hereditary cancercasesrdquo Journal of Carcinogenesis and Mutagenesis vol 2 no 2article 117 2011
[10] X W Yan Stochastic and State Space Models of carcinogenesisUnder Complex Situation [PhD thesis] Department of Math-ematical Sciences University of Memphsis Memphis TennUSA
[11] G L Yang and C W Chen ldquoA stochastic two-stage carcino-genesis model a new approach to computing the probabilityof observing tumor in animal bioassaysrdquo Mathematical Bio-sciences vol 104 no 2 pp 247ndash258 1991
[12] H Osada and T Takahashi ldquoGenetic alterations of multipletumor suppressors and oncogenes in the carcinogenesis andprogression of lung cancerrdquo Oncogene vol 21 no 48 pp 7421ndash7434 2002
[13] I I Wistuba L Mao and A F Gazdar ldquoSmoking moleculardamage in bronchial epitheliumrdquo Oncogene vol 21 no 48 pp7298ndash7306 2002
[14] S Landreville O A Agapova and J W Hartbour ldquoEmerginginsights into the molecular pathogenesis of uveal melanomardquoFuture Oncology vol 4 no 5 pp 629ndash636 2008
[15] H W Mensink D Paridaens and A De Klein ldquoGenetics ofuveal melanomardquo Expert Review of Ophthalmology vol 4 no6 pp 607ndash616 2009
[16] WY Tan andXWYan ldquoAnew stochastic and state spacemodelof human colon cancer incorporating multiple pathwaysrdquoBiology Direct vol 5 article 26 2010
[17] L B Klebanov S T Rachev and A Y Yakovlev ldquoA stochasticmodel of radiation carcinogenesis latent time distributions andtheir propertiesrdquo Mathematical Biosciences vol 113 no 1 pp51ndash75 1993
[18] A Y Yakovlev and A D Tsodikov Stochastic Models of TumorLatency and Their Biostatistical Applications World ScientificRiver Edge NJ USA 1996
[19] H Fakir W Y Tan L Hlatky P Hahnfeldt and R K SachsldquoStochastic population dynamic effects for lung cancer progres-sionrdquo Radiation Research vol 172 no 3 pp 383ndash393 2009
[20] A G Knudson ldquoMutation and cancer statistical study ofretinoblastomardquoProceedings of theNational Academy of Sciencesof the United States of America vol 68 no 4 pp 820ndash823 1971
[21] J F Crow and M Kimura An Introduction to PopulationGenetics Theory Harper and Row New York NY USA 1970
[22] W Y Tan Stochastic Models With Applications to GeneticsCancers AIDS and Other Biomedical Systems World ScientificRiver Edge NJ USA 2002
[23] W Y Tan CW Chen and L J Zhang ldquoCancer risk assessmentby state space modelsrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds Chapter 13 WorldScientific River Edge NJ USA 2008
[24] W Y Tan andCWChen ldquoStochasticmodeling of carcinogene-sis some new insightsrdquoMathematical and Computer Modellingvol 28 no 11 pp 49ndash71 1998
[25] W Y Tan and C W Chen ldquoCancer stochastic modelsrdquo inEncyclopedia of Statistical Sciences Revised edition John Wileyand Sons New York NY USA 2005
[26] E G Luebeck and S HMoolgavkar ldquoMultistage carcinogenesisand the incidence of colorectal cancerrdquo Proceedings of theNational Academy of Sciences of the United States of Americavol 99 no 23 pp 15095ndash15100 2002
[27] R Durrett J Mayberry S Moseley and D Schmidt ldquoProba-bility models for cancer development and progressionrdquo GoogleSearch Google 2012
[28] G E P Box and G C Tiao Bayesian Inference in StatisticalAnalysis Addison-Wesley Reading Mass USA 1973
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
10 ISRN Biomathematics
Table 1 The SEER incidence data (1973ndash2007) of uveal melanomafrom NCINIH (over all races and genders)
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
0 12495777 34 36 36 381 12221582 20 11 11 162 12120990 14 7 8 173 12112995 9 5 6 184 12146174 4 4 5 205 12161336 3 3 4 226 12111854 2 2 4 257 12160452 2 2 4 288 11942586 2 2 4 309 12381299 1 2 5 3410 12512703 2 3 6 3811 12410338 5 3 6 4112 12449244 2 3 6 4413 12527781 5 4 7 4814 12602883 3 5 7 5115 12719598 5 5 8 5516 12766107 7 6 9 5917 12831400 9 7 9 6218 12382047 8 7 9 6319 12581638 10 8 10 6820 12636509 7 9 11 7121 12682601 6 10 10 7522 12840510 12 11 11 7923 13075528 17 13 13 8424 13358635 16 15 14 8925 13473849 12 16 16 9426 13426340 17 18 18 9727 13525264 28 20 20 10128 13149674 17 22 21 10229 13812811 23 25 25 11030 13886874 24 28 27 11431 13488332 37 30 29 11532 13460286 32 32 32 11833 13256067 38 35 35 11934 13428827 37 39 38 12435 13220037 40 41 41 12636 12870265 30 44 44 12637 12689592 43 47 47 12738 12157014 42 49 49 12539 12494081 46 55 54 13140 12272125 49 58 58 13241 11826573 56 61 60 13042 11663153 54 65 64 13143 11407082 53 68 68 13144 11296848 70 73 72 133
Table 1 Continued
Agegroups
Numberof peopleat risk
Observedincidence
Model-Fpredicated
Model-1predicated
Two-stagepredicated
45 11016369 57 76 76 13246 10651593 71 79 79 13047 10475708 87 84 83 13148 9994684 82 86 85 12749 10138908 78 93 93 13150 9836359 87 97 96 13051 9475641 95 100 99 12752 9250985 113 104 104 12653 9027382 106 108 108 12554 8883737 117 113 113 12555 8547883 129 116 116 12356 8279648 107 119 119 12157 8062368 119 123 123 11958 7654610 132 124 124 11559 7563706 118 130 129 11560 7232719 131 131 130 11261 6927332 116 132 132 10962 6708273 133 134 134 10763 6543931 143 138 137 10664 6404652 130 141 141 10565 6168486 145 142 142 10266 5913479 138 142 142 9967 5746766 151 144 144 9868 5480517 147 142 142 9469 5363912 149 144 144 9470 5110728 136 142 142 9071 4925076 144 141 141 8872 4696825 140 138 138 8573 4512136 146 135 135 8274 4345300 126 133 133 8075 4148801 122 128 129 7876 3900900 124 122 122 7477 3681587 114 116 116 7078 3481918 93 110 110 6779 3243631 102 102 102 6380 2961234 79 92 93 5881 2724984 93 83 84 5482 2495219 82 75 75 5083 2271595 92 66 67 4684 2041351 64 57 58 4285 10466605 304 282 285 216Note 1 for age 0 and 1ndash10 years old the cancer incidence are derived bysubtracting incidence of retinoblastoma from the original SEERdata (see Tanand Zhou [9])Note 2 the observed uveal melanoma incidence rates per 106 individuals arederived from the SEER eye cancer incidence by subtracting retinoblastomaincidence as given in Tan and Zhou [9]
ISRN Biomathematics 11
42 The Probability Distribution of 119884119895(119895 ge 1) To derive the
probability distribution of 119884119895(119895 ge 1) in the 119895th age group
let 119884119894119895(119894 = 0 1 2) be the number of cancer cases generated
by people who have genotype 119869119894at the embryo stage among
these 119884119895cancer cases Then 119884
119895= sum
2
119894=0119884119894119895and 119884
0119895is the
number of cancer cases generated by the 1198990119895= 119899
119895minus 119899
1119895minus 119899
2119895
normal people in the populationThe conditional probabilitydistribution of 119884
119894119895given 119899
119894119895is
119884119894119895| 119899
119894119895sim Poisson 119899
119894119895119876119894(119895) 119894 = 0 1 2 (33)
Notice that if 119896 = 2 (a 2-stage model) then all 1198692
individuals would develop tumor at or before birth Thus if119896 = 2 then 119884
2119895= 0 for all 119895 gt 0 so that if 119895 gt 0 cancer
cases develop only from normal people (119873 = 1198690people) and
1198691people On the other hand if 119896 gt 2 then with positive
probability 119884119894119895gt 0 for all (119894 = 0 1 2 119895 = 0 1 119899
119879) where
119899119879is the last time point in the data Let 120575
2119896= 1 if 119896 = 2 and
1205752119896= 0 if 119896 = 2 Then 119884
119895| (119899
119894119895 119894 = 0 1 2) sim Poisson(119876
119879(119895)
where 119876119879(119895) = sum
1
119894=0119899119894119895119876119894(119895) + (1 minus 120575
2119896)119899
21198951198762(119895) Since
(1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) we have for the
conditional probability density function119875(119910119895| 119899
119895) of119884
119895given
119899119895
119875 (119910119895| 119899
119895)=
119899119895
sum
1198990119895=0
119899119895minus1198990119895
sum
1198991119895=0
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1) ℎ 119910
119895 119876
119879(119895)
(34)
where 119892(1198990119895 119899
1119895 119899
119895 119901
0 119901
1) is the probability density function
of (1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) and ℎ119910
119895 119876
119879(119895)
the probability density function of 119884119895| (119899
119894119895 119894 = 0 1 2) sim
Poisson119876119879(119895)
The probability density function 119875(119910119895| 119899
119895) given by (34)
is a mixture of Poisson probability density functions withmixing probability density function given by themultinomialprobability distribution of 119899
0119895 119899
1119895 given 119899
119895 This mixing
probability density function represents individuals with dif-ferent genotypes at the embryo stage in the population
Let Θ be the set of all unknown parameters (ie theparameters (119901
1 119901
2 120572) and the birth rates the death rates
and the mutation rates of 119869119895cells) Based on data (119910
119895 119895 =
0 1 119899119879) the likelihood function of Θ is
119871 Θ | 119910119895 119895 = 0 1 119899
119879 = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119875 (119910119895| 119899
119895)
(35)
Notice that because themutation rates are very small onemay practically assume 120573
119894(119905) = 120573
119894for 119894 = 0 1 119896 minus 1
Also because the stage-limiting genes are basically tumorsuppressor genes which act recessively (see Tan [3]Weinberg[6] and Tan et al [5 8 23]) one may practically assume119887119894(119905) = 119887
119894 119889
119894(119905) = 119889
119894 120574
119894(119905) = 119887
119894minus 119889
119894= 120574
119894 119894 = 0 1 119896 minus 1
(see Tan et al [4 5 8])
43 The Joint Probability Distribution of Augmented Variablesand Cancer Incidence For applying the mixture distribution
of 119884119895in (34) to make inference about the unknown parame-
ters one needs to expand the model to include the unobserv-able augmented variables (119899
0119895 119899
1119895 119910
0119895 119910
1119895 119895 = 1 119905
119873) and
derives the joint probability distribution of these variablesFor these purposes observe that for (119895 = 1 119905
119873) and for
119896 gt 2 the conditional probability distribution119875119910119894119895 119894 = 0 1 |
119910119895 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1) given (119884
119895= 119910
119895 119899
119894119895 119894 =
0 1 119899119895) is
(119910119894119895 119894 = 0 1) | (119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Multinomial(119910119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1)
(36)
Since the conditional probability distribution of 119884119895given
(119899119894119895 119894 = 0 1 2) for 119896 gt 2 is Poisson with mean 119876
119879(119895) we
have for the joint conditional probability density function119875119910
119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1 119884
119895) given
(119899119894119895 119894 = 0 1 119899
119895)
119875 119910119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895
= 119875 119910119894119895 119894 = 0 1 | 119910
119895 119899
119894119895 119894 = 0 1 2
times 119875 119910119895| 119899
119894119895 119895 = 0 1 2 =
1
119910119895119890minus119876119879(119895)119876
119879(119895)
119910119895
times 1198921199100119895 119910
1119895 119910
119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1
=
2
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) (119896 gt 2)
(37)
where 1199102119895= 119910
119895minus sum
1
119906=0119910119906119895and 119899
2119895= 119899
119895minus sum
1
119906=0119899119906119895
If 119896 = 2 then 1198842119895= 0 so that 119884
119895= sum
1
119894=0119884119894119895 Thus we have
for 119896 = 2
119884119895| (119899
119894119895 119894 = 0 1 119899
119895) sim Poisson
1
sum
119894=0
119899119894119895119876119894(119895) (38)
119884119894119895| (119884
119895= 119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Binomial(119910119895
119899119894119895119876119894(119895)
sum1
119906=0119899119906119895119876119906(119895)
) 119894 = 0 1
(39)
It follows that if 119896 = 2 then sum1
119894=0119884119894119895= 119884
119895and the joint
probability density function of (1198840119895 119884
119895) given (119899
119906119895 119906 = 0 1 2)
is119875 119910
0119895 119910
119895| 119899
119906119895 119906 = 0 1 119899
119895
= 119875 119910119895| 119899
119894119895 119894 = 0 1 119899
119895
times 119875 1199100119895| 119910
119895 119899
119906119895 119906 = 0 1 119899
119895
=
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) if 119896 = 2
(40)
where 119910119895= sum
1
119894=0119910119894119895and 119899
119895= sum
2
119894=0119899119894119895
12 ISRN Biomathematics
Put Y = (119910119894119895 119894 = 0 1 119895 = 1 119905
119873) N = (119899
119894119895 119894 = 0 1
119895 = 1 119905119873)˜119899 = (119899
119895 119895 = 0 1 119905
119873)˜119910 = (119910
119895 119895 = 0 1
119905119873) From (37) and (40) we have for the conditional joint
probability density function of (Y˜119910) given (N
˜119899)
119875 Y˜119910 | N
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(41)
It follows that the joint conditional probability densityfunction of NY
˜119910 given (
˜119899 Θ) is
119875 NY˜119910 |
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1)
times ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(42)
Notice that the above probability density function is aproduct of multinomial probability density functions andPoisson probability density functions For this joint probabil-ity density function the deviance Dev = minus2log119875[Y
˜119910N |
˜119899 Θ] minus log119875[Y
˜119910N |
˜119899 Θ] is
Dev = 1198630(119896) + Dev (119901
1 119901
2) +
119905119873
sum
119895=1
119863119895 (43)
where
1198630(119896) = 2 120594 (119896) minus 119910
0minus 119910
0log
120594 (119896)
1199100
(44)
Dev (1199011 119901
2) = 2
119905119873
sum
119895=1
1198990119895log
1199010
(1 minus 1199011minus 119901
2)
+
2
sum
119894=1
119899119894119895log
119901119894
119901119894
(45)
119863119895= 2
1
sum
119894=0
119899119894119895119876119894(119895)minus119910
119894119895minus119910
119894119895log
119899119894119895119876119894(119895)
119910119894119895
+2 (1 minus 1205752119896)
times 11989921198951198762(119895) minus 119910
2119895minus 119910
2119895log
11989921198951198762(119895)
1199102119895
(46)
where 119901119894= ((sum
119905119873
119895=0119899119894119895)(sum
119905119873
119895=0119899119895)) (119894 = 0 1 2)
The joint probability density function 119875Y˜119910N |
˜119899 Θ
of (Y˜119910N) given by (42) will be used as the kernel for the
Bayesian method to estimate the unknown parameters andto predict the state variables
44 Fitting of the Model to Cancer Incidence Data To fit themodel to real data as inTan [3ndash5] we letΔ119905 sim 1 to correspondto a fixed time interval such as 6 months in human cancerstudies (Tan et al [4] has assumed 3months as one-time unitwhile Luebeck andMoolgavkar [26] has assumed one year asone-time unit)Then because the proliferation rate of the laststage cells is quite large onemay practically assume119875
119879(119904 119905) =
1 for 119905 minus 119904 ge 1 Hence noting that 120573119896minus1
(119905) = 120573119896minus1
is usuallyvery small (see [3ndash5]) the 119876
119894(119895) is approximated by
119876119894(119895) asymp119864 119890
minus120573119896minus1
119866119894(119905119895minus1
)minus 119890
minus120573119896minus1
119866119894(119905119895) = 119890
minus120573119896minus1
119864[119866119894(119905119895minus1
)]
minus 119890minus120573119896minus1
119864[119866119894(119905119895)]+ 119900 (120573
119896minus1)
(47)
where 119866119894(119905) = sum
119905minus1
119904=1199050
119869119896minus1
(119904 119894)Under discrete time approximation the 119864[119868
119896minus1(119905 119894)]rsquos
have been derived in the appendix Using these results ofexpected numbers and using the resultsum119905minus1
119894=0119886119894= (119886
119905minus1)(119886minus
1) for 119886 = 0 we obtain
120573119896minus1
119864 [119866119894(119905)] =
119894+1
sum
119903=119894
119864 [119869119903(1199050 119894)] (
119896minus1
prod
119906=119903
120573119906)
times
119896minus1
sum
119907=119903
119860119903(119896minus1)
(119907)
119905minus1
sum
119904=1199050
(1 + 120574119907)119904minus1199050
= 120579(119894+1)(119896minus1)
120601(119894+1)(119896minus1) (119905) + 120582119894(119896minus1)120601119894(119896minus1) (119905)
(48)
where (120579119894(119896minus1)
119894 = 1 2 3) and (120579119894(119896minus1)
119894 = 0 1 2) are definedin Section 33 andwhere the120601
119894(119896minus1)(119905) (119894 = 0 1 2 3)rsquos are given
by
120601119894(119896minus1) (119905) =(
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1) (119907)
times1
120574119907
(1 + 120574119907)119905minus1199050
minus 1 119894 = 0 1 2 3
(49)
Notice that if 1205740= 0 then 120601
0(119896minus1)(119905) reduces to
1206010(119896minus1)
(119905) =(
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1)
(119907)
times1
1205741199072(1 + 120574
119907)119905minus1199050
minus 1 minus (119905 minus 1199050) 120574
119907
(50)
Applying these results for time homogeneous modelswith 120574
119894= 120574119895if 119894 = 119895 the 119876
119894(119895)rsquos under discrete approximation
are given as follows
ISRN Biomathematics 13
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791112060111(119905119895minus1
)minus1205820112060101(119905119895minus1
)
minus119890minus1205791112060111(119905119895)minus1205820112060101(119905119895) + 119900 (120573
1)
1198761(119895) asymp (1 minus 120572
1) 119890
minus1205821112060111(119905119895minus1
)
minus119890minus1205821112060111(119905119895) + 119900 (120573
1)
(51)
(2) If 119896 ge 3 thenwe have 1198762(0) = 120575
2(119896minus1)120572119876
119894(0) = 0 119894 =
0 1 and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791(119896minus1)
1206011(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1206011(119896minus1)
(119905119895)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198761(119895) asymp 119890
minus1205792(119896minus1)
1206012(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1206012(119896minus1)
(119905119895)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198762(119895) asymp 120575
2(119896minus1)(1 minus 120572) 119890
minus1205822212060122(119905119895minus1
)minus 119890
minus1205822212060122(119905119895)
+ (1 minus 1205752(119896minus1)
) 119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(52)
Notice that if one replaces [1 + 120574119907]119905minus1199050 by [1 + 120574
119907]119905minus1199050 =
119890(119905minus1199050) log(1+120574
119907)
asymp 119890(119905minus1199050)120574119907 the above 119876
119894(119895)rsquos from discrete
time model are exactly the same as from the continuousmodel respectively as given in equations (23)ndash(27) under theassumption that 119875
119879(119904 119905) = 1 for 119905 minus 119904 gt 0 Notice that the
assumption 119875119879(119904 119905) = 1 for 119905minus119904 gt 0 is equivalent to assuming
that the last stage cancer cells grow instantaneously intocancer tumors as soon as they are generated see Remark 1
5 The Fitting of the Model to CancerIncidence and the Generalized BayesianInference Procedure
Given the model in Sections 2 and 3 and cancer incidenceone may use results in Section 4 to fit the model By usingthis model and the distribution results in Section 4 one canreadily estimate the unknown genetic parameters predictcancer incidence and check the validity of the model byusing the generalized Bayesian inference and Gibbs samplingprocedures for more detail see Tan [3 22] and Tan et al[4 5]
The generalized Bayesian inference is based on the pos-terior distribution 119875Θ | NY
˜119910
˜119899 of Θ given NY
˜119910
˜119899
This posterior distribution is derived by combining the prior
distribution 119875Θ ofΘ with the joint probability distribution119875NY
˜119910 |
˜119899 Θ given
˜119899 Θ given by (42) It follows
that this inference procedure would combine informationfrom three sources (1) previous information and experiencesabout the parameters in terms of the prior distribution 119875Θof the parameters (2) biological information of inheritedcancer cases via genetic segregation of cancer genes inthe population (119875N |
˜119899 119901
119894 119894 = 1 2 see Section 2)
and (3) information from the expanded data (Y) and theobserved data (
˜119910) via the statistical model from the system
(119875Y˜119910 | N Θ) given by (37) and (40) Because of additional
information from the genetic segregation of the cancer genesthis inference procedure provides an efficient procedure toextract information of effects of genotypes of individuals atthe embryo stage
51 The Prior Distribution of the Parameters For the priordistributions of Θ because biological information has sug-gested some lower bounds and upper bounds for the muta-tion rates and for the proliferation rates we assume
119875 (Θ) prop 119888 (119888 gt 0) (53)
where 119888 is a positive constant if these parameters satisfy somebiologically specified constraints are and equal to zero forotherwise These biological constraints are as follows
(i) 0 lt 1199011lt 10
minus2 0 lt 1199012lt 10
minus6 and minus001 lt 120574119894lt 1 (119894 =
1 2)(ii) For 120573
119895(119895 = 0 1 119896 minus 1) we let 1 lt 120596
0= 119873(119905
0)120573
0lt
1000 (119873 rarr 1198681) and 10minus8 lt 120573
119894lt 10
minus3 119894 = 1 119896minus1(iii) For the 120582
119895(119896minus1)(119895 = 0 1 2) we let 0 lt 120582
119894lt 10 119894 = 0 1
and 0 lt 1205822lt 10
3(iv) For the 120579
119894(119894 = 1 2 3) we let 0 lt 120579
1lt 10
minus2 and 0 lt
1205792lt 10
minus1
We will refer to the above prior as a partially informativeprior which may be considered as an extension of thetraditional noninformative prior given in Box and Tiao [28]
52 The Posterior Distribution of the Parameters Given119884119873
˜119910
˜119899 Denote by Θ
1= Θ minus 119901
1 119901
2 120572 From the
posterior distribution 119875Θ | NY˜119910
˜119899 we obtain
119875 119901119894 119894 = 1 2 120572 | Θ
1NY
˜119910
˜119899
prop (120594 (119896))1199100
119890minus120594(119896)
119901sum119905119873
119895=11198991119895
1119901sum119905119873
119895=11198992119895
2
times (1 minus 1199011minus 119901
2)sum119905119873
119895=11198990119895
0 lt 1199011 119901
2 120572 lt 1
119875 Θ1| (119901
1 119901
2 120572) NY
˜119910
˜119899
prop
119905119873
prod
119895=1
2
prod
119894=0
119890minus119876119894(119895)119876
119894(119895)
119910119894119895
Θ1isin Ω
(54)
14 ISRN Biomathematics
where Ω is the parameter space of Θ1provided by the
biological constraints in Section 51For computational convenience we notice that the log of
1198751199011 119901
2 120572 | Θ
1NY
˜119910
˜119899 is proportional to the negative of
1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) similarly the log of
119875Θ1| (119901
1 119901
2 120572)NY
˜119910
˜119899 is proportional to the negative
of sum119896
119895=1119863119895given by (46)
53 The Multilevel Gibbs Sampling Procedure For EstimatingUnknown Parameters Given the posterior probability distri-butions we will use the following multilevel Gibbs samplingprocedure to derive estimates of the parameters We noticethat numerically the Gibbs sampling procedure given belowis equivalent to the EM-algorithm from the sampling theoryviewpoint with Steps 1 and 2 as the 119864-Step and with Steps 3and 4 as the119872-Step respectively [29]Thesemultilevel Gibbssampling procedures are given by the following
Step 1 (Generating N Given (Y˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 1)) Given Θ and given˜119899 use the multinomial
distribution of 1198991119895 119899
2119895 given 119899
119895in Section 3 to generate
a large sample of N Then by combining this large samplewith 119875Y
˜119910 | N
˜119899 Θ in (37) and (40) to select N through
the weighted bootstrap method due to Smith and Gelfand[30] This selected N is then a sample from 119875N | Y
˜119910
˜119899 Θ
even though the latter is unknown (For proof see Tan [22]Chapter 3) Call the generated sample N
Step 2 (Generating Y Given (N˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 2)) Given
˜119910
˜119899 Θ and given N = N generated
from Step 1 generate Y from the probability distribution119875Y | N
˜119910
˜119899 Θ given by (36) and (38) Call the generated
sample Y
Step 3 (Estimation of 119901119894 119894 = 1 2 120572 Given Θ
1NY
˜119910
˜119899)
Given
˜119910
˜119899 Θ
1 and given (NY) = (N Y) from Steps 1
and 2 derive the posterior mode of 119901119894 119894 = 1 2 120572 by
maximizing the conditional posterior distribution 119875119901119894 119894 =
1 2 120572 | Θ1 N Y
˜119910
˜119899 Under the partially informative prior
this is equivalent to maximize the negative of the deviance1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) in Section 43 under
the constraints given in Section 51 Denote this generatedmode by 119901
119894 119894 = 1 2
Step 4 (Estimation of Θ1Given (119901
119894 119894 = 1 2 120572NY
˜119910
˜119899))
Given (
˜119910
˜119899) and given (NY 119901
119894 119894 = 1 2 120572) = (N Y 119901
119894 119894 =
1 2 ) from Steps 1ndash3 derive the posterior mode of Θ1
by maximizing the conditional posterior distribution119875Θ
1| 119901
119894 119894 = 1 2 N Y
˜119910
˜119899 Under the partially
informative prior this is equivalent to maximize the negativeof the deviancesum119896
119895=1119863119895in (46) under the constraints Denote
the generated mode as Θ1
Step 5 (Recycling Step) With (NY 119901119894 119894 = 1 2 120572 Θ
1) =
(N Y 119901119894 119894 = 1 2 Θ
1) given above go back to Step 1 and
continue until convergence
The proof of convergence of the above steps can bederived by using procedure given in Tan ([22] Chapter 3) Atconvergence the Θ = 119901
119894 119894 = 1 2 Θ
1 are the generated
values from the posterior distribution of Θ given
˜119910
˜119899
independently of (NY) (for proof see Tan [22] Chapter 3)Repeat the above procedures once then generate a randomsample ofΘ from the posterior distribution ofΘ given
˜119910
˜119899
then one uses the sample mean as the estimates of (Θ) anduse the sample variances and covariances as estimates of thevariances and covariances of these estimates
6 A NewMultistage Stochastic Model for AdultEye Cancer (Uveal Melanoma)mdashAn Example
The human eye cancers consist of pediatric eye cancers andadult eye cancers The most common pediatric eye cancer isthe retinoblastoma which develops from the retinal pigmentepithelium cells underlying the retina that do not formmelanoma The most common adult eye cancers are theuveal melanomas involving the iris the ciliary body andthe choroid (collectively referred to as the uveal) Thesecancers develop from melanocytes (pigment cells) whichreside within the uveal giving color to the eye In Tan andZhou [9] we have developed a modified two-stage model forretinoblastoma Based on results frommolecular biology (seeLandreville et al [14] Mensink et al [15] and Loercher andHarbour [31]) Landreville et al [14] have proposed a threestage model for uveal melanoma as given in Figure 2 As anexample of applications of this paper in this section we willapply this model of uveal melanoma to the NCINIH eyecancer data from the SEER project We notice that the samemethods can be applied to model other human cancers aswell but this will be our future research
Given in Table 1 are the numbers of people at risk andthe eye cancer cases in the age groups together with thepredicted cases from the models These data give cancerincidence at birth and incidence for 85 age groups (119896 =
85) with each group spanning over a 1-year period exceptthe last age group (ge85 years old) For human eye cancerbecause the incidence at birth and for age groups from 1to 10 years old is basically generated by the pediatric eyecancer-retinoblastoma (see [9]) to account for inheritedcancer cases of uveal melanomas the incidence for age 0(birth) and for age periods from 1 to 10 years old in Table 1for uveal melanoma is derived by subtracting incidence ofretinoblastoma from SEER data (see Tan and Zhou [9])
To fit the data we let one-time unit be 6 months afterbirth and let 119905
0= 1 To compare different models and to
assess different assumptions we will consider the following2-3-stage mixture models (1) the complete 3-stage mixturemodel (Model-F) in which no assumptions are made on theparameters (2) the Type-1ndash3-stage mixture model in whichwe assume that 120574
1= 0 and that normal people and 119869
1at
the embryo stage will remain normal people and 1198691people
respectively at birth (Model-1) For comparison purposes wealso fit a 2-stage model as defined in Tan and Zhou [9] Wewill apply the methods in Section 6 to fit these models to theSEER data given in Table 1
ISRN Biomathematics 15
Table 2 The log-likelihood AIC and BIC of the fitted models
Models Log-likelihood AIC BICThree-stage
Model-F minus131202 264804 267735Model-1 minus129576 260953 263151
Two-stage minus4392421 8796843 8811569
0 6 12 18 24 30 36 42 48 54 60 66 72 78 84Age (year)
0
50
100
150
200
250
300
350
Inci
denc
es
ObservedModel-F
Model-1Two-stage
Figure 5 Curve fitting of SEER data by the Model-F Model-1 andthe two-stage model
Given in Table 2 are the natural logs of the likelihoodfunctions the AIC (Akaike Information Criterion) and theBIC (Bayesian Information Criterion) for these modelsGiven in Table 3 are the estimates of parameters in the 3-stagemodels Given in Figure 5 are the plots of predicted cancercases from the 3-stage mixture models (Model-F and Model-1) and the 2-stagemodel For comparison purposes in Table 1we also provide numbers of predicted cancer cases from the3-stage mixture models and the 2-stage model together withthe observed cancer cases over time from SEER From theseresults we have made the following observations
(a) As shown by results in Table 1 and Figure 5 it appear-ed that both Model-F and Model-1 fitted the SEERdata well although Model-1 fitted the data slightlybetter from values of AIC and BIC The Chi-squaretest statistics 1205942 = sum
85
119894=0((119910
119894minus 119910
119894)2119910
119894) for Model-F
and Model-1 are given by 8843 and 9448 respec-tively giving a 119875-value of 012 (df = 86 minus 12 = 74)for Model-F and a 119875 value of 011 (df = 86 minus 7 = 79)for Model-1 On the other hand the 2-stage modelfitted the date very poorly the Chi-statistic value forthe 2-stage model is 274769 giving a 119875-value less than10
minus3The AIC (Akaike Information Criteria) and BIC(Bayesian Information Criteria) values ofModel-1 aregiven by (AIC = 260953 BIC = 263151) which areslightly smaller than those of Model-F respectivelyhowever the AIC and BIC values (879684 881157)of the two-stage model are considerably greater thanthose of the 3-stagemodels respectivelyThese resultssuggest that uvealmelanomamaybest be described bya 3-stage model with inherited component and that
one may practically assume 1205741= 0 and that normal
people and 1198691people at the embryo stage will remain
normal people and 1198691people respectively at birth
(b) From Table 3 it is observed that the estimate of 1205741is
close to zero (the estimate is of order 10minus5) indicatingthat the phenotype of 119869
1is almost identical to that
of 119873 = 1198690further confirming that the staging-
limiting genes are basically tumor suppressor genesand that there is no haploinsufficiency for these tumorsuppressor genes On the other hand the estimate of1205742is of order 10minus2 which is about 103 times greater
than those of cells with genotype 1198691
(c) From Table 3 the estimates 119895(119895 = 0 1 2) of 120582
119895
are of order 10minus1 10
minus1 10
2sim 10
3 respectively
Because 120582119894= 119864[119869
119894(1199050 119894)]120573
119894prod
2
119906=119894+1(120573
119906120574
119906) 119894 = 0 1 2
assuming some values of 119864[119869119894(1199050 119894)] 119894 = 0 1 2 from
some biological observations one can have somerough ideas about the magnitude of 120573
119895(119895 = 0 1 2)
For example if we follow Potten et al [32] to assume(119864[119873(119905
0)] = 119864[119869
119894(1199050 119894)] sim 10
8 119894 = 0 1 2) then 120573
119895asymp
10minus6sim 10
minus5(d) From Table 3 the estimates of 119901
1and 119901
2from the
SEER data are of orders 10minus4 sim 10minus3 and 10minus7 sim 10
minus6respectively This indicates that in the US populationthe frequency of the staging limiting cancer genefor uveal melanoma is approximately around 10
minus3Table 3 also showed that the estimate of 120572 was 08411indicating that most individuals with genotype 119869
2
would develop cancer at birth This may help toexplain why there are observed cancer incidencesat birth for uveal melanoma in the SEER data eventhough the estimate of the frequency 119901
2is of order
10minus7sim 10
minus6
7 Discussion and Conclusions
To account for inherited cancer cases in this paper wehave developed some general multistage models involvinghereditary cancer cases For human cancer incidence thesemodels are basically generalized mixture models In thesemixture models the mixing probability is a multinomialdistribution to account for genetic segregation of the staging-limiting tumor suppressor genes This mixture model allowsus to estimate for the first time the frequency of the staging-limiting tumor suppressor gene in human populations As anexample of applications in this paper we have developed ageneral 3-stage stochastic multistage model of carcinogenesisfor adult human eye cancer To account for inherited cancercases in the stochastic model of human eye cancer wehave also developed a generalized mixture model for uvealmelanoma in human beings
For using the proposed models to fit the cancer incidencedata in this paper we have developed a generalized Bayesianinference procedure to estimate the unknownparameters andto predict cancer cases This inference procedure is advanta-geous over the classical sampling theory inference (ie max-imum likelihood method) because the procedure combines
16 ISRN Biomathematics
Table3Estim
ates
ofparametersfor
the3
-stage
stochastic
mod
els
Parameters
1205820
1205821
1205822
1205741
1205742
1199011
1199012
1205721205791
1205792
Mod
el-F
Estim
ates
288119864minus01
481119864minus01
655119864+02
271119864minus05
337119864minus02
995119864minus04
968119864minus07
841119864minus01
329119864minus04
341119864minus01
StD
121119864minus01
465119864minus02
112119864+02
886119864minus06
192119864minus04
175119864minus06
648119864minus09
419119864minus02
354119864minus05
107119864minus02
95CL
-Low
er503119864minus02
389119864minus01
434119864+02
534119864minus06
333119864minus02
991119864minus04
955119864minus07
759119864minus01
260119864minus04
320119864minus01
95CL
-Upp
er525119864minus01
572119864minus01
875119864+02
401119864minus05
341119864minus02
998119864minus04
981119864minus07
923119864minus01
398119864minus04
362119864minus01
Mod
el-1
Estim
ates
655119864minus02
372119864minus01
198119864+02
NA
349119864minus02
998119864minus04
100119864minus06
807119864minus01
NA
NA
StD
254119864minus02
463119864minus02
338119864+01
NA
358119864minus03
627119864minus05
453119864minus08
706119864minus02
NA
NA
95CL
-Low
er157119864minus02
282119864minus01
132119864+02
NA
279119864minus02
875119864minus04
911119864minus07
668119864minus01
NA
NA
95CL
-Upp
er115119864minus01
463119864minus01
265119864+02
NA
419119864minus02
1121198640minus03
109119864minus06
945119864minus01
NA
NA
NoteNAassum
edno
nexiste
nce
ISRN Biomathematics 17
information from three sources (1)previous information andexperiences about the parameters in terms of the prior distri-bution 119875Θ of the parameters (2) biological information ofinherited cancer cases via the genetic segregation of staging-limiting tumor suppressor genes in the population and (3)
information from the expanded data (Y) and the observeddata (
˜119910) via the statistical model from the system (119875Y
˜119910 |
N Θ) given by (37) and (40)To illustrate the usefulness and applications of ourmodels
and methods we have applied our models and methods tothe eye cancer SEER data of NCINIH Our analysis clearlyshowed that the proposed 3-stage model with inheritedcancer cases fitted the data nicely (see Table 2 and Figure 5)on the other hand the classical 2-stage model cannot fit thedata at all (see Table 2 and Figure 5)These results clearly haveconfirmed results frommolecular biology that the human eyecancer is derived by a 3-stage model with inherited cancercomponent Notice however our 3-stage multistage modelis more general than the classical 3-stage model as describedin Little [1] Tan [2] and Zheng [7] in that we postulatethat cancer tumors develop from primary 119868
3cells by clonal
expansion (see Yang and Chen [11]) (Note that the stochasticmultistagemodels in the literature assume that cancer tumorsdevelop from last stage cells immediately as soon as theyare generated ignoring completely cancer progression seeRemark 1) As a matter of fact we had assumed Δ119905 = 1 fora period of three months and found that the 3-stage modelsthen did not fit the SEER data clearly indicating that119875
119905(119904 119905) lt
1 over a period of three monthsApplying our models and methods to the SEER data
of human eye cancer we have derived for the first timesome useful pieces of information Specifically we mention(1) for the first time that we have estimated the frequencyof the staging-limiting tumor suppressor gene in the USpopulation (119901
1sim 9948 times 10
minus3) (2) With the estimate of 120572as = 08411 the predicted number of uveal melanoma atbirth is 119910
0= 119899
011199012= 36 by 3-stage models with inherited
cancer component (Model-F and Model-1) (The observednumber of eye cancer at birth is 34) (3) The estimateof the proliferation rate (120574
1) of 119869
1cells using Model-F is
1205741
= 2271 times 10minus5
sim 0 (The estimate is 8603 times 10minus5
using Model-1) This confirms that the stage-1 limitinggene is a tumor suppressor gene and unlike the p53 genein chromosome 17p (see [33]) there is little or no haploidinsufficiency for this gene in cells with genotype 119869
1
Using models and methods of this paper one can easilypredict future cancer cases for human eye cancer Thus bycomparing results from different populations our modelsand methods can be used to assess cancer prevention andcontrol procedures This will be our future research topicswe will not go any further here
Appendix
The Expected Numbers of State Variablesunder Discrete Time Approximation
Under discrete time the stochastic differential equations ofstate variables reduce to the following stochastic difference
equations of state variables respectively
119869119894 (119905 + 1 119894) = 119869
119894 (119905 119894) [1 + 120574119894 (119905)]
+ 119890119894(119905 + 1 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 + 1 119906) = 119869
119895(119905 119906) [1 + 120574
119895(119905)] + 119869
119895minus1(119905 119906) 120573
119895minus1(119905)
+ 119890119895(119905 + 1 119906) 0 le 119906 lt 119895 le 119896 minus 1
(A1)
where 119890119894(119905 + 1 119894) = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)] minus [119863
119894(119905 119894) minus
119869119894(119905 119894)119889
119894(119905)] and 119890
119895(119905+1 119894) = [119861
119895(119905 119894)minus119869
119895(119905 119894)119887
119895(119905)] minus [119863
119895(119905 119894)minus
119869119895(119905 119894)119889
119895(119905)] + [119872
119895minus1(119905 119894) minus 119869
119895minus1(119905 119894)120573
119895minus1(119905)] for 119895 gt 119894
The initial conditions at birth (1199050) for the above stochastic
difference equations are 119869119895(1199050 119894) gt 0 if (119895 = 119894 119894 + 1) and
119869119895(1199050 119894) = 0 if 119895 gt 119894 + 1 The solution of the above difference
equations under these initial conditions is given respectivelyby
119869119894(119905 119894) = 119869
119894(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119894(119904))
+
119905
sum
119904=1199050+1
119890119894(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119894(119906))
119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119895(119904))
+
119905minus1
sum
119904=1199050
119869119895minus1 (119904 119894) 120573119895minus1 (119904)
119905minus1
prod
119906=119904+1
(1 + 120574119895 (119906))
+
119905
sum
119904=1199050+1
119890119895(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119895(119906))
0 le 119894 lt 119895 le 119896 minus 1
(A2)
If the model is time homogeneous so that 120573119895(119905) =
120573119895 120574
119895(119905) = 120574
119895(119895 = 119894 119896 minus 1 and if 120574
119894= 120574119895if 119894 = 119895 then the
above solutions under the initial conditions (119869119895(1199050 119894) = 0 119895 gt
119894 + 1) reduce to
119869119894 (119905 119894) = 119869
119894(1199050 119894) (1 + 120574
119894)119905minus1199050
+ 120578(0)
119894(119905 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894) (1 + 120574
119895)119905minus1199050
+
119895minus1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895(119903) (1 + 120574
119903)119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
18 ISRN Biomathematics
=
119894+1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895 (119903) (1 + 120574119903)
119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
0 le 119894 lt 119895 le 119896 minus 1
(A3)
where 120578(0)
119895(119905 119894) = sum
119905
119904=1199050+1119890119895(119904 119894)(1 + 120574
119894)119905minus119904 120578
(119906)
119895(119905 119894) =
sum119905
119904=1199050+1119890119895minus119906
(119904 119894)sum119895
119907=119906119860119906119895(119907)(1 + 120574
119907)119905minus119904 (119906 = 1 119895 minus 119894)
Thus if the model is time homogeneous and if 120574119894=
120574119895if 119894 = 119895 the 119864[119869
119896minus1(119905 119894)]rsquos in discrete time models under
the initial conditions (119869119895(1199050 119894) = 0 119895 gt 119894 + 1) are given
respectively by
119864 [119869119896minus1
(119905 119894)] =
119894+1
sum
119906=119894
119864 [119869119906(1199050 119894)] (
119896minus2
prod
119907=119906
120573119907)
times
119896minus1
sum
119903=119906
119860119906(119896minus1)
(119903) (1 + 120574119903)119905minus1199050
119894 = 0 1 Min (119896 minus 1 2)
(A4)
References
[1] M P Little ldquoCancermodels ionization and genomic instabilitya reviewrdquo inHandbook of CancerModels with ApplicationsW YTan and L Hanin Eds chapter 5 World Scientific River EdgeNJ USA 2008
[2] W Y Tan Stochastic Models of Carcinogenesis Marcel DekkerNew York NY USA 1991
[3] W Y Tan ldquoStochastic multiti-stage models of carcinogenesis ashidden Markov models a new approachrdquo International Journalof Systems and Synthetic Biology vol 1 pp 313ndash337 2010
[4] W Y Tan L J Zhang and C W Chen ldquoStochastic modelingof carcinogenesis state space models and estimation of param-etersrdquoDiscrete and Continuous Dynamical Systems B vol 4 no1 pp 297ndash322 2004
[5] W Y Tan C W Chen and L J Zhang ldquoCancer biology cancermodels and stochasticmathematical analysis of carcinogenesisrdquoin Handbook of Cancer Models and Applications W Y Tan andL Hanin Eds chapter 3World Scientific River Edge NJ USA2008
[6] R A Weinberg The Biology of Human Cancer Garland Sci-ences Taylor and Frances New York NY USA 2007
[7] Q Zheng ldquoStochastic multistage cancer models a fresh lookat an old approachrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds chapter 2 WorldScientific River Edge NJ USA 2008
[8] W Y Tan L J Zhang W Chen and J M Zhu ldquoA stochasticmodel of human colon cancer involving multiple pathwaysrdquo inHandbook of Cancer Models with Applications W Y Tan and LHanin Eds chapter 11 World Scientific River Edge NJ USA2008
[9] W Y Tan and H Zhou ldquoA new stochastic model of retinoblas-toma involving both hereditary and non- hereditary cancercasesrdquo Journal of Carcinogenesis and Mutagenesis vol 2 no 2article 117 2011
[10] X W Yan Stochastic and State Space Models of carcinogenesisUnder Complex Situation [PhD thesis] Department of Math-ematical Sciences University of Memphsis Memphis TennUSA
[11] G L Yang and C W Chen ldquoA stochastic two-stage carcino-genesis model a new approach to computing the probabilityof observing tumor in animal bioassaysrdquo Mathematical Bio-sciences vol 104 no 2 pp 247ndash258 1991
[12] H Osada and T Takahashi ldquoGenetic alterations of multipletumor suppressors and oncogenes in the carcinogenesis andprogression of lung cancerrdquo Oncogene vol 21 no 48 pp 7421ndash7434 2002
[13] I I Wistuba L Mao and A F Gazdar ldquoSmoking moleculardamage in bronchial epitheliumrdquo Oncogene vol 21 no 48 pp7298ndash7306 2002
[14] S Landreville O A Agapova and J W Hartbour ldquoEmerginginsights into the molecular pathogenesis of uveal melanomardquoFuture Oncology vol 4 no 5 pp 629ndash636 2008
[15] H W Mensink D Paridaens and A De Klein ldquoGenetics ofuveal melanomardquo Expert Review of Ophthalmology vol 4 no6 pp 607ndash616 2009
[16] WY Tan andXWYan ldquoAnew stochastic and state spacemodelof human colon cancer incorporating multiple pathwaysrdquoBiology Direct vol 5 article 26 2010
[17] L B Klebanov S T Rachev and A Y Yakovlev ldquoA stochasticmodel of radiation carcinogenesis latent time distributions andtheir propertiesrdquo Mathematical Biosciences vol 113 no 1 pp51ndash75 1993
[18] A Y Yakovlev and A D Tsodikov Stochastic Models of TumorLatency and Their Biostatistical Applications World ScientificRiver Edge NJ USA 1996
[19] H Fakir W Y Tan L Hlatky P Hahnfeldt and R K SachsldquoStochastic population dynamic effects for lung cancer progres-sionrdquo Radiation Research vol 172 no 3 pp 383ndash393 2009
[20] A G Knudson ldquoMutation and cancer statistical study ofretinoblastomardquoProceedings of theNational Academy of Sciencesof the United States of America vol 68 no 4 pp 820ndash823 1971
[21] J F Crow and M Kimura An Introduction to PopulationGenetics Theory Harper and Row New York NY USA 1970
[22] W Y Tan Stochastic Models With Applications to GeneticsCancers AIDS and Other Biomedical Systems World ScientificRiver Edge NJ USA 2002
[23] W Y Tan CW Chen and L J Zhang ldquoCancer risk assessmentby state space modelsrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds Chapter 13 WorldScientific River Edge NJ USA 2008
[24] W Y Tan andCWChen ldquoStochasticmodeling of carcinogene-sis some new insightsrdquoMathematical and Computer Modellingvol 28 no 11 pp 49ndash71 1998
[25] W Y Tan and C W Chen ldquoCancer stochastic modelsrdquo inEncyclopedia of Statistical Sciences Revised edition John Wileyand Sons New York NY USA 2005
[26] E G Luebeck and S HMoolgavkar ldquoMultistage carcinogenesisand the incidence of colorectal cancerrdquo Proceedings of theNational Academy of Sciences of the United States of Americavol 99 no 23 pp 15095ndash15100 2002
[27] R Durrett J Mayberry S Moseley and D Schmidt ldquoProba-bility models for cancer development and progressionrdquo GoogleSearch Google 2012
[28] G E P Box and G C Tiao Bayesian Inference in StatisticalAnalysis Addison-Wesley Reading Mass USA 1973
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
ISRN Biomathematics 11
42 The Probability Distribution of 119884119895(119895 ge 1) To derive the
probability distribution of 119884119895(119895 ge 1) in the 119895th age group
let 119884119894119895(119894 = 0 1 2) be the number of cancer cases generated
by people who have genotype 119869119894at the embryo stage among
these 119884119895cancer cases Then 119884
119895= sum
2
119894=0119884119894119895and 119884
0119895is the
number of cancer cases generated by the 1198990119895= 119899
119895minus 119899
1119895minus 119899
2119895
normal people in the populationThe conditional probabilitydistribution of 119884
119894119895given 119899
119894119895is
119884119894119895| 119899
119894119895sim Poisson 119899
119894119895119876119894(119895) 119894 = 0 1 2 (33)
Notice that if 119896 = 2 (a 2-stage model) then all 1198692
individuals would develop tumor at or before birth Thus if119896 = 2 then 119884
2119895= 0 for all 119895 gt 0 so that if 119895 gt 0 cancer
cases develop only from normal people (119873 = 1198690people) and
1198691people On the other hand if 119896 gt 2 then with positive
probability 119884119894119895gt 0 for all (119894 = 0 1 2 119895 = 0 1 119899
119879) where
119899119879is the last time point in the data Let 120575
2119896= 1 if 119896 = 2 and
1205752119896= 0 if 119896 = 2 Then 119884
119895| (119899
119894119895 119894 = 0 1 2) sim Poisson(119876
119879(119895)
where 119876119879(119895) = sum
1
119894=0119899119894119895119876119894(119895) + (1 minus 120575
2119896)119899
21198951198762(119895) Since
(1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) we have for the
conditional probability density function119875(119910119895| 119899
119895) of119884
119895given
119899119895
119875 (119910119895| 119899
119895)=
119899119895
sum
1198990119895=0
119899119895minus1198990119895
sum
1198991119895=0
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1) ℎ 119910
119895 119876
119879(119895)
(34)
where 119892(1198990119895 119899
1119895 119899
119895 119901
0 119901
1) is the probability density function
of (1198990119895 119899
1119895) | 119899
119895sim Multinomial(119899
119895 119901
0 119901
1) and ℎ119910
119895 119876
119879(119895)
the probability density function of 119884119895| (119899
119894119895 119894 = 0 1 2) sim
Poisson119876119879(119895)
The probability density function 119875(119910119895| 119899
119895) given by (34)
is a mixture of Poisson probability density functions withmixing probability density function given by themultinomialprobability distribution of 119899
0119895 119899
1119895 given 119899
119895 This mixing
probability density function represents individuals with dif-ferent genotypes at the embryo stage in the population
Let Θ be the set of all unknown parameters (ie theparameters (119901
1 119901
2 120572) and the birth rates the death rates
and the mutation rates of 119869119895cells) Based on data (119910
119895 119895 =
0 1 119899119879) the likelihood function of Θ is
119871 Θ | 119910119895 119895 = 0 1 119899
119879 = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119875 (119910119895| 119899
119895)
(35)
Notice that because themutation rates are very small onemay practically assume 120573
119894(119905) = 120573
119894for 119894 = 0 1 119896 minus 1
Also because the stage-limiting genes are basically tumorsuppressor genes which act recessively (see Tan [3]Weinberg[6] and Tan et al [5 8 23]) one may practically assume119887119894(119905) = 119887
119894 119889
119894(119905) = 119889
119894 120574
119894(119905) = 119887
119894minus 119889
119894= 120574
119894 119894 = 0 1 119896 minus 1
(see Tan et al [4 5 8])
43 The Joint Probability Distribution of Augmented Variablesand Cancer Incidence For applying the mixture distribution
of 119884119895in (34) to make inference about the unknown parame-
ters one needs to expand the model to include the unobserv-able augmented variables (119899
0119895 119899
1119895 119910
0119895 119910
1119895 119895 = 1 119905
119873) and
derives the joint probability distribution of these variablesFor these purposes observe that for (119895 = 1 119905
119873) and for
119896 gt 2 the conditional probability distribution119875119910119894119895 119894 = 0 1 |
119910119895 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1) given (119884
119895= 119910
119895 119899
119894119895 119894 =
0 1 119899119895) is
(119910119894119895 119894 = 0 1) | (119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Multinomial(119910119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1)
(36)
Since the conditional probability distribution of 119884119895given
(119899119894119895 119894 = 0 1 2) for 119896 gt 2 is Poisson with mean 119876
119879(119895) we
have for the joint conditional probability density function119875119910
119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895 of (119884
119894119895 119894 = 0 1 119884
119895) given
(119899119894119895 119894 = 0 1 119899
119895)
119875 119910119894119895 119894 = 0 1 119910
119895| 119899
119894119895 119894 = 0 1 119899
119895
= 119875 119910119894119895 119894 = 0 1 | 119910
119895 119899
119894119895 119894 = 0 1 2
times 119875 119910119895| 119899
119894119895 119895 = 0 1 2 =
1
119910119895119890minus119876119879(119895)119876
119879(119895)
119910119895
times 1198921199100119895 119910
1119895 119910
119895119899119894119895119876119894(119895)
119876119879(119895)
119894 = 0 1
=
2
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) (119896 gt 2)
(37)
where 1199102119895= 119910
119895minus sum
1
119906=0119910119906119895and 119899
2119895= 119899
119895minus sum
1
119906=0119899119906119895
If 119896 = 2 then 1198842119895= 0 so that 119884
119895= sum
1
119894=0119884119894119895 Thus we have
for 119896 = 2
119884119895| (119899
119894119895 119894 = 0 1 119899
119895) sim Poisson
1
sum
119894=0
119899119894119895119876119894(119895) (38)
119884119894119895| (119884
119895= 119910
119895 119899
119894119895 119894 = 0 1 119899
119895)
sim Binomial(119910119895
119899119894119895119876119894(119895)
sum1
119906=0119899119906119895119876119906(119895)
) 119894 = 0 1
(39)
It follows that if 119896 = 2 then sum1
119894=0119884119894119895= 119884
119895and the joint
probability density function of (1198840119895 119884
119895) given (119899
119906119895 119906 = 0 1 2)
is119875 119910
0119895 119910
119895| 119899
119906119895 119906 = 0 1 119899
119895
= 119875 119910119895| 119899
119894119895 119894 = 0 1 119899
119895
times 119875 1199100119895| 119910
119895 119899
119906119895 119906 = 0 1 119899
119895
=
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895) if 119896 = 2
(40)
where 119910119895= sum
1
119894=0119910119894119895and 119899
119895= sum
2
119894=0119899119894119895
12 ISRN Biomathematics
Put Y = (119910119894119895 119894 = 0 1 119895 = 1 119905
119873) N = (119899
119894119895 119894 = 0 1
119895 = 1 119905119873)˜119899 = (119899
119895 119895 = 0 1 119905
119873)˜119910 = (119910
119895 119895 = 0 1
119905119873) From (37) and (40) we have for the conditional joint
probability density function of (Y˜119910) given (N
˜119899)
119875 Y˜119910 | N
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(41)
It follows that the joint conditional probability densityfunction of NY
˜119910 given (
˜119899 Θ) is
119875 NY˜119910 |
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1)
times ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(42)
Notice that the above probability density function is aproduct of multinomial probability density functions andPoisson probability density functions For this joint probabil-ity density function the deviance Dev = minus2log119875[Y
˜119910N |
˜119899 Θ] minus log119875[Y
˜119910N |
˜119899 Θ] is
Dev = 1198630(119896) + Dev (119901
1 119901
2) +
119905119873
sum
119895=1
119863119895 (43)
where
1198630(119896) = 2 120594 (119896) minus 119910
0minus 119910
0log
120594 (119896)
1199100
(44)
Dev (1199011 119901
2) = 2
119905119873
sum
119895=1
1198990119895log
1199010
(1 minus 1199011minus 119901
2)
+
2
sum
119894=1
119899119894119895log
119901119894
119901119894
(45)
119863119895= 2
1
sum
119894=0
119899119894119895119876119894(119895)minus119910
119894119895minus119910
119894119895log
119899119894119895119876119894(119895)
119910119894119895
+2 (1 minus 1205752119896)
times 11989921198951198762(119895) minus 119910
2119895minus 119910
2119895log
11989921198951198762(119895)
1199102119895
(46)
where 119901119894= ((sum
119905119873
119895=0119899119894119895)(sum
119905119873
119895=0119899119895)) (119894 = 0 1 2)
The joint probability density function 119875Y˜119910N |
˜119899 Θ
of (Y˜119910N) given by (42) will be used as the kernel for the
Bayesian method to estimate the unknown parameters andto predict the state variables
44 Fitting of the Model to Cancer Incidence Data To fit themodel to real data as inTan [3ndash5] we letΔ119905 sim 1 to correspondto a fixed time interval such as 6 months in human cancerstudies (Tan et al [4] has assumed 3months as one-time unitwhile Luebeck andMoolgavkar [26] has assumed one year asone-time unit)Then because the proliferation rate of the laststage cells is quite large onemay practically assume119875
119879(119904 119905) =
1 for 119905 minus 119904 ge 1 Hence noting that 120573119896minus1
(119905) = 120573119896minus1
is usuallyvery small (see [3ndash5]) the 119876
119894(119895) is approximated by
119876119894(119895) asymp119864 119890
minus120573119896minus1
119866119894(119905119895minus1
)minus 119890
minus120573119896minus1
119866119894(119905119895) = 119890
minus120573119896minus1
119864[119866119894(119905119895minus1
)]
minus 119890minus120573119896minus1
119864[119866119894(119905119895)]+ 119900 (120573
119896minus1)
(47)
where 119866119894(119905) = sum
119905minus1
119904=1199050
119869119896minus1
(119904 119894)Under discrete time approximation the 119864[119868
119896minus1(119905 119894)]rsquos
have been derived in the appendix Using these results ofexpected numbers and using the resultsum119905minus1
119894=0119886119894= (119886
119905minus1)(119886minus
1) for 119886 = 0 we obtain
120573119896minus1
119864 [119866119894(119905)] =
119894+1
sum
119903=119894
119864 [119869119903(1199050 119894)] (
119896minus1
prod
119906=119903
120573119906)
times
119896minus1
sum
119907=119903
119860119903(119896minus1)
(119907)
119905minus1
sum
119904=1199050
(1 + 120574119907)119904minus1199050
= 120579(119894+1)(119896minus1)
120601(119894+1)(119896minus1) (119905) + 120582119894(119896minus1)120601119894(119896minus1) (119905)
(48)
where (120579119894(119896minus1)
119894 = 1 2 3) and (120579119894(119896minus1)
119894 = 0 1 2) are definedin Section 33 andwhere the120601
119894(119896minus1)(119905) (119894 = 0 1 2 3)rsquos are given
by
120601119894(119896minus1) (119905) =(
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1) (119907)
times1
120574119907
(1 + 120574119907)119905minus1199050
minus 1 119894 = 0 1 2 3
(49)
Notice that if 1205740= 0 then 120601
0(119896minus1)(119905) reduces to
1206010(119896minus1)
(119905) =(
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1)
(119907)
times1
1205741199072(1 + 120574
119907)119905minus1199050
minus 1 minus (119905 minus 1199050) 120574
119907
(50)
Applying these results for time homogeneous modelswith 120574
119894= 120574119895if 119894 = 119895 the 119876
119894(119895)rsquos under discrete approximation
are given as follows
ISRN Biomathematics 13
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791112060111(119905119895minus1
)minus1205820112060101(119905119895minus1
)
minus119890minus1205791112060111(119905119895)minus1205820112060101(119905119895) + 119900 (120573
1)
1198761(119895) asymp (1 minus 120572
1) 119890
minus1205821112060111(119905119895minus1
)
minus119890minus1205821112060111(119905119895) + 119900 (120573
1)
(51)
(2) If 119896 ge 3 thenwe have 1198762(0) = 120575
2(119896minus1)120572119876
119894(0) = 0 119894 =
0 1 and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791(119896minus1)
1206011(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1206011(119896minus1)
(119905119895)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198761(119895) asymp 119890
minus1205792(119896minus1)
1206012(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1206012(119896minus1)
(119905119895)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198762(119895) asymp 120575
2(119896minus1)(1 minus 120572) 119890
minus1205822212060122(119905119895minus1
)minus 119890
minus1205822212060122(119905119895)
+ (1 minus 1205752(119896minus1)
) 119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(52)
Notice that if one replaces [1 + 120574119907]119905minus1199050 by [1 + 120574
119907]119905minus1199050 =
119890(119905minus1199050) log(1+120574
119907)
asymp 119890(119905minus1199050)120574119907 the above 119876
119894(119895)rsquos from discrete
time model are exactly the same as from the continuousmodel respectively as given in equations (23)ndash(27) under theassumption that 119875
119879(119904 119905) = 1 for 119905 minus 119904 gt 0 Notice that the
assumption 119875119879(119904 119905) = 1 for 119905minus119904 gt 0 is equivalent to assuming
that the last stage cancer cells grow instantaneously intocancer tumors as soon as they are generated see Remark 1
5 The Fitting of the Model to CancerIncidence and the Generalized BayesianInference Procedure
Given the model in Sections 2 and 3 and cancer incidenceone may use results in Section 4 to fit the model By usingthis model and the distribution results in Section 4 one canreadily estimate the unknown genetic parameters predictcancer incidence and check the validity of the model byusing the generalized Bayesian inference and Gibbs samplingprocedures for more detail see Tan [3 22] and Tan et al[4 5]
The generalized Bayesian inference is based on the pos-terior distribution 119875Θ | NY
˜119910
˜119899 of Θ given NY
˜119910
˜119899
This posterior distribution is derived by combining the prior
distribution 119875Θ ofΘ with the joint probability distribution119875NY
˜119910 |
˜119899 Θ given
˜119899 Θ given by (42) It follows
that this inference procedure would combine informationfrom three sources (1) previous information and experiencesabout the parameters in terms of the prior distribution 119875Θof the parameters (2) biological information of inheritedcancer cases via genetic segregation of cancer genes inthe population (119875N |
˜119899 119901
119894 119894 = 1 2 see Section 2)
and (3) information from the expanded data (Y) and theobserved data (
˜119910) via the statistical model from the system
(119875Y˜119910 | N Θ) given by (37) and (40) Because of additional
information from the genetic segregation of the cancer genesthis inference procedure provides an efficient procedure toextract information of effects of genotypes of individuals atthe embryo stage
51 The Prior Distribution of the Parameters For the priordistributions of Θ because biological information has sug-gested some lower bounds and upper bounds for the muta-tion rates and for the proliferation rates we assume
119875 (Θ) prop 119888 (119888 gt 0) (53)
where 119888 is a positive constant if these parameters satisfy somebiologically specified constraints are and equal to zero forotherwise These biological constraints are as follows
(i) 0 lt 1199011lt 10
minus2 0 lt 1199012lt 10
minus6 and minus001 lt 120574119894lt 1 (119894 =
1 2)(ii) For 120573
119895(119895 = 0 1 119896 minus 1) we let 1 lt 120596
0= 119873(119905
0)120573
0lt
1000 (119873 rarr 1198681) and 10minus8 lt 120573
119894lt 10
minus3 119894 = 1 119896minus1(iii) For the 120582
119895(119896minus1)(119895 = 0 1 2) we let 0 lt 120582
119894lt 10 119894 = 0 1
and 0 lt 1205822lt 10
3(iv) For the 120579
119894(119894 = 1 2 3) we let 0 lt 120579
1lt 10
minus2 and 0 lt
1205792lt 10
minus1
We will refer to the above prior as a partially informativeprior which may be considered as an extension of thetraditional noninformative prior given in Box and Tiao [28]
52 The Posterior Distribution of the Parameters Given119884119873
˜119910
˜119899 Denote by Θ
1= Θ minus 119901
1 119901
2 120572 From the
posterior distribution 119875Θ | NY˜119910
˜119899 we obtain
119875 119901119894 119894 = 1 2 120572 | Θ
1NY
˜119910
˜119899
prop (120594 (119896))1199100
119890minus120594(119896)
119901sum119905119873
119895=11198991119895
1119901sum119905119873
119895=11198992119895
2
times (1 minus 1199011minus 119901
2)sum119905119873
119895=11198990119895
0 lt 1199011 119901
2 120572 lt 1
119875 Θ1| (119901
1 119901
2 120572) NY
˜119910
˜119899
prop
119905119873
prod
119895=1
2
prod
119894=0
119890minus119876119894(119895)119876
119894(119895)
119910119894119895
Θ1isin Ω
(54)
14 ISRN Biomathematics
where Ω is the parameter space of Θ1provided by the
biological constraints in Section 51For computational convenience we notice that the log of
1198751199011 119901
2 120572 | Θ
1NY
˜119910
˜119899 is proportional to the negative of
1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) similarly the log of
119875Θ1| (119901
1 119901
2 120572)NY
˜119910
˜119899 is proportional to the negative
of sum119896
119895=1119863119895given by (46)
53 The Multilevel Gibbs Sampling Procedure For EstimatingUnknown Parameters Given the posterior probability distri-butions we will use the following multilevel Gibbs samplingprocedure to derive estimates of the parameters We noticethat numerically the Gibbs sampling procedure given belowis equivalent to the EM-algorithm from the sampling theoryviewpoint with Steps 1 and 2 as the 119864-Step and with Steps 3and 4 as the119872-Step respectively [29]Thesemultilevel Gibbssampling procedures are given by the following
Step 1 (Generating N Given (Y˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 1)) Given Θ and given˜119899 use the multinomial
distribution of 1198991119895 119899
2119895 given 119899
119895in Section 3 to generate
a large sample of N Then by combining this large samplewith 119875Y
˜119910 | N
˜119899 Θ in (37) and (40) to select N through
the weighted bootstrap method due to Smith and Gelfand[30] This selected N is then a sample from 119875N | Y
˜119910
˜119899 Θ
even though the latter is unknown (For proof see Tan [22]Chapter 3) Call the generated sample N
Step 2 (Generating Y Given (N˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 2)) Given
˜119910
˜119899 Θ and given N = N generated
from Step 1 generate Y from the probability distribution119875Y | N
˜119910
˜119899 Θ given by (36) and (38) Call the generated
sample Y
Step 3 (Estimation of 119901119894 119894 = 1 2 120572 Given Θ
1NY
˜119910
˜119899)
Given
˜119910
˜119899 Θ
1 and given (NY) = (N Y) from Steps 1
and 2 derive the posterior mode of 119901119894 119894 = 1 2 120572 by
maximizing the conditional posterior distribution 119875119901119894 119894 =
1 2 120572 | Θ1 N Y
˜119910
˜119899 Under the partially informative prior
this is equivalent to maximize the negative of the deviance1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) in Section 43 under
the constraints given in Section 51 Denote this generatedmode by 119901
119894 119894 = 1 2
Step 4 (Estimation of Θ1Given (119901
119894 119894 = 1 2 120572NY
˜119910
˜119899))
Given (
˜119910
˜119899) and given (NY 119901
119894 119894 = 1 2 120572) = (N Y 119901
119894 119894 =
1 2 ) from Steps 1ndash3 derive the posterior mode of Θ1
by maximizing the conditional posterior distribution119875Θ
1| 119901
119894 119894 = 1 2 N Y
˜119910
˜119899 Under the partially
informative prior this is equivalent to maximize the negativeof the deviancesum119896
119895=1119863119895in (46) under the constraints Denote
the generated mode as Θ1
Step 5 (Recycling Step) With (NY 119901119894 119894 = 1 2 120572 Θ
1) =
(N Y 119901119894 119894 = 1 2 Θ
1) given above go back to Step 1 and
continue until convergence
The proof of convergence of the above steps can bederived by using procedure given in Tan ([22] Chapter 3) Atconvergence the Θ = 119901
119894 119894 = 1 2 Θ
1 are the generated
values from the posterior distribution of Θ given
˜119910
˜119899
independently of (NY) (for proof see Tan [22] Chapter 3)Repeat the above procedures once then generate a randomsample ofΘ from the posterior distribution ofΘ given
˜119910
˜119899
then one uses the sample mean as the estimates of (Θ) anduse the sample variances and covariances as estimates of thevariances and covariances of these estimates
6 A NewMultistage Stochastic Model for AdultEye Cancer (Uveal Melanoma)mdashAn Example
The human eye cancers consist of pediatric eye cancers andadult eye cancers The most common pediatric eye cancer isthe retinoblastoma which develops from the retinal pigmentepithelium cells underlying the retina that do not formmelanoma The most common adult eye cancers are theuveal melanomas involving the iris the ciliary body andthe choroid (collectively referred to as the uveal) Thesecancers develop from melanocytes (pigment cells) whichreside within the uveal giving color to the eye In Tan andZhou [9] we have developed a modified two-stage model forretinoblastoma Based on results frommolecular biology (seeLandreville et al [14] Mensink et al [15] and Loercher andHarbour [31]) Landreville et al [14] have proposed a threestage model for uveal melanoma as given in Figure 2 As anexample of applications of this paper in this section we willapply this model of uveal melanoma to the NCINIH eyecancer data from the SEER project We notice that the samemethods can be applied to model other human cancers aswell but this will be our future research
Given in Table 1 are the numbers of people at risk andthe eye cancer cases in the age groups together with thepredicted cases from the models These data give cancerincidence at birth and incidence for 85 age groups (119896 =
85) with each group spanning over a 1-year period exceptthe last age group (ge85 years old) For human eye cancerbecause the incidence at birth and for age groups from 1to 10 years old is basically generated by the pediatric eyecancer-retinoblastoma (see [9]) to account for inheritedcancer cases of uveal melanomas the incidence for age 0(birth) and for age periods from 1 to 10 years old in Table 1for uveal melanoma is derived by subtracting incidence ofretinoblastoma from SEER data (see Tan and Zhou [9])
To fit the data we let one-time unit be 6 months afterbirth and let 119905
0= 1 To compare different models and to
assess different assumptions we will consider the following2-3-stage mixture models (1) the complete 3-stage mixturemodel (Model-F) in which no assumptions are made on theparameters (2) the Type-1ndash3-stage mixture model in whichwe assume that 120574
1= 0 and that normal people and 119869
1at
the embryo stage will remain normal people and 1198691people
respectively at birth (Model-1) For comparison purposes wealso fit a 2-stage model as defined in Tan and Zhou [9] Wewill apply the methods in Section 6 to fit these models to theSEER data given in Table 1
ISRN Biomathematics 15
Table 2 The log-likelihood AIC and BIC of the fitted models
Models Log-likelihood AIC BICThree-stage
Model-F minus131202 264804 267735Model-1 minus129576 260953 263151
Two-stage minus4392421 8796843 8811569
0 6 12 18 24 30 36 42 48 54 60 66 72 78 84Age (year)
0
50
100
150
200
250
300
350
Inci
denc
es
ObservedModel-F
Model-1Two-stage
Figure 5 Curve fitting of SEER data by the Model-F Model-1 andthe two-stage model
Given in Table 2 are the natural logs of the likelihoodfunctions the AIC (Akaike Information Criterion) and theBIC (Bayesian Information Criterion) for these modelsGiven in Table 3 are the estimates of parameters in the 3-stagemodels Given in Figure 5 are the plots of predicted cancercases from the 3-stage mixture models (Model-F and Model-1) and the 2-stagemodel For comparison purposes in Table 1we also provide numbers of predicted cancer cases from the3-stage mixture models and the 2-stage model together withthe observed cancer cases over time from SEER From theseresults we have made the following observations
(a) As shown by results in Table 1 and Figure 5 it appear-ed that both Model-F and Model-1 fitted the SEERdata well although Model-1 fitted the data slightlybetter from values of AIC and BIC The Chi-squaretest statistics 1205942 = sum
85
119894=0((119910
119894minus 119910
119894)2119910
119894) for Model-F
and Model-1 are given by 8843 and 9448 respec-tively giving a 119875-value of 012 (df = 86 minus 12 = 74)for Model-F and a 119875 value of 011 (df = 86 minus 7 = 79)for Model-1 On the other hand the 2-stage modelfitted the date very poorly the Chi-statistic value forthe 2-stage model is 274769 giving a 119875-value less than10
minus3The AIC (Akaike Information Criteria) and BIC(Bayesian Information Criteria) values ofModel-1 aregiven by (AIC = 260953 BIC = 263151) which areslightly smaller than those of Model-F respectivelyhowever the AIC and BIC values (879684 881157)of the two-stage model are considerably greater thanthose of the 3-stagemodels respectivelyThese resultssuggest that uvealmelanomamaybest be described bya 3-stage model with inherited component and that
one may practically assume 1205741= 0 and that normal
people and 1198691people at the embryo stage will remain
normal people and 1198691people respectively at birth
(b) From Table 3 it is observed that the estimate of 1205741is
close to zero (the estimate is of order 10minus5) indicatingthat the phenotype of 119869
1is almost identical to that
of 119873 = 1198690further confirming that the staging-
limiting genes are basically tumor suppressor genesand that there is no haploinsufficiency for these tumorsuppressor genes On the other hand the estimate of1205742is of order 10minus2 which is about 103 times greater
than those of cells with genotype 1198691
(c) From Table 3 the estimates 119895(119895 = 0 1 2) of 120582
119895
are of order 10minus1 10
minus1 10
2sim 10
3 respectively
Because 120582119894= 119864[119869
119894(1199050 119894)]120573
119894prod
2
119906=119894+1(120573
119906120574
119906) 119894 = 0 1 2
assuming some values of 119864[119869119894(1199050 119894)] 119894 = 0 1 2 from
some biological observations one can have somerough ideas about the magnitude of 120573
119895(119895 = 0 1 2)
For example if we follow Potten et al [32] to assume(119864[119873(119905
0)] = 119864[119869
119894(1199050 119894)] sim 10
8 119894 = 0 1 2) then 120573
119895asymp
10minus6sim 10
minus5(d) From Table 3 the estimates of 119901
1and 119901
2from the
SEER data are of orders 10minus4 sim 10minus3 and 10minus7 sim 10
minus6respectively This indicates that in the US populationthe frequency of the staging limiting cancer genefor uveal melanoma is approximately around 10
minus3Table 3 also showed that the estimate of 120572 was 08411indicating that most individuals with genotype 119869
2
would develop cancer at birth This may help toexplain why there are observed cancer incidencesat birth for uveal melanoma in the SEER data eventhough the estimate of the frequency 119901
2is of order
10minus7sim 10
minus6
7 Discussion and Conclusions
To account for inherited cancer cases in this paper wehave developed some general multistage models involvinghereditary cancer cases For human cancer incidence thesemodels are basically generalized mixture models In thesemixture models the mixing probability is a multinomialdistribution to account for genetic segregation of the staging-limiting tumor suppressor genes This mixture model allowsus to estimate for the first time the frequency of the staging-limiting tumor suppressor gene in human populations As anexample of applications in this paper we have developed ageneral 3-stage stochastic multistage model of carcinogenesisfor adult human eye cancer To account for inherited cancercases in the stochastic model of human eye cancer wehave also developed a generalized mixture model for uvealmelanoma in human beings
For using the proposed models to fit the cancer incidencedata in this paper we have developed a generalized Bayesianinference procedure to estimate the unknownparameters andto predict cancer cases This inference procedure is advanta-geous over the classical sampling theory inference (ie max-imum likelihood method) because the procedure combines
16 ISRN Biomathematics
Table3Estim
ates
ofparametersfor
the3
-stage
stochastic
mod
els
Parameters
1205820
1205821
1205822
1205741
1205742
1199011
1199012
1205721205791
1205792
Mod
el-F
Estim
ates
288119864minus01
481119864minus01
655119864+02
271119864minus05
337119864minus02
995119864minus04
968119864minus07
841119864minus01
329119864minus04
341119864minus01
StD
121119864minus01
465119864minus02
112119864+02
886119864minus06
192119864minus04
175119864minus06
648119864minus09
419119864minus02
354119864minus05
107119864minus02
95CL
-Low
er503119864minus02
389119864minus01
434119864+02
534119864minus06
333119864minus02
991119864minus04
955119864minus07
759119864minus01
260119864minus04
320119864minus01
95CL
-Upp
er525119864minus01
572119864minus01
875119864+02
401119864minus05
341119864minus02
998119864minus04
981119864minus07
923119864minus01
398119864minus04
362119864minus01
Mod
el-1
Estim
ates
655119864minus02
372119864minus01
198119864+02
NA
349119864minus02
998119864minus04
100119864minus06
807119864minus01
NA
NA
StD
254119864minus02
463119864minus02
338119864+01
NA
358119864minus03
627119864minus05
453119864minus08
706119864minus02
NA
NA
95CL
-Low
er157119864minus02
282119864minus01
132119864+02
NA
279119864minus02
875119864minus04
911119864minus07
668119864minus01
NA
NA
95CL
-Upp
er115119864minus01
463119864minus01
265119864+02
NA
419119864minus02
1121198640minus03
109119864minus06
945119864minus01
NA
NA
NoteNAassum
edno
nexiste
nce
ISRN Biomathematics 17
information from three sources (1)previous information andexperiences about the parameters in terms of the prior distri-bution 119875Θ of the parameters (2) biological information ofinherited cancer cases via the genetic segregation of staging-limiting tumor suppressor genes in the population and (3)
information from the expanded data (Y) and the observeddata (
˜119910) via the statistical model from the system (119875Y
˜119910 |
N Θ) given by (37) and (40)To illustrate the usefulness and applications of ourmodels
and methods we have applied our models and methods tothe eye cancer SEER data of NCINIH Our analysis clearlyshowed that the proposed 3-stage model with inheritedcancer cases fitted the data nicely (see Table 2 and Figure 5)on the other hand the classical 2-stage model cannot fit thedata at all (see Table 2 and Figure 5)These results clearly haveconfirmed results frommolecular biology that the human eyecancer is derived by a 3-stage model with inherited cancercomponent Notice however our 3-stage multistage modelis more general than the classical 3-stage model as describedin Little [1] Tan [2] and Zheng [7] in that we postulatethat cancer tumors develop from primary 119868
3cells by clonal
expansion (see Yang and Chen [11]) (Note that the stochasticmultistagemodels in the literature assume that cancer tumorsdevelop from last stage cells immediately as soon as theyare generated ignoring completely cancer progression seeRemark 1) As a matter of fact we had assumed Δ119905 = 1 fora period of three months and found that the 3-stage modelsthen did not fit the SEER data clearly indicating that119875
119905(119904 119905) lt
1 over a period of three monthsApplying our models and methods to the SEER data
of human eye cancer we have derived for the first timesome useful pieces of information Specifically we mention(1) for the first time that we have estimated the frequencyof the staging-limiting tumor suppressor gene in the USpopulation (119901
1sim 9948 times 10
minus3) (2) With the estimate of 120572as = 08411 the predicted number of uveal melanoma atbirth is 119910
0= 119899
011199012= 36 by 3-stage models with inherited
cancer component (Model-F and Model-1) (The observednumber of eye cancer at birth is 34) (3) The estimateof the proliferation rate (120574
1) of 119869
1cells using Model-F is
1205741
= 2271 times 10minus5
sim 0 (The estimate is 8603 times 10minus5
using Model-1) This confirms that the stage-1 limitinggene is a tumor suppressor gene and unlike the p53 genein chromosome 17p (see [33]) there is little or no haploidinsufficiency for this gene in cells with genotype 119869
1
Using models and methods of this paper one can easilypredict future cancer cases for human eye cancer Thus bycomparing results from different populations our modelsand methods can be used to assess cancer prevention andcontrol procedures This will be our future research topicswe will not go any further here
Appendix
The Expected Numbers of State Variablesunder Discrete Time Approximation
Under discrete time the stochastic differential equations ofstate variables reduce to the following stochastic difference
equations of state variables respectively
119869119894 (119905 + 1 119894) = 119869
119894 (119905 119894) [1 + 120574119894 (119905)]
+ 119890119894(119905 + 1 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 + 1 119906) = 119869
119895(119905 119906) [1 + 120574
119895(119905)] + 119869
119895minus1(119905 119906) 120573
119895minus1(119905)
+ 119890119895(119905 + 1 119906) 0 le 119906 lt 119895 le 119896 minus 1
(A1)
where 119890119894(119905 + 1 119894) = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)] minus [119863
119894(119905 119894) minus
119869119894(119905 119894)119889
119894(119905)] and 119890
119895(119905+1 119894) = [119861
119895(119905 119894)minus119869
119895(119905 119894)119887
119895(119905)] minus [119863
119895(119905 119894)minus
119869119895(119905 119894)119889
119895(119905)] + [119872
119895minus1(119905 119894) minus 119869
119895minus1(119905 119894)120573
119895minus1(119905)] for 119895 gt 119894
The initial conditions at birth (1199050) for the above stochastic
difference equations are 119869119895(1199050 119894) gt 0 if (119895 = 119894 119894 + 1) and
119869119895(1199050 119894) = 0 if 119895 gt 119894 + 1 The solution of the above difference
equations under these initial conditions is given respectivelyby
119869119894(119905 119894) = 119869
119894(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119894(119904))
+
119905
sum
119904=1199050+1
119890119894(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119894(119906))
119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119895(119904))
+
119905minus1
sum
119904=1199050
119869119895minus1 (119904 119894) 120573119895minus1 (119904)
119905minus1
prod
119906=119904+1
(1 + 120574119895 (119906))
+
119905
sum
119904=1199050+1
119890119895(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119895(119906))
0 le 119894 lt 119895 le 119896 minus 1
(A2)
If the model is time homogeneous so that 120573119895(119905) =
120573119895 120574
119895(119905) = 120574
119895(119895 = 119894 119896 minus 1 and if 120574
119894= 120574119895if 119894 = 119895 then the
above solutions under the initial conditions (119869119895(1199050 119894) = 0 119895 gt
119894 + 1) reduce to
119869119894 (119905 119894) = 119869
119894(1199050 119894) (1 + 120574
119894)119905minus1199050
+ 120578(0)
119894(119905 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894) (1 + 120574
119895)119905minus1199050
+
119895minus1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895(119903) (1 + 120574
119903)119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
18 ISRN Biomathematics
=
119894+1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895 (119903) (1 + 120574119903)
119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
0 le 119894 lt 119895 le 119896 minus 1
(A3)
where 120578(0)
119895(119905 119894) = sum
119905
119904=1199050+1119890119895(119904 119894)(1 + 120574
119894)119905minus119904 120578
(119906)
119895(119905 119894) =
sum119905
119904=1199050+1119890119895minus119906
(119904 119894)sum119895
119907=119906119860119906119895(119907)(1 + 120574
119907)119905minus119904 (119906 = 1 119895 minus 119894)
Thus if the model is time homogeneous and if 120574119894=
120574119895if 119894 = 119895 the 119864[119869
119896minus1(119905 119894)]rsquos in discrete time models under
the initial conditions (119869119895(1199050 119894) = 0 119895 gt 119894 + 1) are given
respectively by
119864 [119869119896minus1
(119905 119894)] =
119894+1
sum
119906=119894
119864 [119869119906(1199050 119894)] (
119896minus2
prod
119907=119906
120573119907)
times
119896minus1
sum
119903=119906
119860119906(119896minus1)
(119903) (1 + 120574119903)119905minus1199050
119894 = 0 1 Min (119896 minus 1 2)
(A4)
References
[1] M P Little ldquoCancermodels ionization and genomic instabilitya reviewrdquo inHandbook of CancerModels with ApplicationsW YTan and L Hanin Eds chapter 5 World Scientific River EdgeNJ USA 2008
[2] W Y Tan Stochastic Models of Carcinogenesis Marcel DekkerNew York NY USA 1991
[3] W Y Tan ldquoStochastic multiti-stage models of carcinogenesis ashidden Markov models a new approachrdquo International Journalof Systems and Synthetic Biology vol 1 pp 313ndash337 2010
[4] W Y Tan L J Zhang and C W Chen ldquoStochastic modelingof carcinogenesis state space models and estimation of param-etersrdquoDiscrete and Continuous Dynamical Systems B vol 4 no1 pp 297ndash322 2004
[5] W Y Tan C W Chen and L J Zhang ldquoCancer biology cancermodels and stochasticmathematical analysis of carcinogenesisrdquoin Handbook of Cancer Models and Applications W Y Tan andL Hanin Eds chapter 3World Scientific River Edge NJ USA2008
[6] R A Weinberg The Biology of Human Cancer Garland Sci-ences Taylor and Frances New York NY USA 2007
[7] Q Zheng ldquoStochastic multistage cancer models a fresh lookat an old approachrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds chapter 2 WorldScientific River Edge NJ USA 2008
[8] W Y Tan L J Zhang W Chen and J M Zhu ldquoA stochasticmodel of human colon cancer involving multiple pathwaysrdquo inHandbook of Cancer Models with Applications W Y Tan and LHanin Eds chapter 11 World Scientific River Edge NJ USA2008
[9] W Y Tan and H Zhou ldquoA new stochastic model of retinoblas-toma involving both hereditary and non- hereditary cancercasesrdquo Journal of Carcinogenesis and Mutagenesis vol 2 no 2article 117 2011
[10] X W Yan Stochastic and State Space Models of carcinogenesisUnder Complex Situation [PhD thesis] Department of Math-ematical Sciences University of Memphsis Memphis TennUSA
[11] G L Yang and C W Chen ldquoA stochastic two-stage carcino-genesis model a new approach to computing the probabilityof observing tumor in animal bioassaysrdquo Mathematical Bio-sciences vol 104 no 2 pp 247ndash258 1991
[12] H Osada and T Takahashi ldquoGenetic alterations of multipletumor suppressors and oncogenes in the carcinogenesis andprogression of lung cancerrdquo Oncogene vol 21 no 48 pp 7421ndash7434 2002
[13] I I Wistuba L Mao and A F Gazdar ldquoSmoking moleculardamage in bronchial epitheliumrdquo Oncogene vol 21 no 48 pp7298ndash7306 2002
[14] S Landreville O A Agapova and J W Hartbour ldquoEmerginginsights into the molecular pathogenesis of uveal melanomardquoFuture Oncology vol 4 no 5 pp 629ndash636 2008
[15] H W Mensink D Paridaens and A De Klein ldquoGenetics ofuveal melanomardquo Expert Review of Ophthalmology vol 4 no6 pp 607ndash616 2009
[16] WY Tan andXWYan ldquoAnew stochastic and state spacemodelof human colon cancer incorporating multiple pathwaysrdquoBiology Direct vol 5 article 26 2010
[17] L B Klebanov S T Rachev and A Y Yakovlev ldquoA stochasticmodel of radiation carcinogenesis latent time distributions andtheir propertiesrdquo Mathematical Biosciences vol 113 no 1 pp51ndash75 1993
[18] A Y Yakovlev and A D Tsodikov Stochastic Models of TumorLatency and Their Biostatistical Applications World ScientificRiver Edge NJ USA 1996
[19] H Fakir W Y Tan L Hlatky P Hahnfeldt and R K SachsldquoStochastic population dynamic effects for lung cancer progres-sionrdquo Radiation Research vol 172 no 3 pp 383ndash393 2009
[20] A G Knudson ldquoMutation and cancer statistical study ofretinoblastomardquoProceedings of theNational Academy of Sciencesof the United States of America vol 68 no 4 pp 820ndash823 1971
[21] J F Crow and M Kimura An Introduction to PopulationGenetics Theory Harper and Row New York NY USA 1970
[22] W Y Tan Stochastic Models With Applications to GeneticsCancers AIDS and Other Biomedical Systems World ScientificRiver Edge NJ USA 2002
[23] W Y Tan CW Chen and L J Zhang ldquoCancer risk assessmentby state space modelsrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds Chapter 13 WorldScientific River Edge NJ USA 2008
[24] W Y Tan andCWChen ldquoStochasticmodeling of carcinogene-sis some new insightsrdquoMathematical and Computer Modellingvol 28 no 11 pp 49ndash71 1998
[25] W Y Tan and C W Chen ldquoCancer stochastic modelsrdquo inEncyclopedia of Statistical Sciences Revised edition John Wileyand Sons New York NY USA 2005
[26] E G Luebeck and S HMoolgavkar ldquoMultistage carcinogenesisand the incidence of colorectal cancerrdquo Proceedings of theNational Academy of Sciences of the United States of Americavol 99 no 23 pp 15095ndash15100 2002
[27] R Durrett J Mayberry S Moseley and D Schmidt ldquoProba-bility models for cancer development and progressionrdquo GoogleSearch Google 2012
[28] G E P Box and G C Tiao Bayesian Inference in StatisticalAnalysis Addison-Wesley Reading Mass USA 1973
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
12 ISRN Biomathematics
Put Y = (119910119894119895 119894 = 0 1 119895 = 1 119905
119873) N = (119899
119894119895 119894 = 0 1
119895 = 1 119905119873)˜119899 = (119899
119895 119895 = 0 1 119905
119873)˜119910 = (119910
119895 119895 = 0 1
119905119873) From (37) and (40) we have for the conditional joint
probability density function of (Y˜119910) given (N
˜119899)
119875 Y˜119910 | N
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(41)
It follows that the joint conditional probability densityfunction of NY
˜119910 given (
˜119899 Θ) is
119875 NY˜119910 |
˜119899 Θ = ℎ (119910
0 120594 (119896))
119905119873
prod
119895=1
119892 (1198990119895 119899
1119895 119899
119895 119901
0 119901
1)
times ℎ [1199102119895 119899
21198951198762(119895)]
1minus1205752119896
times
1
prod
119894=0
ℎ 119910119894119895 119899
119894119895119876119894(119895)
(42)
Notice that the above probability density function is aproduct of multinomial probability density functions andPoisson probability density functions For this joint probabil-ity density function the deviance Dev = minus2log119875[Y
˜119910N |
˜119899 Θ] minus log119875[Y
˜119910N |
˜119899 Θ] is
Dev = 1198630(119896) + Dev (119901
1 119901
2) +
119905119873
sum
119895=1
119863119895 (43)
where
1198630(119896) = 2 120594 (119896) minus 119910
0minus 119910
0log
120594 (119896)
1199100
(44)
Dev (1199011 119901
2) = 2
119905119873
sum
119895=1
1198990119895log
1199010
(1 minus 1199011minus 119901
2)
+
2
sum
119894=1
119899119894119895log
119901119894
119901119894
(45)
119863119895= 2
1
sum
119894=0
119899119894119895119876119894(119895)minus119910
119894119895minus119910
119894119895log
119899119894119895119876119894(119895)
119910119894119895
+2 (1 minus 1205752119896)
times 11989921198951198762(119895) minus 119910
2119895minus 119910
2119895log
11989921198951198762(119895)
1199102119895
(46)
where 119901119894= ((sum
119905119873
119895=0119899119894119895)(sum
119905119873
119895=0119899119895)) (119894 = 0 1 2)
The joint probability density function 119875Y˜119910N |
˜119899 Θ
of (Y˜119910N) given by (42) will be used as the kernel for the
Bayesian method to estimate the unknown parameters andto predict the state variables
44 Fitting of the Model to Cancer Incidence Data To fit themodel to real data as inTan [3ndash5] we letΔ119905 sim 1 to correspondto a fixed time interval such as 6 months in human cancerstudies (Tan et al [4] has assumed 3months as one-time unitwhile Luebeck andMoolgavkar [26] has assumed one year asone-time unit)Then because the proliferation rate of the laststage cells is quite large onemay practically assume119875
119879(119904 119905) =
1 for 119905 minus 119904 ge 1 Hence noting that 120573119896minus1
(119905) = 120573119896minus1
is usuallyvery small (see [3ndash5]) the 119876
119894(119895) is approximated by
119876119894(119895) asymp119864 119890
minus120573119896minus1
119866119894(119905119895minus1
)minus 119890
minus120573119896minus1
119866119894(119905119895) = 119890
minus120573119896minus1
119864[119866119894(119905119895minus1
)]
minus 119890minus120573119896minus1
119864[119866119894(119905119895)]+ 119900 (120573
119896minus1)
(47)
where 119866119894(119905) = sum
119905minus1
119904=1199050
119869119896minus1
(119904 119894)Under discrete time approximation the 119864[119868
119896minus1(119905 119894)]rsquos
have been derived in the appendix Using these results ofexpected numbers and using the resultsum119905minus1
119894=0119886119894= (119886
119905minus1)(119886minus
1) for 119886 = 0 we obtain
120573119896minus1
119864 [119866119894(119905)] =
119894+1
sum
119903=119894
119864 [119869119903(1199050 119894)] (
119896minus1
prod
119906=119903
120573119906)
times
119896minus1
sum
119907=119903
119860119903(119896minus1)
(119907)
119905minus1
sum
119904=1199050
(1 + 120574119907)119904minus1199050
= 120579(119894+1)(119896minus1)
120601(119894+1)(119896minus1) (119905) + 120582119894(119896minus1)120601119894(119896minus1) (119905)
(48)
where (120579119894(119896minus1)
119894 = 1 2 3) and (120579119894(119896minus1)
119894 = 0 1 2) are definedin Section 33 andwhere the120601
119894(119896minus1)(119905) (119894 = 0 1 2 3)rsquos are given
by
120601119894(119896minus1) (119905) =(
119896minus1
prod
119906=119894+1
120574119906)
119896minus1
sum
119907=119894
119860119894(119896minus1) (119907)
times1
120574119907
(1 + 120574119907)119905minus1199050
minus 1 119894 = 0 1 2 3
(49)
Notice that if 1205740= 0 then 120601
0(119896minus1)(119905) reduces to
1206010(119896minus1)
(119905) =(
119896minus1
prod
119906=1
120574119906)
119896minus1
sum
119907=1
1198601(119896minus1)
(119907)
times1
1205741199072(1 + 120574
119907)119905minus1199050
minus 1 minus (119905 minus 1199050) 120574
119907
(50)
Applying these results for time homogeneous modelswith 120574
119894= 120574119895if 119894 = 119895 the 119876
119894(119895)rsquos under discrete approximation
are given as follows
ISRN Biomathematics 13
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791112060111(119905119895minus1
)minus1205820112060101(119905119895minus1
)
minus119890minus1205791112060111(119905119895)minus1205820112060101(119905119895) + 119900 (120573
1)
1198761(119895) asymp (1 minus 120572
1) 119890
minus1205821112060111(119905119895minus1
)
minus119890minus1205821112060111(119905119895) + 119900 (120573
1)
(51)
(2) If 119896 ge 3 thenwe have 1198762(0) = 120575
2(119896minus1)120572119876
119894(0) = 0 119894 =
0 1 and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791(119896minus1)
1206011(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1206011(119896minus1)
(119905119895)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198761(119895) asymp 119890
minus1205792(119896minus1)
1206012(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1206012(119896minus1)
(119905119895)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198762(119895) asymp 120575
2(119896minus1)(1 minus 120572) 119890
minus1205822212060122(119905119895minus1
)minus 119890
minus1205822212060122(119905119895)
+ (1 minus 1205752(119896minus1)
) 119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(52)
Notice that if one replaces [1 + 120574119907]119905minus1199050 by [1 + 120574
119907]119905minus1199050 =
119890(119905minus1199050) log(1+120574
119907)
asymp 119890(119905minus1199050)120574119907 the above 119876
119894(119895)rsquos from discrete
time model are exactly the same as from the continuousmodel respectively as given in equations (23)ndash(27) under theassumption that 119875
119879(119904 119905) = 1 for 119905 minus 119904 gt 0 Notice that the
assumption 119875119879(119904 119905) = 1 for 119905minus119904 gt 0 is equivalent to assuming
that the last stage cancer cells grow instantaneously intocancer tumors as soon as they are generated see Remark 1
5 The Fitting of the Model to CancerIncidence and the Generalized BayesianInference Procedure
Given the model in Sections 2 and 3 and cancer incidenceone may use results in Section 4 to fit the model By usingthis model and the distribution results in Section 4 one canreadily estimate the unknown genetic parameters predictcancer incidence and check the validity of the model byusing the generalized Bayesian inference and Gibbs samplingprocedures for more detail see Tan [3 22] and Tan et al[4 5]
The generalized Bayesian inference is based on the pos-terior distribution 119875Θ | NY
˜119910
˜119899 of Θ given NY
˜119910
˜119899
This posterior distribution is derived by combining the prior
distribution 119875Θ ofΘ with the joint probability distribution119875NY
˜119910 |
˜119899 Θ given
˜119899 Θ given by (42) It follows
that this inference procedure would combine informationfrom three sources (1) previous information and experiencesabout the parameters in terms of the prior distribution 119875Θof the parameters (2) biological information of inheritedcancer cases via genetic segregation of cancer genes inthe population (119875N |
˜119899 119901
119894 119894 = 1 2 see Section 2)
and (3) information from the expanded data (Y) and theobserved data (
˜119910) via the statistical model from the system
(119875Y˜119910 | N Θ) given by (37) and (40) Because of additional
information from the genetic segregation of the cancer genesthis inference procedure provides an efficient procedure toextract information of effects of genotypes of individuals atthe embryo stage
51 The Prior Distribution of the Parameters For the priordistributions of Θ because biological information has sug-gested some lower bounds and upper bounds for the muta-tion rates and for the proliferation rates we assume
119875 (Θ) prop 119888 (119888 gt 0) (53)
where 119888 is a positive constant if these parameters satisfy somebiologically specified constraints are and equal to zero forotherwise These biological constraints are as follows
(i) 0 lt 1199011lt 10
minus2 0 lt 1199012lt 10
minus6 and minus001 lt 120574119894lt 1 (119894 =
1 2)(ii) For 120573
119895(119895 = 0 1 119896 minus 1) we let 1 lt 120596
0= 119873(119905
0)120573
0lt
1000 (119873 rarr 1198681) and 10minus8 lt 120573
119894lt 10
minus3 119894 = 1 119896minus1(iii) For the 120582
119895(119896minus1)(119895 = 0 1 2) we let 0 lt 120582
119894lt 10 119894 = 0 1
and 0 lt 1205822lt 10
3(iv) For the 120579
119894(119894 = 1 2 3) we let 0 lt 120579
1lt 10
minus2 and 0 lt
1205792lt 10
minus1
We will refer to the above prior as a partially informativeprior which may be considered as an extension of thetraditional noninformative prior given in Box and Tiao [28]
52 The Posterior Distribution of the Parameters Given119884119873
˜119910
˜119899 Denote by Θ
1= Θ minus 119901
1 119901
2 120572 From the
posterior distribution 119875Θ | NY˜119910
˜119899 we obtain
119875 119901119894 119894 = 1 2 120572 | Θ
1NY
˜119910
˜119899
prop (120594 (119896))1199100
119890minus120594(119896)
119901sum119905119873
119895=11198991119895
1119901sum119905119873
119895=11198992119895
2
times (1 minus 1199011minus 119901
2)sum119905119873
119895=11198990119895
0 lt 1199011 119901
2 120572 lt 1
119875 Θ1| (119901
1 119901
2 120572) NY
˜119910
˜119899
prop
119905119873
prod
119895=1
2
prod
119894=0
119890minus119876119894(119895)119876
119894(119895)
119910119894119895
Θ1isin Ω
(54)
14 ISRN Biomathematics
where Ω is the parameter space of Θ1provided by the
biological constraints in Section 51For computational convenience we notice that the log of
1198751199011 119901
2 120572 | Θ
1NY
˜119910
˜119899 is proportional to the negative of
1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) similarly the log of
119875Θ1| (119901
1 119901
2 120572)NY
˜119910
˜119899 is proportional to the negative
of sum119896
119895=1119863119895given by (46)
53 The Multilevel Gibbs Sampling Procedure For EstimatingUnknown Parameters Given the posterior probability distri-butions we will use the following multilevel Gibbs samplingprocedure to derive estimates of the parameters We noticethat numerically the Gibbs sampling procedure given belowis equivalent to the EM-algorithm from the sampling theoryviewpoint with Steps 1 and 2 as the 119864-Step and with Steps 3and 4 as the119872-Step respectively [29]Thesemultilevel Gibbssampling procedures are given by the following
Step 1 (Generating N Given (Y˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 1)) Given Θ and given˜119899 use the multinomial
distribution of 1198991119895 119899
2119895 given 119899
119895in Section 3 to generate
a large sample of N Then by combining this large samplewith 119875Y
˜119910 | N
˜119899 Θ in (37) and (40) to select N through
the weighted bootstrap method due to Smith and Gelfand[30] This selected N is then a sample from 119875N | Y
˜119910
˜119899 Θ
even though the latter is unknown (For proof see Tan [22]Chapter 3) Call the generated sample N
Step 2 (Generating Y Given (N˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 2)) Given
˜119910
˜119899 Θ and given N = N generated
from Step 1 generate Y from the probability distribution119875Y | N
˜119910
˜119899 Θ given by (36) and (38) Call the generated
sample Y
Step 3 (Estimation of 119901119894 119894 = 1 2 120572 Given Θ
1NY
˜119910
˜119899)
Given
˜119910
˜119899 Θ
1 and given (NY) = (N Y) from Steps 1
and 2 derive the posterior mode of 119901119894 119894 = 1 2 120572 by
maximizing the conditional posterior distribution 119875119901119894 119894 =
1 2 120572 | Θ1 N Y
˜119910
˜119899 Under the partially informative prior
this is equivalent to maximize the negative of the deviance1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) in Section 43 under
the constraints given in Section 51 Denote this generatedmode by 119901
119894 119894 = 1 2
Step 4 (Estimation of Θ1Given (119901
119894 119894 = 1 2 120572NY
˜119910
˜119899))
Given (
˜119910
˜119899) and given (NY 119901
119894 119894 = 1 2 120572) = (N Y 119901
119894 119894 =
1 2 ) from Steps 1ndash3 derive the posterior mode of Θ1
by maximizing the conditional posterior distribution119875Θ
1| 119901
119894 119894 = 1 2 N Y
˜119910
˜119899 Under the partially
informative prior this is equivalent to maximize the negativeof the deviancesum119896
119895=1119863119895in (46) under the constraints Denote
the generated mode as Θ1
Step 5 (Recycling Step) With (NY 119901119894 119894 = 1 2 120572 Θ
1) =
(N Y 119901119894 119894 = 1 2 Θ
1) given above go back to Step 1 and
continue until convergence
The proof of convergence of the above steps can bederived by using procedure given in Tan ([22] Chapter 3) Atconvergence the Θ = 119901
119894 119894 = 1 2 Θ
1 are the generated
values from the posterior distribution of Θ given
˜119910
˜119899
independently of (NY) (for proof see Tan [22] Chapter 3)Repeat the above procedures once then generate a randomsample ofΘ from the posterior distribution ofΘ given
˜119910
˜119899
then one uses the sample mean as the estimates of (Θ) anduse the sample variances and covariances as estimates of thevariances and covariances of these estimates
6 A NewMultistage Stochastic Model for AdultEye Cancer (Uveal Melanoma)mdashAn Example
The human eye cancers consist of pediatric eye cancers andadult eye cancers The most common pediatric eye cancer isthe retinoblastoma which develops from the retinal pigmentepithelium cells underlying the retina that do not formmelanoma The most common adult eye cancers are theuveal melanomas involving the iris the ciliary body andthe choroid (collectively referred to as the uveal) Thesecancers develop from melanocytes (pigment cells) whichreside within the uveal giving color to the eye In Tan andZhou [9] we have developed a modified two-stage model forretinoblastoma Based on results frommolecular biology (seeLandreville et al [14] Mensink et al [15] and Loercher andHarbour [31]) Landreville et al [14] have proposed a threestage model for uveal melanoma as given in Figure 2 As anexample of applications of this paper in this section we willapply this model of uveal melanoma to the NCINIH eyecancer data from the SEER project We notice that the samemethods can be applied to model other human cancers aswell but this will be our future research
Given in Table 1 are the numbers of people at risk andthe eye cancer cases in the age groups together with thepredicted cases from the models These data give cancerincidence at birth and incidence for 85 age groups (119896 =
85) with each group spanning over a 1-year period exceptthe last age group (ge85 years old) For human eye cancerbecause the incidence at birth and for age groups from 1to 10 years old is basically generated by the pediatric eyecancer-retinoblastoma (see [9]) to account for inheritedcancer cases of uveal melanomas the incidence for age 0(birth) and for age periods from 1 to 10 years old in Table 1for uveal melanoma is derived by subtracting incidence ofretinoblastoma from SEER data (see Tan and Zhou [9])
To fit the data we let one-time unit be 6 months afterbirth and let 119905
0= 1 To compare different models and to
assess different assumptions we will consider the following2-3-stage mixture models (1) the complete 3-stage mixturemodel (Model-F) in which no assumptions are made on theparameters (2) the Type-1ndash3-stage mixture model in whichwe assume that 120574
1= 0 and that normal people and 119869
1at
the embryo stage will remain normal people and 1198691people
respectively at birth (Model-1) For comparison purposes wealso fit a 2-stage model as defined in Tan and Zhou [9] Wewill apply the methods in Section 6 to fit these models to theSEER data given in Table 1
ISRN Biomathematics 15
Table 2 The log-likelihood AIC and BIC of the fitted models
Models Log-likelihood AIC BICThree-stage
Model-F minus131202 264804 267735Model-1 minus129576 260953 263151
Two-stage minus4392421 8796843 8811569
0 6 12 18 24 30 36 42 48 54 60 66 72 78 84Age (year)
0
50
100
150
200
250
300
350
Inci
denc
es
ObservedModel-F
Model-1Two-stage
Figure 5 Curve fitting of SEER data by the Model-F Model-1 andthe two-stage model
Given in Table 2 are the natural logs of the likelihoodfunctions the AIC (Akaike Information Criterion) and theBIC (Bayesian Information Criterion) for these modelsGiven in Table 3 are the estimates of parameters in the 3-stagemodels Given in Figure 5 are the plots of predicted cancercases from the 3-stage mixture models (Model-F and Model-1) and the 2-stagemodel For comparison purposes in Table 1we also provide numbers of predicted cancer cases from the3-stage mixture models and the 2-stage model together withthe observed cancer cases over time from SEER From theseresults we have made the following observations
(a) As shown by results in Table 1 and Figure 5 it appear-ed that both Model-F and Model-1 fitted the SEERdata well although Model-1 fitted the data slightlybetter from values of AIC and BIC The Chi-squaretest statistics 1205942 = sum
85
119894=0((119910
119894minus 119910
119894)2119910
119894) for Model-F
and Model-1 are given by 8843 and 9448 respec-tively giving a 119875-value of 012 (df = 86 minus 12 = 74)for Model-F and a 119875 value of 011 (df = 86 minus 7 = 79)for Model-1 On the other hand the 2-stage modelfitted the date very poorly the Chi-statistic value forthe 2-stage model is 274769 giving a 119875-value less than10
minus3The AIC (Akaike Information Criteria) and BIC(Bayesian Information Criteria) values ofModel-1 aregiven by (AIC = 260953 BIC = 263151) which areslightly smaller than those of Model-F respectivelyhowever the AIC and BIC values (879684 881157)of the two-stage model are considerably greater thanthose of the 3-stagemodels respectivelyThese resultssuggest that uvealmelanomamaybest be described bya 3-stage model with inherited component and that
one may practically assume 1205741= 0 and that normal
people and 1198691people at the embryo stage will remain
normal people and 1198691people respectively at birth
(b) From Table 3 it is observed that the estimate of 1205741is
close to zero (the estimate is of order 10minus5) indicatingthat the phenotype of 119869
1is almost identical to that
of 119873 = 1198690further confirming that the staging-
limiting genes are basically tumor suppressor genesand that there is no haploinsufficiency for these tumorsuppressor genes On the other hand the estimate of1205742is of order 10minus2 which is about 103 times greater
than those of cells with genotype 1198691
(c) From Table 3 the estimates 119895(119895 = 0 1 2) of 120582
119895
are of order 10minus1 10
minus1 10
2sim 10
3 respectively
Because 120582119894= 119864[119869
119894(1199050 119894)]120573
119894prod
2
119906=119894+1(120573
119906120574
119906) 119894 = 0 1 2
assuming some values of 119864[119869119894(1199050 119894)] 119894 = 0 1 2 from
some biological observations one can have somerough ideas about the magnitude of 120573
119895(119895 = 0 1 2)
For example if we follow Potten et al [32] to assume(119864[119873(119905
0)] = 119864[119869
119894(1199050 119894)] sim 10
8 119894 = 0 1 2) then 120573
119895asymp
10minus6sim 10
minus5(d) From Table 3 the estimates of 119901
1and 119901
2from the
SEER data are of orders 10minus4 sim 10minus3 and 10minus7 sim 10
minus6respectively This indicates that in the US populationthe frequency of the staging limiting cancer genefor uveal melanoma is approximately around 10
minus3Table 3 also showed that the estimate of 120572 was 08411indicating that most individuals with genotype 119869
2
would develop cancer at birth This may help toexplain why there are observed cancer incidencesat birth for uveal melanoma in the SEER data eventhough the estimate of the frequency 119901
2is of order
10minus7sim 10
minus6
7 Discussion and Conclusions
To account for inherited cancer cases in this paper wehave developed some general multistage models involvinghereditary cancer cases For human cancer incidence thesemodels are basically generalized mixture models In thesemixture models the mixing probability is a multinomialdistribution to account for genetic segregation of the staging-limiting tumor suppressor genes This mixture model allowsus to estimate for the first time the frequency of the staging-limiting tumor suppressor gene in human populations As anexample of applications in this paper we have developed ageneral 3-stage stochastic multistage model of carcinogenesisfor adult human eye cancer To account for inherited cancercases in the stochastic model of human eye cancer wehave also developed a generalized mixture model for uvealmelanoma in human beings
For using the proposed models to fit the cancer incidencedata in this paper we have developed a generalized Bayesianinference procedure to estimate the unknownparameters andto predict cancer cases This inference procedure is advanta-geous over the classical sampling theory inference (ie max-imum likelihood method) because the procedure combines
16 ISRN Biomathematics
Table3Estim
ates
ofparametersfor
the3
-stage
stochastic
mod
els
Parameters
1205820
1205821
1205822
1205741
1205742
1199011
1199012
1205721205791
1205792
Mod
el-F
Estim
ates
288119864minus01
481119864minus01
655119864+02
271119864minus05
337119864minus02
995119864minus04
968119864minus07
841119864minus01
329119864minus04
341119864minus01
StD
121119864minus01
465119864minus02
112119864+02
886119864minus06
192119864minus04
175119864minus06
648119864minus09
419119864minus02
354119864minus05
107119864minus02
95CL
-Low
er503119864minus02
389119864minus01
434119864+02
534119864minus06
333119864minus02
991119864minus04
955119864minus07
759119864minus01
260119864minus04
320119864minus01
95CL
-Upp
er525119864minus01
572119864minus01
875119864+02
401119864minus05
341119864minus02
998119864minus04
981119864minus07
923119864minus01
398119864minus04
362119864minus01
Mod
el-1
Estim
ates
655119864minus02
372119864minus01
198119864+02
NA
349119864minus02
998119864minus04
100119864minus06
807119864minus01
NA
NA
StD
254119864minus02
463119864minus02
338119864+01
NA
358119864minus03
627119864minus05
453119864minus08
706119864minus02
NA
NA
95CL
-Low
er157119864minus02
282119864minus01
132119864+02
NA
279119864minus02
875119864minus04
911119864minus07
668119864minus01
NA
NA
95CL
-Upp
er115119864minus01
463119864minus01
265119864+02
NA
419119864minus02
1121198640minus03
109119864minus06
945119864minus01
NA
NA
NoteNAassum
edno
nexiste
nce
ISRN Biomathematics 17
information from three sources (1)previous information andexperiences about the parameters in terms of the prior distri-bution 119875Θ of the parameters (2) biological information ofinherited cancer cases via the genetic segregation of staging-limiting tumor suppressor genes in the population and (3)
information from the expanded data (Y) and the observeddata (
˜119910) via the statistical model from the system (119875Y
˜119910 |
N Θ) given by (37) and (40)To illustrate the usefulness and applications of ourmodels
and methods we have applied our models and methods tothe eye cancer SEER data of NCINIH Our analysis clearlyshowed that the proposed 3-stage model with inheritedcancer cases fitted the data nicely (see Table 2 and Figure 5)on the other hand the classical 2-stage model cannot fit thedata at all (see Table 2 and Figure 5)These results clearly haveconfirmed results frommolecular biology that the human eyecancer is derived by a 3-stage model with inherited cancercomponent Notice however our 3-stage multistage modelis more general than the classical 3-stage model as describedin Little [1] Tan [2] and Zheng [7] in that we postulatethat cancer tumors develop from primary 119868
3cells by clonal
expansion (see Yang and Chen [11]) (Note that the stochasticmultistagemodels in the literature assume that cancer tumorsdevelop from last stage cells immediately as soon as theyare generated ignoring completely cancer progression seeRemark 1) As a matter of fact we had assumed Δ119905 = 1 fora period of three months and found that the 3-stage modelsthen did not fit the SEER data clearly indicating that119875
119905(119904 119905) lt
1 over a period of three monthsApplying our models and methods to the SEER data
of human eye cancer we have derived for the first timesome useful pieces of information Specifically we mention(1) for the first time that we have estimated the frequencyof the staging-limiting tumor suppressor gene in the USpopulation (119901
1sim 9948 times 10
minus3) (2) With the estimate of 120572as = 08411 the predicted number of uveal melanoma atbirth is 119910
0= 119899
011199012= 36 by 3-stage models with inherited
cancer component (Model-F and Model-1) (The observednumber of eye cancer at birth is 34) (3) The estimateof the proliferation rate (120574
1) of 119869
1cells using Model-F is
1205741
= 2271 times 10minus5
sim 0 (The estimate is 8603 times 10minus5
using Model-1) This confirms that the stage-1 limitinggene is a tumor suppressor gene and unlike the p53 genein chromosome 17p (see [33]) there is little or no haploidinsufficiency for this gene in cells with genotype 119869
1
Using models and methods of this paper one can easilypredict future cancer cases for human eye cancer Thus bycomparing results from different populations our modelsand methods can be used to assess cancer prevention andcontrol procedures This will be our future research topicswe will not go any further here
Appendix
The Expected Numbers of State Variablesunder Discrete Time Approximation
Under discrete time the stochastic differential equations ofstate variables reduce to the following stochastic difference
equations of state variables respectively
119869119894 (119905 + 1 119894) = 119869
119894 (119905 119894) [1 + 120574119894 (119905)]
+ 119890119894(119905 + 1 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 + 1 119906) = 119869
119895(119905 119906) [1 + 120574
119895(119905)] + 119869
119895minus1(119905 119906) 120573
119895minus1(119905)
+ 119890119895(119905 + 1 119906) 0 le 119906 lt 119895 le 119896 minus 1
(A1)
where 119890119894(119905 + 1 119894) = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)] minus [119863
119894(119905 119894) minus
119869119894(119905 119894)119889
119894(119905)] and 119890
119895(119905+1 119894) = [119861
119895(119905 119894)minus119869
119895(119905 119894)119887
119895(119905)] minus [119863
119895(119905 119894)minus
119869119895(119905 119894)119889
119895(119905)] + [119872
119895minus1(119905 119894) minus 119869
119895minus1(119905 119894)120573
119895minus1(119905)] for 119895 gt 119894
The initial conditions at birth (1199050) for the above stochastic
difference equations are 119869119895(1199050 119894) gt 0 if (119895 = 119894 119894 + 1) and
119869119895(1199050 119894) = 0 if 119895 gt 119894 + 1 The solution of the above difference
equations under these initial conditions is given respectivelyby
119869119894(119905 119894) = 119869
119894(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119894(119904))
+
119905
sum
119904=1199050+1
119890119894(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119894(119906))
119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119895(119904))
+
119905minus1
sum
119904=1199050
119869119895minus1 (119904 119894) 120573119895minus1 (119904)
119905minus1
prod
119906=119904+1
(1 + 120574119895 (119906))
+
119905
sum
119904=1199050+1
119890119895(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119895(119906))
0 le 119894 lt 119895 le 119896 minus 1
(A2)
If the model is time homogeneous so that 120573119895(119905) =
120573119895 120574
119895(119905) = 120574
119895(119895 = 119894 119896 minus 1 and if 120574
119894= 120574119895if 119894 = 119895 then the
above solutions under the initial conditions (119869119895(1199050 119894) = 0 119895 gt
119894 + 1) reduce to
119869119894 (119905 119894) = 119869
119894(1199050 119894) (1 + 120574
119894)119905minus1199050
+ 120578(0)
119894(119905 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894) (1 + 120574
119895)119905minus1199050
+
119895minus1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895(119903) (1 + 120574
119903)119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
18 ISRN Biomathematics
=
119894+1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895 (119903) (1 + 120574119903)
119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
0 le 119894 lt 119895 le 119896 minus 1
(A3)
where 120578(0)
119895(119905 119894) = sum
119905
119904=1199050+1119890119895(119904 119894)(1 + 120574
119894)119905minus119904 120578
(119906)
119895(119905 119894) =
sum119905
119904=1199050+1119890119895minus119906
(119904 119894)sum119895
119907=119906119860119906119895(119907)(1 + 120574
119907)119905minus119904 (119906 = 1 119895 minus 119894)
Thus if the model is time homogeneous and if 120574119894=
120574119895if 119894 = 119895 the 119864[119869
119896minus1(119905 119894)]rsquos in discrete time models under
the initial conditions (119869119895(1199050 119894) = 0 119895 gt 119894 + 1) are given
respectively by
119864 [119869119896minus1
(119905 119894)] =
119894+1
sum
119906=119894
119864 [119869119906(1199050 119894)] (
119896minus2
prod
119907=119906
120573119907)
times
119896minus1
sum
119903=119906
119860119906(119896minus1)
(119903) (1 + 120574119903)119905minus1199050
119894 = 0 1 Min (119896 minus 1 2)
(A4)
References
[1] M P Little ldquoCancermodels ionization and genomic instabilitya reviewrdquo inHandbook of CancerModels with ApplicationsW YTan and L Hanin Eds chapter 5 World Scientific River EdgeNJ USA 2008
[2] W Y Tan Stochastic Models of Carcinogenesis Marcel DekkerNew York NY USA 1991
[3] W Y Tan ldquoStochastic multiti-stage models of carcinogenesis ashidden Markov models a new approachrdquo International Journalof Systems and Synthetic Biology vol 1 pp 313ndash337 2010
[4] W Y Tan L J Zhang and C W Chen ldquoStochastic modelingof carcinogenesis state space models and estimation of param-etersrdquoDiscrete and Continuous Dynamical Systems B vol 4 no1 pp 297ndash322 2004
[5] W Y Tan C W Chen and L J Zhang ldquoCancer biology cancermodels and stochasticmathematical analysis of carcinogenesisrdquoin Handbook of Cancer Models and Applications W Y Tan andL Hanin Eds chapter 3World Scientific River Edge NJ USA2008
[6] R A Weinberg The Biology of Human Cancer Garland Sci-ences Taylor and Frances New York NY USA 2007
[7] Q Zheng ldquoStochastic multistage cancer models a fresh lookat an old approachrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds chapter 2 WorldScientific River Edge NJ USA 2008
[8] W Y Tan L J Zhang W Chen and J M Zhu ldquoA stochasticmodel of human colon cancer involving multiple pathwaysrdquo inHandbook of Cancer Models with Applications W Y Tan and LHanin Eds chapter 11 World Scientific River Edge NJ USA2008
[9] W Y Tan and H Zhou ldquoA new stochastic model of retinoblas-toma involving both hereditary and non- hereditary cancercasesrdquo Journal of Carcinogenesis and Mutagenesis vol 2 no 2article 117 2011
[10] X W Yan Stochastic and State Space Models of carcinogenesisUnder Complex Situation [PhD thesis] Department of Math-ematical Sciences University of Memphsis Memphis TennUSA
[11] G L Yang and C W Chen ldquoA stochastic two-stage carcino-genesis model a new approach to computing the probabilityof observing tumor in animal bioassaysrdquo Mathematical Bio-sciences vol 104 no 2 pp 247ndash258 1991
[12] H Osada and T Takahashi ldquoGenetic alterations of multipletumor suppressors and oncogenes in the carcinogenesis andprogression of lung cancerrdquo Oncogene vol 21 no 48 pp 7421ndash7434 2002
[13] I I Wistuba L Mao and A F Gazdar ldquoSmoking moleculardamage in bronchial epitheliumrdquo Oncogene vol 21 no 48 pp7298ndash7306 2002
[14] S Landreville O A Agapova and J W Hartbour ldquoEmerginginsights into the molecular pathogenesis of uveal melanomardquoFuture Oncology vol 4 no 5 pp 629ndash636 2008
[15] H W Mensink D Paridaens and A De Klein ldquoGenetics ofuveal melanomardquo Expert Review of Ophthalmology vol 4 no6 pp 607ndash616 2009
[16] WY Tan andXWYan ldquoAnew stochastic and state spacemodelof human colon cancer incorporating multiple pathwaysrdquoBiology Direct vol 5 article 26 2010
[17] L B Klebanov S T Rachev and A Y Yakovlev ldquoA stochasticmodel of radiation carcinogenesis latent time distributions andtheir propertiesrdquo Mathematical Biosciences vol 113 no 1 pp51ndash75 1993
[18] A Y Yakovlev and A D Tsodikov Stochastic Models of TumorLatency and Their Biostatistical Applications World ScientificRiver Edge NJ USA 1996
[19] H Fakir W Y Tan L Hlatky P Hahnfeldt and R K SachsldquoStochastic population dynamic effects for lung cancer progres-sionrdquo Radiation Research vol 172 no 3 pp 383ndash393 2009
[20] A G Knudson ldquoMutation and cancer statistical study ofretinoblastomardquoProceedings of theNational Academy of Sciencesof the United States of America vol 68 no 4 pp 820ndash823 1971
[21] J F Crow and M Kimura An Introduction to PopulationGenetics Theory Harper and Row New York NY USA 1970
[22] W Y Tan Stochastic Models With Applications to GeneticsCancers AIDS and Other Biomedical Systems World ScientificRiver Edge NJ USA 2002
[23] W Y Tan CW Chen and L J Zhang ldquoCancer risk assessmentby state space modelsrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds Chapter 13 WorldScientific River Edge NJ USA 2008
[24] W Y Tan andCWChen ldquoStochasticmodeling of carcinogene-sis some new insightsrdquoMathematical and Computer Modellingvol 28 no 11 pp 49ndash71 1998
[25] W Y Tan and C W Chen ldquoCancer stochastic modelsrdquo inEncyclopedia of Statistical Sciences Revised edition John Wileyand Sons New York NY USA 2005
[26] E G Luebeck and S HMoolgavkar ldquoMultistage carcinogenesisand the incidence of colorectal cancerrdquo Proceedings of theNational Academy of Sciences of the United States of Americavol 99 no 23 pp 15095ndash15100 2002
[27] R Durrett J Mayberry S Moseley and D Schmidt ldquoProba-bility models for cancer development and progressionrdquo GoogleSearch Google 2012
[28] G E P Box and G C Tiao Bayesian Inference in StatisticalAnalysis Addison-Wesley Reading Mass USA 1973
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
ISRN Biomathematics 13
(1) If 119896 = 2 then 119896 minus 1 = 1 Hence 1198762(0) = 1 119876
119894(0) =
1198762(119895) = 0 for (119894 = 0 1 119895 gt 0) and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791112060111(119905119895minus1
)minus1205820112060101(119905119895minus1
)
minus119890minus1205791112060111(119905119895)minus1205820112060101(119905119895) + 119900 (120573
1)
1198761(119895) asymp (1 minus 120572
1) 119890
minus1205821112060111(119905119895minus1
)
minus119890minus1205821112060111(119905119895) + 119900 (120573
1)
(51)
(2) If 119896 ge 3 thenwe have 1198762(0) = 120575
2(119896minus1)120572119876
119894(0) = 0 119894 =
0 1 and for 119895 gt 0
1198760(119895) asymp 119890
minus1205791(119896minus1)
1206011(119896minus1)
(119905119895minus1
)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895minus1
)
minus119890minus1205791(119896minus1)
1206011(119896minus1)
(119905119895)minus1205820(119896minus1)
1206010(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198761(119895) asymp 119890
minus1205792(119896minus1)
1206012(119896minus1)
(119905119895minus1
)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895minus1
)
minus119890minus1205792(119896minus1)
1206012(119896minus1)
(119905119895)minus1205821(119896minus1)
1206011(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
1198762(119895) asymp 120575
2(119896minus1)(1 minus 120572) 119890
minus1205822212060122(119905119895minus1
)minus 119890
minus1205822212060122(119905119895)
+ (1 minus 1205752(119896minus1)
) 119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895minus1
)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895minus1
)
minus119890minus1205793(119896minus1)
1206013(119896minus1)
(119905119895)minus1205822(119896minus1)
1206012(119896minus1)
(119905119895)
+ 119900 (120573119896minus1
)
(52)
Notice that if one replaces [1 + 120574119907]119905minus1199050 by [1 + 120574
119907]119905minus1199050 =
119890(119905minus1199050) log(1+120574
119907)
asymp 119890(119905minus1199050)120574119907 the above 119876
119894(119895)rsquos from discrete
time model are exactly the same as from the continuousmodel respectively as given in equations (23)ndash(27) under theassumption that 119875
119879(119904 119905) = 1 for 119905 minus 119904 gt 0 Notice that the
assumption 119875119879(119904 119905) = 1 for 119905minus119904 gt 0 is equivalent to assuming
that the last stage cancer cells grow instantaneously intocancer tumors as soon as they are generated see Remark 1
5 The Fitting of the Model to CancerIncidence and the Generalized BayesianInference Procedure
Given the model in Sections 2 and 3 and cancer incidenceone may use results in Section 4 to fit the model By usingthis model and the distribution results in Section 4 one canreadily estimate the unknown genetic parameters predictcancer incidence and check the validity of the model byusing the generalized Bayesian inference and Gibbs samplingprocedures for more detail see Tan [3 22] and Tan et al[4 5]
The generalized Bayesian inference is based on the pos-terior distribution 119875Θ | NY
˜119910
˜119899 of Θ given NY
˜119910
˜119899
This posterior distribution is derived by combining the prior
distribution 119875Θ ofΘ with the joint probability distribution119875NY
˜119910 |
˜119899 Θ given
˜119899 Θ given by (42) It follows
that this inference procedure would combine informationfrom three sources (1) previous information and experiencesabout the parameters in terms of the prior distribution 119875Θof the parameters (2) biological information of inheritedcancer cases via genetic segregation of cancer genes inthe population (119875N |
˜119899 119901
119894 119894 = 1 2 see Section 2)
and (3) information from the expanded data (Y) and theobserved data (
˜119910) via the statistical model from the system
(119875Y˜119910 | N Θ) given by (37) and (40) Because of additional
information from the genetic segregation of the cancer genesthis inference procedure provides an efficient procedure toextract information of effects of genotypes of individuals atthe embryo stage
51 The Prior Distribution of the Parameters For the priordistributions of Θ because biological information has sug-gested some lower bounds and upper bounds for the muta-tion rates and for the proliferation rates we assume
119875 (Θ) prop 119888 (119888 gt 0) (53)
where 119888 is a positive constant if these parameters satisfy somebiologically specified constraints are and equal to zero forotherwise These biological constraints are as follows
(i) 0 lt 1199011lt 10
minus2 0 lt 1199012lt 10
minus6 and minus001 lt 120574119894lt 1 (119894 =
1 2)(ii) For 120573
119895(119895 = 0 1 119896 minus 1) we let 1 lt 120596
0= 119873(119905
0)120573
0lt
1000 (119873 rarr 1198681) and 10minus8 lt 120573
119894lt 10
minus3 119894 = 1 119896minus1(iii) For the 120582
119895(119896minus1)(119895 = 0 1 2) we let 0 lt 120582
119894lt 10 119894 = 0 1
and 0 lt 1205822lt 10
3(iv) For the 120579
119894(119894 = 1 2 3) we let 0 lt 120579
1lt 10
minus2 and 0 lt
1205792lt 10
minus1
We will refer to the above prior as a partially informativeprior which may be considered as an extension of thetraditional noninformative prior given in Box and Tiao [28]
52 The Posterior Distribution of the Parameters Given119884119873
˜119910
˜119899 Denote by Θ
1= Θ minus 119901
1 119901
2 120572 From the
posterior distribution 119875Θ | NY˜119910
˜119899 we obtain
119875 119901119894 119894 = 1 2 120572 | Θ
1NY
˜119910
˜119899
prop (120594 (119896))1199100
119890minus120594(119896)
119901sum119905119873
119895=11198991119895
1119901sum119905119873
119895=11198992119895
2
times (1 minus 1199011minus 119901
2)sum119905119873
119895=11198990119895
0 lt 1199011 119901
2 120572 lt 1
119875 Θ1| (119901
1 119901
2 120572) NY
˜119910
˜119899
prop
119905119873
prod
119895=1
2
prod
119894=0
119890minus119876119894(119895)119876
119894(119895)
119910119894119895
Θ1isin Ω
(54)
14 ISRN Biomathematics
where Ω is the parameter space of Θ1provided by the
biological constraints in Section 51For computational convenience we notice that the log of
1198751199011 119901
2 120572 | Θ
1NY
˜119910
˜119899 is proportional to the negative of
1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) similarly the log of
119875Θ1| (119901
1 119901
2 120572)NY
˜119910
˜119899 is proportional to the negative
of sum119896
119895=1119863119895given by (46)
53 The Multilevel Gibbs Sampling Procedure For EstimatingUnknown Parameters Given the posterior probability distri-butions we will use the following multilevel Gibbs samplingprocedure to derive estimates of the parameters We noticethat numerically the Gibbs sampling procedure given belowis equivalent to the EM-algorithm from the sampling theoryviewpoint with Steps 1 and 2 as the 119864-Step and with Steps 3and 4 as the119872-Step respectively [29]Thesemultilevel Gibbssampling procedures are given by the following
Step 1 (Generating N Given (Y˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 1)) Given Θ and given˜119899 use the multinomial
distribution of 1198991119895 119899
2119895 given 119899
119895in Section 3 to generate
a large sample of N Then by combining this large samplewith 119875Y
˜119910 | N
˜119899 Θ in (37) and (40) to select N through
the weighted bootstrap method due to Smith and Gelfand[30] This selected N is then a sample from 119875N | Y
˜119910
˜119899 Θ
even though the latter is unknown (For proof see Tan [22]Chapter 3) Call the generated sample N
Step 2 (Generating Y Given (N˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 2)) Given
˜119910
˜119899 Θ and given N = N generated
from Step 1 generate Y from the probability distribution119875Y | N
˜119910
˜119899 Θ given by (36) and (38) Call the generated
sample Y
Step 3 (Estimation of 119901119894 119894 = 1 2 120572 Given Θ
1NY
˜119910
˜119899)
Given
˜119910
˜119899 Θ
1 and given (NY) = (N Y) from Steps 1
and 2 derive the posterior mode of 119901119894 119894 = 1 2 120572 by
maximizing the conditional posterior distribution 119875119901119894 119894 =
1 2 120572 | Θ1 N Y
˜119910
˜119899 Under the partially informative prior
this is equivalent to maximize the negative of the deviance1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) in Section 43 under
the constraints given in Section 51 Denote this generatedmode by 119901
119894 119894 = 1 2
Step 4 (Estimation of Θ1Given (119901
119894 119894 = 1 2 120572NY
˜119910
˜119899))
Given (
˜119910
˜119899) and given (NY 119901
119894 119894 = 1 2 120572) = (N Y 119901
119894 119894 =
1 2 ) from Steps 1ndash3 derive the posterior mode of Θ1
by maximizing the conditional posterior distribution119875Θ
1| 119901
119894 119894 = 1 2 N Y
˜119910
˜119899 Under the partially
informative prior this is equivalent to maximize the negativeof the deviancesum119896
119895=1119863119895in (46) under the constraints Denote
the generated mode as Θ1
Step 5 (Recycling Step) With (NY 119901119894 119894 = 1 2 120572 Θ
1) =
(N Y 119901119894 119894 = 1 2 Θ
1) given above go back to Step 1 and
continue until convergence
The proof of convergence of the above steps can bederived by using procedure given in Tan ([22] Chapter 3) Atconvergence the Θ = 119901
119894 119894 = 1 2 Θ
1 are the generated
values from the posterior distribution of Θ given
˜119910
˜119899
independently of (NY) (for proof see Tan [22] Chapter 3)Repeat the above procedures once then generate a randomsample ofΘ from the posterior distribution ofΘ given
˜119910
˜119899
then one uses the sample mean as the estimates of (Θ) anduse the sample variances and covariances as estimates of thevariances and covariances of these estimates
6 A NewMultistage Stochastic Model for AdultEye Cancer (Uveal Melanoma)mdashAn Example
The human eye cancers consist of pediatric eye cancers andadult eye cancers The most common pediatric eye cancer isthe retinoblastoma which develops from the retinal pigmentepithelium cells underlying the retina that do not formmelanoma The most common adult eye cancers are theuveal melanomas involving the iris the ciliary body andthe choroid (collectively referred to as the uveal) Thesecancers develop from melanocytes (pigment cells) whichreside within the uveal giving color to the eye In Tan andZhou [9] we have developed a modified two-stage model forretinoblastoma Based on results frommolecular biology (seeLandreville et al [14] Mensink et al [15] and Loercher andHarbour [31]) Landreville et al [14] have proposed a threestage model for uveal melanoma as given in Figure 2 As anexample of applications of this paper in this section we willapply this model of uveal melanoma to the NCINIH eyecancer data from the SEER project We notice that the samemethods can be applied to model other human cancers aswell but this will be our future research
Given in Table 1 are the numbers of people at risk andthe eye cancer cases in the age groups together with thepredicted cases from the models These data give cancerincidence at birth and incidence for 85 age groups (119896 =
85) with each group spanning over a 1-year period exceptthe last age group (ge85 years old) For human eye cancerbecause the incidence at birth and for age groups from 1to 10 years old is basically generated by the pediatric eyecancer-retinoblastoma (see [9]) to account for inheritedcancer cases of uveal melanomas the incidence for age 0(birth) and for age periods from 1 to 10 years old in Table 1for uveal melanoma is derived by subtracting incidence ofretinoblastoma from SEER data (see Tan and Zhou [9])
To fit the data we let one-time unit be 6 months afterbirth and let 119905
0= 1 To compare different models and to
assess different assumptions we will consider the following2-3-stage mixture models (1) the complete 3-stage mixturemodel (Model-F) in which no assumptions are made on theparameters (2) the Type-1ndash3-stage mixture model in whichwe assume that 120574
1= 0 and that normal people and 119869
1at
the embryo stage will remain normal people and 1198691people
respectively at birth (Model-1) For comparison purposes wealso fit a 2-stage model as defined in Tan and Zhou [9] Wewill apply the methods in Section 6 to fit these models to theSEER data given in Table 1
ISRN Biomathematics 15
Table 2 The log-likelihood AIC and BIC of the fitted models
Models Log-likelihood AIC BICThree-stage
Model-F minus131202 264804 267735Model-1 minus129576 260953 263151
Two-stage minus4392421 8796843 8811569
0 6 12 18 24 30 36 42 48 54 60 66 72 78 84Age (year)
0
50
100
150
200
250
300
350
Inci
denc
es
ObservedModel-F
Model-1Two-stage
Figure 5 Curve fitting of SEER data by the Model-F Model-1 andthe two-stage model
Given in Table 2 are the natural logs of the likelihoodfunctions the AIC (Akaike Information Criterion) and theBIC (Bayesian Information Criterion) for these modelsGiven in Table 3 are the estimates of parameters in the 3-stagemodels Given in Figure 5 are the plots of predicted cancercases from the 3-stage mixture models (Model-F and Model-1) and the 2-stagemodel For comparison purposes in Table 1we also provide numbers of predicted cancer cases from the3-stage mixture models and the 2-stage model together withthe observed cancer cases over time from SEER From theseresults we have made the following observations
(a) As shown by results in Table 1 and Figure 5 it appear-ed that both Model-F and Model-1 fitted the SEERdata well although Model-1 fitted the data slightlybetter from values of AIC and BIC The Chi-squaretest statistics 1205942 = sum
85
119894=0((119910
119894minus 119910
119894)2119910
119894) for Model-F
and Model-1 are given by 8843 and 9448 respec-tively giving a 119875-value of 012 (df = 86 minus 12 = 74)for Model-F and a 119875 value of 011 (df = 86 minus 7 = 79)for Model-1 On the other hand the 2-stage modelfitted the date very poorly the Chi-statistic value forthe 2-stage model is 274769 giving a 119875-value less than10
minus3The AIC (Akaike Information Criteria) and BIC(Bayesian Information Criteria) values ofModel-1 aregiven by (AIC = 260953 BIC = 263151) which areslightly smaller than those of Model-F respectivelyhowever the AIC and BIC values (879684 881157)of the two-stage model are considerably greater thanthose of the 3-stagemodels respectivelyThese resultssuggest that uvealmelanomamaybest be described bya 3-stage model with inherited component and that
one may practically assume 1205741= 0 and that normal
people and 1198691people at the embryo stage will remain
normal people and 1198691people respectively at birth
(b) From Table 3 it is observed that the estimate of 1205741is
close to zero (the estimate is of order 10minus5) indicatingthat the phenotype of 119869
1is almost identical to that
of 119873 = 1198690further confirming that the staging-
limiting genes are basically tumor suppressor genesand that there is no haploinsufficiency for these tumorsuppressor genes On the other hand the estimate of1205742is of order 10minus2 which is about 103 times greater
than those of cells with genotype 1198691
(c) From Table 3 the estimates 119895(119895 = 0 1 2) of 120582
119895
are of order 10minus1 10
minus1 10
2sim 10
3 respectively
Because 120582119894= 119864[119869
119894(1199050 119894)]120573
119894prod
2
119906=119894+1(120573
119906120574
119906) 119894 = 0 1 2
assuming some values of 119864[119869119894(1199050 119894)] 119894 = 0 1 2 from
some biological observations one can have somerough ideas about the magnitude of 120573
119895(119895 = 0 1 2)
For example if we follow Potten et al [32] to assume(119864[119873(119905
0)] = 119864[119869
119894(1199050 119894)] sim 10
8 119894 = 0 1 2) then 120573
119895asymp
10minus6sim 10
minus5(d) From Table 3 the estimates of 119901
1and 119901
2from the
SEER data are of orders 10minus4 sim 10minus3 and 10minus7 sim 10
minus6respectively This indicates that in the US populationthe frequency of the staging limiting cancer genefor uveal melanoma is approximately around 10
minus3Table 3 also showed that the estimate of 120572 was 08411indicating that most individuals with genotype 119869
2
would develop cancer at birth This may help toexplain why there are observed cancer incidencesat birth for uveal melanoma in the SEER data eventhough the estimate of the frequency 119901
2is of order
10minus7sim 10
minus6
7 Discussion and Conclusions
To account for inherited cancer cases in this paper wehave developed some general multistage models involvinghereditary cancer cases For human cancer incidence thesemodels are basically generalized mixture models In thesemixture models the mixing probability is a multinomialdistribution to account for genetic segregation of the staging-limiting tumor suppressor genes This mixture model allowsus to estimate for the first time the frequency of the staging-limiting tumor suppressor gene in human populations As anexample of applications in this paper we have developed ageneral 3-stage stochastic multistage model of carcinogenesisfor adult human eye cancer To account for inherited cancercases in the stochastic model of human eye cancer wehave also developed a generalized mixture model for uvealmelanoma in human beings
For using the proposed models to fit the cancer incidencedata in this paper we have developed a generalized Bayesianinference procedure to estimate the unknownparameters andto predict cancer cases This inference procedure is advanta-geous over the classical sampling theory inference (ie max-imum likelihood method) because the procedure combines
16 ISRN Biomathematics
Table3Estim
ates
ofparametersfor
the3
-stage
stochastic
mod
els
Parameters
1205820
1205821
1205822
1205741
1205742
1199011
1199012
1205721205791
1205792
Mod
el-F
Estim
ates
288119864minus01
481119864minus01
655119864+02
271119864minus05
337119864minus02
995119864minus04
968119864minus07
841119864minus01
329119864minus04
341119864minus01
StD
121119864minus01
465119864minus02
112119864+02
886119864minus06
192119864minus04
175119864minus06
648119864minus09
419119864minus02
354119864minus05
107119864minus02
95CL
-Low
er503119864minus02
389119864minus01
434119864+02
534119864minus06
333119864minus02
991119864minus04
955119864minus07
759119864minus01
260119864minus04
320119864minus01
95CL
-Upp
er525119864minus01
572119864minus01
875119864+02
401119864minus05
341119864minus02
998119864minus04
981119864minus07
923119864minus01
398119864minus04
362119864minus01
Mod
el-1
Estim
ates
655119864minus02
372119864minus01
198119864+02
NA
349119864minus02
998119864minus04
100119864minus06
807119864minus01
NA
NA
StD
254119864minus02
463119864minus02
338119864+01
NA
358119864minus03
627119864minus05
453119864minus08
706119864minus02
NA
NA
95CL
-Low
er157119864minus02
282119864minus01
132119864+02
NA
279119864minus02
875119864minus04
911119864minus07
668119864minus01
NA
NA
95CL
-Upp
er115119864minus01
463119864minus01
265119864+02
NA
419119864minus02
1121198640minus03
109119864minus06
945119864minus01
NA
NA
NoteNAassum
edno
nexiste
nce
ISRN Biomathematics 17
information from three sources (1)previous information andexperiences about the parameters in terms of the prior distri-bution 119875Θ of the parameters (2) biological information ofinherited cancer cases via the genetic segregation of staging-limiting tumor suppressor genes in the population and (3)
information from the expanded data (Y) and the observeddata (
˜119910) via the statistical model from the system (119875Y
˜119910 |
N Θ) given by (37) and (40)To illustrate the usefulness and applications of ourmodels
and methods we have applied our models and methods tothe eye cancer SEER data of NCINIH Our analysis clearlyshowed that the proposed 3-stage model with inheritedcancer cases fitted the data nicely (see Table 2 and Figure 5)on the other hand the classical 2-stage model cannot fit thedata at all (see Table 2 and Figure 5)These results clearly haveconfirmed results frommolecular biology that the human eyecancer is derived by a 3-stage model with inherited cancercomponent Notice however our 3-stage multistage modelis more general than the classical 3-stage model as describedin Little [1] Tan [2] and Zheng [7] in that we postulatethat cancer tumors develop from primary 119868
3cells by clonal
expansion (see Yang and Chen [11]) (Note that the stochasticmultistagemodels in the literature assume that cancer tumorsdevelop from last stage cells immediately as soon as theyare generated ignoring completely cancer progression seeRemark 1) As a matter of fact we had assumed Δ119905 = 1 fora period of three months and found that the 3-stage modelsthen did not fit the SEER data clearly indicating that119875
119905(119904 119905) lt
1 over a period of three monthsApplying our models and methods to the SEER data
of human eye cancer we have derived for the first timesome useful pieces of information Specifically we mention(1) for the first time that we have estimated the frequencyof the staging-limiting tumor suppressor gene in the USpopulation (119901
1sim 9948 times 10
minus3) (2) With the estimate of 120572as = 08411 the predicted number of uveal melanoma atbirth is 119910
0= 119899
011199012= 36 by 3-stage models with inherited
cancer component (Model-F and Model-1) (The observednumber of eye cancer at birth is 34) (3) The estimateof the proliferation rate (120574
1) of 119869
1cells using Model-F is
1205741
= 2271 times 10minus5
sim 0 (The estimate is 8603 times 10minus5
using Model-1) This confirms that the stage-1 limitinggene is a tumor suppressor gene and unlike the p53 genein chromosome 17p (see [33]) there is little or no haploidinsufficiency for this gene in cells with genotype 119869
1
Using models and methods of this paper one can easilypredict future cancer cases for human eye cancer Thus bycomparing results from different populations our modelsand methods can be used to assess cancer prevention andcontrol procedures This will be our future research topicswe will not go any further here
Appendix
The Expected Numbers of State Variablesunder Discrete Time Approximation
Under discrete time the stochastic differential equations ofstate variables reduce to the following stochastic difference
equations of state variables respectively
119869119894 (119905 + 1 119894) = 119869
119894 (119905 119894) [1 + 120574119894 (119905)]
+ 119890119894(119905 + 1 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 + 1 119906) = 119869
119895(119905 119906) [1 + 120574
119895(119905)] + 119869
119895minus1(119905 119906) 120573
119895minus1(119905)
+ 119890119895(119905 + 1 119906) 0 le 119906 lt 119895 le 119896 minus 1
(A1)
where 119890119894(119905 + 1 119894) = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)] minus [119863
119894(119905 119894) minus
119869119894(119905 119894)119889
119894(119905)] and 119890
119895(119905+1 119894) = [119861
119895(119905 119894)minus119869
119895(119905 119894)119887
119895(119905)] minus [119863
119895(119905 119894)minus
119869119895(119905 119894)119889
119895(119905)] + [119872
119895minus1(119905 119894) minus 119869
119895minus1(119905 119894)120573
119895minus1(119905)] for 119895 gt 119894
The initial conditions at birth (1199050) for the above stochastic
difference equations are 119869119895(1199050 119894) gt 0 if (119895 = 119894 119894 + 1) and
119869119895(1199050 119894) = 0 if 119895 gt 119894 + 1 The solution of the above difference
equations under these initial conditions is given respectivelyby
119869119894(119905 119894) = 119869
119894(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119894(119904))
+
119905
sum
119904=1199050+1
119890119894(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119894(119906))
119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119895(119904))
+
119905minus1
sum
119904=1199050
119869119895minus1 (119904 119894) 120573119895minus1 (119904)
119905minus1
prod
119906=119904+1
(1 + 120574119895 (119906))
+
119905
sum
119904=1199050+1
119890119895(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119895(119906))
0 le 119894 lt 119895 le 119896 minus 1
(A2)
If the model is time homogeneous so that 120573119895(119905) =
120573119895 120574
119895(119905) = 120574
119895(119895 = 119894 119896 minus 1 and if 120574
119894= 120574119895if 119894 = 119895 then the
above solutions under the initial conditions (119869119895(1199050 119894) = 0 119895 gt
119894 + 1) reduce to
119869119894 (119905 119894) = 119869
119894(1199050 119894) (1 + 120574
119894)119905minus1199050
+ 120578(0)
119894(119905 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894) (1 + 120574
119895)119905minus1199050
+
119895minus1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895(119903) (1 + 120574
119903)119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
18 ISRN Biomathematics
=
119894+1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895 (119903) (1 + 120574119903)
119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
0 le 119894 lt 119895 le 119896 minus 1
(A3)
where 120578(0)
119895(119905 119894) = sum
119905
119904=1199050+1119890119895(119904 119894)(1 + 120574
119894)119905minus119904 120578
(119906)
119895(119905 119894) =
sum119905
119904=1199050+1119890119895minus119906
(119904 119894)sum119895
119907=119906119860119906119895(119907)(1 + 120574
119907)119905minus119904 (119906 = 1 119895 minus 119894)
Thus if the model is time homogeneous and if 120574119894=
120574119895if 119894 = 119895 the 119864[119869
119896minus1(119905 119894)]rsquos in discrete time models under
the initial conditions (119869119895(1199050 119894) = 0 119895 gt 119894 + 1) are given
respectively by
119864 [119869119896minus1
(119905 119894)] =
119894+1
sum
119906=119894
119864 [119869119906(1199050 119894)] (
119896minus2
prod
119907=119906
120573119907)
times
119896minus1
sum
119903=119906
119860119906(119896minus1)
(119903) (1 + 120574119903)119905minus1199050
119894 = 0 1 Min (119896 minus 1 2)
(A4)
References
[1] M P Little ldquoCancermodels ionization and genomic instabilitya reviewrdquo inHandbook of CancerModels with ApplicationsW YTan and L Hanin Eds chapter 5 World Scientific River EdgeNJ USA 2008
[2] W Y Tan Stochastic Models of Carcinogenesis Marcel DekkerNew York NY USA 1991
[3] W Y Tan ldquoStochastic multiti-stage models of carcinogenesis ashidden Markov models a new approachrdquo International Journalof Systems and Synthetic Biology vol 1 pp 313ndash337 2010
[4] W Y Tan L J Zhang and C W Chen ldquoStochastic modelingof carcinogenesis state space models and estimation of param-etersrdquoDiscrete and Continuous Dynamical Systems B vol 4 no1 pp 297ndash322 2004
[5] W Y Tan C W Chen and L J Zhang ldquoCancer biology cancermodels and stochasticmathematical analysis of carcinogenesisrdquoin Handbook of Cancer Models and Applications W Y Tan andL Hanin Eds chapter 3World Scientific River Edge NJ USA2008
[6] R A Weinberg The Biology of Human Cancer Garland Sci-ences Taylor and Frances New York NY USA 2007
[7] Q Zheng ldquoStochastic multistage cancer models a fresh lookat an old approachrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds chapter 2 WorldScientific River Edge NJ USA 2008
[8] W Y Tan L J Zhang W Chen and J M Zhu ldquoA stochasticmodel of human colon cancer involving multiple pathwaysrdquo inHandbook of Cancer Models with Applications W Y Tan and LHanin Eds chapter 11 World Scientific River Edge NJ USA2008
[9] W Y Tan and H Zhou ldquoA new stochastic model of retinoblas-toma involving both hereditary and non- hereditary cancercasesrdquo Journal of Carcinogenesis and Mutagenesis vol 2 no 2article 117 2011
[10] X W Yan Stochastic and State Space Models of carcinogenesisUnder Complex Situation [PhD thesis] Department of Math-ematical Sciences University of Memphsis Memphis TennUSA
[11] G L Yang and C W Chen ldquoA stochastic two-stage carcino-genesis model a new approach to computing the probabilityof observing tumor in animal bioassaysrdquo Mathematical Bio-sciences vol 104 no 2 pp 247ndash258 1991
[12] H Osada and T Takahashi ldquoGenetic alterations of multipletumor suppressors and oncogenes in the carcinogenesis andprogression of lung cancerrdquo Oncogene vol 21 no 48 pp 7421ndash7434 2002
[13] I I Wistuba L Mao and A F Gazdar ldquoSmoking moleculardamage in bronchial epitheliumrdquo Oncogene vol 21 no 48 pp7298ndash7306 2002
[14] S Landreville O A Agapova and J W Hartbour ldquoEmerginginsights into the molecular pathogenesis of uveal melanomardquoFuture Oncology vol 4 no 5 pp 629ndash636 2008
[15] H W Mensink D Paridaens and A De Klein ldquoGenetics ofuveal melanomardquo Expert Review of Ophthalmology vol 4 no6 pp 607ndash616 2009
[16] WY Tan andXWYan ldquoAnew stochastic and state spacemodelof human colon cancer incorporating multiple pathwaysrdquoBiology Direct vol 5 article 26 2010
[17] L B Klebanov S T Rachev and A Y Yakovlev ldquoA stochasticmodel of radiation carcinogenesis latent time distributions andtheir propertiesrdquo Mathematical Biosciences vol 113 no 1 pp51ndash75 1993
[18] A Y Yakovlev and A D Tsodikov Stochastic Models of TumorLatency and Their Biostatistical Applications World ScientificRiver Edge NJ USA 1996
[19] H Fakir W Y Tan L Hlatky P Hahnfeldt and R K SachsldquoStochastic population dynamic effects for lung cancer progres-sionrdquo Radiation Research vol 172 no 3 pp 383ndash393 2009
[20] A G Knudson ldquoMutation and cancer statistical study ofretinoblastomardquoProceedings of theNational Academy of Sciencesof the United States of America vol 68 no 4 pp 820ndash823 1971
[21] J F Crow and M Kimura An Introduction to PopulationGenetics Theory Harper and Row New York NY USA 1970
[22] W Y Tan Stochastic Models With Applications to GeneticsCancers AIDS and Other Biomedical Systems World ScientificRiver Edge NJ USA 2002
[23] W Y Tan CW Chen and L J Zhang ldquoCancer risk assessmentby state space modelsrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds Chapter 13 WorldScientific River Edge NJ USA 2008
[24] W Y Tan andCWChen ldquoStochasticmodeling of carcinogene-sis some new insightsrdquoMathematical and Computer Modellingvol 28 no 11 pp 49ndash71 1998
[25] W Y Tan and C W Chen ldquoCancer stochastic modelsrdquo inEncyclopedia of Statistical Sciences Revised edition John Wileyand Sons New York NY USA 2005
[26] E G Luebeck and S HMoolgavkar ldquoMultistage carcinogenesisand the incidence of colorectal cancerrdquo Proceedings of theNational Academy of Sciences of the United States of Americavol 99 no 23 pp 15095ndash15100 2002
[27] R Durrett J Mayberry S Moseley and D Schmidt ldquoProba-bility models for cancer development and progressionrdquo GoogleSearch Google 2012
[28] G E P Box and G C Tiao Bayesian Inference in StatisticalAnalysis Addison-Wesley Reading Mass USA 1973
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
14 ISRN Biomathematics
where Ω is the parameter space of Θ1provided by the
biological constraints in Section 51For computational convenience we notice that the log of
1198751199011 119901
2 120572 | Θ
1NY
˜119910
˜119899 is proportional to the negative of
1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) similarly the log of
119875Θ1| (119901
1 119901
2 120572)NY
˜119910
˜119899 is proportional to the negative
of sum119896
119895=1119863119895given by (46)
53 The Multilevel Gibbs Sampling Procedure For EstimatingUnknown Parameters Given the posterior probability distri-butions we will use the following multilevel Gibbs samplingprocedure to derive estimates of the parameters We noticethat numerically the Gibbs sampling procedure given belowis equivalent to the EM-algorithm from the sampling theoryviewpoint with Steps 1 and 2 as the 119864-Step and with Steps 3and 4 as the119872-Step respectively [29]Thesemultilevel Gibbssampling procedures are given by the following
Step 1 (Generating N Given (Y˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 1)) Given Θ and given˜119899 use the multinomial
distribution of 1198991119895 119899
2119895 given 119899
119895in Section 3 to generate
a large sample of N Then by combining this large samplewith 119875Y
˜119910 | N
˜119899 Θ in (37) and (40) to select N through
the weighted bootstrap method due to Smith and Gelfand[30] This selected N is then a sample from 119875N | Y
˜119910
˜119899 Θ
even though the latter is unknown (For proof see Tan [22]Chapter 3) Call the generated sample N
Step 2 (Generating Y Given (N˜119910
˜119899 Θ) (The Data-Augmen-
tation Step 2)) Given
˜119910
˜119899 Θ and given N = N generated
from Step 1 generate Y from the probability distribution119875Y | N
˜119910
˜119899 Θ given by (36) and (38) Call the generated
sample Y
Step 3 (Estimation of 119901119894 119894 = 1 2 120572 Given Θ
1NY
˜119910
˜119899)
Given
˜119910
˜119899 Θ
1 and given (NY) = (N Y) from Steps 1
and 2 derive the posterior mode of 119901119894 119894 = 1 2 120572 by
maximizing the conditional posterior distribution 119875119901119894 119894 =
1 2 120572 | Θ1 N Y
˜119910
˜119899 Under the partially informative prior
this is equivalent to maximize the negative of the deviance1198630(119896) + Dev(119901
1 119901
2) given by (44)-(45) in Section 43 under
the constraints given in Section 51 Denote this generatedmode by 119901
119894 119894 = 1 2
Step 4 (Estimation of Θ1Given (119901
119894 119894 = 1 2 120572NY
˜119910
˜119899))
Given (
˜119910
˜119899) and given (NY 119901
119894 119894 = 1 2 120572) = (N Y 119901
119894 119894 =
1 2 ) from Steps 1ndash3 derive the posterior mode of Θ1
by maximizing the conditional posterior distribution119875Θ
1| 119901
119894 119894 = 1 2 N Y
˜119910
˜119899 Under the partially
informative prior this is equivalent to maximize the negativeof the deviancesum119896
119895=1119863119895in (46) under the constraints Denote
the generated mode as Θ1
Step 5 (Recycling Step) With (NY 119901119894 119894 = 1 2 120572 Θ
1) =
(N Y 119901119894 119894 = 1 2 Θ
1) given above go back to Step 1 and
continue until convergence
The proof of convergence of the above steps can bederived by using procedure given in Tan ([22] Chapter 3) Atconvergence the Θ = 119901
119894 119894 = 1 2 Θ
1 are the generated
values from the posterior distribution of Θ given
˜119910
˜119899
independently of (NY) (for proof see Tan [22] Chapter 3)Repeat the above procedures once then generate a randomsample ofΘ from the posterior distribution ofΘ given
˜119910
˜119899
then one uses the sample mean as the estimates of (Θ) anduse the sample variances and covariances as estimates of thevariances and covariances of these estimates
6 A NewMultistage Stochastic Model for AdultEye Cancer (Uveal Melanoma)mdashAn Example
The human eye cancers consist of pediatric eye cancers andadult eye cancers The most common pediatric eye cancer isthe retinoblastoma which develops from the retinal pigmentepithelium cells underlying the retina that do not formmelanoma The most common adult eye cancers are theuveal melanomas involving the iris the ciliary body andthe choroid (collectively referred to as the uveal) Thesecancers develop from melanocytes (pigment cells) whichreside within the uveal giving color to the eye In Tan andZhou [9] we have developed a modified two-stage model forretinoblastoma Based on results frommolecular biology (seeLandreville et al [14] Mensink et al [15] and Loercher andHarbour [31]) Landreville et al [14] have proposed a threestage model for uveal melanoma as given in Figure 2 As anexample of applications of this paper in this section we willapply this model of uveal melanoma to the NCINIH eyecancer data from the SEER project We notice that the samemethods can be applied to model other human cancers aswell but this will be our future research
Given in Table 1 are the numbers of people at risk andthe eye cancer cases in the age groups together with thepredicted cases from the models These data give cancerincidence at birth and incidence for 85 age groups (119896 =
85) with each group spanning over a 1-year period exceptthe last age group (ge85 years old) For human eye cancerbecause the incidence at birth and for age groups from 1to 10 years old is basically generated by the pediatric eyecancer-retinoblastoma (see [9]) to account for inheritedcancer cases of uveal melanomas the incidence for age 0(birth) and for age periods from 1 to 10 years old in Table 1for uveal melanoma is derived by subtracting incidence ofretinoblastoma from SEER data (see Tan and Zhou [9])
To fit the data we let one-time unit be 6 months afterbirth and let 119905
0= 1 To compare different models and to
assess different assumptions we will consider the following2-3-stage mixture models (1) the complete 3-stage mixturemodel (Model-F) in which no assumptions are made on theparameters (2) the Type-1ndash3-stage mixture model in whichwe assume that 120574
1= 0 and that normal people and 119869
1at
the embryo stage will remain normal people and 1198691people
respectively at birth (Model-1) For comparison purposes wealso fit a 2-stage model as defined in Tan and Zhou [9] Wewill apply the methods in Section 6 to fit these models to theSEER data given in Table 1
ISRN Biomathematics 15
Table 2 The log-likelihood AIC and BIC of the fitted models
Models Log-likelihood AIC BICThree-stage
Model-F minus131202 264804 267735Model-1 minus129576 260953 263151
Two-stage minus4392421 8796843 8811569
0 6 12 18 24 30 36 42 48 54 60 66 72 78 84Age (year)
0
50
100
150
200
250
300
350
Inci
denc
es
ObservedModel-F
Model-1Two-stage
Figure 5 Curve fitting of SEER data by the Model-F Model-1 andthe two-stage model
Given in Table 2 are the natural logs of the likelihoodfunctions the AIC (Akaike Information Criterion) and theBIC (Bayesian Information Criterion) for these modelsGiven in Table 3 are the estimates of parameters in the 3-stagemodels Given in Figure 5 are the plots of predicted cancercases from the 3-stage mixture models (Model-F and Model-1) and the 2-stagemodel For comparison purposes in Table 1we also provide numbers of predicted cancer cases from the3-stage mixture models and the 2-stage model together withthe observed cancer cases over time from SEER From theseresults we have made the following observations
(a) As shown by results in Table 1 and Figure 5 it appear-ed that both Model-F and Model-1 fitted the SEERdata well although Model-1 fitted the data slightlybetter from values of AIC and BIC The Chi-squaretest statistics 1205942 = sum
85
119894=0((119910
119894minus 119910
119894)2119910
119894) for Model-F
and Model-1 are given by 8843 and 9448 respec-tively giving a 119875-value of 012 (df = 86 minus 12 = 74)for Model-F and a 119875 value of 011 (df = 86 minus 7 = 79)for Model-1 On the other hand the 2-stage modelfitted the date very poorly the Chi-statistic value forthe 2-stage model is 274769 giving a 119875-value less than10
minus3The AIC (Akaike Information Criteria) and BIC(Bayesian Information Criteria) values ofModel-1 aregiven by (AIC = 260953 BIC = 263151) which areslightly smaller than those of Model-F respectivelyhowever the AIC and BIC values (879684 881157)of the two-stage model are considerably greater thanthose of the 3-stagemodels respectivelyThese resultssuggest that uvealmelanomamaybest be described bya 3-stage model with inherited component and that
one may practically assume 1205741= 0 and that normal
people and 1198691people at the embryo stage will remain
normal people and 1198691people respectively at birth
(b) From Table 3 it is observed that the estimate of 1205741is
close to zero (the estimate is of order 10minus5) indicatingthat the phenotype of 119869
1is almost identical to that
of 119873 = 1198690further confirming that the staging-
limiting genes are basically tumor suppressor genesand that there is no haploinsufficiency for these tumorsuppressor genes On the other hand the estimate of1205742is of order 10minus2 which is about 103 times greater
than those of cells with genotype 1198691
(c) From Table 3 the estimates 119895(119895 = 0 1 2) of 120582
119895
are of order 10minus1 10
minus1 10
2sim 10
3 respectively
Because 120582119894= 119864[119869
119894(1199050 119894)]120573
119894prod
2
119906=119894+1(120573
119906120574
119906) 119894 = 0 1 2
assuming some values of 119864[119869119894(1199050 119894)] 119894 = 0 1 2 from
some biological observations one can have somerough ideas about the magnitude of 120573
119895(119895 = 0 1 2)
For example if we follow Potten et al [32] to assume(119864[119873(119905
0)] = 119864[119869
119894(1199050 119894)] sim 10
8 119894 = 0 1 2) then 120573
119895asymp
10minus6sim 10
minus5(d) From Table 3 the estimates of 119901
1and 119901
2from the
SEER data are of orders 10minus4 sim 10minus3 and 10minus7 sim 10
minus6respectively This indicates that in the US populationthe frequency of the staging limiting cancer genefor uveal melanoma is approximately around 10
minus3Table 3 also showed that the estimate of 120572 was 08411indicating that most individuals with genotype 119869
2
would develop cancer at birth This may help toexplain why there are observed cancer incidencesat birth for uveal melanoma in the SEER data eventhough the estimate of the frequency 119901
2is of order
10minus7sim 10
minus6
7 Discussion and Conclusions
To account for inherited cancer cases in this paper wehave developed some general multistage models involvinghereditary cancer cases For human cancer incidence thesemodels are basically generalized mixture models In thesemixture models the mixing probability is a multinomialdistribution to account for genetic segregation of the staging-limiting tumor suppressor genes This mixture model allowsus to estimate for the first time the frequency of the staging-limiting tumor suppressor gene in human populations As anexample of applications in this paper we have developed ageneral 3-stage stochastic multistage model of carcinogenesisfor adult human eye cancer To account for inherited cancercases in the stochastic model of human eye cancer wehave also developed a generalized mixture model for uvealmelanoma in human beings
For using the proposed models to fit the cancer incidencedata in this paper we have developed a generalized Bayesianinference procedure to estimate the unknownparameters andto predict cancer cases This inference procedure is advanta-geous over the classical sampling theory inference (ie max-imum likelihood method) because the procedure combines
16 ISRN Biomathematics
Table3Estim
ates
ofparametersfor
the3
-stage
stochastic
mod
els
Parameters
1205820
1205821
1205822
1205741
1205742
1199011
1199012
1205721205791
1205792
Mod
el-F
Estim
ates
288119864minus01
481119864minus01
655119864+02
271119864minus05
337119864minus02
995119864minus04
968119864minus07
841119864minus01
329119864minus04
341119864minus01
StD
121119864minus01
465119864minus02
112119864+02
886119864minus06
192119864minus04
175119864minus06
648119864minus09
419119864minus02
354119864minus05
107119864minus02
95CL
-Low
er503119864minus02
389119864minus01
434119864+02
534119864minus06
333119864minus02
991119864minus04
955119864minus07
759119864minus01
260119864minus04
320119864minus01
95CL
-Upp
er525119864minus01
572119864minus01
875119864+02
401119864minus05
341119864minus02
998119864minus04
981119864minus07
923119864minus01
398119864minus04
362119864minus01
Mod
el-1
Estim
ates
655119864minus02
372119864minus01
198119864+02
NA
349119864minus02
998119864minus04
100119864minus06
807119864minus01
NA
NA
StD
254119864minus02
463119864minus02
338119864+01
NA
358119864minus03
627119864minus05
453119864minus08
706119864minus02
NA
NA
95CL
-Low
er157119864minus02
282119864minus01
132119864+02
NA
279119864minus02
875119864minus04
911119864minus07
668119864minus01
NA
NA
95CL
-Upp
er115119864minus01
463119864minus01
265119864+02
NA
419119864minus02
1121198640minus03
109119864minus06
945119864minus01
NA
NA
NoteNAassum
edno
nexiste
nce
ISRN Biomathematics 17
information from three sources (1)previous information andexperiences about the parameters in terms of the prior distri-bution 119875Θ of the parameters (2) biological information ofinherited cancer cases via the genetic segregation of staging-limiting tumor suppressor genes in the population and (3)
information from the expanded data (Y) and the observeddata (
˜119910) via the statistical model from the system (119875Y
˜119910 |
N Θ) given by (37) and (40)To illustrate the usefulness and applications of ourmodels
and methods we have applied our models and methods tothe eye cancer SEER data of NCINIH Our analysis clearlyshowed that the proposed 3-stage model with inheritedcancer cases fitted the data nicely (see Table 2 and Figure 5)on the other hand the classical 2-stage model cannot fit thedata at all (see Table 2 and Figure 5)These results clearly haveconfirmed results frommolecular biology that the human eyecancer is derived by a 3-stage model with inherited cancercomponent Notice however our 3-stage multistage modelis more general than the classical 3-stage model as describedin Little [1] Tan [2] and Zheng [7] in that we postulatethat cancer tumors develop from primary 119868
3cells by clonal
expansion (see Yang and Chen [11]) (Note that the stochasticmultistagemodels in the literature assume that cancer tumorsdevelop from last stage cells immediately as soon as theyare generated ignoring completely cancer progression seeRemark 1) As a matter of fact we had assumed Δ119905 = 1 fora period of three months and found that the 3-stage modelsthen did not fit the SEER data clearly indicating that119875
119905(119904 119905) lt
1 over a period of three monthsApplying our models and methods to the SEER data
of human eye cancer we have derived for the first timesome useful pieces of information Specifically we mention(1) for the first time that we have estimated the frequencyof the staging-limiting tumor suppressor gene in the USpopulation (119901
1sim 9948 times 10
minus3) (2) With the estimate of 120572as = 08411 the predicted number of uveal melanoma atbirth is 119910
0= 119899
011199012= 36 by 3-stage models with inherited
cancer component (Model-F and Model-1) (The observednumber of eye cancer at birth is 34) (3) The estimateof the proliferation rate (120574
1) of 119869
1cells using Model-F is
1205741
= 2271 times 10minus5
sim 0 (The estimate is 8603 times 10minus5
using Model-1) This confirms that the stage-1 limitinggene is a tumor suppressor gene and unlike the p53 genein chromosome 17p (see [33]) there is little or no haploidinsufficiency for this gene in cells with genotype 119869
1
Using models and methods of this paper one can easilypredict future cancer cases for human eye cancer Thus bycomparing results from different populations our modelsand methods can be used to assess cancer prevention andcontrol procedures This will be our future research topicswe will not go any further here
Appendix
The Expected Numbers of State Variablesunder Discrete Time Approximation
Under discrete time the stochastic differential equations ofstate variables reduce to the following stochastic difference
equations of state variables respectively
119869119894 (119905 + 1 119894) = 119869
119894 (119905 119894) [1 + 120574119894 (119905)]
+ 119890119894(119905 + 1 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 + 1 119906) = 119869
119895(119905 119906) [1 + 120574
119895(119905)] + 119869
119895minus1(119905 119906) 120573
119895minus1(119905)
+ 119890119895(119905 + 1 119906) 0 le 119906 lt 119895 le 119896 minus 1
(A1)
where 119890119894(119905 + 1 119894) = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)] minus [119863
119894(119905 119894) minus
119869119894(119905 119894)119889
119894(119905)] and 119890
119895(119905+1 119894) = [119861
119895(119905 119894)minus119869
119895(119905 119894)119887
119895(119905)] minus [119863
119895(119905 119894)minus
119869119895(119905 119894)119889
119895(119905)] + [119872
119895minus1(119905 119894) minus 119869
119895minus1(119905 119894)120573
119895minus1(119905)] for 119895 gt 119894
The initial conditions at birth (1199050) for the above stochastic
difference equations are 119869119895(1199050 119894) gt 0 if (119895 = 119894 119894 + 1) and
119869119895(1199050 119894) = 0 if 119895 gt 119894 + 1 The solution of the above difference
equations under these initial conditions is given respectivelyby
119869119894(119905 119894) = 119869
119894(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119894(119904))
+
119905
sum
119904=1199050+1
119890119894(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119894(119906))
119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119895(119904))
+
119905minus1
sum
119904=1199050
119869119895minus1 (119904 119894) 120573119895minus1 (119904)
119905minus1
prod
119906=119904+1
(1 + 120574119895 (119906))
+
119905
sum
119904=1199050+1
119890119895(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119895(119906))
0 le 119894 lt 119895 le 119896 minus 1
(A2)
If the model is time homogeneous so that 120573119895(119905) =
120573119895 120574
119895(119905) = 120574
119895(119895 = 119894 119896 minus 1 and if 120574
119894= 120574119895if 119894 = 119895 then the
above solutions under the initial conditions (119869119895(1199050 119894) = 0 119895 gt
119894 + 1) reduce to
119869119894 (119905 119894) = 119869
119894(1199050 119894) (1 + 120574
119894)119905minus1199050
+ 120578(0)
119894(119905 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894) (1 + 120574
119895)119905minus1199050
+
119895minus1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895(119903) (1 + 120574
119903)119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
18 ISRN Biomathematics
=
119894+1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895 (119903) (1 + 120574119903)
119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
0 le 119894 lt 119895 le 119896 minus 1
(A3)
where 120578(0)
119895(119905 119894) = sum
119905
119904=1199050+1119890119895(119904 119894)(1 + 120574
119894)119905minus119904 120578
(119906)
119895(119905 119894) =
sum119905
119904=1199050+1119890119895minus119906
(119904 119894)sum119895
119907=119906119860119906119895(119907)(1 + 120574
119907)119905minus119904 (119906 = 1 119895 minus 119894)
Thus if the model is time homogeneous and if 120574119894=
120574119895if 119894 = 119895 the 119864[119869
119896minus1(119905 119894)]rsquos in discrete time models under
the initial conditions (119869119895(1199050 119894) = 0 119895 gt 119894 + 1) are given
respectively by
119864 [119869119896minus1
(119905 119894)] =
119894+1
sum
119906=119894
119864 [119869119906(1199050 119894)] (
119896minus2
prod
119907=119906
120573119907)
times
119896minus1
sum
119903=119906
119860119906(119896minus1)
(119903) (1 + 120574119903)119905minus1199050
119894 = 0 1 Min (119896 minus 1 2)
(A4)
References
[1] M P Little ldquoCancermodels ionization and genomic instabilitya reviewrdquo inHandbook of CancerModels with ApplicationsW YTan and L Hanin Eds chapter 5 World Scientific River EdgeNJ USA 2008
[2] W Y Tan Stochastic Models of Carcinogenesis Marcel DekkerNew York NY USA 1991
[3] W Y Tan ldquoStochastic multiti-stage models of carcinogenesis ashidden Markov models a new approachrdquo International Journalof Systems and Synthetic Biology vol 1 pp 313ndash337 2010
[4] W Y Tan L J Zhang and C W Chen ldquoStochastic modelingof carcinogenesis state space models and estimation of param-etersrdquoDiscrete and Continuous Dynamical Systems B vol 4 no1 pp 297ndash322 2004
[5] W Y Tan C W Chen and L J Zhang ldquoCancer biology cancermodels and stochasticmathematical analysis of carcinogenesisrdquoin Handbook of Cancer Models and Applications W Y Tan andL Hanin Eds chapter 3World Scientific River Edge NJ USA2008
[6] R A Weinberg The Biology of Human Cancer Garland Sci-ences Taylor and Frances New York NY USA 2007
[7] Q Zheng ldquoStochastic multistage cancer models a fresh lookat an old approachrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds chapter 2 WorldScientific River Edge NJ USA 2008
[8] W Y Tan L J Zhang W Chen and J M Zhu ldquoA stochasticmodel of human colon cancer involving multiple pathwaysrdquo inHandbook of Cancer Models with Applications W Y Tan and LHanin Eds chapter 11 World Scientific River Edge NJ USA2008
[9] W Y Tan and H Zhou ldquoA new stochastic model of retinoblas-toma involving both hereditary and non- hereditary cancercasesrdquo Journal of Carcinogenesis and Mutagenesis vol 2 no 2article 117 2011
[10] X W Yan Stochastic and State Space Models of carcinogenesisUnder Complex Situation [PhD thesis] Department of Math-ematical Sciences University of Memphsis Memphis TennUSA
[11] G L Yang and C W Chen ldquoA stochastic two-stage carcino-genesis model a new approach to computing the probabilityof observing tumor in animal bioassaysrdquo Mathematical Bio-sciences vol 104 no 2 pp 247ndash258 1991
[12] H Osada and T Takahashi ldquoGenetic alterations of multipletumor suppressors and oncogenes in the carcinogenesis andprogression of lung cancerrdquo Oncogene vol 21 no 48 pp 7421ndash7434 2002
[13] I I Wistuba L Mao and A F Gazdar ldquoSmoking moleculardamage in bronchial epitheliumrdquo Oncogene vol 21 no 48 pp7298ndash7306 2002
[14] S Landreville O A Agapova and J W Hartbour ldquoEmerginginsights into the molecular pathogenesis of uveal melanomardquoFuture Oncology vol 4 no 5 pp 629ndash636 2008
[15] H W Mensink D Paridaens and A De Klein ldquoGenetics ofuveal melanomardquo Expert Review of Ophthalmology vol 4 no6 pp 607ndash616 2009
[16] WY Tan andXWYan ldquoAnew stochastic and state spacemodelof human colon cancer incorporating multiple pathwaysrdquoBiology Direct vol 5 article 26 2010
[17] L B Klebanov S T Rachev and A Y Yakovlev ldquoA stochasticmodel of radiation carcinogenesis latent time distributions andtheir propertiesrdquo Mathematical Biosciences vol 113 no 1 pp51ndash75 1993
[18] A Y Yakovlev and A D Tsodikov Stochastic Models of TumorLatency and Their Biostatistical Applications World ScientificRiver Edge NJ USA 1996
[19] H Fakir W Y Tan L Hlatky P Hahnfeldt and R K SachsldquoStochastic population dynamic effects for lung cancer progres-sionrdquo Radiation Research vol 172 no 3 pp 383ndash393 2009
[20] A G Knudson ldquoMutation and cancer statistical study ofretinoblastomardquoProceedings of theNational Academy of Sciencesof the United States of America vol 68 no 4 pp 820ndash823 1971
[21] J F Crow and M Kimura An Introduction to PopulationGenetics Theory Harper and Row New York NY USA 1970
[22] W Y Tan Stochastic Models With Applications to GeneticsCancers AIDS and Other Biomedical Systems World ScientificRiver Edge NJ USA 2002
[23] W Y Tan CW Chen and L J Zhang ldquoCancer risk assessmentby state space modelsrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds Chapter 13 WorldScientific River Edge NJ USA 2008
[24] W Y Tan andCWChen ldquoStochasticmodeling of carcinogene-sis some new insightsrdquoMathematical and Computer Modellingvol 28 no 11 pp 49ndash71 1998
[25] W Y Tan and C W Chen ldquoCancer stochastic modelsrdquo inEncyclopedia of Statistical Sciences Revised edition John Wileyand Sons New York NY USA 2005
[26] E G Luebeck and S HMoolgavkar ldquoMultistage carcinogenesisand the incidence of colorectal cancerrdquo Proceedings of theNational Academy of Sciences of the United States of Americavol 99 no 23 pp 15095ndash15100 2002
[27] R Durrett J Mayberry S Moseley and D Schmidt ldquoProba-bility models for cancer development and progressionrdquo GoogleSearch Google 2012
[28] G E P Box and G C Tiao Bayesian Inference in StatisticalAnalysis Addison-Wesley Reading Mass USA 1973
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
ISRN Biomathematics 15
Table 2 The log-likelihood AIC and BIC of the fitted models
Models Log-likelihood AIC BICThree-stage
Model-F minus131202 264804 267735Model-1 minus129576 260953 263151
Two-stage minus4392421 8796843 8811569
0 6 12 18 24 30 36 42 48 54 60 66 72 78 84Age (year)
0
50
100
150
200
250
300
350
Inci
denc
es
ObservedModel-F
Model-1Two-stage
Figure 5 Curve fitting of SEER data by the Model-F Model-1 andthe two-stage model
Given in Table 2 are the natural logs of the likelihoodfunctions the AIC (Akaike Information Criterion) and theBIC (Bayesian Information Criterion) for these modelsGiven in Table 3 are the estimates of parameters in the 3-stagemodels Given in Figure 5 are the plots of predicted cancercases from the 3-stage mixture models (Model-F and Model-1) and the 2-stagemodel For comparison purposes in Table 1we also provide numbers of predicted cancer cases from the3-stage mixture models and the 2-stage model together withthe observed cancer cases over time from SEER From theseresults we have made the following observations
(a) As shown by results in Table 1 and Figure 5 it appear-ed that both Model-F and Model-1 fitted the SEERdata well although Model-1 fitted the data slightlybetter from values of AIC and BIC The Chi-squaretest statistics 1205942 = sum
85
119894=0((119910
119894minus 119910
119894)2119910
119894) for Model-F
and Model-1 are given by 8843 and 9448 respec-tively giving a 119875-value of 012 (df = 86 minus 12 = 74)for Model-F and a 119875 value of 011 (df = 86 minus 7 = 79)for Model-1 On the other hand the 2-stage modelfitted the date very poorly the Chi-statistic value forthe 2-stage model is 274769 giving a 119875-value less than10
minus3The AIC (Akaike Information Criteria) and BIC(Bayesian Information Criteria) values ofModel-1 aregiven by (AIC = 260953 BIC = 263151) which areslightly smaller than those of Model-F respectivelyhowever the AIC and BIC values (879684 881157)of the two-stage model are considerably greater thanthose of the 3-stagemodels respectivelyThese resultssuggest that uvealmelanomamaybest be described bya 3-stage model with inherited component and that
one may practically assume 1205741= 0 and that normal
people and 1198691people at the embryo stage will remain
normal people and 1198691people respectively at birth
(b) From Table 3 it is observed that the estimate of 1205741is
close to zero (the estimate is of order 10minus5) indicatingthat the phenotype of 119869
1is almost identical to that
of 119873 = 1198690further confirming that the staging-
limiting genes are basically tumor suppressor genesand that there is no haploinsufficiency for these tumorsuppressor genes On the other hand the estimate of1205742is of order 10minus2 which is about 103 times greater
than those of cells with genotype 1198691
(c) From Table 3 the estimates 119895(119895 = 0 1 2) of 120582
119895
are of order 10minus1 10
minus1 10
2sim 10
3 respectively
Because 120582119894= 119864[119869
119894(1199050 119894)]120573
119894prod
2
119906=119894+1(120573
119906120574
119906) 119894 = 0 1 2
assuming some values of 119864[119869119894(1199050 119894)] 119894 = 0 1 2 from
some biological observations one can have somerough ideas about the magnitude of 120573
119895(119895 = 0 1 2)
For example if we follow Potten et al [32] to assume(119864[119873(119905
0)] = 119864[119869
119894(1199050 119894)] sim 10
8 119894 = 0 1 2) then 120573
119895asymp
10minus6sim 10
minus5(d) From Table 3 the estimates of 119901
1and 119901
2from the
SEER data are of orders 10minus4 sim 10minus3 and 10minus7 sim 10
minus6respectively This indicates that in the US populationthe frequency of the staging limiting cancer genefor uveal melanoma is approximately around 10
minus3Table 3 also showed that the estimate of 120572 was 08411indicating that most individuals with genotype 119869
2
would develop cancer at birth This may help toexplain why there are observed cancer incidencesat birth for uveal melanoma in the SEER data eventhough the estimate of the frequency 119901
2is of order
10minus7sim 10
minus6
7 Discussion and Conclusions
To account for inherited cancer cases in this paper wehave developed some general multistage models involvinghereditary cancer cases For human cancer incidence thesemodels are basically generalized mixture models In thesemixture models the mixing probability is a multinomialdistribution to account for genetic segregation of the staging-limiting tumor suppressor genes This mixture model allowsus to estimate for the first time the frequency of the staging-limiting tumor suppressor gene in human populations As anexample of applications in this paper we have developed ageneral 3-stage stochastic multistage model of carcinogenesisfor adult human eye cancer To account for inherited cancercases in the stochastic model of human eye cancer wehave also developed a generalized mixture model for uvealmelanoma in human beings
For using the proposed models to fit the cancer incidencedata in this paper we have developed a generalized Bayesianinference procedure to estimate the unknownparameters andto predict cancer cases This inference procedure is advanta-geous over the classical sampling theory inference (ie max-imum likelihood method) because the procedure combines
16 ISRN Biomathematics
Table3Estim
ates
ofparametersfor
the3
-stage
stochastic
mod
els
Parameters
1205820
1205821
1205822
1205741
1205742
1199011
1199012
1205721205791
1205792
Mod
el-F
Estim
ates
288119864minus01
481119864minus01
655119864+02
271119864minus05
337119864minus02
995119864minus04
968119864minus07
841119864minus01
329119864minus04
341119864minus01
StD
121119864minus01
465119864minus02
112119864+02
886119864minus06
192119864minus04
175119864minus06
648119864minus09
419119864minus02
354119864minus05
107119864minus02
95CL
-Low
er503119864minus02
389119864minus01
434119864+02
534119864minus06
333119864minus02
991119864minus04
955119864minus07
759119864minus01
260119864minus04
320119864minus01
95CL
-Upp
er525119864minus01
572119864minus01
875119864+02
401119864minus05
341119864minus02
998119864minus04
981119864minus07
923119864minus01
398119864minus04
362119864minus01
Mod
el-1
Estim
ates
655119864minus02
372119864minus01
198119864+02
NA
349119864minus02
998119864minus04
100119864minus06
807119864minus01
NA
NA
StD
254119864minus02
463119864minus02
338119864+01
NA
358119864minus03
627119864minus05
453119864minus08
706119864minus02
NA
NA
95CL
-Low
er157119864minus02
282119864minus01
132119864+02
NA
279119864minus02
875119864minus04
911119864minus07
668119864minus01
NA
NA
95CL
-Upp
er115119864minus01
463119864minus01
265119864+02
NA
419119864minus02
1121198640minus03
109119864minus06
945119864minus01
NA
NA
NoteNAassum
edno
nexiste
nce
ISRN Biomathematics 17
information from three sources (1)previous information andexperiences about the parameters in terms of the prior distri-bution 119875Θ of the parameters (2) biological information ofinherited cancer cases via the genetic segregation of staging-limiting tumor suppressor genes in the population and (3)
information from the expanded data (Y) and the observeddata (
˜119910) via the statistical model from the system (119875Y
˜119910 |
N Θ) given by (37) and (40)To illustrate the usefulness and applications of ourmodels
and methods we have applied our models and methods tothe eye cancer SEER data of NCINIH Our analysis clearlyshowed that the proposed 3-stage model with inheritedcancer cases fitted the data nicely (see Table 2 and Figure 5)on the other hand the classical 2-stage model cannot fit thedata at all (see Table 2 and Figure 5)These results clearly haveconfirmed results frommolecular biology that the human eyecancer is derived by a 3-stage model with inherited cancercomponent Notice however our 3-stage multistage modelis more general than the classical 3-stage model as describedin Little [1] Tan [2] and Zheng [7] in that we postulatethat cancer tumors develop from primary 119868
3cells by clonal
expansion (see Yang and Chen [11]) (Note that the stochasticmultistagemodels in the literature assume that cancer tumorsdevelop from last stage cells immediately as soon as theyare generated ignoring completely cancer progression seeRemark 1) As a matter of fact we had assumed Δ119905 = 1 fora period of three months and found that the 3-stage modelsthen did not fit the SEER data clearly indicating that119875
119905(119904 119905) lt
1 over a period of three monthsApplying our models and methods to the SEER data
of human eye cancer we have derived for the first timesome useful pieces of information Specifically we mention(1) for the first time that we have estimated the frequencyof the staging-limiting tumor suppressor gene in the USpopulation (119901
1sim 9948 times 10
minus3) (2) With the estimate of 120572as = 08411 the predicted number of uveal melanoma atbirth is 119910
0= 119899
011199012= 36 by 3-stage models with inherited
cancer component (Model-F and Model-1) (The observednumber of eye cancer at birth is 34) (3) The estimateof the proliferation rate (120574
1) of 119869
1cells using Model-F is
1205741
= 2271 times 10minus5
sim 0 (The estimate is 8603 times 10minus5
using Model-1) This confirms that the stage-1 limitinggene is a tumor suppressor gene and unlike the p53 genein chromosome 17p (see [33]) there is little or no haploidinsufficiency for this gene in cells with genotype 119869
1
Using models and methods of this paper one can easilypredict future cancer cases for human eye cancer Thus bycomparing results from different populations our modelsand methods can be used to assess cancer prevention andcontrol procedures This will be our future research topicswe will not go any further here
Appendix
The Expected Numbers of State Variablesunder Discrete Time Approximation
Under discrete time the stochastic differential equations ofstate variables reduce to the following stochastic difference
equations of state variables respectively
119869119894 (119905 + 1 119894) = 119869
119894 (119905 119894) [1 + 120574119894 (119905)]
+ 119890119894(119905 + 1 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 + 1 119906) = 119869
119895(119905 119906) [1 + 120574
119895(119905)] + 119869
119895minus1(119905 119906) 120573
119895minus1(119905)
+ 119890119895(119905 + 1 119906) 0 le 119906 lt 119895 le 119896 minus 1
(A1)
where 119890119894(119905 + 1 119894) = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)] minus [119863
119894(119905 119894) minus
119869119894(119905 119894)119889
119894(119905)] and 119890
119895(119905+1 119894) = [119861
119895(119905 119894)minus119869
119895(119905 119894)119887
119895(119905)] minus [119863
119895(119905 119894)minus
119869119895(119905 119894)119889
119895(119905)] + [119872
119895minus1(119905 119894) minus 119869
119895minus1(119905 119894)120573
119895minus1(119905)] for 119895 gt 119894
The initial conditions at birth (1199050) for the above stochastic
difference equations are 119869119895(1199050 119894) gt 0 if (119895 = 119894 119894 + 1) and
119869119895(1199050 119894) = 0 if 119895 gt 119894 + 1 The solution of the above difference
equations under these initial conditions is given respectivelyby
119869119894(119905 119894) = 119869
119894(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119894(119904))
+
119905
sum
119904=1199050+1
119890119894(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119894(119906))
119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119895(119904))
+
119905minus1
sum
119904=1199050
119869119895minus1 (119904 119894) 120573119895minus1 (119904)
119905minus1
prod
119906=119904+1
(1 + 120574119895 (119906))
+
119905
sum
119904=1199050+1
119890119895(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119895(119906))
0 le 119894 lt 119895 le 119896 minus 1
(A2)
If the model is time homogeneous so that 120573119895(119905) =
120573119895 120574
119895(119905) = 120574
119895(119895 = 119894 119896 minus 1 and if 120574
119894= 120574119895if 119894 = 119895 then the
above solutions under the initial conditions (119869119895(1199050 119894) = 0 119895 gt
119894 + 1) reduce to
119869119894 (119905 119894) = 119869
119894(1199050 119894) (1 + 120574
119894)119905minus1199050
+ 120578(0)
119894(119905 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894) (1 + 120574
119895)119905minus1199050
+
119895minus1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895(119903) (1 + 120574
119903)119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
18 ISRN Biomathematics
=
119894+1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895 (119903) (1 + 120574119903)
119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
0 le 119894 lt 119895 le 119896 minus 1
(A3)
where 120578(0)
119895(119905 119894) = sum
119905
119904=1199050+1119890119895(119904 119894)(1 + 120574
119894)119905minus119904 120578
(119906)
119895(119905 119894) =
sum119905
119904=1199050+1119890119895minus119906
(119904 119894)sum119895
119907=119906119860119906119895(119907)(1 + 120574
119907)119905minus119904 (119906 = 1 119895 minus 119894)
Thus if the model is time homogeneous and if 120574119894=
120574119895if 119894 = 119895 the 119864[119869
119896minus1(119905 119894)]rsquos in discrete time models under
the initial conditions (119869119895(1199050 119894) = 0 119895 gt 119894 + 1) are given
respectively by
119864 [119869119896minus1
(119905 119894)] =
119894+1
sum
119906=119894
119864 [119869119906(1199050 119894)] (
119896minus2
prod
119907=119906
120573119907)
times
119896minus1
sum
119903=119906
119860119906(119896minus1)
(119903) (1 + 120574119903)119905minus1199050
119894 = 0 1 Min (119896 minus 1 2)
(A4)
References
[1] M P Little ldquoCancermodels ionization and genomic instabilitya reviewrdquo inHandbook of CancerModels with ApplicationsW YTan and L Hanin Eds chapter 5 World Scientific River EdgeNJ USA 2008
[2] W Y Tan Stochastic Models of Carcinogenesis Marcel DekkerNew York NY USA 1991
[3] W Y Tan ldquoStochastic multiti-stage models of carcinogenesis ashidden Markov models a new approachrdquo International Journalof Systems and Synthetic Biology vol 1 pp 313ndash337 2010
[4] W Y Tan L J Zhang and C W Chen ldquoStochastic modelingof carcinogenesis state space models and estimation of param-etersrdquoDiscrete and Continuous Dynamical Systems B vol 4 no1 pp 297ndash322 2004
[5] W Y Tan C W Chen and L J Zhang ldquoCancer biology cancermodels and stochasticmathematical analysis of carcinogenesisrdquoin Handbook of Cancer Models and Applications W Y Tan andL Hanin Eds chapter 3World Scientific River Edge NJ USA2008
[6] R A Weinberg The Biology of Human Cancer Garland Sci-ences Taylor and Frances New York NY USA 2007
[7] Q Zheng ldquoStochastic multistage cancer models a fresh lookat an old approachrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds chapter 2 WorldScientific River Edge NJ USA 2008
[8] W Y Tan L J Zhang W Chen and J M Zhu ldquoA stochasticmodel of human colon cancer involving multiple pathwaysrdquo inHandbook of Cancer Models with Applications W Y Tan and LHanin Eds chapter 11 World Scientific River Edge NJ USA2008
[9] W Y Tan and H Zhou ldquoA new stochastic model of retinoblas-toma involving both hereditary and non- hereditary cancercasesrdquo Journal of Carcinogenesis and Mutagenesis vol 2 no 2article 117 2011
[10] X W Yan Stochastic and State Space Models of carcinogenesisUnder Complex Situation [PhD thesis] Department of Math-ematical Sciences University of Memphsis Memphis TennUSA
[11] G L Yang and C W Chen ldquoA stochastic two-stage carcino-genesis model a new approach to computing the probabilityof observing tumor in animal bioassaysrdquo Mathematical Bio-sciences vol 104 no 2 pp 247ndash258 1991
[12] H Osada and T Takahashi ldquoGenetic alterations of multipletumor suppressors and oncogenes in the carcinogenesis andprogression of lung cancerrdquo Oncogene vol 21 no 48 pp 7421ndash7434 2002
[13] I I Wistuba L Mao and A F Gazdar ldquoSmoking moleculardamage in bronchial epitheliumrdquo Oncogene vol 21 no 48 pp7298ndash7306 2002
[14] S Landreville O A Agapova and J W Hartbour ldquoEmerginginsights into the molecular pathogenesis of uveal melanomardquoFuture Oncology vol 4 no 5 pp 629ndash636 2008
[15] H W Mensink D Paridaens and A De Klein ldquoGenetics ofuveal melanomardquo Expert Review of Ophthalmology vol 4 no6 pp 607ndash616 2009
[16] WY Tan andXWYan ldquoAnew stochastic and state spacemodelof human colon cancer incorporating multiple pathwaysrdquoBiology Direct vol 5 article 26 2010
[17] L B Klebanov S T Rachev and A Y Yakovlev ldquoA stochasticmodel of radiation carcinogenesis latent time distributions andtheir propertiesrdquo Mathematical Biosciences vol 113 no 1 pp51ndash75 1993
[18] A Y Yakovlev and A D Tsodikov Stochastic Models of TumorLatency and Their Biostatistical Applications World ScientificRiver Edge NJ USA 1996
[19] H Fakir W Y Tan L Hlatky P Hahnfeldt and R K SachsldquoStochastic population dynamic effects for lung cancer progres-sionrdquo Radiation Research vol 172 no 3 pp 383ndash393 2009
[20] A G Knudson ldquoMutation and cancer statistical study ofretinoblastomardquoProceedings of theNational Academy of Sciencesof the United States of America vol 68 no 4 pp 820ndash823 1971
[21] J F Crow and M Kimura An Introduction to PopulationGenetics Theory Harper and Row New York NY USA 1970
[22] W Y Tan Stochastic Models With Applications to GeneticsCancers AIDS and Other Biomedical Systems World ScientificRiver Edge NJ USA 2002
[23] W Y Tan CW Chen and L J Zhang ldquoCancer risk assessmentby state space modelsrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds Chapter 13 WorldScientific River Edge NJ USA 2008
[24] W Y Tan andCWChen ldquoStochasticmodeling of carcinogene-sis some new insightsrdquoMathematical and Computer Modellingvol 28 no 11 pp 49ndash71 1998
[25] W Y Tan and C W Chen ldquoCancer stochastic modelsrdquo inEncyclopedia of Statistical Sciences Revised edition John Wileyand Sons New York NY USA 2005
[26] E G Luebeck and S HMoolgavkar ldquoMultistage carcinogenesisand the incidence of colorectal cancerrdquo Proceedings of theNational Academy of Sciences of the United States of Americavol 99 no 23 pp 15095ndash15100 2002
[27] R Durrett J Mayberry S Moseley and D Schmidt ldquoProba-bility models for cancer development and progressionrdquo GoogleSearch Google 2012
[28] G E P Box and G C Tiao Bayesian Inference in StatisticalAnalysis Addison-Wesley Reading Mass USA 1973
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
16 ISRN Biomathematics
Table3Estim
ates
ofparametersfor
the3
-stage
stochastic
mod
els
Parameters
1205820
1205821
1205822
1205741
1205742
1199011
1199012
1205721205791
1205792
Mod
el-F
Estim
ates
288119864minus01
481119864minus01
655119864+02
271119864minus05
337119864minus02
995119864minus04
968119864minus07
841119864minus01
329119864minus04
341119864minus01
StD
121119864minus01
465119864minus02
112119864+02
886119864minus06
192119864minus04
175119864minus06
648119864minus09
419119864minus02
354119864minus05
107119864minus02
95CL
-Low
er503119864minus02
389119864minus01
434119864+02
534119864minus06
333119864minus02
991119864minus04
955119864minus07
759119864minus01
260119864minus04
320119864minus01
95CL
-Upp
er525119864minus01
572119864minus01
875119864+02
401119864minus05
341119864minus02
998119864minus04
981119864minus07
923119864minus01
398119864minus04
362119864minus01
Mod
el-1
Estim
ates
655119864minus02
372119864minus01
198119864+02
NA
349119864minus02
998119864minus04
100119864minus06
807119864minus01
NA
NA
StD
254119864minus02
463119864minus02
338119864+01
NA
358119864minus03
627119864minus05
453119864minus08
706119864minus02
NA
NA
95CL
-Low
er157119864minus02
282119864minus01
132119864+02
NA
279119864minus02
875119864minus04
911119864minus07
668119864minus01
NA
NA
95CL
-Upp
er115119864minus01
463119864minus01
265119864+02
NA
419119864minus02
1121198640minus03
109119864minus06
945119864minus01
NA
NA
NoteNAassum
edno
nexiste
nce
ISRN Biomathematics 17
information from three sources (1)previous information andexperiences about the parameters in terms of the prior distri-bution 119875Θ of the parameters (2) biological information ofinherited cancer cases via the genetic segregation of staging-limiting tumor suppressor genes in the population and (3)
information from the expanded data (Y) and the observeddata (
˜119910) via the statistical model from the system (119875Y
˜119910 |
N Θ) given by (37) and (40)To illustrate the usefulness and applications of ourmodels
and methods we have applied our models and methods tothe eye cancer SEER data of NCINIH Our analysis clearlyshowed that the proposed 3-stage model with inheritedcancer cases fitted the data nicely (see Table 2 and Figure 5)on the other hand the classical 2-stage model cannot fit thedata at all (see Table 2 and Figure 5)These results clearly haveconfirmed results frommolecular biology that the human eyecancer is derived by a 3-stage model with inherited cancercomponent Notice however our 3-stage multistage modelis more general than the classical 3-stage model as describedin Little [1] Tan [2] and Zheng [7] in that we postulatethat cancer tumors develop from primary 119868
3cells by clonal
expansion (see Yang and Chen [11]) (Note that the stochasticmultistagemodels in the literature assume that cancer tumorsdevelop from last stage cells immediately as soon as theyare generated ignoring completely cancer progression seeRemark 1) As a matter of fact we had assumed Δ119905 = 1 fora period of three months and found that the 3-stage modelsthen did not fit the SEER data clearly indicating that119875
119905(119904 119905) lt
1 over a period of three monthsApplying our models and methods to the SEER data
of human eye cancer we have derived for the first timesome useful pieces of information Specifically we mention(1) for the first time that we have estimated the frequencyof the staging-limiting tumor suppressor gene in the USpopulation (119901
1sim 9948 times 10
minus3) (2) With the estimate of 120572as = 08411 the predicted number of uveal melanoma atbirth is 119910
0= 119899
011199012= 36 by 3-stage models with inherited
cancer component (Model-F and Model-1) (The observednumber of eye cancer at birth is 34) (3) The estimateof the proliferation rate (120574
1) of 119869
1cells using Model-F is
1205741
= 2271 times 10minus5
sim 0 (The estimate is 8603 times 10minus5
using Model-1) This confirms that the stage-1 limitinggene is a tumor suppressor gene and unlike the p53 genein chromosome 17p (see [33]) there is little or no haploidinsufficiency for this gene in cells with genotype 119869
1
Using models and methods of this paper one can easilypredict future cancer cases for human eye cancer Thus bycomparing results from different populations our modelsand methods can be used to assess cancer prevention andcontrol procedures This will be our future research topicswe will not go any further here
Appendix
The Expected Numbers of State Variablesunder Discrete Time Approximation
Under discrete time the stochastic differential equations ofstate variables reduce to the following stochastic difference
equations of state variables respectively
119869119894 (119905 + 1 119894) = 119869
119894 (119905 119894) [1 + 120574119894 (119905)]
+ 119890119894(119905 + 1 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 + 1 119906) = 119869
119895(119905 119906) [1 + 120574
119895(119905)] + 119869
119895minus1(119905 119906) 120573
119895minus1(119905)
+ 119890119895(119905 + 1 119906) 0 le 119906 lt 119895 le 119896 minus 1
(A1)
where 119890119894(119905 + 1 119894) = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)] minus [119863
119894(119905 119894) minus
119869119894(119905 119894)119889
119894(119905)] and 119890
119895(119905+1 119894) = [119861
119895(119905 119894)minus119869
119895(119905 119894)119887
119895(119905)] minus [119863
119895(119905 119894)minus
119869119895(119905 119894)119889
119895(119905)] + [119872
119895minus1(119905 119894) minus 119869
119895minus1(119905 119894)120573
119895minus1(119905)] for 119895 gt 119894
The initial conditions at birth (1199050) for the above stochastic
difference equations are 119869119895(1199050 119894) gt 0 if (119895 = 119894 119894 + 1) and
119869119895(1199050 119894) = 0 if 119895 gt 119894 + 1 The solution of the above difference
equations under these initial conditions is given respectivelyby
119869119894(119905 119894) = 119869
119894(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119894(119904))
+
119905
sum
119904=1199050+1
119890119894(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119894(119906))
119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119895(119904))
+
119905minus1
sum
119904=1199050
119869119895minus1 (119904 119894) 120573119895minus1 (119904)
119905minus1
prod
119906=119904+1
(1 + 120574119895 (119906))
+
119905
sum
119904=1199050+1
119890119895(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119895(119906))
0 le 119894 lt 119895 le 119896 minus 1
(A2)
If the model is time homogeneous so that 120573119895(119905) =
120573119895 120574
119895(119905) = 120574
119895(119895 = 119894 119896 minus 1 and if 120574
119894= 120574119895if 119894 = 119895 then the
above solutions under the initial conditions (119869119895(1199050 119894) = 0 119895 gt
119894 + 1) reduce to
119869119894 (119905 119894) = 119869
119894(1199050 119894) (1 + 120574
119894)119905minus1199050
+ 120578(0)
119894(119905 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894) (1 + 120574
119895)119905minus1199050
+
119895minus1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895(119903) (1 + 120574
119903)119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
18 ISRN Biomathematics
=
119894+1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895 (119903) (1 + 120574119903)
119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
0 le 119894 lt 119895 le 119896 minus 1
(A3)
where 120578(0)
119895(119905 119894) = sum
119905
119904=1199050+1119890119895(119904 119894)(1 + 120574
119894)119905minus119904 120578
(119906)
119895(119905 119894) =
sum119905
119904=1199050+1119890119895minus119906
(119904 119894)sum119895
119907=119906119860119906119895(119907)(1 + 120574
119907)119905minus119904 (119906 = 1 119895 minus 119894)
Thus if the model is time homogeneous and if 120574119894=
120574119895if 119894 = 119895 the 119864[119869
119896minus1(119905 119894)]rsquos in discrete time models under
the initial conditions (119869119895(1199050 119894) = 0 119895 gt 119894 + 1) are given
respectively by
119864 [119869119896minus1
(119905 119894)] =
119894+1
sum
119906=119894
119864 [119869119906(1199050 119894)] (
119896minus2
prod
119907=119906
120573119907)
times
119896minus1
sum
119903=119906
119860119906(119896minus1)
(119903) (1 + 120574119903)119905minus1199050
119894 = 0 1 Min (119896 minus 1 2)
(A4)
References
[1] M P Little ldquoCancermodels ionization and genomic instabilitya reviewrdquo inHandbook of CancerModels with ApplicationsW YTan and L Hanin Eds chapter 5 World Scientific River EdgeNJ USA 2008
[2] W Y Tan Stochastic Models of Carcinogenesis Marcel DekkerNew York NY USA 1991
[3] W Y Tan ldquoStochastic multiti-stage models of carcinogenesis ashidden Markov models a new approachrdquo International Journalof Systems and Synthetic Biology vol 1 pp 313ndash337 2010
[4] W Y Tan L J Zhang and C W Chen ldquoStochastic modelingof carcinogenesis state space models and estimation of param-etersrdquoDiscrete and Continuous Dynamical Systems B vol 4 no1 pp 297ndash322 2004
[5] W Y Tan C W Chen and L J Zhang ldquoCancer biology cancermodels and stochasticmathematical analysis of carcinogenesisrdquoin Handbook of Cancer Models and Applications W Y Tan andL Hanin Eds chapter 3World Scientific River Edge NJ USA2008
[6] R A Weinberg The Biology of Human Cancer Garland Sci-ences Taylor and Frances New York NY USA 2007
[7] Q Zheng ldquoStochastic multistage cancer models a fresh lookat an old approachrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds chapter 2 WorldScientific River Edge NJ USA 2008
[8] W Y Tan L J Zhang W Chen and J M Zhu ldquoA stochasticmodel of human colon cancer involving multiple pathwaysrdquo inHandbook of Cancer Models with Applications W Y Tan and LHanin Eds chapter 11 World Scientific River Edge NJ USA2008
[9] W Y Tan and H Zhou ldquoA new stochastic model of retinoblas-toma involving both hereditary and non- hereditary cancercasesrdquo Journal of Carcinogenesis and Mutagenesis vol 2 no 2article 117 2011
[10] X W Yan Stochastic and State Space Models of carcinogenesisUnder Complex Situation [PhD thesis] Department of Math-ematical Sciences University of Memphsis Memphis TennUSA
[11] G L Yang and C W Chen ldquoA stochastic two-stage carcino-genesis model a new approach to computing the probabilityof observing tumor in animal bioassaysrdquo Mathematical Bio-sciences vol 104 no 2 pp 247ndash258 1991
[12] H Osada and T Takahashi ldquoGenetic alterations of multipletumor suppressors and oncogenes in the carcinogenesis andprogression of lung cancerrdquo Oncogene vol 21 no 48 pp 7421ndash7434 2002
[13] I I Wistuba L Mao and A F Gazdar ldquoSmoking moleculardamage in bronchial epitheliumrdquo Oncogene vol 21 no 48 pp7298ndash7306 2002
[14] S Landreville O A Agapova and J W Hartbour ldquoEmerginginsights into the molecular pathogenesis of uveal melanomardquoFuture Oncology vol 4 no 5 pp 629ndash636 2008
[15] H W Mensink D Paridaens and A De Klein ldquoGenetics ofuveal melanomardquo Expert Review of Ophthalmology vol 4 no6 pp 607ndash616 2009
[16] WY Tan andXWYan ldquoAnew stochastic and state spacemodelof human colon cancer incorporating multiple pathwaysrdquoBiology Direct vol 5 article 26 2010
[17] L B Klebanov S T Rachev and A Y Yakovlev ldquoA stochasticmodel of radiation carcinogenesis latent time distributions andtheir propertiesrdquo Mathematical Biosciences vol 113 no 1 pp51ndash75 1993
[18] A Y Yakovlev and A D Tsodikov Stochastic Models of TumorLatency and Their Biostatistical Applications World ScientificRiver Edge NJ USA 1996
[19] H Fakir W Y Tan L Hlatky P Hahnfeldt and R K SachsldquoStochastic population dynamic effects for lung cancer progres-sionrdquo Radiation Research vol 172 no 3 pp 383ndash393 2009
[20] A G Knudson ldquoMutation and cancer statistical study ofretinoblastomardquoProceedings of theNational Academy of Sciencesof the United States of America vol 68 no 4 pp 820ndash823 1971
[21] J F Crow and M Kimura An Introduction to PopulationGenetics Theory Harper and Row New York NY USA 1970
[22] W Y Tan Stochastic Models With Applications to GeneticsCancers AIDS and Other Biomedical Systems World ScientificRiver Edge NJ USA 2002
[23] W Y Tan CW Chen and L J Zhang ldquoCancer risk assessmentby state space modelsrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds Chapter 13 WorldScientific River Edge NJ USA 2008
[24] W Y Tan andCWChen ldquoStochasticmodeling of carcinogene-sis some new insightsrdquoMathematical and Computer Modellingvol 28 no 11 pp 49ndash71 1998
[25] W Y Tan and C W Chen ldquoCancer stochastic modelsrdquo inEncyclopedia of Statistical Sciences Revised edition John Wileyand Sons New York NY USA 2005
[26] E G Luebeck and S HMoolgavkar ldquoMultistage carcinogenesisand the incidence of colorectal cancerrdquo Proceedings of theNational Academy of Sciences of the United States of Americavol 99 no 23 pp 15095ndash15100 2002
[27] R Durrett J Mayberry S Moseley and D Schmidt ldquoProba-bility models for cancer development and progressionrdquo GoogleSearch Google 2012
[28] G E P Box and G C Tiao Bayesian Inference in StatisticalAnalysis Addison-Wesley Reading Mass USA 1973
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
ISRN Biomathematics 17
information from three sources (1)previous information andexperiences about the parameters in terms of the prior distri-bution 119875Θ of the parameters (2) biological information ofinherited cancer cases via the genetic segregation of staging-limiting tumor suppressor genes in the population and (3)
information from the expanded data (Y) and the observeddata (
˜119910) via the statistical model from the system (119875Y
˜119910 |
N Θ) given by (37) and (40)To illustrate the usefulness and applications of ourmodels
and methods we have applied our models and methods tothe eye cancer SEER data of NCINIH Our analysis clearlyshowed that the proposed 3-stage model with inheritedcancer cases fitted the data nicely (see Table 2 and Figure 5)on the other hand the classical 2-stage model cannot fit thedata at all (see Table 2 and Figure 5)These results clearly haveconfirmed results frommolecular biology that the human eyecancer is derived by a 3-stage model with inherited cancercomponent Notice however our 3-stage multistage modelis more general than the classical 3-stage model as describedin Little [1] Tan [2] and Zheng [7] in that we postulatethat cancer tumors develop from primary 119868
3cells by clonal
expansion (see Yang and Chen [11]) (Note that the stochasticmultistagemodels in the literature assume that cancer tumorsdevelop from last stage cells immediately as soon as theyare generated ignoring completely cancer progression seeRemark 1) As a matter of fact we had assumed Δ119905 = 1 fora period of three months and found that the 3-stage modelsthen did not fit the SEER data clearly indicating that119875
119905(119904 119905) lt
1 over a period of three monthsApplying our models and methods to the SEER data
of human eye cancer we have derived for the first timesome useful pieces of information Specifically we mention(1) for the first time that we have estimated the frequencyof the staging-limiting tumor suppressor gene in the USpopulation (119901
1sim 9948 times 10
minus3) (2) With the estimate of 120572as = 08411 the predicted number of uveal melanoma atbirth is 119910
0= 119899
011199012= 36 by 3-stage models with inherited
cancer component (Model-F and Model-1) (The observednumber of eye cancer at birth is 34) (3) The estimateof the proliferation rate (120574
1) of 119869
1cells using Model-F is
1205741
= 2271 times 10minus5
sim 0 (The estimate is 8603 times 10minus5
using Model-1) This confirms that the stage-1 limitinggene is a tumor suppressor gene and unlike the p53 genein chromosome 17p (see [33]) there is little or no haploidinsufficiency for this gene in cells with genotype 119869
1
Using models and methods of this paper one can easilypredict future cancer cases for human eye cancer Thus bycomparing results from different populations our modelsand methods can be used to assess cancer prevention andcontrol procedures This will be our future research topicswe will not go any further here
Appendix
The Expected Numbers of State Variablesunder Discrete Time Approximation
Under discrete time the stochastic differential equations ofstate variables reduce to the following stochastic difference
equations of state variables respectively
119869119894 (119905 + 1 119894) = 119869
119894 (119905 119894) [1 + 120574119894 (119905)]
+ 119890119894(119905 + 1 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 + 1 119906) = 119869
119895(119905 119906) [1 + 120574
119895(119905)] + 119869
119895minus1(119905 119906) 120573
119895minus1(119905)
+ 119890119895(119905 + 1 119906) 0 le 119906 lt 119895 le 119896 minus 1
(A1)
where 119890119894(119905 + 1 119894) = [119861
119894(119905 119894) minus 119869
119894(119905 119894)119887
119894(119905)] minus [119863
119894(119905 119894) minus
119869119894(119905 119894)119889
119894(119905)] and 119890
119895(119905+1 119894) = [119861
119895(119905 119894)minus119869
119895(119905 119894)119887
119895(119905)] minus [119863
119895(119905 119894)minus
119869119895(119905 119894)119889
119895(119905)] + [119872
119895minus1(119905 119894) minus 119869
119895minus1(119905 119894)120573
119895minus1(119905)] for 119895 gt 119894
The initial conditions at birth (1199050) for the above stochastic
difference equations are 119869119895(1199050 119894) gt 0 if (119895 = 119894 119894 + 1) and
119869119895(1199050 119894) = 0 if 119895 gt 119894 + 1 The solution of the above difference
equations under these initial conditions is given respectivelyby
119869119894(119905 119894) = 119869
119894(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119894(119904))
+
119905
sum
119904=1199050+1
119890119894(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119894(119906))
119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894)
119905minus1
prod
119904=1199050
(1 + 120574119895(119904))
+
119905minus1
sum
119904=1199050
119869119895minus1 (119904 119894) 120573119895minus1 (119904)
119905minus1
prod
119906=119904+1
(1 + 120574119895 (119906))
+
119905
sum
119904=1199050+1
119890119895(119904 119894)
119905minus1
prod
119906=119904
(1 + 120574119895(119906))
0 le 119894 lt 119895 le 119896 minus 1
(A2)
If the model is time homogeneous so that 120573119895(119905) =
120573119895 120574
119895(119905) = 120574
119895(119895 = 119894 119896 minus 1 and if 120574
119894= 120574119895if 119894 = 119895 then the
above solutions under the initial conditions (119869119895(1199050 119894) = 0 119895 gt
119894 + 1) reduce to
119869119894 (119905 119894) = 119869
119894(1199050 119894) (1 + 120574
119894)119905minus1199050
+ 120578(0)
119894(119905 119894) 119894 = 0 1 Min (119896 minus 1 2)
119869119895(119905 119894) = 119869
119895(1199050 119894) (1 + 120574
119895)119905minus1199050
+
119895minus1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895(119903) (1 + 120574
119903)119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
18 ISRN Biomathematics
=
119894+1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895 (119903) (1 + 120574119903)
119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
0 le 119894 lt 119895 le 119896 minus 1
(A3)
where 120578(0)
119895(119905 119894) = sum
119905
119904=1199050+1119890119895(119904 119894)(1 + 120574
119894)119905minus119904 120578
(119906)
119895(119905 119894) =
sum119905
119904=1199050+1119890119895minus119906
(119904 119894)sum119895
119907=119906119860119906119895(119907)(1 + 120574
119907)119905minus119904 (119906 = 1 119895 minus 119894)
Thus if the model is time homogeneous and if 120574119894=
120574119895if 119894 = 119895 the 119864[119869
119896minus1(119905 119894)]rsquos in discrete time models under
the initial conditions (119869119895(1199050 119894) = 0 119895 gt 119894 + 1) are given
respectively by
119864 [119869119896minus1
(119905 119894)] =
119894+1
sum
119906=119894
119864 [119869119906(1199050 119894)] (
119896minus2
prod
119907=119906
120573119907)
times
119896minus1
sum
119903=119906
119860119906(119896minus1)
(119903) (1 + 120574119903)119905minus1199050
119894 = 0 1 Min (119896 minus 1 2)
(A4)
References
[1] M P Little ldquoCancermodels ionization and genomic instabilitya reviewrdquo inHandbook of CancerModels with ApplicationsW YTan and L Hanin Eds chapter 5 World Scientific River EdgeNJ USA 2008
[2] W Y Tan Stochastic Models of Carcinogenesis Marcel DekkerNew York NY USA 1991
[3] W Y Tan ldquoStochastic multiti-stage models of carcinogenesis ashidden Markov models a new approachrdquo International Journalof Systems and Synthetic Biology vol 1 pp 313ndash337 2010
[4] W Y Tan L J Zhang and C W Chen ldquoStochastic modelingof carcinogenesis state space models and estimation of param-etersrdquoDiscrete and Continuous Dynamical Systems B vol 4 no1 pp 297ndash322 2004
[5] W Y Tan C W Chen and L J Zhang ldquoCancer biology cancermodels and stochasticmathematical analysis of carcinogenesisrdquoin Handbook of Cancer Models and Applications W Y Tan andL Hanin Eds chapter 3World Scientific River Edge NJ USA2008
[6] R A Weinberg The Biology of Human Cancer Garland Sci-ences Taylor and Frances New York NY USA 2007
[7] Q Zheng ldquoStochastic multistage cancer models a fresh lookat an old approachrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds chapter 2 WorldScientific River Edge NJ USA 2008
[8] W Y Tan L J Zhang W Chen and J M Zhu ldquoA stochasticmodel of human colon cancer involving multiple pathwaysrdquo inHandbook of Cancer Models with Applications W Y Tan and LHanin Eds chapter 11 World Scientific River Edge NJ USA2008
[9] W Y Tan and H Zhou ldquoA new stochastic model of retinoblas-toma involving both hereditary and non- hereditary cancercasesrdquo Journal of Carcinogenesis and Mutagenesis vol 2 no 2article 117 2011
[10] X W Yan Stochastic and State Space Models of carcinogenesisUnder Complex Situation [PhD thesis] Department of Math-ematical Sciences University of Memphsis Memphis TennUSA
[11] G L Yang and C W Chen ldquoA stochastic two-stage carcino-genesis model a new approach to computing the probabilityof observing tumor in animal bioassaysrdquo Mathematical Bio-sciences vol 104 no 2 pp 247ndash258 1991
[12] H Osada and T Takahashi ldquoGenetic alterations of multipletumor suppressors and oncogenes in the carcinogenesis andprogression of lung cancerrdquo Oncogene vol 21 no 48 pp 7421ndash7434 2002
[13] I I Wistuba L Mao and A F Gazdar ldquoSmoking moleculardamage in bronchial epitheliumrdquo Oncogene vol 21 no 48 pp7298ndash7306 2002
[14] S Landreville O A Agapova and J W Hartbour ldquoEmerginginsights into the molecular pathogenesis of uveal melanomardquoFuture Oncology vol 4 no 5 pp 629ndash636 2008
[15] H W Mensink D Paridaens and A De Klein ldquoGenetics ofuveal melanomardquo Expert Review of Ophthalmology vol 4 no6 pp 607ndash616 2009
[16] WY Tan andXWYan ldquoAnew stochastic and state spacemodelof human colon cancer incorporating multiple pathwaysrdquoBiology Direct vol 5 article 26 2010
[17] L B Klebanov S T Rachev and A Y Yakovlev ldquoA stochasticmodel of radiation carcinogenesis latent time distributions andtheir propertiesrdquo Mathematical Biosciences vol 113 no 1 pp51ndash75 1993
[18] A Y Yakovlev and A D Tsodikov Stochastic Models of TumorLatency and Their Biostatistical Applications World ScientificRiver Edge NJ USA 1996
[19] H Fakir W Y Tan L Hlatky P Hahnfeldt and R K SachsldquoStochastic population dynamic effects for lung cancer progres-sionrdquo Radiation Research vol 172 no 3 pp 383ndash393 2009
[20] A G Knudson ldquoMutation and cancer statistical study ofretinoblastomardquoProceedings of theNational Academy of Sciencesof the United States of America vol 68 no 4 pp 820ndash823 1971
[21] J F Crow and M Kimura An Introduction to PopulationGenetics Theory Harper and Row New York NY USA 1970
[22] W Y Tan Stochastic Models With Applications to GeneticsCancers AIDS and Other Biomedical Systems World ScientificRiver Edge NJ USA 2002
[23] W Y Tan CW Chen and L J Zhang ldquoCancer risk assessmentby state space modelsrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds Chapter 13 WorldScientific River Edge NJ USA 2008
[24] W Y Tan andCWChen ldquoStochasticmodeling of carcinogene-sis some new insightsrdquoMathematical and Computer Modellingvol 28 no 11 pp 49ndash71 1998
[25] W Y Tan and C W Chen ldquoCancer stochastic modelsrdquo inEncyclopedia of Statistical Sciences Revised edition John Wileyand Sons New York NY USA 2005
[26] E G Luebeck and S HMoolgavkar ldquoMultistage carcinogenesisand the incidence of colorectal cancerrdquo Proceedings of theNational Academy of Sciences of the United States of Americavol 99 no 23 pp 15095ndash15100 2002
[27] R Durrett J Mayberry S Moseley and D Schmidt ldquoProba-bility models for cancer development and progressionrdquo GoogleSearch Google 2012
[28] G E P Box and G C Tiao Bayesian Inference in StatisticalAnalysis Addison-Wesley Reading Mass USA 1973
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
18 ISRN Biomathematics
=
119894+1
sum
119906=119894
119869119906(1199050 119894)
119895minus1
prod
119907=119906
120573119907
119895
sum
119903=119906
119860119906119895 (119903) (1 + 120574119903)
119905minus1199050
+ 120578(0)
119895(119905 119894) +
119895minus119894
sum
119906=1
119895minus1
prod
119903=119895minus119906
120573119903
120578(119906)
119895(119905 119894)
0 le 119894 lt 119895 le 119896 minus 1
(A3)
where 120578(0)
119895(119905 119894) = sum
119905
119904=1199050+1119890119895(119904 119894)(1 + 120574
119894)119905minus119904 120578
(119906)
119895(119905 119894) =
sum119905
119904=1199050+1119890119895minus119906
(119904 119894)sum119895
119907=119906119860119906119895(119907)(1 + 120574
119907)119905minus119904 (119906 = 1 119895 minus 119894)
Thus if the model is time homogeneous and if 120574119894=
120574119895if 119894 = 119895 the 119864[119869
119896minus1(119905 119894)]rsquos in discrete time models under
the initial conditions (119869119895(1199050 119894) = 0 119895 gt 119894 + 1) are given
respectively by
119864 [119869119896minus1
(119905 119894)] =
119894+1
sum
119906=119894
119864 [119869119906(1199050 119894)] (
119896minus2
prod
119907=119906
120573119907)
times
119896minus1
sum
119903=119906
119860119906(119896minus1)
(119903) (1 + 120574119903)119905minus1199050
119894 = 0 1 Min (119896 minus 1 2)
(A4)
References
[1] M P Little ldquoCancermodels ionization and genomic instabilitya reviewrdquo inHandbook of CancerModels with ApplicationsW YTan and L Hanin Eds chapter 5 World Scientific River EdgeNJ USA 2008
[2] W Y Tan Stochastic Models of Carcinogenesis Marcel DekkerNew York NY USA 1991
[3] W Y Tan ldquoStochastic multiti-stage models of carcinogenesis ashidden Markov models a new approachrdquo International Journalof Systems and Synthetic Biology vol 1 pp 313ndash337 2010
[4] W Y Tan L J Zhang and C W Chen ldquoStochastic modelingof carcinogenesis state space models and estimation of param-etersrdquoDiscrete and Continuous Dynamical Systems B vol 4 no1 pp 297ndash322 2004
[5] W Y Tan C W Chen and L J Zhang ldquoCancer biology cancermodels and stochasticmathematical analysis of carcinogenesisrdquoin Handbook of Cancer Models and Applications W Y Tan andL Hanin Eds chapter 3World Scientific River Edge NJ USA2008
[6] R A Weinberg The Biology of Human Cancer Garland Sci-ences Taylor and Frances New York NY USA 2007
[7] Q Zheng ldquoStochastic multistage cancer models a fresh lookat an old approachrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds chapter 2 WorldScientific River Edge NJ USA 2008
[8] W Y Tan L J Zhang W Chen and J M Zhu ldquoA stochasticmodel of human colon cancer involving multiple pathwaysrdquo inHandbook of Cancer Models with Applications W Y Tan and LHanin Eds chapter 11 World Scientific River Edge NJ USA2008
[9] W Y Tan and H Zhou ldquoA new stochastic model of retinoblas-toma involving both hereditary and non- hereditary cancercasesrdquo Journal of Carcinogenesis and Mutagenesis vol 2 no 2article 117 2011
[10] X W Yan Stochastic and State Space Models of carcinogenesisUnder Complex Situation [PhD thesis] Department of Math-ematical Sciences University of Memphsis Memphis TennUSA
[11] G L Yang and C W Chen ldquoA stochastic two-stage carcino-genesis model a new approach to computing the probabilityof observing tumor in animal bioassaysrdquo Mathematical Bio-sciences vol 104 no 2 pp 247ndash258 1991
[12] H Osada and T Takahashi ldquoGenetic alterations of multipletumor suppressors and oncogenes in the carcinogenesis andprogression of lung cancerrdquo Oncogene vol 21 no 48 pp 7421ndash7434 2002
[13] I I Wistuba L Mao and A F Gazdar ldquoSmoking moleculardamage in bronchial epitheliumrdquo Oncogene vol 21 no 48 pp7298ndash7306 2002
[14] S Landreville O A Agapova and J W Hartbour ldquoEmerginginsights into the molecular pathogenesis of uveal melanomardquoFuture Oncology vol 4 no 5 pp 629ndash636 2008
[15] H W Mensink D Paridaens and A De Klein ldquoGenetics ofuveal melanomardquo Expert Review of Ophthalmology vol 4 no6 pp 607ndash616 2009
[16] WY Tan andXWYan ldquoAnew stochastic and state spacemodelof human colon cancer incorporating multiple pathwaysrdquoBiology Direct vol 5 article 26 2010
[17] L B Klebanov S T Rachev and A Y Yakovlev ldquoA stochasticmodel of radiation carcinogenesis latent time distributions andtheir propertiesrdquo Mathematical Biosciences vol 113 no 1 pp51ndash75 1993
[18] A Y Yakovlev and A D Tsodikov Stochastic Models of TumorLatency and Their Biostatistical Applications World ScientificRiver Edge NJ USA 1996
[19] H Fakir W Y Tan L Hlatky P Hahnfeldt and R K SachsldquoStochastic population dynamic effects for lung cancer progres-sionrdquo Radiation Research vol 172 no 3 pp 383ndash393 2009
[20] A G Knudson ldquoMutation and cancer statistical study ofretinoblastomardquoProceedings of theNational Academy of Sciencesof the United States of America vol 68 no 4 pp 820ndash823 1971
[21] J F Crow and M Kimura An Introduction to PopulationGenetics Theory Harper and Row New York NY USA 1970
[22] W Y Tan Stochastic Models With Applications to GeneticsCancers AIDS and Other Biomedical Systems World ScientificRiver Edge NJ USA 2002
[23] W Y Tan CW Chen and L J Zhang ldquoCancer risk assessmentby state space modelsrdquo in Handbook of Cancer Models andApplications W Y Tan and L Hanin Eds Chapter 13 WorldScientific River Edge NJ USA 2008
[24] W Y Tan andCWChen ldquoStochasticmodeling of carcinogene-sis some new insightsrdquoMathematical and Computer Modellingvol 28 no 11 pp 49ndash71 1998
[25] W Y Tan and C W Chen ldquoCancer stochastic modelsrdquo inEncyclopedia of Statistical Sciences Revised edition John Wileyand Sons New York NY USA 2005
[26] E G Luebeck and S HMoolgavkar ldquoMultistage carcinogenesisand the incidence of colorectal cancerrdquo Proceedings of theNational Academy of Sciences of the United States of Americavol 99 no 23 pp 15095ndash15100 2002
[27] R Durrett J Mayberry S Moseley and D Schmidt ldquoProba-bility models for cancer development and progressionrdquo GoogleSearch Google 2012
[28] G E P Box and G C Tiao Bayesian Inference in StatisticalAnalysis Addison-Wesley Reading Mass USA 1973
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
ISRN Biomathematics 19
[29] A P Dempster N M Laird and D B Rubin ldquoMaximumlikelihood from incomplete data via the EM algorithm (withdiscussion)rdquo Journal of the Royal Statistical Society B vol 39pp 1ndash38 1977
[30] A F M Smith and A E Gelfand ldquoBayesian statistics withouttears a samplingresampling perspectiverdquoAmerican Statisticianvol 46 pp 84ndash88 1992
[31] A E Loercher and J W Harbour ldquoMolecular genetics of uvealmelanomardquoCurrent Eye Research vol 27 no 2 pp 69ndash74 2003
[32] C S Potten C Booth and D Hargreaves ldquoThe small intestineas amodel for evaluating adult tissue stem cell drug targetsrdquoCellProliferation vol 36 no 3 pp 115ndash129 2003
[33] C J Lynch and J Milner ldquoLoss of one p53 allele results in four-fold reduction of p53 mRNA and protein a basis for p53 haplo-insufficiencyrdquo Oncogene vol 25 no 24 pp 3463ndash3470 2006
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of
Submit your manuscripts athttpwwwhindawicom
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical Problems in Engineering
Hindawi Publishing Corporationhttpwwwhindawicom
Differential EquationsInternational Journal of
Volume 2014
Applied MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Mathematical PhysicsAdvances in
Complex AnalysisJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
OptimizationJournal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Operations ResearchAdvances in
Journal of
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Function Spaces
Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014
International Journal of Mathematics and Mathematical Sciences
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Algebra
Discrete Dynamics in Nature and Society
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Decision SciencesAdvances in
Discrete MathematicsJournal of
Hindawi Publishing Corporationhttpwwwhindawicom
Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014
Stochastic AnalysisInternational Journal of