mathematical biology
TRANSCRIPT
Mathematical Biology
Lecture notes for MATH 4333
(formerly MATH 365)
Jeffrey R. Chasnov
The Hong Kong University ofScience and Technology
The Hong Kong University of Science and TechnologyDepartment of MathematicsClear Water Bay, Kowloon
Hong Kong
Copyright cβ 2009β2012 by Jeffrey Robert Chasnov
This work is licensed under the Creative Commons Attribution 3.0 Hong Kong License.
To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/hk/
or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco,
California, 94105, USA.
Preface
What follows are my lecture notes for Math 4333: Mathematical Biology, taughtat the Hong Kong University of Science and Technology. This applied mathe-matics course is primarily for final year mathematics major and minor students.Other students are also welcome to enroll, but must have the necessary mathe-matical skills.
My main emphasis is on mathematical modeling, with biology the sole ap-plication area. I assume that students have no knowledge of biology, but I hopethat they will learn a substantial amount during the course. Students are re-quired to know differential equations and linear algebra, and this usually meanshaving taken two courses in these subjects. I also touch on topics in stochas-tic modeling, which requires some knowledge of probability. A full course onprobability, however, is not a prerequisite though it might be helpful.
Biology, as is usually taught, requires memorizing a wide selection of factsand remembering them for exams, sometimes forgetting them soon after. Forstudents exposed to biology in secondary school, my course may seem like adifferent subject. The ability to model problems using mathematics requiresalmost no rote memorization, but it does require a deep understanding of basicprinciples and a wide range of mathematical techniques. Biology offers a richvariety of topics that are amenable to mathematical modeling, and I have chosenspecific topics that I have found to be the most interesting.
If, as a UST student, you have not yet decided if you will take my course,please browse these lecture notes to see if you are interested in these topics.Other web surfers are welcome to download these notes from
http://www.math.ust.hk/Λmachas/mathematical-biology.pdfand to use them freely for teaching and learning. I welcome any comments, sug-gestions, or corrections sent to me by email ([email protected]). Althoughmost of the material in my notes can be found elsewhere, I hope that some ofit will be considered to be original.
Jeffrey R. ChasnovHong KongMay 2009
iii
Contents
1 Population Dynamics 11.1 The Malthusian growth model . . . . . . . . . . . . . . . . . . . . 11.2 The Logistic equation . . . . . . . . . . . . . . . . . . . . . . . . 21.3 A model of species competition . . . . . . . . . . . . . . . . . . . 61.4 The Lotka-Volterra predator-prey model . . . . . . . . . . . . . . 8
2 Age-structured Populations 152.1 Fibonacciβs rabbits . . . . . . . . . . . . . . . . . . . . . . . . . . 152.2 The golden ratio Ξ¦ . . . . . . . . . . . . . . . . . . . . . . . . . . 172.3 The Fibonacci numbers in a sunflower . . . . . . . . . . . . . . . 182.4 Rabbits are an age-structured population . . . . . . . . . . . . . 212.5 Discrete age-structured populations . . . . . . . . . . . . . . . . . 232.6 Continuous age-structured populations . . . . . . . . . . . . . . . 262.7 The brood size of a hermaphroditic worm . . . . . . . . . . . . . 29
3 Stochastic Population Growth 373.1 A stochastic model of population growth . . . . . . . . . . . . . . 373.2 Asymptotics of large initial populations . . . . . . . . . . . . . . 41
3.2.1 Derivation of the deterministic model . . . . . . . . . . . 423.2.2 Derivation of the normal probability distribution . . . . . 45
3.3 Simulation of population growth . . . . . . . . . . . . . . . . . . 47
4 Infectious Disease Modeling 514.1 The SI model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514.2 The SIS model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524.3 The SIR epidemic disease model . . . . . . . . . . . . . . . . . . 544.4 Vaccination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564.5 The SIR endemic disease model . . . . . . . . . . . . . . . . . . . 584.6 Evolution of virulence . . . . . . . . . . . . . . . . . . . . . . . . 59
5 Population Genetics 615.1 Haploid genetics . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.1.1 Spread of a favored allele . . . . . . . . . . . . . . . . . . 635.1.2 Mutation-selection balance . . . . . . . . . . . . . . . . . 64
5.2 Diploid genetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665.2.1 Sexual reproduction . . . . . . . . . . . . . . . . . . . . . 675.2.2 Spread of a favored allele . . . . . . . . . . . . . . . . . . 695.2.3 Mutation-selection balance . . . . . . . . . . . . . . . . . 72
v
vi CONTENTS
5.2.4 Heterosis . . . . . . . . . . . . . . . . . . . . . . . . . . . 745.3 Frequency-dependent selection . . . . . . . . . . . . . . . . . . . 755.4 Linkage equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . 775.5 Random genetic drift . . . . . . . . . . . . . . . . . . . . . . . . 82
6 Biochemical Reactions 896.1 The law of mass action . . . . . . . . . . . . . . . . . . . . . . . . 896.2 Enzyme kinetics . . . . . . . . . . . . . . . . . . . . . . . . . . . 926.3 Competitive inhibition . . . . . . . . . . . . . . . . . . . . . . . . 936.4 Allosteric inhibition . . . . . . . . . . . . . . . . . . . . . . . . . 966.5 Cooperativity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7 Sequence Alignment 1037.1 DNA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1047.2 Brute force alignment . . . . . . . . . . . . . . . . . . . . . . . . 1057.3 Dynamic programming . . . . . . . . . . . . . . . . . . . . . . . . 1087.4 Gaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1117.5 Local alignments . . . . . . . . . . . . . . . . . . . . . . . . . . . 1127.6 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Chapter 1
Population Dynamics
Populations grow in size when the birth rate exceeds the death rate. ThomasMalthus, in An Essay on the Principle of Population (1798), used uncheckedpopulation growth to famously predict a global famine unless governments reg-ulated family sizeβan idea later echoed by Mainland Chinaβs one-child policy.The reading of Malthus is said by Charles Darwin in his autobiography to haveinspired his discovery of what is now the cornerstone of modern biology: theprinciple of evolution by natural selection.
The Malthusian growth model is the granddaddy of all population models,and we begin this chapter with a simple derivation of the famous exponentialgrowth law. Unchecked exponential growth obviously does not occur in nature,and population growth rates may be regulated by limited food or other envi-ronmental resources, and by competition among individuals within a speciesor across species. We will develop models for three types of regulation. Thefirst model is the well-known logistic equation, a model that will also makean appearance in subsequent chapters. The second model is an extension ofthe logistic model to species competition. And the third model is the famousLotka-Volterra predator-prey equations. Because all these mathematical mod-els are nonlinear differential equations, mathematical methods to analyze suchequations will be developed.
1.1 The Malthusian growth model
Let π(π‘) be the number of individuals in a population at time π‘, and let π andπ be the average per capita birth rate and death rate, respectively. In a shorttime Ξπ‘, the number of births in the population is πΞπ‘π , and the number ofdeaths is πΞπ‘π . An equation for π at time π‘+Ξπ‘ is then determined to be
π(π‘+Ξπ‘) = π(π‘) + πΞπ‘π(π‘)β πΞπ‘π(π‘),
which can be rearranged to
π(π‘+Ξπ‘)βπ(π‘)
Ξπ‘= (πβ π)π(π‘);
and as Ξπ‘ β 0,ππ
ππ‘= (πβ π)π.
1
2 CHAPTER 1. POPULATION DYNAMICS
With an initial population size of π0, and with π = πβ π positive, the solutionfor π = π(π‘) grows exponentially:
π(π‘) = π0πππ‘.
With population size replaced by the amount of money in a bank, the expo-nential growth law also describes the growth of an account under continuouscompounding with interest rate π.
1.2 The Logistic equation
The exponential growth law for population size is unrealistic over long times.Eventually, growth will be checked by the over-consumption of resources. Weassume that the environment has an intrinsic carrying capacity πΎ, and popula-tions larger than this size experience heightened death rates.
To model population growth with an environmental carrying capacity πΎ, welook for a nonlinear equation of the form
ππ
ππ‘= πππΉ (π),
where πΉ (π) provides a model for environmental regulation. This functionshould satisfy πΉ (0) = 1 (the population grows exponentially with growth rateπ when π is small), πΉ (πΎ) = 0 (the population stops growing at the carryingcapacity), and πΉ (π) < 0 when π > πΎ (the population decays when it is largerthan the carrying capacity). The simplest function πΉ (π) satisfying these con-ditions is linear and given by πΉ (π) = 1 β π/πΎ. The resulting model is thewell-known logistic equation,
ππ
ππ‘= ππ(1βπ/πΎ), (1.1)
an important model for many processes besides bounded population growth.Although (1.1) is a nonlinear equation, an analytical solution can be found
by separating the variables. Before we embark on this algebra, we first illustratesome basic concepts used in analyzing nonlinear differential equations.
Fixed points, also called equilibria, of a differential equation such as (1.1)are defined as the values of π where ππ/ππ‘ = 0. Here, we see that the fixedpoints of (1.1) are π = 0 and π = πΎ. If the initial value of π is at one of thesefixed points, then π will remain fixed there for all time. Fixed points, however,can be stable or unstable. A fixed point is stable if a small perturbation fromthe fixed point decays to zero so that the solution returns to the fixed point.Likewise, a fixed point is unstable if a small perturbation grows exponentiallyso that the solution moves away from the fixed point. Calculation of stabilityby means of small perturbations is called linear stability analysis. For example,consider the general one-dimensional differential equation (using the notationοΏ½οΏ½ = ππ₯/ππ‘)
οΏ½οΏ½ = π(π₯), (1.2)
with π₯* a fixed point of the equation, that is π(π₯*) = 0. To determine analyt-ically if π₯* is a stable or unstable fixed point, we perturb the solution. Let uswrite our solution π₯ = π₯(π‘) in the form
π₯(π‘) = π₯* + π(π‘), (1.3)
1.2. THE LOGISTIC EQUATION 3
0
x
f(x)
unstable stable unstable
Figure 1.1: Determining one-dimensional stability using a graphical approach.
where initially π(0) is small but different from zero. Substituting (1.3) into (1.2),we obtain
οΏ½οΏ½ = π(π₯* + π)
= π(π₯*) + ππ β²(π₯*) + . . .
= ππ β²(π₯*) + . . . ,
where the second equality uses a Taylor series expansion of π(π₯) about π₯* andthe third equality uses π(π₯*) = 0. If π β²(π₯*) = 0, we can neglect higher-orderterms in π for small times, and integrating we have
π(π‘) = π(0)ππβ²(π₯*)π‘.
The perturbation π(π‘) to the fixed point π₯* goes to zero as π‘ β β providedπ β²(π₯*) < 0. Therefore, the stability condition on π₯* is
π₯* is
{a stable fixed point if π β²(π₯*) < 0,an unstable fixed point if π β²(π₯*) > 0.
Another equivalent but sometimes simpler approach to analyzing the stabil-ity of the fixed points of a one-dimensional nonlinear equation such as (1.2) is toplot π(π₯) versus π₯. We show a generic example in Fig. 1.1. The fixed points arethe x-intercepts of the graph. Directional arrows on the x-axis can be drawnbased on the sign of π(π₯). If π(π₯) < 0, then the arrow points to the left; ifπ(π₯) > 0, then the arrow points to the right. The arrows show the direction ofmotion for a particle at position π₯ satisfying οΏ½οΏ½ = π(π₯). As illustrated in Fig. 1.1,fixed points with arrows on both sides pointing in are stable, and fixed pointswith arrows on both sides pointing out are unstable.
In the logistic equation (1.1), the fixed points are π* = 0,πΎ. A sketch ofπΉ (π) = ππ(1 βπ/πΎ) versus π , with π,πΎ > 0 in Fig. 1.2 immediately showsthat π* = 0 is an unstable fixed point and π* = πΎ is a stable fixed point. The
4 CHAPTER 1. POPULATION DYNAMICS
0 K
0
N
F(N) unstable stable
Figure 1.2: Determining stability of the fixed points of the logistic equation.
analytical approach computes πΉ β²(π) = π(1 β 2π/πΎ), so that πΉ β²(0) = π > 0and πΉ β²(πΎ) = βπ < 0. Again we conclude that π* = 0 is unstable and π* = πΎis stable.
We now solve the logistic equation analytically. Although this relativelysimple equation can be solved as is, we first nondimensionalize to illustrate thisvery important technique that will later prove to be most useful. Perhaps hereone can guess the appropriate unit of time to be 1/π and the appropriate unitof population size to be πΎ. However, we prefer to demonstrate a more generaltechnique that may be usefully applied to equations for which the appropriatedimensionless variables are difficult to guess. We begin by nondimensionalizingtime and population size:
π = π‘/π‘*, π = π/π*,
where π‘* and π* are unknown dimensional units. The derivative οΏ½οΏ½ is computedas
ππ
ππ‘=
π(π*π)ππ
ππ
ππ‘=
π*π‘*
ππ
ππ.
Therefore, the logistic equation (1.1) becomes
ππ
ππ= ππ‘*π
(1β π*π
πΎ
),
which assumes the simplest form with the choices π‘* = 1/π and π* = πΎ.Therefore, our dimensionless variables are
π = ππ‘, π = π/πΎ,
and the logistic equation, in dimensionless form, becomes
ππ
ππ= π (1β π) , (1.4)
1.2. THE LOGISTIC EQUATION 5
with the dimensionless initial condition π(0) = π0 = π0/πΎ, where π0 is theinitial population size. Note that the dimensionless logistic equation (1.4) hasno free parameters, while the dimensional form of the equation (1.1) containsπ and πΎ. Reduction in the number of free parameters (here, two: π and πΎ)by the number of independent units (here, also two: time and population size)is a general feature of nondimensionalization. The theoretical result is knownas the Buckingham Pi Theorem. Reducing the number of free parameters ina problem to the absolute minimum is especially important before proceedingto a numerical solution. The parameter space that must be explored may besubstantially reduced.
Solving the dimensionless logistic equation (1.4) can proceed by separatingthe variables. Separating and integrating from π = 0 to π and π0 to π yieldsβ« π
π0
ππ
π(1β π)=
β« π
0
ππ.
The integral on the left-hand-side can be performed using the method of partialfractions:
1
π(1β π)=
π΄
π+
π΅
1β π
=π΄+ (π΅ βπ΄)π
π(1β π);
and by equating the coefficients of the numerators proportional to π0 and π1,we find that π΄ = 1 and π΅ = 1. Therefore,β« π
π0
ππ
π(1β π)=
β« π
π0
ππ
π+
β« π
π0
ππ
(1β π)
= lnπ
π0β ln
1β π
1β π0
= lnπ(1β π0)
π0(1β π)
= π.
Solving for π, we first exponentiate both sides and then isolate π:
π(1β π0)
π0(1β π)= ππ , or π(1β π0) = π0π
π β ππ0ππ ,
or π(1β π0 + π0ππ ) = π0π
π , or π =π0
π0 + (1β π0)πβπ.
Returning to the dimensional variables, we finally have
π(π‘) =π0
π0/πΎ + (1βπ0/πΎ)πβππ‘. (1.5)
There are several ways to write the final result given by (1.5). The presentationof a mathematical result requires a good aesthetic sense and is an importantelement of mathematical technique. When deciding how to write (1.5), I con-sidered if it was easy to observe the following limiting results: (1) π(0) = π0;(2) limπ‘ββ π(π‘) = πΎ; and (3) limπΎββ π(π‘) = π0 exp (ππ‘).
6 CHAPTER 1. POPULATION DYNAMICS
0 2 4 6 80
0.2
0.4
0.6
0.8
1
1.2
1.4
Ο
Ξ·
Logistic Equation
Figure 1.3: Solutions of the dimensionless logistic equation.
In Fig. 1.3, we plot the solution to the dimensionless logistic equation forinitial conditions π0 = 0.02, 0.2, 0.5, 0.8, 1.0, and 1.2. The lowest curve isthe characteristic βS-shapeβ usually associated with the solution of the logisticequation. This sigmoidal curve appears in many other types of models. TheMATLAB script to produce Fig. 1.3 is shown below.
eta0=[0.02 .2 .5 .8 1 1.2];
tau=linspace(0,8);
for i=1:length(eta0)
eta=eta0(i)./(eta0(i)+(1-eta0(i)).*exp(-tau));
plot(tau,eta);hold on
end
axis([0 8 0 1.25]);
xlabel(β\tauβ); ylabel(β\etaβ); title(βLogistic Equationβ);
1.3 A model of species competition
Suppose that two species compete for the same resources. To build a model, wecan start with logistic equations for both species. Different species would havedifferent growth rates and different carrying capacities. If we let π1 and π2 bethe number of individuals of species one and species two, then
ππ1
ππ‘= π1π1(1βπ1/πΎ1),
ππ2
ππ‘= π2π2(1βπ2/πΎ2).
These are uncoupled equations so that asymptotically, π1 β πΎ1 and π2 β πΎ2.How do we model the competition between species? If π1 is much smaller thanπΎ1, and π2 much smaller than πΎ2, then resources are plentiful and populations
1.4. THE LOTKA-VOLTERRA PREDATOR-PREY MODEL 7
grow exponentially with growth rates π1 and π2. If species one and two compete,then the growth of species one reduces resources available to species two, andvice-versa. Since we do not know the impact species one and two have on eachother, we introduce two additional parameters to model the competition. Areasonable modification that couples the two logistic equations is
ππ1
ππ‘= π1π1
(1β π1 + πΌ12π2
πΎ1
), (1.6a)
ππ2
ππ‘= π2π2
(1β πΌ21π1 +π2
πΎ2
), (1.6b)
where πΌ12 and πΌ21 are dimensionless parameters that model the consumptionof species oneβs resources by species two, and vice-versa. For example, supposethat both species eat exactly the same food, but species two consumes twice asmuch as species one. Since one individual of species two consumes the equivalentof two individuals of species one, the correct model is πΌ12 = 2 and πΌ21 = 1/2.
Another example supposes that species one and two occupy the same niche,consume resources at the same rate, but may have different growth rates andcarrying capacities. Can the species coexist, or does one species eventually drivethe other to extinction? It is possible to answer this question without actuallysolving the differential equations. With πΌ12 = πΌ21 = 1 as appropriate for thisexample, the coupled logistic equations (1.6) become
ππ1
ππ‘= π1π1
(1β π1 +π2
πΎ1
),
ππ2
ππ‘= π2π2
(1β π1 +π2
πΎ2
). (1.7)
For sake of argument, we assume that πΎ1 > πΎ2. The only fixed points otherthan the trivial one (π1, π2) = (0, 0) are (π1, π2) = (πΎ1, 0) and (π1, π2) =(0,πΎ2). Stability can be computed analytically by a two-dimensional Taylor-series expansion, but here a simpler argument can suffice. We first consider(π1, π2) = (πΎ1, π), with π small. Since πΎ1 > πΎ2, observe from (1.7) thatοΏ½οΏ½2 < 0 so that species two goes extinct. Therefore (π1, π2) = (πΎ1, 0) is astable fixed point. Now consider (π1, π2) = (π,πΎ2), with π small. Again, sinceπΎ1 > πΎ2, observe from (1.7) that οΏ½οΏ½1 > 0 and species one increases in number.Therefore, (π1, π2) = (0,πΎ2) is an unstable fixed point. We have thus foundthat, within our coupled logistic model, species that occupy the same niche andconsume resources at the same rate cannot coexist and that the species with thelargest carrying capacity will survive and drive the other species to extinction.This is the so-called principle of completive exclusion, also called πΎ-selectionsince the species with the largest carrying capacity wins. In fact, ecologists alsotalk about π-selection; that is, the species with the largest growth rate wins. Ourcoupled logistic model does not model π-selection, demonstrating the potentiallimitations of a too simple mathematical model.
For some values of πΌ12 and πΌ21, our model admits a stable equilibriumsolution where two species coexist. The calculation of the fixed points and theirstability is more complicated than the calculation just done, and I present onlythe results. The stable coexistence of two species within our model is possibleonly if πΌ12πΎ2 < πΎ1 and πΌ21πΎ1 < πΎ2.
8 CHAPTER 1. POPULATION DYNAMICSPage 1 of 1
5/13/2010file://C:\Documents and Settings\macho\Local Settings\Temp\snowshoe-lynx.gif
Figure 1.4: Pelt-trading records of the Hudson Bay Company for the snowshoehare and its predator the lynx. [From E.P. Odum, Fundamentals of Ecology,1953.]
1.4 The Lotka-Volterra predator-prey model
Pelt-trading records (Fig. 1.4) of the Hudson Bay company from over almost acentury display a near-periodic oscillation in the number of trapped snowshoehares and lynxes. With the reasonable assumption that the recorded numberof trapped animals is proportional to the animal population, these records sug-gest that predator-prey populationsβas typified by the hare and the lynxβcanoscillate over time. Lotka and Volterra independently proposed in the 1920s amathematical model for the population dynamics of a predator and prey, andthese Lotka-Volterra predator-prey equations have since become an iconic modelof mathematical biology.
To develop these equations, suppose that a predator population feeds on aprey population. We assume that the number of prey grow exponentially in theabsence of predators (there is unlimited food available to the prey), and thatthe number of predators decay exponentially in the absence of prey (predatorsmust eat prey or starve). Contact between predators and prey increases thenumber of predators and decreases the number of prey.
Let π(π‘) and π (π‘) be the number of prey and predators at time π‘. To developa coupled differential equation model, we consider population sizes at time π‘+Ξπ‘.Exponential growth of prey in the absence of predators and exponential decayof predators in the absence of prey can be modeled by the usual linear terms.The coupling between prey and predator must be modeled with two additionalparameters. We write the population sizes at time π‘+Ξπ‘ as
π(π‘+Ξπ‘) = π(π‘) + πΌΞπ‘π(π‘)β πΎΞπ‘π(π‘)π (π‘),
π (π‘+Ξπ‘) = π (π‘) + ππΎΞπ‘π(π‘)π (π‘)β π½Ξπ‘π (π‘).
The parameters πΌ and π½ are the average per capita birthrate of the prey andthe deathrate of the predators, in the absence of the other species. The coupling
1.4. THE LOTKA-VOLTERRA PREDATOR-PREY MODEL 9
terms model contact between predators and prey. The parameter πΎ is the frac-tion of prey caught per predator per unit time; the total number of prey caughtby predators during time Ξπ‘ is πΎΞπ‘ππ . The prey eaten is then converted intonewborn predators (view this as a conversion of biomass), with conversion factorπ, so that the number of predators during time Ξπ‘ increases by ππΎΞπ‘ππ .
Converting these equations into differential equations by letting Ξπ‘ β 0, weobtain the well-known Lotka-Volterra predator-prey equations
ππ
ππ‘= πΌπ β πΎππ,
ππ
ππ‘= ππΎππ β π½π. (1.8)
Before analyzing the Lotka-Volterra equations, we first review fixed pointand linear stability analysis applied to what is called an autonomous system ofdifferential equations. For simplicity, we consider a system of only two differen-tial equations of the form
οΏ½οΏ½ = π(π₯, π¦), οΏ½οΏ½ = π(π₯, π¦), (1.9)
though our results can be generalized to larger systems. The system given by(1.9) is said to be autonomous since π and π do not depend explicitly on theindependent variable π‘. Fixed points of this system are determined by settingοΏ½οΏ½ = οΏ½οΏ½ = 0 and solving for π₯ and π¦. Suppose that one fixed point is (π₯*, π¦*).To determine its linear stability, we consider initial conditions for (π₯, π¦) nearthe fixed point with small independent perturbations in both directions, i.e.,π₯(0) = π₯* + π(0), π¦(0) = π¦* + πΏ(0). If the initial perturbation grows in time, wesay that the fixed point is unstable; if it decays, we say that the fixed point isstable. Accordingly, we let
π₯(π‘) = π₯* + π(π‘), π¦(π‘) = π¦* + πΏ(π‘), (1.10)
and substitute (1.10) into (1.9) to determine the time-dependence of π and πΏ.Since π₯* and π¦* are constants, we have
οΏ½οΏ½ = π(π₯* + π, π¦* + πΏ), οΏ½οΏ½ = π(π₯* + π, π¦* + πΏ).
The linear stability analysis proceeds by assuming that the initial perturbationsπ(0) and πΏ(0) are small enough to truncate the two-dimensional Taylor-seriesexpansion of π and π about π = πΏ = 0 to first-order in π and πΏ. Note that ingeneral, the two-dimensional Taylor series of a function πΉ (π₯, π¦) about the originis given by
πΉ (π₯, π¦) = πΉ (0, 0) + π₯πΉπ₯(0, 0) + π¦πΉπ¦(0, 0)
+1
2
[π₯2πΉπ₯π₯(0, 0) + 2π₯π¦πΉπ₯π¦(0, 0) + π¦2πΉπ¦π¦(0, 0)
]+ . . . ,
where the terms in the expansion can be remembered by requiring that all of thepartial derivatives of the series agree with that of πΉ (π₯, π¦) at the origin. We nowTaylor-series expand π(π₯* + π, π¦* + πΏ) and π(π₯* + π, π¦* + πΏ) about (π, πΏ) = (0, 0).The constant terms vanish since (π₯*, π¦*) is a fixed point, and we neglect allterms with higher orders than π and πΏ. Therefore,
οΏ½οΏ½ = πππ₯(π₯*, π¦*) + πΏππ¦(π₯*, π¦*), οΏ½οΏ½ = πππ₯(π₯*, π¦*) + πΏππ¦(π₯*, π¦*),
10 CHAPTER 1. POPULATION DYNAMICS
which may be written in matrix form as
π
ππ‘
(ππΏ
)=
(π*π₯ π*
π¦
π*π₯ π*π¦
)(ππΏ
), (1.11)
where π*π₯ = ππ₯(π₯*, π¦*), etc. Equation (1.11) is a system of linear odeβs, and its
solution proceeds by assuming the form(ππΏ
)= πππ‘v. (1.12)
Upon substitution of (1.12) into (1.11), and canceling πππ‘, we obtain the linearalgebra eigenvalue problem
J*v = πv, with J* =
(π*π₯ π*
π¦
π*π₯ π*π¦
),
where π is the eigenvalue, v the corresponding eigenvector, and J* the Jacobianmatrix evaluated at the fixed point. The eigenvalue is determined from thecharacteristic equation
det (J* β πI) = 0,
which for a two-by-two Jacobian matrix results in a quadratic equation for π.From the form of the solution (1.12), the fixed point is stable if for all eigenvaluesπ, Re{π} < 0, and unstable if for at least one π, Re{π} > 0. Here Re{π} meansthe real part of the (possibly) complex eigenvalue π.
We now reconsider the Lotka-Volterra equations (1.8). Fixed point solutionsare found by solving οΏ½οΏ½ = οΏ½οΏ½ = 0, and the two solutions are
(π*, π*) = (0, 0) or (π½
ππΎ,πΌ
πΎ).
The trivial fixed point (0, 0) is unstable since the prey population grows expo-nentially if it is initially small. To determine the stability of the second fixedpoint, we write the Lotka-Volterra equation in the form
ππ
ππ‘= πΉ (π, π ),
ππ
ππ‘= πΊ(π, π ),
withπΉ (π, π ) = πΌπ β πΎππ, πΊ(π, π ) = ππΎππ β π½π.
The partial derivatives are then computed to be
πΉπ = πΌβ πΎπ, πΉπ = βπΎπ
πΊπ = ππΎπ, πΊπ = ππΎπ β π½.
The Jacobian at the fixed point (π*, π*) = (π½/ππΎ, πΌ/πΎ) is
J* =
(0 βπ½/πππΌ 0
);
and
det(J* β πI) =
βπ βπ½/πππΌ βπ
= π2 + πΌπ½
= 0
1.4. THE LOTKA-VOLTERRA PREDATOR-PREY MODEL 11
has the solutions πΒ± = Β±πβπΌπ½, which are pure imaginary. When the eigenvalues
of the two-by-two Jacobian are pure imaginary, the fixed point is called a centerand the perturbation neither grows nor decays, but oscillates. Here, the angularfrequency of oscillation is π =
βπΌπ½, and the period of the oscillation is 2π/π.
We plot π and π versus π‘ (time series plot), and π versus π (phase spacediagram) to see how the solutions behave. For a nonlinear system of equationssuch as (1.8), a numerical solution is required. The Lotka-Volterra system hasfour free parameters πΌ, π½, πΎ and π. The relevant units here are time, the numberof prey, and the number of predators. The Buckingham Pi Theorem predictsthat nondimensionalizing the equations can reduce the number of free parame-ters by three to a manageable single dimensionless grouping of parameters. Wechoose to nondimensionalize time using the angular frequency of oscillation andthe number of prey and predators using their fixed point values. With caretsdenoting the dimensionless variables, we let
π‘ =β
πΌπ½π‘, οΏ½οΏ½ = π/π* =ππΎ
π½π, π = π/π* =
πΎ
πΌπ. (1.13)
Substitution of (1.13) into the Lotka-Volterra equations (1.8) results in thedimensionless equations
ποΏ½οΏ½
ππ‘= π(οΏ½οΏ½ β οΏ½οΏ½π ),
ππ
ππ‘=
1
π(οΏ½οΏ½π β π ),
with single dimensionless grouping π =βπΌ/π½. A numerical solution uses MAT-
LABβs ode45.m built-in function to integrate the differential equations. Thecode below produces Fig. 1.5. Notice how the predator population lags the preypopulation: an increase in prey numbers results in a delayed increase in predatornumbers as the predators eat more prey. The phase space diagrams clearly showthe periodicity of the oscillation. Note that the curves move counterclockwise:prey numbers increase when predator numbers are minimal, and prey numbersdecrease when predator numbers are maximal.
12 CHAPTER 1. POPULATION DYNAMICS
function lotka_volterra
% plots time series and phase space diagrams
clear all; close all;
t0=0; tf=6*pi; eps=0.1; delta=0;
r=[1/2, 1, 2];
options = odeset(βRelTolβ,1e-6,βAbsTolβ,[1e-6 1e-6]);
%time series plots
for i=1:length(r);
[t,UV]=ode45(@lv_eq,[t0,tf],[1+eps 1+delta],options,r(i));
U=UV(:,1); V=UV(:,2);
subplot(3,1,i); plot(t,U,t,V,β --β);
axis([0 6*pi,0.8 1.25]); ylabel(βpredator,preyβ);
text(3,1.15,[βr=β,num2str(r(i))]);
end
xlabel(βtβ);
subplot(3,1,1); legend(βpreyβ, βpredatorβ);
%phase space plot
xpos=[2.5 2.5 2.5]; ypos=[3.5 3.5 3.5];%for annotating graph
for i=1:length(r);
for eps=0.1:0.1:1.0;
[t,UV]=ode45(@lv_eq,[t0,tf],[1+eps 1+eps],options,r(i));
U=UV(:,1); V=UV(:,2);
figure(2);subplot(1,3,i); plot(U,V); hold on;
end
axis equal; axis([0 4 0 4]);
text(xpos(i),ypos(i),[βr=β,num2str(r(i))]);
if i==1; ylabel(βpredatorβ); end;
xlabel(βpreyβ);
end
function dUV=lv_eq(t,UV,r)
dUV=zeros(2,1);
dUV(1) = r*(UV(1)-UV(1)*UV(2));
dUV(2) = (1/r)*(UV(1)*UV(2)-UV(2));
1.4. THE LOTKA-VOLTERRA PREDATOR-PREY MODEL 13
0 2 4 6 8 10 12 14 16 180.8
1
1.2
pred
ator
,pre
y
r=0.5
0 2 4 6 8 10 12 14 16 180.8
1
1.2
pred
ator
,pre
y
r=1
0 2 4 6 8 10 12 14 16 180.8
1
1.2
pred
ator
,pre
y
r=2
t
preypredator
0 2 40
1
2
3
4r=0.5
pred
ator
prey0 2 4
0
1
2
3
4r=1
prey0 2 4
0
1
2
3
4r=2
prey
Figure 1.5: Solutions of the dimensionless Lotka-Volterra equations. Upperplots: time-series solutions; lower plots: phase space diagrams.
14 CHAPTER 1. POPULATION DYNAMICS
Chapter 2
Age-structured Populations
Determining the age-structure of a population helps governments plan economicdevelopment. Age-structure theory can also help evolutionary biologists betterunderstand a speciesβs life-history. An age-structured population occurs becauseoffspring are born to mothers at different ages. If average per capita birth anddeath rates at different ages are constant, then a stable age-structure arises.However, a rapid change in birth or death rates can cause the age-structureto shift distributions. In this section, we develop the theory of age-structuredpopulations using both discrete- and continuous-time models. We also presenttwo interesting applications: (1) modeling age-structure changes in China andother countries as these populations age, and; (2) modeling the life cycle of ahermaphroditic worm. We begin this section, however, with one of the oldestproblems in mathematical biology: Fibonacciβs rabbits. This will lead us toa brief digression about the golden mean, rational approximations and flowerdevelopment, before returning to our main topic.
2.1 Fibonacciβs rabbits
In 1202, Fibonacci proposed the following puzzle, which we paraphrase here:
A man put a male-female pair of newly born rabbits in a field. Rab-bits take a month to mature before mating. One month after mating,females give birth to one male-female pair and then mate again. As-sume no rabbits die. How many rabbit pairs are there after oneyear?
Let ππ be the number of rabbit pairs at the start of the πth month, counted afterthe birth of any new rabbits. There are twelve months in a year, so the numberof rabbits at the start of the 13th month will be the solution to Fibonacciβspuzzle. Now π1 = 1 because the population consists of the initial newbornrabbit pair at the start of the first month; π2 = 1 because the population stillconsists of the initial rabbit pair, now mature; and π3 = π2 + π1 because thepopulation consists of π2 rabbit pairs present in the previous month (here equalto one), and π1 newborn rabbit pairs born to the π1 female rabbits that are nowold enough to give birth (here, also equal to one). In general
ππ+1 = ππ + ππβ1. (2.1)
15
16 CHAPTER 2. AGE-STRUCTURED POPULATIONS
For only 12 months we can simply count using (2.1) and π1 = π2 = 1, so theFibonacci numbers are
1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, . . .
where π13 = 233 is the solution to Fibonacciβs puzzle.
Let us solve (2.1) for all the ππβs. This equation is a second-order lineardifference equation, and to solve it, we look for a solution of the form ππ = ππ.Substitution into (2.1) yields
ππ+1 = ππ + ππβ1,
or after division by ππβ1:
π2 β πβ 1 = 0,
with solution
πΒ± =1Β±
β5
2.
Define
Ξ¦ =1 +
β5
2= 1.61803 . . . ,
and
π =
β5β 1
2= Ξ¦β 1 = 0.61803 . . . .
Then π+ = Ξ¦ and πβ = βπ. Also, notice that since Ξ¦2 β Ξ¦ β 1 = 0, divisionby Ξ¦ yields 1/Ξ¦ = Ξ¦β 1, so that
π =1
Ξ¦.
As in the solution of linear homogeneous differential equations, the two valuesof π can be used to construct a general solution to the linear difference equationusing the principle of linear superposition:
ππ = π1Ξ¦π + π2(β1)πππ.
Extending the Fibonacci sequence to π0 = 0 (since π0 = π2β π1), we satisfy theconditions π0 = 0 and π1 = 1:
π1 + π2 = 0,
π1Ξ¦β π2π = 1.
Therefore, π2 = βπ1, and π1(Ξ¦ + π) = 1, or π1 = 1/β5, π2 = β1/
β5. We can
rewrite the solution as
ππ =1β5Ξ¦π
[1 + (β1)π+1Ξ¦β2π
]. (2.2)
In this form, we see that, as π β β, ππ β Ξ¦π/β5, and ππ+1/ππ β Ξ¦.
2.2. THE GOLDEN RATIO Ξ¦ 17
2.2 The golden ratio Ξ¦
The number Ξ¦ is known as the golden ratio. Two positive numbers π₯ and π¦,with π₯ > π¦, are said to be in the golden ratio if the ratio between the sum ofthose numbers and the larger one is the same as the ratio between the largerone and the smaller; that is,
π₯+ π¦
π₯=
π₯
π¦. (2.3)
Solution of (2.3) yields π₯/π¦ = Ξ¦. In some well-defined way, Ξ¦ can also be calledthe most irrational of the irrational numbers.
To understand why Ξ¦ has this distinction as the most irrational number, weneed first to understand an algorithmβcontinued fraction expansionsβfor ap-proximating irrational numbers by rational numbers. Let ππ be positive integers.To construct a continued fraction expansion to the positive irrational numberπ₯, we choose the largest possible ππβs that satisfy the following inequalities:
π₯ > π1 = π1,
π₯ < π2 = π1 +1
π2,
π₯ > π3 = π1 +1
π2 +1
π3
,
π₯ < π4 = π1 +1
π2 +1
π3 +1
π4
, etc.
There is a simple algorithm for calculating the ππβs. Firstly, π1 = π1 is theinteger part of π₯. One computes the remainder, π₯β π1, and takes its reciprocal1/(π₯βπ1). The integer part of this reciprocal is π2. One then takes the remainder1/(π₯ β π1) β π2, and its reciprocal. The integer part of this reciprocal is π3.We follow this algorithm to find successively better rational approximations ofπ = 3.141592654 . . . :
πβs remainder 1/remainder
π1 = 3 0.14159265 7.06251329
π2 = 3 +1
70.06251329 15.99659848
π3 = 3 +1
7 +1
15
0.99659848 1.00341313
π4 = 3 +1
7 +1
15 +1
1
18 CHAPTER 2. AGE-STRUCTURED POPULATIONS
Follow how this works. The first rational approximation of π is π1 = 3, whichis less than π. The remainder is 0.141592654. The remainderβs reciprocal is7.06251329. The integer part of this reciprocal is 7, so our next rational ap-proximation of π is π2 = 3 + 1/7, which is larger than π since 1/7 is greaterthan 0.141592654. The next remainder is 7.06251329 β 7 = 0.06251329, andits reciprocal is 15.99659848. Therefore, our next rational approximation isπ3 = 3 + 1/(7 + (1/15)), and since 1/15 is greater than 0.06251329, π3 is lessthan π. Rationalizing the continued fractions gives us the following succes-sive rational number approximation of π: {3, 22/7, 333/106, 355/113, . . . }, or indecimal form: {3, 3.1428 . . . , 3.141509 . . . , 3.14159292 . . . }.
Now consider a rational approximation for Ξ¦ = 1.61803399 . . . :
πβs remainder 1/remainder
π1 = 1 π Ξ¦
π2 = 1 +1
1π Ξ¦
and so on. All the ππβs are one, and the sequence results in the slowest possibleconvergence for the ππβs. Formerly, the limiting expression for Ξ¦ is
Ξ¦ =1
1 +1
1 +1
. . .
.
Another derivation of the continued fraction expansion for Ξ¦ can be obtainedfrom Ξ¦2 β Ξ¦ β 1 = 0, written as Ξ¦ = 1 + 1/Ξ¦. Then inserting this expressionfor Ξ¦ back into the right-hand-side gives Ξ¦ = 1 + 1/(1 + 1/Ξ¦), and iteratinginfinitum shows that all the ππβs are one. (As a side note, another interestingexpression for Ξ¦ comes from iterating Ξ¦ =
β1 + Ξ¦.)
The sequence of rational approximations just constructed for Ξ¦ is 1, 2, 3/2,5/3, 8/5, 13/8, etc., which are just the ratios of consecutive Fibonacci numbers.We have already shown that this ratio converges to Ξ¦.
Because the golden ratio is the most irrational number, it has a way ofappearing unexpectedly in nature. One well-known example is the florets in asunflower head, which we discuss in the next section.
2.3 The Fibonacci numbers in a sunflower
Consider the photo of a sunflower shown in Fig. 2.1, and notice the apparentspirals in the florets radiating out from the center to the edge. These spiralsappear to rotate both clockwise and counterclockwise. By counting them, onefinds 34 clockwise spirals and 21 counterclockwise spirals. The numbers 21 and34 are notable as they are consecutive numbers in the Fibonacci sequence.
Why do Fibonacci numbers appear in the sunflower head? To answer thisquestion, we construct a very simple model for the way that the florets develop.Suppose that during development, florets first appear close to the center of thehead and subsequently move radially out with constant speed as the sunflowerhead grows. To fill the circular sunflower head, we suppose that as each newfloret is created at the center, it is rotated through a constant angle before
2.3. THE FIBONACCI NUMBERS IN A SUNFLOWER 19
Figure 2.1: The flowering head of a sunflower.
moving radially. We will further assume that the angle of rotation is optimumin the sense that the resulting sunflower head consists of florets that are well-spaced.
We denote the rotation angle by 2ππΌ, and we can assume that 0 < πΌ < 1.We first consider the possibility that πΌ is a rational number, say π/π, where πand π are positive integers with no common factors, and π < π. Since afterπ rotations the florets return to the line on which they started, the resultingsunflower head would consist of florets lying along π straight lines. Such asunflower head for πΌ = 3/5 is shown in Fig. 2.2a, where five straight lines areapparent. This figure and subsequent ones were produced using MATLAB, andthe simulation code is shown at the end of this subsection.
We next consider the possibility that πΌ is an irrational number. Now, nomatter how many rotations, the florets will never return to their starting line.Nevertheless, the resulting sunflower head may not have florets that are well-spaced. For example, if πΌ = π β 3, then the resulting sunflower head wouldlook like Fig. 2.2b. Intriguingly, seven counterclockwise spirals are evident. Wehave already shown that a good rational approximation for πβobtained froma continued fraction expansionβis 3 + 1/7, which is slightly larger than π. Sothat on every seventh rotation, the new floret falls just short of the radial linetraced by the floret from seven rotations ago. After every seven rotations asthese florets move out along the growing sunflower head, they will take on theappearance of a counterclockwise spiral; and the entire flower head will seem toconsist of seven of these spirals, as evident in Fig. 2.2b.
The irrational number that is most likely to result in a sunflower head withwell-spaced florets is the most irrational of the irrational numbers, that is π,which is the fractional part of the golden ratio. The rational approximations
20 CHAPTER 2. AGE-STRUCTURED POPULATIONS
(b)(a)
Figure 2.2: Simulation of a sunflower head for (a) πΌ = 3/5; (b) πΌ = π β 3.
(b)(a)
Figure 2.3: Simulation of a sunflower head for πΌ = π. (a) π£0 = 1/2; (b)π£0 = 1/4.
2.4. RABBITS ARE AN AGE-STRUCTURED POPULATION 21
to π obtained from a continued fraction expansion are just the ratio of twoconsecutive Fibonacci numbers, e.g., 13/21 or 21/34. Depending on the densityand size of the sunflower head, different consecutive Fibonacci numbers may beobserved. For the sunflower shown in Fig. 2.1, the relevant Fibonacci numbersare 13, 21 and 34. Since 13/21 > π, the 21 spirals are expected to rotatecounterclockwise; and since 21/34 < π, the 34 spirals are expected to rotateclockwise. Furthermore, the 34 clockwise spirals should be easier to observethan the 21 counterclockwise spirals (since 21/34 is a better approximation toπ than is 13/21), and indeed, this is exactly what we find.
Two simulations of the sunflower head with πΌ = π are shown in Fig. 2.3.These simulations differ only by the choice of radial velocity, with π£0 = 1/2 inFig. 2.3a and π£0 = 1/4 in Fig. 2.3b, where the precise definition of π£0 canbe obtained from the MATLAB code. In Fig. 2.3a, there can be counted 13counterclockwise spirals and 21 clockwise spirals; in Fig. 2.3b, there are 21clockwise spirals and 34 counterclockwise spirals, similar to the real sunflowerin Fig. 2.1. As expected, these number of spirals and their orientation derivefrom the ratio of consecutive Fibonacci numbers satisfying 8/13 < π, 13/21 > πand 21/34 < π.
function flower(alpha)
%draws points moving radially but rotated an angle 2*pi*num
figure
angle=2*pi*alpha; r0=0;
v0=1.0; %for higher Fibonacci numbers try v0=0.5, 0.25, 0.1
npts=100/v0;
theta=(0:npts-1)*angle; %all the angles of the npts
r=r0*ones(1,npts); %starting radius of the npts
for i=1:npts % i is total number of points to be plotted (also time)
for j=1:i %find coordinates of all i points; j=i is newest point
r(j)=r(j)+v0; %*(i-j);
x(j)=r(j)*cos(theta(j)); y(j)=r(j)*sin(theta(j));
end
plot(x,y,β.β,βMarkerSizeβ,5.0);
axis equal;
axis([-npts*v0-r0 npts*v0+r0 -npts*v0-r0 npts*v0+r0]);
pause(0.1);
end
%draw a circle at the center
hold on; phi=linspace(0,2*pi); x=r0*cos(phi); y=r0*sin(phi);
fill(x,y,βrβ)
2.4 Rabbits are an age-structured population
Fibonacciβs rabbits form an age-structured population and we can use this sim-ple case to illustrate the more general approach. Fibonacciβs rabbits can becategorized into two meaningful age classes: juveniles and adults. Here, juve-niles are the newborn rabbits that can not yet mate; adults are those rabbits atleast one month old. Beginning with a newborn pair at the beginning of the firstmonth, we census the population at the beginning of each subsequent month
22 CHAPTER 2. AGE-STRUCTURED POPULATIONS
after mated females have given birth. At the start of the πth month, let π’1,π
be the number of newborn rabbit pairs, and let π’2,π be the number of rabbitpairs at least one month old. Since each adult pair gives birth to a juvenile pair,the number of juvenile pairs at the start of the (π+1)-st month is equal to thenumber of adult pairs at the start of the π-th month. And since the number ofadult pairs at the start of the (π+1)-st month is equal to the sum of adult andjuvenile pairs at the start of the π-th month, we have
π’1,π+1 = π’2,π,
π’2,π+1 = π’1,π + π’2,π;
or written in matrix form(π’1,π+1
π’2,π+1
)=
(0 11 1
)(π’1,π
π’2,π
). (2.4)
Rewritten in vector form, we have
uπ+1 = Luπ, (2.5)
where the definitions of the vector uπ and the matrix L are obvious. The initialconditions, with one juvenile pair and no adults, are given by(
π’1,1
π’2,1
)=
(10
).
The solution of this system of coupled, first-order, linear, difference equations,(2.5), proceeds similarly to that of coupled, first-order, linear, differential equa-tions. With the ansatz, uπ = ππv, we obtain upon substitution into (2.5) theeigenvalue problem
Lv = πv,
whose solution yields two eigenvalues π1 and π2, with corresponding eigenvectorsv1 and v2. The general solution to (2.5) is then
uπ = π1ππ1v1 + π2π
π2v2, (2.6)
with π1 and π2 determined from the initial conditions. Now suppose that |π1| >|π2|. If we rewrite (2.6) in the form
uπ = ππ1
(π1v1 + π2
(π2
π1
)π
v2
),
then because |π2/π1| < 1, uπ β π1ππ1v1 as π β β. The long-time asymptotics
of the population, therefore, depends only on π1 and the corresponding eigen-vector v1. For our Fibonacciβs rabbits, the eigenvalues are obtained by solvingdet (Lβ πI) = 0, and we find
det
(βπ 11 1β π
)= βπ(1β π)β 1
= 0,
2.5. DISCRETE AGE-STRUCTURED POPULATIONS 23
π’π,π number of females in age class π at census ππ π fraction of females surviving from age class πβ 1 to πππ expected number of female offspring from a female in age class πππ = π 1 Β· Β· Β· π π fraction of females surviving from birth to age class πππ = ππππβ0 =
βπ ππ basic reproductive ratio
Table 2.1: Definitions needed in an age-structured, discrete-time populationmodel
or π2 β πβ 1 = 0, with solutions Ξ¦ and βπ. Since Ξ¦ > π, the eigenvalue Ξ¦ andits corresponding eigenvector determine the long-time asymptotic populationage-structure. The eigenvector may be found by solving
(Lβ Ξ¦I)v1 = 0,
or (βΞ¦ 11 1β Ξ¦
)(π£11π£12
)=
(00
).
The first equation is just βΞ¦ times the second equation (use Ξ¦2 β Ξ¦ β 1 = 0),so that π£12 = Ξ¦π£11. Taking π£11 = 1, we have
v1 =
(1Ξ¦
).
The asymptotic age-structure obtained from v1 shows that the ratio of adultsto juveniles approaches the golden mean; that is,
limπββ
π’2,π
π’1,π= π£12/π£11
= Ξ¦.
2.5 Discrete age-structured populations
In a discrete model, population censuses occur at discrete times and individualsare assigned to age classes, spanning a range of ages. For model simplicity,we assume that the time between censuses is equal to the age span of all ageclasses. An example is a country that censuses its population every five years,and assigns individuals to age classes spanning five years (e.g., 0-4 years old,5-9 years old, etc.). Although country censuses commonly count both femalesand males separately, we will only count females and ignore males.
There are several new definitions in this Section and I place these in Table2.1 for easy reference. We define π’π,π to be the number of females in age classπ at census π. We assume that π = 1 represents the first age class and π = πthe last. No female survives past the last age class. We also assume that thefirst census takes place when π = 1. We define π π as the fraction of females thatsurvive from age class πβ 1 to age class π (with π 1 the fraction of newborns thatsurvive to their first census), and define ππ as the expected number of femalebirths per female in age class π.
24 CHAPTER 2. AGE-STRUCTURED POPULATIONS
We construct difference equations for {π’π,π+1} in terms of {π’π,π}. First,newborns at census π + 1 were born between census π and π + 1 to differentaged females, with differing fertilities. Also, only a faction of these newbornssurvive to their first census. Second, only a fraction of females in age class π thatwere counted in census π survive to be counted in age class π+1 in census π+1.Putting these two ideas together with the appropriately defined parameters, thedifference equations for {π’π,π+1} are determined to be
π’1,π+1 = π 1 (π1π’1,π +π2π’2,π + Β· Β· Β·+πππ’π,π) ,
π’2,π+1 = π 2π’1,π,
π’3,π+1 = π 3π’2,π,
...
π’π,π+1 = π ππ’πβ1,π,
which can be rewritten as the matrix equationβββββββπ’1,π+1
π’2,π+1
π’3,π+1
...π’π,π+1
βββββββ =
βββββββπ 1π1 π 1π2 . . . π 1ππβ1 π 1ππ
π 2 0 . . . 0 00 π 3 . . . 0 0...
......
......
0 0 . . . π π 0
βββββββ
βββββββπ’1,π
π’2,π
π’3,π
...π’π,π
βββββββ ;
or in compact vector form as
uπ+1 = Luπ, (2.7)
where L is called the Leslie Matrix.This system of linear equations can be solved by determining the eigenvalues
and associated eigenvectors of the Leslie Matrix. One can solve directly thecharacteristic equation, det (Lβ πI) = 0, or reduce the system of first-orderdifference equations (2.7) to a single high-order equation for the number offemales in the first age class. Following the latter approach, and beginning withthe second row of (2.7), we have
π’2,π+1 = π 2π’1,π,
π’3,π+1 = π 3π’2,π
= π 3π 2π’1,πβ1,
...
π’π,π+1 = π ππ’πβ1,π
= π ππ πβ1π’πβ2,πβ1
...
= π ππ πβ1 Β· Β· Β· π 2π’1,πβπ+2.
If we define ππ = π 1π 2 Β· Β· Β· π π to be the fraction of females that survive from birthto age class π, and ππ = ππππ, then the first row of (2.7) becomes
π’1,π+1 = π1π’1,π + π2π’1,πβ1 + π3π’1,πβ2 + Β· Β· Β·+ πππ’1,πβπ+1. (2.8)
2.5. DISCRETE AGE-STRUCTURED POPULATIONS 25
Here, we have made the simplifying assumption that π β₯ π so that all thefemales counted in the π+ 1 census were born after the first census.
The high-order linear difference equation (2.8) may be solved using theansatz π’1,π = ππ. Direct substitution and division by ππ+1 results in the dis-crete Euler-Lotka equation
πβπ=1
πππβπ = 1, (2.9)
which may have both real and complex-conjugate roots.Once an eigenvalue π is determined from (2.9), the corresponding eigenvector
v can be computed using the Leslie matrix. We haveβββββββπ 1π1 β π π 1π2 . . . π 1ππβ1 π 1ππ
π 2 βπ . . . 0 00 π 3 . . . 0 0...
......
......
0 0 . . . π π βπ
βββββββ
βββββββπ£1π£2π£3...π£π
βββββββ =
βββββββ000...0
βββββββ .
Taking π£π = ππ/ππ, and beginning with the last row and working backwards,
we have:
π£πβ1 = ππβ1/ππβ1,
π£πβ2 = ππβ2/ππβ2,
...
π£1 = π1/π,
so thatπ£π = ππ/π
π, for π = 1, 2, . . . , π.
We can obtain an interesting implication of this result by forming the ratio oftwo consecutive age classes. If π is the dominant eigenvalue (and is real andpositive, as is the case for human populations), then asymptotically,
π’π+1,π/π’π,π βΌ π£π+1/π£π
= π π+1/π.
With the survival fractions {π π} fixed, a smaller positive π implies a larger ratio:a slower growing (or decreasing) population has relatively more older peoplethan a faster growing population. In fact, we are now living through a timewhen developed countries, particularly Japan and those in Western Europe, aswell as Hong Kong and Singapore, have substantially lowered their populationgrowth rates and are increasing the average age of their citizens.
If we want to simply determine if a population grows or decays, we cancalculate the basic reproduction ratio β0, defined as the net expectation offemale offspring to a newborn female. Stasis is obtained if the female onlyreplaces herself before dying. If β0 > 1, then the population grows, and ifβ0 < 1 then the population decays. β0 is equal to the number of femaleoffspring expected from a newborn when she is in age class π, summed over allage classes, or
β0 =
πβπ=1
ππ.
26 CHAPTER 2. AGE-STRUCTURED POPULATIONS
For a population with approximately equal numbers of males and females,β0 = 1 means a newborn female must produce on average two children overher lifetime. News stories in the western press frequently state that for zeropopulation growth, women need to have 2.1 children. The term women usedin these stories presumably means women of child-bearing age. Since girls whodie young have no children, the statistic of 2.1 children implies that 0.1/2.1, orabout 5% of children die before reaching adulthood.
A useful application of the mathematical model developed in this Sectionis to predict the future age structure within various countries. This can beimportant for economic planningβfor instance, determining the tax revenuesthat can pay for the rising costs of health care as a population ages. For accuratepredictions on the future age-structure of a given country, immigration andmigration must also be modeled. An interesting website to browse is at
http://www.census.gov/ipc/www/idb.
This website, created by the US census bureau, provides access to the Inter-national Data Base (IDB), a computerized source of demographic and socioe-conomic statistics for 227 countries and areas of the world. In class, we willlook at and discuss the dynamic output of some of the population pyramids,including those for Hong Kong and China.
2.6 Continuous age-structured populations
We can derive a continuous-time model by considering the discrete model inthe limit as the age span Ξπ of an age class (also equal to the time betweencensuses) goes to zero. For π > π, (2.8) can be rewritten as
π’1,π =
πβπ=1
πππ’1,πβπ. (2.10)
The first age class in the discrete model consists of females born between twoconsecutive censuses. The corresponding function in the continuous model isthe female birth rate of the population as a whole, π΅(π‘), satisfying
π’1,π = π΅(π‘π)Ξπ.
If we assume that the πth census takes place at a time π‘π = πΞπ, we also have
π’1,πβπ = π΅(π‘πβπ)Ξπ
= π΅(π‘π β π‘π)Ξπ.
To determine the continuous analogue of the parameter ππ = ππππ, we definethe age-specific survival function π(π) to be the fraction of newborn females thatsurvive to age π, and define the age-specific maternity function π(π), multipliedby Ξπ, to be the average number of females born to a female between the agesof π and π+Ξπ. With the definition of the age-specific net maternity function,π(π) = π(π)π(π), and ππ = πΞπ, we have
ππ = π(ππ)Ξπ.
2.6. CONTINUOUS AGE-STRUCTURED POPULATIONS 27
With these new definitions, (2.10) becomes
π΅(π‘π)Ξπ =
πβπ=1
π(ππ)π΅(π‘π β π‘π)(Ξπ)2.
Canceling one factor of Ξπ, and using π‘π = ππ, the right-hand side becomes aRiemann sum. Taking π‘π = π‘ and assigning π(π) = 0 when π is greater than themaximum age of female fertility, the limit Ξπ β 0 transforms (2.10) to
π΅(π‘) =
β« β
0
π΅(π‘β π)π(π)ππ. (2.11)
Equation (2.11) states that the population-wide female birth rate at time π‘ hascontributions from females of all ages, and that the contribution to this birthrate from females between the ages of π and π + ππ is determined from thepopulation-wide female birth rate at the earlier time π‘ β π times the fractionof females that survive to age π times the number of female births to femalesbetween the ages of π and π + ππ. Equation (2.11) is a linear homogeneousintegral equation, valid for π‘ greater than the maximum age of female fertility.A more complete but inhomogeneous equation valid for smaller π‘ can also bederived.
Equation (2.11) can be solved by the ansatz π΅(π‘) = πππ‘. Direct substitutionyields
πππ‘ =
β« β
0
π(π)ππ(π‘βπ)ππ,
which upon canceling πππ‘ results in the continuous Euler-Lotka equationβ« β
0
π(π)πβππππ = 1. (2.12)
Equation (2.12) is an integral equation for π given the age-specific net maternityfunction π(π). It is possible to prove that for π(π) a continuous non-negativefunction, (2.13) has exactly one real root π*, and that the population grows(π* > 0) or decays (π* < 0) asymptotically as ππ*π‘. The population growth rateπ* has been called the intrinsic rate of increase, the intrinsic growth rate, or theMalthusian parameter. Typically, (2.12) must be solved for π* using Newtonβsmethod.
We briefly digress here to explain Newtonβs method, which is an efficientroot-finding algorithm that solves πΉ (π₯) = 0 for π₯. Newtonβs method can bederived graphically by plotting πΉ (π₯) versus π₯, approximating πΉ (π₯) by the tan-gential line at π₯ = π₯π with slope πΉ β²(π₯π), and letting π₯π+1 be the intercept ofthe tangential line with the x-axis. An alternative derivation starts with theTaylor series πΉ (π₯π+1) = πΉ (π₯π)+(π₯π+1βπ₯π)πΉ
β²(π₯π)+ . . . . We set πΉ (π₯π+1) = 0,drop higher-order terms in the Taylor series, and solve for π₯π+1:
π₯π+1 = π₯π β πΉ (π₯π)
πΉ β²(π₯π),
which is to be solved iteratively, starting with a wise choice of initial guess π₯0,until convergence.
28 CHAPTER 2. AGE-STRUCTURED POPULATIONS
Solution of (2.12) by Newtonβs method requires defining
πΉ (π) =
β« β
0
π(π)πβππππβ 1, (2.13)
from which one derives
πΉ β²(π) = ββ« β
0
ππ(π)πβππππ.
Newtonβs method then begins with an initial guess π0 and computes the iteration
ππ+1 = ππ +
β«β0
π(π)πβπππππβ 1β«β0
ππ(π)πβπππππ
until ππ converges sufficiently close to π*.After asymptotically attaining a stable age structure, the population grows
like ππ*π‘, and our previous discussion of the Malthusian growth model suggeststhat π* may be found from the constant per capita birth rate π and death rateπ. By determining expressions for π and π, we will indeed show that π* = πβ π.
Because females that have survived to age π at time π‘ were born earlier attime π‘ β π, π΅(π‘ β π)π(π)ππ represents the number of females at time π‘ that arebetween the ages of π and π+ ππ. The total number of females π(π‘) at time π‘is therefore given by
π(π‘) =
β« β
0
π΅(π‘β π)π(π)ππ. (2.14)
The per capita birth rate π(π‘) equals the population-wide birth rate π΅(π‘) dividedby the population size π(π‘), and using (2.11) and (2.14),
π(π‘) = π΅(π‘)/π(π‘)
=
β«β0
π΅(π‘β π)π(π)ππβ«β0
π΅(π‘β π)π(π)ππ.
Similarly, the per capita death rate π(π‘) equals the population-wide deathrate π·(π‘) divided by π(π‘). To derive π·(π‘), we first define the age-specificmortality function π(π), multiplied by Ξπ, to be the fraction of females of ageπ that die before attaining the age π +Ξπ. The relationship between the age-specific mortality function π(π) and the age-specific survival function π(π) maybe obtained by computing the fraction of females that survive to age π + Ξπ.This fraction is equal to the fraction of females that survive to age π times thefraction of females 1βπ(π)Ξπ that do not die in the next small interval of timeΞπ; that is,
π(π+Ξπ) = π(π)(1β π(π)Ξπ),
orπ(π+Ξπ)β π(π)
Ξπ= βπ(π)π(π);
and as Ξπ β 0,πβ²(π) = βπ(π)π(π). (2.15)
The age-specific mortality function π(π) is analogous to the age-specific mater-nity function π(π), and we define the age-specific net mortality function π(π) =
2.7. THE BROOD SIZE OF A HERMAPHRODITIC WORM 29
π(π)π(π) in analogy to the age-specific net maternity function π(π) = π(π)π(π).The population-wide birth rate π΅(π‘) is determined from π(π) using (2.11), andin analogy, the population-wide death rate π·(π‘) is determined from π(π) using
π·(π‘) =
β« β
0
π΅(π‘β π)π(π)ππ, (2.16)
where the integrand represents the contribution to the death rate from femalesthat die between the ages of π and π+ππ. The per capita death rate is therefore
π(π‘) = π·(π‘)/π(π‘)
=
β«β0
π΅(π‘β π)π(π)ππβ«β0
π΅(π‘β π)π(π)ππ;
and the difference between the per capita birth and death rates is calculatedfrom
π(π‘)β π(π‘) =
β«β0
π΅(π‘β π) [π(π)β π(π)] ππβ«β0
π΅(π‘β π)π(π)ππ. (2.17)
Asymptotically, a stable age structure is established and the population-widebirth rate grows as π΅(π‘) βΌ ππ*π‘. Substitution of this expression for π΅(π‘) into(2.17) and cancelation of ππ*π‘ results in
πβ π =
β«β0
[π(π)β π(π)] πβπ*πππβ«β0
π(π)πβπ*πππ
=1 +
β«β0
πβ²(π)πβπ*πππβ«β0
π(π)πβπ*πππ,
where use has been made of (2.12) and (2.15). Simplifying the numerator usingintegration by parts,β« β
0
πβ²(π)πβπ*πππ = π(π)πβπ*π|β0 + π*
β« β
0
π(π)πβπ*πππ
= β1 + π*
β« β
0
π(π)πβπ*πππ,
produces the desired result,π* = πβ π.
It is usually supposed that evolution by natural selection will result in popu-lations with the largest value of the Malthusian parameter π*, and that naturalselection would favor those females that constitute such a population. We willexploit this idea in the next section to compute the brood size of the self-fertilizing hermaphroditic worm of the species Caenorhabditis elegans.
2.7 The brood size of a hermaphroditic worm
Caenorhabditis elegans, a soil-dwelling nematode worm about 1 mm in length,is a widely studied model organism in biology. With a body made up of approx-imately 1000 cells, it is one of the simplest multicellular organisms under study.Advances in understanding the development of this multicellular organism led
30 CHAPTER 2. AGE-STRUCTURED POPULATIONS
Figure 2.4: Caenorhabditis elegans, a nematode worm used by biologists as asimple animal model of a multicellular organism. (Photo by Amy Pasquinelli.)
0 π π + π π + π + π
οΏ½sperm- οΏ½ eggs -
οΏ½ juvenile - οΏ½ adult -
Figure 2.5: A simplified timeline of a hermaphroditeβs life.
to the awarding of the 2002 Nobel prize in Physiology or Medicine to the threeC. elegans biologists Sydney Brenner, H. Robert Horvitz and John E. Sulston.
The worm C. elegans has two sexes: hermaphrodites, which are essentiallyfemales that can produce internal sperm and self-fertilize their own eggs, andmales, which must mate with hermaphrodites to produce offspring. In labora-tory cultures, males are rare and worms generally propagate by self-fertilization.Typically, a hermaphrodite lays about 250-350 self-fertilized eggs before becom-ing infertile. It is reasonable to assume that the forces of natural selection haveshaped the life-history of C. elegans, and that the number of offspring producedby a selfing hermaphrodite must be in some sense optimal. Here, we show howan age-structured model applied to C. elegans yields theoretical insights intothe brood size of a selfing hermaphrodite.
To develop a mathematical model for C. elegans, we need to know somedetails of its life history. As a first approximation (Barker, 1992), a simplifiedtimeline of a hermaphroditeβs life is shown in Fig. 2.5. The fertilized egg is laidat time π‘ = 0. During a juvenile growth period, the immature worm developsthrough four larval stages (L1-L4). Towards the end of L4 and for a short whileafter its final molt to adulthood, the hermaphrodite produces sperm, which isthen stored for later use. Then the hermaphrodite produces eggs, self-fertilizesthem using her internally stored sperm, and lays them. In the absence of males,egg production ceases after all the sperm are utilized. We assume that thejuvenile growth period occurs during 0 < π‘ < π, spermatogenesis occurs duringπ < π‘ < π+π , and egg-production, self-fertilization, and egg laying occurs during
2.7. THE BROOD SIZE OF A HERMAPHRODITIC WORM 31
π growth period 72 hπ sperm production period 11.9 hπ egg production period 65 hπ sperm production rate 24 hβ1
π egg production rate 4.4 hβ1
π΅ brood size 286
Table 2.2: Parameters in a life-history model of C. elegans, with experimentalestimates.
π + π < π‘ < π + π + π.Here, we want to understand why hermaphrodites limit their sperm produc-
tion. Biologists define males and females from the size and metabolic cost oftheir gametes: sperm are cheap and eggs are expensive. So on first look, itis puzzling why the total number of offspring produced by a hermaphrodite islimited by the number of sperm produced, rather than by the number of eggs.There must be a hidden cost to the hermaphrodite of producing additionalsperm other than metabolic. To understand the basic biology, it is instructiveto consider two limiting cases: (1) no sperm production; (2) infinite sperm pro-duction. In both cases, the hermaphrodite produces no offspringβin the firstcase because there are no sperm, and in the second case because there are noeggs. The number of sperm produced by a hermaphrodite before laying eggsis therefore a compromise; although more sperm means more offspring, moresperm also means delayed egg production.
Our main theoretical assumption is that natural selection will favor wormswith the ability to establish populations with the largest Malthusian parameterπ. Worms containing a genetic mutation resulting in a larger value for π willeventually outnumber all other worms.
The parameters we need for our mathematical model are listed in Table 2.2,together with estimated experimental values (Cutter, 2004). In addition to thegrowth period π, sperm production period π , and egg production period π (allin units of hours), we need the sperm production rate π and the egg productionrate π (both in units of inverse hours). We also define the brood size π΅ as thetotal number of fertilized eggs laid by a selfing hermaphrodite. The brood sizeis equal to the number of sperm produced, and also equal to the number of eggslaid, so that
π΅ = ππ = ππ. (2.18)
We may use (2.18) to eliminate π and π in favor of π΅:
π = π΅/π, π = π΅/π. (2.19)
The continuous Euler-Lotka equation (2.12) for π requires a model for π(π) =π(π)π(π), where π(π) is the age-specific maternity function and π(π) is the age-specific survival function. The function π(π) satisfies the differential equation(2.15), and here we make the simplifying assumption that the age-specific mor-tality function π(π) = π, where π is the age-independent per capita death rate.Implicitly, we are assuming that worms do not die of old age during egg-laying,but rather die of predation, starvation, disease, or other age-independent causes.Such an assumption is reasonable since worms can live in the laboratory for
32 CHAPTER 2. AGE-STRUCTURED POPULATIONS
several weeks after sperm depletion. Solving (2.15) with the initial conditionπ(0) = 1 results in
π(π) = exp (βπ Β· π). (2.20)
The age-specific maternity function π(π) is defined such that π(π)Ξπ is theexpected number of offspring produced over the age interval Ξπ. We assume thata hermaphrodite lays eggs at a constant rateπ over the ages π+π < π < π+π +π;therefore,
π(π) =
{π for π + π < π < π + π + π,
0 otherwise.(2.21)
Using (2.19), (2.20) and (2.21), the continuous Euler-Lotka equation (2.12)for the Malthusian parameter π becomesβ« π+π΅/π+π΅/π
π+π΅/π
π exp[β(π + π)π
]ππ = 1. (2.22)
Integrating,
1 =
β« π+π΅/π+π΅/π
π+π΅/π
π exp[β(π + π)π
]ππ
=π
π + π
{exp
[β(π +π΅/π)(π + π)
]β exp
[β(π +π΅/π+π΅/π)(π + π)
]}=
π
π + πexp
[β(π +π΅/π)(π + π)
]{1β exp
[β(π΅/π)(π + π)
]},
which may be rewritten as
(π + π) exp[(π +π΅/π)(π + π)
]= π
{1β exp
[β(π΅/π)(π + π)
]}. (2.23)
With the parameters π, π, π, and π fixed, the integrated Euler-Lotka equation(2.23) is an implicit equation for π = π(π΅).
To demonstrate that π = π(π΅) has a maximum at some value of π΅, wenumerically solve (2.23) for π with the parameter values of π, π and π obtainedfrom Table 2.2. Since π + π is maximum at the same value of π΅ that π ismaximum, and π only enters (2.23) in the form π+ π, without loss of generalitywe can take π = 0. To solve (2.23), it is best to make use of Newtonβs method.We let
πΉ (π) = (π + π) exp[(π +π΅/π)(π + π)
]βπ
{1β exp
[β(π΅/π)(π + π)
]},
and differentiate with respect to π to obtain
πΉ β²(π) =[1+(π+π΅/π)(π+π)
]exp
[(π +π΅/π)(π + π)
]β π΅
πexp
[β(π΅/π)(π + π)
].
For a given π΅, we then solve πΉ (π) = 0 by iterating
ππ+1 = ππ β πΉ (ππ)
πΉ β²(ππ).
Using appropriate starting values for π, the function π = π(π΅) can be computedand is presented in Fig. 2.6. Evidently, π is maximum near the value π΅ = 152,which is 53% of the experimental value for π΅ shown in Table 2.2.
2.7. THE BROOD SIZE OF A HERMAPHRODITIC WORM 33
0 100 200 300 400 5000.042
0.044
0.046
0.048
0.05
0.052
0.054
0.056
B
r
Figure 2.6: A plot of π = π(π΅), which demonstrates that the Malthusian growthrate π is maximum near a brood size of π΅ = 152.
We can also directly determine a single equation for the value of π΅ at whichπ is maximum. We implicitly differentiate (2.23) with respect to π΅βwith π theonly parameter that depends on π΅βand apply the condition ππ/ππ΅ = 0. Wefind
(π + π) exp[(π +π΅/π)(π + π)
]= π exp
[β(π΅/π)(π + π)
]. (2.24)
Taking the ratio of (2.23) to (2.24) results in
1 =π
π
{exp
[(π΅/π)(π + π)
]β 1
}, (2.25)
from which we can find
π + π =π
π΅ln (1 + π/π). (2.26)
Substituting (2.26) back into either (2.23) or (2.24) results in
π
π΅
(1 +
π
π
)ππ +ππ
π΅
ln(1 +
π
π
)=
ππ
π+π. (2.27)
Equation (2.27) contains the four parameters π, π, π and π΅, which can befurther reduced to three parameters by dimensional analysis. The brood size π΅is already dimensionless. The rate π at which eggs are produced and laid maybe multiplied by the sperm production period π = π΅/π to form the dimensionlessparameter π₯ = ππ΅/π. The parameter π₯ represents the number of eggs forgonebecause of the adult sperm production period and is a measure of the cost ofproducing sperm. Similarly, π may be multiplied by the larval growth period πto form the dimensionless parameter π¦ = ππ. The parameter π¦ represents thenumber of eggs forgone because of the juvenile growth period and is a measure ofthe cost of development. With π΅, π₯ and π¦ as our three dimensionless parameters,(2.27) becomes
1
π΅
(1 +
π΅
π₯
) π₯+π¦π΅
ln
(1 +
π΅
π₯
)=
π΅
π΅ + π₯. (2.28)
34 CHAPTER 2. AGE-STRUCTURED POPULATIONS
5 10 15 20 25 30 35 40 45 50 55
200
300
400
500
600
700
x
y
Figure 2.7: Solution curve of π¦ versus π₯ with brood size π΅ = 286, obtained bysolving (2.28). Values for π΅, π, π and π are taken from Table 2.2. The cross,crossed open circle and open circle correspond to π¦ = ππ and π₯ = πππ΅/π withπ = 1, 1/3 and 1/8, respectively.
0 π β π π½ π π + π π΄ π + π π΄ + π
οΏ½sperm- οΏ½ eggs -
οΏ½ juvenile - οΏ½ adult -
Figure 2.8: A more refined timeline of a hermaphroditeβs life.
Given values for two of the three dimensionless parameters π₯, π¦ and π΅, (2.28)may be solved for the remaining parameter, either explicitly for the case ofπ¦ = π¦(π₯,π΅), or by Newtonβs method.
The values of π₯ and π¦ obtained from Table 2.2 are π₯ = 52.5 and π¦ = 317.With π΅ = 286, the solution π¦ = π¦(π₯) is shown in Fig. 2.7, with the experimentalvalue of (π₯, π¦) plotted as a cross.
The seemingly large disagreement between the theoretical result and theexperimental data leads us to question the underlying assumptions of the model.Indeed, Cutter (2004) first suggested that the sperm produced precociously asa juvenile does not delay egg production and should be considered cost free.One possibility is to fix the absolute number of sperm produced precociouslyand to optimize the number of sperm produced as an adult. Another possibilityis to fix the fraction of sperm produced precociously and to optimize the totalnumber of sperm produced. This latter assumption was made by Cutter (2004)and seems to best improve the model agreement with the experimental data.
We therefore split the total sperm production period π into juvenile andadult sperm production periods π π½ and π π΄, with π = π π + π π΄. The revised
2.7. THE BROOD SIZE OF A HERMAPHRODITIC WORM 35
timeline of the hermaphroditeβs life is now shown in Fig. 2.8. With the fractionof sperm produced as an adult denoted by π , and the fraction produced as ajuvenile by 1β π , we have
π π½ = (1β π)π , π π΄ = ππ . (2.29)
Equation (2.21) for the age-specific maternity function becomes
π(π) =
{π for π + π π΄ < π < π + π π΄ + π,
0 otherwise.(2.30)
withπ π΄ = ππ΅/π. (2.31)
The Euler-Lotka equation (2.22) is then changed by the substitution π β π/π .Following this substitution to the final result given by (2.28) shows that thisequation still holds (which is in fact equivalent to (12) of Chasnov (2011)), butwith the now changed definition
π₯ = πππ΅/π. (2.32)
A close examination of the results shown in Fig. 2.7 demonstrates that near-perfect agreement can be made between the results of the theoretical model andthe experimental data if π = 1/8 (shown as the open circle in Fig. 2.7). Cutter(2004) suggested the value of π = 1/3, and this result is shown as a crossedopen circle in Fig. 2.7, still in much better agreement with the experimentaldata than the open circle corresponding to π = 1. The added modeling ofprecocious sperm production through the parameter π thus seems to improvethe verisimilitude of the model to the underlying biology.
References
Barker, D.M. Evolution of sperm shortage in a selfing hermaphrodite. Evolu-tion (1992) 46, 1951-1955.
Cutter, A.D. Sperm-limited fecundity in nematodes: How many sperm areenough? Evolution (2004) 58, 651-655.
Chasnov, J.R. Evolution of increased self-sperm production in postdauer her-maphroditic nematodes. Evolution (2011) 65, 2117-2122.
Hodgkin, J. & Barnes, T.M. More is not better: brood size and populationgrowth in a self-fertilizing nematode. Proc. R. Soc. Lond. B. (1991) 246,19-24.
Charlesworth, B. Evolution in age-structured populations. (1980) CambridgeUniversity Press.
36 CHAPTER 2. AGE-STRUCTURED POPULATIONS
Chapter 3
Stochastic Modeling ofPopulation Growth
Our derivation of the Malthusian growth model implicitly assumed a large pop-ulation size. Smaller populations exhibit stochastic effects and these can consid-erably complicate the modeling. Since in general, modeling stochastic processesin biology is an important yet difficult topic, we will spend some time hereanalyzing the simplest model of births in finite populations.
3.1 A stochastic model of population growth
The size of the population π is now considered to be a discrete random variable.We define the time-dependent probability mass function ππ (π‘) of π to be theprobability that the population is of size π at time π‘. Since π must take onone of the values from zero to infinity, we have
ββπ=0
ππ (π‘) = 1,
for all π‘ β₯ 0. Again, let π be the average per capita birth rate. We make thesimplifying approximations that all births are singlets, and that the probabilityof an individual giving birth is independent of past birthing history. We canthen interpret π probabilistically by supposing that as Ξπ‘ β 0, the probabilitythat an individual gives birth during the time Ξπ‘ is given by πΞπ‘. For example,if the average per capita birthrate is one offspring every 365-day year, then theprobability that a given individual gives birth on a given day is 1/365. As wewill be considering the limit as Ξπ‘ β 0, we neglect probabilities of more thanone birth in the population in the time interval Ξπ‘ since they are of order (Ξπ‘)2
or higher. Furthermore, we will suppose that at π‘ = 0, the population size isknown to be π0, so that ππ0
(0) = 1, with all other ππ βs at π‘ = 0 equal to zero.We can determine a system of differential equations for the probability mass
function ππ (π‘) as follows. For a population to be of size π > 0 at a time π‘+Ξπ‘,either it was of size π β 1 at time π‘ and one birth occurred, or it was of size πat time π‘ and there were no births; that is
ππ (π‘+Ξπ‘) = ππβ1(π‘)π(π β 1)Ξπ‘+ ππ (π‘)(1β ππΞπ‘).
37
38 CHAPTER 3. STOCHASTIC POPULATION GROWTH
Subtracting ππ (π‘) from both sides, dividing by Ξπ‘, and taking the limit Ξπ‘ β 0results in the forward Kolmogorov differential equations,
πππππ‘
= π[(π β 1)ππβ1 βπππ
], π = 1, 2, . . . , (3.1)
where π0(π‘) = π0(0) since a population of zero size remains zero. This systemof coupled, first-order, linear differential equations can be solved iteratively.
We first review how to solve a first-order linear differential equation of theform
ππ¦
ππ‘+ ππ¦ = π(π‘), π¦(0) = π¦0, (3.2)
where π¦ = π¦(π‘) and π is constant. First, we look for an integrating factor π suchthat
π
ππ‘(ππ¦) = π
(ππ¦
ππ‘+ ππ¦
).
Differentiating the left-hand-side and multiplying out the right-hand-side resultsin
ππ
ππ‘π¦ + π
ππ¦
ππ‘= π
ππ¦
ππ‘+ πππ¦;
and canceling terms yieldsππ
ππ‘= ππ.
We may integrate this equation with an arbitrary initial condition, so for sim-plicity we take π(0) = 1. Therefore, π(π‘) = πππ‘. Hence,
π
ππ‘
(πππ‘π¦
)= πππ‘π(π‘).
Integrating this equation from 0 to t yields
πππ‘π¦(π‘)β π¦(0) =
β« π‘
0
πππ π(π )ππ .
Therefore, the solution is
π¦(π‘) = πβππ‘
(π¦(0) +
β« π‘
0
πππ π(π )ππ
). (3.3)
The forward Kolmogorov differential equation (3.1) is of the form (3.2) withπ = ππ and π(π‘) = π(π β 1)ππβ1. With the population size known to be π0 atπ‘ = 0, the initial conditions can be written succinctly as ππ (0) = πΏπ,π0
, whereπΏππ is the Kronecker delta, defined as
πΏππ =
{0, if π = π;
1, if π = π.
Therefore, formal integration of (3.1) using (3.3) results in
ππ (π‘) = πβπππ‘
[πΏπ,π0
+ π(π β 1)
β« π‘
0
ππππ ππβ1(π )ππ
]. (3.4)
3.1. A STOCHASTIC MODEL OF POPULATION GROWTH 39
The first few solutions of (3.4) can now be obtained by successive integra-tions:
ππ (π‘) =
β§βͺβͺβͺβͺβͺβͺβ¨βͺβͺβͺβͺβͺβͺβ©
0, if π < π0;
πβππ0π‘, if π = π0;
π0πβππ0π‘[1β πβππ‘], if π = π0 + 1;
12π0(π0 + 1)πβππ0π‘[1β πβππ‘]2, if π = π0 + 2;
. . . , if . . . .
Although we will not need this, for completeness I give the complete solution. Bydefining the binomial coefficient as the number of ways one can select k objectsfrom a set of n identical objects, where the order of selection is immaterial, wehave (
π
π
)=
π!
π!(πβ π)!,
(read as βπ choose πβ). The general solution for ππ (π‘), π β₯ π0, is known to be
ππ (π‘) =
(π β 1
π0 β 1
)πβππ0π‘
[1β πβππ‘
]πβπ0,
which statisticians call a shifted negative binomial distribution. The determi-nation of the time-evolution of the probability mass function of π completelysolves this stochastic problem.
Of usual main interest is the mean and variance of the population size,and although both could in principle be computed from the probability massfunction, we will compute them directly from the differential equation for ππ .The definitions of the mean population size β¨πβ© and its variance π2 are
β¨πβ© =ββ
π=0
πππ , π2 =
ββπ=0
(π β β¨πβ©
)2ππ , (3.5)
and we will make use of the equality
π2 = β¨π2β© β β¨πβ©2. (3.6)
Multiplying the differential equation (3.1) by the constant π , summing over π ,and using ππ = 0 for π < π0, we obtain
πβ¨πβ©ππ‘
= π
[ ββπ=π0+1
π(π β 1)ππβ1 βββ
π=π0
π2ππ
].
Now π(πβ1) = (πβ1)2+(πβ1), so that the first term on the right-hand-sideis
ββπ=π0+1
π(π β 1)ππβ1 =
ββπ=π0+1
(π β 1)2ππβ1 +
ββπ=π0+1
(π β 1)ππβ1
=
ββπ=π0
π2ππ +
ββπ=π0
πππ ,
40 CHAPTER 3. STOCHASTIC POPULATION GROWTH
where the second equality was obtained by shifting the summation index down-ward by one. Therefore, we find the familiar Malthusian growth equation
πβ¨πβ©ππ‘
= πβ¨πβ©.
Together with the initial condition β¨πβ©(0) = π0, we can find the solution
β¨πβ©(π‘) = π0πππ‘. (3.7)
We proceed similarly to find π2 by first determining the differential equation forβ¨π2β©. Multiplying the differential equation for ππ , (3.1), by π2 and summingover π results in
πβ¨π2β©ππ‘
= π
[ ββπ=π0+1
π2(π β 1)ππβ1 βββ
π=π0
π3ππ
].
Here, we use the equality π2(πβ1) = (πβ1)3+2(πβ1)2+(πβ1). Proceedingin the same way as above by shifting the index downward, we obtain
πβ¨π2β©ππ‘
β 2πβ¨π2β© = πβ¨πβ©. (3.8)
Since β¨πβ© is known, (3.8) is a first-order, linear, inhomogeneous equation forβ¨π2β©, which can be solved using an integrating factor. The solution obtainedusing (3.3) is
β¨π2β© = π2ππ‘(π2
0 + π
β« π‘
0
πβ2ππ β¨πβ©(π )ππ ),
with β¨πβ© given by (3.7). Performing the integration, we obtain
β¨π2β© = π2ππ‘[π2
0 +π0(1β πβππ‘)].
Finally, using π2 = β¨π2β©β β¨πβ©2, we obtain the variance. Thus we arrive at ourfinal result for the population mean and variance:
β¨πβ© = π0πππ‘, π2 = π0π
2ππ‘(1β πβππ‘
). (3.9)
The coefficient of variation ππ£ measures the standard deviation relative tothe mean, and is here given by
ππ£ = π/β¨πβ©
=
β1β πβππ‘
π0.
For large π‘, the coefficient of variation therefore goes like 1/βπ0, and is small
when π0 is large. In the next section, we will determine the limiting form of theprobability distribution for large π0, recovering both the deterministic modeland a Gaussian model approximation.
3.2. ASYMPTOTICS OF LARGE INITIAL POPULATIONS 41
3.2 Asymptotics of large initial populations
Our goal here is to solve for an expansion of the distribution in powers of 1/π0
to leading-orders; notice that 1/π0 is small if π0 is large. To zeroth-order, thatis in the limit π0 β β, we will show that the deterministic model of populationgrowth is recovered. To first-order in 1/π0, we will show that the probabilitydistribution is normal. The latter result will be seen to be a consequence of thewell-known Central Limit Theorem in probability theory.
We develop our expansion by working directly with the differential equationfor ππ (π‘). Now, when the population sizeπ is a discrete random variable (takingonly the nonnegative integer values of 0, 1, 2, . . . ), ππ (π‘) is the probability massfunction for π . If π0 is large, then the discrete nature of π is inconsequential,and it is preferable to work with a continuous random variable and its probabilitydensity function. Accordingly, we define the random variable π₯ = π/π0, andtreat π₯ as a continuous random variable, with 0 β€ π₯ < β. Now, ππ (π‘) is theprobability that the population is of size π at time π‘, and the probability density
function of π₯, π (π₯, π‘), is defined such thatβ« π
ππ (π₯, π‘)ππ₯ is the probability that π β€
π₯ β€ π. The relationship between π and π can be determined by considering howto approximate a discrete probability distribution by a continuous distribution,that is, by defining π such that
ππ (π‘) =
β« (π+ 12 )/π0
(πβ 12 )/π0
π (π₯, π‘)ππ₯
= π (π/π0, π‘)/π0
where the last equality becomes exact as π0 β β. Therefore, the appropriatedefinition for π (π₯, π‘) is given by
π (π₯, π‘) = π0ππ (π‘), π₯ = π/π0, (3.10)
which satisfies β« β
0
π (π₯, π‘)ππ₯ =
ββπ=0
π (π/π0, π‘)(1/π0)
=
ββπ=0
ππ (π‘)
= 1,
the first equality (exact only when π0 β β) being a Reimann sum approxima-tion of the integral.
We now transform the infinite set of ordinary differential equations (3.1)for ππ (π‘) into a single partial differential equation for π (π₯, π‘). We multiply(3.1) by π0 and substitute π = π0π₯, ππ (π‘) = π (π₯, π‘)/π0, and ππβ1(π‘) =π (π₯β 1
π0, π‘)/π0 to obtain
ππ (π₯, π‘)
ππ‘= π
[(π0π₯β 1)π (π₯β 1
π0, π‘)βπ0π₯π (π₯, π‘)
]. (3.11)
We next Taylor series expand π (π₯β1/π0, π‘) around π₯, treating 1/π0 as a small
42 CHAPTER 3. STOCHASTIC POPULATION GROWTH
parameter. That is, we make use of
π (π₯β 1
π0, π‘) = π (π₯, π‘)β 1
π0ππ₯(π₯, π‘) +
1
2π20
ππ₯π₯(π₯, π‘)β . . .
=
ββπ=0
(β1)π
π!ππ0
πππ
ππ₯π.
The two leading terms proportional to π0 on the right-hand-side of (3.11) cancelexactly, and if we group the remaining terms in powers of 1/π0, we obtain forthe first three leading terms in the expansion
ππ‘ = βπ
[(π₯ππ₯ + π )β 1
π0
(π₯ππ₯π₯
2!+
ππ₯
1!
)+
1
π20
(π₯ππ₯π₯π₯
3!+
ππ₯π₯
2!
)β . . .
]= βπ
[(π₯π )π₯ β 1
π02!(π₯π )π₯π₯ +
1
π20 3!
(π₯π )π₯π₯π₯ β . . .
];
(3.12)
and higher-order terms can be obtained by following the evident pattern.Equation (3.12) may be further analyzed by a perturbation expansion of the
probability density function in powers of 1/π0:
π (π₯, π‘) = π (0)(π₯, π‘) +1
π0π (1)(π₯, π‘) +
1
π20
π (2)(π₯, π‘) + . . . . (3.13)
Here, the unknown functions π (0)(π₯, π‘), π (1)(π₯, π‘), π (2)(π₯, π‘), etc. are to be deter-mined by substituting the expansion (3.13) into (3.12) and equating coefficientsof powers of 1/π0. We thus obtain for the coefficients of (1/π0)
0 and (1/π0)1,
π(0)π‘ = βπ
(π₯π (0)
)π₯, (3.14)
π(1)π‘ = βπ
[(π₯π (1)
)π₯β 1
2
(π₯π (0)
)π₯π₯
]. (3.15)
3.2.1 Derivation of the deterministic model
The zeroth-order term in the perturbation expansion (3.13),
π (0)(π₯, π‘) = limπ0ββ
π (π₯, π‘),
satisfies (3.14). Equation (3.14) is a first-order linear partial differential equationwith a variable coefficient. One way to solve this equation is to try the ansatz
π (0)(π₯, π‘) = β(π‘)π(π), π = π(π₯, π‘), (3.16)
together with the initial condition
π (0)(π₯, 0) = π(π₯),
since at π‘ = 0, the probability distribution is assumed to be a known function.This initial condition imposes further useful constraints on the functions β(π‘)and π(π₯, π‘):
β(0) = 1, π(π₯, 0) = π₯.
3.2. ASYMPTOTICS OF LARGE INITIAL POPULATIONS 43
The partial derivatives of (3.16) are
π(0)π‘ = ββ²π + βππ‘π
β², π (0)π₯ = βππ₯π
β²,
which after substitution into (3.14) results in
(ββ² + πβ) π + (ππ‘ + ππ₯ππ₯)βπβ² = 0.
This equation can be satisfied for any π provided that β(π‘) and π(π₯, π‘) satisfy
ββ² + πβ = 0, ππ‘ + ππ₯ππ₯ = 0. (3.17)
The first equation for β(π‘), together with the initial condition β(0) = 1, iseasily solved to yield
β(π‘) = πβππ‘. (3.18)
To determine a solution for π(π₯, π‘), we try the technique of separation of vari-ables. We write π(π₯, π‘) = π(π₯)π (π‘), and upon substitution into the differentialequation for π(π₯, π‘), we obtain
ππ β² + ππ₯π β²π = 0;
and division by XT and separation yields
π β²
π= βππ₯
π β²
π. (3.19)
Since the left-hand-side is independent of π₯, and the right-hand-side is inde-pendent of π‘, both the left-hand-side and right-hand-side must be constant,independent of both π₯ and π‘. Now, our initial condition is π(π₯, 0) = π₯, so thatπ(π₯)π (0) = π₯. Without lose of generality, we can take π (0) = 1, so thatπ(π₯) = π₯. The right-hand-side of (3.19) is therefore equal to the constant βπ,and we obtain the differential equation and initial condition
π β² + ππ = 0, π (0) = 1,
which we can solve to yield π (π‘) = πβππ‘. Therefore, our solution for π(π₯, π‘) is
π(π₯, π‘) = π₯πβππ‘. (3.20)
By putting our solutions (3.18) and (3.20) together into our ansatz (3.16), wehave obtained the general solution to the pde:
π (0)(π₯, π‘) = πβππ‘π(π₯πβππ‘).
To determine π , we apply the initial condition of the probability mass func-tion, ππ (0) = πΏπ,π0
. From (3.10), the corresponding initial condition on theprobability distribution function is
π (π₯, 0) =
{π0 if 1β 1
2π0β€ π₯ β€ 1 + 1
2π0,
0 otherwise.
In the limit π0 β β, π (π₯, 0) β π (0)(π₯, 0) = πΏ(π₯ β 1), where πΏ(π₯ β 1) is theDirac delta-function, centered around 1. The delta-function is widely used in
44 CHAPTER 3. STOCHASTIC POPULATION GROWTH
quantum physics and was introduced by Dirac for that purpose. It now findsmany uses in applied mathematics. It can be defined by requiring that, for anyfunction π(π₯), β« +β
ββπ(π₯)πΏ(π₯)ππ₯ = π(0).
The usual view of the delta-function πΏ(π₯βπ) is that it is zero everywhere exceptat π₯ = π at which it is infinite, and its integral is one. It is not really a function,but it is what mathematicians call a distribution.
Now, since π (0)(π₯, 0) = π(π₯) = πΏ(π₯β 1), our solution becomes
π (0)(π₯, π‘) = πβππ‘πΏ(π₯πβππ‘ β 1). (3.21)
This can be rewritten by noting that (letting π¦ = ππ₯β π),β« +β
ββπ(π₯)πΏ(ππ₯β π)ππ₯ =
1
π
β« +β
ββπ((π¦ + π)/π
)πΏ(π¦)ππ¦
=1
ππ(π/π),
yielding the identity
πΏ(ππ₯β π) =1
ππΏ(π₯β π
π).
From this, we can rewrite our solution (3.21) in the more intuitive form
π (0)(π₯, π‘) = πΏ(π₯β πππ‘). (3.22)
Using (3.22), the zeroth-order expected value of π₯ is
β¨π₯0β© =β« β
0
π₯π (0)(π₯, π‘)ππ₯
=
β« β
0
π₯πΏ(π₯β πππ‘)ππ₯
= πππ‘;
while the zeroth-order variance is
π2π₯0
= β¨π₯20β© β β¨π₯0β©2
=
β« β
0
π₯2π (0)(π₯, π‘)ππ₯β π2ππ‘
=
β« β
0
π₯2πΏ(π₯β πππ‘)ππ₯β π2ππ‘
= π2ππ‘ β π2ππ‘
= 0.
Thus, in the infinite population limit, the random variable π₯ has zero variance,and is therefore no longer random, but follows π₯ = πππ‘ deterministically. We saythat the probability distribution of π₯ becomes sharp in the limit of large popula-tion sizes. The general principle of modeling large populations deterministicallycan simplify mathematical models when stochastic effects are unimportant.
3.2. ASYMPTOTICS OF LARGE INITIAL POPULATIONS 45
3.2.2 Derivation of the normal probability distribution
We now consider the first-order term in the perturbation expansion (3.13), whichsatisfies (3.15). We do not know how to solve (3.15) directly, so we will attemptto find a solution following a more circuitous route. First, we proceed by com-puting the moments of the probability distribution. We have
β¨π₯πβ© =β« β
0
π₯ππ (π₯, π‘)ππ₯
=
β« β
0
π₯ππ (0)(π₯, π‘)ππ₯+1
π0
β« β
0
π₯ππ (1)(π₯, π‘)ππ₯+ . . .
= β¨π₯π0 β©+
1
π0β¨π₯π
1 β©+ . . . ,
where the last equality defines β¨π₯π0 β©, etc. Now, using (3.22),
β¨π₯π0 β© =
β« β
0
π₯ππ (0)(π₯, π‘)ππ₯
=
β« β
0
π₯ππΏ(π₯β πππ‘)ππ₯
= ππππ‘.
(3.23)
To determine β¨π₯π1 β©, we use (3.15). Multiplying by π₯π and integrating, we have
πβ¨π₯π1 β©
ππ‘= βπ
[β« β
0
π₯π(π₯π (1)
)π₯ππ₯β 1
2
β« β
0
π₯π(π₯π (0)
)π₯π₯
ππ₯
]. (3.24)
We integrate by parts to remove the derivatives of π₯π , assuming that π₯π andall its derivatives vanish on the boundaries of integration, where π₯ is equal tozero or infinity. We haveβ« β
0
π₯π(π₯π (1)
)π₯ππ₯ = β
β« β
0
ππ₯ππ (1)ππ₯
= βπβ¨π₯π1 β©,
and
1
2
β« β
0
π₯π(π₯π (0)
)π₯π₯
ππ₯ = βπ
2
β« β
0
π₯πβ1(π₯π (0)
)π₯ππ₯
=π(πβ 1)
2
β« β
0
π₯πβ1π (0)ππ₯
=π(πβ 1)
2β¨π₯πβ1
0 β©.
Therefore, after integration by parts, (3.24) becomes
πβ¨π₯π1 β©
ππ‘= βπ
[πβ¨π₯π
1 β©+π(πβ 1)
2β¨π₯πβ1
0 β©]. (3.25)
Equation (3.25) is a first-order linear inhomogeneous differential equation andcan be solved using an integrating factor (see (3.3) and the preceding discussion).
46 CHAPTER 3. STOCHASTIC POPULATION GROWTH
Solving this differential equation using (3.23) and the initial condition β¨π₯π1 β©(0) =
0, we obtain
β¨π₯π1 β© =
π(πβ 1)
2ππππ‘
(1β πβππ‘
). (3.26)
The probability distribution function, accurate to order 1/π0, may be obtainedby making use of the so-called moment generating function Ξ¨(π ), defined as
Ξ¨(π ) = β¨ππ π₯β©
= 1 + π β¨π₯β©+ π 2
2!β¨π₯2β©+ π 3
3!β¨π₯3β©+ . . .
=
ββπ=0
π πβ¨π₯πβ©π!
.
To order 1/π0, we have
Ξ¨(π ) =
ββπ=0
π πβ¨π₯π0 β©
π!+
1
π0
ββπ=0
π πβ¨π₯π1 β©
π!+ O(1/π2
0 ). (3.27)
Now, using (3.23),
ββπ=0
π πβ¨π₯π0 β©
π!=
ββπ=0
(π πππ‘
)ππ!
= ππ πππ‘
,
and using (3.26),
ββπ=0
π πβ¨π₯π1 β©
π!=
1
2
(1β πβππ‘
) ββπ=0
1
π!π(πβ 1)
(π πππ‘
)π=
1
2
(1β πβππ‘
)π 2π2ππ‘
ββπ=0
1
π!π(πβ 1)
(π πππ‘
)πβ2
=1
2
(1β πβππ‘
)π 2
ββπ=0
1
π!
π2
ππ 2(π πππ‘
)π=
1
2
(1β πβππ‘
)π 2
π2
ππ 2
ββπ=0
(π πππ‘
)ππ!
=1
2
(1β πβππ‘
)π 2
π2
ππ 2
(ππ π
ππ‘)
=1
2
(1β πβππ‘
)π 2π2ππ‘ππ π
ππ‘
.
Therefore,
Ξ¨(π ) = ππ πππ‘
(1 +
1
2π0
(1β πβππ‘
)π 2π2ππ‘ + . . .
). (3.28)
We can recognize the parenthetical term of (3.28) as a Taylor-series expansionof an exponential function truncated to first-order, i.e.,
exp
(1
2π0
(1β πβππ‘
)π 2π2ππ‘
)= 1 +
1
2π0
(1β πβππ‘
)π 2π2ππ‘ +O(1/π2
0 ).
3.3. SIMULATION OF POPULATION GROWTH 47
Therefore, to first-order in 1/π0, we have
Ξ¨(π ) = exp
(π πππ‘ +
1
2π0
(1β πβππ‘
)π 2π2ππ‘
)+O(1/π2
0 ). (3.29)
Standard books on probability theory (e.g., A first course in probability bySheldon Ross, pg. 365) detail the derivation of the moment generating functionof a normal random variable:
Ξ¨(π ) = exp
(β¨π₯β©π + 1
2π2π 2
), for a normal random variable; (3.30)
and comparing (3.30) with (3.29) shows us that the probability distributionπ (π₯, π‘) to first-order in 1/π0 is normal with the mean and variance given by
β¨π₯β© = πππ‘, π2π₯ =
1
π0π2ππ‘
(1β πβππ‘
). (3.31)
The mean and variance of π₯ = π/π0 is equivalent to those derived for π in(3.9), but now we learn that π is approximately normal for large populations.
The appearance of a normal probability distribution (also called a Gaussianprobability distribution) in a first-order expansion is in fact a particular caseof the Central Limit Theorem, one of the most important and useful theoremsin probability and statistics. We state here a simple version of this theoremwithout proof:
Central Limit Theorem: Suppose that π1, π2, . . . , ππ are independent andidentically distributed (iid) random variables with mean β¨πβ© and variance π2
π .Then for sufficiently large π, the probability distribution of the average of theππβs, denoted as the random variable π = 1
π
βππ=1 ππ, is well approximated by
a Gaussian with mean β¨πβ© and variance π2 = π2π/π.
The central limit theorem can be applied directly to our problem. Consider thatour population consists of π0 founders. If ππ(π‘) denotes the number of indi-viduals descendent from founder π at time π‘ (including the still living founder),
then the total number of individuals at time π‘ is π(π‘) =βπ0
π=1 ππ(π‘); and theaverage number of descendants of a single founder is π₯(π‘) = π(π‘)/π0. If themean number of descendants of a single founder is β¨πβ©, with variance π2
π, thenby applying the central limit theorem for large π0, the probability distributionfunction of π₯ is well approximated by a Gaussian with mean β¨π₯β© = β¨πβ© andvariance π2
π₯ = π2π/π0. Comparing with our results (3.31), we find β¨πβ© = πππ‘
and π2π = π2ππ‘(1β πβππ‘).
3.3 Simulation of population growth
As we have seen, stochastic modeling is significantly more complicated thandeterministic modeling. As the modeling becomes more detailed, it may becomenecessary to solve a stochastic model numerically. Here, for illustration, we showhow to simulate individual realizations of population growth.
A straightforward approach would make use of the birth rate π directly.During the time Ξπ‘, each individual of a population has probability πΞπ‘ of
48 CHAPTER 3. STOCHASTIC POPULATION GROWTH
giving birth, accurate to order Ξπ‘. If we have π individuals at time π‘, we cancompute the number of individuals at time π‘ + Ξπ‘ by computing π randomdeviates (random numbers between zero and unity) and calculating the numberof new births by the number of random deviates less than πΞπ‘. In principle,this approach will be accurate provided that Ξπ‘ is sufficiently small.
There is, however, a more efficient way to simulate population growth. Wefirst determine the probability density function of the interevent time π , definedas the time required for the population to grow from size π to size π+1 becauseof a single birth. A simulation from population size π0 to size ππ would thensimply require computing ππ β π0 different random values of π , a relativelyeasy and quick computation.
Let π (π)ππ be the probability that the interevent time for a population ofsize π lies in the interval (π, π + ππ), πΉ (π) =
β« π
0π (π β²)ππ β² the probability that
the interevent time is less than π , and πΊ(π) = 1 β πΉ (π) the probability thatthe interevent time is greater than π . π (π) is called the probability densityfunction of π , and πΉ (π) the cumulative distribution function of π . They satisfythe relation π (π) = πΉ β²(π). Since the probability that the interevent time isgreater than π +Ξπ is given by the probability that it is greater than π timesthe probability that no new births occur in the time interval (π, π + Ξπ), thedistribution πΊ(π +Ξπ) is given for small Ξπ by
πΊ(π +Ξπ) = πΊ(π)(1β ππΞπ).
Differencing πΊ and taking the limit Ξπ β 0 yields the differential equation
ππΊ
ππ= βπππΊ,
which can be integrated using the initial condition πΊ(0) = 1 to further yield
πΊ(π) = πβπππ .
Therefore,
πΉ (π) = 1β πβπππ , π (π) = πππβπππ ; (3.32)
π (π) has the form of an exponential distribution.We now compute random values for π using its known cumulative distribu-
tion function πΉ (π). We do this in reference to Figure 3.1, where we plot π¦ = πΉ (π)versus π . The range π¦ of πΉ (π) is [0, 1), and for each π¦, we can compute π byinverting π¦(π) = 1β exp (βπππ); that is,
π(π¦) = β ln (1β π¦)
ππ.
Our claim is that with π¦ a random number sampled uniformly on (0, 1), therandom numbers π(π¦) have probability density function π (π).
Now, the fraction of random numbers π¦ in the interval (π¦1, π¦2) is equal tothe corresponding fraction of random numbers π in the interval (π1, π2), whereπ1 = π(π¦1) and π2 = π(π¦2); and for π¦ sampled uniformly on (0, 1), this fractionis equal to π¦2βπ¦1. We thus have the following string of equalities: (the fractionof random numbers in (π¦1, π¦2)) =(the fraction of random numbers in (π1, π2))=π¦2 β π¦1 = πΉ (π2) β πΉ (π1) =
β« π2π1
π (π)ππ . Therefore, (the fraction of random
3.3. SIMULATION OF POPULATION GROWTH 49
0
1
Ο1 Ο2
y1=F(Ο1)
y2=F(Ο2)y = F(Ο) = 1 β eβbNΟ
Figure 3.1: How to generate a random number with a given probability densityfunction.
numbers in (π1, π2)) =β« π2π1
π (π)ππ . This is exactly the definition of π having
probability density function π (π).Below, we illustrate a simple MATLAB function that simulates one realiza-
tion of population growth from initial size π0 to final size ππ , with birth rateπ.
function [t, N] = population_growth_simulation(b,N0,Nf)
% simulates population growth from N0 to Nf with birth rate b
N=N0:Nf; t=0*N;
y=rand(1,Nf-N0);
tau=-log(1-y)./(b*N(1:Nf-N0)); % interevent times
t=[0 cumsum(tau)]; % cumulative sum of interevent times
The function population growth simulation.m can be driven by a MATLABscript to compute realizations of population growth. For instance, the followingscript computes 100 realizations for a population growth from 10 to 100 withπ = 1 and plots all the realizations:
% calculate nreal realizations and plot
b=1; N0=10; Nf=100;
nreal=100;
for i=1:nreal
[t,N]=population_growth_simulation(b,N0,Nf);
plot(t,N); hold on;
end
xlabel(βtβ); ylabel(βNβ);
Figure 3.2 presents three graphs, showing 25 realizations of population growthstarting with population sizes of 10, 100, and 1000, and ending with population
50 CHAPTER 3. STOCHASTIC POPULATION GROWTH
Figure 3.2: Twenty-five realizations of population growth with initial populationsizes of 10, 100, and 1000, in (a), (b), and (c), respectively.
sizes a factor of 10 larger. Observe that the variance, relative to the initialpopulation size, decreases as the initial population size increases, following ouranalytical result (3.31).
Chapter 4
Infectious Disease Modeling
In the late 1320βs, an outbreak of the bubonic plague occurred in China. Thisdisease is caused by the bacteria Yersinia pestis and is transmitted from rats tohumans by fleas. The outbreak in China spread west, and the first major out-break in Europe occurred in 1347. During a five year period, 25 million peoplein Europe, approximately 1/3 of the population, died of the black death. Othermore recent epidemics include the influenza pandemic known as the Spanishflu killing 50-100 million people worldwide during the year 1918-1919, and thepresent AIDS epidemic, originating in Africa and first recognized in the USA in1981, killing more than 25 million people. For comparison, the SARS epidemicfor which Hong Kong was the global epicenter resulted in 8096 known SARScases and 774 deaths. Yet, we know well that this relatively small epidemiccaused local social and economic turmoil.
Here, we introduce the most basic mathematical models of infectious diseaseepidemics and endemics. These models form the basis of the necessarily moredetailed models currently used by world health organizations, both to predictthe future spread of a disease and to develop strategies for containment anderadication.
4.1 The SI model
The simplest model of an infectious disease categorizes people as either suscep-tible or infective (ππΌ). One can imagine that susceptible people are healthy andinfective people are sick. A susceptible person can become infective by contactwith an infective. Here, and in all subsequent models, we assume that the pop-ulation under study is well mixed so that every person has equal probabilityof coming into contact with every other person. This is a major approxima-tion. For example, while the population of Amoy Gardens could be consideredwell mixed during the SARS epidemic because of shared water pipes and eleva-tors, the population of Hong Kong as a whole could not because of the largergeographical distances, and the limited travel of many people outside the neigh-borhoods where they live.
We derive the governing differential equation for the SI model by consid-ering the number of people that become infective during time Ξπ‘. Let π½Ξπ‘be the probability that a random infective person infects a random susceptible
51
52 CHAPTER 4. INFECTIOUS DISEASE MODELING
person during time Ξπ‘. Then with π susceptible and πΌ infective people, theexpected number of newly infected people in the total population during timeΞπ‘ is π½Ξπ‘ππΌ. Thus,
πΌ(π‘+Ξπ‘) = πΌ(π‘) + π½Ξπ‘π(π‘)πΌ(π‘),
and in the limit Ξπ‘ β 0,ππΌ
ππ‘= π½ππΌ. (4.1)
We diagram (4.1) as
ππ½ππΌβββ πΌ.
Later, diagrams will make it easier to construct more complicated systems ofequations. We now assume a constant population size π , neglecting births anddeaths, so that π + πΌ = π . We can eliminate π from (4.1) and rewrite theequation as
ππΌ
ππ‘= π½ππΌ
(1β πΌ
π
),
which can be recognized as a logistic equation, with growth rate π½π and carryingcapacity π . Therefore πΌ β π as π‘ β β and the entire population will becomeinfective.
4.2 The SIS model
The SI model may be extended to the SIS model, where an infective can recoverand become susceptible again. We assume that the probability that an infectiverecovers during time Ξπ‘ is given by πΎΞπ‘. Then the total number of infectivepeople that recover during time Ξπ‘ is given by πΌ Γ πΎΞπ‘, and
πΌ(π‘+Ξπ‘) = πΌ(π‘) + π½Ξπ‘π(π‘)πΌ(π‘)β πΎΞπ‘πΌ(π‘),
or as Ξπ‘ β 0,ππΌ
ππ‘= π½ππΌ β πΎπΌ, (4.2)
which we diagram as
ππ½ππΌββββ½ββπΎπΌ
πΌ.
Using π + πΌ = π , we eliminate π from (4.2) and define the basic reproductiveratio as
β0 =π½π
πΎ. (4.3)
Equation(4.2) may then be rewritten as
ππΌ
ππ‘= πΎ(β0 β 1)πΌ
(1β πΌ
π(1β 1/β0)
),
which is again a logistic equation, but now with growth rate πΎ(β0 β 1) andcarrying capacity π(1 β 1/β0). The disease will disappear if the growth rateis negative, that is, β0 < 1, and it will become endemic if the growth rate is
4.2. THE SIS MODEL 53
positive, that is, β0 > 1. For an endemic disease with β0 > 1, the number ofinfected people approaches the carrying capacity: πΌ β π(1β 1/β0) as π‘ β β.
We can give a biological interpretation to the basic reproductive ratio β0.Let π(π‘) be the probability that an individual initially infected at π‘ = 0 is stillinfective at time π‘. Since the probability of being infective at time π‘+Ξπ‘ is equalto the probability of being infective at time π‘ multiplied by the probability ofnot recovering during time Ξπ‘, we have
π(π‘+Ξπ‘) = π(π‘)(1β πΎΞπ‘),
or as Ξπ‘ β 0,
ππ
ππ‘= βπΎπ.
With initial condition π(0) = 1,
π(π‘) = πβπΎπ‘. (4.4)
Now, the expected number of secondary infections produced by a single primaryinfective over the time period (π‘, π‘ + ππ‘) is given by the probability that theprimary infective is still infectious at time π‘multiplied by the expected number ofsecondary infections produced by a single infective during time ππ‘; that is, π(π‘)Γπ(π‘)π½ππ‘. We assume that the total number of secondary infections from a singleinfective individual is small relative to the population size π . Therefore, theexpected number of secondary infectives produced by a single primary infectiveintroduced into a completely susceptible population isβ« β
0
π½π(π‘)π(π‘)ππ‘ β π½π
β« β
0
π(π‘)ππ‘
= π½π
β« β
0
πβπΎπ‘ππ‘
=π½π
πΎ
= β0,
where we have approximated π(π‘) β π during the time period in which theinfective remains infectious. If a single infected individual introduced into acompletely susceptible population produces more than one secondary infectionbefore recovering, then β0 > 1 and the disease becomes endemic.
We have seen previously two other analogous calculations with results similarto (4.4): the first when computing the probability distribution of a wormβs life-time (S2.7); and the second when computing the probability distribution of theinterevent time between births (S3.3). We have also seen an analogous definitionof the basic reproductive ratio in our previous discussion of age-structured pop-ulations (S2.5). There, the basic reproductive ratio was the number of femaleoffspring expected from a new born female over her lifetime; the population sizewould grow if this value was greater than unity. Both π(π‘) and β0 are goodexamples of how the same mathematical concept can be applied to seeminglyunrelated biological problems.
54 CHAPTER 4. INFECTIOUS DISEASE MODELING
4.3 The SIR epidemic disease model
The SIR model, first published by Kermack and McKendrick in 1927, is un-doubtedly the most famous mathematical model for the spread of an infectiousdisease. Here, people are characterized into three classes: susceptible π, in-fective πΌ and removed π . Removed individuals are no longer susceptible norinfective for whatever reason; for example, they have recovered from the diseaseand are now immune, or they have been vaccinated, or they have been isolatedfrom the rest of the population, or perhaps they have died from the disease. Asin the SIS model, we assume that infectives leave the πΌ class with constant rateπΎ, but in the SIR model they move directly into the π class. The model maybe diagrammed as
ππ½ππΌββ πΌ
πΎπΌββ π ,
and the corresponding coupled differential equations are
ππ
ππ‘= βπ½ππΌ,
ππΌ
ππ‘= π½ππΌ β πΎπΌ,
ππ
ππ‘= πΎπΌ, (4.5)
with the constant population constraint π + πΌ + π = π . For convenience, wenondimensionalize (4.5) using π for population size and πΎβ1 for time; that is,let
π = π/π, πΌ = πΌ/π, οΏ½οΏ½ = π /π, π‘ = πΎπ‘,
and define the dimensionless basic reproductive ratio as
β0 =π½π
πΎ. (4.6)
The dimensionless SIR equations are then given by
ππ
ππ‘= ββ0ππΌ,
ππΌ
ππ‘= β0ππΌ β πΌ,
ποΏ½οΏ½
ππ‘= πΌ, (4.7)
with dimensionless constraint π + πΌ + οΏ½οΏ½ = 1.We will use the SIR model to address two fundamental questions: (1) Under
what condition does an epidemic occur? (2) If an epidemic occurs, what fractionof a well-mixed population gets sick?
Let (π*, πΌ*, οΏ½οΏ½*) be the fixed points of (4.7). Setting ππ/ππ‘ = ππΌ/ππ‘ =ποΏ½οΏ½/ππ‘ = 0, we immediately observe from the equation for ποΏ½οΏ½/ππ‘ that πΌ = 0,and this value forces all the time-derivatives to vanish for any π and οΏ½οΏ½. Sincewith πΌ = 0, we have οΏ½οΏ½ = 1β π, evidently all the fixed points of (4.7) are givenby the one parameter family (π*, πΌ*, οΏ½οΏ½*) = (π*, 0, 1β π*).
An epidemic occurs when a small number of infectives introduced into asusceptible population results in an increasing number of infectives. We canassume an initial population at a fixed point of (4.7), perturb this fixed pointby introducing a small number of infectives, and determine the fixed pointβsstability. An epidemic occurs when the fixed point is unstable. The linearstability problem may be solved by considering only the equation for ππΌ/ππ‘ in(4.7). With πΌ βͺ 1 and π β π0, we have
ππΌ
ππ‘=
(β0π0 β 1
)πΌ,
4.3. THE SIR EPIDEMIC DISEASE MODEL 55
0 0.5 1 1.5 20
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
R0
Rβ
fraction of population that get sick
Figure 4.1: The fraction of the population that gets sick in the SIR model as afunction of the basic reproduction ratio β0.
so that an epidemic occurs if β0π0 β 1 > 0. With the basic reproductive ratiogiven by (4.6), and π0 = π0/π , where π0 is the number of initial susceptibleindividuals, an epidemic occurs if
β0π0 =π½π0
πΎ> 1, (4.8)
which could have been guessed. An epidemic occurs if an infective individualintroduced into a population of π0 susceptible individuals infects on averagemore than one other person.
We now address the second question: If an epidemic occurs, what fractionof the population gets sick? For simplicity, we assume that the entire initialpopulation is susceptible to the disease, so that π0 = 1. We expect the solutionof the governing equations (4.7) to approach a fixed point asymptotically intime (so that the final number of infectives will be zero), and we define thisfixed point to be (π, πΌ, οΏ½οΏ½) = (1 β οΏ½οΏ½β, 0, οΏ½οΏ½β), with οΏ½οΏ½β equal to the fractionof the population that gets sick. To compute οΏ½οΏ½β, it is simpler to work with atransformed version of (4.7). By the chain rule, ππ/ππ‘ = (ππ/ποΏ½οΏ½)(ποΏ½οΏ½/ππ‘), sothat
ππ
ποΏ½οΏ½=
ππ/ππ‘
ποΏ½οΏ½/ππ‘
= ββ0π,
which is separable. Separating and integrating from the initial to final condi-tions, β« 1βοΏ½οΏ½β
1
ππ
π= ββ0
β« οΏ½οΏ½β
0
ποΏ½οΏ½,
56 CHAPTER 4. INFECTIOUS DISEASE MODELING
which upon integration and simplification, results in the following transcenden-tal equation for οΏ½οΏ½β:
1β οΏ½οΏ½β β πββ0οΏ½οΏ½β = 0, (4.9)
an equation that can be solved numerically using Newtonβs method. We have
πΉ (οΏ½οΏ½β) = 1β οΏ½οΏ½β β πββ0οΏ½οΏ½β ,
πΉ β²(οΏ½οΏ½β) = β1 +β0πββ0οΏ½οΏ½β ;
and Newtonβs method for solving πΉ (οΏ½οΏ½β) = 0 iterates
οΏ½οΏ½(π+1)β = οΏ½οΏ½π
β β πΉ (οΏ½οΏ½(π)β )
πΉ β²(οΏ½οΏ½(π)β )
for fixed β0 and a suitable initial condition for π (0)β , which we take to be unity.
My code for computing π β as a function of β0 is given below, and the resultis shown in Fig. 4.1. There is an explosion in the number of infections as β0
increases from unity, and this rapid increase is a classic example of what isknown more generally as a threshold phenomenon.
function [R0, R_inf] = sir_rinf
% computes solution of R_inf using Newtonβs method from SIR model
nmax=10; numpts=1000;
R0 = linspace(0,2,numpts); R_inf = ones(1,numpts);
for i=1:nmax
R_inf = R_inf - F(R_inf,R0)./Fp(R_inf,R0);
end
plot(R0,R_inf); axis([0 2 -0.02 0.8])
xlabel(β$\mathcal{R}_0$β, βInterpreterβ, βlatexβ, βFontSizeβ,16)
ylabel(β$\hat R_\infty$β, βInterpreterβ, βlatexβ, βFontSizeβ,16);
title(βfraction of population that get sickβ)
%subfunctions
function y = F(R_inf,R0)
y = 1 - R_inf - exp(-R0.*R_inf);
function y = Fp(R_inf,R0)
y = -1 + R0.*exp(-R0.*R_inf);
4.4 Vaccination
Table 4.1 lists the diseases for which vaccines exist and are widely administeredto children. Health care authorities must determine the fraction of a populationthat must be vaccinated to prevent epidemics.
We address this problem within the SIR epidemic disease model. Let π bethe fraction of the population that is vaccinated and π* the minimum fractionrequired to prevent an epidemic. When π > π*, an epidemic can not occur.Since even non-vaccinated people are protected by the absence of epidemics, wesay that the population has acquired herd immunity.
We assume that individuals are susceptible unless vaccinated, and vaccinatedindividuals are in the removed class. The initial population is then modeled as
4.4. VACCINATION 57
Disease Description Symptoms Complications
Diphtheria A bacterial res-piratory disease
Sore throat and low-grade fever
Airway obstruction,coma, and death
Haemophilusinfluenzae typeb (Hib)
A bacterialinfection occur-ring primarily ininfants
Skin and throat in-fections, meningitis,pneumonia, sepsis,and arthritis
Death in one out of 20children, and permanentbrain damage in 10% -30% of the survivors
Hepatitis A A viral liver dis-ease
Potentially none;yellow skin or eyes,tiredness, stom-ach ache, loss ofappetite, or nausea
usually none
Hepatitis B Same as Hepati-tis A
Same as Hepatitis A Life-long liver problems,such as scarring of theliver and liver cancer
Measles A viral respira-tory disease
Rash, high fever,cough, runny nose,and red, watery eyes
Diarrhea, ear infections,pneumonia, encephalitis,seizures, and death
Mumps A viral lymphnode disease
Fever, headache,muscle ache, andswelling of thelymph nodes closeto the jaw
Meningitis, inflamma-tion of the testicles orovaries, inflammationof the pancreas anddeafness
Pertussis(whoopingcough)
A bacterial res-piratory disease
Severe spasms ofcoughing
Pneumonia, encephalitis,and death, especially ininfants
Pneumococcaldisease
A bacterial dis-ease
High fever, cough,and stabbing chestpains, bacteremia,and meningitis
death
Polio A viral lym-phatic andnervous systemdisease
Fever, sore throat,nausea, headaches,stomach aches, stiff-ness in the neck,back, and legs
Paralysis that can lead topermanent disability anddeath
Rubella (Ger-man measles)
A viral respira-tory disease
Rash and fever fortwo to three days
Birth defects if acquiredby a pregnant woman
Tetanus (lock-jaw)
A bacterial ner-vous system dis-ease
Lockjaw, stiffness inthe neck and ab-domen, and diffi-culty swallowing
Death in one third of thecases, especially peopleover age 50
Varicella(chickenpox)
A viral diseasein the Herpesfamily
A skin rash ofblister-like lesions
Bacterial infection of theskin, swelling of thebrain, and pneumonia
Human papil-lomavirus
A viral skin andmucous mem-brane disease
Warts, cervical can-cer
The 5-year survival ratefrom all diagnoses of cer-vical cancer is 72%
Table 4.1: Previously common diseases for which vaccines have been developed.
58 CHAPTER 4. INFECTIOUS DISEASE MODELING
(π, πΌ, οΏ½οΏ½) = (1 β π, 0, π). We have already determined the stability of this fixedpoint to perturbation by a small number of infectives. The condition for anepidemic to occur is given by (4.8), and with π0 = 1β π, an epidemic occurs if
β0(1β π) > 1.
Therefore, the minimum fraction of the population that must be vaccinated toprevent an epidemic is
π* = 1β 1
β0.
Diseases with smaller values of β0 are easier to eradicate than diseases withlarger values β0 since a population can acquire herd immunity with a smallerfraction of the population vaccinated. For example, smallpox with β0 β 4 hasbeen eradicated throughout the world whereas measles with β0 β 17 still hasoccasional outbreaks.
4.5 The SIR endemic disease model
A disease that is constantly present in a population is said to be endemic.For example, malaria is endemic to Sub-Saharan Africa, where about 90% ofmalaria-related deaths occur. Endemic diseases prevail over long time scales:babies are born, old people die. Let π be the birth rate and π the disease-unrelated death rate. We separately define π to be the disease-related deathrate; π is now the immune class. We may diagram a SIR model of an endemicdisease as
-π - πΌ -π ππ π½ππΌ πΎπΌ
?ππ
?ππΌ
6ππΌ
?ππ
and the governing differential equations are
ππ
ππ‘= ππ β π½ππΌ β ππ,
ππΌ
ππ‘= π½ππΌ β (π+ π+ πΎ)πΌ,
ππ
ππ‘= πΎπΌ β ππ , (4.10)
with π = π + πΌ + π . In our endemic disease model, π separately satisfies thedifferential equation
ππ/ππ‘ = (πβ π)π β ππΌ, (4.11)
and is not necessarily constant.A disease can become endemic in a population if ππΌ/ππ‘ stays nonnegative;
that is,π½π(π‘)
π+ π+ πΎβ₯ 1.
For a disease to become endemic, newborns must introduce an endless supplyof new susceptibles into a population.
4.6. EVOLUTION OF VIRULENCE 59
4.6 Evolution of virulence
Microorganisms continuously evolve due to selection pressures in their environ-ments. Antibiotics are a common source of selection pressure on pathogenicbacteria, and the development of antibiotic-resistant strains presents a majorhealth challenge to medical science. Bacteria and viruses also compete directlywith each other for reproductive success resulting in the evolution of virulence.Here, using the SIR endemic disease model, we study how virulence may evolve.
For the sake of argument, we will assume that a population is initially inequilibrium with an endemic disease caused by a wildtype virus; that is, π,πΌ and π are assumed to be nonzero and at equilibrium values. Now supposethat some virus particles mutate by a random, undirected process that occursnaturally. We want to determine the conditions under which the mutant viruswill replace the wildtype virus in the population. In mathematical terms, wewant to determine the linear stability of the endemic disease equilibrium to theintroduction of a mutant viral strain.
We assume that the original wildtype virus has infection rate π½, removal rateπΎ, and disease-related death rate π, and that the mutant virus has correspondingrates π½β², πΎβ² and πβ². We further assume that an individual infected with eithera wildtype or mutant virus gains immunity to subsequent infection from bothwildtype and mutant viral forms. Our model thus has a single susceptible classπ, two distinct infective classes πΌ and πΌ β² depending on which virus causes theinfection, and a single recovered class π . The appropriate diagram is
-π
-
-
πΌ
πΌ β²
π ππ
π½ππΌ
π½β²ππΌ β²?
ππ
?(π+ πβ²)πΌ β²
6(π+ π)πΌ
-
?ππ
πΎπΌ
πΎβ²πΌ β²
with corresponding differential equations
ππ
ππ‘= ππ β ππ β π(π½πΌ + π½β²πΌ β²), (4.12)
ππΌ
ππ‘= π½ππΌ β (πΎ + π+ π)πΌ, (4.13)
ππΌ β²
ππ‘= π½β²ππΌ β² β (πΎβ² + π+ πβ²)πΌ β², (4.14)
ππ
ππ‘= πΎπΌ + πΎβ²πΌ β² β ππ . (4.15)
If the population is initially in equilibrium with the wildtype virus, then we haveπΌ = 0 with πΌ = 0, and the equilibrium value for π is determined from (4.13) tobe
π* =πΎ + π+ π
π½, (4.16)
60 CHAPTER 4. INFECTIOUS DISEASE MODELING
which corresponds to a basic reproductive ratio π½π*/(πΎ + π+ π) of unity.We perturb this endemic disease equilibrium by introducing a small number
of infectives carrying the mutated virus, that is, by letting πΌ β² be small. Ratherthan solve the stability problem by means of a Jacobian analysis, we can directlyexamine the equation for ππΌ β²/ππ‘ given by (4.14). Here, with π = π* given by(4.16), we have
ππΌ β²
ππ‘=
[π½β²(πΎ + π+ π)
π½β (πΎβ² + π+ πβ²)
]πΌ β²;
and πΌ β² increases exponentially if
π½β²(πΎ + π+ π)
π½β (πΎβ² + π+ πβ²) > 0,
or after some elementary algebra,
π½β²
πΎβ² + π+ πβ²>
π½
πΎ + π+ π. (4.17)
Our result (4.17) suggests that endemic viruses (or other microorganisms) willtend to evolve (i) to be more easily transmitted between people (π½β² > π½); (ii)to make people sick longer (πΎβ² < πΎ), and; (iii) to be less deadly πβ² < π. In otherwords, viruses evolve to increase their basic reproductive ratios. For instance,our model suggests that viruses evolve to be less deadly because the dead donot spread disease. Our result would not be applicable, however, if the dead infact did spread disease, a possibility if disposal of the dead was not done withsufficient care, perhaps because of certain cultural traditions such as familywashing of the dead body.
Chapter 5
Population Genetics
Deoxyribonucleic acid, or DNAβa large double-stranded, helical molecule, withrungs made from the four base pairs adenine (A), cytosine (C), thymine (T) andguanine (G)βcarries inherited genetic information. The ordering of the basepairs A, C, T and G determines the DNA sequence. A gene is a particular DNAsequence that is the fundamental unit of heredity for a particular trait. Somespecies develop as diploids, carrying two copies of every gene, one from eachparent, and some species develop as haploids with only one copy. There are evenspecies that develop as both diploids and haploids. When we say there is a genefor pea color, say, we mean there is a particular DNA sequence that may vary ina pea plant population, and that there are at least two subtypes, called alleles,where plants with two copies of the yellow-color allele have yellow peas, thosewith two copies of the green-color allele, green peas. A plant with two copies ofthe same allele is homozygous for that particular gene (or a homozygote), whilea plant carrying two different alleles is heterozygous (or a heterozygote). Forthe pea color gene, a plant carrying both a yellow- and green-color allele hasyellow peas. We say that the green color is a recessive trait (or the green-colorallele is recessive), and the yellow color is a dominant trait (or the yellow-colorallele is dominant). The combination of alleles carried by the plant is called itsgenotype, while the actual trait (green or yellow peas) is called its phenotype. Agene that has more than one allele in a population is called polymorphic, andwe say the population has a polymorphism for that particular gene.
Population genetics can be defined as the mathematical modeling of the evo-lution and maintenance of polymorphism in a population. Population geneticstogether with Charles Darwinβs theory of evolution by natural selection andGregor Mendelβs theory of biological inheritance forms the modern evolutionarysynthesis (sometimes called the modern synthesis, the evolutionary synthesis,the neo-Darwinian synthesis, or neo-Darwinism). The primary founders in theearly twentieth century of population genetics were Sewall Wright, J. B. S. Hal-dane and Ronald Fisher.
Allele frequencies in a population can change due to the influence of fourprimary evolutionary forces: natural selection, genetic drift, mutation, and mi-gration. Here, we mainly focus on natural selection and mutation. Geneticdrift is the study of stochastic effects, and it is important in small populations.Migration requires consideration of the spatial distribution of a population, andit is usually modeled mathematically by partial differential equations.
61
62 CHAPTER 5. POPULATION GENETICS
genotype π΄ πnumber ππ΄ ππ
viability fitness ππ΄ ππfertility fitness ππ΄ ππ
Table 5.1: Haploid genetics using population size, absolute viability, and fertilityfitnesses.
The simplified models we will consider assume infinite population sizes (ne-glecting stochastic effects except in S5.5), well-mixed populations (neglectingany spatial distribution), and discrete generations (neglecting any age-structure).Our main purpose is to illustrate the fundamental ways that a genetic polymor-phism can be maintained in a population.
5.1 Haploid genetics
We first consider the modeling of selection in a population of haploid organ-isms. Selection is modeled by fitness coefficients, with different genotypes hav-ing different fitnesses. We begin with a simple model that counts the numberof individuals in the next generation, and then show how this model can bereformulated in terms of allele frequencies and relative fitness coefficients.
Table 5.1 formulates the basic model. We assume that there are two allelesπ΄ and π for a particular haploid gene. These alleles are carried in the popula-tion by ππ΄ and ππ individuals, respectively. A fraction ππ΄ (ππ) of individualscarrying allele π΄ (π) is assumed to survive to reproduction age, and those thatsurvive contribute ππ΄ (ππ) offspring to the next generation. These are of courseaverage values, but under the assumption of an (almost) infinite population, our
model is deterministic. Accordingly, with π(π)π΄ (π
(π)π ) representing the number of
individuals carrying allele π΄ (π) in the πth generation, and formulating a discretegeneration model, we have
π(π+1)π΄ = ππ΄ππ΄π
(π)π΄ , π(π+1)
π = πππππ(π)π . (5.1)
It is mathematically easier and more transparent to work with allele frequenciesrather than individual numbers. We denote the frequency (or more accurately,proportion) of allele π΄ (π) in the πth generation by ππ (ππ); that is,
ππ =π(π)π΄
π(π)π΄ + π
(π)π
, ππ =π(π)π
π(π)π΄ + π
(π)π
,
where evidently ππ + ππ = 1. Now, from (5.1),
π(π+1)π΄ + π(π+1)
π = ππ΄ππ΄π(π)π΄ + πππππ
(π)π , (5.2)
5.1. HAPLOID GENETICS 63
genotype π΄ πfreq. of gamete π πrelative fitness 1 + π 1freq after selection (1 + π )π/π€ π/π€normalization π€ = (1 + π )π+ π
Table 5.2: Haploid genetic model of the spread of a favored allele.
so that dividing the first equation in (5.1) by (5.2) yields
ππ+1 =ππ΄ππ΄π
(π)π΄
ππ΄ππ΄π(π)π΄ + πππππ
(π)π
=ππ΄ππ΄ππ
ππ΄ππ΄ππ + ππππππ
=
(ππ΄ππ΄ππππ
)ππ(
ππ΄ππ΄ππππ
)ππ + ππ
,
(5.3)
where the second equality comes from dividing the numerator and denominator
by π(π)π΄ + π
(π)π , and the third equality from dividing the numerator and denomi-
nator by ππ΄ππ΄. Similarly,
ππ+1 =ππ(
ππ΄ππ΄ππππ
)ππ + ππ
, (5.4)
which could also be derived using ππ+1 = 1βππ+1. We observe from the evolutionequations for the allele frequencies, (5.3) and (5.4), that only the relative fitnessππ΄ππ΄/ππππ of the alleles matters. Accordingly, in our models, we will consideronly relative fitnesses, and we will arbitrarily set one fitness to unity to simplifythe algebra and make the final result more transparent.
5.1.1 Spread of a favored allele
We consider a simple model for the spread of a favored allele in Table 5.2, withπ > 0. Denoting πβ² by the frequency of π΄ in the next generation (not (!) thederivative of π), the evolution equation is given by
πβ² =(1 + π )π
π€
=(1 + π )π
1 + π π,
(5.5)
where we have used (1 + π )π + π = 1 + π π, since π + π = 1. Note that (5.5) isthe same as (5.3) with πβ² = ππ+1, π = ππ, and ππ΄ππ΄/ππππ = 1 + π . Fixed pointsof (5.5) are determined from πβ² = π. We find two fixed points: π* = 0, corre-sponding to a population in which allele π΄ is absent; and π* = 1, correspondingto a population in which allele π΄ is fixed. Intuitively, π* = 0 is unstable whileπ* = 1 is stable.
64 CHAPTER 5. POPULATION GENETICS
To illustrate how a stability analysis is performed analytically for a differ-ence equation (instead of a differential equation), consider the general differenceequation
πβ² = π(π). (5.6)
With π = π* a fixed point such that π* = π(π*), we write π = π* + π so that(5.6) becomes
π* + πβ² = π(π* + π)
= π(π*) + ππ β²(π*) + . . .
= π* + ππ β²(π*) + . . . ,
where π β²(π*) denotes the derivative of π evaluated at π*. Therefore, to leading-order in π
|πβ²/π| = |π β²(π*)| ,and the fixed point is stable provided that |π β²(π*)| < 1. For our haploid model,
π(π) =(1 + π )π
1 + π π, π β²(π) =
1 + π
(1 + π π)2,
so that π β²(π* = 0) = 1+ π > 1, and π β²(π* = 1) = 1/(1 + π ) < 1, confirming thatπ* = 0 is unstable and π* = 1 is stable.
If the selection coefficient π is small, the model equation (5.5) simplifiesfurther. We have
πβ² =(1 + π )π
1 + π π
= (1 + π )π(1β π π+O(π 2))
= π+ (πβ π2)π +O(π 2),
so that to leading-order in π ,
πβ² β π = π π(1β π).
If πβ² β π βͺ 1, which is valid for π βͺ 1, we can approximate this differenceequation by the differential equation
ππ/ππ = π π(1β π),
which shows that the frequency of allele π΄ satisfies the now very familiar logisticequation.
Although a polymorphism for this gene exists in the population as the newallele spreads, eventually π΄ becomes fixed in the population and the polymor-phism is lost. In the next section, we consider how a polymorphism can bemaintained in a haploid population by a balance between mutation and selec-tion.
5.1.2 Mutation-selection balance
We consider a gene with two alleles: a wildtype allele π΄ and a mutant alleleπ. We view the mutant allele as a defective genotype, which confers on the
5.1. HAPLOID GENETICS 65
genotype π΄ πfreq. of gamete π πrelative fitness 1 1β π freq after selection π/π€ (1β π )π/π€freq after mutation (1-u)p/w [(1β π )π + π’π]/π€normalization π€ = π+ (1β π )π
Table 5.3: A haploid genetic model of mutation-selection balance.
carrier a lowered fitness 1 β π relative to the wildtype. Although all mutantalleles may not have identical DNA sequences, we assume that they share incommon the same phenotype of reduced fitness. We model the opposing effectsof two evolutionary forces: natural selection, which favors the wildtype allele π΄over the mutant allele π, and mutation, which confers a small probability π’ thatallele π΄ mutates to allele π in each newborn individual. Schematically,
π΄π’βββ½βπ π
where π’ represents mutation and π represents selection. The model is shown inTable 5.3. The equations for π and π in the next generation are
πβ² =(1β π’)π
π€
=(1β π’)π
1β π (1β π),
(5.7)
and
πβ² =(1β π )π + π’π
π€
=(1β π β π’)π + π’
1β π π,
(5.8)
where we have used π+ π = 1 to eliminate π from the equation for πβ² and π fromthe equation for πβ². The equations for πβ² and πβ² are linearly dependent sinceπβ² + πβ² = 1, and we need solve only one of them.
Considering (5.7), the fixed points determined from πβ² = π are π* = 0, forwhich the mutant allele π is fixed in the population and there is no polymor-phism, and the solution to
1β π (1β π*) = 1β π’,
which is π* = 1 β π’/π , and there is a polymorphism. The stability of thesetwo fixed points is determined by considering πβ² = π(π), with π(π) given by theright-hand-side of (5.7). Taking the derivative of π ,
π β²(π) =(1β π’)(1β π )
[1β π (1β π)]2,
so that
π β²(π* = 0) =1β π’
1β π , π β²(π* = 1β π’/π ) =
1β π
1β π’.
Applying the criterion |π β²(π*)| < 1 for stability, π* = 0 is stable for π < π’ andπ* = 1 β π’/π is stable for π > π’. A polymorphism is therefore possible undermutation-selection balance when π > π’ > 0.
66 CHAPTER 5. POPULATION GENETICS
genotype π΄π΄ π΄π ππreferred to as wildtype homozygote heterozygote mutant homozygotefrequency π π π
Table 5.4: The terminology of diploidy.
5.2 Diploid genetics
Most sexually reproducing species are diploid. In particular, our species Homosapiens is diploid with two exceptions: we are haploid at the gamete stage(sperm and unfertilized egg); and males are haploid for most genes on the un-matched X and Y sex chromosomes (females are XX and diploid). This latterseemingly innocent fact is of great significance to males suffering from geneticdiseases due to an X-linked recessive mutation inherited from their mother. Fe-males inheriting this mutation are most probably disease-free because of thefunctional gene inherited from their father.
A polymorphic gene with alleles π΄ and π can appear in a diploid gene asthree distinct genotypes: π΄π΄, π΄π and ππ. Conventionally, we denote π΄ to bethe wildtype allele and π the mutant allele. Table 5.4 presents the terminologyof diploidy.
As for haploid genetics, we will determine evolution equations for alleleand/or genotype frequencies. To develop the appropriate definitions and re-lations, we initially assume a population of size π (which we will later take tobe infinite), and assume that the number of individuals with genotypes π΄π΄, π΄πand ππ are ππ΄π΄, ππ΄π and πππ. Now, π = ππ΄π΄ +ππ΄π +πππ. Define genotypefrequencies π , π and π as
π =ππ΄π΄
π, π =
ππ΄π
π, π =
πππ
π,
so that π +π+π = 1. It will also be useful to define allele frequencies. Let ππ΄
and ππ be the number of alleles π΄ and π in the population, with π = ππ΄+ππ thetotal number of alleles. Since the population is of size π and diploidy, π = 2π ;and since each homozygote contains two identical alleles, and each heterozygotecontains one of each allele, ππ΄ = 2ππ΄π΄ +ππ΄π and ππ = 2πππ +ππ΄π. Definingthe allele frequencies π and π as previously,
π = ππ΄/π
=2ππ΄π΄ +ππ΄π
2π
= π +1
2π;
and similarly,
π = ππ/π
=2πππ +ππ΄π
2π
= π +1
2π.
5.2. DIPLOID GENETICS 67
With five frequencies, π , π, π , π, π, and four constraints π + π + π = 1,π + π = 1, π = π + π/2, π = π + π/2, how many independent frequenciesare there? In fact, there are two because one of the four constraints is linearlydependent. We may choose any two frequencies other than the choice {π, π} asour linearly independent set. For instance, one choice is {π , π}; then,
π = 1β π, π = 2(πβ π ), π = 1 + π β 2π.
Similarly, another choice is {π , π}; then
π = 1β π βπ, π = π +1
2π, π = 1β π β 1
2π.
5.2.1 Sexual reproduction
Diploid reproduction may be sexual or asexual, and sexual reproduction maybe of varying types (e.g., random mating, selfing, brother-sister mating, andvarious other types of assortative mating). The two simplest types to modelexactly are random mating and selfing. These mating systems are useful forcontrasting the biology of both outbreeding and inbreeding.
Random mating
The simplest type of sexual reproduction to model mathematically is randommating. Here, we assume a well-mixed population of individuals that haveequal probability of mating with every other individual. We will determinethe genotype frequencies of the zygotes (fertilized eggs) in terms of the allelefrequencies using two approaches: (1) the gene pool approach, and (2) themating table approach.
The gene pool approach models sexual reproduction by assuming that malesand females release their gametes into pools. Offspring genotypes are deter-mined by randomly combining one gamete from the male pool and one gametefrom the female pool. As the probability of a random gamete containing alleleπ΄ or π is equal to the alleleβs population frequency π or π, respectively, theprobability of an offspring being π΄π΄ is π2, of being π΄π is 2ππ (male π΄ female π+ female π΄ male π), and of being ππ is π2. Therefore, after a single generationof random mating, the genotype frequencies can be given in terms of the allelefrequencies by
π = π2, π = 2ππ, π = π2.
This is the celebrated Hardy-Weinberg law. Notice that under the assumptionof random mating, there is now only a single independent frequency, greatlysimplifying the mathematical modeling. For example, if π is taken as the inde-pendent frequency, then
π = 1β π, π = π2, π = 2π(1β π), π = (1β π)2.
Most modeling is done assuming random mating unless the biology under studyis influenced by inbreeding.
The second approach uses a mating table (see Table 5.5). This approach tomodeling sexual reproduction is more general and can be applied to other matingsystems. We explain this approach by considering the mating π΄π΄ Γ π΄π. The
68 CHAPTER 5. POPULATION GENETICS
progeny frequencymating frequency AA Aa aaπ΄π΄Γπ΄π΄ π 2 π 2 0 0π΄π΄Γπ΄π 2ππ ππ ππ 0π΄π΄Γ ππ 2ππ 0 2ππ 0π΄πΓπ΄π π2 1
4π2 1
2π2 1
4π2
π΄πΓ ππ 2ππ 0 ππ ππ ππΓ ππ π 2 0 0 π 2
Totals (π +π+π )2 (π + 12π)2 2(π + 1
2π)(π + 12π) (π + 1
2π)2
= 1 = π2 = 2ππ = π2
Table 5.5: Random mating table.
progeny frequencymating frequency AA Aa aaπ΄π΄β π π 0 0π΄πβ π 1
4π12π
14π
ππβ π 0 0 π Totals 1 π + 1
4π12π π + 1
4π
Table 5.6: Selfing mating table.
genotypes π΄π΄ and π΄π have frequencies π and π, respectively. The frequencyof π΄π΄ males mating with π΄π females is ππ and is the same as π΄π΄ femalesmating with π΄π males, so the sum is 2ππ. Half of the offspring will be π΄π΄and half π΄π, and the frequencies ππ are denoted under progeny frequency. Thesums of all the progeny frequencies are given in the Totals row, and the randommating results are recovered upon use of the relationship between the genotypeand allele frequencies.
Selfing
Perhaps the next simplest type of mating system is self-fertilization, or selfing.Here, an individual reproduces sexually (passing through a haploid gamete stagein its life-cycle), but provides both of the gametes. For example, the nematodeworm C. elegans can reproduce by selfing. The mating table for selfing is givenin Table 5.6. The selfing frequency of a particular genotype is just the frequencyof the genotype itself. For a selfing population, disregarding selection or anyother evolutionary forces, the genotype frequencies evolve as
π β² = π +1
4π, πβ² =
1
2π, π β² = π +
1
4π. (5.9)
Assuming an initially heterozygous population, we solve (5.9) with the ini-tial conditions π0 = 1 and π0 = π 0 = 0. In the worm lab, this type of initialpopulation is commonly created by crossing wildtype homozygous C. elegansmales with mutant homozygous C. elegans hermaphrodites, where the mutantallele is recessive. Wildtype hermaphrodite offspring, which are necessarily het-erozygous, are then picked to separate worm plates and allowed to self-fertilize.(Do you see why the experiment is not done with wildtype hermaphrodites and
5.2. DIPLOID GENETICS 69
mutant males?) From the equation for πβ² in (5.9), we have ππ = (1/2)π, andfrom symmetry, ππ = π π. Then, since ππ + ππ + π π = 1, we obtain thecomplete solution
ππ =1
2
(1β
(1
2
)π), ππ =
(1
2
)π
, π π =1
2
(1β
(1
2
)π).
The main result to be emphasized here is that the heterozygosity of the popula-tion decreases by a factor of two in each generation. Selfing populations rapidlybecome homozygous. For homework, I will ask you to determine the genotypefrequencies for an initially random mating population that transitions to selfing.
Constancy of allele frequencies
Mating by itself does not change allele frequencies in a population, but onlyreshuffles alleles into different genotypes. We can show this directly for randommating and selfing. For random mating,
πβ² = π β² +1
2πβ²
= π2 +1
2(2ππ)
= π(π+ π)
= π;
and for selfing,
πβ² = π β² +1
2πβ²
=
(π +
1
4π
)+
1
2
(1
2π
)= π +
1
2π
= π.
The conservation of allele frequency by mating is an important element of neo-Darwinism. In Darwinβs time, most biologists believed in blending inheritance,where the genetic material from parents with different traits actually blendedin their offspring, rather like the mixing of paints of different colors. If blendinginheritance occurred, then genetic variation, or polymorphism, would eventu-ally be lost over several generations as the βgenetic paintsβ became well-mixed.Mendelβs work on peas, published in 1866, suggested a particulate theory ofinheritance, where the genetic material, later called genes, maintain their in-tegrity across generations. Sadly, Mendelβs paper was not read by Darwin (whopublished The Origin of Species in 1859 and died in 1882) or other influentialbiologists during Mendelβs lifetime (Mendel died in 1884). After being rediscov-ered in 1900, Mendel and his work eventually became widely celebrated.
5.2.2 Spread of a favored allele
We consider the spread of a favored allele in a diploid population. The classicexample β widely repeated in biology textbooks as a modern example of natural
70 CHAPTER 5. POPULATION GENETICS
Figure 5.1: Evolution of peppered moths in industrializing England.
selection β is the change in the frequencies of the dark and light phenotypesof the peppered moth during Englandβs industrial revolution. The evolutionarystory begins with the observation that pollution killed the light colored lichen ontrees during industrialization of the cities. Light colored peppered moths cam-ouflage well on light colored lichens, but are exposed to birds on plain tree bark.On the other hand, dark colored peppered moths camouflage well on plain treebark, but are exposed on light colored lichens (see Fig. 5.1). Natural selectiontherefore favored the light-colored allele in preindustrialized England and thedark-colored allele during industrialization. The dark-colored allele, presum-ably kept at low frequency by mutation-selection balance in pre-industrializedEngland, increased rapidly under natural selection in industrializing England.
We present our model in Table 5.7. Here, we consider ππ as the wildtypegenotype and normalize its fitness to unity. The allele π΄ is the mutant whosefrequency increases in the population. In our example of the peppered moth, theππ phenotype is light colored and the π΄π΄ phenotype is dark colored. The colorof the π΄π phenotype depends on the relative dominance of π΄ and π. Usually, nopigment results in light color and is a consequence of nonfunctioning pigment-producing genes. One functioning pigment-producing allele is usually sufficientto result in a dark-colored moth. With π΄ a functioning pigment-producing alleleand π the mutated nonfunctioning allele, π is most likely recessive, π΄ is mostlikely dominant, and the phenotype of π΄π is most likely dark, so β β 1. For themoment, though, we leave β as a free parameter.
We assume random mating, and this simplification is used to write the geno-type frequencies as π = π2, π = 2ππ, and π = π2. Since π = 1βπ, we reduce ourproblem to determining an equation for πβ² in terms of π. Using πβ² = ππ +(1/2)ππ ,where πβ² is the π΄ allele frequency in the next generationβs zygotes, and ππ andππ
are the π΄π΄ and π΄π genotype frequencies, respectively, in the present generationafter selection,
πβ² =(1 + π )π2 + (1 + π β)ππ
π€,
5.2. DIPLOID GENETICS 71
genotype π΄π΄ π΄π ππfreq. of zygote π2 2ππ π2
relative fitness 1 + π 1 + π β 1freq after selection (1 + π )π2/π€ 2(1 + π β)ππ/π€ π2/π€normalization π€ = (1 + π )π2 + 2(1 + π β)ππ + π2
Table 5.7: A diploid genetic model of the spread of a favored allele assumingrandom mating.
where π = 1β π, and
π€ = (1 + π )π2 + 2(1 + π β)ππ + π2
= 1 + π (π2 + 2βππ).
After some algebra, the final evolution equation written solely in terms of π is
πβ² =(1 + π β)π+ π (1β β)π2
1 + 2π βπ+ π (1β 2β)π2. (5.10)
The expected fixed points of this equation are π* = 0 (unstable) and π* = 1(stable), where our assignment of stability assumes positive selection coefficients.
The evolution equation (5.10) in this form is not particularly illuminating.In general, a numerical solution would require specifying numerical values for π and β, as well as an initial value for π. Here, to determine how the spread of π΄depends on the dominance coefficient β, we investigate analytically the increaseof π΄ assuming π βͺ 1. We Taylor-series expand the right-hand-side of (5.10) inpowers of π , keeping terms to order π :
πβ² =(1 + π β)π+ π (1β β)π2
1 + 2π βπ+ π (1β 2β)π2
=π+ π
(βπ+ (1β β)π2
)1 + π
(2βπ+ (1β 2β)π2
)=
(π+ π
(βπ+ (1β β)π2
))(1β π
(2βπ+ (1β 2β)π2
)+O(π 2)
)= π+ π π
(β+ (1β 3β)πβ (1β 2β)π2
)+O(π 2).
(5.11)
If π βͺ 1, we expect a small change in allele frequency in each generation, so wecan approximate πβ² β π β ππ/ππ, where π denotes the generation number, andπ = π(π). The approximate differential equation obtained from (5.11) is
ππ
ππ= π π
(β+ (1β 3β)πβ (1β 2β)π2
). (5.12)
If π΄ is partially dominant so that β = 0 (e.g., the heterozygous moth isdarker than the homozgygous mutant moth), then the solution to (5.12) behavessimilarly to the solution of a logistic equation: π initially grows exponentiallyas π(π) = π0 exp (π βπ), and asymptotes to one for large π. If π΄ is recessive sothat β = 0 (e.g., the heterozygous moth is as light-colored as the homozygousmutant moth), then (5.12) reduces to
ππ
ππ= π π2 (1β π) , for β = 0. (5.13)
72 CHAPTER 5. POPULATION GENETICS
disease mutation symptomsThalassemia haemoglobin anemiaSickle cell anemia haemoglobin anemiaHaemophilia blood clotting factor uncontrolled bleedingCystic Fibrosis chloride ion channel thick lung mucousTay-Sachs disease Hexosaminidase A enzyme nerve cell damageFragile X syndrome FMR1 gene mental retardationHuntingtonβs disease HD gene brain degeneration
Table 5.8: Seven common monogenic diseases.
Of main interest is the initial growth of π when π(0) = π0 βͺ 1, so that ππ/ππ βπ π2. This differential equation may be integrated by separating variables toyield
π(π) =π0
1β π π0π
β π0(1 + π π0π).
The frequency of a recessive favored allele increases only linearly across gener-ations, a consequence of the heterozygote being hidden from natural selection.Most likely, the peppered-moth heterozygote is significantly darker than thelight-colored homozygote since the dark colored moth rapidly increased in fre-quency over a short period of time.
As a final comment, linear growth in the frequency of π΄ when β = 0 issensitive to our assumption of random mating. If selfing occurred, or anothertype of close family mating, then a recessive favored allele may still increaseexponentially. In this circumstance, the production of homozygous offspringfrom more frequent heterozygote pairings allows selection to act more effectively.
5.2.3 Mutation-selection balance
By virtue of self-knowledge, the species with the most known mutant phenotypesis Homo sapiens. There are thousands of known genetic diseases in humans,many of them caused by mutation of a single gene (called a monogenic disease).For an easy-to-read overview of genetic disease in humans, see the website
http://www.who.int/genomics/public/geneticdiseases.
Table 5.8 lists seven common monogenic diseases. The first two diseases aremaintained at significant frequencies in some human populations by heterosis.We will discuss in S5.2.4 the maintenance of a polymorphism by heterosis, forwhich the heterozygote has higher fitness than either homozygote. It is postu-lated that Tay-Sachs disease, prevalent among ancestors of Eastern EuropeanJews, and cystic fibrosis may also have been maintained by heterosis acting inthe past. (Note that the cystic fibrosis gene was identified in 1989 by a Torontogroup led by Lap Chee Tsui, who later became President of the University ofHong Kong.) The other disease genes listed may be maintained by mutation-selection balance.
Our model for diploid mutation selection-balance is given in Table 5.9. Wefurther assume that mutations of type π΄ β π occur in gamete production with
5.2. DIPLOID GENETICS 73
genotype π΄π΄ π΄π ππfreq. of zygote π2 2ππ π2
relative fitness 1 1β π β 1β π freq after selection π2/π€ 2(1β π β)ππ/π€ (1β π )π2/π€normalization π€ = π2 + 2(1β π β)ππ + (1β π )π2
Table 5.9: A diploid genetic model of mutation-selection balance assuming ran-dom mating.
frequency π’. Back-mutation is neglected. The gametic frequency of π΄ and πafter selection but before mutation is given by π = ππ +ππ /2 and π = π π +ππ /2,and the gametic frequency of π after mutation is given by πβ² = π’π+π. Therefore,
πβ² =(π’(π2 + (1β π β)ππ
)+((1β π )π2 + (1β π β)ππ
))/π€,
where
π€ = π2 + 2(1β π β)ππ + (1β π )π2
= 1β π π(2βπ+ π).
Using π = 1βπ, we write the evolution equation for πβ² in terms of π alone. Aftersome algebra that could be facilitated using a computer algebra software suchas Mathematica, we obtain
πβ² =π’+
(1β π’β π β(1 + π’)
)π β π
(1β β(1 + π’)
)π2
1β 2π βπ β π (1β 2β)π2. (5.14)
To determine the equilibrium solutions of (5.14), we set π* β‘ πβ² = π to obtaina cubic equation for π*. Because of the neglect of back mutation in our model,one solution readily found is π* = 1, in which all the π΄ alleles have mutated toπ. The π* = 1 solution may be factored out of the cubic equation resulting in aquadratic equation, with two solutions. Rather than show the exact result here,we determine equilibrium solutions under two approximations: (i) 0 < π’ βͺ β, π ,and; (ii) 0 = β < π’ < π .
First, when 0 < π’ βͺ β, π , we look for a solution of the form π* = ππ’+O(π’2),with π constant, and Taylor series expand in π’ (assuming π , β = O(π’0)). If sucha solution exists, then (5.14) will determine the unknown coefficient π. We have
ππ’+O(π’2) =π’+ (1β π β)ππ’+O(π’2)
1β 2π βππ’+O(π’2)
= (1 + πβ π βπ)π’+O(π’2);
and equating powers of π’, we find π = 1 + πβ π βπ, or π = 1/π β. Therefore,
π* = π’/π β+O(π’2), for 0 < π’ βͺ β, π .
Second, when 0 = β < π’ < π , we substitute β = 0 directly into (5.14),
π* =π’+ (1β π’)π* β π π2*
1β π π2*,
74 CHAPTER 5. POPULATION GENETICS
genotype π΄π΄ π΄π ππfreq: 0 < π’ βͺ π , β 1 + O(π’) 2π’/π β+O(π’2) π’2/(π β)2 +O(π’3)
freq: 0 = β < π’ < π 1 + O(βπ’) 2
βπ’/π +O(π’) π’/π
Table 5.10: Equilibrium frequencies of the genotypes at the diploid mutation-selection balance.
which we then write as a cubic equation π*,
π3* β π2* βπ’
π π* +
π’
π = 0.
By factoring this cubic equation, we find
(π* β 1)(π2* β π’/π ) = 0;
and the polymorphic equilibrium solution is
π* =βπ’/π , for 0 = β < π’ < π .
Because π* < 1 only if π > π’, this solution does not exist if π < π’.Table 5.10 summarizes our results for the equilibrium frequencies of the geno-
types at mutation-selection balance. The first row of frequencies, 0 < π’ βͺ π , β,corresponds to a dominant (β = 1) or partially-dominant (π’ βͺ β < 1) muta-tion, where the heterozygote is of reduced fitness and shows symptoms of thegenetic disease. The second row of frequencies, 0 = β < π’ < π , corresponds to arecessive mutation, where the heterozygote is symptom-free. Notice that indi-viduals carrying a dominant mutation are twice as prevalent in the populationas individuals homozygous for a recessive mutation (with the same π’ and π ).
A heterozygote carrying a dominant mutation most commonly arises eitherde novo (by direct mutation of allele π΄) or by the mating of a heterozygotewith a wildtype. The latter is more common for π βͺ 1, while the former mustoccur for π = 1 (a heterozygote with an π = β = 1 mutation by definitiondoes not reproduce). One of the most common autosomal dominant geneticdiseases is Huntingtonβs disease, resulting in brain deterioration during middleage. Because individuals with Huntingtonβs disease have children before diseasesymptoms appear, π is small and the disease is usually passed to offspring bythe mating of a (heterozygote) with a wildtype homozygote. For a recessivemutation, a mutant homozygote usually occurs by the mating of two heterozy-gotes. If both parents carry a single recessive disease allele, then their child hasa 1/4 chance of getting the disease.
5.2.4 Heterosis
Heterosis, also called overdominance or heterozygote advantage, occurs when theheterozygote has higher fitness than either homozygote. The best-known exam-ples are sickle-cell anemia and thalassemia, diseases that both affect hemoglobin,the oxygen-carrier protein of red blood cells. The sickle-cell mutations are mostcommon in people of West African descent, while the thalassemia mutations aremost common in people from the Mediterranean and Asia. In Hong Kong, thetelevision stations occasionally play public service announcements concerning
5.3. FREQUENCY-DEPENDENT SELECTION 75
genotype π΄π΄ π΄π ππfreq. of zygote π2 2ππ π2
relative fitness 1β π 1 1β π‘freq after selection (1β π )π2/π€ 2ππ/π€ (1β π‘)π2/π€normalization π€ = (1β π )π2 + 2ππ + (1β π‘)π2
Table 5.11: A diploid genetic model of heterosis assuming random mating.
thalassemia. The heterozygote carrier of the sickle-cell or thalassemia gene ishealthy and resistant to malaria; the wildtype homozygote is healthy, but sus-ceptible to malaria; the mutant homozygote is sick with anemia. In class, wewill watch the short video, A Mutation Story, about the sickle cell gene, whichyou can also view at your convenience:
http://www.pbs.org/wgbh/evolution/library/01/2/l_012_02.html.
Table 5.11 presents our model of heterosis. Both homozygotes are of lowerfitness than the heterozygote, whose relative fitness we arbitrarily set to unity.Writing the equation for πβ², we have
πβ² =(1β π )π2 + ππ
1β π π2 β π‘π2
=πβ π π2
1β π‘+ 2π‘πβ (π + π‘)π2.
At equilibrium, π* β‘ πβ² = π, and we obtain a cubic equation for π*:
(π + π‘)π3* β (π + 2π‘)π2* + π‘π* = 0. (5.15)
Evidently, π* = 0 and π* = 1 are fixed points, and (5.15) can be factored as
π(1β π) (π‘β (π + π‘)π) = 0.
The polymorphic solution is therefore
π* =π‘
π + π‘, π* =
π
π + π‘,
valid when π , π‘ > 0. Since the value of π* can be large, recessive mutations thatcause disease, yet are highly prevalent in a population, are suspected to providesome benefit to the heterozygote. However, only a few genes are unequivocallyknown to exhibit heterosis.
5.3 Frequency-dependent selection
A polymorphism may also result from frequency-dependent selection. A well-known model of frequency-dependent selection is the Hawk-Dove game. Mostcommonly, frequency-dependent selection is studied using game theory, andfollowing John Maynard Smith, one looks for an evolutionarily stable strat-egy (ESS). Here, we study frequency-dependent selection using a population-genetics model.
76 CHAPTER 5. POPULATION GENETICS
player β opponent H DH πΈπ»π» = β2 πΈπ»π· = 2D πΈπ·π» = 0 πΈπ·π· = 1
Table 5.12: General payoff matrix for the Hawk-Dove game, and the usuallyassumed values. The payoffs are payed to the player (first column) when playingagainst the opponent (first row).
genotype H Dfreq of zygote p qrelative fitness πΎ + ππΈπ»π» + ππΈπ»π· πΎ + ππΈπ·π» + ππΈπ·π·
freq after selection (πΎ + ππΈπ»π» + ππΈπ»π·)π/π€ (πΎ + ππΈπ·π» + ππΈπ·π·)π/π€normalization π€ = (πΎ + ππΈπ»π» + ππΈπ»π·)π+ (πΎ + ππΈπ·π» + ππΈπ·π·)π
Table 5.13: Haploid genetic model for frequency-selection in the Hawk-Dovegame
We consider two phenotypes: Hawk and Dove, with no mating betweendifferent phenotypes (for example, different phenotypes may correspond to dif-ferent species, such as hawks and doves). We describe the Hawk-Dove game asfollows: (i) when Hawk meets Dove, Hawk gets the resource and Dove retreatsbefore injury; (ii) when two Hawks meet, they engage in an escalating fight, se-riously risking injury, and; (iii) when two Doves meet, they share the resource.The Hawk-Dove game is modeled by a payoff matrix, as shown in Table 5.12.The player in the first column receives the payoff when playing the opponentin the first row. For instance, Hawk playing Dove gets the payoff πΈπ»π·. Thenumerical values are commonly chosen such that πΈπ»π» < πΈπ·π» < πΈπ·π· < πΈπ»π·,that is, Hawk playing Dove does better than Dove playing Dove does betterthan Dove playing Hawk does better than Hawk playing Hawk.
Frequency-dependent selection occurs because the fitness of Hawk or Dovedepends on the frequency of Hawks and Doves in the population. For example,a Hawk in a population of Doves does well, but a Hawk in a population of Hawksdoes poorly. We model the Hawk-Dove game using a haploid genetic model, withπ and π the population frequencies of Hawks and Doves, respectively. We assumethat the times a player phenotype plays an opponent phenotype is proportionalto the population frequency of the opponent phenotype. For example, if thepopulation is composed of 1/4 Hawks and 3/4 Doves, then Hawk or Dove playsHawk 1/4 of the time and Dove 3/4 of the time. Our haploid genetic modelassuming frequency-dependent selection is formulated in Table 5.13. The fitnessparameterπΎ is arbitrary, but assumed to be large and positive so that the fitnessof Hawk or Dove is always positive. A negative fitness in our haploid model hasno meaning (see S5.1).
From Table 5.13, the equation for πβ² is
πβ² = (πΎ + ππΈπ»π» + ππΈπ»π·)π/π€, (5.16)
withπ€ = (πΎ + ππΈπ»π» + ππΈπ»π·)π+ (πΎ + ππΈπ·π» + ππΈπ·π·)π. (5.17)
Both π* = 0 and π* = 1 are fixed points; a stable polymorphism exists whenboth these fixed points are unstable.
5.4. LINKAGE EQUILIBRIUM 77
First, consider the fixed point π* = 0. With perturbation π βͺ 1 and π =1β π, to leading-order in π
πβ² =
(πΎ + πΈπ»π·
πΎ + πΈπ·π·
)π.
Therefore, π* = 0 is an unstable fixed point when πΈπ»π· > πΈπ·π·βa population ofDoves is unstable if Hawk playing Dove does better than Dove playing Dove. Bysymmetry, π* = 1 is an unstable fixed point when πΈπ·π» > πΈπ»π»βa populationof Hawks is unstable if Dove playing Hawk does better than Hawk playing Hawk.Both homogeneous populations of either all Hawks or all Doves are unstable forthe numerical values shown in Table 5.12.
The polymorphic solution may be determined from (5.16) and (5.17). As-suming π* β‘ πβ² = π, π* = 1β π*, and π*, π* = 0, we have
(πΎ + π*πΈπ»π» + π*πΈπ»π·)π* + (πΎ + π*πΈπ·π» + π*πΈπ·π·)π* = πΎ + π*πΈπ»π» + π*πΈπ»π·;
and solving for π*, we have
π* =πΈπ»π· β πΈπ·π·
(πΈπ»π· β πΈπ·π·) + (πΈπ·π» β πΈπ»π»).
Since we have assumed πΈπ»π· > πΈπ·π· and πΈπ·π» > πΈπ»π» , the polymorphic solu-tion satisfies 0 < π* < 1. With the numerical values in Table 5.12,
π* =2β 1
(2β 1) + (0 + 2)
= 1/3;
the stable polymorphic population, maintained by frequency-dependent selec-tion, consists of 1/3 Hawks and 2/3 Doves.
5.4 Recombination and the approach to linkageequilibrium
When considering a polymorphism at a single genetic locus, we assumed twodistinct alleles, π΄ and π. The diploid then occurs as one of three types: π΄π΄,π΄π and ππ. We now consider a polymorphism at two genetic loci, each withtwo distinct alleles. If the alleles at the first genetic loci are π΄ and π, andthose at the second π΅ and π, then four distinct haploid gametes are possible,namely π΄π΅, π΄π, ππ΅ and ππ. Ten distinct diplotypes are possible, obtained byforming pairs of all possible haplotypes. We can write these ten diplotypesas π΄π΅/π΄π΅, π΄π΅/π΄π, π΄π΅/ππ΅, π΄π΅/ππ, π΄π/π΄π, π΄π/ππ΅, π΄π/ππ, ππ΅/ππ΅, ππ΅/ππ,and ππ/ππ, where the numerator represents the haplotype from one parent,the denominator represents the haplotype from the other parent. We do notdistinguish here which haplotype came from which parent.
To proceed further, we define the allelic and gametic frequencies for our twoloci problem in Table 5.14. If the probability that a gamete contains allele π΄or π does not depend on whether the gamete contains allele π΅ or π, then thetwo loci are said to be independent. Under the assumption of independence, the
78 CHAPTER 5. POPULATION GENETICS
gametic frequencies are the products of the allelic frequencies, i.e., ππ΄π΅ = ππ΄ππ΅ ,ππ΄π = ππ΄ππ, etc.
Often, the two loci are not independent. This can be due to epistatic selec-tion, or epistasis. As an example, suppose that two loci in humans influenceheight, and that the most fit genotype is the one resulting in an average height.Selection that favors the average population value of a trait is called normalizingor stabilizing. Suppose that π΄ and π΅ are hypothetical tall alleles, π and π areshort alleles, and a person with two tall and two short alleles obtains averageheight. Then selection may favor the specific genotypes π΄π΅/ππ, π΄π/π΄π, π΄π/ππ΅,and ππ΅/ππ΅. Selection may act against both the genotypes yielding above av-erage heights, π΄π΅/π΄π΅, π΄π΅/π΄π, and π΄π΅/ππ΅, and those yielding below averageheights, π΄π/ππ, ππ΅/ππ and ππ/ππ. Epistatic selection occurs because the fitnessof the π΄, π loci depends on which alleles are present at the π΅, π loci. Here, π΄has higher fitness when paired with π than when paired with π΅.
The two loci may also not be independent because of a finite populationsize (i.e., stochastic effects). For instance, suppose a mutation π β π΄ occursonly once in a finite population (in an infinite population, any possible mutationoccurs an infinite number of times), and that π΄ is strongly favored by naturalselection. The frequency of π΄ may then increase. If a nearby polymorphic locuson the same chromosome as π΄ happens to be π΅ (say, with a polymorphism πin the population), then π΄π΅ gametes may substantially increase in frequency,with π΄π absent. We say that the allele π΅ hitchhikes with the favored allele π΄.
When the two loci are not independent, we say that the loci are in gameticphase disequilibrium, or more commonly linkage disequilibrium, sometimes ab-breviated as LD. When the loci are independent, we say they are in linkageequilibrium. Here, we will model how two loci, initially in linkage disequilib-rium, approach linkage equilibrium through the process of recombination.
To begin, we need a rudimentary understanding of meiosis. During meio-sis, a diploid cellβs DNA, arranged in very long molecules called chromosomes,is replicated once and separated twice, producing four haploid cells, each con-taining half of the original cellβs chromosomes. Sexual reproduction results insyngamy, the fusing of a haploid egg and sperm cell to form a diploid zygotecell.
Fig. 5.2 presents a schematic of meiosis and the process of crossing-overresulting in recombination. In a diploid, each chromosome has a correspond-ing sister chromosome, one chromosome originating from the egg, one from thesperm. These sibling chromosomes have the same genes, but possibly differentalleles. In Fig. 5.2, we schematically show the alleles π, π, π on the light chromo-some, and the alleles π΄,π΅,πΆ on its sisterβs dark chromosome. In the first stepof meiosis, each chromosome replicates itself exactly. In the second step, sis-ter chromosomes exchange genetic material by the process of crossing-over. Allfour chromosomes then separate into haploid cells. Notice from the schematicthat the process of crossing-over can result in genetic recombination. Suppose
allele or gamete genotype π΄ π π΅ π π΄π΅ π΄π ππ΅ ππfrequency ππ΄ ππ ππ΅ ππ ππ΄π΅ ππ΄π πππ΅ πππ
Table 5.14: Definitions of allelic and gametic frequencies for two genetic locieach with two alleles.
5.4. LINKAGE EQUILIBRIUM 79
Figure 5.2: A schematic of crossing-over and recombination during meiosis (fig-ure from Access Excellence @ the National Health Museum)
that the schematic of Fig. 5.2 represents the production of sperm by a male.If the chromosome from the maleβs father contains the alleles π΄π΅πΆ and thatfrom the maleβs mother πππ, recombination can result in the sperm containing achromosome with alleles π΄π΅π (the third gamete in Fig. 5.2). We say this chro-mosome is a recombinant; it contains alleles from both its paternal grandfatherand paternal grandmother. It is likely that the precise combination of alleles onthis recombinant chromosome has never existed before in a single person. Re-combination is the reason why everybody, with the exception of identical twins,is genetically unique.
Genes that occur on the same chromosome are said to be linked. The closerthe genes are to each other on the chromosome, the tighter the linkage, and theless likely recombination will separate them. Tightly linked genes are likely tobe inherited from the same grandparent. Genes on different chromosomes areby definition unlinked; independent assortment of chromosomes results in a 50%chance of a gamete receiving either grandparentsβ genes.
To define and model the evolution of linkage disequilibrium, we first obtainallele frequencies from gametic frequencies by
ππ΄ = ππ΄π΅ + ππ΄π, ππ = πππ΅ + πππ,
ππ΅ = ππ΄π΅ + πππ΅ , ππ = ππ΄π + πππ. (5.18)
Since the frequencies sum to unity,
ππ΄ + ππ = 1, ππ΅ + ππ = 1, ππ΄π΅ + ππ΄π + πππ΅ + πππ = 1. (5.19)
There are three independent gametic frequencies and only two independent al-lelic frequencies, so in general it is not possible to obtain the gametic frequenciesfrom the allelic frequencies without assuming an additional constraint such as
80 CHAPTER 5. POPULATION GENETICS
linkage equilibrium. We can, however, introduce an additional variableπ·, calledthe coefficient of linkage disequilibrium, and define π· to be the difference be-tween the gametic frequency ππ΄π΅ and what this gametic frequency would be ifthe loci were in linkage equilibrium:
ππ΄π΅ = ππ΄ππ΅ +π·. (5.20a)
Using ππ΄π΅ + ππ΄π = ππ΄,ππ΄π = ππ΄ππ βπ·; (5.20b)
using ππ΄π΅ + πππ΅ = ππ΅ ,πππ΅ = ππππ΅ βπ·; (5.20c)
and using πππ΅ + πππ = ππ,πππ = ππππ +π·. (5.20d)
With our definition, positive linkage disequilibrium (π· > 0) implies excessiveπ΄π΅ and ππ gametes and deficient π΄π and ππ΅ gametes; negative linkage disequi-librium (π· < 0) implies the opposite. An equality obtainable from (5.20) thatwe will later find useful is
ππ΄π΅πππ β ππ΄ππππ΅ = (ππ΄ππ΅ +π·)(ππππ +π·)β (ππ΄ππ βπ·)(ππππ΅ βπ·)
= π·(ππ΄ππ΅ + ππππ + ππ΄ππ + ππππ΅)
= π·. (5.21)
Without selection and mutation, π· evolves only because of recombination.With primes representing the values in the next generation, and using πβ²π΄ =ππ΄ and πβ²π΅ = ππ΅ because sexual reproduction by itself does not change allelefrequencies,
π·β² = πβ²π΄π΅ β πβ²π΄πβ²π΅
= πβ²π΄π΅ β ππ΄ππ΅
= πβ²π΄π΅ β (ππ΄π΅ βπ·)
= π· + (πβ²π΄π΅ β ππ΄π΅) ,
where we have used (5.20) to obtain the third equality. The change in π· istherefore equal to the change in frequency of the π΄π΅ gametes,
π·β² βπ· = πβ²π΄π΅ β ππ΄π΅ . (5.22)
To understand why gametic frequencies change across generations, we shouldfirst recognize when they do not change. Without genetic recombination, chro-mosomes maintain their exact identity across generations. Chromosome fre-quencies without recombination are therefore constant, and for genetic loci with,say, alleles A,a and B,b on the same chromosome, πβ²π΄π΅ = ππ΄π΅ . In an infinitepopulation without selection or mutation, gametic frequencies change only forgenetic loci in linkage disequilibrium on different chromosomes, or for geneticloci in linkage disequilibrium on the same chromosome subjected to geneticrecombination.
We will compute the frequency πβ²π΄π΅ of π΄π΅ gametes in the next generation,given the frequency ππ΄π΅ of π΄π΅ gametes in the present generation, using twodifferent methods. The first method uses a βmatingβ table (see Table 5.15).
5.4. LINKAGE EQUILIBRIUM 81
gamete freq / diploid freqdiploid dip freq AB Ab aB abπ΄π΅/π΄π΅ π2π΄π΅ 1 0 0 0π΄π΅/π΄π 2ππ΄π΅ππ΄π 1/2 1/2 0 0π΄π΅/ππ΅ 2ππ΄π΅πππ΅ 1/2 0 1/2 0π΄π΅/ππ 2ππ΄π΅πππ (1β π)/2 π/2 π/2 (1β π)/2π΄π/π΄π π2π΄π 0 1 0 0π΄π/ππ΅ 2ππ΄ππππ΅ π/2 (1β π)/2 (1β π)/2 π/2π΄π/ππ 2ππ΄ππππ 0 1/2 0 1/2ππ΅/ππ΅ π2ππ΅ 0 0 1 0ππ΅/ππ 2πππ΅πππ 0 0 1/2 1/2ππ/ππ π2ππ 0 0 0 1
Table 5.15: Computation of gamete frequencies.
The first column is the parent diploid genotype before meiosis. The second col-umn is the diploid genotype frequency assuming random mating. The next fourcolumns are the haploid genotype frequencies (normalized by the correspondingdiploid frequencies to simplify the table presentation). Here, we define π to bethe frequency at which the gamete arises from a combination of grandmotherand grandfather genes. If the A,a and B,b loci occur on the same chromosome,then π is the recombination frequency due to crossing-over. If the A,a and B,bloci occur on different chromosomes, then because of the independent assort-ment of chromosomes there is an equal probability that the gamete contains allgrandfather or grandmother genes, or contains a combination of grandmotherand grandfather genes, so that π = 1/2. Notice that crossing-over or indepen-dent assortment is of importance only if the grandfatherβs and grandmotherβscontribution to the diploid genotype share no common alleles (i.e., π΄π΅/ππ andπ΄π/ππ΅ genotypes). The frequency πβ²π΄π΅ in the next generation is given by thesum of the π΄π΅ column (after multiplication by the diploid frequencies). There-fore,
πβ²π΄π΅ = π2π΄π΅ + ππ΄π΅ππ΄π + ππ΄π΅πππ΅ + (1β π)ππ΄π΅πππ + πππ΄ππππ΅
= ππ΄π΅(ππ΄π΅ + ππ΄π + πππ΅ + πππ) + π(ππ΄ππππ΅ β ππ΄π΅πππ)
= ππ΄π΅ β ππ·, (5.23)
where the final equality makes use of (5.19) and (5.21).The second method for computing πβ²π΄π΅ is more direct. An π΄π΅ haplotype
can arise from a diploid of general type π΄π΅/ππ without recombination, or adiploid of type π΄π/ππ΅ with recombination. Therefore,
πβ²π΄π΅ = (1β π)ππ΄π΅ + πππ΄ππ΅ ,
where the first term is from non-recombinants and the second term from recom-binants. With ππ΄ππ΅ = ππ΄π΅ βπ·, we have
πβ²π΄π΅ = (1β π)ππ΄π΅ + π(ππ΄π΅ βπ·)
= ππ΄π΅ β ππ·,
the same result as (5.23).
82 CHAPTER 5. POPULATION GENETICS
Using (5.22) and (5.23), we derive
π·β² = (1β π)π·,
with the solutionπ·π = π·0(1β π)π.
Recombination decreases linkage disequilibrium in each generation by a factorof (1 β π). Tightly linked genes on the same chromosome have small valuesof π; unlinked genes on different chromosomes have π = 1/2. For unlinkedgenes, linkage disequilibrium decreases by a factor of two in each generation. Weconclude that very strong selection is required to maintain linkage disequilibriumfor genes on different chromosomes, while weak selection can maintain linkagedisequilibrium for tightly linked genes.
5.5 Random genetic drift
Up to now, our simplified genetic models have all assumed an infinite population,neglecting stochastic effects. Here, we consider a finite population: the resultingstochastic effects on the allelic frequencies is called random genetic drift. Thesimplest genetic model incorporating random genetic drift assumes a fixed-sizedpopulation of π individuals, and models the evolution of a diallelic haploidgenetic locus.
There are two widely used genetic models for finite populations: the Wright-Fisher model and the Moran model. The Wright-Fisher model is most similar toour infinite-population discrete-generation model. In the Wright-Fisher model,π adult individuals release a very large number of gametes into a gene pool, andthe next generation is formed from π random gametes independently chosenfrom the gene pool. The Moran model takes a different approach. In the Moranmodel, a single evolution step consists of one random individual in the popu-lation reproducing, and another random individual dying, with the populationalways maintained at the constant size π . Because two random events occurevery generation in the Moran model, and π random events occur every gener-ation in the Wright-Fisher model, a total of π/2 evolution steps in the Moranmodel is comparable, but not exactly identical, to a single discrete generationin the Wright-Fisher model. It has been shown, however, that the two modelsbecome identical in the limit of large π . For our purposes, the Moran model ismathematically more tractable and we adopt it here.
We develop our model analogously to the stochastic population growth modelderived in S3.1. We let π denote the number of π΄-alleles in the population, andπ β π the number of π-alleles. With π = 0, 1, 2, . . . , π a discrete randomvariable, the probability mass function ππ(π) denotes the probability of havingπ π΄-alleles at evolution step π. With π π΄-alleles in a population of size π ,the probability of an individual with π΄ either reproducing or dying is given byπ π = π/π ; the corresponding probability for π is 1β π π. There are three waysto obtain a population of π π΄-alleles at evolution step π + 1. First, there wereπ π΄-alleles at evolution step π, and the individual reproducing carried the sameallele as the individual dying. Second, there were π β 1 π΄-alleles at evolutionstep π, and the individual reproducing has π΄ and the individual dying has π.And third, there were π + 1 π΄-alleles at evolution step π, and the individual
5.5. RANDOM GENETIC DRIFT 83
reproducing has π and the individual dying has π΄. Multiplying the probabilitiesand summing the three cases results in
ππ(π + 1) =(π 2π + (1β π π)
2)ππ(π) + π πβ1(1β π πβ1)ππβ1(π)
+ π π+1(1β π π+1)ππ+1(π). (5.24)
Note that this equation is valid for 0 < π < π , and that the equations at theboundariesβrepresenting the probabilities that one of the alleles is fixedβare
π0(π + 1) = π0(π) + π 1(1β π 1)π1(π), (5.25a)
ππ (π + 1) = ππ (π) + π πβ1(1β π πβ1)ππβ1(π). (5.25b)
The boundaries are called absorbing and the probability of fixation of an allelemonotonically increases with each birth and death. Once the probability offixation of an allele is unity, there are no further changes in allele frequencies.
We illustrate the solution of (5.24)-(5.25) in Fig. 5.3 for a small populationof size π = 20, and where the number of π΄-alleles is precisely known in thefounding generation, with either (a) π10(0) = 1, or; (b) π13(0) = 1. We plot theprobability mass density of the number of π΄-alleles every π evolution steps, upto 7π steps, corresponding to approximately fourteen discrete generations ofevolution in the Wright-Fisher model. Notice how the probability distributiondiffuses away from its initial value, and how the probabilities eventually con-centrate on the boundaries, with both π0 and π20 monotonically increasing. Infact, after a large number of generations, π0 approaches the initial frequency ofthe π-allele and π20 approaches the initial frequency of the π΄-allele (not shownin figures).
To better understand this numerical solution, we consider the limit of large(but not infinite) populations by expanding (5.24) in powers of 1/π . We firstrewrite (5.24) as
ππ(π + 1)β ππ(π) = π π+1(1β π π+1)ππ+1(π)β 2π π(1β π π)ππ(π)
+ π πβ1(1β π πβ1)ππβ1(π). (5.26)
We then introduce the continuous random variable π₯ = π/π , with 0 β€ π₯ β€1, and the continuous time π‘ = π/(π/2). The variable π₯ corresponds to thefrequency of the π΄-allele in the population, and a unit of time corresponds toapproximately a single discrete generation in the Wright-Fisher model. Theprobability density function is defined by
π (π₯, π‘) = πππ(π), with π₯ = π/π, π‘ = 2π/π.
Furthermore, we define
π(π₯) = π π,
and note that π(π₯) = π₯. Similarly, we have π π+1 = π(π₯ + Ξπ₯) and π πβ1 =π(π₯βΞπ₯), where Ξπ₯ = 1/π . Then, with Ξπ‘ = 2/π , (5.26) transforms into
π (π₯, π‘+Ξπ‘)β π (π₯, π‘) = π(π₯+Ξπ₯)(1β π(π₯+Ξπ₯)
)π (π₯+Ξπ₯, π‘)
β 2π(π₯)(1β π(π₯)
)π (π₯, π‘) + π(π₯βΞπ₯)
(1β π(π₯βΞπ₯)
)π (π₯βΞπ₯, π‘). (5.27)
84 CHAPTER 5. POPULATION GENETICS
(a)
0 5 10 15 200
0.1
0.2
0 5 10 15 200
0.1
0.2
0 5 10 15 200
0.1
0.2
0 5 10 15 200
0.1
0.2
0 5 10 15 200
0.1
0.2
0 5 10 15 200
0.1
0.2
0 5 10 15 200
0.1
0.2
0 5 10 15 200
0.1
0.2
(b)
0 5 10 15 200
0.1
0.2
0 5 10 15 200
0.1
0.2
0 5 10 15 200
0.1
0.2
0 5 10 15 200
0.1
0.2
0 5 10 15 200
0.1
0.2
0 5 10 15 200
0.1
0.2
0 5 10 15 200
0.1
0.2
0 5 10 15 200
0.1
0.2
Figure 5.3: ππ versus π with π = 20. Evolution steps plotted correspond toπ = 0, π, 2π, . . . , 7π . (a) Initial number of π΄ individuals is π = 10. (b) Initialnumber of π΄ individuals is π = 13.
5.5. RANDOM GENETIC DRIFT 85
To simplify further, we use the well-known central-difference approximation tothe second-derivative of a function π(π₯),
π β²β²(π₯) =π(π₯+Ξπ₯)β 2π(π₯) + π(π₯βΞπ₯)
Ξπ₯2+O(Ξπ₯2),
and recognize the right-hand side of (5.27) to be the numerator of a second-derivative. With Ξπ₯ = Ξπ‘/2 = 1/π β 0, and π(π₯) = π₯, we derive to leading-order in 1/π the partial differential equation
ππ (π₯, π‘)
ππ‘=
1
2
π2
ππ₯2
(π (π₯)π (π₯, π‘)
), (5.28)
with
π (π₯) =π₯(1β π₯)
π. (5.29)
To interpret the meaning of the function π (π₯) we will use the followingresult from probability theory. For π independent trials, each with probabilityof success π and probability of failure 1 β π, the number of successes, denotedby π, is a binomial random variable with parameters (π, π). Well-known resultsare πΈ[π] = ππ and Var[π] = ππ(1βπ), where πΈ[. . . ] is the expected value, andVar[. . . ] is the variance.
Now, the number of π΄-alleles chosen when forming the next Wright-Fishergeneration is a binomial random variable πβ² with parameters (π,π/π). There-fore, πΈ[πβ²] = π and Var[πβ²] = π(1 β π/π). With π₯β² = πβ²/π and π₯ = π/π , wehave πΈ[π₯β²] = π₯ , and Var[π₯β²] = π₯(1βπ₯)/π . The function π (π₯) can therefore beinterpreted as the variance of π₯ over a single Wright-Fisher generation.
Although (5.28) and (5.29) depend explicitly on the population size π , thepopulation size can be eliminated by a simple change of variables. If we let
π = π‘/π, (5.30)
then the differential equation (5.28) transforms to
ππ (π₯, π)
ππ=
1
2
π2
ππ₯2
(π₯(1β π₯)π (π₯, π)
), (5.31)
independent of π . The change-of-variables given by (5.30) simply states thata doubling of the population size will lengthen the time scale of evolution by acorresponding factor of two. Remember that here we are already working underthe assumption that population sizes are large.
Equation (5.31) is a diffusion-like equation for the probability distributionfunction π (π₯, π). A diffusion approximation for studying genetic drift was firstintroduced into population genetics by its founders Fisher and Wright, andwas later extensively developed in the 1950βs by the Japanese biologist MotooKimura. Here, our analysis of this equation relies on a more recent paperby McKane and Waxman (2007), who showed how to construct the analyticalsolution of (5.31) at the boundaries of π₯.
For 0 < π₯ < 1, it is suggestive from the numerical solutions shown in Fig. 5.3that the probability distribution might asymptotically become independent ofπ₯. Accordingly, we look for an asymptotic solution to (5.31) at the interior
86 CHAPTER 5. POPULATION GENETICS
values of π₯ satisfying π (π₯, π) = π (π). Equation (5.31) becomes
ππ (π)
ππ=
1
2
(π₯(1β π₯)
)β²β²π (π)
= βπ (π),
with asymptotic solutionπ (π) = ππβπ , (5.32)
and we observe that the total probability over the interior values of π₯ decaysexponentially.
To understand how the solution behaves at the boundaries of π₯, we requireboundary conditions for π (π₯, π). In fact, because π (π₯, π) is singular at theboundaries of π₯, appropriate boundary conditions can not be obtained directlyfrom the difference equations given by (5.25). Rather, boundary conditionsare most easily obtained by first recasting (5.31) into the form of a continuityequation.
We let π(π₯, π) denote the so-called probability current. In a small region ofsize Ξπ₯ lying in the interval (π₯, π₯+Ξπ₯), the time-rate of change of the probabilityπ (π₯, π)Ξπ₯ is due to the flow of probability into and out of this region. With anappropriate definition of the probability current, we have in general
π
ππ
(π (π₯, π)Ξπ₯
)= π(π₯, π‘)β π(π₯+Ξπ₯),
or as Ξπ₯ β 0,ππ (π₯, π)
ππ+
ππ(π₯, π)
ππ₯= 0, (5.33)
which is the usual form of a continuity equation. Identification of (5.33) with(5.31) shows that the probability current of our problem is given by
π(π₯, π) = β1
2
π
ππ₯
(π₯(1β π₯)π (π₯, π)
). (5.34)
Now, since the total probability is unity; that is,β« 1
0
π (π₯, π)ππ₯ = 1, (5.35)
probability can not flow in or out of the boundaries of π₯, and we must thereforehave
π(0, π) = 0, π(1, π) = 0, (5.36)
which are the appropriate boundary conditions for (5.31).We can look for stationary solutions of (5.31). Use of the continuity equation
(5.33) together with the boundary conditions (5.36) shows that the stationarysolution has zero probability current; that is, π(π₯) = 0. Integration of π(π₯) using(5.34) results in
π₯(1β π₯)π (π₯) = π1,
where π1 is an integration constant. The readily apparent solution is π (π₯) =π1/[π₯(1 β π₯)], but there are also two less obvious solutions. Probability dis-tribution functions are allowed to contain singular solutions corresponding toDirac delta functions, and here we consider the possibility of there being Dirac
5.5. RANDOM GENETIC DRIFT 87
delta functions at the boundaries. By noticing that both π₯πΏ(π₯) = 0 and(1 β π₯)πΏ(1 β π₯) = 0, we can see that the general solution for π (π₯) can bewritten as
π (π₯) =π1
π₯(1β π₯)+ π2πΏ(π₯) + π3πΏ(1β π₯).
Requiring π (π₯) to satisfy (5.35) results in π1 = 0 and π2+ π3 = 1. To determinethe remaining free constant, we can compute the mean frequency of the π΄-allelein the population. From the continuity equation (5.33), we have
π
ππ
β« 1
0
π₯π (π₯, π)ππ₯ = ββ« 1
0
π₯ππ(π₯, π)
ππ₯ππ₯
=
β« 1
0
π(π₯, π)ππ₯
= 0,
where the first integral on the right-hand-side was done by parts using theboundary conditions given by (5.36), and the second integral was done using(5.34) and the vanishing of π₯(1βπ₯)π (π₯) on the boundaries. The mean frequencyof the π΄-allele is therefore a constantβas one would expect for a nondirectionalrandom genetic driftβand we can assume that its initial value is π. We thereforeobtain for our stationary solution
π (π₯) = (1β π)πΏ(π₯) + ππΏ(1β π₯).
The eventual probability of fixing the π΄-allele is therefore simply equal to itsinitial frequency. For example, suppose that within a population of π individ-uals homogeneous for π, a single neutral mutation occurs so that one individualnow carries the π΄-allele. What is the probability that the π΄-allele eventuallybecomes fixed? Our result would yield the probability 1/π , which is the initialfrequency of the π΄-allele. Intuitively, after a sufficient number of generationshas passed, all living individuals should be descendant from a single ancestralindividual living at the time the single mutation occurred. The probability thatthat single individual carried the π΄-allele is just 1/π .
We further note that Kimura was the first to find an analytical solution of thediffusion equation (5.31). A solution method using Fourier transforms can befound in the appendix of McKane and Waxman (2007). In addition to makinguse of Dirac delta functions, these authors require Heaviside step functions,Bessel functions, spherical Bessel functions, hypergeometric functions, Legendrepolynomials, and Gegenbauer polynomials. The resulting solution is of the form
π (π₯, π) = Ξ 0(π)πΏ(π₯) + Ξ 1(π)πΏ(1β π₯) + π(π₯, π). (5.37)
If the number of π΄-alleles are known at the initial instant, and the frequency isπ, then
π (π₯, 0) = πΏ(π₯β π),
and Ξ 0(0) = Ξ 1(0) = 0; π(π₯, 0) = πΏ(π₯ β π). As π β β, we have π(π₯, π) β 0;Ξ 0(π) β 1β π and Ξ 1(π) β π.
We can easily demonstrate at least one exact solution of the form (5.37).For an initial uniform probability distribution given by
π (π₯, 0) = 1,
88 CHAPTER 5. POPULATION GENETICS
the probability distribution at the interior points remains uniform and is givenby (5.32) with π = 1. If we assume the form of solution given by (5.37) togetherwith the requirement of unit probability given by (5.35), we can obtain the exactresult
π (π₯, π) =1
2
(1β πβπ
) (πΏ(π₯) + πΏ(1β π₯)
)+ πβπ .
References
Kimura, M. Solution of a process of random genetic drift with a continuousmodel. Proceeding of the National Academy of Sciences, USA (1955) 41, 144-150.
McKane, A.J. & Waxman, D. Singular solutions of the diffusion equation ofpopulation genetics. Journal of Theoretical Biology (2007) 247, 849-858.
Chapter 6
Biochemical Reactions
Biochemistry is the study of the chemistry of life. It can be considered a branchof molecular biology, perhaps more focused on specific molecules and their re-actions, or a branch of chemistry focused on the complex chemical reactionsoccurring in living organisms. Perhaps the first application of biochemistryhappened about 5000 years ago when bread was made using yeast.
Modern biochemistry, however, had a relatively slow start among the sci-ences, as did modern biology. Isaac Newtonβs publication of Principia Mathe-matica in 1687 preceded Darwinβs Origin of Species in 1859 by almost 200 years.I find this amazing because the ideas of Darwin are in many ways simpler andeasier to understand than the mathematical theory of Newton. Most of thedelay must be attributed to a fundamental conflict between science and reli-gion. The physical sciences experienced this conflict earlyβwitness the famousprosecution of Galileo by the Catholic Church in 1633, during which Galileowas forced to recant his heliocentric viewβbut the conflict of religion with evo-lutionary biology continues even to this day. Advances in biochemistry wereinitially delayed because it was long believed that life was not subject to thelaws of science the way nonlife was, and that only living things could producethe molecules of life. Certainly, this was more a religious conviction than a sci-entific one. Then Friedrich Wohler in 1828 published his landmark paper on thesynthesis of urea (a waste product neutralizing toxic ammonia before excretionin the urine), demonstrating for the first time that organic compounds can becreated artificially.
Here, we present mathematical models for some important biochemical re-actions. We begin by introducing a useful model for a chemical reaction: thelaw of mass action. We then model what may be the most important biochem-ical reactions, namely those catalyzed by enzymes. Using the mathematicalmodel of enzyme kinetics, we consider three fundamental enzymatic properties:competitive inhibition, allosteric inhibition, and cooperativity.
6.1 The law of mass action
The law of mass action describes the rate at which chemicals interact in re-actions. It is assumed that different chemical molecules come into contact bycollision before reacting, and that the collision rate is directly proportional to
89
90 CHAPTER 6. BIOCHEMICAL REACTIONS
the number of molecules of each reacting species. Suppose that two chemicalsπ΄ and π΅ react to form a product chemical πΆ, written as
π΄+π΅πβ πΆ,
with π the rate constant of the reaction. For simplicity, we will use the samesymbol πΆ, say, to refer to both the chemical πΆ and its concentration. The law ofmass action says that ππΆ/ππ‘ is proportional to the product of the concentrationsπ΄ and π΅, with proportionality constant π. That is,
ππΆ
ππ‘= ππ΄π΅. (6.1)
Similarly, the law of mass action enables us to write equations for the time-derivatives of the reactant concentrations π΄ and π΅:
ππ΄
ππ‘= βππ΄π΅,
ππ΅
ππ‘= βππ΄π΅. (6.2)
Notice that when using the law of mass action to find the rate-of-change ofa concentration, the chemical that the arrow points towards is increasing inconcentration (positive sign), the chemical that the arrow points away from isdecreasing in concentration (negative sign). The product of concentrations onthe right-hand-side is always that of the reactants from which the arrow pointsaway, multiplied by the rate constant that is on top of the arrow.
Equation (6.1) can be solved analytically using conservation laws. Eachreactant, original and converted to product, is conserved since one molecule ofeach reactant gets converted into one molecule of product. Therefore,
π
ππ‘(π΄+ πΆ) = 0 =β π΄+ πΆ = π΄0,
π
ππ‘(π΅ + πΆ) = 0 =β π΅ + πΆ = π΅0,
where π΄0 and π΅0 are the initial concentrations of the reactants, and no productis present initially. Using the conservation laws, (6.1) becomes
ππΆ
ππ‘= π(π΄0 β πΆ)(π΅0 β πΆ), with πΆ(0) = 0,
which may be integrated by separating variables. After some algebra, the solu-tion is determined to be
πΆ(π‘) = π΄0π΅0π(π΅0βπ΄0)ππ‘ β 1
π΅0π(π΅0βπ΄0)ππ‘ βπ΄0,
which is a complicated expression with the simple limits
limπ‘ββ
πΆ(π‘) =
{π΄0 if π΄0 < π΅0,
π΅0 if π΅0 < π΄0.(6.3)
The reaction stops after one of the reactants is depleted; and the final con-centration of the product is equal to the initial concentration of the depletedreactant.
6.1. THE LAW OF MASS ACTION 91
If we also include the reverse reaction,
π΄+π΅-
οΏ½ πΆ,
π+
πβ
then the time-derivative of the product is given by
ππΆ
ππ‘= π+π΄π΅ β πβπΆ.
Notice that π+ and πβ have different units. At equilibrium, οΏ½οΏ½ = 0, and usingthe conservation laws π΄+ πΆ = π΄0, π΅ + πΆ = π΅0, we obtain
(π΄0 β πΆ)(π΅0 β πΆ)β πβπ+
πΆ = 0,
from which we define the equilibrium constant πΎππ by
πΎππ = πβ/π+,
which has units of concentration. Therefore, at equilibrium, the concentrationof the product is given by the solution of the quadratic equation
πΆ2 β (π΄0 +π΅0 +πΎππ)πΆ +π΄0π΅0 = 0,
with the extra condition that 0 < πΆ < min(π΄0, π΅0). For instance, if π΄0 = π΅0 β‘π 0, then at equilibrium,
πΆ = π 0 β1
2πΎππ
(β1 + 4π 0/πΎππ β 1
).
If πΎππ βͺ π 0, then π΄ and π΅ have a high affinity, and the reaction proceedsmainly to πΆ, with πΆ β π 0.
Below are two interesting reactions. In reaction (ii), π΄ is assumed to be heldat a constant concentration.
(i)
π΄+π-
οΏ½ 2π
π+
πβ
(ii)
π΄+ππ1β 2π, π + π
π2β 2π, ππ3β π΅
Can you write down the equations for οΏ½οΏ½ in reaction (i), and οΏ½οΏ½ and οΏ½οΏ½ in reaction(ii)? When normalized properly, the equations from reaction (ii) reduce tothe Lotka-Volterra predator-prey equations introduced in S1.4. The chemicalconcentrations π and π , therefore, oscillate in time like predators and theirprey.
92 CHAPTER 6. BIOCHEMICAL REACTIONS
Figure 6.1: A Michaelis-Menten reaction of two substrates converting to oneproduct. (Drawn by User:IMeowbot, released under the GNU Free Documen-tation License.)
6.2 Enzyme kinetics
Enzymes are catalysts, usually proteins, that help convert other molecules calledsubstrates into products, but are themselves unchanged by the reaction. Eachenzyme has high specificity for at least one reaction, and it can accelerate thisreaction by millions of times. Without enzymes, most biochemical reactions aretoo slow for life to be possible. Enzymes are so important to our lives that asingle amino acid mutation in one enzyme out of the more than 2000 enzymesin our bodies can result in a severe or lethal genetic disease. Enzymes do notfollow the law of mass action directly: with π substrate, π product, and πΈenzyme, the reaction
π + πΈπβ π + πΈ,
is a poor model since the reaction velocity ππ/ππ‘ is known to attain a finitelimit with increasing substrate concentration. Rather, Michaelis and Menten(1913) proposed the following reaction scheme with an intermediate molecule:
π + πΈ-
οΏ½
π1
πβ1
πΆ - π + πΈ,π2
where πΆ is a complex formed by the enzyme and the substrate. A cartoon of theMichaelis-Menten reaction with an enzyme catalyzing a reaction between twosubstrates is shown in Fig. 6.1. Commonly, substrate is continuously providedto the reaction and product is continuously removed. The removal of producthas been modeled by neglecting the reverse reaction π +πΈ β πΆ. A continuousprovision of substrate allows us to assume that π is held at an approximatelyconstant concentration.
The differential equations for πΆ and π can be obtained from the law of massaction:
ππΆ/ππ‘ = π1ππΈ β (πβ1 + π2)πΆ,
ππ/ππ‘ = π2πΆ.
Biochemists usually want to determine the reaction velocity ππ/ππ‘ in terms ofthe substrate concentration π and the total enzyme concentration πΈ0. We can
6.3. COMPETITIVE INHIBITION 93
eliminate πΈ in favor of πΈ0 from the conservation law that the enzyme, free andbound, is conserved; that is
π(πΈ + πΆ)
ππ‘= 0 =β πΈ + πΆ = πΈ0 =β πΈ = πΈ0 β πΆ;
and we can rewrite the equation for ππΆ/ππ‘ eliminating πΈ:
ππΆ
ππ‘= π1π(πΈ0 β πΆ)β (πβ1 + π2)πΆ
= π1πΈ0π β (πβ1 + π2 + π1π)πΆ. (6.4)
Because π is assumed to be held constant, the complex πΆ is expected to be inequilibrium, with the rate of formation equal to the rate of dissociation. Withthis so-called quasi-steady-state approximation, we may assume that οΏ½οΏ½ = 0 in(6.4), and we have
πΆ =π1πΈ0π
πβ1 + π2 + π1π.
The reaction velocity is then given by
ππ
ππ‘= π2πΆ
=π1π2πΈ0π
πβ1 + π2 + π1π
=πππ
πΎπ + π, (6.5)
where two fundamental constants are defined:
πΎπ = (πβ1 + π2)/π1, ππ = π2πΈ0. (6.6)
The Michaelis-Menten constant or the Michaelis constant πΎπ has units of con-centration, and the maximum reaction velocity ππ has units of concentrationdivided by time. The interpretation of these constants is obtained by consideringthe following limits:
as π β β, πΆ β πΈ0 and ππ/ππ‘ β ππ,
if π = πΎπ, πΆ =1
2πΈ0 and ππ/ππ‘ =
1
2ππ.
Therefore, ππ is the limiting reaction velocity obtained by saturating the reac-tion with substrate so that every enzyme is bound; and πΎπ is the concentrationof π at which only one-half of the enzymes are bound and the reaction proceedsat one-half maximum velocity.
6.3 Competitive inhibition
Competitive inhibition occurs when inhibitor molecules compete with substratemolecules for binding to the same enzymeβs active site. When an inhibitoris bound to the enzyme, no product is produced so competitive inhibition will
94 CHAPTER 6. BIOCHEMICAL REACTIONS
Figure 6.2: Competitive inhibition. (Drawn by G. Andruk, released under theGNU Free Documentation License.)
reduce the velocity of the reaction. A cartoon of this process is shown in Fig. 6.2.
To model competitive inhibition, we introduce an additional reaction asso-ciated with the inhibitor-enzyme binding:
π + πΈ-
οΏ½
π1
πβ1
πΆ1- π + πΈ,
π2
πΌ + πΈ-
οΏ½
π3
πβ3
πΆ2.
With more complicated enzymatic reactions, the reaction schematic becomesdifficult to interpret. Perhaps an easier way to visualize the reaction is from thefollowing redrawn schematic:
πΈ πΆ1 π + πΈ
πΆ2
-οΏ½ -
?
6
π1π
πβ1
π2
π3πΌ πβ3
Here, the substrate π and inhibitor πΌ are combined with the relevant rate con-stants, rather than treated separately. It is immediately obvious from this re-drawn schematic that inhibition is accomplished by sequestering enzyme in theform of πΆ2 and preventing its participation in the catalysis of π to π .
6.3. COMPETITIVE INHIBITION 95
Our goal is to determine the reaction velocity οΏ½οΏ½ in terms of the substrateand inhibitor concentrations, and the total concentration of the enzyme (freeand bound). The law of mass action applied to the complex concentrations andthe product results in
ππΆ1
ππ‘= π1ππΈ β (πβ1 + π2)πΆ1,
ππΆ2
ππ‘= π3πΌπΈ β πβ3πΆ2,
ππ
ππ‘= π2πΆ1.
The enzyme, free and bound, is conserved so that
π
ππ‘(πΈ + πΆ1 + πΆ2) = 0 =β πΈ + πΆ1 + πΆ2 = πΈ0 =β πΈ = πΈ0 β πΆ1 β πΆ2.
Under the quasi-equilibrium approximation, οΏ½οΏ½1 = οΏ½οΏ½2 = 0, so that
π1π(πΈ0 β πΆ1 β πΆ2)β (πβ1 + π2)πΆ1 = 0,
π3πΌ(πΈ0 β πΆ1 β πΆ2)β πβ3πΆ2 = 0,
which results in the following system of two linear equations and two unknowns(πΆ1 and πΆ2):
(πβ1 + π2 + π1π)πΆ1 + π1ππΆ2 = π1πΈ0π, (6.7)
π3πΌπΆ1 + (πβ3 + π3πΌ)πΆ2 = π3πΈ0πΌ. (6.8)
We define the Michaelis-Menten constant πΎπ as before, and an additional con-stant πΎπ associated with the inhibitor reaction:
πΎπ =πβ1 + π2
π1, πΎπ =
πβ3
π3.
Dividing (6.7) by π1 and (6.8) by π3 yields
(πΎπ + π)πΆ1 + ππΆ2 = πΈ0π, (6.9)
πΌπΆ1 + (πΎπ + πΌ)πΆ2 = πΈ0πΌ. (6.10)
Since our goal is to obtain the velocity of the reaction, which requires determin-ing πΆ1, we multiply (6.9) by (πΎπ + πΌ) and (6.10) by π, and subtract:
(πΎπ + π)(πΎπ + πΌ)πΆ1 + π(πΎπ + πΌ)πΆ2 = πΈ0(πΎπ + πΌ)π
β ππΌπΆ1 + π(πΎπ + πΌ)πΆ2 = πΈ0ππΌ
((πΎπ + π)(πΎπ + πΌ)β ππΌ
)πΆ1 = πΎππΈ0π;
or after cancelation and rearrangement
πΆ1 =πΎππΈ0π
πΎππΎπ +πΎππ +πΎππΌ
=πΈ0π
πΎπ(1 + πΌ/πΎπ) + π.
96 CHAPTER 6. BIOCHEMICAL REACTIONS
Figure 6.3: Allosteric inhibition. (Unknown artist, released under the GNUFree Documentation License.)
Therefore, the reaction velocity is given by
ππ
ππ‘=
(π2πΈ0)π
πΎπ(1 + πΌ/πΎπ) + π
=πππ
πΎ β²π + π
, (6.11)
where
ππ = π2πΈ0, πΎ β²π = πΎπ(1 + πΌ/πΎπ). (6.12)
By comparing the inhibited reaction velocity (6.11) and (6.12) with the unin-hibited reaction velocity (6.5) and (6.6), we observe that inhibition increasesthe Michaelis-Menten constant of the reaction, but leaves unchanged the max-imum reaction velocity. Since the Michaelis-Menten constant is defined as thesubstrate concentration required to attain one-half of the maximum reactionvelocity, addition of an inhibitor with a fixed substrate concentration acts todecrease the reaction velocity. However, a reaction saturated with substratestill attains the uninhibited maximum reaction velocity.
6.4 Allosteric inhibition
The term allostery comes from the Greek word allos, meaning different, andstereos, meaning solid, and refers to an enzyme with a regulatory binding siteseparate from its active binding site. In our model of allosteric inhibition, aninhibitor molecule is assumed to bind to its own regulatory site on the enzyme,resulting in either a lowered binding affinity of the substrate to the enzyme,or a lowered conversion rate of substrate to product. A cartoon of allostericinhibition due to a lowered binding affinity is shown in Fig. 6.3.
In general, we need to define three complexes: πΆ1 is the complex formed fromsubstrate and enzyme; πΆ2 from inhibitor and enzyme, and; πΆ3 from substrate,inhibitor, and enzyme. We write the chemical reactions as follows:
6.4. ALLOSTERIC INHIBITION 97
πΈ πΆ1 π + πΈ
π + πΆ2πΆ3πΆ2
-οΏ½ -
?
6
-οΏ½
?
6
-
π1π
πβ1
π2
πβ²3πΌ πβ²β3
πβ²1π
πβ²β1
π3πΌ πβ3
πβ²2
The general model for allosteric inhibition with ten independent rate con-stants appears too complicated to analyze. We will simplify this general modelto one with fewer rate constants that still exhibits the unique features of al-losteric inhibition. One possible but uninteresting simplification assumes thatif πΌ binds to πΈ, then π does not; however, this reduces allosteric inhibition tocompetitive inhibition and loses the essence of allostery. Instead, we simplifyby allowing both πΌ and π to simultaneously bind to πΈ, but we assume that thebinding of πΌ prevents substrate conversion to product. With this simplification,πβ²2 = 0. To further reduce the number of independent rate constants, we assumethat the binding of π to πΈ is unaffected by the bound presence of πΌ, and thebinding of πΌ to πΈ is unaffected by the bound presence of π. These approxima-tions imply that all the primed rate constants equal the corresponding unprimedrate constants, e.g., πβ²1 = π1, etc. With these simplifications, the schematic ofthe chemical reaction simplifes to
πΈ πΆ1 π + πΈ
πΆ3πΆ2
-οΏ½ -
?
6
-οΏ½
?
6
π1π
πβ1
π2
π3πΌ πβ3
π1π
πβ1
π3πΌ πβ3
and now there are only five independent rate constants. We write the equationsfor the complexes using the law of mass action:
ππΆ1
ππ‘= π1ππΈ + πβ3πΆ3 β (πβ1 + π2 + π3πΌ)πΆ1, (6.13)
ππΆ2
ππ‘= π3πΌπΈ + πβ1πΆ3 β (πβ3 + π1π)πΆ2, (6.14)
ππΆ3
ππ‘= π3πΌπΆ1 + π1ππΆ2 β (πβ1 + πβ3)πΆ3, (6.15)
98 CHAPTER 6. BIOCHEMICAL REACTIONS
while the reaction velocity is given by
ππ
ππ‘= π2πΆ1. (6.16)
Again, both free and bound enzyme is conserved, so that πΈ = πΈ0βπΆ1βπΆ2βπΆ3.With the quasi-equilibrium approximation οΏ½οΏ½1 = οΏ½οΏ½2 = οΏ½οΏ½3 = 0, we obtain asystem of three equations and three unknowns: πΆ1, πΆ2 and πΆ3. Despite oursimplifications, the analytical solution for the reaction velocity remains messy(see Keener & Sneyd, referenced at the chapterβs end) and not especially illu-minating. We omit the complete analytical result here and determine only themaximum reaction velocity.
The maximum reaction velocity π β²π for the allosteric-inhibited reaction is
defined as the time-derivative of the product concentration when the reactionis saturated with substrate; that is,
π β²π = lim
πββππ/ππ‘
= π2 limπββ
πΆ1.
With substrate saturation, every enzyme will have its substrate binding siteoccupied. Enzymes are either bound with only substrate in the complex πΆ1, orbound together with substrate and inhibitor in the complex πΆ3. Accordingly,the schematic of the chemical reaction with substrate saturation simplifies to
πΆ1 π + πΆ1
πΆ3
-
?
6
π2
π3πΌ πβ3
The equations for πΆ1 and πΆ3 with substrate saturation are thus given by
ππΆ1
ππ‘= πβ3πΆ3 β π3πΌπΆ1, (6.17)
ππΆ3
ππ‘= π3πΌπΆ1 β πβ3πΆ3, (6.18)
and the quasi-equilibrium approximation yields the single independent equation
πΆ3 = (π3/πβ3)πΌπΆ1
= (πΌ/πΎπ)πΆ1, (6.19)
with πΎπ = πβ3/π3 as before. The equation expressing the conservation of en-zyme is given by πΈ0 = πΆ1 + πΆ3. This conservation law, together with (6.19),permits us to solve for πΆ1:
πΆ1 =πΈ0
1 + πΌ/πΎπ.
6.5. COOPERATIVITY 99
Figure 6.4: Cooperativity.
Therefore, the maximum reaction velocity for the allosteric-inhibited reactionis given by
π β²π =
π2πΈ0
1 + πΌ/πΎπ
=ππ
1 + πΌ/πΎπ,
where ππ is the maximum reaction velocity of both the uninhibited and thecompetitive inhibited reaction. The allosteric inhibitor is thus seen to reducethe maximum velocity of the uninhibited reaction by the factor (1 + πΌ/πΎπ),which may be large if the concentration of allosteric inhibitor is substantial.
6.5 Cooperativity
Enzymes and other protein complexes may have multiple binding sites, andwhen a substrate binds to one of these sites, the other sites may become moreactive. A well-studied example is the binding of the oxygen molecule to thehemoglobin protein. Hemoglobin can bind four molecules of O2, and when threemolecules are bound, the fourth molecule has an increased affinity for binding.We call this cooperativity.
We will model cooperativity by assuming that an enzyme has two separatedbut indistinguishable binding sites for a substrate π. For example, the enzymemay be a protein dimer, composed of two identical subproteins with identicalbinding sites for π. A cartoon of this enzyme is shown in Fig. 6.4. Because thetwo binding sites are indistinguishable, we need consider only two complexes:πΆ1 and πΆ2, with enzyme bound to one or two substrate molecules, respectively.When the enzyme exhibits cooperativity, the binding of the second substratemolecule has a greater rate constant than the binding of the first. We thereforeconsider the following reaction:
100 CHAPTER 6. BIOCHEMICAL REACTIONS
πΈ πΆ1 π + πΈ
π + πΆ1πΆ2
-οΏ½ -
?
6
-
π1π
πβ1
π2
π3π πβ3
π4
where cooperativity supposes that π1 βͺ π3. Application of the law of massaction results in
ππΆ1
ππ‘= π1ππΈ + (πβ3 + π4)πΆ2 β (πβ1 + π2 + π3π)πΆ1,
ππΆ2
ππ‘= π3ππΆ1 β (πβ3 + π4)πΆ2.
Applying the quasi-equilibrium approximation οΏ½οΏ½1 = οΏ½οΏ½2 = 0 and the conserva-tion law πΈ0 = πΈ+πΆ1 +πΆ2 results in the following system of two equations andtwo unknowns:(
πβ1 + π2 + (π1 + π3)π)πΆ1 β (πβ3 + π4 β π1π)πΆ2 = π1πΈ0π, (6.20)
π3ππΆ1 β (πβ3 + π4)πΆ2 = 0. (6.21)
We divide (6.20) by π1 and (6.21) by π3 and define
πΎ1 =πβ1 + π2
π1, πΎ2 =
πβ3 + π4π3
, π = π1/π3
to obtain (ππΎ1 + (1 + π)π
)πΆ1 β (πΎ2 β ππ)πΆ2 = ππΈ0π, (6.22)
ππΆ1 βπΎ2πΆ2 = 0. (6.23)
We can subtract (6.23) from (6.22) and cancel π to obtain
(πΎ1 + π)πΆ1 + ππΆ2 = πΈ0π. (6.24)
Equations (6.23) and (6.24) can be solved for πΆ1 and πΆ2:
πΆ1 =πΎ2πΈ0π
πΎ1πΎ2 +πΎ2π + π2, (6.25)
πΆ2 =πΈ0π
2
πΎ1πΎ2 +πΎ2π + π2, (6.26)
so that the reaction velocity is given by
ππ
ππ‘= π2πΆ1 + π4πΆ2
=(π2πΎ2 + π4π)πΈ0π
πΎ1πΎ2 +πΎ2π + π2. (6.27)
6.5. COOPERATIVITY 101
To illuminate this result, we consider two limiting cases: (i) no cooperativity,where the active sites act independently so that each protein dimer, say, canbe considered as two independent protein monomers; (ii) strong cooperativity,where the binding of the second substrate has a much greater rate constant thanthe binding of the first.
Independent active sites
The free enzyme πΈ has two independent binding sites while πΆ1 has only a singlebinding site. Consulting the reaction schematic: π1 is the rate constant forthe binding of π to two independent binding sites; πβ1 and π2 are the rateconstants for the dissociation and conversion of a single π from the enzyme;π3 is the rate constant for the binding of π to a single free binding site, and;πβ3 and π4 are the rate constants for the dissociation and conversion of one oftwo independent πβs from the enzyme. Accounting for these factors of two andassuming independence of active sites, we have
π1 = 2π3, πβ3 = 2πβ1, π4 = 2π2.
We define the Michaelis-Menten constant πΎπ that is representative of the pro-tein monomer with one binding site; that is,
πΎπ =πβ1 + π2π1/2
= 2πΎ1
=1
2πΎ2.
Therefore, for independent active sites, the reaction velocity becomes
ππ
ππ‘=
(2π2πΎπ + 2π2π)πΈ0π
πΎ2π + 2πΎππ + π2
=2π2πΈ0π
πΎπ + π.
The reaction velocity for a dimer protein enzyme composed of independentidentical monomers is simply double that of a monomer protein enzyme, anintuitively obvious result.
Strong cooperativity
We now assume that after the first substrate binds to the enzyme, the secondsubstrate binds much more easily, so that π1 βͺ π3. The number of enzymesbound to a single substrate molecule should consequently be much less thanthe number bound to two substrate molecules, resulting in πΆ1 βͺ πΆ2. Dividing(6.25) by (6.26), this inequality becomes
πΆ1
πΆ2=
πΎ2
πβͺ 1.
Dividing the numerator and denominator of (6.27) by π2, we have
ππ
ππ‘=
(π2πΎ2/π + π4)πΈ0
(πΎ1/π)(πΎ2/π) + (πΎ2/π) + 1.
102 CHAPTER 6. BIOCHEMICAL REACTIONS
0 2 4 6 8 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
S
dP
/dt
n = 2
n = 1
Figure 6.5: The reaction velocity ππ/ππ‘ as a function of the substrate π. Shownare the solutions to the Hill equation with ππ = 1, πΎπ = 1, and for π = 1, 2.
To take the limit of this expression as πΎ2/π β 0, we set πΎ2/π = 0 everywhereexcept in the last term in the denominator, since πΎ1/π is inversely proportionalto π1 and may go to infinity in this limit. Taking the limit and multiplying thenumerator and denominator by π2,
ππ
ππ‘=
π4πΈ0π2
πΎ1πΎ2 + π2.
Here, the maximum reaction velocity is ππ = π4πΈ0, and the modified Michaelis-Menten constant is πΎπ =
βπΎ1πΎ2, so that
ππ
ππ‘=
πππ2
πΎ2π + π2
.
In biochemistry, this reaction velocity is generalized to
ππ
ππ‘=
ππππ
πΎππ + ππ
,
known as the Hill equation, and by varying π is used to fit experimental data.In Fig. 6.5, we have plotted the reaction velocity ππ/ππ‘ versus π as obtained
from the Hill equation with π = 1 or 2. In drawing the figure, we have takenboth ππ and πΎπ equal to unity. It is evident that with increasing π the reactionvelocity more rapidly saturates to its maximum value.
References
Keener, J. & Sneyd, J. Mathematical Physiology. Springer-Verlag, New York(1998). Pg. 30, Exercise 2.
Chapter 7
Sequence Alignment
The software program BLAST (Basic Local Alignment Search Tool) uses se-quence alignment algorithms to compare a query sequence against a databaseto identify other known sequences similar to the query sequence. Often, theannotations attached to the already known sequences yield important biologicalinformation about the query sequence. Almost all biologists use BLAST, mak-ing sequence alignment one of the most important algorithms of bioinformatics.
The sequence under study can be composed of nucleotides (from the nucleicacids DNA or RNA) or amino acids (from proteins). Nucleic acids chain togetherfour different nucleotides: A,C,T,G for DNA and A,C,U,G for RNA; proteinschain together twenty different amino acids. The sequence of a DNA moleculeor of a protein is the linear order of nucleotides or amino acids in a specifieddirection, defined by the chemistry of the molecule. There is no need for us toknow the exact details of the chemistry; it is sufficient to know that a proteinhas distinguishable ends called the N-terminus and the C-terminus, and thatthe usual convention is to read the amino acid sequence from the N-terminus tothe C-terminus. Specification of the direction is more complicated for a DNAmolecule than for a protein molecule because of the double helix structure ofDNA, and this will be explained in Section 7.1.
The basic sequence alignment algorithm aligns two or more sequences tohighlight their similarity, inserting a small number of gaps into each sequence(usually denoted by dashes) to align wherever possible identical or similar char-acters. For instance, Fig 7.1 presents an alignment using the software toolClustalW of the hemoglobin beta-chain from a human, a chimpanzee, a rat,and a zebrafish. The human and chimpanzee sequences are identical, a con-sequence of our very close evolutionary relationship. The rat sequence differsfrom human/chimpanzee at only 27 out of 146 amino acids; we are all mam-mals. The zebrafish sequence, though clearly related, diverges significantly.Notice the insertion of a gap in each of the mammal sequences at the zebrafish amino acid position 122. This permits the subsequent zebrafish sequenceto better align with the mammal sequences, and implies either an insertion ofa new amino acid in fish, or a deletion of an amino acid in mammals. Theinsertion or deletion of a character in a sequence is called an indel. Mismatchesin sequence, such as that occurring between zebrafish and mammals at aminoacid positions 2 and 3 is called a mutation. ClustalW places a β*β on the lastline to denote exact amino acid matches across all sequences, and a β:β and β.β to
103
104 CHAPTER 7. SEQUENCE ALIGNMENT
CLUSTAL W (1.83) multiple sequence alignment
Human VHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKV 60
Chimpanzee VHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPKV 60
Rat VHLTDAEKAAVNGLWGKVNPDDVGGEALGRLLVVYPWTQRYFDSFGDLSSASAIMGNPKV 60
Zebrafish VEWTDAERTAILGLWGKLNIDEIGPQALSRCLIVYPWTQRYFATFGNLSSPAAIMGNPKV 60
*. * *::*: .****:* *::* :**.* *:*******:* :**:**:. *:******
Human KAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGK 120
Chimpanzee KAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFGK 120
Rat KAHGKKVINAFNDGLKHLDNLKGTFAHLSELHCDKLHVDPENFRLLGNMIVIVLGHHLGK 120
Zebrafish AAHGRTVMGGLERAIKNMDNVKNTYAALSVMHSEKLHVDPDNFRLLADCITVCAAMKFGQ 120
***:.*:..:. .: ::**:*.*:* ** :*.:******:*****.: :. . ::*:
Human E-FTPPVQAAYQKVVAGVANALAHKYH 146
Chimpanzee E-FTPPVQAAYQKVVAGVANALAHKYH 146
Rat E-FTPCAQAAFQKVVAGVASALAHKYH 146
Zebrafish AGFNADVQEAWQKFLAVVVSALCRQYH 147
*.. .* *:**.:* *..**.::**
Figure 7.1: Multiple alignment of the hemoglobin beta-chain for Human, Chim-panzee, Rat and Zebra fish, obtained using ClustalW.
denote chemically similar amino acids across all sequences (each amino acid hascharacteristic chemical properties, and amino acids can be grouped accordingto similar properties). In this chapter, we detail the algorithms used to alignsequences.
7.1 The minimum you need to know about DNAchemistry and the genetic code
In one of the most important scientific papers ever published, James Watsonand Francis Crick, pictured in Fig. 7.2, determined the structure of DNA usinga three-dimensional molecular model that makes plain the chemical basis ofheredity. The DNA molecule consists of two strands wound around each otherto form the now famous double helix. Arbitrarily, one strand is labeled by thesequencing group to be the positive strand, and the other the negative strand.The two strands of the DNA molecule bind to each other by base pairing: thebases of one strand pair with the bases of the other strand. Adenine (A) alwayspairs with thymine (T), and guanine (G) always pairs with cytosine (C): A withT, G with C. For RNA, T is replaced by uracil (U). When reading the sequence ofnucleotides from a single strand, the direction of reading must be specified, andthis is possible by referring to the chemical bonds of the DNA backbone. Thereare of course only two possible directions to read a linear sequence of bases,and these are denoted as 5β-to-3β and 3β-to-5β. Importantly, the two separatestrands of the DNA molecule are oriented in opposite directions. Below is thebeginning of the DNA coding sequence for the human hemoglobin beta chainprotein discussed earlier:
5β-GTGCACCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTGTGGGGCAAGGTGAACGTG-3β
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3β-CACGTGGACTGAGGACTCCTCTTCAGACGGCAATGACGGGACACCCCGTTCCACTTGCAC-5β
7.2. BRUTE FORCE ALIGNMENT 105
Figure 7.2: James Watson and Francis Crick posing in front of their DNAmodel. The original photograph was taken in 1953, the year of discovery, andwas recreated in 2003, fifty years later. Francis Crick, the man on the right,died in 2004.
It is important to realize that there are two unique DNA sequences here, andeither one, or even both, can be coding. Reading from 5β-to-3β, the uppersequence begins βGTGCACCTG...β, while the lower sequence ends β...CAGGT-GCACβ. Here, only the upper sequence codes for the human hemoglobin betachain, and the lower sequence is non-coding.
How is the DNA code read? Enzymes separate the two strands of DNA,and transcription occurs as the DNA sequence is copied into messenger RNA(mRNA). If the upper strand is the coding sequence, then the complementarylower strand serves as the template for constructing the mRNA sequence. TheACUG nucleotides of the mRNA bind to the lower sequence and construct asingle stranded mRNA molecule containing the sequence βGUGCACCUG...β,which exactly matches the sequence of the upper, coding strand, but with Treplaced by U. This mRNA is subsequently translated in the ribosome of thecell, where each nucleotide triplet codes for a single amino acid. The tripletcoding of nucleotides for amino acids is the famous genetic code, shown in Fig.7.3. Here, the translation to amino acid sequence is βVHL...β, where we haveused the genetic code βGUGβ = V, βCACβ=H, βCUGβ=L. The three out of thetwenty amino acids used here are V = Valine, H = Histidine, and L = Leucine.
7.2 Sequence alignment by brute force
One (bad) approach to sequence alignment is to align the two sequences inall possible ways, score the alignments with an assumed scoring system, anddetermine the highest scoring alignment. The problem with this brute-forceapproach is that the number of possible alignments grows exponentially withsequence length; and for sequences of reasonable length, the computation isalready impossible. For example, the number of ways to align two sequencesof 50 characters eachβa rather small alignment problemβis about 1.5 Γ 1037,already an astonishingly large number. It is informative to count the number
106 CHAPTER 7. SEQUENCE ALIGNMENT
Figure 7.3: The genetic code.
of possible alignments between two sequences since a similar algorithm is usedfor sequence alignment.
Suppose we want to align two sequences. Gaps in either sequence are allowedbut a gap can not be aligned with a gap. By way of illustration, we demonstratethe three ways that the first character of the upper-case alphabet and the lower-case alphabet may align:
A -A A-
| , || , ||
a a- -a
and the five ways in which the first two characters of the upper-case alphabetcan align with the first character of the lower-case alphabet:
AB AB AB- A-B -AB
|| , || , ||| , ||| , ||| .
-a a- --a -a- a--
A recursion relation for the total number of possible alignments of a sequenceof π characters with a sequence of π characters may be derived by considering thealignment of the last character. There are three possibilities that we illustrateby assuming the πth character is βFβ and the πth character is βdβ:
(1) πβ1 characters of the first sequence are already aligned with πβ1 charactersof the second sequence, and the πth character of the first sequence aligns exactlywith the πth character of the second sequence:
...F
||||
...d
7.2. BRUTE FORCE ALIGNMENT 107
(2) π β 1 characters of the first sequence are aligned with π characters of thesecond sequence and the πth character of the first sequence aligns with a gap inthe second sequence:
...F
||||
...-
(3) π characters of the first sequence are aligned with π β 1 characters of thesecond sequence and a gap in the first sequence aligns with the πth character ofthe second sequence:
...-
||||
...d
If πΆ(π, π) is the number of ways to align an π character sequence with a π charactersequence, then, from our counting,
πΆ(π, π) = πΆ(πβ 1, π β 1) + πΆ(πβ 1, π) + πΆ(π, π β 1). (7.1)
This recursion relation requires boundary conditions. Because there is only oneway to align an π > 0 character sequence against a zero character sequence (i.e.,π characters against π gaps) the boundary conditions are πΆ(0, π) = πΆ(π, 0) = 1for all π, π > 0 We may also add the additional boundary condition πΆ(0, 0) = 1,obtained from the known result πΆ(1, 1) = 3.
Using the recursion relation (7.1), we can construct the following dynamicmatrix to count the number of ways to align the two five-character sequencesπ1π2π3π4π5 and π1π2π3π4π5:
- π1 π2 π3 π4 π5- 1 1 1 1 1 1π1 1 3 5 7 9 11π2 1 5 13 25 41 61π3 1 7 25 63 129 231π4 1 9 41 129 321 681π5 1 11 61 231 681 1683
The size of this dynamic matrix is 6Γ 6, and for convenience we label the rowsand columns starting from zero (i.e., row 0, row 1, . . . , row 5). This matrixwas constructed by first writing β π1 π2 π3 π4 π5 to the left of the matrix andβ π1 π2 π3 π4 π5 above the matrix, then filling in ones across the zeroth row anddown the zeroth column to satisfy the boundary conditions, and finally applyingthe recursion relation directly by going across the first row from left-to-right, thesecond row from left-to-right, etc. To demonstrate the filling in of the matrix,we have across the first row: 1 + 1 + 1 = 3, 1 + 1 + 3 = 5, 1 + 1 + 5 = 7, etc,and across the second row: 1 + 3 + 1 = 5, 3 + 5 + 5 = 13, 5 + 7 + 13 = 25,etc. Finally, the last element entered gives the number of ways to align two fivecharacter sequences: 1683, already a remarkably large number.
It is possible to solve analytically the recursion relation (7.1) for πΆ(π, π)using generating functions. Although the solution method is interestingβandin fact was shown to me by a studentβthe final analytical result is messy andwe omit it here. In general, computation of πΆ(π, π) is best done numerically byconstructing the dynamic matrix.
108 CHAPTER 7. SEQUENCE ALIGNMENT
7.3 Sequence alignment by dynamic program-ming
Two reasonably sized sequences cannot be aligned by brute force. Luckily, thereis another algorithm borrowed from computer science, dynamic programming,that makes use of a dynamic matrix.
What is needed is a scoring system to judge the quality of an alignment.The goal is to find the alignment that has the maximum score. We assumethat the alignment of character ππ with character ππ has the score π(ππ, ππ). Forexample, when aligning two DNA sequences, a match (A-A, C-C, T-T, G-G)may be scored as +2, and a mismatch (A-C, A-T, A-G, etc.) scored as β1. Wealso assume that an indel (a nucleotide aligned with a gap) is scored as π, with atypical value for DNA alignment being π = β2. In the next section, we developa better and more widely used model for indel scoring that distinguishes gapopenings from gap extensions.
Now, let π (π, π) denote the maximum score for aligning a sequence of lengthπ with a sequence of length π. We can compute π (π, π) provided we know π (πβ1, πβ1), π (πβ1, π) and π (π, πβ1). Indeed, our logic is similar to that used whencounting the total number of alignments. There are again three ways to computeπ (π, π): (1) πβ1 characters of the first sequence are aligned with πβ1 charactersof the second sequence with maximum score π (πβ1, πβ1), and the πth characterof the first sequence aligns with the πth character of the second sequence withupdated maximum score π (πβ1, πβ1)+π(ππ, ππ); (2) πβ1 characters of the firstsequence are aligned with π characters of the second sequence with maximumscore π (π β 1, π), and the πth character of the first sequence aligns with a gapin the second sequence with updated maximum score π (π β 1, π) + π, or; (3) πcharacters of the first sequence are aligned with π β 1 characters of the secondsequence with maximum score π (π, πβ 1), and a gap in the first sequence alignswith the πth character of the second sequence with updated maximum scoreπ (π, π β 1)+ π. We then compare these three scores and assign π (π, π) to be themaximum; that is,
π (π, π) = max
β§βͺβ¨βͺβ©π (πβ 1, π β 1) + π(ππ, ππ),
π (πβ 1, π) + π,
π (π, π β 1) + π.
(7.2)
Boundary conditions give the score of aligning a sequence with a null sequenceof gaps, so that
π (π, 0) = π (0, π) = ππ, π > 0, (7.3)
with π (0, 0) = 0.
The recursion (7.2), together with the boundary conditions (7.3), can beused to construct a dynamic matrix. The score of the best alignment is thengiven by the last filled-in element of the matrix, which for aligning a sequenceof length π with a sequence of length π is π (π,π). Besides this score, however,we also want to determine the alignment itself. The alignment can be obtainedby tracing back the path in the dynamic matrix that was followed to computeeach matrix element π (π, π). There could be more than one path, so that thebest alignment may be degenerate.
7.3. DYNAMIC PROGRAMMING 109
Sequence alignment is always done computationally, and there are excellentsoftware tools freely available on the web (see S7.6). Just to illustrate thedynamic programming algorithm, we compute by hand the dynamic matrix foraligning two short DNA sequences GGAT and GAATT, scoring a match as +2,a mismatch as β1 and an indel as β2:
- G A A T T- 0 -2 -4 -6 -8 -10G -2 2 0 -2 -4 -6G -4 0 1 -1 -3 -5A -6 -2 2 3 1 -1T -8 -4 0 1 5 3
In our hand calculation, the two sequences to be aligned go to the left and abovethe dynamic matrix, leading with a gap character β-β. Row 0 and column 0 arethen filled in with the boundary conditions, starting with 0 in position (0, 0)and incrementing by the gap penalty β2 across row 0 and down column 0. Therecursion relation (7.2) is then used to fill in the dynamic matrix one row ata time moving from left-to-right and top-to-bottom. To determine the (π, π)matrix element, three numbers must be compared and the maximum taken: (1)inspect the nucleotides to the left of row π and above column π and add +2 fora match or -1 for a mismatch to the (πβ 1, πβ 1) matrix element; (2) add β2 tothe (π β 1, π) matrix element; (3) add β2 to the (π, π β 1) matrix element. Forexample, the first computed matrix element 2 at position (1, 1) was determinedby taking the maximum of (1) 0+2 = 2, since G-G is a match; (2) β2β2 = β4;(3) β2β2 = β4. You can test your understanding of dynamic programming bycomputing the other matrix elements.
After the matrix is constructed, the traceback algorithm that finds the bestalignment starts at the bottom-right element of the matrix, here the (4, 5) matrixelement with entry 3. The matrix element used to compute 3 was either at(4, 4) (horizontal move) or at (3, 4) (diagonal move). Having two possibilitiesimplies that the best alignment is degenerate. For now, we arbitrarily choosethe diagonal move. We build the alignment from end to beginning with GGATon top and GAATT on bottom:
T
|
T
We illustrate our current position in the dynamic matrix by eliminating all theelements that are not on the traceback path and are no longer accessible:
- πΊ π΄ π΄ π π- 0 -2 -4 -6 -8πΊ -2 2 0 -2 -4πΊ -4 0 1 -1 -3π΄ -6 -2 2 3 1π 3
We start again from the 1 entry at (3, 4). This value came from the 3 entry at(3, 3) by a horizontal move. Therefore, the alignment is extended to
110 CHAPTER 7. SEQUENCE ALIGNMENT
-T
||
TT
where a gap is inserted in the top sequence for a horizontal move. (A gap isinserted in the bottom sequence for a vertical move.) The dynamic matrix nowlooks like
- πΊ π΄ π΄ π π- 0 -2 -4 -6πΊ -2 2 0 -2πΊ -4 0 1 -1π΄ -6 -2 2 3 1π 3
Starting again from the 3 entry at (3, 3), this value came from the 1 entry at(2, 2) in a diagonal move, extending the alignment to
A-T
|||
ATT
The dynamic matrix now looks like
- πΊ π΄ π΄ π π- 0 -2 -4πΊ -2 2 0πΊ -4 0 1π΄ 3 1π 3
Continuing in this fashion (try to do this), the final alignment is
GGA-T
: : : ,
GAATT
where it is customary to represent a matching character with a colon β:β. Thetraceback path in the dynamic matrix is
- πΊ π΄ π΄ π π- 0πΊ 2πΊ 1π΄ 3 1π 3
If the other degenerate path was initially taken, the final alignment would be
GGAT-
: ::
GAATT
7.4. GAPS 111
and the traceback path would be
- πΊ π΄ π΄ π π- 0πΊ 2πΊ 1π΄ 3π 5 3
The score of both alignments is easily recalculated to be the same, with 2β 1+2β 2 + 2 = 3 and 2β 1 + 2 + 2β 2 = 3.
The algorithm for aligning two proteins is similar, except match and mis-match scores depend on the pair of aligning amino acids. With twenty differentamino acids found in proteins, the score is represented by a 20Γ 20 substitutionmatrix. The most commonly used matrices are the PAM series and BLOSUMseries of matrices, with BLOSUM62 the commonly used default matrix.
7.4 Gap opening and gap extension penalties
Empirical evidence suggests that gaps cluster, in both nucleotide and proteinsequences. Clustering is usually modeled by different penalties for gap opening(ππ) and gap extension (ππ), with ππ < ππ < 0. For example, the default scoringscheme for the widely used BLASTN software is +1 for a nucleotide match, β3for a nucleotide mismatch, β5 for a gap opening, and β2 for a gap extension.
Having two types of gaps (opening and extension) complicates the dynamicprogramming algorithm. When an indel is added to an existing alignment,the scoring increment depends on whether the indel is a gap opening or a gapextension. For example, the extended alignment
AB AB-
|| to |||
ab abc
adds a gap opening penalty ππ to the score, whereas
A- A--
|| to |||
ab abc
adds a gap extension penalty ππ to the score. The score increment depends notonly on the current aligning pair, but also on the previously aligned pair.
The final aligning pair of a sequence of length π with a sequence of lengthπ can be one of three possibilities (top:bottom): (1) ππ : ππ ; (2) ππ : β; (3)β : ππ . Only for (1) is the score increment π(ππ, ππ) unambiguous. For (2) or(3), the score increment depends on the presence or absence of indels in thepreviously aligned characters. For instance, for alignments ending with ππ : β,the previously aligned character pair could be one of (i) ππβ1 : ππ , (ii) β : ππ , (iii)ππβ1 : β. If the previous aligned character pair was (i) or (ii), the score incrementwould be the gap opening penalty ππ; if it was (iii), the score increment wouldbe the gap extension penalty ππ.
To remove the ambiguity that occurs with a single dynamic matrix, weneed to compute three dynamic matrices simultaneously, with matrix elements
112 CHAPTER 7. SEQUENCE ALIGNMENT
denoted by π (π, π), πβ(π, π) and πβ(π, π), corresponding to the three types ofaligning pairs. The recursion relations are
(1) ππ : ππ
π (π, π) = max
β§βͺβ¨βͺβ©π (πβ 1, π β 1) + π(ππ, ππ),
πβ(πβ 1, π β 1) + π(ππ, ππ),
πβ(πβ 1, π β 1) + π(ππ, ππ);
(7.4)
(2) ππ : β
πβ(π, π) = max
β§βͺβ¨βͺβ©π (πβ 1, π) + ππ,
πβ(πβ 1, π) + ππ,
πβ(πβ 1, π) + ππ;
(7.5)
(3) β : ππ
πβ(π, π) = max
β§βͺβ¨βͺβ©π (π, π β 1) + ππ,
πβ(π, π β 1) + ππ,
πβ(π, π β 1) + ππ;
(7.6)
To align a sequence of length π with a sequence of length π, the best alignmentscore is the maximum of the scores obtained from the three dynamic matrices:
πopt(π,π) = max
β§βͺβ¨βͺβ©π (π,π),
πβ(π,π),
πβ(π,π).
(7.7)
The traceback algorithm to find the best alignment proceeds as before bystarting with the matrix element corresponding to the best alignment score,πopt(π,π), and tracing back to the matrix element that determined this score.The optimum alignment is then built up from last-to-first as before, but nowswitching may occur between the three dynamic matrices.
7.5 Local alignments
We have so far discussed how to align two sequences over their entire length,called a global alignment. Often, however, it is more useful to align two sequencesover only part of their lengths, called a local alignment. In bioinformatics, thealgorithm for global alignment is called βNeedleman-Wunsch,β and that for localalignment βSmith-Waterman.β Local alignments are useful, for instance, whensearching a long genome sequence for alignments to a short DNA segment. Theyare also useful when aligning two protein sequences since proteins can consistof multiple domains, and only a single domain may align.
If for simplicity we consider a constant gap penalty π, then a local alignmentcan be obtained using the rule
π (π, π) = max
β§βͺβͺβͺβ¨βͺβͺβͺβ©0,
π (πβ 1, π β 1) + π(ππ, ππ),
π (πβ 1, π) + π,
π (π, π β 1) + π.
(7.8)
7.6. SOFTWARE 113
After the dynamic matrix is computed using (7.8), the traceback algorithmstarts at the matrix element with the highest score, and stops at the first en-countered zero score.
If we apply the Smith-Waterman algorithm to locally align the two sequencesGGAT and GAATT considered previously, with a match scored as +2, a mis-match as β1 and an indel as β2, the dynamic matrix is
- G A A T T- 0 0 0 0 0 0G 0 2 0 0 0 0G 0 2 1 0 0 0A 0 0 4 3 1 0T 0 0 2 3 5 3
The traceback algorithm starts at the highest score, here the 5 in matrix element(4, 4), and ends at the 0 in matrix element (0, 0). The resulting local alignmentis
GGAT
: ::
GAAT
which has a score of five, larger than the previous global alignment score ofthree.
7.6 Software
If you have in hand two or more sequences that you would like to align, there isa choice of software tools available. For relatively short sequences, you can usethe LALIGN program for global or local alignments:
http://embnet.vital-it.ch/software/LALIGN_form.html
For longer sequences, the BLAST software has a flavor that permits local align-ment of two sequences:
http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi
Another useful software for global alignment of two or more long DNA sequencesis PipMaker:
http://pipmaker.bx.psu.edu/pipmaker/
Multiple global alignments of protein sequences use ClustalW or T-Coffee:
http://www.clustal.org/
http://tcoffee.crg.cat/
Most users of sequence alignment software want to compare a given sequenceagainst a database of sequences. The BLAST software is most widely used,and comes in several versions depending on the type of sequence and databasesearch one is performing:
http://www.ncbi.nlm.nih.gov/BLAST/