the coalescent model - hits ggmbhrescaling time - exponential growth 2500 5000 7500 10000 12500...

59
The Coalescent Model Florian Weber 23. 7. 2016

Upload: others

Post on 14-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

The Coalescent Model

Florian Weber

23. 7. 2016

Page 2: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

The Coalescent Model

coalescent = „zusammenwachsend“

Page 3: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Outline

I Population Genetics and the Wright-Fisher-modelI The CoalescentI Non-constant population-sizesI Further extensionsI Summary

Page 4: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Population Genetics

(Shamelessly stealing Alexis’ slides)I Study of polymorphisms in a population

I What are the processes that introduce polymorphisms in thepopulation?

I If a polymorphism exists in a population, will it be there forever?

I Is there some process that removes polymorphisms from thepopulation?

I Do the polymorphisms exhibit patterns?I . . .

Page 5: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Motivation

I The coalescent is basically the Wright-Fisher-model with a lotof analysis.

I It can easily do calculations about the past

I It is very fast to compute

I Is can easily be extended to represent a more complex reality

Page 6: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Motivation

I The coalescent is basically the Wright-Fisher-model with a lotof analysis.

I It can easily do calculations about the past

I It is very fast to compute

I Is can easily be extended to represent a more complex reality

Page 7: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Motivation

I The coalescent is basically the Wright-Fisher-model with a lotof analysis.

I It can easily do calculations about the past

I It is very fast to compute

I Is can easily be extended to represent a more complex reality

Page 8: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Motivation

I The coalescent is basically the Wright-Fisher-model with a lotof analysis.

I It can easily do calculations about the past

I It is very fast to compute

I Is can easily be extended to represent a more complex reality

Page 9: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Hardy-Weinberg

I Assuming an infinite population size, random mating, diploidpopulation, no selection. . .the allele-frequencies are constant

I Infinity is weird. . . 0.3×∞ =∞I . . . and unrealistic

Page 10: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Hardy-Weinberg

I Assuming an infinite population size, random mating, diploidpopulation, no selection. . .the allele-frequencies are constant

I Infinity is weird. . . 0.3×∞ =∞I . . . and unrealistic

Page 11: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Wright-Fisher

I Assuming a finite but constant population size, randommating, non-overlapping generations, no selection. . .all alleles except for one will disappear over time.

I The likelihood for an allele to prevail is equal to it’s initialfrequency

Page 12: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Wright-Fisher

I Assuming a finite but constant population size, randommating, non-overlapping generations, no selection. . .all alleles except for one will disappear over time.

I The likelihood for an allele to prevail is equal to it’s initialfrequency

Page 13: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Wright-Fisher

Figure 1: A simulation of three alleles under the model

Page 14: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Wright-Fisher (Individuals)past

present

time

Figure 2: An evolutionary history in the model

Page 15: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Wright-Fisher (Individuals)past

present

time

Figure 3: Extinct alleles removed

Page 16: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Wright-Fisher (Individuals)past

present

time

Figure 4: Surviving Tree

Page 17: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Wright-Fisher (MRCA)

MRCA

past

present

time

Figure 5: Most Recent Common Ancestor marked

Page 18: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Wright-Fisher (Coalescence-Events)

MRCA

Coalescence-Events

past

present

time

Figure 6: Coalescence-Events of the green individuals

Page 19: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

The Coalescent Model

N

A B

P(B)P(A)

Figure 7: Two individuals and their parents

Page 20: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

The Coalescent Model

I Likelihood for two nodes to coalesce in the previous generation:p(P(A) = P(B)) = 1

N

I In the previous two generations:1− (N−1

N · N−1N ) = 1−

(N−1

N

)2

I In the previous three generations:1−

((N−1

N

)2· N−1

N

)= 1−

(N−1

N

)3

I In the previous t generations1−

(N−1

N

)t

Page 21: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

The Coalescent Model

I Likelihood for two nodes to coalesce in the previous generation:p(P(A) = P(B)) = 1

N

I In the previous two generations:1− (N−1

N · N−1N ) = 1−

(N−1

N

)2

I In the previous three generations:1−

((N−1

N

)2· N−1

N

)= 1−

(N−1

N

)3

I In the previous t generations1−

(N−1

N

)t

Page 22: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

The Coalescent Model

I Likelihood for two nodes to coalesce in the previous generation:p(P(A) = P(B)) = 1

N

I In the previous two generations:1− (N−1

N · N−1N ) = 1−

(N−1

N

)2

I In the previous three generations:1−

((N−1

N

)2· N−1

N

)= 1−

(N−1

N

)3

I In the previous t generations1−

(N−1

N

)t

Page 23: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

The Coalescent Model

I Likelihood for two nodes to coalesce in the previous generation:p(P(A) = P(B)) = 1

N

I In the previous two generations:1− (N−1

N · N−1N ) = 1−

(N−1

N

)2

I In the previous three generations:1−

((N−1

N

)2· N−1

N

)= 1−

(N−1

N

)3

I In the previous t generations1−

(N−1

N

)t

Page 24: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

The Coalescent Model

Gen 0

Gen 1

Gen 2

Gen 3

1*

1*

1*

* without loss of generality

1/N

1/N

1/N

(N-1) / N

(N-1) / N

(N-1) / N

Figure 8: Likelihood of coalescence

Page 25: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

The Coalescent Model

I Likelihood of coalescence in the previous t generations:

1−(N − 1

N

)t

I Likelihood for lineages to remain distinct for t generations:(N − 1N

)t

I Expected time for coalescence: E (t) = N

I Rescale: τ = tN : (N − 1

N

)dNτe−−−−→N→∞

e−τ

Page 26: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

The Coalescent Model

I Likelihood of coalescence in the previous t generations:

1−(N − 1

N

)t

I Likelihood for lineages to remain distinct for t generations:(N − 1N

)t

I Expected time for coalescence: E (t) = N

I Rescale: τ = tN : (N − 1

N

)dNτe−−−−→N→∞

e−τ

Page 27: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

The Coalescent Model

I Likelihood of coalescence in the previous t generations:

1−(N − 1

N

)t

I Likelihood for lineages to remain distinct for t generations:(N − 1N

)t

I Expected time for coalescence: E (t) = N

I Rescale: τ = tN : (N − 1

N

)dNτe−−−−→N→∞

e−τ

Page 28: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

The Coalescent Model

I Likelihood of coalescence in the previous t generations:

1−(N − 1

N

)t

I Likelihood for lineages to remain distinct for t generations:(N − 1N

)t

I Expected time for coalescence: E (t) = N

I Rescale: τ = tN : (N − 1

N

)dNτe−−−−→N→∞

e−τ

Page 29: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

The Coalescent Model

⇒ The likelihood for two lineages to stay distinct over timeis exponentially small!

1.0

0.0

0.50.25

1 2 time t

lineage 1

lineage 2

Page 30: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Moar Lineages!!

lineages

Figure 9: http://what-if.xkcd.com/13/

Page 31: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

More Lineages

I Likelihood of no coalescence in one generation and threelineages:

N − 1N × N − 2

N

I One generation, k lineages:

N − 1N × N − 2

N × · · · × N − k + 1N =

k−1∏i=1

N − iN

Page 32: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

More Lineages

I Likelihood of no coalescence in one generation and threelineages:

N − 1N × N − 2

N

I One generation, k lineages:

N − 1N × N − 2

N × · · · × N − k + 1N =

k−1∏i=1

N − iN

Page 33: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

More Lineages

I For some reason this is equal to:

k−1∏i=1

N − iN = 1−

(k2)

N + O( 1N2

)≈ 1−

(k2)

N

I(k

2)is the binomial coefficient and equates to k·(k−1)

2

I There are(k

2)ways to pick two lineages from a set of k lineages.

I Therefore a coalescence-event is(k

2)-times as likely with k

lineages than with 2

I The number of coalescence-events grows quadraticallywith the number of lineages!

Page 34: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

More Lineages

I For some reason this is equal to:

k−1∏i=1

N − iN = 1−

(k2)

N + O( 1N2

)≈ 1−

(k2)

N

I(k

2)is the binomial coefficient and equates to k·(k−1)

2

I There are(k

2)ways to pick two lineages from a set of k lineages.

I Therefore a coalescence-event is(k

2)-times as likely with k

lineages than with 2

I The number of coalescence-events grows quadraticallywith the number of lineages!

Page 35: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

More Lineages

I For some reason this is equal to:

k−1∏i=1

N − iN = 1−

(k2)

N + O( 1N2

)≈ 1−

(k2)

N

I(k

2)is the binomial coefficient and equates to k·(k−1)

2

I There are(k

2)ways to pick two lineages from a set of k lineages.

I Therefore a coalescence-event is(k

2)-times as likely with k

lineages than with 2

I The number of coalescence-events grows quadraticallywith the number of lineages!

Page 36: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

More Lineages

I For some reason this is equal to:

k−1∏i=1

N − iN = 1−

(k2)

N + O( 1N2

)≈ 1−

(k2)

N

I(k

2)is the binomial coefficient and equates to k·(k−1)

2

I There are(k

2)ways to pick two lineages from a set of k lineages.

I Therefore a coalescence-event is(k

2)-times as likely with k

lineages than with 2

I The number of coalescence-events grows quadraticallywith the number of lineages!

Page 37: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

More Lineages

time

coalescence-rateEvents gettingexponentially rare

Figure 10: More lineages = faster coalscence

Page 38: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Properties

I Few deep furcations

I Likelihood: Everything is possible but maybe unlikely

I Calculation is backward in times (Wright-Fisher: forward)

I Efficient: no calculation per individual or for extinct lineages

Page 39: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Properties

I Few deep furcations

I Likelihood: Everything is possible but maybe unlikely

I Calculation is backward in times (Wright-Fisher: forward)

I Efficient: no calculation per individual or for extinct lineages

Page 40: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Properties

I Few deep furcations

I Likelihood: Everything is possible but maybe unlikely

I Calculation is backward in times (Wright-Fisher: forward)

I Efficient: no calculation per individual or for extinct lineages

Page 41: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Properties

I Few deep furcations

I Likelihood: Everything is possible but maybe unlikely

I Calculation is backward in times (Wright-Fisher: forward)

I Efficient: no calculation per individual or for extinct lineages

Page 42: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Non-constant population-sizesWorldpopulation,billions

0

1

2

3

4

5

6

7

10,000 BC 8000 6000 4000 2000 AD 1 1000 2000

Figure 11: Wordpopulation - not very constant [Wikimedia]

Page 43: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Non-constant population-sizes

I Non-constant, but known population-sizeI Coalescence is more likely in small populationsI ⇒ Coalescence-rate changes over time

I Simply rescale time.

Page 44: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Non-constant population-sizes

I Non-constant, but known population-sizeI Coalescence is more likely in small populationsI ⇒ Coalescence-rate changes over time

I Simply rescale time.

Page 45: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Rescaling Time

I Before: t Generations corresponded to t/N units ofcoalescence-time

I Now: t Generations correspond to

t∑i=1

1Ni

units of coalescence-timeI Note: for a constant population both formulas are equal

Page 46: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Rescaling Time - Example

I 5 Generations, with on average 5 individuals:

I For constant 5 individuals: τ = tN = 5

5 = 1 unit of coalescencetime

I For non-constant {4, 4, 5, 6, 6} individuals:

τ =t∑

i=1

1Ni

= 14 + 1

4 + 15 + 1

6 + 16 = 31

30

note the lesser influence of the larger generations

I A generation with twice the size, will get halve thecoalescence-time

Page 47: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Rescaling Time - Example

I 5 Generations, with on average 5 individuals:

I For constant 5 individuals: τ = tN = 5

5 = 1 unit of coalescencetime

I For non-constant {4, 4, 5, 6, 6} individuals:

τ =t∑

i=1

1Ni

= 14 + 1

4 + 15 + 1

6 + 16 = 31

30

note the lesser influence of the larger generations

I A generation with twice the size, will get halve thecoalescence-time

Page 48: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Rescaling Time - Example

I 5 Generations, with on average 5 individuals:

I For constant 5 individuals: τ = tN = 5

5 = 1 unit of coalescencetime

I For non-constant {4, 4, 5, 6, 6} individuals:

τ =t∑

i=1

1Ni

= 14 + 1

4 + 15 + 1

6 + 16 = 31

30

note the lesser influence of the larger generations

I A generation with twice the size, will get halve thecoalescence-time

Page 49: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Rescaling Time - Example

I 5 Generations, with on average 5 individuals:

I For constant 5 individuals: τ = tN = 5

5 = 1 unit of coalescencetime

I For non-constant {4, 4, 5, 6, 6} individuals:

τ =t∑

i=1

1Ni

= 14 + 1

4 + 15 + 1

6 + 16 = 31

30

note the lesser influence of the larger generations

I A generation with twice the size, will get halve thecoalescence-time

Page 50: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Rescaling Time - Exponential Growth

time

population-size

Generation

1 100

2 120

3 144

4 173

5 207

6 249

7 299

8 358

9 430

10 516

Figure 12: Exponentially growing population versus coalescence-time

Page 51: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Rescaling Time - Exponential Growth

2500 5000 7500 10000 12500 15000

5

10

15

20

constant size

exponential growth

scal

ed ti

me

2500 5000 7500 10000 12500 15000

0.5

1

1.5

2

2.5

3

3.5

real time (generations)

log

N(t)

Figure 13: Exponentially growing and constant opulations. Note thereverse time-scale! [Nordborg]

Page 52: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Rescaling Time - Applicability

I Approximation converges against theory for growing NI Close enough for most purposes

Page 53: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Further Extensions

I Separated PopulationsI Diploid PopulationsI Males and FemalesI SelectionI Multiple SpeciesI . . .

Wright-Fisher:Assuming a finite but constant population size, randommating, non-overlapping generations, no selection. . .

Coalescent:Assuming non-overlapping generations. . .

Page 54: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Further Extensions

I Separated PopulationsI Diploid PopulationsI Males and FemalesI SelectionI Multiple SpeciesI . . .

Wright-Fisher:Assuming a finite but constant population size, randommating, non-overlapping generations, no selection. . .

Coalescent:Assuming non-overlapping generations. . .

Page 55: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Further Extensions

I Separated PopulationsI Diploid PopulationsI Males and FemalesI SelectionI Multiple SpeciesI . . .

Wright-Fisher:Assuming a finite but constant population size, randommating, non-overlapping generations, no selection. . .

Coalescent:Assuming non-overlapping generations. . .

Page 56: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

An actual example

Figure 14: Coalescent vs. Anthropological Estimates [Atkinson et al.]

Page 58: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

Summary

I The coalescent is the Wright-Fisher-model plus mathI Coalescent-events are, with exponential likelihood, relatively

recentI The more lineages there are, the more coalescence-events occurI Non-Constant populations can be simulated by rescaling timeI The simulated time for a generation is anti-proportional to it’s

size

Page 59: The Coalescent Model - HITS gGmbHRescaling Time - Exponential Growth 2500 5000 7500 10000 12500 15000 5 10 15 20 constant size exponential growth scaled time 2500 5000 7500 10000 12500

References

ContentI Magnus Nordborg, “Coalescent Theory”, March 2000

Software-listI en.wikipedia.org/wiki/Coalescent_theory

ImagesI Fig. 09: Randal Munroe: what-if.xkcd.com/13/I Fig. 11: El T:

commons.wikimedia.org/wiki/File:Population_curve.svgI Fig. 13: Magnus Nordborg: “Coalescent Theory”, 2000I Fig. 14: Atkinson et al.: “mtDNA variation predicts population

size in humans and reveals a major Southern Asian chapter inhuman prehistory”, 2008