evolutionary genetics: part 5 inbreeding – structured ... · evolutionary genetics: part 5...

63
Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013 Prof Aurélien Tellier FG Populationsgenetik

Upload: others

Post on 23-Feb-2020

23 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Evolutionary Genetics: Part 5

Inbreeding – Structured populations

S. peruvianum

S. chilense

Winter Semester 2012-2013

Prof Aurélien TellierFG Populationsgenetik

Page 2: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Color code

Color code:

Red = Important result or definition

Purple: exercise to do

Green: some bits of maths

Page 3: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population genetics: 4 evolutionary forces

random genomic processes

(mutation, duplication, recombination, gene conversion)

natural

selection

random demographic

process (drift)

random spatial

process (migration)

molecular diversity

Page 4: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Deviations from panmixia

� So far, standard neutral model assumes a Wright-Fisher model of

population

� All individuals have the same probability to pick their parents, and all

parents have same probability

� In reality this is not true

� Inbreeding (include selfing) = mating between closely related

individuals

� Population structure = species consist of several populations,

panmixia occurs within populations

Page 5: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Hardy-Weinberg equilibrium

� The first model of panmixia = Hardy – Weinberg equilibrium

� Formulated independently by GH Hardy and W Weinberg in 1908

Page 6: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Hardy-Weinberg equilibrium

� The assumptions of the model are:

� organisms are diploid

� only sexual reproduction occurs

� generations are non overlapping

� mating is random (panmixia)

� population size is large

� allele frequencies are equal in the sexes

� there is no migration, mutation or selection

Page 7: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Hardy-Weinberg equilibrium

� With two alleles at a locus: A and a, the allele frequencies are:

� A has frequency p

� a has frequency q = 1-p

Homozygote HomozygoteHeterozygote

A A A a a aParentgenotypes

Gametes

Offspringgenotypes

Aaa A

A A A a a a

Frequency = p2 Frequency = q2Frequency = 2pq

Page 8: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Hardy-Weinberg equilibrium

� Demonstration of the equilibrium:

� A has frequency p, a has frequency q = 1-p

� In the parents:

�AA in frequency P

�Aa in frequency H

�aa in frequency Q

p = P + (1/2)H and q = Q + (1/2)H

Page 9: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Hardy-Weinberg equilibrium

1) Frequency of the matings

AA (P) Aa (H) aa (Q)

AA (P) P2 PH PQ

Aa (H) PH H2 HQ

aa (Q) PQ HQ Q2

Female genotypes

Male

gen

oty

pes

Page 10: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Hardy-Weinberg equilibrium

2) Then for each mating, what are the progenies

Freq from

matingAA Aa aa

AA××××AA P2 P2

AA××××Aa 2PH PH PH

AA××××aa 2PQ 2PQ

Aa××××Aa H2 (1/4)H2 (1/2)H2 (1/4)H2

Aa××××aa 2HQ HQ HQ

aa××××aa Q2 Q2

(P + (1/2)H)2

= p2

2(P+(1/2)H)

(Q +(1/2)H)

= 2pq

(Q + (1/2)H)2

= q2

Progeny genotypes

Page 11: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Hardy-Weinberg equilibrium

� Numerical example:

� P = p2 = 0.2

� H = 2pq = 0.4

� Q = q2 = 0.4

So that p=0.4 and q=0.6

� At next generation:

� p2 = (0.4)2 = 0.16

� 2pq = 2 * 0.4 * 0.6 = 0.48

� q2 = (0.6)2 = 0.36

�These are the equilibrium frequencies

Genotypes: AA Aa aa

Page 12: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Hardy-Weinberg equilibrium

� Numerical example: When frequencies are unknown…

Example: 50 Individuals (n = 50, 2n = 100)

Observed number of individuals: AA Aa aa

15 25 10

p = (2 x 15 + 25) / 100 = 0.55

q = (2 x 10 + 25) / 100 = 0.45

np2 2pqn nq2

50 x (0.55)2 2(0.55)(0.45) x 50 50 x (0.45)2

Expected number of individuals 15,125 24,75 10,125

Page 13: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Hardy-Weinberg equilibrium

� HW equilibrium means that if frequencies change, they go back to the

equilibrium value within ONE generation of random mating

� Drift for example affect the allele frequencies if the population is small

� It is a useful neutral model which predict if a population follows the HW

equilibrium frequencies

� If not, WHY???

� One can use a χ2 test for statistical testing of deviation from equilibrium

Data = frequencies or numbers (please look at text book)

( )2

2_ _

_

Observed data Expected data

Expected dataχ

−= ∑

Page 14: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Hardy-Weinberg equilibrium

� HW equilibrium means that if frequencies change, they go back to the

equilibrium value within ONE generation of random mating

� Inbreeding and population structure create deviations from equilibrium

values in HW

� BUT only factors that act at present can create observable deviations from

equilibrium values (= NO memory) as frequencies go back to equilibrium in

one generation

� Softwares STRUCTURE (and INSTRUCT) are used to detect population

structure based on HW equilibrium

Page 15: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Hardy-Weinberg equilibrium

Page 16: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Heterozygosity

� We are interested in heterozygosity in populations

� How to detect it in data?

� When sequencing one individual, SNPs are found at the heterozygous

positions

� Ex: chr1 AAA

chr2 AAT

� Problem of knowing the phase when sequencing

� For microsatellites, alleles of different size in one individual

Page 17: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Heterozygosity

� Heterozygosity H = 1 – proportion of homozygotes

(Nei)

� k is here the number of alleles, and i indicate the allele number

� The maximum of H is observed when all alleles have the same frequency = 1/k

� For k=2, Hmax = 0.5

� Note that this definition is close to that from WF model or coalescent

� H = 1- proba that two individuals are identical by state

2

1

1k

i

i

H p=

= −∑

Identical by descent and no mutation Identical by state

Page 18: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Inbreeding

Page 19: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Inbreeding

� Inbreeding = a form of non-random mating

= occurs in population where mating individuals are more closely related than

those drawn by chance from the population

� Extrem form = self-fertilization (= selfing) which occurs in plants, fungi,

crustaceans, aphids,…

� Inbreeding decreases heterozygosity

Page 20: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Inbreeding

� A simple model, in a diploid population:

� Random mating

� Selfing with probability ps

� Calculate the probability that two alleles in a single offspring individual derive

from the same allele in the parent generation

� This is the inbreeding coefficient f

� Because both alleles are from the same diploid parent (with proba ps)

� AND the copies of that allele are from the same parental allele (with proba ½)

2

sp

f =

Page 21: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Inbreeding

� What is the probability of identity by descent of two homologous alleles picked in the offspring

generation?

� = average single generation probabbility of coalescence (pc,1)

,1

1 2 2 1

2 1 2 1 2c

Np f

N N N

− = +

− −

Proba that the two alleles are in one

individual offspring, and they have a

proba f to be the same (from above)

Proba that the two alleles are in different

individual offsprings

proba=[1-1/(2N-1)]

and they have a proba (1/2N) to coalesce

Page 22: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Inbreeding

� What is the probability of identity by descent of two homologous alleles picked in the offspring

generation?

� = average single generation probabbility of coalescence (pc,1)

,1

1 2 2 1

2 1 2 1 2c

Np f

N N N

− = +

− −

,1

1 1

2 2c

p fN N

+≃

2 ,1

1 1

2 [ 1] 2 c

NeP T p

= ==

Remember that the Identity by descent population size Ne is :

Page 23: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Inbreeding

� What is the probability of identity by descent of two homologous alleles picked in the offspring

generation?

1 1 1

2 2 2 1

Nf Ne

Ne N N f= + ⇒ =

+

� As 0 < f <1 => inbreeding reduces the effective population size!

� If f =0, Ne =N following the Wright-Fisher model

Page 24: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Inbreeding

1

NNe

f=

+

� This means that inbreeding will enhance the effect of genetic drift

� Under inbreeding (selfing) alleles will get fixed or lost faster

Page 25: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Inbreeding

� The inbreeding coefficient is defined as the probability of identity by descent,

but from data we can only infer the identity by state (alleles are identical or not)

� Identity by state = homozygosity

� We can define then heterozygosity at two levels (S. Wright):

� Total heterozygosity HT = probability that two randomly chosen alleles in

the population are different

� Individual heterozygosity HI = probability that two alleles from a single

randomly chosen individual are different

Page 26: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Inbreeding

� Total heterozygosity HT = probability that two randomly chosen alleles in

the population are different

� Individual heterozygosity HI = probability that two alleles from a single

randomly chosen individual are different

� On obtains:

� f measures the proportion of variation between alleles that is due to differences

among individuals

T I

T

H Hf

H

−=

Page 27: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Inbreeding coefficient- Proof

� In population with two alleles A and a, in frequency p and q respectively

� Expectation HT=2pq and HI=2pq from HW equilibrium

� If not, the inbreeding coefficient f measures the difference

� Pick one individual and check its genotype (G)

� 2 alleles can be identical by descent (and then also by state if we assume no recent

mutation)

� 2 alleles are not identical by descent, and are different by state with probability 2pq

2 [ ]

2

T I

T

H Hpq P G Aaf

pq H

−− =⇒ = =

2 2[ ] (1 )P G AA pf p f p fpq= = + − = +

2[ ]P G aa q fpq= = +

2 2[ ] 1 2 2 (1 )P G Aa p q fpq pq f= = − − − = −

Page 28: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Inbreeding - exercise

� In the absence of new genetic variation, contributed by mutation or migration, f

increases in finite populations over time.

Drift makes that some alleles get lost, other increase in frequency. The result is that

individuals at later generations have greater and greater probability of carrying the

same ancestral alleles.

� Using Populus (Mendelian genetics – Inbreeding model) look at the

evolution of the coefficient f in a finite population with various sizes

� After how many generations does f =1? How does it depend on N?

� Ft = theoretical expectation for f

� From simulations of drift:

� Fa = actual homozygosity of individuals = 1-HI

� Ff = homozygosity for the entire population = 1-HT

Page 29: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Structured populations

Page 30: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure - definitions

� Most species are found in several populations = also called demes or

subpopulations

� A population is defined as a panmictic unit

� Gene flow = migration = movement of individuals or gametes between

populations that result in genetic exchange between populations

� Real migrants that can be observed (birds, insects, mammals,…) overestimate

the gene flow because most migrants do not reproduce !!!

� A migrant must reproduce in the population to generate gene flow

� We are interested in so-called effective migration

Page 31: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure – look at data

� Exercise 4.2

� Using DnaSP and the data set TNFSF5-humans.nex

� 1) Define the populations: Data => define sequence sets

� 2) Analysis => DNA divergence between populations

� This gives you a basic idea of difference between populations

� Compare Africa to the three others. Which one is the most different from Africa?

� How do we learn more?

Page 32: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure – fixation indices

� Wright and later Nei derived three fixation indices: based on two alleles per locus

(biallelic)

� HS = probability for two random alleles from same subpopulation to be different

by state

� HT = probability for two random alleles from the total population to be different

by state

� HI = probability for two random alleles from one random individual to be

different by state

� One can partition the variance in allele frequencies at these different levels:

S IIS

S

H HF

H

−= T I

IT

T

H HF

H

−=

T SST

T

H HF

H

−=

Page 33: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure – fixation indices

S IIS

S

H HF

H

−=

T IIT

T

H HF

H

−=

Measures departure from HW expectations within a deme

Values between -1 and 1

Measures departure from HW expectations between

individuals and total population

Page 34: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure – fixation indices

T SST

T

H HF

H

−=

Measures departure from HW expectations when comparing

subpopulations and the total population

values between 0 and 1

� Drift generates fixation of alleles in populations

� FST measures the difference between subpopulations in the process of allele

fixation under the influence of drift

Schematic of a population subdivision and effect of drift in fixing alleles in populations

Page 35: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure – data analysis statistics

� These can be extended for different type of data

� For allelic variants (microsatellites): GST

� Based on gene (or allelic) diversity measure (HT and HS)

( )T S

ST

T

H HG

H

−=

� This measure does not calculate whether alleles are the same in subpopulations

and total population

� It compares the amount of diversity at subpopulation level to total population

(See later)

Page 36: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure – data analysis statistics

� These can be extended for different type of data

� For sequence data KST an analog of FST

� Can calculate it per locus

� But you can also calculate the FST per SNP as it is a biallelic locus

_

T S

ST Hudson ST

T

F Kπ π

π

−= =

Average number of differences

between sequences in a pop

(Pairwise comparison intrapop)

Average number of differences

between all sequences

(Pairwise comparison interpop)

2

1

1k

i

i

H p=

= −∑T S

ST

T

H HF

H

−=

Page 37: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure – look at data

� Exercise 4.3

� Using DnaSP and the data set TNFSF5-humans.nex

� Analysis => Gene flow and Genetic differentiation

� This gives you a basic idea of difference between populations

� Compare Africa to all three others.

� Look at values of FST, GST

� Does this fit with your previous observations?

Page 38: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure – look at data

� Exercise 4.3

� Hand calculations of one locus = 1 SNP, for different formulae of FST

T SST

T

H HF

H

−=

Pop 1 Pop 2

A A

TT

T

A A

T

AA

_1 S

ST Hudson

T

π= −

Page 39: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure – look at data

� Exercise 4.3

� Hand calculations of different formulae of GST, FST

8

48

T SST

T

H HF

H

−= =

Pop 1 Pop 2

A A

TT

T

A A

T

AA

_

31

48

SST Hudson

T

π= − =

� This is because π assumes that both sequences are from different individuals, but

heterotygosity does not assume that

Page 40: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure – look at data

� Exercise: Hand calculations of different formulae of GST, FST

� Which species present populations which are genetically differentiated?

� Can you calculate GST? What do you see?

Jost, Molecular Ecology

2008

Page 41: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure – look at data

� Be careful, GST does not measure differentiation when diversity within population

(HS) is too big

� It can take low values despite the presence of private alleles

Page 42: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure – look at data

� Be careful, GST does not measure differentiation when diversity is too big

� Jost has proposed another measure D to capture differentiation

� This is also true for SNP data when calculating GST based on haplotypes

� One should be cautious when interpreting values of FST as it depends on the

effective population size,

=> FST =0.2 can be low in one species and high in another one

Page 43: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure

� If the subpopulations are in HW equilibrium, there is always an excess of

homozygotes in the subpopulations compared to total population (FST >0)

� UNLESS allele frequencies are exactly the same in all subpopulations

� This is the Wahlund effect

� HS = probability for two random alleles from same subpopulation to be different

by state

� HT = probability for two random alleles from the total population to be different

by state

� HI = probability for two random alleles from one random individual to be

different by state

T S

ST

T

H HF

H

−=

Page 44: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Coalescence in structured populations

Page 45: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Coalescence in structured populations

Page 46: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Coalescence in structured populations

Page 47: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Coalescence in structured populations

If coalescence and migration have small probabilities, they cannot occur in the same time

= low migration limit

Page 48: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Coalescence in structured populations

Using msms to look at the frequency-spectrum for

different migration rates

Page 49: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure models

� Island model (S. Wright): symetric migration between all subpopulations

� d demes, all demes exchangeable and equivalent, each has a size N

� Symmetric migration rate m (proportion of migrants in each deme, 0<m<1)

� We can calculate for low migration:

� TS = mean time of coalescence of a pair of alleles sampled from teh same deme

� TB = mean time of coalescence of alleles from different demes

� There is a probability 1/(d-1) that an allele migrating comes from a given deme

Page 50: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure models

� We find (Charlesworth):

� Which one is bigger ? Why?

� We can calculate also:

� TT = mean time of coalescence of two alleles drawn randomly from the whole

population

2ST dN≈1

22

B

dT dN

m

−≈ +

2( 1)

22

T

dT dN

dm

−≈ +

Why does the time of coalescence in the

whole population depends on m?

Page 51: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure models

� We find (Charlesworth):

� Which one is bigger ? Why?

� We can calculate also:

� TT = mean time of coalescence of two alleles drawn randomly from the whole

population

2ST dN≈1

22

B

dT dN

m

−≈ +

2( 1)

22

T

dT dN

dm

−≈ +

� We can define the effective size of the total population

2 4

TT dNe dN

m= ≈ +

Lets compare this to a one panmictic

population with dN individuals

WHY is it bigger?

Page 52: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure

� Connect FST with migration (Wright, Slatkin, Takahata)

� We can define a coalescent version of FST

T SST

T

T TF

T

−=

2

2

1 1

4 1 41

( 1)

STFNmd Nm

d

≈ ≈+

+−

� In an island model (with all restriction on the assumptions)

� FST is proportional to Nm which is the number of effective migrants!!!

� If Nm<1 then FST > 0.2 and there is some difference between populations

� If Nm>2 then FST < 0.1 and there is little difference between populations

Page 53: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Coalescence in structured populations

If migration is high, it behaves almost like a single big panmictic population of size dN= strong migration limit

Page 54: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Coalescence in structured populations

Page 55: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Other population structure models

� Continental island

� One main deme, no drift in the big island

Page 56: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Other population structure models

� Stepping-stone model

� Demes connected only to neighbours

Page 57: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Coalescence in metapopulation

Page 58: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Most (all ?) plant species exhibit metapopulation

� Metapopulation = “a population that is subdivided into a large number of demes, each of

which is subject to extinction and recolonization and receives migrants from other demes” (Wakeley

and Aliacar 2001 Genetics)

Hanski, LaineT. Städler ©

Page 59: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population structure models

� Metapopulation model (Charlesworth, Wakeley)

� Many demes d, With extinction – recolonization, migration

� The sampling n << d

� Extinction – recolonization affect the genetic diversity and time of coalescence

� Complex formulae (see Charlesworth, Wakeley books)

Page 60: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Coalescence in metapopulation: more complex

Two phases: collecting (long) and scattering (short) (Wakeley and Aliacar 2001 Genetics)

Genealogy depends on the number of demes (n) and migration rate (M)

time

past

present

Deme 1 Deme 2 Deme 3 Deme 4

collecting phase

scattering phase

Page 61: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Species wide sampling

1 individual per deme, over the species range = reflect the species wide evolution

(Wakeley and Aliacar 2001 Genetics, Pannell 2003 Evolution, Städler et al. 2009 Genetics)

time

past

present

Deme 1 Deme 2 Deme 3 Deme 4

collecting phase

scattering phase

Page 62: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Population sample

Several individuals per deme, few populations = reflect the local evolution

time

past

present

Deme 1 Deme 2 Deme 3 Deme 4

collecting phase

scattering phase

Page 63: Evolutionary Genetics: Part 5 Inbreeding – Structured ... · Evolutionary Genetics: Part 5 Inbreeding – Structured populations S. peruvianum S. chilense Winter Semester 2012-2013

Study of structured populations

Deme 1 Deme 2 Deme 3 Deme 4

When studying structured populations: be careful at the sampling and markers used

How was it done? Why? Which information will you get from these samples?

Calculate FST but also Jost D to compare. Can you interpret migration from FST values?

Remember: each locus has its own coalescent tree!!!