lecture 4 - biostatistics › sites › default › files › ... · 2020-01-03 · 1 lecture 4...
TRANSCRIPT
1
Lecture 4Short-Term Selection
Response: Breeder’s equation
Bruce Walsh lecture notesIntroduction to Quantitative Genetics
SISG, Seattle16 – 18 July 2018
2
Response to Selection
• Selection can change the distribution of phenotypes, and we typically measure this by changes in mean– This is a within-generation change
• Selection can also change the distribution of breeding values– This is the response to selection, the change in
the trait in the next generation (the between-generation change)
3
The Selection Differential and the Response to Selection
• The selection differential S measures the within-generation change in the mean– S = µ* - µ
• The response R is the between-generation change in the mean– R(t) = µ(t+1) - µ(t)
4
Parental Generation
Offspring Generation
Truncation selectionUppermost fraction
p saved
µp µ*S
µo
R
5
The Breeders’ Equation: Translating S into R
Recall the regression of offspring value on midparent value
Averaging over the selected midparents,E[ (Pf + Pm)/2 ] = µ*,
E[ yo - µ ] = h2 ( µ* - µ ) = h2 S
Likewise, averaging over the regression gives
Since E[ yo - µ ] is the change in the offspring mean, it represents the response to selection, giving:
R = h2 S The Breeders’ Equation (Jay Lush)
6
• Note that no matter how strong S, if h2 is small, the response is small
• S is a measure of selection, R the actual response. One can get lots of selection but no response
• If offspring are asexual clones of their parents, the breeders’ equation becomes – R = H2 S
• If males and females subjected to differing amounts of selection,– S = (Sf + Sm)/2– Example: Selection on seed number in plants -- pollination
(males) is random, so that S = Sf/2
7
Pollen control• Recall that S = (Sf + Sm)/2• An issue that arises in plant breeding is pollen
control --- is the pollen from plants that have also been selected?
• Not the case for traits (i.e., yield) scored after pollination. In this case, Sm = 0, so response only half that with pollen control
• Tradeoff: with an additional generation, a number of schemes can give pollen control, and hence twice the response– However, takes twice as many generations, so
response per generation the same
8
Selection on clones• Although we have framed response in an outcrossed
population, we can also consider selecting the best individual clones from a large population of different clones (e.g., inbred lines)
• R = H2S, now a function of the board sense heritability. Since H2 > h2, the single-generation response using clones exceeds that using outcrossed individuals
• However, the genetic variation in the next generation is significantly reduced, reducing response in subsequent generations– In contrast, expect an almost continual response for several
generations in an outcrossed population.
9
Price-Robertson identity• S = cov(w,z)• The covariance between trait value z and
relative fitness (w = W/Wbar, scaled to have mean fitness = 1)
• VERY! Useful result• R = cov(w,Az), as response = within
generation change in BV– This is called Robertson’s secondary theorem of
natural selection
10
11
Suppose pre-selection mean = 30, and we select top5. In the table zi = trait value, ni = number of offspring
Unweighted S = 7, predicted response = 0.3*7 = 2.1offspring-weighted S = 4.69, pred resp = 1.4
12
Response over multiple generations
• Strictly speaking, the breeders’ equation only holds for predicting a single generation of response from an unselected base population
• Practically speaking, the breeders’ equation is usually pretty good for 5-10 generations
• The validity for an initial h2 predicting response over several generations depends on:– The reliability of the initial h2 estimate– Absence of environmental change between
generations– The absence of genetic change between the
generation in which h2 was estimated and the generation in which selection is applied
13
50% selectedVp = 4, S = 1.6
20% selectedVp = 4, S = 2.8
20% selectedVp = 1, S = 1.4
The selection differential is a function of boththe phenotypic variance and the fraction selected
14
The Selection Intensity, iAs the previous example shows, populations with thesame selection differential (S) may experience verydifferent amounts of selection
The selection intensity i provides a suitable measurefor comparisons between populations,
15
Truncation selection• A common method of artificial selection is truncation
selection --- all individuals whose trait value is above some threshold (T) are chosen.
• Equivalent to only choosing the uppermost fraction p of the population
16
Selection Differential UnderTruncation Selection
R code for i: dnorm(qnorm(1-p))/p
Likewise,
S =µ*- µ
17
Truncation selection• The fraction p saved can be translated into an
expected selection intensity (assuming the trait is normally distributed), – allows a breeder (by setting p in advance) to
chose an expected value of i before selection, and hence set the expected response
p 0.5 0.2 0.1 0.05 0.01 0.005
i 0.798 1.400 1.755 2.063 2.665 2.892
Height of a unit normal at the threshold value corresponding to p
R code for i: dnorm(qnorm(1-p))/p
18
Selection Intensity Version of the Breeders’ Equation
Since h = correlation between phenotypic and breedingvalues, h = rPA
R = i rPAsA
Response = Intensity * Accuracy * spread in Va
When we select an individual solely on their phenotype,the accuracy (correlation) between BV and phenotype is h
19
Accuracy of selectionMore generally, we can express the breedersequation as
R = i ruA sA
Where we select individuals based on the index u (for example, the mean of n of their sibs).
ruA = the accuracy of using the measure u topredict an individual's breeding value = correlation between u and an individual's BV, A
20
21
Improving accuracy• Predicting either the breeding or genotypic
value from a single individual often has low accuracy --- h2 and/or H2 (based on a single individuals) is small – Especially true for many plant traits with
high G x E– Need to replicate either clones or relatives
(such as sibs) over regions and years to reduce the impact of G x E
– Likewise, information from a set of relatives can give much higher accuracy than the measurement of a single individual
22
Stratified mass selection• In order to accommodate the high
environmental variance with individual plant values, Gardner (1961) proposed the method of stratified mass selection– Population stratified into a number of different
blocks (i.e., sections within a field)– The best fraction p within each block are chosen– Idea is that environmental values are more similar
among individuals within each block, increasing trait heritability.
23
Overlapping Generations
Ry =im + if
Lm + Lf
h2sp
Lx = Generation interval for sex x = Average age of parents when progeny are born
The yearly rate of response is
Trade-offs: Generation interval vs. selection intensity:If younger animals are used (decreasing L), i is also lower,as more of the newborn animals are needed as replacements
24
Computing generation intervals
OFFSPRING Year 2 Year 3 Year 4 Year 5 total
Number (sires)
60 30 0 0 90
Number (dams)
400 600 100 40 1140
25
Generalized Breeder’s Equation
Ry =im + if
Lm + Lf
ruAsA
Tradeoff between generation length L and accuracy r
The longer we wait to replace an individual, the moreaccurate the selection (i.e., we have time for progenytesting and using the values of its relatives)
26
27
Permanent Versus Transient Response
Considering epistasis and shared environmental values,the single-generation response follows from the midparent-offspring regression
Permanent component of response
Transient component of response --- contributesto short-term response. Decays away to zero
over the long-term
28
Permanent Versus Transient Response
The reason for the focus on h2S is that thiscomponent is permanent in a random-mating population, while the other components aretransient, initially contributing to response, butthis contribution decays away under random mating
Why? Under HW, changes in allele frequenciesare permanent (don’t decay under random-mating),while LD (epistasis) does, and environmentalvalues also become randomized
29
Response with EpistasisThe response after one generation of selection froman unselected base population with A x A epistasis is
The contribution to response from this single generationafter t generations of no selection is
c is the average (pairwise) recombination between lociinvolved in A x A
30
Response with Epistasis
Contribution to response from epistasis decays to zero aslinkage disequilibrium decays to zero
Response from additive effects (h2 S) is due to changes in allele frequencies and hence is permanent. Contribution from A x A due to linkage disequilibrium
31
Why breeder’s equation assumption of an unselected base population? If history of previous selection, linkage disequilibrium may be present and the mean can change as the disequilibrium decays
For t generation of selection followed byt generations of no selection (but recombination)
RAA has a limitingvalue given by
Time to equilibrium afunction of c
Decay half-life
32
What about response with higher-order epistasis?
Fixed incremental differencethat decays when selection
stops
33
Response in autotetraploids
• Autotetraploids pass along two alleles at each locus to their offspring
• Hence, dominance variance is passed along• However, as with A x A, this depends upon
favorable combinations of alleles, and these are randomized over time by transmission, so D component of response is transient.
34
P-O covariance Single-generationresponse
Response to t generations ofselection with constant selection differential S
Response remaining after t generations of selection followed by t generations of random mating
Contribution from dominancequickly decays to zero
Autotetraploids
35
General responses• For both individual and family selection, the
response can be thought of as a regression of some phenotypic measurement (such as the individual itself or its corresponding selection unit value x) on either the offspring value (y) or the breeding value RA of an individual who will be a parent of the next generation (the recombination group).
• The regression slope for predicting – y from x is s (x,y)/s2(x) – BV RA from x s (x,RA)/s2(x)
• With transient components of response, these covariances now also become functions of time ---e.g. the covariance between x in one generation and y several generations later
36
Maternal Effects:Falconer’s dilution model
z = G + m zdam + e
G = Direct genetic effect on characterG = A + D + I. E[A] = (Asire + Adam)/2
maternal effect passed from dam to offspring m zdam is just a fraction m of the dam’s phenotypic value
m can be negative --- results in the potential fora reversed response
The presence of the maternal effects means that responseis not necessarily linear and time lags can occur in response
37
Parent-offspring regression under the dilution model
In terms of parental breeding values,
With no maternal effects, baz = h2
The resulting slope becomes bAz = h2 2/(2-m)
-
38
Parent-offspring regression under the dilution model
39
109876543210-0.15
-0.10
-0.05
0.00
0.05
0.10
Generation
Cum
ulat
ive
Res
pons
e to
Sel
ectio
n
(in te
rms o
f S)
Response to a single generation of selection
Reversed response in 1st generation largely due tonegative maternal correlationmasking genetic gain
Recovery of genetic response afterinitial maternal correlation decays
h2 = 0.11, m = -0.13 (litter size in mice)
40
20151050
-1.0
-0.5
0.0
0.5
1.0
1.5
Generation
Cum
ulat
ive
Res
pons
e (in
uni
ts o
f S) m = -0.25
m = -0.5
m = -0.75
h2 = 0.35
Selection occurs for 10 generations and then stops
Additional material
Unlikely to be covered in class
41
42
Selection on Threshold Traits
Assume some underlying continuous value z, the liability, maps to a discrete trait.
z < T character state zero (i.e. no disease)
z > T character state one (i.e. disease)
Alternative (but essentially equivalent model) is aprobit (or logistic) model, when p(z) = Prob(state one | z). Details in LW Chapter 14.
Response on a binary trait is a special case of response on a continuous trait
43Frequency of character state onin next generation
Frequency of trait
Observe: trait values are either 0,1. Popmean = q (frequencyof the 1 trait)
Want to map fromq onto the underlyingliability scale z, wherebreeder’s equationRz = h2Sz holds
44
Liability scale Mean liability before selection
Selection differentialon liability scale
Mean liability in next generation
45
qt* - qt is the selection differential on the phenotypic scale
Mean liability in next generation
46
Steps in Predicting Response to Threshold Selection
i) Compute initial mean µ0
We can choose a scale where the liabilityz has variance of one and a threshold T = 0
Hence, z - µ0 is a unit normal random variable
P(trait) = P(z > 0) = P(z - µ > -µ) = P(U > -µ)
U is a unit normal
Define z[q] = P(U < z[q] ) = q. P(U > z[1-q] ) = q
For example, suppose 5% of the pop shows the trait. P(U > 1.645) = 0.05, hence µ = -1.645. Note: in R, z[1-q] = qnorm(1-q), with qnorm(0.95) returning 1.644854
General result: µ = - z[1-q]
47
Steps in Predicting Response to Threshold Selection
ii) The frequency qt+1 of the trait in the next generation is just
qt+1 = P(U > - µt+1 ) = P(U > - [h2S + µt ] )= P(U > - h2S - z[1-q] )
iii) Hence, we need to compute S, the selection differential for the liability z
Let pt = fraction of individuals chosen ingeneration t that display the trait
48
-- ttq
St = π π t =f(π t ) pt - qt
1 q*
This fraction does not displaythe trait, hence z < 0
When z is normally distributed, this reduces to
Height of the unit normal density functionat the point µt
Hence, we start at some initial value given h2 andµ0, and iterative to obtain selection response
This fraction displaysthe trait, hence z > 0
49
25201510500.00
0.25
0.50
0.75
1.00
1.25
1.50
1.75
2.00
2.25
0
10
20
30
40
50
60
70
80
90
100
Generation
Sele
ctio
n di
ffere
ntia
l S
q, F
requ
ency
of c
hara
cter
S q
Initial frequency of q = 0.05. Select only on adultsshowing the trait (pt = 1)
50
Ancestral RegressionsWhen regressions on relatives are linear, we can think of the response as the sum over all previous contributions
For example, consider the response after 3 gens:
8 great-grand parentsS0 is there selectiondifferentialb3,0 is the regressioncoefficient for an offspring at time 3on a great-grandparentFrom time 0
4 grandparentsSelection diff S1
b3,1 is the regressionof relative in generation3 on their gen 1 relatives
2 parents
51
Ancestral Regressions
bT,t = cov(zT,zt)
More generally,
The general expression cov(zT,zt), where we keep track of the actual generation, as oppose to cov(z, zT-t ) -- how many generationsseparate the relatives, allows us to handle inbreeding, where theregression slope changes over generations of inbreeding.
52
Changes in the Variance under Selection
The infinitesimal model --- each locus has a very smalleffect on the trait.
Under the infinitesimal, require many generations for significant change in allele frequencies
However, can have significant change in geneticvariances due to selection creating linkage disequilibrium
Under linkage equilibrium, freq(AB gamete) = freq(A)freq(B)
With positive linkage disequilibrium, f(AB) > f(A)f(B), so that AB gametes are more frequent
With negative linkage disequilibrium, f(AB) < f(A)f(B), so that AB gametes are less frequent
53
Selection that reduces the variance generates negative d, selection that increases the variancegenerates positive d
54
Additive variance with LD:Additive variance is the variance of the sum of allelic effects,
Additive variance
Genic variance: value of Var(A)in the absence of disequilibriumfunction of allele frequencies
Disequilibrium contribution. Requires covariances between allelic effects at different loci
55
Key: Under the infinitesimal model, no (selection-induced) changes in genicvariance s2
a
Selection-induced changes in d change s2A, s2
z , h2
Dynamics of d: With unlinked loci, d loses half its value each generation (i.e, d in offspring is 1/2 d of their parents,
56
Dynamics of d: Computing the effect of selection in generating d
Consider the parent-offspring regression
Taking the variance of the offspring given the selected parents gives
Change in variance from selection
57
Change in d = change from recombination pluschange from selection
Recombination Selection
+ =
In terms of change in d,
This is the Bulmer Equation (Michael Bulmer), and it isakin to a breeder’s equation for the change in variance
At the selection-recombination equilibrium,
58
Application: Egg Weight in DucksRendel (1943) observed that while the change mean weight weight (in all vs. hatched) asnegligible, but their was a significance decreasein the variance, suggesting stabilizing selection
Before selection, variance = 52.7, reducing to43.9 after selection. Heritability was h2 = 0.6
= 0.62 (43.9 - 52.7) = -3.2
Var(A) = 0.6*52.7= 31.6. If selection stops, Var(A)is expected to increase to 31.6+3.2= 34.8
Var(z) should increase to 55.9, giving h2 = 0.62
59
Specific models of selection-inducedchanges in variances
Proportional reduction model:constant fraction k of
variance removed
Bulmer equation simplifiesto
Closed-form solutionto equilibrium h2
60
61
Equilibrium h2 under directiontruncation selection
62
Directional truncation selection
63
Changes in the variance = changes in h2
and even S (under truncation selection)
R(t) = h2(t) S(t)