lecture 4 - biostatistics › sites › default › files › ... · 2020-01-03 · 1 lecture 4...

1

Lecture 4Short-Term Selection

Response: Breeder’s equation

Bruce Walsh lecture notesIntroduction to Quantitative Genetics

SISG, Seattle16 – 18 July 2018

2

Response to Selection

• Selection can change the distribution of phenotypes, and we typically measure this by changes in mean– This is a within-generation change

• Selection can also change the distribution of breeding values– This is the response to selection, the change in

the trait in the next generation (the between-generation change)

3

The Selection Differential and the Response to Selection

• The selection differential S measures the within-generation change in the mean– S = µ* - µ

• The response R is the between-generation change in the mean– R(t) = µ(t+1) - µ(t)

4

Parental Generation

Offspring Generation

Truncation selectionUppermost fraction

p saved

µp µ*S

µo

R

5

The Breeders’ Equation: Translating S into R

Recall the regression of offspring value on midparent value

Averaging over the selected midparents,E[ (Pf + Pm)/2 ] = µ*,

E[ yo - µ ] = h2 ( µ* - µ ) = h2 S

Likewise, averaging over the regression gives

Since E[ yo - µ ] is the change in the offspring mean, it represents the response to selection, giving:

R = h2 S The Breeders’ Equation (Jay Lush)

6

• Note that no matter how strong S, if h2 is small, the response is small

• S is a measure of selection, R the actual response. One can get lots of selection but no response

• If offspring are asexual clones of their parents, the breeders’ equation becomes – R = H2 S

• If males and females subjected to differing amounts of selection,– S = (Sf + Sm)/2– Example: Selection on seed number in plants -- pollination

(males) is random, so that S = Sf/2

7

Pollen control• Recall that S = (Sf + Sm)/2• An issue that arises in plant breeding is pollen

control --- is the pollen from plants that have also been selected?

• Not the case for traits (i.e., yield) scored after pollination. In this case, Sm = 0, so response only half that with pollen control

• Tradeoff: with an additional generation, a number of schemes can give pollen control, and hence twice the response– However, takes twice as many generations, so

response per generation the same

8

Selection on clones• Although we have framed response in an outcrossed

population, we can also consider selecting the best individual clones from a large population of different clones (e.g., inbred lines)

• R = H2S, now a function of the board sense heritability. Since H2 > h2, the single-generation response using clones exceeds that using outcrossed individuals

• However, the genetic variation in the next generation is significantly reduced, reducing response in subsequent generations– In contrast, expect an almost continual response for several

generations in an outcrossed population.

9

Price-Robertson identity• S = cov(w,z)• The covariance between trait value z and

relative fitness (w = W/Wbar, scaled to have mean fitness = 1)

• VERY! Useful result• R = cov(w,Az), as response = within

generation change in BV– This is called Robertson’s secondary theorem of

natural selection

10

11

Suppose pre-selection mean = 30, and we select top5. In the table zi = trait value, ni = number of offspring

Unweighted S = 7, predicted response = 0.3*7 = 2.1offspring-weighted S = 4.69, pred resp = 1.4

12

Response over multiple generations

• Strictly speaking, the breeders’ equation only holds for predicting a single generation of response from an unselected base population

• Practically speaking, the breeders’ equation is usually pretty good for 5-10 generations

• The validity for an initial h2 predicting response over several generations depends on:– The reliability of the initial h2 estimate– Absence of environmental change between

generations– The absence of genetic change between the

generation in which h2 was estimated and the generation in which selection is applied

13

50% selectedVp = 4, S = 1.6



The selection differential is a function of boththe phenotypic variance and the fraction selected

14

The Selection Intensity, iAs the previous example shows, populations with thesame selection differential (S) may experience verydifferent amounts of selection

The selection intensity i provides a suitable measurefor comparisons between populations,

15

Truncation selection• A common method of artificial selection is truncation

selection --- all individuals whose trait value is above some threshold (T) are chosen.

• Equivalent to only choosing the uppermost fraction p of the population

16

Selection Differential UnderTruncation Selection

R code for i: dnorm(qnorm(1-p))/p

Likewise,

S =µ*- µ

17

Truncation selection• The fraction p saved can be translated into an

expected selection intensity (assuming the trait is normally distributed), – allows a breeder (by setting p in advance) to

chose an expected value of i before selection, and hence set the expected response

p 0.5 0.2 0.1 0.05 0.01 0.005

i 0.798 1.400 1.755 2.063 2.665 2.892

Height of a unit normal at the threshold value corresponding to p

R code for i: dnorm(qnorm(1-p))/p

18

Selection Intensity Version of the Breeders’ Equation

Since h = correlation between phenotypic and breedingvalues, h = rPA

R = i rPAsA

Response = Intensity * Accuracy * spread in Va

When we select an individual solely on their phenotype,the accuracy (correlation) between BV and phenotype is h

19

Accuracy of selectionMore generally, we can express the breedersequation as

R = i ruA sA

Where we select individuals based on the index u (for example, the mean of n of their sibs).

ruA = the accuracy of using the measure u topredict an individual's breeding value = correlation between u and an individual's BV, A

20

21

Improving accuracy• Predicting either the breeding or genotypic

value from a single individual often has low accuracy --- h2 and/or H2 (based on a single individuals) is small – Especially true for many plant traits with

high G x E– Need to replicate either clones or relatives

(such as sibs) over regions and years to reduce the impact of G x E

– Likewise, information from a set of relatives can give much higher accuracy than the measurement of a single individual

22

Stratified mass selection• In order to accommodate the high

environmental variance with individual plant values, Gardner (1961) proposed the method of stratified mass selection– Population stratified into a number of different

blocks (i.e., sections within a field)– The best fraction p within each block are chosen– Idea is that environmental values are more similar

among individuals within each block, increasing trait heritability.

23

Overlapping Generations

Ry =im + if

Lm + Lf

h2sp

Lx = Generation interval for sex x = Average age of parents when progeny are born

The yearly rate of response is

Trade-offs: Generation interval vs. selection intensity:If younger animals are used (decreasing L), i is also lower,as more of the newborn animals are needed as replacements

24

Computing generation intervals

OFFSPRING Year 2 Year 3 Year 4 Year 5 total

Number (sires)

60 30 0 0 90

Number (dams)

400 600 100 40 1140

25

Generalized Breeder’s Equation

Ry =im + if

Lm + Lf

ruAsA

Tradeoff between generation length L and accuracy r

The longer we wait to replace an individual, the moreaccurate the selection (i.e., we have time for progenytesting and using the values of its relatives)

26

27

Permanent Versus Transient Response

Considering epistasis and shared environmental values,the single-generation response follows from the midparent-offspring regression

Permanent component of response

Transient component of response --- contributesto short-term response. Decays away to zero

over the long-term

28

Permanent Versus Transient Response

The reason for the focus on h2S is that thiscomponent is permanent in a random-mating population, while the other components aretransient, initially contributing to response, butthis contribution decays away under random mating

Why? Under HW, changes in allele frequenciesare permanent (don’t decay under random-mating),while LD (epistasis) does, and environmentalvalues also become randomized

29

Response with EpistasisThe response after one generation of selection froman unselected base population with A x A epistasis is

The contribution to response from this single generationafter t generations of no selection is

c is the average (pairwise) recombination between lociinvolved in A x A

30

Response with Epistasis

Contribution to response from epistasis decays to zero aslinkage disequilibrium decays to zero

Response from additive effects (h2 S) is due to changes in allele frequencies and hence is permanent. Contribution from A x A due to linkage disequilibrium

31

Why breeder’s equation assumption of an unselected base population? If history of previous selection, linkage disequilibrium may be present and the mean can change as the disequilibrium decays

For t generation of selection followed byt generations of no selection (but recombination)

RAA has a limitingvalue given by

Time to equilibrium afunction of c

Decay half-life

32

What about response with higher-order epistasis?

Fixed incremental differencethat decays when selection

stops

33

Response in autotetraploids

• Autotetraploids pass along two alleles at each locus to their offspring

• Hence, dominance variance is passed along• However, as with A x A, this depends upon

favorable combinations of alleles, and these are randomized over time by transmission, so D component of response is transient.

34

P-O covariance Single-generationresponse

Response to t generations ofselection with constant selection differential S

Response remaining after t generations of selection followed by t generations of random mating

Contribution from dominancequickly decays to zero

Autotetraploids

35

General responses• For both individual and family selection, the

response can be thought of as a regression of some phenotypic measurement (such as the individual itself or its corresponding selection unit value x) on either the offspring value (y) or the breeding value RA of an individual who will be a parent of the next generation (the recombination group).

• The regression slope for predicting – y from x is s (x,y)/s2(x) – BV RA from x s (x,RA)/s2(x)

• With transient components of response, these covariances now also become functions of time ---e.g. the covariance between x in one generation and y several generations later

36

Maternal Effects:Falconer’s dilution model

z = G + m zdam + e

G = Direct genetic effect on characterG = A + D + I. E[A] = (Asire + Adam)/2

maternal effect passed from dam to offspring m zdam is just a fraction m of the dam’s phenotypic value

m can be negative --- results in the potential fora reversed response

The presence of the maternal effects means that responseis not necessarily linear and time lags can occur in response

37

Parent-offspring regression under the dilution model

In terms of parental breeding values,

With no maternal effects, baz = h2

The resulting slope becomes bAz = h2 2/(2-m)

-

38

Parent-offspring regression under the dilution model

39

109876543210-0.15

-0.10

-0.05

0.00

0.05

0.10

Generation

Cum

ulat

ive

Res

pons

e to

Sel

ectio

n

(in te

rms o

f S)

Response to a single generation of selection

Reversed response in 1st generation largely due tonegative maternal correlationmasking genetic gain

Recovery of genetic response afterinitial maternal correlation decays

h2 = 0.11, m = -0.13 (litter size in mice)

40

20151050

-1.0

-0.5

0.0

0.5

1.0

1.5

Generation

Cum

ulat

ive

Res

pons

e (in

uni

ts o

f S) m = -0.25

m = -0.5

m = -0.75

h2 = 0.35

Selection occurs for 10 generations and then stops

Additional material

Unlikely to be covered in class

41

42

Selection on Threshold Traits

Assume some underlying continuous value z, the liability, maps to a discrete trait.

z < T character state zero (i.e. no disease)

z > T character state one (i.e. disease)

Alternative (but essentially equivalent model) is aprobit (or logistic) model, when p(z) = Prob(state one | z). Details in LW Chapter 14.

Response on a binary trait is a special case of response on a continuous trait

43Frequency of character state onin next generation

Frequency of trait

Observe: trait values are either 0,1. Popmean = q (frequencyof the 1 trait)

Want to map fromq onto the underlyingliability scale z, wherebreeder’s equationRz = h2Sz holds

44

Liability scale Mean liability before selection

Selection differentialon liability scale

Mean liability in next generation

45

qt* - qt is the selection differential on the phenotypic scale

Mean liability in next generation

46

Steps in Predicting Response to Threshold Selection

i) Compute initial mean µ0

We can choose a scale where the liabilityz has variance of one and a threshold T = 0

Hence, z - µ0 is a unit normal random variable

P(trait) = P(z > 0) = P(z - µ > -µ) = P(U > -µ)

U is a unit normal

Define z[q] = P(U < z[q] ) = q. P(U > z[1-q] ) = q

For example, suppose 5% of the pop shows the trait. P(U > 1.645) = 0.05, hence µ = -1.645. Note: in R, z[1-q] = qnorm(1-q), with qnorm(0.95) returning 1.644854

General result: µ = - z[1-q]

47

Steps in Predicting Response to Threshold Selection

ii) The frequency qt+1 of the trait in the next generation is just

qt+1 = P(U > - µt+1 ) = P(U > - [h2S + µt ] )= P(U > - h2S - z[1-q] )

iii) Hence, we need to compute S, the selection differential for the liability z

Let pt = fraction of individuals chosen ingeneration t that display the trait

48

-- ttq

St = π π t =f(π t ) pt - qt

1 q*

This fraction does not displaythe trait, hence z < 0

When z is normally distributed, this reduces to

Height of the unit normal density functionat the point µt

Hence, we start at some initial value given h2 andµ0, and iterative to obtain selection response

This fraction displaysthe trait, hence z > 0

49

25201510500.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75

2.00

2.25

0

10

20

30

40

50

60

70

80

90

100

Generation

Sele

ctio

n di

ffere

ntia

l S

q, F

requ

ency

of c

hara

cter

S q

Initial frequency of q = 0.05. Select only on adultsshowing the trait (pt = 1)

50

Ancestral RegressionsWhen regressions on relatives are linear, we can think of the response as the sum over all previous contributions

For example, consider the response after 3 gens:

8 great-grand parentsS0 is there selectiondifferentialb3,0 is the regressioncoefficient for an offspring at time 3on a great-grandparentFrom time 0

4 grandparentsSelection diff S1

b3,1 is the regressionof relative in generation3 on their gen 1 relatives

2 parents

51

Ancestral Regressions

bT,t = cov(zT,zt)

More generally,

The general expression cov(zT,zt), where we keep track of the actual generation, as oppose to cov(z, zT-t ) -- how many generationsseparate the relatives, allows us to handle inbreeding, where theregression slope changes over generations of inbreeding.

52

Changes in the Variance under Selection

The infinitesimal model --- each locus has a very smalleffect on the trait.

Under the infinitesimal, require many generations for significant change in allele frequencies

However, can have significant change in geneticvariances due to selection creating linkage disequilibrium

Under linkage equilibrium, freq(AB gamete) = freq(A)freq(B)

With positive linkage disequilibrium, f(AB) > f(A)f(B), so that AB gametes are more frequent

With negative linkage disequilibrium, f(AB) < f(A)f(B), so that AB gametes are less frequent

53

Selection that reduces the variance generates negative d, selection that increases the variancegenerates positive d

54

Additive variance with LD:Additive variance is the variance of the sum of allelic effects,

Additive variance

Genic variance: value of Var(A)in the absence of disequilibriumfunction of allele frequencies

Disequilibrium contribution. Requires covariances between allelic effects at different loci

55

Key: Under the infinitesimal model, no (selection-induced) changes in genicvariance s2

a

Selection-induced changes in d change s2A, s2

z , h2

Dynamics of d: With unlinked loci, d loses half its value each generation (i.e, d in offspring is 1/2 d of their parents,

56

Dynamics of d: Computing the effect of selection in generating d

Consider the parent-offspring regression

Taking the variance of the offspring given the selected parents gives

Change in variance from selection

57

Change in d = change from recombination pluschange from selection

Recombination Selection

+ =

In terms of change in d,

This is the Bulmer Equation (Michael Bulmer), and it isakin to a breeder’s equation for the change in variance

At the selection-recombination equilibrium,

58

Application: Egg Weight in DucksRendel (1943) observed that while the change mean weight weight (in all vs. hatched) asnegligible, but their was a significance decreasein the variance, suggesting stabilizing selection

Before selection, variance = 52.7, reducing to43.9 after selection. Heritability was h2 = 0.6

= 0.62 (43.9 - 52.7) = -3.2

Var(A) = 0.6*52.7= 31.6. If selection stops, Var(A)is expected to increase to 31.6+3.2= 34.8

Var(z) should increase to 55.9, giving h2 = 0.62

59

Specific models of selection-inducedchanges in variances

Proportional reduction model:constant fraction k of

variance removed

Bulmer equation simplifiesto

Closed-form solutionto equilibrium h2

60

61

Equilibrium h2 under directiontruncation selection

62

Directional truncation selection

63

Changes in the variance = changes in h2

and even S (under truncation selection)

R(t) = h2(t) S(t)

lecture 4 - biostatistics › sites › default › files › ... · 2020-01-03 · 1 lecture 4...

Documents