pcb5065 advanced genetics population genetics and

Click here to load reader

Post on 11-May-2015

1.217 views

Category:

Documents

12 download

Embed Size (px)

TRANSCRIPT

  • 1.PCB5065 Advanced Genetics Population Genetics and Quantitative Genetics Instructor: Rongling Wu, 409 McCarty Hall, Department of Statistics Tel: 2-3806, Email: [email protected] Mon Nov 14Population genetics - population structure Tues Nov 15 Population genetics - Hardy-Weinberg equilibrium Wed Nov 16 Population genetics - effective population size Thurs Nov 17 Population genetics - linkage disequilibrium Mon Nov 21 Population genetics - evolutionary forces Tues Nov 22 Population genetics - evolutionary forcesWed Nov 23 Genetic Parameters: Means Mon Nov 28 Genetic Parameters: (Co)Variances Tues Nov 29 Mating Designs for Parameter Estimation Wed Nov 30 Discussion paper - Epigenetics / developmental genetics Thurs Dec 1 No Class UFGI Genetics Symposium Reitz Union Mon Dec 5Experimental Designs for Parameter Estimation Tues Dec 6 Heritability, Genetic Correlation and Gain from Selection Wed Dec 7 Toward Molecular Dissection of Quantitative Variation Wed Dec 7 Take-home exam on pop. and quant. genetics given- due inelectronic format ( [email_address] ) by 5 PM Mon. Dec. 12

2. Teosinte and Maize Teosinte branched 1( tb1 ) is found to affect the differentiationin branch architecture from teosinte to maize (John Doebley 2001) 3.

  • Approaches used to support the view that modern maize cultivars are domesticated from the wild type teosinte
  • Population genetics
  • Study the evolutionary or phylogenetic relationships between maize and its wild relative
  • Study evolutionary forces that have shaped the structure of and diversity in the maize genome

4.

  • Quantitative genetics
  • Identify the genetic architecture of the differences in morphology between maize and teosinte
  • Estimate the number of genes required for the evolution of a new morphological trait from teosinte to maize: few genes of large effect or many genes of small effect?
  • Doebley pioneered the use of quantitative trait locus (QTL) mapping approaches to successfully identify genomic regions that are responsible for the separation of maize from its undomesticated relatives.

5.

  • Doebley has cloned genes identified through QTL mapping,teosinte branched1( tb1 ), which governs kernel structure and plant architecture.
  • Ancient Mexicans used several thousand years ago to transform the wild grass teosinte into modern maize through rounds of selective breeding for large ears of corn.
  • With genetic information, I think in as few as 25 years I can move teosinte fairly far along the road to becoming maize, Doebley predicts (Brownlee, 2004 PNAS vol. 101:697699)

6. Toward biomedical breakthroughs? Single Nucleotide Polymorphisms (SNPs) no cancer cancer 7.

  • According to The International HapMap Consortium (2003), the statistical analysis and modeling of the links between DNA sequence variants and phenotypes will play a pivotal role in the characterization of specific genes for various diseases and, ultimately, the design of personalized medications that are optimal for individual patients.
  • What knowledge is needed to perform such statistical analyses?
  • Population genetics and quantitative genetics, and others
  • The International HapMap Consortium, 2003 The International HapMap Project.Nature426: 789-94.
  • Liu, T., J. A. Johnson, G. Casella and R. L. Wu, 2004 Sequencing complex diseases with HapMap.Genetics168: 503-511.

8.

  • Basic Genetics
  • (1) Mendelian genetics
  • How does a gene transmit from a parent to its progeny (individual)?
  • (2) Population genetics
  • How is a gene segregating in a population (a group of individuals)?
  • (3) Quantitative genetics
  • How is gene segregation related with the phenotype of a character?
  • (4) Molecular genetics
  • What is the molecular basis of gene segregation and transmission?
  • (5) Developmental genetics
  • (6) Epigenetics

9.

  • Mendelian GeneticsProbability
  • Population Genetics Statistics
  • Quantitative genetics Molecular Genetics
  • Statistical Genetics Mathematics withbiology (our view)
  • Cutting-edge research at the interface among genetics,
  • evo lution anddeve lopment (Evo-Devo)
  • Wu, R. L. Functional mapping how to map and study the genetic architecture of dynamic complex traits.Nature Reviews Genetics(accepted)

10.

  • Mendels Laws
  • Mendels first law
  • There is a gene with two alleles on a chromosome location (locus)
  • These alleles segregate during the formation of the reproductive cells, thus passing into different gametes
  • Mendels second law
  • There are two or more pairs of genes on different chromosomes
  • They segregate independently (partially correct)
  • Linkage (exception to Mendels second law)
  • There are two or more pairs of genes located on the same chromosome
  • They can be linked or associated (the degree of association is described by the recombination fraction)

11.

  • Population Genetics
  • Different copies of a gene are called alleles; for exampleAandaat geneA ;
  • These alleles form three genotypes,AA ,Aaandaa ;
  • The allele (or gene) frequency of an allele is defined as the proportion of this allele among a group of individuals;
  • Accordingly, the genotype frequency is the proportion of a genotype among a group of individuals

12.

  • Calculations of allele frequencies and genotype frequencies
  • Genotypes Counts Estimates genotype frequencies
  • AA224 P AA= 224/294 = 0.762
  • Aa 64 P Aa=64/294= 0.218
  • aa 6P aa=6/294= 0.020
  • Total294P AA+ P Aa+ P aa= 1
  • Allele frequencies
  • p A= (2 214+64)/(2 294)=0.871, p a= (2 6+64)/(2 294)=0.129,
  • p A+ p a= 0.871 + 0.129 = 1
  • Expected genotype frequencies
  • AA p A 2 = 0.871 2 = 0.769
  • Aa 2p A p a= 20.8710.129 = 0.224
  • Aa p a 2 = 0.129 2 = 0.017

13.

  • Genotypes Counts Estimates of genotype freq.
  • AAn AA P AA= n AA /n
  • Aa n Aa P Aa= n Aa /n
  • aa n aa P aa= n aa /n
  • TotalnP AA+ P Aa+ P aa= 1
  • Allele frequencies
  • p A= (2n AA+ n Aa )/2n
  • p a= (2n aa+ n Aa )/2n
  • Standard error of the estimate of the allele frequency
  • Var(p A ) = p A (1 - p A )/2n

14.

  • The Hardy-Weinberg Law
  • In the Hardy-Weinberg equilibrium (HWE), the relative frequencies of the genotypes will remain unchanged from generation to generation;
  • As long as a population is randomly mating, the population can reach HWE from the second generation;
  • The deviation from HWE, called Hardy-Weinberg disequilibrium (HWD), results from many factors, such as selection, mutation, admixture and population structure

15.

  • Mendelian inheritance at the individual level
  • (1) Make a cross between two individual parents
  • (2) Consider one gene ( A ) with two alleles A and aAA, Aa, aa
  • Thus, we have a total of nine possible cross combinations:
  • Cross Mendelian segregation ratio
  • 1.AAAA AA
  • 2.AAAa AA + Aa
  • 3.AAaa Aa
  • 4.AaAA AA + Aa
  • 5.AaAa AA + Aa + aa
  • 6.Aaaa Aa + aa
  • 7.aaAA Aa
  • 8.aaAa Aa + aa
  • 9.aaaa aa

16.

  • Mendelian inheritance at the population level
  • A population, a group of individuals, may contain all these nine combinations, weighted by the mating frequencies.
  • Genotype frequencies: AA, P AA (t); Aa, P Aa (t); aa, P aa (t)
  • CrossMating freq. (t) Mendelian segreg. ratio (t+1)
  • AA Aa aa
  • 1.AAAA P AA (t)P AA (t) 1 0 0
  • 2.AAAaP AA (t)P Aa (t) 0
  • 3.AAaaP AA (t)P aa (t) 0 1 0
  • 4.AaAA P Aa (t)P AA (t) 0
  • 5.AaAa P Aa (t)P Aa (t)
  • 6.Aaaa P Aa (t)P aa (t) 0
  • 7.aaAA P aa (t)P AA (t) 0 1 0
  • 8.aaAa P aa (t)P Aa (t) 0
  • 9.aaaa P aa (t)P aa (t) 0 0 1

17.

  • P AA (t+1) = 1[P AA (t)] 2+ 2[P AA (t)P Aa (t)] + [P Aa (t)] 2
  • = [P AA (t) + P Aa (t)] 2
  • Similarly, we have
  • P aa (t+1) = [P aa (t) + P Aa (t)] 2
  • P Aa (t+1) = 2[P AA (t) + P Aa (t)][P aa (t) + P Aa (t)]
  • Therefore, we have
  • [P Aa (t+1)] 2= 4P AA (t+1)P aa (t+1)
  • Furthermore, if random mating continues, we have
  • P AA (t+2) = [P AA (t+1) + P Aa (t+1)] 2= P AA (t+1)
  • P Aa (t+2) =2[P AA (t+1) + P Aa (t+1)][P aa (t+1) + P Aa (t+1)]= P Aa (t+1)
  • P aa (t+2) = [P aa (t+1) + P Aa (t+1)] 2= P aa (t+1)

18.

  • (1) Genotype (and allele) frequencies are constant from generation to generation,
  • (2) Genotype frequencies = the product of the allele frequencies, i.e., P AA= p A 2 , P Aa= 2p A p a , P aa= p a 2
  • For a population at Hardy-Weinberg disequilibrium (HWD), we have
  • P AA= p A 2+ D
  • P Aa= 2p A p a 2D
  • P aa= p a 2+ D
  • The magnitude of D determines the degree of HWD.
  • D = 0 means that there is no HWD.
  • D has a range of max(-p A 2, -p a 2 )Dp A p a

Concluding remarks A population with [P Aa (t+1)] 2= 4P AA (t+1)P aa (t+1) is said to be inHardy-Weinberg equilibrium(HWE). The HWE population has the following properties: 19.

  • Chi-square test for HWE
  • Whether or not the population deviates from HWE at a particular locus can be tested using a chi-square test.
  • If the population deviates from HWE (i.e., Hardy-Weinberg disequilibrium, HWD), this implies that the population is not randomly mating. Many evolutionary forces, such as mutation, genetic drift and population structure, may operate.

20.

  • Example 1
  • AA Aa aa Total
  • Obs 224 646 294
  • Exp n(p A 2 ) = 222.9n(2p A p a ) = 66.2n(p a 2 ) = 4.9 294
  • Test statistics
  • x 2=(obs exp)2 /exp = (224-222.9)2/222.9 + (64-66.2)2/66.2 + (6-4.9)2/4.9 = 0.32
  • is less than
  • x 2 df=1( = 0.05) = 3.841
  • Therefore, the population does not deviate from HWE at this locus.
  • Why the degree of freedom = 1? Degree of freedom = the number of parameters contained in the alternative hypothesis the number of parameters contained in the null hypothesis. In this case, df = 2 (p Aor p aand D) 1 (p Aor p a ) = 1

21.

  • Example 2
  • AA Aa aa Total
  • Obs 234 36 6 276
  • Exp n(p A 2 )n(2p A p a )n(p a 2 )
  • = 230.1= 43.8= 2.1276
  • Test statistics
  • x 2=(obs exp)2/exp = (234-230.1)2/230.1 + (36-43.8)2/43.8 + (6-2.1)2/2.1 = 8.8
  • is greater than x 2 df=1( = 0.05) = 3.841
  • Therefore, the population deviates from HWE at this locus.

22.

  • Linkage disequilibrium
  • Consider two loci, A and B, with alleles A, a and B, b, respectively, in a population
  • Assume that the population is at HWE
  • If the population is at Hardy-Weinberg equilibrium, we have
  • Gene AGene B
  • AA: P AA= p A 2 BB: P BB= p B 2
  • Aa:P Aa= 2p A p a Bb:P Bb= 2p B p b
  • Aa:P aa= p a 2 bb:P bb= p b 2
  • P AA +P Aa +P aa= 1P BB +P Bb +P bb =1
  • p A+ p a= 1 p B+ p b= 1

23.

  • But the population is at Linkage Disequilibrium (for a pair of loci). Then we have
  • Two-gene haplotype AB: p AB= p A p B+ D AB
  • Two-gene haplotype Ab:p Ab= p A p b+ D Ab
  • Two-gene haplotype aB:p aB= p a p B+ D aB
  • Two-gene haplotype ab:p ab= p a p b+ D ab
  • p AB +p Ab +p aB +p ab= 1
  • D ijis the coefficient of linkage disequilibrium (LD) between the two genes in the population. The magnitude of D reflects the degree of LD. The larger D, the stronger LD.

24.

  • p A = p AB +p Ab
  • = p A p B+ D AB+ p A p b+ D Ab
  • = p A +D AB +D Ab D AB= -D Ab
  • p B = p AB +p aB
  • = p B +D AB +D aB D AB= -D aB
  • p b = p Ab +p ab
  • = p b +D aB +D ab D ab= -D aB
  • Finally, we have D AB= -D Ab= -D aB = D ab= D.
  • Re-write four two-gene haplotype frequncies
  • AB: p AB= p A p B+ D
  • Ab: p Ab= p A p b D
  • aB:p aB= p a p B D
  • ab:p ab= p a p b+ D
  • D = p AB p ab- p Ab p aB
  • D = 0the population is at the linkage equilibrium

25.

  • How does D transmit from one generation (1) to the next (2)?
  • D(2) = (1-r) 1D(1)
  • D(t+1) = (1-r) tD(1)
  • t ,D(t+1) r

26. Conclusions: - D tends to be zero at the rate depending on therecombination fraction. - Linkage equilibrium PAB = pApB isapproached gradually and without oscillation. - The larger r, the faster is the rate ofconvergence, the most rapid being ()t forunlinked loci (r=0.5). 27.

  • D(t) = (1-r) t D(0)
  • D(t)/D(0)= (1-r) t
  • The ratioD(t)/D(0)describes the degree with which LD decays with generation.

28. The plot of the ratioD(t)/D(0) against r tells us theevolutionary history of a population implications forpopulation and evolutionary genetics. 29. The plot of the ratioD(t)/D(0) against t tells usthe degree of linkage Implications for high- resolution mapping of human diseases andother complex traits 30.

  • Proof to D(t+1) = (1-r) 1D(t)
  • The four gametes randomly unite to form a zygote. The proportion 1-r of the gametes produced by this zygote are parental (or nonrecombinant) gametes and fraction r are nonparental (or recombinant) gametes. A particular gamete, say AB, has a proportion (1-r) in generation t+1 produced without recombination. The frequency with which this gamete is produced in this way is (1-r)p AB (t).
  • Also this gamete is generated as a recombinant from the genotypes formed by the gametes containing allele A and the gametes containing allele B. The frequencies of the gametes containing alleles A or B are p A (t) and p B (t), respectively. So the frequency with which AB arises in this way is rp A (t)p B (t).
  • Therefore the frequency of AB in the generation t+1 is
  • p AB (t+1) = (1-r)p AB (t) + rp A (t)p B (t)
  • By subtracting is p A (t)p B (t) from both sides of the above equation, we have
  • D(t+1) = (1-r) 1D(t)
  • Whence
  • D(t+1) = (1-r) tD(1)

31.

  • Estimate and test for LD
  • Assuming random mating in the population, we have joint probabilities of the two genes
  • BB (P BB ) Bb (P Bb ) bb (P bb )
  • _______________________________________________________________________________________
  • AA (P AA ) p AB 2 2p AB p Ab p Ab 2
  • n 22 n 21 n 20
  • Aa (P Aa ) 2p AB p aB 2(p AB p ab +p Ab p aB ) 2p Ab p ab
  • n 12 n 11 n 10
  • aa (P aa ) p aB 2 2p Ab p ab p ab 2
  • n 02 n 01 n 00
  • ________________________________________________________________________________________
  • Multinomial pdf
  • H 1 : D0
  • log f(p ij | n )
  • =log n!/(n 22 !n 00 !)
  • + n 22log p AB 2+ n 21 log (2p AB p Ab ) + n 20log p Ab 2
  • +
  • Estimate p AB , p Ab , p aB(p ab= 1-p AB -p Ab -p aB )p A , p B , D
  • H 0 : D = 0
  • log f(p i ,p j | n )
  • =log n!/(n 22 !n 00 !)
  • + n 22 log(p A p B ) 2+ n 21 log(2p A 2 p B p b )+n 20 log(p A p b ) 2
  • +
  • Estimate p Aand p B .

32.

  • Chi-square Test of Linkage Disequilibrium (D)
  • Test statistic
  • x 2= 2nD 2 /(p A p a p B p b )
  • is compared with the critical threshold value obtained from the chi-square table x 2 df=1(0.05). n is the number of individuals in the population.
  • If x 2< x 2 df=1(0.05), this means that D is not significantly different from zero and that the population under study is in linkage equilibrium.
  • If x 2> x 2 df=1(0.05), this means that D is significantly different from zero and that the population under study is in linkage disequilibrium.

33.

  • E xample
  • (1)Two genesAwith allele A and a,Bwith alleles B and b, whose population frequencies are denoted by p A , p a(=1- p A ) and p B , p b(=1- p b ), respectively
  • (2) These two genes are associated with each other, having the coefficient of linkage disequilibrium D
  • Four gametes are observed as follows:
  • Gamete AB Ab aB ab Total
  • Obs 474 611 142 773 2n=2000
  • Gamete frequency p AB p Ab p aB p ab =474/2000 =611/2000 =142/2000 =773/2000
  • =0.237 =0.305 =0.071 =0.386 1

34.

  • Estimates of allele frequencies
  • p A= p AB+ p Ab= 0.237 + 0.305 = 0.542
  • p a= p aB+ p ab= 0.071 + 0.386 = 0.458
  • p B= p AB+ p aB= 0.237 + 0.071 = 0.308
  • p b= p Ab+ p ab= 0.305 + 0.386 = 0.692
  • The estimate of D
  • D = p AB p ab p Ab p aB= 0.2370.386 0.3050.071 = 0.0699
  • Test statistics
  • x 2= 2nD 2 / (p A p a p B p b )= 2 1000 0.0699 2 /(0.542 0.458 0.308 0.692)= 184.78 is greater thanx 2 df=1(0.05) = 3.841.
  • Therefore, the population is in linkage disequilibrium at these two genes under consideration.

35.

  • A second approach for calculating x 2 :
  • Gamete AB Ab aB ab Total
  • Obs 474 611 142 773 2n=2000
  • Exp 2n(p A p B ) 2n(p A p b )2n(p a p B )2n(p a p b )
  • =334.2 =750.8 =281.8 =633.2 2000
  • x 2=(obs exp) 2/exp
  • =(474-334.2) 2 /334.2 + (611-750.8) 2 /750.8 + (142-281.8) 2 /281.8 + (773-633.2) 2 /633.2
  • = 184.78
  • = 2nD 2 / (p A p a p B p b )

36.

  • Measures of linkage disequilibrium
  • D, which has a limitation that its value depends on
  • the allele frequencies
  • D = 0.02 is considered to be
  • large for two genes each with diverse allele frequencies, e.g., p A= p B= 0.9 vs. p a= p b= 0.1
  • small for two genes each with similar allele frequencies, e.g., p A= p B= 0.5 vs. p a= p b= 0.5

37.

  • To make a comparison between gene pairs with
  • different allele frequencies, we need a new normalized measure.
  • The range of LD is
  • max(-p A p B , -p a p b ) Dmin(p A p b , p a p B )
  • The normalized LD (Lewontin 1964) is defined as
  • D' = D/ D max ,
  • where D maxis the maximum that D can have, which is
  • D max=max(-p A p B , -p a p b )if D < 0,
  • ormin(p A p b , p a p B )if D > 0.
  • For the above example, we have D' = 0.0699/min(p A p b , p a p B ) = 0.0699/min(0.375, 0.141) = 0.496

38.

  • (3) Linkage disequilibrium measured as the correlation
  • between the A and B alleles
  • R = D/ (p A p a p B p b ),r: [-1, 1]
  • Note: x 2 = 2nR 2follows the chi-square distribution
  • with df = 1 under the null hypothesis of D = 0.
  • For the above example, we have
  • R = 0.0699/ (p A p b p a p B ) = 0.3040.

39.

  • Application of LD analysis
  • D(t+1) = (1-r) t D(t),
  • This means that when the population undergoes random mating, the LD decays exponentially in a proportion related to the recombination fraction.
  • (1) Population structure and evolution
  • Estimating D, D' and Rthe mating history of
  • population
  • The larger the D and R estimates, the more likely the population in nonrandom mating, the more likely the population to have a small size, the more likely the population to be affected by evolutionary forces.

40.

  • Human origin studies based on LD analysis
  • Reich, D. E., M. Cargill, S. Bolk, J. Ireland, P. C. Sabeti, D. J. Richter, T. Lavery, R. Kouyoumjian, S. F. Farhadian, R. Ward and E. S. Lander, 2001 Linkage disequilibrium in the human genome. Nature 411: 199-204.
  • Dawson, E., G. R. Abecasis, S. Bumpstead, Y. Chen et al. 2002 A first-generation linkage disequilibrium map of human chromosome 22. Nature 418: 544-548.

41. LD curve for Swedish and Yoruban samples.To minimize ascertainment bias, data are onlyshown for marker comparisons involving thecore SNP. Alleles are paired such that D'>0in the Utah population. D'>0 in the otherpopulations indicates the same direction ofallelic association and D'D p C->E

  • D E
  • p D->I p E->I
  • I

51.

  • The common ancestor A generates two gametes G1 and G2 during meiosis, but only transmits one gamete for its first offspring B and one gamete for its second offspring C.
  • A pair of gametes contributed to offspring B and C by A may be G1G1, G1G2, G2G1, G2G2, each with a probability of 1/4 because of Mendelian segregation.
  • For G1G1 and G2G2, the alleles are clearly IBD,
  • For G1G2 and G2G1, the alleles are IBD only if G1 and
  • G2 are IBD, and G1 and G2 are IBD only if individual A is
  • autozygous, which has probability F A(the inbreeding
  • coefficient of A)
  • The probability for A to generate IBD alleles for B and D is therefore 1/4 + 1/4 + 1/4F A+ 1/4F A= 1/2(1 + F A ).

52.

  • The transmission probability of an allele from other parents, B, C, D, E to their own specified offspring is, based on Mendelian segregation,
  • p B->D= p C->E= p D->I= p E->I=1/2
  • Finally, the probability that the two alleles at any locus in individual I are identical by descent is
  • F I= 1/2 (1 + F A ) p B->D p C->E p D->I p E->I
  • = (1/2) 5 (1 + F A )

53. Evolutionary Forces The Causes of Evolution 54.

  • For a Hardy-Weinberg equilibrium (HWE) population, the genotype frequencies will remain unchanged from generation to generation. Two questions may arise that concern HWE.
  • (1) Do such HWE populations exist in nature?
  • (2) More importantly, if a population had
  • unchanged genotype frequencies over time, it
  • should be in a stationary status. Thus, wild type
  • teosinte would always be teosinte and never
  • change. But what have made teosinte become
  • cultivar maize (see the figure above)?

55.

  • First of all , no HWE population exists in nature because many evolutionary forces may operate in a population, which cause the genotype frequencies in the population to change.
  • Secondly , even if a population is at HWE, this equilibrium may be quickly violated because of some particular evolutionary forces.
  • These so-called evolutionary forces that cause the structure and organization of a population to change includemutation ,selection ,admixture ,division ,migration ,genetic drift Next, we will talk about the roles of some of these evolutionary forces in shaping a population.

56.

  • Mutation
  • Mutation is a change in genetic material, including
  • nucleotides substitution, insertions and deletions,
  • and chromosome rearrangements
  • Mutation has different types, forward mutation and
  • reversible mutation
  • Forward mutation
  • Consider a geneAwith two alleles A and a, with allele
  • frequencies p A (t) and p a (t) in generation t
  • Allele A is mutating to allele a, with the mutation rate per
  • generation denoted by u
  • Forward mutation is a process in which the mutating allele is
  • the prevalent wild type allele

57.

  • With the definition of mutation rate u ( a fraction u of A alleles undergo mutation and become a alleles, whereas a fraction 1-u of A alleles escape mutation and remain A ), we have allele frequency in the next generation t+1
  • p A (t+1) = p A (t) p A (t)u = (1-u) p A (t).
  • In general, we have
  • p A (t+1) = (1-u) p A (t) = (1-u) 2 p A (t-1) =
  • = (1-u) t+1 p A (0).

58.

  • Assuming that the initial population is nearly fixed for A, i.e.,p A (0) 1 , and that t+1 is not too large relative to 1/u, we can approximate the allele frequencies by
  • p A (t+1) p A (0) (t+1)u,
  • p a (t+1) p a (0) + (t+1)u .
  • The frequency of the mutant a allele increases linearly with time and the slope of the line equals u.
  • Because u is small, the linear increase in p ais difficult to detect unless a very large population size is used.

59.

  • Reversible mutation
  • Reversible mutation allows the mutation from A to a (at the rate u per generation) and from a to A (at the rate v per generation).
  • Thus, allele A can havetwo originsin any generation:
  • One being allele A in the previous generation that escaped mutation to allele a
  • The second being reversibly mutated from allele a in the previous generation

60.

  • The allele frequency in the current generation is therefore expressed as
  • p A (t+1) = (1-u)p A (t) + vp a (t) = (1-u-v)p A (t) + v
  • p A (t+1) v/(u+v) = (1-u-v)p A (t) + v - v/(u+v)
  • = (1-u-v)p A (t) + (uv+v 2 -v)/(u+v)
  • = [p A (t) v/(u+v)](1-u-v)
  • = [(1-u) t p A (0) v/(u+v)](1-u-v)
  • = [p A (0) v/(u+v)](1-u-v) t+1

61.

  • If p A (0) = v/(u+v), we have
  • p A (1) = p A (2) = = p A (t+1) = v/(u+v)
  • We define
  • p A= v/(u+v)
  • as anequilibrium frequency(irrespective of the starting frequencies).
  • To reach this equilibrium, it needs to take a long time for realistic values of the mutation rates.

62.

  • Admixture
  • Admixture is an evolutionary process in which two or more HWE populations with differing allele frequencies are mixed to produce a new population.
  • The consequence of admixture is the deficiency of heterozygous genotypes relative to the frequency expected with HWE for the average allele frequencies

63.

  • Consider geneAwith two alternative alleles A and a
  • Subpopulation 1 (HWE)Subpopulation 2(HWE)
  • AA Aa aa AA Aa aa
  • p A 2 2p A p a p a 2 p A 2 2p A p a p a 2
  • Admixture
  • Admixed population, mixed population,metapopulation, aggregate population (HWD)
  • AA Aa aa
  • (p A 2+ p A 2 )/2 (2p A p a+ 2p A p a )/2 (p a 2+ p a 2 )/2
  • Random mating
  • Fused population,total population (HWE)
  • AA Aa aa
  • 2p A p a

64.

  • After admixture, the allele frequencies are changed as
  • We find
  • (p A 2+ p A 2 )/2 (metapopulation)
  • (p A 2+ p A 2 )/2 - (p A - p A ) 2 /4
  • = (p A 2+ p A 2 )/2 + 2p A p A /4 - (p A 2+ p A 2 )/4
  • = (p A 2+ p A 2 )/4 + 2p A p A /4
  • = (p A+ p A ) 2 /4
  • = p - A 2(HWE)

65.

  • (p a 2+ p a 2 )/2 (metapopulation)
  • (p a 2+ p a 2 )/2 - (p a p a ) 2 /4
  • = (p a 2+ p a 2 )/2 + 2p a p a /4 - (p a 2+ p a 2 )/4
  • = (p a 2+ p a 2 )/4 + 2p a p a /4
  • = (p a+ p a ) 2 /4
  • =p - a 2(HWE)
  • p A p a+ p A p a(metapopulation)
  • p A p a+ p A p a+ (p A p A )(p a- p a )/2
  • = p A p a+ p A p a+ (p A p a+ p A p a- p A p a p A p a )/2
  • = (p A p a+ p A p a+ p A p a+ p A p a )/2
  • = (p A+ p A )(p a+ p a )/2
  • = 2q - A q - a (HWE)

66.

  • Discovery 1
  • It can be seen that genotype frequencies are not equal to the products of the allele frequencies for the admixed population so that the mixed population is not in HWE.
  • Discovery 2
  • Relative to an HWE population, the aggregate population contains too few heterozygous genotypes and too many homozygous genotypes.

67.

  • Define the variance in allele frequency (in terms of recessive alleles) among the subpopulation by 2 .
  • Value Frequncy
  • Supopulation 1 p a n
  • Supopulation 2 p a n = n
  • Mean p - a
  • Based on the definition of variance, we have
  • 2= [(p a- p - a ) 2+ (p a- p - a ) 2 ]/2
  • = (p a 2+ p a 2 )/2 + p - a 2- p a p - a p a p - a
  • = (p a 2+ p a 2 )/2 + p - a 2 2p - a [(p a +p a )/2]
  • = (p a 2+ p a 2 )/2 - p - a 2

68.

  • 2is actually the difference between the genotype frequencies ( R S ) in the metapopulation (equal to the average genotype frequencies among the subpopulations) and the genotype frequencies ( R T ) that would be expected in a total population in HWE., i.e.,
  • 2 = R S- R T 0, so R S= R T+ 2 R T

69.

  • Discovery 3
  • The average frequency of homozygous recessive genotypes among a group of subpopulations is always greater than the frequency of homozygous recessive genotypes that would be expected with random mating, andexcess is numerically equal to the variance in the recessive allele frequency .
  • The relationshipR S= R T+ 2 R Tis calledWahlunds principle

70.

  • Example: Two subpopulations of gray squirrels
  • For the recessive allele, we have p a= 0.16, p a= 0
  • The genotype frequency in the metapopulation is
  • (0.16 + 0)/2 = 0.08
  • The allele frequency in the metapopulation is
  • ( 0.16 + 0)/2 = 0.2
  • The frequency of the homozygous recessive genotype in the HWE total population is
  • 0.2 2= 0.04 < 0.08
  • The variance in allele frequency is
  • ( 0.16 0.2) 2+ ( 0 0.2) 2= 0.04 , which equals the reduction in the frequency of the homozygous recessive.

71.

  • Population structure
  • Similar to 2= R S R T= (p a 2+ p a 2 )/2 - p - a 2for homozygous recessive genotypes, we have
  • 2= D S D T= (p A 2+ p A 2 )/2 - p - A 2
  • for homozygous dominant genotypes.
  • For heterozygous genotypes, we have
  • H S H T= -2 2

72.

  • Recall the definition of the inbreeding coefficient
  • F = (P 0AA- P AA )/ P 0AA(describe the deficiency of heterozygous genotypes in an inbred population, relative to a population in HWE).
  • We define
  • F ST= (H T H S )/H T ,
  • as thefixation indexin the metapopultion .
  • Metapopulation inbred population

73.

  • Redefine
  • F ST= 2 / p - A p - a .
  • This is afundamental relationin population geneticsthat connects the fixation index in a metapopulation with the variance in allele frequencies among the subpopulations . The fixation index can be interpreted in terms of the inbreeding coefficient. Thus, the genotype frequencies in a metapopulation are expressed as
  • AA: p - A 2+ p - A p - a F ST = p - A 2 (1-F ST ) + p - A F ST
  • Aa: 2p - A p - a- 2p - A p - aF ST = 2p - A p - a (1-F ST )
  • aa: p - a 2+ p - A p - a F ST = p - a 2 (1-F ST ) + p - a F ST

74.

  • Remarks
  • Even though each subpopulation itself is undergoing random mating and is in HWE, there is inbreeding in the metapopulation composed of the aggregate of subpopulations.
  • A metapopulation may be composed of many smaller subpopulations each of which may be in HWE (theory for population structure).

75.

  • Natural Selection
  • Selection is the principal process that results in greater adaptation of organisms to their environment
  • Through selection the genotypes that are superior in survival and reproduction increase in frequency in the population

76.

  • Haploid selection: selection at the gamete level
  • Two alleles A and a, with initial frequencies p Aand p a
  • Haploid progeny (reproduction) 10 A (pA=1/2) 10 a (pa=1/2)
  • Maturation
  • Survival(Adults) 9 A 6 a
  • Viability (or Absolute fitness )9/10=0.90 6/10=0.60
  • Relative fitness w A =0.90/0.90=1 w a =0.60/0.90= 0.67
  • Selection coefficient 0 s=10.67=0.33
  • New frequenciesp A = 9/15 p a =6/15
  • Haploid progeny (reproduction) 12 A 8 a

77.

  • Viability orsurvivorship : the probability of survival, which is also calledfitness .
  • Fitness has two types:Absolute fitnessseparately for each genotype andrelative fitness (the ability of one genotype to survive relative to another genotype taken as a standard)
  • It is impossible to measure absolute fitness because it requires knowing the absolute number of each genotype, whereas relative fitness can be measured by the sampling approach
  • Selection coefficient : 1 relative fitness

78.

  • In general, the new frequency for allele A is expressed as
  • In the above example, p A= p a= , w A= 1, w a= 2/3, and s =1/3, we have p A= 1/2/(1-1/2 1/3) = 3/5 = 9/15.

79. . 80.

  • By the method of successive substitutions, we have

81. Taking the natural logarithm at both sides of the above equation, we have

  • (for a not-too-large s)
  • If s is not too large, ln(p A /p a ) should be linear with time with a slope equal to the value of s.
  • This is one approach by which the selection coefficient can be estimated

82. Example:E. coli

  • Generation ln(p A /p a )
  • 0 0.34
  • 5 0.53
  • 10 1.01
  • 20 1.47
  • 25 1.47
  • 30 1.10
  • 1.50
  • Using the linear regression model ln[p A (t)/p a (t)] = ln[p A (0)/p a (0)] + st, we estimate
  • ln(p A /p a ) = 0.52 + 0.0323t(Hartl and Dykhuizen 1981).

83. Diploid selection: selection at the zygote level

  • Two alleles A and a, with initial frequencies p A= and p a=
  • Zygote 5 AA10 Aa 5 aa
  • Maturation
  • Survival(Adults) 5 AA 8 Aa 3 aa
  • Absolute fitness 5/5= 1 8/10=0.8 3/5=0.6
  • Relative fitness w AA =1 w Aa =0.8/1=0.80w aa =0.6/1=0.6
  • Selection coefficient 0hs=10.80=0.20s=1-0.60=0.40
  • New frequenciesp A = (2 5+8)/[2(5+8+3)]=18/32p a =(3 2+8)/[2(5+8+3)]=14/32
  • Random mating with HWE leads to
  • AA: P AA= (18/32) 2 20 = 6
  • Aa:P Aa= 2(18/32)(14/32) 20 = 10
  • Aa:P aa= (14/32) 2 20 = 4

84. Defineh = hs/sas the degree of dominance of allele a. We have

  • h = 0 means that a is recessive to A,
  • h = means that the heterozygous fitness is the arithmetic average of the homozygous fitnesses; in this case, the effects of the alleles are said to be additive effects
  • h = 1 means that allele a is dominant to allele A.
  • It is possible that h < 0 or h > 1.

85. In general, the allele frequencies in the next generation after diploid selection are expressed as

  • where the dominator is the average fitness in the population, symbolized by

86. This equation has no analytical solution, and for this reason it is more useful to calculate the difference 87. Example

  • In the initial population, P AA= 0, P Aa= 2/3, P aa= 1/3, so we have p A= 1/3 and p a= 2/3. The fitness is measured, w AA= 0, w Aa= 0.50 and w aa= 1.
  • In the second generation, we expect
  • p A= [(1/3)2 0 + (1/3)(2/3) 0.50]/
  • [(1/3)2 0+2 (1/3) (2/3) 0.50+(2/3)2 1]
  • =1/6.

88. Time required for changes in gene frequency

  • With the selection coefficient (s), the degree of dominance (h) and 1 (if selection is weak), the difference in allele frequency can be expressed as
  • p A= p A p a s[p A h + p a (1-h)].

89. The time t required for the allele frequency of A to change from p A (0) to p A (t) can be determined in each of the three following special cases:

  • 1.Allele A is a favored dominant , in which case h = 0 and p A= p A p a 2 s, i.e.,
  • ,
  • In the special case, p a (0) = p a (t) = 1, we have
  • t(1/s)ln[p A (t)/p a (t)].

whose integral is 90.

  • Allele A is a favored and the alleles areadditive ,in which case h = 1/2 and p A= p A p a s/2, i.e.,
  • whose integral is
  • In the special case, p a (0) = p a (t) = 1, we have
  • t(2/s)ln[p A (t)/p a (t)].

91.

  • Allele A is a favored recessive ,in which case h = 1 and p A= p A 2 p a s, i.e.,
  • whose integral is

92. Implication If selection is operating on a rare harmful recessive allele (say a), what is the consequence?

  • This is the case whenallele A is a favored dominant, p A= p A p a 2 s and p a 0, p a 2 0.
  • Even if the selection coefficient s is very large, p Astill change little.
  • In other words, the change in allele frequency of a rare harmful recessive is slow whatever the value of the selection coefficient.
  • In humans, the forced sterilization of rare homozygous recessive individuals is not genetically sound, although it is also not morally accepted .

93. Other evolutionary forces

  • Migration : The movement of individuals among subpopulations
  • Random genetic drift : Fluctuations in allele frequency that happen by chance, particularly in small populations, as a result of random sampling among gametes
  • Mutation-selection balance:Selection and mutation affect a population at the same time

94. Overviews

  • HWE (estimate and test)
  • LD (test)
  • Inbreeding coefficient (evolutionary significance)
  • IBD
  • Evolutionary forces
  • Mutation
  • Admixture
  • Population structure
  • Selection

95. Discussion paper

  • Thornsberry, J.M., M.M. Goodman, J. Doebley, S. Kresovich, D. Nielsen, and E. S. Buckler, IV. 2001. Dwarf8 polymorphisms associate with variation in flowering time.Nature Genetics28: 286-289.
  • Pritchard, J. K. 2001 Deconstructing maize population structure.Nature Genetics28: 203-204.

96. Quantitative genetics

  • Many traits that are important in agriculture, biology and biomedicine are continuous in their phenotypes. For example,
  • Crop Yield
  • Stemwood Volume
  • Plant Disease Resistances
  • Body Weight in Animals
  • Fat Content of Meat
  • Time to First Flower
  • IQ
  • Blood Pressure

97. The following image demonstrates the variation forflower diameter ,number of flower partsandthe color of the flower Gaillaridia pilchella(McClean 1997). Each trait is controlled by a number of genes each interacting with each other and an array of environmental factors. 98.

  • Number of GenesNumber of Genotypes
  • 1 3
  • 29
  • 5243
  • 1059,049

99. Consider two genes,Awith two alleles A and a, andBwith two alleles B and b. - Each of the alleles will be assigned metric values - We give the A allele 4 units and the a allele 2 units - At the other locus, the B allele will be given 2 units and the b allele 1 unit

  • GenotypeRatio Metric value
  • AABB1 12
  • AABb 211
  • AAbb 110
  • AaBB 210
  • AaBb 49
  • Aabb 28
  • aaBB 18
  • aaBb 27
  • aabb 16

100. A grapical format is used to present the above results: 101. Normal distribution of a quantitative trait may be due to

  • Many genes
  • Environmental effects
  • The traditional view : polygenes each with small effect and being sensitive to environments
  • The new view : A few major gene and many polygenes (oligogenic control), interacting with environments

102. Traditional quantitative genetics research:Variance component partitioning

  • The phenotypic variance of a quantitative trait can be partitioned into genetic and environmental variance components.
  • To understand the inheritance of the trait, we need to estimate the relative contribution of these two components.
  • We define the proportion of the genetic variance to the total phenotypic variance as theheritability(H 2 ).
  • -If H 2= 1.0, then the trait is 100% controlled by genetics
  • - If H 2= 0, then the trait is purely affected by environmental factors.

103.

  • Fisher (1918) proposed a theory for partitioning genetic variance into additive, dominant and epistatic components;
  • Cockerham (1954) explained these genetic variance components in terms of experimental variances (from ANOVA), which makes it possible to estimate additive and dominant components (but not the epistatic component);
  • I proposed a clonal design to estimate additive, dominant and part-of-epistatic variance components
  • Wu, R., 1996 Detecting epistatic genetic variance with a clonally replicated design: Models for low- vs. high-order nonallelic interaction.Theoretical and Applied Genetics 93 : 102-109.

104. Genetic Parameters: Means and (Co)variances

  • One-gene model
  • Genotypeaa Aa AA
  • Genotypic value G 0 G 1 G 2
  • Net genotypic value-a0da
  • origin=(G 0 +G 1 )/2
  • a = additive genotypic value
  • d = dominant genotypic value
  • Environmental deviation E 0 E 1 E 2
  • Phenotype or
  • Phenotypic value Y 0 =G 0 +E 0 Y 1 =G 1 +E 1 Y 2 =G 2 +E 2
  • Genotype frequency P 0 P 1 P 2
  • at HWE =q2 =2pq =p2
  • Deviation from population mean -a - d - a -
  • =-2p[a+(q-p)d]= (q-p)[a+(q-p)d]= 2q[a+(q-p)d]
  • -2p 2 d+2pqd -2q 2 d
  • Letting =a+(q-p)d=-2p -2p 2 d =(q-p) +2pqd =2q -2q 2 d
  • Breeding value -2p (q-p) 2q
  • Dominant deviation -2p 2 d 2pqd -2q 2 d

105.

  • Population mean= q 2 (-a) + 2pqd + p 2 a = (p-q)a+2pqd
  • Genetic variance 2 g= q 2 (-2p -2p 2 d) 2+ 2pq[(q-p) +2pqd] 2+ p 2 (2q -2q 2 d) 2
  • =2pq 2 +(2pqd) 2
  • = 2 a (or V A ) + 2 d (or V D )
  • Additive genetic variance , Dominant genetic variance ,
  • depending on both on a and d depending only on d
  • Phenotypic variance 2 P= q 2 Y 0 2+ 2pqY 1 2+ p 2 Y 2 2 (q 2 Y 0+ 2pqY 1+ p 2 Y 2 ) 2
  • Define
  • H 2= 2 g/ 2 Pas the broad-sense heritability
  • h 2= 2 a/ 2 Pas the narrow-sense heritability
  • These two heritabilities are important in understanding the relative contribution of genetic and environmental factors to the overall phenotypic variance.

106. What is= a+(q-p)d?

  • It is theaverage effectdue to the substitution of gene from one allele (A say) to the other (a).
  • Event Aa contains two possibilities
  • From A ato a a FromA A toA a
  • Frequencyqp
  • Value change d-(-a) a-d
  • = q[d-(-a)]+p(a-d)
  • = a+(q-p)d

107. Midparent-offspring correlation

  • ____________________________________________________________________
  • Progeny
  • Genotype Freq. of Midparent AA Aa aa Mean value
  • of parents matings value a d -a of progeny
  • ____________________________________________________________________
  • AA AA p 4 a 1 - - a
  • AA Aa 4p 3 q (a+d) - (a+d)
  • AA aa 2p 2 q 2 0 - 1 - d
  • Aa Aa 4p 2 q 2 d d
  • Aa aa 4pq 3 (-a+d) - (-a+d)
  • aa aa q 4 -a - - 1 -a
  • ________________________________________________

108.

  • Covariance between midparent and offspring:
  • Cov(OP)
  • = E(OP) E(O)E(P)
  • =p 4 a a + 4p 3 q (a+d) (a+d) + + q 4(-a)(-a) [(p-q)a+2pqd] 2
  • = pq 2
  • = 2 a
  • The regression of offspring on midparent values is
  • b = Cov(OP)/ 2 (P)
  • = 2 a/ 2 P
  • = 2 a/ 2 P
  • = h 2
  • where 2 (P)= 2 Pis the variance of midparent value.

109.

  • IMPORTANT
  • The regression of offspring on midparent values can be used to measure the heritability!
  • T his is a fundamental contribution by R. A. Fisher .

110. You can derive other relationships

  • Degree of relationship Covariance
  • ____________________________________________________
  • Offspring and one parentCov(OP) = 2 a /2
  • Half siblings Cov(FS) = 2 a /4
  • Full siblings Cov(FS) = 2 a /2 + 2 a /4
  • Monozygotic twins Cov(MT) = 2 a+ 2 d
  • Nephew and uncle Cov(NU) = 2 a /4
  • First cousins Cov(FC) = 2 a/8
  • Double first cousins Cov(DFC) = 2 a /4 + 2 d /16
  • Offspring and midparent Cov(O) = 2 a /2
  • ____________________________________________________

111. Cockerhams experimental and mating designs

  • By estimating the covariances between relatives, we can estimate the additive (or mixed additive and dominant) variance and, therefore, the heritability.
  • Next, I will introduce mating and experimental designs used to estimate the covariances between relatives.

112. Mating design

  • Mating designis used to generate genetic pedigrees, genetic information and materials that can be used in a breeding program
  • Mating design provides genetic materials, whereasexperimental designis utilized to obtain and analyze the data from these materials

113. Objectives of mating designs

  • Provide information for evaluatingparents
  • 2)Provide estimates of genetic parameters
  • 3)Provide estimates of genetic gains
  • 4)Provide a base population for selection

114. Commonly used mating designs

  • 1)Open-pollinated
  • 2)Polycross
  • 3)Single-pair mating
  • 4)Nested mating
  • 5) Factorial mating & tester design
  • 6)Diallel mating (full, half, partial &disconnected)

115. Nested mating (NC Design I)

  • Each of male parents is mated to a subset of different female parents

116.

  • Cov(HS M )=1/4V A
  • V(female/male) = Cov(FS) Cov(HS M )
  • =1/2V A +1/4V D1/4V A
  • =1/4V A+1/4V D
  • -Provide information for parents and full-sib families
  • -Provide estimates of both additive and dominance effects
  • -Provide estimates of genetic gains from both V Aand V D
  • -Not efficient for selection
  • -Low cost for controlled mating

117. Example: Date structure forNC Design I

  • Sample Male Female Full-sib family Individual Phenotype
  • 1 1 A 1 1 y 1A1
  • 2 1 A 1 2 y 1A2
  • 3 1 B 2 1 y 1B1
  • 4 1 B 2 2 y 1B2
  • 5 1 C 3 1 y 1C2
  • 6 1 C 3 2y 1C2
  • 7 2 D 4 1y 2D1
  • 8 2 D 4 2y 2D2
  • 9 2 E 5 1y 2E1
  • 10 2 E 5 2 y 2E2
  • 11 2 F 6 1y 2F1
  • 12 2 F 6 2y 2F2
  • 13 3 G 7 1y 3G1
  • 14 3 G 7 2y 3G2
  • 15 3 H 8 1y 3H1
  • 16 3 H 8 2y 3H2
  • 17 3 I 9 1 y 3I1
  • 18 3 I 9 2 y 3I2

118. Estimates by statistical software

  • V Total= 40
  • V FS= Cov(FS) = 10
  • V M= Cov(HS M ) = 4
  • V E= V Total V FS = 40 10 = 30
  • V(female/male) = Cov(FS) Cov(HS M )
  • = 10 4 = 6
  • V A= 4Cov(HS M )= 4 4 = 16h 2= 16/40 = 0.x
  • V(female/male) = 1/4V A+1/4V D= 4 + 1/4V D= 6
  • V D= 8, V G= V A+ V D= 16 + 6 = 22
  • H 2= 22/40 = 0.x

119. Factorial mating (NC Design II)

  • Each member of a group of males is mated to each member of group of females

120.

  • Cov(HS M ) =1/4 V A
  • Cov(HS F ) =1/4 V A
  • V(femalemale) = Cov(FS)Cov(HS M )Cov(HS F )
  • = 1/4 V D
  • -Provide good information for parents and full-sib families
  • -Provide estimates of both additive and dominance effects
  • -Provide estimates of genetic gains from both V Aand V D
  • -Limited selection intensity
  • -High cost

121. Tester mating design (Factorial)

  • Each parent in a population is mated to each member of the testers that are chosen for a particular reason

122.

  • Cov(HS M )=1/4V A
  • Cov(HS F )=1/4V A
  • V(femalemale) = Cov(FS)COV(HS M )-COV(HS F )
  • = 1/4V D
  • -Provide good information for parents and full-sib families
  • -Provide estimates of both additive and dominance effects
  • -Provide estimates of genetic gains from both V Aand V D
  • -Limited selection intensity
  • -High cost

123. Diallel mating design

  • Full dialleleach parent is mated with every other parent in the population, including selfs and reciprocal:

124.

  • Half diallel each parent is mated with every other parent in the population, excluding selfs and reciprocal:

125.

  • Partial Diallel selected subsets of full diallels:

126.

  • Disconnected half diallel selected subsets of full diallels:

127.

  • Diallel analysis
  • Cov(HS) = 1/4V A
  • Cov(FS) = 1/2V A+ 1/4V D
  • Cov(FS) = Cov(FS) 2Cov(HS) = 1/4V D
  • -Provide good evaluation of parents and full-sib families
  • -Provide estimates of both additive and dominance effects
  • -Provide estimates of genetic gains from both V Aand V D
  • -High cost

128. Genomic Imprintingor parent-of-origin effect T he same allele is expressed differently, depending on its parental origin

  • Consider a geneAwith two alleles A (in a frequency p) and a (in a frequency q)
  • GenotypeFrequencyValue
  • A A p2 a Average effect
  • A a pq d+i No imprinting:= a + d(q-p)
  • a A qp d-i Imprinting: M= a i +d(q-p)A a
  • a a q2 -a P= a + i +d(q-p)A a
  • Mean: a(p-q)+2pqd
  • No imprinting: g 2= 2pq 2+ (2pqd) 2
  • Imprinting: gi 2= 2pq 2+ (2pqd) 2+ 2pqi 2
  • Imprinting leads to increased genetic variance for a quantitative trait and, therefore, is evolutionarily favorable.

129. Genomic Imprinting The callipygous animals 1 and 3 compared to normal animals 2 and 4( Cockettet al.Science 273: 236- 23 8 , 1996) 130. We have presented a statistical framework to genomewide scan for imprinted loci

  • Cui, Y. H., W. Zhao, J. M. Cheverud and R. L. Wu,Genetics

131. 132. 133. 134. Predicting Response to Selection 135. 136. Population Mean, Xp- phenotypic mean of the animals or plants of interest and expressed in measurable units. Selection Mean, Xs- phenotypic mean of those animals or plants chosen to be parents for the next generation and expressed in measurable units.Selection Differential, SD- difference between the phenotypic means of the entire population and its selected mean. 137. GeneticGain=the amount that the phenotypic mean in the next generation change by selection.- that change can be + or - 138. Selection Differential G = h 2SD 139. How to Calculate Genetic Gain M 2= M + h 2(M 1- M) M 2 = resulting mean phenotype M = mean of parental population M 1= mean of selected population h 2= heritability of the trait M 2- M = h 2(M 1- M) G = h 2SD = (SD/ p )h 2 p= ih 2 p i = selection intensity h 2= narrow-sense heritability p = standard phenotypic deviation 140.

  • Factors that influence
  • the Genetic Gain
  • Magnitude of selection differential
  • Selection intensity
  • Broad-sense heritability heritability
  • Phenotypic variation

141. Knowing the Selection Differential, and the response to selection, an estimate of the traits heritability can be calculated G / SD = Realized Heritability 142. Realized heritability can also be calculated as M 2= M + h 2(M 1- M) rearranged, (M 2- M) (M 1- M) h 2= 143.

  • Maximizing Genetic Gain
  • Examples

144. N=48, PopulationMean = 109.7 145. Goal: Improve the Mean Select those inred, N= 6, Mean of Selected = 119.5 SD = 9.8 G = h 2SD = 0.7 x 9.8 = 6.86 146. Goal: Reduce the Mean Select those inblue, N= 8, Mean of Selected = 100.4 147. Nature 432 , 630 - 635 (02 December 2004) The role ofbarren stalk1in the architecture of maize

  • ANDREAGALLAVOTTI1,2, QIONGZHAO3, JUNKOKYOZUKA4, ROBERTB.MEELEY5, MATTHEWK.RITTER1,*,JOHNF.DOEBLEY 3, M.ENRICOP2 & ROBERTJ.SCHMIDT1
  • 1Section of Cell and Developmental Biology, University of California, San Diego, La Jolla, California 92093-0116, USA 2Dipartimento di Scienze Biomolecolari e Biotecnologie, Universit degli Studi di Milano, 20133 Milan, Italy 3Laboratory of Genetics, University of Wisconsin, Madison, Wisconsin 53706, USA 4Graduate School of Agriculture and Life Science, The University of Tokyo, Tokyo 113-8657, Japan 5Crop Genetics Research, Pioneer-A DuPont Company, Johnston, Iowa 50131, USA *Present address: Biological Sciences Department, California Polytechnic State University, San Luis Obispo, California 93407, USA

148. Mapping Quantitative Trait Loci (QTL) in the F2 hybrids between maize and teosinte 149. Maize Teosinte tb-1 / tb-1mutant maize 150. Effects ofba1mutations on maize development MutantWild type No tasselTassel 151. Data format for a backcross

  • Sample HeightMarker 1 Marker 2 QTL
  • (cm, y)
  • 1 184 Mm (1) Nn (1) ?
  • 2 185 Mm (1) Nn (1) ?
  • 3 180 Mm (1) Nn (1) ?
  • 4 182 Mm (1) nn (0) ?
  • 5 167 mm (0) nn (0) ?
  • 6 169 mm (0) nn (0) ?
  • 7 165 mm (0) nn (0) ?
  • 8 166 mm (0) Nn (1) ?

152.

  • Heights classified by markers (say marker 1)
  • Marker Sample Sample Sample
  • group size mean variance
  • Mmn 1= 4 m 1 =182.75 s 2 1 =
  • mmn 0= 4 m 0 =166.75 s 2 0 =

153. The hypothesis for the association between the marker and QTL

  • H 0 : m 1= m 0
  • H 1 : m 1 m 0
  • Calculate the test statistic:
  • t = (m 1 m 0 )/ [s 2 (1/n 1 +1/n 0 )],
  • where s 2= [(n 1 -1)s 2 1 +(n 0 -1)s 2 0 ]/(n 1 +n 0 2)
  • Compare t with the critical value t df=1 (0.05) from the t-table.
  • If t > t df=1 (0.05), we reject H 0at the significance level 0.05there is a QTL
  • If t < t df=1 (0.05), we accept H 0at the significance level 0.05there is no QTL

154. Why can the t-test probe a QTL?

  • Assume a backcross with two genes, one marker (alleles M and m) and one QTL (allele Q and q).
  • These two genes are linked with the recombination fraction of r.
  • MmQq Mmqq mmQq mmqq
  • Frequency (1-r)/2 r/2 r/2 (1-r)/2
  • Mean effect m+a m m+a m
  • Mean of marker genotype Mm:
  • m 1 = (1-r)/2 (m+a) + r/2 m = m + (1-r)a
  • Mean of marker genotype mm:
  • m 0 = r/2 (m+a) + (1-r)/2 m = m + ra
  • The difference
  • m 1 m 0= m + (1-r)a m ra = (1-2r)a

155.

  • The difference of marker genotypes can reflect the size of the QTL,
  • This reflection is confounded by the recombination fraction
  • Based on the t-test, we cannot distinguish between the two cases,
  • - Large QTL genetic effect but loose linkage with the marker
  • - Small QTL effect but tight linkage with the marker

156. Example: marker analysis for body weight in a backcross of mice

  • _____________________________________________________________________
  • Marker class 1 Marker class 0
  • ______________________ _____________________
  • Marker n1 m1s21n1 m1s21 t Pvalue
  • _____________________________________________________________________________
  • 1 Hmg1-rs13 41 54.20 111.81 62 47.32 63.67 3.754