mapping populations controlled crosses between two parents –two alleles/locus, gene frequencies =...
TRANSCRIPT
Mapping populations
• Controlled crosses between two parents– two alleles/locus, gene frequencies = 0.5– gametic phase disequilibrium is due to linkage, not other
causes
Examples– Backcross (BC1 or BC2)
– F2 or F2:3
– Recombinant inbred lines (RIL)– Doubled haploid (DH)
Recombinant Inbred Lines (RILs)
Generation AA Aa aaF1 0 100% 0F2 25% 50% 25%F3 37.5% 25% 37.5%F4 43.75% 12.5% 43.75%F5 46.875% 6.25% 46.875%F6 48.4375% 3.125% 48.4375%
F10 49.9% 0.2% 49.9%
A(1/2)
a(1/2)
A(1/2)
AA(1/4)
Aa(1/4)
a(1/2)
aA(1/4)
Aa(1/4)
♀♂
expectedfrequency
f112--- 1 r– =
f212---r=
f312---r=
f412--- 1 r– =
r = 0 r = 0.5
0.5 0.25
0.0 0.25
0.0 0.25
0.5 0.25
Recombinant Inbred Lines (RILs)
RR R R
𝑟=𝑘𝑁
=𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑒𝑐𝑜𝑚𝑏𝑖𝑛𝑎𝑛𝑡𝑠
𝑇𝑜𝑡𝑎𝑙=
420
=0.2
RILs
Doubled Haploids
expectedfrequency
f112--- 1 r– =
f212---r=
f312---r=
f412--- 1 r– =
r = 0 r = 0.5
0.5 0.25
0.0 0.25
0.0 0.25
0.5 0.25
Doubled Haploids (DHs)
DOUBLED HAPLOIDS
R R R R R R R R R R
𝑟=𝑘𝑁
=𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑒𝑐𝑜𝑚𝑏𝑖𝑛𝑎𝑛𝑡𝑠
𝑇𝑜𝑡𝑎𝑙=
1020
=0.5
F2 Population
Expected Genotypic Frequencies for F2 Progeny when r = 0 or r = 0.5 Between Two Loci in Coupling (AB/ab) Configuration
Genotype Expected Frequency
r = 0 r = 0.5
AB/AB p1 = 0.25(1 - r)2 1/4 = 0.25 1/16 = 0.0625
AB/aB p2 = 0.50r(1 - r) 0.0 2/16 = 0.125
AB/Ab p3 = 0.50r(1 - r) 0.0 2/16 = 0.125
AB/ab p4 = 0.50(1 - r)2 1/2 = 0.5 2/16 = 0.125
Ab/aB p5 = 0.50r2 0.0 2/16 = 0.125
Ab/Ab p6 = 0.25r2 0.0 1/16 = 0.0625
Ab/ab p7 = 0.50r(1 - r) 0.0 2/16 = 0.125
aB/aB p8 = 0.25r2 0.0 1/16 = 0.0625
aB/ab p9 = 0.50r(1 - r) 0.0 2/16 = 0.125
ab/ab p10 = 0.25(1 - r)2 1/4 = 0.25 1/16 = 0.0625
Expected and Observed Genotypic FrequenciesCoupling (AB/ab) and Repulsion (Ab/aB) F2 Progeny
Genotype Observed Frequency
Coupling Repulsion
AB/AB p1 p1 = 0.25(1 - r)2 p1 = 0.25r2
AB/aB p2 p2 = 0.50r(1 - r) p2 = 0.50r(1 - r)
AB/Ab p3 p3 = 0.50r(1 - r) p3 = 0.50r(1 - r)
AB/ab p4 p4 = 0.50(1 - r)2 p4 = 0.50r2
Ab/aB p5 p5 = 0.50r2 p5 = 0.50(1 – r)2
Ab/Ab p6 p6 = 0.25r2 p6 = 0.25(1 – r)2
Ab/ab p7 p7 = 0.50r(1 - r) p7 = 0.50r(1 - r)
aB/aB p8 p8 = 0.25r2 p8 = 0.25(1 – r)2
aB/ab p9 p9 = 0.50r(1 - r) p9 = 0.50r(1 - r)
ab/ab p10 p10 = 0.25(1 - r)2 p10 = 0.25r2
•Co-dominant•Fully classified double hets.
•Locus A = A and a•Locus B = B and b• r = recombination frequency between locus A and B
Expected and Observed Genotypic FrequenciesCoupling (AB/ab) F2 Progeny
Genotype Observed Frequency
Coupling
AB/AB q1 q1 = 0.25(1 - r)2
AB/aB q2 q2 = 0.50r(1 - r)
AB/Ab q3 q3 = 0.50r(1 - r)
AB/ab + Ab/aB q4 q4 = p4 + p5 = 0.50[(1 - r)2+r2]
Ab/Ab q5 q5 = 0.25r2
Ab/ab q6 q6 = 0.50r(1 - r)
aB/aB q7 q7 = 0.25r2
aB/ab q8 q8 = 0.50r(1 - r)
ab/ab q9 q9 = 0.25(1 - r)2
•Co-dominant•Unclassified double heterozygotes
•Locus A = A and a•Locus B = B and b• r = recombination frequency between locus A and B
Expected and Observed Genotypic FrequenciesCoupling (AB/ab) and Repulsion (Ab/aB) F2 Progeny
Genotype Observed Frequency
Coupling Repulsion
A_B_ f1 f1 = 0.25(3 - 2r + r2) f1 = 0.25(2 + r2)
A_bb f2 f2 = 0.25(2r – r2) f2 = 0.25(1 – r2)
aaB_ f3 f3 = 0.25(2r – r2) f3 = 0.25(1 – r2)
aabb f4 f4 = 0.25(1 - r)2 f4 = 0.25r2
•Dominant•Locus A = A and a•Locus B = B and b• r = recombination frequency between locus A and B
Analysis
1. Single-locus analysis
2. Two-locus analysis
3. Detecting linkage and grouping
4. Ordering loci
5. Multi-point analysis
Mendelian Genetic AnalysisPhenotypic and Genotypic Distributions • The expected segregation ratio of a gene is a function of the
transmission probabilities
• If a gene produces a discrete phenotypic distribution, then an intrinsic hypothesis can be formulated to test whether the gene produces a phenotypic distribution consistent with a expected segregation ratio of the gene
• The heritability of a phenotypic trait that produces a Mendelian phenotypic distribution is ~1.0. Such traits are said to be fully penetrant
• The heritability of a DNA marker is theoretically ~1.0; however, it is affected by genotyping errors
Mendelian Genetic AnalysisHypothesis Tests • The expected segregation ratio (null hypothesis) is specified on
the basis of the observed phenotypic or genotypic distribution
• One-way tests are performed to test for normal segregation of individual phenotypic or DNA markers
– If the observed segregation ratio does not fit the expected segregation ratio, then the null hypothesis is rejected.
• The expected segregation ratio is incorrect• Selection may have operated on the locus• The locus may not be fully penetrant• A Type I error has been committed
Mendelian Genetic Analysis
Hypothesis Tests
• Two-way tests are performed to test for independent assortment (null hypothesis - no linkage) between two phenotypic or DNA markers. – If two genes do not sort independently, then the null
hypothesis is rejected • The two genes are linked (r < 0.50)• The expected segregation ratio is incorrect• A Type I error has been committed.
Mendelian Genetics Analysis
Null Hypothesis
Null Hypothesis
Accept Reject
TrueNo error
1 - a
Type I errora
False positive
FalseType II error
bFalse negative
No error1 - b
One-way or single-locus tests
• C2 statistics
• Log likelihood ratio statistics (G-statistics)
n
i i
ii
e
eo
1
22 )(
i
ik
ii e
ooG ln2
1
i = ith genotype (or allele, or phenotype)
Pr[C2 > 2df] =
Pr[G > 2df] =
Goodness of fit statistics
One-way or single-locus tests
Genotype Sample A Sample B Total
aa 40 51 91
Aa 82 81 163
Total 122 132 254
88.6588.16149.13266
81ln81
66
51ln512
SBG
7.14259.24880.16261
82ln82
61
40ln402
SAG
Two backcross populations (A and B) genotyped for a co-dominant marker (Brandt and Knapp 1993)
Null hypothesis1:1 ratio of aa to Aa
Pr[GA > 2k-1] =
Pr[14.8 > 21] = 0.0001
Pr[GB > 2k-1] =
Pr[6.88 > 21] = 0.0086
Null hypothesis is rejected for both samples
Individual G-statistics for samples A and B
i
ik
ii e
ooG ln2
1
i = ith genotypek = 2 genotypic classes
One-way or single-locus tests
Genotype Sample A Sample B Total
aa 40 51 91
Aa 82 81 163
Total 122 132 254
7.20679.40333.302127
163ln163
127
91ln912
PG
Two backcross populations (A and B) genotyped for a co-dominant marker (Brandt and Knapp 1993)
Null hypothesis1aa to 1Aa ratio for
pooled samples
Pr[GP > 2k-1] = Pr[20.7 > 2
1] = 0.0000054
Null hypothesis is rejected
Pooled G-statistic across samples
i = ith genotype j = jth samplek = genotypic classesp = No. of samples (populations)
k
ip
iij
p
iijp
iijP
e
ooG
1
1
1
1
ln2
One-way or single-locus tests
Genotype Sample A Sample B Total
aa 40 51 91
Aa 82 81 163
Total 122 132 254
Two backcross populations (A and B) genotyped for a co-dominant marker (Brandt and Knapp 1993)
Null hypothesisSamples A and B are
homogenous
378.106581ln8151ln5140ln4082ln82ln1 1
k
i
p
jijij oo
94.0483.1406621.1230769.1240378.10652lnlnlnln21
......1
..1 1
p
jjj
k
iiiij
k
j
p
jijH ooooooooG
Pr[GH > 2(k-1)(p-1)] = Pr[0.94 > 2
1] = 0.33 (N.S.)
The heterogeneity G-statistic is
769.1240163ln16391ln91ln1
..
k
iii oo 621.1230132ln132122ln122ln
1..
p
jjj oo
483.1406254ln254ln .... oo
i = ith genotype j = jth sample (population)k = genotypic classesp = No. of samples (populations)n = Total No. of observations
One-way or single-locus tests
6.219.67.14 SBSAT GGG
6.219.07.20 HPT GGG
Pr[GT > 2p(k-1)] = Pr[21.7 > 2
2] = 0.00002
Source G df Pr > G
Sample A 14.7 k-1 = 2-1 =1 0.0001
Sample B 6.9 k-1 = 2-1 =1 0.0086
Total 21.6 p(k-1) = 2(2-1) = 2 0.00002
Pooled 20.7 k-1 = 2-1 =1 0.000005
Heterogeneity 0.9 (k-1)(p-1) = (2-1)(2-1) = 1 0.33
Total 21.6 p(k-1) = 2(2-1) = 2 0.00002
Relationship between G statistics
k = genotypic classesp = No. of samples (populations)
One-way or single-locus tests
Allelic constitution
Genotype Observed Expected
120bp /120bp aa 21 23.5
120bp /124bp Aa 44 47
124bp /124bp AA 29 23.5
Total 94 94
668.1098.6902.2362.225.23
29ln29
47
44ln44
5.23
21ln212
G
F2 progeny of Ae. cylindrica genotyped for the SSR marker barc98.
Null hypothesis1:2:1 ratio of aa:Aa:AA
Pr[G > 2k-1] = Pr[1.67 > 2
2] = 0.434
Null hypothesis is not rejected
Individual G-statistics for samples A and B
i
ik
ii e
ooG ln2
1
i = ith genotypek = 3 genotypic classes
Calculating probability values for Chi-square distributions
SAS program
data pv;Input x df;datalines;3.75 2;data pvalue;set pv;pvalue = 1 – probchi (x, df);output;proc print;run;
Output
Obs x df pvalue 1 3.75 2 0.15335
Excel formula
=CHIDIST(x , degrees_fredom)
=CHIDIST(3.75 , 2)
Output
0.15335