mapping populations controlled crosses between two parents –two alleles/locus, gene frequencies =...

Mapping populations

• Controlled crosses between two parents– two alleles/locus, gene frequencies = 0.5– gametic phase disequilibrium is due to linkage, not other

causes

Examples– Backcross (BC1 or BC2)

– F2 or F2:3

– Recombinant inbred lines (RIL)– Doubled haploid (DH)

Recombinant Inbred Lines (RILs)

Generation AA Aa aaF1 0 100% 0F2 25% 50% 25%F3 37.5% 25% 37.5%F4 43.75% 12.5% 43.75%F5 46.875% 6.25% 46.875%F6 48.4375% 3.125% 48.4375%

F10 49.9% 0.2% 49.9%

A(1/2)

a(1/2)

A(1/2)

AA(1/4)

Aa(1/4)

a(1/2)

aA(1/4)

Aa(1/4)

♀♂

expectedfrequency

f112--- 1 r– =

f212---r=

f312---r=

f412--- 1 r– =

r = 0 r = 0.5

0.5 0.25

0.0 0.25

0.0 0.25

0.5 0.25

Recombinant Inbred Lines (RILs)

RR R R

𝑟=𝑘𝑁

=𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑒𝑐𝑜𝑚𝑏𝑖𝑛𝑎𝑛𝑡𝑠

𝑇𝑜𝑡𝑎𝑙=

420

=0.2

RILs

Doubled Haploids

expectedfrequency

f112--- 1 r– =

f212---r=

f312---r=

f412--- 1 r– =

r = 0 r = 0.5

0.5 0.25

0.0 0.25

0.0 0.25

0.5 0.25

Doubled Haploids (DHs)

DOUBLED HAPLOIDS

R R R R R R R R R R

𝑟=𝑘𝑁

=𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑒𝑐𝑜𝑚𝑏𝑖𝑛𝑎𝑛𝑡𝑠

𝑇𝑜𝑡𝑎𝑙=

1020

=0.5

F2 Population

Expected Genotypic Frequencies for F2 Progeny when r = 0 or r = 0.5 Between Two Loci in Coupling (AB/ab) Configuration

Genotype Expected Frequency

r = 0 r = 0.5

AB/AB p1 = 0.25(1 - r)2 1/4 = 0.25 1/16 = 0.0625

AB/aB p2 = 0.50r(1 - r) 0.0 2/16 = 0.125

AB/Ab p3 = 0.50r(1 - r) 0.0 2/16 = 0.125

AB/ab p4 = 0.50(1 - r)2 1/2 = 0.5 2/16 = 0.125

Ab/aB p5 = 0.50r2 0.0 2/16 = 0.125

Ab/Ab p6 = 0.25r2 0.0 1/16 = 0.0625

Ab/ab p7 = 0.50r(1 - r) 0.0 2/16 = 0.125

aB/aB p8 = 0.25r2 0.0 1/16 = 0.0625

aB/ab p9 = 0.50r(1 - r) 0.0 2/16 = 0.125

ab/ab p10 = 0.25(1 - r)2 1/4 = 0.25 1/16 = 0.0625

Expected and Observed Genotypic FrequenciesCoupling (AB/ab) and Repulsion (Ab/aB) F2 Progeny

Genotype Observed Frequency

Coupling Repulsion

AB/AB p1 p1 = 0.25(1 - r)2 p1 = 0.25r2

AB/aB p2 p2 = 0.50r(1 - r) p2 = 0.50r(1 - r)

AB/Ab p3 p3 = 0.50r(1 - r) p3 = 0.50r(1 - r)

AB/ab p4 p4 = 0.50(1 - r)2 p4 = 0.50r2

Ab/aB p5 p5 = 0.50r2 p5 = 0.50(1 – r)2

Ab/Ab p6 p6 = 0.25r2 p6 = 0.25(1 – r)2

Ab/ab p7 p7 = 0.50r(1 - r) p7 = 0.50r(1 - r)

aB/aB p8 p8 = 0.25r2 p8 = 0.25(1 – r)2

aB/ab p9 p9 = 0.50r(1 - r) p9 = 0.50r(1 - r)

ab/ab p10 p10 = 0.25(1 - r)2 p10 = 0.25r2

•Co-dominant•Fully classified double hets.

•Locus A = A and a•Locus B = B and b• r = recombination frequency between locus A and B

Expected and Observed Genotypic FrequenciesCoupling (AB/ab) F2 Progeny


Coupling

AB/AB q1 q1 = 0.25(1 - r)2

AB/aB q2 q2 = 0.50r(1 - r)

AB/Ab q3 q3 = 0.50r(1 - r)

AB/ab + Ab/aB q4 q4 = p4 + p5 = 0.50[(1 - r)2+r2]

Ab/Ab q5 q5 = 0.25r2

Ab/ab q6 q6 = 0.50r(1 - r)

aB/aB q7 q7 = 0.25r2

aB/ab q8 q8 = 0.50r(1 - r)

ab/ab q9 q9 = 0.25(1 - r)2

•Co-dominant•Unclassified double heterozygotes

•Locus A = A and a•Locus B = B and b• r = recombination frequency between locus A and B

Expected and Observed Genotypic FrequenciesCoupling (AB/ab) and Repulsion (Ab/aB) F2 Progeny


Coupling Repulsion

A_B_ f1 f1 = 0.25(3 - 2r + r2) f1 = 0.25(2 + r2)

A_bb f2 f2 = 0.25(2r – r2) f2 = 0.25(1 – r2)

aaB_ f3 f3 = 0.25(2r – r2) f3 = 0.25(1 – r2)

aabb f4 f4 = 0.25(1 - r)2 f4 = 0.25r2

•Dominant•Locus A = A and a•Locus B = B and b• r = recombination frequency between locus A and B

Analysis

1. Single-locus analysis

2. Two-locus analysis

3. Detecting linkage and grouping

4. Ordering loci

5. Multi-point analysis

Mendelian Genetic AnalysisPhenotypic and Genotypic Distributions • The expected segregation ratio of a gene is a function of the

transmission probabilities

• If a gene produces a discrete phenotypic distribution, then an intrinsic hypothesis can be formulated to test whether the gene produces a phenotypic distribution consistent with a expected segregation ratio of the gene

• The heritability of a phenotypic trait that produces a Mendelian phenotypic distribution is ~1.0. Such traits are said to be fully penetrant

• The heritability of a DNA marker is theoretically ~1.0; however, it is affected by genotyping errors

Mendelian Genetic AnalysisHypothesis Tests • The expected segregation ratio (null hypothesis) is specified on

the basis of the observed phenotypic or genotypic distribution

• One-way tests are performed to test for normal segregation of individual phenotypic or DNA markers

– If the observed segregation ratio does not fit the expected segregation ratio, then the null hypothesis is rejected.

• The expected segregation ratio is incorrect• Selection may have operated on the locus• The locus may not be fully penetrant• A Type I error has been committed

Mendelian Genetic Analysis

Hypothesis Tests

• Two-way tests are performed to test for independent assortment (null hypothesis - no linkage) between two phenotypic or DNA markers. – If two genes do not sort independently, then the null

hypothesis is rejected • The two genes are linked (r < 0.50)• The expected segregation ratio is incorrect• A Type I error has been committed.

Mendelian Genetics Analysis

Null Hypothesis

Null Hypothesis

Accept Reject

TrueNo error

1 - a

Type I errora

False positive

FalseType II error

bFalse negative

No error1 - b

One-way or single-locus tests

• C2 statistics

• Log likelihood ratio statistics (G-statistics)

n

i i

ii

e

eo

1

22 )(

i

ik

ii e

ooG ln2

1

i = ith genotype (or allele, or phenotype)

Pr[C2 > 2df] =

Pr[G > 2df] =

Goodness of fit statistics


Genotype Sample A Sample B Total

aa 40 51 91

Aa 82 81 163

Total 122 132 254

88.6588.16149.13266

81ln81

66

51ln512

SBG

7.14259.24880.16261

82ln82

61

40ln402

SAG

Two backcross populations (A and B) genotyped for a co-dominant marker (Brandt and Knapp 1993)

Null hypothesis1:1 ratio of aa to Aa

Pr[GA > 2k-1] =

Pr[14.8 > 21] = 0.0001

Pr[GB > 2k-1] =

Pr[6.88 > 21] = 0.0086

Null hypothesis is rejected for both samples

Individual G-statistics for samples A and B

i

ik

ii e

ooG ln2

1

i = ith genotypek = 2 genotypic classes



aa 40 51 91

Aa 82 81 163

Total 122 132 254

7.20679.40333.302127

163ln163

127

91ln912

PG


Null hypothesis1aa to 1Aa ratio for

pooled samples

Pr[GP > 2k-1] = Pr[20.7 > 2

1] = 0.0000054

Null hypothesis is rejected

Pooled G-statistic across samples

i = ith genotype j = jth samplek = genotypic classesp = No. of samples (populations)

k

ip

iij

p

iijp

iijP

e

ooG

1

1

1

1

ln2



aa 40 51 91

Aa 82 81 163

Total 122 132 254


Null hypothesisSamples A and B are

homogenous

378.106581ln8151ln5140ln4082ln82ln1 1

k

i

p

jijij oo

94.0483.1406621.1230769.1240378.10652lnlnlnln21

......1

..1 1

p

jjj

k

iiiij

k

j

p

jijH ooooooooG

Pr[GH > 2(k-1)(p-1)] = Pr[0.94 > 2

1] = 0.33 (N.S.)

The heterogeneity G-statistic is

769.1240163ln16391ln91ln1

..

k

iii oo 621.1230132ln132122ln122ln

1..

p

jjj oo

483.1406254ln254ln .... oo

i = ith genotype j = jth sample (population)k = genotypic classesp = No. of samples (populations)n = Total No. of observations


6.219.67.14 SBSAT GGG

6.219.07.20 HPT GGG

Pr[GT > 2p(k-1)] = Pr[21.7 > 2

2] = 0.00002

Source G df Pr > G

Sample A 14.7 k-1 = 2-1 =1 0.0001

Sample B 6.9 k-1 = 2-1 =1 0.0086

Total 21.6 p(k-1) = 2(2-1) = 2 0.00002

Pooled 20.7 k-1 = 2-1 =1 0.000005

Heterogeneity 0.9 (k-1)(p-1) = (2-1)(2-1) = 1 0.33

Total 21.6 p(k-1) = 2(2-1) = 2 0.00002

Relationship between G statistics

k = genotypic classesp = No. of samples (populations)


Allelic constitution

Genotype Observed Expected

120bp /120bp aa 21 23.5

120bp /124bp Aa 44 47

124bp /124bp AA 29 23.5

Total 94 94

668.1098.6902.2362.225.23

29ln29

47

44ln44

5.23

21ln212

G

F2 progeny of Ae. cylindrica genotyped for the SSR marker barc98.

Null hypothesis1:2:1 ratio of aa:Aa:AA

Pr[G > 2k-1] = Pr[1.67 > 2

2] = 0.434

Null hypothesis is not rejected

Individual G-statistics for samples A and B

i

ik

ii e

ooG ln2

1

i = ith genotypek = 3 genotypic classes

Calculating probability values for Chi-square distributions

SAS program

data pv;Input x df;datalines;3.75 2;data pvalue;set pv;pvalue = 1 – probchi (x, df);output;proc print;run;

Output

Obs x df pvalue 1 3.75 2 0.15335

Excel formula

=CHIDIST(x , degrees_fredom)

=CHIDIST(3.75 , 2)

Output

0.15335

mapping populations controlled crosses between two parents –two alleles/locus, gene frequencies =...

Documents

50r1 r0

50r1 rababq9q9

50r1 rp7

50r1 rp9

50r1 rp3

50r1 rp2

50r1 rababp3p3

50r1 rababq3q3