mapping populations controlled crosses between two parents –two alleles/locus, gene frequencies =...

27
Mapping populations Controlled crosses between two parents two alleles/locus, gene frequencies = 0.5 gametic phase disequilibrium is due to linkage, not other causes Examples Backcross (BC 1 or BC 2 ) F 2 or F 2:3 Recombinant inbred lines (RIL) Doubled haploid (DH)

Upload: elizabeth-fox

Post on 12-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

Mapping populations

• Controlled crosses between two parents– two alleles/locus, gene frequencies = 0.5– gametic phase disequilibrium is due to linkage, not other

causes

Examples– Backcross (BC1 or BC2)

– F2 or F2:3

– Recombinant inbred lines (RIL)– Doubled haploid (DH)

Page 2: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

Recombinant Inbred Lines (RILs)

Page 3: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,
Page 4: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

Generation AA Aa aaF1 0 100% 0F2 25% 50% 25%F3 37.5% 25% 37.5%F4 43.75% 12.5% 43.75%F5 46.875% 6.25% 46.875%F6 48.4375% 3.125% 48.4375%

F10 49.9% 0.2% 49.9%

A(1/2)

a(1/2)

A(1/2)

AA(1/4)

Aa(1/4)

a(1/2)

aA(1/4)

Aa(1/4)

♀♂

Page 5: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

expectedfrequency

f112--- 1 r– =

f212---r=

f312---r=

f412--- 1 r– =

r = 0 r = 0.5

0.5 0.25

0.0 0.25

0.0 0.25

0.5 0.25

Recombinant Inbred Lines (RILs)

Page 6: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

RR R R

𝑟=𝑘𝑁

=𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑒𝑐𝑜𝑚𝑏𝑖𝑛𝑎𝑛𝑡𝑠

𝑇𝑜𝑡𝑎𝑙=

420

=0.2

RILs

Page 7: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

Doubled Haploids

Page 8: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

expectedfrequency

f112--- 1 r– =

f212---r=

f312---r=

f412--- 1 r– =

r = 0 r = 0.5

0.5 0.25

0.0 0.25

0.0 0.25

0.5 0.25

Doubled Haploids (DHs)

Page 9: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

DOUBLED HAPLOIDS

R R R R R R R R R R

𝑟=𝑘𝑁

=𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑒𝑐𝑜𝑚𝑏𝑖𝑛𝑎𝑛𝑡𝑠

𝑇𝑜𝑡𝑎𝑙=

1020

=0.5

Page 10: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

F2 Population

Page 11: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,
Page 12: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

Expected Genotypic Frequencies for F2 Progeny when r = 0 or r = 0.5 Between Two Loci in Coupling (AB/ab) Configuration

Genotype Expected Frequency

r = 0 r = 0.5

AB/AB p1 = 0.25(1 - r)2 1/4 = 0.25 1/16 = 0.0625

AB/aB p2 = 0.50r(1 - r) 0.0 2/16 = 0.125

AB/Ab p3 = 0.50r(1 - r) 0.0 2/16 = 0.125

AB/ab p4 = 0.50(1 - r)2 1/2 = 0.5 2/16 = 0.125

Ab/aB p5 = 0.50r2 0.0 2/16 = 0.125

Ab/Ab p6 = 0.25r2 0.0 1/16 = 0.0625

Ab/ab p7 = 0.50r(1 - r) 0.0 2/16 = 0.125

aB/aB p8 = 0.25r2 0.0 1/16 = 0.0625

aB/ab p9 = 0.50r(1 - r) 0.0 2/16 = 0.125

ab/ab p10 = 0.25(1 - r)2 1/4 = 0.25 1/16 = 0.0625

Page 13: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

Expected and Observed Genotypic FrequenciesCoupling (AB/ab) and Repulsion (Ab/aB) F2 Progeny

Genotype Observed Frequency

Coupling Repulsion

AB/AB p1 p1 = 0.25(1 - r)2 p1 = 0.25r2

AB/aB p2 p2 = 0.50r(1 - r) p2 = 0.50r(1 - r)

AB/Ab p3 p3 = 0.50r(1 - r) p3 = 0.50r(1 - r)

AB/ab p4 p4 = 0.50(1 - r)2 p4 = 0.50r2

Ab/aB p5 p5 = 0.50r2 p5 = 0.50(1 – r)2

Ab/Ab p6 p6 = 0.25r2 p6 = 0.25(1 – r)2

Ab/ab p7 p7 = 0.50r(1 - r) p7 = 0.50r(1 - r)

aB/aB p8 p8 = 0.25r2 p8 = 0.25(1 – r)2

aB/ab p9 p9 = 0.50r(1 - r) p9 = 0.50r(1 - r)

ab/ab p10 p10 = 0.25(1 - r)2 p10 = 0.25r2

•Co-dominant•Fully classified double hets.

•Locus A = A and a•Locus B = B and b• r = recombination frequency between locus A and B

Page 14: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

Expected and Observed Genotypic FrequenciesCoupling (AB/ab) F2 Progeny

Genotype Observed Frequency

Coupling

AB/AB q1 q1 = 0.25(1 - r)2

AB/aB q2 q2 = 0.50r(1 - r)

AB/Ab q3 q3 = 0.50r(1 - r)

AB/ab + Ab/aB q4 q4 = p4 + p5 = 0.50[(1 - r)2+r2]

Ab/Ab q5 q5 = 0.25r2

Ab/ab q6 q6 = 0.50r(1 - r)

aB/aB q7 q7 = 0.25r2

aB/ab q8 q8 = 0.50r(1 - r)

ab/ab q9 q9 = 0.25(1 - r)2

•Co-dominant•Unclassified double heterozygotes

•Locus A = A and a•Locus B = B and b• r = recombination frequency between locus A and B

Page 15: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

Expected and Observed Genotypic FrequenciesCoupling (AB/ab) and Repulsion (Ab/aB) F2 Progeny

Genotype Observed Frequency

Coupling Repulsion

A_B_ f1 f1 = 0.25(3 - 2r + r2) f1 = 0.25(2 + r2)

A_bb f2 f2 = 0.25(2r – r2) f2 = 0.25(1 – r2)

aaB_ f3 f3 = 0.25(2r – r2) f3 = 0.25(1 – r2)

aabb f4 f4 = 0.25(1 - r)2 f4 = 0.25r2

•Dominant•Locus A = A and a•Locus B = B and b• r = recombination frequency between locus A and B

Page 16: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

Analysis

1. Single-locus analysis

2. Two-locus analysis

3. Detecting linkage and grouping

4. Ordering loci

5. Multi-point analysis

Page 17: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

Mendelian Genetic AnalysisPhenotypic and Genotypic Distributions • The expected segregation ratio of a gene is a function of the

transmission probabilities

• If a gene produces a discrete phenotypic distribution, then an intrinsic hypothesis can be formulated to test whether the gene produces a phenotypic distribution consistent with a expected segregation ratio of the gene

• The heritability of a phenotypic trait that produces a Mendelian phenotypic distribution is ~1.0. Such traits are said to be fully penetrant

• The heritability of a DNA marker is theoretically ~1.0; however, it is affected by genotyping errors

Page 18: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

Mendelian Genetic AnalysisHypothesis Tests • The expected segregation ratio (null hypothesis) is specified on

the basis of the observed phenotypic or genotypic distribution

• One-way tests are performed to test for normal segregation of individual phenotypic or DNA markers

– If the observed segregation ratio does not fit the expected segregation ratio, then the null hypothesis is rejected.

• The expected segregation ratio is incorrect• Selection may have operated on the locus• The locus may not be fully penetrant• A Type I error has been committed

Page 19: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

Mendelian Genetic Analysis

Hypothesis Tests

• Two-way tests are performed to test for independent assortment (null hypothesis - no linkage) between two phenotypic or DNA markers. – If two genes do not sort independently, then the null

hypothesis is rejected • The two genes are linked (r < 0.50)• The expected segregation ratio is incorrect• A Type I error has been committed.

Page 20: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

Mendelian Genetics Analysis

Null Hypothesis

Null Hypothesis

Accept Reject

TrueNo error

1 - a

Type I errora

False positive

FalseType II error

bFalse negative

No error1 - b

Page 21: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

One-way or single-locus tests

• C2 statistics

• Log likelihood ratio statistics (G-statistics)

n

i i

ii

e

eo

1

22 )(

i

ik

ii e

ooG ln2

1

i = ith genotype (or allele, or phenotype)

Pr[C2 > 2df] =

Pr[G > 2df] =

Goodness of fit statistics

Page 22: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

One-way or single-locus tests

Genotype Sample A Sample B Total

aa 40 51 91

Aa 82 81 163

Total 122 132 254

88.6588.16149.13266

81ln81

66

51ln512

SBG

7.14259.24880.16261

82ln82

61

40ln402

SAG

Two backcross populations (A and B) genotyped for a co-dominant marker (Brandt and Knapp 1993)

Null hypothesis1:1 ratio of aa to Aa

Pr[GA > 2k-1] =

Pr[14.8 > 21] = 0.0001

Pr[GB > 2k-1] =

Pr[6.88 > 21] = 0.0086

Null hypothesis is rejected for both samples

Individual G-statistics for samples A and B

i

ik

ii e

ooG ln2

1

i = ith genotypek = 2 genotypic classes

Page 23: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

One-way or single-locus tests

Genotype Sample A Sample B Total

aa 40 51 91

Aa 82 81 163

Total 122 132 254

7.20679.40333.302127

163ln163

127

91ln912

PG

Two backcross populations (A and B) genotyped for a co-dominant marker (Brandt and Knapp 1993)

Null hypothesis1aa to 1Aa ratio for

pooled samples

Pr[GP > 2k-1] = Pr[20.7 > 2

1] = 0.0000054

Null hypothesis is rejected

Pooled G-statistic across samples

i = ith genotype j = jth samplek = genotypic classesp = No. of samples (populations)

k

ip

iij

p

iijp

iijP

e

ooG

1

1

1

1

ln2

Page 24: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

One-way or single-locus tests

Genotype Sample A Sample B Total

aa 40 51 91

Aa 82 81 163

Total 122 132 254

Two backcross populations (A and B) genotyped for a co-dominant marker (Brandt and Knapp 1993)

Null hypothesisSamples A and B are

homogenous

378.106581ln8151ln5140ln4082ln82ln1 1

k

i

p

jijij oo

94.0483.1406621.1230769.1240378.10652lnlnlnln21

......1

..1 1

p

jjj

k

iiiij

k

j

p

jijH ooooooooG

Pr[GH > 2(k-1)(p-1)] = Pr[0.94 > 2

1] = 0.33 (N.S.)

The heterogeneity G-statistic is

769.1240163ln16391ln91ln1

..

k

iii oo 621.1230132ln132122ln122ln

1..

p

jjj oo

483.1406254ln254ln .... oo

i = ith genotype j = jth sample (population)k = genotypic classesp = No. of samples (populations)n = Total No. of observations

Page 25: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

One-way or single-locus tests

6.219.67.14 SBSAT GGG

6.219.07.20 HPT GGG

Pr[GT > 2p(k-1)] = Pr[21.7 > 2

2] = 0.00002

Source G df Pr > G

Sample A 14.7 k-1 = 2-1 =1 0.0001

Sample B 6.9 k-1 = 2-1 =1 0.0086

Total 21.6 p(k-1) = 2(2-1) = 2 0.00002

Pooled 20.7 k-1 = 2-1 =1 0.000005

Heterogeneity 0.9 (k-1)(p-1) = (2-1)(2-1) = 1 0.33

Total 21.6 p(k-1) = 2(2-1) = 2 0.00002

Relationship between G statistics

k = genotypic classesp = No. of samples (populations)

Page 26: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

One-way or single-locus tests

Allelic constitution

Genotype Observed Expected

120bp /120bp aa 21 23.5

120bp /124bp Aa 44 47

124bp /124bp AA 29 23.5

Total 94 94

668.1098.6902.2362.225.23

29ln29

47

44ln44

5.23

21ln212

G

F2 progeny of Ae. cylindrica genotyped for the SSR marker barc98.

Null hypothesis1:2:1 ratio of aa:Aa:AA

Pr[G > 2k-1] = Pr[1.67 > 2

2] = 0.434

Null hypothesis is not rejected

Individual G-statistics for samples A and B

i

ik

ii e

ooG ln2

1

i = ith genotypek = 3 genotypic classes

Page 27: Mapping populations Controlled crosses between two parents –two alleles/locus, gene frequencies = 0.5 –gametic phase disequilibrium is due to linkage,

Calculating probability values for Chi-square distributions

SAS program

data pv;Input x df;datalines;3.75 2;data pvalue;set pv;pvalue = 1 – probchi (x, df);output;proc print;run;

Output

Obs x df pvalue 1 3.75 2 0.15335

Excel formula

=CHIDIST(x , degrees_fredom)

=CHIDIST(3.75 , 2)

Output

0.15335