mapping populations

27
Mapping populations Controlled crosses between two parents two alleles/locus, gene frequencies = 0.5 gametic phase disequilibrium is due to linkage, not other causes Examples Backcross (BC 1 or BC 2 ) F 2 or F 2:3 Recombinant inbred lines (RIL) Doubled haploid (DH)

Upload: aisha

Post on 24-Feb-2016

56 views

Category:

Documents


0 download

DESCRIPTION

Mapping populations. Controlled crosses between two parents two alleles/locus, gene frequencies = 0.5 gametic phase disequilibrium is due to linkage, not other causes Examples Backcross (BC 1 or BC 2 ) F 2 or F 2:3 Recombinant inbred lines (RIL) Doubled haploid (DH). - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Mapping populations

Mapping populations

• Controlled crosses between two parents– two alleles/locus, gene frequencies = 0.5– gametic phase disequilibrium is due to linkage, not other

causesExamples– Backcross (BC1 or BC2)– F2 or F2:3

– Recombinant inbred lines (RIL)– Doubled haploid (DH)

Page 2: Mapping populations

Recombinant Inbred Lines (RILs)

Page 3: Mapping populations
Page 4: Mapping populations

Generation AA Aa aaF1 0 100% 0F2 25% 50% 25%F3 37.5% 25% 37.5%F4 43.75% 12.5% 43.75%F5 46.875% 6.25% 46.875%F6 48.4375% 3.125% 48.4375%

F10 49.9% 0.2% 49.9%

A(1/2)

a(1/2)

A(1/2)

AA(1/4)

Aa(1/4)

a(1/2)

aA(1/4)

Aa(1/4)

♀♂

Page 5: Mapping populations

expectedfrequency

f112--- 1 r– =

f212---r=

f312---r=

f412--- 1 r– =

r = 0 r = 0.5

0.5 0.25

0.0 0.25

0.0 0.25

0.5 0.25

Recombinant Inbred Lines (RILs)

Page 6: Mapping populations

RR R R

𝑟=𝑘𝑁=

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑒𝑐𝑜𝑚𝑏𝑖𝑛𝑎𝑛𝑡𝑠𝑇𝑜𝑡𝑎𝑙 =

420 =0.2

RILs

Page 7: Mapping populations

Doubled Haploids

Page 8: Mapping populations

expectedfrequency

f112--- 1 r– =

f212---r=

f312---r=

f412--- 1 r– =

r = 0 r = 0.5

0.5 0.25

0.0 0.25

0.0 0.25

0.5 0.25

Doubled Haploids (DHs)

Page 9: Mapping populations

DOUBLED HAPLOIDS

R R R R R R R R R R

𝑟=𝑘𝑁=

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑒𝑐𝑜𝑚𝑏𝑖𝑛𝑎𝑛𝑡𝑠𝑇𝑜𝑡𝑎𝑙 =

1020 =0.5

Page 10: Mapping populations

F2 Population

Page 11: Mapping populations
Page 12: Mapping populations

Expected Genotypic Frequencies for F2 Progeny when r = 0 or r = 0.5 Between Two Loci in Coupling (AB/ab) Configuration

Genotype Expected Frequency

r = 0 r = 0.5AB/AB p1 = 0.25(1 - r)2 1/4 = 0.25 1/16 = 0.0625AB/aB p2 = 0.50r(1 - r) 0.0 2/16 = 0.125AB/Ab p3 = 0.50r(1 - r) 0.0 2/16 = 0.125AB/ab p4 = 0.50(1 - r)2 1/2 = 0.5 2/16 = 0.125Ab/aB p5 = 0.50r2 0.0 2/16 = 0.125Ab/Ab p6 = 0.25r2 0.0 1/16 = 0.0625Ab/ab p7 = 0.50r(1 - r) 0.0 2/16 = 0.125aB/aB p8 = 0.25r2 0.0 1/16 = 0.0625aB/ab p9 = 0.50r(1 - r) 0.0 2/16 = 0.125ab/ab p10 = 0.25(1 - r)2 1/4 = 0.25 1/16 = 0.0625

Page 13: Mapping populations

Expected and Observed Genotypic FrequenciesCoupling (AB/ab) and Repulsion (Ab/aB) F2 Progeny

Genotype Observed Frequency

Coupling Repulsion

AB/AB p1 p1 = 0.25(1 - r)2 p1 = 0.25r2

AB/aB p2 p2 = 0.50r(1 - r) p2 = 0.50r(1 - r)AB/Ab p3 p3 = 0.50r(1 - r) p3 = 0.50r(1 - r)AB/ab p4 p4 = 0.50(1 - r)2 p4 = 0.50r2

Ab/aB p5 p5 = 0.50r2 p5 = 0.50(1 – r)2

Ab/Ab p6 p6 = 0.25r2 p6 = 0.25(1 – r)2

Ab/ab p7 p7 = 0.50r(1 - r) p7 = 0.50r(1 - r)aB/aB p8 p8 = 0.25r2 p8 = 0.25(1 – r)2

aB/ab p9 p9 = 0.50r(1 - r) p9 = 0.50r(1 - r)ab/ab p10 p10 = 0.25(1 - r)2 p10 = 0.25r2

•Co-dominant•Fully classified double hets.

•Locus A = A and a•Locus B = B and b• r = recombination frequency between locus A and B

Page 14: Mapping populations

Expected and Observed Genotypic FrequenciesCoupling (AB/ab) F2 Progeny

Genotype Observed Frequency

Coupling

AB/AB q1 q1 = 0.25(1 - r)2

AB/aB q2 q2 = 0.50r(1 - r)AB/Ab q3 q3 = 0.50r(1 - r)

AB/ab + Ab/aB q4 q4 = p4 + p5 = 0.50[(1 - r)2+r2]Ab/Ab q5 q5 = 0.25r2

Ab/ab q6 q6 = 0.50r(1 - r)aB/aB q7 q7 = 0.25r2

aB/ab q8 q8 = 0.50r(1 - r)ab/ab q9 q9 = 0.25(1 - r)2

•Co-dominant•Unclassified double heterozygotes

•Locus A = A and a•Locus B = B and b• r = recombination frequency between locus A and B

Page 15: Mapping populations

Expected and Observed Genotypic FrequenciesCoupling (AB/ab) and Repulsion (Ab/aB) F2 Progeny

Genotype Observed Frequency

Coupling Repulsion

A_B_ f1 f1 = 0.25(3 - 2r + r2) f1 = 0.25(2 + r2)A_bb f2 f2 = 0.25(2r – r2) f2 = 0.25(1 – r2)aaB_ f3 f3 = 0.25(2r – r2) f3 = 0.25(1 – r2)aabb f4 f4 = 0.25(1 - r)2 f4 = 0.25r2

•Dominant•Locus A = A and a•Locus B = B and b• r = recombination frequency between locus A and B

Page 16: Mapping populations

Analysis

1. Single-locus analysis2. Two-locus analysis3. Detecting linkage and grouping4. Ordering loci5. Multi-point analysis

Page 17: Mapping populations

Mendelian Genetic AnalysisPhenotypic and Genotypic Distributions • The expected segregation ratio of a gene is a function of the

transmission probabilities

• If a gene produces a discrete phenotypic distribution, then an intrinsic hypothesis can be formulated to test whether the gene produces a phenotypic distribution consistent with a expected segregation ratio of the gene

• The heritability of a phenotypic trait that produces a Mendelian phenotypic distribution is ~1.0. Such traits are said to be fully penetrant

• The heritability of a DNA marker is theoretically ~1.0; however, it is affected by genotyping errors

Page 18: Mapping populations

Mendelian Genetic AnalysisHypothesis Tests • The expected segregation ratio (null hypothesis) is specified on

the basis of the observed phenotypic or genotypic distribution

• One-way tests are performed to test for normal segregation of individual phenotypic or DNA markers

– If the observed segregation ratio does not fit the expected segregation ratio, then the null hypothesis is rejected.

• The expected segregation ratio is incorrect• Selection may have operated on the locus• The locus may not be fully penetrant• A Type I error has been committed

Page 19: Mapping populations

Mendelian Genetic AnalysisHypothesis Tests

• Two-way tests are performed to test for independent assortment (null hypothesis - no linkage) between two phenotypic or DNA markers. – If two genes do not sort independently, then the null

hypothesis is rejected • The two genes are linked (r < 0.50)• The expected segregation ratio is incorrect• A Type I error has been committed.

Page 20: Mapping populations

Mendelian Genetics Analysis

Null Hypothesis

Null Hypothesis

Accept Reject

True No error1 - a

Type I errora

False positive

FalseType II error

bFalse negative

No error1 - b

Page 21: Mapping populations

One-way or single-locus tests

• C2 statistics

• Log likelihood ratio statistics (G-statistics)

C

n

i i

ii

eeo

1

22 )(

i

ik

ii e

ooG ln21

i = ith genotype (or allele, or phenotype)

Pr[C2 > 2df] = a

Pr[G > 2df] = a

Goodness of fit statistics

Page 22: Mapping populations

One-way or single-locus tests

Genotype Sample A Sample B Total

aa 40 51 91Aa 82 81 163Total 122 132 254

88.6588.16149.1326681ln81

6651ln512

SBG

7.14259.24880.1626182ln82

6140ln402

SAG

Two backcross populations (A and B) genotyped for a co-dominant marker (Brandt and Knapp 1993)

Null hypothesis1:1 ratio of aa to Aa

Pr[GA > 2k-1] =

Pr[14.8 > 21] = 0.0001

Pr[GB > 2k-1] =

Pr[6.88 > 21] = 0.0086

Null hypothesis is rejected for both samples

Individual G-statistics for samples A and B

i

ik

ii e

ooG ln21

i = ith genotypek = 2 genotypic classes

Page 23: Mapping populations

One-way or single-locus tests

Genotype Sample A Sample B Totalaa 40 51 91Aa 82 81 163Total 122 132 254

7.20679.40333.302127163ln163

12791ln912

PG

Two backcross populations (A and B) genotyped for a co-dominant marker (Brandt and Knapp 1993)

Null hypothesis1aa to 1Aa ratio for

pooled samples

Pr[GP > 2k-1] = Pr[20.7 > 2

1] = 0.0000054

Null hypothesis is rejected

Pooled G-statistic across samples

i = ith genotype j = jth samplek = genotypic classesp = No. of samples (populations)

k

ip

iij

p

iijp

iijP

e

ooG

1

1

1

1

ln2

Page 24: Mapping populations

One-way or single-locus tests

Genotype Sample A Sample B Total

aa 40 51 91Aa 82 81 163Total 122 132 254

Two backcross populations (A and B) genotyped for a co-dominant marker (Brandt and Knapp 1993)

Null hypothesisSamples A and B are

homogenous

378.106581ln8151ln5140ln4082ln82ln1 1

k

i

p

jijij oo

94.0483.1406621.1230769.1240378.10652lnlnlnln21

......1

..1 1

p

jjj

k

iiiij

k

j

p

jijH ooooooooG

Pr[GH > 2(k-1)(p-1)] = Pr[0.94 > 2

1] = 0.33 (N.S.)

The heterogeneity G-statistic is

769.1240163ln16391ln91ln1

..

k

iii oo 621.1230132ln132122ln122ln

1..

p

jjj oo

483.1406254ln254ln .... oo

i = ith genotype j = jth sample (population)k = genotypic classesp = No. of samples (populations)n = Total No. of observations

Page 25: Mapping populations

One-way or single-locus tests

6.219.67.14 SBSAT GGG

6.219.07.20 HPT GGGPr[GT > 2

p(k-1)] = Pr[21.7 > 22] = 0.00002

Source G df Pr > G

Sample A 14.7 k-1 = 2-1 =1 0.0001

Sample B 6.9 k-1 = 2-1 =1 0.0086

Total 21.6 p(k-1) = 2(2-1) = 2 0.00002

Pooled 20.7 k-1 = 2-1 =1 0.000005

Heterogeneity 0.9 (k-1)(p-1) = (2-1)(2-1) = 1 0.33

Total 21.6 p(k-1) = 2(2-1) = 2 0.00002

Relationship between G statistics

k = genotypic classesp = No. of samples (populations)

Page 26: Mapping populations

One-way or single-locus tests

Allelic constitution Genotype Observed Expected

120bp /120bp aa 21 23.5120bp /124bp Aa 44 47124bp /124bp AA 29 23.5

Total 94 94

668.1098.6902.2362.225.23

29ln294744ln44

5.2321ln212

G

F2 progeny of Ae. cylindrica genotyped for the SSR marker barc98. Null hypothesis

1:2:1 ratio of aa:Aa:AA

Pr[G > 2k-1] = Pr[1.67 > 2

2] = 0.434

Null hypothesis is not rejected

Individual G-statistics for samples A and B

i

ik

ii e

ooG ln21

i = ith genotypek = 3 genotypic classes

Page 27: Mapping populations

Calculating probability values for Chi-square distributions

SAS program

data pv;Input x df;datalines;3.75 2;data pvalue;set pv;pvalue = 1 – probchi (x, df);output;proc print;run;

Output

Obs x df pvalue 1 3.75 2 0.15335

Excel formula

=CHIDIST(x , degrees_fredom)

=CHIDIST(3.75 , 2)

Output

0.15335