lab 12. linkage disequilibrium november 28, 2012

14
Lab 12. Linkage Disequilibrium November 28, 2012

Upload: ashley-miles

Post on 16-Jan-2016

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lab 12. Linkage Disequilibrium November 28, 2012

Lab 12. Linkage Disequilibrium

November 28, 2012

Page 2: Lab 12. Linkage Disequilibrium November 28, 2012

Goals1. Estimation of LD in terms of D, D’ and r2.

2. Determine effect of random and non-random mating on LD.

3. Estimate LD from diploid genotype data using EM-algorithm.

Page 3: Lab 12. Linkage Disequilibrium November 28, 2012

LD estimation in two-locus (A&B) and two-allele (1 & 2) model

A1 A1 A2 A2

B1 B2 B1 B2

p1 p1 p2 p2

q1 q2 q1 q2

Gamete Observed gametic frequency

Expected gametic frequency under linkage equilibrium

Allele Allele frequency

A1B1 x11 p1q1 A1 p1=x11+x12A1B2 x12 p1q2 A2 p2= x21+x22A2B1 x21 p2q1 B1 q1= x11+x21A2B2 x22 p2q2 B2 q2= x12+x22

211222111221122122221111 xxxxxqpxqpqpxqpxD

Page 4: Lab 12. Linkage Disequilibrium November 28, 2012

2121 qqpp

Dr

If D > 0, Dmax = min(p1q2, p2q1)

If D < 0, Dmax = min(p1q1, p2q2).

max

'D

DD

Different measures of LD

2121

222

qqpp

NDNr

Page 5: Lab 12. Linkage Disequilibrium November 28, 2012

Allele history

High driftorSelective sweep

Time

Page 6: Lab 12. Linkage Disequilibrium November 28, 2012

LD Broken by recombination

A1 B1

A2 B2

A1 B2

A2 B1

A1 B1

A1 B2

A1 B2

A1 B1

Page 7: Lab 12. Linkage Disequilibrium November 28, 2012

LD Broken by recombination

Closer proximity -> less recombination -> stronger LD

Page 8: Lab 12. Linkage Disequilibrium November 28, 2012

Decay of LD00)1( DeDcD ctt

t

0

ln1

D

D

ct t

Recombination rate for self-fertilizing organisms:

Page 9: Lab 12. Linkage Disequilibrium November 28, 2012

Gamete CountA1B1 138A1B2 88A2B1 78A2B2 152Total 456

Problem 1. In most conifers, gamete frequencies and the linkage phase of diploid genotypes can be determined directly because seeds contain relatively large amounts of haploid nutritional tissue (called endosperm or megagametophyte), which originates from the maternal gamete. As part of a study of the linkage relationship among allozyme loci in loblolly pine (Pinus taeda), Adams and Joly (1980) sampled 456 gametes at loci phosphoglucose isomerase 2 (PGI2, for simplicity, let this be locus A) and glutamate-oxaloacetate transaminase 1 (GOT1, let this be locus B) and observed the following numbers of gametes.(15 minutes)

a.)Calculate D, D’, and r2, and test the statistical significance of the gametic disequilibrium between the two loci.b.)Because the linkage phase of each mother tree was known, Adams and Joly were able to estimate that the recombination rate between the two loci is c = 0.044.

i. What is the expected value of D in the next generation (i.e., in the offspring of the seeds that were included in the study)?

ii. How many generations of random mating will it take for D to decay below 0.005?iii. What is the expected value of D in the next generation if:

S = 0.1? S = 0.5? S = 0.9?c.) Repeat the calculations from b) assuming c = 0.5 (i.e., assuming that the two loci are physically unlinked).d.) Discuss the relative importance of rates of recombination and self-fertilization in determining the rate of LD decay

Page 10: Lab 12. Linkage Disequilibrium November 28, 2012

Problem 2. Compare rates of decay of r2 with physical distance in sequences from the phytochrome B2 (PHYB2) gene in European aspen (Populus tremula) and the phytochrome C (PHYC) gene in Arabidopsis thaliana.

a) Show scatter plots with trend lines illustrating the decay of r2 with physical distance for each gene

b) How do the patterns of LD differ between these two species, and why?c) GRADUATE STUDENTS: Provide facts and citations supporting your

biological explanation

Page 11: Lab 12. Linkage Disequilibrium November 28, 2012

• When we genotype, we often don’t know the actual haplotypes– Unphased haplotypes

• Can use a maximum likelihood method to obtain haplotype frequencies– Expectation Maximization (EM)

Haplotypes through EM

Page 12: Lab 12. Linkage Disequilibrium November 28, 2012

Haplotypes through EM1. Initialize – Guess the gamete frequencies2. Expectation Step – Find expected frequencies of

known phase genotypes given gamete frequencies3. Maximization Step – Find expected frequencies of all

unphased genotypes given gamete frequenciesa. Use to make new gamete frequency estimates

where n= # of unphased genotypes in the samples, n1, n2….n5, are the # of times each unphased genotype was observed in the sample, and P1, P2, …., P5 are the expected frequencies of the unphased genotypes in the sample.

5431543

221

54321 !!!!!

!)sfrequencie haplotype|( nnnnn PPPPP

nnnnn

nDataPL

Page 13: Lab 12. Linkage Disequilibrium November 28, 2012

Problem 3. File human_LD.arp contains data for humans from two populations (Han and Melanesian) genotyped for the same loci you have analyzed for departures from Hardy-Weinberg Equilibrium and population structure. The Han sample includes individuals from a broad geographic area in China, whereas the Melanesian sample only includes individuals from the Bougainville Island. Use Arlequin to test for significant linkage disequilibrium among the 10 loci in each of these populations. a.) How do you interpret the difference in the number of linked loci in the two populations?

b.) GRAD STUDENTS: How many pairs of loci are expected to show significant LD at α=0.05 by chance?c.) GRAD STUDENTS: Provide facts and citations supporting your biological claim.

Page 14: Lab 12. Linkage Disequilibrium November 28, 2012

http://en.wikipedia.org/wiki/Melanesia

Han