lecture 23: causes and consequences of linkage disequilibrium
DESCRIPTION
Lecture 23: Causes and Consequences of Linkage Disequilibrium. November 16, 2012. Last Time. Signatures of selection based on synonymous and nonsynonymous substitutions Multiple loci and independent segregation Estimating linkage disequilibrium. Today. Recombination and LD Drift and LD - PowerPoint PPT PresentationTRANSCRIPT
Lecture 23: Causes and Consequences of Linkage
DisequilibriumNovember 16, 2012
Last Time Signatures of selection based on
synonymous and nonsynonymous substitutions
Multiple loci and independent segregation
Estimating linkage disequilibrium
Today
Recombination and LD
Drift and LD
Mutation and LD
Selection and LD
Hitchhiking and selective sweeps
Effects of recombination rate on LD Decline in LD over time
with different theoretical recombination rates (c)
Even with independent segregation (c=0.5), multiple generations required to break up allelic associations
0DeD ctt
Where t is time (in generations) ande is base of natural log (2.718)
LD varies substantially across human genome
NATURE|Vol 437|27 October 2005
Average r2 for pairs of SNP separated by 30 kb in 1 Mb windows
LD affected by location relative to telomeres and centromeres, chromosome length, GC content, sequence polymorphism, and repeat composition
Highest and lowest levels of LD found in gene-rich regions
Human HapMap Project and Whole Genome Scans
LD structure of human Chromosome 19 (www.hapmap.org)
1 common SNP genotyped every 700 bp for 270 individuals (3.4 million SNP)
9.2 million SNP in total
NATURE|Vol 437|27 October 2005
LD in the Poplar Genome
5
3
1
132
2
4
1
2
LD declines rapidly with distance
LD higher in genes than in genome as a whole
Loci separated by kilobases still in LD!
Distance (kb)0 5 10 15 20
r 2
0.0
0.1
0.2
0.3
0.4
0.5
Genomewide (core of range) Genes (core of range)
Recombination Across Poplar Chromosomes
Substantial variation in recombination rate
Related to repeat composition, methylation, and distance from centromere
Recombination rate varies among individuals Rate is often higher in females than males
Rate varies among individuals within males and females
Variation in recombination rate in the MHC region (3.3 Mb in human sperm donors
Genetic Drift and LD
Begin with highly diverse haplotype pool
Drift leads to chance increase of certain haplotypes
Generates nonrandom association between alleles at different loci (LD)
Genetic Drift and LD
Why doesn’t recombination reduce LD in this situation?
Expected Gamete Frequencies: Double Homozygote
A1B1
A1B1
NonRecombinantRecombinantRecombinantNonRecombinant
A1B1 A1B1A1B1
Meiosis
A1B1
Expected Gamete Frequencies: Double Heterozygote
A2
A1B1
B2
A1B1
Meiosis
A2B2A1B2 A2B1NonRecombinantRecombinantRecombinantNonRecombinant
LD is partially a function of recombination rate
Expected proportions of gametes produced by various genotypes over two generations
Where c is the recombination rateand D0 is the initial amount of LD
Double heterozygote is only case where recombination matters
Effect of Drift on LD Drift and recombination will have opposing effects
on LD Where r2 is the squared correlation coefficient for alleles at two loci, Ne is effective population size, and c is recombination rate
4Nec is “population recombination rate”,
Expression approaches 0 for large populations or high recombination rates
Combined effects of Drift and Recombination
LD declines as a function of population recombination rate (Ner in this figure, same as Nec)
Effects of chance fluctuation of gamete frequencies
How should inbreeding affect linkage disequilibrium?
Mutation and LD: High mutation rates
Allelic associations are masked by high mutation rates, so LD is decreased
Gamete Pool with Low Mutation
Gamete Pool with High Mutation
LD and neutral markers
Low LD is the EXPECTED condition unless other factors are acting
If LD is low, neutral markers represent very small segment of the genome in most cases
In most parts of the genome, LD declines to background levels within 1 kb in most cases (though this varies by organism and population)
Care must be taken in drawing conclusions about selection based on population structure derived from neutral markers
Selection and Linkage Disequilibrium (LD)
Selection can create LD between unlinked loci
Epistasis: two or more loci interact with each other nonadditively
Phenotype depends on alleles at multiple loci
D
Change in D over time due to epistatic interactions between loci with directional selection
Why does D decline after generation 15 in this scenario?
),min( 1221max qpqpD for D > 0
Epistasis and LD
Begin with highly diverse haplotype pool
Directional selection leads to increase of certain haplotype combinations
Generates nonrandom association between alleles at different loci (LD)
Recombination vs Polymorphism in Poplar
Nucleotide diversity (π) is positively correlated with population recombination rate (4Nec)
(R2=0.38)
LG VII
Position (Mb)
0 2 4 6 8 10 12 14
Rat
e
0.000
0.001
0.002
0.003
0.004
0.005
0.006
4Nec
Recombination vs Polymorphism
Recombination rate varies substantially across Drosophila genome
Nucleotide diversity is positively correlated with recombination rate Hartl and Clark 2007
Why is polymorphism reduced in areas of low recombination?
(or why is polymorphism enhanced in areas of high recombination)
Selection and LD Selection affects target loci as well as loci in LD
Hitchhiking: neutral alleles increase in frequency because of selective advantage of allele at another locus in LD
Selective Sweep: selectively advantageous allele increases in frequency and changes frequency of variants in LD
Background Selection: selection against detrimental mutants also removes alleles at neutral loci in LD
Hill-Robertson Effect: directional selection at one locus affects outcome of selection at another locus in LD
http://medinfo.ufl.edu/
Selective Sweep in Plasmodium Pyrimethamine used to treat malaria parasite (Plasmodium falciparum)
Parasite developed resistance at locus dhfr, which rapidly became fixed in population (6 years on Thai border)
Microsatellite variation wiped out in vicinity of dhfr
Selective Sweep
Positive selection leads to increase of a particular allele, and all linked loci
Results in enhanced LD in region of selected polymorphism
Accentuated in rapidly expanding population
Derived Alleles and Selective Sweeps
Recent, incomplete selective sweeps are expected to leave a molecular signature of
•High frequency of derived alleles
•Strong geographic differentiation
•Elevated LDACAA AA
A C
chimp Africans Europeans
LD Provides evidence of recent selection Regions under recent selection experience selective sweep, show high LD locally
Patterns of LD in human genome provide signature of selection
A statistic based on length of haplotypes and frequency of “derived alleles” reveals regions under selection (“iHS” statistic)
Selective sweep for lactase enzyme in Europeans after domestication of dairy cows
Voight et al. 2006 Plos Biology 4: 446-458
Some factors that affect LD Factor Effect
Recombination rate Higher recombination lowers LD
Genetic Drift Increases LD
Inbreeding Increases LD
Mutation rate High mutation rate decreases overall LD
Epistasis Increases LD
Selection Locally increased LD