linkage disequilibrium

Click here to load reader

Download Linkage Disequilibrium

Post on 07-Feb-2016




1 download

Embed Size (px)


Linkage Disequilibrium. Granovsky Ilana and Berliner Yaniv. Computational Genetics 19.06.03. What is Linkage Disequilibrium?. - PowerPoint PPT Presentation


  • Linkage DisequilibriumGranovsky Ilana and Berliner YanivComputational Genetics 19.06.03

  • What is Linkage Disequilibrium?When the occurrence of pairs of specific alleles at different loci on the same haplotype is not independent, the deviation form independence is termed linkage disequilibriumIn general, linkage disequilibrium is usually seen as an association between one specific allele at one locus and another specific allele at a second locus

  • Linkage Disequilibrium Coefficient Definitions

    Xi-number of observations in cell i (X1+X2+X3+X4)=nD11-coefficient of gametic linkage disequilibrium between allele 1 at locus 1 and allele 1 at locus 2 D11=E[X1X4-X2X3|n=1]

    Marker 2Marker1Allele1(probability = p2)Allele2(probability = 1-p2)Allele1(probability = p1)X1p1*p2+D11X2p1*(1-p2)-D11Allele2(probability = 1-p1)X3(1-p1)*p2-D11X4(1-p1)*(1-p2)+D11

  • Population-based sampling and the EH program We wish to test the absence of disequilibrium between allele A at locus 1 and allele B at locus 2 (DAB=0)The sample of individuals we have consist of genotyping data with no possibility to fully distinguish all of the haplotypes in each individual

  • Table of all possible two-locus genotypesIn cell 5 there can be either of two phases, AB/ab or Ab/aB

    Locus2Locus 2AAAaaaBBk1k2k3Bbk4k5k6bbk7k8k9

  • Analysis of likelihoodWe maximize the log likelihood of the data observed:

    For cell 1: p1=[P(A B)] For cell 4: p4=2P(A B)P(A b)For cell 5: p5=P(A B/a b)+P(A b/a B) ==2P(A B)P(a b)+2P(A b)P(a B)22

  • Table of probabilities in each cell2222

    Locus 1Locus 2AAAaaaBBp(A B)2p(A B)p(a B)P(a B)Bb2p(A B)p(A b)2P(A B)P(a b)++2P(A b)P(a B)2p(a B)p(a b)bbP(A b)2p(A b)p(a b)P(a b)

  • Analysis of likelihoodWe maximize the likelihood above over the possible haplotype frequencies (p(A), p(B) and DAB.This likelihood is then compared with the maximum likelihood when DAB is set equal to 0 (absence of linkage disequilibrium)

  • Example*When censoring k5 all the haplotypes can be uniquely determined

    Locus 1Locus 2AAAaaaBBK1=10K2 = 10K3=3BbK4=15K5=50K6=13bbK7=5K8=13K9=10



  • Example cont.P(A) = 0.28+0.24 = 0.525P(B) = 0.28+0.18 = 0.468DAB = p(A B) p(A)p(B) = 0.28 0.525*0.468 = 0.0387* Biased example due to the elimination of the 50 observations in k5.

  • EH program input file formatEH = estimated haplotype.Input file EH.datLine 1: Number of alleles at each of the two lociLine 2: k1 k4 k7Line 3: k2 k5 k8Line 4: k3 k6 k9

  • EH program output fileOutput Estimates of Gene Frequencies (including k5)

    # of typed Individuals: 129


  • EH program output file

    Allele at locus 1Allele at locus 2Haplotype frequencyIndependent w/association110.2480.328120.2680.188210.2320.153220.2520.332

  • Chi square testThe difference between the 2 chi-square is 8.89 The P-value associated with chi-square (with 1 df) is 0.002873 It is clear the k5 contributes siginificant information

    dfLn(L)Chi-squareH0: No association2-252.680.00H1: Allelic association allowed3-248.238.89

  • Summary

  • Multiallelic genotype information in EH programLine 1: Number of alleles at each locusSubsequent lines:

    Locus 2Locus 1 1/11/22/21/32/33/31/1a1b1c1d1e1f11/2a2b2c2d2e2f22/2a3b3c3d3e3f31/3a4b4c4d4e4f42/3a5b5c5d5e5f53/3a6b6c6d6e6f6

  • Multilocus genotype data

    Locus 3Locus 1Locus 21/11/22/21/11/1a1b1c11/2a2b2c22/2a3b3c31/21/1a4b4c41/2a5b5c52/2a6b6c62/21/1a7b7c71/2a8b8c82/2a9b9c9

  • Ex. 23Full data Solution file: Censored data solution file.Censored data1/1 haplotype data

    Locus 2Locus 11/11/21/31/42/22/32/43/33/44/41/1105641231201/263331211212/21298113251031/312211110422/302282293683/3864103385913

  • Haplotypes from censored genotype data

    Allele at locus 2Allele at locus 11234142141312258251631337262963

    Allele at locus 2Allele at locus 1123410.110.0380.0350.03220.1580.0680.0440.08530.100.070.0790.172

  • !!!

View more