lecture 3 l dand_haplotypes_full

129
Introduction

Upload: lekki-frazier-wood

Post on 04-Nov-2014

84 views

Category:

Science


3 download

DESCRIPTION

Linkage disequilibrium and haplotypes

TRANSCRIPT

Page 1: Lecture 3 l dand_haplotypes_full

Introduction

Page 2: Lecture 3 l dand_haplotypes_full

All about my classes

[email protected]

• Lectures are stand alone - No preparation needed except for previous course content.

• Nearly always provide additional resources• -Take home exercise• -Papers referenced• -Resources such as other lecture slides

Page 3: Lecture 3 l dand_haplotypes_full

All about me….

Page 4: Lecture 3 l dand_haplotypes_full

All about you….

Page 5: Lecture 3 l dand_haplotypes_full

Try to always orient you to the session

• Go over the theory of linkage disequilibrium and haplotypes

• Calculate linkage disequilibrium by hand• Relaxing session: story of HapMap• Lab: Today walk you through, hand-holding look

at HapMap.

• Each ~30 minutes, so please go spent extra time familiarizing yourself with HapMap.

Page 6: Lecture 3 l dand_haplotypes_full

Try to give you your learning objectives

• Primary objectives• Describe linkage disequilibrium and a haplotype• Explain the meaning of r2 = 1.0, r2 = .8 and r2 = .5• Find a region of interest (ROI) on HapMap• Locate tagSNPs for an ROI on HapMap. • Secondary objectives• Describe how mutations and recombination give rise to linkage

disequilibrium and haplotypes• Calculate D, D’ and r2 by hand • List key differences between D, D’ and r2

• Evaluate the contribution of HapMap to public health genetics

Page 7: Lecture 3 l dand_haplotypes_full

Part 1Haplotype and Linkage Disequilibrium

theory

Page 8: Lecture 3 l dand_haplotypes_full

One source of variation in our DNA occurs through mutation events….

A

C

C

Mutation

Ancestral population

Mutation event

A

Population

Page 9: Lecture 3 l dand_haplotypes_full

Mutations that proliferate are ‘SNPs’

• Single Nucleotide Polymorphisms• The most common type of variation in DNA• Substitution of 1 nucleotide for another• 2/3 SNPs involve C-> T • Definition is evolving:

• Old definition: SNPs must be seen in 1% of the population

• SNPs occur ~ every 300 bp• Therefore ~ 10 million SNPs in the human genome

Page 10: Lecture 3 l dand_haplotypes_full

The number of mutations increases over time

A

C

1st Mutation event

2nd Mutation event

G

G

A

C

G

G

C C Mutation

Page 11: Lecture 3 l dand_haplotypes_full

Proliferating SNPs give rise to haplotypes

• A haplotype is “A specific set of DNA variants observed on a single chromosome, or part of a chromosome”

• In practice, usually referring to a set of SNPs within a single gene

Page 12: Lecture 3 l dand_haplotypes_full

Haplotypes:

A

C

G

G

C C

A C

A

A

A

A

T

T

T

T

T

T

T

T

C

C

C

C

T

T

T

T

Haplotype 1: AG

Haplotype 2: CG

Haplotype 3: CC

Haplotype 4: AC

Page 13: Lecture 3 l dand_haplotypes_full

Resolve the population haplotypes!

C G A C T A G T

GA, CA, GT, CT,

C G A C T A G T A C C A

GAG, CAG, GTG, CTG, GAT, CAT, GTT, CTT,

GC

AT

GC

AT

GT

Page 14: Lecture 3 l dand_haplotypes_full

How many possible haplotypes?

C G A C T A G T

GA, CA, GT, CT,

C G A C T A G T A C C A

GAG, CAG, GTG, CTG, GAT, CAT, GTT, CTT,

GC

AT

GC

AT

GT

22 = 6

23 = 8

Page 15: Lecture 3 l dand_haplotypes_full

How many possible haplotypes?

2 (alleles) to the power of n loci:2n

Page 16: Lecture 3 l dand_haplotypes_full

How many haplotypes does a person have for a given chromosomal region?

C G A C T A G TGC

AT

C G A C T A G TGC

AT

C G A C T A G TGC

AT

Page 17: Lecture 3 l dand_haplotypes_full

But what if the person is homozygous at both loci?

C G A C T A G TGC

AT

C G A C T A G T

C G A C T A G T

GA, CA, GT, CT,

C

C

T

T

CT, CT, CT, CT,

Page 18: Lecture 3 l dand_haplotypes_full

Haplotype overview

• Method of characterizing variation at more than one locus on a chromosome

• Only 1 allele from each locus• But as many alleles as there are loci on the

chromosome… IF….……those loci contain variation (SNPs)

• Like SNPs each person has 2 haplotypes….. Which (like SNPs) may be the same

• The number of possible haplotypes in the population is 2 to the power of n loci.

Page 19: Lecture 3 l dand_haplotypes_full

Variation in our DNA also occurs through recombination

A G

Before recombination

After recombination

C G

C C

A G

C G

C C

A C

Page 20: Lecture 3 l dand_haplotypes_full

The number of recombination events increases over time

Page 21: Lecture 3 l dand_haplotypes_full

Our chromosome are mosaics….

• The extent and conservation of pieces depends on:• Recombination rate• Mutation rate• Population size• Natural selection

Page 22: Lecture 3 l dand_haplotypes_full

What do these mosaics mean….

…. For our haplotypes?

Page 23: Lecture 3 l dand_haplotypes_full

Key concept….

…. alleles often co-occur at greater than chance levels

XX

Page 24: Lecture 3 l dand_haplotypes_full

Linkage Disequilibrium (LD)

• The nonrandom association of alleles at different loci

• Equilibrium – when things are ‘in balance’ or as we would expect

• When a particular allele at one locus is found together on the same chromosome with a specific allele at a second locus, more often than expected if the loci were segregating independently in a population. The loci are in disequilibrium – it is out of balance, or not what we would expect

Page 25: Lecture 3 l dand_haplotypes_full

Linkage disequilibrium is a measureable trait

Determined by space and time

XX

Page 26: Lecture 3 l dand_haplotypes_full

Time decreases linkage disequilibrium

X X

Page 27: Lecture 3 l dand_haplotypes_full

Space decreases linkage disequilibrium

X XX XX

Page 28: Lecture 3 l dand_haplotypes_full

Summary of part 1

• Mutations give rise to SNPs• SNPs give rise to haplotypes• A haplotype is a specific set of DNA variants • Recombination patterns lead to linkage

disequilibrium • Linkage disequilibrium is when we see haplotypes

more often than by chance

Questions before we proceed to calculating LD?

Page 29: Lecture 3 l dand_haplotypes_full

Part 2

Calculating Linkage Disequlibrium

Page 30: Lecture 3 l dand_haplotypes_full

All about punnet squares….

Locus B

Locus A

B b

A

a

PAB PAb

PaB Pab

Totals

Totals:

PA

Pa

PB Pb 1.0

2 loci; A: A/a, B: B/bWhat are out haplotypes?

Page 31: Lecture 3 l dand_haplotypes_full

All about punnet squares (in LD calculation)….

• Each cell contains frequency of a haplotype• Row & column ends contain the frequency of an

allele• When you sum the rows and columns you should

get 1.0

Page 32: Lecture 3 l dand_haplotypes_full

Measures of Linkage Disequilibrium

• (A Little History lesson)• Three measures of LD:

• D • D’• r

Page 33: Lecture 3 l dand_haplotypes_full

Measures of Linkage Disequilibrium - D

• 1960 Lewontin & Kojima• D – unstandardized measure of how far the

association between two alleles differs from that expected by chance

Page 34: Lecture 3 l dand_haplotypes_full

Linkage Equilibrium

PAB = PAPB

Page 35: Lecture 3 l dand_haplotypes_full

Linkage Disequilibrium

PAB = PAPB

Page 36: Lecture 3 l dand_haplotypes_full

Linkage Disequilibrium

PAB = PAPB

D = PAB - (PAPB)

Page 37: Lecture 3 l dand_haplotypes_full

Linkage Disequilibrium – an example

Given the following haplotype frequencies – are the alleles in linkage disequilibrium?PAB = .2PAb = .5PaB = .3Pab = .0i.e. what is D?

D = PAB - (PAPB)

Page 38: Lecture 3 l dand_haplotypes_full

Step 1: Complete the punnet square PAB = .2PAb = .5PaB = .3Pab = .0

Locus B

B b

A

a

.2 .5

.3 .0

Totals

Totals:

.7

.3

.5 .5 1.0

D = PAB - (PAPB)

Locus A

Page 39: Lecture 3 l dand_haplotypes_full

Step 2: Calculate allele frequencies PAB = .2PAb = .5PaB = .3Pab = .0

PA = Pa = PB = Pb =

.7

.3

.5

.5

D = PAB - (PAPB)

Page 40: Lecture 3 l dand_haplotypes_full

Step 3: Calculate D PAB = .2PAb = .5PaB = .3Pab = .0

PA = Pa = PB = Pb =

.7

.3

.5

.5

D = PAB - (PAPB)

D=.2 – (.7 * . 5)D= -.15

Are the alleles in linkage disequlibrium?

Page 41: Lecture 3 l dand_haplotypes_full

Measures of Linkage Disequilibrium - D

Problems:• Sign is arbitrary• Range depends on allele frequencies

Page 42: Lecture 3 l dand_haplotypes_full

Measures of Linkage Disequilibrium – D’

• 1964 Lewinton• D’ – Standardize D to the maximum possible value it

can take

• D’ = D / Dmax/min

Page 43: Lecture 3 l dand_haplotypes_full

Step 4: Calculate Dmax/min PAB = .2PAb = .5PaB = .3Pab = .0

PA = Pa = PB = Pb =

.7

.3

.5

.5

D = -.15

• Where D is positive:Dmax = the lesser of PAPb or PaPB

• Where D is negative:Dmin = the larger of -PAPB or -PaPb

What is our Dmax/min?

Max {-.7*.5, -.3*.5} =

Max{-.35, -.15}

Page 44: Lecture 3 l dand_haplotypes_full

Step 5: Calculate D’ PAB = .2PAb = .5PaB = .3Pab = .0

PA = Pa = PB = Pb =

.7

.3

.5

.5

D = -.15

Dmin = -.15

D’= D / Dmax/min

D’ = -.15 / -.15 = 1

Page 45: Lecture 3 l dand_haplotypes_full

Measures of Linkage Disequilibrium – D’

• D’= +/- 1 = complete LD• No evidence for recombination• Ancestral haplotype not disruptedProblems• D’ is inflated in small N• D’ inflated with rare alleles• No information on allele frequency

Page 46: Lecture 3 l dand_haplotypes_full

Measures of Linkage Disequilibrium – r2

• 1968 Hill & Robertson• r2 = correlation coefficient between 2 alleles

Page 47: Lecture 3 l dand_haplotypes_full

Step 5: Calculate r2 PAB = .2PAb = .5PaB = .3Pab = .0

PA = Pa = PB = Pb =

.7

.3

.5

.5

D = -.15

Dmin = -.15

r2 = D2 / PA Pa PB Pb

r2 = -.152 / [.7*.3*.5*.5] = .43

Page 48: Lecture 3 l dand_haplotypes_full

Measures of Linkage Disequilibrium – r2

• r2 = 0-1• 1= two markers give identical informationProblems

Page 49: Lecture 3 l dand_haplotypes_full

What can we learn from our 3 measures of LD?

• D = -.15• D’ = 1.0• r2 = .43

Page 50: Lecture 3 l dand_haplotypes_full

D’ vs r2

• Both are a measure of association with 1 being the maximum, and indicating most LD

• BUT r2 requires equal allele frequency to be 1.

Page 51: Lecture 3 l dand_haplotypes_full

Perfect LD

• Equal allele frequency• Allelic association is as strong

as possible– 2 haplotypes observed – No detected recombination

between SNPs

D´ = 1 r2 = 1

Page 52: Lecture 3 l dand_haplotypes_full

Complete LD

Unequal allele frequency– 3 haplotypes observed – No detected recombination

between SNPs

D´ = 1 r2 < 1

Page 53: Lecture 3 l dand_haplotypes_full

Calculate your own Linkage Disequilibrium measures of D, D’ and r2

PAB = .6PAb = .1PaB = .2Pab = .1

Page 54: Lecture 3 l dand_haplotypes_full

At the end of the day…..

Linkage disequilibrium is the non random association of markers [SNPs] at two or more loci

….. But what does this mean for applying genetics to public health? (finally we get there….)

Page 55: Lecture 3 l dand_haplotypes_full

Part 3Using LD in genetic studies: The Hapmap

consortium

Page 56: Lecture 3 l dand_haplotypes_full

The Human Genome Project

Page 57: Lecture 3 l dand_haplotypes_full

DbSNP

Page 58: Lecture 3 l dand_haplotypes_full

Cystic Fibrosis

Page 59: Lecture 3 l dand_haplotypes_full

Inflammatory bowel disease

• Likely had many causal variants• Heritable MZ > DZ• 10% of those with IBD had 1 relative with IBD• Reasonable linkage signal on Chr 5• What could explain this structure?

Page 60: Lecture 3 l dand_haplotypes_full

Inflammatory bowel disease

Page 61: Lecture 3 l dand_haplotypes_full

5qp31

Page 62: Lecture 3 l dand_haplotypes_full

5qp31

8 SNPsGGACAACCAATTCGGG

Page 63: Lecture 3 l dand_haplotypes_full

Haplotype Map

• Add to Human Genome Project with information on diversity

• How did HapMap and Human genome project differ?

• ‘Chunks’ of data

8 SNPsGGACAACCAATTCGGG

Page 64: Lecture 3 l dand_haplotypes_full

“Short cuts”

A T A G T A C ATC

AC

AT

GA

GC

GCA

AATT

GGAA

GCGC

TCCC

GCGC

ACCC

SNPs 1, 3 and 4 are TagSNPs

Page 65: Lecture 3 l dand_haplotypes_full

HapMap

• Launched in 2001• Open access resource for all researchers• In real time• Spin off from The Human Genome Project• Qu: What was the key difference between the HGP

and HapMap?• Characterizes LD across the genome• Also develop analytic tools

• Haploview

Page 66: Lecture 3 l dand_haplotypes_full

HapMap

“The success of the HapMap will be measured in terms of the genetic discoveries enabled, and improved knowledge

of disease aetiology.”

Page 67: Lecture 3 l dand_haplotypes_full

HapMapMark Daly “The

community’s response after a number of years of

struggling and to not finding genetic factors for

complex disease”.

Page 68: Lecture 3 l dand_haplotypes_full

HapMap – Phase 1

• Launched in 2001; Production 2002-3• Phase I• Not comprehensive• 90 Yoruba individuals• 90 individuals of European descent • 45 Han Chinese• 45 Japanese• 1,000,000 SNPs

Page 69: Lecture 3 l dand_haplotypes_full

HapMap – Phase 1

Minor allele frequency

Page 70: Lecture 3 l dand_haplotypes_full

HapMap – Phase I

• Released in 2005• 1 million SNPs• August 2006, “dbSNP included more than ten million SNPs, and

more than 40% of them were known to be polymorphic. By comparison, at the start of the project, fewer than 3 million SNPs were identified, and no more than 10% of them were known to be polymorphic.”

Page 71: Lecture 3 l dand_haplotypes_full

HapMap – an LD plot

Page 72: Lecture 3 l dand_haplotypes_full

HapMap – Phase I

Recombination hotspots are widespreadand account for LD structure

Page 73: Lecture 3 l dand_haplotypes_full

HapMap – Phase I

Page 74: Lecture 3 l dand_haplotypes_full

Tagger

Table 7 Number of selected tag SNPs to capture all observed common SNPs in the Phase I HapMap for the three analysis panels using pairwise tagging at different r2 thresholds

YRI CEU CHB+JPT

Pairwise r2 ≥ 0.5 324,865 178,501 159,029

r2 ≥ 0.8474,409 293,835 259,779

r2 = 1 604,886 447,579 434,476

Page 75: Lecture 3 l dand_haplotypes_full

Will tag SNPs picked from HapMap apply to other population samples?

Population differences add very little inefficiency(stolen slide from ASHG... I can’t source this)

CEU

Whites fromLos Angeles, CA

Botnia, Finland

CEUCEU

Utah residents with European ancestry

(CEPH)

Page 76: Lecture 3 l dand_haplotypes_full

HapMap – Phases II and III

• Phase II• >3.1 million genetic variants• Captured 90 to 96 percent of common genetic

variation• Phase III• 1,301 samples from 11 populations

Page 77: Lecture 3 l dand_haplotypes_full

HapMap and Public Health

• How has HapMap helped us in the quest to find genes for disorders?

Page 78: Lecture 3 l dand_haplotypes_full

What is next for HapMap?

• 1,000 Genomes Project

Page 79: Lecture 3 l dand_haplotypes_full

Part 4

HapMap Practical

Page 80: Lecture 3 l dand_haplotypes_full

Goals of this lab

Part 11. Find HapMap SNPs near a gene.2. View patterns of LD amongst the SNPs.3. Select tag SNPs.4. Download information on the SNPs for use in

Haploview.5. Evaluate genotype data in a paper against HapMap

data.Part 26. Make a file from data for use in haploview

Page 81: Lecture 3 l dand_haplotypes_full

Data origin

Page 82: Lecture 3 l dand_haplotypes_full

Goals of this lab

Part 11. Find HapMap SNPs near a gene.2. View patterns of LD amongst the SNPs.3. Select tag SNPs.4. Download information on the SNPs for use in

Haploview.5. Evaluate genotype data in a paper against HapMap

data.

Page 83: Lecture 3 l dand_haplotypes_full

Goals of this lab

Part 11. Find HapMap SNPs near a gene.>Navigate to HapMap>Using release #27 (Pase 3) locate the LRP1 gene (hint: it is a landmark).>Answer questions 1-3

Page 84: Lecture 3 l dand_haplotypes_full

1. Go to hapmap.ncbi.nlm.nih.gov

2. Select release 2, Phase #3

Page 85: Lecture 3 l dand_haplotypes_full

3. Put LRP1 in the search box

Page 86: Lecture 3 l dand_haplotypes_full

5. Look at the information

Page 87: Lecture 3 l dand_haplotypes_full

6. Turn different tracks on and off

(Don’t forget ‘update image’)

Page 88: Lecture 3 l dand_haplotypes_full

7. Count the genotyped SNPs

Page 89: Lecture 3 l dand_haplotypes_full

8. Create an LD plot

Page 90: Lecture 3 l dand_haplotypes_full
Page 91: Lecture 3 l dand_haplotypes_full

9. Choose tag SNPs

Page 92: Lecture 3 l dand_haplotypes_full

Goals of this lab

Part 11. Find HapMap SNPs near a gene.2. View patterns of LD amongst the SNPs.3. Select tag SNPs.4. Download information on the SNPs for use in

Haploview.5. Evaluate genotype data in a paper against HapMap

data.

Page 93: Lecture 3 l dand_haplotypes_full

10. Download LRP1 data & open in Haploview

Page 94: Lecture 3 l dand_haplotypes_full

11. Open in Haploview, Answer questions 4-7

Page 95: Lecture 3 l dand_haplotypes_full

Slide graveyard

Page 96: Lecture 3 l dand_haplotypes_full

6. Turn different tracks on and off

(Don’t forget ‘update image’)

Page 97: Lecture 3 l dand_haplotypes_full

6. Turn different tracks on and off

(Don’t forget ‘update image’)

Page 98: Lecture 3 l dand_haplotypes_full

4. Look at the different PPARy

Page 99: Lecture 3 l dand_haplotypes_full

Try to give you your learning objectives

• Primary objectives• Describe linkage disequilibrium and a haplotype• Explain the meaning of r2 = 1.0, r2 = .8 and r2 = .5• Find a region of interest (ROI) on HapMap• Locate tagSNPs for an ROI on HapMap. • Secondary objectives• Describe how mutations and recombination give rise to linkage

disequilibrium and haplotypes• Calculate D, D’ and r2 by hand • List key differences between D, D’ and r2

• Evaluate the contribution of HapMap to public health genetics

Page 100: Lecture 3 l dand_haplotypes_full

A

C

C G A C T A G T A C C ATC

AG

GT

T G A C T A A G T A C C G A

8 Possible SNP combinations:

C T G A C T A A G T A C C T A

C T G A C T A G G T A C C G A

C T G A C T A G T A C C T A

C C G A C T A G T A C C G A

C C G A C T A A G T A C C T A

C C G A C T A G G T A C C G A

C C G A C T A G G T A C C T A

Haplotype 1

Haplotype 2

Haplotype 3

Haplotype 4

Haplotype 5

Haplotype 6

Haplotype 7

Haplotype 8

G

Page 101: Lecture 3 l dand_haplotypes_full

C

C G A C T A G T A C C ATC

AG

GT

T G A C T A A G T A C C G A

8 Possible Haplotypes:

C T G A C T A A G T A C C T A

C T G A C T A G G T A C C G A

C T G A C T A GH

G T A C C T A

C C G A C T A A G T A C C G A

C C G A C T A A G T A C C T A

C C G A C T A G G T A C C G A

C C G A C T A G G T A C C T A

Page 102: Lecture 3 l dand_haplotypes_full

C

C G A C T A G T A C C ATC

AG

GT

T G A C T A A G T A C C G A

8 Possible Haplotypes:

C T G A C T A A G T A C C T A

C T G A C T A G G T A C C G A

C T G A C T A GH

G T A C C T A

C C G A C T A A G T A C C G A

C C G A C T A A G T A C C T A

C C G A C T A G G T A C C G A

C C G A C T A G G T A C C T A

1

72

Page 103: Lecture 3 l dand_haplotypes_full

C

C G A C T A G T A C C ATC

AG

GT

T G A C T A A G T A C C G A

8 Possible Haplotypes, but 3 observed haplotypes:

C T G A C T A A G T A C C T A

C T G A C T A G G T A C C G A

C T G A C T A GH

G T A C C T A

C C G A C T A A G T A C C G A

C C G A C T A A G T A C C T A

C C G A C T A G G T A C C G A

C C G A C T A G G T A C C T A

TAGTATTGGTGTCAGCATCGGCGT

Page 104: Lecture 3 l dand_haplotypes_full

1. Information about our population

• Factors that influence linkage disequilibrium:• Genetic drift • Mutation• Founder effects• Selection• Stratification

• Factors that maintain linkage disequilibrium:• Selection• Non-random mating• Linkage

• Mainstay of ‘population genetics’

Page 105: Lecture 3 l dand_haplotypes_full

2. Interpretation of our findings

• Genetic association is correlational therefore, we cannot make causal inferences• SNP1 -> Trait• SNP1 and SNP2 are in LD• We don’t know which is the true causal

variant

Page 106: Lecture 3 l dand_haplotypes_full
Page 107: Lecture 3 l dand_haplotypes_full

Linkage Disequilibrium coefficient D’

PAB = PAPB

DAB = PAB - PAPB

PAB = PAPB + DAB

Problems:• Sign is arbitrary• Range depends on allele frequencies

Q: Why are these problems for applied genetics in public health?

Page 108: Lecture 3 l dand_haplotypes_full

Calculating Linkage EqulibriumLocus B

Locus A B b

A

a

PAB PAb

PaB Pab

Totals

Totals:

PA

Pa

PB Pb 1.0

Page 109: Lecture 3 l dand_haplotypes_full

A

C

C G A C T A G T A C C ATC

AG

GT

T G A C T A A G T A C C G A

8 Possible SNP combinations:

C T G A C T A A G T A C C T A

C T G A C T A G G T A C C G A

C T G A C T A G T A C C T A

C C G A C T A G T A C C G A

C C G A C T A A G T A C C T A

C C G A C T A G G T A C C G A

C C G A C T A G G T A C C T A

Haplotype 1

Haplotype 2

Haplotype 3

Haplotype 4

Haplotype 5

Haplotype 6

Haplotype 7

Haplotype 8

G

Page 110: Lecture 3 l dand_haplotypes_full

C

C G A C T A G T A C C ATC

AG

GT

T G A C T A A G T A C C G A

8 Possible Haplotypes:

C T G A C T A A G T A C C T A

C T G A C T A G G T A C C G A

C T G A C T A GH

G T A C C T A

C C G A C T A A G T A C C G A

C C G A C T A A G T A C C T A

C C G A C T A G G T A C C G A

C C G A C T A G G T A C C T A

Page 111: Lecture 3 l dand_haplotypes_full

C

C G A C T A G T A C C ATC

AG

GT

T G A C T A A G T A C C G A

8 Possible Haplotypes:

C T G A C T A A G T A C C T A

C T G A C T A G G T A C C G A

C T G A C T A GH

G T A C C T A

C C G A C T A A G T A C C G A

C C G A C T A A G T A C C T A

C C G A C T A G G T A C C G A

C C G A C T A G G T A C C T A

1

72

Page 112: Lecture 3 l dand_haplotypes_full

C

C G A C T A G T A C C ATC

AG

GT

T G A C T A A G T A C C G A

8 Possible Haplotypes, but 3 observed haplotypes:

C T G A C T A A G T A C C T A

C T G A C T A G G T A C C G A

C T G A C T A GH

G T A C C T A

C C G A C T A A G T A C C G A

C C G A C T A A G T A C C T A

C C G A C T A G G T A C C G A

C C G A C T A G G T A C C T A

TAGTATTGGTGTCAGCATCGGCGT

Page 113: Lecture 3 l dand_haplotypes_full

Linkage Equilibrium

PAB = PAPB

PAb = PaPb = PA P (1-Pb)PaB = PaPB = (1-PA) PB

Pab = PaPb = (1-PA) (1-PB)

Page 114: Lecture 3 l dand_haplotypes_full

Linkage Disequilibrium coefficient D

PAB = PAPB

DAB = PAB - PAPB

Problems:• Sign is arbitrary• Range depends on allele frequencies

Q: Why are these problems for applied genetics in public health?

Page 115: Lecture 3 l dand_haplotypes_full

S.M. Bray, J.G. Mulle, A.F. Dodd, A.E. Pulver, S. Wooding and S.T. Warren. Signatures of founder effects, admixture and selection in the Ashkenazi Jewish population. PNAS Early Edition (2010).

Page 116: Lecture 3 l dand_haplotypes_full

C T G A C T A A G T A C C G AC T G A C T A A G T A C C T AC T G A C T A G G T A C C G AC T G A C T A G G T A C C T AC C G A C T A A G T A C C G AC C G A C T A A G T A C C T AC C G A C T A G G T A C C G A

C C G A C T A G G T A C C T A

8 Possible haplotypes:

Haplotype 1

Haplotype 2

Haplotype 3

Haplotype 4

Haplotype 5

Haplotype 6

Haplotype 7

Haplotype 8

Page 117: Lecture 3 l dand_haplotypes_full

C T G A C T A A G T A C C G AC T G A C T A A G T A C C T AC T G A C T A G G T A C C G AC T G A C T A G G T A C C T AC C G A C T A A G T A C C G AC C G A C T A A G T A C C T AC C G A C T A G G T A C C G A

C C G A C T A G G T A C C T A

Page 118: Lecture 3 l dand_haplotypes_full

Measures of Linkage Disequilibrium - D

• 1960s Lewontin & Kojima• D – unstandardized measure of how far the

association between two alleles differs from that expected by chance

Page 119: Lecture 3 l dand_haplotypes_full

Then we get recombinationA

C

G

G

C C

A

C

G

G

C C

Before recombination

After recombination

A C

Page 120: Lecture 3 l dand_haplotypes_full

C T G A C T A A G T A C C G AC T G A C T A A G T A C C T AC T G A C T A G G T A C C G AC T G A C T A G G T A C C T AC C G A C T A A G T A C C G AC C G A C T A A G T A C C T AC C G A C T A G G T A C C G A

C C G A C T A G G T A C C T A

Page 121: Lecture 3 l dand_haplotypes_full

Ancestor

Present Day

Page 122: Lecture 3 l dand_haplotypes_full

Recombination on an individual level

Page 123: Lecture 3 l dand_haplotypes_full

Measures of Linkage Disequilibrium - D

• At single locus: Aa PA = (1-Pa)

Page 124: Lecture 3 l dand_haplotypes_full

C

C G A C T A G T A C C ATC

AG

GT

T G A C T A A G T A C C G A

8 Possible SNP combinations:

C T G A C T A A G T A C C T A

C T G A C T A G G T A C C G A

C T G A C T A GH

G T A C C T A

C C G A C T A A G T A C C G A

C C G A C T A A G T A C C T A

C C G A C T A G G T A C C G A

C C G A C T A G G T A C C T A

Page 125: Lecture 3 l dand_haplotypes_full

Refresher

• Recombination

Page 126: Lecture 3 l dand_haplotypes_full

Sources of variation in our DNA

Page 127: Lecture 3 l dand_haplotypes_full

New Concept – Linkage Disequilibrium

• Linkage Disequilibrium is the tendency for 2 (or more) SNPs to be inherited together

• AATAAGCCTGATC• ATTAAGCCTGATC• AATTAGCCTGATC• ATTAAGGCTGATC

Page 128: Lecture 3 l dand_haplotypes_full

Why is this important?

• Allows to only genotype certain SNPs of the genome…

• ….. We can infer more than we type

Page 129: Lecture 3 l dand_haplotypes_full

Haplotype

• Inheritance of a cluster of SNPs• “Haploid” “Genotype”