applications of linkage analysis in the modern era intro to... · 2017-12-07 · linkage...

41
Applications of linkage analysis in the modern era [email protected]

Upload: others

Post on 18-Jun-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Applications of linkage analysis in the modern era

[email protected]

Page 2: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Outline

• What is linkage analysis?• Parametric

• Non-parametric

• Why is linkage analysis complicated for complex traits such as cognition or psychiatric illness?

• How can it be used in the modern era?• Used to filter large amount of data generated through next generation

sequencing

• Used to understand the effects of combinations of variants on phenotype

Page 3: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

• One of the two main approaches in gene mapping.

• Uses pedigree data.

Linkage Analysis

Page 4: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Linkage Association

Linkage is a property of loci Association is a property of alleles

Role:* To identify a biological mechanism for transmission of a trait* To locate the gene involved

Role:* To identify association between an allelic variant and a disease* To identify linkage disequilibrium between a disease allele and a marker

Coarse mapping (>1cM) Fine mapping (<1cM)

No information about which allelic variant associated with higher risk of disease

Require family pedigrees Case-control or family based approach

Use very polymorphic markers or bi-allelic markers Usually bi-allelic markers

Differences between linkage and association

Page 5: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Calculation of LOD Scores

LOD scores are the Log10 of the ratio between the two odds.

You calculate the probability of the pattern occurring by chance and the probability that they occur because they are close together i.e. linked.

Page 6: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Likelihood given linkage (i.e. the recombination fraction <0.5, here 0.2)

= (1-θ)5 x θ1

= (0.8)5 x 0.21

= 0.32768 x 0.2 = 0.065536

Likelihood given no linkage (i.e. the recombination fraction is 0.5)

= (θ)6

= (0.5)6

= 0.015625

Ratio between the two probabilities= 0.065536/0.015625

= 4.194304

The Log(10) of this ratio, is the Z score or LODscore =

0.62266

5 non-recombinant individuals

+

1 recombinant individual

Recombination fraction

=N. recomb/N.meioses)

=1/6 = 0.167

Calculation of LOD Scores

Page 7: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Building blocks of linkage analysis

001.0

8.0

99.0 (aa), probability of a homozygote being affected

(Aa), probability of a heterozygote being affected

(AA), probability of a non-carrier being affected (phenocopy rate)

• Information about disease model (in parametric analysis)

• Information about allele frequencies

• Information about environmental variables

Page 8: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

DISEASE ALLELE_FREQ PENETRANCES LABELPROSTATE_CANCER 0.001 * HYPOTHETICAL_ADDITIVE_MODEL

SEX = FEMALE 0.000,0.000,0.000AGE < 50 0.001,0.050,0.100AGE < 70 0.002,0.200,0.400OTHERWISE 0.004,0.500,0.800

The model describes an hypothetical susceptibility allele for prostate cancer.

- The first liability class is all females, and specifies that they never develop prostate cancer.

- The next row specifies that males under the age of 50 have about a 5% chance of developing cancer if they are heterozygotes for this allele and a 10% chance if they are homozygotes. These probabilities increase for males aged between 50 and 70.

- The final row specifies the penetrances for all other individuals (i.e. males aged 70 or over).

An appropriate and careful choice of disease model is essential for parametric linkage analyses.

Calculation of LOD Scores – liability classes

Page 9: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Seven liability classes were defined on the basis of age (in years). For each age group j, age-dependent population prevalence Pjwas obtained from the Rotterdam Study.1

The disease-gene penetrance, fj, of the jthage group can be estimated as

where PAF is the population-attributable fraction—that is, the proportion of the population prevalence that can be explained by the studied gene (10% assumed)—and qis the disease-allele frequency (1% assumed).

LiabilityClass

Age(years)

Population Prevalencea Penetrance

No. of Patients

No. of Unaffected Relatives

1 <65 <.02 .00 0 129

2 65–69 .02 .09 4 6

3 70–74 .05 .23 22 11

4 75–79 .09 .46 32 14

5 80–84 .23 .99 30 8

6 85–89 .35 .99 24 1

7 90 >.35 .99 0 1

Calculation of LOD Scores – age-dependent penetrance

A genome-wide screen for late-onset Alzheimer disease in a genetically isolated Dutch population. Am J Hum Genet. 2007 July; 81(1): 17–31.

Page 10: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Multipoint and Heterogeneity LOD Scores better resolution, more robust, exclusion mapping

A genome-wide screen for late-onset Alzheimer disease in a genetically isolated Dutch population.

Multipoint LOD (blue) and HLOD (pink) scores for chromosomes 1 and 3 in the genome screen of late-onset AD after fine typing.

Am J Hum Genet. 2007 July; 81(1): 17–31.

Page 11: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

The Significance of LOD Scores

• Significant linkage equals a LOD score of > or = +3

i.e. Log10 1000

or linkage is 1000x more likely than non-linkage

LOD +3 is ~ p = 0.05

In genome scans this limit is increased to +3.3 due to the testing of multiple markers.

• LOD < -2 is significant evidence for non-linkage

•LOD > -2 < +3 it is inconclusive and more data is needed, perhaps by adding additional families.

Strachan & Read, 1999

Page 12: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Linkage Analysis

• Unfortunately, the standard (parametric) LOD score method doesn’t work well for complex traits, because it requires a definite model of how the trait is inherited: the first step in LOD score mapping is to determine the expected frequency of offspring phenotypes as a function of the recombination fraction.

• Non-parametric methods: look for chromosome segments shared by affected individuals. Doesn’t rely on genetic model.• affected sib pair analysis• linkage disequilibrium

Page 13: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Affected Sib Pair Analysis

• If two siblings both are affected by a genetic disease, they will (in most cases) share a region of chromosome surrounding the disease gene. This segment is “identical by descent” (IBD): it was derived from a common ancestor, their parent.

• use many markers to find IBD regions among many affected sib pairs.

• Usually results in a large region: too big for positional cloning.

• Also: if more than one gene causes the trait, the necessary large amount of data will never converge to a single chromosome region.

AB CD

AC AD

AB AC

AC AB

Page 14: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Linkage Disequilibrium (LD)• Gene mapping using recombination methods

(such as affected sib pair analysis) suffers from not having enough crossovers in one generation to localize a gene very well.

• Linkage disequilibrium uses crossovers that have occurred over several generation.• Regions of chromosome distant from the disease

mutation will become randomized. • However, right near the mutation random crossovers

will not have separated the disease locus from its surrounding haplotype: a particular DNA haplotype will be in disequilibrium with the disease trait.

• The trick is to find that haplotype.• The further back in time since the mutation occurred,

the smaller the region of disequilibrium.

Page 15: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

More Linkage Disequilibrium

• A major complication: turns out that whole blocks of chromosomes get inherited together over many generations. Crossing over isn’t completely random. Means that genes occur in LD blocks separated by recombination hotspots.

• Another problem: LD methods depend on there being only a single original disease mutation that occurred in a particular haplotype. Multiple mutations will each have their own LD haplotype.

Page 16: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Outline

• What is linkage analysis?• Parametric

• Non-parametric

• Why is linkage analysis complicated for complex traits such as cognition or psychiatric illness?

• How can it be used in the modern era?• Used to filter large amount of data generated through next generation

sequencing

• Used to understand the effects of combinations of variants on phenotype

Page 17: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

High throughput sequencing platforms

Mol Cell. 2015 May 21; 58(4): 586–597. doi: 10.1016/j.molcel.2015.05.004

Page 18: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Variation in the Genome

Matt Hurles – UK10K

Causes and Consequences of new mutations per individual:

3-4,000,000 variants (90% SNV, 9% Indel, 1% SV)10-11,000 amino acid changing200-250 truncating Indels70-100 truncating base30-50 splice site50-200 new mutations (only 0-2 in genes)

Page 19: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair
Page 20: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Metachondromatosis

Linkage analysis

Followed by whole genome sequencing of a single affected individual from the family.

Page 21: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Evaluation of sequence data

Sobreira et al., suggests 3 methods to prioritize variants for further analysis:- Linkage information

Using the results of linkage analysis to prioritise regions

- Likelihood of being functionalLooking at exonic variants that effect the protein sequence

- Stops gained- Frameshifting InDels

- Frequency in the health populationComparison to known variants in dbSNPComparison to sequence from 8 unrelated controls

Page 22: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Linkage AnalysisThey were able to exclude linkage to 96% of the genome (LOD < -2) and 98.4% of

the genome showed negative LOD scores.

This reduced the search to 42Mb of sequence within 6 regions.

This included 767 Kb exonic sequencePTPN11

Page 23: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Linkage Region

LODScore

No. RefSeqgenes

No. variants unique to patient

Unique Variants

2p25 1.0-1.5 20 0

5q12.1 1.0-1.5 7 0

7p14.1 2.5* 14 0

8q24.1 1.8 27 0

9q31.1-q33.1 1.0-1.5 71 0

12q33 1.8# 105 1 11bp del PTPN11

*Maximum achievable LOD score in this family# Subsequently revised.

Linkage Analysis

Page 24: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Genotyping in Family

11 bp deletion

Page 25: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair
Page 26: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Linkage Analysis

Page 27: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Target Enrichment

Agilent Technologies 1M SureSelect

DNA capture array containing 973,952 probes targeting 844,339 bp within the 8.6 Mb candidate interval

Including 88.4% and 98.6% of UCSC exons and CCDS coding sequence, respectively.

61% of sequence reads mapped to the targeted region.

23% of targeted bases were not captured.

16 individuals from 11 families

Page 28: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

PTPN11 mutations identified in MC participants.

Page 29: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair
Page 30: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Locus Identification-problems

• Uncertainty in diagnostic boundaries

• Non-Mendelian inheritance

• Variable age of onset

• Genetic heterogeneity

– Many different genes can cause the illness

> 1% risk world wide

> phenotypic variation

• Oligogenic/polygenic causation

– More than one mutant gene required to produce phenotype

Page 31: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Locus identification- reducing the problems

• Single large families

• Avoid bilineal descent

• rigorous interviews

• family history

Reduce genetic heterogeneity

Significant LOD score = gene of major effect

• Reduce uncertainty of diagnosis

– classify minor diagnoses as unaffected

– >1 category of affected phenotype

Page 32: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Molecular Psychiatry advance online publication 21 March 2017.doi:10.1038/mp.2017.49

A rare missense variant in RCL1 segregates with depression in extended families.

Immunohistochemical labeling of RCL1 in human cerebral cortex. Co-localization with GFAP-positive primate-specific interlaminarastrocytes.

A rare genetic variant, rs115482041, on chr 9p24 in the RCL1 gene that segregated with depression across multiple generations in an extended family.

The variant was estimated to explain more than half of the variation in depressive symptoms in the extended family, and 2.9% of the heritability in the overall genetically isolated population.

Interlaminar astrocytes may form a network for long-range coordination of intra-cortical communication.

Page 33: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Pedigree structure and genotypic information

John R. Giudicessi, and Michael J. Ackerman Circ

Cardiovasc Genet. 2013;6:193-200

Page 34: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

The phenotype of Bardet–Biedl syndrome (BBS) is defined by the association of retinitis pigmentosa, obesity, polydactyly, hypogenitalism, renal disease and cognitive impairment. The significant genetic heterogeneity of this condition is supported by the identification, to date, of eight genes (BBS1–8) implied with cilia assembly or function.

Page 35: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Phenotypic heterogeneity and reduce penetrance?

recurrent major depression

minor diagnosisunaffected

schizophrenia

bipolar affective disorder

(1;11)(q42;q14) translocation

Blackwood et al, 2001

Risk of major psychiatric illness increases 50 fold

Carriers have reduced brain attention measure ERP P300

Page 36: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Chromosomes with multipoint LOD >2

Chr1q Chr11q Chr5q

Chr2pChr1p

Chr4q Chr16Chr3q

Model Code Phenotypes

MODEL BSCZ, BP1, BP2, SCZAFF, rMDD, cyclothymia

MODEL F BP, rMDD, MDD

Psychosis hal/del

Page 37: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Fine Mapping

Page 38: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Id

Any

Diagnosis SCZ BP

SCZ BP

rMDD

Cyclothymi

a BP rMDD t(1;11) chr1 chr11_1

chr11_2

F chr5 chr1p chr2p chr3q chr4q chr16

13 1 1 1 1 2 1* 1 1 1 1

24 1 1 1 1 2 1 1 1 1 1

27 1 1 2 1 1 1 1 1

49 1 1 1 1 2 1 1 1 1 1 1 Schizophrenia

41 1 1 1 1 1 2 1 1 1 1 1 1 1 1 Bipolar

61 1 1 1 1 2 1 1 1 1 1 1 1

50 1 1 2 1 1 1 1 1 1 1 1

26 1 1 2 1 1 1 1

104 1 1 2 1 1 1

67 1 1 1 1 1 2 1 1 1 1 1 Bipolar

55 1 1 1 2 1 1 1 1 1 1 Cyclothymic

19 1 1 1 1 2 1 1 1 1

53 1 1 1 2 1 1 1 1 Cyclothymic

15 1 1 1 1 2 1 1 1 1 Schizophrenia

32 1 1 1 2 1 1 1 Cyclothymic

9 1 1 2 1 1 1

18 1 1 1 1 2 1 1 1 Schizophrenia

70 1 1 1 1 2 1 1 2 Schizophrenia

44 1 1 1 1 1 1 1 1 1 1 1 1 MDD recurrent

87 1 1 1 1 1 1 1 1 MDD single episode

47 1 1 1 1 1 1 1 MDD + Generalised anxiety

54 1 1 1 1 1 1 1 1 1 MDD recurrent

62 1 1 1 1 Generalised anxiety

85 1 1 1 MDD single episode

Do the haplotypes in each individual predict diagnoses?

Page 39: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Phenotype prediction in the family

Page 40: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

Linkage information useful for:

• Exclusion mapping• Variable filtering

• Assessing combined effects of variants:• Compound heterozygosity

• Identification of causal pathways

• Investigation of reduced penetrance & phenotypic heterogeneity

Page 41: Applications of linkage analysis in the modern era Intro to... · 2017-12-07 · Linkage Disequilibrium (LD) •Gene mapping using recombination methods (such as affected sib pair

AbstractCreative activities in music represent a complex cognitive function of the human brain, whose biological basis is largely unknown. In order to elucidate the biological background of creative activities in music we performed genome-wide linkage and linkage disequilibrium (LD) scans in musically experienced individuals characterised for self-reported composing, arranging and non-music related creativity. The participants consisted of 474 individuals from 79 families, and 103 sporadic individuals. We found promising evidence for linkage at 16p12.1-q12.1 for arranging (LOD 2.75, 120 cases), 4q22.1 for composing (LOD 2.15, 103 cases) and Xp11.23 for non-music related creativity (LOD 2.50, 259 cases). ...The locus at 4q22.1 overlaps the previously identified region of musical aptitude, music perception and performance giving further support for this region as a candidate region for broad range of music-related traits. ...Pathway analysis of the genes suggestively associated with composing suggested an overrepresentation of the cerebellar long-term depression pathway (LTD), which is a cellular model for synaptic plasticity. ...These results suggest that molecular pathways linked to memory and learning via LTD affect music-related creative behaviour.