individual variant interpretation in a diagnostic...

32
Neutral variant Unknown clinical significance Pathogenic mutation Nienke van der Stoep Dept. Clinical Genetics LUMC Individual Variant Interpretation In a diagnostic setting

Upload: others

Post on 23-Feb-2020

13 views

Category:

Documents


1 download

TRANSCRIPT

Neutral variant

Unknown clinical significance

Pathogenic mutation Nienke van der StoepDept. Clinical GeneticsLUMC

Individual Variant InterpretationIn a diagnostic setting

UM CL Indication for request molecular analysis

• Genetic confirmation of clinical symptomsBetter treatment

• Carrier detectionBetter risk calculation and / or treatment options

• Prenatal analysisIn the Netherlands: >99% affected fetus the pregnancy is terminated

UM CL Genome Diagnostic setting

Analysis of genes proven to be the underlying cause of the clinical symptoms

Scanning* Approach: variant detection by DNA (or RNA) sequencing

- Methods: Sanger/ WES/ targeted NGS/ WGS)

* Approach: CNV detection, structural changes- methods: MLPA/ array analysis /karyotyping/ FISH/ WGS

Mutation/region specific* repeat length analysis* methylation* Sanger sequencing specific amplicons* deletion/duplication specific PCR ( or MLPA)* FISH* other

UM CL Sequence analysis; unsolicited findings

Whole Exome Sequencing (WES) analysis

What about findings that not correlate with the clinical phenotype present in the patient / family?

Precounseling is essential!

Patient / family wants to know:1. Only sequence changes correlated to the disease in the family2. Only sequence changes correlated to the disease in the family and other treatable disorders3. All sequence changes and the impact of the changes

Be aware of differential or missed diagnoses, where identified variants do correlate with phenotype of patient but not (yet) observed by medical doctor

UM CL Gene Panel Sequence Analysis

Analysis of genes proven to cause the clinical symptoms:* Using targeted sequence analysis approach either by specific enrichment or selected gene panel analysis.

Advantage: no unexpected results with respect to other diseases

Examples of genepanel NGS in the Netherlands:* Growth Disorders and Skeletal abnormalities (targeted)* Sporadic mental retardation (whole exome; sporadic cases)* Cardiomyopathy (targeted)* Muscular Dystrophies (targeted)* Deafness and blindness (whole exome and targeted analysis)

UM CL Position and Effect of nucleotide variants

(http://www.hgvs.org)

enhancerpromoter

ESS,ESE

Branchpoint sequenceISE, ISS

Exon Intron

UM CL Nomenclature of identified variants ;

Use HGVS

Exon: ESE?Nonsense c.3826C>T p.(Arg220*) ..Frameshift c.3525_3526delAA p.(Arg1157fs) ..In-frame c.4312_4314delGAA p.(Glu1438del) ..Missense c.4418A>G p.(Gln1494Arg) ..Silent c.3468C>T p.(=) ..Splice site c.3112G>T p.(=) ..

Intron:Splice site c.3113+1G>A p.?

NB. All c nomenclature is Reference Sequence specific

Start position 1: ATG translation codon: A is nucleotide number 1 Report information on the genome build (Hg19) and Reference Sequence

(http://www.hgvs.org)

NGS data: - Recommended to include the genomic coordinates (e.g. chr11: g.19207841)

- Indicate : confirmed by another independent method

UM CL RNA Splicing

DONOR5’ SPLICE SITE

ACCEPTOR3’ SPLICE SITE

EXON 2 EXON 3Intron2

EXON 1

ATG TGA

EXON 3

TGAATG

EXON 2EXON 1

BRANCH POINT (between: - 50 and -10)

PROTEIN

Genomic DNA

mRNA

Protein

UM CL RNA Splicing Consensus sequences

A(38)A(62) G(77)

C(31)

EXON INTRON

A(71)G(100) T(100)

G(24)

C(55)PY(84) PY(85) PY(58) X A(100) G(100)

T(37)

INTRON EXON

T(41)G(50)

A(24)

EXON 1

EXON 2

DONOR

ACCEPTOR

INTRON

INTRON

+1 +2

-2 -1

Zhang, Hum Mol Genet (1998) 7:919-932Roca et al, Genome research (2008) 18:77-87)

UM CL Interpretation of Detected Variants

Nonsense and Frameshift:- Almost always pathogenic

But be aware of- Nonsense mutation at the N-terminal end of the protein: alternative ATG translation codon usage possible?

- Nonsense mutation at the C-terminal end of the protein: eg. p.Lys3326* (BRCA2; 3418 aa): Is known neutral variant what about pathogenicity of all nonsense and frameshifts after this position?

EXON 2 EXON 3Intron2

Splice site changes:- Position -2, -1 and +1, +2 changes are almost always pathogenic UNLESS:

Skipped exon(s) are in frame and do not effect protein function. Other not-affected wt splice variant from same allele can replace function Wt RNA transcript of affected allele is still sufficiently expressed

Testing RNA expression of mutated allele highly recommended

UM CL Interpretation of Detected Variants

The rest: In-frame, missense, silent nucleotide changes and the other intron changes:

- How to decide pathogenic or a neutral variant? Variant of unknown significance (VUS)

VUS interpretation and classification tools: In silico evaluation of impact on RNA and protein function Gene specific Data bases Frequency in population (GoNL, ExAC, ESP, etc) Literature search Co-occurrence with deleterious in trans mutations Segregation with disease in families Biochemical functional tests

Expertise knowledge of the gene highly recommended for correct interpretation of variant.

UM CL Classification of sequence changes

5 Class system (Plon et al. Hum Mutat (2008) 29: 1282 – 1291)

Class Description Probability of being pathogenic5 Definitely pathogenic * > 0.994 Likely pathogenic# 0.95 - 0.993 Uncertain 0.05 - 0.9492 Likely not pathogenic or of little clinical significance 0.001 - 0.0491 Not pathogenic or of no clinical significance <0.001

*: prenatal and carrier detection is offered# : presymptomatic carrier testing is offered

3 Class UV system (Bell):

Description:[Definitely pathogenic]III. Possibly /likely to be pathogenic, but cannot be formally proven II. Unlikely to be pathogenic, but cannot be formally provenI. Not pathogenic or of no clinical significance

UM CL Tools & Criteria for Classification of Variants

Frequency: MAF eg. rs data, GoNL, ExAC, ESP > 1% (AD disorder) or > 5% (AR disorder) (more than 200 chromosomes analysed)

Class 1

Known deleterious: Nonsense, frameshift and ‘consensus sequences of intron’Class 5

In silico prediction programs: (protein and RNA splicing)Frequently used software Alamut:

- 5 in silico protein prediction programs- 5 in silico RNA splicing programs- links to various databases and literature

Functional studies: (eg. in vitro assay, RNA analysis, LOH studies in tumours)

Locus specific databases LOVD, BIC, HGMD professional, etc

Conservation

UM CL RNA in silico splice site predictionprograms

via Alamut

• Splice Site finder (SSF)

• Max End Scan (MES)

• NNSPLICE

• GeneSplicer

• Human Splicing Finder

BRCA study : Combining SSF and MES gave 96% sensitivity and 83% specificity for VUSs occurring in the vicinity of consensus splice sites 

Correct False positive UncertainNot recognized

UM CL Protein in silico prediction programs

via AlamutAlign GVGD (http://agvgd.iarc.fr/agvgd_input.php)combines the biophysical characteristics of amino acids and protein multiple sequence alignments (C0 – C65)

PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2)impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations (benign – damaging)

SIFT (http://sift.jcvi.org/)is based on the degree of conservation of amino acid residues in sequence alignments derived from closely related sequences (tolerated – damaging)

Mutation Taster (http://www.mutationtaster.org)the frequencies of all single features for known disease mutations/polymorphisms were studied in a large training set composed of >390,000 known disease mutations from HGMD Professional and >6,800,000 harmless SNPs and Indel polymorphisms from the 1000 Genomes Project (TGP) (benign – disease causing).

KD4v (http://decrypthon.igbmc.fr/kd4v/cgi-bin/home)The server provides a set of rules learned by Induction Logic Programming (ILP) on a set of missense variants described by conservation, physico-chemical, functional and 3D structure predicates. The rules are interpretable by non-expert humans and can be used to accurately predict the deleterious/neutral status of an unknown mutation

NB. All can give different Results depending on variant

UM CL In silico classification of Variants

Lindor et al. (2012) Hum Mut 33:8-21

Determine / estimate priori likelihood of causality

Eg. In silico analysis/Alamut tool

Additional facts

UM CL Align-GVGD in silico tool scores

Lindor et al. (2012) Hum Mut 33:8-21

Align-GVGD

UM CL Classification of VUS

run prediction programs in Alamut (or other SW)For selected variants, (excluding class 1 and 5 variants) :

Silent change & intron changes outside consensus:• no effect on RNA splicing Class 2• effect on RNA splicing* Class 3 or >

Missense change and no further data (eg functional)• Effect on RNA splicing Class 3 or >• (4 out of 5 protein in silico neutral) Class 2• 3 out of 4 protein in silico neutral Class 2• Remaining Class 3

Further analysis can result in a reclassification (eg RNA studies,functional protein studies, LOH)

* Be aware of natural occurring isoforms/splice variants

UM CL Reporting pathogenic mutations

If identified variant is considered pathogenic mutation :

- Confirmation of the clinical diagnosis

- Genetic cause of the disease is identified

Consequence:

• Prenatal analysis offered

• Choice of treatment (eg breast cancer families)

• Presymptomatic testing is offered (eg. mastectomy / oophorectomy)

UM CL Stringent Selection variants in e.g. WES

Sanger sequencing (low # of genes):All sequence changes in the analyzed fragments are viewed and interpreted.

NGS/gene panel sequencing: Use custom / commercial designed pipeline, to create variant list.Often exclusion of:

• SNP: frequency > 1-2% (AD criteria not AR)• Silent changes• PolyPhen: benign• Etc…

Causal variant could be lost/removed in final VCF And false possible pathogenic variant could be identified

UM CL Classification of sequence changes using Alamut

Alamut: missense, c.1235A>T, (p.Glu412Val; TSC2) at Protein level

probably damagingpolymorphism

tuberin

Tuberous Sclerosis Complex 2

UM CL

Consensus sequence splice donor : A(62) G(77) / g(100) t(100) a(71)

mutant

normal

Alamut c.1235A>T (p.Glu412Val; TSC2), at RNA level , Splice site prediction

Classification of sequence changes using Alamut

NEW donor site

UM CL

Functional analysis:- RNA (isolated from skin fibroblasts) analysis performed :

Results:- RNA showed an abnormal pattern in agreement with the predictions.

NB.1 If possible: use an intragenic heterozygous SNP to rule out the possibility that the abnormal spliced RNA is a product of the normal allele.

NB.2 Use enough (about 5) controls to rule out leaky transcription artefacts

c.1235A>T p.(Glu412Val) in TSC2

NB. Software tool KD4v predicted polymorphism

Functional verification of sequence changes

Conclusion:Sequence change c.1235A>T is a pathogenic mutation; it influencesRNA splicing of TSC2 and therefore it is not a missense mutation buta splice site mutation

Nomenclature: c.1235A>T, p.Glu412fs

UM CL

TSC1: missense changes in same codon. Pathogenicity?

Functional analysis:- p.Arg190Cys: same as wildtype- p.Arg190Pro: pathogenic

p.Arg190Cys p.Arg190Pro

probably damaging probably damagingneutral deleterious

Classification of sequence changes using Alamut

UM CL Hereditary Breast/Ovarian Cancer

Disease/gene specific criteria

- 1:8 women develops breast cancer- ~5%: a genetic factor involved- 10-15% a pathogenic mutation in either BRCA1 or BRCA2- A lot of VUS identified

Classification tools/criteria• Co-occurrence of 2 deleterious mutations:

In BRCA1 not possible; BRCA2: other phenotype (Fanconi Anemia D1) • Co-segregation• Pathology: e.g. array CGH profiles; Loss of Heterozygosity of UV• Functional data

Problem:Most UVs are very rare and therefore the likelihood ratios will not give the ultimate result.

UM CL Hereditary Breast/Ovarian Cancer,

using function analysis

Variant: c.5309G>T p.Gly1770Val (BRCA1)

- Several small families; not enough for linkage analysis- All families of Northern African origin

arrayCGH of tumours: BRCA1 profileFunctional analysis:

Pathogenic

Possibly damagingDeleterious

Strong suspect for pathogenic variant

UM CL Bloopers

p.Met1628Val variant in BRCA1At first classified as neutral/low risk variant)

Phelan et al (2005) J. Med. Genet. 42: 138-146:functional test: pathogenic mutation

Carvalho and Monteiro (2007) J. Med. Genet. 44: 78:mistake in construct; not only 1628V variant, but also a deletion of 7 nucleotides.

p.Met1628Val is a neutral variant

UM CL Cryptic an challenging findings

with in silico variant prediction toolsResult Prenatal array

‘Patients’ : Mother and her previous deceased male fetus Detected : Array deletion Xp21: DMD gene in both mother and fetus

coordinates array data : 32687712 en 33058441; exact deletion unclear

mother is pregnant again

Check exact deletion by MLPA testing:

- deletion exon 2-9 of DMD gene

UM CL Cryptic an challenging findings

Previous data: Deletion also observed in other family, but also Array result. no index patient known, info: Adult brother of previous mother has same deletion (no phenotype?, no data)Action Test father and brothers of this pregnant woman

exon 1 exon 10

results in out of frame deletion (Alamut) pathogenic? Class 5

Result: Father has same deletion as pregnant mother and so far no clear dystrophy symptoms. Class 3

NB. Having index patient is very relevant for variant interpretation

UM CL

Functional RNA analysis:new donor site used variant results in ‘in frame’ insertion, Class 3(in silico prediction not that clear)

Be aware of newly generated splice sites

Cryptic an challenging findings

Gene: SDHA, splice variant in intron 9: c.1260+1G>A, p.?Disrupts canonical splice donor site : Class 5

UM CL Final considerations

Guidelines for finding genetic variants underlying human diseasePosted in ‘Genomes Unzipped’: 24 Apr 2014 06:00 AM PDTAuthors: Daniel MacArthur and Chris Gunter.

New DNA sequencing technologies are rapidly transforming the diagnosis of rare genetic diseases, but they also carry a risk: by allowing us to see all of the hundreds of “interesting-looking” variants in a patient’s genome, they make it potentially easy for researchers to spin a causal narrative around genetic changes that have nothing to do with disease status.

Such false positive reports can have serious consequences: incorrect diagnoses, unnecessary or ineffective treatment, and reproductive decisions (such as embryo termination) based on spurious test results.

In order to minimize such outcomes the field needs to decide on clear statistical*guidelines for deciding whether or not a variant is truly causally linked with disease.

* NB additional functional and biological related guidelines will be more reliable

Questions