sharing and analyzing clinical ngs data using lovd 3€¦ · sharing and analyzing clinical ngs...
TRANSCRIPT
Sharing and analyzing clinicalNGS data using LOVD 3.0
Sharing and analyzing clinicalNGS data using LOVD 3.0
DTL Focus meeting: ‘NGS Data Sharing & Repository’
Leiden University Medical Center Ivo F.A.C. Fokkema February 3rd , 2014 Slide 01/11
Sharing and analyzing clinicalNGS data using LOVD 3.0
LOVD - Leiden Open Variation Database
• Locus-Specific (mutation) Database (LSDB)• Gene- or disease-centered database of sequence variants• LOVD 3.0 also allows inter-genic variants (NGS)
• Open Source, free, web-based, can be used locally
• In use since 2003, first released in January 2004
• Within few years became the most popular LSDB software
• 50% of genes in LSDBs worldwide is LOVD (2009)• 98% of genes in LSDBs worldwide is LOVD (Jan 2014)
• 620K fully annotated variants in registered LOVDs (70%)
Leiden University Medical Center Ivo F.A.C. Fokkema February 3rd , 2014 Slide 02/11
Sharing and analyzing clinicalNGS data using LOVD 3.0
Why are LSDBs needed?
• Need for finding curated data about a found variant
• Sources where to get variant information• HGMD
• (Partially) paid resource• First report only (frequency unknown)• Only variants reported as pathogenic
• dbSNP• Mixed bag of everything• Not all variants are actually confirmed• Not clear if functional consequences have been reported
• LSDBs• All variants, all reports• Curated by expert
Leiden University Medical Center Ivo F.A.C. Fokkema February 3rd , 2014 Slide 03/11
Sharing and analyzing clinicalNGS data using LOVD 3.0
Some LOVD key advantages (1)
• Direct online submission
• Advanced searching
• Curator chooses which data to store (custom columns)
• Link with Mutalyzer to check variant correctness
• Application Programming Interface (API)• Software can discover information in LOVDs• Create views in genome browsers• Global search interface
• Accessed 2.2M times in 2012• Accessed 5.2M times in 2013
Leiden University Medical Center Ivo F.A.C. Fokkema February 3rd , 2014 Slide 04/11
Sharing and analyzing clinicalNGS data using LOVD 3.0
Some LOVD key advantages (2)New in LOVD 3.0
• Describe variants on more than one transcript
• VCF file import• Find affected transcripts (Mutalyzer)• Create gene entries automatically (HGNC)
• Reference sequences (Mutalyzer/NCBI/EBI)
• Map position on transcript(s) (Mutalyzer)• Predict RNA & protein change (Mutalyzer)
• SeattleSeq annotated file import• Variants with effect on transcripts and many annotations
• Search variant on other LOVDs
Leiden University Medical Center Ivo F.A.C. Fokkema February 3rd , 2014 Slide 05/11
Sharing and analyzing clinicalNGS data using LOVD 3.0
Still under construction
• Add data licenses• Let data owner decide if data is redistributable, and under
which terms• Use Creative Commons license schema• Modify API, make accepting license mandatory
• Allow LOVDs to fetch variant frequencies from data sets• Exome Variant Server, 1000 Genomes, GoNL, ...• Or use DVD?
Leiden University Medical Center Ivo F.A.C. Fokkema February 3rd , 2014 Slide 06/11
Analyzing clinical NGS data with LOVD (1)
Sharing and analyzing clinicalNGS data using LOVD 3.0
Analyzing clinical NGS data with LOVD (2)
• New analysis platform based on LOVD 3.0
• Developed for clinical genetics department in Leiden
• Exome sequencing on patients with rare diseases
• VCF run through internally developed pipeline for annotations• Tab delimited file with variants of affected child• SeattleSeq annotations (conservation, effect prediction)• Fields for coverage information, score for variant in parent
• Filters implemented to find the causative variant
• Variants can be scored by user and flagged for confirmation
Leiden University Medical Center Ivo F.A.C. Fokkema February 3rd , 2014 Slide 08/11
Sharing and analyzing clinicalNGS data using LOVD 3.0
Analyzing clinical NGS data with LOVD (3)
• All analyses first filter on gene panel
• De novo variants, not previously reported• Not found in dbSNP, 1000 Genomes, GoNL, EVS; Not present
in mother or father
• X-linked recessive variants (male patients)• Chromosome X; Not present in father; Not homozygous in
mother
• Recessive variants• Homozygous or compound heterozygous in patient; Not
homozygous in parents
Leiden University Medical Center Ivo F.A.C. Fokkema February 3rd , 2014 Slide 09/11
Analyzing NGS data with LOVD (4)
Analyzing NGS data with LOVD (4)
Analyzing NGS data with LOVD (4)
Sharing and analyzing clinicalNGS data using LOVD 3.0
Acknowledgements
• LOVD 3.0 development• Ivar Lugtenburg• Jerry Hoogenboom
• LOVD team• Johan den Dunnen• Peter Taschner• Julia Lopez Hernandez
www.LOVD.nl
• Mutalyzer• Jeroen Laros• Martijn Vermaat
www.Mutalyzer.nl
Leiden University Medical Center Ivo F.A.C. Fokkema February 3rd , 2014 Slide 11/11
Varda: A platform for sharing NGS variantsin medical research and diagnostics
Martijn Vermaat
Department of Human Genetics
Leiden University Medical Center
Varda
Platform for sharing variant frequencies
Central human genomic variant database
• Focussed on frequencies, not individuals
• Only accessible to collaborators
• Using is sharing
Goal
• Share variants found in sequencing experiments
• Find functionally relevant variants
DTL Focus Meeting 1/7 Monday, 3 February 2014
Varda
Technical details
• Annotate by sharing
• Store coverage information to determine reference calls
• Pooling without loss of information
• Encrypted connection with authentication
• REST API• Command line client for pipelines• web client for people
DTL Focus Meeting 2/7 Monday, 3 February 2014
Platform overview
DTL Focus Meeting 3/7 Monday, 3 February 2014
Data model
Sample anonymity
Guaranteed by
• No browsing
• Only disclose database-wide variant frequencies
• No individual genotypes
DTL Focus Meeting 4/7 Monday, 3 February 2014
Data model
Sample anonymity
Guaranteed by
• No browsing
• Only disclose database-wide variant frequencies
• No individual genotypes
Additionally, samples can be pooled
• Merge variants from multiple samples
• Before uploading to the server
• Without loss of functionality
DTL Focus Meeting 4/7 Monday, 3 February 2014
Data model: coverage profiles
DTL Focus Meeting 5/7 Monday, 3 February 2014
Current and future work
Conclusions
We propose a shared variant database to facilitate in filteringobserved variants with high frequency in the population
Application: Shared variant database for Dutch medical researchcenters, coordinated by NBIC BioAssist. (trac.nbic.nl/dvd)
Application: Variant frequency database for LUMC clinicaldiagnostics NGS pipeline.
github.com/martijnvermaat/varda
DTL Focus Meeting 6/7 Monday, 3 February 2014
Questions?
Acknowledgements:
Jeroen LarosMichiel van Galen
Ivo FokkemaPeter Taschner
Johan den Dunnen
Leon Mei (LUMC, NBIC)David van Enckevort (NBIC)
Pieter Neerincx (UMCG)Morris Swertz (UMCG)
DTL Focus Meeting 7/7 Monday, 3 February 2014