supporting curation through clingendb
TRANSCRIPT
Using ClinGenDB for Curation
Sharon E. Plon, MD, PhD, FACMG
ClinGenDB Disease Area Curation Tool
ClinVar/ClinGen System Interactions
OMIM
Patient Registries
EHR Interface
Expert Curation of Genes and Variants by Clinical Domain and Disease Area Workgroups
dbGaP
LSDBs
Labs Labs
Labs (Genotypes & Phenotypes)
Gene Resource
(Medical Exome, Actionability)
CNV Curation
Tool (JIRA)
Application Interface
External Informatics Activities Enabled
Expert Curated Variants
Case-level Data
Variant-level Data ClinVar
Disease WGs
Clinical Domain WGs
Data
Crowd- sourced Curation
Controlled Access
Public Access
Private
Pharm GKB
Machine Learning Algorithms
Population Datasets
Medical Lit
Portal for the Public
ClinGenDB
Design and Implementation Team
Aleksandar Milosavljevic, PhD Bioinformatics Research Laboratory (BRL)
BCM Contributors: Hailin Chen, Xin Feng,
Andrew R. Jackson, Sameer Paithankar, Sharon Plon Stanford Contributors: Tam Sneddon, Mike Cherry,
Carlos Bustamante
ClinGenDB Design and Implementation 1. Curation Tools and Interfaces 2. Application Programming Interfaces (APIs) 3. Database Features and Document Models 4. Data Import From and Export Into ClinVar 5. Data Linking / Import from Other Sources
ClinGenDB – Data Entry – MYO6
Edit documents determined by user’s role.
• i.e. in the Genboree group containing the knowledgebase.
Search (initially) by document/ variant name.
Aleks Milosavljevic and colleagues
ClinGenDB – Literature extraction - PTEN
Comprehensive records, with facts extracted into sub-properties.
“Literature Curation Evidence” is an open-ended list of 0+ “publicationTitle” sub-properties. • Can add many many more.
Data Linking / Import from Open Sources Accomplished • Identified computer/API-accessible LSDBs for 56 ACMG
genes • Identified unique variants not covered by ClinVar (Xin
and Steven) • Annotation of imported variants using population data (dbSNP, 1000 genomes, ESP6500)
In Progress • Implement links to LSDB records from variant entries in ClinGenDB to facilitate variant curation • Assessing the quality of LSDBs (Steven)
• Amount of curation, submission type, phenotype information, etc.
Future • Design and implement methods for data linking or import from LSDBs and other sources
ClinGenDB linking to external databases
Data Import From and Export Into ClinVar Accomplished • Import of data from the ClinVar VCF file
In Progress
• Import of data from the ClinVar XML file • Issues being populated in JIRA
• Export of data into ClinVar • Now working on the ClinVar “minimal-submission”
process • Document the details of data-flow between ClinGen and
ClinVar Future • Make modifications to support the new data model under
development by the Data Model WG • Implement an ID system for variants in ClinGen
Curation Tools and Interfaces Accomplished • Working prototype of interface implemented and tested with a single user (Tam Sneddon) • First cycle of feedback collected using ticketing system and used to improve the interface
In Progress • Tool that automatically generates and justifies preliminary pathogenicity assertion based on ACMG guidelines
Future • Implement new ACMG variant classification guidelines
upon their official approval • Update document to be consistent with new data model • Support a curation jamboree
Questions?