cafe variome: connecting diagnostic networks, disease consortia and diverse third parties - raymond...

18
Openly share the ‘existence’ rather than the ‘substance’ of the data …thereafter variably manage data access

Upload: human-variome-project

Post on 10-May-2015

99 views

Category:

Science


2 download

DESCRIPTION

The Cafe Variome approach changes the nature of the problem, by converting it to the challenge of enabling fully open and comprehensive “data discovery” (i.e., making the existence rather than the substance of the data openly accessible), for example, between networks of diagnostic laboratories or disease consortia that know/trust each other and share an interest in certain causative genes or diseases. This provides a mechanism for the discovery of rare sequence variants or patients with rare disease phenotypes. Cafe Variome is not a database, but instead aims to be a “shop window” for openly searching/discovering what data exist. The system allows users to openly search the full content of the data, in sophisticated ways, and thereby determine whether or not a record of interest exists in an information resource. Users of the system can subsequently access the hit data (according to pre-set permissions) in line with one of the following conditions: Open Access: the user may access variant and patient records directly and freely Linked Access: the user can view summary data and is provided with a link to the source database to access the full record Restricted Access: the user may access variant and patient records if they belong to a pre-approved group or must request access from the data owner to the full record Cafe Variome offers a complete data sharing software solution (either a hosted or an “in-a-box” solution) all controlled by an intuitive administrator dashboard, which gives owners complete control over their data and installation. Dashboard configuration options include a content management system for adding/editing custom pages and menus, full control over site appearance (logo, colours, backgrounds, themes). Easy import of source data via templates, control over how searches are performed and results are displayed (ordering and specifying of fields and which fields can be searched) and a comprehensive access-control system for users and groups. A sophisticated “Google-like” query interface allows users to form complex queries to interrogate and discover data across an installation. Additionally, multiple installations can be connected together to form federated networks to allow controlled queries across nodes within the network. Each variant in the system can be annotated with any number of terms from any of the NCBO BioPortal phenotype ontologies. This flexibility allows a variant to be associated with a single disease term, or a complex combination of phenotype descriptions. An admin tool generates an up-to-date searchable term tree for all ontologies used in the annotations. This functionality makes use of the BioPortal API to ensure the latest version of all ontologies, and associated terms, are available to the user.

TRANSCRIPT

Page 1: Cafe Variome: connecting diagnostic networks, disease consortia and diverse third parties - Raymond Dalgleish

Openly share the ‘existence’ rather than the ‘substance’ of the data…thereafter variably manage data access

Page 2: Cafe Variome: connecting diagnostic networks, disease consortia and diverse third parties - Raymond Dalgleish

Connecting Diagnostic Networks

• Need to enable disease consortia to identify patients with similar phenotypes or to identify patients harbouring the same variant(s)

• Currently not possible due to difficulties of data sharing between labs, or with central repositories

• Cafe Variome can solve this...Simple to install and can be deployed either– on a server at one or more of a network of labs– or, hosted by the Cafe Variome team

Page 3: Cafe Variome: connecting diagnostic networks, disease consortia and diverse third parties - Raymond Dalgleish

The Cafe Variome Solution

• Allows 'open discovery' of the existence (rather than actual substance) of relevant data

• Thereby, enables networks of labs to easily query for the existence of patients or variants, without necessarily revealing additional underlying data, thus overcoming issues of patient confidentiality & data ‘ownership'

• Currently being extended to support more sophisticated omics/NGS data handling and deep phenotype data

Page 4: Cafe Variome: connecting diagnostic networks, disease consortia and diverse third parties - Raymond Dalgleish

Cafe Variome Features

• Cafe Variome is not a database but is a searchable 'menu'

• The platform enables data owners/submitters to specify and update lists of who can search for records of interest (using various search parameters)

• Results can be returned to users:- as open data- as links to data at source- by computationally facilitating data-access requests

Page 5: Cafe Variome: connecting diagnostic networks, disease consortia and diverse third parties - Raymond Dalgleish

- Allows users to check whether the same variants(s) /patients (with related phenotypes) have previously been seen by other laboratories

Networks of labs exchanging data

Optional wider

discovery

Clinical Community

Research Community

CENTRAL

optional

- Supports multiple installs & federated searches(data remains at source)

Page 6: Cafe Variome: connecting diagnostic networks, disease consortia and diverse third parties - Raymond Dalgleish

Data Sharing Models (facilitated & controlled access)

Open Access

Core info for each record is shown & made available for download

Restricted Access

Core or full record details are provided per record, if:• User is pre-approved by

group-access permissions set by data owner

• Access is approved after facilitated email request to the data owner

Open Discovery – Reporting Existence of Patients/Variants in Sources

Linked Access

No data, only link to the data source is reported

Source DBresource

Access then control managed by source db

Page 7: Cafe Variome: connecting diagnostic networks, disease consortia and diverse third parties - Raymond Dalgleish

Record Discovery “Menu”

Google-likesearch queriesAND/OR, fuzzy,boosting, etc.

A count of hits in each data source is returned and grouped by the sharing policy

Page 8: Cafe Variome: connecting diagnostic networks, disease consortia and diverse third parties - Raymond Dalgleish

Cafe Variome Variant Report

Page 9: Cafe Variome: connecting diagnostic networks, disease consortia and diverse third parties - Raymond Dalgleish

Data Sharing Granularity

Data owners can control access to variants from individual record level to entire data sets

Page 10: Cafe Variome: connecting diagnostic networks, disease consortia and diverse third parties - Raymond Dalgleish

Administrator’s Interface

Page 11: Cafe Variome: connecting diagnostic networks, disease consortia and diverse third parties - Raymond Dalgleish

Create Custom Groups of Labs

Page 12: Cafe Variome: connecting diagnostic networks, disease consortia and diverse third parties - Raymond Dalgleish

Assign Groups to Variant Sources

Users belonging to groups have pre-approved access to particular variant and patient data

Page 13: Cafe Variome: connecting diagnostic networks, disease consortia and diverse third parties - Raymond Dalgleish

• Make data import as flexible as possible• Allows users to generate import templates

– Excel or tab-delimited– Specify which data fields– Populate with their data– Import into CV

Bulk Data Import Templates

Page 14: Cafe Variome: connecting diagnostic networks, disease consortia and diverse third parties - Raymond Dalgleish

Phenotype Developments

• Allow the phenotypic consequences of genetic variants to be described using public ontologies– Many terms from many ontologies can be associated

with one variant or patient• Also, allow the phenotypic consequences of genetic

variants to be described using a local vocabulary or list

Enable hierarchical viewing and querying of the phenotype ontology data

Page 15: Cafe Variome: connecting diagnostic networks, disease consortia and diverse third parties - Raymond Dalgleish

Built on standards

• Cafe Variome is based on open-source software• HVP Recommended System Status (RSS):

– HGVS nomenclature (RSS001)– Mutalyzer (RSS002)– LOVD (RSS003)– VarioML (RSS004: under review)– Locus Reference Genomic (RSS005)– VariO (RSS006: under review)

• Submitted to HVP for RSS review: May 2014

Page 16: Cafe Variome: connecting diagnostic networks, disease consortia and diverse third parties - Raymond Dalgleish

Summary

• CV is very flexible in terms of the content that it can hold– gross disease/phenotype name or single variant– or, detailed phenotype and thousands of variants– (whole exome/genome scan, in next release) 

• Each data source decides what data fields are included– which of these are made discoverable & by whom– which fields are shared if discovery searches hit a record– deeper data sharing may be permitted to particular users

• The API (computer-computer interface) is straightforward, and so other data systems can easily be modified to 'talk to' Cafe Variome installations

Page 17: Cafe Variome: connecting diagnostic networks, disease consortia and diverse third parties - Raymond Dalgleish

• We can host a Cafe Variome for you, or you can run it locally:– one Cafe Variome for the whole project– one per site and federate these to act as a private network– in all cases any number of different users can be given tailored

access rights for discovery and data sharing

• It is simple to populate the system– from various starting formats (we can help you with this)– this can be done automatically and at your preferred interval, if

you have data in other databases

• Key point — it is flexible, and designed to let the data find the data, without compromising patient privacy or researcher/clinician control and ownership of the data

Page 18: Cafe Variome: connecting diagnostic networks, disease consortia and diverse third parties - Raymond Dalgleish

Acknowledgements

• Anthony J Brookes• Owen Lancaster ([email protected])• Tim Beck• Raymond Dalgleish• The research leading to these results has

received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement number 200754 — the GEN2PHEN project