bioshare: making data useful without direct sharing: cafe variome and omics browsing - anthony...

Post on 22-Jan-2018

350 Views

Category:

Health & Medicine

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Making data useful without direct sharing:Cafe Variome and Omics browsing

• CANNOT: Data owners may not have time nor funding to manually submit data, and/or submission process and requirements too complicated

• WILL NOT: Data owners receive little or no recognition or reward for releasing data, hence little incentive to try

• MUST NOT: Data owners may have good reasons for not sharing data (ethical, legal, competitive edge)

Issues that restrict sharing data

DATA SHARING

IS IMPORTANT

BUT DIFFICULT !

SO DO

SOMETHING

ELSE

(AS WELL)

Share the ‘existence’ rather than the ‘substance’ of data

This technology (or similar) sits atop/alongside existing local DBs to bring the discoverability and connectivity, without replacing or altering the local solutions

Use Cases/Collaborating Networks

• Designed to be flexible for a number of use cases

• Various groups using the tool in different ways

– Rare disease (variant is the “entity”)

– Patient centric (patient is the “entity”)

– Aggregate frequency (i.e. mutation seen with a frequency of X in population

Café Variome Features

• Café Variome is not a database but is a searchable 'menu’

• The platform enables data custodians to specify which users can search for, display counts of, or display details of, which subsets of records and record fields, using various search parameters

• Results can be returned to users:- as core data- as links to data at source- by computationally facilitating data access requests

• Federated Café Variome network

• Nodes populated with local data

• Data discovery/sharing options under control of each source

• Data remain at each source

• Search interface enables real-time data/subject discovery

• Each discovered record reported in one of 3 ways, dependant on users permissions and data source settings

Open Access Restricted Access Linked Access

Data provided Interface facilitates email to request data, followed by

data supply if/when approved

No data provided,only link to data source

Source DB

• Each source can control which data fields are searchable and which fields are (potentially) then returned

Simple Query Interface

Complex Query Interface

Query Builder in action

Controlled Display of Matched Record Counts& Data (if permitted)

Phenotype Semantics

• Allow the phenotypic consequences of genetic entities to be described using public ontologies

– Many terms from many ontologies can be associated with one entity

• Allow the phenotypic consequences of genetic entities to be described using a local vocabulary or list

• Enable hierarchical viewing and querying of the phenotype ontology data

Node Search Options

• Searches are performed through one nominated head node

• Searches can be performed from any node in the network

External searches

Head node

Internal searches

Installation wizard

Appearance preferences, content management system and statistics

reporting

Core system settings, defining displayed and searchable fields, bulk import template

configuration

Record and source management

User, group and record access

control management

MultipleAdmin Options

OmicsConnect:

• Enables collaborators, and (optionally) additional researchers, to view/explore 'omics' datasets

• Provides a mechanism for visual data discovery

• Provides a unified browser view of ‘omics’ data

• Separates data sharing into open, controlled and discoverable

• Cope with different data sources and formats

• Easy to setup and use

GWAS Central(www.gwascentral.org)

- comprehensive genetic association database

- aggregate data & extensive metadata

- links to data sources for primary data

Eg. Visual meta-analysis:Compares and contrasts 8 different studies

The Browser

OmicsConnect browser

LocalData

RemoteData

Data Sources

(DAS, GFF3, BED, wiggle BigWig,

BigBed)

FilesFASTA

GFF3

GTF

Access Local DatabasesMySQL

SQLite

Access Online Resource'sGWAS Central

Ensembl

UCSC

Display Data

Simple Interface

Use new technologies

Low Demands on Resources

Platform Independent

No dependencies

OmicsConnect browser (Dalliance)

Track authentication

DAS track enabled when passphrase entered

DAS track not enabled as no passphrase entered

• Allows researchers to controllably serve their own omics data

– Authentication (public/private)

– User accounts

• Can returns the available features for a specific file/genome segment

• Intuitive interface for upload and management of data, including validation

• Stylesheets: Instructions on how to format the data for viewing.

• Additional feature implementations to the DAS protocol- ‘Types’: Returns what data types exist in the DAS track- ‘Summary’: Returns a summary of the data features per segment- ‘Search’: Returns features based on a keyword given by the user

• Can be installed independently from OmicsConnect

eDAS ‘gate keeper’

Other Genome Browsers

EnhancedDistributed Annotation

System (eDAS)

Raw Files

Local Databases

Online Resource's

Remote Access

Local Access

OmicsConnect

Browser

Online Resource's

OmicsConnect & eDAS

Acknowledgements

The research leading to these results has received funding from the EC under the 7th Framework Programme (FP7/2007-2013) grant agreements 261433 (BioSHaRE-EU) and 200754 (GEN2PHEN), and the IMI projects grants 115372 (EMIF) and 115736 (EPAD)

Tim Beck Robert HastingsCharalambos Chrysostomou Robert FreeAdam Webb Owen LancasterDhiwagaran Thangavelu Colin Veal

Morris Swertz et alAlliance

Consortium

top related