genometrakr: perspectives on linking internationally - canada and irida.ca
TRANSCRIPT
1
Perspec)ves on Linking Interna)onally – Canada and IRIDA.ca
Fiona Brinkman Professor, Department of Biochemistry and Molecular Biology
Adjunct, School of Compu>ng Science, and Faculty of Health Sciences
Simon Fraser University Greater Vancouver, BC, Canada
[email protected] GenomeTrakr -‐ Sept 23 2015 @fionabrinkman
Canada’s Integrated Rapid Infec>ous Disease Analysis PlaTorm for Genomic Epidemiology
Interac(ng with,
complemen(ng others Interna(onal resources
Integrated Rapid Infec4ous Disease Analysis informa4cs pla;orm suppor)ng real-‐)me infec)ous disease outbreak inves)ga)ons
Goal
Rich genomic epi analysis Public health agencies
Rapid, open genomic
data release Academia/Public
Genomics, Epidemiology, Clinical, Lab Data
Integrated Rapid Infec4ous Disease Analysis informa4cs pla;orm suppor)ng real-‐)me infec)ous disease outbreak inves)ga)ons
Goal
User-‐friendly, web-‐accessible
User access control (e.g. public health workers vs public)
Automated assembly pipelines, Data analysis and visualiza(on
Standards compliant, rich ontologies
Open source
Integrated Rapid Infec4ous Disease Analysis informa4cs pla;orm suppor)ng real-‐)me infec)ous disease outbreak inves)ga)ons
Goal
NationalPublic Health Agency
Provincial Public Health Agency Academic/Public
Will Hsiao Fiona Brinkman
Gary Van Domselaar
www.IRIDA.ca
6
Project Leaders Fiona Brinkman – SFU Will Hsiao – PHMRL Gary Van Domselaar – NML Simon Fraser University (SFU) Emma Griffiths Geoff Winsor Julie Shay Matthew Laird Bhav Dhillon McMaster University Andrew McArthur Daim Sardar European Food Safety Agency Leibana Criado Ernesto Vernazza Francesco Rizzi Valentina
National Microbiology Laboratory (NML) Franklin Bristow Aaron Petkau Thomas Matthews Josh Adam Adam Olsen Tara Lynch Shaun Tyler Philip Mabon Philip Au Celine Nadon Matthew Stuart-Edwards Morag Graham Chrystal Berry Lorelee Tschetter Eduardo Toboada Peter Kruczkiewicz Chad Laing Vic Gannon Matthew Whiteside Ross Duncan Steven Mutschall University of Lisbon Joᾶo Carriҫo European Bioinformatics Institute Melanie Courtot Helen Parkinson
BC Public Health Microbiology & Reference Laboratory (PHMRL) and BC Centre for Disease Control (BCCDC) Judy Isaac-Renton Patrick Tang Natalie Prystajecky Jennifer Gardy Linda Hoang Kim MacDonald Yin Chang Eleni Galanis Marsha Taylor Damion Dooley Jennifer Law University of Maryland Lynn Schriml Canadian Food Inspection Agency (CFIA) Adam Koziol Burton Blais Catherine Carrillo Dalhousie University Rob Beiko Alex Keddy
6
IRIDA Design: Carefully designed and engineered soHware plaIorm is just the star)ng point…
User Interface
Security
File system
Metadata Storage Applica)on
logic
REST API Workflow Execu)on Manager
Con)nuous Integra)on Documenta)on
Federated database model
Addressing ontology gaps Build On, Work With: OBI TypON NGSOnto NIAID-‐GSC-‐BRC core metadata MIxS Ontology NCBI Biosample etc TRANS – Pathogen Transmission EPO Exposure Ontology Infec)ous Disease Ontology CARD, ARO for AMR USDA Nutrient DB EFSA Comp. Food Consump. DB Example gaps to fill: Improve Food ontologies, AMR data
Ontology: Describes types of en((es and rela(ons between them
Analy)cal Tool
Quality Control Module
Quality Metrics
Quality Control
IRIDA’S QA/QC Model
IRIDA Workflows: Portable and Transparent Pipelines
Use Galaxy as workflow engine
Version Controlled Pipeline
Templates
1. Input files,
parameters sent to Galaxy
3. Results downloaded from Galaxy
IRIDA UI/DB
Galaxy
Assembly Tools
Variant Calling Tools
…
REST API
Shared File System
Worker Worker
2. Tools executed on Galaxy workers
Example data analysis, visualiza)on tools IslandViewer/GenomeD3Plot – more flexible GI/VF/AMR visualiza)on
Dhillon BK et al 2015 Nucl Acids Res PMID: 25916842 Laird MR et al 2015 Bioinforma(cs PMID: 26093150 www.pathogenomics.sfu.ca/islandviewer/
github.com/brinkmanlab/GenomeD3Plot/
“SNVPhyl” SNV analysis Integra)ng genomics, geographic data (led by NML, Rob Beiko, Dalhousie U)
http://kiwi.cs.dal.ca/GenGIS
SNVPhyl Software Demo by Aaron Petkau and IRIDA poster by Emma Griffiths at #ASMNGS meeting
Example data analysis, visualization tools
Challenges
… for IRIDA … for interna)onal linkages
13
Challenges
… for IRIDA … for interna)onal linkages
Biggest challenge is NOT bioinforma(cs/soPware development
14
Challenges
… for IRIDA … for interna)onal linkages
Biggest challenge is NOT bioinforma(cs/soPware development
It’s sharing
15
Canada’s Public Health System Challenges
Provincial public health dept.
National laboratory
Local public health dept.
Provincial laboratory
Cases
Physicians Frontline lab
Informa)on
Bioinforma)cs and Analy)cal
Capaci)es
Info lost as aggregate data from Frontline lab to national PH labs
$ Disease Reporting
Informa)on Sharing is Highly Complex
• Variety of agreements, legisla)on
• Lack of standards (metadata, legal requirements)
• Fears of data release during ongoing inves)ga)ons, and IP concerns make provinces “risk averse”
Impact
• Impacts data sharing na)onally, interna)onally
• Also: Lack of rich example data impacts ontology, data standards development (ability for computers to share)
Must communicate to countries: Get data sharing arranged early – both na(onally and interna(onally
18
Interna)onal data sharing
-‐ Get data sharing arranged early -‐ Ensure alloca(on of adequate resources to set it up
-‐ Share bioinforma(cs resources, code, parameters -‐ Share data examples
-‐ Start simple: Agree on minimal genome metadata for rapid release
19
Interna)onal data sharing Pt. 2 -‐ Develop harmonized metadata for further data release
-‐ Interoperable systems
-‐ Current and well-‐maintained repositories
-‐ Valida4on datasets for pipeline calibra)on
-‐ User access control
Open access à Opens opportuni)es, discoveries
20
A new hope… MLISA (Multilateral Information Sharing Agreement)
• Canadian multi-jurisdictional legal agreement • Establishes standards re sharing, usage, disclosure and
protection of PH info for infectious diseases and PH events • Technical annexes (for example for WGS) can be developed to
clarify specifically data to be exchanged
PulseNet as a model for sharing (in part)
Can USA–Canada sharing be developed as a model?
Flight paths across North America. Outbreaks follow flight paths more closely than simple geographic distance.
23
IRIDA’s Role in International Data Sharing
1. Application ontology for genomic epidemiology
2. Metadata standardization
3. Interoperability
4. Sensitive field sharing secured via authorization
5. Privacy protection and data security
6. Compatible with International Health Regulations (2005)
7. Aims to support federated design, plus open data sharing
24
Sharing – via computers, people Its all about communica(on Computers – ontologies, data standards are key Humans – gemng them together is key…
25
Sharing – via computers, people Its all about communica(on Computers – ontologies, data standards are key Humans – gemng them together is key…
Brinkman Lab Kayaking Trip
Addi)onal key trainee funding and compu)ng infrastructure:
We’re hiring!