cshals 2013
DESCRIPTION
TRANSCRIPT
![Page 1: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/1.jpg)
Alejandra Gonzalez-‐Beltran University of Oxford e-‐Research Centre, UK
The ISA Infrastructure for the biosciences from data curaDon at source to the linked data cloud
Conference on Semantics in Healthcare and Life Sciences (CSHALS)
Boston, USA Feb 27- Mar 1 2013
![Page 2: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/2.jpg)
• The infrastructure : a metadata tracking framework in the biosciences: the format, a set of open source soMware tools and the user community
• The syntax and its implicit semanDcs
• The component of the infrastructure
• for mapping the syntax to ontologies
• A couple of mappings, architecture, conversion
Outline
![Page 3: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/3.jpg)
![Page 4: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/4.jpg)
Contextual informaDon (metadata): • Sample characterisDcs • Technology and measurement types • Instrument parameters • …
![Page 5: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/5.jpg)
Need for a generic representaDon, applied to: •microarray based experiments (MAGE) •sequencing based experiments (SRA) •flow cytometry based experiments (FuGE-‐Flow Cyt) •mass spectrometry and NMR spectroscopy
experiments (Metabolights and PRIDE)
![Page 6: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/6.jpg)
• Assist in the annotaDon and management of experimental metadata at source, supporDng data provenance tracking
• Deal with high-‐throughput studies using one or a combinaDon of omics and other technologies
• Empower users to uptake community-‐defined checklists and ontologies
• Facilitate data sharing, re-‐use, comparison and reproducibility of experiments, submission to internaDonal public repositories
infrastructure ISA soMware suite: supporDng
standards-‐compliant experimental annotaDon and enabling curaDon at
the community level Rocca-‐Serra et al, 2010
BioinformaDcs
![Page 7: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/7.jpg)
A growing ecosystem of over 30 public and internal resources using the ISA metadata tracking framework
to facilitate standards-‐compliant collecDon, curaDon, management and reuse of invesDgaDons in an increasingly diverse set of life science domains.
Towards interoperable bioscience data Sansone et al, 2012 Nature GeneDcs
![Page 8: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/8.jpg)
syntax (and its implicit semanDcs)
![Page 9: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/9.jpg)
![Page 10: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/10.jpg)
Protocol Process
Characteristics[…] Factor Value[…] (independent variables) Material Type Comment[…]
Date (day effect)
Performer (operator effect)
Parameter Value […]
Derived Data File
Raw Data File
Data File Node
" DATA!
" Material!
Material Node
Sample Name Material Type
HybridizaDon Assay Name Assay Design REF Array Data File Protocol REF Derived Array Data File
sample1 genomic DNA assay1 A-AFFY-107" assay1.cel data normalizaDon assay1.txt
sample2 genomic DNA assay2 A-AFFY-107" assay2.cel data normalizaDon assay2.txt
sample3 genomic DNA assay3 A-AFFY-107" assay3.cel data normalizaDon assay3.txt
Material transformations...
" Material!
" DATA!
![Page 11: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/11.jpg)
Tagging: from free text to ontology-‐based • single intervenDon representaDon, free text annotaDon
• single intervenDon, ontology-‐based annotaDon
11
Source Name CharacterisDcs[organism]
Factor Value[perturbaDon agent]
Factor Value[dose]
Factor Value[duraDon]
individual1 human aspirin high dose 12 weeks
Source Name CharacterisDcs[organismobi:0100026)])
Term Source REF
Term Accession Number
Factor Value[chemical compound CHEBI_37577)]
Term Source REF
Term Accession Number
individual1 Homo sapiens NCBITax 9606 aspirin CHEBI 1231354
Factor Value[dose(OBI_0000984)
Term Source REF
Term Accession Number
Factor Value[Dme (PATO_0000165)] Unit Term Source
REF Term Accession Number
low dose LNC LP30872-‐3 12 week UO “0000034”
![Page 12: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/12.jpg)
Kohonen et al. The ToxBank Data Warehouse: a research cluster of 7
EU FP7 Health systems toxicology and toxicogenomics projects.
Health Care & Life Sciences Interest Group
ToxBank effort developed by Nina Jeliazkova
![Page 13: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/13.jpg)
• Make the semanDcs of ISAtab explicit, including materials & data enDDes & processes & their relaDonships
• Provide incenDves for provision of ontology-‐based annotaDons in ISA-‐TAB datasets; exploit those annotaDons
• Augment ISA syntax with new elements (e.g. groups), facilitaDng the understanding & querying of experimental design
• Facilitate data integraDon & knowledge discovery/reasoning
![Page 14: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/14.jpg)
architecture
ISA-TAB parser isa2owl mapping
parser graph
analysis
Configuration file
![Page 15: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/15.jpg)
• Ontology search and automated tagging (relying on NCBO Bioportal services) on Google Spreadsheets • CollaboraDve annotaDon; support for distributed users • Version control & history
OntoMaton: a Bioportal powered Ontology widget for Google
Spreadsheets Maguire et al, 2013
BioinformaDcs
![Page 16: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/16.jpg)
![Page 17: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/17.jpg)
Expe
rimen
tal
domain
Biomolecular domain
Chemical domain
InformaDon domain
vocabularies
Source Name CharacterisDcs[organismobi:0100026)])
Term Source REF
Term Accession Number
Factor Value[chemical compound CHEBI_37577)]
Term Source REF
Term Accession Number
individual1 Homo sapiens NCBITax 9606 aspirin CHEBI 1231354
![Page 18: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/18.jpg)
Source Name CharacterisDcs[organismobi:0100026)])
Term Source REF
Term Accession Number
Factor Value[chemical compound CHEBI_37577)]
Term Source REF
Term Accession Number
individual1 Homo sapiens NCBITax 9606 aspirin CHEBI 1231354
OBI
GO ChEBI IAO
Open Biological and Biomedical Ontologies
(OBO) Foundry BFO
![Page 19: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/19.jpg)
ISA-‐OBI mapping
![Page 20: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/20.jpg)
ISA-‐SIO mapping
![Page 21: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/21.jpg)
Data subset: LC/MS peaks from the spinal cords of 6 wild-‐type and 6 FAAH (fapy acid amyde hydrolase) knockout mice
faahKO dataset Available in
Bioconductor (with ISA-‐TAB metadata)
Global metabolite profiling
![Page 22: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/22.jpg)
![Page 23: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/23.jpg)
• support different conversion modes (different levels of granularity)
• querying for ISA-‐TAB datasets, across mulDple experiment types
• reasoning exploiDng ontology annotaDons • semanDc validaDon of ISA-‐TAB datasets
• augmented annotaDon over naDve ISA syntax
• idenDficaDon gaps in ontological representaDons • feedback of findings to community ontologies
![Page 24: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/24.jpg)
Increasing level of structure for experimental metadata
Notes in Lab books
Spreadsheets & Tables (ISAtab metadata)
Facts as RDF statements
![Page 25: CSHALS 2013](https://reader033.vdocuments.net/reader033/viewer/2022051819/54c6444e4a7959f2328b45a8/html5/thumbnails/25.jpg)
@isatools @biosharing
isa-tools.org isacommons.org biosharing.org