neurolog: sharing neuroimaging data using an ontology
TRANSCRIPT
NeuroLOG ANR-06-TLOG-024
Software technologies for integration of process and data in medical imaging
http://neurolog.polytech.unice.fr
Bernard Gibaud1, Gilles Kassel2, Michel Dojat3, Bénédicte Batrancourt4, Franck Michel5, Alban Gaignard5, Johan Montagnat5
1 INSERM / INRIA / CNRS / Univ. Rennes 1, IRISA Unit VISAGES U746, Rennes, France
2 Univ. de Picardie Jules Verne, MIS, EA 4290, Amiens, France 3 INSERM U836 / Univ. J. Fourier, Institut des Neurosciences, Grenoble, France 4 INSERM / CNRS / Univ. Pierre et Marie Curie, CRICM, UMR_S975, Paris, France
5 CNRS / UNS, I3S lab, MODALIS team, Sophia Antipolis, France
AMIA 2011, Washington DC
Supported by
NeuroLOG: sharing neuroimaging data using an ontology-based
federated approach
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024
Collaborative neurosciences
• Major challenge for this century − Degenerative diseases − Population aging − Pathophysiology understanding − Development / validation of new drugs
• Need to foster collaborative research − Validation / comparison of methods − Multi-centric disease-targeting studies − Multi-centric cross-disease studies
⇒ Require the sharing of large amounts of resources
2 AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024 3
Collaborative research in neuroimaging
• What resources ? – Image data and associated data (technical, clinical, etc.) – Image processing tools – Computing resources
Data
Processing tools Computing power
• Strategies: 2 possible approaches − Centralized (e.g. ADNI) − Federated (e.g. BIRN,
caBIG, neuGRID, @NeurIST)
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024
Goals of the NeuroLOG project
4 AMIA 2011, Washington
• To set up a federated system, allowing the sharing and re-use of: − Neuroimaging data (images and related technical,
demographical and medical metadata) − Processing tools published by cooperating partners − Computer processing resources (local, GRIDs)
• Three-year project (mid-2007 end-2010) − Supported by ANR − Associating several french partners with complementary
skills ⇒ This presentation focuses on the data sharing
part of the project
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024
Overall system design
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024 6
• Federated system – federating independent legacy systems – flexibility for data organization
• Mediation – use of an a priori-defined ontology – consistent with a “Local as Views” approach (rather than GaV)
• Access control – security layer (X509 certificates + SSL channels) – security policy
Major design choices
a Come up with a global federated view that hides data distribution and heterogeneity from the end-user
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024
Ontology design
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024
Ontology: scope
To assemble a common application ontology to provide a uniform and consistent modelling of shared information, e.g. : − Images (Datasets) − Image acquisition and image processing (Dataset processings) − Context of acquisition and exploitation of the images (Studies,
Subjects, Examinations, Centers, etc.) − Results of other kinds of explorations (Subject data acquisition
instuments, Instrument variables, Assessments, Scores, etc)
Use of this ontology to integrate heterogeneous data Common relational schema
8 AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024
Ontology: general approach
• Application ontology (called OntoNeuroLOG) − based on a common modelling framework − 3-level structure
• one Foundational ontology: i.e. DOLCE • Several Core ontologies • Several Domain ontologies
• Major concerns − Re-use of existing ontologies (when applicable) − Documentation
9 AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024
DOLCE: an ontology of particulars
10
Particular
Endurant Perdurant Quality Abstract
Physical object
Non-Physical object
Mental object
Event Stative
Achievement
Accomplishment
State Process
Social object
Physical quality
Temporal quality
Region
Time interval
(Masolo et al., 2003)
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024
Ontology: 3-level structure
Application ontology (called OntoNeuroLOG) one Foundational ontology (DOLCE) Several Core ontologies Several Domain ontologies
11 AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024 12
Ontology: 3 representations
1. OntoSpec representation (Kassel, 2005) Semi-formal notation (rich semantics) Numerous axioms
2. OWL-Lite Edited with PROTÉGÉ Tailored to perform inferences with CORESE (search
engine) 3. Federated relational schema
Entities and relations are closely linked to concepts and relations of the ontology
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024 AMIA 2011, Washington
Example of OntoSpec representation
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024 14
Ontology: 3 representations
1. OntoSpec representation (Kassel, 2005) Semi-formal notation (rich semantics) Numerous axioms
2. OWL-Lite Edited with PROTÉGÉ Tailored to perform inferences with CORESE (search
engine) 3. Federated relational schema
Entities and relations are closely linked to concepts and relations of the ontology
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024
OntoNeuroLOG: OWL-Lite representation
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024
Instruments’ descriptions
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024 17
Ontology: 3 representations
1. OntoSpec representation (Kassel, 2005) Semi-formal notation (rich semantics) Numerous axioms
2. OWL-Lite Edited with PROTÉGÉ Tailored to perform inferences with CORESE (search
engine) 3. Federated relational schema
Entities and relations are closely linked to concepts and relations of the ontology
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024 18
Subject Study
Dataset
Examination
Experimental Groups of Subjects
Ontology V2.2 Instrument
Variables Scores
Assessment
Federated Relational Schema
Dataset processing
MR protocol
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024
Data integration
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024 20
Software architecture
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024 21
Integration mechanism
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024 22
Data & Metadata management design
write results
read (jdbc interface)
Data Federator
Site 2
Site 1
Metadata manager
get/write files & datasets
Site-specific DB NeuroLOG DB
read
Data Federator
Local storage
resource
Interface with other Services
NeuroLOG Server NeuroLOG Server
Data manager
NeuroLOG Client
Web services
Web services
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024 23
Metadata mapping
• Data Federator: relational data mapping and federation tool (Business Object / SAP)
Shanoir
NeuroLOG
GIN-DMS I3S
IRISA GIN I3S
Global federated view
Data Federator /global
Data Federator /global
Data Federator /global
Data Federator /local
Data Federator /local
Data Federator /local
NeuroLOG NeuroLOG
Global federated schema derived from project ontology (OntoNeuroLOG).
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024 24
Semantic data
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024 25
Architecture: semantic module
Semantic repository
Semantic queries engine (CORESE)
METAMorphoses (SQL ↔ RDF)
Images Meta-data (RDF)
Ontology (OWL LITE)
Mapping file (XML) Template file (XML)
Querying
Query interface
Data Federator
Querying
Querying
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024 26
Software architecture: client software
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024 27
• Search datasets related to selected subjects and produced in selected studies
Client – Querying metadata and accessing data
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024
Client - Viewer
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024
Semantic query example (in SPARQL): EDSS scores with ambulation score <= 300m
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024
Results: system deployment
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024 31
NeuroLOG platform
IFR49 (La Pitié Salpétrière)
IRISA (CHU Rennes)
Paris Rennes
Grenoble
Sophia Antipolis
• 5 sites federated – 4 legacy databases – 12 studies – > 70 subjects
– MS – Brain tumors – AD
– > 500 datasets GIN (Michalon Hospital)
I3S site (technical)
Gin-DMS
Neuro-DMS
CAC-DB
INRIA (Centre A Lacassagne)
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024
Discussion
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024
Discussion: ontology
• Domains that could not be included − Ontology of Regions of Interest (image ROIs) − Ontology of anatomical structures
• Need of convergence with other ontologies, e.g. NIF − Need to map BFO-based ontologies with our DOLCE-
based ontologies
33 AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024
Discussion: mediator
• Data federator (SAP): the pros − Standard interface (jdbc) − Fully dynamic queries − Performance − User interface to defined mappings
• Data federator (SAP): the cons − No automatic re-configuring in case of error − NOT opensource
• Alternative solutions − OGSA-DAI / OGSA-DQP (e.g. used in BIRN, @NeurIST) − Fully semantic integration framework
34 AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024 35
Conclusion / Perspectives
• We have presented a system for sharing image data and associated metadata according an ontology-based federated approach - Most of our objectives (not all !) could be met - We are currently applying for a continuation of this work in focused
neuroimaging research applications • We remain convinced that our ontology-based approach
is a promising one – Intelligent data querying and mediation – Complementary to regular relational querying
• However, it raises the issue of the convergence with other ontologies developed elsewhere, e.g. – NIF* / BIRN** in the USA – @NeurIST in Europe
* Neuroscience Information Framework ** Biomedical Informatics Research Network
AMIA 2011, Washington
Software technologies for integration of process, data and knowledge in medical imaging
NeuroLOG ANR-06-TLOG-024 36
Acknowledgements
• ANR, for its financial support • All the people having contributed to NeuroLOG, espec. − Farooq Ahmad − Christian Barillot − Pascal Girard − David Godard − Diane Lingrand − Grégoire Malandain − Mélanie Pélégrini-Issac − Xavier Pennec − Javier Rojas Balderrama − Bacem Wali
• And all our clinical partners, especially − Gilles Edan, Jean-Christophe Ferré and Jean François Lebas
AMIA 2011, Washington