health-e-child: a grid platform for european paediatrics on behalf of the health-e-child consortium...

Post on 01-Jan-2016

221 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Health-e-Child: A Grid Platform for European Paediatrics

On behalf of the Health-e-Child Consortium(& with thanks to David Manset Maat-G)

CHEP07 3rd September 2007, Victoria, CA

Richard McClatchey, UWE-Bristol UK

Contents

• Project Objectives & Challenges• Health-e-Child Gateway • The HeC Grid architecture• Data Integration & Ontologies• Futures• Conclusions

3

Motivation for the Project

• Clinical demand for integration and exploitation of heterogeneous biomedical information• vertical dimension – multiple data sources• horizontal dimension – multiple sites

• Need for generic and scalable platforms (Grid?)• integrate traditional and emerging sources

– in vivo and in vitro• provide decision support• ubiquitous access to knowledge repositories in clinical routine• connect stakeholders in clinical research

• Need for complex integrated disease models• build holistic views of the human body• early disease detection exploiting in vitro information• personalized diagnosis, therapy and follow-up

4

Objectives of Health-e-ChildBuild enabling tools & services that improve

the quality of care and reduce cost withIntegrated disease models Database-guided decision support systemsCross modality information fusion and data mining

for knowledge discovery

Establish multi-site, vertical, and longitudinal integration of data, information and knowledge

Develop a GRID based platform, supported by robust search, optimisation and matching

Healthy Child

Dec

isio

n S

uppo

rt S

yste

ms

Integrated Disease Modeling

Know

ledge Discovery

Augment

GuidanceGuidanceEnrich

Real-time alert

On-line learning

Ob

se

rva

tio

n P

roc

es

sS

en

so

rs

Imaging

Genomics

Lab Data

Proteomics

Demographics

Physician Notes

Life Style

Time

Organ

Tissue

Cell

Molecule

Population

Individual

Ver

tica

l D

ata

Inte

gra

tio

n

Integrated Medical

Database

6

GOSH

NECKER

UWE

CERN

IGG

SIEMENS

ASPER

UOA

INRIA

LYNKEUS

UCL

EGF

FGG

MAAT

A Geographically Distributed Environment

Introduction

Clinical Site

R&D Site

7

Applications Integration Challenge

Introduction

IGG

NECKER

GOSH

Highlights

- Different Networks: LANs, WANs, Internet- Security Constraints: Local & National Regulations- Bandwidth Limitations: LAN/WAN & Internet uplinks …

Highlights

- Different Networks: LANs, WANs, Internet- Security Constraints: Local & National Regulations- Bandwidth Limitations: LAN/WAN & Internet uplinks …

8

Contents

• Project Objectives & Challenges• Health-e-Child Gateway • The HeC Grid architecture• Data Integration & Ontologies• Futures• Conclusions

9

HeC System Overview

Grid Infrastructuredatabases, resource and user management, data security

HeC Gateway HeC specific models and Grid services like query processing, security

Heart Disease Applications

Inflammatory Diseases

Applications

Brain TumourApplications

Common Client Applications user interface for authentication, viewing, editing, similarity search

1010 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

• Our building blocks are Services• Our Gateway is a SOA

• Existing blocks are available in the community• WSRF Containers: Globus Toolkit4, Tomcat,

WSRF::Lite…• Data Access Layers: OGSA-DAI, AMGA…• File Transfer facilities: gridFTP…

• Applies to other distributed systems• Our approach is to efficiently reuse and combine

blocks to satisfy our requirements• Building blocks might be enhanced, i.e. HeC

Authentication Service

Data Management & Integration

The Health-e-Child Gateway

1111 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

Gateway• The HeC Gateway

• An intermediary access layer to decouple client applications from the

complexity of the grid • Towards a platform independent implementation• To add domain specific functionality not available in Grid middleware

Status√ SOA architecture

and design√ implementation of

privacy and security modules

1212 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

Our Approach

Workstation GatewayUser

Hospital X

Highlights- Simplicity, “all-in-one box” - Modularity & Scalability, “off-the-shelf” components- State-of-the-art Approaches

Highlights- Simplicity, “all-in-one box” - Modularity & Scalability, “off-the-shelf” components- State-of-the-art Approaches

One AccessPoint

Per Institution

One Key

To enter the system

1313 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

Platform & Grid Middleware

The Health-e-Child Access Point

One AccessPoint

Hosting Domain 1 Stable & Secure Environment

Hosting Domain 2 Stable & Secure Environment

Security &

Registration

JobManagnt

HeC Gateway Storage

HeC Scheduling Monitoring

InfoSystem

Co

mp

uta

tio

n U

nit

Dat

a U

nitHeC

DBMS

200GB 50GB 1TB

Health-e-Child EGEE gLite

HeC Gateway

Virtualization…

1414 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

The Platform

The Health-e-Child Gateway (1)

Inside the box…

Client ApplicationsOne Access

Point

HeC Gateway

Computing Resources

Functionality Access

Infrastructure Abstraction

1515 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

The Platform

The Health-e-Child Gateway (2)

Security

GT4

Inside the box…

One AccessPoint

HeC Gateway

1616 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

Platform & Grid Middleware

The Health-e-Child Gateway (3)

Grid GT4

Inside the box…

One AccessPoint

HeC Gateway

1717 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

The Platform

The Health-e-Child Gateway (4)

Client Connectivity

GT4

Inside the box…

One AccessPoint

HeC Gateway

1818 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

The Platform

The Health-e-Child Gateway (5)

Client Applications

Inside the box…

One AccessPoint

HeC Gateway

Data Integration

1919 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

Contents

• Project Objectives & Challenges• Health-e-Child Gateway • The HeC Grid architecture• Data Integration & Ontologies• Futures• Conclusions

2020 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

Grid• Grid technology (gLite 3.0) as

the enabling infrastructure• A distributed platform for

sharing storage and computing resources

• HeC Specific Requirements • Need support for medical

(DICOM) images• Need high responsiveness for

use in clinical routine • Need to guarantee patient data

privacy: access rights management storage of anonymized patient

data only

Status√ Testbed installation since Mai

2006√ HeC Certificate Authority√ HeC Virtual Organisation√ Security Prototype (clients &

services)√ Logging Portal & Appender

2222 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

Grid Platform

Grid Architecture @ Month18

Virtualization Applied

2nd Test Node Deployed

Being deployed

2323 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

Solution Facts & Conclusion

Grid Platform

- Test-bed Installed & Tested @ CERN- Validated the Middleware Architecture (V1.0) to be installed at the different Sites- However: Architecture still too centralized (VOMS, LFC…)

- Being addressed in the second planning period

- Test-bed Installed & Tested @ CERN- Validated the Middleware Architecture (V1.0) to be installed at the different Sites- However: Architecture still too centralized (VOMS, LFC…)

- Being addressed in the second planning period

CERN

2626 Health-e-Child Technical Review Meeting, Paris, 17 January 2007

Contents

• Project Objectives & Challenges• Health-e-Child Gateway • The HeC Grid architecture• Data Integration & Ontologies• Futures• Conclusions

27

• Vertical Levels• Split the domain into vertical levels• Data/Information from multiple levels• Implicit semantics connecting these

levels.

• Vertical Integration in applications• Data models should provide

seamless access to all the infor-mation stored in HeC

• Vertically Integrated Modelling• Coherently integrate information

from different levels (instances)• Provide integrated view of the data to the users (concept-oriented views, etc)• Queries against the integrated knowledge

• Topics• Vertical Integration along the longitudinal axis in pediatrics• Semantics of relevance for presenting clinical data • Fragment extraction, alignment and integration from biomedical knowledge sources

Vertical Integration Population

Individual

Organ

Tissue

Cell

Molecule

28

Data and knowledge modelling

RequirementsRequirements

Data Acquisition Protocols (WP9)

Users Requirements Specifications (WP2)

Modelling with Domain Experts (WP6)

Integrated Data Modelling

Integrated Data Modelling

Applications DSS

Similarity

Knowledge Representation -

ontologies

Knowledge Representation -

ontologies

Query

HeC-universalHeC-universal

Application specific

Application specific

31

Ontologies in Health-e-Child

• Reuse existing medical knowledge to structure our feature space• Working with established knowledge• Linkage to “live” knowledge base(s)• Saves effort on validating the models with domain experts• Non-domain specific conceptualization (upper level ontologies): reusing

these, we establish shareable/shared knowledge

• Ontological modelling of paediatric medicine• Integration of information – resolving semantic heterogeneity

• Different countries, hospitals, protocols

• Research potential• OWL DL representation of paediatrics medical knowledge, identification of

knowledge patterns• Decision-support systems• Similarity metrics and query enhancement.• Alignment of ontologies that overlap over the Health-e-Child domain

32

Contents

• Project Objectives & Challenges• Health-e-Child Gateway • The HeC Grid architecture• Data Integration & Ontologies• Futures• Conclusions

33

Clinical and Application Roadmap

Phase I(- 06/06)

Phase II(07/06 - 06/07)

Phase III(07/07 - 12/08)

Study Designand Approval

Phase IV(2009)

ClinicalValidation

Refinementof Models and

Algorithms

Dissemination

Data acquisition, genetic tests, ground truth annotations

UserRequirements

State ofthe Art

Reports

Knowledge Discovery Methods

Segmentation/Registration

Feature Extraction from Imaging

Disease Model Developmentgeneric subtype specific patient and treatment specific

Integrated decision support

Classifiers Based on Genetics

34

Future Work – Decentralized Architecture

Grid Platform

Hospital 11 Box

4 Domains

- WN- site BDii- CE/LRMS or gsissh- UI- FTS

- top BDII- WMSLB- LFC (*M)- DPM- MON ?

- VOMS (S)- PX (S)- Hydra (D)- AMGA (*M)

Hospital 21 Box

4 Domains

- WN- site BDii- CE/LRMS or gsissh- LRMS- UI- FTS

- top BDII- WMSLB- LFC (*S)- DPM- MON ?

- VOMS (S)- PX (S)- Hydra (D)- R-GMA ?- AMGA (*S)

Hospital 31 Box

4 Domains

- VOMS (M)- PX (M)- Hydra (D)- AMGA (*S)

Virtual Computer

Virtual Computer

SMP 2 - n CPUs

Virtual Computer Virtual Computer

SMP 4 - n CPUs

Virtual Computer Virtual Computer

SMP 4 - n CPUs

- top BDII- WMSLB- LFC (*S)- DPM- MON ?

- site BDii- CE/LRMS or gsissh- UI- FTS

- WN

Working out a possible alternatives to decentralize key components of thegrid middleware

Autonomous Sites

35

Ontology layer in the data management architecture

HeC Ontology

Ontological Layer

Ontology-Data Model MappingsQuery Processing Engine

GUI

Semantics Rules

Externalpublic DBs

External Biomedical ontologies

Hospitals’ DBs

Applications (e.g.DSS)

HeC-External Ontologies Mappings

36

Contents

• Project Objectives & Challenges• Health-e-Child Gateway • The HeC Grid architecture• Data Integration & Ontologies• Futures• Conclusions

37

HEP vs. BioMed - Comparison

Number & location of sites, Job vs. Interactive processing, Data distribution & governance, Nature of applications, skills sets of users, Data Types & lifetimes, Data Replication

Middleware-centric vs. Operating system-centric

Vs.

38

Obstacles for Biomed Users

• Grid Middleware is difficult to set up, configure, maintain and use. New skills needed.

• Support for non-scientific applications is limited.• Hierarchical environments favoured, largely client-server,

inflexible in nature - no support for P2P.• Grid software has concentrated on job-oriented (batch-like)

computing rather than interactive computing.• Cannot share the cpu (CE) or storage capabilities (SE) with

the Grid : all or nothing for the resourcing.• We need out-of-the-box Grid functionality for biomedical

end-users.

39

Conclusions

• HeC Gateway designed and implementation progressing through 2007-2008

• Grid platform decided – EGEE gLite• BUT Grid is not yet mature in non-HEP settings• Biomedical communities are very different to HEP

communities. So no one-size-fits-all• Future must be towards user-participation in Grids –

what about a Grid OS ?

top related