cagrid, fog and clouds

Download caGrid, Fog and Clouds

Post on 23-Feb-2016




0 download

Embed Size (px)


caGrid, Fog and Clouds. Joel Saltz MD, PhD Director Center for Comprehensive Informatics. Overview. Introduction Tools Use Case Fog and Clouds. Fog Computing Grid interoperable enterprise virtualization. caGrid, Enterprise and Clouds. caGrid Components Security (GAARDS) - PowerPoint PPT Presentation


Design Templates: Translational Biomedical Research and the Biomedical Informatics Middleware

caGrid, Fog and CloudsJoel Saltz MD, PhDDirector Center for Comprehensive Informatics

OverviewIntroductionToolsUse CaseFog and CloudsFog ComputingGrid interoperable enterprise virtualization

caGrid, Enterprise and Clouds

Biomedical Middleware: caGrid, TRIAD, i2b2caGrid ComponentsSecurity (GAARDS)Language (metadata, ontologies)Semantic/Federated queryWorkflow Grid Service Graphical Development Toolkit (Introduce)DICOM, IHE compatibilityAdvertisement and Discovery


Preferred NameSynonymsDefinitionRelationshipsConcept CodeVocabulary/Ontology InteroperabilityRegistered metadataOntology concept codes used to annotate modelsXML schemas that define data structures also registeredThus both data semantics AND data structures are registered. That is how we achieve (relative) interoperability.

7Use Case: Will Treatment work and if not, why not?

Avastin and Glioblastoma in RTOG-0825Treatment: Radiation therapy and Avastin (anti angiogenesis)Predict and Explain: Genetic, gene expression, microRNA, Pathology, ImagingRT, imaging, Pathology markup/annotations/queryActive Data workflows

Avastin and GBMs in RTOG-0825Analysis on pre-treatment tissue to extract imaging and molecular biomarkers that are indicative of Outcome/Avastin response. whole genome mRNA and microRNA expression profiling of GBM tumor specimens to identify outcome/Avastin response biomarkersAnalyzing the Pathology imaging and diagnostic imaging registered with the therapy plan to extract any biomarkers that can indicate Avastin response. Does advanced imaging (eg: diffusion weighted imaging) provide markers that can predict patient response?RT, Diagnostic Imaging and Pathology: Support Human/Algorithm analyses, annotation, markupData management and display framework that integrates the pathology with the radiology, therapy treatment information and the clinical data. This involves integrating platforms that manage imaging data at ACRIN, pathology at UCSF and molecular data from Emory.

For the sake of quality control, reproducibility and data sharing, results of RT, imaging, Pathology observations and analyses need to be described in a well defined manner

Finding: massMass ID: 1Margins: spiculatedLength: 2.3cmWidth: 1.2cmCavitary: YCalcified: NSpatial relationships: Abuts pleural surface; invades aortaDistinguish (and maybe redefine) astrocytic, oligodendroglial and oligoastrocytic tumors using TCGA and Rembrandt Important since treatment and Outcome differ

Link nuclear shape, texture to biological and clinical behaviorHow is nuclear shape, texture related to gene expression category defined by clustering analysis of Rembrandt data sets?Relate nuclear morphometry and gene expression to neuroimaging features (Vasari feature set)Genetic and gene expression correlates of high resolution nuclear morphometry and relation to MR features using Rembrandt and TCGA datasets.

Annotation and Markup of Pathology Data needs Human/Algorithm Cooperation

Astrocytoma vs OligodendroglimaTCGA finds genetic, gene expression overlapPathologists have also long seen overlapRelationship between Pathology, Molecular, RadiologyRelationship to Outcome, treatment response

Example: Compute Intensive Workflow

Fog Computing

Attributes of Fog ComputingFor legal and organizational reasons, data location is constrainedVirtual machines used to create software stacks that use caGrid + other middleware to expose data services (we regularly ship VMs with caGrid based software stacks)Enterprises such as Emory increasingly rely on virtualization architecturesMid-range active storage platforms (eg Emory 1PB archival, 100TB fast disk, 2K cores, infiniband interconnect) will rely heavily on virtualization

Relationship between Fog and Cloud ComputingWorkflows have enterprise, caGrid and compute intensive componentsInteroperability between enterprise and caGrid software stacks major current NCI/ONR issueGiven virtualization, HPC and large scale data requirements can be tackled with cloud approach

IssuesOrganizational, legal requirements create constraints on placement of datasets, where VMs can be runMapping of VMs, datasets:Huge variation in communication, I/O bandwidth, compute capabilities in Fog/Cloud continuaMultiple software stacks, heterogeneous hardware architecturesSecurity: Fog to Cloud: multiple distinct but overlapping security related requirements and constraints

Thank you