case study: oss at the national cancer institute the ... · pdf filecase study: oss at the...

22
Case Study: OSS at the National Cancer Institute The Cancer Bioinformatics Grid (caBIG) Program Workshop Open Source Software and the Military Health System September 22-23, 2011 Virginia Tech Arlington Research Building Fred Prior, PhD

Upload: buidan

Post on 22-Feb-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Case Study: OSS at the National Cancer Institute The Cancer Bioinformatics Grid (caBIG) Program

Workshop Open Source Software and the Military Health System September 22-23, 2011 Virginia Tech Arlington Research Building

Fred Prior, PhD

cancer Biomedical Informatics Grid®

An open source, open science program begun in 2004

GOAL: design and develop a collaborative IT infrastructure to link the NCI designated cancer centers and other NCI programs to accelerate the pace of biomedical research focused on the detection, diagnosis, treatment, and prevention of cancer.

Between 2004 and 2010 caBIG grew into one of the largest NCI funded programs with a total cost of at least $350 million for fiscal years 2004 to 2010.

The program is on-going.

Source: AN ASSESSMENT OF THE IMPACT OF THE NCI CANCER BIOMEDICAL INFORMATICS GRID (caBIG®), 2011 Available from: http://arc.georgetown.edu/BSAcaBIGAssessment.pdf

Source: https://cabig.nci.nih.gov/concepts/essentials

Source: https://cabig.nci.nih.gov/concepts/essentials

Source: https://cabig.nci.nih.gov/concepts/essentials

Source: https://cabig.nci.nih.gov/concepts/essentials

Phase 1: Pilot (2004-2007) Establish the community (workspaces) Develop core tools

Over 70 open source software packages were developed

Link the NCI cancer centers Over 149 nodes on the production Grid

Phase 2: Enterprise Deployment (2007+)

caBIG® supported Deployment Program, at 56 NCI cancer centers

6 caBIG® supported Knowledge Centers to provide demonstrations, training material, answers to frequently asked questions

6 caBIG® supported In Silico Research Centers of Excellence for research using data-mining and other in silico

19 Support Service Providers licensed commercial entities to assist users in installing,

modifying and using caBIG® tools

Experience with caBIG®

Primary involvement with the In-vivo Imaging workspace Liaison to Tissue Bank and Pathology Tools workspace Multiple interactions with Vocabularies & Common Data Elements

cross cutting workspace

Roles: Subject Matter Expert (funded) to facilitate community input and

provide guidance on imaging priorities Contract Development and Maintenance of 2 Software packages Adoption and deployment of caBIG tools Open Source development of extensions

Ref: Prior F, Erickson B, Tarbox L. Open Source Software Projects of the caBIG™ In Vivo Imaging Workspace Software Special Interest Group. Journal of Digital Imaging, 2007; 20(supl 1):94-100.

caBIG® IVI Workspace The In Vivo Imaging Workspace was added to the caBIG program in April of 2005: 1. To advance imaging informatics for treatment of

patients with cancer

2. To leverage caBIG technology such as the caGRID, Internet tools such as XML, and existing DICOM standards by creating “middleware,” to facilitate sharing of images in a variety of settings

3. To strive towards a standardized way to evaluate and annotate images, especially for evaluation of tumor burden and response

4. To facilitate secure and easy sharing of images and image analysis & visualization algorithms with an emphasis on the cancer community

Adoption of Open Source XIP Platform caBIG Open Source imaging libraries and XIP Builder Tool caBIG AVT Project (Algorithm Validation Toolkit) DoD TATRC/ACR’s Interoperability in Medical Imaging DARPA deep-bleeder acoustic coagulation Beth Israel Intraoperative Fluorescent Imaging NTR Optical Imaging for Drug Therapy Monitoring, multi-modality imaging caBIG AIM Project (Annotation Imaging Markup) - Northwestern University. CenSSIS collaboration with RPI on Cellular Imaging for XIP UPENN collaboration on multi-resolution histopathology

Pre-Clinical In Vivo In Vitro

Lessons Learned A very large open source community is difficult

to manage if the goal is to achieve consistency in implementation and interoperability of components. Development teams largely worked in isolation Despite global architecture, interface and vocabulary

standards, stove-pipe development was all too frequent

Interoperability “certification” processes were complex, not well communicated and not flexible enough to deal with widely disparate domains.

Architecture and interoperability were NOT community driven.

The Five Steps to Compatibility There are five steps in developing a caBIG®

compatible application : 1. Creating an Information Model 2. Performing Semantic Integration (Vocabularies) 3. Transforming the Information Model into Metadata (Common Data Elements) 4. Generating Code and Messaging Interfaces (API’s) 5. Generating a caGrid Interface

Generate Code and Messaging Interfaces using the caCORE SDK Code Generator

PerformSemantic Integration using the Semantic Integration Workbench (SIW)

Create an Information Model in a Modeling Tool

Transform the Information Model into Metadata using the UML Loader

y

Generate a caGrid Interface using “Introduce”

y

Information Models

Vocabularies CDEs APIs

Lesson 5: Making a Tool caBIG™ Compatible

Compatibility Review Process

Lessons Learned Contract Development and Maintenance without

an open source community and mechanisms for accepting and integrating community developed software lead to missed opportunities and multiple independent development paths. Open source enhancements and extensions can not be

inserted into the code base Change proposals can be submitted to the contract

development team – IFF one exists End users and Support Service Providers are denied

important features because they are not in the official release package.

The National Biomedical Image Archive becomes The Cancer Imaging Archive

NBIA as hosted by CBIIT TCIA hosted by Washington Univ.

Open Source Enhancements without a Home Deployment Enhancements

Virtual Machine hosting Database Clustering High-availability and load-balancing clusters

User Interface extensions New look and feel Integrated wiki LDAP and Password Management

Performance improvements Improved download applet Multi-pipeline de-identification Enhanced upload performance

Findings of NCI Board of Scientific Advisors

Source: AN ASSESSMENT OF THE IMPACT OF THE NCI CANCER BIOMEDICAL INFORMATICS GRID (caBIG®), 2011 Available from: http://arc.georgetown.edu/BSAcaBIGAssessment.pdf

A "cart-before-the-horse" overly broad vision for the program Technology driven focus rather than end user requirements and needs In some instances a confusion between clinical practice needs and the

mandated research focus of the program.

A "build it and they will come" mentality. Free software will always be accepted even if it is complicated, costly to

integrate into a user’s environment and poorly supported

A business model that is unsustainable and not cost-effective Software has a life cycle and requires ongoing evolution. An active,

engaged open source community can facilitate this. Without a self sustaining business model, when the government funding stops, the products die.

In Summary caBIG® is a large and productive open source, open access,

open science program funded and managed by the National Cancer Institute

caBIG® is a contract development program with open source licensing.

Lots of community engagement but little support for an actual open source development process.

Stovepipe development and inefficient certification process lead to poorly integrated components and poor interoperability.

Open Source is largely a bottom up process, caBIG® is largely a top down program.