building global scientific computing infrastructure through lab, academic and industrial...

38
Building Global Scientific Computing Infrastructure through Lab, Academic and Industrial Collaborations Bill Hoffman - CTO/Founder Berk Geveci - Director of Scientific Computing Kitware Inc.

Post on 22-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Building Global Scientific Computing Infrastructure through Lab, Academic and Industrial Collaborations

• Bill Hoffman - CTO/Founder• Berk Geveci - Director of Scientific Computing• Kitware Inc.

Thank You

• Thank you to the unsung heroes of Open Source – Scientific computing

community and Gov. labs

• Google and Facebook would not be around without the open source infrastructure built in part by you DOE folks

Talk Overview

• About Kitware

• Why Open Source for Collaboration

• Successful Collaboration Platforms Supported by Kitware

• Suggestions for Future Directions

Kitware: the Company

• Founded in 1998

• Founders: 5 previous employees of GE Corporate Research

• Privately held, profitable from creation, no debt

– Revenues projected at $12 million in 2010• ~$15 million if subcontractors included

– Principally consulting/grants, with support product revenue

• Approximately 90 employees; growing rapidly (30% in 2010)

– > 25 PhD

– Looking to hire 20 to 30 in 2011

Kitware Is• A software company• creating open-source collaboration

platforms• which are used globally for

– research– teaching– commercial application.

• This software is created by– internationally recognized experts– in extended communities– using a rigorous, quality-inducing software

development process.

Technical Portfolio

Computer Vision

• Extract information from images or video streams

Why Open Source?• World-wide visibility

– Marketing (7.5 million web hits/month)– Hiring

• Candidates have trained with the software

– Collaboration Platform• Academic• Research• Commercial

– Distributed maintenance

• High quality base for products

– Commercial– Proprietary– Specialized

9

Why Open Source?

• Software licensing fees are minimal– Support costs– Consulting costs

• Software survives independent of any single company– Community support

10

Source Is Ideal for Scientific Computing

• Open Science

• Authenticity (see what you get)

• Quality-inducing, agile, collaborative software process

• Scalability

• Business model

11

Open Science

• Reproducible– Data (Open Data)– Software / algorithms (Open Source)– Publications (Open Access)

• Impediment-free Dissemination– Results– Research ideas

12

Example: Alzheimer’s Research

• From NY Times Article:"The key to the Alzheimer’s project was an agreement as ambitious as its goal:

not just to raise money, not just to do research on a vast scale, but also to share all the data, making every single finding public immediately, available to anyone with a computer anywhere in the world.

No one would own the data. No one could submit patent applications, though private companies would ultimately profit from any drugs or imaging tests developed as a result of the effort.

“It was unbelievable,” said Dr. John Q. Trojanowski, an Alzheimer’s researcher at the University of Pennsylvania. “It’s not science the way most of us have practiced it in our careers. But we all realized that we would never get biomarkers unless all of us parked our egos and intellectual-property noses outside the door and agreed that all of our data would be public immediately.”

13

Authenticity

• See what you get

• Try before you commit

• Access to outside, independent experts– Avoid vendor lock-in– Hire from the community

14

Agile Software Process

• Open source communities require extensive collaboration– Distributed development and user communities

• Necessarily require agile processes– Responsive to customer– Responsive to technology changes

15

Scalability

• Scalable Software Development– Eric Raymond The Cathedral & The Bazaar

“open-source peer review is the only scalable method for achieving high reliability and quality.”

(assuming community size is big enough !!)

16

Business Model

• Open source software– Services and support

– Consulting

– Collaborative R&D

• Commercial products– Value-added products

– Applications built on open source base

• Redhat for scientific computing*

17

Successful Collaborations

• VTK

• ParaView

• ITK

• CMake

• Client Specific Work built on those tools– ISP– ERDC

18

In The Beginning There Was VTK

•From Ohloh: Very large, active development team: Over the past

twelve months, 66 developers contributed new code to VTK. This is one of the largest open-source teams in the world, and is in the top 2% of all project teams on Ohloh.

VTK Development Team

and many others...

VTK Development Team

Then Came ParaView

National Library of MedicineSegmentation and Registration

ToolkitInsight Toolkit (ITK)$15 million over 7

yearsLeading edge

algorithmsOpen source software

www.itk.org

Software Process

Software Process Tools

CMake – huge impact started with NLM

• 3000+ downloads per day from www.cmake.org• Major Linux distributions and Cygwin provide CMake

packages• KDE, Second Life, Trilinos, Boost (Expermentally), many

others

KDE 2006 – Tipping Point!

CMake Who Is Involved?

Users• KDE

• Second Life

• ITK

• VTK

• ParaView

• Trilinos

• Scribus

• Boost (Experimentaly)

• Mysql

• LLVM

• many more

Supporters

• Kitware

• ARL

• National Library of Medicine

• Sandia National Labs

• Los Alamos National Labs

• NAMIC

• Commercial Customers

CDash - Trilinos (Multi-Package Dashboard)http://trilinos-dev.sandia.gov/cdash/index.php

Main Project

Sub Projects

ParaView work with Army Core of Engineers ERDC

30

Genesis of ISPISP was developed beginning in 2008. It is a synthesis of three different tools:

• Midas for data archival and transmission

• VolView (modified) for the visualization and display core

• Lesion Sizing Toolkit for additional functionality

The resulting data archive and viewing application has been running 24/7 since 2008 and provides a means for readers to interactively explore the data of participating authors.

Lesion Sizing Toolkit

ISP (VolView Based)

MIDAS

Kitware SBIR History• No stranger to SBIR funding

– First contract was an SBIR

– 16 Phase II's

• Funded many advances across our tools– ParaView Web

– In-Situ analysis

– AMR Volume Rendering

– Higher Order Finite Element Visualization

• Tibbetts award for Image-Guided Surgery (IGSTK) Phase I and II STTRS– recognizes companies who represent excellence in achieving the mission

and goals of SBIR and STTR programs

32

The Need for Indoor Plumbing

• http://www.kitware.com/blog/home/post/78• Rather than creating usable tools for scientists and

engineers, often what is created are shiny toys with little practical use. Instead, as one of our collaborators Russ Taylor at UNC so aptly put it, we could use a lot more basic "indoor plumbing" to complement our bleeding-edge zero-G toilets with the latest bells and whistles.

33

2008 Rejected Proposal DOE Office of Science• "CMake - The Next Generation Petascale

Build Tool“

• Reviewers that wanted zero-G toilets- “The proposed method is an evolutionary development beyond current state of the

art. There seems to be very little novelty in the proposed approach.”

- “Of the four components of the petascale functionality only two are new, e.g., something beyond what is available with GNU make.”

- “If the DOE HPC centers feel the need for such an extended version of Cmake and Ctest it should probably be obtained in a acquisition process from this and/or competing vendors.”

34

Reviewers that wanted indoor plumbing• “This is a good proposal to enhance CMake for petascale

systems and to perform community outreach on it.”

• “The approach is very appropriate”

• “This proposal addresses the often-overlooked but important area of Application Build Tools. .... In large HPC software projects, which have to be portable to many complex systems, the build infrastructure and process are labor-intensive and often frustrating. The proposed changes seem to have reasonably low risk and significant benefit.”

35

Investment in Infrastructures has a huge payoff• Allows your people to focus on the science

• Allows the “outsourcing” of cross platform maintenance

• Software Engineering is a “Solved Problem”?

36

Kitware’s view on Multi-Core

• Huge opportunity • Different Platforms and Cross Compiling

• We have the expertise to help others achieve goals

• People will want our packages ITK, VTK, ParaView and CMake to take advantage of multi-core and will collaborate and fund us to develop, investment in open-source packages have a huge global impact

37

Thanks, and Questions

38

[email protected]