commercial and near- commercial use of grid in the uskesselman+-+univa… · commercial and...
TRANSCRIPT
Commercial and Near-commercial Use of Grid
in the US:
The Good, the Bad and the Ugly.The Good, the Bad and the Ugly.
Carl KesselmanDirector, Center of Grid Technologies
University of Southern California
Infrastructure for enabling resource sharing and collaboration across
distributed virtual orginizations
Enterprise Virtualization
Grid is Enterprise Virtualization
Server
Virtualization
Application
Virtualization
Storage
Virtualization
File
Virtualization
Desktop
Virtualization
Level 5 : Enterprise Grid
Level 4 : Linked Clusters / Peer-to-Peer Grid
Level 6 : Inter - Enterprise Grid
Grid Adoption Lifecycle
Level 1 : Single Cluster / Single Application
Level 3 : Remote Use of Isolated Clusters
Level 2 : Isolated Clusters / Grid Silos
Commercial Grid Adoption Statistics
• Gartner defines grid computing as Early Mainstream
• 5% – 20% of target audience has adopted
• Currently 14% of CIOs rate interest level as high to very high
Grid focus areas:
• Non-dedicated distributed computing
• Highly scalable
• Application centric, meta data later, simplicity focus
• Cost savings / ROI value proposition
• Innovation, differentiation value proposition
Grid Infrastructure Value Progression
Optimize enterprise- wide Integrated IT
Improve control;Increase isolation/
stabilityServicedelivery
Integrated IT
Improve coordination;Enable sharing
Current market
transition
Strategic AssetTactical Solution
Copyright © Univa 2006Slide 5
Compute
Single application
Accelerate application performance
Compute
Single application
Data
Integrated IT islands
enterprise- wide utilization
Single application
Compute
Data
Coordinate data and
computation
Compute
Virtual machine
Data
Single application
Integrated IT islands
Multiple applications
Compute
Data
Integrated IT islands
Virtual machine
transition
Grid Value Proposition
For End Users
• Gain access to increased compute capacity
• Reduce costs• Get to market faster For IT
Integrators
• Gain a competitive
For Software Vendors
• Gain access to new
Commercial
Univa UD Proprietary & Confidential 6
For Service Providers
• Gain a competitive edge with a differentiated offering
• Maximize customer satisfaction and retention
• Gain access to new markets / customers
• Ensure your applications perform as designed
• Maximize customer satisfaction and retention
Commercial Grid Computing
• Guarantee a high-quality end user experience
• Deliver fast, consistent application response on time, every time
• Reduce operating costs and improve infrastructure utilization
HPC Lifecycle Management
HPC
Application Selection
Test
Selection &
PurchaseMigrate
Retire
Univa UD Proprietary & Confidential 7
HPC Lifecycle
ManagementConfigure
Deploy
Use
Upgrade
Grow
Add
Support
Maximize use of single cluster e.g. The UniCluster 3.2 Stack
GlobusUniCluster Monitor ConsoleARCO
Robust, scalable, Open Source, high-value
Multiple uses aligned with business goals
Application access for remote users
Integrate best-of-breed open source technologies into a full featured, mature, cluster software stack
RHEL 4 and 5 SUSE 9 and 10 CentOS 4 and 5X86 and x86_64
Windows 2000 or XP (Monitor Console only)
X86, x86_64 Hardware
GridFTP
WS-GRAM
RFT
MyProxy/Auto-CA
GSI-OpenSSH
Bootstrap ServiceManagement Service
SGEGanglia
gmond
gmetad
Sge_qmaster
Sge_schedd
Sge_execdRRD
UniCluster Security Component
Postgres
UniCluster Security Component
Application Domains
Health Sciences Medical imaging, bio-informatics
Manufacturing EDA, fluid dynamics, crash test simulations
Data Analytics and Data Mining Potential Growth Applications
Financial Services Risk/portfolio analysis, Monte Carlo simulations
Media Digital content creation, animation
Energy Reservoir simulations, seismic processing
• Parallel applications• Faster time to completion for single calculation
• Generally requires application modification
• Examples: solving differential equations for mechanical design, aeronautics, etc.
• Course-grain “High-throughput”• Complete as many independent calculations as possible
• Orders of magnitude increase in exploration of problem space
Cluster Workloads
• Orders of magnitude increase in exploration of problem space
• Often accomplished via scripting
• Examples: Monte Carlo algorithms, parameter space exploration, optimization
• Interactive• Single node, start “immediately”
• Offload work from desktop
• Any of these may be driven from command line, portals, or exiting application tools (eg. Matlab)
Enterprise Grids
• Desktop
• Desktop and Cluster
• Multiple Clusters
Johnson & Johnson
• Goal• Provide complete infrastructure suite for central IT
managed Grid and HPC services
• Challenges• Leverage heterogeneous environment including Linux
Clusters, Blades, and high performance workstations
• Provide solution for meta-scheduling to remove
system and application management overhead for
researchers so they can focus on analysis
“It’s just a better way to invest your money... We can have a single tool that can both do a virtual cluster and also take advantage of CPU harvesting off of the existing equipment
researchers so they can focus on analysis
• Solution Highlights• Grid Ready Cluster environment consisting of Linux
clusters and Windows workstations
• Fully implemented ‘Grid as Service’
• Successful solution for accelerating a multitude of
critical production applications
• Today• Successful deployments across Europe and US
(WAN)
• Multi-site integration work complete
• Grid and HPC now a managed, central service
off of the existing equipment that we already have. That was one of the big reasons we picked [Univa UD] over other ways of providing HPC capabilities.”
– Jeff MathersPharma R&DJohnson & Johnson
• Goal– Optimize its scientific computing infrastructure
– Enable better / faster engineering
• Challenges– Time crunch driven by project to simulate replacing
crash barriers along UK highways
– Required a significant increase in processing
capabilities
“The implementation of GX Synergy was, in hindsight, the only solution that gave us both an optimization of the resources of the center as well as an immediate mastery of the tool by our engineers. The simplicity and the lightness of the solution enabled
Corus
– Fully integrated solution had to be in production in
30 days
• Solution Highlights– Installed new 48 CPU cluster to increase
throughput
– Integrated LS-Dyna, Nastran, PamStamp, Radioss,
Abaqus and ST-ORM
– Installation began on Jan 15th, 20 engineers were
trained and running production jobs by Feb. 15th
• Today– Univa UD used for core production jobs across
compute center
the lightness of the solution enabled us to install it in one week, without having to stop our normal work. We were able to be ready on time and to meet the challenge of the customer work we needed to undertake.”
– Mike TwelvesManager Knowledge SystemsCorus Automotive Engineering
• Goal
– Accelerate in silico research, value creation
– Cost containment
• Challenges– CPU-constrained in-silico research capability
– Integration with existing interfaces
– Security of intellectual property
• Solution Highlights
“The reported work clearly shows that large
database docking in conjunction with appropriate scoring and filtering processes can
be useful in medicinal chemistry. This approach has reached a maturation stage where it can start contributing to the lead
finding process. At the time of this study, nearly one month was necessary to complete such a docking experiment in our laboratory settings.
The Grid computing architecture recently
Novartis
2007 Univa UD Confidential
• Solution Highlights– Seamless integration with existing portal
interface
– Passed all end-to-end security tests
• Today
– To be extended to 10’s of thousands of nodes
– Multiple production applications including drug discovery, clinical analysis, and sales & marketing
The Grid computing architecture recently developed by [Univa UD] allows us to now
perform the same task in less than five working days using the power of hundreds of desktop
PC’s. High-throughput docking has therefore acquired the status of a routine
screening technique.”
– Journal of Medicinal Chemistry
Procter & Gamble
• Goal
• Enhance in-house high performance compute capabilities by taking advantage of underutilized workstations
• Run a Finite Element Analysis application (Abaqus) on the Grid
• Bottle and package design
• Challenges
• Competitive pilot vs. Axcellion and Platform (incumbent supplier)
• Grid MP grid needed to interface with LSF
• User community is used to the Platform LSF interface
2007 Univa UD Confidential
• User community is used to the Platform LSF interface
• Company did not want to disrupt user community by introducing a new interface
• Company concerned about: security, unobtrusiveness, scalability
• Solution Highlights
• Univa UD successfully won competitive pilot
• passed all major end user tests and concerns
• Integration with LSF for Abaqus jobs completed under a week
• A grid of 200 high-end workstations / desktops is running Abaqus jobs during off-peak hours
• Design of Experiments, monte-carlo based approach
Children’s Memorial Hospital
• Goal
– Analyze patient data and medical research literature
to differentiate pediatric brain tumor types while
gaining unique insights into tumor classification and
treatment
• Challenges
– Too costly and logistically difficult to add compute
power to perform necessary analytic work
“Leveraging SPSS predictive analytics and [Univa UD]’s expertise in grid computing, we’ve developed
an integrated technology system that can efficiently extract
and organize gene relationships
2007 Univa UD Confidential
• Solution Highlights
– Uses data mining technology, Clementine®, to
analyze and classify pediatric brain tumor types
– Employs LexiQuest Mine™, to discover previously
overlooked relationships contained in literature
• Today
– Routinely able to process 124,000 medical abstracts
less than 1.5 hours
– Previously the analysis required between 20 and 24
hours making ad-hoc and what-if queries unfeasible
and organize gene relationships from full text articles. We can also correlate this insight with both past and ongoing research on effective
pediatric cancer treatments.”
Dr. Eric BremerDirector of the Brain Tumor
Research Program atChildren’s Memorial Research
Center
GlaxoSmithKline
• Goal– Replace internally developed Grid technology
with commercially available solution
• Challenges
– Very knowledgeable about Grid computing having built VCS in-house
– Existing Platform Computing customer
– Concerned with integration in their current IT infrastructure, application migration and
“The Grid MP platform keeps track of all the data related to our
job runs – where the job was executed, what type of machine,
how long it took. So not only does the grid save us time, but in automating this function it allows
2007 Univa UD Confidential
infrastructure, application migration and standards.
• Solution Highlights
– Rapid initial migration of existing applications– No performance issues from non-dedicated
nodes
– Passed all end-to-end security tests
• Today– Grid solution is now a production IT service– Multiple applications running in production
across many functions and departments
automating this function it allows us to define a validated process for job execution. That goes a
long way toward achieving FDA compliance.”
Mark SaleGlobal Director of Research
Modeling and SimulationGlaxoSmithKline
“…[Grid MP] will revolutionize not only how we do our work, but the
• Goal– Accelerate Archimedes processing dramatically
over standalone or server based processing
• Challenges– Enabling a Smalltalk application (Archimedes) to
run on Grid MP
– Getting single run times to under 10 minutes
•
American Diabetes Association & Kaiser Permanente
2007 Univa UD Confidential
we do our work, but the accuracy of the decisions people make about the
management of diseases. Normally, answering a
single question on a PC requires 24 to 48 hours. [Univa UD] is helping us
reduce the computing time to minutes.”
Dr. David EddyKaiser Permanente
Sr. Advisor for Health Policy and Management
• Solution Highlights– On-demand processing was the initial best fit for
the ADA’s peak-driven workloads– Fully functioning simulation portal
• Today
– ADA utilized an internal grid of more than 1000 nodes
– Diabetes PHD in full production and publicly accessible
Sanofi~Aventis
• Goal
• Screen entire 1M compound library
• Invest in latest technology – skip over Clusters to Grids
• Challenges
• Too costly to expand existing HPC systems (SGI)
• Plans in place to phase out HPC systems across
company
“Ease of deployment and the availability of applications
were strong selling points for us. We also needed proven scalability and security and
2007 Univa UD Confidential
company
• Solution Highlights
• Joint effort in partnership with Accelrys to deliver
integrated LigandFit solution
• Within 2 months began routinely screening 300k
compounds across multiple locations in France
• Today
• Plan to expand to thousands of nodes
• 1M+ library routinely screened
scalability and security and knew from [Univa UD]’s other enterprise deployments and their work with their public
Grid that massive scaling and security capabilities were
already proven.”
Olivier GienHead of Discovery IT
Sanofi-Aventis
• Goal
– Build a model of all production parts required to build a series of
automobiles based on current orders on hand
• Challenges
– Have to determine the optimal production schedule to ensure that
Grid MP
ServicesDevice Group Y
Device Group Z
Toyota
2007 Univa UD Confidential
production schedule to ensure that factories are loaded optimally and
that all parts are on hand (JIT)
• Solution Highlights
– Integrated with the existing Z/OS Mainframe to minimize current user
and workflow changes
• Today
– Improved utilization of hardware, manufacturing scheduling, inventory
and personnelResults returned in minutes not hours
Orders
Part 1
Part 2
Part 3 Part 1
Part 2
Part 3 Part 1
Part 2
Part 3 Part 1
Part 2
Part 3Part 1
Part 2
Part 3 Part 1
Part 2
Part 3 Part 1
Part 2
Part 3 Part 1
Part 2
Part 3
Part 1
Part 2
Part 3 Part 1
Part 2
Part 3 Part 1
Part 2
Part 3 Part 1
Part 2
Part 3Part 1
Part 2
Part 3 Part 1
Part 2
Part 3 Part 1
Part 2
Part 3 Part 1
Part 2Part 3
Part 1
Part 2
Part 3 Part 1
Part 2
Part 3 Part 1
Part 2
Part 3 Part 1
Part 2
Part 3Part 1
Part 2
Part 3 Part 1
Part 2
Part 3 Part 1Part 2
Part 3 Part 1
Part 2
Part 3
Part 1
Part 2
Part 3 Part 1
Part 2
Part 3 Part 1
Part 2
Part 3 Part 1
Part 2
Part 3Part 1
Part 2Part 3 Part 1
Part 2
Part 3 Part 1
Part 2
Part 3 Part 1
Part 2
Part 3
Grid is gaining traction across Life Insurance and P&C
Life (Enterprise PC Grids)
•Mortality
•Population Analysis
•Stochastic-based risk
modeling
•Principle-based
reserving (PBR)
P&C (Enterprise PC Grids)
•Pricing
•Underwriter’s Scorecard
•Catastrophe modeling
•Claims analytics
(specifically fraud)
• Analysis of remote data
For underwriting and process, leading carriers are investing in technology to
improve risk selection, competitive pricing, and applying automated and
consistent underwriting rules.
Deep
Computing
Particularly in the Life and Annuities
Univa UD 2007 21Univa UD 2007 21
reserving (PBR)
•Enterprise Risk
Management (ERM)
• Analysis of remote data
(telematics)
Particularly in the Life and Annuities space, actuaries are working with IT to change the way they model risks and
maintain reserving in parallel with equity market fluctuations.
As technology helps reduce costs, freed up budget and resources from maintenance and outsourcing
are being used to make strategic technology investments across all areas of insurance.
Remote Access
Job SchedulerVirtualization
FileVirtualization
FederatedSecurity
FederatedMonitoring
Maximizing Value from Multiple Clusters
Operating System
JobScheduler
FileSystem
Security Monitoring
Remote Access
Increased collaboration
Greater aggregate capacity, via peer to peer
Simplified access to distributed, file-based data
Design File Test Vectors
Defect Map
Semiconductor Design
Log Files Log File Analysis
Scale: 10,000’s of test vectors and result log files
RTL env. scripts
Job SchedulerVirtualization
FileVirtualization
FederatedSecurity
FederatedMonitoring
“Overflow” compute capacity for multiple applications
Management of all resource types (compute, data, network)
Higher overall service quality with lower administration
Maximizing Value From Shared Utility
JobScheduler
FileSystem
Security Monitoring
Operating System
Remote Access
SHARED UTILITY
Reporting & Analytics Utilization ForecastingCapacity
Event Management Event DetectionMonitors / Actions License Control
Chargeback
UtilityOverflow
Cluster Configuration and Mgmt. interfacesApp Mgmt. & Job Submission/Mgmt. interfaces
Submission ConfigurationManagementMonitoring
(Enterprise PC Grids)
The Commercial Grid Ecosystem
Univa UD 2007 25
Common Application Integration and BundlesFirewall
Overflow
Cluster Management
• Integrated Insight
• Integrated Response
• System Management
• License Management
• DataCatalyst - Pro
• P2P job forwarding
• Virtual Machine Plug-ins
(Open Source)
• Easy install & configure
• Job scheduling
• Monitoring
• Cluster deployment
• Remote access / staging
• DataCatalyst
• Insight as a service (fee)
• Create Enterprise PC
Grids
• Create Internet Grids
• Highly secure
• Over 100+ enterprise
deployments
• P2P job forwarding for
Grid MP & Cluster
Inter-enterprise Grids
• Resource Outsourcing
• Cloud computing
• Service Value Networks
• Outside in (J.S. Brown)
• True Federated environments• True Federated environments
• E.g. Healthcare
Globus MEDICUS
• Medical Imaging and Computing for Unified Information
Sharing (MEDICUS)
• Use standards Open Grid Service Architecture (OGSA) for
Healthcare and Clinical Research
• Vertical integration of existing robust Grid technology
• Addresses Medical Imaging• Addresses Medical Imaging
• DICOM image sharing within Grids*
• DICOM image processing (WS)
• DICOM image archiving/management(Grid PACS)**
Globus MEDICUS Proto-Project @ http://dev.globus.org/wiki/Incubator/MEDICUS
*PACS and Imaging Informatics, SPIE Medical Imaging, 6145-32, 2006
**Int Journal of Computer Assistant Radiology and Surgery, 2006, 1:87-105; p100-104, Springer, Heidelberg
Global Patient Record
MEDICUS Use Cases: Childrens Oncology Group and Neuroblastoma Cancer Foundation Grids
Open Source Communities
Summary
• Increasing traction of Grid in commercial sector
• Need to give time for maturity lifecycle
• Need to look at entire cap-ex/op-ex lifecycle
• Many interesting new areas of opportunity and greenfield for Grid technologygreenfield for Grid technology
• Grid’s place in infrastructure ecosystem
• Clouds don’t replace Grids, VM management doesn’t replace grids