shantenu jha for the saga team saga: an overview
TRANSCRIPT
![Page 1: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/1.jpg)
Shantenu Jha for the SAGA Team
http://saga.cct.lsu.edu
SAGA: An Overview
![Page 2: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/2.jpg)
Overview
The SAGA Philosophy• A Fresh Perspective on Distributed Applications and CI
SAGA in a Nutshell• SAGA Landscape
• Individual APIs
• OGF standard
SAGA in action• Applications
• Tools, Frameworks, Gateways, Access Layers..
Uptake and Roadmap
![Page 3: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/3.jpg)
Critical Perspectives
Ability to develop simple, novel or effective distributed applications lags behind other aspects of CI • Distributed CI: Is the whole > than the sum of the parts?
Infrastructure capabilities (tools, programming systems) and policy determine applications, type development & execution:• Proportion of App. that utilize multiple distributed sites
sequentially, concurrently or asynchronously is low
• Not referring to tightly-coupled across multiple-sites
• Focus on extending legacy, static execution models
• Scale-Out of Simulations? Compute where the data is?
What novel applications & science has Distributed CI fostered• Distinguish challenges of provisioning Distributed CI versus support
for application development
![Page 4: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/4.jpg)
Critical PerspectivesQuick Analysis
Several Factors responsible for perceived & actual lack of DA• Developing Distributed Applications is fundamentally hard!
• Coordination across multiple distinct resources
• Range of tools, prog. systems and environments large
• Interoperability and extensibility become difficult
• Commonly accepted abstractions not available
• E.g. Pilot-Job powerful, but no “unifying” tool on TG
• Deployment and execution challenges disjoint from the development process
• Generally good idea, but application development often influences where and how it can be deployed/executed
![Page 5: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/5.jpg)
Distributed Applications Development Challenges
Developing Distributed Applications is fundamentally hard • Intrinsic:
• Design Points: Dynamical and Heterogeneous resources and Variable Control (or lack thereof)
• Coordination over Multiple & Distributed sites
• Scale-up and Scale-out
• Models of Distributed Applications:
• More than (peak) performance
• Primary role of Usage Modes
• Extrinsic:
• (Complex) Underlying infrastructure & its provisioning
• Programming systems, tools and environments
![Page 6: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/6.jpg)
Distributed ApplicationsDevelopment Challenges
Dist. Application and Programming Systems and Tools• Incompleteness and/or out-of-phase:
• Need X and Y, but only X or Y available,
• e.g., Master-Worker paradigm supported, but no FT.
• Customization:
• Works well with tool A but not B,
• e.g., Pegasus-DAGMAN-Condor
• Robustness and Scalability:
• Works well in small or controlled environment, but not at-Scale
• e.g., SAGA–based Montage
![Page 7: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/7.jpg)
Distributed Applications and PGILessons from Applications
There exist distributed applications that aim to utilize multiple resources:• Complex Coordination, Data Mgmt and FT issues
• Complex structures at different phases
Emerging Infrastructure present operational challenges• Don’t always provide the application requirement
Missing abstractions• Development, Deployment and Execution
• e.g. Coding against low-level middleware
Issues of policy and infrastructure design decisions• e.g. Co-allocation supported or not
• e.g. Narrow versus broad Grids
![Page 8: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/8.jpg)
Distributed Application and PGI (2)
Lack of explicit support for addressing these challenges becomes self-fulfilling and self-perpetrating• Most PGI not designed to support distributed applications,
increasing effort to develop/deploy/execute
• Small number of distributed applications on TG leads to focus on single-site applications
(Ironically) Most applications have been developed to hide from heterogeneity and dynamism; not embrace them• Good heterogeneity vs Bad heterogeneity
• Dynamism: Performance Advantages
Applications have been brittle and not extensible:• Tied to specific tool or prog. system (& thus PGI)
![Page 9: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/9.jpg)
Distributed Application and PGI (3)
Successful Evolution of Supported Capabilities and PGI• OSG support of HTC
• Condor from scavenging system to building block
• Condor Flocking
• TeraGrid and Gateways (User-level Abstraction)
• Several new capabilities for new communities
Not so Successful Evolution of Capabiliites on PGI• Co-scheduling on PGI
• Both technical and policy issues
• Scale-out of DAGs/Loosely-coupled ensembles
• Execution is logically or physically distributed
![Page 10: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/10.jpg)
Distributed Application and PGI (4)
Supporting applications that have “come-of-age” useful• Coupling real-time simulations to live sensor data (LEAD)
• Currently neither OSG nor TG can support DDDAS
Often the underlying infrastructure and capabilities change quicker than application • Infrastructure: Grids and Clouds
• Many challenges remain the same, e.g., requirements of distribution coordination of data and computation
• Role for abstractions
• Applications still around: SFExpress, Netsolve->GridSolve
![Page 11: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/11.jpg)
Distributed Application and PGI (5)Miscellaneous Factors
Role of Funding Agencies:• Changed vision, funding landscape
• TG: Run anywhere with tools/prog systems to support primarily static HPC applications
• Focus on static as opposed to distributed scale-out robust workflows
• http://www.cct.lsu.edu/~sjha/presentations/panel_discussion/punch_counterpunch.pdf
Role of Standard: • Lack of standards impedes interoperation
• No standards;
• Chicken-and-egg situation
• Simple standards have been effective, but with limited impact, eg., troika of JSDL, HPC-BP, OGSA-BES
![Page 12: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/12.jpg)
Distributed Applications How do they differ from traditional HPC applications?
PGI design needs to be reflect some or all of these:
Performance Models:• Not just “peak utilization”; e.g., HPC & HTC (# of jobs)
Usage Modes:• The same application has multiple usage modes
• How applications are developed, deployed and executed is often determined by the infrastructure
Static vs Dynamic Execution:• Static applications is not enough; varying resource conditions,
application requirements
Skillful Decomposition vs Aggregation• Primacy of Coordination across distributed resources
![Page 13: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/13.jpg)
Understanding Distributed ApplicationsIDEAS: First Principles Development Objectives
Interoperability: Ability to work across multiple distributed resources
Distributed Scale-Out: The ability to utilize multiple distributed resources concurrently
Extensibility: Support new patterns/abstractions, different programming systems, functionality & Infrastructure
Adaptivity: Response to fluctuations in dynamic resource and availability of dynamic data
Simplicity: Accommodate above distributed concerns at different levels easily…
Challenge: How to develop DA effectively and efficiently with the above as first-class objectives?
![Page 14: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/14.jpg)
Overview
The SAGA Philosophy• A Fresh Perspective on Distributed Applications and CI
SAGA in a Nutshell• SAGA Landscape
• Individual APIs
• OGF standard
SAGA in action• Applications
• Tools, Frameworks, Gateways, Access Layers..
Uptake and Roadmap
![Page 15: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/15.jpg)
SAGA: In a nutshell
There exists a lack of Programmatic approaches that:• Provide general-purpose, basic & common grid functionality for
applications and thus hide underlying complexity, varying semantics..
• The building blocks upon which to construct “consistent” higher-levels of functionality and abstractions
• Hides “bad” heterogeneity, means to address “good” heterogeneity
• Meets the need for a Broad Spectrum of Application:
• Simple scripts, Gateways, Smart Applications and Production Grade Tooling, Workflow…
Simple, integrated, stable, uniform and high-level interface• Simple and Stable: 80:20 restricted scope and Standard
• Integrated: Similar semantics & style across
• Uniform: Same interface for different distributed systems
SAGA: Provides Application* developers with units required to compose high-level functionality across (distinct) distributed systems (*) One Person’s Application is another Person’s Tool
![Page 16: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/16.jpg)
Text
SAGA: The Standard Landscape
![Page 17: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/17.jpg)
SAGA: Specification Landscape
Blue lines showwhich packageshave input in the Experience document
![Page 18: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/18.jpg)
SAGA: In a thousand words..
![Page 19: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/19.jpg)
Text
SAGA: Job SubmissionRole of Adaptors (middleware binding)SAGA: Role of Adaptors
![Page 20: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/20.jpg)
SAGA Job API: Example
![Page 21: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/21.jpg)
SAGA Job Package
![Page 22: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/22.jpg)
SAGA File Package
![Page 23: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/23.jpg)
File API: Example
![Page 24: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/24.jpg)
SAGA Advert
![Page 25: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/25.jpg)
SAGA Advert API: Example
![Page 26: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/26.jpg)
SAGA: Other Packages
![Page 27: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/27.jpg)
SAGA Task Model
All SAGA objects implement the task model
Every method has three “flavors”• Synchronous version - the implementation
• Asynchronous version - synchronous version wrapped in a task (thread) and started
• Task version - synchronous version wrapped in a task but not started (task handle returned)
Adaptor can implement own async. version
![Page 28: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/28.jpg)
SAGA Implementation: Extensibility
Horizontal Extensibility – API Packages• Current packages:
• file management, job management, remote procedure calls, replica management, data streaming
• Steering, information services, checkpoint…
Vertical Extensibility – Middleware Bindings• Different adaptors for different middleware
• Set of ‘local’ adaptors
Extensibility for Optimization and Features• Bulk optimization, modular design
![Page 29: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/29.jpg)
SAGA C++ quick tour
Open Source - released under the Boost Software License 1.0
Implemented as a set of libraries• SAGA Core - A light-weight engine / runtime that dispatches
calls from the API to the appropriate middle-ware adaptors
• SAGA functional packages - Groups of API calls for: jobs, files, service discovery, advert services, RPC, replicas, CPR, ... (extensible)
• SAGA language wrappers - Thin Python and C layers on top of thenative C++ API
• SAGA middleware adaptors - Take care of the API call execution on the middleware
Can be configured / packaged to suit your individual needs!
![Page 30: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/30.jpg)
SAGA: Available Adaptors
Job Adaptors• Fork (localhost), SSH, Condor, Globus GRAM2, OMII GridSAM,
Amazon EC2, Platform LSF
File Adaptors• Local FS, Globus GridFTP, Hadoop Distributed Filesystem (HDFS),
CloudStore KFS, OpenCloud Sector-Sphere
Replica Adaptors• PostgreSQL/SQLite3, Globus RLS
Advert Adaptors• PostgreSQL/SQLite3, Hadoop H-Base, Hypertable
![Page 31: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/31.jpg)
SAGA: Available Adaptors (2)
Other Adaptors• Default RPC / Stream / SD
Planned Adaptors• CURL file adaptor, gLite job adaptor (Ole), …..
Open issues• We’re in the process of consolidating the adaptor code
base and adding rigorous tests in order to improve adaptor quality
• Capability Provider Interface (CPI - the ‘Adaptor API’) is notdocumented or standardized (yet), but looking at existing adaptor code should get you started if you want to develop your own adaptor.
![Page 32: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/32.jpg)
SAGA API: Towards a StandardStandards promote Interoperability
The need for standard programming interface• Trade-off “Go it alone” versus “Community” model
• Reinventing the wheel again, yet again, & then again
• MPI a useful analogy of community standard
• Vendors (Resource Provider), Software developers, users..
• social/historic parallels also important
• Time to adoption, after specification ....
OGF the natural choice (SAGA-RG, SAGA-WG)• Spin-off of the Applications Research Group
• Driven by UK, EU (German/Dutch), US
• Design derived from 23 Use Cases
• different projects, applications and functionality
• biological, coastal modelling, visualization
• Will discuss the advantage of SAGA as a standard specification
![Page 33: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/33.jpg)
Overview
The SAGA Philosophy• A Fresh Perspective on Distributed Applications and CI
SAGA in a Nutshell• SAGA Landscape
• Individual APIs
• OGF standard
SAGA in Action• Applications
• Tools, Frameworks, Gateways. Access Layers..
Uptake and Roadmap
![Page 34: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/34.jpg)
SAGA and Distributed Applications
![Page 35: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/35.jpg)
Understanding Distributed Applications Implicit vs Explicitly Distributed ?
Which approach (implicit vs. explicit) is used depends:• How the application is used?
• Need to control/marshal more than one resource?
• Why distributed resources are being used?
• How much can be kept out of the application?
• Can’t predict in advance?
• Not obvious what to do, application-specific metric
If possible, Applications should not be explicitly distributed• GATEWAYS approach:
• Implicit for the end-users
• Supporting Applications? Or Application Usage Modes?
![Page 36: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/36.jpg)
Taxonomy of Distributed Application Development
Example of Distributed Execution Mode:• Implicitly Distributed
• HTC of HTC: 1000 job submissions of NAMD the TG/LONI
• SAGA shell example (cf DESHL)
Example of Explicit Coordination and Distribution• Explicitly Distributed
• DAG-based Workflows (example of Higher-level API)
• EnKF-HM application
Example of SAGA-based Frameworks• Pilot-Jobs, Fault-tolerant Autonomic Framework
• MapReduce, All-Pairs
• Note: An application can belong to more than one type
![Page 37: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/37.jpg)
DNA Energy Levels: HTC of HPC (Bishop; Tulane)
![Page 38: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/38.jpg)
Montage: DAG-based Workflow Application Exemplar
![Page 39: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/39.jpg)
Application Development Phase
Generation & Exec. Planning Phase
Execution Phase
DAG based Workflow ApplicationsExtensibility and Higher-level API
![Page 40: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/40.jpg)
SAGA-based DAG ExecutionPreserving Performance
![Page 41: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/41.jpg)
Ensemble Kalman FiltersHeterogeneous Sub-Tasks
Ensemble Kalman filters (EnKF), are recursive filters to handle large, noisy data; use the EnKF for history matching and reservoir characterization
EnKF is a particularly interesting case of irregular, hard-to-predict run time characteristics:
![Page 42: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/42.jpg)
Results: Scale-Out Performance
Using more machines decreases the TTC and variation between experiments
Using BQP decreases the TTC & variation between experiments further
Lowest time to completion achieved when using BQP and all available resources
Khamra & Jha, GMAC, ICAC’09
![Page 43: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/43.jpg)
43
• History match on a 1 million grid cell problem, with a thousands of ensemble members
• The entire system will have a few billion degrees of freedom
• This will increase the need for scale-out, autonomy, fault tolerance, self healing etc...
Extreme Distribution: Frameworks?
![Page 44: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/44.jpg)
44
• History match on a 1 million grid cell problem, with a thousand ensemble members
• The entire system will have a few billion degrees of freedom
• This will increase the need for scale-out, autonomy, fault tolerance, self healing etc...
Extreme Distribution: Frameworks?
![Page 45: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/45.jpg)
45
• History match on a 1 million grid cell problem, with a thousand ensemble members
• The entire system will have a few billion degrees of freedom
• This will increase the need for scale-out, autonomy, fault- tolerance, self healing etc...
Extreme Distribution: Frameworks?
![Page 46: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/46.jpg)
SAGA-based Frameworks: Types
Frameworks: Logical structure for Capturing Application Requirements, Characteristics & Patterns• Runtime and/or Application Framework
Application Frameworks designed to either:• Pattern: Commonly recurring modes of computation
• Programming, Deployment, Execution, Data-access..
• MapReduce, Master-Worker, H-J Submission
• Abstraction: Mechanism to support patterns and application characteristics
Runtime Frameworks:• Load-Balancing – Compute and Data Distribution
SAGA-based Framework: Infrastructure-independent
![Page 47: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/47.jpg)
SAGA-based Frameworks: Examples
SAGA-based Pilot-Job Framework (FAUST)• Extend to support Load-balancing for multi-components
SAGA MapReduce Framework: • Control the distribution of Tasks (workers)
• Master-Worker: File-Based &/or Stream-Based
• Data-locality optimization using SAGA’s replica API
SAGA NxM Framework:• Compute Matrix Elements, each is a Task
• All-to-All Sequence comparison
• Control the distribution of Tasks and Data
• Data-locality optimization via external (runtime) module
![Page 48: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/48.jpg)
Abstractions for Dynamic Execution Container Task
Adaptive: Type A: Fix number of replicas; vary cores assigned to each
replica. Type B: Fix the size of replica, vary number of replicas (Cool Walking) -- Same temperature range (adaptive sampling) -- Greater temperature range (enhanced dynamics)
![Page 49: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/49.jpg)
Abstractions for Dynamic ExecutionSAGA Pilot-Job (BigJob)
![Page 50: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/50.jpg)
Coordinate Deployment & Scheduling of Multiple Pilot-Jobs
![Page 51: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/51.jpg)
Distributed Adaptive Replica Exchange (DARE)Application Usage Mode (GridChem)
![Page 52: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/52.jpg)
GridChem -- Extensions(Joohyun Kim, LSU)
![Page 53: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/53.jpg)
Multi-Physics Runtime FrameworksExtensibility
Coupled Multi-Physics require two distinct, but concurrent simulations
Can co-scheduling be avoided?• Adaptive execution model:
Yes
Load-balancing required. • Pilot-Job facilitates LB!• Across sites? (open Q)
Multi-platform Pilot-Job:• MPI-based TG – Condor
![Page 54: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/54.jpg)
Dynamic Execution Reduced Time to Solution
![Page 55: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/55.jpg)
SAGA-MapReduce(Miceli, Jha et al CCGrid’09; Merzky, Jha et al GPC’09)
• Interoperability: Use multiple infrastructure concurrently
• Control the NW placement
• Simple staging of data
• SAGA-Sphere-Sector:
• Open Cloud Consortium
• Stream processing model
• Ongoing work
• Apply to all elements (files) in a data-set (stream)
Ts: Time-to-solution, including data-staging for SAGA-MapReduce (simple file-based mechanism)
![Page 56: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/56.jpg)
Controlling Relative Compute-Data Placement
![Page 57: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/57.jpg)
SAGA All-Pairs: Runtime Data Placement
Classical: Place task on 4 LONI machines (512px Dell Clusters)
• Simple data staging
“Intelligent”: Map a task to a resource based upon Cost
• Cost = Data Dependency + transfer times (latency)
“Ignoring Intelligent mapping is no longer an option”
• Quote (undergraduate) Miceli Classical Intelligent
0
100
200
300
400
500
600
Processing Time
"Intelligence" Time
![Page 58: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/58.jpg)
Distributed Data Intensive ApplicationsResearch Challenges
Goal: Develop DDI scientific applications to utilize a broad range of distributed systems, without vendor lock-in, or disruption, yet with the flexibility and performance that scientific applications demand.• Frameworks as possible solutions
Frameworks address some primary challenges in developing Distributed DI Applications• Coordination of distributed data & computing
• Runtime (Dynamic) scheduling, placement
• Fault-tolerance
Many Challenges in developing such Frameworks:• What are the components? How are they coupled? Functionality
expressed/exposed? Coordination?
• Layering, Ordering, Encapsulations of Components
“Just because you use can’t use MPI (on distributed systems), doesn’t mean you can’t use other approaches”
![Page 59: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/59.jpg)
Frameworks: Logical ordering
![Page 60: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/60.jpg)
Understanding Distributed Applications Development Objectives Redux
Interoperability: Ability to work across multiple distributed resources• SAGA: Middleware Agnostic
Distributed Scale-Out: The ability to utilize multiple distributed resources concurrently• Support Multiple Pilot-Jobs: Ranger, Abe, QB
Extensibility: Support new patterns/abstractions, different programming systems, functionality & Infrastructure• Pilot-Job also Coupled CFD-MD, Integrated BQP
Adaptivity: Response to fluctuations in dynamic resource and availability of dynamic data
Simplicity: Accommodate above distributed concerns at different levels easily…
![Page 61: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/61.jpg)
Does SAGA Provide A Fresh Perspective?
![Page 62: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/62.jpg)
SAGA: Building the Abstractions to Bridge the Infrastructure-Applications Gap
Focus on Application Development and Characteristics, not infrastructure details
![Page 63: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/63.jpg)
SAGA-based Tools and ProjectsAdvantage of Standards
JSAGA from IN2P3 (Lyon)• http://grid.in2p3.fr/jsaga/index.html
• Slides Ack: Sylvain Renaud
GANGA-DIANE-gLite (EGEE)• http://faust.cct.lsu.edu/trac/saga/wiki/Applications/GangaSAGA
• Slides Ack: Jackub Mosciki, Massimo L, O. Weidner
NAREGI/KEK (Active)• http://www.ogf.org/OGF27/materials/1767/OGF27_SAGA_KEK.pdf
DEISA• DEISA-based Shell and Workflow library (
http://www.fz-juelich.de/nic-series/volume38/pringle.pdf )
• http://deisa-jra7.forge.nesc.ac.uk/ and http://www.ogf.org/OGF19/materials/501/SAGA-DEISA.ppt
XtreemOS• http://saga.cct.lsu.edu/index.php?option=com_content&task=view&id=95&Itemid=174
Service Discovery (SD) Specification • (with gLite bindings; extended to NAREGI)
![Page 64: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/64.jpg)
JSAGA: Implementer and user of SAGA
64
JSAGA uses SAGA in a module, which hides heterogeneity of grid infrastructures
JSAGA implements SAGA to hide heterogeneity of middlewares
Applications
jobscollection
JSAGA
SAGA
core engine+ plug-ins
JSAGA
Legacy APIs
![Page 65: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/65.jpg)
JSAGA
65
Projects using JSAGA
Elis@• a web portal for submitting jobs to industrial and research
grid infrastructures
SimExplorer• a set of tools for managing simulation experiments
• includes a workflow engine that submit jobs to heterogeneous distributed computing resources
JJS• a tool for running efficiently short-life jobs on EGEE
JUX• a multi-protocols file browser
/
![Page 66: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/66.jpg)
ganga integration
![Page 67: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/67.jpg)
DIANE INTEGRATION
Diane without SAGA Diane with SAGA
DIANA is an execution manager with support for pilot-jobs + worker agents(IDEAS Redux)
![Page 68: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/68.jpg)
Master
Agents scheduling
Heterogeneous resourcesallocation (Ganga + Ganga/SAGA)
Lattice-QCD Applications on heterogeneous resources
Ganga/gLite
Ganga/SAGA (to TeraGrid)
Ganga/SAGA (to *)
Payload distribution
Application-aware (and resource-aware) scheduling
Federating resources! EGEE Conference (Sep’09) (Not in this demo:
cloud resources, additional Grid infrastructures…)
![Page 69: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/69.jpg)
RENKEI Project Aims
SAGA-Engine
gLiteNAREGISRB
iRODS
Adpt Adpt Adpt
C++ Interface
Python Binding
Service & Applications Svc Apps Apps
CloudLRMS
LSF/PBS/SGE/…
Middleware-independent service & application
RNSYet Another FC
service based on OGF standard
SAGA adaptors
SAGA framework
This activity is funded by MEXT as a part of RENKEI project which develops seamless linkage of resources in the Grids and the local one for e-Science.
KEK
Osaka Univ.Tsukuba
Univ.
HEPLibrary
SAGA
![Page 70: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/70.jpg)
![Page 71: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/71.jpg)
![Page 72: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/72.jpg)
Text
![Page 73: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/73.jpg)
SAGA: Europe/EGEE – EGI
TAMAS: Team to Assist Porting Applications to e-Science infrastructures • http://indico.cern.ch/getFile.py/access?contribId=4&sessionId=1&resI
d=1&materialId=slides&confId=72253
SAGA – gLite-UNICORE-ARC and SD (gLite)• Abstract at EGEE Users Forum (Uppsala, Apr’10)
University of Western England• Visualisation and workflows
Other projects..
![Page 74: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/74.jpg)
![Page 75: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/75.jpg)
Summary
Provides the basic abstractions to:• The building blocks upon which to construct “consistent”
higher-levels of functionality and abstractions
• Core feature set
• Extensible and growing; many externally driven (eg SD)
• Meets the need for a broad Spectrum of Application
Used for Applications, Tools and Service Layers• SAGA standard is interesting for Tools, Service-layer
developers
• Provides a fresh perspective to developing distributed applications and CI
• IDEAS (Interoperability, Distributed Scale-Out, Extensibility, Adaptivity, Simplicity)
![Page 76: Shantenu Jha for the SAGA Team SAGA: An Overview](https://reader035.vdocuments.net/reader035/viewer/2022062309/56649e0d5503460f94af6e60/html5/thumbnails/76.jpg)
SAGA Team and DPA Team and the UK-EPSRC (UK EPSRC: DPA, OMII-UK , OMII-UK PAL), NSF (HPCOPS, Cybertools) and LA-BOR
People:
SAGA D&D: Hartmut Kaiser, Ole Weidner, Andre Merzky, Joohyun Kim, Lukasz Lacinski, João Abecasis, Chris Miceli, Bety Rodriguez-Milla
SAGA Users: Andre Luckow, Yaakoub el-Khamra, Kate Stamou, Cybertools (Abhinav Thota, Jeff, N. Kim), Owain Kenway
Google SoC: Michael Miceli, Saurabh Sehgal, Miklos Erdelyi
Collaborators and Contributors: Steve Fisher & Group, Sylvain Renaud (JSAGA), Go Iwai & Yoshiyuki Watase (KEK)
DPA: Dan Katz, Murray Cole, Manish Parashar, Omer Rana, Jon Weissman
Acknowledgements