building problem solving environments with application web service toolkits
DESCRIPTION
Building Problem Solving Environments with Application Web Service Toolkits. Choonhan Youn and Marlon Pierce Computer Science, Syracuse University And Community Grid Labs, Indiana University. Presentation Outline. Introduction What is the Computational web portal? - PowerPoint PPT PresentationTRANSCRIPT
Building Problem Solving Environments with Application Web
Service Toolkits
Choonhan Youn and Marlon PierceComputer Science, Syracuse University
And Community Grid Labs, Indiana University
Presentation Outline• Introduction
– What is the Computational web portal?– Gateway: computing web portal– Limitations of traditional approach
• Web Service-Based Computing Portal Architecture• Core Web services for Computing Portals
– Job submission– File Manipulation– Context Management– Script Generation– Job monitoring
• Application Web services• Web service negotiation.• Conclusion
Computational Web Portals
• Computational Web Portals provide seamless access to HPC resources– You can log in anywhere through any general web browser.
• Portals simplify the use of HPCs for novice users.– Basics: batch script generation, job submission and monitoring, file
service and ……– Computational grid services: Globus, Condor
• Portals can simplify the use of unfamiliar codes.– GEM code: disloc, simplex
• Provide a work management environment for all users.– You can see what you did last week.
• Other PSEs Web portals– NASA Information Power Grid LaunchPad– NPACI Hotpage – Pacific Northwest National Laboratory’s Ecce system, UNICORE– Our own Gateway/ServoGrid projects
Gateway project
• Gateway is a computational web portal project funded through:– DoD HPC MO PET Portal: Kerberos security in computational web portal– GEM science: Support codes developed by earthquake modeling
consortium– Alliance: Contribute to NCSA portal– SciDAC (Scientific Discovery through Advanced Computing): DOE
project to build portal services for Plasma physics• Our goal is to provide building block components that can be used to
build specific portals. • We also develop browser-based interfaces for basic services and
specific science codes.• Developed to support typical, if simple, high performance computing
services– Batch script generation, job submission and monitoring, file management
and transfer.– Do it all securely
Problems with Traditional Portal Architecture
• Portals accesses heterogeneous back ends and grids through a particular middle tier.
• Most portal projects are not interoperable– Middle tier software incompatible– Wide range of protocols.
• Why do we need the portal interoperability?– Portal developers don’t have to
reinvent every single important service (lesson from GGF GCE).
– Users will have access to more services than any one project can provide.
– Users will be able to pick up the best available implementation of a service.
services
Web browser Web browser
services
Back end resources Back end resources
?
…
…
…
Web Service-Based Computing Portal Architecture
JS: Job submissionJM: Job MonitoringFT: File TransferCM: Context ManagerSG: Script GenerationAWS: Application Web ServiceHIS: Host Independent ServiceHSS: Host Specific Service
Backend Resources
Middle Tier(Web Server)
Simulation Component
JSJM
FT
HPC
SOAP
Data Component
FTJS
JM
Data Base
… Web Services Provider
Web Browser
ServiceRepository
…
Publish
Publish
SOAP
SOAPSOAP
SOAP
HTTP HTTP
Portal Server
CM
SG
AWS
Middle Tier(Web Server)
HIS
SOAP
SOAP
User Interface Server
SOAP Client
Repository Client
SOAP
HSSHSS
Publish
Core Web services – 1
• Given WSDL and SOAP, what can you build?• Host-Specific Services (HSS)
– Instances of these services are bound to particular hosts.– Job Submission– File Transfer– Job & Host Monitoring
• Host-Independent Services (HIS)– Informational services that are not tied to specific service points– The service provided does not depend on the location.– Context Management– Script Generation
• These core services are simple, stateless.
Core Web services - 2• Job Submission
– Allow users to execute scientific applications– Execute operating system calls directly or may interact with Grid
services through, for example, the CoG client API to Globus.– We use Java Runtime processes to run external (non-Java) commands,
for example, PBS qsub.• File Manipulation
– Upload and download files between their desktops and various backend destinations.
– Allow users to transparently move, rename, and copy files on remote back-ends and crossload between different backend sites.
– File uploading and downloading service illustrate the use of SOAP messages with attachments in the RPC messaging style.
– SOAP attachments are non-XML files that are appended to the SOAP message and are useful for sending binary data and files with known MIME formats.
Core Web services - 3
• Context Management (CM)– Archives interactions with the computational portal and stores all of the
metadata associated with user sessions.– Provides simplest possible data model
• CM provides an easy interface to an arbitrarily deep and complex tree-shaped data structure.
• Context data nodes are defined by recursive schema that hold optional, unbounded name/value pairs and child nodes.
– We use CM to store locations of job scripts, miscellaneous file URIs, user’s application instance XML files, etc.
– CM metadata stored on file systems, XML-native databases, ….• Actual data may be anywhere.
– Actual service interface for manipulating contexts and the context data• Add one or more contexts.• Search and store the context data with XPath queries.• Remove the specified context.• List the child contexts.
Context Manager Architecture
Client
Axis Servlet
SOAP/HTTP
ContextManager
SharedWSDL
Interface
FS XMLDB
InternalCommunication
Context Data
Core Web services - 4
• Script Generation– For users who are unfamiliar with HPC systems.– The information about user’s choice with the portal interaction is stored
as user’s application instance XML document.– Generate the job script which could be broken down into two parts: a
queue script for a particular queuing system such as PBS, LSF and LoadLeveler and a user script for running the application code.
• Job monitoring– Has been built in the polling method.– Monitor the execution of a job running in a queuing system.– Return the array of the generated a WSDL complex type, effectively an
XML data object that contains the job status of the scheduler, given the user name and the type of queuing system as input parameters on job monitoring method.
List user files on selected host, Solar. File operations include Upload, download, Copy, rename, crossload
File manipulation service
Job monitoring service
List the user’s job status on selected host, Solar that is running PBS queuing system.
Application Web Services (AWS)
• Application: specifically some code developed by the scientific community.– Example: Finite element codes, grid generation codes and so on.
• AWS are designed to make scientific applications (i.e. earthquake modeling codes) into Grid Resources.
• An actual application is wrapped by a Java program.• We need a meaningful metadata model for applications
– Describe application-specific requirements– Describe bindings of applications to host environments and to Web
services in a general way that is independent of the particular portal.• Scientific applications consist of several core Web services.
– Get files to right place, script submission instructions, submit the job, get notified at various states.
AWS Lifecycle
• Applications can exist in four stages:– Abstract state: describes optional choices and
configurations that are available.– Ready state: Specific choices are made– Submitted: Application is running – Completed: Application is finished, but we
need to archive information about it.
AWS Schema Structure
• Two sets of XML schema:– Application Descriptors:
• describe abstract state.• describe application options. Used by the application developer
to deploy his/her service into the portal.– Application Instance Descriptors:
• describe particular instance states (ready, running, archived).• describe particular user choices and archive them for later
browsing and resubmission.
• Schema sets are arranged hierarchically– Applications contain hosts– Schema are designed to be pluggable
• Don’t like my queue description schema? Plug in your own.
AWS XML Descriptors
• Application description schema– A “basic information” element that contains information such as application
name, version, option flags.– An “internal communication” element that contains child elements for
describing input, output, and error fields for the code.– An “execution environment” element that contains a list of core services
needed to execute the application.– An optional, generic parameter to hold arbitrary information about the
application.• Host description schema
– Contains information about the resource such as DNS name and IP address– All of the information needed to invoke the parent application on that resource
such as location of the executable, location of the workspace or scratch directory, and so on.
• Queue description schema– Contains information needed to perform queue submissions such as memory
size, number of CPUs and so on( in case of PBS).
Example: Deploy an application code, Simplex on a particular host as a service and this form is used to edit the Application XML descriptor file
Sample generated user view of application code, Simplex: this form is generated from the Application XML descriptor for a particular application runs: the input files used, the location of the output, the resources used for the computation, etc.
Portal Stack
• Core services provide the basic connection to back end “Grid” services.
• Application services combine core services and application metadata.
• User interface portlets are built for each service.
• Portals aggregate portlet components into portals.
Core Web Services
User Interfaces
Application Web Servicesand Workflow
Aggregate Portals
Message S
ecurity, Information
Portlets for User Interface Components
• Web services define XML interfaces for accessing services.
• User interface components (such as JSPs) combine service stubs into useful objects for human interaction.
• So we actually have two points of interoperability:– At the WSDL interface– At the user interface
• Portlets combine HTML (and other) user interfaces into aggregate portal interfaces.– EX: Jetspeed from Jakarta
Reliability of Distributed Services
• Distributed service systems have some important reliability problems– Information must be up to date.
• The system adjust when servers become available or unavailable.• Service metadata should match the actual capabilities of the system.
– Messages should reach the services.• We are automating application service metadata through
publish/subscribe mechanisms.– Servers contain embedded publisher/subscriber clients– Information aggregators publish requests for information to JMS-
style brokers.– All available servers subscribed to the request topic publish their
information back to the aggregator.
Bridging Between Client-Serverand Messaging Services
Browser
DynamicUser Interface
Component
BrokerAggregator
TomcatServer
TomcatServer
TomcatServer
TomcatServer
TomcatServer
Serversrun NaradaNotifiers
Peers registerthemselvesto Aggregator
Web servicerequest forinformation
SOAP
HTTP
Conclusions• Traditional portals have “stovepipes” with interoperability problems.• By designing and implementing several core portal services and Application Web
Services around Web services, we gain interoperability and reusability.• The emphasis on the development of reusable services that can form the basis for
multiple PSEs.• The portal developer can construct specific implementations and composites of
primitive service components and can also provide services that may be shared among different portals.
• Application-specific services and data models that can be used to encapsulate entire applications independently of the portal implementation.
• User interfaces to application services become distributed portlets.• Everything is distributed
– Core Web Services->Application Web Services->User Interfaces Portlets->Portals
– Uses HTTP, SOAP, WSDL, ….• It all has to be secured.
– A flexible, message-based security system that can be bound to multiple mechanism and multiple message formats.
– The general approach: to use assertion– SAML, WS-Security– Kerberos, PKI