the open grid computing environments project marlon pierce community grids laboratory indiana...

61
The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Upload: audrey-harrison

Post on 20-Jan-2016

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

The Open Grid Computing Environments Project

Marlon Pierce

Community Grids Laboratory

Indiana University

Page 2: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Acknowledgements

Funding from NSF NMI (2003-2007) and OCI SDCI (2007-2010).

Current participants Indiana University (Pierce, Gannon) RENCI (Kandaswamy) RIT(von Laszewski) SDSC (Wilkins-Diehr) SDSU (Thomas, Edwards) TACC (Dahan)

Page 3: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Outline

Web Portals and Science GatewaysOGCE efforts

OGCE Portal Software Portal tools

Java COG, GTLAB

OGCE Gateway Services GFAC, GPIR

Software Engineering Issues

What is next?

Page 4: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

OGCE Goals

To provide easily installable, well-tested software for building Web client and service components that constitute a Grid Computing Environment. Science Web Portal --> GCE --> Science

GatewayTo support developing groups through

training, outreach, and divine intervention. Gateways have many needs that can’t be

solved by downloadable software alone.

Page 5: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

What Is a Web Portal? Aggregate content from

multiple sources into a single display.

Typically consume RSS/Atom news feeds.

More powerful versions these days support Flickr, calendars, games, etc. Gadgets, widgets

Examples: iGoogle, Netvibes, My Yahoo!

Page 6: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Science Portals and Gateways

Science portals resemble standard portals, but must also Support access to computing and storage

resources. Allow users remote, Unix-like access to these

resources. Provide access to science applications and data

sets.So security is crucial.And we must provide value added services as

well as user interfaces.

Page 7: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

A Comprehensive Gateway Architecture

Gateway Services

Grid Portal Server

Grid Portal Server

SecurityServices

SecurityServices

Workflow/ ApplicationExecution Engine

Workflow/ ApplicationExecution Engine

ApplicationResourceCatalogs

ApplicationResourceCatalogs

User Data& Metadata

Catalogs

User Data& Metadata

Catalogs

User’s BrowserUser’s Browser

Workflow ComposerWorkflow ComposerU

ser’s

De

skto

pU

ser’s

De

skto

p

DataServices

DataServices Information

Services

InformationServices Job MGMT, Resource Broker

And Scheduling Services

Job MGMT, Resource BrokerAnd Scheduling Services Security

Services

SecurityServices

Globus-Teragrid “OGSA-Like” Services

Page 8: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Components for Science Portals

OGCE is founded on the principal that portals should be built out of reusable parts.

Key standard in our first phase: the JSR 168 portlet specification.

Portlets can run in multiple containers uPortal, Sakai, GridSphere, LifeRay, etc.

Allows us to build Grid specific components and deploy along side other goodies: Sakai collaboration tools, contributed portlets, etc.

Page 9: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

OGCE Portal Software

Page 10: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

OGCE GPIR portlet can interoperate with TeraGrid and your own GPIR

services.

Page 11: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Manage TeraGrid MyProxy credentials with the OGCE

ProxyManager portlets.

Page 12: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

OGCE file management client portlets interact with TeraGrid GridFTP

servers.

Page 13: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

General purpose batch and interactive job submission to GRAM, WS-GRAM is supported.

Page 14: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Dashboard Portlet

14

The dashboard portlet allows users to track jobs on the selected resource. The user can view either his own set of jobs or get information on all submitted jobs.

Page 15: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University
Page 16: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Queue forecasting portlets work with the NWS QBETS to predict wait times and deadlines.

Page 17: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

PURSe portlets manage user requests for portal accounts and Grid credentials.

Page 18: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Condor and Condor-G

Page 19: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

OGCE IFrame Portlet can be used to integrate external sites.

Page 20: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Building Your Own Grid Portlets

Page 21: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Coding Portlets

Portlets are just servlet-like Java classes.Basic API key methods:

doView(), processAction().

These are coupled to JSP pages (typically) through tag libraries and request dispatchers. OGCE supports Velocity portlets

So we must provide the coding logic for processAction().

COG abstraction layers provide this.

Page 22: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

CoG Abstraction Layer

CoG CoG CoG CoG CoG

CoG Data and Task Management Layer

CoG Gridfaces Layer

CoG CoG

CoG

GridID

E

GT2GT3(X)

GT4WS-RF

Condor Unicore

Applications

SSH Others

Nanomaterials

Bio-Informatics

DisasterManagement

Portals

CoG Abstraction Layer

CoG CoG CoG CoG CoG

CoG Data and Task Management Layer

CoG Gridfaces Layer

CoG CoG

CoG

GridID

E

DevelopmentSupport

CoG Abstraction Layers

Page 23: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

TaskTask

Handler

Service

TaskSpecification

SecurityContext

ServiceContact

The class diagram is thesame for all grid tasks (running jobs, modifying files, moving data).

Classes also abstract toolkit provider differences. You set these as parameters: GT2, GT4, etc.

Page 24: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Task and Specification

Task task=new TaskImpl(“mytask”,Task.JOB_SUBMISSION);

task.setProvider(“GT2”);JobSpecification spec=

new JobSpecificationImpl();spec.setExecutable(“rm”);spec.setBatchJob(true);spec.setArguments(“-r”);…task.setSpecification(spec);

Page 25: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Service and Security Context

Service service=new

ServiceImpl(Service.JOB_SUBMISSION);

service.setProvider(“GT2”);

SecurityContext securityContext=

CoreFactory.newSecurityContext(“GT2”);

//Use cred object from ProxyManager

securityContext.setCredentials(cred);

service.setSecurityContext(

(SecurityContext)securityContext);

Page 26: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Service Contact and Submit

ServiceContact serviceContact=

new ServiceContact(“myhost.myorg.org”);

service.setServiceContact(serviceContact);

task.setService(

Service.JOB_SUBMISSION_SERVICE,

service);

TaskHandler handler=new GenericTaskHandler();

handler.submit(task);

Page 27: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Coupling CoG TasksThe COG

abstractions also simplify creating coupled tasks.

Tasks can be assembled into task graphs with dependencies. “Do Task B after

successful Task A”Graphs can be

nested.

Page 28: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Problems with Portlet Development

Grid portlets typically wrap each single Grid capability in a separate portlet

Problem is that Grid portlets need to combine these operations Portlets are entire web applications, so we need a component model for

portlets: reusable portlet parts Even with the COG Abstraction Layer, we must still do a lot of

coding to biuld new applications. To address these problems we have adopted Java Server

Faces Provides several nice Model-View-Controller features JSF provides an extensible framework (tag libraries) for making

reusable components. Apache JSF portlet bridge allows you to convert standalone JSF

applications (development phase) into portlets (deployment phase).

Page 29: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Grid Tag Libraries and Beans (GTLAB)

GTLAB provides common components for building portlets using tags and reusable parts.

The goal of GTLAB to simplify Grid portlet development Enable rapid development

GTLAB capabilities include Grid operations with XML based tags within Java Server Faces (JSF) framework.

Grid tag libraries are built using JSF custom component development techniques

Grid tags are interfaces to backing Grid beans End users pass values to Grid beans by using tag attributes.

We build on Java CoG 4’s abstraction layer. Each backing Grid bean has equal capability with a portlet

application in case of Grid portlet approach.

29

Page 30: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

GTLAB Example

<html>

<body>

<f:form>

<o:submit id=”test” action=”next_page” />

<o:myproxy id=”pr” hostname=”gf1.ucs.indiana.edu” port=”7512” lifetime=”2” username=“mnacar” password=”***” />

<o:jobsubmit id=”task” hostname=”cobalt.ncsa.teragrid.org” provider=”GT4” executable=”/bin/ls” stdout=”tmp/result stderr=”tmp/error” />

</o:submit>

</f:form>

</body>

</html> 30

• Grid tags are associated with Grid services via Grid beans• Grid Beans wrap the Java COG Kit (version 4)

• We show an example JSF page section below.• This allows you to develop new Grid portlets with no additional Java code.

Page 31: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Grid Tags Associated Grid Beans Features

<submit/> ComponentBuilderBean Creating components, job handlers, submitting jobs

<handler/> MonitorBean Handling monitoring page actions

<multitask/> MultitaskBean Constructing simple workflow

<dependency/> MultitaskBean Defining dependencies among sub jobs

<myproxy/> MyproxyBean Retrieving myproxy credential

<fileoperation/> FileOprationBean Providing Gridftp operations

<jobsubmission/> JobSubmitBean Providing GRAM job submissions

<filetransfer/> FileTransferBean Providing Gridftp file transfer

ResourceBean Describes common properties among all tags and beans. Passing values given by standard visual JSF components.

Page 32: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

How to prepare application pages

Developers embed Grid tags snippet into JSF page These components are non-visual and are not displayed in

HTML. Resource bean provides bridging with form inputs and GTLAB

framework. <h:outputText value="Taskname: "/>

   <h:inputText value="#{resource.taskname}" />   <o:multitask id="multi" persistent="true" taskname="#{resource.taskname}" />

Dynamic values to Grid tag attributes are provided by Resource bean.

Only visual component is <o:submit/> tag that is associated with action method of GTLAB.

32

Page 33: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

GTLAB Dashboard PortletExample

<o:submit id=”track” action=”list_page” /> <o:multitask id=”dashboard” taskname=”track” persistent=”true” >

<o:myproxy id=”proxy” hostname=”gf1.ucs.indiana.edu” lifetime=”2” username=”#{resource.username}” password=”#{resource.password}” /> <o:jobsubmit id=”jobA” hostname=”cobalt.ncsa.teragrid.org” provider=”GT4” executable=”/bin/whoami” stdout=”tmp/result” stderr=”tmp/error” />  <o:jobsubmit id=”jobB” hostname=”cobalt.ncsa.teragrid.org” provider=”GT4” executable=”/bin/showq” stdin=”tmp/result” stdout=”tmp/list” stderr=”tmp/error” />  <o:dependency id=”depend” task=”jobB” dependsOn=”jobA” /> </o:multitask></o:submit>

33

Page 34: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Tracking and Managing Jobs

GTLAB manages lifecycles of jobs and monitor their status.

Grid operations are usually batch processes We provide callback mechanism to follow up the jobs GTLAB creates handlers for jobs and persistently stores them.

GTLAB handlers manages the job events such as stop, cancel or resuming the running jobs.

GTLAB provides archive for job metadata and allows managing the archive Handler tag helps to organize user’s job repository <o:handler id=”delete” action="#{monitor.delete}" > <f:param id="task" name="taskname“ value="#{task}"/> </o:handler>

34

Page 35: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

OGCE Gateway Services

Page 36: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

36

Web Services in Scientific Communities (G.

Kandaswamy) Web services are used to “wrap” scientific

applications to Describe, publish, discover and consume scientific

applications in a standard way Compose complex workflows from scientific

applications Run and monitor complex workflows on distributed

resources

Such web services that “wrap” scientific applications are called “application services”

Page 37: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

37

ApplicationService

Command-line

ApplicationWeb Service

Client

Host1 Host2

SOAP Request

SOAP Response

Command-line Arguments

Output Results

A Simple Application Service

Page 38: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

38

Things Are Usually More Complicated

ARPS-TRNARPS-TRN

ARPS-SFCARPS-SFC

EXT2ARPSEXT2ARPS

MCI2ARPSMCI2ARPS

NIDS2ARPSNIDS2ARPS88D2ARPS88D2ARPS

ADASADAS

ARPS2WRFARPS2WRF

WRFWRF

ARPS-PLOTARPS-PLOT

EXT2ARPSEXT2ARPS

Initial boundary conditions

Initial boundary conditions

Run for each forecast

and/or ADAS analysis

Run for each forecast

and/or ADAS analysis

Decoded data from other programs (sfc,

rwh etc.)

Decoded data from other programs (sfc,

rwh etc.)

Level III dataLevel III data

Level II dataLevel II data

Satellite dataSatellite data

Run once per forecast region

Run once per day

Lateral boundary conditions

Page 39: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

39

The Problem

Application services may not be available during a workflow execution Unreliable resources (software, computers,

networks) Heavy load on service Does not meet QoS or security requirements

of client Workflows cannot complete unless all

services are available

Page 40: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

40

GFAC Solution

A Generic Application Factory A persistent web service that knows how to

create instances of any application service

Use a Generic Application Factory to create instances of application services on-demand from workflows

Page 41: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

41

Implementation

The Generic Application Factory (GFac) The Generic Service Toolkit: A toolkit that

“wraps” any command-line application as an application service Without writing any web service code Without modifying the application in any

significant way

Page 42: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

42

Creating an Application Service (1/2)

Write “ServiceMap” document to describe your service

Write “Application Deployment Description” document to describe a deployment of your application

Upload the above two documents to a Registry service

Page 43: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

43

Creating an Application Service (2/2)

GFac

Generic Web

Service5. Register capabilities

RegistryService

5. Register WSDL

3. Create service

1. Create service request

Certificate & Capabilities

Vault

Generic ServicePortlet

MyProxy Service

Capability Manager Service

Portal

2. Get ServiceMap & Host Description

ApplicationService

4. Configure service

Host1

Host2

Service Provider

Page 44: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

44

Invoking an Application Service

ApplicationService

RegistryService

Certificate & Capabilities

Vault

Generic ServicePortlet

MyProxy Service

Capability Manager Service

Portal

4. Run application

Application

3. Return user interface

4. Invoke Service

7. Return results

2. Access service

5. Get Application Deployment Description and Host Description

6. Send notifications

Host2

Host3

User

Page 45: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Software Engineering Issues

Page 46: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

OGCE Code Repository

We use SourceForge, SVN http://sourceforge.net/projects/ogce

Other SourceForge tools are useful. Replaced old OGCE bugzilla with SF

bugzilla recently after we were attacked by robots.

Page 47: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Portal Build System The portal download gives you everything you need to get

started except Java. Includes Tomcat, GridSphere, Ant, and Maven. Assume you have a Grid somewhere.

Build system (recently revised) is designed to build everything in one command. “mvn clean install” Also designed to support extensibility (I.e. replace GridSphere with

Sakai) and simple updates of portlets. We use Maven 2 exclusively.

Nice for managing third party jar dependencies. It can call Ant as necessary

Testing portals is another matter Normal unit test systems like Junit are not really appropriate.

Page 48: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

JMeter Test SuiteFile Transfer portlet unit tested in JMeter UI: check for valid HTML response

Create lots of unit tests, run, and see results in a dashboard

Page 49: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Nightly Builds and Tests on NMI Testbed

Page 50: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

What’s Next?

Page 51: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Some Future Issues

Better support for science tools, not just bare grids. Experiment builder, Xbaya workflow manager,

metadata repository services and clients.Better support for TeraGrid Science Gateways

Logging, auditing, integration with GridShibJavaScript Grid abstraction layers and agent

services to support non-portlet clients.More projects: obviously we are interested in

working with the OSG

Page 52: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

What About Web 2.0?

This is another talk entirely. http://grids.ucs.indiana.edu/ptliupages/presentatio

ns/Web20Tutorial_CTS.ppt http://grids.ucs.indiana.edu/ptliupages/publications

/Web20ChapterFinal.pdfSee also recent OGF 19 and 21 Workshops.Join us at SC07 for the GCE07 Science

Gateway Workshop ~20 peer-reviewed or invited talks, with focus on

Web 2.0.

Page 53: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

More Information

OGCE Web Site: www.collab-ogce.org

Announcements Atom Feed http://collab-ogce.blogspot.com/atom.xml

Contact me: [email protected]

Page 54: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

Some Example Portals

Page 55: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

LEAD Gateway PortalNSF Large ITR and Teragrid Gateway - Adaptive Response to Mesoscale weather events - Supports Data exploration,Grid Workflow

Page 56: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

TeraGrid User Portal

Page 57: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

User Portal Sharable PortletsAccount Management

view projects and allocation usage view system account usernames view DNs registered for account add users to projects supports >3500 users

Resource view comprehensive list of TG

resources and their attributes view job queues, load, status of

resources

Documentation current User Info

documentation contextual help for all interfaces

Consulting TG help desk information portal feedback channel

Allocation Info about how to apply

for/renew allocations

Page 58: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

North Carolina Bioportal Principal collaborators: John McGee

and Lavanya Ramakrishnan Features

access to common bioinformatics tools

extensible toolkit and infrastructure OGCE and National Middleware

Initiative (NMI) leverages emerging international

standards remotely accessible or locally

deployable packaged and distributed with

documentation National reach and community

TeraGrid deployment Portals hosted at RENCI and NCSA

Education and training

Page 59: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University

UNC-CharlotteVisual Grid Portal

Project Lead: Prof. Barry WilkinsonPortal Developer: Jeremy Villalobos

Page 60: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University
Page 61: The Open Grid Computing Environments Project Marlon Pierce Community Grids Laboratory Indiana University