teragrid’s integrated information service “iis” grid ... · entry 1 entry 2 entry 3 …...

28
TeraGrid’s Integrated Information Service “IIS” Grid Computing Environments 2009 Lee Liming, JP Navarro, Eric Blau, Jason Brechin, Charlie Catlett, Maytal Dahan, Diana Diehl, Rion Dooley, Michael Dwyer, Kate Ericson, Ian Foster, Ed Hanna, David L. Hart, Chris Jordan, Rob Light, Stuart Martin, John McGee, Laura Pearlman, Jason Reilly, Tom Scavo, Michael Shapiro, Shava Smallen, Warren Smith, Nancy Wilkins-Diehr TeraGrid Grid Infrastructure Group (GIG) University of Chicago, Argonne National Laboratory November 2009

Upload: others

Post on 23-Sep-2020

21 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

TeraGrid’sIntegrated Information Service

“IIS”

Grid Computing Environments 2009

Lee Liming, JP Navarro, Eric Blau, Jason Brechin, Charlie Catlett, Maytal Dahan, Diana Diehl, Rion Dooley, Michael Dwyer, Kate Ericson, Ian Foster, Ed Hanna,

David L. Hart, Chris Jordan, Rob Light, Stuart Martin, John McGee,Laura Pearlman, Jason Reilly, Tom Scavo, Michael Shapiro,

Shava Smallen, Warren Smith, Nancy Wilkins-Diehr

TeraGrid Grid Infrastructure Group (GIG)University of Chicago, Argonne National Laboratory

November 2009

Page 2: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

Outline

Introduction:Conceived in 2006; Production in 2007; Presented at GCE’07.IIS Vision

1st IIS System ArchitectureDistributed CI provider operated local information servicesCentralized federation wide information servicesRegistries -> XML document entries

2nd IIS Information ArchitectureRegistry architecture and data formatThe Capability Kit meta-registryCurrent information registries

Leveraging IIS Examples – Providers and Consumers

Conclusion and Future Work

November 20, 2009 GCE09

Page 3: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

VisionProvide an Authoritative Integrated Information Service enabling:

Human discovery of cyber-infrastructureScience Gateways, Portals, Documentation, CLIs

Software discovery of cyber-infrastructureFor automated resource, service, and software selection and accessFor auto-configuration (applications, gateways, workflow engines)

Providers to advertise their cyber-infrastructure offeringsAdvertise any information about any CI capabilityProviders own data, and independently control publishing

Streamlined operationsChange integration and managementAutomated testing, and monitoring

November 20, 2009 GCE09

Page 4: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

Provide an Authoritative Integrated Information Service enabling:

Human discovery of cyber-infrastructureScience Gateways, Portals, Documentation, CLIs

Software discovery of cyber-infrastructureFor automated resource, service, and software selection and accessFor auto-configuration (applications, gateways, workflow engines)

Providers to advertise their cyber-infrastructure offeringsAdvertise any information about any CI capabilityProviders own data, and independently control publishing

Streamlined operationsChange integration and managementAutomated testing, and monitoring

Vision

November 20, 2009 GCE09

Page 5: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

Distributed Architecture Components

XMLRepository

WS MDS4

TomcatWebMDS

Apache 2.0

Federation WideIntegrated Information Service

WS MDS4

Service ProviderLocal Information Service

TeraGridWide

Databases

WS/REST

WS/SOAP

Clients

HTTPD

Clients

November 20, 2009 GCE09

Page 6: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

High-Availability Architecture

info.teragrid.org

Dynamic DNS

Clients

Service ProviderPublishing

High-LevelAggregation

November 20, 2009 GCE09

Page 7: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

Registry ArchitectureNamed Registries, with schema compliantRegistry Entries, which are each anXML Document

The Capability Deployment Meta-Registry

Universal IdentifiersSite and Resource IdentifiersCapability IdentifierRegistry entry cross-references

ExtensibilityMeta-Registry ExtensionsNew RegistriesXML

Registry CEntry 1Entry 2Entry 3…

Registry BEntry 1Entry 2Entry 3…

Information Architecture

Registry AEntry 1Entry 2Entry 3…

<Reg1.Entry><id>entry1</id><foo>bar</foo>

</Reg1.Entry>

November 20, 2009 GCE09

Page 8: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

TeraGrid Capability Meta-Registry

Each Capability DeploymentWhere (site and resource)What (name, class, and description)Support informationStatus informationSoftware and services component informationExtensions

November 20, 2009 GCE09

Page 9: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

Capabilities Kit Registry by Class

CTSS Gateways

LocalLocal HPC Software

Renci Portal…

Application Development & RuntimeTeraGrid Core Integration (local info service)Co-scheduling, meta-schedulingCommon ClientComputation & Scheduling ClientsData CollectionsData ManagementData Movement servers, ClientsDistributed Parallel Application SupportDistributed Programming SystemsLocal ComputeLoginNimbus/Cloud ComputingParallel Application SupportRemote ComputationScience Gateway SupportVisualization Software (VTSS)WAN GPFS, WAN Lustre file-systemsWorkflow Support

CentralCredential Server (MyProxy)Integrated Information ServicesUser Portal

November 20, 2009 GCE09

Page 10: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

Other Registries

TeraGrid Central Database Registries

Gateways Registries

Local RP Registries

Site/Organization and Resource identifiers (IDs) and descriptionsProject/Allocation to Resource authorization listTeraGrid Science Gateway CatalogTeraGrid System Outages

November 20, 2009 GCE09

CTSS Extension RegistriesBatch System Load (%)Batch Queue Contents (requires authorization)OGF GLUE2

Science Gateway Web Services Application Registry

Local HPC Software Catalog

Page 11: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

Leveraging IIS ExamplesResource Description Repository Publishing

TeraGrid User Portal Batch Load & Queue DataTeraGrid User Documentation

Software Discovery– CTSS Software– Local HPC Software– Science Gateway Software– Science Gateways Web Services “WS” Application Registry

Advanced Scheduling Information

Inca Verification & ValidationUser Profile ServiceDiscovery CLI Interface

November 20, 2009 GCE09

Page 12: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

Resource Description Repository “RDR”

TeraGrid Core Services uses RDR to collect and store validated, current and historical resource description information:

Common Resource InformationCompute Resource InformationData Collections InformationStorage Information

TeraGrid Core IntegrationLocal ComputeData Collections(Storage)

Page 13: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

TGUP Batch Load & Queue Data

IIS provides queue & batch load information from all RP sites for TGUP to use in system monitor

<LoadRP xmlns=""><ComputeResourceLoad xmlns=""><ResourceID>pople.psc.teragrid.org</ResourceID><SiteID>psc.teragrid.org</SiteID><LoadInfo hostname="tg-login1.pople.psc.teragrid.org" timestamp="2009-11-11T13:46:19Z"><Load><Type>queue</Type><Value>98</Value></Load>

Remote Computation -> Local Compute

November 20, 2009 GCE09

http://portal.teragrid.org/

Page 14: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

TeraGrid User Documentation

http://www.teragrid.org/http://www.teragrid.org/userinfo/software/ctss.php

November 20, 2009 GCE09

Page 15: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

Software Discovery

TeraGrid context:>650 CTSS software package deployments>1600 Local HPC software package deployments>40 Science Gateways offering software packages

Problems:How can users discover what software is available, and how to access it?How can Science Gateways or Web Applications discover what software is available thru web service interfaces and invoke it?

November 20, 2009 GCE09

Page 16: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

Software Discovery

Solutions:Single IIS interface to multiple software repositories including 3rd party HPC software and Science Gateway software.A custom Gateway web services registry.

Which enables, for example:Scientists to discover that Gaussian is available both from the command line and through a full service gateway such as GridChem (www.gridchem.org).Science Gateways and Applications to discover and invoke Gaussian web services automatically.

November 20, 2009 GCE09

Page 17: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

Kit Registry

Software Discovery Design

CTSS Kit Software

Gateways Kit Software

Local HPC Kit Software

Gateway Web Services Registry

Local HPCSoftware Registry

WS EnabledSoftwareDiscovery

Comprehensive Software Discovery

November 20, 2009 GCE09

Page 18: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

Gateway WS Application Registry

l Each Gateway hosts a service (RESTful or otherwise) that publishes local web service metadata.

l Information Services aggregates all configured Gateway hosted GAWSR metadata, creating a central registry.

l Content of GAWSR metadata is rich enough to dynamically launch jobs via web services. (ie, the registry has enough metadata to allow a user/client to dynamically launch jobs)

l Following slides demonstrate two clients using the GAWSR. The first & the latter is a.

November 20, 2009 GCE09

Page 19: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

November 20, 2009 GCE09

Dynamic execution of web services written in Java

Page 20: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

November 20, 2009 GCE09

RIA Flex application showing the available metadata

Page 21: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

Local HPC Software

Local HPC Software

November 20, 2009 GCE09

Page 22: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

Advanced Scheduling Information

CTSSCo-schedulingMeta-schedulingComputation & Scheduling ClientsLocal ComputeRemote ComputationScience Gateway SupportWorkflow Support

GLUE2 Registry

November 20, 2009 GCE09

Page 23: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

Inca Verification & Validation

• Running on TeraGrid since 2003

• Verifies IIS published information through automated, user-level testing

• Total of ~2200 tests running on 18 login nodes, 2 grid nodes, and 3 servers

• Email notifications for critical services

• Status views from detailed test information to summary and historical reports

• Data published as XML, HTML, or graphed

• IIS compatible REST interface:

http://inca.teragrid.org/

XMLCTSS kitregistrations

XSLinfo.teragrid.org

http://info.teragrid.org/web-apps/HTML/kit-reg-v1/remote-compute.teragrid.org-4.0.2/bigred.iu.teragrid.org/http://inca.teragrid.org/inca/HTML/kit-status-v1/remote-compute.teragrid.org-4.0.2/bigred.iu.teragrid.org/

November 20, 2009 GCE09

Page 24: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

User Profile Service

November 20, 2009 GCE09

Provide authenticated users with user-centric informationHTTPS with Basic AuthenticationIn html, csv, json, perl, and xml formats

Page 25: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

Discovery CLI InterfaceThe tginfo CLI: http://info.teragrid.org/tginfo/

November 20, 2009 GCE09

Page 26: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

ConclusionFederation Wide Standards

Information Integration IdentifiersInformation Discovery REST APIsStandard Capability Naming and Description Schemas

Federation Wide Information DiscoveryUsing a Central Federation Wide IndexUsing a DNS/WWW model

Central Discovery à Distributed Information Access

Enable User InterfacesWeb 2.0, Science Gateways, and traditional Web servers** IIS does not develop those interface

November 20, 2009 GCE09

Page 27: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

Conclusion & Future WorkInformation Architecture

Capability Definition Meta-Registry (BioMedical Informatics -- BIRN)Capability Implementation RegistryMore Capabilities and Capability Classes

Clouds/IaaS, SaaS, Distributed Programming Environments (SAGA) , Data CollectionsScience Gateway Security Configuration Information (SAML)

System ArchitectureFully REST based registration services (Apache CXF, Globus CRUX)Fully REST based aggregation servicesMore REST based discovery interfaces (with XPATH, XSLT support)More custom REST service, some providing custom user services

Separate IIS projectPackaged, documented, and distributed for other projects

November 20, 2009 GCE09

Page 28: TeraGrid’s Integrated Information Service “IIS” Grid ... · Entry 1 Entry 2 Entry 3 … Registry B Entry 1 Entry 2 Entry 3 … Information Architecture Registry A Entry 1 Entry

More Information

November 20, 2009 GCE09

Web Siteshttp://info.teragrid.org/http://www.teragrid.org/gateways/http://info.teragrid.org/web-apps/html/index/ (REST APIs)

PeopleJP Navarro, Lee Liming (IIS Architecture and Coordination)Nancy Wilkins-Diehr (Gateway Information)Warren Smith (Execution and Scheduling Information)Ed Hannah (Resource Description Information)Kate Ericson (Monitoring and Validation Information)Rion Dooley (Authenticated User Custom Information)