introduce: an open source toolkit for rapid development · pdf fileintroduce: an open source...

21
Introduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott Oster & Stephen Langella & David Ervin & Tahsin Kurc & Joel Saltz Received: 7 June 2006 / Accepted: 15 February 2007 / Published online: 9 March 2007 # Springer Science + Business Media B.V. 2007 Abstract Service-oriented architectures and applica- tions have gained wide acceptance in the Grid computing community. A number of tools and mid- dleware systems have been developed to support application development using Grid Services architec- tures. Most of these efforts, however, have focused on low-level support for management and execution of Grid services, management of Grid-enabled resources, and deployment and execution of applications that make use of Grid services. Simple-to-use service development tools, which would allow a Grid service developer to leverage Grid technologies without needing to know low-level details, are becoming increasingly important for wider application of the Grid. In this paper, we describe an open-source, extensible toolkit, called Introduce, that supports easy development and deployment of Web Services Re- source Framework (WSRF) compliant services. Intro- duce is designed to reduce the service development and deployment effort by hiding low level details of the Globus Toolkit and to enable the implementation of strongly typed services. In strongly typed services, a service produces and consumes data types that are well-defined and published in the Grid. This enables data-level syntactic interoperability so that clients and services can access and consume data elements programmatically and correctly. We expect that en- abling strongly typed Grid services while lowering the difficulty of entry to the Grid via toolkits like Introduce will have a major impact to the success of the Grid and its wider adoption as a viable technology of choice in the commercial sector as well as in academic, medical, and government research. Keywords Grid . Grid computing . Web services . Grid service . Globus . WSDL 1 Introduction Grid computing is quickly becoming the new means for creating distributed infrastructures and virtual organizations for multi-institutional research and enterprise applications. The Grid has evolved from being a platform targeted at large-scale computing J Grid Computing (2007) 5:407427 DOI 10.1007/s10723-007-9074-8 S. Hastings (*) : S. Oster : S. Langella : D. Ervin : T. Kurc : J. Saltz Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA e-mail: [email protected] S. Oster e-mail: [email protected] S. Langella e-mail: [email protected] D. Ervin e-mail: [email protected] T. Kurc e-mail: [email protected] J. Saltz e-mail: [email protected]

Upload: vonguyet

Post on 22-Feb-2018

238 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

Introduce: An Open Source Toolkit for Rapid Developmentof Strongly Typed Grid Services

Shannon Hastings & Scott Oster &

Stephen Langella & David Ervin & Tahsin Kurc &

Joel Saltz

Received: 7 June 2006 /Accepted: 15 February 2007 / Published online: 9 March 2007# Springer Science + Business Media B.V. 2007

Abstract Service-oriented architectures and applica-tions have gained wide acceptance in the Gridcomputing community. A number of tools and mid-dleware systems have been developed to supportapplication development using Grid Services architec-tures. Most of these efforts, however, have focused onlow-level support for management and execution ofGrid services, management of Grid-enabled resources,and deployment and execution of applications thatmake use of Grid services. Simple-to-use servicedevelopment tools, which would allow a Grid servicedeveloper to leverage Grid technologies withoutneeding to know low-level details, are becomingincreasingly important for wider application of the

Grid. In this paper, we describe an open-source,extensible toolkit, called Introduce, that supports easydevelopment and deployment of Web Services Re-source Framework (WSRF) compliant services. Intro-duce is designed to reduce the service development anddeployment effort by hiding low level details of theGlobus Toolkit and to enable the implementation ofstrongly typed services. In strongly typed services, aservice produces and consumes data types that arewell-defined and published in the Grid. This enablesdata-level syntactic interoperability so that clients andservices can access and consume data elementsprogrammatically and correctly. We expect that en-abling strongly typed Grid services while lowering thedifficulty of entry to the Grid via toolkits like Introducewill have a major impact to the success of the Grid andits wider adoption as a viable technology of choice inthe commercial sector as well as in academic, medical,and government research.

Keywords Grid . Grid computing .Web services . Gridservice . Globus .WSDL

1 Introduction

Grid computing is quickly becoming the new meansfor creating distributed infrastructures and virtualorganizations for multi-institutional research andenterprise applications. The Grid has evolved frombeing a platform targeted at large-scale computing

J Grid Computing (2007) 5:407–427DOI 10.1007/s10723-007-9074-8

S. Hastings (*) : S. Oster : S. Langella :D. Ervin :T. Kurc : J. SaltzDepartment of Biomedical Informatics,The Ohio State University,Columbus, OH, USAe-mail: [email protected]

S. Ostere-mail: [email protected]

S. Langellae-mail: [email protected]

D. Ervine-mail: [email protected]

T. Kurce-mail: [email protected]

J. Saltze-mail: [email protected]

Page 2: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

applications to a new architecture paradigm forsharing information, data, and software as well ascomputational and storage resources. A vast array ofmiddleware systems and toolkits has been developedto support applications in the Grid. These includemiddleware and tools for service deployment andremote service invocation [1], security [2–4], resourcemonitoring and scheduling [5–8], high-speed datatransfer [9–11], metadata and replica management[12–16], component based application composition[17–19], and workflow management [20–22]. Earlierefforts focused on providing the core functionalityneeded by applications to use the Grid environment.As the number of Grid-enabled resources and appli-cations has increased, interoperability has become amajor concern, since applications and resources areheterogeneous, oftentimes managed autonomously,and programmatic access to resources is desired. Inrecent years, a services oriented view of the Grid hasemerged to address interoperability concerns. TheOpen Grid Services Architecture (OGSA) [23, 24] hasbeen developed through standardization efforts coor-dinated mainly by the Global Grid Forum (http://www.ggf.org), which includes both academicresearchers and corporate IT leaders, and OASIS(http://www.oasis-open.org). OGSA specifies a GridServices framework and Grid Services standards. TheGrid Services framework builds on and extends WebServices [25, 26] for scientific applications. It definesmechanisms for such additional features as statefulservices, service notification, and management ofservice/resource lifetime. A more recent effort, theWeb Services Resource Framework (WSRF) stan-dardization [27], has paved the way for closerinteroperability and unification between stateful Gridservices and Web services.

The most widely used reference implementationsof OGSA, as realized by OGSI (Open Grid ServicesInfrastructure), and WSRF are the Globus Toolkit(GT; http://www.globus.org) version 3.2 (GT 3.2) andversion 4.0 (GT 4.0), respectively. The GT enablesGrid architects to create Grids using standard buildingblocks. It also implements support for deploymentand invocation of Grid services, and provides runtimesupport for common services such as an Index servicefor service registration and discovery and the GlobusSecurity Infrastructure (GSI) for security. Thesecomponents enable service providers to stand up Gridservices, advertise them, discover them, and secure

them. The OGSA, WSRF, and GT provide a foundationupon which domain specific secure services, tools, andapplications can be developed. However, additional toolsthat will make it easier to use Grid technologies in serviceand application development and deployment are neededfor wider adoption of the Grid in application domains.

In this paper we present the design and implementa-tion of an open-source toolkit, called Introduce, whichprovides support for development and deployment ofstrongly typed, secure Grid services. The goals of ourwork are twofold. The first is to reduce the developmenttime and knowledge required to implement and stand upGrid Services using the GT. Developing services withthe Globus Toolkit requires a good knowledge of GridServices technologies and of the details of the toolkit.Introduce hides from service developers the complexi-ties of low-level tools and processes for servicedevelopment and deployment. The second goal is toenable greater levels of interoperability in the Gridenvironment. To this end, we implement support inIntroduce for development of strongly typed services. Astrongly typed service is one that consumes andproduces data types that are well-defined and publishedin the environment. The use of published types enablesa developer to create compatible and interoperable Gridservices without needing to communicate with otherGrid service developers. We believe that the availabilityof tools like Introduce will greatly impact the efficacyof the Grid in fields where Grid style architectures area great change to the everyday development andmanagement of data and analytical services.

The current implementation of Introduce uses theGT as the underlying core Grid infrastructure and theGlobal Model Exchange (GME) service of the Mobiusframework [16, 28, 29] to support strongly typed Gridservice development. The salient features of Introducecan be summarized as follows:

& It provides a graphical user interface (GUI) andhigh-level functions that encapsulate and hide thecommon complexities and low level command-line tools of the GT for generating a suitableservice layout. These include client and serverwrappers to encapsulate the “boxed” documentliteral Grid service calls, functions to createconfigurable service properties, functions to spec-ify resource properties and register metadata,functions to specify the security configurations

408 S. Hastings, et al.

Page 3: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

of the service operations of the service, andsupport to deploy a service to commonly usedGrid service containers. The service developer cancreate and modify Grid service interfaces using aGUI. Developers can also write their own servicedescription documents, which the Introduce tool-kit can use to create and modify a serviceprogrammatically.

& It enables development of strongly typed services.Using plug-ins, Introduce enables discovery of datatypes from virtually any style of data type providers.

& It is customizable and extensible via the use ofextension plug-ins. Plug-ins allow for Introduce tobe customized and its base functionality to beextended (1) for custom and common servicetypes in an application domain and (2) to employcustomized discovery mechanisms for commondata types for creating strongly typed services.

& It allows for implementation of secure services. Itleverages all aspects of the GSI in order to providecustomizable service- and method-level securityconfiguration and code generation. It providessupport for service developers to optionally turn onauthentication and authorization support for individ-ual service methods as well as the entire service itself.

& It manages all the service-specific files anddirectories required by the GT for correct compi-lation and deployment. It also generates appropri-ate, object-oriented client APIs which can be usedby client applications to interact with the service.

Introduce is available with the caGrid version 0.5 and1.0 distributions (https://cabig.nci.nih.gov/workspaces/Architecture/caGrid); caGrid is the Grid architectureand middleware infrastructure of the NCI-fundedcancer Biomedical Informatics Grid (caBIG™;https://cabig.nci.nih.gov) program [30]. Originallyimplemented as a tool to assist analytical servicedevelopers in caBIG, Introduce has evolved into amore generic Grid service development and deploy-ment toolkit. The current version of Introduce has beenused as the core caBIG service development toolkit incaGrid version 1.0, which was released in December2006, to build various types of services ranging fromdata services to core security infrastructure services tobiomedical image analysis services. Introduce was alsoaccepted as an Incubator project for the GT inNovember 2006; it is planned to be a subproject ofthe GT after the incubation period.

The rest of this paper is organized as follows. Wepresent the motivation behind Introduce and thenotion of strongly typed service in greater detail inSection 2. Section 3 provides a review of related workin toolkits for Grid application development. Themain components of the Introduce toolkit are de-scribed in Section 4. Additional features of Introduceand its extension framework are presented in Sections 5and 6. We conclude in Section 7.

2 Motivation

The need for the Introduce toolkit primarily arose inthe cancer Biomedical Informatics Grid program(caBIG™, https://cabig.nci.nih.gov), funded by theNational Cancer Institute (NCI). The goal of thisprogram is to implement a nationwide cancer researchnetwork in order to significantly enhance basic andclinical research on all types of cancer disease. Thislarge scale effort uses the Grid Services architectureas the underlying Grid framework. In this section,using caBIG as an example, we discuss the motiva-tions for the two main goals of the Introduce effort;easy development of secure Grid services and supportfor strongly typed service development.

2.1 Easy Development and Deployment of SecureGrid Services

The principle idea in caBIG is to leverage combinedstrengths of individual cancer centers and researchlabs by creating common applications, standards,common data and analytical resources, and softwareinfrastructure to link these applications and resources.In this effort, a core Grid infrastructure, called caGrid,is being developed to facilitate sharing of data andanalytical resources, and to support development ofGrid-enabled applications that make use of theseresources. The infrastructure is designed as a ser-vice-oriented system, building on the WSRF stan-dards.1 Being the most complete and widely deployedreference implementation of the WSRF standard, theGlobus Toolkit (GT) is employed as the Grid

1 The initial release of the caBIG architecture, called caGridversion 0.5, is built using the OGSA standards and GlobusToolkit 3.2. The new release, caGrid version 1.0, is developedusing the WSRF standards and Globus Toolkit 4.

Introduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services 409

Page 4: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

middleware backbone and runtime environment incaGrid.

caBIG is envisioned to span hundreds of institu-tions and possibly thousands of investigator andresearch laboratories. These institutions may hostsoftware systems of different complexity and maturi-ty. They may consist of users and applicationdevelopers with varying levels of understanding ofdistributed systems and the Grid. Thus, one of thegoals of caGrid is to provide tools to make it easierfor institutions, laboratories, and individual research-ers to leverage the Grid without requiring a great dealof knowledge of the Grid technologies. The GTprovides support for a suite of core runtime servicesand tools for developing and deploying Grid services.However, these are low level tools, requiring servicedevelopers to understand the details of the GT andhow and in what order the tools should be invoked,and to keep track of several files and directories thatare required for successful compilation and deploy-ment of services. The Introduce toolkit helps a servicedeveloper by coordinating the various steps of theservice development and deployment via the toolsprovided by the GT and by managing the necessarydirectories and files. It abstracts away the details ofinvoking the various tools so that the developer isfreed up to concentrate on the details of implementinghis/her domain-specific code.

Security is a required component in caBIG both forprotecting intellectual property and to ensure protec-tion and privacy of patient related information. IncaBIG, a service provider should be able to enforcesecure and controlled access to resources based onpolicies set by the owners of the resources. ThecaGrid infrastructure provides higher level servicessuch as Dorian [4] and the Grid Trust Service (GTS)for managing and enforcing security. Introduce allowsa service developer to configure the service as asecure service. When configured, the service caninteract with Grid-wide and local authentication andauthorization services to enforce controlled access toits methods.

2.2 Strongly Typed Service Development

Interoperability is critical in environments where it isdesirable to access heterogeneous resources program-matically and the resources are developed andmanaged by different groups independently. In

caBIG, for example, there can be multiple datasources that maintain databases of Gene expressiondata at different institutions; similarly, there can bemultiple analytical resources providing microarrayanalysis methods. To make the sharing of and remoteaccess to such resources easy and efficient in caBIG,applications should be able to access the resourcesthrough common interface syntax and protocols, andinteract with their data structures programmatically.

The Introduce toolkit aims to support and enhanceinteroperability in two main categories; architecture-level and information-level interoperability. Thearchitecture-level interoperability requires use ofcommon data and control information exchangeprotocols and common interface syntax for program-matic access to the functionality of a Grid resource.The OGSA and WSRF enable architecture-levelinteroperability by defining standards for data andinformation exchange protocols, service creation andmanagement, and service invocation. A Web or Gridservice exposes its structure and interface definitionsin WSDL so that clients can invoke the service’smethods and interact with the service remotely.Introduce leverages WSRF to support architecture-level interoperability. The information-level interop-erability, on the other hand, is concerned withprogrammatic access to information produced by aGrid resource and the correct interpretation andconsumption of the information. It is especiallyimportant when information is exchanged betweendisjoint groups of data providers and consumers. Theinformation-level interoperability requires the use ofcontrolled vocabularies, common data types, andcommon information models to represent the infor-mation. In the context of information-level interop-erability, we define the notion of strongly typedservice as the class of services in which the input andoutput data types of service methods are well-defined and registered in the Grid environment.

We developed the concept of a strongly typedservice first in the Mobius project [16, 29] in thecontext of XML virtualization of data sources. InMobius, the structure of a XML document or a dataelement provided by a service is required to conformto a XML schema registered in the environment. IncaBIG, information-level interoperability is addressedthrough coordinated management of metadata andcurated objects, bound to underlying semantic con-cepts. Objects are exchanged between two Grid end

410 S. Hastings, et al.

Page 5: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

points (i.e., between services or between clients andservices) in the form of XML documents. Thestructure of an object is represented by a registeredand published XML schema. That is, a strongly typedcaBIG service has service methods, whose input andoutput data types conform to registered XMLschemas.2 The Introduce toolkit makes use of XMLschemas to enable strongly typed service developmentand define data types.

The support for strongly typed services requires theavailability of data type repository services, whichmaintain the definition and structure of Grid supplieddata types. Data type repositories are used by servicedevelopers to locate data types which suite their needsand use them as method input and output parameters.Having Grid-wide accessible data type repositoryservices can promote the usage of common data typesacross the Grid. Introduce leverages the Global ModelExchange (GME) service of the Mobius framework asa data type repository. The GME provides tools andservices for coordinated management of XMLschemas. A XML schema describes the structure ofa simple or complex data element (data object) that isconsumed or produced by a service and is exchangedbetween two endpoints in the environment. In GME,all schemas are registered under namespaces and canbe versioned. A schema can be a new schema, or cancontain pointers to other registered schemas and theattributes of schemas, or can be generated by ver-sioning an existing schema (e.g., by adding or deletingattributes). The concept of versioning schemas isformalized in GME; any change to a schema isreflected as either a new version of the schema or atotally new schema under a different name or name-space. In this way, data types can evolve whileallowing clients and services to make use of oldversions of the data type, if necessary. Namespacesenable distributed and hierarchical management ofschemas; two groups can create and manage schemasunder different namespaces without worrying aboutaffecting each other’s services, clients, and programs.

In domains which publish semantic ontologies,semantic interoperability can also be facilitatedthrough the use of ontology aware data type discoveryextensions of Introduce.

3 Related Work

The Globus Service Build Tools Project (http://gsbt.sourceforge.net/) consists of GT4IDE and globus-build-services components. The GT4IDE componentis an Eclipse (http://www.eclipse.org) plug-indesigned to aid in the development of GT 4.0compatible services. The plug-in enables a servicedeveloper to create a service and add methods to theservice. It then generates the stub service to beimplemented by the service developer. Unlike theIntroduce toolkit, which leverages tools provided bythe Eclipse project such as the Java Emitter Templates(JET), but is a stand-alone application, the GT4IDEplug-in can only be used in the Eclipse developmentenvironment. Introduce has several additional fea-tures, which are not part of the GT4IDE implemen-tation. These features include support for stronglytyped services, hiding of document literal bindingsbehind user defined service interfaces, extensibility,and ability to leverage advanced features of GT 4.0such as resource properties framework, compositionalinheritance, and service and method level securityconfiguration. Moreover, GT4IDE mainly facilitatesdevelopment of server side capabilities, whereasIntroduce can be used to generate both server sideand client side methods and APIs. The Sun StudioCreator, JAXR, and JAX-WS 2.0 are Java basedtoolkits for development of Web applications andWeb Services (http://java.sun.com/webservices/index.jsp) and provide an integrated environment to supportdevelopment life cycle. Introduce is a toolkit targetedat WSRF compliant services. The IBM Grid Toolbox(http://www-1.ibm.com/grid/solutions/grid_toolbox.shtml) is a suite of integrated tools to aid withcreation and hosting of Grid services using the GT3.0 platform. Unlike Introduce, which provides accessto service development and deployment functionsfrom a graphical user interface, the IBM Toolboxconsists of command-line tools (e.g., Ant task andXML batch file based tools) and APIs that have to beinvoked individually by the service developer. Inaddition, the toolbox does not provide support forstrongly typed service development.

Smith et. al. and Friese et. al. [31, 32] lay out theimportance of a model driven architecture (MDA)approach to generating Grid service oriented archi-tectures. Their work primarily deals with creating aservice generation system for GT4 which uses an

2 A schema may represent a simple data element such as integeror a more complex object.

Introduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services 411

Page 6: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

MDA along with a code annotation approach. Theypropose an architecture for generating Grid servicesusing Java annotations. The user can create the Javaservice implementation and simply annotate themethods and resources and their components withuse Java Annotations to process service implementa-tion and create the Grid service. This work is a niceconcept for easily separating the business logic of aservice from the Grid service implementation; how-ever, it does not enable the user to clearly stay awayfrom the complexities of the underlying security,registration, invocation, or composition processes.The MDA concepts, however, do hold true in theIntroduce toolkit. The separation in layers betweenthe business logic (Code Layer), platform binding(PSM), and overall conceptual design (PIM) are veryrelevant to the design philosophies exposed by theIntroduce Toolkit.

gEclipse is a recent project in its definition stages(http://www.geclipse.eu/). It will develop a frameworkas plug-in extensions to the Eclipse infrastructure toboth integrate various existing tools in a unifiedenvironment and build new tools for easy develop-ment of Grid applications using Java, C/C++, andother programming languages. Our work differs fromgEclipse in that our effort is focused on developmentof WSRF compliant services, strongly typed servicedevelopment, and secure service configuration anddeployment.

Morohoshi and Huang [33] propose a toolkitdesigned to support service development on top ofthe GT3 platform, which is based on the OGSAstandards. An application developer can use thetoolkit GUI to create a basic service by inputting aset of parameters or a service from a template,generate service stub from GWSDL (Grid WebService Description Language) specification, generateGWSDL from service method interfaces (imple-mented in Java), and deploy or undeploy the service.Mizuta and Huang [34] develop a prototype cartridgefor AndroMDA (http://www.andromda.org), an open-source model driven architecture (MDA) based codegeneration framework, to facilitate GT3-compliantservice development using UML. In their framework,Grid services are expressed as class diagrams andmodels in UML. Their approach allows a servicedeveloper to use existing UML-based modelingsystems to design Grid services. The cartridgedeveloped for the AndroMDA implements the neces-

sary functions and extensions so that all the files,directories, source codes, and service data elements aregenerated for the service under development. Onelimitation of their work is their tool can be used only inthe AndroMDA framework. Introduce has severalfeatures not available in these toolkits. These featuresinclude (1) a more comprehensive extension frame-work (see Section 6), which goes beyond justproviding service templates and allows complexplug-ins (providing both logic and custom graphicalcomponents) to be integrated to the toolkit, (2)support for strongly typed services, and (3) ability toleverage resource framework of GT4 and composi-tional inheritance. On the other hand, Introduce coulduse a component like AndroMDA and the cartridge asan extension plug-in in the Introduce extensionframework so that a service can be designed usingUML models.

The Java CoG kits [35, 36] offer a rich set of clientside capabilities to develop clients and access Gridfunctions and services. Limited support is providedfor server side capabilities and service development.The goal of the WSRF .NET [37] project is to developa reference implementation of the WSRF standards onthe .NET platform. It supports creation of servicesusing the ASP.NET and Visual Studio .NET environ-ments and a set of WSRF .NET tools (e.g., thePortTypeAggregator tool for service deployment).The tools are mainly targeted at service side devel-opment and provide limited functionality for clientside implementation. The Introduce toolkit, on theother hand, is a stand-alone program and consists ofself-contained suite of components targeted at servicedevelopment on the GT4 platform.

P-GRADE [38] and ASKALON [39] are examplesof toolkits that provide support for development ofhigh-performance applications on cluster platformsand in the Grid. ASKALON provides a suite of toolsin a service-oriented architecture for performancemeasurement, experiment management, performancebottleneck detection, and performance prediction.P-GRADE implements a graphical interface andruntime support to develop and execute PVM andMPI parallel programs in the Grid. It supports bothinteractive and job mode execution of program anduses scheduling systems like Condor-G [7] for jobscheduling and execution. The graphical interfaceallows a developer to define components, componentprocesses, and communication patterns between com-

412 S. Hastings, et al.

Page 7: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

ponent processes. GEMLCA [40] is a suite of toolsand services that facilitate deployment of a legacyprogram as a Grid service. It employs a non-intrusiveapproach in that deployment can be carried outwithout making code modifications to the legacyapplication; the owner of the application provides adescription of the input and output parameters of theapplication, where the application is located, and theexecution environment of the application in a LegacyCode Interface Description file. The GEMLCAcomponents and services allow a user to register anddeploy a legacy application in the framework, invokeit remotely through Grid service interfaces, and getthe status and output of the execution. FORTE HPC(http://developers.sun.com/sunstudio/products/previ-ous/fortran/index.html) provides an integrated envi-ronment for development, debugging, and executionof high-performance applications written in Fortanand C. Introduce is complementary to these toolkits inthat it provides tools for development of WSRFcompliant services. P-GRADE, ASKALON, FORTEHPC, and GEMLCA as well as tools for high-performance bulk data transfer [10, 11], job schedul-ing and monitoring [5, 7], could be used in theIntroduce framework to support services that involveparallel application execution, legacy programs, jobscheduling, and large data transfer.

The Grid Interoperation Now (GIN) communitygroup of the Open Grid Forum (OGF) (https://forge.gridforum.org/sf/projects/gin) is leading efforts tofacilitate operational and infrastructure level interop-erability between different production Grids. Themain areas focused on by the group include interop-erability between authorization and authenticationservices, data management and movement, job de-

scription and submission, and information services.Introduce promotes the development and use ofstrongly typed services through the use of well-defined and published data types in WSRF compliantservices. This aspect of Introduce can be viewed asone of the enabling technologies for interoperability atthe level of data structures produced and consumedby services in different Grids. The User ProgramDevelopment Tools research group (http://www.ggf.org/7_APM/UPDT.htm) has investigated tools tosupport development cycle of Grid-enabled applica-tions with a focus on tools for debugging and tuningapplications. Introduce is a program developmenttool, but focuses on support for exposing data andanalytical resources as WSRF compliant services. Anapplication developer could use performance tuningand debugging tools along with Introduce to optimizethe performance of their service implementation.

4 Introduce Toolkit

The Introduce toolkit is designed to support the threemain steps of service development (see Fig. 1): (a)Creation: The service developer describes at thehighest level some basic attributes about the servicesuch as service name and service namespace. Oncethe user has set these service configuration properties,Introduce will create the basic service implementa-tion, to which the developer can add application-specific methods and security options through theservice modification steps. (b) Modification: Themodification step allows the developer to add,remove, and modify service methods, properties,

Fig. 1 Overall service development process

Introduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services 413

Page 8: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

resources, service contexts, and service/method levelsecurity configuration. In this step, the developercreates strongly typed service interfaces using well-defined schemas, which are registered in a system likethe Mobius GME as type definitions of the inputand output parameters of the service methods. (c)Deployment: The developer can deploy the servicewhich has been created with Introduce to a Gridservice container (e.g., a Globus or Tomcat servicecontainer).

A service developer can access the functionsrequired to execute these three steps through theGraphical Development Environment (GDE) of In-troduce. The runtime support behind the GDEfunctionality is provided by the Introduce engine,which consists of the Service Creator, ServiceSynchronizer, and Service Deployer components. Inthis section, we describe the GDE and the corecomponents of the Introduce engine. Section 5presents the additional features of the engine, includ-ing resource framework, support for compositionalinheritance, and support for multi-service/multi-resource implementations. Introduce is an extensibletoolkit. The extension framework, which allows forintegration of custom service types and discovery ofcustom data types, is described in Section 6.

4.1 External Middleware Systems and Tools

Introduce assumes the availability of two externalcomponents to implement its functions. (1) A Gridruntime environment that provides the support forcompiling, advertising, and deploying Grid services;(2) A repository of data types, accessible locally orremotely, that the toolkit can pull common data typesfor strongly typed services. The toolkit is designed as amodular system, in which the specific implementationsof external components can be replaced.

The current implementation of Introduce leveragesthe Globus Toolkit (GT), Apache Axis (http://ws.apache.org/axis/), and Java Naming and DirectoryInterface (JNDI; http://java.sun.com/products/jndi/).Apache Axis is an open source middleware providingthe actual Web Services layer, i.e., SOAP based WebService protocol and support for Java bean genera-tion, resource support via JNDI, and other WebService middleware components. Introduce uses theGT, Axis, and JNDI as the underlying middleware

systems to support development of stateful services,advertisement of metadata in the form of resourceproperties, and deployment of services for execution.Introduce uses the Mobius GME service [16] as arepository of XML schemas to support the develop-ment of strongly typed services. It also has the abilityfor custom data type discovery plug-ins, detailed later.

4.2 Introduce Graphical Development Environment

The Introduce Graphical Development Environment(GDE) is the graphical service authoring tool whichcan be used to create, modify, and deploy a Gridservice (see Fig. 2). It is designed to be very simple touse, enable using community excepted data types, andprovide easy configuration of service metadata,operations, and security. It also allows customizedplug-ins to be added. These plug-ins enable customextensions to Introduce that can be used by servicedevelopers.

The interface contains several screens and optionsfor the service developer. The service creation screensenable the developer to create a new service. Usingthe interface, the service developer can provide somebasic information such as the name of the service,package name to use for creating the source code, anddomain name to use in service description (specifiedin WSDL), when defining the methods of the service.The service modification screens allow the servicedeveloper to customize the service by adding,removing, and modifying service methods and theinput and output parameters of the methods. Thedeveloper can also configure the service to enforceauthentication and authorization via Grid securityinfrastructure. Using the data types registration anddiscovery screens, the developer can obtain the datatypes that he/she wants to use for the serviceparameters and return types from any data typediscovery plug-in. The Introduce toolkit comes witha set of pre-installed discovery plug-ins, such as theMobius GME and a basic file system browser whichcan be used to locate local schemas. The deploymentcomponent of the GDE allows the service developerto deploy the implemented Grid service to a Gridservice container. The toolkit currently supportsdeploying a service to either a Globus or TomcatGrid service container; however, support for otherdeployment options can easily be added to the GDE.

414 S. Hastings, et al.

Page 9: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

4.3 Introduce Engine

The runtime support to enable service creation,modification, and deployment of Grid services isprovided by the Introduce engine. The engine is thecore piece that tracks changes in a service beingimplemented, modifies the service to reflect thesechanges, and rebuilds the service to verify that it isready to be edited or deployed. In this section, wedescribe the main components of this engine.

4.3.1 Service Creator

The service creator is composed of a series oftemplates using the Java Emitter Templates (JET)component, which is part of the Eclipse ModelingFramework (http://www.eclipse.org/emf/), for gener-ating source code and configuration files, and askeleton set of directories which is used to generatea Grid service that can be built, registered, anddeployed in the Grid environment.

Fig. 3 Service creationtools and use of JET tem-plates for service creation

Fig. 2 The IntroduceGraphical DevelopmentEnvironment (GDE)

Introduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services 415

Page 10: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

Templates for source code and configuration filesare used to create all the custom Java source code forthe service and the client APIs, and to generate thefiles required by the GT in order to build and deploy aGrid service (see Fig. 3). Deployment configurationfiles are used for resource and resource propertyconfiguration in the form of Java Naming andDirectory Interface (JNDI), resource property regis-tration configuration, Web Service Deployment De-scriptor (WSDD), and security configuration. Thebasic service created by Introduce is the skeleton ofthe service application developer wants to develop

and it does not include such additions as multi-service/multi-source properties and compositionalinheritance properties. Such additional configurationand modifications can be performed in the servicemodification step. The basic service consists of the setof method interfaces defined by the service developerand has the core set of files and processed needed tocompile the service and deploy by the lower levelGrid middleware:

& ANT processes (http://ant.apache.org) for build,deploy, and test operations,

Fig. 4 Service resynchronization process

416 S. Hastings, et al.

Page 11: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

& Custom configuration files for IDE integration, e.g.,Eclipse project files for editing the service using theEclipse platform (http://www.eclipse.org),

& Standard interfaces for both client and service toimplement,

& Fully implemented client APIs,& Stub implemented service,& Configuration to support service metadata and

resource properties and the registration of meta-data and properties, and

& Configuration for secure service deployment andauthorization.

Examples of the various files and directoriesmanaged by Introduce for a service are illustrated inthe Appendix.

4.3.2 Service Synchronizer

Service resynchronization is the process by which thesource code and configuration generation tools of theIntroduce toolkit analyzes the service’s current imple-mentation with that of the desired service description.The overall high level process of service resynchro-nization is illustrated in Fig. 4. This process adds,removes, and modifies any service methods, resourceproperties, and service settings based on the servicedescription. The descriptions and configurations formethods, metadata, and security are those that aregenerated from the GDE, programmatically or byhand, and that can be validated by the Introduceservice schema (Figs. 5 and 6). The service descrip-tion is the basis by which the code generation tools

Fig. 5 Base Introduce ser-vice description schema

Fig. 6 Schema for Introduce services

Introduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services 417

Page 12: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

add, remove, and modify operations and metadata,and change the security configuration of the service byediting the source code, configuration files, and meta-data files. This is similar to the way that Axis uses theWSDL to generate the Grid stubs of the service.

The service synchronizer component manages theservice WSDL description. If a service or method hasbeen added or removed, the respective WSDL filesmust be updated to reflect the changes. Updating thefiles requires many auto generated code segments.External XML schemas, which describe publisheddata types, must be imported into the respective WSDLfiles so that the data types can be located and used togenerate the required Java beans and SOAP bindings.Both themessage types, which represent the data typesof the input parameters and the return type of a servicemethod, and the complete operation, which reflects theJava based signature of the method being exposed,have to be described in the WSDL file. The synchro-nization operation automatically keeps this file in syncso that the service implementation and the Grid servicedescription represented by the WSDL match up. Thisensures that the methods implemented in the Gridservice can actually be invoked.

The service can have an inheritance model byadding methods from another service, possibly alongwith the implementations of those methods (seeSection 5.3). If a method were imported from anotherGrid service, the service synchronizer componentwould also pull in the WSDL description of themethod and copy it into the portType of the newservice. This enables the service to have a completelyprotocol compatible implementation of the method.

4.3.3 Service Deployer

The service deployment component currently sup-ports deploying Grid services to a Globus or Tomcatcontainer. The deployment framework, which is anANT based deployer, can easily be extended toprovide deployment capability to other Web Servicecontainers as well as remote containers. It utilizes thelayout of the Grid Archive (GAR) structure toorganize and package a deployable service. Thedeployment process first populates template files (e.g.,the WSDD and JNDI of the service) which specifyservice deployment path or security configuration filelocations. It then gathers the library files required for theservice as well as those library files which contain the

actual runtime code of the service. All configurationfiles and service resources are also collected. Finally, aGAR file is generated for this service. Once the GARfile has been generated, it can be handed off to theparticular deployment handler for the desired orappropriate container.

The deployment framework also allows for deploy-ment-time service properties to be acquired from theapplication developer or the user deploying the service.These custom service properties can be used to defineconfigurable options for a particular deployment of theservice. For example, if a service implementationaccesses a local database, the username and passwordof the database can be specified as service properties.This allows a very convenient way for those deployingthe service to configure it appropriately for theirenvironment, without requiring knowledge of how theGrid service implements these configuration points or acode change prior to deployment.

5 Additional Features of Introduce Engine

In addition to basic service support, the Introduceengine has several additional components that imple-ment features such as compositional inheritance,resource properties framework, multi-service/multi-resource contexts, security configuration, and auto-boxing/unboxing. In this section, we provide anoverview of these components.

5.1 Auto-boxing/Unboxing of Service Operations

Code generation for Grid service interfaces has toassociate the service interface as defined by theservice developer to the actual port type generatedby the GT via Axis. The overall process is compli-cated due to fact that the GT and Axis use documentliteral bindings to create the services portType andbindings to SOAP. For example, if the servicedeveloper describes a method as shown below:

int foo int bar1; int bar2ð Þ;

The port type method that will be created for thecorresponding Java interface call will look like theone below:

FooReturn foo FooParameters paramsð Þ;

418 S. Hastings, et al.

Page 13: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

This style is known as document literal binding. Thisboxing or wrapping of the parameters and the returntype of the service method can be confusing to theservice user and service developer. Since this docu-ment literal style is exposed directly through the clientAPI or the service implementation API, every clientusing the service will have to box up the parametersto call the operations of the service and un-box theresults. Not only will this task be cumbersome forservice users, but the document literal interface is notthe interface that the service designer intended to beprovided to its users. The Introduce toolkit will hidethe boxing and un-boxing of methods by providing aninterface to the service, which looks exactly asdescribed by the service developer and not asinterpreted by Globus via Axis. In order to do this,the toolkit creates a wrapping layer in the client andservice which both implement the clean interface (nondocument literal). These wrapper layers auto-box andun-box and map the calls from the clean client to thedocument literal port type client generated by Globus/Axis and vice versa for the service (see Fig. 7). It isworth noting that Introduce created services can still beaccessed via the standard document literal interfaces.

5.2 Resource Properties Framework

Introduce provides support for exposing service stateand metadata provided by the WSRF-ResourcePro-perties (WSRF-RP) specification by abstracting awaythe details of common patterns of use. The WSRFspecifications and the GT implementation of thesespecifications allow the creation of stateful Gridservices, whilst remaining compatible with existingWeb Service standards. The state (and metadata) of aGrid service is maintained in Resources in the WSRFspecification, and is exposed to clients by way ofResource Properties. Introduce allows its users toexpose state or metadata of their services by manag-ing the definition, creation, and population of theResources and Resource Properties of a service.

Upon creating a service, a service developer is ableto simply select from a list of common resource usagepatterns, and Introduce manages the complete gener-ation of the necessary backend code and configurationto implement that pattern. For example, if the userwants to use a resource pattern which has the abilityto control the lifetime of the resources which arecreated, the user will choose the resource lifetime

style of resource pattern, and Introduce will generate astubbed resource framework which supports the WS-ResourceLifetime specification. Once the resourcepattern is defined, a service developer can then usethe existing data type discovery tools (as used tospecify operation inputs and outputs) in order toexpose the state or metadata of its resources by wayof Resource Properties. Developers can either pro-grammatically control the values of the property atruntime (as is common for exposed dynamic resourcestate), or supply the value from a file at service startup(as is common for exposed static metadata). Introducealso allows service developers to maintain soft-stateregistration of resource properties. The combinationof these simple-to-use, yet powerful features, allows aservice provider to, for instance, automatically pro-vide and register a description of its service to acentral repository when the service is instantiated bythe container.

Fig. 7 Mapping from and to document literal port types andclient/service level method syntax

Introduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services 419

Page 14: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

5.3 Compositional Inheritance

Introduce enables the service developer to importmethods from other services (i.e., from other port-Types). This notion is called compositional inheri-tance . Services under the new WSDL 2.0specification cannot extend portType definitions aspreviously accomplished by the pre-WSDL based GT3.2. In order to simulate portType extension, Intro-duce implements the ability to copy operationdescriptions from other portTypes and put them inthe service’s own description. When using thisfeature, there are various configuration options theengine must know such as the namespaces of theoperations, and whether or not an implementation ofthe operations are already provided or should astubbed method be generated just as any otherIntroduce defined operation. In order to be sure thatthe operation can be correctly imported and copied intothe new portType, the WSDL of this operation and anyreferenced schemas must be brought into this newservice and imported. If the operation implementationis not being provided, the synchronization engine mustadd this operation to the services base interface, andthe un-boxing wrapper for this operation must begenerated. If the implementation of this operation isbeing provided, the extra code in the interface, itsimplementation, and the wrapper do not need to beadded. However, the implementation code, in the formof a JAR file, must be brought into the new service,and the operation class must be added to theoperationProviders variable of the WSDD configura-tion file of this service.

5.4 Multi Service/Multi Resource

In Grid Services architecture it is sometimes requiredthat a service not only maintain some extra informa-tion about the service, but also maintain information(state) which is of particular interest to one or severalparticular clients. These styles of Grid services usecases have driven the requirements for specifying amechanism for stateful Grid services via the WSRFbased specifications. Each WSRF service manages itsstate by creating and manipulating Resources. Themain restriction is that a given service can onlymanage a single given resource type.

Consider a Grid data source, such as the oneshown in Fig. 8, which provides access to a database.Assume this data source is designed to provide twocapabilities: a “Query” capability, which allows aclient to query into the backend database, and a“Results Delivery” capability, which provides thesupport to iteratively access query results (similar toa remote cursor). In a Grid service implementation ofthe data source, such capabilities are supportedthrough the Service Context concept in Introduce.The “Query” capability is implemented as “QueryContext” and the “Results Delivery” capability as“Results Delivery Context.” Introduce implementsthese Service Contexts by creating a WSRF servicefor each context (i.e., a Query Service for the Querycontext and a Results Delivery Service for the ResultsDelivery context), and a corresponding Resourcetype. In this example, the Query Service would havea Resource type that represents the backend database(s), and the Results Delivery Service would have a

Fig. 8 Stateful Grid servicepattern

420 S. Hastings, et al.

Page 15: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

Resource type that represents query results. When aclient submits a query by invoking the queryoperation on the Query Service, the service can querythe backend database and then create an instance ofthe Results Delivery Service Resource. The queryoperation then returns a pointer, or EndPointRefer-ence (EPR), based on the WS-Addressing specifica-tion, to this Resource. The client is then able tointeract with the results of the query via the ResultsDelivery Service’s client API, passing in the returnedEPR. Through its support for multiple ServiceContexts, Introduce enables this and other Resourcepatterns for stateful Grid services.

Each Introduce service has at least one ServiceContext (the main service), and can create an arbitrarynumber of additional Service Contexts to support morecomplex resource usage patterns. Each created contexthas a corresponding source directory containing itsown server, client, common, resource, etc. In this way,resource properties, operations, and security config-urations can be added, removed, and modified for eachadditional context. The corresponding service andresource of each additional context are modified,compiled, and deployed with the main service.

5.5 Security

Introduce facilitates the creation and configuration ofsecure Grid services using the Grid Security Infra-structure (GSI) and allows security to be configured atboth the service level and the service method level.Moreover, security can be enforced on both the clientand service sides. Service developers can specify thesecure communication mechanism(s), in which clientsare allowed to communicate with the service. AnIntroduce generated client can automatically beconfigured to communicate over the secure commu-nication mechanism specified by the service. In thecase where multiple secure communication mecha-nisms are supported by the service, Introduce will allowthe service developer to choose which mechanism theclient will use.

Grid Services often need credentials such that theymay authenticate with one another or so they mayauthenticate with clients. Depending on the commu-nication mechanisms supported and the deploymentscenario, a service may inherit its credentials fromthe container hosting the service, or it may beconfigured to have its own credentials. In Introduce,

service credentials can be configured in the form ofcertificate/private key or in the form of a Gridproxy. Introduce also facilitates of additional clientsecurity aspects, these include anonymous securecommunication and delegation.

The toolkit allows for the configuration of bothclient-side and service-side authorization. Clients canbe configured to perform Self Authorization, HostAuthorization, or Identity Authorization, on a Gridservice before allowing the Grid service to beinvoked. Client side authorization is done based onthe service credentials presented to the client by theservice. On the service side, Introduce allows theconfiguration of the Globus Authorization Framework,which enforces authorization policies on services.

At synchronization time, the Introduce engineinterprets the security configuration and uses it toconfigure the service accordingly. Configuration ofservice security requires the engine to make modifi-cations to the client source code, service source code,security descriptor, and server deployment WSDD.The client source code is modified to use theappropriate secure communication mechanism, en-force the specified client authorization mechanism,use the delegation mode specified, and whether or notto communicate anonymously. The service sourcecode is modified to support the delegation codespecified. A security descriptor is created to specifythe secure communication mechanism supported bythe service, configure the authorization framework,and to configure the service credentials. Finally theserver deployment WSDD is edited to add anyconfiguration parameter that may be needed.

6 Introduce Extension Framework

Introduce extension points enable custom plug-ins/extensions to be added to the Introduce engine tofacilitate customization of services it can create. TheIntroduce Extension Framework currently consists oftwo styles of extensions; service and data typediscovery. Extensions are added to the toolkit in theform of extension plug-ins, which the toolkit can thenmake it available to the service developer. To providean extension to Introduce, the extension providermust implement or extend the appropriate classes forthe style of extension they wish to provide, and mustfill out the extension XML configuration document.

Introduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services 421

Page 16: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

6.1 Service Extensions

A service extension is one which enables custom-ization of the service creation and modificationprocesses. These extensions can add required oper-ations, service resources or resource properties, orsecurity settings, for example. The service extensionallows the user to provide custom code that will beexecuted at different times throughout the creationand modification processes of service development. Aservice extension consists of 5 main extensioncomponents that can be implemented and providedby the developer: CreationUI, ModificationUI, Crea-tionPostProcess, ModificationPreProcess, and Modifi-cationPostProcess. Each of these extension componentshave a predefined class and interface that must beextended or implemented. Two of the service extensioncomponents, CreationUI and ModificationUI, aregraphical components provided to the Introduce GDE,and the other three are Introduce engine plug-ins.

Each service extension component is invoked at aspecific point in the creation or modification steps(Fig. 9). These different time points for eachcomponent execution are critical for making certainchanges. For example, when the CreationUI compo-nent for a particular extension is executed, the servicehas been created as a blank service and no modifica-

tion or synchronization has been done on the service.At this point the CreationUI component might promptthe user for particular information about the creationprocesses. This component would only be executed/displayed once for any given service and its nongraphical component, the CreationPostProcess com-ponent, will also only be ran one time after the servicehas been created and before it will ever be modified.

The modification components are run every time aservice is saved and synchronized. The graphicalmodification component is always available in theIntroduce GDE during modification time. The twoservice modification engine components, Modification-PreProcess and ModificationPostProcess are executedrespectively, before and after the synchronizationprocess is executed. This enables ModificationPrePro-cess to do such things as modify the service description,represented by the Introduce service description file, orthe WSDL files of the services. The ModificationPost-Process, on the other hand, might move in requiredfiles, or populated stubbed methods, etc.

6.2 Data Type Discovery Extensions

These extensions are Introduce GDE components thatare accessed at the service modification step. They are

Fig. 9 Execution of Introduce extension components

422 S. Hastings, et al.

Page 17: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

intended to provide support for custom data typediscovery for the service developer. They must allowthe developer to browse types and chose to use thosetypes in the developed service. This means that a datatype discovery extension will have to be able to copythe schemas which represent the data types down tothe service’s schema directory and produce a Name-spaceType object for the namespace of each separatedata type. This enables the Grid service to utilize theschemas for describing the data types which are usedin the WSDL messages traveling in and out of thecreated service.

7 Conclusion

Grid computing has become the technology of choicethat enables more effective use of distributed data,analytical, information, computational, and storageresources in coordinated, multi-institutional effortsand for large scale applications. The need for highlyinteroperable infrastructure is becoming more impor-tant in order to address the highly heterogeneous anddynamic nature of organizations and groups withinthem. The Grid Services technology offers a viable

platform to meet this need and facilitate the applica-tion of Grid computing in a wider range of applicationdomains. In this paper, we have proposed andpresented a toolkit, called Introduce, that is designedto facilitate rapid development of strongly typed,WSRF-compliant secure Grid Services and thatsupports the main steps of service development;service creation, service modification, and servicedeployment. In addition to ease-of-use and hiding thecomplexity of low-level Grid middleware infrastruc-ture, Introduce encapsulates several novel features:(1) It supports development of strongly typed ser-vices, in which input and output parameters of servicemethods are drawn from published, community-accepted data types. Strongly typed services enablesincreased syntactic interoperability among servicesand are critical to programmatic access to and correcthandling of data returned from services. (2) It allowsfor service developers to easily leverage advancedfeatures of WSRF and the GT 4.0 such as resourceframework, multi-resource/multi-service, and compo-sitional inheritance. (3) It provides a powerfulextension framework for Introduce users to customizeand extend the base Introduce functionality. Throughuse of plug-ins, this framework goes beyond simple

Fig. 10 Basic service layout of an example service created by Introduce

Introduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services 423

Page 18: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

extensibility via service templates and enables integra-tion of both custom code/template generation logic andcustom graphical components as complex plug-ins.

The design of Introduce is motivated mainly by thecaBIG effort and its Grid infrastructure, called caGrid,

designed to support implementation, execution, andmanagement of services and applications in the caBIGenvironment. Introduce has been successfully employedin the development of services within the caBIG caGridinfrastructure as well as data and analytical services that

Fig. 11 Source files used in an example service created by Introduce

Fig. 12 Example common/configuration files on an example service created by Introduce

424 S. Hastings, et al.

Page 19: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

make use of the caGrid functionality. We believe thatGrid service development tools such as Introduce andother related technologies will help bring the technologyof Grid Computing to the forefront of distributedinfrastructure and greatly increase the adoptability ofGrid service technologies in areas from scientificresearch to business computing.

In the future we plan to provide even higher levelof service interoperability at the semantic level. Weplan to investigate the notion of not only designingservices from common public data types but also frompublicly available modeled operation signatures oreven service interfaces. We anticipate that increasingthe availability of these types of information andproviding tools which can aid in the creation of Gridservices from them will enable a higher level ofsemantic interoperability, which will lead to APIharmonization across domains.

Appendix

Examples of the Various Files and DirectoriesManaged by Introduce for a Service

Figure 10 shows a basic service layout as created bythe Introduce Service Creator component. It showswhich pieces of this example service are generated bythe code generation tools, which pieces are built bythe underlying Globus/Axis tools, and which piecesare to be implemented by the service developer.

Figure 11 describes what some particularly criticalsource files are generated for in an example servicecreated by the Introduce engine. When a WSDL file isparsed by the Axis engine, a PortType interface iscreated which is the Java representation of the API ofthe Grid service. The Axis generated PortTypeinterface must then be implemented on the serviceto provide the services implementation. In order toenable the service developer to implement a cleaner,non-documument literal interface, Introduce willautomatically create the implementation of thisPortType interface (HelloWorldProviderImpl). TheIntroduced generated implementation of this interfacewill unbox the document literal calls ot the serviceand pass them on to the unboxed/clean interface(HelloWorldI) which the user defined. Introduce willgenerate a stubbed implementation of this interface

(HelloWorldImpl) which the user will be responsiblefor implementing. This class will maintain theservices implementation of the methods. This enablesthe service designer to be shielded from the details ofthe Axis document literal Grid service interface andenable them to implement an interface which is asthen originally described. The figure also illustratesthe example service’s resource and resource home,which are generated to manage the service resourceand the service resource properties, respectively.

Figure 12 shows the use of the different commonfiles of an example Introduce created service. It showsthe files used for configuring registration and securityfor the Grid service, as well as those used by Introducefor synchronization, and those used for build anddeployment. This example service created by Introducealso contains Eclipse project files so that the servicecan easily be edited using the Eclipse platform (http://www.eclipse.org).

References

1. Foster, I., Kesselman, C.: Globus: a metacomputinginfrastructure toolkit. Int. J. High Perform. Comput. Appl.11, 115–128 (1997)

2. Foster, I., Kesselman, C., Tsudik, G., Tuecke, S.: A securityarchitecture for computational Grids. In: Proceedings of the5th ACM Conference on Computer and CommunicationsSecurity Conference, pp. 83–92. ACM, New York (1998)

3. Welch, V., Siebenlist, F., Foster, I., Bresnahan, J.,Czajkowski, K., Gawor, J., Kesselman, C., Meder, S.,Pearlman, L., Tuecke, S.: Security for Grid services. In:12th International Symposium on High PerformanceDistributed Computing (HPDC-12), 2003

4. Langella, S., Oster, S., Hastings, S., Siebenlist, F., Kurc, T.,Saltz, J.: Dorian: Grid service infrastructure for identitymanagement and federation. In: The 19th IEEE Sympo-sium on Computer-Based Medical Systems, Special Track:Grids for Biomedical Informatics, Salt Lake City, Utah,2006

5. Wolski, R., Spring, N., Hayes, J.: Network weather service: adistributed resource performance forecasting service for meta-computing. Future Gener. Comput. Syst. 15, 757–768 (1999)

6. Czajkowski, K., Foster, I., Kesselman, C., Sander, V.,Tuecke, S.: SNAP: a protocol for negotiating service levelagreements and coordinating resource management indistributed systems. In: Job Scheduling Strategies forParallel Processing, vol. 2537, pp. 153–183. Springer,Berlin Heidelberg New York (2002)

7. Frey, J., Tannenbaum, T., Livny, M., Foster, I., Tuecke, S.:Condor-G: a computational management agent for multi-institutional Grids. In: Proceedings of the Tenth Interna-

Introduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services 425

Page 20: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

tional Symposium on High Performance Distributed Com-puting (HPDC-10). IEEE Press, Piscataway, NJ (2001)

8. Casanova, H., Graziano, O., Berman, F., Wolski, R.: TheappLeS parameter sweep template: user-level middlewarefor the Grid. In: Proceedings of the ACM/IEEE Super-computing Conference (SC2000). IEEE Computer SocietyPress, Los Alamitos, CA (2000)

9. Allcock, B., Bester, J., Bresnahan, J., Chervenak, A.,Foster, I., Kesselman, C., Meder, S., Nefedova, V.,Quesnal, D., Tuecke, S.: Data management and transfer inhigh performance computational Grid environments. Paral-lel Comput. 28, 749–771 (2002)

10. Allcock, W.E., Foster, I., Madduri, R.: Reliable datatransport: a critical service for the Grid. In: Proceedingsof Building Service Based Grids Workshop, Global GridForum 11, Honolulu, HI, 2004

11. Chervenak, A., Deelman, E., Kesselman, C., Allcock, B.,Foster, I., Nefedova, V., Lee, J., Sim, A., Shoshahi, A.,Drach, B., Williams, D., Middleton, D.: High-performanceremote access to climate simulation data: a challengeproblem for data Grid technologies. Parallel Comput. 29,1335–1356 (2003)

12. Chervenak, A., Deelman, E., Foster, I., Guy, L., Hoschek, W.,Iamnitchi, A., Kesselman, C., Kunst, P., Ripeanu, M.,Schwartzkopf, B., Stockinger, H., Tierney, B.: Giggle: aframework for constructing scalable replica location services.In: Proceedings of the ACM/IEEE Supercomputing Confer-ence (SC2002), pp. 1–17. IEEE Computer Society Press, LosAlamitos, CA (2002)

13. Ranganathan, K., Foster, I.: Identifying dynamic replicationstrategies for high performance data Grids. In: Proceedingsof International Workshop on Grid Computing (GRID2002) Denver, CO, 2002

14. Deelman, E., Singh, G., Atkinson, M.P., Chervenak, A.,Chue Hong, N.P., Kesselman, C., Patil, S., Pearlman, L.,Su, M.: Grid-based metadata services. In: Proceedings ofthe 16th International Conference on Scientific andStatistical Database Management (SSDBM ‘04), 2004

15. Singh, G., Bharathi, S., Chervenak, A., Deelman, E.,Kesselman, C., Mahohar, M., Pail, S., Pearlman, L.: Ametadata catalog service for data intensive applications. In:Proceedings of the ACM/IEEE Supercomputing Confer-ence (SC2003), 2003

16. Hastings, S., Langella, S., Oster, S., Saltz, J.: Distributed datamanagement and integration: theMobius project. Proceedingsof the Global Grid Forum 11 (GGF11) Semantic GridApplications Workshop, Honolulu, HI, pp. 20–38, 2004

17. Allen, G., Dramlitsch, T., Foster, I., Goodale, T., Karonis, N.,Ripeanu, M., Seidel, E., Toonen, B.: Cactus-G toolkit:supporting efficient execution in heterogeneous distributedcomputing environments. In: Proceedings of the 4th GlobusRetreat Pittsburg, PA, 2000

18. Furmento, N., Lee, W., Mayer, A., Newhouse, S., Darlington,J.: ICENI: an open Grid service architecture implementedwith JINI. In: Proceedings of the ACM/IEEE SupercomputingConference (SC2002) Baltimore, MD. IEEE ComputerSociety Press, Los Alamitos, CA (2002)

19. Beynon, M., Kurc, T., Sussman, A., Saltz, J.: Design of aframework for data-intensive wide-area applications. In:

Proceedings of the 9th Heterogeneous Computing Work-shop (HCW2000), pp. 116–130. IEEE Computer SocietyPress, Los Alamitos, CA (2000)

20. Altintas, I., Bhagwanani, S., Buttler, D., Chandra, S.,Cheng, Z., Coleman, M., Critchlow, T., Gupta, A., Han, W.,Liu, L., Ludascher, B., Pu, C., Moore, R., Shoshani, A., Vouk,M.: A modeling and execution environment for distributedscientific workflows. In: Proceedings of the 15th InternationalConference on Scientific and Statistical Database Manage-ment (SSDBM ‘03) Boston, MA, 2003

21. Deelman, E., Blythe, J., Gil, Y., Kesselman, C., Mehta, G.,Vahi, K., Blackburn, K., Lazzarini, A., Arbree, A.,Cavanaugh, R., Koranda, S.: Mapping abstract complexworkflows onto Grid environments. J. Grid Computing 1,25–39 (2003)

22. Foster, I., Voeckler, J., Wilde, M., Zhao, Y.: Chimera: avirtual data system for representing, querying, and auto-mating data derivation. In: Proceedings of the 14thConference on Scientific and Statistical Database Manage-ment (SSDBM ‘02), 2002

23. Foster, I., Kesselman, C., Nick, J.M., Tuecke, S.: ThePhysiology of the Grid: An Open Grid Services Architec-ture for Distributed Systems Integration. Open Grid ServiceInfrastructure Working Group Technical Report, GlobalGrid Forum. http://www.globus.org/alliance/publications/papers/ogsa.pdf (2002)

24. Foster, I., Kesselman, C., Tuecke, S.: The anatomy of theGrid: enabling scalable virtual organizations. Int. J. Super-comput. Appl. 15, 200–222 (2001)

25. Cerami, E.: Web Services Essentials. O’Reilly (2002)26. Graham, S., Simeonov, S., Boubez, T., Davis, D., Daniels, G.,

Nakamura, Y., Neyama, R.: Building Web Services with Java:Making Sense of XML, SOAP, WSDL, and UDDI. SAMSPublishing (2002)

27. Czajkowski, K., Ferguson, D.F., Foster, I., Frey, J.,Graham, S., Sedukhin, I., Snelling, D., Tuecke, S.,Vambenepe, W.: The WS-Resource Framework Version1.0. http://www.globus.org/wsrf/specs/ws-wsrf.pdf (2004)

28. Hastings, S., Langella, S., Oster, S., Kurc, T., Pan, T.,Catalyurek, U., Janies, D., Saltz, J.: Grid-based managementof biomedical data using an xML-based distributed datamanagement system. In: Proceedings of the 20th ACMSymposium on Applied Computing (SAC 2005), Bioinfor-matics Track, Santa Fe, NewMexico. ACM, NewYork (2005)

29. Langella, S., Hastings, S., Oster, S., Kurc, T., Catalyurek, U.,Saltz, J.: A distributed data management middleware fordata-driven application systems. In: Proceedings of the 2004IEEE International Conference on Cluster Computing(Cluster 2004), 2004

30. Saltz, J., Oster, S., Hastings, S., Kurc, T., Sanchez, W.,Kher, M., Manisundaram, A., Shanbhag, K., Covitz, P.:caGrid: design and implementation of the core architectureof the cancer biomedical informatics Grid. Bioinformatics22(15), 1910–1916 (2006)

31. Smith, M., Friese, T., Freisleben, B.: Model drivendevelopment of service oriented Grid applications. In:Advanced International Conference on Telecommunica-tions and International Conference on Internet and WebApplications and Services (AICT-ICIW ‘06), 2006

426 S. Hastings, et al.

Page 21: Introduce: An Open Source Toolkit for Rapid Development · PDF fileIntroduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services Shannon Hastings & Scott

32. Friese, T., Smith, M., Freisleben, B.: Grid developmenttools for eclipse. In: Eclipse Technology eXchange work-shop (eTX) at ECOOP 2006, Nantes, France, 2006

33. Morohoshi, H., Huang, R.: A user-friendly platform fordeveloping grid services over globus toolkit 3. In: The2005 11th International Conference on Parallel and Dis-tributed Systems (ICPADS’05), 2005

34. Mizuta, S., Huang, R.: Automation of Grid service codegeneration with AndroMDA for GT3. In: The 19thInternational Conference on Advanced Information Net-working and Applications (AINA’05), 2005

35. von Laszewski, G., Foster, I., Gawor, J., Lane, P.: A Javacommodity grid kit. Concurr. Comp.-Pract. E. 13, 643–662(2001)

36. von Laszewski, G., Foster, I., Gawor, J., Smith, W.,

Tuecke, S.: CoG kits: a bridge between commoditydistributed computing and high performance Grids. In:ACM Java Grande 2000 Conference, 2000

37. Humphrey, M., Wasson, G.: Architectural foundations ofWSRF .NET. JWSR 2, 83–97 (2005)

38. Kacsuk, P., Doza, G., Kovacs, J., Lovas, R., Podhorszki, N.,Balaton, Z., Gombas, G.: P-GRADE: a Grid programmingenvironment. J. Grid Computing 1, 171–197 (2003)

39. Fahringer, T., Jugravu, A., Pllana, S., Prodan, R., Seragiotto,C. Jr., Truong, H.-L.: ASKALON: a tool set for cluster andGrid computing. Concurr. Comp.-Pract. E. 17, 143–169(2005)

40. Delaitre, T., Kiss, T., Goyeneche, A., Terstyanszky, G., Winter,S., Kacsuk, P.: GEMLCA: running legacy code applications asGrid services. J. Grid Computing 3, 75–90 (2005)

Introduce: An Open Source Toolkit for Rapid Development of Strongly Typed Grid Services 427