[ieee 2008 ieee congress on services part ii (services-2) - beijing, china (2008.09.23-2008.09.26)]...

8
Web Services Operation and Parameter Matchmaking based on Free-form User Queries Chaitali Gupta, Rajdeep Bhowmik, Madhusudhan Govindaraju Department of Computer Science, State University of New York (SUNY) at Binghamton, NY {cgupta1, rbhowmi1, mgovinda}@binghamton.edu Abstract Service-oriented architectures (SOA), based on Web services as the underlying architecture, have the potential to facilitate dynamic evolution of business processes. However, to fully realize the benefit of the SOA vision, it is critical to shield application developers from the complexity of Web services and related XML-based formats. In the currently available toolkits, users often play an informed role in the mapping of XML based specifications, operation names, message structures, and parameter types. The focus of our work is to design and develop an elegant, intuitive, simple, and powerful free-form query based system that allows interaction with Web services, without requiring the end-user to understand the operation names, parameters types, and other XML- based information in a WSDL document. Our system uses Semantic Web concepts, ontologies, and WordNet in the process of automating Web services matchmaking. In this paper we focus on presenting techniques for matching free-form queries with appropriate Web service operations and extracting parameter values from user queries. We quantify the accuracy of our methodologies in terms of precision and recall. 1. Introduction Web services and XML-based standards have emerged as important building blocks for distributed application development. The adoption of these standards is primarily due to the active participation of the community in the design of flexible and extensible XML based specifications. Among these specifications, the Web Service Description Language (WSDL) is a widely used to describe Web services. A WSDL document exposes service functionality to end- users in enough detail to allow building of client applications to interact with the service, usually via the SOAP protocol. The advantage of using the XML Web services model is that programs can be written in different languages and on different platforms to communicate with Web services, or for Web services to exchange information with each other. One of the distinguishing factors of Web services, compared to other approaches in distributed computing such as CORBA, is the vision for significantly lower barrier to entry. WSDL, SOAP, and other XML-based standards were initially thought of as being simple and less complex compared to specifications in other distributed computing technologies. These XML-based standards are as a result designed to be compliant with standard Web protocols such as HTTP. However, the vision of simplifying application development with XML Web services has not been fully realized. XML based specifications provide only syntactical descriptions of the functionality provided by Web services. Even though a wide variety of tools are available to invoke Web services, the lack of semantics associated with Web service descriptions requires user intervention in understanding the verbose XML documents. Users are often required to scan through a WSDL document to determine the list of available operations, the input and output parameter types, the set of ports and service endpoints. The next step for a user is to learn the usage details of her particular Web services toolkit (WSIF [1], gSOAP [2], Axis [3], for example). As a result, the current programming paradigm with Web services does not meet the ease-of-use criteria, which is a critical component of the SOA vision. Our work addresses this problem by focusing on providing the “ease-of-use” experience to end-users. We have developed several algorithms and optimization techniques that map free- form user queries to relevant operations and match input-output messages in domain specific Web services. Our system presents a simple interface, similar to HTML based search engines, which accepts 2008 IEEE Congress on Services Part II 978-0-7695-3313-1/08 $25.00 © 2008 IEEE DOI 10.1109/SERVICES-2.2008.13 57

Upload: madhusudhan

Post on 21-Mar-2017

215 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: [IEEE 2008 IEEE Congress on Services Part II (SERVICES-2) - Beijing, China (2008.09.23-2008.09.26)] 2008 IEEE Congress on Services Part II (services-2 2008) - Web Services Operation

Web Services Operation and Parameter Matchmaking based on Free-form User Queries

Chaitali Gupta, Rajdeep Bhowmik, Madhusudhan Govindaraju Department of Computer Science, State University of New York (SUNY) at Binghamton, NY

{cgupta1, rbhowmi1, mgovinda}@binghamton.edu

Abstract

Service-oriented architectures (SOA), based on Web services as the underlying architecture, have the potential to facilitate dynamic evolution of business processes. However, to fully realize the benefit of the SOA vision, it is critical to shield application developers from the complexity of Web services and related XML-based formats. In the currently available toolkits, users often play an informed role in the mapping of XML based specifications, operation names, message structures, and parameter types. The focus of our work is to design and develop an elegant, intuitive, simple, and powerful free-form query based system that allows interaction with Web services, without requiring the end-user to understand the operation names, parameters types, and other XML-based information in a WSDL document. Our system uses Semantic Web concepts, ontologies, and WordNet in the process of automating Web services matchmaking. In this paper we focus on presenting techniques for matching free-form queries with appropriate Web service operations and extracting parameter values from user queries. We quantify the accuracy of our methodologies in terms of precision and recall. 1. Introduction

Web services and XML-based standards have emerged as important building blocks for distributed application development. The adoption of these standards is primarily due to the active participation of the community in the design of flexible and extensible XML based specifications. Among these specifications, the Web Service Description Language (WSDL) is a widely used to describe Web services. A WSDL document exposes service functionality to end-

users in enough detail to allow building of client applications to interact with the service, usually via the SOAP protocol. The advantage of using the XML Web services model is that programs can be written in different languages and on different platforms to communicate with Web services, or for Web services to exchange information with each other. One of the distinguishing factors of Web services, compared to other approaches in distributed computing such as CORBA, is the vision for significantly lower barrier to entry. WSDL, SOAP, and other XML-based standards were initially thought of as being simple and less complex compared to specifications in other distributed computing technologies. These XML-based standards are as a result designed to be compliant with standard Web protocols such as HTTP.

However, the vision of simplifying application development with XML Web services has not been fully realized. XML based specifications provide only syntactical descriptions of the functionality provided by Web services. Even though a wide variety of tools are available to invoke Web services, the lack of semantics associated with Web service descriptions requires user intervention in understanding the verbose XML documents. Users are often required to scan through a WSDL document to determine the list of available operations, the input and output parameter types, the set of ports and service endpoints. The next step for a user is to learn the usage details of her particular Web services toolkit (WSIF [1], gSOAP [2], Axis [3], for example). As a result, the current programming paradigm with Web services does not meet the ease-of-use criteria, which is a critical component of the SOA vision. Our work addresses this problem by focusing on providing the “ease-of-use” experience to end-users. We have developed several algorithms and optimization techniques that map free-form user queries to relevant operations and match input-output messages in domain specific Web services. Our system presents a simple interface, similar to HTML based search engines, which accepts

2008 IEEE Congress on Services Part II

978-0-7695-3313-1/08 $25.00 © 2008 IEEE

DOI 10.1109/SERVICES-2.2008.13

57

Page 2: [IEEE 2008 IEEE Congress on Services Part II (SERVICES-2) - Beijing, China (2008.09.23-2008.09.26)] 2008 IEEE Congress on Services Part II (services-2 2008) - Web Services Operation

user queries and presents the end user with results after matching Web services operations and parameter values. The query matching techniques that we leverage include Semantic Web [4] and ontology technologies such as OWL [5], as well as tools such as WordNet [6]. This enables our system to retrieve contextual information from queries and determine the set of Web service operations that need to be executed for a given user query. In this approach, the details of Web services specification and implementation are hidden from the user. For example, suppose a user wants to check the weather for a trip from Philadelphia to Miami. In our system, the user needs to enter the free-form query "weather for travel from Philadelphia to Miami". An important distinction from other toolkits, which also aim for ease-of-use, is that in our system the user does not have to fill detailed forms for each service in the matchmaking process.

We conducted experiments to evaluate the accuracy of our system using four domains that currently have a large number of functional Web services publicly available: travel, location, currency, and weather. Our implementation uses WordNet 2.0 Dictionary [6] and the JWNL 1.3 API [9]. The JWNL API is used to access the WordNet dictionary.

In this paper we significantly extend our previous work that was focused on just simple matching of query keywords with operation names in WSDL files [23, 24]. We have enhanced the matchmaking algorithms with new algorithms, incorporated processing of unmatched query words in the matchmaking model, designed and implemented extraction of parameter values from free-form queries, defined a model for determining a ranking score, and quantified the accuracy of our methodologies using precision and recall.

The remainder of the paper is organized as follows. In Section 2, we discuss our previous work on how the user queries are mapped to appropriate Web service operations. The descriptions of how the parameter matchmaking takes place are discussed in Section 3. In Section 4 we discuss the basis of our ranking methodology. In Section 5 we present details of the performance study of the matchmaking methodologies. Section 6 provides details of related work. Section 7 and 8 discuss conclusions and future work respectively. 2. Previous Work In previous work, we focused on the design and development of a matchmaking module to match user queries with corresponding operation names in WSDL documents. We provide a brief overview of our

previous work in this section. Performance evaluation and detailed description can be found in our other publications [23, 24]. There are 3 steps in the matchmaking process – (i) Ontology Matchmaking, (ii) Dictionary Matchmaking and (iii) Fallback Mechanism. 2.1. Ontology Matchmaking

The vocabularies for the four domains are modeled using Jena [12], which is a widely used framework for building Semantic Web applications. It provides an easy to use API for processing vocabularies. For our initial implementation, we chose to use OWL [5], instead of RDF/RDFS [10, 11] as it allows an accurate and complete representation of a domain. RDFS, on the other hand, neither allows specification of properties as disjoint nor is it possible to restrict the cardinality of any given property.

The query words provided by the user are matched against the statements in the ontology. Each statement is made up of three entities - Subject, Predicate and Object. Prepositions in the query words are marked with a special flag as they can provide important information to determine the context of the user query. For example, for a query string "Best price for flight from Detroit to Houston on Sunday", we can infer from the prepositions that Detroit and Houston are geographic locations. Furthermore, the use of "from" and "to" in the client query can be used to infer that Detroit is the originating location and Houston is the destination. Once an ontology model is lit up by the query, the corresponding sentence is returned and the query words are stored against the ontology domain.

The Lexicon module is used to obtain better contextual information relevant to the client query. It employs synonym, hypernym2, and hyponym3 matching techniques to take into consideration different senses of a particular word. This is required as often query words may not directly match with the operation names specified in the WSDL document. Metanym Matching [13] is not currently used in our system. To avoid redundancy, query words are also stemmed. For example, for a query "Is it snowing in Buffalo right now?”, the word "snowing" in the client query string gets stemmed to the root word "snow", which is then retrieved from the Lexicon block.

Once synonym matching is applied, we consider four possible results –

2 A hypernym is a word whose meaning denotes super class. 3 A hyponym is a word whose meaning denotes a subordinate or subclass..

58

Page 3: [IEEE 2008 IEEE Congress on Services Part II (SERVICES-2) - Beijing, China (2008.09.23-2008.09.26)] 2008 IEEE Congress on Services Part II (services-2 2008) - Web Services Operation

• Neither the query word nor the synonym words are present in any of the ontology models.

• Some of the synonyms are present, but not the query word.

• The query word is present, but not its synonyms. • Both the query word and its synonyms are

present in the ontology model. 2.2. Dictionary Matchmaking

The dictionary matcher is employed to further improve the chances of determining the context of a user query. It applies (1) direct matching; (2) stripped matching; and (3) dictionary level matching.

• Direct Matching – The WSDL processor stores operation names both in the original format and stripped form. The stripped form of the operation names are the ones whose stop words have been removed. The operation names in a WSDL file, along with the stripped operation names, are matched directly with the query words provided by the user. On a positive match, the operation names and corresponding WSDL file names are stored.

• Stripped Matching – This is similar to direct matching, except that the client query words are also stripped before being matched against operation names.

• Dictionary Level Matching – In this module, the synonyms of the query words are first matched with the operation names and the stripped operation names. Next, hypernyms and hyponyms of the query words are matched with the operation names. On a positive match, operation names and corresponding WSDL document names are stored.

2.3. Fallback Mechanism

The lexicon module provides access to the glossary

for each word. The lexicon block also allows matching against the input and output parameters of the methods, the part names, and the comments and annotations in the WSDL files. Consider the query string "forecast for Boston today". We consider the query word "forecast" that contains the main contextual information in the user query. Suppose the word "forecast" is neither in the ontology model nor can it be found by dictionary level matching. This is true in cases where the Lexicon block does not provide any word related to weather. The information obtained from the glossary on "forecast" is the following: a prediction about something (as the weather) will develop. By processing

this information from the glossary it is possible to infer that "forecast" is related to weather. 3. Extraction of Parameter Values from User Queries

A user query Q of length n consists of keywords k1, k2…kn after the removal of stop words; both matched and unmatched words can be present in the set of keywords. Matched words are ones that occur at least once in the domain-specific ontology files. A frequency value is also associated with each matched word, stored as an attribute, based on its number of occurrences in the ontology file. Unmatched words are those that are not present in the ontology files when the query is being processed. If a keyword is matched, we denote it in our transformation function with a subscript ‘m’; otherwise, we denote an unmatched keyword with a subscript ‘u’. The index of the keyword determines its position in the user query. For example, given a query length of 4 with matched words at positions 2 and 3, we denote the set of keywords as {k1u, k2m, k3m, k4u}.

It is to be noted that in our system, the matched words determine the domain the user query belongs to, based on a walk-through of the domain-specific ontology files. We extend our previous work [23] by taking into account unmatched words, which we believe have the potential to provide useful information to determine the exact context of a query more accurately as well as fill in the parameter values needed to invoke an operation. We propose a Transformation Function TF for each unmatched word in the user query to achieve this objective.

⎟⎠

⎞⎜⎝

⎛∑n

=iicF kT

1 u = c∀

The transformation of unmatched keywords to matched concepts pertaining to the particular domain ontology serves the purpose of imparting more contextual knowledge to the query by examining the synsets4, hypernyms, and hyponyms of the unmatched keywords. The transformation of unmatched keywords to matched concepts is accomplished in the following two steps -

• In the first step, the synsets are generated for each unmatched keyword using the WordNet dictionary. In addition to synonyms, hypernyms, and hyponyms of the keywords, each synset contains within itself the definition

4 A synset (synonym set) represents a concept and contains a set of words; each of which is synonymous with the other words in the synset.

59

Page 4: [IEEE 2008 IEEE Congress on Services Part II (SERVICES-2) - Beijing, China (2008.09.23-2008.09.26)] 2008 IEEE Congress on Services Part II (services-2 2008) - Web Services Operation

of the keywords and examples of the senses. Each synset is normalized with the techniques outlined by Hai He et. al. [7]. Stop words are removed and the root or stem word of each synset is stored and represented as a vector of terms. The term frequency of each term in the vector is then calculated from the number of occurrences of the term in the ontology files.

• In the second step, the similar synsets are merged and grouped into different matched concepts and each unmatched keyword is assigned to a matched concept that contains the synset of the word. There is a clear motivation behind merging and grouping of similar synsets. For example, we can find “city”, “metropolis”, and “urban center” as synsets of the unmatched word Boston. If we group similar synsets and assign Boston to the group, then any operation and input messages with names such as getWeatherByCity or getWeatherByMetropolis can consider Boston as a potential parameter. The rules for merging and grouping of similar synsets are similar to the approach adopted in the paper proposed by Hemayati et. al. [22]. The rules are summarized below -

(i) If two synsets syn1 and syn2 have the same direct hypernym and/or hyponym synset, or one is the direct hypernym and/or hyponym of the other, then syn1 and syn2 are merged.

(ii) If syn1 and syn2 contain the same synonym, then syn1 and syn2 are merged.

(iii) If there exists a synset syn3 for which syn1 and syn3 share a direct hypernym, and syn2 and syn3 share a direct hypernym, the synsets syn1 and syn2 are merged because of the fact that they have the same coordinate terms.

Each generated matched concept is then looked up in the domain-specific ontology file and assigned a frequency value based on the number of occurrences. Both matched and unmatched keywords and their synsets and hypernyms are used to identify the operations that are to be invoked. However, unmatched words have the added importance of providing clues on the parameter values.

For example, consider a query “convert 100 dollars to pounds”. In the currency ontology file, the words “dollar” and “pound” are present as they represent currencies of two different nations. The unmatched keywords are “convert” and “100”. But, in the context of the query, an operation is required that can convert “100” dollars to pounds in order to satisfy the user requirements. So, an operation name containing the

keyword “convert” or any of its synsets or hypernyms belongs to the matched concept can be invoked while “100” will be the parameter value. From a simple query like this, it can be seen that the keyword “100” does not generate any synset that enhances the query context but only provides the parameter value. However, any keyword that does not produce a relevant synset can be considered only as a potential parameter value. To illustrate our observation, we consider the query “weather at Chicago”. The matched word “weather” assists in determining that the query is relevant to the weather domain, but the unmatched word “Chicago” is a possible candidate for a parameter value. If we find the synset or hypernym of Chicago, the WordNet will refer to it as a city or metropolis. So, a weather service with an operation named “weatherbyCity” that accepts a string as input parameter, or an operation name “getWeather” that accepts an input parameter “city” will have both of these operations as possible candidates for invocation. As a result, we can infer that while matched words only signify the domain of a query, unmatched words can provide potential parameter values as well as narrow down the context of the query suited towards the user requirements.

To better extract and map parameters from user queries, we manually examined WSDL operations, their part names, message types, and input-output parameters and created an input parameter file (in XML format) specific to each ontological domain. This implementation is optimized for most common instances of user queries in relevance to a particular domain. For a particular domain, the input parameter file will initially contain input parameters and their types that are most prevalent in that particular domain. For example, in the weather domain, the input parameters and types will be city (string), time (string) and zip code (integer). This input parameter file will suffice for queries like “weather at Boston”, “weather at 2 PM”, or “weather at 13905”. However, if a WSDL operation name is present in the weather domain with an input parameter as state (string) and a user query is received as “weather at Florida”, the input parameter file can be extended with state (string) as a parameter for subsequent runs of the system.

4. Ranking Methodology

The ranking value is determined by taking into

account both matched and unmatched keywords. All unmatched keywords are transformed into matched concepts using the transformation function discussed in Section 3. Both matched and unmatched keywords are assigned frequency values based on their number of

60

Page 5: [IEEE 2008 IEEE Congress on Services Part II (SERVICES-2) - Beijing, China (2008.09.23-2008.09.26)] 2008 IEEE Congress on Services Part II (services-2 2008) - Web Services Operation

occurrences in the domain-specific ontology files. Unmatched words and their matched concepts (determined from the transformation function) along with matched words are grouped together as a set. We denote this set as S*.

The operation names, its input messages, part names, and part types are extracted from the WSDL files using JWNL API, and stored in a vector. In order to obtain more accurate matchmaking results, each operation name and its corresponding input messages and part names are normalized [7]. We denote this operation vector as a set R* and each operation object as R.

In order to calculate the similarity between an operation R and a matched concept S, we have modified the Okapi function [21] as discussed by Hemayati et. al. in [22]. We calculate the score as -

),(*),(*2

),( 21 STWRTWWWSRscoreQT∑

+=

where for i=1, 2

5.05.0log +

+−=i

iii n

nNW)()()(*)1(),(

RtfRKRtfKRTW

++= and

))(_

)(*)1((*)( *RoplavgRoplbbKRK +−=

where, 1N and 2N are the number of operations in R* and number of matched concepts in S* respectively.

1n and 2n are the numbers of operations and matched concepts containing the term T respectively. 1W and

2W are importance of term T with respect to operations in R* and matched concepts in S*, which is equivalent to idf (inverse document frequency) weight in information retrieval systems. ),( RTW computes the importance of term T in R and ),( STW is the same as

),( RTW in which R is replaced with S and R* is replaced with S*. )(_ *Roplavg is the average number of words in R*. )(Rtf is the term frequency of T in R.

)(Ropl is the number of words in R. Usually, the value of K is 2 and b is 0.75.

If score(R, S) is above a threshold value Vt, we then consider the operation contributing to the score as the potential operation that can be invoked. It may also happen that multiple operations are found in the vector R* for which the threshold value is reached or no operation can be found that reaches the threshold value. In the first case, the operation with the highest score is considered for invocation. If there is a tie between two operation names having the same score,

any one of them can be chosen. In the case where no operation is found, we continue with the ranking scheme discussed below.

The ranking scheme is based on the following rules for matching operation names along with parameter values in our system:

i) Matched keywords are assigned higher preference than unmatched keywords.

ii) A matched keyword with a higher frequency is assigned higher priority than a matched keyword with a lower frequency value: this rule is also applicable for unmatched keywords.

iii) Highest preference is given to operation names that encompass all matched keywords and all matched concepts (i.e., transformed unmatched keywords).

iv) The next priority is given to operation names that encompass all matched keywords and all but one matched concept; the matched concept with the lowest frequency is left out in this case. Subsequently, in lower levels of ranking, the matched concepts are discarded one by one based on their frequency values until only the matched keywords remain for consideration.

v) Similar to unmatched words, matched words are then left out at each level of ranking based on their frequency values.

vi) The last matched word that is considered for ranking is the one with the highest frequency value.

5. Experimental Results

We conducted experiments on a Dell D620 with an Intel T2300 processor @ 1.66 GHz and 1 GB of RAM running Microsoft Windows XP.

Due to space limitations we present just a representative set of free-form queries we used for the experiments.

• Binghamton’s weather • Temperature at Chicago • Weather at Philadelphia • How hot is it in Houston today? • Will it be raining at Dallas tomorrow? • Flight Details from Los Angeles to Las Vegas • Today’s conversion rate from Euro to Dollar • Convert USD to INR • Cheapest hotels in Miami • Hyatt hotel rate at Miami on 14th April • Stopovers from NYC to Philadelphia for

traveling by bus

61

Page 6: [IEEE 2008 IEEE Congress on Services Part II (SERVICES-2) - Beijing, China (2008.09.23-2008.09.26)] 2008 IEEE Congress on Services Part II (services-2 2008) - Web Services Operation

• Cost of hotel booking for one-night stay at NYC

• Best price for flight from Pittsburgh to Buffalo on Sunday

Table 1. Performance of Domain Dependent Ontologies for both Operation and Parameter

Matchmaking. Domain Precision Recall Travel 97.9% 93.8%

Currency 99.7% 94.6% Weather 99.9% 98.2% Location 98.9% 95.1% Average 99.1% 95.4%

Table 2. Performance after Input Parameter File Extension for both Operation and Parameter

Matchmaking. Domain Precision Recall Travel 98.2% 95.2%

Currency 99.8% 96.1% Weather 100% 98.8% Location 99.2% 96.9% Average 99.3% 96.8%

We use precision and recall measurements to study

the accuracy of our system. We take into consideration a sample of 50 queries spanning across all the domains and evaluate the outputs. We evaluated our system with 50 WSDL documents. These documents are obtained by searching UDDI and other sources for each different domain. For measuring the overall accuracy of our system, we define precision as ratio of the number of relevant WSDL operations retrieved corresponding to a user query and the total number of WSDL operations returned by our system. We define recall as the ratio of the number of relevant WSDL operations retrieved and the total number of relevant WSDL operations for a user query present in the WSDL repository.

In earlier work [24], we showed that identifying operation names across the different domains by using lexical syntax analysis and ontological matchmaking provides 98.6% success in precision and 93.9% in recall, on average. Table 1 shows the precision and recall values if we consider both Operation and Parameter Matchmaking processes. The results in Table 1 show that Parameter matchmaking improves the precision to 99.1% and recall to 95.4%. This is due to the fact that the parameter extraction technique

assists in identifying the specific operations, which if invoked, will produce better results compared to just matching operation names using lexical syntax and ontological matchmaking. In Table 2, precision-recall results are better than in Table 1. This is because extending the input parameter file with input values and input types assists in more accurately determining operations to be invoked. 6. Related Work

Eberhart at al. describe WSDF [20], a representative mechanism and a runtime system architecture, which allows a client to invoke a service based solely on ontology without prior knowledge of the API. This work overcomes the drawback of the approaches presented in OWL-S and BPEL4WS. WSDF provides semantic annotations to Web services allowing ad-hoc invocation of a service. Patil et al. have developed MWSAF, a Web service annotation framework [14] that performs both element and structural level matching for Web services. The element level matching is bound on a combination of Porter Stemmer [8] algorithm for root word selection, WordNet dictionary for synonyms, abbreviation directory to handle acronyms, and NGram algorithm for linguistic similarity of the names of two concepts. Sycara et al. have developed one of the earliest ontology-based semantic matchmaking engines, MatchMaker [15], which uses capability-based semantic match and various IR-based filters. Another related effort is Racer [16], which focuses solely on service capability-based semantic matches for application in e-commerce systems. Syeda-Mahmood et al. [17] explore the use of domain-independent and domain-specific ontologies for finding matching service descriptions. Domain-independent relationships are derived using an English thesaurus after tokenization and part-of-speech tagging, while domain-specific ontological similarities are derived by inferring semantic annotations associated with Web service descriptions. A combination of the matches due to the two cues is done to determine an overall semantic similarity score. Our work extends the work by Syeda-Mahmood et al. [17], by dynamically learning from previous match making results, extending the ontological vocabulary, and applying the knowledge to subsequent queries. Agarwal et al. propose a solution Synthy [18] for the composition of Web services using domain-dependent ontologies. The system provides semantic reasoning and planning but it does not include domain-independent cues such as thesaurus and text analysis techniques such as stop word filtering. Syeda-Mahmood et al. describe Minelink in [19], which uses

62

Page 7: [IEEE 2008 IEEE Congress on Services Part II (SERVICES-2) - Beijing, China (2008.09.23-2008.09.26)] 2008 IEEE Congress on Services Part II (services-2 2008) - Web Services Operation

bipartite graph for modeling Web service compositions and solves a maximum matching problem using domain-independent cues and text analysis techniques. However, their system does not take into account the cases where contextual information and semantic meaning of the user input can be useful in determining the Web services to be combined. 7. Conclusions

We presented a system that matches user queries with operations and parameter values in Web services. The focus of our design is to provide “ease-of-use” to the end-user of Web services. To achieve accurate matchmaking results, we employ lexical analysis, domain-independent matching techniques, and domain-specific ontologies. We presented a detailed ranking system that can be effectively used to determine the best match for a given user query. The performance results for matching operation names and parameters for domain dependent ontologies indicate that our system obtains very high precision and recall values. The precision and recall values are also high after the input parameter file is extended for both operation and parameter matchmaking.

8. Future Work

An extension of this work is to examine user queries

spanning across multiple domains. We plan to implement a technique for automatically generating input parameter files for each ontological domain. We also plan to explore the efficiency of our matchmaking methodologies for composition of Web services. In future work, we also plan to increase the number of functional WSDL documents in our test base. 9. References [1] M. J. Duftler et al., “Web Services Invocation Framework (WSIF)” in OOPSLA Workshop on Object Oriented Web Services, October 2001. [2] R. van Engelen, “A Framework for service-oriented computing with C and C++ Web service components”, in ACM Transactions on Internet Technologies, 2007.

[3] Axis: “WebServices – Axis” Web Page. Available: http://ws.apache.org/axis/. [4] “Semantic Web” Web Page. Available: http://www.w3.org/2001/sw. [5] "OWL Web Ontology Language Overview" Web Page. Available: http://www.w3.org/TR/owl-features/. [6] G. A. Miller, "WordNet: A Lexical Database for the English Language" in Comm. ACM, 1983. [7] Hai He et al., "An Automated Integrator of Web Search Interfaces for E-commerce" in VLDB Journal, Vol.13, No.3, pp.256-273, September 2004. [8] M. F. Porter, “An algorithm for suffix stripping”, in Program, 14(3) pp 130−137, 1980. [9] "JWNL 1.3" Web Page. Available: http://jwordnet.sourceforge.net/. [10] "Resource Description Framework (RDF)" Web Page. Available: http://www.w3.org/RDF/. [11] "RDF Vocabulary Description Language 1.0: RDF Schema" Web Page. Available: http://www.w3.org/TR/rdf-schema/. [12] "Jena - A Semantic Web Framework for Java" Web Page. Available: http://jena.sourceforge.net. [13] David Heckerman, Eric Horvitz, “Inferring Informational Goals from Free-Text Queries: A Bayesian Approach” in Proceedings of the Conference on Uncertainty in Artificial Intelligence, July 1998. [14] A. Patil et al., “METEOR-S Web Service Annotation Framework” in Proc. WWW Conference, pp. 553-562, 2004. [15] K. Sycara et al., “Dynamic service match making among agents in open information environments” in Journal of the ACM SIGMOD Record, 1999. [16] L. Li, I. Harrocks, “A Software Framework For Matchmaking Based on Semantic Web Technology” in Proc. WWW Conference, 2003. [17] Syeda-Mahmood et al., "Searching Service Repositories by Combining Semantic and Ontological Matching" in Proc.

of the IEEE International Conference on Web Services, 2005. [18] V. Agarwal et al., "Synthy. A System for End to End Composition of Web Services" in Journal of Web Semantics, Vol. 3, Issue 4, 2005.

[19] Syeda-Mahmood et al., "Minelink: Automatic Composition of Web Services through Schema Matching" Poster paper WWW Conference, 2004. [20] A. Eberhart, "Ad-hoc Invocation of Semantic Web Services" in Proc. of the IEEE International Conference on Web Services, 2004.

63

Page 8: [IEEE 2008 IEEE Congress on Services Part II (SERVICES-2) - Beijing, China (2008.09.23-2008.09.26)] 2008 IEEE Congress on Services Part II (services-2 2008) - Web Services Operation

[21] S. Robertson, S. Walker, M. Beaulieu. Okapi at Trec-7: Automatic Ad Hoc, Filtering, Vlc, and Interactive Track. 7th Text REtrieval Conference, 1999, pp.253-264. [22] Reza T. Hemayati, Weiyi Meng, Clement Yu, “Semantic-based Grouping of Search Engine Results Using WordNet”, Joint Conference of the 9th Asia-Pacific Web Conference and the 8th International Conference on Web-Age Information Management (APWeb/WAIM'07), pp.678-686, HuangShan, China, June 2007. [23] Chaitali Gupta, Rajdeep Bhowmik, Michael Head, Madhusudhan Govindaraju, Weiyi Meng, "A Query-based System for Automatic Invocation of Web Services", in Proceedings of the 2007 IEEE International Conference on Web Services (ICWS 2007), Application Services and Industry Track, Salt Lake City, Utah, July 2007. [24] Chaitali Gupta, Rajdeep Bhowmik, Michael Head, Madhusudhan Govindaraju, Weiyi Meng, "Improving Performance of Web Services Query Matchmaking with Automated Knowledge Acquisition", in Proceedings of the 2007 IEEE/WIC/ACM International Conference on Web Intelligence (WI '07), Silicon Valley, California, November 2007.

64