1spring 2005 specification and analysis of information systems towards service retrieval on the web...

50
1 Spring 2005 Specification and Analysis of Information Systems Towards Service Retrieval on the Web Eran Toch * , Iris Reinhartz- Berger, Dov Dori, and Avigdor Gal IBM Research Seminar Haifa, Feb 2007 The Technion Haifa University * Supported by the Levi Eshkol grant from Israel’s Ministry of Science

Post on 20-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

1Spring 2005Specification and Analysis of Information Systems

Towards Service Retrieval on the WebEran Toch*, Iris Reinhartz-Berger, Dov Dori, and Avigdor Gal

IBM Research Seminar Haifa, Feb 2007

The Technion

Haifa University

* Supported by the Levi Eshkol grant from Israel’s Ministry of Science

2

Agenda

1. Background

2. Approximated Service Retrieval

3. Results

3

httphttp://://wwwwww..healthmaphealthmap..orgorg//

4

5

What is a Service?

6

Our Definition

InterfaceInputs, outputsWSDL, Web pages

OtherServicescomposed

with

QualityPrice, speed, freshness

Service

7

Service Compositions

=

8

Current Approaches

Text-based Semantics-based

9

The Semantics-based Approach

• Based on Web ontologies– [Barners-Lee 2001]

• Service modeling languages (OWL-S)

• Logic-based inference to compose services– [Paolucci 2004]– [Patil 2002]

Google Maps

preconditions input output

10

Drawbacks

• Approximation– Logic inference yields low levels of

approximation recall is low

• Tagging– Tagging is difficult– Agreement on mutual concepts is difficult

• Query Language– Require formal concepts

• Performance– Inference takes time

11

Current Approaches

Preciseapproximated

Human Oriented

Machine Oriented

Text-based

Semantics-based

We Want to be here

We are here

12

Agenda

1. Background

2. Approximated Service Retrieval

3. Results

1. OPOSSUM 3. Indexing2. Approximation

Operation

in

in

in

Hospital

Medical Center

Diagnosed Symptom

in

Mount Sinai

Inform Hospital

1

0.5

0.5

1

CertaintyConceptRole

out

in GPS Position

Hospital 0.5

1Find Nearest

Medical Center

in City 0.4

13

Usability in Service Retrieval• Simple query language

– No formal concepts

• Approximation– Flexible results

• Compositions– Retrieval of compositions, components

• Ranking– Best matches on top

• Fast– Sub-linear processing time

14

OPOSSUM (Object-PrOcess-SemanticS Unified Matching)

[Toch 2006]

15

Architecture

OPOSSUM

Crawler

Concepts Index

Composition Index

Query Evaluator

Service DescriptionService

Description

UserUser

WSDL, OWL-S,Web sources

Domain KnowledgeDomain

KnowledgeOntologies

16

Queries

“drug”“treatment drug”“medical treatment map or

address”“drug chickenpox price:0-5$”“input:treatment output:drug”“treatment provider:mount sinai

hospital”

17

Services Queries

Service Network:An ontology-labeled, directed graph

Query - An ontology-labeled rooted, directed tree

medical treatment

AddressMap

“medical treatment map or address”

Insurance Treatment Locator

Yellow pages

Google Maps

Yahoo Maps

Hospital Info

Service

Government Treatment

Locator

18

Results

• V – a virtual service– a fully ordered sequence of operations

Insurance Treatment Locator

Yellow pages

0.9

Insurance Treatment Locator

Yellow pages

Google Maps

0.8

Insurance Treatment Locator

Yellow pages

Hospital Info Service

Yahoo Maps 0.4

19

20

Service Network

21

Service Graph Construction

Insurance Treatment

Locator

Government Treatment

Locator

Yellow pages

Hospital Info Service

Google Maps

Yahoo Maps

Hospital

Location

medical treatment

Price = free

c = 0.7c = 1

22

Probability-based Ontologies

Carmel

Hospital

Rothschild

Address

c = 0.7

Similarity Relations

Name

Properties

Certainty values

Businessc = 0.8

“near by”

Departments

Treatments

“is a”c = 0.9

“is a”c = 0.9

“is a”

23

μ-satisfiability

• V satisfies the requirements of Q with a certainty [0,1]

• μ-satisfiability quantifies approximation using a single numerical value

Easy to calculate, but not always accurate

Concept level

Operation level

Composition level

24

Concept Level Approximation

Each query concept, qi is matched against the concepts index.

BusinessHospital ?

25

Concept Approximation

Hospital

BusinessYellow Pages (Get business

address)

Get Hospital Address

Get Hospital Address

Yellow Pages (Get business

address)

26

Semantic Distance

Concept b

Concept a

Concept x

)(log2

1),( pathbadis

Similarity Radius

Distance function - distance coefficient

Semantic edge number / certainty

xbjxai jcic

path)(

1

)(

1

c(i), c(j) – certainty value on edges

27

Operation Level Matching

A query, Q, is matched against the operation

?Get

Hospital List

City

Treatment

Address

medical treatment

AddressMapName

28

Operation Quality

1. Minimal satisfiability

2. Maximal satisfiability

3. A priori Ranking

29

Composition Level Matching

A query is matched against a composition of operations

?

medical treatment

AddressMap

30

Composition Certainty

c = 1

c = 0.8

c = 1

Shorter compositions have higher certainty

c = 1

c = 0.2

31

Approximation By Composition

Map Zip Yellow Pages

Map by Zip

Map by Address

Map by Address Yellow Pages + Map by Zip

32

Partial Matching

Address to Zip

Converter

Map by Zip

Hospital Locator

Hospital Address Finder

Partial Match

33

Excessive Matching

Get side effect

Get treatmen

t info

Get drug price

Order drug

Excessive Matching

34

Scrambled MatchingGet

treatment info

Get drug info

Get side effects

Order durg

Reordered Match

35

Quantifying Structural Approximation• An approximated service is defined as:

• We define the structural satisfiability as:

“Full” Virtual Service

Approximated Service

Graph edit distance

Satisfiability of query components

36

RankingQ = “medical treatment map or address”

Insurance Treatment Locator

Yellow pages

0.9

Insurance Treatment Locator

Yellow pages

Google Maps

0.8

Insurance Treatment Locator

Yellow pages

Hospital Info Service

Yahoo Maps 0.4

37

Indexing Architecture

Concepts Index

Composition Index

Query

Results: Virtual Service

38

Concept Index

in

in

Treatment

Drug

in

Symptom

Insurance Treatment

Locator

1

0.5

0.5

1

out

in Address

Map 1

1 Google Maps

in Zip code 0.4

out Hospital

The size of the index is determined by the semantic radius

Concept CertaintyRole Operation

39

Composition Complexity

• We know that – The Service Network is a graph– Q is a graph– Evaluating a query Q on the Service Network

is subgraph isomorphism

• Which is NP-Complete– In the general case

• Semantic Peer-to-peer Routing– [Schlosser 2002]– [Ben-Asher 2006]

40

Hierarchal Concept Clustering

Hospital availabilit

y

Hospital Info

Service

Hospital Locator

Yahoo MapsGoogle

Maps

Zip Finder

Update medical records

Get Physician Address

Get patient

address

41

Multi-dimensional clustering• Hypercubes

• Shortest path between two most distant nodes=logbN– N – number of nodes– B – hypercube base

Yahoo MapsGoogle

Maps

Zip Finder

Geography

Yellow Pages

Business

8 1

2

0

11

3 0

4 5

7

0

11

6 0

2 2

22

42

Query Evaluation Complexity

• D - query disjunctions• OP – retrieved operations• - number of results• N - number of operations in the service

network• b - hypercube base

43

Agenda

1. Background

2. Approximated Service Retrieval

3. Results

44

Data Set

• 828 services • 5432 concepts

in 23 domains

45

Performance

QueryOWLS-MXOPOSSUM

Ratio

hospital investigating171033 52book price164735 48country skilled occupation

174220 88

car price service168215 113

geopolitical entity weather process

136427 51

government degree scholarship

178232 56

novel author166240 42

46

Scalability

0

10

20

30

40

50

60

0 500 1000 1500 2000 2500 3000

number of services

pro

ce

ss

ing

tim

e (

in m

s)

Trendline

Service Generator

P – number of parametersnc – create new conceptc – concept mapping

Simulated service

47

Indexing

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

0 100 200 300 400 500 600

number of service entries

nu

mb

er o

f in

dex

en

trie

s

48

Domains and Approximation

0

200

400

600

800

1000

1200

1400

1600

1800

2000

0 500 1000 1500 2000 2500

domain size & connectivity

app

rxim

atio

n f

acto

r

49

http://www.technion.ac.il/erant

Thank You.

50

References[Schlosser 2002] Mario T. Schlosser, Michael Sintek, Stefan Decker,

Wolfgang Nejdl: HyperCuP - Hypercubes, Ontologies, and Efficient Search on Peer-to-Peer Networks. AP2PC 2002: 112-124

[Ben-Asher] Yosi Ben-Asher, Shlomo Berkovsky: Semantic Data Management in Peer-to-Peer E-Commerce Applications. J. Data Semantics VI: 115-142 (2006)

[Berners-Lee 2001] Berners-Lee, T., Hendler, J., Lassila, O., The Semantic Web, Scientific American, 284(5), 2001, pp. 34-43.

[Patil 2004] A. Patil, S. Oundhakar, A. Sheth, and K. Verma. Meteor-s web 13 service annotation framework. In Proceedings of WWW 2004, pages 553–562, New York, NY, May 2004.

[Paolucci 2002] Massimo Paolucci, Takahiro Kawamura, Terry R. Payne, and Katia P. Sycara. Semantic matching of web services capabilities. In International Semantic Web Conference, pages 333–347, 2002.

[Toch 2006] Eran Toch, Iris Reinhartz-Berger, Avigdor Gal, and Dov Dori, OPOSSUM: Bridging the Gap between Web Services and the Semantic Web, proceedings of NGITS 2006, pp. 357-358.