managing metadata in service architectures

36
Managing Metadata in Service Architectures Mehmet S. Aktas Advisor: Prof. Geoffrey C. Fox

Upload: jin

Post on 24-Feb-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Managing Metadata in Service Architectures. Mehmet S. Aktas Advisor: Prof. Geoffrey C. Fox. Outline. Introduction Motivation Requirements Research Issues Architecture Performance Evaluation Conclusions Contribution. Context as Service Metadata. Context interaction-independent - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Managing Metadata in Service Architectures

Managing Metadata in Service

ArchitecturesMehmet S. Aktas

Advisor: Prof. Geoffrey C. Fox

Page 2: Managing Metadata in Service Architectures

Outline Introduction Motivation Requirements Research Issues Architecture Performance Evaluation Conclusions Contribution

2 of 34

Page 3: Managing Metadata in Service Architectures

3 of 34

Context as Service Metadata Context

interaction-independent slowly varying, quasi-static service metadata

interaction-dependent dynamically generated metadata as result of interaction

of services information associated to a single service, or a session

(service activity) or both

Dynamic Grid/Web Service Collections loosely assembled collections of services assembled to support a specific task generate metadata and have limited life-time

Page 4: Managing Metadata in Service Architectures

4 of 34

Motivating Cases Multimedia Collaboration Grids

Global Multimedia Collaboration System- Global MMCS

widely distributed services, session service metadata, session metadata,

stream-specific metadata mostly read-only

Workflow-style applications in Geographical Information System/Sensor Grids Pattern Informatics (PI) – UC Davis, Interdependent

Energy Infrastructure Simulation System (IEISS) – LANL widely distributed services conversation metadata, transient multiple writers

Page 5: Managing Metadata in Service Architectures

5 of 34

Problems with Grid Information Services

Standardization and Unification Issues Customized Grid Information Services Differences in application requirements Thick clients

Performance and Centralization Issues Low performance Low fault tolerance

Dynamic Metadata Management Issues Point-to-point service communication approaches

Page 6: Managing Metadata in Service Architectures

6 of 34

Requirements for Grid Information Services

Greater Interoperability Unified platform for communication Shared communication protocol Thin clients

Greater Capabilities High Performance Fault-tolerant

Dynamic Grid/Web Service Collections Distributed state management Collaboration session management

Page 7: Managing Metadata in Service Architectures

7 of 34

Research Issues I Unification of Grid Information Services

How to combine different information services? Federation of Grid Information Services

What is a common data model and communication protocol?

Flexibility and extensibility Accommodating broad range of application domains

read-dominated, read/write dominated Ability to add/support more information services

Interoperability Being compatible with wide range of applications

Page 8: Managing Metadata in Service Architectures

8 of 34

Research Issues II Performance

Efficient centralized metadata management strategies high performance and persistency

Efficient decentralized metadata management strategies Efficient request distribution strategies Adaptation to instantaneous client-demand changes

Fault-tolerance Efficient replica-content creation strategies

Consistency How to provide consistency across the copies of the same

data?

Page 9: Managing Metadata in Service Architectures

Client

TUPLE SPACE API

TUPLE POOL ( JAVA SPACES)

UNIFORM ACCESS INTERFACE

Request processorAccess Control Notification

A HYBRID GRID INFORMATION SERVICE MEMORY-IN STORAGE

Information Service - I

Information Service - II ….

INFORMATION RESOURCE MANAGER

Client

TUPLE SPACE API

TUPLE POOL ( JAVA SPACES)

Extended UDDI API

Request processorAccess Control Notification

A HYBRID GRID INFORMATION SERVICE MEMORY-IN STORAGE

Extended UDDI WS-Context ….

INFORMATION RESOURCE MANAGER

WS-Context API

Unified Schema API … Unification

Uniform Access

Extensibility

Interoperability Extended UDDI WS-Context

Federation Unified Schema Query/Publish XML API

Hybrid Grid Information Service

9 of 34

Page 10: Managing Metadata in Service Architectures

10 of 34

Client

TUPLE SPACE ACCESS API (JAVASPACES)

Mapping Files

(XML)

TUPLE POOL

Extended UDDI API

Information Resource Manager

Resource Handler

DB1

Resource Handler

DB2

……

WS-Context API ….

Request processorAccess Control Notification

…..

Mapping Rule

Files (XSLT)

Filter

Extended UDDI WS-Context

Unified Schema API

TUPLE processor

Lifetime Management

Persistency Management

Dynamic Caching Management

Fault Tolerance Management

JDBC JDBC JDBC

PUB – SUB Network Manager

HYBRID GIS NETWORK CONNECTED WITH PUB-

SUB SYSTEM

Publisher Subscriber

10 of 34

UDDI instanceWS-Context instanceUnified schema instance

Client

TUPLE SPACE ACCESS API

TUPLE POOL

UNIFORM ACCESS INTERFACE

……

Dynamic Caching Management

Fault Tolerance Management

INFORMATION RESOURCE MANAGER

A HYBRID GRID INFORMATION SERVICE MEMORY-IN STORAGE

WS-ContextExtended UDDI

Page 11: Managing Metadata in Service Architectures

11 of 341 of 34Distributed HYBRID Grid Information Services

Subscriber

Publisher

Replica Server-2 Replica Server-N

Topic Based Publish-Subscribe Messaging System

HTTP(S)WSDLClient

WSDLClient

WSDL WSDL

HYBRID Grid Information Service

Replica Server-1

WSDL

HYBRID ServiceWSDL

HYBRID Service

Database

Extended UDDI

Database

WS-Context

Database

Ext UDDI

Database

WS-Context

Database

Ext UDDI

Database

WS-Context

Decentralized Fault-tolerant Efficient distribution Look-ahead caching Consistency enforced

11 of 34

Page 12: Managing Metadata in Service Architectures

12 of 34

Support for interaction-independent metadata: Extended UDDI Service

Client

TUPLE SPACE API

TUPLE POOL ( JAVA SPACES)

Extended UDDI API

Request processorAccess Control Notification

A HYBRID GRID INFORMATION SERVICE MEMORY-IN STORAGE

Extended UDDI WS-Context ….

INFORMATION RESOURCE MANAGER

WS-Context API

Unified Schema API …

It supports different types of metadata Geographical Information System Metadata Catalog

(functional metadata) User-defined metadata ((name, value) pairs)

It enables advanced query capabilities Geo-spatial queries Metadata oriented queries Domain independent queries

It provides additional capabilities Up-to-date service registry information (leasing) Dynamic aggregation of capabilities of services

Ex: geospatial capabilities

Page 13: Managing Metadata in Service Architectures

Support for interaction-dependent metadata: WS-Context Service

Client

TUPLE SPACE API

TUPLE POOL ( JAVA SPACES)

Extended UDDI API

Request processorAccess Control Notification

A HYBRID GRID INFORMATION SERVICE MEMORY-IN STORAGE

Extended UDDI WS-Context ….

INFORMATION RESOURCE MANAGER

WS-Context API

Unified Schema API

Context Manager Service Data model and communication protocol Session-related metadata

It supports Dynamic Web Service Collections Support for distributed state based systems

collaboration grids workflow-style grids

It provides various capabilities Asynchronous communication capability Up-to-date service registry information (leasing)

13 of 34

Page 14: Managing Metadata in Service Architectures

Support for federated service metadata: Unified Information Service

Client

TUPLE SPACE API

TUPLE POOL ( JAVA SPACES)

Extended UDDI API

Request processorAccess Control Notification

A HYBRID GRID INFORMATION SERVICE MEMORY-IN STORAGE

Extended UDDI WS-Context ….

INFORMATION RESOURCE MANAGER

WS-Context API

Unified Schema API …

Federating Grid Information Services Unified data model and communication protocol Extended UDDI, WS-Context and Glue Schemas

Approach taken Union of schemas vs. separate schemas Reuse common concepts

Ex1: business, session, site => category Combine disjoined concepts

Ex1: UDDI’s tModel

It enables hybrid query capabilities “Give me list of services satisfying C:{a,b,c..} QoS

requirements and participating S:{x,y,z..} sessions”14 of 34

Page 15: Managing Metadata in Service Architectures

Collaboration GridSensor Grid

WSDL HYBRID Service

Database

WS-Context

Topic Based Publish-Subscribe Messaging System

Subscriber

Publisher

WSDL HYBRID Service

Database

Ext-UDDI

Federating Grid Information Services

15 of 34

Page 16: Managing Metadata in Service Architectures

16 of 34

Features of the Distributed System Cache Strategy

Memory-in storage Access Distribution

Redirecting client request to an appropriate replica server

Look-ahead caching Moving/replicating metadata to where they

wanted Replica Content Placement

Replicating data on an appropriate replica server

Consistency enforcement Ensuring all replicas of a data to be the same

16 of 34

Page 17: Managing Metadata in Service Architectures

17 of 34

Tuple Spaces & Publish-Subscribe Paradigms Publish-Subscribe paradigm

Message based asynchronous communication Participants are decoupled both in space and in time Open source NaradaBrokering software

topic based publish/subscribe messaging system

Tuple Spaces paradigm [Gelernter-99] a data-centric asynchronous communication paradigm communication units are tuples (data structure) JavaSpaces [Sun Microsystems]- object oriented

implementation specification

Page 18: Managing Metadata in Service Architectures

18 of 34

Caching Strategy Light-weight implementation of JavaSpaces

Data sharing, associative lookup, and persistency Integrated caching capability for all types of

service metadata Ex: UDDI-type, WS-Context-type, Unified Schema-type

metadata We assume that today’s servers are capable of holding

such small size metadata in cache. All metadata accesses happen in memory Persistency

All metadata is backed-up into appropriate Information Service back-end every so often for persistency

Page 19: Managing Metadata in Service Architectures

Persistency investigation

10 100 1000 10000 100000123456789

101112

Round Trip Chart for Publish/Inquire operations for varying backup-interval times

Average for Publication

Average for Inquiry

Backup-time interval (msec) (logaritmic scale)

Tim

e (m

sec)

Test-1. Echo Service

singlethreaded W

SDL

Client

1 user/200 transactions

Test-2. Publish with memory access for WS-Context, extended UDDI and Unified

Schema standard operations

WSD

L

Client

Ext-UDDI

HYBRIDSERVICE

WSDL

WS-Context

ECHOSERVICE

WSDL

Test-3. Publish with database access for WS-Context, extended UDDI and Unified

Schema standard operations

singlethreaded W

SDL

Client

Ext-UDDI

HYBRIDSERVICE

WSDL

WS-Context

1 user/200 transactions

1 user/200 transactions

singlethreaded

Performance investigation

Simulation parametersBackup frequency

every 10 seconds

Metadata size 1.7 KbytesRegistry size 5000 metadataObservation 200

19 of 34

Page 20: Managing Metadata in Service Architectures

20 of 34

Performance investigation

1 2 3 4 50

5

10

15

Round Trip Time Chart for WS-Context Service Metadata Publish Request

Average-WSContext - database

Average-WSContext - memory

Average - echo service

Repeated Test Cases

Tim

e (m

sec)

1 2 3 4 50

5

10

15

20

25

Round Trip Time Chart for Hybrid Service Metadata Publish Request (WS-Context metadata)

Average - Hybrid - databaseAverage - Hybrid - memoryAverage - echo service

Repeated Test Cases

Tim

e (m

sec)

1 2 3 4 50

5

10

15

20

Round Trip Time Chart for Ext UDDI Service Metadata Publish Request

Average-Ext UDDI - database

Average-Ext UDDI - memory

Average - echo service

Repeated Test Cases

Tim

e (m

sec)

1 2 3 4 50

5

10

15

20

25

Round Trip Time Chart for Hybrid Service Metadata Publish Request (UDDI metadata)

Average - Hybrid - database

Average - Hybrid - memory

Average - echo service

Repeated Test Cases

Tim

e (m

sec)

Page 21: Managing Metadata in Service Architectures

21 of 34

Message rate scalability investigation

100 200 300 400 500 600 700 800 900 10000

10

20

30

40

50

60

70inquiry message rate

publication message rate

message processing rate (message/per second)

avg

tim

e (m

s) p

er m

essa

ge

Hybrid Information Service – WS-Contextinquiry/publish operations with increasingmessage rates (# of messages per second)

HTTP(S)

WSD

LThread Pool

WSD

LThread Pool

5 Client distributed to cluster nodes 1 to 5, with each running 1 to 15 threads

Ext-UDDI

HYBRIDSERVICE

WSD

L

WS-Context

Message rate scalability investigation

Page 22: Managing Metadata in Service Architectures

22 of 34

Message size scalability investigation

10 20 30 40 50 60 70 80 90 1000

5

10

15

20

25

30

35

40Average - database access

Average - memory access

Average - Echo Service

context payload size (KB)

time

(mill

isec

onds

)

0.1 1.0 10.0 100.00

5

10

15

20

25

30

Average - publish

Average - inquiry

context payload size (KB) (logarithmic scale)

time

(mill

isec

onds

)

Hybrid Information Service – WS-Context inquiry/publish operations with increasing

message sizes

singlethreaded W

SDL

Client

1 user/200 transactions

Ext-UDDI

HYBRIDSERVICE

WSD

L

WS-Context

Server

Message size scalability investigation

Simulation parametersBackup frequency every 10 secondsRegistry size 5000 metadataObservation 200

Page 23: Managing Metadata in Service Architectures

23 of 34

Access DistributionLook-ahead Caching Broadcast-based request dissemination

Pub-sub system for message broadcast Broadcast requests only to those servers that

can answer No need to keep track of metadata locations

Dynamic migration/replication [Rabinovich et al, 1999] Popular copies are moved/replicated where

they wanted Autonomous decisions, self-awareness

Page 24: Managing Metadata in Service Architectures

24 of 34

Access Distribution ExperimentTest Methodology

NB node

HybridService instance

HybridService instance

Bloomington, IN 1 - Indianapolis, IN

2- Tallahassee, FL

3- San Diego, CA

HybridService instance

HybridService instance

Bloomington, IN 1 - Indianapolis, IN

2- Tallahassee, FL

3- San Diego, CA

NB node

NB node

T1 T2 T3Time = T1 + T2 + T3

Simulation parametersBackup frequency every 10 secondsMessage size 2.7 Kbytes

Page 25: Managing Metadata in Service Architectures

Distribution experiment result

Overhead of access distribution is only few milliseconds. Continuous access distribution operation does not

degrade the performance.

bloomington-indianapolis bloomington-tallahassee bloomington-san diego0

10

20

30

40

50

60

70

overhead of distribution when using one intermediary brokeroverhead of distribution when using two intermediary brokerslatency

Tim

e (m

s)

The overhead of distribution remains the same regardless of the network distances between nodes.

0 5 10 15 20 250

1

2

3

4

5

6

7

8 Bloomington - Indianapolis Access Distribution Chart

Average - Two Brokers

Average - One Broker

Average - Latency

Every 1000 observations

Tim

e (m

s)

25 of 34

Page 26: Managing Metadata in Service Architectures

26 of 34

NB node

HybridService instance

HybridService instance

Bloomington, IN Indianapolis, INTest-1 Distribution with Dynamic Replication Enabled

Test-2 Distribution with Dynamic Replication Disabled

NB node

HybridServiceinstance

HybridServiceinstance

Bloomington, IN Indianapolis, IN

T1 T2 T3

Time = T1 + T2 + T3

Simulation parametersmessage size / message rate 2.7 Kbytes / 10 msg/sec

replication decision frequency every 100 seconds

deletion / replication threshold 0.03 request/second and 0.18 request/second

registry size 1000 metadata in Indianapolis

Dynamic Replication PerformanceTest Methodology

Page 27: Managing Metadata in Service Architectures

27 of 34

The decrease in average latency shows that the algorithm manages to move replica copies to where they wanted.

0 5 10 15 20 250

1

2

3

4

5

6

7

Dynamic Replication Performance Chart - Distribution between Blooming-ton, IN and Indianapolis, IN

Average - Dynamic Replication

STDev - Dynamic Replication

Every 100 sec

Late

ncy

(ms)

0 5 10 15 20 250

1

2

3

4

5

6

7

Dynamic Replication Performance Chart - Distribution between Blooming-ton, IN and Indianapolis, IN

Average - Distribution

STDev - Distribution

Every 100 sec

Late

ncy

(ms)

Page 28: Managing Metadata in Service Architectures

Replica content placementConsistency enforcement Replica-content placement

Each node keeps information about other servers Selection of Replica Server(s)

Selection policy based on a) geographical (proximity) and b) topical (number of topics) information

Consistency Enforcement - Primary-copy approach Update distribution: updates labeled with synchronized

timestamps reflected (unicast) to primary-copy Update propagation: primary-copy pushes (broadcast)

updates only to those replica servers holding the context

HybridService 1

HybridService 2

HybridService 3

HybridService 4

HybridService 1 28 of 34

Page 29: Managing Metadata in Service Architectures

29 of 34

Fault-tolerance experiment Testing Setup

Hybrid Serviceinstance

Hybrid Serviceinstance

Bloomington, IN

NB node

NB node

Hybrid Serviceinstance

NB node

Hybrid Serviceinstance

NB node

Indianapolis, IN

Tallahassee, FL

San Diego, CA

Hybrid Serviceinstance

Hybrid Service instance

Bloomington, IN

NB node

Hybrid Serviceinstance

Hybrid Serviceinstance

Indianapolis, IN

Tallahassee, FL

San Diego, CA

Test - 1

Test - 2

Simulation parametersBackup frequency every 10 secondsMessage size 2.7 Kbytes

T1 T2 T3Time = T1 + T2 + T3

Page 30: Managing Metadata in Service Architectures

30 of 34

Fault-tolerance experiment result

Overhead of replica creation is only few milliseconds. Continuous replica creation operation does not degrade

the performance.

1 replica creation (In-dianapolis)

2 replica creation (Indi-anapolis, IN - Tallahassee,

FL)

3 replica creation (Indi-anapolis-IN, Tallahassee-

FL, San Diego-CA)

0

10

20

30

40

50

60

70

overhead of replica creation when using one intermediary brokeroverhead of replica creation when using two intermediary brokersend-to-end latency

Tim

e (m

s)

Overhead of replica creation increases in the order of milliseconds as the fault-tolerance level increase.

0 5 10 15 20 250123456789

1 replica creation at remote location: Indianapolis, IN

Average - Two Brokers

Average - One Broker

Latency

Every 1000 observations

Tim

e (m

s)

Page 31: Managing Metadata in Service Architectures

31 of 34

Consistency Enforcement ExperimentTest Methodology

NB node

HybridService instance

HybridService instance

Bloomington, IN 1 - Indianapolis, IN

2- Tallahassee, FL

3- San Diego, CA

HybridServiceinstance

HybridService instance

Bloomington, IN 1 - Indianapolis, IN

2- Tallahassee, FL

3- San Diego, CA

NB node

NB node

T1 T2 T3Time = T1 + T2 + T3

Simulation parametersBackup frequency every 10 secondsMessage size 2.7 Kbytes

Page 32: Managing Metadata in Service Architectures

32 of 34

Consistency Enforcement Test Result

Overhead of consistency enforcement is few milliseconds. Continuous operation does not degrade the performance. The cost of consistency enforcement remains the same

regardless of distribution of the network nodes.

bloomington-indianapolis bloomington-tallahassee bloomington-san diego0

10

20

30

40

50

60

70

overhead of distribution when using one intermediary brokeroverhead of distribution when using two intermediary brokerslatency

Tim

e (m

s)

0 5 10 15 20 250

1

2

3

4

5

6

7

8

9Bloomington - Indianapolis Consistency Enforcement Chart

Average - Latency

Average - One Broker

Average - Two Brokers

Every 1000 observations

Tim

e (m

s)

Page 33: Managing Metadata in Service Architectures

Conclusions

33 of 34

Efficient decentralized metadata strategies TupleSpaces & Pub-Sub communication paradigms Distribution Replication for fault-tolerance Replication for performance Consistency Enforcement

Efficient centralized metadata management strategies TupleSpaces Paradigm based memory-in storage

Page 34: Managing Metadata in Service Architectures

Contributions

34 of 34

Federated Grid Information Service Architecture Unified data model and communication protocol Support for both interaction independent and conversation-

based service metadata Support for greater interoperability

Unified Grid Information Service Architecture Flexible and extendable architecture Support for High Performance and Fault-tolerance Uniform access to all kinds of service metadata

Efficient decentralized metadata systems can be built by integrating TupleSpaces and Publish-Subscribe paradigms Fault-tolerance, distribution and consistency can be succeeded

with few milliseconds system processing overhead. Self-awareness can be achieved in decentralized metadata

management. Communication among services can be achieved with

efficient mediator metadata strategies A metadata management approach for Dynamic Web/Grid

Service Collections Collective operations such as queries on subsets of all available

metadata in service conversation.

Page 35: Managing Metadata in Service Architectures

35 of 34

Information Service Usage CasesWS-Context Fast SOAP transfer in Mobile

Computing (Sangyoon Oh Thesis)

WS-ContextExtended UDDI

Geographical Information Service & Sensor Grids (Galip Aydin’s Thesis)

WS-Context Session Metadata Management (Hasan Bulut’s Thesis)

WS-Context Fault-Tolerant Registry (Harshawardhan Gadgil’ s Thesis)

WS-Context VLab Project – Univ. of Minesota, Florida State University

Extended UDDI Chemical Informatics and Cyberinfrastructure Collaboratory Project

WS-ContextExtended UDDI

Pattern Informatics – UC – DavisIEISS - LANL

Page 36: Managing Metadata in Service Architectures

Selected Publication List focusing on a) Metadata, b) Information Services, and c) Metadata Discovery

36 of 34

Mehmet S. Aktas, Geoffrey Fox, Marlon Pierce, Information Services for Dynamically Assembled Semantic Grids [SKG-05, 2005]

Mehmet S. Aktas, Geoffrey Fox, Marlon Pierce, Managing Dynamic Metadata as Context [ICCSE, 2005]

Mehmet S. Aktas et al., Web Service Information Systems and Applications [GGF-16, 2006]

Mehmet S. Aktas, Geoffrey C. Fox, Marlon Pierce, Fault Tolerant High Performance Information Services for Dynamic Collections of Grid and Web Services [FGCS Journal, 2006]

Mehmet S. Aktas, Sangyoon Oh, Geoffrey C. Fox, Marlon Pierce, XML Metadata Services [SKG-2006, Concurrency and Computation: Practice and Experience Journal-2007]

Mehmet S. Aktas, Marlon Pierce, and Geoffrey C.Fox, Designing Ontologies and Distributed Resource Discovery Services for an Earthquake Simulation Grid [ GGF11, 2004]

Mehmet S. Aktas, M. Pierce, G. Fox, and D. Leake , A Web based Conversational Case-Based Recommender System for Ontology aided Metadata Discovery [GRID Workshop -2004]

Sangyoon Oh, Mehmet S. Aktas, Geoffrey C. Fox, Marlon Pierce, Architecture for High-Performance Web Service Communications Using an Information Service [WSEAS Journal -2006]