data dissemination motivated by examples · large distributed systems must detect and correct...

16
1 1 Data Dissemination Some Apps Nature of information Modes of interaction Data delivery options MOM Addressing options Contents: Mariano Cilia / [email protected] 2 Motivated by Examples Traffic management Air-traffic control Emergency vehicles Environmental Concentration of chemicals Dangerous emissions Forest fires Flood control 3 Example – Flood Control A combination of happenings (coming from sensor data) determines a situation Polling sensors Too frequent Consumption of resources Too infrequent Malfunction 4 pull(Airport Op.) Application scenario: ATIS active mediator datalink publish(ATIS++) subscribe(ATIS+) push(runway) publisher and event detection wrappers push(weather) pull(Airport Op.) complex situation detection action execution repository configuration management pull(Tower) pull(DWD Of) pull(xxx) ATC: HMI request ATIS subscribe to ATIS+ notification notify push(xxx) 5 Examples (cont.) Plant and reactor control Equipment control Defense Missile detection Battlefield monitoring Workflow management 6 Examples (cont.) Commerce Inventory control Supply Chain Management Marketplaces e-Auctions Online shops

Upload: others

Post on 06-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Dissemination Motivated by Examples · Large distributed systems must detect and correct failures/exceptions (autonomic computing, ESCM, zero latency enterprise) 15 The Nature

1

1

Data DisseminationSome AppsNature of informationModes of interactionData delivery optionsMOM

Addressing options

Contents:

Mariano Cilia / [email protected]

Motivated by Examples

Traffic managementAir-traffic controlEmergency vehicles

EnvironmentalConcentration of chemicalsDangerous emissionsForest firesFlood control

3

Example – Flood ControlA combination of happenings (coming from sensor data) determines a situationPolling sensors

Too frequentConsumption of resources

Too infrequentMalfunction

4

pull(Airport Op.)

Application scenario: ATIS

activemediator

datal

ink

publish(ATIS++)

subscribe(ATIS+)

push(runway)

publisher andevent detection wrappers

push(weath

er)

pull(Airport Op.)

complex situation detectionaction execution

repositoryconfiguration management

pull(Tower)

pull(DWD Of)

pull(xxx)

ATC: HMI

request ATISsubscribe to ATIS+

notification

notify

push(xxx)

5

Examples (cont.)Plant and reactor control

Equipment controlDefense

Missile detectionBattlefield monitoring

Workflow management

6

Examples (cont.)Commerce

Inventory controlSupply Chain ManagementMarketplacese-AuctionsOnline shops

Page 2: Data Dissemination Motivated by Examples · Large distributed systems must detect and correct failures/exceptions (autonomic computing, ESCM, zero latency enterprise) 15 The Nature

2

7

Example – e-Auctions

Simple ascending auction modelTime-related eventsConditions

Send msgto seller

Notify process status to

participants

Send congrat. msg

to winner8

Example – Online shops

buy

credit card info credit card authorization

confirmation(web page)

paymentproblems?

problem w/payment(email)

Yes

No

send goods

packageend

user

9

RFID – Supply Chain Mgmt (cont)

10

Examples (cont.)

PersonalizationUser InterfacesServices

11

Example - PersonalizationCar PortalDriver

Portal(*.*)Portal

• My Car Info• Biography• Occupants• Current State

• My Info• Medical Info• Travel Preferences• Units of Measures Prefs• Preferred Payment Methods

• Info• Status• Location• Software

Box

on DrvGetIntothen LoadPrefsR1

DrvGetIntoevent

SetInstruments

1

23

40Addition of rules to portalsAddition of rules to portals

Dissemination of eventsDissemination of events(GPS, status, ...)(GPS, status, ...)

12

Examples (cont.)

PersonalizationUser InterfacesServices

Financial applicationsCommodity tradingCurrency tradingStock trading

Page 3: Data Dissemination Motivated by Examples · Large distributed systems must detect and correct failures/exceptions (autonomic computing, ESCM, zero latency enterprise) 15 The Nature

3

13

Example – Stock trading

SampleON stock.Name=IBM IF stock.Price<20 THEN call myBuy()

High volume

14

Convergence of TechnologiesAmbient Intelligence and smart devices require continuous monitoring of eventsMiniaturization of sensors, ubiquitous deploymentContext information for proper interpretation(Almost) complete reachability of individuals causes unbounded appetite for informationNeed to filter and interpret large amounts of heterogeneous and short-lived dataLarge distributed systems must detect and correct failures/exceptions (autonomic computing, ESCM, zero latency enterprise)

15

The Nature of InformationInformation flows from producer to consumer

info-pipes, broadcast disks, event streams, pub/sub

Static view of information is a simplification

data flows into/out of high latency pool (database)

Mechanisms for access to static information (queries) different from those for accessing flow of information (subscription/filters)

16

Working HypothesisDesintermediation/ReintermediationB2C, B2B, B2B2C, C2C, Portals, Markets (DotComs vs. NewCos), m-commerce …First generation e-commerce systems mapped existing applications 1:1 to new mediumNext generation(s) will be based on flexible integration of services and componentsFlow of tasks and informationHow should (middleware) platforms look?

17

Modes of Interaction

Initiator of Interaction

Event-based dissemination

Anonymous Request/Reply

No

One-to-One Message

Request/ReplyFullKnowledge about Counterpart

ProducerConsumer1st generation

18

Modes of Interaction – R/R

C P

Page 4: Data Dissemination Motivated by Examples · Large distributed systems must detect and correct failures/exceptions (autonomic computing, ESCM, zero latency enterprise) 15 The Nature

4

19

Modes of Interaction – R/R

C Prequest

• Known server

20

Modes of Interaction – R/R

C Prequest

reply

21

Request/ReplyDirect and synchronous communications

Enforces tightly coupling of comm. PartiesImpairs scalability

Clients pull remote data sourcesTrade off

Usage of data vs. data accuracyShort polling interval waste resourceLong polling interval increase update latency

Need for asynchronous and decoupled operations

22

Request/Reply (cont.)Simple

+ imperative nature of C/S paradigm+ programming language abstraction

DrawbacksPoint-to-point communication limits scalabilityPolling limits accuracy of data

Unnecessary bandwidth consumption

23

Modes of Interaction

Initiator of Interaction

Event-based dissemination

Anonymous Request/Reply

No

One-to-One Message

Request/ReplyFullKnowledge about Counterpart

ProducerConsumer

24

Anonymous R/RConsumer does not specify the providerRequest is delivered to an arbitrary set of providersIdentity of provider is unknownLoad balancing

Page 5: Data Dissemination Motivated by Examples · Large distributed systems must detect and correct failures/exceptions (autonomic computing, ESCM, zero latency enterprise) 15 The Nature

5

25

Modes of Interaction

Initiator of Interaction

Event-based dissemination

Anonymous Request/Reply

No

One-to-One Message

Request/ReplyFullKnowledge about Counterpart

ProducerConsumer

26

1-to-1 message / CallbackConsumer registers interest with a known providerProvider repeatedly evaluates interests

when true callback registered consumerResponsible for managing list of interests and registered consumers

One to one messageObserver pattern

27

Modes of Interaction

Initiator of Interaction

Event-based dissemination

Anonymous Request/Reply

No

One-to-One Message

Request/ReplyFullKnowledge about Counterpart

ProducerConsumer

Needs a mediator

28

Modes of Interaction – Events

C mediator

C

C

P

29

Modes of Interaction – Events

C mediator

interests

C

C

P

30

Modes of Interaction – Events

C mediator

interests

C

P

C

Page 6: Data Dissemination Motivated by Examples · Large distributed systems must detect and correct failures/exceptions (autonomic computing, ESCM, zero latency enterprise) 15 The Nature

6

31

Modes of Interaction – Events

C mediator

C

C

P

data/event

happening of interest

32

Modes of Interaction – Events

C

C

C

P

data/event

mediator

33

Modes of Interaction – Events

C

C

C

mediator P

? - Bottleneck?- Single point of failure?

34

Modes of Interaction – Events

C

C

C

P

data/event

B

B B

B

35

Events and Notifications

Event happening of interest at observed object(s)

Notification i) communication of event occurrence to interested recipientsii) reification of observed event

Notification Service (NS)provides infrastructure to register for and deliver notificationsi.e. publish(), subscribe(), notify()

36

Event Notification - PatternsObserver: observable events

Event producers have knowledge about event consumers

Mediator: centralized mediationEncapsulates and coordinates communication

Notification ServiceCombines the Observer and the Mediator patternsSubscribers only know about events not about publishersMediation between event producers and consumers

Page 7: Data Dissemination Motivated by Examples · Large distributed systems must detect and correct failures/exceptions (autonomic computing, ESCM, zero latency enterprise) 15 The Nature

7

37

Event-based ParadigmThe (data) producer is the initiator of communicationNotifications are not addressed to any specific consumer

Producers are not aware of consumers

Consumers issue subscriptions (interests)Consumers are not aware of producers

Notifications are delivered to consumers if they match with subscriptionsFlexible!

38

Modes of InteractionInfluences

The architecture of the system The design of the individual processes involved

Link distributed parts of the systemdifficult to change afterwards

MoI determine system’s ability to adapt, evolve and scaleMoI is confused with the implementation techniques

39

Interaction PatternsInitiation

(client) pull vs. (server) pushperiodic vs. aperiodic

Topology1:1 (unicast) vs. 1:n (multicast)

Lifecycletime-dependent vs. time-independent

Concurrencyblocking vs. non-blocking

Reliabilityatomic, at-least-once, at-most-once, exactly-once

40

Data Delivery OptionsData Delivery Options

Pull

PeriodicAperiodic

1:1 1:n 1:n1:1

Request/reply

Request/reply

w/snooping

Polling Pollingw/snooping

Push

PeriodicAperiodic

1:1 1:n 1:n1:1

E-mail

Triggers

Publish/subscribe

Reminders

News headlines

Broadcastdisks

41

Interaction vs. Invocation

Must separate mode of interaction and implementation (invocation) techniqueSeparation must occur at various levels of abstraction

RPC implemented using messagesImplementation using other interaction patterns

Pointcast: implemented an event-driven notification service through a polling mechanism

42

Invocation Mechanisms of C/S Sys

The communication mechanisms used in client/server systems fall into one of the following categories:

remote procedure call (RPC)transactional RPCpeer-to-peer messagingqueuestransactional queuesevents/Publish-Subscribe

Page 8: Data Dissemination Motivated by Examples · Large distributed systems must detect and correct failures/exceptions (autonomic computing, ESCM, zero latency enterprise) 15 The Nature

8

43

Middlewareused to glue together applications (components):

IPC by sockets, shared memoryTCP/IP, X.25common databaseRPC, CORBA RMI, J2EEMOM

44

Message Oriented Middlewareapplications communicate through explicitly sending/receiving messagesmost common flavors:

queuespoint-to-point (mostly)location-based addressingenqueue, dequeuestore and forward

publish/subscribedifferent addressing approachesregister & callback (Observer pattern)optimize network use

45

MOM (cont.)flexible interaction

C/S request/reply, one-way pushasynchronous and time-independent 1:1, n:1, 1:n, m:npriorities

flexible reliabilityvolatile/persistent/transactional queuesreliable/certified/transactional pub/sub

additional servicesload balancing, naming, security, content transformation

Communication Mechanisms

47

Already seen Request/ReplySimple

+ imperative nature of C/S paradigm+ programming language abstraction

DrawbacksPoint-to-point communication limits scalabilityPolling limits accuracy of data

Unnecessary bandwidth consumption

48

QueuesWhy use queues?

Asynchronous communicationNo blocking while waiting for replyClients can submit requests even if server is not availableEasy to handle results of disconnected clientsLoad balancingPossibility of prioritizing the requests in the queue

Page 9: Data Dissemination Motivated by Examples · Large distributed systems must detect and correct failures/exceptions (autonomic computing, ESCM, zero latency enterprise) 15 The Nature

9

49

Operation with QueuesPersistent queue between client and serverClient: enqueues requests, dequeuesrepliesServer: dequeues a request, processes request, enqueues reply, commitsIf transaction aborts due to system reasons it is enqueued again

50

Queue ManagersQueue manager needed

operations on queue elements: enqueue, dequeue, scan queue, keyed accesscreate and destroy queuesmodify a queue’s attributes, such as owner, size, privilegesstart and stop queuerouting of requests (forwarding to another queue manager in case of overload)

51

Server’s View of QueuingAssume each request is for execution of a single transactionServer dequeues a request, executes the request, enques the result, and commitsIf the transaction aborts

the dequeue operation is undonethe enqueue operation is undone if already started

If client checks queues, request is either in request queue, in process, or result in result queue

52

Client’s View of QueuingClient perceives three transactions for each request:

one transaction to enqueue requestreceive input from user, construct request, enqueue request, commit

one server transaction (described above)one transaction to dequeue results

dequeue reply from result queue, convert to proper output format, deliver output, commit (wiping out result in result queue)

54

Request/Reply with QueuesClient A

Server B

Request Q

Reply Q

TX 1:Start

get inputconstruct requestenqueue request

CommitTX 2:Start

dequeue requestprocess requestenqueue reply

Commit

TX 3:Start

dequeue replydecode replyprocess output

Commit

Message queues are good for Message queues are good for asynchronous point to point asynchronous point to point (1:1) messaging(1:1) messaging

55

Cost/Benefit of Operating with Queues

Using queues is expensive3 transactions instead of onetransactional queues must be managed by a (specialized) DBMS to guarantee persistence and transaction semantics

Using queues buys flexibilitycommunication with unavailable clients or serversload balancing across serverseasy implementation of prioritieseasier integration of legacy systems

Page 10: Data Dissemination Motivated by Examples · Large distributed systems must detect and correct failures/exceptions (autonomic computing, ESCM, zero latency enterprise) 15 The Nature

10

56

Need for Persistent SessionsSystem must be able to identify sending and receiving transactions and match themWithout request/reply semantics, queue manager may not accept requests with output parameters (since results would be simply dumped on device)Recovery of queuing systems later (with TPM)

57

Summary communication mechanisms

RPC: synchronous, simple call-return semantics, hard-wired termination and orderingMulticast: 1:N messaging for group communicationPoint-to-point messaging: flexible sequencing of messages, asynchronous (synchronous possible), optimizable, sequencing and timing reflected in application logic therefore more difficult to useQueues: fully asynchronous, maximum flexibility for handling client/server/communication failures

Publish/Subscribe

59

Pub/Sub Notification ServiceMain characteristics

decouples producers and consumersanonymous to each otherdynamic number of consumers and producersno directory service is needed

Addressing modelsChannel-basedSubject-basedContent-basedConcept-based

60

Channel-based

producer1

producer2

consumer1

consumer2

consumer3

sports

• Less powerful• Simple

61

Subject-based AddressingSubject-based addressing avoids use of physical network addressesSenders label a data message with a subject name

Subject = characterize/synthesize message content

Consumers listen to names and pick up the messages with the proper subject nameAnonymous rendezvous:

producers need not know how data is consumedconsumers need not know how or where data is produced

Page 11: Data Dissemination Motivated by Examples · Large distributed systems must detect and correct failures/exceptions (autonomic computing, ESCM, zero latency enterprise) 15 The Nature

11

62

Subject-based Addressing (2)

subscribes:news.sports.*

{sport=basketball;team1=Lakers;team2=Sixers;result=108-96 }

producer1 producer2

consumer1

subscribes:news.finance.stocks.SUNW

subscribes:news.>

consumer2 consumer3

{symbol=SUNW;price=10}

news.sports.basketball

news.finance.stocks.SUNW

63

Subject-based Addressing (3)Agreement on subject namesNew subjects can be created dynamicallySubject names consist of elements (subject name hierarchy)

element.subelement.subsubelement

Wildcards can be usedRUN.* matches RUN.AWAY

RUN.homeRUN.> matches RUN.AWAY.far

Difficult to change subject hierarchy

64

Content-Based Pub/SubA content-based filter F

is a predicate on the content of notificationsinduces the set of matching notifications

Content-Based filtering is flexible but complexcannot be easily mapped to “IP-Multicast”

Centralized implementations not scalable to wide-area scenario

powerful distributed infrastructure required

65

Content-Based Pub/Sub

subscribes:type==sport-news

{type=sport-news;sport=basketball;team1=Lakers,team2=Sixers,result=108-96 }

publisher1 publisher2

subscriber1

subscribes:type==finanace &symbol==SUNW

subscribes: *

subscriber2 subscriber3

{type=finance;symbol=SUNW;

price=10}

subscribes:price < 15

66

Problems derived from scaleFlooding of notifications is not an applicable solution

need strategies for filter placement to optimize bandwidth and size of routing tables

67

Content-Based RoutingCooperating brokers

Local clientsNotification forwardingFilter-Based Routing Tables

Tradeoff:Network resource waste vs. filtering overhead(processing and delay)

B1

LocalClients

B2 B3

B4F

G

Producer

subscriptions: F and G

Page 12: Data Dissemination Motivated by Examples · Large distributed systems must detect and correct failures/exceptions (autonomic computing, ESCM, zero latency enterprise) 15 The Nature

12

68

Content-Based RoutingCooperating brokers

Local clientsNotification forwardingFilter-Based Routing Tables

Tradeoff:Network resource waste vs. filtering overhead(processing and delay)

B1

LocalClients

B2 B3

B4F

G

(F,B2)(G,B3)

(F,B1)(G,B4)

RoutingTables

Producer

routing tables are built

69

Content-Based RoutingCooperating brokers

Local clientsNotification forwardingFilter-Based Routing Tables

Tradeoff:Network resource waste vs. filtering overhead(processing and delay)

B1

LocalClients

n

B2 B3

B4

n

F

G

(F,B2)(G,B3)

n

Producer

notification n is published it matches with F

RoutingTables

(F,B1)(G,B4)

70

Content-Based RoutingCooperating brokers

Local clientsNotification forwardingFilter-Based Routing Tables

Tradeoff:Network resource waste vs. filtering overhead(processing and delay)

B1

LocalClients

m

B2 B3

B4

m

m

mF

G

(F,B2)(G,B3)

(G,B4)

RoutingTables

Producer

notification m is published it matches with G

71

Content-Based Routing (cont)Size of routing tables crucial for scalability

global knowledge about all active subscriptions not feasible

Solution: reduce size of routing tables and overhead to update them by

exploiting similarities among filtersidentity testscovering tests

merging of filterstrading accuracy vs. efficiency

72

CoveringFilters can cover each otherCovering can decrease

size of routing tablesfilter forwarding overhead

FG

73

Merging of FiltersFilters can be merged

perfect

Imperfect

Merging generates new covers

similar benefits as covering

F GH

GHF

Page 13: Data Dissemination Motivated by Examples · Large distributed systems must detect and correct failures/exceptions (autonomic computing, ESCM, zero latency enterprise) 15 The Nature

13

74

The REBECA ApproachPrototype of notification infrastructure (REBECA Event-Based Electronic Commerce Architecture)(http://www.gkec.informatik.tu-darmstadt.de/rebeca)

Content-based routing with optimizationsFlexible filter frameworkSupport for complex data types

Structuring publish/subscribe systemsScopingSessions

75

Concept-based Pub/SubMain advantages of Publish/subscribe

decouples producers and consumersanonymous to each other

BUT even though consumer and producer use a common vocabulary (let’s suppose this) assumptions of participants are implicit

(date) 7/11/2003 Which is the month?(price) 200 Currency? €?, U$S?…

Subscriptions expressed on flat messages

76

Concept-based Pub/SubProvide a higher level of abstraction to describe the interests of publishers and subscribersEvents represented using MIXSubscribers can specify their assumptions

Price < 100 [€]DeliveryDate <= 7/11/2003 [dd/mm/yyyy]

The notification service delivers ready-to-process events to subscribers

No further processing is needed78

Concept-based Overview

79

Concept-based Overview[“MM/DD/YY”, miles, F]

[“DD.MM.YY”, Km]

[“DD/MM/YYYY”, km, C]

[“DD.MM.YYYY”, km, USD] [Km, EUR, C] 80

Concept-based OverviewCan be built as a layer on top of

different addressing models:• Channel-based• Subject-based

• Topic-based (JMS)• Content-based

Page 14: Data Dissemination Motivated by Examples · Large distributed systems must detect and correct failures/exceptions (autonomic computing, ESCM, zero latency enterprise) 15 The Nature

14

81

Wrap upDifferent routing strategies according to application needsFilters/subscriptions on a single messageEvent correlation no supported

Need to cache/store semi-composed events

Software EngineeringNeed to scope events and subdivide event space

Data Dissemination Products

91

Queue Managers: IBM’s MQSeriesmost TP Monitors offer queue managers (TUXEDO, Encina, TOP END)standalone products(IBM’s MQSeries, BEA messageQ, SonicMQ, SUN JMQ, ...)MQSeries provides interoperable queue management across many Operating Systems (~15)works with all IBM TP monitors and any system supporting the X/Open XA interface (including CORBA OTS), Java connectivity includedwhen working with a TPM, MQSeries uses the TPM transactions, otherwise it provides its own

92

MQSeries (cont.)multiple named queues supportedqueue forwarding among queues (e.g. for load balancing)queue forwarding occurs within channel agent’s own transactionpub/sub brokering possiblequeue manager consists of

connection managerdata managerlock managerbuffer managerrecovery managerlog manager

93

MQSeries: Qs, API ...types of Qs

local (app., transmission, dead-letter, initiation, ...)remote, alias, model, dynamic

interaction through MQI verbsMQBEGIN, MQCMITMQPUT, MQGET (browsing, consuming, blocking/non-blocking)control operations

connect/disconnect Qmanager (MQCONN, MQDISC)set configurations, manage Q processing (MQOPEN, MQSET, MQCLOSE)

interaction through C++/ Java APIsinteraction through JMS API 94

MQSeries Messagesmessages can be

persistentmore secure, more expensive, logged, exactly once semantic

non-persistentless secure, faster since in main memory, at most once semantic

both types of messages can be enqueued in same queuemessage data

user defined formatdefault format and encodings

Page 15: Data Dissemination Motivated by Examples · Large distributed systems must detect and correct failures/exceptions (autonomic computing, ESCM, zero latency enterprise) 15 The Nature

15

95

MQSeries: Messages (cont.)message consists of descriptor and datadescriptor includes context

identityoriginsystem message IDapplication message IDmessage type

datagram, request, reply,report

persistence flagname of destination queueID of reply queuecorrelation ID expiryapplication-defined formatreport options

confirm on arrival, on delivery, on positive/negative action, on expiration, or on exception

priority

96

MQSeries (cont.)management of message processing applications

process definition associated with QQmanager sends trigger to initiation Qtrigger monitor may start applicationusing process definition in trigger message

97

TIBCO’s TIB/Rendezvous

Event-driven, publish/subscribeSubject-based addressingSelf-describing messagesLeverage on broadcast & IP-multicast

98

TIB/Rendezvous Architecture

99

TIB/Rendezvous MessagesData Messages are self-describing

data + descriptive informationdatalength of datadatatype indicatorsubject name

listener callback functions receive same bundleautomatic conversion between local data format and TIB/Rendezvous wire format

104

Java Message Service - JMSTransactional, asynchronous messagingDe-facto standard for Java messaging APIsSupported by almost every QueueManagervendor (IBM, Oracle, BEA,...)Many 100% Java, lightweight JMS products (Fiorano, Progress, Softwired, SpiritSoft, etc.)Designed for portability

Interfaces only => many different realizationsAPIs to create, send, receive, read messages

Page 16: Data Dissemination Motivated by Examples · Large distributed systems must detect and correct failures/exceptions (autonomic computing, ESCM, zero latency enterprise) 15 The Nature

16

105

JMS ModelSupports both models

Queues for point-to-point

Publish/subscribe

JMS ProviderClient

Queue

ServerServer

Server

JMS ProviderPublisher

Topic

Subscriber

Subscriber

Subscriber

106

JMS ModelJMS point-to-point

Queue object: encapsulates provider specific Q nameQueueConnection: handle to underlying transportQueueSession: produces and consumes messagesTemporaryQueue: temporary storage for the QueueConnectionQueueConnectionFactory: creates QueueConnectionQueueReceiver: gets messagesQueueSender: puts messages

107

JMS pub/subJMS publish/subscribe:

Combination of channels (now topics) and expressions on envelope’s attributes

Factories, destinations, etc. identified via JNDI

108

JMS MessagesMessage types

textmap: (name,value) pairsobject: serializable objectstream: primitivesbyte

Message header used for addressingMessage properties

list of (name,value) pairsselectors: SQL-like conditional expressions, MyType=‘car’ AND MyName like ‘Mu%’

110

JMS Open Issuesload balancingscalability/availabilityfault toleranceerror notificationend-to-end securitysegregation of domainssimple and flexible deployment configurationMany of these issues being addressed by vendors

112