content-based publish-subscribe over structured p2p networks
DESCRIPTION
Content-Based Publish-Subscribe Over Structured P2P Networks. Peter Triantafillou and Ioannis Aekaterinidis Research Academic Computer Technology Institute and Department of Computer Engineering and Informatics, University of Patras , Greece. Agenda. Introduction/Goal - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/1.jpg)
Content-Based Publish-Subscribe Over Structured P2P Networks
Peter Triantafillou and Ioannis AekaterinidisResearch Academic Computer Technology Institute and
Department of Computer Engineering and Informatics, University of Patras, Greece
![Page 2: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/2.jpg)
Agenda
• Introduction/Goal• Publish-Subscribe Systems• Publish-Subscribe over Chord– Processing Subscriptions– Processing Events
• Improving Performance• Conclusion
![Page 3: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/3.jpg)
Introduction• Publish Subscribe systems are becoming very popular for
building large scale distributed systems and applications– Anonymity between publisher and subscriber
• Centralized:– Adv: Global image of system making matching algorithm easy
to implement– Dis: Scalability
• Decentralized:– Adv: Scalability– Challenge: development of efficient distributed matching
algorithm
![Page 4: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/4.jpg)
Goal
• Chose to use Chord because:– Simplicity– Popularity– Scalable– Self-Organizing– Well Performing
• Challenge: to develop a strategy for using DHTs to provide good support for range predicates– Which are popular when specifying subscription attributes
![Page 5: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/5.jpg)
Agenda
• Introduction/Goal• Publish-Subscribe Systems• Publish-Subscribe over Chord– Processing Subscriptions– Processing Events
• Improving Performance• Conclusion
![Page 6: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/6.jpg)
Publish-Subscribe Systems
• Asynchronous messaging paradigm• Senders (publishers) of messages are not programmed
to send their messages to specific receivers (subscribers)• Published messages are characterized into classes
(without knowledge of what subscribers there may be)• Subscribers express interest in one or more classes and
only receive messages that are of interest (without knowledge of what publishers there are)
• This Decoupling of publishers and subscribers allows for greater scalability and a more dynamic network topology
![Page 7: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/7.jpg)
Pub-Sub Message Filtering
• Subscribers receive only a subset of the total messages published
• 2 Main Types– Topic Based– Content Based
• Hybrid– Coupling of topic and content based systems
![Page 8: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/8.jpg)
Topic Based Pub/Sub Systems
• Much like newsgroups• Users join a group (topic)• All messages related to that topic are
broadcasted to all users participating in the specific group
![Page 9: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/9.jpg)
Content Based Pub/Sub Systems
• Preferable• Give users the ability to express their interest
by specifying predicates over the values of a number of well defined attributes
• Matching of publications (events) to subscriptions (interest) is done based on the content (values of attributes)
![Page 10: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/10.jpg)
Hybrid System
• Publishers post messages to a topic while subscribers register content-based subscriptions to one or more topics
• Publications and subscriptions are automatically classified in topics (using an application-specific schema)
• Drawbacks:– Design of the domain schema plays fundamental role in
the system’s performance– Likely many false positives may occur.
![Page 11: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/11.jpg)
Agenda
• Introduction/Goal• Publish-Subscribe Systems• Publish-Subscribe over Chord– Processing Subscriptions– Processing Events
• Improving Performance• Conclusion
![Page 12: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/12.jpg)
Event Schema• Set of typed attributes• Each attribute ai consists of:
– Type – belong to predefined set of primitive data types– Name - string– Value v(ai)
• Any range defined by the minimum and maximum values (vmin(ai), vmax(ai)) along with the attribute’s precision Vpr(ai)
![Page 13: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/13.jpg)
Subscription Schema
• Contains all interesting subscription-attribute data types (integers, strings, etc.) and all common operators (=, ≠, <, >, etc.)
• Event matches subscription iff all the subscription’s attribute predicates/constraints are satisfied
• Can have two or more constraints for the same attribute
![Page 14: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/14.jpg)
Subscription Identifier
• Concatenation of 3 parts:– C1: id of the node receiving the subscription• Size: m bits in a Chord ring with an m-bit address space
– C2: id of the subscription itself• Size: bits equal to the rounded-up base-2 logarithm of the
maximum number of outstanding subscriptions a node can have
– C3: number of attributes on which constraints are declared• Size: max value = total number of attributes supported by the
system
![Page 15: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/15.jpg)
Subscription ID Example
• Assume Chord ring with a 3-bit identifier address space
• Each node can support 8 outstanding subscriptions with an attribute schema including 7 attributes
• Depicts subscription 3 (C2=3), belonging to node 4 (C1=4), comprised of constraints on 5 attributes (C3=5)
![Page 16: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/16.jpg)
Storing Subscriptions
• Done using the hash function provided by Chord (SHA-1)
• Returns an identifier uniformly distributed in the address space
• k=h(v(ai))• Following the Chord API, the subID is placed at
node: successor(k)
![Page 17: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/17.jpg)
Storing Subscriptions
• Procedure:
![Page 18: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/18.jpg)
Storing Example
• Attributes are processed one at a time• Subscription ID is stored at:– Successor(h(“NYSE”)), in the list dedicated for attribute
Exchange– Successor(h(“OTE”)), in the list dedicated for attribute
Symbol• Since the Price attribute is over a range of
8.30<Price<8.70 with a precision of .01 the subscription ID is stored at:– Successor(h(Price)); for the values 8.31,8.32, …, 8.69.
![Page 19: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/19.jpg)
Updating Subscriptions
• Updating attributes of a subscription with equality only 2 nodes are affected:– Delete the Subscription ID from:• nodeID = successor(h(vstale_value(ai)))
– Add the subscription ID to node: (appropriate list)• nodeID = successor(h(vupdated_value(ai)))
![Page 20: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/20.jpg)
Updating Subscriptions
• The procedure for updating a range value depends on the new values of the range bounds (vlow_NEW(ai) and vhigh_NEW(ai)) compared to the old values
• If vlow_NEW(ai) < vlow(ai) store the subID to the nodes that cover [vlow_NEW(ai), vlow(ai)) range
• If vhigh_NEW(ai) > vhigh(ai) store the subID to the nodes that cover (vhigh(ai), vhigh_NEW(ai)] range
![Page 21: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/21.jpg)
Updating Subscriptions
• If vlow_NEW(ai) > vlow(ai) delete the subID from the nodes that cover [vlow(ai), vlow_NEW(ai)) range.
• If vhigh_NEW(ai) < vhigh(ai) delete the subID from the nodes that cover (vhigh_NEW(ai), vhigh(ai)] range
![Page 22: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/22.jpg)
Processing Events: Matching Algorithm
![Page 23: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/23.jpg)
Matching Events with Subscriptions Example
• Suppose we have Subscriptions 1 and 2 generated by two clients connected to a Chord node and Event 1
• First, the algorithm will collect all the subIDs lists in which the values of the event attributes satisfy the corresponding constrains of the subscriptions
![Page 24: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/24.jpg)
Matching Events with Subscriptions Example (continued)
• The algorithm starts with attribute Exchange = “NYSE” and retrieves the subID list (LExchange) from node successor(h(“NYSE”))
• This list contains only the subID1
– LExchange -> subID1
![Page 25: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/25.jpg)
Matching Events with Subscriptions Example (continued)
• Next attribute Symbol = “OTE”; subID list (LSymbol) from node successor(h(“OTE”)) is retrieved– LSymbol -> subID1, subID2
• Since both subscriptions are satisfied for the event
![Page 26: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/26.jpg)
Matching Events with Subscriptions Example (continued)
• Next attribute Price = 8.40; subID list (LPrice) from node successor(h(8.40)) is retrieved– LPrice -> subID1
• Since only subscription 1 has a price that falls within this range.
![Page 27: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/27.jpg)
Matching Events with Subscriptions Example (continued)
• Lastly attribute Low = 8.22; subID list (LLow) from node successor(h(8.22)) is retrieved– LLow -> subID2
• Since only subscription 2 has an attribute Low
![Page 28: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/28.jpg)
Matching Events with Subscriptions Example (continued)
• After this phase of the matching process the collected subscription ID lists are:• LExchange -> subID1
• LSymbol -> subID1, subID2
• LPrice -> subID1
• LLow -> subID2
• Subscription1 was found in 3 lists while subscription2 was found in 2
• By processing the subIDs of the subscriptions (c3 part) we can find out that both subscriptions have constraints over 3 attributes.
![Page 29: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/29.jpg)
Matching Events with Subscriptions Example (continued)
• Since subscription 1 was found in 3 lists, a match is implied and it’s subID is kept in order to inform the node which generated the subscription about the matched event.
• While holding metadata info for subID1 in order to locate the IP address of the client that generated the subscription
• The node storing the subscription is contacted (using nodeID equal to c1 field of the subID1) and the event is delivered to the interested client
![Page 30: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/30.jpg)
Expected Performance
• Subscription Storage Procedure:– Average number of hops needed to store a subID
depends on the type of constraints over the attributes
– Equality: ½ log(N)• subID is stored in a single node
– Range Constraint:Nodes affected which leads to r*1/2 log(N) hops on
average to store the subID
)(av)(a v- )(av
ipr
ilowihighr
![Page 31: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/31.jpg)
Expected Performance
• Update/Deletion of Subscription– Again, depends on the type of constraints over the
attributes• Equality: update performed by contacting Log(N) nodes• Ranges: number of nodes is k*log(N) on average
– K depends on whether the new range is smaller or wider than the old range
![Page 32: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/32.jpg)
Expected Performance
• Event-Processing (matching)– Involves contacting nodes to
collect the subscription id lists
• Reminder: a Chord network with N nodes and a 2m-bit address space, the average number of nodes that must be contacted to find a successor is: – ½ log(N) hops
• By design, this proposal leads to fast and scalable event matching.
)log(21 NN eventa
![Page 33: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/33.jpg)
Agenda
• Introduction/Goal• Publish-Subscribe Systems• Publish-Subscribe over Chord– Processing Subscriptions– Processing Events
• Improving Performance• Conclusion
![Page 34: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/34.jpg)
Improving Performance
• Currently storing a subscription over the Chord ring takes r* ½ * log(N) hops on average for every attribute– r depends on precision, high, and low values
• Using an order preserving hash function we can optimize to r+ ½ *log(N) hops
![Page 35: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/35.jpg)
Order Preserving Chord
• Using a 2m- order preserving hash function• Expected performance:– ½ log(N) hops to locate node storing minimum
value of the range (vlow(ai))– Then, perform r hops to store remaining values in
the range to lead to r+ ½ log(N) total hops
![Page 36: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/36.jpg)
Order Preserving Hash Function
• Suppose every attribute is characterized by– vmin(ai): minimum value ai can take– vmax(ai): maximum value ai can take– vpr(ai): precision of ai
• vj(ai) is any value in [vlow(ai), vhigh(ai)]• OPHF is:
mm
ii
iijioij avav
avavasavh 2mod)2*
)()()()(
)(())((minmax
min
))(_()( iio anameattributehashas
![Page 37: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/37.jpg)
Subscription and Event Processing with OPHF
• Example: Storing Subscription– Consider Chord ring with 3-bit ids and 8 nodes– Subscription of a single integer attribute a arriving
at node 3 with constraint 0<v(a)<4– Using Chord requires O(r*log(N)) hops to store the
subID at three nodes
![Page 38: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/38.jpg)
Subscription and Event Processing with OPHF
• Using the OPHF with Chord:– Perform O(log(N)) hops only once to reach the
first node (node 6)– Storing the subID at nodes 7 and 0 requires 2
more hops
![Page 39: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/39.jpg)
Agenda
• Introduction/Goal• Publish-Subscribe Systems• Publish-Subscribe over Chord– Processing Subscriptions– Processing Events
• Improving Performance• Conclusion
![Page 40: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/40.jpg)
Conclusion
• Not Addressed– Load Balancing– Small Domain Problem
• Able to support equality and range attributes while leveraging Chord to build a scalable, self-organizing, well performing content based publish-subscribe system.
![Page 41: Content-Based Publish-Subscribe Over Structured P2P Networks](https://reader036.vdocuments.net/reader036/viewer/2022062520/56816452550346895dd61e6d/html5/thumbnails/41.jpg)
Sources Cited
• http://en.wikipedia.org/wiki/Publish/subscribe