alex cheung and hans-arno jacobsen august, 14 th 2009 middleware systems research group

Alex Cheung and Hans-Arno Jacobsen August, 14 th 2009 MIDDLEWARE SYSTEMS RESEARCH GROUP Slide 2 Agenda Problem statement and goal Related approaches How GRAPE and POP works? Experiment results Conclusions and Future Work Slide 3 Problem Publishers can join anywhere in the network Closest broker Impact: High delivery delay High system utilization Matching Bandwidth Subscription Storage P P S S S S Pure forwarders Slide 4 Goal Adaptively move publisher to the area of highest-rated subscribers or highest number of publication deliveries Key properties of solution: Dynamic Transparent Scalable Robust S S S S P P Slide 5 Existing Approaches Filter-based Pub/Sub: R.Baldoni et al. Efficient publish/subscribe through a self-organizing broker overlay and its application to SIENA. The Computer Journal, 2007. Migliavacca et al. Adapting Publish-Subscribe Routing to Traffic Demands. DEBS 2007. Multicast-based Pub/Sub: Such as Riabovs subscription clustering algorithms (ICDCS02 and 03), SUB-2-SUB (one subscription per peer), TERA (topic-based) Assign similar subscriptions to one or more cluster of servers One-time-match at the dispatcher Suitable for static workloads May get false-positive publication delivery Architecture is fundamentally different than filter-based approaches Slide 6 Terminology B1 B2 B3 B4 B5 P P Reference broker upstreamdownstream Publication flow Slide 7 GRAPE - Intro Greedy Relocation Algorithm for Publishers of Events Goal: Move publishers to area with highest-rated subscribers or highest publication deliveries based on GRAPEs configuration. Slide 8 GRAPEs Configuration The configuration tells GRAPE what aspect of system performance to improve: 1. Prioritize on minimizing average end-to-end delivery delay or total system message rate (a.k.a. system load) 2. Weight on prioritization falls on a scale between 0% (weakest) and 100% (full). Example: Prioritize on minimizing load at 100% (load100) Slide 9 Minimize Delivery Delay or Load? S S S S S S S S S S S S [class,=,`STOCK], [symbol,=,`GOOG], [volume,>,1000000] P P [class,`STOCK], [symbol,`GOOG], [volume,9900000] [class,=,`STOCK], [symbol,=,`GOOG], [volume,>,0] 4 msg/s 1 msg/s 100% Load 0% 0% Delay 100% Slide 10 GRAPEs 3 Phases Operation of GRAPE is divided into 3 phases: Phase 1: Discover location of publication deliveries by tracing live publication messages in trace sessions Retrieve trace and broker performance information Phase 2: In a centralized manner, pinpoint the broker that minimizes the average delivery delay or system load Phase 3: Migrate the publisher to the broker decided in phase 2 Transparently with minimal routing table update and message overhead Slide 11 Phase 1 Logging Publication History Each broker records, per publisher, the publications delivered to local subscribers G threshold publications are traced per trace session Each trace session is identified by the message ID of first traced publication message of that session B34-M213 B34-M215 B34-M216 B34-M217 B34-M220 B34-M222 B34-M225 B34-M226 Publications received from start of trace session B34-M212 B34-M212 01011111110000000 Trace session ID Start of bit vector GRAPEs data structure representing local delivery pattern. Requires each publication to store the trace session ID Slide 12 Phase 1 Trace Data and Broker Performance Retrieval B1B5 B7 B6 B8 P P S S S S S S 1x 9x 5x S S 1x Reply B8 Reply B8 Reply B7 Reply B7 Reply B8, B7, B6 Reply B8, B7, B6 Reply B8, B7, B6, B5 Reply B8, B7, B6, B5 at the end of a trace session Slide 13 Phase 1 Contents of Trace Information Broker ID Neighbor ID(s) Bit vector (for estimating total system message rate) Total number of local deliveries (for estimating end-to- end delivery delay) Input queuing delay Average matching delay Output queuing delays to neighbor(s) and binding(s) GRAPE adds 1 reply message per trace session. Slide 14 Phase 2 Broker Selection Estimate the average end-to-end delivery delay Local delivery counts, and queuing and matching delays Publisher ping times to the downstream brokers Estimate the total broker message rate Bit vectors Slide 15 Phase 2 Estimating Average End- to-End Delivery Delay B1 B8 B6 B7 P P S S S S S S 9 5 S S 2 1 Input Q: Matching: Output Q (RMI): Output Q (B5): Input Q: Matching: Output Q (RMI): Output Q (B5): Output Q (B7): Output Q (B8): Input Q: Matching: Output Q (RMI): Output Q (B6): Input Q: Matching: Output Q (RMI): Output Q (B6): 30 ms 20 ms 100 ms 50ms 20 ms 5 ms 45 ms 25 ms 40 ms 35 ms 30 ms 10 ms 70 ms 30 ms 35 ms 15 ms 75 ms 35 ms Subscriber at B1: 10+(30+20+100) 1 = 160 ms Subscribers at B2: 10+[(30+20+50)+(20+5+45)] 2 = 350 ms Subscribers at B7: 10+ [(30+20+50)+(20+5+40)+ (30+10+70)] 9 = 2,485 ms Subscribers at B8: 10+[(30+20+50)+(20+5+35)+ (35+15+75)] 5 = 1,435 ms Average end-to-end delivery delay: (150+340+2475+1425) 17 = 268 ms 10 ms Ping time: Slide 16 Phase 2 Estimating Total Broker Message Rate B1 B8 B6 B7 P P S S S S S S 9 5 S S 2 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 1 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 Bit vector capturing publication deliveries to local subscribers 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 0 0 1 1 0 0 0 0 Message rate through a broker is calculated by using the OR-bit operator to aggregate the bit vectors of all downstream brokers Slide 17 Phase 2 Minimizing Delivery Delay with Weight P% 1. Get ping times from publisher 2. Calculate the average delivery delay if the publisher is positioned at any of the downstream brokers 3. Normalize, sort, and drop candidates with average delivery delays greater than 1-P (0 P 1). 4. Calculate the total broker message rate if the publisher is positioned at any of the remaining candidate brokers 5. Select the candidate that yields the lowest total system message rate. Slide 18 Phase 3 Publisher Migration Protocol Requirements: Transparent to the end-user publisher Minimize network and computational overhead No additional storage overhead Slide 19 Phase 3 - Example B1 B3 B2 B5 B4B7 B6 B8 S S S S S S S S S S S S 2x 4x 3x 1x 9x 5x P P S S 1x (1) Update last hop of P to B6-x (1)Update last hop of P to B6 (2)Remove all S with B6 as last hop (1)Update last hop of P to B6 (2)Remove all S with B5 as last hop (3)Forward (all) matching S to B5 How to tell when all subs are processed by B6 before P can publish again? DONE Slide 20 POP - Intro Publisher Optimistic Placement Goal: Move publishers to the area with highest publication delivery or concentration of matching subscribers Slide 21 POPs Methodology Overview 3 phase algorithm: Phase 1: Discover location of publication deliveries by probabilistically tracing live publication messages Ongoing, efficiently with minimal network, computational, and storage overhead Phase 2: In a decentralized fashion, pinpoint the broker closest to the set of matching subscribers using trace data from phase 1 Phase 3: Migrate the publisher to the broker decided in phase 2 Same as GRAPEs Phase 3 Slide 22 Phase 1 Aggregated Replies B43 B615 B1 B3 B2 B5 B4B7 B6 B8 P P S S S S S S S S S S S S 2x 4x 3x 1x 9x 5x S S 1x B1 B2 B4 B5 1 6 3 15 B32 B2 2424 B89 B75 B6 B7 B8 159159 Publisher Profile Table Multiple publication traces are aggregated by : S i = S new + (1 - ) S i-1 Reply 9 Reply 9 Reply 5 Reply 5 Reply 15 Reply 15 Reply 15 Reply 15 In terms of message overhead, POP introduces 1 reply message per traced publication Slide 23 Phase 2 Decentralized Broker Selection Algorithm Phase 2 starts when P threshold publications are traced Goal: Pinpoint the broker that is closest to highest concentration of matching subscribers Using trace information from only a subset of brokers The Next Best Broker condition: The next best neighboring broker is the one whose number of downstream subscribers is greater than the sum of all other neighbors' downstream subscribers plus the local broker's subscribers. Slide 24 Phase 2 Example B43 B615 B1 B3 B2 B5 B4B7 B6 B8 S S S S S S S S S S S S 2x 4x 3x 1x 9x 5x P P S S 1x B1 B2 B4 B5 1 6 3 15 B32 B2 2424 B89 B75 B6 B7 B8 159159 AdvId: P DestId: null Broker List: B1, B5, B6 10 B6 Slide 25 Experiment Setup Experiments on both PlanetLab and a cluster testbed PlanetLab: 63 brokers 1 broker per box 20 publishers with publication rate of 10 40 msg/min 80 subscribers per publisher 1600 subscribers in total P threshold of 50 G threshold of 50 Cluster testbed: 127 brokers Up to 7 brokers per box 30 publishers with publication rate of 30 300 msg/min 200 subscribers per publisher 6000 subscribers in total P threshold of 100 G threshold of 100 Slide 26 Average Input Utilization Ratio VS Subscriber Distribution Graph 4/25/200926 Slide 27 Average Delivery Delay VS Subscriber Distribution Graph 4/25/200927 Slide 28 Results Summary Under random workload No significant performance differences between POP and GRAPE Prioritization metric and weight has almost no impact on GRAPEs performance Increasing the number of publication samples on POP Increases the response time Increases the amount of message overhead Increases the average broker message rate GRAPE reduces the input util ratio by up to 68%, average message rate by 84%, average delivery delay by 68%, and message overhead relative to POP by 91%. Slide 29 Conclusions and Future Work POP and GRAPE moves publishers to highest-rated or highest number of matching subscribers to: Reduce load in the system, and/or Scalability Reduce average delivery delay on publication messages Performance Subscriber relocation algorithm that works in concert with GRAPE Slide 30 Questions and Notes

alex cheung and hans-arno jacobsen august, 14 th 2009 middleware systems research group

Documents

s s s s s s s s s s

p p class

load100 slide

message overhead slide

highest publication

system load phase

traced publication message

average delivery delay