middleware systems research group msrg.org hans-arno jacobsen june 23, 2011 resource allocation...
Embed Size (px)
TRANSCRIPT

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
Hans-Arno Jacobsen
June 23, 2011
Resource Allocation Algorithms for Publish/Subscribe Systems
http://padres.msrg.org
Joint work with Alex King Yeung Cheung

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
Green Resource Allocation Algorithms for Publish/Subscribe Systems
http://padres.msrg.org

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Publish/Subscribe in Practice
• GooPS▫ Google’s internal pub/sub messaging middleware to integrate applications
across data centers▫ Hundreds of brokers with tens of thousands of pub/sub clients
• Yahoo Message Broker▫ Yahoo’s pub/sub middleware▫ Used for example in PNUTS key/value-store (cf. VLDB’08)
• SuperMontage▫ Tibco’s pub/sub distribution network for NASDAQ’s quote and order-
processing• GDSN (Global Data Synchronization Network)▫ A global pub/sub network that allows retailers and suppliers (i.e., Walmart,
Target, Metro, etc.) to exchange timely and accurate supply chain data
3
(Distributed and brokered publish/subscribe)

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Problem
4
P
SS
Input
Brokers
Subscribers
Publishers P P P
S S
Output
SS S S
P P P P
Overload!Deployment
strategy that uses the least number
of brokers?

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Challenges
• Brokers have limited and heterogeneous resource capacities– Computational– I/O or bandwidth– Memory and storage
• Publishers publish at different message rates• Subscribers have unique interests that sink
zero or more publications from zero or more publishers
5

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORGChallenges When Scaling Up
6
SS S S
P P P P
How to connect the brokers to minimize traffic while avoiding
overload?
SS S S
How to allocate subscribers to
brokers?
P P P P
How to connect the publishers if subscribers
sink traffic from >2 publishers?
How to connect the publishers if subscribers
sink traffic from >2 publishers?
How to allocate subscribers to
brokers?
This is an NP-complete problem!
ICDCS 2011

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Additional Requirements
• Minimize – Amount of processing– Amount of messages forwarded
• Work effectively under any workload distribution (defined or undefined)
• Readily adaptable to any pub/sub system by being language independent– Content-based (XPath, regex, ranged, SQL, composite
subscriptions, etc.)– Topic-based pub/sub
7

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011 8
Summary of Our Approach
• Phase 1: Subscription profiling (& publisher)– Record publications delivered to each subscription
• Phase 2: Subscription to broker allocation– Allocate subscriptions to brokers depending on
the load induced by each subscription• Phase 3: Broker overlay construction– Construct and configure broker overlay
• Apply publisher re-allocation (GRAPE, cf. ICDCS’2010)
(A customizable framework )

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORGPhase 1: Subscription Profiling
ICDCS 2011 9
0 0 0000000
Message ID of first index
Start of bit vector
1Publications delivered to subscription
B34-M213
B34-M215
B34-M216
B34-M217
B34-M220
B34-M222
B34-M225
B34-M226
B34-M213
01 01 01 01 01 01 01
Profile of each subscription per advertisement maintained at the
subscriber’s first broker
Message ID
Cardinality of bit vector approximates bandwidth requirement of subscription
Used to compute “closeness” between any two subscriptions in the allocation phase based on clustering algorithm. E.g, closeness = |si ∩ sj|
Fixed vector size; shift left if next publication is out of bit vector range

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Phase 2: Subscription Allocation Algorithms
• MANUAL & AUTOMATIC as baseline– Tree with fanout of 2; random placement of clients (manual)– Random allocation (automatic)
• Fastest Broker First (FBF)– Assign subscriptions randomly to the next most powerful broker
• Bin Packing– Like FBF, but assigns the next highest traffic subscription
• PAIRWISE-N, PAIRWISE-K (Riabov et al. ICDCS’02)– Pairwise subscription clustering where the number of clusters is
specified beforehand• CRAM (Clustering with Resource Awareness and Minimization)
– Dynamically determines the number of clusters– Utilizes a novel one-to-many clustering scheme– Evaluated with 4 different subscription closeness metrics, with one
derived from Banavar et al. ICDCS '99
10

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Allocation with Bin Packing
11
SSS S S S

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Allocation Result (Bin Packing)
12
S
S
S S
SS

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011 13
Allocation with CRAM
1. Find and cluster a pair of subscriptions having next highest non-zero “closeness”
2. Run BIN PACKING algorithm with new pairing3. Allocation fails, if:– More brokers are allocated than without this pairing– Not all subscriptions can be allocated to brokers
4. On failure, undo and remember incompatible pairing5. Repeat loop until no more pairings can be found
• Initially BIN PACKING is run to determine initial allocation• Pairings found are combined and re-inserted in sub pool• Final subscription clustering is last successful allocation
(Basic version)

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011 14
Summary of Optimizations
• Grouping of subscriptions with equal profiles– Apply CRAM an groups– In our experiments, reductions of up to 61%
• Limit closeness computations among groups– Exploit covering relationships among subscriptions– Disregard groups with small closeness– In our experiments, a 20x improvement, roughly
• One-to-many clustering– Cluster groups of subscriptions & covered subs

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011 15
Closeness Metrics
XOR: |si XOR sj|-1
IOU:| si ∩ sj|2 / |si U sj|
IOS:|si ∩ sj|2 / |si| + |sj|
Intersect: |si ∩ sj|
Ideally, find subscriptions sharing highest overlap in traffic, while introducing least amount of non-overlapping traffic.
XOR is derived from Banavar et al. ICDCS '99)
Good for highest overlap
Good for least non-overlapping traffic
Good for both conditions, yield 0 for empty relationships, favour clustering higher traffic subs}
(Intersection over sum & … over union)
(If value is 0, defined as MAXVAL)

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Traditional One-to-One Clustering
16
Bit Vector of S1 Bit Vector of S2
C = 82/(36+24) = 1.07
Closeness, C = |si ∩ sj|2
|si| + |sj|
C = 42/(36+4) = 0.4
C = 12/(24+1) = 0.04
S1a
S1b
S1c
S2a S2b
S2c S2d
S2e S2f
S2g S2h

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
New One-to-Many Clustering
17
Bit Vector of S1 Bit Vector of S2
C = 82/(36+24) = 1.07
C = |si ∩ sj|2
|si| + |sj|
C = 42/(36+4) = 0.4
C = 12/(24+1) = 0.04
C = 122/(36+12) = 3
C = 82/(24+8) = 2
S1a
S1b
S1c
S2a S2b
S2c S2d
S2e S2f
S2g S2h

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
S
S
Phase 3: Broker Overlay Construction
18
S
S
S
S
S S S

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Bin Packing’s Final Overlay
19
SSSS
S
S S
SS
PP(( GRAPE ))
(( GRAPE ))

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Greedy Relocation Algorithmfor Publishers of Events (GRAPE)
• Distributed algorithm that dynamically relocates publishers to minimize– Broker message rates, and/or– Delivery Delay
• Similar three phased design:1. Profile load of subscriptions matching each publisher2. Determine the placement strategy that minimizes the specified
metric3. Transparently migrate the publisher
• Cf. GRAPE paper from ICDCS 2010
20

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Evaluation
• Implemented on the PADRES open source content-based publish/subscribe system
• Evaluated on a cluster testbed using 80 brokers• Evaluated on SciNet using 1000 brokers• Comparison against two related approaches
(Riabov et al. ICDCS’02, Banavar et al. ICDCS’99)• Homogeneous and heterogeneous scenarios• Workload saturates the initial deployment
(MANUAL)21
http://padres.msrg.org

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Output Utilization Ratio
22
Resource aware algorithms make full use of allocated
resources

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Broker Message Rate
23
CRAM reduced message rate by up to 92%
Clustering significantly reduces
message rate
Allocating fewer brokers does not
help

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Number of Allocated Brokers
24
Reduces number of allocated
brokers by up to 91%
Uses all resources

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Computation Time
25
91% improvement at only 30% higher
computation time

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Impact of Publisher Relocation & Subscription Clustering
26
50% reduction in
broker message
rate

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Broker Message Rates Using Various Closeness Metrics
27
XOR closeness
metric cannot identify empty-
relations

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Conclusions
•CRAM combines the benefits of ▫Subscription clustering from PAIRWISE-N/K▫Resource awareness from Bin Packing
by simultaneously reducing both▫Broker message rate (up to 92%)▫Number of allocated brokers (up to 91%)to meet green IT objectives!
•By using bit vectors, CRAM is ▫Language independent (XPath, regex, topics)▫Effective for any workload distribution
28

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Q & A
29

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011 30

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Future Work
•React dynamically by growing and shrinking the network in incremental steps
•Improve runtime of the CRAM algorithm by parallelization or reducing its computational complexity
•Model workload with more sophisticated methods, such as stochastic processes, to improve accuracy of load estimation
•Address fault resiliency31

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Related Works - Clustering
• Riabov et al. (ICDCS’02)▫ The number of clusters K is pre-specified▫ Each cluster is a multicast address, thus there is no upper limit on its size▫ Event space is divided into grids▫ Supports only ranged subscriptions▫ Their pairwise clustering considers each subscription individually
• Gryphon (ICDCS'99)▫ Supports only equal and * subscriptions▫ Each cluster is stored in memory, the upper bound limit is not a major
concern• SUB-2-SUB (IPTPS'06)
▫ Supports only ranged subscriptions▫ Each cluster is a p2p network, thus there is no upper limit on the cluster size
32

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Related Works – Broker Overlay Construction, Publisher and Subscriber Placement Algorithms• Baldoni et al. (The Computer Journal), • Jaeger et al. (SAC'07)• Migliavacca et al. (DEBS’07)– Reconfigure broker overlay to reduce delivery
delay and broker processing load• Cheung et al. (Middleware’06, ICDCS’10)– Load balancing by relocating subscriber clients– Reduce delivery delay and broker processing load
by relocating publisher clients33

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Hop Count Using Various Closeness Metrics
34

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Computation Time vs. Bit Vector Size
35

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Allocated Brokers vs. Bit Vector Size
36

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Average Hop Count
37

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Computation Time Using Various Closeness Metrics
38
108% higher computation time using Gryphon-derived
closeness metric (XOR).

MIDDLEWARE SYSTEMSRESEARCH GROUP
MSRG.ORG
ICDCS 2011
Delivery Delay
39
Overload with
Pairwise-K