mercury: scalable routing for range queries ashwin r. bharambe carnegie mellon university with...
Post on 20-Dec-2015
214 views
TRANSCRIPT
Mercury: Scalable Routing for Range Queries
Ashwin R. Bharambe Carnegie Mellon University With Mukesh Agrawal, Srinivasan Seshan
SIGCOMM 2004 Ashwin R. Bharambe 2
Motivation
Lookup dataLookup data in a distributed data storeScalable, efficient routing, load balance, etc.
State-of-the-art: DHTsProblem: exact match queries only
More expressive queries?Often rely on flooding or centralization!Trade-off between expressivity and scalability
What can we achieve in a scalable manner?What can we achieve in a scalable manner?
SIGCOMM 2004 Ashwin R. Bharambe 3
Outline
Single attribute range queries
Performance evaluation
Multi-attribute range queries
Discussion and summary
SIGCOMM 2004 Ashwin R. Bharambe 4
Distributed Hash Tables (DHT)x = 1
0xb2
hash 0x00
0x70
0xf0
0x30
0xb00x20
0xc0
0xd0
0x10
0xe0
0xa0
0x90
0x80
0x40
0x50
0x60
Finger pointer
O(log n) hops
SIGCOMM 2004 Ashwin R. Bharambe 5
Using DHTs for Range Queries
No cryptographic hashing for key No cryptographic hashing for key identifier identifier
Query: 6 x 13
key = 6 0xabkey = 7 0xd3
…key = 13 0x12
0x00
0x70
0xf0
0x30
0xb00x20
0xc0
0xd0
0x10
0xe0
0xa0
0x90
0x80
0x40
0x50
0x60
Query: 6 x 13
SIGCOMM 2004 Ashwin R. Bharambe 6
Using DHTs for Range Queries
Nodes in popular regions can be overloaded
Load imbalance!
SIGCOMM 2004 Ashwin R. Bharambe 7
DHTs with Load Balancing
Mercury load balancing strategy
Re-adjust responsibilities
Range ownerships are skewed!skewed!
SIGCOMM 2004 Ashwin R. Bharambe 8
DHTs with Load Balancing
0x00
0x300xa0
0xb0
0xd0
Finger pointersget skewed!
PopularRegion
0x90
0x80
0xf0 0xe0
Each routing hop may not reduce node-space by half! no log(n) hop guarantee
SIGCOMM 2004 Ashwin R. Bharambe 9
Ideal Link Structure
0x00
0x300xa0
0xb0
0xd0
PopularRegion
0x90
0x80
0xf0 0xe0
SIGCOMM 2004 Ashwin R. Bharambe 10
Mercury
Values
Nodes
If we had the above information…For finger i
Estimate value v for which 2i th node is responsible
Need to establish links based on node-distancenode-distance
4 8
v4
v8
SIGCOMM 2004 Ashwin R. Bharambe 11
MercuryV
alu
es
Nodes
Need to establish links based on node-distancenode-distance
4 8
v4
v8
Values
Nod
e-d
en
sit
yPiece-wise linear approximation Histogram
SIGCOMM 2004 Ashwin R. Bharambe 12
Histogram Maintenance
0xf0
0x00
0x70
0xa0
0xb0
0xd0
0x90
0x80
0xe0
0x30 Request sample
(Range, density)
Measure node-density locallyGossip about it!
Values
Nod
e-d
en
sit
y
(Range, density)
(Range, density)
SIGCOMM 2004 Ashwin R. Bharambe 14
Load Balancing
Basic idea: leave-rejoinSteps
Find average, check if heavy or lightLight nodes perform a leave and rejoin
0 10 20 25 35 45 60 65 70 75 85
Average
Heavy
Light
15 72.5
Load histogram
Load
SIGCOMM 2004 Ashwin R. Bharambe 15
Outline
Single-attribute range queries
Performance evaluation
Multi-attribute range queries
Discussion and summary
SIGCOMM 2004 Ashwin R. Bharambe 16
Evaluation
WorkloadSeveral item insertions Data chosen according to Zipfian distributionValues near 0x00 most popular
Key questions:Are the histograms accurate?Are the routes efficient?
0x00
0xf0
Popular
Unpopular
SIGCOMM 2004 Ashwin R. Bharambe 17
Sampling Accuracy
Estimate of total node count by each participant10000 nodes, Zipf-skewed distribution with load-balancing
+1%
-1%
Correct value
Node ID
No
de-
cou
nt
esti
mat
e
(L0
erro
r)
SIGCOMM 2004 Ashwin R. Bharambe 18
Overlay Structure
Finger pointers created by different schemesNodes should pick greater number of neighbors near them and few long links
Chord/Symphony Mercury
Node ID Node ID
Ideal
Node ID
Nei
gh
bo
r ID
SIGCOMM 2004 Ashwin R. Bharambe 19
Routing Performance
020406080
100120140160180200
0 5000 10000 15000 20000 25000 30000 35000Number of nodes
Ave
rag
e #h
op
s
Naive DHT
Mercury
Ideal
SIGCOMM 2004 Ashwin R. Bharambe 20
Outline
Single-attribute range queries
Performance evaluation
Multi-attribute range queries
Discussion and summary
SIGCOMM 2004 Ashwin R. Bharambe 21
[0, 80)
[240, 320)
[80, 160)
[160, 240)
[105, 210)
[210, 320)
[0, 105)
Multi-attribute Range Queries
Rx Ry
50 ≤ x ≤ 150150 ≤ y ≤ 250
Query
x = 100y = 200
Data item
Send data to all ringsSend query to only ring
SIGCOMM 2004 Ashwin R. Bharambe 22
Design Rationale
Queries span multiple nodes; one ring restricts propagation
0 < x < 1000 && 0 < y < 1000
Use histograms for selectivity estimation 0 < x < 100 && y = *
Send data-items to all rings?? Send queries to all rings??vs.
SIGCOMM 2004 Ashwin R. Bharambe 23
Outline
Single-attribute range queries
Performance evaluation
Multi-attribute range queries
Discussion and summary
SIGCOMM 2004 Ashwin R. Bharambe 24
Alternate Designs
Virtual servers [Stoica02]#virtual servers skewData-item distribution can have large skews Many virtual servers high overhead
SkipNet [Harvey03]Load balancing OR range queries
Load balanced skip graphs [Karger04, Aspnes04]More complex to maintain Need random sampling
SIGCOMM 2004 Ashwin R. Bharambe 25
Conclusions
Lesson: a little knowledge about a distributed system helps a lot!
Sampling and histogram maintenanceUseful for efficient routingLoad balancingSelectivity estimation
Routing for range queries in P2P networksEfficient in the face of skewed node rangesExplicit load balancingMultiple attributes
SIGCOMM 2004 Ashwin R. Bharambe 28
Dynamics
Node joinJoin one or more hubs – join some rep in a hub Init routing table from the representative
Start sampling for obtaining new histogram Make new long-distance links Obtain new cross-hub neighbors
Node leaveMaintain successor listsRepair succ-pred pointers Repair long-distance links only when number of nodes changes by a factor of 2
SIGCOMM 2004 Ashwin R. Bharambe 29
Histogram accuracy
0.0001
0.001
0.01
0.1
1
0 20 40 60 80
Number of nodes queried per round
His
tog
ram
err
or
-
(lo
g s
cale
)
#Reports = 1
#Reports = 6
#Reports = 14
SIGCOMM 2004 Ashwin R. Bharambe 30
Routing Performance
0
20
40
60
80
100
120
140
160
180
200
0 5000 10000 15000 20000 25000 30000 35000Number of nodes
Ave
rag
e #h
op
s
Naive DHT
Naive DHT + Cache
Mercury
Ideal
SIGCOMM 2004 Ashwin R. Bharambe 31
Multiplayer Games
Player 1
Player 2
Game World
Large shared worldComposed of map information, textures, etcPopulated by active entities: user avatars, AI bots, etc
Only parts of world relevantOnly parts of world relevant to particular user/player
SIGCOMM 2004 Ashwin R. Bharambe 32
Gaming with Mercury
Key challenge: provide every player with relevant updates without central server
Use Mercury for performing distributed object distributed object discoverydiscovery
Each player “registers” a range predicateBounding box region surrounding itselfPeriodically updated
Player movements are “matched” against the queries
SIGCOMM 2004 Ashwin R. Bharambe 33
Attribute Rings
One hub for each attributeLinearization to support multiple attributes within a ring
Single node may participate in multiple rings
name x
Age
y
Rings in the system
name
Age+weight
x
yCross-ring
links
Intra-ring links
Hub = routing ring