mercury: scalable routing for range queries ashwin r. bharambe carnegie mellon university with...

32
Mercury: Scalable Routing for Range Queries Ashwin R. Bharambe Carnegie Mellon University With Mukesh Agrawal, Srinivasan Seshan

Post on 20-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Mercury: Scalable Routing for Range Queries

Ashwin R. Bharambe Carnegie Mellon University With Mukesh Agrawal, Srinivasan Seshan

SIGCOMM 2004 Ashwin R. Bharambe 2

Motivation

Lookup dataLookup data in a distributed data storeScalable, efficient routing, load balance, etc.

State-of-the-art: DHTsProblem: exact match queries only

More expressive queries?Often rely on flooding or centralization!Trade-off between expressivity and scalability

What can we achieve in a scalable manner?What can we achieve in a scalable manner?

SIGCOMM 2004 Ashwin R. Bharambe 3

Outline

Single attribute range queries

Performance evaluation

Multi-attribute range queries

Discussion and summary

SIGCOMM 2004 Ashwin R. Bharambe 4

Distributed Hash Tables (DHT)x = 1

0xb2

hash 0x00

0x70

0xf0

0x30

0xb00x20

0xc0

0xd0

0x10

0xe0

0xa0

0x90

0x80

0x40

0x50

0x60

Finger pointer

O(log n) hops

SIGCOMM 2004 Ashwin R. Bharambe 5

Using DHTs for Range Queries

No cryptographic hashing for key No cryptographic hashing for key identifier identifier

Query: 6 x 13

key = 6 0xabkey = 7 0xd3

…key = 13 0x12

0x00

0x70

0xf0

0x30

0xb00x20

0xc0

0xd0

0x10

0xe0

0xa0

0x90

0x80

0x40

0x50

0x60

Query: 6 x 13

SIGCOMM 2004 Ashwin R. Bharambe 6

Using DHTs for Range Queries

Nodes in popular regions can be overloaded

Load imbalance!

SIGCOMM 2004 Ashwin R. Bharambe 7

DHTs with Load Balancing

Mercury load balancing strategy

Re-adjust responsibilities

Range ownerships are skewed!skewed!

SIGCOMM 2004 Ashwin R. Bharambe 8

DHTs with Load Balancing

0x00

0x300xa0

0xb0

0xd0

Finger pointersget skewed!

PopularRegion

0x90

0x80

0xf0 0xe0

Each routing hop may not reduce node-space by half! no log(n) hop guarantee

SIGCOMM 2004 Ashwin R. Bharambe 9

Ideal Link Structure

0x00

0x300xa0

0xb0

0xd0

PopularRegion

0x90

0x80

0xf0 0xe0

SIGCOMM 2004 Ashwin R. Bharambe 10

Mercury

Values

Nodes

If we had the above information…For finger i

Estimate value v for which 2i th node is responsible

Need to establish links based on node-distancenode-distance

4 8

v4

v8

SIGCOMM 2004 Ashwin R. Bharambe 11

MercuryV

alu

es

Nodes

Need to establish links based on node-distancenode-distance

4 8

v4

v8

Values

Nod

e-d

en

sit

yPiece-wise linear approximation Histogram

SIGCOMM 2004 Ashwin R. Bharambe 12

Histogram Maintenance

0xf0

0x00

0x70

0xa0

0xb0

0xd0

0x90

0x80

0xe0

0x30 Request sample

(Range, density)

Measure node-density locallyGossip about it!

Values

Nod

e-d

en

sit

y

(Range, density)

(Range, density)

SIGCOMM 2004 Ashwin R. Bharambe 14

Load Balancing

Basic idea: leave-rejoinSteps

Find average, check if heavy or lightLight nodes perform a leave and rejoin

0 10 20 25 35 45 60 65 70 75 85

Average

Heavy

Light

15 72.5

Load histogram

Load

SIGCOMM 2004 Ashwin R. Bharambe 15

Outline

Single-attribute range queries

Performance evaluation

Multi-attribute range queries

Discussion and summary

SIGCOMM 2004 Ashwin R. Bharambe 16

Evaluation

WorkloadSeveral item insertions Data chosen according to Zipfian distributionValues near 0x00 most popular

Key questions:Are the histograms accurate?Are the routes efficient?

0x00

0xf0

Popular

Unpopular

SIGCOMM 2004 Ashwin R. Bharambe 17

Sampling Accuracy

Estimate of total node count by each participant10000 nodes, Zipf-skewed distribution with load-balancing

+1%

-1%

Correct value

Node ID

No

de-

cou

nt

esti

mat

e

(L0

erro

r)

SIGCOMM 2004 Ashwin R. Bharambe 18

Overlay Structure

Finger pointers created by different schemesNodes should pick greater number of neighbors near them and few long links

Chord/Symphony Mercury

Node ID Node ID

Ideal

Node ID

Nei

gh

bo

r ID

SIGCOMM 2004 Ashwin R. Bharambe 19

Routing Performance

020406080

100120140160180200

0 5000 10000 15000 20000 25000 30000 35000Number of nodes

Ave

rag

e #h

op

s

Naive DHT

Mercury

Ideal

SIGCOMM 2004 Ashwin R. Bharambe 20

Outline

Single-attribute range queries

Performance evaluation

Multi-attribute range queries

Discussion and summary

SIGCOMM 2004 Ashwin R. Bharambe 21

[0, 80)

[240, 320)

[80, 160)

[160, 240)

[105, 210)

[210, 320)

[0, 105)

Multi-attribute Range Queries

Rx Ry

50 ≤ x ≤ 150150 ≤ y ≤ 250

Query

x = 100y = 200

Data item

Send data to all ringsSend query to only ring

SIGCOMM 2004 Ashwin R. Bharambe 22

Design Rationale

Queries span multiple nodes; one ring restricts propagation

0 < x < 1000 && 0 < y < 1000

Use histograms for selectivity estimation 0 < x < 100 && y = *

Send data-items to all rings?? Send queries to all rings??vs.

SIGCOMM 2004 Ashwin R. Bharambe 23

Outline

Single-attribute range queries

Performance evaluation

Multi-attribute range queries

Discussion and summary

SIGCOMM 2004 Ashwin R. Bharambe 24

Alternate Designs

Virtual servers [Stoica02]#virtual servers skewData-item distribution can have large skews Many virtual servers high overhead

SkipNet [Harvey03]Load balancing OR range queries

Load balanced skip graphs [Karger04, Aspnes04]More complex to maintain Need random sampling

SIGCOMM 2004 Ashwin R. Bharambe 25

Conclusions

Lesson: a little knowledge about a distributed system helps a lot!

Sampling and histogram maintenanceUseful for efficient routingLoad balancingSelectivity estimation

Routing for range queries in P2P networksEfficient in the face of skewed node rangesExplicit load balancingMultiple attributes

Thank You!

Backup slides

SIGCOMM 2004 Ashwin R. Bharambe 28

Dynamics

Node joinJoin one or more hubs – join some rep in a hub Init routing table from the representative

Start sampling for obtaining new histogram Make new long-distance links Obtain new cross-hub neighbors

Node leaveMaintain successor listsRepair succ-pred pointers Repair long-distance links only when number of nodes changes by a factor of 2

SIGCOMM 2004 Ashwin R. Bharambe 29

Histogram accuracy

0.0001

0.001

0.01

0.1

1

0 20 40 60 80

Number of nodes queried per round

His

tog

ram

err

or

-

(lo

g s

cale

)

#Reports = 1

#Reports = 6

#Reports = 14

Ashwin
Node count estimation graph

SIGCOMM 2004 Ashwin R. Bharambe 30

Routing Performance

0

20

40

60

80

100

120

140

160

180

200

0 5000 10000 15000 20000 25000 30000 35000Number of nodes

Ave

rag

e #h

op

s

Naive DHT

Naive DHT + Cache

Mercury

Ideal

SIGCOMM 2004 Ashwin R. Bharambe 31

Multiplayer Games

Player 1

Player 2

Game World

Large shared worldComposed of map information, textures, etcPopulated by active entities: user avatars, AI bots, etc

Only parts of world relevantOnly parts of world relevant to particular user/player

SIGCOMM 2004 Ashwin R. Bharambe 32

Gaming with Mercury

Key challenge: provide every player with relevant updates without central server

Use Mercury for performing distributed object distributed object discoverydiscovery

Each player “registers” a range predicateBounding box region surrounding itselfPeriodically updated

Player movements are “matched” against the queries

SIGCOMM 2004 Ashwin R. Bharambe 33

Attribute Rings

One hub for each attributeLinearization to support multiple attributes within a ring

Single node may participate in multiple rings

name x

Age

y

Rings in the system

name

Age+weight

x

yCross-ring

links

Intra-ring links

Hub = routing ring

Ashwin
Arrows between the hubs -- what are they for?