beehive: achieving o(1) lookup performance in p2p overlays for zipf-like query distributions...

31
Beehive: Achieving O(1) Lookup Performance in P2P Overlays for Zipf-like Query Distributions Venugopalan Ramasubramanian (Rama) and Emin Gün Sirer Cornell University

Upload: philip-carr

Post on 18-Dec-2015

222 views

Category:

Documents


1 download

TRANSCRIPT

Beehive: Achieving O(1) Lookup Performance in P2P Overlays for Zipf-like Query Distributions

Venugopalan Ramasubramanian (Rama)

andEmin Gün SirerCornell University

introduction

caching is widely-used to improve latency and to decrease overheadpassive caching caches distributed throughout the

network store objects that are encountered

not well-suited for a large-class applications

problems with passive caching

no performance guaranteesheavy-tail effect large percentage of queries to unpopular

objects ad-hoc heuristics for cache management

introduces coherency problems difficult to locate all copies weak consistency model

overview of beehive

general replication framework for structured DHTs decentralization, self-organization, resilience

properties high performance: O(1) average lookup time scalable: minimize number of replicas and

reduce storage, bandwidth, and network load

adaptive: promptly respond to changes in popularity – flash crowds

0122

prefix-matching DHTs

object 0121

2012

0021

0112

logbN hops several RTTs on

the Internet

key intuition

tunable latency adjust number of

objects replicated at each level

fundamental space-time tradeoff

2012

0021

0112

0122

analytical model

optimization problemminimize: total number of replicas, s.t.,

average lookup performance C

configurable target lookup performance continuous range, sub one-hop

minimizing number of replicas decreases storage and bandwidth overhead

analytical model

zipf-like query distributions with parameter number of queries to rth popular object 1/r

fraction of queries for m most popular objects

(m1- - 1) / (M1- - 1)

level of replication nodes share i prefix-digits with the object i hop lookup latency replicated on N/bi nodes

optimization problem

minimize (storage/bandwidth)x0 + x1/b + x2/b2 + … + xK-1/bK-1

such that (average lookup time is C hops)K – (x0

1- + x11- + x2

1- + … + xK-11-) C

andx0 x1 x2 … xK-1 1

b: base K: logb(N)

xi: fraction of objects replicated at level i

optimal closed-form solution

dj (K’ – C)

1 + d + … + dK’-1

11 - [ ]

x*i =, 0 i K’ – 1

where, d = b(1- ) /

, K’ i K

K’ is determined by setting (typically 2 or 3)x*K’-1 1 dK’-1 (K’ – C) / (1 + d + … + dK’-1)

1

1

latency - overhead trade off

beehive: system overview

estimation popularity of objects, zipf parameter local measurement, limited aggregation

replication apply analytical model independently at

each node push new replicas to nodes at most one hop

away

beehive replication protocol

0 1 2 *

home node

EL 3

0 1 *0 1 * 0 1 *EB IL 2

0 *0 *0 * 0 * 0 * 0 * 0 * 0 * 0 *

A B C D E F G H I

L 1

object 0121

mutable objects

leverage the underlying structure of DHT replication level indicates the locations of all

the replicas

proactive propagation to all nodes from the home node home node sends to one-hop neighbors with

i matching prefix-digits level i nodes send to level i+1 nodes

implementation and evaluation

implemented using Pastry as the underlying DHT evaluation using a real DNS workload

MIT DNS trace (zipf parameter 0.91) 1024 nodes, 40960 objects compared with passive caching on pastry

main properties evaluated lookup performance storage and bandwidth overhead adaptation to changes in query distribution

evaluation: lookup performance

passive caching is not very effective because

of heavy tail query distribution and mutable objects.

beehive converges to the target of 1 hop

evaluation: overheadBandwidth Storage

average number of replicas per nodePastry 40

PC-Pastry 420

Beehive 380

evaluation: flash crowdslookup performance

evaluation: zipf parameter change

Cooperative Domain Name System (CoDoNS)

replacement for legacy DNS secure authentication through

DNSSEC

incremental deployment path completely transparent to clients uses legacy DNS to populate resource

records on demand

deployed on planet-lab

advantages of CoDoNS

higher performance than legacy DNS median latency of 7 ms for codons

(planet-lab), 39 ms for legacy DNS

resilience against denial of service attacks self configuration after host and

network failures

fast update propagation

conclusions

model-driven proactive caching O(1) lookup performance with optimal replicas

beehive: a general replication framework structured overlays with uniform fan-out high performance, resilience, improved

availability

well-suited for latency sensitive applications

www.cs.cornell.edu/people/egs/beehive

evaluation: zipf parameter change

evaluation: instantaneous bandwidth overhead

lookup performance: target 0.5 hops

lookup performance: planet-lab

typical values of zipf parameter

MIT DNS trace: = 0.91Web traces:

trace Dec UPisa FuNet UCB Quest NLANR

0.83 0.84 0.84 0.83 0.88 0.90

comparative overview of structured DHTs

DHTlookup

performance

CAN O(dN1/d)

Chord, Kademlia, Pastry, Tapestry, Viceroy

O(logN)

de Bruijn graphs (Koorde) O(logN/loglogN)

Kelips, Salad, [Gupta, Liskov, Rodriguez], [Mizrak, Cheng,

Kumar, Savage]O(1)

O(1) structured DHTs

DHTlookup

performancerouting state

Salad d O(dN1/d)[Mizrak, Cheng, Kumar, Savage]

2 N

Kelips 1N(N

replication)[Gupta, Liskov,

Rodriguez]1 N

security issues in beehive

underlying DHT corruption in routing tables [Castro, Druschel, Ganesh, Rowstrom,

Wallach]

beehive misrepresentation of popularity remove outliers

application corruption of data certificates (ex. DNS-SEC)

Beehive DNS: Lookup Performance

CoDoNS Legacy DNSmedian 6.56 ms 38.8 ms

90th percentile

281 ms 337 ms