peer-to-peer networks
DESCRIPTION
Peer-to-Peer Networks. Distributed Algorithms for P2P Distributed Hash Tables. P. Felber [email protected] http://www.eurecom.fr/~felber/. Agenda. What are DHTs? Why are they useful? What makes a “good” DHT design Case studies Chord Pastry (locality) TOPLUS (topology-awareness) - PowerPoint PPT PresentationTRANSCRIPT
Peer-to-Peer NetworksPeer-to-Peer Networks
Distributed Algorithms for P2P
Distributed Hash Tables
P. FelberP. [email protected]
http://www.eurecom.fr/~felber/
Peer-to-Peer Networks — P. Felber 2
AgendaAgenda What are DHTs? Why are they useful? What makes a “good” DHT design Case studies
Chord Pastry (locality) TOPLUS (topology-awareness)
What are the open problems?
Peer-to-Peer Networks — P. Felber 3
What is P2P?What is P2P? A distributed system system
architecturearchitecture No centralized control Typically many nodes, but
unreliable and heterogeneous Nodes are symmetric in function Take advantage of distributed,
shared resources (bandwidth, CPU, storage) on peer-nodes
Fault-tolerant, self-organizing Operate in dynamic
environment, frequent join and leave is the norm
InternetInternet
Peer-to-Peer Networks — P. Felber 4
P2P Challenge: Locating ContentP2P Challenge: Locating Content
Simple strategy: expanding ring search until content is found If r of N nodes have copy, the expected search cost is at least
N / r, i.e., O(N) Need many copies to keep overhead small
Who has this paper?Who has
this paper?
I have itI have it
I have itI have it
Peer-to-Peer Networks — P. Felber 5
Directed SearchesDirected Searches Idea
Assign particular nodes to hold particular content (or know where it is)
When a node wants this content, go to the node that is supposes to hold it (or know where it is)
Challenges Avoid bottlenecks: distribute the responsibilities
“evenly” among the existing nodes Adaptation to nodes joining or leaving (or failing)
Give responsibilities to joining nodes Redistribute responsibilities from leaving nodes
Peer-to-Peer Networks — P. Felber 6
Idea: Hash TablesIdea: Hash Tables A hash table associates data
with keys Key is hashed to find bucket
in hash table Each bucket is expected to
hold #items/#buckets items In a Distributed Hash Table
(DHT), nodes are the hash buckets Key is hashed to find
responsible peer node Data and load are balanced
across nodes
key pos
00
hash functionhash function
11
22
N-1N-1
33...
xx
yy zz
lookuplookup (key) → datainsert (key, data)lookuplookup (key) → datainsert (key, data)
““Beattles”Beattles” 22
hash table
hash bucket
h(key)%Nh(key)%N
0
1
2
...
node
key poshash functionhash function
lookuplookup (key) → datainsert (key, data)lookuplookup (key) → datainsert (key, data)
““Beattles”Beattles” 22h(key)%Nh(key)%N
N-1
Peer-to-Peer Networks — P. Felber 7
DHTs: ProblemsDHTs: Problems Problem 1 (dynamicity):Problem 1 (dynamicity): adding or removing nodes
With hash mod N, virtually every key will change its location!
h(k) mod m ≠ h(k) mod (m+1) ≠ h(k) mod (m-1)
Solution:Solution: use consistent hashing Define a fixed hash space All hash values fall within that space and do not
depend on the number of peers (hash bucket) Each key goes to peer closest to its ID in hash space
(according to some proximity metric)
Peer-to-Peer Networks — P. Felber 8
DHTs: Problems (cont’d)DHTs: Problems (cont’d) Problem 2 (size):Problem 2 (size): all nodes must be known to
insert or lookup data Works with small and static server populations
Solution:Solution: each peer knows of only a few “neighbors” Messages are routed through neighbors via
multiple hops (overlay routing)
Peer-to-Peer Networks — P. Felber 9
What Makes a Good DHT DesignWhat Makes a Good DHT Design For each object, the node(s) responsible for that object should
be reachable via a “short” path (small diametersmall diameter) The different DHTs differ fundamentally only in the routing
approach The number of neighbors for each node should remain
“reasonable” (small degreesmall degree) DHT routing mechanisms should be decentralized (no single no single
point of failure or bottleneckpoint of failure or bottleneck) Should gracefully handle nodes joining and leavinggracefully handle nodes joining and leaving
Repartition the affected keys over existing nodes Reorganize the neighbor sets Bootstrap mechanisms to connect new nodes into the DHT
To achieve good performance, DHT must provide low stretchlow stretch Minimize ratio of DHT routing vs. unicast latency
Peer-to-Peer Networks — P. Felber 10
DHT InterfaceDHT Interface Minimal interface (data-centric)
Lookup(key) Lookup(key) IP address IP address
Supports a wide range of applications, because few restrictions Keys have no semantic meaning Value is application dependent
DHTs do notnot store the data Data storage can be build on top of DHTs
Lookup(key) Lookup(key) data data
Insert(key, data)Insert(key, data)
Peer-to-Peer Networks — P. Felber 11
DHTs in ContextDHTs in Context
TransportTransport
DHTDHT
Reliable Block StorageReliable Block Storage
File SystemFile System
Communication
LookupRouting
StorageReplicationCaching
Retrieve and store filesMap files to blocks
CFS
DHash
Chord
TCP/IP
receivesend
lookup
load_blockstore_block
load_filestore_file
User ApplicationUser Application
Peer-to-Peer Networks — P. Felber 12
DHTs Support Many ApplicationsDHTs Support Many Applications File sharing [CFS, OceanStore, PAST, …] Web cache [Squirrel, …] Censor-resistant stores [Eternity, FreeNet, …] Application-layer multicast [Narada, …] Event notification [Scribe] Naming systems [ChordDNS, INS, …] Query and indexing [Kademlia, …] Communication primitives [I3, …] Backup store [HiveNet] Web archive [Herodotus]
Peer-to-Peer Networks — P. Felber 13
DHT Case StudiesDHT Case Studies Case Studies
Chord Pastry TOPLUS
Questions How is the hash space divided evenly among
nodes? How do we locate a node? How does we maintain routing tables? How does we cope with (rapid) changes in
membership?
Peer-to-Peer Networks — P. Felber 14
Chord (MIT)Chord (MIT) Circular m-bit ID space
for both keys and nodes Node ID = SHA-1(IP
address) Key ID = SHA-1(key) A key is mapped to the
first node whose ID is equal to or follows the key ID Each node is responsible
for O(K/N) keys O(K/N) keys move when
a node joins or leaves
N1
N8
N14
N32
N21
N38
N42
N48
N51
N56
m=6m=6
K30
K24
K10
K38
K54
2m-1 0
Peer-to-Peer Networks — P. Felber 15
Chord State and Lookup (1)Chord State and Lookup (1) Basic Chord: each node
knows only 2 other nodes on the ring Successor Predecessor (for ring
management)
Lookup is achieved by forwarding requests around the ring through successor pointers Requires O(N) hops
N1
N8
N14
N32
N21
N38
N42
N48
N51
N56
m=6m=6
K54
2m-1 0
lookup(K54)
Peer-to-Peer Networks — P. Felber 16
Chord State and Lookup (2)Chord State and Lookup (2) Each node knows m
other nodes on the ring Successors: finger i of n
points to node at n+2i (or successor)
Predecessor (for ring management)
O(log N) state per node Lookup is achieved by
following closest preceding fingers, then successor O(log N) hops
N1
N8
N14
N32
N21
N38
N42
N48
N51
N56
m=6m=6
K54
2m-1 0
N8+1
N8+2
N8+4
N8+8
N8+16
N8+32
N14
N14
N14
N21
N32
N42
Finger table
+32
+16 +8
+4
+2
+1lookup(K54)
Peer-to-Peer Networks — P. Felber 17
Chord Ring ManagementChord Ring Management For correctness, Chord needs to maintain the
following invariants For every key k, succ(k) is responsible for k Successor pointers are correctly maintained
Finger table are not necessary for correctness One can always default to successor-based lookup Finger table can be updated lazily
Peer-to-Peer Networks — P. Felber 18
Joining the RingJoining the Ring Three step process:
Initialize all fingers of new node Update fingers of existing nodes Transfer keys from successor to new node
Peer-to-Peer Networks — P. Felber 19
Joining the Ring Joining the Ring —— Step 1 Step 1 Initialize the new node finger table
Locate any node n in the ring Ask n to lookup the peers at j+20, j+21, j+22… Use results to populate finger table of j
Peer-to-Peer Networks — P. Felber 20
Joining the Ring Joining the Ring —— Step 2 Step 2 Updating fingers of
existing nodes New node j calls update
function on existing nodes that must point to j Nodes in the ranges
[j-2i , pred(j)-2i+1] O(log N) nodes need to
be updated
N1
N8
N14
N32
N21
N38
N42
N48
N51
N56
m=6m=6
2m-1 0
N8+1
N8+2
N8+4
N8+8
N8+16
N8+32
N14
N14
N14
N21
N32
N42
+16
N28
N28
-16
12
6
Peer-to-Peer Networks — P. Felber 21
Joining the Ring Joining the Ring —— Step 3 Step 3 Transfer key responsibility
Connect to successor Copy keys from successor to new node Update successor pointer and remove keys
Only keys in the range are transferred
N32
N21
N28
K30 K24
N32
N21
N28
K24
K30
K24
N32
N21
N28
K24
K30
N32
N21
K24
K30
Peer-to-Peer Networks — P. Felber 22
StabilizationStabilization Case 1: finger tables are reasonably fresh Case 2: successor pointers are correct, not fingers Case 3: successor pointers are inaccurate or key
migration is incomplete — MUST BE AVOIDED! Stabilization algorithm periodically verifies and
refreshes node pointers (including fingers)Basic principle (at node n):
x = n.succ.predif x ∈ (n, n.succ)
n = n.succnotify n.succ
Eventually stabilizes the system when no node joins or fails
N32
N21
N28
Peer-to-Peer Networks — P. Felber 23
Dealing With FailuresDealing With Failures Failure of nodes might
cause incorrect lookup N8 doesn’t know correct
successor, so lookup of K19 fails
Solution: successor list Each node n knows r
immediate successors After failure, n knows first
live successor and updates successor list
Correct successors guarantee correct lookups
N1
N8
N32
N21
N38
N42
N48
N51
N56
m=6m=6
2m-1 0
+16 +8
+4
+2
+1
N14
N18
K19
lookup(K19) ?
Peer-to-Peer Networks — P. Felber 24
Dealing With Failures (cont’d)Dealing With Failures (cont’d) Successor lists guarantee correct lookup with
some probability Can choose r to make probability of lookup failure
arbitrarily small Assume half of the nodes fail and failures are
independent P(n.successor-list all dead) = 0.5r P(n does not break the Chord ring) = 1 - 0.5r P(no broken nodes) = (1 – 0.5r)N
r = 2log(N) makes probability = 1 – 1/N With high probability (1/N), the ring is not broken
Peer-to-Peer Networks — P. Felber 25
Evolution of P2P SystemsEvolution of P2P Systems Nodes leave frequently, so surviving nodes must be
notified of arrivals to stay connected after their original neighbors fail
Take time t with N nodes Doubling time: time from t until N new nodes join Halving time: time from t until N nodes leave Half-life: minimum of halving and doubling time
Theorem:Theorem: there exist a sequence of joins and leaves such that any node that has received fewer than k notifications per half-life will be disconnected with probability at least (1 – 1/(e-1))k ≈ 0.418k
Peer-to-Peer Networks — P. Felber 26
Chord and Network TopologyChord and Network Topology
Nodes numerically-closeare not topologically-close(1M nodes = 10+ hops)
Nodes numerically-closeare not topologically-close(1M nodes = 10+ hops)
Peer-to-Peer Networks — P. Felber 27
Pastry (MSR)Pastry (MSR) Circular m-bit ID space
for both keys and nodes Addresses in base 2b with
m / b digits
Node ID = SHA-1(IP address)
Key ID = SHA-1(key) A key is mapped to the
node whose ID is numerically-closest the key ID
N0002
N0201
N0322
N2001
N1113
N2120
N2222
N3001
N3033
N3200
m=8m=8
K1320
K1201
K0220
K2120
K3122
2m-1 0b=2b=2
Peer-to-Peer Networks — P. Felber 28
Pastry LookupPastry Lookup Prefix routing from A to B
At hth hop, arrive at node that shares prefix with B of length at least h digits
Example: 5324 routes to 0629 via5324 → 0748 → 0605 → 0620 → 0629
If there is no such node, forward message to neighbor numerically-closer to destination (successor)
5324 → 0748 → 0605 → 0609 → 0620 → 0629 O(log2b N) hops
Peer-to-Peer Networks — P. Felber 29
Pastry State and LookupPastry State and Lookup For each prefix, a node
knows some other node (if any) with same prefix and different next digit
For instance, N0201N0201:N-N-: N1???, N2???, N3???N1???, N2???, N3???
N0N0: N00??, N01??, N03??N00??, N01??, N03??
N02N02: N021?, N022?, N023?N021?, N022?, N023?
N020N020: N0200, N0202, N0203N0200, N0202, N0203
When multiple nodes, choose topologically-closesttopologically-closest Maintain good locality
properties (more on that later)
N0002
N0201
N0322
N2001
N1113
N2120
N2222
N3001
N3033
N3200
m=8m=8
2m-1 0b=2b=2
N0122
N0212N0221
N0233
Routingtable
K2120
lookup(K2120)
Peer-to-Peer Networks — P. Felber 30
Node ID 10233102Node ID 10233102Leaf setLeaf set
Routing TableRouting Table
Neighborhood setNeighborhood set
A Pastry Routing TableA Pastry Routing Table
000221210202212102 11 2230120322301203 3120320331203203
1130123311301233 1223020312230203 1302102213021022221003120310031203 1013210210132102 1032330210323302
3333
10222302102223021020023010200230 10211302102113021023032210230322 1023100010231000 10232121102321211023300110233001
1023312010233120102332321023323211
0022
1302102213021022 1020023010200230 1130123311301233 31301233313012330221210202212102 2230120322301203 3120320331203203 3321332133213321
1023303310233033 1023302110233021 1023312010233120 10233122102331221023300110233001 1023300010233000 1023323010233230 1023323210233232
< SMALLER< SMALLER LARGER >LARGER >
Contains thenodes that are
numericallyclosest to
local nodeMUST BE UP
TO DATE
b=2, so node IDis base 4 (16 bits)
m/b
ro
ws
Contains thenodes that are
closest tolocal node
according toproximity metric
2b-1 entries per row
Entries in the nth rowshare the first n digitswith current node[ common-prefix next-digit rest ]
nth digit of current node
Entries in the mth columnhave m as next digit
Entries with no suitablenode ID are left empty
b=2b=2m=16m=16
Peer-to-Peer Networks — P. Felber 31
Pastry and Network TopologyPastry and Network Topology
Expected node distanceincreases with rownumber in routing table
Expected node distanceincreases with rownumber in routing table
Smaller and smallernumerical jumpsBigger and biggertopological jumps
Smaller and smallernumerical jumpsBigger and biggertopological jumps
Peer-to-Peer Networks — P. Felber 32
X0629X0629
X joinsX joins
Join message
JoiningJoining
A5324A5324
X knows A(A is “close” to X)
X knows A(A is “close” to X)
C0605C0605
D0620D0620
B0748B0748
Route message to node numerically closest to X’s ID
Route message to node numerically closest to X’s ID
A’s neighborhood set
D’s leaf set
A0 — ????
B1 — 0???
C2 — 06??
D4 — 062?
0629’s routing table
Peer-to-Peer Networks — P. Felber 33
LocalityLocality The joining phase preserves the locality property
First: A must be near X Entries in row zero of A’s routing table are close to A, A is close
to X X0 can be A0
The distance from B to nodes from B1 is much larger than distance from A to B (B is in A0) B1 can be reasonable choice for X1, C2 for X2, etc.
To avoid cascading errors, X requests the state from each of the node in its routing table and updates its own with any closer node
This scheme works “pretty well” in practice Minimize the distance of the next routing step with no sense of
global direction Stretch around 2-3
Peer-to-Peer Networks — P. Felber 34
Node DepartureNode Departure Node is considered failed when its immediate
neighbors in the node ID space cannot communicate with it To replace a failed node in the leaf set, the node contacts
the live node with the largest index on the side of failed node, and asks for its leaf set
To repair a failed routing table entry Rdl, node contacts first
the node referred to by another entry Ril, id of the same
row, and ask for that node’s entry for Rdl
If a member in the M table, is not responding, node asks other members for their M table, check the distance of each of the newly discovered nodes, and update its own M table
Peer-to-Peer Networks — P. Felber 35
11 21
2
3
1
4
3
2
CAN (Berkeley)CAN (Berkeley) Cartesian space (d-
dimensional) Space wraps up: d-torus
Incrementally split space between nodes that join
Node (cell) responsible for key k is determined by hashing k for each dimension
d=2d=2
hx(k)
hy(k)
insert(k,data)insert(k,data)
retrieve(k)retrieve(k)
Peer-to-Peer Networks — P. Felber 36
CAN State and LookupCAN State and Lookup A node A only maintains
state for its immediate neighbors (N, S, E, W) 2d neighbors per node
Messages are routed to neighbor that minimizes Cartesian distance More dimensions means
faster the routing but also more state
(dN1/d)/4 hops on average
Multiple choices: we can route around failures
W
N
A
S
E
B
d=2d=2
Peer-to-Peer Networks — P. Felber 37
CAN Landmark RoutingCAN Landmark Routing CAN nodes do not have a pre-defined ID Nodes can be placed according to locality
Use well known set of m landmarklandmark machines (e.g., root DNS servers)
Each CAN node measures its RTT to each landmark Orders the landmarks in order of increasing RTT: m!
possible orderings
CAN construction Place nodes with same ordering close together in the CAN To do so, partition the space into m! zones: m zones on x,
m-1 on y, etc. A node interprets its ordering as the coordinate of its zone
Peer-to-Peer Networks — P. Felber 38
CAN and Network TopologyCAN and Network Topology
Use m landmarksto split space inm! zones
Use m landmarksto split space inm! zones
A;C;B
B;A;CB;C;ABA;B;C
A
C
C;B
;A
C;A;B
Nodes get randomzone in their zoneNodes get randomzone in their zone
Topologically-closenodes tend to be inthe same zone
Topologically-closenodes tend to be inthe same zone
Peer-to-Peer Networks — P. Felber 39
Topology-AwarenessTopology-Awareness Problem
P2P lookup services generally do not take topology into account
In Chord/CAN/Pastry, neighbors are often not locally nearby
Goals Provide small stretchsmall stretch: route packets to their destination
along a path that mimics the router-level shortest-path distance
Stretch: DHT-routing / IP-routing
Our solution TOPLUSTOPLUS (TOPology-centric Look-Up Service) An “extremist design” to topology-aware DHTs
Peer-to-Peer Networks — P. Felber 40
TOPLUS ArchitectureTOPLUS Architecture
Tier 1
Tier 2
Tier 3
IP Addresses
Tier 0
nn ... ... ...
Use IPv4 address rangeIPv4 address range (32-bits)for node IDs and key IDsUse IPv4 address rangeIPv4 address range (32-bits)for node IDs and key IDs
Group nodes in nested groupsusing IP prefixesIP prefixes: AS, ISP, LAN(IP prefix: contiguous addressrange of the form w.x.y.z/n)
Group nodes in nested groupsusing IP prefixesIP prefixes: AS, ISP, LAN(IP prefix: contiguous addressrange of the form w.x.y.z/n)
Assumption:Assumption: nodes with sameIP prefix are topologically closeAssumption:Assumption: nodes with sameIP prefix are topologically close
Peer-to-Peer Networks — P. Felber 41
Node StateNode State
Tier 1
Tier 2
Tier 3
IP Addresses
Tier 0
H1
H0=I
H2
H3
S3
S2
S1
nn ... ... ...
Each node n is part of a series oftelescoping sets HHii with siblings SSii
Each node n is part of a series oftelescoping sets HHii with siblings SSii
Node n must know allallup nodesup nodes in inner groupNode n must know allallup nodesup nodes in inner group
Node n must know oneone delegatedelegatenodenode in each tier i set S ∈ Si
Node n must know oneone delegatedelegatenodenode in each tier i set S ∈ Si
Peer-to-Peer Networks — P. Felber 42
Routing with XOR MetricRouting with XOR Metric To lookup key k, node n forwards the request to the
node in its routing table whose ID j is closest to k according to XOR metric Let j = j31j30...j0 — k = k31k30...k0
Refinement of longest-prefix match Note that closest ID is unique: d(j,k) = d(j’,k) ⇔ j = j’
Example (8 bits)k = 1001011010010110
j = 1010111011010110 d(j,k) = 25 = 3232
j’ = 1001000100101001 d(j’,k) = 24 + 23 + 22 + 21 + 20 = 3131
i
iii kjkjd 2),(
31
0
Peer-to-Peer Networks — P. Felber 43
Prefix Routing LookupPrefix Routing Lookup
Tier 1
Tier 2
Tier 3
IP Addresses
Tier 0
nn ... ... ...
Compute 32-bits key 32-bits key kk (usinghash function)Compute 32-bits key 32-bits key kk (usinghash function)
Perform longest-prefix matchagainst entries in routing tableusing XOR metric
Perform longest-prefix matchagainst entries in routing tableusing XOR metric
Route message to node in innergroup with closest ID (accordingto XOR metric)
Route message to node in innergroup with closest ID (accordingto XOR metric)
kk
Peer-to-Peer Networks — P. Felber 44
TOPLUS and Network TopologyTOPLUS and Network Topology
Smaller and smallernumerical andtopological jumps
Smaller and smallernumerical andtopological jumps
... ... ...
Always movecloser to thedestination
Always movecloser to thedestination
Peer-to-Peer Networks — P. Felber 45
Group MaintenanceGroup Maintenance To join the system, a node n find its closest node n’ n copies the routing and inner-group tables of n’ n modifies its routing table to satisfy a “diversity”
property Requires that the delegate nodes of n and n’ are distinct
with high probability Allows us to find a replacement delegate in case of failure
Upon failure, update inner-group tables Lazy update of routing tables
Membership tracking within groups (local, small)
Peer-to-Peer Networks — P. Felber 46
On-Demand CachingOn-Demand Caching
Tier 1
Tier 2
Tier 3
IP Addresses
Tier 0
nn ... ... ...
Cache data in group (ISP, campus)group (ISP, campus)with prefix w.x.y.z/rCache data in group (ISP, campus)group (ISP, campus)with prefix w.x.y.z/r
To look up k, create create k’k’=k with rfirst bits replaced by w.x.y.z/r(node responsible for k in cache)
To look up k, create create k’k’=k with rfirst bits replaced by w.x.y.z/r(node responsible for k in cache)
Extends naturally to multiplelevels (cache hierarchy)(cache hierarchy)Extends naturally to multiplelevels (cache hierarchy)(cache hierarchy)
kk
k’
Peer-to-Peer Networks — P. Felber 47
Measuring TOPLUS StretchMeasuring TOPLUS Stretch Obtained prefix information from
BGP tables from Oregon, Michigan Universities Routing registries from Castify, RIPE
Sample of 1000 different IP address Point-to-point IP measurements using King TOPLUS distance: weighted average of all
possible paths between source and destination Weights: probability of a delegate to be in each group
TOPLUS stretch = TOPLUS distance / IP distance
Peer-to-Peer Networks — P. Felber 48
ResultsResults Original Tree
250,562 distinct IP prefixes Up to 11 levels of nesting Mean stretch: 1.171.17
16-bit regrouping (>16 → 16) Aggregate small tier-1 groups Mean stretch: 1.191.19
8-bit regrouping (>16 → 8) Mean stretch: 1.281.28
Original+1: add one level with 256 8-bit prefixes Mean stretch: 1.91.9
Artificial, 3-tier tree Mean stretch: 2.322.32
Peer-to-Peer Networks — P. Felber 49
TOPLUS SummaryTOPLUS Summary Problems
Non-uniform ID space (requires bias in hash to balance load)
Correlated node failures
Advantages Small stretch IP longest-prefix matching allows fast forwarding On-demand P2P caching straightforward to implement Can be easily deployed in a “static” environment (e.g., multi-
site corporate network) Can be used as benchmark to measure speed of other P2P
services
Peer-to-Peer Networks — P. Felber 50
Other Issues: Hierarchical DHTsOther Issues: Hierarchical DHTs The Internet is organized as a hierarchy
Should DHT designs be flat?
Hierarchical DHTs: multiple overlays managed by possibly different DHTs (Chord, CAN, etc.) First, locate group responsible for key in top-level DHT Then, find peer in next-level overlay, etc.
By designating the most reliable peers as super-nodes (part of multiple overlays), number of hops can be significantly decreased
How can we deploy, maintain such architectures?
Peer-to-Peer Networks — P. Felber 51
Hierarchical DHTs: ExampleHierarchical DHTs: Example
s1s1
s3s3
s4s4
s2s2
Top-level Chord Overlay
CANGroup
ChordGroup
Peer-to-Peer Networks — P. Felber 52
Other Issues: DHT QueryingOther Issues: DHT Querying DHTs allow us to locate data very quickly...
Lookup(“Beatles/Help”) IP address
...but it only works for perfect matching Users tend to submit broad queries
Lookup(“Beatles/*”) IP address Queries may be inaccurate
Lookup(“Beattles/Help”) IP address
Idea: Index data using partial queries as keys Other approach: Fuzzy matching (UCSB)
Peer-to-Peer Networks — P. Felber 53
Some Other IssuesSome Other Issues Better handling of failures
In particular, Byzantine failures: A single corrupted node may compromise the system
Reasoning with the dynamics of the system A large system may never achieve a quiescent
ideal state Dealing with untrusted participants
Data authentication, integrity of routing tables, anonymity and censor resistance, reputation
Traffic-awareness, load balancing
Peer-to-Peer Networks — P. Felber 54
ConclusionConclusion DHT is a simple, yet powerful abstraction
Building block of many distributed services (file systems, application-layer multicast, distributed caches, etc.)
Many DHT designs, with various pros and cons Balance between state (degree), speed of lookup
(diameter), and ease of management
System must support rapid changes in membership Dealing with joins/leaves/failures is not trivial Dynamics of P2P network is difficult to analyze
Many open issues worth exploring