paper survey of dht distributed hash table. usages directory service very little amount of...

28
Paper Survey of DHT Distributed Hash Table

Upload: brooke-york

Post on 14-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Paper Survey of DHT

Distributed Hash Table

Page 2: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Usages

Directory service Very little amount of information, such as URI,

metadata, … Storage

Data, such as files, … Immutable, just for download

Database Each entry is small, but large amount of entries Mutable Special operations for query

Page 3: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Challenges

Immutable Latency Availability Query Consistency

Mutable Object Consistency

Page 4: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Latency Query

Different routing architectures Chord, Tapestry, Pastry, Kademlia, Can, …

Recursive, interactive Proximity Neighbor Route Parallel Routing table size

Fetch Transport Protocol Proximity Neighbor Selections Cache Distributed Object

Page 5: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Query: Routing Architectures

Routing Complexity O (log n), O (d), O (1), …

Principle Each peer has a unique digest Object with a digest Put the object to the peer with the closed digest

Famous ones are O (log n) O (1)

cache

Page 6: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such
Page 7: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Query: Recursive or Interactive Query is recursive forward

Faster 2 times than interactive theoretically Primary parameters

Base # of successor

Persistent problem

Page 8: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Query: Recursive or Interactive Query is interactively forward

Not very slow in practical Primary parameters

# of parallel query Routing table tree

Learning new neighbor easily Exchange information with other peers Flexible

Page 9: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such
Page 10: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Query: Proximity Neighbor Route Route by a node with smaller delay Small delay -> small timeout

TCP > Vivaldi > fixed

Page 11: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such
Page 12: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Query: Proximity Neighbor Route Measure methods

Global Sampling Neighbor’s neighbors Neighbor’s inverse Recursive sampling

Page 13: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such
Page 14: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Query: others

Parallel query Faster With partial PNS property Persistent More traffic

Large routing table Easy to find a closer node locally

Page 15: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Fetch: Cache

Cache objects on nodes closer to the primary one

# of nodes to cache is upon the popularity of the object

Average query hops can be reduced to a constant number ( O (1) )

Hard to apply to mutable object Consider churn more bandwidth

consumption

Page 16: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Fetch: Distributed Object

Split object to small pieces and put on different nodes

Recover faster Download faster Hard to maintain Only for immutable data

Page 17: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Fetch: Transport Protocol

Striped Transport Protocol UDP Window control Retransmission

Page 18: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Availability

Replicate Reactive / Proactive Eager / lazy repair

Erasure coding

Load balance is broken High correlation between uptime and storage

Maintenance traffic problem

Page 19: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Availability: Replicate

Reactive Duplicate when a copy is lost Consume lots of bandwidth in short time When churn is low, reactive is better

Proactive Duplicate continually Consume constant and small bandwidth continually Need avail. prediction and redundancy management Bandwidth usage is predictable

Page 20: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such
Page 21: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Availability: Replicate Temporary / Permanent churn Availability <-> Durability Achieve 100% availability or/and durability ? Eager repair

Duplicate immediately

Lazy repair Duplicate after timeout Need a good choice of timeout Reintegrating returning replicas

Page 22: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Availability: Erasure Coding

Matter more on larger object Save storage and bandwidth For high churn, the bandwidth consumption is

still not acceptable Complex maintenance Download latency is heterogeneous Only for immutable data

Page 23: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Query Consistency

A digest-object mapping is existed, then the result of query must be it

Weakly consistent KBR Eventual consistency Most of existed DHT

Strongly consistent KBR Causality consistency Strong consistency

Solution Route by W-KBR to a group S-KBR in a group

Page 24: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Mutable DHT

Object stored in DHT is mutable Insert, update, delete

Churn -> Replica New Challenge …

Page 25: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Object Consistency

For immutable data For security issue, it may be there

Merkle tree

For mutable data Consensus algorithm

Distributed algorithm for data consistency Quorum algorithm

Read / write locks

Page 26: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

Pitfalls

Different kinds of p2p have different properties

Lack of new real traces Standard simulation platform

Page 27: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

References

Efficient Replica Maintenance for Distributed Storage Systems Proactive replication for data durability On object Maintenance in Peer-to-Peer systems Enforcing Routing Consistency in Structured Peer-to-peer Overla

ys: Should We and Could We? High Availability in DHTs: Erasure Coding vs. Replication Toward Fault-tolerant Atomic Data Access in Mutable Distributed

Hash Tables Kademlia: A Peer-to-peer Information System Based on the XOR

Metric Total Recall: System Support for Automated Availability Manage

ment Designing a DHT for low latency and high throughput

Page 28: Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such

References

Fallacies in evaluating decentralized systems Anatomy of a P2P Content Distribution system with Network Cod

ing Comparing the performance of distributed hash tables under chur

n EpiChord: Parallelizing the Chord Lookup Algorithm with Reactiv

e Routing State management Bandwidth-efficient management of DHT routing tables Improving Lookup Performance over a Widely-Deployed DHT Failure Recovery for Structured P2P Networks: Protocol Design

and Performance Evaluation Handling Churn in a DHT