![Page 1: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/1.jpg)
A Survey of Peer-to-Peer Content Distribution Technologies
Stephanos Androutsellis-Theotokis and Diomidis SpinellisACM Computing Surveys, December 2004
Presenter: Seung-hwan Baek
Ja-eun Choi
![Page 2: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/2.jpg)
Outline• Overview of P2P
– P2P Motivation– P2P Characteristics & Benefits– P2P Application Types
• P2P Classification– Unstructured: Gnutella, Kazaa, Napster– Structured: Freenet, Chord, CAN, Tapestry
• Other Aspects• Conclusions
2/50
![Page 3: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/3.jpg)
P2P Motivation
Client/Server Architecture:• Well known, powerful, reliable server is a data source• Clients request data from server• Very successful model WWW (HTTP), FTP, Web services, etc.
3/50
![Page 4: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/4.jpg)
P2P Motivation (Cont’d)
Client/Server Limitation:• Scalability is hard to achieve• Presents a single point of failure• Requires administration• Unused resources at the network edge
P2P systems try to address these limitations
4/50
![Page 5: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/5.jpg)
P2P Characteristics
P2P Computing:• P2P computing is the sharing of computer resources and ser-
vices by direct exchange between systems.• These resources and services include the exchange of infor-
mation, processing cycles, cache storage, and disk storage for files.
• P2P computing takes advantage of existing computing power, computer storage and networking connectivity, allowing users to leverage their collective power to the ‘benefit’ of all.
5/50
![Page 6: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/6.jpg)
P2P Characteristics (Cont’d)
P2P Characteristics:• All nodes are both clients and servers
– Provide and consume data– Any node can initiate a connection
• No centralized data source Nodes collaborate directly with each other (not through well-known servers)
• Network is dynamic Nodes enter and leave the network “frequently”
6/50
![Page 7: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/7.jpg)
P2P Benefits• Ease of administration
– Nodes self-organize adaptively– No need to deploy servers to satisfy demand (c.f. scalability)– Built-in fault tolerance, replication, and load balancing
• Scalability– Consumers of resources also donate resources– Aggregate resources grow naturally with utilization
• Reliability– Geographic distribution– No single point of failure
7/50
![Page 8: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/8.jpg)
P2P Application Types• Direct real-time communication: instant messaging• Combine processing power of multiple distributed machines
to perform complex computations: analysis of SETI data, prime computation
• Distributed database systems• Store and distribute digital content: mp3 file sharing
(Content Distribution)
8/50
![Page 9: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/9.jpg)
P2P Classification
Architecture Types:• Unstructured• Structured• Loosely structured
Here,By structure, we refer to whether overlay network is created non-deterministically or whether it’s created based on a specific rules
9/50
![Page 10: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/10.jpg)
P2P Classification (Cont’d)
10/50
Unstructured Loosely Structured
Highly Structured
Hybrid Napster, IM
Partial Kazaa, Gia
None Gnutella Freenet Chord, CANCen
traliz
atio
n
Data organization
![Page 11: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/11.jpg)
Unstructured Architectures• Placement of content is unrelated to overlay topology• Search mechanism is required.• Appropriate for case of highly-transient node population
Degrees of centralization:• Purely Decentralized• Partially Centralized• Hybrid Decentralized
11/50
![Page 12: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/12.jpg)
Purely Decentralized
12/50
• Purely Decentralized– No central coordination– Users (servents) connect to
each other directly.• Gnutella architecture
– Query: Flooding• Send messages to all neighbors
– Response: Route back• Scalability Issues
– With TTL, virtual horizon– Without TTL, unlimited flooding
• E.g., Gnutella, FreeHaven
registrationregistration
registrationregistration
registration
query
query
query
query query
replyreply
request download
query
![Page 13: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/13.jpg)
Partially Centralized
13/50
• Partially Centralized– Supernodes
• Indexing & caching files of small subpart of the peer network
• Peers are automatically elected to become supernodes.
• Advantages– Reduced discovery time– Normal nodes will be lightly
loaded.• E.g., Kazaa, Edutella, Gnutella
(later version)
registration
queryreply
query reply
request
download
![Page 14: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/14.jpg)
Hybrid Decentralized
14/50
• Hybrid Decentralized– Central directory server
• User connection info.• File & metadata info.
• Advantages– Simple to implement– Locate files quickly and effi-
ciently• Disadvantages
– Vulnerable to technical failure– Inherently unscalable
• E.g., Napster, Publius
resigtration
query
reply
request
download
![Page 15: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/15.jpg)
Outline• Overview of P2P
– P2P Motivation– P2P Characteristics & Benefits– P2P Application Types
• P2P Classification– Unstructured: Gnutella, Kazaa, Napster– Structured: Freenet, Chord, CAN, Tapestry
• Other Aspects• Conclusions
15/50
![Page 16: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/16.jpg)
Structured Architectures• Features
– Mapping of content and location– Scalable solution for exact-match queries
• Examples– Freenet– Chord– CAN– Tapestry
![Page 17: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/17.jpg)
Freenet• Loosely Structured System
– Chain mode propagation• Each node
– Local data store– Dynamic routing table
• ( node address, file key )
• Each file– Unique binary key
![Page 18: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/18.jpg)
Freenet (Cont’d)• Messages
– Node ID, Timeout, Src ID, Dst ID• Message types
– Data insert : key, data– Data request : key– Data reply : file– Data filed : failure location, reason
![Page 19: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/19.jpg)
Freenet (Cont’d)• Data Insert
– Calculates a binary key– Sends a data insert message to itself
• Receiving a Data Insert message– If not taken
• Store the data• Forwards to the closest key’s owner
– If taken• Returns the preexisting file
![Page 20: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/20.jpg)
Freenet (Cont’d)• Data Request
– Chain mode propagation• Receiving a Data Request
– If locally stored• The search stops and the data is forwarded back
– If not• Forwards to the closest key’s owner
![Page 21: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/21.jpg)
Freenet (Cont’d)• Data Fail
– Timeout (hops-to-live)• Receiving a Data Failed Message
– Forwards the request to the next best node– After failed through all neighbors,
Sends back data filed message to the request sender
![Page 22: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/22.jpg)
Freenet (Cont’d)• Data Reply
– Includes the actual data – Passed back through the chain– The data is cached in all intermediate nodes
• A subsequent request w/ the same key → served immediately
• A request for a similar key → forwarded to the node that previously provided the data
![Page 23: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/23.jpg)
Freenet (Cont’d)• Indirect Files
– A special class of lightweight files– Named according to search keywords– Contain pointers to the real file– Multiple files w/ the same key
![Page 24: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/24.jpg)
Freenet (Cont’d)• Indirect Files
![Page 25: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/25.jpg)
Freenet (Cont’d)• Properties
– Nodes specialize in searching for similar keys– Nodes store similar keys– Similarity of keys does not reflect similarity of files– Routing does not reflect the underlying network
topology
![Page 26: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/26.jpg)
Chord• Nodes and Files are identified by keys
– m-bit identifiers– a deterministic hash function
• Mapping File ID onto Node ID– Nodes store (key, data item) pairs
![Page 27: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/27.jpg)
Chord (Cont’d)• A Chord Identifier Circle
![Page 28: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/28.jpg)
Chord (Cont’d)• Simple Key Location
![Page 29: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/29.jpg)
Chord (Cont’d)• Scalable Key Location
![Page 30: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/30.jpg)
Chord (Cont’d)• Simple Key Location
– Routing Information: Successor pointer– O( n )
• Scalable Key Location– Routing Information: Finger Table– O( logn )
![Page 31: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/31.jpg)
Chord (Cont’d)• Node Joining
– Certain keys assigned to its successor are reas-signed to it
• Node Departing
– Keys are reassigned to its successor
![Page 32: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/32.jpg)
Chord (Cont’d)• Node Joining
– N26 joins the network
![Page 33: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/33.jpg)
CAN Content Addressable Network
• Hash Table– Maps file names to their location– ( key K, value V ) pairs stored– Each node storing a part of the hash table
• A “zone”
![Page 34: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/34.jpg)
CAN (Cont’d)• Virtual coordinate space
– A zone corresponds to a segment of space– Key K is mapped onto a point P
• A deterministic function– ( K, V ) is stored at the node responsible for P
![Page 35: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/35.jpg)
CAN (Cont’d)• Virtual coordinate space
![Page 36: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/36.jpg)
CAN (Cont’d)• Retrieve
– Map K to P– Retrieve the value from the node covering P
• Routing– Request is routed to the node covering P– Nodes maintain a routing table
• Addresses of Nodes holding adjoining zones– Following the straight line path in the space
![Page 37: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/37.jpg)
CAN (Cont’d)• Routing
![Page 38: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/38.jpg)
CAN (Cont’d)• Node Joining
– Allocatedits own portion of the space• By splitting the zone of an existing node
• Node Departing– Hand over hash table entries to one of its neigh-
bors
![Page 39: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/39.jpg)
Tapestry• Location and Routing Infrastructure
– Self Administeration– Fault Tolerance– Stability
• By bypassing failed routes and nodes
• Plaxton Mesh– Routing mechanism– Location mechanism
![Page 40: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/40.jpg)
Tapestry (Cont’d)• Routing Mechanism
– Neighbor Maps• Local routing maps• Incrementally route messages • Multiple levels
– Level l → node ID matched w/ l digits• Multiple entries
– The number equals to the base of the ID• Pointer to the closest node in the network
![Page 41: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/41.jpg)
Tapestry (Cont’d)• Neighbor Map of Node w/ ID 67493
![Page 42: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/42.jpg)
Tapestry (Cont’d)• Routing Path from 67493 to 34567
– xxxx7 → xxx67 → xx567 → x4567→ 34567
![Page 43: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/43.jpg)
Tapestry (Cont’d)• Location Mechanism
– Root node• Provide a guaranteed node from which the object can
be located• Assigned when an object is inserted
– A globally consistent deterministic algorithm
– When inserted• Server node Ns, object O, root node Nr• Message routed to Ns to Nr• (O, Ns) stored along the routing path
![Page 44: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/44.jpg)
Tapestry (Cont’d)• Location Mechanism
– Location query• Messages destined for O• Initially routed toward to Nr• Meet a node containing (O, Ns) mapping
![Page 45: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/45.jpg)
Tapestry (Cont’d)• Advantages of Plexton Mesh
– Simple fault-handling• Routing by choosing a node w/ a similar suffix
– Scalability• w/ the only bottleneck (root nodes)
• Limitations– The need for global knowledge
• Assigning and identifying root nodes– The vulnerability of the root nodes
![Page 46: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/46.jpg)
Tapestry (Cont’d)• Extending Plaxton mesh’s Design
– Plaxton mesh assumes a static node population– Tapestry adapts it to the transient population
• Adaptibility• Fault tolerance• Optimizations
![Page 47: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/47.jpg)
Tapestry (Cont’d)• Optimizations
– Back-pointers for dynamic node insertion– Flexible concept of distance between nodes– Maintain cached content for failures– Multiple roots to each object– Adapt to environment changes
![Page 48: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/48.jpg)
Other Aspects• Content Caching, Replication and Migration• Security• Provisions for Anonymity• Provisions for Deniability• Incentive Mechanisms and Accountability• Resource Management Capability• Semantic Grouping of Information
![Page 49: A Survey of Peer-to-Peer Content Distribution Technologies](https://reader035.vdocuments.net/reader035/viewer/2022081419/56816622550346895dd979cd/html5/thumbnails/49.jpg)
Conclusions• Study of P2P Content Distribution Systems
– Properties– Design features
• Location and routing algorithms– Two Categories
• Unstructured system• Structured system
– Remains Open Research Problems