1 school of computing science simon fraser university cmpt 771/471: overlay networks and p2p systems...
TRANSCRIPT
![Page 1: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/1.jpg)
1
School of Computing Science
Simon Fraser University
CMPT 771/471: Overlay Networks CMPT 771/471: Overlay Networks and P2P Systemsand P2P Systems
Instructor: Dr. Mohamed HefeedaInstructor: Dr. Mohamed Hefeeda
![Page 2: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/2.jpg)
2
P2P Computing: Definitions
Peers cooperate to achieve desired functions- Peers:
• End-systems (typically, user machines)
• Interconnected through an overlay network
• Peer ≡ Like the others (similar or behave in similar manner)
- Cooperate: • Share resources, e.g., data, CPU cycles, storage, bandwidth
• Participate in protocols, e.g., routing, replication, …
- Functions: • File-sharing, distributed computing, communications,
content distribution, streaming, …
Note: the P2P concept is much wider than file sharing
![Page 3: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/3.jpg)
3
When Did P2P Start?
Napster (Late 1990’s)- Court shut Napster down in 2001
Gnutella (2000) Then the killer FastTrack (Kazaa, ...) Now BitTorrent, and many others Accompanied by significant research interest Claim
- P2P is much older than Napster!
Proof- The original Internet!
- Remember UUCP (unix-to-unix copy)?
![Page 4: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/4.jpg)
4
What IS and IS NOT New in P2P?
What is not new- Concepts!
What is new- The term P2P (may be!)
- New characteristics of • Nodes which constitute the
• System that we build
![Page 5: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/5.jpg)
5
What IS NOT New in P2P?
Distributed architectures Distributed resource sharing Node management (join/leave/fail) Group communications Distributed state management ….
![Page 6: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/6.jpg)
6
What IS New in P2P?
Nodes (Peers)- Quite heterogeneous
• Several order of magnitudes difference in resources
• Compare bandwidth of dial-up peer vs high-speed LAN peer
- Unreliable• Failure is the norm!
- Offer limited capacity• Load sharing and balancing are critical
- Autonomous• Rational, i.e., maximize their own benefits!
• Motivations should be provided to peers to cooperate in a way that optimizes the system performance
![Page 7: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/7.jpg)
7
What IS New in P2P? (cont’d)
System - Scale
• Numerous number of peers (millions)
- Structure and topology• Ad-hoc: No control over peer joining/leaving
• Highly dynamic
- Membership/participation• Typically open
- More security concerns• Trust, privacy, data integrity, …
- Cost of building and running• Small fraction of same-scale centralized systems
• How much would it cost to build/run a super computer with processing power of that 3 Million SETI@Home PCs?
![Page 8: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/8.jpg)
8
What IS New in P2P? (cont’d)
So what? We need to design new lighter-weight
algorithms and protocols to scale to millions (or billions!) of nodes given the new characteristics
Question: why now, not two decades ago?- We did not have such abundant (and
underutilized) computing resources back then!
- And, network connectivity was very limited
![Page 9: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/9.jpg)
9
Why is it Important to Study P2P?
P2P traffic is a major portion of Internet traffic (50+%), current killer app
P2P traffic has exceeded web traffic (former killer app)!
Direct implications on the design, administration, and use of computer networks and network resources
- Think of ISP designers or campus network administrators
Many potential distributed applications
![Page 10: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/10.jpg)
10
Sample P2P Applications
File sharing- BitTorrent, Overnet, eDonkey, Gnutella,, …
Distributed cycle sharing- SETI@home, Gnome@home, …
File and storage systems- OceanStore, CFS, Freenet, Farsite, …
Media streaming and content distribution- SopCast, CoolStreaming, …
- SplitStream, CoopNet, PeerCast, Bullet, Zigzag, NICE, …
![Page 11: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/11.jpg)
11
P2P vs. its Cousin (Grid Computing)
Common Goal:- Aggregate resources (e.g., storage, CPU
cycles, and data) into common pool and provide efficient access to them
Differences along five axes- Target communities and applications
- Type of shared resources
- Scalability of the system
- Services provided
- Software required
![Page 12: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/12.jpg)
12
P2P vs Grid Computing (cont’d)
Issue Grid P2P
Communities and Applications
Established communities, e.g., scientific institutions Computationally-intensive problems
Grass-root communities (anonymous) Mostly, file-swapping
Resources Shared
Powerful and Reliable machines, clusters High-speed connectivity Specialized instruments
PCs with limited capacity and connectivity Unreliable Very diverse
![Page 13: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/13.jpg)
13
P2P vs Grid Computing (cont’d)
Issue Grid P2P
System Scalability
Hundreds to thousands of nodes
Hundreds of thousands to Millions of nodes
Services Provided
Sophisticated services: authentication, resource discovery, scheduling, access control, and membership control Members usually trust each others
Limited services: resource discovery limited trust among peers
Software required
Sophisticated suit: e.g., Globus, Condor
Simple:, e.g., BitTorrent, SETI@Home (screen saver)
![Page 14: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/14.jpg)
14
P2P vs Grid Computing: Discussion
The differences mentioned are based on the traditional view of each paradigm
- It is conceived that both paradigms will converge and will complement each other
Target communities and applications- Grid: is going open
Type of shared resources- P2P: is to include various and more powerful resources
Scalability of the system- Grid: is to increase number of nodes
Services provided- P2P: is to provide authentication, data integrity, trust
management, …
![Page 15: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/15.jpg)
15
P2P Systems: Simple Model
P2P Substrate
Operating System
Hardware
Middleware
P2P Application
Software architecture model on a peer
System architecture: Peers form an overlay according to the P2P
Substrate
![Page 16: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/16.jpg)
16
Overlay Network
An abstract layer built on top of physical network
Neighbors in overlay can be several hops away in physical network
![Page 17: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/17.jpg)
17
Overlay Network (cont’d)
![Page 18: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/18.jpg)
18
Overlay Network (cont’d)
Why do we need overlays? Flexibility in
- Choosing neighbors
- Forming and customizing topology to fit application’s needs (e.g., short delay, reliability, high BW, …)
- Designing communication protocols among nodes
Get around limitations in legacy networks Enable new (and old!) network services
![Page 19: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/19.jpg)
19
Overlay Network (cont’d)
Overlay design issues- Select neighbors
- Handle node arrivals, departures
- Detect and handle failures (nodes, links)
- Monitor and adapt to network dynamics
- Match with the underlying physical network
![Page 20: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/20.jpg)
20
Overlay Network (cont’d)
Some applications that use overlays- Application level multicast, e.g., ESM, Zigzag, NICE, …
• Build multicast tree(s) or mesh(es) in the application (not network) layer
- Reliable inter-domain routing, e.g., RON• Improves BGP by finding robust routes faster
- Content Distribution Networks (CDN)• To distribute bandwidth intensive content (software
updates,…)
- Peer-to-peer file sharing • File exchange among peers
- P2P streaming• Real time streaming
![Page 21: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/21.jpg)
21
Overlay Network (cont’d)
Example application … Application Level Multicast (ALM)
Let us first see IP Multicast
![Page 22: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/22.jpg)
22
Overlay Network (cont’d) Recall: IP Multicast
source
![Page 23: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/23.jpg)
23
Overlay Network (cont’d)
IP Multicast
- Most efficient (packets traverse each link only once)
What is wrong with IP Multicast?
- Not enabled in many routers- Not scalable (core routers need to maintain state for
multicast sessions)
Now let us see ALM …
![Page 24: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/24.jpg)
24
Overlay Network (cont’d)Application Level Multicast (ALM)
source
![Page 25: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/25.jpg)
25
Overlay Network (cont’d)
Several algorithms have been proposed to improve the efficiency of ALM
- Get it as close as possible to IP Multicast
- See ESM, NICE, Zigzag papers
![Page 26: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/26.jpg)
26
Peer Software Model
A software client installed on each peer
Three components:
- P2P Substrate
- Middleware
- P2P Application
P2P Substrate
Operating System
Hardware
Middleware
P2P Application
Software model on peer
![Page 27: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/27.jpg)
27
Peer Software Model (cont’d)
P2P Substrate (key component)- Overlay management
• Construction
• Maintenance (peer join/leave/fail and network dynamics)
- Resource management• Allocation (storage)
• Discovery (routing and lookup)
Ex: Pastry, CAN, Chord, …
![Page 28: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/28.jpg)
28
Peer Software Model (cont’d)
Middleware- Provides auxiliary services to P2P applications:
• Peer selection
• Trust management
• Data integrity validation
• Authentication and authorization
• Membership management
• Accounting (Economics and rationality)
• …
- Ex: CollectCast, EigenTrust, Micro payment
![Page 29: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/29.jpg)
29
Peer Software Model (cont’d)
P2P Application- Potentially, there could be multiple applications
running on top of a single P2P substrate
- Applications include• File sharing
• File and storage systems
• Distributed cycle sharing
• Content distribution
- This layer provides some functions and bookkeeping relevant to target application
• File assembly (file sharing)
• Buffering and rate smoothing (streaming)
Ex: Promise, Bullet, CFS
![Page 30: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/30.jpg)
30
P2P Substrate
Key component, which - Manages the Overlay
- Allocates and discovers objects
P2P Substrates can be - Structured
- Unstructured
- Based on the flexibility of placing objects at peers
![Page 31: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/31.jpg)
31
P2P Substrates: Classification
Structured (or tightly controlled, DHT) − Objects are rigidly assigned to specific peers
− Looks like as a Distributed Hash Table (DHT)
− Efficient search & guarantee of finding
− Lack of partial name and keyword queries
− Maintenance overhead
− Ex: Chord, CAN, Pastry, Tapestry, Kademila (Overnet)
![Page 32: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/32.jpg)
32
P2P Substrates: Classification
Unstructured (or loosely controlled)− Objects can be anywhere
− Support partial name and keyword queries
− Inefficient search & no guarantee of finding
− Some heuristics exist to enhance performance
− Ex: Gnutella, Kazaa (super node), GIA [Chawathe et al. 03]
![Page 33: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/33.jpg)
33
Structured P2P Substrates
Objects are rigidly assigned to peers− Objects and peers have IDs (usually by
hashing some attributes)
− Objects are assigned to peers based on IDs
Peers in overlay form specific geometrical shape, e.g.,
- tree, ring, hypercube, butterfly network
Shape (to some extent) determines − How neighbors are chosen, and
− How messages are routed
![Page 34: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/34.jpg)
34
Structured P2P Substrates (cont’d)
Substrate provides a Distributed Hash Table (DHT)-like interface
− InsertObject (key, value), findObject (key), …
− In the literature, many authors refer to structured P2P substrates as DHTs
It also provides peer management (join, leave, fail) operations
Most of these operations are done in O(log n) steps, n is number of peers
![Page 35: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/35.jpg)
35
Structured P2P Substrates (cont’d)
DHTs: Efficient search & guarantee of finding
However, − Lack of partial name and keyword queries
− Maintenance overhead, even O(log n) may be too much in very dynamic environments
Ex: Chord, CAN, Pastry, Tapestry, Kademila (Overnet)
![Page 36: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/36.jpg)
36
Example: Content Addressable Network (CAN) [Ratnasamy 01]
− Nodes form an overlay in d-dimensional space− Node IDs are chosen randomly from the d-space
− Object IDs (keys) are chosen from the same d-space
− Space is dynamically partitioned into zones
− Each node owns a zone
− Zones are split and merged as nodes join and leave
− Each node stores− Portion of the hash table that belongs to its zone
− Information about its immediate neighbors in the d-space
![Page 37: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/37.jpg)
37
2-d CAN: Dynamic Space Division
n1
n2
n3
n4
0
0
7
7
n5
![Page 38: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/38.jpg)
38
2-d CAN: Key Assignment
n1
n2
n3
n4
0
0
7
7
K1K2
K3
K4
n5
![Page 39: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/39.jpg)
39
2-d CAN: Routing (Lookup)
n1
n2
n3
n4
0
0
7
7
K1K2
K3
K4
n5
K4?
K4?
![Page 40: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/40.jpg)
40
CAN: Routing
− Nodes keep 2d = O(d) state information (neighbor coordinates, IPs)
− Constant, does not depend on number of nodes n
− Greedy routing - Route to the node that is closest to the
destination
- On average, is done in O(n1/d) = O(log n) when d = log n /2
![Page 41: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/41.jpg)
41
CAN: Node Join
− New node finds a node already in the CAN− (bootstrap: one (or a few) dedicated nodes outside the
CAN maintain a partial list of active nodes)
− It finds a node whose zone will be split− Choose a random point P (will be its ID)
− Forward a JOIN request to P through the existing node
− The node that owns P splits its zone and sends half of its routing table to the new node
− Neighbors of the split zone are notified
![Page 42: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/42.jpg)
42
CAN: Node Leave, Fail
− Graceful departure− The leaving node hands over its zone to one of its
neighbors − Failure
− Detected by the absence of heart beat messages sent periodically in regular operation
− Neighbors initiate takeover timers, proportional to the volume of their zones
− Neighbor with smallest timer takes over zone of dead node
− notifies other neighbors so they cancel their timers (some negotiation between neighbors may occur)
− Note: the (key, value) entries stored at the failed node are lost
− Nodes that insert (key, value) pairs periodically refresh (or re-insert) them
![Page 43: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/43.jpg)
43
CAN: Discussion
− Scalable − O(log n) steps for operations − State information is O(d) at each node
− Locality − Nodes are neighbors in the overlay, not in the physical
network− Suggestion (for better routing)
− Each node measures RTT between itself and its neighbors− Forwards the request to the neighbor with maximum ratio
of progress to RTT
![Page 44: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/44.jpg)
44
CAN: Discussion
− What is wrong with CAN (and DHTs in general)?
− Maintenance cost− Although logarithmic in number of nodes, still too much
for very dynamic P2P systems
− Peers are joining and leaving all the time
![Page 45: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/45.jpg)
45
Unstructured P2P Substrates
− Objects can be anywhere Loosely-controlled overlays
− The loose control− Makes overlay tolerate transient behavior of nodes
− For example, when a peer leaves, nothing needs to be done because there is no structure to restore
− Enables system to support flexible search queries− Queries are sent in plain text and every node runs a mini-
database engine
− But, we loose on searching− Usually using flooding, inefficient− Some heuristics exist to enhance performance− No guarantee on locating a requested object (e.g., rarely
requested objects)− Ex: Gnutella, Kazaa (super node), GIA [Chawathe et al.
03]
![Page 46: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/46.jpg)
46
Example: Gnutella
− Peers are called servents− All peers form an unstructured overlay− Peer join
− Find an active peer already in Gnutella (e.g., contact known Gnutella hosts)
− Send a Ping message through the active peer− Peers willing to accept new neighbors reply with Pong
− Peer leave, fail− Just drop out of the network!
− To search for a file− Send a Query message to all neighbors with a TTL (=7)− Upon receiving a Query message
− Check local database and reply with a QueryHit to requester
− Decrement TTL and forward to all neighbors if nonzero
![Page 47: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/47.jpg)
Flooding in Gnutella
Scalability Problem
47
![Page 48: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/48.jpg)
48
Heuristics for Searching [Yang and Garcia-Molina 02]
− Iterative deepening− Multiple BFS with increasing TTLs
− Reduce traffic but increase response time
− Directed BFS− Send to “good” neighbors (subset of your neighbors
that returned many results in the past) need to keep history
− Local Indices− Keep a small index over files stored on neighbors
(within number of hops)
− May answer queries on behalf of them
− Save cost of sending queries over the network
− Index currency?
![Page 49: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/49.jpg)
49
Heuristics for Searching: Super Node
− Used in recent Gnutella-like networks
− Relatively powerful nodes play special role− maintain indexes over other peers
Super Node (SN)
Ordinary Node (ON)
![Page 50: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/50.jpg)
50
Super Node Systems
− File search− ON sends a query to its SN
− SN replies with a list of IPs of ONs that have the file
− SN may forward the query to other SNs
− Parallel downloads take place between ONs
![Page 51: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/51.jpg)
51
Super Node Systems
− Two types of traffic− Signaling
− Handshaking, connection establishment, uploading metadata, …
− Over TCP connections between SN—SN and SN—ON
− Content traffic − Files exchanged
− Mostly through HTTP between ON—ON
![Page 52: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/52.jpg)
52
Lessons from Deployed P2P Systems
− Distributed design− Exploit heterogeneity− Load balancing − Locality in neighbor selection− Connection Shuffling
− If a peer searches for a file and does not find it, it may try later and gets it!
− Efficient gossiping algorithms− To learn about other SNs and perform shuffling
− Consider peers behind NATs and Firewalls− They are everywhere!
![Page 53: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/53.jpg)
53
P2P Systems: Summary
P2P is an active research area with many potential applications in industry and academia
In P2P computing paradigm:- Peers cooperate to achieve desired functions
New characteristics- heterogeneity, unreliability, rationality, scale, ad hoc
new and lighter-weight algorithms are needed
Simple model for P2P systems:- Peers form an abstract layer called overlay
- A peer software client may have three components• P2P substrate, middleware, and P2P application
• Borders between components may be blurred
![Page 54: 1 School of Computing Science Simon Fraser University CMPT 771/471: Overlay Networks and P2P Systems Instructor: Dr. Mohamed Hefeeda](https://reader034.vdocuments.net/reader034/viewer/2022042616/56649e4c5503460f94b41621/html5/thumbnails/54.jpg)
54
Summary (cont’d)
P2P substrate: A key component, which - Manages the Overlay
- Allocates and discovers objects
P2P Substrates can be - Structured (DHT)
• Example: CAN
- Unstructured• Example: Gnutella