1 p2p overlay networks by behzad akbari spring 2011 these slidesin some parts are based on the...
TRANSCRIPT
1
P2P Overlay Networks
By
Behzad Akbari Spring 2011
These slidesin some parts are based on the slides of J. Kurose (UMASS) and Sanjay Rao ()
2
OutlineOutline
Overview of P2P overlay networksOverview of P2P overlay networks Applications of overlay networksApplications of overlay networks Unstructured overlay networksUnstructured overlay networks Structured overlay networksStructured overlay networks Overlay multicast networksOverlay multicast networks P2P media streaming networksP2P media streaming networks
3
Overview of P2P overlay Overview of P2P overlay networksnetworks What is a P2P system?What is a P2P system?
P2P refers to applications that P2P refers to applications that take advantage of resourcestake advantage of resources (storage, cycles, content, bandwidth, human presence) (storage, cycles, content, bandwidth, human presence) available at the available at the end systems ofend systems of the internet. the internet.
What is an overlay network?What is an overlay network? Overlay networks refer to networks that are constructed on Overlay networks refer to networks that are constructed on
top of another network (e.g. IP).top of another network (e.g. IP). What is a P2P overlay network?What is a P2P overlay network?
Any overlay network that is constructed by the Internet Any overlay network that is constructed by the Internet peers in the application layer on top of the IP network.peers in the application layer on top of the IP network.
4
Overview of P2P overlay Overview of P2P overlay networksnetworks P2P overlay network propertiesP2P overlay network properties
Efficient use of resourcesEfficient use of resources Self-organizingSelf-organizing
All peers organize themselves into an application layer network on top of All peers organize themselves into an application layer network on top of IP.IP.
ScalabilityScalability Consumers of resources also donate resourcesConsumers of resources also donate resources Aggregate resources grow naturally with utilizationAggregate resources grow naturally with utilization
ReliabilityReliability No single point of failureNo single point of failure Redundant overlay links between the peersRedundant overlay links between the peers Redundant data source Redundant data source
Ease of deployment and administrationEase of deployment and administration The nodes are self-organizedThe nodes are self-organized No need to deploy servers to satisfy demand.No need to deploy servers to satisfy demand. Built-in fault tolerance, replication, and load balancingBuilt-in fault tolerance, replication, and load balancing No need any change in underlay IP networksNo need any change in underlay IP networks
5
Applications of P2P overlay Applications of P2P overlay networksnetworks P2P file sharingP2P file sharing
Napster, Gnutella, Kaza, Emule, Edonkey, Bittorent, etc.Napster, Gnutella, Kaza, Emule, Edonkey, Bittorent, etc. Application layer multicastingApplication layer multicasting P2P media streamingP2P media streaming Content distributionContent distribution Distributed cachingDistributed caching Distributed storageDistributed storage Distributed backup systemsDistributed backup systems Grid computingGrid computing
6
Classification of P2P overlay Classification of P2P overlay networksnetworks Unstructured overlay networksUnstructured overlay networks
The overlay networks organize peers in a random graph in flat or The overlay networks organize peers in a random graph in flat or hierarchical manners.hierarchical manners.
Structured overlay networksStructured overlay networks Are based on Distributed Hash Tables (DHT)Are based on Distributed Hash Tables (DHT) the overlay network assigns keys to data items and organizes its the overlay network assigns keys to data items and organizes its
peers into a graph that maps each data key to a peer peers into a graph that maps each data key to a peer Overlay multicast networksOverlay multicast networks
The peers organize themselves into an overlay tree for The peers organize themselves into an overlay tree for multicasting.multicasting.
P2P media streaming networksP2P media streaming networks Used for large scale media streaming applications (e.g IPTV) Used for large scale media streaming applications (e.g IPTV)
over the Internet.over the Internet.
7
Unstructured P2P File Sharing Unstructured P2P File Sharing NetworksNetworks Centralized Directory based P2P systemsCentralized Directory based P2P systems Pure P2P systemsPure P2P systems Hybrid P2P systemsHybrid P2P systems
8
Unstructured P2P File Sharing Unstructured P2P File Sharing Networks (…)Networks (…) Centralized Directory based P2P systemsCentralized Directory based P2P systems
All peers are connected to central entityAll peers are connected to central entity Peers establish connections between each other on Peers establish connections between each other on
demand to exchange user data (e.g. mp3 demand to exchange user data (e.g. mp3 compressed data)compressed data)
Central entity is necessary to provide the service Central entity is necessary to provide the service Central entity is some kind of index/group databaseCentral entity is some kind of index/group database Central entity is lookup/routing tableCentral entity is lookup/routing table Examples:Examples: Napster, Bittorent Napster, Bittorent
9
Unstructured P2P File Sharing Unstructured P2P File Sharing Networks(…)Networks(…) Pure P2P systemsPure P2P systems
Any terminal entity can be removed Any terminal entity can be removed without loss of functionalitywithout loss of functionality
No central entities employed in the No central entities employed in the overlayoverlay
Peers establish connections between Peers establish connections between each other randomlyeach other randomly To route request and response messagesTo route request and response messages To insert request messages into the overlayTo insert request messages into the overlay
Examples:Examples: Gnutella, FreeNet Gnutella, FreeNet
10
Unstructured P2P File Sharing Unstructured P2P File Sharing Networks(…)Networks(…)
Hybrid P2P systemsHybrid P2P systems Main characteristic, Main characteristic,
compared to pure P2P: compared to pure P2P: Introduction of another Introduction of another dynamic hierarchical layerdynamic hierarchical layer
Election process to select Election process to select an assign Super peersan assign Super peers
Super peers: high degree Super peers: high degree (degree>>20, depending on (degree>>20, depending on network size)network size)
Leaf nodes: connected to Leaf nodes: connected to one or more Super peers one or more Super peers (degree<7(degree<7))
Example:Example: KaZaAKaZaAleafnode
Superpeer
11
Centralized Directory based P2P systems : Napster
Napster history: the rise January 1999: Napster version 1.0 May 1999: company founded December 1999: first lawsuits 2000: 80 million users
Napster history: the fall Mid 2001: out of business due to lawsuits Mid 2001: dozens of P2P alternatives that were
harder to touch, though these have gradually been constrained
12
Napster Technology: Directory Service User installing the software
Download the client program Register name, password, local
directory, etc. Client contacts Napster (via TCP)
Provides a list of music files it will share
… and Napster’s central server updates the directory
Client searches on a title or performer Napster identifies online clients with
the file … and provides IP addresses
Client requests the file from the chosen supplier Supplier transmits the file to the client Both client and supplier report status to
Napster
centralizeddirectory server
peers
Alice
Bob
1
1
1
12
3
13
Napster Technology: Properties Server’s directory continually updated
Always know what music is currently available Point of vulnerability for legal action
Peer-to-peer file transfer No load on the server Plausible deniability for legal action (but not enough)
Proprietary protocol Login, search, upload, download, and status operations No security: clear text passwords and other vulnerability
Bandwidth issues Suppliers ranked by apparent bandwidth & response time
14
Napster: Limitations of Central Directory Single point of failure Performance bottleneck Copyright infringement
So, later P2P systems were more distributed Gnutella went to the other extreme…
File transfer is decentralized, but locating content is highly centralized
15
Pure P2P system: Gnutella
Gnutella history 2000: J. Frankel &
T. Pepper released Gnutella
Soon after: many other clients (e.g., Morpheus, Limewire, Bearshare)
2001: protocol enhancements, e.g., “ultrapeers”
Query flooding Join: contact a few
nodes to become neighbors
Publish: no need! Search: ask neighbors,
who ask their neighbors Fetch: get file directly
from another node
16
Gnutella: Query Flooding
Fully distributed No central server
Public domain protocol
Many Gnutella clients implementing protocol
Overlay network: graph Edge between peer X
and Y if there’s a TCP connection
All active peers and edges is overlay net
Given peer will typically be connected with < 10 overlay neighbors
17
Gnutella: Protocol
Query message sent over existing TCPconnections
Peers forwardQuery message
QueryHit sent over reverse path
Query
QueryHit
Query
Query
QueryHit
Query
Query
Query
Hit
File transfer:HTTP
Scalability:limited scopeflooding
18
Gnutella: Peer Joining
Joining peer X must find some other peers Start with a list of candidate peers X sequentially attempts TCP connections with
peers on list until connection setup with Y X sends Ping message to Y
Y forwards Ping message. All peers receiving Ping message respond with
Pong message X receives many Pong messages
X can then set up additional TCP connections
19
Gnutella: Pros and Cons
Advantages Fully decentralized Search cost distributed Processing per node permits powerful search
semantics Disadvantages
Search scope may be quite large Search time may be quite long High overhead, and nodes come and go often
20
Hybrid P2P system: KaAzA
KaZaA history2001: created by Dutch company (Kazaa BV)
Smart query flooding Join: on start, the client contacts a super-node (and may later
become one) Publish: client sends list of files to its super-node Search: send query to super-node, and the super-nodes flood
queries among themselves Fetch: get file directly from peer(s); can fetch from multiple
peers at once
21
KaZaA: Exploiting Heterogeneity Each peer is either a group
leader or assigned to a group leader TCP connection between
peer and its group leader TCP connections between
some pairs of group leaders
Group leader tracks the content in all its children ordinary peer
group-leader peer
neighoring re la tionshipsin overlay network
22
KaZaA: Motivation for Super-Nodes Query consolidation
Many connected nodes may have only a few files Propagating query to a sub-node may take more
time than for the super-node to answer itself Stability
Super-node selection favors nodes with high up-time
How long you’ve been on is a good predictor of how long you’ll be around in the future
23
P2P Case study: Skype inherently P2P: pairs of
users communicate. proprietary application-
layer protocol (inferred via reverse engineering)
hierarchical overlay with SNs
Index maps usernames to IP addresses; distributed over SNs
Skype clients (SC)
Supernode (SN)
Skype login server
24
Peers as relays
Problem when both Alice and Bob are behind “NATs”. NAT prevents an outside peer
from initiating a call to insider peer
Solution: Using Alice’s and Bob’s SNs,
Relay is chosen Each peer initiates session with
relay. Peers can now communicate
through NATs via relay
25
BitTorrent
BitTorrent history and motivation 2002: B. Cohen debuted BitTorrent Key motivation: popular content
Popularity exhibits temporal locality (Flash Crowds)
Focused on efficient fetching, not searching Distribute same file to many peers Single publisher, many downloaders
Preventing free-loading
26
BitTorrent: Simultaneous Downloading Divide large file into many pieces
Replicate different pieces on different peers A peer with a complete piece can trade with other
peers Peer can (hopefully) assemble the entire file
Allows simultaneous downloading Retrieving different parts of the file from different
peers at the same time And uploading parts of the file to peers Important for very large files
27
BitTorrent: Tracker
Infrastructure node Keeps track of peers participating in the torrent
Peers register with the tracker Peer registers when it arrives Peer periodically informs tracker it is still there
Tracker selects peers for downloading Returns a random set of peers Including their IP addresses So the new peer knows who to contact for data
28
BitTorrent: Chunks
Large file divided into smaller pieces Fixed-sized chunks Typical chunk size of 256 Kbytes
Allows simultaneous transfers Downloading chunks from different neighbors Uploading chunks to other neighbors
Learning what chunks your neighbors have Periodically asking them for a list
File done when all chunks are downloaded
29
BitTorrent: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
TrackerWeb Server
.torr
ent
30
BitTorrent: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
Get-announce
Web Server
31
BitTorrent: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
Response-peer list
Web Server
32
BitTorrent: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
Shake-hand
Web Server
Shake-hand
33
BitTorrent: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
pieces
pieces
Web Server
34
BitTorrent: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
piecespieces
pieces
Web Server
35
BitTorrent: Overall Architecture
Web page with link to .torrent
A
B
C
Peer
[Leech]
Downloader
“US”
Peer
[Seed]
Peer
[Leech]
Tracker
Get-announce
Response-peer list
piecespieces
pieces
Web Server
36
BitTorrent: Chunk Request Order Which chunks to request?
Could download in order Like an HTTP client does
Problem: many peers have the early chunks Peers have little to share with each other Limiting the scalability of the system
Problem: eventually nobody has rare chunks E.g., the chunks need the end of the file Limiting the ability to complete a download
Solutions: random selection and rarest first
37
BitTorrent: Rarest Chunk First Which chunks to request first?
The chunk with the fewest available copies I.e., the rarest chunk first
Benefits to the peer Avoid starvation when some peers depart
Benefits to the system Avoid starvation across all peers wanting a file Balance load by equalizing # of copies of chunks
38
Free-Riding Problem in P2P Networks Vast majority of users are free-riders
Most share no files and answer no queries Others limit # of connections or upload speed
A few “peers” essentially act as servers A few individuals contributing to the public good Making them hubs that basically act as a server
BitTorrent prevent free riding Allow the fastest peers to download from you Occasionally let some free loaders download
39
Bit-Torrent: Preventing Free-Riding Peer has limited upload bandwidth
And must share it among multiple peers Prioritizing the upload bandwidth: tit for tat
Favor neighbors that are uploading at highest rate
Rewarding the top four neighbors Measure download bit rates from each neighbor Reciprocates by sending to the top four peers Recompute and reallocate every 10 seconds
Optimistic unchoking Randomly try a new neighbor every 30 seconds So new neighbor has a chance to be a better partner
40
BitTorrent Today
Significant fraction of Internet traffic Estimated at 30% Though this is hard to measure
Problem of incomplete downloads Peers leave the system when done Many file downloads never complete Especially a problem for less popular content
Still lots of legal questions remains Further need for incentives
41
Structured overlay networksStructured overlay networks
Overlay topology construction is based on NodeID’s Overlay topology construction is based on NodeID’s that are generated by using Distributed Hash Tables that are generated by using Distributed Hash Tables (DHT)(DHT)..
In this category, the overlay network assigns keys to In this category, the overlay network assigns keys to data items and organizes its peers into a graph that data items and organizes its peers into a graph that maps each data key to a peer.maps each data key to a peer.
This structured graph enables efficient discovery of This structured graph enables efficient discovery of data items using the given keys. data items using the given keys.
It Guarantees object detection in O(log n) hops.It Guarantees object detection in O(log n) hops. Examples:Examples: Content Addressable Network (CAN), Content Addressable Network (CAN),
Chord, PastryChord, Pastry..
42
Structured-DHT-based P2P overlay networks
43
CAN: Content Addressable Network
Each key maps to one point in the d-dimensional space
Each node responsible for all the keys in its zone.
Divide the space into zones.
A B
C D E
44
CAN(…)
45
Pastry
Prefix-based Route to node with shared
prefix (with the key) of ID at least one digit more than this node.
Neighbor set, leaf set and routing table.
65a1fc
d13da3
d4213f
d462bad467c4
d471f1
d46a1c
Route(d46a1c)
46
Pastry (…)
Routing table, leaf set and neighbor set example in a Pastry node b=2 and l=8
47
Overlay Multicasting
48
The key decision behind IP Multicast
“Smart Network”
Berkeley
Gatech Stanford
Per-group Router State
49
Internet Design Philosophy Dumb routers , smart end systems ....
Minimal functionality in routers Push functionality to end systems as much as possible Key reason attributed to success and scaling of Internet
IP
Application/Transport
Internet architecture
Physical/Link layer
“Thin Waist”
50
Significance of IP Multicast
First major functionality added to routers since original design
Per-group state implies: Routing tables with potentially millions of entries! Today’s routers have only tens of thousands of
entries in routing tables though there are millions of end systems!
Potential scaling concerns Per-group (per-flow) based solutions typically
find slow acceptance in the Internet
51
Other concerns with IP Multicast Slow deployment
Very difficult to change network infrastructure and routers
Difficult to support higher layer functionality IP Multicast: Best effort delivery Model complicates support for reliability and
congestion control
52
Congestion control with IP Multicast
Network with
IP Multicast
Purdue
Stanford, LAN
Stanford, modem
What rate must the source send at ?
Berkeley1
China
London
53
Back to the drawing board…
IP
Application
Internet architecture
Physical/Link
?
?
Which is the right layer to implement multicast functionality?
IP Multicast: IP layer, efficient
Naïve Unicast: Application layer, inefficient
Can we do it efficiently at the application level ?
54
Overlay Multicast
Purdue
Stan1
Stan2
Berk2
Overlay Tree
Stanford
Berkeley
Dumb Network
Gatech
Gatech
Berk1
Stan1
Stan2
Berk1
Berk2
55
No per-group state in routers Easy to deploy: No change to network infrastructure Can simplify support for congestion control etc.
Overlay Multicast: Benefits
Unicast congestion
control
Purdue
Stan-Modem
Gatech
Leverage computation and storage
(e.g. Transcoding)Stan-LAN
Berk1
Berk2
56
Overlay Performance Even a well-designed overlay cannot be as efficient as IP
Mulitcast But performance penalty can be kept low Trade-off some performance for other benefits
Increased Delay
Dumb Network
GatechDuplicate Packets:
Bandwidth Wastage
Stanford
Berkeley
57
Architectural SpectrumPurely application
end-pointInfrastructure-Centric
+Instantaneous Deployment
+Low setup overhead, low cost
+Uses bandwidth resources at end systems
+Can scale to support millions of groups
+ Security
+ Robustness
58
Router-Based(IP Multicast)
No Router Support
Infrastructure-Centric(CDNs, e.g. Akamai)
Application End-points Only, End-System Only
Application End-pointsor end-systems with infrastructure support
End-System, Application-Level, Overlay or Peer-to-Peer Multicast
59
Applications
File Download Vs. Video Conferencing/Broadcasting Bandwidth demanding : hundreds or thousands of Kbps Real-time constraints: Continuous streaming delivery
Broadcasting Vs. Conferencing Conferencing: smaller scale, anyone can be source. Broadcasting: single source, tens or hundreds of thousands
of simultaneous participants
60
Design Issues
Construct efficient overlays High bandwidth, low latency to receivers
Low Overheads Fault tolerant Self-organizing, Self-Improving Honor per-node access bandwidth constraints System Issues: NATs etc.
61
What overlay to construct?
•No formal structure
•Each data pkt send using
epidemic algorithms
Overlay with a structure
Stan-LANNJ
Purdue
Berk2
Stan-Modem
Berk1Purdue
Stan-LAN
Stan-Modem
Berk2
Stan-LAN
Berk1
Tree Multi-Tree Mesh
62
Inefficient Overlay Trees
-Poor network usage
-Potential congestion near Purdue
Purdue
High latency
Berk2
Gatech
Stan2
Stan1-Modem
Berk1Berk1
Berk2
Gatech
Stan2 Purdue
Stan1-Modem
Gatech
Purdue
Berk2
Stan-Modem
Stan-LAN
Berk1
Poor bandwidth
to members
63
Efficient overlay trees
Gatech
Purdue
Berk2
Stan-LAN
Stan-Modem
Berk1 Gatech
Berk2Purdue
Stan-Modem
Berk1
Stan-LAN
64
Self-Organizing Protocols
Construct efficient overlays in a distributed fashion Members may have limited knowledge of the
Internet Adapt to dynamic join/leave/death of members Adapt to network dynamics and congestion
65
Example
Purdue
Stan-Lan joins
Berk1
Stan-Modem
Berk2
66
Example
Purdue
Stan-Lan
Berk1
Stan-Modem
Berk2
67
Example
Purdue
Stan-Lan
Berk1
Stan-Modem
Berk2
Bad Bw Perf!
Switch!
68
Example
Purdue
Stan-Lan
Berk1
Stan-Modem
Berk2
69
Example
Purdue
Stan-Lan
Berk1
Stan-Modem
Berk2
More “clustering” if
Stan-Modem moves to Stan-Lan
70
Example
Purdue
Stan-Lan
Berk1
Stan-Modem
Berk2
71
Example
Purdue
Berk1
Stan-Modem
Berk2
Stan-Modem is disconnected:
Back to Berk1
72
Key Components of Protocol
Overlay Management: How many people does a member know? How is this knowledge maintained?
Overlay Optimization: Constructing efficient overlay among members Distributed heuristics
No entity with global knowledge of hosts and network
73
Group Management Build separate control structure decoupled from tree
Each member knows small random subset of group members Knowledge maintained using gossip-like algorithm
Members also maintain path from source
Other design alternatives possible: Example: More hierarchical structure No clear winner between design alternatives
S
A
74
Bootstrap
AsiaUS2
Euro2Euro3
Euro2, Euro3, ...
Source (US)
US1
Modem1
Asia
Euro1US2
US4, …US
Node that joins:
–Gets a subset of group membership from source
–Finds parent using parent selection algorithm
75
Other causes for parent switch
Euro2, Euro3, ...
Modem1
Source (US)
US1Asia
Euro1US2
US4, …
Modem1
Euro3, ...
Source (US)
US1Asia
Euro2US2
US4, …
Modem1
Euro3, ...
Source (US)
US1Asia
Euro2US2
US4, …
Member Leave/Death Congestion/ poor bandwidth
Euro4
Better Clustering
76
Parent Selection
Modem1X
–X sends PROBE_REQ to subset of members it knows
–Waits for 1 second for the responses
–Evaluates remote nodes and chooses a candidate parent
Euro2, Euro3, ...
Source (US)
US1Asia
Euro1US2
US4, …
77
Factors in Parent Selection Filter out P if it is a descendant of X Performance of P
Application throughput P received over last few seconds Delay of path from S to P
Saturation level of P Performance of link P-X
Delay of link P-X TCP bandwidth of link P-X
S
PX
78
Multiple Overlay Trees
Source
Peer A
Peer C
S/3 S/3S/3
With support from MDC, split into T-equally sized stripes T trees, each distributes a single stripe of size S/T Overall quality depends on the number of stripes received Number of trees node i is entitled to =
S Kbps
Tree 1 Tree 3
TS
ri/
Tree 2
79
Mesh-Based Approach Tree-based:
Involves “push” Well-known structure
Mesh-based: Each node maintains a set of partners Exchange data availability info with partners Pull data from partner if do not possess it already
80
81
Mesh-Based Streaming Vs. BitTorrent Clever Scheduling Algorithm regarding which
segments to fetch
82
Deployments
Research ESM Broadcasting system (2002-) CoolStreaming (2004-)
Report peak sizes >25000 users
Industrial (last 2-3 yrs) PPLive
Reports even larger peak sizes Zattoo GridMedia
83
Research Challenges/Open Issues Tree Vs. Mesh Approaches, Hybrids Access-bandwidth scarce regimes Incentives and fairness Extreme peer dynamics, flash crowd Support for heterogeneous receivers System Issues
NATs/Firewalls Transport Protocol Start-up delay/buffer interaction
Security Challenges
84
U.S. East Coast
U.S. Central
U.S. West Coast
Europe
Asia
Unknown
Source
Snapshot from Sigcomm Broadcast