lecture 14 p2p
TRANSCRIPT
-
8/12/2019 Lecture 14 P2P
1/44
ECE 595Computer Network Systems
1
Peer-to-Peer Systems
-
8/12/2019 Lecture 14 P2P
2/44
Applications
File download
Location and Retrieval (Napster, Gnutella)
Efficient distribution (BitTorrent, EDonkey)
Audio Conference (Skype)
Live Internet Broadcastin
2
-
8/12/2019 Lecture 14 P2P
3/44
Scaling Problem Millions of clients server and network meltdown
3
-
8/12/2019 Lecture 14 P2P
4/44
P2P System
4
Leverage the resources of client machines (peers) Computation, storage, bandwidth
-
8/12/2019 Lecture 14 P2P
5/44
P2p file-sharing
Quickly grown in popularity
Dozens or hundreds of file sharing applications 35 million American adults use P2P networks --
29% of all Internet users in US!
5
Internet
-
8/12/2019 Lecture 14 P2P
6/44
Whats out there?
Central Flood Super-
nodeflood
Route
Search Napster Gnutella KaZaa DHTs
6
BitTorrent: Being able to transfer parts of a file at a time
-
8/12/2019 Lecture 14 P2P
7/44
Searching
N1N2 N3
7
Internet
N6N5N4
Publisher
Key=titleValue=MP3 data
Client
Lookup(title)
?
-
8/12/2019 Lecture 14 P2P
8/44
Framework
Common Primitives:
Join: how to I begin participating?
Publish: how do I advertise my file?
Search: how to I find a file?
Fetch: how to I retrieve a file?
8
-
8/12/2019 Lecture 14 P2P
9/44
Next Topic...
Centralized Database
Napster
Query Flooding Gnutella
Intelli ent Quer Floodin
9
KaZaA
Swarming BitTorrent
Structured Overlay Routing
Distributed Hash Tables
-
8/12/2019 Lecture 14 P2P
10/44
Napster: History
1999: Sean Fanning launches Napster
Peaked at 1.5 million simultaneous users
Jul 2001: Napster shuts down
10
-
8/12/2019 Lecture 14 P2P
11/44
Napster: Overiew
Centralized Database: Join: on startup, client contacts central server Publish: reports list of files to central server Search: query the server => return someone that stores
11
Fetch: get the file directly from peer
-
8/12/2019 Lecture 14 P2P
12/44
Napster: Publish
insert(X,123.2.21.23)
...
12
I have X, Y, and Z!
Publish
123.2.21.23
-
8/12/2019 Lecture 14 P2P
13/44
Napster: Search
search(A)-->123.2.0.18
Fetch
123.2.0.18
13
Where is file A?
Query Reply
-
8/12/2019 Lecture 14 P2P
14/44
Napster: Discussion
Pros: Simple
Search scope is O(1)
Cons: Server ma not scale: Server maintains O N State
14
Server does all processing Single point of failure
-
8/12/2019 Lecture 14 P2P
15/44
Next Topic...
Centralized Database
Napster
Query Flooding Gnutella
Intelli ent Quer Floodin
15
KaZaA Swarming
BitTorrent
Structured Overlay Routing
Distributed Hash Tables
-
8/12/2019 Lecture 14 P2P
16/44
Gnutella: History
Released in 2000
Soon many other clients: Bearshare, Morpheus,LimeWire, etc.
In 2001, many protocol enhancements includingultra eers
16
-
8/12/2019 Lecture 14 P2P
17/44
Gnutella: Overview Query Flooding:
Join: on startup, client contacts a few other nodes; these
become its neighbors Publish: no need Search: ask neighbors, who ask their neighbors, and so
on... when/if found, reply to sender. TTL limits ro a ation
17
Fetch: get the file directly from peer
-
8/12/2019 Lecture 14 P2P
18/44
I have file A.
I have file A.
Gnutella: Search
Reply
18
Where is file A?
Query
-
8/12/2019 Lecture 14 P2P
19/44
GnutellaSearching by flooding:
If you dont have the file you want, query neighbors
If they dont have it, they contact their neighbors
TTL ensures flooding is controlled
19
.
No looping but packets may be received twice.
Query response to neighbor requesting query
Forwards it to node it received query from.
-
8/12/2019 Lecture 14 P2P
20/44
Constructing Topology
Node must know one initial bootstrap node.
Through query responses, learns about other nodes
Can attach itself to some of these.
Keeps track of whether neighbor is alive through ping
20
.
-
8/12/2019 Lecture 14 P2P
21/44
Gnutella: Discussion
Pros: Fully de-centralized
Search cost distributed
Processing @ each node permits powerful searchsemantics
21
Cons: Search scope is O(N)
Nodes leave often, network unstable TTL-limited search works well for popular files
May not work well for files not popular.
-
8/12/2019 Lecture 14 P2P
22/44
KaZaA: History
Created in 2001, by Dutch company
Single network called FastTrack used byother clients as well: Morpheus, giFT, etc.
Eventually protocol changed so other clients
22
could no longer talk to it Popular file sharing network today
-
8/12/2019 Lecture 14 P2P
23/44
KaZaA: Overview Smart Query Flooding:
Join: on startup, client contacts a supernode ...
may at some point become one itselfPublish: send list of files to supernodeSearch: send query to supernode, supernodes
23
Fetch: get the file directly from peer(s); can fetchsimultaneously from multiple peers
-
8/12/2019 Lecture 14 P2P
24/44
KaZaA: Network DesignSuper Nodes
24
-
8/12/2019 Lecture 14 P2P
25/44
KaZaA: File Insert
insert(X,123.2.21.23)
...
25
I have X!
Publish
123.2.21.23
-
8/12/2019 Lecture 14 P2P
26/44
KaZaA: File Searchsearch(A)-->123.2.22.50
26
Where is file A?
Query
search(A)
-->123.2.0.18
Replies
123.2.0.18
123.2.22.50
-
8/12/2019 Lecture 14 P2P
27/44
Why Superpeers? Pros:Query consolidation
Many connected nodes may have only a few files
Propagating a query to a sub-node would take moreb/w than answering it yourself
27
Selecting superpeers How long youve been on is a good predictor of how long
youll be around.
-
8/12/2019 Lecture 14 P2P
28/44
KaZaA: Discussion Pros:
Tries to take into account node heterogeneity:
Bandwidth
Host Computational Resources
Host Availability (?)
28
Still no real guarantees on search scope or search time
Similar behavior to gnutella, but better.
-
8/12/2019 Lecture 14 P2P
29/44
BitTorrent: History In 2002, B. Cohen released BitTorrent
Key Motivation:
Popularity exhibits temporal locality (Flash Crowds)
E.g., Slashdot effect, CNN on 9/11, new movie/game release
Focused on Efficient Fetchin , not Searchin :
29
Distribute the samefile to all peers Single publisher, multiple downloaders
Has some real publishers
-
8/12/2019 Lecture 14 P2P
30/44
BitTorrent: Distribute chunks at a time
1 2 3 4
1 23
30
3 4
1 2
Service capacity doubles!
-
8/12/2019 Lecture 14 P2P
31/44
BitTorrent: Overview A overlay per file Obtain torrent file
File, length, hashing information, url of tracker Centralized Tracker Server
Join: contact centralized tracker server, get a list of peers. Hel s eers kee track of each other
31
Peer: Seed and leecher Seed: already obtained full file. Files broken into segments: segment hash Get segments from peers
-
8/12/2019 Lecture 14 P2P
32/44
Key Ideas Split the file Form mesh-based overlay
Transfer by file blocks Overlap downloading with uploading
(Help each other while downloading) Overla downloadin with downloadin
32
Receive the all blocks out-of-order Key differences from Napster: Use of chunks instead of entire files Anti-freeloading mechanisms
-
8/12/2019 Lecture 14 P2P
33/44
BitTorrent: Publish/Join
Tracker
33
-
8/12/2019 Lecture 14 P2P
34/44
BitTorrent: Fetch
34
-
8/12/2019 Lecture 14 P2P
35/44
BitTorrent: Sharing Strategy Employ Tit-for-tat sharing strategy
A is downloading from some other people A will let the fastest N of those download from him
Be optimistic: occasionally let freeloadersdownload
35
Otherwise no one would ever start!
Also allows you to discover better peers to downloadfrom when they reciprocate
-
8/12/2019 Lecture 14 P2P
36/44
Operations Bootstrap:
New peer P.
Contacts tracker. Gets partial list of other peers in the system.
36
onnec on s a s men Each peer establishes connections with other peers Peers exchange bitmaps reporting which pieces they have Updated as and when new pieces downloaded.
-
8/12/2019 Lecture 14 P2P
37/44
Maintaining block diversity
1 2 3 4
1 2
1 2 3 4
1 2
Diversity No Diversity
1 2
37
3 4
1 2
Distributed coordination:
To select block to download: Rarest block first
??
??
-
8/12/2019 Lecture 14 P2P
38/44
Guard against free-riding Free-riding: (Selfish, harmful to system)
downloading without uploading
Solution: Tit-for-tat: If you dont upload to me, Im not uploading to you! Free-rider is punished
38
ow muc you g ve s ow muc you get
-
8/12/2019 Lecture 14 P2P
39/44
BitTorrent: Summary Pros:
Works reasonably well in practice
Gives peers incentive to share resources; avoidsfreeloaders
Cons:
39
Central tracker server needed to bootstrap swarm Not clear how effective tit-for-tat is.
-
8/12/2019 Lecture 14 P2P
40/44
Next Topic...
Centralized Database Napster
Query Flooding Gnutella
Intelli ent Quer Floodin
40
KaZaA
Swarming BitTorrent
Unstructured Overlay Routing Freenet
Structured Overlay Routing Distributed Hash Tables (DHT)
-
8/12/2019 Lecture 14 P2P
41/44
Distributed Hash Tables (DHTs)
Gnutella: flooding oriented search
Expensive for items not as popular.
DHTs
Develo ed in academic communit
41
Ensures efficient search/lookup without flooding Makes some things harder
Fuzzy queries / full-text search / etc.
Deployment
Adopted by recent versions of BitTorrent, and Emule (Kad)
-
8/12/2019 Lecture 14 P2P
42/44
Distributed Hash Tables
Nodes in system form a distributed data structure
Clever structure
Convenient to locate objects Different than Gnutella: structure can be arbitrary
Each node has a unique id.
42
Usually hash of nodes IP address
Each object has a unique id.
Usually hash of filename.
Map object to node nearest it in id space
-
8/12/2019 Lecture 14 P2P
43/44
DHT: Overview
Structured Overlay Routing:
Join: On startup, contact a bootstrap node and integrate
yourself into the distributed data structure; get a node id Publish: Route publication for file idtoward a close node id
along the data structure
43
.
Data structure guarantees that query will meet thepublication.
Fetch: Two options:
Publication contains actual file => fetch from where query stops
Publication says I have file X => query tells you 128.2.1.3 hasX, use IP routing to get X from 128.2.1.3
-
8/12/2019 Lecture 14 P2P
44/44
Distributed Hash Tables (DHTs) Hash table interface: put(key,item), get(key)
O(log n) hops
Structured Networks
K IK I(K1,I1)
K I
K I
K I
K I
K I
K I
K I
put(K1,I1) get (K1)
I1