lecture 14 p2p

Upload: zahid52

Post on 03-Jun-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Lecture 14 P2P

    1/44

    ECE 595Computer Network Systems

    1

    Peer-to-Peer Systems

  • 8/12/2019 Lecture 14 P2P

    2/44

    Applications

    File download

    Location and Retrieval (Napster, Gnutella)

    Efficient distribution (BitTorrent, EDonkey)

    Audio Conference (Skype)

    Live Internet Broadcastin

    2

  • 8/12/2019 Lecture 14 P2P

    3/44

    Scaling Problem Millions of clients server and network meltdown

    3

  • 8/12/2019 Lecture 14 P2P

    4/44

    P2P System

    4

    Leverage the resources of client machines (peers) Computation, storage, bandwidth

  • 8/12/2019 Lecture 14 P2P

    5/44

    P2p file-sharing

    Quickly grown in popularity

    Dozens or hundreds of file sharing applications 35 million American adults use P2P networks --

    29% of all Internet users in US!

    5

    Internet

  • 8/12/2019 Lecture 14 P2P

    6/44

    Whats out there?

    Central Flood Super-

    nodeflood

    Route

    Search Napster Gnutella KaZaa DHTs

    6

    BitTorrent: Being able to transfer parts of a file at a time

  • 8/12/2019 Lecture 14 P2P

    7/44

    Searching

    N1N2 N3

    7

    Internet

    N6N5N4

    Publisher

    Key=titleValue=MP3 data

    Client

    Lookup(title)

    ?

  • 8/12/2019 Lecture 14 P2P

    8/44

    Framework

    Common Primitives:

    Join: how to I begin participating?

    Publish: how do I advertise my file?

    Search: how to I find a file?

    Fetch: how to I retrieve a file?

    8

  • 8/12/2019 Lecture 14 P2P

    9/44

    Next Topic...

    Centralized Database

    Napster

    Query Flooding Gnutella

    Intelli ent Quer Floodin

    9

    KaZaA

    Swarming BitTorrent

    Structured Overlay Routing

    Distributed Hash Tables

  • 8/12/2019 Lecture 14 P2P

    10/44

    Napster: History

    1999: Sean Fanning launches Napster

    Peaked at 1.5 million simultaneous users

    Jul 2001: Napster shuts down

    10

  • 8/12/2019 Lecture 14 P2P

    11/44

    Napster: Overiew

    Centralized Database: Join: on startup, client contacts central server Publish: reports list of files to central server Search: query the server => return someone that stores

    11

    Fetch: get the file directly from peer

  • 8/12/2019 Lecture 14 P2P

    12/44

    Napster: Publish

    insert(X,123.2.21.23)

    ...

    12

    I have X, Y, and Z!

    Publish

    123.2.21.23

  • 8/12/2019 Lecture 14 P2P

    13/44

    Napster: Search

    search(A)-->123.2.0.18

    Fetch

    123.2.0.18

    13

    Where is file A?

    Query Reply

  • 8/12/2019 Lecture 14 P2P

    14/44

    Napster: Discussion

    Pros: Simple

    Search scope is O(1)

    Cons: Server ma not scale: Server maintains O N State

    14

    Server does all processing Single point of failure

  • 8/12/2019 Lecture 14 P2P

    15/44

    Next Topic...

    Centralized Database

    Napster

    Query Flooding Gnutella

    Intelli ent Quer Floodin

    15

    KaZaA Swarming

    BitTorrent

    Structured Overlay Routing

    Distributed Hash Tables

  • 8/12/2019 Lecture 14 P2P

    16/44

    Gnutella: History

    Released in 2000

    Soon many other clients: Bearshare, Morpheus,LimeWire, etc.

    In 2001, many protocol enhancements includingultra eers

    16

  • 8/12/2019 Lecture 14 P2P

    17/44

    Gnutella: Overview Query Flooding:

    Join: on startup, client contacts a few other nodes; these

    become its neighbors Publish: no need Search: ask neighbors, who ask their neighbors, and so

    on... when/if found, reply to sender. TTL limits ro a ation

    17

    Fetch: get the file directly from peer

  • 8/12/2019 Lecture 14 P2P

    18/44

    I have file A.

    I have file A.

    Gnutella: Search

    Reply

    18

    Where is file A?

    Query

  • 8/12/2019 Lecture 14 P2P

    19/44

    GnutellaSearching by flooding:

    If you dont have the file you want, query neighbors

    If they dont have it, they contact their neighbors

    TTL ensures flooding is controlled

    19

    .

    No looping but packets may be received twice.

    Query response to neighbor requesting query

    Forwards it to node it received query from.

  • 8/12/2019 Lecture 14 P2P

    20/44

    Constructing Topology

    Node must know one initial bootstrap node.

    Through query responses, learns about other nodes

    Can attach itself to some of these.

    Keeps track of whether neighbor is alive through ping

    20

    .

  • 8/12/2019 Lecture 14 P2P

    21/44

    Gnutella: Discussion

    Pros: Fully de-centralized

    Search cost distributed

    Processing @ each node permits powerful searchsemantics

    21

    Cons: Search scope is O(N)

    Nodes leave often, network unstable TTL-limited search works well for popular files

    May not work well for files not popular.

  • 8/12/2019 Lecture 14 P2P

    22/44

    KaZaA: History

    Created in 2001, by Dutch company

    Single network called FastTrack used byother clients as well: Morpheus, giFT, etc.

    Eventually protocol changed so other clients

    22

    could no longer talk to it Popular file sharing network today

  • 8/12/2019 Lecture 14 P2P

    23/44

    KaZaA: Overview Smart Query Flooding:

    Join: on startup, client contacts a supernode ...

    may at some point become one itselfPublish: send list of files to supernodeSearch: send query to supernode, supernodes

    23

    Fetch: get the file directly from peer(s); can fetchsimultaneously from multiple peers

  • 8/12/2019 Lecture 14 P2P

    24/44

    KaZaA: Network DesignSuper Nodes

    24

  • 8/12/2019 Lecture 14 P2P

    25/44

    KaZaA: File Insert

    insert(X,123.2.21.23)

    ...

    25

    I have X!

    Publish

    123.2.21.23

  • 8/12/2019 Lecture 14 P2P

    26/44

    KaZaA: File Searchsearch(A)-->123.2.22.50

    26

    Where is file A?

    Query

    search(A)

    -->123.2.0.18

    Replies

    123.2.0.18

    123.2.22.50

  • 8/12/2019 Lecture 14 P2P

    27/44

    Why Superpeers? Pros:Query consolidation

    Many connected nodes may have only a few files

    Propagating a query to a sub-node would take moreb/w than answering it yourself

    27

    Selecting superpeers How long youve been on is a good predictor of how long

    youll be around.

  • 8/12/2019 Lecture 14 P2P

    28/44

    KaZaA: Discussion Pros:

    Tries to take into account node heterogeneity:

    Bandwidth

    Host Computational Resources

    Host Availability (?)

    28

    Still no real guarantees on search scope or search time

    Similar behavior to gnutella, but better.

  • 8/12/2019 Lecture 14 P2P

    29/44

    BitTorrent: History In 2002, B. Cohen released BitTorrent

    Key Motivation:

    Popularity exhibits temporal locality (Flash Crowds)

    E.g., Slashdot effect, CNN on 9/11, new movie/game release

    Focused on Efficient Fetchin , not Searchin :

    29

    Distribute the samefile to all peers Single publisher, multiple downloaders

    Has some real publishers

  • 8/12/2019 Lecture 14 P2P

    30/44

    BitTorrent: Distribute chunks at a time

    1 2 3 4

    1 23

    30

    3 4

    1 2

    Service capacity doubles!

  • 8/12/2019 Lecture 14 P2P

    31/44

    BitTorrent: Overview A overlay per file Obtain torrent file

    File, length, hashing information, url of tracker Centralized Tracker Server

    Join: contact centralized tracker server, get a list of peers. Hel s eers kee track of each other

    31

    Peer: Seed and leecher Seed: already obtained full file. Files broken into segments: segment hash Get segments from peers

  • 8/12/2019 Lecture 14 P2P

    32/44

    Key Ideas Split the file Form mesh-based overlay

    Transfer by file blocks Overlap downloading with uploading

    (Help each other while downloading) Overla downloadin with downloadin

    32

    Receive the all blocks out-of-order Key differences from Napster: Use of chunks instead of entire files Anti-freeloading mechanisms

  • 8/12/2019 Lecture 14 P2P

    33/44

    BitTorrent: Publish/Join

    Tracker

    33

  • 8/12/2019 Lecture 14 P2P

    34/44

    BitTorrent: Fetch

    34

  • 8/12/2019 Lecture 14 P2P

    35/44

    BitTorrent: Sharing Strategy Employ Tit-for-tat sharing strategy

    A is downloading from some other people A will let the fastest N of those download from him

    Be optimistic: occasionally let freeloadersdownload

    35

    Otherwise no one would ever start!

    Also allows you to discover better peers to downloadfrom when they reciprocate

  • 8/12/2019 Lecture 14 P2P

    36/44

    Operations Bootstrap:

    New peer P.

    Contacts tracker. Gets partial list of other peers in the system.

    36

    onnec on s a s men Each peer establishes connections with other peers Peers exchange bitmaps reporting which pieces they have Updated as and when new pieces downloaded.

  • 8/12/2019 Lecture 14 P2P

    37/44

    Maintaining block diversity

    1 2 3 4

    1 2

    1 2 3 4

    1 2

    Diversity No Diversity

    1 2

    37

    3 4

    1 2

    Distributed coordination:

    To select block to download: Rarest block first

    ??

    ??

  • 8/12/2019 Lecture 14 P2P

    38/44

    Guard against free-riding Free-riding: (Selfish, harmful to system)

    downloading without uploading

    Solution: Tit-for-tat: If you dont upload to me, Im not uploading to you! Free-rider is punished

    38

    ow muc you g ve s ow muc you get

  • 8/12/2019 Lecture 14 P2P

    39/44

    BitTorrent: Summary Pros:

    Works reasonably well in practice

    Gives peers incentive to share resources; avoidsfreeloaders

    Cons:

    39

    Central tracker server needed to bootstrap swarm Not clear how effective tit-for-tat is.

  • 8/12/2019 Lecture 14 P2P

    40/44

    Next Topic...

    Centralized Database Napster

    Query Flooding Gnutella

    Intelli ent Quer Floodin

    40

    KaZaA

    Swarming BitTorrent

    Unstructured Overlay Routing Freenet

    Structured Overlay Routing Distributed Hash Tables (DHT)

  • 8/12/2019 Lecture 14 P2P

    41/44

    Distributed Hash Tables (DHTs)

    Gnutella: flooding oriented search

    Expensive for items not as popular.

    DHTs

    Develo ed in academic communit

    41

    Ensures efficient search/lookup without flooding Makes some things harder

    Fuzzy queries / full-text search / etc.

    Deployment

    Adopted by recent versions of BitTorrent, and Emule (Kad)

  • 8/12/2019 Lecture 14 P2P

    42/44

    Distributed Hash Tables

    Nodes in system form a distributed data structure

    Clever structure

    Convenient to locate objects Different than Gnutella: structure can be arbitrary

    Each node has a unique id.

    42

    Usually hash of nodes IP address

    Each object has a unique id.

    Usually hash of filename.

    Map object to node nearest it in id space

  • 8/12/2019 Lecture 14 P2P

    43/44

    DHT: Overview

    Structured Overlay Routing:

    Join: On startup, contact a bootstrap node and integrate

    yourself into the distributed data structure; get a node id Publish: Route publication for file idtoward a close node id

    along the data structure

    43

    .

    Data structure guarantees that query will meet thepublication.

    Fetch: Two options:

    Publication contains actual file => fetch from where query stops

    Publication says I have file X => query tells you 128.2.1.3 hasX, use IP routing to get X from 128.2.1.3

  • 8/12/2019 Lecture 14 P2P

    44/44

    Distributed Hash Tables (DHTs) Hash table interface: put(key,item), get(key)

    O(log n) hops

    Structured Networks

    K IK I(K1,I1)

    K I

    K I

    K I

    K I

    K I

    K I

    K I

    put(K1,I1) get (K1)

    I1