peer to peer systems - kocwelearning.kocw.net/contents4/document/lec/2013/gachon/you... ·...

29
PEER-TO-PEER SYSTEMS Joon Yoo Dept. of Software Design & Management Gachon University Distributed Systems - Fall 2013

Upload: others

Post on 07-Jun-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

PEER-TO-PEER SYSTEMS

Joon Yoo

Dept. of Software Design & ManagementGachon University

Distributed Systems - Fall 2013

Page 2: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

Outline

o Peer-to-peer systems

o Distributed Hash Table (DHT)

o BitTorrent

- 2Distributed Systems - Fall 2013

Page 3: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

3

Peer-to-Peer Systems

o Server-Client system

n Hot spots (Google, Naver etc..) just keep getting hotter while cold pipes remain unused

o Peer-to-Peer system

n share the information, bandwidth and computing resources of individual users

o Main Problem

n Searching for location of the content

Distributed Systems - Fall 2013

Page 4: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

4

Classification

o How to search for the content

o Server based Searchn Not actually peer-to-peer, single point of failure

n Soribada, Napster, eDonkey

o Flooding based searchn Simple but doesn’t scale. Worst case O(N) searches.

n Gnutella

o Distributed Hash Table (DHT)n Scalable but complicated.

n CAN, Chord…

Distributed Systems - Fall 2013

Page 5: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

Centralized lookup

Publisher@

Client

Lookup(“title”)

N6

N9 N7

DB

N8

N3

N2N1SetLoc(“title”, N4)

Simple, but O(N) state and a single point of failure

Key=“title”Value=file data…

N4

- 5Distributed Systems - Fall 2013

Page 6: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

Flooded queries

N4Publisher@Client

N6

N9

N7N8

N3

N2N1

Robust, but worst case O(N) messages per lookup

Key=“title”Value=file data…

Lookup(“title”)

- 6Distributed Systems - Fall 2013

Page 7: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

Routed queries (Freenet, Chord, etc.)

N4Publisher

Client

N6

N9

N7N8

N3

N2N1

Lookup(“title”)

Key=“title”Value=file data…

- 7Distributed Systems - Fall 2013

Page 8: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

Outline

o Peer-to-peer systems

o Distributed Hash Table (DHT)n Chapter 2.2.2 Decentralized Architectures.

o BitTorrent

- 8Distributed Systems - Fall 2013

Page 9: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

The lookup problem

• At the heart of all DHTs

Internet

N1N2 N3

N6N5N4

Publisher

Put (Key=“title”Value=file data…) Client

Get(key=“title”)

?

- 9Distributed Systems - Fall 2013

Page 10: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

DHT algorithms

o Chord (2001)

n I. Stoica, R. Morris, D. Karger ,M.-F. Kaashoek, H. Balakrishnan (MIT), “Chord: A scalable peer-to-peer lookup service for internet applications,” ACM Sigcomm’01

n 11,032+ citations as of Dec. 2013 (Google scholar)

o CAN (2001)

n S. Ratnasamy , P. Francis , M. Handley , R. Karp , S. Schenker (UC Berkeley), “A scalable content-addressable network,” ACM Sigcomm’01

n 8,323+ citations as of Dec. 2013 (Google scholar)

- 10Distributed Systems - Fall 2013

Page 11: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

DHT Basics

o Intuitionn In previous P2P examples, the data search was done either by

using centralized server or fully distributed random search without server

n In DHT, the nodes are distributed with some structure called DHT. So we need some mechanism to map the data item keys to the distributed nodes.

o Data item mappingn Data item: assigned random key identifier from large identifier

space

n DHT Nodes: assigned random node identifier from same identifier space

n Uniquely map key of data item to DHT node based on some distance metric.

- 11Distributed Systems - Fall 2013

Page 12: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

CAN (1/6)

o Node (n1) joinsn Function(n1) = (x1,y1)

n n1 takes all space

o Node (n2) joinsn Function(n2) = (x2,y2)

n Identification space is splitted§ Into rectangular space

o Each node has neighbor informationn n1 knows n2

- 12Distributed Systems - Fall 2013

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1(x1,y1)

Page 13: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

CAN (2/6)

- 13Distributed Systems - Fall 2013

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1(x1,y1) n2(x2,y2)

o n3 joinso n2 joins

Page 14: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

CAN (3/6)

- 14Distributed Systems - Fall 2013

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

o n5 joinso n4 joins

Page 15: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

CAN (4/6)

o Files f1~f4 added.

n Files are assigned to the node that is responsible for the region

- 15Distributed Systems - Fall 2013

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

f1

f2

f3

f4

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

f1

f2

f3

f4

Page 16: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

CAN (5/6)

o A client Bob wants to find file f4! But he only knows n1: Routing

- 16Distributed Systems - Fall 2013

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

f1

f2

f3

f4

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

f1

f2

f3

f4

Page 17: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

17

CAN (6/6)

§ Each node knows its neighbors in the d-space

§ Forward query to the neighbor that is closest to the query id

§ Example: assume n1 queries f41 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

f1

f2

f3

f4

§ Space divided between nodes§ All nodes cover the entire space§ Each node covers either a square or a

rectangular area of ratios 1:2 or 2:1

§ Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6)

§ Items: f1:(2,3); f2:(5,1); f3:(2,1); f4:(7,5);

Distributed Systems - Fall 2013

Page 18: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

Distributed Hash Table (DHT) usages

o Apache Cassandra: Facebook (2010), open source distributed database management system

o Mainline DHT: Most BitTorrent search engines

o Most distributed data stores employ some form of DHT for lookup.

o Memcached

o And many more…

- 18Distributed Systems - Fall 2013

Page 19: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

Outline

o Peer-to-peer systems

o Distributed Hash Table (DHT)

o BitTorrentn Chapter 2.2.3 Hybrid Architectures.

- 19Distributed Systems - Fall 2013

Page 20: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

What is BitTorrent?

o Efficient content distribution system using file swarming.

Usually does not perform all the functions of a typical p2p

system, like searching.

o BitTorrent is the most widely used P2P program in the

world

n Utilized by 150 million active users as of Jan. 2012

n BitTorrent has, on average, more active users than YouTube

and Facebook combined.

n Since 2010, more than 20,000 BitTorrent users have been sued

by copyright trolls.

- 20Distributed Systems - Fall 2013

Page 21: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

File sharing

o To share a file or group of files, a peer first creates a .torrent file, a small file that contains

n metadata about the files to be shared, and

n Information about the tracker, the computer that coordinates the file distribution.

o Peers first obtain a .torrent file, and then connect to the specified tracker, which tells them from which other peers to download the pieces of the file.

- 21Distributed Systems - Fall 2013

Page 22: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

Basic Idea

o Initial seeder chops file into many pieces.

o Leecher first locates the .torrent file that directs it to a tracker

o Tracker tells which other peers are downloading that file.

o As a leecher downloads pieces of the file, replicas of the pieces are created. More downloads mean more replicas available

o As soon as a leecher has a complete piece, it can potentially share it with other downloaders.

o Eventually each leecher becomes a seeder by obtaining all the pieces, and assembles the file. Verifies the checksum.

- 22Distributed Systems - Fall 2013

Page 23: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

Piece Selection

o The order in which pieces are selected by different peers is critical for good performance

o If an inefficient policy is used, then peers may end up in a situation where each has all identical set of easily available pieces, and none of the missing ones.

o If the original seed is prematurely taken down, then the file cannot be completely downloaded! What are “good policies?”

- 23Distributed Systems - Fall 2013

Page 24: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

Piece Selection: Rarest Piece First

o General rule

o Determine the pieces that are most rare

among your peers, and download those first.

o This ensures that the most commonly

available pieces are left till the end to

download.

- 24Distributed Systems - Fall 2013

Page 25: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

Internal mechanism

o Built-in incentive mechanism (where all the

magic happens):

n Choking Algorithm

n Optimistic Unchoking

- 25Distributed Systems - Fall 2013

Page 26: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

Choking

- 26

o Choking is a temporary refusal to upload. It

is one of BitTorrent’s most powerful idea to

deal with free riders (those who only

download but never upload).

o Tit-for-tat strategy is based on game-

theoretic concepts.

Distributed Systems - Fall 2013

Page 27: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

Upload-Only mode

o Once download is complete, a peer has no download rates to use for comparison nor has any need to use them. The question is, which nodes to upload to?

o Policy: Upload to those with the best upload rate. This ensures that pieces get replicated faster, and new seeders are created fast

- 27Distributed Systems - Fall 2013

Page 28: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

Pipelining

o When transferring data over TCP, always have several

requests pending at once, to avoid a delay between

pieces being sent. At any point in time, some number,

typically 5, are requested simultaneously.

o Every time a piece arrives, a new request is sent out.

- 28Distributed Systems - Fall 2013

Page 29: PEER TO PEER SYSTEMS - KOCWelearning.kocw.net/contents4/document/lec/2013/Gachon/You... · 2014-07-03 · Balakrishnan(MIT), “Chord: A scalable peer-to-peer lookup service for internet

Distributed tracking: Trackerless torrents

o BitTorrent also supports "trackerless" torrents

o Trackerless torrents features a DHT implementation that allows the client to download torrents that have been created without using a BitTorrent tracker.

- 29Distributed Systems - Fall 2013