1 unstructured p2p overlay. 2 centralized model e.g. napster global index held by central...

1

Unstructured P2P overlay

2

Centralized modele.g. Napsterglobal index held by central authoritydirect contact between requestors and

providers

Decentralized modele.g. Gnutella, Freenet, Chordno global index – local knowledge only

(approximate answers)contact mediated by chain of intermediaries

P2P Application

Gnutella search mechanism

Each peer keeps a list of other peers that it knows aboutNeighbors

Increasing the degree of the peers reduces the longest path from one peer to another but requires more storage at each peer

Once a peer is connected to the overlay, it can exchange messages with other peers in its neighbor.

3

4

Gnutella search mechanism

1

2

3

4

5

6

7A

Steps:• Node 2 initiates search for file A

5

Gnutella Search Mechanism

1

2

3

4

5

6

7

ASteps:• Node 2 initiates search for file A• Sends message to all neighbors

A

A

6


1

2

3

4

5

6

7

ASteps:• Node 2 initiates search for file A• Sends message to all neighbors• Neighbors forward message

A

A

A

7


1

2

3

4

5

6

7Steps:• Node 2 initiates search for file A• Sends message to all neighbors• Neighbors forward message• Nodes that have file A initiate a

reply message

A:5

A

A:7

A

A

8


1

2

3

4

5

6


reply message• Query reply message is back-

propagated

A:5

A:7

A

A

9


1

2

3

4

5

6



propagated

A:5

A:7

10


1

2

3

4

5

6



propagated• File download

download A

11

Scalability

Whenever a node receives a message, (ping/query) it sends copies out to all of its other connections.

existing mechanisms to reduce traffic:TTL counterCache information about messages they

received, so that they don't forward duplicated messages.


FloodFoward(Query q, Source p)

if (q.id oldIdsQ)

return

oldIdsQ=oldIdsQ ∪q.id

q.TTL=q.TTL-1

if (q.TTL <= 0)

return

foreach (sNeighbors)

if (s <> p)

send(s,q)12

13

Total Generated Traffic

Ripeanu has determined that Gnutella traffic totals 1Gbps (or 330TB/month)! Compare to 15,000TB/month in US Internet backbone

(Dec. 2000) this estimate excludes actual file transfers

Reasoning: QUERY and PING messages are flooded. They

form more than 90% of generated traffic predominant TTL=7

14

Mapping between Gnutella Network and Internet Infrastructure

A

DB

C

E

H

G

F

Perfect Mapping

15

A

DB

C

E

H

G

F

Mismatch between Gnutella Network and Internet Infrastructure

Inefficient mappingLink D-E needs to support six times

higher traffic.

16

Free Riding on Gnutella

70% of Gnutella users share no files90% of users answer no queriesThose who have files to share may limit number of

connections or upload speed, resulting in a high download failure rate.

If only a few individuals contribute to the public good, these few peers effectively act as centralized servers.

17

Query Expressiveness

Format of query not standardizedNo standard format or matching semantics for

the QUERY string. Its interpretation is completely determined by each node that receives it.

String literal vs. regular expression Directory name, filename, or file contentsMalicious users may even return files

unrelated to the query

18

Conclusions Gnutella is a self-organizing, large-scale,

P2P application that produces an overlay network on top of the Internet; it appears to work freedom

High network traffic cost Scalability File availability

Random Walk

To avoid the message overhead of flooding, unstructured overlays can use some type of random walk.A single query message is sent to a randomly

selected neighborThe message has a TTL that is decremented at

each hopTermination

The query locates a node with the desired objectSearch timeout

19

Random Walk

To improve the response time, several random walk queries can be issued in parallel.

20

21

Some References

[1] Eytan Adar and Bernardo A. Huberman, Free Riding on Gnutella http://www.firstmonday.dk/issues/issue5_10/adar/

[2] Igor Ivkovic, Improving Gnutella Protocol: Protocol Analysis And Research Proposals http://www9.limewire.com/download/ivkovic_paper.pdf

[3] Jordan Ritter, Why Gnutella Can't Scale. No, Really. http://www.monkey.org/~dugsong/mirror/gnutella.html

[4] Matei Ripeanu, Peer-to-Peer Architecture Case Study: Gnutella network. http://www.cs.uchicago.edu/%7Ematei/PAPERS/gnutella-rc.pdf

[5] The Gnutella Protocol Specification v0.4 http://www9.limewire.com/developer/gnutella_protocol_0.4.pdf

22

Improving on Flooding and Random Walk

23

Peer-to-peer NetworksPeers are connected by an

overlay network.Users cooperate to share

files (e.g., music, videos, etc.)

Overviews

DisadvantagesFlooding : does not scaleRandom walks : take a long time to find an

object

Key ideas to improve performanceQuery forwarding criteriaOverlay topologyObject placement

24

Overviews

Query forwardingBy using additional knowledge about where the

object is likely to be.

Overlay topologyProximity of the peers in the networkConnecting with high degree nodesShared properties of the peers

Object placementObject popularity

25

Overviews

MetricsOverlay hop (hop)

The overlay hop may corresponding to many network hops!

Request hit rateLatency

26

27

Topics

Search strategies Beverly Yang and Hector Garcia-Molina, “

Improving Search in Peer-to-Peer Networks”, ICDCS 2002 Arturo Crespo, Hector Garcia-Molina, “

Routing Indices For Peer-to-Peer Systems”, ICDCS 2002

Short cuts Kunwadee Sripanidkulchai, Bruce Maggs and Hui Zhang, “

Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems”, infocom 2003.

Replication Edith Cohen and Scott Shenker, “

Replication Strategies in Unstructured Peer-to-Peer Networks”, SIGCOMM 2002.

28

Improving Search in Peer-to-Peer Networks

ICDCS 2002

Beverly YangHector Garcia-Molina

29

Motivation

The propose of a data-sharing P2P system is to accept queries from users, and locate and return data (or pointers to the data).

MetricsCost

Average aggregate bandwidthAverage aggregate processing cost

Quality of resultsNumber of resultsSatisfaction : a query is satisfied if Z (a value specified by

user) or more results are returned. Time to satisfaction

30

Current Techniques

GnutellaBFS with depth limit D.Waste bandwidth and processing resources

FreenetDFS with depth limit D.Poor response time.

31

Broadcast policies

Iterative deepening (Expanding ring)Directed BFS

32

Iterative DeepeningIn system where satisfaction is the metric of

choice, iterative deepening is a good technique

Under policy P= { a, b, c} ;waiting time WA source node S first initiates a BFS of depth “a”The query is processed and then becomes

frozen at all nodes that are “a” hops from the source

S waiting for a time period W

33

Iterative Deepening

If query is not satisfied, S will start the next iteration, initiating a BFS of depth b.S send a “Resend” with a TTL of “a”A node that receives a Resend message will

• simply forward the message or • if the node is at depth “a”, it will drop the resend

message and unfreeze the corresponding query by forwarding the query message with a TTL of b-a to all its neighbors

34

Directed BFS

If minimizing response time is important to an application, iterative deepening may not be appropriate

A source send query messages to just a subset of its neighbors

A node maintains simple statistics on its neighborsNumber of results received from each neighborLatency of connection

35

Directed BFS (cont)

Candidate nodesReturned the Highest number of resultsThe neighbor that returns response messages

that have taken the lowest average number of hops

36

Routing Indices For Peer-to-Peer Systems

Arturo Crespo, Hector Garcia-Molina Stanford University {crespo,hector}@db.Stanford.edu

37

Motivation

A distributed-index mechanism Routing Indices (RIs) Give a “direction” towards the document, rather than its actual

location

By using “routes” the index size is proportional to the number of neighbors

38

Peer-to-peer Systems A P2P system is formed by a large number of nodes that

can join or leave the system at any timeEach node has a local document database that can be

accessed through a local indexThe local index receives content queries and returns

pointers to the documents with the requested content

39

Routing indicesThe objective of a Routing Index (RI) is to allow a node to

select the “best” neighbors to send a queryA RI is a data structure that, given a query, returns a list of

neighbors, ranked according to their goodness for the queryEach node has a local index for quickly finding local

documents when a query is received. Nodes also have a CRI containing the number of documents along each path the number of documents on each topic

40

Routing indices (cont.)

Thus, the number of results in a path can be estimated as : Example : search documents contain (DB and L)

Goodness of B: (20/100) *(30/100) * 100= 6 C: ( 0/1000)*(50/1000)*1000=0 D: (100/200)*(150/200)*200=75

Note that these numbers are just estimates and they are subject to overcounts and/or undercounts

A limitation of using CRIs is that they do not take into account the difference in cost due to the number of “hops” necessary to reach a document

41

Using Routing Indices

42

Using Routing Indices (cont.)t is the counter size in bytes, c is the number of

categories, N the number of nodes, and b the branching factorCentralized index would require t × (c + 1) × N

bytesthe total for the entire distributed system is t ×

(c + 1) × b × N bytes

the RIs require more storage space overall than a centralized index, the cost of the storage space is shared among the network nodes

43

Creating Routing Indices

44

Maintaining Routing Indices

Maintaining RIs is identical to the process used for creating them

For efficiency, we may delay exporting an update for a short time so we can batch several updates, thus, trading RI freshness for a reduced update cost

45

Hop-count Routing Indices

46

Hop-count Routing Indices (cont.)

The estimator of a hop-count RI needs a cost model to compute the goodness of a neighbor

We assumes that document results are uniformly distributed across the network and that the network is a regular tree with fanout F

We define the goodness (goodness hc) of Neighbor i with respect to query Q for hop-count RI as:

If we assume F = 3, the goodness of X for a query about “DB” documents would be 13+10/3 = 16.33 and for Y would be 0+31/3 = 10.33

47

Exponentially aggregated RI

Each entry of the ERI for node N contains a value computed as:th is the height and F the fanout of the assumed

regular tree, goodness() is the Compound RI estimator, N[j] is the summary of the local index of neighbor j of N, and T is the topic of interest of the entry

48

Exponentially aggregated RI (cont.)

Cycles in the P2P Network

49

50


There are three general approaches for dealing with cycles:No-op solution: No changes are made to the algorithms

only works with the hop-count and the exponential RI schemes

• hop-count RI: cycles longer than the horizon will not affect the RI. However, shorter cycles will affect the hop-count RI

• exponential RI: updates are sent back to the originator. However, the effect of the cycle will be smaller and smaller every time the update is sent back (due to the exponential decay


Cycle avoidance solution: do not allow nodes to create an “update” connection to other nodes if such connection would create a cycleAbsence of global information

Cycle detection and recovery: This solution detects cycles sometime after they are formed and, after that, takes recovery actions to eliminate the effect of the cycles

Cycles can be detected by having the originating node of a query or an update, let us say A, include a unique message identifier in the message.

51

52

Efficient Content Location Using Interest-based Locality in Peer-to-Peer

Systems

53

Background Each peer is connected randomly, and searching is done

by flooding. Allow keyword search

Example of searching a mp3 file in Gnutella network. The query is flooded across the network.

54

Background DHT (Chord):

Given a key, Chord will map the key to the node. Each node need to maintain O(log N) information Each query use O(log N) messages. Key search means searching by exact name

An chord with about 50 nodes.

The black lines point to adjacent nodes while the red lines are “finger” pointers that allow a node to find key in O(log N) time.

55

Interest-based Locality

Peers have similar interest will share similar contents

56

ArchitectureShortcuts are modular.Shortcuts are performance enhancement hints.

57

Creation of shortcutsThe peer use the underlying topology (e.g.

Gnutella) for the first few searches.One of the return peers is selected from random

and added to the shortcut lists.Each shortcut will be ordered by the metric, e.g.

success rate, path latency.Subsequent queries go through the shortcut lists

first. If fail, lookup through underlying topology.

58

Methodology – query workload

Create traffic trace from the real application traffic: Boeing firewall proxies Microsoft firewall proxies Passively collect the web traffic between CMU and the

Internet Passively collect typical P2P traffic (Kazza, Gnutella)

Use exact matching rather than keyword matching in the simulation. “song.mp3” and “my artist – song.mp3” will be treated

as different.

59

Methodology – Underlying peers topology

Based on the Gnutella connectivity graph in 2001, with 95% nodes about 7 hops away.

Searching TTL is set to 7.For each kind of traffic (Boeing, Microsoft… etc),

run 8 times simulations, each with 1 hour.

60

Simulation Results – success rate

61

Using Shortcuts’ Shortcuts

Idea:

Add the shortcut’s shortcut

Performance gain of 7% on average

Enhancement of Interest-based Locality

62

Interest-based Structures

When viewed as an undirected graph:In the first 10 minutes, there are many

connected components, each component has a few peers in between.

At the end of simulation, there are few connected components, each component has several hundred peers. Each component is well connected.

The clustering coefficient is about 0.6 ~ 0.7.

63

Conclusion

Interest based shortcuts are modular and performance enhancement hints over existing P2P topology.

Shortcuts can enhance the searching efficiencies.

Shortcuts form clusters within a P2P topology, and the clusters are well connected.

64

Replication Strategies in Unstructured Peer-to-Peer

Networks

Edith Cohen

AT&T Labs-research

Scott ShenkerICIR

65

(replication in) P2P architectures

No proactive replication (Gnutella)Hosts store and serve only what they

requested

66

Question: how to use replication to improve search efficiency in unstructured networks with a proactive replication mechanism ?

67

Search and replication model

Search: probe hosts, uniformly at random, until the query is satisfied (or the search max size is exceeded)

Goal: minimize average search size (number of probes till query is satisfied)

• Replication: Each host can store up to copies of items.

Unstructured networks with replication of keys or copies. Peers probed (in the search and replication process) are unrelated to query/item

68

Search Example

2 probes 4 probes

69

What is the search size of a query ?Soluble queries: number of probes until

answer is found.

We look at the Expected Search Size (ESS) of each item. The ESS is inversely proportional to the fraction of peers with a copy of the item.

Search size

70

Expected Search Size (ESS)

n nodes, capacity R=n* ri= number of copies of the i’th items

Allocation : p1(=r1/R), p2, p3,…, pm i pi = 1

ith item is allocated pi fraction of storage.

• m items with relative query rates

q1 > q2 > q3 > … > qm. i qi = 1

• Search size for ith item is a Geometric r.v. with mean Ai = n/(piR)=1/( pi ).

• ESS is i qi Ai = (i qi / pi)/

71

Uniform and Proportional Replication

Two natural strategies:• Uniform Allocation: pi = 1/m (m items)

•Simple, resources are divided equally• Proportional Allocation: pi = qi

•“Fair”, resources per item proportional to demand• Reflects current P2P practicesExample: 3 items, q1=1/2, q2=1/3, q3=1/6

Uniform Proportional

72

Basic Questions

How do Uniform and Proportional allocations perform/compare ?

Which strategy minimizes the Expected Search Size (ESS) ?

Is there a simple protocol that achieves optimal replication in decentralized unstructured networks ?

73

ESS under Uniform and Proportional Allocations (soluble queries)

• Lemma: The ESS under either Uniform or Proportional allocations is m/– Independent of query rates (!!!)– Same ESS for Proportional and Uniform (!!!)

Proportional:ASS is (i qi / pi)/(i qi / qi)/m/

Uniform:ASS is (i qi / pi)/(i m qi)/m/i qi m/ pi=(R/m)/R

• Proof…

74

Space of Possible Allocations

Definition: Allocation p1, p2, p3,…, pm is “in-between” Uniform and Proportional if for 1i <m, q i+1/q i < p i+1/p i < 1

Theorem1: All (strictly) in-between strategies are (strictly) better than Uniform and Proportional

Theorem2: p is worse than Uniform/Proportional if for all i, p i+1/p i > 1 (more popular gets less) OR for all i, q i+1/q i > p i+1/p i (less popular gets less than

“fair share”)

Proportional and Uniform are the worst “reasonable” strategies (!!!)

75

So, what is the best strategy for soluble queries ?

76

Square-Root Allocationpi is proportional to square-root(qi)

m

jj

ii

q

qp

1

• Lies “In-between” Uniform and Proportional

• Theorem: Square-Root allocation minimizes the ESS (on soluble queries)

Minimize i qi / pi such that i pi = 1

77

Replication Algorithms

Fully distributed where peers communicate through random probesminimal bookkeepingno more communication than what is needed for search.

Converge to/obtain SR allocation when query rates remain steady.

• Uniform and Proportional are “easy” :-– Uniform: When item is created, replicate its key

in a fixed number of hosts.– Proportional: for each query, replicate the key

in a fixed number of hosts

Desired properties of algorithm:

78

Model for Copy Creation/Deletion

Creation: after a successful search, C(s) new copies are created at random hosts.

Deletion: is independent of the identity of the item

<Ci> average value of C used to replicate ith item.

Claim: If <Ci>/<Cj> remains fixed over time, then pi/pj qi <Ci>/qj <Cj>

Property of the process:

79

SR Replication Algorithms

Path replication: number of new copies C(s) is proportional to the size of the search

Probe memory: each peer records number and combined search size of probes it sees for each item. C(S) is determined by collecting this info from number of peers proportional to search size. Extra communication (proportional to that needed for search).

80

Path ReplicationNumber of new copies produced per query,

<Ci>, is proportional to search size 1/pi

Creation rate is proportional to qi <Ci>Steady state: creation rate proportional to

allocation pi, thus

iiiii ppqCq

ii qp

81

Summary

• Random Search/replication Model: probes to “random” hosts

• Soluble queries: • Proportional and Uniform allocations are two

extremes with same average performance• Square-Root allocation minimizes Average

Search Size• OPT (all queries) lies between SR and Uniform• SR/OPT allocation can be realized by simple

algorithms.

1 unstructured p2p overlay. 2 centralized model e.g. napster global index held by central...

Documents