opendht: a public dht service

56
OpenDHT: A Public DHT Service Sean C. Rhea UC Berkeley June 2, 2005 Joint work with: Brighten Godfrey, Brad Karp, John Kubiatowicz, Sylvia Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu

Upload: cadee

Post on 11-Jan-2016

53 views

Category:

Documents


0 download

DESCRIPTION

OpenDHT: A Public DHT Service. Sean C. Rhea UC Berkeley June 2, 2005. Joint work with: Brighten Godfrey, Brad Karp, John Kubiatowicz, Sylvia Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu. Peer-to-Peer File Sharing. Very simple insight Most computers unused most of the time - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: OpenDHT: A Public DHT Service

OpenDHT: A Public DHT ServiceSean C. Rhea

UC Berkeley

June 2, 2005

Joint work with: Brighten Godfrey, Brad Karp, John Kubiatowicz, Sylvia Ratnasamy,

Scott Shenker, Ion Stoica, and Harlan Yu

Page 2: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

Peer-to-Peer File Sharing

• Very simple insight– Most computers unused most of the time

• Idea: harness this spare capacity to– Quickly download music files [Napster, Gnutella]– Search for aliens [SETI@Home]– Make free long-distance phone calls [Skype]

• Question: how to find desired resource(s)?– Early approaches: scoped flooding– Downsides: scalability, accuracy

Page 3: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

A Better Search Facility:The Distributed Hash Table (DHT)

• Same interface as a programmatic hash table,– put(key, value) stores value under key

– get(key) returns the value(s) stored under key

• But shared across many machines

• Implemented via an overlay network

Page 4: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

A Better Search Facility:The Distributed Hash Table (DHT)

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

put(k1,v1)

stores k1,v1

get(k1)

k1

k1

k1v1

v1

v1

k1,v1

k1,v1

k1,v1

Page 5: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

DHTs and File Sharing:DHT Stores Pointers to Files

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

pointerto file

put(file, IP)

Page 6: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

DHTs and File Sharing:DHT Stores Pointers to Files

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

pointerto file

get(file)IP

xfer over HTTP

Page 7: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

DHTs and Spam Detection:Detecting Similar Messages

K V

K V

K V

K V

K V

K V

K V

K V

K V

put(hash(msg), IP)

“I loveyou!”

Page 8: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

DHTs and Spam Detection:Detecting Similar Messages

K V

K V

K V

K V

K V

K V

K V

K V

K V

put(hash(msg), IP)

“I loveyou!”

“I loveyou!”

Page 9: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

DHTs and Spam Detection:Detecting Similar Messages

K V

K V

K V

K V

K V

K V

K V

K V

K V

“I loveyou!”

“I loveyou!”

Something’sfishy!

Page 10: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

DHTs and Spam Detection:Detecting Similar Messages

K V

K V

K V

K V

K V

K V

K V

K V

K V

put(hash(msg), IP)

“I loveyou!”

“I loveyou!” “I love

you!”

Page 11: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

DHTs and Spam Detection:Detecting Similar Messages

K V

K V

K V

K V

K V

K V

K V

K V

K V

“I loveyou!”

“I loveyou!” “I love

you!”

Something’sfishy!

Page 12: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

More DHT Applications• Distributed Storage Systems

– CFS, HiveCache, PAST, Pastiche– OceanStore / Pond

• Content Distribution Networks / Web Caches– Bslash, Coral, Squirrel

• Indexing / Naming Systems– Chord-DNS, CoDoNS, DOA, SFR

• Internet Query Processors– Catalogs, PIER

• Communication Systems– Bayeux, i3, MCAN, SplitStream

Page 13: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

Some Areas of DHT Research• Better routing protocols

– One-hop, degree-optimal

• Load balancing– Non-uniform key distributions

• Security– Byzantine fault-tolerant routing

• Data redundancy and fault tolerance– Replication, erasure-coding

• Stronger semantics– Supporting read-modify-write

Page 14: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

How Many DHTs Will There Be?

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

File Sharing

K V

Spam Detection

Company Machine: Can’t Share Files

Owns Stock in Spam CompanyK V

Page 15: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

How Many DHTs Will There Be?

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

File Sharing

K V

Spam Detection

Redundant Link

K V

Page 16: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

How Many DHTs Will There Be?

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

File Sharing

K V

Spam Detection

Unshared Links

K V

Page 17: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

Benefits of Sharing a DHT

• Amortizes costs across applications– Maintenance bandwidth, connection state, etc.

• Facilitates “bootstrapping” of new applications– Working infrastructure already in place

• Allows for statistical multiplexing of resources– Takes advantage of spare storage and bandwidth

• Facilitates upgrading existing applications– “Share” DHT between application versions

Page 18: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

Challenges in Sharing a DHT

• Robustness– Must be available 24/7

• Shared Interface Design– Should be general, yet easy to use

• Resource Allocation– Must protect against malicious/over-eager users

• Economics– What incentives are there to provide resources?

Page 19: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

Challenges in Sharing a DHT

• Robustness– Must be available 24/7

• Shared Interface Design– Should be general, yet easy to use

• Resource Allocation– Must protect against malicious/over-eager users

• Economics– What incentives are there to provide resources?

Page 20: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

The DHT as a Service

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

Page 21: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

The DHT as a Service

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V OpenDHT

Page 22: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

The DHT as a Service

OpenDHT Clients

Page 23: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

The DHT as a Service

OpenDHT

Page 24: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

The DHT as a Service

OpenDHT

What is this interface?

Page 25: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

The Traditional Interface: lookup

1112

10 2

1

9 3

6

4

57

8

Page 26: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

The Traditional Interface: lookup

lookup(k)

k

On reaching the successor of k,message passed to an “upcall”

Page 27: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

DHTs and Spam Detection:Detecting Similar Messages

K V

K V

K V

K V

K V

K V

K V

K V

K V

put(hash(msg), IP)

“I loveyou!”

“I loveyou!”

Upcall: I’ve seen thismessage before!

Page 28: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

DHTs and Spam Detection:Detecting Similar Messages

K V

K V

K V

K V

K V

K V

K V

K V

K V

“I loveyou!”

“I loveyou!”

Something’sfishy!

Page 29: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

Upcall Challenges

• Distribution– How do we get new upcall code to all nodes?

Page 30: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

Upcall Challenges

lookup(k)

k

How did the upcall codeget here?

Page 31: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

Upcall Challenges

• Distribution– How do we get new upcall code to all nodes?– Active networking experience is a warning…

Page 32: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

Upcall Challenges

• Distribution– How do we get new upcall code to all nodes?– Active networking experience is a warning…

• Security– How do we safely run untrusted clients’ upcalls?

Page 33: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

What about Put/Get?

• Works great for some applications– File sharing, for example

Page 34: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

DHTs and File Sharing:DHT Stores Pointers to Files

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

pointerto file

get(file)IP

xfer over HTTP

put(file, IP)

Page 35: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

What about Put/Get?

• Works great for some applications– File sharing, for example

• What about applications with upcalls?– Our spam detection application, for example

Page 36: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

What about Put/Get?

• Works great for some applications– File sharing, for example

• What about applications with upcalls?– Our spam detection application, for example

• Idea: let application nodes run the upcalls– Each node only runs upcalls for the applications

that it’s participating in

Page 37: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

Upcall Example

OpenDHT

put/get

put/get

File Sharing

I can handle spamdetection messages

Spam Detection

I can handle spamdetection messages

I can handlespam detection

messages

Page 38: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

Upcall Example

OpenDHT

put/get

put/get

File Sharing

Spam Detection

“I loveyou!”

Who’s handlinghash(message)?

Page 39: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

Upcall Example

OpenDHT

put/get

put/get

File Sharing

Spam Detection

“I loveyou!”

Who’s handlinghash(message)?

“I loveyou!”

Page 40: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

Upcall Example

OpenDHT

put/get

put/get

File Sharing

Spam Detection

“I loveyou!”

“I loveyou!”

Something’sfishy!

DHT keeps track of which nodes support which upcalls via

Recursive Distributed Rendezvous (ReDiR)

Page 41: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

ReDiR• Goal: Implement two functions using put/get:

– join(namespace, node)– node = lookup(namespace, identifier)

H(namespace)

L0

L1

L2

H(A)

A

A

AH(B)

Page 42: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

ReDiR• Goal: Implement two functions using put/get:

– join(namespace, node)– node = lookup(namespace, identifier)

L0

L1

L2

H(A)

A, B

A

AH(B) H(C)

C

Page 43: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

ReDiR• Goal: Implement two functions using put/get:

– join(namespace, node)– node = lookup(namespace, identifier)

L0

L1

L2

H(A)

A, B

A, C

AH(B) H(C)

C

H(D)

D

D

Page 44: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

ReDiR• Goal: Implement two functions using put/get:

– join(namespace, node)– node = lookup(namespace, identifier)

L0

L1

L2

H(A)

A, B

A, C

A, DH(B) H(C)

C

H(D)

D

D

H(E)

E

Page 45: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

ReDiR• Goal: Implement two functions using put/get:

– join(namespace, node)– node = lookup(namespace, identifier)

L0

L1

L2

H(A)

A, B

A, C

A, DH(B) H(C)

C

H(D)

D

D, E

H(E)

E

Page 46: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

ReDiR• Join cost:

– Worst case: O(log n) puts and gets– Average case: O(1) puts and gets

L0

L1

L2

H(A)

A, B

A, C

A, DH(B) H(C)

C

H(D)

D

D, E

H(E)

E

Page 47: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

ReDiR• Goal: Implement two functions using put/get:

– join(namespace, node)– node = lookup(namespace, identifier)

L0

L1

L2H(A)

A, B

A, C

A, D

H(B) H(C)

C

H(D)

D

D, E

H(E)

E

H(k1)

successor

Page 48: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

ReDiR• Goal: Implement two functions using put/get:

– join(namespace, node)– node = lookup(namespace, identifier)

L0

L1

L2H(A)

A, B

A, C

A, D

H(B) H(C)

C

H(D)

D

D, E

H(E)

E

H(k2)

no successor

successor

Page 49: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

ReDiR• Goal: Implement two functions using put/get:

– join(namespace, node)– node = lookup(namespace, identifier)

L0

L1

L2H(A)

A, B

A, C

A, D

H(B) H(C)

C

H(D)

D

D, E

H(E)

E

H(k3)

no successor

successor

no successor

Page 50: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

ReDiR• Lookup cost:

– Worst case: O(log n) gets– Average case: O(1) gets

L0

L1

L2H(A)

A, B

A, C

A, D

H(B) H(C)

C

H(D)

D

D, E

H(E)

E

Page 51: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

ReDiR Performance(On PlanetLab)

Page 52: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

OpenDHT Design Summary

• OpenDHT is a common infrastructure for– Storage of values, pointers, etc.– Organizing clients that handle application upcalls

• Benefits:– Amortizes maintenance costs across applications– Facilitates “bootstrapping” of new applications– Allows for statistical multiplexing of resources

Page 53: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

ImpactApplication Uses OpenDHT for Interface

Croquet Media Manager replica location put/get

DOA indexing put/get

HIP name resolution put/get

Tetherless Computing host mobility put/get

Place Lab range queries ReDiR

QStream mcast tree constr. ReDiR

VPN Index indexing put/get

FreeDB storage put/get

Instant Messaging rendezvous put/get

CFS storage put/get

i3 redirection ReDiR

Page 54: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

Future Work

• OpenDHT makes a great common substrate for:– Soft-state storage– Naming and rendezvous

• Many P2P applications also need to:– Traverse NATs– Redirect packets within the infrastructure (as in i3)– Refresh puts while intermittently connected

• All of these can be implemented with upcalls– Who provides the machines that run the upcalls?

Page 55: OpenDHT: A Public DHT Service

Sean C. Rhea OpenDHT: A Public DHT Service June 2, 2005

Future Work

• We don’t want to add upcalls to the core DHT– Keep the main service simple, fast, and robust

• Can we build a separate upcall service?– Some other set of machines organized with ReDiR– Security: can only accept incoming connections,

can’t write to local storage, etc.

• This should be enough to implement– NAT traversal, reput service– Some (most?) packet redirection

• What about more expressive security policies?

Page 56: OpenDHT: A Public DHT Service

For more information, seehttp://opendht.org/