tecs week, pune, 5-9 january 2009 1 the user is the computer: from decentralized systems to social...

132
TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

Upload: jonah-ford

Post on 27-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

TECS Week, Pune, 5-9 January 2009 1

The User is the Computer: From Decentralized Systems to Social Computing

Peter Druschel

Page 2: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

Course overview

Today’s computer systems augment a wide range of human activity, including cooperation among individuals, organizations, businesses

This course deals with some of the technology underlying this trend, as well as the challenges and opportunities that come with it

2TECS Week, Pune, 5-9 January 2009

Page 3: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

3

Course overview1. Decentralized systems (~2 hours)

Overlays, object lookup, routing Shared state and coordination Applications Challenges

2. Accountability for distributed systems (~1.5 hours) Why and what is accountability? How can we implement it? How well does it work?

3. Social computing and applications (~1.5 hours) Exploiting social networks for distributed computing Example: enhancing Web search Example: thwarting unwanted communication

TECS Week, Pune, 5-9 January 2009

Page 4: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

4

Credits

Group members: Andreas Haeberlen Jeff Hoye Petr Kuznetsov Alan Mislove Animesh Nandi Ansley Post Atul Singh Jim Stewart

Colleagues: Krishna Gummadi, MPI-SWS Rodrigo Rodrigues, MPI-SWS Anne-Marie Kermarrec, INRIA Ant Rowstron, MSRC Miguel Castro, MSRC Ion Stoica, UC Berkeley John Kubiatowicz, UC Berkeley Frank Dabek, Google Y. Charlie Hu, Purdue

Funding: Max Planck Society National Science Foundation Intel Research Microsoft Research Texas ATP

TECS Week, Pune, 5-9 January 2009

Page 5: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

5

Decentralized (p2p) systems

Distributed computer system with Symmetric components Decentralized control and state Self-organization

Promise “Organic” growth Low barrier to deployment Resilience to faults, attack Resource abundance, diversity

TECS Week, Pune, 5-9 January 2009

Page 6: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

Partly vs. fully decentralized systems

Partly decentralized systems have a dedicated controller node Organic growth, abundant/diverse resources Limited scalability, resilience

Fully decentralized systems Some fully decentralized systems have

powerful supernodes Increased efficiency, but reduced resilience

6TECS Week, Pune, 5-9 January 2009

Page 7: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

7

Decentralized systems: deployment

Self-organization enables deployment in dynamic networks

Ad hoc wireless networks Mobile wireless devices

Delay-tolerant networks Devices with intermittent connectivity

Overlay networks (most common) Internet-connected devices

TECS Week, Pune, 5-9 January 2009

Page 8: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

8

Outline

1. Decentralized systems: state-of-the-art Overlays, object lookup, routing Example: Pastry Shared state and coordination: DHTs

and Scribe/DOLR Challenges Putting it all together: ePOST

2. Accountability for distributed systems3. Social computing and applications

TECS Week, Pune, 5-9 January 2009

Page 9: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

9

Overlay networks

Internet

Overlay links rely on unicast service in the Internet Topology can be “structured” or “unstructured”

Overlaynetwork

TECS Week, Pune, 5-9 January 2009

Page 10: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

10

Why overlays?

Overcome limitations of Internet architecture group communication, content-oriented networking enable innovation

Low barrier to deployment resource sharing enables “organic” growth self-organization simplifies operation

Robustness to faults, attacks, unexpected workloads decentralization resource diversity, wealth

TECS Week, Pune, 5-9 January 2009

Page 11: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

11

Decentralized (p2p) systems: What do they enable?

Cooperative computing Content sharing/distribution (Kazaa, BitTorrent) Streaming media (SOPcast, PPLive, Joost, iPlayer) Telephony (Skype), popular scientific computing Low barrier to deployment, market entry: Innovation

Digital preservation Diversity, abundance of resources provides

durability

Autonomous distributed systems Self-managing networks of little or mobile devices Decentralization is necessary for autonomy

TECS Week, Pune, 5-9 January 2009

Page 12: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

Popular decentralized systems

File sharing, bulk content distribution BitTorrent, eDonkey dominate Internet

traffic Streaming media distribution

PPLive, CoolStreaming, Joost, iPlayer, LiveStation

Skype Volunteer computing

BOINC apps perform 1 PFLOPS on average

12TECS Week, Pune, 5-9 January 2009

Page 13: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

13

Decentralized (p2p) systems: State-of-the-art

Decentralized state management Object location Replication Availability, Durability Load balancing

Efficient, consistent lookup routing in Internet overlays

Efficient cooperative content distribution Dependable storage from untrusted components Security: secure routing, content integrity, incentives

TECS Week, Pune, 5-9 January 2009

Page 14: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

14

Key problem: Object location

Objects partitioned among participating nodes Mapping from objects to nodes is dynamic

Unicast routing doesn’t help don’t know who to talk to don’t know where to store objects want to address (data) objects, not nodes !

TECS Week, Pune, 5-9 January 2009

Page 15: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

15

Solution 1: Unstructured overlay

No assumptions about overlay graph structure New node is assumed to know one participant Performs random walk to find more nodes to attach

to

Object placement Inserting node or random walk target May leave references along random path

Object lookup Scoped flooding or random walk

Examples: Gnutella, Kazaa, eDonkey

TECS Week, Pune, 5-9 January 2009

Page 16: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

Unstructured object location

16TECS Week, Pune, 5-9 January 2009

I inserts an object Leave reference on R

S floods a request Finds reference at R Tradeoff between scalability and recall Popular object easy to find

Page 17: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

17

Solution 2: structured overlay networks

Overlay graph conforms to a specific graph structure

Key-based routing primitive (KBR):

KBR(M, X): route message M to the live node that is currently responsible for the object associated with numerical id X

Basis for content-oriented networking

Examples: Chord, CAN, Pastry, Tapestry, Bamboo, Kademlia, SkipNet, Kelips, Accordeon, etc.

TECS Week, Pune, 5-9 January 2009

Page 18: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

18

Structured vs. unstructured overlays

Unstructured Simple overlay

formation Tradeoff between

recall and efficiency Robust to churn

Structured Pre-determined

routes Efficient identity

lookup, tree formation

More susceptible to churnCan be combined:

Stable nodes form structure Others attach randomly

TECS Week, Pune, 5-9 January 2009

Page 19: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

19

Outline

1. Decentralized systems: state-of-the-art Overlays, object lookup, routing Example: Pastry Shared state and coordination: DHTs

and Scribe/DOLR Challenges Putting it all together: ePOST

2. Accountability for distributed systems3. Social computing and applications

TECS Week, Pune, 5-9 January 2009

Page 20: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

20

Pastry: Identifier space

key

Consistent hashing [Karger et al. ‘97]

160 bit circular id space

nodeIds (uniform random)

keys (uniform random)

Each key is mapped to the live node with “closest” nodeId

nodeIds

O2160-1

TECS Week, Pune, 5-9 January 2009

Page 21: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

21

Pastry: lookup

X

KBR(M, X)

Msg with key X is routed to live node with nodeId closest to X

Problem:

complete routing table not scalable

O 2160-1

TECS Week, Pune, 5-9 January 2009

Page 22: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

22

Pastry: prefix-based routing

Properties• log16 N steps • O(log N) state

d46a1c

KBR(M, d46a1c)

d462ba

d4213f

d13da3

65a1fc

d467c4d471f1

TECS Week, Pune, 5-9 January 2009

Page 23: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

23

Pastry: routing table (node 65a1fcx)0x

1x

2x

3x

4x

5x

7x

8x

9x

ax

bx

cx

dx

ex

fx

60x

61x

62x

63x

64x

66x

67x

68x

69x

6ax

6bx

6cx

6dx

6ex

6fx

650x

651x

652x

653x

654x

655x

656x

657x

658x

659x

65bx

65cx

65dx

65ex

65fx

65a0x

65a2x

65a3x

65a4x

65a5x

65a6x

65a7x

65a8x

65a9x

65aax

65abx

65acx

65adx

65aex

65afxlog16 N

rows

Row 0

Row 1

Row 2

Row 3

TECS Week, Pune, 5-9 January 2009

Page 24: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

24

Pastry: prefix-based routing

Similar to Plaxton Trees [Plaxton et al. ‘97]

But added Neigbor sets for consistency, robustness,

security Consistent routing Self-organization (dynamic joins, fault

tolerance) Proximity neighbor selection for efficiency Secure routing to defend against malicious

nodes

TECS Week, Pune, 5-9 January 2009

Page 25: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

Neighbor sets

Stabilization protocol ensures eventual consistency aids routing consistency enables secure routing localizes fault detection within neighbor sets enables application-specific local coordination (e.g., object replica management)

AB

25TECS Week, Pune, 5-9 January 2009

Page 26: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

26

Challenge: Inconsistent routing

Routing consistency: “At any time, at most one

overlay node accepts messages with a given key”

Necessary for consistency of mutable data

Complicated by Internet routing anomalies

key

New node N has informed X, but not yet Y of its arrival

N

X

Y

TECS Week, Pune, 5-9 January 2009

Page 27: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

28

Challenge: Self-organization

Initializing and maintaining node state (overlay construction and maintenance)

Node addition Node departure (failure)

TECS Week, Pune, 5-9 January 2009

Page 28: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

29

Pastry: Node join

d46a1c

KBR(Join,d46a1c)

d462ba

d4213f

d13da3

65a1fc

d467c4d471f1

New node: d46a1c

TECS Week, Pune, 5-9 January 2009

Page 29: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

30

Pastry: Node departure (failure)

Neighbor set members exchange keep-alive messages (failure detection, neighbor set stabilization)

Neighbor set repair (eager): request set from farthest live node in set

Routing table repair (lazy): get table from peers in the same row, then higher rows

TECS Week, Pune, 5-9 January 2009

Page 30: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

31

Challenge: Overlay route efficiency

Nodes close in id space, but far away in Internet

Goal: choose routing table entries that yield few hops and low latency

CA-T1CCIArosUtah

CMUMIT

MA-CableCisco

Cornell

NYU

OR-DSL20x

80x

89x81x

TECS Week, Pune, 5-9 January 2009

Page 31: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

32

Proximity neighbor selection (PNS)

Assumptions: scalar proximity metric (e.g., RTT) a node can probe distance to any other

node

Proximity invariant:

Each routing table entry refers to a node close to the local node (in the physical network), among all nodes with the appropriate nodeId prefix.

TECS Week, Pune, 5-9 January 2009

Page 32: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

33

PNS: Routes in delay space

d46a1c

Route(d46a1c)

d462ba

d4213f

d13da3

65a1fc

d467c4d471f1

NodeId space

d467c4

65a1fcd13da3

d4213f

d462ba

Delay space

TECS Week, Pune, 5-9 January 2009

Page 33: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

34

PNS Properties

1) Low-delay routes: Average delay stretch, relative to IP, is a small constant (1.3 - 2.2) and can be derived from the physical network’s delay distribution

2) Route convergence: Routes of messages sent by nearby nodes with the same key converge at a node near the source nodes

Details in [Castro et al. MSR-TR-2002-82]

TECS Week, Pune, 5-9 January 2009

Page 34: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

35

Outline

1. Decentralized systems: state-of-the-art Overlays, object lookup, routing Example: Pastry Shared state and coordination: DHTs

and Scribe/DOLR Challenges Putting it all together: ePOST

2. Accountability for distributed systems3. Social computing and applications

TECS Week, Pune, 5-9 January 2009

Page 35: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

Sharing state: Distributed hash tables (DHT)

Hashtable API: put(obj,key), obj <- get(key)

Layered on top of a structured overlay Scalability, Robustness Persistent storage High availability

Examples: Chord/CFS, Pastry/PAST, Bamboo, Kelips, Kademlia

36TECS Week, Pune, 5-9 January 2009

Page 36: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

37

Distributed hash table (DHT)

k6,v6

k1,v1

k5,v5

k2,v2

k4,v4

k3,v3

nodes

Operations:insert(k,v)v=lookup(k)

Overlay

network

Overlay

network

• Structured overlay maps keys to nodes• Decentralized and self-organizing• Scalable, robust

TECS Week, Pune, 5-9 January 2009

Page 37: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

38

DHT: Insertion and replication

Storage Invariant: Tuple replicas are stored on r nodes with nodeIdsclosest to key

key

Insert(key,value,r)

r=4

TECS Week, Pune, 5-9 January 2009

Page 38: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

39

DHT: Lookup

Key Object located in log16 N steps (expected)

usually locates replica nearest client C

Lookup(key)

r replicasC

TECS Week, Pune, 5-9 January 2009

Page 39: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

40

DHT: Dynamic caching

Nodes cache tuples in the unused portion of their allocated disk space

Tuples cached on nodes along the route of lookup and insert messages

Goals: maximize query xput for popular tuples balance query load improve client latency

TECS Week, Pune, 5-9 January 2009

Page 40: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

41

DHT: Dynamic caching

Key

Lookup(key)

Delay space

TECS Week, Pune, 5-9 January 2009

Page 41: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

42

Coordination: Decentralized group management

E.g., SCRIBE [Rowstron et al., JSAC ’02] Spanning trees embedded in

structured overlay Multicast, anycast primitives Scalable: large numbers of groups,

members, wide range of members/group, dynamic membership

TECS Week, Pune, 5-9 January 2009

Page 42: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

43

Cooperative group communication

n2

n1

n0g:n1,n2

g:n3,n4

g

nodes

Operations:create(g)join(g)leave(g)multicast(g,m)anycast(g,m)

• groupId g mapped to n0• decentralized membership • robust, scalable

n3g

n4g

TECS Week, Pune, 5-9 January 2009

Page 43: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

44

Scribe

groupId

Join(groupId)

Delay space

TECS Week, Pune, 5-9 January 2009

Page 44: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

45

Structured overlay APIs

KBR

DHT SCRIBE / DOLR

[Dabek et al., IPTPS ’05]

route(M, X)

insert(k,v)v=lookup(k)

create(g)join(g)leave(g)multicast(g,m)anycast(g,m

TECS Week, Pune, 5-9 January 2009

Page 45: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

46

Outline

1. Decentralized systems: state-of-the-art Overlays, object lookup, routing Example: Pastry Shared state and coordination: DHTs

and Scribe/DOLR Challenges: malicious participants Putting it all together: ePOST

2. Accountability for distributed systems3. Social computing and applications

TECS Week, Pune, 5-9 January 2009

Page 46: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

47

Malicious participants: threats

Prevent messages from reaching root drop or corrupt bias routing tables

Cause objects to be placed on faulty nodes

choose nodeId values use many identities (Sybil attack) impersonate root

key

A

B

C

F

IJL

TECS Week, Pune, 5-9 January 2009

Page 47: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

48

Malicious participants: threats

Prevent messages from reaching root drop or corrupt bias routing tables

Cause objects to be placed on faulty nodes

choose nodeId values use many identities (Sybil attack) impersonate root

A

B

C

F

IJL

TECS Week, Pune, 5-9 January 2009

Page 48: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

49

Malicious participants: threats

Prevent messages from reaching root drop or corrupt bias routing tables

Cause objects to be placed on faulty nodes

choose nodeId values use many identities (Sybil attack) impersonate root

keyAB

C

F

IJL

TECS Week, Pune, 5-9 January 2009

Page 49: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

50

Malicious participants: threats

Prevent messages from reaching root drop or corrupt bias routing tables

Cause objects to be placed on faulty nodes

choose nodeId values use many identities (Sybil attack) impersonate root

keyA

E

D

B

C

FG

HI

JKL

TECS Week, Pune, 5-9 January 2009

Page 50: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

51

Malicious participants: threats

Prevent messages from reaching root drop or corrupt bias routing tables

Cause objects to be placed on faulty nodes

choose nodeId values use many identities (Sybil attack) impersonate root

key

A

B

C

F

IJ

KL

“F is my neighbor”

TECS Week, Pune, 5-9 January 2009

Page 51: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

52

Securing routing

Secure node identifier assignment thwarts Sybil and id choosing attacks

Secure membership protocol Prevents routing table bias attacks

Secure routing primitive Prevents root impersonation

Can tolerate up to 25% malicious nodes

key

A

B

C

F

IJ

TECS Week, Pune, 5-9 January 2009

Page 52: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

53

Securing routing

Secure routing primitive Prevents root impersonation

key

A

B

C

F

IJ

KL

“F is my neighbor”

M

[Castro et al., OSDI’ 02 ]

TECS Week, Pune, 5-9 January 2009

Page 53: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

54

Other threats

Freeloading: incentives mechanisms Data corruption: crypto Denial-of-service Several defenses needed

TECS Week, Pune, 5-9 January 2009

Page 54: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

55

Outline

1. Decentralized systems: state-of-the-art Overlays, object lookup, routing Example: Pastry Shared state and coordination: DHTs

and Scribe/DOLR Challenges: malicious participants Putting it all together: ePOST

2. Accountability for distributed systems3. Social computing and applications

TECS Week, Pune, 5-9 January 2009

Page 55: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

56

Putting it all together: ePOST

Decentralized, cooperative email service Based on users’ desktops/notebooks Messages transmitted and stored

securely Standard mail clients (IMAP/POP) Interoperability via SMTP Nodes may fail arbitrarily Users only trust their local node

[Mislove et al., EuroSys 06]

TECS Week, Pune, 5-9 January 2009

Page 56: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

HPDC-15, June 21, 2006

Why Email?

Demanding user expectations Privacy Integrity Durability Availability

Goal: Demonstrate that a decentralized, cooperative email service can be built that users can entrust with their production email

Page 57: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

58

ePOST: Single-copy store

Emails split into MIME components, stored in the DHT

Using its content-hash as the key Self-certifying (integrity) Identical items stored once Convergent encryption

Items replicated thrice for availability Additional erasure-coded

replicas for durability (Glacier [Haeberlen et al., NSDI’05])

Header

Body

Attachment

Attachment

Email Data

Page 58: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

59

ePOST: Single-writer log

Per-user metadata (folders, inbox, etc.) stored as an update log

All updates performed by owner Stored in the DHT

Entries form a hash chain Log head is signed with owner’s key Periodic snapshots stored in logHeader

Body

Attachment

Attachment

Email DataLog HeadLog Entry

Insert msg x

Mark msg y read

Insert msg y

Page 59: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

60

ePOST: Message Delivery

Message notifications are signed and contain encrypted headers and keys to the message’s components

Each user has a Scribe group Node joins user’s group if it has a message for

the user User announces to the group when online Pending notifications delivered

Page 60: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

ePOST: Security

Users have certificates (public key, node id)

Secure communication (SSL) All content stored in the DHT is protected

Authenticity Integrity Privacy

Incentives to prevent freeloading (Scrivener [Nandi, Middleware’05])

Secure KBR

61TECS Week, Pune, 5-9 January 2009

Page 61: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

62

Deployment and Experience Rice / MPI rings: reserved for internal members PlanetLab ring: open membership ring, backed by

Planetlab Usage

26 internal users (16 used ePOST as primary email) over more than two years

40 DHT nodes (Rice / MPI ring), 350 nodes (PlanetLab ring)

Several times, ePOST was available when Rice or MPI-SWS email had failed

No system-wide outages after initial testing phase Shut down due to overhead of tracking spam filtering

Page 62: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

63

Decentralized systems challenges

Maintaining mutable distributed state remains hard Fortunately, lots of useful applications don’t

require it

Incentives are basis for cooperation Strategy-prove protocols (e.g. tit-for-tat) Accountability

Need to control membership Certified identities (background check or fee) proof-of-work, social networks?

TECS Week, Pune, 5-9 January 2009

Page 63: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

64

Decentralized systems challenges

Need to protect data Durability requires non-decreasing membership Scalable storage, high availability, churn

resilience: pick two [Blake&Rodrigues, HotOS-IX]

Manageability Self-organization reduces administrative effort Hardware management is decentralized BUT: Evidence that lack of centralized control

may make it difficult to manage system-wide disruptions

TECS Week, Pune, 5-9 January 2009

Page 64: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

65

Outline

1. Decentralized systems: state-of-the-art2. Accountability for distributed systems

Why accountability? What is accountability? How can we implement it? How well does it work? Accountable virtual machines

3. Social computing and applications

TECS Week, Pune, 5-9 January 2009

Page 65: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 66

Byzantine faults occur in practice

Not all faults cause a node to stop The faulty node continues to operate, but its

behavior deviates from that of a correct node

Examples: Hardware malfunction Misconfiguration Software error External security attack Intentional software modification

Page 66: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 67

Example: LAX airport outage

Aug 2007: 17,000 passengers stranded at LAX Cause: intermittent fault of a network card

Admin

Page 67: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 68

Example: Botnets in the Internet

Compromised computer targets different domain Admin A must localize fault, then convince admin B that

her machine is faulty

Domain A Domain B

Administrative domain

Page 68: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 69

Example: Insider attack

Mar 2002: UBS PaineWebber admin disrupts trade for days to weeks

Difficult to detect, defuse logical bombs

Administrative domain

Page 69: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 70

Why is detecting faults difficult?

How to detect faults? How to identify the faulty node? How to convince others that a node is (not) faulty?

Incorrectmessage

Responsibleadmin

Page 70: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

71

Learning from the 'offline' world Relies on accountability Example: Banks

Record can be used to (manually) detect, identify and convince

Is accountability useful in distributed systems? Is it practical?

Requirement Solution

Commitment Signed receipts

Tamper-evident record

Double-entry bookkeeping

Inspections Audits

TECS Week, Pune, 5-9 January 2009

Page 71: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 72

What does accountability mean?

Accountability := tamper-evident record + automated, reliable fault detection

Page 72: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 73

Is accountability alone useful?

No, if faults are severe and irrecoverable need byzantine fault tolerance (see Lorenzo‘s

course)

Yes, for systems that provide „best-effort“ service systems that assume crash failures systems that mask severe/irrecoverable faults

Accountability reliably detects and localizes faults provides incentives to avoid faults builds trust, reputation

Page 73: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 74

Which Systems can benefit?

Internet services (BGP, DNS, NTP, NNTP, SMTP)

Web services Content distribution networks (CDN) Grid computing Peer-to-peer systems Multi-player games Cloud computing

Page 74: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 75

Butler Lampson on accountability

"Don’t forget that in the real world, security depends more on police than on locks, so detecting attacks, recovering from them, and punishing the bad guys are more important than prevention." -- Butler Lampson, "Computer Security in the Real World", ACSAC 2000

Page 75: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

76

Outline

1. Decentralized systems: state-of-the-art2. Accountability for distributed systems

Why accountability? What is accountability? How can we implement it? How well does it work? Accountable virtual machines

3. What’s next? Social computing and applications

TECS Week, Pune, 5-9 January 2009

Page 76: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 77

Ideal accountability

Whenever a node is faulty in any way, the system generates a proof of misbehavior against that node

Fault := Node deviates from expected behavior Our goal is to automatically

detect faults identify the faulty nodes convince others that a node is (or is not) faulty

Can we build a system that provides the following guarantee?

Page 77: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 78

Can we detect all faults? Problem: Faults that

affect only a node's internal state

Would require online trusted probes at each node

Focus on observable faults: Faults that affect a correct node

Can detect observable faults without requiring trusted components

AA

X

CC

100101011000101101011100100100

0

Page 78: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 79

Can we always get a proof?

Problem: He-said-she-said Three possible causes:

A never sent X B refuses to acknowledge X X was lost by the network

Cannot get proof of misbehavior! Generalize to verifiable evidence:

a proof of misbehavior, or a challenge that a faulty node cannot answer

What if the challenged node does not respond? Does not prove a fault, but node is suspected until it

responds

AA

X

BB

CC

?

I sent X!

I neverreceived

X!

?!

Page 79: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 80

Practical accountability We propose the following requirement for an

accountable distributed system:

This is useful Any (!) fault that affects a correct node is

eventually detected and linked to a faulty node

It can be implemented in practice

Whenever a fault is observed by a correct node, the system eventually generates verifiable evidence against a faulty node

Page 80: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

81

Outline

1. Decentralized systems: state-of-the-art2. Accountability for distributed systems

Why accountability? What is accountability? How can we implement it? How well does it work? Accountable virtual machines

3. Social computing and applications

TECS Week, Pune, 5-9 January 2009

Page 81: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 82

Adds accountability to a given system Implemented as a library Provides tamper-evident record Detects faults via state-machine replay

Assumptions:

An implementation: PeerReview

1. Nodes can be modeled as deterministic state machines

2. Nodes have reference implementations of the state machines

3. Correct nodes can eventually communicate4. Nodes can sign messages

Page 82: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 83

M

PeerReview from 10,000 feet All nodes keep logs of

their inputs & outputs Including all messages

Each node has a set of witnesses, which audit the node periodically

If the witnesses detect misbehavior, they

generate evidence make the evidence

avai-lable to other nodes

Other nodes check evi-dence, report fault

A's log

B's log

AA

BB

M

CCDD

EE

A's witnesses

M

A is faulty

Page 83: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

84

PeerReview detects tampering

A B

Message Has

h ch

ain

Send(X)

Recv(Y)

Send(Z)

Recv(M)

H0

H1

H2

H3

H4

B's log

ACK

What if a node modifies its log entries?

Log entries form a hash chain

Inspired by secure histories [Maniatis02]

Hash is included with every message authenticator Node commits to its current state Changes are evident

Hash(log)

Hash(log)

TECS Week, Pune, 5-9 January 2009

Page 84: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 85

PeerReview detects omission What if a node omits

log entries? While inspecting A’s

log, A’s witnesses send msg authenticators signed by B to B’s witnesses

Thus, witnesses learn about all messages their node has ever sent or acknowleged

Omission of a message from the log is a fault

A's log

AA

BB

A's witnesses

B's witnesses

MB

MB M

B

MB

MB

MB

Page 85: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

86

PeerReview detects inconsistencies

What if a node keeps multiple logs? forks its log?

Witnesses check whether all msg authenticators form a single hash chain

Two authenticators not connected by a log segment indicate a fault

H3'

Read X

H4'

Not found

Read Z

OK

Create X

H0

H1

H2

H3

H4

OK

"View #1""View #2"

TECS Week, Pune, 5-9 January 2009

Page 86: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 87

Module B

PeerReview detects faults How to recognize

faults? Assumption:

Nodes can be modeled as deterministic state machines

To audit a node, witness Fetches signed log Replays inputs to a

trusted copy of the state machine

Checks outputs against the log

Module A

Module B

=?

LogNetwork

Input

Output

Sta

te m

ach

ine

if ≠

Module A

Page 87: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 88

PeerReview guarantees

1) Observable faults will be detected

2) Good nodes cannot be accused

Formal definitions and proof in the TR

If node commits a fault + has a correct witness,

then witness obtains a proof of misbehavior (PoM), or a challenge that the faulty node cannot answer

If node is correct there can never be a PoM,

and it can answer any

challenge

Page 88: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 89

PeerReview is widely applicable App #1: NFS server in the Linux kernel

Many small, latency-sensitive requests Tampering with files Lost updates

App #2: Overlay multicast Transfers large volume of data

Freeloading Tampering with content

App #3: P2P email Complex, large, decentralized

Denial of service Attacks on DHT routing

More information in [Haeberlen et al., SOSP’07]

Metadata corruption Incorrect access

control

Censorship

Page 89: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

90

Outline

1. Decentralized systems: state-of-the-art2. Accountability for distributed systems

Why accountability? What is accountability? How can we implement it? How well does it work? Accountable virtual machines

3. Social computing and applications

TECS Week, Pune, 5-9 January 2009

Page 90: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 91

How much does PeerReview cost?

Log storage 10 – 100 GByte per month, depending on

application

Message signatures Message latency (e.g. 1.5ms RTT with RSA-1024) CPU overhead (embarrassingly parallel)

Log/authenticator transfer, replay overhead Depends on # witnesses Can be deferred to exploit bursty/diurnal load

patterns

Page 91: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 92

P2p email, dedicated witnesses

Dominant cost depends on number of witnesses W

O(W2) component

Baseline 1 2 3 4 5

100

80

60

40

20

0

Avg t

raffi

c (K

bps/

node)

Number of witnesses

Baseline traffic

Signaturesand ACKs

Checking logs

W dedicatedwitnesses

Page 92: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 93

P2p email, mutual auditing

Small probability of error is inevitable Example: Replication

Can use this to optimize PeerReview Accept that an instance of a fault is found

only with high probability Asymptotic complexity: O(N2) O(log N)

Small randomsample of peers

chosen as witnesses

Node

Page 93: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 94

PeerReview is scalable

Assumption: up to 10% of nodes can be faulty Probabilistic guarantees provide scalability

Example: email system scales to over 10,000 nodeswith P=0.999999

DSL/cableupstream

Email systemw/o accountability

O((log N)2)

O(log N)

Email system+ PeerReview(P=0.999999)

Email system + PeerReview(P=1.0)

System size (nodes)

Avg

traf

fic (

Kbp

s/no

de)

Page 94: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

95

PeerReview summary Accountability is a new approach to

handlingfaults in distributed systems detects faults identifies the faulty nodes produces evidence

PeerReview: A system that enforces accountability Offers provable guarantees and is widely

applicable

Details in [Haeberlen et al., SOSP ‘07 ]

TECS Week, Pune, 5-9 January 2009

Page 95: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 96

Challenges

Tension between accountability and privacy PeerReview (PR) requires disclosure to

witnesses Zero-knowledge proofs?

Fault detection PR uses state-machine replay for fault

detection Can‘t detect deterministic software bugs Different implementations of underspecified

protocols may diverge Protocol specification or abstract model?

Page 96: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 97

Challenges (cont‘d)

Message signatures PR assumes a public-key infrastructure Web-of-trust (physical network, social network)

?

Partial deployment Accountability zones, gateways ?

PR requires source code modifications To enable deterministic replay Accountable virtual machines?

Page 97: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

NetReview

Accountability applied to inter-domain routing

Fault detection based on a spec of the routing policy

Web-of-trust-based certificates Auditing limited to peering partners Partial deployment: accountability zones

Details in [Haeberlen et. al., NSDI’09]

98TECS Week, Pune, 5-9 January 2009

Page 98: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

99

Outline

1. Decentralized systems: state-of-the-art2. Accountability for distributed systems

Why accountability? What is accountability? How can we implement it? How well does it work? Accountable virtual machines

3. What’s next? Social computing and applications

TECS Week, Pune, 5-9 January 2009

Page 99: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 100

Accountable virtual machines (AVM)

Make unmodified binary VMs accountable

VMM provides deterministic logging/replay

Accountable VMM

AVMVM

Log

Unmodified binary

Packets Authenticator

Page 100: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 101

What are AVMs good for?

Accountability for proprietary/legacy software

Accountable cloud computing Customer can verify correct execution

Making an entire host computer accountable Check for compromised software Forensics

Page 101: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 102

Trusted network probes

Making the Internet accountable, one host at a time

Secure log

Cable/DSL modemor ISP’s DSLAM

Internet AccountableWorkstation

AuthenticatorPacket

Chain of authenticatorsvalidates log

Page 102: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 103

Related Work

Accountability [Lampson ’00, Yumerefendi&Chase ’05, Yemerefendi et al. ’07, Argyraki et al. ’07, Michalakis et al. ‘07]

Practical byzantine fault tolerance [Castro&Liskov ‘00, Ramasamy ‘07]

General fault detection [Kihlstrom et al. ’07, Doudou et al. ’99, Malkhi&Reiter ‘97]

Intrusion detection, reputation systems [Denning ’87, Ko et al. ’94, Kamvar et al. ‘03]

Trusted computing [Garfinkel et al. ’02] Fault-specific defenses [Cox&Noble ‘03,

Waldman&Mazieres ’03] Tamper-evident logs [Schneier&Kelsey ’98, Maniatis&Baker

‘02]

Page 103: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

PODC, Toronto, 18 August 2008 104

Conclusion Byzantine faults in distributed systems are real

Accountability is a new approach to handling faults detects observable faults identifies the faulty node produces verifiable evidence

Presented a practical definition of accountability

Practical implementations exist

Many challenges remain

Page 104: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

105

Outline

1. Decentralized systems: state-of-the-art2. Accountability for distributed systems3. Social computing and applications

Exploiting social networks for distributed computing

Example: enhancing Web search Example: thwarting unwanted

communication

TECS Week, Pune, 5-9 January 2009

Page 105: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

106

From service-centric to user-centric computing

Collaborative, social computing and communication

In peer-to-peer, users share technical resources In social computing, users share knowledge,

opinions, referrals, ratingsTECS Week, Pune, 5-9 January 2009

Page 106: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

107

User-centric, social computing

Mass collaboration, enabled by technology

Human intelligence aggregated through technology

User contribution is the most important resource(Underutilized resource of enormous scale?)

BUT: Outcome depends on user behavior depends on cooperation, good will vulnerable to spoilers

TECS Week, Pune, 5-9 January 2009

Page 107: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

108

Social networks: two concepts

Users contribute Content Opinions, recommendations, ratings (ex- or

implicit)

Users form social networks Graph connecting users (ex- or implicit) Links imply shared interest or trust

TECS Week, Pune, 5-9 January 2009

Page 108: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

109

What are social networks?

Graphs connecting people Edges connect “friends” Imply shared interest or trust Online friends may have

never met in real life E.g., email, Skype, IM

Online social networking sites Network hosted by a Web

site Often used to share

opinions, advice, ratings, multimedia content

Social Network

Online Social Network

TECS Week, Pune, 5-9 January 2009

Page 109: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

110

Huge opportunity…

…to leverage collective user input, e.g.

to deal with unwanted communication to thwart security attacks to enable better organization, filtering,

search, ranking, and distribution of content may provide an answer to the ever-

increasing flood of information

TECS Week, Pune, 5-9 January 2009

Page 110: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

111

Outline

1. Decentralized systems: state-of-the-art2. Accountability for distributed systems3. What’s next? Social computing and

applications Exploiting social networks for

distributed computing Example: enhancing Web search Example: thwarting unwanted

communication

TECS Week, Pune, 5-9 January 2009

Page 111: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

112

What’s it got to do with Systems?

Social networks enhance distributed systems Sybil attacks Unwanted communication Personalization

Social computing may need distribution Privacy Avoid dependence on a single provider

TECS Week, Pune, 5-9 January 2009

Page 112: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

113

Leveraging social networks to enhance systems

Trust can help thwart security problems Sybil attacks: SybilGuard [SIGCOMM’06] Clones unlikely to have diverse links

Trust can help block unwanted communication Friends unlikely to send SPAM: RE [NSDI’06] Using social networks to thwart SPAM

(Ostra)

Shared interest can improve search Web search: PeerSpective [HotNets’06] Related users likely to visit relevant content

TECS Week, Pune, 5-9 January 2009

Page 113: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

114

Leveraging social networks: More ideas

Sharing solutions and problem fixes Configurations that work Fixes that others have found “Copy what works for others”

Combine technology and social networks to truly “stand on the shoulders of giants”

Answer to the increasing complexity of the information age?

TECS Week, Pune, 5-9 January 2009

Page 114: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

115

Outline

1. Decentralized systems: state-of-the-art2. Accountability for distributed systems3. What’s next? Social computing and

applications Exploiting social networks for

distributed computing Example: enhancing Web search Example: thwarting unwanted

communication

TECS Week, Pune, 5-9 January 2009

Page 115: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

116

Example: social network based Web search

PeerSpectiveGoogl

e

PeerSpective experiment Idea: users can query their friends’ previously viewed pages Results from friends appear alongside Google results

TECS Week, Pune, 5-9 January 2009

Page 116: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

117

PeerSpective implementation

Prototype is a lightweight HTTP proxy Runs on users’ desktop and indexes all browsed content When Google search is performed,

query other PeerSpective proxies in parallel with Google present PeerSpective results alongside Google results

PeerSpectivePeerSpective

PeerSpectivePeerSpective

PeerSpectivePeerSpectiveTECS Week, Pune, 5-9 January 2009

Page 117: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

118

PeerSpective results summary

Explored potential of integrating Web and social network search

Evidence that PeerSpective added value Additional coverage for viewed sites Improved ranking of results Aided in finding content serendipitously

However, just an experiment Many challenges remain Opportunities as well

Details in [Mislove et al., HotNets ’06]TECS Week, Pune, 5-9 January 2009

Page 118: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

119

Outline

1. Decentralized systems: state-of-the-art2. Accountability for distributed systems3. What’s next? Social computing and

applications Exploiting social networks for

distributed computing Example: enhancing Web search Example: thwarting unwanted

communication

TECS Week, Pune, 5-9 January 2009

Page 119: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

120

Unwanted communication

Well-known problem Email spam

Increasingly affects other systems Search-engine spam Mislabeled videos plaguing YouTube Unwanted invitations in Skype

Existing solutions insufficient Content filtering for videos?

TECS Week, Pune, 5-9 January 2009

Page 120: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

121

Known defenses

Content filtering Works very well for email, but False positives reduce communication

reliability Doesn’t work for multimedia

Holding senders accountable Requires strong user identities

Imposing a per-communication cost Refunded if communication is wanted Requires micro-payments/quota market

TECS Week, Pune, 5-9 January 2009

Page 121: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

122

Ostra: Using social relationships

Assumptions

Cost for acquiring and maintaining social links Cannot create links arbitrarily fast Cannot maintain arbitrary number of links

Receivers are willing to classify content Explicit (Junk button) Implicit (Deletion, response)

TECS Week, Pune, 5-9 January 2009

Page 122: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

123

Ostra: Pair-wise credit exchange

• Credit balance/bound associated with each link • Credit balances decay at constant rate (10%/day)• Sum of all credit = 0 (invariant)

-202

TECS Week, Pune, 5-9 January 2009

Page 123: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

124

Ostra: Pair-wise credit exchange

Receiver

Sender

Message unwanted -> sender pays receiver one credit Sending spam exhausts sender’s link balance

-202

TECS Week, Pune, 5-9 January 2009

Page 124: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

125

Ostra: End-to-end credit exchange

Sender

Receiver

Rate of spam a user can send is proportional to number of links (s)he has

-202

-212 -202-2-12

TECS Week, Pune, 5-9 January 2009

Page 125: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

126

Sybil attacks are not effective

{Sybils

Total unwanted communication by Sybils is bounded by the number of links with other users

TECS Week, Pune, 5-9 January 2009

Page 126: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

127

Ostra

Thwarts unwanted communication existing systems Examples: Email, Skype, IM, YouTube

Uses existing relationships among users Online social networks Graph of email/IM/Skype users

Does not require strong user identities Does not rely on automatic content classification Respects recipient’s idea of wanted/unwanted

communication

Details in [Mislove et al., NSDI ’08 ]

TECS Week, Pune, 5-9 January 2009

Page 127: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

128

SN and applications research agenda

Measurement/Analysis Theory of complex networks Empirical study of social networks

Understanding SN evolution Understanding SN information flow

Design Personalized search, filtering, content distribution Using social networks to thwart unwanted behavior Online social networks and privacy

TECS Week, Pune, 5-9 January 2009

Page 128: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

129

Outline

1. Decentralized systems: state-of-the-art2. Accountability for distributed systems3. Social computing and applications

Exploiting social networks for distributed computing

Example: enhancing Web search Example: thwarting unwanted

communication

TECS Week, Pune, 5-9 January 2009

Page 129: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

130

Max Planck Institute for Software Systems(MPI-SWS)

Part of Max Planck Society Academic research institute, pub.

funded Focus on basic research Kick-off in Aug 2005 17 faculty positions (tenure-track) ~100 doctoral/post-doc positions Administrative and technical support

staff Top international research institution

TECS Week, Pune, 5-9 January 2009

Page 130: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

131

MPI-SWS Faculty

Distributed systems

Peter Druschel

Krishna Gummadi

Program analysis

and verificati

on

Andrey Rybalchenko

Derek Dreyer

Functional Programming

Networked systems

Michael Backes (Fellow)(Fellow)

Security andCryptography

Rodrigo Rodrigues

Dependable systems

TECS Week, Pune, 5-9 January 2009

Paul Francis

Large scale

Internet systems

Page 131: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

132

Graduate program (MS/PhD)

Advised by MPI-SWS faculty Stimulating, competitive environment International, diverse student body

(80%) English language Financial aid Internships available

http://www.mpi-sws.org

TECS Week, Pune, 5-9 January 2009

Page 132: TECS Week, Pune, 5-9 January 2009 1 The User is the Computer: From Decentralized Systems to Social Computing Peter Druschel

133

Thanks for your attention!

TECS Week, Pune, 5-9 January 2009