15-446 distributed systems spring 2009

80
15-446 Distributed Systems Spring 2009 L-20 Multicast

Upload: ivana

Post on 20-Feb-2016

14 views

Category:

Documents


1 download

DESCRIPTION

15-446 Distributed Systems Spring 2009. L -20 Multicast. Multicast Routing. Unicast: one source to one destination Multicast: one source to many destinations Two main functions: Efficient data distribution Logical naming of a group . Overview. What/Why Multicast - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: 15-446 Distributed Systems Spring 2009

15-446 Distributed Systems

Spring 2009

L-20 Multicast

Page 2: 15-446 Distributed Systems Spring 2009

2

Multicast RoutingUnicast: one source to one destinationMulticast: one source to many destinationsTwo main functions:

Efficient data distribution Logical naming of a group

Page 3: 15-446 Distributed Systems Spring 2009

3

OverviewWhat/Why MulticastIP Multicast Service BasicsMulticast Routing BasicsDVMRPReliabilityCongestion ControlOverlay MulticastPublish-Subscribe

Page 4: 15-446 Distributed Systems Spring 2009

4

Multicast – Efficient Data Distribution

Src Src

Page 5: 15-446 Distributed Systems Spring 2009

5

Multicast Router Responsibilities

Learn of the existence of multicast groups (through advertisement)

Identify links with group membersEstablish state to route packets

Replicate packets on appropriate interfaces Routing entry:

Src, incoming interface List of outgoing interfaces

Page 6: 15-446 Distributed Systems Spring 2009

6

Logical NamingSingle name/address maps to logically

related set of destinations Destination set = multicast group

How to scale? Single name/address independent of group growth or

changes

Page 7: 15-446 Distributed Systems Spring 2009

7

Multicast GroupsMembers are the intended receiversSenders may or may not be membersHosts may belong to many groupsHosts may send to many groupsSupport dynamic creation of groups,

dynamic membership, dynamic sources

Page 8: 15-446 Distributed Systems Spring 2009

8

ScopeGroups can have different scope

LAN (local scope) Campus/admin scoping TTL scoping

Concept of scope important to multipoint protocols and applications

Page 9: 15-446 Distributed Systems Spring 2009

9

Example ApplicationsBroadcast audio/videoPush-based systemsSoftware distributionWeb-cache updates Teleconferencing (audio, video, shared

whiteboard, text editor)Multi-player gamesServer/service locationOther distributed applications

Page 10: 15-446 Distributed Systems Spring 2009

10

OverviewWhat/Why MulticastIP Multicast Service BasicsMulticast Routing BasicsDVMRPReliabilityCongestion ControlOverlay MulticastPublish-Subscribe

Page 11: 15-446 Distributed Systems Spring 2009

11

IP Multicast Architecture

Hosts

Routers

Service model

Host-to-router protocol(IGMP)

Multicast routing protocols(various)

Page 12: 15-446 Distributed Systems Spring 2009

12

IP Multicast Service Model (rfc1112)

Each group identified by a single IP addressGroups may be of any sizeMembers of groups may be located

anywhere in the InternetMembers of groups can join and leave at willSenders need not be membersGroup membership not known explicitly Analogy:

Each multicast address is like a radio frequency, on which anyone can transmit, and to which anyone can tune-in.

Page 13: 15-446 Distributed Systems Spring 2009

13

IP Multicast AddressesClass D IP addresses

224.0.0.0 – 239.255.255.255

How to allocated these addresses? Well-known multicast addresses, assigned by IANA Transient multicast addresses, assigned and reclaimed

dynamically, e.g., by “sdr” program

1 1 1 0 Group ID

Page 14: 15-446 Distributed Systems Spring 2009

14

IP Multicast ServiceSending – same as beforeReceiving – two new operations

Join-IP-Multicast-Group(group-address, interface) Leave-IP-Multicast-Group(group-address, interface) Receive multicast packets for joined groups via normal

IP-Receive operation

Page 15: 15-446 Distributed Systems Spring 2009

15

Multicast Scope Control – Small TTLs

TTL expanding-ring search to reach or find a nearby subset of a group

s

1

2

3

Page 16: 15-446 Distributed Systems Spring 2009

16

Multicast Scope Control – Large TTLs

Administrative TTL Boundaries to keep multicast traffic within an administrative domain, e.g., for privacy or resource reasons

An administrative domain

TTL threshold set oninterfaces to these links,greater than the diameterof the admin. domain

The rest of the Internet

Page 17: 15-446 Distributed Systems Spring 2009

17

OverviewWhat/Why MulticastIP Multicast Service BasicsMulticast Routing BasicsDVMRPReliabilityCongestion ControlOverlay MulticastPublish-Subscribe

Page 18: 15-446 Distributed Systems Spring 2009

18

IP Multicast Architecture

Hosts

Routers

Service model

Host-to-router protocol(IGMP)

Multicast routing protocols(various)

Page 19: 15-446 Distributed Systems Spring 2009

19

Multicast RoutingBasic objective – build distribution tree for

multicast packetsMulticast service model makes it hard

Anonymity Dynamic join/leave

Page 20: 15-446 Distributed Systems Spring 2009

20

Routing TechniquesFlood and prune

Begin by flooding traffic to entire network Prune branches with no receivers Examples: DVMRP, PIM-DM Unwanted state where there are no receivers

Link-state multicast protocols Routers advertise groups for which they have

receivers to entire network Compute trees on demand Example: MOSPF Unwanted state where there are no senders

Page 21: 15-446 Distributed Systems Spring 2009

21

Routing TechniquesCore based protocols

Specify “meeting place” aka core Sources send initial packets to core Receivers join group at core Requires mapping between multicast group address

and “meeting place” Examples: CBT, PIM-SM

Page 22: 15-446 Distributed Systems Spring 2009

22

Shared vs. Source-based Trees

Source-based trees Separate shortest path tree for each sender DVMRP, MOSPF, PIM-DM, PIM-SM

Shared trees Single tree shared by all members Data flows on same tree regardless of sender CBT, PIM-SM

Page 23: 15-446 Distributed Systems Spring 2009

23

Source-based Trees

RouterSourceReceiver

SR

R

R

R

R

S

S

Page 24: 15-446 Distributed Systems Spring 2009

24

A Shared Tree

RP

RouterSourceReceiver

S

S

SR

R

R

R

R

Page 25: 15-446 Distributed Systems Spring 2009

25

Shared vs. Source-Based Trees

Source-based trees Shortest path trees – low delay, better load

distribution More state at routers (per-source state) Efficient for in dense-area multicast

Shared trees Higher delay (bounded by factor of 2), traffic

concentration Choice of core affects efficiency Per-group state at routers Efficient for sparse-area multicast

Which is better? extra state in routers is bad!

Page 26: 15-446 Distributed Systems Spring 2009

26

OverviewWhat/Why MulticastIP Multicast Service BasicsMulticast Routing BasicsDVMRPReliabilityCongestion ControlOverlay MulticastPublish-Subscribe

Page 27: 15-446 Distributed Systems Spring 2009

27

Distance-Vector Multicast Routing

DVMRP consists of two major components: A conventional distance-vector routing protocol (like

RIP) A protocol for determining how to forward multicast

packets, based on the routing tableDVMRP router forwards a packet if

The packet arrived from the link used to reach the source of the packet (reverse path forwarding check – RPF)

If downstream links have not pruned the tree

Page 28: 15-446 Distributed Systems Spring 2009

28

Example Topology

G G

S

G

Page 29: 15-446 Distributed Systems Spring 2009

29

Broadcast with Truncation

G G

S

G

Page 30: 15-446 Distributed Systems Spring 2009

30

Prune

G G

S

Prune (s,g)

Prune (s,g)

G

Page 31: 15-446 Distributed Systems Spring 2009

31

Graft (s,g)

Graft (s,g)

Graft

G G

S

G

GReport (g)

Page 32: 15-446 Distributed Systems Spring 2009

32

Steady State

G G

S

G

G

Page 33: 15-446 Distributed Systems Spring 2009

33

OverviewWhat/Why MulticastIP Multicast Service BasicsMulticast Routing BasicsDVMRPReliabilityCongestion ControlOverlay MulticastPublish-Subscribe

Page 34: 15-446 Distributed Systems Spring 2009

34

R1

Implosion

S

R3 R4

R2

21

R1

S

R3 R4

R2

Packet 1 is lost All 4 receivers request a resend

Resend request

Page 35: 15-446 Distributed Systems Spring 2009

35

RetransmissionRe-transmitter

Options: sender, other receiversHow to retransmit

Unicast, multicast, scoped multicast, retransmission group, …

Problem: Exposure

Page 36: 15-446 Distributed Systems Spring 2009

36

R1

Exposure

S

R3 R4

R2

21

R1

S

R3 R4

R2

Packet 1 does not reach R1;Receiver 1 requests a resend

Packet 1 resent to all 4 receivers

1

1

Resend request Resent packet

Page 37: 15-446 Distributed Systems Spring 2009

37

Ideal Recovery Model

S

R3 R4

R2

2

1

S

R3 R4

R2

Packet 1 reaches R1 but is lost before reaching other Receivers

Only one receiver sends NACK to the nearest S or R with packet

Resend request

1 1Resent packet

Repair sent only to those that need packet

R1 R1

Page 38: 15-446 Distributed Systems Spring 2009

38

SRMOriginally designed for wbReceiver-reliable

NACK-basedEvery member may multicast NACK or

retransmission

Page 39: 15-446 Distributed Systems Spring 2009

39

R1

SRM Request Suppression

S

R3

R2

21

R1

S

R3

R2

Packet 1 is lost; R1 requests resend to Source and Receivers

Packet 1 is resent; R2 and R3 no longer have to request a resend

1

X

XDelay varies by distance

X

Resend request Resent packet

Page 40: 15-446 Distributed Systems Spring 2009

40

Deterministic Suppression

d

d

d

d

3d

Time

data

nack repair

d4d

d

2d

3d

= Sender

= Repairer

= Requestor

Delay = C1dS,R

Page 41: 15-446 Distributed Systems Spring 2009

41

SRM Star Topology

S

R2

21

R3

Packet 1 is lost; All Receivers request resends

Packet 1 is resent to all Receivers

X

R4

Delay is same length

S

R2

1

R3 R4

Resend request Resent packet

Page 42: 15-446 Distributed Systems Spring 2009

42

SRM: Stochastic Suppression

datad

d

d

d

Time

NACK

repair

2d

session msg

0

1

2

3

Delay = U[0,D2] dS,R

= Sender

= Repairer

= Requestor

Page 43: 15-446 Distributed Systems Spring 2009

43

SRM (Summary)NACK/Retransmission suppression

Delay before sending Delay based on RTT estimation Deterministic + Stochastic components

Periodic session messages Full reliability Estimation of distance matrix among members

Page 44: 15-446 Distributed Systems Spring 2009

44

What’s Missing?Losses at link (A,C)

causes retransmission to the whole group

Only retransmit to those members who lost the packet

[Only request from the nearest responder]

Sender Receiver

A B

E F

S

C D

0.99

0 0

0 0 0

Page 45: 15-446 Distributed Systems Spring 2009

45

Local RecoveryDifferent techniques in various systemsApplication-level hierarchy

Fixed v.s. dynamicTTL scoped multicastRouter supported

Page 46: 15-446 Distributed Systems Spring 2009

46

OverviewWhat/Why MulticastIP Multicast Service BasicsMulticast Routing BasicsDVMRPReliabilityCongestion ControlOverlay MulticastPublish-Subscribe

Page 47: 15-446 Distributed Systems Spring 2009

47

Multicast Congestion Control

What if receivers have very different bandwidths?

Send at max?Send at min?Send at avg?

R

R

R

S

???Mb/s

100Mb/s

100Mb/s

1Mb/s

1Mb/s

56Kb/s

R

Page 48: 15-446 Distributed Systems Spring 2009

48

Video Adaptation: RLMReceiver-driven Layered MulticastLayered video encodingEach layer uses its own mcast groupOn spare capacity, receivers add a layerOn congestion, receivers drop a layerJoin experiments used for shared learning

Page 49: 15-446 Distributed Systems Spring 2009

49

Layered Media Streams

S R

R1R2

R3

R10Mbps

10Mbps

512Kbps

128Kbps

10Mbps

R3 joins layer 1, fails at layer 2

R2 join layer 1,join layer 2 fails at layer 3

R1 joins layer 1,joins layer 2 joins layer 3

Page 50: 15-446 Distributed Systems Spring 2009

50

Drop Policies for Layered Multicast

Priority Packets for low bandwidth layers are kept, drop

queued packets for higher layers Requires router support

Uniform (e.g., drop tail, RED) Packets arriving at congested router are dropped

regardless of their layerWhich is better?

Intuition vs. reality!

Page 51: 15-446 Distributed Systems Spring 2009

51

RLM Intuition

Page 52: 15-446 Distributed Systems Spring 2009

52

RLM IntuitionUniform

Better incentives to well-behaved users If oversend, performance rapidly degrades Clearer congestion signal Allows shared learning

Priority Can waste upstream resources Hard to deploy

RLM approaches optimal operating point Uniform is already deployed Assume no special router support

Page 53: 15-446 Distributed Systems Spring 2009

53

RLM Join ExperimentReceivers periodically try subscribing to

higher layerIf enough capacity, no congestion, no drops

Keep layer (& try next layer)If not enough capacity, congestion, drops

Drop layer (& increase time to next retry)What about impact on other receivers?

Page 54: 15-446 Distributed Systems Spring 2009

54

Join Experiments

1

2

3

4

Time

Layer

Page 55: 15-446 Distributed Systems Spring 2009

55

RLM Scalability?What happens with more receivers?Increased frequency of experiments?

More likely to conflict (false signals) Network spends more time congested

Reduce # of experiments per host? Takes longer to converge

Receivers coordinate to improve behavior

Page 56: 15-446 Distributed Systems Spring 2009

56

OverviewWhat/Why MulticastIP Multicast Service BasicsMulticast Routing BasicsDVMRPReliabilityCongestion ControlOverlay MulticastPublish-Subscribe

Page 57: 15-446 Distributed Systems Spring 2009

57

Supporting Multicast on the Internet

IP

Application

Internet architecture

Network

?

?

At which layer should multicast be implemented?

Page 58: 15-446 Distributed Systems Spring 2009

58

IP Multicast

CMU

BerkeleyMIT

UCSD

routersend systemsmulticast flow

Highly efficientGood delay

Page 59: 15-446 Distributed Systems Spring 2009

59

End System Multicast

MIT1

MIT2

CMU1

CMU2

UCSD

MIT1

MIT2

CMU2

Overlay TreeBerkeley

CMU1

CMU

BerkeleyMIT

UCSD

Page 60: 15-446 Distributed Systems Spring 2009

60

Quick deploymentAll multicast state in end systemsComputation at forwarding points

simplifies support for higher level functionality

Potential Benefits Over IP Multicast

MIT1

MIT2

CMU1

CMU2

CMU

BerkeleyMIT

UCSD

Page 61: 15-446 Distributed Systems Spring 2009

61

Concerns with End System Multicast

Self-organize recipients into multicast delivery overlay tree Must be closely matched to real network topology

to be efficientPerformance concerns compared to IP

Multicast Increase in delay Bandwidth waste (packet duplication)

MIT2

Berkeley MIT1

UCSD

CMU2

CMU1

IP Multicast

MIT2

Berkeley MIT1

CMU1

CMU2

UCSD

End System Multicast

Page 62: 15-446 Distributed Systems Spring 2009

62

Coordination: Cooperative group communication

Scribe: Tree-based group management Multicast, anycast primitivesScalable: large numbers of groups,

members, wide range of members/group, dynamic membership

[IEEE JSAC ’02]

Page 63: 15-446 Distributed Systems Spring 2009

63

Cooperative group communication

n2

n1

n0g:n1,n2

g:n3,n4

g

nodes

Operations:create(g)join(g)leave(g)multicast(g,m)anycast(g,m)

• groupId g mapped to n0• decentralized membership • robust, scalable

n3g

n4g

Page 64: 15-446 Distributed Systems Spring 2009

64

Scribe

groupId

Join(groupId)

Proximity space

Page 65: 15-446 Distributed Systems Spring 2009

65

Respecting forwarding capacity

The tree structure described may not respect maximum capacities

Scribe's push-down fails to resolve the problem because a leaf node in one tree may have children in another tree

Page 66: 15-446 Distributed Systems Spring 2009

66

Parent location algorithmNode adopts prospective childIf too many children, choose one to reject:

First, look for one in stripe without shared prefix Otherwise, select node with shortest prefix match

Orphan locates new parent in up to two steps: Tries former siblings with stripe prefix match

Adopts or rejects using same criteria; continue push-down

Use the spare capacity group

Page 67: 15-446 Distributed Systems Spring 2009

67

The spare capacity groupIf orphan hasn't found parent yet, anycasts

to spare capacity groupGroup contains all nodes with fewer children

than their forwarding capacityAnycast returns nearby node, which starts a

DFS of the spare capacity group tree, sending first to a child...

Page 68: 15-446 Distributed Systems Spring 2009

68

Spare capacity group (cont.)

At each node in the search: If node has no children left to search, check whether it

receives a stripe the orphan seeks If so, verifies that the orphan is not an ancestor (which

would create a cycle)If both tests succeed, the node adopts the

orphan May leave spare capacity group

If either test fails, back up to parent (more DFS...)

Page 69: 15-446 Distributed Systems Spring 2009

69

A spare capacity example

Page 70: 15-446 Distributed Systems Spring 2009

70

ProblemsImposing bandwidth constraints on Scribe

can result in:

High tree depth non-DHT links

Observed Cause: mismatch between id space and node bandwidth constraints

Page 71: 15-446 Distributed Systems Spring 2009

71

OverviewWhat/Why MulticastIP Multicast Service BasicsMulticast Routing BasicsDVMRPReliabilityCongestion ControlOverlay MulticastPublish-Subscribe

Page 72: 15-446 Distributed Systems Spring 2009

72

Publish-SubscribeP/S service is also known as event servicePublishers role : Publishers generate event

data and publishes themSubscribers role : Subscribers submit their

subscriptions and process the events received

P/S service: It’s the mediator/broker that routes events from publishers to interested subscribers

Page 73: 15-446 Distributed Systems Spring 2009

73

Publish/Subscribe

Publish/Subscribe Service

Want champions

league football news

Want weather news for

Lancaster

REAL MADRID 4-2 MARSEILLE

Publication

Weather Lancaster : sunny intervals

min 11°C max 20°C

PublisherWant Traffic update for junction A6

Subscriber

Subscription

SubscribePublish

Page 74: 15-446 Distributed Systems Spring 2009

74

Key attributes of P/S communication model

The publishing entities and subscribing entities are anonymous

The publishing entities and subscribing entities are highly de-coupled

Asynchronous communication modelThe number of publishing and subscribing

entities can dynamically change without affecting the entire system

Page 75: 15-446 Distributed Systems Spring 2009

75

Key functions implemented by P/Sservice

Event filtering (event selection)- The process which selects the set of subscribers that have shown interest in a given event

Event routing (event delivery) – The process of routing the published events from the publisher to all interested subscribers

Page 76: 15-446 Distributed Systems Spring 2009

76

Subject based vs. Content based

Subject based: Generally also known as topic based, group based or

channel based event filtering. Here each event is published to one of these channels

by its publisher A subscriber subscribes to a particular channel and

will receive all events published to the subscribed channel.

Simple process for matching an event to subscriptions

Page 77: 15-446 Distributed Systems Spring 2009

77

Subject based vs. Content based

Content based: More flexibility and power to subscribers, by allowing

to express as an arbitrary query over the contents of the event.

E.g. Notify me of all stock quotes of IBM from New York stock exchange if the price is greater than 150

Added complexity in matching an event to subscriptions

Page 78: 15-446 Distributed Systems Spring 2009

78

Event routingThe basic P/S system consists of many event

publishers, an event broker (or mediator) and many subscribers.

An event publisher generates an event in response to some change it monitors

The events are published to an event broker which matches events against all subscriptions forwarded by subscribers in the system.

Event broker system could have either a single event broker or multiple distributed event brokers coordinating among themselves

Page 79: 15-446 Distributed Systems Spring 2009

79

Event routing

P

P

P

P

Event broker system

S1

S2

S3

C1

C2

C3

C1,C2,C3 are subscriptions of S1,S2,S3 respectively

P- Event publisher

S – Event subscriber

Single broker

Cluster of cooperating brokers

OR

Page 80: 15-446 Distributed Systems Spring 2009

80

Basic elements of P/S model

Event data model Structure Types

Subscription model Filter language Scope (subject, content, context)

General challenge Expressiveness vs. Scalability