ece544: communication networks-ii spring 2009 hang liu lecture 6 includes teaching materials from d....

99
ECE544: Communication Networks- II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K.

Post on 20-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

ECE544: Communication Networks-II Spring 2009

Hang LiuLecture 6

Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Page 2: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Today’s Lecture• Recap

• Sub-netting• Super-netting (CIDR)• BGP• IPv6

• IP Multicast• Introduction • Internet Group Management Protocol

(IGMP)• Routing Protocols

– Intra-domain (DVMRP, MOSPF, PIM)– Inter-domain (BGMP, MSDP, PIM-SSM)

Page 3: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Internet StructureToday

Backbone service provider

Peeringpoint

Peeringpoint

Large corporation

Large corporation

Smallcorporation

“Consumer ” ISP

“Consumer” ISP

“ Consumer” ISP

Page 4: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Subnetting• Add another level to address/routing

hierarchy: subnet• Subnet masks define variable partition of host

part• Subnets visible only within site

Network number Host number

Class B address

Subnet mask (255.255.255.0)

Subnetted address

111111111111111111111111 00000000

Network number Host IDSubnet ID

Page 5: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Subnet Example

Forwarding table at router R1Subnet Number Subnet Mask Next

Hop128.96.34.0 255.255.255.128

interface 0128.96.34.128 255.255.255.128

interface 1128.96.33.0 255.255.255.0 R2

Subnet mask: 255.255.255.128Subnet number: 128.96.34.0

128.96.34.15 128.96.34.1

H1R1

128.96.34.130Subnet mask: 255.255.255.128Subnet number: 128.96.34.128

128.96.34.129128.96.34.139

R2H2

128.96.33.1128.96.33.14

Subnet mask: 255.255.255.0Subnet number: 128.96.33.0

H3

Page 6: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Classless Inter Domain Routing (CIDR) or Super-netting

Class B:

Class C:

Net ID Host ID

Host IDNet ID Problem: Class B addresses are running out Solution: Allocate multiple Class C addresses Problem: Random allocation of Class C addresses need multiple routing table entries Solution: Allocate “contiguous” Class C addresses Routing entry: [IP Address of Network and Net Mask]

IP Address: 195.201.3.5 = 11000011 11001001 00000011 00000101Net Mask: 254.0.0.0 = 11111110 00000000 00000000 00000000-----------------------------------------------------------------------------------------Network IP: 194.0.0.0 = 11000010 00000000 00000000 00000000

Page 7: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

CIDR (continued)

Organization’s Requirements Assignment

Fewer than 256 addresses 1 Class C network

Fewer than 512 addresses 2 Contiguous Class C networks

Fewer than 1024 addresses 4 Contiguous Class C networks

Fewer than 2048 addresses 8 Contiguous Class C networks

Fewer than 4096 addresses 16 Contiguous Class C networks

Fewer than 8192 addresses 32 Contiguous Class C networks

Fewer than 16384 addresses 64 Contiguous Class C networks

How many Class C addresses ?

Contiguous Class C network addresses allow a “single” entry in the routing table for all the above organizations

Page 8: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Route Aggregation Examples

Q: How does network get network part of IP addr?

A: gets allocated portion of its provider ISP’s address space

ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20

Organization 0 11001000 00010111 00010000 00000000 200.23.16.0/23 Organization 1 11001000 00010111 00010010 00000000 200.23.18.0/23 Organization 2 11001000 00010111 00010100 00000000 200.23.20.0/23 ... ….. …. ….

Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23

Page 9: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Hierarchical addressing: more specific routes

ISPs-R-Us has a more specific route to Organization 1

“Send me anythingwith addresses beginning 200.23.16.0/20”

200.23.16.0/23

200.23.18.0/23

200.23.30.0/23

Fly-By-Night-ISP

Organization 0

Organization 7Internet

Organization 1

ISPs-R-Us“Send me anythingwith addresses beginning 199.31.0.0/16or 200.23.18.0/23”

200.23.20.0/23Organization 2

...

...

Page 10: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Coordinated Address Allocation

Multi-regional 192.0.0.0 -- 193.255.255.255

Europe 194.0.0.0 -- 195.255.255.255

Others 196.0.0.0 -- 197.255.255.255

North America 198.0.0.0 -- 199.255.255.255

Central/South America 200.0.0.0 -- 201.255.255.255

Pacific Rim 202.0.0.0 -- 203.255.255.255

Others 204.0.0.0 -- 205.255.255.255

Others 206.0.0.0 -- 207.255.255.255

Address aggregation using Geographic scope

European networks will have a single entry in routing tables of routers in other continents: [Network IP =194.0.0.0; mask = 254.0.0.0]194.0.0.0 = 10110010 00000000 00000000 00000000195.255.255.255 = 10110011 11111111 11111111 11111111

Same 7 high-order bits implies Mask = 11111110 00000000 00000000 00000000 = 254.0.0.0

Page 11: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Address Matching in CIDR

• Routing table uses “longest prefix match”– 171.69 (16 bit prefix) = routing table entry #1– 171.69.10 (24 bit prefix) = routing table entry #2– then DA=171.69.10.5 matches routing table entry #2– and DA = 171.69.20.3 matches routing table entry

#1

Page 12: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

CIDR (Summary)

• Continuous block of 2N addresses• [Base address, Mask]• Lookup algorithm:

– Masks destination address against mask in routing table entry

– Match means route is found – May be multiple matchings! – Longest mask breaks “ties” (longest prefix

match)

Page 13: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

IP addressing (Summary)• Classful addressing:

– inefficient use of address space, address space exhaustion

• e.g., class B net allocated enough addresses for 65K hosts, even if only 2K hosts in that network

• CIDR: Classless InterDomain Routing– network portion of address of arbitrary length– address format: a.b.c.d/x, where x is # bits in

network portion of address

11001000 00010111 00010000 00000000

networkpart

hostpart

200.23.16.0/23

Page 14: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Routing in the Internet• The Global Internet consists of Autonomous

Systems (AS) interconnected with each other:– Stub AS: small corporation: one connection to other

AS’s– Multihomed AS: large corporation (no transit):

multiple connections to other AS’s– Transit AS: provider, hooking many AS’s together

• Two-level routing: – Intra-AS: administrator responsible for choice of

routing algorithm within network (RIP, OSPF)– Inter-AS: unique standard for inter-AS routing: BGP

Page 15: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Hierarchical OSPF

• Two-level hierarchy: local area, backbone.– Link-state advertisements

only in area – each nodes has detailed

area topology; only know direction (shortest path) to nets in other areas.

• Area border routers: “summarize” distances to nets in own area, advertise to other Area Border routers.

• Backbone routers: run OSPF routing limited to backbone.

• Boundary routers: connect to other AS’s.

Page 16: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Inter-AS routing in the Internet: BGP

Figure 4.5.2-new2: BGP use for inter-domain routing

AS2 (OSPF

intra-AS routing)

AS1 (RI P intra-AS

routing) BGP

AS3 (OSPF intra-AS

routing)

BGP

R1 R2

R3

R4

R5

Page 17: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Internet inter-AS routing: BGP• BGP (Border Gateway Protocol): the de facto

standard• Path Vector protocol:

– similar to Distance Vector protocol– each Border Gateway broadcast to neighbors

(peers) entire path (i.e., sequence of AS’s) to destination

– BGP routes to networks (ASs), not individual hosts– E.g., Gateway X may send its path to dest. Z:

Path (X,Z) = X,Y1,Y2,Y3,…,Z

Page 18: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Internet inter-AS routing: BGP

Suppose: gateway X send its path to peer gateway W

• W may or may not select path offered by X– cost, policy (don’t route via competitors AS),

loop prevention reasons.

• If W selects path advertised by X, then:Path (W,Z) = w, Path (X,Z)

• Note: X can control incoming traffic by controlling its route advertisements to peers:– e.g., don’t want to route traffic to Z -> don’t

advertise any routes to Z

Page 19: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

BGP: controlling who routes to you

Figure 4.5-BGPnew: a simple BGP scenario

A

B

C

W X

Y

legend:

customer network:

provider network

• A,B,C are provider networks• X,W,Y are customer (of provider networks)• X is dual-homed: attached to two networks

– X does not want to route from B via X to C– .. so X will not advertise to B a route to C

Page 20: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

BGP: controlling who routes to you

Figure 4.5-BGPnew: a simple BGP scenario

A

B

C

W X

Y

legend:

customer network:

provider network

• A advertises to B the path AW • B advertises to X the path BAW • Should B advertise to C the path BAW?

– No way! B gets no “revenue” for routing CBAW since neither W nor C are B’s customers

– B wants to force C to route to w via A– B wants to route only to/from its customers!

Page 21: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

BGP operation

Q: What does a BGP router do?• Receiving and filtering route

advertisements from directly attached neighbor(s).

• Route selection. – To route to destination X, which path (of

several advertised) will be taken?– Policy

• Sending route advertisements to neighbors.

Page 22: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

22

BGP Operations (Simplified)

Establish session on TCP port 179

Exchange all active routes

Exchange incremental updates

AS1

AS2

While connection is ALIVE exchangeroute UPDATE messages

BGP session

Page 23: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

23

Four Types of BGP Messages

• Open : Establish a peering session.

• Keep Alive : Handshake at regular intervals.

• Notification : Shuts down a peering session.

• Update : Announcing new routes or withdrawing previously announced routes.

announcement = prefix + attributes values

Page 24: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Attributes are Used to Select Best Routes

192.0.2.0/24pick me!

192.0.2.0/24pick me!

192.0.2.0/24pick me!

192.0.2.0/24pick me!

Given multipleroutes to the sameprefix, a BGP speakermust pick at mostone best route(Note: it could reject them all!)

Page 25: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

25

ASPATH Attribute

AS7018135.207.0.0/16AS Path = 6341

AS 1239Sprint

AS 1755Ebone

AT&T

AS 3549Global Crossing

135.207.0.0/16AS Path = 7018 6341

135.207.0.0/16AS Path = 3549 7018 6341

AS 6341

135.207.0.0/16

AT&T Research

Prefix Originated

AS 12654RIPE NCCRIS project

AS 1129Global Access

135.207.0.0/16AS Path = 7018 6341

135.207.0.0/16AS Path = 1239 7018 6341

135.207.0.0/16AS Path = 1755 1239 7018 6341

135.207.0.0/16AS Path = 1129 1755 1239 7018 6341

Page 26: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

AS Graphs Do Not Show Topology!

The AS graphmay look like this. Reality may be closer to this…

BGP was designed to throw away information!

Page 27: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

In fairness: could you do this “right” and still scale?Exporting internalstate would dramatically increase global instability and amount of routingstate

Shorter Doesn’t Always Mean Shorter

AS 4

AS 3

AS 2

AS 1

BGP says that path 4 1 is better than path 3 2 1

??

Page 28: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Why different Intra- and Inter-AS routing ? Policy: • Inter-AS: admin wants control over how its

traffic routed, who routes through its net. • Intra-AS: single admin, so no policy decisions

needed

Scale:• hierarchical routing saves table size, reduced

update trafficPerformance: • Intra-AS: can focus on performance• Inter-AS: policy may dominate over

performance

Page 29: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

IP Version 6• Features

– 128-bit addresses (classless)– multicast– real-time service– authentication and security – autoconfiguration – end-to-end fragmentation– protocol extensions

• Header– 40-byte “base” header– extension headers (fixed order, mostly fixed length)

• fragmentation• source routing• authentication and security• other options

Page 30: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

IPv4 & IPv6 Header Comparison

Version IHLType of Service

Total Length

Identification FlagsFragment

Offset

Time to Live

Protocol Header Checksum

Source Address

Destination Address

Options Padding

Version Traffic Class Flow Label

Payload LengthNext

HeaderHop Limit

Source Address

Destination Address

IPv4 HeaderIPv4 Header IPv6 HeaderHeader

- field’s name kept from IPv4 to IPv6

- fields not kept in IPv6

- Name & position changed in IPv6

- New field in IPv6Leg

end

Page 31: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

IP Multicast

•Introduction •Internet Group Management Protocol (IGMP)

•Routing Protocols–Intra-domain (DVMRP, MOSPF, PIM)–Inter-domain (MBGP, MSDP)

Page 32: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Multicast: one sender to many receivers

• Multicast: act of sending datagram to multiple receivers with single “transmit” operation– One-to-many, many-to-many

• Question: how to achieve multicast

Multicast via unicast• source sends N unicast

datagrams, one addressed to each of N receivers– Redundant traffic

around sender– Keep track of all the IP

addresses to send tomulticast receiver (red)

not a multicast receiver

Routers forward unicast datagrams

Page 33: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Multicast: one sender to many receivers

• Multicast: act of sending datagram to multiple receivers with single “transmit” operation– One-to-many, many-to-many

• Question: how to achieve multicast

Network multicast (IP Multicast)• Routers actively

participate in multicast, making copies of packets as needed and forwarding towards multicast receivers

Multicast routers (red) duplicate and forward multicast datagrams

Page 34: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Multicast: one sender to many receivers

• Multicast: act of sending datagram to multiple receivers with single “transmit” operation– One-to-many, many-to-many

• Question: how to achieve multicast

Application-layer multicast (P2P)

• end systems (“hosts”) involved in multicast copy and forward unicast datagrams among themselves

• “host” becomes routerP2P hosts duplicate and forward multicast datagrams

Page 35: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Internet Multicast Service Model

multicast group concept: – Each group has its own IP multicast address– A host can join or leave freely– Routers forward multicast datagrams (with destination address

of the group’s multicast address) to hosts that have “joined” that multicast group

128.119.40.186

128.59.16.12

128.34.108.63

128.34.108.60

multicast group

226.17.30.197

Page 36: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Multicast groupsclass D Internet addresses reserved for

multicast:

host group semantics:o anyone can “join” (receive) or leave multicast groupo anyone (not even a member) can send to multicast

groupo no network-layer identification of hosts members

needed: infrastructure to deliver mcast-addressed datagrams to all hosts that have joined that multicast group

Page 37: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Mapping IP Multicast Address to Ethernet Address

• Ethernet MAC Addresses: 48 bits– broadcast: all 1s, ff:ff:ff:ff:ff:ff– multicast: multicast flag (the lowest bit of the 1st

octet)= 1• 01-00-5E-00-00-00 to 01-00-5E-7F-FF-FF for IP

multicast• IP multicast group address mapped to the

lower order 23 bits of MAC address • not one-to-one mapping, one Ethernet mcast

addr 32 IP mcast addrs

Page 38: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

IPv6 Multicast Addresses (RFC 2375)

• low-order flag indicates permanent / transient group; three other flags reserved

• scope field: 1 - node local–2 - link-local–5 - site-local–8 - organization-local–B - community-local–E - global–(all other values reserved)

4 112 bits8

group IDscopeflags11111111

4

Page 39: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Joining a mcast group: two-step process

• local: host informs local mcast router of desire to join group: IGMP (Internet Group Management Protocol)

• wide area: local router interacts with other routers to receive mcast datagram flow– many protocols (e.g., DVMRP, MOSPF, PIM)

IGMPIGMP

IGMP

wide-areamulticast

routing

Page 40: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

IGMP: Internet Group Management Protocol

• host: sends IGMP report when application joins mcast group– IP_ADD_MEMBERSHIP socket option– host need not explicitly “unjoin” group when

leaving • router: sends IGMP query at regular intervals

– host belonging to a mcast group must reply to query

query report

Page 41: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

How IGMP Works

• on each link, one router is elected the “querier”• querier periodically sends a Membership Query message

to the all-systems group (224.0.0.1), with TTL = 1• on receipt, hosts start random timers (between 0 and 10

seconds) for each multicast group to which they belong

Qrouters:

hosts:

Page 42: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

How IGMP Works (cont.)

• when a host’s timer for group G expires, it sends a Membership Report to group G, with TTL = 1

• other members of G hear the report and stop their timers• routers hear all reports, and time out non-responding groups

Q

G G G G

Page 43: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Source Specific Multicast

• Source Specific Multicast: a receiving host specifies (source, mcast group) to join– receive multicast packets addressed

to the group and only if they are from the specific sender (one-to-many)

• Any source multicast (ASM): many-to-many

Page 44: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

IGMPIGMP version 1• router: Host

Membership Query msg broadcast on LAN to all hosts

• host: Host Membership Report msg to indicate group membership– randomized delay

before responding– implicit leave via no

reply to Query

• RFC 1112

IGMP v2: additions include

• group-specific Query• Leave Group msg

– last host replying to Query can send explicit Leave Group msg

– router performs group-specific query to see if any hosts left in group

– RFC 2236

IGMP v3: – Join/Leave specific S in G– RFC 3376

Page 45: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Multicast Routing: Problem Statement

• Goal: find a tree (or trees) connecting routers having local mcast group members – tree: not all paths between routers used– source-based: different tree from each sender to rcvrs– shared-tree: same tree used by all group members

Source-based trees Shared tree

Page 46: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Approaches for building mcast trees

Approaches:• source-based tree: one tree per source

– shortest path trees– reverse path forwarding

• group-shared tree: group uses one tree– minimal spanning (Steiner) – center-based trees

…we first look at basic approaches, then specific protocols adopting these approaches

Page 47: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Shortest Path Tree

• mcast forwarding tree: tree of shortest path routes from source to all receivers– Dijkstra’s algorithm

R1

R2

R3

R4

R5

R6 R7

21

6

3 4

5

i

router with attachedgroup member

router with no attachedgroup member

link used for forwarding,i indicates order linkadded by algorithm

LEGENDS: source

Page 48: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Reverse Path Forwarding

if (mcast datagram received on incoming link on shortest path back to source)

then flood datagram onto all outgoing links else ignore datagram

rely on router’s knowledge of unicast shortest path from it to sender

each router has simple forwarding behavior:

Page 49: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

source

Building the Reverse Path

Page 50: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

source

Building a Reverse Path Tree

Page 51: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Reverse Path Forwarding: example

• result is a source-specific reverse SPT– may be a bad choice with asymmetric links

R1

R2

R3

R4

R5

R6 R7

router with attachedgroup member

router with no attachedgroup member

datagram will be forwarded

LEGENDS: source

datagram will not be forwarded

Page 52: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Reverse Path Forwarding: pruning• forwarding tree contains subtrees with no

mcast group members– no need to forward datagrams down subtree– “prune” msgs sent upstream by router with

no downstream group members

R1

R2

R3

R4

R5

R6 R7

router with attachedgroup member

router with no attachedgroup member

prune message

LEGENDS: source

links with multicastforwarding

P

P

P

Page 53: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Shared-Tree: Steiner Tree

• Steiner Tree: minimum cost tree connecting all routers with attached group members

• problem is NP-complete• excellent heuristics exists• not used in practice:

– computational complexity– information about entire network needed– monolithic: rerun whenever a router needs to

join/leave

Page 54: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Center-based trees• single delivery tree shared by all• one router identified as “center” of tree• to join:

– edge router sends unicast join-msg addressed to center router

– join-msg “processed” by intermediate routers and forwarded towards center

– join-msg either hits existing tree branch for this center, or arrives at center

– path taken by join-msg becomes new branch of tree for this router

Page 55: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Center-based trees: an example

Suppose R6 chosen as center:

R1

R2

R3

R4

R5

R6 R7

router with attachedgroup member

router with no attachedgroup member

path order in which join messages generated

LEGEND

21

3

1

Page 56: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Current Intra-Domain Multicast Routing

ProtocolsDVMRP — Distance-Vector Multicast Routing Protocol

flood-and-prune,unidirectional per-source trees,builds own routing table

MOSPF — Multicast Extensions to Open Shortest-Path First Protocol

broadcast membership,unidirectional per-source trees,uses OSPF routing table

Page 57: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Current Intra-Domain Multicast Routing Protocols

(cont.)PIM-DM — Protocol-Independent Multicast, Dense-Mode

broadcast-and-prune,unidirectional per-source trees,uses unicast routing table (Protocol Independent)

PIM-SM — Protocol-Independent Multicast, Sparse-Modeuses meeting places (“rendezvous points”),unidirectional per-group or shared trees,uses unicast routing table (Protocol Independent)

CBT — Core-Based Treesuses meeting places (“cores”),bidirectional shared trees,uses unicast routing table

Page 58: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

The First Intra-Domain Routing Protocol:

DVMRP

Page 59: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Distance-Vector Multicast Routing Protocol (DVMRP)

DVMRP consists of two major components:(1)a conventional distance-vector routing protocol (like RIP) which builds, in each router, a routing table like this:

(2) a protocol for determining how to forward multicast packets, based on the routing table and routing messages

Subnet(Destination)

shortest dist(cost)

via interface(NextHop)

a 1 i1

b 5 i1

c 3 i2… … …

Page 60: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Example Topology

g g

s

g

Page 61: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Phase 1: Truncated Broadcast

g g

s

g

flood iff the packet arrives over the link that is on the shortest path to S

Page 62: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Phase 2: Pruning

g g

s

prune (s,g)

prune (s,g)

g

Page 63: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Steady State

g g

s

g

g

Page 64: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

graft (s,g)

graft (s,g)

Grafting on New Receivers

g g

s

g

g

report (g)

Page 65: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Steady State after Grafting

g g

s

g

g

Page 66: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Multicast Routing: MOSPF

Page 67: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Multicast OSPF (MOSPF)

• an extension to OSPF (Open Shortest-Path First),a link-state, intra-domain routing protocolspecified in RFCs 1584 & 1585

• multicast-capable routers indicate that capability with a flag in their link-state messages

• routers include in their link-state messages a list of all groups that have members on the router’s directly-attached links (as learned through IGMP)

Page 68: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

S1

R1

R2

X

Y

Link state: each router floods link-state advertisementMulticast: add membership information to “link state”

Each router then has a complete map of the topology, includingwhich links have members of which multicast groups

Z

Page 69: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

S1

R1

R2

X

Y

Z has network map, including membership at X and YZ computes shortest path tree from S1 to X and YZ builds multicast entry with one outgoing interfaceW, Q, R, each build multicast entries

Z

W

Q

R

Page 70: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

R1

R2

X

Y

Z

W

Q

R

S1

Link-state advertisement with new topology (may be due to link failure) may require recomputation of tree and forwarding entry.Link WZ failed in the diagram below.

Page 71: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

R1

R2

X

Y

Z

W

Q

R

S1

T

R3

Link state advertisement (T) with new membership (R3) may require incremental computation and addition of interface to outgoing interface list (Z) (Similarly, disappearance of a membership may cause deletionan interface from an outgoing interface list). Link WZ is back to normal.

Page 72: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Multicast Routing: PIM

Page 73: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Protocol Independent Multicast (PIM)

• “Protocol Independent”– does not perform its own routing information

exchange – uses unicast routing table made by any of the existing

unicast routing protocols

• PIM-DM (Dense Mode) - similar to DVMRP, but:– without the routing information exchange part– differs in some minor details

• PIM-SM (Sparse Mode), or just PIM - instead of directly building per-source, shortest-path trees:– initially builds a single (unidirectional) tree per group

, shared by all senders to that group – once data is flowing, the shared tree can be converted to a

per-source, shortest-path tree if needed

Page 74: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

PIM Protocol Overview• Basic protocol steps

– routers with local members send Join messages towards a Rendezvous Point (RP) to join shared tree

– routers with local sources encapsulate data to RP

– routers with local members may initiate data-driven switch to source-specific, shortest-path tree

Page 75: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

RP

R1

R2 R3

R4

Join messagetoward RP

Shared tree after R1,R2,R3 join

Phase 1: Build Shared Tree

Join G

Page 76: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Phase 2: Sources Send to RP

RP

R1

R2 R3

R4

S1

unicast encapsulateddata packet to RP

RP decapsulates,forwards downShared treeS2

PIMRegister

Page 77: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Phase 3: Stop Encapsulation

RP

R1

R2 R3

R4

S1

Join G for S1Join G for S2

S2

(S1,G)

(S1,G)

(*.G)

(S2,G)

Page 78: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Phase 4: Switch to Shortest Path Tree

R1

R2 R3

R4

Join messagestoward S2

shared tree

S1

S2

RP

Page 79: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Phase 5: Prune (S2 off) Shared Tree

R1

R2 R3

R4

S1

S2 distribution treeShared tree

Prune S2 off Shared tree where iif of S2 andRP entries differS2

RP

Page 80: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Bidirectional Trees (BIDIR-PIM)

RP Address’s Link

Join

Join

Page 81: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Bidirectional Trees (BIDIR-PIM)• Bidirectional PIM (BIDIR-PIM): Enhancement to

PIM for many-to-many• A router in shared tree:

– Receive a mcast packet from a down-stream branch– Forward it both up the tree and down other branches

• BIDIR-PIM vs. PIM-SM shared tree– more direct but more state information

• BIDIR-PIM vs. PIM-SM source tree– Longer than PIM-SM source tree, but less state (no

source-specific tree)• What to optimize: bandwidth usage, router

state, path length, etc.• What sort of application to support: one-to-

many, many-to-many, etc.

Page 82: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Inter-domain Multicast Routing

Page 83: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

What Exactly is Needed?

• inter-domain route exchange protocol

• mechanism for connecting domains– two models:

• discover sources using “source announcing” protocol

• know the source(s) a priori

Page 84: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Inter-Domain Route Exchange

• Exchange multicast reachability between Autonomous Systems (AS)– Just like unicast routes are exchanged with BGP– Protocol is “Multiprotocol extensions to BGP” (RFC

2283)• Also known as “Multicast” BGP (MBGP)• Also known as BGP4+

• MBGP is available and deployed today.– Multiple vendors: Juniper, Cisco, Nortel, etc.

Page 85: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth
Page 86: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

What Exactly is Needed?

• inter-domain route exchange protocol

• mechanism for connecting domains– two models:

• discover sources using “source announcing” protocol

• know the source(s) a priori

Page 87: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

The Internet Solution

• Re-use existing protocols/solutions– Use PIM-SM in the inter-domain

• The challenge is to avoid “root dependencies”– A root/RP/core is one domain but no active

group participants (sources or receivers) in the domain

– Root dependencies can lead to political problems and inefficiencies

Page 88: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

The Internet Solution (cont)

• The key: Establish a root/RP/core per domain– No “root dependencies”

• Remember the problem:– Connecting sources and receivers– Solution: Multicast Source Discovery Protocol

(MSDP)

• MSDP is the last piece of the puzzle; is simple to implement; and yields an interim solution to inter-domain multicast

Page 89: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth
Page 90: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

MSDP -- Basic Idea

• MSDP advertises multicast sources to other domains

• Other domains decide if group members are active and find a way to get the data

• “MSDP connects shared-trees together”

• MSDP typically runs in the RP

Page 91: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

MSDP - Elements of Operation

• Receivers in a domain join the shared-tree

• The RP is known only to routers in the domain

• When a source goes active in a domain, it’s packets get to the RP in that domain

• The RP sends a Source-Active (SA) message identifying the source and group it sends to

Page 92: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

MSDP - Elements of Operation (cont)

• How to get SA messages to all MSDP peers?

– Need MSDP topology flooding protocol

– The RP’s address is also in the SA message to accommodate “peer-RPF” like flooding

– Each MSDP peer receives SA message and forwards away from the originating RP

Page 93: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

MSDP - Elements of Operation (cont)

• Each MSDP speaking RP will examine SA message to see if any local members are joined to the group

• If so, the RP joins to source described in SA message

• Otherwise, the SA message is ignored (Flood-and-Join model)

Page 94: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

How MSDP works with PIM-SM

RP

RP

RP

RP

MSDP peer

Physical link

A

B

C D

Receiver

Source

PIM message

MSDP message

2b: SA

2b: SA

2b: SA

3:Join3:Join3:Join

3:Join

1: Register

2a:Join

Page 95: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

What Exactly is Needed?

• inter-domain route exchange protocol

• mechanism for connecting domains– two models:

• discover sources using “source announcing” protocol

•know the source(s) a priori—SSM model

Page 96: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Source Specific Multicast (SSM)

• Basic idea:– Assumes receiver knows the

source(s)– Reverse SPT join to source

• No RPs or MSDP

– About as straightforward as you can get!

Page 97: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

How SSM Works

Physical link

A

B

C D

Receiver

Source

PIM message

Join

JoinJoin

Join

Join

Join

Page 98: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

Source Specific Multicast

• Advantages– Minor changes to existing infrastructure—still use

PIM-SM– No PIM-SM RP, or MSDP

• Limitations– Requires modifications (last hop routers) and

IGMPv3– May be difficult to support some applications

• Thoughts– Works for 9x% of killer-apps -- need mechanism

(WWW) to let receivers know who sources are– Success will depend on seamless migration strategy

Page 99: ECE544: Communication Networks-II Spring 2009 Hang Liu Lecture 6 Includes teaching materials from D. Raychaudhuri, L. Peterson, J. Kurose, K. Almeroth

104

Today’s Homework• Peterson & Davie, Chap 4

4.464.564.61

Due next Friday 3/6Reminder: Midterm 3/13

Office hour: 3:30PM-4:40PM, 3/6(CoRE 510)