lecture 11: addressing, framing, and switching in the link layer cs 3035/gz01: networked systems...
TRANSCRIPT
Lecture 11: Addressing, Framing, and Switching in the Link Layer
CS 3035/GZ01: Networked SystemsKyle Jamieson
Department of Computer ScienceUniversity College London
2
The link layer: Functionality
• Enables the exchange of messages (frames) between end hosts
• Functionality:1. Framing: Determine start and end of bits and frames 2. Error control: Detect and/or correct errors3. Reliable delivery: Deliver frames exactly once4. Medium access control: Control hosts’ access to a shared
medium, if applicable (medium access control)
Networked Systems 3035/GZ01
Sending host Receiving host
IP datagram
frame frame
Link-layer protocol
3
Today• We finish the functionality of the link layer, and tie it in to IP
1. Framing and addressing
2. Repeaters, hubs, and switches
3. Bootstrapping a host
Networked Systems 3035/GZ01
4
• We have seen how to frame bits on a link– Ethernet’s Manchester encoding– Result: An infinite stream of bits on a link
• But, two hosts connected on the same physical medium need to be able to exchange frames– Service provided by the link layer– Implemented by the network adaptor
• Problem: how does the link layer determine where each frame begins and ends? (…how hard can that be?)
Framing frames
Networked Systems 3035/GZ01
5
Simple approach to framing: count bytes• Sender includes number of bytes in header• Receiver extracts this number of bytes of body
• But what if the Count field is corrupted?– L2 will frame the wrong bytes– This is called a framing error– With high probability, CRC will detect the framing error and
discard that frame, but:
53 Body 80 Body
53 bytes of data 21 bytes of data
61 Body 80 Body
61 bytes of data misdelivered ??? bytes of data misdelivered
???
Bogus count field
• This state of persistent framing errors is called desynchronizationNetworked Systems 3035/GZ01
6
Desynchronization• Once framing on a link is desynchronized, it can stay that way
• Need a method to resynchronize
• But once we have that method, why use counting?
Networked Systems 3035/GZ01
7
Framing with sentinel bytes• Delineate beginning of frame with special byte (SYN)• Delineate end of frame with another special byte (ETX)
• What if sentinel occurs in data? – Byte stuffing: insert another special “escape” byte DLE before sentinel
• What if any of the above escape characters occur in data? – Byte stuffing again: Stuff DLE before DLE occurring in data
• Example:
• Can we be more efficient?
SYN ETXFrame contents
SYN ETXDLE, SYN, DLE, DLE, DLE, ETX
Networked Systems 3035/GZ01
8
Framing with sentinel bits• Delineate frame with special bit pattern– e.g., 01111110 start, 01111111 end
• Problem: what if sentinel occurs within frame?
• Solution: bit stuffing– Sender always inserts a 0 after five 1s in the frame contents– Receiver always removes a 0 appearing after five 1s
01111110 01111111Frame contents
Networked Systems 3035/GZ01
9
When receiver sees five 1s…
• If next bit 0, remove it, and begin counting again– Because this must be a stuffed bit; we can’t be at
beginning/end of frame (those had six or seven 1s)
• If next bit 1 (i.e., we’ve seen six 1s) then:– If following bit is 0, this is start of frame• Because the receiver has seen 01111110
– If following bit is 1, this is end of frame• Because the receiver has seen 01111111
01111110 01111111Frame content
Networked Systems 3035/GZ01
10
Example: sentinel bits• Original data, including start/end of frame:01111110011111101111101111100101111111
• Sender rule: five 1s insert a 0– After bit stuffing at the sender:
01111110011111010111110011111000101111111
• Receiver rule: five 1s and next bit 0 remove 001111110011111101111101111100101111111
Networked Systems 3035/GZ01
11
Comparing addressing schemes
• Network layer address (IP address)– Function: move datagram to destination network– 32-bit address, dotted quad notation a.b.c.d where each
component is an eight-bit unsigned integer– Hierarchical address space
• Link layer address (MAC address, Ethernet address): – Function: move frame from one point to another point on
the same network– Unique 48-bit address (in most LANs)– Burned in NIC ROM, also sometimes software settable– Usually a flat address space
Networked Systems 3035/GZ01
12
Ethernet addresses• 48-bit source and destination addresses
– Receiver’s link layer passes frame up to network-level protocol:• If destination address matches the adaptor’s
• Or the destination address is the broadcast address (ff:ff:ff:ff:ff:ff)
• Or the card is in a mode of operation that receives all frames (promiscuous mode)
– Addresses are globally unique• Assigned by NIC vendors (top three bytes specify vendor)
Networked Systems 3035/GZ01
13
Today• We finish the functionality of the link layer, and tie it in to IP
1. Framing and addressing
2. Repeaters, hubs, and switches– Comparison– Self-learning switches– The Spanning Tree Protocol
3. Bootstrapping a host
Networked Systems 3035/GZ01
14
Message, segment, datagram, and frame
HTTP
TCP
IP
Ethernetinterface
HTTP
TCP
IP
Ethernetinterface
IP IP
Ethernetinterface
Ethernetinterface
SONETinterface
SONETinterface
host host
router router
HTTP message
TCP segment
IP datagram IP datagramIP datagram
Ethernet frame Ethernet frameSONET frameNetworked Systems 3035/GZ01
15
Different devices switch on different information
• Routers: forward IP datagrams based on network-layer addresses in the IP header
• Switches (Bridges): forward link-layer frames based on link-layer addresses in the link-layer header
• Repeaters/Hubs: rebroadcast all bits in the physical-layer frame
Networked Systems 3035/GZ01
H H H H data
Physical-layer frame
Physical
Hub
LinkPhysical
Switch
H H H dataLink layer frame
NetworkLink
Physical
Router
H H dataIP datagram
H H H H data
H H H H data
H H H H data
16
Physical Layer: Repeaters• Distance limitation in local-area networks
– Electrical signal becomes weaker as it travels– Imposes a limit on the length of a LAN
• In addition to limit imposed by collision detection
• Repeaters join LANs together– Analog electronic device– Continuously monitors electrical signals on each LAN– Transmits an amplified copy
Repeater
Networked Systems 3035/GZ01
17
Physical Layer: Hubs• Joins multiple input lines electrically– Do not necessarily amplify the signal
• Very similar to repeaters– Also operate at the physical layer
hub hubhub
hub
Networked Systems 3035/GZ01
18
Limitations of repeaters and hubs
• One large place where packets collide (collision domain), since every bit is sent everywhere– So, aggregate throughput is limited– e.g., three departments each get 10 Mbps independently– … and then if connect via a hub must share 10 Mbps
• Cannot support multiple LAN technologies– Repeaters/hubs do not buffer or interpret frames– So, can’t interconnect between different rates or formats• e.g., no mixing 100 Mbit/s Ethernet and Gigabit Ethernet
• Limitations on maximum nodes and distances– Does not circumvent limitations of the shared medium– e.g., still cannot go beyond 2500 m in commercial Ethernet
Networked Systems 3035/GZ01
19
Link Layer: Switches• Switches also connect two or more LANs at the link layer– Extracts destination address from the frame– Looks up the destination in a table– Forwards the frame to the appropriate LAN segment• Or point-to-point link, for higher-speed Ethernet
• Each port is its own collision domain (if not just a link)
Networked Systems 3035/GZ01
hub
Switch
collision domaincollision domain
Extended LAN
20
Switches and concurrent communication• Host A can talk to C, while B talks to D
• If host has (dedicated) point-to-point link to switch:– Full duplex: each connection can send in both directions– Completely avoids collisions
No need for carrier sense, collision detection, and so on Change in medium access control, but same framing
switchA
B
C
D
Networked Systems 3035/GZ01
21
Switches: Advantages over hubs and repeaters
• Only forwards frames as needed– Filters frames to avoid unnecessary load on segments– Sends frames only to segments that need to see them
• Extends the geographic span of the network– Separate collision domains allow longer distances
• Improves privacy by limiting scope of frames– Hosts can “snoop” the traffic traversing their segment– … but not all the rest of the traffic
• Applies CSMA/CD in segment (not whole net)– Smaller collision domain
• Joins segments using different technologiesNetworked Systems 3035/GZ01
22
Disadvantages over hubs and repeaters• Higher cost– More complicated devices that cost more money
• Delay in forwarding frames– Bridge/switch must receive and parse the frame– … and perform a look-up to decide where to forward– Introduces store-and-forward delay• Can ameliorate using cut-through switching– Start forwarding after only header received
• Need to learn where to forward frames– Bridge/switch needs to construct a forwarding table– Ideally, without intervention from network administrators– Solution: Self-learning algorithm
Networked Systems 3035/GZ01
23
Motivation for self learning• Benefit if switch forwards frame only on segment(s) that need it– Allows concurrent use of other links
• Switch forwarding table– Maps destination link-layer address to outgoing interface– Goal: construct the switch table automatically
switchA
B
C
DNetworked Systems 3035/GZ01
24
Self learning algorithm: Building the table• When a frame (e.g., from A to B) arrives at the switch:– Inspect the source link-layer address• Associate that address with the incoming switch port• Store the mapping in the switch table• Use time-to-live field to eventually forget the mapping an
amount of time later equal to its value– This is an example of soft state
Switch just learned how to reach A.Networked Systems 3035/GZ01
switchA
B
C
D
A B data 12
3
4
Address Port Time-to-liveA 1 2 minutes
Switch forwarding table:
25
Self learning algorithm: Handling misses
• When frame arrives with unfamiliar destination (e.g., B)– Forward the frame out all ports except for the one on which
the frame arrived• This is called flooding
– Hopefully, this case won’t happen very often• When e.g. B replies, switch will learn that node, too
Networked Systems 3035/GZ01
switchA
B
C
D
A B data 12
3
4
Address Port Time-to-liveA 1 2 minutes
Switch forwarding table:
26
Self-learning algorithmWhen switch receives a frame:index into the forwarding table using link-layer destination addressif entry found for destination { if dest on segment from which frame arrived
then drop frame else forward frame on interface indicated} else flood the frame
Forward on all ports except the port on which the frame arrived
Problems?
Networked Systems 3035/GZ01
27
Flooding can lead to loops• Switches sometimes need to flood frames:
– Upon receiving a frame with an unfamiliar destination– Upon receiving a frame sent to the broadcast address
• Flooding can lead to forwarding loops– e.g., if the network contains a cycle of switches
• Either accidentally, or by design for higher reliability
• This is catastrophic, for two reasons:1. Unlike IP, layer 2 has no way of preventing frame looping2. Ethernet duplicates frames, leading to an exponential increase, quickly crashing the
extended LAN (this is called a broadcast storm)
Networked Systems 3035/GZ01
How can we revise the bridge learning algorithm to avoid broadcast storms?
28
The spanning tree protocol (STP)
• Once the spanning tree is formed:– Switches use the switch learning algorithm to forward data
frames over the tree links only
• Early 1980s: Digital Equipment Corporation, a key Ethernet vendor, wanted to leverage the benefits of loops while avoiding broadcast storms
• Radia Perlman’s idea: Switches agree on a loop-free and connected spanning tree– Spanning tree: a sub-graph that touches all vertices but contains
no cycles
Spanning tree has no cycles
Graph with cycles
Networked Systems 3035/GZ01
29
Spanning Tree Protocol (STP): Overview• Users connect Ethernet
switches and shared-medium Ethernet LANs together
– Arbitrarily, possibly creating forwarding loops
• Need a distributed algorithm so that:
1. Switches cooperate to build the spanning tree
2. Switches adapt automatically when failures occur
Networked Systems 3035/GZ01
1
2
3
4
30
STP: Key ingredients of the algorithm• Switches elect one root switch
from which to build the tree– Switch identifier = link-layer
address on one port
• Switches block some ports from sending or receiving frames of Ethernet type IP (or other L3 data)
• To form tree, switches exchange configuration messages (R, d, X):– From switch X– Proposing switch R (which is d
hops away) as the root– Configuration messages are
never blocked
Networked Systems 3035/GZ01Root switch
1
2B
3 B
4
Blocked ports
Let’s begin with a simplified version of the full STP distributed algorithm
31
Simplified STP: State at each switch• Each switch X keeps the following state:1. Its view of who the root is– Initially, itself: X
XRoot id: X
Networked Systems 3035/GZ01
32
Simplified STP: Startup and calculating the root
• Note: Initially, each switch X periodically sends (X, 0, X) from all its ports
Networked Systems 3035/GZ01
1Root id: 1
2Root id: 2
3Root id: 3
4Root id: 4
Root ID rule: Root ID r at switch X is the minimum of X and root IDs received at all ports
33
Simplified STP: Startup and calculating the root
• Note: Initially, each switch X periodically sends (X, 0, X) from all its ports
Networked Systems 3035/GZ01
1Root id: 1
2Root id: 2
3Root id: 2
4Root id: 4
Root ID rule: Root ID r at switch X is the minimum of X and root IDs received at all ports
• Switch 2 sends (2, 0, 2); switch 3 sets its root id to 1, switch 1 ignores
(2,
0, 2)
(2, 0, 2)
(2, 0, 2)
34
Simplified STP: Startup and calculating the root
• Note: Initially, each switch X periodically sends (X, 0, X) from all its ports
Networked Systems 3035/GZ01
1Root id: 1
2Root id: 1
3Root id: 1
4Root id: 4
Root ID rule: Root ID r at switch X is the minimum of X and root IDs received at all ports
• Switch 1 sends (1, 0, 1); switches 2 and 3 set their root ids to 1
(1, 0, 1)
(1, 0, 1
)
35
Simplified STP: Startup and calculating the root
• Note: Initially, each switch X periodically sends (X, 0, X) from all its ports
Networked Systems 3035/GZ01
1Root id: 1
2Root id: 1
3Root id: 1
4Root id: 3
Root ID rule: Root ID r at switch X is the minimum of X and root IDs received at all ports
• Switch 3 sends (3, 0, 3); switch 4 sets its root id to 3, others ignore
(3
, 0, 3
)
(3, 0, 3)
(3, 0, 3)
36
STP: Startup and calculating the root• Note: Initially, each switch X periodically sends
(X, 0, X) from all its ports
Networked Systems 3035/GZ01
1Root id: 1
2Root id: 1
3Root id: 1
4Root id: 3
Root ID rule: Root ID r at switch X is the minimum of X and root IDs received at all ports
• Switch 4 sends (4, 0, 4); switch 3 ignores
(4, 0, 4)
Not yet agreeing on the identity of the root: let’s now see how switches propagate information through the network
37
Simplified STP: State at each switch• Each switch X keeps the following state:1. Its view of who the root is– Initially, itself: X
2. Its configuration message to send– Initially, announcing itself as root with
zero distance to root: (X, 0, X)
XRoot id: X
Msg: (X, 0, X)
Networked Systems 3035/GZ01
38
Simplified STP: Calculating the message• Switch X finds its distance from the root (d):1. If X thinks it is the root, d 02. Otherwise, d the minimum distance from
messages received matching X’s root id (call it r), plus one
Networked Systems 3035/GZ01
1Root id: 1
Msg: (1, 0, 1)
2Root id: 2
Msg: (2, 0, 2)
3Root id: 3
Msg: (3, 0, 3)
4Root id: 4
Msg: (4, 0, 4)
Configuration message rule: Switch X sets its configuration message to (r, d, X). If configuration message changes, sends updated message immediately
39
Simplified STP: Calculating the message• Switch X finds its distance from the root (d):1. If X thinks it is the root, d 02. Otherwise, d the minimum distance from
messages received matching X’s root id (call it r), plus one
Networked Systems 3035/GZ01
1Root id: 1
Msg: (1, 0, 1)
2Root id: 1
Msg: (1, 1, 2)
3Root id: 1
Msg: (1, 1, 3)
4Root id: 3
Msg: (4, 0, 4)
(1, 0,
1)
(1, 0, 1)
• Switch 1 sends (1, 0, 1), switches 2 and 3 update their root ids and msgs
Configuration message rule: Switch X sets its configuration message to (r, d, X). If configuration message changes, sends updated message immediately
40
Simplified STP: Calculating the message• Switch X finds its distance from the root (d):1. If X thinks it is the root, d 02. Otherwise, d the minimum distance from
messages received matching X’s root id (call it r), plus one
Networked Systems 3035/GZ01
1Root id: 1
Msg: (1, 0, 1)
2Root id: 1
Msg: (1, 1, 2)
3Root id: 1
Msg: (1, 1, 3)
4Root id: 1
Msg: (1, 2, 4)
• Switch 3 sends (1, 1, 3), switch 4 updates its root id and message
(1, 1, 3)
Configuration message rule: Switch X sets its configuration message to (r, d, X). If configuration message changes, sends updated message immediately
41
Simplified STP: Calculating the message• Switch X finds its distance from the root (d):1. If X thinks it is the root, d 02. Otherwise, d the minimum distance from
messages received matching X’s root id (call it r), plus one
Networked Systems 3035/GZ01
1Root id: 1
Msg: (1, 0, 1)
2Root id: 1
Msg: (1, 1, 2)
3Root id: 1
Msg: (1, 1, 3)
4Root id: 1
Msg: (1, 2, 4)
Configuration message rule: Switch X sets its configuration message to (r, d, X)Now all switches agree on the root identifier. But how do
they decide which ports to block to form the spanning tree?
42
STP: Port status
• All switches connected to a Ethernet LAN (or the two at the ends of a cable) agree on a single “designated” port
1Root id: 1
Msg: (1, 0, 1)
DD
2Root id: 1
Msg: (1, 1, 2)
D
3Root id: 1
Msg: (1, 1, 3)
D
4Root id: 1
Msg: (1, 2, 4)
– The designated port forwards frames from the LAN to the root
– Only designated ports send configuration messages
Networked Systems 3035/GZ01
Designated port: The port on the shortest path from the LAN or cable to the root is the designated port (D)
43
STP: Port status
1Root id: 1
Msg: (1, 0, 1)
DD
2Root id: 1
Msg: (1, 1, 2)
D
R
3Root id: 1
Msg: (1, 1, 3)
R
D
4Root id: 1
Msg: (1, 2, 4)RRoot port: Each non-root switch
notes which of its port is on the shortest path to the root; this port is the root port (R)
Networked Systems 3035/GZ01
44
STP: Port status
1Root id: 1
Msg: (1, 0, 1)
DD
2Root id: 1
Msg: (1, 1, 2)
D
B
R
3Root id: 1
Msg: (1, 1, 3)
BR
D
4Root id: 1
Msg: (1, 2, 4)R
Networked Systems 3035/GZ01
Blocked port: If neither designated nor root, a port is a blocked port (B), not forwarding data traffic.
45
STP: State at each switch• Each switch X keeps the following state:1. Its view of who the root is– Initially, itself: X
2. Its configuration message to send– Initially, announcing itself as root with
zero distance to root: (X, 0, X)
XRoot id: X
Msg: (X, 0, X)
D: (X, 0, X)
3. For each of X’s ports:– Whether designated (D), root (R), or blocking (B) data traffic
• Initially, designated (D)– “Best” configuration message heard on that port
• Initially, its own configuration message (X, 0, X)
Networked Systems 3035/GZ01
46
STP: Designated port rule
• At a switch, for each port p:– Consider all configuration messages received on port p and
the configuration message the switch would send
– If switch receives a “better” configuration message on a port p, don’t send configuration messages on port p
– Else, p is designated: send configuration message on p
• Rule for comparing configuration messages: (R1, d1, X1) better than (R2, d2, X2) if R1 < R2 or
(R1 = R2 and d1 < d2) or (R1 = R2 and d1 = d2 and X1 < X2)
Networked Systems 3035/GZ01
47
STP: Complete example• All switches begin thinking they are root with
all ports in the designated state
Networked Systems 3035/GZ01
1Root id: 1
Msg: (1,0,1)
D: (1,0,1)D: (1,0,1)
2Root id: 2
Msg: (2,0,2)
D: (2,0,2)
D: (2,0,2)
D: (2,0,2)
3Root id: 3
Msg: (3,0,3)
D: (3,0,3)D: (3,0,3)
D: (3,0,3)
4Root id: 4
Msg: (4,0,4)D: (4,0,4)
48
STP: Complete example• All switches begin thinking they are root with
all ports in the designated state
• Switch 1 sends (1,0,1), switches 2 and 3 update their root ids, ports, and msgs
Networked Systems 3035/GZ01
1Root id: 1
Msg: (1,0,1)
D: (1,0,1)D: (1,0,1)
2Root id: 1
Msg: (1,1,2)
D: (2,0,2)
B: (1,0,1)
R: (1,0,1)
3Root id: 1
Msg: (1,1,3)
D: (3,0,3)R: (1,0,1)
D: (3,0,3)
4Root id: 4
Msg: (4,0,4)D: (4,0,4)
(1, 0, 1)
(1, 0
, 1)
– Switch 2 breaks “tie” between the two copies of (1,0,1) locally by numbering its ports
– Each switch’s port remembers the best configuration message seen so far 1
23
49
STP: Complete example
Networked Systems 3035/GZ01
1Root id: 1
Msg: (1,0,1)
D: (1,0,1)D: (1,0,1)
2Root id: 1
Msg: (1,1,2)
D: (1,1,3)
B: (1,0,1)
R: (1,0,1)
3Root id: 1
Msg: (1,1,3)
D: (3,0,3)R: (1,0,1)
D: (3,0,3)
4Root id: 1
Msg: (1,2,4)R: (1,1,3)
• Switch 3 sends (1,1,3) from its designated ports , switch 4 updates its root id and message
– Switch 2, port 3 remains designated because Switch 2’s message (1,1,2) is better than (1,1,3)
– Switch 1, port 1 remains designated because Switch 1’s message (1,0,1) is better than (1,1,3)
(1,1
,3)
(1,1,3)
1
23
1 2
50
STP: Complete example
Networked Systems 3035/GZ01
• Switch 2 sends (1,1,2) from port 3 only
– Switch 3 blocks its port 3 since (1,1,2) is better than its message (1,1,3)
1Root id: 1
Msg: (1,0,1)
D: (1,0,1)D: (1,0,1)
2Root id: 1
Msg: (1,1,2)
D: (1,1,3)
B: (1,0,1)
R: (1,0,1)
3Root id: 1
Msg: (1,1,3)
B: (1,1,2)R: (1,0,1)
D: (3,0,3)
4Root id: 1
Msg: (1,2,4)R: (1,1,3)
(1,1,2)
1
23
12
3
51
STP: Dynamics• When do switches send configuration messages?
– If you think you’re the root, send periodically with parameter hello time (two seconds recommended in 802.1d)
– Other switches send on all designated ports upon receiving root’s message
• How does the algorithm adapt to topology changes?– State table contains age field, which is updated continuously
– Aging rule: If age reaches a threshold max age (20 sec in 802.1d), discard that table entry and recalculate using all rules• What happens if max age is too big? Too small?
– Recalculate when receive better or newer configuration message on port p (resulting in a table entry being overwritten)
Networked Systems 3035/GZ01
52
STP: Handling failures• Suppose the Ethernet LAN fails
Networked Systems 3035/GZ01
1Root id: 1
Msg: (1,0,1)
D: (1,0,1)D: (1,0,1)
2Root id: 1
Msg: (1,1,2)
D: (1,1,3)
B: (1,0,1)
R: (1,0,1)
3Root id: 1
Msg: (1,1,3)
B: (1,1,2)R: (1,0,1)
D: (3,0,3)
4Root id: 1
Msg: (1,2,4)R: (1,1,3)
1
23
12
3
53
STP: Handling failures• Suppose the Ethernet LAN fails
• Switch 3:– Stops hearing the root’s
messages through port 1, so it becomes designated
– Port 3 becomes root– Updates its own message
Networked Systems 3035/GZ01
1Root id: 1
Msg: (1,0,1)
D: (1,0,1)D: (1,0,1)
2Root id: 1
Msg: (1,1,2)
D: (1,1,3)
B: (1,0,1)
R: (1,0,1)
3Root id: 1
Msg: (1,2,3)
R: (1,1,2)D: (1,2,3)
D: (3,0,3)
4Root id: 1
Msg: (1,2,4)R: (1,1,3)
1
23
12
3
54
STP: Handling failures• Suppose the Ethernet LAN fails
• Switch 4:– Updates message heard on
root port – Updates its own message
• Switch 2:– Stops hearing the root’s
messages through port 2, so it becomes designated
Networked Systems 3035/GZ01
1Root id: 1
Msg: (1,0,1)
D: (1,0,1)D: (1,0,1)
2Root id: 1
Msg: (1,1,2)
D: (1,1,3)
D: (1,1,2)
R: (1,0,1)
3Root id: 1
Msg: (1,2,3)
R: (1,1,2)D: (1,2,3)
D: (3,0,3)
4Root id: 1
Msg: (1,3,4)R: (1,2,3)
1
23
12
3
55
STP: Handling topology change• Suppose we fix the LAN. Now we
have created (temporary) forwarding loops
– This also happens when switches are powered-up
Networked Systems 3035/GZ01
1Root id: 1
Msg: (1,0,1)
D: (1,0,1)D: (1,0,1)
2Root id: 1
Msg: (1,1,2)
D: (1,1,3)
D: (1,1,2)
R: (1,0,1)
3Root id: 1
Msg: (1,2,3)
R: (1,1,2)D: (1,2,3)
D: (3,0,3)
4Root id: 1
Msg: (1,3,4)R: (1,2,3)
1
23
12
3
56
STP: Pre-forwarding port state• Suppose any of the following
apply to a port:
1. Transition from B D 2. Any newly-connected port
(detect Ethernet carrier)3. Any port on a freshly-
powered switch
• The port then enters the pre-forwarding (PF) state, where:
– It sends configuration messages and transitions to blocked and root states as if designated
– But it does not forward data frames, so can’t create loops
Networked Systems 3035/GZ01
1Root id: 1
Msg: (1,0,1)
D: (1,0,1)PF: (1,0,1)
2Root id: 1
Msg: (1,1,2)
D: (1,1,3)
PF: (1,1,2)
R: (1,0,1)
3Root id: 1
Msg: (1,2,3)
R: (1,1,2)PF: (1,2,3)
D: (3,0,3)
4Root id: 1
Msg: (1,3,4)R: (1,2,3)
1
23
12
3
57
STP: Pre-forwarding port state• Switches 3 returns to old state
Networked Systems 3035/GZ01
1Root id: 1
Msg: (1,0,1)
D: (1,0,1)PF: (1,0,1)
2Root id: 1
Msg: (1,1,2)
D: (1,1,3)
PF: (1,1,2)
R: (1,0,1)
3Root id: 1
Msg: (1,1,3)
R: (1,1,2)R: (1,0,1)
D: (3,0,3)
4Root id: 1
Msg: (1,3,4)R: (1,2,3)
1
23
12
3
58
STP: Pre-forwarding port state• Switch 3 returns to old state• Switch 2 returns to old state
Networked Systems 3035/GZ01
1Root id: 1
Msg: (1,0,1)
D: (1,0,1)PF: (1,0,1)
2Root id: 1
Msg: (1,1,2)
D: (1,1,3)
B: (1,0,1)
R: (1,0,1)
3Root id: 1
Msg: (1,1,3)
R: (1,1,2)R: (1,0,1)
D: (3,0,3)
4Root id: 1
Msg: (1,3,4)R: (1,2,3)
1
23
12
3
59
STP: Pre-forwarding port state• Switch 3 returns to old state• Switch 2 returns to old state• Switch 4 returns to old state
• Now switch 1, port 1 remains in the pre-forwarding state
Networked Systems 3035/GZ01
1Root id: 1
Msg: (1,0,1)
D: (1,0,1)PF: (1,0,1)
2Root id: 1
Msg: (1,1,2)
D: (1,1,3)
B: (1,0,1)
R: (1,0,1)
3Root id: 1
Msg: (1,1,3)
R: (1,1,2)R: (1,0,1)
D: (3,0,3)
4Root id: 1
Msg: (1,2,4)R: (1,1,3)
1
23
12
3
1 2
60
STP: Leaving the pre-forwarding state
• If still in PF state after some number of seconds (forwarding delay parameter) then the port becomes designated (D)
• How long should forwarding delay be?– Long enough for the entire
spanning tree to re-form, i.e.:
– Twice the maximum transit time across the extended LAN• 30 seconds in 802.1d
Networked Systems 3035/GZ01
1Root id: 1
Msg: (1,0,1)
D: (1,0,1)D: (1,0,1)
2Root id: 1
Msg: (1,1,2)
D: (1,1,3)
B: (1,0,1)
R: (1,0,1)
3Root id: 1
Msg: (1,1,3)
R: (1,1,2)R: (1,0,1)
D: (3,0,3)
4Root id: 1
Msg: (1,2,4)R: (1,1,3)
1
23
12
3
1 2
61
The evolution of Ethernet• From the coaxial cable shared medium to switches– Even more capacity, with simultaneous conversations
• From 3 Mbit/s experimental Ethernet to 100 Gbit/s recent standards
• From electrical signaling to optical
• Changed everything except the frame format
• Lesson: The right interface can accommodate many changes – Implementation is hidden behind interface
Networked Systems 3035/GZ01
62
Today• We finish the functionality of the link layer, and tie it in to IP
1. Framing and addressing
2. Repeaters, hubs, and switches
3. Bootstrapping a host– Protocols for bootstrapping: DHCP, ARP– Communicating over the same, different networks
Networked Systems 3035/GZ01
63
What does a host need to know?• What IP address should the host use?
• What local DNS server to use?
• How to tell which destinations are local?– How to address them using the local network?
• How to send packets to remote destinations?
host host DNS... host host DNS...
router router
1.2.3.0/23 5.6.7.0/24
1.2.3.7 1.2.3.156???
1.2.3.19
routerNetworked Systems 3035/GZ01
64
Avoiding manual configuration• Dynamic Host Configuration Protocol (DHCP)– End host learns how to send packets– Learn IP address, DNS servers, “gateway,” what’s local
• Address Resolution Protocol (ARP)– For local destinations, learn the mapping between IP address
and MAC address
host host DNS... host host DNS...
router router
1.2.3.0/23255.255.254.0
5.6.7.0/24
1.2.3.7 1.2.3.1561.2.3.48
1.2.3.19
router
1A-2F-BB-76-09-AD
Networked Systems 3035/GZ01
65
Key ideas in both protocols
• Broadcasting: when in doubt, shout!– Broadcast query to all hosts in the local-area-network– … when you don’t know how to identify the right one
• Caching: remember the past for a while– Store the information you learn to reduce overhead– Remember your own address and other host’s addresses
• Soft state: eventually forget the past– Associate a time-to-live field with the information– On expiry either refresh or discard the information– This is key for robustness in the face of unpredictable change
Networked Systems 3035/GZ01
66
Bootstrapping problem• Host doesn’t have an IP address yet– So, host doesn’t know what source address to use
• Host doesn’t know whom to ask for an IP address– So, host doesn’t know what destination address to use
Networked Systems 3035/GZ01
host host
routerrouter
67
DHCP discovery, from the client• DHCP Solution: “shout” to discover a server that can help– Client broadcasts a DHCP discover message (to the broadcast
IP address, 255.255.255.255)
– Two possibilities:1. Server on same subnet sends a reply offering an address2. Or: a DHCP relay agent (configured only with DHCP server’s
IP address) unicasts to a DHCP server on another network• DHCP server replies unicast to relay agent; agent forwards
replies to the new host’s network
Networked Systems 3035/GZ01
host host
DHCP server DHCP relay
DHCP server
routerrouter
68
Response from the DHCP server• The server responds with a DHCP offer message– Contains configuration parameters (including proposed IP
address, mask, gateway router, DNS server)– Contains lease time (duration the information remains valid)
• Multiple servers may respond– Multiple servers on the same subnetwork– Each may respond with an offer
• Accepting one of the offers– Client sends a DHCP request echoing the parameters– The DHCP server responds with a DHCP ACK to confirm– The other servers see they were not chosen• They can then safely offer those same parameters to other clients
Networked Systems 3035/GZ01
69
Dynamic Host Configuration Protocol
• Why all the broadcasts?
• Discover broadcast: client doesn’t know DHCP server’s identity• Offer, ACK broadcast: client doesn’t have an IP yet• Request broadcast: so other servers can see
Networked Systems 3035/GZ01
Arrivingclient
DHCP server
DHCP discover(broadcast)
DHCP offer
(broadcast)DHCP request(broadcast)
DHCP ACK
(broadcast)
70
Soft state: Refresh or forget• Why is a lease time necessary?– Client can release the IP address (DHCP release)• e.g., clean shutdown of the computer
– But, host might not release the address• e.g., the host crashes• e.g., buggy client software
– And you don’t want the address to be allocated forever
• Performance trade-offs– Short lease time: returns inactive addresses quickly– Long lease time: avoids overhead of frequent renewals & lessens
frequency of lease being denied
Networked Systems 3035/GZ01
71
So, now the host knows things…
IP address Mask Gateway router DNS server
• And can send packets to other IP addresses• But: how to use the local network to do this?
Networked Systems 3035/GZ01
72
Figuring out where to send locally• Two cases:1. Destination is on the local network: need to address it directly2. Destination is not local (remote): need to figure out the first
“hop” on the local network
• Determining if it’s local: use the netmask– e.g., bitwise-AND the destination IP address with 255.255.254.0– Is it the same value as when we do the same with own IP address?• Yes destination IP is local; no destination IP is remote
host host DNS... host host DNS...
router router
1.2.3.0/23255.255.254.0
5.6.7.0/24
1.2.3.7 1.2.3.1561.2.3.48
1.2.3.19
router
1A-2F-BB-76-09-AD
Networked Systems 3035/GZ01
73
Figuring out where to send locally (2)
• If it’s remote, look up the first hop in a (very small) local routing table– e.g., by default, route via 1.2.3.19– Now do the local case but for 1.2.3.19 rather than ultimate
destination IP address
• For the local case, need to determine the destination’s link-layer address• How does a host translate the next hop IP address to a link-layer address?
host host DNS... host host DNS...
router router
1.2.3.0/23255.255.254.0
5.6.7.0/24
1.2.3.7 1.2.3.1561.2.3.48
1.2.3.19
router
1A-2F-BB-76-09-AD
Networked Systems 3035/GZ01
74
Address Resolution Protocol (ARP)• Every node maintains an ARP table– (IP address, link-layer address) pairs
• Consult the table when sending a packet– Map destination IP address to destination MAC address– Encapsulate and transmit the data packet
• But: what if IP address not in the table?– Sender broadcasts: “Who has IP address 1.2.3.156?”– Receiver responds (unicast, to the source of the broadcast):
“link-layer address 58-23-D7-FA-20-B0”– Sender caches result in its ARP table
• Sender may include its own <IP, link-layer> address mapping in request, so that receiver can reply back to the sender
Networked Systems 3035/GZ01
75
Example: Putting it all together• How does host A send a datagram to host B?
1. A sends packet to R2. R sends packet to B
Networked Systems 3035/GZ01
A
R
Bhost host
router
74:29:9c:e8:ff:55128.16.74.92 netmask 0xfffff000
e6:e9:00:17:bb:4b128.16.64.1
Network128.16.64.0/20
Network128.17.64.0/20
49:bd:d2:C7:56:2a128.17.0.2
1a:23:f9:cd:06:9b128.17.0.1
76
Host A decides to send through R• Host A constructs an IP packet to send to B– IP source 128.16.74.92, IP destination 128.170.0.2
• Host A has a gateway router R– Used to reach any destination outside of 128.16.64.0/20– Address 128.16.64.1 for R learned via DHCP
Networked Systems 3035/GZ01
A
R
Bhost host
router
74:29:9c:e8:ff:55128.16.74.92 netmask 0xfffff000
e6:e9:00:17:bb:4b128.16.64.1
Network128.16.64.0/20
Network128.17.64.0/20
49:bd:d2:C7:56:2a128.17.0.2
1a:23:f9:cd:06:9b128.17.0.1
77
Host A sends packet through R• Host A learns the MAC address of R’s interface– ARP request: broadcast request for 128.16.64.1– ARP response: R responds with e6:e9:00:17:bb:4b
• Host A encapsulates the packet in a link-layer header and sends to R
Networked Systems 3035/GZ01
A
R
Bhost host
router
74:29:9c:e8:ff:55128.16.74.92 netmask 0xfffff000
e6:e9:00:17:bb:4b128.16.64.1
Network128.16.64.0/20
Network128.17.64.0/20
49:bd:d2:C7:56:2a128.17.0.2
1a:23:f9:cd:06:9b128.17.0.1
A B dataTo: R
78
R decides how to forward datagram• Router R’s left interface receives the packet– R extracts the IP packet from the Ethernet frame– R sees the IP packet is destined to 128.17.0.2
• Router R consults its forwarding table– Packet matches 128.17.64.0/20 via right interface
Networked Systems 3035/GZ01
A
R
Bhost host
router
74:29:9c:e8:ff:55128.16.74.92 netmask 0xfffff000
e6:e9:00:17:bb:4b128.16.64.1
Network128.16.64.0/20
Network128.17.64.0/20
49:bd:d2:C7:56:2a128.17.0.2
1a:23:f9:cd:06:9b128.17.0.1
A B data
79
R sends datagram to B• Router R’s right interface learns the link-layer address of host B– ARP request: broadcast request for 128.17.0.2– ARP response: B responds with 49:bd:d2:C7:56:2a
• Router R encapsulates the packet and sends to B
Networked Systems 3035/GZ01
A
R
Bhost host
router
74:29:9c:e8:ff:55128.16.74.92 netmask 0xfffff000
e6:e9:00:17:bb:4b128.16.64.1
Network128.16.64.0/20
Network128.17.64.0/20
49:bd:d2:C7:56:2a128.17.0.2
1a:23:f9:cd:06:9b128.17.0.1
A B dataTo: B
80
Security analysis of ARP• Impersonation– Any node that hears an ARP request can answer…– …and can say whatever they want
– Actual legit receiver never sees a problem• Because even though later packets carry its IP address, its
NIC doesn’t capture them since not its link-layer address
• Man-in-the-middle attack– Imposter updates frames with correct link-layer address and
forwards whatever it receives to the legit destination…• …but gets to inspect (and maybe alter) it first
• Does the attacker have to “win” a race?– Maybe not, if sender blindly believes ARP responses
Networked Systems 3035/GZ01
81
The problem with extended LANs• Switched LANs afford greater scalability, but extended LANs do
not isolate traffic
• Three resulting issues:1. Security: Allows eavesdropping across LANs, just by putting an
interface in promiscuous mode
2. Load: Some LANs are more heavily-used than others, may be desirable to separate them at times.
3. Broadcast scalability: Broadcast frames traverse the entire extended LAN; this reduces overall performance
Networked Systems 3035/GZ01
82
Virtual LANs (VLANs)
• Switch assigns each port a color, an identifier designating the VLAN that port belongs to
• Traffic isolation: colors = broadcast domains
• Easily reconfigurable port assignments
• Routing between VLANs: layer 3 routing functionality
1 9 152 4 8 10 16
… …Computer Science Electrical Engineering
Networked Systems 3035/GZ01
83
• Configure ports on W, X, Y, and Z to be in appropriate VLANs– Trunk ports between B1
and B2 configured for both VLANs
• Bridge inserts VLAN header containing color between Ethernet header and payload
• If a packet contains a VLAN header, bridges only forward on matching-color or trunk ports
VLAN example
Trunklink
Networked Systems 3035/GZ01
84
Comparing L2 switches and L3 routers
• Advantages of L2 switches over L3 routers– No human configuration is needed– Fast filtering and forwarding of frames
• Disadvantages of L2 switches over L3 routers– Topology restricted to a spanning tree– Large networks require large ARP tables– Broadcast storms can cause the network to collapse– Can’t accommodate non-Ethernet segments (why not?)
Networked Systems 3035/GZ01
85
NEXT TIME
Midterm exam in regular lecture timeslot, Thursday 14th November
AcknowledgementSelected parts adapted from lecture material by Scott Shenker (UC Berkeley) and Kurose and Ross Computer Networking (4/e)
Coursework 2 due Friday 15th November, 4:05 PM
Networked Systems 3035/GZ01