washington washington university in st louis [email protected] gige for the msr fred kuhns...
TRANSCRIPT
WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
GigE for the MSR
Fred Kuhns
2WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
Ethernet Forwarding Scenario 1
EthernetSwitch
Host
IP: 192.163.204.2MAC: 08:00:20:7C:E3:25
Host
IP: 192.163.204.3MAC: 08:00:20:7C:F2:45
RouterPort 0:IP: 192.163.204.4MAC: 00:01:03:7C:23:03Port 1:IP: 192.163.150.1MAC: 00:01:03:7C:56:34
EthernetSwitch
Port 1:IP: 192.163.204.2MAC: 00:00:5E:04:00:01
MSR P1
HostIP: 192.163.150.2MAC: 00:40:33:A3:4C:04
P0
P1
Host
IP: 192.163.150.3MAC: 08:00:20:54:6C:4A
P3
Use the Address Resolution Protocol to Map 192.168.204.2
to 08:00:20:7C:E3:25. Encapsulation datagram in Ethernet frame and send.
Destination Addr:192.168.204.2
IP hdr
data
Packet arrives with destination host on local
network. Output port must map destination IP address
to MAC address.
3WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
Ethernet Forwarding Scenario 2
EthernetSwitch
Host
IP: 192.163.204.2MAC: 08:00:20:7C:E3:25
Host
IP: 192.163.204.3MAC: 08:00:20:7C:F2:45
RouterPort 0:IP: 192.163.204.4MAC: 00:01:03:7C:23:03Port 1:IP: 192.163.150.1MAC: 00:01:03:7C:56:34
EthernetSwitch
Port 1:IP: 192.163.204.2MAC: 00:00:5E:04:00:01
MSR P1
HostIP: 192.163.150.2MAC: 00:40:33:A3:4C:04
P0
P1
Host
IP: 192.163.150.3MAC: 08:00:20:54:6C:4A
P3 Forwards to final destination host
Next hop router IP address must be used in the ARP
request: Map 192.168.204.4 to 00:01:03:7C:23:03.
Encapsulate datagram in Ethernet frame and send.
Destination Addr:192.168.150.2
IP hdr
data
Packet arrives with destination host NOT on locally attached network. Output port must send to
the next hop router.
4WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
Ethernet Frame Format
Transport Header
Fragment offset
VersionH-length TOS Total length
Identification Flags
TTL Protocol IP Header checksum
IP Source Address
IP Destination Address
Destination Address cont.
Destination (6 B)
Source Address cont.
Source Address - (6 B)
Ether Type (2 B)
IPH
eade
rE
ther
net
Hea
der
IPD
atag
ram
5WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
IP Encapsulation in Ethernet Frames
FCS (4)Data (46-1500)type0800
src address (6)dst address (6)
len(2)
src address (6)dst address (6) FCS (4)Data (38 - 1492)
DSAPAA
SSAPAA
ctl03
Org Code00
type0800
802.2 LLC 802.2 SNAP
802.2 LLC/SNAP
• Ethernet frame size: 64 - 1518 Bytes• if type 1500, then IEEE frame, otherwise Ethernet V2.Ethernet Encapsulation, RFC 894
IEEE 803.2/802.2 encapsulation, RFC 1042
0 len 1500
Pad(0-46)
Pad(0-46)
6WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
ARP FrameDestination Address (6B)
Source Address (6B)
Ether Type (2B)
Hardware Address Space (2B)
Protocol Address Space (2B)
Byte length of Hardware address = 6 (1B)
Byte length of Protocol address = 4 (1B)
Hardware Address of Sender (6 B)
Protocol Address of Sender (4 B)
Hardware Address of Destination (6 B)
Protocol Address of Destination (4 B)
Operation Code 1/2(2B)
7WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
ARP Message Formats
ARP Request
type0806
src address<eth-A>
dst address ff:ff:ff:ff:ff:ff
FCSxx
has0001
pas0800
hl6
pl4
op01
sha<eth-A>
spa<ip-A>
tha<??>
tpa<ip-B>
type806
src address<eth-B>
dst address <eth-A>
FCSxx
has1
pas800
hl6
pl4
op02
sha<eth-B>
spa<ip-B>
tha<eth-A>
tpa<ip-A>
ARP Reply
Host B Eth<eth-B>
Reply (02)
Request (01)
Host A Eth<eth-B>
Host A IP<ip-A>
Host B IP<ip-A>
Ethernet Header (14 B)
pad
pad
ARP Message (28 Bytes for Request or Reply)
Ethernet Data - Pad with zeros to 46 BytesFCS(4B)
Ethernet Frame with ARP Request/Reply - 64 Bytes
18 Byte Pad
8WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
IP over ATM (rfc 791 and 2684)IP
Head
er
AA
L5 T
railer
IP
Data
gra
m
Fragment offset
VersionH-length TOS Total length
Identification flags
TTL protocol Header checksum
Source Address
Destination Address
Options ??
IP data (transport header and transport data)
AAL5 padding (0 - 40 bytes)
CPCS-UU (0) CPCS-UU (0) Length (IP packet + LLC/SNAP)
CRC
9WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
IP Header Fields (rfc 791)• Version - support IPv4 (4)• Header Length - Length in 32 bit words
(>= 5)• TOS -• Total Length - Length of datagram in
octets• Id - Assists in reassembling fragments• Flags - • Fragment Offset - Where fragment
belongs, offset is in octets
0 DF
MF
TOS Precedense Field:111 - Network Control110 - Internetwork Control101 - Critic/ECP100 - Flash Override011 - Flash010 - Immediate001 - Priority000 - RoutineRemaining TOS Fields:D - 1 = Low delayT - 1 = High ThroughputR - 1 = High Reliability
0Prec. D T R 0
DF - 1 = Don’t Fragment, MF - 1 = More Fragments
10WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
IP Header Fields
•TTL - router must decrement, if 0 then discard packet
•Protocol - UDP/TCP/ICMP/RSVP to name a few
•Header Checksum - 16 bit one’s complement of the one’s complement sum of all 16 bit words in header
•Source Address - Sending hosts IP address•Destination Address - Destination hosts IP address
11WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
SPC
shimupdate
shimdemux
Packet Routing Within MSR
WUGS
...out port + IntBase
(64 ... 127)
InVC
...
Ingress Egress
ATM uses VCsas link layer
address.
Ethernet: Base VC used fordirectly attached hosts,
subports are for hext hop routers
From previous hop
router or endstation
add
shim
rem sh
imFIPLshimproc.
FPX FPX
SPC
shimdemux
shimupdate
OutVC
Outbound VC = SPI + ExtBase0 <= SPI<= 15
currently support at most 4
Lin
k In
terfaceL
ink
In
terf
ace
IP processing for FPX 1. Broadcast and Multicast
destination address2. IP options3. ICMP messages4. Packet not recognized
Inbound VC = SPI + ExtBase0 <= SPI <= 15
Currently support at most 4 Inbound VCs: One for Ethernet or
Four for ATM
Current VCI Support1) 64 Ports (PN)2) 16 sub-ports (SP)
FIPL
IPproc
plugins
FIPL
IPproc
plugins
in port + IntBase(64 ... 127)
12WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
GigE Link Interface
ARP Table(M Entries)
MACIP
IP1 MAC1
IPM MACM
......
Pkt VC = 50
Endsystem, broadcastor multicast address
if VC != 50,Lookup VC in
VIN tablereturns IP used for ARP lookup(support N = 4)
Send to pkt->dstif bcast or mcast
map to eaddrelse
resolve w/ARP
IP Header
data
AAL5 trailer
IP Header
data
Ethernet
Add Ethernet header using the derived destination address and out source address. Protocol is IP.
Software createsVIN table at boot time by writing to
interface.
Fro
mF
PX
/SP
C
To
Nex
t H
op o
r E
nd
stat
ion
No ARP entry aging!
To a next hop routerNH #1 = Base + 1 = 51
NH #2 = Base + 2 = 52
NH #3 = Base + 3 = 53
VIN Table - 4 entries
50 MyIP0 0
53 MyIP2 NhIP2
MyIPVC NhIP
52 MyIP1 NhIP1
51 MyIP0 NhIP0
Map multicast or broadcast toethernet address
If ARP table lookup fails, send ARP request to broadcast address, drop packet. No retries are made.
13WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
Ethernet Assigned Numbers• RFC1700 obsoleted by online database at IANA:
– http://www.iana.org/assignments/ethernet-numbers
• Ethernet Address - 6 octets:– 3 high-order octets = Organizationally Unique
Identifier (OUI)– 3 low-order octets = the interface number
• Multicast bit = lsb of the MSB (xxxx xxx1)– first byte odd => multicast or broadcast– first byte even => unicast address– multicast address = ((OUI | 0x0100) << 24) & Group_ID
• Ethernet Broadcast: FF:FF:FF:FF:FF:FF
14WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
IP and Ethernet Multicast• IANA has allocated address block with OUI = 00:00:5E
– Used for unicast addresses for ”IETF standard track protocols “
– Half of Multicast addresses reserved for IP, remaining for “special use”. Leaves 23 bits for multicast addresses:
• 01:00:5E:00:00:00 to 01:00:5E:7F:FF:FF
– Could use this block for our interface, see ethernet numbers
• IP Multicast– Class D address, 0xE0000000 + 28 Bit Group ID– 224.0.0.0 to 239.255.255.255 (0xE0000000 - 0xEFFFFFFF)
• IP to Ethernet Mapping– RFC1112 - Host Extensions for IP Multicasting – Non-unique mapping: 28 bit IP group to 23 bit Ethernet group
• 32 IP multicast groups per mapped ethernet multicast address.
15WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
Multicast: IP to Ethernet Mappings• Network Byte Ordering, Internet Standard Bit order:
(Big-Endian)
0000 0001 0000 0000 0101 1110 0xxx xxxx xxxx xxxx xxxx xxxx47240
Multicast Bit Internet BitMSB LSB
lsbmsb 1110 xxxx xxxx xxxx xxxx xxxx xxxx xxxx
Class D (Multicast)
Not Used in IP to Ethernet Mapping
Block of Ethernet Multicast Address
0 8
LSB
23 bits
16WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
IP Broadcast• No Direct Impact on GigE Interface
• IP Broadcast : default, we will not forward directed broadcasts.– limited versus:
• {-1, -1}. Must not be forwarded, Destination address only
– Directed broadcast: • {Network-Number, -1}, destination address only.
– Subnet Directed Broadcast: • {Network-Number, Subnet-Number, -1}
– Directed Broadcast to all subnets:• {Network-Number, -1, -1}
17WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
Unicast - If we use the IANA Block
0000 0000 0000 0000 0101 1110 0000 0100 xxxx xxxx xxxx xxxx47230
Multicast Bit set to 0
MSB LSB
IANA Block of Ethernet Addresses16 bits
ARL Interface Number
18WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
GigE Link InterfaceARP Table (M Entries)
MACIP
IP1 MAC1
IPM MACM
......
Base VCto FPX/SPC
IP Header
data
Ethernet
Fro
m N
ext
Hop
or
En
dst
atio
n
To
FP
X/S
PC
receive ethernet frame: ethif (eth->type == ARP)
if (eth->arp->has != Ethernet/0001) Drop Frameif (eth->arp->pas != IP/0800) Drop Frameupdate {eth->arp->spa, eth->arp->sha} in ARP tableif (eth->arp->tpa NOT in {MyIP0, MyIP1, MyIP2})
Drop Frame // target IP not oursif (eth->arp->op == Request/01) {
swap source and target ARP infoset operation to Replyset ether header src and dst addresssend reply
} // Already handled eth->arp->op == Reply/02// when updated cache above
else if (eth->type == IPv4)remove ethernet header, padding and CRCadd AAL5 trailer and required paddingbreak into cells and send on default Base VC
else Error, drop packet
*Unicast MAC address filtering
IP Header
data
AAL5 trailer
19WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
Notes• Packet Received on ATM interface:
– If received on Base_VC (i.e. 50) then • map IP destination (ip->dst_addr) to ethernet representation.
• Unicast uses ARP table, multicast and broadcast use appropriate mapping.
– Otherwise, • lookup VC in VIN table: Table entry index = RX_VC - Base_VC.
• ARP the resulting Next Hop IP address.
– This permits a simple mechanism for “tunneling” traffic to a gateway. This allows us to support directed broadcast and provides a convenient mechanism for testing.
• Packet received on Ethernet interface: – if IPv4 then send all (unicast, multicast and broadcast) to input
port processor on the Base_VC (i.e. 50)
20WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
ARP Cache• IP Address = Network_Prefix.Host or simply Net.Host
– Assume a prefix length of at least 24 bits, leaves 8 bits for the host
– An interface can have at most 3 unique IP addresses
• Interface may communicate with at most 256 hosts per network
• Implement ARP cache as a table with 768 entries (3 * 256)
• See next slideVIN Table
PrefixMask
Local IPAddress
Next HopIP Address
Mask0 MyIP0 NH0
Mask1 MyIP1 NH1
Mask2 MyIP2 NH2
EntryNumber
0
1
2
EthernetIP
IP0,0
......
IP0,255 Ether0,255
Ether0,0
IP1,0
......
IP1,255 Ether1,255
Ether1,0
IP2,0
......
IP2,255 Ether2,255
Ether2,0
ARP Table
Net 0
Net 1
Net 2
Net 0 = Mask0 & MyIP0
Net 1 = Mask1 & MyIP1
Net 2 = Mask2 & MyIP2
21WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
‘get next packet’:// received frame from ATM interfaceif (RX_VC == Base_VC)
ipdst = ip->dst_addr;elseipdst = VIN_Table[RX_VC- Base_VC].NextHop
// ipdst == IP Address of host we must send packet to// determine networkfor (i = 0; i < 3; i++) {
if ((ipdst & Maski) == (MyIPi & Maski)) {index = (i << 8) | (ip->dst_addr & ~Maski)break; }
if i == 3 ; drop packet, goto get next packet// i corresponds to the Network Number (0 - 2)if (ArpTable[index].EtherAddress != 00:00:00:00:00:00) {
construct ethernet frame send packet goto ‘get next packet’
} else {send ARP Request for ipdstdrop packet, goto ‘get next packet’}
Implementing the ARP TableVIN Table
EthernetIP
IP0,0
......
IP0,255 Ether0,255
Ether0,0
IP1,0
......
IP1,255 Ether1,255
Ether1,0
IP2,0
......
IP2,255 Ether2,255
Ether2,0
ARP Table
index
PrefixMask
Local IPAddress
Next HopIP Address
Mask0 MyIP0 NH0
Mask1 MyIP1 NH1
Mask2 MyIP2 NH2
EntryNumber
0
1
2
don’t need to store IP address
22WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
Notes and Issues• GigE Control Interface for Software configuration.
1. Reset interface to defaults
2. Clear ARP cache
3. Read ARP table
4. Read VIN table
5. Read ethernet address
6. set VIN table entries and other registers• Set BASE VC (currently 50)• Set Entries in the VIN table• Add static ARP entries??
23WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
Notes and Issues• Comprehensive testing scenarios need defining
• verify multicast and broadcast
• VC to control line card
24WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
References• RFC 1122 - Requirements for Internet Hosts
– Must send and receive using RFC-894 - compliant– Should receive RFC-1042 mixed with RFC-894 - we do not– May send using RFC-1042 - we do not– Must use ARP– Must flush out-of-date ARP cache entries - not compliant– Must prevent ARP floods - we only try once– Should have configurable ARP cache timeout - no– Should save at least one (latest) unresolved (by ARP) packet - no– Must report broadcasts to IP layer - compliant– IP layer Must pass TOS to link layer - via the header– Must Not report no ARP entry as “destination unreachable” -
compliant
25WashingtonWASHINGTON UNIVERSITY IN ST LOUIS
Fred Kuhns - 1/9/01
References• RFC-826 : Address Resolution Protocol
– Maps <protocol, address> to 48 bit Ethernet address– our processing differs in minor ways
• RFC 1700 : Assigned Numbers– Ethertype values defined by RFC 1700– IP to ethernet multicast address mapping defined
• RFC-1812 : Requirements for IPv4 Routers– Must not believe ARP reply if contains multicast or broadcast
address - not compliant– Must be compliant with RFC 1122 - Partial
• Support Ethernet V2 only– RFC 894: IP encapsulation in Ethernet V2 - Supported– RFC 1042: IP encapsulation in 802.3 frames - Not Supported