coms/csee 4140 networking laboratory lecture 06
DESCRIPTION
COMS/CSEE 4140 Networking Laboratory Lecture 06. Salman Abdul Baset Spring 2008. Announcements. Lab 4 (5-7) due next week before your lab slot Prelab 5 due next week. There will be Lab 5 next week. Midterm (March 10 th , duration ~1.5 hours) Assignment 2 issues aslookup compilation? - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/1.jpg)
COMS/CSEE 4140 Networking Laboratory
Lecture 06
Salman Abdul BasetSpring 2008
![Page 2: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/2.jpg)
2
Announcements Lab 4 (5-7) due next week before your lab
slot Prelab 5 due next week. There will be Lab 5 next week. Midterm (March 10th, duration ~1.5 hours) Assignment 2 issues
aslookup compilation? ISP name: nslookup or whois for IP address
Lab 4 (count-to-infinity issues)
![Page 3: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/3.jpg)
3
Agenda Autonomous Systems (AS) Policy vs. distance based routing Border gateway protocol (BGP) Transmission control protocol (TCP)
![Page 4: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/4.jpg)
4
Autonomous Systems Terminology local traffic = traffic with source or
destination in AS transit traffic = traffic that passes
through the AS Stub AS = has connection to
only one AS, only carry local traffic Multihomed AS = has connection to >1
AS, but does not carry transit traffic Transit AS = has connection to >1
AS and carries transit traffic
![Page 5: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/5.jpg)
5
Stub and Transit Networks
AS 1, AS 2, and AS 5 are stub networks
AS 2 is a multi-homed stub network
AS 3 and AS 4 are transit networks
AS 3
AS 1
AS 4
AS 2
AS 5
![Page 6: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/6.jpg)
6
Selective Transit
Example: Transit AS 3 carries
traffic between AS 1 and AS 4 and between AS 2 and AS 4
But AS 3 does not carry traffic between AS 1 and AS 2
The example shows a routing policy.
AS 2AS 1
AS 3
AS 4
![Page 7: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/7.jpg)
7
Customer/Provider
A stub network typically obtains access to the Internet through a transit network.
Transit network that is a provider may be a customer for another network
Customer pays provider for service
AS 5
AS 2
Customer/Provider
AS 6
Customer/Provider
AS 6
Customer/Provider
AS 4
Customer/Provider
AS 6
Customer/Provider
![Page 8: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/8.jpg)
8
Customer/Provider and Peers
Transit networks can have a peer relationship Peers provide transit between their respective customers Peers do not provide transit between peers Peers normally do not pay each other for service
AS 3
AS 5
AS 2Peers
Customer/Provider
AS 6
Customer/Provider
AS 1Peers
AS 6
Customer/Provider
AS 4
Customer/Provider
AS 6
Customer/Provider
![Page 9: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/9.jpg)
9
Shortcuts through peering
Note that peering reduces upstream traffic Delays can be reduced through peering But: Peering may not generate revenue
AS 3
AS 5
AS 2Peers
Customer/Provider
AS 6
Customer/Provider
AS 1Peers
AS 6
Customer/Provider
AS 4
Customer/Provider
AS 6
Customer/Provider
Peers
![Page 10: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/10.jpg)
10
ASNs already assigned
Source: http://www.potaroo.net/tools/asn32/
private ASN: 65412 – 65536
![Page 11: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/11.jpg)
11
ASNs in use
![Page 12: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/12.jpg)
12
ASN projections
![Page 13: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/13.jpg)
13
Autonomous Routing Domains Don’t Always Need BGP or an ASN
Qwest
Yale University
Nail up default routes 0.0.0.0/0pointing to Qwest
Nail up routes 130.132.0.0/16pointing to Yale
130.132.0.0/16
Static routing is the most common way of connecting anautonomous routing domain to the Internet. This helps explain why BGP is a mystery to many …
ARDs versus ASes
![Page 14: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/14.jpg)
14
ASNs Can Be “Shared” (RFC 2270)
AS 701UUNet
ASN 7046 is assigned to UUNet. It is used byCustomers single homed to UUNet, but needing BGP for some reason (load balancing, etc..) [RFC 2270]
AS 7046Crestar Bank
AS 7046 NJIT
AS 7046HoodCollege
128.235.0.0/16
![Page 15: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/15.jpg)
15
ARDs and ASes: Summary Most ARDs have no ASN (statically routed
at Internet edge)
Some unrelated ARDs share the same ASN (RFC 2270)
Some ARDs are implemented with multiple ASNs (example: Worldcom)
ASes are just an implementation detail of Inter-domain routing
![Page 16: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/16.jpg)
16
Agenda Autonomous Systems (AS) Policy vs. distance based routing Border gateway protocol (BGP) Transmission control protocol (TCP)
![Page 17: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/17.jpg)
17
Regional ISP1
Regional ISP2
Regional ISP3
Cust1Cust3 Cust2
National ISP1
National ISP2
YES
NO
Shortest path routing is not compatible with commercial relations
Why not minimize “AS hop Count”?
![Page 18: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/18.jpg)
18
Customer versus Provider
Customer pays provider for access to the Internet
provider
customer
IP trafficprovider customer
![Page 19: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/19.jpg)
19
peer peer
customerprovider
Peers provide transit between their respective customers
Peers do not provide transit between peers
Peers (often) do not exchange $$$trafficallowed
traffic NOTallowed
The “Peering” Relationship
![Page 20: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/20.jpg)
20
Peering also allows connectivity betweenthe customers of “Tier 1” providers.
peer peer
customerprovider
Peering Provides Shortcuts
![Page 21: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/21.jpg)
21
Peering Wars
Reduces upstream transit costs
Can increase end-to-end performance
May be the only way to connect your customers to some part of the Internet (“Tier 1”)
You would rather have customers
Peers are usually your competition
Peering relationships may require periodic renegotiation
Peering struggles are by far the most contentious issues in the ISP world!
Peering agreements are often confidential.
Peer Don’t Peer
![Page 22: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/22.jpg)
22
Agenda Autonomous Systems (AS) Policy vs. distance based routing Border gateway protocol (BGP) Transmission control protocol (TCP)
![Page 23: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/23.jpg)
23
The Gang of FourLink State Vectoring
EGP
IGP
BGP
RIPIS-IS
OSPF
![Page 24: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/24.jpg)
24
BGP Overview BGP = Border Gateway Protocol v4 . RFC 1771. (~
60 pages) Note: In the context of BGP, a gateway is nothing
else but an IP router that connects autonomous systems.
Interdomain routing protocol for routing between autonomous systems.
Uses TCP to establish a BGP session and to send routing messages over the BGP session.
Update only new routes. BGP is a path vector protocol. Routing messages in
BGP contain complete routes. Network administrators can specify routing
policies.
![Page 25: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/25.jpg)
25
BGP Policy-based Routing Each node is assigned an AS number (ASN)
BGP’s goal is to find any AS-path (not an optimal one). Since the internals of the AS are never revealed, finding an optimal path is not feasible.
Network administrator sets BGP’s policies to determine the best path to reach a destination network.
![Page 26: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/26.jpg)
26
BGP = RFC 1771
+ “optional” extensionsRFC 1997 (communities) RFC 2439 (damping) RFC 2796 (reflection) RFC3065 (confederation) …
+ routing policy configurationlanguages (vendor-specific)
+ Current Best Practices in management of Interdomain Routing
BGP was not DESIGNED. It EVOLVED.
The Border Gateway Protocol (BGP)
![Page 27: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/27.jpg)
27
BGP Route Processing
Best Route Selection
Apply Import Policies
Best Route Table
Apply Export Policies
Install forwardingEntries for bestRoutes.
ReceiveBGPUpdates
BestRoutes
TransmitBGP Updates
Apply Policy =filter routes & tweak attributes
Based onAttributeValues
IP Forwarding Table
Apply Policy =filter routes & tweak attributes
Open ended programming.Constrained only by vendor configuration language
![Page 28: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/28.jpg)
28
BGP Attributes
Value Code Reference----- --------------------------------- --------- 1 ORIGIN [RFC1771] 2 AS_PATH [RFC1771] 3 NEXT_HOP [RFC1771] 4 MULTI_EXIT_DISC [RFC1771] 5 LOCAL_PREF [RFC1771] 6 ATOMIC_AGGREGATE [RFC1771] 7 AGGREGATOR [RFC1771] 8 COMMUNITY [RFC1997] 9 ORIGINATOR_ID [RFC2796] 10 CLUSTER_LIST [RFC2796] 11 DPA [Chen] 12 ADVERTISER [RFC1863] 13 RCID_PATH / CLUSTER_ID [RFC1863] 14 MP_REACH_NLRI [RFC2283] 15 MP_UNREACH_NLRI [RFC2283] 16 EXTENDED COMMUNITIES [Rosen] ... 255 reserved for development
From IANA: http://www.iana.org/assignments/bgp-parameters
Mostimportantattributes
Not all attributesneed to be present inevery announcement
![Page 29: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/29.jpg)
29
LOCAL_PREF Attribute
Forces outbound traffic to take primary link, unless link is down.
![Page 30: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/30.jpg)
30
NEXT_HOP Attribute
EGP: IP address used to reach the advertising router IGP: next-hop address is carried into local AS
![Page 31: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/31.jpg)
31
AS_PATH Attribute
Used to detect routing loops and find shortest paths
![Page 32: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/32.jpg)
32
Prepending will (usually) force inbound traffic from AS 1to take primary linkAS 1
192.0.2.0/24ASPATH = 2 2 2
customerAS 2
provider
192.0.2.0/24
backupprimary
192.0.2.0/24ASPATH = 2
Yes, this is a Glorious Hack …
Shedding Inbound Traffic with ASPATH Prepending
![Page 33: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/33.jpg)
33
AS 1
192.0.2.0/24ASPATH = 2 2 2 2 2 2 2 2 2 2 2 2 2
customerAS 2
provider
192.0.2.0/24
192.0.2.0/24ASPATH = 2
AS 3provider
AS 3 will sendtraffic on “backup”link because it prefers customer routes and localpreference is considered before ASPATH length!
Padding in this way is oftenused as a form of loadbalancing
backupprimary
… But Padding Does Not Always Work
![Page 34: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/34.jpg)
34
AS 1
customerAS 2
provider
192.0.2.0/24
192.0.2.0/24ASPATH = 2
AS 3provider
backupprimary
192.0.2.0/24ASPATH = 2 COMMUNITY = 3:70
Customer import policy at AS 3:If 3:90 in COMMUNITY then set local preference to 90If 3:80 in COMMUNITY then set local preference to 80If 3:70 in COMMUNITY then set local preference to 70
AS 3: normal customer local pref is 100,peer local pref is 90
COMMUNITY Attribute to the Rescue!
![Page 35: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/35.jpg)
35
BGP Issues - What is a BGP Wedgie?
BGP policies make sense locally Interaction of local policies allows
multiple stable routings Some routings are consistent with
intended policies, and some are not If an unintended routing is
installed (BGP is “wedged”), then manual intervention is needed to change to an intended routing
When an unintended routing is installed, no single group of network operators has enough knowledge to debug the problem
¾ wedgie
Full wedgie
![Page 36: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/36.jpg)
36
YouTube blocking Pakistan blocks YouTube How? (according to BBC)
Advertise a shorter route to reach YouTube The incorrect short route gets propagated Seen by two thirds of the Internet Traffic to YouTube goes through Pakistan Since Pakistan blocked YouTube, all traffic
reaches a dead end!
![Page 37: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/37.jpg)
37
Dynamic Routing Protocols: Summary Dynamic routing protocols: RIP, OSPF, BGP
RIP uses distance vector algorithm, and converges slow (the count-to-infinity problem)
OSPF uses link state algorithm, and converges fast. But it is more complicated than RIP.
Both RIP and OSPF finds lowest-cost path.
BGP uses path vector algorithm, and its path selection algorithm is complicated, and is influenced by policies.
BGP has its own problems see WIDGI by Tim Griffin
![Page 38: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/38.jpg)
38
More Readings (Optional)BGP Wedgies: Bad Routing Policy Interactions that Cannot be Debugged
JI’s Intro to interdomain routing.
"Interdomain Setting of PlanetLab Nodes." PlanetLab Meeting, May 14, 2004.
Understanding the Border Gateway Protocol (BGP) ICNP 2002 Tutorial Session
![Page 39: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/39.jpg)
39
Agenda Autonomous Systems (AS) Policy vs. distance based routing Border gateway protocol (BGP) Transmission control protocol (TCP)
![Page 40: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/40.jpg)
40
Transmission Control Protocol (RFC) Reliable and in-order byte-stream service
TCP format Connection establishment Flow control Reaction to congestion Packet corruption
![Page 41: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/41.jpg)
41
TCP Format
IP header TCP header TCP data
Sequence number (32 bits)
DATA
20 bytes 20 bytes
0 15 16 31
Source Port Number Destination Port Number
Acknowledgement number (32 bits)
window sizeheaderlength
0 Flags
Options (if any)
TCP checksum urgent pointer
20 bytes• TCP segments have a 20 byte header with >= 0 bytes of data.
![Page 42: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/42.jpg)
42
TCP header fields Sequence Number (SeqNo):
Sequence number is 32 bits long. So the range of SeqNo is
0 <= SeqNo <= 232 -1 4.3 Gbyte
Each sequence number identifies a byte in the byte stream
Initial Sequence Number (ISN) of a connection is set during connection establishmentQ: What are possible requirements for ISN ?
![Page 43: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/43.jpg)
43
TCP header fields Acknowledgement Number (AckNo):
Acknowledgements are piggybacked, i.e.,a segment from A -> B can contain an acknowledgement for a data sent in the B -> A direction
Q: Why is piggybacking good ?
A hosts uses the AckNo field to send acknowledgements. (If a host sends an AckNo in a segment it sets the “ACK flag”)
The AckNo contains the next SeqNo that a hosts wants to receiveExample: The acknowledgement for a segment with sequence numbers 0-1500 is AckNo=1501
![Page 44: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/44.jpg)
44
TCP header fields Acknowledge Number (cont’d)
TCP uses the sliding window flow protocol (see CS 457) to regulate the flow of traffic from sender to receiver
TCP uses the following variation of sliding window: no NACKs (Negative ACKnowledgement) only cumulative ACKs
Example: Assume: Sender sends two segments with “1..1500”
and “1501..3000”, but receiver only gets the second segment.
In this case, the receiver cannot acknowledge the second packet. It can only send AckNo=1
![Page 45: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/45.jpg)
45
TCP header fields Header Length ( 4bits):
Length of header in 32-bit words Note that TCP header has variable length
(with minimum 20 bytes)
![Page 46: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/46.jpg)
46
TCP header fields Flag bits:
URG: Urgent pointer is valid If the bit is set, the following bytes contain an urgent
message in the range:SeqNo <= urgent message <= SeqNo+urgent pointer
ACK: Acknowledgement Number is valid PSH: PUSH Flag
Notification from sender to the receiver that the receiver should pass all data that it has to the application.
Normally set by sender when the sender’s buffer is empty
![Page 47: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/47.jpg)
47
TCP header fields Flag bits:
RST: Reset the connection The flag causes the receiver to reset the connection Receiver of a RST terminates the connection and
indicates higher layer application about the reset
SYN: Synchronize sequence numbers Sent in the first packet when initiating a connection
FIN: Sender is finished with sending Used for closing a connection Both sides of a connection must send a FIN
![Page 48: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/48.jpg)
48
TCP header fields Window Size:
Each side of the connection advertises the window size
Window size is the maximum number of bytes that a receiver can accept.
Maximum window size is 216-1= 65535 bytes TCP Checksum:
TCP checksum covers over both TCP header and TCP data (also covers some parts of the IP header)
16-bit one’s complement Urgent Pointer:
Only valid if URG flag is set
![Page 49: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/49.jpg)
49
TCP header fields Options:
End ofOptions kind=0
1 byte
NOP(no operation) kind=1
1 byte
MaximumSegment Size kind=2
1 byte
len=4
1 byte
maximumsegment size
2 bytes
Window ScaleFactor kind=3
1 byte
len=3
1 byte
shift count
1 byte
Timestamp kind=8
1 byte
len=10
1 byte
timestamp value
4 bytes
timestamp echo reply
4 bytes
![Page 50: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/50.jpg)
50
TCP header fields Options:
NOP is used to pad TCP header to multiples of 4 bytes
Maximum Segment Size Window Scale Options
Increases the TCP window from 16 to 32 bits, i.e., the window size is interpreted differentlyQ: What is the different interpretation ?
This option can only be used in the SYN segment (first segment) during connection establishment time
Timestamp Option Can be used for roundtrip measurements
![Page 51: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/51.jpg)
51
Three-Way Handshake
aida.poly.edu mng.poly.edu
S 1031880193:1031880193(0)win 16384 <mss 1460, ...>
S 172488586:172488586(0)
ack 1031880194 win 8760 <mss 1460>
ack 172488587 win 17520
![Page 52: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/52.jpg)
52
Why is a Two-Way Handshake not enough?
aida.poly.edu mng.poly.edu
S 15322112354:15322112354(0)win 16384 <mss 1460, ...>
S 172488586:172488586(0)
win 8760 <mss 1460>
S 1031880193:1031880193(0)win 16384 <mss 1460, ...>
The redline is adelayedduplicatepacket.
When aida initiates the data transfer (starting with SeqNo=15322112355), mng will reject all data.
Will be discarded as a duplicate
SYN
![Page 53: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/53.jpg)
53
TCP Connection Termination
aida.poly.edu mng.poly.edu
F 172488734:172488734(0)
ack 1031880221 win 8733
. ack 172488735 win 17484
. ack 1031880222 win 8733
F 1031880221:1031880221(0)ack 172488735 win 17520
![Page 54: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/54.jpg)
54
Connection termination with tcpdump
1 mng.poly.edu.telnet > aida.poly.edu.1121: F 172488734:172488734(0) ack 1031880221 win 8733
2 aida.poly.edu.1121 > mng.poly.edu.telnet: . ack 172488735 win 174843 aida.poly.edu.1121 > mng.poly.edu.telnet: F 1031880221:1031880221(0)
ack 172488735 win 175204 mng.poly.edu.telnet > aida.poly.edu.1121: . ack 1031880222 win 8733
aida.poly.edu mng.poly.edu
aida issuesan "telnet mng"
![Page 55: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/55.jpg)
55
TCP States in “Normal” Connection Lifetime
SYN (SeqNo = x)
SYN (SeqNo = y, AckNo = x + 1 )
(AckNo = y + 1 )
SYN_SENT(active open)
SYN_RCVD
ESTABLISHED
ESTABLISHED
FIN_WAIT_1(active close)
LISTEN(passive open)
FIN (SeqNo = m)
CLOSE_WAIT(passive close)
(AckNo = m+ 1 )
FIN (SeqNo = n )
(AckNo = n+1)LAST_ACK
FIN_WAIT_2
TIME_WAIT
CLOSED
![Page 56: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/56.jpg)
56
TCP State Transition DiagramOpening A Connection
CLOSED
LISTEN
SYN RCVD SYN SENT
ESTABLISHED
active opensend: SYN
recv: SYN, ACKsend: ACK
recv: SYNse nd: SYN, ACK
recvd: ACKsend: . / .
recv:RST
Application sends datasend: SYN
simultaneous openrecv: SYNsend: SYN, ACK
close ortimeout
passive opensend: . / .
recvd: FIN send: FIN
send:FIN
![Page 57: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/57.jpg)
57
TCP State Transition DiagramClosing A Connection
FIN_WAIT_1
FIN_WAIT_2
ESTABLISHED
recv: FINsend: ACK
recv: ACKsend: . / .
recvd: ACKsend: . / .
recv:FIN, ACKsend: ACK
active closesend: FIN
TIME_WAIT
CLOSING
recv: FINsend: ACK
CLOSED
Timeout(2 MSL)
CLOSE_WAIT
LAST_ACK
passive closerecv: FINsend: ACK
applicationclosessend: FIN
recv: ACKsend: . / .
Issue close()
![Page 58: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/58.jpg)
58
2MSL Wait State2MSL Wait State = TIME_WAIT When TCP does an active close, and sends the final
ACK, the connection must stay in in the TIME_WAIT state for twice the maximum segment lifetime.
2MSL= 2 * Maximum Segment Lifetime
Why? TCP is given a chance to resent the final ACK. (Server will timeout after sending the FIN segment and resend the FIN)
The MSL is set to 2 minutes or 1 minute or 30 seconds.
![Page 59: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/59.jpg)
59
Rules for sending Acknowledgments TCP has rules that influence the transmission of
acknowledgments
Rule 1: Delayed Acknowledgments Goal: Avoid sending ACK segments that do not carry data Implementation: Delay the transmission of (some) ACKs
Rule 2: Nagle’s rule Goal: Reduce transmission of small segments
Implementation: A sender cannot send multiple segments with a 1-byte payload (i.e., it must wait for an ACK)
![Page 60: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/60.jpg)
60
Delayed Acknowledgement TCP delays transmission of ACKs for up to 200ms
Goal: Avoid to send ACK packets that do not carry data. The hope is that, within the delay, the receiver will have data ready to be sent to the receiver. Then, the ACK can be
piggybacked with a data segmentIn Example: Delayed ACK explains why the “ACK of character” and the “echo of character” are sent in the same segment The duration of delayed ACKs can be observed in the example when Argon sends ACKs
Exceptions: ACK should be sent for every second full sized segment Delayed ACK is not used when packets arrive out of order
![Page 61: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/61.jpg)
61
Observing Delayed Acknowledgements
• Remote terminal applications (e.g., Telnet) send characters to a server. The server interprets the character and sends the output at the server to the client.
• For each character typed, you see three packets:1. Client Server: Send typed character 2. Server Client: Echo of character (or user output) and
acknowledgement for first packet3. Client Server: Acknowledgement for second packet
1.send character
2.interpretcharacter
3.send echo of character
and/or output
Host withTelnet client
Host withTelnet server
![Page 62: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/62.jpg)
62
Observing Delayed Acknowledgements
Argon Neon
Telnet sessionfrom Argonto Neon
This is the output of typing 3 (three) characters :
Time 44.062449: Argon Neon: Push, SeqNo 0:1(1), AckNo 1 Time 44.063317: Neon Argon: Push, SeqNo 1:2(1), AckNo 1Time 44.182705: Argon Neon: No Data, AckNo 2
Time 48.946471: Argon Neon: Push, SeqNo 1:2(1), AckNo 2 Time 48.947326: Neon Argon: Push, SeqNo 2:3(1), AckNo 2 Time 48.982786: Argon Neon: No Data, AckNo 3
Time 55.116581: Argon Neon: Push, SeqNo 2:3(1) AckNo 3Time 55.117497: Neon Argon: Push, SeqNo 3:4(1) AckNo 3 Time 55.183694: Argon Neon: No Data, AckNo 4
![Page 63: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/63.jpg)
63
Why 3 segments per character? We would expect four
segments per character:
But we only see three segments per character:
This is due to delayed acknowledgements
character
ACK of character
ACK of echoed character
echo of character
character
ACK and echo of character
ACK of echoed character
![Page 64: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/64.jpg)
64
Observing Nagle’s Rule
argon.cs.virginia.edu
3000miles
tenet.cs.berkeley.edu
Telnet sessionbetween argon.cs.virginia.eduandtenet.cs.berkeley.edu
This is the output of typing 7 characters :
Time 16.401963: Argon Tenet: Push, SeqNo 1:2(1), AckNo 2 Time 16.481929: Tenet Argon: Push, SeqNo 2:3(1) , AckNo 2
Time 16.482154: Argon Tenet: Push, SeqNo 2:3(1) , AckNo 3Time 16.559447: Tenet Argon: Push, SeqNo 3:4(1), AckNo 3
Time 16.559684: Argon Tenet: Push, SeqNo 3:4(1), AckNo 4 Time 16.640508: Tenet Argon: Push, SeqNo 4:5(1) AckNo 4
Time 16.640761: Argon Tenet: Push, SeqNo 4:8(4) AckNo 5 Time 16.728402: Tenet Argon: Push, SeqNo 5:9(4) AckNo 8
![Page 65: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/65.jpg)
65
Observing Nagle’s Rule Observation: Transmission
of segments follows a different pattern, i.e., there are only two segments per character typed
Delayed acknowledgment does not kick in at Argon
The reason is that there is always data at Argon ready to sent when the ACK arrives
Why is Argon not sending the data (typed character) as soon as it is available?
char1
ACK + char2
ACK + char3
ACK + char4-7
![Page 66: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/66.jpg)
66
Resetting Connections Resetting connections is done by setting
the RST flag When is the RST flag set?
Connection request arrives and no server process is waiting on the destination port
Abort (Terminate) a connection Causes the receiver to throw away buffered data. Receiver does not acknowledge the RST segment
![Page 67: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/67.jpg)
67
TCP Congestion Control TCP has a mechanism for congestion control.
The mechanism is implemented at the sender
The window size at the sender is set as follows:Send Window = MIN (flow control window, congestion window)
where flow control window is advertised by the receiver congestion window is adjusted based on feedback
from the network
![Page 68: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/68.jpg)
68
TCP Congestion Control TCP congestion control is governed by
two parameters: Congestion Window (cwnd)
Slow-start threshhold Value (ssthresh)Initial value is 216-1
Congestion control works in two modes: slow start (cwnd < ssthresh) congestion avoidance (cwnd ≥ ssthresh
![Page 69: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/69.jpg)
69
Slow Start Initial value: Set cwnd = 1
Note: Unit is a segment size. TCP actually is based on bytes and increments by 1 MSS (maximum segment size)
The receiver sends an acknowledgement (ACK) for each Segment Note: Generally, a TCP receiver sends an ACK for every other
segment. Each time an ACK is received by the sender, the congestion
window is increased by 1 segment:cwnd = cwnd + 1
If an ACK acknowledges two segments, cwnd is still increased by only 1 segment.
Even if ACK acknowledges a segment that is smaller than MSS bytes long, cwnd is increased by 1.
Does Slow Start increment slowly? Not really. In fact, the increase of cwnd is exponential
![Page 70: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/70.jpg)
70
Slow Start Example The congestion
window size grows very rapidly For every ACK, we
increase cwnd by 1 irrespective of the number of segments ACK’ed
TCP slows down the increase of cwnd when cwnd > ssthresh
cwnd = 1
cwnd = 2
cwnd = 4
cwnd = 7
![Page 71: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/71.jpg)
71
Congestion Avoidance Congestion avoidance phase is started if
cwnd has reached the slow-start threshold value
If cwnd ≥ ssthresh then each time an ACK is received, increment cwnd as follows:
cwnd = cwnd + 1/ cwnd
So cwnd is increased by one only if all cwnd segments have been acknowledged.
![Page 72: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/72.jpg)
72
Example of Slow Start/Congestion Avoidance
Assume that ssthresh = 8
cwnd = 1
cwnd = 2
cwnd = 4
cwnd = 8
cwnd = 9
cwnd = 10
0
2
4
6
8
10
12
14
t=0
t=2
t=4
t=6
Roundtrip times
Cw
nd
(in
seg
men
ts)
ssthresh
![Page 73: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/73.jpg)
73
Responses to Congestion So, TCP assumes there is congestion if it
detects a packet loss A TCP sender can detect lost packets via:
Timeout of a retransmission timer Receipt of a duplicate ACK
TCP interprets a Timeout as a binary congestion signal. When a timeout occurs, the sender performs: cwnd is reset to one:
cwnd = 1 ssthresh is set to half the current size of the congestion
window:ssthressh = cwnd / 2
and slow-start is entered
![Page 74: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/74.jpg)
74
Fast Retransmit If three or more duplicate
ACKs are received in a row, the TCP sender believes that a segment has been lost.
Then TCP performs a retransmission of what seems to be the missing segment, without waiting for a timeout to happen.
Enter slow start:ssthresh = cwnd/2
cwnd = 1
1. duplicate
2. duplicate
3. duplicate
![Page 75: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/75.jpg)
75
Fast Recovery Fast recovery avoids slow start
after a fast retransmit
Intuition: Duplicate ACKs indicate that data is getting through
After three duplicate ACKs set: Retransmit packet that is
presumed lost ssthresh = cwnd/2 cwnd = cwnd+3 (note the order of operations) Increment cwnd by one for each
additional duplicate ACK
When ACK arrives that acknowledges “new data” (here: AckNo=6148), set:
cwnd=ssthreshenter congestion avoidance
1K SeqNo=0
AckNo=1024
AckNo=1024
1K SeqNo=1024
SeqNo=20481K
AckNo=1024
SeqNo=30721K
SeqNo=40961K
1. duplicate
2. duplicate
AckNo=1024
SeqNo=10241K
SeqNo=51201K
3. duplicate
cwnd=12sshtresh=5
cwnd=12sshtresh=5
cwnd=12sshtresh=5
cwnd=12sshtresh=5
cwnd=15sshtresh=6
AckNo=6148cwnd=6sshtresh=6
ACK for new data
![Page 76: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/76.jpg)
76
Flavors of TCP Congestion Control TCP Tahoe (1988, FreeBSD 4.3 Tahoe)
Slow Start Congestion Avoidance Fast Retransmit
TCP Reno (1990, FreeBSD 4.3 Reno) Fast Recovery
New Reno (1996) SACK (1996)
RED (Floyd and Jacobson 1993)
![Page 77: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/77.jpg)
77
SACK SACK = Selective acknowledgment
Issue: Reno and New Reno retransmit at most 1 lost packet per round trip time
Selective acknowledgments: The receiver can acknowledge non-continuous blocks of data (SACK 0-1023, 1024-2047)
Multiple blocks can be sent in a single segment.
TCP SACK: Enters fast recovery upon 3 duplicate ACKs Sender keeps track of SACKs and infers if segments are lost.
Sender retransmits the next segment from the list of segments that are deemed lost.
![Page 78: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/78.jpg)
78
TCP in Linux Congestion control algorithm is pluggable
/proc/sys/net/ipv4/tcp_congestion_control TCP read and write buffer sizes
/proc/sys/net/ipv4/tcp_r[w]mem
![Page 79: COMS/CSEE 4140 Networking Laboratory Lecture 06](https://reader036.vdocuments.net/reader036/viewer/2022062309/56815907550346895dc63aea/html5/thumbnails/79.jpg)
79
Midterm questions ARP, ICMP, UDP, TCP, RIP, OSPF, BGP Compare and contrast design principles in
protocols. Fragmentation