tcp - part ii
DESCRIPTION
TCP - Part II. Relates to Lab 5. This is an extended module that covers TCP data transport, and flow control, congestion control, and error control in TCP. Interactive and bulk data. TCP applications can be put into the following categories bulk data transfer - ftp, mail, http - PowerPoint PPT PresentationTRANSCRIPT
1
TCP - Part II
Relates to Lab 5. This is an extended module that covers TCP data transport, and flow control, congestion control, and error control in TCP.
2
Interactive and bulk data
TCP applications can be put into the following categoriesbulk data transfer - ftp, mail, httpinteractive data transfer - telnet, rlogin
TCP has algorithms to deal which each type of applications efficiently.
3
tcpdump of an rlogin session
This is the output of typing 3 (three) characters :44.062449 argon.cs.virginia.edu.1023 > neon.cs.virginia.edu.login: P 0:1(1) ack 1 44.063317 neon.cs.virginia.edu.login > argon.cs.virginia.edu.1023: P 1:2(1) ack 1 win 876044.182705 argon.cs.virginia.edu.1023 > neon.cs.virginia.edu.login: . ack 2 win 17520
48.946471 argon.cs.virginia.edu.1023 > neon.cs.virginia.edu.login: P 1:2(1) ack 2 win 1752048.947326 neon.cs.virginia.edu.login > argon.cs.virginia.edu.1023: P 2:3(1) ack 2 win 876048.982786 argon.cs.virginia.edu.1023 > neon.cs.virginia.edu.login: . ack 3 win 17520
55:00.116581 argon.cs.virginia.edu.1023 > neon.cs.virginia.edu.login: P 2:3(1) ack 3 win 1752055:00.117497 neon.cs.virginia.edu.login > argon.cs.virginia.edu.1023: P 3:4(1) ack 3 win 876055:00.183694 argon.cs.virginia.edu.1023 > neon.cs.virginia.edu.login: . ack 4 win 17520
Argon.cs.virginia.edu Neon.cs.virginia.edu
rlogin sessionfrom Argonto Neon
4
Rlogin
• “Rlogin” is a remote terminal application• Originally built only for Unix systems.• Rlogin sends one segment per character (keystroke)• Receiver echoes the character back.
• So, we really expect to have four segments per keystroke
5
Rlogin
• We would expect that tcpdump shows this pattern:
• However, tcpdump shows this pattern:
• So, TCP has delayed the transmission of an ACK
character
ACK of character
ACK of echoed character
echo of character
character
ACK and echo of character
ACK of echoed character
6
Delayed Acknowledgement
• TCP delays transmission of ACKs for up to 200ms• The hope is to have data ready in that time frame. Then, the
ACK can be piggybacked with the data segment.• Delayed ACKs explain why the ACK and the “echo of
character” are sent in the same segment.
7
tcpdump of a wide-area rlogin session
This is the output of typing 9 characters :
54:16.401963 argon.cs.virginia.edu.1023 > tenet.CS.Berkeley.EDU.login: P 1:2(1) ack 2 win 1638454:16.481929 tenet.CS.Berkeley.EDU.login > argon.cs.virginia.edu.1023: P 2:3(1) ack 2 win 1638454:16.482154 argon.cs.virginia.edu.1023 > tenet.CS.Berkeley.EDU.login: P 2:3(1) ack 3 win 1638354:16.559447 tenet.CS.Berkeley.EDU.login > argon.cs.virginia.edu.1023: P 3:4(1) ack 3 win 1638454:16.559684 argon.cs.virginia.edu.1023 > tenet.CS.Berkeley.EDU.login: P 3:4(1) ack 4 win 1638354:16.640508 tenet.CS.Berkeley.EDU.login > argon.cs.virginia.edu.1023: P 4:5(1) ack 4 win 1638454:16.640761 argon.cs.virginia.edu.1023 > tenet.CS.Berkeley.EDU.login: P 4:8(4) ack 5 win 1638354:16.728402 tenet.CS.Berkeley.EDU.login > argon.cs.virginia.edu.1023: P 5:9(4) ack 8 win 16384
argon.cs.virginia.edu tenet.cs.berkeley.edu
rlogin sessionbetween argon.cs.virginia.eduandtenet.cs.berkeley.edu
8
Wide-area Rlogin: Observation 1
• Transmission of segments follows a different pattern.
• The delayed acknowled-gment does not kick in
• Reason is that there is always data at aida when the ACK arrives.
char1
ACK of char 1 + echo of char1
ACK + char2
ACK + echo of char2
9
Wide-area Rlogin: Observation 2
• There are fewer transmissions than there are characters.• Aida never has multiple segments outstanding.• This is due to Nagle’s Algorithm:
Each TCP connection can have only one small (1-byte) segment outstanding that has not been acknowledged.
• Implementation: Send one byte and buffer all subsequent bytes until acknowledgement is received.Then send all buffered bytes in a single segment. (Only enforced if data is arriving from application one byte at a time)
• Nagle’s rule reduces the amount of small segments.The algorithm can be disabled.
10
Flow Control
Congestion ControlError Control
TCP:
11
What is Flow/Congestion/Error Control ?
• Flow Control: Algorithms to prevent that the sender overruns the receiver with information?
• Congestion Control: Algorithms to prevent that the sender overloads the network
• Error Control: Algorithms to recover or conceal the effects from packet losses
The goal of each of the control mechanisms is different.
But the implementation is combined
12
TCP Flow Control
13
TCP Flow Control
• TCP implements sliding window flow control
• Sending acknowledgements is separated from setting the window size at sender.
• Acknowledgements do not automatically increase the window size
• Acknowledgements are cumulative
14
Sliding Window Flow Control
1 2 3 4 5 6 7 8 9 10 11
Advertised window
sent but notacknowledged can be sent
USABLEWINDOW
sent andacknowledged
can't sent
• Sliding Window Protocol is performed at the byte level:
•Here: Sender can transmit sequence numbers 6,7,8.
15
Sliding Window: “Window Closes”
1 2 3 4 5 6 7 8 9 10 11
1 2 3 4 5 6 7 8 9 10 11
Transmit Byte 6
1 2 3 4 5 6 7 8 9 10 11
AckNo = 5, Win = 4is received
• Transmission of a single byte (with SeqNo = 6) and acknowledgement is received (AckNo = 5, Win=4):
16
Sliding Window: “Window Opens”
1 2 3 4 5 6 7 8 9 10 11
1 2 3 4 5 6 7 8 9 10 11
AckNo = 5, Win = 6is received
• Acknowledgement is received that enlarges the window to the right (AckNo = 5, Win=6):
• A receiver opens a window when TCP buffer empties (meaning that data is delivered to the application).
17
Sliding Window: “Window Shrinks”
1 2 3 4 5 6 7 8 9 10 11
1 2 3 4 5 6 7 8 9 10 11
AckNo = 5, Win = 3is received
• Acknowledgement is received that reduces the window from the right (AckNo = 5, Win=3):
• Shrinking a window should not be used
18
Window Management in TCP
• The receiver is returning two parameters to the sender
• The interpretation is:• I am ready to receive new data with
SeqNo= AckNo, AckNo+1, …., AckNo+Win-1
• Receiver can acknowledge data without opening the window• Receiver can change the window size without acknowledging
data
AckNowindow size
(win)32 bits 16 bits
19
Sliding Window: Example
3K
ReceiverBuffer
0 4KSendersends 2Kof data
2K
Sendersends 2Kof data
4K
Sender blocked
20
TCP Congestion Control
21
TCP Congestion Control
• TCP has a mechanism for congestion control. The mechanism is implemented at the sender
• The window size at the sender is set as follows:
where • flow control window is advertised by the receiver• congestion window is adjusted based on feedback from the
network
•Send Window = MIN (flow control window, congestion window)•Send Window = MIN (flow control window, congestion window)
22
TCP Congestion Control
• The sender has two additional parameters:– Congestion Window (cwnd)
Initial value is 1 MSS (=maximum segment size) counted as bytes
– Slow-start threshold Value (ssthresh)Initial value is the advertised window size)
• Congestion control works in two modes:– slow start (cwnd < ssthresh)– congestion avoidance (cwnd >= ssthresh)
23
Slow Start
• Initial value:– cwnd = 1 segment
• Note: cwnd is actually measured in bytes: 1 segment = MSS bytes
• Each time an ACK is received, the congestion window is increased by MSS bytes.
– cwnd = cwnd + MSS – If an ACK acknowledges two segments, cwnd is still increased by only 1
segment.
– Even if ACK acknowledges a segment that is smaller than MSS bytes long, cwnd is increased by MSS.
• Does Slow Start increment slowly? Not really. In fact, the increase of cwnd can be exponential
24
Slow Start Example
• The congestion window size grows very rapidly– For every ACK, we
increase cwnd by 1 irrespective of the number of segments ACK’ed
• TCP slows down the increase of cwnd when cwnd > ssthresh
cwnd =1xMSS
cwnd =2xMSS
cwnd =4xMSS
cwnd =7xMSS
25
Congestion Avoidance
• Congestion avoidance phase is started if cwnd has reached the slow-start threshold value
• If cwnd >= ssthresh then each time an ACK is received, increment cwnd as follows:
• cwnd = cwnd + MSS(MSS/ cwnd)
• So cwnd is increased by one segment (=MSS bytes) only if all segments have been acknowledged.
26
Slow Start / Congestion Avoidance
• Here we give a more accurate version than in our earlier discussion of Slow Start:
If cwnd <= ssthresh then Each time an Ack is received:cwnd = cwnd + MSS
else /* cwnd > ssthresh */
Each time an Ack is received :cwnd = cwnd + MSS. MSS / cwnd
endif
27
Example of Slow Start/Congestion Avoidance
Assume that ssthresh = 8 cwnd = 1
cwnd = 2
cwnd = 4
cwnd = 8
cwnd = 9
cwnd = 10
0
2
4
6
8
10
12
14
Roundtrip times
Cw
nd
(in
seg
men
ts)
ssthresh
28
Responses to Congestion
• Most often, a packet loss in a network is due to an overflow at a congested router (rather than due to a transmission error)
• So, TCP assumes there is congestion if it detects a packet loss
• A TCP sender can detect lost packets via:• Timeout of a retransmission timer• Receipt of a duplicate ACK
• When TCP assumes that a packet loss is caused by congestion it reduces the size of the sending window
29
TCP Tahoe
• Congestion is assumed if sender has timeout or receipt of duplicate ACK
• Each time when congestion occurs, – cwnd is reset to one:
cwnd = MSS– ssthresh is set to half the current size of the congestion
window:ssthressh = cwnd / 2
– and slow-start is entered
30
Slow Start / Congestion Avoidance
• A typical plot of cwnd for a TCP connection (MSS = 1500 bytes) with TCP Tahoe:
31
TCP Error Control
Background on Error Control
TCP Error Control
32
Background: ARQ Error Control
• Two types of errors:– Lost packets– Damaged packets
• Most Error Control techniques are based on:
1. Error Detection Scheme (Parity checks, CRC).
2. Retransmission Scheme.
• Error control schemes that involve error detection and retransmission of lost or corrupted packets are referred to as Automatic Repeat Request (ARQ) error control.
33
Background: ARQ Error Control
All retransmission schemes use all or a subset of the following procedures:
Positive acknowledgments (ACK) Negative acknowledgment (NACK) All retransmission schemes (using ACK, NACK or both) rely on the use of timers
The most common ARQ retransmission schemes are:Stop-and-Wait ARQ Go-Back-N ARQSelective Repeat ARQ
34
Background: ARQ Error Control
• The most common ARQ retransmission schemes:
– Stop-and-Wait ARQ
– Go-Back-N ARQ
– Selective Repeat ARQ
• The protocol for sending ACKs in all ARQ protocols are based on the sliding window flow control scheme
35
Background: Stop-and-Wait ARQ
• Stop-and-Wait ARQ is an addition to the Stop-and-Wait flow control protocol:
• Packets have 1-bit sequence numbers (SN = 0 or 1)• Receiver sends an ACK (1-SN) if packet SN is correctly
received• Sender waits for an ACK (1-SN) before transmitting the next
packet with sequence number 1-SN • If sender does not receive anything before a timeout value
expires, it retransmits packet SN
36
Background: Stop-and-Wait ARQ
Packet 1
• Lost Packet
A
B
AC
K 0
Packet 0
Timeout
AC
K 1
Packet 1
Packet 1 A
CK
0
37
Background: Go-Back-N ARQ
Operations:
– A station may send multiple packets as allowed by the window size
– Receiver sends a NAK i if packet i is in error. After that, the receiver discards all incoming packets until the packet in error was correctly retransmitted
– If sender receives a NAK i it will retransmit packet i and all packets i+1, i+2,... which have been sent, but not been acknowledged
38
Example of Go-Back-N ARQ
BA2
1
3
2
packetsreceived
packets waitingfor ACK/NAK
3
ACK2
packet 1 is received, send ACK 2
BA3
2
4
3
4
1
1
2
4
3
BA24
Time out for Packet 2retransmit frame 2,3,4
13
• In Go-back-N, if packets are correctly delivered, they are delivered in the correct sequence
• Therefore, the receiver does not need to keep track of `holes’ in the sequence of delivered packets
39
Background: Go-Back-N ARQ
Packet 0
• Lost Packet
A
B
Packet 2
Packet 4
Packet 1
Packet 3
AC
K 3
Packet 5
Packet 6
Packet 4
Packet 5
Packet 6
Packets 4,5,6are
retransmitted
AC
K 6
Packets 5 and 6are discarded
Timeoutfor Packet 4
40
Background: Selective-Repeat ARQ
• Similar to Go-Back-N ARQ. However, the sender only retransmits packets for which a time-out occured is received
• Advantage over Go-Back-N: – Fewer Retransmissions.
• Disadvantages: – More complexity at sender and receiver
– Each packet must be acknowledged individually (no cumulative acknowledgements)
– Receiver may receive packets out of sequence
41
Example of Selective-Repeat ARQ
BA2
1
3
2
Framesreceived
Packets waitingfor ACK/NAK
3
ACK2Packet is correct, send ACK 2Packet 2 does not arrive
BA3
2
4
3
4
ACK4
Following packets are Acked
1
1
2
4
3
BA2
4Timeout for packet 2:retransmit only packet2
1
3
5
5
ACK5
Receiver must keep track of `holes’ in the sequence of delivered packets
Sender must maintain one timer per outstanding packet
42
Timeout for Packet 4:only Packet 4
is retransmitted
Background: Selective-Repeat ARQ
Packet 0
• Lost Packet
A
B
Packet 2
Packet 1
Packet 3
AC
K 2
Packet 5
Packet 6
AC
K 6
Packet 4
Packet 7
Packet 0
AC
K 7
AC
K 1
AC
K 3
AC
K 4
Packet 4
Packets 5 and 6are buffered
AC
K 5
AC
K 0
AC
K 1
43
Error Control in TCP
• TCP implements a variation of the Go-back-N retransmission scheme
• TCP maintains a Retransmission Timer for each connection:– The timer is started during a transmission. A timeout
causes a retransmission
• TCP couples error control and congestion control (i.e., it assumes that errors are caused by congestion)
• TCP allows accelerated retransmissions (Fast Retransmit)
44
TCP Retransmission Timer
• Retransmission Timer:– The setting of the retransmission timer is crucial for
efficiency– Timeout value too small results in unnecessary
retransmissions– Timeout value too large long waiting time before
a retransmission can be issued
– A problem is that the delays in the network are not fixed – Therefore, the retransmission timers must be adaptive
45
Round-Trip Time Measurements
• The retransmission mechanism of TCP is adaptive • The retransmission timers are set based on round-trip time
(RTT) measurements that TCP performs
RTT #1
RTT #2
RTT #3
The RTT is based on time difference between segment transmission and ACKBut:
TCP does not ACK each segmentEach connection has only one timer
46
Round-Trip Time Measurements
• Retransmission timer is set to a Retransmission Timeout (RTO) value.
• RTO is calculated based on the RTT measurements. • The RTT measurements are smoothed by the following
estimators srtt and rttvar:
srttn+1 = RTT + (1- ) srttn
rttvarn+1 = ( | RTT - srttn+1 | ) + (1- ) rttvarn
RTOn+1 = srttn+1 + 4 rttvarn+1
• The gains are set to =1/4 and =1/8
• srtt0 = 0 sec, rttvar0 = 3 sec, Also: RTO1 = srtt1 + 2 rttvar1
47
Karn’s Algorithm
• If an ACK for a retransmitted segment is received, the sender cannot tell if the ACK belongs to the original or the retransmission.
Timeout !
RT
T ? R
TT
?
Karn’s Algorithm:Don’t update srtt on any segments that have been retransmitted.Each time when TCP retransmits, it sets:RTOn+1 = min( 2 RTOn, 64) (exponential backoff)
48
Measuring TCP Retransmission Timers
•Transfer file from Argon to neon
• Unplug Ethernet of Argon cable in the middle of file transfer
argon.tcpip-lab.edu("Argon")
neon.tcpip-lab.edu("Neon")
Transfer file
Web client Web server
49
Interpreting the Measurements
• The interval between retransmission attempts in seconds is:
1.03, 3, 6, 12, 24, 48, 64, 64, 64, 64, 64, 64, 64.
• Time between retrans-missions is doubled each time (Exponential Backoff Algorithm)
• Timer is not increased beyond 64 seconds
• TCP gives up after 13th attempt and 9 minutes.
0
100
200
300
400
500
600
Se
con
ds
0 2 4 6 8 10 12Transmission Attempts