ece/cs 372 – introduction to computer networks lecture 7

Chapter 3, slide: 1

ECE/CS 372 – introduction to computer networks

Lecture 7

Announcements:

HW1 is due today

LAB2 is due tomorrow

Acknowledgement: slides drawn heavily from Kurose & Ross

Chapter 3, slide: 2

Chapter 3: Transport Layer

Our goals: understand

principles behind transport layer services: reliable data

transfer flow control congestion

control

learn about transport layer protocols in the Internet: UDP: connectionless

transport TCP: connection-

oriented transport TCP congestion

control

Chapter 3, slide: 3

Transport services and protocols provide logical

communication between app processes running on different hosts

transport protocols run in end systems send side: breaks app

messages into segments, passes to network layer

rcv side: reassembles segments into messages, passes to app layer

more than one transport protocol available to apps Internet: TCP and UDP

application

transportnetworkdata linkphysical

application


logical end-end transport

Chapter 3, slide: 4

Transport vs. network layer network layer: logical communication between hosts transport layer: logical communication between

processes

Household case: 12 kids (East coast

house) sending letters to 12 kids (West coast house)

Ann is responsible for the house at East coast

Bill is responsible for the house at West coast

Postal service is responsible for between houses

Household analogy: kids = processes letters = messages houses = hosts home address = IP

address kid names = port

numbers Ann and Bill = transport

protocol postal service = network-

layer protocol

Chapter 3, slide: 5

Internet transport-layer protocols reliable, in-order

delivery (TCP) congestion control flow control connection setup

unreliable, unordered delivery: UDP no-frills extension of

“best-effort” IP

services not available: delay guarantees bandwidth guarantees

application

transportnetworkdata linkphysical network

data linkphysical

networkdata linkphysical





application


logical end-end transport

Chapter 3, slide: 6

UDP: User Datagram Protocol [RFC 768]

“best effort” service, UDP segments may be: lost delivered out of order

to app connectionless:

no handshaking between UDP sender, receiver

each UDP segment handled independently of others

Why is there a UDP? less delay: no connection

establishment (which can add delay)

simple: no connection state at sender, receiver

less traffic: small segment header

no congestion control: UDP can blast away as fast as desired

Chapter 3, slide: 7

UDP: more

often used for streaming multimedia apps loss tolerant rate sensitive

other UDP uses DNS SNMP

reliable transfer over UDP: add reliability at application layer application-specific

error recovery!

source port # dest port #

32 bits

Applicationdata

(message)

UDP segment format

length checksumLength, in

bytes of UDPsegment,including

header

Chapter 3, slide:

UDP checksum

Sender: treat segment contents

as sequence of 16-bit integers

checksum: addition (1’s complement sum) of segment contents

sender puts checksum value into UDP checksum field

Receiver: compute checksum of

received segment check if computed

checksum equals checksum field value: NO - error detected YES - no error detected.

But maybe errors nonetheless? ….

Goal: detect “errors” (e.g., flipped bits) in transmitted segment

8

Chapter 3, slide:

Internet Checksum Example Note

When adding numbers, a carryout from the most significant bit needs to be added to the result

Example: add two 16-bit integers

9

1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 01 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1

1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1

1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 01 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1

wraparound

sumchecksum

Chapter 3, slide: 10

Chapter 3 outline

Principles of reliable data transfer

Connection-oriented transport: TCP

Principles of congestion control

TCP congestion control


Principles of Reliable data transfer important in app., transport, link layers top-10 list of important networking topics!

characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)


Reliable data transfer: getting started

sendside

receiveside

rdt_send(): called from above, (e.g., by app.). Passed data to deliver to receiver upper layer

udt_send(): called by rdt,to transfer packet over unreliable channel to

receiver

rdt_rcv(): called when packet arrives on rcv-side of channel

deliver_data(): called by rdt to deliver data to

upper


Reliable data transfer: getting startedWe will: incrementally develop sender, receiver

sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer use finite state machines (FSM) to specify

sender, receiver

state1

state2

event causing state transitionactions taken on state transition

state: when in this “state” next state

uniquely determined by

next event

eventactions


rdt1.0: reliable transfer over a reliable channel

underlying channel perfectly reliable no bit errors no loss of packets

separate FSMs for sender, receiver: sender sends data into underlying channel receiver read data from underlying channel

Wait for call from above packet = make_pkt(data)

udt_send(packet)

rdt_send(data)

extract (packet,data)deliver_data(data)

Wait for call from

below

rdt_rcv(packet)

sender receiver


rdt2.0: channel with bit errors

underlying channel may flip bits in packet Receiver can detect bit errors (e.g., use checksum) But still no packet loss

questions: (1) how does sender know and (2) what does it do when packet is erroneous: acknowledgements:

• positive ack (ACK): receiver tells sender that pkt received OK• negative ack (NAK): receiver tells sender that pkt had erros

retransmission: sender retransmits pkt on receipt of NAK

new mechanisms in rdt2.0 (beyond rdt1.0): error detection receiver feedback: control msgs (ACK,NAK) rcvr->sender assume ACK/NAK are error free


rdt2.0: operation with no errors

Wait for call from above

snkpkt = make_pkt(data, checksum)udt_send(sndpkt)

extract(rcvpkt,data)deliver_data(data)udt_send(ACK)

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)

rdt_rcv(rcvpkt) && isACK(rcvpkt)

udt_send(sndpkt)

rdt_rcv(rcvpkt) && isNAK(rcvpkt)

udt_send(NAK)

rdt_rcv(rcvpkt) && corrupt(rcvpkt)

Wait for ACK or

NAK

Wait for call from

below

rdt_send(data)

sender

receiver


: errordt2.0r scenario

Wait for call from above

snkpkt = make_pkt(data, checksum)udt_send(sndpkt)

extract(rcvpkt,data)deliver_data(data)udt_send(ACK)

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)

rdt_rcv(rcvpkt) && isACK(rcvpkt)

udt_send(sndpkt)

rdt_rcv(rcvpkt) && isNAK(rcvpkt)

udt_send(NAK)

rdt_rcv(rcvpkt) && corrupt(rcvpkt)

Wait for ACK or

NAK

Wait for call from

below

rdt_send(data)


rdt2.0 has a fatal flaw!

What happens if ACK/NAK is corrupted? That is, sender receives garbled ACK/NAK

sender doesn’t know what happened at receiver!

can’t sender just retransmit? Sure: sender retransmits current

pkt if ACK/NAK garbled Any problem with this ??

Handling duplicates: sender adds sequence

number to each pkt receiver discards

(doesn’t deliver up) duplicate pkt

Sender sends one packet, then waits for receiver response

stop and waitstop and waitSender sends one packet, then waits for receiver response

stop and wait

Receiver doesn’t know whether received pkt is a retransmit or a new pkt

Problem: duplicate


rdt2.1: sender, handles garbled ACK/NAKs

Wait for call 0 from

above

sndpkt = make_pkt(0, data, checksum)udt_send(sndpkt)

rdt_send(data)

Wait for ACK or NAK 0 udt_send(sndpkt)

rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

sndpkt = make_pkt(1, data, checksum)udt_send(sndpkt)

rdt_send(data)

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt)

udt_send(sndpkt)

rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) ||isNAK(rcvpkt) )

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt)

Wait for call 1 from

above

Wait for ACK or NAK 1


rdt2.1: receiver, handles garbled ACK/NAKs

Wait for 0 from below

sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)

rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq0(rcvpkt)

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt)

extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)

Wait for 1 from below

rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq0(rcvpkt)

extract(rcvpkt,data)deliver_data(data)sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)

rdt_rcv(rcvpkt) && (corrupt(rcvpkt)

sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)

rdt_rcv(rcvpkt) && not corrupt(rcvpkt) && has_seq1(rcvpkt)

rdt_rcv(rcvpkt) && (corrupt(rcvpkt)

sndpkt = make_pkt(ACK, chksum)udt_send(sndpkt)

sndpkt = make_pkt(NAK, chksum)udt_send(sndpkt)


rdt2.1: discussion

Sender: seq # added to pkt two seq. #’s (0,1)

will suffice. Why? must check if

received ACK/NAK corrupted

twice as many states state must

“remember” whether “current” pkt has 0 or 1 seq. #

Receiver: must check if

received packet is duplicate state indicates

whether 0 or 1 is expected pkt seq #

note: receiver can not know if its last ACK/NAK received OK at sender


rdt2.2: a NAK-free protocol

do we really need NAKs??

instead of NAK, receiver sends ACK for last pkt received OK receiver must explicitly include seq # of pkt being ACKed

duplicate ACK at sender results in same action as NAK: retransmit current pkt

rdt2.2: same functionality as rdt2.1, using ACKs only


rdt3.0: channels with errors and loss

New assumption: packet may be lost: underlying channel can also lose packets (data or ACKs)

checksum, seq. #, ACKs, retransmissions will be of help, but not enough

What else is needed?

Approach: timeout policy:

sender waits “reasonable” amount of time for ACK

retransmits if no ACK received in this time

if pkt (or ACK) just delayed (not lost): retransmission will be

duplicate, but use of seq. #’s already handles this

receiver must specify seq # of pkt being ACKed

requires countdown timer



Lecture 8

Announcements:

LAB2 is due today

HW2 is posted and is due Tuesday next week

LAB3 is posted and is due Tuesday next week



rdt3.0 in action (still stop-n-wait w/ (0,1) sn)


Performance of rdt3.0: stop-n-wait

first packet bit transmitted, t = 0

sender receiver

RTT

last packet bit transmitted, t = L / R

first packet bit arriveslast packet bit arrives, send ACK

ACK arrives, send next packet, t = RTT + L / R

rdt3.0 works, but performance stinks example: R=1 Gbps, RTT=30 ms, L=1000Byte packet:

Ttransmit

= 8.103 b/pkt109 b/sec

= 8 microsecL (packet length in bits)R (transmission rate, bps)

=




sender receiver

RTT





U sender: utilization – fraction of time sender busy sending

U sender

= .008

30.008 = 0.00027

microseconds

L / R

RTT + L / R =




sender receiver

RTT





1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps link

network protocol limits use of physical resources!


Pipelining: increased utilization


sender receiver

RTT

last bit transmitted, t = L / R



last bit of 2nd packet arrives, send ACKlast bit of 3rd packet arrives, send ACK

U sender

= .024

30.008 = 0.0008

microseconds

3 * L / R

RTT + L / R =

Increase utilizationby a factor of 3!

Question: What is the link utilization Usender


So far: rdt3.0

Acknowledgment and retransmission Reliable delivery

Sequence numbers Duplication detection

Timeout and retransmit Deal with packet losses

Stop-n-wait one packet at a time

Problem: efficiency


Pipelined protocolsPipelining: sender allows multiple, “in-flight”, yet-to-be-ACK’ed

pkts What about the range of sequence numbers then?? What about buffering at receiver??

Two generic forms of pipelined protocols: go-Back-N and selective repeat


Go-Back-NSender side N pckts can be sent

w/o being ACK’ed, “sliding window”

Single timer, oldest non-ACK’ed pckt

At timeout(n), retransmit pkt n and all successive pkts

Receiver side Cumulative ACK Discard out-of-order

packets, no buffering


Selective repeat in action

GBN? GBN?

GBN?


Selective Repeat via comparison

Selective Repeat

Sliding windowN pckts on the fly

Selective ACK One timer for each

pckt Buffer out-of-order

pksts Retransmit only lost or

delayed pkt at timeout

Go-Back-N Sliding window

N pckts on the fly Cumulative ACK Single timer Discard out-of-order

pkts Retransmit all

successive pkts at timeout


Selective repeat:dilemmaExample: seq #’s: 0, 1, 2, 3 window size=3


Selective repeat:dilemmaExample: seq #’s: 0, 1, 2, 3 window size=3

receiver sees no difference in two scenarios! Even though, (a) is a retransmit pkt (b) is a new pkt

in (a), receiver incorrectly passes old data as new

Q: what relationship between seq # size and window size to avoid duplication problem??

Duplication problem: illustration

Let’s assume window size W = 2 for illustration

Consider a scenario where:Only ACK 1 is lost

Now let’s assume that seqNbr space n = W =2That is, seqNbr pattern is 0,1,0,1,0,1,…



Scenario: n=W=2; only ACK 1 is lost


Receiver thinks of retransmission of 2nd Pkt (Pkt1, SN=1) as a 4th/new Pkt (Pkt3,SN=1) Duplication detection problem!


Recap:

When window size W = seqNbr space n, there is a duplication problem receiver can’t tell a new transmission from a retransmission of a lost pkt

Let’s now see what happens if we increase n by 1: Now n=3 and seqNbr pattern is: 0,1,2,0,1,2,0…

Let’s revisit the scenario again



Scenario: n=3;W=2; ACK 1 is lost


Receiver still thinks of retrans. of 2nd Pkt (Pkt1, SN=1) as a new Pkt (4th w/ SN=1) Increasing n by 1 didn’t solve the duplication detection problem this time!!


Recap:

Increasing seqNbr space n from 2 to 3 didn’t solve the duplication problem yet!

Let’s now see what happens if we again increase n by 1: Now n=4 and seqNbr pattern is: 0,1,2,3,0,1,2,3,0…

Let’s revisit the scenario with n=4 and see



Scenario: n=4;W=2; ACK 1 is lost


Receiver now drops this retransmission of 2nd Pkt (Pkt1, SN=1) since its SN falls outside Duplication detection problem is now solved when n=4 w=2


Recap: Increasing seqNbr space n from 2 to 4 did solve the

duplication problem

That is, when seqNbr space = twice window size (n = 2w) the duplication problem was solved

Hence, in this case: when W ≤ n/2, the problem is solved

Can we generalize? Yes, W ≤ n/2 must hold in order to avoid dup problem


SeqNbr space n & window size w relationship: general case

Sender: Suppose ½ n < W < n

W segments with seqNbr in [0, W-1] are sent, but their ACKs are not received yet

0 1n-1

W-1

wpkts already sent,

but ACKs not received yet

= sender’s sliding window

seqNbr space

Receiver: Now assume: receiver received all W

segments (SeqNbr in [0,W-1])

receiver sent all W ACKs, one for each received segment

receiver slides its window: Sliding window (SeqNbr of expected segments) from:

W to x = 2W-n-1

Hint: (n-1-W+1) + (x-0+1)=W x=2W–n-1

0n-1

W

seqNbr expected by receiver


x=2W-n-1


1

Consider worst case scenario: all W ACKs are lost

Sender will re-transmit all the first W segments; i.e., with SeqNbr in [0,W-1]

But receiver is expecting segments with SeqNbr in [W,n-1] U [0,2W-n-1]

Retransmissions with seqNbr falling within [0,2W-n-1] will then be interpreted by receiver as new transmissions ( dup problem)

To avoid dup problem, we must then

have: 2W-n-1 ≤ -1 W ≤ ½ n

0n-1

W

seqNbr expected by receiver

x=2W-n-1

seqNbr

retransmitted

by sender

seqNbr overlap




Lecture 9

Announcements:

HW2 and LAB3 are due Tuesday 4th week.

A possible room change (Will send email with change if happened)

Midterm July 19th, 2012 (Thursday 4th week)

Ch1, Ch2, and Ch3. Make sure you review the material, HW, labs



Chapter 3 outline






TCP Round Trip Time (RTT) and TimeoutWhy need to estimate RTT? “timeout” and “retransmit”

needed to address pkt loss

need to know when to timeout and retransmit

Ideal world: exact RTT is needed

Real world: RTTs change over time

bcause pkts may take different paths network load changes over

time RTTs can only be estimated

Some intuition What happens if too

short: premature timeout unnecessary

retransmissions

What happens if too long: slow reaction to

segment loss


Technique: Exponential Weighted Moving Average (EWMA)

EstimatedRTT(current) = (1-)*EstimatedRTT(previous) + *SampleRTT(recent)

0 < < 1; typical value: = 0.125

SampleRTT: measured time from segment transmission until ACK

receipt current value of RTT Ignore retransmission

EstimatedRTT: estimated based on past & present; smoother than

SampleRTT to be used to set timeout period

TCP Round Trip Time (RTT) and Timeout




0 < < 1; typical value: = 0.125


Illustration:Assume: we received n RTT samples so far:Let’s order them as: 1,2,3, … ,n(1 is the most recent one, 2 is the 2nd most recent, etc.)

EstimatedRTT(n) = (1-)*EstimatedRTT(n-1) + *SampleRTT(1)

(EstimatedRTT(n) : estimated RTT after receiving ACK of nth Pkt.)

Example: Spse 3 ACKs returned w/ SampleRTT(1), SampleRTT(2), SampleRTT(3)

Question: What would be EstimatedRTT after receiving the 3ACKs ? Assume that EstimatedRTT(1) = SampleRTT(3) after receiving 1st ACK




0 < < 1; typical value: = 0.125

What happens if is too small (say very close 0): A sudden, real change in network load does not get

reflected in EstimatedRTT fast enough May lead to under- or overestimation of RTT for a long

time

What happens if is too large(say very close 1): Transient fluctuations/changes in network load affects

EstimatedRTT and makes it unstable when it should not Also leads to under- or overestimation of RTT



Example RTT estimation:RTT: gaia.cs.umass.edu to fantasia.eurecom.fr

100

150

200

250

300

350

1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106

time (seconnds)

RTT

(mill

isec

onds

)

SampleRTT Estimated RTT


Setting the timeout timeout = EstimtedRTT, any problem with this???

add a “safety margin” to EstimtedRTT large variation in EstimatedRTT -> larger safety margin

see how much SampleRTT deviates from EstimatedRTT:

TimeoutInterval = EstimatedRTT + 4*DevRTT

DevRTT = (1-)*DevRTT + *|SampleRTT-EstimatedRTT|(typically, = 0.25)

Then set timeout interval:



TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581

full duplex data: bi-directional data flow

in same connection MSS: maximum

segment size

connection-oriented: handshaking (exchange

of control msgs) init’s sender, receiver state before data exchange

flow controlled: sender will not

overwhelm receiver

point-to-point: one sender, one

receiver

reliable, in-order byte stream: no “message

boundaries”

pipelined: TCP congestion and flow

control set window size

send & receive bufferssocketdoor

T C Psend buffer

T C Preceive buffer

socketdoor

segm ent

applicationwrites data

applicationreads data


TCP: a reliable data transfer

TCP creates rdt service on top of IP’s unreliable service

Pipelined segments Cumulative acks TCP uses single

retransmission timer

Retransmissions are triggered by: timeout events duplicate acks

Initially consider simplified TCP sender: ignore duplicate acks ignore flow control,

congestion control


TCP sender events:data rcvd from app: Create segment with

seq # seq # is byte-stream

number of first data byte in segment

start timer if not already running (think of timer as for oldest unACK’ed segment)

expiration interval: TimeOutInterval

timeout: retransmit segment

that caused timeout restart timer Ack rcvd: If acknowledges

previously unACK’ed segments update what is known

to be ACK’ed start timer if there are

outstanding segments


TCP seq. #’s and ACKsSeq. #’s:

byte stream “number” of first byte in segment’s data

ACKs: seq # of next

byte expected from other side

cumulative ACK

Host A Host B

Seq=42, ACK=79, data = ‘C’

Usertypes

‘C’

Seq=43, ACK=80

host ACKsreceipt

of echoed‘C’

Seq=79, ACK=43, data = ‘C’

host ACKsreceipt of

‘C’, echoesback ‘C’

timesimple telnet scenario


TCP: retransmission scenarios

Host A

Seq=92, 8 bytes data

ACK=100

loss

tim

eout

lost ACK scenario

Host B

X


ACK=100

time

Host A


ACK=100

timepremature timeout

Host B


ACK=120


Seq=

92

tim

eout

ACK=120

Seq=

92

tim

eout


TCP retransmission scenarios (more)

Host A


ACK=100

loss

tim

eout

Cumulative ACK scenario

Host B

X


ACK=120

time


TCP ACK generation [RFC 1122, RFC 2581]

Event at Receiver

Arrival of in-order segment withexpected seq #. All data up toexpected seq # already ACKed

Arrival of in-order segment withexpected seq #. One other segment has ACK pending

Arrival of out-of-order segmenthigher-than-expect seq. # .Gap detected

Arrival of segment that partially or completely fills gap

TCP Receiver action

Delayed ACK. Wait up to 500msfor next segment. If no next segment,send ACK

Immediately send single cumulative ACK, ACKing both in-order segments

Immediately send duplicate ACK, indicating seq. # of next expected byte

Immediate send ACK, provided thatsegment starts at lower end of gap


ECE/CS 372 – introduction to computer networksLecture 10

Announcements:

Midterm July 19th, 2012 (Thursday 4th week)

Ch1, Ch2, and Ch3. Make sure you review the material, HW, labs

Closed notes, book. Allowed only your pen/pencil and a calculator.



Fast Retransmit Suppose: Packet 0 gets lost

Q: when retrans. of Packet 0 will happen?? Why happens at that time?

A: typically at t1 ; we think it is lost when timer expires

Can we do better?? Think of what means to receive

many duplicate ACKs it means Packet 0 is lost Why wait till timeout since we know

packet 0 is lost => Fast retransmit

=> better perfermance

Why 3 dup ACK, not just 1 or 2 Think of what happens when pkt0

arrives after pkt1 (delayed, not lost) Think of what happens when pkt0

arrives after pkt1 & 2, etc.

client server

Packet 0Timer isset at t0

Timer expires

at t1

Packet 1Packet 2Packet 3

ACK0

ACK0

ACK0

Packet 0

Packet 0


Fast Retransmit: recap Receipt of duplicate

ACKs indicate lost of segments Sender often sends

many segments back-to-back

If segment is lost, there will likely be many duplicate ACKs.

This is how TCP works: If sender receives 3 ACKs

for the same data, it supposes that segment after ACK’ed data was lost:

fast retransmit: resend segment before

timer expires better performance


TCP Flow Control

receive side of TCP connection has a receive buffer:

speed-matching service: matching the send rate to the receiving app’s drain rate app process may be

slow at reading from buffer

sender won’t overflow

receiver’s buffer bytransmitting too

much, too fast

flow control


TCP Flow control: how it works

spare room in buffer= RcvWindow

= RcvBuffer-[LastByteRcvd - LastByteRead]

Rcvr advertises spare room by including value of RcvWindow in segments

Sender limits unACKed data to RcvWindow guarantees receive

buffer doesn’t overflow


Review questions

Problem: TCP connection between A and B B received upto 248 bytes A sends back-to-back 2 segments

to B with 40 and 60 bytes B ACKs every pkt it receives

Q1: Seq# in 1st and 2nd seg. from A to B ?

Q2: Spse: 1st seg. gets to B first. What is seq# in 1st ACK?

Host A Host BSeq=249, 40 bytes

Seq=289, 60 bytes

ACK=289


Review questions





Q3: Spse: 2nd seg. gets to B first. What is seq# in 1st ACK? And in 2nd ACK?


Seq=289, 60 bytes

ACK=??

ACK=??


Review questions





Q3: Spse: 2nd seg. gets to B first. What is seq# in 1st ACK? And in 2nd ACK?


Seq=289, 60 bytes

ACK=249

ACK=349


Review questions



Now suppose: - 2 segs get to B in order. - 1st ACK is lost- 2nd ACK arrives after timeout

Question: fill out all pckt seq numbers, and ACKs seq numbers in the timing diagram


Chapter 3 outline






Principles of Congestion Control Oct. ’86: LBL (Lawrence Berkeley Lab) to UC-Berkeley:

drop frm 32kbps to 40bps

cause: end systems are sending too much data too fast for network/routers to handle

manifestations: lost/dropped packets (buffer overflow at routers) long delays (queueing in router buffers)

different from flow control!

a top-10 problem!


Approaches towards congestion control

End-end congestion control:

no explicit feedback from network

congestion inferred from end-system observed loss, delay

approach taken by TCP

One more restriction on sliding window: CongWin

Network-assisted congestion control:

routers provide feedback to end systems single bit indicating

congestion (DECbit, TCP/IP ECN)

explicit rate sender should send at

Two broad approaches towards congestion control:


TCP congestion control Keep in mind:

Too slow: under-utilization => waste of network resources by not using them!

Too fast: over-utilization => waste of network resources by congesting them!

Challenge is then: Not too slow, nor too fast!!

Approach: Increase slowly the sending rates to probe for usable

bandwidth Decrease the sending rates when congestion is observed=> Additive-increase, multiplicative decrease (AIMD)


Additive-increase, multiplicative decrease (AIMD)

(also called “congestion avoidance”) additive increase: increase CongWin by 1 MSS every RTT

until loss detected (MSS = Max Segment Size) multiplicative decrease: cut CongWin in half after loss

8 Kbytes

16 Kbytes

24 Kbytes

time

congestionwindow

time

cong

estio

n w

indo

w s

ize

Saw toothbehavior: probing

for bandwidth

TCP congestion control: AIMD


TCP Congestion Control: details

sender limits transmission: LastByteSent-LastByteAcked

CongWin Roughly,

CongWin is dynamic, function of perceived network congestion

How does sender perceive congestion?

loss event timeout or 3 duplicate ACKs

TCP sender reduces rate (CongWin) after loss event

AIMD increases window by 1 MSS every RTT

Improvements: AIMD, any problem??

Think of the start of connections

Solution: start a little faster, and then slow down=>“slow-start”

rate = CongWin

RTT Bytes/sec


TCP Slow Start

When connection begins, CongWin = 1 MSS Example: MSS = 500

Bytes RTT = 200 msec initial rate = 20 kbps

available bandwidth may be >> MSS/RTT desirable to quickly

ramp up to respectable rate

TCP addresses this via Slow-Start mechanism

When connection begins, increase rate exponentially fast

When loss of packet occurs (indicates that connection reaches up there), then slow down


TCP Slow Start (more)

How it is done When connection

begins, increase rate exponentially until first loss event: double CongWin every

RTT done by incrementing CongWin for every ACK received

Summary: initial rate is slow but ramps up exponentially fast

Host A

one segment

RTT

Host B

time

two segments

four segments


Refinement: TCP TahoeQuestion: When should exponential increase (Slow-Start) switch to linear (AIMD)?

Here is how it works: Define a variable, called Threshold Start “Slow-Start” (i.e., CongWin =1 and double CongWin every RTT) At 1st loss event (timeout or 3 Dup ACKs),

Set Threshold = ½ current CongWin

Start “Slow-Start” (i.e., CongWin = 1 and double CongWin every RTT) When CongWin =Threshold:

Switch to AIMD (linear) At any loss event (timeout or 3 Dup ACKs):

Set Threshold = 1/2 CongWin Start “Slow-Start” (i.e. start over with CongWin = 1)


Refinement: TCP TahoeQuestion: When should exponential increase (Slow-Start) switch to linear (AIMD)?

CongWin


More refinement: TCP RenoLoss event: Timeout vs. dup ACKs

3 dup ACKs: fast retransmit

client server


Timer expires

at t1


ACK0

ACK0

ACK0

Packet 0


More refinement: TCP RenoLoss event: Timeout vs. dup ACKs

3 dup ACKs: fast retransmit

Timeout: retransmit

client server


Timer expires

at t1


Packet 0

Any difference (think congestion) ??

3 dup ACKs indicate network still capable of delivering some segments after a loss

Timeout indicates a “more” alarming congestion scenario


More refinement: TCP RenoTCP Reno treats “3 dup ACKs” different from “timeout”

How does TCP Reno work?

After 3 dup ACKs: CongWin is cut in half congestion avoidance

(window grows linearly)

But after timeout event: CongWin instead set

to 1 MSS; Slow-Start (window

grows exponentially)


Summary: TCP Congestion Control

When CongWin is below Threshold, sender in slow-start phase, window grows exponentially.

When CongWin is above Threshold, sender is in congestion-avoidance phase, window grows linearly.

When a triple duplicate ACK occurs, Threshold set to CongWin/2 and CongWin set to Threshold.

When timeout occurs, Threshold set to CongWin/2 and CongWin is set to 1 MSS.


Average throughput of TCPAvg. throughout as a function of window size W and RTT?

Ignore Slow-Start Let W, the window size when loss occurs, be constant

When window is W, throughput is ?? throughput(high) = W/RTT Just after loss, window drops to W/2, throughput is ??

throughput(low) = W/(2RTT). Throughput then increases linearly from W/(2RTT) to

W/RTT Hence, average throughout = 0.75 W/RTT


Fairness goal: if K TCP sessions share same bottleneck link of bandwidth R, each should have average rate of R/K

TCP connection 1

bottleneckrouter

capacity R

TCP connection 2

TCP Fairness


Fairness (more)Fairness and parallel TCP connections nothing prevents app from opening parallel

conections between 2 hosts. Web browsers do this

Example: link of rate R supporting 9 connections; 9 TCPs belong to same host, each getting R/9 new app asks for 1 TCP, gets rate R/10 One host gets 9/10 of R and one gets 1/10 of R!


Question: is TCP fair?Again assume two competing sessions only, and consider AIMD: Additive increase gives slope of 1, as throughout increases Multiplicative decrease halves throughput proportionally, when

congestion occurs

TCP connection 1

bottleneckrouter

capacity R

TCP connection 2


Question: is TCP fair?Again assume two competing sessions only, and consider

AIMD: Additive increase gives slope of 1, as throughout increases Multiplicative decrease halves throughput proportionally,

when congestion occurs Question: How would (R1, R2)

varywhen AIMD is used?

Does it converge to equal share?

R

R

equal bandwidth share

Connection 1 throughputConnect

ion 2

th

roughput

congestion avoidance: additive increaseloss: decrease window by factor of 2

congestion avoidance: additive increaseloss: decrease window by factor of 2


Question?Again assume two competing sessions only: Additive increase gives slope of 1, as throughout increases Constant/equal decrease instead of multiplicative

decrease!!! (Call this scheme: AIED)

R

R

equal bandwidth share

Connection 1 throughputConnect

ion 2

th

roughput

(R1, R2)

Question: How would (R1, R2)

vary when AIED is used instead of AIMD?

Is it fair? Does it converge to equal share?

ece/cs 372 – introduction to computer networks lecture 7

Documents