flow control and reliability control in webtp – part 2 ye xia 10/31/00

Flow Control and Reliability Control in WebTP – Part 2

Ye Xia

10/31/00

Outline

• Motivating problems

• Recall known ideas and go through simple facts about flow control

• Flow control examples: TCP (BSD) and Credit-based flow control for ATM

• WebTP challenges and tentative solutions

Motivating Problem

• Suppose a packet is lost at F1’s receive buffer. Should the pipe’s congestion window be reduced?

Two Styles of Flow Control

• TCP congestion control– Lossy

– Traffic intensity varies slowly and oscillates.

• Credit-Based flow control.– No loss

– Handles bursty traffic well; handles bursty receiver link well

• TCP’s receive-buffer flow control is equivalent to Kung and Morris’s Credit-Based flow control

What is TCP?

• (Network) congestion control: – Linear/sublinear increase and multiplicative decrease of

window.

– Use binary loss information.

• (End-to-end) flow control resembles credit scheme, with credit update protocol.

• Can the end-to-end flow control be treated the same as congestion control? Maybe, but …

Credit-Based Control (Kung and Morris 95)

• Overview of steps– Before forwarding packets, sender needs to receive

credits from receiver– At various times, receiver sends credits to sender,

indicating available receive buffer size– Sender decrements its credit balance after forwarding a

packet.

• Typically ensures no buffer overflow• Works well over a wide range of network

conditions, e.g. bursty traffic.

Credit Update Protocol (Kung and Morris 95)

Adaptive Credit-Based Control(Kung and Chang 95)

• Without adaptation: M = N * C_r * (RTT + N2 * N)

• Idea: make buffer size proportional to actual bandwidth, for each connection.

• For each connection and on each allocation interval,Buf_Alloc = (M/2 – TQ – N) * (VU/TU)

TQ: current buffer occupancy VU: amount of data forwarded for the connection TU: amount forwarded for all N connections.

• M = 4 * RTT + 2 * N• Easy to show no losses. But can allocation be controlled

precisely?• Once bandwidth is introduced, the scheme can no longer handle

burst well.

BSD - TCP Flow Control• Receiver advertises free buffer space

win = Buf_Alloc – Que_siz• Sender can send [snd_una, snd_una + snd_win –1].

snd_win = win; snd_una: oldest unACKed number

1 2 3 4 5 6 7 8 9 10 11 …

sent and ACKed sent not ACKed can send ASAP

snd_win = 6: advertised by receiver

snd_una snd_nxt

can’t send until

window moves

next send number

TCP Example

1 2 3 4 5 6 7 8 9 10 11 …

snd_win = 3

snd_una

1 2 3 4 5 6 7 8 9 10 11 …

• Receiver: ACKs 4, win = 3. (Total buffer size = 6)

• Sender: sends 4 again

3 4 5 6 7 8 9 10 11 12 13 …

snd_win = 6

snd_una

• Sender: after 4 is received at receiver.

TCP Receiver Buffer Management

• Time-varying physical buffer size B_r(t), shared by n TCP connections.

• BSD implementation: each connection has an bound , B_i, on queue size.

• Buffers are not reserved. It is possible B_i > B_r(t), for some time t.

Possible Deadlock

• Example: Two connections, each with B_i = 4.• Suppose B_r = 4. At this point, physical buffer runs out,

reassembly cannot continue.• Deadlock can be avoided if we allow dropping received packets.

Implications to reliability control (e.g. connection 1):– OK with TCP, because packets 4 and 5 have not been acked– WebTP may have already acked 4 and 5

Connection 1: … 2 3 4 5 6

Connection 2: … 4 5 6 7 8

TCP Receiver Flow Control Uses Credit Scheme

• For performance reasons: better throughput than TCPC for the same (small) buffer size?– Losses are observed by the receiver. Why not

inform the sender?– Queue size is also observed. Tell the sender.– For data re-assembly, the receiver has to tell

which packets are lost/received? Very little complexity is added to support window-flow control.

Data Re-Assemble Forces a Credit Scheme

• There is reason to believe that TCP flow control brings some order to TCP.

• Receiver essentially advertises window by [min, max], rather than just the size.

Actual TCP: … 2 3 4 5 6 7 8 9

Otherwise: … 2 9 12 17 19 20 24 31

Why do we need receiver buffer?

• Part of flow/congestion control when C_s(t) > C_r(t)– In TCPC, certain amount of buffer is needed to get

reasonable throughput. (For optimality issues, see [Mitra92] and [Fendick92])

– In CRDT, also for good throughput.

• Buffering is beneficial for data re-assembly.

Buffering for Flow Control: Example

• Suppose link capacities are constant. Suppose C_s >= C_r. To reach throughput C_r, B_r should be– C_r * RTT, in a naïve but robust CRDT scheme– (C_s - C_r) * C_r * RTT / C_s, if C_r is known to the sender.– 0, if C_r is known to the sender and sender never sends burst

at rate greater than C_r.– Note: upstream can estimate C_r

Re-assembly Buffer Sizing

• Without it, throughput can suffer. (by how much?)• Buffer size depends on network delay, loss, packet

reordering behaviors. Can we quantify this?

Question: How do we put the two together? Re-assembly buffer size can simply be a threshold number, e.g. TCP.

Example: (Actual) buffer size B = 6. But we allow packet 3 and 12 coexist in the buffer.

One-Shot Model for Throughput• Send n packets in block, iid delays.• Example: B=1 and n=3.• E[Throughput] = 1/6 * (3+2+2+1+1+1) / 3 = 5/9

Input Accepted

123 123

132 12

213 1

231 1

312 12

321 1

Some Results

• If B = 1, roughly ½ + e packets will be received on average, for large n.

• If B = n, all n packets will be received.• Conjecture:

nBk

BThroughputE

n

Bk

Bk

B /)1!

!(][

Reliability and Flow Control Intertwined

• They share the same feedback stream.

• The receiver needs to tell the sender HOW MANY and WHICH packets have been forwarded.

• Regarding to “WHICH”, TCP takes the simplistic approach to ACK the first un-received data.

Summary of Issues

• Find control scheme suitable for both pipe level and flow level.– Reconcile network control and last-hop control: we need

flow control at the flow level.– Note that feedback for congestion control and reliability

control is entangled.

• Buffer management at receiver– Buffer sizing

• for re-assembly • for congestion control

– Deadlock prevention

WebTP Packet Header Format

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Packet Number |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Acknowledgment Number |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Acknowledged Vector |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| ADU Name |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Segment Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Source Port | Destination Port |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Data |U|A|R|S|F|R|E|F|P| C | P | || Offset|R|C|S|Y|I|E|N|A|T| C | C | RES || |G|K|T|N|N|L|D|S|Y| A | L | |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Window | Checksum |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| Options | Padding |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+| data |+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

WebTP: Receiver Flow Control

• A flow can be reliable (in TCP sense) or unreliable (in UDP sense).

• Shared feedback for reliability and for congestion control.• Reliable flow uses TCP-styled flow control and data re-

assembly. A loss at the receiver due to flow-control buffer overflow is not distinguished from a loss at the pipe. But, this should be rare.

• Unreliable flow: losses at receiver due to overflowing B_i are not reported back to the sender. No window flow control is needed for simplicity. (Is the window information useful?)

WebTP: Buffer Management

• Each flow gets a fixed upper bound on queue size, say B_i. B_i >= B_r is possible.

• Later on, B_i will adapt to speed of application.• Receiver of a flow maintains rcv_nxt and rcv_adv.

B_i = rcv_adv - rcv_nxt + 1• Packets outside [rcv_nxt, rcv_adv] are

rejected.

WebTP Example

1 2 3 4 5 6 7 8 9 10 11 …

snd_win = 3

snd_una

1 2 3 4 5 6 7 8 9 10 11 …

• Receiver: (positively) ACKs 5, 6, and 7, win = 3. (B_i = 6)

• Sender: can send 4, 8 and 9 (subject to congestion control)

5 6 7 8 9 10 11 12 13 14 15 …

snd_win = 6

snd_una

• Sender: after 4, 8 and 9 are received at receiver.

rcv_nxt rcv_adv

snd_nxt

snd_nxt

WebTP: Deadlock Prevention (Reliable Flows)

• Deadlock prevention: pre-allocate bN buffer spaces, b >= 1, where N = max. number of flows allowed.

• When dynamic buffer runs out, enter deadlock prevention mode. In this mode,– each flow accepts only up-to b in-sequence packets for each flow.– when a flow uses up b buffers, it won’t be allowed to use any

buffers until b buffers are freed.

• We guard against case where all but one flow is still responding. In practice, we only need N to be some reasonable large number.

• b = 1 is sufficient, but can be greater than 1 for performance reason.

WebTP: Feedback Scheme

• The Window field in packet header is for each flow. Like TCP, it is the current free buffer space for the flow.

• When a flow starts, use the FORCE bit (FCE) for immediate ACK from the flow.

• Rules for acknowledgement:– To inform the sender about the window size, flow

generates an ACK for every 2 received packets (MTU).– Pipe generates an ACK for every k packets.

• ACK can be piggybacked in the reverse data packets.

Acknowledgement Example: Four Flows

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4Receiver:

Pipe:

Flow 2:

Flow 1:

Flow 4:

Flow 3:

Result:

With some randomness in the traffic, 50 - 62 ACKs are generated for every 100 data packets.

Computation

… k-3 k-2 k-1 k k+1Packet number:

1 4 2 3 4 1 3 2 2Flows:

Correctness of Protocol and Algorithm

• Performance typically deals with average cases, and can be studied by model-based analysis or simulation.

• What about correctness?– Very often in networking, failures are more of the concerns than poor

performance.

• Correctness of many distributed algorithms in networking area has not been proven.

• What can be done?– Need formal description– Need methods of proof

• Some references for protocol verification: I/O Automata ([Lynch88]), Verification of TCP ([Smith97])

References[Mitra92] Debasis Mitra, “Asymptotically Optimal Design of Congestion Control for High Speed Data Networks”, IEEE Transactions on Communications, VOL. 10 NO. 2, Feb. 1992

[Fendick92] Kerry W. Fendick, Manoel A. Rodrigues and Alan Weiss, “Analysis of a rate-based feedback control strategy for long haul data transport”, Performance Evaluation 16 (1992), pp. 67-84

[Kung and Morris 95], H.T. Kung and Robert Morris, “Credit-Based Flow Control for ATM Networks”, IEEE Network Magazine, March 1995.

[Kung and Chang 95], H.T. Kung and Koling Chang, “Receiver-Oriented Adaptive Buffer Allocation in Credit-Based Flow Control for ATM Networks”, Proc. Infocom ’95.

[Smith97] Mark Smith. “Formal Verification of TCP and T/TCP”. PhD thesis, Department of EECS, MIT, 1997.

[Lynch88], Nancy Lynch and Mark Tuttle. “An introduction to Input/Output automata”. Technical Memo MIT/LCS/TM-373, Laboratory for Computer Science, MIT, 1988.

flow control and reliability control in webtp – part 2 ye xia 10/31/00

Documents

buffer flow control

endtoend flow control

reliability control

receiver snd

win snd

network congestion control

una snd

buffer size sender