school of computing national university of singaporeunder m/m/1, power = throughput π ππ β...
TRANSCRIPT
Congestion Avoidance
Richard T. B. Ma
School of Computing
National University of Singapore
CS 5229: Advanced Compute Networks
References
K. K. Ramakrishnan, Raj Jain, βA Binary Feedback Scheme for Congestion Avoidance in Computer Networks with a Connectionless Network Layerβ, ACM Computer Communication Review, Vol. 18, No. 4, August 1988, pp. 303-313.
Congestion Collapse
When: October 1986
Where: Lawrence Berkeley Laboratory (LBL) to UC Berkeley site, 400 yards and 3 hops away
What: Throughput dropped from 32 Kbps to 40bps
Network Congestion
Dest
Source
Source
Router
1.5-Mbps T1 link
Congestion in a packet-switched network
Flow Control
End-to-end flow control looks at βselfishβ control function Make sure sufficient buffer at destination
Receiver sends π ππ£ππππππ€ (or ππ€ππ) =
π ππ£π΅π’ππππ β (πΏππ π‘π΅π¦π‘ππ ππ£π β πΏππ π‘π΅π¦π‘ππ πππ)
Sender keeps πΏππ π‘π΅π¦π‘πππππ‘ β πΏππ π‘π΅π¦π‘ππ΄ππππ β€ππ€ππ
Flow Control VS Congestion Control
End-to-end flow control looks at βselfishβ control function Make sure sufficient buffer at destination
Congestion control solves a βsocialβ problem End-to-end flows of the network cooperate to
avoid/recover from congestion of the intermediate nodes/routers they share
Connectionless Flows
Dest1
Source
Source
Router
Multiple flows passing through a set of routers
Source
Dest2
Router
Router
Congestion Avoidance/Control
Congestion control: Detect and reduce load from the βCliffβ
Congestion avoidance: Operate network at the βKneeβ
DECbit Scheme: 1-bit Feedback
Minimum feedback information One congestion avoidance bit
Set by the router if congested
Destination sends it back in ACK
When does the router set avoidance bit?
What does an end-host respond?
0 1
1
Congestion Avoidance Bit
Optimization Criteria (Metrics)
Efficiency Maximize βPowerβ
πππ€ππ β πβπππ’πβππ’π‘πΌ
π ππ ππππ π ππππ, 0 < πΌ < 1.
Operate at βKneeβ when maximizing for πΌ = 1
π ππ ππ’πππ πΈπππππππππ¦ β π ππ ππ’πππ πππ€ππ
π ππ ππ’πππ πππ€ππ ππ‘ πΎπππ
Less than 100% efficiency can happen when β’ Underutilize capacity (low throughput)
β’ Overutilize capacity (high response time)
Optimization Criteria (Metrics)
Fairness Maximize Jainβs index
π½ π₯ = π₯πππ=1
2
π π₯π2π
π=1
π₯π denotes πβs resource share, absolute or %
π½ π₯ β 0,1
Independent of scale
Continuous
π½ π₯ = π/π if only k users are allocated equally
Optimization Criteria (Metrics)
Distributedness Only depends on the congestion avoidance bit
End-hosts control independently
Convergence Responsiveness
Smoothness
Congestion Detection at Router
Assumption: single server FIFO type
Metrics: 1) utilization π, 2) # of packets πΏ
Packet size distribution determine service time distribution
When packet size varies a lot, utilization π is not a good measure of congestion
πΏ is used instead to measure congestion
Hysteresis algorithm
Two thresholds 0 < πΏ1 < πΏ2 Set congestion signal when πΏ > πΏ2
Unset congestion signal when πΏ < πΏ1
Equivalently, the algorithm maintains a center πΆ and width πΎ such that πΏ1 = πΆ β πΎ and πΏ2 = πΆ + πΎ
Power is maximized (in experiment) with πΆ = 1 and πΎ = 0 (or π1 = π2 = 1)
Set congestion avoidance bit when πΏ β₯ 1
Hysteresis algorithm
Source
Source
Router Source
π³π π³π = π
Dest
πͺ π²
Feedback Filter at Router
Do not use instantaneous value of πΏ(π‘)
Average over time interval π
Using last (busy + idle) cycle time plus the busy period of the current cycle
What do end-hosts respond?
From a senderβs perspective πΏππ π‘π΅π¦π‘πππππ‘ β πΏππ π‘π΅π¦π‘ππ΄ππππ β€ min ππ€ππ, ππ€ππ
How to control the congestion window ππ€ππ?
Four aspects: Decision Frequency
Use of the Received Information
Signal Filtering
Decision Function
User Policy: Decision Frequency
Update after each acknowledgement Sliding window size
oscillates frequently
Update after receiving ππ +ππ ACKs
ππ and ππ are the sizes of the previous and current sliding windows
User Policy: Signal Filtering
Use of Received Information Only the most recent ππ packets are examined
Drop old information (ππ packets) after update
Signal Filtering Reduce ππ if more than 50% of the bits are set
Increase ππ otherwise
Why do we use 50% as a cutting point? Depends on optimal system utilization πβ
Depends on the used threshold πΆ
User Policy: Signal Filtering
Under M/M/1,
Power =Throughput
π ππβ
π
E π= π2 1 β π π
Max power is attained at πβ = 0.5
At the optimum, π0β = 1 β πβ = 0.5, ππ
β = πβππ0
Given a threshold πΆ to set congestion bit At the optimum operating point β Pβ bit set = 1 β π0
β β π1β ββ―β ππΆβ1
β
If more than Pβ bit set Γ ππ packets are set, system is over-utilized.
User Policy: Signal Filtering
Use of Received Information Only the most recent ππ packets are examined
Drop old information (ππ packets) after update
Signal Filtering Reduce ππ if more than 50% of the bits are set
Increase ππ otherwise
Decision Function How much to increase/decrease?
Decision Function Requirements
Achieve efficiency (high resource power)
Achieve fairness (high Jainβs index)
Minimize oscillations
Minimize convergence time
Linear Decision Choices
1. Additive increase additive decrease (AIAD) β: ππ
π = πππ + π; β: ππ
π = πππ β π
2. Additive increase multiplicative decrease (AIMD) β: ππ
π = πππ + π; β: ππ
π = ππππ
3. Multiplicative increase additive decrease (MIAD) β: ππ
π = ππππ; β: ππ
π = πππ β π
4. Multiplicative increase and decrease (MIMD) β: ππ
π = ππππ; β: ππ
π = ππππ
References
Dah Ming Chiu and Raj Jain, βAnalysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networksβ, Computer Networks and ISDN Systems, 1989, Vol. 17, pp. 1-14.
Synchronous Feedback Model
Feedback control loop is synchronous
Congestion state is determined by the number of packets in the system
Single bottleneck and binary feedback
π¦ π‘ =
0 ππ π₯π π‘
π
π=1
β€ πΆ
1 ππ π₯π π‘
π
π=1
> πΆ
Distributed Linear Control
Each user π adjust the window size π₯π as a linear function based on feedback
π₯π π‘ + 1 = ππΌ + ππΌπ₯π(π‘) ππ π¦ π‘ = 0
ππ· + ππ·π₯π(π‘) ππ π¦ π‘ = 1
Decision on the values of ππΌ, ππ·, ππΌ and ππ·. AIAD: ππΌ > 0 > ππ·; ππΌ = ππ· = 1.
AIMD: ππΌ > 0 = ππ·; 0 < ππ· < ππΌ = 1.
MIAD: ππ· < 0 = ππΌ; ππ· = 1 < ππΌ .
MIMD: ππΌ = ππ· = 0; 0 < ππ· < 1 < ππΌ .
Pictorial Explanation/Intuition
Additive Movement
Multiplicative Movement
AIMD Works
AIAD Not Fair (so as MIMD)
π₯ π‘ + 1 = π + ππ₯ π‘
π β€ 0; π β€ 1 needed
Efficiency Convergence
π β€ π π β€ π
π β₯ π π β₯ π
π β€ π π β₯ π
π β₯ π π β€ π
Fairness Convergence
Conclusion: Decrease must be multiplicative in order to achieve fairness and efficiency.
Increase Fairness & Efficiency
Conclusion: Optimal increase is additive in order to achieve fairness and efficiency.
Decision Function Choices
1. Additive increase additive decrease (AIAD) β: ππ
π = πππ + π; β: ππ
π = πππ β π
2. Additive increase multiplicative decrease (AIMD) β: ππ
π = πππ + π; β: ππ
π = ππππ
3. Multiplicative increase additive decrease (MIAD) β: ππ
π = ππππ; β: ππ
π = πππ β π
4. Multiplicative increase and decrease (MIMD) β: ππ
π = ππππ; β: ππ
π = ππππ
Optimal Convergence To
Efficiency π‘π : time to convergence
Responsiveness improved with large increase/decreases parameters
π π : oscillation size
Smoothness improved with small increase/decreases parameters
Fairness AIMD is the optimal mechanism
that convergences to fairness
Buffer Management
Buffer over-flow under congestion When to drop packets?
Which packets to drop?
Dest
Source
Source
Router
1.5-Mbps T1 link
Network Layer 4-36
Chapter 4 Network Layer
A note on the use of these ppt slides: Weβre making these slides freely available to all (faculty, students, readers).
Theyβre in PowerPoint form so you can add, modify, and delete slides
(including this one) and slide content to suit your needs. They obviously
represent a lot of work on our part. In return for use, we only ask the
following:
If you use these slides (e.g., in a class) in substantially unaltered form, that
you mention their source (after all, weβd like people to use our book!)
If you post any slides in substantially unaltered form on a www site, that
you note that they are adapted from (or perhaps identical to) our slides, and
note our copyright of this material.
Thanks and enjoy! JFK/KWR
All material copyright 1996-2010
J.F Kurose and K.W. Ross, All Rights Reserved
Computer Networking: A Top Down Approach 5th edition. Jim Kurose, Keith Ross Addison-Wesley, April 2009.
Network Layer 4-37
Chapter 4: Network Layer
4. 1 Introduction
4.2 Virtual circuit and datagram networks
4.3 Whatβs inside a router?
4.4 IP: Internet Protocol Datagram format
IPv4 addressing
ICMP
IPv6
4.5 Routing algorithms Link state
Distance Vector
Hierarchical routing
4.6 Routing in the Internet RIP
OSPF
BGP
4.7 Broadcast and multicast routing
Network Layer 4-38
Router Architecture Overview
two key router functions: run routing algorithms/protocol (RIP, OSPF, BGP)
forwarding datagrams from incoming to outgoing link
switching fabric
routing processor
router input ports router output ports
Network Layer 4-39
line termination
link layer
protocol (receive)
lookup, forwarding
queueing
Input Port Functions
Decentralized switching: given datagram dest., lookup output port
using forwarding table in input port memory
goal: complete input port processing at βline speedβ
queuing: if datagrams arrive faster than forwarding rate into switch fabric
Physical layer: bit-level reception
Data link layer: e.g., Ethernet see chapter 5
switch fabric
Network Layer 4-40
Switching fabrics
transfer packet from input buffer to appropriate output buffer
switching rate: rate at which packets can be transfer from inputs to outputs often measured as multiple of input/output line rate
N inputs: switching rate N times line rate desirable
three types of switching fabrics
memory
memory
bus crossbar
Network Layer 4-41
Switching Via Memory
First generation routers:
traditional computers with switching under direct control of CPU
packet copied to systemβs memory
speed limited by memory bandwidth (2 bus crossings per datagram)
input port (e.g.,
Ethernet)
memory
output port (e.g.,
Ethernet)
system bus
Network Layer 4-42
Switching Via a Bus
datagram from input port memory
to output port memory via a shared bus
bus contention: switching speed limited by bus bandwidth
32 Gbps bus, Cisco 5600: sufficient speed for access and enterprise routers
bus
Network Layer 4-43
Switching Via An Interconnection Network
overcome bus bandwidth limitations
Banyan networks, crossbar, other interconnection nets initially developed to connect processors in multiprocessor
advanced design: fragmenting datagram into fixed length cells, switch cells through the fabric.
Cisco 12000: switches 60 Gbps through the interconnection network
crossbar
Network Layer 4-44
Output Ports
buffering required when datagrams arrive from fabric faster than the transmission rate
scheduling discipline chooses among queued datagrams for transmission
line termination
link layer
protocol (send)
switch fabric
datagram buffer
queueing
Network Layer 4-45
Output port queueing
buffering when arrival rate via switch exceeds output line speed
queueing (delay) and loss due to output port buffer overflow!
at t, packets more from input to output
one packet time later
switch fabric
switch fabric
Network Layer 4-46
How much buffering?
RFC 3439 rule of thumb: average buffering equal to βtypicalβ RTT (say 250 msec) times link capacity C e.g., C = 10 Gpbs link: 2.5 Gbit buffer
recent recommendation: with N flows, buffering equal to RTT C .
N
Network Layer 4-47
Input Port Queuing
fabric slower than input ports combined -> queueing may occur at input queues queueing delay and loss due to input buffer overflow!
Head-of-the-Line (HOL) blocking: queued datagram at front of queue prevents others in queue from moving forward
output port contention: only one red datagram can be
transferred. lower red packet is blocked
one packet time later: green packet experiences HOL
blocking
switch fabric
switch fabric
References
Sally Floyd and Van Jacobson, βRandom Early Detection Gateway for Congestion Avoidanceβ, IEEE/ACM Transactions on Networking, Vol. 1 No. 4, August 1993.
Congestion Avoidance
βDefaultβ mechanism: FIFO, droptail Congestion can be detected after packet drop
Induce long queues and queueing delays
Main goal and desirable objectives Provide congestion avoidance by controlling the
average queue length
High throughput and low delay
Routers can detect congestion better Distinguish propagation and queueing delay
Random Early Detection (RED)
Does not assume cooperative end hosts, and provide probabilistic fairness to flows
General buffer management scheme that can be used with other congestion control mechanisms, e.g. TCP, and scheduling mechanisms, e.g. FIFO, priority queueing.
Does not require all routers in the Internet to implement in order for RED to work (incremental deployment is possible)
Avoid global synchronization
RED Versus DECbit
Computing average queue size DECbit: last (busy+idle) cycle + current busy
cycle for averaging queue size
RED: time-based exponential decay
Notifying congestion DECbit: no separation of detection and marking,
biased against bursty traffic
RED: randomized marking, avoid global synchronization
RED Algorithm (high-level)
For each packet arrival
calculate the average queue size ππ£π
ππ π‘πππβ€ ππ£π < π‘πππ₯
calculate probability ππ
mark the arriving packet with probability ππ
πππ π ππ π‘πππ₯ β€ ππ£π
mark the arriving packet
RED Active Queue Management
π π₯ =
0, 0 β€ π₯ < π‘πππ
π₯ β π‘πππ
π‘πππ₯ β π‘πππππππ₯ , π‘πππ β€ π₯ β€ π‘πππ₯
1, π‘πππ₯ < π₯
ππππ ππππ
ππππ
1
0
RED Algorithm (part 1)
Initialization: aπ£π = 0; πππ’ππ‘ = β1;
for each packet arrival
calculate the average queue size ππ£π:
ππ ππ’ππ’π ππ πππππππ‘π¦: ππ£π = 1 β π€π ππ£π + π€ππ
πππ π: π = π π‘πππ β π_π‘πππ ;
ππ£π = 1 β π€ππππ£π
RED Algorithm Variables
π€π: the discount weight for historical avg.
If π€π is too big, the router cannot filter transient congestions
An upper bound of is derived in the paper
π_π‘πππ: starting time of the most recent idle period
π = π π‘πππ β π_π‘πππ : # of packets that could have been transmitted during the idle period
RED Algorithm (part 2)
ππ π‘πππβ€ ππ£π < π‘πππ₯
πππ’ππ‘ = πππ’ππ‘ + 1;
calculate probability ππ:
ππ = ππππ₯(ππ£π β π‘πππ)/(π‘πππ₯ β π‘πππ);
ππ = ππ/(1 β πππ’ππ‘ Γ ππ);
with probability ππ:
mark the arriving packet; πππ’ππ‘ = 0;
πππ π ππ π‘πππ₯ β€ ππ£π
mark the arriving packet; πππ’ππ‘ = 0;
πππ π πππ’ππ‘ = β1;
When queue becomes empty: π_π‘πππ = π‘πππ
RED Algorithm Variables
πππ’ππ‘: # of unmarked packets under congestion No congestion: πππ’ππ‘ = β1
Just marked a packet, reset: πππ’ππ‘ = 0
The most recent π packets are not marked under congestion: πππ’ππ‘ = π
Relationship between ππ and ππ
ππ = ππ/(1 β πππ’ππ‘ Γ ππ)
ππ increases with πππ’ππ‘
Ensure that the router does not wait too long before marking a packet
Make inter-dropping time uniform
Relationship between ππ and ππ
Use ππ as the final dropping probability Inter-dropping time ππ is a geometric r.v.
Prob ππ = π = 1 β πππβ1ππ; E ππ = 1/ππ
More desirable to have uniform distribution
Use ππ as the final dropping probability
Prob ππ = π
=
ππ1 β π β 1 ππ
1βππ
1 β πππ
πβ2
π=0= ππ 1 β€ π β€ 1/ππ
0 π > 1/ππ
E ππ =1
2ππ+1
2
Simulation Results
Topology 4 FTP flows and 1 RED router
Parameters ππππ‘β = 5; πππ₯π‘β = 15; π€π = 0.002
Simulation Results
ππ
Simulation Results
How About TCP?
How to control congestion window sizes?
How to infer congestion?
Why and how to estimate and smooth RTT?
What is Slow Start? Why do we need it?
Retransmit timer back-off policy
Acknowledgement sending policy
References
Van Jacobson, βCongestion Avoidance and Controlβ, ACM Computer Communication Review Vol. 18, No. 4, August 1988, pp. 314-329.