school of computing national university of singaporetbma/teaching/cs5229y11_past/05... ·...
TRANSCRIPT
Congestion Avoidance
Richard T. B. Ma
School of Computing
National University of Singapore
CS 5229: Advanced Compute Networks
References
K. K. Ramakrishnan, Raj Jain, “A Binary Feedback Scheme for Congestion Avoidance in Computer Networks with a Connectionless Network Layer”, ACM Computer Communication Review, Vol. 18, No. 4, August 1988, pp. 303-313.
Congestion Collapse
When: October 1986
Where: Lawrence Berkeley Laboratory (LBL) to UC Berkeley site, 400 yards and 3 hops away
What: Throughput dropped from 32 Kbps to 40bps
Network Congestion
Dest
Source
Source
Router
1.5-Mbps T1 link
Congestion in a packet-switched network
Flow Control VS Congestion Control
End-to-end flow control looks at “selfish” control function Make sure sufficient buffer at destination
𝑅𝑐𝑣𝑊𝑖𝑛𝑑𝑜𝑤 = 𝑅𝑐𝑣𝐵𝑢𝑓𝑓𝑒𝑟 – (𝐿𝑎𝑠𝑡𝐵𝑦𝑡𝑒𝑅𝑐𝑣𝑑 − 𝐿𝑎𝑠𝑡𝐵𝑦𝑡𝑒𝑅𝑒𝑎𝑑)
Congestion control solves a “social” problem Logical links of network cooperate to
avoid/recover from congestion of the intermediate nodes they share
Congestion Avoidance/Control
Congestion control: Detect and reduce load from the “Cliff”
Congestion avoidance: Operate network at the “Knee”
Connectionless Flows
Dest1
Source
Source
Router
Multiple flows passing through a set of routers
Source
Dest2
Router
Router
DECbit Scheme: 1-bit Feedback
Minimum feedback information One congestion avoidance bit
Set by the router if congested
Destination send it back in ACK
When does the router set avoidance bit?
What does an end-host respond?
0 1
1
Congestion Avoidance Bit
Optimization Criteria (Metrics)
Efficiency Maximize “Power”
𝑃𝑜𝑤𝑒𝑟 ≝ 𝑇𝑟𝑜𝑢𝑔𝑝𝑢𝑡𝛼
𝑅𝑒𝑠𝑝𝑜𝑛𝑠𝑒 𝑇𝑖𝑚𝑒, 0 < 𝛼 < 1.
Operate at “Knee” when maximizing for 𝛼 = 1
𝑅𝑒𝑠𝑜𝑢𝑟𝑐𝑒 𝐸𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑐𝑦 ≝ 𝑅𝑒𝑠𝑜𝑢𝑟𝑐𝑒 𝑃𝑜𝑤𝑒𝑟
𝑅𝑒𝑠𝑜𝑢𝑟𝑐𝑒 𝑃𝑜𝑤𝑒𝑟 𝑎𝑡 𝐾𝑛𝑒𝑒
Less than 100% efficiency can happen when • Underutilize capacity (low throughput)
• Overutilize capacity (high response time)
Optimization Criteria (Metrics)
Fairness Maximize Jain‟s index
𝐽 𝑥 = 𝑥𝑖𝑛𝑖=1
2
𝑛 𝑥𝑖2𝑛
𝑖=1
𝑥𝑖 denotes 𝑖„s resource share, absolute or %
𝐽 𝑥 ∈ 0,1
Independent of scale
Continuous
𝐽 𝑥 = 𝑘/𝑛 if only k users are allocated equally
Optimization Criteria (Metrics)
Distributedness Only depends congestion avoidance bit
End-hosts control independently
Convergence Responsiveness
Smoothness
Congestion Detection at Router
Assumption: single server FIFO type
Metrics: 1) utilization 𝜌, 2) # of users 𝐿
Packet size distribution determine service time distribution
When packet size varies a lot, utilization 𝜌 is not a good measure of congestion
𝐿 is used instead to measure congestion
Hysteresis algorithm
Two thresholds 0 < 𝑇1 < 𝑇2 Set congestion signal when 𝐿 > 𝑇2
Unset congestion signal when 𝐿 < 𝑇1
Equivalently, the algorithm maintains a center 𝐶 and width 𝐾 such that 𝑇1 = 𝐶 − 𝐾 and 𝑇2 = 𝐶 + 𝐾
Power is maximized (in experiment) with 𝐶 = 1 and 𝐾 = 0 (or 𝑇1 = 𝑇2 = 1)
Set congestion avoidance bit when 𝐿 ≥ 1
Feedback Filter at Router
Do not use instantaneous value of 𝐿(𝑡)
Average over time interval 𝑇
Using last (busy + idle) cycle time plus the busy period of the current cycle
User Policy: Decision Frequency
Update after each acknowledgement
Update after receiving 𝑊𝑝 +𝑊𝑐 packets 𝑊𝑝 and 𝑊𝑐 are the
previous & current window sizes
User Policy: Signal Filtering
Use of Received Information Only the most recent 𝑊𝑐 packets are examined
Drop old information (𝑊𝑝 packets) after update
Signal Filtering Reduce 𝑊𝑐 if more than 50% of the bits are set
Increase 𝑊𝑐 otherwise
Why do we use 50% as a cutting point?
User Policy: Signal Filtering
For M/M/1, power =Throughput
𝑅𝑇𝑇≈
1/𝜇
E 𝐷=
1−𝜌
𝜌
Max power is attained at 𝜌∗ = 0.5
At the optimum, 𝜋0∗ = 1 − 𝜌∗ = 0.5 and 𝜋𝑖
∗ =𝜋0∗𝜌𝑖 = 0.5𝑖+1
If use a threshold 𝐶 to set congestion bit At the optimum operating point ⇒ P∗ bit set = 1 − 𝜋0
∗ − 𝜋1∗ −⋯− 𝜋𝐶−1
∗
If more than P∗ bit set × 𝑊𝑐 packets are set, system is overutilized; otherwise, it is underutilized.
User Policy: Signal Filtering
Use of Received Information Only the most recent 𝑊𝑐 packets are examined
Drop old information (𝑊𝑝 packets) after update
Signal Filtering Reduce 𝑊𝑐 if more than 50% of the bits are set
Increase 𝑊𝑐 otherwise
Decision Function How much to increase/decrease?
Decision Function Requirements
Achieve efficiency (high resource power)
Achieve fairness (high Jain‟s index)
Minimize oscillations
Minimize convergence time
Decision Function Choices
1. Additive increase additive decrease (AIAD) ↑: 𝑊𝑐
𝑖 = 𝑊𝑝𝑖 + 𝑏; ↓: 𝑊𝑐
𝑖 = 𝑊𝑝𝑖 − 𝑑
2. Additive increase multiplicative decrease (AIMD) ↑: 𝑊𝑐
𝑖 = 𝑊𝑝𝑖 + 𝑏; ↓: 𝑊𝑐
𝑖 = 𝑑𝑊𝑝𝑖
3. Multiplicative increase additive decrease (MIAD) ↑: 𝑊𝑐
𝑖 = 𝑏𝑊𝑝𝑖; ↓: 𝑊𝑐
𝑖 = 𝑊𝑝𝑖 − 𝑑
4. Multiplicative increase and decrease (MIMD) ↑: 𝑊𝑐
𝑖 = 𝑏𝑊𝑝𝑖; ↓: 𝑊𝑐
𝑖 = 𝑑𝑊𝑝𝑖
References
Dah Ming Chiu and Raj Jain, “Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks”, Computer Networks and ISDN Systems, 1989, Vol. 17, pp. 1-14.
Synchronous Feedback Model
Feedback control loop is synchronous
Congestion state is determined by the number of packets in the system
Single bottleneck
Binary feedback
Distributed Linear Control
Each user 𝑖 adjust the window size 𝑥𝑖 as a function of feedback
𝑥𝑖 𝑡 + 1 = 𝑎𝐼 + 𝑏𝐼𝑥𝑖(𝑡) 𝑖𝑓 𝑦 𝑡 = 0
𝑎𝐷 + 𝑏𝐷𝑥𝑖(𝑡) 𝑖𝑓 𝑦 𝑡 = 1
Decision on the values of 𝑎𝐼, 𝑎𝐷, 𝑏𝐼 and 𝑏𝐷. AIAD: 𝑎𝐼 > 0 > 𝑎𝐷; 𝑏𝐼 = 𝑏𝐷 = 1.
AIMD: 𝑎𝐼 > 0; 0 < 𝑏𝐷 < 𝑏𝐼 = 1.
MIAD: 𝑎𝐷 < 0; 𝑏𝐷 = 1 < 𝑏𝐼 .
MIMD: 0 < 𝑏𝐷 < 1 < 𝑏𝐼 .
Pictorial Explanation/Intuition
Additive Movement
Multiplicative Movement
AIMD Works
AIAD Not Fair (so as MIMD)
Efficiency Convergence
Fairness Convergence
Conclusion: Decrease must be multiplicative in order to achieve fairness and efficiency.
Increase Fairness & Efficiency
Conclusion: Optimal increase is additive in order to achieve fairness and efficiency.
Decision Function Choices
1. Additive increase additive decrease (AIAD) ↑: 𝑊𝑐
𝑖 = 𝑊𝑝𝑖 + 𝑏; ↓: 𝑊𝑐
𝑖 = 𝑊𝑝𝑖 − 𝑑
2. Additive increase multiplicative decrease (AIMD) ↑: 𝑊𝑐
𝑖 = 𝑊𝑝𝑖 + 𝑏; ↓: 𝑊𝑐
𝑖 = 𝑑𝑊𝑝𝑖
3. Multiplicative increase additive decrease (MIAD) ↑: 𝑊𝑐
𝑖 = 𝑏𝑊𝑝𝑖; ↓: 𝑊𝑐
𝑖 = 𝑊𝑝𝑖 − 𝑑
4. Multiplicative increase and decrease (MIMD) ↑: 𝑊𝑐
𝑖 = 𝑏𝑊𝑝𝑖; ↓: 𝑊𝑐
𝑖 = 𝑑𝑊𝑝𝑖
Optimal Convergence To
Efficiency 𝑡𝑒 : time to convergence
Responsiveness improved with large increase/decreases parameters
𝑠𝑒 : oscillation size
Smoothness improved with small increase/decreases parameters
Fairness AIMD is the optimal mechanism
that convergences to fairness
Buffer Management
Buffer over-flow under congestion When to drop packets?
Which packets to drop?
Dest
Source
Source
Router
1.5-Mbps T1 link
References
Sally Floyd and Van Jacobson, “Random Early Detection Gateway for Congestion Avoidance”, IEEE/ACM Transactions on Networking, Vol. 1 No. 4, August 1993.
Congestion Avoidance
“Default” mechanism: FIFO, droptail Congestion can be detected after packet drop
Induce long queues and queueing delays
Main Goal and desirable objectives Provide congestion avoidance by controlling the
average queue length
High throughput and low delay
Routers can detect congestion better Distinguish propagation and queueing delay
Random Early Detection (RED)
Does not assume cooperative end hosts, and provide probabilistic fairness to flows
General buffer management scheme that can be used with other congestion control mechanisms, e.g. TCP, scheduling mechanisms, e.g. FIFO, priority queueing.
Does not require all routers in the Internet to implement in order for RED to work (incremental deployment is possible)
Avoid global synchronization
RED VS DECbit
Computing average queue size DECbit: last (busy+idle) cycle + current busy
cycle for averaging queue size
RED: time-based exponential decay
Notifying congestion DECbit: no separation of detection and marking,
biased against bursty traffic
RED: randomized marking, avoid global synchronization
RED Algorithm (high-level)
For each packet arrival
calculate the average queue size 𝑎𝑣𝑔
𝑖𝑓 𝑡𝑚𝑖𝑛≤ 𝑎𝑣𝑔 < 𝑡𝑚𝑎𝑥
calculate probability 𝑝𝑎
mark the arriving packet with probability 𝑝𝑎
𝑒𝑙𝑠𝑒 𝑖𝑓 𝑡𝑚𝑎𝑥 ≤ 𝑎𝑣𝑔
mark the arriving packet
RED Active Queue Management
𝑝 𝑥 =
0, 0 ≤ 𝑥 < 𝑡𝑚𝑖𝑛
𝑥 − 𝑡𝑚𝑖𝑛
𝑡𝑚𝑎𝑥 − 𝑡𝑚𝑖𝑛𝑝𝑚𝑎𝑥 , 𝑡𝑚𝑖𝑛 ≤ 𝑥 ≤ 𝑡𝑚𝑎𝑥
1, 𝑡𝑚𝑎𝑥 < 𝑥
𝒕𝒎𝒊𝒏 𝒕𝒎𝒂𝒙
𝒑𝒎𝒂𝒙
1
0
RED Algorithm (part 1)
Initialization: a𝑣𝑔 = 0; 𝑐𝑜𝑢𝑛𝑡 = −1;
for each packet arrival
calculate the average queue size 𝑎𝑣𝑔:
𝑖𝑓 𝑞𝑢𝑒𝑢𝑒 𝑖𝑠 𝑛𝑜𝑛𝑒𝑚𝑝𝑡𝑦: 𝑎𝑣𝑔 = 1 − 𝑤𝑞 𝑎𝑣𝑔 + 𝑤𝑞𝑞
𝑒𝑙𝑠𝑒: 𝑚 = 𝑓 𝑡𝑖𝑚𝑒 − 𝑞_𝑡𝑖𝑚𝑒 ;
𝑎𝑣𝑔 = 1 − 𝑤𝑞𝑚𝑎𝑣𝑔
RED Algorithm (part2)
𝑖𝑓 𝑡𝑚𝑖𝑛≤ 𝑎𝑣𝑔 < 𝑡𝑚𝑎𝑥
𝑐𝑜𝑢𝑛𝑡 = 𝑐𝑜𝑢𝑛𝑡 + 1;
calculate probability 𝑝𝑎:
𝑝𝑏 = 𝑝𝑚𝑎𝑥(𝑎𝑣𝑔 − 𝑡𝑚𝑖𝑛)/(𝑡𝑚𝑎𝑥 − 𝑡𝑚𝑖𝑛);
𝑝𝑎 = 𝑝𝑏/(1 − 𝑐𝑜𝑢𝑛𝑡 × 𝑝𝑏);
with probability 𝑝𝑎:
mark the arriving packet; 𝑐𝑜𝑢𝑛𝑡 = 0;
𝑒𝑙𝑠𝑒 𝑖𝑓 𝑡𝑚𝑎𝑥 ≤ 𝑎𝑣𝑔
mark the arriving packet; 𝑐𝑜𝑢𝑛𝑡 = 0;
𝑒𝑙𝑠𝑒 𝑐𝑜𝑢𝑛𝑡 = −1;
When queue becomes empty: 𝑞_𝑡𝑖𝑚𝑒 = 𝑡𝑖𝑚𝑒
Relationship between 𝑝𝑎 and 𝑝𝑏
𝑝𝑎 = 𝑝𝑏/(1 − 𝑐𝑜𝑢𝑛𝑡 × 𝑝𝑏)
𝑝𝑎 increases slowly with 𝑐𝑜𝑢𝑛𝑡
Ensure that the router does not wait too long before marking a packet
Make inter-dropping time uniform
Relationship between 𝑝𝑎 and 𝑝𝑏
Use 𝑝𝑏 as the final dropping probability Inter-dropping time 𝑇𝑏 is geometric
Prob 𝑇𝑏 = 𝑛 = 1 − 𝑝𝑏𝑛−1𝑝𝑏; E 𝑇𝑏 = 1/𝑝𝑏
More desirable to have uniform distribution
Use 𝑝𝑎 as the final dropping probability
Prob 𝑇𝑎 = 𝑛
=
𝑝𝑏1 − 𝑛 − 1 𝑝𝑏
1−𝑝𝑏
1 − 𝑖𝑝𝑏
𝑛−2
𝑖=0= 𝑝𝑏 1 ≤ 𝑛 ≤ 1/𝑝𝑏
0 𝑛 > 1/𝑝𝑏
E 𝑇𝑎 =1
2𝑝𝑏+1
2
How About TCP?
How to control congestion window sizes?
How to infer congestion?
Why and how to estimate and smooth RTT?
What is Slow Start? Why do we need it?
Retransmit timer back-off policy
Acknowledgement sending policy
References
Van Jacobson, “Congestion Avoidance and Control”, ACM Computer Communication Review Vol. 18, No. 4, August 1988, pp. 314-329.