active queue management scheme

Nonlinear RED: A simplemanagement

Kaiyu Zhou *, Kwan L. Yeung, Victor O.K. Li

Department of Electrical and Electronic Engineering, The University of Hong Kong, CYC805, HKU,

strate that NLRED achieves a higher and more stable throughput than RED and REM, another ecient variant of RED.Since NLRED is fully compatible with RED, we can easily upgrade/replace the existing RED implementations by

and are known as drop-tail routers. A drop-tail rou-ter discards packets when its FIFO queue is full. Itwas shown in [1] that under heavy load conditions,drop-tail routers cause global synchronization, aphenomenon in which all senders sharing the same

erved.

the Hong Kong Special Administrative Region, China (ProjectNo. AoE/E-01/99), and the Research Grant Council EarmarkedGrant 7048/02E.* Corresponding author. Tel.: +852 28592692.E-mail addresses: [email protected] (K. Zhou), kyeung@

eee.hku.hk (K.L. Yeung), [email protected] (V.O.K. Li).

Computer Networks 50 (2001389-1286/$ - see front matter 2006 Elsevier B.V. All rights resNLRED. 2006 Elsevier B.V. All rights reserved.

Keywords: Active queue management; Nonlinear RED; Random early marking

1. Introduction

Congestion control is one of the most importantproblems in the Internet. Most of the existing Inter-net routers play a passive role in congestion control,

q This research is supported in part by the Areas of ExcellenceScheme established under the University Grants Committee ofPokfulam Road, Hong Kong, China

Received 1 June 2005; received in revised form 26 January 2006; accepted 13 April 2006Available online 17 May 2006

Responsible Editor: Chin-T. Lea

Abstract

Among various active queue management schemes (AQM), random early detection (RED) is probably the most exten-sively studied. Unlike the existing RED enhancement schemes, we replace the linear packet dropping function in RED by ajudicially designed nonlinear quadratic function. The rest of the original RED remains unchanged. We call this newscheme Nonlinear RED, or NLRED. The underlying idea is that, with the proposed nonlinear packet dropping function,packet dropping becomes gentler than RED at light trac load but more aggressive at heavy load. As a result, at lighttrac load, NLRED encourages the router to operate in a range of average queue sizes rather than a xed one. Whenthe load is heavy and the average queue size approaches the pre-determined maximum threshold (i.e. the queue sizemay soon get out of control), NLRED allows more aggressive packet dropping to back o from it. Simulations demon-doi:10.1016/j.comnet.2006.04.007yet ecient active queuescheme q

6) 37843794

www.elsevier.com/locate/comnet

K. Zhou et al. / Computer Networks 50 (2006) 37843794 3785bottleneck router/link shut down their transmissionwindows at almost the same time, thereby causing asharp drop in the bottleneck link utilization. It wasalso found in [2] that drop-tail routers are biasedagainst bursty sources. This is because, when a burstof packets from a sender arrives at a fully occupiedqueue, a sustained packet drop within the samewindow of data occurs. Research in [3,4] showedthat the dominant transport layer protocol, TCP[5], lacks the ability to recover from such multiplepacket losses within the same window of data.Therefore, the TCP sender has to rely on retrans-mission timeouts to recover the lost packets.Retransmission timeout signicantly slows downthe transmission rate of a TCP ow, because almostno data will be sent when the sender waits for theretransmission timer to expire. A good congestioncontrol scheme should therefore avoid triggeringunnecessary timeouts.

On the other hand, as more and more multimediaapplications running on top of UDP are beingdeployed, the traditional approach of relying solelyon TCPs end-to-end congestion control algorithmswill no longer be viable [6]. The network, in partic-ular the routers in the network, should play anactive role in its resource allocation, so as to eec-tively control/prevent congestion. This is known asactive queue management (AQM) [7]. The essenceis that an AQM router may intelligently drop pack-ets before the queue overows.

Among various AQM schemes, random earlydetection (RED) [8] is probably the most exten-sively studied. RED is shown to eectively tackleboth the global synchronization problem and theproblem of bias against bursty sources. Due to itspopularity, RED (or its variants) has been imple-mented by many router vendors in their products(e.g. Cisco implemented WRED [9]). On the otherhand, there is still a hot on-going debate on theperformance of RED. Some researchers claimedthat RED appears to provide no clear advantageover drop-tail mechanism [10]. But more research-ers [2,1124] acknowledged that RED shows someadvantages over drop-tail routers but it is not per-fect, mainly due to one or more of the followingproblems.

RED performance is highly sensitive to its para-meter settings [11,12,17,18,21,23,25]. In RED(as detailed in Section 2), at least 4 parameters,namely, maximum threshold (maxth), minimum

threshold (minth), maximum packet droppingprobability (maxp), and weighting factor (xq),have to be properly set.

RED performance is sensitive to the number ofcompeting sources/ows [12,1719,21,25].

RED performance is sensitive to the packet size[15].

With RED, wild queue oscillation is observedwhen the trac load changes [12,22].

As a result, RED has been extended andenhanced in many dierent ways [2,1121]. It canbe found that a common underlying techniqueadopted in most studies is to steer a router to oper-ate around a xed target queue size (which caneither be an average queue size or an instantaneousqueue size). There are some concerns on the suit-ability of this approach, since the schemes thusdesigned are usually more complicated than theoriginal RED. This renders them unsuitable forbackbone routers where ecient implementation isof primary concern. In some schemes (e.g. [13]),additional parameters are also introduced. Thisadds extra complexity to the task of parametersetting.

Unlike the existing RED enhancement schemes,we propose to simply replace the linear packet drop-ping function in RED by a judiciously designednonlinear quadratic function. The rest of the origi-nal RED remains unchanged. We call this newscheme Nonlinear RED, or NLRED. The underly-ing idea is that, with the proposed nonlinear packetdropping function, packet dropping is gentler thanRED at light trac load but more aggressive atheavy load. Therefore, at light trac load NLREDencourages the router to operate in a range of aver-age queue sizes rather than a xed one. When theload is heavy and the average queue size approachesthe maximum threshold maxth an indicator thatthe queue size may soon get out of control, NLREDallows more aggressive packet dropping to quicklyback o from it. Simulations demonstrate thatNLRED achieves a higher and more stable through-put than RED and REM [16], an ecient variant ofRED. Since NLRED is fully compatible with RED,we can easily upgrade/replace the existing REDimplementations by NLRED.

In the next section, some major RED enhance-ment schemes are reviewed, which provides the nec-essary background for NLRED design in Section 3.In Section 4, NLRED performance is studied usingsimulations. Finally, we conclude the paper in

Section 5.

number of active ows carried by a link. Therefore,they designed a mechanism to estimate the numberof active ows in a bottleneck link. Based on theestimated number of ows, an adaptive AQMscheme was proposed, aiming at operating the rou-ter at a xed target queue size.

In [16], by arguing that the average queue size is aperformance measure instead of a congestion mea-sure, an approach known as random exponentialmarking (REM) was developed and analyzed. Itwas shown that REM is able to achieve high linkutilization, negligible packet loss, and short queuingdelay in a simple and scalable manner. In REM, a

3786 K. Zhou et al. / Computer Networks 50 (2006) 378437942. Related work

RED [8] was mainly designed to overcome thetwo problems associated with drop-tail routers,namely, global synchronization and bias againstbursty sources. Unlike the drop-tail mechanism,RED measures congestion by the average queue sizeand drops packets randomly before the router queueoverows. When a packet arrives at a router, theaverage queue size, denoted avg, is updated usingthe following exponentially weighted moving aver-age (EWMA) function:

avg 1 xqavg0 xqq;where avg 0 is the calculated average queue size whenthe last packet arrived, q is the instantaneous queuesize, and xq is the pre-determined weighting factorwith a value between 0 and 1.

As avg varies from a minimum threshold minth toa maximum threshold maxth, the packet droppingprobability pd increases linearly from 0 to a maxi-mum packet dropping probability maxp, or

pd 0 avg < minth;avgminthmaxthminth maxp minth 6 avg < maxth;1 maxth 6 avg:

8>:

1May et al. [13] modeled the throughput of RED andthey showed that under heavy load the throughputis inversely proportional to the load. This problemcan be solved by selecting appropriate parametersfor RED [11]. Besides, queuing delay and jitter havealso been studied. It is found that delay jitter inRED is very sensitive to the value of the weightingfactor xq. Small xq results in large delay variance,which is not suitable for real-time applications [22].

Zheng and Atiquzzaman [14] believed that themost important problem of RED is low throughput.They found that just modifying the parameters ofRED for throughput enhancement is not sucient;so they proposed a scheme that uses a packet drop-ping function consisting of two linear segmentsinstead of a single one. This change is in fact verysimilar to that of gentle RED (GRED) by Floyd[26]. In the original RED, the packet droppingprobability is set to 1 when maxth 6 avg. This waslater considered to be too aggressive and thus a gen-tle version of RED (or GRED) was proposed,which is implemented in the ns-2 [27]. In GRED,

the packet dropping probability increases linearlyfrom maxp to 1 when avg increases from maxth to2maxth. The resulting packet dropping function isshown in Fig. 1.

In [18], Feng et al. showed that the eectivenessof dropping packets to control source rate is aectedby the number of sources. So there is no single set ofRED parameters that can work well under dierentcongestion scenarios. Therefore, they proposed theadaptive RED (ARED) scheme, which adapts themaxp value based on the trac load.

In [17], Floyd acknowledged that the averagequeue size of the original RED is sensitive to thelevel of congestion as well as the RED parametersettings. She then introduced a new adaptive REDscheme, which is partially based on ARED. In thisscheme, the variance of average queue size is usedto adjust maxp. It also diers from ARED in thatthe average queue size is steered to a target around(maxth + minth)/2. Besides, instead of using a con-stant weighting factor xq, xq is given by xq = 1 e1/L, where L is the capacity of the bottleneck linkunder consideration.

In [12], Teunis J. Ott et al. observed wild oscilla-tions in instantaneous queue size when the tracload changes. The trac load is measured by the

Fig. 1. Dropping functions for RED and GRED.state variable r is maintained via

K. Zhou et al. / Computer Networks 50 (2006) 37843794 3787rk 1 rk caqk b kk C T ;2

where b is the desired queue occupancy, k(k) is thearrival rate during the kth sample, T is the sampleperiod, C is the link capacity, c and a are constants.The mapping from r to the packet drop probabilityis

premk 1 /rk; 3where / is a constant parameter.

In [20], AVQ takes a rather dierent approach todesign an AQM. AVQ seeks to keep the queueempty. To this end, a virtual link speed is intro-duced, which is kept no greater than the actual linkspeed. Accordingly, a virtual queue is maintainedalong with the virtual link speed. As packets arrive,they are placed in the real queue while a token isplaced in the virtual queue. The real packets areserved with the link speed, while the tokens areserved at the virtual link speed. The service timeof a token is the size of the corresponding packetdivided by the virtual link speed. If a token ndsthe virtual queue full, then the real packet and tokenare dropped. The virtual capacity at each link isthen adapted to ensure that the total ow enteringeach link achieves a desired utilization of the link.

Modeling the dynamics of TCP and AQM isanother important research area. In particular,Misra et al. modeled the dynamics of TCP andAQM with stochastic dierential equations in [15].They showed that it is the combination of linkbandwidth, average packet size, sampling interval,and trac load that make RED stable. Partly basedon those results, in [19], Hollot et al. used controltheory to investigate AQM as a classical feedbacksystem and introduced the PI controller to improveREDs performance. Recently, Gao and Hou [23]argued that the model of TCP and AQM dynamicsin the previous work does not consider the realimplementation of a TCP; so they proposed a statefeedback controller (SFC).

To summarize, we note that schemes in [12,1517,20,21,23] tend to be more complicated than theoriginal RED. They all adopted a common underly-ing technique to steer a router to operate at a xedtarget queue size independent of trac load. Wehave some concerns on the suitability of thisapproach. Suppose the target queue size is set toosmall and the oered load is high (so the numberof ows is large), each active ows congestion win-

dow size will be very small due to resource sharing.As a result, when a ow has less than three packetsacknowledged in a single window, the ow has towait for retransmission timeout, and the throughputis low. On the other hand, if the target queue size isset too large and the oered load is light, packetswill experience a long queuing delay without anygain in throughput. If the target queue size isallowed to vary with the trac load, we observethe following advantages:

The potential problems with the xed targetqueue size can be removed.

A fairer congestion control can be realized.

To illustrate the second point, we can express thethroughput of a TCP connection as

R w MSSRTT

; 4

where w is the sender window size, MSS is the seg-ment size, and RTT is the round-trip-time (whichincreases with the queue size). From (4), increasingRTT reduces the throughput, and thus the sendingrate. Therefore, allowing router queue size to buildup is another way, probably with a ner granularityof control, to slow down all senders passing throughthis router when congestion occurs. Therefore, thistends to be fairer than just punishing a few unluckysenders by random packet dropping.

3. Nonlinear random early detection

3.1. NLRED Algorithm

The throughput performance of RED is not sta-ble. For example, when the trac load is very lightand RED parameters are aggressively set or whenthe trac load is very heavy and the parametersare tenderly set, the throughput is low. It has beenshown that no single set of parameters for REDcould get a stable performance under dierent tracloads. We believe such instability is due, at least inpart, to the linear packet dropping function adoptedby RED, which tends to be too aggressive at lightload, and not aggressive enough when the averagequeue size approaches the maximum thresholdmaxth. We also believe that the performance im-provement of some previous work is at least partlydue to the employment of nonlinear dropping func-tion (e.g. [14,16]), either intentionally or uninten-tionally. (More reasons to be provided later.)

However, we notice that these improvements may

not be suitable for core routers, as their nonlineardropping functions greatly complicate the basicmechanism of RED. In this paper, we propose toreplace the linear packet dropping function by ajudicially designed quadratic function. The resultingscheme is called non-linear RED or NLRED. Thepseudocode of NLRED is summarized in Fig. 2.

When avg exceeds the minimum threshold,NLRED uses the nonlinear quadratic functionshown in (5) to drop packets, where max0p representsthe maximum packet dropping probability ofNLRED. Fig. 3 compares the packet droppingfunctions for RED and NLRED. (The choice of aquadratic function is further explained in the nextsubsection.)

p0d 0 avg 6 minth;

avgminthmaxthminth

2max0p minth < avg 6 maxth;

1 maxth < avg:

8>>>:

5Comparing (5) to the dropping function of originalRED in (1), if the same value of maxp is used,NLRED will be gentler than RED for all tracload. This is because the packet dropping probabil-

ity of NLRED will always be smaller than that ofRED. In order to make the two schemes to have acomparable total packet dropping probabilities, weset max0p 1:5maxp, such that the areas coveredby both dropping functions from minth to maxthare the same, orZ maxthminth

pd davg Z maxthminth

p0d davg:

3.2. Why use a quadratic function?

Given that N TCP ows equally share a link withbandwidth L, and experience a random packet loss/drop probability p. It was shown [18] that p and Nhas the following relationship:

N MSS a 2ue

le

3788 K. Zhou et al. / Computer Networks 50 (2006) 37843794NLRED for each packet arrival:

calculate the average queue size avgif thavg min

no packet drop else if th thmin avg max

calculate the packet drop probability using (3)drop the packet with the calculated probability

else drop the packet

active queue management scheme

Documents