doc
DESCRIPTION
TRANSCRIPT
On Evaluating Policy-Based Bandwidth Management Devices
Huan-Yun Wei1 Ying-Dar Lin
Department of Computer and Information Science
National Chiao Tung University, Hsinchu, Taiwan
Tel: +886-3-5712121-ext56667
FAX: +886-3-5721490
Email: {hywei,ydlin}@cis.nctu.edu.tw
Policy-based bandwidth management defines how to allocate bandwidth resources according to
organizational policy rules. Enterprises often employ such policy-based devices at their organizational
edges to manage the narrow but expensive Internet access links. This work designs a novel testbed and
uses it to evaluate the functionality and performance of many such devices, including six commercial
products and one open source solution. Their policy rules can be categorized into (1) class-based rule;
(2) connection rule within a class; (3) bandwidth borrowing rule among classes. The testbed mimics the
real-life Internet with heterogeneous Internet delays/delay jitters/packet loss rates, and evaluates the
effectiveness of policy enforcement of the above three policy types in terms of accuracy, fairness,
stability, robustness, bandwidth borrowing, and voice over IP (VoIP) quality. The test results2 reveal that
(1) explicitly sizing the TCP window could cause performance or fairness degradation even under slight
packet loss rates; (2) the open source solution can compete with commercial products in accurately
limiting flow aggregates; (3) the voice qualities over IP networks significantly depends on the packet
sizes of all other traffic when using a narrowband (125kbps) access link.
Keywords: policy-based, bandwidth management, TCP, testbed, emulator
1 Corresponding author2 All test results are verified by the vendors and are reproducible through our open tools. Nowadays most benchmark reports are financed by vendors and may be biased, without practical testbeds. Guided by this neutral test, readers can obtain in-depth sights when examining bandwidth management devices.
1
1. Introduction
Internet services provide an economic and convenient system to carry out business, such as
efficient information exchange among branch offices, or efficient customer/provider access to the
services. However, the importance of the services varies, and enterprises often fail to effectively utilize
the narrow but expensive WAN link bandwidth. For instance, the bandwidth required by ERP
(Enterprise Resource Planning), voice over IP (VoIP), and e-business may be occupied by less-important
applications such as FTP. Since end-to-end Internet QoS such as DiffServ [1] is still under experiment,
enterprises seek to at least manage their inbound and outbound links. Thus, policy-based bandwidth
management devices are employed at organizational edges to set and enforce organizational policies for
pursuing the utmost benefits.
Network administrators define policy rules to achieve resource management objectives for the
enterprise. Each policy rule contains “condition” and “action” fields to define specific actions for
specific conditions. Condition defines the packet-matching criteria, such as a certain subnet , application,
or protocol. Action defines the bandwidth parameters, such as “at least 100kbps” or “at most 200kbps”.
So each policy rule is class-based that it groups a set of traffic flows into a per-class queue according to
the specified packet filter (condition), and then the class of traffic is scheduled out at its corresponding
specified bandwidth (action). Moreover, the class-based rules can be further configured with bandwidth
borrowing among the classes to dynamically utilize available bandwidth effectively. Additionally, each
connection within a class can be guaranteed to have at least a certain amount of bandwidth. Throughout
this work we evaluate the effectiveness of various policy enforcements for the above three policy types:
(1) class-based rule; (2) connection rule within a class; (3) bandwidth borrowing rule among classes.
The following subsections review traditional and prevalent technologies to enforce these policy rules.
Traditional Technology—Queuing
A straightforward method for bandwidth management is to queue less-important traffic and pass
important traffic as soon as possible. Queuing can be roughly categorized into (1) priority-based queuing
and (2) rate-based queuing. Priority-based queuing sets the priority among the classes and the highest
priority class is scheduled out first. This is suitable for short-lived, extremely important, or transaction-
oriented flows. However, priority-based queuing cannot quantitatively guarantee/limit the bandwidth for
a class. As an analogy, if everyone is VIP, then no one is real VIP. In contrast, rate-based queuing
employs various packet scheduling algorithms [2] that can decide from which class comes the next
packet for transmission. This can effectively limit senders who are trying to overburden the resource.
Besides, the minimum bandwidth for important applications can be quantitatively guaranteed. Floyd and
2
Jacobson [3] further investigate the bandwidth borrowing among the classes. Queuing has different
impacts upon UDP and TCP data flows. Next we briefly review UDP and TCP protocols.
Queuing the Internet Traffic: TCP vs. UDP
The majority of software applications today use TCP (Transmission Control Protocol) for data
transmission because TCP can establish a reliable end-to-end connection. TCP receivers acknowledge
the successful reception of each data packet by replying an Ack to their TCP senders. Thus, Ack packets
can trigger senders sending out new data packets. Unacknowledged data packets are retransmitted to
guarantee reliability of data transfers. TCP also incorporates flow control mechanisms that prevent a
sender from overburdening the network capacity or overflowing its receiver’s buffer. Thus each TCP
sender keeps two window values, congestion window (CWND) and receiver advertised window
(RWND), and seeks to satisfy both network capacity (congestion control) and receiver's capability of
receiving the data, respectively. So each TCP sender do not have unacknowledged data more than
min(CWND, RWND). RWND is advertised by the receiver in TCP Ack packets and ranges widely
among operating systems. CWND is kept increasing exponentially during the slow-start phase and
linearly during the congestion avoidance phase, probing available bandwidth until packet losses occur.
Loss behavior differs among versions but mainly on how the CWND is shrunk and raised, or how the
lost segments are accurately retransmitted. Falls and Floyd [4] give a good overview and problems on
Tahoe, Reno, NewReno, and SACK TCP versions. Padhye and Floyd [5] further investigate the TCP
version distribution among 4550 Web servers. Unlike TCP, UDP (User Datagram Protocol) lacks the
connection establishment, reliability of data transfer, and flow control. UDP only provides port number
multiplexing and is commonly used by real-time applications such as video conferencing and Voice over
IP (VoIP).
Queuing has different impacts upon UDP and TCP flows. As for real-time UDP traffic, the bit rate
is often fixed and the video/voice quality heavily depends on the loss rate, delay, and delay jitter. The
packet scheduler must precisely allocate enough bandwidth for real-time UDP traffic to minimize packet
losses and delay at the controlling device. Moreover, the packets of the real-time traffic require to be
smoothly scheduled out with even intervals for minimizing the delay jitter. As for TCP traffic, TCP
flows competing for the same queue can cause a great amount of data packets queued in the device,
resulting in high buffer requirement and large packet latency at the device. Moreover, the TCP flows
may not fairly share the class bandwidth, especially when their round-trip times (RTT) are different.
Thus many vendors apply specific algorithms for regulating TCP traffic.
Specific Algorithms for TCP Traffic
To guarantee each TCP connection bandwidth within a class, and hence achieve fairness among
the flows within a class, the ideal solution is to actively control the sending rate of each sender within
3
the class instead of letting them compete with each other. Thus queuing and its queuing delay, buffer
requirement can be reduced. Other types of traffic such as UDP can only resort to the primitive solution,
queuing, to passively control its bandwidth. Two methods exist for controlling each TCP connection: (1)
window-sizing and (2) packet-dropping.
1. Window Sizing: Since a TCP connection can be actively controlled through the feedback
Acks, the window-sizing method directly influences the amount of sending bytes by shrinking the
RWND in the TCP Acks. In this test, iPolicer, PacketShaper, WiseWAN, QoSWorks and Guardian
Pro belong to this type. Karandikar et al. [6] sponsored by Packeteer investigate the window-sizing
technique. Though window-sizing can directly control per-connection bandwidth, it needs to
readjust its Ack regulations when another connection enters or leaves the class.
2. Packet Dropping: Because a TCP sender slows down its transmission rate in response of
network congestion by halving its congestion window size, the packet-dropping method drops
packets and expects that the sender will slow down its rate when detecting the packet loss events
[7]. In this test, FloodGate (uses per-flow queuing) and ALTQ_CBQ+RED belong to this type.
This work designs a novel testbed for evaluating the effectiveness of various policy enforcement
techniques used by existing products or solutions. The testbed mimics the real-life Internet
characteristics such as WAN delay, delay jitter, and packet loss. Section 2 compares the relevant
information of the devices under test (DUT). Section 3 then describes the design of our testbed and the
test methodology. Section 4 demonstrates the test results. Finally, a summary of the test results and
conclusions are given in Section 5.
2. Device under Test (DUT)
This test project invites nine vendors, and six of them join this test. Table 1 compares the
relevant information of all the DUTs. Most DUTs are installed at LAN-router link to prevent router
queues from overflowing and causing congestion. Because the grade of each DUT differs, so only low
bandwidth configurations (below 1.544Mbps) are tested. This minifies hardware differences so that test
results can reflect true management capability of each DUT.
Vendor/
Model
Grade
(Announced)
S/W
Ver.
OS,
HW/SWInstall at
Hardware
Boot
fromCPU RAM Interface
Fail
OverLog to
ALTQ 2.2 [8] 100Mbps 2.2 FreeBSD, Software Between
LAN
and
Our P!!! 700MHz PC with 256M
SDRAM, 2 Intel 100M NICs installed,
booting from a hard disk.
N Same FreeBSD
NetGuard’s Guardian Pro [9] 10Mbps 5.02 NT 4.0, Software N Same NT server
CheckPoint’s FloodGate [10] 45 Mbps 4.1 NT 4.0, Software HA* Same NT server
4
Router
BroadWeb/Acute’s iPolicer
100-CR2202 [11][12]100 Mbps 1.6.4
Embedded NT,
Hardware
Flash
32M
P!!!
600128M 10/100Mbps N Another NT server
Packeteer’s PacketShaper
4500 [13]45 Mbps 4.1.2
Embedded Linux,
HardwareFlash
P!!!
600128M 10/100Mbps Y
Embedded Hard
Disk
Sitara’s QoSWorks
QWX-10000 [14]100 Mbps 1.8
Embedded FreeBSD,
Hardware
Hard
Disk
P!!!
600192M 10/100Mbps Y
Embedded Hard
Disk
NetReality’s WiseWan
200/500 [15]5Mbps 4.0
Proprietary,
HardwareWAN link
Flash
32M
P
13332M
V.35
(10Mbps log)Y Another NT server
Note 1: Invited venders also include Lucent’s Access Point, Allot’s NetEnforcer(these two decide not to join this test after examining our test plan)and
Cisco’s Cisco Assure (did not want to join at the beginning).
Note 2: Fail Over is defined as the capability of bypassing traffic when the power is off. HA means high availability module (optional).
Note 3: Sitira revealed to us that QoSWorks uses ALTQ_CBQ.
Table 1: Product information and software/hardware platforms
2.1 Functionality of Policy Console
Network administrators use policy console to define organizational bandwidth policy rules. Table 2
lists the functionality of each policy console. All DUTs can limit the bandwidth of a class. Moreover,
most DUTs can guarantee the minimum bandwidth of each connection within the class, except for
Guardian Pro and ALTQ. These two settings can be further set by (a) inter-class bandwidth borrowing
and (b) intra-class bandwidth borrowing, respectively. In (a) the DUTs can redistribute any available
bandwidth unused by some classes to other active classes; in (b) if any flow in a class terminates, its
bandwidth will be fairly redistributed to other flows.
Vendor/
Model
Packet Classifier
Direction
(In/Out)
UDP
traffic
control
WAN
Link
Speed
Setup
Per-Class Bandwidth Control Bandwidth Borrowing
Src/Dst IP/Port#,
mask, Prot. ID
Host
list
Class
limit
Guarantee BW for each
connection in the classInter-class Intra-class
ALTQ Y N Both Y Y Y N Auto Compete2
NetGuard’s Guardian Pro Y Y Both Y Y Y N Degree1 Compete
CheckPoint’s FloodGate Y Y Both Y Y Y Y Degree Degree
NetReality’s WiseWan Y Y Both Y Y Y Y Auto Auto
Acute/Broadweb’s iPolicer Y Y Both N N Y Y N N
Packeteer’s PacketShaper Y Y Both Y Y Y Y Degree Degree
Sitara’s QoSWorks Y N Both Y Y Y Y Auto Auto
1 Degree means that administrators can manually specify the degree of bandwidth borrowing.
2 DUTs without connection guarantee let the flows within the class compete with each other.
Table 2: Functionality Comparison of the Devices under Test
5
2.3 Protocol Support
Table 3 compares the protocol support of each DUT. Most Internet services/protocols can be
recognized by layer-4 TCP/UDP port numbers. However, layer-7 awareness can increase the simplicity
and capability of bandwidth management. For example, FTP protocol includes the passive mode, in
which FTP-data port (port 20, for sending data) can be dynamically changed to another by negotiation in
the FTP-Cmd port (port 21, for sending FTP commands). If the DUT cannot recognize what negotiation
is in the FTP-Cmd port, obviously it cannot control the connection that is actually sending the data.
PacketShaper and WiseWAN have the richest layer-7 awareness. In terms of quantity of port-service
mapping entries, WiseWAN and PacketShaper are the richest. The next richest are FloodGate and
Guardian Pro. iPolicer, QoSWorks, and ALTQ have few or no built-in port-service mapping entries and
require manual lookups in the port-service mapping table. Although iPolicer can identify UDP, it cannot
control its bandwidth.
Vendor/
Model
Layer awareness Built-in port-service mappingsICMP IPX # of other protocols
Layer Layer-7 TYPE TCP UDP
ALTQ 4 N 0 (Manually assign port #) N N Manually assign port #
NetGuard’s Guardian Pro 4 N 60 35 Y N 15
CheckPoint’s FloodGate 7 URL/MIME-TYPE 60 35 Y N Manually assign port #
NetReality’s WiseWan 7 URL/MIME-TYPE 109 79 Y Y Above 250
Acute/BroadWeb’s iPolicer 4 N 12 Cannot control Y N Manually assign port #
Packeteer’s PacketShaper 7 URL Total above 200 (layer 2 ~7) Y Y Above 200
Sitara’s QoSWorks 4 N 0 (Manually assign port #) N N Manually assign port #
*Note: This table only lists the protocols that can control rather than just recognize only.
Table 3: Comparison of Protocols Support
Appendix A-1 and A-2 further compare the policy console user interface and special functions of
the DUTs. Most DUTs mix priority-based and rate-based queuing, however, this test focuses on “rate-
based policy” that controls “TCP connections flowing from enterprises (LAN) to WAN” since TCP
traffic occupies most of the Internet traffic. As for UDP traffic, this test focuses on real-time applications
such as Voice over IP (VoIP). Differences between configured bandwidth and measured results will be
6
quantified.
3. Testbed and Test Methodology
Testbed and test methodology significantly influence test results and require careful examination to
avoid misinterpretation of the results.
3.1 Testbed: Mimics the Real-Life Internet
Internet is very dynamic. Different connections have different paths and therefore have different
distances and path qualities. Our testbed mimics the above properties by setting WAN delay, WAN delay
jitter, and WAN packet loss rate to each routing path. Figure 1 and Table 4 shows complete information
about our testbed and testing tools. Testing data flows are from X to Y, passing through the DUT,
routers, monitoring point, and WAN emulator. The Cisco routers are installed specifically for WiseWAN
because of its V.35 interface. Each DUT is individually tested on this testbed. Appendix B displays our
testbed photo. IP-aliasing employed at A and I in Fig.1 emulates multiple competing sources and their
corresponding sinks, respectively. Self-written wan-emu virtual interface driver is used to emulate the
dynamics of the Internet. They are detailed as follows:
7
XY
Figure 1: The Testbed: Mimic the Real-life InternetNote: All PC are equipped with Intel Express Pro 10/100Mbps network interface cards. The V.35 serial clock rate between Cisco routers is set to 2Mbps. Each DUT is individually tested on this testbed.
Tool Function DescriptionPosition in
Fig. 1
Ncftpput [16] TCP Traffic
generator
Traffic: 20 ncftputs flows from subnet X to subnet Y.
Packet size: 1,500 bytes
TCP options: SACK/timestamp/window-scaling disabled.
A
SmartVoIpQoS
[17]
VoIP (UDP) traffic
generator
Traffic: Single VoIP flow with RTP format UDP packets.
Codec: G.729 (50 frame/sec, frame size=74 byte, around 30kbps)
M
VoIP Gateway Same as above Same as above K and N
ttt [18] Real-time traffic
bandwidth monitor
Monitor the bandwidth of the traffic passing through it by protocols,
source/destination IP, etc.
G
Tcpdump [19] Packet sniffer Dump each packet’s header to the RAM disk to avoid I/O overheads. A and H
Self-written AWK
scripts [20]
Data Analyzer Calculating statistics from the tcpdump result. G
Self-written wan
emulator [20]
WAN Emulator To have different delays, delay jitters, and random/periodic packet
loss rates impairments on different flows.
H
Table 6: Testing Tools
1. IP-aliasing3: In Linux each network interface card (NIC) can emulate 100 NICs, with each virtual NIC
having a unique IP address. With proper routing table setup at A in Fig.1, we can direct certain flows
destined to a certain virtual NIC at I through a virtual NIC at A. Virtual NICs generate packets with
their corresponding IP addresses such that the DUT will feel that outgoing TCP data packets are from
different local hosts, and incoming TCP Acks are from different remote hosts. Moreover, packets are
sent without link-layer collisions since only a single physical NIC is present at A and I.
2. wan-emu: Wan-emu is a Linux virtual interface driver that resides between the IP layer and the NIC
driver. In this testbed, multiple wan-emu virtual devices are attached to the sink-side last-hop NIC
driver (at H with IP 10.1.1.254) to have different impairments on different routes. With proper static
route, we can direct flows destined to a virtual NIC at I through a specific wan-emu interface that has
the desired link characteristics. Each packet passing through is pasted a timestamp indicating the time
for it to be kicked out. An interrupt is triggered every 1ms to examine how many packets are due and
should be forwarded. The timer granularity can be easily tuned to 8192 Hz in Linux. Impairments such
as the random/periodic loss rate and delay jitter are also implemented.
3.2 Test Methodology
This test includes three sub-tests: Basic Test, Robustness Test, and Advanced Test.
3 Note that some operating systems merely support alias IP addresses, but cannot support alias interfaces, such as FreeBSD and Windows 2000.
8
A. Basic Test
This test evaluates the accuracy of the class bandwidth and the fairness among the connections
within the class. Besides, this test also investigates the stability of each DUT among its five-time runs.
The total WAN link bandwidth is set to T1 (1.544Mbps)4 and is partitioned into five classes (20, 40, 128,
256, and 1100kbps), with each class matching four TCP connections. Each class is set to guarantee that
each connection has 1/4 of the class bandwidth5. All settings are fixed without any bandwidth
borrowing. This test repeats in consecutive five runs, with 200 seconds intervals in between. Within each
run, 20 FTP connections are simultaneously flowing from A to I (Table 6), with each class match 4
connections. After 250 seconds, all the ncftpput processes are killed. Data from 30 to 230 seconds are
analyzed. The statistics are explained in Table 7. Appendix C uses an intuitive example to illustrate the
following statistics.
Statistic Quantify what? DefinitionComparison
Standard
AccuracyThe differences between:(1)the class bandwidth settings (2)the measured class bandwidth
Averaged normalized goodput* The closer to 1, the better
Stability of
accuracy
The differences of the accuracy
statistics among the five runs.
CoV** of normalized goodput among the five runs
(Same as above, but take the CoV among the 5 runs
instead of the average.)
It depends***.
Fairness Fairness of bandwidth usage
among the 4 connections in
each class.
Averaged CoV among 4 connections’ goodputs The closer to 0, the
better
Stability of
fairness
Differences of the “fairness
statistic among the five runs”
Same as above, but take the standard deviation
among the 5 runs instead of the average.
It depends***.
Retransmission
Ratio
Retransmission ratio in each
class.
The closer to 0, the
better
* Goodput is the effective throughput (bytes/time) excluding the bandwidth consumed by retransmission.
** CoV denotes “coefficient of variation,” which means “standard deviation over mean.”
*** If the accuracy tends to 1, it would be better for its stability to be 0. This implies the DUT always performs accurately. However, if the accuracy tends to
0, and its stability also tends to 0, it implies that the DUT always performs inaccurately. This also applies to fairness and its stability (Appendix C).
Table 7: Basic Test Statistics
4 BroadWeb/Acute iPolicer does not have WAN link speed setup.5 NetGuard Guardian Pro cannot accept per-connection setting.
9
B. Robustness Test
Packets may be generated by different operating systems, hence different TCP implementations,
and pass through paths with various delays and loss rates. Long-distance TCP connections are expected
to be vulnerable to Internet losses because they require more time to obtain Acks for recovering to their
target bandwidth. Since many DUTs regulate TCP Acks, it is our concern whether they are compatible
with the major operation systems. Table 8 describes our test methodology.
Test ItemDescription
Comparison standardDUT Settings Test Methodology
Under Heterogeneous
Internet Delays
Same as Basic
Test.
WAN delays of the four connections in each class
are 10ms, 50ms, 100ms, 150ms
Same as the Basic Test
Under Various
Internet Loss Rates
200kbps for
the test flow.
A single TCP connection is tested under 0.5%, 1%,
2%, 4% and 8% periodic loss rates.
Whether the goodput can
smoothly degrade.
Under Different
Sending Operating
Systems
80kbps for the
test flow.
(1)WAN: delay=50ms, periodic loss rate=1%.
(2)TCP Source OS= {Linux 2.2.14, Windows 2000,
FreeBSD 4.0, Solaris8}.
(3)TCP Receiver OS= Linux 2.2.14.
(4)Each time a single TCP connection is tested.
How closely the byte-time
lines of the operating
systems can overlap with
each other.
Table 8: Robustness Test Methodology
C. Advanced Test
This test includes bandwidth borrowing test and VoIP quality test. Bandwidth borrowing has been
described in Section 2. VoIP quality is separately tested through SmartBits and VoIP Gateway to
evaluate whether the DUTs can precisely allocate adequate bandwidth for voice traffic. Each test is
conducted under heavily-loaded FTP traffic. Detailed test methodologies are in Table 9.
Test ItemDescription Comparison
standardDUT Settings Test Methodology
Inter-class
Bandwidth
Borrowing
(1) Link speed=T1 (1.544Mbps), divided
into 2 classes A, B. A=B=777kbps.
(2) Class A matches connection 1, Class
B matches connection 2.
(3) A and B can borrow with each other.
Connection 1 and 2 are started and stopped in
sequence.
(1) Stability of
each
connection.
(2) How
seamlessly the
total
bandwidth line
can be when
connection 1
Intra-class
Bandwidth
Borrowing
(1) Link speed=T1 (1.544Mbps), divided
into 1 classes A. A=1.544Mbps.
(2) The class matches connection 1 and 2.
(3) Per-connection bandwidth: at least
10
777kbps, at most 1.544Mbps. terminates.
VoIP test using
SmartVoIpQoS
(1) Link speed={T1,125kbps}, divided
into 2 classes A, B.
(2) A=30kbps for voice traffic,
B={T1,125kbps}-30kbps for FTP
traffic.
(3) FTP traffic can occupy the voice class
until voice traffic begins.
Background: 20 FTP connections.
Foreground: a 30kbps G.729 VoIP flow.
PSQM1, jitter,
delay and loss.
VoIP test using
VoIP Gateway
(Cisco 1750)
Background: 20 FTP connections.
Foreground: Dial a phone (JP to NP, G.729
codec), hold X’s and Y’s phones,
speak 1 to 10 at 2 word/sec, and
judge the voice quality.
Listening with
ears2.
1PSQM (Perceptual Speech Quality Measurement) is calculated from delay, jitter, and loss statistics. PSQM rated as 6.5 has the poorest quality
2The VoIP Gateway is set to continuously sample the sound even when the primary tester keeps silent. Thus the data flow is always around 30 kbps.
Table 9: Advanced Test Methodology
4. Benchmark Test Results
A. Basic Test Results
A-1. Accuracy and Stability of Accuracy
Figure 2 (A1 is accuracy, B1 is its stability, A2 and B2 will be discussed in the robustness test)
reveals that the DUTs can be classified into three groups: ALTQ_CBQ, PacketShaper, and QoSWorks
have the most accurate and stable control for each class; WiseWAN and FloodGate are less effective in
the narrowband class (20kbps) because of their large retransmission ratios as will be shown in section A-
3; iPolicer and Guardian Pro are the least effective. iPolicer has several terminated connections in the
middle of each run. Thus those connections not sending data waste bandwidth and result in instability
among the five runs6.
6 Note: The test crew had performed many “five-run” tests on iPolicer. It is only after the above phenomenon has been verified that we include the most general one of the “five-run” tests in our analysis.
11
Figure 2: Results of accuracy and its stability (A1, B1: No Internet Delay; A2, B2: With Internet Delay)
A-2. Fairness and Stability of Fairness
Figure 3 (A1 is fairness, B1 is its stability, A2 and B2 will be discussed in robustness test) also
distinguishes three groups: PacketShaper is the most fair and stable; QoSWorks is less fair but is stable
in the 20kbps class, implying that it is less fair in the 20kbps class in all the five test runs (Appendix C).
FloodGate and WiseWAN are less fair and stable in the 20kbps class. iPolicer, Guardian Pro, and
ALTQ_CBQ+RED provide poor fairness. Pure CBQ has the poorest fairness under narrowband
(20~40kbps) classes. However, it is somewhat alleviated after applying RED to each class because RED
tends to drop more packets from the connection that is more aggressively sending the data.
12
Figure 3: Results of fairness and its stability (A1, B1: No Internet Delay; A2, B2: With Internet Delay)
A-3. Retransmission Ratio
Figure 4 A1 (A2 will be discussed in robustness test) shows large retransmission ratio in
narrowband classes (20~40kbps), except for PacketShaper and QoSWorks, but especially in WiseWAN,
iPolicer, FloodGate and ALTQ_CBQ+RED. As an analogy, a small exit often keeps many people
waiting before it. FloodGate and ALTQ_CBQ+RED use “packet dropping” to slow down TCP flows so
they have high retransmissions. WiseWAN has enormous packet losses at the Cisco router before
WiseWAN can control the traffic at the WAN link. Results of iPolicer are not easy to comprehend in
terms of the technologies it claims (adjusting the TCP window size).
13
Figure 4: Test results of retransmission ratio (A1: No Internet Delay; A2: With Internet Delay)
B. Robustness Test Results
B-1. Under Heterogeneous Internet Delays
To make it easy to compare with the Basic Test, the test results are listed with those of Basic Test.
Figure 2 (A2, B2), Figure 3 (A2, B2) and Figure 4 (A2) separately demonstrates the results. Most results
scales up the differences among the DUTs in the Basic Test, especially with iPolicer and ALTQ_CBQ in
the fairness statistic. Long-distance connections are vulnerable to packet losses due to buffer overflows
at the controlling device, as described in Section 3.2 B. ALTQ_CBQ+RED can alleviate the unfairness
degree of ALTQ_CBQ because the short-distance connections, which are more aggressively sending the
data, have more packets dropped by the RED mechanism. Guardian Pro cannot guarantee each
connection and thus reveals significant instability between Basic Test and this test. QoSWorks is less fair
under the broadband class (1.1Mbps).
B-2. Under Various Packet Loss Rates
Normally a TCP flow slows down its transmission rate when packet losses occur. Figure 5 shows
the goodput of each DUT under different Internet packet loss rates (each flow is with 200kbps and the
measured goodput is averaged over 200 seconds as in Basic Test). Almost all the DUTs can smoothly
lower their goodputs as packet loss rate increases, except for PacketShaper and iPolicer. These two
devices give up sizing the TCP window when they have detected the TCP loss events (triple duplicate
Acks). Thus, the TCP sending window suddenly bumps up and causes a burst of packets flowing to the
controlling device, resulting in a higher goodput at 0.5% loss rate. This phenomenon is alleviated when
increasing the packet loss rate.
14
Figure 5: Robustness Test—goodput under various packet loss rates
B-3. Under Different Sending Operating Systems
In this compatibility test (see Fig.6, the X axis is time, Y axis is the bytes sent, thus the slope is the
bandwidth), TCP connections sending from different operating systems passing through PacketShaper
have different results. PacketShaper shrinks the TCP window to the condition that no more than 4
packets are in the WAN pipe. Thus, each packet loss resorts to a retransmission timeout instead of using
fast retransmit [21]. Since BSD-derived UNIX systems use a coarse-grained retransmission timer
(500ms) [21] such that they slowly retransmit the lost packets. In contrast, Linux keeps a fined-grained
retransmission timer and has the best performance when packet losses occur. iPolicer has a serious bug
when sending data from Windows 2000 to Linux 2.2.14. The tcpdump tool found that the TCP Ack
header length is miscalculated when passing through iPolicer, causing incorrectly triggering of data
packets from TCP senders. TCP has many options and various implementations, so explicitly modifying
the packet header requires sever compatibility tests. The other products can fairly treat TCP flows from
different operating systems.
Figure 6: Robustness test— Under different Sending Operating Systems
C. Advanced Test Results
C-1. Bandwidth Borrowing Test Results
This test uses ttt to observe the effectiveness of bandwidth borrowing. In each figure we only focus
15
on three lines: the total bandwidth (ip/ether line), the bandwidth of connection 1 (xxxx/tcp line) and the
bandwidth of connection 2 (yyyy/tcp line). The test crew draws another baseline indicating the ideal
total link bandwidth (1.544Mbps) for comparison.
Inter-Class Bandwidth Borrowing Test Results
Figure 7 shows the inter-class bandwidth borrowing benchmark results. iPolicer does not have
this function, so we set the bandwidth of both of the two classes as 1.544Mbps. However, Cisco
Routers link is set to 2Mbps, thus the two 1.544Mbps flows through iPolicer exceeds the baseline
bandwidth. After connection 1 terminates, the total bandwidth narrows down to around 1.5Mbps
with some bandwidth fluctuation. WiseWAN and ALTQ can automatically borrow bandwidth among
classes, and the others can be further configured with the degree of bandwidth borrowing. Guardian
Pro has an unstable look when connection 2 starts to obtain a bandwidth share. ALTQ_CBQ and
ALTQ_CBQ+RED can only borrow a limited bandwidth (from 777kbps to 1.1Mbps). FloodGate,
PacketShaper and QoSWorks can perform inter-class bandwidth borrowing seamlessly.
(a) Acute/Broadweb iPolicer (b) CheckPoint FloodGate (c) NetGuard GuardianPro (d) ALTQ_CBQ
(e) Packeteer PacketShaper (f) Sitara QoSWorks (g) NetReality WiseWAN (h) ALTQ_CBQ+RED
Figure 7: Inter-class Bandwidth Borrowing Test
Intra-Class Bandwidth Borrowing
Figure 8 shows the intra-class bandwidth borrowing benchmark results. iPolicer lacks this
function so after connection 1 terminates, connection 2 cannot occupy the newly available
bandwidth within the class. Guardian Pro and ALTQ_CBQ have fluctuating bandwidth sharing
between the two connections since they cannot guarantee per-connection bandwidth. This
phenomenon in ALTQ_CBQ is again slightly alleviated after applying RED. The other four products
are quite similar in this test, except that PacketShaper and FloodGate have little gaps.
16
(a) Acute/Broadweb iPolicer (b) CheckPoint FloodGate (c) NetGuard Guardian Pro (d) ALTQ_CBQ
(e) Packeteer PacketShaper (f) Sitara QoSWorks (g) NetReality WiseWAN (h) ALTQ_CBQ+RED
Figure 8: Intra-class Bandwidth Borrowing Test
C-2. VoIP Quality Test
This test does not include iPolicer since presently it cannot control UDP traffic. This test is
performed by the Smartbits and by the Cisco 1750 VoIP gateways. The former gives quantitative results
while the latter judges the voice quality through hearing.
Figure 9 (a) shows that under T1 WAN link (1.544Mbps) the DUTs differ in latency and jitter.
However, the ultimate voice quality grades (PSQM) are similar except for ALTQ_CBQ. This is also
verified by the VoIP Gateway (Table 10) test. We thus conclude that under T1 access link the G.729 bit
rate can be easily allocated. In contrast, under 125kbps WAN link (Fig.9 (b) and Table 10), the voice can
only barely be recognized with PacketShaper. Transmitting a large packet (1500 bytes) to the
narrowband WAN link (125kbps) takes a long time such that its following small voice packet (74 bytes)
has to wait until the previous large packet is completely scheduled out. However, after QoSWorks
17
Latency and jitter
30.6303
81.0739
1529.3163 837.5684
1.053310.8978
121.2768
0
50
100
150
200
250
Base PacketShaper FloodGate WiseWAN GuardianPro QoSWorks ALTQ_CBQ
(ms)
0
5
10
15
20
25
30
(ms)
Average Latency Max Latency Jitter (Latency Variation)
PSQM and Loss Rate
2.2 2.48 2.56 2.45 2.7 2.6
6.5
0
1
2
3
4
5
6
Base PacketShaper FloodGate WiseWAN GuardianPro QoSWorks ALTQ_CBQ
PS
QM
0
5
10
15
20
25
Loss
rat
e (%
)
PSQM Loss rate
Latency and jitter
0
500
1000
1500
2000
2500
3000
Base PacketShaper FloodGate WiseWAN GuardianPro QoSWorks QoSWorks2 ALTQ_CBQ
Late
ncy
(ms)
0
10
20
30
40
50
60
70
80
Jitte
r (m
s)
Average Latency Max Latency Jitter (Latency Variation)
PSQM and Loss Rate
2.2
6.5 6.5 6.5 6.5 6.5
2.6
6.23
0
2
4
6
8
Base PacketShaper FloodGate WiseWAN GuardianPro QoSWorks QoSWorks2 ALTQ_CBQ
PS
QM
0
20
40
60
80
100
Loss
rate
(%)
PSQM Loss rate
(a) T1 WAN link (1.544Mbps) (b)125kbps WAN linkNote: “Base” results are conducted under clean testbed without enabling any DUT. The G.729 Codec is not lossless compression. Even though the jitter and loss is few, the PSQM is at least 2.2.
Figure 9: VoIP Test Results of SmartVoIPQoS
exercises Packet Size Optimization (minifying the maximum transfer unit of FTP connections when
establishing the connections), the voice quality approaches the original voice both in Smartbits and
Gateway tests. While it is promising, readers should be aware that minifying the packet size of all other
TCP connections can cause large overhead. As an analogy, the overhead of several small trucks carrying
the goods is larger than that of a big truck carrying the same goods. This tradeoff depends on the
considerations of the network administrator.
T1 WAN link Speed 125kbps WAN link Speed
Calling timeDelay time
(estimated by ears)
Voice quality
(legibility)Calling time
Delay time
(estimated by ears)
Voice quality
(legibility)
Baseline (only voice) About 0.2 sec Very short(< 0.1 sec) Very good <1 sec Very short(< 0.1
sec)
Very good
Baseline (with background FTP) Cannot establish the connection Cannot establish the connection
iPolicer Cannot be tested(do not support UDP traffic control) Cannot be tested(do not support UDP traffic control)
FloodGate About 0.5 sec Very short(< 0.1 sec) Very good About 7sec About 1 sec Very Poor(<10%)
Guardian Pro About 0.5 sec Very short(< 0.1 sec) Very good About 3 sec About 1.5 sec Ultra poor(<1%)
WiseWAN About 0.5 sec Very short(< 0.1 sec) Very good About 7sec About 1.5 sec Ultra poor(<1%)
PacketShaper About 0.5 sec Very short(< 0.1 sec) Very good About 1 sec About 1 sec Poor (60%)
ALTQ_CBQ About 2 sec Very short(< 0.1 sec) Very good About 18 sec About 1 sec Very Poor(<10%)
QoSWorks About 1 sec Very short(< 0.1 sec) Very good About 17 sec About 1 sec Very Poor(<10%)
QoSWorks Optimized Not tested (no need to) About 6 sec Very short(< 0.2
sec)
Very good
Table 10: VoIP Test Results Through VoIP Gateway
5. Conclusions
This work designs a novel testbed that mimics the real-life Internet conditions, such as multiple
connections, heterogeneous WAN delays/delay jitters/packet loss rates, and different TCP source
implementations. Most test reports, such as those by the Tolly Group [22], are financed by the vendors
and may be biased. Additionally, the testbed in those reports is over-simplified, without in-depth test
items or with inadequate number of connections. This work first classifies the policy rules into three
major types: (1) class-based rule; (2) connection rule within a class; (3) bandwidth borrowing rule
among classes. The test methodology then quantifies the effectiveness of the above policy rule types of
each device in terms of accuracy, fairness, stability, robustness, bandwidth borrowing, and VoIP quality.
The test results reveal several things that can be reproducible with our open tools: (1) the narrowband
18
class-based rule and its fairness among the flows are harder to enforce when multiple TCP connections
compete for the same queue, resulting in large queue length and TCP retransmissions. (2) explicitly
sizing the TCP window could cause performance or fairness degradation even under slight packet loss
rates; (3) the open source solution can compete with commercial products in accurately limiting flow
aggregates; (4) the video/voice qualities of real-time applications significantly depends on the packet
sizes of all other traffic when using a narrowband (125kbps) access link. Detailed functionality
comparison among the DUTs gives further directions for enhancing open source solutions, such as
Packeteer’s traffic discovery and QoSWorks’s intuitive user interface. The ALTQ package lacks per-
connection bandwidth guarantee within the class that it needs further refinements to satisfy the
enterprises’ demand. Some vendors in this test use open sources but never do they open their kernel
patches. We are currently patching ALTQ with per-connection bandwidth guarantee and will feedback to
the Open Source community. After all, open source should be open.
6. References
[1] S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W. Weiss, An Architecture for
Differentiated Services, RFC 2475, Dec. 1998.
[2] Stiliadis, and A. Varma, Latency-Rate Servers: A General Model for Analysis of Traffic
Scheduling Algorithms, IEEE/ACM Transactions on Networking, Vol. 6, No. 5, pp.611-624, Oct.
1998.
[3] S. Floyd, and V. Jacobson, Link-sharing and resource management models for packet
networks, IEEE/ACM Transactions on Networking, Vol. 3, No. 4, pp.365-386, 1995.
[4] K. Fall, and S. Floyd, Simulation-based Comparisons of Tahoe, Reno, and SACK TCP,
ACM Computer Communication Review, Vol. 26 No. 3, pp.5-21, Jul. 1996.
[5] J. Padhye, and S. Floyd, On Inferring TCP Behavior, ACM SIGCOMM'2001, San Diego,
USA, August, 2001. http://www.acm.org/sigcomm/sigcomm2001/p23.html (to be appeared)
[6] S. Floyd and V. Jacobson, Random Early Detection Gateways for Congestion Avoidance,
IEEE/ACM Transactions on Networking, Vol. 1, No. 4, pp.397-413, Aug. 1993.
[7] S. Karandikar, S. Kalyanaraman, P. Bagal, and B. Packer, TCP Rate Control, ACM
Computer Communication Review, Vol. 30, No. 1, Jan. 2000.
[8] K. Cho, Alternate Queueing for BSD UNIX (ALTQ), http://www.csl.sony.co.jp/person/kjc
[9] NetGuard Corporation, http://www.netguard.com
[10] Check Point Software Technologies, http://www.checkpoint.com
[11] BroadWeb Corporation, http://www.broadweb.com.tw
[12] Acute Communication Corporation, http://www.acutecomm.com
[13] Packeteer Corporation, http://www.packeteer.com
19
[14] Sitara Networks, http://www.sitaranetworks.com
[15] NetReality Corporation, http://www.net-reality.com
[16] Ncftpput Software, http://www.ncftp.com
[17] K. Cho, Tele Traffic Tapper (ttt), http://www.csl.sony.co.jp/person/kjc
[18] Spirent Communications, http://www.netcomsystems.com
[19] Lawrence Berkeley National Laboratory, tcpdump, http://www-nrg.ee.lbl.gov
[20] H. Y. Wei, WAN Emulator, http://speed.cis.nctu.edu.tw/wanemu/
[21] W. R. Stevens, TCP/IP Illustrated Volume 1 - The Protocols, Addison-Wesley, 1994.
[22] Tolly Group, http://www.tolly.com
Acknowledgements
We thank the vendors who so generously provided us with the devices and their verifications of
the test results. We are grateful to Ching-Chuan Chiang and Yi-Chung Liu for their help on the
preliminary tests and functionality comparisons.
Appendix
Appendix A. Detailed Functionality Comparison
A-1. Policy Console User Interface
As for the policy console user interface (Table A), a notable function is how many devices a
management console can control. Policy consoles of PacketShaper and QoSWorks can control only one
device since they use built-in web servers for configuration with Web browsers. Policy consoles of
others (except for ALTQ) can remotely control multiple devices located at different places. As for
schedule control, per-rule schedule control is more effective. For example, some rules can be inactive
during non-office hours, but VoIP rule should be always active to guarantee voice quality.
Vendor/Model TypeSchedule
Control
Management
ConsoleOS Monitor/Statistics Alert
ALTQ Config File N Single device FreeBSD 4.0 Per-class bandwidth usage N/A
NetGuard’s
Guardian Pro
GUI Win32
ApplicationPer-rule
Global
devicesWin NT/2000
Line Statistics Report/Response Time
Report/Protocol Distribution ReportLog
20
CheckPoint’s
FloodGate
GUI Win32
ApplicationPer-rule
Global
devicesWin NT/2000
Line Statistics Report/Response Time
Report/Protocol Distribution ReportN/A
NetReality’s
WiseWan
GUI Java
ApplicationPer-rule
Global
devicesWin NT/Solaris
Line Statistics Report/Port Report /Response Time
Report/Protocol Distribution Report/VoIP
Report/Top Ten Talkers/Top Ten Protocols or Apps
SNMP trap
BroadWeb/Acute’s
iPolicer
Web Browser
(Java Applet)Per-rule
Global
devices
Web Server Web Client Line Statistics Report/Top Ten Report/
Top Ten Talkers/Top Ten ProtocolsEmail trap
Another NT IE 5.0
Packeteer’s
PacketShaper
Web Browser
(HTML)Per-device
Single
device
Web Server Web Client Utilization/Network Efficiency/Top Ten
Classes/Top Twenty Talkers/Per-class Bandwidth
Usage/Response Time Report
SNMP
trapEmbedded
Web Server Any
Sitara’s QoSWorksWeb Browser
(HTML)Per-device
Single
device
Web Server Web Client Per-class Bandwidth Usage/Link statistics/Top
classes per link/Top Applications/Protocol
Distribution/Traffic by address
SNMP
trapEmbedded
Web Server Any
Table A: Management Interface and Statistics of Flow
A-2. Special Functions
PacketShaper is superior in its Traffic Discovery, which can automatically identify the protocols
of the traffic passing through it and provide an instant feedback to the network administrator for further
bandwidth setting. Others have to manually monitor whether the newly specified packet filters can
capture its corresponding traffic. WiseWAN is directly installed at the WAN link (V.35 cable) and thus
can verify whether the measured bandwidth matches the subscribed bandwidth. Additionally, it can
detect PVCs in the frame relay network. Thus a single WiseWAN device can control all the traffic on the
mesh-structured frame relay links among branch offices. QoSWorks significantly focuses on controlling
VoIP traffic. With shrinking TCP data packet size, VoIP (UDP packets) traffic can pass through
QoSWorks smoothly, especially in narrowband WAN link. Moreover, QoSWorks has built-in Web cache
(not verified in this report). Both FloodGate and Guardian Pro can be integrated with their firewall, VPN
and NAT packages. Integrated solutions may reduce management costs.
21
Appendix B. Testbed Photo
Figure B: Testbed Photo
Appendix C. Intuitive Example for Basic Test Statistics
This intuitive example illustrates how the Basic Test statistics of the 20kbps class are derived. As
described in Section 3.2, each class matches four connections, and the test repeats for five runs. Ideally
within each run each connection can receive 1/4 of the class bandwidth. The example results tell us that
the accuracy statistic is 19, which approaches the ideal result 20, cannot reflect real conditions. With the
aid of poor stability of accuracy, we can judge that the DUT is actually not good in accuracy. On the
other hand, “Not fair” with “Good stability of fairness” means that the DUT “cannot fairly" treat the
flows almost “all the time”.
Figure C: Intuitive Example for Basic Test Statistics
22