inf570 dario.rossi inf570 v 08/2013 dario rossi drossi ledbat/utp

67
INF570 dario.rossi INF570 v08/2013 Dario Rossi http://www.enst.fr/~drossi LEDBAT/uTP

Upload: stuart-dickerson

Post on 29-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

INF570

dario.rossi

INF570v08/2013

Dario Rossihttp://www.enst.fr/~drossi

LEDBAT/uTP

Page 2: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Plan

• Bufferbloat– Problem– Extent

• LEDBAT/uTP– Historical – Congestion control– BitTorrent swarm– Bufferbloat

• References

Page 3: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

• RTT Delay between two Internet hosts ?Bufferbloat

~5s !?~2.5s~300ms~100msInternet

Page 4: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Time [s]100 2000 300 400 500 600

RTT

[s]

00

2

1

3

4

5

RTT

TCP Alice to Bob TCP Bob to Alice

• RTT Delay between two closeby Internet hosts ?Bufferbloat

Bufferbloat!

RTT may grow to several seconds!Nasty impact on interactive Web, VoIP, gaming traffic, etc.

Source: [PAM’10]

RTT earth-moon

Page 5: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Bufferbloat

~5s !?~2.5s~400ms~100msInternet

• Bufferbloat = very large queuing delay– New word, old “persistently full buffer” problem

• Root causes– Narrow cable, ADSL pipe capacities (few 100Kbps)– Relatively large modem buffers (~100KB)– TCP dynamics are loss driven, hence buffers fill

Page 6: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Bufferbloat: single household, US

J. Gettys home, Comcast cable line

Page 7: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Bufferbloat: single household, EU

D. Rossi home, Free DSL line

Page 8: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Bufferbloat: uplink

Netalyzruplink

Page 9: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Bufferbloat: downlink

Netalyzrdonwlink

Page 10: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Bufferbloat: where and why ?

• Examples– OS stack (and above)

• (L2) Eth driver, (L3) pfifo_fast queue, (L4) TCP/UDP buffers, (L7) application buffer

– WiFi AP • e.g. when speed changes from

54 to 2Mbps

– Interfering traffic • e.g., PC to NAS backup use

WiFi AP while a smartphone access the Internet

– DSL box• Unless AQM is used (more

details later)

pc

nas

End usr devices ISP box ISP net

Box

Dns Gw

dst

hop

ntp

PC1 pings to google while PC2 backup to NAS

Backup: large

files

Backup: many small files

Backup ended

Page 11: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

LEDBAT: BitTorrent to the rescue?

Peer File chunks

Chunk transmission

LEDBAT congestion control

Page 12: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

LEDBAT: BitTorrent to the rescue?• BitTorrent announces

closed source code, data transfer over UDP

• BitTorrent + UDP = Internet meltdown!

• After BitTorrent denial and some discussion...

• Everybody agreed that Internet not gonna die

• But is LEDBAT really the best approach ?

Page 13: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

LEDBAT: BitTorrent to the rescue?• BitTorrent solution

– Side effect of the Internet meltdown buzz, BT had to explain itself– Two fora: IETF and BitTorrent Enhancement Proposal (BEP)

• Two names, same goals:– Efficiently use the available bandwidth– Keep delay low on the network path– Quickly yield to regular TCP traffic– LEDBAT/UTP used interchangeably

IEFT (LEDBAT) • Low Extra Delay BAckground Transport • Definition of a low priority delay-based

congestion control algorithm

BEP29 (UTP)• UDP framing for

interoperability among BT clients

• Assumptions: - Bottleneck = user access link- Congestion = self-induced by users’ own traffic

Page 14: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

• Two solutions to bufferbloat– Active Queue Management (AQM)– Low-priority/Delay-based Congestion Control (CC)

Research timeline: the big picture

1990 1993 1995 2000 2002 2003 2010 2012

SFQ RED DRR Choke CoDel

Vegas NICE TCP-LP LEDBAT

Let expand this further

Page 15: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

LEDBAT/uTP evolution

0

250

500

750

1000

1250

1500

0 10 20 30 40 50 0

250

500

750

1000

1250

1500

0 10 20 30 40 50

Time [s]

TCP v5.2.2Oct ‘08

Open source

α2 v1.9-15380

Mar ’09 First LEDBAT

draft

β1v 1.9-16666

Aug ‘09Draft as WG

Item

α1 v1.9-13485

Dec ‘08 Closed source

Packet size [Bytes]

0

250

500

750

1000

1250

1500

0 10 20 30 40 50 0

250

500

750

1000

1250

1500

0 10 20 30 40 50

TCP v5.2.2Oct ‘08

Open source

RC1v2.0.1Apr’10

After IETF 77

α1 v1.9-13485

Dec ‘08 Closed source

Testbed[PAM’10]

α2 v1.9-15380

Mar ’09 First LEDBAT draft

AnalysisSimulation

β1v1.9-16666

Aug ‘09Draft as WG Item

Passive monitoring

•TCP transfers, full playload• 1a small packet overkill !! • 2a variable framing (not in draft) • 1b finer bytewise cwnd control •The evolution continues at IETF

Page 16: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

LEDBAT/uTP evolution

May’10 Jul’10 Oct’10 Dec’10 Mar’11 Apr’11 May’11 Jul’11 Sep’11 Oct’11 Nov’12

3.02.0.2

v2

[ICCCN’10]

2.2Griffin b

v3

[LCN’10]

2.2

[GLOB’10,TR1]

v4 v5,6

libUTPopen src

v7 v8,9

[TMA’12,P2P’12,CoNEXT’12]

3.1 a

[P2P’11]

LEDBAT fast growing• on most BT clients • >50% BT traffic

LEDBAT almost RFC• with some bugs!

Source: [TR1]

~RFC

Page 17: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Viewpoints• Congestion control viewpoint

– Latecomer unfairness [ICCCN’10] + solution via simulation [GLOB’10] and analytical modeling [TR1]

– LEDBAT vs other low priority protocols + magic numbers [LCN’10]

• BitTorrent swarm viewpoint– Reverse engineering of the closed source protocol [PAM’10] – Impact of LEDBAT on swarm completion time via simulation [P2P’11]

and experiments [TMA’12]

• Bufferbloat viewpoint– LEDBAT vs AQM [CoNEXT’12]– Exploiting LEDBAT to gauge Internet bufferbloat delays [P2P’12]

Page 18: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

dario.rossi

Congestion control viewpoint

``Because LEDBAT is an IETF protocol, whose scope is larger than BitTorrent’’

Page 19: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Congestion control viewpoint plan

• High-level idea and protocol specification [ICCCN’10]• Latecomer unfairness: problem, extent, solution

[GLOBECOM’10,TR’10]• Protocol tuning and magic numbers [LCN’10]• Comparison with others low-priority protocols

[LCN’10]• Experiments with BitTorrent implementation

[PAM’10]

Page 20: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

LEDBAT operations: high level

Detect congestion by losses• Increment the congestion window

(cwnd) by one packet per RTT• Halve cwnd on loss events

Consequences• The buffer always fills up • High delay for interactive apps• Users need to prioritize traffic !

losses

time

cwnd

TARGET reached time

cwnd

Infer via delay measurement• Senders measure minimum delay• Evaluate offset from TARGET delay • React with linear controller

Aim• At most TARGET ms of delay • Lower priority than TCP• Do not harm interactive application• Avoid self-induced congestion

LEDBATTCP

Page 21: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

LEDBAT operations: gory details• Pseudocode in draft v9 (31/10/2011)

• Note: lots of other details in the draft (e.g., route change, sample filtering, etc.) but not in today talk

• TARGET = 25ms (“magic number” fixed in draft v1) = 100ms (in BEP29), <= 100ms (from draft > v5)

@RX: remote_timestamp = data_packet.timestamp ack.delay = local_timestamp() - remote_timestamp ack.send()

@TX: current_delay = ack.delay base_delay = min(delay, base_delay)

queuing_delay = current_delay - base_delay() off_target = (TARGET - queuing_delay)/TARGET

cwnd += GAIN * off_target * bytes_newly_acked * MSS / cwnd

One-way delay (affected by clock offset skew)

Base delay = min delay seen(hopefully finds an empty queue)

Queuing delay estimation(clock offset cancels !)

Linear response to distance from target

(at most grows by 1 MSS after a full cwnd is acked)

Page 22: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

LEDBAT vs TCP 1. TCP and LEDBAT start together2. As soon as q > 20pkt

-> queuing delay > 25ms -> LEDBAT decreases and stops

3. TCP experiences a loss (due to TCP AIMD) and halves its rate

4. Queue empties, LEDBAT restarts5. TCP is more aggressive (larger

cwnd w.r.t. t=0), LEDBAT yields6. Cyclic losses

• LEDBAT is lower priority than TCP• Total link utilization increases

(w.r.t TCP alone)

CWN

D [p

kts]

Que

ue [p

kts]

0

20

40

0

40

80

TCPLEDBAT

Total

4 8 12Time [s]

1

23

4

5,6

Source[ICCCN]

Page 23: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

LEDBAT vs LEDBAT

1. LEDBAT flows start together2. Queue builds up3. As soon as q = 20pkts

-> delay=25ms -> linear controller settles

4. Queue keeps stable5. No changes until some

other flows arrive

• LEDBAT flows efficiently use the available bandwidth

• The bandwidth share is fair between LEDBAT flows4 8 12

Time [s]

CWN

D [p

kts]

Que

ue [p

kts]

0

20

40

0

40

80

LEDBAT 1LEDBAT 2

Total

Source[ICCCN]

Page 24: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

CWN

D [p

kts]

0

40

80

5 15 25Time [s]

∆T = 10s, B=100

0

40

80 ∆T = 10s, B=40

5 15 25

0

40

80 ∆T = 5s, B=40

5 15 25

Latecomer unfairness

Source[ICCCN]

• Second flow starts before queue builds up: same target, but latecomer gets a smaller share

• Second flow starts after first has reached its target: latecomer sets higher target (delay due to first flow ends up in base delay) After loss, base delay is possibly corrected (queue empties)

• In case of larger buffer, latecomer starve the firstcomer

Page 25: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Latecomer unfairness• Implementation

– LEDBAT as a new TCP flavor in ns2– (also works as a Linux kernel module)

• Scenario– Single bottleneck – Backlogged transfers

• Parameters– Buffer size B– Time between flows start ∆T– Target = 100 ms

• Methodology– Simulation [ICCCN’10]– Experiments [TR1]

B

∆T

10Mbps

100Mbps

100Mbps

Page 26: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Latecome unfairness

Real BitTorrent codein a testbed [TR1]Latecomer overestimates

the base delay (includes queueing due to the first)

Firstcomer senses increasing delay and

slowdown sending rate

Firstcomer starvation !(now stated in small prints after

tons of disclaimers in draft v9)

Source[TR1]

https://github.com/bittorrent/libutp

Page 27: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Root cause of latecomer unfairness

D. Chiu and R. Jain, "Analysis of the Increase/Decrease Algorithms for Congestion Avoidance in Computer Networks," Journal of Computer Networks and ISDN, Vol. 17, No. 1, June 1989, pp. 1-14

AIAD AIMD

• Intrinsic to AIAD dynamics!– Known since the late 80s!!– Very well known, over

1500 citations!

Page 28: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Solution to latecomer unfairness

Several possibilities:• Random pacing • Slow start• Random cwnd drop• Proportional drop Slow

start •A loss empties the queue…•..but hurts interactive traffic

:(

:( Random pacing•Proposed on the IETF mailing list•Doesn’t work (more later)

Random cwnd drop•Throughput fluctuations•Efficiency ok, fairness almost ok

:/

:/ Proportional cwnd drop

•Synchronization•Fairness ok, efficiency almost ok

Source: [GLOB]

: ) fair-LEDBAT (fLEDBAT)•Reintroduces AIMD •Use proportional drop•Details, simulation, fluid model: see [TR1]

Page 29: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Extent of latecomer unfairness• Fairness issue affects backlogged connections

– If tx stops => queue empties => base delay measure is correct => all ok

• Relevance of latecomer unfairness with traffic models ?– Chunk based: M=4 simultaneous connections out of N>M peers– After each chunk, continue with the same peer with probability P– Rough model of chocking, pipelining, optimistic unchoking, etc.

TXRX1

RX3

RX2

RXN…

Persistence probability P

fLEDBAT

LEDBAT

Source:[TR1]

Page 30: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Sender 2 to NReceivers

...

Efficiency (η)

Average queue occupancy (B )

Jain fairness index (F )

C

xi i

N

ii

N

ii

xN

x

F

1

2

2

1

maxB

BEB

Sensitivity analysis + low-priority comparison• LEDBAT and TCP-NICE as new flavors in ns2• Bottleneck with capacity C=10 Mbps, other links at 100 Mbps• RTT=50 ms, packet size pktS=1500 B, buffer Bmax=100 pkts• Flows start at same time ΔT=0s (avoid latecomer unfairness)

F∈[1/N,1] F=1 max fairnessF=1/N max unfairness

Page 31: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Sensitivity analysis• Inter-protocol case : LEDBAT vs NewReno

– Target T∈[2,150]% of Bmax– Gain G∈[1,10] as in draft

• Main findings– Gain only minimal impact– Target has profound impact; hard to tune the level of priority

Gain G Gain G

Target T [%] Target T [%]

Effici

ency

η

Fairn

ess F

f(G)f(T)

R1: unstableR2: low priorityR3: transientR4: loss-based

R1

R2

R3 R4

Page 32: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

• Intra-protocol case : LEDBAT vs LEDBAT– Target ratio T1/T2 ∈[1,10] – Gain G=1 since minimal impact

• Main findings– Even small target differences yield to serious unfairness

Sensitivity analysis

Efficiency η Fairness FAvg queue B

Target T1/T2

η, B

, F m

etric

s

R0: T1=T2

R1: T1+T2<100% Bmax

R2: T1+T2>100% Bmax

R1R2

Fair:Unfair:Lossy:

R0

Page 33: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Low-priority congestion control

• NICE– Inherits from Vegas, measure RTT delay, decrease

window when half of samples > E[RTT]

• TCP-LP– EWMA of the OWD delay value; if threshold exceeded,

halves cwnd and enters congestion inference phase

• LEDBAT– OWD estimation, continuous AIAD response to OWD

offset from target delay

Page 34: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

NICE.0NICE.1

NewRenoNICE

Time [s] Time [s]TCP-LP.0TCP-LP.1

NewRenoTCP-LP

LEDBAT.0LEDBAT.1

NewRenoLEDBAT

cwnd [pkt]

Time [s]

Low-priority congestion control

Page 35: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Multiple-flows comparison• NewReno vs lower priority flows

– N∈[1,10] low priority flows vs 1 NewReno flow

• Main finding– LEDBAT and NICE efficient in exploiting spare capacity– TCP-LP more aggressive than LEDBAT, NICE

Effici

ency

η

Fairn

ess

F

LEDBATTCP-LPNICENewReno

N= N=

Page 36: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Multiple-flows comparison• Lower priority flows against each other

– N∈[1,5] low priority flows for each flavors, 3N flows total

• Main finding– Relative low priority rank: LEDBAT, NICE, TCP-LP

Thro

ughp

ut b

reak

dow

n

Total number of low priority flows

LEDBATTCP-LPNICE

Page 37: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Experiments• [PAM’10]

– Carried out before the draft – Still relevant to point out difference

between theory and practice

• Active testbed mesurements– Single-flow LAN experiments

• Timeline of LEDBAT evolution• Emulated capacity and delay

– Multi-flow LAN experiments• Level of low-priority ?

– ADSL experiments• In the wild• Interaction with cross-traffic

(((

(((

Page 38: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

LAN: single-flow experiments• Application setup

– uTorrent, official BitTorrent client, closed source

• on native Windows (XP)• on Linux with Wine

– Private torrent between PCs

• Network setup– Linux based router with Netem

to emulate network conditions– Capture traffic on both sides,

analyze traces

• Experiment setup– Flavors comparison– Delay impairment– Bandwidth limitation

Leecher

Seed

Switch 10/100

Router +

Netem

Forward Backward

Page 39: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

0

2

4

6

8

10

0 60 120 180 240

LEDBAT evolution• Throughput evolution

– Different LEDBAT versions – Each curve in separate experiment– Also TCP BitTorrent (Linux + Win)

• Observations– a1 unstable, small packets

• Small packets recently broke b1 too– a2 , ß1 stable smooth throughput

• 4 and 7 Mbps respectively• Window limits

– LEDBAT and TCP influenced by the default maximum receive window:

• TCP WinXP = 17 KB• TCP Linux = 108 KB• LEDBAT a2 = 30 KBytes• LEDBAT ß1 = 45 KBytes

Time [s]

Throughput [Mb/s]

β1α2

α1

TCP WinXP

TCP Linux

Page 40: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

•LEDBAT is delay based– it measures the one-way delay– sensitive to forward delay– Independent from backward

• Experiment setup– constant delay for all packets– delay = k20 ms, k \in [1,5]– forward (or backward) path

• Observation– same behavior on both directions– due to upper bound of maximum

receiver window (a2=20, ß1=30)– not a constraint since bottleneck

shared among many flows

LAN: Constant delay

0

Time [s]

480240 600

0 480240 600

5

10

5

10Th

roug

hput

[Mb/

s]

0

60

120

0

60

120

α2 β1

Del

ay [m

s]

Backward path

Forward path

Delayprofile

Page 41: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

LAN: Variable delay• LEDBAT is delay based• Experiment setup

– random uniformly distributed delay for each packet

– reordering is possible!– range = 20 ± {0,5,10,20} ms– forward (or backward) path

• Observations– a2 greatly affected by varying

delay on the forward path– ß1 probably implements some

mechanism to detect reordering– Minor impact of varying delay on

backward path (ack delay only affects when decisions are taken)

0

Time [s]

120 240 360

0 120 240 360

Backward path

Forward path

8

10

6

8

10

6

0

60

120

Del

ay [m

s]

0

60

120

α2 β1

Thro

ughp

ut [M

b/s]

Delayprofile

Page 42: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

LAN: multi-flow experiments• Application setup

– uTorrent b1 flavor only– Private torrents– Each couple seeder/leecher

shares a different torrent

• Network setup– Simple LAN environment – No emulated conditions

• Experiment setup– Varying ratios of TCP/LEDBAT flows– Different TCP network stack

• native Linux stack • Windows settings emulated

over Linux (to measure losses)

Leechers

Seeds

Switch 10/100

Router +

Netem

Page 43: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

4-0 3-1 2-2 1-3 0-4

1.00 0.64 0.74 0.87 1.000.67 0.94 0.93 0.92 0.96

11

LAN: Multiple TCP and LEDBAT flows

Observations• Throughput breakdown between X+Y competing TCP+LEDBAT competing flows• Efficiency is always high• Fairness among flows of the same type (intra-protocol) is preserved• TCP-friendliness depends on TCP parameters

– TCP Linux stronger than LEDBAT, LEDBAT stronger than TCP WinXP

0.5

0

0.5

Band

wid

th B

reak

dow

n

TCP windowsTCP linuxTC

P

TCP

LED

BAT

LED

BAT

EfficiencyFairness

4-0 3-1 2-2 1-3 0-4

1.00 0.75 0.56 0.33 1.000.98 0.98 0.98 0.98 0.96

4+0 3+1 2+2 1+3 0+4 4+0 3+1 2+2 1+3 0+4 TCP+LEDBAT TCP+LEDBAT

Page 44: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

ADSL experiments• Application setup

– uTorrent b1 flavor only – native Windows (XP) only– private torrent

• Network setup– real ADSL modems – wild Internet

• uncontrolled delay, bandwidth• Interfering traffic

• Experiment setup– LEDBAT alone in the wild – LEDBAT with cross TCP traffic

• On forward path• On backward path

Leecher

Seed

ADSL

ADSL

Internet

Page 45: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

ADSL experiments

• UTP alone: stable throughput, close to DSL capacity• TCP on forward/backward paths

– Forward TCP: UTP correctly yields – Backward TCP: nasty influence

• shacky RTT , competition with bursty ACK on foward• RTT > 1sec affects the time at which decisions are effectively applied !

Thro

ughp

ut [M

b/s]

Time [s]100 2000 300 400 500 600

RTT

[s]

00

2

1

3

4

5

0.2

0.4

0.6

0.8

UTP β1RTT

TCP forward TCP backward

Throughput

DSL capacity

Source: [PAM]

Page 46: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Congestion control summary• IETF LEDBAT

– Latecomer unfairness intrinsecally due to AIAD dynamics [ICCCN’10,GLOBECOM’10] the bug almost on an IETF RFC !

– Unfairness due to heterogeneous TARGET [LCN’10] – Ledbat lowest priority w.r.t. TCP-LP, NICE [LCN’10]– Target delay–driven AIMD solves latecomer + keep queue

bounded[TR’10]– Chunk-mode transmission: all flows in transient phase,

latecomer does not show up [TR’10]– Implementation details can easily break the performance [PAM’10]– Meaning of low-priority depends on meaning of best effort

(i.e., TCP flavor) [PAM’10]– Bufferbloat in the reverse direction can still harm performance [PAM’10]

Page 47: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

dario.rossi

BitTorrent swarm viewpoint

``Because LEDBAT is used by BitTorrent, whose users couldn’t care less for IETF’’

Page 48: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

BitTorrent swarm viewpoint plan

• Simulation with arbitrary connection-level semantic [P2P’11]

• Experiments with real application [TMA’12]

Page 49: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Simulation scenario • Bottleneck at the access links (limited capacity C, large buffer B)• Population: 1 seed, 100 leechers in flash-crowd scenario• Seed policy: leave immediately, stay forever or for a random time • Leechers use either TCP or uTP to TX chunk (but are able to RX both flavors)• Swarm: homogeneous (100% TCP or 100% uTP) or heterogeneous (50%-50%)

Page 50: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Simulation results• Homogeneous scenario: same completion time for TCP and uTP

• Heterogeneous scenario: swarm shows longer download time

• Heterogeneous scenario: uTP peers have shorter completion time w.r.t. TCP peers... unexpected !

Page 51: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Completion time•Opposite uTP/TCP behavior for

growing buffer size•Affected by congetion control

choice in the uplink direction

Results explanation• TCP fills the buffer, up to 2s

• uTP keeps buffer occupancy low: faster signalling due to shorter queuing delay

• uTP peers opportunistically steal chunk download slots to self-congested TCP peers

Page 52: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Experimental scenario

• Grid’5000 controlled testbed [TMA’12]– uTorrent application– Gbps LAN, emulated capacity & delay– One peer per host (no shared queues)

• Experimental scenario– Flash crowd, single provisioned seed – 75 peers, mixed LEDBAT/TCP transport prefs– Homogeneous delay and capacity

• Performance – Torrent completion time (QoE)– Linux kernel queue length (QoS)– Closed source, so no chunk level log :(

Seed

N=75

1

2

3

Page 53: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Experimental results– Homogeneous swarms: ( 75 UTP ) XOR ( 75 TCP ) peers,

separate experiments at different times – Several repetitions (envelope report min & max)– As expected, UTP limits queue length

E[TCP]=385msE[UTP]=108ms

Source:[TMA’12]

Page 54: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Experimental results

E[TCP]=1421 sec

E[UTP]=1325 sec

– Homogeneous swarms: ( 75 UTP ) XOR ( 75 TCP ) peers – Shorter queue length leads to faster signaling:

awareness of content availability propagates faster!– In turn, this assist in reducing the completion time

Source:[TMA’12]

Page 55: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Experimental results

s

– Heterogeneous swarms: H TCP + (75-H) UTP peers (+other scenarios, details in [TMA’12])

– Completion time increases linearly with TCP byte share!(exception: all UTP scenario, details in [TMA’12])

– Impact of TCP flavors studied in [P2P’13]

Page 56: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

BitTorrent swarm summary• LEDBAT nice interplay with BitTorrent

– No latecomer unfairness due to chunk-based transmission [TR’10]– Lower queue occupancy translates into faster signaling [P2P’11, TMA’12]– Completion times of swarm possibly decreases, actual results depend on application-

layer connection management policies [P2P’11,TMA’12]– LEDBAT nice interplay with other applications without hurting BitTorrent performance

either– Similar interplay observed with NICE [P2P’13], additional benefits follow from changes

in the application (but uTorrent is closed source)

• Hidden details– uTorrent implements an aggregated bandwidth shaper that ensures the set of LEDBAT

flows to take as much as the set of TCP flows (otherwise, LEDBAT starves)• Further complexity

– Dual TCP/LEDBAT stack: uTorrent attempts opening both connections in parallel; if LEDBAT open succeeds, drop TCP connection

Page 57: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

dario.rossi

Bufferbloatviewpoint

``Because LEDBAT is an important Internet player, but only one of the players’’

AQM ? VoIP ? Games ?

Page 58: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Bufferbloat solution: LEDBAT ?

LEDBAT achieves its goals• Efficient and low queuing delay

[ICCCN’10,PAM’10]• Lowest-priority w.r.t low-priority

protocols TCP-LP, NICE [LCN’10]

LEDBAT not enough• A single TCP connection can

generate bufferbloat• LEDBAT must be ubiquitously

replace TCP to solve bufferbloat• ...but this ain’t going to happen

LEDBAT not perfect• Latecomer advantage for

backlogged flows can be solved [GLOB’10,TR1]

• Fixed 100ms target does not scale with link capacity (not future proof)

• Heterogeneous target other source of unfairness [LCN’10]

• Troubles with reverse bufferbloat [PAM’10]

• ...but good interplay with BitTorrent [P2P’11, TMA’12]

Page 59: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Bufferbloat viewpoint

• What is the actual extent of Bufferbloat?– Maximum bufferbloat easy to

measure (eg Netalyzer)– Typical bufferbloat unknown

[P2P’12]

• Interesting questions– What is the queuing delay

probability distribution (i.e. user X sees a queuing delay larger than Y) ?

– What % of RTT is due to queuing ?

• What about LEDBAT interoperability ?– 1 TCP flow can still

create bufferbloat– Hence, AQM possibly

deployed in parallel

• Interesting questions– Combination of LEDBAT and

AQM ?– More generally , of any low-

priority protocol and AQM ? [CoNEXT’12]

Page 60: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Passive bufferbloat inference• Methodology

– Passive traffic observation – Gauge remote peer buffer

even with hidden B->C traffic– Simple idea : queuing delay =

current OWD – minimum OWD

• Timestamps from payload– Exploit LEDBAT headers– TimeStamp Options

for TCP (RFC1323)– Very accurate with kernel and

application-level ground truth– Demo at [P2P’12]

Page 61: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Passive bufferbloat inference• LEDBAT: modal

– Queuing 100ms as expected (target)…

– Unless competing TCP• TCP: bi-modal

– very low delay <=> application-level bandwidth limit

– very high delay <=> bufferbloat

• Biased/coupled sampling– Number of samples =

congestion window– Congestion window =

~ inversely proportional to delay !

~2000 BitTorrent peers

Page 62: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Passive bufferbloat inference~2000 BitTorrent peers

Removing sampling bias• Bias due to sampling process

synchronous with LEDBAT dynamics• Make the process asynchronous by

batching packets into fixed duration windows

Performance analysis• 1% of the 1-sec windows see a queuing

delay larger than 1 sec• Breakdown per access type (AT),

BitTorrent client (BC) and operating system (OS)

• Order of importance AT, BC, OS

Page 63: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

AQM + LEDBAT

AQM alone• AQM non ubiquitous • “Dark” buffers in

OS, DSL, application• AQM hard to tune

AQM+LEDBAT • Reprioritization!• LEDBAT as

aggressive as NewReno TCP

Page 64: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

AQM + Low-priority congestion control

AQM + Low-priority CC• Reprioritization for any

AQM+CC combination!• Simulation [CoNEXT’12]

student workshop • Experimental results

confirm the trends [TMA’13b]

• Fluid model to explain dynamics [ITC’13]

Page 65: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

dario.rossi

Wrap up

``Because nobody in the audience could possibly stand any further picture !!’’

Page 66: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

Conclusions• IETF LEDBAT

– The protocol may perform poorly under some (not-so-weird) conditions (i.e., backlogged transfers)

– Fixed TARGET delay not future-proof ! – Heterogeneous TARGET cause of further unfairness !– Bufferbloat due to LEDBAT low unless TCP in the background. – Hence, LEDBAT alone is not enough to solve the bufferbloat– Dilemma: distributed E2E or local AQM solution ?

Both at the same time have poor interaction• BitTorrent UTP

– When connections are not backlogged, conditions for latecomer unfairness are not met, hence good performance

– Good interplay with both BT completion time (lower completion time)– Good interplay with the interactive traffic of the same users (bounded queuing delay,

unlike TCP-LP, NICE)– May not be needed if AQM properly configured on the DSL box ?

Page 67: INF570 dario.rossi INF570 v 08/2013 Dario Rossi drossi LEDBAT/uTP

ReferencesOptional readings:

[P2P’13] C. Testa, D. Rossi, A. Rao and A. Legout, Data Plane Throughput vs Control Plane Delay: Experimental Study of BitTorrent Performance . In IEEE P2P'XIII, Trento, Italy, September 2013.

[P2P’11] C. Testa and D. Rossi, The impact of uTP on BitTorrent completion time. In IEEE P2P'11, 2011

For your thirst of knowledge:[ITC-13] Gong, YiXi, Rossi, Dario and Leonardi, Emilio, Modeling the interdependency of low-prioritycongestion control and active queue management . In

The 25th International Teletraffic Congress (ITC25), sSptember 2013[TMA-13a] C. Chirichella and D. Rossi, To the Moon and back: are Internet bufferbloat delays really that large Traffic Measurement and Analysis (TMA'13),

2013.[TMA-13b] Y. Gong, D. Rossi, C. Testa, S. Valenti and D. Taht, Fighting the bufferbloat: on the coexistence of AQM and low priority congestion control . In

Traffic Measurement and Analysis (TMA'13), 2013. [CoNEXT’12] Y.Gong, D. Rossi, C. Testa, S. Valenti and D.That, Interaction or Interference: can AQM and low priority congestion cotnrol succesfully

collaborate ? In ACM CoNEXT, Student Session, Dec 2012[P2P’12] Chiara Chirichella, Dario Rossi, Claudio Testa, Timur Friedman, Antonio Pescapé, Inferring the buffering delay of remote BitTorrent peers under

LEDBAT vs TCP . In IEEE P2P'XII, 2012.[TMA’12] C. Testa, D. Rossi, A. Rao and A. Legout, Experimental Assessment of BitTorrent Completion Time in Heterogeneous TCP/uTP swarms . In Traffic

Measurement and Analysis (TMA), 2012.[TR1] G. Carofiglio, L. Muscariello, D. Rossi, C. Testa, S. Valenti, Rethinking low extra delay backtround transport protocols . Elsevier Computer Networks,

2013[GLOBECOM’10] G. Carofiglio, L. Muscariello, D. Rossi and S. Valenti, The quest for LEDBAT fairness. In IEEE Globecom, 2010 [LCN’10] G. Carofiglio, L. Muscariello, D. Rossi and C. Testa, A hands-on Assessment of Transport Protocols with Lower than Best Effort Priority . In 35th IEEE

Conference on Local Computer Networks (LCN'10), Denver, CO, USA, October 10-14 2010.[ICCCN’10] D. Rossi, C. Testa, S. Valenti and L. Muscariello, LEDBAT: the new BitTorrent congestion control protocol . In International Conference on

Computer Communication Networks (ICCCN'10), Zurich, Switzerland, August 2-5 2010.[PAM’10] D. Rossi, C. Testa and S. Valenti, Yes, we LEDBAT: Playing with the new BitTorrent congestion control algorithm . In Passive and Active

Measurement (PAM'10), Zurich, Switzerland, April 2010.