making linux tcp fast - netdev...

26
1 Making Linux TCP Fast Yuchung Cheng Neal Cardwell netdev 1.2 Tokyo, October, 2016

Upload: lyngoc

Post on 02-Feb-2018

223 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

1

Making Linux TCP Fast

Yuchung ChengNeal Cardwell

netdev 1.2 Tokyo, October, 2016

Page 2: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

Once upon a time, there was a TCP ACK...

Here is the a story of what happened next...

2

Page 3: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

RACK: detect losses by packets’ send time

Monitors the delivery process of every (re)transmission. E.x.

Sent packets P1 and P2

Receives a SACK of P2

=> P1 is lost if sent more than $RTT + $reo_wnd ago1

Reduce timeouts in Disorder state by 80% on Google.com

3

1 RACK draft-ietf-tcpm-rack-00 since Linux 4.4

Page 4: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

congestion control:how fast to send?

4

Page 5: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

5

Congestion and bottlenecks

Page 6: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

6

Deliv

ery

rate

BDP BDP + BufSize

Congestion and bottlenecks

amount in flight

Page 7: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

7

Deliv

ery

rate

BDP BDP + BufSize

RTT

amount in flight

Page 8: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

8

Deliv

ery

rate

BDP BDP + BufSize

RTT

CUBIC / Reno

amount in flight

Page 9: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

9

Deliv

ery

rate

BDP BDP + BufSize

RTT

Optimal max BW and min RTT (Gail & Kleinrock. 1981)

amount in flight

Page 10: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

BDP = (max BW) * (min RTT)

10

Deliv

ery

rate

BDP BDP + BufSize

RTT

Estimating optimal point (max BW, min RTT)

amount in flight

Est min RTT = windowed min of RTT samples

Est max BW = windowed max of BW samples

Page 11: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

11

Deliv

ery

rate

BDP BDP + BufSize

RTT

amount in flight

Only min RTT is visible

Onlymax BWis visible

But to see both max BW and min RTT,must probe on both sides of BDP...

Page 12: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

One way to stay near (max BW, min RTT) point:

12

Model network, update max BW and min RTT estimates on each ACK

Control sending based on the model, to...

Probe both max BW and min RTT, to feed the model samples

Pace near estimated BW, to reduce queues and loss

Vary pacing rate to keep inflight near BDP (for full pipe but small queue)

That's BBR congestion control (code in Linux v4.9 paper ACM Queue, Oct 2016)

BBR = Bottleneck Bandwidth and Round-trip propagation time

BBR seeks high tput with small queue by probing BW and RTT sequentially

Page 13: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

Confidential + Proprietary

BBR: model-based walk toward max BW, min RTT

optimal operating point

13

Page 14: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

Confidential + Proprietary

STARTUP: exponential BW search

14

Page 15: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

Confidential + Proprietary

DRAIN: drain the queue created during startup

15

Page 16: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

Confidential + Proprietary

PROBE_BW: explore max BW, drain queue, cruise

16

Page 17: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

Confidential + Proprietary

PROBE_RTT briefly if min RTT filter expires (=10s)*

[*] if continuously sending

minimal packets in flight for max(0.2s, 1 round trip)

17

Page 18: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

Packet scheduling: when to send?

18

Page 19: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

NIC

TCP

Pacing

Fair queuing

TCP Small Queues (TSQ) TSO autosizing

link19

?

fq

Page 20: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

Performance results...

20

Page 21: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

BBR vs CUBIC synthetic bulk TCP test with 1 flow, bottleneck_bw 100Mbps, RTT 100ms

Fully use bandwidth, despite high loss

21

Page 22: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

Low queue delay, despite bloated buffers

22BBR vs CUBIC synthetic bulk TCP test with 8 flows, bottleneck_bw=128kbps, RTT=40ms

Page 23: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

BBR is 2-20x faster on Google WAN

● BBR used for all TCP on Google B4 ● Most BBR flows so far rwin-limited

○ max RWIN here was 8MB (tcp_rmem[2])

○ 10 Gbps x 100ms = 125MB BDP

● after lifting rwin limit

○ BBR 133x faster than CUBIC

23

Page 24: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

Conclusion

Algorithms and architecture in Linux TCP have evolved

● Maximizing BW, minimizing queue, and one-RTT recovery (BBR, RACK) ● Based on groundwork of a high-performance packet scheduler

(fq/pacing/tsq/tso-autosizing)● Orders of magnitude higher bandwidth and lower latency

Next Google, YouTube, and... the Internet?

● Help us make them better! https://groups.google.com/forum/#!forum/bbr-dev

24

Page 25: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

Backup slides...

26

Page 26: Making Linux TCP Fast - NetDev confnetdevconf.org/1.2/slides/oct5/04_Making_Linux_TCP_Fast_netdev_1… · Conclusion Algorithms and architecture in Linux TCP have evolved Maximizing

Confidential + Proprietary

BBR convergence dynamics

Converge by sync'd PROBE_RTT + randomized cycling phases in PROBE_BW

● Queue (RTT) reduction is observed by every (active) flow● Elephants yield more (multiplicative decrease) to let mice grow

bw = 100 Mbit/secpath rtt = 10ms