MultiPath TCP : Linux Kernelimplementation
Presenter: Christoph PaaschIP Networking Lab
Universite catholique de Louvain
August 22, 2012
http://mptcp.info.ucl.ac.be
IP Networking Lab http://mptcp.info.ucl.ac.be 1 / 32
Networks are becoming Multipath
Mobile devices can connect to the Internet via differentinterfaces
3G
WiFi
Internet
Data-centers have a large redundant infrastructure
IP Networking Lab http://mptcp.info.ucl.ac.be 2 / 32
Networks are becoming Multipath
Mobile devices can connect to the Internet via differentinterfaces
3G
WiFi
Internet
Data-centers have a large redundant infrastructure
IP Networking Lab http://mptcp.info.ucl.ac.be 2 / 32
However, the protocols are single-path
TCP is used for 95% of the Internet communications
TCP identifies connections by the 5-tuple
IP: 192.168.1.1 IP: 4.4.4.4
regular TCP
Port: 42424 Port: 80
A single TCP connection cannot be used across differentinterfaces.
IP Networking Lab http://mptcp.info.ucl.ac.be 3 / 32
However, the protocols are single-path
TCP is used for 95% of the Internet communications
TCP identifies connections by the 5-tuple
IP: 192.168.1.1 IP: 4.4.4.4
regular TCP
Port: 42424 Port: 80
A single TCP connection cannot be used across differentinterfaces.
IP Networking Lab http://mptcp.info.ucl.ac.be 3 / 32
However, the protocols are single-path
TCP is used for 95% of the Internet communications
TCP identifies connections by the 5-tuple
IP: 192.168.1.1 IP: 4.4.4.4
regular TCP
Port: 42424 Port: 80
A single TCP connection cannot be used across differentinterfaces.
IP Networking Lab http://mptcp.info.ucl.ac.be 3 / 32
End-hosts don’t use the multipath network
Smartphones have to restart their data-transfer when movingaway from the WiFi access-point.
IP Networking Lab http://mptcp.info.ucl.ac.be 4 / 32
3G
WiFi
Internet
End-hosts don’t use the multipath network
Smartphones have to restart their data-transfer when movingaway from the WiFi access-point.
IP Networking Lab http://mptcp.info.ucl.ac.be 4 / 32
3G
WiFi
Internet
End-hosts don’t use the multipath network
Smartphones have to restart their data-transfer when movingaway from the WiFi access-point.
IP Networking Lab http://mptcp.info.ucl.ac.be 4 / 32
3G
WiFi
Internet
End-hosts don’t use the multipath network
Collisions in data-center reduce the bandwidth and result insuboptimal load-balancing
IP Networking Lab http://mptcp.info.ucl.ac.be 5 / 32
End-hosts don’t use the multipath network
Collisions in data-center reduce the bandwidth and result insuboptimal load-balancing
A B
IP Networking Lab http://mptcp.info.ucl.ac.be 5 / 32
End-hosts don’t use the multipath network
Collisions in data-center reduce the bandwidth and result insuboptimal load-balancing
A B
IP Networking Lab http://mptcp.info.ucl.ac.be 5 / 32
Mismatch between the multipath networkand the single-path transport protocol.
IP Networking Lab http://mptcp.info.ucl.ac.be 6 / 32
Previous work
Transport layer
SCTP
Needs modifications in the applications
Does not pass by most middleboxes/firewalls
Network layer
Mobile IP(v6), shim6, HIP,. . .
Some are only designed for IPv6
Do not pass by middleboxes/firewalls
Are hard to deploy
IP Networking Lab http://mptcp.info.ucl.ac.be 7 / 32
MultiPath TCP
Runs with unmodified applications
Works over today’s Internet
IPv4/IPv6 are both supported (even simultaneously)
IP Networking Lab http://mptcp.info.ucl.ac.be 8 / 32
MultiPath TCP
Standard Stream Socket API
standard Socket API
Transport Layer
MultiPath TCP
Application Layer
IP Networking Lab http://mptcp.info.ucl.ac.be 9 / 32
MultiPath TCP
Multiple TCP subflows to pass middleboxes
standard Socket API
Transport Layer
MultiPath TCP
TCPsubflow
TCPsubflow
TCPsubflow
Application Layer
IP Networking Lab http://mptcp.info.ucl.ac.be 9 / 32
MultiPath TCP
IPv4/IPv6 capable
standard Socket API
Transport Layer
MultiPath TCP
TCPsubflow
TCPsubflow
TCPsubflow
Application Layer
Network Layer
WiFi Wired3G
IPv4 IPv6IPv4
IP Networking Lab http://mptcp.info.ucl.ac.be 9 / 32
MultiPath TCP
Is the server MPTCP-capable?
IP Networking Lab http://mptcp.info.ucl.ac.be 10 / 32
SYNMP_CAPABLE
MultiPath TCP
Is the server MPTCP-capable?
IP Networking Lab http://mptcp.info.ucl.ac.be 10 / 32
SYN + ACKMP_CAPABLE
MultiPath TCP
Create new subflows
IP Networking Lab http://mptcp.info.ucl.ac.be 10 / 32
SYNMP_JOIN
MultiPath TCP
Create new subflows
IP Networking Lab http://mptcp.info.ucl.ac.be 10 / 32
SYN+ACKMP_JOIN
MultiPath TCP
Create new subflows
IP Networking Lab http://mptcp.info.ucl.ac.be 10 / 32
MultiPath TCP
Separate sequence-number space
IP Networking Lab http://mptcp.info.ucl.ac.be 10 / 32
write_seqsnd_cwndrcv_nxt
Subflow 1
write_seqsnd_cwndrcv_nxt
Subflow 2
write_seqsnd_cwndrcv_nxt
Data-level
MultiPath TCP
Separate sequence-number space
IP Networking Lab http://mptcp.info.ucl.ac.be 10 / 32
SubSeq: 1DataSeq: 1000
SubSeq: 200DataSeq: 1001
SubSeq: 2DataSeq: 1002
write_seqsnd_cwndrcv_nxt
Subflow 1
write_seqsnd_cwndrcv_nxt
Subflow 2
write_seqsnd_cwndrcv_nxt
Data-level
MultiPath TCP
Cross-subflow reinjection
IP Networking Lab http://mptcp.info.ucl.ac.be 10 / 32
SubSeq: 200DataSeq: 1001
SubSeq: 3DataSeq: 1001
SubSeq: 1DataSeq: 1000
SubSeq: 2DataSeq: 1002
MultiPath TCP
standard Socket API
Transport Layer
MultiPath TCP
TCPsubflow
TCPsubflow
TCPsubflow
Application Layer
Network Layer
WiFi Wired3G
IP Networking Lab http://mptcp.info.ucl.ac.be 11 / 32
MultiPath TCP
Linux Kernel Implementation
Available at http://mptcp.info.ucl.ac.be
IP Networking Lab http://mptcp.info.ucl.ac.be 12 / 32
Establishing first subflow
Exchanged Messages
SYNMP_CAPABLE
IP Networking Lab http://mptcp.info.ucl.ac.be 13 / 32
Establishing first subflow
High-Level Kernel design
standard Socket API
Transport Layer
Application Layer
Network Layer
Connection EstablishmentIs the Peer MPTCP-Capable?
TCPsubflow
structtcp_sock
socket(AF_INET, SOCK_STREAM, 0);connect(...);
IP Networking Lab http://mptcp.info.ucl.ac.be 13 / 32
Establishing first subflow
In-depth call-stacksocket(AF_INET, SOCK_STREAM, 0);connect(...);
tcp_v4_connect(...);
tcp_connect(...);
tcp_connect_init(...);
Generate a new keyfor this connection
create SYN
tcp_transmit_skb(...);
tcp_syn_options(...);
mptcp_syn_options(...);
tcp_options_write(...);
mptcp_options_write(...);
pass packet to IP-stack
IP Networking Lab http://mptcp.info.ucl.ac.be 13 / 32
Establishing first subflow
Exchanged Messages
SYN + ACKMP_CAPABLE
IP Networking Lab http://mptcp.info.ucl.ac.be 14 / 32
Establishing first subflow
In-depth call-stack
tcp_v4_rcv(...);
tcp_v4_conn_request(...);
tcp_openreq_init(...);
* Store remote-key in mptcp_request_sock* Generate unique local key
SYN
create the request-socket
tcp_v4_send_synack(...);
mptcp_synack_options(...);
SYN/ACK
IP Networking Lab http://mptcp.info.ucl.ac.be 14 / 32
Establishing first subflow
High-Level Kernel design
standard Socket API
Transport Layer
Application Layer
Network Layer
MultiPath TCPMeta-socket
Creating the Meta-socketand linking the application to this socket.
TCPsubflow
structtcp_sock
IP Networking Lab http://mptcp.info.ucl.ac.be 14 / 32
Establishing first subflow
In-depth call-stack
tcp_v4_rcv(...);
tcp_v4_do_rcv(...);
tcp_rcv_state_process(...);
tcp_rcv_synsent_state_process(...);
mptcp_alloc_mpcb(...);
create the meta-socket
mptcp_add_sock(...);
attach master to meta
SYN/ACK
IP Networking Lab http://mptcp.info.ucl.ac.be 14 / 32
Establishing first subflow
Exchanged Messages
ACKMP_CAPABLE
IP Networking Lab http://mptcp.info.ucl.ac.be 14 / 32
Establishing first subflow
In-depth call-stack
tcp_v4_hnd_req(...);tcp_v4_do_rcv(...);tcp_v4_rcv(...);
tcp_check_req(...);
ACK
tcp_v4_syn_recv_sock(...);
Creates the master-socket
mptcp_check_req_master(...);
Creates the meta-socketInitializes everything for MPTCP
Handles the third ackof the 3-way Handshake
IP Networking Lab http://mptcp.info.ucl.ac.be 14 / 32
Establishing first subflow
High-Level Kernel design
standard Socket API
Transport Layer
Application Layer
Network Layer
MultiPath TCPMeta-socket
TCPsubflow
structtcp_sock
IP Networking Lab http://mptcp.info.ucl.ac.be 14 / 32
Establishing additional subflows
Exchanged Messages
SYNMP_JOIN
IP Networking Lab http://mptcp.info.ucl.ac.be 15 / 32
Establishing additional subflows
High-Level Kernel design
standard Socket API
Transport Layer
Application Layer
Network Layer
MultiPath TCPMeta-socket
An additional subflowis being createdTCP
subflow
structtcp_sock
TCPsubflow
structtcp_sock
IP Networking Lab http://mptcp.info.ucl.ac.be 15 / 32
Establishing additional subflows
In-depth call-stackmptcp_init4_subsockets(...);
tp->mptcp->slave_sk = 1;
connect(...);
mptcp_add_sock(...);
Creates a new IPv4 subflow
Thanks to slave_sk headds an MP_JOIN
tcp_v4_connect(...);
tcp_connect(...);
SYN
Attach subflow to meta-sk
IP Networking Lab http://mptcp.info.ucl.ac.be 15 / 32
Establishing additional subflows
Exchanged Messages
SYN+ACKMP_JOIN
IP Networking Lab http://mptcp.info.ucl.ac.be 16 / 32
Establishing additional subflows
In-depth call-stack
tcp_v4_rcv(...);
SYN
tcp_v4_send_synack(...);
mptcp_lookup_join(...);
SYN/ACK
Look for the MP_JOIN in a SYNAnd see if we know this token
mptcp_v4_do_rcv(...);
mptcp_v4_join_request(...);
Similar to tcp_v4_conn_request()Generate the truncated MAC
IP Networking Lab http://mptcp.info.ucl.ac.be 16 / 32
Establishing additional subflows
Exchanged Messages
ACKMP_JOIN
IP Networking Lab http://mptcp.info.ucl.ac.be 16 / 32
Establishing additional subflows
In-depth call-stack
tcp_v4_rcv(...);
ACK
mptcp_check_req_child(...);
Verify the hmac of the ACK
mptcp_syn_recv_sock(...);
Look for a request-sock
mptcp_v4_do_rcv(...);
tcp_check_req(...);
mptcp_add_sock(...);
Attach subflow to meta-sk
IP Networking Lab http://mptcp.info.ucl.ac.be 16 / 32
Sending Data
Exchanged Messages
SubSeq: 1DataSeq: 1000
SubSeq: 200DataSeq: 1001
SubSeq: 2DataSeq: 1002
write_seqsnd_cwndrcv_nxt
Subflow 1
write_seqsnd_cwndrcv_nxt
Subflow 2
write_seqsnd_cwndrcv_nxt
Data-level
IP Networking Lab http://mptcp.info.ucl.ac.be 17 / 32
Sending Data
High-Level Kernel design
standard Socket API
Transport Layer
Application Layer
Network Layer
MultiPath TCPMeta-socket
TCPsubflow
Mastersubsocket
TCPsubflow
Slavesubsocket
send-queue
write(sock, &buf, size);
21 3 4 5
IP Networking Lab http://mptcp.info.ucl.ac.be 17 / 32
Sending Data
High-Level Kernel design
standard Socket API
Transport Layer
Application Layer
Network Layer
MultiPath TCPMeta-socket
TCPsubflow
Mastersubsocket
TCPsubflow
Slavesubsocket
MultiPath TCP Scheduler
send-queue
write(sock, &buf, size);
21 3 4 5
IP Networking Lab http://mptcp.info.ucl.ac.be 17 / 32
Sending Data
High-Level Kernel design
standard Socket API
Transport Layer
Application Layer
Network Layer
MultiPath TCPMeta-socket
TCPsubflow
Mastersubsocket
TCPsubflow
Slavesubsocket
MultiPath TCP Scheduler
send-queue
send-queue send-queue
21 3 4 5
21 3 4 5
write(sock, &buf, size);
IP Networking Lab http://mptcp.info.ucl.ac.be 17 / 32
Receiving Data
Packets can be reordered at the data-level due to delay-differences.
IP Networking Lab http://mptcp.info.ucl.ac.be 18 / 32
SubSeq: 1DataSeq: 1000
SubSeq: 200DataSeq: 1001
SubSeq: 2DataSeq: 1002
Receiving Data
Packets can be reordered at the data-level due to delay-differences.
IP Networking Lab http://mptcp.info.ucl.ac.be 18 / 32
1000
SubSeq: 200DataSeq: 1001
receive-queue
MPTCP-Level
SubSeq: 2DataSeq: 1002
Receiving Data
Packets can be reordered at the data-level due to delay-differences.
IP Networking Lab http://mptcp.info.ucl.ac.be 18 / 32
1000
SubSeq: 200DataSeq: 1001
out-of-order queue
1002
receive-queue
MPTCP-Level
Receiving Data
A loss at the subflow-level (or network-reordering) can also causereordering at the subflow-level
IP Networking Lab http://mptcp.info.ucl.ac.be 19 / 32
SubSeq: 3DataSeq: 1003
SubSeq: 4DataSeq: 1004
Receiving Data
A loss at the subflow-level (or network-reordering) can also causereordering at the subflow-level
IP Networking Lab http://mptcp.info.ucl.ac.be 19 / 32
SubSeq: 4DataSeq: 1004 Packet Dropped
Receiving Data
Subflow-level out-of-order queues are necessary to handle theretransmission at the subflow-level
IP Networking Lab http://mptcp.info.ucl.ac.be 20 / 32
SubSeq: 4DataSeq: 1004
out-of-order queue
Subflow-Level
Receiving Data
High-Level Kernel design
standard Socket API
Transport Layer
Application Layer
Network Layer
MultiPath TCPMeta-socket
TCPsubflow
Mastersubsocket
TCPsubflow
Slavesubsocket
ofo-queuereceive-queue
ofo-queue ofo-queue
read(sock, &buf, size);
IP Networking Lab http://mptcp.info.ucl.ac.be 21 / 32
MultiPath TCP
Performance Evaluation
IP Networking Lab http://mptcp.info.ucl.ac.be 22 / 32
MPTCP on Amazon EC2
2-8 paths available between hosts not on the samemachine/rack
ECMP load-balancing
40 medium CPU instances running MPTCP
During 12 hours, all-to-all iperf with:
TCPMPTCP (2 subflows)MPTCP (4 subflows)
IP Networking Lab http://mptcp.info.ucl.ac.be 23 / 32
MPTCP on Amazon EC2
0
100
200
300
400
500
600
700
800
900
1000
0 500 1000 1500 2000 2500 3000
Thro
ughput (M
b/s
)
Flow Rank
TCPMPTCP, 4 subflowsMPTCP, 2 subflows
IP Networking Lab http://mptcp.info.ucl.ac.be 24 / 32
MultiPath TCP on a Smartphone/Mobile Node
Smartphones have multiple interfaces (WiFi/3G)
MultiPath TCP can be a benefit for these devices
Increased BandwidthIncreased Resilience
IP Networking Lab http://mptcp.info.ucl.ac.be 25 / 32
MultiPath TCP handover from WiFi to 3G
DSEQ: 1100
DSEQ: 1000
IP Networking Lab http://mptcp.info.ucl.ac.be 26 / 32
MultiPath TCP handover from WiFi to 3G
DSEQ: 1100
DSEQ: 1000
IP Networking Lab http://mptcp.info.ucl.ac.be 26 / 32
MultiPath TCP handover from WiFi to 3G
DSEQ: 1100
DSEQ: 1000
REMOVE-ADDRESS
IP Networking Lab http://mptcp.info.ucl.ac.be 26 / 32
MultiPath TCP handover from WiFi to 3G
DSEQ: 1100
Teardown
Reinject 1100
IP Networking Lab http://mptcp.info.ucl.ac.be 26 / 32
MultiPath TCP handover from WiFi to 3G
Regular TCP would break!
Some applications support recovering from a broken TCP(HTTP-Header Range)
Thanks to the REMOVE ADDR-option, MPTCP is able tohandle this without the need for application support.
IP Networking Lab http://mptcp.info.ucl.ac.be 27 / 32
MultiPath TCP handover from WiFi to 3G
1.0 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0Relative Time w.r.t the Wi-Fi loss [seconds]
0
1
2
3
4
5
6
7
8
9Goodput[Mbps]
Wi-Fi loss
Application Handover
IP Networking Lab http://mptcp.info.ucl.ac.be 28 / 32
MultiPath TCP handover from WiFi to 3G
1.0 0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0Relative Time w.r.t the Wi-Fi loss [seconds]
0
1
2
3
4
5
6
7
8
9Goodput[Mbps]
Wi-Fi lossFull MPTCPApplication Handover
IP Networking Lab http://mptcp.info.ucl.ac.be 28 / 32
Flow-to-core affinity
Flow-to-core affinity
Individual TCP-flows are steered to the same CPU-core toavoid reordering inside the receive-code.
MPTCP has lots of L1/L2 cache-misses because theindividual subflows are steered on different CPU-cores.
MPTCP-aware Receive-Flow-Steering sends all subflows onthe same CPU-core.
CPU 0Core 0 Core 1
CPU 1Core 2 Core 3
L2 cache L2 cache
DRAM
L1 cache L1 cache L1 cache L1 cache
IP Networking Lab http://mptcp.info.ucl.ac.be 29 / 32
Flow-to-core affinity - 10 Gbps interfaces
0 10 20 30 40 50 60 70
Number of iperf-sessions0
2
4
6
8
10
12
14
Ave
rage
Goo
dput
inG
bps
MPTCP
IP Networking Lab http://mptcp.info.ucl.ac.be 30 / 32
Flow-to-core affinity - 10 Gbps interfaces
0 10 20 30 40 50 60 70
Number of iperf-sessions0
2
4
6
8
10
12
14
Ave
rage
Goo
dput
inG
bps
MPTCPMPTCP-aware RFS
IP Networking Lab http://mptcp.info.ucl.ac.be 30 / 32
Conclusion
MultiPath TCP increases the bandwidth andallows seamless mobile handover.
MultiPath TCP can be used with unmodifiedapplications, over today’s Internet.
IP Networking Lab http://mptcp.info.ucl.ac.be 31 / 32
Conclusion
Freely available at http://mptcp.info.ucl.ac.beDownload it, try it out, contribute!
UCLouvain MPTCP-Team:Christoph Paasch
Gregory DetalFabien Duchene
Prof. Olivier Bonaventure
Thanks to our previous and present partners/contributors:
IP Networking Lab http://mptcp.info.ucl.ac.be 32 / 32