part1: paste: network stacks must integrate with nvmm ... · network stacks must integrate with...
TRANSCRIPT
![Page 1: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/1.jpg)
Michio Honda (NEC Laboratories Europe)
With acknowledge to Lars Eggert and Douglas Santry
IIJ-II Seminar
December 26th, Tokyo, Japan
Part1: PASTE:Network Stacks Must Integrate with NVMM
AbstractionsPart2: Report on ACM HotNets 2016
*work done
![Page 2: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/2.jpg)
https://www.hpe.com/us/en/servers/persistent-memory.html
Motivation
• Non-Volatile Main Memories (NVMMs)• Persistent• Byte-addressable• Low latency
• 10s-1000s of ns• Shift from block- to byte-granularity persistency
• OS abstractions• Direct access to mmap()-ed files
• Data structures• Filesystems and databases
What are implications for networking?
![Page 3: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/3.jpg)
Case Study: Write-Ahead Logging
• Persist client’s request prior to acknowledgment• Durably store data into a log file to mask overhead of
updating primary database(e.g. B-tree) to the client
• 1KB commit
• 2030 us• Networking takes 40 us
client
DRAM
Network stack
SSD/DIsk
App
NIC(1)
(2)
(5)
Storage stack
(4)
write()/fsync() ormemcpy()/msync()
(3)
read()
![Page 4: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/4.jpg)
write()/fsync() or memcpy()/msync()
Case Study: Write-Ahead Logging
• 2000 42 us• Networking takes 40 us
• This 2 us is not small
client
DRAM
Network stack
Storage stack
NVMM
App
NIC(1)
(2)
(5)Emulated using a reserved region of DRAM
(3)
read()
(4)
• Persist client’s request prior to acknowledgment• Durably store data into a log file to mask overhead of
updating primary database(e.g. B-tree) to the client
• 1KB commit
![Page 5: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/5.jpg)
• Parallel requests are serialized on each core
Case Study: Write-Ahead Logging
33 % throughput decrease, 50 % latency increase
![Page 6: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/6.jpg)
Data Copies Matter
• Cache Misses• Copy to tmp buffer (e.g., read()) is cheap• Logging always happens to a different destination
app bufferkernel buffer
read()
log file (mmap()-ed)
memcpy()
Overall cache misses Largest Contributor
Networking only 0.0004 % net_rx_action() (84%)
Networking + NVMM(read() + memcpy() + msync())
4.4121 % memcpy() (98%)
Networking + NVMM (read()+msync())
8.3451 % sys_read() (99 %)
We must avoid data copy for logging!
![Page 7: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/7.jpg)
Packet Store (PASTE) Overview
• Static packet buffers on a named NVMM region• DMA to NVMM
• Zero-copy APIs• Fast logging
client
Network stack
Storage stack
NVMM
App
NIC(1)
(2)
(3)
/mnt/nvmm/pktbufs
/mnt/pmem/appmd
(4)
metadata only(e.g., buffer index)
![Page 8: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/8.jpg)
Fast Logging with PASTE
/mnt/nvmm/myapp_metadata
buf_idx off len
0 100 1135
1 100 932
3 100 1024
/mnt/nvmm/pktbufs
buf_ofs: 123
/mnt/nvmm/pktbufs
packet buffers(static)
metadata header
metadata entries
NIC ring
Application
(2) Write metadata entry (3) Flush (buffer and metadata)
netmap APImmap()
(1) Read data (zero copy)
Kernel
User
TCP/IPinput and
output
![Page 9: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/9.jpg)
Fast Logging with PASTE
/mnt/nvmm/myapp_metadata
buf_idx off len
0 100 1135
1 100 932
3 100 1024
/mnt/nvmm/pktbufs
buf_ofs: 123
/mnt/nvmm/pktbufs
packet buffers(static)
metadata header
metadata entries
NIC ring
Application
(2) Write metadata entry (3) Flush (buffer and metadata)
netmap APImmap()
(1) Read data (zero copy)
Kernel
User
TCP/IPinput and
output
UnreadRead orwritten
![Page 10: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/10.jpg)
Fast Logging with PASTE
/mnt/nvmm/myapp_metadata
buf_idx off len
0 100 1135
1 100 932
3 100 1024
/mnt/nvmm/pktbufs
buf_ofs: 123
/mnt/nvmm/pktbufs
packet buffers(static)
metadata header
metadata entries
NIC ring
Application
(2) Write metadata entry (3) Flush (buffer and metadata)
netmap APImmap()
(1) Read data (zero copy)
Kernel
User
TCP/IPinput and
output
UnreadRead orwritten
![Page 11: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/11.jpg)
Fast Logging with PASTE
/mnt/nvmm/myapp_metadata
buf_idx off len
0 100 1135
1 100 932
3 100 1024
/mnt/nvmm/pktbufs
buf_ofs: 123
/mnt/nvmm/pktbufs
packet buffers(static)
metadata header
metadata entries
NIC ring
Application
(2) Write metadata entry(3) Flush (buffer and metadata)
netmap APImmap()
(1) Read data (zero copy)
Kernel
User
TCP/IPinput and
output
UnreadRead orwritten
![Page 12: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/12.jpg)
Fast Logging with PASTE
/mnt/nvmm/myapp_metadata
buf_idx off len
0 100 1135
1 100 932
3 100 1024
/mnt/nvmm/pktbufs
buf_ofs: 123
/mnt/nvmm/pktbufs
packet buffers(static)
metadata header
metadata entries
NIC ring
Application
(2) Write metadata entry (3) Flush (buffer and metadata)
netmap APImmap()
(1) Read data (zero copy)
Kernel
User
TCP/IPinput and
output
Flushed
DMA is performed to L3 cache (DDIO)
UnreadRead orwritten
![Page 13: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/13.jpg)
Fast Logging with PASTE
/mnt/nvmm/myapp_metadata
buf_idx off len
0 100 1135
1 100 932
3 100 1024
/mnt/nvmm/pktbufs
buf_ofs: 123
/mnt/nvmm/pktbufs
packet buffers(static)
metadata header
metadata entries
NIC ring
Application
(2) Write metadata entry (3) Flush (buffer and metadata)
netmap APImmap()
(1) Read data (zero copy)
Kernel
User
TCP/IPinput and
output
DMA is performed to L3 cache (DDIO)
Flushed
UnreadRead orwritten
![Page 14: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/14.jpg)
Fast Logging with PASTE
/mnt/nvmm/myapp_metadata
buf_idx off len
0 100 1135
1 100 932
3 100 1024
/mnt/nvmm/pktbufs
buf_ofs: 123
/mnt/nvmm/pktbufs
packet buffers(static)
metadata header
metadata entries
NIC ring
Application
(2) Write metadata entry (3) Flush (buffer and metadata)
netmap APImmap()
(1) Read data (zero copy)
Kernel
User
TCP/IPinput and
output
DMA is performed to L3 cache (DDIO)
Flushed
UnreadRead orwritten
![Page 15: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/15.jpg)
Fast Logging with PASTE
/mnt/nvmm/myapp_metadata
buf_idx off len
0 100 1135
1 100 932
3 100 1024
/mnt/nvmm/pktbufs
buf_ofs: 123
/mnt/nvmm/pktbufs
packet buffers(static)
metadata header
metadata entries
NIC ring
Application
(2) Write metadata entry (3) Flush (buffer and metadata)
netmap APImmap()
(1) Read data (zero copy)
Kernel
User
TCP/IPinput and
output
DMA is performed to L3 cache (DDIO)
Flushed
UnreadRead orwritten
![Page 16: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/16.jpg)
Fast Logging with PASTE
/mnt/nvmm/myapp_metadata
buf_idx off len
0 100 1135
1 100 932
3 100 1024
/mnt/nvmm/pktbufs
buf_ofs: 123
/mnt/nvmm/pktbufs
packet buffers(static)
metadata header
metadata entries
NIC ring
Application
(2) Write metadata entry (3) Flush (buffer and metadata)
netmap APImmap()
(1) Read data (zero copy)
Kernel
User
TCP/IPinput and
output
DMA is performed to L3 cache (DDIO)
Flushed
UnreadRead orwritten
![Page 17: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/17.jpg)
Fast Logging with PASTE
/mnt/nvmm/myapp_metadata
buf_idx off len
0 100 1135
1 100 932
3 100 1024
/mnt/nvmm/pktbufs
buf_ofs: 123
/mnt/nvmm/pktbufs
packet buffers(static)
metadata header
metadata entries
NIC ring
Application
(2) Write metadata entry (3) Flush (buffer and metadata)
netmap APImmap()
(1) Read data (zero copy)
Kernel
User
TCP/IPinput and
output
Idempotent request
DMA is performed to L3 cache (DDIO)Unnecessary data is not flushed to DIMM
Flushed
UnreadRead orwritten
![Page 18: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/18.jpg)
Fast Logging with PASTE
/mnt/nvmm/myapp_metadata
buf_idx off len
0 100 1135
1 100 932
3 100 1024
/mnt/nvmm/pktbufs
buf_ofs: 123
/mnt/nvmm/pktbufs
packet buffers(static)
metadata header
metadata entries
NIC ring
Application
(2) Write metadata entry (3) Flush (buffer and metadata)
netmap APImmap()
(1) Read data (zero copy)
Kernel
User
TCP/IPinput and
output
DMA is performed to L3 cache (DDIO)Unnecessary data is not flushed to DIMM
Flushed
UnreadRead orwritten
![Page 19: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/19.jpg)
Fast Logging with PASTE
/mnt/nvmm/myapp_metadata
buf_idx off len
0 100 1135
1 100 932
3 100 1024
/mnt/nvmm/pktbufs
buf_ofs: 123
/mnt/nvmm/pktbufs
packet buffers(static)
metadata header
metadata entries
NIC ring
Application
(2) Write metadata entry (3) Flush (buffer and metadata)
netmap APImmap()
(1) Read data (zero copy)
Kernel
User
TCP/IPinput and
output
DMA is performed to L3 cache (DDIO)Unnecessary data is not flushed to DIMM
Flushed
UnreadRead orwritten
![Page 20: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/20.jpg)
Fast Logging with PASTE
/mnt/nvmm/myapp_metadata
buf_idx off len
0 100 1135
1 100 932
3 100 1024
/mnt/nvmm/pktbufs
buf_ofs: 123
/mnt/nvmm/pktbufs
packet buffers(static)
metadata header
metadata entries
NIC ring
Application
(2) Write metadata entry (3) Flush (buffer and metadata)
netmap APImmap()
(1) Read data (zero copy)
Kernel
User
TCP/IPinput and
output
DMA is performed to L3 cache (DDIO)Unnecessary data is not flushed to DIMM
Flushed
UnreadRead orwritten
![Page 21: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/21.jpg)
Implementation
• Extension to netmap memory allocator• Claim packet buffers from a given file backed by NVMM
• e.g., pkt-gen -i eth1@/mnt/pmem/bufs -f rx
• Server app using the netmap API can easily implement logging
2016-9-12 © 2016 NetApp, Inc. All rights reserved. 21
![Page 22: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/22.jpg)
10-88 % throughput increase, 9-46 % latency reduction
Preliminary Results
• Implementation• Extend the netmap framework
• Stackmap for TCP/IP
![Page 23: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/23.jpg)
Related Work
• Enhanced network stacks• MegaPipe (OSDI’12), Stackmap (ATC’16), Fastsocket
(ASPLOS’16)
• IX and Arrakis (OSDI’14), mTCP (NSDI’13), Sandstorm (SIGCOMM’14), MICA (NSDI’14)
• NVMM filesystems• BPFS (SOSP’09), NOVA (FAST’15)
• NVMM databases• NVWAL (ASPLOS’15), REWIND (VLDB’15), NV-Tree
(FAST’15)
No NVMM aware
No networking aware
![Page 24: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/24.jpg)
Conclusion
• Implications• Network stacks are now a bottleneck for
durably storing data• Improving network and storage stacks in
isolation is not enough• We need new stacks design
PASTE: Fast logging with named packet buffers on NVMM and zero-copy API
![Page 25: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/25.jpg)
HotNets 2016 Reports
• ACM Workshop on Hot Topics in Networks• Focus on new ideas and future directions in networking
• November 9-10 2016 @ Atlanta
• ~90 attendees (Invitation only)• 1 author per paper
• Invited people in the community
• Lottery
2016-9-12 © 2016 NetApp, Inc. All rights reserved. 25
![Page 26: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/26.jpg)
Submission and Reviews
• 30 papers (out of 108 submitted)
• 3-6 reviews• 48 papers on PC discussion
• We submitted 2 papers• 1 rejected (received 4 reviews)
• 3 weak rejects + 1 weak accept
• 1 accepted (received 6 reviews)• 1 accept + 4 weak accepts + 1 weak reject
2016-9-12 © 2016 NetApp, Inc. All rights reserved. 26
![Page 27: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/27.jpg)
Workshop Format
• Format was awesome• Very friendly – like a university internal workshop
• Young people can be active
• Senior people (incl. TPCs) comment and/or raise discussion
2016-9-12 © 2016 NetApp, Inc. All rights reserved. 27
![Page 28: Part1: PASTE: Network Stacks Must Integrate with NVMM ... · Network Stacks Must Integrate with NVMM Abstractions Part2: Report on ACM HotNets 2016 *work done. ... •OS abstractions](https://reader033.vdocuments.net/reader033/viewer/2022060218/5f067f267e708231d4184852/html5/thumbnails/28.jpg)
Workshop Topics
• ISPs• Frontier networks, Traffic engineering using MPTCP, monitoring
• Resource allocation• ML for cluster scheduling etc, blockchain for the Internet (BGP, DNS)
• Container networking• RDMA-based interfaces
• Social networks and clouds• SaaS, recommendation etc
• Datacenters• Topology, deadlockes in RDMA networks, debugging
• Mobile• Low-energy consumption network stack, network personalization
• Network monitoring and analysis• Using programmable switches
• Wireless (MIT)
• NF modeling verification
• DDoS
2016-9-12 © 2016 NetApp, Inc. All rights reserved. 28