g rpc talk with intel (3)
TRANSCRIPT
Out of the Box Network DevelopersSDN and SwitchingSF Bay OpenStack
05/17/2016
Sujata TibrewalaNetwork Developer Evangelist
Intel Developer Zone, NetworkingSoftware.intel.com/networking
@intelsoftware#sdnnfv
Upcoming events
OPNFV summit June 21st -23rd , Berlin, Germany
Red Hat Summit June 27th -30th , SFO, CA
DPDK summit August 10-11 www.dpdksummit.com, San Jose, CA
Intel developer Forum August
DPDK deep dive July 2016
Intel TeamEdwin Verplanke: Principle Engineer
Rashmin Patel : Software Architect
Priya V Autee: Software Engineer
Google TeamJayant Kolhe: Director of Engineering
Abhishek Kumar : Engineering Lead & Manager at Google
Introductions
How RDT and gRPC fit into SDI/SDN, NFV and OpenStack
Key Platform Requirements for SDI
SDI Platform Ingredients: DPDK, IntelⓇRDT
gRPC Service Framework
IntelⓇ RDT and gRPC service framework
SDI - Software Defined Infrastructure, NFV - Network Function Virtualization
Agenda
How DPDK, RDT and gRPC fit into SDI/SDN, NFV and OpenStack
Key Platform Requirements for SDI
SDI Platform Ingredients: DPDK, IntelⓇRDT
gRPC Service Framework
IntelⓇ RDT and gRPC service frameworkSDI - Software Defined Infrastructure, NFV - Network Function Virtualization
Agenda
SDI/SDN/NFV/OPENSTACK
RDT
gRPC/EPA
SDN
Open Flow , ODP and ForCES (Forwarding and Control Element Separation) all perform similar functions High Level
● Separation of control and data plane● Centralized management● Programmable network behavior via well-defined
interfaces
gRPC
DPDK, RDT, Quick Assist etc
gRPC
NFV
RDT gRPC
How DPDK, RDT and gRPC fit into SDI/SDN, NFV and OpenStack
Key Platform Requirements for SDI
SDI Platform Ingredients: DPDK, IntelⓇRDT
gRPC Service Framework
IntelⓇ RDT and gRPC service framework
SDI - Software Defined Infrastructure, NFV - Network Function Virtualization
Agenda
Software Defined Infrastructure10000 feet
Enterprise
Cloud Service Providers
Interconnect / Switch
Processor
Crypto / Compression
DRAM
Last Level Cache
Soft switch, Packet Processing SW Optimizations
Interconnect / Switch
Communication Infrastructure Cloud
Comms. Service Providers
▪Optimized I/O Access (Data Plane Development Kit)
▪Intel® QuickAssist Technology for Crypto and Compression Acceleration
▪Virtualization Technology Enhancements (Posted Interrupts, Page-Modification Logging)
▪Intel® Resource Director Technology (CMT, CAT, CDP, MBM)
3* 2*
Services Deployment on SDI
Servicex
Intel® Xeon® Processor E5 v4
Service1
Service2
Service3
Service1
Intel® Xeon® Processor E5 v4
Service2
Service3
Service y
CallService Call
Service
CallService
CallService Call
ServiceCallService
CallService
CallService Call
Service
CallService
Flexibility, Scalability, Service Agility, Resource Utilization
1* 4*
CallService
5000 feet
*can be a Process/Container/Pod/VM using a CPU core
NFV - Packet Pipeline on IA100 feet
How DPDK, RDT and gRPC fit into SDI/SDN, NFV and OpenStack
Key Platform Requirements for SDI
SDI Platform Ingredients: DPDK, IntelⓇRDT
gRPC Service Framework
IntelⓇ RDT and gRPC service framework
SDI - Software Defined Infrastructure, NFV - Network Function Virtualization
Agenda
Orchestration Support
Service/API SupportSecurity Policy
Scheduler Policy
SW/FW Compatibility
Threading Model
Quality of ServiceShared Memory Access
Optimized I/O Access
SDI Platform Ingredients
Intel® Xeon® Processor E5 v4
VT RDT Memory Controller
Cores NIC Crypto
HT
Platform SW/FW Ingredients
DPDK QAT OS Kernel Optimizations
Standard Service Semantics
openssl libcrypto
OVS Hyperscan
.
.
.
.
.
.
.
.DPDK – Data Plane Development Kit, QAT – Quick Assist Technology, RDT – Resource Director Technology, VT- Virtualization Technology, HT – Hyper Threading Technology, OVS – Open vSwitch, NFV – Network Function Virtualization, SFC – Service Function Chaining
Orchestrator
Optimized Packet I/O API
Software solution for accelerating Packet Processing workloads on Intel® Architecture
• Delivers 25X performance jump over Linux*
• Comprehensive Virtualization support
• Enjoys vibrant community support
• Free, Open Source, BSD License
Disclaimer: Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
Packet Processing Performance
Data Plane Development Kit (DPDK)
Process a bunch of packets during each software iteration and amortize the access cost over multiple packets
For memory access, use HW or SW controlled prefetching. For PCIe access, use Data Direct IO to write data directly into cache
Use access schemes that reduce the amount of sharing (e.g. lockless queues for message passing)
Page tables are constantly evicted (DTLB Thrashing) – Allow Linux to use Huge Pages (2MB, 1GB)
Switch from an interrupt-driven network device driver to a polled-mode driver
DPDK - Data Plane Development Kit
DPDK Overview
Multi-Architecture/NIC Support (dpdk.org)
DPDK Example Apps
Bond
QoS Sched
Link Status
Interrupt
L3fwd
Load Balancer
KNI
IPv4 Multicas
t
L2fwd Keep Alive
Packet Distrib
IP Pipeline
Hello World
Exception Path
L2fwd Jobstats
L2fwd IVSHME
M
Timer
IP Reass
VMDq DCB
PTP Client
Packet OrderingCLI
DPDK
Multi Process
Ethtool
L3fwd VF
IP Frag
QoS Meter
L2fwd
Perf Thread
L2fwd Crypto
RxTx Callback
s
Quota & W’mark
Skeleton
TEP Term
Vhost
VM Power
Manager
VMDq
L3fwd Power
L3fwd ACL
Netmap
Vhost Xen
QAT
L2fwd CAT
IPsecSec GW
A fully open source (BSD licensed) software project with a strong dev communityWebsite: http://dpdk.org Git: http://dpdk.org/browse/dpdk/
Current Infrastructure Support
Intel® Ethernet Network Adapter
* driver patch available
Xen
Virtual Machine
DPDK
Grant Table
e1000 dev mod
Enq / Deq shm
E1000_eth_pmd
Qemu DM
Shared Memory
DPDK
vmexit()VF_pmd
SRIOV
KVM
Virtual Machine
DPDK
ivshmem
vhost
E1000 dev mod
DPDK
Virtio_pmd
Qemu DM
E1000_eth_pmd
Enq/Deq shm
Shared Memory
vmexit()
VF_pmd SRIOV
VMware ESXi
Virtual Machine
DPDK
VMXNET3
e1000 dev mod
ESXi DM
VMwarevSwitch
VMXNET3_pmd
E1000_eth_pmd
Para Virtual Interface
VF_pmd SRIOV
vmexit()
Microsoft Hyper-V
Virtual Machine
Linux Drivers
Synthetic NIC
DEC 21140 dev mod
Hyper-V DM
Extensible vSwitch
DEC 21140 Driver
Para Virtual Interface
DPDK*Synthetic NIC Driver
VF_Driver SRIOV
vmexit()
Shared Resource Contention
• Last Level Cache is shared to make best use of the resources in the platform
• However certain types of applications can cause noise and slow down others
• Applications streaming in nature can cause excessive LLC evictions and lead up to 51% of throughput degradation of Network Workloads
Intel® Xeon® Processor E5 v4
Virtual Machine Monitor
Last Level Cache
Memory
Network IOCrypto IO
Solution: Intel Ⓡ Resource Director TechnologyBuilding on a rich and growing portfolio of technologies embedded in Intel silicon
LPHP
Intel Ⓡ Resource Director Technology (IntelⓇ RDT)
Coreapp
Coreapp
Last Level
Cache
Core
DRAM
app
• Identify misbehaving applications and reschedule according to priority
• Cache Occupancy reported on a per Resource Monitoring ID (RMID) basis – Advanced Telemetry
Cache Monitoring Technology (CMT)
Coreapp
Coreapp
Last Level
Cache
Core
DRAM
app
Cache Allocation Technology (CAT)
• Last Level Cache partitioning mechanism enabling separation and prioritization of apps or VMs
• Misbehaving threads can be isolated to increase determinism
Coreapp
Coreapp
Last Level
Cache
Coreapp
Memory Bandwidth Monitoring (MBM)
• Monitors Memory Bandwidth consumption on per thread/core/app basis
• Shares common RMID architecture -- Telemetry
• Provides insight into second order of shared resource contention
DRAM
IntelⓇ RDT - University of California, Berkeley
http://span.cs.berkeley.edu
Load Generator
Intel® Xeon® processor E5-2695 v4
Ethernet
Virtual Machine Monitor Qemu
VirtualMachineEndRE
EthernetEthernet …
Virtual MachineIPSec …
Virtual MachineMazuNAT
Virtual Machine
SNORT
LLC
• UCB has been researching the applicability of Intel® Resource Director Technology in Edge Device.
• Research focus on maintaining Quality of Service while consolidating a variety of network centric workloads
Core (ASIC-based, MPLS-like)Handles scalable basic connectivity (resilience, load balancing, anycast,
mcast,…)
SDN Controller
Support for 3rd party servicesPartially at edge, partially in cloud
Edge Devices (x86, hybrid)Handles all complex processing(NFV, NetVirt, …)
EdgeDevices
EdgeDevices
EdgeDevices
Intel® Resource Director Technology
IntelⓇ RDT - University of California, Berkeley
Load Generator
Intel® Xeon® processor E5-2695 v4
Ethernet
Virtual Machine Monitor Qemu
VirtualMachineEndRE
EthernetEthernet …
Virtual MachineIPSec …
Virtual MachineMazuNAT
Virtual Machine
SNORT
LLC
• Network functions are executing simultaneously on isolated core’s, throughput of each Virtual Machines is measured
• Min packet size (64 bytes), 100K flows, uniformly distributed
• LLC contention causes up to 51% performance degradation in throughput
Max.% throughput degradation, normalizedhttp://span.cs.berkeley.eduSoftware and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Configurations: see slide 28. For more complete information, visit http://www.intel.com/performance/datacenter.
IntelⓇ RDT - University of California, Berkeley
Load Generator
Intel® Xeon® processor E5-2695 v4 Ethernet
Virtual Machine Monitor Qemu
VirtualMachineEndRE
EthernetEthernet …
Virtual MachineIPSec …
Virtual MachineMazuNAT
Virtual Machine
SNORT
LLC
Max.% throughput degradation, normalized
• Network functions are executing simultaneously on isolated core’s, throughput of each Virtual Machines is measured
• Min packet size (64 bytes), 100K flows, uniformly distributed
• VM under test is isolated utilizing CAT, 2 Ways of LLC are associated with the Network function. Isolation only causes ~2% variation
http://span.cs.berkeley.eduSoftware and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Configurations: see slide 28. For more complete information, visit http://www.intel.com/performance/datacenter.
IntelⓇ RDT - University of California, Berkeley
Load Generator
Intel® Xeon® processor E5-2695 v4
Ethernet
Virtual Machine Monitor Qemu
VirtualMachineEndRE
EthernetEthernet …
Virtual MachineIPSec …
Virtual MachineMazuNAT
Virtual Machine
SNORT
LLC
• Network functions are executing simultaneously on isolated core’s, throughput of each Virtual Machines is measured
• Min packet size (64 bytes), 100K flows, uniformly distributed
LLC
Late
ncy
in M
icro
seco
nds
(log
scal
e)
http://span.cs.berkeley.eduSoftware and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Configurations: see slide 28. For more complete information, visit http://www.intel.com/performance/datacenter.
Threads
cgroup fs
/sys/fs/cgroup/intel_rdt Perf / syscall(perf_event_open)
User interface
Cache Allocation Cache/Memory Monitoring (Perf)
Intel® RDT support
Kernel Space
Hardware
MSR/CPUID Driver
Configure bitmask per
CLOS
Set CLOS/RMID
for thread
During ctx switch
Allocation configuration
Read Event
counter
Read Monitored data
Standalone PQoS library
Intel® Xeon® Processor E5 v4 with Intel® RDT
IntelⓇ RDT Software Enabling Approaches
Broad Platform Awareness Enabling
• Linux cgroup/perf/libvirt enablingcgroup: https://github.com/fyu1/linux/tree/cat16.1/Perf: CMT mainstream(v4.1) and MBM mainstream(v4.6-rc1)Libvirt patches: https://www.redhat.com/archives/libvir-list/2016-January/msg01264.html
• Standalone Intel® RDT API available (01.org)https://github.com/01org/intel-cmt-cat
• DPDK API (dpdk.org) Intel® RDT enablingexamples/l2fwd-cat: RDT CAT and CDP, example of libpqos usage
How DPDK, RDT and gRPC fit into SDI/SDN, NFV and OpenStack
Key Platform Requirements for SDI
SDI Platform Ingredients DPDK/RDT
gRPC Service Framework
IntelⓇ RDT and gRPC service framework
SDI - Software Defined Infrastructure, NFV - Network Function Virtualization
Agenda
Google confidential │ Do not distribute Google confidential │ Do not distribute
gRPC:A multi-platform RPC systemAbhishek Kumar
@grpcio
Mobile firstSoftware Defined EverythingMicroservices ArchitectureEverything as a servicePublic CloudInternet of Things
gRPC touches and influences each of these areas.
High level trends
Google confidential │ Do not distribute
Microservices at Google: O(1010) RPCs per second.
Images by Connie Zhou
Open source on Github for C, C++, Java, Node.js, Python, Ruby, Go, C#, PHP, Objective-C
Introduction to RPCHello, world!
service Greeter { rpc SayHello (HelloRequest) returns (HelloReply);}
message HelloRequest { string name = 1;}
message HelloReply { string message = 1;}
Example (IDL)
// Create shareable virtual connection (may have 0-to-many actual connections; auto-reconnects)ManagedChannel channel = ManagedChannelBuilder.forAddress(host, port).build();GreeterBlockingStub blockingStub = GreeterGrpc.newBlockingStub(channel);
HelloRequest request = HelloRequest.newBuilder().setName("world").build();HelloReply response = blockingStub.sayHello(request);
// To release resources, as necessarychannel.shutdown();
Example (Client)
Server server = ServerBuilder.forPort(port) .addService(new GreeterImpl()) .build() .start();server.awaitTermination();
class GreeterImpl extends GreeterGrpc.AbstractGreeter { @Override public void sayHello(HelloRequest req, StreamObserver<HelloReply> responseObserver) { HelloReply reply = HelloReply.newBuilder().setMessage("Hello, " + req.getName()).build(); responseObserver.onNext(reply); responseObserver.onCompleted(); }}
Example (Server)
Overview
@grpcio
Decomposing Monolithic apps
A B
C D
@grpcio
Decomposing Monolithic apps
A B
CD
@grpcio
Decomposing Monolithic apps
A B
CDgRPC
@grpcio
Polyglot Microservices Architecture
C++ Service
gRPC server Golang Service
gRPC server
gRPC Stub
Java Service
gRPC Stub
Python Service
gRPC server
gRPC Stub
Use Cases
Google confidential │ Do not distribute
Client-server communication
Access Google Cloud Services
Build distributed applications
Images by Connie Zhou
• In data-centers• In public/private
cloud• HIgh performance• Streaming• Millions of
outstanding RPCs• Cross-language API
framework
• Clients and servers across:• Mobile• Web• Cloud
• Also• Embedded systems,
IoT
• From GCP
• From Android and iOS devices
• From everywhere else
Some of the adopters
Microservices: in data centres
Streaming telemetry from network devices
Client Server communication
Client Server communication
@grpcio
MicroServices using gRPC
10 languages, Android and iOS platforms.Idiomatic, language-specific APIsEase of use and ScalabilitySimple programming model. Protocol buffers for interface definition, data model and wire encoding.
Multi-language
Streaming and High PerformanceHTTP/2 framing and multiplexing with flow control. QUIC support.Layered and Pluggable Architecture
Integrated load balancing, health checking, tracing across services
Support for different transports (HTTP/2-over-TCP, QUIC, etc.) Plugin APIs for naming, stats, auth. etc.
Architecture
Three complete stacks: C/C++, Java and Go.
Other language implementations wrap C-Runtime libraries.
Library API surface defined language-idiomatic way and hand-implemented on top of wrapped C-Runtime libraries.
Initial choice of wrapping C Runtime gives us scale, performance in different languages and ease of maintenance.
Implementation across languages
gRPC Core
Http 2.0
SSL
Code Generated API
Planned in:C/C++, Java, GoApplication Layer
Framework Layer
Transport Adapter Layer
Architecture: Native Implementation in Language
TCP (Sockets)
Transport Layer
Generic Low Level API in C
Python
Code-Generated Language Idiomatic APIObj-C, C#,
C++, ...Ruby PHPPython
gRPC Core in C
Http 2.0SSL
Language Bindings
Code Generated
Ruby PHP Obj-C, C#, C++,...
Application Layer
Framework Layer
Transport Layer
Architecture: Derived Stack
Wire Implementation across languages
gRPC Core
Http 2.0
TLS/SSL
Code Generated API
Auth Architecture and API
Credentials API
Auth-Credentials Implementation
Auth Plugin API
Generic mechanism for attaching metadata to requests and responses
Built into the gRPC protocol - always available
Plugin API to attach “bearer tokens” to requests for Auth
OAuth2 access tokens
OIDC Id Tokens
Session state for specific Auth mechanisms is encapsulated in an Auth-credentials object
Metadata Mechanism can be used for signaling up and down the stack
Metadata and Auth
How DPDK, RDT and gRPC fit into SDI/SDN, NFV and OpenStack
Key Platform Requirements for SDI
SDI Platform Ingredients DPDK/RDT
gRPC Service Framework
IntelⓇ RDT and gRPC service framework
SDI - Software Defined Infrastructure, NFV - Network Function Virtualization
Agenda
Platform Exposure to gRPC Endpoints
gRPC
Stu
b
gRPC Core
Http 2.0
SSL
Code Generated API
TCP (Sockets)
Java Service
gRPC server
gRPC Core
Http 2.0
SSL
Code Generated API
TCP (Sockets)
Golang Service
gRPC
Stu
b
Intel® Xeon® Processor E5 v4
Golang Service
gRPC server
Intel® Xeon® Processor E5 v4
Java Service
gRPC server
C++ Service
gRPC server
Intel® Xeon® Processor E5 v4
gRPC Core
Code Generated API
Application Layer
Framework Layer
Transport Adapter Layer
gRPC stack supporting Intel Ⓡ Resource Director Technology
Transport Layer
RDT Set in Metadata
Extract RDT options from metadata
Http 2.0
SSL
TCP (Sockets)
cgroup-perf
Http 2.0
SSL
TCP (Sockets)
RDT Mgr
DPDK +IntelⓇ RDT +
Packet* I/O Mgr
Set RDT options on the socket
gRPC Core
Http 2.0
SSL
Code Generated API
Application Layer
Framework Layer
Transport Adapter Layer
gRPC Enhanced stack using DPDK
TCP (Sockets)
Transport Layer
DPDK-Crypto/QAT Session
DPDKDPDK Sockets
DPDK Packet I/O Mgr
Contacts
DPDK:Site: dpdk.org
Mailing List: http://dpdk.org/ml/listinfo/dev
IntelⓇ RDT APIs:Site:
https://01.org/packet-processing/cache-monitoring-technology-memory-bandwidth-monitoring-cache-allocation-technology-code-and-data
gRPC:Site: grpc.io
Mailing List: [email protected]
Twitter Handle: @grpcio
DPDK provides high performance I/O for SDN/NFV based workloads, has a vibrant developers’ community and yields 25x performance over standard Linux Network Stack
IntelⓇ Resource Director Technology enables developers and system admins to monitor and control shared resources
gRPC a multi-platform RPC system with multi-language support and a high performance pluggable architecture for services
Summary
Site: grpc.ioMailing List:[email protected] Handle: @grpcio
Question?
Backup
Intel Ⓡ Resource Director Technology (IntelⓇ RDT)
Virtual Machine Monitor Qemu
Intel® Xeon® processor E5-2695 v4
Ethernet Ethernet Ethernet Ethernet
Virtual Machine
“Noisy Neighbor”
Virtual Machine
“Noisy Neighbor”
4 VNFs (VMs) with Simple Packet Pipeline
VNF – Virtual Network Function
Prioritizing Important Apps Without Cache Allocation Technology LLC contention causes 38% performance degradation, performance is restored utilizing CAT
Another Benefit Average Latency is reduced from 36usec to 7usec after isolation of the noisy neighbors
Container Workload: A security sandbox performing DPI on a suspected packet stream segment
CAT Application on ContainersCAT - Cache Allocation Technology (IntelⓇ RDT Feature)
Number of Active Containers at time t: 50/100/150
Each Container processing a stream of packets/messages from suspected packet dump store
Containers’ Cache Pollution
Avg: 35-40MB
Max: 44MB
After CAT Application
Avg/Max: 15-20MB