boyan krosnov open infrastructure summit with …...usual optimization goal - lowest cost per...
TRANSCRIPT
![Page 1: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/1.jpg)
Achieving the ultimate performance with KVM
Boyan KrosnovOpen Infrastructure Summit
Shanghai 2019
1
![Page 2: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/2.jpg)
StorPool & Boyan K.● NVMe software-defined storage for VMs and containers
● Scale-out, HA, API-controlled
● Since 2011, in commercial production use since 2013
● Based in Sofia, Bulgaria
● Mostly virtual disks for KVM
● … and bare metal Linux hosts
● Also used with VMWare, Hyper-V, XenServer
● Integrations into OpenStack/Cinder, Kubernetes Persistent
Volumes, CloudStack, OpenNebula, OnApp
2
![Page 3: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/3.jpg)
Why performance● Better application performance -- e.g. time to load a page, time to
rebuild, time to execute specific query
● Happier customers (in cloud / multi-tenant environments)
● ROI, TCO - Lower cost per delivered resource (per VM) through
higher density
3
![Page 4: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/4.jpg)
Why performance
4
![Page 5: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/5.jpg)
Agenda
● Hardware
● Compute - CPU & Memory
● Networking
● Storage
5
![Page 6: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/6.jpg)
Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling, space, server, network, support/maintenance
Example: cost per VM with 4x dedicated 3 GHz cores and 16 GB RAM
Unusual - Best single-thread performance I can get at any cost - 5 GHz cores, yummy :)
Compute node hardware
6
![Page 7: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/7.jpg)
Compute node hardware
7
![Page 8: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/8.jpg)
Compute node hardware
Intellowest cost per core: - Xeon Gold 6222V - 20 cores @ 2.4 GHzlowest cost per 3GHz+ core: - Xeon Gold 6210U - 20 cores @ 3.2 GHz
- Xeon Gold 6240 - 18 cores @ 3.3 GHz- Xeon Gold 6248 - 20 cores @ 3.2 GHz
AMD - EPYC 7702P - 64 cores @ 2.0/3.35 GHz - lowest cost per core - EPYC 7402P - 24 cores / 1S - low density
- EPYC 7742 - 64 cores @ 2.2/3.4GHz x 2S - max density
8
![Page 9: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/9.jpg)
Compute node hardware
Form factor
from to
9
![Page 10: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/10.jpg)
Compute node hardware
● firmware versions and BIOS settings
● Understand power management -- esp. C-states, P-states,
HWP and “bias”
○ Different on AMD EPYC: "power-deterministic",
"performance-deterministic"
● Think of rack level optimization - how do we get the lowest
total cost per delivered resource?
10
![Page 11: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/11.jpg)
Agenda
● Hardware
● Compute - CPU & Memory
● Networking
● Storage
11
![Page 12: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/12.jpg)
Tuning KVM
RHEL7 Virtualization_Tuning_and_Optimization_Guide linkhttps://pve.proxmox.com/wiki/Performance_Tweaks
https://events.static.linuxfound.org/sites/events/files/slides/CloudOpen2013_Khoa_Huynh_v3.pdf
http://www.linux-kvm.org/images/f/f9/2012-forum-virtio-blk-performance-improvement.pdf
http://www.slideshare.net/janghoonsim/kvm-performance-optimization-for-ubuntu
… but don’t trust everything you read. Perform your own benchmarking!
12
![Page 13: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/13.jpg)
CPU and Memory
Recent Linux kernel, KVM and QEMU… but beware of the bleeding edgeE.g. qemu-kvm-ev from RHEV (repackaged by CentOS)
tuned-adm virtual-hosttuned-adm virtual-guest
13
![Page 14: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/14.jpg)
CPU
Typical● (heavy) oversubscription, because VMs are mostly idling● HT● NUMA● route IRQs of network and storage adapters to a core on the
NUMA node they are on
Unusual● CPU Pinning
14
![Page 15: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/15.jpg)
Understanding oversubscription and congestion
Linux scheduler statistics: linux-stable/Documentation/scheduler/sched-stats.txt
Next three are statistics describing scheduling latency: 7) sum of all time spent running by tasks on this processor (in jiffies) 8) sum of all time spent waiting to run by tasks on this processor (in jiffies) 9) # of timeslices run on this cpu
20% CPU load with large wait time (bursty congestion) is possible100% CPU load with no wait time, also possible
Measure CPU congestion!
15
![Page 16: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/16.jpg)
Understanding oversubscription and congestion
16
![Page 17: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/17.jpg)
Discussion
17
![Page 18: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/18.jpg)
Memory
Typical● Dedicated RAM● huge pages, THP● NUMA● use local-node memory if you can
Unusual● Oversubscribed RAM● balloon● KSM (RAM dedup)
18
![Page 19: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/19.jpg)
Discussion
19
![Page 20: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/20.jpg)
Agenda
● Hardware
● Compute - CPU & Memory
● Networking
● Storage
20
![Page 21: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/21.jpg)
Networking
Virtualized networkingUse virtio-net driverregular virtio vs vhost_net
Linux Bridge vs OVS in-kernel vs OVS-DPDK
Pass-through networkingSR-IOV (PCIe pass-through)
21
![Page 22: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/22.jpg)
Networking - virtio
Qemu
VM
Kernel
Kernel
User space
22
![Page 23: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/23.jpg)
Networking - vhost
Qemu
VM
Kernel
Kernel
User space
vhost
23
![Page 24: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/24.jpg)
Networking - vhost-user
Qemu
VM
Kernel
Kernel
User space
vhost
24
![Page 25: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/25.jpg)
● Direct exclusive access to the PCI device
● SR-IOV - one physical device appears as multiple virtual functions (VF)
● Allows different VMs to share a single PCIe hardware
Host
NICVF1
Hypervisor / VMM
VM
Hostdriver
driver
VM
driver
VM
driver
VF2 VF3PF
PCIe
IOMMU / VT-d
Networking - PCI Passthrough and SR-IOV
25
![Page 26: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/26.jpg)
Discussion
26
![Page 27: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/27.jpg)
Agenda
● Hardware
● Compute - CPU & Memory
● Networking
● Storage
27
![Page 28: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/28.jpg)
Storage - virtualization
Virtualized
cache=none -- direct IO, bypass host buffer cache
io=native -- use Linux Native AIO, not POSIX AIO (threads)
virtio-blk vs virtio-scsi
virtio-scsi multiqueue
iothread
vs. Full bypass
SR-IOV for NVMe devices
28
![Page 29: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/29.jpg)
Storage - vhost
Virtualized with host kernel bypass
vhost
before: guest kernel -> host kernel -> qemu -> host kernel -> storage system
after: guest kernel -> storage system
29
![Page 30: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/30.jpg)
storpool_server instance1 CPU thread2-4 GB RAM
NIC
storpool_server instance1 CPU thread2-4 GB RAM
storpool_server instance1 CPU thread2-4 GB RAM
• Highly scalable and efficient architecture• Scales up in each storage node & out with multiple nodes
25GbE
. . .25GbE
storpool_block instance1 CPU thread
NVMe SSD
NVMe SSD
NVMe SSD
NVMe SSD
NVMe SSD
NVMe SSD
KVM Virtual Machine
KVM Virtual Machine
30
![Page 31: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/31.jpg)
Storage benchmarks
Beware: lots of snake oil out there!
● performance numbers from hardware configurations totally
unlike what you’d use in production
● synthetic tests with high iodepth - 10 nodes, 10 workloads *
iodepth 256 each. (because why not)
● testing with ramdisk backend
● synthetic workloads don't approximate real world (example)
31
![Page 32: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/32.jpg)
Latency
ops
per s
econ
d
best service
32
![Page 33: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/33.jpg)
Latency
ops
per s
econ
d
best service
lowest cost per delivered resource
33
![Page 34: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/34.jpg)
Latency
ops
per s
econ
d
best service
lowest cost per delivered resource
only pain
34
![Page 35: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/35.jpg)
Latency
ops
per s
econ
d
best service
lowest cost per delivered resource
only pain
35
benchmarks
![Page 36: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/36.jpg)
example1: 90 TB NVMe system - 22 IOPS per GB capacityexample2: 116 TB NVMe system - 48 IOPS per GB capacity
36
![Page 37: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/37.jpg)
?
37
![Page 38: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/38.jpg)
Real load
38
![Page 39: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/39.jpg)
?
39
![Page 40: Boyan Krosnov Open Infrastructure Summit with …...Usual optimization goal - lowest cost per delivered resource - fixed performance target - calculate all costs - power, cooling,](https://reader030.vdocuments.net/reader030/viewer/2022040904/5e78ca90196e9b2d472b610a/html5/thumbnails/40.jpg)
Discussion
40