xpdds17: xen schedulers and their impact on interrupt latency - stefano stabellini, aporeto &...
TRANSCRIPT
Xen Schedulers and Interrupt Latency
Dario Faggioli & Stefano Stabellini
The case for embedded virtualization
Galois SMACCMPPilot
Demo
Xen Summit 2014
Why Xen? Why an hypervisor?
• Efficiency and Consolidation• Isolation and Partitioning• Componentization• Resilience• Scaling• Portability
Embedded != Cloud
Different requirements:
• short boot times• small footprint• small codebase (certifications)• non-PCI device assignment• driver domains• co-processor virtualization• low, deterministic irq latency• real time schedulers
Embedded != Cloud
Different requirements:
• short boot times• small footprint• small codebase (certifications)• non-PCI device assignment• driver domains• co-processor virtualization• low, deterministic irq latency• real time schedulers
Xen supports/enables
Xen Schedulers
CPU CPU CPU CPU
CPU CPU CPU CPU
Xen Schedulers
CPU CPU CPU CPU
CPU CPU CPU CPU
Real Time SchedulerARINC 653
Regular VM SchedulerCredit
Dedicatedto 1 VCPU
Dedicatedto 1 VCPU
Automotive
Hardware
Xen
Dom0Linux Control Domain
UI DomainAutomotive Grade Android
HW Drivers GPU Driver
PV Block & Net frontends
PV Block & Net Backends
AudioDriver
GlobalLogic
EPAMEPAM
Xilinx Zynq MPSoC
Xen
Dom0Linux
Baremetal App
Toolstack FPGA Driver
Baremetal App
FPGA Driver
Baremetal App
FPGA Driver
Baremetal App
FPGA Driver
FPGA
Dedicated CPU Dedicated CPU Dedicated CPU Dedicated CPU
Latency Impact of Schedulers
pCPUs and vCPUs...
pCPU0
pCPU1
pCPU2
pCPU3
vcpu
vcpuvcpu
vcpu
vcpu
vcpuvcpu
vcpu
vcpu
vcpu vcpu
vcpuvcpu
vcpu
We want to run!!
We are blocked...
vcpu
vcpu
I’m running
Keeping vCPUs “organised”: runqueues
pCPU0
pCPU1
pCPU2
pCPU3
vcpu vcpu vcpuvcpuvcpurunq
vcpu vcpurunq
runq
vcpu vcpu vcpurunq
Runqueues in Credit1
pCPU0
pCPU1
pCPU2
pCPU3
vcpu vcpu vcpuvcpuvcpurunq0
vcpu vcpurunq1
runq2
vcpu vcpu vcpurunq3
1 runqueue x pCPU
vcpu
vcpu
vcpu
Runqueues in Credit2
pCPU0
pCPU1
pCPU2
pCPU3
vcpu vcpu vcpuvcpurunq
runq
Runqueues are sharedvcpu
vcpu
vcpu
A vCPU Wake-Up in Credit1
pCPU0
pCPU1
pCPU2
pCPU3
vcpu vcpu vcpuvcpuvcpurunq0
vcpu vcpurunq1
runq2
vcpu vcpu vcpurunq3
Case a:1. vcpu goes in a runq12. where can vcpu run?
Hey, pCPU2 is idle!3. put vcpu in runq24. pCPU2 picks up vcpu
from its runqueue
Case b:1. vcpu goes in a runq32. hey, vcpu can prempt
what’s running on pCPU3!
3. context switch
vcpu
vcpu
vcpu
vcpu
vcpu
vcpuvcpu
(1)
(2)
(3)
(4)
vcpu
vcpu
(1)
(2)
(3)
A vCPU Wake-Up in Credit2
pCPU0
pCPU1
pCPU2
pCPU3
vcpu vcpu vcpurunqA
runqB
vcpu
vcpu
vcpu
vcpu
vcpu
vcpu
vcpu
vcpuvcpu
Case a:1. vcpu goes in a runqA2. load-balancer moves
vcpu to a less loaded runq
3. pCPU2 picks up vcpu from its runqueue
Case b:1. vcpu goes in runqB2. pCPU3 picks up vcpu
A vCPU Wake-Up in Credit20.000971102 irq_enter0.000971102 irq_direct, vec fa, handler = apic_timer_interrupt0.000971649 raise_softirq TIMER_SOFTIRQ0.000971962 irq_exit, in_irq = 00.000974070 softirq_handler TIMER_SOFTIRQ0.000976010 tasklet_schedule fn=hvm_assert_evtchn_irq, sched_on=6 (softirq)0.000976010 tasklet_enqueue fn=hvm_assert_evtchn_irq0.000976510 raise_softirq TASKLET_SOFTIRQ on cpu 60.000978213 softirq_handler TASKLET_SOFTIRQ0.000981562 tasklet_do_work fn=hvm_assert_evtchn_irq0.000982017 vcpu_wake d1v10.000982437 runstate_change d1v1 blocked->runnable0.000982987 csched2:update_load0.000983230 csched2:update_rq_load rq# 0, load = 1, avgload = 34.830%0.000983430 csched2:update_vcpu_load d1v1, vcpu_load = 11.824%0.000983735 csched2:runq_insert d1v1, position 00.000984060 csched2:runq_tickle_new d1v1, processor = 6, credit = 52925670.000984490 csched2:runq_tickle cpu 60.000984842 raise_softirq SCHEDULE_SOFTIRQ on cpu 60.000985500 softirq_handler SCHEDULE_SOFTIRQ0.000988941 csched2:schedule cpu 6, rq# 0, idle, SMT idle, tickled0.000989344 csched2:runq_cand_check d1v10.000989611 csched2:runq_candidate d1v1, credit = 52925670.000990881 sched_switch prev idle, run for 344.6us0.000991199 sched_switch next d1v1, was runnable for 5.862us, next slice 1000.0us0.000991377 sched_switch prev idle next d1v10.000991697 runstate_change idle running->runnable0.000991979 runstate_change d1v1 runnable->running
vcpu wakes-up; goes in runq
Scheduler triggered on CPU 6
CPU 6 schedules
vcpu runs
Interrupt arrives
SchedulingIntroduced latency:
9.9620 us
BEST CASE
_This_ _is_ _all_ _good_ ...
… Because, thanks to this, we can offer VMs/users:
• Overcommitting (i.e., having more vCPUs than pCPUs)• Weighted fair share of pCPU time• Hard and soft affinity• Cache and NUMA awareness• Caps and reservations on pCPU time
… but it _comes_ _at_ _a_ _price_
The ‘null’ Scheduler
A scheduler that does nothing
If we want features & flexibility, we must pay the price :-(
What if we don’t want (or need) them, e.g.:
• Static environments (some embedded usecases)• Systems (or cpupools, where we know we’ll never have
overcommit)• For testing/benchmarking (as reference)
Then you can use the ‘null’ scheduler
Runqueues in ‘null’. No, wait...
pCPU0
pCPU1
pCPU2
pCPU3
vcpu
vcpu
vcpu
vcpu
There are no runqs at all!
• vCPUs are statically assigned to pCPUs• Only 1 vCPUs per pCPU• Overcommit is possible (i.e.: the system won’t
explode), but use only if you really know what you’re doing (i.e.: the VMs will likely explode!)
A vCPU Wake-Up in ‘null’
pCPU0
pCPU1
pCPU2
pCPU3
1. vcpu wakes up and run
:-)vcpu
vcpuvcpu(1)
A vCPU Wake-Up in ‘null’
0.636884641 irq_enter0.636884641 irq_direct, vec fa, handler = 0xffff82d080267ec40.636885492 raise_softirq TIMER_SOFTIRQ0.636885922 irq_exit, in_irq = 00.636889583 softirq_handler TIMER_SOFTIRQ0.636892021 tasklet_schedule fn=hvm_assert_evtchn_irq, sched_on=5 (softirq)0.636892021 tasklet_enqueue fn=hvm_assert_evtchn_irq0.636892836 raise_softirq TASKLET_SOFTIRQ on cpu 50.636895074 softirq_handler TASKLET_SOFTIRQ0.636895607 tasklet_do_work fn=hvm_assert_evtchn_irq
0.636896202 vcpu_wake d1v10.636896712 runstate_change d1v1 blocked->runnable
0.636897197 raise_softirq SCHEDULE_SOFTIRQ on cpu 5
0.636898470 softirq_handler SCHEDULE_SOFTIRQ0.636899465 null:schedule cpu 5, vcpu d1v10.636899720 sched_switch prev idle, run for 999936.973us0.636899970 sched_switch next d1v1, was runnable for 2.411us0.636900155 sched_switch prev idle next d1v10.636900448 runstate_change idle running->runnable0.636900738 runstate_change d1v1 runnable->running
vcpu wakes-up; goes in runq
Scheduler triggered on CPU 6
CPU 6 schedules
vcpu runs
Interrupt arrives
SchedulingIntroduced latency:
4.5360 us
(less than half of Credit2)
NORMAL CASE
Benchmarks
Hardware and Software configuration
Hardware:
Xilinx Zynq MPSoC: 4 ARM A53 Cores
Physical Timer
Software:
Xen 4.9.0-rc7 (+ phys_timer forwarding patch)
Dom0: Linux 4.9, dom0_mem=1G, max_dom0_vcpus=2
1 vcpu TBM ctest
Fin