live migration · migrations in openstack ... • hypervisor doesn’t know what type of disks...
TRANSCRIPT
Live MigrationMitaka and Beyond
Paul Murray, Hewlett Packard EnterpriseAndrea Rosa, Hewlett Packard EnterprisePawel Koniszewski, Intel Corporation
Why live migration?
Host Maintenance
Rolling Updates
Power Optimization
Migrations in OpenStack
✗ Non-live migration (Cold migration)• nova migrate <server>
✓ Live migration• nova live-migration <server> [<host>]
✓ Block live migration (optional)• nova live-migration --block-migrate <server> [<host>]
Assumptions
• Live
• Consistent
• Transparent
• Minimal service disruption
Production experience
Works 85% of the time* !• Hypervisor doesn’t know what type of disks there are• Migration may fail• Migration may never end• Migration traffic may impact network bandwidth• Possible guest network disruption on migration• New resource types and physical mappings• …
...and bugs
*Vancouver OpenStack Summit: Live Migration at HP Public Cloud, Dive into VM Live Migrationhttps://www.openstack.org/summit/vancouver-2015/summit-videos/presentation/live-migration-at-hp-public-cloudhttps://www.openstack.org/summit/vancouver-2015/summit-videos/presentation/dive-into-vm-live-migration
Production experience
Works 85% of the time* !• Hypervisor doesn’t know what type of disks there are• Migration may fail• Migration may never end• Migration traffic may impact network bandwidth• Possible guest network disruption on migration• New resource types and physical mappings• …
...and bugs
*Vancouver OpenStack Summit: Live Migration at HP Public Cloud, Dive into VM Live Migrationhttps://www.openstack.org/summit/vancouver-2015/summit-videos/presentation/live-migration-at-hp-public-cloudhttps://www.openstack.org/summit/vancouver-2015/summit-videos/presentation/dive-into-vm-live-migration
Nova Live Migration Priority
Make it simple• Fewer config options• Fewer API options
Make it work• Many fixes in libvirt/qemu• Fix bugs in OpenStack• Improve CI
Make it managable• Progress information• Abort or force to completion• Isolate networking
Make it Simple
Friendly Live Migration API
Keep It Simple, Stupid Sir
“Let the machine do the dirty work”*
*Kernighan And Ritchie. “Elements of programming style” -1978
- block_migration- disk_over_commit
- live_migration_flag- block_migration_flag+ live_migration_tunnelled
block_migration Nova virt driver Libvirt: is_on_shared_storageXen: Using aggregateHyperV: Not supported
disk_over_commit Libvirt specific, do not expose it via API
Warning - Rolling Upgrades
• Mitaka API is not backward compatible• nova --os-compute-api-version 2.24 live-migration <server> [<host>]
• nova --os-compute-api-version 2.24 live-migration --block-migrate <server> [<host>]
live_migration_flag block_migration_flag
live_migration_tunnelled
Make it Work
Scheduling in Mitaka
• All original scheduling properties are preserved• Scheduler can correctly choose target for live migration
• extra specs• scheduler hints• image properties
Block migration with attached volumes
• Volumes are not copied
• Selective disk migration requires Libvirt >= 1.2.17• In Ubuntu it requires >=1.2.16, but a manual change in code on compute nodes is needed
• Default config drive type is iso9660• Due to Libvirt bug iso9660 is not migratable
Live Migration with config drive attached
iso9660 vfat
Block live migration ✗ ✓
Volume-backed live migration ✗ ✓
Shared storage live migration ✓ ✓
Memory Oversubscription prior to Mitaka
• LM to specific host does not use memory oversubscription•ram_allocation_ratio
Compute Node A2 GB RAM
Reported RAM = available - reserved
nova-conductor
2 GB
2 GB
2 GB
4 GBnova-scheduler
ram_allocation_ratio = 2.0
Memory Oversubscription in Mitaka
• LM to specific host mimics RAM Filter logic
Compute Node A2 GB RAM
Total RAMFree RAMRam allocation ratio
nova-conductor 4 GB
4 GBnova-scheduler
Memory = Total * Ratio – (Total – Free)
Page Modification Logging
• Hardware-assisted dirty logging mechanism• Performance of a VM increased up to 8%• Requires:
• 4th generation Intel Xeon processor• Kernel version >=4.0
Make it Managable
Pets and cattle metaphor
molly.mycompany.com
charlie.mycompany.com
Pets and Cattle metaphor
server1.mycompany.com
server2.mycompany.com
Pets and cattle metaphor: the theory
Pets and cattle metaphor: the reality
Molly2.company.com
Charlie1.company.com
Management of on-going live migrations
• Progress details• Force to complete• Abort
Progress details
nova server-migration-list <server>
nova server-migration-show <server> <migration id>
• List migrations for a server
• Show a migration for a server
• Details: disk progress, memory progress
Progress details
$ nova server-migration-show e5fe4c30-c993-43a3-a4b4-a1e48ee93606 4+------------------------+--------------------------------------+| Property | Value |+------------------------+--------------------------------------+| created_at | 2016-04-22T12:31:55.000000 || dest_compute | devstack3 || dest_host | - || dest_node | - || disk_processed_bytes | 8109686784 || disk_remaining_bytes | 13365149696 || disk_total_bytes | 21474836480 || id | 4 || memory_processed_bytes | 0 || memory_remaining_bytes | 2156605440 || memory_total_bytes | 2156605440 || server_uuid | e5fe4c30-c993-43a3-a4b4-a1e48ee93606 || source_compute | devstack3a || source_node | - || status | running || updated_at | 2016-04-22T12:37:07.000000 |+------------------------+--------------------------------------+
Operators loves to kill a live migration
How to abort an in-progress live migration
nova live-migration-abort <server> <migration id>
• Abort the running job and triggers a rollback
• Works only when libvirt is used as a driver (QEMU/KVM hypervisor)
• Won’t work with post-copy live migration
Force to Complete
nova live-migration-force-complete <server> <migration id>
• Pauses VM during LM
• Automatically unpauses VM
• Works only when libvirt is used as a driver
Live Migration on dedicated network
New configuration parameter:
live_migration_inbound_addr
Live migration traffic
Complex KVM installation with VSA model
live_migration_inbound_addr
Future of Live Migration
Post-copy Live Migration
Pre-copy Post-copy
● Move workload to destination in the middle of the process
Post-copy Live Migration
• Live migration ends in a finite time• VM needs to be rebooted in case of failure• Performance impact on memory reads
Check Destination on Migration
• Live migration can be forced to particular host• Adds a new parameter to check provided host in scheduler
nova live-migration <server> [<host>] --check
Summary
Mitaka blueprints (merged):
• Fewer config options• Fewer API options• Scheduling with original request properties• Block migration with volumes and vfat config drive• Fixed memory over subscription• Progress reporting• Abort migration• Force migration to complete• Split network plane
Future• Post-copy live migration• Check destination…
… and more in planning
Legal Notices and Disclaimers
• Intel technologies’ features and benefits depend on system configuration and may require enabled
hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer.
• No computer system can be absolutely secure.
• Tests document performance of components on a particular test, in specific systems. Differences in
hardware, software, or configuration will affect actual performance. Consult other sources of information
to evaluate performance as you consider your purchase. For more complete information about
performance and benchmark results, visit http://www.intel.com/performance.
• Intel, the Intel logo and others are trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
• © 2016 Intel Corporation.
Credits
Presenters:
Nova Live Migration sub-team:
https://wiki.openstack.org/wiki/Meetings/NovaLiveMigration
Artwork: Dave McNally: [email protected]
HikingArts: http://hikingartist.com