winconnections spring, 2011 - 30 bite-sized tips for best vsphere and hyper-v vm performance
DESCRIPTION
At the end of the day, virtualization is all about performance. If you squish together 20 VMs onto a single host and they don’t perform well, then you’ve failed at your job. Conversely, if you’ve constructed the environment correctly, you win. In this fun and exciting session, Friend-of-the-Virtual-Machine Greg Shields presents 30 of his very best tips that you can immediately implement. Who knows, you might find one or two that solve your performance problems overnight!TRANSCRIPT
30 Bite-Sized Tips forBest VM Performance
Greg Shields, MVPSenior Partner and Principal Technologist
www.ConcentratedTech.com
#1: Purchase Compatible Hardware
• …and not just “compatible with ESX”.• Purchase hardware compatible with each other.
● Particularly considering vMotion needs.
#2: Buy Nehalem/Opteron
• Intel Nehalem & AMD Opteron include support for Intel EPT / AMD RVI processor extensions.
● Together, referred to as Second Level Address Translations, or SLAT
● Includes hardware-assisted memory management unit (MMU) virtualization.
● Significantly faster for certain workloads, such as those with large context switches.
● Finally, full support for Remote Desktop Services / XenApp
• Note that these support Large Memory Pages, which will disable ESX’s page table sharing.
#3: Mind NIC Oversubscription
• One of the greatest benefits of iSCSI is its linear scalability.
● Need more throughput, just add another NIC!
• However, VLANs and link aggregation introduce the notion of NIC oversubscription.
● Ceteris Paribus, Storage traffic >>> Regular traffic.
• Even with VLANs, always segregate storage NICs from production networking NICs.
● If possible/affordable use segregated network paths.● Monitor! This will kill your performance faster than anything!
#4: Consider FurtherSegregating Heavy Workloads
• Some VMs run workloads that make heavy use of their attached disks.
● Consider segregating these workloads onto their own independent NICs and paths.
● Keep an eye on your IOPS.
#5: vSphere 4.0 VMs Don’t Backup Applications Correctly!
• vSphere 4.1 added full support for Microsoft VSS on Server 2008 guests.
● This support is only automatic if the guest was initially created on a vSphere 4.1 host.
● Hosts upgraded from vSphere 4.0 aren’t properly backing up their applications.
• Fix this by setting disk.EnableUUID to True.● Power off machine.● Edit Settings | Options | General | Configuration Parameters |
Add Row● Power on machine.
#6: HBA Max Queue Depth
• One solution for poor fibre storage performance can be adjusting your HBA maximum queue depth.
● More queues can mean more performance, but less cross-device and cross-VM optimizations.
● 32 by default.● This is not a task taken lightly.● Kind of like adjusting air/fuel mix on a carburetor.
• Multi-step process.• See http://kb.vmware.com/kb/1267 for details.
#7: Consider Hardware iSCSI
• …but, perhaps, don’t buy them…
• ESX’s software iSCSI initiator works well.● However, using it incurs a small processing overhead.● Hardware iSCSI NICs offload this overhead to the card.● NFS/NAS storage also experience this behavior.
#7: Consider Hardware iSCSI
• …but, perhaps, don’t buy them…
• ESX’s software iSCSI initiator works well.● However, using it incurs a small processing overhead.● Hardware iSCSI NICs offload this overhead to the card.● NFS/NAS storage also experience this behavior.
• Newer NICs reduce this effect, those with…● Checksum offload● TCP segmentation offload (TSO)● 64-bit DMA addressing● Multiple Scatter Gather elements per Tx frame● Jumbo frames
#8: Set NICs to Autonegotiate
• VMware’s recommendation is to set all NICs to autonegotiate, full duplex.
● This is sort of “duh” these days.● But its worth mentioning, because…
● Some old school network admins still prefer to manually set speed/duplex due to a crazy old race condition bug that happened a long, long time ago.
#8: Set NICs to Autonegotiate
• VMware’s recommendation is to set all NICs to autonegotiate, full duplex.
● This is sort of “duh” these days.● But its worth mentioning, because…
● Some old school network admins still prefer to manually set speed/duplex due to a crazy old race condition bug that happened a long, long time ago.
● Just smack around those old coots.
#8: Set NICs to Autonegotiate
• VMware’s recommendation is to set all NICs to autonegotiate, full duplex.
● This is sort of “duh” these days.● But its worth mentioning, because…
● Some old school network admins still prefer to manually set speed/duplex due to a crazy old race condition bug that happened a long, long time ago.
● Just smack around those old coots. ● “You can take your Token Ring and your IPX
and go home now!”
#9: Do Not Team Storage NICs
• What? Don’t team them?● Well, I guess I mean “team” as in the classic sense of network
teaming.
#9: Do Not Team Storage NICs
• What? Don’t team them?● Well, I guess I mean “team” as in the classic sense of network
teaming.
• Remember that storage NICs leverage MPIO for link aggregation.
● MPIO is a superior technology over link aggregation anyway for storage.
● ‘tis also easier to use, and better for routing!
• vCenter’s GUI wizards make this hard not to do, but be aware that extra steps are required…
#10: Enable Hyperthreading
• Early in ESX’s days we debated whether hyperthreading improved or decreased overall performance.
● That debate is over. The winner is “increase”.
• Today, hyperthreading adds a non-linear additional quantity of processing capacity.
● Like 20-30% (???), not a full extra proc.But you know this.
● Enable it in your servers’ BIOS.● Just turn it on, OK?
#11: Allocate Only the CPUs You Need
• Allocate only as many vCPUs as a VM requires.● Start with only one as your baseline. Rarely deviate. Circle
this bullet point. No, really.● Don’t use dual vCPUs if single-threaded application.● Don’t assign more vRAM than necessary.
#11: Allocate Only the CPUs You Need
• Allocate only as many vCPUs as a VM requires.● Start with only one as your baseline. Rarely deviate. Circle
this bullet point. No, really.● Don’t use dual vCPUs if single-threaded application.● Don’t assign more vRAM than necessary.
• More vCPUs equals more problems.● More vCPUs equals more interrupts● Extra overhead in maintaining consistent memory view
between vCPUs. This is tough, especially with today’s descheduled processing.
● Some OSs migrate single-threaded workloads between multiple CPUs, adding a performance tax.
• More CPUs good for CPU spike handling.
#12: Disconnect UnusedPhysical Hardware Devices
• COM, LPT, USB, Floppy, CD/DVD, NICs, etc.all consume interrupt resources.
● High priority resources.● It is a big deal to insert a CD/DVD/USB.
• Connected Windows guests will poll CD/DVD drives very frequently, significantly affecting performance.
● Disconnect these in VM properties when not in use.● There’s a reason why the “Connected” checkbox exists!● Note! Connected devices can prevent a vMotion!
#13: Upgrade to VM Version 7
• Virtual hardware version 7 offers some very significant performance improvements.
● VMXNET3 paravirtualized NIC driver● PVSCSI paravirtualized SCSI driver● Upgrade VMware tools. Reboot.● (More on these in a minute)
• Note that VMv7 hardware cannot be vMotioned to ESX servers prior to 4.0.
● Be careful of this.
#14: Don’t Fear Scaling Out
• Creating VMs is easy, so we create them.● You’ll eventually run out of CPU resources.● You’ll probably run out of RAM first.● Don’t run more VMs than processing/memory capacity.
• When running very close to capacity, use CPU reservations to guarantee 100% CPU availability for the console.
● Host | Configuration | System Resource Allocation | Edit● Particularly important if you have software installed there.● This is unnecessary in ESXi.
#15: 80% is Nice
• VMware recommends maintaining an administrative ceiling on utilization at 80%.
● This reserves enough capacity for failure, service console.● VMware suggests that 90% should be a warning for
overconsumption.
• Less workload dynamics can shift this up.● …but, seriously, who can really state that?
#16: With Older OSs,Use UP HAL When Possible
• Newer OSs (Vista, W7, 2008) use the same HAL for all UP/SMP conditions.
• Older OSs leverage two HALs● A Uniprocessor HAL● A Multiprocessor HAL
• An SMP HAL that is only given a single vCPU will run slightly slower.
● Slightly more synchronization code.
• Note that this will impact hot add.
#17: Mind Scheduling Affinity
• It is possible to tag a VM to a particular pProc.● Good for ensuring that VM has processing resources during
contention.● Setting Code Sharing to None prevents any other vProc from
using a pProc on the same core. Like disabling HT.● Setting Code Sharing to
Internal prevents vProcson other VMs from usingpProc on same core.Only same VM.
● Just set this to Any.
#18: Don’t Touch this Setting.
• Exceptionally rare are the cases when this setting shouldbe adjusted.
● So, no touchy.
#18: Don’t Touch this Setting.
• Exceptionally rare are the cases when this setting shouldbe adjusted.
● So, no touchy.
● I will tell youwhen.
● I havevery reasonableconsulting rates.
#19: Don’t Just Keep Up theOld (and Dumb) Habit of Assigning 4G of RAM to Every Stinking Virtual Machine, No Matter What Workload it Runs. Really.
• Consciously consider the amount of RAM that a VM needs, and assign it that RAM.
● Yes, VMware has memory ballooning.● But overallocating unnecessarily increases VM overhead.● Ballooning isn’t automatic. Ballooning is slow. Ballooning is
reactive.
#19: Don’t Just Keep Up theOld (and Dumb) Habit of Assigning 4G of RAM to Every Stinking Virtual Machine, No Matter What Workload it Runs. Really.
• Consciously consider the amount of RAM that a VM needs, and assign it that RAM.
● Yes, VMware has memory ballooning.● But overallocating unnecessarily increases VM overhead.● Ballooning isn’t automatic. Ballooning is slow. Ballooning is
reactive.
• Note to Self: Talk about Hyper-V’s Dynamic Memory here. Very cool.
#20: Stop with the Snapshots
• Snapshots are (were) a significant selling point in the early days of virtualization.
● About to do something risky? Snapshot! Its like a career protection device!
#20: Stop with the Snapshots
• Snapshots are (were) a significant selling point in the early days of virtualization.
● About to do something risky? Snapshot! Its like a career protection device!
• However, snapshots aren’t (and never were) meant for long-term storage.
● And I mean “no more than just a few minutes” long.● They’re not meant for backups.● Reverting to an aged snapshot can break computer trust
relationships to Windows domain.● Managing snapshots, particularly linked ones, significantly
reduces overall VM performance.
#21: Perform vSphere Tasksin the Off Hours
• Some vSphere tasks are actually quite impactful on VM operations.
● Provisioning virtual disks● Cloning virtual machines● svMotion● Manipulating file permissions● Backups● Anti-virus (bleh)
• Do these tasks during off hours, or you may impact performance for other running VMs.
#22: Mind Affinities
• Some VMs need to regularly communicate with each other with high throughput.
● “Keep Virtual Machines Together”● Make sure these machines share the same vSwitch.● Collocation forces inter-VM traffic through the system bus
rather than pNICs, significantly increasing speed.
• The loss of other VMs could be bad, if both are collocated on the same host.
● “Separate Virtual Machines”
#23: Disable Screen Savers
• And Window animations.
• Screen savers represent a machine interrupt, particularly those with heavy graphics.
● “Pipes”, I’m looking right at you!● This interrupt is particularly impactful on collocated VMs.
● …and, plus, screen savers on servers is sooooo 2002.
#24: Use NTP, not VMwareTools for Time Sync
• …and here’s one out of the odd files…
• VMware suggests configuring VMs to sync time from an external NTP server.
● They prefer this even over their own internal timekeeper.● Their timekeeper uses a much lower resolution than NTP.● NTP = milliseconds● NT5DS = 1 second● VMware Tools = ?
#25: Never Use PerfMon Inside the VM, Except…
• Not that you’d ever actually use PerfMon, but…● Measuring performance from within a virtual machine fails to
take into account for unscheduled time.● Essentially, when the ESX server isn’t servicing the VM, no
time passes within that VM.● Also, in-VM PerfMon doesn’t recognize virtualization
overhead.● Most important, in-VM PerfMon can’t see down into layers
below the VM: Storage, processing, etc.
#25: Never Use PerfMon Inside the VM, Except…
• Not that you’d ever actually use PerfMon, but…● Measuring performance from within a virtual machine fails to
take into account for unscheduled time.● Essentially, when the ESX server isn’t servicing the VM, no
time passes within that VM.● Also, in-VM PerfMon doesn’t recognize virtualization
overhead.● Most important, in-VM PerfMon can’t see down into layers
below the VM: Storage, processing, etc.
• VMware Tools adds PerfMon counters to VMs.● These are OK to use, as they’re synched from ESX.
#26: Paravirtualization is Your Friend
• VM Hardware Version 7 adds two new paravirtualized drivers.
● VMXNET3 replaces E1000● PVSCSI replaces BusLogic/LSILogic
• Paravirtualized drivers are superior to emulation● They are “aware” they’ve been virtualized. Can work directly
with host without needing emulation.● Mexican menus versus French menus.
• VMXNET3 supports TSO & Jumbo Frames,in the VM!
● Even if the physical hardware doesn’t support TSO!
#27: Turn on Jumbo Frames,but Do it Everywhere
• If you plan to use Jumbo Frames…● MTU size is usually set to 9000● Make sure you enable it everywhere.● This brings particular assist with large file transfers (think
WDS, virtual disk provisioning, etc.) and storage connections.
• Not all network equipment supports Jumbo Frames.● Test, test, test.
#28: DRS Will PrioritizeFaster Hosts over Slower Ones
• A neat fact (that I didn’t know):● When potential hosts for a DRS relocation have compatible
CPUs but different CPU frequencies and/or memory capacity…
● …DRS will prioritize relocating VMs to the system with the highest CPU frequency and more memory.
● This won’t be the case if that CPU is already at capacity.
#29: Disable FT, UnlessYou’re Using It
• …and most of you aren’t.
• You can “turn on” but not “enable” FT.● Problem: Turning on Fault Tolerance automatically disables
some features that enhance VM performance.● Hardware virtual MMU is one.
• Or, just don’t use that horrible feature. Har!● (Is there anyone from VMware in the audience…?)
#30: Match Configured OSwith Actual OS
• Big oops here, usually during OS migrations.• This setting also
sets a few important low-level kerneloptimizations.
• Make sureyours arecorrect!
BONUS TIP #31: Follow the Numbers
• Private Clouds are all about quantifying performance in terms of supply and demand.
• vSphere gives you those numbers. Just sum ‘em up.
Final Thoughts
• See! Creating good VMs isn’t all that easy.● Our jobs aren’t going away any time soon!● These little optimizations add up
• Be smart with your virtual environment and always remember…
Final Thoughts
• See! Creating good VMs isn’t all that easy.● Our jobs aren’t going away any time soon!● These little optimizations add up
• Be smart with your virtual environment and always remember…
• …you cannot change the laws of Physics!
30 Bite-Sized Tips forBest VM Performance
Greg Shields, MVPSenior Partner and Principal Technologist
www.ConcentratedTech.com