impact of disk alignment in virtualized environments
DESCRIPTION
Impact of Disk Alignment in Virtualized Environments. Grant Cohoe. Why should you care?. Performance Misalignment causes more IO’s than you need Shared Storage issues. Understand Your Stuff. Hard Disk Geometry Sector Size (Logical & Physical) Operating System What does it want? - PowerPoint PPT PresentationTRANSCRIPT
Grant Cohoe
IMPACT OF DISK ALIGNMENT IN VIRTUALIZED ENVIRONMENTS
WHY SHOULD YOU CARE?• Performance
• Misalignment causes more IO’s than you need
• Shared Storage issues
UNDERSTAND YOUR STUFF• Hard Disk Geometry
• Sector Size (Logical & Physical)
• Operating System
• What does it want?
• What does it do by default?
• Sometimes silly things…
LAYERS
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
DISK GEOMETRY/PARTITIONS
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
TERMINOLOGY• Sectors
• Units of disk storage
• Partition
• Logical group of sectors
• Track
• Ring of sectors on a single side of a platter
• Cylinder
• 3D track (all platters at one track location)
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
MASTER BOOT RECORD (MBR)• That thing that boots your OS
• First 512 bytes of the disk
• 440 bytes of bootloader
• 32 bytes of partition information
• 4 primary partitions - max size 2TB
512
STAR
T 440 (Boot loader) 32
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
MASTER BOOT RECORD (MBR)• DOS Compatibility
• Cannot span cylinders (because DOS was silly)
• Number of sectors per cylinder = 63
• 63 – 1 (MBR) = 62 sectors before first usable
• This is deprecatedMBR LBA-1 LBA-62 63
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
MASTER BOOT RECORD (MBR)• 1MB Alignment
• Align all partitions to 1MB
• 1MB = 1048576B / 512B sectors = 2048 (1st Sector)
• Improves performance
• Ensures compatibility for 4K “Advanced Format”
• This is new standard (Windows Vista)
MBR LBA-1 LBA-2047 2048
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
RESULTING DISK
• 512B MBR –
• Alignment Space –
• 1st Partition Starting Sector –
• This is good!
MBR 2048 2049 2050 2051 2052 2053 2054 2055 … 16777215
MBR
2048
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
LOGICAL VOLUME MANAGEMENT (LVM)
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
TERMINOLOGY• Physical Volume
• Container of data stored as a partition on disk
• Logical Volume
• Virtualized storage structure stored as data in a PV
• pe_start
• LV offset within a PV
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
LVM PHYSICAL VOLUMES (LVM PV)• pe_start specifies the start of LV data
• Very intelligent. Usually not a problem
• Needs to be aligned to your sectors!
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
LVM PHYSICAL VOLUMES (LVM PV)• Bad
• pe_start does not line up with a sector
• Going to hurt performance later
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
MBR 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059
Physical Volume
pe_start PV Data Region
LVM PHYSICAL VOLUMES (LVM PV)• Good
• As long as pe_start is a multiple ofyour sector size (usually 512B)you’re good!
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
MBR 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059
Physical Volume
pe_start PV Data Region
LVM PHYSICAL VOLUMES (LVM PV)• PE Size
• Physical Extent – LVM “block” size
• Usually default is fine
• Multiple of sector size (512)
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
RESULTING VOLUME
• LV starting point aligned (pe_start)
• PV aligned to sectors on disk
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
MBR 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059
Physical Volume
pe_start PV Data Region
Logical Volume
HOST FILE SYSTEM
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
HOST FILE SYSTEM• Not much to do here
• RAID would be a different story…
• Ext is good at picking sane defaults
• Block size
• Smallest unit of data for the filesystem
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
RESULTING FILESYSTEM
MBR 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059
Physical Volume
pe_start PV Data Region
Logical Volume
Filesystem
VMDK GEOMETRY & PARTITIONS
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
VMDK GEOMETRY/PARTITIONS• Same principles as host disks
• DOS compatibility sucks
• 1MB alignment is good
• Performance impact is bigger
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
MBR 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059
VM FILE SYSTEM
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
VM FILE SYSTEM• Don’t use RAID/LVM in VMs
• Unless you really need it for some reason
• Or if you did a P2V
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
MBR 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059
VM File System
VM ALIGNMENT
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
PERFECTLY ALIGNED VM
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
MBR 2048 2049 2050 2051 2052 2053
VM File System
MBR 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059
Physical Volume
pe_start PV Data Region
Logical Volume
Filesystem
2054
PERFECTLY ALIGNED VM
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
4096
512 512 512 512 512 512 512 512
4096
1024
512 512 512 512 512 512 512 512
VM FS Block
VMDK Sectors
Host FS Block
LVM PE*
Host Disk Blocks
* PE shown as 1K for example
1024 1024 1024
MISALIGNED VM
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
4096
512 512 512 512 512 512 512 512
4096
512 512 512 512 512 512 512 512
4096
512 512 512 512 512 512 512 512
• VM disk image sits across two Host FS blocks, thus requiring more reads of the host disks to get all data
• 4096B of VM data requires 8192B of host disk data to read
1024 1024 1024 1024 1024 1024 1024 1024
END GOAL
Disk Geometry/Partitions
LVM
Host File System
VMDK Geometry/Partitions
VMFS
MBR 2048 2049 2050 2051 2052 2053
VM File System
MBR 2048 2049 2050 2051 2052 2053 2054 2055 2056 2057 2058 2059
Physical Volume
pe_start PV Data Region
Logical Volume
Filesystem Filesystem Filesystem Filesystem Filesystem
MODERN STUFFBONUS MATERIAL
ADVANCED FORMAT DISKS• 4K Sectors
• Old:
• New:
• Much more efficient with todays data usage
• 512e Emulation Mode
• Lets old stuff still work with new disks
• Logical (OS):
• Physical (Disk):
64 65 66 67 68 69 70 71
8
64 65 66 67 68 69 70 71
8
ADVANCED FORMAT DISKS & MBR• Regular disks (512 byte sectors)
• LBA-63
• Advanced Format (4K sectors) w/ e512
• LBA-63
• PROBLEM LATER ON
MBR 1 62 63 64 65 66 67 68 69 70 71 72 73 74 75
MBR 1 62 63 64 65 66 67 68 69 70 71 72 73 74 75
0 4K sectors 7 8 9
GUID PARTITION TABLE (GPT)• That new thing that boots your OS
• First 17K of the disk
• Lots of stuff ------------------------------>
• On Disk
GPT Alignment Space 2048
RAID IMPLICATIONS• If RAID volume misaligned, entire array is affected
• RAID in VMs is BAD!
RAID TERMINOLOGY• Data Disk
• A disk that has real data (not parity)
• Stripe
• RAID unit of IO (“block”)
• Also called “Chunk”
• Stride
• Amount of data from a stripe before moving to next disk
• Stripe Width
• Length of a stripe
RAID MATH• Constants
• DATA_DISKS = 3 (lets say this is RAID5 with 4 disks)
• BLOCK_SIZE = 4K (from the filesystem)
• CHUNK_SIZE = 512K
• Calculate Stride
• STRIDE = CHUNK_SIZE / BLOCK_SIZE = 128K
• Calculate Stripe Width
• STRIPE_WIDTH = STRIDE * DATA_DISKS = 384K
• What this means:
• One unit of RAID IO will write 128K to the first disk then move on to the next one
REFERENCES• http://en.community.dell.com/techcenter/extras/w/wiki/2838.aspx
• http://www.pixelbeat.org/docs/disk/
• http://computer-forensics.sans.org/blog/2010/07/28/windows-7-mbr-advanced-format-drives-e512/
• http://en.wikipedia.org/wiki/Advanced_format