best practice of huawei oceanstor t series solutions for ... · pdf filebest practice of...
TRANSCRIPT
Best Practice of HUAWEI OceanStor T Series Solutions for Key VMware Applications
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
i
Copyright © Huawei Technologies Co., Ltd. 2013. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without prior
written consent of Huawei Technologies Co., Ltd.
Trademarks and Permissions
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective
holders.
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and
the customer. All or part of the products, services and features described in this document may not be
within the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements,
information, and recommendations in this document are provided "AS IS" without warranties, guarantees or
representations of any kind, either express or implied.
The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.
Huawei Technologies Co., Ltd.
Address: Huawei Industrial Base
Bantian, Longgang
Shenzhen 518129
People's Republic of China
Website: http://www.huawei.com
Email: [email protected]
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications About This Document
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
ii
About This Document
Purpose
This document describes the best practice for using HUAWEI OceanStor T series on the
VMware vShpere 5.0 platform, including network design, configuration method and
performance tuning for storage systems and VMware virtual machines (VMs). The best
practice in this document can meet the requirements of configuration optimization in a variety
of application scenarios.
Intended Audience
This document is intended for:
Marketing engineers
Technical support engineers
Maintenance engineers
Symbol Conventions
The symbols that may be found in this document are defined as follows.
Symbol Description
Indicates an imminently hazardous situation which, if not
avoided, will result in death or serious injury.
Indicates a potentially hazardous situation which, if not
avoided, could result in death or serious injury.
Indicates a potentially hazardous situation which, if not
avoided, may result in minor or moderate injury.
Indicates a potentially hazardous situation which, if not
avoided, could result in equipment damage, data loss,
performance deterioration, or unanticipated results.
NOTICE is used to address practices not related to personal
injury.
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications About This Document
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
iii
Symbol Description
Calls attention to important information, best practices and
tips.
NOTE is used to address information not related to personal
injury, equipment damage, and environment deterioration.
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications Contents
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
iv
Contents
About This Document .................................................................................................................... ii
1 Overview ......................................................................................................................................... 1
2 Component Introduction ............................................................................................................. 2
2.1 Introduction to HUAWEI OceanStor T Series Storage Systems................................................................................... 2
2.2 Introduction to VMware vSphere ................................................................................................................................. 4
3 Storage Network Design .............................................................................................................. 6
3.1 Reliability Design ......................................................................................................................................................... 6
3.2 Bandwidth Design ........................................................................................................................................................ 7
3.3 Load Balancing Design ................................................................................................................................................. 7
3.4 Network Design ............................................................................................................................................................ 7
4 Optimizing Storage System Configuration ............................................................................. 8
4.1 Configuration Process ................................................................................................................................................... 8
4.2 Selecting Disks ............................................................................................................................................................. 8
4.3 Configuring RAID Groups ........................................................................................................................................... 9
4.3.1 RAID Group Levels ................................................................................................................................................... 9
4.3.2 RAID Group Capacity ............................................................................................................................................. 10
4.3.3 Hot Spare Disk and Reliability ................................................................................................................................ 10
4.3.4 RAID Group Performance Evaluation ..................................................................................................................... 10
4.4 LUN Configuration ..................................................................................................................................................... 12
4.4.1 Owning Controller ................................................................................................................................................... 12
4.4.2 Stripe Depth ............................................................................................................................................................. 12
4.4.3 Prefetch Policies ...................................................................................................................................................... 12
4.4.4 Write-Back Policies ................................................................................................................................................. 13
4.5 Host Mappings ............................................................................................................................................................ 13
5 VM Configuration Optimization ............................................................................................. 14
5.1 VMFS and RDM ......................................................................................................................................................... 14
5.2 Suggestion on VMFS Volume Configuration ............................................................................................................. 17
5.2.1 Suggestion on Capacity Configuration .................................................................................................................... 17
5.2.2 VMFS Volume Expansion ....................................................................................................................................... 17
5.2.3 Exclusive Volume or Shared Volume ....................................................................................................................... 18
5.3 Suggestion on Virtual Disk Configuration .................................................................................................................. 19
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications Contents
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
v
5.3.1 Choosing a Virtual Disk Format .............................................................................................................................. 19
5.3.2 Virtual Disk Modes .................................................................................................................................................. 20
5.3.3 SCSI Bus Sharing Methods ..................................................................................................................................... 20
5.3.4 Configuring Partition Alignment ............................................................................................................................. 21
5.4 Suggestion on RDM Configuration ............................................................................................................................ 22
5.5 Configuring I/O Queue Depth .................................................................................................................................... 22
5.6 DRS and HA Are Recommended ................................................................................................................................ 23
6 Summary ....................................................................................................................................... 25
A Glossary ....................................................................................................................................... 26
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 1 Overview
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
1
1 Overview
With the advent of virtualization, companies are falling over themselves to deploy VMware
VM-based applications. HUAWEI OceanStor T series storage systems have been optimized
for VMware VMs and offer comprehensive solutions in the infrastructure, service key
applications, and backup and disaster recovery of virtual data centers. This document
introduces how to select the proper storage system on the VMware vSphere 5.0 platform,
including storage system selection, storage system network design, storage system
configuration optimization, and the configuration optimization of the VMware vSphere
platform. You can achieve the best performance and reliability by configuring the storage
system as recommended in this document.
If you are deploying storage systems on the VMware platform, you are advised to configure
the storage systems as recommended in this document.
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 2 Component Introduction
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
2
2 Component Introduction
About This Chapter
2.1 Introduction to HUAWEI OceanStor T Series Storage Systems
2.2 Introduction to VMware vSphere
2.1 Introduction to HUAWEI OceanStor T Series Storage Systems
Targeted at high-level storage applications, HUAWEI OceanStor T series is developed on
industry-leading hardware, high-density disk design, TurboModule high-density I/O modules,
and the hot swap design. Besides, HUAWEI OceanStor T series integrated a variety of
advanced technologies, including the TurboBoost three-level performance acceleration
technology and multi-data redundancy technology. These technologies not only meet the
requirements of big database OLTP/OLAP, high-performance computing, digital media,
Internet operation, integrated storage, backup, disaster recovery, and data transfer, but also
ensure service security and continuity.
Figure 2-1 HUAWEI OceanStor T series storage systems
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 2 Component Introduction
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
3
More importantly, Huawei S5500T/S5600T/S5800T/S6800T comes with such features as high
performance, high reliability, high scalability, and low power consumption.
Table 2-1 Features provided by HUAWEI OceanStor T series storage systems
High Performance
SmartCache
The SmartCache resource pool consists of one or
multiple SSDs. By collecting the real-time
information about the access frequency of the data
blocks in the storage system, SmartCache
dynamically transfers the hotspot data blocks with
high access frequency from the HDD to the
SmartCache resource pool. Because an SSD has a
faster access speed, SmartCache improves the
read-performance and the access efficiency of the
host.
Cache intelligent
prefetch
Cache intelligent prefetch can identify the current I/O
sequence and enable/disable the Cache prefetch
function based on different service models. Based on
different application scenarios, you can set the
optimized prefetch length by using Cache intelligent
prefetch. In addition to significantly improving the
host read performance, Cache intelligent prefetch
reduces disk access frequency, prolonging disk
lifespan.
Dual-controller
dynamic load
balancing
Working in an Active-Active mode, the two
controllers can concurrently process the I/O requests
from the host and share the data storage loads. This
avoids the situation that one controller is over loaded
but the other one is left unused. The dual-controller
dynamic load balancing not only reduces the load on
a single controller, but also utilizes system resources
more properly, improving both system efficiency and
performance.
VAAI VM
performance
acceleration
The VMware VAAI technology is supported. VAAI
can significantly improve the usage effectiveness of
the storage space in S5600T on the VMware platform
and balances loads between the server on the
VMware platform and the storage system, reducing
load on the host and improving storage efficiency.
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 2 Component Introduction
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
4
High Reliability
Built-in
BBUs+data
coffers
The built-in BBUs+data coffers design is used.
Small, low-cost, redundant, and hot swappable,
built-in BBUs can supply power to controllers and
the coffer after the external power fails. Data in
caches can be written into disks after the external
power failure, and data integrity and reliability are
protected.
Disk pre-copy After obtaining the firsthand disk status information
by using the disk predicting technology, disk
pre-copy uses the pre-copy algorithms to analyze
disk running status to calculate the probability of disk
failure. If some disks are predicted to fail, disk
pre-copy copies the data in those disks to a hot spare
disk. This predicting act not only shortens the time
needed for reconstruction or eliminates the necessity
for reconstruction after disk failure but also reduces
the possibility of disk failing again during
reconstruction, improving storage security.
Critical data
protection
HyperImage (virtual snapshot), HyperCopy (LUN
copy), HyperMirror/S (synchronous copy), and
HyperMirror/A (asynchronous copy) are used to
meet the requirements of backup, disaster recovery,
and data transfer.
2.2 Introduction to VMware vSphere
By using the virtualization products offered by VMware, you can run multiple operating
systems in one physical machine. For each operating system, you can set virtual partitions and
configuration and switch one operating system to another.
VMware vSphere is a suite released by VMware for data centers. VMware vSphere transfers
data centers to a simplified cloud computing infrastructure by using virtualization, enabling IT
departments to provide flexible and reliable IT services. In addition to vitalizing and
integrating basic physical hardware resources, VMware vSphere provides abundant virtual
resources for data centers.
VMware vSphere consists of the following component layers:
Table 2-2 VMware vSphere components
Name Description
Infrastructure service Infrastructure service The basic architecture service is a service set
used for collecting, integrating, and allocating hardware basic
architecture resources. It includes vComputer, vStorage, and
vNetwork, which are responsible for vitalizing computing
resources, storage resources, and network resources respectively
and integrating the resources in a virtual environment for unified
management.
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 2 Component Introduction
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
5
Name Description
VMware
vCenter
Server
VMware vCenter Server provides a unified management platform
for data centers and basic data center services, such as access
control, performance monitoring, and configuration.
Client You can access the VMware vSphere data center management
platform using vSphere Client or Web Access (on a Web browser).
In addition to solving the problems of over-complexity, low efficiency, and inflexibility
occurring in data center deployment, VMware reduces the cost of physical basic architecture,
reduces the operating expenses of data centers, and improves work efficiency, flexibility, and
response speed.
Besides, VMware vSphere provides a host of high-availability technologies, such as HA, DRS,
and FT. The latest 5.0 VMware vSphere provides powerful capacities in basic architecture
virtualization and extension.
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 3 Storage Network Design
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
6
3 Storage Network Design
About This Chapter
3.1 Reliability Design
3.2 Bandwidth Design
3.3 Load Balancing Design
3.4 Network Design
3.1 Reliability Design
A SAN network design must support link redundancy, switch redundancy, controller
redundancy and prevents single points of failure (SPOFs). Figure 3-1shows a typical network
diagram used by OceanStor T series in a virtual environment. In the network, there are at least
two data channels between each host and LUN, and the data access must pass different
switches and controllers.
Figure 3-1 Typical network diagram used by OceanStor series in a virtual environment
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 3 Storage Network Design
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
7
3.2 Bandwidth Design
When designing an SAN network, you must choose the proper OceanStor T series storage
systems based on the bandwidth requirements of application systems (for details about
bandwidth performance, see the white papers and brochures of Huawei storage systems) and
configure enough physical channels for ESX cluster to prevent the front-end links from
becoming a performance bottleneck.
3.3 Load Balancing Design
In a SAN network design, load balancing on controllers and links must be supported. Figure
3-1 shows the typical network which allocates LUNs equally on controllers. The proper
multipathing configuration of OceanStor T series must be selected on ESX Server, and the
fixed mode is recommended for VM path selection strategy to ensure link redundancy and
load balancing on transmission paths.
3.4 Network Design FC network
FC network is a relatively mature storage network configured on the VM platform. To
establish connection between ESX servers and storage systems, HBAs must be
configured on the ESX host. Each HBA has a WWN as their unique identifier. The
following configurations are recommended for FC network configuration:
− When using an FC network, you must use the HBAs with at least two ports to ensure
link redundancy.
− You are advised to use only the FC Zone function and to assign links of the same type
to the same FC Zone to prevent cross-network effects.
SCSI network
Both HUAWEI OceanStor and VMware vSphere support 10GE network configuration.
Because performance of a single port is improved, the number of network ports can be
considerably reduced, especially for blade severs.
The following configurations are recommended for ISCSI configuration:
− On the ESX server, the flow control function can be configured for each network port.
You are advised to disable the flow control function to maximize the performance of
the storage system.
− HUAWEI OceanStor T series and VMware vSphere 5.0 support Jumbo frame. You
are advised to enable the Jumbo frame function to significantly improve network
performance. (The network switch also needs to support the Jumbo frame function.)
− Because the IP network may conflict with the management network, you are advised
to separate the iSCSI network from the port management network by configuring
them to different network segments or VLANs.
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 4 Optimizing Storage System Configuration
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
8
4 Optimizing Storage System Configuration
About This Chapter
4.1 Configuration Process
4.2 Selecting Disks
4.3 Configuring RAID groups
4.4 Configuring LUN
4.5 Configuring Host Mappings
4.1 Configuration Process
Figure 4-1 shows the configuration process of HUAWEI OceanStor T series, and this section
describes configuration optimization from the following perspectives:
Figure 4-1 Storage system configuration process
Select disksConfigure RAID
groupsConfigure LUNs
Configure and use
LUNs on the VMware
platform
Configure host
mappings
4.2 Selecting Disks
Because OceanStor T series supports a variety of disks with different capacities and supports
the mixture use of different disks, you can choose proper disks to configure the VM storage
based on the requirements of your services. Table 4-1 lists the random access IOPS empirical
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 4 Optimizing Storage System Configuration
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
9
values of some disk capacities and small data blocks, and the values can serve as references
for you.
Table 4-1 Disk capacity and performance
Disk Type Capacity Random IOPS Application Scenario
SATA 7.2k rpm 1 TB / 2 TB 30 - 60 Backup and archiving
SAS 15k rpm 300 GB / 450 GB /
600 GB
100 - 200 Video, file service, and database
SATA SSD 50 GB / 100 GB /
200 GB
1500 - 2500 Database and email service
4.3 Configuring RAID Groups
4.3.1 RAID Group Levels
You need configure the RAID groups based on the data features of different transactions, and
Table 4-2 lists the suggestion.
Table 4-2 RAID group configuration for different transactions
Transaction Category
Data Characteristic Configured RAID Level
Configured Disk Type
Sequential I/Os in
transaction log disks
Sequential I/Os,
requiring high reliability
RAID 10 SAS/FC
OLTP database data
Exchange Server data
database RAID 10 SAS/FC
Backup and archiving Large capacity RAID 5/RAID 6 SATA
Java applications and
Web applications
Non-dense I/Os RAID 5 SAS/FC
File services and
video applications
Random big I/Os RAID 5 SAS/FC
VM boot disk Low I/O load, requiring
swift response
RAID 5 SAS/FC
CAUTION
Do not use RAID 3 except for large-block sequential reads, such as non-linear editing.
Do not use RAID 0 unless otherwise specified.
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 4 Optimizing Storage System Configuration
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
10
4.3.2 RAID Group Capacity
Recommended configuration:
The number of disks in a RAID group should range from 5 to 12. If the number is
smaller than 5, the performance of the RAID group is relatively poor, followed by the
requirement for more RAID groups and the increase in maintenance cost. For a RAID 5,
RAID 6, or RAID 3 group, this may result in a waste of storage space. On the contrary, if
the number is greater than 12, the RAID group is reconstructed when disk failure occurs.
When more disks are required, the probability of multi-disk failure becomes higher and
the system reliability will decrease accordingly.
An odd number of disks are recommended for a RAID 5 group, and five or nine disks are
preferred.
An even number of disks is recommended for a RAID 6 group, and six or ten disks are
preferred.
Dual-disk mirroring is recommended for a RAID 10 group except for the applications
that demand ultra-high reliability.
Eight or twelve disks are recommended for a RAID 10 group with dual-disk mirroring.
4.3.3 Hot Spare Disk and Reliability
Any disk may fail in use. Therefore, for data security, you must create hot spare disks for an
OceanStor storage system and determine the number of hot spare disks based on the reliability
requirement, maintenance cost, and number of RAID groups. You are advised to configure at
least one hot spare disk in each disk enclosure.
4.3.4 RAID Group Performance Evaluation
The random read and write performance of a RAID group can be evaluated by the random
performance of a single disk, and the bandwidth of a RAID group is determined by the
front-end host channels and back-end storage channels. The following formulas are used to
compute the random performance of different RAID groups. You can refer to these formulas
in actual application scenarios.
RIOPSRAID
indicates the random read IOPS performance of a RAID group.WIOPSRAID
indicates the random write IOPS performance of a RAID group.
IOPSDISK
indicates the random IOPS performance of a single disk (with no notable
difference between HDD random read and write performance).
RMBPSRAID
indicates the sequential read bandwidth performance of a RAID group.
RMBPSRAID
indicates the sequential write bandwidth performance of a RAID group.
MBPSPATH
indicates the bandwidth at the back-end channels of a RAID group.
MBPSDISK
indicates the sequential bandwidth performance of a single disk.
N stands for the number of member disks in a RAID group (5 ≤ N ≤ 12) with the
hypothesis that the front-end channels are not the performance bottleneck.
RAID 0
RAID 0: a striped volume of hard disks. The sequential bandwidth is equal to the
channel bandwidth performance or the total bandwidth performance of all disks. The
random read and write performance is equal to the total random performance of all disks.
RMBPSRAID0 = WMBPSRAID0 = MIN (MBPSPATH, MBPSDISK x N)
RIOPSRAID0 = W IOPSRAID0 = IOPSDISK x N
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 4 Optimizing Storage System Configuration
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
11
RAID 10
RAID 10: a stripe of mirrored hard disks. RMBPSRAID10 = MIN (MBPSPATH,
MBPSDISK x N) WMBPSRAID10 = 1/2 x MIN (MBPSPATH, MBPSDISK x N)
RIOPSRAID10 = IOPSDISK x N
W IOPSRAID10 = 1/2 x IOPSDISK x N
RAID 5
RAID 5: block-level striping with parity data distributed across all member disks.
RMBPSRAID5 = MIN (MBPSPATH, MBPSDISK x N) WMBPSRAID5 = (N-1)/N x
MIN (MBPSPATH, MBPSDISK x N) RIOPSRAID5 = IOPSDISK x N
WIOPSRAID5 = 1/4 x IOPSDISK x N
RAID 6
RAID 6: block-level striping with two copies of parity data distributed across all member
disks.
RMBPSRAID6 = MIN (MBPSPATH, MBPSDISK x N)
WMBPSRAID6 = (N-2)/N x MIN (MBPSPATH, MBPSDISK x N)
RIOPSRAID6 = IOPSDISK x N
WIOPSRAID6 = 1/6 x IOPSDISK x N
CAUTION
The random IOPS formulas in the preceding formulas are not applicable to an SSD, because
there is notable difference between the random read performance and random write
performance of an SSD. IOPSDISK
in RIOPSRAID
and WIOPSRAID
formulas can be replaced
with RIOPSSSD
and WIOPSSSD
respectively for rough computation. Note that because SSDs
have high random performance, the maximum IOPS of the storage system must be considered
when you are computing the random performance of a RAID group.
The preceding formulas are used to compute the bandwidth and IOPS of a 9-disk RAID 5 and
an 8-disk RAID 10.
Suppose IOPSDISK
= 200, MBPSDISK
= 150 MB/s, and the back-end storage uses one 4 Gbit/s
SAS/FC loop, then MBPSPATH
= 4 x 1000 x 1/8 x 8/10 = 400 MB/s (1/8 is the conversion
from bit to byte and 8/10 is the bandwidth loss in the 8 bit/10 bit conversion).
For a 9-disk RAID 5group:
RMBPSRAID5 = MIN (400, 150 x 9) = 400 MB/s WMBPSRAID5 = 8/9 x MIN (400,
150 x 9) = 355 MB/s RIOPSRAID5 = 200 x 9 = 1800
WIOPSRAID5 = 1/4 x 200 x 9 = 450
For an 8-disk RAID10 group:
RMBPSRAID10 = MIN (400, 150 x 8) = 400 MB/s WMBPSRAID10 = 1/2 x MIN
(400, 150 x 8) = 200 MB/s RIOPSRAID10 = 200 x 8 = 1600
WIOPSRAID10 = 1/2 x 200 x 8 = 800
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 4 Optimizing Storage System Configuration
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
12
4.4 LUN Configuration
4.4.1 Owning Controller
The OceanStor T series storage systems use two controllers that work in the active-active
mode, and each LUN has its owning controller. Under normal circumstances, the I/O
operations on a LUN are processed by its owning controller, and the other controller takes
over services only when the host access path or the owning controller fails.
You are advised to assign the LUNs with heavy loads to both controllers equally to ensure
load balancing and improve application performance.
4.4.2 Stripe Depth
OceanStor T series storage systems offer multiple striping policies that bring great flexibility
in application configuration. The LUN stripe depth must be determined based on the I/O size.
Table 4-3 lists the typical configurations.
Table 4-3 Typical stripe depths
RAID Level Stripe Depth
RAID 0 256k / 512k
RAID 5 64k / 128 k
RAID 10 256k / 512k
In addition, you must take the server system stripe into consideration when configuring stripes.
For example, if you use ASM disks (whose default stripe depth is 1 MB) in Oracle databases,
you must configure the stripe of the storage system to a value that can be exactly divided by 1
MB. The stripe depth of 256 KB is recommended for an 8-disk RAID 10, and 128 KB is
recommended for a 9-disk RAID5.
4.4.3 Prefetch Policies
OceanStor T series storage systems offer four prefetch policies which can be applied to most
applications. If applications have high randomicity, configure non-prefetch for them.
Constant prefetch: prefetches a constant size of data from hard disks when hit fails and
applies to large-block sequential read.
Multiplied prefetch: prefetches a certain amount (a multiple of missed I/O amount) of
data from hard disks when hit fails and applies to small-block sequential read.
Intelligent prefetch: automatically determines whether to prefetch data according to the
load characteristics when hit fails. OceanStor T series intelligently calculates the amount
of the data to be pre-fetched. Intelligent prefetch applies to most applications.
Non-prefetch: does not prefetch data and applies to random services.
Intelligent prefetch is recommended for common OLTP database applications.
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 4 Optimizing Storage System Configuration
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
13
4.4.4 Write-Back Policies
OceanStor T series provides sound power failure protection and uses the write-back
technology, which greatly shortens I/O delay and improves application performance. The
write-back with mirroring mechanism applies to most applications. For those applications that
demand high reliability, use write-through not write-back without mirroring for LUNs.
Write-back with mirroring: After receiving a write I/O request from the host, the current
controller writes this I/O request into the cache of the other controller, then writes this
I/O request into its own cache, and then notifies the host that the write I/O operation is
completed. The current controller flushes the cache data onto hard disks by using a policy,
ensuring that the cache always has enough space for new write I/O requests. The
write-back with mirroring mechanism ensures data reliability when a controller fails.
Write-through: After receiving a write I/O request from the host, the current controller
first writes this I/O request into its memory, then writes this I/O request into its disk, and
then notifies the host that the write I/O operation is completed. The write-through
mechanism applies to the application scenarios that demand ultra-high data reliability.
Write-back without mirroring: After receiving a write I/O request from the host, the
current controller first writes this I/O request into its memory, and then notifies the host
that the write I/O operation is completed. Because the write-back without mirroring
mechanism may cause data loss, it is not recommended for LUNs.
Write-back with mirroring is recommended to ensure high performance and high reliability of
storage systems.
4.5 Host Mappings
You can map host configuration to a host or host group. You must consider the following
situations when configuring mappings:
1. If such high-availability functions as HA and DRS are enabled on the VMware platform,
all ESX hosts must be able to see the same storage system. Therefore, you need to add all
ESX hosts to the host group and map LUNs to the host group.
2. If some host cluster functions need to use quorum disks, such as the Oracle RAC quorum
disk and Microsoft cluster quorum disk, multiple ESX hosts can access the same disk.
Therefore, you need to add the ESX hosts to the host group and map LUNs to the host
group.
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 5 VM Configuration Optimization
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
14
5 VM Configuration Optimization
About This Chapter
5.1 VMFS and RDM
5.2 Suggestion on VMFS Volume Configuration
5.3 Suggestion on Virtual Disk Configuration
5.4 Suggestion on RDM Configuration
5.5 Configuring I/O Queue Depth
5.6 DRS and HA Are Recommended
5.1 VMFS and RDM
As a clustered file system with high performance, VMware Virtual Machine File System
(VMFS) enables multiple VMs to access an integrated clustered storage pool, significantly
improving resource utilization rate. By using VMFS, you can create a small number of LUNs
with a big capacity and allocate the LUNs to different VMs. Figure 5-1 displays the schematic
drawing of VMFS.
Figure 5-1 VMFS schematic drawing
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 5 VM Configuration Optimization
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
15
Two VMFSs are available in VMware VMs: VMFS-3 and VMFS-5. VMFS-5 optimized
VMFS-3 performance, such as supporting disks with a bigger capacity. You are advised to
adopt VMFS-5 to use its new features.
VMFS-3 VMFS-5
Supports up to a 2 TB disk volume. Supports up to a 60 TB disk volume, including
RAW Device Mapping (RDM) disks.
Supports master boot record (MBR)
partition.
GUID partition table (GPT) supports a bigger
capacity.
Data block sizes include 1 MB, 2
MB, 4 MB, and 8 MB.
Has a unified data block size of 1 MB and supports
256 GB or larger files.
The smallest sub-block size is 64
KB.
The sub-block size is 8 KB, occupying smaller
space.
The maximum number of files is
30720. Each volume supports over 100,000 files.
The largest VMDK file size is 2 TB. The largest VMDK file size is 2 TB.
Supports up to 256 LUNs. Supports up to 256 LUNs.
Uses the SCSI reservation
mechanism to lock the whole LUN.
Uses VAAI hardware-assisted locking to reduce
disk access conflicts.
VMFS supports RDM.RDM can use a VM to access the physical sub-storage system, but the
sub-storage system can only use FC or iSCSI. Figure 5-2 displays RDM schematic drawing.
VMFS provides a basic symbolic link which is used in VM configuration. When the VM
needs to open an RDM device, it first opens the symbolic link. After the symbolic link
file .vmdk resolves the address and finds the mapped physical device, follow-up read and
write operations do not have to go through VMFS volume, and the physical device is operated
by the VM.
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 5 VM Configuration Optimization
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
16
Figure 5-2 RDM schematic drawing
There is no notable difference between the performance of VMFS and RDM. For random I/O
loads, VMFS and RDM have similar performance in throughput. As for sequential I/O loads,
RDM has relatively better performance than VMFS. For details about the performance of
VMFS and RDM, see the Performance Characterization of VMFS and RDM Using a SAN.
Application scenarios of MFS and RDM:
The VMFS is preferred unless otherwise specified.
RDM applies to the following application scenarios:
Choose RDM for P2V or V2P.
Choose RDM when physical machines and VMs are used for cluster.
RDM disks with physical compatibility are recommended for VMs that use Microsoft Cluster
Services (MSCS).
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 5 VM Configuration Optimization
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
17
5.2 Suggestion on VMFS Volume Configuration
5.2.1 Suggestion on Capacity Configuration
VMFS-5 uses 1 MB data block and 8 KB sub-block and supports 256 GB or larger files, with
smaller files occupying less space. The VMFS volume supports a maximal capacity of 60 TB.
You are not advised to create a RAID group with an excessively large capacity, because if a
physical hard disk fails, the RAID group needs to be reconstructed, and the reconstruction
may affect services.
5.2.2 VMFS Volume Expansion
The expansion function allows the VMFS to cross multiple LUNs. The expansion function
allows the VMFS to cross multiple LUNs. In this application scenario, LUNs are arranged
linearly. Space in the first LUN is used first, and space in the following LUNs will be used
only when space in the first LUN is used up. As a result, the VMFS cannot balance I/O load
among the LUNs to improve the application performance, but can simplify storage
management and realize thin-provisioning at the application layer.
If the application has a light I/O load but demands large storage space, you can create a VMFS
volume across multiple LUNs; if the application has a heavy I/O load, you are advised to
create multiple virtual disks for the VM and assign them to multiple VMFS volumes.
Figure 5-3 VMFS volume expansion
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 5 VM Configuration Optimization
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
18
5.2.3 Exclusive Volume or Shared Volume
ESX administrators' top concerns may involve how many VMs a VMFS volume is assigned to
and how to assign VMFS volumes to VMs of different applications. A lot of factors must be
considered when determining whether to allow multiple VMs to share a big VMFS volume or
allow each VM to have its own relatively smaller VMFS volume.
Figure 5-4 VMFS exclusive volumes and shared volume
Table 5-1lists the advantages and disadvantages of the two volume types. Choose the proper
type for your configuration based on management cost, performance, and extensibility.
Table 5-1 Advantages and disadvantages of exclusive VMFS and shared VMFS
Exclusive VMFS Shared VMFS
Maps one VMFS volume to one
VM.
Multiple VMs share one VMFS volume.
Poor resource utilization. Improved resource utilization.
Deployed in isolation. Easy deployment.
Requires more management cost. Requires lower management cost.
Applicable to applications with
big I/Os.
Resource competition may exist, compromising I/O
performance.
If shared storage with exclusive volumes is used, a large number of disks with high I/O workloads are
configured exclusively to ensure the performance of applications with high throughput. Shared storage
can be used to configure other storage systems.
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 5 VM Configuration Optimization
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
19
5.3 Suggestion on Virtual Disk Configuration
5.3.1 Choosing a Virtual Disk Format
The VMFS supports the following virtual disk formats:
Thin: assigns space on demand.
Thick: assigns a fixed amount of space. The Thick format includes the following
sub-formats:
− Zeroed Thick: generates VMDK files with a fixed size and does not write data into
disks.
− Eager Zeroed Thick: generates VMDK files with a fixed size and writes 0 into disks.
Figure 5-5 shows that there is no notable difference between the performance of Thin and
Zeroed Thick. (For details, see Performance Study of VMware vStorage Thin Provisioning.)
Compared with the other two formats, Eager Zeroed Thick has better performance in
sequential write and has no notable performance difference in other I/O modes.
Virtual disks of each format can be expanded to a larger capacity, but no virtual disk can
decrease the actual used space.
Figure 5-5 Comparison of the performance of different VMFS disk formats
Remarks: The figure above is from Performance Study of VMware vStorage Thin Provisioning.
Zeroing: All VMFS blocks must be zeroed out before data is written into them.
Post-zeroing: All VMFS blocks have been zeroed out before data is written into them.
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 5 VM Configuration Optimization
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
20
1. Choose Thin if capacity is your top concern.
2. Choose Eager Zeroed Thick if performance is your top concern.
3. If the VMFS volume is shared by different ESXs, you are advised to choose Eager Zeroed Thick to
prevent faults during VM startup.
5.3.2 Virtual Disk Modes
VMFS supports the following virtual disk modes:
Independent: The disk is not included when a snapshot is created for the VM. The
Independent mode includes the following sub-modes:
− Persistent: The data updates are persistently saved on the virtual disk.
− Non-persistent: The data updates are discarded when the virtual machine is powered
off or the VM snapshot is restored.
Dependent: A snapshot for the virtual disk is created when a snapshot for the VM is
created. In the Dependent mode, the data updates are persistently saved on the virtual
disk.
There is no notable virtual disk performance difference between the two modes. In the
Non-persistent mode, VMware creates the REDO file in the root directory of the VM and all
read operations of the virtual disk are saved in the file. If the VM is powered off or the VM
snapshot is restored, this file is discarded, so this mode seriously affects virtual disk
performance.
The independent mode is the default mode of the system. Do not use the dependent mode
unless otherwise specified.
5.3.3 SCSI Bus Sharing Methods
VMware supports three SCSI bus sharing methods:
None: The virtual disk cannot be shared by VMs.
Virtual: The virtual disk can be shared by the VMs on the same server.
Physical: The virtual disk can be shared by VMs on any server.
Figure 5-6 VMware bus sharing methods
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 5 VM Configuration Optimization
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
21
1. If RDM volumes are used and shared by VM clusters on different servers, you are advised to choose
the Physical mode for SCSI bus.
2. If VMFS volumes are used and are shared by VMs on different servers, you can choose either the
Physical mode or the None mode. Use the None mode only in specific application scenarios. For
details, see Disabling Simultaneous Write Protection Provided by VMFS using the Multi-Writer
Flag.
5.3.4 Configuring Partition Alignment
In Linux or Windows, data on a disk (or LUN) is organized in the legacy Cylinder, Head, and
Sector (CHS) mode. When a partition is created, 63 sectors are reserved on the head to store
the partition structure information and the main boot record, causing storage layers to be out
of alignment and affecting application performance. The VMFS volume improves this storage
method by reserving 64 KB data on the head when being created, but storage layers are still
out of alignment with the data structure in the storage systems.
Figure 5-7shows the storage structures of the VMDK, VMFS, and LUN. When the cluster,
block, and chunk are out of alignment, reading/writing a cluster causes multiple blocks to be
read or written. In the VMFS-5, if clusters and blocks are out of alignment, multiple block
operations may be caused, resulting in read and write operations of more blocks. In the
VMFS-5, the impact on the block layer is negligible, but performance may still get
compromised if the clusters are out of alignment with blocks.
Figure 5-7 Storage structures out of alignment
Cluster ClusterClusterClusterClusterCluster
Block BlockBlock
Chunk ChunkChunk
Block Block
Chunk ChunkChunk
Cluster
VMDK file (NTFS)
VMFS volume
SAN LUN
Attempt to read
one disk
cluster may
cause read of
up to three
SAN chunks.
Remarks: The figure above is from Recommendation for Aligning VMFS Partitions. You are
advised to use the Fdisk command line tool in ESX and Linux and the Diskpart command line
tool in Windows. Configure partition alignment for disks that require partition alignment, and
for details about configuring partition alignment, see Recommendation for Aligning VMFS
Partitions.
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 5 VM Configuration Optimization
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
22
5.4 Suggestion on RDM Configuration
RDM supports two compatibility modes, and the two modes have no notable difference in
performance. Under normal circumstances, you are advised to choose the Physical mode.
Physical compatibility: allows the client to directly access the hardware. The disk is not
included when a snapshot of the client is created.
Virtual compatibility: allows the virtual disk to use the VMware snapshot and other
advanced capabilities.
5.5 Configuring I/O Queue Depth
Because there are limitations on the I/O queue depth on HBA ports, I/O queue depth of VMs,
and I/O queue depth of LUNs, you are advised to perform the following configuration to
improve system performance:
FC HBA: A single path has a maximum I/O queue depth of 32. Run the esxcfg-module
command to configure the HBA driver and set the maximum concurrence.
Virtual machine: A single VM has a maximum I/O queue depth of 32 by default. Modify
the depth by changing the value of the ESX advanced parameter Disk.SchedQuantum
to 64.
Figure 5-8 Adjusting the I/O queue depth of a VM
LUN: A single LUN has a maximum I/O queue depth of 64 by default. Modify the depth by
changing the value of the ESX advanced parameter Disk.SchedQuantum to 256.
Figure 5-9 Modifying the I/O queue depth of a single LUN
A single virtual disk or SCSI controller has a limited I/O queue depth. Create multiple
virtual disks for the VM and use multiple SCSI controllers to increase the utilization ratio
of storage resources.
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 5 VM Configuration Optimization
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
23
5.6 DRS and HA Are Recommended
The VMware DRS enables the intelligent allocation of computing resources, and the VMware
HA ensures the reliability of virtualized environment and service continuity. The VMware
DRS provides three levels of automation:
Manual: The vCenter presents suggestion on virtualization.
Partially automated: VMs are automatically placed on the host and the vCenter presents
suggestion on the migration of the virtual machines.
Fully automated: Based on the resource usage, VMs are automatically placed on the host
and automatically migrated. VMware has five migration thresholds and offers suggestion
on the migration according to the selected threshold.
In DRS Performance and Best Practices, VMware elaborates on how the VMware DRS
improves the performance. Figure 5-10 shows how the two migration levels improve
performance under an undesirable initial deployment of VMs, and Figure 5-11 shows how the
two DRS migration levels improve performance under a fully balanced initial deployment of
VMs.
You are advised to create an ESX cluster with multiple servers and use the VMware DRS
function to dynamically allocate resources. In addition, you are advised to enable the VMware
HA function to ensure service reliability, performance, and continuity in the virtualized
environment.
Figure 5-10 Performance improvement made by DRS under an undesirable initial deployment of
VMs
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 5 VM Configuration Optimization
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
24
Figure 5-11 Performance enhancement made by DRS under a fully balanced initial deployment of
VMs
Remarks: The figures above are from DRS Performance and Best Practices
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications 6 Summary
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
25
6 Summary
Huawei is committed to providing customers with quality storage products and solutions, and
HUAWEI OceanStor T series embodies this concept. Based on the VMware platform,
HUAWEI OceanStor T series offers comprehensive solutions and best practice with
high-availability, high-performance, and easy management.
Drawing from the key application solutions on the VMware platform, Huawei integrates its
storage systems with VMware vSphere's high availability and easy management features to
offer integrated architecture. This document describes the best practice of the configuration of
Huawei storage systems (based on VMware vSphere) and can serve as a reference for solution
configuration.
Best Practice of HUAWEI OceanStor T Series Solutions
for Key VMware Applications A Glossary
Issue 1.0 (2012-07-06) Huawei Proprietary and Confidential
Copyright © Huawei Technologies Co., Ltd.
26
A Glossary
Acronym Full Name
ASM Automated Storage Management
TCO Total Cost of Ownership
IT Information Technology
DBA Database Administrator
OLTP Online Transaction Processing
OLAP On-Line Analysis Processing
RAC Real Application Clusters
OCR Oracle Cluster Registry
AU Allocation Unit
SAS/FC Fibre Channel
LUN Logical Unit Number
RAID Redundant Array of Independent Disks
SAS Serial Attached SCSI
SATA Serial Advanced
SSD solid state disk
SCSI Small Computer System Interface
ERP Enterprise Resource Planning