h11299 emc vplex elements performance testing best practices wp

8/20/2019 h11299 Emc Vplex Elements Performance Testing Best Practices Wp

1/64

White Paper

Abstract

This white paper describes the performance

characteristics, metrics, and testing considerations

for EMC VPLEX family of products. Its intent is to refine

performance expectations, to review key planning

considerations, and to describe testing best

practices for VPLEX Local, Metro and Geo. This

paper is not suitable for planning for exceptiona

situations. In configuring for performance, everyenvironment is unique and actual results may vary.

EMC VPLEX: ELEMENTS OF PERFORMANCE

AND TESTING BEST PRACTICES DEFINED


2/64

2EMC VPLEX: ELEMENTS OF PERFORMANCE AND TESTING BEST PRACTICES

DEFINED

Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC believes the information in this publication is accurate of its publication

date. The information is subject to change without notice.

The information in this publication is provided “as is”. EMC Corporation makes no

representations or warranties of any kind with respect to the information in this

publication, and specifically disclaims implied warranties of merchantability or

fitness for a particular purpose.

Use, copying, and distribution of any EMC software described in this publication

requires an applicable software license.

For the most up-to-date listing of EMC product names, see EMC Corporation

Trademarks on EMC.com.

All other trademarks used herein are the property of their respective owners.

Part Number h11299


3/64


DEFINED

Table of Contents

Executive summary ....................................................................................................... 5

Audience ........................................................................................................................ 5

Introduction .................................................................................................................... 6

Transaction-based workloads................................................................................... 8

Throughput-based workloads ................................................................................... 8

The Role of Applications in Determining Acceptable Performance .................. 8

Section1: VPLEX Architecture ...................................................................................... 10

VPLEX hardware platform ....................................................................................... 10

VPLEX GeoSynchrony 5.1 System Configuration Limits ....................................... 10

READ / Write IO Limits ............................................................................................... 11

Section 2: VPLEX Performance Highlights .................................................................. 12

Understanding VPLEX overhead ............................................................................ 12

Native vs. VPLEX Local Performance ..................................................................... 12

OLTP Workload Example ......................................................................................... 15

Native vs. VPLEX Metro Performance .................................................................... 15

Native vs. VPLEX Geo Performance ...................................................................... 16

Section 3: Hosts and Front-end Connectivity ............................................................. 17

Host Environment ...................................................................................................... 17

Host Paths .................................................................................................................. 17

Host to director connectivity .................................................................................. 20

Host Path Monitoring ................................................................................................ 22

Policy based path monitoring ................................................................................ 23

VPLEX Real-time GUI Performance Monitoring Stats ........................................... 24

Remote Monitoring and Scripting .......................................................................... 26

Watch4Net ................................................................................................................ 26

Perpetual Logs .......................................................................................................... 26

Benchmarking Applications, Tools and Utilities .................................................... 26

Section 4: Application Performance Considerations ................................................ 31

High Transaction environments .............................................................................. 31

High Throughput environments .............................................................................. 32

VPLEX Device Geometry ......................................................................................... 32

Section 5: Back-end Performance Considerations ................................................... 34

Storage Considerations ........................................................................................... 34

Storage Array Block Size .......................................................................................... 34

SAN Architecture for Storage Array Connectivity ............................................... 34


4/64


DEFINED

Active/Active Arrays ................................................................................................ 35

Active/ Passive Arrays .............................................................................................. 38

Additional Array Considerations ............................................................................ 40

Automated Storage Tiering..................................................................................... 40

Performance Metrics for Back-end IO ................................................................... 40

Back-end Connectivity Summary .......................................................................... 41

Section 6: SAN and WAN Performance ...................................................................... 42

SAN Redundancy ..................................................................................................... 42

Redundancy through Cisco VSANs or Brocade Virtual Fabrics ......................... 42

Planning SAN Capacity ........................................................................................... 43

ISL Considerations ..................................................................................................... 43

FC WAN Sizing ........................................................................................................... 44

Brocade switches: .................................................................................................... 44

IP WAN Settings VPLEX Metro-IP and VPLEX Geo................................................. 45

Areas to Check to Avoid SAN and WAN Performance Issues ........................... 45

Section 7: VPLEX Performance Checklist ................................................................... 47

Section 8: Benchmarking ............................................................................................ 51

Tips when running the benchmarks ....................................................................... 51

Take a scientific approach when testing ............................................................. 51

Typical Benchmarking Mistakes ............................................................................. 52

Real World Testing Mistake Example ..................................................................... 54

Understand the Metamorphosis of an IO ............................................................. 54

VPLEX Performance Benchmarking Guidelines ................................................... 54

IOMeter Example ...................................................................................................... 56

Conclusion .................................................................................................................... 61

References .................................................................................................................... 62


5/64


DEFINED

Executive summary

For several years, businesses have relied on traditional physical storage to meet their

information needs. Developments such as sever virtualization and the growth of

multiple sites throughout a businesses’ network have placed new demands on how

storage is managed and how information is accessed.

To keep pace with these new requirements, storage must evolve to deliver newmethods of freeing data from a physical device. Storage must be able to connect

to virtual environments and still provide automation, integration with existing

infrastructure, consumption on demand, cost efficiency, availability, and security.

The EMC® VPLEX™ family is the next generation solution for information mobility and

access within, across, and between data centers. It is the first platform in the world

that delivers both Local and Distributed Federation.

Local Federation provides the transparent cooperation of physical elements

within a site.

Distributed Federation extends access between two locations across distance.

VPLEX is a solution for federation both EMC and non-EMC storage.VPLEX completely changes the way IT is managed and delivered – particularly when

deployed with server virtualization. By enabling new models for operating and

managing IT, resources can be federated – pooled and made to cooperate through

the stack — with the ability to dynamically move applications and data across

geographies and service providers. The VPLEX family breaks down technology silos

and enables IT to be delivered as a service.

VPLEX resides at the storage layer, where optimal performance vital. This document

focuses on key considerations for VPLEX performance, performance metrics, and

testing best practices. The information provided is based on VPLEX Release 5.1. The

subject is advanced and it is assumed the reader has a basic understanding of the

VPLEX technology. For additional information on VPLEX best practices and detailed

technologies see the appendix for a reference list and hyperlinks to relevant

documents.

Audience

This white paper is intended for storage, network and system administrators who

desire a deeper understanding of the performance aspects of EMC VPLEX, the

testing best practices, and/or the planning considerations for the future growth of

their VPLEX virtual storage environment(s). This document outlines how VPLEX

technology interacts with existing storage environments, how existing environmentsmight impact VPLEX technology, and how to apply best practices through basic

guidelines and troubleshooting techniques as uncovered by EMC VPLEX

performance engineering and EMC field experiences.


6/64


DEFINED

Introduction

Before we begin our discussion, it is important to know why we are providing

guidance on interpretation of the performance data provided in this document. The

Business unit that has delivered VPLEX to the market has a guiding policy to be as

open and transparent as possible with EMC field resources, partners and customers.

We believe that all modern storage products have limitations and constraints and

therefore the most successful and satisfied customers are those that fully understand

the various constraints and limitations of the technology they intend to implement.

This approach leads our customers to success because there are fewer surprises and

the product expectations match the reality. Our intent is to be candid as possible.

We ask readers to use the information to understand the performance aspects of

VPLEX implementations and to make better informed judgments about nominal

VPLEX capabilities rather than use the document as the final word on all VPLEX

performance (as competitors may be tempted to do). If you have questions about

any of the content in this document please contact to your local EMC Sales or

Technical representatives.

When considering a given solution from any vendor there will undoubtedly be

strengths and weaknesses that need to be considered. There will always be a

specific unique IO profile that poses challenges in servicing the application load, the

key is to understand the overall IO mix and how this will impact real production

workloads. It is misleading to extrapolate a specific IO profile to be representative of

an entire environment unless the environment homogeneously shares a single IO

profile.

Let’s begin our discussion of VPLEX performance by considering performance in

general terms. What is good performance anyway? Performance can be

considered to be a measure of the amount of work that is being accomplished in a

specific time period. Storage resource performance is frequently quoted in terms of

IOPS (IO per second) and/or throughput (MB/s). While IOPS and throughput are bothmeasures of performance, they are not synonymous and are actually inversely

related – meaning if you want high IOPS, you typically get low MB/s. This is driven in

large part by the size of the IO buffers used by each storage product and the time it

takes to load and unload each of them. This produces a relationship between IOPS

and throughput as shown in Figure 1 below.


7/64


DEFINED

Figure 1

For example, an application requests 1,000 IOPS at an 8KB IO size which equals 8MB/s of throughput (1,000 IOPS x 8KB = 8MB/s). Using 200MB/s Fibre channel, 8 MB/s

doesn’t intuitively appear to be good performance (8MB/s is only 4% utilization of the

Fibre Channel bus) if you’re thinking of performance in terms of MB/s. However, if the

application is requesting 1,000 IOPS and the storage device is supplying 1,000 IOPS

without queuing (queue depth = 0), then the storage resource is servicing the

application needs without delay – meaning the performance is actually good.

Conversely, if a video streaming application is sequentially reading data with a 64MB

IO size and 3 concurrent streams, it would realize 192MB/s aggregate performance

across the same 200MB/s Fibre channel connection (64MB x 3 streams = 192MB/s).

While there’s no doubt that 192 MB/s performance is good (96% utilization of the Fibre

Channel bus), it’s equally important to note we’re only supporting 3 IOPS in thisapplication environment.

These examples illustrate the context dependent nature of performance – that is,

performance depends upon what you are trying to accomplish (MB/s or IOPS).

Knowing and understanding how your host servers and applications handle their IO

workload is the key to being successful with VPLEX performance optimization. In

general, there are two types of IO workloads:

• Transaction-based

• Throughput-based

As you saw in Figure 1, these workloads are quite different in terms of their objectivesand must be planned for in specific ways. We can describe these two types of

workloads in the follow ways:

A workload that is characterized by a high number of IO per second (IOPS) is

called a transaction-based workload.


8/64


DEFINED

A workload that is characterized by a large amount of data transferred,

normally with large IO sizes, is called a throughput-based workload.

What should you expect to see from each type of workload?

Transaction-based workloads

High performance transaction-based environments cannot typically be built using

low cost and consequently low IOPs back end arrays. Transaction processing ratesare heavily dependent on the competency of the backend array. Ultimately the

number of back-end physical drives that are available within a storage system to

processing host IO becomes the limiting factor. In general, transaction- based

processing is limited by the physical spindle count and individual disk IO capabilities

of the array rather than the size of the connectivity pipes, the transfer buffer sizes, or

the internal bandwidth of the array.

Another common characteristic of transaction intense applications is that they use a

small random data block pattern to transfer data. With this type of data pattern,

having more back-end drives enables more host IO to be processed simultaneously.

When transaction-based or any type of workloads are random and write biased, the

efficacy of read cache is diminished as misses need to be retrieved from the physicaldisks.

In many cases, slow transaction performance problems can be traced directly to

“hot” files that cause a bottleneck on a critical component (such as a single physical

disk). This situation can occur even when the overall storage subsystem sees a fairly

light workload. When bottlenecks occur, they can present an extremely difficult and

frustrating task to resolve.

Throughput-based workloads

Throughput-based workloads are seen with applications or processes that require

massive amounts of data to be transmitted in as few IO as possible. Generally these

workloads use large sequential blocks to reduce the impact of disk latency.

Applications such as satellite imagery, high performance compute (HPC), video

streaming, seismic research, surveillance, and the like would fit into this category.

Relatively speaking, a smaller number of physical drives are needed to reach

adequate IO performance compared to transaction-based workloads. In a

throughput-based environment, read operations make use of the storage subsystem

cache to pre-fetch large chunks of data at a time to improve the overall

performance. Throughput rates are heavily dependent on the connectivity pipe size,

IO buffer size, and storage subsystem’s internal bandwidth. Modern storage

subsystems with high bandwidth internal busses are able to reach higher throughput

numbers and bring higher rates to bear.

The Role of Applications in Determining Acceptable Performance

Regardless of the capability of a given storage frame, it cannot provide more IO

than the application requests. Ultimately, the application is the real performance

driver. For example, say an application generates requests for 2,500 IOPS from a


9/64


DEFINED

storage resource - is there any performance difference at the application level

between a storage frame capable of delivering 2,500 IOPS and another storage

frame capable of delivering 10,000 IOPS? Obviously, the answer is a resounding “No”.

Either resource is capable of servicing the 2,500 IOPS requirement. It’s like traveling in

a car at 65mph on the freeway – if everyone obeys the 65 mph speed limit, then, any

car that goes the speed limit will get you there in the same amount of time – whether

it’s a Chevy Lumina or Ferrari Enzo.

The point we are trying to make is that performance is very much dependent on the

point of view. Ultimately, performance can be considered good if the application is

not waiting on the storage frame. Understanding the applications performance

requirements and providing compatible storage resources ensures maximum

performance and application productivity. It goes without saying to always be

cautious about performance claims and spec sheet speeds and feeds. If the

environment that generated the claims is not identical or does not closely

approximate your environment, you may very well not see the same performance

results.


10/64


DEFINED

Section1: VPLEX Architecture

VPLEX hardware platform

A VPLEX system with GeoSynchrony 5.1 is composed of one or two VPLEX clusters:

one cluster for VPLEX Local systems and two clusters for VPLEX Metro and VPLEX Geo

systems. These clusters provide the VPLEX AccessAnywhere capabilities.

Each VPLEX cluster consists of:

A VPLEX Management Console

One, two, or four engines

One standby power supply for each engine

In configurations with more than one engine, the cluster also contains:

A pair of Fibre Channel switches

An uninterruptible power supply for each Fibre Channel switch

As you add engines you add cache, front-end, back-end, and wan-com

connectivity capacity as indicated in Table 2 below.

VPLEX GeoSynchrony 5.1 System Configuration Limits

Capacity Local Metro Geo

Maximum virtualized capacity No Known Limit No Known Limit No Known Limit

Maximum virtual volumes 8,000 16,000 16,000

Maximum storageelements

8,000 16,000 16,000

Minimum/maximum virtual

volume size

100MB/32TB 100MB/32TB 100MB/32TB

Minimum/maximum

storage volume size

No VPLEX Limit

/ 32TB

No VPLEX Limit /

32TB

No VPLEX Limit /

32TB

Number of host initiators 1600 1600 800

Table 1


11/64


DEFINED

Engine Type ModelCache

[GB]

FC speed

[Gb/s]Engines FC Ports Announced

VPLEX VS1

Single 64 8 1 32 10-May-10

Dual 128 8 2 64 10-May-10

Quad 256 8 4 128 10-May-10

VPLEX VS2

Single 72 8 1 16 23-May-11

Dual 144 8 2 32 23-May-11

Quad 288 8 4 64 23-May-11

Table 2

Table 1 and Table 2 show the current limits and hardware specifications for the VPLEX

VS1 and VS2 hardware versions. Although the VS2 engines have half the number of

ports as VS1 the actual system throughput is improved as each VS2 port can supply

full line rate (8 Gbps) of throughput whereas the VS1 ports are over-subscribed.

Several of the VPLEX maximums are determined by the limits of the externallyconnected physical storage frames and therefore unlimited in terms of VPLEX itself.

The latest configuration limits are published in the GeoSynchrony 5.1 Release Notes

which are available on Powerlink.EMC.com.

READ / Write IO Limits

VPLEX with GeoSynchrony 5.1 can be configured with one to four engines per cluster.

For a fully configured four-engine VS2 VPLEX cluster the maximums work out as

follows:

IOPS up to 3 Million IOPS

GB/S up to 23.2 Gigabytes per second


12/64


DEFINED

Section 2: VPLEX Performance Highlights

Understanding VPLEX overhead

Properly understanding VPLEX performance capabilities and dependencies will

greatly benefit many of the design decisions for your VPLEX environment. In general,

with VPLEX's large per-director cache, host reads are comparable to and, in somecases, better than native array performance. Writes on the other hand, will follow

VPLEX's write-through caching model on VPLEX Local and Metro will inevitably have

slightly higher latency than native.

There are many factors involved in determining if and when latency is added by

VPLEX. Factors such as host IO dispensation size, IO type, VPLEX internal queue

congestion, SAN congestion, and array congestion will play a role in whether or not

latency is introduced by VPLEX. In real world production environments, however,

what do all of these factors add up to? Let’s take a look at the average latency

impact. We can break these latencies into the following 3 categories based on the

type of host IO and whether or not the data resides in VPLEX cache:

For VPLEX read cache hits, the VPLEX read response time typically ranges from 85-150

microseconds depending on the overall system load and IO size. In many cases this is

less latency than the latency of the native storage array and can actually be

considered a reduction in latency. For local devices VPLEX adds a small amount of

latency to each VPLEX cache read miss and each write operation:

Typical Read Miss: About 200-400 microseconds

Typical Write: About 200-600 microseconds

These latency values will vary slightly depending on the factors mentioned earlier. For

example, if there are large block IO requests which must be broken up into smallerparts (based on VPLEX or individual array capabilities) and then written serially in

smaller pieces to the storage array. Further, if you are comparing native array to

VPLEX performance, it will be heavily dependent on the overall load on the array. If

you have an array that is under cache pressure, adding VPLEX to the environment

can actually improve read performance. The additive cache from VPLEX may

offload a portion of read IO from the array, thereby reducing average IO latency.

Additional discussion on this topic is provided later in the subsequent host and

storage sections.

Native vs. VPLEX Local Performance

Native performance tests use a direct connection between a host and storage-

array. VPLEX Local testing inserts VPLEX in the path between the host and array.

4KB Random Read Hit

Random read hits are tested over a working set size that fits entirely into array or

VPLEX cache.


13/64


DEFINED

Figure 2 VPLEX Random 4KB Read Hit Native vs. VPLEX

This test actually reveals that VPLEX performs faster than the storage frame used in

this test. Here the lower the latency the better! Though this may not be typical ofevery situation, it does help illustrate a case where VPLEX read hits can help to

improve overall storage environment performance.

4KB Random Read Miss

Random read misses by-pass VPLEX cache (but can be a cache hit on the array)

because of their large working set size. Here we see the typical VPLEX read miss

overhead of about 300 microseconds.

Figure 3 Random Read Miss Native vs. VPLEX

http://localhost/var/www/apps/conversion/tmp/scratch_2/%1D%C6%8B/Ze%1C%EF%BE%AB%EF%A0%92%DA%9E%06nBn%D6%9D-QgtN%EA%86%B3%01[y.*n?%DD%B7%C3%90%02%CF%81%0F%E0%B0%AEW%E6%8E%9A%E2%B2%90%E0%B7%BE%E7%A4%9E%C5%BA


14/64


DEFINED

Figure 4 Random 4KB Write Native vs. VPLEX

4KB IO are used in our tests to illustrate a high number of IO operations. In Figure 4 we

see VPLEX write overhead of about 350 microseconds. This test reveals that VPLEX

adds a measureable but relatively small impact to each IO. Here the lower the

latency the better.

Figure 5 128 KB Sequential Write

128 KB IO are used in our tests to illustrate high throughput (bandwidth) operations. In

Figure 5 we see an average VPLEX write overhead of about 500 microseconds. This

test reveals that VPLEX adds a measureable but relatively small impact to each IO.

Here the lower the latency the better.


15/64


DEFINED

OLTP Workload Example

Our synthetic OLTP heavy workload (called OLTP2HW below) benchmark workload is

a mix of 8KB and 64KB IO request sizes, with a 1:1 ratio of reads and writes.

In this test, the application demonstrates slightly more host latency compared to

native with VPLEX. The additional latency overhead is about 600 microseconds.

Native vs. VPLEX Metro Performance

VPLEX Metro write performance is highly dependent upon the WAN round-trip-time

latency (RTT latency). The general rule of thumb for Metro systems is host write IO

latency will be approximately 1x-3x the WAN round-trip-time. While some may viewthis as overly negative impact, we would caution against this view and highlight the

following points. First, VPLEX Metro uses a synchronous cache model and therefore is

subject to the laws of physics when it comes to data replication. In order to provide

a true active-active storage presentation it is incumbent on VPLEX to provide a

consistent and up to date view of data at all times. Second, many workloads have a

considerable read component, so the net WAN latency impact can be masked by

the improvements in the read latency provided by VPLEX read cache. This is another

reason that we recommend a thorough understanding of the real application

workload so as to ensure that any testing that is done is applicable to the workload

and environment you are attempting to validate.

In comparing VPLEX Metro to native array performance it is important to ensure that

the native array testing is also synchronously replicating data across an equal

distance and WAN link as VPLEX. Comparing Metro write performance to a single

array that is not doing synchronous replication is an apples to bananas comparison.

http://localhost/var/www/apps/conversion/tmp/scratch_2/%E0%A7%8B%EF%BE%9B%E9%AE%9E%CE%B1%19W%CA%9F%04%E7%92%9B%C3%8A%C5%9F%EC%8A%8E!P%E0%AD%B1%EB%AF%B3%EE%9B%AD%D5%ACm%E4%A9%B6%E7%B7%B1%E6%BB%A8%E3%80%92%DE%B7r%%E3%A7%A3VG=%16uS%E7%96%95%E7%97%A1%E3%9F%B5_.%E2%8F%9C%E3%A9%A2%E2%AD%93%14%EE%8C%B2xSb%CA%BB


16/64


DEFINED

Figure 6 Metro WAN Latency Impact

Figure 6 illustrates the impact of WAN latency on VPLEX Metro. As WAN latency is

added there is a corresponding impact on write IO. The OLTP (green) lines show a

simulated OLTP application (8KB and 64 KB IO with roughly equal read and write IO)

and the overall impact of WAN latency with VPLEX Metro.

Write throughput-intensive applications such as back-ups need to be aware of

maximum available WAN bandwidth between VPLEX clusters. If the write workload

exceeds the WAN link bandwidth, response time will spike, and other applications

may also see severe performance degradations.

Native vs. VPLEX Geo Performance

Given the fundamental architectural differences of VPLEX Geo from Local and

Metro, namely its write-back caching model and asynchronous data replication, it's

even more difficult to accurately compare native array performance to VPLEX Geo

performance.

In short, VPLEX Geo performance will be limited to the available drain-rate, which is a

function of the available WAN bandwidth and storage-array performance at each

cluster. If a VPLEX director's incoming host write rate exceeds what the outgoing write

rate can achieve, inevitably there will be push back or throttling that occurs on the

host, which will negatively affect host per operation write latency causing it to rise.Ensure the WAN and arrays are properly configured, and various VPLEX Geo related

settings are tuned properly.


17/64


DEFINED

Section 3: Hosts and Front-end Connectivity

There are certain baseline configuration recommendations when using VPLEX to

provision virtual storage to hosts. These considerations include how many paths

through the fabric are allocated to the host, how many host ports to use, how to

spread the hosts across VPLEX directors, logical unit number (LUN) mapping, and thecorrect size of virtual volumes to use. Maximizing connectivity and following best

practices for inline devices such as VPLEX will optimize performance for your virtual

storage environment.

Host Environment

When configuring a new host for VPLEX, the first step is to determine the EMC

supported operating system, driver, firmware, and supported host bus adapters in

order to prevent unexpected problems due to untested configurations. Consult the

VPLEX Simple Support Matrix prior to bringing a new host into VPEX for recommended

levels. The VPLEX support matrix is available at: http://powerlink.emc.com or at

Support.EMC.com

In addition, verify that all host path management software is enabled and operating

correctly.

Path management applications should be set as follows:

Operating System Recommended Policy

Hewlett-Packard HPUX PVLinks set to Failover.

VMware ESX Set NMP policy to Fixed

IBM AIX Native MPIO set to Round Robin

All Linux MPIO set to Round Robin Load Balancing

All Veritas DMP Balanced Policy with parititonsize set to 16MB

All platforms using Powerpath EMC Powerpath policy set to Adaptive

Table 3

Note: The most current and detailed information for each host OS is provided in the

corresponding Host Connectivity Guides on Powerlink at: http://powerlink.emc.com

Host Paths

EMC recommends that you limit the total number of paths that the multipathingsoftware on each host is managing to four paths, even though the maximum

supported is considerably more than four. Following these rules helps prevent many

issues that might otherwise occur and leads to improved performance.

http://localhost/var/www/apps/conversion/tmp/scratch_2/%CA%AF+%E6%85%A2%D4%A1%E1%B3%A9i%D8%BF%E1%A5%98%E5%A9%ACK%E4%95%BCO%0FZB%D1%AC%E2%91%B2iUF%EF%BB%97!%CF%B7%E5%B6%B4I3http://localhost/var/www/apps/conversion/tmp/scratch_2/%CA%AF+%E6%85%A2%D4%A1%E1%B3%A9i%D8%BF%E1%A5%98%E5%A9%ACK%E4%95%BCO%0FZB%D1%AC%E2%91%B2iUF%EF%BB%97!%CF%B7%E5%B6%B4I3http://localhost/var/www/apps/conversion/tmp/scratch_2/ndTO%E3%A6%A8+%C5%AF%E3%A0%A4%03%E0%A2%87%E2%A9%84%EF%A3%AAs%ED%8E%B6%0C~j-a%E8%B2%BF%7F%E8%8D%937%EE%A5%B2%E3%9F%BE9http://localhost/var/www/apps/conversion/tmp/scratch_2/ndTO%E3%A6%A8+%C5%AF%E3%A0%A4%03%E0%A2%87%E2%A9%84%EF%A3%AAs%ED%8E%B6%0C~j-a%E8%B2%BF%7F%E8%8D%937%EE%A5%B2%E3%9F%BE9http://localhost/var/www/apps/conversion/tmp/scratch_2/ndTO%E3%A6%A8+%C5%AF%E3%A0%A4%03%E0%A2%87%E2%A9%84%EF%A3%AAs%ED%8E%B6%0C~j-a%E8%B2%BF%7F%E8%8D%937%EE%A5%B2%E3%9F%BE9http://localhost/var/www/apps/conversion/tmp/scratch_2/ndTO%E3%A6%A8+%C5%AF%E3%A0%A4%03%E0%A2%87%E2%A9%84%EF%A3%AAs%ED%8E%B6%0C~j-a%E8%B2%BF%7F%E8%8D%937%EE%A5%B2%E3%9F%BE9http://localhost/var/www/apps/conversion/tmp/scratch_2/%CA%AF+%E6%85%A2%D4%A1%E1%B3%A9i%D8%BF%E1%A5%98%E5%A9%ACK%E4%95%BCO%0FZB%D1%AC%E2%91%B2iUF%EF%BB%97!%CF%B7%E5%B6%B4I3


18/64


DEFINED

The major reason to limit the number of paths available to a host from the VPLEX is for

error recovery, path failover, and path failback purposes. These are also important

during the VPLEX non-disruptive upgrade (NDU) process. The overall time for

handling path loss by a host is significantly reduced when you keep the total number

of host paths to a reasonable number required to provide the aggregate

performance and availability. Additionally, the consumption of resources within the

host is greatly reduced each time you remove a path from path management

software.

During NDU, there are intervals where only half of the VPLEX directors and associated

front-end ports (on first and second upgraders, respectively) are available on the

front-end fabric. NDU front-end high availability checks ensure that the front-end

fabric is resilient against single points of failure during the NDU, even when either the

first or second upgrader front-end ports are offline.

From a host pathing perspective there are two types of configurations:

High availability configurations – VPLEX configurations that include sufficient

redundancy to avoid data unavailability during NDU, even in the event of

front-end fabric or port failures. The NDU high-availability pre-checks succeedfor these configurations.

Minimal configurations – VPLEX configurations that do not include sufficient

redundancy to avoid data unavailability in the event of front-end fabric or

port failures.

For minimal configurations, the NDU high-availability pre-checks fail. Instead, the pre-

checks for these configurations must be performed manually. This can take a

considerable amount of time in large environments and in general EMC believes that

the benefits in lower port count requirements are not justified based on the increased

operational impact.

High availability configurations

VPLEX Non-Disruptive Upgrade (NDU) automated pre-checks verify that VPLEX is

resilient in the event of failures while the NDU is in progress.

In high availability configurations:

In dual- or quad-engine systems, each view has front-end target ports across

two or more engines in the first upgrader set (A directors), and two or more

engines in the second upgrader set (B directors).

In single-engine systems, each initiator port in a view has a path to at least one

front-end target port in the first upgrader (A director) and second upgrader (Bdirector). (See 7).

There are two variants of front-end configurations to consider for high availability that

will pass the high-availability pre-checks:


19/64


DEFINED

An optimal configuration for a single engine cluster is one in which there are

redundant paths (dotted and solid lines in 7) between both front end fabrics

and both directors. In addition to protecting against failures of an initiator port,

HBA, front-end switch, VPLEX front-end, or director, these redundant paths also

protect against front end port failures during NDU.

A high-availability configuration for a single-engine cluster is one in which

there is a single path between the front end fabrics and the directors (solid

lines in 7). Like the optimal configuration described above, a high-availability

configuration protects against failures of initiator ports, HBAs, front-end

switches, and director failures during NDU.

A high availability configuration provides protection against front-end port failures

during NDU.

Figure 7 High Availability Front-end Configuration (single-engine)

Minimal configurations

A minimal configuration is not considered highly-available, and the automated NDU

pre-check will not pass. For a single-engine cluster, a minimal configuration is one in

which each fabric has a single path to a single director. Minimal configurations

support failover, but have no redundancy during NDU.

Strict high-availability pre-checks for front-end and back-end connectivity have

been implemented in VPLEX 5.1 and higher. If the high-availability pre-check detectsone or more storage views do not conform to the front-end high availability

requirements of NDU, it will specify which storage views are in question. For example:

Error: Storage view /clusters/cluster-2/exports/storage-views/lsca3195_win2k3 does not have target ports from two or more directorsin the second upgrader set at cluster-2.


20/64


DEFINED

Update these views to satisfy the high availability requirement. Ensure the storage-

view in question has front-end target ports across two or more engines in the first

upgrader set (A directors) and second upgrader set (B directors).

Figure8 illustrates a single-engine cluster with a minimal front-end configuration:

Figure 8 Minimal Front-end Configuration

For minimal configurations, the automated high-availability pre-checks fail, and must

be performed manually. Refer to the VPLEX Procedure Generator documentation onupgrading and the necessary manual pre-checks, commands, and options for

minimal configurations.

Host to director connectivity

VPLEX caching algorithms send cache coherency messages between directors via

the internal fibre channel networks or via the built-in CMI bus contained within each

engine chassis. The CMI bus is a low latency, high speed communication bus. It

allows two directors within the same engine to directly communicate. The

recommendation when two or more VPLEX engines are available is to connect to

two directors per host. In addition, ensure each host is connected to an A and a B

director on different VPLEX engines. There are certainly possible exceptions to thetwo director connectivity rule. For example, a heavy IO workload (OLTP) server with

4 or more adapter ports the director connectivity would be need to be at least 4 or

possibly 8 directors. The key take away is VPLEX system performance under normal

loads will be virtually equivalent whether you use directors in the same engine or two

directors in different engines. The benefits from the added availability tip the scale in


21/64


DEFINED

favor of connecting hosts to one director on two different engines. In general,

consuming just 2 directors per host will provide the best overall scalability and

balance of resource consumption for your VPLEX system.

Figure 9 Current VPLEX NDU Enforced Single and Dual Engine Connectivity

Note: For code releases through VPLEX GeoSynchrony code version 5.1 Patch 3, thenon-disruptive upgrade pre-check strictly enforces connecting hosts across 4

directors with 2 and 4 engine VPLEX systems. This restriction will likely be relaxed in

future releases to better align with the reasoning presented above.

When considering attaching a host to more than two directors in a dual-engine or

quad-engine VPLEX configuration, both the performance and the scalability of the

VPLEX complex should be considered. Though this may contradict what the

automated NDU will accept, this guidance for the following reasons:

Utilizing more than two directors per host increases cache update traffic

among the directors

Utilizing more than two directors per host decreases probability of read cache

hits on the ingress director.

Based on the reliability and availability characteristics of VPLEX hardware,

attaching a host to just two directors provides a high availability configuration

without unnecessarily impacting performance and scalability of the solution


22/64


DEFINED

General Best practice considerations for multipath software:

With EMC Powerpath the pathing policy should be set up for Adaptive mode.

Avoid connecting to multiple A directors and multiple B directors with a single

host or host cluster.

Avoid a single IO policy for round-robin IO. Alternating every single IO across

directors is not an efficient situation for cache-coherency traffic, and defeats

the VPLEX director's read-ahead cache pre-fetch. When using a round-robin

policy, set the burst or stream count to something greater than one so more

consecutive I/O's are sent to the same director before another director is

chosen.

For Veritas DMP, using the balanced policy with a partitionsize value of 16MB is

the optimal for VPLEX director cache-coherency.

Separate latency sensitive applications from each other, preferably using

independent directors and independent front-end ports.

For VPLEX Metro-FC cross-connect solutions, be aware of which path(s) the

hosts are using, and configure the hosts to prefer the local paths over the

remote paths

Host Path Monitoring

Host IO monitoring tools are available across virtually every open systems OS

supported by VPLEX. In particular, EMC Powerpath provides a consistent set of

commands and outputs across operating systems such as AIX, Linux, VMware, and

Windows.

Individual host path performance can be monitored using the powermt display

command:

Example 1 - Windows path monitoring with Powerpath for Windows

powermt display dev=all

Pseudo name=harddisk12Invista ID=FNM00103600####Logical device ID=6000144000000010A001ED129296E028state=alive; policy=ADaptive; priority=0; queued-IOs=0==============================================================================---------------- Host --------------- - Stor - -- I/O Path - -- Stats ---### HW Path I/O Paths Interf. Mode State Q-IOs Errors==============================================================================

4 port4\path0\tgt0\lun10 c4t0d10 04 active alive 8 0

Also, latency by path is available with the powermt display latency command:

powermt display latency

Invista logical device count=86==============================================================================


23/64


DEFINED

----- Host Bus Adapters --------- ------ Storage System ---- - Latency (us) -### HW Path ID Interface Current Max==============================================================================3 port3\path0 FNM0010360#### 01 0 04 port4\path0 FNM0010360#### 04 0 0

Policy based path monitoring

There are many situations in which a host can lose one or more paths to storage. If

the problem is isolated to that one host, it might go unnoticed until an upgrade to

VPLEX or when a SAN event occurs that causes the remaining paths to go offline,

such as a switch failure, or routine switch maintenance. This can lead to poor

performance or, worse yet, a data unavailability event, which can seriously affect

your business. To prevent this loss-of-access event from happening, many users have

found it useful to implement automated path monitoring using path management

software like EMC Powerpath or Veritas DMP or to create custom scripts that issue

path status commands and then parse the output for specific key words that then

trigger further script action.

For EMC Powerpath you can turn on path latency monitoring and define a threshold

to simply stop using a specific path.

Example 2 – Automated latency monitoring with Powerpath

powermt set path_latency_monitor=on|off

powermt set path_latency_threshold=

It is also possible to set an autorestore policy with Powerpath so that any paths that

drop offline are brought back online if they are healthy.

Example 3 – Auto restore paths

powermt set periodic_autorestore=on|off

Each of these commands can provide hosts with self-monitoring and self-recovery to

provide the greatest resiliency and available possible for each host. This command

can be combined with a scheduler, such as cron, and a notification system, such as

an e-mail, to notify SAN administrators and system administrators if the number of

paths to the system changes.

For Veritas DMP there are recovery settings that control how often a path will beretried after failure. If these are not the default settings on your hosts, you should set

the following on any hosts using DMP:


24/64


DEFINED

Example 4 - DMP Tuning Parameters

vxdmpadm setattr enclosure emc-vplex0 recoveryoption=throttle iotimeout=30vxdmpadm setattr enclosure emc-vplex0 dmp_lun_retry_timeout=30

The values shown in Example 4 specify a retry 30s period for handling transient errors.

When all paths to a disk fail (such as during a VPLEX NDU), there may be certain

paths that have a temporary failure and are likely to be restored soon. If IOs are notretried for a non-zero period of time, the IO may be failed by the application layer.

The DMP tunable dmp_lun_retry_timeout can be used for more robust handling of

such transient errors. If the tunable is set to a non-zero value, I/Os to a disk with all

failed paths will be retried until the specified dmp_lun_retry_timeout interval or

until the I/O succeeds on one of the paths, whichever happens first. The default

value of the tunable is 0, which means that the paths are probed only once.

VPLEX Real-time GUI Performance Monitoring Stats

The Unisphere for VPLEX UI contains several key performance statistics for host

performance and overall health. They can be found on the Performance

Dashboard tab and can be added to the default performance charts that are

displayed. Using the data provided, the VPLEX administrator can quickly determine

the source of performance problems within an environment. Figure 10, below shows

the performance data included in the GeoSynchrony 5.1 version of VPLEX.

Figure 10 VPLEX Real-time Performance Data

Unisphere for VPLEX Performance Data Details

Back-end Latency – time in microseconds for IO to complete with physical

storage frames.

CPU Utilization - % busy of the VPLEX directors in each engine. 50% or less is

considered ideal

Front-end Aborts –

SCSI aborts received from hosts connected to VPLEX front-end ports. 0 is ideal.

Front-end Bandwidth - Total IO as measured in MB per second from hosts to

VPLEX.

Front-end Latency – time in microseconds for IO to complete between VPLEX

and hosts. Very dependent on backend array latency.


25/64


DEFINED

Front-end Throughput – Total IO as measured in IO per second.

Rebuild Status – Completion status of local and remote device rebuild jobs.

Subpage Writes – Number of writes that < 4KB. This statistic has taken on a very

diminished importance for VPLEX Local and Metro systems running

GeoSynchrony 5.0.1 and later code. For VPLEX Geo, this is still a very relevant

metric.

WAN Link Usage – IO between VPLEX clusters as measured in MB per second.

This chart can be further sub divided into system, rebuild, and distributed

volume write activity.

WAN Link Performance – IO between VPLEX clusters as measured in IO per

second.

Figure 11 UniSphere for VPLEX Performance Dashboard

Figure 11 shows the VPLEX Performance Dashboard which provides continuous real

time data for 10 key performance metrics over a continuously updated 5 minute

window. Each of the charts can be added, moved, or removed from the display tomeet a wide variety of monitoring needs.


26/64


DEFINED

Remote Monitoring and Scripting

VPLEX has a RESTful API and supports SNMP monitoring via third party SNMP

monitoring tools. The VPLEX MIB is available on the VPLEX Management Server in the

following directory:

/opt/emc/VPlex/mibs

Today there are a limited set of performance categories today for SNMP. Using REST

API to access VPLEX allows virtually any command that can be run locally on a VPLEX

to be run remotely. This enables integration with Microsoft PowerShell and with

VMware vCOPS. Refer to VPLEX 5.1 Administrators Guide -- Performance Monitoring

Chapter for more details.

Watch4Net

Comprehensive historical and trending data along with custom dashboard views will be

provided for VPLEX by Watch4Net. Watch4Net will likely be supported very shortly after this

document is published.

Perpetual Logs

VPLEX maintains a perpetual log of over 50 different performance statistics on the

VPLEX management server. There are 10 of these files for each VPLEX director and

they roll at the 10 MB mark. The perpetual log files contain comma separated data

that can be very easily imported to MS Excel for aggregation, reporting and historical

trending analysis. The files are located in the /var/log/VPlex/cli directory as show in

below:

service@RD-GEO-2-1:/var/log/VPlex/cli> ll | grep PERPETUAL-rw-r--r-- 1 service users 3374442 2012-11-17 05:41 director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log-rw-r--r-- 1 service users 10485855 2012-11-14 18:38 director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.1-rw-r--r-- 1 service users 10485864 2012-09-10 01:25 director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.10-rw-r--r-- 1 service users 10486060 2012-11-07 03:33 director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.2-rw-r--r-- 1 service users 10485825 2012-10-30 12:14 director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.3-rw-r--r-- 1 service users 10485922 2012-10-22 21:38 director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.4-rw-r--r-- 1 service users 10486009 2012-10-15 12:43 director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.5-rw-r--r-- 1 service users 10486000 2012-10-08 05:20 director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.6-rw-r--r-- 1 service users 10486298 2012-10-01 03:31 director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.7-rw-r--r-- 1 service users 10486207 2012-09-24 02:24 director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.8-rw-r--r-- 1 service users 10485969 2012-09-17 00:24 director-1-1-A_PERPETUAL_vplex_sys_perf_mon.log.9-rw-r--r-- 1 service users 2467450 2012-11-17 05:41 director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log-rw-r--r-- 1 service users 10485770 2012-11-15 11:39 director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.1-rw-r--r-- 1 service users 10486183 2012-09-07 09:02 director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.10-rw-r--r-- 1 service users 10485816 2012-11-08 01:10 director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.2-rw-r--r-- 1 service users 10485977 2012-10-31 13:49 director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.3-rw-r--r-- 1 service users 10486275 2012-10-24 01:25 director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.4-rw-r--r-- 1 service users 10485793 2012-10-16 07:27 director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.5-rw-r--r-- 1 service users 10486230 2012-10-08 07:56 director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.6-rw-r--r-- 1 service users 10485762 2012-09-30 05:37 director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.7

-rw-r--r-- 1 service users 10485807 2012-09-22 01:53 director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.8-rw-r--r-- 1 service users 10486077 2012-09-14 14:31 director-1-1-B_PERPETUAL_vplex_sys_perf_mon.log.9

Benchmarking Applications, Tools and Utilities

There are good benchmarking applications out there, and there are not-so-good

ones. Testers use tools they can trust. The following section reviews several


27/64


DEFINED

benchmarking tools that are useful (and not so useful) when testing VPLEX

performance in your environment.

Good benchmarks

IOMeter

IOMeter is one of the most popular public domain benchmarking tools amongstorage vendors, and is primarily a Windows-based tool. It is available

from http://www.iometer.org. In the Benchmarking section of this document we

provide some examples of IOMeter settings that are used to simulate specific

workloads for testing.

The popularity of IOMeter holds true at EMC. Many internal teams, including the

VPLEX Performance Engineering team use IOMeter and are familiar with its behavior,

input parameters, and output. That being said, the IO patterns, queue depths, and

other tunables can be misused and distorted. It’s impor tant to maintain healthy

skepticism from any benchmark numbers you see until you know the full details of the

settings and overall testing parameters.

Warning: It's not recommended to run the IO client (dynamo) on Linux. Dynamo does

not appear to function completely as expected. It's best to use Windows clients with

Dynamo.

IOZone

Broad operating system support, however primarily file system based. It is available

for free from http://www.iozone.org.

http://localhost/var/www/apps/conversion/tmp/scratch_2/x%1A%E5%80%ACv%E2%80%81%E4%B7%B1%E3%B4%A8http://localhost/var/www/apps/conversion/tmp/scratch_2/x%1A%E5%80%ACv%E2%80%81%E4%B7%B1%E3%B4%A8http://localhost/var/www/apps/conversion/tmp/scratch_2/x%1A%E5%80%ACv%E2%80%81%E4%B7%B1%E3%B4%A8http://localhost/var/www/apps/conversion/tmp/scratch_2/(Bib%EB%AD%B6%20%22o%0141%E3%AA%AC%EF%BF%BD%06%EB%86%8Ed%044+k%E9%BD%8F%E9%81%9F%EF%89%93%E8%9D%AD%EB%87%A3%1A%E3%9A%80http://localhost/var/www/apps/conversion/tmp/scratch_2/(Bib%EB%AD%B6%20%22o%0141%E3%AA%AC%EF%BF%BD%06%EB%86%8Ed%044+k%E9%BD%8F%E9%81%9F%EF%89%93%E8%9D%AD%EB%87%A3%1A%E3%9A%80http://localhost/var/www/apps/conversion/tmp/scratch_2/(Bib%EB%AD%B6%20%22o%0141%E3%AA%AC%EF%BF%BD%06%EB%86%8Ed%044+k%E9%BD%8F%E9%81%9F%EF%89%93%E8%9D%AD%EB%87%A3%1A%E3%9A%80http://localhost/var/www/apps/conversion/tmp/scratch_2/(Bib%EB%AD%B6%20%22o%0141%E3%AA%AC%EF%BF%BD%06%EB%86%8Ed%044+k%E9%BD%8F%E9%81%9F%EF%89%93%E8%9D%AD%EB%87%A3%1A%E3%9A%80http://localhost/var/www/apps/conversion/tmp/scratch_2/x%1A%E5%80%ACv%E2%80%81%E4%B7%B1%E3%B4%A8


28/64


DEFINED

iorate

Initially implemented by EMC, iorate has been released to the public as open source.

Available for free from http://iorate.org/

fio

io is an I/O tool meant to be used both for benchmark and stress/hardware

verification. It has support for 13 different types of I/O engines, I/O priorities (for

newer Linux kernels), rate I/O, forked or threaded jobs, and much more. It can work

on block devices as well as files. fio is a tool that will spawn a number of threads or

processes doing a particular type of I/O action as specified by the user. The typical

use of fio is to write a job file matching the I/O load one wants to simulate. Available

for free from http://freecode.com/projects/fio

Additional info: http://linux.die.net/man/1/fio

Poor benchmarks

In general, any single outstanding I/O or filesystem focused benchmarks are not

good choices.

Unix dd test

dd is completely single-threaded or single outstanding I/O.

The dreaded dd test:

Bonnie

Bonnie was designed to test UNIX file systems and is over 20 years old.

Bst5 or "Bart's stuff test"Bst5 is single outstanding I/O. http://www.nu2.nu/bst/

File copy commands

http://localhost/var/www/apps/conversion/tmp/scratch_2/p2%05%03%7Dn[%EF%90%81%3EmLZM%1Ep%E1%99%8DNI%E1%93%AA%EC%AF%A3%E6%A0%B2%D8%82%EF%A2%91%15-n%E3%9F%A3%C7%AChttp://localhost/var/www/apps/conversion/tmp/scratch_2/p2%05%03%7Dn[%EF%90%81%3EmLZM%1Ep%E1%99%8DNI%E1%93%AA%EC%AF%A3%E6%A0%B2%D8%82%EF%A2%91%15-n%E3%9F%A3%C7%AChttp://localhost/var/www/apps/conversion/tmp/scratch_2/p2%05%03%7Dn[%EF%90%81%3EmLZM%1Ep%E1%99%8DNI%E1%93%AA%EC%AF%A3%E6%A0%B2%D8%82%EF%A2%91%15-n%E3%9F%A3%C7%AChttp://localhost/var/www/apps/conversion/tmp/scratch_2/%EC%89%BFV%E8%A2%B9%E7%9F%83_%EF%AC%99S%D3%B6R%E1%B7%85%E8%8D%91%E2%A1%A5%06%E0%BD%9F%E7%AC%AD%E7%82%B4%E9%9A%B6bR%1C%CA%BD%EB%88%A6%E3%94%9E%EB%8E%AB%E5%8C%9F%E5%8A%94P%EE%80%9Dhttp://localhost/var/www/apps/conversion/tmp/scratch_2/%EC%89%BFV%E8%A2%B9%E7%9F%83_%EF%AC%99S%D3%B6R%E1%B7%85%E8%8D%91%E2%A1%A5%06%E0%BD%9F%E7%AC%AD%E7%82%B4%E9%9A%B6bR%1C%CA%BD%EB%88%A6%E3%94%9E%EB%8E%AB%E5%8C%9F%E5%8A%94P%EE%80%9Dhttp://localhost/var/www/apps/conversion/tmp/scratch_2/%EC%89%BFV%E8%A2%B9%E7%9F%83_%EF%AC%99S%D3%B6R%E1%B7%85%E8%8D%91%E2%A1%A5%06%E0%BD%9F%E7%AC%AD%E7%82%B4%E9%9A%B6bR%1C%CA%BD%EB%88%A6%E3%94%9E%EB%8E%AB%E5%8C%9F%E5%8A%94P%EE%80%9Dhttp://localhost/var/www/apps/conversion/tmp/scratch_2/%0E%0B%DE%BF%E3%BD%A0%5E%EB%82%B0%E9%A5%A0%E0%A8%A9%3C%08%EF%A7%A2%EC%BE%96%%E4%A6%B3%C7%89%05]%E1%AF%BB%CA%A6W%CD%BA%7D%EC%9F%B1%D9%8Ehttp://localhost/var/www/apps/conversion/tmp/scratch_2/%0E%0B%DE%BF%E3%BD%A0%5E%EB%82%B0%E9%A5%A0%E0%A8%A9%3C%08%EF%A7%A2%EC%BE%96%%E4%A6%B3%C7%89%05]%E1%AF%BB%CA%A6W%CD%BA%7D%EC%9F%B1%D9%8Ehttp://localhost/var/www/apps/conversion/tmp/scratch_2/%0E%0B%DE%BF%E3%BD%A0%5E%EB%82%B0%E9%A5%A0%E0%A8%A9%3C%08%EF%A7%A2%EC%BE%96%%E4%A6%B3%C7%89%05]%E1%AF%BB%CA%A6W%CD%BA%7D%EC%9F%B1%D9%8Ehttp://localhost/var/www/apps/conversion/tmp/scratch_2/Sw%E2%8B%86C4B%EB%85%B5%E8%94%9EY%EC%84%BE%E6%96%9FM)XR%02%E2%92%B2![%E9%9B%92%DE%91%1F%E4%A5%A2%E5%88%B9%EC%B6%BAk%E2%80%80http://localhost/var/www/apps/conversion/tmp/scratch_2/Sw%E2%8B%86C4B%EB%85%B5%E8%94%9EY%EC%84%BE%E6%96%9FM)XR%02%E2%92%B2![%E9%9B%92%DE%91%1F%E4%A5%A2%E5%88%B9%EC%B6%BAk%E2%80%80http://localhost/var/www/apps/conversion/tmp/scratch_2/Sw%E2%8B%86C4B%EB%85%B5%E8%94%9EY%EC%84%BE%E6%96%9FM)XR%02%E2%92%B2![%E9%9B%92%DE%91%1F%E4%A5%A2%E5%88%B9%EC%B6%BAk%E2%80%80http://localhost/var/www/apps/conversion/tmp/scratch_2/y%E1%A0%8BQn%E2%97%8D%DD%B2.%EB%88%BF%E9%BC%AE+%1Eo%E3%B5%AB%EC%96%98%15%E1%8B%82%EE%A1%A4Mn%EB%98%A4H%D6%9C%EC%9E%87%D1%B3M5(%E6%B7%B3%E2%A9%A4%E1%A4%8EPk#%s%E2%87%81%DE%B1%EB%86%9BxR%CF%82%E3%AE%B6lM%E5%97%9DP%ED%8E%83^Nf.http://localhost/var/www/apps/conversion/tmp/scratch_2/%0E%0B%DE%BF%E3%BD%A0%5E%EB%82%B0%E9%A5%A0%E0%A8%A9%3C%08%EF%A7%A2%EC%BE%96%%E4%A6%B3%C7%89%05]%E1%AF%BB%CA%A6W%CD%BA%7D%EC%9F%B1%D9%8Ehttp://localhost/var/www/apps/conversion/tmp/scratch_2/%EC%89%BFV%E8%A2%B9%E7%9F%83_%EF%AC%99S%D3%B6R%E1%B7%85%E8%8D%91%E2%A1%A5%06%E0%BD%9F%E7%AC%AD%E7%82%B4%E9%9A%B6bR%1C%CA%BD%EB%88%A6%E3%94%9E%EB%8E%AB%E5%8C%9F%E5%8A%94P%EE%80%9Dhttp://localhost/var/www/apps/conversion/tmp/scratch_2/p2%05%03%7Dn[%EF%90%81%3EmLZM%1Ep%E1%99%8DNI%E1%93%AA%EC%AF%A3%E6%A0%B2%D8%82%EF%A2%91%15-n%E3%9F%A3%C7%AC


29/64


DEFINED

These are single threaded and single outstanding I/O. They use a host memory file

cache, so it is not known when or if a particular file IO hits storage. It is also not clear

what I/O size the filesystem will happen to choose and so it might be reading and

writing with inefficient IO sizes. In theory, a multiple file copy benchmark could be

constructed; however it requires careful parallelism and multiple independent source

and target locations.

It is best to separate reads and writes in performance testing. For example, a slow

performing read source device could penalize a fast write target device. The entire

copy test would show up as slow. Without detailed metrics into the read and write

times (not always gathered in a simple "how long did it take" file copy test), the

wrong conclusions can easily be drawn about the storage solution.

Note: See Section 8: Benchmarking for specific testing recommendations and

example results.

Benchmarking Applications

The list of possible application level benchmarking programs is numerous. Some that

are fairly well known and understood are:

Microsoft Exchange - JetStress

Microsoft SQL Server - SQLIO

Oracle - SwingBench, DataPump, or export/import commands

VMware - VMbench, VMmark - virtual machine benchmarking tools

These particular benchmarking applications are potentially one step closer to a

production application environment; however as all artificially crafted benchmarking

applications suffer from the fact that at the end of the day they likely are notrepresentative of your environment.

Engage EMC's application experts when you are interested in a specific application

benchmark. We stress that these benchmarks also exercise more of the application

and host IO stack and so they may not be representative of the underlying storage

devices and could be affected by a lot of things outside the storage layer.

Application Testing

Testing with the actual application is the best way to measure storage performance.

Production-like environment that can stress storage limits is desirable.

Measure performance of different solutions:

Compare OLTP response times.

Compare batch run times.

Compare sustained streaming rates.

Operating system and application tools can help monitor storage performance.


30/64


DEFINED

Production Testing

Ultimately, there must be a level of trust in the solution and the deployment the

solution in your production environment. When are considering moving an

application into product there are some risks and rewards.

Risk vs. Reward:

Risk: taking an unsupported, well-traveled evaluation unit and putting it in aproduction environment could compromise application availability and

expose unexpected system problems.

Reward: sometimes this is the only way to know for certain that storage

performance is acceptable for an application.

In order to minimize the risk side of the equation, consider a staged approach

whereby at first non-business critical applications can be virtualized with VPLEX. This is

a similar approach recommended by VMware in the early stages of host

virtualization. Go for the low hanging fruit first and then closely monitor the

performance throughout the process.


31/64


DEFINED

Section 4: Application Performance Considerations

When gathering data for planning from the application side, it is important to first

consider the workload type for the application. If multiple applications or workload

types will share the system, you need to know the type of workloads of each

application, and if the applications have both types -- are mixed (transaction-based

and throughput-based), which workload will be the most critical? Manyenvironments have a mix of transaction-based and throughput-based workloads;

generally, the transaction performance is considered the most critical. However, in

some environments, for example, a backup media server environment, the streaming

high throughput workload of the backup itself is the critical part of the operation. The

backup database, although a transaction-centered workload, is a less critical

workload.

High Transaction environments

So, what are the traits of transaction-based and high throughput applications? In the

following sections, we explain these traits in more detail.

Applications that use high transaction workloads are better known as Online

Transaction Processing (OLTP) systems. Examples of these systems are database

servers and mail servers. If you have a database, you tune the server type

parameters, as well as the database’s logical devices, to meet the needs of the

database application. If the host server has a secondary role of performing nightly

backups for the business, you may choose to use a different set of logical devices,

which are tuned for high throughput for the best backup performance.

As mentioned in the introduction, you can expect to see a high number of

transactions and a fairly small IO size in OLTP environments. Different databases use

different IO sizes for their logs, and these logs vary from vendor to vendor. In allcases, the logs are generally high write-oriented workloads. For table spaces, most

databases use between a 4 KB and a 16 KB IO size. In certain applications, larger

chunks (for example, 64 KB) will be moved to host application cache memory for

processing. VPLEX currently has a fixed 4KB page size and IO in this size range is not

appreciably impacted by the introduction of VPLEX.

Understanding how your application is going to handle its IO is critical to laying out

the data properly at the storage layer. In many cases, the table space is generally a

large file made up of small blocks of data records. The records are normally

accessed using small IOs of a random nature, which can result in about a high

cache miss ratio. It is important to ensure the backend storage array is able to keepup with the IOPS requirement as well.

Another point to consider is whether the typical IO is a read or a write. In many OLTP

environments, there is generally a mix of about 70% reads and 30% writes. However,

the transaction logs of a database application have a much higher write ratio and,


32/64


DEFINED

therefore, perform better if they are isolated onto dedicated storage volumes.

VPLEX’s large read cache benefits this sort of a read portion workload but the log

volumes will likely not benefit from cache, and therefore will need underlying storage

volumes that can keep pace with the write workload.

High Throughput environments

With high throughput workloads, you have fewer transactions, but much larger IO pertransaction. IO sizes of 128 K or greater are normal, and these IOs are generally

sequential in nature. Applications that typify this type of workload are imaging,

video servers, seismic processing, high performance computing (HPC), and backup

servers.

When running applications that use larger size I/O, it is important to be aware of the

extra IO impact that VPLEX will add as a result of breaking up write IO that are larger

than 128KB. For example, a single 1MB host write would require VPLEX to do 8 x 128KB

writes out to the backend storage frame. When practical, maximum host and

application IO size and allocations units for thigh throughput systems should be set

to128KB or less. An increase the maximum back-end write size to 1MB is expected inthe next major VPLEX code release.

Best practice: Database table spaces, journals, and logs should not be placed on

virtual volumes that reside on extents from the same backend storage volume.

VPLEX Device Geometry

The typical consumption methodology for back-end storage for the majority of

applications is to create 1:1 mapped (physical:virtual) devices. If striping (raid-0) or

concatenated (raid-c) geometries are used then VPLEX devices should beconstructed using storage volumes of similar raid protection and performance

characteristics. This general purpose rule is applicable to most VPLEX backend

storage configurations and simplifies the device geometry decision for storage

administrators. This type of physical storage consumption model enables the

continued use of array based snap, clones, and remote replication technologies (for

example, MirrorView, SnapView, SRDF, and TimeFinder)

It is also important to consider where the failure domains are in the backend storage

frames and take them into consideration when creating complex device

geometries. In this context, we define a failure domain as the set of storage

elements that will be affected by the loss of a single storage component. We

strongly advise against the creation of VPLEX devices consisting of striped orconcatenated extents across different backend storage frames. Using different sets

of back-end storage frames makes the failure domain wider and makes it more

susceptible to being affected by a single failure. This can also unbalance the I/O,

and will limit the performance of those striped volumes to the slowest back-end


33/64


DEFINED

device. It is acceptable to use striped (raid-0) volumes with applications and storage

frames that do not already stripe their data across physical disks.


34/64


DEFINED

Section 5: Back-end Performance Considerations

Storage Considerations

It is of great importance that the selected storage subsystem model is able to support

the required IO workload. Besides availability concerns, adequate performance

must be ensured to meet the requirements of the applications, which includeevaluation of the physical drive type (EFD, FC, SATA) used and if the internal

architecture of the storage subsystem is sufficient. For example, a high speed, Fibre

Channel 15K rpm drives or Enterprise Flash drives are typically selected for use with

transaction-based (OLTP) workloads. As for the subsystem architecture, newer

generations of storage subsystems have larger internal caches, higher bandwidth

busses, and more powerful storage controllers.

Storage Array Block Size

Today VPLEX supports communicating to back-end storage-arrays that advertise a

512 byte block size. Within VPLEX, the block-size parameter that you see for a

storage-volume within VPLEX is not the underlying storage-array's supported block-size, but rather it is the VPLEX associated 4KB block size. Each and every volume

reported by VPLEX today will have 4KB. This has the implications for host to VPLEX IO

size that were discussed in Section 4.

Note: VPLEX can and does read/write to back-end arrays at I/O sizes as

small as 512 bytes as of GeoSynchrony 5.0.

SAN Architecture for Storage Array Connectivity

For back-end (storage) connectivity the recommended SAN topology consists ofredundant (A/B) fabrics. Though EMC does support the use of direct storage to

VPLEX connectivity, this practice is extremely limited in terms of cost efficiency,

flexibility and scalability. Direct connect is intended for proof of concept, test,

development, and / or specific sites that only have a single storage frame. Direct

connect allows for backend connectivity while reducing the number of required

switch ports, but as mentioned earlier, the sacrifices in terms of scale and flexibility

make this a fairly uncommon connectivity scheme. Sites with multiple arrays, existing

SAN fabrics, or large implementations should plan to utilize dual redundant SAN

connectivity as it provides the most robust overall solution.

Note: Direct connect applies only to backend connectivity. Front-end (direct hostto VPLEX) connect is not supported.


35/64


DEFINED

Active/Active Arrays

With Active/Active storage platforms such as EMC VMAX and Symmetrix, Hitachi VSP,

IBM XIV, and HP 3PAR each director in a VPLEX cluster must have a minimum of two

paths to every local back-end storage array and to every storage volume presented

to VPLEX. Each VPLEX director requires physical connections to the back-end

storage across dual fabrics. Each director is required to have redundant paths to

every back-end storage array across both fabrics. Otherwise this would create a

single point of failure at the director level that could lead to rebuilds that

continuously start/restart and never finish. This is referred to as asymmetric backend

visibility. This is detrimental when VPLEX is mirroring across local devices (RAID-1) or

across Distributed Devices (Distributed RAID-1).

Each storage array should have redundant controllers connected to dual fabrics,

with each VPLEX controller having a minimum of two ports connected to the back-

end storage arrays through the dual fabrics (required).

VPLEX allows a maximum of 4 back-end paths per director to a given LUN. This is

considered optimal because each director will load balance across the four paths to

the storage volume. Maximum because VPLEX using more paths to any givenstorage volume or the Initiator, Target, LUN (ITL) would potentially lead to an excess

ITL nexus per storage volume resulting in the inability to claim or work with the device.

Exceeding 4 paths per storage volume per director can lead to elongated backend

path failure resolution, ndu pre-check failures, and decreased scalability.

High quantities of storage volumes (i.e 1000+ storage volumes) or entire arrays

provisioned to VPLEX should be divided up into appropriately sized groups (i.e.

masking views or storage groups) and presented from the array to VPLEX via groups

of four array ports per VPLEX director so as not to exceed the four active paths per

VPLEX director limitation. As an example, following the rule of four active paths per

storage volume per director (also referred to as ITLs), a four engine VPLEX clustercould have each director connected to four array ports dedicated to that director.

In other words, a quad engine VPLEX cluster would have the ability to connect to 32

ports on a single array for access to a single device presented through all 32 ports

and still meet the connectivity rules of 4 ITL’s per director. This can be accomplished

using only two ports per backend I/O module leaving the other two ports for access

to another set of volumes over the same or different array ports.

Appropriateness would be judged based on the planned total IO workload for the

group of LUNs and the limitations of the physical storage array. For example, storage

arrays often have limits around the number of LUNs per storage port, storage group,

or masking view they can have.

Maximum performance, environment wide, is achieved by balancing IO workload

across maximum number of ports on an array while staying within the IT limits.

Performance is not based on a single host but the overall impact of all resources


36/64


DEFINED

being utilized. Proper balancing of all available resources provides the best overall

performance.

Storage Best Practices: Create separate port groups within the storage frame for

each of the logical path groups that have been established. Spread each group

of four ports across storage array engines for redundancy. Mask devices to allow

access to the appropriate VPLEX initiators for both port groups.

Figure 12 shows the physical connectivity from a quad-engine to a hex-engine VMAX

array.

Figure 12 – Active/Active Storage to VPLEX Connectivity


37/64


DEFINED

Similar considerations should apply to other active/active arrays as well as ALUA

arrays. Follow the array best practices for all arrays including third party arrays.

The devices should be provisioned in such a way as to create ―digestible ‖ chunks

and provisioned for access through specific FA ports. The devices within this device

grouping should restrict access to four specific FA ports for a VPLEX A director port

group and a different set of FA ports for a VPLEX B director port group.

The VPLEX initiators (backend ports) on a single director should spread across engines

to increase HA and redundancy. The array should be configured into initiator groups

such that each VPLEX director acts as a single host per four paths.

This could mean four physical paths or four logical paths per VPLEX director

depending on port availability and whether or not VPLEX is attached to dual fabrics

or multiple fabrics in excess of two.

For the example above following basic limits on the VMAX:

Initiator Groups (IG) (HBAs); max of 32 WWN's per IG; max of 8192 IG's on a

VMax; set port flags on the IG; an individual WWN can only belong to 1 IG.

Cascaded Initiator Groups have other IG's (rather than WWN's) as members.

Port Groups (PG) (FA ports): max of 512 PG's; ACLX flag must be enabled on

the port; ports may belong to more than 1 PG Storage Groups (SG) (LUNs / Symm Devs); max of 4096 Symm Devs per SG; a

Symm Dev may belong to more than 1 SG; max of 8192 SG's on a VMAX

Masking View consists of an Initiator Group, a Port Group, and a Storage

Group

We have divided the backend ports of the VPLEX into two groups allowing us to

create four masking views on the VMAX. Ports FC00 and FC01 for both directors are

zoned to two FA‘s each on the array. The WWN‘s of these ports are the members of

the first Initiator Group and will be part of Masking View 1. The Initiator Group created

with this group of WWN‘s will become the member of a second Initiator Group which

will in turn become a member of a second Masking View. This is called CascadingInitiator Groups. This was repeated for ports FC02 and FC03 placing them in Masking

Views 3 and 4. This is only one example of attaching to the VMAX and other

possibilities are allowed as long as the rules are followed.

VPLEX virtual volumes should be added to masking views containing initiators from a

director A and initiators from a director B. This translates to a single host with two

initiators connected to dual fabrics and having four paths into two VPLEX directors.

VPLEX would access that host‘s storage volumes via eight FA‘s on the array through

two VPLEX directors (an A director and a B director). The VPLEX A director and B

director each see four different FA‘s across at least two VMAX engines if available.

This is an optimal configuration that spreads a single host‘s I/O over the maximum

number of array ports. Additional hosts will attach to different pairs of VPLEX directors

in a dual-engine or quad -engine VPLEX cluster. This will help spread the overall

environment I/ O workload over more switches, VPLEX, and array resources. This


38/64


DEFINED

would allow for the greatest possible balancing of all resources resulting in the best

possible environment performance.

Figure 13 ITL per Storage Volume

Figure 13 shows the ITLs per Storage Volume. In this example the VPLEX Cluster is a

single engine and is connected to an active/active array with four paths per Storage

Volume per Director giving us a total of eight logical paths. The Show ITLs buttondisplays the ports on the VPLEX director from which the paths originate and which FA

they are connected to.

Active/ Passive Arrays

When using a storage array that is operating in active-passive model, each director

needs to have logical (zoning and masking) and physical connectivity to both the

active and the passive storage controller. This ensures that VPLEX does not lose

access to storage volumes if the active controller should fails or is restarted.

Additionally, arrays like the CLARiiON® have limitations on the size of initiator or

storage groups. It may be necessary to have multiple storage groups to

accommodate provisioning storage to the VPLEX. Follow the logical and physical

connectivity guidelines described earlier in this section.

h11299 emc vplex elements performance testing best practices wp

Documents