protocol choices for virtual infrastructure

23
Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 1 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com TECHNOLOGY IN DEPTH PROTOCOL CHOICES FOR STORAGE IN THE VIRTUAL INFRASTRUCTURE JUNE 2014 Over the past few years, server virtualization has rapidly emerged as the defacto standard for today’s data center. But the path has not been an easy one, as server virtualization has brought with it near upheaval in traditional infrastructure integrations. From network utilization, to data backup, almost no domain of the infrastructure has been untouched, but by far, some of the deepest challenges have revolved around storage. It may well be the case that no single infrastructure layer has ever imposed as great of a challenge to any single IT initiative as the challenges that storage has cast before virtualization. After experiencing wide-reaching initial rewards, IT managers have aggressively expanded their virtualization initiatives, and in turn the virtual infrastructure has grown faster than any other infrastructure technology ever before deployed. But with rapid growth, demands against storage rapidly exceed the level any business could have anticipated, requiring performance, capacity, control, adaptability, and resiliency like never before. In an effort to address these new demands, it quickly becomes obvious that storage cannot be delivered in the same old way. For organizations facing scale-driven, virtualization storage challenges, it quickly becomes clear that storage must be delivered in a more utility-like fashion than ever before. What do we mean by utility-like? Storage must be highly efficient, more easily presented, scaled and managed, and more consistently delivered with acceptable performance and reliability than ever before. In the face of challenges, storage has advanced by leaps and bounds, but differences still remain between products and vendors. This is not a matter of performance or even purely interoperability, but rather one of suitability over time in the face of growing and constantly changing virtual infrastructures – changes that don’t solely revolve around the number and types of workloads, but also includes a constantly evolving virtualization layer. A choice today is still routinely made – typically at the time of storage system acquisition – between iSCSI, Fibre Channel (FC), and NFS. While often a choice between block and file for the customer, there are substantial differences between these block and file architectures, and even iSCSI and FC that will define the process of presenting and using storage, and determine the customer’s efficiency and scale as they move forward with virtualization. Even minor differences will have long ranging effects and ultimately determine whether an infrastructure can ever be operated with utility-like efficiency. Recently, in this Technology in Depth report, Taneja Group set out to evaluate these protocol choices and determine what fits the requirements of the virtual infrastructure. We built our criteria with the expectation that storage was about much more than just performance or interoperability, or up-front ease of use – factors that are too often bandied about by vendors who conduct their own assessments while using their own alternative offerings as proxies for the competition. Instead, we defined a set of criteria that we believe are determinative in how customer infrastructure can deliver, adapt, and last over the long term.

Upload: thefibrechannel

Post on 16-Aug-2015

88 views

Category:

Technology


2 download

TRANSCRIPT

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 1 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

TECHNOLOGY IN DEPTH

PROTOCOL CHOICES FOR STORAGE IN THE VIRTUAL INFRASTRUCTURE

JUNE 2014

Over the past few years, server virtualization has rapidly emerged as the defacto standard for today’s data center. But the path has not been an easy one, as server virtualization has brought with it near upheaval in traditional infrastructure integrations.

From network utilization, to data backup, almost no domain of the infrastructure has been untouched, but by far, some of the deepest challenges have revolved around storage. It may well be the case that no single infrastructure layer has ever imposed as great of a challenge to any single IT initiative as the challenges that storage has cast before virtualization.

After experiencing wide-reaching initial rewards, IT managers have aggressively expanded their virtualization initiatives, and in turn the virtual infrastructure has grown faster than any other infrastructure technology ever before deployed. But with rapid growth, demands against storage rapidly exceed the level any business could have anticipated, requiring performance, capacity, control, adaptability, and resiliency like never before. In an effort to address these new demands, it quickly becomes obvious that storage cannot be delivered in the same old way. For organizations facing scale-driven, virtualization storage challenges, it quickly becomes clear that storage must be delivered in a more utility-like fashion than ever before.

What do we mean by utility-like? Storage must be highly efficient, more easily presented, scaled and managed, and more consistently delivered with acceptable performance and reliability than ever before.

In the face of challenges, storage has advanced by leaps and bounds, but differences still remain between products and vendors. This is not a matter of performance or even purely interoperability, but rather one of suitability over time in the face of growing and constantly changing virtual infrastructures – changes that don’t solely revolve around the number and types of workloads, but also includes a constantly evolving virtualization layer. A choice today is still routinely made – typically at the time of storage system acquisition – between iSCSI, Fibre Channel (FC), and NFS. While often a choice between block and file for the customer, there are substantial differences between these block and file architectures, and even iSCSI and FC that will define the process of presenting and using storage, and determine the customer’s efficiency and scale as they move forward with virtualization. Even minor differences will have long ranging effects and ultimately determine whether an infrastructure can ever be operated with utility-like efficiency.

Recently, in this Technology in Depth report, Taneja Group set out to evaluate these protocol choices and determine what fits the requirements of the virtual infrastructure. We built our criteria with the expectation that storage was about much more than just performance or interoperability, or up-front ease of use – factors that are too often bandied about by vendors who conduct their own assessments while using their own alternative offerings as proxies for the competition. Instead, we defined a set of criteria that we believe are determinative in how customer infrastructure can deliver, adapt, and last over the long term.

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 2 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

We summarize these characteristics as five key criteria. They are:

• Efficiency – in capacity and performance • Presentation and Consumption • Storage Control and Visibility • Scalable and autonomic adaptation • Resiliency

These are not inconsequent criteria, as a key challenge before the business is effectively realizing intended virtualization gains as the infrastructure scales. Moreover, our evaluation is not a matter of performance or interoperability – as the protocols themselves have comparable marks here. Rather our assessment is a broader consideration of storage architecture suitability over time in the face of a growing and constantly changing virtual infrastructure. As we’ll discuss, mismatched storage can create a number of inefficiencies that defeat virtualization gains and create significant problems for the virtual infrastructure at scale, and these criteria highlight the alignment of storage protocol choices with the intended goals of virtualization.

What did we find? Block storage solutions carry significant advantages today. Key capabilities such as VMware API integrations, and approaches to scaling, performance, and resiliency make a difference. While advantages may be had in initial deployment with NAS/NFS, architectural and scalability characteristics suggest this is a near term advantage that does not hold up in the long run. Meanwhile, between block-based solutions, we see the difference today surfacing mostly at scale. At mid-sized scale, iSCSI may have a serious cost advantage while “converged” form factors may let the mid-sized business/enterprise scale with ease into the far future. But for businesses facing serious IO pressure, or looking to build an infrastructure for long term use that can serve an unexpected multitude of needs, FC storage systems delivery utility-like storage with a level of resiliency that likely won’t be matched without the FC SAN.

Selecting Storage for the Utility-class Virtual Infrastructure

Fibre Channel iSCSI NAS/NFS

4 3 2

Class-leading for customers at scale who need efficiency, adaptability, control, and resiliency.

Class leading for mid-range cus-tomers or those pursuing truly converged infrastructures and the resulting ideal acquisition costs with easy deployment, unmatched adaptability, and comprehensive features.

Long heralded as a sensible choice for the virtual infrastructure, trends toward scale, mission criti-cal workloads, and the need to deeply control and manage storage resources leaves NAS/NFS at a dis-advantage.

Table 8: Our findings from an assessment of storage choices for the virtual infrastructure (Harvey Ball scores indicate completeness of the solution when assessed against our five criteria – a full circle is more complete). What follows is our full in-depth assessment of these protocols within today’s VMware virtualization landscape.

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 3 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

VIRTUALIZATION – EFFICIENCY REQUIRES MORE THAN COMPUTE For today’s enterprise, virtualization is now a long-ranging, center-of-stage technology. It can be seen among IT practitioners and vendors alike whom are all focusing on big, business-important virtualization initiatives such as business critical applications and business continuity. The enterprise has turned an eye toward virtualization as a key technology that will unlock higher efficiency in terms of both utilization and on-going operation for all systems. The cost proposition in turn is immense. Virtualization stands to create significant improvements in total IT cost of ownership by reducing both the capital cost of equipment as utilization improves, and the operational cost of management as virtualization’s layer of homogeneous abstraction, broad reaching insight, and powerful tools are brought to bear.

Moreover, virtualization is by no means out of tricks. On-going innovations are bringing entirely new sets of functionality and products to bear to further reduce IT complexity and enable easier, faster, and cheaper access to better IT infrastructure. Many of these products – such as VMware vCloud and the Software-defined Data Center – are introducing a utility-like degree of self-service, access to resources, and management efficiency. With these products that introduce both higher levels of abstraction and more agile adaptation to new demands, administrators may soon be able to focus on managing less granular sets of resources while enabling their customers to easily consume whatever level of compute is desired. In effect, this promises to reduce integration complexity, and unleash a new level of efficient scalability with lower overhead for even better IT infrastructure services.

Figure 1: Intended TCO improvements are often eroded by a mismatch between virtual infrastructure and physical infrastructure capabilities, especially storage, that in turn create significant complexity and operational overhead.

But no matter the future vision of software-defined capabilities, the reality today is that the virtual infrastructure consists of many different parts – compute, networking, storage, and storage fabrics. Achieving efficiency and scalability requires the right foundation that closely integrates these multiple technology domains so that they work well together. Most often, the biggest hurdle to building that foundation is storage. A utility-like compute layer is of little use if it is so mismatched with storage capabilities that it is constantly and severely constrained; and this can easily be the case. Mismatched storage can create a number of inefficiencies that defeat virtualization gains and create significant problems for the virtual infrastructure at scale.

Constraint Problem

Efficiency Consolidation drives up both space and performance demand, and requires a system that is uniquely efficient in delivering both, otherwise virtualization just trades off physical servers in exchange for ever-larger storage infrastructure.

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 4 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

Provisioning and Management Virtual workloads grow in number faster than when workloads were constrained by physical hardware. A demand for more storage for more virtual systems can make provisioning and management deficiencies in the storage infrastructure become a hurdle to getting good use out of virtualization.

Control Virtualization may virtualize everything, and needs differentiation – not everything can be tier 1 storage. At scale, it is too difficult to segregate and differentiate between storage demands, especially if it requires multiplying the number and type of storage systems.

Flexibility Virtualization unleashes flexibility – workload movement, cluster scaling, dynamic alterations in resources. Yet storage is often far from similarly flexible, and cannot easily adapt, scale, or alter allocated resources as virtual infrastructure needs change.

Availability and Resiliency At scale, storage systems multiply to compensate for storage inflexibility. With complexity comes risk, lowered resiliency, and inevitable compromises in availability from both incident/outage and service level compromise.

Table 1: Constraints commonly imposed by storage systems that create pain points in the virtual infrastructure.

The challenges above have faced storage infrastructures for decades, but virtualization has meant such challenges surface more often, and with greater impact on business capabilities and costs. An increasingly efficient and utility-like virtual infrastructure needs to be matched to an efficient and utility-like storage foundation. In turn, beneath the market dominant ESXi hypervisor, both VMware and storage vendors have pursued a number of different features and capabilities intended to enhance storage in a utility-like fashion detailed further in Table 2. Today, storage technologies and VMware vSphere APIs come together to enable better virtual infrastructure storage services.

THE IMPORTANCE OF VAAI AND VASA In the past couple of years, VMware has steadily introduced a string of storage interoperability enhancements under the banner of vSphere Storage APIs. These Storage APIs are wide-ranging, and include APIs for Data Protection (VADP), vSphere API for Multipathing, vSphere APIs for SRM, vSphere API for Array Integration (VAAI), and vSphere API for Storage Awareness (VASA). Today, VAAI has been the highest profile API for integrating together storage capabilities and virtual infrastructure capabilities, and can have significant impact on storage efficiency. VAAI is built on top of standard SCSI T10 extensions, and includes 5 key capabilities: Block Zeroing, Hardware Assisted Locking, Extended Copy, Stun for Thin or Pause at Quota or Threshold, and Space Reclamation. Each of these capabilities is designed to offload key operations to the array (such as writing out zeros, or copying existing data blocks), enhance virtualization and storage interaction, or enhance the management of capacity in thin provisioning environments. Vendor support for VAAI today still varies, but varies most significantly by storage protocol, with VMware slower to introduce support for NFS in key APIs. Secondary to VAAI, VASA also plays a key role with primary storage in the virtual infrastructure by allowing vSphere to peer into storage capabilities, such as disk types and speed, RAID settings, and snapshot space. In the future, as features like vVols are introduced (see Sidebar, page 18: vVols simplify SANs), VASA will become increasingly important.

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 5 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

Pursuing Storage Improvements – VMware APIs and Storage Vendor Advancements

Efficiency vSphere APIs for Array Integration (VAAI) features including block zero, full copy, SCSI unmap, and snapshot offload; storage improvements such as caching and acceleration; tiering, wide-striping, short stroking, and write-optimized data structures; thin, thin reclamation, thin snaps, read/write snaps, and snapshot scalability.

Presentation and Consumption vSphere plugins, Virtual Storage Appliances, vCloud integrations for multi-tiered storage, more easily deployable storage architectures and form factors, distributable and scalable storage systems that reduce sprawl.

Storage Control and Visibility Storage IO Control, vSphere APIs for Storage Awareness (VASA) and SMI-S APIs, command integrations (e.g. esxtop), STUN, Quota STUN, Plugin integrations for storage management.

Scalable and autonomic adaptation Storage DRS, multipathing, controller clustering, scale-out architectures.

Resiliency Storage DRS (IO distribution and queue scale), storage driven fabric/connectivity architecture, pathing, controller connectivity and load balancing, fabric bandwidth aggregation, storage system placement in large scaling infrastructures, stretched storage clusters, real-time replication.

Table 2: Innovations designed to address key customer pain points around storage and the virtual infrastructure.

PROTOCOL AND ARCHITECTURAL DIFFERENTIATION Today, mainstream choices for storage in a virtual infrastructure consist of three networked storage options – Fibre Channel and iSCSI Storage Area Networks (SANs), and NFS Network Attached Storage (NAS). While these three terms represent “storage protocols” and storage fabric architectures, they have also come to represent distinctly different storage system architectures in today’s market. Storage systems in each category have broad general architectural similarities, and when assessed in the context of VMware vSphere integration, they face similar integration capabilities or limitations. In general, the most obvious differences surface between NFS and iSCSI/FC as one is file and one is block storage, and between iSCSI and FC because of differences in the network fabric that have historically motivated vendors to pursue different paths toward

Figure 2: FC, iSCSI, and NAS/NFS appear seemingly similar in deployment, but subtle and long ranging differences exist between these storage solutions in both their storage system capabilities and their integration with virtual infrastructure storage innovations.

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 6 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

interaction with host systems.

In a nutshell, today, these protocols are related to the capabilities of a particular architecture in terms of controller scalability, scale-out and/or capacity aggregation capabilities, IO distribution, VAAI and VASA integration, as well as more nuanced details (a few of these major details are summarized in Table 3). From the range of architectural and capability variation among these storage protocols, selecting a storage system of one protocol or another can have subtle but long ranging impact on the capabilities of virtual storage infrastructures. Furthermore, these choices and implications apply even among the choice of protocols on a multi-protocol storage system.

Fibre Channel iSCSI NAS/NFS

Storage Layout Multiple block devices aggregated and served up as a remotely accessible block LUN

Multiple block devices aggregated and served up as a remotely accessible block LUN

Remotely accessible file system where data is written into variable length files.

Storage Protocol FC (SCSI over FC) iSCSI (SCSI over Ethernet and TCP/IP)

Network File System v3 - NFSv3 (standard today)

Data Format VMFS datastore (cluster accessible) containing VMDKs, or directly mounted by VM as RDM

VMFS datastore (cluster accessible) containing VMDKs, or directly mounted by VM as RDM

NAS file system (proprietary) containing VMDKs (cluster accessible)

Hypervisor Interaction Storage Protocol Storage Protocol Network Protocol

Storage Granularity (as of vSphere 5.1)*

LUN/Datastore LUN/Datastore File level / VMDK

Connectivity Optical Fiber Copper or Fiber Ethernet Copper or Fiber Ethernet

Active Multipathing Yes Yes No

Scalability Typically highly clustered multi-controller architecture

Typically scale-out multi-controller architecture

Typically dual controller

Typical level of VAAI and VASA support

Full Full Partial

Table 3: Basic overview of the three predominant choices for storage in virtual infrastructures.

In general, for each given protocol, these similarities hold true, even though there are exceptions. For example, NFS systems are typically designed such that client connectivity is single connection oriented, and storage systems sit behind a single pair of clustered controllers. It is not that this must be the case, but that architectural departures from this model are more complex, and fewer are found in virtualization environments today. For example, scaling out controllers and providing multipathing, multi-connection connectivity with an NFS solution typically requires proprietary connection clients or next generation pNFS functionality that is not compatible with the VMware ESXi network stack, and attempts to tackle these challenges have not yet demonstrated much success in the market.

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 7 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

The Problem with Assessing Protocols The testing data in the market today is often only oriented toward testing performance and interoperability, and is further biased by multi-protocol test results run against one vendor's multi-protocol system. For performance, such an approach encumbers the system being evaluated with additional overhead - in unified, multi-protocol storage solutions, either the block storage will be encapsulated on top of a file system, or the file storage approach will be layered on top of block storage volumes and protocol processing. The combination is usually determined by a vendor's particular approach to volume virtualization and feature engineering with a goal of serving up consistent features across both file and block. Better discernment of differences can be made by testing best of breed solutions against each other.

Moreover, many times multi-protocol vendors will claim superiority of one approach over another based on a particular feature set and enablement inside the virtual infrastructure, and then claim that as advantage against all products in the market serving up the less optimal protocol. Such conclusions are misleading, and too broadly generalize the expectations for different storage solutions in the market. For example, around NFS storage, it is often claimed or perceived that datastore capacity management can be dismissed and solve the headaches of datastore growth in a virtual infrastructure, thereby delivering superior ease of use versus block storage. Against legacy SAN architectures, this may be true, but it is no longer the norm among latest generation block storage systems and with the latest generations of VMware’s VMFS. Latest generation block storage systems that fully virtualize volumes in the array offer matching functionality. Moreover, a new feature from VMware – vVols (See Sidebar: vVols Simplify SANs) is soon to be upon us, and will do more to simplify the SAN while simultaneously extending how features can be applied to individual VMs; in contrast, NAS will not benefit as much. Coupled with scale-out or highly clustered, highly scalable block architectures, vVols look poised to simplify storage versus dual controller architectures more likely to encounter capacity and performance limits. Such systems can lead to inevitable sprawl, storage silos, and underutilization. Finally, also not to be overlooked, today, SAN technologies are more often the ones tightly integrated into converged infrastructure offerings and these solutions can offer advantages in simplifying the act of scaling while still keeping the storage and virtual infrastructures well aligned.

But regardless, performance and interoperability is only part of the equation. As illustrated, the virtual infrastructure is about much more than purely performance and interop.

ASSESSING PROTOCOLS AND STORAGE ARCHITECTURES FOR VIRTUALIZATION Selecting Storage for the Utility-class Virtual Infrastructure

Fibre Channel iSCSI NAS/NFS

4 3 2

Class-leading for customers at scale who need efficiency, adaptability, control, and resiliency.

Class leading for mid-range custom-ers or those pursuing truly converged infrastructures and the resulting ide-al acquisition costs with easy de-ployment, unmatched adaptability, and comprehensive features.

Long heralded as a sensible choice for the virtual infrastructure, trends toward scale, mission critical work-loads, and the need to deeply control and manage storage resources leaves NAS/NFS at a disadvantage.

Table 4: Assessing storage protocol and storage system choices for the virtual infrastructure, a preview of our detailed findings.

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 8 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

Because the storage market lacks a comprehensive and unbiased assessment of storage protocols, Taneja Group has set out in mid-2012 to conduct a vendor-agnostic assessment of the storage choices in the virtual infrastructure. While many customers will continue to stand by their favorite incumbent, the implications of a less than ideal storage choice for the virtual infrastructure are significant, and the results of an objective assessment may surprise even long-term storage administrators.

We’ll turn next to a detailed discussion of each of these respective storage choices – NFS, iSCSI, and FC – while identifying the typical characteristics of the associated storage architectures. Following this protocol-by-protocol review, we’ll summarize how these results stack up and provide our assessment of where the best storage investments are today made for the virtual infrastructure.

Network File System - NFS First brought to market in the mid-80’s by Sun Microsystems, Network File System or NFS was designed to simulate local file access on a file system on a remote server. NFS uses a handful of NFS-specific file access commands executed across a network over the Remote Procedure Call protocol. RPC is a remote execution protocol that runs across an IP network using UDP or TCP. By using a standardized set of operations, NFS allows a client to see a standard compatible representation of directories and files irrespective of what file system a remote storage system may be running.

While variations exist – such as NFSoRDMA operating on an InfiniBand network – most common NFS storage infrastructures run the protocol over a gigabit or 10 gigabit Ethernet network today. In typical implementations, the use of NFS hands off requests to multiple execution “threads” on the client that then enable many simultaneous commands, including tasks such as read ahead caching, without creating IO bottlenecks for compute processes.

While originally envisioned as remote file access, storage hardware advancements over the past years have allowed some vendors to turn their NFS storage offerings to extreme workloads – highly transactional workloads that require low latency and are far beyond what the original pioneers likely ever thought would be stored and accessed with NFS. In turn, NFS has become a flexible protocol that can be used for general-purpose storage. Commonly, vendors demonstrate proof points for applications that demand low latency such as databases, Microsoft Exchange, and of course virtualization. In reality, the NFS market has largely bifurcated into two camps – scale-out systems that are designed to deliver extreme capacity and performance for content, and traditional architecture systems that deliver more general-purpose storage. Typically, the highly scalable NFS systems on the market today do not possess a full set of storage features nor perform well under small highly random IO typical of transactional applications and virtualization, and fall outside of our assessment here.

The common general-purpose storage systems on the market today, with a few less common exceptions, remain dual controller architectures, and this architecture has been in part defined by the nature of NFS connectivity. NFS storage over Ethernet is bound to the connectivity capabilities of UDP, TCP, and RPC. These protocols dictate single-pathed point-to-point connections between a NFS client and NFS server. This has in turn led general purpose NFS storage vendors to stick with dual controller storage system architectures, as the performance that could be delivered through

Figure 3: Typical NFS connections and IO paths, assuming active-active dual storage controllers.

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 9 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

hardware advancements has long seemed well suited to keeping up with client performance demands.

The nature of NFS interaction has imposed some challenges as well. NFS can impose much more communication back and forth than seems reasonable, and it can be prone to unintentional abuse by clients if storage systems are multipurpose. Routine interaction, and the occasional misbehavior by a machine or human client can create excessive read activity on the storage system, while small NFS interactions can create excessive load on both the client and the server. At scale, both factors make a case for NFS acceleration through read caching and NFS protocol processing offload. Both types of products are readily available in the market today, and can help a NAS storage system go further before running out of steam.

In use, NFS storage is easily presented to hypervisors as a datastore, using typical IP address semantics, with little complexity. Once presented, a single NFS mount point can easily serve many machines and historically has not been restricted by volume size limitations that affected block storage (although recent generations of VMware’s vSphere suite have put an end to these complications for block volumes as well). The ability to reach an NFS NAS from anywhere Ethernet can reach makes this large single mount point attractive for the virtual infrastructure – a single NAS can serve a very large virtual infrastructure provided there is sufficient performance and capacity within the NAS, although reasonable care needs to be taken to ensure sufficient connectivity, bandwidth, and low latency. Meanwhile, NAS also provides versatility by simultaneously addressing file sharing needs, such as image repositories, user directories, or application mounts, from the same storage system.

While no doubt NAS storage management, accessibility, and storage versatility are strengths in many use cases, it is also apparent these strengths can simultaneously be weaknesses for the virtual infrastructure, and have in fact plagued many users. Storage controllers that enable unique levels of storage management on top of a proprietary file system can obscure visibility and limit insight into the storage system. There is no easily convenient standardized way to communicate extra storage detail to a NAS client. Moreover, the NFS protocol resides in the network protocol stack within VMware’s ESXi rather than the storage protocol stack. These factors are key impediments that consistently create extra cycles and delay in developing many VAAI integrations. Today, NFS still lacks some VAAI integrations, and some integrations fall short of covering all VMware products.

As an example, NFS has no support for VAAI’s XCOPY when it comes to Storage vMotion, and VAAI’s Extended Statistics are not thoroughly integrated for NFS (such as esxtop). Historically, NFS has lagged in receiving new capabilities – for example VAAI could offload snapshots to the NFS storage system, but not full clones of VMs. Vendors have gone far in delivering functionality through vendor

WHERE IS NFS SCALE-OUT? Advancing storage system design beyond two controllers is complex, requiring the distribution of data across many nodes, and then the intelligent handling of requests for data across many storage controllers. Not only must the storage controllers have a cohesive cluster-wide view of stored data, but they must also be able to deal with client requests where the data may be stored and authoritatively controlled by another controller. These challenges become even more significant, if not impossible, when trying to scale performance for single files such as databases or the VMDKs behind virtualization beyond the limitations of a single controller – this means multiple controllers must be able to manage a single file. Meanwhile, managing metadata and data consistency across this cluster often adds latency and overhead that makes small random IO impractical, and can make it impossible to implement the typical storage features most business demand – including snapshots and clones. NFSv4 promises to deal with some challenges, but still appears far from becoming common.

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 10 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

proprietary plugins, but these can yield limitations compared to other protocols; for example, VMware’s vCloud Director can rely on VAAI snapshot offload for block storage performance at scale, but could not do so with NFS, forcing NFS adopters to rely on vCloud Director’s Fast Provisioning which had disadvantages in performance and limited ESXi cluster size.

Finally, while the large mount point approach common across customers simplifies initial provisioning and allows versatile use of a single storage system, this can reduce visibility and proactive planning in rapidly growing virtual infrastructures. NFS datastores are by default thin-provisioned (VMDKs are treated as files that grow over time as they are used, and consequently are not fully allocated all at once). Large numbers of virtual machines can rapidly consume space, and limited visibility can make it difficult to guard against running out of space. Meanwhile, the consequences of running out of space can be severe, as with limited scalability, NFS environments may need to turn to adding an additional storage system. Migrating and rebalancing workloads across those storage systems may be difficult. Moreover, the shared nature of NAS storage may mean there are many dependencies upon a single storage system, further exacerbating migration headaches. We have as a result seen large scale environments using multiple NAS systems set goals of using no more than 50% of the expected capacity and performance in order to avoid forced migration and rebalancing efforts.

NFS – Strengths and Deficiencies

Strengths Deficiencies

Efficiency • Storage-side features are comprehensive and some plugins can put powerful features in the hands of virtual server admins

• Large mount points can simplify deployment and management, until capacity and performance limitations are reached

• Innovative vendor plugins can enhance management and some operations

• Limited VAAI and VASA support compromise capabilities and lead to less performance or capacity efficiency

• Large mount points with no path to further scaling compromise long term efficiency

• Thin reclamation made complex by file system architecture

• Typically limited system scalability that eventually leads to sprawl and complexity

Presentation and Consumption

• Easy initial setup and use

• Versatile access from nearly anywhere

• At scale, management of many systems can compromise ease of presentation

Storage Control and Visibility

• Tends to manage everything in one large storage pool with limited control and visibility, essentially putting the enterprise at the mercy of a single, undifferentiated pool of storage

• Limited VAAI and VASA support compromise visibility and control

Scalable and autonomic adaptation

• At best, NFS/NAS vendors have employed solid state acceleration technologies to extend the performance of a single system

• Limited scalability means limited adaptation as the infrastructure grows. This leads to sprawl, and underutilization to avoid limits and painful rebalancing.

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 11 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

Resiliency • Active-active controllers made easy by NFS and IP

• Total control of a proprietary file system can help vendors ensure data protection and integrity

• Large shared storage systems can mean multiple hops across shared Ethernet networks and create complexity and risk

• Single connections for sessions and reliance on IP network mechanisms to handle failures (e.g. Spanning Tree) can mean disruptions during logical or physical failures

Table 5: NAS/NFS Strengths and Weaknesses

iSCSI iSCSI, first brought to market in 2001, is an IP-based storage networking standard that carries traditional SCSI storage commands over an IP network to a storage system serving up block-based storage volumes (LUNs). This makes storage interactions routable and switchable over the standard, ubiquitous, and low cost Ethernet network (or other IP carrying networks such as InfiniBand) while preserving the efficiency and low latency of direct block access and the SCSI command set.

Directly connected to the client (or initiator in iSCSI terminology), a storage volume is presented as a raw block volume and is entirely under control of that client for all data management and storage. At the storage system (or the target in iSCSI terminology), disks are typically aggregated, protected by RAID, and then served up as a “virtual” logical disk to clients. The storage system then provides various levels of storage features such as snapshots, clones, replication, volume expansion, volume migration, thin provisioning, auto-tiering, and various other functionality. With widespread integration of iSCSI in standard OSs, including hypervisors, iSCSI today is easily deployed and rivals NFS in terms of ease of provisioning. Simultaneously, iSCSI provides better means of segregating and controlling workloads – the SCSI transaction nature of iSCSI provides superior over-the-wire efficiency, and vendors have been able to easily harness the accessibility of IP networking to peer into storage interactions and enhance how storage systems interoperate with the virtual infrastructure.

As is the case with all block-based networked storage technologies, iSCSI operates with standard SCSI commands, only they are transmitted over an Ethernet wire. In turn, virtual infrastructure integrations based on SCSI command extensions (such as Fully Copy, Block Zero, Hardware Assisted Locking, and others) use the same storage protocol stack as Fibre Channel and have become rapidly and widely supported with iSCSI storage systems – for most systems, VAAI, VASA, and Multipath APIs are fully supported. Use of both SCSI and IP in all access and communications has historically facilitated rapid innovation in iSCSI storage management, and likely does and will play a key role in how fast iSCSI supports new API innovations. Moreover, in the case of multipath, the iSCSI protocol was designed to be highly reliable and multipath capable. iSCSI has robust support for both native active/passive and active/active multipathing within VMware’s ESXi as well as more sophisticated active/active support with vendor specific multipathing plugins. The standard is usually an active/active configuration, where multiple IO streams are simultaneously used across multiple initiator and target interfaces, and across multiple network paths. This allows network paths to fail

Figure 4: Typical iSCSI active-active multipathing.

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 12 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

with no visible impact, but also aggregates these multiple paths for higher total bandwidth, and can permit load balancing of traffic and queue depths across controllers. The individual connections in a storage system can scale to thousands, and individual clients may each have dozens, depending on the architecture of the storage system.

iSCSI storage system architecture has become notably innovative over the past decade, likely in part due to the nature and flexibility of the iSCSI protocol. In reality, IP addressing has abstracted away much of the physically restrictive connectivity of legacy storage and allowed vendors to innovate around how clients connect to storage systems and access underlying storage volumes. Then without the complexity of file system semantics or huge namespaces, iSCSI storage architectures have been more readily able to tackle the challenges of scale. Today, a number of iSCSI systems exist that can scale-out and serve up a shared pool of storage resources across many storage controllers with utter simplicity. Such systems usually distribute volumes and stored data across multiple controllers, and load balance capacity and performance as new controllers and disk capacity are brought into the storage system. As the storage controllers scale, aggregate storage performance can grow, and oftentimes capacity can be scaled behind the controllers independently, thereby allowing businesses to adjust their storage capabilities according to changing business demands, and without undue complexity or cost. Moreover, the same system architecture that has enabled iSCSI storage scale-out has also resulted in the packaging of several products as Virtual Storage Appliances that run as virtual machines. These virtual machines can help segregate storage resources for better control, or provide local storage in places where a physical system might be inconvenient (at a cloud provider for instance). Using replication, most such appliances can move data back to a physical storage system easily. For the growing virtual infrastructure, a unique level of versatility and scalability can make the storage infrastructure just as agile as the virtual infrastructure.

With a substantial investment in software innovation to make iSCSI storage systems scale while maintaining effective consolidated management, vendors have found unique opportunities to also peer inside of virtual machine storage interactions. With that insight, on top of a single shared LUN iSCSI systems often deliver capabilities that rival that of file-based storage approaches, including per-VM snapshots and cloning inside of larger shared datastores. Moreover, in some cases this even surpasses the capabilities of file-based storage, as with per-VM insight and without a file system to manage, per-VM auto-tiering onto SSD for both read and write acceleration is straightforward, and allows vendors to increase the range of tools they have for optimizing the performance of contending workloads. Meanwhile, peering into storage interactions has also allowed vendors to provide extensive visibility and reporting.

Although iSCSI can use standard Ethernet networks, in reality, most best practice deployments also include dedicated and isolated network devices (switches) to support the iSCSI SAN. Creating localized Ethernet fabrics can further take iSCSI to best-in-class capabilities in terms of robustness, security, and performance, while still sticking with a familiar and well-understood medium. Moreover, it also enables customers to employ specialized technologies, like jumbo frames that can reduce packet and processing overhead, while ensuring that these capabilities are present on the entire path – from client to storage system. Then, if necessary in order to serve some clients, this network can be bridged to a general network while using best practices for authenticating and securing iSCSI (using built-in CHAP authentication methods). In reality, the complexity introduced by these additional switches is minimal, and similar isolation can be provided in existing network infrastructures through VLANs. Simultaneously, a number of scale-out iSCSI storage solutions on the market today have taken on a deployment model where they are placed in the same racks as compute systems. Since even this dedicated Ethernet SAN can easily stretch across racks with standard Ethernet cabling, this allows iSCSI storage to be scaled on demand alongside new server deployments, while still maintaining a single storage system – the storage system is simply distributed across many racks in the data center and interwoven with servers. This “converged”

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 13 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

infrastructure model allows business to scale on demand with minimal planning or manipulation of existing equipment, and has been a boon to many growing and continuously scaling customers.

In the virtual infrastructure, iSCSI volumes can be provisioned just about anywhere, and can support the largest ESXi clusters requiring shared storage. This can increase versatility, but to the uninitiated, it can also create a proliferation of storage volumes that can be perceived as more complex to manage. In reality, management tool sophistication and attention to standardizing how and where volumes are provisioned can easily put an end to complexity, but they cannot compensate where IT shops fail to standardize their storage practices and perform sufficient planning.

Finally, the iSCSI storage protocol can create burden on the host operating system compared to the more traditional Fibre Channel protocol where framing is simpler and more processing is handled in the Fibre Channel HBA. Because iSCSI rides on top of TCP/IP, there are multiple layers of processing involved. First, TCP sessions must be established and processed, then packets themselves must be opened, checksummed, and reassembled. Early in the decade when processor cycles were more precious and TCP processing was relatively inefficient, performant iSCSI almost always required specialized Network Interface Cards (NICs) to offload the processing of TCP from the host processor. Over the years, broad support for varying forms of offload as well as a number of innovations in host operating system TCP handling have made this processing substantially more efficient. Using typical enterprise NICs, most host operating systems including ESXi are able to process full line rate iSCSI at even 10 gigabit speeds with negligible impact on the host processors. ESXi does this today by leveraging TCP Offload Engine (TOE) capable NICs for partial offload functionality – including TCP Segmentation Offload (TSO) and Large Receive Offload (LRO). This partial offload significantly reduces the processing cycles required for iSCSI, and supporting hardware has a wide install base in enterprise servers today. Moreover, even with no offload, today’s abundant processing cycles mean many data center clients and even virtualization hosts have sufficient idle processor cycles to support iSCSI at high IO rates. The exception is 10 gigabit Ethernet, which is still assumed to require some degree offload for most use cases.

iSCSI – Strengths and Deficiencies

Strengths Deficiencies

Efficiency • Standardized SCSI interactions have ensured rapid VAAI and VASA support on nearly all iSCSI storage

• Extensive storage features that can conduct operations on both datastores and individual VMs help increase management and storage/compute efficiency

• Density and efficiency can often be exceeded, especially at scale, by FC systems that are built for larger scale and with higher bandwidth FC fabrics

Presentation and Consumption

• Ubiquitous Ethernet, OS iSCSI support, and converged storage form factors make deployment and provisioning easy

• Streamlined provisioning and homogeneous storage virtualization create on-going simplicity and ease of use

• Software-based nature of many iSCSI storage systems have led to virtual appliance form factors that can extend reach and versatility

• Target and initiator naming conventions can create perceptions of complexity to the unfamiliar administrator

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 14 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

Storage Control and Visibility

• Easily visible iSCSI traffic has been leveraged by storage system vendors to enhance reporting and visibility to extremes

• Scale-out inclinations have led vendors to develop management that can manage many systems. Allows good segregation of demands, with little overhead

• iSCSI can also simultaneously be used for VMFS data stores and storage volumes directly attached to virtual machine guests, enhancing versatility

Scalable and autonomic adaptation

• Common iSCSI scale-out architectures help ensure easy adaptation to changing demands

• At scale, iSCSI deployment and provisioning retains ease of use advantages by way of scale-out architectures that present themselves through single portals

• vVols will likely enhance almost all iSCSI functionality very quickly (given history); this will further improve efficiency, presentation, and how well iSCSI scales.

• Eventual limits to scale-out storage system size require the creation of additional “storage systems”. Although managed under one umbrella, they cannot aggregate performance and capacity. Today, scalability in a single system can reach hundreds of TBs and exceed 100,000 IOs per second

Resiliency • Multipathing can be active/active and can tolerate link loss without disruption to uptime or performance

• Typical RAID mechanisms and snapshots protect storage, but increasingly, iSCSI systems also come with RAID or replication protection that spans controllers

• Use of Ethernet can create cross-domain dependencies when it comes to management, and require interaction of both network and storage administrators and increase risk

• Multipathing may require vendor plugins for full resiliency

Table 6: iSCSI Strengths and Weaknesses

Fibre Channel Fibre Channel, originally developed in the late 1980’s, was driven to broad market adoption as the defacto enterprise storage fabric by the wave of distributed computing that had changed enterprise compute by the mid-1990’s. In the midst of compute’s shift to distributed architectures and multi-tier applications, the new systems hosting mission critical applications needed more access to large amounts of data storage and performance than ever before. As distributed computing became more dominant and the number of systems multiplied, Fibre Channel (FC) technology advanced with it, leading to today’s FC SANs that commonly use a fully switched, fiber optical fabric that runs the Fibre Channel Protocol (FCP) at high speeds, extremely low latency and with extreme resiliency. Such FC networks can today be extended over distances, routed over IP, and even bridged to Ethernet networks where clients might run the FC over Ethernet protocol (FCoE).

Fibre Channel Protocol transports SCSI commands using a multi-layer model similar to that of the TCP/IP protocol stack on Ethernet – FCP handles everything from physical media management to

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 15 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

packetizing data for transfer across a fabric. But within those packets are standard SCSI commands. FCP has historically made communication challenging, and management initiatives like SMI-S have demonstrated slow adoption and success, with products often seeming like afterthought add-ons. Management products have too often turned to the IP network to work around a lack of communication on the FC fabric. But lately, this has changed, with a seeming new generation of capability being born as VMware has turned to standard T10 SCSI commands over the FC fabric to enable many of the latest virtualization-enhancing capabilities.

Among the most modern architectures typified by Fibre Channel storage, years of specialized architectures and innovation have led to large scale, highly clustered systems that can leverage multiple controllers, large numbers of storage system ports, and redundancy across all systems as well as the FC fabric that makes it practical to attach enormous numbers of drives to a single system. That single system usually contains tremendous amounts of controller horsepower and cache, and in some cases has a clustering architecture that allows the addition of more controllers should performance demands change. The most recent generations of product have shifted much of this capability to industry standard hardware and moved storage functionality to rapidly evolvable software, sometimes aided by specialized hardware such as ASICs that may deliver certain features and functionality with high performance.

Historically, FC fabrics and SANs have been viewed as complex, difficult to understand, and difficult to manage. With the introduction of switching, SAN fabrics eventually became an exercise in designing fan-in/fan-out ratios, zoning, name services, FSPF, port configurations, fiber types, device-to-device interoperability, and more. Meanwhile, complex FC storage arrays often required extensive management of disk devices and logical volume constructs, and a similar degree of configuration complexity to provision these disks to clients.

This is no longer the case with latest generation products. While the FC SAN fabric itself can be complex at scale, latest generation FC storage systems have in fact become easy to use, and with moderate-sized FC fabrics, FC storage systems can compare favorably with iSCSI. Such systems have simplified provisioning through fabric auto-discovery and sophisticated wizards, have automated and simplified the construction of disk groups with internal virtualization, and today harness sophisticated levels of functionality that can make administrative tasks like optimization, protection, or replication easy. Meanwhile, beneficial parts of Fibre Channel’s legacy have stuck around. Today, FC remains a high bandwidth fabric (having advanced to 16Gbps), and standard implementations represent the best available performance and resiliency. Then for the more advanced administrators, management tools can help implement a level of control and insight that simply isn’t available with other fabrics.

Moreover, truly sophisticated visibility tools exist from third parties. These tools can deliver deeper visibility and analytics than with any other storage fabric – especially when best practices are adhered to and physical fiber TAPs are installed at key points in the infrastructure.

Nonetheless, it is still the case that FC infrastructures do require skill specialization and costs beyond that of the other storage protocols. Proprietary configuration and management practices specific to Fibre Channel require specialized training and management that can increase the OPEX and complexity behind a data center. FC systems are more expensive alone, and when considering the cost of Fibre Channel infrastructure and server adapters (HBAs), the CAPEX costs of Fibre Channel can be many

Figure 5: Multipathing in an FC SAN

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 16 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

times greater than the cost of NFS or iSCSI systems.

But for customers with FC investments, there is little doubt today that they are worth it. As the leading choice for enterprise storage, FC systems are typically the first to support new vSphere Storage API functionality and today support every multipath, VAAI and VASA feature. Coupled with high powered controllers and a laundry list of storage array features – thin provisioning, thin snapshots, dynamic auto-tiering, wide-striped volumes, per-volume QoS, multi-tenancy, and array virtualization – VAAI integrations with the FC array can significantly increase the performance and capabilities of a virtual infrastructure.

These integrations have allowed FC vendors to broadly advance their capabilities in the past few years. Only a few years ago, FC vendors were challenged to apply their storage features to the virtual infrastructure in a meaningful way. Designed to work on a LUN basis, storage features weren’t granular enough to manage VMs where hundreds were stored on a single datastore. Today, APIs and innovative plug-ins now let these vendors apply their snap and cloning capabilities to individual VMs, allowing the virtual infrastructure to protect and scale workloads with blistering speed. The best FC arrays can execute thousands of snapshots, and through intelligent volume design and efficient controller caching, snapshots may have no performance impact. This makes VM-level features particularly powerful with FC arrays – using such features on other systems often makes performance fall apart under load and at high utilization levels. In the near future, such VM-level storage management may only improve. With VVOLs that are soon to arrive, every VM will be handled as an individual storage object, and array operations (snaps, clones, etc.) will be applied directly to a VM. This will further enhance efficiency and further differentiate block storage from the shared file system approach of NFS.

When it comes to the hotly contested area of capacity efficiency, vendors in this market not only offer tools like thin provisioning, but also may have innovative thin acceleration technology that can help to automatically detect empty space and/or accelerate the reclamation of space when it is freed by operating systems. Known as VAAI-based thin reclamation, this can provide significant efficiency advantages. With other storage systems storage volume interactions may cause bloat over time and may make thin relatively inefficient.

Since FC systems are often built to operate at tremendous scale, vendors have increasingly turned their attention to making large systems and multi-system infrastructures work better. Multi-tenancy models can “fence off” resources such as controller processing cycles or cache in order to guarantee or restrict performance for certain uses or customers. Tools like array virtualization can enhance how resources are fenced while increasing portability for the sake of redistribution, rebalancing, or migration. Resources can be kept guaranteed and predictable, and nearly limitless resources can be served up by adding more systems. Taking this capability to the extreme, some vendors leverage STUN in the SCSI protocol and VAAI stack to facilitate the non-disruptive movement of a workload across storage systems, even across locations. These technologies are key integrations that enable solutions like VMware’s vCloud today, and help keep those solutions optimally aligned with the way a provider – private or public – is trying to deliver infrastructure services.

Many of these vendors have also turned their architectures toward availability innovation, using their architectures for in-the-array protection and some of these same workload portability approaches to simultaneously protect workloads across sites. The best of clustered architectures use multi-node protection work to preserve performance in the case of a single controller loss by creating protecting write caches in a fully clustered fashion. In other storage systems, the loss of a controller typically means degradation in performance not only from the loss of a controller, but because the remaining controller stops caching writes in order to protect data. Meanwhile, multi-way array-to-array replication, workload movement, and transparent failover provide capabilities provide sophisticated distributed protection that is harder to come by with iSCSI and NFS. Moreover, these approaches are

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 17 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

typically well integrated with thin technologies to optimize data replication or movement, as well as VMware’s APIs for SRM.

Fibre Channel – Strengths and Deficiencies

Strengths Deficiencies

Efficiency • Superior raw performance and low latency with minimal overhead

• Designed to operate at full speed with negligible impact from any storage feature or operation

• Often equipped with state-of-the-art thin technologies also designed to keep volumes thin over time

• Thorough vSphere Storage API support

• Innovative vendor plugins can enhance management and some operations

• Lacking deduplication, but typically not a desired item because the FC storage system is focused first on performance, and dedupe often perceived to compromise performance

Presentation and Consumption

• Designed for scale, multi-tenancy, and managing infrastructures of many systems joined together by replication and federation technologies

• More complex storage fabrics and provisioning processes, but becoming increasingly simple and scalable over time – likely to be further improved with vVols

Storage Control and Visibility

• Cutting edge storage isolation features that provide a superior degree of control and guarantee

• Sophisticated QoS approaches that can throttle workloads and help guarantee performance for critical apps

• Optimization technologies that can adjust performance to the demands of changing workloads within administrator set policies

• Ideal levels of visibility often require third party tools, especially for verification of physical layer interactions on a fairly unique fibre channel fabric

Scalable and autonomic adaptation

• High degrees of scalability paired with an ability to federate and manage an infrastructure of many systems

• Sophisticated arrays can constantly re-optimize workloads to ideally balance all demands against existing capabilities

• With scale, FC storage systems bring increasing degrees of sophistication requiring specialized skillsets

Resiliency • Best-in-class availability architectures

• Clustering approaches can guarantee not just availability protection but also the protection of performance levels

• Multi-site protection capabilities that are tightly integrated with VMware

Table 7: Fibre Channel Strengths and Weaknesses

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 18 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

PROTOCOL DIVERSITY – PRIORITIZING CHOICES FOR MAINSTREAM ENVIRONMENTS Previously in this brief, we identified five dimensions for our evaluation and emphasized the application of these criteria to a consideration of storage protocol, typical storage system architecture, and VMware API support in considering storage choices for the virtual infrastructure. Moreover, we emphasized that this evaluation is not a matter of performance or interoperability – as the protocols themselves have comparable marks here – but rather a broader consideration of storage architecture suitability over time in the face of a growing and constantly changing virtual infrastructure; especially an infrastructure in which change happens not just in the workloads, but periodically in the hypervisor itself. In review, we’ve labeled these five dimensions as:

• Efficiency – in capacity and performance • Presentation and Consumption • Storage Control and Visibility • Scalable and autonomic adaptation • Resiliency

With an eye on evaluating the relative effectiveness of each of these storage technologies within virtual infrastructures that are increasingly scaling and demanding storage resources that are “utility-like”, we’ve briefly reviewed the individual technologies, and now will turn to assessing how they comparatively align with the characteristics we’ve identified for utility-like storage. To summarize our findings, we’ve repeatedly used “Harvey Ball” indicators, where solid circles indicate the most complete solution.

Selecting Storage for the Utility-class Virtual Infrastructure

Fibre Channel iSCSI NAS/NFS

4 3 2

Class-leading for customers at scale who need efficiency, adaptability, control, and resiliency.

Class leading for mid-range custom-ers or those pursuing truly converged infrastructures and the consequent ideal acquisition costs with easy de-ployment, unmatched adaptability, and comprehensive features.

Long heralded as a sensible choice for the virtual infrastructure, trends toward scale, mission critical work-loads, and the need to deeply control and manage storage resources leaves NAS/NFS at a disadvantage.

Table 8: Assessing storage protocol and storage system choices for the virtual infrastructure.

VVOLS SIMPLIFY SANS At VMworld 2012 in Las Vegas, VMware announced a new storage technology called vVols (virtual volumes). vVols will put an end to LUN provisioning on the storage system by creating a single storage system endpoint (the Protocol Endpoint), allowing the storage system to allocate a single aggregated pool of capacity within a service class, and then enabling the virtual infrastructure to automate the allocation of storage resources to every VM. This in effect turns the SAN into a large single capacity pool across which many VMs can be stored without further management complexity, much akin to a large NFS storage pool today. But vVols also allow the storage system to have deep visibility into individual VMDKs stored within a capacity pool, and conduct storage operations on those VMDKs as individual logical objects. This means per-VM granularity for all storage operations – operations like snapshots, clones, or data tiering will no longer happen against a datastore at the LUN level, but will happen on each individual VM.

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 19 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

We’ll briefly turn to the detail behind our conclusions.

EFFICIENCY Efficiency comes in two parts – capacity and performance. As we look across the representative vendors offering each type of storage, we see a heavier degree of difference in performance than we do with capacity. Capacity has historically been a key measure for storage, at least until the past few years where an onslaught of storage intensive applications has put performance under new pressure. But among the solutions representative of the storage approaches here, capacity still remains a most closely contended dimension.

Fibre Channel - 4

Leading Fibre Channel storage solutions have a robust set of thin technologies and potentially broader support for re-thinning storage or thin reclamation (in some cases acting as a zero detect form of technology that can eradicate empty space before it is ever created on disk). Broad API interoperability helps ensure efficiency in interactions with the virtual infrastructure. The leading vendors of FC storage solutions also demonstrate a serious performance advantage – primarily by avoiding disruption to IO at all costs, including with the use of thin, snapshots, or any other storage side technology. The balance of capacity and performance creates overall efficiency superiority over other solutions at large scale.

NAS/NFS - 3

Among the solutions, NAS/NFS storage merits leadership recognition in terms of capacity efficiency. Today, leading vendors offer not only extensive thin capabilities, but can intermittently deduplicate data to return additional efficiency. Where NAS/NFS falls down, and comes out of the leadership position is among performance efficiency. Assessing the broad landscape, NAS solutions often carry too much file system overhead to maintain robust performance under all circumstances, and often the use of tools like snapshots or deduplication can extract a big performance penalty on highly utilized systems.

iSCSI – 3

Among the typical mainstream solutions, iSCSI solutions today demonstrate similarly pervasive use of thin technologies, although with more variation in how well and how easily volumes can be re-thinned. But in a race, the best Fibre Channel and NAS/NFS solutions likely both have long-term capacity superiority. But it shouldn’t be overlooked that iSCSI acquisition costs are lower than what is typically found among the other storage systems, and a lower $/GB may yield better capacity efficiency in terms of cost of ownership. This is especially true with leading and emerging vendors who are tightly integrating solid state with the IO path to make performance go farther, while simultaneously tiering data down to the most cost effective high capacity drives, and sometimes even deduplicating that data. Meanwhile, we also consider iSCSI performance efficiency at an advantage over NAS/NFS, as the scale-out architecture of iSCSI systems promise optimal performance at an optimal cost, irrespective of how business needs change. At the end of the day, such adaptability is only outpaced in terms of performance by the higher performance architectures of Fibre Channel.

PRESENTATION Presentation is our term for how easily and quickly the right storage resources can be connected with the right applications and users. Minor differences on the surface will ultimately add up into big cost of ownership differences over time, and significantly impact the agility of virtualized IT.

NAS/NFS - 4

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 20 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

It is clear that NAS clearly holds the leadership position in presentation. No technology can ever hope to make things easier than forever storing everything on a single mount point. But that very approach is what undermines NAS/NFS capabilities from delivering what the virtual infrastructure needs in other dimensions. Moreover, the single mount point model cannot easily provide multi-tenant services and isolation, and simply falls apart when storage scales to the point of managing many systems. Sophisticated proprietary layers can take some NAS systems far, but they are beyond the norm and add complexity. Meanwhile, API support for technologies like vCloud as well as a lack of Virtual Storage Appliance capabilities can be additionally restrictive. Nonetheless, high marks remain for ease of presentation with a single system.

iSCSI – 4

iSCSI is a hotbed of storage innovation, and the leading vendors have radically simplified the storage volume provisioning process, effectively reducing it to a copy/paste-point-and-click process that is little different than connecting to an NFS mount point. Meanwhile, comprehensive API support and the ability to provision the right resources in the right way – including iSCSI volumes provisioned to individual VMs if desired – can give the infrastructure considerable flexibility to address business needs. In the near future, we expect vVols will even further reinvent the simplicity of iSCSI provisioning. While that day isn’t quite here yet, expect in the future that iSCSI will pull ahead of NAS/NFS.

Fibre Channel - 3

Fibre Channel remains behind in terms of ease of presentation, but is no longer the complex endeavor many legacy customers would expect. Better discovery and automation has helped simplify the provisioning process in the array, and is helping customers simplify fabrics. The most comprehensive API support alongside tools like volume virtualization, thin provisioning, sophisticated policy engines, and dynamic auto-tiering have put an end to management of the array, allowing administrators to work with highly automated systems that only require an occasional review of capacity and on-going utilization. Meanwhile, FC has potentially the most versatile presentation in terms of multi-tenancy and management of an infrastructure with many federated systems. Once again, when VVOLs arrive, expect even the Fibre Channel array will leap ahead, and quite possibly allow all storage provisioning to become seamlessly integrated with the process of deploying or moving a VM.

CONTROL AND VISIBILITY Shared caching among the large churning data behind big virtual infrastructures will inevitably fall apart. Workloads will step on each other’s toes. Without a tremendous number of wasted idle resources sitting around in case they need to soak up bursty workloads, storage will fall apart. Control and visibility are about how the respective storage platforms answer this challenge, and may mean more in how well the infrastructure holds up than any other single dimension.

Fibre Channel - 4

Fibre Channel simply leads the market in addressing these challenges. As the flag bearer for high performance enterprise storage, vendors have invested in caching and throughput technologies for FC storage for years. Meanwhile, the FC fabric has been engineered to transport the most precious data inside the data center walls. Between the fabric and the FC array there are a unique number of tools for segregating and managing workloads that are all contending for the same storage resources across one array or many. Meanwhile, the same vendors along with some third party pioneers can lend some serious insight into the storage fabric. Finally, comprehensive API support and FC’s position as the first storage to be integrated with evolving API features means FC has a consistent leg up on storage control.

iSCSI - 3

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 21 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

iSCSI is a close contender with FC solutions, but not because of the same capabilities. With a few notable exceptions from early stage pioneers, most iSCSI storage offers little in the way of ability to seriously shape and control IO. Instead, the market leading iSCSI storage solutions make it easy to provision more storage in response to growing demands, and to carve up resources into isolated pools without sacrificing centralized manageability. The two capabilities together make it easy to build pools of storage for different classes of applications or users, while still scaling each pool, and this gives the business organization serious control and flexibility.

NAS/NFS – 2

In contrast, we see little that looks like control in most of the NAS/NFS solutions on the market. As customers settle into NAS/NFS with an expectation of ease of use, single large mount points do little to segregate performance demands and the storage system is hard pressed to respond to contention effectively. Moreover, NAS remains consistently late to the game in receiving “full” API enablement, especially for the all important VAAI and VASA. Meanwhile limited scalability offers the customer little recourse when workloads run into a wall. Among some vendors, substantial capabilities exist to control resources at a volume level, either through resource throttling or through virtualized NAS instances, but this requires multiple volumes and can break the presentation advantages of NFS.

SCALING In the face of growth, longevity will be determined by scalability, and poorly scaling architectures will weigh heavily in the near term on business agility, and over the long term on an enterprise’s future manageability and cost of ownership.

Fibre Channel - 4

The best Fibre Channel solutions harness a variety of mechanisms to make sure that performance is able to scale to extremes and harness every ounce of performance installed behind a controller. The latest generation FC fabrics are most suited to carrying this performance at extreme scale. Meanwhile, a number of notable leaders among FC solutions take scalability to even greater extremes with multi-way clustering architectures that permit customers to increase both performance and capacity separately as demands change. Finally, vendors are also injecting the latest FC systems with technologies that can federate and join together multiple storage arrays with dynamic workload movement between them, allowing enterprises to easily scale into multiple arrays without isolating storage pools or increasing management overhead.

iSCSI - 3

Many iSCSI architectures actually permit more granular scaling than FC systems, allowing customers to gradually expand their storage infrastructure. While that storage infrastructure may not scale within a single system to performance or capacity numbers that can contend with FC systems or the FC fabric, granular scaling paired with iSCSI’s extreme cost effectiveness can make virtual infrastructure scaling truly agile. The right architectures deliver this functionality in solutions that transparently rebalance data, and can keep multiple controllers operating as seamless parts of a single storage system.

NAS/NFS – 1

With predominantly dual-controller architectures and reliance on NFS that can introduce pathing constraints by not aggregating multiple paths, NAS/NFS solutions fall short in scalability, and offer few choices but to turn to multiple separate storage systems as an infrastructure grows. For this reason, we typically see customers managing NAS/NFS solutions to low levels of utilization and building their expertise around managing many storage systems at once. This is not in our opinion a recipe for effective scalability.

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 22 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

RESILIENCY Finally, with more workloads than ever before comes a need to deliver greater resiliency than ever before; especially as solutions scale into multiple components with a greater number of failure points, and as virtual infrastructure interactions become increasingly complex.

Fibre Channel - 4

With little surprise, FC has a long list of resiliency advantages, from a dedicated multipathing fabric, to what are typically deeply architected shared backplane storage controllers and high levels of componentry redundancy. Meanwhile, deep integration with virtual infrastructure APIs can make widespread use of tools like snapshots and clones fully usable without concern around disruption to production IO. This can elevate the resiliency of workloads and applications well beyond just the storage infrastructure.

iSCSI - 3

iSCSI fabrics with active multipathing and scale-out architectures with controller failure protection can create highly resilient iSCSI infrastructures. Typical solutions have high degrees of availability protection in single systems, and can flexibly leverage the resources of multiple controllers to meet business requirements for resiliency even across multiple physical locations or sites. Meanwhile, scaling out and deep API integration can bolster performance making performance a resilient capability as well. Finally, converged form factors have value as well, as close proximity to the workload and distribution throughout the datacenter reduces fabric complexity and individual points of catastrophic failure.

NAS/NFS – 1

NAS/NFS storage fabrics rely on single pathed connections for datastore access that can be an inherent compromise in both availability and consistent performance. We’ve observed that features encourage customers to underutilize storage to maintain performance and headroom for growth. Moreover, scalability approaches consist of sprawl, which can further compromise resiliency. While they can be protected at the network level with technologies like Spanning Tree Protocol, and at the hypervisor level with multiple datastores, such technologies are not readily apparent in the face of NAS/NFS ease of use, and leave NAS/NFS fabrics lacking in terms of resiliency. In larger clusters, some of these consequences can be mitigated through the use of virtual distributed switches, multiple uplinks, load balancing, and Network IO Control and/or Storage IO Control. Nevertheless, single connections are still subject to compromise, and the tendency to build for multi-hop connections to a large shared mount point across the data center lower our marks for resiliency.

Fibre Channel iSCSI NAS/NFS

Capacity Efficiency 4 3 4

Performance Efficiency

4 3 2

Presentation 3 4 4

Control and Visibility 4 3 2

Copyright The TANEJA Group, Inc. 2012. All Rights Reserved. 23 of 23 87 Elm Street, Suite 900 Hopkinton, MA 01748 T: 508.435.2556 F: 508.435.2557 www.tanejagroup.com

Technology in Depth

Scalability 4 3 1 Resiliency 4 3 1

Table 9: Summary of our evaluation of storage protocol choices. TANEJA GROUP OPINION The reality is that the individual characteristics do not simply add up for a total score in an easy manner, and individual businesses must still carefully consider their own needs and the alignment of storage choices with those needs. Businesses needs/requirements vary greatly. What may matter most in one customer – presentation for example – may rank much lower to the next enterprise. Nonetheless, an evaluation of the characteristics of each storage technology – FC, iSCSI, and NAS/NFS, demonstrates advantages in block-based solutions (FC and iSCSI). This is especially true when the virtual infrastructure is considered both mission critical and growing.

Based on our assessment of how these key dimensions relate in importance to the typical enterprise with a virtualization initiative, we believe the findings here will be remarkably insightful to a broad range of businesses. While advantages may be had in initial deployment with NAS/NFS, architectural and scalability characteristics suggest this is a near term advantage that does not hold up in the long run. Meanwhile, between block-based solutions, we see the difference today surfacing mostly at scale. At mid-sized scale, iSCSI may have a serious cost advantage while “converged” form factors may let the mid-sized business/enterprise scale with ease into the far future. But for businesses facing serious IO pressure, or looking to build an infrastructure for long term use that can serve an unexpected multitude of needs, FC storage systems delivery utility-like storage with a level of resiliency that likely won’t be matched without the FC SAN.

At the end of the day, we believe a more realistic evaluation of storage choices along these lines should take place for any business pursuing virtualization. Irrespective of whether the weightings for individual criteria change, the role of storage is about more than simply interoperability or apparent ease of use. The criteria we’ve identified for growing virtual infrastructures – and all enterprise virtual infrastructures are growing – clearly identify the advantages of block-based solutions. Moreover, we’re on the cusp of a new wave of capability, as vVols look likely to soon cement the virtual infrastructure in block-based storage.

.NOTICE: The information and product recommendations made by the TANEJA GROUP are based upon public information and

sources and may also include personal opinions both of the TANEJA GROUP and others, all of which we believe to be accurate and

reliable. However, as market conditions change and not within our control, the information and recommendations are made

without warranty of any kind. All product names used and mentioned herein are the trademarks of their respective owners. The

TANEJA GROUP, Inc. assumes no responsibility or liability for any damages whatsoever (including incidental, consequential or

otherwise), caused by your use of, or reliance upon, the information and recommendations presented herein, nor for any inadvertent

errors that may appear in this document.