ontap select product architecture

Upload: nayab-rasool

Post on 28-Feb-2018

240 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/25/2019 ONTAP Select Product Architecture

    1/44

    Technical Report

    ONTAP SelectProduct Architecture and Best Practices

    Tudor Pascu, NetApp

    July 2016 | TR-4517

  • 7/25/2019 ONTAP Select Product Architecture

    2/44

    2

    TABLE OF CONTENTS

    1 Introduction .......................................................................................................................................... 4

    1.1 Software-Defined Infrastructure ............. .............. .............. .............. .............. .............. .............. .............. ........ 4

    1.2 Running ONTAP as Software ............. ............... .............. .............. .............. .............. .............. .............. .......... 5

    1.3 ONTAP Select Versus Data ONTAP Edge ............ .............. .............. .............. .............. ............... .............. ..... 5

    1.4 ONTAP Select Platform and Feature Support .............. .............. .............. .............. ............... .............. ............ 6

    2 Architecture Overview ......................................................................................................................... 7

    2.1 Virtual Machine Properties ............................................................................................................................... 7

    2.2 RAID Services .............. .............. .............. .............. .............. .............. .............. .............. .............. ............... ..... 8

    2.3 Virtualized NVRAM ........................................................................................................................................ 10

    2.4 High Availability .............. .............. ............... .............. .............. .............. .............. .............. .............. .............. . 11

    2.5 Network Configurations .............. .............. .............. .............. .............. .............. .............. ............... .............. ... 16

    2.6 Networking: Internal and External .................................................................................................................. 20

    3 Deployment and Management .......................................................................................................... 22

    3.1 ONTAP Select Deploy ............. .............. .............. .............. .............. .............. .............. .............. .............. ...... 22

    3.2 Licensing ........................................................................................................................................................ 24

    3.3 ONTAP Management ............ .............. ............... .............. .............. .............. .............. .............. .............. ........ 24

    4 Storage Design Considerations ........................................................................................................ 24

    4.1 Storage Provisioning ............. .............. ............... .............. .............. .............. .............. .............. .............. ........ 24

    4.2 ONTAP Select Virtual Disks ........................................................................................................................... 27

    4.3 ONTAP Select Deploy ............. .............. .............. .............. .............. .............. .............. .............. .............. ...... 28

    5 Network Design Considerations ....................................................................................................... 29

    5.1 Supported Network Configurations ................................................................................................................ 29

    5.2 vSphere: vSwitch Configuration ..................................................................................................................... 30

    5.3 Physical Switch Configuration ........................................................................................................................ 32

    5.4 Data and Management Separation ............. .............. .............. .............. .............. .............. .............. .............. . 33

    5.5 Four-NIC Configuration .............. .............. .............. .............. .............. .............. .............. ............... .............. ... 35

    5.6 Two-NIC Configuration .................................................................................................................................. 36

    6 Use Cases ........................................................................................................................................... 37

    6.1 Mirrored Aggregate Creation .............. ............... .............. .............. .............. .............. .............. .............. ........ 37

    6.2 Remote/Branch Office .............. .............. .............. .............. .............. .............. .............. .............. .............. ...... 39

    6.3 Private Cloud (Data Center) ........................................................................................................................... 39

    7 Upgrading ........................................................................................................................................... 40

    7.1 Increasing Capacity ....................................................................................................................................... 40

  • 7/25/2019 ONTAP Select Product Architecture

    3/44

    3

    7.2 Single-Node to Multinode Upgrade ................................................................................................................ 41

    8 Performance ....................................................................................................................................... 41

    Version History ......................................................................................................................................... 43

    LIST OF TABLES

    Table 1) ONTAP Select versus Data ONTAP Edge. ............. .............. .............. .............. .............. ............... .............. ..... 5

    Table 2) ONTAP Select virtual machine properties. .............. .............. .............. .............. .............. ............... .............. ..... 7

    Table 3) Internal versus external network quick reference. ............. .............. .............. .............. .............. .............. ........ 21

    Table 4) Network configuration support matrix. .............. .............. .............. .............. .............. .............. ............... .......... 29

    Table 5: Performance results ............. .............. .............. .............. .............. .............. .............. .............. ............... .......... 42

    LIST OF FIGURES

    Figure 1) Virtual disk to physical disk mapping. ............. .............. .............. .............. .............. .............. ............... ............ 9

    Figure 2) Incoming writes to ONTAP Select VM. .............. .............. .............. .............. .............. .............. .............. ........ 11

    Figure 3) Four-node ONTAP Select cluster. ............... .............. .............. .............. .............. .............. .............. .............. . 12

    Figure 4) ONTAP Select mirrored aggregate. .............. .............. .............. .............. .............. .............. .............. ............. 13

    Figure 5) ONTAP Select write path workflow. .............. .............. .............. .............. .............. .............. .............. ............. 14

    Figure 7) ONTAP Select multinode network configuration. ............. .............. .............. .............. .............. .............. ........ 17

    Figure 8) Network configuration of a multinode ONTAP Select VM. ............. .............. .............. .............. .............. ........ 18

    Figure 9) Network configuration of single-node ONTAP Select VM. ............. .............. .............. .............. .............. ........ 20

    Figure 10) Server LUN configuration with only RAID-managed spindles. .............. .............. .............. .............. ............. 25

    Figure 11) Server LUN configuration on mixed RAID/non-RAID system. .......................... .............. .............. .............. . 26

    Figure 12) Virtual disk provisioning. ............ .............. ............... .............. .............. .............. .............. .............. .............. . 28

    Figure 14) Standard vSwitch configuration. ............. .............. .............. .............. .............. .............. ............... .............. ... 30

    Figure 15) LACP distributed vSwitch configuration. .............. .............. .............. .............. .............. ............... .............. ... 31

    Figure 16) Network configuration using shared physical switch. .............. .............. .............. .............. .............. ............. 32

    Figure 17) Network configuration using multiple physical switches. .............. .............. .............. .............. .............. ........ 33

    Figure 18) Data and management separation using VST. .............. .............. .............. .............. .............. .............. ........ 34

    Figure 19) Data and management separation using VGT. .............. .............. .............. .............. .............. .............. ........ 34

    Figure 20) Four-NIC network configuration. ............... .............. .............. .............. .............. .............. .............. .............. . 35

    Figure 21) Two-NIC network configuration. ............. .............. .............. .............. .............. .............. ............... .............. ... 37Figure 22) Scheduled backup of remote office to corporate data center. ............. .............. .............. .............. .............. . 39

    Figure 23) Private cloud built on direct-attached storage. ............ .............. .............. .............. ............... .............. .......... 40

  • 7/25/2019 ONTAP Select Product Architecture

    4/44

    4

    1 Introduction

    NetApp ONTAP Select is helping pioneer the newly emerging software-defined storage (SDS) area by

    bringing enterprise-class storage management features to the software-defined data center. ONTAP

    Select is a critical component of the Data Fabric envisioned by NetApp, allowing customers to run

    ONTAP management services on commodity hardware.

    This document describes the best practices that should be followed when building an ONTAP Select

    cluster, from hardware selection to deployment and configuration. Additionally, it aims to answer the

    following questions:

    How is ONTAP Select different from our engineered FAS storage platforms?

    Why were certain design choices made when creating the ONTAP Select architecture?

    What are the performance implications between the various configuration options?

    1.1 Software-Defined Infrastructure

    Fundamental changes in IT require that companies reevaluate the technology they use to provide

    business services. Weve seen this with the migration away from the monolithic infrastructures used in

    mainframe computing and more recently in the adoption of virtualization technologies. Today, the nextwave of IT transformation is upon us. The implementation and delivery of IT services through software

    provide administrators with the ability to rapidly provision resources with a level of speed and agility that

    was previously impossible. Service-oriented delivery models, the need to dynamically respond to

    changing application requirements, and a dramatic shift in infrastructure consumption by enterprises have

    given rise to software-defined infrastructures (SDIs).

    Modern data centers are moving toward software-defined infrastructures as a mechanism to provide IT

    services with greater agility and efficiency. Separating out IT value from the underlying physical

    infrastructure allows them to react quickly to changing IT needs by dynamically shifting infrastructure

    resources to where they are needed most.

    Software-defined infrastructures are built on three tenets:

    Flexibility Scalability

    Programmability

    These environments provide a set of IT services across a heterogeneous physical infrastructure,

    achieved through the abstraction layer provided by hardware virtualization. They are able to dynamically

    increase or decrease the amount of IT resources available, without needing to add or remove hardware.

    Most important, they are automatable, allowing for programmatic deployment and configuration at the

    touch of a button. No equipment rack and stack or recabling is necessary.

    SDI covers many aspects of IT infrastructure, from networking to compute to storage. In this document

    we focus on one specific area, SDS, and discuss how it relates to NetApp.

    Software-Defined StorageThe shift toward software-defined infrastructures may be having its greatest impact in an area that has

    traditionally been one of the least affected by the virtualization movement: storage. Software-only

    solutions that separate out storage management services from the physical hardware are becoming more

    commonplace. This is especially evident within private cloud environments: enterprise-class service-

    oriented architectures designed from the ground up with software defined in mind. Many of these

    environments are being built on commodity hardware: white box servers with locally attached storage,

    with software controlling the placement and management of user data.

  • 7/25/2019 ONTAP Select Product Architecture

    5/44

    5

    This is also seen within the emergence of hyperconverged infrastructures (HCIs), a building-block style of

    IT design based on the premise of bundling compute, storage, and networking services. The rapid

    adoption of hyperconverged solutions over the past several years has highlighted the desire for simplicity

    and flexibility. However, as companies make the decision to replace enterprise-class storage arrays with

    a more customized, make your own model, by building storage management solutions on top of home-

    grown components, a set of new problems emerges.

    In a commodity world, where data lives fragmented across silos of direct-attached storage, data mobility

    and data management become complex problems that need to be solved. This is where NetApp can help.

    1.2 Running ONTAP as Software

    There is a compelling value proposition in allowing customers to determine the physical characteristics of

    their underlying hardware, while still giving them the ability to consume ONTAP and all of its storage

    management services. Decoupling ONTAP from the underlying hardware allows us to provide enterprise-

    class file and replication services within a software-defined environment.

    This is enabled by leveraging the abstraction layer provided by server virtualization, which allows us to

    tease apart ONTAP from any dependencies on the underlying physical hardware and put ONTAP into

    places that our FAS arrays cant reach.

    Why do we require a hypervisor? Why not run ONTAP on bare metal? There are two answers to thatquestion:

    Qualification

    Hyperconvergence

    Running ONTAP as software on top of another software application allows us to leverage much of the

    qualification work done by the hypervisor, critical in helping us rapidly expand our list of supported

    platforms. Additionally, positioning ONTAP as a hyperconverged solution allows customers to plug into

    existing orchestration frameworks, allowing for rapid provisioning and end-to-end automation, from

    deployment and configuration to the provisioning of storage resources through supported OnCommand

    tooling such as WFA or NMSDK.

    This is the goal of the ONTAP Select product.

    1.3 ONTAP Select Versus Data ONTAP Edge

    If youre familiar with the past NetApp software-defined offering Data ONTAP Edge, you may be

    wondering how ONTAP Select is different. Although much of this is covered in additional detail in the

    architecture overviewsection of this document, the Table 1 highlights some of the major differences

    between the two products.

    Table 1) ONTAP Select versus Data ONTAP Edge.

    Description Data ONTAP Edge ONTAP Select

    Node count Single nodeTwo offerings: single node and 4

    node with HA

    Virtual machine CPU/memory 2 vCPUs/8GB 4 vCPUs/16GB

    Hypervisor vSphere 5.1, 5.5 vSphere 5.5 update 3a (build3116895 or greater) and 6.0(build 2494585 or greater)

    High availability No Yes

    iSCSI/CIFS/NFS Yes Yes

  • 7/25/2019 ONTAP Select Product Architecture

    6/44

    6

    Description Data ONTAP Edge ONTAP Select

    SnapMirror and SnapVault Yes Yes

    Compression No Yes

    Capacity limit 10TB, 25TB, 50TB Up to 100TB/node

    Hardware platform support Select families within qualifiedserver vendors

    Wider support for major vendorofferings that meet minimum

    criteria

    1.4 ONTAP Select Platform and Feature Support

    The abstraction layer provided by the hypervisor allows ONTAP Select to run on a wide variety of

    commodity platforms from virtually all of the major server vendors, providing they meet minimum

    hardware criteria. These specifications are detailed below.

    Hardware Requirements

    ONTAP Select requires that the hosting physical server meet the following requirements: Intel Xeon E5-26xx v3 (Haswell) CPU or greater: 6 cores (4 for ONTAP Select, 2 for OS)

    32GB RAM

    824 internal SAS disks

    Min. 2 10GbE NIC ports (4 recommended)

    Hardware RAID controller with writeback cache

    For a complete list of supported hardware platforms and management applications, refer to the ONTAP

    Select 9.0 Release Notes

    ONTAP Feature Support

    The ONTAP Select product has shipped with clustered Data ONTAP release 9.0 and offers full support

    for most functionality, with the exception of those features that have hardware-specific dependencies

    such as MetroCluster and FCoE.

    This includes support for:

    NFS/CIFS/iSCSI

    SnapMirror and SnapVault

    Flexclone

    SnapRestore

    Dedupe and compression

    Additionally, support for the OnCommand management suite is included. This includes most tooling used

    to manage NetApp FAS arrays, such as OnCommand Unified Manager (OCUM), OnCommand Insight

    (OCI), Workflow Automation (WFA), and SnapCenter. Consult the IMT for a complete list of supported

    management applications.

    Note that the following ONTAP features are not supported by ONTAP Select:

    Interface groups (IFGRPs)

    Hardware-centric features such as MetroCluster, Fibre Channel (FC/FCoE), and full disk encryption(FDE)

    SnapLock

  • 7/25/2019 ONTAP Select Product Architecture

    7/44

    7

    Compaction

    Inline dedupe

    In traditional FAS systems, Interface Groups are used to provide aggregrate throughput and faulttolerance using a single, logical, virtualized network interface configured on top of multiple physicalnetwork interfaces. OnTap Select leverages the underlying Hypervisors virtualization of multiple physicalnetwork interfaces to achieve the same goals of throughput aggregation and resiliency. The networkinterface cards that OnTap Select manage are therefore logical constructs and configuring additionalInterface Groups will not achieve the goals of throughput aggregation or recovering from hardwarefailures.

    2 Architecture Overview

    ONTAP Select is clustered Data ONTAP deployed as a virtual machine and providing storage

    management services on a virtualized commodity server by managing the servers direct-attached

    storage.

    The ONTAP Select product can be deployed two different ways:

    Non-HA (single node).The single-node version of ONTAP Select is well suited for remote andbranch offices by providing customers with the ability to run enterprise-class file services, backup,and disaster recovery solutions on commodity hardware.

    High availability (multinode).The multinode version of the platform uses four ONTAP Select nodesand adds support for high availability and clustered Data ONTAP nondisruptive operations, all withina shared-nothing environment.

    When choosing a solution, resiliency requirements, environment restrictions, and cost factors should be

    taken into consideration. Although both versions run clustered Data ONTAP and support many of the

    same core features, the multinode solution provides the addition of high availability and supports

    nondisruptive operations, a core value proposition for clustered Data ONTAP.

    Note: The single-node and multinode versions of ONTAP Select are deployment options, not separateproducts. Although the multinode solution requires the purchase of additional node licenses, both

    share the same product model, FDvM300.

    This section attempts to provide a deeper dive into various aspects of the system architecture for both the

    single-node and multinode solutions while highlighting important differences between the two variants.

    2.1 Virtual Machine Properties

    The ONTAP Select virtual machine has a fixed set of properties, described in Table 2. Increasing or

    decreasing the amount of resources allocated to the VM is not supported. Additionally, the ONTAP Select

    instance hard reserves the CPU and memory resources, meaning the physical resources backed by the

    virtual machine are unavailable to any other VMs hosted on the server.

    Table 2 shows the resources used by the ONTAP Select VM.

    Table 2) ONTAP Select virtual machine properties.

    Description Single Node Multinode (per Node)

    CPU/memory 4 vCPUs/16GB 4 vCPUs/16GB

    Virtual network interfaces 2 6

    SCSI controllers 4 4

    System boot disk 10GB 10GB

  • 7/25/2019 ONTAP Select Product Architecture

    8/44

    8

    Description Single Node Multinode (per Node)

    System coredump disk 120GB 120GB

    Mailbox disk 556MB 556MB

    Cluster root disk 68GB 68GB (x 2 because disk is

    mirrored)

    Serial ports 2 network serial ports 2 network serial ports

    Note: The coredump disk partition is separate from the system boot disk. Because the corefile size isdirectly related to the amount of memory allocated to the ONTAP instance, this allows NetApp tosupport larger sized memory instances in the future without requiring a redesign of the systemboot disk.

    ONTAP makes use of locally attached physical hardware, specifically the hardware RAID controller

    cache, to achieve a significant increase in write performance. Additionally, because ONTAP Select is

    designed to manage the locally attached storage on the system, certain restrictions apply to the ONTAP

    Select virtual machine. Specifically:

    Only one ONTAP Select VM can reside on a single server.

    ONTAP Select may not be migrated or vMotioned to another server. This includes storage vMotion ofthe ONTAP Select VM.

    Enabling vSphere Fault Tolerance (FT) is not supported. This is in part due to the fact that the systemdisks of the ONTAP Select VM are IDE disks, which vSphere FT does not support.

    2.2 RAID Services

    Although some software-defined solutions require the presence of an SSD to act as a higher speed write

    staging device, ONTAP Select uses a hardware RAID controller to achieve both a write performance

    boost and the added benefit of protection against physical drive failures by moving RAID services to the

    hardware controller. As a result, RAID protection for all nodes within the ONTAP Select cluster are

    provided by the locally attached RAID controller and not through Data ONTAP software RAID.

    Note: ONTAP Select data aggregates are configured to use RAID 0, because the physical RAIDcontroller is providing RAID striping to the underlying drives. No other RAID levels are supported.

    RAID Controller Configuration

    All locally attached disks that provide ONTAP Select with backing storage must sit behind a RAID

    controller. Most commodity servers come with multiple RAID controller options across multiple price

    points, and each with varying levels of functionality. The intent is to support as many of these options as

    possible, providing they meet certain minimum requirements placed on the controller.

    The RAID controller that is managing the ONTAP Select disks must support the following:

    The HW RAID controller must have a battery backup unit (BBU) or flash-backed write cache (FBWC). The RAID controller must support a mode that can withstand at least one or two disk failures (RAID 5,

    RAID 6).

    The drive cache should be set to disabled.

    The write policy should be configured for writeback mode with a fallback to writethrough upon BBU orflash failure. This is explained in further detail in the RAID Modesection of this document.

    The I/O policy for reads must be set to cached.

  • 7/25/2019 ONTAP Select Product Architecture

    9/44

    9

    All locally attached disks that provide ONTAP Select with backing storage must be placed into a single

    RAID group running RAID 5 or RAID 6. Using a single RAID group allows ONTAP to reap the benefits of

    spreading incoming read requests across a higher number of disks, providing a significant gain in

    performance. Additionally, performance testing was done against single-LUN vs. multi-LUN

    configurations. No significant differences were found, so for simplicitys sake, we strongly recommend

    creating the fewest number of LUNs necessary to support your configuration needs.

    Best Practice

    Although most configurations should require the creation of only a single LUN, if the physical server

    contains a single RAID controller managing alllocally attached disks, we recommend creating two

    LUNs: one to provide backing storage for the server OS and a second for ONTAP Select. In the event

    of boot disk corruption, this allows the administrator to recreate the OS LUN without affecting ONTAP

    Select.

    At its core, ONTAP Select presents Data ONTAP with a set of virtual disks, provisioned from a backing

    storage pool, using LUNs composed of locally attached spindles. Data ONTAP is presented with a set of

    virtual disks, which it treats as physical, and the remaining portion of the storage stack is abstracted by

    the hypervisor and RAID controller. Figure 1 shows this relationship in more detail, highlighting the

    relationship between the physical RAID controller, the hypervisor, and the ONTAP Select VM. Note the

    following:

    RAID group and LUN configuration occur from within the server's RAID controller software.

    Storage pool configuration happens from within the hypervisor.

    Virtual disks are created and owned by individual VMs, in this case, ONTAP Select.

    Figure 1) Virtual disk to physical disk mapping.

    RAID Mode

    Many RAID controllers support three modes of operation, each representing a significant difference in the

    data path taken by write requests. These are:

    Writethrough.All incoming I/O requests are written to the RAID controller cache and thenimmediately flushed to disk before acknowledging the request back to the host.

    Writearound.All incoming I/O requests are written directly to disk, circumventing the RAIDcontroller cache.

  • 7/25/2019 ONTAP Select Product Architecture

    10/44

    10

    Writeback.All incoming I/O requests are written directly to the controller cache and immediatelyacknowledged back to the host. Data blocks are flushed to disk asynchronously using thecontroller.

    Writeback mode offers the shortest data path, with I/O acknowledgement occurring immediately after the

    blocks enter cache, and thus lower latency and higher throughput for mixed read/write workloads.

    However, without the presence of a BBU or nonvolatile flash technology, when operating in this mode,

    users run the risk of losing data should the system incur a power failure.

    Because ONTAP Select requires the presence of a battery backup or flash unit, we can be confident that

    cached blocks are flushed to disk in the event of this type of failure. For this reason, it is a requirement

    that the RAID controller be configured in writeback mode.

    Best Practice

    The server RAID controller should be configured to operate in writeback mode. If write workload

    performance issues are seen, check the controller settings and make sure that writethrough or

    writearound is not enabled.

    2.3 Virtualized NVRAMNetApp FAS systems are traditionally fitted with a physical NVRAM PCI card: a high-performing card

    containing nonvolatile flash memory that provides a significant boost in write performance by granting

    Data ONTAP with the ability to:

    Immediately acknowledge incoming writes back to the client

    Schedule the movement of modified data blocks back to the slower storage media (this process is

    known as destaging)

    Commodity systems are not traditionally fitted with this type of equipment because they can be cost

    prohibitive. Therefore, the functionality of the NVRAM card has been virtualized and placed into a partition

    on the ONTAP Select system boot disk. It is for precisely this reason that placement of the system virtual

    disk of the instance is extremely important, and why the product requires the presence of a physical RAID

    controller with a resilient cache.

    Data Path Explained: vNVRAM and RAID Controller

    The interaction between the virtualized NVRAM system partition and the RAID controller can be best

    highlighted by walking through the data path taken by a write request as it enters the system.

    Incoming write requests to the ONTAP Select virtual machine are targeted at the VMs NVRAM partition.

    At the virtualization layer, this partition exists within an ONTAP Select system disk: a VMDK attached to

    the ONTAP Select VM. At the physical layer, these requests are cached in the local RAID controller, like

    allblock changes targeted at the underlying spindles. From here, the write is acknowledged back to the

    host.

    So at this point:

    Physically, the block resides in the RAID controller cache, waiting to be flushed to disk.

    Logically, the block resides in NVRAM, waiting for destaging to the appropriate user data disks.

    Because changed blocks are automatically stored within the RAID controllers local cache, incoming

    writes to the NVRAM partition are automatically cached and periodically flushed to physical storage

    media. This should not be confused with the periodic flushing of NVRAM contents back to Data ONTAP

    data disks. These two events are unrelated and occur at different times and frequencies.

  • 7/25/2019 ONTAP Select Product Architecture

    11/44

    11

    Figure 2 is intended to show the I/O path an incoming write takes, highlighting the difference between the

    physical layer, represented by the RAID controller cache and disks, from the virtual layer, shown through

    the virtual machines NVRAM and data virtual disks.

    Note: Although blocks changed on the NVRAM VMDK are cached in the local RAID controller cache,the cache is not aware of the VM construct or its virtual disks. It stores all changed blocks on thesystem, of which NVRAM is only a part. This includes write requests bound for the hypervisor,

    which is provisioned from the same backing spindles.

    Figure 2) Incoming writes to ONTAP Select VM.

    Best Practice

    Because the RAID controller cache is used to store all incoming block changes and not only those

    targeted toward the NVRAM partition, when choosing a RAID controller, select one with the largest

    cache available. A larger cache allows for less frequent disk flushing and an increase in performance of

    the ONTAP Select VM, the hypervisor, and any compute VMs colocated on the server.

    2.4 High Availability

    Although customers are starting to move application workloads from enterprise-class storage appliances

    to software-based solutions running on commodity hardware, the expectations and needs around

    resiliency and fault tolerance have not changed. A high-availability solution providing a zero RPO is

    required, one that protects the customer from data loss due to a failure from any component in the

    infrastructure stack. This makes asynchronous replication engines poor candidates to provide these

    services.

    A large portion of the SDS market is built on the notion of nonshared storage, with software replication

    providing data resiliency by storing multiple copies of user data accross different storage silos. ONTAP

    Select builds on this premise by using the synchronous replication features (RAID SyncMirror) provided

    by clustered Data ONTAP to store an additional copy of user data within the cluster. This occurs within

    the context of an HA pair. Every HA pair stores two copies of user data: one on storage provided by the

    local node and one on storage provided by the HA partner. Within an ONTAP Select cluster, HA and

    synchronous replication are tied together, and the functionality of the two cannot be decoupled or used

    independently. As a result, the synchronous replication functionality is only available in the multinode

    offering.

    Note: In an ONTAP Select cluster, synchronous replication functionality is a function of the HAimplementation, not a replacement for the asynchronous SnapMirror or SnapVault replicationengines. Synchronous replication cannot be used independently from HA.

  • 7/25/2019 ONTAP Select Product Architecture

    12/44

    12

    Synchronous Replication

    The Data ONTAP HA model is built on the notion of HA partners. As explained earlier, ONTAP Select

    extends this architecture into the nonshared commodity server world by using the RAID SyncMirror

    functionality that is present in clustered Data ONTAP to replicate data blocks between cluster nodes,

    providing two copies of user data spread across an HA pair.

    Note: This product is not intended to be an MCC-style disaster recovery replacement and cannot beused as a stretch cluster. Cluster network and replication traffic occurs using link-local IPaddresses and requires a low-latency, high-throughput network. As a result, spreading out clusternodes across long distances is not supported.

    This architecture is represented by Figure 3. Note that the four-node ONTAP Select cluster is composed

    of two HA pairs, each synchronously mirroring blocks back and forth. Data aggregates on each cluster

    node are guaranteed to be identical, and in the event of a failover there is no loss of data.

    Figure 3) Four-node ONTAP Select cluster.

    Note: Only one ONTAP Select instance may be present on a physical server. That instance is tied tothe server, meaning the VM may not be migrated off to another server. ONTAP Select requiresunshared access to the local RAID controller of the system and is designed to manage the locallyattached disks, which would be impossible without physical connectivity to the storage.

    Mirrored Aggregates

    An ONTAP Select cluster is composed of four nodes and contains two copies of user data, synchronously

    mirrored across HA pairs over an IP network. This mirroring is transparent to the user and is a property of

    the aggregate assigned at the time of creation.

    Note: All aggregates in an ONTAP Select cluster mustbe mirrored in order to insure data availability inthe case of a node failover and avoid a single point of failure in case of hardwarefailure.Aggregates in an ONTAP Select cluster are built from virtual disks provided from each

    node in the HA pair and use: A local set of disks, contributed by the current ONTAP Select node

    A mirror set of disks, contributed by the HA partner of the current node

    Note: Both the local and mirror disks used to build a mirrored aggregate must be of the same size. Wewill refer to these aggregates as Plex 0 and Plex 1 to indicate the local and remote mirror pairsrespectively. The actual Plex numbers may be different in your installation.

    This is an important point and fundamentally different from the way standard ONTAP clusters work. This

    applies to all root and data disks within the ONTAP Select cluster. Because the aggregate contains both

  • 7/25/2019 ONTAP Select Product Architecture

    13/44

    13

    local and mirror copies of data, an aggregate that contains N virtual disks actually offers N/2 disks worth

    of unique storage, because the second copy of data resides on its own unique disks.

    Figure 4 depicts an HA pair within a four-node ONTAP Select cluster. Within this cluster is a single

    aggregate, test, which uses storage from both HA partners. This data aggregate is composed of two

    sets of virtual disks: a local set, contributed by the ONTAP Select owning cluster node (Plex 0), and a

    remote set, contributed by the failover partner (Plex 1).

    Plex 0 is the bucket that holds all localdisks. Plex 1 is the bucket that holds mirrordisks, or disks

    responsible for storing a second replicated copy of user data. The node that owns the aggregate

    contributes disks to Plex 0, and the HA partner of that node contributes disks to Plex 1.

    In our figure, we have a mirrored aggregate with two disks. The contents of this aggregate are mirrored

    across our two cluster nodes, with local disk NET-1.1 placed into the Plex 0 bucket and remote disk NET-

    2.1 placed into Plex 1. In this example, aggregate test is owned by the cluster node to the left and uses

    local disk NET-1.1 and HA partner mirror disk NET-2.1.

    Figure 4) ONTAP Select mirrored aggregate.

    Note:When an ONTAP Select cluster is deployed, all virtual disks present on the system areautoassigned to the correct plex, requiring no additional step from the user with respect to disk

    assignment. This prevents the accidental assignment of disks to an incorrect plex and makes sure of

    optimal mirror disk configuration.

    For an example of the process of building a mirrored aggregate using the Data ONTAP command line

    interface, refer to Configuration of a mirrored aggregate.

    Best Practice

    While the existence of the mirrored aggregate is used to guarantee an up to date (RPO 0) copy of the

    primary aggregate, care should be taken that the primary aggregate does not run low on free space. A

    low space condition in the primary aggregate may cause ONTAP to delete the common snapshot usedas the baseline for storage giveback. While this works as designed in order to accommodate client

    writes, the lack of a common snapshot on failback will require the ONTAP Select node to do a full base

    line from the mirrored aggregate. This operation can take a significant amount of time in a shared

    nothing environment.

    A good baseline for monitoring aggregate space utilization is 85%.

  • 7/25/2019 ONTAP Select Product Architecture

    14/44

    14

    Write Path Explained

    Synchronous mirroring of data blocks between cluster nodes and the requirement of no data loss in the

    event of a system failure have a significant impact on the path an incoming write takes as it propagates

    through an ONTAP Select cluster. This process consists of two stages:

    Acknowledgement

    Destaging

    Writes to a target volume occur over a data LIF and are committed to the virtualized NVRAM partition,

    present on a system disk of the ONTAP Select node, before being acknowledged back to the client. On

    an HA configuration, an additional step occurs, because these NVRAM writes are immediately mirrored to

    the HA partner of the target volumes owner before being acknowledged. This insures the file system

    consistency on the HA partner node, in case of a hardware failure on the original node.

    After the write has been committed to NVRAM, Data ONTAP periodically moves the contents of this

    partition to the appropriate virtual disk, a process known as destaging. This process only happens once,

    on the cluster node owning the target volume, and does not happen on the HA partner.

    Figure 5 shows the write path of an incoming write request to an ONTAP Select node.

    Figure 5) ONTAP Select write path workflow.

    Incoming write acknowledgement:

    1. Writes enter the system through a logical interface owned by Select A

    2. Writes are committed to both local system memory and NVRAM then synchronously mirrored to theHA partner

    a. Once the IO request is present on both HA nodes it is then acknowledged back to the client

    Destaging to virtual disk:

    3. Writes are destaged from system memory to aggregate

    4. Mirror engine synchronously replicates blocks to both plexes

  • 7/25/2019 ONTAP Select Product Architecture

    15/44

    15

    Disk Heartbeating

    Although the ONTAP Select HA architecture leverages many of the code paths used by the traditional

    FAS arrays, some exceptions exist. One of these exceptions is in the implementation of disk-based

    heartbeating, a nonnetwork-based method of communication used by cluster nodes to prevent network

    isolation from causing split-brain behavior. Split brain is the result of cluster partitioning, typically caused

    by network failures, whereby each side believes the other is down and attempts to take over cluster

    resources. Enterprise-class HA implementations must gracefully handle this type of scenario, and Data

    ONTAP does this through a customized disk-based method of heartbeating. This is the job of the HA

    mailbox, a location on physical storage that is used by cluster nodes to pass heartbeat messages. This

    helps the cluster determine connectivity and therefore define quorum in the event of a failover.

    On FAS arrays, which use a shared-storage HA architecture, Data ONTAP resolves split-brain issues

    through:

    1. SCSI persistent reservations

    2. Persistent HA metadata

    3. HA state sent over HA interconnect

    However, within the shared-nothing architecture of an ONTAP Select cluster, a node is only able to see

    its own local storage and not that of the HA partner. Therefore, when network partitioning isolates eachside of an HA pair, the preceding methods of determining cluster quorum and failover behavior are

    unavailable.

    Although the existing method of split-brain detection and avoidance cannot be used, a method of

    mediation is still required, one that fits within the constraints of a shared-nothing environment. ONTAP

    Select extends the existing mailbox infrastructure further, allowing it to act as a method of mediation in

    the event of network partitioning. Because shared storage is unavailable, mediation is accomplished

    through access to the mailbox disks over network-attached storage. These disks are spread throughout

    the cluster, across an iSCSI network, so intelligent failover decisions can be made by a cluster node

    based on access to these disks. If a node is able to access the mailbox disks of all cluster nodes outside

    of its HA partner, it is likely up and healthy. If not, network connectivity can be isolated to itself.

    Note: The mailbox architecture and disk-based heartbeating method of resolving cluster quorum and

    split-brain issues are the reasons the multinode variant of ONTAP Select requires four separatenodes.

    HA Mailbox Posting

    The HA mailbox architecture uses a message post model. At repeated intervals, cluster nodes post

    messages to all other mailbox disks across the cluster, stating that the node is up and running. Within a

    healthy cluster, at any given point in time, a single mailbox disk on a cluster node will have messages

    posted from all other cluster nodes.

    Attached to each Select cluster node is a virtual disk that is used specifically for shared mailbox access.

    This disk is referred to as the mediator mailbox disk, since its main function is to act as a method of

    cluster mediation in the event of node failures or network partitioning. This mailbox disk contains

    partitions for each cluster node and is mounted over an iSCSI network by other Select cluster nodes.

    Periodically, these nodes will post health status to the appropriate partition of the mailbox disk. Usingnetwork accessible mailbox disks spread throughout the cluster allows us to infer node health through a

    reachability matrix. For example, if cluster nodes A and B can post to the mailbox of cluster node D, but

    not node C, and cluster node D cannot post to the mailbox of node C, its likely that node C is either down

    or network isolated and should be taken over.

  • 7/25/2019 ONTAP Select Product Architecture

    16/44

    16

    HA Heartbeating

    Like NetApps FAS platforms, ONTAP Select periodically sends HA heartbeat messages over the HA

    interconnect. Within the ONTAP Select cluster, this is done over a TCP/IP network connection that exists

    between HA partners. Additionally, disk-based heartbeat messages are passed to all HA mailbox disks,

    including mediator mailbox disks. These messages are passed every few seconds and read back

    periodically. The frequency with which these are sent/received allow the ONTAP Select cluster to detect

    HA failure events within 15 seconds, the same window available on FAS platforms. When heartbeat

    messages are no longer being read, a failover event is triggered.

    Figure 6 illustrates the process of sending and receiving heartbeat messages over the HA interconnect

    and mediator disks from the perspective of a single ONTAP Select cluster node, node C. Note that

    network heartbeats are sent over the HA interconnect to the HA partner, node D, while disk heartbeats

    use mailbox disks across all cluster nodes, A, B, C, and D.

    Figure 6) HA heartbeating: steady state.

    2.5 Network Configurations

    Decoupling Data ONTAP from physical hardware and providing it to customers as a software package

    designed to run on commodity servers introduced a new problem to NetApp, one best summed up by two

    important questions:

    How can we be confident that Data ONTAP will run reliably with variability in the underlying networkconfiguration?

    Does running Data ONTAP as a virtual machine guarantee implicit support for any software-definednetwork configuration?

    Providing a storage management platform as software, instead of hardware, requires supporting a level of

    abstraction between the storage OS and its underlying hardware resources. Configuration variability,which can affect the hardwares ability to meet the needs to the OS, becomes a real problem. Supporting

    any VM network configuration would potentially introduce the possibility of resource contention into

    cluster network communications, an area that has traditionally been owned exclusively by Data ONTAP.

    Therefore, to make sure that Data ONTAP has sufficient network resources needed for reliable cluster

    operations, a specific set of architecture requirements has been placed around the ONTAP Select

    network configuration.

  • 7/25/2019 ONTAP Select Product Architecture

    17/44

    17

    The network architecture of the single-node variant of the ONTAP Select platform is different from the

    multinode version. This section first dives into the more complicated multinode solution and then covers

    the single-node configuration, which is a logical subset.

    Network Configuration: Multinode

    The multinode ONTAP Select network configuration consists of two networks: an internal network,

    responsible for providing cluster and internal replication services, and an external network, responsible for

    providing data access and management services. End-to-end isolation of traffic that flows within these

    two networks is extremely important in allowing us to build an environment that ensures the cluster

    resiliency.

    Figure 7) ONTAP Select multinode network configuration.

    These networks are represented in Figure 7, which shows a four-node ONTAP Select cluster running ona VMware vSphere platform. Note that each ONTAP Select instance resides on a separate physical

    server and internal and external traffic is isolated through the use of separate network port groups, which

    are assigned to each virtual network interface and allow the cluster nodes to share the same physical

    switch infrastructure.

    Each ONTAP Select virtual machine contains six virtual network adapters, presented to Data ONTAP as

    a set of six network ports, e0a through e0f. Although ONTAP treats these adapters as physical NICs, they

    are in fact virtual and map to a set of physical interfaces through a virtualized network layer. As a result,

    each hosting server does not require six physical network ports.

    Note: Adding virtual network adapters to the ONTAP Select VM is not supported.

    These ports are preconfigured to provide the following services:

    e0a, e0b. Data and management LIFs

    e0c, e0d. Cluster network LIFs

    e0e. RAID SyncMirror (RSM)

    e0f. HA interconnect

    Ports e0a and e0b reside on the external network. While ports e0c e0f perform several different

    functions, collectively they comprise the internal Select network. When making network design decisions,

    these ports should be placed on a single L2 network. There is no need to separate these virtual adapters

    across different networks.

  • 7/25/2019 ONTAP Select Product Architecture

    18/44

    18

    The relationship between these ports and the underlying physical adapters can be seen in Figure 8,

    which depicts one ONTAP Select cluster node on the ESX hypervisor.

    Figure 8) Network configuration of a multinode ONTAP Select VM.

    Note that in Figure 8, internal traffic and external traffic are split across two different vSwitches.

    Segregating traffic across different physical NICs makes sure that we are not introducing latencies into

    the system due to insufficient access to network resources. Additionally, aggregation through NIC

    teaming makes sure that failure of a single network adapter does not prevent the ONTAP Select cluster

    node from accessing the respective network.

    Refer to the Networkingsection for network configuration best practices.

    LIF Assignment

    With the introduction of IPspaces, Data ONTAP port roles have been deprecated. Like FAS, ONTAP

    Select clusters contain both a default and cluster IPspace. By placing network ports e0a and e0b into thedefault IPspace and ports e0c and e0d into the cluster IPspace, we have essentially walled off those ports

    from hosting LIFs that do not belong. The remaining ports within the ONTAP Select cluster are consumed

    through the automatic assignment of interfaces providing internal services and not exposed through the

    ONTAP shell, as is the case with the RSM and HA interconnect interfaces.

    Note: Not all LIFs are visible through the ONTAP command shell. The HA interconnect and RSMinterfaces are hidden from ONTAP and used internally by FreeBSD to provide their respectiveservices.

    The network ports/LIFs are explained in further detail in the following sections.

    Data and Management LIFs (e0a, e0b)

    Data ONTAP ports e0a and e0b have been delegated as candidate ports for logical interfaces that carrythe following types of traffic:

    SAN/NAS protocol traffic (CIFS, NFS, iSCSI)

    Cluster, node, and SVM management traffic

    Intercluster traffic (SnapMirror, SnapVault)

    Note: Cluster and node management LIFs are automatically created during ONTAP Select clustersetup. The remaining LIFs may be created postdeployment.

  • 7/25/2019 ONTAP Select Product Architecture

    19/44

    19

    Cluster Network LIFs (e0c, e0d)

    Data ONTAP ports e0c and e0d have been delegated as home ports for cluster interfaces. Within each

    ONTAP Select cluster node, two cluster interfaces are automatically generated during Data ONTAP setup

    using link-local IP addresses (169.254.x.x).

    Note: These interfaces cannot be assigned static IP addresses, and additional cluster interfaces should

    not be created.Cluster network traffic must flow through a low-latency, nonrouted layer 2 network. Due to cluster

    throughput and latency requirements, the ONTAP Select cluster is expected to be physically located

    within close proximity (for example, multipack, single data center). Building a stretch cluster configuration

    by separating HA nodes across a wide area network or across significant geographical distances is not

    supported.

    Note: To make sure of maximum throughput for cluster network traffic, this network port is configured touse jumbo frames (9000 MTU). This is not configurable, so to make sure of proper clusteroperation, verify that jumbo frames are enabled on all upstream virtual and physical switchesproviding internal network services to ONTAP Select cluster nodes.

    RAID SyncMirror Traffic (e0e)

    Synchronous replication of blocks across HA partner nodes occurs using an internal network interfaceresiding on network port e0e. This functionality happens automatically, using network interfaces

    configured by Data ONTAP during cluster setup, and requires no configuration by the administrator.

    Because this port is reserved by Data ONTAP for internal replication traffic, neither the port nor the

    hosted LIF is visible in the Data ONTAP CLI or management tooling. This interface is configured to use

    an automatically generated link-local IP address, and the reassignment of an alternate IP address is not

    supported.

    Note: This network port requires the use of jumbo frames (9000 MTU).

    Throughput and latency requirements that are critical to the proper behavior of the replication network

    dictate that ONTAP Select nodes be located within close physical proximity, so building a hot disaster

    recovery solution is not supported.

    HA Interconnect (e0f)

    NetApp FAS arrays use specialized hardware to pass information between HA pairs in an ONTAP cluster.

    Software-defined environments, however, do not tend to have this type of equipment available (such as

    Infiniband or iWARP devices), so an alternate solution is needed. Although several possibilities were

    considered, ONTAP requirements placed on the interconnect transport required that this functionality be

    emulated in software. As a result, within an ONTAP Select cluster, the functionality of the HA interconnect

    (traditionally provided by hardware) has been designed into the OS, using Ethernet as a transport

    mechanism.

    Each ONTAP Select node is configured with an HA interconnect port, e0f. This port hosts the HA

    interconnect network interface, which is responsible for two primary functions:

    Mirroring the contents of NVRAM between HA pairs Sending/receiving HA status information and network heartbeat messages between HA pairs

    HA interconnect traffic flows through this network port using a single network interface by layering RDMA

    frames within Ethernet packets. Similar to RSM, neither the physical port nor the hosted network interface

    is visible to users from either the ONTAP CLI or management tooling. As a result, the IP address of this

    interface cannot be modified, and the state of the port cannot be changed.

    Note: This network port requires the use of jumbo frames (9000 MTU).

  • 7/25/2019 ONTAP Select Product Architecture

    20/44

    20

    Network Configuration: Single Node

    Single-node ONTAP Select configurations do not require the ONTAP internal network, because there is

    no cluster, HA, or mirror traffic. Unlike the multi-node version of the ONTAP Select product which contains

    6 virtual network adapters, each ONTAP Select virtual machine contains 2 virtual network adapters,

    presented to Data ONTAP network ports e0aand e0b.

    These ports will be used to provide all the following services: Data, Management and Intercluster LIFs

    The relationship between these ports and the underlying physical adapters can be seen in Figure 9,

    which depicts one ONTAP Select cluster node on the ESX hypervisor.

    Figure 9) Network configuration of single-node ONTAP Select VM.

    Note that unlike the multinode configuration, the ONTAP Select VM is configured to use only a singlevSwitch: vSwitch0. Also note that similar to the multinode solution, this vSwitch is backed by twophysical NIC ports, eth0 and eth1 which is required to insure the resiliency of the configuration.

    Best Practice

    We encourage splitting physical network ports into vSwitches across ASIC boundaries. In the event

    where a NIC has two ASICs, pick one from each when teaming for the internal and external networks.

    LIF Assignment

    As explained in the multinode LIF assignmentsection of this document, IPspaces are used by ONTAP

    Select to keep cluster network traffic separate from data and management traffic. The single-node variant

    of this platform does not contain a cluster network, therefore no ports are present in the cluster IPspace.

    Note: Cluster and node management LIFs are automatically created during ONTAP Select clustersetup. The remaining LIFs may be created postdeployment.

    2.6 Networking: Internal and External

    ONTAP Select Internal Network

    The internal ONTAP Select network, which is only present in the multinode variant of the product, is

    responsible for providing the ONTAP Select cluster with cluster communication, HA interconnect, and

    synchronous replication services. This network includes the following ports and interfaces:

    e0c, e0d.Hosting cluster network LIFs

  • 7/25/2019 ONTAP Select Product Architecture

    21/44

    21

    e0e.Hosting the RAID SyncMirror (RSM) interface

    e0f.Hosting the HA interconnect

    The throughput and latency of this network are critical in determining the performance and resiliency of

    the ONTAP Select cluster. Network isolation is required for cluster security and to make sure that system

    interfaces are kept separate from other network traffic. Therefore, this network must be used exclusively

    by the ONTAP Select cluster.

    Note: Using the internal network for non-Select cluster traffic, such as application or managementtraffic, is not supported. There can be no other VMs or hosts on the ONTAP-internal VLAN.

    Network packets traversing the internal network must be be on a dedicated VLAN tagged layer-2network. This can be accomplished either by:Assigning a VLAN-tagged port group to the internalvirtual NICs (e0ce0f)

    Using the native VLAN provided by the upstream switch

    Although this substantially reduces broadcast traffic on the network, the VLAN ID is also used in the MAC

    address generation of the Data ONTAP ports associated with the internal network.

    ONTAP Select External NetworkThe ONTAP Select external network is responsible for all outbound communications by the cluster and

    therefore is present on both the single-node and multinode configurations. Although this network does not

    have the tightly defined throughput requirements of the internal network, the administrator should be

    careful not to create network bottlenecks between the client and ONTAP VM, because performance

    issues could be mischaracterized as ONTAP Select problems.

    Internal Versus External network

    Table 3 highlights the major differences between the ONTAP Select internal and external networks.

    Table 3) Internal versus external network quick reference.

    Description Internal Network External Network

    Network servicesCluster, HA/IC, RAIDSyncMirror (RSM)

    Data, management,intercluster (SnapMirror

    and SnapVault)

    VLAN tagging Required Optional

    Frame size (MTU) 9,000 1,500 (default) / 9000(supported)

    NIC aggregation Required Required

    IP address assignment Autogenerated User defined

    DHCP support No No

    NIC Aggregation

    To make sure that the internal and external networks have both the necessary bandwidth and resiliency

    characteristics required to provide high performance and fault tolerance, physical network adapter

    aggregation is used. This is a requirement on both the internal and external networks of the ONTAP

    Select cluster, regardless of the underlying hypervisor being used, and provides the ONTAP Select

    cluster with two major benefits:

  • 7/25/2019 ONTAP Select Product Architecture

    22/44

    22

    Isolation from a single physical port failure

    Increased throughput

    NIC aggregation allows the ONTAP Select instance to balance network traffic across two physical ports.

    LACP-enabled port channels are only supported on the External Network (note that LACP is only

    available on when using distributed vSwitches).

    Best Practice

    In the event that a NIC has multiple ASICs, select one network port from each ASIC when building

    network aggregation constructs through NIC teaming for the internal and external networks.

    MAC Address Generation

    The MAC addresses assigned to all ONTAP Select network ports are generated automatically by the

    included deployment utility, using a platform-specific organizationally unique identifier (OUI) specific to

    NetApp to make sure there is no conflict with FAS systems. A copy of this address is then stored in an

    internal database, within the ONTAP Select installation VM (ONTAP Deploy), to prevent accidental

    reassignment during future node deployments. At no point should the administrator modify the assigned

    MAC address of a network port.

    3 Deployment and Management

    This section covers the deployment and management aspects of the ONTAP Select product.

    3.1 ONTAP Select Deploy

    The ONTAP Select cluster is deployed using specialized tooling that provides the administrator with the

    ability to build the ONTAP cluster as well as manage various aspects of the virtualized server. This utility,

    called ONTAP Select Deploy, comes packaged inside of an installation VM along with the ONTAP Select

    OS image. Bundling the deployment utility and ONTAP Select bits inside of a single virtual machine

    allows NetApp to include all the necessary support libraries and modules while helping reduce thecomplexity of the interoperability matrix between various versions of ONTAP Deploy and ONTAP Select.

    The ONTAP Deploy application can be accessed two ways:

    Command-line interface (CLI)

    REST API

    The ONTAP Deploy CLI is shell-based and immediately accessible upon connecting to the installation VM

    using SSH. Navigation of the shell is similar to that of the ONTAP shell, with commands bundled into

    groupings that provide related functionality (for example, network create, network show, network

    delete).

    For automated deployments and integration into existing orchestration frameworks, ONTAP Deploy can

    also be invoked programmatically, through a REST API. All functionality available through the shell-based

    CLI is available through the API.

    Further ONTAP Deploy details can be found in the ONTAP Select 9.0 Installation and Setup Guide.

    The ONTAP Deploy VM can be placed anywhere in the environment, provided there is network

    connectivity to the ONTAP Select target physical server. For more information, refer to the ONTAP

    Deploy VM Placementsection of design considerations portion of this document.

  • 7/25/2019 ONTAP Select Product Architecture

    23/44

    23

    Server Preparation

    Although ONTAP Deploy provides the user with functionality that allows for configuration of portions of

    the underlying physical server, there are several requirements that must be met before attempting to

    manage the server. This can be thought of as a manual preparation phase, because many of the steps

    are difficult to orchestrate through automation. This preparation phase involves the following:

    RAID controller and attached local storage is configured.! RAID groups and LUNs have been provisioned.

    Physical network connectivity to server is verified.

    Hypervisor is installed.

    Virtual networking constructs are configured (vSwitches/port groups).

    Note: After the ONTAP Select cluster has been deployed, the appropriate ONTAP management toolingshould be used to configure SVMs, LIFs, volumes, and so on. ONTAP Deploy does not providethis functionality.

    The ONTAP Deploy utility and ONTAP Select software are bundled together into a single virtual machine,

    which is then made available as a .OVA file for vSphere. The bits are available from the NetApp Support

    site, from this link:

    http://mysupport.netapp.com/NOW/cgi-bin/software

    This installation VM runs the debian Linux OS and has the following properties:

    2 vCPUs

    4GB RAM

    40GB virtual disk

    Multiple ONTAP Select Deploy Instances

    Depending on the complexity of the environment, it may be beneficial to have more than one ONTAP

    Deploy instance managing the ONTAP Select environment. When this is desired, make sure that each

    ONTAP Select cluster is managed by a dedicated ONTAP Deploy instance. ONTAP Seploy stores cluster

    metadata within an internal database, so managing an ONTAP Select cluster using multiple ONTAPDeploy instances is not recommended.

    When deciding whether to use multiple installation VMs, keep in mind that while ONTAP Deploy attempts

    to create unique MAC addresses by using a numeric hash based on the IP address of the installation VM,

    the uniqueness of the MAC address can only be guaranteed within that Deploy instance. As there is no

    communication across Deploy instances, its possible for 2 separate instances to assign multiple ONTAP

    Select network adapters with the same MAC address.

    Best Practice

    To eliminate the possibility of having multiple Deploy instances assign duplicate MAC addresses, one

    Deploy instance per L2 network should be used to manage existing or deploy new Select Clusters /

    Nodes

    Note: Each ONTAP Deploy can generate 64,000 unique MAC addresses. Each ONTAP Select nodeconsumes 4 MAC addresses for its internal communication network schema. Therefore eachONTAP Deploy could deploy a theoretical maximum of 16,000 Select nodes (the equivalent of4,000 4-node Select Clusters).

  • 7/25/2019 ONTAP Select Product Architecture

    24/44

    24

    3.2 Licensing

    ONTAP Select provides a flexible, consumption-based licensing model, specifically designed to allow

    customers to only pay for the storage that they need. Capacity licenses are sold in 1TB increments and

    must be applied to each node in the ONTAP Select cluster within 30 days of deployment. Failure to apply

    a valid capacity license to each cluster node results in the VM being shut until a valid license is reapplied.

    3.3 ONTAP Management

    Because ONTAP Select runs Data ONTAP, it supports many common NetApp management tools. As a

    result, after the product is deployed and Data ONTAP is configured, it can be administered using the

    same set of applications that a system administrator would use to manage FAS storage arrays. There is

    no special procedure required to build out an ONTAP configuration, such as creating SVMs, volumes,

    LIFs, and so on.

    4 Storage Design Considerations

    This section covers the various storage-related options and best practices that should be taken into

    consideration when building a single-node or a 4 node ONTAP Select cluster. The choices made by the

    administrator when building the underlying infrastructure can have a significant impact on both theperformance and resiliency of the ONTAP Select cluster.

    4.1 Storage Provisioning

    The flexibility provided to the administrator by the ONTAP Select product requires that it support

    variability in the underlying hardware configurations. Server vendors offer the customer numerous

    choices, providing different families of servers designed for different types of application workloads or

    performance requirements. Even within a family, substantial variability may exist. Customers may

    customize virtually every aspect of their configuration, so although two physical servers may come from

    the same vendor and may even have the same model, they are likely composed of completely different

    physical components. This has the potential to impact the ONTAP Select installation workflow, explained

    in further detail later.

    Homogeneous Configurations

    The ONTAP Select product requires that all managed storage sit behind a single RAID controller.

    Managed storage is defined as storage that is consumed by the ONTAP Select VM and may not

    represent all storage attached to the system. A server may have different types of locally attached

    storage available, such as an internal flash drive (possibly used as a boot device) or even an SSD/SAS

    hybrid setup with the SSDs being managed by one controller or HBA and the SAS spindles by another.

    Note: All locally attached storage that is managed by ONTAP Select must be of like kind with respect tostorage type and speed. Furthermore, spreading out the VMs virtual disks across multiple RAIDcontrollers or storage types (including nonRAID backed storage) is not supported.

    4. From a storage standpoint, physical servers that are candidates for hosting an ONTAP Select clusterare frequently configured two different ways, described later.

    Single RAID Group

    Most RAID controllers support a maximum of 32 drives for a single RAID group. Extensive performance

    testing was done to determine whether there was any benefit in splitting ONTAP Select virtual drives

    across LUNs from multiple RAID groups. None was found.

    In the unlikely event that the target server has more than 32 attached drives, split the disks evenly across

    multiple RAID groups. Provision an equal number of LUNs to the server and subsequently stripe the

  • 7/25/2019 ONTAP Select Product Architecture

    25/44

    25

    virtualized file system across all LUNs. This creates a single storage pool that ONTAP Select can then

    use to carve into virtual disks.

    Best Practice

    All locally attached storage on the server should be configured into a single RAID group. In the

    multinode configuration, no hot spares should be used because a mirror copy of the data makes sureof data availability in the event of multiple drive failures.

    Local Disks Shared Between ONTAP Select and OS

    5. The most common server configuration is one where all locally attached spindles sit behind a singleRAID controller. In this style of configuration, a single RAID group using all of the attached storageshould be created. From there, two LUNs should be provisioned: one for the hypervisor and anotherfor the ONTAP Select VM.

    Note: The one LUN for ONTAP Select statement assumes the physical storage capacity of the systemdoesnt surpass the hypervisor-supported file system extent limits. See Multiple LUNsfor moreinformation.

    6. For example, lets say a customer purchases an HP DL380 g8 with six internal drives and a singleSmart Array P420i RAID controller. All internal drives are managed by this RAID controller, and noother storage is present on the system.

    7. Figure 10 shows this style of configuration. In this example, no other storage is present on thesystem, so the hypervisor needs to share storage with the ONTAP Select node.

    Figure 10) Server LUN configuration with only RAID-managed spindles.

    Provisioning both LUNs from the same RAID group allows the hypervisor OS (and any client VMs that are

    also provisioned from that storage) to benefit from RAID protection, preventing a single-drive failure from

    bringing down the entire system.

    Best Practice

    Separating the OS LUN from the storage managed by ONTAP Select prevents a catastrophic failure

    that requires a complete OS reinstallation or LUN reprovisioning from affecting the ONTAP Select VM

    or user data. Its strongly encouraged that two LUNs are used in this style of configuration.

    Local Disks Split Between ONTAP Select and OS

    The other possible configuration provided by server vendors involves configuring the system with multiple

    RAID or disk controllers. In this configuration, a set of disks is managed by one disk controller, which may

  • 7/25/2019 ONTAP Select Product Architecture

    26/44

    26

    or may not offer RAID services, with a second set of disks being managed by a hardware RAID controller

    that is able to offer RAID 5/6 services.

    With this style of configuration, the set of spindles that sits behind the RAID controller that is able to

    provide RAID 5/6 services should be used exclusively by the ONTAP Select VM. All spindles should be

    configured into a single RAID group, and from there, a single LUN should be provisioned and used by

    ONTAP Select. The second set of disks is reserved for the hypervisor OS (and any client VMs not using

    ONTAP storage).

    This is shown in further detail with Figure 11.

    Figure 11) Server LUN configuration on mixed RAID/non-RAID system.

    Multiple LUNsAs servers become equipped with larger drives, the guidance around singleRAID group/single-LUN

    configurations must change. When a single LUN becomes larger than the supported extent limit of the

    underlying hypervisor, storage must be broken up into multiple LUNs to allow for successful file system

    creation. The term extent refers to an area of storage that is used by the file system and, for the purpose

    of this section, to the size of the disk or LUN that the hypervisor can use within a single file system.

    Best Practice

    ONTAP Select receives no performance benefits by increasing the number of LUNs within the RAID

    group. Adding additional LUNs should only be done to bypass hypervisor file system limitations.

    vSphere VMFS Limits

    The maximum extent size on a vSphere 5.5 server is 64TB. A VMFS file system cannot use disks or

    LUNs that are larger than this size. If a server has more than 64TB of storage attached, multiple LUNs

    must be provisioned for the host, each smaller than 64TB. A single vSphere datastore can contain

    multiple extents (multiple disks/LUNs), and the underlying file system VMFS can stripe across multiple

    storage devices.

    When multiple LUNs are required, use the following guidance:

  • 7/25/2019 ONTAP Select Product Architecture

    27/44

    27

    Continue to group all locally attached spindles that are managed by ONTAP Select into a single RAIDgroup.

    Create multiple equal sized LUNs (two should be sufficient).

    Provision the vSphere datastore using all attached LUNs.

    ! The virtualized file system is striped across all available LUNs.

    4.2 ONTAP Select Virtual Disks

    The ONTAP Select cluster consumes the underlying storage provided by the locally attached spindles

    through the abstraction layers provided by both the RAID controller and virtualized file system. ONTAP

    Select is completely unaware of the underlying spindle type and does not attempt to manage the disk

    directly. Storage presented to the ONTAP Select node is done through the window of a virtual disk, a

    mechanism provided by the hypervisor that allows a virtualized file system to be broken up into pieces

    that can be managed by an individual virtual machine and treated as if they were locally attached disks.

    For example, on vSphere, an ONTAP Select cluster node is presented with a datastore that is nothing

    more than a single LUN on which the VMFS file system has been configured. ONTAP Select provisions a

    set of virtual disks, or VMDKs, and treats these disks as if they were physical, locally attached spindles.

    ONTAP then assembles these disks into aggregates, from which volumes are provisioned and exported

    to clients through the appropriate access protocol.

    Best Practice

    Similar to creating multiple LUNs, ONTAP Select receives no performance benefits by increasing the

    number of virtual disks used by the system.

    Virtual Disk Provisioning

    To provide for a more streamlined user experience, the ONTAP Select management tool, ONTAP

    Deploy, automatically provisions virtual disks from the associated storage pool and attaches them to the

    ONTAP Select virtual machine. Virtual disks are then automatically assigned to a local and mirror storage

    pool.

    Because all virtual disks on the ONTAP Select VM are striped across the underlying physical disks, there

    is no performance gain in building configurations with a higher number of virtual disks and structuring

    application workloads across different aggregates. Additionally, shifting the responsibility of virtual disk

    creation and assignment from the administrator to the management tool prevents the user from

    inadvertently assigning a virtual disk to an incorrect storage pool.

    ONTAP Select breaks up the underlying attached storage into equal sized virtual disks, each not

    exceeding 8TB. A minimum of two virtual disks is created on each cluster node and assigned to the local

    and mirror plex to be used within a mirrored aggregate.

    For example, if ONTAP Select is assigned a datastore or LUN that is 31TB (space remaining after VM is

    deployed and system and root disks are provisioned), 4 7.75TB virtual disks are created and assigned to

    the appropriate ONTAP local and mirror plex.

    Figure 12 shows this provisioning further. In this example, a single server has 16 locally attached 2TB

    disks. Note that:

    All disks are assigned to a single 32TB RAID group, with one disk acting as a hot spare.

    From the RAID group, a single 30TB LUN is provided.

    The ONTAP Select VM has ~250GB worth of system and root disks, leaving 29.75TB storage to bedivided into virtual data disks.

  • 7/25/2019 ONTAP Select Product Architecture

    28/44

    28

    4 7.4 TB data disks are created and placed into the appropriate ONTAP storage pools (two disks intothe local pool (plex 0) and two into the mirror pool (plex 1)

    Figure 12) Virtual disk provisioning.

    4.3 ONTAP Select Deploy

    Careful consideration should be given to the placement of the ONTAP Deploy installation VM, because

    there is flexibility provided to the administrator with respect to the physical server that hosts the virtual

    machine.

    VM Placement

    The ONTAP Select installation VM can be placed on any virtualized server in the customer environment;

    it can be collocated on the same host as an ONTAP Select instance or on a separate virtualized server.

    The only requirement is that there exists network connectivity between the ONTAP Select installation VM

    and the ONTAP Select virtual servers.

    Figure 13 shows both of these deployment options.

    Figure 13) ONTAP Select installation VM placement.

    Note: To reduce overall resource consumption, the installation VM can be powered down when notactively managing ONTAP Select VMs or virtualized servers through the ONTAP Deploy utility.

  • 7/25/2019 ONTAP Select Product Architecture

    29/44

    29

    5 Network Design Considerations

    This section covers the various network configurations and best practices that should be taken into

    consideration when building an ONTAP Select cluster. Like the design and implementation of the

    underlying storage, care should be taken when making network design decisions because these choices

    have a significant impact on both the performance and resiliency of the ONTAP Select cluster.

    5.1 Supported Network Configurations

    Server vendors understand that customers have different needs, and choice is critical. As a result, when

    purchasing a physical server, there are numerous options available when making network connectivity

    decisions. Most commodity systems ship with a variety of NIC choices, offering single-port and multiport

    options with varying permutations of 1Gb and 10Gb ports. Care should be taken when selecting server

    NICs, because the choices provided by server vendors can have a significant impact on the overall

    performance of the ONTAP Select cluster.

    As mentioned in the Network Configurationssection of this document, link aggregation is a core construct

    used to provide sufficient bandwidth to the external ONTAP Select network. Link Aggregation Control

    Protocol (LACP) is a vendor-neutral standard providing an open protocol for network endpoints to use tobundle groupings of physical network ports into a single logical channel.

    When choosing an ONTAP Select network configuration, use of LACP, which requires specialized

    hardware support, may be a primary consideration. Although LACP requires support from both the

    software virtual switch and the upstream physical switch, it can provide a significant throughput benefit to

    incoming client protocol traffic.

    Table 4 shows the supported NIC permutations and the underlying hypervisor support. Additionally, use

    of LACP is also called out, because hypervisor specific dependencies prevent all combinations from

    being supported.

    Table 4) Network configuration support matrix.

    Available Network Interfaces Internal Network(LACP not

    supported)

    External Network

    2 x 1Gb + 2 x 10Gb 2 x 10Gb 2 x 1Gb (LACPsupported)

    4 x 10Gb 2 x 10Gb 2 x 10Gb (LACPsupported)

    2 x 10Gb: 2 x 10Gb 2 x 10Gb (same

    physical ports / NoLACP support)

    Because the performance of the ONTAP Select VM is tied directly to the characteristics of the underlying

    hardware, increasing the throughput to the VM by selecting 10Gb-capable NICs results in a higherperforming cluster and a better overall user experience. When cost or form factor prevents the user from

    designing a system with four 10Gb NICs, two 10Gb NICs can be used.

    These choices are explained in further detail later.

  • 7/25/2019 ONTAP Select Product Architecture

    30/44

    30

    5.2 vSphere: vSwitch Configuration

    ONTAP Select supports the use of both standard and distributed vSwitch configurations. This section

    describes the vSwitch configuration and load-balancing policies that should be used in both two-NIC and

    four-NIC configurations.

    vSphere: Standard vSwitchAll vSwitch configurations require a minimum of two physical network adapters bundled into a single link

    aggregation group (referred to as NIC teaming). On a vSphere server, NIC teams are the aggregation

    construct used to bundle multiple physical network adapters into a single logical channel, allowing the

    network load to be shared across all member ports. Its important to remember that NIC teams can be

    created without support from the physical switch. Load-balancing and failover policies can be applied

    directly to a NIC team, which is unaware of the upstream switch configuration. In this case, policies are

    only applied to outbound traffic. In order to balance inbound traffic, the physical switch must be properly

    configured. Port channels are the primary way this is accomplished.

    Note: LACP-enabled port channels are not supported, due to the lack of vSphere switch support. Forthis functionality, distributed vSwitches (vDSs) are required. Static port channels are notsupported with ONTAP Select. Therefore we recommend using Distributed vSwitches for the

    External Network.

    Best Practice

    To make sure of optimal load balancing across both the internal and the external ONTAP Select

    networks, the load-balancing policy of Route based on originating virtual port should be used

    Figure 14 shows the configuration of a standard vSwitch and port group, responsible for handling internal

    communication services for the ONTAP Select cluster.

    Figure 14) Standard vSwitch configuration.

    vSphere: Distributed vSwitch

    When using distributed vSwitches in your configuration, LACP can be used on the external network only,

    in order to increase the throughput available for data, management, and intercluster replication traffic.

  • 7/25/2019 ONTAP Select Product Architecture

    31/44

    31

    Best Practice

    When using a Dsitributed vSwitch with LACP for the external ONTAP Select network we recommend to

    configure the load-balancing policy to Route based on IP Hash on the portgroup and Source and

    Destination IP Adress and TCP/UDP port on the link aggregation group (LAG)

    Regardless of the type of vSwitch, the internal ONTAP Select network does not support LACP. Therecommended load-balancing policy for the internal network remains Route based on originating

    virtual port ID

    Figure 15 shows LACP configured on the external distributed port group responsible for handling

    outbound services for the ONTAP Select cluster. The unique number of network endpoints connecting to

    the ONTAP Select instance should be taken into consideration when determining the load-balancing

    policy for the link aggregation group (LAG). Although all load-balancing policies are technically supported,

    the algorithms available are tailored to specific network configurations and topologies and more efficiently

    distribute network traffic. In the event that the upstream physical switch has already been configured to

    use LACP, use the existing settings within the ESX vDS, because mixing LACP load-balancing algorithms

    can have unintended consequences.

    Note that in Figure 15, the load-balancing mode for the external LACP-enabled aggregation group shouldbe set to Source and Destination IP Address and TCP/UDP port to make sure of optimal balancing

    across all adapters.

    Figure 15) LACP distributed vSwitch configuration.

    Note: LACP requires the upstream switch ports to be configured as a port channel. Prior to enablingthis on the distributed vSwitch, make sure that an LACP-enabled port channel is properlyconfigured.

    Best Practice

    When using LACP, NetApp recommends that the LACP mode be set to ACTIVE on both the ESX and

    the physical switches. Further more, the LACP timer should be set to FAST (1 sec) on the portchannel

    interfaces and on the VMNICs.

  • 7/25/2019 ONTAP Select Product Architecture

    32/44

    32

    5.3 Physical Switch Configuration

    Careful consideration should be taken when making connectivity decisions from the virtual switch layer to

    physical switches. Separation of internal cluster traffic from external data services should extend to the

    upstream physical networking layer through isolation provi