emc avamar for network-attached storage (nas) backups ... · emc avamar for network-attached...

25
White Paper Abstract This white paper outlines the configuration, sizing, growth management, and reporting best practices of the EMC ® Avamar ® NDMP Accelerator using performance test results, customer scenarios, and feedback as the basis of this knowledge transfer. The uniquely efficient approach to backup and restore makes EMC Avamar one of the best solutions for protecting highly scaled NAS environments in the industry. April 2011 EMC AVAMAR FOR NETWORK-ATTACHED STORAGE (NAS) BACKUPS USING NDMP A best practices and performance overview of the Avamar NDMP Accelerator

Upload: vukiet

Post on 21-Aug-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

White Paper

Abstract

This white paper outlines the configuration, sizing, growth management, and reporting best practices of the EMC® Avamar® NDMP Accelerator using performance test results, customer scenarios, and feedback as the basis of this knowledge transfer. The uniquely efficient approach to backup and restore makes EMC Avamar one of the best solutions for protecting highly scaled NAS environments in the industry. April 2011

EMC AVAMAR FOR NETWORK-ATTACHED STORAGE (NAS) BACKUPS USING NDMP A best practices and performance overview of the Avamar NDMP Accelerator

Page 2: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

2 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

Copyright © 2011 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate of its publication date. The information is subject to change without notice. The information in this publication is provided “as is.” EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. All other trademarks used herein are the property of their respective owners. Part Number h8235

Page 3: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

3 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

Table of Contents

Executive summary.................................................................................................. 4

Intended audience ............................................................................................................. 4

The NAS system backup solution ............................................................................. 5

Suitability of the EMC Avamar NDMP Accelerator for NAS systems ...................................... 5

NAS support ................................................................................................................... 6

The NDMP Accelerator .................................................................................................... 6

Configuration overview and best practices ............................................................... 7

Managing multiple streams ................................................................................................ 7

Performance expectations ................................................................................................ 10

Level 0 full backup test results ..................................................................................... 11

Level 1 daily incremental backup test results ............................................................... 12

Extrapolation of results ................................................................................................ 14

Performance and key ideas summary ........................................................................... 14

Proper sizing guidelines ................................................................................................... 15

A sizing example .......................................................................................................... 16

Solution for the example .............................................................................................. 17

Additional considerations for sizing ................................................................................. 19

Cause of the problem ................................................................................................... 20

Possible suggestions ................................................................................................... 20

Capacity planning and growth management ..................................................................... 21

Extrapolating previous results to eight streams ............................................................ 22

Expanding the configuration ......................................................................................... 23

Conclusion ............................................................................................................ 23

References ............................................................................................................ 24

Appendix .............................................................................................................. 24

Page 4: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

4 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

Executive summary The EMC® Avamar® solution for NAS backup and recovery utilizes the innovative Avamar NDMP Accelerator to deliver fast, daily full backups and one-step recovery. And unlike other solutions in the industry, Avamar stores network-attached storage (NAS) backup data on resilient, enterprise-class systems for extended retention and simple, granular-level restore.

This paper outlines the architecture, processes, performance expectations, anticipated growth, best practice reporting, and degree of fine tuning for the NDMP Accelerator. This white paper is a supplement to the Avamar NDMP Accelerator User Guide located on Powerlink® (access required), and is not intended for use as a replacement of that documentation.

Intended audience

This white paper is intended for experienced Avamar administrators who either already have NDMP Accelerators deployed or plan to deploy one or many in the near future. A thorough understanding of NDMP backups and the challenges associated with legacy, scaled-out NAS backups will not be discussed here, but these challenges serve as a foundation for understanding prior to reading this paper. EMC Avamar represents an innovative, contemporary approach to NAS system backups supporting the explosive growth of enterprise NAS systems. This paper provides newly acquired best practices as gathered by EMC Backup and Recovery Systems (BRS) Integration Labs as well as empirical data and feedback provided by experienced EMC customers already leveraging the NDMP Accelerator to qualify, size, configure, tune, manage, and grow this enterprise NAS system backup solution to their needs.

Page 5: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

5 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

The NAS system backup solution The first discussion within this white paper will be about proper consideration of the EMC Avamar NDMP Accelerator toward backup of a given NAS environment. This paper will demonstrate that the Avamar NDMP Accelerator is a viable technical option solution for NAS backup and recovery, with built-in scalability to handle future growth and the growth associated with one’s NAS environment. This paper will also cover the results of specific lab testing completed to validate sizing assumptions for throughput and backup window duration. Using these performance results, configuration best practices are described beyond those documented in the relevant EMC Avamar System Administration Guide to provide the best ongoing performance coupled with ease of operations and ongoing management. Lastly, this paper will discuss performance capacity planning and growth management.

Figure 1 depicts the typical configuration of a set of EMC Avamar NDMP Accelerators servicing backups from a NAS system.

Figure 1. Configuration of Avamar NDMP Accelerators backing up a NAS system

Suitability of the EMC Avamar NDMP Accelerator for NAS systems

Many of the general reasons to deploy Avamar for NAS backups have been covered in the higher-level white paper EMC Avamar for NAS Backups: An Overview and Business Case. Those items will not be reiterated here, and the reader should already have an understanding that the NDMP Accelerator with Avamar solution is ideal for very fast, scalable backups and restores of NAS systems. While the business reasons associated with deployment are critical, this paper will extend the technical reasons showing why this solution provides a much more reliable and scalable approach to backups of NAS environments than traditional backup-to-tape, snapshot retention,

Page 6: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

6 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

replication, and other methods. These technical objectives are central to this paper’s focus.

NAS support

EMC Avamar supports both NetApp DataOntap and EMC Celerra®/VNX™ DART operating systems. Please refer to the latest interoperability matrix on Powerlink for the most current information.

NAS systems vary in CPU, memory, I/O connectivity, and cluster configurations. There are many different configurations supporting a number of different protocols including iSCSI, Fibre Channel SAN, FCoE, NFS, CIFS, FTP, and HTTP. Both file- and block-based storage can be serviced from these various protocols. It is important to note that for NetApp the Avamar NDMP Accelerator can be used for any storage type (block or file) serviced by any of the protocols. However, file-level granularity of the backups and restores is only possible with NAS or file-based data. When LUNs are backed up and restored, the entire LUN is captured; files within the LUN cannot be identified using NDMP. For LUN backups, it is best to install the client Avamar software on to the respective client where the LUN is provisioned. With EMC Celerra and VNX, only file-based data serviced via NFS and CIFS is supported with NDMP backups. File-level granularity for restores is fully supported.

Since NAS systems can be so diverse in configuration and size, there is no clear answer or definitive method for sizing backups given a specific model of NAS system and related capacity. Information gathering from a number of different avenues and possibly exploratory baseline testing is the best way to ensure proper sizing with the lowest risk of mistakes.

The NDMP Accelerator

The size of an Avamar NDMP Accelerator is currently fixed. CPU and memory cannot be added to the appliance units. However, one can choose to purchase a smaller utility/accelerator node with less memory (4 GB with Gen3 or 12 GB with Gen4). For the purposes of this paper, we will only reference the larger NDMP Accelerator unit functionally designed for accelerating deduplication and backup of the largest NAS systems. These units contain more CPU cores and 36 GB of onboard memory, capable of supporting multiple accounts/clients, and more than four concurrent NDMP streams. Since the onboard resources are fixed, we can only vary the quantity of accelerators for a given solution. In the future, EMC will support the deployment of NDMP Accelerators in a virtual environment deployed within virtual machines. Additional sizing recommendations will be provided at that time.

An EMC Avamar NDMP backup solution can be applied to almost any EMC Celerra/VNX or NetApp FAS system supporting NDMP, regardless of the capacity. There are already many very large companies with high amounts of capacity and high file count serviced by NAS systems that are being backed up by EMC Avamar today. However, the variables identified in the following section will allow you to identify viability and backup window attainment given overhead of the NDMP backup process to the customer NAS system, scan rates of the accelerator, and connectivity of the appliances provided.

Page 7: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

7 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

Configuration overview and best practices Configuration of NDMP backups through the Avamar NDMP Accelerator is provided in the EMC Avamar NDMP Accelerator User Guide. This will quickly provide a how-to process for creating the account or “client” on the NDMP Accelerator, identifying it in the Administrative interface, creating an NDMP dataset for your filesystems (EMC Celerra) or volumes (NetApp FAS), and associating the dataset with the client, schedule, and retention period into a suitable group within the Policy interface of Avamar.

Beyond the basic configurations, one must consider other factors such as performance balancing over multiple available streams or “accounts”; considerations for large file-count filesystems/volumes; considerations for unique data types such as PST files, VMDK files, and databases; growth management over time; and high availability.

Managing multiple streams

For balancing filesystems over multiple streams, the configuration in Avamar version 5 had a fixed, stream-to-filesystem relationship, and therefore needed to be balanced with other filesystems in order to use the finite number of available streams. The relationship was manually configured.

The legacy of this operation is that the term “account,” as configured on the NDMP Accelerator, was often mistaken for available “streams” since the relationship was fixed at a 1:1 ratio. In Avamar version 6.0, however, the NDMP streams are dynamically allocated to the next available filesystem defined in the Avamar dataset, up to the maximum number of streams defined. This maximum is defined under the Options tab within Avamar’s dataset definition. Once the single “account” is configured and registered on the NDMP Accelerator, the streams are dynamically allocated to each filesystem. Depending on the Celerra X-Blade (Data Mover) or NetApp FAS filer, this could be up to eight possible streams.

In Avamar 6.0, special attention needs to be paid to the difference between “account” and “stream” for just this reason.

The new, dynamic behavior of stream allocation in Avamar version 6 eliminates the need for manual stream-to-filesystem configuration, which also eliminates the need for a 1:1 relationship between the filesystem, dataset, and group policy definitions. Mass selection and inclusion of filesystems into a single dataset allow for simple configuration with no need for balancing across available streams. Streams are allocated to filesystems in an alphabetical chronology up to the maximum number of streams defined in the dataset. One will notice avtar processes running on the NDMP Accelerator corresponding to the number of active streams one has configured, assuming that all are being leveraged on filesystems that are available to be backed up. In addition, one will see one extra avtar, the "progress avtar," that provides summary information regarding the progress of all jobs associated with a single instance of avndmp.

Page 8: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

8 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

It is important to distinguish that multistreaming does not function within or on a single filesystem. The relationship of stream-to-filesystem remains at a 1:1 ratio such that one avtar will work on a single filesystem, but the main benefit is the dynamic allocation of available streams to configured dataset/filesystems determined by the NDMP Accelerator. As before, pcache.dat files will be created under /usr/local/avamar/var/<account name>/ directory, associated one per filesystem.

Additional information about unique data types is addressed in the Proper sizing guidelines section.

For environments with remote offices with NAS systems deployed, one can deploy the Avamar NDMP Accelerator into those offices to facilitate backup to a central office/data center. This configuration allows deduplication of the data within the remote office so that backups can be sent to an Avamar storage node or grid in another location. This works the exact same way as a client loaded on a server in a remote office.

To consider such a solution, one needs to perform the same sizing efforts as would be done for a server environment, except as applied to a NAS system. This includes gathering information such as:

Data type

Capacity

File count

Unique data types

Database capacity

Available bandwidth to the remote site

With these data points in hand, an NDMP Accelerator can be deployed close to (on the same subnet preferably) the NAS system, and send resultant deduplicated, encrypted data over the WAN to the Avamar server.

Highly available NDMP Accelerators can be configured in much the same way as an Avamar storage node/grid. Although network connections rarely fail, the connections to local switches can be configured in a redundant fashion. Linux bonding mode on each node in the ADS server can be set to mode 1 for "active-backup." With this setting, the NIC port enumerated as eth0 (Gb 1 port) acts as the primary connection and eth2 (Gb 3 port) acts as the backup connection. See the “Changes to Port Bonding Mode” section of the Configuration of High-Availability ADS Network technical note available internally to EMC employees.

Additionally, NDMP Accelerators are configured with dual disk drives mirrored (RAID 1) for extreme redundancy. As a further means of redundancy, multiple NDMP Accelerators can be configured with multiple “accounts” that correspond to the same initiating NAS system. An example follows:

Page 9: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

9 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

NDMP Accelerator

Account 1 Account 2

A celerra213-to-avamar227-primary celerra210-to-avamar227-secondary

B celerra210-to-avamar227-primary celerra213-to-avamar227-secondary

All of these accounts will register with Avamar server “227”; however, when selecting available accounts in the group policy, only the “*-primary” account should be selected for use. Otherwise, if multiple accounts are selected, multiple, redundant jobs of the same filesystems will be executed, which unnecessarily taxes the accelerator, NAS system, and Avamar grid.

Additionally, backup datasets are associated with the corresponding account that was used during the backup. For example, if the “celerra213-to-avamar227-primary” account was used for Feb. 1 through Feb. 28 backups, then the data for those filesystems would be seen under this account name in the Restore interface of the Avamar console. As well, pcache (*.dat) files would be created under the /usr/local/avamar/var/celerra213-to-avamar227-primary/ directory with the prefixes of the filesystems backed up. If after Feb. 28 the other account named “celerra213-to-avamar227-secondary” was used, regardless if it was configured on the same accelerator or not, new pcache files (*.dat) would be created now under the corresponding /usr/local/avamar/var/celerra213-to-avamar227-secondary/ subdirectory. Backup data dated Feb. 28 and beyond would be located in the Restore interface of the Avamar console under the “celerra213-to-avamar227-secondary” account name. Changing accounts will also initiate another level 0 backup since there is no history of backups for that filesystem using that account. Level 0 backups can be sped up by copying *.dat files from the previous account subdirectory to the new.

Changing accounts used for filesystem backups has some side effects, but nonetheless can be used to restart NAS system backups in the event that an NDMP Accelerator is lost and cannot be replaced in a timely manner.

When scheduling backups, please review the guide EMC Avamar Operational Best Practices on Powerlink. However, specific to NAS system backups, one must be conscious of and attempt to minimize interference with other snapshot, native deduplication, and off-hour replication activity. Each of these processes incurs overhead to the NAS system that will affect backups. Check the schedules (NetApp) or policies (EMC Celerra) for these activities prior to scheduling backups through Avamar.

Lastly, snapshots of filesystems or volumes are taken when an NDMP request for backup is received by the NAS system. If multiple snapshots for a given filesystem exist, then they will not be captured in the snapshot executed for that backup. If backup data needs to be captured in a given snapshot, then one must explicitly define the snapshot filesystem path in the dataset configuration of Avamar.

Page 10: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

10 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

Please note that this will require a level 0 backup initially and ongoing. Level 1 (incremental) backups from explicitly defined snapshot paths are not executed. One may consider other tools for backing up snapshots within NetApp, such as SnapMirror-to-Tape (SM2T).

The Capacity planning and growth management section outlines considerations to take in the event that one needs to design an Avamar solution to back up very large or numerous NAS systems. Baseline test results from empirical data collected from BRS lab testing (outlined in the following sections) and customer feedback are used as a foundation for configuring such an environment.

Summary:

Leverage multiple streams for best utilization of the NDMP Accelerator and reduction of backup duration

Use NDMP Accelerators local to the NAS systems, if located in a remote or branch office (ROBO)

Connection of NDMP Accelerators to multiple network ports is generally not necessary but can assist in the event of a switch or NIC port outage

Creating redundant accounts on multiple NDMP Accelerators can provide decreased time-to-resolution should an individual accelerator have an issue

Be mindful of changes in configuration between Avamar versions 5 and 6, as the streams-to-filesystems relationship changes

Schedule backup activity to occur at off-hours when other storage array snapshot, deduplication, and replication activity does not interfere with NDMP/dump-requested snapshots

Performance expectations

Measuring performance for NAS system backups can be accomplished in multiple ways. Since the NDMP Accelerator performs deduplication within the appliance prior to sending the data to the Avamar system, we can measure throughput with a “logical” or “effective” approach as well as a physical one. The effective throughput demonstrates how much data is being effectively backed up from the NAS system on a routine basis, as though the NAS system was being fully backed up every time. Avamar backs up all clients in this same manner; every backup is a full backup.

However, because of the way Avamar performs data deduplication on the client prior to sending the data across the network to the Avamar server, very little data is actually transmitted. And because of the way those unique, sub-file data segments are stored in the Avamar server (using hash IDs and multiple layers of indices), no complete full backups are needed to be subsequently taken after the initial level 0 baseline of data is sent, nor is there any post-processing performed in the Avamar server to re-create a “synthetic” full backup from multiple incremental backups. This is what makes Avamar such an impressive solution to backup NAS systems—fast, daily full backups that only require daily level 1 dumps, which generate very little network traffic and minimal resource overhead to the NAS system.

Page 11: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

11 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

To assess performance, the time to backup completion is the critical factor. To measure this, a number of tests were created in the EMC lab environment to determine backup speed and duration given a set of measured filesystems on a typical Celerra system. These results will be used as the basis of sizing examples later in this paper. The test environment details include:

One Celerra NS-120 NAS system with two Data Movers

Only one Data Mover was examined during this test cycle

Five filesystems on the NS-120 Celerra system with 2.5 TB of data and 3.2 million files to start. Filesystems were grown to a total sample size of 3.4 TB (for capacity tests), and over 15 million files (for high file count tests)

One NDMP Accelerator with an Avamar 6.0 client and NDMP software configured

One Avamar target system with Avamar 6.0 installed

Baseline level 0 backups (seeding) were completed first. The target Avamar system did not have any substantial amount of existing data residing on it prior to this test; therefore, the data deduplication effect was not as significant as might normally be observed in a production deployment. However, the amount of data sent to the Avamar system during this test did not exceed any thresholds of the infrastructure, so the global deduplication effect would not have impacted backup duration to much of a degree.

Additional monitoring and metrics include:

During the job runs, using the Activity Monitor interface within the Avamar console

Running LINUX scripts to capture memory and CPU statistics (vmstat and iostat) and log outputs from the avtar processes on the NDMP Accelerator with associated date/time stamps

Capturing job activity details from Activity Monitor or Data Protection Network (DPN) Summary reports following the completion of the tests

For brevity, all of the outputs of these measurements will be summarized in the following tables. It should be noted that these tests were conducted in controlled laboratory conditions with little load on the NAS and Avamar systems. Given these light existing loads, some inherent deviation is included in our results as stated.

Level 0 full backup test results

During the tests, level 0 (initial baseline) backups run consistently at ~190 GB/hour or 3.5 million small files per hour if the relative number of files per GB of storage is high.

For example, a customer NAS system has 20 million files, each 20 KB in size. Although the total space consumed is low (~400 GB), it will take roughly 5.7 hours to execute the initial backup of this filesystem. So as a rule of thumb when estimating level 0 duration, use the baseline denominator (190 GB/hr or 3.5 million files/hr) that

Page 12: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

12 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

results in the longest amount of time as the expected full duration of the level 0 backup.

Level 0 backups tested with five filesystems (2.37 TB total and 3.2 million files)

Baseline backup time (regardless of the number of streams) equaled ~12.5 hours or a rate of 240,000 files/hr and ~190 GB/hour.

Accelerator CPU averaged 40 percent to 70 percent. Average memory utilization was ~ 3 GB only, with peaks of just over 7 GB.

Each test was run multiple times to ensure reliability.

The target system was a relatively low-capacity Avamar version 6 grid with one Gen4 node of 3.4 TB.

Level 1 daily incremental backup test results

Level 1 (differential incremental) backups are what can be measured in two ways, as described earlier—logical or “effective” and physical.

Tests included using one stream, two streams, and four streams. Daily backups with 0 percent change were routinely only 6 minutes for multiple runs of the two- and four-stream configurations.

Accelerator CPU averaged 40 percent to 70 percent. Average memory utilization was ~ 3 GB only, with peaks of just over 7 GB.

Adding capacity to each of the test filesystems, with varying streams configured, included the following:

Baseline test dataset capacity consisted of 2.371 TB. The same five filesystems used for the level 0 testing was also leveraged. Less than 50,000 files were ever added to the complete test dataset for capacity testing.

Additional capacity to each filesystem contained mixed file types and depth of directory structures. Data types added consisted of VMDK images, operating system binaries, compressed software files, PDF files, log files, and some amount of fsutil-created large, flat files. This represented a true mix of data types, as would be expected in a customer environment.

Page 13: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

13 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

# filesystems

Added Capacity

%

Added Capacity

(GB)

Total Capacity

(TB)

Streams Backup Time (hh:mm:ss)

GB/Hour

1 10% 133 1.495 1 0:58:00 150

1 17% 233 1.595 1 0:85:00 164

5 5% 159 3.427 2 0:48:00 199

5 5% 159 3.427 3 0:45:00 212

5 5% 159 3.427 4 0:41:30 230

Adding 10 percent more small files (50 KB size files within five layers of directory depth) equally to each of the five filesystems meant the following:

Baseline capacity equaled 2.371 TB and 3.2 million files.

Add % Add’l Files (thousands)

Total Files (millions)

Streams Backup Time (hh:mm:ss)

Millions of Files/Hour

10% 310 3.5 1 0:15:00 1.24

10% 310 3.5 2 0:08:00 2.27

10% 310 3.5 3 0:07:30 2.48

10% 310 3.5 4 0:07:30 2.48

20% 620 3.9 4 0:09:04 ~4.10

30% 931 4.2 4 0:11:16 ~4.96

In the final test, the objective is to determine a maximum throughput; roughly 11 million more files were added, representing an additional 1/2 TB of capacity.

Total space now increased to 2.97 TB and 14.4 million files.

Backup time equaled 3 hours, 29 minutes.

Throughput rate was ~3.1 million files/hr.

A “point of diminishing returns” was observed with respect to the addition of streams, which most likely is due to the limited number of filesystems used for this test, and/or the number of NDMP streams supported by this particular Celerra Data Mover (four maximum). It is likely that the same results would have been achieved had three streams been used for this test instead of the maximum configurable of four. Streams tie themselves to available filesystems in a round-robin fashion, which explains why three streams could have worked the same as four since two of the filesystems finished much quicker than others.

Page 14: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

14 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

It was very important to find the limitations of the maximum number of files per backup that can be achieved since many customers are using NAS systems for storing millions of files from various applications. NAS systems are ideally suited for storing this type of data given the ease of backups and storage allocation/provisioning.

Extrapolation of results Using a conservative 2.48 million files/hr, assuming five hours for backup equals

12.4 million files per backup window. Using the identified 3.1 million files/hr, assuming 5 hours for backup equaled 15.5 million files per backup window.

Using the more conservative rate found (since most production systems would have other load on the Avamar server and NAS system that these lab systems did not have) of 2.48 million files/hr in a 10 percent change rate environment would equate to 124 million files total capacity or “effective” backup throughput since Avamar indexes every backup as a “full.”

Using the maximum rate found (on an unloaded NAS and Avamar system) of 3.1 million files/hr in a 10 percent change rate environment would equate to 155 million files of total capacity that can be backed up in a five-hour window.

For capacity (low file count) scan rate per hour, a maximum of 230 GB/hr of new data is found with four streams active. For a five-hour window, that equates to 1.15 TB per backup window.

The math for these results is outlined in the Proper sizing guidelines section.

Note: The number of “files/hour” is identified in the console log of the Activity Monitor of the Avamar interface. Be mindful that the log is divided into sections corresponding to the filesystem being backed up. Aggregating throughputs (files/hour) must be done by correlating the start and end times of the given filesystem jobs within the log.

These results appear to be a major improvement in performance over previous lab results evolved into best practices of 40 million files maximum, as outlined in Avamar version 4 and version 5 NDMP performance expectations.

Performance and key ideas summary Although capacity increases yielded very good backup performance, this may in

fact not be the behavior if complex file types (such as VMDK, database, PST, and other unstructured files) dominated the capacity of the filesystem or filesystems. As well, deep directory structures with many millions of inodes can also cause extended backup times. Not all of these “difficult” scenarios can be tested.

The speed of backups may increase if more storage nodes existed in the Avamar grid, but that could as well be offset by existing backup activity contending for grid resources.

At maximum throughput, memory utilization averaged 3 GB, and never exceeded 7 GB on the NDMP Accelerator for any of the tests (baseline or daily incremental). This may increase if the NDMP Accelerator was configured with additional

Page 15: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

15 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

accounts servicing additional filesystems, or if it was servicing a NAS system capable of processing more than four NDMP streams.

For high-capacity, relatively moderate file count environments with mixed file types, one can expect 150 – 164 GB/hr for a single stream, or up to 230 GB/hr if all four streams are leveraged appropriately. Given a five-hour backup window, with all four streams utilized, for a 10 percent change rate environment, this equates to a maximum protected capacity or “effective backup” of 11.5 TB per NDMP Accelerator.

Improvements in NDMP client code in Avamar versions 5 and 6 appear to have resulted in a 3-4x improvement in backup times for high file-count environments. What once was a limit of 10 million files per stream (40 million files using a maximum of four streams) has now increased to a range of 124 million to 155 million total files (all available streams utilized) of “effective backup” per NDMP Accelerator.

Cautionary Note: Although small-scale tests prove that NDMP Accelerator can process 4.96 million files/hour (for four streams), that rate was observed for only a small dataset (~ 3.5 million files total) with 10 percent change. When the dataset was expanded to roughly 15 million total files, the aggregated backup rate reduced to only 3.1 million files/hour (for four streams). This is likely due to the preparation time required by the Celerra to scan changed inodes and assemble for the dump stream. Therefore, NAS systems can be a limiting factor to backup of very dense filesystems. And, since NAS models vary in CPU, memory, and I/O capability (both disk and RAID types), it is nearly impossible to quantify throughput behavior for all possible combinations. The duration of wait time is tracked in the Activity Monitor under the Elapsed Wait column, assuming the job is not waiting for an available Avamar storage node stream.

Proper sizing guidelines

As simple as it is to back up a NAS system, the ideal amount of resources must be leveraged in order to properly service the backup processes. The backups influence three major components in the Avamar solution—the Avamar server/grid, Avamar NDMP Accelerator, and the NAS system itself. Understanding the influence on these systems provides guidance toward the most proper quantity of NDMP Accelerators to deploy, the proper amount of Avamar capacity, and the expected time to back up a given NAS system.

When sizing the EMC Avamar NDMP Accelerator, one must gather the following information:

Quantity of data—capacity, number of filesystems/volumes, and file count (or inode count)

Type of data (simple departmental shares, VMDK images, PST files, databases, and so on)

Page 16: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

16 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

Change rate of the data

Model of the NAS system—indicating onboard resources such as CPU, memory, I/O boards, and so on

NAS operating system version

Connectivity of the NAS system to local and remote networks

Expected SLA for backup window

A sizing example

Given these data points, a proper sizing model can be developed, and the influence on the target Avamar server/grid can be anticipated. Here is an example:

A customer NAS system “alpha” contains 4 TB of file-based data with a total of 74 million files (majority are non-database) over 10 filesystems.

Data is comprised primary of departmental shares with 1 TB of database data serviced via the NFS protocol.

Estimated change rate (from the application level) is about 6 percent per day.

Model of the device is EMC Celerra NS-120, with one Data Mover.

Operating system is DART version 5.6.

Connectivity is to two local VLANs through two aggregated 1 GbE ports.

Replication of data to an offsite DR facility may be considered.

Local administrators provide us with the fact that past NDMP backups took ~9 hours and had ~15 percent CPU and 30 percent memory influence on the NS-120 “alpha” NAS system.

An important bullet identified here is the knowledge that NDMP-based backups had been running on this “alpha” NAS system in the past. Alternatively, it would be valuable if the administrator had executed a test run of an NDMP backup such as a dump to > /dev/null. Knowing this additional information might indicate the overhead associated with NDMP backups to that particular NAS system. Since NAS systems have varying amounts of CPU, memory, and bus speeds, only empirical data from past NDMP backup activity can give an idea as to the impact an EMC Avamar NDMP backup will have to a given NAS system.

Note: It is entirely possible that some customers, for example, those with very small CPU NAS systems, might observe NDMP backups overwhelming their NAS system with just the smallest backup request. It is imperative that a NAS system first be capable of handling the assembly and movement of requested blocks through the defined output port before expecting the backup infrastructure to perform to such expected rates of throughput.

To alleviate some of the overhead to NAS systems, Avamar only requires incremental level 1 dumps daily, so the impact will only be as great as creation of a snapshot

Page 17: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

17 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

containing the blocks that changed since the last backup recorded by Avamar. By definition, this is like executing a differential incremental backup that backs up only the blocks that have been modified since the most recent incremental level 1 or incremental level 0 backup. Avamar will continue requesting incremental level 1 backups every time after the initial full (level 0). There is no need to rerun level 0 backups. Avamar does not rely on NAS-based snapshots retained to determine blocks or files needed to be backed up since the last known backup. Instead, Avamar maintains a record of the last known good backup and the files associated with that backup, and requests via the NDMP command set that a backup of only those blocks and files to be sent between that date and the current. In this way, no reliance on the NAS system is needed for providing that list or maintaining particular snapshots, which could be otherwise compromised by NAS system administrators. The dump (NetApp) or vbb (EMC) processes are responsible for assembling and sending the files to the defined target (Avamar NDMP Accelerator) following the creation of the snapshot.

Solution for the example

Since 25 percent of the total capacity (1 TB of the 4 TB total) is database, it is likely that those files (such as .dbf files) will be identified as all new blocks to the NAS system. Therefore, 3 TB of capacity should be treated as “general departmental files” and the remaining 1 TB will be treated as “unstructured data” where block shifting and heavy random writes will likely cause the NAS system to identify and therefore send nearly all of the 1 TB of database data over to the Avamar NDMP Accelerator (see the Additional considerations for sizing section for further explanation).

Given that 3 TB of total “general departmental files” data is present:

6% daily change rate = .06 x 3 TB = 180 GB new capacity per day

6% daily change rate = .06 x 74M files = 4.4M new files per day

Finding the maximum backup duration to use:

Using a capacity backup rate of 230 GB/hr on 180 GB/day = ~47 minutes

Using a files backup rate of 2.48M files/hr on 4.4M files/day = ~1.8 hours

Since the longest duration found was 1.8 hours, we will use this value for our expected daily backup time for the 3 TB of departmental file data.

And now given that 1 TB of total “unstructured DB” data is present, one should use the baseline capacity scan rate for this entire capacity as such:

Using a capacity backup rate of 190 GB/hr on 1000 GB = 5.25 hours

Given the metrics above identified, this is how the final configuration could be presented:

Page 18: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

18 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

What had not been previously identified is the distribution of departmental data (general file data) and database data to the filesystems. One could likely interpret that from the filesystem names, but should still inquire with the customer as to this correlation. Some distributions, such as one filesystem containing the majority of the 74 million files, could cause a reduction in deduplication rate, corresponding more to the previously observed 1.24 million files/hr backup rate. Stream allocation to available filesystems is a 1:1 relationship; in other words, multiple streams will not execute against a single filesystem.

Note: If this system had been a system with 8 GB of onboard memory (such as an NS-960) with DART version 6 or later, the maximum number of active streams available would be eight. This, however, would likely not be a linear increase in backup throughput since the NDMP Accelerator and the Avamar server have limited amounts of resources. Therefore it is best to plan conservatively using the previously identified rates.

Step Action

1 Since there are multiple filesystems, the duration of backup of filesystems with database data will be compounded with the duration of backup for the departmental share data. If there were more streams available than departmental share data filesystems, then the database filesystem backup could run in conjunction with the other filesystems; however, in this example there are 10 filesystems defined.

Total time = 1.8 hrs + 5.25 hrs = roughly 7 hours, with the majority of the backup time spent on those filesystems with databases

2 The duration of backup time does not exceed that capable of a single NDMP Accelerator within an 8-hour backup window; however, if Avamar replication is desired, then the total backup + replication window of 10 hours would be exceeded. So the solution here would be:

2a One NDMP Accelerator if no replication

2b Two NDMP Accelerators if replication required

3 The DART OS version is supported as stated in the interoperability matrix.

4 The overhead of the NDMP backup will likely not exceed that of what had been experienced in the previous backups (~15% CPU and 30% memory).

5 Make mention that unstructured DB data (and other types), as noted, will likely be sent in its entirety to the NDMP Accelerator for deduplication given the nature of snapshot processes on the NAS systems. This elongates backup duration, and will immediately affect the number of NDMP Accelerators needed.

6 The targeted Avamar system should be sized using traditional Avamar sizing techniques that consider data type, deduplication ratio, and retention. File count is not necessary for consideration of the Avamar grid node count or capacity.

Page 19: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

19 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

Additional considerations for sizing

Some issues may surface when backing up particular data types. This might include database files, VMDK image files, encrypted or compressed files, PST mail files, and others. Issues that arise can be due in part to random block writes within a file or set of files that cause block offsets. When new blocks inserted into a file cause offset or “shifting” of subsequent blocks in a file, the logical block addresses (LBA) are changed. When the LBAs are changed, the NAS system will recognize those as new blocks, and subsequently assemble the newly written blocks in addition to all of the proceeding blocks that were shifted. For EMC Avamar, block offset will not cause any change to the way deduplication works. However, since it does affect the NAS system’s differential incremental block identification process, more blocks will be sent in the dump stream from the NAS system to the Avamar NDMP Accelerator. So although a backup job might still produce outstanding deduplication results (at the Avamar NDMP Accelerator), the duration of the backup job might be longer than expected. Check the following to identify this behavior as a possible issue:

The number of gigabytes sent to the NDMP Accelerator is greater than the expected change rate (divide the number of bytes sent by the total number of bytes used in the NAS volume or filesystem) as estimated by the application.

The backup job produces very good deduplication rates.

The rate of the level 1 backup throughput is nearly equal to a level 0.

One such example of the above behavior was observed at a large customer who continued trying to refine the schedules for backup of PST filesystems in conjunction with a number of departmental filesystems.

The problem:

Some filesystem backups are taking 8 - 10 hours to complete, and others of the same capacity are taking 15 - 30 minutes.

The configuration:

8 filesystems on 1 Data Mover

Each filesystem has only PST files (personal email folder storage). These are large user files that are stored on a CIFS network share serviced by one Celerra Data Mover. The filesystems that are "good performing" are simple departmental folders that store typical data one might often find on a NAS device.

Shares GB Capacity Bytes Sent from Celerra to NDMP

Accelerator

Bytes Sent to be Deduplicated

Duration of Daily

Incremental Backups

Departmental 771 10 B 11 MM 15 min

PST 778 42 B 208 MM 9 hrs

Page 20: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

20 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

Cause of the problem

The nature of these PST files is that the entire dump stream needs to be re-hashed every time it is backed up. For incremental (level 1) backups, the NAS system recognizes completely new files and sends them over to the Avamar NDMP Accelerator.

Since the majority of these PST files are recognized as completely new files, for 800 GB of filesystem data the Celerra is sending four times more data from the PST filesystems to the Avamar NDMP Accelerator than it does for the other departmental shares. As well, the NDMP Accelerator must scan (sticky byte algorithm) against 20 times more bytes of data from the PST filesystems than it does for the other departmental shares. These two reasons contribute to the massive difference in backup times observed. Essentially, the NDMP Accelerator is running at ~100 GB/hr (Avamar version 5), which is just a bit faster than initial seeding times for these PST filesystems. That is caused by the nature (high change rate) of the PST files. It is most obvious that this is the problem since deduplication rates are observed to be still between ~99.5 percent to 99.7 percent for these PST filesystems. However, it is slow.

Possible suggestions

1. Continue to rebalance the PST filesystem scheduling for optimal timing

2. Separate PST filesystems to another Celerra Data Mover that can be dedicated to serving and backing up PST filesystems without influencing other departmental shares

3. Consider use of alternate methods to back up PST filesystems such as EMC NetWorker® with the Data Domain® NDMP Tape Server, or promote a “no backup” policy within the customer user base since many companies do not back up PST files

This behavior, as mentioned before, can easily be identified as an issue if previous NDMP-based backup methods or a test “dump” backup was performed on the volume or filesystem, and resulted in very slow incremental backups due to sending inordinately large amounts of data to the backup target (greater than the estimated application-level change rate).

It is important to note, that although many NAS systems today can scale to over 256 million files per filesystem, and filesystems can scale to well over 100 TB, one must consider that during a backup all blocks and related inodes within a filesystem need to be moved to a target device and be cataloged. Avamar, with the NDMP Accelerator, minimizes the influence of that activity on a NAS system; however, NAS systems should be designed to be able to service such activity. With the use of Avamar’s NDMP Accelerator, minimization of overhead due to backup resource consumption can lead to increased scalability of one’s NAS systems beyond what other traditional, legacy backup tools might support.

Page 21: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

21 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

Summary

Gathering the correct type and amount of data from the existing NAS environment will lead to an accurate estimation of Avamar NDMP Accelerators and Avamar storage consumption.

Baseline tests or previous NDMP backup information from an existing NAS system can be very valuable to estimating backup duration and resource consumption.

The amount of data sent to the NDMP Accelerator, and the rate of deduplication can be dramatically affected by data type.

NAS system size can be increased beyond what traditional backup methods can support because of the high effective throughput of Avamar backups.

Capacity planning and growth management

When planning for Avamar growth, standard reports can be generated from the usual places within Avamar such as Enterprise Manager, the Avamar Administrative Console, and the MCCLI command set. EMC Data Protection Advisor (DPA) can also be leveraged for mining report data for historical purposes and cross-referencing/correlating data points with other elements of the data storage infrastructure.

NAS systems tend to be landing areas for many tiers of data that can uncontrollably grow beyond expectations. This can be due to policies that allow auto-growth of underlying filesystems or volumes, lack of leveraging quotas, and mass data migrations such as the movement of all departmental data shares to centralized NAS systems. Tools to migrate older files off to long-term media storage can be leveraged to help manage over-retention of data.

Since absolute resource limitations and rupture points were not found in our testing using such small test samples, we need to extrapolate the results to find the maximum number of files and capacity that can be backed up by a single NDMP Accelerator. This is a plausible exercise since the maximum number of streams serviced by the NDMP Accelerator used in our testing did not overly tax the unit beyond its available resources. Therefore, some amount of extrapolation can be done since the Avamar NDMP Accelerator is capable of supporting eight streams, although a linear extrapolation is not accurate.

Both the EMC Celerra/VNX and NetApp FAS systems can perform a level of native deduplication. As discussed earlier, these are processes that incur overhead to the NAS systems, so backup schedules should not interfere with native NAS system deduplication schedules. Flexibility in DART OS code for EMC Celerra and VNX allows rehydration (decompression) of files that had been previously compressed by Celerra data deduplication, meaning that NDMP backups will not send the compressed files to the Avamar NDMP Accelerator. This is good for the NDMP Accelerator since compressed files do not deduplicate well. However, data is sent to the NDMP Accelerator in an uncompressed state, which may result in slower backups for the initial (level 0) backup.

Page 22: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

22 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

To enable this, set the switch “dedupe.backupDataThreshold=0 ”, which will rehydrate the deduplicated files prior to sending. This switch needs to be set per file system. Upon restore it will recompress the data. The following is an example to configure this.

To set the backup data threshold for VNX File Deduplication and Compression on a Celerra Data Mover or VNX blade, use this command syntax:

$ fs_dedupe -default -set {<movername>|-all} -backup_data_threshold <percent>

<movername> = name of the Data Mover

<percent> = full percentage that a deduplicated file has to be below in order to trigger space-reduced backups for NDMP

For example, when set to 90, any deduplicated file whose physical size (compressed file plus changed blocks) is greater than 90 percent of the logical size of the file will have the entire file data backed up without attempting to back it up in a space-reduced format. Any deduplicated file whose physical size is less than 90 percent of the logical file size will be backed up in a space-reduced format. Setting this value to 0 disables a space-reduced backup. The range of values is 0 to 200, and the default value is 90 percent.

Future developments to the VNX Operating Environment (the successor to DART version 6) include improvements for servicing incremental-forever methodology backups that relieve a large amount of resource consumption on the NDMP Accelerator. For this, we expect increases in throughput performance (not part of the tests outlined in this paper) when customers upgrade to future releases of VNX OS/DART 7 and Avamar NDMP Accelerator code.

Additional performance and process improvements for handling the natively deduplicated files will be included in future releases of Avamar NDMP client code.

Extrapolating previous results to eight streams

Memory was not overly taxed during the testing; no more than 7 GB of memory was observed to be consumed during backups of the test sample and four streams utilized. Memory resources were quickly released once the backup jobs completed. Therefore, an accelerator with 36 GB of memory should not develop a bottleneck from the lack of memory. It is safe to assume that a 36 GB accelerator node could handle more streams than what was identified in this study. The overconsumption of memory, however, has dramatic performance implications, causing a system to begin swapping memory pages to disk swap space. This situation severely impairs performance, and should be constantly monitored if one has expanded the number of streams beyond the recommendation or added more accounts to the NDMP Accelerator that would incur additional streams (avtar processes). Memory monitoring is accomplished using the “vmstat” command on the NDMP Accelerator unit.

CPU utilization reaches peaks of 70 percent consumed (maximum of 100 percent as measured) during the tests in this paper. Therefore, we might assume that 30 percent additional processing power is possible. It is more likely that an expanded NDMP

Page 23: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

23 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

Accelerator (by adding more streams) would run out of processing power before volatile memory resources.

Expanding the configuration

NDMP Accelerators can be linearly added to the configuration of a NAS backup system or set of systems; however, the accelerators do not interoperate with any sort of data/configuration transfer between them. This is not needed because the global deduplication process (the final step in the complete deduplication process executed) interacts with the Avamar target server/grid for inquiry of duplicate segment IDs. Since data segments and related hash IDs inside of the Avamar grid are not segregated, all data segments from all clients are compared to one another. Therefore, having multiple NDMP Accelerators will continue to provide outstanding deduplication, without the need for cross-accelerator communications. Data, as well, can be restored through an alternate account, and therefore an alternate NDMP Accelerator, given that the same plug-in type (Celerra or NetApp) is leveraged.

Requiring no advanced configuration or clustering, NDMP Accelerators are easily added into an environment to support very large NAS environments.

For continued results and examples, please review the complementary white paper EMC Avamar Backups for Highly Scaled Network-Attached Storage (NAS) Environments.

Conclusion Using laboratory tests and a unique customer NAS example, we defined baseline performance expectations for the NDMP Accelerator. That led to appropriate solution sizing needed to meet specific SLAs.

The findings prove that the Avamar NDMP Accelerator is an ideal solution for efficient backup of EMC and NetApp NAS deployments. Centralized management, a variety of reporting mechanisms, and the assistance of EMC staff can ensure that a NAS backup solution using the Avamar NDMP Accelerator scales well into the future with high performance, resiliency, and availability.

With Avamar NDMP Accelerators deployed for NAS system backups, decreased capital and operating expenditures are realized through:

Extensive Avamar deduplication processes that allow for decreased network consumption

Fast, daily full backups that also reduce overhead to NAS systems

A disk-based grid architecture of the target Avamar system with RAIN for high availability, providing peace of mind

New multi-streaming selection embedded within the policy configurations that allows for efficient management and optimized performance

Page 24: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

24 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

Collectively, these major effects empower greater attainment of SLAs. These business values, coupled with the unique technical approach to NAS system backup leveraging a well-developed protocol (NDMP), make the Avamar NDMP Accelerator a very cost-effective, scalable solution, proven in the industry.

References The following documents can be found on EMC Powerlink (access required):

EMC Avamar 5.0 System Administration Guide

EMC Avamar Backups for Highly Scaled Network-Attached Storage (NAS) Environments

EMC Avamar NDMP Accelerator 5.0 User Guide

EMC Avamar for NAS Backups: An Overview and Business Case

EMC Avamar 5.0 Operational Best Practices Guide

EMC Celerra Network Server Version 5.5 Command Reference Manual

Configuring NDMP Backups on Celerra

The following documents can also be helpful:

NetApp Data ONTAP 7.3 Data Protection Tape Backup and Recovery Guide

Data Protection Strategies for Network Appliance Storage Systems

Appendix There are a number of useful commands that can be leveraged on a NetApp FAS and EMC Celerra to assist with gathering information:

Gathering number of available kilobytes in EMC Celerra, assuming server_2 is the Data Mover for which you are sizing for backup:

‘server_df server_2’

Gathering number of available inodes or files in EMC Celerra, assuming server_2 is the Data Mover for which you are sizing for backup:

‘server_df server_2 -inode’

Gathering number of available kilobytes in NetApp:

‘df’

Gathering number of available inodes or files in NetApp:

‘df –i’

Page 25: EMC Avamar for Network-Attached Storage (NAS) Backups ... · EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP 3 Table of Contents Executive summary

25 EMC Avamar for Network-Attached Storage (NAS) Backups Using NDMP

One should pay care to only gather the “Used” column of these outputs. As well, do not consider those EMC Celerra filesystems that are checkpoints (snapshots).

Additional performance data can be gathered on the NDMP Accelerator at the command line of that unit. This appliance can be accessed in the same way as any typical Linux or UNIX system. Use the following commands for accurate information gathering:

CPU and memory consumption on the NDMP Accelerator during a backup:

‘iostat –xtc 5 10’

‘vmstat 5 10’

‘top’

Look at the User, Nice, and System columns to see consumed CPU. The %idle column is the inverse combination of these three. The commands above are shown to run every 5 seconds for 10 iterations, producing an output as expected below:

Network packet transmission and port usage on the NDMP Accelerator during a backup:

‘netstat -i’

‘netstat -pnutl’

Processes running for servicing the streams on the NDMP Accelerator during a backup:

‘pgrep –l avtar’

‘ps –ef | grep avtar’

‘top’