technical brief▶ netbackup 7.6 deduplication technology

6
Copyright © 2014 Symantec Corporation. All rights reserved. Symantec and the Symantec logo are trademarks of Symantec Corporation. All other brands and products are trademarks of their respective holder/s. 01/2014 Symantec Backup and Recovery Technical Brief NetBackup 7.6 Deduplication Technology NetBackup 7.6 Deduplication Technology NetBackup 7.6 Overview The Symantec NetBackup Platform is a complete backup and recovery solution that is optimized for virtually any workload, including physical, virtual, arrays, or big data infrastructures. NetBackup delivers flexible target storage options, such as tape, 3 rd -party disk, cloud, or appliance storage devices, including the NetBackup Deduplication Appliances and Integrated Backup Appliances . NetBackup 7.6 delivers the performance, automation, and manageability necessary to protect virtualized deployments at scale – where thousands of Virtual Machines and petabytes of data are the norm today, and where software-defined data centers and IT-as-a-service become the norm tomorrow. Enterprises trust Symantec. Key Benefits Comprehensive – As a single solution to protect all of your data assets, NetBackup provides support for virtually every popular server, storage, hypervisor, database, and application platform used in the enterprise today. Scalable – High performance, elastic automation, and centralized management based on a flexible, multitier architecture enables NetBackup to adapt to the growing needs of a fast-paced, modern enterprise data center. Integrated – From backup appliances to big data platforms, NetBackup integrates at every point in the technology stack to improve reliability and performance. OpenStorage Technology (OST) provides even tighter integration with third -party storage and snapshot solutions. Innovative – With hundreds of patents awarded in areas including backup, recovery, virtualization, deduplication, and snapshot management, NetBackup continues a long tradition of bringing advanced technologies to market first. Proven – For over a decade, NetBackup has led the industry as the most popular enterprise data protection software by market share and is used by many of the largest enterprises on the planet. When you need your data back, you can trust NetBackup. Key Features One platform, one console unifies virtual and physical global data protection Unified global management of snapshots, replicated snapshots, backup, and recovery Scalable, global deduplication across virtual and physical infrastructures V-Ray one pass backup, instant image and single file restore for virtual and physical Automated virtual data protection and load bal anced backup performance Deduplication Overview Deduplication is defined as the elimination of redundant data from disk storage. NetBackup deduplication uses a hash algorithm to provide a unique identifier, or fingerprint, to data segments within a client backup stream. These fingerprints enable NetBackup to identify client data segments that are identical to one another, which can then be used to prevent the same data from being stored multiple times while still allowing the data to be restored when necess ary. Backup and archive infrastructures are ideal candidates for deduplication due to the redundant nature of backed up and archived data. For example, in many backup infrastructures most of the data backed up during a full backup is identical to that of the previous full backup. Deduplication prevents storing multiple copies of the identical data. Key Benefits Powerful, enterprise-class data deduplication technology Dramatic optimization of disk- based backup storage Flexible implementation choices, including server and client deduplication Purpose-built appliance solutions for streamlined implementation and management Support for OST-compliant deduplication storage devices

Upload: symantec

Post on 14-Jul-2015

1.519 views

Category:

Software


4 download

TRANSCRIPT

Page 1: TECHNICAL BRIEF▶ NetBackup 7.6 Deduplication Technology

Copyright © 2014 Symantec Corporation. All rights reserved. Symantec and the Symantec logo are trademarks of Symantec Corporation. All other brands and products

are trademarks of their respective holder/s. 01/2014

Symantec Backup and Recovery Technical Brief NetBackup 7.6 Deduplication Technology

NetBackup 7.6 Deduplication Technology

NetBackup 7.6 Overview

The Symantec NetBackup Platform is a complete backup and recovery solution that is optimized for virtually any workload, including physical, virtual, arrays, or big data infrastructures. NetBackup delivers flexible target storage options, such as tape, 3rd-party disk, cloud, or appliance storage devices, including the NetBackup Deduplication Appliances and Integrated Backup Appliances .

NetBackup 7.6 delivers the performance, automation, and manageability necessary to protect virtualized deployments at scale – where thousands of Virtual Machines and petabytes of data are the norm today, and where software-defined data centers and IT-as-a-service become the norm tomorrow. Enterprises trust Symantec.

Key Benefits

Comprehensive – As a single solution to protect all of your data assets, NetBackup provides support for virtually every popular server, storage, hypervisor, database, and application platform used in the enterprise today.

Scalable – High performance, elastic automation, and centralized management based on a flexible, multitier architecture enables NetBackup to adapt to the growing needs of a fast-paced, modern enterprise data center.

Integrated – From backup appliances to big data platforms, NetBackup integrates at every point in the technology stack to improve reliability and performance. OpenStorage Technology (OST) provides even tighter integration with third -party storage and snapshot solutions.

Innovative – With hundreds of patents awarded in areas including backup, recovery, virtualization, deduplication, and snapshot management, NetBackup continues a long tradition of bringing advanced technologies to market first.

Proven – For over a decade, NetBackup has led the industry as the most popular enterprise data protection software by market share and is used by many of the largest enterprises on the planet. When you need your data back, you can trust NetBackup.

Key Features

One platform, one console unifies virtual and physical global data protection

Unified global management of snaps hots, replicated snapshots, backup, and recovery

Scalable, global deduplication across virtual and physical infrastructures

V-Ray one pass backup, instant image and single fi le restore for virtual and physical

Automated virtual data protection and load bal anced backup performance

Deduplication Overview

Deduplication is defined as the elimination of redundant data from disk storage. NetBackup deduplication uses a hash algorithm to provide a unique identifier, or fingerprint, to data segments within a client backup stream. These fingerprints enable NetBackup to identify client data segments that are identical to one another, which can then be used to prevent the same data from being stored multiple times while stil l allowing the data to be restored when necess ary.

Backup and archive infrastructures are ideal candidates for deduplication due to the redundant nature of backed up and archived data. For example, in many backup infrastructures most of the data backed up during a full backup is identical to that of the previous full backup. Deduplication prevents storing multiple copies of the identical data.

Key Benefits

Powerful, enterprise-class data deduplication technology

Dramatic optimization of disk-based backup storage

Flexible implementation

choices, including server and client deduplication

Purpose-built appliance

solutions for streamlined implementation and management

Support for OST-compliant deduplication storage devices

Page 2: TECHNICAL BRIEF▶ NetBackup 7.6 Deduplication Technology

Copyright © 2014 Symantec Corporation. All rights reserved. Symantec and the Symantec logo are trademarks of Symantec Corporation. All other brands and products

are trademarks of their respective holder/s. 01/2014

Symantec Backup and Recovery Technical Brief NetBackup 7.6 Deduplication Technology

Figure 1: Deduplication overview

In addition to the significant storage savings that are provided by deduplication technology, time and other resources are also saved. Although many vendors provide the ability to perform deduplication at the storage target, usually at the backup storage location or appliance, being able to deduplicate data at the source, or the client, provides the ability to significantly reduce network bandwidth util ization, and can speed up the entire backup process. NetBackup supports deduplication at the target – also referred to as NetBackup Media Server Deduplication (MSDP) – as well as client deduplication.

NetBackup Deduplication Options

NetBackup offers several options for implementing deduplication. Symantec’s first deduplication product was PureDisk, which is a stand-alone application.

Soon after Symantec released the PureDisk product, it was integrated into NetBackup through an option called the PureDisk Deduplication Option, or PDDO, in which NetBackup used the PureDisk environment as a deduplication storage unit. PureDisk is now deprecated as a software form.

Another deduplication option that is available is NetBackup Media Server Deduplication, or MSDP. An MSDP server is a NetBackup media server that provides a built-in method for NetBackup to support deduplication without the need for complex hardware. Prior to NetBackup 7.5, the MSDP storage limit was 32TB. In NetBackup 7.5 the limit was increased to 64TB. NetBackup can also perform deduplication at the source, using a technology called client deduplication.

As another alternative, Symantec offers the NetBackup 5200-series appliances that support deduplication, and from NetBackup 7.6.1, the 5300 series appliances. The 5200 and 5300 series appliances are effectively NetBackup media servers than have NetBackup installed and preconfigured to perform deduplication. The 5230 can support up to 144TB of deduplicated storage, whilst th e 5330 can support up to 229TB, as shown in figure 2.

Page 3: TECHNICAL BRIEF▶ NetBackup 7.6 Deduplication Technology

Copyright © 2014 Symantec Corporation. All rights reserved. Symantec and the Symantec logo are trademarks of Symantec Corporation. All other brands and products

are trademarks of their respective holder/s. 01/2014

Symantec Backup and Recovery Technical Brief NetBackup 7.6 Deduplication Technology

Figure 2: NetBackup deduplication-enabled appliances

Finally, certain third-party vendors provide appliances that support NetBackup’s OpenStorage Technology (OST). Some of these appliances support deduplication. Check the Symantec NetBackup hardware compatibil ity l ist to determine the vendor appliances that support this capability. http://www.symantec.com/business/support/index?page=content&id=TECH76495

How NetBackup Deduplication Works

While a backup job is running, NetBackup determines what client data needs to be backed up. For each file that is backed up, the fi le metadata – including file permissions, directory location, and fi le name – is separated from the actual content of the fi le. The fi le metadata is saved in the deduplication database.

The fi le content is broken down into smaller 128 KB segments. In figure 3, this is represented by segments A, B, C, and D for File 1. A hash fingerprint is calculated for each segment. The data segment fingerprints are compared against the fingerprints of data segments that have already been stored in deduplication storage, in order to identify unique data segments. Only the unique data segments are sent to deduplication storage, along with the fi le metadata.

It is important to note that fi le segments are checked for uniqueness across all clients’ backup data, not just for an indivi dual client. This means that if an identical data segment exists on multiple clients, only a single copy of that data segment is written to storage. The metadata storage tracks all the metadata for each file that is backed up. The content storage tracks each file and all the segments that are associated with that fi le to enable NetBackup to put the fi le back together. In figure 2, data segments A, B, C, and D are required to restore File 1.

Next, when File 2 is backed up, the fi le is separated into metadata and content. The fi le contents are broken down into 128 KB segments and a fingerprint is calculated for each segment. Since File2 is slightly different than File1, segments E and F are determined to have unique fingerprints, while segments A and C have the same fingerprints as segments that are already stored in the database. Since segments E and F are unique, they are sent to storage and a notation is made that File 2 is comprised of segments A, E, C, and F. This process continues for each fi le in the backup job, until all fi les are processed.

Page 4: TECHNICAL BRIEF▶ NetBackup 7.6 Deduplication Technology

Copyright © 2014 Symantec Corporation. All rights reserved. Symantec and the Symantec logo are trademarks of Symantec Corporation. All other brands and products

are trademarks of their respective holder/s. 01/2014

Symantec Backup and Recovery Technical Brief NetBackup 7.6 Deduplication Technology

Figure 3: NetBackup deduplication example

Media Server and Client Deduplication Comparison

NetBackup provides the capability to perform both media server deduplication and client-side deduplication, as shown in figure 4. Depending on the circumstances, one deduplication method might be more beneficial than the other.

When using NetBackup media server deduplication, the client sends the entire backup data stream over the network to the media server. The deduplication media server performs the fingerprinting. It determines which data segments are unique and which data segments have an exact fingerprint match that has been stored previously.

The media server sends only the unique data segments, the deduplicated data stream, to deduplication storage. The advantage of performing deduplication on the media server is that the CPU of the client is not impacted by fingerprinting activity that oc curs during the backup. Potential disadvantages are that all backup data must be sent over the network, and that heavy deduplication loads can affect the overall performance of the media server.

When using client-side deduplication, the segmenting of client fi les and the fingerprinting of the resulting data segments, is performed by the Deduplication plug-in on the client system. After comparing with a local fingerprint cache, or communicating with the deduplication media server to determine which data segments are unique, only the unique data segments, the deduplicated data stream, is sent over the network to the deduplication media server, which then writes the deduplicated data to deduplication storage.

Client-side deduplication has the advantages of distributing the fingerprinting workload to the clients, as each client deduplicates its own backup data, and sends only unique client data segments over the network. This can greatly reduce the network util ization. Th e potential disadvantage of client-side deduplication is the additional load that is placed on the client’s CPU to perform dedupli cation during the backup.

Page 5: TECHNICAL BRIEF▶ NetBackup 7.6 Deduplication Technology

Copyright © 2014 Symantec Corporation. All rights reserved. Symantec and the Symantec logo are trademarks of Symantec Corporation. All other brands and products

are trademarks of their respective holder/s. 01/2014

Symantec Backup and Recovery Technical Brief NetBackup 7.6 Deduplication Technology

Figure 4: Media server and client deduplication

The good news is that NetBackup customers have capability to use client-side deduplication for those clients where that method provides the greatest benefit, while simultaneously performing deduplication on the media server for other clients.

Summary

The Symantec Backup and Recovery product family offers market-leading backup and disaster recovery solutions for critical customer IT resources. This includes powerful and proven storage optimization technologies, such as data deduplication, that help customers manage data growth and backup storage costs to lower overall total cost of ownership.

Feedback Please take a minute to provide feedback on this document by clicking on this FEEDBACK LINK. This will redirect you to Adobe Forms where you can fi l l out a very short form. This will take less than a minute and help us improve our documentation.

Page 6: TECHNICAL BRIEF▶ NetBackup 7.6 Deduplication Technology

Copyright © 2014 Symantec Corporation. All rights reserved. Symantec and the Symantec logo are trademarks of Symantec Corporation. All other brands and products

are trademarks of their respective holder/s. 01/2014

Symantec Backup and Recovery Technical Brief NetBackup 7.6 Deduplication Technology

For More Information

Link Description

http://www.netbackup.com/ NetBackup Home Page

http://www.symantec.com/docs/TECH59978 NetBackup Compatibility Information

http://www.symantec.com/docs/DOC6488 NetBackup Documentation

http://www.symantec.com/support Symantec Support Portal