n series overview hardware and competitive advantage

71
N Series Overview Hardware and Competitive Advantage

Post on 22-Dec-2015

215 views

Category:

Documents


2 download

TRANSCRIPT

N SeriesOverview Hardware and

Competitive Advantage

Innovating to Deliver Choices

19931993 NAS appliance and Snapshots

Near-line storage appliance20012001

Multiprotocol appliance19961996

20022002 Unified SAN/NAS appliance

20032003 iSCSI storage system

20042004 RAID-DP™ disk resiliency

20052005 Thin provisioning and virtual cloning

20062006 Scalable grid storage

Addressing Today’s Challenges

Explosive data growth Do more with less Scale the infrastructure 24x7 global access Data security & compliance

Consolidatestorage

Operateeverywhere

Protectyour business

Different Classes of Data

Availability Requirement

Data Criticality

Cost

Tier 4Compliance / Reference / Archive

Tier 3Departmental / Distributed

Tier 2Operational / Internal

Tier 1Business Critical

Different Tiers of Storage

• One architecture• One management interface• Total interoperability

Data ONTAP™ Operating System – SAN, NAS, iSCSI

Tier 4Compliance / Reference / Archive

Tier 3Departmental / Distributed

Tier 2Operational / Internal

Tier 1Business Critical

Data ONTAP® Operating System – SAN, NAS, iSCSI

One architectureOne application interface

One management interfaceTotal interoperability

Broadest Scalable Storage Architecture

N3700 N5000 N7000

N Series Family of Unified Enterprise Storage Systems

4 FC ports16TB max

20 FC ports168TB max

32 FC ports504TB max

A Fundamentally Simpler Approach

• Unrivaled synergy: everything works together

• Unique leverage: everything can do more

• Simpler administration: one process works everywhere

• Easier to deploy: less to learn means reduced training

Primary

Secondary

Backup

Compliance

Disaster Recovery

SAN

iSCSI

NAS

NFS

CIFS

Unified Unified Unified

High-end FC

Midrange FC

Midrange ATA

Low-end ATA

Virtualization

• Filer Head Unit• Disk Shelves• Optimised Micro Kernel• Connectivity

N Series Storage - What is it?

Dedicated Storage Appliances

Its all about Blocks!

• DataONTAP – Optimised Micro-kernel– Highly Optimised, Scalable and Flexible OS

• Write Anywhere File Layout – WAFL– Allows for flexible storage containers– Tightly integrated with NVRAM and RAID– RAID 4 for performance and flexibility– RAID DP for performance, flexibility and increased

reliability!

Traditional Implementation

• Classic RAID 4 – Dedicated Parity Drive

PD

• For each Data block written, parity is written

• The parity drive becomes the bottleneck!• 5 Data Blocks + 5 Parity = 10 Disk Writes!

D PDP

D PPD

Traditional Implementation

• RAID 5 – Distributed Parity

D PD P

DP D

P

D

P

• For each Data block written, parity is written

• 5 Data Blocks + 5 Parity = 10 Disk Writes!

Performance achieved!

• WAFL and RAID 4

P

• Stripe is calculated in NVRAM, parity describes the stripe.

• No bottleneck! All drives are written to evenly

• 5 Data Blocks + 1 Parity = 6 Disk Writes!

D D D DD

Performance achieved!

• Lets extend that further….– RAID 4 with 28 Disks = 28 data + 28 Parity

• Total 56 Disk Writes with Hot Parity

– RAID 5 with 28 Disks = 28 data + 28 Parity• Total 56 Disk Writes

– WAFL RAID 4 with 28 Disks = 28 data + 2 Parity• Total 30 Disk Writes

• The more disks the bigger the difference!

Protecting the Data

• Subsystem Resilience– RAID-DP

• Disk Resilience– Lost Write Protection– Momentary Disk Offline– Maintenance Center– Checksums– Background Media Scans– RAID Scrub

Storage Resiliency – RAID-DP

• RAID-DP is dual parity data protection• NetApp RAID-DP is an implementation of the

industry standard RAID 6 as defined by SNIA– SNIA definition recently updated to include NetApp RAID-

DP:“Any form of RAID that can continue to execute read and write requests to all of a RAID array's virtual disks in the presence of any two concurrent disk failures. Several methods, including dual check data computations (parity and Reed Solomon), orthogonal dual parity check data and diagonal parity have been used to implement RAID Level 6”

Storage Resiliency - RAID-DP

• Impact on usable capacity is zero– Default raid group sizes with RAID-DP are double

those of RAID 4– Result: Even though an extra parity disk is used,

net result is the same number of data disk drives– Example: RAID 4 (7D+1P)+(7D+1P)

RAID-DP (14D+2P)Both result in 14 data disks

• Comparable performance to RAID 4– Typically 1-3% impact on performance– Competitor RAID 6 typically see significant

degradation on writes when compared to their RAID 5

Storage Resiliency – RAID-DP• RAID-DP provides extra protection over single parity

solutions• Within the disk drive industry

– Disk drives are much larger– Disk drive error correction capability and reliability

have not improved at the same rate – More data is read while reconstructing a RAID

group– Increases likelihood of unrecoverable error when

RAID is not available to correct it• Leads to higher risk of data loss from single disk

failure followed by unrecoverable media error during reconstruct (MEDR), or a double disk failure

• This is a disk drive industry risk that RAID-DP protects against

Storage Resiliency - RAID-DP

SATA

FC

RAID-DP / RAID 6 so what?

If a customer requires ATA storage then RAID-6 has to be mandatory, surely!

The Competitors solution is to sell MORE DISK!

Lost Write Protection - Anatomy Of A Lost Write

• How a “lost” write occurs:

1. Disk malfunction occurs during write2. Disk reports a successful write3. In reality, no write happened (silently

dropped) or data written in random location (lost)

4. Subsequent read* of blocks returns bad data

5. Result = data corruption or data loss

* Note: No checksum mismatch occurs (existing data still matches checksum since neither were updated) so no error is reported. Also, parity inconsistency is detected and fixed by RAID scrub, but without the detection or recovery of the lost write data.

Lost Writes Protection

Step 1: Write New Block• Write Data To Free Block• Block ID Stored In Checksum• WAFL Tracks Block ID

ID 1234

Step 2: Write Updated Data• Update Data On Block 1234• Write Block 1234 To Location• Previous free block had ID 0000• Changes to Block ID 1234

Step 3: Read Data From Disk

• Verify Data Block ID Against ID Tracked By WAFL• Incorrect ID Indicates Lost Write Has Occurred• Re-Create Data From Parity

31

2

1

11

3

1

22

1

3

31

22

95

8

7

7

1212

11

{D D D D P DP

31

2

1

11

3

1

22

1

3

31

22

95

8

7

7

1212

11

{D D D D P DP

New ID 1234

ID 0000

ID 1234

ID 1234

ID 1234

ID 1234

Legend

WAFL

Checksum

Benefit: High Data Integrity

Old ID 0000

Momentary Disk Offline

• Feature where RAID temporarily suspends I/O to a drive– Available with DOT 7.0 for non-disruptive disk

firmware upgrade– Disk offline for SATA drive spasm recovery

supported (7.0.1) – Aggregate requirements

• RAID-DP or mirrored aggregate (SyncMirror)• Is allowed only if RAID group is in normal/restricted state

and disk copy is not in progress in group

Momentary Disk Offline

Data1 Data2 Data3 Data4 Data5 Data6 Parity1 Parity2

Step 1: Trigger Detection• Firmware Upgrade• SATA Spasm Errors• FC Timeout Errors• Media/Head Errors Step 2: Drive Offline

• Reads From Parity• Writes Logged• Execute (FW Update, Error

Recovery, Power Cycle)

Step 3: Bring Drive Online• Recovery Test (Dummy I/O)• Re-Sync Logged Writes• Re-Activate I/O To Drive

Maintenance Center

Step 1: Predict Failure Step 2: Run Diagnostics Step 3: Fix Problems

• Provides additional storage resiliency• Predictive and preventative techniques to ensure system

health is at peak• Customers benefit

– Fewer storage related issues– Lower IT management costs

Checksums

512 bytes data per sector

8 bytes checksum per sector

• Fibre Channel Drives• BCS• 8 x 520-byte sectors

per 4kB block

HOW CHECKSUMS WORK

• Verify all 4kb block checksums on read

- Read data- Re-calculate

checksum- Compare To stored

checksum- If needed, re-create

from parity

512 bytes data per sector

64 bytes checksum

in 9th sector

• SATA Drives• BCS Emulation• 9 x 512-byte sectors

per 4kB block

Background Media Scans

Step 1: Scan For Media Error• Begin scans at disk block 0 • Uses SCSI Verify• 128 blocks (512K) verify

request size

Step 2: Detect & Fix Error• Looking for latent defects• Drive marks bad block• Reconstruct data from parity• Re-allocate to available block

Step 3: Complete Scan• Continue scanning all blocks• Background verify process

with no performance impact• Fixed scan rate (sectors/sec)

31

2

1

11

3

1

22

1

3

31

22

95

8

7

7

1212

11

{D D D D P DP

31

2

1

11

3

1

22

1

3

31

22

95

8

7

7

1212

11

{D D D D P DP

RAID Scrubs

Step 1: Scrub Disks • Issue reads to all disks in RG• Scan for media defects• Verify checksums• Compute parity

Step 2: Detect & Fix Errors• Fix checksum errors• Fix parity errors• Fix media errors

Step 3: Complete Scan• Default runs 6 hours/week• Can configure schedule• Can resume if interrupted

31

2

1

11

3

1

22

1

3

31

22

95

8

7

7

1212

11

{D D D D P DP

31

2

1

11

3

1

22

1

3

31

22

95

8

7

7

1212

11

{D D D D P DP

Increasing Flexibility in the Dynamic Enterprise

Data ONTAP™ 7.0

Infrastructure Utilization Challenges

• Overall storage utilization is low– Most enterprises are below 50% utilization

• Too many untapped resources– Static allocation– Suboptimal performance– No sharing of resources

Industry Trends

Disk capacity is growingMore disks being used to address performanceControl requirements drive volume granularityDiffering data types need different managementSize of data units growing unevenly

Increasing mismatch between tools and building blocks

RG1 RG2 RG3

Aggregate

AggregateAggregate

RG1 RG2 RG3

Aggregates and FlexVol™ Volumes:How They Work

Flexible Volume 1

Flexible Volume 2

Flexible Volume 3

Create and populate each flexible volume No preallocation of blocks

to a specific volume WAFL® allocates space

from aggregate as data is written

Create RAID groups

Create aggregatevol1vol1 vol2vol2

vol3vol3

14 x 72 gb disks = 1 tb capacity

Flexible Volumes Improve Utilization

Vol 0

Data Parity

Database

Data Data Data Data Data Data Data Parity Spare

Home Directories

Data Data Parity

Vol0 = 1gb Max

200gb Database created

3 disk vol for Home Directories / Shares

1 Hot spare

140 Gb 370 Gb40 Gb

550 Gb of wasted space

14 x 72 gb disks = 1 tb capacity

Flexible Volumes Improve UtilizationVol0 = 1gb Max

200gb Database created

3 disk vol for Home Directories / Shares

1 Hot spare

SpareData Data Data Data Data Data Data Data Data Data Data Parity Parity

Aggregate

Database Home DirsVol0

400 Gb used

600 Gb of Free Space!

Benefits

• Flexibility

• Utilization

• Performance

• Cloning

LUNs

Application-levelsoft allocation

1 TB

400 GB

FlexVols™: Enabling Thin Provisioning

FlexVols: Container level:

flexible provisioning Better utilization

Physical Storage: 1 TB

FlexVols: 2TB

Container-levelsoft allocation

1 TB

300GB

200GB

200GB

50GB

150GB

100GB

Application-level: Higher granularity Application over-

allocation containment

Separates physical allocation from space visible to users

Increases control of space allocation

Data Availability: WAFL Snapshots

Causes of Unplanned Downtime

20%

40%

40%

TechnologyFailures

OperatorErrors

ApplicationErrors

Fewer Components Redundant Components Cluster Failover SnapMirror™ for DR

Appliance Simplicity Ease of Management Plug-n-play Low Product Complexity

Multiple point-in-time copy with low overhead

Fast Recovery of Entire Filesystem, Database

Source: GartnerGroup, 2005

• A Snapshot is a reference to a complete point-in-time image of the volume’s file system, “frozen” as read-only.

• Taken automatically on a schedule or manually

• Readily accessible via “special” subdirectories

• 255 snapshots concurrently for each file system, with no performance degradation.

• Snapshots replace a large portion of the “oops!” reasons that backups are normally relied upon for:

– Accidental data deletion– Accidental data corruption

• Snapshots use minimal disk space (~1% per Snap)

Snapshots Defined

FileA.dat

Snapshots Defined

FileA.dat

SnapShot!

File Read File Write

Only the changed block is written back to disk

Previous block is maintained for the SnapShot version

SnapShot only copies the pointers to the blocks

Snapshots from other Storage Vendors

FileA.dat

How not to SnapShot - Copy on Write!

FileA.dat

SnapShot!

File Read File Write

Step 1 – Original block must be movedSnapShot only copies the pointers to the blocks

Step 2 – SnapShot Index updated

Step 3 – New Block is written to disk

Data Availability : SnapRestore

SnapRestore Defined

• SnapRestore reverts an entire volume (filesystem) to any previous online Snapshot

– Makes the Snapshot the new active file system

• Instant recovery (no reboot)*

• Particularly compelling for database contents or software testing situations

* Except if restoring root volume

N Series SnapRestore

Step 1 – Volume index set as master, current volume pointers removed, redundant blocks flagged as available

VolxVolxVolx

Volume is restored in seconds! No performance impact!

SnapRestore from other Vendors(If the functionality exists at all)

Competitions Volume Restore

Volx

SnapShot!

Step 1 - Volume Index is Restored, Data is inconsistent!

Volx

Step 2 – Blocks are copied from SnapShot area

SnapRestore for Databases Provides a unique solution to database recovery

Rather than restoring large amounts of data from backup tape:1.Simply revert the entire volume back in

time to its state when the Snapshot was taken

2.Then play change logs forward to complete recovery

Effectively protects data without expensive mirroring or replication

Use Snaprestore where time to copy data from either a Snapshot or tape into the active filesystem is prohibitive

How many of your customers talk about Recovery Issues?

RTO and RPO

• Recovery Time Objective• Recovery Point Objective

Positioning SnapManager Products

RTO RPO

W D H M M H D W

File Data

E-mail

SQL

Oracle

SnapShot / SnapRestore

SnapManager for Exchange

SnapManager for SQL

SnapManager for Oracle

Solutions to Meet Customer Challenges

Information Lifecycle Management

Solutions to Meet Customer Challenges

Simplify for Lowest TCO

Best of Breed Solutions

Customer Satisfaction

Backup& Recovery

StorageConsolidation

RegulatoryCompliance

DisasterRecovery

Storage Consolidation

High availability storage

Effortless, large scale server consolidation

Pooled storage with non-disruptive expansion

Heterogeneous file sharing

Simplified data management

Seamless integration with existing software and hardware

NearlineStorage

PrimaryStorage

Backup and Recovery

Simplified, centralized backup and restore

Perform remote backup and restores locally

Instantaneous access to backup data (file format)

Uses significantly less storage

Eliminate backup window problems (backup hourly)

UNIXServer

ChicagoSan FranciscoNYLondon

WindowsServer WAN

NearlineStorage

Data Center

NearlineStorage

Remote Site

WAN

SnapVault™

Regulated Data

UK Data Protection Act, European Union Directive 95/46

• SEC Rule 17a.4 (Broker dealers)• DoD 5015.2 (Government)• HIPAA (Healthcare) • 21CFR11 (Life Sciences/Pharmaceuticals)• Sarbanes Oxley (public companies over $75m cap.)• FSA Handbook (UK Financial Services)• BSI DISC PD 0008 (evidential weight, code of practise)• Basel II Accord • Freedom of Information Act 2000

…..to name just a few !

NearlineStorage

Data Center

NearlineStorage

Remote Site

Comprehensive solution:• Data permanence• Data security

Increased data protection

Retention date support

Easy to integrate with existing applications

Meets requirements for SEC 17a-4, HIPAA, DOD 5015.2, GoBS, and more

Unmatched flexibility• Runs on all platforms• Systems can store

compliance andnon-compliance data

DB or E-mailArchival

WORMVolumes

Accesses Data and Moves itto WORM storage

ChicagoTokyoLondon

SnapMirror®

Regulatory Compliance

WAN

WAN

Disaster Recovery

Mirror sites for rapid disaster recovery

Remote site users failover to mirrored site automatically

Single solution for sync, async, semi-sync

Runs across all platforms

Cost effective for remote sites

Economical DR solution

NearlineStorage

PrimaryStorage

ChicagoTokyoLondon

DR Site

WAN

SnapMirror™

FlexClones

Data ONTAP™ 7.0

Infrastructure Utilization Challenges

• Overall storage utilization is low– Most enterprises are below 50% utilization

• Too many untapped resources– Static allocation– Suboptimal performance– No sharing of resources

FlexClone™ Software

• Enables multiple, instant data set clones with no storage overhead

• Provides dramatic improvement for application test and development environments

• Renders competitive methods archaic

FlexClone™ Volumes: Ideal for Managing Production Data Sets

• Error containment– Bug fixing

• Platform upgrades– ERP– CRM

• Multiple simulations against a large data set – ECAD– MCAD– Oil and gas

The Pain of Development

Prod Volume (200gb)

Pre-Prod Volume (200gb)

QA Volume (200gb)

Dev Volume (200gb)

Test Volume (200gb)

Sand Box Volume (200gb)

1.4 tb Storage Solution

200 Gb Free

Create copies of the volume

Requires processor time and Physical storage

Clone’s remove the pain

Prod Volume (200gb)

Pre-Prod Volume

QA VolumeDev Volume

Test Volume

Sand Box Volume

1.4 tb Storage Solution

Create Clones of the Volume – no additional space required

Start working on Prod Volume and Cloned Volume

Only changed blocks get written to disk!

1 Tb Free

In an Ideal IBM N Series World….

Primary Production

Array

Secondary

Array

SnapMirror

Create Clones from the Read Only mirrored volume

Removes development workload from Production Storage!

FAS3070

DMX FamilyCX-80AX150/S

IBM N Series / EMC / HP

Centera

FAS3050FAS3020FAS270FAS250

NetAppNetApp

Celerra

NS Series NS SeriesEMCEMC

FAS6030 FAS6070

DL380

MSA1000 RISSMSA1500 MSA1500cs EVA4/6000

EVA8000

XP Family

HPHPDL38

0

CX-40CX-20

Data ONTAP™ Operating System – SAN, NAS, iSCSI

• One architecture• One application interface

• One management interface• Total interoperability

A real world Scenario

• Customer is looking for a Scalable Platform to support future growth

• Needs to consider Disaster Recovery options

• And has a requirement for a Compliancy Solution

DMX FamilyCX-80AX150/S

IBM N Series / EMC / HP

Centera

N5500N5200N3700

IBMIBMN SeriesN Series

Celerra

NS Series NS SeriesEMCEMC

DL380

MSA1000 RISSMSA1500 MSA1500cs EVA4/6000

EVA8000

XP Family

HPHPDL38

0

CX-40CX-20

Scalability

N5600 N7600 N7800

DMX FamilyCX-80AX150/S

IBM N Series / EMC / HP

CenteraCelerra

NS Series NS SeriesEMCEMC

DL380

MSA1000 RISSMSA1500 MSA1500cs EVA4/6000

EVA8000

XP Family

HPHPDL38

0

CX-40CX-20

Interoperability

N5500N5200N3700

IBMIBMN SeriesN Series N5600 N7600 N7800

DMX FamilyCX-80AX150/S

IBM N Series / EMC / HP

CenteraCelerra

NS Series NS SeriesEMCEMC

DL380

MSA1000 RISSMSA1500 MSA1500cs EVA4/6000

EVA8000

XP Family

HPHPDL38

0

CX-40CX-20

Compliance

N5500N5200N3700

IBMIBMN SeriesN Series N5600 N7600 N7800

A real world Scenario

• Customer is looking for a Scalable Platform to support future growth– N Series systems scale from the Entry to the Enterprise

• Needs to consider Disaster Recovery options– Any N Series system can replicate to any other N Series System

– Natively over IP or FC

– Can be a mix FC-SAN or iSCSI

• And has a requirement for a Compliancy Solution– SnapLock can be added to any N Series system

And don’t forget FlexClone!

Addressing Today’s Challenges

Explosive data growth Do more with less Scale the infrastructure 24x7 global access Data security & compliance

Consolidatestorage

Operateeverywhere

Protectyour business