appendix a introduction to storage

22
A-1 Introduction to Storage

Upload: rogerio-coimbra-de-oliveira

Post on 05-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 1/22

Aroduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 2/22Aroduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 3/22

DAS - Direct Attached Storage

Products used include devices such as vanilla SCSI hard drivesand on-board RAID arrays

DAS connects directly to a single server. Clients on the networkmust have access to this server to use the storage device.

Server handles storage, retrieval of data files and applicationssuch as email or databases

DAS uses a lot of CPU power and requires even more CPUresources for sharing with other machines.

Purchasing too much storage in advance leads to an imbalanceof storage between servers and a waste of storage resources.

Lacks features like snapshots, replications, and management

SAN  –  Storage Area Network

A network whose primary purpose is to transfer data betweencomputer systems and storage elements as well as betweenstorage elements.

A SAN consists of a communication infrastructure that providesphysical connections and a management layer that organizesthe connections, storage elements, and computer systems so

that data transfer is secure and robust.

The term SAN is usually (but not necessarily) identified with blockI/O services rather than file access services. Aroduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 4/22

DAS characteristics

Difficult to manage

Limited functionality

Poor asset utilizationTrapped or captive storage

Limited scalability

Dell EqualLogic’s advantage:

Intelligent storage platform

Storage virtualization, load-balancing

Multi-server access – DAS is local to server

Enterprise data services – all includedThe financial, operational, and management benefits of SAN

Aroduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 5/22

How a NAS OperatesNAS devices identify data by file name and byte offsets, and transfer file

data or “metadata” - file's owner, permissions, creation data, etc.NAS devices provide file sharing.

File systems are managed by the NAS processor.Performance can be affected by file assembly/disassembly from blockI/O operations.

Not all applications are supported such as some back-up and anti-virusagents.

Booting off NAS requires complex software and is not supported bymany operating systems including Microsoft Windows.

Characteristics

Suitable for applications like file sharing (NFS/CIFS)

Limited support for applications (database)Can be susceptible to viruses/attacked

Performance issues: file access vs. block access

Requires specialized back-up software support (NDMP)

Major Players:

Network Appliance

EMC (Celerra)

Blue Arc

Dell EqualLogic’s Advantage:

PS Series array is a SAN that allows any flavor of NAS (if desired) byutilizin low cost NAS heads atewa s. Aroduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 6/22

A SAN is a high-performance network dedicated to storage. It provides any-to-any connectivity for the resources on the SAN; any server can potentiallytalk to any storage device, and communication between storage and SANdevices (switches, hubs, routers, bridges) is enabled.

SANs employ fiber optic and copper connections to create dedicatednetworks for servers and their storage systems. Due to its fast performance,high reliability, and long reach of fiber optics, the SAN makes it practical tolocate the storage systems away from the servers. This opens the door tostorage clustering, data sharing, and disaster planning applications, whilemanaging the user’s storage systems centrally.SAN formal definition from the Storage Network Industry Association (SNIA):

“A network whose primary purpose is the transfer of data betweencomputer systems and storage elements and among storage

elements. A SAN consists of a communication infrastructure whichprovides physical connections and a management layer whichorganizes the connections, storage elements, and computersystems so that data transfer is secure and robust.”--SNIA

Technical Dictionary , copyright Storage Networking IndustryAssociation, 2000

SANs address data by LUNs and Logical Block Addresses (LBAs).While SAN devices can be shared by many client servers, the data is typicallynot shared except within clusters.File systems are managed by client servers not by the SAN devices.

Performance is optimized for block I/O operations.Applications run on the client servers and therefore are all supported. Serverscan boot off a SAN. The servers can be free of any local disks, greatly Aroduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 7/22

SAN and NAS are complementary solutions not really competing solutions.

SAN uses block I/O between server and storage.

Data is stored and retrieved on disk and tape devices

Blocks are the atomic unit of data recognition and protection:Represented in binary 1s and 0s

Fixed size

A unit of application data that is transferred within a single sequence isa set of data frames with a common sequence_ID correspondingto one message element, block, or information unit.

NAS uses file I/O between server and storage.

Works within a file system

File system refers to structures and software used to organizeand manage data and programs on the hard disk

Operating system dependent

Files are made up of:

Ordered sequence of data bytes,

Symbolic name identifier

Set of properties, e.g. ownership and access permissions

Application must read/write the file in order for operations to occur.Files may be created and deleted. Most file systems expand or contract

during their lifetimes.Aroduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 8/22

Most customers that currently have DAS are now considering moving to aSAN.

The arguments for moving from DAS to SAN are that SANs:

Facilitate easy and dynamic storage growth with little or no disruptionto users and applications.

Reduce significantly the total cost of ownership (TCO).

Better utilize storage and allow for storage consolidation.

Provide better management and control.

Provide the ability to utilize new technologies such as:

Clustering

Better and more efficient backup utilities

Replication and snapshots

Aroduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 9/22

Fibre Channel (FC) SANs

Fibre Channel solves cabling problems associated with parallel SCSIdisk arrays

Installation base in high-end data centers and large enterprisenetworks

Expensive technology

Expensive skills required to implement, design, manage, etc.

Interoperability among FC solutions is an issue

Characteristics

Costly to deploy, often requires professional services

Complex to setup, configure and troubleshoot,

Connectivity to servers is very expensive (up to 10 times Gig-E),

Requires support of a separate network (fiber-based)

Interoperability issues.

Major Players:

EMC/DELL

HP

IBM

iSCSI SAN utilizes iSCSI (standards-based IP technology)

iSCSI is a globally adopted standard (an IP-based protocol)

Simple to use, manage, and deploy

Cost effective with excellent bandwidth

“ ”Aroduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 10/22

Fibre Channel factors to be considered

Costly

Added cost per port

Added costs for multi-pathing software, i.e. EMC’s

Powerpath @ $2,500/hostRedundancy for HA requires multiple fabrics

Disruptive to changes

Requires professional services to manage

New technology requires huge amounts of retraining

iSCSI factors to consider

Low cost

Lower cost per port

GE switches are cheaper than FC

Lower cost of HBA if used

Zero cost for software initiator in most OSs

Easy to manage. No professional services required.

Highly redundant and HA at a low cost

Easy to scale and no disruptions

Security built in with TCP/IP

CHAP

Radius

LDAPEtc

Native and scalable over long distances without added boxes asFC re uires A-roduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 11/22

SAN technology is finding favor with users for a number of reasons:

SANs can be readily and cost-effectively expanded to supportmore users, more raw storage, more storage devices, moreparallel data paths, and more widely distributed userpopulations.

Though SAN connectivity is not unlimited, the 126 nodes on aSAN loop are a good start. On a SAN fabric, the total number ofnodes is usually written as a power of 2 (actually, 224), resultingin over 16 million nodes.

This connectivity enables storage for many servers to beconsolidated on a small number of shared storage devices,which reduces costs and eases management of capital assets.SAN provides higher utilization of storage compared to DAS.

By providing a separate channel for data, a SAN can offloadback-up traffic from the LAN. Beyond LAN-free back-up, there isserverless back-up where the data is moved across the SANfrom one storage unit directly to another.

Since the SAN can allow any server to access any storage unit, itprovides a foundation for server clustering and eventually datasharing. The application and server operating system, however,must also support the arrangement.

Finally, many SANs are being employed in disaster recoveryplans where the long reach of the SAN can be used to deliverdata to a remote mirror.

Aroduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 12/22

SAN users can add storage facilities as required without disruptingongoing operations. For instance, if increased disk capacity causesbackup times to exceed the time window available, then additionalbackup units can be added to the SAN and become immediately

available to all servers and the backup application.

Once server clustering is established, the number of servers devoted toa given application can be increased or decreased dynamically tomatch demand. This can be done manually or under control of atransaction monitor.

The storage system’s performance can be measured and tuned for

optimum throughput, balancing the load among servers and storagedevices. Bandwidth can also be temporarily applied to high-prioritybulk data transfers typical of data warehouse and data miningapplications.

Aroduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 13/22

Emphasis on highly available, if not entirely “nonstop,” applications has

drawn attention to storage as a potential point-of-failure. Even withfully redundant storage solutions (such as RAID mirroring), as long asstorage is accessible only through the server, the server itself must also

be made fully redundant. In a SAN, the storage system is independentof the application server and can be readily switched from one serverto another. In principle, any server can provide failover capability forany other, resulting in more protected servers at lower cost than intraditional 2-4 server cluster arrangements.

A central pool of storage can be managed cost-effectively using SANtools. Central management makes it easier to measure and predictdemands for service and it leverages management tools and trainingacross a broader base of systems. Backups proceed independent of

the load on application servers. Most obvious of these are backup andarchiving operations, but the list also includes expansion andreallocation of storage resources and operation of support servicessuch as location and name resolution.

Aroduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 14/22

SAN is comprised of three major areas as follows:

Host/ Servers

Each having one or more iSCSI Host Bus Adapters (HBAs)

or Network Interface Cards (NICs)Applications

Databases

File servers

Network

Gigabit Ethernet

Gigabit Ethernet swtiches

Cat5E or Cat 6 cabling

WAN capable

Storage

iSCSI-based

Disk arrays

Tape systems

A-roduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 15/22

SAN components

Initiator/ Server

HBA  –  Provides physical connection and access point to

the SAN storage elements and computer systemsNIC – Server based Network Interface Card, providesphysical connectivity to the SAN.

Target/Storage

Disk

Tape

Traffic between initiator and target

Usually identified with block I/O services rather than file access

services.

Small Computer System Interface (SCSI)

Define I/O buses primarily intended for connectingstorage subsystems or devices to hosts through HBA’s.

Originally intended primarily for use with small (desktopand desk-side workstation) computers, SCSI has beenextended to serve most computing needs and is,arguably, the most widely implemented I/O bus in usetoday.

Aroduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 16/22

Servers can be of any variety, although UNIX and NT systems are mostcommon. HBAs are needed to connect the server to the SAN.

Disk storage systems can be RAID or JBOD. In addition, SANs often

include tape systems or optical backup devices.SAN connections are most often provided by fiber optic cables andconnections through hubs and switches. However, small scale SANscan be implemented using copper connections.

SAN management software allows the user to control the SAN and thestorage systems it supports.

A-roduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 17/22

Every device connected to a SAN requires an interface or adapterboard. Some units have built-in interfaces while others rely on HBAs orserver-based NIC cards designed for PCI.

Adapter boards will have either integrated (soldered) components orplug-in modules. The two module types are:

SFPs. Modules or cartridges that plug into external slots onsystem adapter boards and into switches. They are available incopper and fiber versions. Some adapters are capable of fullduplex operation for performance up to 200 MB/sec.

SNMP and MIB support for ease of remote management

Aroduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 18/22

RAID (Redundant Array of Inexpensive Disks) is a disk clusteringtechnology that has been available on larger systems for many years.Depending on how you configure the array, you can have the datamirrored (duplicate copies on separate drives), striped (interleaved

across several drives), or parity-protected (extra data written to identifyerrors). You can use these techniques in combination to deliver thebalance of performance and reliability that the user requires.

Because of the high capacity (and cost) of RAID storage systems, theyare good candidates for sharing across a SAN. Although you cancertainly have a SAN without RAID, the two technologies are oftenused hand-in-hand.

The four most common RAID levels are 0, 1, 3, and 5. The following listsand defines RAID levels:

Level 0: Provides data striping (spreading out blocks of eachfile across multiple disks) but no redundancy. Improvesperformance but does not deliver fault tolerance.

Level 1: Provides disk mirroring; data is written to two duplicatedisks simultaneously.

Level 3: Same as Level 0, but also reserves one dedicated diskfor error correction data. Provides good performance and somelevel of fault tolerance.

Level 5: Provides data striping at the byte level and stripe errorcorrection information. Results in excellent performance andgood fault tolerance. A-roduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 19/22

RAID 0 is not a fault tolerant RAID solution. If one drive fails, all datawithin the entire array is lost. It is used where raw speed is the only (ormajor) objective.

Provides the highest storage efficiency of all array types.

Made by grouping two or more physical disks together to createa virtual disk. This virtual disk appears as one physical disk to thehost. Each physical drive's storage space is partitioned intostripes.

RAID 1 provides complete protection and is used in applicationscontaining mission-critical data. It uses paired disks where one physicaldisk is partnered with a second physical disk. Each physical diskcontains the same exact data to form a single virtual drive.RAID 5 uses parity information interspersed across the drive array. RAID

5 requires a minimum of 3 drives.One drive can fail without affecting the availability of data.In the event of a failure, the controller regenerates the lost dataof the failed drive from the other surviving drives.

RAID 6 is an extension of RAID 5 which allows for additional faulttolerance by using a second independent distributed parity scheme(dual parity)

Data is striped on a block level across a set of drives, just like inRAID 5, and a second set of parity is calculated and written

across all the drivesRAID 6 provides extremely high data fault tolerance and cansustain multiple simultaneous drive failures. It requires N+2 A-roduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 20/22

RAID 10 consists of multiple sets of mirrored drives. These mirrored drives are then stripedtogether to create the final virtual drive.

Pros:

High levels of reliability

Can handle multiple disk failuresProvides highest performance with data protection

Striping multiple mirror sets can create larger virtual drives

Cons:

Like RAID 1, writes the information twice and thus incurs a minorperformance penalty when compared to writing to a single disk

Requires an additional disk to make up each mirror set

By implementing RAID 10, the result is an extremely scalable mirror array capable ofperforming reads and writes significantly faster (since the disk operations are spreadover more drive heads).

RAID 50 combines the block striping and parity of RAID 5 with the straight block striping ofRAID 0. RAID 50 is a RAID 0 array striped across RAID 5 elements.

Pros:

Ensures that if one of the disks in any parity group fails, its contents can beextracted using the information on the remaining functioning disks in itsparity group

Offers better data redundancy than the simple RAID types (i.e., RAID 1 & 5)

Can improve the throughput of read operations by allowing reads to beperformed concurrently on multiple disks in the set

Cons:

Slower performance than RAID 10

Slower write than read performance (write penalty)

Improves on the performance of RAID 5 through the addition of RAID 0,particularly during writes A-roduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 21/22Aroduction to Storage

7/31/2019 Appendix a Introduction to Storage

http://slidepdf.com/reader/full/appendix-a-introduction-to-storage 22/22