advances in data storage - misrc | home

12
1 Advances in Data Storage Technologies Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium April 23, 2004 [email protected] Overview ? Opening thoughts ? Main-stream technologies ? Storage – where data resides when it is not being manipulated or moved around ? Transports – how data is moved between other components and how components are physically connected together ? Protocols – the “language” that components use to talk to each other ? Software – the control of what happens throughout the system ? Closing thoughts Opening Thoughts ? Technologies – The underlying components used to build products using a given architecture ? Architectures – The way technologies are put together to solve a problem ? Applications – define the scope of requirements and associated problems to be addressed An orthogonal thought ? There are many interesting “technologies” that can be incorporated into “products” ? Products are what sells ? This presentation describes ? Past and current “products” and associated technologies ? The evolution of various technologies and architectures that may or may not become products Storage ? Disk Drives ? ATA Disk Technology ? SCSI Disk Technology ? Tape Technology ? DVD (optical) Technology ? MEMS ? Solid State Disk Drives in General ? Definition ? “Winchester” disks have the rotating rigid media and read/write heads plus actuator enclosed in an air-tight case ? Current Status ? Disk drives have been around since 1957 (IBM RAMAC 305) ? Areal Density is about 60 Gbit/in 2 ? Current capacities are in the 300GB range for 3.5-inch ATA-class drives ? Rotation speeds are at 7200 RPM for ATA, 15000 RPM for SCSI ? Form factors are 3.5-inch for desktop/enterprise, 2.5 -inch for mobile, some 1-inch ? Interfaces include ATA, SATA, Parallel SCSI, and FC

Upload: others

Post on 12-Sep-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Advances in Data Storage - MISRC | Home

1

Advances in Data Storage Technologies

Thomas M. RuwartUniversity of MinnesotaDigital Technology CenterIntelligent Storage ConsortiumApril 23, [email protected]

Overview

? Opening thoughts? Main-stream technologies

? Storage – where data resides when it is not being manipulated or moved around

? Transports – how data is moved between other components and how components are physically connected together

? Protocols – the “language” that components use to talk to each other

? Software – the control of what happens throughout the system

? Closing thoughts

Opening Thoughts

? Technologies – The underlying components used to build products using a given architecture

? Architectures – The way technologies are put together to solve a problem

? Applications – define the scope of requirements and associated problems to be addressed

An orthogonal thought

? There are many interesting “technologies” that can be incorporated into “products”

? Products are what sells? This presentation describes ? Past and current “products” and associated

technologies? The evolution of various technologies and

architectures that may or may not become products

Storage

? Disk Drives? ATA Disk Technology? SCSI Disk Technology

? Tape Technology? DVD (optical) Technology? MEMS? Solid State

Disk Drives in General

? Definition? “Winchester” disks have the rotating rigid

media and read/write heads plus actuator enclosed in an air-tight case

? Current Status? Disk drives have been around since

1957 (IBM RAMAC 305)? Areal Density is about 60 Gbit/in2

? Current capacities are in the 300GB range for 3.5-inch ATA-class drives

? Rotation speeds are at 7200 RPM for ATA, 15000 RPM for SCSI

? Form factors are 3.5-inch for desktop/enterprise, 2.5-inch for mobile, some 1-inch

? Interfaces include ATA, SATA, Parallel SCSI, and FC

Page 2: Advances in Data Storage - MISRC | Home

2

Disk Drives continued…

? Technology Evolution? Perpendicular Recording

? Will operate in the 100-200 Gigabit/in2 areal density range? 2 to 4 years out

? New media types? Patterned Media? Tilted Perpendicular media? Self-organized media

? Smaller Form Factors – move from 3.5-inch to 2.5-inch form factors? Lower power requirements? Higher manufacturing yields ? Higher areal densities? Higher RPM drives == lower access latency

? Serial interfaces? Serial ATA (SATA)? Serial Attached SCSI (SAS)? Fibre Channel (FC) – 4 and 10Gbit/sec

? Protocols? Object-based Storage Device

HDD Technologies for the future

Patternedmedia,

33 nm bits.Bruce

Terris, HGST

Perpendicularrecording

Heat -assisted

magnetic recordinguses both

laser and field to record

T. McDaniel,Seagate

Self-organized magnetic arrays

D. Weller, Seagate

Courtesy of INSIC and Tarnotek, Inc.

Capacity Growth: Sustainable?

?

HDD Capacity vs. Time, 95 mm Desktop ? 7,200 rpm

y = 6E-28e 0.0018x

R2 = 0.9626

1

10

100

1000

Jan-

98

Jan-

99

Jan-

00

Jan-

01

Jan-

02

Jan-

03

Jan-

04

Jan-

05

Jan-

06

Jan-

07

Jan-

08

Jan-

09

Time

HD

D C

apac

ity

(GB

)

All HDDEarly HDD

(C) 2003 TarnoTek

INSIC 1 Tb/inch2

AD demo goalSeagate’s

plans

Courtesy of INSIC and Tarnotek, Inc.

Precipitous decline in $/GBCost per Gigabyte, 95 mm Desktop ? 7,200 rpm

0.01

0.10

1.00

10.00

100.00

Jan

-98

Jan

-99

Jan

-00

Jan

-01

Jan

-02

Jan

-03

Jan

-04

Jan

-05

Jan

-06

Jan

-07

Jan

-08

Jan

-09

Time

$/G

B

HGSTMaxtorSeagateWDC

(C) 2003 TarnoTek

- 44%/ year

Courtesy of INSIC and Tarnotek, Inc.

The Future of Hard Disk Drive TechnologyLab Demos: Possible HDD Areal Density Progression

Laboratory Demonstrations

1.0

10.0

100.0

1000.0

10000.0

100000.0

Jan-90 Jan-93 Jan-96 Jan-99 Jan-02 Jan-05 Jan-08 Jan-11 Jan-14 Jan-17

Date

Are

al D

ensi

ty (G

b/ in

2 )

perpendicular recording

heat-assisted mag recording

self -organized arrays ?

patterned media ?

50% CAG rate lin

e

70now

highest in products

1 Terabit per inch2 goa l

30% CAG line

Courtesy of INSIC and Tarnotek, Inc.

Disk Arrays in General

? Definition? RAID – Redundant Array of Independent (Inexpensive)

Disks? Aggregation of disk drives to operate as a large single

storage device? Used to improve reliability, availability, serviceability,

and performance through ? Current Status

? Disk arrays have been around since 1988 ? Interface is primarily 2Gbit FibreChannel

? Technology Evolution? Recent developments in MAID – Massive Arrays of

Independent (Inexpensive) Disks? MAID takes advantage of smaller form factor disk

drives – lots of them? Intended to address problems associated with multiple

drive failures

Page 3: Advances in Data Storage - MISRC | Home

3

ATA/IDE Disk Technology

? Definition? Inexpensive interface used for consumer-grade disk drives? ATA stands for AT Attachment, IDE stands for Integrated Drive

Electronics. The two terms are used interchangeably? Primarily 3.5-inch and 2.5-inch form factors (5.25-inch for optical devices)

? Current Status? ATA disk drives are one half to one third the cost of equivalent capacity

SCSI disks and can result in lower overall equipment costs for large deployments

? ATA cost levels make it an attractive alternative or augmentation to tape? Technology Evolution

? Parallel ATA interface technology has been around for many years and has matured to the point where it is very commonplace.

? ATA disk technology is equally mature to SCSI drive technology but is a very different drive technology.

? Serial ATA is the next evolutionary step in ATA technology and is now available in production quantities.

? Serial ATA improves upon Parallel ATA performance, addressing, and other limitations while still maintaining the cost effectiveness of Parallel ATA.

ATA/IDE Disk Technology (cont.)? Technology Availability

? ATA disk interfaces are available on both standard Winchester Hard Disks and on CD and DVD devices.

? ATA disk drives of various capacities and form factors are available from multiple manufacturers: Seagate, Maxtor, Western Digital, Hitach i Global Storage (formerly IBM Storage), Fujitsu, NEC, and Samsung

? Comments? ATA disk drives are NOT simply a SCSI disk drive with a different

interface. ATA (consumer or personal storage) and SCSI (enterprise-class storage) are designed with very different goals in mind. ? ATA disks are designed to minimize cost, maximize volume, and are intended to

operate as single units. ? SCSI disks are designed to maximize performance and reliability as well as

being able to operate in arrays.? The Winchester disk industry is moving toward a 2.5-inch form factor very

similar to the disk drives in laptop computers. One significant implication is that there will be a much larger number of disk units (roughly four times as many) to manage in an overall installation. This move toward smaller form factors is motivated by higher aerial densities and media yield.

? Reference Web Sites? Serial ATA – www.serialata.org

SCSI Disk Technology

? Definition? High-performance interface used for enterprise-class disk drives? SCSI stands for Small Computer Systems Interface? Primarily 3.5-inch and 2.5-inch form factors (5.25-inch for optical devices)

? Current Status? SCSI disk drives are principally used in disk arrays and in applications that require

consistent high bandwidth, lower latency, and higher reliability than can be provided by ATA/IDE disk drive.

? Technology Evolution? SCSI interface technology has been around for almost 20 years and has matured to

the point where it is the disk drive interface of choice for enterprise class storage.? Serial Attached SCSI (SAS) is the next evolutionary step in SCSI technology and

will be available in production quantities in late 2004 or early 2005. ? SAS provides some electrical compatibility with SATA.

? Technology Availability? SAS has been announced by Seagate on the new 2.5-inch enterprise-class disk

drive.? Reference Web Sites

? SCSI – www.t10.org? FIbre Channel – www.t11.org? iSCSI – www.ieft.org

Tape Technology

? Definition? Magnetic recording? Linear and helical scan

? Current Status? Used for backup and long-term data archival ? High density, low cost, very durable? Potential high latency perhaps an issue

? Technology Evolution? Magnetic tape well understood – it has been around since 1953? Magnetic tape follows disk density and performance curves? LTO (Linear Tape Open) getting a lot of market share? Drive consortium of IBM, Seagate and HP? Tape manufacturers include Maxell, Fujifilm, TDK, Imation, Sony,

Emtec (BASF), Verbatim, …etc.? www.lto.org

? IBM 3490 tape technology is the gold standard in enterprise-class tape ? Comments

? Technology Refresh is a significant issue with large data archives

Tape Technology (cont.)

? Technology availability? LTO Roadmap

Native GB

Native MB/S

Recording Method

Media

Available

Generation

100

10-20

RLL 1,7

MP

2000

1Generation

200

20-40

PRML

MP

2002

2Generation

400

40-80

PRML

MP

2004

3Generation

800

80-160

PRML

Thin Film

2006

4

18 - 24 Months between generations

DVD

? Definition? Digital Versatile Disc

? Current Status? Long term data archival? High latency data

? Technology Evolution? Capacity? Now - 4.7GB per side, 9.4GB per disk double sided? Soon - 27GB per side (blue-laser)? Ultimately - 50GB per side

? Transfer rates? Current write = 3.3MB/sec, read = 10.8MB/sec? Assume technology will advance (i.e. like CDRW)

? Technology Availability? Drives and robotics are common

? Comments? Long media life, reasonably durable? Exploits the consumer market for media

Page 4: Advances in Data Storage - MISRC | Home

4

MEMS-based Storage Technology

? Definition? MEMS - Micro Electro Mechanical Systems

? Current Status? Still in the technology demonstration mode

? Technology Evolution? Research at CMU

? Technology Availability? Not available for another few years

? Reference website? www.pdl.cmu.edu

Solid State Storage Technology

? Definition? Non-volatile for permanent data storage

? Current Status? Most popular types

? CompactFlash (CF)? Smart Media (SM)? Secure Digital Memory Card (SD)? Memory Stick (Sony)? USB Memory devices

? Technology Evolution? Read/write/re-write ? Slow write speeds, moderate read speeds? Driven by the consumer electronics market? Continue to follow the density curves of integrated circuits? Factor of 10-100 times the price/MB of magnetic storage

? Technology Availability? Consumer products

Transports

? Fibre Channel? Gigabit Ethernet? TCP Offload Engines (TOEs)? System Area Networks

Fibre Channel

? Definition? High speed data transport (physical, encoding, framing protocol)? Extensible to miles (100km) with special gbicsand special 9-micron single-

mode fiber? Current Status

? Currently the StorageArea Network Interconnect of choice? Technology Evolution

? 1 and 2Gbit shipping on disk drives and arrays? 4Gbit may be shipping on disk drives in near future? 10Gbit/sec standard mature and is intended for disk arrays

? Vendors/Products? Multiple

? General Reference Web Sites? Fibre Channel Industry Association (www.fibrechannel.org, www.t11.org)

? Comments? Synchronization of HBAs, switches and storage? Technical Committee T11 (www.t11.org)

Gigabit Ethernet

? Definition? Ethernet at speeds of 1,000 (1GigE) to 10,000 (10GigE) bits per

second? Current Status

? Ethernet is the network transport of choice? 1GigE is the overwhelming favorite for high-speed LANs? 10GigE close behind

? Technology Evolution? IEEE 802.3 with 802.3ae defining the 10,000 bit modifications (6 /17/02)

? Technology Availability? Cisco Catalyst 6500 Serial 1550nm 10 Gigabit Ethernet Module? Intel® PRO/10GbE LR Server Adapter? Intel® TXN17401 Optical Transceiver

? Reference Web Sites? www.10gea.org/index.htm

? Comments? 10GigE Products are currently shipping

TOE

? Definition? A TOE (TCP Offload Engine) -chip or a board that handles the TCP protocol

stack without utilizing host CPU resources? Current Status

? Reduces processing load on nodes that are connected via GigE and handle heavy traffic through this interface

? Some TOEs have iSCSI protocol engines that accelerate the SCSI command protocol processing for iSCSI storage devices

? Technology Availability? Adaptec AIC-7211 1Gb ASIC with full TCP/IP offload? Lucent TA1000 manufactured by Intel

Page 5: Advances in Data Storage - MISRC | Home

5

System Area Networks

? Definition? A network used to connect nodes together

within a single system (computer room environment) that has the following operational characteristics? Very Low Latency (~1 µsec)? High Bandwidth (>1Gigabit/sec)? Support for Atomic Operations? Remote DMA capability (RDMA)? Low Overhead? Allow the construction of NUMA -like

systems with *standard* hardware? Current Status

? System Area Networks are being employed as the interconnect for compute and storage clusters

SAN Switch

System Area Networks (cont.)

? Technology Evolution? Two companies, Myricom and Quadrix, have been producing

system area networks for a number of years with some success. ? Several other companies have more recently come up with

products in the VIA (Virtual Interface Architecture) but these devices only support RDMA and no atomic operations.

? The most widely accepted system area network technology is InfiniBandwith more than 200 companies building various pieces of that technology.

? Other companies such as AMD and Motorola have developed competing technologies, HyperTransportand Rapid I/O respectively, originally intended to be restricted to within theconfines of a single “box” but have since defined connectors andcables to allow the to interconnect boxes as well.

System Area Networks (cont.)

? Technology Availability? InfiniBand is available from a variety of companies. It is best to

go to the InfiniBand Trade Association website to see who these companies are and what categories their product fall under (i.e. switches, NICs, software, …etc.)

? HyperTransport is slightly ahead of InfiniBand on the maturity curve but also has slightly different applicability.

? Rapid I/O is on the same track as HyperTransportand is available from several vendors.

? VIA hardware adapters are available in different forms from Qlogic, Emulex, Intel, and Troika Networks

? Quadrics is a proprietary system area network and all the adapters and relevant software is available from Quadrics.

? Myrinet hardware (NICs and Switches) is available from Myricom however *all* the associated software, interface protocols, and APIs are publicly available.

System Area Networks (cont.)

? Comments? Evolving storage system architectures will incorporate System Area

Networks? Storage “devices” will become peers on the System Area Network

? Reference Web Sites? InfiniBand – www.infinibandta.org? Hypertransport– www.hypertransport.org? Virtual Interface Architecture (VIA) –

www.intel.com/design/servers/vi? Myrinet – www.myri.com? Quadrics – www.quadrics.com

Protocols

? iSCSI? OSD? NFS/CIFS

iSCSI

? Definition? Internet Small Computer Systems Interface (iSCSI) protocol ? Encapsulated SCSI over IP

? Current Status? Important Protocol for block storage applications? Can be thought of an an inexpensive alternative to Fibre Channel

? Technology Evolution? Number of early release products? Some driver based? Some with specialized hardware, (TCP offload)? Limited commercial success

? Security and discovery remain a problem? Draft IETF iSCSI ? www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-14.pdf

Page 6: Advances in Data Storage - MISRC | Home

6

iSCSI (cont.)

? Technology Availability? Adaptec ASA-7211 iSCSI Adapter (Mid 2002)? Intel Pro 1000 T IP ? www.intel.com/network/connectivity/products/iscsi/index.ht

m?iid=ipp_home+netcon_iscsi&? Reference Web Sites

? iSCSI_network_storage.pdf? www.ietf .org/html.charters/ips-charter.html

? Comments? Replaces need for Fibre Channel, if accepted by customers

and industry? Uphill battle - many camps for and against.

OSD – Object -based Storage Devices

? Definition? Object-based Storage Devices – A protocol for accessing data on storage

devices? Current Status

? OSD can have a significant impact on helping to solve many of the issues that arise in building scalable, high performance storage systems

? Technology Evolution? First release of the T10 (SCSI) OSD standard specification has been

submitted to the T10 committee? Technology Availability

? Nothing commercially available yet – some product from some companies is available

? Many large storage companies looking into it? An OSD reference code is available at ? http://www.sourceforge.net/projects/intel -iscsi

What is OSD?

? Object-based Storage Devices – An Enabling Technology? Grew out of the Network Attached Secure Disks (NASD) project at CMU

? A flexible and powerful protocol used to communicate with storage devices

? Proposed as a protocol extension to the SCSI command set

? Actively being pursued by the OSD Technical Working Group in theStorage Networking Industry Association (SNIA)

? It is a natural step in the evolution of storage interface protocols

? For some however, it is very new and very different

ST506 SMD SCSI FC SCSI SCSI OSD OSD

1902 1985 1990 1998 2002? 200X

What OSD is NOT

? It is not intended or expected that the object abstraction be a complete file system

? There is NO notion of ? Naming? Hierarchical relationships? Streams? file system style ownership access control

? The omitted features are assumed still to be the responsibility of the OS file system

The General Application:Storage Architectures Today

Storage Device

I/O Application

Interconnect

File System

Network Attached Storage(files)

Interconnect

Storage Device

File System

I/O Application

Storage Area Network (blocks)

Storage Device

File System

I/O Application

Direct Attached Storage (blocks)

Architecture defined by location of storage system & devices Block Storage

Device

OSD System Architecture

File System Storage Component

File SystemUser Component

I/O Application

Block Storage Device

File System Storage Component

File SystemUser Component

I/O Application

DAS Architecture OSD Architecture

SCSI

SCSI

Page 7: Advances in Data Storage - MISRC | Home

7

File System Components

? User File System Component? Hierarchy Management? Naming ? User Access Control? Data Properties (Attributes)

? File System Storage Component? Free space management? Storage allocation for data entities? Attribute Interpretation Block Storage

Device

File System Storage Component

File SystemUser Component

I/O Application

SCSI

How OSD works

Block Storage Device

File System Storage Component

File SystemUser Component

I/O Application OSD Manager

File SystemUser Component

I/O Application

Data Transfer

Object

Location

Security

Security

What problems are being solved?

? Depends on the APPLICATION? Different people are trying to solve different problems for different reasons? Storage Device Utilization? Data Management? Cost? Reliability? Device Management? Performance? Security? Availability? Maintainability? Extensibility? Restate the question: What problems CAN be solved with OSD?

What CAN be addressed by OSD

? Improved storage management? Self-managed, policy-driven storage (e.g., backup, recovery)

? Improved device and data sharing? Shared devices and data across OS platforms

? Improved storage performance? Hints, QoS, Differentiated Services

? Improved scalability (and not just capacity)? Of performance and metadata (e.g, free block allocation)

? Current block- based access protocols and associated file systems are 30 years old (that’s 210 in dog-years).

? OSD has the potential to make a significant impact on the Extensibility of a Storage System Architecture

Intelligent Storage

? Definition? Assume an Object-based Storage Device? Storage Device is “aware” of the data objects it stores? An Intelligent storage device can manipulate its data objects and

potentially the “contents” of the data object? Current Status

? Pre-Competitive research ? Several organizations involved? University of Minnesota DTC Intelligent Storage Consortium? CMU Parallel Data Lab? UC Santa Cruz Storage Research Center? UCSD Center for Magnetic Recording Research

? Technology Evolution? Intelligent Storage is a natural evolution of OSD.

Why Intelligent Storage?

? From the storage device manufacturer and storage vendor’s perspective? More room to innovate and differentiate storage

devices like disk drives? Increase margins – price storage devices based

on capability not simply capacity? From the User’s perspective? Increase in capability more specific to the User’s

application space? Easier to manage

Page 8: Advances in Data Storage - MISRC | Home

8

Things that need to happen

? Standards – OSD Interface to storage devices

? Standards – Runtime / execution environment

? Standards – API from Application to the Intelligent Storage System

NFS/CIFS – Network File System / Common Internet File System? Definition

? NFS and CIFS are file sharing protocols ? Current Status

? NFS and CIFS are the current standard protocols used for file sharing? Technology Evolution

? NFS version 3 is the current generally available release. ? NFS version 4 is under development at the University of Michigan and

contains many enhancements that will allow it to exploit OSDs .? CIFS is the Microsoft answer to NFS – it essentially does the same thing

as NFS? CIFS evolution is proprietary to Microsoft

? Technology Availability? NFS v3 is available for all OS and platforms? CIFS comes with all MS operating systems

Software

? DAS/SAN/NAS? File systems? Heterogeneous Shared File Systems? Hierarchical Storage Management (HSM)? Storage Resource Management? Storage Management? Virtualization

DAS/NAS/SAN

? Definition? These are more architectural terms than technologies? DAS – Direct Attached Storage? NAS – Network Attached Storage? SAN – Storage Area Networks

? Current Status? DAS is the oldest and most common storage interconnect? NAS is a generic term for NFS or CIFS? SAN is the architectural interconnect used to physically share storage devices among

several host computer systems? DAS and SAN imply block-based storage access protocols such as SCSI or ATA? NAS implies a file-based access protocol

? Technology Evolution? DAS has been around since the beginning of time? NAS became widely used with NFS in the mid 1980’s? SAN has only been around since about 1997 and is still somewhat qwerky

File Systems

? There are several types of file systems? Local file systems that use Direct Attached Storage or dedicated storage on a SAN? Network File Systems that subscribe to the NFS or CIFS protocol? Shared File Systems that operate on a SAN and allow for concurrent read/write access

to files with all other host computer systems on the SAN? All these file systems unless otherwise noted are “block-based” file systems – a

30-year-old technology? Block-based file systems require the host computer to manage the free space

on the storage as well as the allocation of storage blocks to files? Block-based file systems have difficulty scaling particularly in a truly shared

environment? Object-based File Systems assume that free-space management and space

allocation functions are delegated to the Object-based Storage Devices? Object-based File Systems scale far better than block-based file systems? Object-based File Systems enable the exploitation of Intelligent Storage Devices

Heterogeneous Shared File Systems

? Definition? Permits simultaneous file sharing among different (1) Operating Systems

and (2) multiple computers at mount point level? Current Status

? Key foundation technologies that will ultimately support seamless, geographical data sharing

? Allows a site to phase in (and out) different client and/or processing systems without affecting the data storage

? Allows for the growth of data storage subsystems without affecting the client and/or processing systems

? Technology Evolution? Products on the market for several years? Still acclimating to 24x7 operational, ‘user’ heavy environments? None of the file systems fully support the full range of Operating Systems? Debate over centralized versus distributed metadata? Scalability remains a question? Failover sometimes difficult

Page 9: Advances in Data Storage - MISRC | Home

9

Heterogeneous Shared File Systems (cont.)

? Technology Availability? Several proprietary, somewhat heterogeneous file systems ? Tivoli’s SANergy ™? ADIC’s CVFS now called SNFS? SGI’s CxFS

? GFS by Redhat (formerly Sistina), orginallyGPL’dbut has since taken on a proprietary course

? Lustre, currently under development as GPL? Funded by the DoE (ASCI Path Forward)? GPL’d solution in development and available from Cluster File

Systems, Inc (Lustre)? Reference Web Sites

? Lustre – www.lustre.org? SNFS – www.adic.com? CxFS – www.sgi.com

Hierarchical Storage Management (HSM)

? Definition? Policy-based data migration between storage elements

? Current Status? Multiple storage tiers likely given mix of short latency/long archive requirements

? Technology Evolution? Traditional HSMs mature? SAN HSMs emerging

? Technology Availability? Numerous HSM vendors/products? SAN HSMs

? ADIC’s StoreNext SAN HSM now shipping? Tivoli’s SANergy™/Sun QFs/SAM-FS

? Comments? Most interesting and most relevant are emerging Disk-to-Disk systems and storage

over IP oriented companies? Nexsan Technologies – www.nexsan.com? FalconStor Software – www.falconstor.com

? HSM has been around for many years but has always had trouble getting traction

Storage Resource Management (SRM)

? Definition? Storage administration, capacity planning, monitoring, etc.

? Current Status? Movement towards centralized control of storage

? Technology Evolution? Evolving but disjointed

? Technology Availability? Multiple products by multiple vendors? McData’s SANavigator? AppIQ

? Comments? Getting the products to gel to provide a unified, cohesive view

of storage? SMIS (a.k.a. Bluefin) is a new standard that is getting

significant attention

What is SMIS?

? The Problem? Too many management infrastructures!? Simple Network Management Protocol for networks? Desktop Management Interface for desktops? Common Management Information Protocol for telco? System Management BIOS for motherboard/BIOS vendors? Alert Standard Format for system alarms, …

? Non-interoperable models, frameworks and policies? The model describes what you are managing.? The framework allows you to manage the model.? And policies say how you can manage the model.

? We need a unified management infrastructure for the enterprise

So, What is SMIS?!

? Based off of CIM/WBEM and SNIA Shared Storage Model

? Provides common interface to SAN resources? Services include discovery, monitoring,

configuration, security, capacity planning, …? Bluefin says how we manage CIM via WBEM? Solves the problem of multi -vendor SAN

interoperability

SMIS – Storage Management

? Definition? SMIS is an object-oriented messaging interface that links

distributed management applications (clients) with device management support (agents) to discover, manage, and control devices of any kind

? A CIM/WBEM-based SAN management framework? A SNIA-based standard

? Current Status? Will significantly enhance the ability to manage the entire

heterogeneous storage environment independent of hardware or software vendor or manufacturer

? Technology Evolution? Started 5 years ago in SNIA? Taken out of SNIA by the Partnership Development Process –

a consortium of 17 companies? Rev 1 of the SMIS spec was brought back into SNIA June 2002

for review and approval by all the Technical Working Groups

Page 10: Advances in Data Storage - MISRC | Home

10

SMIS – Storage Management (cont.)

? Technology Availability? Spec released to the public September 2002? Products that are SMIS-compliant are available from a limited

number of companies? Reference Web Sites

? SNIA – www.snia.org

Storage Virtualization

? Definition? Unfortunately, there is no single definition – depends on the vendor? Generically, virtualization is an abstraction of physical data storage space

? Current Status? Sites are made up of many different vendors’ storage devices and some of the

virtualization products allow the pooling of storage devices int o a single, larger space for more efficient use of that space

? Technology Evolution? Some of the virtualization products are still only a few years old and have not had

time to prove themselves as a success or failure? Virtualization hyped during 2001 as a way to decrease the TCO of a storage system

but it is becoming commonly believed that this is not the case? Technology Availability

? Range from complete software to a mix of hardware and software products? Products from several vendors are currently available

? StoreAge Networking Technologies? DataCore™ Software

? Comments? Debate over ‘inband ’ versus ‘out -of-band’ virtualization

Cool Stuff Happening in the Storage Industry

? Future of Data Storage Systems Workshop –April 27-29, UCSD, San Diego CA

? Intelligent Storage Workshop, May 19-20, UMN/DTC, Minneapolis, MN

? SNIA OSD TWG, meets monthly? StoreCloud, Supercomputing 2004,

November 6 -12, 2004, Pittsburgh, PA

Closing Thoughts

? Hardware technologies are evolving? Areal Density increasing? Form factors shrinking? Serial interfaces/transports are replacing parallel

interfaces/transports? Newer, cooler storage technologies like MEMS are in process

? Protocol and software technologies ? Lagging the hardware evolution? Block-based access moving toward Object-based protocol in

devices and protocols? Object-based File Systems are being developed? Traditional POSIX file system API is being challenged, reformed

Other References

? www.dtc.umn.edu? www.insic.org

? www.datarecoverygroup.com/articles/article3.htm

? www.actionfront.com/ts_articles.asp? History of disk drives

? www.research.ibm.com/about/past_history.shtml? www.research.ibm.com/journal/rd/443/thompson.html? www.i-t-s.com/corporate/disk_drive_history.html? www.startribune.com/stories/484/4734780.html

? Future of Data Storage Technologies (NSF/NIST/DARPA project)? www.wtec.org/loyola /hdmem/toc.htm? www.eetimes.com/sys/news/OEG20030718S0038

Thankyou!

University of Minnesota Digital Technology Center

Intelligent Storage Consortiumwww.dtc.umn.edu

Page 11: Advances in Data Storage - MISRC | Home

11

Software Issues

? Operating Systems – Homogeneity and Heterogeneity? Between OS Types

? Windows? Unix in all flavors? Mac

? Within OS types? Linux Releases from a single vendor (i.e. RedHat)? Linux releases from different vendors (i.e. RedHat vs Suse)? Patches from many different Linux Value-Add providers? Windows 95/98/NT/2000/XP …etc.

? Concurrent support for multiple OS types (Heterogeneous)? Striping efficiency of the Virtualization Engine(s)? Striping efficiency of the file system/Volume Manager

Software Issues (cont.)

? File Systems Incompatibilities? Name spaces? Security mechanisms? Meta data: Proprietary versus standard? Disk storage layout: Proprietary versus standard

? Application porting issues? From one OS to another OS (i.e. Unix to Windows)? Software rot? Losing source code? Losing algorithms? Losing compilers

Software Issues (cont.)

? Hard product functionality/operational limits? 1 TB LUN/file system limit for Solaris ? Note this has secondary impacts on products such as CVFS

(shared volume labeling)? Veritas file system? 1 TB file systems on Solaris? 2 TB file systems on HP-UX

? QFS supports up to 252 LUNs therefore 252 TB file systems? CVFS supports up to 1.84E19 files? Number of files in an HSM, etc.

? Driver (NICs, HBAs , etc.) availability? No iSCSI driver for SGI IRIX™? SNIA approved drivers that support LUN discovery

Software Issues (cont.)

? Linux? Open source is powerful but not without its problems? Most everything is kernel and/or distribution (i.e. RedHat, etc.)

specific? SANergy ™? CVFS? GFS

? Product incompatibility – Lots of examples? CVFS won’t run on a GFS patched kernel

? Security/firewall? CVFS uses dynamically assigned ports for communication with

the FSS (metadata)? SANergyuses NFS

? Firmware and software upgrades? Impact on operations – not just computers

? Tuning? Variables/parameters at all levels

Protocol Issues

? Mixing protocols with interfaces – not all “endpoints” support all the possible combinations? FC over everything? TCP/IP over everything? SCSI over everything

? Go with what works? Ethernet for networking? Fibre Channel for storage

? Experiment with what will be the most likely winner? SCSI over Ethernet one way or another

? Plan for “phasing out” old technologies? Plan on “phasing in” new technologies

Management Issues

? There are many pieces in a system to manage? There is no single unified management tool be

weary of anyone who tries to sell you one? Even in the SAN space, no single management

tool can manage all the SAN devices? Bluefin will help with this but it is still a ways out? Real-time monitoring and management is still a

problem

Page 12: Advances in Data Storage - MISRC | Home

12

Management Issues (cont.)

? Failure management? Run under the assumption that there is ALWAYS something

broken somewhere in the system? Complete architectural redundancy or allowance for degraded

operation? Host, HBA, Switch(s), RAID controller, LUN? Interaction of all the components to effect a proper

switchover? Disconnecting and shutting off failed components? SANergy? CVFS? GFS

? Failback(restoration)? Performance management

? Treat bandwidth, latency, and transaction rates as a resource that needs to be monitored and managed