© 2004 ibm corporation ibm totalstorage san file system overview san fs for grid & hpc paul l....

27
© 2004 IBM Corporation IBM TotalStorage SAN File System •Overview •SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On Demand with IBM TotalStorage

Upload: haley-faulkner

Post on 28-Mar-2015

230 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

© 2004 IBM Corporation

IBM TotalStorage SAN File System•Overview•SAN FS for Grid & HPC

Paul L. BradshawIBM Almaden Research Center

Breakthrough to On Demand with IBM TotalStorage

Page 2: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

2

IBM TotalStorage® Open Software Family

Evolving to an on demand environment © 2004 IBM Corporation

StorageInfrastructureManagement

HierarchicalStorage

Management

ArchiveManagement

RecoveryManagement

Storage Orchestration

forData

forFabric

forDisk

forReplication

forFiles

forMail

forSAP

forFiles

forFiles

forDatabases

forMail

forSAP

forApplication

Servers

SANVolume

Controller

SANFile

System

Storage Virtualization

IBM TotalStorage Open Software Family Taking steps toward an On Demand storage environment

Page 3: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

IBM Systems and Technology Group

IBM TotalStorage Open Software Family© 2004 IBM Corporation3

Current IT Environment

FS = Multiple, different File Systems across servers, with individual interfaces

AF = Multiple, different Advanced Functions across storage devices with individual interfaces

FS 4 FS 11FS 10FS 9FS 8FS 7FS 6FS 5

IBM eServer™

zSeries®

FS 1 FS 3FS 2

Public Internet/Intranet Clients

Routers (Layer 3 Switches)Firewall

Caching Appliances

Layer 4-7 Switches

Layer 2 Switches

Web Servers

File/Print Servers

SSL Appliances

Storage Fibre

Switches

Storage Fibre

Switches

HDSAF

HPAF

IBMAF

IBMAF

HDSAF

EMCAF

HPAF

Storage Area Network

Increasing complexity of deployment, access, and

management of IT infrastructure

Increasing complexity of deployment, access, and

management of IT infrastructure

Page 4: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

IBM Systems and Technology Group

IBM TotalStorage Open Software Family© 2004 IBM Corporation4

Consolidate Server Infrastructure

FS 4 FS 11FS 10FS 9FS 8FS 7FS 6FS 5

FS

1

FS

3F

S 2

FS = Multiple, different File Systems across servers, with individual interfaces

AF = Multiple, different Advanced Functions across storage devices with individual interfaces

FS

4FS 1 FS 3FS 2

Public Internet/Intranet Clients

Routers (Layer 3 Switches)Firewall

Caching Appliances

Web Servers

File/Print Servers

SSL AppliancesLayer 4-7 Switches

Layer 2 Switches

Storage Fibre

Switches

Storage Fibre

Switches

HDSAF

HPAF

IBMAF

IBMAF

HDSAF

EMCAF

HPAF

Storage Area NetworkF

S 1

1F

S 1

0F

S 9

FS

8

Consolidate servers and distribute workloads to most appropriate platforms

IBM

AF

Consolidate storage into SAN

Consolidate to pSeries, xSeries, etc.

Consolidate to zSeries

Consolidate to BladeCenter

Page 5: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

IBM Systems and Technology Group

IBM TotalStorage Open Software Family© 2004 IBM Corporation5

Result: Simplified Server and Network Infrastructure

Storage infrastructure is perhaps the most complex element of the overall IT infrastructure

Storage infrastructure is perhaps the most complex element of the overall IT infrastructure

However …

FS = Multiple, different File Systems across servers, with individual interfaces

AF = Multiple, different Advanced Functions across storage devices with individual interfaces

FS

1

FS

3F

S 2

FS

4

HDSAF

HPAF

IBMAF

IBMAF

HDSAF

EMCAF

HPAF

Storage Area Network

Individual File Systems• Difficult to implement consistent

policies for file and database management

• Changes to storage and servers impacts application availability

• Difficult to share data• Inefficient usage of file and

database storage across SAN

Storage Complexity• Downtime to add, move

allocate storage• Storage isolated in SAN

islands• Storage is 44 ~ 55%

utilized• Expensive, restrictive and

varying advanced functions

• Multiple interfaces for heterogeneous servers

FS

11

FS

10

FS

9F

S 8

IBM

AF

Public Internet/Intranet Clients

Routers (Layer 3 Switches)Firewall

Currently, complexity of storage management makes it difficult to visualize, manage and optimize storage infrastructure

Currently, complexity of storage management makes it difficult to visualize, manage and optimize storage infrastructure

Page 6: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

IBM Systems and Technology Group

IBM TotalStorage Open Software Family© 2004 IBM Corporation6

Virtualize file systems across all servers with SAN File System

HDSAF

HPAF

IBMAF

IBMAF

HDSAF

EMCAF

HPAF

Storage Area Network

IBM

AFStorage

PoolStorage

PoolStorage

Pool

Single, datastore and global name space

across SAN

Storage Pool• Pools of capacity• Segmented

based on business need

Storage Pool• Pools of capacity• Segmented

based on business need

Policies• File placement

based on business need

Policies• File placement

based on business need

FS = Multiple, different File Systems across servers, with individual interfaces

AF = Multiple, different Advanced Functions across storage devices with individual interfaces

AF = Multiple, different Advanced Functions across storage devices with individual interfaces

FS

1

FS

3F

S 2

FS

4 FS

11

FS

10

FS

9F

S 8

Multiple, independent File Systems

(Future(1))

Note: 1. SAN File System and SAN Volume Controller connectivity to zSeries represents IBM's future product plans and general intentions only. It is subject to change or cancellation without notice and should not be relied on for any purpose.

SAN File System

Page 7: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

IBM Systems and Technology Group

IBM TotalStorage Open Software Family© 2004 IBM Corporation7

VFS w/Cache IFS w/CacheVFS w/Cache VFS w/Cache VFS w/Cache

IP Network for Client/Metadata Cluster Communications

Storage NetworkStorage Network

AIX Solaris HP-UX Linux Win2K/XP

Multiple Storage pools

Data Store

Shared Storage Devices

Metadata Server Cluster

NFS CIFS

Admin Client

External Clients

Metadata Server

Metadata Server

Metadata Server

Metadata Store

IBM TotalStorage SAN File System – Architecturebased on Storage TankTM technology

Page 8: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

8

IBM TotalStorage®

Evolving to an on demand environment © 2004 IBM Corporation

SANVolume Controller

VirtualDisk

VirtualDisk

VirtualDisk

IBM TotalStorage SAN File System

SAN

ESS SATA

SANFile System

Storage Pool

Storage Pool

Storage Pool

Good Better BestProject A Project B Project CCustomer 1

Customer 2

Customer 3

Storage Pool

Storage Pool

Storage Pool

Storage Pool•Pools of capacity•Segmented based on business need

Storage Pool•Pools of capacity•Segmented based on business need

Page 9: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

9

IBM TotalStorage®

Evolving to an on demand environment © 2004 IBM Corporation

IBM TotalStorage SAN File System

SANFile System

/Storage Utility

/A /B

/D /E /F/C

Name Space•Shared by all participating servers

Name Space•Shared by all participating servers

Page 10: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

10

IBM TotalStorage®

Evolving to an on demand environment © 2004 IBM Corporation

IBM TotalStorage SAN File System

SAN

Storage Pool

Storage Pool

Storage Pool

SANFile System

Policies•File placement based on business need

Policies•File placement based on business need

Page 11: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

11

IBM TotalStorage®

Evolving to an on demand environment © 2004 IBM Corporation

IBM TotalStorage SAN File SystemIntegration with Databases

SAN

HighPerformance

MediumPerformance

LowPerformance

SANFile System

Direct I/O•Raw volume performance levels with the benefit of SAN File System file management

Direct I/O•Raw volume performance levels with the benefit of SAN File System file management

Data filesSystem indices

Transaction logs

Temp files Object references

Improved Capacity Utilization•Pooled space for temp files – shared by all database systems

Improved Capacity Utilization•Pooled space for temp files – shared by all database systems

FlashCopy Image •Point-in-Time copy for all files related to a given database

FlashCopy Image •Point-in-Time copy for all files related to a given database

Page 12: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

12

IBM TotalStorage®

Evolving to an on demand environment © 2004 IBM Corporation

IBM TotalStorage SAN File System

SAN

SANFile System

Storage Pool

Storage Pool

Storage Pool

ServerConsolidation, Replacement, and Expansion

ServerConsolidation, Replacement, and Expansion

Page 13: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

13

IBM TotalStorage®

Evolving to an on demand environment © 2004 IBM Corporation

BackupServer

IBM TotalStorage SAN File System

SAN

SANFile System

Storage Pool

Storage Pool

Storage Pool

Bkupagent

Bkupagent

Bkupagent

Bkupagent

NewEfficiencies•Backup•Virus scanning•Other?

NewEfficiencies•Backup•Virus scanning•Other?

Page 14: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

14

IBM TotalStorage®

Evolving to an on demand environment © 2004 IBM Corporation

SAN

Improved Access to and Sharing of Data

Traditional SAN Difficult to share across

applications Data is replicated – duplicate

storage required Turnaround times are slowed by

data copy and batch processing

SAN File System No replication No duplication Streamlined turnaround times

SAN

SANFile System

FTP FTP Share Share

Page 15: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

15 © 2003 IBM CorporationIBM TotalStorage Open Software Family

IBM Storage Software

VFS w/Cache IFS w/CacheVFS w/Cache VFS w/Cache VFS w/Cache

IP Network for Client/Metadata Cluster Communications

Storage NetworkStorage Network

AIX Solaris HP-UX Linux Win2K/XP

Multiple Storage pools

Data Store

Shared Storage Devices

Metadata Server Cluster

NFS CIFS

Admin Client

External Clients

Metadata Server

Metadata Server

Metadata Server

Metadata Store

IBM TotalStorage SAN File System – Architecturebased on Storage TankTM technology

Page 16: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

16

IBM TotalStorage®

Evolving to an on demand environment © 2004 IBM Corporation

Functions of the components Functions of the Meta-Data Servers

Heterogeneous file and byte range level locking

Mandatory and advisoryLease based scheme so locks are renewed on a configurable intervalWhen a lease expires then a lock can be revoked and the work

transferred to another serverLocks can non-disruptively move between meta-data servers on a

meta-data server failureSpace allocation

Configurable block and partition sizesAllocation of blocks occurs in partitions, adjusted dynamically

Striping of data across LUNs in a storage pool

Configurable striping interval, round robin allocationVolume Drain

Automated, non-disruptive movement of data LUN labeling

Quota management

Soft and hard quotas per filesetVolume management in storage pools

Flash copy image

32 per fileset activeUpdates are written to new blocks

Page 17: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

17

IBM TotalStorage®

Evolving to an on demand environment © 2004 IBM Corporation

Functions of the components Functions of the SAN file system clients

Linux client reference implementation available via open source

Caching of the meta-data

Aggressive caching

Pro-active cache invalidation

Mapping of access control rights to local file system semantics

Mapping of operating system file interface to common file system functions

Direct I/O, parallel i/o, async i/o support

Benchmarks show we are only limited by SAN bandwidth, linear scalability as we add clients

Install/PackagingSoftware Only

Rolling Upgrade maintains service during upgrade

N and n-1 supported simultaneously

Services lined up

Current hardware supported

Servers can be upgraded independently of clients

Page 18: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

18

IBM TotalStorage®

Evolving to an on demand environment © 2004 IBM Corporation

RAS

High AvailabilityNon-disruptive move of workload between meta-data servers

Built-in clustering and failover Serviceability

Meta-data checker enhancements

Master Console

Performance improvements for diagnostic logging

One button data collection for error handling Manageability

Consistent on-line help

Log filtering

Fileset transaction statistics

Policy set statistics

Storage Pool assignment verification

Win 2KAIX Solaris LinuxHP/UX WindowsAIX Solaris LinuxHP/UX

VFS w/CacheVFS w/Cache VFS w/CacheVFS w/Cache VFS w/CacheVFS w/Cache VFS w/CacheVFS w/Cache IFS w/CacheIFS w/Cache

Admin Client

External Clients

NFSCIFS

IP Network for Client/Metadata Cluster Communications

Data Store

Shared Storage Devices

Multiple Storage Pools

Storage NetworkStorage Network

Metadata Metadata StoreStore

Metadata Server

Metadata Server

Metadata Server

Storage Tank

Server Cluster

Page 19: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

19

IBM TotalStorage®

Evolving to an on demand environment © 2004 IBM Corporation

SANVolume Controller

TotalStorage SAN File SystemSupported Environments

SANFC & iSCSI

IBMESSF20/800

HitachiThunder

9200

HPEMA1200016000

IBMFAStT200/500

600/600T700/900

HPMA8000

Today…1H04…

MicrosoftWindows

MSCS

IBM AIXHACMP

SunSolarisSun Cluster

Linux(Intel)

IBMESSF20/800

IBMESS

HitachiThunder

HPEMA

IBMFAStT

HitachiLightning

HPMA

EMCSymmetrix

EMCCLARiiON

SANVolume Controller

VirtualDisk

VirtualDisk

VirtualDisk

“ANY”Disk*

Intended as an overview only.For the most complete information, visit ibm.com/storage/software

* One IBM SAN Volume Controller, FAStT or ESS required for metadata information

IBMBladeCenterWindows / VMWare

Linux

VMWareWindows guest

Linux guest

Page 20: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

20

IBM TotalStorage® Open Software Family

Evolving to an on demand environment IBM Confidential © 2004 IBM Corporation

Scalable

Heterogeneous Data Sharing

What are the characteristics of SAN file system?

Centralized Management

Data Protection

Value Add

Resources are pooled across heterogeneous server and OS platforms. (heterogeneous storage support in 2Q’04)Simultaneous access to same storage and data eliminates islands.

Strong locking and security to enable centralized data management and on-demand clustering and availability solutions

Fast storage efficient recovery points which can be leveraged to simplify data protection solutions

Enabling layer for value add for other components through open standards

No architectural limits to file system size, size or number of files. Research tested 100’s of millions of files

Automation

Storage organized by class of serviceAutomated provisioning of storage by class of service with customer defined soft and hard quotasInformation Life Cycle Management leveraging Tivoli

High Performance

Direct access to data volumes takes advantage of SAN technology

SAN File System exploits the SAN to provide significantly enhanced file and data management for an on demand storage environment

Page 21: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

21

IBM TotalStorage® Open Software Family

Evolving to an on demand environment IBM Confidential © 2004 IBM Corporation

Page 22: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

22

IBM TotalStorage® Open Software Family

Evolving to an on demand environment IBM Confidential © 2004 IBM Corporation

Test Environment Configuration

•8 iSCSI storage servers (IBM 200is)•28 terabytes of SCSI disk space•Each 200i can sustain 70 MB/sec on random reads•Each 200i connected by a single gigabit ethernet cable

•100+ SAN File System Linux clients (32 bit)•Each client connected by a single gigabit ethernet cable and running iSCSI initiators•57 Data LUNs across 10 storage pools and 12 filesets•Utilized automated system to install, configure, and run the client platforms

•4 SAN File System metadata servers•Connected to 58 LUNs (1 system, 57 data)•The cluster managed 12 filesets over 10 storage pools)

Page 23: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

23

IBM TotalStorage® Open Software Family

Evolving to an on demand environment IBM Confidential © 2004 IBM Corporation

Ramp Up Test Notes

The storage controllers used were:8 iSCSI storage units

Each unit was capable of 70 MB/sec for large random reads in optimum configuration (limited by older RAID card in the units)

Storage LUNs were RAID-1 (mirrored) Client ramp up:

Clients started the test application one at a time

Once all 100+ clients were running additional processes/threads were started sequentially

Page 24: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

24

IBM TotalStorage® Open Software Family

Evolving to an on demand environment IBM Confidential © 2004 IBM Corporation

Page 25: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

25

IBM TotalStorage® Open Software Family

Evolving to an on demand environment IBM Confidential © 2004 IBM Corporation

iSCSI Storage Server Ramp Up Test(40 Linux clients running)

0

100

200

300

400

500

600

700

MB/sec

1 2 3 4 5 6 7 8 9 10

iSCSI Storage Servers

With 10 storage servers SAN FS achieved 10x the

throughput of 1!

Page 26: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

26

IBM TotalStorage® Open Software Family

Evolving to an on demand environment IBM Confidential © 2004 IBM Corporation

Win 2KAIX Solaris LinuxHP/UX

VFS w/Cache

Admin Client

External Clients

NFSCIFS

IP Network for Client/Metadata Cluster Communications

Storage Network

Metadata Store

Metadata Server

Storage Tank

Server Cluster

... ... ... ...

Storage Tank

Win 2KAIX Solaris LinuxHP/UX

VFS w/Cache

Storage Network

Metadata Store

Metadata Server

Storage Tank

Server Cluster

... ... ... ...

Grid-based Storage

Distributed Storage Tank

NAS Storage Tank

Storage Tank

Storage Tank

Page 27: © 2004 IBM Corporation IBM TotalStorage SAN File System Overview SAN FS for Grid & HPC Paul L. Bradshaw IBM Almaden Research Center Breakthrough to On

27

IBM TotalStorage® Open Software Family

Evolving to an on demand environment IBM Confidential © 2004 IBM Corporation