storage systems in hpc john a. chandy department of electrical and computer engineering university...

18
Storage Systems in HPC Storage Systems in HPC John A. Chandy John A. Chandy Department of Electrical and Computer Engineering Department of Electrical and Computer Engineering University of Connecticut University of Connecticut

Upload: amber-mathews

Post on 24-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

Storage Systems in HPCStorage Systems in HPC

John A. ChandyJohn A. ChandyDepartment of Electrical and Computer EngineeringDepartment of Electrical and Computer EngineeringUniversity of ConnecticutUniversity of Connecticut

Page 2: Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

Research SummaryResearch Summary

• Storage SystemsStorage Systems

– Active StorageActive Storage

– Parallel File SystemsParallel File Systems

– Reliable Data StorageReliable Data Storage

– Active Storage NetworksActive Storage Networks

Page 3: Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

Storage SystemsStorage Systems

• Parallel ComputingParallel Computing– Building parallel file systems to support HPCBuilding parallel file systems to support HPC

– Computation at the storage nodeComputation at the storage node

– Data organization methods to improve performanceData organization methods to improve performance

• Reliable Data StorageReliable Data Storage– Customizable and extensible storage for reliabilityCustomizable and extensible storage for reliability

– Backup strategies using personal storage devicesBackup strategies using personal storage devices

– Data security, trust, and reliability in the cloudData security, trust, and reliability in the cloud

Page 4: Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

Parallel File SystemsParallel File Systems

• Network Attached StorageNetwork Attached Storage

– Put the storage on the network with a Put the storage on the network with a computer (server) acting as the go-betweencomputer (server) acting as the go-between

Network

Page 5: Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

Parallel File SystemsParallel File Systems

• Separate the metadata from the storageSeparate the metadata from the storage

Network

Metadata

Page 6: Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

Parallel File SystemsParallel File Systems

• How do you improve metadata performance?How do you improve metadata performance?

– Distribute metadata services on data nodesDistribute metadata services on data nodes

– Use active storage and object servicesUse active storage and object services

Page 7: Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

Active StorageActive Storage

• Allows us to run applications on storage nodesAllows us to run applications on storage nodes

• Can dramatically reduce data trafficCan dramatically reduce data traffic

– Eliminate large network latenciesEliminate large network latencies

• Take advantage of fast RAID arrays and SSDsTake advantage of fast RAID arrays and SSDs

– Drives bottle-necked by slow networksDrives bottle-necked by slow networks

• Run applications in parallel across multiple nodesRun applications in parallel across multiple nodes

• Make use of unused processor timeMake use of unused processor time

Page 8: Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

Programming ModelProgramming Model

• Based on object storageBased on object storage

• RPC basedRPC based

– Executable objectsExecutable objects

– RPC calls have full access to all object functions – RPC calls have full access to all object functions – read, write, create, set attribute, etc.read, write, create, set attribute, etc.

• Functions can be synchronous or asyncFunctions can be synchronous or async

• Supports multiple languages (C, Java, Python)Supports multiple languages (C, Java, Python)

Page 9: Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

Programming ModelProgramming Model

• Based on work by Acharya, Riedel - Stream basedBased on work by Acharya, Riedel - Stream based• Our model is Remote Procedure Call (RPC) basedOur model is Remote Procedure Call (RPC) based

o Use executable objectsUse executable objectso Added command to begin executionAdded command to begin executiono Allow full access to all OSD functionsAllow full access to all OSD functions

• Functions can be run sync or asyncFunctions can be run sync or asynco Due to iSCSI 30sec timeoutDue to iSCSI 30sec timeouto Working to allow queries for asyncWorking to allow queries for async

• Allow parallel execution using asyncAllow parallel execution using async• Support multiple languages (c, java, python)Support multiple languages (c, java, python)

Page 10: Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

SecuritySecurity

• Multiprocess implementationMultiprocess implementation

– Limits AS functions from directly accessing objectsLimits AS functions from directly accessing objects

– Limits access to the object services libraryLimits access to the object services library

– Enforces use of object security mechanismsEnforces use of object security mechanisms

• chroot sandboxingchroot sandboxing

– C/Java engines run in a chroot directoryC/Java engines run in a chroot directory

– Allows limited system libraries – e.g. libcAllows limited system libraries – e.g. libc

Page 11: Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

SecuritySecurity

• Multiprocess ImplementationMultiprocess Implementationo Limits AS functions from directly accessing objectsLimits AS functions from directly accessing objectso Limits access to the OSD services libraryLimits access to the OSD services library

Forces the use of RPCForces the use of RPCo Enforces the use of OSD security mechanismsEnforces the use of OSD security mechanisms

• Chroot SandboxingChroot Sandboxingo Applied to enginesApplied to engineso Limits engines inside a single directoryLimits engines inside a single directoryo Allows limiting of librariesAllows limiting of libraries

AS versions of libraries possibleAS versions of libraries possible

Page 12: Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

Active Storage Code ExampleActive Storage Code Example

Page 13: Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

Results: AES Local vs. Active Results: AES Local vs. Active Storage Storage

Page 14: Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

Results: Scaling with Multiple Results: Scaling with Multiple OSDsOSDs

Page 15: Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

Results: C vs. JavaResults: C vs. Java

Page 16: Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

High Performance ComputingHigh Performance Computing• Active storage networkActive storage network

– Computing in the networkComputing in the network

– SIMD-like processing of data in motionSIMD-like processing of data in motion

– Adaptive computing network elementsAdaptive computing network elements

– Application optimizations for database queries, scientific applications, Application optimizations for database queries, scientific applications, data mining, sort, etc.data mining, sort, etc.

Page 17: Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

Active Storage NetworksActive Storage Networks

Data Sort

Page 18: Storage Systems in HPC John A. Chandy Department of Electrical and Computer Engineering University of Connecticut

BECAT CollaborationBECAT Collaboration

• Large Data ProblemsLarge Data Problems

• Parallel File Systems ImplementationParallel File Systems Implementation