applying data grids to support distributed data management storage resource broker reagan w. moore...

16
Applying Data Grids to Support Distributed Data Management Storage Resource Broker Reagan W. Moore Ian Fisk Bing Zhu University of California, San Diego [email protected] http://www.npaci.edu/DICE/

Post on 20-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Applying Data Grids to Support Distributed Data Management

Storage Resource Broker

Reagan W. MooreIan Fisk

Bing ZhuUniversity of California, San Diego

[email protected]://www.npaci.edu/DICE/

Data Management Systems

• Data sharing - data grids– Federation across administration domains– Latency management– Sustained data transfers

• Data publication - digital libraries– Discovery– Organization

• Data preservation - persistent archives– Technology management– Authenticity

Consistent Data Environments

• Storage Resource Broker combines the functionality of data grids, digital libraries, and persistent archives within a single data environment

• SRB provides– Metadata consistency– Latency management functions– Technology evolution management

Metadata Consistency

• Storage Resource Broker uses a logical name space to assign global identifiers to digital entities– Files, SQL command strings, database tables, URLs

• State information that characterizes the result of operations on the digital entities is mapped onto the logical name space

• Consistency of state information is managed as update constraints on the mapping– Write locks, synchronization flags, schema extension

• SRB state information is managed in the MCAT metadata catalog

SRB Latency Management

ReplicationServer-initiated I/O

StreamingParallel I/O

CachingClient-initiated I/O

Remote Proxies,Staging

Data AggregationContainers

SourceDestination

Prefetch

NetworkDestinationNetwork

SRB 2.0 - Parallel I/O

• Client-directed parallel I/O - Client/Server– Thread-safe client

– client decides the number of threads to use

– each thread is responsible for a data segment and connects to the server independently

– utilities srbpput and srbpget

• Sustains 80% to 90% of available bandwidth using 4 parallel I/O streams and a window size of 800 kBytes

SRB 2.0 - Parallel I/O (cont1)

• Server-directed parallel I/O - Client/Server – Server plans and decides number of threads to use– Separate “Control” and “data transfer” sockets– Client listens on the “control” socket and spawns

threads to handle data transfer– Always a one-hop data transfer between client and

server– Similar to HPSS

• Works seamlessly with HPSS Mover protocol• Also works for other file systems

SRB 2.0 - Parallel I/O (cont2)

• Parallel I/O - Server/Server – Copy, replicate and staging operations– Always used in third-party transfer operations

• Server/server data transfer, client not involved

– Uses up to 4 threads depending on file size– 7-10 times improvement for large files across

country– Up to 39 MB/sec across campus (PC raid disk,

gBit ethernet).

SRBserver

SRB agent

SRBserver

Federated SRB server model

MCAT

Read Application

SRB agent

1

2

3

4

6

5

Logical NameOr

Attribute Condition

1.Logical-to-Physical mapping2.Identification of Replicas3.Access & Audit Control

Peer-to-peer

Brokering

Server(s) SpawningData

Access

Parallel Data Access

R1R2

5/6

SRB 2.0 - Bulk operations

• Uploading and downloading large number of small files– Multi-threaded

• Bulk registration – 500 files in one call– Fill 8 MB buffer before sending– Use of container

• New Sbload and Sbunload utilities– Over 100 files per second registration– 3-10+ times speedup

Unix Shell

Java, NTBrowsers

OAIWSDL

GridFTP

SDSC Storage Resource Broker & Meta-data Catalog

ArchivesHPSS, ADSM,UniTree, DMF

DatabasesDB2, Oracle,

Postgres

File SystemsUnix, NT,Mac OSX

Application

HRM

AccessAPIs

Servers

Storage AbstractionCatalog Abstraction

DatabasesDB2, Oracle, Sybase,

SQLServer

C, C++, Libraries

Logical Name Space

LatencyManagement

DataTransport

MetadataTransport

Consistency Management / Authorization-AuthenticationPrimeServer

Linux I/O

DLL /Python

Technology Management

SRB Archival Tape Library System

• SRB archival storage system in addition to HPSS, UniTree, ADSM.– A distributed pool of disk caches for front end – A tape library system back end

• STK silo for tape storage and tape mount

• 3590 tape drives

• I/O always performed on disk cache– Always stage data to cache

CMS Experiment

• Ian Fisk - user level application– Installed SRB servers at CERN, Fermi Lab, UCSD under a

user account

• Remotely invoked data replication– From UCSD, invoked data replication from CERN to Fermi

Lab, and to UCSD

– Data transfers automatically used four parallel I/O streams, default window size of 800 kBytes

• Observed– Sustained data transfer at 80% to 90% of available bandwidth

– Transferred over 1 TB of data per day using multiple sessions

Future plans• SRB 2.1 - Grid-oriented features, SRB-G (5/31/03)

– Add GridFTP driver – Access data through GridFTP server

– Upgrade to GSI 2.2 (GSI 1.1 in current version)

– Provide encrypted data transfer facility, using GSI encryption, between servers and between server and client.

• Explore network encryption as a digital entity property

– WSDL Services interface for SRB including data movement, replication, access control, metadata ingestion and retrieval and container support.

• SRB 2.2 – Federated MCATs (8/30/03)– Peer-to-peer MCATs

– Mount point like interface - /sdsc/…, /caltech/…

Next CMS Experiments

• Sustained transfer– Use 4 MB window size

• Bulk data registration– In tests with DOE ASCI project, sustained

registration of 400 files per second

• Peer-to-peer federation– Prototype of ability to initiate data and

metadata exchanges between MCAT catalogs

For More Information

Reagan W. MooreSan Diego Supercomputer Center

[email protected]

http://www.npaci.edu/DICE

http://www.npaci.edu/DICE/SRB/index.html

http://www.npaci.edu/dice/srb/mySRB/mySRB.html