production data grids srb - irods
DESCRIPTION
Production Data Grids SRB - iRODS. Storage Resource Broker. Reagan W. Moore [email protected] http://www.sdsc.edu/srb. Topics. Production data grids Architecture and installation challenges Production challenges Interoperability challenges (federation) Applications Data grids - sharing - PowerPoint PPT PresentationTRANSCRIPT
Production Data GridsProduction Data GridsSRB - iRODSSRB - iRODS
Storage Resource BrokerStorage Resource Broker
Reagan W. MooreReagan W. Moore
http://www.sdsc.edu/srbhttp://www.sdsc.edu/srb
TopicsTopics
• Production data grids• Architecture and installation challenges• Production challenges• Interoperability challenges (federation)
• Applications• Data grids - sharing• Digital libraries - publication• Persistent archive - preservation• Real-time sensor data - collection• Cyberinfrastructure - analysis
BaBar High-Energy PhysicsBaBar High-Energy Physics
• Stanford Linear Accelerator
• Palo Alto, CA• IN2P3• Lyon, France
• A functioning international Data Grid for high-energy physics
Manchester-SDSC mirror
Moved over 300 TBs of dataMoved over 300 TBs of data
Increasing to 5 TBs per dayIncreasing to 5 TBs per day
Architecture ChallengesArchitecture Challenges
• Infrastructure heterogeneity• Storage in file systems, archives, ORBs• Choice of database for metadata catalog
• Network devices• Management of firewalls, private virtual
networks, load levelers
• Network latency• Geographic distance between storage
locations
Installation ChoicesInstallation Choices
• Infrastructure heterogeneity• Provision of drivers for each type of storage system or
database• Porting of APIs for each preferred access mechanism
• Network devices• Establishment of range of ports for access through
firewall• Server-initiated parallel I/O and bulk operations
• Network latency• Master-slave metadata catalogs• Federation across multiple independent data grids
Data GridData Grid
Using a Data Grid – Using a Data Grid – in Abstractin Abstract
Ask for d
ata
•User asks for data from the data grid
Data d
elivere
d
•The data is found and returned•Where & how details are hidden
Data Grid ManagementData Grid Management
• Data grids integrate multiple system components• Application level client software• Federation software• Data grid servers• Data grid metadata catalog• Security infrastructure• Storage systems• Database catalog• Network
• A failure in any of the systems is viewed as a failure of the data grid
Operation ChallengesOperation Challenges
• Data grids provide mechanisms to analyze all types of infrastructure failure• Integrity checks• Authenticity checks• System logs
• Data grids provide mechanisms to manage all types of infrastructure failure• Replication of data and metadata• Synchronization of replicas• Federation of data grids• Server rebooting and server maintenance mode
Operation ProceduresOperation Procedures
• Periodic system administration• Manage integrity checks on
data• Manage audit trails• Manage consistency checks on
collections• Manage synchronization of
replicas• Manage deletion of files (empty
trash can)• Track all errors and reported
data losses• Manage upgrades to new
versions of the data grid servers
• Operational tasks for each data grid• Add servers for new storage
systems• Add new users• Respond to user questions• Modify access controls on
collections and storage• Restart data grid servers as
needed• Identify problems with storage
systems• Respond to installation
questions• Integrate user interfaces with
data grid
Automation of Management TasksAutomation of Management Tasks
• integrated Rule-Oriented Data System - iRODS• Express management policies as rules that control
the execution of micro-services • Micro-service is a standard operation performed on a remote
storage system
• Manage persistent state information that describes outcome of the micro-service• Persistent Metadata catalog stores state information
• Virtualize the management policies• Logical name space for rules• Logical name space for micro-services• Logical name space for state information
• First release in December 2006
iRODS - integrated Rule-Oriented iRODS - integrated Rule-Oriented Data SystemData System
Resources
Client Interface Admin Interface
MetadataModifierModule
ConfigModifierModule
RuleModifierModule
ConsistencyCheck
Module
Confs
RuleBase
MetadataPersistent
Repository
Engine
Rule
Current State
Rule Invoker
MicroService
Modules
Resource-based Micro-services
MicroService
Modules
Metadata-based Micro-services
ServiceManager
ConsistencyCheck
Module
ConsistencyCheck
Module
Interoperation VirtualizationInteroperation Virtualization
• Management of federation with other data grid technologies• Define micro-service that executes the protocols
required by the alternate data grid• Define rule for when this micro-service is executed
(link to explicit storage location)• Separately manage state information from application
of this micro-service
• iRODS enables encapsulation of the rules, access mechanisms, and state information needed for interoperation with other data grids
Federation Between Data GridsFederation Between Data Grids
Data Collection B
Data Access Methods (Web Browser, DSpace, OAI-PMH)
Data Grid
• Logical resource name space
• Logical user name space
• Logical file name space
• Logical persistent state name space
• Logical rule name space
• Logical micro-service name space
Data Collection A
Access controls and consistency constraints on cross registration of logical name spaces
Data Grid
• Logical resource name space
• Logical user name space
• Logical file name space
• Logical persistent state name space
• Logical rule name space
• Logical micro-service name space
OGF Data Grid FederationOGF Data Grid Federation
Data Grid
Country SRB version SRB Zone name Storage Resource Logical Name
APAC Australia 3.4.0-P AU StoreDemoResc_AU NOAO Chile/US 3.4.2 noao-ls-t3-z1 noao-ls-t3-fs ChinaGrid China CGSP-II RNP Brazil 3.4.1-P2 GGF-RNP demoResc UERJ Brazil 3.4.1-P2 UERJ-HERPGrid demoResc IN2P3 France 3.4.2 ccin2p3 LyonFS4 DEISA Italy 3.4.0-P DEISA demo-cineca KEK Japan 3.4.0-P KEK-CRC rsr01-ufs SARA Netherlands 3.4.0-P SARA SaraStore IB New Zealand 3.4.1 aucklandZone aucklandResc ASGC Taiwan 3.4.0-P TWGrid SDSC-GGF_LRS1 NCHC Taiwan 3.4.0-P ecogrid ggf-test CCLRC UK 3.4.0-P tdmg2zone IB UK 3.4.1 avonZone avonResc WunGrid UK 3.3.1 SDSC-wun sfs-tape LCDRG US 3.4.2-P2 LCDRG-GGF demoResc Purdue US 3.4.0-P Purdue uxResc1 Teragrid US 3.4.0-P2 SDSC-GGF sfs-disk U Md US 3.4.0-P umiacs narasrb02-unix1
For More InformationFor More Information
Reagan W. Moore
San Diego Supercomputer Center
http://www.sdsc.edu/srb/