gsaf grid storage access framework
Post on 27-Jan-2016
55 Views
Preview:
DESCRIPTION
TRANSCRIPT
INFSO-RI-508833
Enabling Grids for E-sciencE
www.eu-egee.orgUniversity of Coimbra
GSAFGrid Storage Access FrameworkSalvatore Scifo
INFN of Catania
EGEE User Forum
Manchester, UK - 10th-11th May 2007
Manchester, 10-11 May 2007 2
Enabling Grids for E-sciencE
INFSO-RI-508833
Partnership
• Grid Storage Access Framework– The project is carried out by a cooperation between the INFN
and the IR&T engineering s.r.l. (a SME located in Catania, Italy).
– The context of this work is the e-Infrastructure Trinacria Grid Virtual Laboratory Project and the ADAT Project (“Archivi Digitali Antico Testo”), aiming at the implementation of a Digital Archive for Cultural Heritage Data (antique manuscripts) with Digital Repository based on EGEE grid services.
• Resources– INFN
S. Scifo (s.scifo@ct.infn.it) A. Calanducci (a.calanducci@ct.infn.it) On behalf of the Gilda Team
– IR&T engineering (http://www.irt-engineering.it) V. Milazzo (v.milazzo@irt-engineering.it) A. Magrì (a.magri@irt-engineering.it)
Manchester, 10-11 May 2007 3
Enabling Grids for E-sciencE
INFSO-RI-508833
Web Integration Requirements
• Main objectives of web application– Infrastructure side
Organize and handle big amounts of information Share documents among several organizations
– Security side Define and apply Access Control Policies
– Development side Build application without specific technical knowledge of the
adopted infrastructure (high level API) Build and maintain dynamic web content (simple tools to manage
repositories for provisioning purposes)
– User side Manage Groups and Users (administrative user profiles) Manage Digital Resources (authoring user profiles) Access, search and retrieve file/data easily (end user profiles)
Manchester, 10-11 May 2007 4
Enabling Grids for E-sciencE
INFSO-RI-508833
Grid as Digital Repository
• Repository Virtualization provided by– interfaces to manage DATA– interface to manage METADATA
Storage Element(SRM)
File Catalog(LFC)
Metadata Catalog(AMGA)
GFAL API Catalog API Metadata API
GRID DMS
• Data Management capabilities – Large and numerous file handling also in distributed environments– Ubiquity: data access independently by their location
• Security capabilities– Centralized access control mechanism based on x.509 certificates
• Systems capabilities– Availability, Scalability, Fault Tolerance
Manchester, 10-11 May 2007 5
Enabling Grids for E-sciencE
INFSO-RI-508833
Classic Web Application
• Data Presentation Layer consists of all graphical interfaces that make user able to interact with application
• Data Business Layer collects all software components that implement the behavior of the given application
• Data Access Layer is made up by software components that allow application to manage data (ascii files, xml files, digital object, metadata, SQL data)
• Data Access Layer components interact to several types of data sources – File System (for data stored into files)– Relational Database Management System (for data organized into SQL
tables)
Data Presentation Layer
Data Business Layer
Data Access Layer
Posix API Database API
Classic Application
File System RDBMS
Manchester, 10-11 May 2007 6
Enabling Grids for E-sciencE
INFSO-RI-508833
Grid Web Application
• Grid environment porting aspects– files are stored inside a Storage
Element (SE)– files can be replicated on several
SEs for ubiquity, security and sharing needs; relationship among locations of files, replicas and theirs identifiers are kept within a specific File Catalogue Service
– for each file is possible to associate descriptive metadata arranged through a specific Metadata Catalogue Service
• Technical Approach– replace Data Access Layer with an appropriate interface that permits:
business components to manage data stored within the DMS presentation objects to search and retrieve data from DMS
Data Presentation Layer
Data Business Layer
Data Access Layer
Storage Element API
File Catalog API
Metadata Catalog API
GRID Data Management System
GRID Application
Manchester, 10-11 May 2007 7
Enabling Grids for E-sciencE
INFSO-RI-508833
Designers point of view
• Development of applications (web or desktop) is not easy
• This fragmentariness forces software engineers and web designers to consider a vertical architecture
• Application must take care about the atomicity, coherence and synchronization of data manipulation
Catalog Manager
Storage Element(SRM)
File Catalog(LFC)
Metadata Catalog(AMGA)
Metadata Manager
GFAL API Catalog API Metadata API
GFAL API Catalog API Metadata API
GRID DMS
File Manager
Application Data Access Layer
– Grid Data Services are independent each from others
– They work in a “stand a lone” mode
– Any kind of coherence is ensured
Manchester, 10-11 May 2007 8
Enabling Grids for E-sciencE
INFSO-RI-508833
Users point of view
• User can use only Command Line Tools
• These tools are installed on specific machines called User Interface (UI) and located inside the Grid network boundaries
• Users encounter several problem about net access
• User has a personal UI – who does ensure its
security?
GRID Services
GRID User Interface
SSH
Firewall
SSHSSLGSIFTPHTTPS
User Workstaions
• All logical relationships among data and metadata must be kept in his mind
Manchester, 10-11 May 2007 9
Enabling Grids for E-sciencE
INFSO-RI-508833
GSAF solution
• GSAF is an Object Oriented Framework – built on top of the Grid Metadata Service and Grid Data Service
and exposes classes and related methods for applications located above
GRID FARM(Redundancy, High Availability, Data Backup&Recovery, High Storage Capability, Net Access Security)
GRID Metadata Service GRID Data Service
Grid Data Access Framework
Web Application1 Web Application2 Web Application3
• Main objective– hide the complexity and the fragmentation of the several underlying APIs – grouping functional requirements shared among applications– ensure atomicity among different data manipulation
Manchester, 10-11 May 2007 10
Enabling Grids for E-sciencE
INFSO-RI-508833
GSAF System Architecture
Storage Element(SRM)
File Catalog(LFC)
Metadata Catalog(AMGA)
Catalog Manager Metadata Manager
GFAL API Catalog API Metadata API
GFAL API Catalog API Metadata API
GRID DMS
GRID Storage Access Framework
File Manager
GSAF Interface
Manchester, 10-11 May 2007 11
Enabling Grids for E-sciencE
INFSO-RI-508833
GSAF Functional Requirement
• Managing Metadata Schemas
• Managing ACLs to access Metadata
• Managing ACLs to access Data
• Uploading file to the SE
• Browsing Metadata Catalogue \ File Catalogue
• Search file by Metadata
• Deleting file
Manchester, 10-11 May 2007 12
Enabling Grids for E-sciencE
INFSO-RI-508833
GSAF Web Interface
• GSAF Web Interface to manage data and their metadata remotely – Initially, the main target of this application was to be a natural
tester of the framework basic functionalities – it represents a useful tool to administrate the Grid Storage
through internet
• Web Interface is the easiest approach – for new users which don’t have specific knowledge of the Grid
environment. – no syntax rules are required and users don’t loose the high level
view of data neither of metadata schemas. – immediate interaction thanks to comfortable and friendly driven
procedures that make training and learning faster. – web application needs only a simple internet connection than it
avoids any dependencies from the Grid UI machines.
Manchester, 10-11 May 2007 13
Enabling Grids for E-sciencE
INFSO-RI-508833
GSAF Web Interface
Manchester, 10-11 May 2007 14
Enabling Grids for E-sciencE
INFSO-RI-508833
ADAT project
Manchester, 10-11 May 2007 15
Enabling Grids for E-sciencE
INFSO-RI-508833
Use Cases
• ADAT Project– embeds GSAF within the Digital Archive Software
• Physics Department of the University of Catania (PI2S2 project)– aims to implement a Grid Oriented Digital Archive for DICOM
images.
• BM Portal project (Bio-Lab, DIST University of Genoa )– embeds GSAF framework as a plug-in
• GILDA Team– adopts the GSAF web interface for dissemination and training
purposes.
Manchester, 10-11 May 2007 16
Enabling Grids for E-sciencE
INFSO-RI-508833
Outlook
• File Replica support
• VOMS Integration
• ACLs at Disk Pool Manager Level– for coherence between File Catalogue permissions and DPM
permissions
• Transaction Manager– Serialization levels– Transaction pattern
Execute() Commit() Rollback()
• All we need to integrate applications….
Manchester, 10-11 May 2007 17
Enabling Grids for E-sciencE
INFSO-RI-508833
Conclusions
• GSAF means– Useful API to develop Grid Storage based applications– Useful and simple web interface to access Data Management
Services remotely
• extreme flexible, multi platform and multi user– to be a cross application domain plug-in
• comfortable usage of the Web Interface – to be a simple Content Management Tool to manage data
remotely
• candidate at the EGEE Respect Program– to become a recommended external software for the EGEE
middleware
Manchester, 10-11 May 2007 18
Enabling Grids for E-sciencE
INFSO-RI-508833
References
• GSAF wiki pages– https://grid.ct.infn.it/twiki/bin/view/TRIGRID/GSAF
• Amga Web Interface wiki pages– https://grid.ct.infn.it/twiki/bin/view/TRIGRID/AMGAWI
• AMGA Service and Java API– http://project-arda-dev.web.cern.ch/project-arda-dev/metadata/index.html
• GFAL Java API– http://grid-deployment.web.cern.ch/grid-deployment/gis/GFAL/gfal.3.html
– https://grid.ct.infn.it/twiki/bin/view/GILDA/APIGFAL
• LFC Java API– http://wiki.egee-see.org/index.php/SEE-GRID_File_Management_Java_API
• IR&T engineering s.r.l.– http://www.irt-engineering.com
• Trigrid VL– http://www.trigrid.it
top related