san diego supercomputer center sdsc storage resource broker srb as data grid solution (chinese...
TRANSCRIPT
San Diego Supercomputer CenterSan Diego Supercomputer CenterSDSC Storage Resource Broker
SRB as data grid solution (Chinese version)
Arun Jagatheesan
San Diego Supercomputer Center
SRB WorkshopNational Center for High-performance Computing (NCHC)
Taiwan, August 3, 2004
San Diego Supercomputer CenterSan Diego Supercomputer CenterSDSC Storage Resource Broker
SRB as data grid solution (Chinese version)
Arun Jagatheesan
San Diego Supercomputer Center
SRB WorkshopNational Center for High-performance Computing (NCHC)
Taiwan, August 3, 2004
San Diego Supercomputer CenterSDSC Storage Resource Broker 3
SRB?
SRB = Storage
Resource
Broker
San Diego Supercomputer CenterSDSC Storage Resource Broker 4
More ChineseOops, don’t know any more
Chinese to continue
San Diego Supercomputer CenterSDSC Storage Resource Broker 5
Thanks to Layton Chen
San Diego Supercomputer CenterSan Diego Supercomputer CenterSDSC Storage Resource Broker
SRB as data grid solution(English Version)
Arun Jagatheesan
San Diego Supercomputer Center
San Diego Supercomputer CenterSDSC Storage Resource Broker 7
Talk Outline
• Introduction to Problem statement(s)• How SRB is the solution• SRB Project History
• SRB Team
• SRB Architecture (from the Architect him self)
San Diego Supercomputer CenterSDSC Storage Resource Broker 8
What problem, why SRB solution?
• Why are people using SRB?
• What problems did it solve for them?
• Who are these people?
• Did they use it because they liked Arun
San Diego Supercomputer CenterSDSC Storage Resource Broker 9
Southern California Earthquake Center
San Diego Supercomputer CenterSDSC Storage Resource Broker 10
Southern California Earthquake Center
• Build community digital library• Manage simulation and observational data
• Anelastic wave propagation output• 10 TBs, 1.5 million files
• Provide web-based interface• Support standard services on digital library
• Manage data distributed across multiple sites• USC, SDSC, UCSB, SDSU, SIO
• Provide standard metadata• Community based descriptive metadata• Administrative metadata• Application specific metadata
San Diego Supercomputer CenterSDSC Storage Resource Broker 11
SCEC Data Management Technologies
• Portals • Knowledge interface to the library, presenting a coherent view of the services
• Knowledge Management Systems• Organize relationships between SCEC concepts and semantic labels
• Process management systems • Data processing pipelines to create derived data products
• Web services • Uniform capabilities provided across SCEC collections
• Data grid • Management of collections of distributed data
• Computational grid • Access to distributed compute resources
• Persistent archive • Management of technology evolution
San Diego Supercomputer CenterSDSC Storage Resource Broker 12
NASA Data Grids
• NASA Information Power Grid• NASA Ames, NASA Goddard• Distributed data collection using the SRB
• ESIP federation• Led by Joseph JaJa (U Md)• Federation of ESIP data resources using the SRB
• NASA Goddard Data Management System• Storage repository virtualization (Unix file system, Unitree
archive, DMF archive) using the SRB
• NASA EOS Petabyte store• Storage repository virtualization for EMC persistent store
using the Nirvana version of SRB
San Diego Supercomputer CenterSDSC Storage Resource Broker 13
NCSA6+2 TF
4 TB Memory400 TB disk
SDSC4.1 TF
2 TB Memory500 TB SAN
Caltech0.5 TF
.4 TB Memory86 TB disk
ANL1 TF
.25 TB Memory25 TB disk
32
32
5
32
32
5
TeraGrid: 13.6 TF, 6.8 TB memory, 900 TB network disk, 10 PB archive
HPSS HPSS
HPSS9 PB
ESnetHSCCMREN/AbileneStarlight
32
24
8
32
24
8 4
Juniper M160
OC-12
OC-48
OC-12
574p IA-32 Chiba City
128p Origin
HR Display & VR Facilities
256p HP X-Class
128p HP V2500
92p IA-32
MyrinetMyrinet MyrinetMyrinet
Chicago & LA DTF Core Switch/RoutersCisco 65xx Catalyst Switch (256 Gb/s Crossbar)
OC-12
OC-12
OC-3
vBNSAbileneMREN
1176p IBM SP1.7 TFLOPs
Blue Horizon
OC-48NTON
4
4
2 x Sun E10K
4 15xxp Origin
UniTree
1024p IA-32 320p IA-64
2
14
8
vBNSAbileneCalrenESnet
OC-12
OC-12
OC-12
OC-3
8
SunServer
16
GbE
24
Extreme Blk Diamond
OC-12 ATM
Calren
San Diego Supercomputer CenterSDSC Storage Resource Broker 14
NIH BIRN SRB Data Grid
• Biomedical Informatics Research Network• Access and analyze biomedical image data• Data resources distributed throughout the country• Medical schools and research centers across the US
• Stable high performance grid based environment• Coordinate data sharing• Federate collections • Support data mining and analysis
San Diego Supercomputer CenterSDSC Storage Resource Broker 15
BIRN: Inter-organizational Data
San Diego Supercomputer CenterSDSC Storage Resource Broker 16
SDSC SRB User Community (Major US)• BaBar, Stanford Linear Accelerator
Center (SLAC)• California Digital Library (CDL)• Center for Integrated Space Weather
Modeling (CISM)• CVC, Visualization Portal• LDC Data Storage• NIH Bio Informatics Research Network
(BIRN)• NSF Southern California Earthquake
Center (SCEC)• National Archives and Records
Administration (NARA)• National Aeronautics and Space
Administration Centers (NASA)• National Virtual Observatory (NVO)• Npackage, NSF Middleware Initiative
(NMI)
• National Science Digital Library (NSDL)
• National Optical Astronomy Observatory (NOAO)
• ROADNet• Purdue University• SCCOOS, USA• Scientific Rich Media Archive• Salk Institute
• Strand Map Service, USA• UC Berkeley Library• UCSD Library• University of Houston• Persistent Archives Test bed• University of Wisconsin, Madison• WebBase, Stanford University• Yale University Library
San Diego Supercomputer CenterSDSC Storage Resource Broker 17
SDSC SRB User Community• Academia Sinica, Taiwan• Australian National University• Bio-Lab, University of Genoa, Italy• Council for the Central Laboratory of
the Research Councils (CCLRC), UK• CC-IN2P3, France• Distributed Framework, Singapore • Distributed Aircraft Maintenance
Environment (DAME), UK• eMinerals Project, UK• eScience, Belfast Center• Fraunhofer ITWM, Germany• High Energy Accelerator
Organization, KEK, Japan
• K* Grid Computing, Korea• KEK Computing Center, Japan• Lyon, France• NorGrid, Norway• Nanyang Data Grid, Singapore• Queensland University of
Technology (QUT), Australia• Rutherford Appleton Laboratory
(RAL), UK• T-Systems, Germany• UK eScience Project, UK• UniGrid, Poland• UMK, Poland• Virtual Laboratory for eScience,
Netherlands
San Diego Supercomputer CenterSDSC Storage Resource Broker 18
What problem, why SRB solution?
• Why are people using SRB?
• What problems did it solve for them?
• Who are these people?
• Did they use it because they liked Arun
San Diego Supercomputer CenterSDSC Storage Resource Broker 19
Why they use SRB?
• Distributed unstructured data management• Data Grids, Digital Libraries, Persistent Archives, • Workflow/dataflow Pipelines, Knowledge Generation
• Distributed data storage provisioning • Common logical namespace for data and storage
• Data publication • Browsing and discovery of data in collections
• Data Preservation• Management of technology evolution
San Diego Supercomputer CenterSDSC Storage Resource Broker 20
0
2
4
6
8
10
12
14
> 100TB
> 10 TB > 5 TB > 1 TB > 500GB
< 200GB
Response
Unique
Outside SDSC
324 TB358 TB
682 TB
Total data brokered by SDSC SRB
San Diego Supercomputer CenterSDSC Storage Resource Broker 21
USA53%
Norway1%
Japan3%
Taiwan1%
France3%
Germany3%
Korea3%
Poland3%
Singapore3%
Australia6%
United Kingdom
17%
Other 4%
Countries actively using SDSC SRB
San Diego Supercomputer CenterSDSC Storage Resource Broker 23
Looking back…
• 1995: MDAS Project by DARPA• 1998: SRB Releases• 2000: Arun joins SRB
• Only after that SRB becomes a hit – lucky guy (just kidding)
• 2000 ++: Multiple client interfaces, Many more functionalities, Multiple projects across the world
• 2005: NCHC demonstrates significant interest in SRB and also their end-users in Taiwan (through this workshop)
San Diego Supercomputer CenterSDSC Storage Resource Broker 24
Physical Layer (Real World)
• Distributed digital entities• Heterogeneous and distributed storage
resources• Autonomous Organizations • Distributed Users, distributed authentication• Heterogeneous authorization schemes• Users; sub-organizations;
organizations/enterprises; virtual organizations
San Diego Supercomputer CenterSDSC Storage Resource Broker 25
Data Grid Transparencies/Virtualizations
(bits,data,information,..)
Storage Resource Transparency
Storage Location Transparency
E:\srbVault\image.jpg /users/srbVault/image.jpg Select … from srb.mdas.td where...
Data Identifier Transparency
image_0.jpg…image_100.jpgData Replica Transparency
image.sqlimage.cgi image.wsdl
Virtual Data Transparency
patientRecordsCollectionmyActiveNeuroCollection
Inter-organizational Information
Storage Management
San Diego Supercomputer CenterSDSC Storage Resource Broker 26
We are SRB
Arun is here!- Shameless
Self promotion
Not in picture: Many students