cyberinfrastructure for the geosciences geon systems report karan bhatia san diego supercomputer...

21
www.geongrid.org CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

Upload: hugo-french

Post on 17-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON Systems Report

Karan BhatiaSan Diego Supercomputer Center

Friday Aug 13 2004

Page 2: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

Year 2 Goals & Accomplishments

• Goals:– Procure and deploy

physical resources for partners

– Provide infrastructure for management of systems

• including mechanisms for collaboration and communication

– Provide basic production services for data

– Provide basic grid services for applications

• Physical Layer– Purchased and Deployed hardware

• Systems Layer– Developed management software

and collaborations with partner sites– Developed Geon Software Stack

• Grid Layer– Beginning to build out Services

• Portal & Security done (end of aug)• Naming & Discovery, Data

Management & Replication, and mediation

– Basic research still being done

• Applications Layer– Some apps ready, used as templates

for how to build apps in Geon

Page 3: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEONgrid Development

Physical Deployment Hardware, clusters, networks

Systems Layer OS & Software layer

Grid Layer Grid System Services

Applications End-user Apps & Services

Page 4: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

Physical Deployment

• Vendors:– Dell (27 prod systems + 9 devel systems)

• Poweredge 2650-based systems• Dual 2.8 GHz Pentium processors• 2 GB RAM

– ProMirco (3 systems)• Dual pentium• 4 TB + RAID

– HP Cluster donation (9 systems)• Rx2600-based dual 1.4 GHz

• 15 partner sites– 1 PoP node

– Optional small cluster (4 system)

– Optional data node

• Misc equipment as needed– Switches, racks, etc.

Physical Deployment

Systems Layer

Grid Layer

Applications

Page 5: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

Deployment Architecture

• Similar to BIRN Architecture

– Each site runs a PoP– Optional cluster and

data nodes

• Users access resources through PoP

– PoP provides point of entry

– PoP provides access to global services

• Developers add services & data hosted on GEON resources

– Web services/Grid services

Page 6: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEONgrid Current Status

Physical Resources:

- all pops deployed, 3 data nodes deployed, clusters all up

- HP cluster delivered

Software Stack:

- mix of GeonRocks 0.1 (redhat 9-based), redhat 9

Page 7: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

Systems Layer

• Unified Software Stack definition– Custom GEON Roll

• Web/Grid Services software Stack• Common GEON Applications and Services

• Focus on scalable systems management– Modified Rocks for wide-area cluster management

(See [Sacerdoti94])

• Collaborations with partner sites– Identified appropriate contacts

Physical Deployment

Systems Layer

Grid Layer

Applications

Page 8: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON Software Roll

• Development– OGSI 1.0 (from GT3.0.2) --> GT3.2 (packaged by

NMI) – Web Services (jakarta, axis, ant, etc)– GridSphere 2.02 Portal Framework

• Database– IBM DB2 (packaged for Protein Data Bank)– Postgres --> PostGIS– SRB Client software– OPeNDAP roll (UNAVCO)

• Security– DB2 with GSI Plugin (developed by Teragrid)– Tripwire

• System Monitoring– Grid Monitor– INCA Testing and Monitoring framework (Teragrid)

• With GRASP benchmarks

– Network Weather Service (NWS)

GEON Software Stack Version 1.0 to be deployed

starting Sept 1, 2004!

Page 9: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

Wide-Area Cluster Management

• Frederico Sacerdoti, Sandeep Chandra, and Karan Bhatia, “Grid Systems Deployment and Management using Rocks”, Cluster 2004, Sept. 20-23 2004, San Diego, California

Page 10: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

Additional Infrastructure

• Production/Development servers– 8 development servers used for various activities – Main Production Portal – Blogs, forums, RSS– Production application services

• CVS services– cvs.geongrid.org

• Geon Certificate Authority– ca.geongrid.org

Page 11: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

Grid Layer

• Goals – Evaluate core software infrastructure

• CAS, Handle.net, RLS (Replica Location Service), VOMS (Virtual Organization Mgmt),Firefish, MCS (Metadata Catalog Service), SRB, CSF (Community Scheduling Framework).

– Integrate or build as necessary1. Portal Infrastructure2. Security Infrastructure3. Naming and Discovery Infrastructure4. Data Management and Replication5. Generic Mediation

Physical Deployment

Systems Layer

Grid Layer

Applications

Page 12: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

1. Portal Infrastructure

• GridSphere Portal Framework– Developed by GridLab (Jason Novatny, and others) Albert Einstein

Institute, Berlin, Germany– Java/JSP Portlet Container

• JSR 168 support, WSRP and JSF coming

– Supports • Collaboration (standard portlet API)• Personalization (eg. my.yahoo.com)• Grid Services (GSI support)• Web Services

• Other Frameworks– Open Grid Computing Environments (OGCE)

• Apache JetSpeed based --> Sakai

Page 13: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

2. Security Infrastructure• GSI Based

– Collaboration with Telescience & BIRN

– GEON certificate authority: ca.geongrid.org

• SDSC CACL system

– Roll-based access control using Globus Community Authorization System (CAS)

• geonAdmin, geonPI, geonUser, public

– Portal Integration• Account requests,

certificate management

Page 14: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

3. Naming and Discovery

• Naming– All service instances, datasets and applications

– Two level naming scheme to support replication and versioning

– GeoID similar to LSID (Life Sciences ID)

– Globally Unique and Resolvable

• Resolution– GeoID --> usable reference (eg. WSDL)

– Handle system (CNRI)

• Discovery– Discover resources in heterogeneous metadata repositories

• MCAT, MCS, Geography Network (ESRI), OPeNDAP

– Firefish (LBL)

Page 15: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

4. Data Management & Replication

• Installed Services– GridFTP– SRB Server

• GMR testing – Grid Movement and

Replication– With IBM Research

• OGSA-DAI performance– With GRASP (baru,

casanova, snavely)

0

10

20

30

40

50

Seconds

LAN WAN

Data Access Performance

OGSA-DAI JDBC

Page 16: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

5. Mediation Services

• GIS Map Integration– See next talk (Ludaescher)

Page 17: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

Year 2 Summary

• Physical Layer– Purchased and Deployed hardware

• Systems Layer– Developed management software and

collaborations with partner sites– Developed Geon Software Stack

• Grid Layer– Beginning to build out Services

• Portal & Security done (end of aug)• Naming & Discovery, Data Management &

Replication, and mediation

– Basic research still being done

• Applications Layer– Some apps ready, used as templates for

how to build apps in Geon

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 18: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

Looking Ahead, Year 3

• Goals:– Provide core software infrastructure– Integration with outside resources– Encourage software development and

integration with partners– More data, more apps, more tools

Page 19: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

Questions?

Page 20: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

Additional Material

Page 21: CYBERINFRASTRUCTURE FOR THE GEOSCIENCES  GEON Systems Report Karan Bhatia San Diego Supercomputer Center Friday Aug 13 2004

www.geongrid.orgCYBERINFRASTRUCTURE FOR THE GEOSCIENCES

Grid Movement and Replication (with IBM)

• Data is stored in the postgres database at UTEP on the GEON node.

• GMR capture service running at UTEP reads and replicates data to the postgres database running at SDSC.

• GMR apply and monitor service run at SDSC to store data sent by the capture service.

• OGSA-DAI data access service provides access to database on both UTEP and SDSC nodes.

• The user application grid service accepts two parameters,

– The name of the node you want to access and

– An SQL query to get data of interest that will be sent to the grav application.

• Based on the SQL query an XML query document is generated.

• Also based on the node, an appropriate service handle is selected.

• The application grid service invokes the OGSA-DAI grid service handle to access data from the database.

• The application grid service receives the data, and parses it to extract the relevant data values that are submitted to the grav application.