cyberinfrastructure for the geosciences geon 2007 workshop at the university of auckland, new...

42
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007 www.geongrid.org GEON Architecture: Systems Components Overview Sandeep Chandra, SDSC The Geosciences Network (GEON) Cyberinfrastructure Workshop University of Auckland, New Zealand. 26-28 November 2007

Upload: molly-harrell

Post on 04-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

GEON Architecture: Systems Components Overview

Sandeep Chandra, SDSC

The Geosciences Network (GEON) Cyberinfrastructure Workshop

University of Auckland, New Zealand.

26-28 November 2007

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

IT Goals• Develop cyberinfrastructure to support the “day-to-day” conduct

of science (e-science), not just “hero” computations

– Based on a Web/Grid services-based distributed environment

• Work closely with geoscientists to help create data sharing frameworks, best practices, and useful and usable capabilities and tools for information integration and knowledge discovery

• The “two-tier” approach

– Use best practices, including commercial tools,

– while developing advanced technology in open source, and doing CS research

• Leverage from other similar cyberinfrastructure projects

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Balancing “Empowering” and “Controlling”

• System Deployment– Standard reference systems for GEON PoP (point-of-

presence) and GEON Portal middleware infrastructure– Additional resources can be attached to the PoP

• Software Deployment– Centralized software stack definition– Locally controlled extensions

• Application Development and Integration– Centralized web-based portal for access to core resources– Local portals provide customization into users home

environment and access to local expertise • Security

– Centralized user account policies– Locally defined “non-grid” user policies

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Software Layers

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Hardware Deployment• Vendors

– Dell (40 prod systems + devel systems)• Poweredge 2950-based systems• Dual Core 2.8 GHz Intel Xeon• 750GB SAS, 2-4 GB RAM

– ProMirco (3 systems)• Dual Pentium• 4 TB + RAID

– HP Cluster donation (9 systems)• Rx2600-based dual 1.4 GHz

• PoPs (PI Institutes, Project Partners, International Partners)

– 23 servers in 23 domains• Compute and Data Clusters

– 4 small clusters (3-4 nodes each)– 3 medium cluster (8-9 nodes)– 1 large cluster (30,000 su’s on Teragrid)

• Data Storage– 3 data nodes (4 TB)– 12 TB online SAN– 10 TB tape archive

• Misc. Equipment– Switches, racks, etc.

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Partner Sites

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Deployment Architecture

• Hardware Deployment– Each site runs a PoP– Optional cluster and data

nodes

• Users access resources through PoP

– PoP provides point of entry

– PoP provides access to global services in GEON

• Developers add services & data hosted on GEON resources

– Portal Services, Application Services, Web services/Grid Services

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

GEON Hardware Facility

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Systems Software

• Unified Software Stack definition– Custom GEON Roll

• GEON Portal• Web/Grid Services software Stack• Common GEON Applications and Services

• Focus on scalable systems management– Modified Rocks for wide-area cluster management– Mechanism to provide local extensions to base

software stack definition

• Collaborations with partner sites– Identified appropriate contacts– Helping partner sites in systems development

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

GEON Software Stack• Base OS

– Rocks: highly programmatic software configuration management

• Development– Globus 4.0.2 (GSI, GridFTP, etc)– Web Services (Jakarta-tomcat-5.0.28, axis-1.2, ant-1.6,

jdk1.4.2, etc)– GridSphere 2.0.2 Portal Framework

• Database– IBM DB2– Postgres 8.0.3 – PostGIS 1.2 (Geos, Proj)

• Security– Tripwire, chkrootkit

• System Monitoring– INCA Testing and Monitoring framework (Teragrid)

• With GRASP benchmarks

– Network Weather Service (NWS)– Ganglia

• Job Submission and Monitoring– Condor, PBS

GridSphere Portal

GEONGrid Software Stack

Rocks 4.2.1 based on RedHat Enterprise Linux

JDKAnt TripwireSamba

Globus Pre-Web AxisOGSA-DAI

Tomcat

INCA/GRASPNWSCondorPBS

Postgres PostGIS Geos Proj

GRASS (GDAL, NetCDF, Tiff) GMT

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Wide-Area Cluster Management

Federico Sacerdoti, Sandeep Chandra, and Karan Bhatia, “Grid Systems Deployment and

Management using Rocks”, IEEE Cluster 2004, Sept. 20-23 2004, San Diego, California

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

SDSC Central Server(GEONGrid Roll)

GEON Frontend(GEONGrid+GRASS+GMT)

ASU Central Server(GRASS Roll)

UTEP Central Server(GMT Roll)

Compute

Compute

central.asu.geongrid.org

central.sdsc.geongrid.orgcentral.utep.geongrid.org

<X> Central Server(<Y> Roll)

central.<X>.geongrid.org

GEON Rocks Central• Local extensions

to software stack• Partner sites

package and maintain locally hosted rolls.

• Provides easy installation and automatic configuration of software on nodes.

• A highly customized node.

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Additional Infrastructure

• Production/Beta/Development servers– 8 Production servers used for various activities– 3 Beta Servers– 1 Common Development Server– 1 Central Server for hosting Rocks and GEON stack. – Blogs, Forums, Calendar, RSS– Bug tracking software (JIRA)

• CVS/SVN services– svn.geongrid.org

• GEON Certificate Authority– gama.geongrid.org

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Grid/Web Core Middleware Services

• Goals – Evaluate core software infrastructure– Collaborate and engineer solutions as needed– Integrate or build as necessary

1. Portal Middleware Infrastructure (GEON Portal)2. Security Infrastructure (GAMA)3. Naming and Discovery Infrastructure (Handle.net)4. Data Management and Replication (SRB, RLS)5. Generic Mediation

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Core Services• Authentication

– GSI, CAS, SAML, MyProxy, CACL-CA, Naregi-CA, GAMA• Monitoring

– NWS, INCA• Scheduling

– Condor, CSF (Community Scheduling Framework)• Cataloging

– RLS (Replica Location Service), Handle.net• Data Transfer and Management

– GridFTP, SRB• Replication

– RLS, SRB• Databases

– Postgres, PostGIS, DB2

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Portal Infrastructure

• GridSphere Portal Framework– Developed by GridLab (Jason Novatny, and others) Albert

Einstein Institute, Berlin, Germany– Java/JSP Portlet Container

• JSR 168 support, WSRP and JSF– Supports

• Collaboration (standard portlet API)• Personalization (e.g. my.yahoo.com)• Grid Services (GSI support)• Web Services

• Other Frameworks– Open Grid Computing Environments (OGCE)

• Apache JetSpeed based on Sakai

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

GEON Portal

• GEON Portal provides:– Authenticated access to data and Web services– Registration of data sets, tools, and services with metadata– Search for data, tools, and services, using ontologies– Scientific workflow environment and access to HPC– Data and map integration capability– Scientific data visualization and GIS mapping

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Distributed Portals

• Distributed portal architecture– Allows partner sites to “brand” their portal– Facilitates development by partners– Allows custom apps for each site

• Unified user login– GSI based, managed by GEON system

• Networking for local organizations

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

End-user access through Distributed Portals

• Local Customization – Partner sites can customize

the local portal to the specific needs of the users at that site.

• Support integration of local resources – each site may have

significant local resources that can be integrated for local and external users.

• Supports code development

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Challenges

• Managing distributed catalogs

• Integration of tools

• Complete automation of portal middleware deployment process

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Data Portal Middleware• Portal Server

– Dual Core Xeon– 750 GB SAS– 4-8 GB RAM– Rocks, GEON

• Data Server– Dual Core Xeon– 1.5TB RAID 5– 4-8 GB RAM– Rocks, SRB

• CA Server– Dual Core Xeon– 30GB SCSI– 2 GB RAM – Rocks, GAMA

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Data Portals Deployed

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Security Infrastructure• Problem

– Portal users need access to various Grid-enabled resources for job submission, data management, instrument control, etc.

– Standard security mechanism is GSI (Grid Security Infrastructure). Typically involves:

• Creation of credentials for a new user• Storage of a proxy in MyProxy by user• Retrieval of proxy upon user login to portal• Configuration of resources to accept

credentials

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Security Infrastructure• GSI Based

– Collaboration with Telescience & BIRN

– GEON certificate authority: gama.geongrid.org

• SDSC CACL system– Roll-based access control by

extending Gridsphere capabilities

• geonAdmin, geonPI, geonUser, public

– Portal Integration• Account requests,

certificate management

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

GAMA: Grid Account Management Architecture

• A Solution– Install command-line security infrastructure on a

dedicated, locked-down machine (GAMA server)

– Wrap apps in Web Services on GAMA server– Construct GridSphere portlets and services for

submitting and managing account requests from users on a portal server

– Configure GridSphere to automatically retrieve a proxy from the GAMA server when a user logs on to the portal

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

GAMA Services

Kurt Mueller, Sandeep Chandra, and Karan Bhatia, “GAMA: Grid Account Management Architecture”, IEEE E-Science 2005, Melbourne, Australia, Dec 2005.

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

GAMA Portal Components

Utility classesFormInputValidation

SendMail

GridSphere

DB

GridSphere

DB

ActionPortlets PortletServices

AccountRequestServiceAll object persistencemethods, and many

other account manage-ment methods

AccountRequestServiceAll object persistencemethods, and many

other account manage-ment methods

GAMAClientServiceEncapsulates all

communication withGAMA server

GAMAClientServiceEncapsulates all

communication withGAMA server

GAMAAuthModuleProvides GridSphere loginand automatic credential

retrieval

GAMAAuthModuleProvides GridSphere loginand automatic credential

retrieval

AccountRequestobjects

RequestApprovalRuleobjects

hibernate

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

GAMA Supports

•GSI-based using best practices•Global account acceptance policies•Supports importing of grid accounts (privileged user)•Supports non-grid local accounts (non-privileged user)•Supports portals, clusters and rich clients•Packaged as Rocks rolls.

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Authorization

• Extending Gridsphere user db

• Users can be authorized for access to tools/services at various levels

• Registration, LiDAR, SYNSEIS

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Naming and Discovery

• Naming– All service instances, datasets and applications– Two level naming scheme to support replication and

versioning– Globally Unique and Resolvable

• Resolution– Handle system (Evaluating)– Collaborating with Earthchem project

• Discovery– Discover resources in heterogeneous metadata repositories

• MCAT, MCS, Geography Network (ESRI), OPeNDAP– UDDI – Replica Location Service (Globus)

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Data Management & Replication

• Data Movement and Storage

– GridFTP– SRB Server

• Caching and Replication – Replica Location

Service (RLS)• Data Services Metrics

– GRASP– Inca

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Mediation Services

• GIS Map Integration

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

System Monitoring and Benchmarking

• Inca for user-level monitoring of Grid functionality and performance

• Measure Bandwidth, Latency and other system metrics

• Use Globus, GRASP, INCA and NWS frameworks

• Archive results and display data continuously

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

System Monitoring and Benchmarking

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Summary• Physical Layer

– Deploy hardware• Systems Layer

– Developing management software and collaborations with partner sites

– Developing and Deploying GEON Middleware– Collaborating with partner sites to develop local software stack

extensions• Grid Layer

– Services for• Portal & Security (Authentication & Authorization)• Naming & Discovery, Data Management & Replication, and

Mediation• Applications Layer

– Apps ready, used as templates for how to build apps in GEON.

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

iGEON Sites

• University of Hyderabad, India (iGEON India network with 2 more sites)

• Russian Academy of Sciences, Moscow• Chinese Academy of Sciences, China• AIST GeoGrid, Japan• Auscope, Australia

and now• University of Auckland, New Zealand

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Looking Ahead: GEON 2.0• Goals:

– Robustify existing GEON systems and middleware infrastructure.

– Build new useful tools on top of existing framework.

– Encourage software development and resource integration with partner sites

– More data, More apps

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Resources

• http://geongrid.org

• http://portal.geongrid.org

• http://grid-devel.sdsc.edu/gama

• www.rocksclusters.org

• www.globus.org

• www.gridsphere.org

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Acknowledgements

• GEON Team

• Grid-Devel Group

• Rocks Group

• University of Auckland

• BeSTGRID

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES

GEON 2007 Workshop at the University of Auckland, New Zealand, November 26-28, 2007

www.geongrid.org

Questions or Feedback?

• Mail: [email protected]