science gateways on the teragrid
DESCRIPTION
Science Gateways on the TeraGrid. Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways SDSC Director of Consulting, Documentation, Training San Diego Supercomputer Center [email protected]. Hope you have had a productive week in San Diego. TeraGrid Science Gateways. - PowerPoint PPT PresentationTRANSCRIPT
SAN DIEGO SUPERCOMPUTER CENTER
Science Gateways on the TeraGrid
Nancy Wilkins-DiehrTeraGrid Area Director for Science Gateways
SDSC Director of Consulting, Documentation, TrainingSan Diego Supercomputer Center
SAN DIEGO SUPERCOMPUTER CENTER
Hope you have had a productive week in San Diego
SAN DIEGO SUPERCOMPUTER CENTER
TeraGrid Science Gateways
• What is TeraGrid?• What are Science Gateways?• Why TeraGrid and Gateways?• How Does This Help Me?
SAN DIEGO SUPERCOMPUTER CENTER
TeraGrid: Integrating NSF Cyberinfrastructure
SDSCTACC
UC/ANL
NCSA
ORNL
PU
IU
PSC
TeraGrid is a facility that integrates computational, information, and analysis resources at the San Diego Supercomputer Center, the Texas Advanced Computing Center, the University of Chicago / Argonne National Laboratory, the National Center for Supercomputing Applications, Purdue University, Indiana University, Oak Ridge National Laboratory, the Pittsburgh Supercomputing Center, and the National Center for Atmospheric Research.
NCAR
Caltech
USC-ISI
UtahIowa
Cornell
Buffalo
UNC-RENCI
Wisc
SAN DIEGO SUPERCOMPUTER CENTER
TeraGrid Vision
• TeraGrid will create integrated, persistent, and pioneering computational resources that will significantly improve our nation’s ability and capacity to gain new insights into our most challenging research questions and societal problems.
– Our vision requires an integrated approach to the scientific workflow including obtaining access, application development and execution, data analysis, collaboration and data management.
• 20 compute platforms– 10 at or above 10 Tflops
• 1 Pbyte of online disk• Data collection hosting• Remote visualization
• Single application process
SAN DIEGO SUPERCOMPUTER CENTER
TeraGrid PI’s By Institution as of May 2006
TeraGrid PI’s
Blue: 10 or more PI’sRed: 5-9 PI’sYellow: 2-4 PI’sGreen: 1 PI
SAN DIEGO SUPERCOMPUTER CENTER
Gateways are part of TeraGrid’s 3-pronged strategy to further science
• DEEP Science: Enabling Terascale Science– Make science more productive
through an integrated set of very-high capability resources
• ASTA projects
• WIDE Impact: Empowering Communities– Bring TeraGrid capabilities to the
broad science community• Science Gateways
• OPEN Infrastructure, OPEN Partnership– Provide a coordinated, general
purpose, reliable set of services and resources
• Grid interoperability working group
SAN DIEGO SUPERCOMPUTER CENTER
Science GatewaysA new initiative for the TeraGrid
• Increasing investment by communities in their own cyberinfrastructure, but heterogeneous:
• Resources• Users – from expert to K-12• Software stacks, policies
• Science Gateways– Provide “TeraGrid Inside”
capabilities– Leverage community investment
• Three common forms:– Web-based Portals – Application programs running on
users' machines but accessing services in TeraGrid
– Coordinated access points enabling users to move seamlessly between TeraGrid and other grids.
Workflow Composer
SAN DIEGO SUPERCOMPUTER CENTER
Initial Focus on 10 Gateways
SAN DIEGO SUPERCOMPUTER CENTER
What Did We Learn About Common Gateway Requirements?
• Accounting– Support for accounts with differing
capabilities– Ability to associate compute job to a
individual portal user– Scheme for portal registration and
usage tracking– Support for OSG’s Grid User
Management System (GUMS)– Dynamic accounts
• Security– Community account privileges– Need to identify human responsible
for a job for incident response– Acceptance of other grid certificates– TG-hosted web servers, cgi-bin code
• Web Services – Initial analysis completed 12/05– Some Gateways (LEAD, Open Life
Sciences) have immediate needs– Many will build on capabilities offered
by GT4, but interoperability could be an issue
– Web Service security– Interfaces to scheduling and account
management are common requirements
• Software– Interoperability of software stacks
between TG and peer grids– Software installations for gateways
across all TG sites– Community software areas– Management (pacman, other options)
SAN DIEGO SUPERCOMPUTER CENTER
Gateways are growing in numbers
• 10 initial projects as part of TG proposal• >20 Gateway projects today• No limit on how many gateways can use TG resources
– Prepare services and documentation so developers can work independently
• Open Science Grid (OSG)• Special PRiority and Urgent Computing Environment
(SPRUCE)• National Virtual Observatory (NVO)• Linked Environments for Atmospheric Discovery
(LEAD)• Computational Chemistry Grid (GridChem)• Computational Science and Engineering Online (CSE-
Online)• GEON(GEOsciences Network)• Network for Earthquake Engineering Simulation (NEES)• SCEC Earthworks Project• Network for Computational Nanotechnology and
nanoHUB• GIScience Gateway (GISolve)• Biology and Biomedicine Science Gateway• Open Life Sciences Gateway• The Telescience Project• Grid Analysis Environment (GAE)• Neutron Science Instrument Gateway• TeraGrid Visualization Gateway, ANL• BIRN• Gridblast Bioinformatics Gateway• Earth Systems Grid• Astrophysical Data Repository (Cornell)
• Many others interested– SID Grid– HASTAC
SAN DIEGO SUPERCOMPUTER CENTER
GEON
• The goal of GEON is– to advance the field of geoinformatics
and– to prepare and train current and future
generations of geoscience researchers, educators, and practitioners in the use of cyberinfrastructure to further their research, education, and professional goals.
• Geoinformatics will foster new interdisciplinary research, for example
– the gravity modeling of 3D geological features, such as plutons
– the study of active tectonics by integrating LiDAR data and geodynamics models
– the study of lithospheric structure and properties across diverse tectonic environments.
SAN DIEGO SUPERCOMPUTER CENTER
Southern California Earthquake Center• Philip Maechling• SCEC IT Architect
• Involves 500+ scientists at 55 institutions worldwide
• Focuses on earthquake system science using Southern California as a natural laboratory
• Translates basic research into practical products for earthquake risk reduction
SCEC Focus Groups
SAN DIEGO SUPERCOMPUTER CENTER
SCEC/CME Project
Goal: To develop a cyberinfrastructure that can support system-level earthquake science – the SCEC Community Modeling Environment (CME)
Support: 5-yr project funded by the NSF/ITR program under the CISE and Geosciences Directorates
Oct 1, 2001 – Sept 30, 2006
SCEC/ITRProject
NSFCISE GEO
SCECInstitutions
IRIS
USGSISI
SDSCInformation
ScienceEarth
Science
www.scec.org/cme
SAN DIEGO SUPERCOMPUTER CENTER
Outline
SCEC Earthworks Science Gateway goal is to allow users to run wave propagation simulations.
• Seismological Researchers• Grad Students• Public Interest in resulting data products
Many of these target users are not used to using high performance computing.
SAN DIEGO SUPERCOMPUTER CENTER
SCEC Earthworks Science Gateway Basic Capabilities:
• Configure earthquake wave propagation simulations.• Submit simulation for execution as workflow.• Workflow executes across distributed grid
environment including – SCEC, USC HPCC, and TeraGrid
• Monitoring of workflow status• Data products registered with metadata into digital
library• Data discovery tools using metadata searches• Data Retrieval for data products of interest
SAN DIEGO SUPERCOMPUTER CENTER
SCEC Earthworks Science Gateway
SAN DIEGO SUPERCOMPUTER CENTER
NCAR Earth System Grid
• Science Gateway for climate research
• ESG originally a distributed data management/access system but it has evolved into more.
• User registration, authorization controls, and metrics tracking
• CCSM model source, initialization datasets, post-processing codes, and analysis and visualization tools.
• Prototypes of model- submission environments
– Eventually real-time tracking of model status along with references to available output datasets.
• Expect to see more model runs at higher- resolution and with greater component scope.
SAN DIEGO SUPERCOMPUTER CENTER
SAN DIEGO SUPERCOMPUTER CENTER
SAN DIEGO SUPERCOMPUTER CENTER
SAN DIEGO SUPERCOMPUTER CENTER
SAN DIEGO SUPERCOMPUTER CENTER
SAN DIEGO SUPERCOMPUTER CENTER
Linked Environments for Atmospheric DiscoveryLEAD
•Providing tools that are needed to make accurate predictions of tornados and hurricanes
•Meteorological data•Forecast models•Analysis and visualization tools
•Data exploration and Grid workflow
SAN DIEGO SUPERCOMPUTER CENTER
NanoHub Harnesses TeraGrid for Education Nanohub is used to complete coursework by undergraduate and graduate students in dozens of courses at 10 universities. Currently serves over 1000 users.
SAN DIEGO SUPERCOMPUTER CENTER
NanoHUB Middleware infrastructure
Campus Grids
Purdue, GLOW
Grid
Capability Computing
Science Gateway
Workspaces
Research apps
Virtual backends
Virtual Cluster with VIOLIN
VM
Capacity Computing
nanoHUB VO
Middleware
SAN DIEGO SUPERCOMPUTER CENTER
Biomedical and Biology Gateway• Led by Dan Reed, Renaissance Computing Institute, North Carolina• Supports
– Distributed collaboration– Multi-site data access– Computational tools for local or remote execution– Grid and cluster interoperability
• Will provides access to – Common sequence and protein structure databases– Over 140 software packages
Identify Genes
Phenotype 1 Phenotype 2 Phenotype 3 Phenotype 4
Predictive Disease Susceptibility
Physiology
Metabolism Endocrine
Proteome
Immune Transcriptome
BiomarkerSignatures
Morphometrics
Pharmacokinetics
EthnicityEnvironment
AgeGender
Genetics and Disease Susceptibility
Source: Terry Magnuson, UNC
Science Communities and Outreach
• Communities• Students and educators• Phylogeneticists• Evolutionary biologists• Biomedical researchers• Biostatisticians• Computer scientists• Medical clinicians
Biomedical and Biology, Building Biomedical Communities
• Partners• University of North Carolina• Duke University• North Carolina State University• NSF National Evolutionary
Synthesis Center (NESC)• NIH Carolina Center for
Exploratory Genetic Analysis(CCEGA)
QuickTime™ and aGraphics decompressor
are needed to see this picture.
QuickTime™ and aGraphics decompressor
are needed to see this picture.
SAN DIEGO SUPERCOMPUTER CENTER
Would development of a gateway help your research?
• [email protected] mailing list– Email [email protected]– <subscribe gateways> in body
• Biweekly telecons to get advice from others. Current focus– Gateway documentation– GRAM auditing fully tested– Community account policies– TG-provided web service interfaces
• www.teragrid.org– Details about current gateways– Slides from June full day tutorial at TG06
• In depth presentations by LEAD, nanoHUB, RENCI, GIScience– Documentation coming soon– Potential integration assistance from TeraGrid staff
• Nancy Wilkins-Diehr, [email protected]
SAN DIEGO SUPERCOMPUTER CENTER
Thank you for your attentionTime for LUNCH!