science gateways on the teragrid

29
SAN DIEGO SUPERCOMPUTER CENTER Science Gateways on the TeraGrid Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways SDSC Director of Consulting, Documentation, Training San Diego Supercomputer Center [email protected]

Upload: cameron-boyer

Post on 31-Dec-2015

40 views

Category:

Documents


1 download

DESCRIPTION

Science Gateways on the TeraGrid. Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways SDSC Director of Consulting, Documentation, Training San Diego Supercomputer Center [email protected]. Hope you have had a productive week in San Diego. TeraGrid Science Gateways. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

Science Gateways on the TeraGrid

Nancy Wilkins-DiehrTeraGrid Area Director for Science Gateways

SDSC Director of Consulting, Documentation, TrainingSan Diego Supercomputer Center

[email protected]

Page 2: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

Hope you have had a productive week in San Diego

Page 3: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

TeraGrid Science Gateways

• What is TeraGrid?• What are Science Gateways?• Why TeraGrid and Gateways?• How Does This Help Me?

Page 4: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

TeraGrid: Integrating NSF Cyberinfrastructure

SDSCTACC

UC/ANL

NCSA

ORNL

PU

IU

PSC

TeraGrid is a facility that integrates computational, information, and analysis resources at the San Diego Supercomputer Center, the Texas Advanced Computing Center, the University of Chicago / Argonne National Laboratory, the National Center for Supercomputing Applications, Purdue University, Indiana University, Oak Ridge National Laboratory, the Pittsburgh Supercomputing Center, and the National Center for Atmospheric Research.

NCAR

Caltech

USC-ISI

UtahIowa

Cornell

Buffalo

UNC-RENCI

Wisc

Page 5: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

TeraGrid Vision

• TeraGrid will create integrated, persistent, and pioneering computational resources that will significantly improve our nation’s ability and capacity to gain new insights into our most challenging research questions and societal problems.

– Our vision requires an integrated approach to the scientific workflow including obtaining access, application development and execution, data analysis, collaboration and data management.

• 20 compute platforms– 10 at or above 10 Tflops

• 1 Pbyte of online disk• Data collection hosting• Remote visualization

• Single application process

Page 6: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

TeraGrid PI’s By Institution as of May 2006

TeraGrid PI’s

Blue: 10 or more PI’sRed: 5-9 PI’sYellow: 2-4 PI’sGreen: 1 PI

Page 7: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

Gateways are part of TeraGrid’s 3-pronged strategy to further science

• DEEP Science: Enabling Terascale Science– Make science more productive

through an integrated set of very-high capability resources

• ASTA projects

• WIDE Impact: Empowering Communities– Bring TeraGrid capabilities to the

broad science community• Science Gateways

• OPEN Infrastructure, OPEN Partnership– Provide a coordinated, general

purpose, reliable set of services and resources

• Grid interoperability working group

Page 8: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

Science GatewaysA new initiative for the TeraGrid

• Increasing investment by communities in their own cyberinfrastructure, but heterogeneous:

• Resources• Users – from expert to K-12• Software stacks, policies

• Science Gateways– Provide “TeraGrid Inside”

capabilities– Leverage community investment

• Three common forms:– Web-based Portals – Application programs running on

users' machines but accessing services in TeraGrid

– Coordinated access points enabling users to move seamlessly between TeraGrid and other grids.

Workflow Composer

Page 9: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

Initial Focus on 10 Gateways

Page 10: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

What Did We Learn About Common Gateway Requirements?

• Accounting– Support for accounts with differing

capabilities– Ability to associate compute job to a

individual portal user– Scheme for portal registration and

usage tracking– Support for OSG’s Grid User

Management System (GUMS)– Dynamic accounts

• Security– Community account privileges– Need to identify human responsible

for a job for incident response– Acceptance of other grid certificates– TG-hosted web servers, cgi-bin code

• Web Services – Initial analysis completed 12/05– Some Gateways (LEAD, Open Life

Sciences) have immediate needs– Many will build on capabilities offered

by GT4, but interoperability could be an issue

– Web Service security– Interfaces to scheduling and account

management are common requirements

• Software– Interoperability of software stacks

between TG and peer grids– Software installations for gateways

across all TG sites– Community software areas– Management (pacman, other options)

Page 11: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

Gateways are growing in numbers

• 10 initial projects as part of TG proposal• >20 Gateway projects today• No limit on how many gateways can use TG resources

– Prepare services and documentation so developers can work independently

• Open Science Grid (OSG)• Special PRiority and Urgent Computing Environment

(SPRUCE)• National Virtual Observatory (NVO)• Linked Environments for Atmospheric Discovery

(LEAD)• Computational Chemistry Grid (GridChem)• Computational Science and Engineering Online (CSE-

Online)• GEON(GEOsciences Network)• Network for Earthquake Engineering Simulation (NEES)• SCEC Earthworks Project• Network for Computational Nanotechnology and

nanoHUB• GIScience Gateway (GISolve)• Biology and Biomedicine Science Gateway• Open Life Sciences Gateway• The Telescience Project• Grid Analysis Environment (GAE)• Neutron Science Instrument Gateway• TeraGrid Visualization Gateway, ANL• BIRN• Gridblast Bioinformatics Gateway• Earth Systems Grid• Astrophysical Data Repository (Cornell)

• Many others interested– SID Grid– HASTAC

Page 12: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

GEON

• The goal of GEON is– to advance the field of geoinformatics

and– to prepare and train current and future

generations of geoscience researchers, educators, and practitioners in the use of cyberinfrastructure to further their research, education, and professional goals.

• Geoinformatics will foster new interdisciplinary research, for example

– the gravity modeling of 3D geological features, such as plutons

– the study of active tectonics by integrating LiDAR data and geodynamics models

– the study of lithospheric structure and properties across diverse tectonic environments.

Page 13: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

Southern California Earthquake Center• Philip Maechling• SCEC IT Architect

• Involves 500+ scientists at 55 institutions worldwide

• Focuses on earthquake system science using Southern California as a natural laboratory

• Translates basic research into practical products for earthquake risk reduction

SCEC Focus Groups

Page 14: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

SCEC/CME Project

Goal: To develop a cyberinfrastructure that can support system-level earthquake science – the SCEC Community Modeling Environment (CME)

Support: 5-yr project funded by the NSF/ITR program under the CISE and Geosciences Directorates

Oct 1, 2001 – Sept 30, 2006

SCEC/ITRProject

NSFCISE GEO

SCECInstitutions

IRIS

USGSISI

SDSCInformation

ScienceEarth

Science

www.scec.org/cme

Page 15: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

Outline

SCEC Earthworks Science Gateway goal is to allow users to run wave propagation simulations.

• Seismological Researchers• Grad Students• Public Interest in resulting data products

Many of these target users are not used to using high performance computing.

Page 16: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

SCEC Earthworks Science Gateway Basic Capabilities:

• Configure earthquake wave propagation simulations.• Submit simulation for execution as workflow.• Workflow executes across distributed grid

environment including – SCEC, USC HPCC, and TeraGrid

• Monitoring of workflow status• Data products registered with metadata into digital

library• Data discovery tools using metadata searches• Data Retrieval for data products of interest

Page 17: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

SCEC Earthworks Science Gateway

Page 18: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

NCAR Earth System Grid

• Science Gateway for climate research

• ESG originally a distributed data management/access system but it has evolved into more.

• User registration, authorization controls, and metrics tracking

• CCSM model source, initialization datasets, post-processing codes, and analysis and visualization tools.

• Prototypes of model- submission environments

– Eventually real-time tracking of model status along with references to available output datasets.

• Expect to see more model runs at higher- resolution and with greater component scope.

Page 19: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

Page 20: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

Page 21: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

Page 22: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

Page 23: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

Page 24: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

Linked Environments for Atmospheric DiscoveryLEAD

•Providing tools that are needed to make accurate predictions of tornados and hurricanes

•Meteorological data•Forecast models•Analysis and visualization tools

•Data exploration and Grid workflow

Page 25: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

NanoHub Harnesses TeraGrid for Education Nanohub is used to complete coursework by undergraduate and graduate students in dozens of courses at 10 universities. Currently serves over 1000 users.

Page 26: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

NanoHUB Middleware infrastructure

Campus Grids

Purdue, GLOW

Grid

Capability Computing

Science Gateway

Workspaces

Research apps

Virtual backends

Virtual Cluster with VIOLIN

VM

Capacity Computing

nanoHUB VO

Middleware

Page 27: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

Biomedical and Biology Gateway• Led by Dan Reed, Renaissance Computing Institute, North Carolina• Supports

– Distributed collaboration– Multi-site data access– Computational tools for local or remote execution– Grid and cluster interoperability

• Will provides access to – Common sequence and protein structure databases– Over 140 software packages

Identify Genes

Phenotype 1 Phenotype 2 Phenotype 3 Phenotype 4

Predictive Disease Susceptibility

Physiology

Metabolism Endocrine

Proteome

Immune Transcriptome

BiomarkerSignatures

Morphometrics

Pharmacokinetics

EthnicityEnvironment

AgeGender

Genetics and Disease Susceptibility

Source: Terry Magnuson, UNC

Science Communities and Outreach

• Communities• Students and educators• Phylogeneticists• Evolutionary biologists• Biomedical researchers• Biostatisticians• Computer scientists• Medical clinicians

Biomedical and Biology, Building Biomedical Communities

• Partners• University of North Carolina• Duke University• North Carolina State University• NSF National Evolutionary

Synthesis Center (NESC)• NIH Carolina Center for

Exploratory Genetic Analysis(CCEGA)

QuickTime™ and aGraphics decompressor

are needed to see this picture.

QuickTime™ and aGraphics decompressor

are needed to see this picture.

Page 28: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

Would development of a gateway help your research?

[email protected] mailing list– Email [email protected]– <subscribe gateways> in body

• Biweekly telecons to get advice from others. Current focus– Gateway documentation– GRAM auditing fully tested– Community account policies– TG-provided web service interfaces

• www.teragrid.org– Details about current gateways– Slides from June full day tutorial at TG06

• In depth presentations by LEAD, nanoHUB, RENCI, GIScience– Documentation coming soon– Potential integration assistance from TeraGrid staff

• Nancy Wilkins-Diehr, [email protected]

Page 29: Science Gateways on the TeraGrid

SAN DIEGO SUPERCOMPUTER CENTER

Thank you for your attentionTime for LUNCH!