cybergis and teragrid science gateways update nancy wilkins-diehr teragrid area director for science...

25
CyberGIS and TeraGrid Science Gateways update Nancy Wilkins-Diehr TeraGrid Area Director for Science Gateways [email protected] TeraGrid Quarterly, December 7, 2010

Post on 19-Dec-2015

228 views

Category:

Documents


0 download

TRANSCRIPT

TeraGrid Quarterly, December 7, 2010

CyberGIS andTeraGrid Science Gateways update

Nancy Wilkins-DiehrTeraGrid Area Director for Science

[email protected]

CyberGIS!

•SGW funds a Cyberinfrastructure in GIS workshop in conjunction with the UCGIS meeting in DC in February, 2010–Co-led by Shaowen Wang at NCSA and Nancy–Approved by UCGIS board after a lengthy voting process–Expected outcome of the workshop – all are happening!

• Increased visibility•New partnerships for TeraGrid, UCGIS, and other pertinent organizations

•Workshop report• Interesting collaborative proposal ideas•Potential future publications

–50 attendees from throughout the US, Sweden and Australia–Workshop attendees also attended UCGIS meeting, including

briefing from high-ranking administration officials

TeraGrid Quarterly, December 7, 2010

Shaowen and Nancy represent TeraGridCongressional Reading Room, Library of Congress

TeraGrid Quarterly, December 7, 2010

Administrative officials in attendance

•Karen Siderelis–GIO Department of Interior,

Acting Chair Federal Geographic Data Committee

•Raphael Bostic–Assistant Secretary for

Policy Development and Research, HUD

•Derek Douglas–Special Assistant to the

President for Urban Affairs•Jerry Johnston–CIO, EPA

•Stephen Lowe–GIO, Dept of AgricultureTeraGrid Quarterly, December 7, 2010

Workshop brings results

•Scientific Software Integration (SSI) award for CyberGIS–Shaowen Wang, PI–Largest SSI awarded in FY10

•$4.4M, 10/1/10-9/30/15–Academia, government, industry, international partnerships

• Arizona State University• The Computer Network Information Center of the Chinese Academy of Sciences

• Environmental Systems Research Institute (ESRI)• Georgia Institute of Technology• Oak Ridge National Laboratory• University College London Centre for Advanced Spatial Analysis (England)• University Consortium for Geographic Information Science• University of California-San Diego• University of California-Santa Barbara• University of Illinois at Urbana-Champaign• University of Washington• U.S. Geological Survey• Victorian Partnership for Advanced Computing (Australia)

TeraGrid Quarterly, December 7, 2010

Goals and objectives

•Establish CyberGIS as a fundamentally new software framework– Integration of CI, GIS, and spatial analysis/modeling

capabilities–Widespread scientific breakthroughs, broad societal impacts

•Participatory evolution of CyberGIS community requirements•CyberGIS software integration roadmap•High performance and scalable CyberGIS•Online CyberGIS gateway•CyberGIS testing and integration with national and international CI•Community-based and application-driven evaluation of CyberGIS TeraGrid Quarterly, December 7, 2010

CyberGIS vision

TeraGrid Quarterly, December 7, 2010

Wang, S. 2010. "A CyberGIS Framework for the Synthesis of Cyberinfrastructure, GIS, and Spatial Analysis.“Annals of the Association of American Geographers, 100(3): 535-557.

Initial software components•GISolve

– Large-scale spatial analysis and modeling (SAM), web interface•GeoDa/PySAL

– Spatial data analysis: exploratory data analysis (EDA), exploratory spatial data analysis (ESDA), maximum likelihood (ML) spatial regression

– Open source library for spatial analysis: weights, computational geometry, ESDA, spatial econometrics, clustering and spatial dynamics

•Open-Topography– Community access to high-resolution, Earth science-oriented, topography

data, and related tools and resources•PGIST

– On-line tools for expanding public participation in transportation improvement programming

•pd-GRASS– GRASS (Geographic Resources Analysis Support System)

• Geospatial data management and analysis, image processing, graphics/maps production, spatial modeling, and visualization

– pd-GRASS• Shell scripts that distribute display functions of GRASS GIS through the network among several physical monitors in a synchronized manner

TeraGrid Quarterly, December 7, 2010

Early application targets

•Emergency management–Fire, flood, disease, earthquake–Managing during the incident and post incident, tracking

victims, etc.–Real-time integration of data from multiple sources–Combining distinct analysis tools via workflows, incorporating

HPC when warranted (“computational intensity maps”)•Distributed analysis support, collaboration and participation tools through an online service•Proof of concept research

TeraGrid Quarterly, December 7, 2010

What have the gateways been up to?•Record high 896 users of community accounts on TG–31% of all users charging jobs–645 users of CIPRES gateway

• Cited in at least 35 publications including Nature, PNAS, and Cell• 77% of all jobs have been submitted from the US, including top-tier institutions such as Harvard, Yale,

and Stanford• Jobs received regularly from 17 EPSCOR states• Job submissions from 34 countries on 5 continents• At least 5 undergraduate classes known to use the portal routinely. This is likely an underestimate

(based on Web log patterns).

•International representation–Matthew Woitaszek gives keynote address at the International

Workshop on Science Gateways (IWSG) in Catania, Italy–Rion Dooley also in attendance

•GRAM5–This deployment is the final piece needed to support

attribute-based authentication–Much testing by gateway staff

•Special kudos to Suresh Marru, Raminder Singh, David Carver, Stu Martin, Kate Ericson

TeraGrid Quarterly, December 7, 2010

2.5M CPU hours of Q3 gateway useSource: Dave Hart

TeraGrid Quarterly, December 7, 2010

By Resource

abe.ncsa.teragridfrost.ncar.teragridranger.tacc.teragridcobalt.ncsa.teragridsteele.purdue.teragridcondor.purdue.teragridkraken.nics.teragridlonestar.tacc.teragridqueenbee.loni-lsu.ter-agridbigred.iu.teragridlincoln.ncsa.teragridnstg.ornl.teragridpople.psc.teragrid

By Community User

Cipres Community User Gridamp Community User

Gisolve Community User Gridchem Chemistry Community User

ccsmuser Community User

Sidgrid Community User

Robetta Community User

DES Community User

DESDM Community User Tera3D Community UserOGCE Community User Nanohub Community

UserBioportal Community User

Nbcruser Community User

LEAD Community User Jimmy Neutron Com-munity User

C4e4 Community User Ultrascan Community User

Much progress on standardized treatment of community accounts

•Victor Hazlewood, Matthew Woitaszek, Jim Marsteller•Nancy’s goal is to provide gateway developers a menu of what to expect at TG sites•Fold all paper agreements into single, existing TG user responsibility form

TeraGrid Quarterly, December 7, 2010

Nancy’s “end of program” vision for community accounts

•Access to resources via gsi-ssh and GRAM•If direct logins to community accounts are restricted:–Allow identified developers to “su in”

•If execution directories are restricted:–Provide developer controls through commsh

TeraGrid Quarterly, December 7, 2010

gsi-ssh GRAM su commsh

Site A X X

Site B X X X

Site C X X

Site D X Xh

•Uintah and PET–Gateway support staff and documentation are so good, need

for advanced support nearly eliminated for large scale CFD gateway Uintah•Much to our surprise, Uintah developer presents at TG10 on nearly finished gateway after 20 minute phone conversation with Matthew Woitaszek and pointers to documentation!

•New work outlined on Population-Environment-Technology (PET) model gateway

TeraGrid Quarterly, December 7, 2010

Analytical UltracentrifugationEmerging computational tool for the study of proteins

•Samples from researchers all over the world–Some (Germany, Australia)

have their own ultracentrifuges and use only the analysis capabilities, others send samples to UT to spin

•Spin the samples at high speeds, learn about macromolecule properties•Monte Carlo simulations•Observations are electronically digitized and stored for further mathematical analysis

TeraGrid Quarterly, December 7, 2010

Source: Suresh Marru, IU

The Center for Analytical Ultracentrifugation of Macromolecular Assemblies, UT Health Sciences

Comprehensive data analysis environment•Management of analytical ultracentrifugation data for single users or entire facilities•Support for storage, editing, sharing and analysis of data–HPC facilities used for 2-D spectrum analysis and genetic

algorithm analysis•TeraGrid (~2M CPU hours used)•Technische University of Munich• Juelich Supercomputing Center

•Portable graphical user interface•MySQL database backend for data management•Over 30 active institutions

TeraGrid Quarterly, December 7, 2010

Source: Suresh Marru, IU

Gateway and ASTA supporta growing trend

•TeraGrid advanced support–Fault tolerance–Workflows–Use of multiple TG resources (using Lonestar, expanding to

QueenBee and Ranger, using Quarry for test server, waiting for GRAM5 on Ranger)

–Community account implementation–Remote steering– Improved UI (no manual specification of CPU time)–Applying lessons learned from GridChem, LEAD, incorporating

new features into OGCE•LEAD is portlet-based, Gridchem is java swing client side app, Ultrascan is php and perl-based gateway, all can use OGCE

•Big MPI app that forks off many independent runs, improvements here will be tackled by TG's advanced support team

TeraGrid Quarterly, December 7, 2010

Source: Suresh Marru, IU

New advanced support work

•Ocean Land Atmosphere Simulation (OLAS) group–PI Craig Mattocks, UNC–Simulation of flooding and inundation–Using TG’s gateway hosting service to set up high availability,

real-time data server to ingest live data from NCAR's LDM, OPeNDAP and THREDDS services

–Gateway work includes workflows to execute OLAS coupled models based on triggers from events in LDM data streams•Building off similar work in LEAD

TeraGrid Quarterly, December 7, 2010

Emerging gateway through DataONE

•Proposed MOU would have DataONE appear to TG as data oriented science gateway and TG appear to DataONE as a set of Member Nodes•Combining distributed and diverse data sets to create new scientific insight from new syntheses of data–Observational eBird.org

integrated with environmental observational data such as NASA’s MODIS data from the ORNL DAAC to generate predictions of bird species migration patterns

–TG10 paper by Daniel Fink

TeraGrid Quarterly, December 7, 2010

Gateway software listing

•Populate TeraGrid’s information service with gateway software information–A search for computational

chemistry packages should turn up both commandline software and packages accessible through a gateway

–Web services and programmatic generation of package listings too• So the RENCI science portal folks don’t have to hand-enter 140 applications!

TeraGrid Quarterly, December 7, 2010

New gateway activities in the extension year

•Helpdesk support expanded–From .2 FTE in PY5 to 1.7 in Extension

• Helpdesk and Condor support, new GIS communities, SimpleGrid extensions

•Accounting– Improved views for gateways now that we have attributes

•Community accounts–Continued work toward improved standardization

•Prebuilt VMs with gateway software–OGCE, SimpleGrid

•Online tutorials with CI Tutor and the EOT team–OGCE, SimpleGrid

•More example-based documentation– Less talk, more action, short videos, based on user feedback

•Remote vis for gateways – contract delays

TeraGrid Quarterly, December 7, 2010

Targeted Support in the ExtensionAll staff available for assignments as new projects come in

•Cactus–Meet the needs of several groups with large TG allocations

•GridChem, PolarGrid, Ultrascan–Scheduling, vis, Matlab processing, processing of centrifuge

data for large international project•CCSM-ESG–Continuing work to combine capabilities

•SNS•CIPRES•OpenSocial for gateways•Condor and cloud support

TeraGrid Quarterly, December 7, 2010

TG to XD transitions for gatewayshttp://www.teragridforum.org/mediawiki/index.php?title=Science_Gateway_Use_Cases

•Stu, Nancy and XD gateway leads to conduct focused discussions with each gateway, pending architecture definition–Update use case description–Add/subtract gateways as

necessary–Ask about impact of transition

(depends on the architecture)–Ask about entries in software

catalog–Ask about attribute-based

authentication (also depends on architecture)

•Assessment and transition of gateway advanced support projects too

TeraGrid Quarterly, December 7, 2010

Gateway Sustainability StudySmall, non-TG, EAGER grant

•Characteristics of short funding cycles– Build exciting prototypes with input

from scientists– Work with early adopters to extend

capabilities– Tools are publicized, more scientists

interested– Funding ends– Scientists who invested their time

to use new tools are disillusioned• Less likely to try something new again

– Start again on new short-term project

•Need to break this cycle•EAGER grant to look at characteristics of successful gateways and domain areas where a gateway could have a big impact TeraGrid Quarterly, December 7, 2010

4 focus group meetings over 2 yearsFirst 2 held June, 2010

www.sciencegateways.org

TeraGrid Quarterly, December 7, 2010

Questions?