san diego supercomputer center, ucsd scir&d hydrologic information system services architecture...

21
SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC (C. Baru, I. Zaslavsky) + Utah State U (D. Tarboton)+ Drexel U (M. Piasecki) + Duke U. (J. Goodall) + CUAHSI Office (R. Hooper)

Upload: berniece-mcdaniel

Post on 28-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

Hydrologic Information System Services Architecture

Collaborative project: UTAustin (D. R. Maidment) +

SDSC (C. Baru, I. Zaslavsky) + Utah State U (D. Tarboton)+

Drexel U (M. Piasecki) + Duke U. (J. Goodall) +

CUAHSI Office (R. Hooper)

Page 2: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

The Grid is becoming the backbone for collaborative science and data sharing

CI is about RE-USING data and research resources !!

Page 3: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

CI Vision for Hydrologic Science

• Leverage ongoing cyberinfrastructure projects:• The Geosciences Network (GEON)• Share data between Earth Disciplines• Secure access to Grid resources, single sign-on authentication/

authorization, distributed data management, data publication, search, information integration, knowledge management, scientific workflows, archiving

• Integrate with common COTS (commercial off-the shelf) software: • Excel, ArcGIS, Matlab… • and Fortran … mostly on Windows… • Interesting survey of CUAHSI partners by David Tarboton!

Page 4: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

HIS User Assessment (Chapter 4 in Status Report)

Data Access

Science Observatorysupport

Education

Which of the four HIS goals is most important to you?

Page 5: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

Tuning to unique features of hydrology

• Hydrologic observations:• Reliance on federally-organized data collection (NWIS, STORET,

Ameriflux, etc.) with huge and complex nomenclatures simplifying access to federal repositories relatively lower emphasis on data ownership

• Handling time in both UTC and local• Various spatial offsets• Multiple data types: time series, fields, spatial data

• Integrative discipline:• Interoperation with atmospheric, ocean, soils, geomorphology, social

datasets and services…• Community:

• Organized by “natural boundaries” natural object hierarchy networks of relatively autonomous self-managed data nodes

• Partnership with public sector water management

Page 6: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

WaterOneFlow Web Services

Data access through web

services

Data storage through web

services

Dow

nlo

ads

Upl

oa

ds

Observatory servers

SDSC HIS servers

3rd party servers

e.g. USGS, NCDC

GIS

Matlab

IDL

Splus, R

D2K, I2K

Programming (Fortran, C, VB)

Web services interface

Web portal Interface (HDAS)

Information input, display, query and output services

Preliminary data exploration and discovery. See what is available and perform exploratory analyses

HTML -XML WS

DL

- SO

AP

Hydrologic Information System Service Oriented Architecture

Page 7: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

Main Components• Web services for

accessing hydrologic repositories

• Hydrologic Observations Data Model

• Hydrologic Data Access System + Time SeriesViewer

• Collection of CUAHSI nodes

NWISNWIS

ArcGISArcGIS

ExcelExcel

NCARNCAR

UnidataUnidata

NASANASAStoretStoret

NCDCNCDC

AmerifluxAmeriflux

MatlabMatlabAccessAccess SASSAS

FortranFortran

Visual BasicVisual Basic

C/C++C/C++

CUAHSI Web ServicesCUAHSI Web Services

Page 8: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

Database Sizes

EPA

NWS

USGS

Records

200 million

?

Stations Time range

250 million

800,000 100 years

1.5 million 100 years

100 years19,000

(From Jon Goodall, Duke U.)

Page 9: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

Language for Data Representation

EPA

NWS

USGS

Unique Identifier for a Observation Station

site_no

Station ID

COOPID

Latitude, Longitude

Time of Measurement

Station Latitude, Station Longitude

Activity Start

dec_lat_va, dec_long_va

dv_dt

YEAR,MO,DA,TIME LATITUDE, LONGITUDE

Lots of semantic differences in parameter names, methods, etc.

Page 10: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

CUAHSI Search Wizard

Inside the NWIS Module

Search Wizard Demo

NWIS Parameter Codes

Output (Drexel Univ.)

Ontology Viz Demo

Page 11: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

WaterOneFlow Web ServicesService Input Output

GetSites Obs Network, filter

Get station codes in network

GetSiteInfo Station Code Lat/long, station name

GetVariables Obs Network or data source, filter

Get variable codes

GetVariableInfo Variable code Description of variable

GetValues Station code or lat/long point, variable code, begin date, end date

A time series of values

GetChart As for GetValue A chart plotting the values

Output: string standardized to a common set of objects (XML schema)

Page 12: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

CUAHSI Web Serviceshttp://www.cuahsi.org/his/webservices.html

NCEP North American Forecast Model 12 Km grid for continental US

New services: http://water.sdsc.edu/wateroneflow/

Page 13: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

CUAHSI Point HydrologicObservations Data Model

• A relational database stored in Access, PostgreSQL, MS SQL Server, ….

• Stores observation data made at points

• Consistent format for storage of observations from many different sources and of many different types.

Streamflow

Flux towerdata

Precipitation& Climate

Groundwaterlevels

Water Quality

Soil moisture

data

(D. Tarboton, USU)

Community design requirements(22 reviewers)

Page 14: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

Schema

Page 15: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

Uses and tools for HODM

• HODM is central to HIS infrastructure, but lacks tools• Testing HODM with two types of data: federal repositories, and

external databases (Panola). Personal and enterprise versions.• Mapping wizard: loading

Excel observation data to HODM database:• Can save mapping files

for subsequent runs of similarly formatted spreadsheets

• Local data analysis can be done: charts and stats

• HDAS as an interface to HODM datasets - but shall not be the only one - so exposing HODM as Web services

Page 16: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

SAN DIEGO SUPERCOMPUTER CENTER, UCSD

SciR&D

Hydrologic Data Access System

http://river.sdsc.edu/hdas/

Page 17: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

Hydrologic Data Access

System

Page 18: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

Cross-platform design Central CUAHSI HIS Node (Windows) GEON Data Node (Linux)

Data

Apache TomcatIIS Web Server

ASP . Net

Geon Software Stack

SQL Server

Proxy

ArcGIS

Technologies

HDASHODM

Web

ServiceWeb

Services

Web Serviceproxies

Data

Remote CUAHSI HIS Node (Windows)

Data

IIS Web ServerASP . Net

SQL ServerArcGIS

Technologies

HDASHODM

Web

ServiceWeb

Services

Web Serviceproxies

Remote CUAHSI HIS Node (Windows)

Data

IIS Web ServerASP . Net

SQL ServerArcGIS

Technologies

HDASHODM

Web

ServiceWeb

Services

Web Serviceproxies

Remote CUAHSI HIS Node (Windows)

Data

IIS Web ServerASP . Net

SQL ServerArcGIS

Technologies

HDASHODM

Web

ServiceWeb

Services

Web Serviceproxies

Remote CUAHSI HIS Node (Windows)

Data

IIS Web ServerASP . Net

SQL ServerArcGIS

Technologies

HDASHODM

Web

ServiceWeb

Services

Web Serviceproxies

Remote CUAHSI

HIS Nodes (Windows)

Page 19: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

Resource registration• Shapefiles• TIFF images, GMT rasters• Web Services, WMS services• Relational databases, ASCII• PDFs, URLs• “CUAHSI data”• NetCDF• Coming: Geodatabases and ODM

Page 20: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

HIS Scalability• Adding…

…data types and datasets; processing models and services; servers; users and roles –

- shall not create unmanageable bottlenecks that require system re-engineering

• Designing for scalability:• Distilling a generic set of web service signatures; resolving semantic

and structural heterogeneities• Using HODM as a common generic format for time series data, for

ease of coding and uniform search interfaces• HDAS GUI design to abstract specifics of disparate repositories• Leveraging common CI components developed in GEON and other

projects• Have good design docs, to allow others develop and deploy systemsAlso: Need to work with agencies to remove the web services bottleneck

– some progress with USGS and NOAA!!

Page 21: SAN DIEGO SUPERCOMPUTER CENTER, UCSD SciR&D Hydrologic Information System Services Architecture Collaborative project: UTAustin (D. R. Maidment) + SDSC

SAN DIEGO SUPERCOMPUTER CENTER, UCSDSciR&D

Plans

• HIS 1.0, Oct 30, 06: • Version 1 of WaterOneFlow web services; • version 1 of ODM; • Additional catalogs in HODM;• HDAS deployable;• HIS workbook; • HIS design document

• Web services and catalogs for fields• Further GEON integration: registering ODMs to GEON

catalog, and completing a Windows-based node• Planning for 5 more years!