051207 agu sna frncisco data fed web services based mediator of distributed data flow and processing...

12
Web Services-Based Mediator of Distributed Data Flow and Processing Project Coordinators: Software Architecture: R. Husar Software Implementation: K. Höijärvi Data and Applications: S. Falke, R. Husar Center for Air Pollution Impact and Trend Analysis (CAPITA) Washington University, St. Louis, MO 63130

Upload: rudolf-husar

Post on 23-Jan-2015

1.419 views

Category:

Technology


3 download

DESCRIPTION

http://capitawiki.wustl.edu/index.php/20051205_Web_Services-Based_Mediator_of_Distributed_Data_Flow_and_Processing

TRANSCRIPT

Page 1: 051207 Agu Sna Frncisco Data Fed Web Services Based Mediator Of Distributed Data Flow And Processing Files Data Fed 051207 Agu

Web Services-Based Mediator of Distributed Data Flow and Processing

Project Coordinators:Software Architecture: R. Husar

Software Implementation: K. HöijärviData and Applications: S. Falke, R. Husar

Center for Air Pollution Impact and Trend Analysis (CAPITA)Washington University, St. Louis, MO 63130

Page 2: 051207 Agu Sna Frncisco Data Fed Web Services Based Mediator Of Distributed Data Flow And Processing Files Data Fed 051207 Agu

DataFed Description

DataFed VisionBetter air quality management and science through by effective use of relevant data 

DataFed GoalsFacilitate the access and flow of atmospheric data from provider to usersSupport the development of user-driven data processing value chainsParticipate in specific application projects  

Approach: Mediation Between Users and Data ProvidersDataFed assumes spontaneous, autonomous emergence of AQ data (a la Internet)Non-intrusively wraps datasets for access by web servicesWS-based mediators provide homogeneous data views e.g. geo-spatial, time...

End-user programming of data access and processing through WS composition (limited)

 Applications

Building browsers and analysis tools for distributed monitoring data    Serve as data gateway for user programs; web pages, GIS, science toolsDataFed is currently focused on the mediation of air quality data

Page 3: 051207 Agu Sna Frncisco Data Fed Web Services Based Mediator Of Distributed Data Flow And Processing Files Data Fed 051207 Agu

DataFed Multidimensional Data Model4 D Geo-Environmental Data Cube (X, Y, Z, T)

Environmental data represent measurements in the physical world which has space (X, Y, Z) and time (T) as its dimensions.

The specific inherent dimensions for geo-environmental data are: Longitude X, Latitude Y, Elevation Z and DateTime T.

The needs for finding, sharing and integration of geo-environmental data requires that data are ‘coded’ in this 4D data space – at the minimum.

Page 4: 051207 Agu Sna Frncisco Data Fed Web Services Based Mediator Of Distributed Data Flow And Processing Files Data Fed 051207 Agu

Data Flow & Processing in Air Quality Management

AQ DATA

EPA Networks IMPROVE Visibility Satellite-PM Pattern

METEOROLOGY

Met. Data Satellite-Transport Forecast model

EMISSIONS

National Emissions Local Inventory Satellite Fire Locs

Status and Trends

AQ Compliance

Exposure Assess.

Network Assess.

Tracking Progress

AQ Management Reports

‘Knowledge’ Derived from Data

Primary Data Diverse Providers

Data ‘Refining’ Processes Filtering, Aggregation, Fusion

Page 5: 051207 Agu Sna Frncisco Data Fed Web Services Based Mediator Of Distributed Data Flow And Processing Files Data Fed 051207 Agu

Mediator-Based Integration Architecture (Wiederhold, 1992) • The job of the mediator is to provide an answer to a user query (Ullman, 1997)

• In database theory sense, a mediator is a view of the data found in one or more sources • Heterogeneous sources are wrapped by translation software local to global language• Mediators (web services) obtain data from wrappers or other mediators and process it …

Wrapper Wrapper

Service

Service

User Query Views

Heterogeneous Data

Page 6: 051207 Agu Sna Frncisco Data Fed Web Services Based Mediator Of Distributed Data Flow And Processing Files Data Fed 051207 Agu

Generic Data Flow and Processing in DataFed

DataView 1

Data Processed Data

Portrayed Data

Process Data

Portrayal/ Render

Abstract Data Access

View Wrapper

Physical Data

Abstract Data

Physical Data

Resides in autonomous servers; accessed by view-specific wrappers which

yield abstract data ‘slices’

Abstract Data

Abstract data slices are requested by viewers;

uniform data are delivered by wrapper services

DataView 2

DataView 3

View Data

Processed data are delivered to the user as multi-layer views by portrayal and overlay web services

Processed Data

Data passed through filtering, aggregation, fusion and other web

services

Page 7: 051207 Agu Sna Frncisco Data Fed Web Services Based Mediator Of Distributed Data Flow And Processing Files Data Fed 051207 Agu

Anatomy of a Wrapper Service: TOMS Satellite Image Data

• Given the URL template and the image description, the wrapper service can access the image for any day, any spatial subset using a HTTP URL or SOAP protocol:

• Wrapper classes are available for geo-spatial (incl. satellite) images, SQL servers, text files,etc. The mediator classes are implemented as web services for uniform data access, transformation and portrayal.

src_img_width

src_

img_

heig

h t

src_margin_rightsrc_margin_left

src_margin_top

src_margin_bottom

src_lon_min src_lat_max

src_lat_min src_lon_max

Image Description for Data Access:

src_image_width=502 src_image_height=329

src_margin_bottom=105 src_margin_left=69 src_margin_right=69 src_margin_top=46

src_lat_min=-70 src_lat_max=70 src_lon_min=-180 src_lon_max=180

The daily TOMS images reside on the FTP archive, e.g. ftp://toms.gsfc.nasa.gov/pub/eptoms/images/aerosol/y2000/ea000820.gif

URL template: ftp://toms.gsfc.nasa.gov/pub/eptoms/images/aerosol/y[yyyy]/ea[yy][mm][dd].gif

Transparent colors for overlays

RGB(89,140,255) RGB(41,117,41) RGB(23,23,23) RGB(0,0,0)

Page 8: 051207 Agu Sna Frncisco Data Fed Web Services Based Mediator Of Distributed Data Flow And Processing Files Data Fed 051207 Agu

An Application Program: Voyager Data Browser

• The web-program consists of a stable core and adoptive input/output layers• The core maintains the state and executes the data selection, access and render services• The adoptive, abstract I/O layers connects the core to evolving web data, flexible displays and to the

a configurable user interface:– Wrappers encapsulate the heterogeneous external data sources and homogenize the access– Device Drivers translate generic, abstract graphic objects to specific devices and formats – Ports connect the internal parameters of the program to external controls– WDSL web service description documents

Data Sources

Controls

Displays

I/O Layer

Dev

ice

Dri

vers

Wra

pp

ers App State Data

Flow Interpreter

Core

Web Services

WSDL

Ports

Page 9: 051207 Agu Sna Frncisco Data Fed Web Services Based Mediator Of Distributed Data Flow And Processing Files Data Fed 051207 Agu

SeaWiFS Satellite

SeaWiFS Satellite

Aerosol Chemical

Air Trajectory

Map Boarder

VIEW by Web Service Composition

Page 10: 051207 Agu Sna Frncisco Data Fed Web Services Based Mediator Of Distributed Data Flow And Processing Files Data Fed 051207 Agu

Air Quality Datasets

• Data are accessed from autonomous, distributed providers• DataFed ‘wrappers’ provide uniform geo-time referencing• Tools allow space/time overlay, comparisons and fusion

Near Real Time Data IntegrationDelayed Data Integration

Surface Air Quality AIRNOW O3, PM25 ASOS_STI Visibility, 300 sitesMETAR Visibility, 1200 sitesVIEWS_OL 40+ Aerosol Parameters

SatelliteMODIS_AOT AOT, Idea ProjectGASP Reflectance, AOTTOMS Absorption Indx, Refl.SEAW_US Reflectance, AOT

Model OutputNAAPS Dust, Smoke, Sulfate, AOTWRF Sulfate

Fire DataHMS_Fire Fire PixelsMODIS_Fire Fire Pixels

Surface MeteorologyRADAR NEXTRADSURF_MET Temp, Dewp, Humidity…SURF_WIND Wind vectorsATAD Trajectory, VIEWS locs.

Page 11: 051207 Agu Sna Frncisco Data Fed Web Services Based Mediator Of Distributed Data Flow And Processing Files Data Fed 051207 Agu

Some of the Tools of DataFed

Consoles: Data from diverse sources are displayed to create a rich context for exploration and analysis

CATT: Combined Aerosol Trajectory Tool for the browsing backtrajectories for specified chemical conditions

Viewer: General purpose spatio-temporal data browser and view editor applicable for all DataFed datasets

Page 12: 051207 Agu Sna Frncisco Data Fed Web Services Based Mediator Of Distributed Data Flow And Processing Files Data Fed 051207 Agu

Sulfate in the Northeast

Sahara Dust in the Gulf

Fires in the Southeast

Time Series Console: Southeast

Analyst Console Applications:

Sulfate Episode: 8/27/04