050317 ws telecon husar
DESCRIPTION
http://capitawiki.wustl.edu/index.php/20050317_Web_Services:_ES_Rationale_and_AssertionsTRANSCRIPT
Web Services: ES Rationale and Assertions
Provider Push Science Pull
Flow of DataFlow of Control
DATA 1
Data 2
Data 2
Knowledge Prod. 1
Knowledge Prod. 2
Knowledge Prod. 4
Knowledge Prod. 3
Knowledge Prod. 5
Web Services for Refining Data to Knowledge
Prepared for:
Technology Infusion Web Services Sub-groupMarch 17, 2004 Telecon
R. Husar, [email protected]
[Better Earth] Science is the DRIVER for the Information System!
The Researcher’s Challenge
“The researcher cannot get access to the data;if he can, he cannot read them;if he can read them, he does not know how good they are;and if he finds them good he cannot merge them with other data.”
Information Technology and the Conduct of Research: The Users ViewNational Academy Press, 1989
These resistances can be overcome through
• A catalog of distributed data resources for easy data ‘discovery’
• Uniform data coding and formatting for easy access, transfer and merging
• Rich and flexible metadata structure to encode the knowledge about data
• Powerful shared tools to access, merge and analyze the data
[For the data types they cover], OGC & OpenDAP are addressing the Finding and Reformatting tasks
The custom processing of data into knowledge is still a major burden at the user end
Petabytes 1015Terabytes 1012 Gigabytes 109 Megabytes 106
Calibration, Transformation To Characterized
Geophysical Parameters
Filtering, Aggregation, Fusion, Modeling,
Trends, Forecasting
InteractiveDissemination
ACCESS
Multi-platform/parameter, high space/time resolution,
remote & in-situ sensing
Sensing Analysis & Synthesis
Earth Science Data to Knowledge Transformation:Value-Adding Processes
Data Acquisition Value Chain (Network)
InfoSystem Goal: Add as much value to the data as possible to benefit all users
Data Usage Value Network
Flexible data selection, and processing to to deliver right knowledge, right place right time
Data - L1 Information – L2 Knowledge – L3-6? Usable Knowledge
Query
Data
Distributed, DynamicMore Local, DAAC
Processing Knowledge Use
Value-Added Processing in Service Oriented Architecture
Control
Data
Chain 1
Chain 2 Chain 3
Peer-to-peer network representation
Data ServiceCatalog
User
Data, services and users are distributed throughout the network
Users compose data processing chains form reusable services
Intermediate data are also exposed for possible further use
Chains can be linked to form compound value-adding processes
Service chain representation
User Tasks:
Find data and services
Compose service chains
Expose output
Chain 2
Chain 1 Chain 3
Data
Service
User Carries less Burden
In service-oriented peer-to peer architecture, the user is aided by software ‘agents’
Data Flow and Flow Control in AQ Management
Provider Push User Pull
Data are supplied by the provider and exposed on the ‘smorgasbord’However, the choice of data and processes is made by the userThus, the autonomous data consumers, providers and mediators form the info system
Flow of DataFlow of Control
AQ DATA
METEOROLOGY
EMISSIONS DATA
Informing Public
AQ Compliance
Status and Trends
Network Assess.
Tracking Progress
Data to Knowledge ‘Refinery’
The data ‘refining’ process is not a chain but network connection processing nodes. Like on the Internet, new nodes and connections are added continuouslyThus, the infosystem needs to support the dynamic addition of new nodes and connections
Hence – there is a need for loosely coupled ‘plug-and-play’ architecture
A Sample of Datasets Accessible through DataFed/ESIP MediationNear Real Time (~ day)
It has been demonstrated (project FASTNET) that these and other datasets can be accessed, repackaged and delivered by AIRNow through ‘Consoles’
MODIS Reflectance
MODIS AOT TOMS Index
GOES AOT
GOES 1km Reflec
NEXTRAD Radar
MODIS Fire Pix
NRL MODEL
NWS Surf Wind, Bext
Assertions on Web Services Technology• Currently Web Services are the leading (and only?) technologies for building software
applications in autonomous, networked, dynamic environment
• The future is promising since businesses are driving the WS technologies and the community is benefiting from the increasingly ‘semantic web’
• A growing resource pool is exposed as ‘services’ and WS-based ES applications development frameworks are being developed/evaluated (e.g. SciFlo, DataFed)
WS Adaptation Issues• Catalogs for finding and using services are grossly inadequate• The semantic layers of the interoperability stack are not yet available• General ‘fallacies of distributed computing’:
– Network is reliable – Latency is zero – Bandwidth infinite – Network is secure – Topology stable – One administrator – No transport costs – Network uniform
Interoperability Stack
Layer Description Standards
Semantics Meaning WSDL ext., Policy, RDF
Data Types Schema, WSDL
Protocol Communication behavior SOAP, WS-* ext.
Syntax Data format XML
Transport Addressing, Data flow HTTP, SMTP
Kickoff Questions• What is a Web Service?
– e.g. 'A programming module with a well-defined, web-based I/O interface' (operating on well structured data??)
– Examples of what is/is not a WS
• WS Classification by Interoperability Layer– Transport– Interface Syntax
• Strongly typed interface (e.g. SOAP, WSDL)• Weakly typed interface (e.g. arbitrary CGI? URL interface)
– Protocol/Data– Semantics
• WS Classification by Architecture– Services for Tightly Coupled applications (e.g. URL service called from IDL)– Services for Loosely Coupled (e.g. application composed from SOAP services)