information system for air quality management: end-to-end system architecture november 2001

52
Information System for Air Quality Management: End-to-End System Architecture November 2001

Upload: rosamond-armstrong

Post on 27-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Information System for Air Quality Management: End-to-End System Architecture November 2001

Information System for Air Quality Management:End-to-End System Architecture

November 2001

Page 2: Information System for Air Quality Management: End-to-End System Architecture November 2001
Page 3: Information System for Air Quality Management: End-to-End System Architecture November 2001

Minimal (Voyager) Star Schema

Page 4: Information System for Air Quality Management: End-to-End System Architecture November 2001

Snowflake and Star Schema

• Fact Table: A fact table is a table that contains the measures of interest. For example, sales amount would be such a measure. This measure is stored in the fact table with the appropriate granularity. For example, it can be sales amount by store by day. In this case, the fact table would contain three columns: A date column, a store column, and a sales amount column.

• Snowflake Schema is a set of tables comprised of a single, central fact table surrounded by normalized dimension hierarchies. Each dimension level is represented in a table. Snowflake schema implement dimensional data structures with fully normalized dimensions. Star schema are an alternative to snowflake schema.

• Snowflake Schema: The snowflake schema is an extension of the star schema where each point of the star explodes into more points. The main advantage of the snowflake schema is the improvement in query performance due to minimized disk storage requirements and joining smaller lookup tables. The main disadvantage of the snowflake schema is the additional maintenance efforts needed due to the increase number of lookup tables.

• Star Schema is a set of tables comprised of a single, central fact table surrounded by de-normalized dimensions. Each dimension is represented in a single table. Star schema implement dimensional data structures with de- normalized dimensions. Snowflake schema are an alternative to star schema. A relational database schema for representing multidimensional data. The data is stored in a central fact table, with one or more tables holding information on each dimension. Dimensions have levels, and all levels are usually shown as columns in each dimension table.

• Star Schema: In the star schema design, a single object (the fact table) sits in the middle and is radially connected to other surrounding objects (dimension lookup tables) like a star. A star schema can be simple or complex. A simple star consists of one fact table; a complex star can have more than one fact table

• Federated Star Schema: In federated star schema, instead of having the fact table in the middle, a chosen dimension sits in the middle. Then all the fact tables related to this particular dimension radiate from it. Finally, all the other dimensions that are related to each of the fact tables complete the loop. This type of schema is best used when one wants to focus the analysis on that one particular schema. Because all the fact tables are connected to one central dimension, this is an excellent way of performing cross-fact analysis. The construct also allows much better segmentation and profiling of the one dimension of interest. An example of the product that uses this type of schema is MetaEdge's C-Insight.

• Part of the data modeling exercise is often the identification of data sources. Sometimes this step is deferred until the ETL step. However, my feeling is that it is better to find out where the data exists, or, better yet, whether they even exist anywhere in the enterprise at all. Should the data not be available, this is a good time to raise the alarm. If this was delayed until the ETL phase, rectifying it will becoming a much tougher and more complex process.

• Star Schema Database A database schema used in some RDBMS based OLAP servers in which all values of measures are stored in a "fact table" along with simple numeric keys representing the dimension members with which the values are associated. The descriptive member names that are associated with each numeric key in the fact table are stored in denormalized dimension tables, one per dimension of the hypercube. The dimension tables also describe the hierarchical relationships in each dimension. See: RDBMS Based OLAP Server, Snowflake Schema Database

• Snowflake Schema Database A variation of the star schema database in which the dimension tables are normalized. This will improve performance in cases where a star schema dimension table would so large that unreasonable amounts of disk storage would be required and query performance would be degraded See: RDBMS Based OLAP Server, Star Schema Database

Page 5: Information System for Air Quality Management: End-to-End System Architecture November 2001

PORTALS

• A portal provides users with personalized, one-stop shopping for structured and unstructured data, as well as various types of applications, all of which may exist either inside or outside the corporation. However, data warehousing teams need to be especially careful when selecting portal software.

• Portal products that warrant attention will be based on open standards of data communication like XML and provide an extensible platform that can accommodate a range of technologies. Portal vendors that also market ETL tools designed to maintain the data warehouse will be especially well positioned to provide enterprise data integration. Ascential Software is such a vendor.

Page 6: Information System for Air Quality Management: End-to-End System Architecture November 2001

Database-DataWarehouse Differences• Data Acquisition Databases•  • Acronyms: OLTP - On Line Transaction Processing• Examples:• - Order entry system;• - Look up your checking account when you go to an ATM to request a withdraw• Features:• -         Designed for very rapid selects and inserts of simple transactions• -         Simple transaction that needs to be executed with speed.• -         DBMS designed for OLTP (Oracle) do not do the best job at data querying.• -         Several databases are designed to query and manage data• -         Stores transactional data of an enterprise • -         A database is nothing more, or less, than a technology for managing data files.• -         OLTP Transactional data focusing particular operations or department• -         Current data only, no historical data• -         For OLTP systems that means small tables (row size), probably some specific indexes related to transactional processing, and a high degree of normalization• -         A Database is normalized and contain several constraints to minimize input errors.• -         Contains only fine-grain data, no coarse grain aggregations• -         The database is the underlying infrastructure of a data warehouse (DW)• Data Analysis Databases• Features:• designed for massive ad- hoc queries of large data volumes• not to process transactions.• Stores historical data of an enterprise• Datawarehouse is a centralized storage facility• Used for reporting purposes; helps management making critical decisions• For analysis of patterns, derived after analyzing data aggregations• datawarehouse does not contain all records/info, only summarized info• data gathered from a variety of sources and retained for extend periods• Integrated data formatted for easy access for queries and reports- trend analysis• May contain all relevant detail transaction info for tracebility and drill down of summaries.• There is need for good, clean, transactional data in the warehouse• The summaries and aggregations are also in• Larger tables, more indexes to facilitate queries, and many tables are denormalized to varying degrees• Implemented using a database engine, RDBMS or OLAP tools• The schema is not normalized as in operational database.• The data are arranged in dimensions like Time, Geographic Region, Product, Customer class, promotion etc.• The user doesn't need to know SQL or other language to access the database.• A data warehouse does not normally accept user inputs and is read only.Contains fine-grain as well as coarse grain aggregate data• Summaries inside the relational warehouse could be a simple star schema• If you use a microstrategy to provide information, you will need a snow flake schema.• If you use a hyperion solution, you must have this summarized area in star schema.

Page 7: Information System for Air Quality Management: End-to-End System Architecture November 2001

Data Warehousing Trends

• During the 1990s, distributed computing became the predominant architecture; most hardware and software companies focused their research and development efforts on developing new and enhanced products that were compatible with this new technology. Specific to data warehousing, we saw tremendous progress relative to both the functionality and scalability associated with products in extract/transform/load (ETL), data repositories/databases, OLAP, data mining and other associated decision-support technologies.

• In the past few years, we have seen the rise of the Internet. The Internet's impact on data warehousing will be tremendous in terms of enabling more individuals to access and gain value from existing warehouses beginning with intranets and, more recently, making the information available to trading partners via extranets. At the same time, the Internet provides valuable information about customers, suppliers and competitors that was not readily available from traditional sources.

Page 8: Information System for Air Quality Management: End-to-End System Architecture November 2001

A Retrospective look at Data Warehousing

• In the late 1980s and early 1990s, something had to be done to address the growing level of discomfort with legacy applications. Thus, the concept of the data warehouse was born. The data warehouse was a database that stored detailed, integrated, current and historical data.

• The data warehouse was built from the detailed transaction data that was generated by and aggregated in the legacy applications. One of the major tasks of the data warehouse developer was going back into the legacy systems environment to find, convert and reformat legacy data. The task of integrating legacy data that was never designed for integration was daunting and dirty. Yet, this task was one of the cornerstones of the success of data warehousing.

• In the early days of data warehousing, there were no tools for integrating and converting data. The work had to be done by hand; and it was thankless, toilsome work. Soon, a subindustry evolved called the integration/transformation (i/t) or the extract, transform and load (ETL) industry. Software products were created that allowed the legacy environments to be integrated in an automated manner. There were two types of ETL tools code generators that could handle any conversion scenario and run-time generators that were easy to use but allowed for only limited complexity in integration.

Page 9: Information System for Air Quality Management: End-to-End System Architecture November 2001

OpenGIS Web Services

• Mission: Definition and specification of geospatial web services.• A Web service is an application that can be published, located, and dynamically invoked across the Web.• Applications and other Web services can discover and invoke the service.• The sponsors of the Web services initiative include

– Federal Geographic Data Committee– Natural Resources Canada– Lockheed Martin– National Aeronautics and Space Administration– U.S. Army Corps of Engineers Engineer Research and Development Center– U.S. Environmental Protection Agency EMPACT Program– U.S. Geological Survey– US National Imagery and Mapping Agency.

• Phase I - February 2002 – Common Architecture: OGC Services Model, OGC Registry Services, and Sensor Model Language. – Web Mapping: Map Server- raster, Feature Server-vector, Coverage Server-image, Coverage Portrayal Services. – Sensor Web: OpenGIS Sensor Collection Service for accessing data from a variety of land, water, air and other sensors.

Page 10: Information System for Air Quality Management: End-to-End System Architecture November 2001

Driving Forces of Data Flow

• Need a ‘force’ to move data from one-shot to reusable form

• External force – contracts

• Internal – humanitarian, benefits

Page 11: Information System for Air Quality Management: End-to-End System Architecture November 2001

Resistances (it takes extra effort to recycle information)

• Mechanical

• Personal

• Institutional

Page 12: Information System for Air Quality Management: End-to-End System Architecture November 2001

Monitoring

Page 13: Information System for Air Quality Management: End-to-End System Architecture November 2001

The Data Flow Process:From Raw Data to Refined Knowledge

• Primary data are gathered from providers of sensory data• Data are integrated, filtered, aggregated and fused into secondary data, figures, tables • Report describes pollutant pattern and possibly causality

Page 14: Information System for Air Quality Management: End-to-End System Architecture November 2001

The Researcher/Analyst’s Challenge

“The researcher cannot get access to the data;if he can, he cannot read them;if he can read them, he does not know how good they are;and if he finds them good he cannot merge them with other data.”

Information Technology and the Conduct of Research: The Users ViewNational Academy Press, 1989

Page 15: Information System for Air Quality Management: End-to-End System Architecture November 2001

Data Flow Resistances

These resistances can be overcome through a distributed system that catalogs and standardizes the data allowing easy access for data manipulation and analysis.

• The user does not know what data are available• The available data are poorly described (metadata)• There is a lack of QA/QC information• Incompatible data can not be combined and fused

The data flow process is hampered by a number of resistances.

Page 16: Information System for Air Quality Management: End-to-End System Architecture November 2001

NARSTO-Supersite Data System: Data Flow

• Data gathering, QA/QC and standard formatting is done by individual projects

• The data exchange standards, data ingest and archives are by ORNL and NASA

• The data catalog, relational transformers, SQL server and I/O is by this project

EPA

Supersite Data

Coordinated

Supersite

Relational

Tables

EOSDIS

Data

Archive

NARSTO ORNLDES, Data Ingest

Supersite

SQL

Server

DES-SQLTransformer

Manual-SQL TransformerAuxiliary

Batch Data

DataQuery

TableOutput

Direct Web Data Input

Page 17: Information System for Air Quality Management: End-to-End System Architecture November 2001

STAR Schema for Relational Tables: IMPROVE

Page 18: Information System for Air Quality Management: End-to-End System Architecture November 2001

STAR Schema for Relational Tables: CCAQS

Page 19: Information System for Air Quality Management: End-to-End System Architecture November 2001

Supersite Relational Data System: Schedule

Year 1 - 2002 Year 2 - 2003 Year 2 - 2004

RDMS Design Feed

back

Impl. &

Test SQL Supersite Data Entry

Auxiliary Data Entry

Other Coordinated Data Entry

Supersite, Coordinated and Auxiliary Data Updates

Page 20: Information System for Air Quality Management: End-to-End System Architecture November 2001

• Multi-Teared Architecture

• TCPIP - fff

Page 21: Information System for Air Quality Management: End-to-End System Architecture November 2001
Page 22: Information System for Air Quality Management: End-to-End System Architecture November 2001

Distributed Data Analysis & Dissemination System:D-DADS

• Specifications: Uses standardized forms of data, metadata and access protocols Supports distributed data archives, each run by its own provider Provides tools for data exploration, analysis and presentation

• Features: Data are structured as relational tables and multidim. data cubes Dimensional data cubes are distributed but shared Analysis is supported by built-in and user functions Supports other data types, such as images, GIS data layers, etc.

Page 23: Information System for Air Quality Management: End-to-End System Architecture November 2001

D-DADS Architecture

ARC/INFO

VirtualDataCube

ArcSDETranslator

OLAPService

Provider

DataCube

LegacyDatabase

CustomOLAP

Translator

DataCube

SQLDatabase

OLAP ServiceProvider

GISTable

OLAP

StandardizedDescription &

Format

Database(SQL,

Oracle,etc.)

ArcSDETranslator

GISMap

DataProviders

Data Access andManipulation Tools

UserInteraction

DataCube

ArcIMS

Page 24: Information System for Air Quality Management: End-to-End System Architecture November 2001

The D-DADS Components

• Data Providers supply primary data to system, through SQL or other data servers. • Standardized Description & Format populate and describe the data cubes and

other data types using a standard metadata describing data

• Data Access and Manipulation tools for providing a unified interface to data cubes, GIS data layers, etc. for accessing and processing (filtering, aggregating, fusing) data and integrating data into virtual data cubes

• Users are the analysts who access the D-DADS and produce knowledge from the data

The multidimensional data access and manipulation component of D-DADS will be implemented using OLAP.

Page 25: Information System for Air Quality Management: End-to-End System Architecture November 2001

Interoperability

“the ability to freely exchange all kinds of spatial information about the Earth and about objects and phenomena on, above, and below the Earth’s surface; and to cooperatively, over networks, run software capable of manipulating such information.” (Buehler & McKee, 1996)

Such a system has two key elements:

• Exchange of meaningful information

• Cooperative and distributed data management

One requirement for an effective distributed environmental data system is interoperability, defined as,

Page 26: Information System for Air Quality Management: End-to-End System Architecture November 2001

On-line Analytical Processing: OLAP

• A multidimensional data model making it easy to select, navigate, integrate and explore the data.

• An analytical query language providing power to filter, aggregate and merge data as well as explore complex data relationships.

• Ability to create calculated variables from expressions based on other variables in the database.

• Pre-calculation of frequently queried aggregated values, i.e. monthly averages, enables fast response time to ad hoc queries.

Page 27: Information System for Air Quality Management: End-to-End System Architecture November 2001

User Interaction with D-DADS

Query

Data View(Table, Map, etc.)

Distributed Database

XML data

XML data

Page 28: Information System for Air Quality Management: End-to-End System Architecture November 2001

Metadata Standardization

• The Supersite Data Management Workgroup

• NARSTO

• FGDC

Metadata standards for describing air quality data are currently being actively pursued by several organizations, including:

Page 29: Information System for Air Quality Management: End-to-End System Architecture November 2001

Potential D-DADS Nodes

The following organizations are potential nodes in a distributed data analysis and dissemination system:

• CAPITA

• NPS-CIRA

• EPA Supersites- California- Texas- St. Louis

Page 30: Information System for Air Quality Management: End-to-End System Architecture November 2001

Summary

In the past, data analysis has been hampered by data flow resistances. However, the tools and framework to overcome each of these resistances now exist, including:

• World Wide Web• XML• OLAP• OpenGIS• Metadata standards

Incorporating these tools will initiate a distributed data analysis and dissemination system.

Page 31: Information System for Air Quality Management: End-to-End System Architecture November 2001

Overview

Environmental data are collected by multiple, disparate data providers, such as individual EMPACT projects

Each data provider presents their data in their own format making it difficult to find, access, read, and integrate the data

Standardized formats and data dissemination systems are required for data accessibility and integration of distributed data sets

This proposal presents a distributed data analysis and delivery system that provides users with data access to multiple sources

Page 32: Information System for Air Quality Management: End-to-End System Architecture November 2001

Fast Analysis of Shared Multidimensional Information (FASMI)

(Nigel, P. “The OLAP Report”)

being Fast – The system is designed to deliver relevant data to users quickly and efficiently; suitable for ‘real-time’ analysis

facilitating Analysis – The capability to have users extract not only “raw” data but data that they “calculate” on the fly.

being Shared – The data and its access are distributed.

being Multidimensional – The key feature. The system provides a multidimensional view of the data.

exchanging Information – The ability to disseminate large quantities of various forms of data and information.

An OLAP system is characterized as:

Page 33: Information System for Air Quality Management: End-to-End System Architecture November 2001

Multi-Dimensional Data Cubes

•Multi-dimensional data models use inherent relationships in data to populate multidimensional matrices called data cubes.

•A cube's data can be queried using any combination of dimensions

•Hierarchical data structures are created by aggregating the data along successively larger ranges of a given dimension, e.g time dimension can contain the aggregates year, season, month and day.

Page 34: Information System for Air Quality Management: End-to-End System Architecture November 2001

Example Application: Visibility D-DADS

Visibility observations (extinction coefficient) are an indicator of air quality and serve as an important data set in the public’s understanding of air quality.

A visibility D-DADS will consist of multiple forms of visibility data, such as visual range observations and digital images from web cameras.

Potential visibility data providers include:

- EMPACT projects and their hourly visual range data

- The IMPROVE database

- CAPITA, a warehouse for global surface observation data available every six hours

Page 35: Information System for Air Quality Management: End-to-End System Architecture November 2001

Possible Node in Geography Network

National Geographic and ESRI are establishing a geography network consisting of distributed spatial databases.

Some EMPACT projects are participating as nodes in the initial start-up phase

The visibility distributed data and analysis system could link to and become another node in the geography network, making use of the geography network’s spatial viewers.

Other views, such as a time view could be linked with the spatial viewer to take advantage of the multidimensional visibility data cubes.

Page 36: Information System for Air Quality Management: End-to-End System Architecture November 2001

Example Viewer

Map View

Variable View

Time View WebCam

View

The views are linked so that making a change in one view, such as selecting a different location in the map view, updates the other views.

Page 37: Information System for Air Quality Management: End-to-End System Architecture November 2001

TitleGeoreference Satellite

VectorMetadata

GIS Data - OpenGIS Services

MultiDim Data - Database Services

OLAPCube

SQL-OLAP SQL

DB-OLAPdBase

Table

Textual Data – Weblink Services

OLAPCube

SQL-OLAP SQL

DB-OLAPdBase

Table

General Approach: Ride the Internet wave to the max: XML, Web Services, OpenGis,

Page 38: Information System for Air Quality Management: End-to-End System Architecture November 2001

Summary

In the past, data analysis has been hampered by data flow resistances. Fortunately, the tools and framework to overcome these resistances now exist, including:

• World Wide Web• XML• OLAP• ArcIMS• Metadata standards

It appears timely to consider a distributed environmental data analysis and dissemination system.

Page 39: Information System for Air Quality Management: End-to-End System Architecture November 2001

Distributed Data Browser Architecture

XML WebServices

Satellite

Vector

GIS Data

XDim Data

OLAPCube

SQLTable

HTTPServices

Text Data

WebPage

TextData

Time Chart

Scatter Chart

Text, Table

Data ViewsLayered Map

Cursor

Session Manager (Broker)

Data View

Manager

Connection

Manager

Data Access

Manager

Cursor-Query

Manager

OpenGISServices

Data are rendered by linked Data Views (map, time, text)

Distributed data of multiple types (spatial, temporal text)

The Broker handles the views, connections, data access, cursor

Page 40: Information System for Air Quality Management: End-to-End System Architecture November 2001

NARSTO-Supersite Data System: Data Flow

• Data gathering, QA/QC and standard formatting is done by individual projects

• The data exchange standards, data ingest and archives are by ORNL and NASA

• The data catalog, relational transformers, SQL server and I/O is by this project

EPA

Supersite Data

Coordinated

Supersite

Auxiliary

Batch Data

EOSDIS

Data

Archive

NARSTO ORNLDES, Data Ingest

Supersite

SQL

Server

DES-SQLTransformer

Raw-SQL Transformer

Auxiliary

SQL Data

Data

Catalog

and

SQL

I/O

SQL Web Service

DataQuery

TableOutput

Page 41: Information System for Air Quality Management: End-to-End System Architecture November 2001

‘Global’ and ‘Local’ AQ Analysis

• AQ data analysis needs to be performed at both global and local levels

• The ‘global’ refers to regional national, and global analysis. It establishes the larger-scale context.

• ‘Local’ analysis focuses on the specific and detailed local features

• Both global and local analyses are needed for for full understanding.

• Global-local interaction (information flow) needs to be established for effective management.

National and Local AQ Analysis

Page 42: Information System for Air Quality Management: End-to-End System Architecture November 2001

Data Re-Use and Synergy

• Data producers maintain their own workspace and resources (data, reports, comments).

• Part of the resources are shared by creating a common virtual resources.

• Web-based integration of the resources can be across several dimensions:Spatial scale: Local – global data sharing

Data content: Combination of data generated internally and externally

• The main benefits of sharing are data re-use, data complementing and synergy.

• The goal of the system is to have the benefits of sharing outweigh the costs.

Content

Content

User

User

User

LocalLocal

GlobalGlobal

Virtual Shared Resources

Data, KnowledgeTools, Methods

User

User

Shared part of resources

Page 43: Information System for Air Quality Management: End-to-End System Architecture November 2001

Integration for Global-Local Activities

Global Activity Local Benefit

Global data, tools Improved local productivity

Global data analysis Spatial context; initial analysis

Analysis guidance Standardized analysis, reporting

Local Activity Global Benefit

Local data, tools Improved global productivity

Local data analysis Elucidate, expand initial analysis

Identify relevant issues Responsive, relevant global analysis

Global and local activities are both needed – e.g. ‘think global, act local’

‘Global’ and ‘Local’ here refers to relative, not absolute spatial scale

Page 44: Information System for Air Quality Management: End-to-End System Architecture November 2001

Content Integration for Multiple Uses (Reports)

Data from multiple measurements are shared by their providers or custodiansData are integrated, filtered, aggregated and fused in the process of analysisReports use the analysis for Status and Trends; Exposure Assessment; Compliance …

The creation of the needed reports requires data sharing and integration from multiple sources.

Page 45: Information System for Air Quality Management: End-to-End System Architecture November 2001

Potential Applications of Global-Local Interaction

• OAQPS-State Analyst

• Supersite Program

National and Local AQ Analysis

Page 46: Information System for Air Quality Management: End-to-End System Architecture November 2001

Web Services as Program Components

• A Web Service is a URL addressable resource that returns requested data. The ‘service provider’ resides on the Internet.

• Web Services allow computer to computer communication, regardless of their language or platform.

• Web Services are reusable components, as ‘LEGO blocks’, that can be integrated to create larger, richer applications.

• Example web services are: current weather server, currency converter, map server.

• Web Services use standard web protocols: SOAP, XML, HTTP.

• (Need a picture here: service provider-clients architecture)

• WS vision is to transform the web from a medium for viewing and downloading to a genuine computing platform, a ‘programmable medium’ in which more data can be manipulated and transported across the internet to do useful things.

Page 47: Information System for Air Quality Management: End-to-End System Architecture November 2001

Web Services

• Current distributed application methodologies, DCOM, CORBA, RMI, are for homogeneous environments, not for integration across the heterogeneous Internet.

• Web Services provide simple, flexible standard-based model for integrating applications from reusable, interoperable components distributed over the the Internet.

• This allows agile application development by making it simple to integrate resources within and outside the organization.

Page 48: Information System for Air Quality Management: End-to-End System Architecture November 2001

Web Services: Connect, Communicate, Describe, Discover

Enabling Protocols of the Web Services architecture:

• Connect: Extensible Markup Language (XML) is the universal data format that makes connection and data sharing possible.

• Communicate. Simple Object Access Protocol (SOAP) is the new W3C protocol for data communication, e.g. making requests.

• Describe. Web Service Description Language (WSDL) describes the functions, parameters and the returned results from a service

• Discover. Universal Description, Discovery and Integration (UDDI) is a broad W3C effort for locating and understanding web services.

Page 49: Information System for Air Quality Management: End-to-End System Architecture November 2001

Web Services Enabled by Standards

Web Services operate ‘on top’ of many layers of Internet standards, TCP/IP, HTTP

Web services also the use an array of its own standards - some still in development.

Discovery

UDDI

Disco

Description

WSDL, XSchema

XSD, XSI

Invocation

SOAP

XML

On top of these Internet and Web Service Standards, we will need to develop our own:

Naming conventions

Metadata standards

Uniform database schemata, etc

Page 50: Information System for Air Quality Management: End-to-End System Architecture November 2001

Protocols for Web Services

• The industry standard protocols for Web Services are defined by the W3C Consortium. – Data are described by the Simple Object Access Protocol (SOAP) – Data are expressed using Extensible Mark-up Language (XML)– Transmitted using Hyper Text Transport Protocol (HTTP).

• SOAP is an XML-based protocol for distributed data exchange consisting of :

– Envelope describing what is in a message and how to process it

– A set of encoding rules for expressing data types

– A convention for representing remote procedure calls and responses

– A binding convention for exchanging messages

• XML is a language for describing hierarchical data consisting of:

– A

– B

– C

• HTTP is a protocol for encoding and transmitting data though the Internet

Page 51: Information System for Air Quality Management: End-to-End System Architecture November 2001

Central California AQ Study - Schema

• rerer

Page 52: Information System for Air Quality Management: End-to-End System Architecture November 2001

IMPROVE Relational Database Schema(Tentative)