1. outline motivations research issues architecture: federated service-oriented geographic...

49
1

Upload: jack-gallagher

Post on 25-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

1

Page 2: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

OutlineOutline

• Motivations • Research Issues• Architecture: Federated Service-Oriented

Geographic Information System• Performance enhancing designs -

measurements and analysis• Conclusions

2

Page 3: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Geographic Information Systems (GIS)Geographic Information Systems (GIS)• GIS is a system for creating, storing,

sharing, analyzing, manipulating and displaying geo-data and associated attributes.

• Inherently requires federation (see the figure)– Autonomy for scalability, flexibility

and extensibility• Distributed data access for geo-data

resources (databases, digital libraries etc.)

• Utilizing remote analysis, simulation or visualization tools.

• Open Standards– OGC and ISO/TC-211

3

Page 4: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

MotivationsMotivations

• Requirements for –o Interoperable Service-oriented Geographic

Information Systems – Necessity for sharing and integrating heterogeneous data

and computation resources to produce knowledge.o Uniform data access/query, display and analysis from

a single access pointo Responsive and interactive information systems

– GIS applications require quick response• Emergency early warning systems• Home-land security and natural disasters.

4

Page 5: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

ResearchResearch IssuesIssues

• Interoperability– Defining component based Service-oriented GIS data Grid

framework– Adoption of Open Geographic Standards -data model and services– Applying Web Service principles to GIS data services– Integrating Web Service and Open Geographic Standards

• Federation – Capability-based federation of GIS Web Service components– Unified data access/query, display from a single access point

through integrated data-views• Addressing high-performance support for responsiveness

– Streaming GIS Web Services and Pre-fetching framework – Client-based caching – Parallel processing through attribute based query decomposition

5

Page 6: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Web Service components and data-flow Web Service components and data-flow Service-oriented GISService-oriented GIS

• Built over:– Web Services standards (WS-I+) and– Open Geographic Standards (OGC and ISO/TC-211)

• Consists of two types of online services – Web Map Services (WMS) and Web Feature Services (WFS)

• And two types of data:– Binary data –map images (provided by WMS),– Structured-data –GML : content (core data) and presentation

(attribute and geometry elements) (provided by WFS)

6

Relation of the components and data flow:

WMS GML

rendering

WMS GML

rendering

WFS

(mediator)

WFS

(mediator)

wsdl

wsdl

GMLBinary data

getCapabilitygetMapgetFeatureInfo

getCapabilitygetFeatureDescribeFeatureType

GIS

• WMS are data rendering services providing human comprehensible data (binary map images)

• WFS are data services providing data in common data model GML – Geographic Markup Language• behaving as mediator and annotation services.

• WMS and WFS have their own type of capability metadata defined by Open Geographic specs.

• Inter-service communication is done through “getCapability” service interface.• UDDI based registry services. • Components are Web Services and all control goes through SOAP messages• XML-based query language (standard schema)

Page 7: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Capability-based Federation of Standard GIS Capability-based Federation of Standard GIS Web Service ComponentsWeb Service Components

• Built over the proposed standard Web Service components and common data models– MMS, WFS, and GML

• Federation is done by aggregating GIS Web Services’ capabilities metadata– Inspired from OGC’s cascading

WMS• Unified data access/query/display

from a single access point• Providing application-based

hierarchical data definitions– layer based data and service

(WMS and WFS) compositions• Capability is basically a metadata

about data+service: – Server’s information content and

acceptable request parameter values

7

WFS+

Seismic Rec.

WSDL Capability.xml

WFS+

State Bounds

WSDLCapability.xml

WMS+

OnEarthGoogle Maps

“REST”Capability.xml

AggregatingWMS

(Federator)

Stubs

Web MapClient

Interactive map tools

Stubs

WSDL

SOAPHTTP

Page 8: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Unified Data query and display over integrated data-Unified Data query and display over integrated data-views views

• Step-1: (Before run time – blue lines in the figure)Federator search for standard WS components (WMS or WFS) providing required data layers and organize them in one aggregated capability file. – According to the standard WMS capability schema definition– Capability metadata are collected by getCapability standard service

interface– Interactive Map Tools gets the aggregated capability metadata from

Federator through service interface (1)

8

Event-basedInteractive Map-Tools

Event-basedInteractive Map-Tools

WFSWFS

FederatorFederator

ab

WMSWMS

WFSWFS

ab

Aggregated Capability

1

2

3

1. GetCapability (metadata data+service)

2. GetMap (get map data in set of layer(s))

3. GetFeatureInfo (query the attributes of data)

Browser

a. NASA satellite layer

b. Earthquake-seismic dataCGL at Indiana

JPL at CaliforniaIntegrated data-view: b over a

Events: - Move, - Zooming in/out - Panning (drag-drop) - Rectangular region - Distance calc. - Attribute querying

Browser

• Step-2: (Run time – green lines) Users access/query and display data sources from a single access point (federator) over integrated data-views (multi-layered map images). • Some layers are provided in binary map images (layers from WMS), and some

layers are rendered from GML which is provided by WFS.• Users interact with the system through generic Interactive Map Tools. • Enables users to query the map images based on their attributes and features• On Demand Data Access: There is no copying of the data at any intermediary

places. Data are kept at their originating sources. Consistency and autonomy.

Page 9: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Why Capability metadataWhy Capability metadata

• Web Services provide key low level capability but do not define an information or data architecture

• These are left to domain specific capabilities metadata and data description language (GML).

• Machine and human readable information– Enables easy integration and federation

• Enables developing application based standard interactive re-usable tools – for data query display and analysis– Seamless data/access/query

9

Page 10: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Designs, measurements and analysisDesigns, measurements and analysis

10

Page 11: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Performance InvestigationPerformance Investigation• Interoperability requirements bring up some compliance

costs:– Common data model (GML)– Web Services (SOAP protocol for communication)

• Approaches: Enhancing the GIS systems’ responsiveness– Data transfer and rendering

• Streaming GIS Web Services (1)• Structured/annotated GML data rendering (2)

– Federator-oriented approaches• Pre-fetching (3)• Client-based caching (4)• Query decomposition and parallel processing (5)

• Testing with large scale Geo-science applications– Earthquake forecasting (PI),– Virtual California (VC)

• Aim: Turning compliance requirements into competitiveness11

Page 12: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

(1) Streaming Data Flow Extension to(1) Streaming Data Flow Extension toGIS Web-ServicesGIS Web-Services

• Concern is large-sized XML-structured data transfer

• Approach is that responses are chunked into parts and streamed to client as the answer comes.• Enables client to render map images

with partially returned data – no need to wait for whole data to be returned.

• Provides better performance results• Uses topic-based publish-subscribe

messaging systems for exchanging SOAP messages and data payloads.

• SOAP is used for negotiation (line-3) with standard “getFeature” request

– Publisher information in triple (topic, IP, port) is returned.

• Data transfer is done between publisher and subscriber

topic,ip,port

(A)WMS WFS

Narada Brokering

Server

UDDI

client server

registry

GML GML

3

2 1

getFeature

(topic, IP, port) PublisherSubscriber

wsdl

w s d l

12

4

Web Services’ publish-find-bind triangle

Measured Avg. Response time

DB

Page 13: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

(2) GML Data Processing(2) GML Data Processing• Processing XML data: Parsing and rendering to create map images.• Two well-known approaches are document models (DOM) and push models

(SAX).• We use pull approach for XML processing:– Parses only what is asked for– No support for document validation (major gains of performance)

• Structural correctness of XML document– Doesn’t build complete object model in memory (unlike DOM)– Contents are returned directly to application from calls to parser (unlike SAX)

DOM (dom4j) pull (Xpp)

Data Size Parsing + Total Data Total

(KB) Validation Rendering extraction Rendering

1 469.22 469.22 15.59 15.59

10 494.06 497.06 72.81 75.81

100 625.54 640.87 183.06 198.39

1,000 760.20 843.31 270.47 353.58

5,000 1,422.91 1,576.58 671.74 825.41

10,000 3,557.44 4,385.94 1,025.67 1,854.17

100,000 -- Out of Memory -- 7,059.72 10,797.97

[GML] [Parsing / Validation] [Geo-data extraction] [Plotting]

(1GB allocated VM)

Page 14: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Analyzing Conventional OGC-GIS systems Analyzing Conventional OGC-GIS systems and Baseline Performance Testand Baseline Performance Test

• Common/straightforward approaches are characterized as – Stateless services– On-demand data access/rendering, – Single-threaded and no-caching

• Systems developed with Open Geographic Standards have:– High degree of interoperability but poor performance results

Test Setup:

(b). Map Rendering time

(c). GML data capture time(a). Average Response time

a = b + c + d

(d). Map images transfer time -Avg value is 48.53msec

The performance cumbersome is -c- > Query and data conversions and large size XML data transfer.

Sample data: Earthquake seismic data served in GML

(a)(c)(b)

Page 15: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

(3) Pre-fetching(3) Pre-fetching• Performance bottleneck:

– On-demand access to originating databases through WFS,– Transmitting XML-encoded GML representation of data

• Solution:– Periodically fetching the whole data before it is needed (so called pre-fetching).– Databases are mapped to GML files and stored locally in the federator – Successive on-demand queries are served by using pre-fetched data (red-curve)

• Pros/Cons:– Removes the repeated resource consuming query/data conversions at WFS and

associated Databases.– Gives the best performance outcomes for the in-frequently changing archived data,– But, might cause consistency problem depending on the fetching periodicity and

data update periodicity

15

WFSWFSFederator or WMSFederator or WMS Any-data

GetData in GMLFETCH

NBTemp

Storage

GML

Federator Local File System

Periodically runs and maps databases into GMLs , and stores locallyRepresents all the data

at the database and their associated attributes

Event-basedInteractive Map Tools

Event-basedInteractive Map Tools

Users’ on demand access/query

Page 16: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

• Performance tests are done with earthquake seismic data records

• Pre-fetching (independent of run-time):

– Earthquake data in Database is routinely mapped to GML and kept at federator

– Pre-fetched GML size is 127MB.• The response times seem very close

in case of pre-fetching– No matter how much the requested

data sizes, Every time request comes, map is rendered from the same size of pre-fetched GML data stored at federator

– Dominating performance bottleneck is removed. No need to go through the WFS to get the data from database.

• Threshold value: 500KB of data• For 100MB, pre-fetching is about 50

times faster.• The larger the data size the higher

the performance gains.

Performance Comparison : Performance Comparison : Pre-fetching vs. Straightforward On-demand FetchingPre-fetching vs. Straightforward On-demand Fetching

Event-basedInteractive Map Tools

Event-basedInteractive Map Tools

Federator or WMSFederator or WMS

GML

Map rendering over pre-fetched data at run-time:48.53msec (included into the table values)

Time for rendering map from pre-fetched data -GML

Avg Response timePre-fetched GML -in federator’s disk space

Details for the on-demand performance analysis are given in Slide-14

RequestedGML MB

Avg. Resp. Time withOn-demand

Avg. Resp. Time with Pre-fetched data

0.01 1,808.13 7,576.460.1 2,635.46 7,426.860.5 5,001.29 7,537.04

1 8,225.73 7,742.045 33,419.31 8,460.56

10 64,506.78 8,480.4650 316,906.00 11,197.08

100 643,344.00 12,304.99

Page 17: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Enhancements over Enhancements over On-demand Fetching and RenderingOn-demand Fetching and Rendering

• Pre-fetching is very fast and a straightforward approach, BUT– Might cause inconsistency

• Intermediary storage of data’s copies at federator,• On-demand (just in time) fetching enables

– Keeping the data at their originating sources all the time– Scalability, autonomy and easy data maintenance.

• It has performance bottleneck to access/query the federated heterogeneous data sources through WFS-based mediation.– Time consuming request/response conversion

• (Request): GetFeature request to SQL, • (Response): Relational tables to GML

– Transferring XML-encoded GML data• Enhancement Approaches:

– Client-based caching– Parallel processing through attribute-based query decomposition.

17

Page 18: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

(4) Client-based Caching(4) Client-based Caching

• Makes stateless GIS Web Services stateful• Removes repeated time and resource consuming processes• Helps sharing the workload as equal as possible for the most

efficient parallel processing• Each client has different interest of regions of data sets

(formulated-queried as bbox), and separate caching area allocated.• Application of working-window and locality principles into map

images rendering• Clients are differentiated according to the client assigned session-id

parameter in the header of queries.• Federator always keeps the least recently-used data sets for each

client separately.• Brings up some overhead to keep up working-window and for each

client.

18

Page 19: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Brief ArchitectureBrief Architecture• FormerRequest Class

String uuid; /*unique-user-id*/

String bbox; /*bounding box of the user’s last request*/

Double density; /*data size falling into per unit square*/

Vector [] feature_data; /*geometry elements of the last request*/

19

uuid-1 FormerRequest-1uuid-2 FormerRequest-2….. ……

ClientWSStub binding;binding = (ClientWSStub ) new ServiceLocator().WMSServices( servaddress));String sessionID = session.getid(); //uuid-1 String channel_name = “getMapChannel”;/*Add SessionID to the SOAP message’s header*/binding.setHeader(service_address, channel_name, sessionID);Map mymap = binding.getMap(request);

Set identity to message header

Register to client table

Client-side

Server-side Create identity card. Update at every request from the client

Page 20: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Comparing with Google’s Caching Comparing with Google’s Caching and Map Rendering Approachand Map Rendering Approach

• Google-like map servers are fast because– They replace computation with storage.– Pre-making all images and cut up into tiles– They formalize the accepted requests in terms of parameters, and responses in

terms of the tile compositions.• Google’s approach is good for only the client-server based applications

– Their approach is static and central.– In large scale applications it is impossible to cache whole data – There is always a limit on storage and computation capabilities– It can’t be applied to distributed dynamic data rendering and extensible

applications.• We do fine-grained dynamic information presentation enabling attribute-

based data querying and interaction from a single access point over integrated data-views in multi-layered map images.

• Client-based caching enables– Dynamic and flexible map rendering based on layer specific attribute-based

querying/rendering of data(such as magnitude values of earthquake seismic data)– It enables autonomy of data sources and easy data maintenance

20

Page 21: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

(5) Parallel Processing through (5) Parallel Processing through attribute based attribute based query decompositionquery decomposition

WFSWFS

bbox1 bbox2 bboxn. . . .

Step-1: Cached data extraction and partitioning of the main query bbox

R1 R2

R3

R4

R1

R1

R2

Cached Data

Main Query

2

3

4

1

DBDB

WFSWFS

4 possible positions of main query to the cache:

Step-2: Bbox is an attribute defining range queries

GetFeature1 . . . . GetFeature-n

Main query BBOX = bbox1+bbox2+…+bboxn

Step-3: Main thread distribute the GteFeature requests to the worker threads

Step-4: Worker threads capturing the GML data in parallel

Questions:1. How to find best efficient partition-number 2. How to assign the partitions to worker nodes

Attribute is BBOX defining the ranges for the requested data in the main query getMap

Page 22: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

22

-110,35,-100,36 GFeature-1

-110,36,-100,37 GFeature-2

-110,37,-100,38 GFeature-3

-110,38,-100,39 GFeature-4

-110,39,-100,40 GFeature-5

Partition list as bbox values for sample case : - Pn=5 - Main query getMap bbox 110,35 -100,40

Sample GetFeature request to get feature data (GML) from WFS.

Page 23: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Challenge: Geo-Data CharacteristicChallenge: Geo-Data Characteristic

• Geo-data is characterized as un-evenly distributed and variable sized according to their locations attributes.– Ex. Human population

• A point data is described with location attribute– (x, y) coordinates.

• Linestrings, polylines, polygons etc are defined as set of points.

• Data sets falling into a queried region is formulated as bounding box (bbox)– Coordinates of a rectangle

(a, b, c, d)

R1

R2R3

R4

(c,d)

(a,b)((a+c)/2, b) (a,b)

(c,d)

(1) (2)

(c, (b+d)/2) (c, (b+d)/2)

((a+c)/2, b)

• Need for advanced techniques for parallel processing and workload sharing !

23

Page 24: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

• 1. Blind partitioning– For the first time queries– Uses static/default partitioning number– Costs un-necessary partitioning overheads– Not efficient

• 2. Smart partitioning (next-slide)– As a solution to the sharing of unpredictable workload– Utilize client-based caching and – FormerRequest Object giving session information– Utilize locality principles and working-window to find out the best efficient

partition number• Partitions’ assignment: Threads are assigned in round-robin fashion

– Initially every worker node (WN) is assigned equal #of partition (share)– If partition number (PN) can not be divided evenly then the remaining

partitions (rmg) are re-distributed to the worker nodes

24

Partitioning techniques for Query Decomposition Partitioning techniques for Query Decomposition

Page 25: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Smart Partitioning through Client-based CachingSmart Partitioning through Client-based Caching• Aim: Determining the most efficient partition number to get best

performance result from parallel processing• Based-on the locality principles.

– Assumption: successive requests have similar data density – Data’s density is measure of data size falling into per unit square.– Example: Human population data : no population on the ocean, and urban

areas have higher population than the rural areas.• Brief algorithm:

– Each layer on which partitioning will be done has a threshold value pre-defined.

– Threshold value helps finding the largest area in bbox to be assigned.– Largest area changes depending on the density of the data last time

requested• Density is obtained from the FormerRequest object for that client

25If >= 2, then partition

Static – pre-defined

From Client-based caching FormerRequest object

Page 26: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Performance Test SetupPerformance Test Setup• NASA Satellite maps are provided by WMS from NASA’s OnEarth project.• WFS servers, federator server and event-based interactive map tools are in

Indiana University Community Grids Labs.• Tests are done in Local Area Network (LAN) by using grid-farm machines;

gf12,15,16,17,18,19.ucs.indiana.edu.• Grid-farm machines have 2 Quad-core Intel Xeon processors running at

2.33 GHz with 8 GB of memory and operating Red Hat Enterprise Linux ES release.

26

DB1DB1Federato

rFederato

rWFS-1WFS-1GMLBinary

map image

Event-based

dynamic map tools

Browser

WMSWMS

WFS-2WFS-2

WFS-6WFS-6

.

.

DB2DB2

DB6DB6

NASA Satellite Map Images

Earthquake Seismic records

Binary map image

12

1: NASA satellite map images2: Earthquake- seismic records

JPL California

CGLIndiana

GetMapGetMap

1

2

2

2

Page 27: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Parallel & On-demandParallel & On-demandWith Blind PartitioningWith Blind Partitioning

• The larger the data size the higher the performance gains– As the data size falling in

a specific range query increase, the possibility of equal sharing increases.

• From the figure it seems partitioning into 10 or 20 give the best results, but– What about relatively

small sized data rendering– What partition number

gives the best result for a specific range and data sizes

27

See next slide as an illustration of need of using smart partitioning

The number of worker WFS : 6 Partition Numbers : 2, 10, 20

Page 28: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Parallel & On-demandParallel & On-demandWith Smart PartitioningWith Smart Partitioning

28The number of worker WFS : 6 Partition Numbers : 2, 10, 20

i : Best partition/thread numbers

10

22

10

Actual performance results are much better, because of the client-based caching.Depending on the cache and main query overlapping size, response times changes between orange-line and brown-line in the second figure

Brown-line shows the best case in which the whole main query range falls in cached data ranges.

Page 29: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Overhead Timings resulting from parallelization Overhead Timings resulting from parallelization

• Overheads: Query partitioning, sub-query creation, and merging results to sub-queries.• Partitioning: Defining the partition number and cutting the main query range into that

number of pieces in the form of bounding box (bbox) values (range query attribute)• Sub-query creation: Create corresponding XML-based query (getFeature) for each partition

in the partition list to fetch the remote GML data from WFS.• Merging: Aggregating the results to sub-queries and creating one complete map images as

an answer to main query29

WFSWFS

Overhead Timings :Range query:Sample range: 0 to10

1. Partitioning into 5: 0-2, 2-4, 4-6, 6-8, 8-10

2. Query Creations for partitions: Q1, Q2, Q3, Q4, Q5

WFSWFSWFSWFS

3. Merging

Query for Range:0-10

Queries/responses for partitioned ranges

1.2.3.

DB1DB1 DB1DB1DB1DB1

illustration of overheads

Page 30: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Conclusions – PerformanceConclusions – Performance• Streaming data transfer techniques allow data rendering even on partially

returned data.• Pull parsing results in best outcomes for XML encoded GML data rendering -

Eliminating the requirement of data validation.• Federator’s natural characteristic allowed us develop advanced caching and

parallel processing designs.• Pre-fetching and parallel-processing techniques are mutually exclusive.• Best performance outcomes are achieved through pre-fetching but can

cause data inconsistency. – Triggering periodicity must be defined carefully.

• Parallel-processing techniques’ success is based on how well we share the workload to worker nodes.– Un-evenly distributed and variable sized geo-data characteristics.

• Client-based caching enables us efficient workload sharing for the best efficient parallel processing – Besides enabling removing time and resource consuming repeated jobs.

• We saw that– Application of working-window and locality principles by means of client-based

caching, and– Parallel processing through attribute-based query decompositionHelped us increase the system responsiveness to a greater extent. 30

Page 31: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Conclusions – FrameworkConclusions – Framework• Fine-grained dynamic information presentation through a federation

framework enabled us heterogeneous data sources to be queried as a single resource over integrated data-view in multi-layered map images– Autonomous local resources controlling definition of data– Removing the burden of individually accessing each data source with ad-hoc

query languages. • We showed that Open Geographic Standards (OGC) can be applied

together with Web Service standards.– We converted HTTP/GET-POST based queries into XML-based queries by

developing standard schemas --compatible with the standards.– We also extended the standard service definitions with streaming data transfer

capabilities by using publish-subscribe based messaging middle.• Easy extension with new data and service resources

– Open Geographic and Web Service standards• No physical data integration

– Just-in-time or late-binding federation– Data always is kept at its originating resource– This enables easy data-maintenance and high degree of autonomy

• Seamless interaction with the system through integrated data views in multi-layered map images– Enables interactive feature based querying besides displaying the data 31

Page 32: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

ContributionsContributions• A federated Service-oriented Geographic Information

Systems framework– Integrating Web Services with Open Geographic Standards to

support interoperability at both data and service levels– Production of knowledge from distributed data sources in multi-

layered map images.• Hierarchical data definitions through capability metadata federations• Fine-grained dynamic information presentation• Enabling unified interactive data access/query and display from a single

access point through federator.• Investigated performance efficient designs and did detailed

benchmarking – Streaming GIS Web Services– Federator-oriented high-performance design techniques

• Pre-fetching• Client-based caching : Working-window and locality principles• Parallel processing through attribute-based query decomposition over

un-predictable workload sharing

32

Page 33: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

AcknowledgementAcknowledgement

• The work described in this presentation is part of the QuakeSim project which is supported by the Advanced Information Systems Technology Program of NASA's Earth-Sun System Technology Office.

• Galip Aydin: Web Feature Server (WFS)

33

Page 34: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Thanks!....Thanks!....

34

Page 35: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

BACK-UP SLIDES

35

Page 36: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Capability-based Federation of the standard Web Capability-based Federation of the standard Web Service ComponentsService Components

• Built over the proposed standard Web Service components and common data models

• Unified data access/query/display from a single access point• Providing application-based hierarchical data definitions

– layer based data and service (WMS and WFS) compositions• Federation is done by aggregating GIS Web Services’ capabilities metadata• Capability is basically a metadata about data+service:

– Server’s information content and acceptable request parameter values

36

a. NASA satellite layer

b. Earthquake-seismic layer

c. Google Map Layerd. State-boundaries

Layer

Sample Layers for PI: User PortalInteractive Map-Tools

User PortalInteractive Map-Tools

WFSWFS

WMSWMS

FederatorFederator

12

WMSWMS

WMSWMS

WFSWFS

WFSWFS

12

GIS

Capability FederationMap Rendering

1

2

3

1. GetCapability (metadata data+service)

2. GetMap (get map data in set of layer(s))

3. GetFeatureInfo (query the attributes of data)

Browser

a

bc

d

a, b, c and d

Events: - Move, - Zooming in/out - Panning (drag-drop) - Rectangular region - Distance calc. - Attribute querying

Application-based hierarchical data: [Application]- Pattern Informatics

– [Layer-1] State-boundary over Satellite• [Data-1]

– State-boundary (WFS-1)• [Data-2]

– Satellite-Image(WMS-2)

– [Layer-2] • Google map (WMS-1)

– [Layer-3]- Earthquake-Seismic• [Data-1]

• Earthquake-Seismic(WFS-3)

Application-based hierarchical data: [Application]- Pattern Informatics

– [Layer-1] State-boundary over Satellite• [Data-1]

– State-boundary (WFS-1)• [Data-2]

– Satellite-Image(WMS-2)

– [Layer-2] • Google map (WMS-1)

– [Layer-3]- Earthquake-Seismic• [Data-1]

• Earthquake-Seismic(WFS-3)

Page 37: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

37

Hierarchical dataIntegrated data-view

12

3

1: Google map layer2: States boundary lines layer3: seismic data layer

Event-based Interactive Tools :Event-based Interactive Tools :Query and data analysis over integrated data views

Page 38: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

38

Page 39: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Hierarchical data / Integrated data-viewHierarchical data / Integrated data-viewFor IEISS Geo-science ApplicationFor IEISS Geo-science Application

39

Application-based hierarchical data: [Application]- IEISS

– [Layer-1] Gas-pipeline over Satellite• [Data-1]

– Gas-pipeline (WFS-1)• [Data-2]

– Satellite-Image(WMS-2)

– [Layer-2] • Google map (WMS-1)

– [Layer-3]- Electric-power• [Data-1]

• Electric-power(WFS-3)

Application-based hierarchical data: [Application]- IEISS

– [Layer-1] Gas-pipeline over Satellite• [Data-1]

– Gas-pipeline (WFS-1)• [Data-2]

– Satellite-Image(WMS-2)

– [Layer-2] • Google map (WMS-1)

– [Layer-3]- Electric-power• [Data-1]

• Electric-power(WFS-3)

Page 40: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

GetCapabilities Schema and Sample Request InstanceGetCapabilities Schema and Sample Request Instance

40

Page 41: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

GetMap Schema and Sample Request InstanceGetMap Schema and Sample Request Instance

41

Page 42: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

42

Page 43: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Event-based Interactive Map Tools Event-based Interactive Map Tools

• <event_controller>– <event name="init" class="Path.InitListener" next="map.jsp"/>– <event name="REFRESH" class=" Path.InitListener " next="map.jsp"/>– <event name="ZOOMIN" class=" Path.InitListener " next="map.jsp"/>– <event name="ZOOMOUT" class="Path.InitListener" next="map.jsp"/>– <event name="RECENTER" class="Path.InitListener“next="map.jsp"/>– <event name="RESET" class=" Path.InitListener " next="map.jsp"/>– <event name="PAN" class=" Path.InitListener " next="map.jsp"/>– <event name="INFO" class=" Path.InitListener " next="map.jsp"/>

• </event_controller>

43

Page 44: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Sample GML documentSample GML document

44

Page 45: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Sample GetFeature Request InstanceSample GetFeature Request Instance

45

Page 46: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

A Template simple capabilities file for a WMSA Template simple capabilities file for a WMS

46

Page 47: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Generalizing the Problem DomainGeneralizing the Problem Domain• Query heterogeneous data

sources as a single resource– Heterogeneous: local

resource controls definition of the data

– Single resource: remove the burden of individually accessing each data source

• Easy extension with new data and service resources

• No real integration of data– Data always at local source– Easy maintenance of data

• Seamless interaction with the system– Collaborative decision

makings

Integrated View

Client/User-Query

Files

Mediator Mediator Mediator

Data in files, HTML, XML/Relational Databases, Spatial Sources/sensors

DBDB

47

Page 48: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Generalization of the Proposed ArchitectureGeneralization of the Proposed Architecture• GIS-style information model can be redefined in any application areas

such as Chemistry and Astronomy– Application Specific Information Systems (ASIS).

• We need to define Application Specific– Language (ASL) -> GML :expressing domain specific features, semantic of

data– Feature Service (ASFS) -> WFS :Serving data in common language (ASL)– Visualization Services (ASVS) -> WMS : Visualizes information and provide

a way of navigating ASFS compatible/mediated data resources– Capabilities metadata for ASVS and ASFS.

ASSensorAS

SensorAS

Sensor

ASRepository

ASRepository

Such as filter, transformation, reasoning, data-mining, analysis

Messages using ASL

1234

48

Standard service API

Mediator Standard service API

Mediator

Federator ASVSFederator ASVS

Capability FederationASL-RenderingStandard service API

1

2

3

Unified data query/access/display

• We need to define Application Specific:• Federator federating the capabilities of distributed ASVS

and ASFS to create application-based hierarchy of distributed data and service resources.

• Mediators: Query and data format conversions• Data sources maintain their internal structure • Large degree of autonomy• No actual physical data integration

Page 49: 1. Outline Motivations Research Issues Architecture: Federated Service-Oriented Geographic Information System Performance enhancing designs - measurements

Contributions (Systems Software)Contributions (Systems Software)

• Developing Web Map Server (WMS) in Open Geographic Standards– Extended with Web Service Standards and– Streaming map creation capabilities

• Developing GIS Federator– Extended from WMS– Provides application specific layer-structured hierarchical data

as a composition of distributed standard GIS Web Service components

– Enable uniform data access and query from a single access point.

• Interactive map tools for data display, query and analysis.– Browser and event-based.– Extended with AJAX (Asynchronous Java and XML)

49