data models and query languages of spatio-temporal information cindy xinmin chen computer science...
Post on 06-Jan-2018
216 Views
Preview:
DESCRIPTION
TRANSCRIPT
Data Models and Query Data Models and Query Languages of Languages of Spatio-Temporal InformationSpatio-Temporal Information
Cindy Xinmin ChenComputer Science Department
UCLAFebruary 28, 2001
2
The ProblemThe Problem Data models and query languages for spatio-
temporal databases: many different approaches proposed complexity of technical problem diversity of application requirements
Implementation: extensions for spatio-temporal information zero extensibility in Relational DBMS Object-Relational systems are better, but still
have many limitations
3
Contribution of This ResearchContribution of This Research Better data models and query languages for
temporal and spatio-temporal information Multi-layered architecture for spatio-temporal
extensions on O-R systems Support further extensions and customization by
end-users via user-defined spatio-temporal aggregates
4
OutlineOutline Temporal Data Models and Query Languages --
SQLT
Spatio-Temporal Data Models and Query Languages -- SQLST
Implementation of SQLST
More Abstract Representation of Spatio-Temporal Data
Conclusion
5
6
State of ArtState of Art More than 40 temporal data models according to
[Jensen and Snodgrass 99] Interval-based approach [Lorentzos 97]
same conceptual level and implementation level representations but requires interval coalescing after projection
TSQL2’s implicit time [Snodgrass 95] temporal joins are specified without ever
mentioning the time column in WHERE or SELECT clauses of the query
Point-based approach [Toman 98]
7
Interval-Based Time Model Interval-Based Time Model and Coalescingand Coalescing A temporal relation contains prescription information
Projection on Name and Physican:
Projection on Name and Drug:
Prescription(Melanie, Dr. Jones, Proventil, 3mg, [19960101, 19960131])Prescription(Melanie, Dr. Jones, Prozac, 3mg, [19960201, 19960229])Prescription(Melanie, Dr. Bond, Prozac, 3mg, [19960301, 19960331])
(Melanie, Dr. Jones, [19960101, 19960229])(Melanie, Dr. Bond, [19960301, 19960331])
(Melanie, Proventil, [19960101, 19960131])(Melanie, Prozac, [19960201, 19960331])
8
TSQL2TSQL2 Bitemporal Conceptual Data Model -- coalesced
data model Two dimensional time -- valid time and transaction
time Implicit time model -- no coalescing Lack of universality
9
TSQL2 -- An ExampleTSQL2 -- An Example Schema Definition
Query 1: find the drugs Melanie took in 1996 and the time she took them.
CREATE TABLE Prescription (Name CHAR(30), Physician CHAR(30), Drug CHAR(30), Dosage CHAR(30))AS VALID STATE DAY
SELECT DrugVALID INTERSECT(VALID(Prescription), PERIOD(‘[1996]’ AS DAY))FROM PrescriptionWHERE Name = “Melanie”
10
Point-Based ModelPoint-Based Model Expressive power [Toman 97] Use user-defined aggregates to express Allen's
interval operators Universality:
uniformly applicable to SQL, QBE and Datalog use current query languages’ construct types no new constructs are introduced
11
SQLSQLTT: Schema Definition: Schema Definition Define the Prescription relation
CREATE TABLE Prescription (Name CHAR(30), Physician CHAR(30), Drug CHAR(30), Dosage CHAR(30), VTime DATE)
12
Temporal Selection and JoinTemporal Selection and Join Query 1’: find the drugs Melanie took in 1996 and
the time she took them.
SELECT Drug, VTimeFROM PrescriptionWHERE Name = “Melanie” 19960101 <= VTime AND 19961231 >= VTime
13
Interval-Oriented ReasoningInterval-Oriented Reasoning Query 2: find the patients who have taken Proventil
throughout the time they took Prozac.
SELECT P1.NameFROM Prescription AS P1 P2WHERE P1.Name = P2.Name AND P1.Drug = “Proventil” AND P2.Drug = “Prozac”GROUP BY P1.NameHAVING DURING(P1.VTime. P2.VTime)
14
Interval-Oriented Reasoning (cont.)Interval-Oriented Reasoning (cont.) Query 2 in QBE
Prescription Name Physician Drug Dosage VTime P.G._name Proventil _vtime1 _name Prozac _vtime2
ConditionsDURING(_vtime1, _vtime2)
15
Interval-Oriented Reasoning (cont.)Interval-Oriented Reasoning (cont.) Query 2 in Datalog
query2(Name, during<VTime1, VTime2>) prescription(Name, _, “Proventil”, _, VTime1), prescription(Name, _, “Prozac”, _, VTime2).
16
Implementation of SQLImplementation of SQLTT on DB2 on DB2 From point-based representation to interval based
representation Difficulty of support temporal data model and query
language extensions on existed O-R systems only user-defined functions (UDFs) available
UDFs can not access the database tables directly
UDFs are hard to develop and debug
17
18
Previous WorkPrevious Work Constraint-based approach
Triangulation-based spatial objects + interval-based time [Chomicki 97]
Parametric rectangles + interval-based time [Cai 00]
Time as another dimension in space [Grumbach 98]
Composite spatio-temporal data types: mpoint and mregion [Güting 00]
Orthogonal space and time [Worboy 94]
19
Previous Work (cont.)Previous Work (cont.) Commercial DBMSs
no spatio-temporal extensions only spatial DataBlades, Extenders, etc.
provide a predefined library of functions offer no extensibility
20
Objective of SQLObjective of SQLSTST
orthogonality, minimality and extensibility separated temporal and spatial information minimal extensions to SQL additional constructs can be built in SQLST
21
Design and Implementation of SQLDesign and Implementation of SQLSTST
Define a minimal set of built-in primitives in procedure language
Use user-defined aggregates for further extension Data types:
Temporal data type -- time interval Spatial data types -- points, lines (finite straight
line segments), and counterclockwise directed triangles
22
Counterclockwise Directed TriangleCounterclockwise Directed Triangle A triangle is counterclockwise directed if its three
vertexes are counterclockwise orientated
Makes point-location problem easy inside(point, triangle)
01V3yV3x1V2yV2x1V1yV1x
T
V1 V2
V3
P’P
23
Application ExampleApplication Example Cyclone statistics for the northern Hemisphere from
NSF Arctic System Science Research Program
ID Trajectory Pressure Start Time End Time x1 y1 x2 y2960001 (1146, 1034, 1303, 1775) 1004 1996-05-01 1996-05-02960001 (1303, 1775, 1664, 1779) 995 1996-05-02 1996-05-03960001 (1664, 1779, 1957, 1018) 991 1996-05-03 1996-05-04
day1day2 day3
day4
24
SQLSQLSTST: Schema Definition: Schema Definition Define the Cyclone relation
Define the Island relation
CREATE TABLE Cyclone (ID INT, Trajectory LINE, Pressure REAL, Tstart DATE, Tend DATE)
CREATE TABLE Island (Name CHAR(30), Region TRIANGLE)
25
Spatio-Temporal QueriesSpatio-Temporal Queries Query 3: find all cyclones whose high pressure stage
(pressure > 1000mb) have lasted more than 3 days.
SELECT ID FROM CycloneWHERE Pressure > 1000GROUP BY IDHAVING DURATION(Tstart, Tend) > 3
26
Spatio-Temporal Queries (cont.)Spatio-Temporal Queries (cont.) Query 4: find the cyclones whose trajectory have
been enclosed by the island Misfortune.
SELECT ID FROM Cyclone, IslandWHERE Name = “Misfortune”GROUP BY IDHAVING CONTAIN(Trajectory, Region)
27
28
ApproachApproach Define a minimal set of ADTs built in C++ Use user-defined aggregates to define new spatio-
temporal primitives Allow end-users to extend and customize the system
for their application
29
Built-in Spatial FunctionsBuilt-in Spatial Functions length(line) area(triangle) center_of_mass(triangle) distance(point, point) distance(point, line) intersect(line, line) intersect(line, triangle) intersect(triangle, triangle)
30
User-Defined Aggregates (UDAs)User-Defined Aggregates (UDAs) UDAs provide a more general and powerful
mechanism for DB extensions ease of use no impedance mismatch of data types and
programming paradigms DB advantages -- scalability, data independence,
optimizability, etc.
31
Aggregate eXtension Language Aggregate eXtension Language (AXL) [Wang 00](AXL) [Wang 00] Stream orientated processing Three functions expressed in SQL
INTIALIZE: gives an initial value to the aggregate ITERATE: computes the intermediate aggregate
value for each new record TERMINATE: returns the final value computed for
the aggregate Local tables
state return
Built on the Berkeley DB storage manager
32
DurationDuration Calculates the total length of the time intervals
Cyclone(960001, _, _, 19960101, 19960105)Cyclone(960001, _, _, 19960111, 19960115)Cyclone(960001, _, _, 19960121, 19960125)
15 days
33
Duration (cont.)Duration (cont.)
AGGREGATE DURATION(Tstart DATE, Tend DATE) : INT{ TABLE state (i INT); INITIALIZE : { INSERT INTO state VALUES(Tend - Tstart + 1); } ITERATE : { UPDATE state SET i = i + (Tend - Tstart + 1); } TERMINATE : { INSERT INTO return SELECT i FROM state; }}
34
ContainContain Tests if one object contains another
returns 1 if true; returns nothing otherwise
contain(O1, O2) triangle t2 O2, vertex v of t2, triangle t1 O1, v inside t1
35
Contain (cont.)Contain (cont.)AGGREGATE CONTAIN(Object1 TRIANGLE, Object2 TRIANGLE) : INT{ TABLE state (b INT) AS VALUES(1); TABLE triangles(Object TRIANGLE); TABLE points(Vertex POINT); INITIALIZE : ITERATE : { INSERT INTO triangles VALUES(Object1); INSERT INTO points VALUES(Object2.Vertex);} TERMINATE : { UPDATE state SET b = 0 WHERE NOT EXIST (SELECT Vertex FROM points, triangles WHERE inside(Vertex, Object) = 1); INSERT INTO return SELECT b FROM state WHERE b = 1; }}
36
Other UDAsOther UDAs Overlap
tests if any edges of two objects intersect Edge_Distance
calculates the minimum distance from the vertexes of one object to the edges of the other object
Moving_Distance calculates the distance an object has traveled
continuously
37
Key Issue: PerformanceKey Issue: Performance Size of data set:
Cyclone table -- 200,000 tuples Island table -- 1000 tuples
Cases compared AXL using indexes AXL not using indexes C++ using indexes C++ not using indexes
Index Tstart on Cyclone table and Name on Island table
38
Performance -- DurationPerformance -- Duration Query 5: find the duration of the cyclones occurred
in June, 1996.
SELECT DURATION(Tstart, Tend) FROM CycloneWHERE 19960601 <= Tstart AND 19960630 >= TstartGROUP BY ID
39
Performance – Duration (cont.)Performance – Duration (cont.)
40
Performance – ContainPerformance – Contain Query 6: find the cyclones which occurred in June,
1996 and have been enclosed by the region of the island Misfortune.
SELECT IDFROM Cyclone, IslandWHERE 19960601 <= Tstart AND 19960630 >= Tstart AND Name = “Misfortune”GROUP BY IDHAVING CONTAIN(Region, Trajectory)
41
Performance – Contain (cont.)Performance – Contain (cont.)
42
43
Abstract ModelAbstract Model Objective: flexibility
user can decide which level of abstraction they want
may have more than two layers Data types:
temporal data type -- time instants spatial data types – points, lines, and polygons
44
A Spatio-Temporal ObjectA Spatio-Temporal Object The concrete model -- space triangles and time
intervals
A more abstract representation -- sequence of snapshots
(S , ((2,2),(6,2),(2,6)), [1,10])(S , ((2,6),(6,2),(6,6)), [1,10]) 1<=t<=10
S
2 4 6 8
(S, [(2,2),(2,6),(6,6),(6,2)], 1)(S, [(2,2),(2,6),(6,6),(6,2)], 2)
……(S, [(2,2),(2,6),(6,6),(6,2)], 10)
6
4
2
45
Schema DefinitionSchema Definition The Cyclone relation
The Island relation
CREATE TABLE Cyclone (ID INT, Position POINT, Pressure REAL, Time DATE)
CREATE TABLE Island (Name CHAR(30), Extent POLYGON)
46
MappingMapping UDA -- map
Table function -- decompose
(Point, Time Instant)
(Line, Time Interval)
(Polygon)
(Triangle)
47
Spatio-Temporal QueriesSpatio-Temporal Queries Query 3’: find all cyclones whose high pressure
stage (pressure > 1000mb) have lasted more than 3 days.
SELECT NEW.ID FROM (SELECT ID, MAP(Position, Time) FROM Cyclone WHERE Pressure > 1000 GROUP BY ID) AS NEW(ID, Trajectory, Tstart, Tend)GROUP BY NEW.IDHAVING DURATION(New.Tstart, New.Tend) > 3
48
Spatio-Temporal Queries (cont.)Spatio-Temporal Queries (cont.) Query 4’: find the cyclones whose trajectory have
been enclosed by the island Misfortune.
SELECT NEW.ID FROM (SELECT ID, MAP(Position, Time), T.Region FROM Cyclone, Island, TABLE(decompose(Extent)) AS T WHERE Name = “Misfortune” GROUP BY ID, T.Region) AS NEW(ID, Trajectory, Tstart, Tend, Region)GROUP BY NEW.IDHAVING CONTAIN(New.Region, New.Trajectory)
49
ConclusionConclusion Better data models and query languages for
temporal and spatio-temporal information Multi-layered architecture for spatio-temporal
extensions on O-R systems Support further extensions and customization by
end-users via user-defined spatio-temporal aggregates
top related