indexing semistructured data j. mchugh, j. widom, s. abiteboul, q. luo, and a. rajaraman stanford...
Post on 15-Jan-2016
223 views
TRANSCRIPT
Indexing Semistructured Data
J. McHugh, J. Widom, S. Abiteboul,
Q. Luo, and A. Rajaraman
Stanford University
January 1998
http://www-db.stanford.edu/lore/
EECS 684 02/21/2000 Presented by Weiming Zhou
Outline
• Introduction
- Data Model
- Query Language• Indexes in Lore• Query plans using indexes• Conclusions
Data Model - Object Exchange Model (OEM)
The Lorel Query Language (Lorel)
Example 1select DB.Movie.Titlewhere DB.Movie.Actor.Name = “Harrison Ford”
Example 2select Tfrom DB.Movie M, M.Title Twhere exists A in M.Actor : exists N in A.Name
: N = “Harrison Ford”
Indexes In Lore
• Value index
• Text index
• Link index
• Path index
• Edge index
Value index
Similar to attribute indexes in Relational DBMS
Example
Suppose we create a Value index for DB.Movie.Year
If we perform a lookup for DB.Movie.Year = “1956”, Result: &12.
Text Index
• An information-retrieval style keyword search.• Restricted by incoming labels.• Locates string values containing specific words.• Useful for strings containing a significant amount of text.
Implementation:Inverted lists - map a given word w and label l to a list of atomic values with incoming edge l that contain word w.
Example: Lookup for all objects with an atomic string value containing theword “Ford" and an incoming edge Name.Results: {<&17, 2>, <&21, 2>}.
Link Index
• Locates parents of a given object.• Serves as back-pointers
Implementation• Extendible hashing• One Link Index for the entire database graph
Example The Link Index lookup for object &17 returns parent object &6, and the lookup for object &21 returns object &13.
Path Index
Locate all objects reachable by a given labeled path.
Provided by DataGuide.
Exampleselect DB.Movie.Title Using the Path Index to directly locate all objects reachable via DB.Movie.Title.
Results: &5; &9; &14.
Edge Index
All parent-child pairs connected via a specified label.
Example
Look up label “Year” in Edge Index
Results: &2-&7, &3-&12
Query Plans Using Indexes
• Top-Down• Bottom-Up• Hybrid
Example select Tfrom DB.Movie M, M.Title Twhere exists A in M.Actor : exists N in A.Name
: N = “Harrison Ford”
Top-Down Query Plan
Exhaustive Top-down traversalsDB.Movie.Actor.Name = “Harrison Ford” &17, &21 Link Index &17 &2, &21 &4DB.Movie.Title &5, &14
Bottom-Up Query Plan
Look up Value Index DB.Movie.Actor.Name = “Harrison Ford” &17, &21Link Index &17 &2, &21 &4DB.Movie.Title &5, &14
Hybrid Query Plan
select Xfrom A.B Xwhere exists Y in X.C : Y =5
Bottom-up: Value Index A.B.C = “5”
Top-down: A.B
Intersect
Conclusions
• Presents Lore’s indexing structures: Value
Index, Text Index, Link Index, Path Index
and Edge Index.
• Query plans using indexes
• Preliminary performance results:
at least an order of magnitude improvement
when indexes are used for query processing.