rainbow: bridging xml and relational databases design, implementation, and evaluation

24
04-19-2001 1 Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation MQP Advisor: MQP Advisor: Prof. Elke A. Prof. Elke A. Rundensteiner Rundensteiner Sponsor: Sponsor: Verizon Laboratories Verizon Laboratories Incorporated Incorporated MQP Project MQP Project Members: Members: Tien Vu, Tien Vu, Mirek Cymer, Mirek Cymer, John Lee John Lee

Upload: oke

Post on 24-Jan-2016

57 views

Category:

Documents


0 download

DESCRIPTION

Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation. MQP Project Members: Tien Vu, Mirek Cymer, John Lee. MQP Advisor: Prof. Elke A. Rundensteiner Sponsor: Verizon Laboratories Incorporated. HTML vs. XML. Microsoft, IBM, Informix, Oracle, Sun,. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 1

Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

MQP Advisor:MQP Advisor:Prof. Elke A. RundensteinerProf. Elke A. Rundensteiner

Sponsor:Sponsor:Verizon Laboratories IncorporatedVerizon Laboratories Incorporated

MQP Project Members:MQP Project Members:

Tien Vu, Tien Vu,

Mirek Cymer, Mirek Cymer,

John LeeJohn Lee

Page 2: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 2

HTML vs. XML

Microsoft, IBM, Informix, Oracle, Sun, ...Microsoft, IBM, Informix, Oracle, Sun, ...

Page 3: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 3

XML Data Management by RDBMS

Advantages:Advantages: Efficient query and analysis tools.Efficient query and analysis tools. Matured database tools available.Matured database tools available. Easy integration with existing business Easy integration with existing business

databases.databases. Issues:Issues:

Map between XML and Relational Model.Map between XML and Relational Model. Update Propagation.Update Propagation. Query Translation and Optimization.Query Translation and Optimization.

Page 4: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 4

Motivation for Mapping

Query Performance vary with respect to how data is mapped.Query Performance vary with respect to how data is mapped. Flexible mapping: fixed translation and restructureFlexible mapping: fixed translation and restructure

<EMPTY>

Mustang

2001

Ford

car

make

model

year

carmake model year

Ford Mustang 2001

Alternate Mapping

Page 5: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 5

XMLXMLDataData

SubSubsystemsystem

LegendLegend

Rainbow Architecture

DTDDTD XMLXML

XMLXMLQueryQuery

XMLXMLUserUser

XML Query EngineXML Query Engine

DTDM ManagerDTDM Manager XML ManagerXML Manager

Restructuring SubsystemRestructuring Subsystem RDBMS

Page 6: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 6

Goals of our MPQ

What:What: ImplementImplement and and evaluateevaluate restructuring subsystems restructuring subsystems

within the large-scale Rainbow system.within the large-scale Rainbow system. How:How:

Learn about the database technologies and web tools.Learn about the database technologies and web tools. Translate research ideas to software system design.Translate research ideas to software system design. Practice software engineering techniques: Practice software engineering techniques:

UML, engineer and reuse code.UML, engineer and reuse code. Design an experimental test plan and test bed.Design an experimental test plan and test bed. Conduct performance study and analysis.Conduct performance study and analysis.

Page 7: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 7

Restructuring Subsystem

DTDDTD XMLXML

XMLXMLQueryQuery

XMLXMLUserUser

XML Query EngineXML Query Engine

DTDM ManagerDTDM Manager XML ManagerXML Manager

Res

truc

turi

ngR

estr

uctu

ring

MappingMapping

RestructureRestructureOperatorOperatorLibraryLibrary

RestructurerRestructurer

Query StorageQuery Storage

XMLXMLModelModel

SubSubsystemsystem

RelationalRelationalModelModel

InternalInternalProcessProcess

LegendLegend

Page 8: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 8

Restructuring Operators

11 Restructuring Operators:11 Restructuring Operators: Rename Item/AttributeRename Item/Attribute Switch NestingSwitch Nesting Pushup/Pushdown AttributePushup/Pushdown Attribute Pushup/Pushdown NestingPushup/Pushdown Nesting Split/Merge NestingSplit/Merge Nesting Reference/DereferenceReference/Dereference

Page 9: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 9

Mapping: Sequence of Restructure Operators

Mapping is modeled as a sequence of reversable Mapping is modeled as a sequence of reversable restructuring operators, Operator Name + Parameters.restructuring operators, Operator Name + Parameters.

For Example:For Example:

pushUpAttribute(‘account_number’, ‘value’, ‘invoice’, ‘account_number’);

pushUpAttribute(‘bill_period’, ‘value’, ‘invoice’, ‘bill_peroid’);

renameItem(‘invoice’, ‘summary’);

<empty>invoice

value valueaccount_num bill_period

summaryaccount_num bill_period

Page 10: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 10

SQLs for Push-Up Attributes

CREATE VIEW new.A (CREATE VIEW new.A (<all-columns>, a) AS, a) ASSELECT A.SELECT A.<all_columns>, B.b, B.bFROM old.A, old.BFROM old.A, old.BWHERE B.pid = A.iidWHERE B.pid = A.iid

CREATE VIEW new.B (CREATE VIEW new.B (<all-columns-but-b>) AS) ASSELECT B.SELECT B.<all-columns-but-b>FROM old.BFROM old.B

A

B

A

B

Push-up

b

a

Page 11: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 11

Example SQLs Inline: Inline: make.value into car as Attribute make.make.value into car as Attribute make. Mapping:Mapping:

pushUpAttribute(pushUpAttribute(‘account_number’, ‘value’, ‘invoice’, ‘account_number’, ‘value’, ‘invoice’, ‘account_number’‘account_number’););

SQL statements:SQL statements:CREATE VIEW new.invoice (iid, pid, account_number) CREATE VIEW new.invoice (iid, pid, account_number)

ASASSELECT SELECT invoice.iid, invoice.pid,

account_number.valueFROM old.invoice, old.account_numberFROM old.invoice, old.account_numberWHERE account_number.pid = invoice.iidWHERE account_number.pid = invoice.iid

CREATE VIEW new.account_number (iid, pid) ASCREATE VIEW new.account_number (iid, pid) ASSELECT SELECT account_number.iid, account_number.pidFROM old.account_numberFROM old.account_number

Page 12: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 12

Rainbow Implementation

Development ToolsDevelopment Tools Java: Visual Café2, Java: Visual Café2,

Javadocs, JAVA2Javadocs, JAVA2 Oracle 8i, XML 4J, Oracle 8i, XML 4J,

JDBC1.2, SQL QueriesJDBC1.2, SQL Queries Code FactsCode Facts

44 total system classes44 total system classes 17 classes of Rainbow17 classes of Rainbow 27 classes reused27 classes reused ? lines of system code? lines of system code ? lines of Rainbow code? lines of Rainbow code ? lines of code reused? lines of code reused

new

re-use

Page 13: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 13

Screen Shot

Page 14: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 14

Screen Shot

Page 15: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 15

Rainbow Test & Experimental Evaluation

Experimental SetupExperimental Setup Oracle 8iOracle 8i Windows NTWindows NT

DataData Created a DTDCreated a DTD Randomly generated XMLRandomly generated XML Hand translated queriesHand translated queries

FactorsFactors Type of queryType of query Number of operationsNumber of operations

Page 16: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 16

Query Performance Evaluation

Query Performance vs #Restructuring

0

0.05

0.1

0.15

0.2

0 5 10

# Operations

Que

ry P

erfo

rman

ce (

s)

pushUpAttribte

Page 17: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 17

Rainbow Conclusions Technical accomplishmentsTechnical accomplishments

Functional prototype systemFunctional prototype system Feasibility of Rainbow conceptsFeasibility of Rainbow concepts Automated test bed designedAutomated test bed designed Performance evaluations show that:Performance evaluations show that:

(Ideal) Moving up data on the embedded-relational-level (Ideal) Moving up data on the embedded-relational-level yields better query performance for Join queries.yields better query performance for Join queries.

Knowledge gainedKnowledge gained OOOO, Java, JDBC, SQL, RDBMS, XML, DTD, Java, JDBC, SQL, RDBMS, XML, DTD Teamwork & S/W Engineering & Software ReuseTeamwork & S/W Engineering & Software Reuse Logistics of setting up an experimentLogistics of setting up an experiment

Future workFuture work Experiment test plans and test beds to realize the full potential of Experiment test plans and test beds to realize the full potential of

the restructuring component.the restructuring component.

Page 18: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 18

Rainbow: XML and Relational Database Design, Implementation, and Evaluation

Project MembersProject Members::Tien Vu, Mirek Cymer, John LeeTien Vu, Mirek Cymer, John Lee

Advisor:Advisor:Elke A. RundensteinerElke A. Rundensteiner

Ph. D Student:Ph. D Student:Xin ZhangXin Zhang

Sponsor By:Sponsor By:Verizon Laboratories IncorporatedVerizon Laboratories Incorporated

Visit Rainbow at http://davis.wpi.edu/dsrg/TJM/Visit Rainbow at http://davis.wpi.edu/dsrg/TJM/

Page 19: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 19

Recycled!!!

Page 20: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 20

XML: The Future of the Web

Benefits:Benefits: Efficient query and Efficient query and

analysis tools.analysis tools. Matured Data Matured Data

Warehousing support.Warehousing support. Easy Integration with Easy Integration with

existing business existing business database.database.

Applications:Applications: E-commerceE-commerce Web-based industriesWeb-based industries

<invoice>

<account_number>555 777-3158 573 234 </account_number>

<bill_period>Jun 9 - Jul 8, 2000</bill_period>

<carrier>Sprint</carrier>

<itemized_call no=”1” date=”JUN 10” number_called=”973 555-8888” time=”10:17pm” rate=”NIGHT” min=”1” amount=”0.05” />

<itemized_call no=”2” date=”JUN 13” number_called=”973 650-2222” time=”10:19pm” rate=”NIGHT” min=”1” amount=”0.05” />

<itemized_call no=”3” date=”JUN 15” number_called=”206 365-9999” time=”10:25pm” rate=”NIGHT” min=”3” amount=”0.15” />

<total>$0.25</total>

</invoice>

Page 21: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 21

XML and Relational Database ProblemProblem

Many Application usually change its data very frequently.Many Application usually change its data very frequently. e.g., flight reservation, online billing, inventory.e.g., flight reservation, online billing, inventory.

Current SolutionCurrent Solution Reloading the complete XML document when changed which is very Reloading the complete XML document when changed which is very

expensive.expensive. Rainbow SolutionRainbow Solution

Incrementally propagate XML Document Updates to Stored XML Data.Incrementally propagate XML Document Updates to Stored XML Data. Goal: XML Repository Implemented using RDBMSGoal: XML Repository Implemented using RDBMS Approach: Flexible MappingApproach: Flexible Mapping Features: Features:

• DTD Metadata Management in RDBDTD Metadata Management in RDB• Automatic Schema CreationAutomatic Schema Creation• Incremental Update PropagationIncremental Update Propagation• XML Query OptimizationXML Query Optimization

Page 22: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 22

Rainbow Analysis

Exp1: Batch vs Series

0

0.5

1

1.5

2

2.5

0 2 4 6 8 10# of Operations

Tim

e (

s)

avg serial Linear (avg)

Page 23: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 23

Rainbow Analysis Cont..

Time VS Data Size

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0 500 1000 1500 2000 2500Data Size (KB)

Tim

e (

s)

renameitem

renameattribute

pushupattribute

pushdownattribute

Page 24: Rainbow: Bridging XML and Relational Databases Design, Implementation, and Evaluation

04-19-2001 24

HTML vs. XML

HTMLHTML<h1>Car</h1><h1>Car</h1>

<h2>Make</h2><h2>Make</h2>

<p>Ford Mustang<p>Ford Mustang

<h2>Seats</h2><h2>Seats</h2>

<p>5<p>5

<h2>Top Speed</h2><h2>Top Speed</h2>

<p>70 m.p.h<p>70 m.p.h

XMLXML<h1>Car</h1><h1>Car</h1>

<make>Ford Mustang</make><make>Ford Mustang</make>

<seats>5<seats><seats>5<seats>

<speed units=“mph”>70</speed><speed units=“mph”>70</speed>