efficient xslt processing in relational database system

30
Efficient XSLT Processing in Relational Database System Zhen Hua Liu Anguel Novoselsky Oracle Corporation VLDB 2006

Upload: elsu

Post on 22-Jan-2016

47 views

Category:

Documents


0 download

DESCRIPTION

Efficient XSLT Processing in Relational Database System. Zhen Hua Liu Anguel Novoselsky Oracle Corporation VLDB 2006. Agenda. XML Processing Languages Overview XSLT Processing - Coprocessor Vs Integrated Approach XSLT to XQuery Rewrite Translation Technique Performance Evaluation - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Efficient XSLT Processing in Relational Database System

Efficient XSLT Processing in Relational Database System

Zhen Hua LiuAnguel Novoselsky

Oracle CorporationVLDB 2006

Page 2: Efficient XSLT Processing in Relational Database System

Agenda

XML Processing Languages Overview XSLT Processing - Coprocessor Vs

Integrated Approach XSLT to XQuery Rewrite Translation

Technique Performance Evaluation Conclusion Q & A

Page 3: Efficient XSLT Processing in Relational Database System

XML Processing Languages

XQuery/XPath XSLT SQL/XML –integration XML and XQuery/XPath into

SQL (http://www.sqlx.org) SQL/XML (Oracle XMLDB) –

XMLTransform – Oracle XMLDB extension operator: embedding XSLT in SQL

SELECT XMLTransform(emp.resume, ‘XSLT code’) FROM emp; // resume is of XMLType column of table emp

Page 4: Efficient XSLT Processing in Relational Database System

XQuery, XSLT, SQL/XML Comparison All are Declarative Languages ! (Central Dogma in

declarative query processing) Share common XQuery Data Model

SQL/XML XMLType is based on XQuery Data Model XQuery & SQL/XML

More database centric Share the same paradigm – Selection, Projection (XML

Construction), Join, Order XSLT

Database Foreign - Template Rule Matching based Execution Model

XLST is more declarative than XQuery Impedance Mismatch with XQuery/SQL Processing Model,

how to run XSLT in RDBMS ?

Page 5: Efficient XSLT Processing in Relational Database System

Multi-Coprocessors Approach

Embed off-the-shelf Xquery/XSLT processors into a SQL engine

XQuery EngineSQL Engine

XQuery(XMLQuery)

XQuery(XMLQuery)

XQuery Data Model

Instances Input

XQuery Data Model

Instances Input

XQuery DM instances output

XQuery DM instances output

XSLT(Transform)XSLT(Transform) XSLT Engine

Page 6: Efficient XSLT Processing in Relational Database System

Issues & Challenges with Coprocessor Approach

Fully composable in SQL/XML: Can we optimize XSLT, XQuery, XPath as one language ? Cross Language Optimization Feasible ?

XML is stored & Indexed: Can XSLT processing leverage index on XML in RDBMS?

How to make XSLT template rule matching based execution model “fit” into RDBMS processing Model ?

Page 7: Efficient XSLT Processing in Relational Database System

XQuery/XSLT/SQL/XML Integrated Architecture

XQuery

XMLType Abstraction

OR Storage

SQLX View/Relational Data

Binary XML/ XMLIndex

Common algebraic operator tree – storage independent optimization

XML Storage/Index dependent Optimization

Extended/Hybrid Model

XSLT

XSLT to XQuery Rewrite

SQL/XML

Page 8: Efficient XSLT Processing in Relational Database System

XSLT to XQuery Rewrite Translation

Page 9: Efficient XSLT Processing in Relational Database System

General XSLT to XQuery Translate Technique

Fokoue etc “Compiling XSLT 2.0 into XQuery 1.0” paper at WWW 2005

Translate XSLT template into XQuery Function Translate XSLT instruction into corresponding XQuery

construct Translate <xsl:apply-template> into XQuery function calls

with large XQuery conditional expression matching XSLT pattern

Issues Resultant XQuery is cumbersome and requires aggressive

optimization Where to Add Intelligence in the translation ?

Page 10: Efficient XSLT Processing in Relational Database System

XSLT to XQuery Rewrite example XMLType view over relational data

CREATE VIEW dept_empASSELECT XMLElement("dept", XMLElement("dname", dname), XMLElement("loc", loc), XMLElement("employees", (SELECT XMLAgg(XMLElement("emp", XMLElement("empno", empno), XMLElement("ename", ename), XMLElement("sal", sal))) FROM emp WHERE emp.deptno = dept.deptno))) as dept_contentFROM dept

Page 11: Efficient XSLT Processing in Relational Database System

Example contd.. Result of XMLType View ====================================================

========<dept> dname>ACCOUNTING</dname> <loc>NEW YORK</loc> <employees> <emp> <empno>7782</empno> <ename>CLARK</ename> <sal>2450</sal> </emp> <emp> <empno>7934</empno> <ename>MILLER</ename> <sal>1300</sal> </emp> </employees></dept>

Page 12: Efficient XSLT Processing in Relational Database System

Example- XSLT on XMLType -1

SELECT XMLTransform(dept_emp.dept_content,'<?xml version="1.0"?><xsl:stylesheet version="1.0“

xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="dept"> <H1>HIGHLY PAID DEPT EMPLOYEES</H1> <xsl:apply-templates/> </xsl:template> <xsl:template match="dname"> <H2>Department name: <xsl:value-of

select="."/></H2> </xsl:template> <xsl:template match="loc"> <H2>Department location: <xsl:value-of

select="."/></H2> </xsl:template>

Page 13: Efficient XSLT Processing in Relational Database System

Example- XSLT on XMLType -2

<xsl:template match="employees"> <H2>Employees Table</H2> <table border="2"> <td><b>EmpNo</b></td> <td><b>Name</b></td> <td><b>Weekly Salary</b></td> <xsl:apply-templates select="emp[sal > 2000]"/> </table> </xsl:template> <xsl:template match = "emp"> <tr> <td><xsl:value-of select="empno"/></td> <td><xsl:value-of select="ename"/></td> <td><xsl:value-of select="sal"/></td> </tr> </xsl:template>

Page 14: Efficient XSLT Processing in Relational Database System

Example- XSLT on XMLType -3

<xsl:template match="text()"> <xsl:value-of select="."/> </xsl:template> </xsl:stylesheet>')FROM dept_emp;

Page 15: Efficient XSLT Processing in Relational Database System

Input XML Structural Info with XSLT template Analysis

dept

dname locemployees

emp

text

texttext

root

Text template

emp template

Employees templateloc template

dname template

dept template

default template

Page 16: Efficient XSLT Processing in Relational Database System

XSLT Template Invocation Graph

Text templateemp template

Employees templateloc templatedname template

dept template

Default template

Text template

Text template

root

dept

dnameloc

employees

text text Emp[sal>2000]

text

Page 17: Efficient XSLT Processing in Relational Database System

Example- XQuery from XSLTSELECT XMLQuery('declare variable $var000 := .;(: builtin template :) ( let $var002 := $var000/dept return (: <xsl:template match="dept"> :) ( <H1>HIGHLY PAID DEPT EMPLOYEES</H1>, ( let $var003 := $var002/dname return (: <xsl:template match="dname"> :) <H2>{fn:concat("Department name: ", fn:string($var003))}</H2>, let $var003 := $var002/loc return (: <xsl:template match="loc"> :) <H2>{fn:concat("Department location: ",fn:string($var003))}</H2>,

Page 18: Efficient XSLT Processing in Relational Database System

Example- XQuery from XSLT

let $var003 := $var002/employees return (: <xsl:template match="employees"> :)( <H2>Employees Table</H2>, <table border="2"> { <td><b>EmpNo</b></td>, <td><b>Name</b></td>, <td><b>Weekly Salary</b></td>, (

Page 19: Efficient XSLT Processing in Relational Database System

Example- XQuery from XSLT Rewritefor $var005 in ($var003/emp[sal > 2000]) return (: <xsl:template match="emp"> :) <tr> <td>{fn:string($var005/empno)}</td> <td>{fn:string($var005/ename)}</td> <td>{fn:string($var005/sal)}</td> </tr> ) } </table> ) ) ))' PASSING dept_emp.dept_content RETURNING CONTENT) FROM DEPTFROM dept_emp

Page 20: Efficient XSLT Processing in Relational Database System

Final Optimized SQL/XML Query SELECT XMLConcat(

XMLElement( "H1",'HIGHLY PAID DEPT EMPLOYEES'), XMLElement( "H2",'Department name: ' ||"SYS_ALIAS_4"."DNAME"), XMLELement( "H2",'Department location:’ ||"SYS_ALIAS_4"."LOC"), XMLELement( "H2",'Employees Table'), XMLElement( "table",XMLAttributes('2' AS "border"), XMLElement( "td", XMLElement( "b",'EmpNo')), XMLElement( "td",XMLElement( "b",'Name')), XMLElement( "td",XMLElement( "b",'Weekly Salary')), (SELECT XMLAGG( XMLElement( "tr", XMLElement( "td","EMP"."EMPNO"), XMLElement( "td","EMP"."ENAME"), XMLElement( "td","EMP"."SAL"))) FROM EMP WHERE SAL > 2000 AND DEPTNO=DEPT.DEPTNO)))FROM DEPT

Page 21: Efficient XSLT Processing in Relational Database System

XSLT to XQuery Rewrite Key

Leverge XML structural information to generate Template Invocation Graph

XML Schema, DTD, SQL/XML construction functions

Inline Template with caller This generates compact XQuery

amendable for further optimization Cancellation with XML view / relational data Path/Value Index for binary XML

Page 22: Efficient XSLT Processing in Relational Database System

Partial Evaluation

Partial Evaluation to obtain template invocation graph Application computation is described as F(X,Y), X

changes less frequently than Y and significant part of F’s computation depends on X.

Optimize F statically by holding X as constant Key observation – let F be the XSLT stylesheet, X be the

input XML structural information, Y be the actual XML instance document content

Page 23: Efficient XSLT Processing in Relational Database System

Comparison with Related Work

Fokoue - “Compiling XSLT 2.0 into XQuery 1.0” WWW 2005

Not leveraging Input XML Structure Information for optimizing XQuery generation process

Concluded context sensitive flow analysis & function specialization for static optimization

Our work – Optimization of XSLT based on input XML structure Leverage Partial Evaluation for obtaining template

invocation graph

Page 24: Efficient XSLT Processing in Relational Database System

Comparison with Related Work

Moerkotte - “Incorporating XSL Processing Into Database Engines” VLDB 2002

XSLT into internal algebra and integrate with RDBMS

Concluded future research in combined optimizations of XSLT with XML construction

Our work: XSLT into XQuery Combined optimizations of XSLT with XML input

Page 25: Efficient XSLT Processing in Relational Database System

Comparison with Related Work

Jain & Li etc - “Translating XSLT Programs to Efficient SQL Queries” WWW 2002 & “Composing XSL Transformations with XML Publishing Views” SIGMOD 2003

Our work: XQuery as intermediate language Work with any XML storage/Index Model

Page 26: Efficient XSLT Processing in Relational Database System

Performance Evaluation

Page 27: Efficient XSLT Processing in Relational Database System

XSLT Mark

Db-one row query – Index Probe (Table Scan Vs Index Scan)

Rewrite – Integrated ApproachNo-Rewrite – Coprocessor Approach

0

1000

2000

3000

4000

5000

6000

7000

8M 16M 32M 64M

Rewrite

No-Rewrite

Page 28: Efficient XSLT Processing in Relational Database System

XSLT Mark

Avts, metric: XML construction Chart, total: XSLT uses xquery aggregate

functions: count/sum

0

200

400

600

800

1000

1200

avtscase

chartcase

metriccase

totalcase

No-Rewrite

Rewrite

Page 29: Efficient XSLT Processing in Relational Database System

Conclusions

Efficient XSLT processing in RDBMS is feasible despite the template rule based XSLT language

Use XQuery as intermediate language to which XSLT is translated

Leverage XML input structural information to get efficient & compact XQuery

Index Probing, pull based execution model, parallel aggregation, sort applicable to XSLT in RDMS engine

XSLT is native to RDBMS, just as XQuery,SQL/XML

Page 30: Efficient XSLT Processing in Relational Database System

Questions