maziar sanaii ashtiani – 105405 sct – emu, fall 2011/12

26
Chapter 23 XML Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

Upload: scarlett-johnson

Post on 28-Dec-2015

225 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

Chapter 23XML

Maziar Sanaii Ashtiani – 105405SCT – EMU, Fall 2011/12

Page 2: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

Introduction and Motivation

HTTP Standard Generalized Markup

Language eXtensible Markup Language▪ Useful as a data format to exchange between

apps▪ Markup means something not mentioned in

the document▪ Has tags enclosed n angle brackets▪ <title>Database Systems Concepts</title>

Page 3: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

Freedom

<university><department>

<dept name> Comp. Sci. </dept name><building> Taylor </building><budget> 100000 </budget>

</department><course>

<course id> CS-101 </course id><title> Intro. to Computer Science </title><dept name> Comp. Sci </dept name><credits> 4 </credits>

</course><instructor>

<IID> 10101 </IID><name> Srinivasan </name><dept name> Comp. Sci. </dept name><salary> 65000 </salary>

</instructor><teaches>

<IID> 10101 </IID><course id> CS-101 </course id>

</teaches></university>

Page 4: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

Advantages

Tags are self documenting No rigid format

Can evolve over time Nested structures Widely accepted

Lots of tools

XML has become THE dominant format for data exchange

Page 5: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

Structure

Elements Single root Proper nesting

<course> . . . <title> . . . </title> .. . </course>

<course> . . . <title> . . . </course> ... </title>

Text in the context of an element May be mixed with subelements

Nesting to avoid joins (fig. 23.5, 23.6)

Page 6: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

Structure (Cont’d)

Attributes name= value Strings Useful as identifiers

Namespace <university

xmlns:yale=“http://www.yale.edu”> Literal values

<![CDATA[<course> · · ·</course>]]>

Page 7: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

XML Document Schema

Databases have schemas XML

Document Type Definition XML Schema Relax NG

Page 8: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

DTD

<!DOCTYPE university [<!ELEMENT university ( (department|course|instructor|teaches)+)><!ELEMENT department ( dept name, building, budget)><!ELEMENT course ( course id, title, dept name, credits)><!ELEMENT instructor (IID, name, dept name, salary)><!ELEMENT teaches (IID, course id)><!ELEMENT dept name( #PCDATA )><!ELEMENT building( #PCDATA )><!ELEMENT budget( #PCDATA )><!ELEMENT course id ( #PCDATA )><!ELEMENT title ( #PCDATA )><!ELEMENT credits( #PCDATA )><!ELEMENT IID( #PCDATA )><!ELEMENT name( #PCDATA )><!ELEMENT salary( #PCDATA )>

] >

Page 9: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

DTD (Cont’d)

<!DOCTYPE university-3 [<!ELEMENT university ( (department|course|

instructor)+)><!ELEMENT department ( building, budget )><!ATTLIST departmentdept_name ID #REQUIRED ><!ELEMENT course (title, credits )><!ATTLIST coursecourse_id ID #REQUIREDdept_name IDREF #REQUIREDinstructors IDREFS #IMPLIED ><!ELEMENT instructor ( name, salary )><!ATTLIST instructor IID ID #REQUIRED >dept name IDREF #REQUIRED >· · · declarations for title, credits,

building,budget, name and salary · · ·] >

Page 10: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

DTD Limitations

No constraints Data verification needed

No limit over occurrence Lack of typing for ID and IDREF

Page 11: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

XML Schema

Result of deficiencies in DTD Has string, integer, decimal,… User defined types

Page 12: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

XML Schema (Cont’d)

<xs:schema xmlns:xs=“http://www.w3.org/2001/XMLSchema”><xs:element name=“university” type=“universityType” />

<xs:element name=“department”><xs:complexType><xs:sequence><xs:element name=“dept name” type=“xs:string”/><xs:element name=“building” type=“xs:string”/><xs:element name=“budget” type=“xs:decimal”/></xs:sequence></xs:complexType></xs:element><xs:element name=“course”><xs:element name=“course id” type=“xs:string”/><xs:element name=“title” type=“xs:string”/><xs:element name=“dept name” type=“xs:string”/><xs:element name=“credits” type=“xs:decimal”/></xs:element><xs:complexType name=“UniversityType”><xs:sequence><xs:element ref=“department” minOccurs=“0”maxOccurs=“unbounded”/><xs:element ref=“course” minOccurs=“0”maxOccurs=“unbounded”/><xs:element ref=“instructor” minOccurs=“0”maxOccurs=“unbounded”/><xs:element ref=“teaches” minOccurs=“0”maxOccurs=“unbounded”/></xs:sequence></xs:complexType>

</xs:schema>

<xs:attribute name = “dept name”/>

Page 13: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

XML Schema (Cont’d)

PK<xs:key name = “deptKey”>

<xs:selector xpath = “/university/department”/>

<xs:field xpath = “dept name”/></xs:key>

FK<xs: name = “courseDeptFKey” refer=“deptKey”>

<xs:selector xpath = “/university/course”/><xs:field xpath = “dept name”/>

</xs:keyref>

Page 14: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

XML Schema Benefits

Constraints User-defined types PK and FK Integrated namespaces Min and Max values Type extension by inheritence

Page 15: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

Query and Transformation

XPath Language for path expressions

XQuery Standard language for querying XML▪ Modeled after SQL but different▪ Deal with nested XML data

Page 16: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

Tree Model of XML and XPath

Trees and nodes Elements and attributes

XPath 2.0 /university-3/instructor/name▪ <name>Srinivasan</name>▪ <name>Brandt</name>

Page 17: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

XPath features

Selection /university-3/course[credits >=

4]/@course id Functions

Count()▪ /university-2/instructor[count(./teaches/

course)> 2] id(“foo”)

Union “|” …

Page 18: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

XQuery

W3C XQuery 1.0▪ For▪ Let▪ Where▪ Order by▪ Return

Page 19: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

XQuery (Cont’d)

for $x in /university-3/courselet $courseId := $x/@course_idwhere $x/credits > 3return <course_id> { $courseId } </course_id>

is equivalent to

for $x in /university-3/course[credits > 3]return <course_id> { $x/@course id } </course_id>

Page 20: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

XQuery Joins

for $c in /university/course,$i in /university/instructor,$t in /university/teaches

where $c/course_id= $t/course id and $t/IID = $i/IIDreturn <course_instructor> { $c $i } </course_instructor>

which is equivalent to

for $c in /university/course,$i in /university/instructor,$t in /university/teaches[ $c/course id= $t/course id

and $t/IID = $i/IID]return <course_instructor> { $c $i } </course_instructor>

Page 21: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

Functions and Types

declare function local:dept_courses($iid as xs:string) as element(course)* {

for $i in /university/instructor[IID = $iid],$c in /university/courses[dept name = $i/dept_name]return $c

}

Page 22: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

API to XML

Document Object Model JAVA DOM API

Simple API for XML Event model

Page 23: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

Storage of XML Data

Non-relational Data Stores Flat files (NO ACID) XML Database▪ DOM C++-based

Page 24: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

Storage of XML Data (Cont’d)

Relational Databases Store as string▪ clob

Tree Representation Map to Relations Publishing and Shredding XML Data Native Storage within Relational

Database

Page 25: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

SQL/XML

select xmlelement (name “course”,xmlattributes (course id as course id, dept name as dept name),xmlelement (name “title”, title),xmlelement (name “credits”, credits))

from course

Page 26: Maziar Sanaii Ashtiani – 105405 SCT – EMU, Fall 2011/12

XML Applications

Storing Data With Complex Structure ODF OOXML

Standardized Data Exchange Format B2B

Web Services – HTTP SOAP WSDL

Data Mediation – Wrappers