internet technologies1 xslt processing xml using xslt using xpath

80
Internet Technologies 1 XSLT • Processing XML using XSLT • Using XPath

Post on 22-Dec-2015

278 views

Category:

Documents


2 download

TRANSCRIPT

Internet Technologies 1

XSLT

• Processing XML using XSLT

• Using XPath

Internet Technologies 2

Processing XML using XSLT

XSLT is available in Java and C#/.NET

Internet Technologies 3

<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="demo1.xsl"?><book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book>

Input

Internet Technologies 4

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:template match = "book"> <HTML><BODY><xsl:apply-templates/></BODY></HTML> </xsl:template> <xsl:template match = "title"> <H1><xsl:apply-templates/></H1> </xsl:template> <xsl:template match = "author"> <H3><xsl:apply-templates/></H3> </xsl:template> <xsl:template match = "publisher"> <P><I><xsl:apply-templates/></I></P> </xsl:template></xsl:stylesheet> Processing

Internet Technologies 5

<HTML><BODY> <H1>The Catcher in the Rye</H1> <H3>J. D. Salinger</H3> <P><I>Little, Brown and Company</I></P> </BODY></HTML>

Output

Internet Technologies 6

<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="demo1.xsl"?><library><block><book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book></block></library>

Input

Internet Technologies 7

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

version="1.0">

<xsl:template match = "book">

<HTML><BODY><xsl:apply-templates/></BODY></HTML>

</xsl:template>

<xsl:template match = "title">

<H1><xsl:apply-templates/></H1>

</xsl:template>

<xsl:template match = "author">

<H3><xsl:apply-templates/></H3>

</xsl:template>

<xsl:template match = "publisher">

<P><I><xsl:apply-templates/></I></P>

</xsl:template>

</xsl:stylesheet>

The default rules matchesthe root, library and block elements.

Internet Technologies 8

<HTML><BODY> <H1>The Catcher in the Rye</H1> <H3>J. D. Salinger</H3> <P><I>Little, Brown and Company</I></P> </BODY></HTML>

The output is the same.

Internet Technologies 9

<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="demo1.xsl"?>

<book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> <book>Cliff Notes on The Catcher in the Rye</book> </book>

Two books in the input

Internet Technologies 10

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:template match = "book"> <HTML><BODY><xsl:apply-templates/></BODY></HTML> </xsl:template>

<xsl:template match = "title"> <H1><xsl:apply-templates/></H1> </xsl:template>

<xsl:template match = "author"> <H3><xsl:apply-templates/></H3> </xsl:template>

<xsl:template match = "publisher"> <P><I><xsl:apply-templates/></I></P> </xsl:template>

</xsl:stylesheet>

What’s the output?

Internet Technologies 11

<HTML><BODY> <H1>The Catcher in the Rye</H1> <H3>J. D. Salinger</H3> <P><I>Little, Brown and Company</I></P> <HTML><BODY>Cliff Notes on The Catcher in the Rye</BODY></HTML> </BODY></HTML>

Illegal HTML

Internet Technologies 12

<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="demo1.xsl"?>

<book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book>

Input

Internet Technologies 13

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:template match = "book"> <HTML><BODY><xsl:apply-templates/></BODY></HTML> </xsl:template>

<xsl:template match = "title"> <H1><xsl:apply-templates/></H1> </xsl:template>

<xsl:template match = "author"> <H3><xsl:apply-templates/></H3> </xsl:template><!-- <xsl:template match = "publisher"> <P><I><xsl:apply-templates/></I></P> </xsl:template>--></xsl:stylesheet>

We are not matchingon publisher.

Internet Technologies 14

<HTML><BODY> <H1>The Catcher in the Rye</H1> <H3>J. D. Salinger</H3> Little, Brown and Company </BODY></HTML>

We get the default rule matching thepublisher and then printing its child.

Internet Technologies 15

<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="demo1.xsl"?>

<book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book>

Input

Internet Technologies 16

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:template match = "book"> <HTML><BODY><xsl:apply-templates/></BODY></HTML> </xsl:template>

<xsl:template match = "title"> <H1><xsl:apply-templates/></H1> </xsl:template>

<xsl:template match = "author"> <H3><xsl:apply-templates/></H3> </xsl:template>

<xsl:template match = "publisher"> <!-- Skip the publisher --> </xsl:template>

</xsl:stylesheet>

We can skip the publisherby matching and stoppingthe recursion.

Internet Technologies 17

<HTML><BODY> <H1>The Catcher in the Rye</H1> <H3>J. D. Salinger</H3> </BODY></HTML>

Internet Technologies 18

<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="demo1.xsl"?><shelf> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book></shelf>

A shelfhas many books.

Internet Technologies 19

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:template match = "book"> <HTML><BODY><xsl:apply-templates/></BODY></HTML> </xsl:template>

<xsl:template match = "title"> <H1><xsl:apply-templates/></H1> </xsl:template>

<xsl:template match = "author"> <H3><xsl:apply-templates/></H3> </xsl:template>

<xsl:template match = "publisher"> <i><xsl:apply-templates/></i> </xsl:template>

</xsl:stylesheet>

Will this do the job?

Internet Technologies 20

<HTML> <BODY> <H1>The Catcher in the Rye</H1> <H3>J. D. Salinger</H3> <i>Little, Brown and Company</i> </BODY></HTML><HTML> <BODY> <H1>The Catcher in the Rye</H1> <H3>J. D. Salinger</H3> <i>Little, Brown and Company</i> </BODY></HTML><HTML> <BODY> <H1>The Catcher in the Rye</H1> <H3>J. D. Salinger</H3> <i>Little, Brown and Company</i> </BODY></HTML>

This is not whatwe want.

Internet Technologies 21

<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="demo1.xsl"?><shelf> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book></shelf>

Same input.

Internet Technologies 22

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:template match = "shelf"> <HTML><BODY>Found a shelf</BODY></HTML> </xsl:template>

</xsl:stylesheet>

Checks for a shelf and quits.

Internet Technologies 23

<HTML><BODY>Found a shelf</BODY></HTML>

Output

Internet Technologies 24

<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href="demo1.xsl"?><shelf> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book></shelf>

Same input.

Internet Technologies 25

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:template match = "shelf"> <HTML> <BODY> <b>These are a few of my favorite books</b> <table width = "640“ border = “5”> <xsl:apply-templates/> </table> </BODY> </HTML> </xsl:template> <xsl:template match = "book"> <tr> <td> <xsl:number/> </td> <xsl:apply-templates/> </tr> </xsl:template> <xsl:template match = "title | author | publisher"> <td><xsl:apply-templates/></td> </xsl:template></xsl:stylesheet>

Produce a table of books.

Internet Technologies 26

<HTML><BODY><b>These are a few of my favorite books</b><table width="640“ border = “5”> <tr><td>1</td> <td>The Catcher in the Rye</td> <td>J. D. Salinger</td> <td>Little, Brown and Company</td> </tr> <tr><td>2</td> <td>The XSLT Programmer's Reference</td> <td>Michael Kay</td> <td>Wrox Press</td> </tr> <tr>

<td>3</td> <td>Computer Organization and Design</td> <td>Patterson and Henessey</td> <td>Morgan Kaufmann</td> </tr></table></BODY></HTML>

Internet Technologies 27

Internet Technologies 28

One More Time -- Input<?xml version="1.0" ?><shelf> <book> <title>The Catcher in the Rye</title> <author>J. D. Salinger</author> <publisher>Little, Brown and Company</publisher> </book> <book> <title>Mindfulness In Plain English</title> <author>J. D. Salinger</author> <publisher>Wisdom Publications Boston</publisher> </book> <book> <title>To Kill A Mokingbird</title> <author>Harper Lee</author> <publisher>Addison-Wesley</publisher> </book></shelf>

Internet Technologies 29

One More Time -- XSLT<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"

version="1.0"> <xsl:template match = "shelf"> <HTML> <BODY> <b>These are a few of my favorite books</b> <table width = "640" border = "5"> <xsl:apply-templates/> </table> </BODY> </HTML> </xsl:template> <xsl:template match = "book"> <tr> <td> <xsl:number value="position() div 2"/> </td> <xsl:apply-templates/> </tr> </xsl:template> <xsl:template match = "title | author | publisher"> <td><xsl:apply-templates/></td> </xsl:template></xsl:stylesheet>

Let’s do some calculationson the fly.

Internet Technologies 30

QuickTime™ and a decompressor

are needed to see this picture.

One More Time - Output

Internet Technologies 31

XPATH

• Non-xml language used to identify particular parts of an xml document

• Used by XSLT for matching and selecting particular elements to be copied into the result tree.

• Used by Xpointer to identify a particular point in or part of an xml document that an Xlink links to.

Slides adapted from “XML in a Nutshell” by Harold

Internet Technologies 32

XPATH

First, we’ll look at three commonly used XSLT instructions:

xsl:value-of xsl:template xsl:apply-templates

Internet Technologies 33

XPATH

<xsl:value-of select = “XPathExpression” />

The xsl:value-of element computes the string value of an Xpathexpression and inserts it into the result tree. XPath allows us to select nodes in the tree and different node types produce differentvalues.

Internet Technologies 34

XPATH

<xsl:value-of select = “XPathExpression” />

element => the text content of the element after all tags are stripped text => the text of the node attribute => the value of the attribute root => the value of the root processing-instruction => the processing instruction data (<?, ?>, and the target are not included comment => the text of the comment (no comment symbols) namespace => the namespace URI node set => the value of the first node in the set

Internet Technologies 35

XPATH

<xsl:template match = “pattern” />

The xsl:template top-level element is the key to all of xslt.The match attribute contains a pattern (location path) againstwhich nodes are compared as they’re processed. If the patternmatches a node, then the contents are instantiated

Internet Technologies 36

XPATH

<xsl:apply-templates select = “XPath node set expression” />

Find and apply the highest priority template that matches the node set expression.

If the select attribute is not present then all children of the context node are processed.

Internet Technologies 37

The Tree Structure of an XML Document

<?xml version="1.0" ?><?xml-stylesheet type="text/xsl" href = "pi.xsl" ?><people> <person born="1912" died = "1954" id="p342"> <name> <first_name>Alan</first_name> <last_name>Turing</last_name> </name> <!-- Did the word "computer scientist" exist in Turing's day? --> <profession>computer scientist</profession> <profession>mathematician</profession> <profession>cryptographer</profession> </person>

See Harold Pg. 147

Internet Technologies 38

<person born="1918" died = "1988" id="p4567"> <name> <first_name>Richard</first_name> <middle_initial>&#x4D;</middle_initial> <last_name>Feynman</last_name> </name> <profession>physicist</profession> <hobby>Playing the bongoes</hobby> </person></people>

Unicode ‘M’

Internet Technologies 39

/

personborn = “1914”died = “1952”id=“p342”

person

name

first_name

Alan

<!– Did the word “computer scientist”exist in Turing’s day?”-- >

<?xml-stylesheet type="text/xsl" href = “some.xsl" ?>

profession

Internet Technologies 40

The rootElement NodesText NodesAttribute NodesComment NodesProcessing InstructionsNamespace Nodes

Nodes seen by XPath Constructs not seen by XPath

CDATA sectionsEntity referencesDocument Type Declarations

Internet Technologies 41

Note

The following appears in each example below so ithas been removed from the slides.

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"version="1.0"

>::</xsl:stylesheet>

Internet Technologies 42

Location Paths

• The root

<xsl:template match="/"><a>matched the root</a>

</xsl:template>

<?xml version="1.0" encoding="utf-8"?><a>matched the root</a>

Internet Technologies 43

Location Paths

• Child element location paths (relative to context node)

<xsl:template match="/"> <xsl:value-of select = "people/person/profession" /></xsl:template>

computer scientist

Internet Technologies 44

Location Paths

• Attribute location paths (relative to context node)

<xsl:template match="/"> <xsl:value-of select = "people/person/@born" /></xsl:template>

<?xml version="1.0" encoding="utf-8"?>1912

Internet Technologies 45

Location Paths

• Attribute location paths (relative to context node)<xsl:template match="/"> <xsl:apply-templates select = "people/person" /></xsl:template>

<xsl:template match = "person"> <date> <xsl:value-of select = "@born" /> </date></xsl:template>

<date>1912</date><date>1918</date>

Internet Technologies 46

Location Paths

• Comment Location Step (comments don’t have names)

<xsl:template match="/"> <xsl:value-of select = "people/person/comment()" /></xsl:template>

<?xml version="1.0" encoding="utf-8"?> Did the word "computer scientist" exist in Turing's day?

Internet Technologies 47

Location Paths

• Comment Location Step

<xsl:template match = "comment()" > <i>comment deleted</i></xsl:template>

Document content withcomments replaced as shown.Default – no comments output

Internet Technologies 48

Location Paths

• Text Location Step (Text nodes don’t have names)

<xsl:template match="/"> <xsl:value-of select = "people/person/profession/text()" /></xsl:template>

computer scientist

Internet Technologies 49

Location Paths

• Processing Instruction Location Step

<xsl:template match="/"> <xsl:value-of select = "processing-instruction()" /></xsl:template>

<?xml version="1.0" encoding="utf-8"?>type="text/xsl" href = "pi.xsl"

Internet Technologies 50

Location Paths

• Wild cards

There are three wild cards: *, node(), @*

The * matches any element node. It will not match attributes, text nodes, comments or processing instructions nodes.

Internet Technologies 51

Location Paths

• Matching with *<xsl:template match = "*" > <xsl:apply-templates select ="*" /></xsl:template>

Matches all elements and requestscalls on sub-elements only. Nothingis displayed. The text nodes are never reached.

Internet Technologies 52

Location Paths

• Matching with node()

The node() wild card matches all nodes: element nodes,text nodes, attribute nodes, processing instruction nodes,namespace nodes and comment nodes.

Internet Technologies 53

Matching with Node

<xsl:template match="node()">

<xsl:apply-templates/>

</xsl:template>

What is the output?

Internet Technologies 54

Matching with Node -Output

<?xml version="1.0" encoding="UTF-8"?>

Internet Technologies 55

Location Paths

• Matching with @*

The @* wild card matches all attribute nodes.

Internet Technologies 56

Matching with @*<xsl:template match="@*">

Found an attribute <xsl:value-of select="."/>

</xsl:template>

<xsl:template match="node()">

<xsl:apply-templates select="@*"/> <xsl:apply-templates/>

</xsl:template>

What is the output?

Internet Technologies 57

Matching with @* - Output

<?xml version="1.0" encoding="UTF-8"?> Found an attribute 1912 Found an attribute 1954 Found an attribute p342 Found an attribute 1918 Found an attribute 1988 Found an attribute p4567

Internet Technologies 58

Matching with @*

<xsl:template match = "person" > <b> <xsl:apply-templates select = "@*" /> </b></xsl:template>

<?xml version="1.0" encoding="utf-8"?>

<b>19121954p342</b>

<b>19181988p4567</b>

Internet Technologies 59

Location Paths

• Multiple matches with |

<xsl:template match = "profession|hobby" > <activity> <xsl:value-of select = "text()"/> </activity></xsl:template>

<xsl:template match = "*" > <xsl:apply-templates /></xsl:template>

<xsl:template match = "text()" ></xsl:template>

Matches all the elements.Skips the text nodes unlessthey describe a professionor hobby.

Internet Technologies 60

Location Paths

• Selecting from all descendants with //

// selects from all descendants of the context node as well as the context nodeitself. At the beginning of an Xpathexpression, it selects from all descendantsof the root node.

Internet Technologies 61

Location Paths

• Selecting from all descendants with //

<xsl:template match = "//name/last_name/text()" > <xsl:value-of select = "." /></xsl:template>

<xsl:template match = "text()" ></xsl:template>

<?xml version="1.0" encoding="utf-8"?>TuringFeynman

Internet Technologies 62

Location Paths

• Selecting from all descendants with //

<xsl:template match = "/" >

<xsl:value-of select = "//first_name/text()" />

</xsl:template>

<?xml version="1.0" encoding="utf-8"?>Alan

Internet Technologies 63

Location Paths

• Selecting from all descendants with //

<xsl:template match = "/" >

<xsl:apply-templates select = "//first_name/text()" />

</xsl:template>

<xsl:template select = "text()" >

<xsl:value-of select = "." />

</xsl:template> <?xml version="1.0" encoding="utf-8"?>AlanRichard

Internet Technologies 64

Location Paths

• Selecting from all descendants with //

<xsl:template match = "/" >

<xsl:apply-templates select = "//middle_initial/../first_name" />

</xsl:template>

<xsl:template select = "text()" >

<xsl:value-of select = "." />

</xsl:template>

</xsl:stylesheet>

<?xml version="1.0" encoding="utf-8"?>Richard

Internet Technologies 65

Specifying the Child Axis

Consider the following path:

/Envelope/Header/Signature

The above is an abbreviation for

/child::Envelope/child::Header/child::Signature

Internet Technologies 66

Using an Axis <xsl:template match="people">

<xsl:apply-templates select="person"/>

</xsl:template>

<xsl:template match = "person" > <xsl:if test="position() = last()"> <xsl:value-of select="preceding-sibling::person/name"/> </xsl:if>

Internet Technologies 67

<xsl:if test="position() != last()">

<xsl:value-of select="following-sibling::person/name"/>

</xsl:if>

</xsl:template>

What is the output?

Internet Technologies 68

<?xml version="1.0" encoding="UTF-8"?> Richard M Feynman Alan Turing

Axis Example - Output

Internet Technologies 69

Writing Output to an Attribute

<xsl:template match="@*">

<someTag id="{.}"></someTag>

</xsl:template>

<xsl:template match="node()">

<xsl:apply-templates select="@*"/> <xsl:apply-templates/>

</xsl:template>

Internet Technologies 70

Writing Output to an Attribute

<?xml version="1.0" encoding="UTF-8"?><someTag id="1912"/><someTag id="1954"/><someTag id="p342"/><someTag id="1918"/><someTag id="1988"/><someTag id="p4567"/>

Internet Technologies 71

Predicates

In general, an Xpath expression may refer to morethan one node. Predicates allow us to reduce the number of nodes we are interested in.

Each step in a location path may have a predicatethat selects from the node list that is current at thatstep in the expression.

The boolean expression in the predicate is tested against each node in the context node list. If the expressionis false then that node is deleted from the list.

Internet Technologies 72

Predicates<xsl:template match = "/" >

<xsl:apply-templates select = "//profession[.='physicist']/../name" />

</xsl:template>

<xsl:template select = "text()" >

<xsl:value-of select = "." />

</xsl:template><?xml version="1.0" encoding="utf-8"?>

Richard M Feynman

Internet Technologies 73

Predicates

<xsl:template match = "/" >

<xsl:apply-templates select = "//person[@id='p4567']" />

</xsl:template>

<xsl:template select = "text()" >

<xsl:value-of select = "." />

</xsl:template>

<?xml version="1.0" encoding="utf-8"?>

Richard M Feynman

physicist Playing the bongoes

Internet Technologies 74

Predicates<xsl:template match = "/" >

<xsl:apply-templates select = "//person[@born &lt;= 1915]" />

</xsl:template>

<xsl:template select = "text()" >

<xsl:value-of select = "." />

</xsl:template>

<?xml version="1.0" encoding="utf-8"?>

Alan Turing

computer scientist mathematician cryptographer

Internet Technologies 75

Predicates<xsl:template match = "/" >

<xsl:apply-templates select = "//person[@born &lt;= 1919 and @born &gt;= 1917]" />

</xsl:template>

<xsl:template select = "text()" >

<xsl:value-of select = "." />

</xsl:template>

<?xml version="1.0" encoding="utf-8"?>

Richard M Feynman

physicist Playing the bongoes

Internet Technologies 76

Predicates<xsl:template match = "/" >

<xsl:apply-templates select = "/people/person[@born &lt; 1950]/ name[first_name='Alan']" />

</xsl:template>

<?xml version="1.0" encoding="utf-8"?>

Alan Turing

Internet Technologies 77

General XPath Expressions

Xpath expressions that are not node sets can’t be usedin the match attribute of an xsl:template element.

They can be used for the values for the select attributeof xsl:value-of elements and in location path predicates.

Internet Technologies 78

General XPath Expressions

<xsl:template match = "/" > <xsl:apply-templates select = "/people/person" /></xsl:template>

<xsl:template match = "person"> <xsl:value-of select="@born div 10" /></xsl:template>

<xsl:template match = "text()"></xsl:template>

<?xml version="1.0" encoding="utf-8"?>191.2191.8

Internet Technologies 79

General XPath ExpressionsXpath Functions

<xsl:template match = "/" > <xsl:apply-templates select = "/people/person" /></xsl:template>

<xsl:template match = "person"> Person <xsl:value-of select="position()" /></xsl:template>

<xsl:template match = "text()"></xsl:template> <?xml version="1.0" encoding="utf-8"?>

Person 1

Person 2

Internet Technologies 80

General XPath ExpressionsXpath Functions

<xsl:template match = "/" > <xsl:apply-templates select = "//name[starts-with(last_name,'T')]"/></xsl:template>

<xsl:template match = "name"> Mr. T. <xsl:value-of select="." /></xsl:template>

<?xml version="1.0" encoding="utf-8"?>

Mr. T. Alan Turing

Node set convertedto string