introduction to xpointer transparency no. 1 introduction to xml pointer language (xpointer)...

37
Introduction to Xpointer Transparency No. 1 Introduction to XML Po inter Language (XPoint er) Cheng-Chia Chen

Post on 22-Dec-2015

244 views

Category:

Documents


6 download

TRANSCRIPT

Introduction to Xpointer

Transparency No. 1

Introduction to XML Pointer Language (XPointer)

Cheng-Chia Chen

Introduction to XPointer

Transparency No. 2

XPointer: Why, what, and how?

relative addressing: allows links to places with no anchors can point to substrings in character data and to whole tree fragmen

ts (node ranges) used by XLink to locate remote link resources Example of an XPointer:

http://www.foo.org/bar.xml#xpointer(article/section[position() <= 5])

URI reference fragment identifier

locates a resource whose Internet media type is one of text/xml, application/xml, text/xml-external-parsed-entity, or application/xml-external-parsed-entity.

Xpointer expression (points to the first five section elements in the article root element.) In HTML, fragment identifiers may denote anchor IDs -- XPointer gener

alizes that.

Introduction to XPointer

Transparency No. 3

XPointer vs. XPath

XPointer is based upon XPath an XPath is evaluated wrt. some context; XPointer specifie

s this context XPath says nothing about URIs; XPointer specifies that co

nnection XPointer adds some features not available in XPath

Introduction to XPointer

Transparency No. 4

XPointer fragment identifiers

An XPointer fragment identifier (the substring to the right of # in the URI) is either the value of some ID attribute in the document (ID attributes are specifi

ed by the DTD ), a sequence of element numbers denoting the path from the root to an

element optionally preceeding an ID (e.g. /1/27/3, or theSection/1/27/3), or

a sequence of the form xpointer(...) xpointer(...) ... containing a list (typically of length 1) of XPointer expressions.

Each expression is evaluated in turn, and the first where evaluation succeeds is used. (This allows alternative pointers to be specified thereby increasing rob

ustness.) (XPointer allows for other means of specifying pointers - XPointer p

ointers are recognized by "xpointer(...)".) Next: We will now dig into XPath and then later describe what additi

onal features XPointer adds to XPath...

Introduction to XPointer

Transparency No. 5

XPath: Location paths

XPath is a declarative language for addressing (used in XPointer) and pattern matching (used in XSLT).

The central construct is the location path, which is a sequence of location steps separated by /.

A location step is evaluated wrt. some context resulting in a set of nodes.

A location path is evaluated compositionally, left-to-right, starting with some initial context.

Each node resulting from evaluation of one step is used as context for evaluation of the next, and the results are unioned together.

Introduction to XPointer

Transparency No. 6

Xpath context

A context consists of: a context node, a context position and size (two integers, 1≤ position≤ size),

variable bindings, a function library, and a set of namespac

e declarations. Initial context: defined externally (e.g. by XPointer or XS

LT).(Location paths starting with / always use the document

root as initial context node!)

Example:

child::section[position()<6]/descendant::cite/attribute::hrefselects all href attributes in cite elements in the first 5 se

ctions of an article.

Introduction to XPointer

Transparency No. 7

Location steps

A location step has the form

axis :: node-test (‘[‘ predicate ‘]’)* The axis selects a set of candidate nodes (e.g. the child no

des of the context node) related to the context node. The node-test performs an initial filtration of the candidate

s based on their types (chardata node, processing instruction, etc.), or names (e.g. element or attribute name).

The predicates (zero or more) cause a further, potentially more complex, filtration.Only candidates for which the predicates evaluate to true are

kept.

The candidates that survive the filtration constitute the result.

Introduction to XPointer

Transparency No. 8

Example

The example from before:

child::section[position()<6]/descendant::cite/attribute::href

selects all href attributes in cite elements in the first 5 sections of an article.

axis :: node-test ([ predicate ])*

Introduction to XPointer

Transparency No. 9

Available axes

child the children of the context nodedescendant all descendants (children, childrens childre

n, ...)parent the parent (empty is at the root)ancestor all ancestors from the root to the parentfollowing-siblings siblings to the rightpreceding-siblings siblings to the leftfollowing all following nodes in the documentpreceding all preceding nodes in the documentattribute the attributes of the context nodenamespace namespace declarations in the context nod

eself the context node itselfancestor-or-self, descendant-or-self

Introduction to XPointer

Transparency No. 10

context node : this. Then

Axis selected nodes

self (.) {this}

parent(..) {p}

ancestor {p,a2,a3,/}

ancestor-or-self {this,p,a2,a3,/}

child() {c1~c4}

descendent {c1~c4,d1,d2}

descendent-or-self

preceding-sibling {ps1,ps2}

following-sibling {fs1,fs2}

attribute {attr1,..}

namespace {ns1,…}

preceding {ps1,ps2,ps3,ps4}

following {fs3,fs4, fs1,fs2}

Note:

U = self + ancestor + descendant

+ preceding + following

All the above axes are disjoint.

Nodes selected by Axes

attribute1

a2

a3

p

ps1 fs1this fs2ps2

c3:

element c4:comment

d2d1

c2:PI

/ root node

document element

namespace1

c1:text

fs3

ps4

ps3

fs4

Introduction to XPointer

Transparency No. 11

Available axes (cont’d)

descendant-or-self the union of descendant and selfancestor-or-self the union of ancestor and selfNotes

attributes and namespace declarations are considered a special kind of nodes.

Some of these axes assume a document ordering of the tree nodes. The ordering essentially corresponds to a left-to-right preorder traversal of the document tree.

The resulting sets are ordered intuitively, either forward (in document order) or reverse (reverse document order).

For instance, following is a forward axis, and ancestor is a reverse axis.

Introduction to XPointer

Transparency No. 12

Node tests

Testing by node type: text() char data nodes comment() comment nodes processing-instruction() processing instruction nodes node() all nodes

Testing by node name: name | *:NCName | NCName:* nodes with that name * any node of the principal node type When testing by name, only nodes of the "axis principal no

de type" are considered (attributes for attribute axis, namespace nodes for namesp

ace axis, element nodes for all other axes).

Introduction to XPointer

Transparency No. 13

Predicates

expressions coerced to type boolean

A predicate filters a node-set by evaluating the predicate expression on each node in the set with that node as the context node,

the size of the node-set as the context size, and

the position of the node in the node-set wrt. the axis ordering as the context position.

Introduction to XPointer

Transparency No. 14

Expressions

Available types: node-set (set of nodes), boolean (true or false), number (floating point), string (Unicode text).

Abstract syntax for expressions: exp -> $variable

| ( exp )| literal

| numeral | function ( arguments ) | boolean-expression | numerical-expression | node-set-expressionCoercion may occur at function arguments and when ex

pressions are used as predicates.Variables and functions are evaluated using the context.

Introduction to XPointer

Transparency No. 15

Expressions

Boolean expressions: Available boolean operators: or, and, =, !=, <, >, <=, >= Standard precedence, all left associative.

Numerical expressions: Available numerical operators: +, -, *, div, mod

Node-set expressions: location paths, filtered by predicates. Node-set expression operators: | (node-set union)

Introduction to XPointer

Transparency No. 16

Core function library

Node-set functions: last() returns the context size position() returns the context position count(node-set) returns number of nodes in node-set name(node-set) return string representation of first node i

n node-set local-name(node-set?) return the localname of the the first

node in the node-set namespace-uri(node-set?) return the namespace URI of the

the first node in the node-set.. node-set id(object) : node-set => return U node object∈ id(string(node)) String =>[object = tokens ] => { node | node has a ID attr

value object }∈ others => return id(string(object))

Introduction to XPointer

Transparency No. 17

Core library functions

String functions: string(value?) type cast to string concat(string, string, string*) string concatenation boolean starts-with(string, string) boolean contains(string, string) string substring-before(string, string) substring-before(“1999/4/10”, “/”) = “1999” string substring-after(string, string) substring-before(“1999/4/10”, “/”) = “4/10” sring substring(string, start-position, length) substring(“123456”, 3, 3) = “345” string-length(string?) string normalize-space(string?) normalize-space(“ abs fff “) = “abs fff” string translate(string, string, string) translate(“abcdef”, “abcd” “ABC”) = “ABCef”

Introduction to XPointer

Transparency No. 18

Core library functions

Boolean functions: boolean(value) type cast to boolean not(boolean) boolean negation true(), false(), lang(string) return true if the lang of the context node is $

1....Number functions:

number(value) type cast to number sum(node-set) sum of number value of each node in nod

e-set floor(number) ceiling(number) round(number)

see the XPath specification for the complete list.

Introduction to XPointer

Transparency No. 19

Abbreviations

Syntactic sugar: convenient notation for common situations

Normal syntax Abbreviation child:: nothing (so child is the default axis) attribute:: @ /descendant-or-self::node()/ // self::node() . (useful because location paths

starting with / begin evaluation at the root) parent::node() ..

Note: //par[1] ≠ /descendant::par[1]

Introduction to XPointer

Transparency No. 20

Examples

Example:

.//@href

selects all href attributes in descendants of the context node.

The coercion rules often allow compact notation,

foo[3] refers to the third foo child element of the context

node (because 3 is coerced to position()=3).

Introduction to XPointer

Transparency No. 21

XPointer Extensions to XPath

generalize node-set to location-set, where a location can be a node, a point, or a range.

Rules for establishing the XPath evaluation context. new location type test:

point() range()

A new expression, RangeExpr, which can generate results of the range location type.

Extra functions: range-to(.), string-range(..??), here(), origin() range(.), range-inside(.), start-point(.), end-point(.)

Introduction to XPointer

Transparency No. 22

XPointer: Context initialization

An XPointer is basically an XPath expression occurring in a URI.

When evaluated, the initial context is defined as follows: The context node is the root node of the document.

The context position and size are both 1. The variable bindings are empty. The function library consists of the core XPath fun

ctions + a few extra functions The namespace declarations are chosen to be the

ones whose scope contains the XPointer.

Introduction to XPointer

Transparency No. 23

Definition of point location

A point P = (N, I) where N is a node called the container node of point P, I an integer ≥ 0 called the index of point P.

Type of points: character-point: represent a position in a character string, N is a leaf node and the index points to the position b/t Ith

and (I+1)th characters in the string. Axes: no children, no siblings, parent is N node-point: represent a position b/t nodes, N is a root or element node, the index point to the positio

n b/t Ith and (I+1)th nodes. Axes: no children, parent is N, Sliblings are parent/node()[I] (and [I+1]))

Introduction to XPointer

Transparency No. 24

Definition of range location

A range R = (S,E) where S and E is the start point and end point of R, respectively. represents all content between the start point and end point. S and E must be in the same document. S must not appear after E in document order.

S = E => collapsed range. S and E must have the same container node if either is comment,

attribute, namespace or PI. Range locations do not have an expanded-name. string-value of R:

S and E are both character-points and has the same container node => the characters between the two points.

Otherwise =>the characters that are in text nodes and that are between the two points.

axes(R) = axes(S). start-point(R) = S, end-point(R) = E.

Introduction to XPointer

Transparency No. 25

Covering ranges of locations

A covering range of a location is the minimal range that wholly encompasses the location.

CR: location range is defined as follows:CR(loc) =

if loc is a range => loc if loc is a point => (loc, loc) if loc is an attribute or namespace node => ([loc,0], [loc, |

string-value(loc)| ] ) if loc is a root => ( [loc, 0], [loc, #child(loc)] ) o/w (element, text,PI,comment) => ( [parent(loc), position(loc)-1 ], [parent(loc),position(loc)))

CR(location-set) =def {CR(loc) | loc location-set

Introduction to XPointer

Transparency No. 26

NodeTests for point and range Locations

XPointer extends the XPath production for NodeType by adding items for the point and range location types.

NodeType [8] NodeType ::= 'comment' | 'text' | 'processing-instruction' | 'node' | 'point' | 'range'

This definition allows NodeTests to select locations of type point and range from a location set that may include locations of all three types.

Introduction to XPointer

Transparency No. 27

Document Order

P = (N, I) : any point =>preceding (P) : the node immediately preceding P:

character-point => pre(P) = N node-point and I > 0 => pre(P) = N/node()[I] node-point and I = 0 => Pre(P) = N if N has no attribute and namespaces, o/w => last of N’s attributes or namespace nodes.

Following(P): the node immediately following P is the node that is immediately after preceding(P) in document order.

Introduction to XPointer

Transparency No. 28

Document Order

N: node, P:point, R: rangeN < P iff N ≤ preceding(P)

N < R = (S,E) iff N < S

P1 = (N1, I1) < P2 = (N2, I2) iff N1 = N2 and I1 < I2 or N1 ≠ N2 and pre(P1) < pre(N2)

R1 = (S1, E1 ) < R2 = (S2, E2) iff S1 < S2 or S1 = S2 and E1 < E2.

Note: Document order forms a total order on the set of locations in a document.

Introduction to XPointer

Transparency No. 29

XPointer functions

range(location-set) =def CR($1). return the set of covering ranges of locations of $1.

start-point(loc-set) : location-setend-point(loc-set) : location-set

return the set of start-points (, end-points) of locatins of $1. ? start-point(?), end-point(?) p: point p p, p (p1,p2):range p1 p2 N:attr or namespace fail fail N: (/, element) (N,0) (N, |N/node()

|) N: (PI,comment) (N,0) (N, |string-value(N))

Introduction to XPointer

Transparency No. 30

range-inside()

range-inside(loc) : range loc: point or range loc. N: element or / ( [N,0], [N, #child(N)]). N: o/w ([N,0], [N, |string-value(N)|]

range-inside(loc-set) = {range-inside(loc) | loc in loc-set}

Introduction to XPointer

Transparency No. 31

range-to()

range-to(loc) range loc ( start-point(context-node), end-point(loc)) range-to(locSetExpr) = {range-to(lc) | lc in Eval(locSetExpr) }.

Ex: 1. xpointer(id(“chap1”)/range-to(id(chap2”))) (start of chap1, end of chap2) 2. xpointer(decedant::REVEST/range-to(following::REVEN

D[1] ) set of (start of <REVST/>, next <REVEND/>)s.

Introduction to XPointer

Transparency No. 32

The string-range() Function

location-set string-range( location-set string pos? size? )For each location in $1, find and return all non-overlappe

d ranges matching $2 default pos = 1, size = |$2| [ 0 collapsed range] pos is the start position of $2 to return, and size is the retu

rn length (may > |$2|).

Example: string-range(//title,"Thomas Pynchon")[17] string-range(//P,"Thomas Pynchon",8,0)[3] = string-range(string-range(//P,"Thomas Pynchon")

[3],"P",1,0) string-range(/,"!",1,2)[5] : return the fifth ! and the char foll

owing it.

Introduction to XPointer

Transparency No. 33

here()here() get location of (text or atr or PI) node containing

current XPointerExample

<button>#xpointer( here()/../slides[1])</button> <button x:type=“simple” x:href=“#xpointer(here()/../slides[2]” /> … </button> The 1st here() refers to the text child of button element. the 2nd here() refers to the x:href attr.

Introduction to XPointer

Transparency No. 34

origin()

origin() get element location where user initiated link traversal

ex: <link x:type=“extended”> <s x:type=“locator” x:href=“…#e1” x:label=“s” /> <d x”type=“locator” x:href=“…#xpointer(origin()/… )” x:label=“d” /> <arc x:type=“arc”, x:from=“s”, x:to=“d” … /> </link>

Introduction to XPointer

Transparency No. 35

Example of here() and origin()

<xlink:ext>

<locator role=“from” href=“http://ok/page#fromhere” />

<locator role=“to” href=“#xpointer(origin()/…)” />

<locator role=“to” title = “here”

href = “…#xpointer( here()/…)” />

<xlink:arc from =“from” to = “to” />

</xlink:ext>

Introduction to XPointer

Transparency No. 36

Tools

Kinds of tools supporting XLink: browsers parsers link bases

www.fujitsu.co.jp/hypertext/free/HyBrick/en the HyBrick browser

www.loria.fr/projets/XSilfide/EN/sxp the SXP parser

www.stepuk.com/x2x/x2x_ove.asp the X2X link base

pages.wooster.edu/ludwigj/xml the Link browser

Warning: most tools do not support the newest specifications.

Introduction to XPointer

Transparency No. 37

Links to more information

www.w3.org/TR/xlink W3C's Working Draft on XLink

www.w3.org/TR/xptr W3C's Working Draft on XPointer

www.w3.org/TR/xpath W3C's specification on XPath

www.stg.brown.edu/~sjd/xlinkintro.html a brief introduction to XML linking

metalab.unc.edu/xml/books/bible/updates/16.html a chapter from The XML Bible on XLink

metalab.unc.edu/xml/books/bible/updates/17.html a chapter from The XML Bible on XPointer (and XPath)