xquery from the experts chapter 5 – introduction to the formal semantics Νίκος Λούτας
Post on 20-Dec-2015
218 Views
Preview:
TRANSCRIPT
XQuery from the Experts
Chapter 5 – Introduction to the formal Semantics
Νίκος Λούτας
Outline
Getting started with the formal semantics Dynamic Semantics Environments Matching Values and Types Errors Static Semantics Type Soundness Evaluation order Normalization
Outline (cont’d)
Learning more about XQuery Values and Types Matching and Subtyping FLWOR Expressions Path Expressions Implicit Coercion and Function calls Node Identity and Element Constructors
Getting Started… The XQuery formal semantics describes a processing model
that relates: Query parsing
Takes as input a query and produces a parse tree Normalization
Transforms the parse tree in an equivalent parse tree in the core language
Static analysis Produces a parse tree where each expression has been
assigned a type Dynamic evaluation
Take a parse tree in the core language and reduces its expression to XML values that is the result of the query
Dynamic Semantics
Evaluation takes an expression and returns a value
Expr Value
Value ::= Boolean | Integer Boolean ::= fn: true() | fn: false() Integer ::= 0 | 1 | -1 | 2 | -2 |…
Dynamic Semantics (cont’d)
Expr ::= Value
| Expr < Expr
| Expr + Expr
| if (Expr) then Expr else Expr
e.g. 5 < 10, 1 + 2, if(1 < 2) then 4 + 5 else 6 + 7
Dynamic Semantics (cont’d)
Evaluation is described by five rules:
i. Value Value (VALUE)
ii. Expr0 Integer0 Expr1 Integer1 Expr0 < Expr1 Integer0 < Integer1 (LT)
iii. Expr0 Integer0 Expr1 Integer1 Expr0 + Expr1 Integer0 + Integer1 (SUM)
Dynamic Semantics (cont’d)
iv. Expr0 fn: true()Expr1 Valueif (Expr0) then Expr1 else Expr2 Value
(IF-TRUE)
iv. Expr0 fn: true()Expr1 Valueif (Expr0) then Expr1 else Expr2 Value
(IF-FALSE)
Dynamic Semantics (cont’d)
Example: Build a proof tree to evaluate the expression 1+2
1 1 (VALUE)
2 2 (VALUE)
1 + 2 3 (SUM)
Environment
dynEnv├ Expr Value
dynEnv dynamic Environment An environment may have many components varValue a map from variables to their
values Binding a variable to an environment overrides
any previous bindings of the variable
Environment (cont’d)
Notation Meaning
The initial environment with an empty map
dynEnv.varValue (Var1 Value1,…, Varn Valuen)
The environment that maps Vari to Valuei
dynEnv + varValue (Var Value)
The environment identical to dynEnv except that maps Var to Value
dynEnv.varValue (Var) The value of var in dynEnv
dom(dynEnv.varValue)The set of variables mapped in dynEnv
Environment (cont’d)
Expr ::= …previous expressions…
| $Var
| let $Var := Expr return Expr
e.g. $x, let $x := 1 return $x + 2
Environment (cont’d)
The five rules shown before need to be revised, e.g. (LT)
ii. dynEnv├ Expr0 Integer0 dynEnv├ Expr1 Integer1 dynEnv├ Expr0 < Expr1 Integer0 < Integer1
two more rules are added dynEnv.varValue (Var) = Value
dynEnv├ $Var = Value(VAR)
dynEnv├ Expr0 Value0
dynEnv + varValue (Var Value0) ├ Expr1 Value1
dynEnv├ let $Var := Expr0 return Expr1 Value1 (LET)
Matching Values and Types The value must match the variables type, else an exception is
raised
Expr ::= …previous expressions… | let $Var as Type := Expr return Expr
Type ::= xs: boolean | xs: integer
Static type declarations when the expression is analyzed
Dynamic type declarations when the expression is evaluated
Value matches Type
Matching Values and Types (cont’d) Three new rules derive
viii. Integer matches xs: Integer (INT-MATCH)
ix. Boolean matches xs: Boolean (BOOL-MATCH)
x. dynEnv├ Expr0 Value0
Value0 matches TypedynEnv + varValue (Var Value0) ├ Expr1 Value1
dynEnv├ let $Var as Type := Expr0 return Expr1 Value1
(LET-DECL)
Errors
dynEnv├ Expr raises Error
Error ::= typeErr | dynErr
Expr ::= …previous expressions… | Expr idiv Expr
Errors (cont’d)
Type errors triggered if an operand’s value does not match the operator’s required type
not (Value matches Type)
dynEnv├ Expr0 Value0
not (Value0 matches Type)
dynEnv├ let $Var as Type := Expr0 return Expr1 raises typeErr
Errors (cont’d) - Dynamic errors
dynEnv├ Expr0 Value0
dynEnv├ Expr1 Value1
Value1 ≠ 0dynEnv├ Expr0 idiv Expr1
Value0 idiv Value1
dynEnv├ Expr1 0 dynEnv├ Expr0 idiv Expr1 raises dynErr
Errors (cont’d)
Example: what errors does the following expression raise?
(1 idiv 0) + (2 < 3)
typeErrtypeErrdynErrdynErr
Static Semantics How static types associated with expressions
Static typing takes a static environment and an expression and returns a type statEnv ├ Expr : Type
statEnv the static environment that captures the context available at query-analysis time (variables and their types)
No need to check for type errors at evaluation time
Static Semantics (cont’d) Two rules to assign type
statEnv├ Boolean : xs: boolean (BOOLEAN-STATIC)
statEnv├ Integer : xs: integer (INTEGER-STATIC)
statEnv├ Expr0 : xs: booleanstatEnv├ Expr1 : TypestatEnv├ Expr2 : TypestatEnv├ if (Expr0) then Expr1 else Expr2 : Type(IF-STATIC)
We do not examine the value of the Expr – the value is not known statically
Examine only the type of the condition must be boolean The branches must have the same type
Static Semantics (cont’d)
Expression: if (1 < 3) then 3 + 4 else 5 + 6
statEnv ⊢ 1 : integer
statEnv ⊢ 3 : integer (BOOLEAN-STATIC) statEnv ⊢ 1 < 3 : boolean
(INTEGER-STATIC)
statEnv ⊢ 3 + 4 : integer
(IF-STATIC)
statEnv ⊢ if (1 < 3) then 3 + 4 else 5 + 6 : integer
statEnv ⊢ 3 : integer
statEnv ⊢ 4 : integer
(INTEGER-STATIC)
statEnv ⊢ 5 + 6 : integer
statEnv ⊢ 5 : integer
statEnv ⊢ 6 : integer
Type Soundness
Suppose Expr : Type Expr either yields a value of the same type or raises a dynamic error
dynEnv matches statEnv capture the relationship between dynEnv and statEnv dynEnv1 := varValue (x 1, y 0, z fn: false()) statEnv1 := varType (x: xs: integer, y: xs: integer,
z: xs: boolean)
Type Soundness (cont’d)
Theorem for Values
if dynEnv matches statEnv
dynEnv ├ Expr Value
statEnv ├ Expr Type
then
Value matches type
Type Soundness (cont’d)
Example
dynEnv1 matches statEnv1
dynEnv1├ if ($z) then $x else $y 0
statEnv1├ if ($z) then $x else $y : xs: integer
0 matches xs: integer
Type Soundness (cont’d)
Theorem for Errors
if
dynEnv matches statEnv
dynEnv ├ Expr raises Error
statEnv ├ Expr : Type
then
Error ≠ typeErr
Type Soundness (cont’d)
Example
dynEnv1 matches statEnv1
dynEnv1├ $x idiv $y raises dynErr
statEnv1├ $x idiv $y : xs: integer
Type Soundness (cont’d)
Remember that If an expression raises a type error, then it cannot
type check e.g. dynEnv1├ $x + $z raises typeErr
statEnv1├ $x + $y : Type
An expression that does not raise a type error may still fail to statically type dynEnv1├ if ($x < $y) then $x + $z else $y 0
statEnv1├ if ($x < $y) then $x + $z else $y : Type
Evaluation Order
Test the expressions in either order Expr and Expr
Stop and return false if either one is false Raise an error if either one raises an error
Example (1 idiv 0 < 2) and (4 < 3)
Two possible results: dynErr or false Depends on which will be evaluated first Both correct
11
Normalization takes an expression in full XQuery and returns an equivalent
expression in core XQuery [FullExpr]Expr == Expr
FullExpr ::= Expr | let $Var as Type := Expr where Expr
return Expr
[Expr0 + Expr1]Expr == [Expr0 ]Expr + [Expr0 ]Expr
[$Var ]Expr = $Var
[let $Var as Type := Expr0 where Expr1 return Expr2]Expr == let $Var as Type := [Expr0]Expr return if ([Expr1]Expr ) then [Expr2]Expr else ()
Outline
So far we have covered: Dynamic Semantics Environments Matching Values and Types Errors Static Semantics Type Soundness Evaluation order Normalization
Outline (cont’d)
Now we will talk in more depth about: Values and Types Matching and Subtyping FLWOR Expressions Path Expressions Implicit Coercion and Function calls Node Identity and Element Constructors
Part II: Values and Types Value sequence of one or more items
Value ::= () | Item (,Item)*
Item ::= AtomicValue | NodeValue
AtomicValue ::= xs: integer(String) | xs: boolean(String) | xs: string(String) | xs: date(String)
e.g. xs: string(“XQuery’)( XQuery), xs:boolean(“false”)( fn: false())
Values and Types (cont’d)
NodeValue ::= element ElementName TypeAnnotation? { Value }
| text { String }
ElementName ::= QName
TypeAnnotation ::= of type TypeName
TypeName ::= QName
Values and Types (cont’d)
ItemType ::= NodeType | AtomicType
NodeType ::= ElementType | text ()
AtomicType ::= AtomicTypeName
AtomicTypeName ::= xs:string | xs:integer | xs:boolean | xs:date
Values and Types (cont’d)
ElementType := element((ElementName (,TypeName)?)?)
element(article) global declaration element(article, xs:string) local declaration
Type ::= none() | empty() | ItemType | | Type , Type | Type | Type | | Type Occurrence
Occurrence ::= ? | + | *
Values and Types (cont’d)
SimpleType ::= AtomicTypeName | SimpleType | SimpleType
| SimpleType Occurrence
Definition ::= define element ElementName TypeAnnotation
| define type TypeName TypeDeriviation
TypeDeriviation ::= restricts AtomicTypeName | restricts TypeName { Type } | { Type }
Values and Types (cont’d)
Example
define element article of type Articledefine type Article {
element (name, xs; string),element (reserve_price, PriceList) *
}define type PriceList restricts xs:anyType { xs:decimal *}
Matching and Subtyping Matching relate complex XML values with
complex typese.g.<reserve_price> 10.00 20.00 25.00 </reserve_price>
element reserve_price of type PriceList {10.0, 20.0, 25.0}matches element (reserve_price)
Subtyping checks whether a type is a subtype of another
Matching and Subtyping (cont’d)
Yields
ElementType yields element (ElementName,TypeName)
ElementType Reference to a global element name of the element and type
annotation from the element declaration Contains an element name with a type annotation element name and
type name in the type annotation Has a wildcard name followed by type name wildcard name and type
name Has neither element name nor type name wildcard name and
xs:anyType
Matching and Subtyping (cont’d)
Substitutes for
ElementName1 substitutes for ElementName2 When the two names are equal When the second name is the *
An element name may substitute for itself statEnv├ ElementName substitutes for
ElementName
Matching and Subtyping (cont’d)
Derives
TypeName1 derives from TypeName2
e.g. PriceList derives from xs:anyType
Every type name derives derives from the type name that is declared to derive from by restriction
Reflexive and transitive
Matching and Subtyping (cont’d)
Matches
Value matches Type e.g. (10.0, 20.0, 25.0) matches xs:decimal *
The empty sequence matches the empty sequence, e.g. () matches ().
If two values match two types, then their sequence matches the corresponding sequence type.
Matching and Subtyping (cont’d)
Matches
If a value matches a type, then it also matches a choice type, where that type is one of the choices.Value matches Type1
Value matches Type1 | Type2
A value matches an optional occurrence of a type of it either matches the type or the empty sequenceValue matches empty() | TypeValue matches Type ?
Matching and Subtyping (cont’d)
Subtyping
Type1 subtype Type2
If and only ifValue matches Type1 Value matches Type2
e.g. element(*, PriceList) subtype element(xs:integer)
FLWOR Expressions
Expr ::= …previous expressions… | FLWRExpr FLWRExpr ::= Clause+ return Expr Clause ::= ForExpr | LetExpr | WhereExpr ForExpr ::= for ForBinding (, ForBinding) * WhereExpr ::= where Expr LetExpr ::= let LetBinding (, LetBinding) * ForBinding ::= $Var TypeDeclaration? PositionVar? in Expr LetBinding ::= $Var TypeDeclaration? := Expr TypeDeclaration ::= as SequenceType PositionVar ::= at $Var SequenceType ::= ItemType Occurrence
FLWOR Expressions (cont’d) Normalization
A for / let clause with more than one binding turns each binding into a separate nested for / let expression and normalizes the result, (n>1)
[let LetBinding1 … LetBindingn return Expr]Expr == [let LetBinding1 return … [ for LetBindingn return Expr]Expr]Expr
a where clause is normalized into an if expression that returns the empty sequence if the condition is false, and normalizes the result
[where Expr0 return Expr1]Expr == [if(Expr0) then Expr1 else ()]Expr
FLWOR Expressions (cont’d)
for $i in $I, $j in $Jlet $k := $i + $jwhere $k >= 5 return ($i , $j)
for $i in $I returnfor $j in $J return
let $k := $i + $jif ($k >= 5) then ($i,$j)
else ()
Normalization
FLWOR Expressions (cont’d)
Factored types consist of an item type and an occurrence indicator
Result type = Prime ∙ Quantifier
e.g. ((xs:integer, xs:string) | xs:integer) *
subtype (xs:integer | xs:string) *
prime ((xs:integer, xs:string) | xs:integer) * = xs:integer |
xs:stringquant ((xs:integer, xs:string) | xs:integer) * = *
FLWOR Expressions (cont’d)
Factorization theorem
for all types we have
Type subtype prime (Type) ∙ quant (Type)
further if
Type subtype Prime ∙ Quantifier
then
prime (Type) subtype Prime and
quant (Type) ≤ Quantifier
1 ≤ ?, 1 ≤ +, ? ≤ *, + ≤ *
Path Expressions
[QName]Path == child:: QName [book/isbn]Path == child:: book/child:: isbn
[.]Path == self:: node()
[..]Path == parent:: node()
[Expr1//Expr2]Path == [Expr1/descendant-or-self:: node()/Expr2]Path
Path Expressions (cont’d)
Expr ::= …previous expressions… | PathExpr PathExpr ::= / | / RelativePathExpr | RelativePathExpr RelativePathExpr ::= RelativePathExpr / StepExpr |
StepExpr | RelativePathExpr // StepExpr StepExpr ::= (ForwardStep | ReverseStep) Predicates ForwardStep ::= ForwardAxis NodeTest ReverseStep ::= ReverseAxis NodeTest ForwardAxis ::= child:: | descendant:: | self::
| descendant-or-self:: ReverseAxis ::= parent:: Predicates ::= ( [ Expr ] )* NodeTest ::= text() | node() | * | QName
Rule that relates normalization of expressions to normalization of path expressions:
[PathExpr]Expr == fs:distinct-docorder([PathExpr]Path)
Normalization of absolute path expressions[/]path == fn:root($fs:dot)
[/RelativePathExpr]path == [fn:root($fs:dot)/RelativePathExpr]path
Built-in variable $fs:dot represents the context node An absolute path expression refers to the root of the
XML tree that contains the context node
Path Expressions (cont’d)
Path Expressions (cont’d) Normalization of “/”
[RelativePathExpr / StepExpr]path ==
let $fs:sequence := fs:distinct-docorder([RelativePathExpr]path) return
let $fs:last := fn:count($fs:sequence) return
for $fs:dot at $fs:position in $fs:sequence return
[StepExpr]path
This rule binds the variables $fs:sequence, $fs:last, $fs:dot and $fs:position to, respectively, the context sequence, the context size, the context node and the position of that node in the context sequence
Path Expressions (cont’d) Normalization of step expressions:[ForwardStep Predicates [Expr]]Path == let $fs:sequence := [ForwardStep Predicates]Path return let $fs:last := fn:count($fs:sequence) return for $fs:dot at $fs:position in $fs:sequence return if ([Expr]Predicates) then $fs:dot else ()
Similar rule for ReverseStep but the $fs:position is bound reversely
Example (simplified): child::*[2] let $fs:sequence := child::* return
let $fs:last := fn:count($fs:sequence) return for $fs:dot at $fs:position in $fs:sequence return if (fn:position() = 2) then $fs:dot else ()
Path Expressions (cont’d)
Predicate mapping
[Expr]Predicates ==
typeswitch([Expr]Expr)
case numeric $v return
op:numeric-equal(fn:round($v), $fs:position)
default $v return
fn:boolean($v)
Finally, axis mapping is straightforward
[ForwardAxis :: NodeTest]Path == ForwardAxis :: Nodetest
[ReverseAxis :: NodeTest]Path == ReverseAxis :: Nodetest
Path Expressions (cont’d) path expression $input//a/b is normalized to
fs:distinct-docorder( let $fs:sequence := ( fs:distinct-docorder( let $fs:sequence := $input return let $fs:last := fn:count($fs:sequence) return for $fs:dot at $fs:position in $fs:sequence return fs:distinct-docorder( let $fs:sequence := descendant-or-self::node()
return let $fs:last := fn:count($fs:sequence) return for $fs:dot at $fs:position in $fs:sequence
return child::a)) ) return let $fs:last := fn:count($fs:sequence) return for $fs:dot at $fs:position in $fs:sequence return child::b)
Implicit Coercion and Function calls
XQuery can represent a Schema containing irregular data in the formal type notation
<xs: element name=“article” type=“Article”/><xs: complexType name=“Article”>
<xs: sequence><xs: element name=“name” type=“xs:
string”/>< xs: element name=“reserve_price”
type=“PriceList” minOccurs=“0” maxOccurs=“unbounded”/> </xs: sequence>
</xs: complexType> <xs: simpleType name=“PriceList”>
<xs: list itemType=“xs: decimal”/></xs: simpleType>
define element article of type Articledefine type Article {
element (name, xs: string),element (reserve_price, PriceList) * ,}
define type PriceList { xs: decimal * }
Implicit Coercion and Function calls (cont’d)
An arithmetic expression is well defined on any item sequence that can be coerced to zero or one atomic value $article/reserve_price a sequence of zero or more
reserve_price elements
A comparison is well defined on any item sequence that can be coerced to a sequence of atomic values $article/reserve_price < 100 the typed context of
$article/reserve_price is automatically extracted
XPath’s predicate expressions are well defined on any item sequence $article[reserve_price] returns each node in $article that
has at least one reserve_price child
Implicit Coercion and Function calls (cont’d)
First coercion Applied to expressions that require a boolean value Maps the Expr argument to a core expression and
applies fn: boolean to the result
[if (Expr0) then Expr1 else Expr2]Expr ==
if (fn: boolean([Expr0]Expr)) then [Expr1]Expr
else [Expr2]Expr
Implicit Coercion and Function calls (cont’d)
Second coercion Applied to an expression when used in a context
that requires a sequence of atomic values Maps the Expr argument to a core expression
then applies the fn: data to the result
fn: data takes any item sequence, applies the following rules and concatenates the results
•If the item is an atomic value, it is returned•Otherwise, the item is a node and its typed value is returned
Implicit Coercion and Function calls (cont’d) Normalization rule for +
[Expr1 + Expr2]Expr == let $v1 := fn: data([Expr1]Expr) return let $v2 := fn: data([Expr2]Expr) return fs: plus($v1, $v2)
Normalization rule for < [Expr1 < Expr2]Expr == some $v1 in fn: data([Expr1]Expr) satisfies some $v2 in fn: data([Expr2]Expr) satisfies fs: less-than($v1, $v2)
$article/reserve_price + 10.00
(<reserve_price/>) returns ()(<reserve_price>10.00 (</reserve_price>) returns 20.00(<reserve_price>10.00 (</reserve_price>), <reserve_price>20.00 25.00(</reserve_price>) type error, because the
atomized value is a sequence of decimals
Node Identity and Element Constructors
Element Constructor creates new nodes with new identities e.g let $name := <name> Red Bicycle </name>
return <article>{$name, $name}</article>
Store mapping from node identifiers to node values Item ::= NodeId | AtomicValue dynEnv store(NodeId NodeValue)
Node Identity and Element Constructors (cont’d)
<article>
<name>Red Bicycle</name>
<start_date>1999-01-05
</start_date>
<end_date>1999-02-20
</end_date>
<reserve_price>40
</reserve_price>
Store(
N1 element article of type Article {N2, N3, N4, N5},
N2 element article of type xs:string {“Red Bicycle”},
N3 element article of type xs:date {“1999-01-05”},
N4 element article of type xs:date {“1999-02-20”},
N5 element article of type xs:decimal {“40”} )
Node Identity and Element Constructors (cont’d)
Evaluation affects the store Store0 ; dynEnv├ Expr Value ; Store1
Each new store computed through a given judgment is passed as input to the next judgment
Most rules treat the store implicitly
dynEnv├ Expr0 Value0
dynEnv + varValue(Var Value0) ├ Expr1 Value1
dynEnv├ let $Var := Expr0 return Expr1 Value1
; Store2
; Store2
; Store1
Store1 ;
Store0 ;
Store0 ;
Node Identity and Element Constructors (cont’d)
Validation
Evaluate the expression to yield a value Erase all type information in the value Construct the untyped element node Validate the node to yield the final typed value
SchemaUntyped
document
all nodes have an associated
type annotation
Node Identity and Element Constructors (cont’d)
Static semantics and element construction
The static type system performs a conservative analysis that catches errors early, during static analysis rather than dynamic evaluation. e.g. if the element is declared to have type xs:
integer then the type of its contents must be xs: integer
top related