1xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
xsugarDual Syntax for XML Languages
Claus Brabrand Anders Møller Michael Schwartzbach{brabrand,amoeller,mis}@brics.dk
BRICS, Department of Computer Science
University of Aarhus, Denmark
3xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Outline (3 parts)
Introduction
xsugar
Syntax and Semantics
Unifying Syntax Tree
Validation AnalysisDTDs & Summary Graphs
Schema Languages
Reversibility AnalysisInfo Preservation
Unambiguity
Teleportation
More Examples
Related & Future Work
Assessment
Conclusion
Introduction (xsugar)
Static Analyses
Assessment
1
2
3
4xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
Validation AnalysisDTDs & Summary Graphs
Schema Languages
// Part 1: Introduction
Introduction
xsugar
Syntax and Semantics
Unifying Syntax Tree
Reversibility AnalysisInfo Preservation
Unambiguity
Teleportation
More Examples
Related & Future Work
Assessment
Conclusion
Introduction (xsugar)
Static Analyses
Assessment
1
2
3
5xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
Relax NG:
// Motivation
Relax RNGRelax RNC
• correspondence ?• maintenance ?• reversibility ?• validity (XML) ?• termination ?
RNC-to-RNG:Python script (1,478 lines)
RNG-to-RNC:XSLT stylesheet (894 lines)
Dynamic issues:
6xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
XQuery:
// Motivation (cont’d)
XQueryXXQuery
• correspondence ?• maintenance ?• reversibility ?• validity (XML) ?• termination ?
XQuery-to-XQueryX:Non-existent...!
XQueryX-to-XQuery:XSLT stylesheet (845 lines)
Dynamic issues:
7xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
One stylesheet produces:
// xsugar
XL
• correspondence !• maintenance !• reversibility !• validity (XML) !• termination !
Static guarantees:L2X:Transformation: L X
X2L:Reverse transformation: X L
xsugar
s : L X
8xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Example: Transformation…<student id=“19920539”> <name>Claus Brabrand</name> <email>[email protected]</email></student>
Claus Brabrand ([email protected]) 19920539
Name = { ... }Email = { ... }Id = { [0-9]+ }
student : [Name n] ( [Email e] ) [Id id] \n = { <student id=[Id id]> <name><[Name n]></name> <email><[Email e]></email> </student>}
[Name n] [Email e] [Id id]
s: L Xparsing
unparsing
parsing
unparsing
transformation
x:
l :
9xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// …and Reverse Transformation
<student id=“19920539”> <name>Claus Brabrand</name> <email>[email protected]</email></student>
Claus Brabrand ([email protected]) 19920539
Name = { ... }Email = { ... }Id = { [0-9]+ }
student : [Name n] ( [Email e] ) [Id id] \n = { <student id=[Id id]> <name><[Name n]></name> <email><[Email e]></email> </student>}
s: L Xunparsing
parsing
unparsing
parsing
reverse transformation
x:
l :
[Id id] [Name n] [Email e]
10xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
N finite set of unifying nonterminals
finite alphabet of terminals
s N start unifying nonterminal
U finite set of unification names
: N P(E* E*), unifying production function, E = (N U)
// Unifying Grammarstudent : [Name n] ( [Email e] ) [Id id] \n = { <student id=[Id id]> <name><[Name n]></name> <email><[Email e]></email> </student>}
G = N, , s, U,
unification:2 right-hand sides
11xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Regular Nonterminal Shorthand
Regular expressions (convenient short-hand) for regular nonterminals (w/ identity unification):
student : [Name n] ( [Email e] ) [Id id] \n = { <student id=[id]> <name><[n]></name> <email><[e]></email> </student>}
Id = { [0-9]+ }
id : [num n] [id i] = { <[num n]> <[id i]> } : [num n] = { <[num n]> }num : 0 = { 0 } … : 9 = { 9 }
desugaring
12xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// (A)symmetric Unification
Unification is symmetric:Grammar for both L and X
However (XML-induced asymmetry):X (XML) contains lots of structure typically impervious to grammatical structure
Thus, one can “think of G as grammar for L”Asymmetry reflected syntactically:
: N P(E* E*)
unification:2 right-hand sides
student : [Name n] ( [Email e] ) [Id id] \n = { <student id=[Id id]> <name><[Name n]></name> <email><[Email e]></email> </student>}
13xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Syntactic Constituents (L)
L:Construction Name UST
foo token []
[N a] nonterminal [a ]
[N] ignoreable []
Name = { [^(\n]+ }Email = { [^,) ]+ }Id = { [0-9]+ }
student : [Name n] ( [Email e] ) [Id id] \n = { <student id=[id]> <name><[n]></name> <email><[e]></email> </student>}
14xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Syntactic Constituents (X)
X:Construction Name UST
‘ ’ (XML) whitespace []
foo text (“PC data”) []
<e…>…</e> element (w/ attributes) …
<[a]> gap [a ]<…a=[a]…>…</… attribute gap
Name = { [^(\n]+ }Email = { [^,) ]+ }Id = { [0-9]+ }
student : [Name n] ( [Email e] ) [Id id] \n = { <student id=[id]> <name><[n]></name> <email><[e]></email> </student>}
15xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
ASTL (ordered tree):
ASTX (partially ordered):
// ASTL and ASTX
[id]19920539
[n]Claus Brabrand
studentprod: #1
studentprod #1
student : [Name n] ( [Email e] ) [Id id] \n = { <student id=[id]> <name><[n]></name> <email><[e]></email> </student>}
[n]Claus Brabrand
Attr [id]19920539
[ ]
[ ]}{
, ,
,
16xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
Unifying Syntax Tree (UST)
UST(L/X) (unordered tree):
// UST (“Unifying Syntax Tree”)student : [Name n] ( [Email e] ) [Id id] \n = { <student id=[id]> <name><[n]></name> <email><[e]></email> </student>}
[n]Claus Brabrand
studentprod: #1
[id]19920539{ }, ,
17xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
Reversible ? (i.e. ):Parsing / Unparsing (i.e. ):
Grammar Ambiguity ?
Transformation (i.e. ):Information Preservation ?
// “The Big Picture”UST
ASTL / ~L ASTX / ~XML
. .. transformation
L X
L X
Ordered tree
Unordered tree
Partially Ordered
Legend:
Canonical: l L
Canonical: x X
1-1
un-/parsing
transformation
un-/parsing
1-1/~L? 1-1/~
XML?
1-1? 1-1?
.1-1 .
.. 1-1 ..
18xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
Validation AnalysisDTDs & Summary Graphs
Schema Languages
// Part 1: Introduction
Introduction
xsugar
Syntax and Semantics
Unifying Syntax Tree
Reversibility AnalysisInfo Preservation
Unambiguity
Teleportation
More Examples
Related & Future Work
Assessment
Conclusion
Introduction (xsugar)
Static Analyses
Assessment
1
2
3
19xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Part 2: Static Analyses
Introduction
xsugar
Syntax and Semantics
Unifying Syntax Tree
Validation AnalysisDTDs & Summary Graphs
Schema Languages
Reversibility AnalysisGrammar Unambiguity
Information Preservation
Teleportation
More Examples
Related & Future Work
Assessment
Conclusion
Introduction (xsugar)
Static Analyses
Assessment
1
2
3
20xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Motivating Example (Ex. cont’d)
<student id=“19940392”> <name>Anders Moeller</name> <emails> <email>[email protected]</email> <email>[email protected]</email> </emails></student>
Anders Moeller ([email protected],[email protected]) 19940392
Name = { [^(\n]+ }Email = { [^,) ]+ }Id = { [0-9]+ }
student : [Name n] ( [emails es] ) [Id id] \n = { <student id=[id]> <name><[n]></name> <emails><[es]></emails> </student>}
emails : [Email e] = { <email><[e]></email> } : [Email e] , [emails es] = { <email><[e]></email> <[es]> }
21xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Motivating Example (cont’d2)Name = { [^(\n]+ }Email = { [^,) ]+ }Id = { [0-9]+ }
student : [Name n] [opt_emails e] [Id id] \n = { <student id=[id]> <name><[n]></name> <[e]> </student>}
opt_emails : = {} : ( [email e] ) = { <[e]> } : ( [email e] , [emails es] ) = { <emails><[e]><[es]></emails> }
emails : [email e] = { <[e]> } : [email e] , [emails es] = { <[e]><[es]> }
email : [Email e] = { <email><[e]></email> }
// -- sequence of students -----
start : [students ss] = { <students> <[ss]> </students> }
students : [student s] = { <[s]> } : [student s] [students ss] = { <[s]> <[ss]> }
22xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Example (cont’d)
<students> <student id=“19920539”> <name>Claus Brabrand</name> <email>[email protected]</email> </student> <student id=“19940392”> <name>Anders Moeller</name> <emails> <email>[email protected]</email> <email>[email protected]</email> </emails> </student> <student id=“8”> <name>Michael Schwartzbach 1879139</name> </student></students>
Claus Brabrand ([email protected]) 19920539Anders Moeller ([email protected],[email protected]) 19940392Michael Schwartzbach 18791398
23xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Example (cont’d)
<students> <student id=“19920539”> <name>Claus Brabrand</name> <email>[email protected]</email> </student> <student id=“19940392”> <name>Anders Moeller</name> <emails> <email>[email protected]</email> <email>[email protected]</email> </emails> </student> <student id=“8”> <name>Michael Schwartzbach 1879139</name> </student></students>
Claus Brabrand ([email protected]) 19920539Anders Moeller ([email protected],[email protected]) 19940392Michael Schwartzbach 18791398
Ambiguous grammar !
24xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
Unparsing:[N] representative
induces: ~L (equiv. rel.)
“only different wrt. [N]”
// Reversibility: Un-/Parsing
ASTL / ~L
.
L
. Parsing:Grammar ambiguity ?
Undecidable!However…
ASTL / ~L
.
L
.
.
25xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Approximating CFG Ambiguity
Undecidable ?:
However…!: :
Safe (over-)approximation:
G unambiguous, if:G horizontally unambiguous:
G vertically unambiguous:
unambiguous ambiguous
Yes!
.No?
. .
unambiguous ambiguous
?
G
G
Grammar-levelerror messages
(over-)approximation
Black-box
26xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Horizontal Ambiguity
Horizontal ambiguity (L):
Horizontal ambiguity (X):
n : [Id i1] [Id i2] = { <[i1]> : <[i2]>}
abxy <= ab:xy
m : [Id i1] “:” [Id i2] = { <e> <[i1]> <[i2]> </e>}
L
abxy => abx:y
ab:xy => abxy
abx:y <= abxy
X
27xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Vertical Ambiguity
Vertical ambiguity (L):
Vertical ambiguity (X):
n : x [Id i] = { !<[i]> }
: xx [Id i] = { !!<[i]> }
xxy <= !xy
m : ! [Id i] = { x<[i]> }
: !! [Id i] = { xx<[i]> }
X
xxy => !!y
!xy => xxy
!!y <= xxy
L
28xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Information Preservation
“Never throw away or duplicate information”:i.e. all named arguments must be used exactly once!
UST
ASTL / ~L
. .
L X
UST
ASTL / ~L
. .
L X
29xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Information Preservation
Information preservation (L):
Information preservation (X):
m : bar = { <f><[Id i]></f> }
foo (abc) => <e/>
foo (???) <= <e/>n : foo ( [Id i] ) = { <e/> }
bar <= <f>abc</f>
bar => <f>???</f>
X
L
30xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Reversible Stylesheets!
ASTL / ~L ASTX / ~XML
. ..
L X
L X
xsugar: 1-1 !
1-1 !un-/parsing transformation
1-1 !
transformation
1-1 ! 1-1 !un-/parsing
xsugar 1-1 !
Reversibility (proof):Assume:
L X
L Xs s
s s
Ls Xs.. .
31xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
GivenDTD, D :
// Validation Analysis
l L : x(l) L(D)
L(X) L(D)
SG(X) L(D)
Black-box
“Static Validation of Dynamically Generated HTML”[ Claus Brabrand | Anders Møller | Michael Schwartzbach ]
PASTE, 2001
32xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
Id = { [0-9]+ }Name = { [^(\n]+ }Email = { [^,) ]+ }
student : [Name n] ( [emails es] ) [Id id] \n = { <student id=[id]> <name><[n]></name> <emails><[es]></emails> </student>}
emails : [Email e] = { <email><[e]></email> } : [Email e] , [emails es] = { <email><[e]></email> <[es]> }<email><[ ]></email> <[ ]>
// Summary Graphs[0-9]+
[^(\n]+
[^,) ]+
<email><[ ]></email>
<student id=[ ]> <name><[ ]></name> <emails><[ ]></emails> </student>
SG(X) L(D)
Black-box
“Static Validation of Dynamically Generated HTML”[ Claus Brabrand | Anders Møller | Michael Schwartzbach ]
PASTE, 2001
33xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Part 2: Static Analyses
Introduction
xsugar
Syntax and Semantics
Unifying Syntax Tree
Validation AnalysisDTDs & Summary Graphs
Schema Languages
Reversibility AnalysisGrammar Unambiguity
Information Preservation
Teleportation
More Examples
Related & Future Work
Assessment
Conclusion
Introduction (xsugar)
Static Analyses
Assessment
1
2
3
34xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
Validation AnalysisDTDs & Summary Graphs
Schema Languages
// Part 3: Assessment
Introduction
xsugar
Syntax and Semantics
Unifying Syntax Tree
Reversibility AnalysisInfo Preservation
Unambiguity
Teleportation
More Examples
Related & Future Work
Assessment
Conclusion
Introduction (xsugar)
Static Analyses
Assessment
1
2
3
35xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Teleportation (non-local xformat°)Int = { [0-9]+ }Str = { “\”” [^\“]+ “\”” }
start : [list l] = { <[s]> <[l]> }
list : [Int n] [list l] = { <[n]> <[l]> } : [Str s] = {}
}
abc
87
42
l
start
l
l
{
{
{
{
}
}
}
}abc
87
42
l
start
l
l
{
{
{
}
}
,
, ,
,
,abc
abc
1-1
36xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// FW: Teleportation (cont’d)
Reversibility?:UST
ASTL / ~L ASTX / ~XML
. .. transformation
L X
L X
un-/parsing
transformation
un-/parsing
1-1/~L! 1-1/~
XML!
1-1! 1-1!
1-1!
teleportation
Int = { [0-9]+ }Str = { “\”” [^\“]+ “\”” }
start : [list l] = { <[s]> <[l]> }
list : [Int n] [list l] = { <[n]> <[l]> } : [Str s] = {}
37xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Related Work
“XSLT” (aka. XSL Stylesheets):However, only one direction
“Presenting XML”:“Java web application framework for presenting HTML, PDF, WML etc., in a device independent manner”.
“It aims to achieve a complete separation of content and presentation”.
Relax RNGRelax RNC
P2
P1
38xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
For “Relax NG”:
// Assessment
0
200
400
600
800
1000
1200
1400
1600
XSLT Python xsugar
Transform
Parse
Total
lines
Conciseness: [ 1 / 12+ ]
• correspondence ? vs !• maintenance ? vs !• reversibility ? vs !• validity (XML) ? vs !• termination ? vs !
Static guarantees:
39xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
xsugar: Reversible Stylesheets
// Conclusion
XL
• correspondence !• maintenance !• reversibility !• validity (XML) !• termination !
Static guarantees:L2X:Stylesheet: L X
X2L:Reverse stylesheet: X L
xsugar
s : L X
40xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
< presentation >
Questions please…
/
41xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// Schema Languages
The xsugar Schema Language®:Generalization (from Regexps to CFGs):
Full CFG structure for PCDATA
Full CFG structure for attribute values
x L(X) ?
42xsugar: Dual Syntax for XML LanguagesTōkyō Daigaku [ July 15, 2005 ] / 40
// More Examples: “Nice.xsg”
<Z> <Y> <Y> <Y/> </Y> <Y> <X/> <X/></Z>
a xx yyy b
n : a [xs x_s] [ys y_s] b = { <Z><[y_s]><[x_s]></Z> }
xs : = {} : x [xs x_s] = { <X></X><[x_s]> } // X: sequence
ys : = {} : y [ys y_s] = { <Y><[y_s]></Y> } // Y: nested