carrot: an appetizing hybrid of xquery and xslt
DESCRIPTION
Balisage paper link: http://www.balisage.net/Proceedings/vol7/html/Lenz01/BalisageVol7-Lenz01.htmlTRANSCRIPT
CarrotAn appetizing hybrid of XQuery and XSLT
Evan LenzSoftware Developer, Community
MarkLogic Corporation
On reinventing the wheel
XSLT-UK 2001
“XQuery: Reinventing the wheel?” http://xmlportfolio.com/xquery.html
Disclaimer My own personal project & opinions
Praises for XSLT Template rules are really elegant and powerful
It’s mature in its set of features
Powerful modularization features (<xsl:import>)
Praises for XQuery Concise syntax
Highly composable syntax An element constructor is an expression
So you can write things like: foo/<bar/>
Gripes about XSLT Two layers of syntax, which can’t be freely composed
You can’t nest an instruction inside an expression E.g., you can’t apply templates inside an XPath
expression
Verbose syntax In general In particular, for function definitions and parameter-
passing
Gripes about XQuery Conflation of modules and namespaces
Don’t like being forced to use namespace URIs
Distinction between main and library modules You can’t reuse a main module Reuse requires refactoring
Lack of a distinction between: the default namespace for XPath, and the default namespace for element constructors
No template rules!
A lot in common The same data model (XPath 2.0)
Much of the same syntax (XPath 2.0)
Feeling boxed-in XSLT’s lack of composability
XQuery’s lack of template rules
Don’t like having to pick between two languages all the time
Solution: add a third one. ;-)
And I’m calling it…
YAASFXSLT “Yet Another Alternative Syntax For XSLT”
Sam Wilmott’s RXSLT http://www.wilmott.ca/rxslt/rxslt.html
Paul Tchistopolskii’s XSLScript http://markmail.org/message/niumiluelzho6bmt
XSLTXT http://savannah.nongnu.org/projects/xsltxt
Actually: “Carrot” More than just an alternative syntax
Carrot combines: the friendly syntax and composability of XQuery
expressions the power and flexibility of template rules in XSLT
A “host language” for XQuery expressions
Motivation & inspiration My boxed-in feelings
OH: “I will never write code in XML.”
James Clark’s element constructor proposal back in 2001 http://www.jclark.com/xml/construct.html
Haskell
Overall design approach 95% of semantics defined by reference to XQuery
and XSLT
90% of syntax defined by reference to XQuery
Haskell similarities Haskell defines functions using equations:
foo = "bar"
Carrot defines variables, functions, and rules similarly: $foo := "bar"; my:foo() := "bar"; ^foo(*) := "bar";
Haskell similarities Haskell defines functions using equations:
foo = "bar"
Carrot defines variables, functions, and rules similarly: $foo := "bar"; my:foo() := "bar"; ^foo(*) := "bar";
Everything on the RHS is an XQuery expression (plus a few extensions)
Intro by example
Intro by example A rule definition in XSLT:
<xsl:template match="para"> <p> <xsl:apply-templates/> </p></xsl:template>
Intro by example A rule definition in XSLT:
<xsl:template match="para"> <p> <xsl:apply-templates/> </p></xsl:template>
A rule definition in Carrot:^(para) := <p>{^()}</p>;
Intro by example A rule definition in XSLT:
<xsl:template match="para"> <p> <xsl:apply-templates/> </p></xsl:template>
A rule definition in Carrot:^(para) := <p>{^()}</p>;
Intro by example A rule definition in XSLT:
<xsl:template match="para"> <p> <xsl:apply-templates/> </p></xsl:template>
A rule definition in Carrot:^(para) := <p>{^()}</p>;
Intro by example A rule definition in XSLT:
<xsl:template match="para"> <p> <xsl:apply-templates/> </p></xsl:template>
A rule definition in Carrot:^(para) := <p>{^()}</p>;
Intro by example This:
^()
Is short for this:^(node())
Just as, in XSLT, this:<xsl:apply-templates/>
Is short for this:<xsl:apply-templates select="node()"/>
Intro by example Another rule definition in Carrot:
^toc(section) := <li>{ ^toc() }</li>;
The same rule definition in XSLT:<xsl:template match="section" mode="toc"> <li> <xsl:apply-templates mode="toc"/> </li></xsl:template>
Intro by example Another rule definition in Carrot:
^toc(section) := <li>{ ^toc() }</li>;
The same rule definition in XSLT:<xsl:template match="section" mode="toc"> <li> <xsl:apply-templates mode="toc"/> </li></xsl:template>
Intro by example Another rule definition in Carrot:
^toc(section) := <li>{ ^toc() }</li>;
The same rule definition in XSLT:<xsl:template match="section" mode="toc"> <li> <xsl:apply-templates mode="toc"/> </li></xsl:template>
Intro by example Another rule definition in Carrot:
^toc(section) := <li>{ ^toc() }</li>;
The same rule definition in XSLT:<xsl:template match="section" mode="toc"> <li> <xsl:apply-templates mode="toc"/> </li></xsl:template>
Intro by example Another rule definition in Carrot:
^toc(section) := <li>{ ^toc() }</li>;
The same rule definition in XSLT:<xsl:template match="section" mode="toc"> <li> <xsl:apply-templates mode="toc"/> </li></xsl:template>
The identity transform In Carrot:
^(@*|node()) := copy{ ^(@*|node()) };
In XSLT:<xsl:template match="@* | node()"> <xsl:copy> <xsl:apply-templates select="@* | node()"/> </xsl:copy></xsl:template>
The identity transform In Carrot:
^(@*|node()) := copy{ ^(@*|node()) };
In XSLT:<xsl:template match="@* | node()"> <xsl:copy> <xsl:apply-templates select="@* | node()"/> </xsl:copy></xsl:template>
The identity transform In Carrot:
^(@*|node()) := copy{ ^(@*|node()) };
In XSLT:<xsl:template match="@* | node()"> <xsl:copy> <xsl:apply-templates select="@* | node()"/> </xsl:copy></xsl:template>
The identity transform In Carrot:
^(@*|node()) := copy{ ^(@*|node()) };
In XSLT:<xsl:template match="@* | node()"> <xsl:copy> <xsl:apply-templates select="@* | node()"/> </xsl:copy></xsl:template>
Note the asymmetry This definition is illegal (missing pattern):
^() := <foo/>;
Just as this template rule is illegal:<xsl:template
match=""><foo/></xsl:template>
However, when invoking, you can omit the argument:
^()
Just as in XSLT:<xsl:apply-templates/>
An XSLT example<xsl:transform version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <head> <xsl:copy-of select="/doc/title"/> </head> <body> <xsl:apply-templates select="/doc/para"/> </body> </html> </xsl:template>
<xsl:template match="para"> <p> <xsl:apply-templates/> </p> </xsl:template>
</xsl:stylesheet>
An XSLT example<xsl:transform version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <head> <xsl:copy-of select="/doc/title"/> </head> <body> <xsl:apply-templates select="/doc/para"/> </body> </html> </xsl:template>
<xsl:template match="para"> <p> <xsl:apply-templates/> </p> </xsl:template>
</xsl:stylesheet>
The equivalent in Carrot^(/) := <html> <head>{ /doc/title }</head> <body>{ ^(/doc/para) }</body> </html>;
^(para) := <p>{ ^() }</p>;
The equivalent in Carrot^(/) := <html> <head>{ /doc/title }</head> <body>{ ^(/doc/para) }</body> </html>;
^(para) := <p>{ ^() }</p>;
Carrot expressions
Carrot expressions Same as an expression in XQuery, with these
additions:1. ruleset invocations — ^mode(nodes)
2. shallow copy{…} constructors
3. text node literals — `my text node`
1. Ruleset invocations In XSLT, each mode can be thought of as the name of
a polymorphic function
The syntax of Carrot makes this explicit
1. Ruleset invocations “ruleset” == “mode”
In Carrot:^myMode(myExpr)
In XSLT:<xsl:apply-templates mode="myMode"
select="myExpr"/>
1. Ruleset invocations “ruleset” == “mode”
In Carrot:^myMode(myExpr)
In XSLT:<xsl:apply-templates mode="myMode"
select="myExpr"/>
1. Ruleset invocations “ruleset” == “mode”
In Carrot:^myMode(myExpr)
In XSLT:<xsl:apply-templates mode="myMode"
select="myExpr"/>
2. Shallow copy constructors
In Carrot: copy{…}
In XSLT:<xsl:copy>…</xsl:copy>
2. Shallow copy constructors
In Carrot: copy{…}
In XSLT:<xsl:copy>…</xsl:copy>
Why extend XQuery here? The lack of shallow copy constructors in XQuery makes
modified identity transforms impractical (specifically, for preserving namespace nodes)
3. Text node literals In Carrot:
`my text node`
3. Text node literals In Carrot:
`my text node`
In XSLT (a literal text node):my text node
3. Text node literals In Carrot:
`my text node`
In XSLT (a literal text node):my text node
In XQuery (dynamic text node constructor):text{ "my text node" }
3. Text node literals In Carrot:
`my text node`
In XSLT (a literal text node):my text node
In XQuery (dynamic text node constructor):text{ "my text node" }
Why aren’t dynamic text constructors sufficient? After all, they take only six more characters (text{…})
3. Text node literals Consider an example:
<xsl:template match="/doc"> <result> <xsl:apply-templates mode="file-name" select="."/> <xsl:apply-templates mode="file-ext" select="."/> </result></xsl:template>
<xsl:template mode="file-name" match="doc">doc</xsl:template>
<xsl:template mode="file-ext" match="doc">.xml</xsl:template>
3. Text node literals Consider an example:
<xsl:template match="/doc"> <result> <xsl:apply-templates mode="file-name" select="."/> <xsl:apply-templates mode="file-ext" select="."/> </result></xsl:template>
<xsl:template mode="file-name" match="doc">doc</xsl:template>
<xsl:template mode="file-ext" match="doc">.xml</xsl:template>
3. Text node literals Consider an example:
<xsl:template match="/doc"> <result> <xsl:apply-templates mode="file-name" select="."/> <xsl:apply-templates mode="file-ext" select="."/> </result></xsl:template>
<xsl:template mode="file-name" match="doc">doc</xsl:template>
<xsl:template mode="file-ext" match="doc">.xml</xsl:template>
3. Text node literals Consider an example:
<xsl:template match="/doc"> <result> <xsl:apply-templates mode="file-name" select="."/> <xsl:apply-templates mode="file-ext" select="."/> </result></xsl:template>
<xsl:template mode="file-name" match="doc">doc</xsl:template>
<xsl:template mode="file-ext" match="doc">.xml</xsl:template>
And the result:
<result>doc.xml</result>
3. Text node literals Same example, rewritten “naturally” in Carrot:
^(/doc) := <result>{^file-name(.), ^file-ext (.)}</result>;
^file-name(doc) := "doc";^file-ext (doc) := ".xml";
3. Text node literals Same example, rewritten “naturally” in Carrot:
^(/doc) := <result>{^file-name(.), ^file-ext (.)}</result>;
^file-name(doc) := "doc";^file-ext (doc) := ".xml";
3. Text node literals Same example, rewritten “naturally” in Carrot:
^(/doc) := <result>{^file-name(.), ^file-ext (.)}</result>;
^file-name(doc) := "doc";^file-ext (doc) := ".xml";
3. Text node literals Same example, rewritten “naturally” in Carrot:
^(/doc) := <result>{^file-name(.), ^file-ext (.)}</result>;
^file-name(doc) := "doc";^file-ext (doc) := ".xml";
And the result (uh-oh):
<result>doc .xml</result>
3. Text node literals Sequences of atomic values have spaces added upon
conversion to a text node
Various fixes, none satisfactory: Wrap the two invocations in text{} or concat()
(high coupling – what if one of them returns an element?)
Wrap the strings in text{} (burdensome fix—regular strings work 90% of the time)
3. Text node literals Note the imbalance:
Returning a text node is more concise in XSLT than in XQuery (!) XSLT: Hello XQuery: text{"Hello"}
Returning a string is more concise in XQuery than in XSLT XQuery: "Hello" XSLT: <xsl:sequence select="'Hello'"/>
Text node literals in Carrot redress the imbalance Text nodes in Carrot: `Hello` Strings in Carrot: "Hello"
3. Text node literals Rewritten properly in Carrot:
^(/doc) := <result>{^file-name(.), ^file-ext (.)}</result>;
^file-name(doc) := `doc`;^file-ext (doc) := `.xml`;
Simple guideline now: Use text node literals when you are constructing part
of a result document Use string literals when you know you want to return a
string
Expression semantics Same as XQuery!
With this exception (remember the earlier gripe): Namespace attribute declarations on element
constructors do not affect the default element namespace for XPath expressions.
Okay, enough about namespaces
Carrot definitions
A Carrot module Consists of a set of unordered definitions
Three kinds of definitions: Global variables Functions Rules
Unlike XQuery, there is no top-level expression—only definitions Carrot is like XSLT in this regard
Global variables In Carrot:
$foo := "a string value";
Global variables In Carrot:
$foo := "a string value";
Equivalent to this XQuery: declare variable $foo := "a string value";
Global variables In Carrot:
$foo := "a string value";
Equivalent to this XQuery: declare variable $foo := "a string value";
Functions In Carrot:
my:foo() := "return value";
Functions In Carrot:
my:foo() := "return value";my:bar($str as xs:string) as xs:string := upper-case($str);
Functions In Carrot:
my:foo() := "return value";my:bar($str as xs:string) as xs:string := upper-case($str);
Equivalent to this XQuery:declare function my:foo() { "return value" };declare function my:bar($str as xs:string) as xs:string { upper-case($str) };
Functions In Carrot:
my:foo() := "return value";my:bar($str as xs:string) as xs:string := upper-case($str);
Equivalent to this XQuery:declare function my:foo() { "return value" };declare function my:bar($str as xs:string) as xs:string { upper-case($str) };
Rule definitions In Carrot:
^foo(*) := "return value";
Equivalent to this XSLT: <xsl:template match="*" mode="foo">
<xsl:sequence select="'return value'"/></xsl:template>
Rule definitions In Carrot:
^foo(*) := "return value";
Equivalent to this XSLT: <xsl:template match="*" mode="foo">
<xsl:sequence select="'return value'"/></xsl:template>
Rule definitions In Carrot:
^foo(*) := "return value";
Equivalent to this XSLT: <xsl:template match="*" mode="foo">
<xsl:sequence select="'return value'"/></xsl:template>
Rule definitions In Carrot:
^foo(*) := "return value";
Equivalent to this XSLT: <xsl:template match="*" mode="foo">
<xsl:sequence select="'return value'"/></xsl:template>
Rule parameters In Carrot:
^foo(* ; $str as xs:string) := concat($str, .);
Rule parameters In Carrot:
^foo(* ; $str as xs:string) := concat($str, .);
Equivalent to this XSLT:<xsl:template match="*" mode="foo"> <xsl:param name="str" as="xs:string"/> <xsl:sequence select="concat($str, .)"/></xsl:template>
Rule parameters In Carrot:
^foo(* ; $str as xs:string) := concat($str, .);
Equivalent to this XSLT:<xsl:template match="*" mode="foo"> <xsl:param name="str" as="xs:string"/> <xsl:sequence select="concat($str, .)"/></xsl:template>
Rule parameters In Carrot:
^foo(* ; tunnel $str as xs:string) := concat($str, .);
Equivalent to this XSLT:<xsl:template match="*" mode="foo"> <xsl:param name="str" as="xs:string" tunnel="yes"/> <xsl:sequence select="concat($str, .)"/></xsl:template>
Conflict resolution Same as XSLT
Import precedence first Then priority
Explicit priorities In Carrot:
^author-listing( author[1] ) 1 := ^();^author-listing( author ) := `, ` , ^();^author-listing( author[last()] ) := ` and ` , ^();
Explicit priorities In Carrot:
^author-listing( author[1] ) 1 := ^();^author-listing( author ) := `, ` , ^();^author-listing( author[last()] ) := ` and ` , ^();
Equivalent to this XSLT:<xsl:template mode="author-listing" match="author[1]" priority="1">
<xsl:apply-templates/>
</xsl:template>
<xsl:template mode="author-listing" match="author">
<xsl:text>, </xsl:text>
<xsl:apply-templates/>
</xsl:template>
<xsl:template mode="author-listing" match="author[last()]">
<xsl:text> and </xsl:text>
<xsl:apply-templates/>
</xsl:template>
Explicit priorities In Carrot:
^author-listing( author[1] ) 1 := ^();^author-listing( author ) := `, ` , ^();^author-listing( author[last()] ) := ` and ` , ^();
Equivalent to this XSLT:<xsl:template mode="author-listing" match="author[1]" priority="1">
<xsl:apply-templates/>
</xsl:template>
<xsl:template mode="author-listing" match="author">
<xsl:text>, </xsl:text>
<xsl:apply-templates/>
</xsl:template>
<xsl:template mode="author-listing" match="author[last()]">
<xsl:text> and </xsl:text>
<xsl:apply-templates/>
</xsl:template>
Multiple modes In Carrot:
^foo|bar(*) := `result`;
Equivalent to this XSLT:<xsl:template mode="foo bar" match="*">result</xsl:template>
Multiple modes In Carrot:
^foo|bar(*) := `result`;
Equivalent to this XSLT:<xsl:template mode="foo bar" match="*">result</xsl:template>
Why I like composability
A couple of examples
Pipelines are easier A typical pipeline approach in XSLT:
<xsl:variable name="stage1-result"> <xsl:apply-templates mode="stage1" select="."/></xsl:variable><xsl:variable name="stage2-result"> <xsl:apply-templates mode="stage2" select="$stage1-result"/></xsl:variable><xsl:apply-templates mode="stage3" select="$stage2-result"/>
Pipelines are easier A typical pipeline approach in XSLT:
<xsl:variable name="stage1-result"> <xsl:apply-templates mode="stage1" select="."/></xsl:variable><xsl:variable name="stage2-result"> <xsl:apply-templates mode="stage2" select="$stage1-result"/></xsl:variable><xsl:apply-templates mode="stage3" select="$stage2-result"/>
In Carrot:^stage1(.)/^stage2(.)/^stage3(.)
Fewer features needed XSLT 2.1/3.0 promises to add some convenience
features, like: <xsl:copy select="foo">…</xsl:copy>
With Carrot, just use foo/copy{…}
What about feature X?
Two instruction categories
XSLT instructions that aren’t needed <xsl:for-each> - Use for expressions (or mapping operator!) <xsl:variable> - Use let instead <xsl:sort> - Use order by instead <xsl:choose>, <xsl:if> - Use if…then…else instead
XSLT instructions whose Carrot syntax is TBD <xsl:for-each-group> <xsl:result-document> <xsl:analyze-string> Etc., etc.
XQuery 3.0 may add grouping…
You can always import XSLT 2.0 stylesheets into Carrot
Top-level elements Imports and includes
import navigation.crt;include widgets.crt;
Top-level parametersparam $message as xs:string;
Etc. Still in mock-up stage. See some examples:
http://github.com/evanlenz/carrot/examples
Implementation strategy
Compile to XSLT 2.0 1:1 module mapping
Each Carrot module compiles to an XSLT 2.0 module
Carrot can include and import other Carrot modules or XSLT modules.
Carrot can also import XQuery modules, but since this is not supported directly in XSLT 2.0, the semantics depend on your target XSLT processor, e.g.: <saxon:import-query> in Saxon <xdmp:import-module> in MarkLogic Server
Steps to implementation Generate a parser
1. Create a BNF grammar for Carrota) Hand-convert the EBNF grammar for XQuery expressions
to BNF
b) Extend the resulting BNF to support Carrot definitions and expressions
2. Use yapp-xslt to generate the Carrot parser from the Carrot BNF http://www.o-xml.org/yapp/
Already running into trouble with this approach…
Write a compiler (in XSLT, naturally)
Project goals At this point, just explore the raw ideas (and share
them)
Solicit feedback
Placeholder for now: http://github.com/evanlenz/carrot
Help me cook this carrot You write the parser, I’ll write the compiler?
Future possibilities
XML-oriented browser scripting
XQIB
Saxon-CE
Carrot, or something like it, could make XSLT more palatable to Web developers
W3C activity Provide seeds for ideas in the XSL/XQuery WGs?
Carrot will grow with XPath/XQuery/XSLT
Mode merging Random XSLT feature idea: invoke more than one
mode<xsl:apply-templates mode="foo bar"/>
In Carrot:^foo|bar()
A static mode extension mechanism Handy for multi-stage transformations where each
stage is similar to but not the same as the next
Could be added to Carrot even if not supported directly in XSLT Carrot as a research playground for new language
ideas
Questions?