<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xml:base="http://www.jenitennison.com/blog" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
 <title>xslt</title>
 <link>http://www.jenitennison.com/blog/taxonomy/term/5</link>
 <description>The taxonomy view with a depth of 0.</description>
 <language>en</language>
<item>
 <title>UK-based XML/XSLT job</title>
 <link>http://www.jenitennison.com/blog/node/87</link>
 <description>&lt;p&gt;I&amp;#8217;ve been asked if I could advertise the following vacancy. Any interested parties should contact &lt;a href=&quot;mailto:gfuller@peopleworks.co.uk&quot; title=&quot;Email Graham Fuller&quot;&gt;Graham Fuller&lt;/a&gt; from &lt;a href=&quot;http://www.peopleworks.co.uk&quot; title=&quot;Peopleworks&quot;&gt;Peopleworks&lt;/a&gt; (but say you saw it here; I&amp;#8217;ll get a reward!).&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Developer * XSLT * XML * Schemas * JavaScript * XHTML * CSS.&lt;/p&gt;
  
  &lt;p&gt;Global retail organisation and household name is looking for 2 (two) Front-End/User Interface Developers to work on a major consumer e-commerce portal.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;!--break--&gt;

&lt;blockquote&gt;
  &lt;p&gt;MAIN TASKS/REQUIREMENTS:&lt;/p&gt;
  
  &lt;ul&gt;
  &lt;li&gt;Development of enterprise solutions&lt;/li&gt;
  &lt;li&gt;Development of Consumer driven applications&lt;/li&gt;
  &lt;li&gt;Adherence to Software Development Methodology&lt;/li&gt;
  &lt;/ul&gt;
  
  &lt;p&gt;ESSENTIAL SKILLS:&lt;/p&gt;
  
  &lt;ul&gt;
  &lt;li&gt;XSLT &amp;amp; XML&lt;/li&gt;
  &lt;li&gt;XML schemas&lt;/li&gt;
  &lt;li&gt;JavaScript&lt;/li&gt;
  &lt;li&gt;XHTML&lt;/li&gt;
  &lt;li&gt;Cross browser and platform CSS positioning&lt;/li&gt;
  &lt;li&gt;Accessibility&lt;/li&gt;
  &lt;/ul&gt;
  
  &lt;p&gt;DESIRABLE (not essential) SKILLS:&lt;/p&gt;
  
  &lt;ul&gt;
  &lt;li&gt;Understanding of web design&lt;/li&gt;
  &lt;li&gt;JavaScript, including AJAX &amp;amp; DHTML&lt;/li&gt;
  &lt;li&gt;OO JavaScript&lt;/li&gt;
  &lt;/ul&gt;
  
  &lt;p&gt;These roles represent the opportunity to consult for a global multi billion global organisation.&lt;/p&gt;
  
  &lt;p&gt;The roles will pay £400 to £450 per day and they are 3 to 6 months contracts.&lt;/p&gt;
  
  &lt;p&gt;The role is based in Welwyn Garden City in Hertfordshire.&lt;/p&gt;
  
  &lt;p&gt;It is a 15 to 20 minute walk from the train station and there is a company service bus every 15 minutes at peak times from the station to the campus.&lt;/p&gt;
  
  &lt;p&gt;naturally there is loads of parking space for cars.&lt;/p&gt;
&lt;/blockquote&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/87#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/5">xslt</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/38">work</category>
 <pubDate>Thu, 17 Apr 2008 21:02:16 +0100</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">87 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>XSLT Q&amp;A: Refactoring templates</title>
 <link>http://www.jenitennison.com/blog/node/84</link>
 <description>&lt;p&gt;A question about how to refactor some repetitive templates.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The issue is in creating XHTML headings.  &lt;/p&gt;
  
  &lt;p&gt;For a small docbook article, I have the following templates in one of my stylesheets:&lt;/p&gt;
&lt;/blockquote&gt;

&lt;!--break--&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:template match=&quot;article/title | article/info/title&quot;&amp;gt;
  &amp;lt;h1&amp;gt;&amp;lt;xsl:apply-templates /&amp;gt;&amp;lt;/h1&amp;gt;
&amp;lt;/xsl:template&amp;gt;

&amp;lt;xsl:template match=&quot;article/section/title&quot;&amp;gt;
  &amp;lt;h2&amp;gt;&amp;lt;xsl:apply-templates /&amp;gt;&amp;lt;/h2&amp;gt;
&amp;lt;/xsl:template&amp;gt;

&amp;lt;xsl:template match=&quot;article/section/section/title&quot;&amp;gt;
  &amp;lt;h3&amp;gt;&amp;lt;xsl:apply-templates /&amp;gt;&amp;lt;/h3&amp;gt;
&amp;lt;/xsl:template&amp;gt;

&amp;lt;xsl:template match=&quot;article/section/section/section/title&quot;&amp;gt;
  &amp;lt;h4&amp;gt;&amp;lt;xsl:apply-templates /&amp;gt;&amp;lt;/h4&amp;gt;
&amp;lt;/xsl:template&amp;gt;

&amp;lt;xsl:template match=&quot;article/section/section/section/section/title&quot;&amp;gt;
  &amp;lt;h5&amp;gt;&amp;lt;xsl:apply-templates /&amp;gt;&amp;lt;/h5&amp;gt;
&amp;lt;/xsl:template&amp;gt;

&amp;lt;xsl:template match=&quot;article/section/section/section/section/section/title&quot;&amp;gt;
  &amp;lt;h6&amp;gt;&amp;lt;xsl:apply-templates /&amp;gt;&amp;lt;/h6&amp;gt;
&amp;lt;/xsl:template&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;blockquote&gt;
  &lt;p&gt;Obviously this was a quick and (VERY) dirty way to achieve the output I wanted.&lt;/p&gt;
  
  &lt;p&gt;So, I know you can do something similar with an &lt;code&gt;&amp;lt;xsl:choose&amp;gt;&lt;/code&gt; and some cases, but I have a feeling there&amp;#8217;s a more automatic way.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Seek out the similarities. The last five of these templates all match &lt;code&gt;&amp;lt;title&amp;gt;&lt;/code&gt; elements within a &lt;code&gt;&amp;lt;section&amp;gt;&lt;/code&gt; element. They all create an XHTML heading element and apply templates to the content of the &lt;code&gt;&amp;lt;title&amp;gt;&lt;/code&gt; to get the content of the heading.&lt;/p&gt;

&lt;p&gt;Identify the differences. They&amp;#8217;re different in the level of heading that they create and in the number of ancestor &lt;code&gt;&amp;lt;section&amp;gt;&lt;/code&gt; elements the &lt;code&gt;&amp;lt;title&amp;gt;&lt;/code&gt; has.&lt;/p&gt;

&lt;p&gt;Find the algorithm. Here&amp;#8217;s the mapping:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;number of &lt;code&gt;&amp;lt;section&gt;&lt;/code&gt; ancestors&lt;/th&gt;
      &lt;th&gt;required heading&lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;1&lt;/td&gt;
      &lt;td&gt;2&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;2&lt;/td&gt;
      &lt;td&gt;3&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;3&lt;/td&gt;
      &lt;td&gt;4&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;4&lt;/td&gt;
      &lt;td&gt;5&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;5&lt;/td&gt;
      &lt;td&gt;6&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;So the level of the heading is the number of ancestor &lt;code&gt;&amp;lt;section&amp;gt;&lt;/code&gt; elements plus one.&lt;/p&gt;

&lt;p&gt;Put it together. Get the number of ancestor &lt;code&gt;&amp;lt;section&amp;gt;&lt;/code&gt; elements with &lt;code&gt;count(ancestor::section)&lt;/code&gt;. Create the name of the heading element to create using an attribute value template in the &lt;code&gt;name&lt;/code&gt; attribute.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:template match=&quot;section/title&quot;&amp;gt;
  &amp;lt;xsl:variable name=&quot;nAncestorSections&quot;
    select=&quot;count(ancestor::section)&quot; /&amp;gt;
  &amp;lt;xsl:variable name=&quot;headingLevel&quot;
    select=&quot;$nAncestorSections + 1&quot; /&amp;gt;
  &amp;lt;xsl:element name=&quot;h{$headingLevel}&quot;&amp;gt;
    &amp;lt;xsl:apply-templates /&amp;gt;
  &amp;lt;/xsl:element&amp;gt;
&amp;lt;/xsl:template&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Of course there &lt;em&gt;are&lt;/em&gt; differences between this refactored code and the original. In particular, this template deals improperly with the case where there are more than five nested sections, because it creates an &lt;code&gt;&amp;lt;h7&amp;gt;&lt;/code&gt; element, which isn&amp;#8217;t legal. If you thought that was likely to occur, you could change how &lt;code&gt;$headingLevel&lt;/code&gt; is calculated to:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:variable name=&quot;headingLevel&quot;&amp;gt;
  &amp;lt;xsl:choose&amp;gt;
    &amp;lt;xsl:when test=&quot;$nAncestorSections &amp;gt;= 5&quot;&amp;gt;6&amp;lt;/xsl:when&amp;gt;
    &amp;lt;xsl:otherwise&amp;gt;
      &amp;lt;xsl:value-of select=&quot;$nAncestorSections + 1&quot; /&amp;gt;
    &amp;lt;/xsl:otherwise&amp;gt;
  &amp;lt;/xsl:choose&amp;gt;
&amp;lt;/xsl:variable&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;or:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:variable name=&quot;headingLevel&quot;
  select=&quot;if ($nAncestorSections &amp;gt;= 5)
          then 6 else $nAncestorSections + 1&quot; /&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;in XSLT 2.0.&lt;/p&gt;

&lt;p&gt;The other problem is that the template deals differently with &lt;code&gt;&amp;lt;title&amp;gt;&lt;/code&gt; elements that appear within a &lt;code&gt;&amp;lt;section&amp;gt;&lt;/code&gt; whose parent is neither &lt;code&gt;&amp;lt;article&amp;gt;&lt;/code&gt; nor another &lt;code&gt;&amp;lt;section&amp;gt;&lt;/code&gt; (which aren&amp;#8217;t matched by the original templates). There are other possible parents for &lt;code&gt;&amp;lt;section&amp;gt;&lt;/code&gt; namely &lt;code&gt;&amp;lt;appendix&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;chapter&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;partintro&amp;gt;&lt;/code&gt; and &lt;code&gt;&amp;lt;preface&amp;gt;&lt;/code&gt;, so if these elements are likely to appear in the subset of DocBook you&amp;#8217;re using and you want the code to behave differently you need to either add more templates or some extra conditions into this one.&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/84#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/5">xslt</category>
 <pubDate>Sun, 06 Apr 2008 20:20:00 +0100</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">84 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>Free Our Bills</title>
 <link>http://www.jenitennison.com/blog/node/83</link>
 <description>&lt;p&gt;The &lt;a href=&quot;http://www.theyworkforyou.com/freeourbills/&quot; title=&quot;TheyWorkForYou.com: Free Our Bills&quot;&gt;Free Our Bills&lt;/a&gt; campaign was launched recently in the UK. &lt;a href=&quot;http://www.theregister.co.uk/2008/03/26/mysociety_xml_bills_cameron/comments/#c_185029&quot; title=&quot;The Register: Comments on UK.gov urged to adopt web-friendly legislation format&quot;&gt;Some of the comments I&amp;#8217;ve seen&lt;/a&gt; about the campaign makes me think that it might be helpful if people understood more about how Bills and legislation get published in the UK. I thought I&amp;#8217;d offer a bit of background based on my experience (though there are many people with more intimate knowledge of the processes involved; perhaps they&amp;#8217;ll correct me when I get it wrong).&lt;/p&gt;

&lt;!--break--&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Bills are draft legislation that is under discussion within the House of Commons or House of Lords. A Bill becomes law (legislation) when it is enacted.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Bills are published by Parliament and are available on the &lt;a href=&quot;http://services.parliament.uk/bills/&quot; title=&quot;UK Parliament: Bills Before Parliament&quot;&gt;Parliament website&lt;/a&gt;. Legislation is published by &lt;a href=&quot;http://www.tso.co.uk/&quot; title=&quot;The Stationery Office&quot;&gt;The Stationery Office (TSO)&lt;/a&gt; under contract to the Office of Public Sector Information (OPSI) on the &lt;a href=&quot;http://www.opsi.gov.uk/legislation&quot; title=&quot;OPSI: Legislation&quot;&gt;OPSI website&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Bills are changed (amended) as they progress through the Houses of Parliament. People are mostly interested in the most recent version of a Bill. Legislation can be changed (amended) by other legislation; the version of a piece of legislation with all the changes applied to it is known as consolidated legislation. Consolidated legislation is published in the &lt;a href=&quot;http://www.statutelaw.gov.uk&quot; title=&quot;Statute Law Database&quot;&gt;Statute Law Database&lt;/a&gt; as well as (too a more limited extent) on the &lt;a href=&quot;http://www.opsi.gov.uk/legislation/revised&quot; title=&quot;OPSI: Revised Legislation&quot;&gt;OPSI website&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Bills are edited by a dedicated team of Parliament employees who must reflect the amendments that the MPs say they want to make. They use a WYSIWYG XML editor. As is usual in an environment that has only been concerned about printed copies for centuries, they tend to focus on appearance rather than semantics, even when the XML supports the semantics.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The Free Our Bills campaign is not about making Bills (or legislation) easier for humans to read and understand, it&amp;#8217;s about making it easier to extract information from a Bill so that people can be notified when a new Bill comes along on a subject they care about, or an old Bill is redrafted, and so on.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Bills are already available for the public to view on the web, in PDF and HTML forms. The problem is that the HTML is Really Really Bad (&lt;a href=&quot;http://www.publications.parliament.uk/pa/ld200708/ldbills/044/08044.i-v.html&quot; title=&quot;Parliament: Climate Change Bill&quot;&gt;View Source to see&lt;/a&gt;) and that makes it Really Really Hard to extract useful information from them.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;There are reasons for the Bills HTML being Really Really Bad:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;The HTML must look &lt;em&gt;exactly&lt;/em&gt; like it does in printed form, otherwise Members of Parliament (MPs) would get Really Really Confused.&lt;/li&gt;
&lt;li&gt;MPs refer to pieces of a Bill (which they might want to change) by page and line number, not by the semantic structure of the Bill, so the HTML must have page and line numbers in it or MPs would get Really Really Confused. &lt;/li&gt;
&lt;li&gt;Although the formatting of Bills is pretty consistent, there&amp;#8217;s always the chance that a piece will need to be formatted specially. It might be safe to assume a particular presentation for a particular semantic 99% of the time, but if that 1% isn&amp;#8217;t formatted in the different way, MPs would be Really Really Confused.&lt;/li&gt;
&lt;li&gt;The code that creates the Bill HTML was written several years ago, when browser support for CSS was Really Really Bad.&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The picture for legislation is rather better because a strategic decision was made to focus on semantics rather than presentation. When a Bill is enacted, it gets converted into &lt;a href=&quot;http://www.opsi.gov.uk/legislation/schema/&quot; title=&quot;OPSI: Legislation schema&quot;&gt;reasonably good semantic XML&lt;/a&gt;, which forms the basis of all the HTML views. It also helps that this HTML was designed fairly recently, for modern browsers; it makes heavy use of CSS so there&amp;#8217;s relatively little obfuscation of the content.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I think there are interesting general lessons here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Different user communities have different requirements.&lt;/strong&gt; MPs have different requirements from Bills from the general public, who don&amp;#8217;t care (as) much about line or page numbers. On the other hand, you need to actually consult with users about what they need rather than make assumptions about it: are MPs really likely to get Really Really Confused if the HTML presentation of a Bill looks slightly different from the PDF print version? I don&amp;#8217;t know.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Authors don&amp;#8217;t care about what they don&amp;#8217;t use.&lt;/strong&gt; When the only way of using a Bill is to print it, it&amp;#8217;s natural that authors and publishers only care about how it looks when it&amp;#8217;s printed. Training people to care about semantic markup is really hard, and it&amp;#8217;s made harder by WYSIWYG tools that allow them to override the semantic style. If a difference isn&amp;#8217;t visible, then in author&amp;#8217;s eyes it doesn&amp;#8217;t exist.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;You have to positively decide to ignore appearance.&lt;/strong&gt; When transforming from a WYSIWYG view, replicating appearance is the obvious thing to do. But it&amp;#8217;s worthwhile in the long run to focus on extracting the semantics, because the resulting documents are so much more reusable.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;HTML, XML and XSLT are not inherently good.&lt;/strong&gt; Parliament wanted Bills in HTML so that they were more accessible on the web. But the HTML is dreadfully inaccessible because of the other requirements placed on it. Similarly, XML can be incredibly obfuscated, or entirely about presentation, as formats such as OOXML illustrate. And just because your code is written in XSLT does not make it inherently easier to maintain then (say) a SAX transformation. It&amp;#8217;s easy to misuse a technology.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Developers who produce atrocious HTML aren&amp;#8217;t necessarily ignorant.&lt;/strong&gt; Unfortunately, there&amp;#8217;s sometimes a limit to how much you can argue with your customers.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/83#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/14">xml</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/5">xslt</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/37">legislation</category>
 <pubDate>Mon, 31 Mar 2008 20:10:14 +0100</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">83 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>Extension primitives in XSDL</title>
 <link>http://www.jenitennison.com/blog/node/71</link>
 <description>&lt;p&gt;Michael Sperberg McQueen (CMSMcQ) has written a couple of interesting posts about &lt;a href=&quot;http://people.w3.org/~cmsmcq/blog/?p=26&quot; title=&quot;Michael Sperberg McQueen: Allowing ‘extension primitives’ in XML Schema?&quot;&gt;datatypes in W3C&amp;#8217;s XML Schema (XSDL)&lt;/a&gt;. (The second is &lt;a href=&quot;http://people.w3.org/~cmsmcq/blog/?p=27&quot; title=&quot;Michael Sperberg McQueen: Primitives and non-primitives in XSDL&quot;&gt;a response to&lt;/a&gt; a comment from &lt;a href=&quot;http://recycledknowledge.blogspot.com/&quot; title=&quot;John Cowan&#039;s Blog: Recycled Knowledge&quot;&gt;John Cowan&lt;/a&gt;, and attempts to justify some of the seemingly arbitrary decisions made in the set of datatypes present in XSDL 1.0.) The posts are a discussion of one of the issues against XSDL 1.1 raised by &lt;a href=&quot;http://saxonica.blogharbor.com/&quot; title=&quot;Michael Kay&#039;s Blog: Saxon diaries&quot;&gt;Michael Kay&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Michael proposes: just specify that implementations may provide additional implementation-defined primitive types. In the nature of things, an implementation can do this however it wants. Some implementors will code up email dates and CSS lengths the same way they code the other primitives. Fine. Some implementors will expose the API that their existing primitive types use, so they choose, at the appropriate moment, to link in a set of extension types, or not. Some will allow users to provide implementations of extension types, using that API, and link them at run time. Some may provide extension syntax to allow users to describe new types in some usable way (DTLL, anyone?) without having to write code in Java or C or [name of language here].&lt;/p&gt;
&lt;/blockquote&gt;

&lt;!--break--&gt;

&lt;p&gt;Since I&amp;#8217;m principally responsible for the &lt;a href=&quot;http://www.idealliance.org/papers/extreme/proceedings/html/2006/Tennison01/EML2006Tennison01.html&quot; title=&quot;Extreme 2006: Datatypes for XML: the Datatyping Library Language (DTLL)&quot;&gt;Datatype Library Language (DTLL)&lt;/a&gt; it&amp;#8217;ll come as no surprise that I think that XSDL is currently deficient in not providing mechanisms for creating new primitive types (such as colours) or different lexical representations for the primitive types it has (such as UK-style dates). So yes, I do think XSDL would be a better schema language if it supported &amp;#8220;extension primitives&amp;#8221;. &lt;/p&gt;

&lt;p&gt;In XSLT and XPath, providing extensibility hooks has proved very effective. It&amp;#8217;s enabled implementers to innovate (and those innovations fed back into the design of XSLT 2.0 and XPath 2.0). It&amp;#8217;s provided users with functionality (such as &lt;code&gt;xxx:node-set()&lt;/code&gt;) that they would otherwise not have had for years, and therefore made the lives of thousands of users much easier.&lt;/p&gt;

&lt;p&gt;On the other hand, it&amp;#8217;s impossible to say how XSLT and XPath would have developed if those extensibility hooks hadn&amp;#8217;t been there. Would implementers have extended the language anyway, leading to fragmentation? Would the WG have felt more pressure to get later versions of XSLT out the door if the only way the language could have been improved was through centralised changes?&lt;/p&gt;

&lt;p&gt;I think the big thing that helped XSLT&amp;#8217;s extensibility actually work was &lt;a href=&quot;http://www.exslt.org/&quot; title=&quot;EXSLT: Extensions in XSLT&quot;&gt;EXSLT&lt;/a&gt; (but then, I would say that, wouldn&amp;#8217;t I?). The majority of XSLT processors implement EXSLT extensions, and even those processors that don&amp;#8217;t implement all (or any) of EXSLT have other extensibility hooks (such as &lt;code&gt;&amp;lt;msxsl:script&amp;gt;&lt;/code&gt; or &lt;code&gt;&amp;lt;xsl:function&amp;gt;&lt;/code&gt;) and there are third-party implementations of EXSLT&amp;#8217;s functions available so it&amp;#8217;s possible to write interoperable stylesheets while still using those functions.&lt;/p&gt;

&lt;p&gt;(EXSLT is by no means perfect: if we were doing it over again, we&amp;#8217;d build in much better methods for receiving user contributions of various kinds. But I think the general principle is sound.)&lt;/p&gt;

&lt;p&gt;If XSDL were to allow extension primitives, you&amp;#8217;d hope for something similar to happen:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a repository for common extension primitives&lt;/li&gt;
&lt;li&gt;implementations that respond to user demand for extensions in the repository&lt;/li&gt;
&lt;li&gt;development of higher-level languages for defining extension primitives&lt;/li&gt;
&lt;li&gt;implementations that provide hooks (in whatever way) for defining extension primitives&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can&amp;#8217;t predict what implementers will do, but it seems likely that they&amp;#8217;d provide hooks for users to create their own extension primitives (albeit most likely using Java or .NET or whatever rather than a higher-level language such as DTLL). And once they do that, it&amp;#8217;s possible for the community to provide third-party implementations for extension primitives, thus retaining interoperability.&lt;/p&gt;

&lt;p&gt;So I think it could work, if implementers do the right thing and the user community gets involved.&lt;/p&gt;

&lt;p&gt;(Just in case you get the wrong impression: I still think &lt;a href=&quot;http://www.relaxng.org/&quot; title=&quot;RELAX NG&quot;&gt;RELAX NG&lt;/a&gt; is a vastly superior schema language to XSDL. If you need extension datatypes, you can have them in RELAX NG right now. Unfortunately, however, in the real world, you don&amp;#8217;t always get to make the right technical choice.)&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/71#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/5">xslt</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/35">dtll</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/8">schema</category>
 <pubDate>Sat, 19 Jan 2008 23:00:20 +0000</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">71 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>XSLT 2.0 Q&amp;A: Linking elements in different documents</title>
 <link>http://www.jenitennison.com/blog/node/70</link>
 <description>&lt;p&gt;The first of what will probably become a series of posts where I answer publicly questions that people post me privately (with permission, of course)&amp;#8230;&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;How should I model (and store) data for Courses, while being able to pull info about a Course into a particular context (a Semester or Curriculum)?&lt;/p&gt;
  
  &lt;p&gt;I&amp;#8217;m not quite sure how to do this in terms of writing the schema (and consequently the XML), and/or how to connect it with XSL (if that is appropriate).  My experience with XSLT is limited to pretty much straight templates, and I&amp;#8217;ve never cross-referenced two nodesets and used the result to provide a different output before.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The important thing is that within whatever XML you use, you have identifiers of some description that you can use to work out what a particular element means. The simplest and most general purpose identifiers are &lt;code&gt;xml:id&lt;/code&gt; attributes, but they&amp;#8217;re a bit limited because they have to be legal IDs.&lt;/p&gt;

&lt;p&gt;If you have something like a course that&amp;#8217;s identified by a number then it makes more sense to use the number of the course as the identifier; that can&amp;#8217;t live in an xml:id attribute, so you have to use some other attribute (eg &lt;code&gt;number&lt;/code&gt; for it). (You can use an element instead of an attribute, of course, but identifiers are usually metadata, and metadata should usually be an attribute unless it&amp;#8217;s structured.)&lt;/p&gt;

&lt;p&gt;You might have something that is actually uniquely identified by a combination of values. For example, a course might be identified by the department that offers the course plus the number identifying the course; it might be possible for two courses to have the same number, but be different courses, offered by different departments. Again, ultimately all that&amp;#8217;s really important is that it&amp;#8217;s possible to identify the department from the course, but the identifier itself might be on the element representing the course or on one of its ancestors, it doesn&amp;#8217;t really matter.&lt;/p&gt;

&lt;p&gt;In the XSLT, the first task is to pull in all the documents that hold the information you want to use and store them as global variables:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:variable name=&quot;curriculum&quot; as=&quot;document-node()&quot;
  select=&quot;document(&#039;curriculum.xml&#039;)&quot; /&amp;gt;
&amp;lt;xsl:variable name=&quot;transcript&quot; as=&quot;document-node()&quot;
  select=&quot;document(&#039;transcript.xml&#039;)&quot; /&amp;gt;
&amp;lt;xsl:variable name=&quot;database&quot; as=&quot;document-node()&quot;
  select=&quot;document(&#039;database.xml&#039;)&quot; /&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Then you should set up keys that will create indexes of the information in your documents based on their identifier(s). A simple key would look like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:key name=&quot;departments&quot; match=&quot;dept&quot; use=&quot;@xml:id&quot; /&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The &lt;code&gt;name&lt;/code&gt; attribute is a name for the key; you can call it anything you like, but for identifiers I usually just use the plural of a noun for the thing I&amp;#8217;m identifying. The &lt;code&gt;match&lt;/code&gt; attribute is a pattern that matches the elements that you&amp;#8217;re indexing (don&amp;#8217;t forget namespaces, if you have them). The &lt;code&gt;use&lt;/code&gt; attribute holds an XPath that should return a value for a given node that you&amp;#8217;re indexing, in the above case it&amp;#8217;s the value of the &lt;code&gt;xml:id&lt;/code&gt; attribute on the &lt;code&gt;&amp;lt;dept&amp;gt;&lt;/code&gt; element.&lt;/p&gt;

&lt;p&gt;For elements that use a combination of values as an identifier, you can use the &lt;code&gt;concat()&lt;/code&gt; function to create a unique value that combines the identifying values. For example, if your XML looks like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;courses dept=&quot;CMSC&quot;&amp;gt;
  &amp;lt;course number=&quot;131&quot;&amp;gt;...&amp;lt;/course&amp;gt;
  &amp;lt;course number=&quot;198W&quot;&amp;gt;...&amp;lt;/course&amp;gt;
  &amp;lt;course number=&quot;434&quot;&amp;gt;...&amp;lt;/course&amp;gt;
&amp;lt;/courses&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;then you could index each course by its department and number with:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:key name=&quot;courses&quot; match=&quot;course&quot; use=&quot;concat(../@dept, &#039;:&#039;, @number)&quot; /&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It&amp;#8217;s not really necessary in this case, but I&amp;#8217;ve put a separator in the &lt;code&gt;concat()&lt;/code&gt; call out of habit as it helps prevent problems such as something identified as &lt;code&gt;&#039;a&#039; + &#039;bc&#039;&lt;/code&gt; being given the same identifier as something identified through &lt;code&gt;&#039;ab&#039; + &#039;c&#039;&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Note that the keys don&amp;#8217;t indicate which document the information is held in. It&amp;#8217;s only when you &lt;em&gt;call&lt;/em&gt; the key that you say which document you want to use. For example:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;key(&#039;courses&#039;, &#039;CMSC:198W&#039;, $curriculum)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;would pull out the &lt;code&gt;&amp;lt;course&amp;gt;&lt;/code&gt; element whose parent had a &lt;code&gt;dept&lt;/code&gt; attribute equal to &lt;code&gt;&#039;CMSC&#039;&lt;/code&gt; and  a &lt;code&gt;number&lt;/code&gt; attribute with the value &lt;code&gt;&#039;198W&#039;&lt;/code&gt; from the document held in the &lt;code&gt;$curriculum&lt;/code&gt; variable. (This used to be harder to manage in XSLT 1.0, when there wasn&amp;#8217;t a third argument; without the third argument, the XSLT processor looks in the document you&amp;#8217;re currently in.)&lt;/p&gt;

&lt;p&gt;You can just call the &lt;code&gt;key()&lt;/code&gt; function directly, but if you&amp;#8217;re using XSLT 2.0 then I&amp;#8217;d suggest wrapping it in a function like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:function name=&quot;my:course&quot; as=&quot;element(course)&quot;&amp;gt;
  &amp;lt;xsl:param name=&quot;dept&quot; as=&quot;xs:string&quot; /&amp;gt;
  &amp;lt;xsl:param name=&quot;number&quot; as=&quot;xs:string&quot; /&amp;gt;
  &amp;lt;xsl:variable name=&quot;identifier&quot; select=&quot;concat($dept, &#039;:&#039;, $number)&quot; /&amp;gt;
  &amp;lt;xsl:sequence select=&quot;key(&#039;courses&#039;, $identifier, $curriculum)&quot; /&amp;gt;
&amp;lt;/xsl:function&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This means you can use&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;my:course(&#039;CMSC&#039;, &#039;198W&#039;)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;to locate the &lt;code&gt;&amp;lt;course&amp;gt;&lt;/code&gt; element you&amp;#8217;re after.&lt;/p&gt;

&lt;p&gt;Of course, most of the time you won&amp;#8217;t have fixed values for the arguments for that function: you&amp;#8217;ll have some XML that refers to the course in its own way. For example, you might have:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;semester season=&quot;fall&quot; year=&quot;2007&quot;&amp;gt;
  &amp;lt;course dept=&quot;ARTT&quot; number=&quot;210&quot; /&amp;gt;
  &amp;lt;course dept=&quot;INFM&quot; number=&quot;210&quot; /&amp;gt;
&amp;lt;/semester&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and want to make a list of the titles of the courses. You could do this with:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:template match=&quot;course&quot;&amp;gt;
  &amp;lt;li&amp;gt;
    &amp;lt;xsl:apply-templates select=&quot;my:course(@dept, @number)/title&quot; /&amp;gt;
  &amp;lt;/li&amp;gt;
&amp;lt;/xsl:template&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Just a final thought: if you have control over it, it&amp;#8217;s useful to make a clear distinction between elements that define information and elements that reference those definitions. One way of doing that is to name them differently (eg &lt;code&gt;&amp;lt;course&amp;gt;&lt;/code&gt; and &lt;code&gt;&amp;lt;courseRef&amp;gt;&lt;/code&gt;) or make sure that they have different sets of attributes (eg &lt;code&gt;number&lt;/code&gt; and &lt;code&gt;numberRef&lt;/code&gt;). Using the name of an element can be more useful because you can use that when you&amp;#8217;re defining the types of parameters and return values. For example:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:function name=&quot;my:courses&quot; as=&quot;element(course)+&quot;&amp;gt;
  &amp;lt;xsl:param name=&quot;semester&quot; as=&quot;element(semester)&quot; /&amp;gt;
  &amp;lt;xsl:variable name=&quot;courseRefs&quot; as=&quot;element(courseRef)+&quot;
    select=&quot;$semester/courseRef&quot; /&amp;gt;
  &amp;lt;xsl:sequence select=&quot;$courseRefs/my:course(@dept, @number)&quot; /&amp;gt;
&amp;lt;/xsl:function&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/70#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/5">xslt</category>
 <pubDate>Fri, 18 Jan 2008 17:33:14 +0000</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">70 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>Partial implementations #2: XSLT in Google Search Appliance</title>
 <link>http://www.jenitennison.com/blog/node/64</link>
 <description>&lt;p&gt;A &lt;a href=&quot;http://www.google.com/enterprise/gsa/&quot; title=&quot;Google Search Appliance&quot;&gt;Google Search Appliance&lt;/a&gt; (GSA) is a box that you plug into your network which crawls and indexes your data, and serves up the results of searches. Search results come in an XML format, and there&amp;#8217;s a built in XSLT engine which means you can convert that XML into as many different views as you like. So you can have HTML-based search results, summaries, feeds, and so on.&lt;/p&gt;

&lt;p&gt;My task recently was to debug some XSLT that transformed the GSA XML into an Atom feed. Easy enough, right? The GSA &lt;a href=&quot;http://code.google.com/apis/searchappliance/documentation/46/xml_reference.html#results_xml&quot; title=&quot;Google Search Appliance Documentation: XML Results Reference&quot;&gt;XML format&lt;/a&gt; is pretty hideous &amp;#8212; most of the elements max out at three capital letters in length (whatever happened to human-readability) &amp;#8212; but logical enough, and the mapping is hardly complex.&lt;/p&gt;

&lt;p&gt;But all was not as it seemed. The GSA&amp;#8217;s XSLT implementation is&amp;#8230; how can I put this politely?&amp;#8230; &amp;#8220;non-standard&amp;#8221;. This post describes some of the problems and workarounds.&lt;/p&gt;

&lt;!--break--&gt;

&lt;p&gt;To get the GSA to use your own XSLT, you have to go through its web interface. Basically there&amp;#8217;s a form with a text field in which you can type your XSLT. Or you can upload a file that you develop offline. Naturally you&amp;#8217;re going to do the latter because it means you can use your favourite editor with helpful things like syntax highlighting and validation-as-you-type, but of course that means switching between web browser windows and your IDE as you develop.&lt;/p&gt;

&lt;p&gt;So I upload the transformation, point the browser at a relevant search page, and&amp;#8230; oh&amp;#8230;&lt;/p&gt;

&lt;p&gt;When the GSA doesn&amp;#8217;t like the XSLT that you use, you get a really helpful error message. It says:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Internal server error.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So you know that there&amp;#8217;s been an error. With the server. Internally.&lt;/p&gt;

&lt;p&gt;Back to basics, I thought. Let&amp;#8217;s find out what processor the server&amp;#8217;s using. Then we can develop on that processor and be pretty sure the resulting XSLT will work. So I load up the default XSLT (which is used to create an HTML result) and add the line&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:value-of select=&quot;system-property(&#039;xsl:vendor&#039;)&quot; /&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Save the XSLT, reload the page, and&amp;#8230;&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Internal server error.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Okaaay&amp;#8230; so this is an XSLT processor that doesn&amp;#8217;t support the &lt;code&gt;xsl:vendor&lt;/code&gt; system property. If it doesn&amp;#8217;t support that, I&amp;#8217;m going to have to tread carefully. So let&amp;#8217;s start with something really simple:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:stylesheet version=&quot;1.0&quot;
   xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot;&amp;gt;

&amp;lt;xsl:template match=&quot;/&quot;&amp;gt;
  &amp;lt;xsl:copy-of select=&quot;.&quot; /&amp;gt;
&amp;lt;/xsl:template&amp;gt;
&amp;lt;/xsl:stylesheet&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Save the XSLT, reload the page, and&amp;#8230;&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Internal server error.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;On a whim, I tried&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:stylesheet xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot; 
   version=&quot;1.0&quot;&amp;gt;

&amp;lt;xsl:template match=&quot;/&quot;&amp;gt;
  &amp;lt;xsl:copy-of select=&quot;.&quot; /&amp;gt;
&amp;lt;/xsl:template&amp;gt;
&amp;lt;/xsl:stylesheet&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;instead. Save the XSLT, reload the page, and&amp;#8230; Success!&lt;/p&gt;

&lt;p&gt;Can you spot the difference? Yes, that&amp;#8217;s right: it&amp;#8217;s the order of the XSLT namespace declaration and the version attribute. Namespace declaration first, you&amp;#8217;re OK, version first, you&amp;#8217;re not.&lt;/p&gt;

&lt;p&gt;Okaaay&amp;#8230; so this is an XSLT processor that doesn&amp;#8217;t support the XML Recommendation (which says that attribute order doesn&amp;#8217;t matter). But heck, why split hairs? At least it&amp;#8217;s working! Now to create some Atom instead:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:stylesheet xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot; 
   version=&quot;1.0&quot;
   xmlns=&quot;http://www.w3.org/2005/Atom&quot;&amp;gt;

&amp;lt;xsl:template match=&quot;/&quot;&amp;gt;
  &amp;lt;feed /&amp;gt;
&amp;lt;/xsl:template&amp;gt;
&amp;lt;/xsl:stylesheet&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Save the XSLT, reload the page, and we&amp;#8217;re back to&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Internal server error&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;At least there&amp;#8217;s some &lt;a href=&quot;http://code.google.com/apis/searchappliance/documentation/46/xml_reference.html#results_xslt&quot; title=&quot;Google Search Appliance Documentation: Custom HTML&quot;&gt;documentation&lt;/a&gt; about this one:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;XSL stylesheets that include other files may not be used with the Google search engine. An XSL stylesheet that contains the following tags generates an error result:&lt;/p&gt;
  
  &lt;ul&gt;
  &lt;li&gt;&lt;code&gt;&amp;lt;xsl:import&amp;gt;&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code&gt;&amp;lt;xsl:include&amp;gt;&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code&gt;xmlns:&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code&gt;document()&lt;/code&gt;&lt;/li&gt;
  &lt;/ul&gt;
&lt;/blockquote&gt;

&lt;p&gt;Read that again. Yes, the third bullet point. That&amp;#8217;s right, it&amp;#8217;s saying that an XSLT that contains a namespace declaration will generate an error result because it &amp;#8220;includes other files&amp;#8221;.&lt;/p&gt;

&lt;p&gt;But, but, but, namespace declarations in XSLT stylesheets (or elsewhere for that matter) do not indicate file inclusion. Namespace URIs are &lt;em&gt;identifiers&lt;/em&gt;, not &lt;em&gt;locations&lt;/em&gt;. They are strings. They are not resolved. You do not need to be connected to the &amp;#8216;net to use them.&lt;/p&gt;

&lt;p&gt;And how am I supposed to serve an Atom feed, since Atom documents use a namespace? Or XHTML for that matter? Fortunately, the GSA only goes so far in banning namespace declarations: you&amp;#8217;re OK as long as you don&amp;#8217;t put them on the &lt;code&gt;&amp;lt;xsl:stylesheet&amp;gt;&lt;/code&gt; element. Moving it to the &lt;code&gt;&amp;lt;feed&amp;gt;&lt;/code&gt; element as in&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:stylesheet xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot; 
   version=&quot;1.0&quot;&amp;gt;

&amp;lt;xsl:template match=&quot;/&quot;&amp;gt;
  &amp;lt;feed xmlns=&quot;http://www.w3.org/2005/Atom&quot; /&amp;gt;
&amp;lt;/xsl:template&amp;gt;
&amp;lt;/xsl:stylesheet&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and you&amp;#8217;re OK. Of course you have to repeat the namespace declaration in every template so you don&amp;#8217;t end up creating elements in no namespace. Tedious, oh so tedious, but workable.&lt;/p&gt;

&lt;p&gt;(I have a vague suspicion that the idea behind banning namespace declarations is something to do with certain XSLT processors using namespace URIs to pull in Java classes. But addressing that problem by banning namespace declarations entirely isn&amp;#8217;t just throwing the baby out with the bathwater, it&amp;#8217;s throwing the whole bathroom suite out of the window. And if you then allow namespace declarations further down the stylesheet, you haven&amp;#8217;t actually solved the problem.)&lt;/p&gt;

&lt;p&gt;Amazingly enough, given the inauspicious beginning, everything else I tried actually worked. I suspect that it&amp;#8217;s some standard XSLT processor underneath with a regex based filter that (among other things) limits what&amp;#8217;s allowed in the &lt;code&gt;&amp;lt;xsl:stylesheet&amp;gt;&lt;/code&gt; start tag. They probably disallow &lt;code&gt;system-property(&#039;xsl:vendor&#039;)&lt;/code&gt; for security &amp;#8212; knowledge is power, after all.&lt;/p&gt;

&lt;p&gt;Anyway, my suggestions to others who might want to create a customised XSLT processor:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Use a custom URL resolver to restrict access to documents.&lt;/li&gt;
&lt;li&gt;Restrict external function calls using something like the &lt;code&gt;ALLOW_EXTERNAL_FUNCTIONS&lt;/code&gt; property in JAXP&lt;/li&gt;
&lt;li&gt;Document the restrictions you&amp;#8217;re placing on the stylesheets.&lt;/li&gt;
&lt;li&gt;Produce meaningful error messages that explain the extra restrictions when they&amp;#8217;re broken.&lt;/li&gt;
&lt;/ol&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/64#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/5">xslt</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/18">atom</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/19">google</category>
 <pubDate>Fri, 23 Nov 2007 22:22:19 +0000</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">64 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>Detecting streamability in XPath expressions and patterns</title>
 <link>http://www.jenitennison.com/blog/node/61</link>
 <description>&lt;p&gt;The XSL Working Group &lt;a href=&quot;http://lists.w3.org/Archives/Public/public-xml-processing-model-comments/2007Oct/0118.html&quot; title=&quot;XSL WG Comments on XProc Last Call&quot;&gt;gave some comments&lt;/a&gt; recently on the &lt;a href=&quot;http://www.w3.org/TR/2007/WD-xproc-20070920/&quot; title=&quot;W3C: XProc Last Call Working Draft&quot;&gt;Last Call Working Draft of XProc&lt;/a&gt;. One of the comments was about a bunch of standard steps that we&amp;#8217;ve specified which do things you can do in XSLT, such as renaming certain nodes. These steps generally use XPath expressions or XSLT patterns to identify which nodes should be processed.&lt;/p&gt;

&lt;p&gt;What bothers the XSL WG is that these steps aren&amp;#8217;t guaranteed to be streamable. In a streamable process, an input document can be delivered to the processor as a stream of events (and an output similarly generated as a stream of events) rather than as an in-memory representation. Such processes will start producing results more quickly and require less memory than non-streamable ones. And, because they don&amp;#8217;t need as much memory, they are able to work on larger documents.&lt;/p&gt;

&lt;p&gt;If the processes we defined in XProc &lt;em&gt;were&lt;/em&gt; streamable, there&amp;#8217;d have a clear advantage over their XSLT equivalents, and therefore a purpose. However, since they&amp;#8217;re &lt;em&gt;not&lt;/em&gt; guaranteed streamable, it looks like we&amp;#8217;re simply creating yet another transformation language.&lt;/p&gt;

&lt;!--break--&gt;

&lt;p&gt;My &lt;a href=&quot;http://lists.w3.org/Archives/Public/public-xml-processing-model-comments/2007Oct/0123.html&quot; title=&quot;Jeni&#039;s response to XSL WG comments on XProc&#039;s streamability&quot;&gt;response&lt;/a&gt; was basically that we left it down to implementers to detect when a particular expression/pattern was streamable because defining a streamable subset of XPath would (a) take too long, (b) require people to learn a particular XPath subset, raising the adoption barrier, (c) require implementers to implement their own XPath engines, raising the implementation barrier.&lt;/p&gt;

&lt;p&gt;But if you put those pragmatic reasons to one side, I think there are good abstract reasons not to specify a streamable XPath subset. First, there is no clear line that can be drawn between a streamable XPath and an unstreamable one, only a scale between &amp;#8220;buffering nothing&amp;#8221; and &amp;#8220;buffering everything&amp;#8221; (building an object model). Second, you can&amp;#8217;t judge the streamability of an XPath expression on its own: there are multiple other factors that effect how streamable a given XPath expression is for a particular processor.&lt;/p&gt;

&lt;p&gt;To illustrate, say that we&amp;#8217;re renaming all elements that we select, and let&amp;#8217;s start with an expression that&amp;#8217;s obviously streamable:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;//section
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;No problems here: as soon as we hit an start-tag (or end-tag) for a &lt;code&gt;&amp;lt;section&amp;gt;&lt;/code&gt; element, we can change its name.&lt;/p&gt;

&lt;p&gt;Now add a predicate:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;//section[@type = &#039;summary&#039;]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This predicate tests the value of the &lt;code&gt;type&lt;/code&gt; attribute on the &lt;code&gt;&amp;lt;section&amp;gt;&lt;/code&gt; element. If we&amp;#8217;re using SAX or StAX events, then this is as straightforwardly streamable as the previous example, because attribute values are reported at the same time as start-tags. But that&amp;#8217;s purely down to the API: the underlying algorithm for streaming RELAX NG validation uses a different event model, for example, in which attributes are reported after the start tag begins (and before the start tag ends). So &lt;strong&gt;streamability depends on the event model&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Now a different predicate:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;//section[title = &#039;Summary&#039;]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This predicate tests the value of the &lt;code&gt;&amp;lt;title&amp;gt;&lt;/code&gt; child of the &lt;code&gt;&amp;lt;section&amp;gt;&lt;/code&gt; element. In fact, it tests if the &lt;code&gt;&amp;lt;section&amp;gt;&lt;/code&gt; element has &lt;em&gt;any&lt;/em&gt; &lt;code&gt;&amp;lt;title&amp;gt;&lt;/code&gt; child with the value &lt;code&gt;&#039;Summary&#039;&lt;/code&gt;. Normally, an XPath processor won&amp;#8217;t be able to tell that a &lt;code&gt;&amp;lt;section&amp;gt;&lt;/code&gt; &lt;em&gt;doesn&amp;#8217;t&lt;/em&gt; satisfy the predicate until it gets to the end-tag of the element. So it will have to buffer the events from each &lt;code&gt;&amp;lt;section&amp;gt;&lt;/code&gt; start tag until its end tag until it can work out whether to do the renaming or not.&lt;/p&gt;

&lt;p&gt;But say the &lt;code&gt;&amp;lt;section&amp;gt;&lt;/code&gt; elements in this markup language can only contain a single &lt;code&gt;&amp;lt;title&amp;gt;&lt;/code&gt;, and that&amp;#8217;s the first child of the &lt;code&gt;&amp;lt;section&amp;gt;&lt;/code&gt;. For an XPath processor that&amp;#8217;s aware of the DTD or schema that the document adheres to, the situation is then very similar to the previous one, which tested the attribute. So &lt;strong&gt;streamability depends on how much the processor knows about the markup language&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Changing the XPath to&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;//section[title[1] = &#039;Summary&#039;]
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;similarly limits how much the processor will have to buffer if the &lt;code&gt;&amp;lt;title&amp;gt;&lt;/code&gt; always appears, and always appears first, even without the processor being told that rule through a schema. So &lt;strong&gt;streamability depends on the markup language itself&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Anyway, I had a quick look at some academic work on streamability, such as &lt;a href=&quot;http://www.cs.umd.edu/projects/xsq/&quot; title=&quot;XSQ: A Streaming XPath Engine&quot;&gt;XSQ&lt;/a&gt;, &lt;a href=&quot;http://www-cs-students.stanford.edu/~amrutaj/work/papers/xpath.pdf&quot; title=&quot;Project Report on Streaming XPath Engine&quot;&gt;TurboXPath&lt;/a&gt; or the recent paper &lt;a href=&quot;http://doi.acm.org/10.1145/1247480.1247512&quot; title=&quot;Efficient Algorithms for Evaluating XPath over Streams&quot;&gt;&amp;#8220;Efficient Algorithms for Evaluating XPath over Streams&amp;#8221;&lt;/a&gt;. These papers really surprised me. The things that prove difficult include backwards axes (which is surprising since information about the previous nodes should be easily available), the descendant axis, and the position function. On the other hand, predicates are absolutely fine (despite requiring a &amp;#8220;look ahead&amp;#8221;). [Weirdly enough, all the papers I looked at contained XPath errors; I guess when you&amp;#8217;re considering abstract algorithms you don&amp;#8217;t have to care about insignificant things like language syntax.]&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;http://doi.acm.org/10.1145/1247480.1247512&quot; title=&quot;Efficient Algorithms for Evaluating XPath over Streams&quot;&gt;paper&lt;/a&gt; I mentioned above actually defines something called Univariate XPath which conforms to the syntax:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Path      := Step | Path Step
Step      := Axis NodeTest
             | Axis NodeTest &#039;[&#039; Predicate &#039;]&#039;
Axis      := &#039;/&#039; | &#039;//&#039;
NodeTest  := Name | &#039;*&#039;
Predicate := Path
             | Path CompOp Path
             | Predicate &#039;and&#039; Predicate
             | Predicate &#039;or&#039; Predicate
             | &#039;not&#039; Predicate                            [sic]
CompOp    := &#039;=&#039; | &#039;!=&#039; | &#039;&amp;gt;&#039; | &#039;&amp;gt;=&#039; | &#039;&amp;lt;&#039; | &#039;&amp;lt;=&#039;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This might be a useful starting point, but it omits useful things like attributes and functions which (as far as I can tell) wouldn&amp;#8217;t effect the applicability of the algorithms. It&amp;#8217;s also worth noting that it will allow paths such as &lt;code&gt;/database[dummy]/record&lt;/code&gt;, which would involve buffering every &lt;code&gt;&amp;lt;record&amp;gt;&lt;/code&gt; until the end tag of the &lt;code&gt;&amp;lt;database&amp;gt;&lt;/code&gt; document element was reached. This illustrates that just because an XPath is theoretically streamable (can be evaluated based on a stream of events) doesn&amp;#8217;t mean it can be evaluated efficiently.&lt;/p&gt;

&lt;p&gt;Some final thoughts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I wonder if there&amp;#8217;s scope for an XPath subset that can be mapped to RELAX NG syntax and therefore evaluated using Brozozowski derivatives&lt;/li&gt;
&lt;li&gt;what about an algorithm that evaluates XPaths using a pipeline process, whereby the stream of events is actually passed through several filters in order to provide the final evaluation&lt;/li&gt;
&lt;li&gt;I&amp;#8217;m sure there&amp;#8217;s preprocessing that could be done on some XPath expressions that would increase their streamability&lt;/li&gt;
&lt;/ul&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/61#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/5">xslt</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/6">pipelines</category>
 <pubDate>Tue, 06 Nov 2007 19:45:45 +0000</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">61 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>Teaching XSLT 1.0</title>
 <link>http://www.jenitennison.com/blog/node/57</link>
 <description>&lt;p&gt;I&amp;#8217;m exhausted after two days of teaching XSLT 1.0. Yes, there are still people out there who want to learn it. The exhaustion comes mainly because I&amp;#8217;m an introvert (INFP, Myers-Briggs fans!) who finds it tiring just being in the same room as someone else.&lt;/p&gt;

&lt;p&gt;In fact, I&amp;#8217;ve been teaching XSLT 1.0 rather a lot in the past couple of months. The first lot was a bunch of C# programmers who had done some light XSLT work, the second a bunch of developers who&amp;#8217;d been using XSLT for years, but wanted to improve. The majority of people on this most recent course weren&amp;#8217;t even developers.&lt;/p&gt;

&lt;p&gt;It&amp;#8217;s interesting seeing who struggles and who sails through the course. Some observations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;You need to understand XML to use XSLT. At the very least, you need to know what elements and attributes look like in an XML document, so you just know that attributes need matching quotation marks and elements need end tags. Half the syntax problems learners have with XSLT are because they&amp;#8217;re not using proper XML syntax. Without that fundamental terminology, you haven&amp;#8217;t a hope, because you can&amp;#8217;t even talk about XSLT, let alone getting information out of an XML document. I guess this is an advantage of XQuery: at least as a teacher you &lt;em&gt;know&lt;/em&gt; you have to teach the basic syntax of the language, rather than taking it as a given. Actually, scratch that: you still have to teach XML syntax with XQuery, since that&amp;#8217;s what people are processing and generating, &lt;em&gt;and&lt;/em&gt; you have to teach the bastardised, almost-XML syntax that XQuery uses. I haven&amp;#8217;t done an XQuery course yet, but surely that must be confusing?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I always always always have to spend a long time explaining namespaces. That&amp;#8217;s not going to surprise anybody. In the end I just provide these rules of thumb:&lt;/p&gt;

&lt;ol&gt;&lt;li&gt;Declare the XSLT namespace (usually &lt;code&gt;xsl&lt;/code&gt; prefix)&lt;/li&gt;
&lt;li&gt;Declare the namespaces you want to appear in the result, with the prefixes that you want to appear in the result. This can include a default namespace declaration.&lt;/li&gt;
&lt;li&gt;Declare the namespaces that appear in the source, with prefixes for every one (even if they&amp;#8217;re usually the default namespace in the source). Add an &lt;code&gt;exclude-result-prefixes&lt;/code&gt; attribute that lists these prefixes (at least the ones that aren&amp;#8217;t also used in the result).&lt;/li&gt;
&lt;li&gt;Declare the namespaces that you use in the stylesheet. List the ones that are used for extensions (elements or functions) in &lt;code&gt;extension-element-prefixes&lt;/code&gt; (technically, you don&amp;#8217;t have to declare the ones you use for extension functions; I just think it&amp;#8217;s clearer if you list all the prefixes used for extensions here); make sure the other ones are listed in &lt;code&gt;exclude-result-prefixes&lt;/code&gt;.&lt;/li&gt;&lt;/ol&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Having a programming background is a blessing and a curse when learning XSLT. It&amp;#8217;s a blessing because you understand basic principles like &amp;#8220;instructions&amp;#8221; and &amp;#8220;expressions&amp;#8221; and &amp;#8220;operators&amp;#8221; and &amp;#8220;functions&amp;#8221; and &amp;#8220;code blocks&amp;#8221;. But it&amp;#8217;s a curse because most programmers use conventional programming languages like C# and Java, which are procedural, and XSLT&amp;#8217;s way of doing things is completely different. The most recent course I taught was to people who dealt with completely data-oriented XML; trying to explain to them why applying templates is a useful thing to do was really hard.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Creating exercises that are based on the markup languages used by the people attending the course is always worthwhile. It takes time to prepare, but they get a better grasp on what&amp;#8217;s going on in the transformations that they create (because they understand the domain), and they learn how to do things on the course that are going to be directly useful for them in their ongoing practice.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The students find the examples the most helpful things in the slides. Whenever I see them look at the slides as they&amp;#8217;re doing the exercises, they&amp;#8217;re looking at the examples, not the abstract descriptions of how something works. Perhaps I should try having slides with (almost) nothing but examples. (It&amp;#8217;s interesting, because &lt;em&gt;I&amp;#8217;m&lt;/em&gt; always interested in the theoretical underpinnings of the things I learn about &amp;#8212; if I don&amp;#8217;t understand why, I can hardly understand how &amp;#8212; but it seems I&amp;#8217;m in the minority; it&amp;#8217;s the N in INFP, I guess.)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Having a bit of enforced interaction in the middle of the presentation works well, such as a list of examples and going around the room asking the students to explain what each means. It wakes up the students who are falling asleep. It forces them to think about what they&amp;#8217;ve been listening to. It highlights parts that they haven&amp;#8217;t understood (so I know they need a fresh explanation). And it seems to encourage them to ask questions. Just asking a question to the room doesn&amp;#8217;t work so well &amp;#8212; it tends to be the same people that answer each time, or there&amp;#8217;s an embarrassing silence.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I&amp;#8217;ve been trying to use a &lt;a href=&quot;http://www.garlikov.com/Soc_Meth.html&quot; title=&quot;The Socratic Method in teaching&quot;&gt;Socratic Method&lt;/a&gt; when I&amp;#8217;m asked for help during the exercises, at least when the problem isn&amp;#8217;t &amp;#8220;obvious&amp;#8221; (like a mis-spelled element name). The difficulty for me is finding the right question to ask, but eventually I find one that they&amp;#8217;re able to answer easily, and things flow well from there. More often than not, the student&amp;#8217;s able to find the answer to their problem without me providing the solution, which has to be good for learning.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;In the end, the main things that differentiate between those that learn a lot and those that learn a little are a willingness to try (those who don&amp;#8217;t attempt the exercises don&amp;#8217;t absorb what I&amp;#8217;ve taught from the front of the class) and a willingness to ask questions (those who don&amp;#8217;t end up getting stuck on one thing, and can&amp;#8217;t move past it). I also think cut-and-paste coders learn less &amp;#8212; cut-and-paste coding practices recognition (finding the right example to cut-and-paste) rather than generation (creating something new from scratch) &amp;#8212; but maybe that&amp;#8217;s just snobbishness on my part.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Anyway, I&amp;#8217;m hoping I can get rid of my XSLT 1.0 course soon, and move on to teaching XSLT 2.0. It&amp;#8217;ll be a longer course, but there&amp;#8217;ll be less &amp;#8220;no, there&amp;#8217;s no built-in support for that in XSLT 1.0&amp;#8221;.&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/57#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/5">xslt</category>
 <pubDate>Fri, 05 Oct 2007 22:28:37 +0100</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">57 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>View-Model-Template</title>
 <link>http://www.jenitennison.com/blog/node/45</link>
 <description>&lt;p&gt;I don&amp;#8217;t know anything about Struts 1, but &lt;a href=&quot;http://www.dehora.net/journal/2007/07/struts_1_problems.html&quot; title=&quot;Bill de hÓra: Struts 1 Problems&quot;&gt;Bill de hÓra&amp;#8217;s recent post&lt;/a&gt; has got some interesting web-application-design tips. There were two particular bits that spoke to me:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;struts-config.xml&lt;/strong&gt; struts-config tries to capture primarily the flow of application state on the server, by being an awkward representation of a call graph. In doing it misses a key aspect of the web - hypertext. In web architecture, HTML hypertext on the client is the engine of application state, not an XML file on the server.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In other words (I think) in web applications your state in the page you&amp;#8217;re on and taking action is about following the links (or submitting the forms) on the page. Your actions (and therefore the transitions between different states) are determined by what links and forms are on the page. But in fact, URLs should be hackable, and transitions unlimited. When you design the application what you really need to think about are the tasks the users want to achieve (and therefore the transitions that they might &lt;em&gt;want&lt;/em&gt; to make) rather than the &lt;em&gt;possible&lt;/em&gt; state transitions.&lt;/p&gt;

&lt;!--break--&gt;

&lt;blockquote&gt;
  &lt;p&gt;On the web, a suitable pattern is View, Model, Template [rather than Model, View, Controller (MVC)]. A request to a URL is dispatched to a View. This View calls into the Model, performs manipulations and prepares data for output. The data is passed to a Template that is rendered an [sic] emitted as a response. ideally [sic] in web frameworks, the controller is hidden from view. Note that this framework style is often called MVC anyway, confusing matters somewhat; The key differences are that Views and Templates are cohesive and Controllers are pushed down into the framework infrastructure.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I&amp;#8217;ve been thinking recently about whether and how XSLT might fit into a &lt;a href=&quot;http://www.rubyonrails.org/&quot; title=&quot;Ruby on Rails&quot;&gt;Ruby on Rails&lt;/a&gt; set-up. In &lt;abbr title=&quot;Ruby on Rails&quot;&gt;RoR&lt;/abbr&gt;, the controller usually either queries the database (via the model) to set up instance variables, and then renders a (template) view, or updates the database (via the model) and redirects to another view. The templates (for (X)HTML) use fairly standard &lt;code&gt;&amp;lt;% ... %&amp;gt;&lt;/code&gt; placeholders to hold code and insert values.&lt;/p&gt;

&lt;p&gt;I&amp;#8217;ve spent most of my professional life cursing (X)HTML documents with &lt;code&gt;&amp;lt;% ... %&amp;gt;&lt;/code&gt; in, because they use unescaped less-than-signs and therefore can&amp;#8217;t be generated or processed by XML tools, particularly XSLT. There&amp;#8217;s an advantage of having templates that are themselves well-formed, not least that you can easily process the templates themselves (for example to generate, update or document them). Plus if your templates are declarative, rather than containing embedded code, you aren&amp;#8217;t tied to a particular framework: I could move templates from Ruby on Rails to Django and they wouldn&amp;#8217;t need modification. When I think &amp;#8220;declarative templates&amp;#8221;, I think &amp;#8220;XSLT&amp;#8221;.&lt;/p&gt;

&lt;p&gt;The other advantage of using XSLT is that it can be used on the client side as well as the server side. So there&amp;#8217;s the possibility of moving that rendering from one server to client completely or using it on particular clients, perhaps in an AJAX set-up, while having the same stylesheets on the server for those browsers that don&amp;#8217;t support client-side XSLT.&lt;/p&gt;

&lt;p&gt;You still need a way of getting the data from the model into the stylesheet, which can be done through a combination of XML and parameters. The XML is itself a view of the model, of course, but if you&amp;#8217;ve got any kind of intention to make your web application mashable, you&amp;#8217;re going to want to generate XML, probably Atom, anyway (yeah, or JSON, but it&amp;#8217;s easy enough to get from XML to JSON using XSLT too). If you add caching to the equation, this approach might help reduce database requests.&lt;/p&gt;

&lt;p&gt;So I think that using XSLT as a templating language, even within a RoR framework, has at least something going for it. What I hope is that I&amp;#8217;m not falling into the &amp;#8220;when you&amp;#8217;ve got a hammer everything looks like a nail&amp;#8221; trap.&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/45#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/5">xslt</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/22">rest</category>
 <pubDate>Fri, 27 Jul 2007 11:02:04 +0100</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">45 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>The perils of default namespaces</title>
 <link>http://www.jenitennison.com/blog/node/36</link>
 <description>&lt;p&gt;A lot of people run into problems with namespaces, and most of those arise from using default namespaces (ie not giving namespaces prefixes). The transformation technology you use can have a big effect on how confusing and irritating it gets.&lt;/p&gt;

&lt;p&gt;Default namespaces make XML documents easier to read because they allow you to just give the local name of an element rather than using prefixes all over the place. For example, using:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;house status=&quot;For Sale&quot; xmlns=&quot;http://www.example.com/ns/house&quot;&amp;gt;
  &amp;lt;askingPrice&amp;gt;...&amp;lt;/askingPrice&amp;gt;
  &amp;lt;address&amp;gt;...&amp;lt;/address&amp;gt;
  &amp;lt;layout&amp;gt;...&amp;lt;/layout&amp;gt;
&amp;lt;/house&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;!--break--&gt;

&lt;p&gt;rather than:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;h:house status=&quot;For Sale&quot; xmlns:h=&quot;http://www.example.com/ns/house&quot;&amp;gt;
  &amp;lt;h:askingPrice&amp;gt;...&amp;lt;/h:askingPrice&amp;gt;
  &amp;lt;h:address&amp;gt;...&amp;lt;/h:address&amp;gt;
  &amp;lt;h:layout&amp;gt;...&amp;lt;/h:layout&amp;gt;
&amp;lt;/h:house&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In some cases, specifically documents that are validated against a DTD or interpreted by non-namespace-aware applications, you might be forced to use the default namespace. The biggest example of this is (X)HTML.&lt;/p&gt;

&lt;p&gt;In transformation technologies, such as &lt;a href=&quot;http://www.w3.org/Style/XSL/&quot;&gt;XSLT&lt;/a&gt;, &lt;a href=&quot;http://www.w3.org/XML/Query/&quot;&gt;XQuery&lt;/a&gt; and &lt;a href=&quot;http://www.xlinq.net/&quot;&gt;XLinq in VB.NET&lt;/a&gt;, you have to deal with at least two documents: the source documents that you are processing and the result documents that you are creating. Often, the source and result documents will use default namespaces, or at any rate you&amp;#8217;ll want to query and create the documents without using prefixes. Sometimes, the source and result documents all use the same namespace, but it&amp;#8217;s far more common that they don&amp;#8217;t.&lt;/p&gt;

&lt;p&gt;So transformation technologies have to support at least &lt;em&gt;two&lt;/em&gt; default namespaces: one for querying and one for construction.&lt;/p&gt;

&lt;p&gt;In XPath 1.0, you must specify a prefix for each namespace you want to use. A path like &lt;code&gt;/house/layout&lt;/code&gt; will only select &lt;code&gt;&amp;lt;layout&amp;gt;&lt;/code&gt; elements in no namespace. In XSLT 1.0, the default namespace in the stylesheet (as declared by the &lt;code&gt;xmlns&lt;/code&gt; attribute on &lt;code&gt;&amp;lt;xsl:stylesheet&amp;gt;&lt;/code&gt;) is then free to be used for the result documents. For example, I might do&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:stylesheet version=&quot;1.0&quot;
  xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot;
  xmlns:h=&quot;http://www.example.com/ns/house&quot;
  exclude-result-prefixes=&quot;h&quot;
  xmlns=&quot;http://www.w3.org/1999/xhtml&quot;&amp;gt;

&amp;lt;xsl:template match=&quot;h:house&quot;&amp;gt;
  &amp;lt;div class=&quot;house&quot;&amp;gt;
    &amp;lt;h1&amp;gt;&amp;lt;xsl:apply-templates select=&quot;h:askingPrice&quot; /&amp;gt;&amp;lt;/h1&amp;gt;
    ...
  &amp;lt;/div&amp;gt;
&amp;lt;/xsl:template&amp;gt;

&amp;lt;/xsl:stylesheet&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;[The best way to deal with multiple result documents in different default namespaces is to simply have different stylesheet documents to handle their generation, all included or imported into your main stylesheet application.]&lt;/p&gt;

&lt;p&gt;Users of XSLT 1.0 found it confusing that they couldn&amp;#8217;t just copy the namespace declarations (including a default namespace declaration) from a sample source document and have the paths just work. So in XPath 2.0, rather than no prefix meaning no namespace, the &lt;strong&gt;default element/type namespace&lt;/strong&gt; in the context is used for element names with no prefix. If the default element/type namespace is set to &lt;code&gt;http://www.example.com/ns/house&lt;/code&gt; then the path &lt;code&gt;/house/layout&lt;/code&gt; will select all &lt;code&gt;&amp;lt;layout&amp;gt;&lt;/code&gt; elements in the &lt;code&gt;http://www.example.com/ns/house&lt;/code&gt; namespace. You can set this default element/type namespace in XSLT 2.0 using the &lt;code&gt;[xsl:]xpath-default-namespace&lt;/code&gt; attribute, which can go anywhere but will usually be situated on the &lt;code&gt;&amp;lt;xsl:stylesheet&amp;gt;&lt;/code&gt; element (in which case it appears without the &lt;code&gt;xsl:&lt;/code&gt; prefix). The default element/type namespace can be scoped to a particular area of your stylesheet in the same way as namespace declarations.&lt;/p&gt;

&lt;p&gt;Otherwise, XSLT 2.0 works like XSLT 1.0 in that the default namespace in the stylesheet supplies the default namespace for created elements, so you can do:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;xsl:stylesheet version=&quot;2.0&quot;
  xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot;
  xmlns=&quot;http://www.w3.org/1999/xhtml&quot;
  xpath-default-namespace=&quot;http://www.example.com/ns/house&quot;&amp;gt;

&amp;lt;xsl:template match=&quot;house&quot;&amp;gt;
  &amp;lt;div class=&quot;house&quot;&amp;gt;
    &amp;lt;h1&amp;gt;&amp;lt;xsl:apply-templates select=&quot;askingPrice&quot; /&amp;gt;&amp;lt;/h1&amp;gt;
    ...
  &amp;lt;/div&amp;gt;
&amp;lt;/xsl:template&amp;gt;

&amp;lt;/xsl:stylesheet&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;By keeping the default query namespace and the default construction namespace separate, you&amp;#8217;re able to use unprefixed names in both paths and element constructors, even if the default namespaces in the two cases are different.&lt;/p&gt;

&lt;p&gt;XQuery and VB.NET, on the other hand, provide a single default namespace that is used for both queries and construction, and they work in slightly different ways.&lt;/p&gt;

&lt;p&gt;In XQuery you can declare the default namespace for the query, with&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;declare default element namespace &quot;http://www.example.com/ns/house&quot;;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;which means that you can query the source document with paths like &lt;code&gt;/house/askingPrice&lt;/code&gt; and create elements in the &lt;code&gt;http://www.example.com/ns/house&lt;/code&gt; namespace with direct element constructors without prefixes, like&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;house status=&quot;For Sale&quot;&amp;gt;
  &amp;lt;askingPrice&amp;gt;...&amp;lt;/askingPrice&amp;gt;
  &amp;lt;address&amp;gt;...&amp;lt;/address&amp;gt;
  &amp;lt;layout&amp;gt;...&amp;lt;/layout&amp;gt;
&amp;lt;/house&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If you then want to generate XHTML (or some other result for which you&amp;#8217;d prefer to use the default namespace), you can use a default namespace declaration on the XHTML you generate:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;div class=&quot;house&quot; xmlns=&quot;http://www.w3.org/1999/xhtml&quot;&amp;gt;
  &amp;lt;h1&amp;gt;...&amp;lt;/h1&amp;gt;
  ...
&amp;lt;/div&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;But the default namespace declaration in the element constructor carries through into the embedded expressions, so&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;  &amp;lt;div class=&quot;house&quot; xmlns=&quot;http://www.w3.org/1999/xhtml&quot;&amp;gt;
    &amp;lt;h1&amp;gt;{ /house/askingPrice }&amp;lt;/h1&amp;gt;
    ...
  &amp;lt;/div&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;won&amp;#8217;t work. As a result, you end up having to use the query&amp;#8217;s default namespace declaration to set the default namespace to XHTML (or whatever the default namespace is in the result), and use prefixes in your queries (essentially the same situation as in XSLT 1.0).&lt;/p&gt;

&lt;p&gt;In XLinq in VB.NET, there&amp;#8217;s the same kind of pattern. The &lt;code&gt;Imports&lt;/code&gt; statement allows you to declare a default namespace that&amp;#8217;s used in both queries and construction, as in:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Imports &amp;lt;xmlns=&quot;http://www.example.com/ns/house&quot;&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and you can use a default namespace declaration on the XHTML you generate to provide the default namespace for the elements in the XML literal:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;houseDiv =
  &amp;lt;div class=&quot;house&quot; xmlns=&quot;http://www.w3.org/1999/xhtml&quot;&amp;gt;
    &amp;lt;h1&amp;gt;...&amp;lt;/h1&amp;gt;
    ...
  &amp;lt;/div&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;But unlike in XQuery, the default XHTML namespace declaration in the &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt; &lt;em&gt;doesn&amp;#8217;t&lt;/em&gt; have an effect on the default namespace used in embedded expressions, which means you can still use unprefixed element names in any paths used within the XML literal, like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;houseDiv =
  &amp;lt;div class=&quot;house&quot; xmlns=&quot;http://www.w3.org/1999/xhtml&quot;&amp;gt;
    &amp;lt;h1&amp;gt;&amp;lt;%= doc.&amp;lt;house&amp;gt;.&amp;lt;askingPrice&amp;gt; %&amp;gt;&amp;lt;/h1&amp;gt;
    ...
  &amp;lt;/div&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;However, if you build up your XHTML gradually, perhaps by using separate variables or methods, then every time you create a snippet of XHTML you have to specify this default namespace. Again, people will end up using the &lt;code&gt;Imports&lt;/code&gt; statement to set the default namespace to the default result namespace and using prefixes in their paths.&lt;/p&gt;

&lt;p&gt;The other factor to consider is that sometimes no prefix really does mean no namespace. If you&amp;#8217;re querying an XML document that contains elements in no namespace, you have to set the default query namespace to no namespace. In XSLT 1.0, that&amp;#8217;s always the case anyway; in XSLT 2.0, the &lt;code&gt;xpath-default-namespace&lt;/code&gt; shouldn&amp;#8217;t be set (or should be unset for those places that need to query no-namespace elements). In XQuery you can&amp;#8217;t use the query default namespace declaration and in XLinq in VB.NET you can&amp;#8217;t use the &lt;code&gt;Imports&lt;/code&gt; statement. In both these cases, you better hope your result is in no namespace too. If not, the best route (to make it work at all in XQuery, and to avoid repetitive &lt;code&gt;xmlns&lt;/code&gt; attributes in VB.NET) is to create a no-namespace version of your result first, and have a standard function or method that will add the right default namespace to that result.&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/36#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/14">xml</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/5">xslt</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/15">xlinq</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/29">xquery</category>
 <pubDate>Sun, 01 Jul 2007 20:32:44 +0100</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">36 at http://www.jenitennison.com/blog</guid>
</item>
</channel>
</rss>
