<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xml:base="http://www.jenitennison.com/blog" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
 <title>schema</title>
 <link>http://www.jenitennison.com/blog/taxonomy/term/8</link>
 <description>The taxonomy view with a depth of 0.</description>
 <language>en</language>
<item>
 <title>RELAX NG for matching</title>
 <link>http://www.jenitennison.com/blog/node/79</link>
 <description>&lt;p&gt;I&amp;#8217;m still thinking about doing &lt;a href=&quot;http://www.jenitennison.com/blog/node/76&quot; title=&quot;Jeni&#039;s Musings: Automatic markup and XML pipelines&quot;&gt;automatic markup with XML pipelines&lt;/a&gt;, and the kind of components that you might need in such a pipeline. These are the useful ones (list inspired by the components offered by &lt;a href=&quot;http://www.gate.ac.uk/&quot; title=&quot;General Architecture for Text Engineering&quot;&gt;GATE&lt;/a&gt;):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a &lt;strong&gt;tokeniser&lt;/strong&gt; that uses regular expressions to add markup to plain text&lt;/li&gt;
&lt;li&gt;a &lt;strong&gt;gazetteer&lt;/strong&gt; that uses a lookup to add markup to plain text&lt;/li&gt;
&lt;li&gt;an &lt;strong&gt;annotater&lt;/strong&gt; that adds attributes to existing elements based on their context/content&lt;/li&gt;
&lt;li&gt;a &lt;strong&gt;grouper&lt;/strong&gt; that adds markup around sequences of existing markup&lt;/li&gt;
&lt;li&gt;a &lt;strong&gt;stripper&lt;/strong&gt; that removes markup&lt;/li&gt;
&lt;li&gt;a general purpose &lt;strong&gt;transformer&lt;/strong&gt; that uses XSLT to do just about everything else&lt;/li&gt;
&lt;/ul&gt;

&lt;!--break--&gt;

&lt;p&gt;The &amp;#8220;grouper&amp;#8221; is the most interesting and difficult of these. It needs to act like a tokeniser, except use regular expressions over markup rather than over text. For example, say I had:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;number&amp;gt;06&amp;lt;/number&amp;gt;&amp;lt;punc&amp;gt;/&amp;lt;/punc&amp;gt;&amp;lt;number&amp;gt;03&amp;lt;/number&amp;gt;&amp;lt;punc&amp;gt;/&amp;lt;/punc&amp;gt;&amp;lt;number&amp;gt;08&amp;lt;/number&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;I want to be able to create a rule that says &amp;#8220;any sequence that looks like a number element that contains a two-digit number between 1 and 31, followed by a punc element that contains a slash, followed by another two-digit number between 1 and 12, followed by a punc element that contains a slash, followed by another two-digit number should be wrapped in a date element&amp;#8221;.&lt;/p&gt;

&lt;p&gt;Now this is something that XPath is really bad at. Try writing an expression that selects, from a sequence of elements that may contain other &lt;code&gt;&amp;lt;number&amp;gt;&lt;/code&gt; and &lt;code&gt;&amp;lt;punc&amp;gt;&lt;/code&gt; elements as well as other elements, only those sequences of elements that match the pattern I just described. It&amp;#8217;s something like:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;number[. &amp;gt;= 1 and . &amp;lt;= 31 and string-length(.) = 2]
      [following-sibling::*[1]/self::punc = &#039;/&#039;]
      [following-sibling::*[2]/self::number[. &amp;gt;= 1 and . &amp;lt;= 12 and string-length(.) = 2]]
      [following-sibling::*[3]/self::punc = &#039;/&#039;]
      [following-sibling::*[4]/self::number[string-length(.) = 2]]
  /(self::number, following-sibling::*[position() &amp;lt;= 4])
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;which is fiddly and messy and only works in this particular example because I know precisely how many elements there are supposed to be in the group.&lt;/p&gt;

&lt;p&gt;In fact, it&amp;#8217;s even difficult to do this kind of grouping using XSLT, even with &lt;code&gt;&amp;lt;xsl:for-each-group&amp;gt;&lt;/code&gt; because the grouping is designed around elements either returning the same value or starting or ending with a particular kind of element, rather than grouping together a sequence that has a particular internal structure.&lt;/p&gt;

&lt;p&gt;The language that &lt;em&gt;is&lt;/em&gt; designed to describe sequences of elements is RELAX NG. Obviously RELAX NG is really useful as a schema language, but it&amp;#8217;s really all to do with defining regular expressions over XML structures. We can use RELAX NG to describe the pattern of elements we want to match:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;group&amp;gt;
  &amp;lt;element name=&quot;number&quot;&amp;gt;
    &amp;lt;data type=&quot;integer&quot;&amp;gt;
      &amp;lt;param name=&quot;minInclusive&quot;&amp;gt;1&amp;lt;/param&amp;gt;
      &amp;lt;param name=&quot;maxInclusive&quot;&amp;gt;31&amp;lt;/param&amp;gt;
      &amp;lt;param name=&quot;pattern&quot;&amp;gt;[0-9]{2}&amp;lt;/param&amp;gt;
    &amp;lt;/data&amp;gt;
  &amp;lt;/element&amp;gt;
  &amp;lt;element name=&quot;punc&quot;&amp;gt;
    &amp;lt;value&amp;gt;/&amp;lt;/value&amp;gt;
  &amp;lt;/element&amp;gt;
  &amp;lt;element name=&quot;number&quot;&amp;gt;
    &amp;lt;data type=&quot;integer&quot;&amp;gt;
      &amp;lt;param name=&quot;minInclusive&quot;&amp;gt;1&amp;lt;/param&amp;gt;
      &amp;lt;param name=&quot;maxInclusive&quot;&amp;gt;12&amp;lt;/param&amp;gt;
      &amp;lt;param name=&quot;pattern&quot;&amp;gt;[0-9]{2}&amp;lt;/param&amp;gt;
    &amp;lt;/data&amp;gt;
  &amp;lt;/element&amp;gt;
  &amp;lt;element name=&quot;punc&quot;&amp;gt;
    &amp;lt;value&amp;gt;/&amp;lt;/value&amp;gt;
  &amp;lt;/element&amp;gt;
  &amp;lt;element name=&quot;number&quot;&amp;gt;
    &amp;lt;data type=&quot;integer&quot;&amp;gt;
      &amp;lt;param name=&quot;pattern&quot;&amp;gt;[0-9]{2}&amp;lt;/param&amp;gt;
    &amp;lt;/data&amp;gt;
  &amp;lt;/element&amp;gt;
&amp;lt;/group&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;or, in compact syntax:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;element number { 
  xs:integer { minInclusive = &quot;1&quot; maxInclusive = &quot;31&quot; pattern = &quot;[0-9]{2}&quot; }
},
element punc { &quot;/&quot; },
element number { 
  xs:integer { minInclusive = &quot;1&quot; maxInclusive = &quot;12&quot; pattern = &quot;[0-9]{2}&quot; }
},
element punc { &quot;/&quot; },
element number { 
  xs:integer { pattern = &quot;[0-9]{2}&quot; }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;As a language, RELAX NG is really good at this. You could even imagine adding attributes to name subexpressions which you could then do things with (in the same way as you can get the substring matching a subexpression when you use a regular expression over text).&lt;/p&gt;

&lt;p&gt;So I think a &amp;#8220;grouper&amp;#8221; component should use RELAX NG to identify sequences to be marked up. But I have no idea if there are RELAX NG libraries out there that can be used in this way: to identify and extract matching sequences rather than to validate entire documents.&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/79#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/6">pipelines</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/8">schema</category>
 <pubDate>Thu, 06 Mar 2008 14:59:03 +0000</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">79 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>Extension primitives in XSDL</title>
 <link>http://www.jenitennison.com/blog/node/71</link>
 <description>&lt;p&gt;Michael Sperberg McQueen (CMSMcQ) has written a couple of interesting posts about &lt;a href=&quot;http://people.w3.org/~cmsmcq/blog/?p=26&quot; title=&quot;Michael Sperberg McQueen: Allowing ‘extension primitives’ in XML Schema?&quot;&gt;datatypes in W3C&amp;#8217;s XML Schema (XSDL)&lt;/a&gt;. (The second is &lt;a href=&quot;http://people.w3.org/~cmsmcq/blog/?p=27&quot; title=&quot;Michael Sperberg McQueen: Primitives and non-primitives in XSDL&quot;&gt;a response to&lt;/a&gt; a comment from &lt;a href=&quot;http://recycledknowledge.blogspot.com/&quot; title=&quot;John Cowan&#039;s Blog: Recycled Knowledge&quot;&gt;John Cowan&lt;/a&gt;, and attempts to justify some of the seemingly arbitrary decisions made in the set of datatypes present in XSDL 1.0.) The posts are a discussion of one of the issues against XSDL 1.1 raised by &lt;a href=&quot;http://saxonica.blogharbor.com/&quot; title=&quot;Michael Kay&#039;s Blog: Saxon diaries&quot;&gt;Michael Kay&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Michael proposes: just specify that implementations may provide additional implementation-defined primitive types. In the nature of things, an implementation can do this however it wants. Some implementors will code up email dates and CSS lengths the same way they code the other primitives. Fine. Some implementors will expose the API that their existing primitive types use, so they choose, at the appropriate moment, to link in a set of extension types, or not. Some will allow users to provide implementations of extension types, using that API, and link them at run time. Some may provide extension syntax to allow users to describe new types in some usable way (DTLL, anyone?) without having to write code in Java or C or [name of language here].&lt;/p&gt;
&lt;/blockquote&gt;

&lt;!--break--&gt;

&lt;p&gt;Since I&amp;#8217;m principally responsible for the &lt;a href=&quot;http://www.idealliance.org/papers/extreme/proceedings/html/2006/Tennison01/EML2006Tennison01.html&quot; title=&quot;Extreme 2006: Datatypes for XML: the Datatyping Library Language (DTLL)&quot;&gt;Datatype Library Language (DTLL)&lt;/a&gt; it&amp;#8217;ll come as no surprise that I think that XSDL is currently deficient in not providing mechanisms for creating new primitive types (such as colours) or different lexical representations for the primitive types it has (such as UK-style dates). So yes, I do think XSDL would be a better schema language if it supported &amp;#8220;extension primitives&amp;#8221;. &lt;/p&gt;

&lt;p&gt;In XSLT and XPath, providing extensibility hooks has proved very effective. It&amp;#8217;s enabled implementers to innovate (and those innovations fed back into the design of XSLT 2.0 and XPath 2.0). It&amp;#8217;s provided users with functionality (such as &lt;code&gt;xxx:node-set()&lt;/code&gt;) that they would otherwise not have had for years, and therefore made the lives of thousands of users much easier.&lt;/p&gt;

&lt;p&gt;On the other hand, it&amp;#8217;s impossible to say how XSLT and XPath would have developed if those extensibility hooks hadn&amp;#8217;t been there. Would implementers have extended the language anyway, leading to fragmentation? Would the WG have felt more pressure to get later versions of XSLT out the door if the only way the language could have been improved was through centralised changes?&lt;/p&gt;

&lt;p&gt;I think the big thing that helped XSLT&amp;#8217;s extensibility actually work was &lt;a href=&quot;http://www.exslt.org/&quot; title=&quot;EXSLT: Extensions in XSLT&quot;&gt;EXSLT&lt;/a&gt; (but then, I would say that, wouldn&amp;#8217;t I?). The majority of XSLT processors implement EXSLT extensions, and even those processors that don&amp;#8217;t implement all (or any) of EXSLT have other extensibility hooks (such as &lt;code&gt;&amp;lt;msxsl:script&amp;gt;&lt;/code&gt; or &lt;code&gt;&amp;lt;xsl:function&amp;gt;&lt;/code&gt;) and there are third-party implementations of EXSLT&amp;#8217;s functions available so it&amp;#8217;s possible to write interoperable stylesheets while still using those functions.&lt;/p&gt;

&lt;p&gt;(EXSLT is by no means perfect: if we were doing it over again, we&amp;#8217;d build in much better methods for receiving user contributions of various kinds. But I think the general principle is sound.)&lt;/p&gt;

&lt;p&gt;If XSDL were to allow extension primitives, you&amp;#8217;d hope for something similar to happen:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a repository for common extension primitives&lt;/li&gt;
&lt;li&gt;implementations that respond to user demand for extensions in the repository&lt;/li&gt;
&lt;li&gt;development of higher-level languages for defining extension primitives&lt;/li&gt;
&lt;li&gt;implementations that provide hooks (in whatever way) for defining extension primitives&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can&amp;#8217;t predict what implementers will do, but it seems likely that they&amp;#8217;d provide hooks for users to create their own extension primitives (albeit most likely using Java or .NET or whatever rather than a higher-level language such as DTLL). And once they do that, it&amp;#8217;s possible for the community to provide third-party implementations for extension primitives, thus retaining interoperability.&lt;/p&gt;

&lt;p&gt;So I think it could work, if implementers do the right thing and the user community gets involved.&lt;/p&gt;

&lt;p&gt;(Just in case you get the wrong impression: I still think &lt;a href=&quot;http://www.relaxng.org/&quot; title=&quot;RELAX NG&quot;&gt;RELAX NG&lt;/a&gt; is a vastly superior schema language to XSDL. If you need extension datatypes, you can have them in RELAX NG right now. Unfortunately, however, in the real world, you don&amp;#8217;t always get to make the right technical choice.)&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/71#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/5">xslt</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/35">dtll</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/8">schema</category>
 <pubDate>Sat, 19 Jan 2008 23:00:20 +0000</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">71 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>XTech 2007: Wednesday 16th May Afternoon</title>
 <link>http://www.jenitennison.com/blog/node/19</link>
 <description>&lt;p&gt;Yes, I&amp;#8217;m determined to write up every talk I attended at XTech 2007, so that &lt;em&gt;I&lt;/em&gt; have a record of it if nothing else. On Wednesday afternoon, I attended sessions on microformats, internationalisation and NVDL (as well as giving my own talk, of course).&lt;/p&gt;

&lt;!--break--&gt;

&lt;h2&gt;&lt;a href=&quot;http://2007.xtech.org/public/schedule/paper/41&quot; title=&quot;Microformats: the nanotechnology of the semantic web&quot;&gt;Microformats: the nanotechnology of the semantic web&lt;/a&gt;&lt;/h2&gt;

&lt;h3&gt;&lt;a href=&quot;http://adactio.com/&quot; title=&quot;Jeremy Keith&#039;s Website&quot;&gt;Jeremy Keith&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;This was a supremely well-put-together presentation on &lt;a href=&quot;http://microformats.org/&quot; title=&quot;Microformats Website&quot;&gt;microformats&lt;/a&gt;: beautiful slides, drama and humour, and a reference to &lt;a href=&quot;http://en.wikipedia.org/wiki/Neal_Stephenson&quot; title=&quot;Wikipedia: Neal Stephenson&quot;&gt;Neal Stephenson&amp;#8217;s&lt;/a&gt; &lt;a href=&quot;http://www.amazon.com/Diamond-Age-Illustrated-Primer-Spectra/dp/0553380966&quot; title=&quot;Amazon: Diamond Age&quot;&gt;Diamond Age&lt;/a&gt; (was I really one of only three people in the packed room to have read it?). There was a lot about what microformats are, how they&amp;#8217;re designed, what their niche is (Jeremy was very up-front about the fact they don&amp;#8217;t solve every problem), and how they&amp;#8217;re developed. But there weren&amp;#8217;t any demonstrations of microformat-based applications, which I would have really liked to see. The other thing I thought was worth noting was that Jeremy talked about the dangers of &amp;#8220;grey goo&amp;#8221; (he was using a nanotechnology metaphor): the proliferation of microformats. He expressed the strong desire that the set of microformats be kept small, and even said (I paraphrase) &amp;#8220;Do use semantic class names in your HTML, but don&amp;#8217;t call them microformats [unless they&amp;#8217;ve been through the microformats standardisation process]!&amp;#8221;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://www.holoweb.net/~liam/&quot; title=&quot;Liam Quin&#039;s Website&quot;&gt;Liam Quin&lt;/a&gt; gave a paper entitled &lt;a href=&quot;http://www.idealliance.org/papers/extreme/proceedings/html/2006/Quin01/EML2006Quin01.html&quot; title=&quot;Microformats: Contaminants or Ingredients&quot;&gt;Microformats: Contaminants or Ingredients&lt;/a&gt; at &lt;a href=&quot;http://www.extrememarkup.com/&quot; title=&quot;Extreme Markup Languages&quot;&gt;Extreme&lt;/a&gt; last year, asking what we, as traditional markup geeks, should do about them. Some were very sceptical, saying something along the lines of &amp;#8220;They&amp;#8217;re headed for a trainwreck; and we should sit back, watch it happen, and pick up the pieces.&amp;#8221; Others wanted to celebrate: the fact that tagging has become understood is really good news for the semantic web, open data and all that jazz. &lt;/p&gt;

&lt;p&gt;Both the traditional markup and the microformats community have the same goals: they want to make information easier to search for, to query, to integrate and so on. The microformats approach is to minimise the cost to those supplying information, and to target just a few, very common, kinds of data such as contact information, events and social networks. Traditional markup, on the other hand, aims to cover every single kind of information you might want to make available, and has to worry about issues like validating, styling, and distinguishing between tag sets.&lt;/p&gt;

&lt;p&gt;It seems that a fundamental problem is that the benefits of including semantic markup aren&amp;#8217;t immediately obvious to the supplier. Whether you use semantic class names in HTML or use elements in known namespaces, it&amp;#8217;s purely a matter of faith that this will make your information easier to locate or use. You can&amp;#8217;t know that search engines will include that information in their weighting algorithms, or that people reading your page will have the screen-scraping software necessary to pull anything out. With so little (obvious) benefit, authors will only supply semantic data if the cost is low. Adding class names to existing HTML elements is easy whether a web page is generated by hand or automatically. Adding namespaces and authoring special CSS might not be that much more costly to do, but it&amp;#8217;s much more costly to grok.&lt;/p&gt;

&lt;p&gt;So if we want authors to start putting elements in their own namespaces in their web pages, we need an application that immediately cranks up the benefit of doing so. I have no idea what that is.&lt;/p&gt;

&lt;h2&gt;&lt;a href=&quot;http://2007.xtech.org/public/schedule/paper/50&quot; title=&quot;Applying the Internationalization Tag Set&quot;&gt;Applying the Internationalization Tag Set&lt;/a&gt;&lt;/h2&gt;

&lt;h3&gt;&lt;a href=&quot;http://www.translate.com/&quot; title=&quot;Yves Savourel&#039;s Website&quot;&gt;Yves Savourel&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;This was a good introduction to [a standard] I only knew about vaguely. It&amp;#8217;s definitely worth knowing about the &lt;code&gt;its:*&lt;/code&gt; attributes for defining i18n features such as indicating which content should be translated, which are terms, providing comments for localisation and so on, just in case you need to build those in to new markup languages.&lt;/p&gt;

&lt;p&gt;I also have much admiration for how the ITS standard doesn&amp;#8217;t expect people to completely rework their markup languages to incorporate ITS data. Instead of using the ITS attributes directly in a document, you can use global rules embedded in the document itself, referenced from the document, or embedded in the schema for the document. I think this approach will prove useful in the development of &lt;a href=&quot;http://www.lmnlwiki.org/index.php/Talk:ECLIX#LIX&quot; title=&quot;LMNL in XML&quot;&gt;LIX&lt;/a&gt;, when we get around to formalising it.&lt;/p&gt;

&lt;h2&gt;&lt;a href=&quot;http://2007.xtech.org/public/schedule/detail/48&quot; title=&quot;NVDL - a breath of fresh air for compound document validation&quot;&gt;NVDL - a breath of fresh air for compound document validation&lt;/a&gt;&lt;/h2&gt;

&lt;h3&gt;&lt;a href=&quot;http://xmlguru.cz/&quot; title=&quot;Jirka Kosek&#039;s Website&quot;&gt;Jirka Kosek&lt;/a&gt; &amp;amp; &lt;a href=&quot;http://nalevka.com/&quot; title=&quot;Petr Nálevka&#039;s Website&quot;&gt;Petr Nálevka&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;&lt;a href=&quot;http://www.nvdl.org/&quot; title=&quot;Namespace-based Validation Dispatching Language&quot;&gt;NVDL&lt;/a&gt; is Part 4 of &lt;a href=&quot;http://www.dsdl.org/&quot; title=&quot;Document Schema Definition Languages&quot;&gt;DSDL&lt;/a&gt;, specifically targeted at organising the validation of documents that incorporate multiple namespaces, such as XHTML documents containing islands of SVG, RDF and MathML. NVDL&amp;#8217;s approach is to identify subtrees within the document that need to be validated against a particular schema. The subtrees don&amp;#8217;t need to only hold one namespace, but often that will be the case.&lt;/p&gt;

&lt;p&gt;The XML Schema wonks in the room (Henry Thompson and Michael Sperberg-McQueen) were a bit befuddled, I think, because with XML Schema you just supply a whole bunch of schema documents to the processor, for different namespaces, and as long as the schemas contain wildcards they&amp;#8217;ll do the right thing. The concept of supplying multiple schemas to a validator isn&amp;#8217;t part of RELAX NG&amp;#8217;s validation approach, so you need something like NVDL if you don&amp;#8217;t want to rework your schema for every combination of namespaces.&lt;/p&gt;

&lt;p&gt;Henry and Michael were particularly concerned about the fact that it means you can override the original schema, allowing elements from foreign namespaces in situations where the original schema hasn&amp;#8217;t allowed them. But as Henry said, it just means that the primary schema you use to define what&amp;#8217;s allowed where is actually an NVDL schema: it&amp;#8217;s not auxiliary validation like Schematron is, but a language for the primary schema you use.&lt;/p&gt;

&lt;p&gt;Later, I wondered how much the &lt;a href=&quot;http://www.w3.org/TR/xproc&quot; title=&quot;XProc: An XML Pipeline Language&quot;&gt;XProc&lt;/a&gt; work would render NVDL irrelevant. After all, XProc can invoke validation of subtrees against multiple external schemas. On the other hand, NVDL&amp;#8217;s syntax is going to be easier to use if that&amp;#8217;s all you want to do. Perhaps someone will write a tool to convert NVDL schemas to XProc pipelines&amp;#8230;&lt;/p&gt;

&lt;p&gt;Actually, Jirka &amp;amp; Petr&amp;#8217;s experience with &lt;a href=&quot;http://sourceforge.net/projects/jnvdl/&quot; title=&quot;Java implementation of NVDL&quot;&gt;JNVDL&lt;/a&gt; is interesting from the XProc viewpoint, in particular the problems that they had with reporting meaningful line numbers when validating subtrees. Something that XProc implementers might want to look at in regard to error reporting with &lt;code&gt;&amp;lt;p:viewport&amp;gt;&lt;/code&gt;.&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/19#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/16">markup</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/6">pipelines</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/8">schema</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/4">xtech</category>
 <pubDate>Sun, 20 May 2007 22:52:14 +0100</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">19 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>XTech 2007: Wednesday 16th May Morning</title>
 <link>http://www.jenitennison.com/blog/node/18</link>
 <description>&lt;p&gt;Since there&amp;#8217;s next to no &amp;#8216;net connection at XTech 2007 (obviously the Web is not so ubiquitous as all that), I have nothing to do in the sessions but listen! Here are some thoughts about the sessions that I attended on the morning of Wednesday 16th. I haven&amp;#8217;t included the keynotes not because they weren&amp;#8217;t interesting but because I can&amp;#8217;t think of anything to say about them at the moment.&lt;/p&gt;

&lt;!--break--&gt;

&lt;h2&gt;&lt;a href=&quot;http://2007.xtech.org/public/schedule/paper/60&quot; title=&quot;XML and LINQ: What&#039;s New in Orcase and Beyond&quot;&gt;XML and LINQ: What&amp;#8217;s New in Orcas and Beyond&lt;/a&gt;&lt;/h2&gt;

&lt;h3&gt;&lt;a href=&quot;http://research.microsoft.com/~emeijer/&quot; title=&quot;Erik Meijer&#039;s Website&quot;&gt;Erik Meijer&lt;/a&gt; (Microsoft)&lt;/h3&gt;

&lt;p&gt;I thought I&amp;#8217;d better go to this one because I&amp;#8217;m supposed to be talking about XML APIs at this year&amp;#8217;s &lt;a href=&quot;http://www.xmlsummerschool.com/&quot; title=&quot;XML Summer School, Oxford&quot;&gt;XML Summer School&lt;/a&gt; and LINQ, or XLINQ, is one of the hot topics. I&amp;#8217;m not a .NET developer, so it&amp;#8217;s all kinda passed me by thus far, and I&amp;#8217;m not sure I really understand it now. (I&amp;#8217;d welcome corrections and clarifications.) The three things that seemed to be important are:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;You can get at information held in objects, databases or XML using the same syntax. (Erik showed accessing XML with faulty XQuery syntax, which made me and &lt;a href=&quot;http://www.datypic.com/&quot; title=&quot;Priscilla Walmsley&#039;s Website&quot;&gt;Priscilla Walmsley&lt;/a&gt; grimace at each other.) This means you can decide how you want to actually hold your data further down the line. A big distinction between previous attempts to work across paradigms is that the &lt;em&gt;data&lt;/em&gt; doesn&amp;#8217;t get converted, but the &lt;em&gt;queries&lt;/em&gt; do. So you write your LINQ query in LINQ syntax and it gets mapped on to SQL to query your SQL database, or on to XQuery (I guess) to query your XML document. This all seemed to assume data-oriented information: I have no idea, yet, how or whether mixed content gets handled.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;XML is a &amp;#8220;first class datatype&amp;#8221; in LINQ, so to create static XML you just write XML in your program (a bit like in XQuery). The example Erik showed included an XML declaration, which is just plain weird: dunno if that was an error or it&amp;#8217;s a way of indicating what version of XML you&amp;#8217;re using, or what. To create dynamic portions of the XML, you use &lt;code&gt;&amp;lt;%=...%&amp;gt;&lt;/code&gt; &amp;#8220;expression holes&amp;#8221; which can contain .NET code, including calls to a new API for creating XML elements and attributes (a DOM replacement).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Erik talked about writing applications in .NET and then automatically refactoring them (with a click in a context menu) to work in client/server architectures, and refactoring again to work across several clients. Presumably this creates all the code necessary to make the application work with WS* messaging, so you don&amp;#8217;t have to program it. This all sounded really dodgy to me: I don&amp;#8217;t want to rely on a tool to make a language/approach/architecture usable.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There was an amusing digression into the art of rendering triangles, and thus three-dimensional models, with zero-width, zero-height, bordered &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt;s in XHTML. And a mention of the &amp;#8220;backbutton&amp;#8221; problem that you get when you spawn tabs/windows in your web browser and then go back to your original tab/window and hit submit, which made me think that perhaps a RESTful architecture would make a whole lot of complexity go away.&lt;/p&gt;

&lt;h2&gt;&lt;a href=&quot;http://2007.xtech.org/public/schedule/detail/159&quot; title=&quot;Data Model Perspectives for XML Schema&quot;&gt;Data Model Perspectives for XML Schema&lt;/a&gt;&lt;/h2&gt;

&lt;h3&gt;&lt;a href=&quot;http://www.ee.ethz.ch/&quot; title=&quot;Felix Michel&#039;s Website&quot;&gt;Felix Michel&lt;/a&gt; (ETH Zurich), &lt;a href=&quot;http://dret.net/netdret/&quot; title=&quot;Erik Wilde&#039;s Website&quot;&gt;Erik Wilde&lt;/a&gt; (UC Berkeley)&lt;/h3&gt;

&lt;p&gt;Felix mentioned that I might be interested in his talk in a &lt;a href=&quot;http://www.jenitennison.com/blog/node/2#comment-24&quot; title=&quot;Comment: Re: XTech Preparation&quot;&gt;comment here&lt;/a&gt;, and sure enough I found it fascinating. He&amp;#8217;s created a single-file representation of XML Schemas (consolidating schemas that, by virtue of using different namespaces, must be in different physical documents), and a set of XSLT 2.0 user-defined functions that provide access to and queries on the XML Schema information.&lt;/p&gt;

&lt;p&gt;For example, you can go from an instance element in your document to its type, find out if it&amp;#8217;s an extension or restriction, go to its base type, look at the annotations on it, and so on and so on. And all this in Basic XSLT 2.0 (the functions that work on instance elements traverse the instance document and schema in parallel to locate the element declaration that applies). You could use these functions to do everything you can do in Schema-Aware XSLT 2.0, with more flexibility, at the expense of performance.&lt;/p&gt;

&lt;p&gt;He also mapped content models onto &lt;code&gt;&amp;lt;occurrence&amp;gt;&lt;/code&gt; elements that encode the &amp;#8220;follow set&amp;#8221; for a particular occurrence, so you can easily answer the question &amp;#8220;what elements could come next?&amp;#8221;. I can&amp;#8217;t immediately think of a way of using that information in a stylesheet, but perhaps he can describe one.&lt;/p&gt;

&lt;p&gt;Anyway, I think Felix&amp;#8217;s point was not to provide XSLT programmers with a set of useful functions, but to demonstrate the kind of standard, fairly light-weight, API that we might use to access XML Schema information. There was some discussion, in the development of XPath 2.0, of providing this kind of API, but getting agreement on XDM was hard enough!&lt;/p&gt;

&lt;p&gt;However, my thoughts were veering off in different directions. To my mind, validation and annotation are separable processes, and the data types, element groups and linking behaviour that you might find useful on a data set are processing-specific. For example, it might make sense for one process to annotate the element &lt;code&gt;&amp;lt;foo&amp;gt;2007-05-17&amp;lt;/foo&amp;gt;&lt;/code&gt; as having the type date, while for another process (such as a transformation that deletes all &lt;code&gt;&amp;lt;foo&amp;gt;&lt;/code&gt; elements) it&amp;#8217;s unnecessary. I really don&amp;#8217;t want to have to define an XSD schema for my entire schema just to indicate that the &lt;code&gt;&amp;lt;foo&amp;gt;&lt;/code&gt; element is of type &lt;code&gt;xs:date&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Just as it&amp;#8217;s better to define the links between elements using keys, rather than relying on ID annotations made by a DTD, I think type annotations and node groups (why limit it to elements?) could be defined in the stylesheet. To give an idea:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&amp;lt;!-- all date attributes have the type named &#039;xs:date&#039; --&amp;gt;
&amp;lt;ann:type name=&quot;xs:date&quot; match=&quot;@date&quot; /&amp;gt;
&amp;lt;!-- h1, h2, h3, h4, h5, h6 elements are heading elements --&amp;gt;
&amp;lt;ann:group name=&quot;xhtml:heading&quot; 
  match=&quot;xhtml:h1 | xhtml:h2 | xhtml:h3 | xhtml:h4 | xhtml:h5 | xhtml:h6&quot; /&amp;gt;
&amp;lt;!-- oh, and so&#039;s the h element --&amp;gt;
&amp;lt;ann:group name=&quot;xhtml:heading&quot; match=&quot;xhtml:h&quot; /&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;It&amp;#8217;d be reasonably easy to give rudimentary support for &lt;code&gt;ann:type($node)&lt;/code&gt; and &lt;code&gt;ann:group($node)&lt;/code&gt; user-defined functions based on these, but they&amp;#8217;d really have to be built into the XSLT processor to get full pattern support and to work with modularised stylesheets. This all requires more detail than I have time to write right now, but is it even worth pursuing?&lt;/p&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/18#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/5">xslt</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/8">schema</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/15">xlinq</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/4">xtech</category>
 <pubDate>Thu, 17 May 2007 21:50:59 +0100</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">18 at http://www.jenitennison.com/blog</guid>
</item>
<item>
 <title>A Creole by any other name...</title>
 <link>http://www.jenitennison.com/blog/node/6</link>
 <description>&lt;p&gt;Argh. I&amp;#8217;ve been contacted by the guys at &lt;a href=&quot;http://www.wikicreole.org&quot; title=&quot;Creole Wiki Markup language&quot;&gt;WikiCreole&lt;/a&gt; who want me to change the name of &lt;a href=&quot;http://www.lmnlwiki.org&quot; title=&quot;Creole schema language&quot;&gt;Creole&lt;/a&gt;. What should I do? Not only is &amp;#8220;Creole&amp;#8221; a great name for a schema language that deals with concurrent markup, but it&amp;#8217;s a great acronym too (Composable regular expressions for overlapping languages etc.)&lt;/p&gt;

&lt;p&gt;I did Google when I first came up with the name in August 2006, but didn&amp;#8217;t discover WikiCreole (unsurprisingly, since it was only coined in July 2006 itself). But now far more many people know, care about and use WikiCreole than Creole grammars. So any suggestions for alternative names?&lt;/p&gt;

&lt;!--break--&gt;
</description>
 <comments>http://www.jenitennison.com/blog/node/6#comments</comments>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/7">creole</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/9">overlapping markup</category>
 <category domain="http://www.jenitennison.com/blog/taxonomy/term/8">schema</category>
 <pubDate>Wed, 25 Apr 2007 21:09:28 +0100</pubDate>
 <dc:creator>Jeni</dc:creator>
 <guid isPermaLink="false">6 at http://www.jenitennison.com/blog</guid>
</item>
</channel>
</rss>
